[R] Accumulate objects in list after try()
Hi, I have written a function harvest and I would like to run the function for each value in a vector c(1:1000). The function returns 4 list objects (obj_1, obj_3, obj_3, obj_4) using the following code at the end of the function: return(list(obj_1 = obj_1, obj_2 = obj_2, obj_3 = obj_3, obj_4 = obj_4)). Since I am connecting with the web in the function and the connection sometimes fails causing errors to occur, I invoke the function as follows: for(i in 1:1000){ result - try(harvest(i)); if(class(result) == try-error) next; } Everything works well accept for the fact that result only stores obj_1, obj_2, obj_3, obj_4 for the last i in the loop. How do I store obj_1, obj_2, obj_3, obj_4 for the first i in the first 4 elements of result, the objects for the second i in the next 4 elements, etc? Thank you very much. -- View this message in context: http://r.789695.n4.nabble.com/Accumulate-objects-in-list-after-try-tp4650927.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reduce(paste, x) question
Thanks Arun, this helps a lot! -- View this message in context: http://r.789695.n4.nabble.com/Reduce-paste-x-question-tp4648151p4648228.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Reduce(paste, x) question
I have a question about the Reduce function: x - list() x[[1]] - LETTERS[1:5] x[[2]] - LETTERS[11:15] Reduce(paste, x) [1] A K B L C M D N E O How do I get this?: [1] A K [2] B L [3] C M [4] D N [5] E O Thanks for your help! -- View this message in context: http://r.789695.n4.nabble.com/Reduce-paste-x-question-tp4648151.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reduce(paste, x) question
I should have been more specific: y - list() a - c(A, K) b - c(B, L) c - c(C, M) d - c(D, N) e - c(E, O) y[[1]] - a y[[2]] - b y[[3]] - c y[[4]] - d y[[5]] - e y [[1]] [1] A K [[2]] [1] B L [[3]] [1] C M [[4]] [1] D N [[5]] [1] E O How do I get a list object like y (each element of y is a vector of strings) from: x[[1]] - LETTERS[1:5] x[[2]] - LETTERS[11:15] using only the Reduce function? Thanks! -- View this message in context: http://r.789695.n4.nabble.com/Reduce-paste-x-question-tp4648151p4648169.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] strapply and characters adjacent to the matched pattern
Thanks Gabor for your invaluable help! I learned a lot. -- View this message in context: http://r.789695.n4.nabble.com/strapply-and-characters-adjacent-to-the-matched-pattern-tp4637673p4637939.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] strapply and characters adjacent to the matched pattern
Thanks Gabor. That worked really well. I have been reading about the use of POSIX and regular expressions and I tried to use your example to see if I could ignore all matches in which the character preceding (rather than following) the match is one of [:alpha:]? So far, I have been unsuccessful. Could anyone help me out here or direct me to a source that explains the combined use of POSIX and regular expressions? Thanks! require(gsubfn) # this only checks for the characters following the match and therefore matches also matches the third element # however I want it to match only the 2nd, 5th and 6th elements strapply(c(abc, ab, abdef, defc, def, def ), (def|ab)($|[^[[:alpha:]])) The outcome should look like this: [[1]] NULL [[2]] [1] ab [[3]] NULL [[4]] NULL [[5]] [1] def [[6]] [1] def Gabor Grothendieck wrote On Tue, Jul 24, 2012 at 5:06 PM, mdvaan lt;mathijsdevaan@gt; wrote: Hi, In the example below, one of the searched patterns SE is matched in the word second. I would like to ignore all matches in which the character following the match is one of [:alpha:]. How do I do this without removing the ignore.case = T argument of the strapply function? Thank you very much! # load library require(gsubfn) # read in data data - c(Santa Fe Gold Corp|Starpharma Holdings|SE) # define the object to be searched text - c(the first is Santa Fe Gold Corp, the second is Starpharma Holdings) # match strapply(text, data, ignore.case = T) The preferred outcome would be: [[1]] [1] Santa Fe Gold Corp [[2]] [1] Starpharma Holdings instead of: [[1]] [1] Santa Fe Gold Corp [[2]] [1] se Starpharma Holdings Try this: strapply(c(abc, ab, ab def), (ab|d)($|[^[[:alpha:]])) [[1]] NULL [[2]] [1] ab [[3]] [1] ab -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://r.789695.n4.nabble.com/strapply-and-characters-adjacent-to-the-matched-pattern-tp4637673p4637835.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] strapply and characters adjacent to the matched pattern
Hi, In the example below, one of the searched patterns SE is matched in the word second. I would like to ignore all matches in which the character following the match is one of [:alpha:]. How do I do this without removing the ignore.case = T argument of the strapply function? Thank you very much! # load library require(gsubfn) # read in data data - c(Santa Fe Gold Corp|Starpharma Holdings|SE) # define the object to be searched text - c(the first is Santa Fe Gold Corp, the second is Starpharma Holdings) # match strapply(text, data, ignore.case = T) The preferred outcome would be: [[1]] [1] Santa Fe Gold Corp [[2]] [1] Starpharma Holdings instead of: [[1]] [1] Santa Fe Gold Corp [[2]] [1] se Starpharma Holdings -- View this message in context: http://r.789695.n4.nabble.com/strapply-and-characters-adjacent-to-the-matched-pattern-tp4637673.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Maximum number of patterns and speed in grep
Hi, I have a minor follow-up question: In the example below, ann and nn in the third element of text are matched. I would like to ignore all matches in which the character following the match is one of [:alpha:]. How do I do this without removing the ignore.case = TRUE argument of the strapply function? So the output should be: [[1]] [1] Santa Fe Gold Corp [[2]] [1] Starpharma Holdings [[3]] NULL Rather than: [[1]] [1] Santa Fe Gold Corp [[2]] [1] Starpharma Holdings [[3]] [1] ann nn Thanks! require(gsubfn) # read in data data - read.csv(https://dl.dropbox.com/u/13631687/data.csv;, header = T, sep = ,) # define the object to be searched text - c(the first is Santa Fe Gold Corp, the second is Starpharma Holdings, the annual earnings exceed those of last year) k - 3000 # chunk size f - function(from, text) { to - min(from + k - 1, nrow(data)) r - paste(data[seq(from, to), 1], collapse = |) r - gsub([().*?+{}], , r) strapply(text, r, ignore.case = TRUE) } ix - seq(1, nrow(data), k) out - lapply(text, function(text) unlist(lapply(ix, f, text))) -- View this message in context: http://r.789695.n4.nabble.com/Maximum-number-of-patterns-and-speed-in-grep-tp4635613p4637458.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Maximum number of patterns and speed in grep
Thanks! That worked like a charm. Math Gabor Grothendieck wrote On Fri, Jul 13, 2012 at 1:41 PM, mdvaan lt;mathijsdevaan@gt; wrote: Here's some data (which should give you the error messages): # read in data data - read.csv(https://dl.dropbox.com/u/13631687/data.csv;, header = T, sep = ,) # first paste all data data1 - paste(data[,1], collapse = |) # second paste subsets of the data data2a - paste(data[1:750,1], collapse = |) data2b - paste(data[751:1500,1], collapse = |) # define the object to be searched text - c(the first is Santa Fe Gold Corp, the second is Starpharma Holdings) # match strapplyc(text, data1) strapplyc(text, data2a) strapplyc(text, data2b) Thanks in advance! Although it seems that strapplyc can handle larger regular expressions than grep in R it seems neither can handle as many as in your example so process it in chunks: k - 3000 # chunk size f - function(from, text) { to - min(from + k - 1, nrow(data)) r - paste(data[seq(from, to), 1], collapse = |) r - gsub([().*?+{}], , r) strapply(text, r) } ix - seq(1, nrow(data), k) out - lapply(text, function(text) unlist(lapply(ix, f, text))) -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://r.789695.n4.nabble.com/Maximum-number-of-patterns-and-speed-in-grep-tp4635613p4636657.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Finding and manipulation clusters of numbers in a sequence of numbers
Hi, I have the following sequence: in - c(0, 0, 0, 2, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 2, 0, 2, 0, 0, 2) From this sequence I would like to get to the following sequence: out - c(0, 0, 0, 3, 3, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 0, 2, 0, 2, 0, 0, 2) Basically, what I would like to do for each number greater than 0, is to add all adjacent numbers and the adjacent numbers of those numbers, etc. until one of those numbers is equal to 0. I could manually repeat the loops below until sequence stops changing but there must be a smarter way. Any suggestions? Thanks! sequence - c(0, 0, 0, 2, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 2, 0, 2, 0, 0, 2) for (h in 2:(length(sequence) - 1)) { sequence[h] - ifelse(sequence[h] 0, sequence[h-1] + sequence[h], 0) } for (h in 1:(length(sequence) - 1)) { sequence[h] - ifelse(sequence[h] 0 sequence[h+1] sequence[h], sequence[h+1], sequence[h]) } -- View this message in context: http://r.789695.n4.nabble.com/Finding-and-manipulation-clusters-of-numbers-in-a-sequence-of-numbers-tp4636661.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Maximum number of patterns and speed in grep
Here's some data (which should give you the error messages): # read in data data - read.csv(https://dl.dropbox.com/u/13631687/data.csv;, header = T, sep = ,) # first paste all data data1 - paste(data[,1], collapse = |) # second paste subsets of the data data2a - paste(data[1:750,1], collapse = |) data2b - paste(data[751:1500,1], collapse = |) # define the object to be searched text - c(the first is Santa Fe Gold Corp, the second is Starpharma Holdings) # match strapplyc(text, data1) strapplyc(text, data2a) strapplyc(text, data2b) Thanks in advance! Math Gabor Grothendieck wrote On Fri, Jul 13, 2012 at 9:40 AM, mdvaan lt;mathijsdevaan@gt; wrote: Thanks, I see that it is working in the sample data. My data, however, gives me an error message: data - strapplyc(text, batch[[l]]) Error in structure(.External(dotTcl, ..., PACKAGE = tcltk), class = tclObj) : [tcl] couldn't compile regular expression pattern: parentheses () not balanced. batch[[l]] is similar to your re string except that there is a larger variety of characters. I haven't been able to figure out which characters are causing trouble here. Any thoughts? Thank you very much. Math ... __ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Note part on last line about posting reproducible code. -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://r.789695.n4.nabble.com/Maximum-number-of-patterns-and-speed-in-grep-tp4635613p4636472.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Maximum number of patterns and speed in grep
Thanks, I see that it is working in the sample data. My data, however, gives me an error message: data - strapplyc(text, batch[[l]]) Error in structure(.External(dotTcl, ..., PACKAGE = tcltk), class = tclObj) : [tcl] couldn't compile regular expression pattern: parentheses () not balanced. batch[[l]] is similar to your re string except that there is a larger variety of characters. I haven't been able to figure out which characters are causing trouble here. Any thoughts? Thank you very much. Math Gabor Grothendieck wrote On Fri, Jul 6, 2012 at 10:45 AM, mdvaan lt;mathijsdevaan@gt; wrote: Hi, I am using R's grep function to find patterns in vectors of strings. The number of patterns I would like to match is 7,700 (of different sizes). I noticed that I get an error message when I do the following: data - array() for (j in 1:length(x)) { array[j] - length(grep(paste(patterns[1:7700], collapse = |), x[j], value = T)) } When I break this up into 4 chunks of patterns it works: data - array() for (j in 1:length(x)) { array$chunk1[j] - length(grep(paste(patterns[1:2500], collapse = |), x[j], value = T)) array$chunk1[j] - length(grep(paste(patterns[2501:5000], collapse = |), x[j], value = T)) array$chunk1[j] - length(grep(paste(patterns[5001:7500], collapse = |), x[j], value = T)) array$chunk1[j] - length(grep(paste(patterns[7501:7700], collapse = |), x[j], value = T)) } My questions: what's the maximum size of the patterns argument in grep? Is there a way to do this faster? It is very slow. Try strapplyc in gsubfn and see http://gsubfn.googlecode.com for more info. # test data x - c(abcd, z, dbef) # re is regexp with 7700 alternatives # to test with g - expand.grid(letters, letters, letters) gp - do.call(paste0, g) gp7700 - head(gp, 7700) re - paste(gp7700, collapse = |) # grep gives error message grep.out - grep(re, x) # strapplyc works library(gsubfn) which(sapply(strapplyc(x, re), length) 0) -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://r.789695.n4.nabble.com/Maximum-number-of-patterns-and-speed-in-grep-tp4635613p4636437.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Maximum number of patterns and speed in grep
Hi, I am using R's grep function to find patterns in vectors of strings. The number of patterns I would like to match is 7,700 (of different sizes). I noticed that I get an error message when I do the following: data - array() for (j in 1:length(x)) { array[j] - length(grep(paste(patterns[1:7700], collapse = |), x[j], value = T)) } When I break this up into 4 chunks of patterns it works: data - array() for (j in 1:length(x)) { array$chunk1[j] - length(grep(paste(patterns[1:2500], collapse = |), x[j], value = T)) array$chunk1[j] - length(grep(paste(patterns[2501:5000], collapse = |), x[j], value = T)) array$chunk1[j] - length(grep(paste(patterns[5001:7500], collapse = |), x[j], value = T)) array$chunk1[j] - length(grep(paste(patterns[7501:7700], collapse = |), x[j], value = T)) } My questions: what's the maximum size of the patterns argument in grep? Is there a way to do this faster? It is very slow. Thanks. Math Sorry for not providing a reproducible example. It's a size issue which makes it difficult to provide an example. -- View this message in context: http://r.789695.n4.nabble.com/Maximum-number-of-patterns-and-speed-in-grep-tp4635613.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Maximum number of patterns and speed in grep
Thanks for the quick response. I should phrase my question differently because everything is working fine, I am just trying to find a more efficient approach: 1. What's the maximum size of the patterns argument in grep? Can't find it online. 2. I am trying to match 7,700 character strings to about 10,000 vectors each containing about 5,000 strings using grep. Is there a way to do this faster? It is very slow. Thanks Sarah Goslee wrote Hi, Given that you can't provide a full example, please at least provide str() on your data, more complete information on the problem, and ideally a small toy example that demonstrates precisely what you are doing. For instance, you tell us that you get an error message but you never tell us what it is. Don't you think we might need to know what the error is to be able to diagnose and fix it? Also, note that your working example simply overwrites array$chunk1[j] four times. Sarah On Fri, Jul 6, 2012 at 10:45 AM, mdvaan lt;mathijsdevaan@gt; wrote: Hi, I am using R's grep function to find patterns in vectors of strings. The number of patterns I would like to match is 7,700 (of different sizes). I noticed that I get an error message when I do the following: data - array() for (j in 1:length(x)) { array[j] - length(grep(paste(patterns[1:7700], collapse = |), x[j], value = T)) } When I break this up into 4 chunks of patterns it works: data - array() for (j in 1:length(x)) { array$chunk1[j] - length(grep(paste(patterns[1:2500], collapse = |), x[j], value = T)) array$chunk1[j] - length(grep(paste(patterns[2501:5000], collapse = |), x[j], value = T)) array$chunk1[j] - length(grep(paste(patterns[5001:7500], collapse = |), x[j], value = T)) array$chunk1[j] - length(grep(paste(patterns[7501:7700], collapse = |), x[j], value = T)) } My questions: what's the maximum size of the patterns argument in grep? Is there a way to do this faster? It is very slow. Thanks. Math Sorry for not providing a reproducible example. It's a size issue which makes it difficult to provide an example. -- Sarah Goslee http://www.functionaldiversity.org __ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://r.789695.n4.nabble.com/Maximum-number-of-patterns-and-speed-in-grep-tp4635613p4635626.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fw: Extract upper case letters
Thanks, all solutions worked like a charm. Math arun kirshna wrote - Forwarded Message - From: arun lt;smartpink111@gt; To: mdvaan lt;mathijsdevaan@gt; Cc: R help lt;r-help@gt; Sent: Thursday, June 28, 2012 1:44 AM Subject: Re: [R] Extract upper case letters Hi, Try this: t - TheWeatherIsVeryNice t1-gsub([:A-Z:],,t) t1 [1] heeatherseryice t2-gsub([:a-z:],,t) t2 [1] TWIVN A.K. - Original Message - From: mdvaan lt;mathijsdevaan@gt; To: r-help@ Cc: Sent: Wednesday, June 27, 2012 3:15 PM Subject: [R] Extract upper case letters t - TheWeatherIsVeryNice How do I extract the upper case letters? - TWIVN Thanks! -- View this message in context: http://r.789695.n4.nabble.com/Extract-upper-case-letters-tp4634664.html Sent from the R help mailing list archive at Nabble.com. __ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://r.789695.n4.nabble.com/Extract-upper-case-letters-tp4634664p4634802.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Extract upper case letters
t - TheWeatherIsVeryNice How do I extract the upper case letters? - TWIVN Thanks! -- View this message in context: http://r.789695.n4.nabble.com/Extract-upper-case-letters-tp4634664.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to set cookies in RCurl
Hi, I am using prof. Temple Lang's suggestions and I think I should be close but with the code below I get an error message which I don't fully get. Any suggestions? Thanks! Math library(RCurl) library(XML) setwd(C:/Comments) url - getURLContent(http://www.scopus.com/results/results.url?sort=plf-fsrc=ssid=M8RcnaPRBgrtA1r_EvZtL7j%3a70sot=asdt=asl=32s=PMID%2811693556%29+OR+PMID%2812239288%29origin=searchadvancedtxGid=M8RcnaPRBgrtA1r_EvZtL7j%3a7;, options(RCurlOptions = list(proxy = 127.0.0.1:2048, proxyuserpwd = username:password, proxyauth = gci)), cookiefile = /Rcookies) Error in curlOptions(..., .opts = .opts) : unnamed curl option(s): list(RCurlOptions = list(proxy = 127.0.0.1:2048, proxyuserpwd = username:password, proxyauth = gci)) Duncan Temple Lang wrote Apologies for following up on my own mail, but I forgot to explicitly mention that you will need to specify the appropriate proxy information in the call to getURLContent(). D. On 6/7/12 8:31 AM, Duncan Temple Lang wrote: To just enable cookies and their management, use the cookiefile option, e.g. txt = getURLContent(url, cookiefile = ) Then you can pass this to readHTMLTable(), best done as content = readHTMLTable(htmlParse(txt, asText = TRUE)) The function readHTMLTable() doesn't use RCurl and doesn't handle cookies. D. On 6/7/12 7:33 AM, mdvaan wrote: Hi, I am trying to access a website and read its content. The website is a restricted access website that I access through a proxy server (which therefore requires me to enable cookies). I have problems in allowing Rcurl to receive and send cookies. The following lines give me: library(RCurl) library(XML) url - http://www.theurl.com; content - readHTMLTable(url) content $`NULL` V1 1 2 Cookies disabled 3 4 Your browser currently does not accept cookies.\rCookies need to be enabled for Scopus to function properly.\rPlease enable session cookies in your browser and try again. $`NULL` V1 V2 V3 1 $`NULL` V1 1 Cookies disabled $`NULL` V1 1 2 3 I have carefully read section 4.4. from this: http://www.omegahat.org/RCurl/RCurlJSS.pdf and tried the following without succes: curl - getCurlHandle() curlSetOpt(cookiejar = 'cookies.txt', curl = curl) Any suggestions on how to allow for cookies? Thanks. Math -- View this message in context: http://r.789695.n4.nabble.com/How-to-set-cookies-in-RCurl-tp4632693.html Sent from the R help mailing list archive at Nabble.com. __ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://r.789695.n4.nabble.com/How-to-set-cookies-in-RCurl-tp4632693p4634147.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to set cookies in RCurl
Hi, I am trying to access a website and read its content. The website is a restricted access website that I access through a proxy server (which therefore requires me to enable cookies). I have problems in allowing Rcurl to receive and send cookies. The following lines give me: library(RCurl) library(XML) url - http://www.theurl.com; content - readHTMLTable(url) content $`NULL` V1 1 2 Cookies disabled 3 4 Your browser currently does not accept cookies.\rCookies need to be enabled for Scopus to function properly.\rPlease enable session cookies in your browser and try again. $`NULL` V1 V2 V3 1 $`NULL` V1 1 Cookies disabled $`NULL` V1 1 2 3 I have carefully read section 4.4. from this: http://www.omegahat.org/RCurl/RCurlJSS.pdf and tried the following without succes: curl - getCurlHandle() curlSetOpt(cookiejar = 'cookies.txt', curl = curl) Any suggestions on how to allow for cookies? Thanks. Math -- View this message in context: http://r.789695.n4.nabble.com/How-to-set-cookies-in-RCurl-tp4632693.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to set cookies in RCurl
Thanks for the fast response. I am not sure how to enter the proxy info in the call. I am working via EZProxy (which I think, rewrites a URL). According to their website it does this: 1. Within the config.txt/ezproxy.cfg file, various hosts are identified that require access from a local IP address. 2. A remote user makes a web connection to port 2048 of your EZproxy server. 3. When the user authenticates successfully, a cookie is sent to the user's browser. 4. The user's browser presents this during each access to EZproxy. So, for example, if I enter URL 1, EZproxy dynamically changes it to URL 2: 1. http://www.scopus.com/results/... 2. http://www-scopus-com.ezproxy.cul.columbia.edu/results/... What kind of proxy information should I look for and where do I enter it in the call? Your help is very much appreciated. Thanks. Duncan Temple Lang wrote Apologies for following up on my own mail, but I forgot to explicitly mention that you will need to specify the appropriate proxy information in the call to getURLContent(). D. On 6/7/12 8:31 AM, Duncan Temple Lang wrote: To just enable cookies and their management, use the cookiefile option, e.g. txt = getURLContent(url, cookiefile = ) Then you can pass this to readHTMLTable(), best done as content = readHTMLTable(htmlParse(txt, asText = TRUE)) The function readHTMLTable() doesn't use RCurl and doesn't handle cookies. D. On 6/7/12 7:33 AM, mdvaan wrote: Hi, I am trying to access a website and read its content. The website is a restricted access website that I access through a proxy server (which therefore requires me to enable cookies). I have problems in allowing Rcurl to receive and send cookies. The following lines give me: library(RCurl) library(XML) url - http://www.theurl.com; content - readHTMLTable(url) content $`NULL` V1 1 2 Cookies disabled 3 4 Your browser currently does not accept cookies.\rCookies need to be enabled for Scopus to function properly.\rPlease enable session cookies in your browser and try again. $`NULL` V1 V2 V3 1 $`NULL` V1 1 Cookies disabled $`NULL` V1 1 2 3 I have carefully read section 4.4. from this: http://www.omegahat.org/RCurl/RCurlJSS.pdf and tried the following without succes: curl - getCurlHandle() curlSetOpt(cookiejar = 'cookies.txt', curl = curl) Any suggestions on how to allow for cookies? Thanks. Math -- View this message in context: http://r.789695.n4.nabble.com/How-to-set-cookies-in-RCurl-tp4632693.html Sent from the R help mailing list archive at Nabble.com. __ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://r.789695.n4.nabble.com/How-to-set-cookies-in-RCurl-tp4632693p4632714.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] gsub/strsplit with multiple patterns/splits
Thank you very much. This definitely helps me out. Math Jeff Newmiller wrote There are many resources for learning regular expressions (e.g. http://gnosis.cx/publish/programming/regular_expressions.html). Once you understand the basics you will probably be able to refer to the ?regex help page for specific tools. After you have waded through a tutorial, the following explanation should make more sense. The braces are extended regex syntax for a repetition of a pattern by some minimum to some maximum number of times. The pattern immediately precedes the repetition specification. In the first case of {0,1} the pattern being repeated is the comma, and in the second case it is any of the characters in the square brackets (a period in this case). The period is a special match any character pattern when not part of a set of characters. A common shorthand for zero or one of something is a + symbol. Also, please learn to provide quoting context for the majority of us who do not use Nabble. --- Jeff NewmillerThe . . Go Live... DCN:lt;jdnewmil@.cagt;Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. mdvaan lt;mathijsdevaan@gt; wrote: Thanks! That works like a charm, but I am not sure if I fully understand the syntax. I looked at the gsub page but still couldn't figure it out. What does the pattern part (,{0,1} Inc[.]{0,1}) do? What do the 0 and 1 within the curly brackets refer to? Also, what if, for example, I would want to remove the word Energy? Thank you very much in advance. Math -- View this message in context: http://r.789695.n4.nabble.com/gsub-strsplit-with-multiple-patterns-splits-tp4631873p4631897.html Sent from the R help mailing list archive at Nabble.com. __ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://r.789695.n4.nabble.com/gsub-strsplit-with-multiple-patterns-splits-tp4631873p4631934.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] gsub/strsplit with multiple patterns/splits
Hi, I have a vector like this: DF - c(Aetna, Inc., Alexander's Inc., Allegheny Energy, Inc) For each element in the vector I would like to remove the incorporated info, so that my vector looks like this: DF - c(Aetna, Alexander's, Allegheny Energy) That means that I have to strip: strip - c(, Inc., Inc., , Inc) How do I pass multiple patterns/splits to gsub/strsplit? Thanks! Math -- View this message in context: http://r.789695.n4.nabble.com/gsub-strsplit-with-multiple-patterns-splits-tp4631873.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] gsub/strsplit with multiple patterns/splits
Thanks! That works like a charm, but I am not sure if I fully understand the syntax. I looked at the gsub page but still couldn't figure it out. What does the pattern part (,{0,1} Inc[.]{0,1}) do? What do the 0 and 1 within the curly brackets refer to? Also, what if, for example, I would want to remove the word Energy? Thank you very much in advance. Math -- View this message in context: http://r.789695.n4.nabble.com/gsub-strsplit-with-multiple-patterns-splits-tp4631873p4631897.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] evaluate whether function returns error
Hi, The following returns an error message. How do I evaluate (TRUE or FALSE) the function? require(XML) readHTMLTable(http://www.sec.gov/Archives/edgar/data/2969/95012399010952/950123-99-010952.txt;) Thanks in advance! Math -- View this message in context: http://r.789695.n4.nabble.com/evaluate-whether-function-returns-error-tp4631406.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Plotting network without overlapping vertices
Hello, I am using the plot.igraph function in the igraph package to plot a network. How do I keep vertices from overlapping? One option would be to pass an argument that restricts vertices to occupy the same coordinates given their size. A second option would be to increase the area of the plot (and multiply the distance between vertices with a constant) while keeping the size of vertices the same. I would like to keep my current layout (kamada.kawai). Any suggestions? Thanks, Math -- View this message in context: http://r.789695.n4.nabble.com/Plotting-network-without-overlapping-vertices-tp4604559.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Scrape data from Scopus: login through R?
Hello, The Scopus bibliographic database allows one to manually download batches of 2000 publications. The data is rich but does not provide one with a field containing the author id. However, author id's can be retrieved through the hyperlinks on the Scopus website. I have two questions: 1. My institution has a Scopus license, so I need to login. How do I do that in R (through Rcurl, XML?)? 2. How do I scrape hyperlinks? Your help is appreciated. Thanks Math -- View this message in context: http://r.789695.n4.nabble.com/Scrape-data-from-Scopus-login-through-R-tp4579261p4579261.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ggplot axis limit
Hi, This is probably an easy one, but I am new to ggplot2 and cannot find an answer online. I am bar plotting values of 10 groups. These values are all within a 90-100 range, so I would like leave out the area of the bars below 90. If I say graph + scale_y_continuous(limit=c(90, 100)), it does limit the axis but the bars disappear completely. Any solution here? Thanks a lot! Mathijs -- View this message in context: http://r.789695.n4.nabble.com/ggplot-axis-limit-tp4478835p4478835.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Merging fully overlapping groups
Hi, I have data on individuals (B) who participated in events (A). If ALL participants in an event are a subset of the participants in another event I would like to remove the smaller event and if the participants in one event are exactly similar to the participants in another event I would like to remove one of the events (I don't care which one). The following example does that however it is extremely slow (and the true dataset is very large). What would be a more efficient way to solve the problem? I really appreciate your help. Thanks! DF - data.frame(read.table(textConnection( A B 1209569832 1209551750 120956734 1877451750 1877451733 187746734 1877469833 1926851750 192686734 1926851733 1926865251 5169 54441 5169 15480 5169 3228 5966 51733 5966 65251 5966 68197 5966 6734 5966 51750 5966 69833 7189 135523 7189 65251 7189 51733 7189 69833 7189 135522 7189 68197 7189 6734 7797 51750 7797 6734 7797 69833 7866 6734 7866 69833 7866 51733 8596 51733 8596 51750 8596 65251 8677 6734 8677 51750 8677 51733 8936 68197 8936 6734 8936 65251 8936 51733 9204 51750 9204 69833 9204 6734 9204 51733),head=TRUE,stringsAsFactors=FALSE)) data - unique(DF$A) for (m in 1:length(data)) { for (m in 1:length(data)) { tdata - data[-m] q - 0 for (n in 1:length(tdata)) { if (length(which(DF[DF$A == data[m], 2] %in% DF[DF$A == tdata[n], 2] == TRUE)) == length(DF[DF$A == data[m], 2])) { q - q + 1 } } if (q 0) { data - data[-m] m - m - 1 } } } DF - DF[DF$A %in% data,] -- View this message in context: http://r.789695.n4.nabble.com/Merging-fully-overlapping-groups-tp4470999p4470999.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Merging fully overlapping groups
Hi Jean and Peter, Thanks for the help. Both options are indeed faster than my initial procedure. Best, Mathijs -- View this message in context: http://r.789695.n4.nabble.com/Merging-fully-overlapping-groups-tp4470999p4473013.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Select elements from text
Thanks. That worked great! -- View this message in context: http://r.789695.n4.nabble.com/Select-elements-from-text-tp4323947p4327711.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Select elements from text
Hi, I have a series of MS word files and each file contains plain text. From these texts I would like to extract only those elements (read: words) that are between square brackets. Example of a text: Most fundamentally, it has led to an effort to clarify the organizational form concept. According to them [see also Smith, Jones and Carroll 2002], categories emerge as audience members recognize dissimilarities among groups of consumers and label them as members of a common set [Nicol 2000]. Now I would like to get the following selection: see also Smith, Jones and Carroll 2002 Nicol 2000 Any ideas on how to do this? What would be the best way to import the text in R? The entire text as an element in a dataframe? Thank you very much! Best, Mathijs -- View this message in context: http://r.789695.n4.nabble.com/Select-elements-from-text-tp4323947p4323947.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Select elements from text
Thanks for the quick response. I get the latter part, but reading the text from MS word into R is problematic. I am able to read in (scan) all unique elements (following sep= ) from the text, but unable to past everything together again. Any id on how to solve this? It looks like this now: text-scan(test.txt, character(0), sep = ) text [1] Mostfundamentally, it has [5] led to an effort [9] to clarify the organizational [13] formconcept.According to [17] them[seealsoSmith, [21] Jones and Carroll 2002], [25] categories emerge as audience [29] members recognize dissimilarities among [33] groups of consumers and [37] label themas members [41] of a common set [45] [Nicol 2000]. -- View this message in context: http://r.789695.n4.nabble.com/Select-elements-from-text-tp4323947p4325174.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Selecting and multiplying
Anyone any idea on how to tackle this problem? Thanks a lot! -- View this message in context: http://r.789695.n4.nabble.com/Selecting-and-multiplying-tp3784901p3793908.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Selecting and multiplying
Hi, I have created two objects: object c contains yearly distances between cases and object g contains yearly interactions between cases. For each case and every year I would like to calculate the following value: Vit = sum(Dabt * Iait * Ibit) Where Vit is the value of case i in year t, Dabt is the distance between all cases a and b that have interacted more than 0 times with i, Iait is the number of times i has interacted with a and Ibit is the number of times i has interacted with b. So for 8027 in 1999 Vit becomes: (0.27547644 * 1 * 2) + (0.31481129 * 1 * 1) + (0.09896982 * 2 * 1) = 1.06370381 How do I create a dataframe that accomodates the values for each case in each year? Thanks in advance! Example: library(zoo) DF1 = data.frame(read.table(textConnection(B C D E F G 8025 1995 0 4 1 2 8025 1997 1 1 3 4 8026 1995 0 7 0 0 8026 1996 1 2 3 0 8026 1997 1 2 3 1 8026 1998 6 0 0 4 8026 1999 3 7 0 3 8027 1997 1 2 3 9 8027 1998 1 2 3 1 8027 1999 6 0 0 2 8028 1999 3 7 0 0 8029 1995 0 2 3 3 8029 1998 1 2 3 2 8029 1999 6 0 0 1),head=TRUE,stringsAsFactors=FALSE)) a - read.zoo(DF1, split = 1, index = 2, FUN = identity) sum.na - function(x) if (any(!is.na(x))) sum(x, na.rm = TRUE) else NA b - rollapply(a, 3, sum.na, align = right, partial = TRUE) newDF - lapply(1:nrow(b), function(i) prop.table(na.omit(matrix(b[i,], nc = 4, byrow = TRUE, dimnames = list(unique(DF1$B), names(DF1)[-1:-2]))), 1)) names(newDF) - time(a) c-lapply(newDF, function(mat) 1-tcrossprod(mat / sqrt(rowSums(mat^2 c-lapply(c, function (x) ifelse(x0.00111, 0, x)) DF2 = data.frame(read.table(textConnection( A B C 80 8025 1995 80 8026 1995 80 8029 1995 81 8026 1996 82 8025 1997 82 8026 1997 83 8025 1997 83 8027 1997 90 8026 1998 90 8027 1998 90 8029 1998 84 8026 1999 84 8027 1999 85 8028 1999 85 8029 1999),head=TRUE,stringsAsFactors=FALSE)) e - function(y) crossprod(table(DF2[DF2$C %in% y, 1:2])) years - sort(unique(DF2$C)) f - as.data.frame(embed(years, 3)) g-lapply(split(f, f[, 1]), e) -- View this message in context: http://r.789695.n4.nabble.com/Selecting-and-multiplying-tp3784901p3784901.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Selections in lists
Thanks David and Jorge for your comments! -- View this message in context: http://r.789695.n4.nabble.com/Selections-in-lists-tp3768562p3784816.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Selections in lists
Hi, I have produced a list g and I would like to reduce the amount of information contained in each object in g. For each matrix I would like to keep the values where the column name equals g[year][[1]][[x]] and the row names equals g[year][[1]][[-x]]. So in g$`1999`$`8029`, year = 1999 and x = 8029. I have been experimenting with the subset function, but have been unsuccesful. Thanks for your help! The result for g$`1999`$`8029` should be: $`1999`$`8029` B B 8029 80261 80271 80281 The result for g$`1999`$`8028` should be: $`1999`$`8028` B B 8028 80291 The result for g$`1999`$`8027` should be: $`1999`$`8027` B B 8027 8025 1 8026 2 8029 1 Example: DF = data.frame(read.table(textConnection( A B C 80 8025 1995 80 8026 1995 80 8029 1995 81 8026 1996 82 8025 1997 82 8026 1997 83 8025 1997 83 8027 1997 90 8026 1998 90 8027 1998 90 8029 1998 84 8026 1999 84 8027 1999 85 8028 1999 85 8029 1999),head=TRUE,stringsAsFactors=FALSE)) e - function(y) crossprod(table(DF[DF$C %in% y, 1:2])) years - sort(unique(DF$C)) f - as.data.frame(embed(years, 3)) g-lapply(split(f, f[, 1]), e) years-names(g) for (t in seq(years)) { year=as.character(years[t]) g[[year]]-sapply(colnames(g[[year]]), function(var) g[[year]][g[[year]][,var]0, g[[year]][var,]0]) } -- View this message in context: http://r.789695.n4.nabble.com/Selections-in-lists-tp3768562p3768562.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Selecting cases from matrices stored in lists
Hi, I have two lists (c and h - see below) containing matrices with similar cases but different values. I want to split these matrices into multiple matrices based on the values in h. So, I did the following: years-c(1997:1999) for (t in 1:length(years)) { year=as.character(years[t]) h[[year]]-sapply(colnames(h[[year]]), function(var) h[[year]][h[[year]][,var]0, h[[year]][var,]0]) } Now that I have created list h (with split matrices), I would like to use these selections to make similar selections in list c. List c needs to get the exact same shape as h, so that `8026`in 1997 (c$`1997`$`8026`) looks like this: $`1997`$`8026` B B 8025 8026 8029 8025 1.000 0.7739527 0.9656091 8026 0.7739527 1.000 0.7202771 8029 0.9656091 0.7202771 1.000 Can anyone help me doing this? I have no idea how I can get it to work. Thank you very much for your help! library(zoo) DF1 = data.frame(read.table(textConnection(B C D E F G 8025 1995 0 4 1 2 8025 1997 1 1 3 4 8026 1995 0 7 0 0 8026 1996 1 2 3 0 8026 1997 1 2 3 1 8026 1998 6 0 0 4 8026 1999 3 7 0 3 8027 1997 1 2 3 9 8027 1998 1 2 3 1 8027 1999 6 0 0 2 8028 1999 3 7 0 0 8029 1995 0 2 3 3 8029 1998 1 2 3 2 8029 1999 6 0 0 1),head=TRUE,stringsAsFactors=FALSE)) a - read.zoo(DF1, split = 1, index = 2, FUN = identity) sum.na - function(x) if (any(!is.na(x))) sum(x, na.rm = TRUE) else NA b - rollapply(a, 3, sum.na, align = right, partial = TRUE) newDF - lapply(1:nrow(b), function(i) prop.table(na.omit(matrix(b[i,], nc = 4, byrow = TRUE, dimnames = list(unique(DF1$B), names(DF1)[-1:-2]))), 1)) names(newDF) - time(a) c-lapply(newDF, function(mat) tcrossprod(mat / sqrt(rowSums(mat^2 DF2 = data.frame(read.table(textConnection( A B C 80 8025 1995 80 8026 1995 80 8029 1995 81 8026 1996 82 8025 1997 82 8026 1997 83 8025 1997 83 8027 1997 90 8026 1998 90 8027 1998 90 8029 1998 84 8026 1999 84 8027 1999 85 8028 1999 85 8029 1999),head=TRUE,stringsAsFactors=FALSE)) e - function(y) crossprod(table(DF2[DF2$C %in% y, 1:2])) years - sort(unique(DF2$C)) f - as.data.frame(embed(years, 3)) g-lapply(split(f, f[, 1]), e) h-lapply(g, function (x) ifelse(x0,1,0)) -- View this message in context: http://r.789695.n4.nabble.com/Selecting-cases-from-matrices-stored-in-lists-tp3759597p3759597.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Selecting cases from matrices stored in lists
Jean V Adams wrote: [R] Selecting cases from matrices stored in lists mdvaan to: r-help 08/22/2011 07:24 AM Hi, I have two lists (c and h - see below) containing matrices with similar cases but different values. I want to split these matrices into multiple matrices based on the values in h. So, I did the following: years-c(1997:1999) for (t in 1:length(years)) { year=as.character(years[t]) h[[year]]-sapply(colnames(h[[year]]), function(var) h[[year]][h[[year]][,var]0, h[[year]][var,]0]) } Now that I have created list h (with split matrices), I would like to use these selections to make similar selections in list c. List c needs to get the exact same shape as h, so that `8026`in 1997 (c$`1997`$`8026`) looks like this: $`1997`$`8026` B B 8025 8026 8029 8025 1.000 0.7739527 0.9656091 8026 0.7739527 1.000 0.7202771 8029 0.9656091 0.7202771 1.000 Can anyone help me doing this? I have no idea how I can get it to work. Thank you very much for your help! Try this: c2 - h years - names(h) for (t in seq(years)) { year - years[t] c2[[year]] - sapply(colnames(h[[year]]), function(var) c[[t]][h[[year]][ ,var] 0, h[[year]][var, ] 0]) } By the way, it's great that you included code in your question. However, I encountered a couple of errors when running you code (see below). Also, it would be better to use a different name for your list c, because c() is a function in R. Jean library(zoo) DF1 = data.frame(read.table(textConnection(B C D E F G 8025 1995 0 4 1 2 8025 1997 1 1 3 4 8026 1995 0 7 0 0 8026 1996 1 2 3 0 8026 1997 1 2 3 1 8026 1998 6 0 0 4 8026 1999 3 7 0 3 8027 1997 1 2 3 9 8027 1998 1 2 3 1 8027 1999 6 0 0 2 8028 1999 3 7 0 0 8029 1995 0 2 3 3 8029 1998 1 2 3 2 8029 1999 6 0 0 1),head=TRUE,stringsAsFactors=FALSE)) a - read.zoo(DF1, split = 1, index = 2, FUN = identity) sum.na - function(x) if (any(!is.na(x))) sum(x, na.rm = TRUE) else NA b - rollapply(a, 3, sum.na, align = right, partial = TRUE) Error in FUN(cdata[st, i], ...) : unused argument(s) (partial = TRUE) rollapply() has no argument partial. newDF - lapply(1:nrow(b), function(i) prop.table(na.omit(matrix(b[i,], nc = 4, byrow = TRUE, dimnames = list(unique(DF1$B), names(DF1)[-1:-2]))), 1)) names(newDF) - time(a) Error in names(newDF) - time(a) : 'names' attribute [5] must be the same length as the vector [3] newDF has only 3 names, but time(a) is of length 5. c-lapply(newDF, function(mat) tcrossprod(mat / sqrt(rowSums(mat^2 DF2 = data.frame(read.table(textConnection( A B C 80 8025 1995 80 8026 1995 80 8029 1995 81 8026 1996 82 8025 1997 82 8026 1997 83 8025 1997 83 8027 1997 90 8026 1998 90 8027 1998 90 8029 1998 84 8026 1999 84 8027 1999 85 8028 1999 85 8029 1999),head=TRUE,stringsAsFactors=FALSE)) e - function(y) crossprod(table(DF2[DF2$C %in% y, 1:2])) years - sort(unique(DF2$C)) f - as.data.frame(embed(years, 3)) g-lapply(split(f, f[, 1]), e) h-lapply(g, function (x) ifelse(x0,1,0)) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Sorry, I am using the devel version of zoo which allows you to use the partial argument. The correct code is given below. I didn't get your suggestion to work. If I understand what you are trying to do (multiplying c and h), this is likely to give the wrong results because h contains values of 0. Since I am ultimately interested in the values of the split matrices in c (based on the original matrices in c), this will probable not work. Or am I just not understanding you? Thanks! # devel version of zoo install.packages(zoo, repos = http://r-forge.r-project.org;) library(zoo) DF1 = data.frame(read.table(textConnection(B C D E F G 8025 1995 0 4 1 2 8025 1997 1 1 3 4 8026 1995 0 7 0 0 8026 1996 1 2 3 0 8026 1997 1 2 3 1 8026 1998 6 0 0 4 8026 1999 3 7 0 3 8027 1997 1 2 3 9 8027 1998 1 2 3 1 8027 1999 6 0 0 2 8028 1999 3 7 0 0 8029 1995 0 2 3 3 8029 1998 1 2 3 2 8029 1999 6 0 0 1),head=TRUE,stringsAsFactors=FALSE)) a - read.zoo(DF1, split = 1, index = 2, FUN = identity) sum.na - function(x) if (any(!is.na(x))) sum(x, na.rm = TRUE) else NA b - rollapply(a, 3, sum.na, align = right, partial = TRUE) newDF - lapply(1:nrow(b), function(i) prop.table(na.omit(matrix(b[i
Re: [R] Selecting cases from matrices stored in lists
Thanks Jean, changing c[[t]] to c[[year]] solved the issue. Math Jean V Adams wrote: Re: [R] Selecting cases from matrices stored in lists mdvaan to: r-help 08/22/2011 09:46 AM Jean V Adams wrote: [R] Selecting cases from matrices stored in lists mdvaan to: r-help 08/22/2011 07:24 AM Hi, I have two lists (c and h - see below) containing matrices with similar cases but different values. I want to split these matrices into multiple matrices based on the values in h. So, I did the following: years-c(1997:1999) for (t in 1:length(years)) { year=as.character(years[t]) h[[year]]-sapply(colnames(h[[year]]), function(var) h[[year]][h[[year]][,var]0, h[[year]][var,]0]) } Now that I have created list h (with split matrices), I would like to use these selections to make similar selections in list c. List c needs to get the exact same shape as h, so that `8026`in 1997 (c$`1997`$`8026`) looks like this: $`1997`$`8026` B B 8025 8026 8029 8025 1.000 0.7739527 0.9656091 8026 0.7739527 1.000 0.7202771 8029 0.9656091 0.7202771 1.000 Can anyone help me doing this? I have no idea how I can get it to work. Thank you very much for your help! Try this: c2 - h years - names(h) for (t in seq(years)) { year - years[t] c2[[year]] - sapply(colnames(h[[year]]), function(var) c[[t]][h[[year]][ ,var] 0, h[[year]][var, ] 0]) } By the way, it's great that you included code in your question. However, I encountered a couple of errors when running you code (see below). Also, it would be better to use a different name for your list c, because c() is a function in R. Jean library(zoo) DF1 = data.frame(read.table(textConnection(B C D E F G 8025 1995 0 4 1 2 8025 1997 1 1 3 4 8026 1995 0 7 0 0 8026 1996 1 2 3 0 8026 1997 1 2 3 1 8026 1998 6 0 0 4 8026 1999 3 7 0 3 8027 1997 1 2 3 9 8027 1998 1 2 3 1 8027 1999 6 0 0 2 8028 1999 3 7 0 0 8029 1995 0 2 3 3 8029 1998 1 2 3 2 8029 1999 6 0 0 1),head=TRUE,stringsAsFactors=FALSE)) a - read.zoo(DF1, split = 1, index = 2, FUN = identity) sum.na - function(x) if (any(!is.na(x))) sum(x, na.rm = TRUE) else NA b - rollapply(a, 3, sum.na, align = right, partial = TRUE) Error in FUN(cdata[st, i], ...) : unused argument(s) (partial = TRUE) rollapply() has no argument partial. newDF - lapply(1:nrow(b), function(i) prop.table(na.omit(matrix(b[i,], nc = 4, byrow = TRUE, dimnames = list(unique(DF1$B), names(DF1)[-1:-2]))), 1)) names(newDF) - time(a) Error in names(newDF) - time(a) : 'names' attribute [5] must be the same length as the vector [3] newDF has only 3 names, but time(a) is of length 5. c-lapply(newDF, function(mat) tcrossprod(mat / sqrt(rowSums(mat^2 DF2 = data.frame(read.table(textConnection( A B C 80 8025 1995 80 8026 1995 80 8029 1995 81 8026 1996 82 8025 1997 82 8026 1997 83 8025 1997 83 8027 1997 90 8026 1998 90 8027 1998 90 8029 1998 84 8026 1999 84 8027 1999 85 8028 1999 85 8029 1999),head=TRUE,stringsAsFactors=FALSE)) e - function(y) crossprod(table(DF2[DF2$C %in% y, 1:2])) years - sort(unique(DF2$C)) f - as.data.frame(embed(years, 3)) g-lapply(split(f, f[, 1]), e) h-lapply(g, function (x) ifelse(x0,1,0)) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Sorry, I am using the devel version of zoo which allows you to use the partial argument. The correct code is given below. My error. I didn't have the latest version installed. I didn't get your suggestion to work. If I understand what you are trying to do (multiplying c and h), this is likely to give the wrong results because h contains values of 0. Since I am ultimately interested in the values of the split matrices in c (based on the original matrices in c), this will probable not work. Or am I just not understanding you? I'm not doing any multiplication. I just applied your extraction [h[[year]][ ,var] 0, h[[year]][var, ] 0] to the c list rather than the h list. You say you didn't get it to work. Did you get an error message? Or did it run, but not give you the values you wanted? Or ... ? Jean Thanks! # devel version of zoo install.packages(zoo, repos = http://r-forge.r-project.org
Re: [R] Selecting section of matrix
That worked great, thanks! Now that I have created list h (see below), I would like to use the selections made in h to make new selections in list c (see below). List c needs to get the exact same shape as h, so that `8026`in 1997 (c$`1997`$`8026`) looks like this: $`1997`$`8026` B B 8025 8026 8029 8025 1.000 0.7739527 0.9656091 8026 0.7739527 1.000 0.7202771 8029 0.9656091 0.7202771 1.000 Thank you very much for your help! library(zoo) DF1 = data.frame(read.table(textConnection(B C D E F G 8025 1995 0 4 1 2 8025 1997 1 1 3 4 8026 1995 0 7 0 0 8026 1996 1 2 3 0 8026 1997 1 2 3 1 8026 1998 6 0 0 4 8026 1999 3 7 0 3 8027 1997 1 2 3 9 8027 1998 1 2 3 1 8027 1999 6 0 0 2 8028 1999 3 7 0 0 8029 1995 0 2 3 3 8029 1998 1 2 3 2 8029 1999 6 0 0 1),head=TRUE,stringsAsFactors=FALSE)) a - read.zoo(DF1, split = 1, index = 2, FUN = identity) sum.na - function(x) if (any(!is.na(x))) sum(x, na.rm = TRUE) else NA b - rollapply(a, 3, sum.na, align = right, partial = TRUE) newDF - lapply(1:nrow(b), function(i) prop.table(na.omit(matrix(b[i,], nc = 4, byrow = TRUE, dimnames = list(unique(DF1$B), names(DF1)[-1:-2]))), 1)) names(newDF) - time(a) c-lapply(newDF, function(mat) tcrossprod(mat / sqrt(rowSums(mat^2 DF2 = data.frame(read.table(textConnection( A B C 80 8025 1995 80 8026 1995 80 8029 1995 81 8026 1996 82 8025 1997 82 8026 1997 83 8025 1997 83 8027 1997 90 8026 1998 90 8027 1998 90 8029 1998 84 8026 1999 84 8027 1999 85 8028 1999 85 8029 1999),head=TRUE,stringsAsFactors=FALSE)) e - function(y) crossprod(table(DF2[DF2$C %in% y, 1:2])) years - sort(unique(DF2$C)) f - as.data.frame(embed(years, 3)) g-lapply(split(f, f[, 1]), e) h-lapply(g, function (x) ifelse(x0,1,0)) years-c(1997:1999) for (t in 1:length(years)) { year=as.character(years[t]) h[[year]]-sapply(colnames(h[[year]]), function(var) h[[year]][h[[year]][,var]0, h[[year]][var,]0]) } David Winsemius wrote: On Aug 15, 2011, at 6:09 AM, mdvaan wrote: Hi, I have a question concerning the selection of data. Let's say that given list h created below, I would like to select a section of the 1999 matrix. For a case (rownames and colnames) I would like to select the cells that have a value 0. So for case 8025 8025 8026 8027 8025111 8026111 8027111 tst - h$`1999` tst[tst[,8025]0, tst[8025,]0] B B 8025 8026 8027 8025111 8026111 8027111 And for case 8028 8028 8029 802811 802911 tst[tst[,8028]0, tst[8028,]0] B B 8028 8029 802811 802911 And to do it programmatically: sapply( colnames(tst), function(var) tst[tst[,var]0, tst[var,]0]) -- David. DF2 = data.frame(read.table(textConnection( A B C 80 8025 1995 80 8026 1995 80 8029 1995 81 8026 1996 82 8025 1997 82 8026 1997 83 8025 1997 83 8027 1997 90 8026 1998 90 8027 1998 90 8029 1998 84 8026 1999 84 8027 1999 85 8028 1999 85 8029 1999),head=TRUE,stringsAsFactors=FALSE)) e - function(y) crossprod(table(DF2[DF2$C %in% y, 1:2])) years - sort(unique(DF2$C)) f - as.data.frame(embed(years, 3)) g-lapply(split(f, f[, 1]), e) h-lapply(g, function (x) ifelse(x0,1,0))# These are the adjacency matrices per year h Thanks very much! -- View this message in context: http://r.789695.n4.nabble.com/Selecting-section-of-matrix-tp3744570p3744570.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://r.789695.n4.nabble.com/Selecting-section-of-matrix-tp3744570p3750246.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Selecting section of matrix
Hi, I have a question concerning the selection of data. Let's say that given list h created below, I would like to select a section of the 1999 matrix. For a case (rownames and colnames) I would like to select the cells that have a value 0. So for case 8025 8025 8026 8027 8025111 8026111 8027111 And for case 8028 8028 8029 802811 802911 DF2 = data.frame(read.table(textConnection( A B C 80 8025 1995 80 8026 1995 80 8029 1995 81 8026 1996 82 8025 1997 82 8026 1997 83 8025 1997 83 8027 1997 90 8026 1998 90 8027 1998 90 8029 1998 84 8026 1999 84 8027 1999 85 8028 1999 85 8029 1999),head=TRUE,stringsAsFactors=FALSE)) e - function(y) crossprod(table(DF2[DF2$C %in% y, 1:2])) years - sort(unique(DF2$C)) f - as.data.frame(embed(years, 3)) g-lapply(split(f, f[, 1]), e) h-lapply(g, function (x) ifelse(x0,1,0))# These are the adjacency matrices per year h Thanks very much! -- View this message in context: http://r.789695.n4.nabble.com/Selecting-section-of-matrix-tp3744570p3744570.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Adding objects to a list
#Hi list, #From the code below I get two list objects (n$values and n$vectors): dat - matrix(1:9,3) n-eigen(dat) n # How do I add another object to n that replicates n$vectors and is called n$vectors$test? # Thanks a lot! -- View this message in context: http://r.789695.n4.nabble.com/Adding-objects-to-a-list-tp3610821p3610821.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Multiply list objects
I am still thinking about this problem. The solution could look something like this (it's net yet working): k-lapply(h, function (x) x*0) # I keep the same format as h, but set all values to 0 years-c(1997:1999) # I define the years for (t in 1:length(years)) { year = as.character(years[t]) ids = rownames(h[year][[1]]) } for (m in 1:length(relevant_firms)) { k[year][[m]]-lapply(k[year], function (col) k[year][[1]][,m] = h[year][[1]][,m] k[year][[1]][m,] = h[year][[1]][m,]) } # I am creating new list objects that should look like this k$'1999'$'8029' and I replace the values in the 8029 column and row by the original ones in h Any takes on this problem? Thank you very much! Best -- View this message in context: http://r.789695.n4.nabble.com/Multiply-list-objects-tp3595719p3603871.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Multiply list objects
Hi, I am trying to use the objects from the list below to create more objects. For each year in h I am trying to create as many objects as there are B's keeping only the values of B. Example for 1999: $`1999`$`8025` B B 8025 8026 8027 8028 8029 802511100 802610000 802710000 802800000 802900000 $`1999`$`8026` B B 8025 8026 8027 8028 8029 802501000 802611101 802701000 802800000 802901000 $`1999`$`8027` B B 8025 8026 8027 8028 8029 802500100 802600100 802711101 802800000 802900100 $`1999`$`8028` B B 8025 8026 8027 8028 8029 802500000 802600000 802700000 802800011 802900010 $`1999`$`8029` B B 8025 8026 8027 8028 8029 802500000 802600001 802700001 802800001 802901111 Any suggestions? You help is very much appreciated! DF = data.frame(read.table(textConnection( A B C 80 8025 1995 80 8026 1995 80 8029 1995 81 8026 1996 82 8025 1997 82 8026 1997 83 8025 1997 83 8027 1997 90 8026 1998 90 8027 1998 90 8029 1998 84 8026 1999 84 8027 1999 85 8028 1999 85 8029 1999),head=TRUE,stringsAsFactors=FALSE)) e - function(y) crossprod(table(DF[DF$C %in% y, 1:2])) years - sort(unique(DF$C)) f - as.data.frame(embed(years, 3)) g-lapply(split(f, f[, 1]), e) h-lapply(g, function (x) ifelse(x0,1,0)) -- View this message in context: http://r.789695.n4.nabble.com/Multiply-list-objects-tp3595719p3595719.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Divide matrix into multiple smaller matrices
Hi, I still haven't found a solution for this problem. Is there a way in which I can slice the objects in c based on the info in h? Thanks a lot! -- View this message in context: http://r.789695.n4.nabble.com/Divide-matrix-into-multiple-smaller-matrices-tp3552399p3591868.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Counting occurrences in a moving window
Would it be possible to use the sqldf package and the ave function to simply run ave over a limited set? So something like: DF = data.frame(read.table(textConnection( A B 8025 1995 8026 1995 8029 1995 8026 1996 8025 1997 8026 1997 8025 1997 8027 1997 8026 1999 8027 1999 8028 1995 8029 1998 8025 1997 8027 1997 8026 1999 8027 1999 8028 1995 8029 1998),head=TRUE,stringsAsFactors=FALSE)) library(sqldf) years-c(1995:1999) for (t in 1:length(years)) { year = as.numeric(years[t]) m-sqldf('select * from DF where B between $year-1 AND $year-4') n-ave(m$A,m$A,FUN = length) } How do I get the correct values in DF$C? Thanks!! -- View this message in context: http://r.789695.n4.nabble.com/Counting-occurrences-in-a-moving-window-tp3568658p3570652.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Counting occurrences in a moving window
Thank you very much! I really liked the first solution, it worked great for my larger dataset. M Gabor Grothendieck wrote: On Fri, Jun 3, 2011 at 8:11 AM, mdvaan lt;mathijsdev...@gmail.comgt; wrote: Would it be possible to use the sqldf package and the ave function to simply run ave over a limited set? So something like: DF = data.frame(read.table(textConnection( A B 8025 1995 8026 1995 8029 1995 8026 1996 8025 1997 8026 1997 8025 1997 8027 1997 8026 1999 8027 1999 8028 1995 8029 1998 8025 1997 8027 1997 8026 1999 8027 1999 8028 1995 8029 1998),head=TRUE,stringsAsFactors=FALSE)) library(sqldf) years-c(1995:1999) for (t in 1:length(years)) { year = as.numeric(years[t]) m-sqldf('select * from DF where B between $year-1 AND $year-4') n-ave(m$A,m$A,FUN = length) } How do I get the correct values in DF$C? Thanks!! In sqldf it would be like this: sqldf(select x.*, sum(x.A = y.A and y.B x.B and y.B = x.B-3) C from DF x, DF y group by x.rowid) -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://r.789695.n4.nabble.com/Counting-occurrences-in-a-moving-window-tp3568658p3571916.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Counting occurrences in a moving window
Hi list, based on the following data.frame I would like to create a variable that indicates the number of occurrences of A in the 3 years prior to the current year: DF = data.frame(read.table(textConnection( A B 8025 1995 8026 1995 8029 1995 8026 1996 8025 1997 8026 1997 8025 1997 8027 1997 8026 1999 8027 1999 8028 1995 8029 1998 8025 1997 8027 1997 8026 1999 8027 1999 8028 1995 8029 1998),head=TRUE,stringsAsFactors=FALSE)) becomes: AB C 8025 1995 0 8026 1995 0 8029 1995 0 8026 1996 1 8025 1997 1 8026 1997 2 8025 1997 1 8027 1997 0 8026 1999 2 8027 1999 2 8028 1995 0 8029 1998 1 8025 1997 1 8027 1997 0 8026 1999 2 8027 1999 2 8028 1995 0 8029 2000 1 So 8026 in 1997 = 2 because 8026 can be found in 1995 and 1996 which are both within the appropriate window (1996 - 1994). Any ideas? I looked at the rollapply vignette, but couldn't figure out how to apply it to my data. Thanks a lot! -- View this message in context: http://r.789695.n4.nabble.com/Counting-occurrences-in-a-moving-window-tp3568658p3568658.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Divide matrix into multiple smaller matrices
Hi list, Using the script below, I have generated two lists (c and h) containing yearly matrices. Now I would like to divide the matrices in c into multiple matrices based on h. The number of matrices should be equal to: length(unique(DF1$B))*length(h). So each unique value in DF1$B get's a yearly matrix. Each matrix should contain all values from c where element cij is 1. An example for DF1$B = 8025 in 1999: 8025 8026 8027 8025 0. 0.27547644 0.06905066 8026 0.27547644 0. 0.10499739 8027 0.06905066 0.10499739 0. Any ideas on how to tackle this problem? Thanks a lot! library(zoo) DF1 = data.frame(read.table(textConnection(B C D E F G 8025 1995 0 4 1 2 8025 1997 1 1 3 4 8026 1995 0 7 0 0 8026 1996 1 2 3 0 8026 1997 1 2 3 1 8026 1998 6 0 0 4 8026 1999 3 7 0 3 8027 1997 1 2 3 9 8027 1998 1 2 3 1 8027 1999 6 0 0 2 8028 1999 3 7 0 0 8029 1995 0 2 3 3 8029 1998 1 2 3 2 8029 1999 6 0 0 1),head=TRUE,stringsAsFactors=FALSE)) # Where Column B represents the cases, C is the year and D-G are the types of knowledge units covered a - read.zoo(DF1, split = 1, index = 2, FUN = identity) sum.na - function(x) if (any(!is.na(x))) sum(x, na.rm = TRUE) else NA b - rollapply(a, 3, sum.na, align = right, partial = TRUE) newDF - lapply(1:nrow(b), function(i) prop.table(na.omit(matrix(b[i,], nc = 4, byrow = TRUE, dimnames = list(unique(DF1$B), names(DF1)[-1:-2]))), 1)) names(newDF) - time(a) c-lapply(newDF, function(mat) tcrossprod(mat / sqrt(rowSums(mat^2 c-lapply(c, function (x) 1-x) c-lapply(c, function (x) ifelse(x0.00111, 0, x))# These are the yearly distance matrices for a 4 year moving window DF2 = data.frame(read.table(textConnection( A B C 80 8025 1995 80 8026 1995 80 8029 1995 81 8026 1996 82 8025 1997 82 8026 1997 83 8025 1997 83 8027 1997 90 8026 1998 90 8027 1998 90 8029 1998 84 8026 1999 84 8027 1999 85 8028 1999 85 8029 1999),head=TRUE,stringsAsFactors=FALSE)) e - function(y) crossprod(table(DF2[DF2$C %in% y, 1:2])) years - sort(unique(DF2$C)) f - as.data.frame(embed(years, 3)) g-lapply(split(f, f[, 1]), e) h-lapply(g, function (x) ifelse(x0,1,0))# These are the adjacency matrices per year -- View this message in context: http://r.789695.n4.nabble.com/Divide-matrix-into-multiple-smaller-matrices-tp3552399p3552399.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.