[R] Accumulate objects in list after try()

2012-11-26 Thread mdvaan
Hi,

I have written a function harvest and I would like to run the function for
each value in a vector c(1:1000). The function returns 4 list objects
(obj_1, obj_3, obj_3, obj_4) using the following code at the end of the
function: return(list(obj_1 = obj_1, obj_2 = obj_2, obj_3 = obj_3, obj_4 =
obj_4)).

Since I am connecting with the web in the function and the connection
sometimes fails causing errors to occur, I invoke the function as follows:

for(i in 1:1000){
  result - try(harvest(i));
  if(class(result) == try-error) next;
  }

Everything works well accept for the fact that result only stores obj_1, 
obj_2, obj_3,  obj_4 for the last i in the loop. How do I store obj_1, 
obj_2, obj_3,  obj_4 for the first i in the first 4 elements of result, the
objects for the second i in the next 4 elements, etc?

Thank you very much.  





--
View this message in context: 
http://r.789695.n4.nabble.com/Accumulate-objects-in-list-after-try-tp4650927.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reduce(paste, x) question

2012-11-02 Thread mdvaan
Thanks Arun, this helps a lot!



--
View this message in context: 
http://r.789695.n4.nabble.com/Reduce-paste-x-question-tp4648151p4648228.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Reduce(paste, x) question

2012-11-01 Thread mdvaan
I have a question about the Reduce function: 

x - list()
x[[1]] - LETTERS[1:5]
x[[2]] - LETTERS[11:15]
Reduce(paste, x)
[1] A K B L C M D N E O

How do I get this?:
[1] A K 
[2] B L
[3] C M
[4] D N 
[5] E O 

Thanks for your help!



--
View this message in context: 
http://r.789695.n4.nabble.com/Reduce-paste-x-question-tp4648151.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reduce(paste, x) question

2012-11-01 Thread mdvaan
I should have been more specific:

y - list()
a - c(A, K)
b - c(B, L)
c - c(C, M)
d - c(D, N)
e - c(E, O)

y[[1]] - a
y[[2]] - b
y[[3]] - c
y[[4]] - d
y[[5]] - e

y
[[1]]
[1] A K

[[2]]
[1] B L

[[3]]
[1] C M

[[4]]
[1] D N

[[5]]
[1] E O

How do I get a list object like y (each element of y is a vector of strings)
from:

x[[1]] - LETTERS[1:5]
x[[2]] - LETTERS[11:15]

using only the Reduce function?

Thanks!



--
View this message in context: 
http://r.789695.n4.nabble.com/Reduce-paste-x-question-tp4648151p4648169.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] strapply and characters adjacent to the matched pattern

2012-07-26 Thread mdvaan
Thanks Gabor for your invaluable help! I learned a lot.



--
View this message in context: 
http://r.789695.n4.nabble.com/strapply-and-characters-adjacent-to-the-matched-pattern-tp4637673p4637939.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] strapply and characters adjacent to the matched pattern

2012-07-25 Thread mdvaan
Thanks Gabor. That worked really well. I have been reading about the use of
POSIX and regular expressions and I tried to use your example to see if I
could  ignore all matches in which the character preceding (rather than
following) the match is one of [:alpha:]? So far, I have been unsuccessful.
Could anyone help me out here or direct me to a source that explains the
combined use of POSIX and regular expressions? Thanks!

require(gsubfn)
# this only checks for the characters following the match and therefore
matches also matches the third element
# however I want it to match only the 2nd, 5th and 6th elements
strapply(c(abc, ab, abdef, defc, def,  def ),
(def|ab)($|[^[[:alpha:]]))

The outcome should look like this:
[[1]]
NULL

[[2]]
[1] ab

[[3]]
NULL

[[4]]
NULL

[[5]]
[1] def

[[6]]
[1] def



Gabor Grothendieck wrote
 
 On Tue, Jul 24, 2012 at 5:06 PM, mdvaan lt;mathijsdevaan@gt; wrote:
 Hi,

 In the example below, one of the searched patterns SE is matched in the
 word second. I would like to ignore all matches in which the character
 following the match is one of [:alpha:]. How do I do this without
 removing
 the ignore.case = T argument of the strapply function? Thank you very
 much!

 # load library
 require(gsubfn)
 # read in data
 data - c(Santa Fe Gold Corp|Starpharma Holdings|SE)
 # define the object to be searched
 text - c(the first is Santa Fe Gold Corp, the second is Starpharma
 Holdings)
 # match
 strapply(text, data, ignore.case = T)

 The preferred outcome would be:

 [[1]]
 [1] Santa Fe Gold Corp

 [[2]]
 [1] Starpharma Holdings

 instead of:

 [[1]]
 [1] Santa Fe Gold Corp

 [[2]]
 [1] se  Starpharma Holdings


 
 Try this:
 
 strapply(c(abc, ab, ab def), (ab|d)($|[^[[:alpha:]]))
 [[1]]
 NULL
 
 [[2]]
 [1] ab
 
 [[3]]
 [1] ab
 
 
 -- 
 Statistics  Software Consulting
 GKX Group, GKX Associates Inc.
 tel: 1-877-GKX-GROUP
 email: ggrothendieck at gmail.com
 
 __
 R-help@ mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 



--
View this message in context: 
http://r.789695.n4.nabble.com/strapply-and-characters-adjacent-to-the-matched-pattern-tp4637673p4637835.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] strapply and characters adjacent to the matched pattern

2012-07-24 Thread mdvaan
Hi,

In the example below, one of the searched patterns SE is matched in the
word second. I would like to ignore all matches in which the character
following the match is one of [:alpha:]. How do I do this without removing
the ignore.case = T argument of the strapply function? Thank you very
much!

# load library
require(gsubfn)
# read in data 
data - c(Santa Fe Gold Corp|Starpharma Holdings|SE)
# define the object to be searched 
text - c(the first is Santa Fe Gold Corp, the second is Starpharma
Holdings) 
# match 
strapply(text, data, ignore.case = T)

The preferred outcome would be:

[[1]]
[1] Santa Fe Gold Corp

[[2]]
[1] Starpharma Holdings

instead of:

[[1]]
[1] Santa Fe Gold Corp

[[2]]
[1] se  Starpharma Holdings






--
View this message in context: 
http://r.789695.n4.nabble.com/strapply-and-characters-adjacent-to-the-matched-pattern-tp4637673.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Maximum number of patterns and speed in grep

2012-07-23 Thread mdvaan
Hi,

I have a minor follow-up question:

In the example below, ann and nn in the third element of text are
matched. I would like to ignore all matches in which the character following
the match is one of [:alpha:]. How do I do this without removing the
ignore.case = TRUE argument of the strapply function?

So the output should be:

[[1]]
[1] Santa Fe Gold Corp

[[2]]
[1] Starpharma Holdings

[[3]]
NULL

Rather than:

[[1]]
[1] Santa Fe Gold Corp

[[2]]
[1] Starpharma Holdings

[[3]]
[1] ann nn

Thanks!


require(gsubfn)

# read in data 
data - read.csv(https://dl.dropbox.com/u/13631687/data.csv;, header = T,
sep = ,) 

# define the object to be searched 
text - c(the first is Santa Fe Gold Corp, the second is Starpharma
Holdings, the annual earnings exceed those of last year) 

k - 3000 # chunk size 

f - function(from, text) { 
  to - min(from + k - 1, nrow(data)) 
  r - paste(data[seq(from, to), 1], collapse = |) 
  r - gsub([().*?+{}], , r) 
  strapply(text, r, ignore.case = TRUE) 
} 
ix - seq(1, nrow(data), k) 
out - lapply(text, function(text) unlist(lapply(ix, f, text))) 



--
View this message in context: 
http://r.789695.n4.nabble.com/Maximum-number-of-patterns-and-speed-in-grep-tp4635613p4637458.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Maximum number of patterns and speed in grep

2012-07-16 Thread mdvaan
Thanks! That worked like a charm.

Math


Gabor Grothendieck wrote
 
 On Fri, Jul 13, 2012 at 1:41 PM, mdvaan lt;mathijsdevaan@gt; wrote:
 Here's some data (which should give you the error messages):

 # read in data
 data - read.csv(https://dl.dropbox.com/u/13631687/data.csv;, header
 =
 T, sep = ,)

 # first paste all data
 data1 - paste(data[,1], collapse = |)

 # second paste subsets of the data
 data2a - paste(data[1:750,1], collapse = |)
 data2b - paste(data[751:1500,1], collapse = |)

 # define the object to be searched
 text - c(the first is Santa Fe Gold Corp, the second is
 Starpharma
 Holdings)

 # match
 strapplyc(text, data1)
 strapplyc(text, data2a)
 strapplyc(text, data2b)

 Thanks in advance!

 
 Although it seems that strapplyc can handle larger regular expressions
 than grep in R it seems neither can handle as many as in your example
 so process it in chunks:
 
 k - 3000 # chunk size
 
 f - function(from, text) {
   to - min(from + k - 1, nrow(data))
   r - paste(data[seq(from, to), 1], collapse = |)
   r - gsub([().*?+{}], , r)
   strapply(text, r)
 }
 ix - seq(1, nrow(data), k)
 out - lapply(text, function(text) unlist(lapply(ix, f, text)))
 
 
 -- 
 Statistics  Software Consulting
 GKX Group, GKX Associates Inc.
 tel: 1-877-GKX-GROUP
 email: ggrothendieck at gmail.com
 
 __
 R-help@ mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


--
View this message in context: 
http://r.789695.n4.nabble.com/Maximum-number-of-patterns-and-speed-in-grep-tp4635613p4636657.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Finding and manipulation clusters of numbers in a sequence of numbers

2012-07-16 Thread mdvaan
Hi,

I have the following sequence:
in - c(0, 0, 0, 2, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 2, 0, 2, 0, 0, 2)

From this sequence I would like to get to the following sequence:
out - c(0, 0, 0, 3, 3, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0,
0, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 0, 2, 0, 2, 0, 0, 2)

Basically, what I would like to do for each number greater than 0,  is to
add all adjacent numbers and the adjacent numbers of those numbers, etc.
until one of those numbers is equal to 0.

I could manually repeat the loops below until sequence stops changing but
there must be a smarter way. Any suggestions? Thanks!

sequence - c(0, 0, 0, 2, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0,
0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 2, 0, 2, 0, 0, 2)
for (h in 2:(length(sequence) - 1))
  {
  sequence[h] - ifelse(sequence[h]  0, sequence[h-1] + sequence[h], 0)
  }

for (h in 1:(length(sequence) - 1))
  {
  sequence[h] - ifelse(sequence[h]  0  sequence[h+1]  sequence[h],
sequence[h+1], sequence[h])
  }  

--
View this message in context: 
http://r.789695.n4.nabble.com/Finding-and-manipulation-clusters-of-numbers-in-a-sequence-of-numbers-tp4636661.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Maximum number of patterns and speed in grep

2012-07-15 Thread mdvaan
Here's some data (which should give you the error messages):

# read in data
data - read.csv(https://dl.dropbox.com/u/13631687/data.csv;, header =
T, sep = ,)

# first paste all data
data1 - paste(data[,1], collapse = |)

# second paste subsets of the data
data2a - paste(data[1:750,1], collapse = |)
data2b - paste(data[751:1500,1], collapse = |)

# define the object to be searched
text - c(the first is Santa Fe Gold Corp, the second is Starpharma
Holdings)

# match
strapplyc(text, data1)
strapplyc(text, data2a)
strapplyc(text, data2b)

Thanks in advance!

Math



Gabor Grothendieck wrote
 
 On Fri, Jul 13, 2012 at 9:40 AM, mdvaan lt;mathijsdevaan@gt; wrote:
 Thanks, I see that it is working in the sample data. My data, however,
 gives
 me an error message:

 data - strapplyc(text, batch[[l]])
 Error in structure(.External(dotTcl, ..., PACKAGE = tcltk), class =
 tclObj) :
   [tcl] couldn't compile regular expression pattern: parentheses () not
 balanced.

 batch[[l]] is similar to your re string except that there is a larger
 variety of characters. I haven't been able to figure out which characters
 are causing trouble here. Any thoughts?

 Thank you very much.

 Math
 ...

 __
 R-help@ mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 Note part on last line about posting reproducible code.
 
 -- 
 Statistics  Software Consulting
 GKX Group, GKX Associates Inc.
 tel: 1-877-GKX-GROUP
 email: ggrothendieck at gmail.com
 
 __
 R-help@ mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

--
View this message in context: 
http://r.789695.n4.nabble.com/Maximum-number-of-patterns-and-speed-in-grep-tp4635613p4636472.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Maximum number of patterns and speed in grep

2012-07-13 Thread mdvaan
Thanks, I see that it is working in the sample data. My data, however, gives
me an error message: 

data - strapplyc(text, batch[[l]]) 
Error in structure(.External(dotTcl, ..., PACKAGE = tcltk), class =
tclObj) : 
  [tcl] couldn't compile regular expression pattern: parentheses () not
balanced.

batch[[l]] is similar to your re string except that there is a larger
variety of characters. I haven't been able to figure out which characters
are causing trouble here. Any thoughts?

Thank you very much.

Math 




Gabor Grothendieck wrote
 
 On Fri, Jul 6, 2012 at 10:45 AM, mdvaan lt;mathijsdevaan@gt; wrote:
 Hi,

 I am using R's grep function to find patterns in vectors of strings. The
 number of patterns I would like to match is 7,700 (of different sizes). I
 noticed that I get an error message when I do the following:

 data - array()
 for (j in 1:length(x))
 {
 array[j] - length(grep(paste(patterns[1:7700], collapse = |),  x[j],
 value = T))
 }

 When I break this up into 4 chunks of patterns it works:

 data - array()
 for (j in 1:length(x))
 {
 array$chunk1[j] - length(grep(paste(patterns[1:2500], collapse = |),
 x[j], value = T))
 array$chunk1[j] - length(grep(paste(patterns[2501:5000], collapse =
 |),
 x[j], value = T))
 array$chunk1[j] - length(grep(paste(patterns[5001:7500], collapse =
 |),
 x[j], value = T))
 array$chunk1[j] - length(grep(paste(patterns[7501:7700], collapse =
 |),
 x[j], value = T))
 }

 My questions: what's the maximum size of the patterns argument in grep?
 Is
 there a way to do this faster? It is very slow.
 
 Try strapplyc in gsubfn and see
   http://gsubfn.googlecode.com
 for more info.
 
 # test data
 x - c(abcd, z, dbef)
 
 # re is regexp with 7700 alternatives
 #  to test with
 g - expand.grid(letters, letters, letters)
 gp - do.call(paste0, g)
 gp7700 - head(gp, 7700)
 re - paste(gp7700, collapse = |)
 
 # grep gives error message
 grep.out - grep(re, x)
 
 # strapplyc works
 library(gsubfn)
 which(sapply(strapplyc(x, re), length)  0)
 
 
 -- 
 Statistics  Software Consulting
 GKX Group, GKX Associates Inc.
 tel: 1-877-GKX-GROUP
 email: ggrothendieck at gmail.com
 
 __
 R-help@ mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

--
View this message in context: 
http://r.789695.n4.nabble.com/Maximum-number-of-patterns-and-speed-in-grep-tp4635613p4636437.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Maximum number of patterns and speed in grep

2012-07-06 Thread mdvaan
Hi,

I am using R's grep function to find patterns in vectors of strings. The
number of patterns I would like to match is 7,700 (of different sizes). I
noticed that I get an error message when I do the following: 

data - array()
for (j in 1:length(x))
{
array[j] - length(grep(paste(patterns[1:7700], collapse = |),  x[j],
value = T))
}

When I break this up into 4 chunks of patterns it works:

data - array()
for (j in 1:length(x))
{
array$chunk1[j] - length(grep(paste(patterns[1:2500], collapse = |), 
x[j], value = T))
array$chunk1[j] - length(grep(paste(patterns[2501:5000], collapse = |), 
x[j], value = T))
array$chunk1[j] - length(grep(paste(patterns[5001:7500], collapse = |), 
x[j], value = T))
array$chunk1[j] - length(grep(paste(patterns[7501:7700], collapse = |), 
x[j], value = T))
} 

My questions: what's the maximum size of the patterns argument in grep? Is
there a way to do this faster? It is very slow.

Thanks.

Math

Sorry for not providing a reproducible example. It's a size issue which
makes it difficult to provide an example.

 

--
View this message in context: 
http://r.789695.n4.nabble.com/Maximum-number-of-patterns-and-speed-in-grep-tp4635613.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Maximum number of patterns and speed in grep

2012-07-06 Thread mdvaan
Thanks for the quick response. I should phrase my question differently
because everything is working fine, I am just trying to find a more
efficient approach:

1. What's the maximum size of the patterns argument in grep? Can't find it
online. 
2. I am trying to match 7,700 character strings to about 10,000 vectors each
containing about 5,000 strings using grep. Is there a way to do this faster?
It is very slow. 

Thanks


Sarah Goslee wrote
 
 Hi,
 
 Given that you can't provide a full example, please at least provide
 str() on your data, more complete information on the problem, and
 ideally a small toy example that demonstrates precisely what you are
 doing.
 
 For instance, you tell us that you get an error message but you
 never tell us what it is. Don't you think we might need to know what
 the error is to be able to diagnose and fix it?
 
 Also, note that your working example simply overwrites
 array$chunk1[j] four times.
 
 Sarah
 
 On Fri, Jul 6, 2012 at 10:45 AM, mdvaan lt;mathijsdevaan@gt; wrote:
 Hi,

 I am using R's grep function to find patterns in vectors of strings. The
 number of patterns I would like to match is 7,700 (of different sizes). I
 noticed that I get an error message when I do the following:

 data - array()
 for (j in 1:length(x))
 {
 array[j] - length(grep(paste(patterns[1:7700], collapse = |),  x[j],
 value = T))
 }

 When I break this up into 4 chunks of patterns it works:

 data - array()
 for (j in 1:length(x))
 {
 array$chunk1[j] - length(grep(paste(patterns[1:2500], collapse = |),
 x[j], value = T))
 array$chunk1[j] - length(grep(paste(patterns[2501:5000], collapse =
 |),
 x[j], value = T))
 array$chunk1[j] - length(grep(paste(patterns[5001:7500], collapse =
 |),
 x[j], value = T))
 array$chunk1[j] - length(grep(paste(patterns[7501:7700], collapse =
 |),
 x[j], value = T))
 }

 My questions: what's the maximum size of the patterns argument in grep?
 Is
 there a way to do this faster? It is very slow.

 Thanks.

 Math

 Sorry for not providing a reproducible example. It's a size issue which
 makes it difficult to provide an example.

 
 
 -- 
 Sarah Goslee
 http://www.functionaldiversity.org
 
 __
 R-help@ mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


--
View this message in context: 
http://r.789695.n4.nabble.com/Maximum-number-of-patterns-and-speed-in-grep-tp4635613p4635626.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fw: Extract upper case letters

2012-06-29 Thread mdvaan
Thanks, all solutions worked like a charm.

Math


arun kirshna wrote
 
 - Forwarded Message -
 From: arun lt;smartpink111@gt;
 To: mdvaan lt;mathijsdevaan@gt;
 Cc: R help lt;r-help@gt;
 Sent: Thursday, June 28, 2012 1:44 AM
 Subject: Re: [R] Extract upper case letters
 
 Hi,
 
 Try this:
 t - TheWeatherIsVeryNice
  t1-gsub([:A-Z:],,t)
  t1
 [1] heeatherseryice
 
 
 t2-gsub([:a-z:],,t)
 t2
 [1] TWIVN
 
 
 A.K.
 
 
 
 
 - Original Message -
 From: mdvaan lt;mathijsdevaan@gt;
 To: r-help@
 Cc: 
 Sent: Wednesday, June 27, 2012 3:15 PM
 Subject: [R] Extract upper case letters
 
 t - TheWeatherIsVeryNice
 
 How do I extract the upper case letters? - TWIVN
 
 Thanks!
 
 --
 View this message in context:
 http://r.789695.n4.nabble.com/Extract-upper-case-letters-tp4634664.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@ mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 __
 R-help@ mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


--
View this message in context: 
http://r.789695.n4.nabble.com/Extract-upper-case-letters-tp4634664p4634802.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Extract upper case letters

2012-06-27 Thread mdvaan
t - TheWeatherIsVeryNice

How do I extract the upper case letters? - TWIVN

Thanks!

--
View this message in context: 
http://r.789695.n4.nabble.com/Extract-upper-case-letters-tp4634664.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to set cookies in RCurl

2012-06-21 Thread mdvaan
Hi,

I am using prof. Temple Lang's suggestions and I think I should be close but
with the code below I get an error message which I don't fully get. Any
suggestions? Thanks! 

Math

library(RCurl)
library(XML)
setwd(C:/Comments)
url -
getURLContent(http://www.scopus.com/results/results.url?sort=plf-fsrc=ssid=M8RcnaPRBgrtA1r_EvZtL7j%3a70sot=asdt=asl=32s=PMID%2811693556%29+OR+PMID%2812239288%29origin=searchadvancedtxGid=M8RcnaPRBgrtA1r_EvZtL7j%3a7;,
options(RCurlOptions = list(proxy = 127.0.0.1:2048, proxyuserpwd =
username:password, proxyauth = gci)),  cookiefile = /Rcookies) 

Error in curlOptions(..., .opts = .opts) : 
  unnamed curl option(s): list(RCurlOptions = list(proxy = 127.0.0.1:2048,
proxyuserpwd = username:password, proxyauth = gci))



Duncan Temple Lang wrote
 
 Apologies for following up on my own mail, but I forgot
 to explicitly mention that you will need to specify the
 appropriate proxy information in the call to getURLContent().
 
   D.
 
 On 6/7/12 8:31 AM, Duncan Temple Lang wrote:
 To just enable cookies and their management, use the cookiefile
 option, e.g.
 
   txt = getURLContent(url,  cookiefile = )
 
 Then you can pass this to readHTMLTable(), best done as
 
   content = readHTMLTable(htmlParse(txt, asText = TRUE))
 
 
 The function readHTMLTable() doesn't use RCurl and doesn't
 handle cookies.
 
D.
 
 On 6/7/12 7:33 AM, mdvaan wrote:
 Hi,

 I am trying to access a website and read its content. The website is a
 restricted access website that I access through a proxy server (which
 therefore requires me to enable cookies). I have problems in allowing
 Rcurl
 to receive and send cookies. 

 The following lines give me:

 library(RCurl)
 library(XML)

 url - http://www.theurl.com;
 content - readHTMLTable(url)

 content
 $`NULL`
 
 
  
 V1
 1   
 
 
 2   
 

 Cookies disabled
 3   
 
 
 4 Your browser currently does not accept cookies.\rCookies need to be
 enabled for Scopus to function properly.\rPlease enable session cookies
 in
 your browser and try again.

 $`NULL`
   V1 V2 V3
 1 

 $`NULL`
 V1
 1 Cookies disabled

 $`NULL`
   V1
 1   
 2   
 3  

 I have carefully read section 4.4. from this:
 http://www.omegahat.org/RCurl/RCurlJSS.pdf and tried the following
 without
 succes:

 curl - getCurlHandle()
 curlSetOpt(cookiejar = 'cookies.txt', curl = curl)

 Any suggestions on how to allow for cookies?

 Thanks.

 Math

 --
 View this message in context:
 http://r.789695.n4.nabble.com/How-to-set-cookies-in-RCurl-tp4632693.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@ mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 __
 R-help@ mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 __
 R-help@ mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-set-cookies-in-RCurl-tp4632693p4634147.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to set cookies in RCurl

2012-06-07 Thread mdvaan
Hi,

I am trying to access a website and read its content. The website is a
restricted access website that I access through a proxy server (which
therefore requires me to enable cookies). I have problems in allowing Rcurl
to receive and send cookies. 

The following lines give me:

library(RCurl)
library(XML)

url - http://www.theurl.com;
content - readHTMLTable(url)

content
$`NULL`


 
V1
1   


2   
   
Cookies disabled
3   


4 Your browser currently does not accept cookies.\rCookies need to be
enabled for Scopus to function properly.\rPlease enable session cookies in
your browser and try again.

$`NULL`
  V1 V2 V3
1 

$`NULL`
V1
1 Cookies disabled

$`NULL`
  V1
1   
2   
3  

I have carefully read section 4.4. from this:
http://www.omegahat.org/RCurl/RCurlJSS.pdf and tried the following without
succes:

curl - getCurlHandle()
curlSetOpt(cookiejar = 'cookies.txt', curl = curl)

Any suggestions on how to allow for cookies?

Thanks.

Math

--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-set-cookies-in-RCurl-tp4632693.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to set cookies in RCurl

2012-06-07 Thread mdvaan
Thanks for the fast response. I am not sure how to enter the proxy info in
the call. 

I am working via EZProxy (which I think, rewrites a URL). According to their
website it does this: 

1. Within the config.txt/ezproxy.cfg file, various hosts are identified that
require access from a local IP address. 
2. A remote user makes a web connection to port 2048 of your EZproxy server. 
3. When the user authenticates successfully, a cookie is sent to the user's
browser. 
4. The user's browser presents this during each access to EZproxy.

So, for example, if I enter URL 1, EZproxy dynamically changes it to URL 2: 
1. http://www.scopus.com/results/...
2. http://www-scopus-com.ezproxy.cul.columbia.edu/results/...

What kind of proxy information should I look for and where do I enter it in
the call? 

Your help is very much appreciated.

Thanks.


Duncan Temple Lang wrote
 
 Apologies for following up on my own mail, but I forgot
 to explicitly mention that you will need to specify the
 appropriate proxy information in the call to getURLContent().
 
   D.
 
 On 6/7/12 8:31 AM, Duncan Temple Lang wrote:
 To just enable cookies and their management, use the cookiefile
 option, e.g.
 
   txt = getURLContent(url,  cookiefile = )
 
 Then you can pass this to readHTMLTable(), best done as
 
   content = readHTMLTable(htmlParse(txt, asText = TRUE))
 
 
 The function readHTMLTable() doesn't use RCurl and doesn't
 handle cookies.
 
D.
 
 On 6/7/12 7:33 AM, mdvaan wrote:
 Hi,

 I am trying to access a website and read its content. The website is a
 restricted access website that I access through a proxy server (which
 therefore requires me to enable cookies). I have problems in allowing
 Rcurl
 to receive and send cookies. 

 The following lines give me:

 library(RCurl)
 library(XML)

 url - http://www.theurl.com;
 content - readHTMLTable(url)

 content
 $`NULL`
 
 
  
 V1
 1   
 
 
 2   
 

 Cookies disabled
 3   
 
 
 4 Your browser currently does not accept cookies.\rCookies need to be
 enabled for Scopus to function properly.\rPlease enable session cookies
 in
 your browser and try again.

 $`NULL`
   V1 V2 V3
 1 

 $`NULL`
 V1
 1 Cookies disabled

 $`NULL`
   V1
 1   
 2   
 3  

 I have carefully read section 4.4. from this:
 http://www.omegahat.org/RCurl/RCurlJSS.pdf and tried the following
 without
 succes:

 curl - getCurlHandle()
 curlSetOpt(cookiejar = 'cookies.txt', curl = curl)

 Any suggestions on how to allow for cookies?

 Thanks.

 Math

 --
 View this message in context:
 http://r.789695.n4.nabble.com/How-to-set-cookies-in-RCurl-tp4632693.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@ mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 __
 R-help@ mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 __
 R-help@ mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-set-cookies-in-RCurl-tp4632693p4632714.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] gsub/strsplit with multiple patterns/splits

2012-05-31 Thread mdvaan
Thank you very much. This definitely helps me out.

Math


Jeff Newmiller wrote
 
 There are many resources for learning regular expressions (e.g.
 http://gnosis.cx/publish/programming/regular_expressions.html). Once you
 understand the basics you will probably be able to refer to the ?regex
 help page for specific tools. After you have waded through a tutorial, the
 following explanation should make more sense.
 
 The braces are extended regex syntax for a repetition of a pattern by some
 minimum to some maximum number of times. The pattern immediately precedes
 the repetition specification. In the first case of {0,1} the pattern being
 repeated is the comma, and in the second case it is any of the characters
 in the square brackets (a period in this case). The period is a special
 match any character pattern when not part of a set of characters. A
 common shorthand for zero or one of something is a + symbol.
 
 Also, please learn to provide quoting context for the majority of us who
 do not use Nabble.
 ---
 Jeff NewmillerThe .   .  Go
 Live...
 DCN:lt;jdnewmil@.cagt;Basics: ##.#.   ##.#.  Live Go...
   Live:   OO#.. Dead: OO#..  Playing
 Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
 /Software/Embedded Controllers)   .OO#.   .OO#. 
 rocks...1k
 --- 
 Sent from my phone. Please excuse my brevity.
 
 mdvaan lt;mathijsdevaan@gt; wrote:
 
Thanks! That works like a charm, but I am not sure if I fully
understand the
syntax. I looked at the gsub page but still couldn't figure it out.
What
does the pattern part (,{0,1} Inc[.]{0,1}) do? What do the 0 and 1
within
the curly brackets refer to? Also, what if, for example, I would want
to
remove the word Energy?

Thank you very much in advance.

Math

--
View this message in context:
http://r.789695.n4.nabble.com/gsub-strsplit-with-multiple-patterns-splits-tp4631873p4631897.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@ mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
 
 __
 R-help@ mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


--
View this message in context: 
http://r.789695.n4.nabble.com/gsub-strsplit-with-multiple-patterns-splits-tp4631873p4631934.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] gsub/strsplit with multiple patterns/splits

2012-05-30 Thread mdvaan
Hi,

I have a vector like this:

DF - c(Aetna, Inc., Alexander's Inc., Allegheny Energy, Inc)

For each element in the vector I would like to remove the incorporated
info, so that my vector looks like this:

DF - c(Aetna, Alexander's, Allegheny Energy)

That means that I have to strip:

strip - c(, Inc.,  Inc., , Inc)

How do I pass multiple patterns/splits to gsub/strsplit?

Thanks!

Math


--
View this message in context: 
http://r.789695.n4.nabble.com/gsub-strsplit-with-multiple-patterns-splits-tp4631873.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] gsub/strsplit with multiple patterns/splits

2012-05-30 Thread mdvaan
Thanks! That works like a charm, but I am not sure if I fully understand the
syntax. I looked at the gsub page but still couldn't figure it out. What
does the pattern part (,{0,1} Inc[.]{0,1}) do? What do the 0 and 1 within
the curly brackets refer to? Also, what if, for example, I would want to
remove the word Energy?

Thank you very much in advance.

Math

--
View this message in context: 
http://r.789695.n4.nabble.com/gsub-strsplit-with-multiple-patterns-splits-tp4631873p4631897.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] evaluate whether function returns error

2012-05-25 Thread mdvaan
Hi,

The following returns an error message. How do I evaluate (TRUE or FALSE)
the function?

require(XML)
readHTMLTable(http://www.sec.gov/Archives/edgar/data/2969/95012399010952/950123-99-010952.txt;)

Thanks in advance!

Math

--
View this message in context: 
http://r.789695.n4.nabble.com/evaluate-whether-function-returns-error-tp4631406.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Plotting network without overlapping vertices

2012-05-02 Thread mdvaan
Hello,

I am using the plot.igraph function in the igraph package to plot a network.
How do I keep vertices from overlapping? One option would be to pass an
argument that restricts vertices to occupy the same coordinates given their
size. A second option would be to increase the area of the plot (and
multiply the distance between vertices with a constant) while keeping the
size of vertices the same. I would like to keep my current layout
(kamada.kawai). Any suggestions?

Thanks,

Math  

--
View this message in context: 
http://r.789695.n4.nabble.com/Plotting-network-without-overlapping-vertices-tp4604559.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Scrape data from Scopus: login through R?

2012-04-22 Thread mdvaan
Hello,

The Scopus bibliographic database allows one to manually download batches of
2000 publications. The data is rich but does not provide one with a field
containing the author id. However, author id's can be retrieved through the
hyperlinks on the Scopus website. I have two questions:

1. My institution has a Scopus license, so I need to login. How do I do that
in R (through Rcurl, XML?)?
2. How do I scrape hyperlinks?

Your help is appreciated.

Thanks

Math

--
View this message in context: 
http://r.789695.n4.nabble.com/Scrape-data-from-Scopus-login-through-R-tp4579261p4579261.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ggplot axis limit

2012-03-16 Thread mdvaan
Hi,

This is probably an easy one, but I am new to ggplot2 and cannot find an
answer online.

I am bar plotting values of 10 groups. These values are all within a 90-100
range, so I would like leave out the area of the bars below 90. If I say
graph + scale_y_continuous(limit=c(90, 100)), it does limit the axis but
the bars disappear completely. Any solution here?

Thanks a lot!

Mathijs


--
View this message in context: 
http://r.789695.n4.nabble.com/ggplot-axis-limit-tp4478835p4478835.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Merging fully overlapping groups

2012-03-14 Thread mdvaan
Hi,

I have data on individuals (B) who participated in events (A). If ALL
participants in an event are a subset of the participants in another event I
would like to remove the smaller event and if the participants in one event
are exactly similar to the participants in another event I would like to
remove one of the events (I don't care which one). The following example
does that however it is extremely slow (and the true dataset is very large).
What would be a more efficient way to solve the problem? I really appreciate
your help. Thanks!  

DF - data.frame(read.table(textConnection(  A  B
1209569832
1209551750
120956734
1877451750
1877451733
187746734
1877469833
1926851750
192686734
1926851733
1926865251
5169 54441
5169 15480
5169 3228
5966 51733
5966 65251
5966 68197
5966 6734
5966 51750
5966 69833
7189 135523
7189 65251
7189 51733
7189 69833
7189 135522
7189 68197
7189 6734
7797 51750
7797 6734
7797 69833
7866 6734
7866 69833
7866 51733
8596 51733
8596 51750
8596 65251
8677 6734
8677 51750
8677 51733
8936 68197
8936 6734
8936 65251
8936 51733
9204 51750
9204 69833
9204 6734
9204 51733),head=TRUE,stringsAsFactors=FALSE))

data - unique(DF$A)
for (m in 1:length(data))
{
for (m in 1:length(data))
{
tdata - data[-m]
q - 0
for (n in 1:length(tdata))
{
if (length(which(DF[DF$A == data[m], 2] %in% DF[DF$A == 
tdata[n], 2] ==
TRUE)) == length(DF[DF$A == data[m], 2]))
{
q - q + 1
}
}
if (q  0)
{
data - data[-m]
m - m - 1
}
}
}
DF - DF[DF$A %in% data,]

--
View this message in context: 
http://r.789695.n4.nabble.com/Merging-fully-overlapping-groups-tp4470999p4470999.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Merging fully overlapping groups

2012-03-14 Thread mdvaan
Hi Jean and Peter,

Thanks for the help. Both options are indeed faster than my initial
procedure.

Best,

Mathijs

--
View this message in context: 
http://r.789695.n4.nabble.com/Merging-fully-overlapping-groups-tp4470999p4473013.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Select elements from text

2012-01-25 Thread mdvaan
Thanks. That worked great!

--
View this message in context: 
http://r.789695.n4.nabble.com/Select-elements-from-text-tp4323947p4327711.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Select elements from text

2012-01-24 Thread mdvaan
Hi,

I have a series of MS word files and each file contains plain text. From
these texts I would like to extract only those elements (read: words) that
are between square brackets. Example of a text:

Most fundamentally, it has led to an effort to clarify the organizational
form concept. According to them [see also Smith, Jones and Carroll 2002],
categories emerge as audience members recognize dissimilarities among groups
of consumers and label them as members of a common set [Nicol 2000].

Now I would like to get the following selection:

see also Smith, Jones and Carroll 2002
Nicol 2000

Any ideas on how to do this? What would be the best way to import the text
in R? The entire text as an element in a dataframe? Thank you very much!

Best,

Mathijs


--
View this message in context: 
http://r.789695.n4.nabble.com/Select-elements-from-text-tp4323947p4323947.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Select elements from text

2012-01-24 Thread mdvaan
Thanks for the quick response. I get the latter part, but reading the text
from MS word into R is problematic. I am able to read in (scan) all unique
elements (following sep= ) from the text, but unable to past everything
together again. Any id on how to solve this? It looks like this now: 

text-scan(test.txt, character(0), sep =  )

 text
 [1] Mostfundamentally,  it  has
 [5] led to  an  effort 
 [9] to  clarify the organizational 
[13] formconcept.According   to 
[17] them[seealsoSmith, 
[21] Jones   and Carroll 2002], 
[25] categories  emerge  as  audience   
[29] members recognize   dissimilarities among  
[33] groups  of  consumers   and
[37] label   themas  members
[41] of  a   common  set
[45] [Nicol  2000].   

--
View this message in context: 
http://r.789695.n4.nabble.com/Select-elements-from-text-tp4323947p4325174.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Selecting and multiplying

2011-09-06 Thread mdvaan
Anyone any idea on how to tackle this problem? Thanks a lot!

--
View this message in context: 
http://r.789695.n4.nabble.com/Selecting-and-multiplying-tp3784901p3793908.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Selecting and multiplying

2011-09-01 Thread mdvaan
Hi,

I have created two objects: object c contains yearly distances between
cases and object g contains yearly interactions between cases. For each case
and every year I would like to calculate the following value:

Vit = sum(Dabt * Iait * Ibit)

Where Vit is the value of case i in year t, Dabt is the distance between all
cases a and b that have interacted more than 0 times with i, Iait is the
number of times i has interacted with a and Ibit is the number of times i
has interacted with b. So for 8027 in 1999 Vit becomes:

(0.27547644 * 1 * 2) + (0.31481129 * 1 * 1) + (0.09896982 * 2 * 1) =
1.06370381

How do I create a dataframe that accomodates the values for each case in
each year? Thanks in advance!

Example:

library(zoo) 
DF1 = data.frame(read.table(textConnection(B  C  D  E  F  G 
8025  1995  0  4  1  2 
8025  1997  1  1  3  4 
8026  1995  0  7  0  0 
8026  1996  1  2  3  0 
8026  1997  1  2  3  1 
8026  1998  6  0  0  4 
8026  1999  3  7  0  3 
8027  1997  1  2  3  9 
8027  1998  1  2  3  1 
8027  1999  6  0  0  2 
8028  1999  3  7  0  0 
8029  1995  0  2  3  3 
8029  1998  1  2  3  2 
8029  1999  6  0  0  1),head=TRUE,stringsAsFactors=FALSE)) 
  
a - read.zoo(DF1, split = 1, index = 2, FUN = identity) 
sum.na - function(x) if (any(!is.na(x))) sum(x, na.rm = TRUE) else NA 
b - rollapply(a, 3,  sum.na, align = right, partial = TRUE) 
newDF - lapply(1:nrow(b), function(i) 
  prop.table(na.omit(matrix(b[i,], nc = 4, byrow = TRUE, 
dimnames = list(unique(DF1$B), names(DF1)[-1:-2]))), 1)) 
names(newDF) - time(a) 
c-lapply(newDF, function(mat) 1-tcrossprod(mat / sqrt(rowSums(mat^2 
c-lapply(c, function (x) ifelse(x0.00111, 0, x)) 

DF2 = data.frame(read.table(textConnection(  A  B  C 
80  8025  1995 
80  8026  1995 
80  8029  1995 
81  8026  1996 
82  8025  1997 
82  8026  1997 
83  8025  1997 
83  8027  1997 
90  8026  1998 
90  8027  1998 
90  8029  1998 
84  8026  1999 
84  8027  1999 
85  8028  1999 
85  8029  1999),head=TRUE,stringsAsFactors=FALSE)) 
  
e - function(y) crossprod(table(DF2[DF2$C %in% y, 1:2])) 
years - sort(unique(DF2$C)) 
f - as.data.frame(embed(years, 3)) 
g-lapply(split(f, f[, 1]), e) 

--
View this message in context: 
http://r.789695.n4.nabble.com/Selecting-and-multiplying-tp3784901p3784901.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Selections in lists

2011-09-01 Thread mdvaan
Thanks David and Jorge for your comments!

--
View this message in context: 
http://r.789695.n4.nabble.com/Selections-in-lists-tp3768562p3784816.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Selections in lists

2011-08-25 Thread mdvaan
Hi,

I have produced a list g and I would like to reduce the amount of
information contained in each object in g. 
For each matrix I would like to keep the values where the column name equals
g[year][[1]][[x]] and the row names equals g[year][[1]][[-x]]. So in
g$`1999`$`8029`, year = 1999 and x = 8029. I have been experimenting with
the subset function, but have been unsuccesful. Thanks for your help!

The result for g$`1999`$`8029` should be:

$`1999`$`8029`
  B
B  8029
  80261
  80271
  80281

The result for g$`1999`$`8028` should be:

$`1999`$`8028`
  B
B  8028
  80291


The result for g$`1999`$`8027` should be:

$`1999`$`8027`
  B
B   8027
  8025 1
  8026 2
  8029 1


Example:

DF = data.frame(read.table(textConnection(  A  B  C 
80  8025  1995 
80  8026  1995 
80  8029  1995 
81  8026  1996 
82  8025  1997 
82  8026  1997 
83  8025  1997 
83  8027  1997 
90  8026  1998 
90  8027  1998 
90  8029  1998 
84  8026  1999 
84  8027  1999 
85  8028  1999 
85  8029  1999),head=TRUE,stringsAsFactors=FALSE)) 
  
e - function(y) crossprod(table(DF[DF$C %in% y, 1:2])) 
years - sort(unique(DF$C)) 
f - as.data.frame(embed(years, 3)) 
g-lapply(split(f, f[, 1]), e)
years-names(g) 
for (t in seq(years))
{ 
year=as.character(years[t]) 
g[[year]]-sapply(colnames(g[[year]]), function(var)
g[[year]][g[[year]][,var]0, g[[year]][var,]0]) 
}

--
View this message in context: 
http://r.789695.n4.nabble.com/Selections-in-lists-tp3768562p3768562.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Selecting cases from matrices stored in lists

2011-08-22 Thread mdvaan
Hi,

I have two lists (c and h - see below) containing matrices with similar
cases but different values. I want to split these matrices into multiple
matrices based on the values in h. So, I did the following:

years-c(1997:1999) 
for (t in 1:length(years)) 
{ 
year=as.character(years[t]) 
h[[year]]-sapply(colnames(h[[year]]), function(var)
h[[year]][h[[year]][,var]0, h[[year]][var,]0]) 
} 

Now that I have created list h (with split matrices), I would like to use
these selections to make similar selections in list c. List c needs to get
the exact same shape as h, so that `8026`in 1997 (c$`1997`$`8026`) looks
like this: 

$`1997`$`8026` 
  B 
B  8025 8026 8029 
  8025   1.000 0.7739527 0.9656091 
  8026   0.7739527 1.000 0.7202771 
  8029   0.9656091 0.7202771 1.000 

Can anyone help me doing this? I have no idea how I can get it to work.
Thank you very much for your help! 


library(zoo) 
DF1 = data.frame(read.table(textConnection(B  C  D  E  F  G 
8025  1995  0  4  1  2 
8025  1997  1  1  3  4 
8026  1995  0  7  0  0 
8026  1996  1  2  3  0 
8026  1997  1  2  3  1 
8026  1998  6  0  0  4 
8026  1999  3  7  0  3 
8027  1997  1  2  3  9 
8027  1998  1  2  3  1 
8027  1999  6  0  0  2 
8028  1999  3  7  0  0 
8029  1995  0  2  3  3 
8029  1998  1  2  3  2 
8029  1999  6  0  0  1),head=TRUE,stringsAsFactors=FALSE)) 
  
a - read.zoo(DF1, split = 1, index = 2, FUN = identity) 
sum.na - function(x) if (any(!is.na(x))) sum(x, na.rm = TRUE) else NA 
b - rollapply(a, 3,  sum.na, align = right, partial = TRUE) 
newDF - lapply(1:nrow(b), function(i) 
  prop.table(na.omit(matrix(b[i,], nc = 4, byrow = TRUE, 
dimnames = list(unique(DF1$B), names(DF1)[-1:-2]))), 1)) 
names(newDF) - time(a) 
c-lapply(newDF, function(mat) tcrossprod(mat / sqrt(rowSums(mat^2 

DF2 = data.frame(read.table(textConnection(  A  B  C 
80  8025  1995 
80  8026  1995 
80  8029  1995 
81  8026  1996 
82  8025  1997 
82  8026  1997 
83  8025  1997 
83  8027  1997 
90  8026  1998 
90  8027  1998 
90  8029  1998 
84  8026  1999 
84  8027  1999 
85  8028  1999 
85  8029  1999),head=TRUE,stringsAsFactors=FALSE)) 
  
e - function(y) crossprod(table(DF2[DF2$C %in% y, 1:2])) 
years - sort(unique(DF2$C)) 
f - as.data.frame(embed(years, 3)) 
g-lapply(split(f, f[, 1]), e) 
h-lapply(g, function (x) ifelse(x0,1,0)) 

--
View this message in context: 
http://r.789695.n4.nabble.com/Selecting-cases-from-matrices-stored-in-lists-tp3759597p3759597.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Selecting cases from matrices stored in lists

2011-08-22 Thread mdvaan

Jean V Adams wrote:
 
 [R] Selecting cases from matrices stored in lists
 mdvaan 
 to:
 r-help
 08/22/2011 07:24 AM
 
 Hi,
 
 I have two lists (c and h - see below) containing matrices with similar
 cases but different values. I want to split these matrices into multiple
 matrices based on the values in h. So, I did the following:
 
 years-c(1997:1999) 
 for (t in 1:length(years)) 
 { 
 year=as.character(years[t]) 
 h[[year]]-sapply(colnames(h[[year]]), function(var)
 h[[year]][h[[year]][,var]0, h[[year]][var,]0]) 
 } 
 
 Now that I have created list h (with split matrices), I would like to 
 use
 these selections to make similar selections in list c. List c needs to 
 get
 the exact same shape as h, so that `8026`in 1997 (c$`1997`$`8026`) looks
 like this: 
 
 $`1997`$`8026` 
   B 
 B  8025 8026 8029 
   8025   1.000 0.7739527 0.9656091 
   8026   0.7739527 1.000 0.7202771 
   8029   0.9656091 0.7202771 1.000 
 
 Can anyone help me doing this? I have no idea how I can get it to work.
 Thank you very much for your help! 
 
 
 Try this:
 
 c2 - h
 years - names(h)
 for (t in seq(years))
 { 
 year - years[t]
 c2[[year]] - sapply(colnames(h[[year]]), function(var) 
 c[[t]][h[[year]][ ,var]  0, h[[year]][var, ]  0]) 
 }
 
 By the way, it's great that you included code in your question.
 However, I encountered a couple of errors when running you code (see 
 below).
 
 Also, it would be better to use a different name for your list c, 
 because c() is a function in R.
 
 Jean
 
 
 library(zoo) 
 DF1 = data.frame(read.table(textConnection(B  C  D  E  F  G 
 8025  1995  0  4  1  2 
 8025  1997  1  1  3  4 
 8026  1995  0  7  0  0 
 8026  1996  1  2  3  0 
 8026  1997  1  2  3  1 
 8026  1998  6  0  0  4 
 8026  1999  3  7  0  3 
 8027  1997  1  2  3  9 
 8027  1998  1  2  3  1 
 8027  1999  6  0  0  2 
 8028  1999  3  7  0  0 
 8029  1995  0  2  3  3 
 8029  1998  1  2  3  2 
 8029  1999  6  0  0  1),head=TRUE,stringsAsFactors=FALSE)) 
 
 a - read.zoo(DF1, split = 1, index = 2, FUN = identity) 
 sum.na - function(x) if (any(!is.na(x))) sum(x, na.rm = TRUE) else NA 
 b - rollapply(a, 3,  sum.na, align = right, partial = TRUE) 
 
 Error in FUN(cdata[st, i], ...) : unused argument(s) (partial = TRUE)
 
 rollapply() has no argument partial.
 
 newDF - lapply(1:nrow(b), function(i) 
   prop.table(na.omit(matrix(b[i,], nc = 4, byrow = TRUE, 
 dimnames = list(unique(DF1$B), names(DF1)[-1:-2]))), 1)) 
 
 names(newDF) - time(a) 
 
 Error in names(newDF) - time(a) : 
   'names' attribute [5] must be the same length as the vector [3]
 
 newDF has only 3 names, but time(a) is of length 5.
 
 c-lapply(newDF, function(mat) tcrossprod(mat / sqrt(rowSums(mat^2 
 
 DF2 = data.frame(read.table(textConnection(  A  B  C 
 80  8025  1995 
 80  8026  1995 
 80  8029  1995 
 81  8026  1996 
 82  8025  1997 
 82  8026  1997 
 83  8025  1997 
 83  8027  1997 
 90  8026  1998 
 90  8027  1998 
 90  8029  1998 
 84  8026  1999 
 84  8027  1999 
 85  8028  1999 
 85  8029  1999),head=TRUE,stringsAsFactors=FALSE)) 
 
 e - function(y) crossprod(table(DF2[DF2$C %in% y, 1:2])) 
 years - sort(unique(DF2$C)) 
 f - as.data.frame(embed(years, 3)) 
 g-lapply(split(f, f[, 1]), e) 
 h-lapply(g, function (x) ifelse(x0,1,0)) 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

Sorry, I am using the devel version of zoo which allows you to use the
partial argument. The correct code is given below. 

I didn't get your suggestion to work. If I understand what you are trying to
do (multiplying c and h), this is likely to give the wrong results because h
contains values of 0. Since I am ultimately interested in the values of the
split matrices in c (based on the original matrices in c), this will
probable not work. Or am I just not understanding you? 

Thanks!  

# devel version of zoo
install.packages(zoo, repos = http://r-forge.r-project.org;)
library(zoo)
DF1 = data.frame(read.table(textConnection(B  C  D  E  F  G 
8025  1995  0  4  1  2 
8025  1997  1  1  3  4 
8026  1995  0  7  0  0 
8026  1996  1  2  3  0 
8026  1997  1  2  3  1 
8026  1998  6  0  0  4 
8026  1999  3  7  0  3 
8027  1997  1  2  3  9 
8027  1998  1  2  3  1 
8027  1999  6  0  0  2 
8028  1999  3  7  0  0 
8029  1995  0  2  3  3 
8029  1998  1  2  3  2 
8029  1999  6  0  0  1),head=TRUE,stringsAsFactors=FALSE)) 
  
a - read.zoo(DF1, split = 1, index = 2, FUN = identity) 
sum.na - function(x) if (any(!is.na(x))) sum(x, na.rm = TRUE) else NA 
b - rollapply(a, 3,  sum.na, align = right, partial = TRUE) 
newDF - lapply(1:nrow(b), function(i) 
  prop.table(na.omit(matrix(b[i

Re: [R] Selecting cases from matrices stored in lists

2011-08-22 Thread mdvaan
Thanks Jean, changing c[[t]] to c[[year]] solved the issue.

Math


Jean V Adams wrote:
 
 Re: [R] Selecting cases from matrices stored in lists
 mdvaan 
 to:
 r-help
 08/22/2011 09:46 AM
 
 Jean V Adams wrote:
  
  [R] Selecting cases from matrices stored in lists
  mdvaan 
  to:
  r-help
  08/22/2011 07:24 AM
  
  Hi,
  
  I have two lists (c and h - see below) containing matrices with 
 similar
  cases but different values. I want to split these matrices into 
 multiple
  matrices based on the values in h. So, I did the following:
  
  years-c(1997:1999) 
  for (t in 1:length(years)) 
  { 
  year=as.character(years[t]) 
  h[[year]]-sapply(colnames(h[[year]]), function(var)
  h[[year]][h[[year]][,var]0, h[[year]][var,]0]) 
  } 
  
  Now that I have created list h (with split matrices), I would like to 
 
  use
  these selections to make similar selections in list c. List c needs 
 to 
  get
  the exact same shape as h, so that `8026`in 1997 (c$`1997`$`8026`) 
 looks
  like this: 
  
  $`1997`$`8026` 
B 
  B  8025 8026 8029 
8025   1.000 0.7739527 0.9656091 
8026   0.7739527 1.000 0.7202771 
8029   0.9656091 0.7202771 1.000 
  
  Can anyone help me doing this? I have no idea how I can get it to 
 work.
  Thank you very much for your help! 
  
  
  Try this:
  
  c2 - h
  years - names(h)
  for (t in seq(years))
  { 
  year - years[t]
  c2[[year]] - sapply(colnames(h[[year]]), function(var) 
  c[[t]][h[[year]][ ,var]  0, h[[year]][var, ]  0]) 
  }
  
  By the way, it's great that you included code in your question.
  However, I encountered a couple of errors when running you code (see 
  below).
  
  Also, it would be better to use a different name for your list c, 
  because c() is a function in R.
  
  Jean
  
  
  library(zoo) 
  DF1 = data.frame(read.table(textConnection(B  C  D  E  F  G 
  8025  1995  0  4  1  2 
  8025  1997  1  1  3  4 
  8026  1995  0  7  0  0 
  8026  1996  1  2  3  0 
  8026  1997  1  2  3  1 
  8026  1998  6  0  0  4 
  8026  1999  3  7  0  3 
  8027  1997  1  2  3  9 
  8027  1998  1  2  3  1 
  8027  1999  6  0  0  2 
  8028  1999  3  7  0  0 
  8029  1995  0  2  3  3 
  8029  1998  1  2  3  2 
  8029  1999  6  0  0  1),head=TRUE,stringsAsFactors=FALSE)) 
  
  a - read.zoo(DF1, split = 1, index = 2, FUN = identity) 
  sum.na - function(x) if (any(!is.na(x))) sum(x, na.rm = TRUE) else 
 NA 
  b - rollapply(a, 3,  sum.na, align = right, partial = TRUE) 
  
  Error in FUN(cdata[st, i], ...) : unused argument(s) (partial = TRUE)
  
  rollapply() has no argument partial.
  
  newDF - lapply(1:nrow(b), function(i) 
prop.table(na.omit(matrix(b[i,], nc = 4, byrow = TRUE, 
  dimnames = list(unique(DF1$B), names(DF1)[-1:-2]))), 
 1)) 
  
  names(newDF) - time(a) 
  
  Error in names(newDF) - time(a) : 
'names' attribute [5] must be the same length as the vector [3]
  
  newDF has only 3 names, but time(a) is of length 5.
  
  c-lapply(newDF, function(mat) tcrossprod(mat / 
 sqrt(rowSums(mat^2 
  
  DF2 = data.frame(read.table(textConnection(  A  B  C 
  80  8025  1995 
  80  8026  1995 
  80  8029  1995 
  81  8026  1996 
  82  8025  1997 
  82  8026  1997 
  83  8025  1997 
  83  8027  1997 
  90  8026  1998 
  90  8027  1998 
  90  8029  1998 
  84  8026  1999 
  84  8027  1999 
  85  8028  1999 
  85  8029  1999),head=TRUE,stringsAsFactors=FALSE)) 
  
  e - function(y) crossprod(table(DF2[DF2$C %in% y, 1:2])) 
  years - sort(unique(DF2$C)) 
  f - as.data.frame(embed(years, 3)) 
  g-lapply(split(f, f[, 1]), e) 
  h-lapply(g, function (x) ifelse(x0,1,0)) 
 [[alternative HTML version deleted]]
  
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
  
 
 Sorry, I am using the devel version of zoo which allows you to use the
 partial argument. The correct code is given below. 
 
 My error.  I didn't have the latest version installed.
 
 
 I didn't get your suggestion to work. If I understand what you are 
 trying to
 do (multiplying c and h), this is likely to give the wrong results 
 because h
 contains values of 0. Since I am ultimately interested in the values of 
 the
 split matrices in c (based on the original matrices in c), this will
 probable not work. Or am I just not understanding you? 
 
 I'm not doing any multiplication.  I just applied your extraction
 [h[[year]][ ,var]  0, h[[year]][var, ]  0]
 to the c list rather than the h list.
 
 You say you didn't get it to work.  Did you get an error message?  Or did 
 it run, but not give you the values you wanted?  Or ... ?
 
 Jean
 
 
 Thanks! 
 
 # devel version of zoo
 install.packages(zoo, repos = http://r-forge.r-project.org

Re: [R] Selecting section of matrix

2011-08-17 Thread mdvaan
That worked great, thanks! Now that I have created list h (see below), I
would like to use the selections made in h to make new selections in list c
(see below). List c needs to get the exact same shape as h, so that `8026`in
1997 (c$`1997`$`8026`) looks like this: 

$`1997`$`8026`
  B
B  8025 8026 8029
  8025   1.000 0.7739527 0.9656091
  8026   0.7739527 1.000 0.7202771
  8029   0.9656091 0.7202771 1.000

Thank you very much for your help!

library(zoo) 

DF1 = data.frame(read.table(textConnection(B  C  D  E  F  G 
8025  1995  0  4  1  2 
8025  1997  1  1  3  4 
8026  1995  0  7  0  0 
8026  1996  1  2  3  0 
8026  1997  1  2  3  1 
8026  1998  6  0  0  4 
8026  1999  3  7  0  3 
8027  1997  1  2  3  9 
8027  1998  1  2  3  1 
8027  1999  6  0  0  2 
8028  1999  3  7  0  0 
8029  1995  0  2  3  3 
8029  1998  1  2  3  2 
8029  1999  6  0  0  1),head=TRUE,stringsAsFactors=FALSE)) 
 
a - read.zoo(DF1, split = 1, index = 2, FUN = identity) 
sum.na - function(x) if (any(!is.na(x))) sum(x, na.rm = TRUE) else NA 
b - rollapply(a, 3,  sum.na, align = right, partial = TRUE) 
newDF - lapply(1:nrow(b), function(i) 
  prop.table(na.omit(matrix(b[i,], nc = 4, byrow = TRUE, 
dimnames = list(unique(DF1$B), names(DF1)[-1:-2]))), 1)) 
names(newDF) - time(a) 
c-lapply(newDF, function(mat) tcrossprod(mat / sqrt(rowSums(mat^2 

DF2 = data.frame(read.table(textConnection(  A  B  C
80  8025  1995
80  8026  1995
80  8029  1995
81  8026  1996
82  8025  1997
82  8026  1997
83  8025  1997
83  8027  1997
90  8026  1998
90  8027  1998
90  8029  1998
84  8026  1999
84  8027  1999
85  8028  1999
85  8029  1999),head=TRUE,stringsAsFactors=FALSE))
 
e - function(y) crossprod(table(DF2[DF2$C %in% y, 1:2]))
years - sort(unique(DF2$C))
f - as.data.frame(embed(years, 3))
g-lapply(split(f, f[, 1]), e)
h-lapply(g, function (x) ifelse(x0,1,0))

years-c(1997:1999)
for (t in 1:length(years))
{
year=as.character(years[t])
h[[year]]-sapply(colnames(h[[year]]), function(var)
h[[year]][h[[year]][,var]0, h[[year]][var,]0])
}


David Winsemius wrote:
 
 On Aug 15, 2011, at 6:09 AM, mdvaan wrote:
 
 Hi,

 I have a question concerning the selection of data. Let's say that  
 given
 list h created below, I would like to select a section of the 1999  
 matrix.
 For a case (rownames and colnames) I would like to select the cells  
 that
 have a value  0. So for case 8025

   8025 8026 8027
 8025111
 8026111
 8027111

   tst - h$`1999`
   tst[tst[,8025]0, tst[8025,]0]
B
 B  8025 8026 8027
8025111
8026111
8027111
 
 

 And for case 8028

   8028 8029
 802811
 802911
 
   tst[tst[,8028]0, tst[8028,]0]
B
 B  8028 8029
802811
802911
 
 And to do it programmatically:
 
 sapply( colnames(tst), function(var) tst[tst[,var]0, tst[var,]0])
 
 -- 
 David.



 DF2 = data.frame(read.table(textConnection(  A  B  C
 80  8025  1995
 80  8026  1995
 80  8029  1995
 81  8026  1996
 82  8025  1997
 82  8026  1997
 83  8025  1997
 83  8027  1997
 90  8026  1998
 90  8027  1998
 90  8029  1998
 84  8026  1999
 84  8027  1999
 85  8028  1999
 85  8029  1999),head=TRUE,stringsAsFactors=FALSE))

 e - function(y) crossprod(table(DF2[DF2$C %in% y, 1:2]))
 years - sort(unique(DF2$C))
 f - as.data.frame(embed(years, 3))
 g-lapply(split(f, f[, 1]), e)
 h-lapply(g, function (x) ifelse(x0,1,0))# These are the adjacency  
 matrices
 per year
 h

 Thanks very much!

 --
 View this message in context:
 http://r.789695.n4.nabble.com/Selecting-section-of-matrix-tp3744570p3744570.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


--
View this message in context: 
http://r.789695.n4.nabble.com/Selecting-section-of-matrix-tp3744570p3750246.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Selecting section of matrix

2011-08-15 Thread mdvaan
Hi,

I have a question concerning the selection of data. Let's say that given
list h created below, I would like to select a section of the 1999 matrix.
For a case (rownames and colnames) I would like to select the cells that
have a value  0. So for case 8025

   8025 8026 8027
8025111
8026111
8027111


And for case 8028

   8028 8029
802811
802911



DF2 = data.frame(read.table(textConnection(  A  B  C
80  8025  1995
80  8026  1995
80  8029  1995
81  8026  1996
82  8025  1997
82  8026  1997
83  8025  1997
83  8027  1997
90  8026  1998
90  8027  1998
90  8029  1998
84  8026  1999
84  8027  1999
85  8028  1999
85  8029  1999),head=TRUE,stringsAsFactors=FALSE))

e - function(y) crossprod(table(DF2[DF2$C %in% y, 1:2])) 
years - sort(unique(DF2$C)) 
f - as.data.frame(embed(years, 3)) 
g-lapply(split(f, f[, 1]), e)
h-lapply(g, function (x) ifelse(x0,1,0))# These are the adjacency matrices
per year
h

Thanks very much!

--
View this message in context: 
http://r.789695.n4.nabble.com/Selecting-section-of-matrix-tp3744570p3744570.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Adding objects to a list

2011-06-20 Thread mdvaan
#Hi list,

#From the code below I get two list objects (n$values and n$vectors):
dat - matrix(1:9,3)
n-eigen(dat)
n

# How do I add another object to n that replicates n$vectors and is called
n$vectors$test?
# Thanks a lot!





--
View this message in context: 
http://r.789695.n4.nabble.com/Adding-objects-to-a-list-tp3610821p3610821.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Multiply list objects

2011-06-16 Thread mdvaan
I am still thinking about this problem. The solution could look something
like this (it's net yet working):

k-lapply(h, function (x) x*0) # I keep the same format as h, but set all
values to 0
years-c(1997:1999) # I define the years
for (t in 1:length(years))
{
year = as.character(years[t])
ids = rownames(h[year][[1]])
}
for (m in 1:length(relevant_firms))
{
k[year][[m]]-lapply(k[year], function (col) k[year][[1]][,m] =
h[year][[1]][,m]  k[year][[1]][m,] = h[year][[1]][m,])
} # I am creating new list objects that should look like this
k$'1999'$'8029' and I replace the values in the 8029 column and row by the
original ones in h

Any takes on this problem? Thank you very much!

Best


--
View this message in context: 
http://r.789695.n4.nabble.com/Multiply-list-objects-tp3595719p3603871.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Multiply list objects

2011-06-14 Thread mdvaan
Hi,

I am trying to use the objects from the list below to create more objects.
For each year in h I am trying to create as many objects as there are B's
keeping only the values of B. Example for 1999: 

$`1999`$`8025`
  B
B  8025 8026 8027 8028 8029
  802511100
  802610000
  802710000
  802800000
  802900000

$`1999`$`8026`
  B
B  8025 8026 8027 8028 8029
  802501000
  802611101
  802701000
  802800000
  802901000

$`1999`$`8027`
  B
B  8025 8026 8027 8028 8029
  802500100
  802600100
  802711101
  802800000
  802900100

$`1999`$`8028`
  B
B  8025 8026 8027 8028 8029
  802500000
  802600000
  802700000
  802800011
  802900010

$`1999`$`8029`
  B
B  8025 8026 8027 8028 8029
  802500000
  802600001
  802700001
  802800001
  802901111   

Any suggestions? You help is very much appreciated!

DF = data.frame(read.table(textConnection(  A  B  C
80  8025  1995
80  8026  1995
80  8029  1995
81  8026  1996
82  8025  1997
82  8026  1997
83  8025  1997
83  8027  1997
90  8026  1998
90  8027  1998
90  8029  1998
84  8026  1999
84  8027  1999
85  8028  1999
85  8029  1999),head=TRUE,stringsAsFactors=FALSE))

e - function(y) crossprod(table(DF[DF$C %in% y, 1:2])) 
years - sort(unique(DF$C)) 
f - as.data.frame(embed(years, 3)) 
g-lapply(split(f, f[, 1]), e)
h-lapply(g, function (x) ifelse(x0,1,0))

--
View this message in context: 
http://r.789695.n4.nabble.com/Multiply-list-objects-tp3595719p3595719.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Divide matrix into multiple smaller matrices

2011-06-12 Thread mdvaan
Hi,

I still haven't found a solution for this problem. Is there a way in which I
can slice the objects in c based on the info in h? Thanks a lot!

--
View this message in context: 
http://r.789695.n4.nabble.com/Divide-matrix-into-multiple-smaller-matrices-tp3552399p3591868.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Counting occurrences in a moving window

2011-06-03 Thread mdvaan
Would it be possible to use the sqldf package and the ave function to simply
run ave over a limited set? So something like:

DF = data.frame(read.table(textConnection(  A  B 
8025  1995 
8026  1995 
8029  1995 
8026  1996 
8025  1997 
8026  1997 
8025  1997 
8027  1997 
8026  1999 
8027  1999 
8028  1995 
8029  1998 
8025  1997 
8027  1997 
8026  1999 
8027  1999 
8028  1995 
8029  1998),head=TRUE,stringsAsFactors=FALSE)) 

library(sqldf)
years-c(1995:1999)
for (t in 1:length(years))
{
year = as.numeric(years[t])
m-sqldf('select * from DF where B between $year-1 AND $year-4')
n-ave(m$A,m$A,FUN = length)
}

How do I get the correct values in DF$C? Thanks!!


--
View this message in context: 
http://r.789695.n4.nabble.com/Counting-occurrences-in-a-moving-window-tp3568658p3570652.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Counting occurrences in a moving window

2011-06-03 Thread mdvaan
Thank you very much! I really liked the first solution, it worked great for
my larger dataset.

M



Gabor Grothendieck wrote:
 
 On Fri, Jun 3, 2011 at 8:11 AM, mdvaan lt;mathijsdev...@gmail.comgt;
 wrote:
 Would it be possible to use the sqldf package and the ave function to
 simply
 run ave over a limited set? So something like:

 DF = data.frame(read.table(textConnection(  A  B
 8025  1995
 8026  1995
 8029  1995
 8026  1996
 8025  1997
 8026  1997
 8025  1997
 8027  1997
 8026  1999
 8027  1999
 8028  1995
 8029  1998
 8025  1997
 8027  1997
 8026  1999
 8027  1999
 8028  1995
 8029  1998),head=TRUE,stringsAsFactors=FALSE))

 library(sqldf)
 years-c(1995:1999)
 for (t in 1:length(years))
        {
        year = as.numeric(years[t])
        m-sqldf('select * from DF where B between $year-1 AND $year-4')
        n-ave(m$A,m$A,FUN = length)
        }

 How do I get the correct values in DF$C? Thanks!!
 
 In sqldf it would be like this:
 
 sqldf(select x.*, sum(x.A = y.A and y.B  x.B and y.B = x.B-3) C
 from DF x, DF y group by x.rowid)
 
 
 -- 
 Statistics  Software Consulting
 GKX Group, GKX Associates Inc.
 tel: 1-877-GKX-GROUP
 email: ggrothendieck at gmail.com
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


--
View this message in context: 
http://r.789695.n4.nabble.com/Counting-occurrences-in-a-moving-window-tp3568658p3571916.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Counting occurrences in a moving window

2011-06-02 Thread mdvaan
Hi list, based on the following data.frame I would like to create a variable
that indicates the number of occurrences of A in the 3 years prior to the
current year:

DF = data.frame(read.table(textConnection(  A  B
8025  1995
8026  1995
8029  1995
8026  1996
8025  1997
8026  1997
8025  1997
8027  1997
8026  1999
8027  1999
8028  1995
8029  1998
8025  1997
8027  1997
8026  1999
8027  1999
8028  1995
8029  1998),head=TRUE,stringsAsFactors=FALSE))

becomes:

AB  C
8025  1995  0  
8026  1995  0
8029  1995  0
8026  1996  1
8025  1997  1
8026  1997  2
8025  1997  1
8027  1997  0
8026  1999  2
8027  1999  2
8028  1995  0
8029  1998  1
8025  1997  1
8027  1997  0
8026  1999  2
8027  1999  2 
8028  1995  0
8029  2000  1

So 8026 in 1997 = 2 because 8026 can be found in 1995 and 1996 which are
both within the appropriate window (1996 - 1994).

Any ideas? I looked at the rollapply vignette, but couldn't figure out how
to apply it to my data.

Thanks a lot!




--
View this message in context: 
http://r.789695.n4.nabble.com/Counting-occurrences-in-a-moving-window-tp3568658p3568658.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Divide matrix into multiple smaller matrices

2011-05-26 Thread mdvaan
Hi list,

Using the script below, I have generated two lists (c and h) containing
yearly matrices. Now I would like to divide the matrices in c into multiple
matrices based on h. The number of matrices should be equal to:
length(unique(DF1$B))*length(h). So each unique value in DF1$B get's a
yearly matrix. Each matrix should contain all values from c where element
cij is 1. An example for DF1$B = 8025 in 1999:

   8025   8026  8027
8025 0. 0.27547644 0.06905066
8026 0.27547644 0. 0.10499739
8027 0.06905066 0.10499739 0.

Any ideas on how to tackle this problem? Thanks a lot!

library(zoo)

DF1 = data.frame(read.table(textConnection(B  C  D  E  F  G
8025  1995  0  4  1  2
8025  1997  1  1  3  4
8026  1995  0  7  0  0
8026  1996  1  2  3  0
8026  1997  1  2  3  1
8026  1998  6  0  0  4
8026  1999  3  7  0  3
8027  1997  1  2  3  9
8027  1998  1  2  3  1
8027  1999  6  0  0  2
8028  1999  3  7  0  0
8029  1995  0  2  3  3
8029  1998  1  2  3  2
8029  1999  6  0  0  1),head=TRUE,stringsAsFactors=FALSE)) # Where Column B
represents the cases, C is the year and D-G are the types of knowledge units
covered

a - read.zoo(DF1, split = 1, index = 2, FUN = identity)
sum.na - function(x) if (any(!is.na(x))) sum(x, na.rm = TRUE) else NA
b - rollapply(a, 3,  sum.na, align = right, partial = TRUE)
newDF - lapply(1:nrow(b), function(i)
   prop.table(na.omit(matrix(b[i,], nc = 4, byrow = TRUE,
   dimnames = list(unique(DF1$B), names(DF1)[-1:-2]))), 1))
names(newDF) - time(a)
c-lapply(newDF, function(mat) tcrossprod(mat / sqrt(rowSums(mat^2
c-lapply(c, function (x) 1-x)
c-lapply(c, function (x) ifelse(x0.00111, 0, x))# These are the yearly
distance matrices for a 4 year moving window

DF2 = data.frame(read.table(textConnection(  A  B  C
80  8025  1995
80  8026  1995
80  8029  1995
81  8026  1996
82  8025  1997
82  8026  1997
83  8025  1997
83  8027  1997
90  8026  1998
90  8027  1998
90  8029  1998
84  8026  1999
84  8027  1999
85  8028  1999
85  8029  1999),head=TRUE,stringsAsFactors=FALSE))

e - function(y) crossprod(table(DF2[DF2$C %in% y, 1:2])) 
years - sort(unique(DF2$C)) 
f - as.data.frame(embed(years, 3)) 
g-lapply(split(f, f[, 1]), e)
h-lapply(g, function (x) ifelse(x0,1,0))# These are the adjacency matrices
per year


--
View this message in context: 
http://r.789695.n4.nabble.com/Divide-matrix-into-multiple-smaller-matrices-tp3552399p3552399.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.