Re: [R] Fwd: Difficulty in finding some R functions

2014-05-02 Thread Pascal Oettli
Hello,

Maybe from here ?
http://www.ie.boun.edu.tr/~hormannw/BounQuantitiveFinance/Thesis/SilaHALULU.pdf

HTH
Pascal

On Fri, May 2, 2014 at 11:11 AM, prachi jain
preciousprachi...@gmail.com wrote:
 Hey, I am trying to find some of the following functions in R packages:

 MLEt

 pt3

 cormatrix2vector

 ParameterEst

 tCopula

 riskBT

 I have checked every package from this link:
 http://cran.r-project.org/web/packages/available_packages_by_name.html but
 is unable to find the above functions. These functions belong to a code for
 calculating Value-at-risk and expected shortfall using copula function and
 backtesting the model.

 Could you please help me find these functions by naming the packages they
 belong to. Its really urgent for me so please reply asap.

 If required i can send the entire code for better understanding.

 If this is not the right place to mail a query, please do let me know where
 can I send my query, its really urgent for me.

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Pascal Oettli
Project Scientist
JAMSTEC
Yokohama, Japan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] substring if a record has a \

2014-05-02 Thread Mat
Hello togehter, 

i have a litte problem. I have a data.frame with a view entries like this
one:
  1
A   Marius Muller -\nIT Services
B   Rockwood\nBrockhues
C   Microlog Services\nMarcos
D   Firefox Services

I now want only the first description in the column until the \n. How can
i do this?

The solution look like this one:

  1
A   Marius Muller -
B   Rockwood
C   Microlog Services
D   Firefox Services

Thank you.

Best regards. Mat.



--
View this message in context: 
http://r.789695.n4.nabble.com/substring-if-a-record-has-a-tp4689857.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] extracting the first few eigenvectors

2014-05-02 Thread Mike Miller
I have a symmetric matrix, X, and I just want the first K eigenvectors 
(those associated with the K largest eigenvalues).  Clearly, this works:


eigs - eigen( X, symmetric=TRUE )

K_eigenvectors - eigs$vectors[ , 1:K ]
K_eigenvalues - eigs$values[ 1:K ]

rm(eigs)

In order to do that, I have to create the matrix eigs$vectors which is the 
same size as X.  Sometimes X has 100 million elements or more.  Usually I 
want only the first 10 eigenvectors instead of all 10,000.  Is there any R 
function that will allow me to extract just the few that I want?  This 
would be analogous to the [V,d] = eigs(X,K) function in Octave/MATLAB.


Best,
Mike

--
Michael B. Miller, Ph.D.
Minnesota Center for Twin and Family Research
Department of Psychology
University of Minnesota
http://scholar.google.com/citations?user=EV_phq4J

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] substring if a record has a \

2014-05-02 Thread Pascal Oettli
Hello,

Something like that?

x - c('Marius Muller -\nIT Services','Rockwood\nBrockhues','Microlog
Services\nMarcos','Firefox Services')
x - sapply(strsplit(x, '\n'), '[', 1)
x
[1] Marius Muller -   Rockwood  Microlog Services
[4] Firefox Services

Regards,
Pascal

On Fri, May 2, 2014 at 4:17 PM, Mat matthias.we...@fnt.de wrote:
 Hello togehter,

 i have a litte problem. I have a data.frame with a view entries like this
 one:
   1
 A   Marius Muller -\nIT Services
 B   Rockwood\nBrockhues
 C   Microlog Services\nMarcos
 D   Firefox Services

 I now want only the first description in the column until the \n. How can
 i do this?

 The solution look like this one:

   1
 A   Marius Muller -
 B   Rockwood
 C   Microlog Services
 D   Firefox Services

 Thank you.

 Best regards. Mat.



 --
 View this message in context: 
 http://r.789695.n4.nabble.com/substring-if-a-record-has-a-tp4689857.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Pascal Oettli
Project Scientist
JAMSTEC
Yokohama, Japan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] substring if a record has a \

2014-05-02 Thread Mat
works perfect, thank you :-)



--
View this message in context: 
http://r.789695.n4.nabble.com/substring-if-a-record-has-a-n-tp4689857p4689860.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Return TRUE only for first match of values between matrix and vector.

2014-05-02 Thread nevil amos
I wish to return  True in a matrix for only the first match of a value
per row where the value equals that in a vector with the same number of
values as rosw in the matrix

eg:
A-matrix(c(2,3,2,1,1,2,NA,NA,NA,5,1,0,5,5,5),5,3)
B-c(2,1,NA,1,5)
desired result:

  [,1] [,2]  [,3]
[1,]  TRUE FALSE FALSE
[2,] FALSE   NA FALSE
[3,]NA   NANA
[4,]  TRUE   NA FALSE
[5,] FALSE TRUE  FALSE

however A==B returns:
  [,1] [,2]  [,3]
[1,]  TRUE TRUE FALSE
[2,] FALSE   NA FALSE
[3,]NA   NANA
[4,]  TRUE   NA FALSE
[5,] FALSE TRUE  TRUE
and
apply(A,1,function(x) match (B,x))
returns
 [,1] [,2] [,3] [,4] [,5]
[1,]1   NA1   NA   NA
[2,]3   NA   NA11
[3,]   NA222   NA
[4,]3   NA   NA11
[5,]   NA   NA332

thanks

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] extracting the first few eigenvectors

2014-05-02 Thread Berend Hasselman

On 02-05-2014, at 09:17, Mike Miller mbmille...@gmail.com wrote:

 I have a symmetric matrix, X, and I just want the first K eigenvectors (those 
 associated with the K largest eigenvalues).  Clearly, this works:
 
 eigs - eigen( X, symmetric=TRUE )
 
 K_eigenvectors - eigs$vectors[ , 1:K ]
 K_eigenvalues - eigs$values[ 1:K ]
 
 rm(eigs)
 
 In order to do that, I have to create the matrix eigs$vectors which is the 
 same size as X.  Sometimes X has 100 million elements or more.  Usually I 
 want only the first 10 eigenvectors instead of all 10,000.  Is there any R 
 function that will allow me to extract just the few that I want?  This would 
 be analogous to the [V,d] = eigs(X,K) function in Octave/MATLAB.
 

You can find possibly relevant stuff with

library(sos)
findFn(eigenvalue”)

which is how I found package rARPACK, which may be exactly what you need/want.

Berend

 Best,
 Mike
 
 -- 
 Michael B. Miller, Ph.D.
 Minnesota Center for Twin and Family Research
 Department of Psychology
 University of Minnesota
 http://scholar.google.com/citations?user=EV_phq4J
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R and GML data from the Canadian Goverment

2014-05-02 Thread Barry Rowlingson
You want to have a look at the R Spatial Task View for starters...

GML files can be read by the rgdal package's readOGR function. But be
warned GML is a complex beast.

For example, I downloaded canvec_gml_AB_EN.zip from that site, it unzips to
a GML (and an xsd) with a lot of layers, some of which aren't spatial.

Using the ogrinfo command line tool from the gdal library I can see the
layers:

$ ogrinfo -so AB_EN.gml
Had to open data source read-only.
INFO: Open of `AB_EN.gml'
  using driver `GML' successful.
1: d_y_n (None)
2: d_y_n00 (None)
3: d_y_n01 (None)
4: d_y_n02 (None)
5: d_y_n03 (None)
6: d_y_n04 (None)
7: d_y_n05 (None)
8: d_y_n06 (None)
[etc etc]
191: d_y166 (None)
192: d_y167 (None)
193: EN_1360049_0 (Multi Point)
194: EN_1360059_2 (Multi Polygon)
195: EN_1360049_2 (Multi Polygon)
196: EN_1340009_0 (Multi Point)
197: EN_1360059_0 (Multi Point)
198: EN_2170009_0 (Multi Point)
199: EN_1120009_1 (Multi Line String)
200: EN_1180009_1 (Multi Line String)

So thats 192 non-spatial layers and a bunch of points n lines n polygons
layers. Let's read one:

m = readOGR(./AB_EN.gml,EN_1360059_2)
plot(m)

 makes a map. Its a bunch of very small polygons, hard to see really!

Further questions will be welcome at the R-sig-geo mailing list, where we
whine about file formats on a daily basis.

Barry




On Thu, May 1, 2014 at 11:26 PM, jcrosbie ja...@crosb.ie wrote:

 I'm trying to create a map of transmission lines in Alberta. In addition,
 I'm
 very new to creating maps.

 The data can be found at: http://geogratis.gc.ca/site/eng/download


 http://ftp2.cits.rncan.gc.ca/pub/canvec/doc/CanVec_distribution_formats_en.pdf

 Would someone be able to point me in the right direction? I haven't been
 able to find an R package which is able to work with the GML data on the
 webset.  Does anyone know of which package I should use and a good example
 out there?

 Thank you



 --
 View this message in context:
 http://r.789695.n4.nabble.com/R-and-GML-data-from-the-Canadian-Goverment-tp4689843.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Return TRUE only for first match of values between matrix and vector.

2014-05-02 Thread arun
Hi,
Try:
indx - A==B
t(apply(indx,1,function(x) {x[duplicated(x)  !is.na(x)] - FALSE; x}))
#  [,1]  [,2]  [,3]
#[1,]  TRUE FALSE FALSE
#[2,] FALSE    NA FALSE
#[3,]    NA    NA    NA
#[4,]  TRUE    NA FALSE
#[5,] FALSE  TRUE FALSE

A.K.




On Friday, May 2, 2014 4:47 AM, nevil amos nevil.a...@gmail.com wrote:
I wish to return  True in a matrix for only the first match of a value
per row where the value equals that in a vector with the same number of
values as rosw in the matrix

eg:
A-matrix(c(2,3,2,1,1,2,NA,NA,NA,5,1,0,5,5,5),5,3)
B-c(2,1,NA,1,5)
desired result:

      [,1] [,2]  [,3]
[1,]  TRUE FALSE FALSE
[2,] FALSE   NA FALSE
[3,]    NA   NA    NA
[4,]  TRUE   NA FALSE
[5,] FALSE TRUE  FALSE

however A==B returns:
      [,1] [,2]  [,3]
[1,]  TRUE TRUE FALSE
[2,] FALSE   NA FALSE
[3,]    NA   NA    NA
[4,]  TRUE   NA FALSE
[5,] FALSE TRUE  TRUE
and
apply(A,1,function(x) match (B,x))
returns
     [,1] [,2] [,3] [,4] [,5]
[1,]    1   NA    1   NA   NA
[2,]    3   NA   NA    1    1
[3,]   NA    2    2    2   NA
[4,]    3   NA   NA    1    1
[5,]   NA   NA    3    3    2

thanks

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Return TRUE only for first match of values between matrix and vector.

2014-05-02 Thread Jorge I Velez
Hi Nevil,

Try

apply(A, 2, function(x) x == B)

HTH,
Jorge.-



On Fri, May 2, 2014 at 6:46 PM, nevil amos nevil.a...@gmail.com wrote:

 I wish to return  True in a matrix for only the first match of a value
 per row where the value equals that in a vector with the same number of
 values as rosw in the matrix

 eg:
 A-matrix(c(2,3,2,1,1,2,NA,NA,NA,5,1,0,5,5,5),5,3)
 B-c(2,1,NA,1,5)
 desired result:

   [,1] [,2]  [,3]
 [1,]  TRUE FALSE FALSE
 [2,] FALSE   NA FALSE
 [3,]NA   NANA
 [4,]  TRUE   NA FALSE
 [5,] FALSE TRUE  FALSE

 however A==B returns:
   [,1] [,2]  [,3]
 [1,]  TRUE TRUE FALSE
 [2,] FALSE   NA FALSE
 [3,]NA   NANA
 [4,]  TRUE   NA FALSE
 [5,] FALSE TRUE  TRUE
 and
 apply(A,1,function(x) match (B,x))
 returns
  [,1] [,2] [,3] [,4] [,5]
 [1,]1   NA1   NA   NA
 [2,]3   NA   NA11
 [3,]   NA222   NA
 [4,]3   NA   NA11
 [5,]   NA   NA332

 thanks

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] substring if a record has a \

2014-05-02 Thread arun
Hi,
Try:
dat - structure(list(`1` = c(Marius Muller -\nIT Services, 
Rockwood\nBrockhues, 
Microlog Services\nMarcos, Firefox Services)), .Names = 1, class = 
data.frame, row.names = c(A, 
B, C, D))

dat$`1` - gsub(\n.*,,dat$`1`)
dat
#  1
#A   Marius Muller -
#B  Rockwood
#C Microlog Services
#D  Firefox Services


A.K.


On Friday, May 2, 2014 3:19 AM, Mat matthias.we...@fnt.de wrote:
Hello togehter, 

i have a litte problem. I have a data.frame with a view entries like this
one:
      1
A   Marius Muller -\nIT Services
B   Rockwood\nBrockhues
C   Microlog Services\nMarcos
D   Firefox Services

I now want only the first description in the column until the \n. How can
i do this?

The solution look like this one:

      1
A   Marius Muller -
B   Rockwood
C   Microlog Services
D   Firefox Services

Thank you.

Best regards. Mat.



--
View this message in context: 
http://r.789695.n4.nabble.com/substring-if-a-record-has-a-tp4689857.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] creating multiple orthogonal plans

2014-05-02 Thread David Moskowitz
Hi,
I am trying to create multiple orthogonal designs using the Conjoint
package.  I have 4 factors with 5 levels in each.

library(conjoint)
experiment-expand.grid(
  price-c(a1,a2,a3,a4,a5),
  tag-c(b1,b2,b3,b4,b5),
  smell-c(c1,c2,c3,c4,c5),
  aroma-c(f1,f2,f3,f4,f5))
design-caFactorialDesign(data=experiment,type=fractional)
print(design)
print(cor(caEncodedDesign(design)))

The resulting design from the above code has 22 test combinations.= (listed
below).  I would like to be able to create multiple designs instead of just
having one.  Is there a way to do that?

Var1 Var2 Var3 Var4
19a4   b4   c1   f1
40a5   b3   c2   f1
96a1   b5   c4   f1
107   a2   b2   c5   f1
148   a3   b5   c1   f2
157   a2   b2   c2   f2
176   a1   b1   c3   f2
195   a5   b4   c3   f2
239   a4   b3   c5   f2
252   a2   b1   c1   f3
299   a4   b5   c2   f3
313   a3   b3   c3   f3
335   a5   b2   c4   f3
366   a1   b4   c5   f3
381   a1   b2   c1   f4
447   a2   b5   c3   f4
469   a4   b4   c4   f4
478   a3   b1   c5   f4
543   a3   b4   c2   f5
559   a4   b2   c3   f5
587   a2   b3   c4   f5
625   a5   b5   c5   f5

[1] R version 3.0.3 (2014-03-06)

Thank you

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] extracting the first few eigenvectors

2014-05-02 Thread Yixuan Qiu
Exactly.
The syntax is intended to mimic eigs() in Matlab and Octave.

library(rARPACK)
eigs(X, 10)   # If your X is of class dsyMatrix
eigs_sym(X, 10)   # If X is of class matrix


Best,
Yixuan

2014-05-02 4:48 GMT-04:00 Berend Hasselman b...@xs4all.nl:

 On 02-05-2014, at 09:17, Mike Miller mbmille...@gmail.com wrote:

 I have a symmetric matrix, X, and I just want the first K eigenvectors 
 (those associated with the K largest eigenvalues).  Clearly, this works:

 eigs - eigen( X, symmetric=TRUE )

 K_eigenvectors - eigs$vectors[ , 1:K ]
 K_eigenvalues - eigs$values[ 1:K ]

 rm(eigs)

 In order to do that, I have to create the matrix eigs$vectors which is the 
 same size as X.  Sometimes X has 100 million elements or more.  Usually I 
 want only the first 10 eigenvectors instead of all 10,000.  Is there any R 
 function that will allow me to extract just the few that I want?  This would 
 be analogous to the [V,d] = eigs(X,K) function in Octave/MATLAB.


 You can find possibly relevant stuff with

 library(sos)
 findFn(eigenvalue”)

 which is how I found package rARPACK, which may be exactly what you need/want.

 Berend

 Best,
 Mike

 --
 Michael B. Miller, Ph.D.
 Minnesota Center for Twin and Family Research
 Department of Psychology
 University of Minnesota
 http://scholar.google.com/citations?user=EV_phq4J

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Yixuan Qiu yixuan@cos.name
Department of Statistics,
Purdue University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Adding a column with a count of unique values

2014-05-02 Thread arun
Hi,
Try:
dat - read.table(text=Person     Time     Change
2     0     3
2     10     5
2     15     7
3     0     4
3     5     2,sep=,header=TRUE)
dat1 - transform(dat,Count= ave(rep(1,nrow(dat)), Person, FUN=cumsum))
#or
##If it is ordered by Person
dat2 - transform(dat, Count= setNames(sequence(table(Person)),NULL))
#or
dat3 - transform(dat,Count= ave(seq_along(Person), Person, FUN=seq_along))
all.equal(dat1,dat2)
#[1] TRUE
 all.equal(dat1,dat3)
#[1] TRUE

A.K.


I have a dataframe that looks like this:
Person     Time     Change
2     0     3
2     10     5
2     15     7
3     0     4
3     5     2
I would like to add a column that counts each row for each person, like this:
Person     Time     Change     Count
2     0     3     1
2     10     5     2
2     15     7     3
3     0     4     1
3     5     2     2
Thanks in advance! 


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] speeding up applying hist() over rows of a matrix

2014-05-02 Thread William Dunlap
Your original code, as a function of 'm' and 'bins' is
f0 - function (m, bins) {
t(apply(m, 1, function(x) hist(x, breaks = bins, plot = FALSE)$counts))
}
and the time it takes to run on your m1 is about 5 s. on my machine
 system.time(r0 - f0(m1,bins))
   user  system elapsed
   4.950.005.02


hist(x,breaks=bins) is essentially tabulate(cut(x,bins),nbins=length(bins)-1).
See how much it speeds things up by replacing hist() with tabulate(cut()):
f1 - function (m, bins)
{
nbins - length(bins) - 1L
t(apply(m, 1, function(x) tabulate(cut(x, bins), nbins = nbins)))
}
That doesn't help with the time, but it does give the same output
 system.time(r1 - f1(m1,bins))
   user  system elapsed
   4.850.105.35
 identical(r0, r1)
[1] TRUE

Now try speeding it up by calling cut() on the whole matrix first
and then applying tabulate to each row, as in
f2 - function (m, bins)  {
nbins - length(bins) - 1L
m - array(as.integer(cut(m, bins)), dim = dim(m))
t(apply(m, 1, tabulate, nbins = nbins))
}
That saves quite a bit of time and gives the same output
 system.time(r2 - f2(m1,bins))
   user  system elapsed
   0.250.000.25
 identical(r0, r2)
[1] TRUE

Bill Dunlap
TIBCO Software
wdunlap tibco.com


On Thu, May 1, 2014 at 12:48 PM, Ortiz-Bobea, Ariel ortiz-bo...@rff.org wrote:
 Hello everyone,



 I'm trying to construct bins for each row in a matrix. I'm using apply() in 
 combination with hist() to do this. Performing this binning for a 10K-by-50 
 matrix takes about 5 seconds, but only 0.5 seconds for a 1K-by-500 matrix. 
 This suggests the bottleneck is accessing rows in apply() rather than the 
 calculations going on inside hist().



 My initial idea is to process as many columns (as make sense for the intended 
 use) at once. However, I still have many many rows to process and I would 
 appreciate any feedback on how to speed this up.



 Any thoughts?



 Thanks,



 Ariel



 Here is the illustration:



 # create data

 m1 - matrix(10*rnorm(50*10^4), ncol=50)

 m2 - matrix(10*rnorm(50*10^4), ncol=500)



 # compute bins

 bins - seq(-100,100,1)

 system.time({ out1 - t(apply(m1,1, function(x) hist(x,breaks=bins, 
 plot=FALSE)$counts)) })

 system.time({ out2 - t(apply(m2,1, function(x) hist(x,breaks=bins, 
 plot=FALSE)$counts)) })


 ---
 Ariel Ortiz-Bobea
 Fellow
 Resources for the Future
 1616 P Street, N.W.
 Washington, DC 20036

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] speeding up applying hist() over rows of a matrix

2014-05-02 Thread William Dunlap
And since as.integer(cut(x,bins)) is essentially findInterval(x,bins)
(since we throw away the labels made by cut()), I tried using
findInterval instead of cut() and it cut the time by more than half,
so your 5.0 s. is now c. 0.1 s.
f3 - function (m, bins)
{
nbins - length(bins) - 1L
m - array(findInterval(m, bins), dim = dim(m))
t(apply(m, 1, tabulate, nbins = nbins))
}
 system.time(r3 - f3(m1,bins))
   user  system elapsed
   0.090.000.09
 identical(r0,r3)
[1] TRUE

Bill Dunlap
TIBCO Software
wdunlap tibco.com


On Fri, May 2, 2014 at 9:23 AM, William Dunlap wdun...@tibco.com wrote:
 Your original code, as a function of 'm' and 'bins' is
 f0 - function (m, bins) {
 t(apply(m, 1, function(x) hist(x, breaks = bins, plot = FALSE)$counts))
 }
 and the time it takes to run on your m1 is about 5 s. on my machine
 system.time(r0 - f0(m1,bins))
user  system elapsed
4.950.005.02


 hist(x,breaks=bins) is essentially tabulate(cut(x,bins),nbins=length(bins)-1).
 See how much it speeds things up by replacing hist() with tabulate(cut()):
 f1 - function (m, bins)
 {
 nbins - length(bins) - 1L
 t(apply(m, 1, function(x) tabulate(cut(x, bins), nbins = nbins)))
 }
 That doesn't help with the time, but it does give the same output
 system.time(r1 - f1(m1,bins))
user  system elapsed
4.850.105.35
 identical(r0, r1)
 [1] TRUE

 Now try speeding it up by calling cut() on the whole matrix first
 and then applying tabulate to each row, as in
 f2 - function (m, bins)  {
 nbins - length(bins) - 1L
 m - array(as.integer(cut(m, bins)), dim = dim(m))
 t(apply(m, 1, tabulate, nbins = nbins))
 }
 That saves quite a bit of time and gives the same output
 system.time(r2 - f2(m1,bins))
user  system elapsed
0.250.000.25
 identical(r0, r2)
 [1] TRUE

 Bill Dunlap
 TIBCO Software
 wdunlap tibco.com


 On Thu, May 1, 2014 at 12:48 PM, Ortiz-Bobea, Ariel ortiz-bo...@rff.org 
 wrote:
 Hello everyone,



 I'm trying to construct bins for each row in a matrix. I'm using apply() in 
 combination with hist() to do this. Performing this binning for a 10K-by-50 
 matrix takes about 5 seconds, but only 0.5 seconds for a 1K-by-500 matrix. 
 This suggests the bottleneck is accessing rows in apply() rather than the 
 calculations going on inside hist().



 My initial idea is to process as many columns (as make sense for the 
 intended use) at once. However, I still have many many rows to process and I 
 would appreciate any feedback on how to speed this up.



 Any thoughts?



 Thanks,



 Ariel



 Here is the illustration:



 # create data

 m1 - matrix(10*rnorm(50*10^4), ncol=50)

 m2 - matrix(10*rnorm(50*10^4), ncol=500)



 # compute bins

 bins - seq(-100,100,1)

 system.time({ out1 - t(apply(m1,1, function(x) hist(x,breaks=bins, 
 plot=FALSE)$counts)) })

 system.time({ out2 - t(apply(m2,1, function(x) hist(x,breaks=bins, 
 plot=FALSE)$counts)) })


 ---
 Ariel Ortiz-Bobea
 Fellow
 Resources for the Future
 1616 P Street, N.W.
 Washington, DC 20036

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Parsing XML file to data frame

2014-05-02 Thread starcraz
Hi all - I am trying to parse out the attached XML file into a data frame.
The file is extracted from Hadoop File Services (HFS). I am new in using the
XML package so need some help in parsing out the data. Below is some code
that I explore to get the attribute into a data frame. Any help is
appreciated.

library(XML)
temp - xmlParseDoc(sample.xml)
temp.root - xmlRoot(temp)
xmlName(temp.root)
xmlSize(temp.root) #21 child nodes
temp.root[[2]] #headers
temp.root[[2]][[2]] #extracts just the revision
temp.2 - xmlToList(temp.root[[2]]) #extracts the info in temp.root[[2]]
into a list
temp.2
temp.2.df - xmlToDataFrame(temp.root[[2]]) #data frame of the list
temp.2.df
xmlValue(temp.root[[2]]) #string the values of the node inside [[2]]

temp.revision - xmlValue(temp.root[[2]][[Revision]])
temp.revision

test - xmlTreeParse(sample.xml)
test




--
View this message in context: 
http://r.789695.n4.nabble.com/Parsing-XML-file-to-data-frame-tp4689883.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] matafor package: adding additional information under a forest plot

2014-05-02 Thread Dietlinde Schmidt

Dear R-Users,

i have to questions about plotting a forest plot with the metafor package:

1) Is there a (text()) function to add additional information (e.g. 
heterogeneity statistics) under the forest plot?
2) I want to add various columns of additional information about each 
study (e.g. quality, time point, measure). I thought i could do that 
with the ilab-argument, but i keep getting in the way with the line that 
restricts the plot and although i tweaked everything around, I do not 
succeed. Basecally, i want the format of the forest plot to be rather 
wide than tall, because i have little studies but many columns with 
additional study information. Does someone know how to solve that (over 
xlim, ylim???)?


Thanks a lot for your help

Linde

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to fix the format in write.csv

2014-05-02 Thread ChangJiang Xu
By dafault, write.csv will change the characters such as 5/38 as a date
May-38. How can I not change the format?
Thanks.

ChangJiang

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Writing to gzcon with rawConnection

2014-05-02 Thread Jamie Olson
I would like to encode/decode some text with deflate/gzip, but I'm
having trouble.

Decoding values from a rawConnection is fairly straightforward, but
I'm having trouble encoding values and then retrieving them.

 gz_out - gzcon(raw_out - rawConnection(raw(0),open=wb))
 writeLines(test,con=gz_out)
 flush(gz_out)
 rawConnectionValue(raw_out)
Error: cannot allocate vector of size 131069.2 Gb
 raw_out - rawConnection(raw(0),open=wb)
 writeLines(test,con=raw_out)
 rawConnectionValue(raw_out)
[1] 74 65 73 74 0a


Has anyone had success doing this?

--Jamie

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to fix the format in write.csv

2014-05-02 Thread Bert Gunter
Read your Excel documentation. AFAIK, R just writes text files -- you
need to tell Excel how to read them in.

Cheers,
Bert

Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom.
H. Gilbert Welch




On Fri, May 2, 2014 at 11:34 AM, ChangJiang Xu
changjiang.h...@gmail.com wrote:
 By dafault, write.csv will change the characters such as 5/38 as a date
 May-38. How can I not change the format?
 Thanks.

 ChangJiang

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to fix the format in write.csv

2014-05-02 Thread Bert Gunter
ChangJiang:

Open the .csv file with Notepad or other plain vanilla text processor,
NOT EXCEL. You will find that the columns are text. Excel
automatically converts them to dates. Read Excel's docs or get help
from someone to learn how to convert the dates back to text.

-- Bert



Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom.
H. Gilbert Welch




On Fri, May 2, 2014 at 12:35 PM, ChangJiang Xu
changjiang.h...@gmail.com wrote:
 Thanks. I still didn't get the solution. For example, I have a data frame,
 called temp
 temp
   Chr Ref Var AFFUNAFF AFF.test pvalue
 1  10   A   G  2/1/240/0/1905/49  2.429e-09
 2  18   G   A  1/9/17  0/23/167   11/43  2.484e-04
 3   1   G   A  2/2/220/8/176 6/46  4.293e-04
 4  11   T   G  1/1/250/2/188  3/51 1.193e-03
 5   2   A   T 1/10/16   38/90/60   12/42  2.220e-03
 6   3   G   A 1/16/10  8/49/13318/36 4.549e-03

 Then I want to write a csv file using R function write.csv, as follows:
 write.csv(temp, file=temp.csv, row.names = FALSE)

 and the csv file, temp.csv, looks like the below, not same as original data
 ChrRefVarAFF   UNAFFAFF.testpvalue
 10AG02/01/20240/0/190May-492.43E-09
 18GA01/09/20170/23/167  Nov-430.0002484
 1  GA02/02/20220/8/176Jun-460.0004293
 11 TG01/01/20250/2/188Mar-510.001193
 2  AT01/10/201638/90/60   Dec-420.00222
 3  GA1/16/108/49/133   18/36  0.004549


 On Fri, May 2, 2014 at 3:06 PM, Bert Gunter gunter.ber...@gene.com wrote:

 Read your Excel documentation. AFAIK, R just writes text files -- you
 need to tell Excel how to read them in.

 Cheers,
 Bert

 Bert Gunter
 Genentech Nonclinical Biostatistics
 (650) 467-7374

 Data is not information. Information is not knowledge. And knowledge
 is certainly not wisdom.
 H. Gilbert Welch




 On Fri, May 2, 2014 at 11:34 AM, ChangJiang Xu
 changjiang.h...@gmail.com wrote:
  By dafault, write.csv will change the characters such as 5/38 as a
  date
  May-38. How can I not change the format?
  Thanks.
 
  ChangJiang
 
  [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to fix the format in write.csv

2014-05-02 Thread Jim Lemon

Hi ChangJiang,
Date conversion is one of the biggest headaches with Excel. Even if you 
import those data into Excel and then specify that the column should be 
text format, it won't convert the values back into what they 
originally were. Be aware that Excel may, when encountering dates from a 
different locale, _silently_ convert the ones that aren't valid in its 
current locale, usually by swapping the day and month values. That one 
has bitten me when I have had to use Excel. The solution I found was to 
always use international format (-mm-dd) as that didn't seem to be 
altered.


Jim

On 05/03/2014 06:56 AM, Bert Gunter wrote:

ChangJiang:

Open the .csv file with Notepad or other plain vanilla text processor,
NOT EXCEL. You will find that the columns are text. Excel
automatically converts them to dates. Read Excel's docs or get help
from someone to learn how to convert the dates back to text.

-- Bert



Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom.
H. Gilbert Welch




On Fri, May 2, 2014 at 12:35 PM, ChangJiang Xu
changjiang.h...@gmail.com  wrote:

Thanks. I still didn't get the solution. For example, I have a data frame,
called temp

temp

   Chr Ref Var AFFUNAFF AFF.test pvalue
1  10   A   G  2/1/240/0/1905/49  2.429e-09
2  18   G   A  1/9/17  0/23/167   11/43  2.484e-04
3   1   G   A  2/2/220/8/176 6/46  4.293e-04
4  11   T   G  1/1/250/2/188  3/51 1.193e-03
5   2   A   T 1/10/16   38/90/60   12/42  2.220e-03
6   3   G   A 1/16/10  8/49/13318/36 4.549e-03

Then I want to write a csv file using R function write.csv, as follows:

write.csv(temp, file=temp.csv, row.names = FALSE)


and the csv file, temp.csv, looks like the below, not same as original data
ChrRefVarAFF   UNAFFAFF.testpvalue
10AG02/01/20240/0/190May-492.43E-09
18GA01/09/20170/23/167  Nov-430.0002484
1  GA02/02/20220/8/176Jun-460.0004293
11 TG01/01/20250/2/188Mar-510.001193
2  AT01/10/201638/90/60   Dec-420.00222
3  GA1/16/108/49/133   18/36  0.004549


On Fri, May 2, 2014 at 3:06 PM, Bert Guntergunter.ber...@gene.com  wrote:


Read your Excel documentation. AFAIK, R just writes text files -- you
need to tell Excel how to read them in.

Cheers,
Bert

Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom.
H. Gilbert Welch




On Fri, May 2, 2014 at 11:34 AM, ChangJiang Xu
changjiang.h...@gmail.com  wrote:

By dafault, write.csv will change the characters such as 5/38 as a
date
May-38. How can I not change the format?
Thanks.

ChangJiang

 [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] extracting the first few eigenvectors

2014-05-02 Thread Mike Miller
Thank you, and thanks for writing that code!  That is the perfect answer 
to my question.  I hope the R development team will consider expanding the 
functionality of eigen() to include the option to retain only the first 
few eigenvectors and/or eigenvalues.  To me it seems very common in 
statistics to work with a few eigenvectors instead of with all of them.


Mike


On Fri, 2 May 2014, Yixuan Qiu wrote:


Exactly.
The syntax is intended to mimic eigs() in Matlab and Octave.

library(rARPACK)
eigs(X, 10)   # If your X is of class dsyMatrix
eigs_sym(X, 10)   # If X is of class matrix


Best,
Yixuan

2014-05-02 4:48 GMT-04:00 Berend Hasselman b...@xs4all.nl:


On 02-05-2014, at 09:17, Mike Miller mbmille...@gmail.com wrote:


I have a symmetric matrix, X, and I just want the first K eigenvectors (those 
associated with the K largest eigenvalues).  Clearly, this works:

eigs - eigen( X, symmetric=TRUE )

K_eigenvectors - eigs$vectors[ , 1:K ]
K_eigenvalues - eigs$values[ 1:K ]

rm(eigs)

In order to do that, I have to create the matrix eigs$vectors which is the same 
size as X.  Sometimes X has 100 million elements or more.  Usually I want only 
the first 10 eigenvectors instead of all 10,000.  Is there any R function that 
will allow me to extract just the few that I want?  This would be analogous to 
the [V,d] = eigs(X,K) function in Octave/MATLAB.



You can find possibly relevant stuff with

library(sos)
findFn(eigenvalue”)

which is how I found package rARPACK, which may be exactly what you need/want.

Berend


Best,
Mike__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to fix the format in write.csv

2014-05-02 Thread ChangJiang Xu
Thanks. I still didn't get the solution. For example, I have a data frame,
called temp
 temp
  Chr Ref Var AFFUNAFF AFF.test pvalue
1  10   A   G  2/1/240/0/1905/49  2.429e-09
2  18   G   A  1/9/17  0/23/167   11/43  2.484e-04
3   1   G   A  2/2/220/8/176 6/46  4.293e-04
4  11   T   G  1/1/250/2/188  3/51 1.193e-03
5   2   A   T 1/10/16   38/90/60   12/42  2.220e-03
6   3   G   A 1/16/10  8/49/13318/36 4.549e-03

Then I want to write a csv file using R function write.csv, as follows:
 write.csv(temp, file=temp.csv, row.names = FALSE)

and the csv file, temp.csv, looks like the below, not same as original data
ChrRefVarAFF   UNAFFAFF.testpvalue
10AG02/01/20240/0/190May-492.43E-09
18GA01/09/20170/23/167  Nov-430.0002484
1  GA02/02/20220/8/176Jun-460.0004293
11 TG01/01/20250/2/188Mar-510.001193
2  AT01/10/201638/90/60   Dec-420.00222
3  GA1/16/108/49/133   18/36  0.004549


On Fri, May 2, 2014 at 3:06 PM, Bert Gunter gunter.ber...@gene.com wrote:

 Read your Excel documentation. AFAIK, R just writes text files -- you
 need to tell Excel how to read them in.

 Cheers,
 Bert

 Bert Gunter
 Genentech Nonclinical Biostatistics
 (650) 467-7374

 Data is not information. Information is not knowledge. And knowledge
 is certainly not wisdom.
 H. Gilbert Welch




 On Fri, May 2, 2014 at 11:34 AM, ChangJiang Xu
 changjiang.h...@gmail.com wrote:
  By dafault, write.csv will change the characters such as 5/38 as a date
  May-38. How can I not change the format?
  Thanks.
 
  ChangJiang
 
  [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] SQL vs R

2014-05-02 Thread Dr Eberhard Lisse
Hi,

How do I do something like this without using sqldf?

a - sqldf(SELECT COUNT(*) FROM b WHERE c = 'd')

or

e - sqldf(SELECT f, COUNT(*) FROM b GROUP BY f ORDER BY f)

greetings, el

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] speeding up applying hist() over rows of a matrix

2014-05-02 Thread Ortiz-Bobea, Ariel
This works great, thanks a lot!
-AOB

-Original Message-
From: William Dunlap [mailto:wdun...@tibco.com] 
Sent: Friday, May 02, 2014 12:31 PM
To: Ortiz-Bobea, Ariel
Cc: r-help@r-project.org
Subject: Re: [R] speeding up applying hist() over rows of a matrix

And since as.integer(cut(x,bins)) is essentially findInterval(x,bins) (since we 
throw away the labels made by cut()), I tried using findInterval instead of 
cut() and it cut the time by more than half, so your 5.0 s. is now c. 0.1 s.
f3 - function (m, bins)
{
nbins - length(bins) - 1L
m - array(findInterval(m, bins), dim = dim(m))
t(apply(m, 1, tabulate, nbins = nbins)) }
 system.time(r3 - f3(m1,bins))
   user  system elapsed
   0.090.000.09
 identical(r0,r3)
[1] TRUE

Bill Dunlap
TIBCO Software
wdunlap tibco.com


On Fri, May 2, 2014 at 9:23 AM, William Dunlap wdun...@tibco.com wrote:
 Your original code, as a function of 'm' and 'bins' is
 f0 - function (m, bins) {
 t(apply(m, 1, function(x) hist(x, breaks = bins, plot = 
 FALSE)$counts)) } and the time it takes to run on your m1 is about 5 
 s. on my machine
 system.time(r0 - f0(m1,bins))
user  system elapsed
4.950.005.02


 hist(x,breaks=bins) is essentially tabulate(cut(x,bins),nbins=length(bins)-1).
 See how much it speeds things up by replacing hist() with tabulate(cut()):
 f1 - function (m, bins)
 {
 nbins - length(bins) - 1L
 t(apply(m, 1, function(x) tabulate(cut(x, bins), nbins = nbins))) 
 } That doesn't help with the time, but it does give the same output
 system.time(r1 - f1(m1,bins))
user  system elapsed
4.850.105.35
 identical(r0, r1)
 [1] TRUE

 Now try speeding it up by calling cut() on the whole matrix first and 
 then applying tabulate to each row, as in
 f2 - function (m, bins)  {
 nbins - length(bins) - 1L
 m - array(as.integer(cut(m, bins)), dim = dim(m))
 t(apply(m, 1, tabulate, nbins = nbins)) } That saves quite a bit 
 of time and gives the same output
 system.time(r2 - f2(m1,bins))
user  system elapsed
0.250.000.25
 identical(r0, r2)
 [1] TRUE

 Bill Dunlap
 TIBCO Software
 wdunlap tibco.com


 On Thu, May 1, 2014 at 12:48 PM, Ortiz-Bobea, Ariel ortiz-bo...@rff.org 
 wrote:
 Hello everyone,



 I'm trying to construct bins for each row in a matrix. I'm using apply() in 
 combination with hist() to do this. Performing this binning for a 10K-by-50 
 matrix takes about 5 seconds, but only 0.5 seconds for a 1K-by-500 matrix. 
 This suggests the bottleneck is accessing rows in apply() rather than the 
 calculations going on inside hist().



 My initial idea is to process as many columns (as make sense for the 
 intended use) at once. However, I still have many many rows to process and I 
 would appreciate any feedback on how to speed this up.



 Any thoughts?



 Thanks,



 Ariel



 Here is the illustration:



 # create data

 m1 - matrix(10*rnorm(50*10^4), ncol=50)

 m2 - matrix(10*rnorm(50*10^4), ncol=500)



 # compute bins

 bins - seq(-100,100,1)

 system.time({ out1 - t(apply(m1,1, function(x) hist(x,breaks=bins, 
 plot=FALSE)$counts)) })

 system.time({ out2 - t(apply(m2,1, function(x) hist(x,breaks=bins, 
 plot=FALSE)$counts)) })


 ---
 Ariel Ortiz-Bobea
 Fellow
 Resources for the Future
 1616 P Street, N.W.
 Washington, DC 20036

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Making a panel of scatter plots in R

2014-05-02 Thread Charlie Nagle
Hello,

I'm trying to figure out how to panel scatter plots of participant data in R 
using the ggplot2 package. I do not want to display separate variables, I 
simply want to create a 4 by 4 or 5 by 5 grid of scatter plots that represent 
the data of each individual participant since I am looking at longitudinal data 
and wish to examine it visually.

I am currently generating basic plots with the function: 
p+geom_point()+facet_grid(~ID). This generates all of the plots, but in a 
single row that is not easy to visualize. Any help you can provide on the 
appropriate command syntax is greatly appreciated!

Sincerely,
Charlie

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Making a panel of scatter plots in R

2014-05-02 Thread Jeff Newmiller
Are you looking for facet_wrap? facet_grid usually has two grouping variables...
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

On May 2, 2014 4:50:08 PM PDT, Charlie Nagle cna...@me.com wrote:
Hello,

I'm trying to figure out how to panel scatter plots of participant data
in R using the ggplot2 package. I do not want to display separate
variables, I simply want to create a 4 by 4 or 5 by 5 grid of scatter
plots that represent the data of each individual participant since I am
looking at longitudinal data and wish to examine it visually.

I am currently generating basic plots with the function:
p+geom_point()+facet_grid(~ID). This generates all of the plots, but in
a single row that is not easy to visualize. Any help you can provide on
the appropriate command syntax is greatly appreciated!

Sincerely,
Charlie

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Seeking well-commented versions of mgcv source code

2014-05-02 Thread Andrew Crane-Droesch
Dear List,

I'm looking for well-commented versions of various functions comprising 
mgcv, so that I can modify a piece of it for a project I'm working on.  
In particular I'm looking for

  * testStat
  * summary.gam
  * liu2
  * simf

Obviously I can find these by typing mgcv:::whatever.  But there are a 
lot of nested if statements, making it difficult to follow. Comments in 
the code describing exactly what is happening at each step would make my 
life a lot easier.

Where can I find more-detailed versions of the code?

Thanks,
Andrew


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-02 Thread Carlos Ortega
Hi,

With the new package dplyr you can create equivalent SQL sintaxt queries
like the one you need.
You can find examples of how to apply it here:

http://martinsbioblogg.wordpress.com/2014/03/26/using-r-quickly-calculating-summary-statistics-with-dplyr/

http://martinsbioblogg.wordpress.com/2014/03/27/more-fun-with-and/

Regards,
Carlos.




2014-05-02 23:23 GMT+02:00 Dr Eberhard Lisse nos...@lisse.na:

 Hi,

 How do I do something like this without using sqldf?

 a - sqldf(SELECT COUNT(*) FROM b WHERE c = 'd')

 or

 e - sqldf(SELECT f, COUNT(*) FROM b GROUP BY f ORDER BY f)

 greetings, el

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Saludos,
Carlos Ortega
www.qualityexcellence.es

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Seeking well-commented versions of mgcv source code

2014-05-02 Thread Jeff Newmiller
Read the Posting Guide, which points out that this list is for plain text 
emails so you should set your email program appropriately.

Read the Writing R Extensions document, which tells you how packages are 
constructed, and from which it will become clear that you want the tar.gz 
package file for whatever package you are interested in.
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

On May 2, 2014 7:15:27 PM PDT, Andrew Crane-Droesch andre...@gmail.com wrote:
Dear List,

I'm looking for well-commented versions of various functions comprising

mgcv, so that I can modify a piece of it for a project I'm working on. 

In particular I'm looking for

  * testStat
  * summary.gam
  * liu2
  * simf

Obviously I can find these by typing mgcv:::whatever.  But there are a 
lot of nested if statements, making it difficult to follow. Comments in

the code describing exactly what is happening at each step would make
my 
life a lot easier.

Where can I find more-detailed versions of the code?

Thanks,
Andrew


   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to fix the format in write.csv

2014-05-02 Thread Duncan Mackay
Hi

I second Jim's comments.

I have also been bitten by British being converted to American format as
well as ANSI numeric to a date format which I did not want .

And if you want to take a daily series from the 1890's to the 2000's caveat
emptor.

If you are dealing with dates use something that will give you a proper
output date format.

Even beware of old versions of Access exporting  excel files they are really
html files.

Regards

Duncan

Duncan Mackay
Department of Agronomy and Soil Science
University of New England
Armidale NSW 2351
Email: home: mac...@northnet.com.au
 



-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Jim Lemon
Sent: Saturday, 3 May 2014 07:21
To: Bert Gunter
Cc: r-help@r-project.org
Subject: Re: [R] How to fix the format in write.csv

Hi ChangJiang,
Date conversion is one of the biggest headaches with Excel. Even if you 
import those data into Excel and then specify that the column should be 
text format, it won't convert the values back into what they 
originally were. Be aware that Excel may, when encountering dates from a 
different locale, _silently_ convert the ones that aren't valid in its 
current locale, usually by swapping the day and month values. That one 
has bitten me when I have had to use Excel. The solution I found was to 
always use international format (-mm-dd) as that didn't seem to be 
altered.

Jim

On 05/03/2014 06:56 AM, Bert Gunter wrote:
 ChangJiang:

 Open the .csv file with Notepad or other plain vanilla text processor,
 NOT EXCEL. You will find that the columns are text. Excel
 automatically converts them to dates. Read Excel's docs or get help
 from someone to learn how to convert the dates back to text.

 -- Bert



 Bert Gunter
 Genentech Nonclinical Biostatistics
 (650) 467-7374

 Data is not information. Information is not knowledge. And knowledge
 is certainly not wisdom.
 H. Gilbert Welch




 On Fri, May 2, 2014 at 12:35 PM, ChangJiang Xu
 changjiang.h...@gmail.com  wrote:
 Thanks. I still didn't get the solution. For example, I have a data
frame,
 called temp
 temp
Chr Ref Var AFFUNAFF AFF.test pvalue
 1  10   A   G  2/1/240/0/1905/49  2.429e-09
 2  18   G   A  1/9/17  0/23/167   11/43  2.484e-04
 3   1   G   A  2/2/220/8/176 6/46  4.293e-04
 4  11   T   G  1/1/250/2/188  3/51 1.193e-03
 5   2   A   T 1/10/16   38/90/60   12/42  2.220e-03
 6   3   G   A 1/16/10  8/49/13318/36 4.549e-03

 Then I want to write a csv file using R function write.csv, as follows:
 write.csv(temp, file=temp.csv, row.names = FALSE)

 and the csv file, temp.csv, looks like the below, not same as original
data
 ChrRefVarAFF   UNAFFAFF.testpvalue
 10AG02/01/20240/0/190May-492.43E-09
 18GA01/09/20170/23/167  Nov-430.0002484
 1  GA02/02/20220/8/176Jun-460.0004293
 11 TG01/01/20250/2/188Mar-510.001193
 2  AT01/10/201638/90/60   Dec-420.00222
 3  GA1/16/108/49/133   18/36
0.004549


 On Fri, May 2, 2014 at 3:06 PM, Bert Guntergunter.ber...@gene.com
wrote:

 Read your Excel documentation. AFAIK, R just writes text files -- you
 need to tell Excel how to read them in.

 Cheers,
 Bert

 Bert Gunter
 Genentech Nonclinical Biostatistics
 (650) 467-7374

 Data is not information. Information is not knowledge. And knowledge
 is certainly not wisdom.
 H. Gilbert Welch




 On Fri, May 2, 2014 at 11:34 AM, ChangJiang Xu
 changjiang.h...@gmail.com  wrote:
 By dafault, write.csv will change the characters such as 5/38 as a
 date
 May-38. How can I not change the format?
 Thanks.

 ChangJiang

  [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-02 Thread Bert Gunter
By making the effort to learn R?

See e.g. the Introduction to R tutorial that ships with R.

-- Bert

Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom.
H. Gilbert Welch




On Fri, May 2, 2014 at 2:23 PM, Dr Eberhard Lisse nos...@lisse.na wrote:
 Hi,

 How do I do something like this without using sqldf?

 a - sqldf(SELECT COUNT(*) FROM b WHERE c = 'd')

 or

 e - sqldf(SELECT f, COUNT(*) FROM b GROUP BY f ORDER BY f)

 greetings, el

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R-es] gracias y pregunta off topic

2014-05-02 Thread Fer
Lo primero, muchisimas gracias a los que respondisteis a mis problemas 
con mapas :-D


Luego, aunque la lista sea de R, tengo una duda matematica que quizas 
podriais responderme. Hay algun palabro en castellano que se utilice 
para designar los state process y observational process en modelos 
jerarquicos? El viejo truco de ir a la wiki en ingles y pasarse al 
castellano no me ha funcionado y no tengo ningun diccionario tecnico que 
me resuelva la duda,  y estoy haciendo un resumen en castellano de mi 
tesis de master y no se como traducirlos o adaptarlos. Quizas proceso 
real para state process y proceso observacional para observational 
process? Suenan un poco cutres...


Gracias por adelantado
Fer

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es