Re: [R] data.table error

2010-04-29 Thread Tom Short
Johannes, please try the latest version on R-forge (1.4). That error has been fixed, and it's much faster. We hope to have that to CRAN reasonably soon. To install, use: install.packages(data.table,repos=http://R-Forge.R-project.org;) - Tom Tom Short On Thu, Apr 29, 2010 at 3:40 PM

Re: [R] Code is too slow: mean-centering variables in a data framebysubgroup

2010-04-07 Thread Tom Short
), with = FALSE], + function(x) x / mean(x, na.rm = TRUE)), + by = group] + } system.time(new.frame2 - f2(frame)) # ave user system elapsed 0.500.081.24 system.time(new.frame3 - f3(frame)) # data.table user system elapsed 0.250.010.30 - Tom Tom

Re: [R] Code is too slow: mean-centering variables in a data framebysubgroup

2010-04-07 Thread Tom Short
, 2010 at 3:46 PM, Tom Short tshort.rli...@gmail.com wrote: Here's how I would have done the data.table method. It's a bit faster than the ave approach on my machine: # install.packages(data.table,repos=http://R-Forge.R-project.org;) library(data.table) f3 - function(frame) { +   frame

Re: [R] rpad ?

2010-03-23 Thread Tom Short
). To get interactivity, the RApache approach requires a fair amount of javascript programming. Rpad gives you interactivity fairly automatically as a webpage with embedded R code. - Tom Tom Short On Tue, Mar 23, 2010 at 4:46 PM, Erich Neuwirth erich.neuwi...@univie.ac.at wrote: We are using

Re: [R] data.table evaluating columns

2010-03-02 Thread Tom Short
is quick). In the data table version, frame[,names[i], with=F] is the same as frame[,names[i], drop=FALSE] (the answer is a list, not a vector). Normally, it's easier to use [[]] or $ indexing to get this. Also, fname[i,j] - something assignment is still a bit buggy for data.tables. - Tom Tom Short

Re: [R] dramatic speed difference in lapply

2010-02-26 Thread Tom Short
I'm sorry, Rob, but that code is dense enough and formatted badly enough that it's hard to dig through. You may want to try the data.table package. The development version on R-forge is pretty fast for grouping operations like this. I'm not sure if this is what you're really after. It's hard to

Re: [R] how to fast extract values from different list elements

2010-02-25 Thread Tom Short
On Thu, Feb 25, 2010 at 4:10 AM, Heym, Peter-Paul ph...@ipb-halle.de wrote: this works fine but it is very slow (since A and B can be very large and I have to repeat this about 5000 times). I would like to make this faster using e.g. apply or lapply but I didn't get it work using these

Re: [R] how to rearrange a dataframe

2010-02-23 Thread Tom Short
Try this: a - b - read.table(textConnection( 1 + name1 1 2 3 2 + name2 5 9 10 2 - name3 56 74 93 1 - name4 65 75 98), skip=1, header=FALSE) swapidx - with(a, (V1 == 2 V2 == +) | (V1 == 1 V2 == -)) b[swapidx,] - b[swapidx, c(1:3,6:4)] This creates an indexing vector that identifies which rows

Re: [R] Large dataset importing, columns merging and splitting

2010-01-26 Thread Tom Short
If you need more aggregations on the stock (I assume that's what the first column is), I'd use the data.table package. It allows fast indexing and merge operations. That's handy if you have other features of a stock (like company size or industry sector) that you'd like to include in the

Re: [R] Getting file name from pdf device?

2009-08-01 Thread Tom Short
On Fri, Jul 31, 2009 at 8:49 AM, Rainer M Krugr.m.k...@gmail.com wrote: My question: how can I get the filename of the pdf from the device before it is closed? I've also looked for this and couldn't find a way. I had a similar use, where I wanted to get an R transcript with embedded plots in

Re: [R] Excel Export in a beauty way

2009-06-07 Thread Tom Short
for example), and more. To create HTML, you have several packages that can help you out: R2HTML, Rpad, hwriter, and xtable. Not everything might convert properly, so you may have to experiment. Data frames as tables normally convert nicely. - Tom Tom Short

Re: [R] Do you use R for data manipulation?

2009-05-06 Thread Tom Short
Another tool I find useful is Matthew Dowle's data.table package. It has very fast indexing, can have much lower memory requirements than a data frame, and has some built-in data manipulation capability. Especially with a 64-bit OS, you can use this to keep things in memory where you otherwise