Very sorry to hear this bit you. If you need a copy of names before
changing them by reference :
oldnames - copy(names(DT))
This will be documented and it's on the bug list to do so. copy is
needed in other circumstances too, see ?copy.
More details here :
On 11/3/2011 3:30 PM, Brian Diggs wrote:
Well, I figured it out. Or at least got it working. I had to run
initexmf --mkmaps
because apparently there was something wrong with my font mappings.
I
don't know why; I don't know how. But it works now. I think
installing
the font into the
Hi Geoff,
Please see this part of the r-help posting guide :
For questions about functions in standard packages distributed with R
(see the FAQ Add-on packages in R), ask questions on R-help. If the
question relates to a contributed package , e.g., one downloaded from CRAN,
try contacting
AKJ,
Please see this recent answer :
http://r.789695.n4.nabble.com/data-table-vs-plyr-reg-output-tp4634518p4634865.html
Matthew
--
View this message in context:
Using Josh's nice example, with data.table's built-in 'by' (optimised
grouping) yields a 6 times speedup (100 seconds down to 15 on
my netbook).
system.time(all.2b - lapply(si, function(.indx) { coef(lm(y ~
+ x, data=d[.indx,])) }))
user system elapsed
144.501 0.300 145.525
Hi Uwe,
When you cc from Nabble it doesn't show as cc'd on r-help. It's
a web form with an Email this post to... box. I asked Nabble
support (over a year ago) if they could reflect that in the cc field of
the post they send to r-help, with no luck.
The previous thread is cited automatically in
Ivo,
Also, perhaps FAQ 2.14 helps : Can you explain further why
data.table is inspired by A[B] syntax in base?
http://datatable.r-forge.r-project.org/datatable-faq.pdf
And, 2.15 and 2.16.
Matthew
Steve Lianoglou mailinglist.honey...@gmail.com wrote in message
Package plyr has .parallel.
Searching datatable-help for multicore, say on Nabble here,
http://r.789695.n4.nabble.com/datatable-help-f2315188.html
yields three relevant posts and examples.
Please check wiki do's and don'ts to make sure you didn't
fall into one of those traps, though (we don't
Joshua Wiley jwiley.ps...@gmail.com wrote in message
news:canz9z_kopuwkzb-zxr96pvulhhf2znxntxso9xnyho-_jum...@mail.gmail.com...
On Tue, Oct 4, 2011 at 12:40 AM, Rainer Schuermann
rainer.schuerm...@gmx.net wrote:
Any comments are very welcome,
3. If that fails, and nobody else has a better
Assuming you can install other packages ok, data.table depends on
R =2.12.0. Which version of R do you have?
_If_ that's the problem, does anyone know if anything prevents
R's error message from stating which dependency isn't satisfied? I think
I've seen users confused by this before, for other
This is the fastest data.table way I can think of :
ans = mydt[,list(mytime=.N),by=list(id,mygroup)]
ans[,censor:=0L]
ans[J(unique(id)), censor:=1L, mult=last]
id mygroup mytime censor
[1,] 1 A 1 1
[2,] 2 B 3 0
[3,] 2 C 3 0
[4,] 2 D
Adam,
because I did not have time to entirely test
Do you (or does your company) have an automated test suite in place?
R 2.10.0 is nearly two years old, and R 2.12.0 is nearly one.
Matthew
AdamMarczak adam.marc...@gmail.com wrote in message
news:1314385041626-3771731.p...@n4.nabble.com...
Hi Justin,
In data.table 1.6.1 there was this news item :
oj's environment is now consistently reused so
that local variables may be set which persist
from group to group; e.g., incrementing a group
counter :
DT[,list(z,groupInd-groupInd+1),by=x]
One of
To close this thread on-list :
packageVersion() was added to R in 2.12.0.
data.table's dependency on 2.12.0 is updated, thanks.
Matthew
Jesse Brown jesse.r.br...@lmco.com wrote in message
news:4e1b21a8.8090...@atl.lmco.com...
Matthew Dowle wrote:
Hi,
Try package 'data.table'. It has
Users of package 'unknownR' already know simplify2array was added in R
2.13.0.
They also know what else was added. Do you?
http://unknownr.r-forge.r-project.org/
Joshua Wiley jwiley.ps...@gmail.com wrote in message
news:canz9z_j+trwoim3scayuaruors+8hyc30pmt_thiex6qmto...@mail.gmail.com...
With data.table, the following is routine :
DT[order(a)] # ascending
DT[order(-a)] # descending, if a is numeric
DT[a5,sum(z),by=c][order(-V1)] # sum of z group by c, just where a5,
then show me the largest first
DT[order(-a,b)] # order by a descending then by b ascending, if a and b are
Do you know how many functions there are in base R?
How many of them do you know you don't know?
Run unk() to discover your unknown unknowns.
It's fast and it's fun!
unknownR v0.2 is now on CRAN.
More information is on the homepage :
http://unknownr.r-forge.r-project.org/
Or, just install the
data.table offers fast subset, fast grouping and fast ordered joins in a
short and flexible syntax, for faster development. It was first released
in August 2008 and is now the 3rd most popular package on Crantastic
with 20 votes and 7 reviews.
* X[Y] is a fast join for large data.
*
Peter,
If the proprietary part of REvolution's product is ok, then surely
Stanislav's suggestion is too. No?
Matthew
peter dalgaard pda...@gmail.com wrote in message
news:be157cf5-9b4b-45a0-a7d4-363b774f1...@gmail.com...
On Apr 7, 2011, at 09:45 , Stanislav Bek wrote:
Hi,
is it
murdoch.dun...@gmail.com wrote in message
news:4d9da9ff.9020...@gmail.com...
On 07/04/2011 7:47 AM, Matthew Dowle wrote:
Peter,
If the proprietary part of REvolution's product is ok, then surely
Stanislav's suggestion is too. No?
Revolution has said that they believe they follow the GPL
Try data.table:::sortedmatch, which is implemented in C.
It requires it's input to be sorted (and doesn't check)
Stavros Macrakis macra...@alum.mit.edu wrote in message
news:BANLkTi=j2lf5syxytv1dd4k9wr0zgk8...@mail.gmail.com...
Is there a generic binary search routine in a standard library
Hi,
One liners in data.table are :
x.dt[,lapply(.SD,mean),by=sample]
sample replicate heightweight age
[1,] A 2.0 12.2 0.503 6.00
[2,] B 1.5 12.75000 0.715 4.50
[3,] C 2.5 11.35250 0.5125000 3.75
[4,] D 2.0
Thanks!
Matthew Dowle wrote:
Thanks for the attempt and required output. How about this?
firststep = DT[,cbind(expand.grid(B,B),v=1/length(B)),by=C][Var1!=Var2]
setkey(firststep,Var1,Var2,C)
firststep = firststep[,transform(.SD,cv=cumsum(v)),by=list(Var1,Var2)]
setkey(firststep,Var1,Var2,C
Thanks. How about this?
DT$B = factor(DT$B)
firststep = DT[,cbind(expand.grid(B,B),v=1/length(B),C=C[1]),by=A][Var1!
=Var2]
setkey(firststep,Var1,Var2,C)
firststep = firststep[,transform(.SD,cv=cumsum(v)),by=list(Var1,Var2)]
setkey(firststep,Var1,Var2,C)
DT[,
Thanks for the attempt and required output. How about this?
firststep = DT[,cbind(expand.grid(B,B),v=1/length(B)),by=C][Var1!=Var2]
setkey(firststep,Var1,Var2,C)
firststep = firststep[,transform(.SD,cv=cumsum(v)),by=list(Var1,Var2)]
setkey(firststep,Var1,Var2,C)
DT[,
Mathijs,
To my eyes you seem to have repeated back what is already done.
More R and less English would help. In other words if it is not 2.5
you need, what is it? Please provide some input and state what the
output should be (and what you tried already).
Matthew
--
View this message in
Hello Lars, (cc'd)
Did you ask maintainer(boot) first, as requested by the posting guide?
If you did, but didn't hear back, then please say so, so that we know
you did follow the guide. That maintainer is particularly active, and
particularly efficient though, so I doubt you didn't hear back.
Hello. One (of many) solution might be:
require(data.table)
DT = data.table(read.table(textConnection(A B C
1 1 a 1999
2 1 b 1999
3 1 c 1999
4 1 d 1999
5 2 c 2001
6 2 d 2001),head=TRUE,stringsAsFactors=FALSE))
firststep =
There's a much shorter way. You don't need that ugly h() with all those $
and potential for bugs !
Using the original f :
dt[,lapply(.SD,f),by=key(dt)]
grp1 grp2 grp3 a b d
xxx 1.00 81.00 161.00
xxx 10.00 90.00
Looking at the timings by each stage may help :
system.time(dt - data.table(dat))
user system elapsed
1.200.281.48
system.time(setkey(dt, x1, x2, x3, x4, x5, x6, x7, x8)) # sort by the
8 columns (one-off)
user system elapsed
4.720.945.67
system.time(udt
Hi Sean,
Try :
key(test.dt) = c(a,b)
Btw, the posting guide asks you to contact the maintainer of the package
before r-help. Otherwise r-help would fill up with posts about 2000+
packages (I guess is the reason). In this case maintainer(data.table)
returns
news:AANLkTik180p4YmBtR3QUCW7r=fdefxzbxsy3zwtik...@mail.gmail.com...
On Mon, Feb 7, 2011 at 5:54 AM, Matthew Dowle mdo...@mdowle.plus.com
wrote:
Looking at the timings by each stage may help :
system.time(dt - data.table(dat))
user system elapsed
1.20 0.28 1.48
system.time(setkey(dt, x1, x2
Hadley,
That's fine; please do. I'm happy to explain it offline where the
documentation or comments in the
code aren't sufficient. It's GPL code so you can take it and improve it, or
depend on it.
Whatever works for you. As long as (of course) you don't stand on it's
shoulders and then
Note that a key is not actually required, so it's even simpler syntax :
dX = as.data.table(X)
dX[,length(unique(z)),by=x,y]
x y V1
[1,] 1 1 2
[2,] 1 2 2
[3,] 2 3 2
[4,] 2 4 2
[5,] 3 5 2
[6,] 3 6 2
or passing list() syntax to the 'by' is exactly the same :
require(data.table)
DT = as.data.table(df)
# 1. Patients with ah and ihd
DT[,.SD[ah%in%diagnosis ihd%in%diagnosis],by=id]
id diagnosis
[1,] 2ah
[2,] 2 ihd
[3,] 2im
[4,] 4ah
[5,] 4 ihd
[6,] 4angina
# 2. Patients with ah but no ihd
Try :
objects(package:base)
Also, as it happens, a new package called unknownR is in
development on R-Forge.
It's description says :
Do you know how many functions there are in base R?
How many of them do you know you don't know?
Run unk() to discover your unknown unknowns.
It's fast and
if I understand
correctly.
Matthew
Duncan Murdoch murdoch.dun...@gmail.com wrote in message
news:4cffca13.7070...@gmail.com...
Matthew Dowle wrote:
Might Wayland fix it in Narwhal ?
I hope those names mean something to Rainer, because they mean nothing to
me.
Duncan Murdoch
Duncan
Might Wayland fix it in Narwhal ?
Duncan Murdoch murdoch.dun...@gmail.com wrote in message
news:4cff7177.7030...@gmail.com...
On 08/12/2010 6:07 AM, Rainer M Krug wrote:
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
On 12/08/2010 12:05 PM, Duncan Murdoch wrote:
Rainer M Krug wrote:
Hi
Hello Alex,
Assuming it was just an inadequate example (since a data.frame would suffice
in that case), did you know that a data.frames' columns do not have to be
vectors but can be lists? I don't know if that helps.
DF = data.frame(a=1:3)
DF$b = list(pi, 2:3, letters[1:5])
DF
a
Richard,
Try data.table. See the introduction vignette and the
presentations e.g. there is a slide showing a join to
183,000,000 observations of daily stock prices in
0.002 seconds.
data.table has fast rolling joins (i.e. fast last observation
carried forward) too. I see you asked about that on
Try data.table with the roll=TRUE argument.
Set your keys and then write :
futData[optData,roll=TRUE]
That is fast and as you can see, short. Works on
many millions and even billions of rows in R.
Matthew
http://datatable.r-forge.r-project.org/
Santosh Srinivas
All the solutions in this thread so far use the lapply(split(...)) paradigm
either directly or indirectly. That paradigm doesn't scale. That's the
likely
source of quite a few 'out of memory' errors and performance issues in R.
data.table doesn't do that internally, and it's syntax is pretty
+ep8ubu3mxxhhrd...@mail.gmail.com...
On Tue, Sep 21, 2010 at 3:09 AM, Matthew Dowle mdo...@mdowle.plus.com
wrote:
All the solutions in this thread so far use the lapply(split(...))
paradigm
either directly or indirectly. That paradigm doesn't scale. That's the
likely
source of quite a few
Wiley wrote:
On Tue, Sep 21, 2010 at 3:09 AM, Matthew Dowle mdo...@mdowle.plus.com
wrote:
All the solutions in this thread so far use the lapply(split(...))
paradigm
either directly or indirectly. That paradigm doesn't scale. That's the
likely
source of quite a few 'out of memory' errors
To: r-help
Cc: Jeff, Matt, Duncan, Hadley [ using Nabble to cc ]
Jeff, Matt,
How about the 'refdata' class in package ref.
Also, Hadley's immutable data.frame in plyr 1.1.
Both allow you to refer to subsets of a data.frame or matrix by reference I
believe, if I understand correctly.
Another option for consideration :
library(data.table)
mydt = as.data.table(mydf)
mydt[,as.list(coef(lm(y~x1+x2+x3))),by=fac]
fac X.Intercept. x1 x2x3
[1,] 0 -0.16247059 1.130220 2.988769 -19.14719
[2,] 1 0.08224509 1.216673 2.847960 -19.16105
[3,] 2
Is this what you mean?
x=c(1,2,2,3,4,5,6,3,2,1)
y=c(2,3,4,2,1,2,3,4,5,6)
matplot(cbind(x,y),type=l)
which(diff(sign(x-y))!=0)+1
[1] 4 8
--
View this message in context:
http://r.789695.n4.nabble.com/Finding-points-where-two-timeseries-cross-over-tp2313257p2313510.html
Sent from the R help
since you are on 64bit. I was working on the basis of squeezing into 32bit.
Matthew
Matthew Dowle mdo...@mdowle.plus.com wrote in message
news:i1faj2$lv...@dough.gmane.org...
Hi Juliet,
Thanks for the info.
It is very slow because of the == in testData[testData$V2==one_ind,]
Why? Imagine
Hi Juliet,
Thanks for the info.
It is very slow because of the == in testData[testData$V2==one_ind,]
Why? Imagine someoone looks for 10 people in the phone directory. Would
they search the entire phone directory for the first person's phone number,
starting
on page 1, looking at every single
Hi Ted,
Well since you mentioned data.table (!) ...
If risk_input is a data.table consisting of 3 columns (m_id, sale_date,
return_date) where the dates
are of class IDate (recently added to data.table by Tom) then try :
risk_input[, fitdistr(return_date-sale_date,normal), by=list(m_id,
dt = data.table(d,key=grp1,grp2)
system.time(ans1 - dt[ , list(mean(x),mean(y)) , by=list(grp1,grp2)])
user system elapsed
3.890.003.91# your 7.064 is 12.23 for me though, so this
3.9 should be faster for you
However, Rprof() shows that 3.9 is mostly dispatch of mean to
William,
Try a rolling join in data.table, something like this (untested) :
setkey(Data, UnitID, TranDt)# sort by unit then date
previous = transform(Data, TranDt=TranDt-1)
Data[previous,roll=TRUE]# lookup the prevailing date before, if any,
for each row within that row's UnitID
data.table is an enhanced data.frame with fast subset, fast
grouping and fast merge. It uses a short and flexible syntax
which extends existing R concepts.
Example:
DT[a3,sum(b*c),by=d]
where DT is a data.table with 4 columns (a,b,c,d).
data.table 1.4.1 :
* grouping is now 10+ times faster
I don't know about that, but try this :
install.packages(data.table, repos=http://R-Forge.R-project.org;)
require(data.table)
summaries = data.table(summaries)
summaries[,sum(counts),by=symbol]
Please let us know if that returns the correct result, and if its
memory/speed is ok ?
Matthew
Steve Lianoglou mailinglist.honey...@gmail.com wrote in message
news:t2ybbdc7ed01004290812n433515b5vb15b49c170f5a...@mail.gmail.com...
Thanks for directing me to the data.table package. I read through some
of the vignettes, and it looks quite nice.
While your sample code would provide
Or try data.table 1.4 on r-forge, its grouping is faster than aggregate :
agg datatable
X100.012 0.008
X100 0.020 0.008
X1000 0.172 0.020
X1 1.164 0.144
X1e.05 9.397 1.180
install.packages(data.table, repos=http://R-Forge.R-project.org;)
Please install v1.3 from R-forge :
install.packages(data.table,repos=http://R-Forge.R-project.org;)
It will be ready for CRAN soon.
Please follow up on datatable-h...@lists.r-forge.r-project.org
Matthew
bo bozha...@hotmail.com wrote in message
news:1270689586866-1755876.p...@n4.nabble.com...
Hi Dimitri,
A start has been made at explaining .SD in FAQ 2.1. This was previously on a
webpage, but its just been moved to a vignette :
https://r-forge.r-project.org/plugins/scmsvn/viewcvs.php/*checkout*/branch2/inst/doc/faq.pdf?rev=68root=datatable
Please note: that vignette is part of a
someone else on this list may be able to give you a ballpark estimate
of how much RAM this merge would require.
I don't have an absolute estimate, but try data.table::merge, as it needs
less
working memory than base::merge.
20 million rows of 5 columns isn't beyond 32bit :
(1*4 +
Rob,
Please look again at Romain's reply to you on 19th March. He informed you
then that Rcpp has its own dedicated mailing list and he gave you the link.
Matthew
R_help Help rhelp...@gmail.com wrote in message
news:ad1ead5f1003291753p68d6ed52q572940f13e1c0...@mail.gmail.com...
Hi,
I'm a
.
FWIW, I think the problem is fixed on the Rcpp 0.7.11 version (on cran
incoming)
Romain
Le 01/04/10 17:47, Matthew Dowle a écrit :
Rob,
Please look again at Romain's reply to you on 19th March. He informed you
then that Rcpp has its own dedicated mailing list and he gave you the
link
Ashley,
This appears to be your first post to this list. Welcome to R. Over 2 days
is quite a long time to wait though, so you are unlikely to get a reply now.
Feedback: since nlrq is in package quantreg, its a question about a package
and should
be sent to the package maintainer. Some
M Joshi,
I don't know but I guess that some might have looked at your previous thread
on 14 March (also about the geoR package). You received help and good advice
then, but it doesn't appear that you are following it. It appears to be a
similar problem this time.
Also, this list is the wrong
Abraham,
This appears to be your 3rd unanswered post to r-help in March, all 3 have
been about the Zelig package.
Please read the posting guide and find out the correct place to send
questions about packages. Then you might get an answer.
HTH
Matthew
Mathew, Abraham T amat...@ku.edu wrote
You may not have got an answer because you posted to the wrong place. Its a
question about a package. Please read the posting guide.
miriza miri...@sfwmd.gov wrote in message
news:1269886286228-1695430.p...@n4.nabble.com...
Hi!
I am using geeglm to fit a Poisson model to a timeseries of
Contact the authors of those packages ?
miriza miri...@sfwmd.gov wrote in message
news:1269981675252-1745896.p...@n4.nabble.com...
Hi!
I was wondering if there were any packages that would allow me to fit a
GEE
to a single timeseries of counts so that I could account for
autocorrelation
Apparently not, since this your 3rd unanswered thread to r-help this month
about this package.
Please read the posting guide and find out where you should send questions
about packages. Then you might get an answer.
ping chen chen1984...@yahoo.com.cn wrote in message
Geelman,
This appears to be your first post to this list. Welcome to R. Nearly 2 days
is quite a long time to wait though, so you are unlikely to get a reply now.
Feedback : the question seems quite vague and imprecise. It depends on which
R you mean (32bit/64bit) and how much ram you have.
Val,
Type combine two data sets (text you wrote in your post) into
www.rseek.org. The first two links are: Quick-R: Merge and Merging data:
A tutorial. Isn't it quicker for you to use rseek, rather than the time it
takes to write a post and wait for a reply ? Don't you also get more
The type of 'NA' is logical. So x[NA] behaves more like x[TRUE] i.e. silent
recycling.
class(NA)
[1] logical
x=101:108
x[NA]
[1] NA NA NA NA NA NA NA NA
x[c(TRUE,NA)]
[1] 101 NA 103 NA 105 NA 107 NA
x[as.integer(NA)]
[1] NA
HTH
Matthew
Barry Rowlingson b.rowling...@lancaster.ac.uk
Nick,
Good question, but just sent to the wrong place. The posting guide asks you
to contact the package maintainer first before posting to r-help only if you
don't hear back. I guess one reason for that is that if questions about all
2000+ packages were sent to r-help, then r-help's traffic
When you click search on the R homepage, type mosaic into the box, and
click the button, do the top 3 links seem relevant ?
Your previous 2 requests for help :
26 Feb : Response was SuppDists. Yet that is the first hit returned by the
subject line you posted : Hartleys table
22 Feb :
Here are some references. Please read these first and post again if you are
still stuck after reading them. If you do post again, we will need x and y.
1. Introduction to R : 9.2.1 Conditional execution: if statements.
2. R Language Definition : 3.2 Control structures.
3. R for beginners by E
Ricardo,
I see you got no public answer so far, on either of the two lists you posted
to at the same time yesterday. You are therefore unlikely to ever get a
reply.
I also see you've been having trouble getting answers in the past, back to
Nov 09, at least. For example no reply to Credit
Your choice of subject line alone shows some people that you missed some
small details from the posting guide. The ability to notice small details
may be important for you to demonstrate in future. Any answer in this
thread is unlikely to be found by a topic search on subject lines alone
This list is the wrong place for that question. The posting guide tells
you, in bold, to contact the package maintainer first.
If you had already done that, and didn't hear back from him, then you
should tell us, so that we know you followed the guide.
Corey Sparks corey.spa...@utsa.edu
Welcome to R Barbara. Its quite an incredible community from all walks of
life.
Your beginner questions are answered in the manual. See Introduction to R.
Please read the posting guide again because it contains lots of good advice
for you. Some people read it three times before posting
Thanks for making it quickly reproducible - I was able to see that message
in English within a few seconds.
The start has x=86, but the data is also called x. Remove x=86 from start
and you get a different error.
P.S. - please do include the R version information. It saves time for us,
and we
This post breaks the posting guide in multiple ways. Please read it again
(and then again) - in particular the first 3 paragraphs. You will help
yourself by following it.
The solution is right there in the help page for ?data.frame and other
places including Introduction to R. I think its
Frank, I respect your views but I agree with Gabor. The posting guide does
not support your views.
It is not any of our views that are important but we are following the
posting guide. It covers affiliation. It says only that some consider it
good manners to include a concise signature
)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)
Matthew Dowle mdo...@mdowle.plus.com 3/5/2010 12:58 PM
Frank, I respect your views but I agree with Gabor. The posting guide
does
not support your views.
It is not any of our
I'd go a bit further and remind that the r-help posting guide is clear :
For questions about functions in standard packages distributed with R
(see the FAQ Add-on packages in R), ask questions on R-help.
If the question relates to a contributed package , e.g., one downloaded from
CRAN, try
appear to be correct. Or just directly sending an email to all of you?
Thanks again,
Rob
On Wed, Mar 3, 2010 at 6:05 AM, Matthew Dowle
mdo...@mdowle.plus.comwrote:
I'd go a bit further and remind that the r-help posting guide is clear :
For questions about functions in standard packages
Dieter,
One way to check if a package is active, is by looking on r-forge. If you
are referring to data.table you would have found it is actually very active
at the moment and is far from abandoned.
What you may be referring to is a warning, not an error, with v1.2 on
R2.10+. That was fixed
I agree with Jim. The term do analysis is almost meaningless, the posting
guide makes reference to statements such as that. At least he tried to
define large, but inconsistenly (first of all 850MB, then changed to
10-20-15GB).
Satish wrote: at one time I will need to load say 15GB into R
I can't help you further than whats already been posted to you. Maybe
someone else can.
Best of luck.
Satish Vadlamani satish.vadlam...@fritolay.com wrote in message
news:1265397089104-1470667.p...@n4.nabble.com...
Matthew:
If it is going to help, here is the explanation. I have an end state
Yes.
data.df[,wcol,drop=FALSE]
For an explanation of drop see ?[.data.frame
Chuck White chuckwhi...@charter.net wrote in message
news:20100202212800.o8xbu.681696.r...@mp11...
Additional clarification: the problem only comes when you have one column
selected from the original dataframe. You
should not be important as long as
you can do what you want. SQL is declarative so you just specify what
you want rather than how to get it and invisibly to the user it
automatically draws up a query plan and then uses that plan to get the
result.
On Wed, Jan 27, 2010 at 12:48 PM, Matthew Dowle
and use is to hide the implementation and focus on the problem.
That is why we use high level languages, object orientation, etc.
On Thu, Jan 28, 2010 at 4:37 AM, Matthew Dowle mdo...@mdowle.plus.com
wrote:
How it represents data internally is very important, depending on the real
goal :
http
its even faster.
On Thu, Jan 28, 2010 at 8:52 AM, Matthew Dowle mdo...@mdowle.plus.com
wrote:
Are you claiming that SQL is that utopia? SQL is a row store. It cannot
give the user the benefits of column store.
For example, why does SQL take 113 seconds in the example in this thread :
http
:971536df1001270629w4795da89vb7d77af6e4e8b...@mail.gmail.com...
On Wed, Jan 27, 2010 at 8:56 AM, Matthew Dowle mdo...@mdowle.plus.com
wrote:
How many columns, and of what type are the columns ? As Olga asked too, it
would be useful to know more about what you're really trying to do.
3.5m rows is not actually
Please re-read the posting guide e.g. you didn't provide an example data set
or a way to generate one, or any R version information.
Werner W. pensterfuz...@yahoo.de wrote in message
news:646146.32238...@web23002.mail.ird.yahoo.com...
Hi,
I have browsed the help list and looked at the FAQ
?merge
plyr
data.table
sqldf
crantastic
Dr. Viviana Menzel vivianamen...@gmx.de wrote in message
news:4b58a0e9.3050...@gmx.de...
Hello R-help group,
I have a question about merging lists. I have two lists:
Genes list (hSgenes)
namechrstrandstartendtransStarttransEnd
specific
function), but don't worry I won't forget. As you said It only works if
users contribute to it. That makes the power of R!
Ivan
Le 1/21/2010 19:01, Matthew Dowle a écrit :
One way is :
dataset = data.table(ssfamed)
dataset[, whatever some functions are on Asfc, Smc, epLsar, etc
Fantastic. You're much more likely to get a response now. Best of luck.
werner w pensterfuz...@yahoo.de wrote in message
news:1264175935970-1100164.p...@n4.nabble.com...
Thanks Matthew, you are absolutely right.
I am working on Windows XP SP2 32bit with R versions 2.9.1.
Here is an
:18, Matthew Dowle a écrit :
Great.
If you mean the crantastic r package, sorry I wasn't clear, I meant the
crantastic website http://crantastic.org/.
If you meant the description of plyr then if the description looks useful
then click the link taking you to the package documentation and read
One way is :
dataset = data.table(ssfamed)
dataset[, whatever some functions are on Asfc, Smc, epLsar, etc ,
by=SPECSHOR,BONE]
Your SPECSHOR and BONE names will be in your result alongside the results of
the whatever ...
Or try package plyr which does this sort of thing too. And sqldf may
but I have thousands of results so it would be really hand to find away of
doing this quickly
its a little difficult to follow those examples
Given your data in data.frame DF, maybe add the following to your list to
investigate :
dat = data.table(DF)
dat[, cor(Score1,Score2),
The user wrote in their first post :
I have a lot of observations in my dataset
Heres one way to do it with a data.table :
a=data.table(a)
ans = a[ , list(dt=dt[dt-min(dt)7]) , by=var1,var2,var3]
class(ans$dt) = Date
Timings are below comparing the 3 methods. In this
Sounds like a good idea. Would it be possible to give an example of how to
combine plyr with data.table, and why that is better than a data.table only
solution ?
hadley wickham h.wick...@gmail.com wrote in message
news:f8e6ff051001200624r2175e38xf558dc8fa3fb6...@mail.gmail.com...
Note that in
1 - 100 of 116 matches
Mail list logo