Re: [R] Confusing behaviour in data.table: unexpectedly changing variable

2013-09-25 Thread Matthew Dowle
Very sorry to hear this bit you. If you need a copy of names before changing them by reference : oldnames - copy(names(DT)) This will be documented and it's on the bug list to do so. copy is needed in other circumstances too, see ?copy. More details here :

Re: [R] Problem with R CMD check and the inconsolata font business

2013-03-05 Thread Matthew Dowle
On 11/3/2011 3:30 PM, Brian Diggs wrote: Well, I figured it out. Or at least got it working. I had to run initexmf --mkmaps because apparently there was something wrong with my font mappings. I don't know why; I don't know how. But it works now. I think installing the font into the

Re: [R] data.table vs plyr reg output

2012-06-29 Thread Matthew Dowle
Hi Geoff, Please see this part of the r-help posting guide : For questions about functions in standard packages distributed with R (see the FAQ Add-on packages in R), ask questions on R-help. If the question relates to a contributed package , e.g., one downloaded from CRAN, try contacting

Re: [R] how to convert list of matrix (raster:extract o/p) to data table with additional colums (polygon Id, class)

2012-06-29 Thread Matthew Dowle
AKJ, Please see this recent answer : http://r.789695.n4.nabble.com/data-table-vs-plyr-reg-output-tp4634518p4634865.html Matthew -- View this message in context:

Re: [R] SLOW split() function

2011-10-13 Thread Matthew Dowle
Using Josh's nice example, with data.table's built-in 'by' (optimised grouping) yields a 6 times speedup (100 seconds down to 15 on my netbook). system.time(all.2b - lapply(si, function(.indx) { coef(lm(y ~ + x, data=d[.indx,])) })) user system elapsed 144.501 0.300 145.525

Re: [R] How to map current Europe?

2011-10-13 Thread Matthew Dowle
Hi Uwe, When you cc from Nabble it doesn't show as cc'd on r-help. It's a web form with an Email this post to... box. I asked Nabble support (over a year ago) if they could reflect that in the cc field of the post they send to r-help, with no luck. The previous thread is cited automatically in

Re: [R] fast or space-efficient lookup?

2011-10-10 Thread Matthew Dowle
Ivo, Also, perhaps FAQ 2.14 helps : Can you explain further why data.table is inspired by A[B] syntax in base? http://datatable.r-forge.r-project.org/datatable-faq.pdf And, 2.15 and 2.16. Matthew Steve Lianoglou mailinglist.honey...@gmail.com wrote in message

Re: [R] multicore by(), like mclapply?

2011-10-10 Thread Matthew Dowle
Package plyr has .parallel. Searching datatable-help for multicore, say on Nabble here, http://r.789695.n4.nabble.com/datatable-help-f2315188.html yields three relevant posts and examples. Please check wiki do's and don'ts to make sure you didn't fall into one of those traps, though (we don't

Re: [R] Efficient way to do a merge in R

2011-10-04 Thread Matthew Dowle
Joshua Wiley jwiley.ps...@gmail.com wrote in message news:canz9z_kopuwkzb-zxr96pvulhhf2znxntxso9xnyho-_jum...@mail.gmail.com... On Tue, Oct 4, 2011 at 12:40 AM, Rainer Schuermann rainer.schuerm...@gmx.net wrote: Any comments are very welcome, 3. If that fails, and nobody else has a better

Re: [R] cannot install.packages(data.table)

2011-10-04 Thread Matthew Dowle
Assuming you can install other packages ok, data.table depends on R =2.12.0. Which version of R do you have? _If_ that's the problem, does anyone know if anything prevents R's error message from stating which dependency isn't satisfied? I think I've seen users confused by this before, for other

Re: [R] formatting a 6 million row data set; creating a censoring variable

2011-09-01 Thread Matthew Dowle
This is the fastest data.table way I can think of : ans = mydt[,list(mytime=.N),by=list(id,mygroup)] ans[,censor:=0L] ans[J(unique(id)), censor:=1L, mult=last] id mygroup mytime censor [1,] 1 A 1 1 [2,] 2 B 3 0 [3,] 2 C 3 0 [4,] 2 D

Re: [R] ddply from plyr package - any alternatives?

2011-08-30 Thread Matthew Dowle
Adam, because I did not have time to entirely test Do you (or does your company) have an automated test suite in place? R 2.10.0 is nearly two years old, and R 2.12.0 is nearly one. Matthew AdamMarczak adam.marc...@gmail.com wrote in message news:1314385041626-3771731.p...@n4.nabble.com...

Re: [R] Sequential Naming of ggplot .pngs using plyr

2011-08-11 Thread Matthew Dowle
Hi Justin, In data.table 1.6.1 there was this news item : oj's environment is now consistently reused so that local variables may be set which persist from group to group; e.g., incrementing a group counter : DT[,list(z,groupInd-groupInd+1),by=x] One of

Re: [R] EXTERNAL: Re: subset with aggregate key

2011-07-13 Thread Matthew Dowle
To close this thread on-list : packageVersion() was added to R in 2.12.0. data.table's dependency on 2.12.0 is updated, thanks. Matthew Jesse Brown jesse.r.br...@lmco.com wrote in message news:4e1b21a8.8090...@atl.lmco.com... Matthew Dowle wrote: Hi, Try package 'data.table'. It has

Re: [R] manipulating by lists and ave() functions

2011-07-11 Thread Matthew Dowle
Users of package 'unknownR' already know simplify2array was added in R 2.13.0. They also know what else was added. Do you? http://unknownr.r-forge.r-project.org/ Joshua Wiley jwiley.ps...@gmail.com wrote in message news:canz9z_j+trwoim3scayuaruors+8hyc30pmt_thiex6qmto...@mail.gmail.com...

Re: [R] Simple order() data frame question.

2011-05-12 Thread Matthew Dowle
With data.table, the following is routine : DT[order(a)] # ascending DT[order(-a)] # descending, if a is numeric DT[a5,sum(z),by=c][order(-V1)] # sum of z group by c, just where a5, then show me the largest first DT[order(-a,b)] # order by a descending then by b ascending, if a and b are

[R] [R-pkgs] unknownR : you didn't know you didn't know?

2011-04-28 Thread Matthew Dowle
Do you know how many functions there are in base R? How many of them do you know you don't know? Run unk() to discover your unknown unknowns. It's fast and it's fun! unknownR v0.2 is now on CRAN. More information is on the homepage : http://unknownr.r-forge.r-project.org/ Or, just install the

[R] [R-pkgs] data.table 1.6 is now on CRAN

2011-04-28 Thread Matthew Dowle
data.table offers fast subset, fast grouping and fast ordered joins in a short and flexible syntax, for faster development. It was first released in August 2008 and is now the 3rd most popular package on Crantastic with 20 votes and 7 reviews. * X[Y] is a fast join for large data. *

Re: [R] R licence

2011-04-07 Thread Matthew Dowle
Peter, If the proprietary part of REvolution's product is ok, then surely Stanislav's suggestion is too. No? Matthew peter dalgaard pda...@gmail.com wrote in message news:be157cf5-9b4b-45a0-a7d4-363b774f1...@gmail.com... On Apr 7, 2011, at 09:45 , Stanislav Bek wrote: Hi, is it

Re: [R] R licence

2011-04-07 Thread Matthew Dowle
murdoch.dun...@gmail.com wrote in message news:4d9da9ff.9020...@gmail.com... On 07/04/2011 7:47 AM, Matthew Dowle wrote: Peter, If the proprietary part of REvolution's product is ok, then surely Stanislav's suggestion is too. No? Revolution has said that they believe they follow the GPL

Re: [R] General binary search?

2011-04-05 Thread Matthew Dowle
Try data.table:::sortedmatch, which is implemented in C. It requires it's input to be sorted (and doesn't check) Stavros Macrakis macra...@alum.mit.edu wrote in message news:BANLkTi=j2lf5syxytv1dd4k9wr0zgk8...@mail.gmail.com... Is there a generic binary search routine in a standard library

Re: [R] How to calculate means for multiple variables in samples with different sizes

2011-03-11 Thread Matthew Dowle
Hi, One liners in data.table are : x.dt[,lapply(.SD,mean),by=sample] sample replicate heightweight age [1,] A 2.0 12.2 0.503 6.00 [2,] B 1.5 12.75000 0.715 4.50 [3,] C 2.5 11.35250 0.5125000 3.75 [4,] D 2.0

Re: [R] Transforming relational data

2011-02-22 Thread Matthew Dowle
Thanks! Matthew Dowle wrote: Thanks for the attempt and required output. How about this? firststep = DT[,cbind(expand.grid(B,B),v=1/length(B)),by=C][Var1!=Var2] setkey(firststep,Var1,Var2,C) firststep = firststep[,transform(.SD,cv=cumsum(v)),by=list(Var1,Var2)] setkey(firststep,Var1,Var2,C

Re: [R] Transforming relational data

2011-02-22 Thread Matthew Dowle
Thanks. How about this? DT$B = factor(DT$B) firststep = DT[,cbind(expand.grid(B,B),v=1/length(B),C=C[1]),by=A][Var1! =Var2] setkey(firststep,Var1,Var2,C) firststep = firststep[,transform(.SD,cv=cumsum(v)),by=list(Var1,Var2)] setkey(firststep,Var1,Var2,C) DT[,

Re: [R] Transforming relational data

2011-02-21 Thread Matthew Dowle
Thanks for the attempt and required output. How about this? firststep = DT[,cbind(expand.grid(B,B),v=1/length(B)),by=C][Var1!=Var2] setkey(firststep,Var1,Var2,C) firststep = firststep[,transform(.SD,cv=cumsum(v)),by=list(Var1,Var2)] setkey(firststep,Var1,Var2,C) DT[,

Re: [R] Transforming relational data

2011-02-17 Thread Matthew Dowle
Mathijs, To my eyes you seem to have repeated back what is already done. More R and less English would help. In other words if it is not 2.5 you need, what is it? Please provide some input and state what the output should be (and what you tried already). Matthew -- View this message in

Re: [R] boot.ci error with large data sets

2011-02-16 Thread Matthew Dowle
Hello Lars, (cc'd) Did you ask maintainer(boot) first, as requested by the posting guide? If you did, but didn't hear back, then please say so, so that we know you did follow the guide. That maintainer is particularly active, and particularly efficient though, so I doubt you didn't hear back.

Re: [R] Transforming relational data

2011-02-15 Thread Matthew Dowle
Hello. One (of many) solution might be: require(data.table) DT = data.table(read.table(textConnection(A B C 1 1 a 1999 2 1 b 1999 3 1 c 1999 4 1 d 1999 5 2 c 2001 6 2 d 2001),head=TRUE,stringsAsFactors=FALSE)) firststep =

Re: [R] Convert the output of by() to a data frame

2011-02-08 Thread Matthew Dowle
There's a much shorter way. You don't need that ugly h() with all those $ and potential for bugs ! Using the original f : dt[,lapply(.SD,f),by=key(dt)] grp1 grp2 grp3 a b d xxx 1.00 81.00 161.00 xxx 10.00 90.00

Re: [R] aggregate function - na.action

2011-02-07 Thread Matthew Dowle
Looking at the timings by each stage may help : system.time(dt - data.table(dat)) user system elapsed 1.200.281.48 system.time(setkey(dt, x1, x2, x3, x4, x5, x6, x7, x8)) # sort by the 8 columns (one-off) user system elapsed 4.720.945.67 system.time(udt

Re: [R] using character vector as input argument to setkey (data.tablepakcage)

2011-02-07 Thread Matthew Dowle
Hi Sean, Try : key(test.dt) = c(a,b) Btw, the posting guide asks you to contact the maintainer of the package before r-help. Otherwise r-help would fill up with posts about 2000+ packages (I guess is the reason). In this case maintainer(data.table) returns

Re: [R] aggregate function - na.action

2011-02-07 Thread Matthew Dowle
news:AANLkTik180p4YmBtR3QUCW7r=fdefxzbxsy3zwtik...@mail.gmail.com... On Mon, Feb 7, 2011 at 5:54 AM, Matthew Dowle mdo...@mdowle.plus.com wrote: Looking at the timings by each stage may help : system.time(dt - data.table(dat)) user system elapsed 1.20 0.28 1.48 system.time(setkey(dt, x1, x2

Re: [R] aggregate function - na.action

2011-02-07 Thread Matthew Dowle
Hadley, That's fine; please do. I'm happy to explain it offline where the documentation or comments in the code aren't sufficient. It's GPL code so you can take it and improve it, or depend on it. Whatever works for you. As long as (of course) you don't stand on it's shoulders and then

Re: [R] Counting number of rows with two criteria in dataframe

2011-01-26 Thread Matthew Dowle
Note that a key is not actually required, so it's even simpler syntax : dX = as.data.table(X) dX[,length(unique(z)),by=x,y] x y V1 [1,] 1 1 2 [2,] 1 2 2 [3,] 2 3 2 [4,] 2 4 2 [5,] 3 5 2 [6,] 3 6 2 or passing list() syntax to the 'by' is exactly the same :

Re: [R] subsets

2011-01-23 Thread Matthew Dowle
require(data.table) DT = as.data.table(df) # 1. Patients with ah and ihd DT[,.SD[ah%in%diagnosis ihd%in%diagnosis],by=id] id diagnosis [1,] 2ah [2,] 2 ihd [3,] 2im [4,] 4ah [5,] 4 ihd [6,] 4angina # 2. Patients with ah but no ihd

Re: [R] Listing of available functions

2011-01-04 Thread Matthew Dowle
Try : objects(package:base) Also, as it happens, a new package called unknownR is in development on R-Forge. It's description says : Do you know how many functions there are in base R? How many of them do you know you don't know? Run unk() to discover your unknown unknowns. It's fast and

Re: [R] RGL crashes

2010-12-09 Thread Matthew Dowle
if I understand correctly. Matthew Duncan Murdoch murdoch.dun...@gmail.com wrote in message news:4cffca13.7070...@gmail.com... Matthew Dowle wrote: Might Wayland fix it in Narwhal ? I hope those names mean something to Rainer, because they mean nothing to me. Duncan Murdoch Duncan

Re: [R] RGL crashes

2010-12-08 Thread Matthew Dowle
Might Wayland fix it in Narwhal ? Duncan Murdoch murdoch.dun...@gmail.com wrote in message news:4cff7177.7030...@gmail.com... On 08/12/2010 6:07 AM, Rainer M Krug wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 12/08/2010 12:05 PM, Duncan Murdoch wrote: Rainer M Krug wrote: Hi

Re: [R] fast subsetting of lists in lists

2010-12-07 Thread Matthew Dowle
Hello Alex, Assuming it was just an inadequate example (since a data.frame would suffice in that case), did you know that a data.frames' columns do not have to be vectors but can be lists? I don't know if that helps. DF = data.frame(a=1:3) DF$b = list(pi, 2:3, letters[1:5]) DF a

Re: [R] Performance tuning tips when working with wide datasets

2010-11-24 Thread Matthew Dowle
Richard, Try data.table. See the introduction vignette and the presentations e.g. there is a slide showing a join to 183,000,000 observations of daily stock prices in 0.002 seconds. data.table has fast rolling joins (i.e. fast last observation carried forward) too. I see you asked about that on

Re: [R] Finding the nearest data in intraday data from two zoo objects

2010-11-24 Thread Matthew Dowle
Try data.table with the roll=TRUE argument. Set your keys and then write : futData[optData,roll=TRUE] That is fast and as you can see, short. Works on many millions and even billions of rows in R. Matthew http://datatable.r-forge.r-project.org/ Santosh Srinivas

Re: [R] Sorting and subsetting

2010-09-21 Thread Matthew Dowle
All the solutions in this thread so far use the lapply(split(...)) paradigm either directly or indirectly. That paradigm doesn't scale. That's the likely source of quite a few 'out of memory' errors and performance issues in R. data.table doesn't do that internally, and it's syntax is pretty

Re: [R] Sorting and subsetting

2010-09-21 Thread Matthew Dowle
+ep8ubu3mxxhhrd...@mail.gmail.com... On Tue, Sep 21, 2010 at 3:09 AM, Matthew Dowle mdo...@mdowle.plus.com wrote: All the solutions in this thread so far use the lapply(split(...)) paradigm either directly or indirectly. That paradigm doesn't scale. That's the likely source of quite a few

Re: [R] Sorting and subsetting

2010-09-21 Thread Matthew Dowle
Wiley wrote: On Tue, Sep 21, 2010 at 3:09 AM, Matthew Dowle mdo...@mdowle.plus.com wrote: All the solutions in this thread so far use the lapply(split(...)) paradigm either directly or indirectly. That paradigm doesn't scale. That's the likely source of quite a few 'out of memory' errors

Re: [R] Pass By Value Questions

2010-08-20 Thread Matthew Dowle
To: r-help Cc: Jeff, Matt, Duncan, Hadley [ using Nabble to cc ] Jeff, Matt, How about the 'refdata' class in package ref. Also, Hadley's immutable data.frame in plyr 1.1. Both allow you to refer to subsets of a data.frame or matrix by reference I believe, if I understand correctly.

Re: [R] coef(summary) and plyr

2010-08-09 Thread Matthew Dowle
Another option for consideration : library(data.table) mydt = as.data.table(mydf) mydt[,as.list(coef(lm(y~x1+x2+x3))),by=fac] fac X.Intercept. x1 x2x3 [1,] 0 -0.16247059 1.130220 2.988769 -19.14719 [2,] 1 0.08224509 1.216673 2.847960 -19.16105 [3,] 2

Re: [R] Finding points where two timeseries cross over

2010-08-04 Thread Matthew Dowle
Is this what you mean? x=c(1,2,2,3,4,5,6,3,2,1) y=c(2,3,4,2,1,2,3,4,5,6) matplot(cbind(x,y),type=l) which(diff(sign(x-y))!=0)+1 [1] 4 8 -- View this message in context: http://r.789695.n4.nabble.com/Finding-points-where-two-timeseries-cross-over-tp2313257p2313510.html Sent from the R help

Re: [R] long to wide on larger data set

2010-07-12 Thread Matthew Dowle
since you are on 64bit. I was working on the basis of squeezing into 32bit. Matthew Matthew Dowle mdo...@mdowle.plus.com wrote in message news:i1faj2$lv...@dough.gmane.org... Hi Juliet, Thanks for the info. It is very slow because of the == in testData[testData$V2==one_ind,] Why? Imagine

Re: [R] long to wide on larger data set

2010-07-12 Thread Matthew Dowle
Hi Juliet, Thanks for the info. It is very slow because of the == in testData[testData$V2==one_ind,] Why? Imagine someoone looks for 10 people in the phone directory. Would they search the entire phone directory for the first person's phone number, starting on page 1, looking at every single

Re: [R] Query about using timestamps returned by SQL as 'factor' forsplit

2010-07-09 Thread Matthew Dowle
Hi Ted, Well since you mentioned data.table (!) ... If risk_input is a data.table consisting of 3 columns (m_id, sale_date, return_date) where the dates are of class IDate (recently added to data.table by Tom) then try : risk_input[, fitdistr(return_date-sale_date,normal), by=list(m_id,

Re: [R] Performance enhancement for ave

2010-06-29 Thread Matthew Dowle
dt = data.table(d,key=grp1,grp2) system.time(ans1 - dt[ , list(mean(x),mean(y)) , by=list(grp1,grp2)]) user system elapsed 3.890.003.91# your 7.064 is 12.23 for me though, so this 3.9 should be faster for you However, Rprof() shows that 3.9 is mostly dispatch of mean to

Re: [R] lapply or data.table to find a unit's previous transaction

2010-06-03 Thread Matthew Dowle
William, Try a rolling join in data.table, something like this (untested) : setkey(Data, UnitID, TranDt)# sort by unit then date previous = transform(Data, TranDt=TranDt-1) Data[previous,roll=TRUE]# lookup the prevailing date before, if any, for each row within that row's UnitID

[R] [R-pkgs] data.table 1.4.1 now on CRAN

2010-05-07 Thread Matthew Dowle
data.table is an enhanced data.frame with fast subset, fast grouping and fast merge. It uses a short and flexible syntax which extends existing R concepts. Example: DT[a3,sum(b*c),by=d] where DT is a data.table with 4 columns (a,b,c,d). data.table 1.4.1 : * grouping is now 10+ times faster

Re: [R] Using plyr::dply more (memory) efficiently?

2010-04-29 Thread Matthew Dowle
I don't know about that, but try this : install.packages(data.table, repos=http://R-Forge.R-project.org;) require(data.table) summaries = data.table(summaries) summaries[,sum(counts),by=symbol] Please let us know if that returns the correct result, and if its memory/speed is ok ? Matthew

Re: [R] Using plyr::dply more (memory) efficiently?

2010-04-29 Thread Matthew Dowle
Steve Lianoglou mailinglist.honey...@gmail.com wrote in message news:t2ybbdc7ed01004290812n433515b5vb15b49c170f5a...@mail.gmail.com... Thanks for directing me to the data.table package. I read through some of the vignettes, and it looks quite nice. While your sample code would provide

Re: [R] sum specific rows in a data frame

2010-04-20 Thread Matthew Dowle
Or try data.table 1.4 on r-forge, its grouping is faster than aggregate : agg datatable X100.012 0.008 X100 0.020 0.008 X1000 0.172 0.020 X1 1.164 0.144 X1e.05 9.397 1.180 install.packages(data.table, repos=http://R-Forge.R-project.org;)

Re: [R] match function or ==

2010-04-08 Thread Matthew Dowle
Please install v1.3 from R-forge : install.packages(data.table,repos=http://R-Forge.R-project.org;) It will be ready for CRAN soon. Please follow up on datatable-h...@lists.r-forge.r-project.org Matthew bo bozha...@hotmail.com wrote in message news:1270689586866-1755876.p...@n4.nabble.com...

Re: [R] Code is too slow: mean-centering variables in a dataframebysubgroup

2010-04-08 Thread Matthew Dowle
Hi Dimitri, A start has been made at explaining .SD in FAQ 2.1. This was previously on a webpage, but its just been moved to a vignette : https://r-forge.r-project.org/plugins/scmsvn/viewcvs.php/*checkout*/branch2/inst/doc/faq.pdf?rev=68root=datatable Please note: that vignette is part of a

Re: [R] memory error

2010-04-06 Thread Matthew Dowle
someone else on this list may be able to give you a ballpark estimate of how much RAM this merge would require. I don't have an absolute estimate, but try data.table::merge, as it needs less working memory than base::merge. 20 million rows of 5 columns isn't beyond 32bit : (1*4 +

Re: [R] Adding RcppFrame to RcppResultSet causes segmentation fault

2010-04-01 Thread Matthew Dowle
Rob, Please look again at Romain's reply to you on 19th March. He informed you then that Rcpp has its own dedicated mailing list and he gave you the link. Matthew R_help Help rhelp...@gmail.com wrote in message news:ad1ead5f1003291753p68d6ed52q572940f13e1c0...@mail.gmail.com... Hi, I'm a

Re: [R] Adding RcppFrame to RcppResultSet causes segmentation fault

2010-04-01 Thread Matthew Dowle
. FWIW, I think the problem is fixed on the Rcpp 0.7.11 version (on cran incoming) Romain Le 01/04/10 17:47, Matthew Dowle a écrit : Rob, Please look again at Romain's reply to you on 19th March. He informed you then that Rcpp has its own dedicated mailing list and he gave you the link

Re: [R] nlrq parameter bounds

2010-04-01 Thread Matthew Dowle
Ashley, This appears to be your first post to this list. Welcome to R. Over 2 days is quite a long time to wait though, so you are unlikely to get a reply now. Feedback: since nlrq is in package quantreg, its a question about a package and should be sent to the package maintainer. Some

Re: [R] Error grid must have equal distances in each direction

2010-03-31 Thread Matthew Dowle
M Joshi, I don't know but I guess that some might have looked at your previous thread on 14 March (also about the geoR package). You received help and good advice then, but it doesn't appear that you are following it. It appears to be a similar problem this time. Also, this list is the wrong

Re: [R] Question about 'logit' and 'mlogit' in Zelig

2010-03-31 Thread Matthew Dowle
Abraham, This appears to be your 3rd unanswered post to r-help in March, all 3 have been about the Zelig package. Please read the posting guide and find out the correct place to send questions about packages. Then you might get an answer. HTH Matthew Mathew, Abraham T amat...@ku.edu wrote

Re: [R] zero standard errors with geeglm in geepack

2010-03-31 Thread Matthew Dowle
You may not have got an answer because you posted to the wrong place. Its a question about a package. Please read the posting guide. miriza miri...@sfwmd.gov wrote in message news:1269886286228-1695430.p...@n4.nabble.com... Hi! I am using geeglm to fit a Poisson model to a timeseries of

Re: [R] GEE for a timeseries of count (one cluster)

2010-03-31 Thread Matthew Dowle
Contact the authors of those packages ? miriza miri...@sfwmd.gov wrote in message news:1269981675252-1745896.p...@n4.nabble.com... Hi! I was wondering if there were any packages that would allow me to fit a GEE to a single timeseries of counts so that I could account for autocorrelation

Re: [R] mcmcglmm starting value example

2010-03-31 Thread Matthew Dowle
Apparently not, since this your 3rd unanswered thread to r-help this month about this package. Please read the posting guide and find out where you should send questions about packages. Then you might get an answer. ping chen chen1984...@yahoo.com.cn wrote in message

Re: [R] GLM / large dataset question

2010-03-31 Thread Matthew Dowle
Geelman, This appears to be your first post to this list. Welcome to R. Nearly 2 days is quite a long time to wait though, so you are unlikely to get a reply now. Feedback : the question seems quite vague and imprecise. It depends on which R you mean (32bit/64bit) and how much ram you have.

Re: [R] Combing

2010-03-29 Thread Matthew Dowle
Val, Type combine two data sets (text you wrote in your post) into www.rseek.org. The first two links are: Quick-R: Merge and Merging data: A tutorial. Isn't it quicker for you to use rseek, rather than the time it takes to write a post and wait for a reply ? Don't you also get more

Re: [R] NA values in indexing

2010-03-26 Thread Matthew Dowle
The type of 'NA' is logical. So x[NA] behaves more like x[TRUE] i.e. silent recycling. class(NA) [1] logical x=101:108 x[NA] [1] NA NA NA NA NA NA NA NA x[c(TRUE,NA)] [1] 101 NA 103 NA 105 NA 107 NA x[as.integer(NA)] [1] NA HTH Matthew Barry Rowlingson b.rowling...@lancaster.ac.uk

Re: [R] translating SQL statements into data.table operations

2010-03-25 Thread Matthew Dowle
Nick, Good question, but just sent to the wrong place. The posting guide asks you to contact the package maintainer first before posting to r-help only if you don't hear back. I guess one reason for that is that if questions about all 2000+ packages were sent to r-help, then r-help's traffic

Re: [R] Mosaic

2010-03-24 Thread Matthew Dowle
When you click search on the R homepage, type mosaic into the box, and click the button, do the top 3 links seem relevant ? Your previous 2 requests for help : 26 Feb : Response was SuppDists. Yet that is the first hit returned by the subject line you posted : Hartleys table 22 Feb :

Re: [R] If else statements

2010-03-23 Thread Matthew Dowle
Here are some references. Please read these first and post again if you are still stuck after reading them. If you do post again, we will need x and y. 1. Introduction to R : 9.2.1 Conditional execution: if statements. 2. R Language Definition : 3.2 Control structures. 3. R for beginners by E

Re: [R] Forecasting with Panel Data

2010-03-11 Thread Matthew Dowle
Ricardo, I see you got no public answer so far, on either of the two lists you posted to at the same time yesterday. You are therefore unlikely to ever get a reply. I also see you've been having trouble getting answers in the past, back to Nov 09, at least. For example no reply to Credit

Re: [R] speed

2010-03-10 Thread Matthew Dowle
Your choice of subject line alone shows some people that you missed some small details from the posting guide. The ability to notice small details may be important for you to demonstrate in future. Any answer in this thread is unlikely to be found by a topic search on subject lines alone

Re: [R] Strange result in survey package: svyvar

2010-03-10 Thread Matthew Dowle
This list is the wrong place for that question. The posting guide tells you, in bold, to contact the package maintainer first. If you had already done that, and didn't hear back from him, then you should tell us, so that we know you followed the guide. Corey Sparks corey.spa...@utsa.edu

Re: [R] IMPORTANT - To remove the null elements from a vector

2010-03-09 Thread Matthew Dowle
Welcome to R Barbara. Its quite an incredible community from all walks of life. Your beginner questions are answered in the manual. See Introduction to R. Please read the posting guide again because it contains lots of good advice for you. Some people read it three times before posting

Re: [R] fit a gamma pdf using Residual Sum-of-Squares

2010-03-08 Thread Matthew Dowle
Thanks for making it quickly reproducible - I was able to see that message in English within a few seconds. The start has x=86, but the data is also called x. Remove x=86 from start and you get a different error. P.S. - please do include the R version information. It saves time for us, and we

Re: [R] ifthen() question

2010-03-05 Thread Matthew Dowle
This post breaks the posting guide in multiple ways. Please read it again (and then again) - in particular the first 3 paragraphs. You will help yourself by following it. The solution is right there in the help page for ?data.frame and other places including Introduction to R. I think its

Re: [R] Nonparametric generalization of ANOVA

2010-03-05 Thread Matthew Dowle
Frank, I respect your views but I agree with Gabor. The posting guide does not support your views. It is not any of our views that are important but we are following the posting guide. It covers affiliation. It says only that some consider it good manners to include a concise signature

Re: [R] Nonparametric generalization of ANOVA

2010-03-05 Thread Matthew Dowle
) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Matthew Dowle mdo...@mdowle.plus.com 3/5/2010 12:58 PM Frank, I respect your views but I agree with Gabor. The posting guide does not support your views. It is not any of our

Re: [R] data.table evaluating columns

2010-03-03 Thread Matthew Dowle
I'd go a bit further and remind that the r-help posting guide is clear : For questions about functions in standard packages distributed with R (see the FAQ Add-on packages in R), ask questions on R-help. If the question relates to a contributed package , e.g., one downloaded from CRAN, try

Re: [R] data.table evaluating columns

2010-03-03 Thread Matthew Dowle
appear to be correct. Or just directly sending an email to all of you? Thanks again, Rob On Wed, Mar 3, 2010 at 6:05 AM, Matthew Dowle mdo...@mdowle.plus.comwrote: I'd go a bit further and remind that the r-help posting guide is clear : For questions about functions in standard packages

Re: [R] Three most useful R package

2010-03-03 Thread Matthew Dowle
Dieter, One way to check if a package is active, is by looking on r-forge. If you are referring to data.table you would have found it is actually very active at the moment and is far from abandoned. What you may be referring to is a warning, not an error, with v1.2 on R2.10+. That was fixed

Re: [R] Reading large files

2010-02-05 Thread Matthew Dowle
I agree with Jim. The term do analysis is almost meaningless, the posting guide makes reference to statements such as that. At least he tried to define large, but inconsistenly (first of all 850MB, then changed to 10-20-15GB). Satish wrote: at one time I will need to load say 15GB into R

Re: [R] Reading large files

2010-02-05 Thread Matthew Dowle
I can't help you further than whats already been posted to you. Maybe someone else can. Best of luck. Satish Vadlamani satish.vadlam...@fritolay.com wrote in message news:1265397089104-1470667.p...@n4.nabble.com... Matthew: If it is going to help, here is the explanation. I have an end state

Re: [R] merging columns

2010-02-03 Thread Matthew Dowle
Yes. data.df[,wcol,drop=FALSE] For an explanation of drop see ?[.data.frame Chuck White chuckwhi...@charter.net wrote in message news:20100202212800.o8xbu.681696.r...@mp11... Additional clarification: the problem only comes when you have one column selected from the original dataframe. You

Re: [R] RMySQL - Bulk loading data and creating FK links

2010-01-28 Thread Matthew Dowle
should not be important as long as you can do what you want. SQL is declarative so you just specify what you want rather than how to get it and invisibly to the user it automatically draws up a query plan and then uses that plan to get the result. On Wed, Jan 27, 2010 at 12:48 PM, Matthew Dowle

Re: [R] RMySQL - Bulk loading data and creating FK links

2010-01-28 Thread Matthew Dowle
and use is to hide the implementation and focus on the problem. That is why we use high level languages, object orientation, etc. On Thu, Jan 28, 2010 at 4:37 AM, Matthew Dowle mdo...@mdowle.plus.com wrote: How it represents data internally is very important, depending on the real goal : http

Re: [R] RMySQL - Bulk loading data and creating FK links

2010-01-28 Thread Matthew Dowle
its even faster. On Thu, Jan 28, 2010 at 8:52 AM, Matthew Dowle mdo...@mdowle.plus.com wrote: Are you claiming that SQL is that utopia? SQL is a row store. It cannot give the user the benefits of column store. For example, why does SQL take 113 seconds in the example in this thread : http

Re: [R] RMySQL - Bulk loading data and creating FK links

2010-01-27 Thread Matthew Dowle
:971536df1001270629w4795da89vb7d77af6e4e8b...@mail.gmail.com... On Wed, Jan 27, 2010 at 8:56 AM, Matthew Dowle mdo...@mdowle.plus.com wrote: How many columns, and of what type are the columns ? As Olga asked too, it would be useful to know more about what you're really trying to do. 3.5m rows is not actually

Re: [R] Once again: Error: cannot allocate vector of size

2010-01-22 Thread Matthew Dowle
Please re-read the posting guide e.g. you didn't provide an example data set or a way to generate one, or any R version information. Werner W. pensterfuz...@yahoo.de wrote in message news:646146.32238...@web23002.mail.ird.yahoo.com... Hi, I have browsed the help list and looked at the FAQ

Re: [R] Merging and extracting data from list

2010-01-22 Thread Matthew Dowle
?merge plyr data.table sqldf crantastic Dr. Viviana Menzel vivianamen...@gmx.de wrote in message news:4b58a0e9.3050...@gmx.de... Hello R-help group, I have a question about merging lists. I have two lists: Genes list (hSgenes) namechrstrandstartendtransStarttransEnd

Re: [R] loop on list levels and names

2010-01-22 Thread Matthew Dowle
specific function), but don't worry I won't forget. As you said It only works if users contribute to it. That makes the power of R! Ivan Le 1/21/2010 19:01, Matthew Dowle a écrit : One way is : dataset = data.table(ssfamed) dataset[, whatever some functions are on Asfc, Smc, epLsar, etc

Re: [R] Once again: Error: cannot allocate vector of size

2010-01-22 Thread Matthew Dowle
Fantastic. You're much more likely to get a response now. Best of luck. werner w pensterfuz...@yahoo.de wrote in message news:1264175935970-1100164.p...@n4.nabble.com... Thanks Matthew, you are absolutely right. I am working on Windows XP SP2 32bit with R versions 2.9.1. Here is an

Re: [R] loop on list levels and names

2010-01-22 Thread Matthew Dowle
:18, Matthew Dowle a écrit : Great. If you mean the crantastic r package, sorry I wasn't clear, I meant the crantastic website http://crantastic.org/. If you meant the description of plyr then if the description looks useful then click the link taking you to the package documentation and read

Re: [R] loop on list levels and names

2010-01-21 Thread Matthew Dowle
One way is : dataset = data.table(ssfamed) dataset[, whatever some functions are on Asfc, Smc, epLsar, etc , by=SPECSHOR,BONE] Your SPECSHOR and BONE names will be in your result alongside the results of the whatever ... Or try package plyr which does this sort of thing too. And sqldf may

Re: [R] Mutliple sets of data in one dataset....Need a loop?

2010-01-21 Thread Matthew Dowle
but I have thousands of results so it would be really hand to find away of doing this quickly its a little difficult to follow those examples Given your data in data.frame DF, maybe add the following to your list to investigate : dat = data.table(DF) dat[, cor(Score1,Score2),

Re: [R] problem of data manipulation

2010-01-20 Thread Matthew Dowle
The user wrote in their first post : I have a lot of observations in my dataset Heres one way to do it with a data.table : a=data.table(a) ans = a[ , list(dt=dt[dt-min(dt)7]) , by=var1,var2,var3] class(ans$dt) = Date Timings are below comparing the 3 methods. In this

Re: [R] problem of data manipulation

2010-01-20 Thread Matthew Dowle
Sounds like a good idea. Would it be possible to give an example of how to combine plyr with data.table, and why that is better than a data.table only solution ? hadley wickham h.wick...@gmail.com wrote in message news:f8e6ff051001200624r2175e38xf558dc8fa3fb6...@mail.gmail.com... Note that in

  1   2   >