I think the reason that stata is fast is because it only keeps 1 work table in ram. if you just keep 1 data frame in R, it will run fast too. But ...
On 4/11/07, Robert Duval <[EMAIL PROTECTED]> wrote: > So I guess my question is... > > Is there any hope of R being modified on its core in order to handle > more graciously large datasets? (You've mentioned SAS and SPSS, I'd > add Stata to the list). > > Or should we (the users of large datasets) expect to keep on working > with the present tools for the time to come? > > robert > > On 4/11/07, Marc Schwartz <[EMAIL PROTECTED]> wrote: > > On Wed, 2007-04-11 at 11:26 -0500, Marc Schwartz wrote: > > > On Wed, 2007-04-11 at 17:56 +0200, Bi-Info > > > (http://members.home.nl/bi-info) wrote: > > > > I certainly have that idea too. SPSS functions in a way the same, > > > > although it specialises in PC applications. Memory addition to a PC is > > > > not a very expensive thing these days. On my first AT some extra memory > > > > cost 300 dollars or more. These days you get extra memory with a package > > > > of marshmellows or chocolate bars if you need it. > > > > All computations on a computer are discrete steps in a way, but I've > > > > heard that SAS computations are split up in strictly divided steps. That > > > > also makes procedures "attachable" I've been told, and interchangable. > > > > Different procedures can use the same code which alternatively is > > > > cheaper in memory usages or disk usage (the old days...). That makes SAS > > > > by the way a complicated machine to build because procedures who are > > > > split up into numerous fragments which make complicated bookkeeping. If > > > > you do it that way, I've been told, you can do a lot of computations > > > > with very little memory. One guy actually computed quite complicated > > > > models with "only 32MB or less", which wasn't very much for "his type of > > > > calculations". Which means that SAS is efficient in memory handling I > > > > think. It's not very efficient in dollar handling... I estimate. > > > > > > > > Wilfred > > > > > > <snip> > > > > > > Oh....SAS is quite efficient in dollar handling, at least when it comes > > > to the annual commercial licenses...along the same lines as the > > > purported efficiency of the U.S. income tax system: > > > > > > "How much money do you have? Send it in..." > > > > > > There is a reason why SAS is the largest privately held software company > > > in the world and it is not due to the academic licensing structure, > > > which constitutes only about 12% of their revenue, based upon their > > > public figures. > > > > Hmmm......here is a classic example of the problems of reading pie > > charts. > > > > The figure I quoted above, which is from reading the 2005 SAS Annual > > Report on their web site (such as it is for a private company) comes > > from a 3D exploded pie chart (ick...). > > > > The pie chart uses 3 shades of grey and 5 shades of blue to > > differentiate 8 market segments and their percentages of total worldwide > > revenue. > > > > I mis-read the 'shade of grey' allocated to Education as being 12% > > (actually 11.7%). > > > > A re-read of the chart, zooming in close on the pie in a PDF reader, > > appears to actually show that Education is but 1.8% of their annual > > worldwide revenue. > > > > Government based installations, which are presumably the other notable > > market segment in which substantially discounted licenses are provided, > > is 14.6%. > > > > The report is available here for anyone else curious: > > > > http://www.sas.com/corporate/report05/annualreport05.pdf > > > > Somebody needs to send SAS a copy of Tufte or Cleveland. > > > > I have to go and rest my eyes now... ;-) > > > > Regards, > > > > Marc > > > > ______________________________________________ > > R-help@stat.math.ethz.ch mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > ______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- WenSui Liu A lousy statistician who happens to know a little programming (http://spaces.msn.com/statcompute/blog) ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.