Hello Melanie, Also make sure that you and your colleague have similar outputs for these two commands:
> search() > ls() # L as in Larry, S as in Sam "search()" shows the packages you have loaded into R (which take up RAM). "ls()" shows a list of your loaded R objects (which take up RAM). - Ray On Fri, Oct 26, 2012 at 11:17 AM, Ray DiGiacomo, Jr. < [email protected]> wrote: > Hello Melanie, > > I'm not too familiar with ARFF but I believe it has some headers (and > possibly footers) that may need to be removed before one can call the > read.csv function. I am assuming you and your colleague both manually > removed the ARFF headers/footers before calling the read.csv function. > > You may also want to try changing the read.csv function call to: > > frame1 <- read.csv("test.csv", header = FALSE) > > You will have to manually change your filename to test.csv first. Also > notice that the "sep" argument is not needed as it defaults to a "comma". > I would also use the word "frame" instead of "mat" as the data will not be > a matrix after you call the read.csv function - it will be a frame. You > can turn your frame into a matrix using other R commands if you like. See > this page: > > http://stackoverflow.com/questions/5158790/data-frame-or-matrix > > Also, there are R packages called "foreign" and "RWeka" which both have > read.arff functions inside of them. You may want to give these a try. > > You can learn about them here: > > See Paper Page 3 (Digital Page 2) > http://cran.r-project.org/web/packages/foreign/foreign.pdf > > See Paper Page 6 (Digital Page 3) > http://cran.r-project.org/web/packages/RWeka/RWeka.pdf > > - Ray > > > > > > > On Fri, Oct 26, 2012 at 10:18 AM, Melanie Courtot <[email protected]>wrote: > >> Hi Ray and Simon, all, >> >> Thanks for the help. My laptop has 8GB of RAM (my colleague has 12 on his >> desktop). I ssh'ed into his machine and the whole file loads in not even 2 >> seconds. >> The file is read with mat<-read.csv('test.arff',header=FALSE,sep=',') The >> arff file is what I use with Weka, which is basically a comma delimited >> file. It contains around 7.5M datapoints (6200 rows, 1140 columns) >> >> It seems that with 8GB I should be quite ok? >> >> Based on your suggestions I tried with a part of the file only, which >> does work fine, so it seems that it is indeed a memory problem. Any idea as >> to why? >> >> Thanks, >> Melanie >> >> >> >> Example record (I have 6200 of those) >> >> 856243,negative,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 >> >> >> >> On 2012-10-25, at 6:39 PM, Ray DiGiacomo, Jr. wrote: >> >> > Hi Simon, >> > >> > I took the spec from this Revo SlideShare. The spec is based on a >> regression. >> > >> > >> http://www.revolutionanalytics.com/news-events/free-webinars/2011/intro-to-r-for-sas-spss/ >> > >> > Click the right arrow until you get to slide 3 of 14. Then, look at >> the slide in the lower-right hand corner (slide 12). >> > >> > - Ray >> > >> > >> > >> > >> > >> > On Thu, Oct 25, 2012 at 6:26 PM, Simon Urbanek < >> [email protected]> wrote: >> > >> > On Oct 25, 2012, at 7:42 PM, Ray DiGiacomo, Jr. wrote: >> > >> > > Hello Melanie, >> > > >> > > How much RAM is installed on your MacBook Pro compared to your >> colleague's >> > > Linux machine? >> > > >> > > How big is your dataset in terms of rows and columns? >> > > >> > > I believe R can handle about 10M datapoints per GB of RAM. >> > > >> > >> > What exactly is that an estimate of? In R, 1GB of RAM will store >> ~134Mio datapoints when using numeric matrices/vectors and twice as many as >> integers or logicals. In practice, you will still need some room for >> computation on the data, though. >> > >> > Cheers, >> > Simon >> > >> > >> > > Note that datapoints = rows x columns >> > > >> > > Best Regards, >> > > >> > > Ray DiGiacomo, Jr. >> > > Master R Trainer >> > > President, Lion Data Systems LLC >> > > President, The Orange County R User Group >> > > Board Member, TDWI >> > > [email protected] >> > > (Mobile) 408-425-7851 >> > > San Juan Capistrano, California >> > > >> > > Check out my one-on-one web-based R courses at >> liondatasystems.com/courses >> > > >> > > >> > > >> > > >> > > >> > > On Thu, Oct 25, 2012 at 4:16 PM, Melanie Courtot <[email protected]> >> wrote: >> > > >> > >> Hi, >> > >> >> > >> I am trying to run R on my MacBook Pro 2.4 GHz Intel core i5. I am >> trying >> > >> to read a csv file, which works fine on my work colleague's machine >> (under >> > >> linux) but causes my CPU to go up to 100% and makes the GUI >> unresponsive >> > >> and hangs on the command line. Activity monitor indicates there is >> only one >> > >> R thread running. >> > >> >> > >> I did see that by default R was using the BLAS library, which is >> > >> single-threaded, and that there was an option to use vecLib instead. >> I did >> > >> this, and >> > >> ls -l /Library/Frameworks/R.framework/Resources/lib/libRblas.dylib >> > >> does return >> > >> /Library/Frameworks/R.framework/Resources/lib/libRblas.dylib -> >> > >> libRblas.vecLib.dylib >> > >> >> > >> I however still see the same behavior: 100% CPU, single thread. >> > >> >> > >> I saw that some MacBook pro (Xeon Nehalem based) had a vecLib bug, >> so I >> > >> built the ATLAS library and symlinked R to libtatlas.dylib >> (unfortunately >> > >> the pre compiled binairies pointed to in a previous email on the >> list [1] >> > >> were not available anymore. Building ATLAS was... fun ;)) I was able >> to get >> > >> the shared libraries (using --shared in my config) but still see the >> same >> > >> behavior when trying to run my code. I was unsure if I should link to >> > >> libsatlas.dylib or libtatlas.dylib, so tried both (I guess the >> latter was >> > >> the right one though) >> > >> >> > >> I tried building R from the source (specifying -arch x86_64 and >> > >> --enable-BLAS-shlib to be able to switch libraries), but same >> behavior and >> > >> it seems it is an identical version to the prepackaged one (I tried >> with >> > >> BLAS, vecLib and ATLAS) >> > >> >> > >> R info: R version 2.15.1 (2012-06-22) -- "Roasted Marshmallows", >> Platform: >> > >> x86_64-apple-darwin9.8.0/x86_64 (64-bit) >> > >> >> > >> Any help would be greatly appreciated. >> > >> >> > >> Thanks, >> > >> Melanie >> > >> >> > >> >> > >> [1] >> https://stat.ethz.ch/pipermail/r-sig-mac/2010-October/007817.html >> > >> >> > >> --- >> > >> Mélanie Courtot >> > >> MSFHR/PCIRN Ph.D. Candidate, >> > >> BCCRC - Terry Fox Laboratory - 12th floor >> > >> 675 West 10th Avenue >> > >> Vancouver, BC >> > >> V5Z 1L3, Canada >> > >> >> > >> _______________________________________________ >> > >> R-SIG-Mac mailing list >> > >> [email protected] >> > >> https://stat.ethz.ch/mailman/listinfo/r-sig-mac >> > >> >> > > >> > > [[alternative HTML version deleted]] >> > > >> > > _______________________________________________ >> > > R-SIG-Mac mailing list >> > > [email protected] >> > > https://stat.ethz.ch/mailman/listinfo/r-sig-mac >> > >> > >> >> > [[alternative HTML version deleted]]
_______________________________________________ R-SIG-Mac mailing list [email protected] https://stat.ethz.ch/mailman/listinfo/r-sig-mac
