Hello Melanie,

Also make sure that you and your colleague have similar outputs for these
two commands:

> search()
> ls() # L as in Larry, S as in Sam

"search()" shows the packages you have loaded into R (which take up RAM).
 "ls()" shows a list of your loaded R objects (which take up RAM).

- Ray





On Fri, Oct 26, 2012 at 11:17 AM, Ray DiGiacomo, Jr. <
[email protected]> wrote:

> Hello Melanie,
>
> I'm not too familiar with ARFF but I believe it has some headers (and
> possibly footers) that may need to be removed before one can call the
> read.csv function.  I am assuming you and your colleague both manually
> removed the ARFF headers/footers before calling the read.csv function.
>
> You may also want to try changing the read.csv function call to:
>
> frame1 <- read.csv("test.csv", header = FALSE)
>
> You will have to manually change your filename to test.csv first.  Also
> notice that the "sep" argument is not needed as it defaults to a "comma".
>  I would also use the word "frame" instead of "mat" as the data will not be
> a matrix after you call the read.csv function - it will be a frame.  You
> can turn your frame into a matrix using other R commands if you like.  See
> this page:
>
> http://stackoverflow.com/questions/5158790/data-frame-or-matrix
>
> Also, there are R packages called "foreign" and "RWeka" which both have
> read.arff functions inside of them.  You may want to give these a try.
>
> You can learn about them here:
>
> See Paper Page 3 (Digital Page 2)
> http://cran.r-project.org/web/packages/foreign/foreign.pdf
>
> See Paper Page 6 (Digital Page 3)
> http://cran.r-project.org/web/packages/RWeka/RWeka.pdf
>
> - Ray
>
>
>
>
>
>
> On Fri, Oct 26, 2012 at 10:18 AM, Melanie Courtot <[email protected]>wrote:
>
>> Hi Ray and Simon, all,
>>
>> Thanks for the help. My laptop has 8GB of RAM (my colleague has 12 on his
>> desktop). I ssh'ed into his machine and the whole file loads in not even 2
>> seconds.
>> The file is read with mat<-read.csv('test.arff',header=FALSE,sep=',') The
>> arff file is what I use with Weka, which is basically a comma delimited
>> file. It contains around 7.5M datapoints (6200 rows, 1140 columns)
>>
>> It seems that with 8GB I should be quite ok?
>>
>> Based on your suggestions I tried with a part of the file only, which
>> does work fine, so it seems that it is indeed a memory problem. Any idea as
>> to why?
>>
>> Thanks,
>> Melanie
>>
>>
>>
>> Example record (I have 6200 of those)
>>
>> 856243,negative,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
>>
>>
>>
>> On 2012-10-25, at 6:39 PM, Ray DiGiacomo, Jr. wrote:
>>
>> > Hi Simon,
>> >
>> > I took the spec from this Revo SlideShare.  The spec is based on a
>> regression.
>> >
>> >
>> http://www.revolutionanalytics.com/news-events/free-webinars/2011/intro-to-r-for-sas-spss/
>> >
>> > Click the right arrow until you get to slide 3 of 14.  Then, look at
>> the slide in the lower-right hand corner (slide 12).
>> >
>> > - Ray
>> >
>> >
>> >
>> >
>> >
>> > On Thu, Oct 25, 2012 at 6:26 PM, Simon Urbanek <
>> [email protected]> wrote:
>> >
>> > On Oct 25, 2012, at 7:42 PM, Ray DiGiacomo, Jr. wrote:
>> >
>> > > Hello Melanie,
>> > >
>> > > How much RAM is installed on your MacBook Pro compared to your
>> colleague's
>> > > Linux machine?
>> > >
>> > > How big is your dataset in terms of rows and columns?
>> > >
>> > > I believe R can handle about 10M datapoints per GB of RAM.
>> > >
>> >
>> > What exactly is that an estimate of? In R, 1GB of RAM will store
>> ~134Mio datapoints when using numeric matrices/vectors and twice as many as
>> integers or logicals. In practice, you will still need some room for
>> computation on the data, though.
>> >
>> > Cheers,
>> > Simon
>> >
>> >
>> > > Note that datapoints = rows x columns
>> > >
>> > > Best Regards,
>> > >
>> > > Ray DiGiacomo, Jr.
>> > > Master R Trainer
>> > > President, Lion Data Systems LLC
>> > > President, The Orange County R User Group
>> > > Board Member, TDWI
>> > > [email protected]
>> > > (Mobile) 408-425-7851
>> > > San Juan Capistrano, California
>> > >
>> > > Check out my one-on-one web-based R courses at
>> liondatasystems.com/courses
>> > >
>> > >
>> > >
>> > >
>> > >
>> > > On Thu, Oct 25, 2012 at 4:16 PM, Melanie Courtot <[email protected]>
>> wrote:
>> > >
>> > >> Hi,
>> > >>
>> > >> I am trying to run R on my MacBook Pro 2.4 GHz Intel core i5. I am
>> trying
>> > >> to read a csv file, which works fine on my work colleague's machine
>> (under
>> > >> linux) but causes my CPU to go up to 100% and makes the GUI
>> unresponsive
>> > >> and hangs on the command line. Activity monitor indicates there is
>> only one
>> > >> R thread running.
>> > >>
>> > >> I did see that by default R was using the BLAS library, which is
>> > >> single-threaded, and that there was an option to use vecLib instead.
>> I did
>> > >> this, and
>> > >> ls -l /Library/Frameworks/R.framework/Resources/lib/libRblas.dylib
>> > >> does return
>> > >> /Library/Frameworks/R.framework/Resources/lib/libRblas.dylib ->
>> > >> libRblas.vecLib.dylib
>> > >>
>> > >> I however still see the same behavior: 100% CPU, single thread.
>> > >>
>> > >> I saw that some MacBook pro (Xeon Nehalem based) had a vecLib bug,
>> so I
>> > >> built the ATLAS library and symlinked R to libtatlas.dylib
>> (unfortunately
>> > >> the pre compiled binairies pointed to in a previous email on the
>> list [1]
>> > >> were not available anymore. Building ATLAS was... fun ;)) I was able
>> to get
>> > >> the shared libraries (using --shared in my config) but still see the
>> same
>> > >> behavior when trying to run my code. I was unsure if I should link to
>> > >> libsatlas.dylib or libtatlas.dylib, so tried both (I guess the
>> latter was
>> > >> the right one though)
>> > >>
>> > >> I tried building R from the source (specifying -arch x86_64 and
>> > >> --enable-BLAS-shlib to be able to switch libraries), but same
>> behavior and
>> > >> it seems it is an identical version to the prepackaged one (I tried
>> with
>> > >> BLAS, vecLib and ATLAS)
>> > >>
>> > >> R info: R version 2.15.1 (2012-06-22) -- "Roasted Marshmallows",
>> Platform:
>> > >> x86_64-apple-darwin9.8.0/x86_64 (64-bit)
>> > >>
>> > >> Any help would be greatly appreciated.
>> > >>
>> > >> Thanks,
>> > >> Melanie
>> > >>
>> > >>
>> > >> [1]
>> https://stat.ethz.ch/pipermail/r-sig-mac/2010-October/007817.html
>> > >>
>> > >> ---
>> > >> Mélanie Courtot
>> > >> MSFHR/PCIRN Ph.D. Candidate,
>> > >> BCCRC - Terry Fox Laboratory - 12th floor
>> > >> 675 West 10th Avenue
>> > >> Vancouver, BC
>> > >> V5Z 1L3, Canada
>> > >>
>> > >> _______________________________________________
>> > >> R-SIG-Mac mailing list
>> > >> [email protected]
>> > >> https://stat.ethz.ch/mailman/listinfo/r-sig-mac
>> > >>
>> > >
>> > >       [[alternative HTML version deleted]]
>> > >
>> > > _______________________________________________
>> > > R-SIG-Mac mailing list
>> > > [email protected]
>> > > https://stat.ethz.ch/mailman/listinfo/r-sig-mac
>> >
>> >
>>
>>
>

        [[alternative HTML version deleted]]

_______________________________________________
R-SIG-Mac mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/r-sig-mac

Reply via email to