Hello Melanie,

I'm not too familiar with ARFF but I believe it has some headers (and
possibly footers) that may need to be removed before one can call the
read.csv function.  I am assuming you and your colleague both manually
removed the ARFF headers/footers before calling the read.csv function.

You may also want to try changing the read.csv function call to:

frame1 <- read.csv("test.csv", header = FALSE)

You will have to manually change your filename to test.csv first.  Also
notice that the "sep" argument is not needed as it defaults to a "comma".
 I would also use the word "frame" instead of "mat" as the data will not be
a matrix after you call the read.csv function - it will be a frame.  You
can turn your frame into a matrix using other R commands if you like.  See
this page:

http://stackoverflow.com/questions/5158790/data-frame-or-matrix

Also, there are R packages called "foreign" and "RWeka" which both have
read.arff functions inside of them.  You may want to give these a try.

You can learn about them here:

See Paper Page 3 (Digital Page 2)
http://cran.r-project.org/web/packages/foreign/foreign.pdf

See Paper Page 6 (Digital Page 3)
http://cran.r-project.org/web/packages/RWeka/RWeka.pdf

- Ray






On Fri, Oct 26, 2012 at 10:18 AM, Melanie Courtot <[email protected]>wrote:

> Hi Ray and Simon, all,
>
> Thanks for the help. My laptop has 8GB of RAM (my colleague has 12 on his
> desktop). I ssh'ed into his machine and the whole file loads in not even 2
> seconds.
> The file is read with mat<-read.csv('test.arff',header=FALSE,sep=',') The
> arff file is what I use with Weka, which is basically a comma delimited
> file. It contains around 7.5M datapoints (6200 rows, 1140 columns)
>
> It seems that with 8GB I should be quite ok?
>
> Based on your suggestions I tried with a part of the file only, which does
> work fine, so it seems that it is indeed a memory problem. Any idea as to
> why?
>
> Thanks,
> Melanie
>
>
>
> Example record (I have 6200 of those)
>
> 856243,negative,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
>
>
>
> On 2012-10-25, at 6:39 PM, Ray DiGiacomo, Jr. wrote:
>
> > Hi Simon,
> >
> > I took the spec from this Revo SlideShare.  The spec is based on a
> regression.
> >
> >
> http://www.revolutionanalytics.com/news-events/free-webinars/2011/intro-to-r-for-sas-spss/
> >
> > Click the right arrow until you get to slide 3 of 14.  Then, look at the
> slide in the lower-right hand corner (slide 12).
> >
> > - Ray
> >
> >
> >
> >
> >
> > On Thu, Oct 25, 2012 at 6:26 PM, Simon Urbanek <
> [email protected]> wrote:
> >
> > On Oct 25, 2012, at 7:42 PM, Ray DiGiacomo, Jr. wrote:
> >
> > > Hello Melanie,
> > >
> > > How much RAM is installed on your MacBook Pro compared to your
> colleague's
> > > Linux machine?
> > >
> > > How big is your dataset in terms of rows and columns?
> > >
> > > I believe R can handle about 10M datapoints per GB of RAM.
> > >
> >
> > What exactly is that an estimate of? In R, 1GB of RAM will store ~134Mio
> datapoints when using numeric matrices/vectors and twice as many as
> integers or logicals. In practice, you will still need some room for
> computation on the data, though.
> >
> > Cheers,
> > Simon
> >
> >
> > > Note that datapoints = rows x columns
> > >
> > > Best Regards,
> > >
> > > Ray DiGiacomo, Jr.
> > > Master R Trainer
> > > President, Lion Data Systems LLC
> > > President, The Orange County R User Group
> > > Board Member, TDWI
> > > [email protected]
> > > (Mobile) 408-425-7851
> > > San Juan Capistrano, California
> > >
> > > Check out my one-on-one web-based R courses at
> liondatasystems.com/courses
> > >
> > >
> > >
> > >
> > >
> > > On Thu, Oct 25, 2012 at 4:16 PM, Melanie Courtot <[email protected]>
> wrote:
> > >
> > >> Hi,
> > >>
> > >> I am trying to run R on my MacBook Pro 2.4 GHz Intel core i5. I am
> trying
> > >> to read a csv file, which works fine on my work colleague's machine
> (under
> > >> linux) but causes my CPU to go up to 100% and makes the GUI
> unresponsive
> > >> and hangs on the command line. Activity monitor indicates there is
> only one
> > >> R thread running.
> > >>
> > >> I did see that by default R was using the BLAS library, which is
> > >> single-threaded, and that there was an option to use vecLib instead.
> I did
> > >> this, and
> > >> ls -l /Library/Frameworks/R.framework/Resources/lib/libRblas.dylib
> > >> does return
> > >> /Library/Frameworks/R.framework/Resources/lib/libRblas.dylib ->
> > >> libRblas.vecLib.dylib
> > >>
> > >> I however still see the same behavior: 100% CPU, single thread.
> > >>
> > >> I saw that some MacBook pro (Xeon Nehalem based) had a vecLib bug, so
> I
> > >> built the ATLAS library and symlinked R to libtatlas.dylib
> (unfortunately
> > >> the pre compiled binairies pointed to in a previous email on the list
> [1]
> > >> were not available anymore. Building ATLAS was... fun ;)) I was able
> to get
> > >> the shared libraries (using --shared in my config) but still see the
> same
> > >> behavior when trying to run my code. I was unsure if I should link to
> > >> libsatlas.dylib or libtatlas.dylib, so tried both (I guess the latter
> was
> > >> the right one though)
> > >>
> > >> I tried building R from the source (specifying -arch x86_64 and
> > >> --enable-BLAS-shlib to be able to switch libraries), but same
> behavior and
> > >> it seems it is an identical version to the prepackaged one (I tried
> with
> > >> BLAS, vecLib and ATLAS)
> > >>
> > >> R info: R version 2.15.1 (2012-06-22) -- "Roasted Marshmallows",
> Platform:
> > >> x86_64-apple-darwin9.8.0/x86_64 (64-bit)
> > >>
> > >> Any help would be greatly appreciated.
> > >>
> > >> Thanks,
> > >> Melanie
> > >>
> > >>
> > >> [1] https://stat.ethz.ch/pipermail/r-sig-mac/2010-October/007817.html
> > >>
> > >> ---
> > >> Mélanie Courtot
> > >> MSFHR/PCIRN Ph.D. Candidate,
> > >> BCCRC - Terry Fox Laboratory - 12th floor
> > >> 675 West 10th Avenue
> > >> Vancouver, BC
> > >> V5Z 1L3, Canada
> > >>
> > >> _______________________________________________
> > >> R-SIG-Mac mailing list
> > >> [email protected]
> > >> https://stat.ethz.ch/mailman/listinfo/r-sig-mac
> > >>
> > >
> > >       [[alternative HTML version deleted]]
> > >
> > > _______________________________________________
> > > R-SIG-Mac mailing list
> > > [email protected]
> > > https://stat.ethz.ch/mailman/listinfo/r-sig-mac
> >
> >
>
>

        [[alternative HTML version deleted]]

_______________________________________________
R-SIG-Mac mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/r-sig-mac

Reply via email to