[R] Cannot allocate vector size of... ?

2007-03-15 Thread Wingfield, Jerad G.
Hello all, 

 

I've been working with R  Fridolin Wild's lsa package a bit over the
past few months, but I'm still pretty much a novice. I have a lot of
files that I want to use to create a semantic space. When I begin to run
the initial textmatrix( ), it runs for about 3-4 hours and eventually
gives me an error. It's always ERROR: cannot allocate vector size of
xxx Kb. I imagine this might be my computer running out of memory, but
I'm sure. So I thought I would send this to community at large for any
help/thoughts.

 

I search the archives and didn't really find anything that specifically
speaks to my situation. So I guess I have s few questions. First, is
this actually an issue with the machine running out of memory? If not,
what might be the cause for the error? If so, is there a way to minimize
the amount of memory used by the vector data structures (e.g., Berkeley
DB)?

 

Thanks,

Gabe Wingfield

IT and Program Specialist I

Center for Applied Social Research

University of Oklahoma

2 Partners Place

3100 Monitor, Suite 100

Norman, OK 73072


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Cannot allocate vector size of... ?

2007-03-15 Thread Wingfield, Jerad G.

Oops. Yep, I totally forgot my specs and such. I'm currently running
R-2.4.1 on a 64-bit Linux box (Fedora Core 6) with 4GB of RAM. The files
are 10-50Kb on average, but this error came about when only working with
~16,000 of them. The final size of the corpus is ~1.7M files. So,
obviously, this memory thing is going to be a large issue for me.

I'm going through re-searching the help list archives and now it looks
like I have S Poetry to read as well. 

Thanks for all the suggestions. Any others are greatly appreciated as
well.

Gabe Wingfield
IT and Program Specialist I
Center for Applied Social Research
University of Oklahoma
2 Partners Place
3100 Monitor, Suite 100
Norman, OK 73072

-Original Message-
From: Patrick Burns [mailto:[EMAIL PROTECTED] 
Sent: Thursday, March 15, 2007 12:31 PM
To: Wingfield, Jerad G.
Subject: Re: [R] Cannot allocate vector size of... ?

You can find a few things not to do (things that waste memory)
in S Poetry.  You don't say how much memory your machine
has, nor how big your objects are.  However, it is possible that
getting more memory for your machine might be the best thing
to do.


Patrick Burns
[EMAIL PROTECTED]
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of S Poetry and A Guide for the Unwilling S User)

Wingfield, Jerad G. wrote:

Hello all, 

 

I've been working with R  Fridolin Wild's lsa package a bit over the
past few months, but I'm still pretty much a novice. I have a lot of
files that I want to use to create a semantic space. When I begin to
run
the initial textmatrix( ), it runs for about 3-4 hours and eventually
gives me an error. It's always ERROR: cannot allocate vector size of
xxx Kb. I imagine this might be my computer running out of memory, but
I'm sure. So I thought I would send this to community at large for any
help/thoughts.

 

I search the archives and didn't really find anything that specifically
speaks to my situation. So I guess I have s few questions. First, is
this actually an issue with the machine running out of memory? If not,
what might be the cause for the error? If so, is there a way to
minimize
the amount of memory used by the vector data structures (e.g., Berkeley
DB)?

 

Thanks,

Gabe Wingfield

IT and Program Specialist I

Center for Applied Social Research

University of Oklahoma

2 Partners Place

3100 Monitor, Suite 100

Norman, OK 73072


   [[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] new to R: don't understand errors

2006-10-03 Thread Wingfield, Jerad G.
Hello all, 

 

I'm brand new to the use of R, and I'm trying to quickly learning the
rudiments for a couple of projects here at work. I'm working with the
lsa package and trying to generate various semantic spaces. I seem to do
well with small collections of clean text files, but now that I am
trying to work with larger collections of less than perfection files,
I'm getting errors that I don't quite understand. So I'm hoping some of
you out there might recognize my issues and be able to point me in the
right direction to resolve them.

 

Currently, I have a corpus of ~12,000 text files. I've separated them
out into other folder of varying sizes to check if there is some sort of
limit on the number of files. Even when I only use the same number as
previous working collections, I still get the errors. So I am wondering
if it might be something in the files themselves...

 

At any rate I routinely get these two errors. The first is generated
when I include a minDocFreq=x, and it looks a little like this when I
run it:

 

  data(stopwords_en)

  CCauto = textmatrix( CultureMineTXT , minWordLength=3,
minDocFreq=50, stopwords=stopwords_en)

  Error in data.frame(docs = basename(file), terms = names(tab),
Freq = tab,  :  

  arguments imply differing number of rows: 1, 0

 

If I remove the minDocFreq, I get a different error:

 

  data(stopwords_en)

  CCauto = textmatrix( CultureMineTXT , minWordLength=3,
stopwords=stopwords_en)

  Error in as.vector(x, mode) : invalid argument 'mode'

 

Any help would be greatly appreciated.

  

Gabe Wingfield

IT and Program Specialist I

Center for Applied Social Research

University of Oklahoma

3200 Marshall Avenue, Suite 201

Norman, OK 73072

 

P: 405-325-4786

F: 405-321-6936

[EMAIL PROTECTED]

 


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.