Dear R-users,

I've written a script that produces a frequency table for a group of
texts. The table has a total frequency for each word type and
individual frequency counts for each of the files. (I have not
included the code for creating the column headers.) Below is a sample:

Word  Total     01.txt  02.txt  03.txt  04.txt  05.txt
the     22442   2667    3651    1579    2132    3097
I       18377   3407      454     824     449   3746
and     15521   2377    2174      891   1006    2450
to      13598   1716    1395      905   1021    1983
of      12834   1647    1557      941   1127    1887
it      12440   2160      916     497     493   2449
you     12036   2283      356     293     106   2435

I've encountered two problems when I try to construct and save the file.

The "combined.sorted.freq.list" is a named integer vector in which the
integers are the total frequency counts for each word. The names are
the words. For each of the individual lists I've created frequency
lists that are sorted in the order of the combined list. (NAs have
been replaced with "0"). These are called "combined." plus the number
of the file.
If I were to write the line to save the file manually, it would look like this:

combined.table<-paste(names(combined.sorted.freq.list),
combined.sorted.freq.list, combined.01, combined.02, combined.03,
combined.04, combined.05, combined.06, combined.07, combined.08,
combined.09, combined.10, combined.11, combined.12, sep="\t")
#creates a table with columns for the combined and all of the
component lists

However, each time I run the script, there may be a differing number
of text files. I created a list of the individual frequency counts
called "combined.file.list"

combined.file.count<-1:length(selected.files) #counts number of files
originally selected
combined.file.list<-paste("combined", combined.file.count, sep=".")
#creates the file names for the combined lists by catenating
"combined" with each file number separated by a period by recycled the
string "combined for each number

I then tried to include it as one of the elements to be pasted by using get().

combined.table<-paste(names(combined.sorted.freq.list),
combined.sorted.freq.list, get(combined.file.list[]), sep="\t")
#intended to create a table with columns for the combined and all of
the component lists

Unfortunately, the get() function only gets the first component list
since get() can apparently only access one object.

This results in a table with only the total frequency and the amount
of the first text:

Word  Total     01.txt
the     22442   2667
I       18377   3407
and     15521   2377
to      13598   1716
of      12834   1647
it      12440   2160
you     12036   2283

If I try to construct the file "piece by piece" as they are created, I
get an error message that a vector of more than 1.3 Gb cannot be
created. Does anyone know how I could use get() or some other method
to access all of the files named in a vector?

Many thank for any help you can offer!

Joseph

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to