Dear Uwe,
Thanks very much for your prompt reply.
I include the following pieces of information, alongside a zip file with two
folders where the corpus resides.
###
##Full reproducible code:
library(lsa)
# load training text
matrix1 = textmatrix(C:\\Documents and Settings\\tine
stalmans.TINE.000\\LSA\\cuentos\\, stemming=TRUE, language=spanish,
minWordLength=2, minDocFreq=1, stopwords=NULL, vocabulary=NULL)
print(matrix1,bag_lines = 3, bag_cols = 3)
matrix1 = lw_bintf(matrix1) * gw_idf(matrix1) # weighting
space = lsa(matrix1, dims = dimcalc_share()) # create LSA space
#as.textmatrix(space)
# fold-in test and gold standard essays
matrix2 = textmatrix(C:\\Documents and Settings\\tine
stalmans.TINE.000\\LSA\\respuestas\\, stemming=TRUE, language=spanish,
minWordLength=2, minDocFreq=1, stopwords=NULL,vocabulary=rownames(matrix1))
matrix2 = lw_bintf(matrix2) # da NaN si se agrega el idf porque divide entre
0
matrix2fld = fold_in(matrix2, space)
r - cor(matrix2fld[,respId1.txt], matrix2fld[,respAl1.txt], method =
pearson) #use = complete.obs, method = pearson);
print(r)
##
#end code
I tried to run a traceback, however when including this command in the code,
it didn't change the original error message.
###
#R output, including error message:
###
source(C:\\Documents and Settings\\tine stalmans.TINE.000\\LSA\\lsa.R)
$matrix
D1 D2 D3 D8 D9 D10 D13 D14 D15
1. 11 1 0 0 0 0 0 0 0 0
2. 14931 0 0 0 0 0 0 0 0
3. 15031 0 0 0 0 0 0 0 0
896. voy 0 0 0 0 2 0 1 0 0
897. vuelv 0 0 0 0 0 0 0 0 0
898. yo0 0 0 0 0 0 0 0 0
1790. unic 0 0 0 0 0 0 0 0 1
1791. verific 0 0 0 0 0 0 0 0 1
1792. vier 0 0 0 0 0 0 0 0 1
$legend
[1] D1 = paraR_1.txt D2 = paraR_10.txt D3 = paraR_11.txt
[4] D8 = paraR_2.txt D9 = paraR_3.txt D10 = paraR_4.txt
[7] D13 = paraR_7.txt D14 = paraR_8.txt D15 = paraR_9.txt
Error in data.frame(docs = basename(file), terms = names(tab), Freq = tab,
:
arguments imply differing number of rows: 1, 0
In addition: There were 16 warnings (use warnings() to see them)
##
#end output
##
R version: R 2.5.1 (running on Windows XP)
LSA package: lsa_0.57
Rstem package 0.3-0 (available at www.omagehat.org/Rstem/)
Thanks in advance for your advice.
Tina.
From: Uwe Ligges [EMAIL PROTECTED]
To: Walter Rojas [EMAIL PROTECTED]
Cc: r-help@stat.math.ethz.ch
Date: August 19, 2007 08:45:28 AM PDT
Subject: Re: [R] Problem with lsa package (data.frame) on Windows XP
Please specify reproducible examples, it is almost impossible to help
otherwise. Also, please provide all error messages and a traceback().
Please tell us versions of R and versions of the packages you are using.
If you are sure this is an error in the package, please send that
reproducible example to the package maintainer.
Uwe Ligges
Walter Rojas wrote:
Dear R team,
The following piece of code (to use the lsa package) works fine on my
mac os x, but when I run the same code on Windows XP, it doesn't work
any more.
### code:
library(lsa)
matrix1 = textmatrix(C:\\Documents and Settings\\tine stalmans.TINE.
000\\LSA\\cuentos\\, stemming=TRUE, language=spanish,
minWordLength=2, minDocFreq=1, stopwords=NULL, vocabulary=NULL)
print(matrix1,bag_lines = 3, bag_cols = 3)
matrix1 = lw_bintf(matrix1) * gw_idf(matrix1)
space = lsa(matrix1, dims = dimcalc_share())
as.textmatrix(space)
### the following line fails on windows XP
matrix2 = textmatrix(C:\\Documents and Settings\\tine stalmans.TINE.
000\\LSA\\respuestas\\, stemming=TRUE, language=spanish,
minWordLength=2, minDocFreq=1, stopwords=NULL,vocabulary=rownames
(matrix1))
matrix2 = lw_bintf(matrix2)
matrix2fld = fold_in(matrix2, space)
r - cor(matrix2fld[,respId1.txt], matrix2fld[,respAl1.txt],
method = pearson)
print(r)
An error occurs when creating the second textmatrix with the
vocabulary of the first. The error I get is:
in data.frame(docs = basename(file), terms = names(tab), Freq = tab, :
arguments imply differing number of rows: 1, 0
When I change the vocabulary argument to NULL, it doesn't report this
error any more; however, then the code will fail on the fold_in
method further down.
I found another user who reported this same problem on-line; however,
I didn't find any answers.
Thank you very much in advance for your reply.
Tine.
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code