Dear Uwe,

Thanks very much for your prompt reply.

I include the following pieces of information, alongside a zip file with two folders where the corpus resides.

###############################
##Full reproducible code:
################################
library("lsa")

# load training  text
matrix1 = textmatrix("C:\\Documents and Settings\\tine stalmans.TINE.000\\LSA\\cuentos\\", stemming=TRUE, language="spanish", minWordLength=2, minDocFreq=1, stopwords=NULL, vocabulary=NULL)
print(matrix1,bag_lines = 3, bag_cols = 3)
matrix1 = lw_bintf(matrix1) * gw_idf(matrix1) # weighting
space = lsa(matrix1, dims = dimcalc_share()) # create LSA space
#as.textmatrix(space)

# fold-in test and gold standard essays
matrix2 = textmatrix("C:\\Documents and Settings\\tine stalmans.TINE.000\\LSA\\respuestas\\", stemming=TRUE, language="spanish", minWordLength=2, minDocFreq=1, stopwords=NULL,vocabulary=rownames(matrix1)) matrix2 = lw_bintf(matrix2) # da NaN si se agrega el idf porque divide entre 0
matrix2fld = fold_in(matrix2, space)
r <- cor(matrix2fld[,"respId1.txt"], matrix2fld[,"respAl1.txt"], method = "pearson") #use = "complete.obs", method = "pearson");
print(r)

######################
#end code
########################


I tried to run a traceback, however when including this command in the code, it didn't change the original error message.

###########################
#R output, including error message:
###################################

source("C:\\Documents and Settings\\tine stalmans.TINE.000\\LSA\\lsa.R")
$matrix
             D1 D2 D3 D8 D9 D10 D13 D14 D15
1. 11          1  0  0  0  0   0   0   0   0
2. 1493        1  0  0  0  0   0   0   0   0
3. 1503        1  0  0  0  0   0   0   0   0
896. voy       0  0  0  0  2   0   1   0   0
897. vuelv     0  0  0  0  0   0   0   0   0
898. yo        0  0  0  0  0   0   0   0   0
1790. unic     0  0  0  0  0   0   0   0   1
1791. verific  0  0  0  0  0   0   0   0   1
1792. vier     0  0  0  0  0   0   0   0   1

$legend
[1] "D1 = paraR_1.txt"  "D2 = paraR_10.txt" "D3 = paraR_11.txt"
[4] "D8 = paraR_2.txt"  "D9 = paraR_3.txt"  "D10 = paraR_4.txt"
[7] "D13 = paraR_7.txt" "D14 = paraR_8.txt" "D15 = paraR_9.txt"

Error in data.frame(docs = basename(file), terms = names(tab), Freq = tab, :
       arguments imply differing number of rows: 1, 0
In addition: There were 16 warnings (use warnings() to see them)

##########################
#end output
##############################

R version: R 2.5.1 (running on Windows XP)
LSA package: lsa_0.57
Rstem package 0.3-0 (available at www.omagehat.org/Rstem/)

Thanks in advance for your advice.

Tina.

>From: "Uwe Ligges" <[EMAIL PROTECTED]>
>To: "Walter Rojas" <[EMAIL PROTECTED]>
>Cc: <r-help@stat.math.ethz.ch>
>Date: August 19, 2007 08:45:28 AM PDT
>Subject: Re: [R] Problem with lsa package (data.frame) on Windows XP
>
>Please specify reproducible examples, it is almost impossible to help
>otherwise. Also, please provide all error messages and a traceback().
>Please tell us versions of R and versions of the packages you are using.
>If you are sure this is an error in the package, please send that
>reproducible example to the package maintainer.
>
>Uwe Ligges
>
>
>Walter Rojas wrote:
>> Dear R team,
>>
>> The following piece of code (to use the lsa package) works fine on my
>> mac os x, but when I run the same code on Windows XP, it doesn't work
>> any more.
>>
>> ### code:
>> library("lsa")
>> matrix1 = textmatrix("C:\\Documents and Settings\\tine stalmans.TINE.
>> 000\\LSA\\cuentos\\", stemming=TRUE, language="spanish",
>> minWordLength=2, minDocFreq=1, stopwords=NULL, vocabulary=NULL)
>> print(matrix1,bag_lines = 3, bag_cols = 3)
>> matrix1 = lw_bintf(matrix1) * gw_idf(matrix1)
>> space = lsa(matrix1, dims = dimcalc_share())
>> as.textmatrix(space)
>>
>> ### the following line fails on windows XP
>> matrix2 = textmatrix("C:\\Documents and Settings\\tine stalmans.TINE.
>> 000\\LSA\\respuestas\\", stemming=TRUE, language="spanish",
>> minWordLength=2, minDocFreq=1, stopwords=NULL,vocabulary=rownames
>> (matrix1))
>> matrix2 = lw_bintf(matrix2)
>> matrix2fld = fold_in(matrix2, space)
>> r <- cor(matrix2fld[,"respId1.txt"], matrix2fld[,"respAl1.txt"],
>> method = "pearson")
>> print(r)
>>
>>
>> An error occurs when creating the second textmatrix with the
>> vocabulary of the first. The error I get is:
>>
>> in data.frame(docs = basename(file), terms = names(tab), Freq = tab,  :
>>          arguments imply differing number of rows: 1, 0
>>
>> When I change the vocabulary argument to NULL, it doesn't report this
>> error any more; however, then the code will fail on the fold_in
>> method further down.
>>
>> I found another user who reported this same problem on-line; however,
>> I didn't find any answers.
>>
>> Thank you very much in advance for your reply.
>> Tine.
>>
>> ______________________________________________
>> R-help@stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>

_________________________________________________________________
Descubre la descarga digital con MSN Music. Más de un millón de canciones.
______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to