Re: [R] Problem with lsa package (data.frame) on Windows XP

2007-08-20 Thread Tine Stalmans

Dear Uwe,

Thanks very much for your prompt reply.

I include the following pieces of information, alongside a zip file with two 
folders where the corpus resides.


###
##Full reproducible code:

library(lsa)

# load training  text
matrix1 = textmatrix(C:\\Documents and Settings\\tine 
stalmans.TINE.000\\LSA\\cuentos\\, stemming=TRUE, language=spanish, 
minWordLength=2, minDocFreq=1, stopwords=NULL, vocabulary=NULL)

print(matrix1,bag_lines = 3, bag_cols = 3)
matrix1 = lw_bintf(matrix1) * gw_idf(matrix1) # weighting
space = lsa(matrix1, dims = dimcalc_share()) # create LSA space
#as.textmatrix(space)

# fold-in test and gold standard essays
matrix2 = textmatrix(C:\\Documents and Settings\\tine 
stalmans.TINE.000\\LSA\\respuestas\\, stemming=TRUE, language=spanish, 
minWordLength=2, minDocFreq=1, stopwords=NULL,vocabulary=rownames(matrix1))
matrix2 = lw_bintf(matrix2) # da NaN si se agrega el idf porque divide entre 
0

matrix2fld = fold_in(matrix2, space)
r - cor(matrix2fld[,respId1.txt], matrix2fld[,respAl1.txt], method = 
pearson) #use = complete.obs, method = pearson);

print(r)

##
#end code



I tried to run a traceback, however when including this command in the code, 
it didn't change the original error message.


###
#R output, including error message:
###


source(C:\\Documents and Settings\\tine stalmans.TINE.000\\LSA\\lsa.R)

$matrix
 D1 D2 D3 D8 D9 D10 D13 D14 D15
1. 11  1  0  0  0  0   0   0   0   0
2. 14931  0  0  0  0   0   0   0   0
3. 15031  0  0  0  0   0   0   0   0
896. voy   0  0  0  0  2   0   1   0   0
897. vuelv 0  0  0  0  0   0   0   0   0
898. yo0  0  0  0  0   0   0   0   0
1790. unic 0  0  0  0  0   0   0   0   1
1791. verific  0  0  0  0  0   0   0   0   1
1792. vier 0  0  0  0  0   0   0   0   1

$legend
[1] D1 = paraR_1.txt  D2 = paraR_10.txt D3 = paraR_11.txt
[4] D8 = paraR_2.txt  D9 = paraR_3.txt  D10 = paraR_4.txt
[7] D13 = paraR_7.txt D14 = paraR_8.txt D15 = paraR_9.txt

Error in data.frame(docs = basename(file), terms = names(tab), Freq = tab,  
:

   arguments imply differing number of rows: 1, 0
In addition: There were 16 warnings (use warnings() to see them)

##
#end output
##

R version: R 2.5.1 (running on Windows XP)
LSA package: lsa_0.57
Rstem package 0.3-0 (available at www.omagehat.org/Rstem/)

Thanks in advance for your advice.

Tina.

From: Uwe Ligges [EMAIL PROTECTED]
To: Walter Rojas [EMAIL PROTECTED]
Cc: r-help@stat.math.ethz.ch
Date: August 19, 2007 08:45:28 AM PDT
Subject: Re: [R] Problem with lsa package (data.frame) on Windows XP

Please specify reproducible examples, it is almost impossible to help
otherwise. Also, please provide all error messages and a traceback().
Please tell us versions of R and versions of the packages you are using.
If you are sure this is an error in the package, please send that
reproducible example to the package maintainer.

Uwe Ligges


Walter Rojas wrote:
 Dear R team,

 The following piece of code (to use the lsa package) works fine on my
 mac os x, but when I run the same code on Windows XP, it doesn't work
 any more.

 ### code:
 library(lsa)
 matrix1 = textmatrix(C:\\Documents and Settings\\tine stalmans.TINE.
 000\\LSA\\cuentos\\, stemming=TRUE, language=spanish,
 minWordLength=2, minDocFreq=1, stopwords=NULL, vocabulary=NULL)
 print(matrix1,bag_lines = 3, bag_cols = 3)
 matrix1 = lw_bintf(matrix1) * gw_idf(matrix1)
 space = lsa(matrix1, dims = dimcalc_share())
 as.textmatrix(space)

 ### the following line fails on windows XP
 matrix2 = textmatrix(C:\\Documents and Settings\\tine stalmans.TINE.
 000\\LSA\\respuestas\\, stemming=TRUE, language=spanish,
 minWordLength=2, minDocFreq=1, stopwords=NULL,vocabulary=rownames
 (matrix1))
 matrix2 = lw_bintf(matrix2)
 matrix2fld = fold_in(matrix2, space)
 r - cor(matrix2fld[,respId1.txt], matrix2fld[,respAl1.txt],
 method = pearson)
 print(r)


 An error occurs when creating the second textmatrix with the
 vocabulary of the first. The error I get is:

 in data.frame(docs = basename(file), terms = names(tab), Freq = tab,  :
  arguments imply differing number of rows: 1, 0

 When I change the vocabulary argument to NULL, it doesn't report this
 error any more; however, then the code will fail on the fold_in
 method further down.

 I found another user who reported this same problem on-line; however,
 I didn't find any answers.

 Thank you very much in advance for your reply.
 Tine.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

 and provide commented, minimal, self-contained, reproducible code

Re: [R] Problem with lsa package (data.frame) on Windows XP

2007-08-19 Thread Uwe Ligges
Please specify reproducible examples, it is almost impossible to help 
otherwise. Also, please provide all error messages and a traceback(). 
Please tell us versions of R and versions of the packages you are using.
If you are sure this is an error in the package, please send that 
reproducible example to the package maintainer.

Uwe Ligges


Walter Rojas wrote:
 Dear R team,
 
 The following piece of code (to use the lsa package) works fine on my  
 mac os x, but when I run the same code on Windows XP, it doesn't work  
 any more.
 
 ### code:
 library(lsa)
 matrix1 = textmatrix(C:\\Documents and Settings\\tine stalmans.TINE. 
 000\\LSA\\cuentos\\, stemming=TRUE, language=spanish,  
 minWordLength=2, minDocFreq=1, stopwords=NULL, vocabulary=NULL)
 print(matrix1,bag_lines = 3, bag_cols = 3)
 matrix1 = lw_bintf(matrix1) * gw_idf(matrix1)
 space = lsa(matrix1, dims = dimcalc_share())
 as.textmatrix(space)
 
 ### the following line fails on windows XP
 matrix2 = textmatrix(C:\\Documents and Settings\\tine stalmans.TINE. 
 000\\LSA\\respuestas\\, stemming=TRUE, language=spanish,  
 minWordLength=2, minDocFreq=1, stopwords=NULL,vocabulary=rownames 
 (matrix1))
 matrix2 = lw_bintf(matrix2)
 matrix2fld = fold_in(matrix2, space)
 r - cor(matrix2fld[,respId1.txt], matrix2fld[,respAl1.txt],  
 method = pearson)
 print(r)
 
 
 An error occurs when creating the second textmatrix with the  
 vocabulary of the first. The error I get is:
 
 in data.frame(docs = basename(file), terms = names(tab), Freq = tab,  :
  arguments imply differing number of rows: 1, 0
 
 When I change the vocabulary argument to NULL, it doesn't report this  
 error any more; however, then the code will fail on the fold_in  
 method further down.
 
 I found another user who reported this same problem on-line; however,  
 I didn't find any answers.
 
 Thank you very much in advance for your reply.
 Tine.
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.