Hello Fellows,
I used the dictionary function to capture terms from a corpus of 102 docs but
the dictionary only captures those terms in 10 documents. I need those terms
from all 102 docs. Any idea why? How do I get the dictionary to return for all
102 docs? See my coding below.
> myTerms <- c("prostatic", "adenocarcinoma", "grade")
> inspect(DocumentTermMatrix(docs, list(dictionary = myTerms)))
<<DocumentTermMatrix (documents: 102, terms: 3)>>
Non-/sparse entries: 292/14
Sparsity : 5%
Maximal term length: 14
Weighting : term frequency (tf)
Sample :
Terms
Docs adenocarcinoma grade prostatic
Patient14.txt 11 6 3
Patient15.txt 7 12 2
Patient16.txt 13 16 4
Patient19.txt 5 13 2
Patient24.txt 11 12 4
Patient25.txt 8 9 4
Patient41.txt 8 10 4
Patient46.txt 8 10 3
Patient8.txt 9 12 2
Patient9.txt 8 23 2
Pat
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.