Re: [R] Invalid input error in tm package

2011-02-24 Thread Kutsal Yesilkagit

Dear Shreyasee,

I had exactly the same error message. I converted the pdf-files into plain
textfiles and reran the command without any problems. I'm no expert but the
problem must be with the readability of the filetype.

Kind regards

Kutsal
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Invalid-input-error-in-tm-package-tp1082961p3322539.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Invalid input error in tm package

2010-01-21 Thread Shreyasee
Hello,

I am working on tm package.
I have 2 pdf files saved in the directory D:/Files
I issued the following commands (marked in red bold) for which I got some
errors and warnings (marked in bold)

*surgj - Corpus(DirSource(D:/Files), readerControl = list(language =
ansi))*

*Warning messages:
1: In readLines(y, encoding = x$Encoding) :
  incomplete final line found on 'D:/Files/provmedsurgj00978-0005b.pdf'
2: In readLines(y, encoding = x$Encoding) :
  incomplete final line found on 'D:/Files/provmedsurgj00978-0007.pdf'*

* inspect(surgj)*

*A corpus with 2 text documents

The metadata consists of 2 tag-value pairs and a data frame
Available tags are:
  create_date creator
Available variables in the data frame are:
  MetaID

[[1]]
%PDF-1.3
Error: invalid input '%Åþë×' in 'utf8towcs'*

Could anybody help me to identify where I went wrong and what I need to do
to proceed further?

Thanks,
Shreyasee

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.