[jira] [Updated] (TIKA-713) Tika can not parse all of the persian pdf files

2011-10-31 Thread Ahmad Ajiloo (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmad Ajiloo updated TIKA-713: -- Attachment: Simple2.pdf Tika can not parse all of the persian pdf files

[jira] [Commented] (TIKA-713) Tika can not parse all of the persian pdf files

2011-10-31 Thread Ahmad Ajiloo (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13140134#comment-13140134 ] Ahmad Ajiloo commented on TIKA-713: --- I'm testing new Encoding.java file with other persian

[jira] [Commented] (TIKA-713) Tika can not parse all of the persian pdf files

2011-10-31 Thread Robert Muir (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13140148#comment-13140148 ] Robert Muir commented on TIKA-713: -- Thanks for uploading another test file Ahmad, we'll

[jira] [Updated] (TIKA-713) Tika can not parse all of the persian pdf files

2011-10-31 Thread Ahmad Ajiloo (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmad Ajiloo updated TIKA-713: -- Attachment: Simple3.pdf Complex.pdf I attached this two files for more researching.

A problem in the right-to-left languages

2011-10-31 Thread ahmad ajiloo
Hello When I use Tika for extracting my persian pdf files, all the characters will be extracted vice versa. I mean that the characters showed from beginning of the line to the end, but from left to right. However when I use Tika gui via Nutch there is no mistake and the output text is

location of pdfbox in sources of Tika

2011-10-31 Thread ahmad ajiloo
Hello I have an edited file in pdfbox project and want to rebuild Tika with this new file. But i can't find location of pdfbox sources in Tika sources to change that. can anyone help me? thanks

Re: location of pdfbox in sources of Tika

2011-10-31 Thread Oleg Tikhonov
Hi Ahmad, I hope you built pdfbox using a maven, i.e. running mvn clean install. If so, a new pdfbox jar file is located in the .m2 local repository. In addition, please find a pom.xml under ../tika-parsers and change the following: dependency groupIdorg.apache.pdfbox/groupId

Re: Build failed in Jenkins: Tika-trunk » Apache Tika OSGi bundle #703

2011-10-31 Thread Jukka Zitting
Hi, On Mon, Oct 31, 2011 at 11:32 PM, Apache Jenkins Server jenk...@builds.apache.org wrote: Tests in error:  initializationError(org.apache.tika.bundle.BundleIT): java/util/ServiceLoader Oops, I guess the new integration test I added only works with Java 6 and higher. Fixed in revision

Jenkins build is back to normal : Tika-trunk » Apache Tika OSGi bundle #704

2011-10-31 Thread Apache Jenkins Server
See https://builds.apache.org/job/Tika-trunk/org.apache.tika$tika-bundle/704/changes

Jenkins build is back to normal : Tika-trunk #704

2011-10-31 Thread Apache Jenkins Server
See https://builds.apache.org/job/Tika-trunk/704/changes