[
https://issues.apache.org/jira/browse/TIKA-245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13889317#comment-13889317
]
Prashanth Ramaswamy commented on TIKA-245:
--
Nick, Thanks for your response.
[
https://issues.apache.org/jira/browse/TIKA-245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13888778#comment-13888778
]
Prashanth Ramaswamy commented on TIKA-245:
--
Hi, I still get the Array index
[
https://issues.apache.org/jira/browse/TIKA-245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13888787#comment-13888787
]
Nick Burch commented on TIKA-245:
-
Prashanth - you might be best off opening a new bug for
[
https://issues.apache.org/jira/browse/TIKA-245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13857520#comment-13857520
]
Jukka Zitting commented on TIKA-245:
bq. tika is not able to extract contents from chm
[
https://issues.apache.org/jira/browse/TIKA-245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13594074#comment-13594074
]
Tejas Patil commented on TIKA-245:
--
I am working on NUTCH-1454 and I am observing that tika
Tika chm support has its limitations, can you provide such file(s) for
further investigation ?
BR,
Oleg
On Wed, Mar 6, 2013 at 1:10 AM, Tejas Patil (JIRA) j...@apache.org wrote:
[
[
https://issues.apache.org/jira/browse/TIKA-245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13133260#comment-13133260
]
Tran Nam Quang commented on TIKA-245:
-
@ Oleg
I tested the CHM parser from Tika 0.10 on
Hi Tran Nam Quang,
Currently our CHM extractor skips all entities that are not HTML.
It would be great if you could write a list of desired entities to be
extracted. In addition, if you can, please attach the CHM files you're
working with.
BR,
Oleg
On Sat, Oct 22, 2011 at 8:08 AM, Tran Nam
[
https://issues.apache.org/jira/browse/TIKA-245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13046231#comment-13046231
]
Oleg Tikhonov commented on TIKA-245:
Committed revision 1133556.
Support of CHM Format
Hi Chris,
I've applied the patch to the
tika-parsers/src/main/java/org/apache/tika/parser/chm, also added 3 chm
files to the tika-parsers\src\test\resources\test-documents and the tests.
BR,
Oleg
On Sun, Jun 5, 2011 at 1:32 AM, Chris A. Mattmann (JIRA) j...@apache.orgwrote:
[
Hi Oleg,
On Jun 7, 2011, at 6:28 AM, Oleg Tikhonov wrote:
Hi Chris,
I've applied the patch to the
tika-parsers/src/main/java/org/apache/tika/parser/chm, also added 3 chm
files to the tika-parsers\src\test\resources\test-documents and the tests.
Thanks sorry I think I confused you with my
Hi,
On Tue, Jun 7, 2011 at 3:52 PM, Mattmann, Chris A (388J)
chris.a.mattm...@jpl.nasa.gov wrote:
Please revert r1132997, and then just modify your patch to make sure that
your java classes and files fit into the appropriate Tika source code area.
Then please attach a new patch real quick so I
Hey Jukka,
On Jun 7, 2011, at 6:55 AM, Jukka Zitting wrote:
Hi,
On Tue, Jun 7, 2011 at 3:52 PM, Mattmann, Chris A (388J)
chris.a.mattm...@jpl.nasa.gov wrote:
Please revert r1132997, and then just modify your patch to make sure that
your java classes and files fit into the appropriate Tika
Hey Jukka,
Thanks for the motivation. I put my money where my mouth was :-)
Oleg, your patch rox. That's all I had to say. My improvement was simply to
commit it to the Tika sources. Feel free to mod/add/whatever on it after that,
per Jukka's comments.
I am going to make one more update, just
[
https://issues.apache.org/jira/browse/TIKA-245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13044403#comment-13044403
]
Chris A. Mattmann commented on TIKA-245:
Hi Oleg,
Looking over this patch, I have a
Hello Tran Nam Quang,
It uses CHMLIB C library, i.e. JNI. From my previous experience, it works
for limited amount of os'es. It does not work in Solaris or AIX.
The really good library with limitations mentioned above is
http://sevenzipjbind.sourceforge.net/ and also LGPL (I would say, the best
[
https://issues.apache.org/jira/browse/TIKA-245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13008797#comment-13008797
]
Oleg Tikhonov commented on TIKA-245:
I've implemented chm extractor, based on the same
17 matches
Mail list logo