[jira] [Commented] (TIKA-245) Support of CHM Format

2014-02-03 Thread Prashanth Ramaswamy (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13889317#comment-13889317 ] Prashanth Ramaswamy commented on TIKA-245: -- Nick, Thanks for your response.

[jira] [Commented] (TIKA-245) Support of CHM Format

2014-02-01 Thread Prashanth Ramaswamy (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13888778#comment-13888778 ] Prashanth Ramaswamy commented on TIKA-245: -- Hi, I still get the Array index

[jira] [Commented] (TIKA-245) Support of CHM Format

2014-02-01 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13888787#comment-13888787 ] Nick Burch commented on TIKA-245: - Prashanth - you might be best off opening a new bug for

[jira] [Commented] (TIKA-245) Support of CHM Format

2013-12-27 Thread Jukka Zitting (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13857520#comment-13857520 ] Jukka Zitting commented on TIKA-245: bq. tika is not able to extract contents from chm

[jira] [Commented] (TIKA-245) Support of CHM Format

2013-03-05 Thread Tejas Patil (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13594074#comment-13594074 ] Tejas Patil commented on TIKA-245: -- I am working on NUTCH-1454 and I am observing that tika

Re: [jira] [Commented] (TIKA-245) Support of CHM Format

2013-03-05 Thread Oleg Tikhonov
Tika chm support has its limitations, can you provide such file(s) for further investigation ? BR, Oleg On Wed, Mar 6, 2013 at 1:10 AM, Tejas Patil (JIRA) j...@apache.org wrote: [

[jira] [Commented] (TIKA-245) Support of CHM Format

2011-10-22 Thread Tran Nam Quang (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13133260#comment-13133260 ] Tran Nam Quang commented on TIKA-245: - @ Oleg I tested the CHM parser from Tika 0.10 on

Re: [jira] [Commented] (TIKA-245) Support of CHM Format

2011-10-22 Thread Oleg Tikhonov
Hi Tran Nam Quang, Currently our CHM extractor skips all entities that are not HTML. It would be great if you could write a list of desired entities to be extracted. In addition, if you can, please attach the CHM files you're working with. BR, Oleg On Sat, Oct 22, 2011 at 8:08 AM, Tran Nam

[jira] [Commented] (TIKA-245) Support of CHM Format

2011-06-08 Thread Oleg Tikhonov (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13046231#comment-13046231 ] Oleg Tikhonov commented on TIKA-245: Committed revision 1133556. Support of CHM Format

Re: [jira] [Commented] (TIKA-245) Support of CHM Format

2011-06-07 Thread Oleg Tikhonov
Hi Chris, I've applied the patch to the tika-parsers/src/main/java/org/apache/tika/parser/chm, also added 3 chm files to the tika-parsers\src\test\resources\test-documents and the tests. BR, Oleg On Sun, Jun 5, 2011 at 1:32 AM, Chris A. Mattmann (JIRA) j...@apache.orgwrote: [

Re: [jira] [Commented] (TIKA-245) Support of CHM Format

2011-06-07 Thread Mattmann, Chris A (388J)
Hi Oleg, On Jun 7, 2011, at 6:28 AM, Oleg Tikhonov wrote: Hi Chris, I've applied the patch to the tika-parsers/src/main/java/org/apache/tika/parser/chm, also added 3 chm files to the tika-parsers\src\test\resources\test-documents and the tests. Thanks sorry I think I confused you with my

Re: [jira] [Commented] (TIKA-245) Support of CHM Format

2011-06-07 Thread Jukka Zitting
Hi, On Tue, Jun 7, 2011 at 3:52 PM, Mattmann, Chris A (388J) chris.a.mattm...@jpl.nasa.gov wrote: Please revert r1132997, and then just modify your patch to make sure that your java classes and files fit into the appropriate Tika source code area. Then please attach a new patch real quick so I

Re: [jira] [Commented] (TIKA-245) Support of CHM Format

2011-06-07 Thread Mattmann, Chris A (388J)
Hey Jukka, On Jun 7, 2011, at 6:55 AM, Jukka Zitting wrote: Hi, On Tue, Jun 7, 2011 at 3:52 PM, Mattmann, Chris A (388J) chris.a.mattm...@jpl.nasa.gov wrote: Please revert r1132997, and then just modify your patch to make sure that your java classes and files fit into the appropriate Tika

Re: [jira] [Commented] (TIKA-245) Support of CHM Format

2011-06-07 Thread Mattmann, Chris A (388J)
Hey Jukka, Thanks for the motivation. I put my money where my mouth was :-) Oleg, your patch rox. That's all I had to say. My improvement was simply to commit it to the Tika sources. Feel free to mod/add/whatever on it after that, per Jukka's comments. I am going to make one more update, just

[jira] [Commented] (TIKA-245) Support of CHM Format

2011-06-04 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13044403#comment-13044403 ] Chris A. Mattmann commented on TIKA-245: Hi Oleg, Looking over this patch, I have a

Re: [jira] [Commented] (TIKA-245) Support of CHM Format

2011-03-31 Thread Oleg Tikhonov
Hello Tran Nam Quang, It uses CHMLIB C library, i.e. JNI. From my previous experience, it works for limited amount of os'es. It does not work in Solaris or AIX. The really good library with limitations mentioned above is http://sevenzipjbind.sourceforge.net/ and also LGPL (I would say, the best

[jira] Commented: (TIKA-245) Support of CHM Format

2011-03-19 Thread Oleg Tikhonov (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13008797#comment-13008797 ] Oleg Tikhonov commented on TIKA-245: I've implemented chm extractor, based on the same