[jira] [Commented] (TIKA-1600) Unable to parse ODT files because of failed to close temporary resources

2015-04-13 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14492084#comment-14492084 ] Hong-Thai Nguyen commented on TIKA-1600: The root exception is an NPE when parsing

[jira] [Commented] (TIKA-1581) jhighlight license concerns

2015-03-30 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14386900#comment-14386900 ] Hong-Thai Nguyen commented on TIKA-1581: And great thank to [~kkrugler] with many

[jira] [Updated] (TIKA-1581) jhighlight license concerns

2015-03-27 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong-Thai Nguyen updated TIKA-1581: --- Fix Version/s: 1.8 jhighlight license concerns ---

[jira] [Resolved] (TIKA-1581) jhighlight license concerns

2015-03-27 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong-Thai Nguyen resolved TIKA-1581. Resolution: Fixed jhighlight license concerns ---

[jira] [Commented] (TIKA-1581) jhighlight license concerns

2015-03-20 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14371432#comment-14371432 ] Hong-Thai Nguyen commented on TIKA-1581: I've contacted also 'gbe...@uwyn.com',

[jira] [Comment Edited] (TIKA-1581) jhighlight license concerns

2015-03-20 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14371432#comment-14371432 ] Hong-Thai Nguyen edited comment on TIKA-1581 at 3/20/15 3:10 PM:

[jira] [Comment Edited] (TIKA-1581) jhighlight license concerns

2015-03-20 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14371432#comment-14371432 ] Hong-Thai Nguyen edited comment on TIKA-1581 at 3/20/15 3:36 PM:

[jira] [Commented] (TIKA-1505) chmparser breaks down when extracting from file of CHM format v3

2015-01-05 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14264786#comment-14264786 ] Hong-Thai Nguyen commented on TIKA-1505: Can you provide also problem files and

[jira] [Resolved] (TIKA-1447) CHM parser: wrong directory list

2014-11-24 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong-Thai Nguyen resolved TIKA-1447. Resolution: Fixed CHM parser: wrong directory list

[jira] [Resolved] (TIKA-1446) CHM parser : wrong decompression of aligned blocks

2014-11-24 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong-Thai Nguyen resolved TIKA-1446. Resolution: Fixed CHM parser : wrong decompression of aligned blocks

[jira] [Updated] (TIKA-1447) CHM parser: wrong directory list

2014-11-24 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong-Thai Nguyen updated TIKA-1447: --- Fix Version/s: 1.7 CHM parser: wrong directory list

[jira] [Resolved] (TIKA-1448) CHM parser : defect in file extraction

2014-11-24 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong-Thai Nguyen resolved TIKA-1448. Resolution: Fixed CHM parser : defect in file extraction

[jira] [Resolved] (TIKA-1430) CHM parser gets faulty text (fix found)

2014-11-24 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong-Thai Nguyen resolved TIKA-1430. Resolution: Fixed CHM parser gets faulty text (fix found)

[jira] [Updated] (TIKA-1430) CHM parser gets faulty text (fix found)

2014-11-24 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong-Thai Nguyen updated TIKA-1430: --- Fix Version/s: 1.7 CHM parser gets faulty text (fix found)

[jira] [Updated] (TIKA-1446) CHM parser : wrong decompression of aligned blocks

2014-11-24 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong-Thai Nguyen updated TIKA-1446: --- Fix Version/s: 1.7 CHM parser : wrong decompression of aligned blocks

[jira] [Updated] (TIKA-1448) CHM parser : defect in file extraction

2014-11-24 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong-Thai Nguyen updated TIKA-1448: --- Fix Version/s: 1.7 CHM parser : defect in file extraction

[jira] [Updated] (TIKA-672) Proper error handling in the CHM parser

2014-11-24 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong-Thai Nguyen updated TIKA-672: -- Fix Version/s: 1.7 Proper error handling in the CHM parser

[jira] [Resolved] (TIKA-672) Proper error handling in the CHM parser

2014-11-24 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong-Thai Nguyen resolved TIKA-672. --- Resolution: Fixed Check no more System.err/System.out inside CHM parser Proper error handling

[jira] [Commented] (TIKA-1447) CHM parser: wrong directory list

2014-11-17 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14214535#comment-14214535 ] Hong-Thai Nguyen commented on TIKA-1447: [~binhawking], The work on TIKA-1446 fixed

[jira] [Commented] (TIKA-1446) CHM parser : wrong decompression of aligned blocks

2014-11-12 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14208079#comment-14208079 ] Hong-Thai Nguyen commented on TIKA-1446: Hi [~binhawking], I've merge your pull

[jira] [Comment Edited] (TIKA-1446) CHM parser : wrong decompression of aligned blocks

2014-11-12 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14208079#comment-14208079 ] Hong-Thai Nguyen edited comment on TIKA-1446 at 11/12/14 2:38 PM:

[jira] [Commented] (TIKA-1463) TesseractOCRParser does not work in Windows

2014-11-04 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14196343#comment-14196343 ] Hong-Thai Nguyen commented on TIKA-1463: Thank [~lfcnassif], without .exe

[jira] [Created] (TIKA-1463) TesseractOCRParser does work in Windows

2014-11-03 Thread Hong-Thai Nguyen (JIRA)
Hong-Thai Nguyen created TIKA-1463: -- Summary: TesseractOCRParser does work in Windows Key: TIKA-1463 URL: https://issues.apache.org/jira/browse/TIKA-1463 Project: Tika Issue Type: Bug

[jira] [Commented] (TIKA-1463) TesseractOCRParser does work in Windows

2014-11-03 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14194694#comment-14194694 ] Hong-Thai Nguyen commented on TIKA-1463: Fixed in r1636382 TesseractOCRParser

[jira] [Updated] (TIKA-1463) TesseractOCRParser does not work in Windows

2014-11-03 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong-Thai Nguyen updated TIKA-1463: --- Summary: TesseractOCRParser does not work in Windows (was: TesseractOCRParser does work in

[jira] [Updated] (TIKA-1463) TesseractOCRParser does not work in Windows

2014-11-03 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong-Thai Nguyen updated TIKA-1463: --- Description: STR: * Case 1: ** Setting tesseractPath to a common installation path of

[jira] [Closed] (TIKA-1463) TesseractOCRParser does not work in Windows

2014-11-03 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong-Thai Nguyen closed TIKA-1463. -- Resolution: Fixed TesseractOCRParser does not work in Windows

[jira] [Commented] (TIKA-1446) CHM parser : wrong decompression of aligned blocks

2014-10-23 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14181530#comment-14181530 ] Hong-Thai Nguyen commented on TIKA-1446: Thank alot [~binhawking], I've quick look

[jira] [Commented] (TIKA-1422) org.apache.tika.parser.mail.RFC822ParserTest fails

2014-10-21 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14178186#comment-14178186 ] Hong-Thai Nguyen commented on TIKA-1422: Applied latest fix on r1633325 with some

[jira] [Comment Edited] (TIKA-1422) org.apache.tika.parser.mail.RFC822ParserTest fails

2014-10-21 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14178186#comment-14178186 ] Hong-Thai Nguyen edited comment on TIKA-1422 at 10/21/14 9:48 AM:

[jira] [Commented] (TIKA-1422) org.apache.tika.parser.mail.RFC822ParserTest fails

2014-10-16 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14173537#comment-14173537 ] Hong-Thai Nguyen commented on TIKA-1422: I'm not using Tesseract

[jira] [Commented] (TIKA-1445) Figure out how to add Image metadata extraction to Tesseract parser

2014-10-13 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14169090#comment-14169090 ] Hong-Thai Nguyen commented on TIKA-1445: Interesting question ! For me, parser's

[jira] [Commented] (TIKA-1176) ChmDirectoryListingSet does not correctly enumerate directory entries

2014-10-13 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14169146#comment-14169146 ] Hong-Thai Nguyen commented on TIKA-1176: Hi [~mdgeek], thank for your offering code

[jira] [Commented] (TIKA-1428) Microsoft Word 97 - 2003 (.doc) footnote references are Unicode Replacement Character

2014-09-25 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147880#comment-14147880 ] Hong-Thai Nguyen commented on TIKA-1428: Thanks [~theoettheo], any chance to have a

[jira] [Commented] (TIKA-1421) Tika-Parsers tests fail on CentOS6 if tesseract isn't installed

2014-09-22 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14143041#comment-14143041 ] Hong-Thai Nguyen commented on TIKA-1421: Not only CentOS, this test failed also on

[jira] [Updated] (TIKA-1421) Tika-Parsers tests fail on CentOS6 if tesseract isn't installed

2014-09-22 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong-Thai Nguyen updated TIKA-1421: --- Priority: Blocker (was: Major) Tika-Parsers tests fail on CentOS6 if tesseract isn't

[jira] [Commented] (TIKA-1412) NPE in OpenDocumentParser

2014-09-22 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14143043#comment-14143043 ] Hong-Thai Nguyen commented on TIKA-1412: Add a test at r1626706 NPE in

[jira] [Resolved] (TIKA-1413) OOXML thumbnail name added to body

2014-09-09 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong-Thai Nguyen resolved TIKA-1413. Resolution: Fixed OOXML thumbnail name added to body --

[jira] [Commented] (TIKA-1413) OOXML thumbnail name added to body

2014-09-09 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14126949#comment-14126949 ] Hong-Thai Nguyen commented on TIKA-1413: I agree. Fixed in r1623819 and _id_ is now

[jira] [Commented] (TIKA-1373) AutoDetectParser extracts no text when SourceCodeParser is selected

2014-07-29 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077885#comment-14077885 ] Hong-Thai Nguyen commented on TIKA-1373: Normally it's on next official 1.6

[jira] [Commented] (TIKA-1373) AutoDetectParser extracts no text when SourceCodeParser is selected

2014-07-24 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14073042#comment-14073042 ] Hong-Thai Nguyen commented on TIKA-1373: HtmlParser skips tags generated by

[jira] [Resolved] (TIKA-1373) AutoDetectParser extracts no text when SourceCodeParser is selected

2014-07-24 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong-Thai Nguyen resolved TIKA-1373. Resolution: Fixed AutoDetectParser extracts no text when SourceCodeParser is selected

[jira] [Commented] (TIKA-1373) AutoDetectParser extracts no text when SourceCodeParser is selected

2014-07-23 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071643#comment-14071643 ] Hong-Thai Nguyen commented on TIKA-1373: Can you format your description with

[jira] [Commented] (TIKA-1373) AutoDetectParser extracts no text when SourceCodeParser is selected

2014-07-23 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071713#comment-14071713 ] Hong-Thai Nguyen commented on TIKA-1373: Yes, I saw the trouble when implementing

[jira] [Comment Edited] (TIKA-1373) AutoDetectParser extracts no text when SourceCodeParser is selected

2014-07-23 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071643#comment-14071643 ] Hong-Thai Nguyen edited comment on TIKA-1373 at 7/23/14 1:42 PM:

[jira] [Commented] (TIKA-1095) Only gibberish extracted from this PDF

2014-07-15 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14061867#comment-14061867 ] Hong-Thai Nguyen commented on TIKA-1095: Event with latest Tika can't convert this

[jira] [Updated] (TIKA-1095) Only gibberish extracted from this PDF

2014-07-15 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong-Thai Nguyen updated TIKA-1095: --- Component/s: (was: general) parser Only gibberish extracted from this

[jira] [Commented] (TIKA-1350) OutlookPSTParser: Unknown message type: IPM.Note

2014-06-23 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14040519#comment-14040519 ] Hong-Thai Nguyen commented on TIKA-1350: Richard Johnson (author of java-pstlib) is

[jira] [Commented] (TIKA-1308) Support in memory parse mode(don't create temp file): to support run Tika in GAE

2014-05-26 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14008704#comment-14008704 ] Hong-Thai Nguyen commented on TIKA-1308: A virtual FileSystem may be a solution, If

[jira] [Updated] (TIKA-1290) Upgrade to PDFBOX 1.8.5

2014-05-06 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong-Thai Nguyen updated TIKA-1290: --- Labels: trivial (was: ) Upgrade to PDFBOX 1.8.5 ---

[jira] [Resolved] (TIKA-1290) Upgrade to PDFBOX 1.8.5

2014-05-06 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong-Thai Nguyen resolved TIKA-1290. Resolution: Fixed r1592780 Upgrade to PDFBOX 1.8.5 ---

[jira] [Commented] (TIKA-1287) Update NetCDF .jar file on Maven Central

2014-05-02 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13987521#comment-13987521 ] Hong-Thai Nguyen commented on TIKA-1287: Technically, not difficult to upload new

[jira] [Created] (TIKA-1290) Upgrade to PDFBOX 1.8.5

2014-05-02 Thread Hong-Thai Nguyen (JIRA)
Hong-Thai Nguyen created TIKA-1290: -- Summary: Upgrade to PDFBOX 1.8.5 Key: TIKA-1290 URL: https://issues.apache.org/jira/browse/TIKA-1290 Project: Tika Issue Type: Improvement

[jira] [Commented] (TIKA-1283) Add thumbnail as possible metadata item to TikaCoreProperties

2014-04-28 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13983434#comment-13983434 ] Hong-Thai Nguyen commented on TIKA-1283: +1 for me to create a thumbnail field in

[jira] [Created] (TIKA-1279) Missing return lines at output of SourceCodeParser

2014-04-24 Thread Hong-Thai Nguyen (JIRA)
Hong-Thai Nguyen created TIKA-1279: -- Summary: Missing return lines at output of SourceCodeParser Key: TIKA-1279 URL: https://issues.apache.org/jira/browse/TIKA-1279 Project: Tika Issue

[jira] [Commented] (TIKA-1224) Adding Source code (Java, Groovy, C) parser

2014-04-24 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13979614#comment-13979614 ] Hong-Thai Nguyen commented on TIKA-1224: Thank [~ben.12] for feedback. For line

[jira] [Resolved] (TIKA-1279) Missing return lines at output of SourceCodeParser

2014-04-24 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong-Thai Nguyen resolved TIKA-1279. Resolution: Fixed Fixed at r1589687 Missing return lines at output of SourceCodeParser

[jira] [Updated] (TIKA-1276) Missing embedded dependencies in tika-bundle

2014-04-24 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong-Thai Nguyen updated TIKA-1276: --- Fix Version/s: 1.6 Missing embedded dependencies in tika-bundle

[jira] [Resolved] (TIKA-1276) Missing embedded dependencies in tika-bundle

2014-04-24 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong-Thai Nguyen resolved TIKA-1276. Resolution: Fixed Thank [~rwesten], added your patch at r1589717 Missing embedded

[jira] [Resolved] (TIKA-1279) Missing return lines at output of SourceCodeParser

2014-04-24 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong-Thai Nguyen resolved TIKA-1279. Resolution: Fixed Thank [~rgauss] for this good catch. I fixed with more tests in r1589742

[jira] [Updated] (TIKA-623) Add support for Outlook PST

2014-04-04 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong-Thai Nguyen updated TIKA-623: -- Assignee: (was: Hong-Thai Nguyen) Add support for Outlook PST ---

[jira] [Resolved] (TIKA-1244) Better parsing of Mbox files

2014-03-31 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong-Thai Nguyen resolved TIKA-1244. Resolution: Fixed Fix Version/s: 1.6 Commited on r1583305, thanks [~lfcnassif] I

[jira] [Assigned] (TIKA-1244) Better parsing of Mbox files

2014-03-28 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong-Thai Nguyen reassigned TIKA-1244: -- Assignee: Hong-Thai Nguyen Better parsing of Mbox files

[jira] [Commented] (TIKA-1244) Better parsing of Mbox files

2014-03-21 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13942965#comment-13942965 ] Hong-Thai Nguyen commented on TIKA-1244: +1 for me too, I was at same intention to

[jira] [Commented] (TIKA-623) Add support for Outlook PST

2014-03-07 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923703#comment-13923703 ] Hong-Thai Nguyen commented on TIKA-623: --- [~lfcnassif], binary attached is handled with

[jira] [Created] (TIKA-1257) MS Word Filter out control characters on ouput

2014-03-06 Thread Hong-Thai Nguyen (JIRA)
Hong-Thai Nguyen created TIKA-1257: -- Summary: MS Word Filter out control characters on ouput Key: TIKA-1257 URL: https://issues.apache.org/jira/browse/TIKA-1257 Project: Tika Issue Type:

[jira] [Updated] (TIKA-1257) MS Word Filter out control characters on ouput

2014-03-06 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong-Thai Nguyen updated TIKA-1257: --- Attachment: tika-doc-control-char.png 5f01ae23-9e6e-4faa-808a-f78dbb20cc71.doc

[jira] [Resolved] (TIKA-1257) MS Word Filter out control characters on ouput

2014-03-06 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong-Thai Nguyen resolved TIKA-1257. Resolution: Fixed Fixed on r1574874 MS Word Filter out control characters on ouput

[jira] [Comment Edited] (TIKA-1257) MS Word Filter out control characters on ouput

2014-03-06 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13922490#comment-13922490 ] Hong-Thai Nguyen edited comment on TIKA-1257 at 3/6/14 1:50 PM:

[jira] [Updated] (TIKA-1257) MS Word Filter out control characters on ouput

2014-03-06 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong-Thai Nguyen updated TIKA-1257: --- Attachment: (was: 5f01ae23-9e6e-4faa-808a-f78dbb20cc71.doc) MS Word Filter out control

[jira] [Updated] (TIKA-1257) MS Word Filter out control characters on ouput

2014-03-06 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong-Thai Nguyen updated TIKA-1257: --- Attachment: testControlCharacters.doc MS Word Filter out control characters on ouput

[jira] [Updated] (TIKA-623) Add support for Outlook PST

2014-03-05 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong-Thai Nguyen updated TIKA-623: -- Fix Version/s: 1.6 Add support for Outlook PST ---

[jira] [Comment Edited] (TIKA-623) Add support for Outlook PST

2014-03-05 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920692#comment-13920692 ] Hong-Thai Nguyen edited comment on TIKA-623 at 3/5/14 9:30 AM: ---

[jira] [Assigned] (TIKA-623) Add support for Outlook PST

2014-03-05 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong-Thai Nguyen reassigned TIKA-623: - Assignee: Hong-Thai Nguyen Add support for Outlook PST ---

[jira] [Resolved] (TIKA-623) Add support for Outlook PST

2014-03-05 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong-Thai Nguyen resolved TIKA-623. --- Resolution: Fixed Commit on r1574411 Add support for Outlook PST

[jira] [Resolved] (TIKA-1089) Tika conversion failed on following documents

2014-02-17 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong-Thai Nguyen resolved TIKA-1089. Resolution: Invalid Fix Version/s: 1.5 Assignee: Hong-Thai Nguyen Should

[jira] [Assigned] (TIKA-1223) Extract thumbnail of OOXML Office files

2014-02-17 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong-Thai Nguyen reassigned TIKA-1223: -- Assignee: Hong-Thai Nguyen Extract thumbnail of OOXML Office files

[jira] [Resolved] (TIKA-1223) Extract thumbnail of OOXML Office files

2014-02-17 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong-Thai Nguyen resolved TIKA-1223. Resolution: Fixed r1568954 Extract thumbnail of OOXML Office files

[jira] [Assigned] (TIKA-1223) Extract thumbnail of OOXML Office files

2014-02-17 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong-Thai Nguyen reassigned TIKA-1223: -- Assignee: (was: Hong-Thai Nguyen) Extract thumbnail of OOXML Office files

[jira] [Resolved] (TIKA-1224) Adding Source code (Java, Groovy, C) parser

2014-02-03 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong-Thai Nguyen resolved TIKA-1224. Resolution: Fixed Adding Source code (Java, Groovy, C) parser

[jira] [Commented] (TIKA-1224) Adding Source code (Java, Groovy, C) parser

2014-02-03 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13889491#comment-13889491 ] Hong-Thai Nguyen commented on TIKA-1224: Commited on 1563902 Adding Source code

[jira] [Commented] (TIKA-1224) Adding Source code (Java, Groovy, C) parser

2014-01-21 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13877343#comment-13877343 ] Hong-Thai Nguyen commented on TIKA-1224: I agree that parsing deeply each language

[jira] [Commented] (TIKA-1215) Regression: Unable to parse a mp3 file on 1.5 which parsed successfully on 1.4

2014-01-14 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13870573#comment-13870573 ] Hong-Thai Nguyen commented on TIKA-1215: Great catch. Thank [~jukkaz] Regression:

[jira] [Updated] (TIKA-1215) Regression: Unable to parse a mp3 file on 1.5 which parsed successfully on 1.4

2014-01-13 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong-Thai Nguyen updated TIKA-1215: --- Attachment: tika-1215-without-wildcard.patch [~gagravarr], my code style is different the one

[jira] [Commented] (TIKA-1215) Regression: Unable to parse a mp3 file on 1.5 which parsed successfully on 1.4

2014-01-13 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13869590#comment-13869590 ] Hong-Thai Nguyen commented on TIKA-1215: [~talli...@apache.org], here's XML of

[jira] [Commented] (TIKA-90) Allow thumbnails as document metadata

2014-01-09 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-90?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13866498#comment-13866498 ] Hong-Thai Nguyen commented on TIKA-90: -- Useful for Open XML Office OpenOffice files and

[jira] [Commented] (TIKA-1216) parse method of Mp3Parser doesn't work for few mp3 files

2014-01-07 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864202#comment-13864202 ] Hong-Thai Nguyen commented on TIKA-1216: I've test with a simple test case with

[jira] [Updated] (TIKA-1215) Regression: Unable to parse a mp3 file on 1.5 which parsed successfully on 1.4

2014-01-07 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong-Thai Nguyen updated TIKA-1215: --- Attachment: TIKA-1215-fix-prefix-namespaces.patch I made a fix with a test for this issue.

[jira] [Comment Edited] (TIKA-1216) parse method of Mp3Parser doesn't work for few mp3 files

2014-01-07 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864202#comment-13864202 ] Hong-Thai Nguyen edited comment on TIKA-1216 at 1/7/14 3:57 PM:

[jira] [Commented] (TIKA-1215) Regression: Unable parse a mp3 file on 1.5 which parsed successfully on 1.4

2014-01-02 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13860246#comment-13860246 ] Hong-Thai Nguyen commented on TIKA-1215: [~davemeikle], here's a sample test failed

[jira] [Updated] (TIKA-1215) Regression: Unable to parse a mp3 file on 1.5 which parsed successfully on 1.4

2014-01-02 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong-Thai Nguyen updated TIKA-1215: --- Summary: Regression: Unable to parse a mp3 file on 1.5 which parsed successfully on 1.4

[jira] [Comment Edited] (TIKA-1215) Regression: Unable to parse a mp3 file on 1.5 which parsed successfully on 1.4

2014-01-02 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13860246#comment-13860246 ] Hong-Thai Nguyen edited comment on TIKA-1215 at 1/2/14 3:11 PM:

[jira] [Comment Edited] (TIKA-1215) Regression: Unable to parse a mp3 file on 1.5 which parsed successfully on 1.4

2014-01-02 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13860246#comment-13860246 ] Hong-Thai Nguyen edited comment on TIKA-1215 at 1/2/14 3:12 PM:

[jira] [Comment Edited] (TIKA-1215) Regression: Unable to parse a mp3 file on 1.5 which parsed successfully on 1.4

2014-01-02 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13860246#comment-13860246 ] Hong-Thai Nguyen edited comment on TIKA-1215 at 1/2/14 5:20 PM:

[jira] [Commented] (TIKA-1152) Process loops infinitely on parsing of a CHM file

2013-12-27 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13857418#comment-13857418 ] Hong-Thai Nguyen commented on TIKA-1152: Thank [~jukkaz], I've checked on trunk.

[jira] [Updated] (TIKA-1215) Regression: Unable parse a mp3 file on 1.5 which parsed successfully on 1.4

2013-12-27 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong-Thai Nguyen updated TIKA-1215: --- Attachment: Centres 080805@0650 RTBF Matin Première - A propos des rues de Dublin et

[jira] [Created] (TIKA-1215) Regression: Unable parse a mp3 file on 1.5 which parsed successfully on 1.4

2013-12-27 Thread Hong-Thai Nguyen (JIRA)
Hong-Thai Nguyen created TIKA-1215: -- Summary: Regression: Unable parse a mp3 file on 1.5 which parsed successfully on 1.4 Key: TIKA-1215 URL: https://issues.apache.org/jira/browse/TIKA-1215 Project:

[jira] [Comment Edited] (TIKA-1215) Regression: Unable parse a mp3 file on 1.5 which parsed successfully on 1.4

2013-12-27 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13857542#comment-13857542 ] Hong-Thai Nguyen edited comment on TIKA-1215 at 12/27/13 3:59 PM:

[jira] [Commented] (TIKA-1152) Process loops infinitely on parsing of a CHM file

2013-12-23 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13855528#comment-13855528 ] Hong-Thai Nguyen commented on TIKA-1152: [~gagravarr] or anyone can have look at

[jira] [Commented] (TIKA-1205) Allow PDFParser to fallback to other parser if there is an exception

2013-12-11 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13845398#comment-13845398 ] Hong-Thai Nguyen commented on TIKA-1205: Just a (newbie) question, why limit only

  1   2   >