[jira] [Commented] (TIKA-3044) add -C/--content cli option using WriteOutContentHandler

2020-02-13 Thread Alexander Klimetschek (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17036728#comment-17036728 ] Alexander Klimetschek commented on TIKA-3044: - Pull request: 

[jira] [Created] (TIKA-3044) add -C/--content cli option using WriteOutContentHandler

2020-02-13 Thread Alexander Klimetschek (Jira)
Alexander Klimetschek created TIKA-3044: --- Summary: add -C/--content cli option using WriteOutContentHandler Key: TIKA-3044 URL: https://issues.apache.org/jira/browse/TIKA-3044 Project: Tika

[jira] [Commented] (TIKA-3040) PDF inline OCR: Exception while processing certain image (others in same PDF work)

2020-02-13 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17036514#comment-17036514 ] Hudson commented on TIKA-3040: -- SUCCESS: Integrated in Jenkins build tika-branch-1x #308 (See

[jira] [Commented] (TIKA-3040) PDF inline OCR: Exception while processing certain image (others in same PDF work)

2020-02-13 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17036508#comment-17036508 ] Hudson commented on TIKA-3040: -- UNSTABLE: Integrated in Jenkins build Tika-trunk #1773 (See

[jira] [Commented] (TIKA-3006) Regression in PDF keywords extraction since 1.23

2020-02-13 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17036483#comment-17036483 ] Hudson commented on TIKA-3006: -- SUCCESS: Integrated in Jenkins build tika-branch-1x #307 (See

[jira] [Commented] (TIKA-3043) vorbis-java-tika overwrites tika's Parser and Detector in MANIFEST

2020-02-13 Thread Nick Burch (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17036421#comment-17036421 ] Nick Burch commented on TIKA-3043: -- If you are building an all-in-one jar, you need to merge certain

[jira] [Commented] (TIKA-3026) Consider extracting structure/tags where possible in PDFs with the PDFMarkedContentExtractor

2020-02-13 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17036394#comment-17036394 ] Hudson commented on TIKA-3026: -- UNSTABLE: Integrated in Jenkins build tika-branch-1x #306 (See

[jira] [Commented] (TIKA-3026) Consider extracting structure/tags where possible in PDFs with the PDFMarkedContentExtractor

2020-02-13 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17036392#comment-17036392 ] Hudson commented on TIKA-3026: -- SUCCESS: Integrated in Jenkins build Tika-trunk #1772 (See

[jira] [Created] (TIKA-3043) vorbis-java-tika overwrites tika's Parser and Detector in MANIFEST

2020-02-13 Thread CHARUSHEELA BOPARDIKAR (Jira)
CHARUSHEELA BOPARDIKAR created TIKA-3043: Summary: vorbis-java-tika overwrites tika's Parser and Detector in MANIFEST Key: TIKA-3043 URL: https://issues.apache.org/jira/browse/TIKA-3043

[jira] [Commented] (TIKA-3026) Consider extracting structure/tags where possible in PDFs with the PDFMarkedContentExtractor

2020-02-13 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17036361#comment-17036361 ] Tim Allison commented on TIKA-3026: --- I pushed an initial draft to master and branch_1x. Let me know

[jira] [Created] (TIKA-3042) Date format extraction problem in XLS/XLSX

2020-02-13 Thread Zoltan Farago (Jira)
Zoltan Farago created TIKA-3042: --- Summary: Date format extraction problem in XLS/XLSX Key: TIKA-3042 URL: https://issues.apache.org/jira/browse/TIKA-3042 Project: Tika Issue Type: Task