[jira] [Updated] (TIKA-1138) Empty body and empty title with some TXT documents

2013-06-25 Thread Koutsoulis Philippe (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koutsoulis Philippe updated TIKA-1138: -- Description: *No error in logs* *+Extract from my Structured Text:+* {noformat} ?xml

[jira] [Comment Edited] (TIKA-1138) Empty body and empty title with some TXT documents

2013-06-25 Thread Koutsoulis Philippe (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13692867#comment-13692867 ] Koutsoulis Philippe edited comment on TIKA-1138 at 6/25/13 9:09 AM:

[jira] [Comment Edited] (TIKA-1138) Empty body and empty title with some TXT documents

2013-06-25 Thread Koutsoulis Philippe (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13692867#comment-13692867 ] Koutsoulis Philippe edited comment on TIKA-1138 at 6/25/13 9:08 AM:

[jira] [Commented] (TIKA-973) PDF form data isn't included in extracted content.

2013-06-25 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13693068#comment-13693068 ] Tim Allison commented on TIKA-973: -- Will submit patch and tests by end of the week.

[jira] [Commented] (TIKA-1109) Metadata not extracted before the context in OOXML (pptx)

2013-06-25 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13693097#comment-13693097 ] Nick Burch commented on TIKA-1109: -- Some parsers fetch the metadata first, some do it

[jira] [Commented] (TIKA-1070) StackOverflow error in org.apache.tika.sax.ToXMLContentHandler$ElementInfo.getPrefix(ToXMLContentHandler.java:58)

2013-06-25 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13693106#comment-13693106 ] Nick Burch commented on TIKA-1070: -- None of our XML gurus have spotted a problem with

[jira] [Resolved] (TIKA-1070) StackOverflow error in org.apache.tika.sax.ToXMLContentHandler$ElementInfo.getPrefix(ToXMLContentHandler.java:58)

2013-06-25 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Burch resolved TIKA-1070. -- Resolution: Fixed Fix Version/s: 1.5 StackOverflow error in

RFC822Parser build error on gump

2013-06-25 Thread Nick Burch
Hi All Anyone have any idea about this compiler error on the tika parsers project as hit by gump? http://vmgump.apache.org/gump/public/tika/tika-parsers/gump_work/build_tika_tika-parsers.html Gump notifications will hopefully start again soon, which'd let us find out about breaking changes

[jira] [Updated] (TIKA-1130) .docx text extract leaves out some portions of text

2013-06-25 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated TIKA-1130: -- Attachment: TIKA-1130.patch Ray's initial test restored after POI-55142 was committed. Thank you,

[jira] [Commented] (TIKA-1130) .docx text extract leaves out some portions of text

2013-06-25 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13693520#comment-13693520 ] Nick Burch commented on TIKA-1130: -- The POI 3.10 beta 1 release vote has just started,