[jira] [Updated] (TIKA-2179) WordMLParser fails to parse a word xml file

2016-11-23 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated TIKA-2179: -- Fix Version/s: 1.15 2.0 > WordMLParser fails to parse a word xml file >

[jira] [Resolved] (TIKA-2179) WordMLParser fails to parse a word xml file

2016-11-23 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-2179. --- Resolution: Fixed Thank you, [~seanstory], for opening this. Let us know what else you find. >

[jira] [Commented] (TIKA-2180) Multiple requests on Tika to extract text slows down

2016-11-23 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15691261#comment-15691261 ] Tim Allison commented on TIKA-2180: --- Wait, are you throwing 22 (roughly) concurrent requests at

[jira] [Updated] (TIKA-2185) NegativeArraySizeException on a valid Word file

2016-11-23 Thread Seva Alekseyev (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Seva Alekseyev updated TIKA-2185: - Attachment: PatentW final.doc > NegativeArraySizeException on a valid Word file >

[jira] [Created] (TIKA-2185) NegativeArraySizeException on a valid Word file

2016-11-23 Thread Seva Alekseyev (JIRA)
Seva Alekseyev created TIKA-2185: Summary: NegativeArraySizeException on a valid Word file Key: TIKA-2185 URL: https://issues.apache.org/jira/browse/TIKA-2185 Project: Tika Issue Type: Bug

[jira] [Created] (TIKA-2184) RecordFormatException on a valid Excel file

2016-11-23 Thread Seva Alekseyev (JIRA)
Seva Alekseyev created TIKA-2184: Summary: RecordFormatException on a valid Excel file Key: TIKA-2184 URL: https://issues.apache.org/jira/browse/TIKA-2184 Project: Tika Issue Type: Bug

[jira] [Updated] (TIKA-2184) RecordFormatException on a valid Excel file

2016-11-23 Thread Seva Alekseyev (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Seva Alekseyev updated TIKA-2184: - Attachment: HIVT Discrepancy Report- 3-29-04UCSF.xls > RecordFormatException on a valid Excel file

[jira] [Commented] (TIKA-2179) WordMLParser fails to parse a word xml file

2016-11-23 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15691208#comment-15691208 ] Hudson commented on TIKA-2179: -- SUCCESS: Integrated in Jenkins build Tika-trunk #1144 (See

[jira] [Commented] (TIKA-2179) WordMLParser fails to parse a word xml file

2016-11-23 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15691200#comment-15691200 ] Hudson commented on TIKA-2179: -- UNSTABLE: Integrated in Jenkins build tika-2.x #176 (See

[jira] [Comment Edited] (TIKA-2179) WordMLParser fails to parse a word xml file

2016-11-23 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15691150#comment-15691150 ] Tim Allison edited comment on TIKA-2179 at 11/23/16 7:44 PM: - I committed a

[jira] [Updated] (TIKA-2153) TaggedIOException on a valid Powerpoint file

2016-11-23 Thread Seva Alekseyev (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Seva Alekseyev updated TIKA-2153: - Description: On the following Powerpoint file, which opens fine with Powerpoint:

[jira] [Updated] (TIKA-2153) TaggedIOException on a valid Powerpoint file

2016-11-23 Thread Seva Alekseyev (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Seva Alekseyev updated TIKA-2153: - Attachment: Marcia Lecture.PPT > TaggedIOException on a valid Powerpoint file >

[jira] [Updated] (TIKA-2153) TaggedIOException on a valid Powerpoint file

2016-11-23 Thread Seva Alekseyev (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Seva Alekseyev updated TIKA-2153: - Description: On the following Powerpoint file, which opens fine with Powerpoint:

[jira] [Updated] (TIKA-2153) TaggedIOException on a valid Powerpoint file

2016-11-23 Thread Seva Alekseyev (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Seva Alekseyev updated TIKA-2153: - Attachment: IAVI Team meeting FINAL.ppt > TaggedIOException on a valid Powerpoint file >

[jira] [Commented] (TIKA-2179) WordMLParser fails to parse a word xml file

2016-11-23 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15691150#comment-15691150 ] Tim Allison commented on TIKA-2179: --- I committed a reasonable first pass at this. Still left on the list

[jira] [Updated] (TIKA-2153) TaggedIOException on a valid Powerpoint file

2016-11-23 Thread Seva Alekseyev (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Seva Alekseyev updated TIKA-2153: - Description: On the following Powerpoint file, which opens fine with Powerpoint:

[jira] [Updated] (TIKA-2153) TaggedIOException on a valid Powerpoint file

2016-11-23 Thread Seva Alekseyev (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Seva Alekseyev updated TIKA-2153: - Attachment: daids.ppt > TaggedIOException on a valid Powerpoint file >

[jira] [Updated] (TIKA-2153) TaggedIOException on a valid Powerpoint file

2016-11-23 Thread Seva Alekseyev (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Seva Alekseyev updated TIKA-2153: - Description: On the following Powerpoint file, which opens fine with Powerpoint:

[jira] [Commented] (TIKA-2179) WordMLParser fails to parse a word xml file

2016-11-23 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15691110#comment-15691110 ] Hudson commented on TIKA-2179: -- FAILURE: Integrated in Jenkins build tika-2.x-windows #77 (See

tika-2.x-windows - Build # 77 - Still Failing

2016-11-23 Thread Apache Jenkins Server
The Apache Jenkins build system has built tika-2.x-windows (build #77) Status: Still Failing Check console output at https://builds.apache.org/job/tika-2.x-windows/77/ to view the results.

[jira] [Commented] (TIKA-2179) WordMLParser fails to parse a word xml file

2016-11-23 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15690971#comment-15690971 ] Tim Allison commented on TIKA-2179: --- How's this look: {noformat} 0: cp:revision : 2 0: date :

[jira] [Commented] (TIKA-2183) Can't Read file if its name is Arabic

2016-11-23 Thread Ahmad Sawalhah (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15690406#comment-15690406 ] Ahmad Sawalhah commented on TIKA-2183: -- I am sure that if I renamed this file to any other english

[jira] [Commented] (TIKA-2183) Can't Read file if its name is Arabic

2016-11-23 Thread Ahmad Sawalhah (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15690396#comment-15690396 ] Ahmad Sawalhah commented on TIKA-2183: -- Traceback (most recent call last): File

[jira] [Commented] (TIKA-2183) Can't Read file if its name is Arabic

2016-11-23 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15690368#comment-15690368 ] Chris A. Mattmann commented on TIKA-2183: - hi Ahmed and Nick - @Ahmed can you please provide the

[jira] [Commented] (TIKA-2183) Can't Read file if its name is Arabic

2016-11-23 Thread Ahmad Sawalhah (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15690297#comment-15690297 ] Ahmad Sawalhah commented on TIKA-2183: -- I did, thanks > Can't Read file if its name is Arabic >

[jira] [Commented] (TIKA-2183) Can't Read file if its name is Arabic

2016-11-23 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15690279#comment-15690279 ] Nick Burch commented on TIKA-2183: -- Ping [~chrismattmann] (he's the maintainer of those bindings at

[jira] [Commented] (TIKA-2183) Can't Read file if its name is Arabic

2016-11-23 Thread Ahmad Sawalhah (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15690267#comment-15690267 ] Ahmad Sawalhah commented on TIKA-2183: -- tika for python, I used pip install tika It is working fine

[jira] [Commented] (TIKA-2183) Can't Read file if its name is Arabic

2016-11-23 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15690022#comment-15690022 ] Nick Burch commented on TIKA-2183: -- How are you calling Tika? I'd guess some sort of Python wrapper? If

[jira] [Created] (TIKA-2183) Can't Read file if its name is Arabic

2016-11-23 Thread Ahmad Sawalhah (JIRA)
Ahmad Sawalhah created TIKA-2183: Summary: Can't Read file if its name is Arabic Key: TIKA-2183 URL: https://issues.apache.org/jira/browse/TIKA-2183 Project: Tika Issue Type: Bug