[jira] [Commented] (TIKA-1355) Unexpected RuntimeException from org.apache.tika.parser.microsoft.ooxml.OOXMLParser

2014-06-25 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14043283#comment-14043283 ] Nick Burch commented on TIKA-1355: -- I wonder if this is the same problem as POI bug 56514

[jira] [Updated] (TIKA-1355) Unexpected RuntimeException from org.apache.tika.parser.microsoft.ooxml.OOXMLParser

2014-06-25 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Burch updated TIKA-1355: - Labels: needs-dependency-upgrade (was: ) > Unexpected RuntimeException from > org.apache.tika.parser.mic

[jira] [Commented] (TIKA-1355) Unexpected RuntimeException from org.apache.tika.parser.microsoft.ooxml.OOXMLParser

2014-06-25 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14043378#comment-14043378 ] Nick Burch commented on TIKA-1355: -- Good to know This'll be fixed when the next Apache PO

[jira] [Updated] (TIKA-1355) Unexpected RuntimeException from org.apache.tika.parser.microsoft.ooxml.OOXMLParser

2014-06-25 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Burch updated TIKA-1355: - Component/s: parser > Unexpected RuntimeException from > org.apache.tika.parser.microsoft.ooxml.OOXMLPars

[jira] [Updated] (TIKA-1350) OutlookPSTParser: Unknown message type: IPM.Note

2014-06-25 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Burch updated TIKA-1350: - Labels: libpst needs-dependency-upgrade parser pst (was: libpst parser pst) > OutlookPSTParser: Unknown m

[jira] [Updated] (TIKA-1352) Upgrade to PDFBox 1.8.6

2014-06-25 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Burch updated TIKA-1352: - Labels: needs-dependency-upgrade (was: ) > Upgrade to PDFBox 1.8.6 > --- > >

[jira] [Commented] (TIKA-1358) Add support for newer iWork file formats

2014-06-26 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14044468#comment-14044468 ] Nick Burch commented on TIKA-1358: -- First thing we'd probably want is to re-create the cur

[jira] [Resolved] (TIKA-1359) Wrong getting started link on site

2014-06-30 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Burch resolved TIKA-1359. -- Resolution: Fixed Fix Version/s: 1.6 Thanks for this, applied in r1606795. > Wrong getting start

[jira] [Commented] (TIKA-1357) Buffered text in EnviHeaderParser

2014-06-30 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14047815#comment-14047815 ] Nick Burch commented on TIKA-1357: -- Might be good to update the unit tests to have an extr

[jira] [Commented] (TIKA-1360) Update description and fix typos in site

2014-06-30 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14047933#comment-14047933 ] Nick Burch commented on TIKA-1360: -- Is it worth linking to the contribution page from one

[jira] [Commented] (TIKA-1360) Update description and fix typos in site

2014-06-30 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14048007#comment-14048007 ] Nick Burch commented on TIKA-1360: -- Next suggestion - should those bare references to dev@

[jira] [Resolved] (TIKA-1360) Update description and fix typos in site

2014-06-30 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Burch resolved TIKA-1360. -- Resolution: Fixed Fix Version/s: 1.6 Thanks, v3 applied in r1606937. > Update description and fi

[jira] [Commented] (TIKA-1361) Update MP4Parser to 1.0.2

2014-07-05 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052836#comment-14052836 ] Nick Burch commented on TIKA-1361: -- Any chance you could try bumping the dependency locall

[jira] [Resolved] (TIKA-1364) Issue in metadata extraction for xslm (Excel Macro 2007) file

2014-07-08 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Burch resolved TIKA-1364. -- Resolution: Invalid Tika 1.5 needs a newer version of Apache POI than that. If you're using the Tika Ap

[jira] [Resolved] (TIKA-1327) New parser for Matlab .mat files

2014-07-14 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Burch resolved TIKA-1327. -- Resolution: Fixed Thanks for the patch, applied in r1610506. > New parser for Matlab .mat files > -

[jira] [Commented] (TIKA-1369) Date parsing and thread safety in ImageMetadataExtractor

2014-07-15 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14062288#comment-14062288 ] Nick Burch commented on TIKA-1369: -- General comment - adding more dependencies to the Tika

[jira] [Commented] (TIKA-1365) Incorrectly MimeType detection for Apache Lucene web site

2014-07-16 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14063292#comment-14063292 ] Nick Burch commented on TIKA-1365: -- There shouldn't normally be a difference between calli

[jira] [Commented] (TIKA-1358) Add support for newer iWork file formats

2014-07-21 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14069469#comment-14069469 ] Nick Burch commented on TIKA-1358: -- Any chance you could attach zips of the test files as

[jira] [Commented] (TIKA-1371) passing parameters via URL no longer works (regression)

2014-07-23 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14071693#comment-14071693 ] Nick Burch commented on TIKA-1371: -- The /detect/stream URL ought to work, see DetectorReso

[jira] [Commented] (TIKA-1373) AutoDetectParser extracts no text when SourceCodeParser is selected

2014-07-23 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14071697#comment-14071697 ] Nick Burch commented on TIKA-1373: -- I've just tried it with svn trunk, and I think I see t

[jira] [Commented] (TIKA-1373) AutoDetectParser extracts no text when SourceCodeParser is selected

2014-07-23 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14071744#comment-14071744 ] Nick Burch commented on TIKA-1373: -- {quote}Yes, I saw the trouble when implementing this p

[jira] [Commented] (TIKA-1371) passing parameters via URL no longer works (regression)

2014-07-23 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14071748#comment-14071748 ] Nick Burch commented on TIKA-1371: -- I've just tried with a recent nightly build, and it wo

[jira] [Commented] (TIKA-1371) passing parameters via URL no longer works (regression)

2014-07-23 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14071953#comment-14071953 ] Nick Burch commented on TIKA-1371: -- {quote}Another useful feature would be to allow the se

[jira] [Commented] (TIKA-1269) Self-hosted documentation for the JAX-RS Server

2014-07-23 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14072311#comment-14072311 ] Nick Burch commented on TIKA-1269: -- I guess we'll need a bit of Maven / Ant-in-Maven magic

[jira] [Commented] (TIKA-1269) Self-hosted documentation for the JAX-RS Server

2014-07-24 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14073164#comment-14073164 ] Nick Burch commented on TIKA-1269: -- It's a bit hard to be sure on Miredot when most (all?)

[jira] [Resolved] (TIKA-1361) Update MP4Parser to 1.0.2

2014-07-24 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Burch resolved TIKA-1361. -- Resolution: Fixed Fix Version/s: 1.6 > Update MP4Parser to 1.0.2 > - > >

[jira] [Commented] (TIKA-1377) As a user, I would like to see "album artist", "disc number", and "compilation" in parsed MP3 and MP4 types

2014-07-28 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14077017#comment-14077017 ] Nick Burch commented on TIKA-1377: -- Any chance you could upload the changed binary files t

[jira] [Commented] (TIKA-1369) Date parsing and thread safety in ImageMetadataExtractor

2014-07-29 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14077578#comment-14077578 ] Nick Burch commented on TIKA-1369: -- Please send the pull request to the main github repo -

[jira] [Commented] (TIKA-1377) As a user, I would like to see "album artist", "disc number", and "compilation" in parsed MP3 and MP4 types

2014-07-30 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14079192#comment-14079192 ] Nick Burch commented on TIKA-1377: -- MP3 parser changes applied in r1614610 and r1614615, w

[jira] [Resolved] (TIKA-1377) As a user, I would like to see "album artist", "disc number", and "compilation" in parsed MP3 and MP4 types

2014-07-30 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Burch resolved TIKA-1377. -- Resolution: Fixed > As a user, I would like to see "album artist", "disc number", and > "compilation" i

[jira] [Created] (TIKA-1380) Upgrade to Apache POI 3.11 beta 1

2014-07-30 Thread Nick Burch (JIRA)
Nick Burch created TIKA-1380: Summary: Upgrade to Apache POI 3.11 beta 1 Key: TIKA-1380 URL: https://issues.apache.org/jira/browse/TIKA-1380 Project: Tika Issue Type: Improvement Compon

[jira] [Updated] (TIKA-1380) Upgrade to Apache POI 3.11 beta 1

2014-08-01 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Burch updated TIKA-1380: - Attachment: TIKA-1380.patch The attached patch fixes the handful of breaking api changes (which mostly mak

[jira] [Updated] (TIKA-1380) Upgrade to Apache POI 3.11 beta 1

2014-08-04 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Burch updated TIKA-1380: - Attachment: TIKA-1380c.patch Updated patch, including fixes from Tim and Andreas's patches This should be

[jira] [Commented] (TIKA-1380) Upgrade to Apache POI 3.11 beta 1

2014-08-04 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14084813#comment-14084813 ] Nick Burch commented on TIKA-1380: -- Patch committed in r1615624 to trunk, and in r1615636

[jira] [Resolved] (TIKA-1093) [OfficeParser] NullPointerException

2014-08-04 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Burch resolved TIKA-1093. -- Resolution: Fixed Fix Version/s: 1.6 Now parses on Tika but from trunk or from the 1.6 release br

[jira] [Resolved] (TIKA-1118) OOXML parser throws when relationship points to 0 byte embedded part

2014-08-04 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Burch resolved TIKA-1118. -- Resolution: Fixed Fix Version/s: 1.7 Check enabled in r1615638. If you do come across a file tha

[jira] [Resolved] (TIKA-1380) Upgrade to Apache POI 3.11 beta 1

2014-08-04 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Burch resolved TIKA-1380. -- Resolution: Fixed Fix Version/s: 1.7 1.6 As of r1615692, I've addressed all th

[jira] [Commented] (TIKA-1275) Upgrade Commons compress to 1.8.1

2014-08-05 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14086126#comment-14086126 ] Nick Burch commented on TIKA-1275: -- I'd suggest putting the version into the properties ne

[jira] [Commented] (TIKA-1275) Upgrade Commons compress to 1.8.1

2014-08-05 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14086158#comment-14086158 ] Nick Burch commented on TIKA-1275: -- I'd put both tukaani.version and compress.version ther

[jira] [Commented] (TIKA-1380) Upgrade to Apache POI 3.11 beta 1

2014-08-05 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14086164#comment-14086164 ] Nick Burch commented on TIKA-1380: -- /null.bin doesn't sound right to me. I think a fix to

[jira] [Commented] (TIKA-1380) Upgrade to Apache POI 3.11 beta 1

2014-08-05 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14086192#comment-14086192 ] Nick Burch commented on TIKA-1380: -- The previous version seemed to use directory name + fi

[jira] [Reopened] (TIKA-1383) Simplify TikeServerCli endpoint setup code

2014-08-05 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Burch reopened TIKA-1383: -- This change seems to have broken the listing of endpoints on the welcome page > Simplify TikeServerCli endp

[jira] [Commented] (TIKA-1383) Simplify TikeServerCli endpoint setup code

2014-08-05 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14086268#comment-14086268 ] Nick Burch commented on TIKA-1383: -- Note that the Tika Server unit tests have their own wa

[jira] [Commented] (TIKA-1383) Simplify TikeServerCli endpoint setup code

2014-08-05 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14086288#comment-14086288 ] Nick Burch commented on TIKA-1383: -- Currently, we only have unit tests, which test specifi

[jira] [Commented] (TIKA-1383) Simplify TikeServerCli endpoint setup code

2014-08-05 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14086346#comment-14086346 ] Nick Burch commented on TIKA-1383: -- I can see arguments both ways for if TikaWelcome shoul

[jira] [Deleted] (TIKA-1386) Add forbidden-apis checker to TIKA build

2014-08-06 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Burch deleted TIKA-1386: - > Add forbidden-apis checker to TIKA build > > > Key:

[jira] [Commented] (TIKA-1388) Tika IOUtils java.lang.OutOfMemoryError

2014-08-07 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14088967#comment-14088967 ] Nick Burch commented on TIKA-1388: -- Not sure how this is a Buffer Overflow? The buffer is

[jira] [Commented] (TIKA-1243) Support for 7z archives

2014-08-13 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14095274#comment-14095274 ] Nick Burch commented on TIKA-1243: -- Tika now uses Commons Compress 1.8.1, you should try w

[jira] [Commented] (TIKA-1243) Support for 7z archives

2014-08-13 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14095528#comment-14095528 ] Nick Burch commented on TIKA-1243: -- Looks like a limitation in Apache Commons Compress, yo

[jira] [Commented] (TIKA-1387) Add forbidden-apis checker to TIKA build

2014-08-13 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14095723#comment-14095723 ] Nick Burch commented on TIKA-1387: -- I have gone through the changes which used Locale.getD

[jira] [Commented] (TIKA-1387) Add forbidden-apis checker to TIKA build

2014-08-13 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14095899#comment-14095899 ] Nick Burch commented on TIKA-1387: -- Whoops, yes, fixed in r1617796. > Add forbidden-apis

[jira] [Commented] (TIKA-1397) Can Tika make the metadata extraction of time stamps as timezone sensitive

2014-08-15 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14098315#comment-14098315 ] Nick Burch commented on TIKA-1397: -- If the data stored in the file has a timezone on it, T

[jira] [Commented] (TIKA-1397) Can Tika make the metadata extraction of time stamps as timezone sensitive

2014-08-18 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14100396#comment-14100396 ] Nick Burch commented on TIKA-1397: -- Some file formats store their times with timezone info

[jira] [Commented] (TIKA-1390) Create tika-example module

2014-08-18 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14100549#comment-14100549 ] Nick Burch commented on TIKA-1390: -- When completed, these can be pulled into the site alon

[jira] [Commented] (TIKA-1399) [Patch] add support for AxCrypt file type detection

2014-08-19 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14102194#comment-14102194 ] Nick Burch commented on TIKA-1399: -- That magic string looks suspiciously long to me. I'll

[jira] [Resolved] (TIKA-1399) [Patch] add support for AxCrypt file type detection

2014-08-19 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Burch resolved TIKA-1399. -- Resolution: Fixed Fix Version/s: 1.7 > [Patch] add support for AxCrypt file type detection >

[jira] [Commented] (TIKA-1399) [Patch] add support for AxCrypt file type detection

2014-08-19 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14102389#comment-14102389 ] Nick Burch commented on TIKA-1399: -- Having read a few likely looking bits of the source, a

[jira] [Commented] (TIKA-1400) Unsupport parse xls file content of headers and footers.

2014-08-19 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14103399#comment-14103399 ] Nick Burch commented on TIKA-1400: -- When you say "content can not be parsed", do you mean

[jira] [Updated] (TIKA-1400) Excel (xls, xlsx) printing page headers and footers not extracted

2014-08-19 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Burch updated TIKA-1400: - Summary: Excel (xls, xlsx) printing page headers and footers not extracted (was: Unsupport parse xls file

[jira] [Commented] (TIKA-1400) Excel (xls, xlsx) printing page headers and footers not extracted

2014-08-19 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14103517#comment-14103517 ] Nick Burch commented on TIKA-1400: -- Ah, I see, it's the print headers/footers you're expec

[jira] [Updated] (TIKA-1401) occured infinite loop using tika library

2014-08-25 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Burch updated TIKA-1401: - Component/s: (was: metadata) detector > occured infinite loop using tika library > --

[jira] [Commented] (TIKA-1401) occured infinite loop using tika library

2014-08-25 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14110294#comment-14110294 ] Nick Burch commented on TIKA-1401: -- At first glance, it looks like we might need to bring

[jira] [Commented] (TIKA-1369) Date parsing and thread safety in ImageMetadataExtractor

2014-08-25 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14110391#comment-14110391 ] Nick Burch commented on TIKA-1369: -- I think this might've already been handled through TIK

[jira] [Commented] (TIKA-1369) Date parsing and thread safety in ImageMetadataExtractor

2014-08-26 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14110542#comment-14110542 ] Nick Burch commented on TIKA-1369: -- If you could confirm, then close this ticket + pull re

[jira] [Commented] (TIKA-1369) Date parsing and thread safety in ImageMetadataExtractor

2014-08-26 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14110602#comment-14110602 ] Nick Burch commented on TIKA-1369: -- I would defer to [~rgauss] on that, he's more of the e

[jira] [Commented] (TIKA-1404) tika-server leaking temporary files when converting Word97 (doc)

2014-08-29 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14115294#comment-14115294 ] Nick Burch commented on TIKA-1404: -- Any chance you could re-test with a recent nightly bui

[jira] [Assigned] (TIKA-1404) tika-server leaking temporary files when converting Word97 (doc)

2014-08-29 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Burch reassigned TIKA-1404: Assignee: Nick Burch > tika-server leaking temporary files when converting Word97 (doc) > --

[jira] [Commented] (TIKA-1404) tika-server leaking temporary files when converting Word97 (doc)

2014-08-29 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14115299#comment-14115299 ] Nick Burch commented on TIKA-1404: -- Also, since you mention production - you might be bett

[jira] [Commented] (TIKA-1404) tika-server leaking temporary files when converting Word97 (doc)

2014-08-29 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14115397#comment-14115397 ] Nick Burch commented on TIKA-1404: -- If you build from source / a svn checkout, you'll find

[jira] [Updated] (TIKA-1404) tika-server leaking temporary files when converting Word97 (doc)

2014-08-31 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Burch updated TIKA-1404: - Component/s: (was: server) cli > tika-server leaking temporary files when converting W

[jira] [Commented] (TIKA-1404) tika-server leaking temporary files when converting Word97 (doc)

2014-08-31 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14116810#comment-14116810 ] Nick Burch commented on TIKA-1404: -- The tika-app server code was missing a call to close t

[jira] [Updated] (TIKA-1404) tika-app server leaking temporary files when converting Word97 (doc)

2014-08-31 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Burch updated TIKA-1404: - Summary: tika-app server leaking temporary files when converting Word97 (doc) (was: tika-server leaking te

[jira] [Resolved] (TIKA-1404) tika-app server leaking temporary files when converting Word97 (doc)

2014-08-31 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Burch resolved TIKA-1404. -- Resolution: Fixed Fix Version/s: 1.7 > tika-app server leaking temporary files when converting Wor

[jira] [Commented] (TIKA-1407) Unexpected RuntimeException from org.apache.tika.parser.microsoft.OfficeParser@5d11346a

2014-09-01 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14117534#comment-14117534 ] Nick Burch commented on TIKA-1407: -- Firstly, can you please post the problematic file - I

[jira] [Resolved] (TIKA-1407) Unexpected RuntimeException from org.apache.tika.parser.microsoft.OfficeParser@5d11346a

2014-09-02 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Burch resolved TIKA-1407. -- Resolution: Fixed Fix Version/s: 1.6 Pre-compiled nightly jars are available from the CI system -

[jira] [Commented] (TIKA-1409) Error asking for a directory mime-type

2014-09-03 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14119912#comment-14119912 ] Nick Burch commented on TIKA-1409: -- Directories don't have mime types, only content does

[jira] [Commented] (TIKA-1409) Error asking for a directory mime-type

2014-09-03 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14119983#comment-14119983 ] Nick Burch commented on TIKA-1409: -- I believe that inodes are a unix-specific thing, so th

[jira] [Commented] (TIKA-1410) Temporary OLE File Leak

2014-09-07 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14124861#comment-14124861 ] Nick Burch commented on TIKA-1410: -- How are you calling Tika? Tika App? Tika Server? Tika

[jira] [Comment Edited] (TIKA-1410) Temporary OLE File Leak

2014-09-07 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14124861#comment-14124861 ] Nick Burch edited comment on TIKA-1410 at 9/7/14 10:37 AM: --- How a

[jira] [Commented] (TIKA-1410) Temporary OLE File Leak

2014-09-08 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14125823#comment-14125823 ] Nick Burch commented on TIKA-1410: -- I don't use Windows, so I can't be sure, but I think t

[jira] [Resolved] (TIKA-1412) NPE in OpenDocumentParser

2014-09-08 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Burch resolved TIKA-1412. -- Resolution: Fixed Fix Version/s: 1.7 Thanks, patch applied in r1623466. > NPE in OpenDocumentPars

[jira] [Commented] (TIKA-1411) Temporary 7z file leak

2014-09-08 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14125966#comment-14125966 ] Nick Burch commented on TIKA-1411: -- Any chance you could produce a patch file of your prop

[jira] [Resolved] (TIKA-1246) Include LastModifiedDate in metadata of archive entries

2014-09-08 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Burch resolved TIKA-1246. -- Resolution: Fixed Fix Version/s: 1.7 Added, with unit tests, in r1623501. > Include LastModifiedD

[jira] [Resolved] (TIKA-1411) Temporary 7z file leak

2014-09-08 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Burch resolved TIKA-1411. -- Resolution: Fixed Fix Version/s: 1.7 Thanks ! Applied in r1623593. > Temporary 7z file leak > ---

[jira] [Resolved] (TIKA-1284) TikaException for Microsoft Powerpoint Document [ ppt ]

2014-09-09 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Burch resolved TIKA-1284. -- Resolution: Fixed Fix Version/s: 1.6 Marking as fixed based on Felix's comments > TikaException f

[jira] [Resolved] (TIKA-1189) Fails to parse PPT file

2014-09-09 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Burch resolved TIKA-1189. -- Resolution: Fixed Fix Version/s: 1.6 Marking as fixed based on Felix's comments > Fails to parse

[jira] [Commented] (TIKA-1415) PowerPoint2003 embedded with word. The embedded file can not be detected.

2014-09-15 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14134369#comment-14134369 ] Nick Burch commented on TIKA-1415: -- We have unit tests which show Tika (trunk) successfull

[jira] [Commented] (TIKA-1415) PowerPoint2003 embedded with word. The embedded file can not be detected.

2014-09-16 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14135103#comment-14135103 ] Nick Burch commented on TIKA-1415: -- Your unit tests don't enable recursion, so I'm not sur

[jira] [Commented] (TIKA-1420) Add Metadata Extraction to Arbitrary Parsers

2014-09-20 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14142056#comment-14142056 ] Nick Burch commented on TIKA-1420: -- Are you envisioning something that will look for certa

[jira] [Commented] (TIKA-1420) Add Metadata Extraction to Arbitrary Parsers

2014-09-21 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14142691#comment-14142691 ] Nick Burch commented on TIKA-1420: -- I don't see why you couldn't do it as a decorating han

[jira] [Commented] (TIKA-1420) Add Metadata Extraction to Arbitrary Parsers

2014-09-24 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14146187#comment-14146187 ] Nick Burch commented on TIKA-1420: -- For now, I'd suggest putting this into the Examples pa

[jira] [Commented] (TIKA-1420) Add Metadata Extraction to Arbitrary Parsers

2014-09-24 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14146976#comment-14146976 ] Nick Burch commented on TIKA-1420: -- Since it's an example, it might be good to put in a he

[jira] [Commented] (TIKA-1420) Add Metadata Extraction to Arbitrary Parsers

2014-09-28 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14151228#comment-14151228 ] Nick Burch commented on TIKA-1420: -- Now it's dependency free, don't see why it can't be in

[jira] [Commented] (TIKA-1431) How to extract embedded images in a document?

2014-09-29 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14151587#comment-14151587 ] Nick Burch commented on TIKA-1431: -- If you go to http://localhost:9998/ you'll see the lis

[jira] [Resolved] (TIKA-1444) Detection for VirtualPC VHD files

2014-10-13 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Burch resolved TIKA-1444. -- Resolution: Fixed Fix Version/s: 1.7 I don't think we can remove the .vhd extension from VHDL, as

[jira] [Resolved] (TIKA-1450) Tika does not detect the correct mime-type for webp images

2014-10-17 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Burch resolved TIKA-1450. -- Resolution: Fixed Fix Version/s: 1.7 Thanks for this. Mimetype added in r1632699, and unit test ba

[jira] [Commented] (TIKA-1452) parser.parse() throws exception after which the procesed file is not getting renamed/moved/deleted

2014-10-21 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14178362#comment-14178362 ] Nick Burch commented on TIKA-1452: -- Can you provide a junit test case that shows how to re

[jira] [Commented] (TIKA-1452) parser.parse() throws exception after which the procesed file is not getting renamed/moved/deleted

2014-10-27 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14185068#comment-14185068 ] Nick Burch commented on TIKA-1452: -- The stacktrace you have posted is quite clear on you h

[jira] [Commented] (TIKA-1460) Could not parse predefined CMAP file for 'Adobe-GBK1-UCS2'

2014-10-29 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14188192#comment-14188192 ] Nick Burch commented on TIKA-1460: -- We do really ideally need the problematic file, any ch

[jira] [Commented] (TIKA-1461) Bad mime detection of certain JAR file

2014-10-29 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14188479#comment-14188479 ] Nick Burch commented on TIKA-1461: -- Do you know the license of that file? And/or of a diff

  1   2   3   4   5   6   7   8   9   10   >