[jira] [Commented] (TIKA-1518) Docker with Tika Server

2015-01-15 Thread Paul Ramirez (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279927#comment-14279927 ] Paul Ramirez commented on TIKA-1518: Thanks Konstantin for the example. If you have the

[jira] [Updated] (TIKA-1028) Tika-server quits parsing of rfc-822 document prematurely when it encounters encrypted zip file as attachment.

2015-01-15 Thread Juha Haaga (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Juha Haaga updated TIKA-1028: - Attachment: test.eml Mail containing encrypted zip file. Zip password is "test". > Tika-server quits parsi

[jira] [Updated] (TIKA-1028) Tika-server quits parsing of rfc-822 document prematurely when it encounters encrypted zip file as attachment.

2015-01-15 Thread Juha Haaga (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Juha Haaga updated TIKA-1028: - Attachment: (was: encrypted-zip.msg) > Tika-server quits parsing of rfc-822 document prematurely when i

[jira] [Commented] (TIKA-1511) Create a parser for SQLite3

2015-01-15 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279748#comment-14279748 ] Tim Allison commented on TIKA-1511: --- Sounds good, y, I think the user will have to handcr

[jira] [Commented] (TIKA-1222) Tika does not extract attachments from RFC822 files

2015-01-15 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279679#comment-14279679 ] Hudson commented on TIKA-1222: -- SUCCESS: Integrated in tika-trunk-jdk1.7 #435 (See [https://b

[jira] [Commented] (TIKA-1028) Tika-server quits parsing of rfc-822 document prematurely when it encounters encrypted zip file as attachment.

2015-01-15 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279629#comment-14279629 ] Hudson commented on TIKA-1028: -- SUCCESS: Integrated in tika-trunk-jdk1.6 #419 (See [https://b

[jira] [Commented] (TIKA-1222) Tika does not extract attachments from RFC822 files

2015-01-15 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279628#comment-14279628 ] Hudson commented on TIKA-1222: -- SUCCESS: Integrated in tika-trunk-jdk1.6 #419 (See [https://b

[jira] [Commented] (TIKA-1222) Tika does not extract attachments from RFC822 files

2015-01-15 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279612#comment-14279612 ] Nick Burch commented on TIKA-1222: -- I've done something similar by r1652321. It's heavily

[jira] [Commented] (TIKA-1222) Tika does not extract attachments from RFC822 files

2015-01-15 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279599#comment-14279599 ] Hudson commented on TIKA-1222: -- SUCCESS: Integrated in tika-trunk-jdk1.7 #434 (See [https://b

[jira] [Commented] (TIKA-1028) Tika-server quits parsing of rfc-822 document prematurely when it encounters encrypted zip file as attachment.

2015-01-15 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279600#comment-14279600 ] Hudson commented on TIKA-1028: -- SUCCESS: Integrated in tika-trunk-jdk1.7 #434 (See [https://b

[jira] [Commented] (TIKA-1028) Tika-server quits parsing of rfc-822 document prematurely when it encounters encrypted zip file as attachment.

2015-01-15 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279572#comment-14279572 ] Nick Burch commented on TIKA-1028: -- I've added a temporary workaround in r1652317. It does

[ANNOUNCE] Apache Tika 1.7 Released

2015-01-15 Thread Tyler Palsulich
The Apache Tika project is pleased to announce the release of Apache Tika 1.7. The release contents have been pushed out to the main Apache release site and to the Maven Central sync, so the releases should be available as soon as the mirrors get the syncs. Apache Tika is a toolkit for detecting a

Re: [VOTE] Apache Tika 1.7 Release

2015-01-15 Thread Tyler Palsulich
Found it: https://github.com/chrismattmann/apachestuff/blob/master/extract-tika-contribs :) Thanks! Tyler On Thu, Jan 15, 2015 at 8:57 AM, Tyler Palsulich wrote: > Thanks, Chris! That sounds useful. Let me know when you get a chance to > upload it somewhere. > > Tyler > > On Wed, Jan 14, 2015 a

[jira] [Commented] (TIKA-1519) Don't allow whatever is in http-equiv Content-Type to overwrite actual Content-Type in HtmlParser

2015-01-15 Thread Luis Filipe Nassif (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279058#comment-14279058 ] Luis Filipe Nassif commented on TIKA-1519: -- Maybe a more general "Content-Type-Hin

[jira] [Closed] (TIKA-1514) http-equiv content-type extraction should pick first parseable content value

2015-01-15 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison closed TIKA-1514. - Resolution: Won't Fix Closing this and opening separate issue TIKA-1519. > http-equiv content-type extract

[jira] [Updated] (TIKA-1519) Don't allow whatever is in http-equiv Content-Type to overwrite actual Content-Type in HtmlParser

2015-01-15 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated TIKA-1519: -- Summary: Don't allow whatever is in http-equiv Content-Type to overwrite actual Content-Type in HtmlParse

[jira] [Created] (TIKA-1519) Don't allow whatever is in http-equiv Content-Type overwrite actual Content-Type in HtmlParser

2015-01-15 Thread Tim Allison (JIRA)
Tim Allison created TIKA-1519: - Summary: Don't allow whatever is in http-equiv Content-Type overwrite actual Content-Type in HtmlParser Key: TIKA-1519 URL: https://issues.apache.org/jira/browse/TIKA-1519

[jira] [Commented] (TIKA-1513) Add mime detection and parsing for dbf files

2015-01-15 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279008#comment-14279008 ] Tim Allison commented on TIKA-1513: --- Thank you, [~lfcnassif] and [~gagravarr]! I think I

[jira] [Commented] (TIKA-1028) Tika-server quits parsing of rfc-822 document prematurely when it encounters encrypted zip file as attachment.

2015-01-15 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279003#comment-14279003 ] Tim Allison commented on TIKA-1028: --- Will look into this and TIKA-1222 tomorrow unless so

[RESULT] [VOTE] Apache Tika 1.7 Release Candidate #3

2015-01-15 Thread Tyler Palsulich
Hi All, The VOTE for releasing Apache Tika 1.7 RC#3 finished with the following tally: +1: Chris Mattmann David Meikle Hong-Thai Nguyen Nick Burch Tim Allison Tyler Palsulich +0: [None] -1: [None] Thank you everyone for voting! I will move forward with the release. Have a good day, Tyler

Re: [VOTE] Apache Tika 1.7 Release

2015-01-15 Thread Tyler Palsulich
Thanks, Chris! That sounds useful. Let me know when you get a chance to upload it somewhere. Tyler On Wed, Jan 14, 2015 at 11:22 PM, Mattmann, Chris A (3980) < chris.a.mattm...@jpl.nasa.gov> wrote: > yeah good idea Nick. Also I had a script that would partially auto-generate > the contributors a

[jira] [Comment Edited] (TIKA-1028) Tika-server quits parsing of rfc-822 document prematurely when it encounters encrypted zip file as attachment.

2015-01-15 Thread Luis Filipe Nassif (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14278651#comment-14278651 ] Luis Filipe Nassif edited comment on TIKA-1028 at 1/15/15 4:49 PM: --

[jira] [Commented] (TIKA-1516) Downgrade Rome dependency to 0.9 to avoid nasty NPE

2015-01-15 Thread Konstantin Gribov (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14278821#comment-14278821 ] Konstantin Gribov commented on TIKA-1516: - Also, it seems to be an classloader issu

[jira] [Commented] (TIKA-1516) Downgrade Rome dependency to 0.9 to avoid nasty NPE

2015-01-15 Thread Konstantin Gribov (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14278800#comment-14278800 ] Konstantin Gribov commented on TIKA-1516: - Upstream bug isn't fixed yet. See https:

[jira] [Commented] (TIKA-1516) Downgrade Rome dependency to 0.9 to avoid nasty NPE

2015-01-15 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14278751#comment-14278751 ] Nick Burch commented on TIKA-1516: -- Is it not possible to upgrade to a newer version of Ro

[jira] [Commented] (TIKA-1516) Downgrade Rome dependency to 0.9 to avoid nasty NPE

2015-01-15 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14278743#comment-14278743 ] Lewis John McGibbney commented on TIKA-1516: Anyone to approve this folks? > D

[jira] [Commented] (TIKA-1513) Add mime detection and parsing for dbf files

2015-01-15 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14278678#comment-14278678 ] Nick Burch commented on TIKA-1513: -- If it's the project themselves pushing it to central,

[jira] [Commented] (TIKA-1513) Add mime detection and parsing for dbf files

2015-01-15 Thread Luis Filipe Nassif (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14278674#comment-14278674 ] Luis Filipe Nassif commented on TIKA-1513: -- I talked to iryndin and he liked the i

[jira] [Commented] (TIKA-1028) Tika-server quits parsing of rfc-822 document prematurely when it encounters encrypted zip file as attachment.

2015-01-15 Thread Luis Filipe Nassif (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14278651#comment-14278651 ] Luis Filipe Nassif commented on TIKA-1028: -- I think fixing tika-1222 it will be

[jira] [Updated] (TIKA-1028) Tika-server quits parsing of rfc-822 document prematurely when it encounters encrypted zip file as attachment.

2015-01-15 Thread Juha Haaga (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Juha Haaga updated TIKA-1028: - Affects Version/s: 1.7 1.5 1.6 > Tika-server quits parsing of

[jira] [Commented] (TIKA-1517) MIME type detection with probability

2015-01-15 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14278587#comment-14278587 ] Nick Burch commented on TIKA-1517: -- Two quick questions so far: * How would this work wit

[jira] [Reopened] (TIKA-1329) Add RecursiveParserWrapper aka Jukka's (and Nick's) RecursiveMetadataParser

2015-01-15 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Burch reopened TIKA-1329: -- I'm re-opening this, as while we have the RecursiveParserWrapper, we don't yet have anything in the tika-exa

[jira] [Commented] (TIKA-1518) Docker with Tika Server

2015-01-15 Thread Konstantin Gribov (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14278446#comment-14278446 ] Konstantin Gribov commented on TIKA-1518: - To pull latest Tika you can use snippet

[jira] [Commented] (TIKA-1518) Docker with Tika Server

2015-01-15 Thread Paul Ramirez (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14278424#comment-14278424 ] Paul Ramirez commented on TIKA-1518: As I build a patch what component should this go i

[jira] [Created] (TIKA-1518) Docker with Tika Server

2015-01-15 Thread Paul Ramirez (JIRA)
Paul Ramirez created TIKA-1518: -- Summary: Docker with Tika Server Key: TIKA-1518 URL: https://issues.apache.org/jira/browse/TIKA-1518 Project: Tika Issue Type: New Feature Reporter:

[jira] [Commented] (TIKA-1301) Establish TikaServer on Apache hosted VM

2015-01-15 Thread Paul Ramirez (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14278419#comment-14278419 ] Paul Ramirez commented on TIKA-1301: In the spirit of fun and because I'm going to do i

[jira] [Updated] (TIKA-1517) MIME type detection with probability

2015-01-15 Thread Luke sh (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke sh updated TIKA-1517: -- Description: Improvement and intuition The original implementation for MIME type selection/detection is a bit les

[jira] [Updated] (TIKA-1517) MIME type detection with probability

2015-01-15 Thread Shuai Liu (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shuai Liu updated TIKA-1517: Summary: MIME type detection with probability (was: MIME type selection with probability) > MIME type detec

[jira] [Updated] (TIKA-1517) MIME type selection with probability

2015-01-15 Thread Shuai Liu (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shuai Liu updated TIKA-1517: Attachment: BaysianTest.java Simple demo program for the MIME type probability detection > MIME type select

[jira] [Updated] (TIKA-1517) MIME type selection with probability

2015-01-15 Thread Shuai Liu (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shuai Liu updated TIKA-1517: Description: Problem and intuition The original implementation in MIME type determination is a bit less flexi

[jira] [Updated] (TIKA-1517) MIME type selection with probability

2015-01-15 Thread Shuai Liu (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shuai Liu updated TIKA-1517: Description: Problem and intuition The original implementation in MIME type determination is a bit less flexi

[jira] [Updated] (TIKA-1517) MIME type selection with probability

2015-01-15 Thread Shuai Liu (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shuai Liu updated TIKA-1517: Description: Problem and intuition The original implementation in MIME type determination is a bit less flexi

[jira] [Updated] (TIKA-1517) MIME type selection with probability

2015-01-15 Thread Shuai Liu (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shuai Liu updated TIKA-1517: Description: Problem and intuition The original implementation in MIME type determination is a bit less flexi

[jira] [Updated] (TIKA-1517) MIME type selection with probability

2015-01-15 Thread Shuai Liu (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shuai Liu updated TIKA-1517: Description: Problem and intuition The original implementation in MIME type determination is a bit less flexi