tika-trunk-jdk1.7 - Build # 964 - Still Failing

2016-04-22 Thread Apache Jenkins Server
The Apache Jenkins build system has built tika-trunk-jdk1.7 (build #964) Status: Still Failing Check console output at https://builds.apache.org/job/tika-trunk-jdk1.7/964/ to view the results.

tika-trunk-jdk1.7 - Build # 963 - Failure

2016-04-22 Thread Apache Jenkins Server
The Apache Jenkins build system has built tika-trunk-jdk1.7 (build #963) Status: Failure Check console output at https://builds.apache.org/job/tika-trunk-jdk1.7/963/ to view the results.

[jira] [Commented] (TIKA-1885) Tika MIME updates for *.cdf and *.xar and custom zero length file detector based on TREC-DD-Polar

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15254825#comment-15254825 ] Chris A. Mattmann commented on TIKA-1885: - ping [~adeshgup] > Tika MIME updates for *.cdf and

[jira] [Updated] (TIKA-1917) Just a quick fix to allow NLTK Parser extract measurement information from text

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1917: Fix Version/s: (was: 1.14) 1.13 > Just a quick fix to allow NLTK

[jira] [Updated] (TIKA-1885) Tika MIME updates for *.cdf and *.xar and custom zero length file detector based on TREC-DD-Polar

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1885: Fix Version/s: (was: 1.14) 1.13 > Tika MIME updates for *.cdf and

[jira] [Updated] (TIKA-1939) Preparation for Tika 1.13 release

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1939: Fix Version/s: (was: 1.14) 1.13 > Preparation for Tika 1.13 release >

[jira] [Updated] (TIKA-1955) MIME types updates and additions for Scientific Data based on TREC-DD-Polar

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1955: Fix Version/s: (was: 1.14) 1.13 > MIME types updates and additions

[jira] [Updated] (TIKA-1913) Integrate MIT Information Extraction(MITIE) into Tika to perform Named Entity Recognition

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1913: Fix Version/s: (was: 1.14) 1.13 > Integrate MIT Information

[jira] [Updated] (TIKA-1801) Integrate MITIE Named Entity Recognition support

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1801: Fix Version/s: (was: 1.14) 1.13 > Integrate MITIE Named Entity

[jira] [Updated] (TIKA-1220) Parser implementration for IFC files

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1220: Fix Version/s: (was: 1.13) 1.14 > Parser implementration for IFC

[jira] [Updated] (TIKA-985) Support for HTML5 elements

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-985: --- Fix Version/s: (was: 1.13) 1.14 > Support for HTML5 elements >

[jira] [Updated] (TIKA-1366) Update some of Tika Server services to support JAX-RS 2.0 AsyncResponse

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1366: Fix Version/s: (was: 1.13) 1.14 > Update some of Tika Server services

[jira] [Updated] (TIKA-1318) Use of Deprecated Word6Extractor.getParagraphText() Method

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1318: Fix Version/s: (was: 1.13) 1.14 > Use of Deprecated

[jira] [Updated] (TIKA-1417) Create Extract Embedded Images from PDFs Example

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1417: Fix Version/s: (was: 1.13) 1.14 > Create Extract Embedded Images from

[jira] [Updated] (TIKA-1540) New Tika plugin for image based feature extraction using computer vision techniques

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1540: Fix Version/s: (was: 1.13) 1.14 > New Tika plugin for image based

[jira] [Updated] (TIKA-1815) Text content from parser is empty when NamedEntityParser is enabled

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1815: Fix Version/s: (was: 1.13) 1.14 > Text content from parser is empty

[jira] [Updated] (TIKA-1925) Composite External Parser like Exiftool fails to run on Windows.

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1925: Fix Version/s: (was: 1.13) 1.14 > Composite External Parser like

[jira] [Updated] (TIKA-1688) Tika Version in Metadata

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1688: Fix Version/s: (was: 1.13) 1.14 > Tika Version in Metadata >

[jira] [Updated] (TIKA-894) Add webapp mode for Tika Server, simplifies deployment

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-894: --- Fix Version/s: (was: 1.13) 1.14 > Add webapp mode for Tika Server,

[jira] [Updated] (TIKA-774) ExifTool Parser

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-774: --- Fix Version/s: (was: 1.13) 1.14 > ExifTool Parser > --- > >

[jira] [Updated] (TIKA-1059) Better Handling of InterruptedException in ExternalParser and ExternalEmbedder

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1059: Fix Version/s: (was: 1.13) 1.14 > Better Handling of

[jira] [Updated] (TIKA-1456) Visual Sentiment API parser

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1456: Fix Version/s: (was: 1.13) 1.14 > Visual Sentiment API parser >

[jira] [Updated] (TIKA-1329) Add RecursiveParserWrapper aka Jukka's (and Nick's) RecursiveMetadataParser

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1329: Fix Version/s: (was: 1.13) 1.14 > Add RecursiveParserWrapper aka

[jira] [Updated] (TIKA-1706) Bring back commons-io to tika-core

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1706: Fix Version/s: (was: 1.13) 1.14 > Bring back commons-io to tika-core

[jira] [Updated] (TIKA-1640) Make ExternalParser support aliases for key names in extracted metadata

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1640: Fix Version/s: (was: 1.13) 1.14 > Make ExternalParser support aliases

[jira] [Updated] (TIKA-980) MicrodataContentHandler for Apache Tika

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-980: --- Fix Version/s: (was: 1.13) 1.14 > MicrodataContentHandler for Apache

[jira] [Updated] (TIKA-1672) Integrate tika-java7 component

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1672: Fix Version/s: (was: 1.13) 1.14 > Integrate tika-java7 component >

[jira] [Updated] (TIKA-1697) Parser Implementation for AkomaNtoso Legal XML Documents

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1697: Fix Version/s: (was: 1.13) 1.14 > Parser Implementation for

[jira] [Updated] (TIKA-1738) ForkClient does not always delete temporary bootstrap jar

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1738: Fix Version/s: (was: 1.13) 1.14 > ForkClient does not always delete

[jira] [Updated] (TIKA-1208) Migrate Any23 mime contributions to Tika

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1208: Fix Version/s: (was: 1.13) 1.14 > Migrate Any23 mime contributions to

[jira] [Updated] (TIKA-1801) Integrate MITIE Named Entity Recognition support

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1801: Fix Version/s: (was: 1.13) 1.14 > Integrate MITIE Named Entity

[jira] [Updated] (TIKA-1953) tika-server NullPointerException while processing rtfs

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1953: Fix Version/s: (was: 1.13) 1.14 > tika-server NullPointerException

[jira] [Updated] (TIKA-891) Use POST in addition to PUT on method calls in tika-server

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-891: --- Fix Version/s: (was: 1.13) 1.14 > Use POST in addition to PUT on method

[jira] [Updated] (TIKA-1598) Parser Implementation for Streaming Video

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1598: Fix Version/s: (was: 1.13) 1.14 > Parser Implementation for Streaming

[jira] [Updated] (TIKA-988) We don't extract a placeholder for a Word document embedded in an Excel document

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-988: --- Fix Version/s: (was: 1.13) 1.14 > We don't extract a placeholder for a

[jira] [Updated] (TIKA-819) Make Option to Exclude Embedded Files' Text for Text Content

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-819: --- Fix Version/s: (was: 1.13) 1.14 > Make Option to Exclude Embedded Files'

[jira] [Updated] (TIKA-1607) Introduce new arbitrary object key/values data structure for persistence of Tika Metadata

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1607: Fix Version/s: (was: 1.13) 1.14 > Introduce new arbitrary object

[jira] [Updated] (TIKA-1425) Automatic batching of Microsoft service calls

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1425: Fix Version/s: (was: 1.13) 1.14 > Automatic batching of Microsoft

[jira] [Updated] (TIKA-1395) Create embedded image extraction example

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1395: Fix Version/s: (was: 1.13) 1.14 > Create embedded image extraction

[jira] [Updated] (TIKA-1952) Access Date is getting modified while capturing the MetaData information using AutoDetectParser

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1952: Fix Version/s: (was: 1.13) 1.14 > Access Date is getting modified

[jira] [Updated] (TIKA-1913) Integrate MIT Information Extraction(MITIE) into Tika to perform Named Entity Recognition

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1913: Fix Version/s: (was: 1.13) 1.14 > Integrate MIT Information

[jira] [Updated] (TIKA-1888) Update mimetype for application/x-netcdf

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1888: Fix Version/s: (was: 1.13) 1.14 > Update mimetype for

[jira] [Updated] (TIKA-1108) Represent individual slides in pptx

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1108: Fix Version/s: (was: 1.13) 1.14 > Represent individual slides in pptx

[jira] [Updated] (TIKA-1295) Make some Dublin Core items multi-valued

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1295: Fix Version/s: (was: 1.13) 1.14 > Make some Dublin Core items

[jira] [Updated] (TIKA-1379) error in Tika().detect for xml files with xades signature

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1379: Fix Version/s: (was: 1.13) 1.14 > error in Tika().detect for xml

[jira] [Updated] (TIKA-1508) Add uniformity to parser parameter configuration

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1508: Fix Version/s: (was: 1.13) 1.14 > Add uniformity to parser parameter

[jira] [Updated] (TIKA-1436) improvement to PDFParser

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1436: Fix Version/s: (was: 1.13) 1.14 > improvement to PDFParser >

[jira] [Updated] (TIKA-1829) org.apache.tika.parser.ocr.TesseractOCRParser.getSupportedTypes(TesseractOCRParser.java:92) NPE

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1829: Fix Version/s: (was: 1.13) 1.14 >

[jira] [Updated] (TIKA-1705) Update ASM dependency to 5.0.4

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1705: Fix Version/s: (was: 1.13) 1.14 > Update ASM dependency to 5.0.4 >

[jira] [Updated] (TIKA-1709) Tika Server doesn't handle multi-part attachments or form-encoded inputs

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1709: Fix Version/s: (was: 1.13) 1.14 > Tika Server doesn't handle

[jira] [Updated] (TIKA-1390) Create tika-example module

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1390: Fix Version/s: (was: 1.13) 1.14 > Create tika-example module >

[jira] [Updated] (TIKA-539) Encoding detection is too biased by encoding in meta tag

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-539: --- Fix Version/s: (was: 1.13) 1.14 > Encoding detection is too biased by

[jira] [Updated] (TIKA-1955) MIME types updates and additions for Scientific Data based on TREC-DD-Polar

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1955: Fix Version/s: (was: 1.13) 1.14 > MIME types updates and additions

[jira] [Updated] (TIKA-1724) Create parser for .obo file format.

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1724: Fix Version/s: (was: 1.13) 1.14 > Create parser for .obo file format.

[jira] [Updated] (TIKA-1276) Missing embedded dependencies in tika-bundle

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1276: Fix Version/s: (was: 1.13) 1.14 > Missing embedded dependencies in

[jira] [Updated] (TIKA-1308) Support in memory parse mode(don't create temp file): to support run Tika in GAE

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1308: Fix Version/s: (was: 1.13) 1.14 > Support in memory parse mode(don't

[jira] [Updated] (TIKA-1939) Preparation for Tika 1.13 release

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1939: Fix Version/s: (was: 1.13) 1.14 > Preparation for Tika 1.13 release >

[jira] [Updated] (TIKA-1800) MediaType#parse does not decode escaped special characters

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1800: Fix Version/s: (was: 1.13) 1.14 > MediaType#parse does not decode

[jira] [Updated] (TIKA-1106) CLAVIN Integration

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1106: Fix Version/s: (was: 1.13) 1.14 > CLAVIN Integration >

[jira] [Updated] (TIKA-1465) Implement extraction of non-global variables from netCDF3 and netCDF4

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1465: Fix Version/s: (was: 1.13) 1.14 > Implement extraction of non-global

[jira] [Updated] (TIKA-1343) Create a Tika Translator implementation that uses JoshuaDecoder

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1343: Fix Version/s: (was: 1.13) 1.14 > Create a Tika Translator

[jira] [Updated] (TIKA-1577) NetCDF Data Extraction

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1577: Fix Version/s: (was: 1.13) 1.14 > NetCDF Data Extraction >

[jira] [Updated] (TIKA-1301) Establish TikaServer on Apache hosted VM

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1301: Fix Version/s: (was: 1.13) 1.14 > Establish TikaServer on Apache

[jira] [Updated] (TIKA-1885) Tika MIME updates for *.cdf and *.xar and custom zero length file detector based on TREC-DD-Polar

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1885: Fix Version/s: (was: 1.13) 1.14 > Tika MIME updates for *.cdf and

[jira] [Updated] (TIKA-1505) chmparser breaks down when extracting from file of CHM format v3

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1505: Fix Version/s: (was: 1.13) 1.14 > chmparser breaks down when

[jira] [Updated] (TIKA-1609) Leverage Google's LibPhonenumber for enhanced phone number extraction and metadata modeling

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1609: Fix Version/s: (was: 1.13) 1.14 > Leverage Google's LibPhonenumber

[jira] [Updated] (TIKA-987) Embedded drawing (SHAPE MERGEFORMAT) sometimes not extracted

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-987: --- Fix Version/s: (was: 1.13) 1.14 > Embedded drawing (SHAPE MERGEFORMAT)

[jira] [Updated] (TIKA-1367) Tika documentation should list tika-parsers parser dependencies

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1367: Fix Version/s: (was: 1.13) 1.14 > Tika documentation should list

[jira] [Updated] (TIKA-1917) Just a quick fix to allow NLTK Parser extract measurement information from text

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1917: Fix Version/s: (was: 1.13) 1.14 > Just a quick fix to allow NLTK

[jira] [Updated] (TIKA-1674) Add example to show how to extract embedded files

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1674: Fix Version/s: (was: 1.13) 1.14 > Add example to show how to extract

[jira] [Updated] (TIKA-1808) Head section closed too eager

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1808: Fix Version/s: (was: 1.13) 1.14 > Head section closed too eager >

[jira] [Updated] (TIKA-1840) No way to link slide notes to slide in PPT output.

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1840: Fix Version/s: (was: 1.13) 1.14 > No way to link slide notes to slide

[jira] [Updated] (TIKA-1513) Add mime detection and parsing for dbf files

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1513: Fix Version/s: (was: 1.13) 1.14 > Add mime detection and parsing for

[jira] [Resolved] (TIKA-1723) Integrate language-detector into Tika

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann resolved TIKA-1723. - Resolution: Fixed Fix Version/s: 1.13 This is now done, Ken's Optimaize langdetect,

[jira] [Resolved] (TIKA-1696) Language Identification with Text Processing Toolkit from MITLL

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann resolved TIKA-1696. - Resolution: Fixed This is now done, Ken's Optimaize langdetect, N-gram langdetect and

[jira] [Resolved] (TIKA-1872) Backport tika-langdetect from 2.x branch to 1.13 branch

2016-04-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann resolved TIKA-1872. - Resolution: Fixed This is now done, Ken's Optimaize langdetect, N-gram langdetect and

[GitHub] tika pull request: Backport tika-langdetect from 2.x branch to 1.1...

2016-04-22 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/tika/pull/90 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[jira] [Resolved] (TIKA-1924) Upgrade com.googlecode.mp4parser's isoparser to 1.1.18

2016-04-22 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-1924. --- Resolution: Fixed Re-upgraded to 1.1.18, added work-around to avoid infinite loops, added tiny

[GitHub] tika pull request: Tika 1913 - MIT Information Extraction itegrate...

2016-04-22 Thread manalishah
GitHub user manalishah opened a pull request: https://github.com/apache/tika/pull/108 Tika 1913 - MIT Information Extraction itegrated with Tika This pull request comprises of yet another NamedEntityRecognizer that uses the open-source trained models and functions of MIT-nlp to

[jira] [Commented] (TIKA-1924) Upgrade com.googlecode.mp4parser's isoparser to 1.1.18

2016-04-22 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15254029#comment-15254029 ] Tim Allison commented on TIKA-1924: --- As [~b...@benmccann.com] pointed out, this upgrade removes our

[jira] [Resolved] (TIKA-1931) Revert mp4 parser version because of new permanent hangs with 1.1.18

2016-04-22 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-1931. --- Resolution: Won't Fix I think there's a workaround that will allow 1.1.18 and prevent infinite loops.

[jira] [Reopened] (TIKA-1931) Revert mp4 parser version because of new permanent hangs with 1.1.18

2016-04-22 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison reopened TIKA-1931: --- No need to revert. > Revert mp4 parser version because of new permanent hangs with 1.1.18 >

[jira] [Reopened] (TIKA-1924) Upgrade com.googlecode.mp4parser's isoparser to 1.1.18

2016-04-22 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison reopened TIKA-1924: --- I think we can add a workaround to prevent the infinite loop at the Tika level. > Upgrade

RE: aspectj dependency

2016-04-22 Thread Ben McCann
Thank you! On Apr 22, 2016 5:18 AM, "Allison, Timothy B." wrote: > Hi Ben, > > We tried to upgrade to 1.1.18 on TIKA-1924. Unfortunately, there was a > bug (which we reported [0]) that causes the parser to go into an infinite > loop on some files in our test corpus. We

Re: JIRA issue?

2016-04-22 Thread Nick Burch
On Thu, 21 Apr 2016, Ben McCann wrote: I'd like to create an issue on the JIRA. When I visit https://issues.apache.org/jira/browse/TIKA/ and hit Create I don't see Tika as an option. I can only create issues for Zookeeper and other projects If you let us know your JIRA username, someone can

RE: aspectj dependency

2016-04-22 Thread Allison, Timothy B.
Hi Ben, We tried to upgrade to 1.1.18 on TIKA-1924. Unfortunately, there was a bug (which we reported [0]) that causes the parser to go into an infinite loop on some files in our test corpus. We had to back off to 1.1.7 (TIKA-1931), and unfortunately as I look this morning, that seems to