[jira] [Commented] (TIKA-1982) Add language (and possibly other fields) to /rmeta endpoint

2018-07-16 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16546037#comment-16546037 ] Chris A. Mattmann commented on TIKA-1982: - [~talli...@apache.org] any chance we could get this

[jira] [Commented] (TIKA-2672) Upgrade dl4j to 1.0.0-beta

2018-07-06 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16534984#comment-16534984 ] Chris A. Mattmann commented on TIKA-2672: - GREAT WORK [~ThejanWijesinghe] thanks my guy > Upgrade

[jira] [Commented] (TIKA-2684) Tika does not extract *.fits header text, just file level metadata

2018-07-05 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16533837#comment-16533837 ] Chris A. Mattmann commented on TIKA-2684: - gotcha, well if you tell me the GDAL commands necessary

[jira] [Comment Edited] (TIKA-2684) Tika does not extract *.fits header text, just file level metadata

2018-07-05 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16533824#comment-16533824 ] Chris A. Mattmann edited comment on TIKA-2684 at 7/5/18 3:40 PM: - hehe,

[jira] [Commented] (TIKA-2684) Tika does not extract *.fits header text, just file level metadata

2018-07-05 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16533824#comment-16533824 ] Chris A. Mattmann commented on TIKA-2684: - hehe, well I know GDAL handles fits, and Tika isĀ 

[jira] [Assigned] (TIKA-94) Speech recognition

2018-06-11 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-94?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann reassigned TIKA-94: - Assignee: Chris A. Mattmann > Speech recognition > -- > >

[jira] [Commented] (TIKA-94) Speech recognition

2018-06-11 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-94?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508258#comment-16508258 ] Chris A. Mattmann commented on TIKA-94: --- Yes [~ThejanWijesinghe] let's start with your Tensorflow one

[jira] [Commented] (TIKA-94) Speech recognition

2018-06-10 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-94?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16507635#comment-16507635 ] Chris A. Mattmann commented on TIKA-94: --- great to hear. Check this out:

[jira] [Commented] (TIKA-94) Speech recognition

2018-06-08 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-94?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16506106#comment-16506106 ] Chris A. Mattmann commented on TIKA-94: --- Thanks for asking [~edwinyeozl]. There is an opportunity here

[jira] [Commented] (TIKA-2646) Tika parse["content"] returns jumbled text across cells of a table in a pdf

2018-05-26 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16491792#comment-16491792 ] Chris A. Mattmann commented on TIKA-2646: - [~adidier] see comment above from [~lfcnassif] > Tika

[jira] [Commented] (TIKA-2520) OptimaizeLangDetector#loadModels() should not be called for every single langdetect HTTP request

2018-05-24 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16489926#comment-16489926 ] Chris A. Mattmann commented on TIKA-2520: - Integrated into 2.x master too: {noformat} [INFO]

[jira] [Comment Edited] (TIKA-2520) OptimaizeLangDetector#loadModels() should not be called for every single langdetect HTTP request

2018-05-24 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16489790#comment-16489790 ] Chris A. Mattmann edited comment on TIKA-2520 at 5/24/18 8:56 PM: --

[jira] [Resolved] (TIKA-2520) OptimaizeLangDetector#loadModels() should not be called for every single langdetect HTTP request

2018-05-24 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann resolved TIKA-2520. - Resolution: Fixed Fix Version/s: 1.19 {noformat} nonas:tika2.0.0 mattmann$ git

[jira] [Assigned] (TIKA-2520) OptimaizeLangDetector#loadModels() should not be called for every single langdetect HTTP request

2018-05-24 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann reassigned TIKA-2520: --- Assignee: Chris A. Mattmann > OptimaizeLangDetector#loadModels() should not be called

[jira] [Commented] (TIKA-2646) Tika parse["content"] returns jumbled text across cells of a table in a pdf

2018-05-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16484338#comment-16484338 ] Chris A. Mattmann commented on TIKA-2646: - Tim thanks - this is for a project at JPL and I asked

[jira] [Resolved] (TIKA-2400) Standardizing current Object Recognition REST parsers

2017-11-21 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann resolved TIKA-2400. - Resolution: Fixed Assignee: Chris A. Mattmann merged! > Standardizing current

[jira] [Commented] (TIKA-2503) Try to upgrade httpclient to >=4.5.3

2017-11-13 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16249913#comment-16249913 ] Chris A. Mattmann commented on TIKA-2503: - thanks Tim, no we don't have coverage. I bet that

[jira] [Commented] (TIKA-2503) Try to upgrade httpclient to >=4.5.3

2017-11-13 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16249877#comment-16249877 ] Chris A. Mattmann commented on TIKA-2503: - for OpeNDAP datasets I believe we would need this...yep

[jira] [Resolved] (TIKA-2464) No PIL found while running the docker image 'InceptionVideoRestDockerfile'

2017-09-14 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann resolved TIKA-2464. - Resolution: Fixed Fix Version/s: 1.17 Committed thanks [~armathur]! > No PIL found

[jira] [Assigned] (TIKA-2464) No PIL found while running the docker image 'InceptionVideoRestDockerfile'

2017-09-14 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann reassigned TIKA-2464: --- Assignee: Chris A. Mattmann > No PIL found while running the docker image

[jira] [Resolved] (TIKA-2332) Output SNOMED codes for CUIs in CTAKES output?

2017-08-29 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann resolved TIKA-2332. - Resolution: Fixed committed thanks! > Output SNOMED codes for CUIs in CTAKES output? >

[jira] [Updated] (TIKA-2332) Output SNOMED codes for CUIs in CTAKES output?

2017-08-29 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-2332: Fix Version/s: 1.17 > Output SNOMED codes for CUIs in CTAKES output? >

[jira] [Updated] (TIKA-2332) Output SNOMED codes for CUIs in CTAKES output?

2017-08-29 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-2332: Labels: memex (was: ) > Output SNOMED codes for CUIs in CTAKES output? >

[jira] [Assigned] (TIKA-2332) Output SNOMED codes for CUIs in CTAKES output?

2017-08-29 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann reassigned TIKA-2332: --- Assignee: Chris A. Mattmann > Output SNOMED codes for CUIs in CTAKES output? >

[jira] [Updated] (TIKA-2355) Cache trained mode while running ObjectRecognition server from Docker builds

2017-08-15 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-2355: Component/s: parser > Cache trained mode while running ObjectRecognition server from Docker

[jira] [Updated] (TIKA-2355) Cache trained mode while running ObjectRecognition server from Docker builds

2017-08-15 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-2355: Labels: memex (was: ) > Cache trained mode while running ObjectRecognition server from

[jira] [Updated] (TIKA-2355) Cache trained mode while running ObjectRecognition server from Docker builds

2017-08-15 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-2355: Description: DockerBuilds of ObjectRecognition downloads model every time server starts.

[jira] [Resolved] (TIKA-2355) Cache trained mode while running ObjectRecognition server from Docker builds

2017-08-15 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann resolved TIKA-2355. - Resolution: Fixed Assignee: Chris A. Mattmann Fixed! {noformat} LMC-053601:tf

[jira] [Updated] (TIKA-2355) Cache trained mode while running ObjectRecognition server from Docker builds

2017-08-15 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-2355: Fix Version/s: 1.17 > Cache trained mode while running ObjectRecognition server from Docker

[jira] [Commented] (TIKA-2434) Language detection slow, cpu intensive, CLI interrupts work

2017-08-09 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16120263#comment-16120263 ] Chris A. Mattmann commented on TIKA-2434: - [~talli...@apache.org] ping > Language detection slow,

[jira] [Updated] (TIKA-2402) Support all image formats in Object Recognition REST Parser

2017-08-09 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-2402: Labels: memex (was: ) > Support all image formats in Object Recognition REST Parser >

[jira] [Resolved] (TIKA-2402) Support all image formats in Object Recognition REST Parser

2017-08-09 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann resolved TIKA-2402. - Resolution: Fixed Assignee: Chris A. Mattmann - fixed! > Support all image formats

[jira] [Commented] (TIKA-2434) Language detection slow, cpu intensive, CLI interrupts work

2017-08-01 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16109908#comment-16109908 ] Chris A. Mattmann commented on TIKA-2434: - Tim, for #1, once you know how it runs against the

[jira] [Updated] (TIKA-2262) Supporting Image-to-Text (Image Captioning) in Tika for Image MIME Types

2017-07-09 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-2262: Labels: deeplearning gsoc2017 machine_learning memex (was: deeplearning gsoc2017

[jira] [Commented] (TIKA-2262) Supporting Image-to-Text (Image Captioning) in Tika for Image MIME Types

2017-07-09 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16079637#comment-16079637 ] Chris A. Mattmann commented on TIKA-2262: - Documentation is here:

[jira] [Resolved] (TIKA-2262) Supporting Image-to-Text (Image Captioning) in Tika for Image MIME Types

2017-07-09 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann resolved TIKA-2262. - Resolution: Fixed Fix Version/s: 1.17 Congratulations [~ThejanWijesinghe] your work

[jira] [Assigned] (TIKA-2262) Supporting Image-to-Text (Image Captioning) in Tika for Image MIME Types

2017-07-09 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann reassigned TIKA-2262: --- Assignee: Chris A. Mattmann (was: Thamme Gowda) > Supporting Image-to-Text (Image

[jira] [Commented] (TIKA-1988) Age Detection Tika Recogniser

2017-07-07 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16078876#comment-16078876 ] Chris A. Mattmann commented on TIKA-1988: - For now yes [~talli...@mitre.org] until we fix

[jira] [Commented] (TIKA-1988) Age Detection Tika Recogniser

2017-07-07 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16078283#comment-16078283 ] Chris A. Mattmann commented on TIKA-1988: - Sounds good to me...almost done with tika-nlp will

[jira] [Commented] (TIKA-1988) Age Detection Tika Recogniser

2017-07-07 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16078237#comment-16078237 ] Chris A. Mattmann commented on TIKA-1988: - Agree on #3. I'm going to take a first cut at tika-nlp.

[jira] [Commented] (TIKA-1988) Age Detection Tika Recogniser

2017-07-07 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16078155#comment-16078155 ] Chris A. Mattmann commented on TIKA-1988: - #1 - absolutely - i thought putting the model download

[jira] [Commented] (TIKA-2298) To improve object recognition parser so that it may work without external RESTful service setup

2017-07-07 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16077692#comment-16077692 ] Chris A. Mattmann commented on TIKA-2298: - docs added here:

[jira] [Resolved] (TIKA-1988) Age Detection Tika Recogniser

2017-07-07 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann resolved TIKA-1988. - Resolution: Fixed - merged into master thanks [~msha...@usc.edu], [~tgow...@gmail.com] and

[jira] [Updated] (TIKA-1988) Age Detection Tika Recogniser

2017-07-07 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1988: Labels: age machine_learning memex nlp opennlp (was: age memex nlp opennlp) > Age Detection

[jira] [Updated] (TIKA-1988) Age Detection Tika Recogniser

2017-07-07 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1988: Fix Version/s: 1.16 > Age Detection Tika Recogniser > - > >

[jira] [Assigned] (TIKA-1988) Age Detection Tika Recogniser

2017-07-07 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann reassigned TIKA-1988: --- Assignee: Chris A. Mattmann > Age Detection Tika Recogniser >

[jira] [Updated] (TIKA-1988) Age Detection Tika Recogniser

2017-07-07 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1988: Labels: age memex nlp opennlp (was: ) > Age Detection Tika Recogniser >

[jira] [Commented] (TIKA-2298) To improve object recognition parser so that it may work without external RESTful service setup

2017-07-05 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075859#comment-16075859 ] Chris A. Mattmann commented on TIKA-2298: - fixed, was a simple typo - you forgot to set the config

[jira] [Commented] (TIKA-2298) To improve object recognition parser so that it may work without external RESTful service setup

2017-07-05 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075855#comment-16075855 ] Chris A. Mattmann commented on TIKA-2298: - docs added in:

[jira] [Commented] (TIKA-2298) To improve object recognition parser so that it may work without external RESTful service setup

2017-07-05 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075849#comment-16075849 ] Chris A. Mattmann commented on TIKA-2298: - [~talli...@apache.org] your latest update causes Jenkins

[jira] [Updated] (TIKA-2298) To improve object recognition parser so that it may work without external RESTful service setup

2017-07-05 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-2298: Labels: ObjectRecognitionParser gsoc memex (was: ObjectRecognitionParser memex) > To

[jira] [Updated] (TIKA-2298) To improve object recognition parser so that it may work without external RESTful service setup

2017-07-05 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-2298: Labels: ObjectRecognitionParser memex (was: ObjectRecognitionParser) > To improve object

[jira] [Commented] (TIKA-2298) To improve object recognition parser so that it may work without external RESTful service setup

2017-07-05 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075755#comment-16075755 ] Chris A. Mattmann commented on TIKA-2298: - YES sounds perfect thanks [~talli...@apache.org] > To

[jira] [Resolved] (TIKA-2298) To improve object recognition parser so that it may work without external RESTful service setup

2017-07-05 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann resolved TIKA-2298. - Resolution: Fixed Assignee: Chris A. Mattmann Thanks to [~asmehra95] and

[jira] [Updated] (TIKA-1988) Age Detection Tika Recogniser

2017-06-29 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1988: Summary: Age Detection Tika Recogniser (was: Tika parser for extracting text based

[jira] [Commented] (TIKA-1988) Tika parser for extracting text based features

2017-06-21 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16058744#comment-16058744 ] Chris A. Mattmann commented on TIKA-1988: - sorry I missed it! will look now > Tika parser for

[jira] [Commented] (TIKA-1988) Tika parser for extracting text based features

2017-06-21 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16058731#comment-16058731 ] Chris A. Mattmann commented on TIKA-1988: - sounds great [~msha...@usc.edu] any progress? > Tika

[jira] [Commented] (TIKA-2368) Clean up SentimentParser dependencies

2017-06-08 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16043056#comment-16043056 ] Chris A. Mattmann commented on TIKA-2368: - hey [~talli...@apache.org] we're working on this right

[jira] [Commented] (TIKA-2373) Fix licenses via rat before 1.15 release

2017-05-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16019789#comment-16019789 ] Chris A. Mattmann commented on TIKA-2373: - exclude away! > Fix licenses via rat before 1.15

[jira] [Commented] (TIKA-1334) Add presentation layer for results of each run

2017-05-22 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16019785#comment-16019785 ] Chris A. Mattmann commented on TIKA-1334: - awesome! they look great! > Add presentation layer for

[jira] [Resolved] (TIKA-1106) CLAVIN Integration

2017-05-21 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann resolved TIKA-1106. - Resolution: Won't Fix we already have the GeoTopicParser so going to close this one out.

[jira] [Resolved] (TIKA-1815) Text content from parser is empty when NamedEntityParser is enabled

2017-05-21 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann resolved TIKA-1815. - Resolution: Fixed Fix Version/s: (was: 1.16) 1.15 > Text

[jira] [Updated] (TIKA-1106) CLAVIN Integration

2017-05-21 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1106: Fix Version/s: (was: 1.15) 1.16 > CLAVIN Integration >

[jira] [Updated] (TIKA-1738) ForkClient does not always delete temporary bootstrap jar

2017-05-21 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1738: Fix Version/s: (was: 1.15) 1.16 > ForkClient does not always delete

[jira] [Updated] (TIKA-1815) Text content from parser is empty when NamedEntityParser is enabled

2017-05-21 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1815: Fix Version/s: (was: 1.15) 1.16 > Text content from parser is empty

[jira] [Updated] (TIKA-985) Support for HTML5 elements

2017-05-21 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-985: --- Fix Version/s: (was: 1.15) 1.16 > Support for HTML5 elements >

[jira] [Updated] (TIKA-1505) chmparser breaks down when extracting from file of CHM format v3

2017-05-21 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1505: Fix Version/s: (was: 1.15) 1.16 > chmparser breaks down when

[jira] [Updated] (TIKA-1329) Add RecursiveParserWrapper aka Jukka's (and Nick's) RecursiveMetadataParser

2017-05-21 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1329: Fix Version/s: (was: 1.15) 1.16 > Add RecursiveParserWrapper aka

[jira] [Updated] (TIKA-1577) NetCDF Data Extraction

2017-05-21 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1577: Fix Version/s: (was: 1.15) 1.16 > NetCDF Data Extraction >

[jira] [Updated] (TIKA-1379) error in Tika().detect for xml files with xades signature

2017-05-21 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1379: Fix Version/s: (was: 1.15) 1.16 > error in Tika().detect for xml

[jira] [Updated] (TIKA-1800) MediaType#parse does not decode escaped special characters

2017-05-21 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1800: Fix Version/s: (was: 1.15) 1.16 > MediaType#parse does not decode

[jira] [Updated] (TIKA-1808) Head section closed too eager

2017-05-21 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1808: Fix Version/s: (was: 1.15) 1.16 > Head section closed too eager >

[jira] [Updated] (TIKA-1609) Leverage Google's LibPhonenumber for enhanced phone number extraction and metadata modeling

2017-05-21 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1609: Fix Version/s: (was: 1.15) 1.16 > Leverage Google's LibPhonenumber

[jira] [Updated] (TIKA-1640) Make ExternalParser support aliases for key names in extracted metadata

2017-05-21 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1640: Fix Version/s: (was: 1.15) 1.16 > Make ExternalParser support aliases

[jira] [Updated] (TIKA-980) MicrodataContentHandler for Apache Tika

2017-05-21 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-980: --- Fix Version/s: (was: 1.15) 1.16 > MicrodataContentHandler for Apache

[jira] [Updated] (TIKA-1295) Make some Dublin Core items multi-valued

2017-05-21 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1295: Fix Version/s: (was: 1.15) 1.16 > Make some Dublin Core items

[jira] [Resolved] (TIKA-2016) A parser that combines Apache OpenNLP and Apache Tika and provides facilities for automatically deriving sentiment from text.

2017-05-21 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann resolved TIKA-2016. - Resolution: Fixed this is fixed - thanks to [~thammegowda]! > A parser that combines

[jira] [Updated] (TIKA-539) Encoding detection is too biased by encoding in meta tag

2017-05-21 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-539: --- Fix Version/s: (was: 1.15) 1.16 > Encoding detection is too biased by

[jira] [Updated] (TIKA-1465) Implement extraction of non-global variables from netCDF3 and netCDF4

2017-05-21 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1465: Fix Version/s: (was: 1.15) 1.16 > Implement extraction of non-global

[jira] [Updated] (TIKA-2016) A parser that combines Apache OpenNLP and Apache Tika and provides facilities for automatically deriving sentiment from text.

2017-05-21 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-2016: Fix Version/s: (was: 1.16) 1.15 > A parser that combines Apache

[jira] [Updated] (TIKA-1390) Create tika-example module

2017-05-21 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1390: Fix Version/s: (was: 1.15) 1.16 > Create tika-example module >

[jira] [Updated] (TIKA-1454) Extracting as HTML loses links in xlsx, ppt, and pptx files

2017-05-21 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1454: Fix Version/s: (was: 1.15) 1.16 > Extracting as HTML loses links in

[jira] [Updated] (TIKA-1616) Tika Parser for GIBS Metadata

2017-05-21 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1616: Fix Version/s: (was: 1.15) 1.16 > Tika Parser for GIBS Metadata >

[jira] [Updated] (TIKA-2016) A parser that combines Apache OpenNLP and Apache Tika and provides facilities for automatically deriving sentiment from text.

2017-05-21 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-2016: Fix Version/s: (was: 1.15) 1.16 > A parser that combines Apache

[jira] [Updated] (TIKA-1952) Access Date is getting modified while capturing the MetaData information using AutoDetectParser

2017-05-21 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1952: Fix Version/s: (was: 1.15) 1.16 > Access Date is getting modified

[jira] [Updated] (TIKA-2298) To improve object recognition parser so that it may work without external RESTful service setup

2017-05-21 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-2298: Fix Version/s: (was: 1.15) 1.16 > To improve object recognition

[jira] [Updated] (TIKA-1706) Bring back commons-io to tika-core

2017-05-21 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1706: Fix Version/s: (was: 1.15) 1.16 > Bring back commons-io to tika-core

[jira] [Updated] (TIKA-1220) Parser implementration for IFC files

2017-05-21 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1220: Fix Version/s: (was: 1.15) 1.16 > Parser implementration for IFC

[jira] [Updated] (TIKA-1108) Represent individual slides in pptx

2017-05-21 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1108: Fix Version/s: (was: 1.15) 1.16 > Represent individual slides in pptx

[jira] [Updated] (TIKA-1308) Support in memory parse mode(don't create temp file): to support run Tika in GAE

2017-05-21 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1308: Fix Version/s: (was: 1.15) 1.16 > Support in memory parse mode(don't

[jira] [Updated] (TIKA-715) Some parsers produce non-well-formed XHTML SAX events

2017-05-21 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-715: --- Fix Version/s: (was: 1.15) 1.16 > Some parsers produce non-well-formed

[jira] [Updated] (TIKA-1328) Translate Metadata and Content

2017-05-21 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1328: Fix Version/s: (was: 1.15) 1.16 > Translate Metadata and Content >

[jira] [Updated] (TIKA-988) We don't extract a placeholder for a Word document embedded in an Excel document

2017-05-21 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-988: --- Fix Version/s: (was: 1.15) 1.16 > We don't extract a placeholder for a

[jira] [Updated] (TIKA-1697) Parser Implementation for AkomaNtoso Legal XML Documents

2017-05-21 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1697: Fix Version/s: (was: 1.15) 1.16 > Parser Implementation for

[jira] [Updated] (TIKA-1598) Parser Implementation for Streaming Video

2017-05-21 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1598: Fix Version/s: (was: 1.15) 1.16 > Parser Implementation for Streaming

[jira] [Updated] (TIKA-1518) Docker with Tika Server

2017-05-21 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1518: Fix Version/s: (was: 1.15) 1.16 > Docker with Tika Server >

[jira] [Updated] (TIKA-1953) tika-server NullPointerException while processing rtfs

2017-05-21 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1953: Fix Version/s: (was: 1.15) 1.16 > tika-server NullPointerException

[jira] [Updated] (TIKA-774) ExifTool Parser

2017-05-21 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-774: --- Fix Version/s: (was: 1.15) 1.16 > ExifTool Parser > --- > >

[jira] [Updated] (TIKA-1425) Automatic batching of Microsoft service calls

2017-05-21 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1425: Fix Version/s: (was: 1.15) 1.16 > Automatic batching of Microsoft

[jira] [Updated] (TIKA-1417) Create Extract Embedded Images from PDFs Example

2017-05-21 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1417: Fix Version/s: (was: 1.15) 1.16 > Create Extract Embedded Images from

  1   2   3   4   5   6   7   8   9   10   >