tika-trunk-jdk1.7 - Build # 733 - Failure

2015-06-06 Thread Apache Jenkins Server
The Apache Jenkins build system has built tika-trunk-jdk1.7 (build #733) Status: Failure Check console output at https://builds.apache.org/job/tika-trunk-jdk1.7/733/ to view the results.

Re: tika-trunk-jdk1.7 - Build # 733 - Failure

2015-06-06 Thread Mattmann, Chris A (3980)
This was due to the SVN issues that infra was dealing with last night. I’ll go ahead and spin RC #2 shortly. ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion

Re: Configuring parsers and translators

2015-06-06 Thread Tyler Palsulich
Hi Nick, I've been mulling this over since you sent the first message. But, I'm afraid I don't have a good solution or developed ideas. I agree, it would be very nice to consolidate all configuration for all parsers in the server and app. Is it feasible to put everything into tika-config? Then

Re: Configuring parsers and translators

2015-06-06 Thread Nick Burch
On Sat, 6 Jun 2015, Tyler Palsulich wrote: (Devil's advocate hat slightly on.) My one hesitation about putting it all into tika-config is that the default might get to be a monstrosity -- difficult for new users to use. Assuming you don't want any translators, and have no non-standard paths

[jira] [Commented] (TIKA-1652) Tika Server should allow config file override from the command line like Tika App

2015-06-06 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14575993#comment-14575993 ] Chris A. Mattmann commented on TIKA-1652: - +1, agreed. I'll wrap them both up

[jira] [Commented] (TIKA-1652) Tika Server should allow config file override from the command line like Tika App

2015-06-06 Thread Tyler Palsulich (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14575986#comment-14575986 ] Tyler Palsulich commented on TIKA-1652: --- I think this is a duplicate of TIKA-1426?

Re: Configuring parsers and translators

2015-06-06 Thread Mattmann, Chris A (3980)
Hey Tyler, I hear you, but balance that against all the hidden things here and there, and everywhere, that I constantly keep discovering and having to pour through lines of TikaConfig - service loaders, class loaders. When things work right - no problem. When something goes wrong; HUGE waste of

[jira] [Commented] (TIKA-1645) Extraction of biomedical information using CTAKESParser

2015-06-06 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14575996#comment-14575996 ] Chris A. Mattmann commented on TIKA-1645: - I got this working with both tika-app

[jira] [Assigned] (TIKA-1645) Extraction of biomedical information using CTAKESParser

2015-06-06 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann reassigned TIKA-1645: --- Assignee: Chris A. Mattmann (was: Giuseppe Totaro) Extraction of biomedical

[jira] [Resolved] (TIKA-1645) Extraction of biomedical information using CTAKESParser

2015-06-06 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann resolved TIKA-1645. - Resolution: Fixed Fix Version/s: (was: 1.10) 1.9

[jira] [Resolved] (TIKA-1642) Integrate cTAKES into Tika

2015-06-06 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann resolved TIKA-1642. - Resolution: Fixed Fix Version/s: 1.9 Assignee: Chris A. Mattmann (was:

[jira] [Commented] (TIKA-1642) Integrate cTAKES into Tika

2015-06-06 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14576051#comment-14576051 ] Hudson commented on TIKA-1642: -- ABORTED: Integrated in tika-trunk-jdk1.7 #734 (See

[jira] [Commented] (TIKA-1652) Tika Server should allow config file override from the command line like Tika App

2015-06-06 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14576053#comment-14576053 ] Hudson commented on TIKA-1652: -- ABORTED: Integrated in tika-trunk-jdk1.7 #734 (See

[jira] [Commented] (TIKA-1426) Let's allow users to specify a tika config file on the commandline for tika-app and tika-server

2015-06-06 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14576052#comment-14576052 ] Hudson commented on TIKA-1426: -- ABORTED: Integrated in tika-trunk-jdk1.7 #734 (See

[jira] [Commented] (TIKA-1645) Extraction of biomedical information using CTAKESParser

2015-06-06 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14576054#comment-14576054 ] Hudson commented on TIKA-1645: -- ABORTED: Integrated in tika-trunk-jdk1.7 #734 (See

Re: Configuring parsers and translators

2015-06-06 Thread Tyler Palsulich
(Devil's advocate hat slightly on.) My one hesitation about putting it all into tika-config is that the default might get to be a monstrosity -- difficult for new users to use. Tyler On Sat, Jun 6, 2015 at 3:48 PM Mattmann, Chris A (3980) chris.a.mattm...@jpl.nasa.gov wrote: I think it would

[jira] [Resolved] (TIKA-1426) Let's allow users to specify a tika config file on the commandline for tika-app and tika-server

2015-06-06 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann resolved TIKA-1426. - Resolution: Fixed Fix Version/s: (was: 1.10) 1.9

Re: [VOTE] Release Apache Tika 1.9 Candidate #1

2015-06-06 Thread David Meikle
Hey Chris, On 1 Jun 2015, at 06:38, Mattmann, Chris A (3980) chris.a.mattm...@jpl.nasa.gov wrote: Please vote on releasing this package as Apache Tika 1.9. The vote is open for the next 72 hours and passes if a majority of at least three +1 Tika PMC votes are cast. [ ] +1 Release this

Re: svn commit: r1683969 - /tika/trunk/tika-parsers/src/main/resources/META-INF/services/org.apache.tika.parser.Parser

2015-06-06 Thread Mattmann, Chris A (3980)
Also the lovely thing here too is that since cTAKESParser is a decorator for AutoDetectParser there is magical infinite recursion if it’s enabled via SPI. TODO: make this a LOT cleaner in 1.10+. ++ Chris Mattmann, Ph.D. Chief

Re: Configuring parsers and translators

2015-06-06 Thread Mattmann, Chris A (3980)
I think it would be great to have all this in the Tika Config. The one thing then is to provide an example default config and to make it *hugely* clear rather than all the levels of indirection that we currently have going on which makes it super hard when there is a config error (SPI, swallowing

[jira] [Commented] (TIKA-1645) Extraction of biomedical information using CTAKESParser

2015-06-06 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14575997#comment-14575997 ] Chris A. Mattmann commented on TIKA-1645: - Documentation:

[jira] [Resolved] (TIKA-1652) Tika Server should allow config file override from the command line like Tika App

2015-06-06 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann resolved TIKA-1652. - Resolution: Fixed - Fixed: {noformat} bash-3.2$ svn commit -m Fix for TIKA-1652,

[VOTE] Release Apache Tika 1.9 Candidate #2

2015-06-06 Thread Mattmann, Chris A (3980)
Hi Folks, A second candidate for the Tika 1.9 release is available at: https://dist.apache.org/repos/dist/dev/tika/ The release candidate is a zip archive of the sources in: http://svn.apache.org/repos/asf/tika/tags/1.9-rc2/ The SHA1 checksum of the archive is

[jira] [Created] (TIKA-1652) Tika Server should allow config file override from the command line like Tika App

2015-06-06 Thread Chris A. Mattmann (JIRA)
Chris A. Mattmann created TIKA-1652: --- Summary: Tika Server should allow config file override from the command line like Tika App Key: TIKA-1652 URL: https://issues.apache.org/jira/browse/TIKA-1652

Re: Configuring parsers and translators

2015-06-06 Thread Nick Burch
Anyone have any thoughts on this? On Fri, 8 May 2015, Nick Burch wrote: Hi All This came up in TIKA-1623, but I thought it might be better brought out to the list for discussion To configure parsers on a per-document basis, such as setting PDF spacing tolerances, or telling Tesseract what