Re: Resource Sharing Tika Corpus with Any23

2018-11-30 Thread Lewis John McGibbney
Hi Tim, Thanks for the reply... answer inline On 2018/11/30 19:22:23, Tim Allison wrote: > I think that'd be great. Some questions: > > 1) Would you use the same input docs that we're using or would you > need/want a new TB drive for your input/output? The same docs I suspect. We *could*

Re: 1.20?

2018-11-30 Thread loompa
Hi, On Wed, 21 Nov 2018 at 13:00, Tim Allison wrote: > Dave, > Should I try to get the Docker plugin working again? > That would be great. I think I may have went down the wrong path building an image at package time, as there doesn't seem to be an easy way to publish it as an Apache labelled

[jira] [Commented] (TIKA-2550) ToTextHandler includes element content

2018-11-30 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16705393#comment-16705393 ] Hudson commented on TIKA-2550: -- SUCCESS: Integrated in Jenkins build Tika-trunk #1602 (See

[jira] [Commented] (TIKA-2550) ToTextHandler includes element content

2018-11-30 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16705363#comment-16705363 ] Hudson commented on TIKA-2550: -- UNSTABLE: Integrated in Jenkins build tika-2.x-windows #355 (See

[jira] [Commented] (TIKA-2550) ToTextHandler includes element content

2018-11-30 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16705343#comment-16705343 ] Hudson commented on TIKA-2550: -- SUCCESS: Integrated in Jenkins build tika-branch-1x #134 (See

[jira] [Resolved] (TIKA-2550) ToTextHandler includes element content

2018-11-30 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-2550. --- Resolution: Fixed Assignee: Tim Allison Fix Version/s: 1.20 2.0.0

[jira] [Commented] (TIKA-2776) Tika server child restart

2018-11-30 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16705283#comment-16705283 ] Hudson commented on TIKA-2776: -- SUCCESS: Integrated in Jenkins build tika-branch-1x #133 (See

[jira] [Commented] (TIKA-2776) Tika server child restart

2018-11-30 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16705281#comment-16705281 ] Hudson commented on TIKA-2776: -- SUCCESS: Integrated in Jenkins build Tika-trunk #1601 (See

[jira] [Commented] (TIKA-2776) Tika server child restart

2018-11-30 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16705251#comment-16705251 ] Hudson commented on TIKA-2776: -- UNSTABLE: Integrated in Jenkins build tika-2.x-windows #354 (See

Re: Resource Sharing Tika Corpus with Any23

2018-11-30 Thread Tim Allison
I think that'd be great. Some questions: 1) Would you use the same input docs that we're using or would you need/want a new TB drive for your input/output? How much space will you need for your eval framework including outputs? 2) Would you be willing to coordinate with us and PDFBox and POI

Resource Sharing Tika Corpus with Any23

2018-11-30 Thread Lewis John Mcgibbney
Hi dev@tika, Over at Any23 we have been discussing the prospect of running large scale jobs over a significant, challenging dataset, same as is done with Tika via Tika batch on the VM. Is there any possibility, a very small number of us from the Any23 team could access VM and the dataset(s)? If

[jira] [Commented] (TIKA-2727) Parsing and detect mime type of XML file stuck in infinite loop

2018-11-30 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16705058#comment-16705058 ] Tim Allison commented on TIKA-2727: --- CVE-2018-11796 http://tika.apache.org/security.html > Parsing and

[jira] [Commented] (TIKA-2727) Parsing and detect mime type of XML file stuck in infinite loop

2018-11-30 Thread David Dillard (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16705055#comment-16705055 ] David Dillard commented on TIKA-2727: - Any plans to get a CVE for this issue?  A hang sounds like a

JDK 12 build 22 is now available at : - jdk.java.net/12/

2018-11-30 Thread Rory O'Donnell
Hi Tim, *NOTE:- *The JDK 12 schedule rampdown phase 1 of the release is coming up in a few weeks on Dec. 13, 2018. ** *JDK 12 Early Access build 22 **is now available **at : - jdk.java.net/12/* * Release Note updates since last email * *

[jira] [Commented] (TIKA-2776) Tika server child restart

2018-11-30 Thread Mario Bisonti (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16704474#comment-16704474 ] Mario Bisonti commented on TIKA-2776: - Hallo Tim. I obtained a restart of child: 2018-11-30 01:21:01