[jira] [Commented] (TIKA-1518) Docker with Tika Server

2018-03-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16390381#comment-16390381
 ] 

Hudson commented on TIKA-1518:
--

UNSTABLE: Integrated in Jenkins build tika-2.x-windows #216 (See 
[https://builds.apache.org/job/tika-2.x-windows/216/])
TIKA-1518 -- turn dockerfile-maven-plugin back on.  Accidentally (tallison: rev 
ca19696657cca2ec83160f9a16cbb36bfc35cde6)
* (edit) tika-server/pom.xml


> Docker with Tika Server
> ---
>
> Key: TIKA-1518
> URL: https://issues.apache.org/jira/browse/TIKA-1518
> Project: Tika
>  Issue Type: New Feature
>Reporter: Paul Ramirez
>Assignee: Dave Meikle
>Priority: Major
> Fix For: 2.0, 1.17
>
> Attachments: tika-server-docker-err-msg.txt
>
>
> This version should be able to demonstrate as many of Apache Tika's 
> capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to 
> show parsers which require installation of other dependencies. In addition, 
> this should help move TIKA-1301 forward and should leverage the suggestion 
> made by [~lewismc] of a script which can pull down the latest version of 
> Apache Tika.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TIKA-1518) Docker with Tika Server

2018-03-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16390351#comment-16390351
 ] 

Hudson commented on TIKA-1518:
--

SUCCESS: Integrated in Jenkins build tika-branch-1x #6 (See 
[https://builds.apache.org/job/tika-branch-1x/6/])
TIKA-1518: Detach docker file build from build phase in Maven execution (david: 
[https://github.com/apache/tika/commit/42aa774f1e1d232ee9f98b58ace9f0417231716b])
* (edit) tika-server/pom.xml


> Docker with Tika Server
> ---
>
> Key: TIKA-1518
> URL: https://issues.apache.org/jira/browse/TIKA-1518
> Project: Tika
>  Issue Type: New Feature
>Reporter: Paul Ramirez
>Assignee: Dave Meikle
>Priority: Major
> Fix For: 2.0, 1.17
>
> Attachments: tika-server-docker-err-msg.txt
>
>
> This version should be able to demonstrate as many of Apache Tika's 
> capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to 
> show parsers which require installation of other dependencies. In addition, 
> this should help move TIKA-1301 forward and should leverage the suggestion 
> made by [~lewismc] of a script which can pull down the latest version of 
> Apache Tika.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TIKA-1518) Docker with Tika Server

2018-03-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16390335#comment-16390335
 ] 

Hudson commented on TIKA-1518:
--

SUCCESS: Integrated in Jenkins build Tika-trunk #1454 (See 
[https://builds.apache.org/job/Tika-trunk/1454/])
TIKA-1518: Detach docker file build from build phase in Maven execution (david: 
[https://github.com/apache/tika/commit/deb9e96f29d3a322804016d4533bb76de7c40e2c])
* (edit) tika-server/pom.xml
TIKA-1518 -- turn dockerfile-maven-plugin back on.  Accidentally (tallison: 
[https://github.com/apache/tika/commit/ca19696657cca2ec83160f9a16cbb36bfc35cde6])
* (edit) tika-server/pom.xml


> Docker with Tika Server
> ---
>
> Key: TIKA-1518
> URL: https://issues.apache.org/jira/browse/TIKA-1518
> Project: Tika
>  Issue Type: New Feature
>Reporter: Paul Ramirez
>Assignee: Dave Meikle
>Priority: Major
> Fix For: 2.0, 1.17
>
> Attachments: tika-server-docker-err-msg.txt
>
>
> This version should be able to demonstrate as many of Apache Tika's 
> capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to 
> show parsers which require installation of other dependencies. In addition, 
> this should help move TIKA-1301 forward and should leverage the suggestion 
> made by [~lewismc] of a script which can pull down the latest version of 
> Apache Tika.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TIKA-1518) Docker with Tika Server

2018-03-07 Thread Dave Meikle (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16390317#comment-16390317
 ] 

Dave Meikle commented on TIKA-1518:
---

[~talli...@mitre.org] - ah it looks like the proxy settings aren't being passed 
into the Docker container.

Normally I've passed proxy settings via buildArgs to docker but I am not sure 
how this is handled by the Maven plugin.  I've not done docker behind a proxy 
for a while.

Can you try -X on the maven command to see what is being set?

> Docker with Tika Server
> ---
>
> Key: TIKA-1518
> URL: https://issues.apache.org/jira/browse/TIKA-1518
> Project: Tika
>  Issue Type: New Feature
>Reporter: Paul Ramirez
>Assignee: Dave Meikle
>Priority: Major
> Fix For: 2.0, 1.17
>
> Attachments: tika-server-docker-err-msg.txt
>
>
> This version should be able to demonstrate as many of Apache Tika's 
> capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to 
> show parsers which require installation of other dependencies. In addition, 
> this should help move TIKA-1301 forward and should leverage the suggestion 
> made by [~lewismc] of a script which can pull down the latest version of 
> Apache Tika.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TIKA-1518) Docker with Tika Server

2018-03-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16390309#comment-16390309
 ] 

Hudson commented on TIKA-1518:
--

UNSTABLE: Integrated in Jenkins build tika-2.x-windows #215 (See 
[https://builds.apache.org/job/tika-2.x-windows/215/])
TIKA-1518: Detach docker file build from build phase in Maven execution (david: 
rev deb9e96f29d3a322804016d4533bb76de7c40e2c)
* (edit) tika-server/pom.xml


> Docker with Tika Server
> ---
>
> Key: TIKA-1518
> URL: https://issues.apache.org/jira/browse/TIKA-1518
> Project: Tika
>  Issue Type: New Feature
>Reporter: Paul Ramirez
>Assignee: Dave Meikle
>Priority: Major
> Fix For: 2.0, 1.17
>
> Attachments: tika-server-docker-err-msg.txt
>
>
> This version should be able to demonstrate as many of Apache Tika's 
> capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to 
> show parsers which require installation of other dependencies. In addition, 
> this should help move TIKA-1301 forward and should leverage the suggestion 
> made by [~lewismc] of a script which can pull down the latest version of 
> Apache Tika.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TIKA-1518) Docker with Tika Server

2018-03-07 Thread Dave Meikle (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16390284#comment-16390284
 ] 

Dave Meikle commented on TIKA-1518:
---

It is a choice we have to make. There are three mains routes to Docker 
packaging that I have used:
 # Automated builds that pull in pre-packaged and then get bundled into an 
image on any change in the an repository - like what we are doing n 
docker-tikaserver approach where is goes and downloads the signed JARs
 # Automated builds that compile the code in the image (e.g. using the maven 
Docker image) and then package them
 # Building a release image and then distributing that - which is what this 
does but requires us to decide when an official release is available and push 
it somewhere

The first and second are really good for leveraging things like Docker Hub to 
automatically build from your repository, where as the third means you have to 
have Docker on your machine when you want to build an image.

I never really like number two as it means the builds are always recompiles of 
the code each time a change is triggered, so you can easily be packing up 
different code as the same version without realising it.

The challenge with the approach in docker-tikaserver is maintaining when assets 
that are being pulled in move - i.e. when an release JAR is move from 
dist.apache.org - but that could easily be solved by going to Nexus for the 
JARs based on the release packages.

I personally quite like the third approach as it means you explicit create an 
image that has its own life and was thinking that we could potentially add this 
to the release process, pushing the image from the release build to Docker 
Hub/Nexus/Another Repos so it is an official build.

Not sure what others think?

> Docker with Tika Server
> ---
>
> Key: TIKA-1518
> URL: https://issues.apache.org/jira/browse/TIKA-1518
> Project: Tika
>  Issue Type: New Feature
>Reporter: Paul Ramirez
>Assignee: Dave Meikle
>Priority: Major
> Fix For: 2.0, 1.17
>
> Attachments: tika-server-docker-err-msg.txt
>
>
> This version should be able to demonstrate as many of Apache Tika's 
> capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to 
> show parsers which require installation of other dependencies. In addition, 
> this should help move TIKA-1301 forward and should leverage the suggestion 
> made by [~lewismc] of a script which can pull down the latest version of 
> Apache Tika.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TIKA-1518) Docker with Tika Server

2018-03-07 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16390274#comment-16390274
 ] 

Tim Allison commented on TIKA-1518:
---

Your 
[commit|https://github.com/apache/tika/commit/deb9e96f29d3a322804016d4533bb76de7c40e2c#diff-332a9cfb880c4a30e2abc7af93035120]
 sure fixed it by turning it off.  :D  

> Docker with Tika Server
> ---
>
> Key: TIKA-1518
> URL: https://issues.apache.org/jira/browse/TIKA-1518
> Project: Tika
>  Issue Type: New Feature
>Reporter: Paul Ramirez
>Assignee: Dave Meikle
>Priority: Major
> Fix For: 2.0, 1.17
>
> Attachments: tika-server-docker-err-msg.txt
>
>
> This version should be able to demonstrate as many of Apache Tika's 
> capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to 
> show parsers which require installation of other dependencies. In addition, 
> this should help move TIKA-1301 forward and should leverage the suggestion 
> made by [~lewismc] of a script which can pull down the latest version of 
> Apache Tika.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TIKA-1518) Docker with Tika Server

2018-03-07 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16390275#comment-16390275
 ] 

Tim Allison commented on TIKA-1518:
---

And sorry for letting the  slip through!!!

> Docker with Tika Server
> ---
>
> Key: TIKA-1518
> URL: https://issues.apache.org/jira/browse/TIKA-1518
> Project: Tika
>  Issue Type: New Feature
>Reporter: Paul Ramirez
>Assignee: Dave Meikle
>Priority: Major
> Fix For: 2.0, 1.17
>
> Attachments: tika-server-docker-err-msg.txt
>
>
> This version should be able to demonstrate as many of Apache Tika's 
> capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to 
> show parsers which require installation of other dependencies. In addition, 
> this should help move TIKA-1301 forward and should leverage the suggestion 
> made by [~lewismc] of a script which can pull down the latest version of 
> Apache Tika.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TIKA-1518) Docker with Tika Server

2018-03-07 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16390268#comment-16390268
 ] 

Tim Allison commented on TIKA-1518:
---

Not quite, different error this time (see attached file)...could be user error, 
I have no doubt!

OTOH, do we want to require Docker on devs' computers?

> Docker with Tika Server
> ---
>
> Key: TIKA-1518
> URL: https://issues.apache.org/jira/browse/TIKA-1518
> Project: Tika
>  Issue Type: New Feature
>Reporter: Paul Ramirez
>Assignee: Dave Meikle
>Priority: Major
> Fix For: 2.0, 1.17
>
> Attachments: tika-server-docker-err-msg.txt
>
>
> This version should be able to demonstrate as many of Apache Tika's 
> capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to 
> show parsers which require installation of other dependencies. In addition, 
> this should help move TIKA-1301 forward and should leverage the suggestion 
> made by [~lewismc] of a script which can pull down the latest version of 
> Apache Tika.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TIKA-1518) Docker with Tika Server

2018-03-07 Thread Dave Meikle (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16390241#comment-16390241
 ] 

Dave Meikle commented on TIKA-1518:
---

{quote}I do have Docker installed, [0] but it is Windows, and I've noticed 
some, um, areas for improvement in Docker on Windows.
{quote}
I've found on Windows I have had to enable the "Expose daemon on 
tcp://localhost:2375 without TLS" in Docker for Windows to talk to it with many 
of the clients out there. Does this solve it for you?

> Docker with Tika Server
> ---
>
> Key: TIKA-1518
> URL: https://issues.apache.org/jira/browse/TIKA-1518
> Project: Tika
>  Issue Type: New Feature
>Reporter: Paul Ramirez
>Assignee: Dave Meikle
>Priority: Major
> Fix For: 2.0, 1.17
>
>
> This version should be able to demonstrate as many of Apache Tika's 
> capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to 
> show parsers which require installation of other dependencies. In addition, 
> this should help move TIKA-1301 forward and should leverage the suggestion 
> made by [~lewismc] of a script which can pull down the latest version of 
> Apache Tika.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TIKA-1518) Docker with Tika Server

2018-03-07 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16390216#comment-16390216
 ] 

Tim Allison commented on TIKA-1518:
---

bq.  this is me getting too excited

?!
 
I do have Docker installed, but it is Windows, and I've noticed some, um, areas 
for improvement in Docker on Windows.

Thank you!

> Docker with Tika Server
> ---
>
> Key: TIKA-1518
> URL: https://issues.apache.org/jira/browse/TIKA-1518
> Project: Tika
>  Issue Type: New Feature
>Reporter: Paul Ramirez
>Assignee: Dave Meikle
>Priority: Major
> Fix For: 2.0, 1.17
>
>
> This version should be able to demonstrate as many of Apache Tika's 
> capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to 
> show parsers which require installation of other dependencies. In addition, 
> this should help move TIKA-1301 forward and should leverage the suggestion 
> made by [~lewismc] of a script which can pull down the latest version of 
> Apache Tika.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TIKA-1518) Docker with Tika Server

2018-03-07 Thread Dave Meikle (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16390202#comment-16390202
 ] 

Dave Meikle commented on TIKA-1518:
---

Sorry [~talli...@mitre.org] - this is me getting too excited. I'll need to 
remove it from being hooked on the "build" phase so those without Docker can 
build without this!

Will do this just now.

 

 

 

 

> Docker with Tika Server
> ---
>
> Key: TIKA-1518
> URL: https://issues.apache.org/jira/browse/TIKA-1518
> Project: Tika
>  Issue Type: New Feature
>Reporter: Paul Ramirez
>Assignee: Dave Meikle
>Priority: Major
> Fix For: 2.0, 1.17
>
>
> This version should be able to demonstrate as many of Apache Tika's 
> capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to 
> show parsers which require installation of other dependencies. In addition, 
> this should help move TIKA-1301 forward and should leverage the suggestion 
> made by [~lewismc] of a script which can pull down the latest version of 
> Apache Tika.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TIKA-1518) Docker with Tika Server

2018-03-07 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16389532#comment-16389532
 ] 

Tim Allison commented on TIKA-1518:
---

Hi [~davemeikle], with the new dockerfile-maven-plugin, I'm getting the 
following.  I'm behind a proxy, and I'm on windows, but you'd think localhost 
would work?!  Any recommendations?  Thank you!

{noformat}
[INFO] --- dockerfile-maven-plugin:1.3.7:build (default) @ tika-server ---
[INFO] Building Docker context C:\Users\tallison\Idea 
Projects\tika-asf2-git-2.x\tika-server
[INFO]
[INFO] Image will be built as apache/tika-server:2.0.0-SNAPSHOT
[INFO]
[WARNING] An attempt failed, will retry 1 more times
org.apache.maven.plugin.MojoExecutionException: Could not build image
at 
com.spotify.plugin.dockerfile.BuildMojo.buildImage(BuildMojo.java:185)
at com.spotify.plugin.dockerfile.BuildMojo.execute(BuildMojo.java:105)
at 
com.spotify.plugin.dockerfile.AbstractDockerMojo.tryExecute(AbstractDockerMojo.java:246)
at 
com.spotify.plugin.dockerfile.AbstractDockerMojo.execute(AbstractDockerMojo.java:235)
at 
org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:13
4)
at 
org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:208)
at 
org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:154)
at 
org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:146)
at 
org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.
java:117)
at 
org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.
java:81)
at 
org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build(SingleTh
readedBuilder.java:51)
at 
org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:128)
at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:309)
at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:194)
at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:107)
at org.apache.maven.cli.MavenCli.execute(MavenCli.java:993)
at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:345)
at org.apache.maven.cli.MavenCli.main(MavenCli.java:191)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:289)
at 
org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:229)
at 
org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:415)
at 
org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:356)
Caused by: com.spotify.docker.client.exceptions.DockerException: 
java.util.concurrent.ExecutionException:
com.spotify.docker.client.shaded.javax.ws.rs.ProcessingException: 
org.apache.http.conn.HttpHostConnectExce
ption: Connect to localhost:2375 [localhost/127.0.0.1, 
localhost/0:0:0:0:0:0:0:1] failed: Connection refus
ed: connect
at 
com.spotify.docker.client.DefaultDockerClient.propagate(DefaultDockerClient.java:2512)
at 
com.spotify.docker.client.DefaultDockerClient.request(DefaultDockerClient.java:2443)
at 
com.spotify.docker.client.DefaultDockerClient.version(DefaultDockerClient.java:501)
at 
com.spotify.docker.client.DefaultDockerClient.authRegistryHeader(DefaultDockerClient.java:2555)

at 
com.spotify.docker.client.DefaultDockerClient.build(DefaultDockerClient.java:1396)
at 
com.spotify.docker.client.DefaultDockerClient.build(DefaultDockerClient.java:1365)
at 
com.spotify.plugin.dockerfile.BuildMojo.buildImage(BuildMojo.java:178)
... 25 more
Caused by: java.util.concurrent.ExecutionException: 
com.spotify.docker.client.shaded.javax.ws.rs.Processin
gException: org.apache.http.conn.HttpHostConnectException: Connect to 
localhost:2375 [localhost/127.0.0.1,
 localhost/0:0:0:0:0:0:0:1] failed: Connection refused: connect
at 
jersey.repackaged.com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture
.java:299)
at 
jersey.repackaged.com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java
:286)
at 
jersey.repackaged.com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116)

at 
com.spotify.docker.client.DefaultDockerClient.request(DefaultDockerClient.java:2441)
... 30 more
Caused by: com.spotify.docker.client.shaded.javax.ws.rs.ProcessingException: 

[jira] [Commented] (TIKA-1518) Docker with Tika Server

2018-03-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16385176#comment-16385176
 ] 

Hudson commented on TIKA-1518:
--

SUCCESS: Integrated in Jenkins build Tika-trunk #1444 (See 
[https://builds.apache.org/job/Tika-trunk/1444/])
TIKA-1518: Updated the README and changed image name to tika-server for 
(dmeikle: 
[https://github.com/apache/tika/commit/ad3d763cf1f883518aa5da7f4a65842337409870])
* (edit) tika-server/README.md
* (edit) tika-server/pom.xml


> Docker with Tika Server
> ---
>
> Key: TIKA-1518
> URL: https://issues.apache.org/jira/browse/TIKA-1518
> Project: Tika
>  Issue Type: New Feature
>Reporter: Paul Ramirez
>Assignee: Dave Meikle
>Priority: Major
> Fix For: 2.0, 1.17
>
>
> This version should be able to demonstrate as many of Apache Tika's 
> capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to 
> show parsers which require installation of other dependencies. In addition, 
> this should help move TIKA-1301 forward and should leverage the suggestion 
> made by [~lewismc] of a script which can pull down the latest version of 
> Apache Tika.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TIKA-1518) Docker with Tika Server

2018-03-04 Thread Dave Meikle (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16385167#comment-16385167
 ] 

Dave Meikle commented on TIKA-1518:
---

As the current Dockerfile was out of date, I've updated it to use the build 
artefacts to create the docker image. This means you can run the following in 
the tika-server project:

{{mvn package dockerfile:build}}

We can setup the POM to allow a push to Dockerhub that we can setup on the 
deploy stage, that can be executed at release time so we always release a 
tagged version that can be used.

Will speak to INFRA about getting access to an account owned by the Apache 
organisation.

> Docker with Tika Server
> ---
>
> Key: TIKA-1518
> URL: https://issues.apache.org/jira/browse/TIKA-1518
> Project: Tika
>  Issue Type: New Feature
>Reporter: Paul Ramirez
>Assignee: Dave Meikle
>Priority: Major
> Fix For: 2.0, 1.17
>
>
> This version should be able to demonstrate as many of Apache Tika's 
> capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to 
> show parsers which require installation of other dependencies. In addition, 
> this should help move TIKA-1301 forward and should leverage the suggestion 
> made by [~lewismc] of a script which can pull down the latest version of 
> Apache Tika.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TIKA-1518) Docker with Tika Server

2018-03-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16385146#comment-16385146
 ] 

Hudson commented on TIKA-1518:
--

SUCCESS: Integrated in Jenkins build Tika-trunk #1443 (See 
[https://builds.apache.org/job/Tika-trunk/1443/])
TIKA-1518: Add local docker build based on dockerfile-maven-plugin (dmeikle: 
[https://github.com/apache/tika/commit/d653f95174e23deb3166459f1bb8eb75073afaae])
* (edit) tika-server/Dockerfile
* (edit) tika-server/pom.xml
TIKA-1518: Updated CHANGES file to include description (dmeikle: 
[https://github.com/apache/tika/commit/cebace98062cfa15bdc04f096b8c7586671738b5])
* (edit) CHANGES.txt


> Docker with Tika Server
> ---
>
> Key: TIKA-1518
> URL: https://issues.apache.org/jira/browse/TIKA-1518
> Project: Tika
>  Issue Type: New Feature
>Reporter: Paul Ramirez
>Assignee: Dave Meikle
>Priority: Major
> Fix For: 2.0, 1.17
>
>
> This version should be able to demonstrate as many of Apache Tika's 
> capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to 
> show parsers which require installation of other dependencies. In addition, 
> this should help move TIKA-1301 forward and should leverage the suggestion 
> made by [~lewismc] of a script which can pull down the latest version of 
> Apache Tika.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TIKA-1518) Docker with Tika Server

2018-03-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16385112#comment-16385112
 ] 

ASF GitHub Bot commented on TIKA-1518:
--

dameikle closed pull request #227: TIKA-1518: Add local docker build based on 
dockerfile-maven-plugin
URL: https://github.com/apache/tika/pull/227
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/tika-server/Dockerfile b/tika-server/Dockerfile
index f197d142e..279fff0d1 100644
--- a/tika-server/Dockerfile
+++ b/tika-server/Dockerfile
@@ -13,25 +13,20 @@
 #  specific language governing permissions and limitations
 #  under the License.
 
-FROM ubuntu:latest
+FROM ubuntu:xenial
 MAINTAINER Apache Tika Team
 
-ENV TIKA_VERSION 1.7
-ENV TIKA_SERVER_URL 
https://www.apache.org/dist/tika/tika-server-$TIKA_VERSION.jar
-
 RUNapt-get update \
-   && apt-get install openjdk-7-jre-headless curl gdal-bin tesseract-ocr \
-   tesseract-ocr-eng tesseract-ocr-ita tesseract-ocr-fra 
tesseract-ocr-spa tesseract-ocr-deu -y \
-   && curl -sSL https://people.apache.org/keys/group/tika.asc -o 
/tmp/tika.asc \
-   && gpg --import /tmp/tika.asc \
-   && curl -sSL "$TIKA_SERVER_URL.asc" -o 
/tmp/tika-server-${TIKA_VERSION}.jar.asc \
-   && NEAREST_TIKA_SERVER_URL=$(curl -sSL 
http://www.apache.org/dyn/closer.cgi/${TIKA_SERVER_URL#https://www.apache.org/dist/}\?asjson\=1
 \
-   | awk '/"path_info": / { pi=$2; }; /"preferred":/ { pref=$2; }; 
END { print pref " " pi; };' \
-   | sed -r -e 's/^"//; s/",$//; s/" "//') \
-   && echo "Nearest mirror: $NEAREST_TIKA_SERVER_URL" \
-   && curl -sSL "$NEAREST_TIKA_SERVER_URL" -o 
/tika-server-${TIKA_VERSION}.jar \
-   && gpg --verify /tmp/tika-server-${TIKA_VERSION}.jar.asc 
/tika-server-${TIKA_VERSION}.jar \
+   && apt-get install openjdk-8-jre-headless curl gdal-bin tesseract-ocr \
+  tesseract-ocr-eng tesseract-ocr-ita tesseract-ocr-fra 
tesseract-ocr-spa tesseract-ocr-deu -y \
&& apt-get clean -y && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
 
+ENV JAVA_HOME /usr/lib/jvm/java-8-openjdk-amd64
+RUN export JAVA_HOME
+
+ARG JAR_FILE
+ADD target/${JAR_FILE} /tika-server.jar
+
 EXPOSE 9998
-ENTRYPOINT java -jar /tika-server-${TIKA_VERSION}.jar -h 0.0.0.0
+ENTRYPOINT java -jar /tika-server.jar -h 0.0.0.0
+
diff --git a/tika-server/pom.xml b/tika-server/pom.xml
index 985387951..6fc38694e 100644
--- a/tika-server/pom.xml
+++ b/tika-server/pom.xml
@@ -260,6 +260,26 @@
   
 
   
+  
+com.spotify
+dockerfile-maven-plugin
+1.3.7
+
+  
+default
+
+  build
+
+  
+
+
+  apache/tika
+  ${project.version}
+  
+tika-server-${project.version}.jar
+  
+
+  
   
 org.apache.maven.plugins
 maven-jar-plugin


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Docker with Tika Server
> ---
>
> Key: TIKA-1518
> URL: https://issues.apache.org/jira/browse/TIKA-1518
> Project: Tika
>  Issue Type: New Feature
>Reporter: Paul Ramirez
>Assignee: Dave Meikle
>Priority: Major
> Fix For: 2.0, 1.17
>
>
> This version should be able to demonstrate as many of Apache Tika's 
> capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to 
> show parsers which require installation of other dependencies. In addition, 
> this should help move TIKA-1301 forward and should leverage the suggestion 
> made by [~lewismc] of a script which can pull down the latest version of 
> Apache Tika.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TIKA-1518) Docker with Tika Server

2018-03-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16385088#comment-16385088
 ] 

ASF GitHub Bot commented on TIKA-1518:
--

dameikle opened a new pull request #227: TIKA-1518: Add local docker build 
based on dockerfile-maven-plugin
URL: https://github.com/apache/tika/pull/227
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Docker with Tika Server
> ---
>
> Key: TIKA-1518
> URL: https://issues.apache.org/jira/browse/TIKA-1518
> Project: Tika
>  Issue Type: New Feature
>Reporter: Paul Ramirez
>Assignee: Dave Meikle
>Priority: Major
> Fix For: 2.0, 1.17
>
>
> This version should be able to demonstrate as many of Apache Tika's 
> capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to 
> show parsers which require installation of other dependencies. In addition, 
> this should help move TIKA-1301 forward and should leverage the suggestion 
> made by [~lewismc] of a script which can pull down the latest version of 
> Apache Tika.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TIKA-1518) Docker with Tika Server

2015-08-01 Thread Dave Meikle (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14650316#comment-14650316
 ] 

Dave Meikle commented on TIKA-1518:
---

Moved to 1.11 to keep work to get DockerHub is tracked.

 Docker with Tika Server
 ---

 Key: TIKA-1518
 URL: https://issues.apache.org/jira/browse/TIKA-1518
 Project: Tika
  Issue Type: New Feature
Reporter: Paul Ramirez
 Fix For: 1.11


 This version should be able to demonstrate as many of Apache Tika's 
 capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to 
 show parsers which require installation of other dependencies. In addition, 
 this should help move TIKA-1301 forward and should leverage the suggestion 
 made by [~lewismc] of a script which can pull down the latest version of 
 Apache Tika.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1518) Docker with Tika Server

2015-01-31 Thread Dave Meikle (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14299805#comment-14299805
 ] 

Dave Meikle commented on TIKA-1518:
---

Right folks I have added the Dockerfile to tika-server root.  Thinking about 
this more I am not sure if we can easily ship an image as a official artefact - 
maybe others have a view on that - so will just keep mine kicking around until 
we get something more official.

In theory if we could setup an organisation in Dockerhub and add the Automatic 
commit hooks to our GitHub organisation profile, we could auto build new images 
- but this all depends on the stance legal would have.

[~tpalsulich] - have you heard anything back on this?

 Docker with Tika Server
 ---

 Key: TIKA-1518
 URL: https://issues.apache.org/jira/browse/TIKA-1518
 Project: Tika
  Issue Type: New Feature
Reporter: Paul Ramirez
 Fix For: 1.8


 This version should be able to demonstrate as many of Apache Tika's 
 capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to 
 show parsers which require installation of other dependencies. In addition, 
 this should help move TIKA-1301 forward and should leverage the suggestion 
 made by [~lewismc] of a script which can pull down the latest version of 
 Apache Tika.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1518) Docker with Tika Server

2015-01-31 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14299843#comment-14299843
 ] 

Hudson commented on TIKA-1518:
--

SUCCESS: Integrated in tika-trunk-jdk1.7 #463 (See 
[https://builds.apache.org/job/tika-trunk-jdk1.7/463/])
TIKA-1518: Added Dockerfile to support building a Tika Server image (dmeikle: 
http://svn.apache.org/viewvc/tika/trunk/?view=revrev=1656191)
* /tika/trunk/tika-server/Dockerfile


 Docker with Tika Server
 ---

 Key: TIKA-1518
 URL: https://issues.apache.org/jira/browse/TIKA-1518
 Project: Tika
  Issue Type: New Feature
Reporter: Paul Ramirez
 Fix For: 1.8


 This version should be able to demonstrate as many of Apache Tika's 
 capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to 
 show parsers which require installation of other dependencies. In addition, 
 this should help move TIKA-1301 forward and should leverage the suggestion 
 made by [~lewismc] of a script which can pull down the latest version of 
 Apache Tika.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1518) Docker with Tika Server

2015-01-31 Thread Chris A. Mattmann (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14299961#comment-14299961
 ] 

Chris A. Mattmann commented on TIKA-1518:
-

Sure, as for images, I think we should use whatever infrastructure that ASF 
infra sets up for us - I bet they'll have a place for us to deal with images. 
Rather than waiting for them to decide too, I recommend joining the 
infrastruct...@apache.org lists (if not already) and joining the conversation 
and providing them a real use case here (TIKA-1518) and describing our needs 
and requirements. I'm sure [~ke4qqq] and team will be happy to listen and help.

 Docker with Tika Server
 ---

 Key: TIKA-1518
 URL: https://issues.apache.org/jira/browse/TIKA-1518
 Project: Tika
  Issue Type: New Feature
Reporter: Paul Ramirez
 Fix For: 1.8


 This version should be able to demonstrate as many of Apache Tika's 
 capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to 
 show parsers which require installation of other dependencies. In addition, 
 this should help move TIKA-1301 forward and should leverage the suggestion 
 made by [~lewismc] of a script which can pull down the latest version of 
 Apache Tika.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1518) Docker with Tika Server

2015-01-30 Thread Dave Meikle (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14299697#comment-14299697
 ] 

Dave Meikle commented on TIKA-1518:
---

Sorry gang been travelling a lot.

#1, Totally up for adding this to tika-server and I think we may be able to do 
the automated build by linking Docker hub to the Tika Git mirrors but haven't 
done this before. Will give it a try when I commit the Dockerfile.

#3, [~tpalsulich] I have a little AngularJS app that I use for something very 
similar that I could brand up and add some functionality to act as a nice front 
end app for tika-server. Let me clean it up and stick it in too.

 Docker with Tika Server
 ---

 Key: TIKA-1518
 URL: https://issues.apache.org/jira/browse/TIKA-1518
 Project: Tika
  Issue Type: New Feature
Reporter: Paul Ramirez
 Fix For: 1.8


 This version should be able to demonstrate as many of Apache Tika's 
 capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to 
 show parsers which require installation of other dependencies. In addition, 
 this should help move TIKA-1301 forward and should leverage the suggestion 
 made by [~lewismc] of a script which can pull down the latest version of 
 Apache Tika.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1518) Docker with Tika Server

2015-01-29 Thread Tyler Palsulich (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297614#comment-14297614
 ] 

Tyler Palsulich commented on TIKA-1518:
---

2. Sent a message. Andrew Bayer responded saying it's in the works -- should 
have more information next month.

3. In my opinion, TIKA-1302 is more of backend testing of Tika. I'm referring 
to a frontend where someone can upload a file to the server and can see what 
Tika pulls out of it. I'll open a new sub-issue of TIKA-1302.

 Docker with Tika Server
 ---

 Key: TIKA-1518
 URL: https://issues.apache.org/jira/browse/TIKA-1518
 Project: Tika
  Issue Type: New Feature
Reporter: Paul Ramirez
 Fix For: 1.8


 This version should be able to demonstrate as many of Apache Tika's 
 capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to 
 show parsers which require installation of other dependencies. In addition, 
 this should help move TIKA-1301 forward and should leverage the suggestion 
 made by [~lewismc] of a script which can pull down the latest version of 
 Apache Tika.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1518) Docker with Tika Server

2015-01-29 Thread Chris A. Mattmann (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297623#comment-14297623
 ] 

Chris A. Mattmann commented on TIKA-1518:
-

[~tpalsulich] talk to [~talli...@apache.org] he already got this set up with a 
donation from Rackspace. We have a VM with Tika Server on it. We just need to 
doc it and promote it. Right, Tim? Lewis and I have access, you can too!

 Docker with Tika Server
 ---

 Key: TIKA-1518
 URL: https://issues.apache.org/jira/browse/TIKA-1518
 Project: Tika
  Issue Type: New Feature
Reporter: Paul Ramirez
 Fix For: 1.8


 This version should be able to demonstrate as many of Apache Tika's 
 capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to 
 show parsers which require installation of other dependencies. In addition, 
 this should help move TIKA-1301 forward and should leverage the suggestion 
 made by [~lewismc] of a script which can pull down the latest version of 
 Apache Tika.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1518) Docker with Tika Server

2015-01-29 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14298002#comment-14298002
 ] 

Tim Allison commented on TIKA-1518:
---

[~tpalsulich], y, the server was initially intended for TIKA-1302, which should 
eventually run fairly continuously and publish results of runs of different 
versions of Tika, but it wasn't designed to be TIKA-1301.  That said, we have a 
server, thanks to Rackspace, and why not use it for TIKA-1301 now.  Send me a 
personal email with your desired userid, and I'll give you access tomorrow.

 Docker with Tika Server
 ---

 Key: TIKA-1518
 URL: https://issues.apache.org/jira/browse/TIKA-1518
 Project: Tika
  Issue Type: New Feature
Reporter: Paul Ramirez
 Fix For: 1.8


 This version should be able to demonstrate as many of Apache Tika's 
 capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to 
 show parsers which require installation of other dependencies. In addition, 
 this should help move TIKA-1301 forward and should leverage the suggestion 
 made by [~lewismc] of a script which can pull down the latest version of 
 Apache Tika.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1518) Docker with Tika Server

2015-01-28 Thread Chris A. Mattmann (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296439#comment-14296439
 ] 

Chris A. Mattmann commented on TIKA-1518:
-

Thanks Tyler. Can you raise #2 on infrastruct...@apache.org? That would be an 
awesome idea, and then keep folks here posted. As for #1, +1 from me. RE: #3, 
there is a TIKA issue on that, I think it's TIKA-1312

 Docker with Tika Server
 ---

 Key: TIKA-1518
 URL: https://issues.apache.org/jira/browse/TIKA-1518
 Project: Tika
  Issue Type: New Feature
Reporter: Paul Ramirez
 Fix For: 1.8


 This version should be able to demonstrate as many of Apache Tika's 
 capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to 
 show parsers which require installation of other dependencies. In addition, 
 this should help move TIKA-1301 forward and should leverage the suggestion 
 made by [~lewismc] of a script which can pull down the latest version of 
 Apache Tika.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1518) Docker with Tika Server

2015-01-26 Thread Konstantin Gribov (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14291999#comment-14291999
 ] 

Konstantin Gribov commented on TIKA-1518:
-

Thank you, [~davemeikle]. It works perfectly, so can be easily used to evaluate 
Tika. 

I'll add info to wiki if it isn't there already.

 Docker with Tika Server
 ---

 Key: TIKA-1518
 URL: https://issues.apache.org/jira/browse/TIKA-1518
 Project: Tika
  Issue Type: New Feature
Reporter: Paul Ramirez
 Fix For: 1.8


 This version should be able to demonstrate as many of Apache Tika's 
 capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to 
 show parsers which require installation of other dependencies. In addition, 
 this should help move TIKA-1301 forward and should leverage the suggestion 
 made by [~lewismc] of a script which can pull down the latest version of 
 Apache Tika.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1518) Docker with Tika Server

2015-01-26 Thread Paul Ramirez (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292435#comment-14292435
 ] 

Paul Ramirez commented on TIKA-1518:


Missed this over the weekend while playing with Docker but yes [~chrismattmann] 
looks to be what exactly I was thinking. +1 to leaving open until it's in 
Apache Tika codebase. Dave I will definitely use this for a project and commit 
updates to it.

 Docker with Tika Server
 ---

 Key: TIKA-1518
 URL: https://issues.apache.org/jira/browse/TIKA-1518
 Project: Tika
  Issue Type: New Feature
Reporter: Paul Ramirez
 Fix For: 1.8


 This version should be able to demonstrate as many of Apache Tika's 
 capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to 
 show parsers which require installation of other dependencies. In addition, 
 this should help move TIKA-1301 forward and should leverage the suggestion 
 made by [~lewismc] of a script which can pull down the latest version of 
 Apache Tika.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1518) Docker with Tika Server

2015-01-24 Thread Dave Meikle (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14290507#comment-14290507
 ] 

Dave Meikle commented on TIKA-1518:
---

Hi [~grossws] - I have added the automated build here:
https://registry.hub.docker.com/u/logicalspark/docker-tikaserver/

Apologies for the delay, DockerHub wasn't very stable for me whilst on my 
travels.

 Docker with Tika Server
 ---

 Key: TIKA-1518
 URL: https://issues.apache.org/jira/browse/TIKA-1518
 Project: Tika
  Issue Type: New Feature
Reporter: Paul Ramirez
 Fix For: 1.8


 This version should be able to demonstrate as many of Apache Tika's 
 capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to 
 show parsers which require installation of other dependencies. In addition, 
 this should help move TIKA-1301 forward and should leverage the suggestion 
 made by [~lewismc] of a script which can pull down the latest version of 
 Apache Tika.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1518) Docker with Tika Server

2015-01-20 Thread Konstantin Gribov (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14284926#comment-14284926
 ] 

Konstantin Gribov commented on TIKA-1518:
-

I've dropped my version to avoid unnecessary duplication.

[~davemeikle], can you also create automated build on docker hub? Instructions 
can be found 
[here|https://docs.docker.com/docker-hub/builds/#setting-up-automated-builds-with-github].

 Docker with Tika Server
 ---

 Key: TIKA-1518
 URL: https://issues.apache.org/jira/browse/TIKA-1518
 Project: Tika
  Issue Type: New Feature
Reporter: Paul Ramirez
 Fix For: 1.8


 This version should be able to demonstrate as many of Apache Tika's 
 capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to 
 show parsers which require installation of other dependencies. In addition, 
 this should help move TIKA-1301 forward and should leverage the suggestion 
 made by [~lewismc] of a script which can pull down the latest version of 
 Apache Tika.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1518) Docker with Tika Server

2015-01-20 Thread Chris A. Mattmann (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14284226#comment-14284226
 ] 

Chris A. Mattmann commented on TIKA-1518:
-

Guys looks like [~davemeikle] has already started some work on this in his 
Github repo:

https://github.com/LogicalSpark/docker-tikaserver

Dave, FYI this JIRA issue not sure if it's related just saw by following your 
Github. Paul R - maybe you can use this?

 Docker with Tika Server
 ---

 Key: TIKA-1518
 URL: https://issues.apache.org/jira/browse/TIKA-1518
 Project: Tika
  Issue Type: New Feature
Reporter: Paul Ramirez
 Fix For: 1.8


 This version should be able to demonstrate as many of Apache Tika's 
 capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to 
 show parsers which require installation of other dependencies. In addition, 
 this should help move TIKA-1301 forward and should leverage the suggestion 
 made by [~lewismc] of a script which can pull down the latest version of 
 Apache Tika.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1518) Docker with Tika Server

2015-01-16 Thread Konstantin Gribov (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14280241#comment-14280241
 ] 

Konstantin Gribov commented on TIKA-1518:
-

Ok, I'll create it soon

 Docker with Tika Server
 ---

 Key: TIKA-1518
 URL: https://issues.apache.org/jira/browse/TIKA-1518
 Project: Tika
  Issue Type: New Feature
Reporter: Paul Ramirez
 Fix For: 1.8


 This version should be able to demonstrate as many of Apache Tika's 
 capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to 
 show parsers which require installation of other dependencies. In addition, 
 this should help move TIKA-1301 forward and should leverage the suggestion 
 made by [~lewismc] of a script which can pull down the latest version of 
 Apache Tika.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1518) Docker with Tika Server

2015-01-15 Thread Paul Ramirez (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14279927#comment-14279927
 ] 

Paul Ramirez commented on TIKA-1518:


Thanks Konstantin for the example. If you have the time that would be awesome.

 Docker with Tika Server
 ---

 Key: TIKA-1518
 URL: https://issues.apache.org/jira/browse/TIKA-1518
 Project: Tika
  Issue Type: New Feature
Reporter: Paul Ramirez
 Fix For: 1.8


 This version should be able to demonstrate as many of Apache Tika's 
 capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to 
 show parsers which require installation of other dependencies. In addition, 
 this should help move TIKA-1301 forward and should leverage the suggestion 
 made by [~lewismc] of a script which can pull down the latest version of 
 Apache Tika.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1518) Docker with Tika Server

2015-01-15 Thread Paul Ramirez (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14278424#comment-14278424
 ] 

Paul Ramirez commented on TIKA-1518:


As I build a patch what component should this go into? Any suggestions on 
things that will need to be a part of this beyond the dependencies I've listed?



 Docker with Tika Server
 ---

 Key: TIKA-1518
 URL: https://issues.apache.org/jira/browse/TIKA-1518
 Project: Tika
  Issue Type: New Feature
Reporter: Paul Ramirez
 Fix For: 1.8


 This version should be able to demonstrate as many of Apache Tika's 
 capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to 
 show parsers which require installation of other dependencies. In addition, 
 this should help move TIKA-1301 forward and should leverage the suggestion 
 made by [~lewismc] of a script which can pull down the latest version of 
 Apache Tika.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TIKA-1518) Docker with Tika Server

2015-01-15 Thread Konstantin Gribov (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14278446#comment-14278446
 ] 

Konstantin Gribov commented on TIKA-1518:
-

To pull latest Tika you can use snippet like mine:

{noformat}
# ...

# see https://www.apache.org/dist/tomcat/tomcat-8/KEYS
RUN gpg --keyserver pgp.mit.edu --recv-keys \
05AB33110949707C93A279E3D3EFE6B686867BA6 \
F7DA48BB64BCB84ECBA7EE6935CD23C10D498E23
# keylist (stripped for jira)

ENV TOMCAT_MAJOR 8
ENV TOMCAT_VERSION 8.0.15
ENV TOMCAT_TGZ_URL 
https://www.apache.org/dist/tomcat/tomcat-$TOMCAT_MAJOR/v$TOMCAT_VERSION/bin/apache-tomcat-$TOMCAT_VERSION.tar.gz

RUN NEAREST_TOMCAT_TGZ_URL=$(curl -sSL 
http://www.apache.org/dyn/closer.cgi/${TOMCAT_TGZ_URL#https://www.apache.org/dist/}\?asjson\=1
 \
| awk '/path_info: / { pi=$2; }; /preferred:/ { pref=$2; }; 
END { print pref   pi; };' \
| sed -r -e 's/^//; s/,$//; s/ //') \
 echo Nearest mirror: $NEAREST_TOMCAT_TGZ_URL \
 curl -sSL $NEAREST_TOMCAT_TGZ_URL -o tomcat.tar.gz \
 curl -sSL $TOMCAT_TGZ_URL.asc -o tomcat.tar.gz.asc \
 gpg --verify tomcat.tar.gz.asc \
 tar -xvf tomcat.tar.gz --strip-components=1
{noformat}
Full Dockerfile can be viewed on github 
(https://github.com/grossws/docker-comp-tomcat8/blob/master/Dockerfile)

If you want, I can make docker image and automated build for it.

 Docker with Tika Server
 ---

 Key: TIKA-1518
 URL: https://issues.apache.org/jira/browse/TIKA-1518
 Project: Tika
  Issue Type: New Feature
Reporter: Paul Ramirez
 Fix For: 1.8


 This version should be able to demonstrate as many of Apache Tika's 
 capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to 
 show parsers which require installation of other dependencies. In addition, 
 this should help move TIKA-1301 forward and should leverage the suggestion 
 made by [~lewismc] of a script which can pull down the latest version of 
 Apache Tika.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)