[jira] [Commented] (TIKA-1518) Docker with Tika Server
[ https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16390381#comment-16390381 ] Hudson commented on TIKA-1518: -- UNSTABLE: Integrated in Jenkins build tika-2.x-windows #216 (See [https://builds.apache.org/job/tika-2.x-windows/216/]) TIKA-1518 -- turn dockerfile-maven-plugin back on. Accidentally (tallison: rev ca19696657cca2ec83160f9a16cbb36bfc35cde6) * (edit) tika-server/pom.xml > Docker with Tika Server > --- > > Key: TIKA-1518 > URL: https://issues.apache.org/jira/browse/TIKA-1518 > Project: Tika > Issue Type: New Feature >Reporter: Paul Ramirez >Assignee: Dave Meikle >Priority: Major > Fix For: 2.0, 1.17 > > Attachments: tika-server-docker-err-msg.txt > > > This version should be able to demonstrate as many of Apache Tika's > capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to > show parsers which require installation of other dependencies. In addition, > this should help move TIKA-1301 forward and should leverage the suggestion > made by [~lewismc] of a script which can pull down the latest version of > Apache Tika. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TIKA-1518) Docker with Tika Server
[ https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16390351#comment-16390351 ] Hudson commented on TIKA-1518: -- SUCCESS: Integrated in Jenkins build tika-branch-1x #6 (See [https://builds.apache.org/job/tika-branch-1x/6/]) TIKA-1518: Detach docker file build from build phase in Maven execution (david: [https://github.com/apache/tika/commit/42aa774f1e1d232ee9f98b58ace9f0417231716b]) * (edit) tika-server/pom.xml > Docker with Tika Server > --- > > Key: TIKA-1518 > URL: https://issues.apache.org/jira/browse/TIKA-1518 > Project: Tika > Issue Type: New Feature >Reporter: Paul Ramirez >Assignee: Dave Meikle >Priority: Major > Fix For: 2.0, 1.17 > > Attachments: tika-server-docker-err-msg.txt > > > This version should be able to demonstrate as many of Apache Tika's > capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to > show parsers which require installation of other dependencies. In addition, > this should help move TIKA-1301 forward and should leverage the suggestion > made by [~lewismc] of a script which can pull down the latest version of > Apache Tika. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TIKA-1518) Docker with Tika Server
[ https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16390335#comment-16390335 ] Hudson commented on TIKA-1518: -- SUCCESS: Integrated in Jenkins build Tika-trunk #1454 (See [https://builds.apache.org/job/Tika-trunk/1454/]) TIKA-1518: Detach docker file build from build phase in Maven execution (david: [https://github.com/apache/tika/commit/deb9e96f29d3a322804016d4533bb76de7c40e2c]) * (edit) tika-server/pom.xml TIKA-1518 -- turn dockerfile-maven-plugin back on. Accidentally (tallison: [https://github.com/apache/tika/commit/ca19696657cca2ec83160f9a16cbb36bfc35cde6]) * (edit) tika-server/pom.xml > Docker with Tika Server > --- > > Key: TIKA-1518 > URL: https://issues.apache.org/jira/browse/TIKA-1518 > Project: Tika > Issue Type: New Feature >Reporter: Paul Ramirez >Assignee: Dave Meikle >Priority: Major > Fix For: 2.0, 1.17 > > Attachments: tika-server-docker-err-msg.txt > > > This version should be able to demonstrate as many of Apache Tika's > capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to > show parsers which require installation of other dependencies. In addition, > this should help move TIKA-1301 forward and should leverage the suggestion > made by [~lewismc] of a script which can pull down the latest version of > Apache Tika. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TIKA-1518) Docker with Tika Server
[ https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16390317#comment-16390317 ] Dave Meikle commented on TIKA-1518: --- [~talli...@mitre.org] - ah it looks like the proxy settings aren't being passed into the Docker container. Normally I've passed proxy settings via buildArgs to docker but I am not sure how this is handled by the Maven plugin. I've not done docker behind a proxy for a while. Can you try -X on the maven command to see what is being set? > Docker with Tika Server > --- > > Key: TIKA-1518 > URL: https://issues.apache.org/jira/browse/TIKA-1518 > Project: Tika > Issue Type: New Feature >Reporter: Paul Ramirez >Assignee: Dave Meikle >Priority: Major > Fix For: 2.0, 1.17 > > Attachments: tika-server-docker-err-msg.txt > > > This version should be able to demonstrate as many of Apache Tika's > capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to > show parsers which require installation of other dependencies. In addition, > this should help move TIKA-1301 forward and should leverage the suggestion > made by [~lewismc] of a script which can pull down the latest version of > Apache Tika. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TIKA-1518) Docker with Tika Server
[ https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16390309#comment-16390309 ] Hudson commented on TIKA-1518: -- UNSTABLE: Integrated in Jenkins build tika-2.x-windows #215 (See [https://builds.apache.org/job/tika-2.x-windows/215/]) TIKA-1518: Detach docker file build from build phase in Maven execution (david: rev deb9e96f29d3a322804016d4533bb76de7c40e2c) * (edit) tika-server/pom.xml > Docker with Tika Server > --- > > Key: TIKA-1518 > URL: https://issues.apache.org/jira/browse/TIKA-1518 > Project: Tika > Issue Type: New Feature >Reporter: Paul Ramirez >Assignee: Dave Meikle >Priority: Major > Fix For: 2.0, 1.17 > > Attachments: tika-server-docker-err-msg.txt > > > This version should be able to demonstrate as many of Apache Tika's > capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to > show parsers which require installation of other dependencies. In addition, > this should help move TIKA-1301 forward and should leverage the suggestion > made by [~lewismc] of a script which can pull down the latest version of > Apache Tika. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TIKA-1518) Docker with Tika Server
[ https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16390284#comment-16390284 ] Dave Meikle commented on TIKA-1518: --- It is a choice we have to make. There are three mains routes to Docker packaging that I have used: # Automated builds that pull in pre-packaged and then get bundled into an image on any change in the an repository - like what we are doing n docker-tikaserver approach where is goes and downloads the signed JARs # Automated builds that compile the code in the image (e.g. using the maven Docker image) and then package them # Building a release image and then distributing that - which is what this does but requires us to decide when an official release is available and push it somewhere The first and second are really good for leveraging things like Docker Hub to automatically build from your repository, where as the third means you have to have Docker on your machine when you want to build an image. I never really like number two as it means the builds are always recompiles of the code each time a change is triggered, so you can easily be packing up different code as the same version without realising it. The challenge with the approach in docker-tikaserver is maintaining when assets that are being pulled in move - i.e. when an release JAR is move from dist.apache.org - but that could easily be solved by going to Nexus for the JARs based on the release packages. I personally quite like the third approach as it means you explicit create an image that has its own life and was thinking that we could potentially add this to the release process, pushing the image from the release build to Docker Hub/Nexus/Another Repos so it is an official build. Not sure what others think? > Docker with Tika Server > --- > > Key: TIKA-1518 > URL: https://issues.apache.org/jira/browse/TIKA-1518 > Project: Tika > Issue Type: New Feature >Reporter: Paul Ramirez >Assignee: Dave Meikle >Priority: Major > Fix For: 2.0, 1.17 > > Attachments: tika-server-docker-err-msg.txt > > > This version should be able to demonstrate as many of Apache Tika's > capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to > show parsers which require installation of other dependencies. In addition, > this should help move TIKA-1301 forward and should leverage the suggestion > made by [~lewismc] of a script which can pull down the latest version of > Apache Tika. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TIKA-1518) Docker with Tika Server
[ https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16390274#comment-16390274 ] Tim Allison commented on TIKA-1518: --- Your [commit|https://github.com/apache/tika/commit/deb9e96f29d3a322804016d4533bb76de7c40e2c#diff-332a9cfb880c4a30e2abc7af93035120] sure fixed it by turning it off. :D > Docker with Tika Server > --- > > Key: TIKA-1518 > URL: https://issues.apache.org/jira/browse/TIKA-1518 > Project: Tika > Issue Type: New Feature >Reporter: Paul Ramirez >Assignee: Dave Meikle >Priority: Major > Fix For: 2.0, 1.17 > > Attachments: tika-server-docker-err-msg.txt > > > This version should be able to demonstrate as many of Apache Tika's > capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to > show parsers which require installation of other dependencies. In addition, > this should help move TIKA-1301 forward and should leverage the suggestion > made by [~lewismc] of a script which can pull down the latest version of > Apache Tika. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TIKA-1518) Docker with Tika Server
[ https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16390275#comment-16390275 ] Tim Allison commented on TIKA-1518: --- And sorry for letting the slip through!!! > Docker with Tika Server > --- > > Key: TIKA-1518 > URL: https://issues.apache.org/jira/browse/TIKA-1518 > Project: Tika > Issue Type: New Feature >Reporter: Paul Ramirez >Assignee: Dave Meikle >Priority: Major > Fix For: 2.0, 1.17 > > Attachments: tika-server-docker-err-msg.txt > > > This version should be able to demonstrate as many of Apache Tika's > capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to > show parsers which require installation of other dependencies. In addition, > this should help move TIKA-1301 forward and should leverage the suggestion > made by [~lewismc] of a script which can pull down the latest version of > Apache Tika. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TIKA-1518) Docker with Tika Server
[ https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16390268#comment-16390268 ] Tim Allison commented on TIKA-1518: --- Not quite, different error this time (see attached file)...could be user error, I have no doubt! OTOH, do we want to require Docker on devs' computers? > Docker with Tika Server > --- > > Key: TIKA-1518 > URL: https://issues.apache.org/jira/browse/TIKA-1518 > Project: Tika > Issue Type: New Feature >Reporter: Paul Ramirez >Assignee: Dave Meikle >Priority: Major > Fix For: 2.0, 1.17 > > Attachments: tika-server-docker-err-msg.txt > > > This version should be able to demonstrate as many of Apache Tika's > capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to > show parsers which require installation of other dependencies. In addition, > this should help move TIKA-1301 forward and should leverage the suggestion > made by [~lewismc] of a script which can pull down the latest version of > Apache Tika. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TIKA-1518) Docker with Tika Server
[ https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16390241#comment-16390241 ] Dave Meikle commented on TIKA-1518: --- {quote}I do have Docker installed, [0] but it is Windows, and I've noticed some, um, areas for improvement in Docker on Windows. {quote} I've found on Windows I have had to enable the "Expose daemon on tcp://localhost:2375 without TLS" in Docker for Windows to talk to it with many of the clients out there. Does this solve it for you? > Docker with Tika Server > --- > > Key: TIKA-1518 > URL: https://issues.apache.org/jira/browse/TIKA-1518 > Project: Tika > Issue Type: New Feature >Reporter: Paul Ramirez >Assignee: Dave Meikle >Priority: Major > Fix For: 2.0, 1.17 > > > This version should be able to demonstrate as many of Apache Tika's > capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to > show parsers which require installation of other dependencies. In addition, > this should help move TIKA-1301 forward and should leverage the suggestion > made by [~lewismc] of a script which can pull down the latest version of > Apache Tika. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TIKA-1518) Docker with Tika Server
[ https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16390216#comment-16390216 ] Tim Allison commented on TIKA-1518: --- bq. this is me getting too excited ?! I do have Docker installed, but it is Windows, and I've noticed some, um, areas for improvement in Docker on Windows. Thank you! > Docker with Tika Server > --- > > Key: TIKA-1518 > URL: https://issues.apache.org/jira/browse/TIKA-1518 > Project: Tika > Issue Type: New Feature >Reporter: Paul Ramirez >Assignee: Dave Meikle >Priority: Major > Fix For: 2.0, 1.17 > > > This version should be able to demonstrate as many of Apache Tika's > capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to > show parsers which require installation of other dependencies. In addition, > this should help move TIKA-1301 forward and should leverage the suggestion > made by [~lewismc] of a script which can pull down the latest version of > Apache Tika. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TIKA-1518) Docker with Tika Server
[ https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16390202#comment-16390202 ] Dave Meikle commented on TIKA-1518: --- Sorry [~talli...@mitre.org] - this is me getting too excited. I'll need to remove it from being hooked on the "build" phase so those without Docker can build without this! Will do this just now. > Docker with Tika Server > --- > > Key: TIKA-1518 > URL: https://issues.apache.org/jira/browse/TIKA-1518 > Project: Tika > Issue Type: New Feature >Reporter: Paul Ramirez >Assignee: Dave Meikle >Priority: Major > Fix For: 2.0, 1.17 > > > This version should be able to demonstrate as many of Apache Tika's > capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to > show parsers which require installation of other dependencies. In addition, > this should help move TIKA-1301 forward and should leverage the suggestion > made by [~lewismc] of a script which can pull down the latest version of > Apache Tika. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TIKA-1518) Docker with Tika Server
[ https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16389532#comment-16389532 ] Tim Allison commented on TIKA-1518: --- Hi [~davemeikle], with the new dockerfile-maven-plugin, I'm getting the following. I'm behind a proxy, and I'm on windows, but you'd think localhost would work?! Any recommendations? Thank you! {noformat} [INFO] --- dockerfile-maven-plugin:1.3.7:build (default) @ tika-server --- [INFO] Building Docker context C:\Users\tallison\Idea Projects\tika-asf2-git-2.x\tika-server [INFO] [INFO] Image will be built as apache/tika-server:2.0.0-SNAPSHOT [INFO] [WARNING] An attempt failed, will retry 1 more times org.apache.maven.plugin.MojoExecutionException: Could not build image at com.spotify.plugin.dockerfile.BuildMojo.buildImage(BuildMojo.java:185) at com.spotify.plugin.dockerfile.BuildMojo.execute(BuildMojo.java:105) at com.spotify.plugin.dockerfile.AbstractDockerMojo.tryExecute(AbstractDockerMojo.java:246) at com.spotify.plugin.dockerfile.AbstractDockerMojo.execute(AbstractDockerMojo.java:235) at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:13 4) at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:208) at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:154) at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:146) at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder. java:117) at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder. java:81) at org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build(SingleTh readedBuilder.java:51) at org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:128) at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:309) at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:194) at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:107) at org.apache.maven.cli.MavenCli.execute(MavenCli.java:993) at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:345) at org.apache.maven.cli.MavenCli.main(MavenCli.java:191) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:289) at org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:229) at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:415) at org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:356) Caused by: com.spotify.docker.client.exceptions.DockerException: java.util.concurrent.ExecutionException: com.spotify.docker.client.shaded.javax.ws.rs.ProcessingException: org.apache.http.conn.HttpHostConnectExce ption: Connect to localhost:2375 [localhost/127.0.0.1, localhost/0:0:0:0:0:0:0:1] failed: Connection refus ed: connect at com.spotify.docker.client.DefaultDockerClient.propagate(DefaultDockerClient.java:2512) at com.spotify.docker.client.DefaultDockerClient.request(DefaultDockerClient.java:2443) at com.spotify.docker.client.DefaultDockerClient.version(DefaultDockerClient.java:501) at com.spotify.docker.client.DefaultDockerClient.authRegistryHeader(DefaultDockerClient.java:2555) at com.spotify.docker.client.DefaultDockerClient.build(DefaultDockerClient.java:1396) at com.spotify.docker.client.DefaultDockerClient.build(DefaultDockerClient.java:1365) at com.spotify.plugin.dockerfile.BuildMojo.buildImage(BuildMojo.java:178) ... 25 more Caused by: java.util.concurrent.ExecutionException: com.spotify.docker.client.shaded.javax.ws.rs.Processin gException: org.apache.http.conn.HttpHostConnectException: Connect to localhost:2375 [localhost/127.0.0.1, localhost/0:0:0:0:0:0:0:1] failed: Connection refused: connect at jersey.repackaged.com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture .java:299) at jersey.repackaged.com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java :286) at jersey.repackaged.com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116) at com.spotify.docker.client.DefaultDockerClient.request(DefaultDockerClient.java:2441) ... 30 more Caused by: com.spotify.docker.client.shaded.javax.ws.rs.ProcessingException:
[jira] [Commented] (TIKA-1518) Docker with Tika Server
[ https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16385176#comment-16385176 ] Hudson commented on TIKA-1518: -- SUCCESS: Integrated in Jenkins build Tika-trunk #1444 (See [https://builds.apache.org/job/Tika-trunk/1444/]) TIKA-1518: Updated the README and changed image name to tika-server for (dmeikle: [https://github.com/apache/tika/commit/ad3d763cf1f883518aa5da7f4a65842337409870]) * (edit) tika-server/README.md * (edit) tika-server/pom.xml > Docker with Tika Server > --- > > Key: TIKA-1518 > URL: https://issues.apache.org/jira/browse/TIKA-1518 > Project: Tika > Issue Type: New Feature >Reporter: Paul Ramirez >Assignee: Dave Meikle >Priority: Major > Fix For: 2.0, 1.17 > > > This version should be able to demonstrate as many of Apache Tika's > capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to > show parsers which require installation of other dependencies. In addition, > this should help move TIKA-1301 forward and should leverage the suggestion > made by [~lewismc] of a script which can pull down the latest version of > Apache Tika. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TIKA-1518) Docker with Tika Server
[ https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16385167#comment-16385167 ] Dave Meikle commented on TIKA-1518: --- As the current Dockerfile was out of date, I've updated it to use the build artefacts to create the docker image. This means you can run the following in the tika-server project: {{mvn package dockerfile:build}} We can setup the POM to allow a push to Dockerhub that we can setup on the deploy stage, that can be executed at release time so we always release a tagged version that can be used. Will speak to INFRA about getting access to an account owned by the Apache organisation. > Docker with Tika Server > --- > > Key: TIKA-1518 > URL: https://issues.apache.org/jira/browse/TIKA-1518 > Project: Tika > Issue Type: New Feature >Reporter: Paul Ramirez >Assignee: Dave Meikle >Priority: Major > Fix For: 2.0, 1.17 > > > This version should be able to demonstrate as many of Apache Tika's > capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to > show parsers which require installation of other dependencies. In addition, > this should help move TIKA-1301 forward and should leverage the suggestion > made by [~lewismc] of a script which can pull down the latest version of > Apache Tika. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TIKA-1518) Docker with Tika Server
[ https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16385146#comment-16385146 ] Hudson commented on TIKA-1518: -- SUCCESS: Integrated in Jenkins build Tika-trunk #1443 (See [https://builds.apache.org/job/Tika-trunk/1443/]) TIKA-1518: Add local docker build based on dockerfile-maven-plugin (dmeikle: [https://github.com/apache/tika/commit/d653f95174e23deb3166459f1bb8eb75073afaae]) * (edit) tika-server/Dockerfile * (edit) tika-server/pom.xml TIKA-1518: Updated CHANGES file to include description (dmeikle: [https://github.com/apache/tika/commit/cebace98062cfa15bdc04f096b8c7586671738b5]) * (edit) CHANGES.txt > Docker with Tika Server > --- > > Key: TIKA-1518 > URL: https://issues.apache.org/jira/browse/TIKA-1518 > Project: Tika > Issue Type: New Feature >Reporter: Paul Ramirez >Assignee: Dave Meikle >Priority: Major > Fix For: 2.0, 1.17 > > > This version should be able to demonstrate as many of Apache Tika's > capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to > show parsers which require installation of other dependencies. In addition, > this should help move TIKA-1301 forward and should leverage the suggestion > made by [~lewismc] of a script which can pull down the latest version of > Apache Tika. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TIKA-1518) Docker with Tika Server
[ https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16385112#comment-16385112 ] ASF GitHub Bot commented on TIKA-1518: -- dameikle closed pull request #227: TIKA-1518: Add local docker build based on dockerfile-maven-plugin URL: https://github.com/apache/tika/pull/227 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/tika-server/Dockerfile b/tika-server/Dockerfile index f197d142e..279fff0d1 100644 --- a/tika-server/Dockerfile +++ b/tika-server/Dockerfile @@ -13,25 +13,20 @@ # specific language governing permissions and limitations # under the License. -FROM ubuntu:latest +FROM ubuntu:xenial MAINTAINER Apache Tika Team -ENV TIKA_VERSION 1.7 -ENV TIKA_SERVER_URL https://www.apache.org/dist/tika/tika-server-$TIKA_VERSION.jar - RUNapt-get update \ - && apt-get install openjdk-7-jre-headless curl gdal-bin tesseract-ocr \ - tesseract-ocr-eng tesseract-ocr-ita tesseract-ocr-fra tesseract-ocr-spa tesseract-ocr-deu -y \ - && curl -sSL https://people.apache.org/keys/group/tika.asc -o /tmp/tika.asc \ - && gpg --import /tmp/tika.asc \ - && curl -sSL "$TIKA_SERVER_URL.asc" -o /tmp/tika-server-${TIKA_VERSION}.jar.asc \ - && NEAREST_TIKA_SERVER_URL=$(curl -sSL http://www.apache.org/dyn/closer.cgi/${TIKA_SERVER_URL#https://www.apache.org/dist/}\?asjson\=1 \ - | awk '/"path_info": / { pi=$2; }; /"preferred":/ { pref=$2; }; END { print pref " " pi; };' \ - | sed -r -e 's/^"//; s/",$//; s/" "//') \ - && echo "Nearest mirror: $NEAREST_TIKA_SERVER_URL" \ - && curl -sSL "$NEAREST_TIKA_SERVER_URL" -o /tika-server-${TIKA_VERSION}.jar \ - && gpg --verify /tmp/tika-server-${TIKA_VERSION}.jar.asc /tika-server-${TIKA_VERSION}.jar \ + && apt-get install openjdk-8-jre-headless curl gdal-bin tesseract-ocr \ + tesseract-ocr-eng tesseract-ocr-ita tesseract-ocr-fra tesseract-ocr-spa tesseract-ocr-deu -y \ && apt-get clean -y && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/* +ENV JAVA_HOME /usr/lib/jvm/java-8-openjdk-amd64 +RUN export JAVA_HOME + +ARG JAR_FILE +ADD target/${JAR_FILE} /tika-server.jar + EXPOSE 9998 -ENTRYPOINT java -jar /tika-server-${TIKA_VERSION}.jar -h 0.0.0.0 +ENTRYPOINT java -jar /tika-server.jar -h 0.0.0.0 + diff --git a/tika-server/pom.xml b/tika-server/pom.xml index 985387951..6fc38694e 100644 --- a/tika-server/pom.xml +++ b/tika-server/pom.xml @@ -260,6 +260,26 @@ + +com.spotify +dockerfile-maven-plugin +1.3.7 + + +default + + build + + + + + apache/tika + ${project.version} + +tika-server-${project.version}.jar + + + org.apache.maven.plugins maven-jar-plugin This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Docker with Tika Server > --- > > Key: TIKA-1518 > URL: https://issues.apache.org/jira/browse/TIKA-1518 > Project: Tika > Issue Type: New Feature >Reporter: Paul Ramirez >Assignee: Dave Meikle >Priority: Major > Fix For: 2.0, 1.17 > > > This version should be able to demonstrate as many of Apache Tika's > capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to > show parsers which require installation of other dependencies. In addition, > this should help move TIKA-1301 forward and should leverage the suggestion > made by [~lewismc] of a script which can pull down the latest version of > Apache Tika. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TIKA-1518) Docker with Tika Server
[ https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16385088#comment-16385088 ] ASF GitHub Bot commented on TIKA-1518: -- dameikle opened a new pull request #227: TIKA-1518: Add local docker build based on dockerfile-maven-plugin URL: https://github.com/apache/tika/pull/227 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Docker with Tika Server > --- > > Key: TIKA-1518 > URL: https://issues.apache.org/jira/browse/TIKA-1518 > Project: Tika > Issue Type: New Feature >Reporter: Paul Ramirez >Assignee: Dave Meikle >Priority: Major > Fix For: 2.0, 1.17 > > > This version should be able to demonstrate as many of Apache Tika's > capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to > show parsers which require installation of other dependencies. In addition, > this should help move TIKA-1301 forward and should leverage the suggestion > made by [~lewismc] of a script which can pull down the latest version of > Apache Tika. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TIKA-1518) Docker with Tika Server
[ https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14650316#comment-14650316 ] Dave Meikle commented on TIKA-1518: --- Moved to 1.11 to keep work to get DockerHub is tracked. Docker with Tika Server --- Key: TIKA-1518 URL: https://issues.apache.org/jira/browse/TIKA-1518 Project: Tika Issue Type: New Feature Reporter: Paul Ramirez Fix For: 1.11 This version should be able to demonstrate as many of Apache Tika's capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to show parsers which require installation of other dependencies. In addition, this should help move TIKA-1301 forward and should leverage the suggestion made by [~lewismc] of a script which can pull down the latest version of Apache Tika. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TIKA-1518) Docker with Tika Server
[ https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14299805#comment-14299805 ] Dave Meikle commented on TIKA-1518: --- Right folks I have added the Dockerfile to tika-server root. Thinking about this more I am not sure if we can easily ship an image as a official artefact - maybe others have a view on that - so will just keep mine kicking around until we get something more official. In theory if we could setup an organisation in Dockerhub and add the Automatic commit hooks to our GitHub organisation profile, we could auto build new images - but this all depends on the stance legal would have. [~tpalsulich] - have you heard anything back on this? Docker with Tika Server --- Key: TIKA-1518 URL: https://issues.apache.org/jira/browse/TIKA-1518 Project: Tika Issue Type: New Feature Reporter: Paul Ramirez Fix For: 1.8 This version should be able to demonstrate as many of Apache Tika's capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to show parsers which require installation of other dependencies. In addition, this should help move TIKA-1301 forward and should leverage the suggestion made by [~lewismc] of a script which can pull down the latest version of Apache Tika. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TIKA-1518) Docker with Tika Server
[ https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14299843#comment-14299843 ] Hudson commented on TIKA-1518: -- SUCCESS: Integrated in tika-trunk-jdk1.7 #463 (See [https://builds.apache.org/job/tika-trunk-jdk1.7/463/]) TIKA-1518: Added Dockerfile to support building a Tika Server image (dmeikle: http://svn.apache.org/viewvc/tika/trunk/?view=revrev=1656191) * /tika/trunk/tika-server/Dockerfile Docker with Tika Server --- Key: TIKA-1518 URL: https://issues.apache.org/jira/browse/TIKA-1518 Project: Tika Issue Type: New Feature Reporter: Paul Ramirez Fix For: 1.8 This version should be able to demonstrate as many of Apache Tika's capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to show parsers which require installation of other dependencies. In addition, this should help move TIKA-1301 forward and should leverage the suggestion made by [~lewismc] of a script which can pull down the latest version of Apache Tika. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TIKA-1518) Docker with Tika Server
[ https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14299961#comment-14299961 ] Chris A. Mattmann commented on TIKA-1518: - Sure, as for images, I think we should use whatever infrastructure that ASF infra sets up for us - I bet they'll have a place for us to deal with images. Rather than waiting for them to decide too, I recommend joining the infrastruct...@apache.org lists (if not already) and joining the conversation and providing them a real use case here (TIKA-1518) and describing our needs and requirements. I'm sure [~ke4qqq] and team will be happy to listen and help. Docker with Tika Server --- Key: TIKA-1518 URL: https://issues.apache.org/jira/browse/TIKA-1518 Project: Tika Issue Type: New Feature Reporter: Paul Ramirez Fix For: 1.8 This version should be able to demonstrate as many of Apache Tika's capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to show parsers which require installation of other dependencies. In addition, this should help move TIKA-1301 forward and should leverage the suggestion made by [~lewismc] of a script which can pull down the latest version of Apache Tika. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TIKA-1518) Docker with Tika Server
[ https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14299697#comment-14299697 ] Dave Meikle commented on TIKA-1518: --- Sorry gang been travelling a lot. #1, Totally up for adding this to tika-server and I think we may be able to do the automated build by linking Docker hub to the Tika Git mirrors but haven't done this before. Will give it a try when I commit the Dockerfile. #3, [~tpalsulich] I have a little AngularJS app that I use for something very similar that I could brand up and add some functionality to act as a nice front end app for tika-server. Let me clean it up and stick it in too. Docker with Tika Server --- Key: TIKA-1518 URL: https://issues.apache.org/jira/browse/TIKA-1518 Project: Tika Issue Type: New Feature Reporter: Paul Ramirez Fix For: 1.8 This version should be able to demonstrate as many of Apache Tika's capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to show parsers which require installation of other dependencies. In addition, this should help move TIKA-1301 forward and should leverage the suggestion made by [~lewismc] of a script which can pull down the latest version of Apache Tika. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TIKA-1518) Docker with Tika Server
[ https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297614#comment-14297614 ] Tyler Palsulich commented on TIKA-1518: --- 2. Sent a message. Andrew Bayer responded saying it's in the works -- should have more information next month. 3. In my opinion, TIKA-1302 is more of backend testing of Tika. I'm referring to a frontend where someone can upload a file to the server and can see what Tika pulls out of it. I'll open a new sub-issue of TIKA-1302. Docker with Tika Server --- Key: TIKA-1518 URL: https://issues.apache.org/jira/browse/TIKA-1518 Project: Tika Issue Type: New Feature Reporter: Paul Ramirez Fix For: 1.8 This version should be able to demonstrate as many of Apache Tika's capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to show parsers which require installation of other dependencies. In addition, this should help move TIKA-1301 forward and should leverage the suggestion made by [~lewismc] of a script which can pull down the latest version of Apache Tika. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TIKA-1518) Docker with Tika Server
[ https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297623#comment-14297623 ] Chris A. Mattmann commented on TIKA-1518: - [~tpalsulich] talk to [~talli...@apache.org] he already got this set up with a donation from Rackspace. We have a VM with Tika Server on it. We just need to doc it and promote it. Right, Tim? Lewis and I have access, you can too! Docker with Tika Server --- Key: TIKA-1518 URL: https://issues.apache.org/jira/browse/TIKA-1518 Project: Tika Issue Type: New Feature Reporter: Paul Ramirez Fix For: 1.8 This version should be able to demonstrate as many of Apache Tika's capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to show parsers which require installation of other dependencies. In addition, this should help move TIKA-1301 forward and should leverage the suggestion made by [~lewismc] of a script which can pull down the latest version of Apache Tika. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TIKA-1518) Docker with Tika Server
[ https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14298002#comment-14298002 ] Tim Allison commented on TIKA-1518: --- [~tpalsulich], y, the server was initially intended for TIKA-1302, which should eventually run fairly continuously and publish results of runs of different versions of Tika, but it wasn't designed to be TIKA-1301. That said, we have a server, thanks to Rackspace, and why not use it for TIKA-1301 now. Send me a personal email with your desired userid, and I'll give you access tomorrow. Docker with Tika Server --- Key: TIKA-1518 URL: https://issues.apache.org/jira/browse/TIKA-1518 Project: Tika Issue Type: New Feature Reporter: Paul Ramirez Fix For: 1.8 This version should be able to demonstrate as many of Apache Tika's capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to show parsers which require installation of other dependencies. In addition, this should help move TIKA-1301 forward and should leverage the suggestion made by [~lewismc] of a script which can pull down the latest version of Apache Tika. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TIKA-1518) Docker with Tika Server
[ https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296439#comment-14296439 ] Chris A. Mattmann commented on TIKA-1518: - Thanks Tyler. Can you raise #2 on infrastruct...@apache.org? That would be an awesome idea, and then keep folks here posted. As for #1, +1 from me. RE: #3, there is a TIKA issue on that, I think it's TIKA-1312 Docker with Tika Server --- Key: TIKA-1518 URL: https://issues.apache.org/jira/browse/TIKA-1518 Project: Tika Issue Type: New Feature Reporter: Paul Ramirez Fix For: 1.8 This version should be able to demonstrate as many of Apache Tika's capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to show parsers which require installation of other dependencies. In addition, this should help move TIKA-1301 forward and should leverage the suggestion made by [~lewismc] of a script which can pull down the latest version of Apache Tika. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TIKA-1518) Docker with Tika Server
[ https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14291999#comment-14291999 ] Konstantin Gribov commented on TIKA-1518: - Thank you, [~davemeikle]. It works perfectly, so can be easily used to evaluate Tika. I'll add info to wiki if it isn't there already. Docker with Tika Server --- Key: TIKA-1518 URL: https://issues.apache.org/jira/browse/TIKA-1518 Project: Tika Issue Type: New Feature Reporter: Paul Ramirez Fix For: 1.8 This version should be able to demonstrate as many of Apache Tika's capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to show parsers which require installation of other dependencies. In addition, this should help move TIKA-1301 forward and should leverage the suggestion made by [~lewismc] of a script which can pull down the latest version of Apache Tika. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TIKA-1518) Docker with Tika Server
[ https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292435#comment-14292435 ] Paul Ramirez commented on TIKA-1518: Missed this over the weekend while playing with Docker but yes [~chrismattmann] looks to be what exactly I was thinking. +1 to leaving open until it's in Apache Tika codebase. Dave I will definitely use this for a project and commit updates to it. Docker with Tika Server --- Key: TIKA-1518 URL: https://issues.apache.org/jira/browse/TIKA-1518 Project: Tika Issue Type: New Feature Reporter: Paul Ramirez Fix For: 1.8 This version should be able to demonstrate as many of Apache Tika's capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to show parsers which require installation of other dependencies. In addition, this should help move TIKA-1301 forward and should leverage the suggestion made by [~lewismc] of a script which can pull down the latest version of Apache Tika. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TIKA-1518) Docker with Tika Server
[ https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14290507#comment-14290507 ] Dave Meikle commented on TIKA-1518: --- Hi [~grossws] - I have added the automated build here: https://registry.hub.docker.com/u/logicalspark/docker-tikaserver/ Apologies for the delay, DockerHub wasn't very stable for me whilst on my travels. Docker with Tika Server --- Key: TIKA-1518 URL: https://issues.apache.org/jira/browse/TIKA-1518 Project: Tika Issue Type: New Feature Reporter: Paul Ramirez Fix For: 1.8 This version should be able to demonstrate as many of Apache Tika's capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to show parsers which require installation of other dependencies. In addition, this should help move TIKA-1301 forward and should leverage the suggestion made by [~lewismc] of a script which can pull down the latest version of Apache Tika. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TIKA-1518) Docker with Tika Server
[ https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14284926#comment-14284926 ] Konstantin Gribov commented on TIKA-1518: - I've dropped my version to avoid unnecessary duplication. [~davemeikle], can you also create automated build on docker hub? Instructions can be found [here|https://docs.docker.com/docker-hub/builds/#setting-up-automated-builds-with-github]. Docker with Tika Server --- Key: TIKA-1518 URL: https://issues.apache.org/jira/browse/TIKA-1518 Project: Tika Issue Type: New Feature Reporter: Paul Ramirez Fix For: 1.8 This version should be able to demonstrate as many of Apache Tika's capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to show parsers which require installation of other dependencies. In addition, this should help move TIKA-1301 forward and should leverage the suggestion made by [~lewismc] of a script which can pull down the latest version of Apache Tika. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TIKA-1518) Docker with Tika Server
[ https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14284226#comment-14284226 ] Chris A. Mattmann commented on TIKA-1518: - Guys looks like [~davemeikle] has already started some work on this in his Github repo: https://github.com/LogicalSpark/docker-tikaserver Dave, FYI this JIRA issue not sure if it's related just saw by following your Github. Paul R - maybe you can use this? Docker with Tika Server --- Key: TIKA-1518 URL: https://issues.apache.org/jira/browse/TIKA-1518 Project: Tika Issue Type: New Feature Reporter: Paul Ramirez Fix For: 1.8 This version should be able to demonstrate as many of Apache Tika's capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to show parsers which require installation of other dependencies. In addition, this should help move TIKA-1301 forward and should leverage the suggestion made by [~lewismc] of a script which can pull down the latest version of Apache Tika. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TIKA-1518) Docker with Tika Server
[ https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14280241#comment-14280241 ] Konstantin Gribov commented on TIKA-1518: - Ok, I'll create it soon Docker with Tika Server --- Key: TIKA-1518 URL: https://issues.apache.org/jira/browse/TIKA-1518 Project: Tika Issue Type: New Feature Reporter: Paul Ramirez Fix For: 1.8 This version should be able to demonstrate as many of Apache Tika's capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to show parsers which require installation of other dependencies. In addition, this should help move TIKA-1301 forward and should leverage the suggestion made by [~lewismc] of a script which can pull down the latest version of Apache Tika. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TIKA-1518) Docker with Tika Server
[ https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14279927#comment-14279927 ] Paul Ramirez commented on TIKA-1518: Thanks Konstantin for the example. If you have the time that would be awesome. Docker with Tika Server --- Key: TIKA-1518 URL: https://issues.apache.org/jira/browse/TIKA-1518 Project: Tika Issue Type: New Feature Reporter: Paul Ramirez Fix For: 1.8 This version should be able to demonstrate as many of Apache Tika's capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to show parsers which require installation of other dependencies. In addition, this should help move TIKA-1301 forward and should leverage the suggestion made by [~lewismc] of a script which can pull down the latest version of Apache Tika. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TIKA-1518) Docker with Tika Server
[ https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14278424#comment-14278424 ] Paul Ramirez commented on TIKA-1518: As I build a patch what component should this go into? Any suggestions on things that will need to be a part of this beyond the dependencies I've listed? Docker with Tika Server --- Key: TIKA-1518 URL: https://issues.apache.org/jira/browse/TIKA-1518 Project: Tika Issue Type: New Feature Reporter: Paul Ramirez Fix For: 1.8 This version should be able to demonstrate as many of Apache Tika's capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to show parsers which require installation of other dependencies. In addition, this should help move TIKA-1301 forward and should leverage the suggestion made by [~lewismc] of a script which can pull down the latest version of Apache Tika. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TIKA-1518) Docker with Tika Server
[ https://issues.apache.org/jira/browse/TIKA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14278446#comment-14278446 ] Konstantin Gribov commented on TIKA-1518: - To pull latest Tika you can use snippet like mine: {noformat} # ... # see https://www.apache.org/dist/tomcat/tomcat-8/KEYS RUN gpg --keyserver pgp.mit.edu --recv-keys \ 05AB33110949707C93A279E3D3EFE6B686867BA6 \ F7DA48BB64BCB84ECBA7EE6935CD23C10D498E23 # keylist (stripped for jira) ENV TOMCAT_MAJOR 8 ENV TOMCAT_VERSION 8.0.15 ENV TOMCAT_TGZ_URL https://www.apache.org/dist/tomcat/tomcat-$TOMCAT_MAJOR/v$TOMCAT_VERSION/bin/apache-tomcat-$TOMCAT_VERSION.tar.gz RUN NEAREST_TOMCAT_TGZ_URL=$(curl -sSL http://www.apache.org/dyn/closer.cgi/${TOMCAT_TGZ_URL#https://www.apache.org/dist/}\?asjson\=1 \ | awk '/path_info: / { pi=$2; }; /preferred:/ { pref=$2; }; END { print pref pi; };' \ | sed -r -e 's/^//; s/,$//; s/ //') \ echo Nearest mirror: $NEAREST_TOMCAT_TGZ_URL \ curl -sSL $NEAREST_TOMCAT_TGZ_URL -o tomcat.tar.gz \ curl -sSL $TOMCAT_TGZ_URL.asc -o tomcat.tar.gz.asc \ gpg --verify tomcat.tar.gz.asc \ tar -xvf tomcat.tar.gz --strip-components=1 {noformat} Full Dockerfile can be viewed on github (https://github.com/grossws/docker-comp-tomcat8/blob/master/Dockerfile) If you want, I can make docker image and automated build for it. Docker with Tika Server --- Key: TIKA-1518 URL: https://issues.apache.org/jira/browse/TIKA-1518 Project: Tika Issue Type: New Feature Reporter: Paul Ramirez Fix For: 1.8 This version should be able to demonstrate as many of Apache Tika's capabilities as possible. For instance with GDAL, Tesseract, and FFmpeg to show parsers which require installation of other dependencies. In addition, this should help move TIKA-1301 forward and should leverage the suggestion made by [~lewismc] of a script which can pull down the latest version of Apache Tika. -- This message was sent by Atlassian JIRA (v6.3.4#6332)