Cool. It’s the auto run that I really need, and the other part that I don’t think I’ve tackled properly is the managing of logs…
I’m going to check with my project to see if they support Snap packages. Eric > On Dec 16, 2019, at 5:10 PM, Tom Barber <[email protected]> wrote: > > Just saw this fly by and FYI on Linux systems that support Snap packages > (Ubuntu/Debian/Arch/Fedora etc) you can `snap install tika-server` doesn’t > yet auto-run I don’t believe but you can just run `tika-server.run` and > adding an init script wouldn’t take 5 minutes. > > Tom > > On 16 December 2019 at 18:42:55, Eric Pugh ([email protected] > <mailto:[email protected]>) wrote: > >> Hi folks! >> >> I’ve got a mostly completed PR for having install scripts for Tika Server, >> and I’m hoping a committer will take a look at the PR, and give feedback >> (and ideally commit in time for 1.24!) >> >> A couple of things: >> >> 1) This was completely influenced by >> https://lucene.apache.org/solr/guide/8_3/taking-solr-to-production.html#service-installation-script >> >> <https://lucene.apache.org/solr/guide/8_3/taking-solr-to-production.html#service-installation-script><https://lucene.apache.org/solr/guide/8_3/taking-solr-to-production.html#service-installation-script >> >> <https://lucene.apache.org/solr/guide/8_3/taking-solr-to-production.html#service-installation-script>>, >> in fact I started with the Solr scripts. >> >> 2) I’ve deleted all the Solr specific aspects (I think), however there may >> still be more to delete. >> >> 3) This requires a change to how we release Tika, previously we ship >> tika-app.jar and Tika-eval.jar, and Tika-server.jar, and now, I think, we >> want to add the tika-server-bin.tgz and tika-server-bin.zip binary >> distributions. >> >> I’m happy to start writing accompanying “how to deploy Tika Server” docs if >> this PR looks good! Or, please give input and I’ll make the updates. >> >> Eric >> >> >> > On Dec 12, 2019, at 2:39 PM, Eric Pugh <[email protected] >> > <mailto:[email protected]>> wrote: >> > >> > I’ve created this JIRA to track this work: >> > https://issues.apache.org/jira/browse/TIKA-3010 >> > <https://issues.apache.org/jira/browse/TIKA-3010> >> > <https://issues.apache.org/jira/browse/TIKA-3010 >> > <https://issues.apache.org/jira/browse/TIKA-3010>> >> > >> > And a WIP progress PR is at https://github.com/apache/tika/pull/305 >> > <https://github.com/apache/tika/pull/305> >> > <https://github.com/apache/tika/pull/305 >> > <https://github.com/apache/tika/pull/305>> >> > >> > My thought is to put something together that mimics how we deploy Solr, >> > and see how that works. I have a need for an install process that a >> > general IT person can follow, who isn’t a Tika expert or a Docker users. >> > >> > >> > >> > >> >> On Dec 4, 2019, at 12:28 PM, Chris Mattmann <[email protected] >> >> <mailto:[email protected]> <mailto:[email protected] >> >> <mailto:[email protected]>>> wrote: >> >> >> >> Thanks for bringing this conversation up Eric. >> >> >> >> >> >> >> >> Historically if you look over the last 5 years, I think what you are >> >> asking below has sort of already become the de facto >> >> truth. Most people are in fact using Tika server, whether they are >> >> individual devs, govvies, commercial folk and the like. >> >> >> >> Big, small and medium projects. Evidenced by the expansion of Tika APIs >> >> into pretty much every PL I know and use of >> >> actively today. >> >> >> >> >> >> >> >> Given that, we probably should update the main website docs to make this >> >> more prominent. The tika server docs on the >> >> wiki are pretty darn good. But they don’t get prime real estate. Would be >> >> wonderful if someone wants to update the >> >> website to make it more prominent. >> >> >> >> >> >> >> >> The downstream Tika Python lib that I maintain has tons of activity is >> >> used by more than 350+ projects and relies solely >> >> on Tika-Server. My recommendation to the Solr folks (having created 7633) >> >> from the 2014 DARPA MEMEX days was to >> >> move towards Tika Server based SolrCell dep and that’s the right way to >> >> go IMO. >> >> >> >> >> >> >> >> Chris >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> From: Eric Pugh <[email protected] >> >> <mailto:[email protected]> >> >> <mailto:[email protected] >> >> <mailto:[email protected]>>> >> >> Reply-To: "[email protected] <mailto:[email protected]> >> >> <mailto:[email protected] <mailto:[email protected]>>" >> >> <[email protected] <mailto:[email protected]> >> >> <mailto:[email protected] <mailto:[email protected]>>> >> >> Date: Wednesday, December 4, 2019 at 12:24 PM >> >> To: "[email protected] <mailto:[email protected]> >> >> <mailto:[email protected] <mailto:[email protected]>>" >> >> <[email protected] <mailto:[email protected]> >> >> <mailto:[email protected] <mailto:[email protected]>>> >> >> Subject: [EXTERNAL] Do we have a community supported approach for >> >> deploying Tika Server in production? >> >> >> >> >> >> >> >> Hi all - Hoping this is a reasonable Tika-dev versus Tika-user question! >> >> >> >> >> >> >> >> Over in Solr land there has been renewed discussion about streamlining >> >> what Solr is.... >> >> >> >> >> >> >> >> In regards to rich content extraction and the Tika project, it seems like >> >> the two ideas that continue to preserve the existing behavior are: >> >> >> >> >> >> >> >> 1) To convert the ExtractingRequestHandler into a Package (Plugin) for >> >> Solr. This slims down the standard Solr download, and *might* make it >> >> easier to update the version of Tika + dependent jars used? >> >> >> >> >> >> >> >> 2) The second approach is to instead require Tika-Server to be running >> >> (https://issues.apache.org/jira/browse/SOLR-7633 >> >> <https://issues.apache.org/jira/browse/SOLR-7633><https://issues.apache.org/jira/browse/SOLR-7633 >> >> <https://issues.apache.org/jira/browse/SOLR-7633>>) and just have Solr >> >> delegate the call to Tika-Server. >> >> >> >> >> >> >> >> >> >> >> >> I was thinking about why I like option 1 better than 2, and I think it >> >> boils down to how mature the IT organization I am working with is. Some >> >> IT organizations have large dev-ops teams, and are working at major >> >> scale, and managing a fleet of Tika-Server on Kubernetes with Load >> >> Balancer dynamically scaling up and down is simple and second nature! >> >> However, many organizations aren’t like that. >> >> >> >> >> >> >> >> So I guess what I’m asking is do we have a reasonable supported approach >> >> for deploying Tika Server for non-tika savvy organizations? I’m thinking >> >> about Solr, and specifically the fact that Solr has a well defined set of >> >> Service Installation scripts. When I follow the directions in >> >> https://lucene.apache.org/solr/guide/8_3/taking-solr-to-production.html#taking-solr-to-production >> >> >> >> <https://lucene.apache.org/solr/guide/8_3/taking-solr-to-production.html#taking-solr-to-production><https://lucene.apache.org/solr/guide/8_3/taking-solr-to-production.html#taking-solr-to-production >> >> >> >> <https://lucene.apache.org/solr/guide/8_3/taking-solr-to-production.html#taking-solr-to-production>> >> >> I can feel confident that when the server is rebooted, then Solr will >> >> come back up! Plus there is log rotation and all the rest. >> >> >> >> >> >> >> >> In contrast, when I look at Tika website, specifically >> >> https://tika.apache.org/1.22/gettingstarted.htm >> >> <https://tika.apache.org/1.22/gettingstarted.htm><https://tika.apache.org/1.22/gettingstarted.htm >> >> <https://tika.apache.org/1.22/gettingstarted.htm>> pagel, the message is >> >> to run Tika as a command line application, or embedded in your >> >> application. >> >> >> >> >> >> >> >> I’m wondering if Tika-Server needs to be made more prominent, and treated >> >> as the “primary method of interacting with Tika”? Do we need as a >> >> community to focus more on Tika-Server? In our getting started >> >> documentation, in our usage documentation, and in our examples? >> >> >> >> >> >> >> >> Do we need to create the equivalent of the Service Installation scripts >> >> for Tika-Server? >> >> >> >> >> >> >> >> Wanted to stoke the discussion! >> >> >> >> >> >> >> >> Eric >> >> >> >> >> >> >> >> _______________________ >> >> >> >> Eric Pugh | Founder & CEO | OpenSource Connections, LLC | 434.466.1467 | >> >> http://www.opensourceconnections.com >> >> <http://www.opensourceconnections.com/><http://www.opensourceconnections.com/ >> >> >> >> <http://www.opensourceconnections.com/>><http://www.opensourceconnections.com/ >> >> <http://www.opensourceconnections.com/> >> >> <http://www.opensourceconnections.com/ >> >> <http://www.opensourceconnections.com/>>> | My Free/Busy >> >> <http://tinyurl.com/eric-cal <http://tinyurl.com/eric-cal> >> >> <http://tinyurl.com/eric-cal <http://tinyurl.com/eric-cal>>> >> >> >> >> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed >> >> <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw >> >> >> >> <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw> >> >> >> >> <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw >> >> >> >> <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw>>> >> >> >> >> >> >> This e-mail and all contents, including attachments, is considered to be >> >> Company Confidential unless explicitly stated otherwise, regardless of >> >> whether attachments are marked as such. >> > >> > _______________________ >> > Eric Pugh | Founder & CEO | OpenSource Connections, LLC | 434.466.1467 | >> > http://www.opensourceconnections.com >> > <http://www.opensourceconnections.com/><http://www.opensourceconnections.com/ >> > <http://www.opensourceconnections.com/>> | My Free/Busy >> > <http://tinyurl.com/eric-cal <http://tinyurl.com/eric-cal>> >> > Co-Author: Apache Solr Enterprise Search Server, 3rd Ed >> > <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw >> > >> > <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw>> >> > >> > This e-mail and all contents, including attachments, is considered to be >> > Company Confidential unless explicitly stated otherwise, regardless of >> > whether attachments are marked as such. >> > >> >> _______________________ >> Eric Pugh | Founder & CEO | OpenSource Connections, LLC | 434.466.1467 | >> http://www.opensourceconnections.com >> <http://www.opensourceconnections.com/><http://www.opensourceconnections.com/ >> <http://www.opensourceconnections.com/>> | My Free/Busy >> <http://tinyurl.com/eric-cal <http://tinyurl.com/eric-cal>> >> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed >> <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw >> >> <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw>> >> >> This e-mail and all contents, including attachments, is considered to be >> Company Confidential unless explicitly stated otherwise, regardless of >> whether attachments are marked as such. >> > > Spicule Limited is registered in England & Wales. Company Number: 09954122. > Registered office: First Floor, Telecom House, 125-135 Preston Road, > Brighton, England, BN1 6AF. VAT No. 251478891. > > > > All engagements are subject to Spicule Terms and Conditions of Business. This > email and its contents are intended solely for the individual to whom it is > addressed and may contain information that is confidential, privileged or > otherwise protected from disclosure, distributing or copying. Any views or > opinions presented in this email are solely those of the author and do not > necessarily represent those of Spicule Limited. The company accepts no > liability for any damage caused by any virus transmitted by this email. If > you have received this message in error, please notify us immediately by > reply email before deleting it from your system. Service of legal notice > cannot be effected on Spicule Limited by email. > _______________________ Eric Pugh | Founder & CEO | OpenSource Connections, LLC | 434.466.1467 | http://www.opensourceconnections.com <http://www.opensourceconnections.com/> | My Free/Busy <http://tinyurl.com/eric-cal> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw> This e-mail and all contents, including attachments, is considered to be Company Confidential unless explicitly stated otherwise, regardless of whether attachments are marked as such.
