Cool.   

It’s the auto run that I really need, and the other part that I don’t think 
I’ve tackled properly is the managing of logs…

I’m going to check with my project to see if they support Snap packages.

Eric


> On Dec 16, 2019, at 5:10 PM, Tom Barber <[email protected]> wrote:
> 
> Just saw this fly by and FYI on Linux systems that support Snap packages 
> (Ubuntu/Debian/Arch/Fedora etc) you can `snap install tika-server` doesn’t 
> yet auto-run I don’t believe but you can just run `tika-server.run` and 
> adding an init script wouldn’t take 5 minutes.
> 
> Tom
> 
> On 16 December 2019 at 18:42:55, Eric Pugh ([email protected] 
> <mailto:[email protected]>) wrote:
> 
>> Hi folks! 
>> 
>> I’ve got a mostly completed PR for having install scripts for Tika Server, 
>> and I’m hoping a committer will take a look at the PR, and give feedback 
>> (and ideally commit in time for 1.24!) 
>> 
>> A couple of things: 
>> 
>> 1) This was completely influenced by 
>> https://lucene.apache.org/solr/guide/8_3/taking-solr-to-production.html#service-installation-script
>>  
>> <https://lucene.apache.org/solr/guide/8_3/taking-solr-to-production.html#service-installation-script><https://lucene.apache.org/solr/guide/8_3/taking-solr-to-production.html#service-installation-script
>>  
>> <https://lucene.apache.org/solr/guide/8_3/taking-solr-to-production.html#service-installation-script>>,
>>  in fact I started with the Solr scripts. 
>> 
>> 2) I’ve deleted all the Solr specific aspects (I think), however there may 
>> still be more to delete.  
>> 
>> 3) This requires a change to how we release Tika, previously we ship 
>> tika-app.jar and Tika-eval.jar, and Tika-server.jar, and now, I think, we 
>> want to add the tika-server-bin.tgz and tika-server-bin.zip binary 
>> distributions. 
>> 
>> I’m happy to start writing accompanying “how to deploy Tika Server” docs if 
>> this PR looks good! Or, please give input and I’ll make the updates.
>> 
>> Eric 
>> 
>> 
>> > On Dec 12, 2019, at 2:39 PM, Eric Pugh <[email protected] 
>> > <mailto:[email protected]>> wrote: 
>> >  
>> > I’ve created this JIRA to track this work: 
>> > https://issues.apache.org/jira/browse/TIKA-3010 
>> > <https://issues.apache.org/jira/browse/TIKA-3010> 
>> > <https://issues.apache.org/jira/browse/TIKA-3010 
>> > <https://issues.apache.org/jira/browse/TIKA-3010>> 
>> >  
>> > And a WIP progress PR is at https://github.com/apache/tika/pull/305 
>> > <https://github.com/apache/tika/pull/305> 
>> > <https://github.com/apache/tika/pull/305 
>> > <https://github.com/apache/tika/pull/305>> 
>> >  
>> > My thought is to put something together that mimics how we deploy Solr, 
>> > and see how that works. I have a need for an install process that a 
>> > general IT person can follow, who isn’t a Tika expert or a Docker users. 
>> >  
>> >  
>> >  
>> >  
>> >> On Dec 4, 2019, at 12:28 PM, Chris Mattmann <[email protected] 
>> >> <mailto:[email protected]> <mailto:[email protected] 
>> >> <mailto:[email protected]>>> wrote: 
>> >>  
>> >> Thanks for bringing this conversation up Eric. 
>> >>  
>> >>  
>> >>  
>> >> Historically if you look over the last 5 years, I think what you are 
>> >> asking below has sort of already become the de facto 
>> >> truth. Most people are in fact using Tika server, whether they are 
>> >> individual devs, govvies, commercial folk and the like.  
>> >>  
>> >> Big, small and medium projects. Evidenced by the expansion of Tika APIs 
>> >> into pretty much every PL I know and use of  
>> >> actively today. 
>> >>  
>> >>  
>> >>  
>> >> Given that, we probably should update the main website docs to make this 
>> >> more prominent. The tika server docs on the 
>> >> wiki are pretty darn good. But they don’t get prime real estate. Would be 
>> >> wonderful if someone wants to update the  
>> >> website to make it more prominent. 
>> >>  
>> >>  
>> >>  
>> >> The downstream Tika Python lib that I maintain has tons of activity is 
>> >> used by more than 350+ projects and relies solely 
>> >> on Tika-Server. My recommendation to the Solr folks (having created 7633) 
>> >> from the 2014 DARPA MEMEX days was to  
>> >> move towards Tika Server based SolrCell dep and that’s the right way to 
>> >> go IMO. 
>> >>  
>> >>  
>> >>  
>> >> Chris 
>> >>  
>> >>  
>> >>  
>> >>  
>> >>  
>> >>  
>> >>  
>> >>  
>> >>  
>> >>  
>> >>  
>> >> From: Eric Pugh <[email protected] 
>> >> <mailto:[email protected]> 
>> >> <mailto:[email protected] 
>> >> <mailto:[email protected]>>> 
>> >> Reply-To: "[email protected] <mailto:[email protected]> 
>> >> <mailto:[email protected] <mailto:[email protected]>>" 
>> >> <[email protected] <mailto:[email protected]> 
>> >> <mailto:[email protected] <mailto:[email protected]>>> 
>> >> Date: Wednesday, December 4, 2019 at 12:24 PM 
>> >> To: "[email protected] <mailto:[email protected]> 
>> >> <mailto:[email protected] <mailto:[email protected]>>" 
>> >> <[email protected] <mailto:[email protected]> 
>> >> <mailto:[email protected] <mailto:[email protected]>>> 
>> >> Subject: [EXTERNAL] Do we have a community supported approach for 
>> >> deploying Tika Server in production? 
>> >>  
>> >>  
>> >>  
>> >> Hi all - Hoping this is a reasonable Tika-dev versus Tika-user question! 
>> >>  
>> >>  
>> >>  
>> >> Over in Solr land there has been renewed discussion about streamlining 
>> >> what Solr is....  
>> >>  
>> >>  
>> >>  
>> >> In regards to rich content extraction and the Tika project, it seems like 
>> >> the two ideas that continue to preserve the existing behavior are: 
>> >>  
>> >>  
>> >>  
>> >> 1) To convert the ExtractingRequestHandler into a Package (Plugin) for 
>> >> Solr. This slims down the standard Solr download, and *might* make it 
>> >> easier to update the version of Tika + dependent jars used? 
>> >>  
>> >>  
>> >>  
>> >> 2) The second approach is to instead require Tika-Server to be running 
>> >> (https://issues.apache.org/jira/browse/SOLR-7633 
>> >> <https://issues.apache.org/jira/browse/SOLR-7633><https://issues.apache.org/jira/browse/SOLR-7633
>> >>  <https://issues.apache.org/jira/browse/SOLR-7633>>) and just have Solr 
>> >> delegate the call to Tika-Server. 
>> >>  
>> >>  
>> >>  
>> >>  
>> >>  
>> >> I was thinking about why I like option 1 better than 2, and I think it 
>> >> boils down to how mature the IT organization I am working with is. Some 
>> >> IT organizations have large dev-ops teams, and are working at major 
>> >> scale, and managing a fleet of Tika-Server on Kubernetes with Load 
>> >> Balancer dynamically scaling up and down is simple and second nature! 
>> >> However, many organizations aren’t like that. 
>> >>  
>> >>  
>> >>  
>> >> So I guess what I’m asking is do we have a reasonable supported approach 
>> >> for deploying Tika Server for non-tika savvy organizations? I’m thinking 
>> >> about Solr, and specifically the fact that Solr has a well defined set of 
>> >> Service Installation scripts. When I follow the directions in 
>> >> https://lucene.apache.org/solr/guide/8_3/taking-solr-to-production.html#taking-solr-to-production
>> >>  
>> >> <https://lucene.apache.org/solr/guide/8_3/taking-solr-to-production.html#taking-solr-to-production><https://lucene.apache.org/solr/guide/8_3/taking-solr-to-production.html#taking-solr-to-production
>> >>  
>> >> <https://lucene.apache.org/solr/guide/8_3/taking-solr-to-production.html#taking-solr-to-production>>
>> >>  I can feel confident that when the server is rebooted, then Solr will 
>> >> come back up! Plus there is log rotation and all the rest. 
>> >>  
>> >>  
>> >>  
>> >> In contrast, when I look at Tika website, specifically 
>> >> https://tika.apache.org/1.22/gettingstarted.htm 
>> >> <https://tika.apache.org/1.22/gettingstarted.htm><https://tika.apache.org/1.22/gettingstarted.htm
>> >>  <https://tika.apache.org/1.22/gettingstarted.htm>> pagel, the message is 
>> >> to run Tika as a command line application, or embedded in your 
>> >> application.  
>> >>  
>> >>  
>> >>  
>> >> I’m wondering if Tika-Server needs to be made more prominent, and treated 
>> >> as the “primary method of interacting with Tika”? Do we need as a 
>> >> community to focus more on Tika-Server? In our getting started 
>> >> documentation, in our usage documentation, and in our examples? 
>> >>  
>> >>  
>> >>  
>> >> Do we need to create the equivalent of the Service Installation scripts 
>> >> for Tika-Server?  
>> >>  
>> >>  
>> >>  
>> >> Wanted to stoke the discussion! 
>> >>  
>> >>  
>> >>  
>> >> Eric 
>> >>  
>> >>  
>> >>  
>> >> _______________________ 
>> >>  
>> >> Eric Pugh | Founder & CEO | OpenSource Connections, LLC | 434.466.1467 | 
>> >> http://www.opensourceconnections.com 
>> >> <http://www.opensourceconnections.com/><http://www.opensourceconnections.com/
>> >>  
>> >> <http://www.opensourceconnections.com/>><http://www.opensourceconnections.com/
>> >>  <http://www.opensourceconnections.com/> 
>> >> <http://www.opensourceconnections.com/ 
>> >> <http://www.opensourceconnections.com/>>> | My Free/Busy 
>> >> <http://tinyurl.com/eric-cal <http://tinyurl.com/eric-cal> 
>> >> <http://tinyurl.com/eric-cal <http://tinyurl.com/eric-cal>>>  
>> >>  
>> >> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed 
>> >> <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw
>> >>  
>> >> <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw>
>> >>  
>> >> <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw
>> >>  
>> >> <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw>>>
>> >>   
>> >>  
>> >> This e-mail and all contents, including attachments, is considered to be 
>> >> Company Confidential unless explicitly stated otherwise, regardless of 
>> >> whether attachments are marked as such. 
>> >  
>> > _______________________ 
>> > Eric Pugh | Founder & CEO | OpenSource Connections, LLC | 434.466.1467 | 
>> > http://www.opensourceconnections.com 
>> > <http://www.opensourceconnections.com/><http://www.opensourceconnections.com/
>> >  <http://www.opensourceconnections.com/>> | My Free/Busy 
>> > <http://tinyurl.com/eric-cal <http://tinyurl.com/eric-cal>>  
>> > Co-Author: Apache Solr Enterprise Search Server, 3rd Ed 
>> > <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw
>> >  
>> > <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw>>
>> >   
>> > This e-mail and all contents, including attachments, is considered to be 
>> > Company Confidential unless explicitly stated otherwise, regardless of 
>> > whether attachments are marked as such. 
>> >  
>> 
>> _______________________ 
>> Eric Pugh | Founder & CEO | OpenSource Connections, LLC | 434.466.1467 | 
>> http://www.opensourceconnections.com 
>> <http://www.opensourceconnections.com/><http://www.opensourceconnections.com/
>>  <http://www.opensourceconnections.com/>> | My Free/Busy 
>> <http://tinyurl.com/eric-cal <http://tinyurl.com/eric-cal>>  
>> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed 
>> <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw
>>  
>> <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw>>
>>     
>> This e-mail and all contents, including attachments, is considered to be 
>> Company Confidential unless explicitly stated otherwise, regardless of 
>> whether attachments are marked as such. 
>> 
> 
> Spicule Limited is registered in England & Wales. Company Number: 09954122. 
> Registered office: First Floor, Telecom House, 125-135 Preston Road, 
> Brighton, England, BN1 6AF. VAT No. 251478891.
> 
> 
> 
> All engagements are subject to Spicule Terms and Conditions of Business. This 
> email and its contents are intended solely for the individual to whom it is 
> addressed and may contain information that is confidential, privileged or 
> otherwise protected from disclosure, distributing or copying. Any views or 
> opinions presented in this email are solely those of the author and do not 
> necessarily represent those of Spicule Limited. The company accepts no 
> liability for any damage caused by any virus transmitted by this email. If 
> you have received this message in error, please notify us immediately by 
> reply email before deleting it from your system. Service of legal notice 
> cannot be effected on Spicule Limited by email.
> 

_______________________
Eric Pugh | Founder & CEO | OpenSource Connections, LLC | 434.466.1467 | 
http://www.opensourceconnections.com <http://www.opensourceconnections.com/> | 
My Free/Busy <http://tinyurl.com/eric-cal>  
Co-Author: Apache Solr Enterprise Search Server, 3rd Ed 
<https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw>
    
This e-mail and all contents, including attachments, is considered to be 
Company Confidential unless explicitly stated otherwise, regardless of whether 
attachments are marked as such.

Reply via email to