Hi all, I’ve gone ahead and added the -spawnChild property as a default when 
running Tika Server as a service.   I’d love some eyes on the PR, and if this 
looks good, get it committed.   

Feedback welcome!

Eric



> On Dec 17, 2019, at 12:53 PM, Eric Pugh <[email protected]> 
> wrote:
> 
> Cool.   
> 
> It’s the auto run that I really need, and the other part that I don’t think 
> I’ve tackled properly is the managing of logs…
> 
> I’m going to check with my project to see if they support Snap packages.
> 
> Eric
> 
> 
>> On Dec 16, 2019, at 5:10 PM, Tom Barber <[email protected] 
>> <mailto:[email protected]>> wrote:
>> 
>> Just saw this fly by and FYI on Linux systems that support Snap packages 
>> (Ubuntu/Debian/Arch/Fedora etc) you can `snap install tika-server` doesn’t 
>> yet auto-run I don’t believe but you can just run `tika-server.run` and 
>> adding an init script wouldn’t take 5 minutes.
>> 
>> Tom
>> 
>> On 16 December 2019 at 18:42:55, Eric Pugh ([email protected] 
>> <mailto:[email protected]>) wrote:
>> 
>>> Hi folks! 
>>> 
>>> I’ve got a mostly completed PR for having install scripts for Tika Server, 
>>> and I’m hoping a committer will take a look at the PR, and give feedback 
>>> (and ideally commit in time for 1.24!) 
>>> 
>>> A couple of things: 
>>> 
>>> 1) This was completely influenced by 
>>> https://lucene.apache.org/solr/guide/8_3/taking-solr-to-production.html#service-installation-script
>>>  
>>> <https://lucene.apache.org/solr/guide/8_3/taking-solr-to-production.html#service-installation-script><https://lucene.apache.org/solr/guide/8_3/taking-solr-to-production.html#service-installation-script
>>>  
>>> <https://lucene.apache.org/solr/guide/8_3/taking-solr-to-production.html#service-installation-script>>,
>>>  in fact I started with the Solr scripts. 
>>> 
>>> 2) I’ve deleted all the Solr specific aspects (I think), however there may 
>>> still be more to delete.  
>>> 
>>> 3) This requires a change to how we release Tika, previously we ship 
>>> tika-app.jar and Tika-eval.jar, and Tika-server.jar, and now, I think, we 
>>> want to add the tika-server-bin.tgz and tika-server-bin.zip binary 
>>> distributions. 
>>> 
>>> I’m happy to start writing accompanying “how to deploy Tika Server” docs if 
>>> this PR looks good! Or, please give input and I’ll make the updates.
>>> 
>>> Eric 
>>> 
>>> 
>>> > On Dec 12, 2019, at 2:39 PM, Eric Pugh <[email protected] 
>>> > <mailto:[email protected]>> wrote: 
>>> >  
>>> > I’ve created this JIRA to track this work: 
>>> > https://issues.apache.org/jira/browse/TIKA-3010 
>>> > <https://issues.apache.org/jira/browse/TIKA-3010> 
>>> > <https://issues.apache.org/jira/browse/TIKA-3010 
>>> > <https://issues.apache.org/jira/browse/TIKA-3010>> 
>>> >  
>>> > And a WIP progress PR is at https://github.com/apache/tika/pull/305 
>>> > <https://github.com/apache/tika/pull/305> 
>>> > <https://github.com/apache/tika/pull/305 
>>> > <https://github.com/apache/tika/pull/305>> 
>>> >  
>>> > My thought is to put something together that mimics how we deploy Solr, 
>>> > and see how that works. I have a need for an install process that a 
>>> > general IT person can follow, who isn’t a Tika expert or a Docker users. 
>>> >  
>>> >  
>>> >  
>>> >  
>>> >> On Dec 4, 2019, at 12:28 PM, Chris Mattmann <[email protected] 
>>> >> <mailto:[email protected]> <mailto:[email protected] 
>>> >> <mailto:[email protected]>>> wrote: 
>>> >>  
>>> >> Thanks for bringing this conversation up Eric. 
>>> >>  
>>> >>  
>>> >>  
>>> >> Historically if you look over the last 5 years, I think what you are 
>>> >> asking below has sort of already become the de facto 
>>> >> truth. Most people are in fact using Tika server, whether they are 
>>> >> individual devs, govvies, commercial folk and the like.  
>>> >>  
>>> >> Big, small and medium projects. Evidenced by the expansion of Tika APIs 
>>> >> into pretty much every PL I know and use of  
>>> >> actively today. 
>>> >>  
>>> >>  
>>> >>  
>>> >> Given that, we probably should update the main website docs to make this 
>>> >> more prominent. The tika server docs on the 
>>> >> wiki are pretty darn good. But they don’t get prime real estate. Would 
>>> >> be wonderful if someone wants to update the  
>>> >> website to make it more prominent. 
>>> >>  
>>> >>  
>>> >>  
>>> >> The downstream Tika Python lib that I maintain has tons of activity is 
>>> >> used by more than 350+ projects and relies solely 
>>> >> on Tika-Server. My recommendation to the Solr folks (having created 
>>> >> 7633) from the 2014 DARPA MEMEX days was to  
>>> >> move towards Tika Server based SolrCell dep and that’s the right way to 
>>> >> go IMO. 
>>> >>  
>>> >>  
>>> >>  
>>> >> Chris 
>>> >>  
>>> >>  
>>> >>  
>>> >>  
>>> >>  
>>> >>  
>>> >>  
>>> >>  
>>> >>  
>>> >>  
>>> >>  
>>> >> From: Eric Pugh <[email protected] 
>>> >> <mailto:[email protected]> 
>>> >> <mailto:[email protected] 
>>> >> <mailto:[email protected]>>> 
>>> >> Reply-To: "[email protected] <mailto:[email protected]> 
>>> >> <mailto:[email protected] <mailto:[email protected]>>" 
>>> >> <[email protected] <mailto:[email protected]> 
>>> >> <mailto:[email protected] <mailto:[email protected]>>> 
>>> >> Date: Wednesday, December 4, 2019 at 12:24 PM 
>>> >> To: "[email protected] <mailto:[email protected]> 
>>> >> <mailto:[email protected] <mailto:[email protected]>>" 
>>> >> <[email protected] <mailto:[email protected]> 
>>> >> <mailto:[email protected] <mailto:[email protected]>>> 
>>> >> Subject: [EXTERNAL] Do we have a community supported approach for 
>>> >> deploying Tika Server in production? 
>>> >>  
>>> >>  
>>> >>  
>>> >> Hi all - Hoping this is a reasonable Tika-dev versus Tika-user question! 
>>> >>  
>>> >>  
>>> >>  
>>> >> Over in Solr land there has been renewed discussion about streamlining 
>>> >> what Solr is....  
>>> >>  
>>> >>  
>>> >>  
>>> >> In regards to rich content extraction and the Tika project, it seems 
>>> >> like the two ideas that continue to preserve the existing behavior are: 
>>> >>  
>>> >>  
>>> >>  
>>> >> 1) To convert the ExtractingRequestHandler into a Package (Plugin) for 
>>> >> Solr. This slims down the standard Solr download, and *might* make it 
>>> >> easier to update the version of Tika + dependent jars used? 
>>> >>  
>>> >>  
>>> >>  
>>> >> 2) The second approach is to instead require Tika-Server to be running 
>>> >> (https://issues.apache.org/jira/browse/SOLR-7633 
>>> >> <https://issues.apache.org/jira/browse/SOLR-7633><https://issues.apache.org/jira/browse/SOLR-7633
>>> >>  <https://issues.apache.org/jira/browse/SOLR-7633>>) and just have Solr 
>>> >> delegate the call to Tika-Server. 
>>> >>  
>>> >>  
>>> >>  
>>> >>  
>>> >>  
>>> >> I was thinking about why I like option 1 better than 2, and I think it 
>>> >> boils down to how mature the IT organization I am working with is. Some 
>>> >> IT organizations have large dev-ops teams, and are working at major 
>>> >> scale, and managing a fleet of Tika-Server on Kubernetes with Load 
>>> >> Balancer dynamically scaling up and down is simple and second nature! 
>>> >> However, many organizations aren’t like that. 
>>> >>  
>>> >>  
>>> >>  
>>> >> So I guess what I’m asking is do we have a reasonable supported approach 
>>> >> for deploying Tika Server for non-tika savvy organizations? I’m thinking 
>>> >> about Solr, and specifically the fact that Solr has a well defined set 
>>> >> of Service Installation scripts. When I follow the directions in 
>>> >> https://lucene.apache.org/solr/guide/8_3/taking-solr-to-production.html#taking-solr-to-production
>>> >>  
>>> >> <https://lucene.apache.org/solr/guide/8_3/taking-solr-to-production.html#taking-solr-to-production><https://lucene.apache.org/solr/guide/8_3/taking-solr-to-production.html#taking-solr-to-production
>>> >>  
>>> >> <https://lucene.apache.org/solr/guide/8_3/taking-solr-to-production.html#taking-solr-to-production>>
>>> >>  I can feel confident that when the server is rebooted, then Solr will 
>>> >> come back up! Plus there is log rotation and all the rest. 
>>> >>  
>>> >>  
>>> >>  
>>> >> In contrast, when I look at Tika website, specifically 
>>> >> https://tika.apache.org/1.22/gettingstarted.htm 
>>> >> <https://tika.apache.org/1.22/gettingstarted.htm><https://tika.apache.org/1.22/gettingstarted.htm
>>> >>  <https://tika.apache.org/1.22/gettingstarted.htm>> pagel, the message 
>>> >> is to run Tika as a command line application, or embedded in your 
>>> >> application.  
>>> >>  
>>> >>  
>>> >>  
>>> >> I’m wondering if Tika-Server needs to be made more prominent, and 
>>> >> treated as the “primary method of interacting with Tika”? Do we need as 
>>> >> a community to focus more on Tika-Server? In our getting started 
>>> >> documentation, in our usage documentation, and in our examples? 
>>> >>  
>>> >>  
>>> >>  
>>> >> Do we need to create the equivalent of the Service Installation scripts 
>>> >> for Tika-Server?  
>>> >>  
>>> >>  
>>> >>  
>>> >> Wanted to stoke the discussion! 
>>> >>  
>>> >>  
>>> >>  
>>> >> Eric 
>>> >>  
>>> >>  
>>> >>  
>>> >> _______________________ 
>>> >>  
>>> >> Eric Pugh | Founder & CEO | OpenSource Connections, LLC | 434.466.1467 | 
>>> >> http://www.opensourceconnections.com 
>>> >> <http://www.opensourceconnections.com/><http://www.opensourceconnections.com/
>>> >>  
>>> >> <http://www.opensourceconnections.com/>><http://www.opensourceconnections.com/
>>> >>  <http://www.opensourceconnections.com/> 
>>> >> <http://www.opensourceconnections.com/ 
>>> >> <http://www.opensourceconnections.com/>>> | My Free/Busy 
>>> >> <http://tinyurl.com/eric-cal <http://tinyurl.com/eric-cal> 
>>> >> <http://tinyurl.com/eric-cal <http://tinyurl.com/eric-cal>>>  
>>> >>  
>>> >> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed 
>>> >> <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw
>>> >>  
>>> >> <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw>
>>> >>  
>>> >> <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw
>>> >>  
>>> >> <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw>>>
>>> >>   
>>> >>  
>>> >> This e-mail and all contents, including attachments, is considered to be 
>>> >> Company Confidential unless explicitly stated otherwise, regardless of 
>>> >> whether attachments are marked as such. 
>>> >  
>>> > _______________________ 
>>> > Eric Pugh | Founder & CEO | OpenSource Connections, LLC | 434.466.1467 | 
>>> > http://www.opensourceconnections.com 
>>> > <http://www.opensourceconnections.com/><http://www.opensourceconnections.com/
>>> >  <http://www.opensourceconnections.com/>> | My Free/Busy 
>>> > <http://tinyurl.com/eric-cal <http://tinyurl.com/eric-cal>>  
>>> > Co-Author: Apache Solr Enterprise Search Server, 3rd Ed 
>>> > <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw
>>> >  
>>> > <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw>>
>>> >  
>>> > This e-mail and all contents, including attachments, is considered to be 
>>> > Company Confidential unless explicitly stated otherwise, regardless of 
>>> > whether attachments are marked as such. 
>>> >  
>>> 
>>> _______________________ 
>>> Eric Pugh | Founder & CEO | OpenSource Connections, LLC | 434.466.1467 | 
>>> http://www.opensourceconnections.com 
>>> <http://www.opensourceconnections.com/><http://www.opensourceconnections.com/
>>>  <http://www.opensourceconnections.com/>> | My Free/Busy 
>>> <http://tinyurl.com/eric-cal <http://tinyurl.com/eric-cal>>  
>>> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed 
>>> <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw
>>>  
>>> <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw>>
>>>    
>>> This e-mail and all contents, including attachments, is considered to be 
>>> Company Confidential unless explicitly stated otherwise, regardless of 
>>> whether attachments are marked as such. 
>>> 
>> 
>> Spicule Limited is registered in England & Wales. Company Number: 09954122. 
>> Registered office: First Floor, Telecom House, 125-135 Preston Road, 
>> Brighton, England, BN1 6AF. VAT No. 251478891.
>> 
>> 
>> 
>> All engagements are subject to Spicule Terms and Conditions of Business. 
>> This email and its contents are intended solely for the individual to whom 
>> it is addressed and may contain information that is confidential, privileged 
>> or otherwise protected from disclosure, distributing or copying. Any views 
>> or opinions presented in this email are solely those of the author and do 
>> not necessarily represent those of Spicule Limited. The company accepts no 
>> liability for any damage caused by any virus transmitted by this email. If 
>> you have received this message in error, please notify us immediately by 
>> reply email before deleting it from your system. Service of legal notice 
>> cannot be effected on Spicule Limited by email.
>> 
> 
> _______________________
> Eric Pugh | Founder & CEO | OpenSource Connections, LLC | 434.466.1467 | 
> http://www.opensourceconnections.com <http://www.opensourceconnections.com/> 
> | My Free/Busy <http://tinyurl.com/eric-cal>  
> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed 
> <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw>
>   
> This e-mail and all contents, including attachments, is considered to be 
> Company Confidential unless explicitly stated otherwise, regardless of 
> whether attachments are marked as such.
> 

_______________________
Eric Pugh | Founder & CEO | OpenSource Connections, LLC | 434.466.1467 | 
http://www.opensourceconnections.com <http://www.opensourceconnections.com/> | 
My Free/Busy <http://tinyurl.com/eric-cal>  
Co-Author: Apache Solr Enterprise Search Server, 3rd Ed 
<https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw>
    
This e-mail and all contents, including attachments, is considered to be 
Company Confidential unless explicitly stated otherwise, regardless of whether 
attachments are marked as such.

Reply via email to