My initial thought was to use scheduling built with DIH: http://wiki.apache.org/solr/DataImportHandler#Scheduling
But I think just a cron job should do the same for me. Thanks On Tue, Sep 1, 2015 at 8:51 AM, Davis, Daniel (NIH/NLM) [C] < daniel.da...@nih.gov> wrote: > On 8/31/2015 11:26 AM, Troy Edwards wrote: > > I am having a hard time finding documentation on DataImportHandler > > scheduling in SolrCloud. Can someone please post a link to that? I > > have a requirement that the DIH should be initiated at a specific time > > Monday through Friday. > > Troy, is your question how to use scheduled tasks? Shawn pointed you to > the right direction. I thought it more likely that you want to schedule a > cron task to run on any of your servers running SolrCloud, and you want the > job to run even if the cluster is degraded. > > Here's an idea - schedule your job Monday on node 1, Tuesday on node 2, > etc. That way, if the cluster is degraded (a node is down), > re-indexing/delta indexing still happens, it just happens slower. You > can certainly write a zookeeper client to make each cron job compete to see > who does the job - questions on how to do this should be directed to a > zookeeper users' mailing list. > > -----Original Message----- > From: Shawn Heisey [mailto:apa...@elyograg.org] > Sent: Monday, August 31, 2015 7:50 PM > To: solr-user@lucene.apache.org > Subject: Re: DataImportHandler scheduling > > On 8/31/2015 11:26 AM, Troy Edwards wrote: > > I am having a hard time finding documentation on DataImportHandler > > scheduling in SolrCloud. Can someone please post a link to that? I > > have a requirement that the DIH should be initiated at a specific time > > Monday through Friday. > > Every modern operating system (and most of the previous versions of every > modern OS) has a built-in task scheduling system. For Windows, it's > literally called Task Scheduler. For most other operating systems, it's > called cron. > > Including dataimport scheduling capability in Solr has been discussed, and > I think someone even wrote a working version ... but since every OS already > has scheduling capability that has had years of time to mature, why should > Solr reinvent the wheel and take the risk that the implementation will have > bugs? > > Currently virtually all updates to Solr's index must be initiated outside > of Solr, and there is good reason to make sure that Solr doesn't ever > modify the index without outside input. The only thing I know of right now > that can update the index automatically is Document Expiration, but the > expiration time is decided when the document is indexed, and the original > indexing action is external to Solr. > > https://lucidworks.com/blog/document-expiration/ > > Thanks, > Shawn > >