My initial thought was to use scheduling built with DIH:
http://wiki.apache.org/solr/DataImportHandler#Scheduling

But I think just a cron job should do the same for me.

Thanks

On Tue, Sep 1, 2015 at 8:51 AM, Davis, Daniel (NIH/NLM) [C] <
daniel.da...@nih.gov> wrote:

> On 8/31/2015 11:26 AM, Troy Edwards wrote:
> > I am having a hard time finding documentation on DataImportHandler
> > scheduling in SolrCloud. Can someone please post a link to that? I
> > have a requirement that the DIH should be initiated at a specific time
> > Monday through Friday.
>
> Troy, is your question how to use scheduled tasks?   Shawn pointed you to
> the right direction.   I thought it more likely that you want to schedule a
> cron task to run on any of your servers running SolrCloud, and you want the
> job to run even if the cluster is degraded.
>
> Here's an idea - schedule your job Monday on node 1, Tuesday on node 2,
> etc.   That way, if the cluster is degraded (a node is down),
> re-indexing/delta indexing still happens, it just happens slower.    You
> can certainly write a zookeeper client to make each cron job compete to see
> who does the job - questions on how to do this should be directed to a
> zookeeper users' mailing list.
>
> -----Original Message-----
> From: Shawn Heisey [mailto:apa...@elyograg.org]
> Sent: Monday, August 31, 2015 7:50 PM
> To: solr-user@lucene.apache.org
> Subject: Re: DataImportHandler scheduling
>
> On 8/31/2015 11:26 AM, Troy Edwards wrote:
> > I am having a hard time finding documentation on DataImportHandler
> > scheduling in SolrCloud. Can someone please post a link to that? I
> > have a requirement that the DIH should be initiated at a specific time
> > Monday through Friday.
>
> Every modern operating system (and most of the previous versions of every
> modern OS) has a built-in task scheduling system.  For Windows, it's
> literally called Task Scheduler.  For most other operating systems, it's
> called cron.
>
> Including dataimport scheduling capability in Solr has been discussed, and
> I think someone even wrote a working version ... but since every OS already
> has scheduling capability that has had years of time to mature, why should
> Solr reinvent the wheel and take the risk that the implementation will have
> bugs?
>
> Currently virtually all updates to Solr's index must be initiated outside
> of Solr, and there is good reason to make sure that Solr doesn't ever
> modify the index without outside input.  The only thing I know of right now
> that can update the index automatically is Document Expiration, but the
> expiration time is decided when the document is indexed, and the original
> indexing action is external to Solr.
>
> https://lucidworks.com/blog/document-expiration/
>
> Thanks,
> Shawn
>
>

Reply via email to