Re: How to avoid nodes to be indexed

Tommaso Teofili Thu, 06 Mar 2014 07:13:14 -0800

Hi,

2014-03-06 15:46 GMT+01:00 Jukka Zitting <jukka.zitt...@gmail.com>:


> Hi,
>
> On Thu, Mar 6, 2014 at 9:32 AM, Tommaso Teofili
> <tommaso.teof...@gmail.com> wrote:
> > for my Solr (indexing) resiliency use case [1] I've implemented an
> > extension of Solr client which is able to cache requests if Solr goes
> down
> > and execute them back once the Solr instance comes back.
> >
> > Now if the repository goes down during the Solr downtime we loose the
> > cached requests as they live in memory so we could write such queued
> > requests down as nodes in the repository for persisting them and
> eventually
> > fetch them once Solr comes live again, but then they may get indexed and
> > that would lead to a loop.
> > So I wonder if there's any way we can tell the repository we don't want
> > some nodes (based on e.g. primaryType and/or path and/or property
> > existing/missing) to be indexed, whatever an IndexEditor is supposed to
> do.
>
> Sounds like an XY problem:
>
> X: Ensuring that he Solr index is (eventually) consistent with content
> in the repository even if the Solr server is down at times.
> Y: Exclude certain nodes from being indexed.
>

I'm just interested in X.


>
> There's a much easier solution to X:
>
> The async indexer mechanism keeps track of the last repository
> checkpoint that has been indexed. If you throw an exception during
> indexing if the Solr server goes down, then the latest checkpoint
> won't be marked as indexed, and the next iteration of the async
> indexer will restart from the previous checkpoint, resulting in
> recreation of all the potentially failed Solr indexing requests.
>

ok, so since everything is already like that (the SolrIndexEditor throws a
CommitFailedException when Solr is unreachable) we just have to do nothing,
good :-)

Regards,
Tommaso


>
> BR,
>
> Jukka Zitting
>

Re: How to avoid nodes to be indexed

Reply via email to