Hello,

It is a DocumentNodeStore based instance. We don't extract data from binary
files, just indexing metadata stored on nodes.

Regards.

On Wed, Jun 7, 2017 at 7:04 AM, Chetan Mehrotra <[email protected]>
wrote:

> > I'm not sure how to minimize the impact of performing a re-index (or new
> > index creation), that will take 48 hours (using oak 1.4). I mean, I don't
> > want to block other indexes update.
>
> Is this a SegmentNodeStore based setup or DocumentNodeStore based?
>
> The reindexing log would have some stats around time spent in indexing
> and time spent in text extraction. Can you check whats the part which
> takes most time. If its text extraction then you can reduce the time
> spent in that via using Pre-Extraction support [1]. This allow
> extracting text before hand and then using that at time of actual
> indexing
>
> Changing the "indexing lane" should help but is tricky to get right
> and something we are improving currently OAK-6246 and OAK-5553
>
> > indexes won't be updated. On the other hand, it seems that using the
> > *reindex-async* flag (see OAK-1456
> > <https://issues.apache.org/jira/browse/OAK-1456>) could do the trick. I
>
> This mode is useful for property index as in the end it removes the
> async flag and makes the index synchronous which would cause issues
> for lucene based index
>
> Chetan Mehrotra
> [1] https://jackrabbit.apache.org/oak/docs/query/lucene.html#
> text-extraction
>
>
> On Tue, Jun 6, 2017 at 9:02 PM, Alvaro Cabrerizo <[email protected]>
> wrote:
> > Hello,
> >
> > I'm not sure how to minimize the impact of performing a re-index (or new
> > index creation), that will take 48 hours (using oak 1.4). I mean, I don't
> > want to block other indexes update.
> >
> > First, we have set the value of async as *fulltext-async* for the new
> > index. I guess, that at least, all the indexes managed by the *async*
> lane
> > <http://jackrabbit.apache.org/oak/docs/query/indexing.html#indexing-lane
> >
> > will not be affected (please, confirm if I'm right). Then we try to
> > minimize the impact on the fulltext-async lane. According to OAK-5553
> > <https://issues.apache.org/jira/browse/OAK-5553> there isn't much we
> can do
> > while the indexing process is active for the new index, as the rest of
> > indexes won't be updated. On the other hand, it seems that using the
> > *reindex-async* flag (see OAK-1456
> > <https://issues.apache.org/jira/browse/OAK-1456>) could do the trick. I
> > mean, setting reindex-async=true to the new index will allow other
> indexes
> > (in the same lane) being updated while it is being populated? If that is
> > true, we could create the index with that flag and then remove it.
> >
> > Regards.
>

Reply via email to