Re: Oak Indexing. Was Re: Property index replacement / evolution

2016-08-11 Thread Ian Boston
Hi,

On 11 August 2016 at 13:03, Chetan Mehrotra 
wrote:

> On Thu, Aug 11, 2016 at 5:19 PM, Ian Boston  wrote:
> > correct.
> > Documents are shared by ID so all updates hit the same shard.
> > That may result in network traffic if the shard is not local.
>
> Focusing on ordering part as that is the most critical aspect compared
> to other. (BAckup and Restore with sharded index is a separate problem
> to discuss but later)
>
> So even if there is a single master for a given path how would it
> order the changes. Given local changes only give partial view of end
> state.
>

In theory, the index should be driven by the eventual consistency of the
source repository, eventually reaching the same consistent state, and
updating on each state change. That probably means the queue should only
contain pointers to Documents and only index the Document as retrieved. I
dont know if that can ever work.


>
> Also in such a setup would each query need to consider multiple shards
> for final result or each node would "eventually" sync index changes
> from other nodes (complete replication) and query would only use local
> index
>
> For me ensuring consistency in how index updates are sent to ES wrt
> Oak view of changes was kind of blocking feature to enable
> parallelization of indexing process. It needs to be ensured that for
> concurrent commit end result in index is in sync with repository
> state.
>

agreed, me also on various attempts.


>
> Current single thread async index update avoid all such race condition.
>

Perhaps this is the "root" of the problem. The only way to index Oak
consistently is with a single thread globally, as is done now

That's still possible with ES.
Run a single thread on the master, that indexes into a co-located ES
cluster.
If the full text extraction is distributed, then master only needs to
resource writing the local shard.
Its not as good as parallelising the queue, but given the structure of Oak
might be the only way.

Even so, future revisions will be in the index long before Oak has synced
the root document.

The current implementation doesn't have to think about this as the indexing
is single threaded globally *and* each segment update committed first by a
hard lucene commit and second by a root document sync guaranteeing the
sequential update nature.

BTW, how does Hybrid manage to parallelise the indexing and maintain
consistency ?

Best Regards
Ian



>
> Chetan Mehrotra
>


Re: Oak Indexing. Was Re: Property index replacement / evolution

2016-08-11 Thread Ian Boston
On 11 August 2016 at 11:10, Chetan Mehrotra 
wrote:

> On Thu, Aug 11, 2016 at 3:03 PM, Ian Boston  wrote:
> > Both Solr Cloud and ES address this by sharding and
> > replicating the indexes, so that all commits are soft, instant and real
> > time. That introduces problems.
> ...
> > Both Solr Cloud and ES address this by sharding and
> > replicating the indexes, so that all commits are soft, instant and real
> > time.
>
> This would really be useful. However I have couple of aspects to clear
>
> Index Update Gurantee
> 
>
> Lets say if commit succeeds and then we update the index and index
> update fails for some reason. Then would that update be missed or
> there can be some mechanism to recover. I am not very sure about WAL
> here that may be the answer here but still confirming.
>

For ES (I don't know about how the Solr Cloud WAL behaves)
The update be accepted until it's written to the WAL so if something fails
before that, then the it's upto how the queue of updates is managed which
is client side.
If its written to the WAL, whatever happens it will be indexed eventually,
provided the WAL is available. Think of the WAL as equivalent to the Oak
Journal, IIUC. The WAL is present on all replicas, so provided 1 replica is
available on shard, no data is lost.




>
> In Oak with the way async index update works based on checkpoint its
> ensured that index would "eventually" contain the right data and no
> update would be lost. if there is a failure in index update then that
> would fail and next cycle would start again from same base state
>

Sound like the same level of guarantee, depending on how the client side is
implemented. Typically I didnt bother with a queue between the application
and the ES client because the ES client was so fast.


>
> Order of index update
> -
>
> Lets say I have 2 cluster nodes where same node is being performed
>
> Original state /a {x:1}
>
> Cluster Node N1 - /a {x:1, y:2}
> Cluster Node N2 - /a {x:1, z:3}
>
> End State /a {x:1, y:2, z:3}
>
> At Oak level both the commits would succeed as there is no conflict.
> However N1 and N2 would not be seeing each other updates immediately
> and that would depend on background read. So in this case how would
> index update would look like.
>
> 1. Would index update for specific paths go to some master which would
> order the update
>

correct.
Documents are shared by ID so all updates hit the same shard.
That may result in network traffic if the shard is not local.



> 2. Or it would end up with with either of {x:1, y:2} or {x:1, z:3}
>
> Here current async index update logic ensures that it sees the
> eventually expected order of changes and hence would be consistent
> with repository state.


> Backup and Restore
> ---
>
> Would the backup now involve backup of ES index files from each
> cluster node. Or assuming full replication it would involve backup of
> files from any one of the nodes. Would the back be in sync with last
> changes done in repository (assuming sudden shutdown where changes got
> committed to repository but not yet to any index)
>
> Here current approach of storing index files as part of MVCC storage
> ensures that index state is consistent to some "checkpointed" state in
> repository. And post restart it would eventually catch up with the
> current repository state and hence would not require complete rebuild
> of index in case of unclean shutdowns
>

If the revision is present in the document, then I assume it can be
filtered at query time.
However, there may be problems here, as might have to find some way of
indexing the revision history of a document like the format in
MongoDB... I did wonder if a better solution was to use ES as the primary
storage then all the property indexes would be present by default with no
need for any Lucene index plugin. but I stopped thinking about that
with the 1s root document sync as my interest was real time.

Best Regards
Ian


>
>
> Chetan Mehrotra
>


Re: Oak Indexing. Was Re: Property index replacement / evolution

2016-08-11 Thread Chetan Mehrotra
On Thu, Aug 11, 2016 at 3:03 PM, Ian Boston  wrote:
> Both Solr Cloud and ES address this by sharding and
> replicating the indexes, so that all commits are soft, instant and real
> time. That introduces problems.
...
> Both Solr Cloud and ES address this by sharding and
> replicating the indexes, so that all commits are soft, instant and real
> time.

This would really be useful. However I have couple of aspects to clear

Index Update Gurantee


Lets say if commit succeeds and then we update the index and index
update fails for some reason. Then would that update be missed or
there can be some mechanism to recover. I am not very sure about WAL
here that may be the answer here but still confirming.

In Oak with the way async index update works based on checkpoint its
ensured that index would "eventually" contain the right data and no
update would be lost. if there is a failure in index update then that
would fail and next cycle would start again from same base state

Order of index update
-

Lets say I have 2 cluster nodes where same node is being performed

Original state /a {x:1}

Cluster Node N1 - /a {x:1, y:2}
Cluster Node N2 - /a {x:1, z:3}

End State /a {x:1, y:2, z:3}

At Oak level both the commits would succeed as there is no conflict.
However N1 and N2 would not be seeing each other updates immediately
and that would depend on background read. So in this case how would
index update would look like.

1. Would index update for specific paths go to some master which would
order the update
2. Or it would end up with with either of {x:1, y:2} or {x:1, z:3}

Here current async index update logic ensures that it sees the
eventually expected order of changes and hence would be consistent
with repository state.

Backup and Restore
---

Would the backup now involve backup of ES index files from each
cluster node. Or assuming full replication it would involve backup of
files from any one of the nodes. Would the back be in sync with last
changes done in repository (assuming sudden shutdown where changes got
committed to repository but not yet to any index)

Here current approach of storing index files as part of MVCC storage
ensures that index state is consistent to some "checkpointed" state in
repository. And post restart it would eventually catch up with the
current repository state and hence would not require complete rebuild
of index in case of unclean shutdowns


Chetan Mehrotra


Re: Oak Indexing. Was Re: Property index replacement / evolution

2016-08-11 Thread Ian Boston
Hi,

There is no need to have several different plugins to deal with the
standalone, small scale cluster, large scale cluster deployment. It might
be desirable for some reason, but it's not necessary.

I have pushed the code I was working before I got distracted it to a GitHub
repo. [1] is where the co-located ES cluster starts. If the
property es-server-url is defined, an external ES cluster is used.

The repo is wip, incomplete and to will see 2 attempts to port the Lucene
plugin, take2 is the second. As I said I stopped when it became apparent
there was a 1s latency imposed by Oak. I think you enlightened me to that
behavior on oak-dev.

I don't know how to co-locate a Solr Cloud cluster in the same way given it
needs Zookeeper. (I don't know enough about Solr Cloud TBH).
I Oak can't stomach using ES as a library, it could with, with enough time
and resources, re-implement the pattern or something close.

Best Regards
Ian

1
https://github.com/ieb/oak-es/blob/master/src/main/java/org/apache/jackrabbit/oak/plusing/index/es/index/ESServer.java#L27

On 11 August 2016 at 09:58, Chetan Mehrotra 
wrote:

> Couple of points around the motivation, target usecase around Hybrid
> Indexing and Oak indexing in general.
>
> Based on my understanding of various deployments. Any application
> based on Oak has 2 type of query requirements
>
> QR1. Application Query - These mostly involve some property
> restrictions and are invoked by code itself to perform some operation.
> The property involved here in most cases would be sparse i.e. present
> in small subset of whole repository content. Such queries need to be
> very fast and they might be invoked very frequently. Such queries
> should also be more accurate and result should not lag repository
> state much.
>
> QR2. User provided query - These queries would consist of both or
> either of property restriction and fulltext constraints. The target
> nodes may form majority part of overall repository content. Such
> queries need to be fast but given user driven need not be very fast.
>
> Note that speed criteria is very subjective and relative here.
>
> Further Oak needs to support deployments
>
> 1. On single setup - For dev, prod on SegmentNodeStore
> 2. Cluster Setup on premise
> 3. Deployment in some DataCenter
>
> So Oak should enable deployments where for smaller setups it does not
> require any thirdparty system while still allow plugging in a dedicate
> system like ES/Solr if need arises. So both usecases need to be
> supported.
>
> And further even if it has access to such third party server it might
> be fine to rely on embedded Lucene for #QR1 and just delegate queries
> under #QR2 to remote. This would ensure that query results are still
> fast for usage falling under #QR1.
>
> Hybrid Index Usecase
> -
>
> So far for #QR1 we only had property indexes and to an extent Lucene
> based property index where results lag repository state and lag might
> be significant depending on load.
>
> Hybrid index aim to support queries under  #QR1 and can be seen as
> replacement for existing non unique property indexes. Such indexes
> would have lower storage requirement and would not put much load on
> remote storage for execution. Its not meant as a replacement for
> ES/Solr but then intends to address different type of usage
>
> Very large Indexes
> -
>
> For deployments having very large repository Solr or ES based indexes
> would be preferable and there oak-solr can be used (some day oak-es!)
>
> So in brief Oak should be self sufficient for smaller deployment and
> still allow plugging in Solr/ES for large deployment and there also
> provide a choice to admin to configure a sub set of index for such
> usage depending on the size.
>
>
>
>
>
>
> Chetan Mehrotra
>
>
> On Thu, Aug 11, 2016 at 1:59 PM, Ian Boston  wrote:
> > Hi,
> >
> > On 11 August 2016 at 09:14, Michael Marth  wrote:
> >
> >> Hi Ian,
> >>
> >> No worries - good discussion.
> >>
> >> I should point out though that my reply to Davide was based on a
> >> comparison of the current design vs the Jackrabbit 2 design (in which
> >> indexes were stored locally). Maybe I misunderstood Davide’s comment.
> >>
> >> I will split my answer to your mail in 2 parts:
> >>
> >>
> >> >
> >> >Full text extraction should be separated from indexing, as the DS blobs
> >> are
> >> >immutable, so is the full text. There is code to do this in the Oak
> >> >indexer, but it's not used to write to the DS at present. It should be
> >> done
> >> >in a Job, distributed to all nodes, run only once per item. Full text
> >> >extraction is hugely expensive.
> >>
> >> My understanding is that Oak currently:
> >> A) runs full text extraction in a separate thread (separate form the
> >> “other” indexer)
> >> B) runs it only once per cluster
> >> If that is correct then the difference to what you mention above would
> be
> >> that 

Re: Oak Indexing. Was Re: Property index replacement / evolution

2016-08-11 Thread Chetan Mehrotra
Couple of points around the motivation, target usecase around Hybrid
Indexing and Oak indexing in general.

Based on my understanding of various deployments. Any application
based on Oak has 2 type of query requirements

QR1. Application Query - These mostly involve some property
restrictions and are invoked by code itself to perform some operation.
The property involved here in most cases would be sparse i.e. present
in small subset of whole repository content. Such queries need to be
very fast and they might be invoked very frequently. Such queries
should also be more accurate and result should not lag repository
state much.

QR2. User provided query - These queries would consist of both or
either of property restriction and fulltext constraints. The target
nodes may form majority part of overall repository content. Such
queries need to be fast but given user driven need not be very fast.

Note that speed criteria is very subjective and relative here.

Further Oak needs to support deployments

1. On single setup - For dev, prod on SegmentNodeStore
2. Cluster Setup on premise
3. Deployment in some DataCenter

So Oak should enable deployments where for smaller setups it does not
require any thirdparty system while still allow plugging in a dedicate
system like ES/Solr if need arises. So both usecases need to be
supported.

And further even if it has access to such third party server it might
be fine to rely on embedded Lucene for #QR1 and just delegate queries
under #QR2 to remote. This would ensure that query results are still
fast for usage falling under #QR1.

Hybrid Index Usecase
-

So far for #QR1 we only had property indexes and to an extent Lucene
based property index where results lag repository state and lag might
be significant depending on load.

Hybrid index aim to support queries under  #QR1 and can be seen as
replacement for existing non unique property indexes. Such indexes
would have lower storage requirement and would not put much load on
remote storage for execution. Its not meant as a replacement for
ES/Solr but then intends to address different type of usage

Very large Indexes
-

For deployments having very large repository Solr or ES based indexes
would be preferable and there oak-solr can be used (some day oak-es!)

So in brief Oak should be self sufficient for smaller deployment and
still allow plugging in Solr/ES for large deployment and there also
provide a choice to admin to configure a sub set of index for such
usage depending on the size.






Chetan Mehrotra


On Thu, Aug 11, 2016 at 1:59 PM, Ian Boston  wrote:
> Hi,
>
> On 11 August 2016 at 09:14, Michael Marth  wrote:
>
>> Hi Ian,
>>
>> No worries - good discussion.
>>
>> I should point out though that my reply to Davide was based on a
>> comparison of the current design vs the Jackrabbit 2 design (in which
>> indexes were stored locally). Maybe I misunderstood Davide’s comment.
>>
>> I will split my answer to your mail in 2 parts:
>>
>>
>> >
>> >Full text extraction should be separated from indexing, as the DS blobs
>> are
>> >immutable, so is the full text. There is code to do this in the Oak
>> >indexer, but it's not used to write to the DS at present. It should be
>> done
>> >in a Job, distributed to all nodes, run only once per item. Full text
>> >extraction is hugely expensive.
>>
>> My understanding is that Oak currently:
>> A) runs full text extraction in a separate thread (separate form the
>> “other” indexer)
>> B) runs it only once per cluster
>> If that is correct then the difference to what you mention above would be
>> that you would like the FT indexing not be pinned to one instance but
>> rather be distributed, say round-robin.
>> Right?
>>
>
>
> Yes.
>
>
>>
>>
>> >Building the same index on every node doesn't scale for the reasons you
>> >point out, and eventually hits a brick wall.
>> >http://lucene.apache.org/core/6_1_0/core/org/apache/
>> lucene/codecs/lucene60/package-summary.html#Limitations.
>> >(Int32 on Document ID per index). One of the reasons for the Hybrid
>> >approach was the number of Oak documents in some repositories will exceed
>> >that limit.
>>
>> I am not sure what you are arguing for with this comment…
>> It sounds like an argument in favour of the current design - which is
>> probably not what you mean… Could you explain, please?
>>
>
> I didn't communicate that very well.
>
> Currently Lucene (6.1) has a limit of Int32 to the number of documents it
> can store in an index, IIUC There is a long term desire to increase that
> but using Int64 but no long term commitment as its probably significant
> work given arrays in Java are indexed with Int32.
>
> The Hybrid approach doesn't help the potential Lucene brick wall, but one
> motivation for looking at it was the number of Oak Documents including
> those under /oak:index which is, in some cases, approaching that limit.
>
>
>
>>
>>
>> Thanks!
>> Michael