Hi,

There are different ways to approach this in Oak.

Your application can register an event listener and gets notified about changes 
when they are visible on the local cluster node.

The application can store a visibility token with the job data you have in 
Kafka. The visibility token concept is described on the Clusterable [0] 
interface, which is an extension to the NodeStore implemented by the 
DocumentNodeStore. On the processing cluster node the visibility token is then 
used to suspend the job until the changes are visible.

Regards
 Marcel

[0] 
https://jackrabbit.apache.org/oak/docs/apidocs/org/apache/jackrabbit/oak/spi/state/Clusterable.html


On 15.12.18, 02:23, "ems eril" <emsro...@gmail.com> wrote:

    Hi Matt ,
    
      Yes your correct, the job is triggered by consumer listening to kafka
    queue . But to you earlier statement that this is not a Oak issue I have to
    disagree . In Mongo you can
    control write concern and make replication synchronize but we cannot do
    something similar in Oak .
    
    Thanks
    
    On Fri, Dec 14, 2018 at 3:25 PM Matt Ryan <mattr...@apache.org> wrote:
    
    > Hi,
    >
    > I believe your concern is:  Content could be uploaded to the cluster via
    > one Oak instance, and your job to process the content runs in a different
    > Oak instance, and that there is a possibility that the job to process the
    > content reads from a MongoDB node that has stale data, so the content is
    > not available yet.
    >
    > If I've understood your concern correctly, you are correct that this is
    > something you have to worry about, that there is a possibility that when
    > the job runs it gets stale data because where it reads from has not been
    > updated yet.  However, that's not something being caused by Oak; this 
would
    > be something you'd have to deal with whether Oak was there or not, no
    > matter what type of backing database cluster was being used.
    >
    > Maybe I'm still missing something in your question.  How are you planning
    > to trigger your job?
    >
    >
    >
    > On Fri, Dec 14, 2018 at 1:01 PM ems eril <emsro...@gmail.com> wrote:
    >
    > > Hi Matt ,
    > >
    > >    I was looking for more details on the inner workings . I came across
    > > this https://markmail.org/message/jbkrsmz3krllqghr where it mentioned
    > that
    > > changes in the cluster would eventually appear across other nodes and
    > this
    > > is not a mongo specific issue but something oak has introduced . I can
    > set
    > > the write concern to majority in mongo but if oak has its own eventually
    > > consistency model this can cause stale reads from other nodes which 
would
    > > be a problem with the distributed job Im trying to create.
    > >
    > > Thanks
    > >
    > > On Fri, Dec 14, 2018 at 8:02 AM Matt Ryan <mattr...@apache.org> wrote:
    > >
    > > > Hi Emily,
    > > >
    > > > Content is stored in Oak in two different configurable storage
    > services.
    > > > This is a bit of an oversimplification, but basically the structure of
    > > > content repository - the content tree, nodes, properties, etc. - is
    > > stored
    > > > in a Node Store [0] and the binary content is stored in a Blob Store
    > [1]
    > > > (you'll also sometimes see the term "data store").  Oak manages all of
    > > this
    > > > transparently to external clients.
    > > >
    > > > Oak clustering is therefore achieved by configuring Oak instances to
    > use
    > > > clusterable storage services underneath [2].  For the node store, an
    > > > implementation of a DocumentNodeStore [3] is needed; one such
    > > > implementation uses MongoDB [4].  For the blob store, an 
implementation
    > > of
    > > > a SharedDataStore is needed.  For example, both the SharedS3DataStore
    > and
    > > > AzureDataStore implementations can be used as a data store for an Oak
    > > > cluster.
    > > >
    > > > So, assume you were using MongoDB and S3.  Setting up an Oak cluster
    > then
    > > > merely means that you have more than one Oak instance, each of which 
is
    > > > configured to use the MongoDB cluster as the node store, and S3 as the
    > > data
    > > > store.
    > > >
    > > >
    > > > [0] -
    > > >
    > > >
    > >
    > 
https://github.com/apache/jackrabbit-oak/blob/trunk/oak-doc/src/site/markdown/nodestore/overview.md
    > > > [1] -
    > > >
    > > >
    > >
    > 
https://github.com/apache/jackrabbit-oak/blob/trunk/oak-doc/src/site/markdown/plugins/blobstore.md
    > > > [2] -
    > > >
    > > >
    > >
    > 
https://github.com/apache/jackrabbit-oak/blob/trunk/oak-doc/src/site/markdown/clustering.md
    > > > [3] -
    > > >
    > > >
    > >
    > 
https://github.com/apache/jackrabbit-oak/blob/trunk/oak-doc/src/site/markdown/nodestore/documentmk.md
    > > > [4] -
    > > >
    > > >
    > >
    > 
https://github.com/apache/jackrabbit-oak/blob/trunk/oak-doc/src/site/markdown/nodestore/document/mongo-document-store.md
    > > >
    > > >
    > > > Does that help?
    > > >
    > > >
    > > > -MR
    > > >
    > > > On Thu, Dec 13, 2018 at 5:52 PM ems eril <emsro...@gmail.com> wrote:
    > > >
    > > > > Hi Team ,
    > > > >
    > > > >    Im really interested in understanding how oak cluster works and
    > how
    > > do
    > > > > cluster nodes sync up . These are some of the questions I have
    > > > >
    > > > > 1) How does the nodes sync
    > > > > 2) What is the mongo role
    > > > > 3) How does indexes in cluster work and sync up
    > > > > 4) What is the distributed model master/slave multi master
    > > > > 5) What is co-ordinated by the master node
    > > > > 6) How is master node elected
    > > > >
    > > > >    One use case I have is to be able to leverage a oak cluster to be
    > > able
    > > > > to upload images/videos and have a consumer on one of the nodes
    > process
    > > > it
    > > > > in a distributed way . I like to try my best to avoid unnecessary
    > read
    > > > > checks if possible .
    > > > >
    > > > > Thanks
    > > > >
    > > > > Emily
    > > > >
    > > >
    > >
    >
    

Reply via email to