Re: How does oak cluster work

ems eril Fri, 14 Dec 2018 17:25:30 -0800

Hi Matt ,

  Yes your correct, the job is triggered by consumer listening to kafka
queue . But to you earlier statement that this is not a Oak issue I have to
disagree . In Mongo you can
control write concern and make replication synchronize but we cannot do
something similar in Oak .


Thanks

On Fri, Dec 14, 2018 at 3:25 PM Matt Ryan <mattr...@apache.org> wrote:

> Hi,
>
> I believe your concern is:  Content could be uploaded to the cluster via
> one Oak instance, and your job to process the content runs in a different
> Oak instance, and that there is a possibility that the job to process the
> content reads from a MongoDB node that has stale data, so the content is
> not available yet.
>
> If I've understood your concern correctly, you are correct that this is
> something you have to worry about, that there is a possibility that when
> the job runs it gets stale data because where it reads from has not been
> updated yet.  However, that's not something being caused by Oak; this would
> be something you'd have to deal with whether Oak was there or not, no
> matter what type of backing database cluster was being used.
>
> Maybe I'm still missing something in your question.  How are you planning
> to trigger your job?
>
>
>
> On Fri, Dec 14, 2018 at 1:01 PM ems eril <emsro...@gmail.com> wrote:
>
> > Hi Matt ,
> >
> >    I was looking for more details on the inner workings . I came across
> > this https://markmail.org/message/jbkrsmz3krllqghr where it mentioned
> that
> > changes in the cluster would eventually appear across other nodes and
> this
> > is not a mongo specific issue but something oak has introduced . I can
> set
> > the write concern to majority in mongo but if oak has its own eventually
> > consistency model this can cause stale reads from other nodes which would
> > be a problem with the distributed job Im trying to create.
> >
> > Thanks
> >
> > On Fri, Dec 14, 2018 at 8:02 AM Matt Ryan <mattr...@apache.org> wrote:
> >
> > > Hi Emily,
> > >
> > > Content is stored in Oak in two different configurable storage
> services.
> > > This is a bit of an oversimplification, but basically the structure of
> > > content repository - the content tree, nodes, properties, etc. - is
> > stored
> > > in a Node Store [0] and the binary content is stored in a Blob Store
> [1]
> > > (you'll also sometimes see the term "data store").  Oak manages all of
> > this
> > > transparently to external clients.
> > >
> > > Oak clustering is therefore achieved by configuring Oak instances to
> use
> > > clusterable storage services underneath [2].  For the node store, an
> > > implementation of a DocumentNodeStore [3] is needed; one such
> > > implementation uses MongoDB [4].  For the blob store, an implementation
> > of
> > > a SharedDataStore is needed.  For example, both the SharedS3DataStore
> and
> > > AzureDataStore implementations can be used as a data store for an Oak
> > > cluster.
> > >
> > > So, assume you were using MongoDB and S3.  Setting up an Oak cluster
> then
> > > merely means that you have more than one Oak instance, each of which is
> > > configured to use the MongoDB cluster as the node store, and S3 as the
> > data
> > > store.
> > >
> > >
> > > [0] -
> > >
> > >
> >
> https://github.com/apache/jackrabbit-oak/blob/trunk/oak-doc/src/site/markdown/nodestore/overview.md
> > > [1] -
> > >
> > >
> >
> https://github.com/apache/jackrabbit-oak/blob/trunk/oak-doc/src/site/markdown/plugins/blobstore.md
> > > [2] -
> > >
> > >
> >
> https://github.com/apache/jackrabbit-oak/blob/trunk/oak-doc/src/site/markdown/clustering.md
> > > [3] -
> > >
> > >
> >
> https://github.com/apache/jackrabbit-oak/blob/trunk/oak-doc/src/site/markdown/nodestore/documentmk.md
> > > [4] -
> > >
> > >
> >
> https://github.com/apache/jackrabbit-oak/blob/trunk/oak-doc/src/site/markdown/nodestore/document/mongo-document-store.md
> > >
> > >
> > > Does that help?
> > >
> > >
> > > -MR
> > >
> > > On Thu, Dec 13, 2018 at 5:52 PM ems eril <emsro...@gmail.com> wrote:
> > >
> > > > Hi Team ,
> > > >
> > > >    Im really interested in understanding how oak cluster works and
> how
> > do
> > > > cluster nodes sync up . These are some of the questions I have
> > > >
> > > > 1) How does the nodes sync
> > > > 2) What is the mongo role
> > > > 3) How does indexes in cluster work and sync up
> > > > 4) What is the distributed model master/slave multi master
> > > > 5) What is co-ordinated by the master node
> > > > 6) How is master node elected
> > > >
> > > >    One use case I have is to be able to leverage a oak cluster to be
> > able
> > > > to upload images/videos and have a consumer on one of the nodes
> process
> > > it
> > > > in a distributed way . I like to try my best to avoid unnecessary
> read
> > > > checks if possible .
> > > >
> > > > Thanks
> > > >
> > > > Emily
> > > >
> > >
> >
>

Re: How does oak cluster work

Reply via email to