Re: Using Jackrabbit in orchestration environment

Clay Ferguson Sat, 24 Jun 2017 10:00:27 -0700

Related to the updating of indexes. I'm working on a P2P capability which
will make a JCR Repo behave essentially like a distributed blockchain
database (i.e. "ledger"), where every node has a full copy of the DB/repo.
One capability required for that which i've already completed is the
implementation of a Merkle-Tree-like capability where I can tell if the
full content under any given subgraph is identical to that located on some
separate "peer" (network node), simply by comparing a SHA256 hash at both
nodes (each node being on totally independent repositories).

The method for maintaining 'identical' copies of the repos (technically a
subgraph in each) will be to use the Merkle-tree to perform a "sync" doing
the "least effort" data transfers from peer to peer to perform the updates
(syncing). I may end up using an open source BitTorrent library to perform
the transmission of data between clients efficiently. So John, that kind of
technique (BitTorrent protocol) could theoretically help you distribute
index files across nodes rather than regenerating index files manually
every time you spin one up.

I admit I haven't even researched "Clusters" (in jackrabbit), and I don't
know if those are sharded/federated, or whether they use a full "copy" on
each node. Interestingly, if you're a fan of blockchain, i will also be
using a public-key encryption system on this app to be able to authenticate
who added what content, by having each 'edit' (node property modification)
get hashed and then encrypted with the user's private key, and storing that
encrypted hash on the tree. So the entire app I am implementing will BE a
true blockchain, implemented as a layer built on top of the JCR.

I think of what I'm doing as a "reference implementation" of what could
eventually become a blockchain specification for the JCR which will be an
extension to the JCR API specifically adding a blockchain protocol/layer on
top of JCR, and hopefully will become an Apache Project of it's own, and a
formal spec for how to use JCR to build out Blockchains. What I am doing is
along the lines of Ethereum, by making blockchain be a more generic,
accessible, reusable technology, but afaik Ethereum is not built on JCR,
and I believe in building on top of JCR. Anyone who understands Merkle
Trees AND the JCR and also is fully cognizant of blockchain would come to
this same conclusion, I believe.

So I hope at least a couple of the guys who are well-connected in Adobe
will pass the word up the chain of command regarding this concept. In 10yrs
nobody will want to use a content repository that doesn't have the level of
'trust' that can only come from a blockchain. I think in 10 to 20yrs even
RDBs will have 'blockchain verifiable' transactions as built-in functions,
in them also. But for now, a protocol layer on top of and separate from the
JCR that specifically does blockchain functionality seems like the next
step for blockchain technology and also for JCR. Who knows, maybe the world
is ready for Adobe to start a cryptocurrency of their own!? Perhaps that
would be the financial incentive to get them interested in this? I have
$10K for that ICO ready and waiting!!

I've probably violated the terms and conditions of this mailing list and I
apologize if so. I went slightly beyond a reply to John.

Best regards,
Clay Ferguson
https://github.com/Clay-Ferguson/meta64
[email protected]

On Sat, Jun 24, 2017 at 6:52 AM, John Chilton <[email protected]> wrote:

> Thanks Galo, this is useful information.
>
> When you say, “large” working sets, how large is large — just looking for
> order of magnitude (Gig, Tera, Peta….)?
>
> Also, are you aware if any Mesos frameworks that offer similar
> capabilities as K8s stateful sets?
>
> Thanks again,
>
> -John
>
> > On Jun 23, 2017, at 6:37 PM, Galo Gimenez <[email protected]>
> wrote:
> >
> > One issue you will find on Jackrabbit is indexing, local storage is
> ephemeral so new nodes need to re index and on large working sets this can
> take hours.
> >
> > Kubernetes introduced stateful sets, this allows you to have very stable
> naming and storage inside the cluster, and a consistent ordering when nodes
> are started -https://kubernetes.io/docs/concepts/workloads/
> controllers/statefulset/ <https://kubernetes.io/docs/concepts/workloads/
> controllers/statefulset/>.
> >
> > — Galo
> >
> >> On Jun 23, 2017, at 11:03 PM, John Chilton <[email protected]> wrote:
> >>
> >> We are running in an orchestration environment — either
> Mesos/Chronos/Marathon or Kubernetes.
> >>
> >> Each docker container needs to join the Jackrabbit cluster for the
> lifetime of that container and then leave the Jackrabbit cluster when its
> work is complete.
> >> When each container joins the Jackrabbit cluster it is assigned a
> unique cluster node id (repository.xml). We also have no upper bound on the
> number of our containers that may join the cluster at any given time.
> >>
> >> Will this “dynamic” clustering work or will we encounter issues? Is
> this ill-advised? or are there things we need to do beyond uniquely
> identify each cluster node.
> >> I Am trying to get ahead of issues that may arise when exercising this.
> Any thoughts at all would be appreciated.
> >>
> >> Thanks,
> >>
> >> -John
> >>
> >
>
>

Re: Using Jackrabbit in orchestration environment

Reply via email to