Re: blobs are not being retained when using MicroKernel interface, adds "str:" prefix to blobId property value

Adrien Lamoureux Tue, 23 Sep 2014 09:31:27 -0700

Thomas,

I'm looking for a way to handle read/write of unstructured content,
including files from multiple client application servers. They, in turn,
would serve authors in an authoring web application. There would only be a
handful of these concurrent users. A publishing mechanism would routinely
take the content and produce static files served by a standard web server.


The content tree is rather flat, with a certain parent having many
children. We are currently using Jackrabbit 2.x with DavEx as the means to
communicate with the remote repository, and we find it very reliable, but
we it's also becoming very slow at reading a large list of nodes from a
single parent (due to so many remote requests). We have a layer of
abstraction on the application servers, so we can switch out to some other
vendor without too much difficulty. (we are not tightly coupled to the JCR
for content reading/writing).

So it doesn't need to be very scalable, but it does need to be efficient at
quickly making changes, and listing child nodes with properties on those
child nodes from multiple clients.

We used to have Jackrabbit running in a clustered environment with a
Jackrabbit instance on each application server, but we found it to be too
unstable. I'm willing to revisit this as a solution, but I'm hesitant due
to our previous experiences and I found no documentation for clustering
with Oak:
http://jackrabbit.apache.org/oak/docs/clustering.html

I have already written an implementation based on the Microkernel, and it's
working fine, except for this issue with the blobs. I also noticed you have
oak-http, and correct me if I'm wrong, but this doesn't seem to handle
files.

Thanks,

Adrien

On Tue, Sep 23, 2014 at 1:35 AM, Thomas Mueller <[email protected]> wrote:

> Hi,
>
> I'm not sure, maybe this is an XY problem:
> http://meta.stackexchange.com/questions/66377/what-is-the-xy-problem
>
> So... what problem do you want to solve? For example, are you looking for
> a scalable solution (if yes why), or do you want multiple clients (from
> different machines or processes) to access the same repository? Or
> something else?
>
> Regards,
> Thomas
>
> On 22/09/14 19:17, "Adrien Lamoureux" <[email protected]> wrote:
>
> >Thank you for replying.
> >
> >I was hoping to use the MicroKernel with simple JSON for remote
> >access. If NodeStoreKernel
> >is not being maintained, can I assume that SegmentMK is? If possible, how
> >do I obtain a SegmentMK backed oak instance with MongoDB for blob store
> >exposed through the MicroKernel API?
> >
> >Thanks,
> >
> >Adrien
> >
> >On Mon, Sep 22, 2014 at 9:32 AM, Stefan Guggisberg <
> >[email protected]> wrote:
> >
> >> hi adrien,
> >>
> >> On Mon, Sep 22, 2014 at 5:35 PM, Adrien Lamoureux
> >> <[email protected]> wrote:
> >> > Hi,
> >> >
> >> > No one has responded to the issues I'm having with the MicroKernel.
> >>
> >> sorry, missed that one.
> >>
> >> the problem you're having ('str:' being prepended to ':blobid:...')
> >> seems to be caused by a bug
> >> in o.a.j.oak.kernel.NodeStoreKernel.
> >>
> >> >
> >> > Is the correct location to ask these questions? I tried finding a
> >> solution
> >> > to this issue in your documentation and found none.
> >>
> >> you could file a jira issue. however, i am not sure NodeStoreKernel is
> >> being actively maintained.
> >>
> >> cheers
> >> stefan
> >>
> >> >
> >> > Thanks,
> >> >
> >> > Adrien
> >> >
> >> > On Tue, Sep 16, 2014 at 1:51 PM, Adrien Lamoureux <
> >> > [email protected]> wrote:
> >> >
> >> >> Hello,
> >> >>
> >> >> I've been testing Oak 1.0.5, and changed Main.java under oak-run to
> >> enable
> >> >> a MicroKernel to run at startup with the standalone service at the
> >> bottom
> >> >> of the addServlets() method:
> >> >>
> >> >>         private void addServlets(Oak oak, String path) {
> >> >>
> >> >>             Jcr jcr = new Jcr(oak);
> >> >>
> >> >>             // 1 - OakServer
> >> >>
> >> >>             ContentRepository repository =
> >> oak.createContentRepository();
> >> >>
> >> >> .............
> >> >>
> >> >> org.apache.jackrabbit.oak.core.ContentRepositoryImpl repoImpl =
> >> >> (org.apache.jackrabbit.oak.core.ContentRepositoryImpl)repository;
> >> >>
> >> >> org.apache.jackrabbit.oak.kernel.NodeStoreKernel nodeStoreK = new
> >> >>
> >>
> >>org.apache.jackrabbit.oak.kernel.NodeStoreKernel(repoImpl.getNodeStore())
> >>;
> >> >>
> >> >> org.apache.jackrabbit.mk.server.Server mkserver = new
> >> >> org.apache.jackrabbit.mk.server.Server(nodeStoreK);
> >> >>
> >> >> mkserver.setPort(28080);
> >> >>
> >> >> mkserver.setBindAddress(java.net.InetAddress.getByName("localhost"));
> >> >>
> >> >> mkserver.start();
> >> >> }
> >> >>
> >> >> I then used an org.apache.jackrabbit.mk.client.Client to connect to
> >>it,
> >> >> and everything seemed to work fine, including writing / reading
> >>blobs,
> >> >> however, the blobs are not being retained, and it appears to be
> >> impossible
> >> >> to set a ":blobId:" prefix for a property value without it forcing an
> >> >> additional 'str:' prefix.
> >> >>
> >> >> Here are a couple of examples using curl to create a node with a
> >>single
> >> >> property to hold the blobId. The first uses the proper ":blobId:"
> >> prefix,
> >> >> the other doesn't:
> >> >>
> >> >> curl -X POST --data 'path=/&message=' --data-urlencode
> >> >> 'json_diff=+"testFile1.jpg" :
> >> >>
> >>
> >>{"testFileRef":":blobId:93e6002eb8f3c4128b2ce18351e16b0d72b870f6e1ee507b5
> >>221579f0dd31a33"}'
> >> >> http://localhost:28080/commit.html
> >> >>
> >> >> RETURNED:
> >> >>
> >> >> curl -X POST --data
> >> >>
> >>
> >>'path=/testFile1.jpg&depth=2&offset=0&count=-1&filter={"nodes":["*"],"pro
> >>perties":["*"]}'
> >> >> http://localhost:28080/getNodes.html
> >> >>
> >> >> {
> >> >>
> >> >>   "testFileRef": "*str::blobId:*
> >> >> 93e6002eb8f3c4128b2ce18351e16b0d72b870f6e1ee507b5221579f0dd31a33",
> >> >>
> >> >>   ":childNodeCount": 0
> >> >>
> >> >> }
> >> >>
> >> >> I then tried without the blobId prefix, and it did not add a prefix:
> >> >>
> >> >> curl -X POST --data 'path=/&message=' --data-urlencode
> >> >> 'json_diff=+"testFile2.jpg" :
> >> >>
> >>
> >>{"testFileRef":"93e6002eb8f3c4128b2ce18351e16b0d72b870f6e1ee507b5221579f0
> >>dd31a33"}'
> >> >> http://localhost:28080/commit.html
> >> >>
> >> >> RETURNED:
> >> >>
> >> >> curl -X POST --data
> >> >>
> >>
> >>'path=/testFile2.jpg&depth=2&offset=0&count=-1&filter={"nodes":["*"],"pro
> >>perties":["*"]}'
> >> >> http://localhost:28080/getNodes.html
> >> >>
> >> >> {
> >> >>
> >> >>   "testFileRef":
> >> >> "93e6002eb8f3c4128b2ce18351e16b0d72b870f6e1ee507b5221579f0dd31a33",
> >> >>
> >> >>   ":childNodeCount": 0
> >> >>
> >> >> }
> >> >>
> >> >> The blob itself was later removed/deleted, presumably by some sort of
> >> >> cleanup mechanism. I'm assuming that it couldn't find the reference
> >>to
> >> the
> >> >> blob.
> >> >>
> >> >> For sanity check, I tried saving a different one line text file at
> >>the
> >> >> Java Content Repository level of abstraction, and this is the result:
> >> >>
> >> >> curl -X POST --data
> >> >>
> >>
> >>'path=/testFile&depth=2&offset=0&count=-2&filter={"nodes":["*"],"properti
> >>es":["*"]}'
> >> >> http://localhost:28080/getNodes.html
> >> >>
> >> >> {
> >> >>
> >> >>   "jcr:created": "dat:2014-09-16T13:41:38.084-07:00",
> >> >>
> >> >>   "jcr:createdBy": "admin",
> >> >>
> >> >>   "jcr:primaryType": "nam:nt:file",
> >> >>
> >> >>   ":childNodeCount": 1,
> >> >>
> >> >>   "jcr:content": {
> >> >>
> >> >>     ":childOrder": "[0]:Name",
> >> >>
> >> >>     "jcr:encoding": "UTF-8",
> >> >>
> >> >>     "jcr:lastModified": "dat:2014-09-16T13:41:38.094-07:00",
> >> >>
> >> >>     "jcr:mimeType": "text/plain",
> >> >>
> >> >>     "jcr:data":
> >> >>
> >>
> >>":blobId:428ed7545cd993bf6add8cd74cd6ad70f517341bbc1b31615f9286c652cd214a
> >>",
> >> >>
> >> >>     "jcr:primaryType": "nam:nt:unstructured",
> >> >>
> >> >>     ":childNodeCount": 0
> >> >>
> >> >>   }
> >> >>
> >> >> }
> >> >>
> >> >> The ":blobId:" prefix appears intact in this case..
> >> >>
> >> >> Any help would be greatly appreciated, as I would like to start using
> >> the
> >> >> MicroKernel for remote access, and file retention is critical.
> >> >>
> >> >> Thanks,
> >> >>
> >> >> Adrien
> >> >>
> >> >>
> >> >>
> >> >>
> >>
>
>

Re: blobs are not being retained when using MicroKernel interface, adds "str:" prefix to blobId property value

Reply via email to