[VOTE] Release Apache Jackrabbit Oak 1.0.5
A candidate for the Jackrabbit Oak 1.0.5 release is available at: https://dist.apache.org/repos/dist/dev/jackrabbit/oak/1.0.5/ The release candidate is a zip archive of the sources in: https://svn.apache.org/repos/asf/jackrabbit/oak/tags/jackrabbit-oak-1.0.5/ The SHA1 checksum of the archive is 2cd71913fe66ba9491ee7edb4e82469e228412c9. A staged Maven repository is available for review at: https://repository.apache.org/ The command for running automated checks against this release candidate is: $ sh check-release.sh oak 1.0.5 2cd71913fe66ba9491ee7edb4e82469e228412c9 Please vote on releasing this package as Apache Jackrabbit Oak 1.0.5. The vote is open for the next 72 hours and passes if a majority of at least three +1 Jackrabbit PMC votes are cast. [ ] +1 Release this package as Apache Jackrabbit Oak 1.0.5 [ ] -1 Do not release this package because... My vote is +1 Regards Thomas
Re: MissingLastRevSeeker
On 2014-08-26 08:03, Amit Jain wrote: Hi Julian, The LastRevRecoveryAgent is executed at 2 places 1. On DocumentNodeStore startup where the MissingLastRevSeeker is used to get potential candidates for recovery. 2. At regular intervals defined by the property 'lastRevRecoveryJobIntervalInSecs' in the DocumentNodeStoreService (default 60 seconds). Short description is that MissingLastRevSeeker will be called rarely in this case. Long description - In this case a less expensive query is executed to find out all the stale clusterNodes for which recovery is to be performed. If there are clusterNodes that have unexpectedly shutdown and their 'leaseEndTime' has not expired then MissingLastRevSeeker will check all potential candidates. Proposal: if this code *is* used regularly, we'll need an API so that DocumentStore implementations other than Mongo can optimize the query. +1. Since, It will be executed on every startup. RDBDocumentStore already maintains the index on _modified property so, optimized querying is possible. Thanks Amit OK, so can we put what's needed into the DocumentStore API, or alternatively have an extension interface, that both MongoDocumentStore and RDBDocumentStore could implement? Best regards, Julian
Re: MissingLastRevSeeker
Hi Julian, The LastRevRecoveryAgent is executed at 2 places 1. On DocumentNodeStore startup where the MissingLastRevSeeker is used to get potential candidates for recovery. 2. At regular intervals defined by the property 'lastRevRecoveryJobIntervalInSecs' in the DocumentNodeStoreService (default 60 seconds). Short description is that MissingLastRevSeeker will be called rarely in this case. Long description - In this case a less expensive query is executed to find out all the stale clusterNodes for which recovery is to be performed. If there are clusterNodes that have unexpectedly shutdown and their 'leaseEndTime' has not expired then MissingLastRevSeeker will check all potential candidates. >> Proposal: if this code *is* used regularly, we'll need an API so that DocumentStore implementations other than Mongo can optimize the query. +1. Since, It will be executed on every startup. RDBDocumentStore already maintains the index on _modified property so, optimized querying is possible. Thanks Amit On Mon, Aug 25, 2014 at 7:36 PM, Julian Reschke wrote: > Hi there, > > it appears that the MissingLastRevSeeker (oak-core), when run, will be > very slow on large repos, unless they use a MongoDocumentStore (which has a > special-cased query). > > Question: when will this code execute? I've seen it occasionally during > benchmarking, but it doesn't seem to happen always. > > Proposal: if this code *is* used regularly, we'll need an API so that > DocumentStore implementations other than Mongo can optimize the query. > > Best regards, Julian >
Re: JCR API implementation transparency
fyi, I created https://issues.apache.org/jira/browse/OAK-2052 On Mon, Aug 25, 2014 at 10:32 PM, Chetan Mehrotra wrote: > On Tue, Aug 26, 2014 at 10:44 AM, Tobias Bocanegra wrote: >> IMO, this should work, even if the value is not a ValueImpl. In this >> case, it should fall back to the API methods to read the binary. > > +1 > > Chetan Mehrotra
Re: JCR API implementation transparency
On Tue, Aug 26, 2014 at 10:44 AM, Tobias Bocanegra wrote: > IMO, this should work, even if the value is not a ValueImpl. In this > case, it should fall back to the API methods to read the binary. +1 Chetan Mehrotra
JCR API implementation transparency
Hi, I'm looking at an issue [0] where "copying" of a JCR value fails, because the source and destination repository implementation are different. so basically: s1 = repository1.login(); // remote repository via davex s2 = repository2.login(); // local oak repository p1 = s1.getProperty(); n2 = s2.getNode(); n2.setProperty(p1.getName(), p1.getValue()); AFAICT, this usually works but not for binary values. it eventually fails in: org.apache.jackrabbit.oak.plugins.value.ValueImpl#getBlob(javax.jcr.Value) public static Blob getBlob(Value value) { checkState(value instanceof ValueImpl); return ((ValueImpl) value).getBlob(); } ...because the value is not a ValueImpl but a QValue. IMO, this should work, even if the value is not a ValueImpl. In this case, it should fall back to the API methods to read the binary. WDYT? Regards, Toby [0] https://issues.apache.org/jira/browse/JCRVLT-58
Re: [DISCUSS] supporting faceting in Oak query engine
Aloha, you should definitely talk to the HippoCMS developers. They forked Jackrabbit 2.x to add facetting as virtual nodes. They ran into some performance issues but I am sure they still have value-able feedback on this. regards, Lukas Kahwe Smith > On 25 Aug 2014, at 18:43, Laurie Byrum wrote: > > Hi Tommaso, > I am happy to see this thread! > > Questions: > Do you expect to want to support hierarchical or pivoted facets soonish? > If so, does that influence this decision? > Do you know how ACLs will come into play with your facet implementation? > If so, does that influence this decision? :-) > > Thanks! > Laurie > > > >> On 8/25/14 7:08 AM, "Tommaso Teofili" wrote: >> >> Hi all, >> >> since this has been asked every now and then [1] and since I think it's a >> pretty useful and common feature for search engine nowadays I'd like to >> discuss introduction of facets [2] for the Oak query engine. >> >> Pros: having facets in search results usually helps filtering (drill down) >> the results before browsing all of them, so the main usage would be for >> client code. >> >> Impact: probably change / addition in both the JCR and Oak APIs to support >> returning other than "just nodes" (a NodeIterator and a Cursor >> respectively). >> >> Right now a couple of ideas on how we could do that come to my mind, both >> based on the approach of having an Oak index for them: >> 1. a (multivalued) property index for facets, meaning we would store the >> facets in the repository, so that we would run a query against it to have >> the facets of an originating query. >> 2. a dedicated QueryIndex implementation, eventually leveraging Lucene >> faceting capabilities, which could "use" the Lucene index we already have, >> together with a "sidecar" index [3]. >> >> What do you think? >> Regards, >> Tommaso >> >> [1] : >> http://markmail.org/search/?q=oak%20faceting#query:oak%20faceting%20list%3 >> Aorg.apache.jackrabbit.oak-dev+page:1+state:facets >> [2] : http://en.wikipedia.org/wiki/Faceted_search >> [3] : >> http://lucene.apache.org/core/4_0_0/facet/org/apache/lucene/facet/doc-file >> s/userguide.html >
Re: [DISCUSS] supporting faceting in Oak query engine
Hi Tommaso, I am happy to see this thread! Questions: Do you expect to want to support hierarchical or pivoted facets soonish? If so, does that influence this decision? Do you know how ACLs will come into play with your facet implementation? If so, does that influence this decision? :-) Thanks! Laurie On 8/25/14 7:08 AM, "Tommaso Teofili" wrote: >Hi all, > >since this has been asked every now and then [1] and since I think it's a >pretty useful and common feature for search engine nowadays I'd like to >discuss introduction of facets [2] for the Oak query engine. > >Pros: having facets in search results usually helps filtering (drill down) >the results before browsing all of them, so the main usage would be for >client code. > >Impact: probably change / addition in both the JCR and Oak APIs to support >returning other than "just nodes" (a NodeIterator and a Cursor >respectively). > >Right now a couple of ideas on how we could do that come to my mind, both >based on the approach of having an Oak index for them: >1. a (multivalued) property index for facets, meaning we would store the >facets in the repository, so that we would run a query against it to have >the facets of an originating query. >2. a dedicated QueryIndex implementation, eventually leveraging Lucene >faceting capabilities, which could "use" the Lucene index we already have, >together with a "sidecar" index [3]. > >What do you think? >Regards, >Tommaso > >[1] : >http://markmail.org/search/?q=oak%20faceting#query:oak%20faceting%20list%3 >Aorg.apache.jackrabbit.oak-dev+page:1+state:facets >[2] : http://en.wikipedia.org/wiki/Faceted_search >[3] : >http://lucene.apache.org/core/4_0_0/facet/org/apache/lucene/facet/doc-file >s/userguide.html
MissingLastRevSeeker
Hi there, it appears that the MissingLastRevSeeker (oak-core), when run, will be very slow on large repos, unless they use a MongoDocumentStore (which has a special-cased query). Question: when will this code execute? I've seen it occasionally during benchmarking, but it doesn't seem to happen always. Proposal: if this code *is* used regularly, we'll need an API so that DocumentStore implementations other than Mongo can optimize the query. Best regards, Julian
[DISCUSS] supporting faceting in Oak query engine
Hi all, since this has been asked every now and then [1] and since I think it's a pretty useful and common feature for search engine nowadays I'd like to discuss introduction of facets [2] for the Oak query engine. Pros: having facets in search results usually helps filtering (drill down) the results before browsing all of them, so the main usage would be for client code. Impact: probably change / addition in both the JCR and Oak APIs to support returning other than "just nodes" (a NodeIterator and a Cursor respectively). Right now a couple of ideas on how we could do that come to my mind, both based on the approach of having an Oak index for them: 1. a (multivalued) property index for facets, meaning we would store the facets in the repository, so that we would run a query against it to have the facets of an originating query. 2. a dedicated QueryIndex implementation, eventually leveraging Lucene faceting capabilities, which could "use" the Lucene index we already have, together with a "sidecar" index [3]. What do you think? Regards, Tommaso [1] : http://markmail.org/search/?q=oak%20faceting#query:oak%20faceting%20list%3Aorg.apache.jackrabbit.oak-dev+page:1+state:facets [2] : http://en.wikipedia.org/wiki/Faceted_search [3] : http://lucene.apache.org/core/4_0_0/facet/org/apache/lucene/facet/doc-files/userguide.html
oak-run benchmarks
Hi there, I'm currently looking at the benchmark behavior for the RDB persistence, and I believe I'm seeing degrading performance with each additional run of the benchmark. To make cases like these easier to find, would it make sense to also report on whether there's a trend in benchmark times? (such as by reporting the average ratio of runtime between subsequent runs?) Best regards, Julian
buildbot failure in ASF Buildbot on oak-trunk-win7
The Buildbot has detected a new failure on builder oak-trunk-win7 while building ASF Buildbot. Full details are available at: http://ci.apache.org/builders/oak-trunk-win7/builds/497 Buildbot URL: http://ci.apache.org/ Buildslave for this Build: bb-win7 Build Reason: scheduler Build Source Stamp: [branch jackrabbit/oak/trunk] 1620305 Blamelist: mduerig BUILD FAILED: failed compile sincerely, -The Buildbot
Oak 1.0.5 release plan
Sorry, wrong subject. It's the Oak 1.0.5 release of course. On 25/08/14 14:06, "Thomas Mueller" wrote: >Hi, > >Now that 1.0.4 is out, it's time to plan the next minor release. > >I'm planning to cut the 1.0.5 release today in about 30 minutes. > >Regards >Thomas >
Re: Oak 1.0.4 release plan
Hi, Now that 1.0.4 is out, it's time to plan the next minor release. I'm planning to cut the 1.0.5 release today in about 30 minutes. Regards Thomas
buildbot success in ASF Buildbot on oak-trunk-win7
The Buildbot has detected a restored build on builder oak-trunk-win7 while building ASF Buildbot. Full details are available at: http://ci.apache.org/builders/oak-trunk-win7/builds/496 Buildbot URL: http://ci.apache.org/ Buildslave for this Build: bb-win7 Build Reason: scheduler Build Source Stamp: [branch jackrabbit/oak/trunk] 1620287 Blamelist: alexparvulescu Build succeeded! sincerely, -The Buildbot
Re: NodeStore#checkpoint api reevaluation
Hi, On 22/08/14 16:31, "Alex Parvulescu" wrote: >Following OAK-2039 there was a discussion around the current design of the >#checkpoint apis. [0] > >It looks a bit confusing that you can call the apis to create a checkpoint >and get back a reference but when retrieving it, it might not exist, even >if the calls are back to back. >With OAK-2039 I've added some warning logs when a checkpoint cannot be >created but a ref is still returned, to understand if this is a system >load >problem, or something more profound. what is the reason the SegmentNodeStore does a commitSemaphore.tryAcquire() instead of a commitSemaphore.acquire() like in SegmentNodeStore.merge()? >I believe that nobody has any issues with the #retrieve method, all the >confusion is really about the #checkpoint parts, currently marked as >'@Nonnull'. > >Alternatives mentioned are > - return null if the checkpoint was not created > - throw en exception > >I vote -0 for the change, I believe that making this more complicated than >it needs to be (more null checks, or a try/catch) doesn't really benefit >anybody. I think we should improve it somehow because I find the current behaviour quite confusing. The current implementation of SegmentNodeStore.checkpoint() IMO violates the contract. It may return a string reference to a checkpoint which was never created and obviously won't be valid for the requested lifetime. In my view, a client should be able to detect this in a simple way. Right now you would have to call retrieve() to find out if checkpoint() actually worked. Returning a null value works better if we specify under what conditions no checkpoint can be created. After all a client would have to implement some code in response to a null value. E.g. should it retry later, because the checkpoint cannot be created when the system is under load? This would be a good fit if we keep the current implementation in SegmentNodeStore. An exception works better if we say an implementation should always be able to create a checkpoint and only fail if it cannot perform the operation because of e.g. an underlying IOException. Regards Marcel
Re: NodeStore#checkpoint api reevaluation
On 22.8.14 4:31 , Alex Parvulescu wrote: Hi, Following OAK-2039 there was a discussion around the current design of the #checkpoint apis. [0] It looks a bit confusing that you can call the apis to create a checkpoint and get back a reference but when retrieving it, it might not exist, even if the calls are back to back. Reading the Javadoc carefully this is to be expected. However I think this could be improved. Either by making the Javadoc for #checkpoint more explicit about it or by reflecting it in the return value. Instead of returning null for the later option we could also return a constant value representing a not available checkpoint. With that client code wouldn't need to change but could check the returned value if desired. Michael With OAK-2039 I've added some warning logs when a checkpoint cannot be created but a ref is still returned, to understand if this is a system load problem, or something more profound. I believe that nobody has any issues with the #retrieve method, all the confusion is really about the #checkpoint parts, currently marked as '@Nonnull'. Alternatives mentioned are - return null if the checkpoint was not created - throw en exception I vote -0 for the change, I believe that making this more complicated than it needs to be (more null checks, or a try/catch) doesn't really benefit anybody. If there are thoughts around how this should change, please feel free to join in. best, alex [0] https://github.com/apache/jackrabbit-oak/blob/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/spi/state/NodeStore.java#L124