[jira] [Commented] (JCR-4460) allow to run remoted conformance tests with a custom servlet context path
[ https://issues.apache.org/jira/browse/JCR-4460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16892577#comment-16892577 ] Felix Meschberger commented on JCR-4460: You might want to look how it is being used in the Sling JCR WebDAV bundles for some more insight, [~reschke] > allow to run remoted conformance tests with a custom servlet context path > - > > Key: JCR-4460 > URL: https://issues.apache.org/jira/browse/JCR-4460 > Project: Jackrabbit Content Repository > Issue Type: Task > Components: jackrabbit-jcr2dav >Reporter: Julian Reschke >Assignee: Julian Reschke >Priority: Minor > Labels: candidate_jcr_2_16 > Fix For: 2.20, 2.19.4, 2.18.3 > > Attachments: use-context-path.diff > > > Add a system property that selects a servlet context path for testing. > To run tests with non-root path: > {noformat} > mvn clean install -PintegrationTesting -DWebDAVServletContext="/foobar/" > {noformat} -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (JCR-4460) allow to run remoted conformance tests with a custom servlet context path
[ https://issues.apache.org/jira/browse/JCR-4460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16892531#comment-16892531 ] Felix Meschberger commented on JCR-4460: {quote}There probably was a good reason for it's introduction {quote} I guess so, yes. But I cannot remember. One explanation I would have is usage inside an OSGi framework where the servlet context path might be injected differently... But honestly, I do not remember ... > allow to run remoted conformance tests with a custom servlet context path > - > > Key: JCR-4460 > URL: https://issues.apache.org/jira/browse/JCR-4460 > Project: Jackrabbit Content Repository > Issue Type: Task > Components: jackrabbit-jcr2dav >Reporter: Julian Reschke >Assignee: Julian Reschke >Priority: Minor > Labels: candidate_jcr_2_16 > Fix For: 2.20, 2.19.4, 2.18.3 > > Attachments: use-context-path.diff > > > Add a system property that selects a servlet context path for testing. > To run tests with non-root path: > {noformat} > mvn clean install -PintegrationTesting -DWebDAVServletContext="/foobar/" > {noformat} -- This message was sent by Atlassian JIRA (v7.6.14#76016)
Re: [FileVault][discuss] performance improvement proposal
Hi This looks great. As for configuration: What is the reason for having a configuration option ? Not being able to decide ? Or real customer need for having it configurable ? I think we should start with reasonble heuristics first and consider configuration options in case there is a need/desire. Regards Felix Am 06.03.2017 um 16:43 schrieb Timothée Maret>: Hi, With Sling content distribution (using FileVault), we observe a significantly lower throughput for content packages containing binaries. The main bottleneck seems to be the compression algorithm applied to every element contained in the content package. I think that we could improve the throughput significantly, simply by avoiding to re-compress binaries that are already compressed. In order to figure out what binaries are already compressed, we could use match the content type stored along the binary against a list of configurable content types. I have done some micro tests with this idea (patch in [0]). I think that the results are promising. Exporting a single 250 MB JPEG is 80% faster (22.4 sec -> 4.3 sec) for a 3% bigger content package (233.2 MB -> 240.4 MB) Exporting AEM OOTB /content/dam is 50% faster (11.9 sec -> 5.9 sec) for a 5% bigger content package (92.8 MB -> 97.4 MB) Import for the same cases is 66% faster respectively 32% faster. I think this could either be done by default and allowing to configure the list of types that skip compression. Alternatively, it could be done on a project level, by extending FileVault with the following 1. For each package, allow to define the default compression level (best compression, best speed) 2. Expose an API that allow to plugin a custom logic to decide how to compress a given artefact In any case, the changes would be backward compatible. Content packages created with the new code would be installable on instances running the old code and vice versa. wdyt ? Regards, Timothee [0] https://github.com/tmaret/jackrabbit-filevault/tree/performance-avoid-compressing-already-compressed-binaries-based-on-content-type-detection [1] https://docs.oracle.com/javase/7/docs/api/java/util/zip/Deflater.html#BEST_SPEED
Re: aftermath on https://issues.apache.org/jira/browse/OAK-5336
Hi Re transient dependency: an option would have been to exclude the commons dependency on the directory dependency and explicitly add a more recent commons dependency on the project. Commons generally does a fairly decent job on keeping backwards compatibility. Regards Felix -- Typos caused by my iPhone > Am 21.12.2016 um 17:14 schrieb Julian Reschke: > > So, summarizing: > > 1) I was reviewing build dependencies after discovering an old pull request > for Jackrabbit, complaining on the use of a security challenged version of > commons-collections (see https://issues.apache.org/jira/browse/JCR-4080) > > 2) Asked Manfred to bump up the version of org.apache.directory.api.api-all > in auth-ldap, which itself had a dependency on the old version of > commons-collections (see https://issues.apache.org/jira/browse/OAK-5336) > > 3) Tests passed on our Windows machines, but not on Jenkins. Turns out that > tests were disabled on Windows (see > https://issues.apache.org/jira/browse/OAK-2904) > > 4) Finally fixed tests by also bumping up the test dependency for the > directory server implementation. > > 5) After some digging, found *why* the tests were failing on Windows, fixed > that, and re-enabled them (https://issues.apache.org/jira/browse/OAK-5358) > > 6) We're still referencing a Release Candidate for > org.apache.directory.api.api-all, and the API *has* changed in the last 12 > months. We need to make sure that once that is released, we update our code > (and branches as well). Opened https://issues.apache.org/jira/browse/OAK-5361 > (scheduling it for 1.8) to track this. > > Best regards, Julian
[jira] [Commented] (JCR-4060) unintended export versions due to changed defaults in maven bundle plugin
[ https://issues.apache.org/jira/browse/JCR-4060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15747768#comment-15747768 ] Felix Meschberger commented on JCR-4060: ACK. > unintended export versions due to changed defaults in maven bundle plugin > - > > Key: JCR-4060 > URL: https://issues.apache.org/jira/browse/JCR-4060 > Project: Jackrabbit Content Repository > Issue Type: Bug >Affects Versions: 2.12.5, 2.13.4 >Reporter: Julian Reschke >Priority: Blocker > Fix For: 2.13.6, 2.14, 2.12.7 > > Attachments: ExportVersionChecker.jar, ExportVersionChecker.java > > > It appears that the maven bundle plugin change for JCR-3937 has caused > default export version numbers to be assigned where before we hadn't any. > For instance, in jackrabbit-api, the package "o.a.j.api" has a > package-info.java setting the version to 2.4, and this is what get's exported. > However, o.a.j.api.query does not have a package-info, and now gets exported > with a version number of 2.13.5, whereas previously it didn't get any version > number. > a) Does anybody know whether this change in behavior of the bundle plugin is > documented anywhere? > b) Do we need to fix something here, or are automatically assigned version > numbers indeed ok? If we need to do something, what is the correct fix? > Freeze the version numbers at the currently assigned version? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (JCR-4060) unintended export versions due to changed defaults in maven bundle plugin
[ https://issues.apache.org/jira/browse/JCR-4060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15747595#comment-15747595 ] Felix Meschberger commented on JCR-4060: I think the most sensible thing to do would be to create java-package.info files in all the exported packages exporting these packages at the latest bundle release version. This way you don't break existing consumers and make sure that going forward, the export versions should be properly managed as the packages may evolve (well probably not any more, so having a fixed version number is even better). ... and then not having full-blown all-in releases any more also will help, but that is for another day ;-) > unintended export versions due to changed defaults in maven bundle plugin > - > > Key: JCR-4060 > URL: https://issues.apache.org/jira/browse/JCR-4060 > Project: Jackrabbit Content Repository > Issue Type: Bug >Affects Versions: 2.12.5, 2.13.4 >Reporter: Julian Reschke >Priority: Blocker > Fix For: 2.13.6, 2.14, 2.12.7 > > Attachments: ExportVersionChecker.jar, ExportVersionChecker.java > > > It appears that the maven bundle plugin change for JCR-3937 has caused > default export version numbers to be assigned where before we hadn't any. > For instance, in jackrabbit-api, the package "o.a.j.api" has a > package-info.java setting the version to 2.4, and this is what get's exported. > However, o.a.j.api.query does not have a package-info, and now gets exported > with a version number of 2.13.5, whereas previously it didn't get any version > number. > a) Does anybody know whether this change in behavior of the bundle plugin is > documented anywhere? > b) Do we need to fix something here, or are automatically assigned version > numbers indeed ok? If we need to do something, what is the correct fix? > Freeze the version numbers at the currently assigned version? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Semantic version in Oak
Hi > Am 08.12.2015 um 00:29 schrieb Francesco Mari: > > 2015-12-07 21:02 GMT+01:00 David Bosschaert : > > ... > The more frequent case is a change in the development branch (1.5) that has > to be backported to the maintenance branches (1.4, 1.2, 1.0). If the change > breaks the API of the affected packages, it could be troublesome - see > below. > > >> >> 1 and 2 above should be pretty simple and 3 should pretty much never >> happen anyway... >> >> Two additional notes. While we're talking about 'Oak 1.4' I assume >> that is the version of the bundle. I assume each package in that >> bundle evolves independently which means that they could have versions >> other than 1.4 (e.g. org.apache.foo = 1.1.2 and org.apache.bar = 1.3 >> etc). >> > > Packages evolve independently, but they do in potentially divergent > branches. This is the kind of timeline that we usually face: > > - Oak 1.4 has a package org.foo.bar 1.0 > - Some changes happen on the development branch 1.5 > - Oak 1.5 now has a package org.foo.bar 1.1 > - A change X happen in the development branch 1.5 > - Oak 1.5 now has a package org.foo.bar 1.2 > - The change X has to be backported to the maintenance branch 1.4 > - Oak 1.4 now should have a package org.foo.bar 1.1 > > Assuming that the versions were incremented following the semantic > versioning rules, we now have two packages - both called org.foo.bar and > both having version 1.1 - that live on two different branches and contain > different code. > > The only obvious solution that comes to my mind is to bump the major > version of every package right after the development branch 1.5 is started, > but I don't like this approach very much because it would break > compatibility with existing clients for no obvious reason. This scenario is the exact problem you are facing while branching and evolving the branches in parallel to trunk. The only end-developer friendly solution is to byte the bullet and do it really properly and make sure you evolve exported packages (being your API) in a truly diligent matter: Consider a package name and its export version as the package’s identity and always make sure this identity (label) refers to the identical exported API. It’s a pain with unduly large exported packages, but its the only way you can serve your community. Regards Felix
Re: Semantic version in Oak
Hi > Am 08.12.2015 um 09:49 schrieb Thomas Mueller: > > Hi, > > I think the main difference between Oak and Sling is, AFAIK, that Sling is > "forward only", and does not maintain branches, and does not backport > things. Just to be clear, this is not about Sling vs. Oak. This is whether Oak can support semantic versioning of its exported API. If you find, that Oak cannot do it. Fine, then don’t do it. As a consumer I will have to live with uncertainties. But that is still better than being misleaded. > > In Oak, we add new features in trunk (changing the API), and backport some > of those features, and not necessarily all of them, and not necessarily in > the same order as they were implemented: > > == Trunk == > > add feature A => bump export version to 1.1 >... later on ... > add feature B => bump export version to 2.0 >... later on ... > add feature C => bump export version to 2.1 > > > == Branch == > > backport feature C => bump export version to ? to get C you also need A and B and hence would be at 2.1 * you cannot do 1.1 because this means feature A but neither B nor C * you cannot just do 2.1 (or even 2.2) because you are omitting both A and B and thus consumers of 2.1 (or 2.2) expecting A or B besides C would break * you cannot do 3.x because this would later break migration to trunk which is 2.1 >... later on ... > backport feature A => bump export version to ? Since you already have C you already have A (and B) and hence are at 2.1 already. Please note, though: This is *not* about an implementation detail. This is about something visible like a new public (or protected) method or a class/interface. Semantic versioning is about resonating about something and agreeing on what level of compatibility I as a consumer can expect from the provider/producer - consider it being the part of an API contract dealing with evolution. It comes with a price for the provider/producer. If the provider is not able to abide by this contract, I suggest not to enter it. Regards Felix > > > Regards, > Thomas > > > > On 08/12/15 09:41, "Michael Dürig" wrote: > >> Packages evolve independently, but they do in potentially divergent branches. This is the kind of timeline that we usually face: - Oak 1.4 has a package org.foo.bar 1.0 - Some changes happen on the development branch 1.5 - Oak 1.5 now has a package org.foo.bar 1.1 - A change X happen in the development branch 1.5 - Oak 1.5 now has a package org.foo.bar 1.2 - The change X has to be backported to the maintenance branch 1.4 - Oak 1.4 now should have a package org.foo.bar 1.1 Assuming that the versions were incremented following the semantic versioning rules, we now have two packages - both called org.foo.bar and both having version 1.1 - that live on two different branches and contain different code. The only obvious solution that comes to my mind is to bump the major version of every package right after the development branch 1.5 is started, but I don't like this approach very much because it would break compatibility with existing clients for no obvious reason. >>> >>> This scenario is the exact problem you are facing while branching and >>> evolving the branches in parallel to trunk. >>> >>> The only end-developer friendly solution is to byte the bullet and do >>> it really properly and make sure you evolve exported packages (being >>> your API) in a truly diligent matter: Consider a package name and its >>> export version as the package¹s identity and always make sure this >>> identity (label) refers to the identical exported API. >>> >> >> I fail to see how this would work with branches. For Francesco's example >> this would mean that we'd need to backport everything into the branch >> effectively aligning it with trunk and thus obviating its purpose. >> >> Michael >> >> >> >
[jira] [Commented] (JCR-3870) Export SessionImpl#getItemOrNull in JackrabbitSession
[ https://issues.apache.org/jira/browse/JCR-3870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14487313#comment-14487313 ] Felix Meschberger commented on JCR-3870: This is actually not an edge case and has always been there. Export SessionImpl#getItemOrNull in JackrabbitSession - Key: JCR-3870 URL: https://issues.apache.org/jira/browse/JCR-3870 Project: Jackrabbit Content Repository Issue Type: Improvement Components: jackrabbit-api Affects Versions: 2.10 Reporter: Joel Richard Priority: Critical Labels: performance getItemOrNull should be exported in JackrabbitSession. This would allow to combine itemExists and getItem in Sling which would reduce the rendering time by 8%. See the following mail thread for more information: http://mail-archives.apache.org/mod_mbox/jackrabbit-oak-dev/201504.mbox/%3CD1495A09.3B670%25anchela%40adobe.com%3E -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (JCR-3870) Export SessionImpl#getItemOrNull in JackrabbitSession
[ https://issues.apache.org/jira/browse/JCR-3870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14484909#comment-14484909 ] Felix Meschberger commented on JCR-3870: I don't think it is worth exporting such a method. The most basic problem is the JCR API requesting the getItem method to throw in case the item does not exist. So the itemExists method has been added to be able to check the existence (or visibility, actually) of the item before trying to access it. So, the actual performance problem boils down to being an implementation problem: The API specification as [~mreutegg] pointed out suggest to check for existence before accessing the item and this is what Sling does. In other word Sling follows the API specification guidance. As such Sling is expecting the implementation to do the best to support the API and the API guidance. Export SessionImpl#getItemOrNull in JackrabbitSession - Key: JCR-3870 URL: https://issues.apache.org/jira/browse/JCR-3870 Project: Jackrabbit Content Repository Issue Type: Improvement Components: jackrabbit-api Affects Versions: 2.10 Reporter: Joel Richard Priority: Critical Labels: performance getItemOrNull should be exported in JackrabbitSession. This would allow to combine itemExists and getItem in Sling which would reduce the rendering time by 8%. See the following mail thread for more information: http://mail-archives.apache.org/mod_mbox/jackrabbit-oak-dev/201504.mbox/%3CD1495A09.3B670%25anchela%40adobe.com%3E -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Unnecessary itemExists overhead in JcrResourceProvider.createResource
Hi Joel Thanks for pointing out. Yet, this has been discussed before. Turns out that checking for item existence before accessing the item is JCR API best practice. So Sling follows best practice. If this generates a performance problem, I would suggest this to fix in the implementation of the API such that API usage best practice guidelines can be follows. Regarding the performance impact: If this createResource method has such a dramatic impact on request processing performance in Sling, it might be that Session.itemExists and/or Session.getItem methods are really expensive. Which would be yet another reason why we should concentrate on improving the implementation rather than introducing new API. Regards Felix Am 07.04.2015 um 09:00 schrieb Joel Richard joelr...@adobe.com: Hi, In JcrResourceProvider.createResource (Sling) it uses fist Session.itemExists to check whether an item exists and then Session.getItem to retreive it. Even though the item data is cached for the second call, this adds 8% overhead to the page rendering. 14% of the whole page rendering time is spent in itemExists. Sling could just only use getItem and catch the exception if the item does not exist. Unfortunately, this will not perform well if the resource resolver is often used to read items which do not exist because exceptions are slow. Technically, the simplest solution would be to export the method getItemOrNull. This method could be Oak-specfic and only be used by Sling when running on Oak. Another approach might be to have the last item cached in the session and return it immediately if getItem is called right after itemExists. No doubt, this is ugly and would solve only this specific problem, but since it is a very common pattern, it could make sense. Maybe it is also possible to improve the performance of itemExists itself, but that's a question which somebody else has to answer. - Joel
[jira] [Comment Edited] (JCR-3870) Export SessionImpl#getItemOrNull in JackrabbitSession
[ https://issues.apache.org/jira/browse/JCR-3870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14484909#comment-14484909 ] Felix Meschberger edited comment on JCR-3870 at 4/8/15 7:55 AM: I don't think it is worth exporting such a method. The most basic problem is the JCR API requesting the getItem method to throw in case the item does not exist. So the itemExists method has been added to be able to check the existence (or visibility, actually) of the item before trying to access it. So, the actual performance problem boils down to being an implementation problem: The API specification, as [~mreutegg] pointed out, suggests to check for existence before accessing the item and this is what Sling does. In other word Sling follows the API specification guidance. As such Sling is expecting the implementation to do the best to support the API and the API guidance. was (Author: fmeschbe): I don't think it is worth exporting such a method. The most basic problem is the JCR API requesting the getItem method to throw in case the item does not exist. So the itemExists method has been added to be able to check the existence (or visibility, actually) of the item before trying to access it. So, the actual performance problem boils down to being an implementation problem: The API specification as [~mreutegg] pointed out suggest to check for existence before accessing the item and this is what Sling does. In other word Sling follows the API specification guidance. As such Sling is expecting the implementation to do the best to support the API and the API guidance. Export SessionImpl#getItemOrNull in JackrabbitSession - Key: JCR-3870 URL: https://issues.apache.org/jira/browse/JCR-3870 Project: Jackrabbit Content Repository Issue Type: Improvement Components: jackrabbit-api Affects Versions: 2.10 Reporter: Joel Richard Priority: Critical Labels: performance getItemOrNull should be exported in JackrabbitSession. This would allow to combine itemExists and getItem in Sling which would reduce the rendering time by 8%. See the following mail thread for more information: http://mail-archives.apache.org/mod_mbox/jackrabbit-oak-dev/201504.mbox/%3CD1495A09.3B670%25anchela%40adobe.com%3E -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Unnecessary itemExists overhead in JcrResourceProvider.createResource
Hi Am 08.04.2015 um 11:12 schrieb Michael Dürig mdue...@apache.org: On 8.4.15 10:57 , Julian Reschke wrote: Turns out that checking for item existence before accessing the item is JCR API best practice. So Sling follows best practice. Why would it be best practice? Hmm, yes, looks like I misread something. So while it may not be JCR API best practices, using exceptions for flow control is not really good and actually expensive also considering: You need to handle an exception anyway (due to possible race conditions). With Oak you generally don't have such races thanks to MVCC. But in general you are right, checking existence before access is prone to races. Indeed, I would not know of a case where such a race condition really created an issue in reality … Regards Felix
[jira] [Commented] (JCR-3870) Export SessionImpl#getItemOrNull in JackrabbitSession
[ https://issues.apache.org/jira/browse/JCR-3870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14485513#comment-14485513 ] Felix Meschberger commented on JCR-3870: bq. there are many other places, where sling is optimized around how the JCR implementation behaves Really ? Off the top of my head I know of the Oak Observer (falling back to regular JCR Observation is not available). Others ? Indeed, at one point in early time, Sling had JCR Session pooling until Jackrabbit Repository.login became drammatically faster than pooled session cleanup. At which time we happily removed Session pooling from Sling. Export SessionImpl#getItemOrNull in JackrabbitSession - Key: JCR-3870 URL: https://issues.apache.org/jira/browse/JCR-3870 Project: Jackrabbit Content Repository Issue Type: Improvement Components: jackrabbit-api Affects Versions: 2.10 Reporter: Joel Richard Priority: Critical Labels: performance getItemOrNull should be exported in JackrabbitSession. This would allow to combine itemExists and getItem in Sling which would reduce the rendering time by 8%. See the following mail thread for more information: http://mail-archives.apache.org/mod_mbox/jackrabbit-oak-dev/201504.mbox/%3CD1495A09.3B670%25anchela%40adobe.com%3E -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Initial work for the specification of a remote API
+100 ! * Type=remove is exactly DELETE and we should do it * type=add is just PUT or POST * type=set likewise is just PUT or POST * type=unset is exactly DELETE So, please use those. Regards Felix Am 26.01.2015 um 10:00 schrieb Lukas Kahwe Smith sm...@pooteeweet.org: On 26 Jan 2015, at 09:55, Francesco Mari mari.france...@gmail.com wrote: I think a multi-tree read request could be a good improvement to the API. Technically speaking, it may be though as a generalization of the Read tree operation. can you elaborate why you are using an RPC style protocol, rather something more along the lines of REST. for example: { type: remove, path: /a/b/c } could just be a DELETE on /a/b/c regards, Lukas Kahwe Smith sm...@pooteeweet.org
Re: Initial work for the specification of a remote API
Hi Whether you use JSOP or RFC 6902 is essentially irrelevant. Maybe I tend to slightly favour a standardised approach hence RFC 6902. Regards Felix Am 26.01.2015 um 12:38 schrieb Francesco Mari mari.france...@gmail.com: My point is that probably you don't need to extend a format when the format you are extending is already powerful enough to express what you need. Other people already applied this concept successfully with the creation of the JSON Patch standard [1]. [1]: https://tools.ietf.org/html/rfc6902 2015-01-26 12:21 GMT+01:00 Lukas Kahwe Smith sm...@pooteeweet.org: On 26 Jan 2015, at 12:04, Francesco Mari mari.france...@gmail.com wrote: The document I posted uses JSON only as a simple way to describe generic data structures. There is a big disclaimer at the beginning of operations.md. The operations are supposed to be described in an abstract way, without any procol-dependent technology. Please, let's evaluate operations.md without thinking so much about JSON, JSOP or other serialization strategies. That said, since the topic was brought up, I have to admit that I'm not a big fan of JSOP. I don't see any benefit in that format, since it doesn't really add anything that couldn't be done with plain JSON, if I understand correctly. well that is the idea .. as its based on JSON :) it essentially extends JSON to specifically make it possible to express PATCH type requests in the context of a content repository. stuff like re-ordering etc. in that sense its also useful to handle the issue you talk about: dealing with multiple changes that you might have inside a remote session without needing a session, since you can do it all in a single request. regards, Lukas Kahwe Smith sm...@pooteeweet.org
Re: Export org.apache.jackrabbit.oak.plugins.tree from oak-core
Hi Am 14.11.2014 um 08:21 schrieb Chetan Mehrotra chetan.mehro...@gmail.com: E.g. the AbstractTree has protected mutable fields (bad practice per se, if exported even worse) If that is by design (required for MutableTree) would it still be bad. What would be the issue? Just because something is by design, does not make it good :-) And I don’t say all fields must be final (though there is good reasons to do so). But all fields should (I’d even say must) be private and only be accessible through getters. All non-private fields make it close to impossible to ever refactor these fields. Add to it having the containing classes be exported as API and you get implementation lock-in. Regards Felix Or TreeConstants exposing an „internal“ constant (should be a final class not an interface, actually) Can change it to class but I do not see much issue in using interface as holder for constants. Allows me to save bit on redundant typing of 'public static final'!. The constant is not that internal, just a name of hidden field I honestly don’t think we should be dropping good common practice and design on the notion of less typing, right ? ;-) An interface is something you implement to expose a behaviour. An interface sporting constants is not defining behaviour. Therefore a „container“ for constants should not be an interface. Regards Felix
Re: Export org.apache.jackrabbit.oak.plugins.tree from oak-core
Hi Am 14.11.2014 um 11:22 schrieb Chetan Mehrotra chetan.mehro...@gmail.com: On Fri, Nov 14, 2014 at 3:29 PM, Felix Meschberger fmesc...@adobe.com wrote: But all fields should (I’d even say must) be private and only be accessible through getters. All non-private fields make it close to impossible to ever refactor these fields. Add to it having the containing classes be exported as API and you get implementation lock-in. Ack. Makes sense and as Michael mentioned that this class is not suitable to be exposed as part of API due such reasons only An interface is something you implement to expose a behaviour. An interface sporting constants is not defining behaviour. Therefore a „container“ for constants should not be an interface. Ack. I knew that I would get negative marks from you for defending that when I sent that mail :) No negative marks at all ! Just to explain my train of thoughts :-) … and there are some things I am picky on :-) Regards Felix
Re: Export org.apache.jackrabbit.oak.plugins.tree from oak-core
Hi I have the impression this is not a good idea, unless you cleanse the classes. E.g. the AbstractTree has protected mutable fields (bad practice per se, if exported even worse) Or TreeConstants exposing an „internal“ constant (should be a final class not an interface, actually) Or ChildOrderDiff which operates on a hidden field (and thus should not be exposed, I would assume) Unfortunately ImmutableTree extends AbstractTree so exporting the former requires exporting the latter … From the outside, I have not the best of all gut feelings here... Regards Felix Am 14.11.2014 um 07:53 schrieb Chetan Mehrotra chetan.mehro...@gmail.com: Hi Team, For OAK-2261 I need to use ImmutableTree from org.apache.jackrabbit.oak.plugins.tree package which is currently not exported by Oak core. To proceed further I would like to export this package as part of OAK-2269 Unless someone objects I would make the change by EOD today Chetan Mehrotra
Re: testing helper
That’s exactly what I had in mind :-) Regards Felix Am 21.08.2014 um 10:21 schrieb Chetan Mehrotra chetan.mehro...@gmail.com: Probably we can package the test classes as an attached artifact and make use of that. For example oak-lucene uses the Oak Core test via dependency groupIdorg.apache.jackrabbit/groupId artifactIdoak-core/artifactId version${project.version}/version classifiertests/classifier scopetest/scope /dependency Chetan Mehrotra On Thu, Aug 21, 2014 at 1:36 PM, Davide Giannella dav...@apache.org wrote: On 20/08/2014 10:11, Marcel Reutegger wrote: oops, you are right, that would be a bad idea. I thought this is about a production class and not a test utility. I can see an additional bundle like testing-commons that can be imported with scope test by other projects. The pain point in here is that the testing helpers: functions, classes, etc; uses part of the oak-core api (NodeBuilder, NodeState, etc) and without having the exposed API as a bundle but together with the implementation we go in a loop of having oak-core depending on testing-commons and testing-commons depending on oak-core. D.
Re: testing helper
Hi Am 20.08.2014 um 09:43 schrieb Marcel Reutegger mreut...@adobe.com: Hi, oak-commons might be a good fit… In the test artifact attachment ? I would suggest not to add test classes to the primary artifact. Regards Felix regards marcel On 19/08/14 09:17, Davide Giannella dav...@apache.org wrote: Hello team, as part of OAK-2035 I had to copy some helpers from oak-core to oak-jcr. I was in a hurry so I copied oak-jcr/src/test/java/org/apache/jackrabbit/oak/jcr/util/ValuePathTuple.ja va and some functions I pasted in OrderedIndexIT.java. Now I was thinking what would be a proper way to share, for testing only, some classes/methods across multiple projects: in the case above oak-core and oak-jcr? Cheers Davide
[jira] [Commented] (JCR-3802) User Management: API for System Users
[ https://issues.apache.org/jira/browse/JCR-3802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14090775#comment-14090775 ] Felix Meschberger commented on JCR-3802: Sounds good to me. The API patch defines the properties of a system user. But what is the use case for system users from a pure Jackrabbit point of view ? I understand the support for Sling's SlingRepository.loginService method and I think that makes sense. Is that enough to justify the introduction of system users in Jackrabbit ? (Don't get me wrong. I think the idea is perfectly valid and with my Sling hat on, I welcome it very much.) And some comments on the POC patch: * SystemUserImpl.checkValidTree checks for type AuthorizableType.USER to check whether the tree is of the correct type. Is that correct or should that be AuthorizableType.SYSTEM_USER ? * UserValidator prohibits disabling system users. What is the reason ? I would think we should be able to disable these users just as any other users for some quick stop-gap security measures. * UserUtil.isSystemUser is missing from the POC patch * UserManagerImpl adds use of com.google.common.base.Strings[.isNullOrEmpty]. Is that by intent or should this rather be org.apache.lang.StringUtil ? User Management: API for System Users - Key: JCR-3802 URL: https://issues.apache.org/jira/browse/JCR-3802 Project: Jackrabbit Content Repository Issue Type: New Feature Components: jackrabbit-api Reporter: angela Attachments: JCR-3802___POC_implementation_for_Oak.patch, JCR-3802_proposed_API.patch Apache Sling recently added the ability to define dedicated service users that allows to replace the troublesome {{SlingRepository.loginAdministrative}} with {{SlingRepository.loginService}}. In a default Sling repository backed by a Jackrabbit/Oak content repository this results in user accounts being created for these service users. Since these service users are never expected to represent a real subject that has a userId/password pair or profile information editable by this very subject, I figured out that it may be useful to cover these service users with the following API extensions, which would allow us to identify if a given user account is in fact a service (or system) user. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: How to implement a queue in Oak?
Hi I have the impression, that JCR itself can adequately solve the problem using ordered child nodes. The problem is not JCR per-se but the implementation and the early-on decision to not optimize the use case of ordered child nodes (in favor of having support for optimized implementation of the unsorted case). Now with Oak, we also have customizable indices. Would it be possible to define an ordering property, index that property and then use a query to get these nodes. The query could be created such that the ordering property index would be considered (only). As a result we get quick and transparent sorted nodes ? Regards Felix Am 01.08.2014 um 07:56 schrieb Carsten Ziegeler cziege...@apache.org: I'm wondering if anyone has a good idea how to model a queue with efficient operations in JCR - or is JCR not suited for this use case? Regards Carsten 2014-07-30 15:57 GMT+02:00 Carsten Ziegeler cziege...@apache.org: Using a different storage than JCR would be easy in my case, however I *want* to use JCR Carsten 2014-07-30 14:55 GMT+02:00 Lukas Smith sm...@pooteeweet.org: Hi, I can totally see that it might be useful to be able to go through the Oak/JCR API to have a queue but maybe this is stretching Oak a bit far if you end up with 1k+ queues. However I think it would be great to look more into federation for this. I think ModeShape supports this quite well already, ie. being able to hook in another JCR tree, a file system, a git repository, CMIS .. I am sure that it would also be possible to implement on top of some MQ standard. see also https://docs.jboss.org/author/display/MODE/Federation?_sscc=t regards, Lukas On 30 Jul 2014, at 14:41, Angela Schreiber anch...@adobe.com wrote: hi carsten if you are expecting your nodes to be in a given order (e.g. the order of creation) you need to have a parent that has orderable children... in which case we don't make any promises about huge child collections... it will not work well. if you don't have the requirement of ordered children, you can have _many_ but need to make sure that your parent node doesn't have orderable children (e.g. oak:Unstructured)... but then you cannot expect that new children are appended at the end of the list... there is no list and there is not guaranteed order. i guess you have a little misunderstanding when it comes to the concept of orderable child nodes - JSR 283 will be your friend. regards angela On 30/07/14 13:27, Carsten Ziegeler cziege...@apache.org wrote: Hi, afaik with Oak the too many child nodes problem of JR2 is solved, therefore I'm wondering what the best way to store a queue in the repository is? In my use cases, there are usually not many items within a single queue, let's say a few hundreds. In some cases the queue might grow to some thousands but not more than maybe 20k. The idea is that new entries (nodes) are added to the end of the queue, and processing would read the first node from the queue, update the properties and once done, remove it. My initial design was to simply store all entries as sub nodes of some queue root entry without any hierarchy. addNode should add them at the end and simply iteration over the child nodes of the root gives the first entry. No need for sortable nodes. Does this make sense? Is there anything else to be considered? Regards Carsten -- Carsten Ziegeler Adobe Research Switzerland cziege...@apache.org -- Carsten Ziegeler Adobe Research Switzerland cziege...@apache.org -- Carsten Ziegeler Adobe Research Switzerland cziege...@apache.org
Re: OAK and large list of orderable child nodes
Hi Am 31.07.2014 um 09:12 schrieb Bertrand De Coatpont lebes...@adobe.com: Hello, In the JCR world, is there a specific API to obtain (for reading) a specific range of child nodes within a large list of orderable child nodes (e.g. Nodes 490-500 out of 2), and is OAK helping/changing anything in this area compared to previous versions? Node.getNodes() returns the children as a NodeIterator which is a RangeIterator, which has the skip(long), long getSize(), and long getPosition() methods. So, you might check this: try { NodeIterator children = node.getNodes(); children.skip(490); for (int i=0; i 10; i++) { Node child = children.nextNode(); // do something useful } } catch (NoSuchElementException nse) { // if there are less than 490 elements } catch (RepositoryException re) { // oh well :-) } Hope this helps Regards Felix Thanks! Bertrand Bertrand de Coatpont Group Product Manager, Adobe Experience Management Adobe 512.284.4195 (tel), /bertranddecoatpont lebes...@adobe.com www.adobe.com
Re: non-space whitespace in name
Hi From a user's perspective, it concerns me that item names are being changed when migrating from Jackrabbit (2) to Oak (Jackrabbit 3) … This may (or may not) cause applications to mysteriously break. Just my $.02 to consider — not a requirement to change OAK-1624 Regards Felix Am 13.06.2014 um 17:36 schrieb Tobias Bocanegra tri...@apache.org: On Fri, Jun 13, 2014 at 6:51 AM, Julian Reschke julian.resc...@gmx.de wrote: On 2014-06-13 15:37, Tobias Bocanegra wrote: On Thu, Jun 12, 2014 at 10:55 PM, Julian Reschke julian.resc...@gmx.de wrote: On 2014-06-13 02:14, Tobias Bocanegra wrote: Hi, according to [0] oak does not allow a non-space whitespace in the name. this is different than in jackrabbit. also it should be allowed based on [1]. the problem at hand is, that we have content with 'no-break-space' chars in node names, that doesn't install in oak anymore. regards, toby [0] https://github.com/apache/jackrabbit-oak/blob/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/name/Namespaces.java#L252 [1] http://www.w3.org/TR/xml/#NT-Char Looking at Jackrabbit's PathParser (org.apache.jackrabbit.spi.commons.conversion), it seems that non-SP whitespace characters aren't allowed here either. but creating nodes with such chars works. so, is it a bug or not? Does it? Maybe there's a higher-level component that actually converts non-SP whitespace to proper whitespace before passing the name to JCR? in jackrabbit, the PathParser treats all non-sp-ws as tab-characters: [2], but does not complain about it. however, if we keep this restriction, it should also be converted during a content upgrade. I created an issue to track this [3]. regards, toby [2] https://github.com/apache/jackrabbit/blob/trunk/jackrabbit-spi-commons/src/main/java/org/apache/jackrabbit/spi/commons/conversion/PathParser.java#L257 [3] https://issues.apache.org/jira/browse/OAK-1891
Re: non-space whitespace in name
Hi From a user's perspective, it concerns me that item names are being changed when migrating from Jackrabbit (2) to Oak (Jackrabbit 3) … This may (or may not) cause applications to mysteriously break. Just my $.02 to consider — not a requirement to change OAK-1624 Regards Felix Am 13.06.2014 um 17:36 schrieb Tobias Bocanegra tri...@apache.org: On Fri, Jun 13, 2014 at 6:51 AM, Julian Reschke julian.resc...@gmx.de wrote: On 2014-06-13 15:37, Tobias Bocanegra wrote: On Thu, Jun 12, 2014 at 10:55 PM, Julian Reschke julian.resc...@gmx.de wrote: On 2014-06-13 02:14, Tobias Bocanegra wrote: Hi, according to [0] oak does not allow a non-space whitespace in the name. this is different than in jackrabbit. also it should be allowed based on [1]. the problem at hand is, that we have content with 'no-break-space' chars in node names, that doesn't install in oak anymore. regards, toby [0] https://github.com/apache/jackrabbit-oak/blob/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/name/Namespaces.java#L252 [1] http://www.w3.org/TR/xml/#NT-Char Looking at Jackrabbit's PathParser (org.apache.jackrabbit.spi.commons.conversion), it seems that non-SP whitespace characters aren't allowed here either. but creating nodes with such chars works. so, is it a bug or not? Does it? Maybe there's a higher-level component that actually converts non-SP whitespace to proper whitespace before passing the name to JCR? in jackrabbit, the PathParser treats all non-sp-ws as tab-characters: [2], but does not complain about it. however, if we keep this restriction, it should also be converted during a content upgrade. I created an issue to track this [3]. regards, toby [2] https://github.com/apache/jackrabbit/blob/trunk/jackrabbit-spi-commons/src/main/java/org/apache/jackrabbit/spi/commons/conversion/PathParser.java#L257 [3] https://issues.apache.org/jira/browse/OAK-1891
Re: Embedding Groovy in oak-run for Oak Shell (OAK-1805)
Hi Am 26.05.2014 um 06:12 schrieb Michael Dürig mdue...@apache.org: Hi, Not embarking on the language war train, I think this is a good addition and we should go forward with it. Apart from that - and this is probably a separate discussion - I also think we should split oak-run up as it is getting too heavy. Couldn't we make it into a rather bare bone OSGi container where users could select what to run with on the command line (package manager like, that is)? E.g. whether to have benchmarks, scalability, server, ... With such a setup we could also make the languages a scripting console supports pluggable while at the same time keeping the oak-run module clean and mean. That sounds like an excellent idea. You might want to leverage the Sling Launchpad for this (along with OSGi Installer). Alternatively, and probably actually easier for you to define the modules/blocks, Apache Kafka: Kafka has the notion of Features which are described in simple XML terms and which allow to dynamically install these features along with dependencies. Regards Felix Michael On 22.5.14 6:37 , Chetan Mehrotra wrote: Hi, Currently Marcel has implemented a Java based basic shell access to Oak in OAK-1805. I have reworked the logic and used Groovysh [1] to provide a richer shell experience. * The shell is branded for Oak * Makes use of all the features provided by groovysh command completion, history, colored output etc * Full power of Groovy! * Most of current command implemented by Marcel ported * Ability to execute script at command line itself similar to Mongo shell Things to considers * Requires groovy jar to be embedded increasing the size ~6 mb * Some of commands need to be written in groovy Sample output {noformat} $ java -jar oak-run-1.1-SNAPSHOT.jar console mongodb://localhost:27017/oak Apache Jackrabbit Oak 1.1-SNAPSHOT Jackrabbit Oak Shell (Apache Jackrabbit Oak 1.1-SNAPSHOT, JVM: 1.6.0_45) Type ':help' or ':h' for help. --- / ls :async apps bin etc / cd :async apps bin etc home jcr:system libs oak:indexrep:policy rep:repoPolicy system tmp var / cd var /var {noformat} Adding groovy would increase size of oak-run by ~6 mb (from 27 to 34) and also requires some of the code logic to be implemented in Groovy, So would it be ok to apply the patch Thoughts? Chetan Mehrotra PS: Details are also provided in bug note [1] http://groovy.codehaus.org/Groovy+Shell [2] http://groovy.codehaus.org/JSR-223+access+to+other+JVM+languages
Re: Login with userid that contains windows domain
Hi Am 08.04.2014 um 19:18 schrieb Tobias Bocanegra tri...@apache.org: On Tue, Apr 8, 2014 at 8:09 AM, Felix Meschberger fmesc...@adobe.com wrote: Hi With my Sling hat on: Toby's variant 2 (JCR Client, e.g. Sling AuthenticationHandler, should do the work) sounds wrong to me. Because for Sling, this is just an opaque user name and there is no reason, why the generic JCR client should interpret it in any way - after all the JCR client deals with JCR and nothing else. Else others could come around and claim other interpretations and special casing … well, it's not quite so simple. for other kind of credentials, the client that calls repository.login() constructs the correct credentials that can be used by a login module. but there might be more information available to pass along, e.g. a security token. in those cases, you wouldn't rely on the login module solely. IMO, the DOMAIN\userid is (was) just a simple way for windows to extend their login chain with domain information. also, what do you expect from: repository.login(creds_with_domain_in_userid).getUserId() ? surely the userid w/o domain. Well, I would expect Session.getUserId to return the same as I provided in the Credentials. But then, the API states, that this is not necessairily the case. If the domain in the user name should be handled with significance, this should be done by the LoginModule assuming significance. it all boils down to the problem of credentials to userid mapping. and the simple credentials clearly state that they contain a password and a userid (and not a login id, or windows domain+userid, or ldap DN, or email address). we could solve this transparently for all login modules that extend from AbstractLoginModule with a general processCredentials() method that extracts the userid and/or domain specifier. but I would favor a more general credentials - userid mapping. for example to support the use case to login with an email address but having a different userid. So you propose special casing for the windows domain mechanism. What if users login with an absolute LDAP/X.500 DN ? Would you extend the special casing to support extracting the CN ? What if the CN is not the actual user ID ? The whole point of having LoginModule is to have this transparent and extensible. You don't want to code special cases in a common abstract base class again. Regards Felix regards, toby Just my $.02. Regards Felix Am 08.04.2014 um 09:15 schrieb Angela Schreiber anch...@adobe.com: hoi variant 2 only works if you just have a single IdentityProvider configured with your external login module, right? based on how we deal with these situations otherwise in Oak and in particular in the security area, i would feel more comfortable if we at some point had the ability to support multiple IdentityProvider implementations. in particular since the external login module is no longer specific for a particular backend but very generic and just uses the configured IdentityProvider to perform the JAAS login. IMO are different ways on how to achieve this: if we thing of having 2 identity provider implementations we could either have 2 entries in the JAAS config listing the ExternalLoginModule with different configuration (i.e. IdentityProvider implementation) or we had 1 single entry but a composing IdentityProvider that manages both identity providers. for either possibility the domain information would be needed in the login module and i see the following possibility to get this: A. define an attribute on the SimpleCredentials that contains the domain. B. define a dedicated Credentials interface extending from SimpleCredentials which specifically allows to obtain the domain information. C. the domain is part of the userId exposed by SimpleCredentials and extracted during the login call only (this is your variant 1). from my point of view 1/C looks a quite troublesome as it requires to add some magic to the userId, which is properly understood and handled by a single login module only (assuming that we would not want the domain to be stored as part of the userID of the synchronized user). A/B would be compatible with your proposal 2 below without loosing the domain information... i have slight preference for B as it would allow to separate the domain information from other credentials attributes. since the ExternalLoginModule could handle both SimpleCredentials without domain information attribute as you suggested) and the new domain-SimpleCredentials, we can easily enhance the SSOAuthenticationHandler and ExternalLoginModule after 1.0 to fully support different domains/IdentityProviders during repository login. would that make sense to you? kind regards angela On 07/04/14 20:26, Tobias Bocanegra tri...@apache.org wrote: Hi, I have an issue where the user tries to login using credentials that include
Re: Login with userid that contains windows domain
Hi Am 09.04.2014 um 09:09 schrieb Tobias Bocanegra tri...@apache.org: we could solve this transparently for all login modules that extend from AbstractLoginModule with a general processCredentials() method that extracts the userid and/or domain specifier. but I would favor a more general credentials - userid mapping. for example to support the use case to login with an email address but having a different userid. So you propose special casing for the windows domain mechanism. What if users login with an absolute LDAP/X.500 DN ? Would you extend the special casing to support extracting the CN ? What if the CN is not the actual user ID ? well, I think then the authenticator should use different credentials. If the user provides user name and password ? Absolutely not. If the user is presenting user name, password and some 3rd party OTP or such, absolutely. Interpreting the provided user name ? Probably not. That's not the business of the authentication handler, since to the handler the user name is an opaque string of characters. however, mapping the DN to as userid would be job of the login module. i.e. provide the userid for session.getUserId() and populate the subject with the correct principals. As is mapping the domain/user name to a userid. Same game. The whole point of having LoginModule is to have this transparent and extensible. You don't want to code special cases in a common abstract base class again. yes, but we (currently) have 3 login modules: default, token, external (and adobe granite has SSO). all of them would need to handle the windows domain (well, maybe not the token lm). Probably not. I would think LoginModules can or cannot handle. If one module cannot handle, it cannot authenticate. Depending on how the module is configured into the system (required or sufficient), login may succeed or not. Regards Felix regards, toby Regards Felix regards, toby Just my $.02. Regards Felix Am 08.04.2014 um 09:15 schrieb Angela Schreiber anch...@adobe.com: hoi variant 2 only works if you just have a single IdentityProvider configured with your external login module, right? based on how we deal with these situations otherwise in Oak and in particular in the security area, i would feel more comfortable if we at some point had the ability to support multiple IdentityProvider implementations. in particular since the external login module is no longer specific for a particular backend but very generic and just uses the configured IdentityProvider to perform the JAAS login. IMO are different ways on how to achieve this: if we thing of having 2 identity provider implementations we could either have 2 entries in the JAAS config listing the ExternalLoginModule with different configuration (i.e. IdentityProvider implementation) or we had 1 single entry but a composing IdentityProvider that manages both identity providers. for either possibility the domain information would be needed in the login module and i see the following possibility to get this: A. define an attribute on the SimpleCredentials that contains the domain. B. define a dedicated Credentials interface extending from SimpleCredentials which specifically allows to obtain the domain information. C. the domain is part of the userId exposed by SimpleCredentials and extracted during the login call only (this is your variant 1). from my point of view 1/C looks a quite troublesome as it requires to add some magic to the userId, which is properly understood and handled by a single login module only (assuming that we would not want the domain to be stored as part of the userID of the synchronized user). A/B would be compatible with your proposal 2 below without loosing the domain information... i have slight preference for B as it would allow to separate the domain information from other credentials attributes. since the ExternalLoginModule could handle both SimpleCredentials without domain information attribute as you suggested) and the new domain-SimpleCredentials, we can easily enhance the SSOAuthenticationHandler and ExternalLoginModule after 1.0 to fully support different domains/IdentityProviders during repository login. would that make sense to you? kind regards angela On 07/04/14 20:26, Tobias Bocanegra tri...@apache.org wrote: Hi, I have an issue where the user tries to login using credentials that include a windows domain in the userid attribute. for example: MYDOMAIN\toby. I'm not sure which layer should handle the domain part correctly, and I think it really depends on the setup. also, I'm not an AD expert and I don't know how the domain part would be used (selecting a forest in the AD server? or selecting a different AD server?). the problem especially comes up in SSO situations, where the LOGON_USER is passed over to a web application (e.g. sling) that then uses the
Re: Login with userid that contains windows domain
Am 09.04.2014 um 19:31 schrieb Tobias Bocanegra tri...@apache.org: oh another use case: login with case-insensitive user id. this is similar in that respect, that the 'id' in the credentials used to login, is not (or must not be) identical to the userid of the resolved authorizable. but the question is, where would this be configured? on all login modules? This is also LoginModule specific IMHO Regards Felix regards, toby On Wed, Apr 9, 2014 at 12:14 AM, Felix Meschberger fmesc...@adobe.com wrote: Hi Am 09.04.2014 um 09:09 schrieb Tobias Bocanegra tri...@apache.org: we could solve this transparently for all login modules that extend from AbstractLoginModule with a general processCredentials() method that extracts the userid and/or domain specifier. but I would favor a more general credentials - userid mapping. for example to support the use case to login with an email address but having a different userid. So you propose special casing for the windows domain mechanism. What if users login with an absolute LDAP/X.500 DN ? Would you extend the special casing to support extracting the CN ? What if the CN is not the actual user ID ? well, I think then the authenticator should use different credentials. If the user provides user name and password ? Absolutely not. If the user is presenting user name, password and some 3rd party OTP or such, absolutely. Interpreting the provided user name ? Probably not. That's not the business of the authentication handler, since to the handler the user name is an opaque string of characters. however, mapping the DN to as userid would be job of the login module. i.e. provide the userid for session.getUserId() and populate the subject with the correct principals. As is mapping the domain/user name to a userid. Same game. The whole point of having LoginModule is to have this transparent and extensible. You don't want to code special cases in a common abstract base class again. yes, but we (currently) have 3 login modules: default, token, external (and adobe granite has SSO). all of them would need to handle the windows domain (well, maybe not the token lm). Probably not. I would think LoginModules can or cannot handle. If one module cannot handle, it cannot authenticate. Depending on how the module is configured into the system (required or sufficient), login may succeed or not. Regards Felix regards, toby Regards Felix regards, toby Just my $.02. Regards Felix Am 08.04.2014 um 09:15 schrieb Angela Schreiber anch...@adobe.com: hoi variant 2 only works if you just have a single IdentityProvider configured with your external login module, right? based on how we deal with these situations otherwise in Oak and in particular in the security area, i would feel more comfortable if we at some point had the ability to support multiple IdentityProvider implementations. in particular since the external login module is no longer specific for a particular backend but very generic and just uses the configured IdentityProvider to perform the JAAS login. IMO are different ways on how to achieve this: if we thing of having 2 identity provider implementations we could either have 2 entries in the JAAS config listing the ExternalLoginModule with different configuration (i.e. IdentityProvider implementation) or we had 1 single entry but a composing IdentityProvider that manages both identity providers. for either possibility the domain information would be needed in the login module and i see the following possibility to get this: A. define an attribute on the SimpleCredentials that contains the domain. B. define a dedicated Credentials interface extending from SimpleCredentials which specifically allows to obtain the domain information. C. the domain is part of the userId exposed by SimpleCredentials and extracted during the login call only (this is your variant 1). from my point of view 1/C looks a quite troublesome as it requires to add some magic to the userId, which is properly understood and handled by a single login module only (assuming that we would not want the domain to be stored as part of the userID of the synchronized user). A/B would be compatible with your proposal 2 below without loosing the domain information... i have slight preference for B as it would allow to separate the domain information from other credentials attributes. since the ExternalLoginModule could handle both SimpleCredentials without domain information attribute as you suggested) and the new domain-SimpleCredentials, we can easily enhance the SSOAuthenticationHandler and ExternalLoginModule after 1.0 to fully support different domains/IdentityProviders during repository login. would that make sense to you? kind regards angela On 07/04/14 20:26, Tobias Bocanegra tri...@apache.org wrote: Hi, I have an issue where
Re: Login with userid that contains windows domain
Hi With my Sling hat on: Toby's variant 2 (JCR Client, e.g. Sling AuthenticationHandler, should do the work) sounds wrong to me. Because for Sling, this is just an opaque user name and there is no reason, why the generic JCR client should interpret it in any way - after all the JCR client deals with JCR and nothing else. Else others could come around and claim other interpretations and special casing … If the domain in the user name should be handled with significance, this should be done by the LoginModule assuming significance. Just my $.02. Regards Felix Am 08.04.2014 um 09:15 schrieb Angela Schreiber anch...@adobe.com: hoi variant 2 only works if you just have a single IdentityProvider configured with your external login module, right? based on how we deal with these situations otherwise in Oak and in particular in the security area, i would feel more comfortable if we at some point had the ability to support multiple IdentityProvider implementations. in particular since the external login module is no longer specific for a particular backend but very generic and just uses the configured IdentityProvider to perform the JAAS login. IMO are different ways on how to achieve this: if we thing of having 2 identity provider implementations we could either have 2 entries in the JAAS config listing the ExternalLoginModule with different configuration (i.e. IdentityProvider implementation) or we had 1 single entry but a composing IdentityProvider that manages both identity providers. for either possibility the domain information would be needed in the login module and i see the following possibility to get this: A. define an attribute on the SimpleCredentials that contains the domain. B. define a dedicated Credentials interface extending from SimpleCredentials which specifically allows to obtain the domain information. C. the domain is part of the userId exposed by SimpleCredentials and extracted during the login call only (this is your variant 1). from my point of view 1/C looks a quite troublesome as it requires to add some magic to the userId, which is properly understood and handled by a single login module only (assuming that we would not want the domain to be stored as part of the userID of the synchronized user). A/B would be compatible with your proposal 2 below without loosing the domain information... i have slight preference for B as it would allow to separate the domain information from other credentials attributes. since the ExternalLoginModule could handle both SimpleCredentials without domain information attribute as you suggested) and the new domain-SimpleCredentials, we can easily enhance the SSOAuthenticationHandler and ExternalLoginModule after 1.0 to fully support different domains/IdentityProviders during repository login. would that make sense to you? kind regards angela On 07/04/14 20:26, Tobias Bocanegra tri...@apache.org wrote: Hi, I have an issue where the user tries to login using credentials that include a windows domain in the userid attribute. for example: MYDOMAIN\toby. I'm not sure which layer should handle the domain part correctly, and I think it really depends on the setup. also, I'm not an AD expert and I don't know how the domain part would be used (selecting a forest in the AD server? or selecting a different AD server?). the problem especially comes up in SSO situations, where the LOGON_USER is passed over to a web application (e.g. sling) that then uses the repository. I can imagine the following scenarios: a) domain is constant/does not apply/or is a leftover from the SSO. so the repository does not (and never will) know about domains. b) domain is part of the userid, i.e. effectively selects a different user, but the same AD is used for all external accounts c) domain is part of the userid, but the domain also selects different ADs. Right now, the external login module does not handle the domain specifier specifically, so would behave like (b) - although I think that the user would not be found on the AD via LDAP the way it is currently built. Also, for a simple SSO setup, where the authentication module of the web app retrieves the LOGON_USER, I think the domain should be stripped there and not being included in the jcr credentials. so this basically boils down to the question: 1) should we implement special handling for windows domain specifiers in the login modules? 2) should we ignore windows domain and delegate this work to the JCR client? (e.g. the sling authentication handler should strip off the domain when building the jcr credentials) I think as long as the domain is not part of the user selection/authentication, we should do 2). WDYT? Regards, Toby
[jira] [Commented] (JCR-3745) Add JackrabbitObservationManager with additional methods for registering event listeners
[ https://issues.apache.org/jira/browse/JCR-3745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933083#comment-13933083 ] Felix Meschberger commented on JCR-3745: I like the filter idea, but I think it is strange to have an interface defining a Bean API instead of just have the simple flat bean class. Alternatively the interface would just define getter methods. Add JackrabbitObservationManager with additional methods for registering event listeners Key: JCR-3745 URL: https://issues.apache.org/jira/browse/JCR-3745 Project: Jackrabbit Content Repository Issue Type: New Feature Components: jackrabbit-api Reporter: Michael Dürig Assignee: Michael Dürig I'd like to add an additional method for adding event listeners to the Jackrabbit API: void addEventListener(EventListener listener, int eventTypes, String[] absPaths, boolean isDeep, String[] uuid, String[] nodeTypeName, boolean noLocal, boolean noExternal) throws RepositoryException; Compared to the JCR method of the same name, this method takes an array of absPath and additional boolean argument: Only events whose associated parent node is at one of the paths in codeabsPaths/code (or within its subgraph, if codeisDeep/code is codetrue/code) will be received. and Additionally, if codenoExternal/code is codetrue/code, then events from external cluster nodes are ignored. Otherwise, they are not ignored. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (JCR-3745) Add JackrabbitObservationManager with additional methods for registering event listeners
[ https://issues.apache.org/jira/browse/JCR-3745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933167#comment-13933167 ] Felix Meschberger commented on JCR-3745: [~rombert] I think we can just keep both methods and should be fine. [~mduerig] I seem to understand we agree on the setter methods for interfaces: How about defining JackrabbitEventListener as a final class ? (NB: Actually it just occurrs to me why setters on interfaces are a really bad idea: This basically defines a mutable object and so the filter used for registration cannot be used by the ObservationManager essentially causing the latter to clone (or otherwise copy) the settings for internal use. Slightly better would be a builder interface with setters and the actual interface with getters -- still two objects, though ...) Add JackrabbitObservationManager with additional methods for registering event listeners Key: JCR-3745 URL: https://issues.apache.org/jira/browse/JCR-3745 Project: Jackrabbit Content Repository Issue Type: New Feature Components: jackrabbit-api Reporter: Michael Dürig Assignee: Michael Dürig Attachments: JCR-3745.patch I'd like to add an additional method for adding event listeners to the Jackrabbit API: void addEventListener(EventListener listener, int eventTypes, String[] absPaths, boolean isDeep, String[] uuid, String[] nodeTypeName, boolean noLocal, boolean noExternal) throws RepositoryException; Compared to the JCR method of the same name, this method takes an array of absPath and additional boolean argument: Only events whose associated parent node is at one of the paths in codeabsPaths/code (or within its subgraph, if codeisDeep/code is codetrue/code) will be received. and Additionally, if codenoExternal/code is codetrue/code, then events from external cluster nodes are ignored. Otherwise, they are not ignored. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (JCR-3745) Add JackrabbitObservationManager with additional methods for registering event listeners
[ https://issues.apache.org/jira/browse/JCR-3745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933197#comment-13933197 ] Felix Meschberger commented on JCR-3745: bq. Class yes, final why? What's the use of extending it ? The ObservationManager is the consumer of the instance and it can only support what it can support and thus is exposed in this base class. If we later see, that extension makes sense we can still make it non-final. But making a non-final class final is essentially killing backwards compatibility. Add JackrabbitObservationManager with additional methods for registering event listeners Key: JCR-3745 URL: https://issues.apache.org/jira/browse/JCR-3745 Project: Jackrabbit Content Repository Issue Type: New Feature Components: jackrabbit-api Reporter: Michael Dürig Assignee: Michael Dürig Attachments: JCR-3745.patch I'd like to add an additional method for adding event listeners to the Jackrabbit API: void addEventListener(EventListener listener, int eventTypes, String[] absPaths, boolean isDeep, String[] uuid, String[] nodeTypeName, boolean noLocal, boolean noExternal) throws RepositoryException; Compared to the JCR method of the same name, this method takes an array of absPath and additional boolean argument: Only events whose associated parent node is at one of the paths in codeabsPaths/code (or within its subgraph, if codeisDeep/code is codetrue/code) will be received. and Additionally, if codenoExternal/code is codetrue/code, then events from external cluster nodes are ignored. Otherwise, they are not ignored. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Versioning of oak-jcr?
Hi Julian Cool ! With my Apache Felix hat on: I am pretty sure this project would be interested in a contribution ;-) Regards Felix Am 13.03.2014 um 10:51 schrieb Julian Sedding jsedd...@gmail.com: To mitigate such issues, I have developed a baselining-maven-plugin[0], which leverages the baselining feature of bnd in order to check that package exports are correctly versioned according to semantic versioning guidelines. If there is interest, I'd be happy to provide a patch for the Oak project. Regards Julian [0] https://github.com/code-distillery/baselining-maven-plugin On Wed, Mar 12, 2014 at 5:24 PM, Carsten Ziegeler cziege...@apache.org wrote: Hi, I just noticed that the api exported by the oak-jcr bundle in trunk is 0.16 (trunk version is 0.19-SNAPSHOT). This implies that the api hasn't changed between the 0.16 release and trunk. Unfortunately that's not true, as my code does not compile with 0.16, so I guess the export version needs to be updated Carsten -- Carsten Ziegeler cziege...@apache.org
Re: How to detect whether Oak is powering the repository?
Hi Isn't this new behaviour exposed through API which is not available on non-Oak systems ? So just trying to instantiate a class implementing that new API would signal that. Regards Felix Am 12.03.2014 um 17:10 schrieb Carsten Ziegeler cziege...@apache.org: This is for SLING-3279 which is about improved observation handling when Oak is available. So I need to check, if Oak is used. In that case I can use the improved handling and if not, revert to the old code. This allows to have a single bundle to deploy as 95% of the code is the same. The tests are run against both versions 2014-03-12 17:06 GMT+01:00 Bertrand Delacretaz bdelacre...@apache.org: Le 12 mars 2014 14:53, Thomas Mueller muel...@adobe.com a écrit : Hi, I'm wondering why would you want to check it? I've been doing that as well in sling tests that can run against various repositories, to make sure the test setup is correct. If the tests break sometimes in the future we can fix them, no big deal. Bertrand -- Carsten Ziegeler cziege...@apache.org
Re: Extending semantics of absPath Parameter of ObservationManager.addEventListener
Hi I am with Angela on this and would prefer a new method instead of overloading the string semantics. Regards Felix Am 11.03.2014 um 14:29 schrieb Angela Schreiber anch...@adobe.com: hi michael Since JCR observation only supports listening on a single path per listener, a commonly seen pattern is listeners registered on the root path and then filtering for the required paths in the event handler. yes, i have seen this a lot. This causes a lot of overhead in the case where the number of events on the target paths is low wrt. the total number of events. One way we could address this in an backward compatible way is to slightly extend the semantics of the absPath parameter and allow for a list of path instead of a single one. AFICS we could use | for separating such paths (e.g. /foo/bar|/baz/qox). This could then also be back ported to JR2 if needed. WDYT? that would be an option... but one that violates the API contract as the path param is defined to be an absolute path. alternatively, we could make this a Jackrabbit API extension to the JCR API and explicitly allow for multiple paths to be specified. this would be backwards compatible as well and less error prone when it comes to specification compliance. kind regards angela
Re: Code style
+1 Coding style is part of the corporate identity of the code, in a sense. Regards Felix Am 04.03.2014 um 09:49 schrieb Thomas Mueller muel...@adobe.com: Hi, Question: why don't we use the maven checkstyle plugin? for what? we are quite liberal in coding style and respect personal preferences. the only checkstyle rules that make sense are the ones that find or prevent bugs (like avoiding .* imports). Sure, the exact formatting rules are a matter of personal taste. But why do companies such as Google *enforce* specific rules? http://google-styleguide.googlecode.com/svn/trunk/javaguide.html Becase most of the time, developers *read* the code. And it's not just you who reads your code, but everybody else as well. Formatting rules help people read the code. I personally have a hard time understanding code that doesn't follow basic coding conventions. In fact I dislike such code. If you read a book, you expect certain formatting rules (identation, whitespace usage, and so on), and if those rules are broken, you can not concentrate on the content. In addition, some code conventions avoid bugs, for example the recent famous Apple SSL/TLS bug: https://www.imperialviolet.org/2014/02/22/applebug.html With decent rules, and the Maven checkstyle plugin, it's easy to avoid such bugs. Therefore, I think we should use strict rules. Regards, Thomas
Re: Partitions! (Was: Workspaces, once more)
Hi Jukka I like the idea. Can the methods not supporting cross-partition operation be enumerated ? I would think it primarily concerns move. What about remove involving a partitioned subtree ? Regards Felix -- Typos caused by my iPhone Am 19.02.2014 um 10:52 schrieb Jukka Zitting jukka.zitt...@gmail.com: Hi, In face-to-face discussion it came up that to avoid confusion, it would make sense to use some other term than workspaces for the proposed functionality. Also, we should extend the JackrabbitRepository interface with some extra methods to make it clear that the client isn't accessing a normal JCR workspace with the proposed feature. Borrowing a term from operating systems, I propose that we'd call such standalone backend a repository partition. Like on-disk partitions, each repository partition could have a completely separate configuration, and operations like moving stuff around that work within a single partition would not work across partition boundaries. BR, Jukka Zitting On Wed, Feb 19, 2014 at 4:09 AM, Jukka Zitting jukka.zitt...@gmail.com wrote: Hi, We discussed our options for (not) implementing workspace support a few times in the past, but the outcome so far has been to postpone the final decision and do just the minimum amount of work to keep all our options open. As we now get closer to production-readiness, I think it's time to finally make this decision. Here's what I propose: We won't support workspaces in the full JCR sense (shared jcr:system, cross-workspace operations, etc.). However, we do allow a repository to have more than one workspace, each workspace being it's own mini-repository with it's own user accounts, node types, version histories, etc. Instead of a one-to-one mapping between the JCR Repository and the Oak ContentRepository, the mapping would be one-to-many with the workspace name as the key. Why should we do this? Here's my rationale: 1) Implementing full JCR workspace support requires a lot of non-trivial work, which touches and complicates much of our existing functionality. 2) We haven't come across many major deployments or use cases where full workspace support would be needed. The few cases where it is used will still be covered by Jackrabbit Classic. 3) However, there still are a few use cases where a client would like to access data from more than one backend location, and having separate Repository instances for such cases is a bit troublesome, especially as Sling makes a fairly strong assumption of the system having just a single Repository. 4) At Adobe we have proprietary connector code for accessing external repositories and virtually mounting them within the main repository. This functionality relies on the kind of limited workspace support described above. 5) It would be useful in some cases to be able to store some content for example in a TarMK backend and other in a MongoMK one, but still access both backends through a single Repository. The proposed workspace mechanism would make this possible with minimum impact on existing code. To do this, we'd need to extend the Jcr utility class to accept a String, NodeStore map instead or as an alternative to just a single NodeStore. And in an OSGi environment the NodeStore services would become service factories that could produce any number of configured NodeStore services, with the repository service tracking all the available NodeStores and making them available as different workspaces. WDYT? BR, Jukka Zitting
Re: Efficient import of binary data into Oak
Hi That was my first thought, too: Nothing prevents the Binary implementation from checking whether the InputStream is a FileInputStream and then access the FileChannel from it. In the concrete case of Sling, the Sling RequestParameter.getInputStream() happens to call the Commons Upload FileItem.getInputStream() method which happens to return such a FileInputStream (if the item is actually stored in the filesystem, otherwise a ByteArrayInputStream happens to be returned). Regards Felix Am 18.02.2014 um 08:46 schrieb Ian Boston i...@tfd.co.uk: Hi, Is there a reason you would not use the commons upload streaming Api to connect the target output stream to the request stream? Iirc you can test if both have nio channels and if the do just connect the two. I have used this in the past to eliminate all GC activity and spooling. The streaming Api is sensitive to order of the multiparts. You must use them as they appear and not expect to be able to treat request parameters as a map. In addition it is sensitive to other frameworks buffering or accessing the request input stream. Best regards Ian On Tuesday, February 18, 2014, Chetan Mehrotra chetan.mehro...@gmail.com wrote: Hi, Currently in a Sling based application where a user uploads a file to the JCR following sequence of steps are executed 1. User uploads file via HTTP request mostly using Multi-Part form data based upload 2. Sling uses Commons File Upload to parse the multi-part request which uses a DiskFileItemFactory and write the binary content to a temporary file (for file size 256 KB) [1] 3. Later the servlet would access the JCR Session and create a Binary value by extracting the InputStream 4. The file content would then be spooled into the BlobStore Effect of different blobstore Now depending on the type of BlobStore one of the following code flow would happen A - JR2 DataStores - The inputstream would be copied to file B - S3DataStore - The AWS SDK would be creating a temporary file and then that file content would be streamed back to the S3 C - Segment - Content from InputStream would be stored as part of various segments D - MongoBlobStore - Content from InputStream would be pushed to remote mongo via multiple remote calls Things to note in above sequence 1. Uploaded content is copied twice. 2. The whole content is spooled via InputStream through JVM Heap Possible areas of Improvement 1. If the BlobStore is finally using some File (on same hard disk not NFS) then it might be better to *move* the file which was created in upload. This would help local FileDataStore and S3DataStore 2. Avoid spooling via InputStream if possible. Spooling via IS is slow [3]. Though in most cases we use efficient buffered copy which is marginally slower than NIO based variants. However avoiding moving byte[] might reduce pressure on GC (probably!) Changes required If we can have a way to create JCR Binary implementations which enables DataStore/BlobStore to efficiently transfer content then that would help. For example for File based DS the Binary created can keep a reference to the source File object and that Binary is used in JCR API. Eventually the FileDataStore can treat it in a different way and move the file. Another example is S3DataStore - In some cases the file has already been transferred to S3 using other options. And the user wants to transfer the S3 file from its bucket to our bucket. So a Binary implementation which can just wrap the S3 url would enable the S3DataStore to transfer the content without streaming all content again [4] Any thoughts on the best way to enable users of Oak to create Binaries via other means (compared to current mode which only enables via InputStream) and enable the DataStores to make use of such binaries? Chetan Mehrotra [1] https://github.com/apache/sling/blob/trunk/bundles/engine/src/main/java/org/apache/sling/engine/impl/parameters/ParameterSupport.java#L190 [2] http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/model/PutObjectRequest.html [3] http://www.baptiste-wicht.com/2010/08/file-copy-in-java-benchmark/3/ [4] http://stackoverflow.com/questions/9664904/best-way-to-move-files-between-s3-buckets
Re: Make Whiteboard accessible through ContentRepository
Hi This thread indeed raises the question, why Oak has to come up with something (the Whiteboard) that is almost but not quite like OSGi instead of going all the way through ? As a stop-gap measure, instead of going full-OSGi you could also just leverage the feature, that you really need: The service registry and base on something like Karl Pauls' µServices [2] and PojoSR [3] Interestingly, when I started with what became Apache Sling I worked on a thing called the Extension Frameworkg for JCR Repositories [1] until it turned out that it basically would be reinventing OSGi … and so Sling became an OSGi application. Regards Felix [1] http://svn.apache.org/repos/asf/jackrabbit/sandbox/inactive/extension-framework/ [2] http://www.pro-vision.de/content/medialib/pro-vision/production/adaptto/2013/adaptto2013-osgi--services-karl-pauls-pdf/_jcr_content/renditions/rendition.download_attachment.file/adaptto2013-osgi--services-karl-pauls.pdf [3] http://code.google.com/p/pojosr/ Am 09.02.2014 um 10:05 schrieb Davide Giannella giannella.dav...@gmail.com: On Sat, Feb 8, 2014 at 6:58 PM, Tobias Bocanegra tri...@apache.org wrote: ... ps: I still think we should turn the problem around, and make everything OSGi services and start a small OSGi container for the runtime :-) I was thinking the same tonight. I was going to ask why (any historical decisions) Oak in the oak-run doesn't use a simple bundled-up OSGi container and runs the related jar, that are already OSGi bundles, in it. It would make for example a lot easier to inject a CommitHook like a custom index. So far the only way to achieve so is to recompile the oak-run adding .with(new MyIndexProvider()) while I'd rather add a Service implementation the OSGi whiteboard. D.
Re: Roadmap for Jackrabbit 2.x and 3.0
Am 17.01.2014 um 11:01 schrieb Bertrand Delacretaz bdelacre...@apache.org: On Wed, Jan 15, 2014 at 7:35 PM, Jukka Zitting jukka.zitt...@gmail.com wrote: g) ...Or as a last resort, abandon the idea of a joint deployment package. Jackrabbit Classic and Oak would be shipped in separate deployment artifacts Does this have impact on how people can migrate existing Jackrabbit repositories to Oak, or is migration a separate concern and plan? Yes, please ! Regards Felix -Bertrand
Re: Roadmap for Jackrabbit 2.x and 3.0
Hi While (f-OSGi) has an appeal to me and think should be done in any case, I would think (g-separate) is the right way to go to prevent complexity with IMVHO little benefit. Just my CHF 0.05 Regards Felix Am 15.01.2014 um 12:35 schrieb Jukka Zitting jukka.zitt...@gmail.com: Hi, Let's pick this up again! On Thu, Jan 17, 2013 at 6:00 AM, Jukka Zitting jukka.zitt...@gmail.com wrote: * Jackrabbit 3.0: Early next year, after the 2.6 and 2.x branches have been created, we'd replace the current trunk with an Oak-based JCR implementation. As mentioned above, instead of replacing the trunk entirely, I'd keep both codebases in trunk for now and include *both* the 2.x and Oak repositories in things like jackrabbit-webapp and jackrabbit-standalone, with a configuration option to decide which repository implementation should be used in each particular deployment. I was looking at this a few months ago, and there's one big blocker that makes such a double deployment somewhat complicated: Since Jackrabbit 2.x (Jackrabbit Classic?) still uses an older version of Lucene, it's hard merge with Oak where Lucene 4 is used. I have a few ideas on how this problem could be resolved: a) Upgrade Jackrabbit Classic to use Lucene 4. As discussed earlier (http://markmail.org/message/nv5jeeoda7qe5qen) this is pretty hard, and it's questionable whether the benefits are worth the effort. b) Downgrade Oak to use Lucene 3. This should be doable with not much effort, as the Lucene integration in Oak is much simpler than in Jackrabbit Classic. It might even be possible to make oak-lucene version-independent, so it would work with both Lucene 3 and 4. c) Ship the jackrabbit deployment packages without Lucene integration for Oak. This would allow people to start playing with Oak in their existing deployments, but require some deployment changes for full Oak functionality. d) Use the class rewriting tricks in something like the Shade plugin [1] to be able to include both Lucene 3 *and* 4 in the same deployment packages. I'm not sure if this is even possible with Lucene, or how much effort it would require. e) Use a custom classloader setup to load the correct version of Lucene depending on the selected Jackrabbit mode. f) Adjust the Jackrabbit deployment packages to use an embedded OSGi container, and use it to selectively deploy the required implementation components, including the correct version of Lucene. g) Or as a last resort, abandon the idea of a joint deployment package. Jackrabbit Classic and Oak would be shipped in separate deployment artifacts. I'm thinking of trying to implement one or two of these alternatives within the next few weeks, and cut Jackrabbit 2.8 based on that work and including something like Oak 0.16 as a beta feature. Assuming that approach works and Oak stabilizes as planned, we could then follow up with Jackrabbit 3.0 fairly soon after 2.8. [1] http://maven.apache.org/plugins/maven-shade-plugin/ BR, Jukka Zitting
[jira] [Commented] (JCR-3705) Extract data store API and implementations from jackrabbit-core
[ https://issues.apache.org/jira/browse/JCR-3705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13837643#comment-13837643 ] Felix Meschberger commented on JCR-3705: +1 for a new jackrabbit-data component Extract data store API and implementations from jackrabbit-core --- Key: JCR-3705 URL: https://issues.apache.org/jira/browse/JCR-3705 Project: Jackrabbit Content Repository Issue Type: Improvement Components: jackrabbit-core Reporter: Jukka Zitting In Oak we'd like to use the Jackrabbit data stores (OAK-805). Doing so would currently require a direct dependency to jackrabbit-core, which is troublesome for various reasons. Since the DataStore interface and its implementations are mostly independent of the rest of Jackrabbit internals, it should be possible to avoid that dependency by moving the data store bits to some other component. One alternative would be to place them in jackrabbit-jcr-commons, another to create a separate new jackrabbit-data component for this purpose. WDYT? -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (JCR-3293) AbstractLoginModule: get rid of trust_credentials_attribute
[ https://issues.apache.org/jira/browse/JCR-3293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13815935#comment-13815935 ] Felix Meschberger commented on JCR-3293: Codewise, something like this, I think: {code} Subject s = getAndPopulateTheSubject(); Session session = Subject.doAs(s, new PrivilegedExceptionAction() { public Session run() throws RepositoryException { return repository.login(); } }); {code} (plus proper exception handling and unwrapping, of course) AbstractLoginModule: get rid of trust_credentials_attribute --- Key: JCR-3293 URL: https://issues.apache.org/jira/browse/JCR-3293 Project: Jackrabbit Content Repository Issue Type: Bug Components: jackrabbit-core Affects Versions: 2.4 Reporter: angela based on JCR-2355 we added a very simplistic way to indicate to the login module that the given credentials have been preauthenticated. as already stated in the original issue this poses a major security issue as it leaves the repository access untrusted. i would like to raise those security concern again and would therefore like to get rid of that hack in the long run. the suggested procedure: - deprecate the attribute (immediately) - log a warning if it is used (immediately) - document how to fix code that is currently relying on that attribute - remove support altogether for the next major release -- This message was sent by Atlassian JIRA (v6.1#6144)
AW: Can I use LoginModulePlugins in Oak?
Chetan may know more but IIRC Oak will support pluggable login modules out of the box. So we will probably deprecate our own plugins or retrofit them into Oak's pluggable API. Regards Felix Ursprüngliche Nachricht Von: Bertrand Delacretaz bdelacre...@apache.org Datum: An: oak-dev@jackrabbit.apache.org Betreff: Can I use LoginModulePlugins in Oak? Hi, Some Sling integration tests (SLING-3221) are failing, due to a form login mechanism that's supported by Sling's FormLoginModulePlugin [1] when running on Jackrabbit, and is not present on our Oak setup. Can I use plugins that implement org.apache.sling.jcr.jackrabbit.server.security.LoginModulePlugin in Oak (and how), or is there a similar mechanism? -Bertrand [1] https://svn.apache.org/repos/asf/sling/trunk/bundles/auth/form/src/main/java/org/apache/sling/auth/form/impl/FormLoginModulePlugin.java
Re: Oak JCR Observation scalability aspects and concerns
Hi Am 22.10.2013 um 11:17 schrieb Chetan Mehrotra: On Mon, Oct 21, 2013 at 11:39 PM, Jukka Zitting jukka.zitt...@gmail.com wrote: 3) The Observer mechanism allows a listener to look at repository changes in variable granularity and frequency depending on application needs and current repository load. Thus an Oak Observer can potentially process orders of magnitude more changes than a JCR event listener that needs to look at each individual changed item. +1 I think in Sling case it would make sense for it to be implemented as an Observer. And I had a look at implementation of some of the listener implementations of [1] and I think they can be easily moved to Sling OSGi events To be discussed on the Sling list -- though wearing my Sling hat I am extremely weary of implementing an Oak-dependency in Sling. Sling uses JCR. Regards Felix Chetan Mehrotra [1] https://gist.github.com/chetanmeh/7081328/raw/listeners-list-filtered.txt smime.p7s Description: S/MIME cryptographic signature
Re: Oak JCR Observation scalability aspects and concerns
Hi Am 22.10.2013 um 15:27 schrieb Jukka Zitting: Hi, On Tue, Oct 22, 2013 at 5:21 AM, Felix Meschberger fmesc...@adobe.com wrote: Am 22.10.2013 um 11:17 schrieb Chetan Mehrotra: I think in Sling case it would make sense for it to be implemented as an Observer. And I had a look at implementation of some of the listener implementations of [1] and I think they can be easily moved to Sling OSGi events To be discussed on the Sling list -- though wearing my Sling hat I am extremely weary of implementing an Oak-dependency in Sling. Sling uses JCR. Yet Sling is actively looking to expand its support for other non-JCR backends. ;-) That bears an interesting question, though: What is the relationship of Oak to JCR ? I think we should do the same thing here, i.e. have an implementation-independent abstraction in Sling that can be implemented both by plain JCR and directly by Oak. As discussed, the main scalability problem with the current JcrResourceListener design is that it needs to handle *all* changes and the event producer has no way to know which events really are needed. To avoid that problem and to make life easier for most typical listeners, I would suggest adding a whiteboard service interface like the following: What you are describing is already implemented in the JcrResourceListener and the OSGi EventAdmin service ;-) The JcrResourceListener just gets JCR Observation events, creates the OSGi Event objects and hands them over for distribution by the OSGi EventAdmin service. The latter service is then responsible for dispatching taking the EventHandler service registration properties into account for filtering. Events are collated on a node level with names of added, removed, and modified properties listed in event properties and the node path provided by the path property. Plus we add an indication of whether the event occurred locally or on another cluster node as well as the event's user id if available. Adding more properties to filter on would certainly be possible. Regards Felix interface ContentChangeListener { void contentAdded(String pathToAddedNode); void contentChanged(String pathToChangedNode); void contentRemoved(String pathToRemovedNode); } By registering such a service with a set of filter properties that identify which content changes are of interest, the client will start receiving callbacks at these methods whenever such changes are detected. The filter properties could be something like this: paths - the paths under which to listen for changes types - the types of nodes of interest nodes - the names of nodes of interest properties - the names of properties of interest For example, the following declaration would result in callbacks whenever there's a base version change of a versionable README.txt node somewhere under /source subtree: paths = /source types = mix:versionable nodes = README.txt properties = jcr:baseVersion Additionally, a granularity property could be set to coarse to indicate that it's fine to deliver events just at the top of a modified subtree. For example, if changes are detected both at /foo and /foo/bar, a coarsely grained listener would only need to be notified at /foo. Setting the property to fine would result in callbacks for both /foo and /foo/bar. For proper access controls, the service would also need to have a credentials property that contains the access credentials to be used for determining which events the listener is entitled to. It should be fairly straightforward to support such a service interface both with plain JCR observers and with an Oak Observer, with the latter being potentially orders of magnitude faster. BR, Jukka Zitting smime.p7s Description: S/MIME cryptographic signature
Re: Oak JCR Observation scalability aspects and concerns
Hi Am 22.10.2013 um 15:52 schrieb Jukka Zitting: Hi, On Tue, Oct 22, 2013 at 9:41 AM, Felix Meschberger fmesc...@adobe.com wrote: The JcrResourceListener just gets JCR Observation events, creates the OSGi Event objects and hands them over for distribution by the OSGi EventAdmin service. The latter service is then responsible for dispatching taking the EventHandler service registration properties into account for filtering. Right, but the problem here is that this design can't scale out to millions of events per second, as it requires each individual OSGi Event to be instantiated before filtering. The proposed service interface doesn't need to do that, so it can achieve a much higher throughput, and since no event queue is needed, there's no need to worry about the queue filling up. That's one Event object per event -- not one event per listener per event. This is completely different to JCR. Regards Felix BR, Jukka Zitting smime.p7s Description: S/MIME cryptographic signature
[jira] [Commented] (JCRVLT-18) Set default autosave threshold based on repository implementation
[ https://issues.apache.org/jira/browse/JCRVLT-18?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13800429#comment-13800429 ] Felix Meschberger commented on JCRVLT-18: - [~tmueller] I think for package installation, stability is more important than performance. The main drawback of autosave (which is why I really don't like it) is that it is not atomic and may leave partially installed packages behind which are hard to cleanup. I would suggest to remove auto save alltogether and rather instruct users to use smaller packages. Set default autosave threshold based on repository implementation - Key: JCRVLT-18 URL: https://issues.apache.org/jira/browse/JCRVLT-18 Project: Jackrabbit FileVault Issue Type: Improvement Reporter: Tobias Bocanegra Priority: Minor with jackrabbit 2.0 we had a limitation of the size of the transient space as it is held in memory. in order to support large packages, the AutoSave threshold is set to 1024 nodes. with jackrabbit 3.0 the transient space is more or less unlimited in size, and we can install large packages in 1 save, which improves installation atomicity. however, the bigger the transient size, the higher the chance for collisions during installation of large packages, so saving in chunks yields to a more robust installation behavior. suggestions: - autosave threshold of 0 should mean 'auto' - autosave threshold of -1 should mean 'never' - packages can provide their desired autosave threshold via properties -- This message was sent by Atlassian JIRA (v6.1#6144)
Re: Migration without an embedded Jackrabbit
Hi I see the problem and I agree that this in fact *is* a problem. But I still don't agree with an integrated, transparent solution to this upgrade problem. And I never will -- such application bloat and even code duplication along with testing and maintenance etc. requirements just sound scaring. Also: Code duplication is one of the big evils in application development. IMNSHO migration of Jackrabbit 2 based repositories to Oak is a one-shot problem: you apply this once to a repository and be done. So why load the application with a host of unneeded pieces ? Rather, I suggest to come up with a standalone application, which can be a conglomerate of original Jackrabbit and Oak libraries and which do the migration in one step. This application can be optimized and fine-tuned to just this single use-case: migration. This way, both Jackrabbit 2 and Oak applications stay clean of such migration junk. This also makes it clear that migration of storage from Jackrabbit2 to Oak is not something that can and will be done by just snipping your fingers, but which is a potentially long-running and complex operation. Regards Felix Am 11.10.2013 um 16:28 schrieb Jukka Zitting: Hi, I've been thinking about the upgrade/migration code (oak-upgrade, OAK-458) over the past few days, and trying to figure out how we could achieve that without having to keep the full Jackrabbit 2.x codebase as dependency. The same question comes up for the support for Jackrabbit 2.x datastores (OAK-805). The key problem here is that the Jackrabbit 2.x codebase is already so convoluted that it's practically impossible to just pick up say something like an individual persistence manager or data store implementation and access it directly without keeping the rest of the 2.x codebase around. This is troublesome for many reasons, for example using such components require lots of extra setup code (essentially a full RepositoryImpl instance) and the size of the required extra dependencies is about a dozen megabytes. Thus I'm inclined to instead just implement the equivalent functionality directly in Oak. This requires some code duplication (we'd for example need the same persistence managers in both Oak and Jackrabbit), but the versions in Oak could be a lot simpler and more streamlined as only a subset of the functionality is needed. To reduce the amount of duplication we could push some of the shared utility code (like NodePropBundle, etc.) to jackrabbit-jcr-commons or to a new jackrabbit-shared component. WDYT? BR, Jukka Zitting smime.p7s Description: S/MIME cryptographic signature
Re: Migration without an embedded Jackrabbit
Hi Thanks for the clarifications. Yet, they more confirm my first impression than they resolve the concerns. Particularly your timing and smoothness assumptions. Migrating JR2 to Oak is much more complex than migrating from one version of JR2 to the next. I would even assume it is way more complex than migrating from JR1 to JR2. So, let's agree to disagree ;-) Regards Felix Am 14.10.2013 um 15:02 schrieb Jukka Zitting: Hi, On Mon, Oct 14, 2013 at 4:38 AM, Felix Meschberger fmesc...@adobe.com wrote: IMNSHO migration of Jackrabbit 2 based repositories to Oak is a one-shot problem: you apply this once to a repository and be done. So why load the application with a host of unneeded pieces ? I'd like to make the upgrade as smooth and transparent as possible. I all Jackrabbit versions so far the upgrade has required just starting up a new version of the repository, and any required migration steps have been handled transparently under the hood. Rather, I suggest to come up with a standalone application, which can be a conglomerate of original Jackrabbit and Oak libraries and which do the migration in one step. That's a valid alternative, especially since the Oak upgrade is by far the most complex migration so far. And I agree with your concerns about code duplication. In this case though I believe the benefits outweigh the drawbacks, see below. This application can be optimized and fine-tuned to just this single use-case: migration. Unfortunately it can't. While the Oak internals are designed with these kinds of bulk operations in mind, the Jackrabbit internals are not. There are some pretty major optimizations (like streaming bundles from a persistence manager, vs. loading them one-by-one) that we could do with custom upgrade-oriented code and that wouldn't be available with the standard Jackrabbit components. This way, both Jackrabbit 2 and Oak applications stay clean of such migration junk. Note that with the approach I'm proposing, all the custom migration code would go to the oak-upgrade component that's independent of the rest of the stack. Once the upgrade is done, a deployment can safely drop that component. This also makes it clear that migration of storage from Jackrabbit2 to Oak is not something that can and will be done by just snipping your fingers, but which is a potentially long-running and complex operation. My goal here is to make the upgrade *not* be a long-running and complex operation. BR, Jukka Zitting smime.p7s Description: S/MIME cryptographic signature
Re: Login parameters
Hi Am 29.07.2013 um 11:20 schrieb Jukka Zitting: Hi, It would be useful to be able to pass in extra parameters in a Repository.login() call, in order to for example control the auto-refresh or read-only status of the created session. Unfortunately the standard JCR API doesn't provide any straightforward way to do this, so I've come up with a few potential solutions: 1. Use the attribute feature of SimpleCredentials to pass such parameters: SimpleCredentials credentials = ...; credentials.setAttribute(AutoRefresh, true); Session session = repository.login(credentials, ...); The downside is that it's tied to the specific Credentials class being used. Yes, but this would be my favourite: For example in Apache Sling we have an authentication framework which allows plugging in a service which processes credential data (a MapString, Object later converted to SimpleCredentials with attributes). A service could be devised to set these flags depending on some configuration or request property. 2. Use the URI query parameter syntax to pass such parameters as a part of the workspace name: String workspace = ...; Session session = repository.login(..., workspace + ?AutoRefresh=true); The downside is the extra complexity of parsing the workspace string and the need to in many case look up the default workspace name (unless we define ?... to refer to the default workspace). Looks clumsy to me. After all the workspace name is a name not an URI. 3. Extend the JackrabbitRepository interface with a new login() method that explicitly allows such extra parameters: MapString, Object parameters = Collections.singletonMap(AutoRefresh, true); Session session = repository.login(..., parameters); The downside is the need for the custom API extension and the adjustments to all relevant implementations. We probably could justify adding such a method in JCR 2.1. Agreed. Plus this would be the fourth login method ... How many are there to come ? 4. Add a new utility class that uses a thread-local variable to pass such extra parameters through a normal login() call: MapString, Object parameters = Collections.singletonMap(AutoRefresh, true); Session session = LoginWrapper.login(repository, ..., parameters); The downside is the need for the custom utility class, and the extra complexity (especially for remoting) of using thread-local variables. Plus: this is hacky and leads to implementation dependent code à-la if (isJackrabbit) { doLoginWrapper } else { doRepositoryLogin } Regards Felix WDYT? BR, Jukka Zitting
Re: Login parameters
Am 29.07.2013 um 11:58 schrieb Michael Dürig: On 29.7.13 11:20, Jukka Zitting wrote: It would be useful to be able to pass in extra parameters in a Repository.login() call, in order to for example control the auto-refresh or read-only status of the created session. Unfortunately the standard JCR API doesn't provide any straightforward way to do this, so I've come up with a few potential solutions: For completeness sake: we could also change session attributes retrospectively by adding a setAttribute() method to JackrabbitSession. What would be the use for it ? providing some contextual information ? Is JackrabbitSession the right thing for that ? In a Servlet API request context we have Request (plus ServletContext and maybe JSP PageContext) attributes. In other context we have other functionality. We should probably not misuse the JackrabbitSession for such things .. Regards Felix However I don't think this is a good idea. Michael
Re: [OT] Donation of Adobe's File Vault and packaging tools
Hi Am 24.06.2013 um 23:40 schrieb Tobias Bocanegra: I was wondering what the status for this is? being buried in day-to-day work the last months, I just now completed the preparation for the contribution. I need one or 2 more days to find the best way for contributing (e.g. big patch vs. direct SVN commits). Since this is an existing code base: wouldn't we need proper IP-clearance [1] ? Regards Felix [1] http://incubator.apache.org/ip-clearance/index.html
Re: Observation throughput
Hi This sounds like hmm, lets see, may or may not work better; but it does today on my box ... Regards Felix Am 25.06.2013 um 12:47 schrieb Michael Dürig: On 25.6.13 9:26, Marcel Reutegger wrote: it might indeed be better to ask the MK for the diff earlier and avoid the comparison of unmodified child nodes. I lowered the threshold that determines from when on the Microkernel diff should be used from 100 to 10 at revision 1496403. This value performed best on MongoMk when comparing 100, 10 and 1 and it doesn't have much effect on other implementations. Michael
Re: Move MongoMK to Oak Core
Hi I have the impression, that this imght solve your short-term issue but in the long term you actually dilute modularity in that you integrte a single, special MK implementation into Oak Core. I may be way off the current implementation, but from a modularization POV this sounds perfectly wrong. Regards Felix Am 18.06.2013 um 07:31 schrieb Chetan Mehrotra: I wanted to make use of the new Whiteboard support [1] for publishing Cache related MBeans and for Scheduler Service. I see two options 1. Move the Whiteboard support to oak Commons 2. Move MongoMk to Oak Core Similar requirement was felt for use of StringCache in MongoMK [2]. There also it was suggested to move Mongo MK to Oak Core. So should we move the codebase to Oak Core and change the package from o.a.j.mongomk to o.a.j.oak.plugins.mongomk Chetan Mehrotra [1] https://issues.apache.org/jira/browse/OAK-867 [2] https://issues.apache.org/jira/browse/OAK-861
[jira] [Commented] (JCR-3534) Add JackrabbitSession.getValueByContentId method
[ https://issues.apache.org/jira/browse/JCR-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13641519#comment-13641519 ] Felix Meschberger commented on JCR-3534: Well, this message is an access token. If that would be the case, this would be bad design. The message data must not be a general access token. A key-value list would be nice This sounds like YAGNI -- I doubt we are going to use it. And if so, we can still adapt. There is just one place in the code involved which knows how to talk to itself Add JackrabbitSession.getValueByContentId method Key: JCR-3534 URL: https://issues.apache.org/jira/browse/JCR-3534 Project: Jackrabbit Content Repository Issue Type: New Feature Components: jackrabbit-api, jackrabbit-core Affects Versions: 2.6 Reporter: Felix Meschberger Attachments: JCR-3534.patch we have a couple of use cases, where we would like to leverage the global data store to prevent sending around and copying around large binary data unnecessarily: We have two separate Jackrabbit instances configured to use the same DataStore (for the sake of this discussion assume we have the problems of concurrent access and garbage collection under control). When sending content from one instance to the other instance we don't want to send potentially large binary data (e.g. video files) if not needed. The idea is for the sender to just send the content identity from JackrabbitValue.getContentIdentity(). The receiver would then check whether the such content already exists and would reuse if so: String ci = contentIdentity_from_sender; try { Value v = session.getValueByContentIdentity(ci); Property p = targetNode.setProperty(propName, v); } catch (ItemNotFoundException ie) { // unknown or invalid content Identity } catch (RepositoryException re) { // some other exception } Thus the proposed JackrabbitSession.getValueByContentIdentity(String) method would allow for round tripping the JackrabbitValue.getContentIdentity() preventing superfluous binary data copying and moving. See also the dev@ thread http://jackrabbit.markmail.org/thread/gedk5jsrp6offkhi -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (JCR-3534) Add JackrabbitSession.getValueByContentId method
[ https://issues.apache.org/jira/browse/JCR-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13641019#comment-13641019 ] Felix Meschberger commented on JCR-3534: To simplify development/support, the message should readable, for example JSON or an URL. Example (shortened) This sounds like the old mantra in the XML-days: Everything had to be XML, bla, bla, bla. Please just keep this simple and don't overexagerate. Having a string of colon separated values (if the value is structured in some way) is more that enough. Otherwise you incurr the price of parsing JSON ... Not worth it IMHO. Having expiry and encrypting the identifier would prevent further damage in case the BinaryReferenceMessage leaks. What is the problem with this message leaking as opposed to the actual data leaking which would be transported over the same channel completely unencrypted ? Again, this sounds like over-engineering to me. We should leave data protection to the transport layer (e.g. SSL) and just care to make sure that a data reference cannot be made up by an attacker (in the sense of try to find out whether a document exists). Add JackrabbitSession.getValueByContentId method Key: JCR-3534 URL: https://issues.apache.org/jira/browse/JCR-3534 Project: Jackrabbit Content Repository Issue Type: New Feature Components: jackrabbit-api, jackrabbit-core Affects Versions: 2.6 Reporter: Felix Meschberger Attachments: JCR-3534.patch we have a couple of use cases, where we would like to leverage the global data store to prevent sending around and copying around large binary data unnecessarily: We have two separate Jackrabbit instances configured to use the same DataStore (for the sake of this discussion assume we have the problems of concurrent access and garbage collection under control). When sending content from one instance to the other instance we don't want to send potentially large binary data (e.g. video files) if not needed. The idea is for the sender to just send the content identity from JackrabbitValue.getContentIdentity(). The receiver would then check whether the such content already exists and would reuse if so: String ci = contentIdentity_from_sender; try { Value v = session.getValueByContentIdentity(ci); Property p = targetNode.setProperty(propName, v); } catch (ItemNotFoundException ie) { // unknown or invalid content Identity } catch (RepositoryException re) { // some other exception } Thus the proposed JackrabbitSession.getValueByContentIdentity(String) method would allow for round tripping the JackrabbitValue.getContentIdentity() preventing superfluous binary data copying and moving. See also the dev@ thread http://jackrabbit.markmail.org/thread/gedk5jsrp6offkhi -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (JCR-3534) Add JackrabbitSession.getValueByContentId method
[ https://issues.apache.org/jira/browse/JCR-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13634889#comment-13634889 ] Felix Meschberger commented on JCR-3534: Re. signed message or HMAC I agree with Jukka: The user id cannot be part of this transaction because there is no reason to assume that to be the same. Even if the name is the same the actual entity represented by the name need not be the same. This is essentially the same issue that NFS is faced with the numeric user and group IDs. But: we also need a way to inject that value through some public API such as the ValueFactory or some other means. Assuming RMI or Davex is not the going to work because we have two separate systems where the data is extracted through the JackrabbitValue.getContentIdentifier() (or some new API) method, serialized in a custom way and then reinjected through some new API (or the ValueFactory if that can differentiate between a message binary and a real binary !). Re. Future Proof Angela is quite right: It is essential, that whatever mechanism we implement for Jackrabbit 2 should also be available for Oak. Add JackrabbitSession.getValueByContentId method Key: JCR-3534 URL: https://issues.apache.org/jira/browse/JCR-3534 Project: Jackrabbit Content Repository Issue Type: New Feature Components: jackrabbit-api, jackrabbit-core Affects Versions: 2.6 Reporter: Felix Meschberger Attachments: JCR-3534.patch we have a couple of use cases, where we would like to leverage the global data store to prevent sending around and copying around large binary data unnecessarily: We have two separate Jackrabbit instances configured to use the same DataStore (for the sake of this discussion assume we have the problems of concurrent access and garbage collection under control). When sending content from one instance to the other instance we don't want to send potentially large binary data (e.g. video files) if not needed. The idea is for the sender to just send the content identity from JackrabbitValue.getContentIdentity(). The receiver would then check whether the such content already exists and would reuse if so: String ci = contentIdentity_from_sender; try { Value v = session.getValueByContentIdentity(ci); Property p = targetNode.setProperty(propName, v); } catch (ItemNotFoundException ie) { // unknown or invalid content Identity } catch (RepositoryException re) { // some other exception } Thus the proposed JackrabbitSession.getValueByContentIdentity(String) method would allow for round tripping the JackrabbitValue.getContentIdentity() preventing superfluous binary data copying and moving. See also the dev@ thread http://jackrabbit.markmail.org/thread/gedk5jsrp6offkhi -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (JCR-3534) Add JackrabbitSession.getValueByContentId method
[ https://issues.apache.org/jira/browse/JCR-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13634905#comment-13634905 ] Felix Meschberger commented on JCR-3534: Re. ValueFactory#createValue(String, PropertyType.BINARY) Some thought on this: The implementation must make sure to (a) properly identify the hashed/signed/whatever value as being an identifier (no false positives or negatives!) and (b) reject identifiers for which there is no data store entry. Add JackrabbitSession.getValueByContentId method Key: JCR-3534 URL: https://issues.apache.org/jira/browse/JCR-3534 Project: Jackrabbit Content Repository Issue Type: New Feature Components: jackrabbit-api, jackrabbit-core Affects Versions: 2.6 Reporter: Felix Meschberger Attachments: JCR-3534.patch we have a couple of use cases, where we would like to leverage the global data store to prevent sending around and copying around large binary data unnecessarily: We have two separate Jackrabbit instances configured to use the same DataStore (for the sake of this discussion assume we have the problems of concurrent access and garbage collection under control). When sending content from one instance to the other instance we don't want to send potentially large binary data (e.g. video files) if not needed. The idea is for the sender to just send the content identity from JackrabbitValue.getContentIdentity(). The receiver would then check whether the such content already exists and would reuse if so: String ci = contentIdentity_from_sender; try { Value v = session.getValueByContentIdentity(ci); Property p = targetNode.setProperty(propName, v); } catch (ItemNotFoundException ie) { // unknown or invalid content Identity } catch (RepositoryException re) { // some other exception } Thus the proposed JackrabbitSession.getValueByContentIdentity(String) method would allow for round tripping the JackrabbitValue.getContentIdentity() preventing superfluous binary data copying and moving. See also the dev@ thread http://jackrabbit.markmail.org/thread/gedk5jsrp6offkhi -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (JCR-3534) Add JackrabbitSession.getValueByContentId method
[ https://issues.apache.org/jira/browse/JCR-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13634974#comment-13634974 ] Felix Meschberger commented on JCR-3534: createValue(Binary) If we have a custom Binary implementation, we could also fail early when trying to create that instance: If the content Id cannot be resolved to a valid entry, the Binary object cannot be created and thus there is no need to call createValue(Binary) at all. Add JackrabbitSession.getValueByContentId method Key: JCR-3534 URL: https://issues.apache.org/jira/browse/JCR-3534 Project: Jackrabbit Content Repository Issue Type: New Feature Components: jackrabbit-api, jackrabbit-core Affects Versions: 2.6 Reporter: Felix Meschberger Attachments: JCR-3534.patch we have a couple of use cases, where we would like to leverage the global data store to prevent sending around and copying around large binary data unnecessarily: We have two separate Jackrabbit instances configured to use the same DataStore (for the sake of this discussion assume we have the problems of concurrent access and garbage collection under control). When sending content from one instance to the other instance we don't want to send potentially large binary data (e.g. video files) if not needed. The idea is for the sender to just send the content identity from JackrabbitValue.getContentIdentity(). The receiver would then check whether the such content already exists and would reuse if so: String ci = contentIdentity_from_sender; try { Value v = session.getValueByContentIdentity(ci); Property p = targetNode.setProperty(propName, v); } catch (ItemNotFoundException ie) { // unknown or invalid content Identity } catch (RepositoryException re) { // some other exception } Thus the proposed JackrabbitSession.getValueByContentIdentity(String) method would allow for round tripping the JackrabbitValue.getContentIdentity() preventing superfluous binary data copying and moving. See also the dev@ thread http://jackrabbit.markmail.org/thread/gedk5jsrp6offkhi -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (JCR-3534) Add JackrabbitSession.getValueByContentId method
[ https://issues.apache.org/jira/browse/JCR-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13604991#comment-13604991 ] Felix Meschberger commented on JCR-3534: This does not solve my problems at all: I have two separate systems (separate JVMs and no clustering but shared data store) involved. So a Binary object from one system cannot be used on the other system without serializing and deserializing it. My proposal uses the data store identity as the serialization mechanism and uses the new JackrabbitSesssion.getValueByContentIdentity as the deserialization mechanism. Another option would be to have such a method on a new JackrabbitValueFactory class. But since we have a content identifier and want to get at content, we need some level of access control. So we need that access control in the JackrabbitValueFactory which would be the only method in the JackrabbitValueFactory which employs access control to create a value Add JackrabbitSession.getValueByContentId method Key: JCR-3534 URL: https://issues.apache.org/jira/browse/JCR-3534 Project: Jackrabbit Content Repository Issue Type: New Feature Components: jackrabbit-api, jackrabbit-core Affects Versions: 2.6 Reporter: Felix Meschberger Attachments: JCR-3534.patch we have a couple of use cases, where we would like to leverage the global data store to prevent sending around and copying around large binary data unnecessarily: We have two separate Jackrabbit instances configured to use the same DataStore (for the sake of this discussion assume we have the problems of concurrent access and garbage collection under control). When sending content from one instance to the other instance we don't want to send potentially large binary data (e.g. video files) if not needed. The idea is for the sender to just send the content identity from JackrabbitValue.getContentIdentity(). The receiver would then check whether the such content already exists and would reuse if so: String ci = contentIdentity_from_sender; try { Value v = session.getValueByContentIdentity(ci); Property p = targetNode.setProperty(propName, v); } catch (ItemNotFoundException ie) { // unknown or invalid content Identity } catch (RepositoryException re) { // some other exception } Thus the proposed JackrabbitSession.getValueByContentIdentity(String) method would allow for round tripping the JackrabbitValue.getContentIdentity() preventing superfluous binary data copying and moving. See also the dev@ thread http://jackrabbit.markmail.org/thread/gedk5jsrp6offkhi -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (JCR-3534) Add JackrabbitSession.getValueByContentId method
[ https://issues.apache.org/jira/browse/JCR-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605061#comment-13605061 ] Felix Meschberger commented on JCR-3534: suggested getValueByContentId() method wouldn't work with Oak So JackrabbitValue.getContentIdentifier will not be supported in Oak, either ? I'd adjust the deployment configuration if you want to make those repositories share data more intimately. How if clustering is not an option ? Consider for example, I have an author box and a publish box. For various reasons both share the same Data Store. The straight forward approach to send binaries from the author box to the publish box would be to extract the binary from the Data Store on the author box and push it back in to the Data Store on the publish box, just for the Data Store to realize the binary is actually the same. On the reading side we can retrieve an identifier (JackrabbitValue.getContentIdentifier). I just need a way to reuse the binary data referred to by that content Identifier on the publish side. Add JackrabbitSession.getValueByContentId method Key: JCR-3534 URL: https://issues.apache.org/jira/browse/JCR-3534 Project: Jackrabbit Content Repository Issue Type: New Feature Components: jackrabbit-api, jackrabbit-core Affects Versions: 2.6 Reporter: Felix Meschberger Attachments: JCR-3534.patch we have a couple of use cases, where we would like to leverage the global data store to prevent sending around and copying around large binary data unnecessarily: We have two separate Jackrabbit instances configured to use the same DataStore (for the sake of this discussion assume we have the problems of concurrent access and garbage collection under control). When sending content from one instance to the other instance we don't want to send potentially large binary data (e.g. video files) if not needed. The idea is for the sender to just send the content identity from JackrabbitValue.getContentIdentity(). The receiver would then check whether the such content already exists and would reuse if so: String ci = contentIdentity_from_sender; try { Value v = session.getValueByContentIdentity(ci); Property p = targetNode.setProperty(propName, v); } catch (ItemNotFoundException ie) { // unknown or invalid content Identity } catch (RepositoryException re) { // some other exception } Thus the proposed JackrabbitSession.getValueByContentIdentity(String) method would allow for round tripping the JackrabbitValue.getContentIdentity() preventing superfluous binary data copying and moving. See also the dev@ thread http://jackrabbit.markmail.org/thread/gedk5jsrp6offkhi -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (JCR-3534) Add JackrabbitSession.getValueByContentId method
[ https://issues.apache.org/jira/browse/JCR-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13605083#comment-13605083 ] Felix Meschberger commented on JCR-3534: but is the overhead high enough to justify the leak in the abstraction well, copying lots of 20MB binaries ? instead of lots of 50 character strings ? Do the math ;-) The abstraction already leaked with the JackrabbitValue.getContentIdentitiy method (ok, that would be the broken window ;-) ). Add JackrabbitSession.getValueByContentId method Key: JCR-3534 URL: https://issues.apache.org/jira/browse/JCR-3534 Project: Jackrabbit Content Repository Issue Type: New Feature Components: jackrabbit-api, jackrabbit-core Affects Versions: 2.6 Reporter: Felix Meschberger Attachments: JCR-3534.patch we have a couple of use cases, where we would like to leverage the global data store to prevent sending around and copying around large binary data unnecessarily: We have two separate Jackrabbit instances configured to use the same DataStore (for the sake of this discussion assume we have the problems of concurrent access and garbage collection under control). When sending content from one instance to the other instance we don't want to send potentially large binary data (e.g. video files) if not needed. The idea is for the sender to just send the content identity from JackrabbitValue.getContentIdentity(). The receiver would then check whether the such content already exists and would reuse if so: String ci = contentIdentity_from_sender; try { Value v = session.getValueByContentIdentity(ci); Property p = targetNode.setProperty(propName, v); } catch (ItemNotFoundException ie) { // unknown or invalid content Identity } catch (RepositoryException re) { // some other exception } Thus the proposed JackrabbitSession.getValueByContentIdentity(String) method would allow for round tripping the JackrabbitValue.getContentIdentity() preventing superfluous binary data copying and moving. See also the dev@ thread http://jackrabbit.markmail.org/thread/gedk5jsrp6offkhi -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Getting a value by its data identifier
Hi, I think there really are two sides to the story: (a) getting an ID (b) getting the data for that ID We may or may not be able -- on a large scale -- to prevent (a). After all getting an ID might just be the result of wold guessing and doing a brute force attack. We have to be able to limit (b): While restricting to admin sessions might be an option, I think that is not the right way to do it. I tend to agree with AlexK that a permission might be the way to do it. The problematic thing really is that permission checking is hooked to a repository path (and thus related to an Item) whereas here we don't have an item: The DataStore BLOB does not know where it belongs to -- and in a shared DataStore setup, there might not even be an owner property. In short: forget about (a). For(b) use a custom permission on / to grant access to the new method (denied by default, of course). Regards Felix Am 12.03.2013 um 16:09 schrieb Thomas Mueller: Hi, (a) Would such a method technically be possible (preventing actual large binary data copy !) ? Yes I think it's possible. Would this be needed for Oak or Jackrabbit 2.x or both? (c) Can we and if yes, how can we control access ? Currently the content identifier is the content hash (SHA-1), so there is no risk of enumeration or scanning attack (not sure what is the right word for this - where the attacker blindly tries out many possible ids in the hope to find one). One risk is that an attacker can prove a certain document is stored in the repository, where the attacker already has the document or at least knows the hash code. For example he could prove the wikileaks file x is stored in the repository, which might be a problem if possession of the wikileaks file x is illegal. Not sure if we need protection against that; if yes, we might only allow this method to be called for admin sessions or so. Another risk is that an attacker that has a list of identifiers might be able to get the documents in that way, if they are stored in the repository. The question is how did the attacker get the identifier, but if it's a simple SHA-1 it might be a bigger risk. One way to protect against that might be to encrypt the SHA-1 hash code with a repository-wide, configurable private key or so. Regards, Thomas -- Felix Meschberger | Principal Scientist | Adobe
Re: Getting a value by its data identifier
Hi, Am 12.03.2013 um 15:02 schrieb Alexander Klimetschek: On 12.03.2013, at 12:32, Felix Meschberger fmesc...@adobe.com wrote: Thus the proposed JackrabbitSession.getValueByContentIdentity(String) method would allow for round tripping the JackrabbitValue.getContentIdentity() preventing superfluous binary data copying and moving. The idea sounds good to me :-) (Disclaimer: discussed this with Felix f2f before) Questions: (c) Can we and if yes, how can we control access ? It's a bit tricky, and I think the best way to do it is: - by default no access at all (getValueByContentIdentity() returns null aka not found) I would prefer a SecurityException, but JCR has a notion of no access looks the same as non-existing, so an ItemNotFoundException would probably be thrown in this case (due to JCR throwing an exception if something does not exist instead of just returning null). - have a special privilege for this feature, that you only want to enable for users that need this feature - because such a repository-wide optimization feature generally does require a user with wide permissions +1 We could use a repository level permission like we have to workspace creation. - nice to have: avoid that the content ID is a hash of the binary, so that an attacker (who already go the above privilege) still cannot infer existence of a binary he knows; but then he might have enough read write access already, as a user with that permission is likely to have broad rights, as for copying things over from one instance to another requires that We don't do such security by obscurity things for regular path and node ID acces. So we might not want to try it here. Rather we should provide proper access control on access. (d) What else ? This is practically only about Binaries and the FileDataStore, but the JackrabbitValue.getContentIdentity() is generic across all value types. If there might be such a store for other properties in the future, the content id must uniquely identify that store (e.g. value type) as well. I would expect such a content identity to be globally unique and internally handled by the repository such that roundtripping between getContentIdentity and getValueByContentIdentity can be guaranteed (provided access control allows for it. Regards Felix Cheers, Alex -- Felix Meschberger | Principal Scientist | Adobe
[jira] [Created] (JCR-3534) Add JackrabbitSession.getValueByContentId method
Felix Meschberger created JCR-3534: -- Summary: Add JackrabbitSession.getValueByContentId method Key: JCR-3534 URL: https://issues.apache.org/jira/browse/JCR-3534 Project: Jackrabbit Content Repository Issue Type: New Feature Reporter: Felix Meschberger we have a couple of use cases, where we would like to leverage the global data store to prevent sending around and copying around large binary data unnecessarily: We have two separate Jackrabbit instances configured to use the same DataStore (for the sake of this discussion assume we have the problems of concurrent access and garbage collection under control). When sending content from one instance to the other instance we don't want to send potentially large binary data (e.g. video files) if not needed. The idea is for the sender to just send the content identity from JackrabbitValue.getContentIdentity(). The receiver would then check whether the such content already exists and would reuse if so: String ci = contentIdentity_from_sender; try { Value v = session.getValueByContentIdentity(ci); Property p = targetNode.setProperty(propName, v); } catch (ItemNotFoundException ie) { // unknown or invalid content Identity } catch (RepositoryException re) { // some other exception } Thus the proposed JackrabbitSession.getValueByContentIdentity(String) method would allow for round tripping the JackrabbitValue.getContentIdentity() preventing superfluous binary data copying and moving. See also the dev@ thread http://jackrabbit.markmail.org/thread/gedk5jsrp6offkhi -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (JCR-3534) Add JackrabbitSession.getValueByContentId method
[ https://issues.apache.org/jira/browse/JCR-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Meschberger updated JCR-3534: --- Attachment: JCR-3534.patch Proposed patch adding the new API. To access the DataStore item by its ID I converted one method from InternalValue from package private to public since the InternvalValue.create(DataStore, String) method cannot be used: The getContentIdentifier method does not return the store prefix required by the create(DataStore, String) method. Also the create method does not validate the existence of a data store entry for the provided identifier. Add JackrabbitSession.getValueByContentId method Key: JCR-3534 URL: https://issues.apache.org/jira/browse/JCR-3534 Project: Jackrabbit Content Repository Issue Type: New Feature Reporter: Felix Meschberger Attachments: JCR-3534.patch we have a couple of use cases, where we would like to leverage the global data store to prevent sending around and copying around large binary data unnecessarily: We have two separate Jackrabbit instances configured to use the same DataStore (for the sake of this discussion assume we have the problems of concurrent access and garbage collection under control). When sending content from one instance to the other instance we don't want to send potentially large binary data (e.g. video files) if not needed. The idea is for the sender to just send the content identity from JackrabbitValue.getContentIdentity(). The receiver would then check whether the such content already exists and would reuse if so: String ci = contentIdentity_from_sender; try { Value v = session.getValueByContentIdentity(ci); Property p = targetNode.setProperty(propName, v); } catch (ItemNotFoundException ie) { // unknown or invalid content Identity } catch (RepositoryException re) { // some other exception } Thus the proposed JackrabbitSession.getValueByContentIdentity(String) method would allow for round tripping the JackrabbitValue.getContentIdentity() preventing superfluous binary data copying and moving. See also the dev@ thread http://jackrabbit.markmail.org/thread/gedk5jsrp6offkhi -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Getting a value by its data identifier
Hi I created https://issues.apache.org/jira/browse/JCR-3534 with a patch implementing the proposed method along with a unit test validating round tripping. Regards Felix Am 12.03.2013 um 12:32 schrieb Felix Meschberger: Hi all we have a couple of use cases, where we would like to leverage the global data store to prevent sending around and copying around large binary data unnecessarily: We have two separate Jackrabbit instances configured to use the same DataStore (for the sake of this discussion assume we have the problems of concurrent access and garbage collection under control). When sending content from one instance to the other instance we don't want to send potentially large binary data (e.g. video files) if not needed. The idea is for the sender to just send the content identity from JackrabbitValue.getContentIdentity(). The receiver would then check whether the such content already exists and would reuse if so: String ci = contentIdentity_from_sender; try { Value v = session.getValueByContentIdentity(ci); Property p = targetNode.setProperty(propName, v); } catch (ItemNotFoundException ie) { // unknown or invalid content Identity } catch (RepositoryException re) { // some other exception } Thus the proposed JackrabbitSession.getValueByContentIdentity(String) method would allow for round tripping the JackrabbitValue.getContentIdentity() preventing superfluous binary data copying and moving. Questions: (a) Would such a method technically be possible (preventing actual large binary data copy !) ? (b) Would a patch be accepted ? (c) Can we and if yes, how can we control access ? (c) What else ? Regards Felix -- Felix Meschberger | Principal Scientist | Adobe -- Felix Meschberger | Principal Scientist | Adobe
[jira] [Updated] (JCR-3534) Add JackrabbitSession.getValueByContentId method
[ https://issues.apache.org/jira/browse/JCR-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Meschberger updated JCR-3534: --- Component/s: jackrabbit-core jackrabbit-api Affects Version/s: 2.6 Add JackrabbitSession.getValueByContentId method Key: JCR-3534 URL: https://issues.apache.org/jira/browse/JCR-3534 Project: Jackrabbit Content Repository Issue Type: New Feature Components: jackrabbit-api, jackrabbit-core Affects Versions: 2.6 Reporter: Felix Meschberger Attachments: JCR-3534.patch we have a couple of use cases, where we would like to leverage the global data store to prevent sending around and copying around large binary data unnecessarily: We have two separate Jackrabbit instances configured to use the same DataStore (for the sake of this discussion assume we have the problems of concurrent access and garbage collection under control). When sending content from one instance to the other instance we don't want to send potentially large binary data (e.g. video files) if not needed. The idea is for the sender to just send the content identity from JackrabbitValue.getContentIdentity(). The receiver would then check whether the such content already exists and would reuse if so: String ci = contentIdentity_from_sender; try { Value v = session.getValueByContentIdentity(ci); Property p = targetNode.setProperty(propName, v); } catch (ItemNotFoundException ie) { // unknown or invalid content Identity } catch (RepositoryException re) { // some other exception } Thus the proposed JackrabbitSession.getValueByContentIdentity(String) method would allow for round tripping the JackrabbitValue.getContentIdentity() preventing superfluous binary data copying and moving. See also the dev@ thread http://jackrabbit.markmail.org/thread/gedk5jsrp6offkhi -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Behaviour of Move Operations
So you essentially say: Behaviour of the repository is best effort and we -- at the end of the day -- cannot trust the repository ? Sounds frightening. IMHO the repository should be failsafe and thus eventually solve the issue making sure we don't end up with two copies of the same node (actually both copies, if referenceable, will even have the same node ID Regards Felix Am 27.02.2013 um 16:35 schrieb Jukka Zitting: Hi, On Wed, Feb 27, 2013 at 5:09 PM, Carsten Ziegeler cziege...@apache.org wrote: How will this be handled with Oak? Could it happen that due to this happening concurrently that the node ends up twice in the repository (at /1/node and /2/node in my example)? The behavior depends on the underlying MicroKernel implementation. With the new SegmentMK I've been working on, you can control the behavior: * If both cluster nodes use the same (root) journal, then only one of them succeeds and the other one will fail with an exception. The behavior is more or less the same as with current Jackrabbit. * If the cluster nodes use different journals (with background merging), then one of the moves will succeed and depending on timing the other one either fails or ends up producing a duplicate copy of the tree. The latter option is designed to boost write concurrency in scenarios where it's OK for some operations to get lost or produce somewhat inconsistent results (high-volume commenting or logging systems, etc.). Operations for which such behavior is not desirable should use the first option. BR, Jukka Zitting -- Felix Meschberger | Principal Scientist | Adobe
Re: Behaviour of Move Operations
Hi, Am 01.03.2013 um 13:47 schrieb Michael Dürig: What Jukka is saying is that the repository gives you a choice between consistency and availability. Since both you cannot have. I think you don't want to given the user that choice ... I'd opt for best possible availability (or probably you mean performance?) with guaranteed consistency. Regards Felix Michael On 1.3.13 12:40, Felix Meschberger wrote: So you essentially say: Behaviour of the repository is best effort and we -- at the end of the day -- cannot trust the repository ? Sounds frightening. IMHO the repository should be failsafe and thus eventually solve the issue making sure we don't end up with two copies of the same node (actually both copies, if referenceable, will even have the same node ID Regards Felix Am 27.02.2013 um 16:35 schrieb Jukka Zitting: Hi, On Wed, Feb 27, 2013 at 5:09 PM, Carsten Ziegeler cziege...@apache.org wrote: How will this be handled with Oak? Could it happen that due to this happening concurrently that the node ends up twice in the repository (at /1/node and /2/node in my example)? The behavior depends on the underlying MicroKernel implementation. With the new SegmentMK I've been working on, you can control the behavior: * If both cluster nodes use the same (root) journal, then only one of them succeeds and the other one will fail with an exception. The behavior is more or less the same as with current Jackrabbit. * If the cluster nodes use different journals (with background merging), then one of the moves will succeed and depending on timing the other one either fails or ends up producing a duplicate copy of the tree. The latter option is designed to boost write concurrency in scenarios where it's OK for some operations to get lost or produce somewhat inconsistent results (high-volume commenting or logging systems, etc.). Operations for which such behavior is not desirable should use the first option. BR, Jukka Zitting -- Felix Meschberger | Principal Scientist | Adobe -- Felix Meschberger | Principal Scientist | Adobe
Re: New Jackrabbit committer: Tommaso Teofili
Welcome and Congratulations ! Regards Felix Am 20.02.2013 um 15:43 schrieb Michael Dürig: Hi, Please welcome Tommaso Teofili as a new committer and PMC member of the Apache Jackrabbit project. The Jackrabbit PMC recently decided to offer Tommaso committership based on his contributions. I'm happy to announce that he accepted the offer and that all the related administrative work has now been taken care of. Welcome to the team, Tommaso! Michael -- Felix Meschberger | Principal Scientist | Adobe
Re: New Jackrabbit committer: Cédric Damioli
Welcome and Congratulations ! Regards Felix Am 20.02.2013 um 15:44 schrieb Michael Dürig: Hi, Please welcome Cédric Damioli as a new committer and PMC member of the Apache Jackrabbit project. The Jackrabbit PMC recently decided to offer Cédric committership based on his contributions. I'm happy to announce that he accepted the offer and that all the related administrative work has now been taken care of. Welcome to the team, Cédric! Michael -- Felix Meschberger | Principal Scientist | Adobe
Re: Time for jackrabbit-jcr-auth?
Hi Can you separate API from implementation in the same step ? Currently API and implementation is nicely mixed, which makes it close to impossible to properly use in an OSGi context. Regards Felix Am 19.02.2013 um 10:52 schrieb Jukka Zitting: Hi, When looking at the login() code for OAK-634 I realized that there's a a lot of duplication between jackrabbit-core and oak-core in this area. Would it make sense to split out the authentication code to something like jackrabbit-jcr-auth that could be used by both jackrabbit-core and oak-core. AFAICT there aren't too many places in the authentication code that require deep integration with the repository internals (unlike in authorization), so it should be possible to extract the relevant code to a separate component. Or am I mistaken? BR, Jukka Zitting -- Felix Meschberger | Principal Scientist | Adobe
[jira] [Commented] (JCR-3489) enhance get/set Property value access, expanding the supported types set
[ https://issues.apache.org/jira/browse/JCR-3489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13536861#comment-13536861 ] Felix Meschberger commented on JCR-3489: Speaking of Sling: Our solution based on the Adaptable interface with the ValueMap is only a secondary benefit of the Resource API we did to abstract the complex JCR API with checked exceptions into a simpler, easier to use API which also allows for much simpler and easier extension by other implementations. I seriously doubt our Sling implementation will be changed to use this new functionality because our implementation exists and is proven and is embedded in a larger context (Adaptable with AdapterFactory). enhance get/set Property value access, expanding the supported types set Key: JCR-3489 URL: https://issues.apache.org/jira/browse/JCR-3489 Project: Jackrabbit Content Repository Issue Type: Improvement Components: jackrabbit-jcr-commons Affects Versions: 2.5.2 Reporter: Simone Tripodi Priority: Minor Fix For: 2.6 Attachments: JCR-3489.patch The idea is having a small EDSL that simplifies the access to {{Property}} value, so rather than coding the following: {code} Property property = ...; boolean oldValue = property.getBoolean(); boolean newValue = !oldValue; property.setValue(newValue); {code} it could be simplified specifying wich type the users are interested on: {code} PropertyAccessors propertiesAccessor = ...; boolean oldValue = propertiesAccessor.get(property).to(boolean.class); boolean newValue = !oldValue; propertiesAccessor.set(newValue).to(property); {code} where {{PropertiesAccessor}} is the delegated to handle right types handling. By default it supports default {{Property}} value types, but it could be extended. It could happen also that users would like to support a larger set of types, maybe performing conversions to/from default {{Property}} types, so rather than inserting the custom code in the app when required, they could use the {{PropertiesAccessor}}; they first need to register the Accessor implementation to (un)bind the type: {code} propertiesAccessor.handle(URI.class).with(new PropertyAccessorURI() { @Override public void set(URI value, Property target) throws ValueFormatException, RepositoryException { // ... } @Override public URI get(Property property) throws ValueFormatException, RepositoryException { // TODO ... return null; } }); {code} so they can use the accessor via the {{PropertiesAccessor}}: {code} URI oldValue = propertiesAccessor.get(property).to(URI.class); URI newValue = URI.create(http://jackrabbit.apache.org/;); propertiesAccessor.set(newValue).to(property); {code} Patch coming soon! -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Conflict handling in Oak
Hi, Just remember that MAY is difficult to handle by developers: Can I depend on it or not ? What if the MAY feature does not exist ? What if I develop on an implementation providing the MAY feature and then running on an implementation not providing the MAY feature ? In essence, a MAY feature basically must be considered as non-existing :-( All in all, please don't use MAY. Thanks from a developer ;-) Regards Felix Am 18.12.2012 um 09:37 schrieb Marcel Reutegger: Hi, To address 1) I suggest we define a set of clear cut cases where any Microkernel implementations MUST merge. For the other cases I'm not sure whether we should make them MUST NOT, SHOULD NOT or MAY merge. I agree and I think three cases are sufficient. MUST, MUST NOT and MAY. MUST is for conflicts we know are easy and straight forward to resolve. MUST NOT is for conflicts that are known to be problematic because there's no clean resolution strategy. MAY is for conflicts that have a defined resolution but we think happen rarely and is not worth implementing. I don't see how SHOULD NOT is useful in this context. regards marcel
[jira] [Commented] (JCR-3465) JcrUtils.getOrCreateByPath() creates a whole subtree instead of a single branch
[ https://issues.apache.org/jira/browse/JCR-3465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13506318#comment-13506318 ] Felix Meschberger commented on JCR-3465: +1 to fixing. AFAICT from reading the JavaDoc support for a/b/../../c/d../../e/f is not implied. In fact I even consider this an unexpected side effect. JcrUtils.getOrCreateByPath() creates a whole subtree instead of a single branch --- Key: JCR-3465 URL: https://issues.apache.org/jira/browse/JCR-3465 Project: Jackrabbit Content Repository Issue Type: Bug Reporter: Michael Dürig Priority: Minor Given a leaf node n, JcrUtils.getOrCreateByPath(n, a/b/../../c/d/../../e/f, false, null, null, true); will result in paths a/b, c/d and e/f being added to n where I'd only expect the path e/f. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Support for long multivalued properties
Hi, Am 15.11.2012 um 14:06 schrieb Lukas Kahwe Smith: On Nov 15, 2012, at 14:02 , Thomas Mueller muel...@adobe.com wrote: Hi, before adding this i would rather want to see support for hash maps. Sounds interesting.. could you give more details please? well right now you can only store ordered lists. especially in PHP its common to use associative arrays $var = array( 'foo' = 'bar', 'ding' = 'dong', ) Hm, this looks more like node var with properties foo set to bar and ding set to dong to me. Regards Felix right now when I want to store such data in JCR/Jackarabbit I need to split this into 2 multi valued properties, one with the values and one with the keys, which then need to be reassembled on read. that being said .. if we want to add this in OAK, we should look into adding it into JCR 2.1 aka JSR-333 too regards, Lukas Kahwe Smith m...@pooteeweet.org
Re: New Jackrabbit committer: Philipp Marx
Welcome and Congratulations ! Regards Felix Am 01.11.2012 um 09:57 schrieb Michael Dürig: Hi, Please welcome Philipp Marx as a new committer and PMC member of the Apache Jackrabbit project. The Jackrabbit PMC recently decided to offer Philipp committership based on his contributions. I'm happy to announce that he accepted the offer and that all the related administrative work has now been taken care of. Welcome to the team, Philipp! Michael
Re: New Jackrabbit committer: Chetan Mehrotra
Welcome Chetan and Congratulations. Keep up the good work ! Regards Felix Am 21.09.2012 um 09:48 schrieb Michael Dürig: Hi, Please welcome Chetan Mehrotra as a new committer and PMC member of the Apache Jackrabbit project. The Jackrabbit PMC recently decided to offer Chetan committership based on his contributions. I'm happy to announce that he accepted the offer and that all the related administrative work has now been taken care of. Welcome to the team, Chetan! Michael
Re: New Jackrabbit committer: Randall Hauch
Congratulations and Welcome ! Regards Felix Am 21.09.2012 um 09:50 schrieb Michael Dürig: Hi, Better a bit late than never. Sorry about that Randall, here is the official welcome message! Please welcome Randall Hauch as a new committer and PMC member of the Apache Jackrabbit project. The Jackrabbit PMC recently decided to offer Randall committership based on his contributions. I'm happy to announce that he accepted the offer and that all the related administrative work has now been taken care of. Welcome to the team, Randall! Michael
[jira] [Commented] (OAK-245) Add import for org.h2 in oak-mk bundle
[ https://issues.apache.org/jira/browse/OAK-245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13435797#comment-13435797 ] Felix Meschberger commented on OAK-245: --- I don't consider this a bug. Let's say it doesn't work well with OSGi. Ok, lets land in on doesn't cope well with dynamic module systems. See also http://blog.meschberger.ch/2010/12/classforname-probably-not.html and BJ Hargrave's findings on this. Add import for org.h2 in oak-mk bundle -- Key: OAK-245 URL: https://issues.apache.org/jira/browse/OAK-245 Project: Jackrabbit Oak Issue Type: Bug Components: mk Reporter: Chetan Mehrotra Labels: osgi Attachments: import-h2.patch, OAK-245-load-driver.patch The oak-mk bundle depends on H2 database. It internally uses Class.forName('org.h2.Driver) to load the H2 driver. Due to usage of Class.forName Bnd is not able to add org.h2 package to Import-Package list. So it should have an explicit entry in the maven-bundle-plugin config as shown below {code:xml} Import-Package org.h2;resolution:=optional, * /Import-Package {code} Without this MicroKernalService loading would fail with a CNFE -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (OAK-245) Add import for org.h2 in oak-mk bundle
[ https://issues.apache.org/jira/browse/OAK-245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13435106#comment-13435106 ] Felix Meschberger commented on OAK-245: --- bq. Class.forName('org.h2.Driver) Does it do forName without a class loader ? This would constitute a bug in itself. Add import for org.h2 in oak-mk bundle -- Key: OAK-245 URL: https://issues.apache.org/jira/browse/OAK-245 Project: Jackrabbit Oak Issue Type: Bug Components: mk Reporter: Chetan Mehrotra Labels: osgi Attachments: import-h2.patch The oak-mk bundle depends on H2 database. It internally uses Class.forName('org.h2.Driver) to load the H2 driver. Due to usage of Class.forName Bnd is not able to add org.h2 package to Import-Package list. So it should have an explicit entry in the maven-bundle-plugin config as shown below {code:xml} Import-Package org.h2;resolution:=optional, * /Import-Package {code} Without this MicroKernalService loading would fail with a CNFE -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (OAK-225) Sling I18N queries not supported by Oak
[ https://issues.apache.org/jira/browse/OAK-225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13431687#comment-13431687 ] Felix Meschberger commented on OAK-225: --- bq. it would be relatively easy to change the query in Sling You are of course welcome to raise an issue in Sling proposing a different Query, which (a) requires no to minial code changes and (b) works on both Jackrabit 2 and Oak. Thanks ;-) Sling I18N queries not supported by Oak --- Key: OAK-225 URL: https://issues.apache.org/jira/browse/OAK-225 Project: Jackrabbit Oak Issue Type: Bug Components: core Affects Versions: 0.3 Reporter: Jukka Zitting Priority: Minor Labels: sling, xpath The Sling I18N component issues XPath queries like the following: {code:none} //element(*,mix:language)[fn:lower-case(@jcr:language)='en']//element(*,sling:Message)[@sling:message]/(@sling:key|@sling:message) {code} Such queries currently fail with the following exception: {code:none} javax.jcr.query.InvalidQueryException: java.text.ParseException: Query: //element(*,mix:language)[fn:lower-(*)case(@jcr:language)='en']//element(*,sling:Message)[@sling:message]/(@sling:key|@sling:message); expected: ( at org.apache.jackrabbit.oak.jcr.query.QueryManagerImpl.executeQuery(QueryManagerImpl.java:115) at org.apache.jackrabbit.oak.jcr.query.QueryImpl.execute(QueryImpl.java:85) at org.apache.sling.jcr.resource.JcrResourceUtil.query(JcrResourceUtil.java:52) at org.apache.sling.jcr.resource.internal.helper.jcr.JcrResourceProvider.queryResources(JcrResourceProvider.java:262) ... 54 more Caused by: java.text.ParseException: Query: //element(*,mix:language)[fn:lower-(*)case(@jcr:language)='en']//element(*,sling:Message)[@sling:message]/(@sling:key|@sling:message); expected: ( at org.apache.jackrabbit.oak.query.XPathToSQL2Converter.getSyntaxError(XPathToSQL2Converter.java:704) at org.apache.jackrabbit.oak.query.XPathToSQL2Converter.read(XPathToSQL2Converter.java:410) at org.apache.jackrabbit.oak.query.XPathToSQL2Converter.parseExpression(XPathToSQL2Converter.java:336) at org.apache.jackrabbit.oak.query.XPathToSQL2Converter.parseCondition(XPathToSQL2Converter.java:279) at org.apache.jackrabbit.oak.query.XPathToSQL2Converter.parseAnd(XPathToSQL2Converter.java:252) at org.apache.jackrabbit.oak.query.XPathToSQL2Converter.parseConstraint(XPathToSQL2Converter.java:244) at org.apache.jackrabbit.oak.query.XPathToSQL2Converter.convert(XPathToSQL2Converter.java:153) at org.apache.jackrabbit.oak.query.QueryEngineImpl.parseQuery(QueryEngineImpl.java:86) at org.apache.jackrabbit.oak.query.QueryEngineImpl.executeQuery(QueryEngineImpl.java:99) at org.apache.jackrabbit.oak.query.QueryEngineImpl.executeQuery(QueryEngineImpl.java:39) at org.apache.jackrabbit.oak.jcr.query.QueryManagerImpl.executeQuery(QueryManagerImpl.java:110) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (JCR-3350) Easy-to-use utility class for adding ACEs to nodes
[ https://issues.apache.org/jira/browse/JCR-3350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403828#comment-13403828 ] Felix Meschberger commented on JCR-3350: Some comments: * I would wrap the RepositoryException inside a generic RuntimeException. This makes handling unduly hard and inside a Jackrabbit class, it is probably appropriate to have RepositoryExceptions thrown. But those should be documented * Is it required to check for the existence of the item first ? In other words, would setting the ACL fail if the item would not exist ? What if the item is a property ? * I would get rid of the autoSave flag and just document this method does not save or rollback at all. Easy-to-use utility class for adding ACEs to nodes -- Key: JCR-3350 URL: https://issues.apache.org/jira/browse/JCR-3350 Project: Jackrabbit Content Repository Issue Type: Improvement Components: jackrabbit-jcr-commons Reporter: Jeff Young Assignee: angela Labels: newbie, patch Attachments: AccessControlUtils.java, JCR-3350_-_Easy-to-use_utility_class_for_adding_ACEs_to_nodes.patch There should be any easy (one-line) method for adding an ACE to a node in a repo supporting resource-based ACLs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Native HTTP bindings for Oak
Hi, Am 28.06.2012 um 01:15 schrieb Jukka Zitting: do you envision oak-jcr being a client of this http binding? No, we already have the Oak API for that. How about remotely located oak-jcr installs ? As in: oak-jcr on box X talking to oak-core on box Y. This should probably be able to leverage this new binding. Or do you envision some different Oak API remoting ? Regards Felix
Re: Native HTTP bindings for Oak
Hi, Am 27.06.2012 um 11:20 schrieb Jukka Zitting: Hi, On Wed, Jun 27, 2012 at 10:25 AM, Angela Schreiber anch...@adobe.com wrote: i don't fully see the use case for such long living sessions. FWIW, this was my first thought, too: This completely breaks stateless-ness of HTTP and introduces the use of Sessions. We can do that, but we have to know exactly, what it means and costs. The rationale is the same as for the branch feature we added to the MicroKernel. Instead of having to maintain a separate transient tree abstraction on the client side (which might be troublesome given potentially limited storage capacity), it's better to be able to send uncommitted data to the server for storage in a temporary branch where it can be accessed using the existing tree abstraction already provided by Oak. Most notably the session feature allows us to use such a HTTP binding to implement remoting of the Oak API without the need for client-side storage space and associated extra code for managing it. IMO also importing big trees and batch read/writing should be covered by a single request. That quickly leads to increasingly complex server side features like filtering or conditional saves. For example, think of a client like the Sling engine that first resolves a path potentially with custom mappings, then follows a complex set of resource type references, and finally renders a representation of the resolved content based on the resource type definitions that were found. Ideally (for consistency and better caching support) it should be possible to perform the entire operation based on a stable snapshot of the repository, but there's no way that all the information required by such a process could be included in the response of a single Oak HTTP request. With my Sling hat on, I am not sure about this example ;-) IMNSHO Sling should operate on JCR API and not on Oak Native HTTP binding. Regards Felix Exposing the branch feature as proposed avoids the need for complex server-side request processing logic and makes it easier to implement many client-side features that would otherwise have to use local storage or temporary content subtrees visible to other repository clients. BR, Jukka Zitting
Re: Native HTTP bindings for Oak
Hi, Am 27.06.2012 um 12:13 schrieb Jukka Zitting: Hi, On Wed, Jun 27, 2012 at 11:49 AM, Felix Meschberger fmesc...@adobe.com wrote: FWIW, this was my first thought, too: This completely breaks stateless-ness of HTTP and introduces the use of Sessions. I think you're misreading the proposal. The feature uses separate URI spaces so all information needed to access such sessions is encoded in each request and depends on no shared state between the client and the server. Its not about shared state but about state maintained on the server which means the exchange is not stateless any longer. The only difference to a Serlvet API HttpSession is, that the session key is part of the URI path instead of a request parameter or cookie. Regards Felix
[jira] [Commented] (OAK-153) Split the CommitHook interface
[ https://issues.apache.org/jira/browse/OAK-153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402151#comment-13402151 ] Felix Meschberger commented on OAK-153: --- .bq The lifecycle of such hooks is up to the deployment Ok, this essentially means the split prevents bracketing. Maybe this should just be documented. Split the CommitHook interface -- Key: OAK-153 URL: https://issues.apache.org/jira/browse/OAK-153 Project: Jackrabbit Oak Issue Type: Improvement Components: core Reporter: Jukka Zitting Assignee: Jukka Zitting The {{CommitHook}} interface has two methods, {{beforeCommit()}} and {{afterCommit()}}, since the symmetry originally seemed like a good idea. However, in practice these methods are not really so symmetric after all. For example, unlike {{afterCommit()}} the {{beforeCommit()}} method may end up being called multiple times for a given changeset if it needs to be repeatedly rebased or otherwise revised before it can be committed. There isn't even any guarantee that a particular changeset on which {{beforeCommit()}} has been called ever gets committed. And on the other hand there are good reasons to avoid calling {{afterCommit()}} on each and every commit that has been made. Instead it could be called only every now and then to cover larger sets of changes. Thus I'd like to split the {{CommitHook}} interface to two parts that I'd tentatively call {{CommitEditor}} and {{Observer}}. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Native HTTP bindings for Oak
Hi, Am 27.06.2012 um 13:49 schrieb Michael Dürig: On 27.6.12 12:50, Felix Meschberger wrote: Hi, Am 27.06.2012 um 12:13 schrieb Jukka Zitting: Hi, On Wed, Jun 27, 2012 at 11:49 AM, Felix Meschberger fmesc...@adobe.com wrote: FWIW, this was my first thought, too: This completely breaks stateless-ness of HTTP and introduces the use of Sessions. I think you're misreading the proposal. The feature uses separate URI spaces so all information needed to access such sessions is encoded in each request and depends on no shared state between the client and the server. Its not about shared state but about state maintained on the server which means the exchange is not stateless any longer. Yes but that's no different from any other POST/PUT/DELETE request. Absolutely not. Per-se these requests are stateless. Jukka's proposal really introduces server state even though he hides it behind a resource URL. The state introduced is the JCR session kept on the server. Regards Felix Michael The only difference to a Serlvet API HttpSession is, that the session key is part of the URI path instead of a request parameter or cookie. Regards Felix
Re: Native HTTP bindings for Oak
Hi, Ah ! Sounds much better now. Thanks alot for the clarification. So $ curl -X DELETE http://localhost:8080/branch/X would in fact drop the branch, right ? Regards Felix Am 27.06.2012 um 15:00 schrieb Jukka Zitting: Hi, On Wed, Jun 27, 2012 at 12:50 PM, Felix Meschberger fmesc...@adobe.com wrote: Its not about shared state but about state maintained on the server which means the exchange is not stateless any longer. I don't follow this argument; the entire repository is one big piece of server-side state. Let's drop the term session here as it's clearly confusing things and call this feature branching: $ curl http://localhost:8080/content {} $ curl -d create=true http://localhost:8080/branch Location: http://localhost:8080/branch/X $ curl http://localhost:8080/branch/X {} $ curl -d foo=bar http://localhost:8080/branch/X {foo:bar} $ curl http://localhost:8080/content {} $ curl -d commit=true -d remove=true http://localhost:8080/branch/X $ curl http://localhost:8080/content {foo:bar} The only difference between such an operation and that of using a separate cloned subtree (or workspace) is that the latter is visible to all repository clients and the former only to those that have the relevant URI. BR, Jukka Zitting
[jira] [Commented] (OAK-153) Split the CommitHook interface
[ https://issues.apache.org/jira/browse/OAK-153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401406#comment-13401406 ] Felix Meschberger commented on OAK-153: --- What is the intended lifecycle of these hooks ? Created on call or created on Repository/Oak/MK start ? What if a class implements both interfaces ? Background: I could imagine an implementation where beforeCommit and afterCommit cooperate in that the final afterCommit might be interested in stuff done for beforeCommit. Depending on how the classes are instantiated this might be simple or not. Split the CommitHook interface -- Key: OAK-153 URL: https://issues.apache.org/jira/browse/OAK-153 Project: Jackrabbit Oak Issue Type: Improvement Components: core Reporter: Jukka Zitting Assignee: Jukka Zitting The {{CommitHook}} interface has two methods, {{beforeCommit()}} and {{afterCommit()}}, since the symmetry originally seemed like a good idea. However, in practice these methods are not really so symmetric after all. For example, unlike {{afterCommit()}} the {{beforeCommit()}} method may end up being called multiple times for a given changeset if it needs to be repeatedly rebased or otherwise revised before it can be committed. There isn't even any guarantee that a particular changeset on which {{beforeCommit()}} has been called ever gets committed. And on the other hand there are good reasons to avoid calling {{afterCommit()}} on each and every commit that has been made. Instead it could be called only every now and then to cover larger sets of changes. Thus I'd like to split the {{CommitHook}} interface to two parts that I'd tentatively call {{CommitEditor}} and {{Observer}}. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: time for JDK 1.6?
Hi, Am 31.05.2012 um 10:22 schrieb Julian Reschke: Hi, yesterday, while working on OAK, I accidentally made a jackrabbit (-tests) change containing a 1.6ism (String.isEmpty). Nothing serious, and easy to fix. Technically I am all for dropping the Java 5 requirement -- Java 5 is essentially no more. On the other hand: In Sling we have setup the Codehaus Animal Sniffer plugin to verify Java 5 API use only and fail the build otherwise. Regards Felix But it makes me wonder whether it makes sense to use 1.6 in OAK, but 1.5 in Jackrabbit (given the fact that we use subprojects from jackrabbit, such as -tests and -commons). Best regards, Julian
Re: Oak plugin model
Hi, With my OSGi hat on, this is ok with a few caveats: * Separate the (service) interfaces from implementations in distinct packages. It has been one the biggest annoyances with the current Jackrabbit implementation, that such plugin interfaces are generally mixed with implementations in the same package (apart from signatures often time referring to implementation classes). * Make sure to consider livecycle: In a traditional application which starts and stops, a service might be available from the start till the end of the application. In a modular application runnuning an OSGi framework a plugin (service) may come and go at any time. This should be accounted for in some way or another. Regards Felix Am 15.05.2012 um 18:18 schrieb Jukka Zitting: Hi, When we talk about Oak and about where in our stack a particular piece of functionality should be implemented, we often tend to approach the design as a layered stack of components from oak-jcr to oak-core and on to the underlying implementation. Possible extra features are usually presented as either new functionality in one of the existing components or as an extra layer in between or on top of this stack. This linear top-down view of dependencies seriously limits the architectural freedom and potential for extensibility we have. I've brought this up a few times in the context of other discussions, but I wanted to raise it also as a general concept to make sure we're all on the same page. To make this non-layered approach to Oak architecture more obvious, I outlined this idea in a diagram [1] that shows various potential features as pluggable extensions. Such an approach provides a nice third alternative to the kinds of discussions where we've been debating whether a particular feature (like name mapping, query parsing, etc.) should be located in oak-jcr or oak-core. By treating such functionality as pluggable extensions, accessed through well defined service provider interfaces, we get to keep the main Oak API clean and simple (e.g. no need for it to cover concepts like a namespace registry or an abstract query tree) and make the overall system much more modular and easier to customize. Ultimately such extra plugin components may well end up as separate Maven components, but until the related service interfaces and plugin boundaries are well defined it's better to keep all such code together and simply use Java package boundaries to separate them. That's the rationale behind the .oak.plugins package I recently created in oak-core, and I propose that we'll start moving also other pluggable parts (like some of the query bits and applicable parts of the recent security work) there. [1] http://people.apache.org/~jukka/2012/oak-plugins.png BR, Jukka Zitting
Re: Consolidate oak Utilities
Hi, Am 27.04.2012 um 12:39 schrieb Jukka Zitting: For example (and I know this has been discussed at length already), much of the path and json/p handling code shared between -core and -mk is just there to first generate jsop strings and then parse them again. It's obviously good to avoid duplicating such code, but even better if we could avoid it in the first place. Which would be an argument of non-stringly typed MK API and leave serialization stuff to be internals of any MK API implementation -- be it for actual persistence or for a remoting layer... Regards Felix
Re: JCR-MK property mapping
Hi, Am 25.04.2012 um 11:40 schrieb Jukka Zitting: Hi, On Wed, Apr 25, 2012 at 9:54 AM, Julian Reschke julian.resc...@gmx.de wrote: Would it make sense to optimize the persistence in that we wouldn't store the primary type when it happens to be nt:unstructured? Yes, though the default type should be something like oak:unstructured or jr3:unstructured that isn't orderable like nt:unstructured. Do we need a namespace ? How about just Unstructured ? Regards Felix
[jira] [Commented] (OAK-67) Initial OSGi Bundle Setup
[ https://issues.apache.org/jira/browse/OAK-67?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260289#comment-13260289 ] Felix Meschberger commented on OAK-67: -- I am ok with just exportin .api. But what about .util ? I would really find it easily informative to have packages intended for internal use only clearly marked with a kind-of-tag in the package. For example o.a.j.mk.util may be seen as internal or external ... but o.a.j.mk.internal.util will never be mistaken to be external. Initial OSGi Bundle Setup - Key: OAK-67 URL: https://issues.apache.org/jira/browse/OAK-67 Project: Jackrabbit Oak Issue Type: Improvement Components: core Affects Versions: 0.1 Reporter: Felix Meschberger Fix For: 0.2 Attachments: OAK-67.patch Since the plan for the 0.2 release intents to add initial OSGi budling functionality, we need to track this addition. Will come up with a patch and change proposal for such bundling. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Oak 0.2 release plan
Hi, Am 23.04.2012 um 09:33 schrieb Jukka Zitting: * OSGi bundle packaging I have created OAK-67 [1] to track this and will come up with a patch and recommendations for changes Regards Felix [1] https://issues.apache.org/jira/browse/OAK-67