Re: Initial work for the specification of a remote API
On Mon, Jan 26, 2015 at 12:38 PM, Francesco Mari mari.france...@gmail.com wrote: ...Other people already applied this concept successfully with the creation of the JSON Patch standard [1] I wasn't aware of that upcoming standard, looks interesting indeed! -Bertrand [1]: https://tools.ietf.org/html/rfc6902
Re: Initial work for the specification of a remote API
My reference to JSON Patch was not an indication that I recommend its usage in this context, but it was more a counter-example to explain that, in my opinion, JSOP doesn't add anything new to JSON if not another notation that must be understood, parsed and supported - potentially in non-Java environments too. On the other hand, JSON Patch is for sure an inspiration for this use case. The reason why I don't recommend it is because it works with a flat namespace. Some operations, namely add and remove, has an overloaded behavior depending on the object referenced by the path property. The repository, instead, works with two namespaces, one for properties and another one for nodes. Given a path /a/b/c, it is not immediately clear if the intent of the client is to manipulate a child c of the node /a/b or a property c of the same node. I would prefer, instead, our own version of patch, with a very small set of operations that convey the right semantics without any guessing involved. 2015-01-26 13:50 GMT+01:00 Bertrand Delacretaz bdelacre...@apache.org: On Mon, Jan 26, 2015 at 12:38 PM, Francesco Mari mari.france...@gmail.com wrote: ...Other people already applied this concept successfully with the creation of the JSON Patch standard [1] I wasn't aware of that upcoming standard, looks interesting indeed! -Bertrand [1]: https://tools.ietf.org/html/rfc6902
Re: Initial work for the specification of a remote API
On 26.1.15 3:13 , Francesco Mari wrote: 3. enables horizontal scalability by eliminating state on the server. I didn't follow all the communication yet as I'm on vacation. However I think we can weaken this precondition to 3'. eliminating *transient* state on the server. See https://issues.apache.org/jira/browse/OAK-2416. With this you should be able to get 1., 2., and 3'. Michael
Re: Initial work for the specification of a remote API
On 26 Jan 2015, at 15:13, Francesco Mari mari.france...@gmail.com wrote: What you are proposing is a REST interface where every node can be treated as an individual resource. This will not work. To fully leverage the MVCC architecture of Oak, a set of related GET, POST, PUT and DELETE requests must belong to the same context. This context, in Oak, is represented by a ContentSession. To express the fact that two or more requests belong to the same ContentSession, the API must provide the following functionalities: 1. Open a session. This returns a session ID that must be saved by the client. 2. Require a valid session ID on every following POST, PUT, DELETE and GET method. 3. Commit or discard a session. This also require a session ID, obviously. This solution forces the server to maintain ContentSession instances. When a client opens a session on an instance of the server, it is bound to that server instance until the ContentSession is committed or discarded. This approach enforces a really strict constraint on the horizontal scalability of the server. We can still provide a REST API, but to avoid this constraint we have to make the API a little bit more coarse grained. In this case we have to work with trees, and not with single nodes (technically speaking, a node is a tree with depth zero, but let's ignore it for the sake of the explanation). The root of the repository is the root of the tree we are working on. Every tree in the repository (the root or any sub-tree of the root) supports a GET method that allows reading that tree, optionally filtering out properties and children and enforcing the maximum depth to read. The root of the repository is a special tree that also supports a PATCH method. This method allows the client to specify a set of operations to make the tree move from one valid state to another valid state. The PATCH operation is not bound to any session, is atomic and can be performed by any instance of the server. This is my analysis so far. I am open to any solution that would combine these qualities: 1. makes effective use of the MVCC approach of Oak. 2. is complaint to the REST architecture. 3. enables horizontal scalability by eliminating state on the server. I think it would be good to still provide a clean REST API, just because some use cases can indeed be covered this way and it will keep things simple. Under the hood Oak supports cheap branching, so another option would be to move the “session” management to the client by simply allowing them to fork the content, issue the changes and then finalize things by sending a merge. Again this can work well for many use cases but not all. Finally yes we also will want a way to batch such changes into a single request, since round trips do matter of course. One of the tricky aspects however we learned that using the old Jackrabbit 2.x remoting API is that it can become hard to predict when a change set becomes too big to still be reasonably be processed via the remoting API in one batch. regards, Lukas Kahwe Smith sm...@pooteeweet.org signature.asc Description: Message signed with OpenPGP using GPGMail
Re: Oak 1.0.10 release
Hi, thanks Amit for taking care of the 1.0.10 release. In the meantime OAK-2442 came up and I'd like to include the fix in a release from the 1.0 branch. I will therefore cut a new 1.0.11 release soon. Regards Marcel On 22/01/15 15:50, Marcel Reutegger mreut...@adobe.com wrote: Hi, it's been a while since the last release off the 1.0 branch and we resolved a number of issues in the meantime. I'd like to cut a release as soon as we have resolved the remaining three issues currently scheduled for 1.0.10: - OAK-2433: IllegalStateException for ValueMap on _revisions - OAK-2434: Lucene AND query with a complex OR phrase returns incorrect result - OAK-2431: Avoid wrapping of LuceneIndexProvider with AggregateIndexProvider in tests Regards Marcel
Re: Initial work for the specification of a remote API
What you are proposing is a REST interface where every node can be treated as an individual resource. This will not work. To fully leverage the MVCC architecture of Oak, a set of related GET, POST, PUT and DELETE requests must belong to the same context. This context, in Oak, is represented by a ContentSession. To express the fact that two or more requests belong to the same ContentSession, the API must provide the following functionalities: 1. Open a session. This returns a session ID that must be saved by the client. 2. Require a valid session ID on every following POST, PUT, DELETE and GET method. 3. Commit or discard a session. This also require a session ID, obviously. This solution forces the server to maintain ContentSession instances. When a client opens a session on an instance of the server, it is bound to that server instance until the ContentSession is committed or discarded. This approach enforces a really strict constraint on the horizontal scalability of the server. We can still provide a REST API, but to avoid this constraint we have to make the API a little bit more coarse grained. In this case we have to work with trees, and not with single nodes (technically speaking, a node is a tree with depth zero, but let's ignore it for the sake of the explanation). The root of the repository is the root of the tree we are working on. Every tree in the repository (the root or any sub-tree of the root) supports a GET method that allows reading that tree, optionally filtering out properties and children and enforcing the maximum depth to read. The root of the repository is a special tree that also supports a PATCH method. This method allows the client to specify a set of operations to make the tree move from one valid state to another valid state. The PATCH operation is not bound to any session, is atomic and can be performed by any instance of the server. This is my analysis so far. I am open to any solution that would combine these qualities: 1. makes effective use of the MVCC approach of Oak. 2. is complaint to the REST architecture. 3. enables horizontal scalability by eliminating state on the server. 2015-01-26 13:00 GMT+01:00 Felix Meschberger fmesc...@adobe.com: +100 ! * Type=remove is exactly DELETE and we should do it * type=add is just PUT or POST * type=set likewise is just PUT or POST * type=unset is exactly DELETE So, please use those. Regards Felix Am 26.01.2015 um 10:00 schrieb Lukas Kahwe Smith sm...@pooteeweet.org: On 26 Jan 2015, at 09:55, Francesco Mari mari.france...@gmail.com wrote: I think a multi-tree read request could be a good improvement to the API. Technically speaking, it may be though as a generalization of the Read tree operation. can you elaborate why you are using an RPC style protocol, rather something more along the lines of REST. for example: { type: remove, path: /a/b/c } could just be a DELETE on /a/b/c regards, Lukas Kahwe Smith sm...@pooteeweet.org
[VOTE] Release Apache Jackrabbit Oak 1.0.11
Hi, A candidate for the Jackrabbit Oak 1.0.11 release is available at: https://dist.apache.org/repos/dist/dev/jackrabbit/oak/1.0.11/ The release candidate is a zip archive of the sources in: https://svn.apache.org/repos/asf/jackrabbit/oak/tags/jackrabbit-oak-1.0.11/ The SHA1 checksum of the archive is 46dbc7cef124cdfd607257f7b33caf1e02271696. A staged Maven repository is available for review at: https://repository.apache.org/ The command for running automated checks against this release candidate is: $ sh check-release.sh oak 1.0.11 46dbc7cef124cdfd607257f7b33caf1e02271696 Please vote on releasing this package as Apache Jackrabbit Oak 1.0.11. The vote is open for the next 72 hours and passes if a majority of at least three +1 Jackrabbit PMC votes are cast. [ ] +1 Release this package as Apache Jackrabbit Oak 1.0.11 [ ] -1 Do not release this package because... My vote is +1. Regards Marcel
Re: [VOTE] Release Apache Jackrabbit Oak 1.0.11
[X] +1 Release this package as Apache Jackrabbit Oak 1.0.11 On Mon, Jan 26, 2015 at 4:27 PM, Marcel Reutegger mreut...@adobe.com wrote: Hi, A candidate for the Jackrabbit Oak 1.0.11 release is available at: https://dist.apache.org/repos/dist/dev/jackrabbit/oak/1.0.11/ The release candidate is a zip archive of the sources in: https://svn.apache.org/repos/asf/jackrabbit/oak/tags/jackrabbit-oak-1.0.11/ The SHA1 checksum of the archive is 46dbc7cef124cdfd607257f7b33caf1e02271696. A staged Maven repository is available for review at: https://repository.apache.org/ The command for running automated checks against this release candidate is: $ sh check-release.sh oak 1.0.11 46dbc7cef124cdfd607257f7b33caf1e02271696 Please vote on releasing this package as Apache Jackrabbit Oak 1.0.11. The vote is open for the next 72 hours and passes if a majority of at least three +1 Jackrabbit PMC votes are cast. [ ] +1 Release this package as Apache Jackrabbit Oak 1.0.11 [ ] -1 Do not release this package because... My vote is +1. Regards Marcel
Re: [segment] offline compaction broken?
Hi Stefan, Offline compaction should work properly. Can you quickly check the number of checkpoints? alex On Mon, Jan 26, 2015 at 6:12 PM, Stefan Egli stefane...@apache.org wrote: Hi, Before I dig too deep I built the latest trunk and tried to run offline compaction but see a weird behavior where oak-run starts filling one tar file after the other basically increasing seemingly endlessly. Is this known or only me? Cheers, Stefan
[segment] offline compaction broken?
Hi, Before I dig too deep I built the latest trunk and tried to run offline compaction but see a weird behavior where oak-run starts filling one tar file after the other basically increasing seemingly endlessly. Is this known or only me? Cheers, Stefan
Re: [VOTE] Release Apache Jackrabbit Oak 1.0.11
On 2015-01-26 16:27, Marcel Reutegger wrote: ... Please vote on releasing this package as Apache Jackrabbit Oak 1.0.11. The vote is open for the next 72 hours and passes if a majority of at least three +1 Jackrabbit PMC votes are cast. [ ] +1 Release this package as Apache Jackrabbit Oak 1.0.11 [ ] -1 Do not release this package because... My vote is +1. Regards Marcel [X] +1 Release this package as Apache Jackrabbit Oak 1.0.11
Re: Initial work for the specification of a remote API
On Mon, Jan 26, 2015 at 10:28 AM, Francesco Mari mari.france...@gmail.com wrote: At the beginning I wanted to expose a more granular interface for node operations, mapping every node to a fully REST resource BTW, what happened to JSOP? http://slideshare.net/uncled/jsop -Bertrand
Re: Initial work for the specification of a remote API
On 26 Jan 2015, at 11:07, Bertrand Delacretaz bdelacre...@apache.org wrote: On Mon, Jan 26, 2015 at 10:28 AM, Francesco Mari mari.france...@gmail.com wrote: At the beginning I wanted to expose a more granular interface for node operations, mapping every node to a fully REST resource BTW, what happened to JSOP? http://slideshare.net/uncled/jsop I was going to bring that up to. Last time I spoke to David N. he was basically open for someone else to take over pushing it forward. regards, Lukas Kahwe Smith sm...@pooteeweet.org signature.asc Description: Message signed with OpenPGP using GPGMail
Re: Initial work for the specification of a remote API
I think a multi-tree read request could be a good improvement to the API. Technically speaking, it may be though as a generalization of the Read tree operation. 2015-01-25 11:14 GMT+01:00 David Buchmann da...@liip.ch: hi, I started to write down a draft for the remote API in a public GitHub repository [1]. I didn't write much so far, but I invite every interested party to take a look at it for suggestions and improvements. [1]: https://github.com/francescomari/oak-remote looks good so me. we wrote a php client[1] to the jackrabbit remoting. the things i see seem to fix some of the issues we had. a key topic for us was to reduce the amount of requests we need. at some point, somebody build a request to get multiple paths in one call, and there was discussion about a request to get content by multiple uuids. having those would help us a lot - in a normal php application flow, each web request starts its own context and loses the state at the end, so we fetch a bunch of nodes for every request... as soon as you have something running, i will be happy to try to use the API and give feedback how well it works. cheers,david [1] https://github.com/jackalope/jackalope-jackrabbit
Re: Initial work for the specification of a remote API
On 26 Jan 2015, at 09:55, Francesco Mari mari.france...@gmail.com wrote: I think a multi-tree read request could be a good improvement to the API. Technically speaking, it may be though as a generalization of the Read tree operation. can you elaborate why you are using an RPC style protocol, rather something more along the lines of REST. for example: { type: remove, path: /a/b/c } could just be a DELETE on /a/b/c regards, Lukas Kahwe Smith sm...@pooteeweet.org signature.asc Description: Message signed with OpenPGP using GPGMail
Re: svn commit: r1654266 - in /jackrabbit/oak/branches/1.0: oak-auth-external/src/main/java/org/apache/jackrabbit/oak/spi/security/authentication/external/ oak-auth-external/src/main/java/org/apache/j
On 23/01/2015 16:54, resc...@apache.org wrote: Author: reschke Date: Fri Jan 23 16:54:35 2015 New Revision: 1654266 URL: http://svn.apache.org/r1654266 Log: fix svn:eol-style I always found annoying the svn/eol thingy so starting from a chat with Julian I started a bit of research. While in svn it's not possible to change the content of the commit before actually storing it (0), it's possible to inject a hook which will enforce the client to have the proper settings on the client side (1). (0) http://stackoverflow.com/questions/5671406/force-svneol-style-native-on-the-server (1) http://stackoverflow.com/questions/2509803/how-to-avoid-mixed-eol-styles-in-a-svn-repository Don't know if it's possible on apache svn to add hooks, but if it would be so, what are your thoughts on the introduction of such hook? Cheers Davide
Re: Initial work for the specification of a remote API
At the beginning I wanted to expose a more granular interface for node operations, mapping every node to a fully REST resource. I figured out that this has some drawbacks. First of all, you have to create some kind of context between HTTP requests to guarantee that operations happen in your transient space and don't influence other clients. The only solution I found was to open a session on the server, assign an ID to it, and force clients to include the session ID on every request. To persist the changes made in a session, the client had to explicitly commit it. This way I could guarantee repeatable reads and isolated writes. This means that the server has to maintain multiple ContentSession objects and a mapping between session IDs and ContentSessions. This pushes a lot of state on the server and avoid horizontal scalability. When a client opens a new session, it is bound to the server holding the session. To have a scalable solution, I had to push some complexity and state to the client. The only way a client has to perform change to the repository is to provide a patch - a series of operations - that would represent the steps to perform to bring the repository to the wanted, final state. The server can then apply those steps and persist the new state in the repository. Note that a patch can be processed by any server in a clustered deployment, because the client is not bound to any specific instance. 2015-01-26 10:00 GMT+01:00 Lukas Kahwe Smith sm...@pooteeweet.org: On 26 Jan 2015, at 09:55, Francesco Mari mari.france...@gmail.com wrote: I think a multi-tree read request could be a good improvement to the API. Technically speaking, it may be though as a generalization of the Read tree operation. can you elaborate why you are using an RPC style protocol, rather something more along the lines of REST. for example: { type: remove, path: /a/b/c } could just be a DELETE on /a/b/c regards, Lukas Kahwe Smith sm...@pooteeweet.org
Re: svn commit: r1654266 - in /jackrabbit/oak/branches/1.0: oak-auth-external/src/main/java/org/apache/jackrabbit/oak/spi/security/authentication/external/ oak-auth-external/src/main/java/org/apache/j
On 2015-01-26 10:13, Davide Giannella wrote: On 23/01/2015 16:54, resc...@apache.org wrote: Author: reschke Date: Fri Jan 23 16:54:35 2015 New Revision: 1654266 URL: http://svn.apache.org/r1654266 Log: fix svn:eol-style I always found annoying the svn/eol thingy so starting from a chat with Julian I started a bit of research. While in svn it's not possible to change the content of the commit before actually storing it (0), it's possible to inject a hook which will enforce the client to have the proper settings on the client side (1). (0) http://stackoverflow.com/questions/5671406/force-svneol-style-native-on-the-server (1) http://stackoverflow.com/questions/2509803/how-to-avoid-mixed-eol-styles-in-a-svn-repository Don't know if it's possible on apache svn to add hooks, but if it would be so, what are your thoughts on the introduction of such hook? Cheers Davide Sounds very good to me. Best regards, Julian
Re: Initial work for the specification of a remote API
On Mon, Jan 26, 2015 at 9:55 AM, Francesco Mari mari.france...@gmail.com wrote: ...I think a multi-tree read request could be a good improvement to the API... In general, considering that you send a bunch of commands to the server and get a bunch of corresponding responses back might help a lot in reducing traffic. You might need to repeat part of the command in the response (or at least its ID) to keep things simple on the client side. -Bertrand
Re: Initial work for the specification of a remote API
The document I posted uses JSON only as a simple way to describe generic data structures. There is a big disclaimer at the beginning of operations.md. The operations are supposed to be described in an abstract way, without any procol-dependent technology. Please, let's evaluate operations.md without thinking so much about JSON, JSOP or other serialization strategies. That said, since the topic was brought up, I have to admit that I'm not a big fan of JSOP. I don't see any benefit in that format, since it doesn't really add anything that couldn't be done with plain JSON, if I understand correctly. When we will start thinking about an HTTP binding of the remote API, nothing stops us to support both JSON and JSOP, using HTTP content negotiation to allow the client to use whatever he likes. To begin, though, I would prefer to stick to JSON since at the moment is the most interoperable format of the two. 2015-01-26 11:09 GMT+01:00 Lukas Kahwe Smith sm...@pooteeweet.org: On 26 Jan 2015, at 11:07, Bertrand Delacretaz bdelacre...@apache.org wrote: On Mon, Jan 26, 2015 at 10:28 AM, Francesco Mari mari.france...@gmail.com wrote: At the beginning I wanted to expose a more granular interface for node operations, mapping every node to a fully REST resource BTW, what happened to JSOP? http://slideshare.net/uncled/jsop I was going to bring that up to. Last time I spoke to David N. he was basically open for someone else to take over pushing it forward. regards, Lukas Kahwe Smith sm...@pooteeweet.org
Re: Initial work for the specification of a remote API
On 26 Jan 2015, at 12:04, Francesco Mari mari.france...@gmail.com wrote: The document I posted uses JSON only as a simple way to describe generic data structures. There is a big disclaimer at the beginning of operations.md. The operations are supposed to be described in an abstract way, without any procol-dependent technology. Please, let's evaluate operations.md without thinking so much about JSON, JSOP or other serialization strategies. That said, since the topic was brought up, I have to admit that I'm not a big fan of JSOP. I don't see any benefit in that format, since it doesn't really add anything that couldn't be done with plain JSON, if I understand correctly. well that is the idea .. as its based on JSON :) it essentially extends JSON to specifically make it possible to express PATCH type requests in the context of a content repository. stuff like re-ordering etc. in that sense its also useful to handle the issue you talk about: dealing with multiple changes that you might have inside a remote session without needing a session, since you can do it all in a single request. regards, Lukas Kahwe Smith sm...@pooteeweet.org signature.asc Description: Message signed with OpenPGP using GPGMail
Re: Initial work for the specification of a remote API
My point is that probably you don't need to extend a format when the format you are extending is already powerful enough to express what you need. Other people already applied this concept successfully with the creation of the JSON Patch standard [1]. [1]: https://tools.ietf.org/html/rfc6902 2015-01-26 12:21 GMT+01:00 Lukas Kahwe Smith sm...@pooteeweet.org: On 26 Jan 2015, at 12:04, Francesco Mari mari.france...@gmail.com wrote: The document I posted uses JSON only as a simple way to describe generic data structures. There is a big disclaimer at the beginning of operations.md. The operations are supposed to be described in an abstract way, without any procol-dependent technology. Please, let's evaluate operations.md without thinking so much about JSON, JSOP or other serialization strategies. That said, since the topic was brought up, I have to admit that I'm not a big fan of JSOP. I don't see any benefit in that format, since it doesn't really add anything that couldn't be done with plain JSON, if I understand correctly. well that is the idea .. as its based on JSON :) it essentially extends JSON to specifically make it possible to express PATCH type requests in the context of a content repository. stuff like re-ordering etc. in that sense its also useful to handle the issue you talk about: dealing with multiple changes that you might have inside a remote session without needing a session, since you can do it all in a single request. regards, Lukas Kahwe Smith sm...@pooteeweet.org
Re: Initial work for the specification of a remote API
+100 ! * Type=remove is exactly DELETE and we should do it * type=add is just PUT or POST * type=set likewise is just PUT or POST * type=unset is exactly DELETE So, please use those. Regards Felix Am 26.01.2015 um 10:00 schrieb Lukas Kahwe Smith sm...@pooteeweet.org: On 26 Jan 2015, at 09:55, Francesco Mari mari.france...@gmail.com wrote: I think a multi-tree read request could be a good improvement to the API. Technically speaking, it may be though as a generalization of the Read tree operation. can you elaborate why you are using an RPC style protocol, rather something more along the lines of REST. for example: { type: remove, path: /a/b/c } could just be a DELETE on /a/b/c regards, Lukas Kahwe Smith sm...@pooteeweet.org
Re: Initial work for the specification of a remote API
Hi Whether you use JSOP or RFC 6902 is essentially irrelevant. Maybe I tend to slightly favour a standardised approach hence RFC 6902. Regards Felix Am 26.01.2015 um 12:38 schrieb Francesco Mari mari.france...@gmail.com: My point is that probably you don't need to extend a format when the format you are extending is already powerful enough to express what you need. Other people already applied this concept successfully with the creation of the JSON Patch standard [1]. [1]: https://tools.ietf.org/html/rfc6902 2015-01-26 12:21 GMT+01:00 Lukas Kahwe Smith sm...@pooteeweet.org: On 26 Jan 2015, at 12:04, Francesco Mari mari.france...@gmail.com wrote: The document I posted uses JSON only as a simple way to describe generic data structures. There is a big disclaimer at the beginning of operations.md. The operations are supposed to be described in an abstract way, without any procol-dependent technology. Please, let's evaluate operations.md without thinking so much about JSON, JSOP or other serialization strategies. That said, since the topic was brought up, I have to admit that I'm not a big fan of JSOP. I don't see any benefit in that format, since it doesn't really add anything that couldn't be done with plain JSON, if I understand correctly. well that is the idea .. as its based on JSON :) it essentially extends JSON to specifically make it possible to express PATCH type requests in the context of a content repository. stuff like re-ordering etc. in that sense its also useful to handle the issue you talk about: dealing with multiple changes that you might have inside a remote session without needing a session, since you can do it all in a single request. regards, Lukas Kahwe Smith sm...@pooteeweet.org