Re: [jr3] Orderable child nodes: required (to be the default)?
On 2012-01-23 11:43, Felix Meschberger wrote: ... I am not sure, whether this proposal does not open a can of worms: Consider using a node for child nodes whose retrieval should be ordered by insertion order. This is currently guaranteed by switching on ordered child nodes on the parent node, right ? So, applications using this will always use ordered nodes and thus suffer from performance again ... not good. ... How is "ordered by insertion order" and "allowing manual re-ordering" different from an application point of view? ... Best regards, Julia
Re: [jr3] Orderable child nodes: required (to be the default)?
On Sat, Jan 21, 2012 at 11:37 PM, Tobias Bocanegra wrote: so for large childnode lists, a stable but uncontrollable order would be ok, although violating the spec? I wouldn't violate the spec for this. If you have an orderable node (like nt:unstructured), then the repository should keep track of the order of child nodes regardless of how many they are. IMHO it's OK for the repository *not* to scale up or perform well if someone dumps a million child nodes to an orderable parent. However, it would be great if we could implement efficient storage for nodes with lots (i.e. millions) of unorderable children. In such a case I'd say we can expect the client to explicitly use a parent node with a custom node type that doesn't require orderability. I am not sure, whether this proposal does not open a can of worms: Consider using a node for child nodes whose retrieval should be ordered by insertion order. This is currently guaranteed by switching on ordered child nodes on the parent node, right ? So, applications using this will always use ordered nodes and thus suffer from performance again ... not good. Well, there is no free lunch... If we want to go 'big data' and distributed, we will have to make trade offs. What we need to do IMO is to carefully identify the areas where we can/want to trade off and make sure these are well understood. Michael
Re: [jr3] Orderable child nodes: required (to be the default)?
On 23.01.12 11:43, "Felix Meschberger" wrote: >Wouldn't it be such, that "unordered" might mean "no defined but stable >ordering" while ordered means "insertion order unless eplicitly changed" ? +1 Alex -- Alexander Klimetschek Developer // Adobe (Day) // Berlin - Basel
Re: [jr3] Orderable child nodes: required (to be the default)?
Hi, Am 23.01.2012 um 10:43 schrieb Jukka Zitting: > Hi, > > On Sat, Jan 21, 2012 at 11:37 PM, Tobias Bocanegra wrote: >> so for large childnode lists, a stable but uncontrollable order >> would be ok, although violating the spec? > > I wouldn't violate the spec for this. If you have an orderable node > (like nt:unstructured), then the repository should keep track of the > order of child nodes regardless of how many they are. IMHO it's OK for > the repository *not* to scale up or perform well if someone dumps a > million child nodes to an orderable parent. > > However, it would be great if we could implement efficient storage for > nodes with lots (i.e. millions) of unorderable children. In such a > case I'd say we can expect the client to explicitly use a parent node > with a custom node type that doesn't require orderability. I am not sure, whether this proposal does not open a can of worms: Consider using a node for child nodes whose retrieval should be ordered by insertion order. This is currently guaranteed by switching on ordered child nodes on the parent node, right ? So, applications using this will always use ordered nodes and thus suffer from performance again ... not good. > > I wouldn't even promise a stable iteration order for such cases. The > repository should be free to for example reorder the internal data > structure based on frequent access patterns. +1 Though paging to the list will then not be possible at all ! Wouldn't it be such, that "unordered" might mean "no defined but stable ordering" while ordered means "insertion order unless eplicitly changed" ? Both must really perform well or we get bad press again Regards Felix
Re: [jr3] Orderable child nodes: required (to be the default)?
Hi, On Sat, Jan 21, 2012 at 11:37 PM, Tobias Bocanegra wrote: > so for large childnode lists, a stable but uncontrollable order > would be ok, although violating the spec? I wouldn't violate the spec for this. If you have an orderable node (like nt:unstructured), then the repository should keep track of the order of child nodes regardless of how many they are. IMHO it's OK for the repository *not* to scale up or perform well if someone dumps a million child nodes to an orderable parent. However, it would be great if we could implement efficient storage for nodes with lots (i.e. millions) of unorderable children. In such a case I'd say we can expect the client to explicitly use a parent node with a custom node type that doesn't require orderability. I wouldn't even promise a stable iteration order for such cases. The repository should be free to for example reorder the internal data structure based on frequent access patterns. BR, Jukka Zitting
Re: [jr3] Orderable child nodes: required (to be the default)?
I think there is a use case for large flat unordered lists (using a node type with orderable = false). For example, dictionaries, where the key and value are in the properties and the node name is not really relevant, and where you don't want to create nested structures just for the sake of scalability - which also increases the total number of nodes. Cheers, Alex On 21.01.12 23:37, "Tobias Bocanegra" wrote: >I agree, >so for large childnode lists, a stable but uncontrollable order would >be ok, although violating the spec? > >-- > >toby >On Sat, Jan 21, 2012 at 5:04 AM, Stefan Guggisberg > wrote: >> On Sat, Jan 21, 2012 at 3:00 AM, Tobias Bocanegra >>wrote: >>> nt:unstructured has orderable childnodes, and we also preach to use >>> nt:unstructured whereever you can. Also, i assume that a lot of >>> applications benefit from node ordering (like a CMS). So I think that >>> the majority of the repositories have a lot of nodes with ordered >>> childnodes. >> >> >> agreed, at least for small child node lists (e.g. < 1k child nodes). >> but i doubt there's a need for large (e.g. 1M entries) orderable >> child node lists. >> >>> thus i would not put too much effort in having an >>> efficient non-ordered format, but making the ordered storage as fast >>> as possible. >> >> i am thinking of large child node collections with stable order. >> however, they wouldn't be insertion-ordered nor client-orderable. >> that's the compromise. >> >>> using lexicographical ordering for unordered child nodes >>> would certainly be a good option, >> >> that would be an implementation detail. as long as the order >> is stable (repeated reads of the same node revision should >> provide the same order of the child nodes) we shouldn't make >> any promises about how the order is defined. >> >> cheers >> stefan >> >>> [...] as i can reduce the lookup (but not >>> insert). also, i think that re-orderings happen relatively rarely. >>> >>> regards, toby >>> >>> On Fri, Jan 20, 2012 at 9:13 AM, Stefan Guggisberg >>> wrote: On Fri, Jan 20, 2012 at 5:43 PM, Jukka Zitting wrote: > Hi, > > Thinking about this more generally, it would definitely be useful to > be able to use a more efficient underlying storage for unorderable > nodes. However, there still are hard use cases that require us to > support orderable nodes as indicated by the type of the parent node. > Thus the implementation basically has two options: > > 1) Use a single data structure for all nodes, like Jackrabbit > currently does. This simplifies the implementation but prevents us > from enjoying the performance and scalability benefits of unorderable > nodes. > > 2) Use two data structures, one for orderable and another for > unorderable nodes. This is more complex (not least because it links > node types to the underlying storage model), but is probably required > if we want to efficiently support up to millions of child nodes per a > single parent. i agree that it's probably required for efficiently handling both large and small child node lists. i am not sure that it necessarily means linking node types to the underlying storage modal. the microkernel prototype currently has no notion of node types and that's IMO a good thing. node types have a way to dominant role in the current jackrabbit core implementation. jackrabbit started out as the official reference implementation for jsr-170, the focus was on supporting every feature of the spec. in jr3 i guess we can and should trade some 'note type correctness' for improved efficiency. > > It might also be possible to have the underlying storage model > unorderable, and just include extra ordering metadata at a higher > layer where also node types are handled. Option 2b, if you like, with > probably fairly significant overhead when iterating over nodes (the > implementation would probably need to pre-fetch all child nodes in > advance to access their ordering metadata). > > If we do have native support for unorderable nodes, then I wouldn't > promise any particular ordering (alphabetic, insertion order, etc.) > but rather leave it undefined like in Java's Map interface. The > underlying implementation is then free to use whatever ordering it > thinks is best. i agree. cheers stefan > > PS. Note that ordering affects not just node traversal, but also the > document ordering used in search results. Though in practice we > already now disable document ordering of search results by default. > > BR, > > Jukka Zitting -- Alexander Klimetschek Developer // Adobe (Day) // Berlin - Basel
Re: [jr3] Orderable child nodes: required (to be the default)?
I agree, so for large childnode lists, a stable but uncontrollable order would be ok, although violating the spec? -- toby On Sat, Jan 21, 2012 at 5:04 AM, Stefan Guggisberg wrote: > On Sat, Jan 21, 2012 at 3:00 AM, Tobias Bocanegra wrote: >> nt:unstructured has orderable childnodes, and we also preach to use >> nt:unstructured whereever you can. Also, i assume that a lot of >> applications benefit from node ordering (like a CMS). So I think that >> the majority of the repositories have a lot of nodes with ordered >> childnodes. > > > agreed, at least for small child node lists (e.g. < 1k child nodes). > but i doubt there's a need for large (e.g. 1M entries) orderable > child node lists. > >> thus i would not put too much effort in having an >> efficient non-ordered format, but making the ordered storage as fast >> as possible. > > i am thinking of large child node collections with stable order. > however, they wouldn't be insertion-ordered nor client-orderable. > that's the compromise. > >> using lexicographical ordering for unordered child nodes >> would certainly be a good option, > > that would be an implementation detail. as long as the order > is stable (repeated reads of the same node revision should > provide the same order of the child nodes) we shouldn't make > any promises about how the order is defined. > > cheers > stefan > >> [...] as i can reduce the lookup (but not >> insert). also, i think that re-orderings happen relatively rarely. >> >> regards, toby >> >> On Fri, Jan 20, 2012 at 9:13 AM, Stefan Guggisberg >> wrote: >>> On Fri, Jan 20, 2012 at 5:43 PM, Jukka Zitting >>> wrote: Hi, Thinking about this more generally, it would definitely be useful to be able to use a more efficient underlying storage for unorderable nodes. However, there still are hard use cases that require us to support orderable nodes as indicated by the type of the parent node. Thus the implementation basically has two options: 1) Use a single data structure for all nodes, like Jackrabbit currently does. This simplifies the implementation but prevents us from enjoying the performance and scalability benefits of unorderable nodes. 2) Use two data structures, one for orderable and another for unorderable nodes. This is more complex (not least because it links node types to the underlying storage model), but is probably required if we want to efficiently support up to millions of child nodes per a single parent. >>> >>> i agree that it's probably required for efficiently handling both large >>> and small child node lists. >>> >>> i am not sure that it necessarily means linking node types to the >>> underlying storage modal. the microkernel prototype currently has >>> no notion of node types and that's IMO a good thing. >>> >>> node types have a way to dominant role in the current jackrabbit >>> core implementation. jackrabbit started out as the official reference >>> implementation for jsr-170, the focus was on supporting every >>> feature of the spec. >>> >>> in jr3 i guess we can and should trade some 'note type correctness' >>> for improved efficiency. >>> It might also be possible to have the underlying storage model unorderable, and just include extra ordering metadata at a higher layer where also node types are handled. Option 2b, if you like, with probably fairly significant overhead when iterating over nodes (the implementation would probably need to pre-fetch all child nodes in advance to access their ordering metadata). If we do have native support for unorderable nodes, then I wouldn't promise any particular ordering (alphabetic, insertion order, etc.) but rather leave it undefined like in Java's Map interface. The underlying implementation is then free to use whatever ordering it thinks is best. >>> >>> i agree. >>> >>> cheers >>> stefan >>> PS. Note that ordering affects not just node traversal, but also the document ordering used in search results. Though in practice we already now disable document ordering of search results by default. BR, Jukka Zitting
Re: [jr3] Orderable child nodes: required (to be the default)?
On Sat, Jan 21, 2012 at 3:00 AM, Tobias Bocanegra wrote: > nt:unstructured has orderable childnodes, and we also preach to use > nt:unstructured whereever you can. Also, i assume that a lot of > applications benefit from node ordering (like a CMS). So I think that > the majority of the repositories have a lot of nodes with ordered > childnodes. agreed, at least for small child node lists (e.g. < 1k child nodes). but i doubt there's a need for large (e.g. 1M entries) orderable child node lists. > thus i would not put too much effort in having an > efficient non-ordered format, but making the ordered storage as fast > as possible. i am thinking of large child node collections with stable order. however, they wouldn't be insertion-ordered nor client-orderable. that's the compromise. > using lexicographical ordering for unordered child nodes > would certainly be a good option, that would be an implementation detail. as long as the order is stable (repeated reads of the same node revision should provide the same order of the child nodes) we shouldn't make any promises about how the order is defined. cheers stefan > [...] as i can reduce the lookup (but not > insert). also, i think that re-orderings happen relatively rarely. > > regards, toby > > On Fri, Jan 20, 2012 at 9:13 AM, Stefan Guggisberg > wrote: >> On Fri, Jan 20, 2012 at 5:43 PM, Jukka Zitting >> wrote: >>> Hi, >>> >>> Thinking about this more generally, it would definitely be useful to >>> be able to use a more efficient underlying storage for unorderable >>> nodes. However, there still are hard use cases that require us to >>> support orderable nodes as indicated by the type of the parent node. >>> Thus the implementation basically has two options: >>> >>> 1) Use a single data structure for all nodes, like Jackrabbit >>> currently does. This simplifies the implementation but prevents us >>> from enjoying the performance and scalability benefits of unorderable >>> nodes. >>> >>> 2) Use two data structures, one for orderable and another for >>> unorderable nodes. This is more complex (not least because it links >>> node types to the underlying storage model), but is probably required >>> if we want to efficiently support up to millions of child nodes per a >>> single parent. >> >> i agree that it's probably required for efficiently handling both large >> and small child node lists. >> >> i am not sure that it necessarily means linking node types to the >> underlying storage modal. the microkernel prototype currently has >> no notion of node types and that's IMO a good thing. >> >> node types have a way to dominant role in the current jackrabbit >> core implementation. jackrabbit started out as the official reference >> implementation for jsr-170, the focus was on supporting every >> feature of the spec. >> >> in jr3 i guess we can and should trade some 'note type correctness' >> for improved efficiency. >> >>> >>> It might also be possible to have the underlying storage model >>> unorderable, and just include extra ordering metadata at a higher >>> layer where also node types are handled. Option 2b, if you like, with >>> probably fairly significant overhead when iterating over nodes (the >>> implementation would probably need to pre-fetch all child nodes in >>> advance to access their ordering metadata). >>> >>> If we do have native support for unorderable nodes, then I wouldn't >>> promise any particular ordering (alphabetic, insertion order, etc.) >>> but rather leave it undefined like in Java's Map interface. The >>> underlying implementation is then free to use whatever ordering it >>> thinks is best. >> >> i agree. >> >> cheers >> stefan >> >>> >>> PS. Note that ordering affects not just node traversal, but also the >>> document ordering used in search results. Though in practice we >>> already now disable document ordering of search results by default. >>> >>> BR, >>> >>> Jukka Zitting
Re: [jr3] Orderable child nodes: required (to be the default)?
nt:unstructured has orderable childnodes, and we also preach to use nt:unstructured whereever you can. Also, i assume that a lot of applications benefit from node ordering (like a CMS). So I think that the majority of the repositories have a lot of nodes with ordered childnodes. thus i would not put too much effort in having an efficient non-ordered format, but making the ordered storage as fast as possible. using lexicographical ordering for unordered child nodes would certainly be a good option, as i can reduce the lookup (but not insert). also, i think that re-orderings happen relatively rarely. regards, toby On Fri, Jan 20, 2012 at 9:13 AM, Stefan Guggisberg wrote: > On Fri, Jan 20, 2012 at 5:43 PM, Jukka Zitting > wrote: >> Hi, >> >> Thinking about this more generally, it would definitely be useful to >> be able to use a more efficient underlying storage for unorderable >> nodes. However, there still are hard use cases that require us to >> support orderable nodes as indicated by the type of the parent node. >> Thus the implementation basically has two options: >> >> 1) Use a single data structure for all nodes, like Jackrabbit >> currently does. This simplifies the implementation but prevents us >> from enjoying the performance and scalability benefits of unorderable >> nodes. >> >> 2) Use two data structures, one for orderable and another for >> unorderable nodes. This is more complex (not least because it links >> node types to the underlying storage model), but is probably required >> if we want to efficiently support up to millions of child nodes per a >> single parent. > > i agree that it's probably required for efficiently handling both large > and small child node lists. > > i am not sure that it necessarily means linking node types to the > underlying storage modal. the microkernel prototype currently has > no notion of node types and that's IMO a good thing. > > node types have a way to dominant role in the current jackrabbit > core implementation. jackrabbit started out as the official reference > implementation for jsr-170, the focus was on supporting every > feature of the spec. > > in jr3 i guess we can and should trade some 'note type correctness' > for improved efficiency. > >> >> It might also be possible to have the underlying storage model >> unorderable, and just include extra ordering metadata at a higher >> layer where also node types are handled. Option 2b, if you like, with >> probably fairly significant overhead when iterating over nodes (the >> implementation would probably need to pre-fetch all child nodes in >> advance to access their ordering metadata). >> >> If we do have native support for unorderable nodes, then I wouldn't >> promise any particular ordering (alphabetic, insertion order, etc.) >> but rather leave it undefined like in Java's Map interface. The >> underlying implementation is then free to use whatever ordering it >> thinks is best. > > i agree. > > cheers > stefan > >> >> PS. Note that ordering affects not just node traversal, but also the >> document ordering used in search results. Though in practice we >> already now disable document ordering of search results by default. >> >> BR, >> >> Jukka Zitting
Re: [jr3] Orderable child nodes: required (to be the default)?
On Fri, Jan 20, 2012 at 5:43 PM, Jukka Zitting wrote: > Hi, > > Thinking about this more generally, it would definitely be useful to > be able to use a more efficient underlying storage for unorderable > nodes. However, there still are hard use cases that require us to > support orderable nodes as indicated by the type of the parent node. > Thus the implementation basically has two options: > > 1) Use a single data structure for all nodes, like Jackrabbit > currently does. This simplifies the implementation but prevents us > from enjoying the performance and scalability benefits of unorderable > nodes. > > 2) Use two data structures, one for orderable and another for > unorderable nodes. This is more complex (not least because it links > node types to the underlying storage model), but is probably required > if we want to efficiently support up to millions of child nodes per a > single parent. i agree that it's probably required for efficiently handling both large and small child node lists. i am not sure that it necessarily means linking node types to the underlying storage modal. the microkernel prototype currently has no notion of node types and that's IMO a good thing. node types have a way to dominant role in the current jackrabbit core implementation. jackrabbit started out as the official reference implementation for jsr-170, the focus was on supporting every feature of the spec. in jr3 i guess we can and should trade some 'note type correctness' for improved efficiency. > > It might also be possible to have the underlying storage model > unorderable, and just include extra ordering metadata at a higher > layer where also node types are handled. Option 2b, if you like, with > probably fairly significant overhead when iterating over nodes (the > implementation would probably need to pre-fetch all child nodes in > advance to access their ordering metadata). > > If we do have native support for unorderable nodes, then I wouldn't > promise any particular ordering (alphabetic, insertion order, etc.) > but rather leave it undefined like in Java's Map interface. The > underlying implementation is then free to use whatever ordering it > thinks is best. i agree. cheers stefan > > PS. Note that ordering affects not just node traversal, but also the > document ordering used in search results. Though in practice we > already now disable document ordering of search results by default. > > BR, > > Jukka Zitting
Re: [jr3] Orderable child nodes: required (to be the default)?
On Fri, Jan 20, 2012 at 4:56 PM, Jukka Zitting wrote: > Hi, > > On Fri, Jan 20, 2012 at 4:35 PM, Thomas Mueller wrote: >> If we use alphabetical child node lists by default for Jackrabbit 3, then >> we would need to use a different default node type? > > Right, though I suppose there are quite a few applications that use > nt:unstructured either as-is or as a base type for many purposes. > >> Is it even possible to use a different node type than nt:unstructured >> in Jackrabbit 3? > > It is. The default use of nt:unstructured in Jackrabbit 1.x and 2.x > boils down to the use of nt:unstructured as the base type of rep:root. > It should be fairly straightforward to change the root type to specify > an unorderable type as the default for any child nodes. yes, i agree. cheers stefan > > Doing so may cause some breakage in existing clients that don't > explicitly specify the types they want, but that breakage should be > manageable with proper release notes and workaround instructions. > > BR, > > Jukka Zitting
Re: [jr3] Orderable child nodes: required (to be the default)?
Hi, Thinking about this more generally, it would definitely be useful to be able to use a more efficient underlying storage for unorderable nodes. However, there still are hard use cases that require us to support orderable nodes as indicated by the type of the parent node. Thus the implementation basically has two options: 1) Use a single data structure for all nodes, like Jackrabbit currently does. This simplifies the implementation but prevents us from enjoying the performance and scalability benefits of unorderable nodes. 2) Use two data structures, one for orderable and another for unorderable nodes. This is more complex (not least because it links node types to the underlying storage model), but is probably required if we want to efficiently support up to millions of child nodes per a single parent. It might also be possible to have the underlying storage model unorderable, and just include extra ordering metadata at a higher layer where also node types are handled. Option 2b, if you like, with probably fairly significant overhead when iterating over nodes (the implementation would probably need to pre-fetch all child nodes in advance to access their ordering metadata). If we do have native support for unorderable nodes, then I wouldn't promise any particular ordering (alphabetic, insertion order, etc.) but rather leave it undefined like in Java's Map interface. The underlying implementation is then free to use whatever ordering it thinks is best. PS. Note that ordering affects not just node traversal, but also the document ordering used in search results. Though in practice we already now disable document ordering of search results by default. BR, Jukka Zitting
Re: [jr3] Orderable child nodes: required (to be the default)?
On Fri, Jan 20, 2012 at 4:09 PM, Thomas Mueller wrote: > Hi, > > The current Jackrabbit implementation uses orderable child nodes by default, > meaning nodes are returned in the same order as they are created. As an > example, if I create the nodes /test/c, then /test/b, and then /test/a, and > then read the node list, I will get them back in that same order. I'm > wondering if this is really required (at all, and to be the default > behavior). Specially if there are a lot of child nodes. > > Instead of using the insertion order, I propose to sort the child node list > alphabetically: /test/a, /test/b, /test/c - no matter in what order the > nodes where created. This will allow an efficient lookup (using binary > search), and for large child node lists (more than one thousand child nodes > per node) modifications would be about twice as fast as using insertion > order (plus it would need less disk space). > > Would it be a problem if Jackrabbit 3 sorts child nodes by name (always, or > at least by default)? i guess no. unless that this behavior is expected/mandated. > > Another option would be to use (insertion) order until the child node list > gets too large (for example 1000 elements), and from then on sort the child > nodes by name (the previous ordering would then be lost). i am currently working on a similar approach in the microkernel prototype (sandbox). i am thinking about switching from 'insertion ordered' to 'unordered' (the order depends on the implementation, in my case it's based on the hashcode of the child node name) once a certain threshold is reached. cheers stefan > > Regards, > Thomas >
Re: [jr3] Orderable child nodes: required (to be the default)?
Hi, On Fri, Jan 20, 2012 at 4:35 PM, Thomas Mueller wrote: > If we use alphabetical child node lists by default for Jackrabbit 3, then > we would need to use a different default node type? Right, though I suppose there are quite a few applications that use nt:unstructured either as-is or as a base type for many purposes. > Is it even possible to use a different node type than nt:unstructured > in Jackrabbit 3? It is. The default use of nt:unstructured in Jackrabbit 1.x and 2.x boils down to the use of nt:unstructured as the base type of rep:root. It should be fairly straightforward to change the root type to specify an unorderable type as the default for any child nodes. Doing so may cause some breakage in existing clients that don't explicitly specify the types they want, but that breakage should be manageable with proper release notes and workaround instructions. BR, Jukka Zitting
Re: [jr3] Orderable child nodes: required (to be the default)?
On 20.01.12 16:50, "Thomas Mueller" wrote: >>Isn't the current appending the simplest and fastest? > >No. It requires to keep two indexes: one by number, and one by name. >Having only one index (by name) is about twice as fast, and needs about >half the disk space / memory. Ah, I guess this is because both cases (orderable & non-orderable) share the same data structure underneath. Wouldn't it make sense for J3 to support both cases naturally in the persistence layer? Cheers, Alex -- Alexander Klimetschek Developer // Adobe (Day) // Berlin - Basel
Re: [jr3] Orderable child nodes: required (to be the default)?
Hi, > Isn't the current appending the simplest and fastest? No. It requires to keep two indexes: one by number, and one by name. Having only one index (by name) is about twice as fast, and needs about half the disk space / memory. Regards, Thomas
Re: [jr3] Orderable child nodes: required (to be the default)?
On 20.01.12 16:09, "Thomas Mueller" mailto:[email protected]>> wrote: The current Jackrabbit implementation uses orderable child nodes by default, meaning nodes are returned in the same order as they are created. I think "same order as they are created" is not really "orderable child nodes" - if a node does not have the orderable flag, you cannot re-order them (through the API). I'd call this default behavior simply "appending". And I think for nodes with the orderable flag, this is a good default behaviour and shouldn't be changed Instead of using the insertion order, I propose to sort the child node list alphabetically: /test/a, /test/b, /test/c - no matter in what order the nodes where created. ...but for non-orderable nodes (i.e. a big "bunch") I agree that alphabetical order by default is good. Based on experience, all the UIs for non-orderable nodes (e.g nt:hierarchyNodes) end up doing alphabetical ordering anyway (for the default view and if they don't allow for custom display ordering), as that's what makes it easy for humans to find the way around. This will allow an efficient lookup (using binary search), and for large child node lists (more than one thousand child nodes per node) modifications would be about twice as fast as using insertion order (plus it would need less disk space). Isn't the current appending the simplest and fastest? Would it be a problem if Jackrabbit 3 sorts child nodes by name (always, or at least by default)? Yes, but for non-orderable nodes only, of course. For orderable nodes I think a goal should be to keep the list ordered the whole stack down (API, in-memory, in-between formats (like json) and persistence layer), to handle it efficiently. I.e. without any "ordering number" column and thus requiring re-orderings at some level. Another option would be to use (insertion) order until the child node list gets too large (for example 1000 elements), and from then on sort the child nodes by name (the previous ordering would then be lost). Hmm, that would take away the benefit of the alphabetical default order, as applications would still need to order it alphabetically in UIs (for the lists shorter than 1000 elements). Cheers, Alex -- Alexander Klimetschek Developer // Adobe (Day) // Berlin - Basel
Re: [jr3] Orderable child nodes: required (to be the default)?
Hi, I guess one problem is that the default node type, nt:unstructured, supports orderable child nodes: http://www.day.com/specs/jcr/2.0/3_Repository_Model.html#3.7.13.1%20nt:unst ructured "This node type also supports client-orderable child nodes." I guess "client-orderable" also means "insertion order". If we use alphabetical child node lists by default for Jackrabbit 3, then we would need to use a different default node type? Is it even possible to use a different node type than nt:unstructured in Jackrabbit 3? Regards, Thomas On 1/20/12 4:26 PM, "[email protected]" wrote: >On 2012-01-20 16:09, Thomas Mueller wrote: >> Hi, >> >> The current Jackrabbit implementation uses orderable child nodes by >> default, meaning nodes are returned in the same order as they are >> created. As an example, if I create the nodes /test/c, then /test/b, and >> then /test/a, and then read the node list, I will get them back in that >> same order. I'm wondering if this is really required (at all, and to be >> the default behavior). Specially if there are a lot of child nodes. >> >> Instead of using the insertion order, I propose to sort the child node >> list alphabetically: /test/a, /test/b, /test/c - no matter in what order >> the nodes where created. This will allow an efficient lookup (using >> binary search), and for large child node lists (more than one thousand >> child nodes per node) modifications would be about twice as fast as >> using insertion order (plus it would need less disk space). >> >> Would it be a problem if Jackrabbit 3 sorts child nodes by name (always, >> or at least by default)? >> ... > >Spec-wise the returned ordering shouldn't matter unless the node type >indicates that the child nodes are indeed ordered. > >If we suspect that there's code out there making other assumptions, we >may want to change that in an earlier jackrabbit release to find out for >sure :-)
Re: [jr3] Orderable child nodes: required (to be the default)?
On 2012-01-20 16:09, Thomas Mueller wrote: Hi, The current Jackrabbit implementation uses orderable child nodes by default, meaning nodes are returned in the same order as they are created. As an example, if I create the nodes /test/c, then /test/b, and then /test/a, and then read the node list, I will get them back in that same order. I'm wondering if this is really required (at all, and to be the default behavior). Specially if there are a lot of child nodes. Instead of using the insertion order, I propose to sort the child node list alphabetically: /test/a, /test/b, /test/c - no matter in what order the nodes where created. This will allow an efficient lookup (using binary search), and for large child node lists (more than one thousand child nodes per node) modifications would be about twice as fast as using insertion order (plus it would need less disk space). Would it be a problem if Jackrabbit 3 sorts child nodes by name (always, or at least by default)? ... Spec-wise the returned ordering shouldn't matter unless the node type indicates that the child nodes are indeed ordered. If we suspect that there's code out there making other assumptions, we may want to change that in an earlier jackrabbit release to find out for sure :-)
