> interesting discussion!
> 
> jim, yes, in my case nesting won't be very deep so i've been thinking about 
> your suggestion:
> 
> 1 : orig message
> 1-2 : reply to 1
> 1-2-5 : reply to 2
> 1-3 : reply to 1
> 1-3-4 : reply to
> 
> you say "1-3-4" is an ID, right?

Yes

> so tell me, if the message with the ID "1-3-4" wants to find out who his 
> parent is, then we'd have to parse the ID, remove one digit and query the 
> database for "1-3" to get the parent. do you think this is optimal for the 
> couchdb?

You could do it this way; I've only outlined a strategy for ID's that sorts in 
a hierarchal fashion, not a way to determine ancestry. I'd probably suggest 
just storing the parent id in the child - depends upon your actual use case.

Another alternative, but would require the use of a view, is to store the 
parent ancestry as an array in each document instead of using a composite id ( 
I believe others have suggested this too):

{ id:1, ancestry: []} 
{ id:2, ancestry: [1]} 
{ id:5, ancestry: [1, 2]} 
{ id:3, ancestry: [1]} 
{ id:4, ancestry: [1, 3]} 
{ id:6, ancestry: [1, 2]} 
{ id:7, ancestry: [1, 3]} 

then a map function:

function(doc) {
        emit(doc.ancestry.concat([doc.id]), null);
}

and just use the built-in _count reduce if you need to count the number of 
nodes in a tree (or sub tree)

This will create a view that sorts in the same way as the id, but using an 
array instead of a string. This view will look something like, noting that the 
key is what's sorted:

{
        rows: [
                { id: 1, key: [ 1 ], value= null},
                { id: 2, key: [ 1, 2 ], value= null},
                { id: 5, key: [ 1, 2, 5 ], value= null},
                { id: 6, key: [ 1, 2, 6 ], value= null},
                { id: 3, key: [ 1, 3 ], value= null},
                { id: 4, key: [ 1, 3, 4 ], value= null},
                { id: 7, key: [ 1, 3, 7 ], value= null},
        ]
}

you can then query an entire tree with startkey=[1]&endkey=[1,{}] which will 
give you all the descendants for the one tree with one query.  fetching a sub 
tree works similarly: startkey=[1,2]&endkey=[1,2,{}] 

and you will have equivalent functionality to using a composite id with 
possibly a simpler to comprehend mechanism for building queries and  creating 
ids at the expense of creating a view...

you'll probably also want a view that just lists root messages too… that would 
be:

function(doc) {
        if (doc.ancestry.length == 0)
                emit(doc.id, null);
}

- Jim


Jim Klo
Senior Software Engineer
Center for Software Engineering
SRI International
t.      @nsomnac

> 
> On 22/02/13 20:57, Jim Klo wrote:
>> On Feb 21, 2013, at 11:31 PM, "Elisiano Petrini" <[email protected]> wrote:
>> 
>>> I understand the concept, but on the same level there could be more than
>>> one child. Using this as an ID would work only if you put something
>>> unique to the node in composing the ID.
>>> 
>> I don't recall multiple nodes being a constraint in the problem, even then 
>> it's solvable using sharded ID's - the reality is there's always finite 
>> number of nodes. The only thing that changes is that tree won't sort 
>> naturally in the order they arrive. Remember, the parent node has to exist 
>> everywhere for a collision to occur.
>> 
>>> Generating a sequential ID (per tree level) depending on the order of
>>> arrival might lead to collisions (if messages arrive at the same time).
>>> 
>> Sure, but only if you have a badly designed sequence generators that are not 
>> designed to operate in a sharded manner. A simple way is to pre-allocate a 
>> pool of ID's to a process (say 500 ID's per process). No other process 
>> should be able to use another process' allocation. Once exhausted, the 
>> process fetches another pool of IDs. This may not be ideal in a mobile 
>> scenario, but would probably still work, you just might have to devise a 
>> smarter allocation algorithm based upon usage and connectivity frequencies. 
>> (And now I've really digressed) This is a very common solution done in the 
>> RDBMS world to combat ID conflicts.
>> 
>> 
>>> Anyhow, I feel we're digressing here :)
>>> 
>> It is worth discussion as not everyone is experienced in massively parallel 
>> systems. :)
>> 
>> If the originator has only a single node - they shouldn't have any conflict 
>> issues with the simplest solution.
>> 
>> 
>> 
>>> On Fri, 2013-02-22 at 16:23 +0900, svilen wrote:
>>>> that's the path to root stored as id
>>>> so answers to 1-2 will be say 1-2-5 1-2-6 1-2-33
>>>> 
>>>>> Hi Kim,
>>>>>    I don't think using this kind of documents' ids is a viable
>>>>> solution: what if there is more that one message in response to
>>>>> another (ie:several people answering to the same message)?
>>>>> 
>>>>> On Fri, 2013-02-22 at 07:00 +0000, Jim Klo wrote:
>>>>>> In theory you mention infinite depth, but is that realistic?
>>>>>> 
>>>>>> A simple way is to make ID's a composite serial and chain them
>>>>>> together.
>>>>>> 
>>>>>> 1 : orig message
>>>>>> 1-2 : reply to 1
>>>>>> 1-2-5 : reply to 2
>>>>>> 1-3 : reply to 1
>>>>>> 1-3-4 : reply to
>>>>>> 
>>>>>> The keys will then sort in order, such that can build the tree DFS
>>>>>> and use range queries to get just segments.
>>>>>> 
>>>>>> You can obviously do some creative things in compression - using
>>>>>> different number bases to handle deep trees.
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Feb 21, 2013, at 10:10 PM, "Michael Heuberger"
>>>>>> <[email protected]> wrote:
>>>>>> 
>>>>>>> andrey, svilen, many thanks again.
>>>>>>> 
>>>>>>> agreed, storing immediate children in a document is a bad idea. i
>>>>>>> think i go for storing only an immediate parent id to get an
>>>>>>> infinite tree depth.
>>>>>>> 
>>>>>>> thanks for your inspirations guys
>>>>>>> 
>>>>>>> cheers
>>>>>>> michael
>>>>>>> 
>>>>>>> On 22/02/13 18:49, Andrey Kuprianov wrote:
>>>>>>>> PS - storing only an immediate parent id would be the solution
>>>>>>>> for an infinite tree depth.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Fri, Feb 22, 2013 at 1:48 PM, Andrey Kuprianov <
>>>>>>>> [email protected]> wrote:
>>>>>>>> 
>>>>>>>>> Storing immediate children in a parent is a bad idea in a sense
>>>>>>>>> that you will never be able to traverse your tree back to the
>>>>>>>>> root, given a child node.
>>>>>>>>> 
>>>>>>>>> Storing all in one doc - you will run out of space before you
>>>>>>>>> know it. Documents have a finite length and the larger the
>>>>>>>>> document, the slower it will get to serialize and unserialize
>>>>>>>>> it.
>>>>>>>>> 
>>>>>>>>> Storing just an immediate parent in children might result in a
>>>>>>>>> costly multiple read operations, if you want to get root of the
>>>>>>>>> tree given an arbitrary child.
>>>>>>>>> 
>>>>>>>>> Therefore the only feasible solution is to store references to
>>>>>>>>> the parent chain - i.e. ancestors - starting from an immediate
>>>>>>>>> parent and ending with a tree root.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Here's a simple calculation for ancestors approach:
>>>>>>>>> 
>>>>>>>>> Suppose the maximum document size is 1MB. A uuid is 32
>>>>>>>>> characters in length + taking into account commas and brackets,
>>>>>>>>> then the depth of your tree (I am not taking content into
>>>>>>>>> account) could be roughly:
>>>>>>>>> 
>>>>>>>>> ((32 + 1)*n  + 2 - 1) = 1024 * 1024  ==> n = 31775 levels
>>>>>>>>> 
>>>>>>>>> However, max document size is configurable from the settings
>>>>>>>>> (as of CouchDB 1.2) and could easily be up to 4GB, giving you
>>>>>>>>> over 130mil of levels. Should be enough... :)
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Fri, Feb 22, 2013 at 1:14 PM, Michael Heuberger <
>>>>>>>>> [email protected]> wrote:
>>>>>>>>> 
>>>>>>>>>> thanks andrey and svil
>>>>>>>>>> 
>>>>>>>>>> depth is infinite. in theory a message can have million
>>>>>>>>>> replies forth and back. similar like a very long email thread.
>>>>>>>>>> 
>>>>>>>>>> i think storing ancestors ą la andrey would be too much hassle
>>>>>>>>>> but works well for few levels.
>>>>>>>>>> 
>>>>>>>>>> svil, what did you mean with dicts and with keeping all
>>>>>>>>>> children in one doc? can you tell me more?
>>>>>>>>>> 
>>>>>>>>>> thanks,
>>>>>>>>>> michael
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On 22/02/13 17:27, svilen wrote:
>>>>>>>>>> 
>>>>>>>>>>> my rough guess:
>>>>>>>>>>> 
>>>>>>>>>>> if it's finite depth, then all in one doc:
>>>>>>>>>>> - list of (item or (list of ...))
>>>>>>>>>>> - or same with dicts
>>>>>>>>>>> 
>>>>>>>>>>> else, one doc per message keeping just link-to-parent,
>>>>>>>>>>> or multiple links-to-grand...grand-parents and/or root.
>>>>>>>>>>> similar to the strategies in SQL - nested etc.
>>>>>>>>>>> keeping all chidlren of same node in one doc is also
>>>>>>>>>>> possible..
>>>>>>>>>>> 
>>>>>>>>>>> in any case either traversal or storing or both will be
>>>>>>>>>>> difficult.
>>>>>>>>>>> 
>>>>>>>>>>> ciao
>>>>>>>>>>> svil
>>>>>>>>>>> 
>>>>>>>>>>> On Fri, 22 Feb 2013 17:13:01 +1300
>>>>>>>>>>> Michael Heuberger
>>>>>>>>>>> <michael.heuberger@**binarykitchen.com<[email protected]>>
>>>>>>>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> Hello guys
>>>>>>>>>>>> I'd like to store messages in a tree like structure.
>>>>>>>>>>>> Whenever you reply to a message, then the original message
>>>>>>>>>>>> becomes the parent message.
>>>>>>>>>>>> 
>>>>>>>>>>>> How would you implement something like this in CouchDB? Just
>>>>>>>>>>>> curious and need a little guidance + advice here.
>>>>>>>>>>>> 
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Michael
>>>>>>>>>>>> 
>>>>>>>>>>>> --
>>>>>>>>>>>> 
>>>>>>>>>>>> Binary Kitchen
>>>>>>>>>>>> Michael Heuberger
>>>>>>>>>>>> 4c Dunbar Road
>>>>>>>>>>>> Mt Eden
>>>>>>>>>>>> Auckland 1024
>>>>>>>>>>>> (New Zealand)
>>>>>>>>>>>> 
>>>>>>>>>>>> Mobile (text only) ...  +64 21 261 89 81
>>>>>>>>>>>> Email ................  [email protected]
>>>>>>>>>>>> Website ..............  http://www.binarykitchen.com
>>>>>>>>>> --
>>>>>>>>>> 
>>>>>>>>>> Binary Kitchen
>>>>>>>>>> Michael Heuberger
>>>>>>>>>> 4c Dunbar Road
>>>>>>>>>> Mt Eden
>>>>>>>>>> Auckland 1024
>>>>>>>>>> (New Zealand)
>>>>>>>>>> 
>>>>>>>>>> Mobile (text only) ...  +64 21 261 89 81
>>>>>>>>>> Email ................  [email protected]
>>>>>>>>>> Website ..............  http://www.binarykitchen.com
>>>>>>> -- 
>>>>>>> 
>>>>>>> Binary Kitchen
>>>>>>> Michael Heuberger
>>>>>>> 4c Dunbar Road
>>>>>>> Mt Eden
>>>>>>> Auckland 1024
>>>>>>> (New Zealand)
>>>>>>> 
>>>>>>> Mobile (text only) ...  +64 21 261 89 81
>>>>>>> Email ................  [email protected]
>>>>>>> Website ..............  http://www.binarykitchen.com
>>> 
> 
> -- 
> 
> Binary Kitchen
> Michael Heuberger
> 4c Dunbar Road
> Mt Eden
> Auckland 1024
> (New Zealand)
> 
> Mobile (text only) ...  +64 21 261 89 81
> Email ................  [email protected]
> Website ..............  http://www.binarykitchen.com
> 

Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to