CouchDB views have a feature called linked documents: http://wiki.apache.org/couchdb/Introduction_to_CouchDB_views#Linked_documents
<http://wiki.apache.org/couchdb/Introduction_to_CouchDB_views#Linked_documents>You could store each token as a doc. Then store the order of tokens in a separate doc. To change the order of the tokens you'd update the "order" doc. Consider this position doc: { _id:"Genesis-1:1", type:"position", position:["token987","token123","token456"]} And these token docs: [ { _id:"token123", type:"token", word:"the"}, { _id:"token987", type:"token", word:"In"}, { _id:"token456", type:"token", word:"beginning"} ] Then a view like this: function(doc) { if (doc.type=="position") { var token=doc.position; for (var i=0; i<token.length; i++) { emit([doc._id, i], token[i]); } } } Emits this: {"total_rows":3,"offset":0,"rows":[ {"id":"Genesis-1:1","key":["Genesis-1:1",0],"doc":{"_id":"token987", type:"token", word:"In"}}, {"id":"Genesis-1:1","key":["Genesis-1:1",1],"doc":{"_id":"token123", type:"token", word:"the"}}, {"id":"Genesis-1:1","key":["Genesis-1:1",2],"doc":{"_id":"token456", type:"token", word:"beginning"}} ]} Maybe you can make an approach like this work for you? FB On Wed, Nov 3, 2010 at 9:16 AM, Dirkjan Ochtman <[email protected]> wrote: > On Wed, Nov 3, 2010 at 14:04, Weston Ruter <[email protected]> wrote: > > That is a good idea, but the problem with Bible translations in > particular > > is the issue of overlapping hierarchies: like chapter and verse don't > always > > fall along same divisions as section and paragraph. So the data model > I've > > been moving toward is standoff markup, where there is a set of tokens > > (words, punctuation) for the entire book and then a set of structures > > (paragraphs, verses, etc) that refer to the start token and end token, so > > when getting a structure it needs to retrieve all tokens from start to > end. > > The use of standoff markup and overlapping hierarchies makes your idea of > > using sorting buckets not feasible, I don't think. Thanks for the idea > > though! > > Not sure I agree. My "buckets" are somewhat arbitrary and don't > actually have to be mapped to any real structure. The trick is just > that by prefixing with a bucket index, you don't have to update all > tokens anymore, you only have to update tokens inside the bucket (or > the next bucket if you happened to be moving a token to the next > bucket). Your standoff thing (I'm not really used to that term, so no > clue if I'm using it correctly) would still work, only you now > reference tokens by bucket and token index, not just token index. > > Cheers, > > Dirkjan >
