> > If memory serves the database's by_id tree uses Erlang term sorting for > collation instead of ICU. ICU is of course the default collation option for > MR views. Regards, > > Adam
That is interesting. I will try to confirm that, because that would mean that the dictionary that I am using now: "-@0123456789aAbBcCdDeEfFgGhHiIjJkKlLmMnNoOpPqQrRsStTuUvVwWxXyYzZ" which is ICU ordered, would not be optimal for the doc_ids. Can you tell me what would an "Erlang term order" base64 dictionary look like? Anyway, I am curious: I understand that the size of doc_id is going to have big impact in performance and size of the database, since the doc_id is going to be present in a lot of internal structures. What I do not fully understand is why *ordering* of doc_ids when inserting documents in the database is going to have any effect in insert speed, or view generation. In my naive view of couchdb, the documents are just written to a big file system file as they are POSTed to couchdb, in the order that they arrive. How would the doc_id order affect this process?
