> On 12 Oct 2016, at 11:55, Bogdan Andu <bog...@gmail.com> wrote:
> On Mon, Oct 10, 2016 at 4:36 PM, Jan Lehnardt <j...@apache.org> wrote:
>>> On 10 Oct 2016, at 14:59, Bogdan Andu <bog...@gmail.com> wrote:
>>> yes, I know , but couchdb storage engine cannot optimize
>>> this while operating normaly. only after compaction is finished, the
>>> is optimized.
>>> I presume that the entire btree is traversed to detect revisions and
>>> btree nodes.
>>> I have no revisions on documents.
>>> My case clear leans toward the unused nodes.
>>> Couldn't be those nodes detected in a timely manner,
>>> while inserting (appending to the end of file) documents , and be deleted
>> we could do that, but then we’d open ourselves up for database corruption
>> during power-, hardware- or software-failures. There are sophisticated
>> techniques to safeguard against that, but they come with their own set
>> of trade-offs, one of which is code complexity. Other databases have
>> millions of lines of code in just this area and CouchDB is <100kLoC total.
>> there is an interesting project called scalaris (http://scalaris.zib.de/)
> that uses paxos commit protocol and
> and algorithms borrowed from torrent technology but they do not store the
> db on disk.
> another interesting technology is hibary database that uses a concept of
> bricks and virtual nodes.
CouchDB 2.0 has a similar layer for in-cluster duplication and reliability.
You are asking about per-node reliability and on-disk storage and the
trade-offs I explained still apply there.
>>> But I assume that the btree must be traversed every time an insert is
>>> (or may be traversed from a few nodes above the last 100 or 1000 new
>> Yes, for individual docs, it is each time, for bulk doc requests with
>> somewhat sequential doc ids, it is about per bulk size.
>>> Now the problem consist in why and how those node become unusable?
>>> What are the conditions necessary that db produces dead nodes?
>> As soon as a document (or set of docs in a bulk docs request) is written,
>> we stop referencing existing btree nodes up the tree in the particular
> but I think stop referencing the nodes does not means garbage-collecting
That is correct, these nodes then are the garbage :)
>>> If you could manage to avoid this I think you have a self-compacting
>>> Just my 2 cents.
>> Again, this is a significant engineering effort. E.g. InnoDB does what
>> you propose and it took 100s of millions of dollars and 10 years to get
>> up to speed and reliability. CouchDB does not have these kinds of
>>> just a side question.. wouldn't be nice to have multiple storage engines
>>> that follow the same
>>> replication protocol, of course
>> We are working on this already :)
> wow, and what are the candidates for alternative backends. I presume one of
> them is leveldb, because everybody has it. Even mnesia has it.
We are still in the design phases of this from the CouchDB point of view.
Potential storage backends are next. I also expect at least a trial for
LevelDB, but no promises yet :)
Professional Support for Apache CouchDB: