On Fri, Jul 8, 2016 at 12:44 AM, Robert Kowalski <[email protected]> wrote: > Couch 1.x and Couch 2.x will choke as soon as the indexer tries to > process a too large document that was added. The indexing stops and > you have to manually remove the doc. In the best case you built an > automatic process around the process. The automatic process removes > the document instead of the human.
Automatic process of removing stored data in production? You might be kidding (: Limiting the document size here sound like a wrong way to some the indexer issue when it cannot handle such documents. Two solutions comes on mind: - Indexer ignores big documents generating enough of loud to help user notice the problem; - Indexer is fixed to handle big documents; >From user side the second option is the only right because it's my data, I put it to database, I trust database in the way it can process it, it shouldn't fail me. What should user do when he hit the limit and cannot store the document, because indexer is buggy, but he need this data to be processed? He becomes very annoying. Because he need that data as is and any attempts to split it into multiple documents may be impossible (because we don't have cross documents links and transactions). What's the next step for him? Change a database for sure. I think that the indexer argument is quite weak and strange. More strong one is about to cut off possibility of uploading bloat data when by design there are some sane boundaries for the stored data. If all your documents are avg. 1MiB and your database receives data from the world, you would like to explicitly drop anomalies of dozens and hundreds MiB because that's not a data you're working with. See also: https://github.com/apache/couchdb-chttpd/pull/114 - Tony Sun made some attempts to add such limit to CouchDB. There are couple of problems to actually implement such limit in predictable and lightweight way because we have awesome _update functions (; But I believe that all of them could be overcome. -- ,,,^..^,,,
