This topic is also interesting for me. How can I read this data ? I have to implement this logic in my application or couchdb understand what I'm finding and redirect me to right database ? And what if I have to query data between two or more database ?
Thanks Sent from my iPad On 20/10/2012, at 08:59, Alexander Shorin <[email protected]> wrote: > Hi Erik! > > The common practice for all databases (SQL, NoSQL) that serves fast > growing data is partitioning[1] - splitting data into partition per > some datetime period. Depended upon how fast data grows this period > may be year, month or even day. Applying to CouchDB this practice you > have to split data per databases with period in their name e.g.: > > world_logs/2012/10 > world_logs/2012/09 > world_logs/2012/08 > world_logs/2012/07 > ... > > Note slashes in names. With this trick CouchDB will create directory > hierarchy for these databases at filesystem: > + world_logs/ > | ---- + 2012/ > | ---- | ---- + 07.couch > | ---- | ---- + 08.couch > | ---- | ---- + 09.couch > | ---- | ---- + 10.couch > > So if your data grows by 1M docs per year splitting him by months will > creates 12 databases with ~100K documents. The big difference from > one-big database is that "old" data is already has computed view > index; if you adding new view you don't need to wait while all data > will be indexed - you'll get result much faster since index will be > build for small chunk that you currently interested. > > Also, you still could have simultaneously one big database with all > data which imports data from these small databases though replication. > > That's about how to optimize data to make views run faster. Also you > could try to switch from JavaScript query server to Erlang[2] one. > Erlang query server is native and doesn't suffers from stdio and json > serialization/deserialization overhead. As for me it gains indexation > boost for about 3-4 times depending on complexity of map function. > > P.S. There is good news for you: in 1.3 release there will be new > query server engine(already in master branch) that for my feeling is a > bit faster than similar in 1.2. > > [1]: http://en.wikipedia.org/wiki/Partition_%28database%29 > [2]: http://wiki.apache.org/couchdb/EnableErlangViews > > -- > ,,,^..^,,, > > > On Sat, Oct 20, 2012 at 4:08 AM, Erik Pearson <[email protected]> wrote: >> Hi, >> >> I'm wondering if there are any write performance improvements on the >> horizon? Although day to day read queries are great, and modest updates are >> fine, bulk updates and index rebuilding is pretty painful. I know >> performance tips are a broad enough topic without focusing it down. Since I >> need to deal with multiple databases which will grow at about a million >> documents per year, I'm in a bit of pain even testing the database with >> significant depth of data (e.g. 5 years). >> >> I'd be happy to provide my use case and experience, but thought I'd cut my >> usually verbose missives down to the bare question. >> >> Thanks, >> Erik.
