Native views are much faster, for sure. Just be aware that they are not sandboxxed, you must trust everyone that writes design documents; they can execute any system command.
On 20 October 2012 12:54, Erik Pearson <[email protected]> wrote: > Hi Alex, > > Thanks for the great write up. > > After posting the question yesterday, I went ahead and installed the latest > couchdb (1.3) from github, and rewrote the map/reduce functions in Erlang. > I'd rather write them in Erlang anyway :) > > Erlang native view performance is much better! It indeed is several times > faster compared to Javascript views in 1.2. That is great progress. Is 1.3 > (github master) considered to be stable? > > One measure I tried that makes no difference is separating view functions > into separate documents. They build in separate Erlang processes, but the > overall rate of building the index is the same (roughly 1000 changes/sec) > as all views in one design doc. Perhaps because Erlang is already > saturating my cpus with just one view rebuild, or perhaps because of other > bottlenecks like disk access? > > Now we just need a few convenience functions to making writing Erlang views > less painful... but I'm going to write a separate post on that shortly. > > I've read about the upcoming integration of bigcouch, and that is indeed > exciting and reassuring. > > > Thanks, > Erik. > > On Sat, Oct 20, 2012 at 4:59 AM, Alexander Shorin <[email protected]> wrote: > >> Hi Erik! >> >> The common practice for all databases (SQL, NoSQL) that serves fast >> growing data is partitioning[1] - splitting data into partition per >> some datetime period. Depended upon how fast data grows this period >> may be year, month or even day. Applying to CouchDB this practice you >> have to split data per databases with period in their name e.g.: >> >> world_logs/2012/10 >> world_logs/2012/09 >> world_logs/2012/08 >> world_logs/2012/07 >> ... >> >> Note slashes in names. With this trick CouchDB will create directory >> hierarchy for these databases at filesystem: >> + world_logs/ >> | ---- + 2012/ >> | ---- | ---- + 07.couch >> | ---- | ---- + 08.couch >> | ---- | ---- + 09.couch >> | ---- | ---- + 10.couch >> >> So if your data grows by 1M docs per year splitting him by months will >> creates 12 databases with ~100K documents. The big difference from >> one-big database is that "old" data is already has computed view >> index; if you adding new view you don't need to wait while all data >> will be indexed - you'll get result much faster since index will be >> build for small chunk that you currently interested. >> >> Also, you still could have simultaneously one big database with all >> data which imports data from these small databases though replication. >> >> That's about how to optimize data to make views run faster. Also you >> could try to switch from JavaScript query server to Erlang[2] one. >> Erlang query server is native and doesn't suffers from stdio and json >> serialization/deserialization overhead. As for me it gains indexation >> boost for about 3-4 times depending on complexity of map function. >> >> P.S. There is good news for you: in 1.3 release there will be new >> query server engine(already in master branch) that for my feeling is a >> bit faster than similar in 1.2. >> >> [1]: http://en.wikipedia.org/wiki/Partition_%28database%29 >> [2]: http://wiki.apache.org/couchdb/EnableErlangViews >> >> -- >> ,,,^..^,,, >> >> >> On Sat, Oct 20, 2012 at 4:08 AM, Erik Pearson <[email protected]> wrote: >> > Hi, >> > >> > I'm wondering if there are any write performance improvements on the >> > horizon? Although day to day read queries are great, and modest updates >> are >> > fine, bulk updates and index rebuilding is pretty painful. I know >> > performance tips are a broad enough topic without focusing it down. >> Since I >> > need to deal with multiple databases which will grow at about a million >> > documents per year, I'm in a bit of pain even testing the database with >> > significant depth of data (e.g. 5 years). >> > >> > I'd be happy to provide my use case and experience, but thought I'd cut >> my >> > usually verbose missives down to the bare question. >> > >> > Thanks, >> > Erik. >>
