Damien used to say, "there are vitamins and there are pain pills."
Bigcouch is a vitamin[1]: a long-term fix to the general health and robustness of the system. npm needs a pain pill. And it is going to get one. Why do I respect the Node.js community? Certainly not because of the language! No, because they get things done and move quickly. I expect a fix for this problem to be in production before we could even wrap up a discussion about architectural changes. Fortunately, this fix will be right where it should be: in the application. There is nothing wrong with storing URLs as document data and having the client fetch those itself, as long as you understand the trade-offs, which npm does. [1]: I am aware that there is not a shred of evidence that multivitamin supplements improve health of normal people. But you know what I mean. On Wed, Nov 27, 2013 at 6:59 PM, Robert Newson <[email protected]> wrote: > I think NPM mostly struggle with disk issues (all attachments in the > same file, it's 100G) and replication (a document with lots of > attachments has to be transferred fully in the same connection without > interruption or else it starts over). > > Both of these are fixable without taking the extreme measure of moving > the attachments out of couchdb entirely. That would pretty much > eliminate the point of using CouchDB for this registry. That's a > perfectly reasonable thing for the registry owners to do but changing > CouchDB is going too far. I've previously advocated for "external" > attachments, whether that's a file-per-attachment or a separate .att > file of all attachments. I've since recanted, it's not compelling > enough to compensate for the extra failure conditions (the .couch file > exists but the .att file is gone, say). > > For the actual problems, the bigcouch merge will bring sharding (a > q=10 database would consist of ten 10G files, each individually > compactable, can be hosted on different machines, etc). CouchDB 1.5.0 > improved replication behaviour around attachments but there's > definitely more work to be done. Particularly, we could make > attachment replication resumable. Currently, if we replicate 99.9% of > a large attachment, lose our connection, and resume, we'll start over > from byte 0. This is why, elsewhere, there's a suggestion of 'one > attachment per document'. That is a horrible and artificial constraint > just to work around replicator deficiencies. We should encourage sane > design (related attachments together in the same document) and fix the > bugs that prevent heavy users from following it. > > B. > > > On 27 November 2013 07:27, Benoit Chesneau <[email protected]> wrote: > > On Wed, Nov 27, 2013 at 8:26 AM, Benoit Chesneau <[email protected] > >wrote: > > > >> > >> > >> > >> On Wed, Nov 27, 2013 at 8:14 AM, Alexander Shorin <[email protected] > >wrote: > >> > >>> http://blog.nodejs.org/2013/11/26/npm-post-mortem/ > >>> > >>> > Move attachments out of CouchDB: Work has begun to move the package > >>> tarballs out of > >>> > CouchDB and into Joyent's Manta service. Additionally, MaxCDN has > >>> generously offered to > >>> > provide CDN services for npm, once the tarballs are moved out of the > >>> registry database. > >>> > This will help improve delivery speed, while dramatically reducing > the > >>> file system I/O load on > >>> > the CouchDB servers. Work is progressing slowly, because at each > stage > >>> in the plan, we are > >>> > making sure that current replication users are minimally impacted. > >>> > >>> I wonder is it CouchDB non-optimal I/O and/or can 769 issue fix it? > >>> > >>> https://issues.apache.org/jira/browse/COUCHDB-769 > >>> > >>> There is alpha-patch attached. May be it's good time to push it > >>> forward? What things are left for it? > >>> > >>> -- > >>> ,,,^..^,,, > >>> > >> > >> I would say a better API internally , I am also interrested to work on > that > >> > > > > also +1 >
