On Jul 16, 2008, at 6:56 PM, David King wrote:

We'd love to hear what you come up with and also to solve any problems you might encounter on your way. Please let us know. Please note that CouchDB at this point is not optimised. We are still in the 'getting it right' phase before we come to the 'getting it fast'. That said, CouchDB is plenty fast already, but there is also the potential to greatly speed up things.


So I'm trying a smaller version of this first (9 million records), and I've hit a snag. I have some rather simple python code to read from Postgres and write to couchdb (that uses couchdb-python, where 'db' is a couchdb.client.Database object):

   chunker = IteratorChunker(get_stuff())

   while not chunker.done:
       print "fetching"
       chunk = chunker.next_chunk(1000)
       if chunk:
print "Adding %d items, starting with %s" % (len(chunk),chunk[0]['_id'])
           db.update(chunk)

db.update(docs) (see <http://code.google.com/p/couchdb-python/source/browse/trunk/couchdb/client.py >, line 360) uses the bulk API, like:

data = self.resource.post('_bulk_docs', content={'docs': documents})

At apparently random points throughout this process, but almost always before 15,000 records or so, the process dies with an exception, the tail end of which looks like:

File "/System/Library/Frameworks/Python.framework/Versions/2.5/lib/ python2.5/httplib.py", line 707, in send
   self.sock.sendall(str)
 File "<string>", line 1, in sendall
socket.error: (54, 'Connection reset by peer')

If I have Futon up while it's running, I occasionally get a Javascript error along the lines of "killed" (reproducing it is difficult) at the same time.

I could have it catch the reset connection and re-try, but why would this be happening?


You appear to be hitting the weird mochiweb connection reset bug. It's causes test failures too. We are looking into it.

Reply via email to