Hi,

just for info, on a current project I needed to import 6mil+ of docs, and the sweet spot was 10K docs per on batch upload. Higher values gave worse results. I don't have the numbers handy, but it took couple of hours to convert the docs from CSV and bulk upload them into Couch, I guess like 8hrs (on a rather old IBM Blade machine)... (And the real pain was handling malformed CSV parts, patching FasterCSV to not choke on it, etc.)

Karel

On 28.Jan, 2010, at 15:02 , Troy Kruthoff wrote:

Just curious, what batch size did you use... I was just getting to run some test data to see where the sweet spot is for our hardware, I remember reading somewhere that someone thought it was around 3k docs.

Troy


On Jan 28, 2010, at 4:21 AM, Sean Clark Hess wrote:

Sweet... down to 28 minutes with bulk. Thanks

On Thu, Jan 28, 2010 at 4:25 AM, Sean Clark Hess <[email protected]> wrote:

Ah, I forgot about bulk! Thanks!


On Thu, Jan 28, 2010 at 4:24 AM, Alex Koshelev <[email protected]> wrote:

How do you import data to CouchDB? Do you use _bulk API?
---
Alex Koshelev


On Thu, Jan 28, 2010 at 1:51 PM, Sean Clark Hess <[email protected] >
wrote:

I'm trying to import 7 million rows into couch from an xml document. If
I
use a database with a "normal" interface (comparing with Mongo here),
the
process completes in 37 minutes. If I use couch, it takes 10 hours. I
think
it might be due to the overhead of the http interface, but I'm not sure.

Is there any way to get data in there faster?

~sean






Reply via email to