Hi,
just for info, on a current project I needed to import 6mil+ of docs,
and the sweet spot was 10K docs per on batch upload. Higher values
gave worse results. I don't have the numbers handy, but it took couple
of hours to convert the docs from CSV and bulk upload them into Couch,
I guess like 8hrs (on a rather old IBM Blade machine)... (And the real
pain was handling malformed CSV parts, patching FasterCSV to not choke
on it, etc.)
Karel
On 28.Jan, 2010, at 15:02 , Troy Kruthoff wrote:
Just curious, what batch size did you use... I was just getting to
run some test data to see where the sweet spot is for our hardware,
I remember reading somewhere that someone thought it was around 3k
docs.
Troy
On Jan 28, 2010, at 4:21 AM, Sean Clark Hess wrote:
Sweet... down to 28 minutes with bulk. Thanks
On Thu, Jan 28, 2010 at 4:25 AM, Sean Clark Hess
<[email protected]> wrote:
Ah, I forgot about bulk! Thanks!
On Thu, Jan 28, 2010 at 4:24 AM, Alex Koshelev
<[email protected]> wrote:
How do you import data to CouchDB? Do you use _bulk API?
---
Alex Koshelev
On Thu, Jan 28, 2010 at 1:51 PM, Sean Clark Hess <[email protected]
>
wrote:
I'm trying to import 7 million rows into couch from an xml
document. If
I
use a database with a "normal" interface (comparing with Mongo
here),
the
process completes in 37 minutes. If I use couch, it takes 10
hours. I
think
it might be due to the overhead of the http interface, but I'm
not sure.
Is there any way to get data in there faster?
~sean