Hi,
Writing such an import script is really simple. I doubt it that there
can be "something ready", because it would have to be very flexible in
order to do some real work.
See the example I am attaching to this mail.
I use such scripts constantly.
Pay a special attention at the "bulk insert" part, otherwise you will
have serious performance issues.
The csv module I am importing here is not a part of the standard
library so you will have to install python-csv (or whatever is it
called in your distro).
Also read that module's documentation in order to choose the right csv
dialect for your needs.
By no means this is the "perfect" way of importing csv documents, but
it does the job for me, I hope you'll find it useful too.
On Fri, Oct 16, 2009 at 4:36 PM, AnotherNetFellow
<[email protected]> wrote:
> Hi,
> i'm testing CouchDB as an alternative to MySQL (using python rewriting my
> application data model doesn't seem so difficult).
>
> To have some bench i want to put in it something like 3 million entries and
> see the get/put/delete latency. I have this data available in CSV format, or
> in alternative to this i can import them in MySQL and then export everything
> in SQL language.
>
> Do you know if there is a simple way to import everything on couch db?
>
> I think i can write a python script to read a csv file and put all entries
> in couch, but hope something similar has already been written.
>
> Thankyou very much
>
> Giorgio
>
> --
> --
> AnotherNetFellow
> Email: [email protected]
>
#!/usr/bin/env python
from couchdb import Database
import os,sys
import csv
from datetime import date
from yourApp.models import SomeModel
file = 'Some-file.csv'
class ICD10_Dialect(csv.Dialect):
delimiter = ','
quotechar = '"'
escapechar='\\'
quoting = csv.QUOTE_MINIMAL
lineterminator = '\r\n'
status_iterable = open(file, 'r').readlines()[:]
status_reader = csv.reader(status_iterable, dialect=ICD10_Dialect())
status = []
for row in status_reader:
status.append(row)
db = Database('http://localhost:5984/pylonsfarm')
docs = []
num = 0
for row in status[:]:
if num % 5000 == 0:
print t
print 'bulk inserting lots of documents'
db.update(docs)
docs = []
num += 1
#here you write some logic to generate instances
#of your model from the row contents.
doc = SomeModel(spam=row[0]) #etc.
docs.append(doc)
db.update(docs)