Hi Tim, https://speakerdeck.com/wohali/10-common-misconceptions-about-apache-couchdb?slide=4 (;
I just agree with her. It doesn't simple to import sql data to nosql. especially document-oriented, storage. Different conceptions and requirements to data applied there. -- ,,,^..^,,, On Wed, Nov 20, 2013 at 11:28 PM, Tim Black <t...@alwaysreformed.com> wrote: > John, > > On 11/19/2013 04:28 AM, John Norris wrote: >> I am trying to retrofit an existing web app using SQL to couchdb with >> ektorp. I have setup couchdb and run through some tutorials (ektorp, >> seven databases in seven weeks, definitive guide). >> If I have several objects (represented as pojos) then in SQL this >> would probably equate to several tables within the database. >> But in couchdb, a database is a number of documents? And those >> documents can represent several object types? So do I need each >> document to have a field representing what type it is? (eg a field >> that is unique to that document type). > So far as I can understand your question, it depends on whether each > pojo object contains many rows of similar data. If they don't then > represent each object as one doc, like this: > > { _id:"12345", type:"pojo" } > >> Or does each document type go in its own database? > If each pojo object contains many rows of similar data, I'd probably > break it up into one document per row and keep all the pojos in the same > database, so I could query across all pojos. I don't think it's > possible to query across multiple databases in CouchDB. > > Here are two files I use to migrate data from sqlite to CouchDB, which I > offer here as an example for any who are doing similar work: > > csv2json.py: > > ---------------- > #!/usr/bin/env python > > import csv, sys, json > > # Open the file passed as a command line argument > f = open(sys.argv[1], 'r') > reader = csv.DictReader(f) > rows = [] > for row in reader: > for key in row.keys(): > # Remove underscore from beginning of attribute names > if key.startswith('_'): > new_key = key.lstrip('_') > row[new_key] = row[key] > del row[key] > # Insert document collection column, which equals the sqlite > table name > row['collection'] = sys.argv[2] > # Convert id column to namespaced id to avoid conflicts > if key == 'id': > row['_id'] = sys.argv[2] + '.' + row['id'] > del row['id'] > if key == 'user_id': > row['_id'] = sys.argv[2] + '.' + row['user_id'] > del row['user_id'] > if key == 'type': > row['job'] = row['type'] > del row['type'] > rows.append(row) > # Wrap in CouchDB _bulk_docs JSON format > out = '{"docs":%s}' % json.dumps(rows) > > print(out) > ----------------- > > sqlite2csv2couchdb.sh > > ------------------ > #!/bin/bash > > # Get the database from the production site > scp remote_host:path/to/sqlite.db . > > DB="http://username:password@localhost:5984/projects" > > # TODO: Use filtered replication to save the design docs > # Delete old copy of database > curl -X DELETE $DB > # Wait a second to let CouchDB delete the old database. > sleep 1 > # Create new copy of database > curl -X PUT $DB > > # TODO: Set permissions on couchdb database > # Create list of tables > tables=`sqlite3 devdata.db 'SELECT tbl_name FROM sqlite_master WHERE > type="table"'` > > while read -r line; > > do > # Filter out the visits tables > if [ "$line" != "visit" ] && [ "$line" != "visit_identity" ] > then > # Get table of data > rows=$(sqlite3 -csv -header sqlite.db "SELECT * FROM $line") > > echo "$rows" > tmp.csv > rows=$( python csv2json.py tmp.csv $line ) > > # write JSON to file to avoid curl error of having too many > command line arguments > echo "$rows" > tmp.json > > # Insert table into couchdb > > curl -d @tmp.json -H "Content-Type:application/json" -X POST > $DB/_bulk_docs &> /dev/null > > fi > done <<< "$tables" > > rm tmp.json > rm tmp.csv > rm devdata.db > --------------------- > > Tim >