On 5 November 2012 19:22, Kevin Burton <[email protected]> wrote: > I am calling CreateDocument<Document>() but I suspect that testing if the > document exists first may perform better in the long run. I am using > DreamSeat for my driver but I suspect other drivers have a similar "test". My > problem is that I don't know what to test for and I am unfamiliar with the > available methods. Any one successfully use such a pattern (preferably with > DreamSeat) that tests for existence then creates if the document doesn't > exist? Keep in mind I don't initially have an id. Thank you. >
Hi Kevin, A number of folk have said "read the guide first" and it's sound advice http://guide.couchdb.org/draft/index.html as you're stuck with conceptual stuff that's well covered in the book. I recommend skipping the sofa chapter (it was written some time ago). Secondly, I recommend having a play at first with pure HTTP i.e. curl or similar. This is simply so you get a real feel for how your data is actually stored and manipulated, before layering an abstraction on top. It *will* save you time in the short run, and it's not scary. I learned a huge amount about HTTP itself along the way and I'm definitely not done yet/ So it's all good. You can also watch the HTTP calls in & out of futon using Chrome Debugger or some other proxy like CharlesProxy in between. I'm assuming you are using Windows (yay!) so http://wiki.apache.org/couchdb/Quirks_on_Windows has some tips on using curl successfully, and I'm happy to help out if you get stuck. Let me know if anything is not clear or out of date. If you're initially bulk uploading data, I would do 3 things differently to what you're currently doing. 1. assign UUIDs myself This is the only enforced unique indexed attribute in a DB, so use it well. Put something you want in it. It's basically free text ** within reason. 2. insert them in sorted UUID order CouchDB is a database and sorting matters. Couch uses a B~tree ** and so if you insert randomly you spend a lot of time forcing the re-write of intermediate nodes for no gain. As Couch is an append-only datastore this means several things - - wasted space until you compact - slower insert performance as you have multiple writes instead of one http://horicky.blogspot.co.at/2008/10/couchdb-implementation.html 3. try inserting the first few docs by hand with curl. And read up on the _bulk_docs API, this is much much faster. Re your drivers, there are several but I personally don't use any of them. There are more popular ones (based on my dodgy recollection) here http://wiki.apache.org/couchdb/Related_Projects hopefully some of the other Windows folk will pipe up. A+ Dave ** handwavey
