Hi all,

(I originally posted this on the mobile-couchbase list, but was recommended to post it here instead, since this is more of a general couchdb question)

I am struggling with something that I know would be quite trivial to do in SQL, but seems to be a lot more complex to do in couchdb. Perhaps 20 years of experience with SQL is clouding my judgement and so I am trying rather to break away from the SQL/relational mindset and understand how this would be better designed in couch.

I am developing an iOS art gallery app, and wish to display information about artworks, artists and galleries.

The menu screen requires a list of all artworks together with the gallery they are currently housed at and the artist that created the artwork:

 * artwork title
 * artwork medium
 * artwork dimensions
 * artist full_name
 * gallery name,
 * gallery city

for each artwork, respectively.

Imagine the following objects in my database (I am not including all the fields for sake of brevity. i.e. only the ones that are important).

|     Artist
    ------
    _id
    full_name
    birth_year
    birth_place

    Gallery
    -------
    _id
    name
    city

    Artwork
    -------
    _id
    title
    medium
    dimensions
    artist_id_ (foreign key)
    gallery_id_ (foreign key)
|

In SQL it would be as easy as one query with two joins:

|SELECT
    artwork.title,
    artwork.medium,
    artwork.dimensions,
    artist.full_name,
    gallery.name,
    gallery.city
FROM
    artwork INNER JOIN artist
    ON artwork.artist_id_ = artist._id INNER JOIN gallery
    ON artwork.gallery_id_ = gallery._id
|

From this I would be able to get the data I need in one query (i.e. one call to the database).

|     Guernica  | Pablo Picasso     | Museo Reina Sofia, Madrid, Spain
    Mona Lisa | Leonardo da Vinci | Musée du Louvre, Paris, France
    David     | Michelangelo      | Uffizi Gallery, Florence, Italy.
|

Using couchdb I am storing the data in a very similar way to the way I would in a relational database, represented as something along the lines of:

|{
   "type"       : "Artwork",
   "_id"        : "d665afaedde12b44c5159cf1782c5c1e",
   "_rev"       : "1-446201af8a1ddfa3759164a01be899de",
   "artist_id"  : "d665afaedde12b44c5159cf1782c2d0b",
   "gallery_id" : "d665afaedde12b44c5159cf1782d44af",
   "title"      : "Guernica",
   "medium"     : "Oil on canvas",
   "dimensions" : "349 cm × 776 cm (137.4 in × 305.5 in)"
}

{
   "type"       : "Artist",
   "_id"        : "d665afaedde12b44c5159cf1782c2d0b",
   "_rev"       : "1-8fa8a6dd50d9d9072b08db1a840128b1",
   "full_name"  : "Pablo Picasso",
   "birth_year" : "1881"
}

{
   "type"       : "Gallery",
   "_id"        : "d665afaedde12b44c5159cf1782d44af",
   "_rev"       : "1-27a139e40a7f88d6465812eec3c73a0f",
   "name"       : "Museo Reina Sofia"
}
|

To get the same result in couch, I would have to do one query to get all the artworks, e.g.

|function(doc) {
  if(doc.type&&  doc.type == 'Artist') {
    emit(doc._id,doc);
  }
}
|

then I would need to do a |GET| for the _id of each |Gallery| and |Artist| returned per |Artwork| respectively to fill in the missing information. This is usually known as the |n+1| queries problem (in this case |2n+1|).

I have always worked on the assumption that the database should be leveraged to do as much of the heavy lifting as possible with as few network hits as possible (i.e. SQL is always faster than client manipulation). With a list of 120 artworks, this means 241 HTTP GETs to the database.

I am aware of "linked documents" in views (http://wiki.apache.org/couchdb/Introduction_to_CouchDB_views#Linked_documents), however this seems to only work with one related row.

I can't really see how "view collation" (http://wiki.apache.org/couchdb/View_collation) would work here either. With view collation the correlated information ends up being on different rows, and I think that would be better suited for where say a single Artists or Gallery is displayed with associated Artworks. My case is the opposite, whereby Artworks plus corresponding information from Artists and Galleries is required.

I am also aware that one could embed each gallery or artist in each artwork or vice versa, but that seems to be a lot of duplication, and would make it difficult to change artist or gallery information if required. Maybe this is how it should be done?

For this small app, this is not going to be a performance issue. I am trying rather to break away from a SQL/relational mindset and understand how this would be better designed in a database like couch.

Thanks,

Steve

Reply via email to