Hi everyone,

This is a resend of a mail I originally only sent to Jan.  He suggested I sent 
it on for others to comment on.  Feel free to let me know what you think!

Hey Jan,

a while ago when you were in Barcelona I mentioned the music program I
was working on and how I wanted to move it to couchdb.  You said I
should write down what I was trying to do and what my problems are, so
since I recently started hacking on it again (as a use case for paisley,
the twisted library for couchdb) and wrote some things down.

I'm attaching the rst that mentions the kinds of documents/ojects the
program deals with.  As you can imagine, I'm having trouble moving from
a normalized data model to a more couchdb-friendly one.  I added two
ideas for denormalizing that I think would be ok for me, but there are
some where I think it wouldn't make sense.

There is one big question I have that I don't know if it's possible at
all or not... but given how couchdb implements 'views' on the whole
document store that are created on the first request to that view and
then kept uptodate incrementally - why would it be so hard to use
couchdb as a normalized data store, and then create a view that
effectively denormalizes data doing more than the single join that the
'view for sql jockeys' chapter explains ?

Isn't the principle the same thing, and isn't couchdb simply only
lacking a way of expressing that kind of view ? Or am I missing
something fundamental here ?

Any feedback on my design ideas and how I want to model the data are
appreciated.  Let me know if you have questions; in particular it's
important to treat Track as the central object, with Slices into
AudioFiles just being the on-disk representation of that track.

Thanks
Thomas


-- 
morgen wordt het beter beter voor iedereen
dan krijg ik de strop
en jij wat je verdiende
--
GStreamer - bringing multimedia to your desktop
http://gstreamer.freedesktop.org/


Objects
=======

Track
-----

Track is the central object.  A track represents the idea of a single song.
Tracks have a name (the song title) and a list of artists performing it.

A Track contains a link to the artist_ids for this track.

Tracks are not directly linked to an audio file for various reasons:
 * one track can have multiple audio files (single file album,
   hidden bonus tracks)
 * you can know about tracks without having the actual file (shopping list,
   wish list, deleted files)

Audiofile
---------

An audio file is a file containing digital audio data, compressed or
uncompressed.  This data can represent one or more tracks.  The file
can be lossless or lossy.

Audiofiles have a samplerate, duration, and an md5sum calculated on everything
but the metadata.  This can be format- or codec-specific.

Slice
-----

A slice identifies a part of an audio file representing one track.

A slice contains links to:
 * the audiofile it slices
 * the track it represents

It has variables like start and end, as well as audio statistics like
peak, rms, and some special rms parameters used for mixing.

An audio file can have multiple slices; for example:
 * 'All Apologies' on Nirvana's 'In Utero' has 20 minutes of silence
   plus a bonus track
 * 'Swallow' on Placebo's debut album has silence plus a beautiful piano-based
   bonus track.

Artist
------

An artist identifies a performer of tracks/albums/...
It has no links.
It has fields to help in sorting and displaying.

It is linked to by:
 * Track
 * Album

Album
-----

An Album is a collection of tracks.

An Album contains a link to the artist_ids for this album.

An Album does not directly list tracks.  See TrackAlbum.

TrackAlbum
----------

Tracks are not directly linked to an audio file for various reasons:
 * one track can have multiple audio files (single file album,
   hidden bonus tracks)
 * you can know about tracks without having the actual file (shopping list,
   wish list, deleted files)

Directory
---------

A directory in which audio files live.
It has its relative name.

It links to a parent which can be another directory or a volume.

Directories are implemented with relative paths and recursively to make it
easy to

* move directories
* mount volumes in different locations (e.g. same nfs on different machines,
  or a usb disk mounted on different mount points)


Volume
------

A storage volume for audio files and their directories.  It can be attached to
or removed from a computer.

Volumes can theoretically be shared across computers; for example an NFS share. 
 Maybe we need a VolumeMapping per computer ? They could be mounted on 
different paths.

Potential denormalizations
==========================

 * put ratings for track/artist/album on the respective documents
 * put slices in track documents


Missing concepts
================

 * Various computers/devices owned
 * their storage locations (some of which might be shared across devices)
 * sync/caching rules
 * playlists

Questions
=========

* What's the best way to query for all tracks that have online audio files ?
* Does it make sense to build aggregate tables on each db update somehow ?

Reply via email to