Right the problem with building a list of counts in a batch is what happens if song added as you are building the counts.
On Wed, Feb 26, 2014 at 10:32 AM, Green, John M (HP Education) < [email protected]> wrote: > Edward, > > > Thanks for your insight. > > > > One other thought I had was to store a reference count with the "song". > When the last "playlist" referencing the "song" is deleted the "song" will > also be deleted because the reference count decrements to zero. However, > this would create some nastiness when it comes to reliably maintaining > reference counts. I'm not sure if it would help to split the reference > count into two monotonically increasing counters (number of references > added, and number of references deleted). > > > > In my case, users cannot browse a repository of "songs" to build a > playlist from scratch. They can only import "songs" themselves or create > references to "songs" other users have explicitly made available to them. > Once a "song" is not referred to by any "playlist" it will never be > re-discovered so it should be deleted. This could be done in some sort of > background data maintenance job that runs periodically. Even if it is a > low-priority background job it look like it will create a lot overhead > (scanning and producing counts). > > > > John > > *From:* Edward Capriolo [mailto:[email protected]] > *Sent:* Wednesday, February 26, 2014 5:56 AM > *To:* [email protected] > *Subject:* Re: Naive question about orphan rows > > > > It is probably ok to have redundant songs in playlists, cassandra is about > denormalization. > > Dealing with this issue is going to be hard since the only way to dwal > with this would be scanning through the firsr cf and procing counts then > using that information to delete in the second table. However that > information can change rapidly and then will fall out of sink fast. > > The only ways yo handle this are > > 1) never delete songs > 2) store copies of songs ib playlist > > On Friday, February 21, 2014, Green, John M (HP Education) < > [email protected]> wrote: > > I'm very much a newbie so this may be a silly question but ... > > > > > > > > I have a situation similar to the music service example ( > http://www.datastax.com/documentation/cql/3.1/cql/ddl/ddl_music_service_c.html) > of songs and playlists. However, in my case, the "songs" would be > considered orphans that should be deleted when no "playlists" refer to > them. Relational databases have mechanisms to manage this relationship so > that a "song" could be deleted as soon as the last "playlist" referencing > it is deleted. While I do NOT need to manage this as an atomic > transaction, I'm wondering what is the best way to delete orphaned rows > (i.e., "songs" not referenced by any "playlists") using Cassandra. > > > > > > > > I guess an alternative approach would be to store "songs" directly in > the "playlists" but this could lead to many redundant copies of the same > "song" which is something I'm hoping to avoid. I'm my case the "playlists" > could have thousands of entries and the "songs" might be blobs of 10s of > Mbytes. Maybe I'm just having a hard time abandoning my relational roots? > > > > > > > > John > > -- > Sorry this was sent from mobile. Will do less grammar and spell check than > usual. >
