Thanks for your help on this.  I am pretty sure I have the query down as I
need.  I used a variation of your UNION'ed query.

Since we don't have one ISDN for each title, I stripped the suffix off of
the ID (-01 or -03) to artificially create a unique ID for each title.

Thanks again!

>From: "Cornillon, Matthieu (Consultant)" <[EMAIL PROTECTED]>
>Reply-To: [EMAIL PROTECTED]
>To: CF-Talk <[EMAIL PROTECTED]>
>Subject: RE: Comparing two records within the query
>Date: Thu, 9 Sep 2004 11:54:22 -0400
>
> >> Another option is to do a self-join.
>
> > This works, though I am not getting all rows returned and I haven't
>quite
>
> > figured why yet.  I hate to rely on the titles being exactly the same.  
>I
>
> > have data where one title contains the words DVD while the VHS version
> > doesn't.
>
>  Unfortunately, you are running into a fundamental data identity issue
>here.
>Here, identity is "a particular movie, regardless of format".  The goal of
>your application is to return a list of identities (as defined above), with
>available format information included.
>
>If you had some unique identifier (similar to ISBN) for movies, something
>that said 002943018 = "What's Eating Gilbert Grape?" regardless of whether
>it was VHS, Beta, DVD, Laserdisc, or celluloid, you could do this
>comparison
>flawlessly and easily.  Unfortunately, I am guessing that you have no such
>unique identifier, and further that no such identifier even exists.
>
>(Interesting idea: use IMDBs data.  They identify "What's Eating Gilbert
>Grape" as tt0108550.  If you used this, you could even dynamically generate
>links to IMDB for more information.  You would have to create a table of
>titles and these unique IDs somehow, and this would be a bunch of work,
>but--as I said--interesting idea.  If you link to them, they might not even
>mind, in which case you could ask them to send you a list of titles and
>IDs.
>Again, just a brainstorm.)
>
>Back to the issue at hand: if you had such an identifier, you would be all
>set.  But you don't.  So, you are using title, which is pretty good.  But
>you have problems.  "The Manchurian Candidate" is not good as an identity:
>it could mean either the 1962 or the 2004 version.  Also, you have this
>format data appended.
>
>I recommend that you add a column to the table that is "cleaned title".  
>You
>then cycle through the database and remove that VHS and DVD note.  While
>you
>do it, you could create a mediaformat column as suggested before.  You
>could
>leave the original title as is, if you want, or go to the cleaned title.  I
>see no way around doing at least this.
>
>Of course, once you have done that, the question is why you don't have a
>relational database with one table that holds videos where each row is a
>unique identity, and another table that holds stock, indicating, among
>other
>things, what media format the stock item is.
>
>If you don't split the table like this, I would still recommend that you
>create some sort of internal unique ID that you use to identify a movie
>entity.  Others might know more on this issue, but my guess is that
>comparing titles (even after the cleaning described above) will be slower
>than comparing numeric values, thereby leading to faster joins.
>
>HTH,
>Matthieu
>
>
>
[Todays Threads] [This Message] [Subscription] [Fast Unsubscribe] [User Settings] [Donations and Support]

Reply via email to