So the ideal use case for our xISBN cache is that we would be querying
only a local database, and that database would only return ISBNs (or bib
numbers) of other editions which are actually in our catalog.

Very nice idea.


My guess is that we should have a row for each ISBN in the system, along
with a column that links that ISBN to some common identifier that will
symbolically mean e.g. "SQL for Dummies, any version".  We could then
ask "What is the identifier for the first ISBN of the item being
displayed?" and then do a second query that asks "What other ISBNs in
our catalog have that identifier?"

My gut instinct is that while this is nice, I'm wondering if for this
particular project it couldn't be simplier.  Why not just simply have a
two column table, both being ISBNs.  Something like source,related.  Then
you simply feed in the results for every hit from the xISBN service.

And of course you might want to refresh this over time, or try to use some
other techniques.   (Ie occasionally look for places where source isbn has
related isbn, but related does not have source.).

Of course, there would be a definite advantage to be able to have the
identifier.  I don't know what number you'd use.  I would think you'd just
have to use a internal id (auto-incrementing or something of that nature).


The only remaining question, I think, is whether we should have one
table with ISBN as primary key and identifier as a second column (one
row per ISBN), and then another table with identifier as primary key and
ISBN as second column (again, one row per ISBN).  This could make our
queries faster at the cost of doubling storage space.  On the other
hand, perhaps we don't need to optimize that far.  It depends on how
mySQL reacts to searching the entire table for a non-primary-key column
value.  I'll have to fiddle with it and find out, unless one of you
knows already.


Nope, sounds like a fiddling thing to me.  I'd suspect that a two column
table should be optimized for lookups in either direction since they tend
to be used in joining queries (ie are a relation).  But I don't know the
behavior of MySQL by default.

Jon Gorman

Reply via email to