That's why I'd love to know whether the xISBN database uses a common
identifier for each set of ISBNs, and whether (and I know 'pretty
please' is a poor justification for changing an API) it might be exposed
for this reason.


Hopefully the OCLC people can answer that.  It might be in the work Andy
suggested yesterday.  One idea I had while yesterday was if you don't
care that much about the id internally you could use an auto-increment.

To clarify, we'll assume that any isbn in a set will return the same set
in xISBN.
IE asking for isbns related to a returns a,b and c.  Asking for b or c
should return a,b,c.

So we can do as Andy suggested and start building our table by taking the
set of all current isbns, normalized a bit I'd imagine.

In a computationally-expensive method:

Start with the first isbn (x) and get the set of isbns from xISBN that is
related (A).  Iterate over every member of A testing for the following:
is the member assigned to a group already.  If it has, stop the loop and
assign x to the same group.  If none in A have been assigned a group,
start a new group and add x.

You'll have to do this every once in a while to make sure you're getting
all the new books.

Hopefully this makes up for the advice I gave yesterday ;).  I'm
sure you can probably come up with a better algorithm though,
something about the backward-lookup everytime makes me think that
there's a better way.


ps.  Andy's right, normalization is a good, good thing.  Only reason I
suggested looking at the costs was I was thinking it would be a lot easier
than trying to come up with a method to generate unique ids for a "group"
since my grasp of FRBR/xISBN is a little shaky I'll avoid any specific
terminology.
(Like I said in my original email, having a identifier or groups is a
definite advangtage).

Reply via email to