Hello,

Am 24.07.2007 um 18:01 schrieb Graham Triggs:

> On Tue, 2007-07-24 at 11:12 -0400, Robert Tansley wrote:
>> ... However, IDs may depend both on the object type and
>> related objects -- e.g. bitstream IDs may include the item ID as a
>> path component, or the version number etc -- and the ID scheme  
>> itself.
>
> IDs will (? should!) always be free of the context of the process of
> assignment.

Am 24.07.2007 um 17:32 schrieb Ekaterina Pechekhonova:

> Do you think it's appropriate for external IDs to support "internal"
> system hierarchy and version numbers ? (I am not arguing just  
> wondering)

I am arguing and I agree with Grahams view that IDs should
not code any metadata of the object identified. I hope my
story is not to lenghty because it is only loosely related
to the subject, but for me it fits in here.


The former programmer of our project has changed the handle
manager to support child naming authorities. During my ongoing
effort to upgrade to 1.4x, I had to read and understand the
code and was thinking a bit how Handles are created and
stored in the database.

We had to do this change to convince contributing parties
to put their data into our repository. They were not clear
about the way they could get back their metadata when our
project is loosing funding (which is the case right now
BTW). The export alone was not the whole answer as the
persistent identifier was still dependent on our system
remaining online.

Our changes are not the kind of quality that we can send
them in as a patch right now and following this discussion
I am not sure that it will get that status before getting
obsolete. You will find some problem while visiting our
site right away, the most obvious of them being communities
constituing child naming authorities not being entities of
their own child naming authorities but of the parent autho-
rity, thus remaining the only dependent item when moved
to another repository.


Aside from that, why do I strongly believe that metadata
and identifier should not be mixed? First, this could lead
people to "forge" URL because they believe they know who
the scheme is. This is happening right now with some
crawlers on our site and I am not yet sure that we dont
provide them ourselves with the wrong link, but I cant
see how we do this right now (my logs are not complete
and I will follow this issue).

Second, more important, the current model, where IDs are
a sequence is not the best solution as well. This sequence
gives metainformation and might lead to contradictory con-
clusions. The lower IDs tells you that this item has been
created before another item. But, you can change the date
through the submission interface. This might be relevant
in some cases where a submitter has to prove credibly that
his work has been published before or past a certain date.
As admin of the repo you will be in charge to make the
origin of the difference between date order and ID sequence
of the two tiems be understood. I would object to help the
lawyers with that. I am interested in the integrity of my
repo but not in others to use this fact at their will.

I think, the current model should be changed to some fixed
length hash value of sufficient length, but I dont know
how to do that with the means of the database. It should
be done by the db because the ID is a primary key of the
table where it is stored. Also, such a change need to
support preexisting handles as well, probably requiring
an additional column in the table or other schema mods.

Bye, Christian



-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
_______________________________________________
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech

Reply via email to