Mark,
I am not not a Longwell developer and my comment is a diversion from  
the core of your question, however using collected metadata for the  
disambiguation of Author names has a significant amount of merit,  
provided that the corpus of information relating to the author in  
question contains a small enough range of name variants relative to  
the size of collected information.
By collected information I mean publications and associated metadata  
that increase the level of confidence that 2 names are in-fact the  
same author. I was at a CNI Workshop in Washington at the end of last  
month where this was discussed, and IMVHO (:)) provided that there is  
some control over the generation potential names, a sufficiently  
populated tripple store will contain the information and the  
capabilities to generate links or additional relationships between  
names with a level of confidence that they are the same. Obviously  
Longwells term vectors (I did get that right, it has term vectors  
doesn't it ?) may be able to cluster an authors style and language to  
add to the confidence.

Another factor that might help you in your search is other identity  
references that might not originate from within your own store of  
names pointers, but be represented as external URI's with a level of  
trust/confidence, eg OpenSocial FOAF stores, OpenID etc.

As I said, not a Longwell developer, but an interested observer.
Ian





On 4 Mar 2008, at 23:53, Mark Diggory wrote:

> Longwell Developers,
>
> Well, the subjects a little "obscure" and I'm not sure I have the
> right terminology, but heres the question in more detail.
>
> In DSpace we can have an Item with an Author name that is for the
> same person but has multiple variants....
>
>> Hal Abelson
>> H. Abelson
>> Abelson, H.
>> Abelson, Hal
>
>
> The resulting import into Longwell would maintain each of these
> values separately. (excuse my poor n3 abilities)...
>
>> <http://hdl.handle.net/1721.1/37585> <dc:contributor> "Hal Abelson"
>> <http://hdl.handle.net/1721.1/38487> <dc:contributor> "H. Abelson"
>> <http://hdl.handle.net/1721.1/38487> <dc:contributor> "Abelson, H."
>> <http://hdl.handle.net/1721.1/37600> <dc:contributor> "Abelson,
>> Harold"
>
>
> (And certainly they are each valid variants of Hal Abelson's name).
> I'm concerned that "some of us" out there perceive it to be Longwells
> current capability that by simply "adding" RDF statements that
> designate that these are equivalents, then Longwell will magically
> allow you to have these reduced to an "agreed upon" single value such
> that (I don't really know n3... I'm making this up as I go)...
>
>> "Hal Abelson" <owl:SameAs> <xxxx>
>> "H. Abelson" <owl:SameAs> <xxxx>
>> "Abelson, H." <owl:SameAs> <xxxx>
>> "Abelson, Harold" <owl:SameAs> <xxxx>
>
>
> where
>
>> <xxxx> <rdfs:label> "Harold Abelson"
>
> And that by adding these sorts of statements to Longwell (or
> something "like" them), it will begin replacing those values with
> that "Label"? That maintaining such mappings in longwell will allow
> Longwell to magically clean our metadata and reduce duplicate values
> that occur in the facets.
>
> I ask this because my interpretation of what you could do with
> Longwell was that you could develop a Sail for Sesame that was able
> to "filter" such equivalencies, but you had to know them "long
> before" you data was actually placed into longwell.  That basically
> your just "filtering" the data before it gets stored... Not very
> exciting... why not just do it before you sent longwell the rdf in
> the first place?
>
> Any clarification on Longwell's "actual capabilities" in this area
> would seriously assist us in evaluating it a valid tool to base a
> discovery UI on for DSpace and reduce any misconception that I feel
> going on in our MIT Libraries group.  My concern is that Longwell is
> being perceived as a mechanism to "cleanup" presentation of Metadata,
> where I see its actual behavior to be more based on the old premise
> of "garbage in, garbage out".  My analysis needs to determine if our
> group is actually realistic in its expectations of Longwell's
> capability and correct those viewpoints if it is not actually the  
> case.
>
> thanks,
> Mark
> ~~~~~~~~~~~~~
> Mark R. Diggory - DSpace Developer and Systems Manager
> MIT Libraries, Systems and Technology Services
> Massachusetts Institute of Technology
>
>
>
>
>
> _______________________________________________
> General mailing list
> [email protected]
> http://simile.mit.edu/mailman/listinfo/general

_______________________________________________
General mailing list
[email protected]
http://simile.mit.edu/mailman/listinfo/general

Reply via email to