Re: [dev-biblio] Re: Re: embedded references/functional requirements wiki page

2006-04-05 Thread Bruce D'Arcus


On Apr 5, 2006, at 11:50 AM, Matt Price wrote:


should it be a little more extensive here?  so for instnace:  I am
extremely disorganized, and in the absence of a satisfactory
bibliogrpahic solution have dealt with various bibs in the last few
years.  On one paper I use one bib, for another project I may have a
wholly different one.  So shouldthe uri be:

   person:[EMAIL PROTECTED]:SOME_HASH_HERE:smith99


I'm not really sure exactly what it should be, but yeah, it'd take some 
thought.



I should also add that using uris for association is likely what will
be the outcome of the metadata work at the ODF TC. It provides a
standard and general mechanism to link content and metadata.

How's that?


do you guys have some docs on this emerging standard?


It's not emerging; it's already widely used:

http://en.wikipedia.org/wiki/Uniform_Resource_Identifier

See how examples like RDF and XLink use uris for linking. One example 
of the former relevant to this discussion:


http://www.xml.com/pub/a/2004/06/02/dijalog.html

Bruce

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [dev-biblio] Re: Re: embedded references/functional requirements wiki page

2006-04-05 Thread Matt Price
On Wed, Apr 05, 2006 at 12:18:27PM -0400, Bruce D'Arcus wrote:
 
 On Apr 5, 2006, at 12:12 PM, Matt Price wrote:
 
 sorry, I didn't mean URI's, I meant the metadata work atthe ODF TC.
 
 OIC.
 
 There's nothing yet, but so long as we agree on allowing standard 
 embedded metadata, I believe there's consensus support for defining one 
 or more linking attributes that would associate content (like 
 citations) with that metadata.
 
 That was uncontroversial when we last talked about it at least.

in this context doess standard embedded metatdata mean metadata
that follows already existing standards or a new OASIS standard for
document metadata?

m

 
 Bruce
 
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]
 

--
 .''`.   Matt Price 
: :'  :  Debian User
`. `'`hemi-geek
  `- 
-- 

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [dev-biblio] Re: Re: embedded references/functional requirements wiki page

2006-04-05 Thread Matthias Steffens
On Wed, 5-Apr-2006 10:27 -0400, Bruce D'Arcus wrote:
  If I ping a server and give a list of isbns and dois and an
  optional username, problem solved; no?
 
  Yes and no, It depends on the quality of the bibliographic data.
[...]
 OK, fair enough. But we need to design the system to enable what I'm
 arguing for as ideal I think.

Yep, and this is a good thing, otherwise there wouldn't be any
progress! ;-) It's important to stay as backwards compatible as
possible, though.

 The fallback can well be that for some users with poor data sources,
 their citation uris are rather dumb uris like:
 
   person:[EMAIL PROTECTED]:smith99

Something like this would be good and I'd consider it equally important.

 But we need to start getting with the network here, so that it'll
 even be possible for a user to be reading a book they want to cite,
 add a citation by typing in the isbn in the citation id field, and
 OOoBib will grab that relevant record from the Library of Congress
 server. Likewise for DOIs.

Yes, that's a very nice feature. However, when I'm writing a paper, 95%
of the cited references do already exist in my bibliographic database
and I want to use these (and not a copy from somewhere else) since I
know that I've verified my own entries for correctness (multiple
times). The same cannot be said for any remotely fetched data and I'd
need to check each entry for correctness. This is just an example. My
point is here that it really depends on the user's specific needs.

On Wed, 5-Apr-2006 10:34 -0400, Bruce D'Arcus wrote:

 As I said, I know there are real world difficulties with this
 approach, but consider all the (much greater) problems of the
 alternative: every user has their own unique reference scheme. Two
 collaborate on a document, one citing an article using xyz and the
 other the exact same article using 123. Imagine THAT headache!

That's a very good point. Still, I think that this, again, is
completely dependent on the user's individual needs. If you're writing
your thesis, collaboration may be less important. But it may be
absolutely crucial when writing a scientific paper together with your
co-authors.

In summary, I think that both keying methods (database-independent and
database-dependent) have major advantages and disadvantages. Thus, the
solution should be to simply allow for both methods. Ideally, multiple
identifiers would be stored and sent to the bibliographic database
which could then decide what to do. One logic could be: If the
database-dependent information (username, cite key, local record ID)
can be resolved, prefer this method to fetch the user's personal entry,
otherwise try to fetch the data from trusted sources (such as LoC)
using the database-independent identifiers.

Matthias

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [dev-biblio] Re: Re: embedded references/functional requirements wiki page

2006-04-05 Thread Bruce D'Arcus


On Apr 5, 2006, at 2:24 PM, Matthias Steffens wrote:

However, when I'm writing a paper, 95% of the cited references do 
already exist in my bibliographic database and I want to use these 
(and not a copy from somewhere else) since I know that I've verified 
my own entries for correctness (multiple times). The same cannot be 
said for any remotely fetched data and I'd need to check each entry 
for correctness. This is just an example.


OK, but I take it you're using RefBase; a single database?

What do you do for Matt, who has different databases, where the same 
reference has different local db numbers and cite keys?


FWIW, the way Endnote handles this is that citations include author and 
year, so if it can't find the proper record by id, it uses those to 
present users choices.



My point is here that it really depends on the user's specific needs.


True.


Thus, the solution should be to simply allow for both methods.


OK.

Ideally, multiple identifiers would be stored and sent to the 
bibliographic database which could then decide what to do.


Yes, ideally. But I'm not sure how practical that is (to get 
implemented).


One logic could be: If the database-dependent information (username, 
cite key, local record ID) can be resolved, prefer this method to 
fetch the user's personal entry, otherwise try to fetch the data from 
trusted sources (such as LoC) using the database-independent 
identifiers.


I think I'd separate this out further:

1) how to identify (local vs. universal id)
2) how to locate (generic vs. user-based)

As I said before, one could use an isbn-based uri to grab a record from 
a local db.


My sense is that we could have rules and configuration options to set 
these options. Am not exactly sure what they'd be, but it probably 
wouldn't be too hard to figure out. Maybe:


For identifying citations, use:

universal identifiers (enhances portability)
user-specific labels

Bruce

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [dev-biblio] Re: Re: embedded references/functional requirements wiki page

2006-04-05 Thread Matthias Steffens
On Wed, 5-Apr-2006 15:29 -0400, Bruce D'Arcus wrote:

  Ideally, multiple identifiers would be stored and sent to the
  bibliographic database which could then decide what to do.
 
  Yes, ideally. But I'm not sure how practical that is (to get
  implemented).
 
  The database would need to perform additional queries if the first
  choice doesn't result in a single record being found. I think that
  this is feasible though.
 
 I'm talking more at the document format level. The citation proposal
 only allows one key/id.

Does that mean that you can only specify one single identifier? Be it
ISBN, DOI, local record ID or user-specific cite key? You can't specify
multiple identifiers? This would mean that all our discussion is
meaningless, doesn't it?

 Allowing more complicated coding in an already complicated spec would
 no doubt be controversial for the TC, and for implementors. Moreover,
 it would treat citations as a special class of object, which would
 also probably be controversial.

I understand that. But the TC folks should also understand that the
entire bibliographic database will be completely useless, if people
can't link to their own records. This will be a major frustration.
Being modern and ideal is nice, but if it only suits 5% of the crowd,
something is wrong.

  Sounds reasonable, but maybe it should read:
 
   For identifying citations, prefer:
 
  Prefer would indicate that both identifiers will be used but with
  different priorities.
 
 Yes, absolutely. And come to think of it, there should be another 
 config option for preferred sources, with optional user parameter(s).

Yes, personal info such as usernames may be different across the
various databases.

 As far as I can see, the ONLY reason to have a natural language key
 is because one doesn't have a universal identifier. Your concern is
 mostly about *where* you get your records from, not how you identify
 them (you want *your* records because you trust them).

Basically, yes. It's not only the source (*where*) but I want
specifically my own records (*your*). So even within the same source,
I don't want the buggy  incomplete records of my colleague but my own
ones.

 So for me, I'd want a rule that says to use universal ids wherever
 possible, and to fallback to a label I provide where necessary.

I'd say it the other way 'round: prefer my own records wherever
possible but fallback to universal ids where necessary, e.g. if nothing
found or when collaborating with others.

 I'd also perhaps default to my database and user account, with
 options to ping other servers if data is missing.

Yes, exactly!

 Wouldn't that solve the problems with the best balance of concerns?

Yes.

Matthias

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [dev-biblio] Re: Re: embedded references/functional requirements wiki page

2006-04-05 Thread David Wilson
On Thursday 06 April 2006 4:24 am, Matthias Steffens wrote:
 On Wed, 5-Apr-2006 10:27 -0400, Bruce D'Arcus wrote:


 Yes, that's a very nice feature. However, when I'm writing a paper, 95%
 of the cited references do already exist in my bibliographic database
 and I want to use these (and not a copy from somewhere else) since I
 know that I've verified my own entries for correctness (multiple
 times). The same cannot be said for any remotely fetched data and I'd
 need to check each entry for correctness. 

(If you wonder why I make late entries into some of the discussions - it's 
because I am not up a 2:30 am)

Yes I agree, we can not assume that library catalogues are correct - even the 
sainted US LOC. I was told recently the a common library cataloguing 
practice, and one used my university,  is that when a new book comes in to be 
catalogue, the cataloguer, does a world-wide library search and copies the 
first cataloguing entry found. Now if they all do this all the libraries have 
copies of the very first cataloguing entry produced for that book by X from 
library Y, and X may not be all the skilled at writing them because he or she 
mostly spends their time copying other libraries' efforts.
 
This also partly explains why book on the same topics are not always together 
on the shelves.

Also the libraries I have used often have problems collecting  the books of 
one author under the same author listing. So you have books by 

Smith Fred, S
Smith Fred, S  (1934- )
Smith Fred, S  (1934-1987)

(Which will look poor in your Dissertation, and be even worse if you assumed 
they were different people)

So the point is that collecting internet cataloguing data will not be a magic 
corrector of data. Useful, but it will still need checking by the user.

David
-- 
---
David N. Wilson
Co-Project Lead for the Bibliographic 
OpenOffice Project
http://bibliographic.openoffice.org

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [dev-biblio] Re: Re: embedded references/functional requirements wiki page

2006-04-05 Thread Bruce D'Arcus


On Apr 5, 2006, at 6:06 PM, Matthias Steffens wrote:


I'd also perhaps default to my database and user account, with
options to ping other servers if data is missing.


Yes, exactly!


Wouldn't that solve the problems with the best balance of concerns?


Yes.


So do the two yes responses suggest I don't need to respond to the 
previous objections?


Bruce

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]