Re: [CODE4LIB] What do you want out of a frbrized data web service?

2010-04-21 Thread Peter Noerr
For our fed search service we very much echo Jonathan's real-time 
requirements/use case (we don't build indexes, so bulk download is not of 
interest):

access - real-time query (purpose - to enhance data about items found by other 
means)
query - by standard IDs (generally this is "known item" augmentation, so 
"discovery" queries by keywords, etc are not so much required)
data format - almost anything "standard"  (we can translate it into the 
internal data model structure)
big value add - relationships, mainly the "upward" ones, towards work
data quantity - all details of directly related items, plus 2nd level links, 
possibly all details all the way up to (and including) the work (this is a 
trade-off of processing time on the service side to gather this information, 
and on our side to de-construct vs. the time to set up and manage multiple 
service calls to get the data about individual items in the link chain. In our 
experience it is almost always quicker to get it "all-at-once" than to send 
repeated messages, even if the total amount of data is less in the latter. But, 
mileage may vary here.) 


Peter

> -Original Message-
> From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of
> Jonathan Rochkind
> Sent: Wednesday, April 21, 2010 7:59 AM
> To: CODE4LIB@LISTSERV.ND.EDU
> Subject: Re: [CODE4LIB] What do you want out of a frbrized data web
> service?
> 
> So, okay, the "value added" stuff you have will indeed be relationships
> between entities, which is not too unexpected.
> 
> So, yes, I would want a real-time query service (for enhancement of
> individual items on display in my system) _as well as_ a bulk download
> (for enhancing my data on indexing).
> 
> For real time query, I'd have a specific entity at hand roughly
> corresponding to a 'manifestation'.  I'd want to look it up in your
> system by any identifiers I have (oclcnum, lccn, isbn, issn; any other
> music-related identifiers that are useful?) to find a match. Then I'd
> want to find out it's workset ID (or possibly expression ID?) in your
> system, and be able to find all the OTHER manifestations/expressions in
> those sets, from your system, with citation details about those items.
> (Author, title, publisher, year, etc; also oclcnum/lccn/isbn/issn/etc
> if
> available. Just giving me Marc with everything might be sufficient).
> If you have work identifiers from other systems that correspond to your
> workID (OCLC workID? etc), I'd want to know those.
> 
> For bulk download, yeah, I'd just want everything you could give me,
> really.
> 
> Some of the details can't really be spec'd in advance, it requires an
> interative process of people trying to use it and seeing what they need.
> I know this makes things hard from a grant-funded project management
> perspective.
> 
> Jonathan
> 
> Riley, Jenn wrote:
> > On 4/20/10 7:18 PM, "Jonathan Rochkind"  wrote:
> >
> >
> >> But first, to really answer the question, we need some more
> information
> >> from you. What data do you actually have of value? Just saying "we
> have
> >> FRBRized data" doesn't really tell me, "FRBRized data" can be almost
> >> anything, really.   Can you tell us more about what value you think
> >> you've added to your data as a result of your "FRBRization"?  What
> do
> >> you have that wasn't there before?  Better relationships between
> >> manifestations?  Something else?
> >>
> >
> > Heh, I was intentionally vague in an attempt to avoid skewing the
> discussion
> > in certain directions, but I was obviously *too* vague - my apologies.
> Here
> > are the sorts of things we'd imagined and are looking to prioritize:
> >
> > - Give me a list of all manifestations that match some arbitrary
> query terms
> > - Given this manifestation identifier, show me all expressions on it
> and
> > what works they realize
> > - Give me a list of all works that match some arbitrary query terms
> > - Given this work identifier, show all expressions and manifestations
> of it
> > - Show me all of the people who match some arbitrary query terms
> (women
> > composers in Vienna in the 1860s, for example)
> > - Which works have expressions with this specific relationship to
> this
> > particular known person?
> >
> > Basically we're exploring when we should support queries as words vs.
> > previously-known identifiers, when a response will all be a set of
> records
> > for the same entity vs. several different ones with the 

Re: [CODE4LIB] What do you want out of a FRBRized data web service?

2010-04-21 Thread Karen Coyle

Quoting "Ziso, Ya'aqov" :


Karen Coyle,

By ‘create entities’ (below) is it NECESSARY to create records (and   
keep them up-to-date), or is it possible/preferable to create them   
on the fly?

./Ya’aqov


the display is created on the fly. the entities need to be somehow  
embodied as entities in the database. But the entity is, for example,  
the author, not that whole page of information. It's an identified  
"thing" in the database that can have relationships with other  
"things" -- which can then be brought into the display.


You can see how OL has defined things by looking through the list of types:

http://openlibrary.org/type

Everything that we consider a "data element" is a type, and types can  
be created that contain other types (look at author as an example).  
This makes for a lot of flexibility, and in theory any type could be  
treated as an entity (although it doesn't make sense for all of them).  
I'm sure this is only one of many ways to do this...


kc



 It would be ideal to have an actual entity for each of the FRBR  
 1, 2  and 3 entities. We could even create entities that aren't   
exactly in FRBR, such as for publication dates, publishers,   
languages. And the   main view is not of a single entity, but an   
entity in relation to other entities. What's nearby? What happens   
when I combine these two?  (Also see WorldCat Identities as an   
example of data that can be shown  in relation to a person entity.)   







--
Karen Coyle
kco...@kcoyle.net http://kcoyle.net
ph: 1-510-540-7596
m: 1-510-435-8234  
begin_of_the_skype_highlighting  1-510-435-8234  end_of_the_skype_highlighting

skype: kcoylenet


Re: [CODE4LIB] What do you want out of a FRBRized data web service?

2010-04-21 Thread Ziso, Ya'aqov
Karen Coyle,

By ‘create entities’ (below) is it NECESSARY to create records (and keep them 
up-to-date), or is it possible/preferable to create them on the fly?
./Ya’aqov

 It would be ideal to have an actual entity for each of the FRBR 1, 2  and 
3 entities. We could even create entities that aren't exactly in FRBR, such as 
for publication dates, publishers, languages. And the   main view is not of a 
single entity, but an entity in relation to other entities. What's nearby? What 
happens when I combine these two?  (Also see WorldCat Identities as an example 
of data that can be shown  in relation to a person entity.) 


Re: [CODE4LIB] What do you want out of a frbrized data web service?

2010-04-21 Thread Karen Coyle

Quoting "Riley, Jenn" :


Hi all,


So if there were FRBRized data available to you (at

least for FRBR group 1 and group 2 entities; *maybe* group 3 as well), what
would you do with it? What kinds of questions would your service (discovery
system, whatever) ask a service that made this data available? What kinds of
information would you want in a response? Would you have uses that called
for downloading of "all" data at once or would you instead be better off
with real-time queries to a web service? It's questions like that we're
interested in brainstorming with this group about.


Take a look at the Open Library's use of entities:
  http://upstream.openlibrary.org

When you search on a subject, you get a subject entity/page. When you  
search on an author, you get an author entity/page. Information about  
the subject and the author, plus the relationships are available in  
current data (related subjects, related persons, etc.) are all there.


For users, I think that the group 2 and 3 entities and the various  
relationships are key to discovery. After that, what relationships you  
can derive from the data gives users a way to navigate (rather than  
search).


It would be ideal to have an actual entity for each of the FRBR 1, 2  
and 3 entities. We could even create entities that aren't exactly in  
FRBR, such as for publication dates, publishers, languages. And the  
main view is not of a single entity, but an entity in relation to  
other entities. What's nearby? What happens when I combine these two?  
(Also see WorldCat Identities as an example of data that can be shown  
in relation to a person entity.)


kc





Basically, what type of access to the data we're generating is most
important, since we have finite resources to expend on this right now.

Thanks, all!

Jenn

[1] http://www.loc.gov/cds/downloads/FRBR.PDF
[2] http://vfrbr.info


Jenn Riley
Metadata Librarian
Digital Library Program
Indiana University - Bloomington
Wells Library W501
(812) 856-5759
www.dlib.indiana.edu

Inquiring Librarian blog: www.inquiringlibrarian.blogspot.com





--
Karen Coyle
kco...@kcoyle.net http://kcoyle.net
ph: 1-510-540-7596
m: 1-510-435-8234
skype: kcoylenet


Re: [CODE4LIB] What do you want out of a frbrized data web service?

2010-04-21 Thread Jonathan Rochkind
So, okay, the "value added" stuff you have will indeed be relationships 
between entities, which is not too unexpected.


So, yes, I would want a real-time query service (for enhancement of 
individual items on display in my system) _as well as_ a bulk download 
(for enhancing my data on indexing).


For real time query, I'd have a specific entity at hand roughly 
corresponding to a 'manifestation'.  I'd want to look it up in your 
system by any identifiers I have (oclcnum, lccn, isbn, issn; any other 
music-related identifiers that are useful?) to find a match. Then I'd 
want to find out it's workset ID (or possibly expression ID?) in your 
system, and be able to find all the OTHER manifestations/expressions in 
those sets, from your system, with citation details about those items. 
(Author, title, publisher, year, etc; also oclcnum/lccn/isbn/issn/etc if 
available. Just giving me Marc with everything might be sufficient).   
If you have work identifiers from other systems that correspond to your 
workID (OCLC workID? etc), I'd want to know those.


For bulk download, yeah, I'd just want everything you could give me, 
really.


Some of the details can't really be spec'd in advance, it requires an 
interative process of people trying to use it and seeing what they need. 
I know this makes things hard from a grant-funded project management 
perspective.


Jonathan

Riley, Jenn wrote:

On 4/20/10 7:18 PM, "Jonathan Rochkind"  wrote:

  

But first, to really answer the question, we need some more information
from you. What data do you actually have of value? Just saying "we have
FRBRized data" doesn't really tell me, "FRBRized data" can be almost
anything, really.   Can you tell us more about what value you think
you've added to your data as a result of your "FRBRization"?  What do
you have that wasn't there before?  Better relationships between
manifestations?  Something else?



Heh, I was intentionally vague in an attempt to avoid skewing the discussion
in certain directions, but I was obviously *too* vague - my apologies. Here
are the sorts of things we'd imagined and are looking to prioritize:

- Give me a list of all manifestations that match some arbitrary query terms
- Given this manifestation identifier, show me all expressions on it and
what works they realize
- Give me a list of all works that match some arbitrary query terms
- Given this work identifier, show all expressions and manifestations of it
- Show me all of the people who match some arbitrary query terms (women
composers in Vienna in the 1860s, for example)
- Which works have expressions with this specific relationship to this
particular known person?

Basically we're exploring when we should support queries as words vs.
previously-known identifiers, when a response will all be a set of records
for the same entity vs. several different ones with the relationships
between them recorded, to what degree answering a query will involve
traversing lots of relationships - stuff like that. Having some real use
cases will help us decide what kind of a service to offer and what
technology we'll use to implement that service.

We do hope to also be able to publish Linked Data in some form - that's
probably going to come a little later, but it's definitely on "the list".

To answer one of your other questions, the V/FRBR project is focusing on
musical materials (scores and recordings) in particular, but we hope to set
up frameworks that would be useful for library bibliographic and authority
data in general.

Jenn


Jenn Riley
Metadata Librarian
Digital Library Program
Indiana University - Bloomington
Wells Library W501
(812) 856-5759
www.dlib.indiana.edu

Inquiring Librarian blog: www.inquiringlibrarian.blogspot.com


  


Re: [CODE4LIB] What do you want out of a frbrized data web service?

2010-04-21 Thread Riley, Jenn
On 4/20/10 7:18 PM, "Jonathan Rochkind"  wrote:

> But first, to really answer the question, we need some more information
> from you. What data do you actually have of value? Just saying "we have
> FRBRized data" doesn't really tell me, "FRBRized data" can be almost
> anything, really.   Can you tell us more about what value you think
> you've added to your data as a result of your "FRBRization"?  What do
> you have that wasn't there before?  Better relationships between
> manifestations?  Something else?

Heh, I was intentionally vague in an attempt to avoid skewing the discussion
in certain directions, but I was obviously *too* vague - my apologies. Here
are the sorts of things we'd imagined and are looking to prioritize:

- Give me a list of all manifestations that match some arbitrary query terms
- Given this manifestation identifier, show me all expressions on it and
what works they realize
- Give me a list of all works that match some arbitrary query terms
- Given this work identifier, show all expressions and manifestations of it
- Show me all of the people who match some arbitrary query terms (women
composers in Vienna in the 1860s, for example)
- Which works have expressions with this specific relationship to this
particular known person?

Basically we're exploring when we should support queries as words vs.
previously-known identifiers, when a response will all be a set of records
for the same entity vs. several different ones with the relationships
between them recorded, to what degree answering a query will involve
traversing lots of relationships - stuff like that. Having some real use
cases will help us decide what kind of a service to offer and what
technology we'll use to implement that service.

We do hope to also be able to publish Linked Data in some form - that's
probably going to come a little later, but it's definitely on "the list".

To answer one of your other questions, the V/FRBR project is focusing on
musical materials (scores and recordings) in particular, but we hope to set
up frameworks that would be useful for library bibliographic and authority
data in general.

Jenn


Jenn Riley
Metadata Librarian
Digital Library Program
Indiana University - Bloomington
Wells Library W501
(812) 856-5759
www.dlib.indiana.edu

Inquiring Librarian blog: www.inquiringlibrarian.blogspot.com


Re: [CODE4LIB] What do you want out of a frbrized data web service?

2010-04-21 Thread Ziso, Ya'aqov
Here's a thought: 

John Riley in an authority record is linked through the 670 field (author of 
Cells today) where Cells today is the 245 in a bibliographic record. Let's 
assume there are about 4 John Riley(s) who wrote about cells, each in their own 
bib record.  If any bibliographic record is part of a 'chain' of FRBRized 
manifestations where one of these manifestations includes also a date (relevant 
to a certain Riley John), a more detailed description of those cells, or a 502 
such as Riley John submitted a dissertation on a specific branch in chemistry), 
or a video with John Riley's picture,  I can benefit and link (via an API 
query) to that information to distinguish that Riley, John.

Jenn, Jonathan, does my scenario make sense?
Ya'aqov

From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of Jonathan 
Rochkind [rochk...@jhu.edu]
Sent: Tuesday, April 20, 2010 7:18 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] What do you want out of a frbrized data web service?

I started preparing a longer answer to this, and still will provide one
eventually.

But first, to really answer the question, we need some more information
from you. What data do you actually have of value? Just saying "we have
FRBRized data" doesn't really tell me, "FRBRized data" can be almost
anything, really.   Can you tell us more about what value you think
you've added to your data as a result of your "FRBRization"?  What do
you have that wasn't there before?  Better relationships between
manifestations?  Something else?

I forget, were you focusing on specific material types (music or moving
image?) in this project, or is this just general materials, covering the
gamut of what one would expect from a major academic library?  If you've
done special work with music or moving image, what is the nature of the
value added there?

Do these questions make sense?   To know how I might want to use the
data, I need to know a bit more about what you've actually got that's
useful, which "it's FRBRized" doesn't really tell me.

But as far as "do you want real-time querries to a web service, or bulk
download of the data?" -- yes, I'd want both, probably. Either one will
be the most convenient depending on what I'm trying to do. If you _had_
to pick one, it would be 'bulk download', because _anything_ is possible
with bulk download -- but for certain uses, it can take a lot more work
on my part for bulk download, so if that's all there is there, it will
be a higher barrier for use than if real-time web api was available.
But if _only_ real-time querries are available, then certain things are
just impossible (mainly indexing-time enhancement of my data).

Jonathan

Riley, Jenn wrote:
> Hi all,
>
> At Indiana University we're working on a project that will help us see
> concretely what FRBRized [1] library data and discovery systems might look
> like. [2] One of our project goals is to share the raw FRBRized data widely
> so that others can look at it to see how it's structured, reuse it, improve
> on it, comment on the FRBRization effectiveness, etc. We're planning on
> allowing remote/Web Services/API/SRU/some machine-to-machine method like
> that access to the data. As we're starting to think about how we should set
> that up, we thought it would be useful to gather some use cases from the
> code4lib community, as it's the folks here that are experimenting with
> services like this. So if there were FRBRized data available to you (at
> least for FRBR group 1 and group 2 entities; *maybe* group 3 as well), what
> would you do with it? What kinds of questions would your service (discovery
> system, whatever) ask a service that made this data available? What kinds of
> information would you want in a response? Would you have uses that called
> for downloading of "all" data at once or would you instead be better off
> with real-time queries to a web service? It's questions like that we're
> interested in brainstorming with this group about.
>
> Basically, what type of access to the data we're generating is most
> important, since we have finite resources to expend on this right now.
>
> Thanks, all!
>
> Jenn
>
> [1] http://www.loc.gov/cds/downloads/FRBR.PDF
> [2] http://vfrbr.info
>
> 
> Jenn Riley
> Metadata Librarian
> Digital Library Program
> Indiana University - Bloomington
> Wells Library W501
> (812) 856-5759
> www.dlib.indiana.edu
>
> Inquiring Librarian blog: www.inquiringlibrarian.blogspot.com
>
>


Re: [CODE4LIB] What do you want out of a frbrized data web service?

2010-04-20 Thread Jonathan Rochkind
I started preparing a longer answer to this, and still will provide one 
eventually.


But first, to really answer the question, we need some more information 
from you. What data do you actually have of value? Just saying "we have 
FRBRized data" doesn't really tell me, "FRBRized data" can be almost 
anything, really.   Can you tell us more about what value you think 
you've added to your data as a result of your "FRBRization"?  What do 
you have that wasn't there before?  Better relationships between 
manifestations?  Something else? 

I forget, were you focusing on specific material types (music or moving 
image?) in this project, or is this just general materials, covering the 
gamut of what one would expect from a major academic library?  If you've 
done special work with music or moving image, what is the nature of the 
value added there?


Do these questions make sense?   To know how I might want to use the 
data, I need to know a bit more about what you've actually got that's 
useful, which "it's FRBRized" doesn't really tell me. 

But as far as "do you want real-time querries to a web service, or bulk 
download of the data?" -- yes, I'd want both, probably. Either one will 
be the most convenient depending on what I'm trying to do. If you _had_ 
to pick one, it would be 'bulk download', because _anything_ is possible 
with bulk download -- but for certain uses, it can take a lot more work 
on my part for bulk download, so if that's all there is there, it will 
be a higher barrier for use than if real-time web api was available.  
But if _only_ real-time querries are available, then certain things are 
just impossible (mainly indexing-time enhancement of my data).  


Jonathan

Riley, Jenn wrote:

Hi all,

At Indiana University we're working on a project that will help us see
concretely what FRBRized [1] library data and discovery systems might look
like. [2] One of our project goals is to share the raw FRBRized data widely
so that others can look at it to see how it's structured, reuse it, improve
on it, comment on the FRBRization effectiveness, etc. We're planning on
allowing remote/Web Services/API/SRU/some machine-to-machine method like
that access to the data. As we're starting to think about how we should set
that up, we thought it would be useful to gather some use cases from the
code4lib community, as it's the folks here that are experimenting with
services like this. So if there were FRBRized data available to you (at
least for FRBR group 1 and group 2 entities; *maybe* group 3 as well), what
would you do with it? What kinds of questions would your service (discovery
system, whatever) ask a service that made this data available? What kinds of
information would you want in a response? Would you have uses that called
for downloading of "all" data at once or would you instead be better off
with real-time queries to a web service? It's questions like that we're
interested in brainstorming with this group about.

Basically, what type of access to the data we're generating is most
important, since we have finite resources to expend on this right now.

Thanks, all!

Jenn

[1] http://www.loc.gov/cds/downloads/FRBR.PDF
[2] http://vfrbr.info


Jenn Riley
Metadata Librarian
Digital Library Program
Indiana University - Bloomington
Wells Library W501
(812) 856-5759
www.dlib.indiana.edu

Inquiring Librarian blog: www.inquiringlibrarian.blogspot.com

  


Re: [CODE4LIB] What do you want out of a frbrized data web service?

2010-04-20 Thread Robert Sanderson
Exposing the records as Linked Data, rather than just plain old XML
would be an interesting demonstration of how the library world can
generate and, more importantly, curate massive amounts of data.  They
could then be linked to and from by other resources/services -- for
example linking a copy of a book on Amazon as an Item to the
Manifestation it's drawn from could allow for powerful graph oriented
search.

Rob


On Tue, Apr 20, 2010 at 3:50 PM, Riley, Jenn  wrote:
> Hi all,
>
> At Indiana University we're working on a project that will help us see
> concretely what FRBRized [1] library data and discovery systems might look
> like. [2] One of our project goals is to share the raw FRBRized data widely
> so that others can look at it to see how it's structured, reuse it, improve
> on it, comment on the FRBRization effectiveness, etc. We're planning on
> allowing remote/Web Services/API/SRU/some machine-to-machine method like
> that access to the data. As we're starting to think about how we should set
> that up, we thought it would be useful to gather some use cases from the
> code4lib community, as it's the folks here that are experimenting with
> services like this. So if there were FRBRized data available to you (at
> least for FRBR group 1 and group 2 entities; *maybe* group 3 as well), what
> would you do with it? What kinds of questions would your service (discovery
> system, whatever) ask a service that made this data available? What kinds of
> information would you want in a response? Would you have uses that called
> for downloading of "all" data at once or would you instead be better off
> with real-time queries to a web service? It's questions like that we're
> interested in brainstorming with this group about.
>
> Basically, what type of access to the data we're generating is most
> important, since we have finite resources to expend on this right now.
>
> Thanks, all!
>
> Jenn
>
> [1] http://www.loc.gov/cds/downloads/FRBR.PDF
> [2] http://vfrbr.info
>
> 
> Jenn Riley
> Metadata Librarian
> Digital Library Program
> Indiana University - Bloomington
> Wells Library W501
> (812) 856-5759
> www.dlib.indiana.edu
>
> Inquiring Librarian blog: www.inquiringlibrarian.blogspot.com
>


[CODE4LIB] What do you want out of a frbrized data web service?

2010-04-20 Thread Riley, Jenn
Hi all,

At Indiana University we're working on a project that will help us see
concretely what FRBRized [1] library data and discovery systems might look
like. [2] One of our project goals is to share the raw FRBRized data widely
so that others can look at it to see how it's structured, reuse it, improve
on it, comment on the FRBRization effectiveness, etc. We're planning on
allowing remote/Web Services/API/SRU/some machine-to-machine method like
that access to the data. As we're starting to think about how we should set
that up, we thought it would be useful to gather some use cases from the
code4lib community, as it's the folks here that are experimenting with
services like this. So if there were FRBRized data available to you (at
least for FRBR group 1 and group 2 entities; *maybe* group 3 as well), what
would you do with it? What kinds of questions would your service (discovery
system, whatever) ask a service that made this data available? What kinds of
information would you want in a response? Would you have uses that called
for downloading of "all" data at once or would you instead be better off
with real-time queries to a web service? It's questions like that we're
interested in brainstorming with this group about.

Basically, what type of access to the data we're generating is most
important, since we have finite resources to expend on this right now.

Thanks, all!

Jenn

[1] http://www.loc.gov/cds/downloads/FRBR.PDF
[2] http://vfrbr.info


Jenn Riley
Metadata Librarian
Digital Library Program
Indiana University - Bloomington
Wells Library W501
(812) 856-5759
www.dlib.indiana.edu

Inquiring Librarian blog: www.inquiringlibrarian.blogspot.com