Re: [CODE4LIB] What do you want out of a frbrized data web service?
For our fed search service we very much echo Jonathan's real-time requirements/use case (we don't build indexes, so bulk download is not of interest): access - real-time query (purpose - to enhance data about items found by other means) query - by standard IDs (generally this is "known item" augmentation, so "discovery" queries by keywords, etc are not so much required) data format - almost anything "standard" (we can translate it into the internal data model structure) big value add - relationships, mainly the "upward" ones, towards work data quantity - all details of directly related items, plus 2nd level links, possibly all details all the way up to (and including) the work (this is a trade-off of processing time on the service side to gather this information, and on our side to de-construct vs. the time to set up and manage multiple service calls to get the data about individual items in the link chain. In our experience it is almost always quicker to get it "all-at-once" than to send repeated messages, even if the total amount of data is less in the latter. But, mileage may vary here.) Peter > -Original Message- > From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of > Jonathan Rochkind > Sent: Wednesday, April 21, 2010 7:59 AM > To: CODE4LIB@LISTSERV.ND.EDU > Subject: Re: [CODE4LIB] What do you want out of a frbrized data web > service? > > So, okay, the "value added" stuff you have will indeed be relationships > between entities, which is not too unexpected. > > So, yes, I would want a real-time query service (for enhancement of > individual items on display in my system) _as well as_ a bulk download > (for enhancing my data on indexing). > > For real time query, I'd have a specific entity at hand roughly > corresponding to a 'manifestation'. I'd want to look it up in your > system by any identifiers I have (oclcnum, lccn, isbn, issn; any other > music-related identifiers that are useful?) to find a match. Then I'd > want to find out it's workset ID (or possibly expression ID?) in your > system, and be able to find all the OTHER manifestations/expressions in > those sets, from your system, with citation details about those items. > (Author, title, publisher, year, etc; also oclcnum/lccn/isbn/issn/etc > if > available. Just giving me Marc with everything might be sufficient). > If you have work identifiers from other systems that correspond to your > workID (OCLC workID? etc), I'd want to know those. > > For bulk download, yeah, I'd just want everything you could give me, > really. > > Some of the details can't really be spec'd in advance, it requires an > interative process of people trying to use it and seeing what they need. > I know this makes things hard from a grant-funded project management > perspective. > > Jonathan > > Riley, Jenn wrote: > > On 4/20/10 7:18 PM, "Jonathan Rochkind" wrote: > > > > > >> But first, to really answer the question, we need some more > information > >> from you. What data do you actually have of value? Just saying "we > have > >> FRBRized data" doesn't really tell me, "FRBRized data" can be almost > >> anything, really. Can you tell us more about what value you think > >> you've added to your data as a result of your "FRBRization"? What > do > >> you have that wasn't there before? Better relationships between > >> manifestations? Something else? > >> > > > > Heh, I was intentionally vague in an attempt to avoid skewing the > discussion > > in certain directions, but I was obviously *too* vague - my apologies. > Here > > are the sorts of things we'd imagined and are looking to prioritize: > > > > - Give me a list of all manifestations that match some arbitrary > query terms > > - Given this manifestation identifier, show me all expressions on it > and > > what works they realize > > - Give me a list of all works that match some arbitrary query terms > > - Given this work identifier, show all expressions and manifestations > of it > > - Show me all of the people who match some arbitrary query terms > (women > > composers in Vienna in the 1860s, for example) > > - Which works have expressions with this specific relationship to > this > > particular known person? > > > > Basically we're exploring when we should support queries as words vs. > > previously-known identifiers, when a response will all be a set of > records > > for the same entity vs. several different ones with the
Re: [CODE4LIB] What do you want out of a FRBRized data web service?
Quoting "Ziso, Ya'aqov" : Karen Coyle, By ‘create entities’ (below) is it NECESSARY to create records (and keep them up-to-date), or is it possible/preferable to create them on the fly? ./Ya’aqov the display is created on the fly. the entities need to be somehow embodied as entities in the database. But the entity is, for example, the author, not that whole page of information. It's an identified "thing" in the database that can have relationships with other "things" -- which can then be brought into the display. You can see how OL has defined things by looking through the list of types: http://openlibrary.org/type Everything that we consider a "data element" is a type, and types can be created that contain other types (look at author as an example). This makes for a lot of flexibility, and in theory any type could be treated as an entity (although it doesn't make sense for all of them). I'm sure this is only one of many ways to do this... kc It would be ideal to have an actual entity for each of the FRBR 1, 2 and 3 entities. We could even create entities that aren't exactly in FRBR, such as for publication dates, publishers, languages. And the main view is not of a single entity, but an entity in relation to other entities. What's nearby? What happens when I combine these two? (Also see WorldCat Identities as an example of data that can be shown in relation to a person entity.) -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 begin_of_the_skype_highlighting 1-510-435-8234 end_of_the_skype_highlighting skype: kcoylenet
Re: [CODE4LIB] What do you want out of a FRBRized data web service?
Karen Coyle, By ‘create entities’ (below) is it NECESSARY to create records (and keep them up-to-date), or is it possible/preferable to create them on the fly? ./Ya’aqov It would be ideal to have an actual entity for each of the FRBR 1, 2 and 3 entities. We could even create entities that aren't exactly in FRBR, such as for publication dates, publishers, languages. And the main view is not of a single entity, but an entity in relation to other entities. What's nearby? What happens when I combine these two? (Also see WorldCat Identities as an example of data that can be shown in relation to a person entity.)
Re: [CODE4LIB] What do you want out of a frbrized data web service?
Quoting "Riley, Jenn" : Hi all, So if there were FRBRized data available to you (at least for FRBR group 1 and group 2 entities; *maybe* group 3 as well), what would you do with it? What kinds of questions would your service (discovery system, whatever) ask a service that made this data available? What kinds of information would you want in a response? Would you have uses that called for downloading of "all" data at once or would you instead be better off with real-time queries to a web service? It's questions like that we're interested in brainstorming with this group about. Take a look at the Open Library's use of entities: http://upstream.openlibrary.org When you search on a subject, you get a subject entity/page. When you search on an author, you get an author entity/page. Information about the subject and the author, plus the relationships are available in current data (related subjects, related persons, etc.) are all there. For users, I think that the group 2 and 3 entities and the various relationships are key to discovery. After that, what relationships you can derive from the data gives users a way to navigate (rather than search). It would be ideal to have an actual entity for each of the FRBR 1, 2 and 3 entities. We could even create entities that aren't exactly in FRBR, such as for publication dates, publishers, languages. And the main view is not of a single entity, but an entity in relation to other entities. What's nearby? What happens when I combine these two? (Also see WorldCat Identities as an example of data that can be shown in relation to a person entity.) kc Basically, what type of access to the data we're generating is most important, since we have finite resources to expend on this right now. Thanks, all! Jenn [1] http://www.loc.gov/cds/downloads/FRBR.PDF [2] http://vfrbr.info Jenn Riley Metadata Librarian Digital Library Program Indiana University - Bloomington Wells Library W501 (812) 856-5759 www.dlib.indiana.edu Inquiring Librarian blog: www.inquiringlibrarian.blogspot.com -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet
Re: [CODE4LIB] What do you want out of a frbrized data web service?
So, okay, the "value added" stuff you have will indeed be relationships between entities, which is not too unexpected. So, yes, I would want a real-time query service (for enhancement of individual items on display in my system) _as well as_ a bulk download (for enhancing my data on indexing). For real time query, I'd have a specific entity at hand roughly corresponding to a 'manifestation'. I'd want to look it up in your system by any identifiers I have (oclcnum, lccn, isbn, issn; any other music-related identifiers that are useful?) to find a match. Then I'd want to find out it's workset ID (or possibly expression ID?) in your system, and be able to find all the OTHER manifestations/expressions in those sets, from your system, with citation details about those items. (Author, title, publisher, year, etc; also oclcnum/lccn/isbn/issn/etc if available. Just giving me Marc with everything might be sufficient). If you have work identifiers from other systems that correspond to your workID (OCLC workID? etc), I'd want to know those. For bulk download, yeah, I'd just want everything you could give me, really. Some of the details can't really be spec'd in advance, it requires an interative process of people trying to use it and seeing what they need. I know this makes things hard from a grant-funded project management perspective. Jonathan Riley, Jenn wrote: On 4/20/10 7:18 PM, "Jonathan Rochkind" wrote: But first, to really answer the question, we need some more information from you. What data do you actually have of value? Just saying "we have FRBRized data" doesn't really tell me, "FRBRized data" can be almost anything, really. Can you tell us more about what value you think you've added to your data as a result of your "FRBRization"? What do you have that wasn't there before? Better relationships between manifestations? Something else? Heh, I was intentionally vague in an attempt to avoid skewing the discussion in certain directions, but I was obviously *too* vague - my apologies. Here are the sorts of things we'd imagined and are looking to prioritize: - Give me a list of all manifestations that match some arbitrary query terms - Given this manifestation identifier, show me all expressions on it and what works they realize - Give me a list of all works that match some arbitrary query terms - Given this work identifier, show all expressions and manifestations of it - Show me all of the people who match some arbitrary query terms (women composers in Vienna in the 1860s, for example) - Which works have expressions with this specific relationship to this particular known person? Basically we're exploring when we should support queries as words vs. previously-known identifiers, when a response will all be a set of records for the same entity vs. several different ones with the relationships between them recorded, to what degree answering a query will involve traversing lots of relationships - stuff like that. Having some real use cases will help us decide what kind of a service to offer and what technology we'll use to implement that service. We do hope to also be able to publish Linked Data in some form - that's probably going to come a little later, but it's definitely on "the list". To answer one of your other questions, the V/FRBR project is focusing on musical materials (scores and recordings) in particular, but we hope to set up frameworks that would be useful for library bibliographic and authority data in general. Jenn Jenn Riley Metadata Librarian Digital Library Program Indiana University - Bloomington Wells Library W501 (812) 856-5759 www.dlib.indiana.edu Inquiring Librarian blog: www.inquiringlibrarian.blogspot.com
Re: [CODE4LIB] What do you want out of a frbrized data web service?
On 4/20/10 7:18 PM, "Jonathan Rochkind" wrote: > But first, to really answer the question, we need some more information > from you. What data do you actually have of value? Just saying "we have > FRBRized data" doesn't really tell me, "FRBRized data" can be almost > anything, really. Can you tell us more about what value you think > you've added to your data as a result of your "FRBRization"? What do > you have that wasn't there before? Better relationships between > manifestations? Something else? Heh, I was intentionally vague in an attempt to avoid skewing the discussion in certain directions, but I was obviously *too* vague - my apologies. Here are the sorts of things we'd imagined and are looking to prioritize: - Give me a list of all manifestations that match some arbitrary query terms - Given this manifestation identifier, show me all expressions on it and what works they realize - Give me a list of all works that match some arbitrary query terms - Given this work identifier, show all expressions and manifestations of it - Show me all of the people who match some arbitrary query terms (women composers in Vienna in the 1860s, for example) - Which works have expressions with this specific relationship to this particular known person? Basically we're exploring when we should support queries as words vs. previously-known identifiers, when a response will all be a set of records for the same entity vs. several different ones with the relationships between them recorded, to what degree answering a query will involve traversing lots of relationships - stuff like that. Having some real use cases will help us decide what kind of a service to offer and what technology we'll use to implement that service. We do hope to also be able to publish Linked Data in some form - that's probably going to come a little later, but it's definitely on "the list". To answer one of your other questions, the V/FRBR project is focusing on musical materials (scores and recordings) in particular, but we hope to set up frameworks that would be useful for library bibliographic and authority data in general. Jenn Jenn Riley Metadata Librarian Digital Library Program Indiana University - Bloomington Wells Library W501 (812) 856-5759 www.dlib.indiana.edu Inquiring Librarian blog: www.inquiringlibrarian.blogspot.com
Re: [CODE4LIB] What do you want out of a frbrized data web service?
Here's a thought: John Riley in an authority record is linked through the 670 field (author of Cells today) where Cells today is the 245 in a bibliographic record. Let's assume there are about 4 John Riley(s) who wrote about cells, each in their own bib record. If any bibliographic record is part of a 'chain' of FRBRized manifestations where one of these manifestations includes also a date (relevant to a certain Riley John), a more detailed description of those cells, or a 502 such as Riley John submitted a dissertation on a specific branch in chemistry), or a video with John Riley's picture, I can benefit and link (via an API query) to that information to distinguish that Riley, John. Jenn, Jonathan, does my scenario make sense? Ya'aqov From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of Jonathan Rochkind [rochk...@jhu.edu] Sent: Tuesday, April 20, 2010 7:18 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] What do you want out of a frbrized data web service? I started preparing a longer answer to this, and still will provide one eventually. But first, to really answer the question, we need some more information from you. What data do you actually have of value? Just saying "we have FRBRized data" doesn't really tell me, "FRBRized data" can be almost anything, really. Can you tell us more about what value you think you've added to your data as a result of your "FRBRization"? What do you have that wasn't there before? Better relationships between manifestations? Something else? I forget, were you focusing on specific material types (music or moving image?) in this project, or is this just general materials, covering the gamut of what one would expect from a major academic library? If you've done special work with music or moving image, what is the nature of the value added there? Do these questions make sense? To know how I might want to use the data, I need to know a bit more about what you've actually got that's useful, which "it's FRBRized" doesn't really tell me. But as far as "do you want real-time querries to a web service, or bulk download of the data?" -- yes, I'd want both, probably. Either one will be the most convenient depending on what I'm trying to do. If you _had_ to pick one, it would be 'bulk download', because _anything_ is possible with bulk download -- but for certain uses, it can take a lot more work on my part for bulk download, so if that's all there is there, it will be a higher barrier for use than if real-time web api was available. But if _only_ real-time querries are available, then certain things are just impossible (mainly indexing-time enhancement of my data). Jonathan Riley, Jenn wrote: > Hi all, > > At Indiana University we're working on a project that will help us see > concretely what FRBRized [1] library data and discovery systems might look > like. [2] One of our project goals is to share the raw FRBRized data widely > so that others can look at it to see how it's structured, reuse it, improve > on it, comment on the FRBRization effectiveness, etc. We're planning on > allowing remote/Web Services/API/SRU/some machine-to-machine method like > that access to the data. As we're starting to think about how we should set > that up, we thought it would be useful to gather some use cases from the > code4lib community, as it's the folks here that are experimenting with > services like this. So if there were FRBRized data available to you (at > least for FRBR group 1 and group 2 entities; *maybe* group 3 as well), what > would you do with it? What kinds of questions would your service (discovery > system, whatever) ask a service that made this data available? What kinds of > information would you want in a response? Would you have uses that called > for downloading of "all" data at once or would you instead be better off > with real-time queries to a web service? It's questions like that we're > interested in brainstorming with this group about. > > Basically, what type of access to the data we're generating is most > important, since we have finite resources to expend on this right now. > > Thanks, all! > > Jenn > > [1] http://www.loc.gov/cds/downloads/FRBR.PDF > [2] http://vfrbr.info > > > Jenn Riley > Metadata Librarian > Digital Library Program > Indiana University - Bloomington > Wells Library W501 > (812) 856-5759 > www.dlib.indiana.edu > > Inquiring Librarian blog: www.inquiringlibrarian.blogspot.com > >
Re: [CODE4LIB] What do you want out of a frbrized data web service?
I started preparing a longer answer to this, and still will provide one eventually. But first, to really answer the question, we need some more information from you. What data do you actually have of value? Just saying "we have FRBRized data" doesn't really tell me, "FRBRized data" can be almost anything, really. Can you tell us more about what value you think you've added to your data as a result of your "FRBRization"? What do you have that wasn't there before? Better relationships between manifestations? Something else? I forget, were you focusing on specific material types (music or moving image?) in this project, or is this just general materials, covering the gamut of what one would expect from a major academic library? If you've done special work with music or moving image, what is the nature of the value added there? Do these questions make sense? To know how I might want to use the data, I need to know a bit more about what you've actually got that's useful, which "it's FRBRized" doesn't really tell me. But as far as "do you want real-time querries to a web service, or bulk download of the data?" -- yes, I'd want both, probably. Either one will be the most convenient depending on what I'm trying to do. If you _had_ to pick one, it would be 'bulk download', because _anything_ is possible with bulk download -- but for certain uses, it can take a lot more work on my part for bulk download, so if that's all there is there, it will be a higher barrier for use than if real-time web api was available. But if _only_ real-time querries are available, then certain things are just impossible (mainly indexing-time enhancement of my data). Jonathan Riley, Jenn wrote: Hi all, At Indiana University we're working on a project that will help us see concretely what FRBRized [1] library data and discovery systems might look like. [2] One of our project goals is to share the raw FRBRized data widely so that others can look at it to see how it's structured, reuse it, improve on it, comment on the FRBRization effectiveness, etc. We're planning on allowing remote/Web Services/API/SRU/some machine-to-machine method like that access to the data. As we're starting to think about how we should set that up, we thought it would be useful to gather some use cases from the code4lib community, as it's the folks here that are experimenting with services like this. So if there were FRBRized data available to you (at least for FRBR group 1 and group 2 entities; *maybe* group 3 as well), what would you do with it? What kinds of questions would your service (discovery system, whatever) ask a service that made this data available? What kinds of information would you want in a response? Would you have uses that called for downloading of "all" data at once or would you instead be better off with real-time queries to a web service? It's questions like that we're interested in brainstorming with this group about. Basically, what type of access to the data we're generating is most important, since we have finite resources to expend on this right now. Thanks, all! Jenn [1] http://www.loc.gov/cds/downloads/FRBR.PDF [2] http://vfrbr.info Jenn Riley Metadata Librarian Digital Library Program Indiana University - Bloomington Wells Library W501 (812) 856-5759 www.dlib.indiana.edu Inquiring Librarian blog: www.inquiringlibrarian.blogspot.com
Re: [CODE4LIB] What do you want out of a frbrized data web service?
Exposing the records as Linked Data, rather than just plain old XML would be an interesting demonstration of how the library world can generate and, more importantly, curate massive amounts of data. They could then be linked to and from by other resources/services -- for example linking a copy of a book on Amazon as an Item to the Manifestation it's drawn from could allow for powerful graph oriented search. Rob On Tue, Apr 20, 2010 at 3:50 PM, Riley, Jenn wrote: > Hi all, > > At Indiana University we're working on a project that will help us see > concretely what FRBRized [1] library data and discovery systems might look > like. [2] One of our project goals is to share the raw FRBRized data widely > so that others can look at it to see how it's structured, reuse it, improve > on it, comment on the FRBRization effectiveness, etc. We're planning on > allowing remote/Web Services/API/SRU/some machine-to-machine method like > that access to the data. As we're starting to think about how we should set > that up, we thought it would be useful to gather some use cases from the > code4lib community, as it's the folks here that are experimenting with > services like this. So if there were FRBRized data available to you (at > least for FRBR group 1 and group 2 entities; *maybe* group 3 as well), what > would you do with it? What kinds of questions would your service (discovery > system, whatever) ask a service that made this data available? What kinds of > information would you want in a response? Would you have uses that called > for downloading of "all" data at once or would you instead be better off > with real-time queries to a web service? It's questions like that we're > interested in brainstorming with this group about. > > Basically, what type of access to the data we're generating is most > important, since we have finite resources to expend on this right now. > > Thanks, all! > > Jenn > > [1] http://www.loc.gov/cds/downloads/FRBR.PDF > [2] http://vfrbr.info > > > Jenn Riley > Metadata Librarian > Digital Library Program > Indiana University - Bloomington > Wells Library W501 > (812) 856-5759 > www.dlib.indiana.edu > > Inquiring Librarian blog: www.inquiringlibrarian.blogspot.com >
[CODE4LIB] What do you want out of a frbrized data web service?
Hi all, At Indiana University we're working on a project that will help us see concretely what FRBRized [1] library data and discovery systems might look like. [2] One of our project goals is to share the raw FRBRized data widely so that others can look at it to see how it's structured, reuse it, improve on it, comment on the FRBRization effectiveness, etc. We're planning on allowing remote/Web Services/API/SRU/some machine-to-machine method like that access to the data. As we're starting to think about how we should set that up, we thought it would be useful to gather some use cases from the code4lib community, as it's the folks here that are experimenting with services like this. So if there were FRBRized data available to you (at least for FRBR group 1 and group 2 entities; *maybe* group 3 as well), what would you do with it? What kinds of questions would your service (discovery system, whatever) ask a service that made this data available? What kinds of information would you want in a response? Would you have uses that called for downloading of "all" data at once or would you instead be better off with real-time queries to a web service? It's questions like that we're interested in brainstorming with this group about. Basically, what type of access to the data we're generating is most important, since we have finite resources to expend on this right now. Thanks, all! Jenn [1] http://www.loc.gov/cds/downloads/FRBR.PDF [2] http://vfrbr.info Jenn Riley Metadata Librarian Digital Library Program Indiana University - Bloomington Wells Library W501 (812) 856-5759 www.dlib.indiana.edu Inquiring Librarian blog: www.inquiringlibrarian.blogspot.com