[CODE4LIB] You got it!!!!! Re: [CODE4LIB] Something completely different
Peter Schlumpf writes: Bill, You have hit the nail on the head! This is EXACTLY what I am trying to do! It's the underlying stuff that I am trying to get at. Looking at RDF may yield some good ideas. But I am not thinking in terms of RDF or XML, triples, or MARC, standards, or any of that stuff that gets thrown around here. Even the Internet is not terribly necessary. I am thinking in terms of data structures, pointers, sparse matrices, relationships between objects and yes, set theory too -- things like that. The former is pretty much cruft that lies upon the latter, and it mostly just gets in the way. Noise, as you put it, Bill! A big problem here is that Libraryland has a bad habit of getting itself lost in the details and going off on all kinds of tangents. As I said before, the biggest prison is between the ears Throw out all that junk in there and just start over! When I begin programming this thing my only tools will be a programming language (C or Java) a text editor (vi) and my head. This is very idyllic and (I hope this doesn't sound too patronising) probably necessary from time to time. But I've seen too many initiatives like this that start out making huge conceptual strides and then start tripping over all those gushdurned DETAILS. I think it's disingenuous to talk as though the details aren't important: 90% of every project is the details, and while the other 10% is the fun part, building new conceptual frameworks usually seems to involve throwing out all the accumulated crud, which -- guess what? -- turns out to be the embodiment in code of accumulated wisdom. Babies, bathwater, all that ... except that the bathwater turns out to be made of millions of tiny babies, and -- what's that you say? My metaphor has skidded off the track? Oh well. _/|____ /o ) \/ Mike Taylorm...@indexdata.comhttp://www.miketaylor.org.uk )_v__/\ Are you suggesting that coconuts migrate? -- Monty Python and the Holy Grail.
Re: [CODE4LIB] Something completely different
Alexander Johannesen wrote: On Wed, Apr 15, 2009 at 10:32, stuart yeates stuart.yea...@vuw.ac.nz wrote: Yes, we mint something very similar (see http://authority.nzetc.org/52969/ for mine), but none of our interoperability partners do. None of our local libraries, none of our local archives and only one of our local museums (by virtue of some work we did with them). All of them publish and most consume some form RDF. Hmm, RDF resources are just URIs, so I'm still a bit unsure about what you mean. Are you talking about the fact that the RDF definitions (and not the RDF vocabs themselves) aren't encoded in your TM engine? Interoperability isn't just about using the same URL for the same concept. It's about being able to import each others data for matching (which involves having the in-house tools and experience); it's about being able to provide mutual support and aid (which involves speaking the same language); it's about being part of a community of practise and all that that entails. Additionally many of the taxonomies we're interested in are available in RDF but not topic maps. Converting them to a Topic Map isn't that hard to do, but I guess there is *a* cost there. There is little cost in converting it to a topic map, but by using a topic map derived from a shared ontology, rather than the shared ontology itself, places barriers to interoperability, builds the burden of maintenance we need to carry and places us at at least a further step from the heart of the community using that ontology. Given that there are at least four ontologies I'd like to be using, multiply that by four. cheers stuart -- Stuart Yeates http://www.nzetc.org/ New Zealand Electronic Text Centre http://researcharchive.vuw.ac.nz/ Institutional Repository
Re: [CODE4LIB] Something completely different
Alexander Johannesen wrote: We currently use topic maps, alot, in our infrastructure. If we were starting again tomorrow, I'd advocate using RDF instead, mainly because of the much better tool support and take-up. Hmm, not a good thing at all. Could you elaborate, though, as I use it too as part of infrastructure too, and wouldn't touch RDF / SemWeb without a long stick? I'm into application semantics and shared knowledge-bases. What are you guys doing where you feel the support and tools are lacking? And what are the RDF alternatives? RDF, unlike topic maps, is being used by substantial numbers of people who we interact with in the real world and would like to interoperate with. If we used RDF rather than topic maps internally, that interoperability would be much, much cheaper. It's tempting to say it's free, but it's not quite, because it does impose some constraints. In my eyes, the core thing that RDF supports that topic maps don't seem to is seamless reuse by people you don't care about. For example the people at http://lcsubjects.org have never heard of us (that I know of), but we can use their URLs like http://lcsubjects.org/subjects/sh90005545#concept to represent our roles. cheers stuart -- Stuart Yeates http://www.nzetc.org/ New Zealand Electronic Text Centre http://researcharchive.vuw.ac.nz/ Institutional Repository
Re: [CODE4LIB] Something completely different
On Wed, Apr 15, 2009 at 07:10, stuart yeates stuart.yea...@vuw.ac.nz wrote: RDF, unlike topic maps, is being used by substantial numbers of people who we interact with in the real world and would like to interoperate with. If we used RDF rather than topic maps internally, that interoperability would be much, much cheaper. It's tempting to say it's free, but it's not quite, because it does impose some constraints. But it's not that hard to create a bridge from RDF to Topic Maps and back, no? Or is your interop story different? In my eyes, the core thing that RDF supports that topic maps don't seem to is seamless reuse by people you don't care about. Yes, this has been brought up on several occasions, including by me at the TMRA 2008. But then, it's not so much that RDF does something that Topic Maps doesn't *support*, it's that it's packaged differently. So, where RDF has got five standard ontology levels (RDF, RDFS, OWL DL/Lite/Full) Topic Maps got one simpler one (TMDM), yet neither can express anything better or differently than the other. My theory here is that people *like* 5 layers of RDF, because it gives the false sensation of choice. But it's all ontological definitions. However, the 5 levels of RDF does indeed create a defined platform for sharing (if not cast in iron), in which in the TM world you need to include it / create it. Oh, and of course the academics seem to have embraced W3C and anything by the authority of TBL, and its effect is trickling down. For example the people at http://lcsubjects.org have never heard of us (that I know of), but we can use their URLs like http://lcsubjects.org/subjects/sh90005545#concept to represent our roles. Not sure I understand your example. Here's my Topic Map identifier in a Topic Map ; http://psi.ontopedia.net/Alexander_Johannesen Identifier and locator, and resolvable, and can be used by anyone. Regards, Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] Something completely different
Alexander Johannesen wrote: On Wed, Apr 15, 2009 at 07:10, stuart yeates stuart.yea...@vuw.ac.nz wrote: For example the people at http://lcsubjects.org have never heard of us (that I know of), but we can use their URLs like http://lcsubjects.org/subjects/sh90005545#concept to represent our roles. Not sure I understand your example. Here's my Topic Map identifier in a Topic Map ; http://psi.ontopedia.net/Alexander_Johannesen Identifier and locator, and resolvable, and can be used by anyone. Yes, we mint something very similar (see http://authority.nzetc.org/52969/ for mine), but none of our interoperability partners do. None of our local libraries, none of our local archives and only one of our local museums (by virtue of some work we did with them). All of them publish and most consume some form RDF. Additionally many of the taxonomies we're interested in are available in RDF but not topic maps. cheers stuart -- Stuart Yeates http://www.nzetc.org/ New Zealand Electronic Text Centre http://researcharchive.vuw.ac.nz/ Institutional Repository
Re: [CODE4LIB] Something completely different
On Wed, Apr 15, 2009 at 10:32, stuart yeates stuart.yea...@vuw.ac.nz wrote: Yes, we mint something very similar (see http://authority.nzetc.org/52969/ for mine), but none of our interoperability partners do. None of our local libraries, none of our local archives and only one of our local museums (by virtue of some work we did with them). All of them publish and most consume some form RDF. Hmm, RDF resources are just URIs, so I'm still a bit unsure about what you mean. Are you talking about the fact that the RDF definitions (and not the RDF vocabs themselves) aren't encoded in your TM engine? Additionally many of the taxonomies we're interested in are available in RDF but not topic maps. Converting them to a Topic Map isn't that hard to do, but I guess there is *a* cost there. Regards, Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] Something completely different
+1 Sharon M. Foster, 91.7% Librarian Speaker-to-Computers http://www.vsa-software.com/mlsportfolio/ On Thu, Apr 9, 2009 at 10:37 PM, Bill Dueber b...@dueber.com wrote: On Thu, Apr 9, 2009 at 10:26 AM, Mike Taylor m...@indexdata.com wrote: I'm not sure what to make of this except to say that Yet Another XML Bibliographic Format is NOT the answer! I recognize that you're being flippant, and yet think there's an important nugget in here. When you say it that way, it makes it sound as if folks are debating the finer points of OAI-MARC vs MARC-XML -- that it's simply syntactic sugar (although I'm certainly one to argue for the importance of syntactic sugar) over the top of what we already have. What's actually being discussed, of course, is the underlying data model. E-R pairs primarily analyzed by set theory, triples forming directed graphs, whether or not links between data elements can themselves have attributes -- these are all possible characteristics of the fundamental underpinning of a data model to describe the data we're concerned with. The fact that they all have common XML representations is noise, and referencing the currently-most-common xml schema for these things is just convenient shorthand in a community that understands the exemplars. The fact that many in the library community don't understand that syntax is not the same as a data model is how we ended up with RDA. (Mike: I don't know your stuff, but I seriously doubt you're among that group. I'm talkin' in general, here.) Bibliographic data is astoundingly complex, and I believe wholeheartedly that modeling it sufficiently is a very, very hard task. But no matter the underlying model, we should still insist on starting with the basics that computer science folks have been using for decades now: uids (and, these days, guids) for the important attributes, separation of data and display, definition of sufficient data types and reuse of those types whenever possible, separation of identity and value, full normalization of data, zero ambiguity in the relationship diagram as a fundamental tenet, and a rigorous mathematical model to describe how it all fits together. This is hard stuff. But it's worth doing right. -- Bill Dueber Library Systems Programmer University of Michigan Library
Re: [CODE4LIB] Something completely different
On Thu, Apr 9, 2009 at 10:37 PM, Bill Dueber b...@dueber.com wrote: This is hard stuff. But it's worth doing right. +1 The issue here isn't about serializations or transmission formats. It's about data modeling. Our current bibliographic data model is horribly inefficient, with antiquated design ideas and far too rigid to handle the explosion in information sources, and that has nothing to do with ISO 2709 (although the byte limit might). Then there's the actual inventory control data model... -Ross.
Re: [CODE4LIB] You got it!!!!! Re: [CODE4LIB] Something completely different
Bill and Peter, Very nice posts. XML, RDF, MARC and DC are all different ways to present information in a way (of course, XML, RDF, and DC are easier to read/processed by machine). However, down the fundamentals, I think that it can go deeper, basically data structure and algorithms making things works. RDF (with triples) is a directed graph. Graph is a powerful (the most powerful?) data structure that you can model everything. However, some of the graph theory/problems are NP-hard problems. In fundamental we are talking about Math. So a balance needs to be made. (between how complex the model is and how easy(or possible) to get it implemented). As computing power grows, complex data modeling and data mining are on the horizon. Yan -Original Message- From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of Peter Schlumpf Sent: Thursday, April 09, 2009 10:09 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] You got it! Re: [CODE4LIB] Something completely different Bill, You have hit the nail on the head! This is EXACTLY what I am trying to do! It's the underlying stuff that I am trying to get at. Looking at RDF may yield some good ideas. But I am not thinking in terms of RDF or XML, triples, or MARC, standards, or any of that stuff that gets thrown around here. Even the Internet is not terribly necessary. I am thinking in terms of data structures, pointers, sparse matrices, relationships between objects and yes, set theory too -- things like that. The former is pretty much cruft that lies upon the latter, and it mostly just gets in the way. Noise, as you put it, Bill! A big problem here is that Libraryland has a bad habit of getting itself lost in the details and going off on all kinds of tangents. As I said before, the biggest prison is between the ears Throw out all that junk in there and just start over! When I begin programming this thing my only tools will be a programming language (C or Java) a text editor (vi) and my head. But before I really start that, right now I am writing a paper that explains how this stuff works at a very low level. It's mostly an effort to get my thoughts down clearly, but I will share a draft of it with y'all on here soon. Peter Schlumpf -Original Message- From: Bill Dueber b...@dueber.com Sent: Apr 9, 2009 10:37 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Something completely different On Thu, Apr 9, 2009 at 10:26 AM, Mike Taylor m...@indexdata.com wrote: I'm not sure what to make of this except to say that Yet Another XML Bibliographic Format is NOT the answer! I recognize that you're being flippant, and yet think there's an important nugget in here. When you say it that way, it makes it sound as if folks are debating the finer points of OAI-MARC vs MARC-XML -- that it's simply syntactic sugar (although I'm certainly one to argue for the importance of syntactic sugar) over the top of what we already have. What's actually being discussed, of course, is the underlying data model. E-R pairs primarily analyzed by set theory, triples forming directed graphs, whether or not links between data elements can themselves have attributes -- these are all possible characteristics of the fundamental underpinning of a data model to describe the data we're concerned with. The fact that they all have common XML representations is noise, and referencing the currently-most-common xml schema for these things is just convenient shorthand in a community that understands the exemplars. The fact that many in the library community don't understand that syntax is not the same as a data model is how we ended up with RDA. (Mike: I don't know your stuff, but I seriously doubt you're among that group. I'm talkin' in general, here.) Bibliographic data is astoundingly complex, and I believe wholeheartedly that modeling it sufficiently is a very, very hard task. But no matter the underlying model, we should still insist on starting with the basics that computer science folks have been using for decades now: uids (and, these days, guids) for the important attributes, separation of data and display, definition of sufficient data types and reuse of those types whenever possible, separation of identity and value, full normalization of data, zero ambiguity in the relationship diagram as a fundamental tenet, and a rigorous mathematical model to describe how it all fits together. This is hard stuff. But it's worth doing right. -- Bill Dueber Library Systems Programmer University of Michigan Library
Re: [CODE4LIB] You got it!!!!! Re: [CODE4LIB] Something completely different
(Attention: lurker emerging) To me what it comes down to is neither simplicity nor complexity, but extensibility. In a perfect world, our data models should be capable of representing very sophisticated and robust relationships at a high level of granularity, while still accommodating ease of metadata production and contribution (especially by non-experts and those outside the library community). I agree that none of our existing data structures/syntaxes are /a priori /fundamental or infallible. But what is promising to me about RDF is its intuitive mode of expression and extensibility (exactly the kind I advocate above). Casey Han, Yan wrote: Bill and Peter, Very nice posts. XML, RDF, MARC and DC are all different ways to present information in a way (of course, XML, RDF, and DC are easier to read/processed by machine). However, down the fundamentals, I think that it can go deeper, basically data structure and algorithms making things works. RDF (with triples) is a directed graph. Graph is a powerful (the most powerful?) data structure that you can model everything. However, some of the graph theory/problems are NP-hard problems. In fundamental we are talking about Math. So a balance needs to be made. (between how complex the model is and how easy(or possible) to get it implemented). As computing power grows, complex data modeling and data mining are on the horizon. Yan -Original Message- From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of Peter Schlumpf Sent: Thursday, April 09, 2009 10:09 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] You got it! Re: [CODE4LIB] Something completely different Bill, You have hit the nail on the head! This is EXACTLY what I am trying to do! It's the underlying stuff that I am trying to get at. Looking at RDF may yield some good ideas. But I am not thinking in terms of RDF or XML, triples, or MARC, standards, or any of that stuff that gets thrown around here. Even the Internet is not terribly necessary. I am thinking in terms of data structures, pointers, sparse matrices, relationships between objects and yes, set theory too -- things like that. The former is pretty much cruft that lies upon the latter, and it mostly just gets in the way. Noise, as you put it, Bill! A big problem here is that Libraryland has a bad habit of getting itself lost in the details and going off on all kinds of tangents. As I said before, the biggest prison is between the ears Throw out all that junk in there and just start over! When I begin programming this thing my only tools will be a programming language (C or Java) a text editor (vi) and my head. But before I really start that, right now I am writing a paper that explains how this stuff works at a very low level. It's mostly an effort to get my thoughts down clearly, but I will share a draft of it with y'all on here soon. Peter Schlumpf -Original Message- From: Bill Dueber b...@dueber.com Sent: Apr 9, 2009 10:37 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Something completely different On Thu, Apr 9, 2009 at 10:26 AM, Mike Taylor m...@indexdata.com wrote: I'm not sure what to make of this except to say that Yet Another XML Bibliographic Format is NOT the answer! I recognize that you're being flippant, and yet think there's an important nugget in here. When you say it that way, it makes it sound as if folks are debating the finer points of OAI-MARC vs MARC-XML -- that it's simply syntactic sugar (although I'm certainly one to argue for the importance of syntactic sugar) over the top of what we already have. What's actually being discussed, of course, is the underlying data model. E-R pairs primarily analyzed by set theory, triples forming directed graphs, whether or not links between data elements can themselves have attributes -- these are all possible characteristics of the fundamental underpinning of a data model to describe the data we're concerned with. The fact that they all have common XML representations is noise, and referencing the currently-most-common xml schema for these things is just convenient shorthand in a community that understands the exemplars. The fact that many in the library community don't understand that syntax is not the same as a data model is how we ended up with RDA. (Mike: I don't know your stuff, but I seriously doubt you're among that group. I'm talkin' in general, here.) Bibliographic data is astoundingly complex, and I believe wholeheartedly that modeling it sufficiently is a very, very hard task. But no matter the underlying model, we should still insist on starting with the basics that computer science folks have been using for decades now: uids (and, these days, guids) for the important attributes, separation of data and display, definition of sufficient data types and reuse of those types whenever possible, separation of identity and value, full
Re: [CODE4LIB] You got it!!!!! Re: [CODE4LIB] Something completely different
Extensibility as absolutely key. I know that some people consider XML to be inherently extensible, but I'm concerned that the conceptual model presented by FRBR doesn't support extensibility. For example, the FRBR entity Place represents only the place as a subject. If you want to represent places anywhere else in the record, you are SOL. Ditto the Event entity. The attributes in FRBR have no inherent structure, so you have, say, Manifestation with a whole page of attributes that are each defined at the most detailed level. You have reduction ratio (microform) but no reproduction info field that you could extend for another physical format. You have date of publication but no general date property that could be extended to other dates that are needed (in fact, the various date fields have no relation to each other). To have an extensible data structure we need to have some foundation classes that we can build on, and nothing in FRBR, RDA, or MARC gives us that. kc Casey A Mullin wrote: (Attention: lurker emerging) To me what it comes down to is neither simplicity nor complexity, but extensibility. In a perfect world, our data models should be capable of representing very sophisticated and robust relationships at a high level of granularity, while still accommodating ease of metadata production and contribution (especially by non-experts and those outside the library community). I agree that none of our existing data structures/syntaxes are /a priori /fundamental or infallible. But what is promising to me about RDF is its intuitive mode of expression and extensibility (exactly the kind I advocate above). Casey Han, Yan wrote: Bill and Peter, Very nice posts. XML, RDF, MARC and DC are all different ways to present information in a way (of course, XML, RDF, and DC are easier to read/processed by machine). However, down the fundamentals, I think that it can go deeper, basically data structure and algorithms making things works. RDF (with triples) is a directed graph. Graph is a powerful (the most powerful?) data structure that you can model everything. However, some of the graph theory/problems are NP-hard problems. In fundamental we are talking about Math. So a balance needs to be made. (between how complex the model is and how easy(or possible) to get it implemented). As computing power grows, complex data modeling and data mining are on the horizon. Yan -Original Message- From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of Peter Schlumpf Sent: Thursday, April 09, 2009 10:09 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] You got it! Re: [CODE4LIB] Something completely different Bill, You have hit the nail on the head! This is EXACTLY what I am trying to do! It's the underlying stuff that I am trying to get at. Looking at RDF may yield some good ideas. But I am not thinking in terms of RDF or XML, triples, or MARC, standards, or any of that stuff that gets thrown around here. Even the Internet is not terribly necessary. I am thinking in terms of data structures, pointers, sparse matrices, relationships between objects and yes, set theory too -- things like that. The former is pretty much cruft that lies upon the latter, and it mostly just gets in the way. Noise, as you put it, Bill! A big problem here is that Libraryland has a bad habit of getting itself lost in the details and going off on all kinds of tangents. As I said before, the biggest prison is between the ears Throw out all that junk in there and just start over! When I begin programming this thing my only tools will be a programming language (C or Java) a text editor (vi) and my head. But before I really start that, right now I am writing a paper that explains how this stuff works at a very low level. It's mostly an effort to get my thoughts down clearly, but I will share a draft of it with y'all on here soon. Peter Schlumpf -Original Message- From: Bill Dueber b...@dueber.com Sent: Apr 9, 2009 10:37 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Something completely different On Thu, Apr 9, 2009 at 10:26 AM, Mike Taylor m...@indexdata.com wrote: I'm not sure what to make of this except to say that Yet Another XML Bibliographic Format is NOT the answer! I recognize that you're being flippant, and yet think there's an important nugget in here. When you say it that way, it makes it sound as if folks are debating the finer points of OAI-MARC vs MARC-XML -- that it's simply syntactic sugar (although I'm certainly one to argue for the importance of syntactic sugar) over the top of what we already have. What's actually being discussed, of course, is the underlying data model. E-R pairs primarily analyzed by set theory, triples forming directed graphs, whether or not links between data elements can themselves have attributes -- these are all possible characteristics
Re: [CODE4LIB] You got it!!!!! Re: [CODE4LIB] Something completely different
I completely agree with Karen regarding how FRBR falls short in not allowing for more relationships between Group 1-2 and Group 3 entities. FRBRoo fleshes out some of these things, but in a woefully unweildy way, IMO. Conversely, FRBR in RDF (at http://vocab.org/frbr) consolidates some classes and properties (e.g. Responsible entity, a superclass of Person, Family and Corporate body), and to me approaches the kind of extensibility we need. Unfortunately, it does not include data properties, which I agree are problematic, as Karen illustrates. I do maintain that FRBR is the kind of *conceptual* model that, for the most part, can guide the development of effective data structures. However, it is far too abstract to be implemented verbatim. This is what I think RDA is trying to do with attributes like Title for the work I wonder: why is there not an ontology expert on the JSC?? (If I'm wrong and there is, someone please correct me) Casey Karen Coyle wrote: Extensibility as absolutely key. I know that some people consider XML to be inherently extensible, but I'm concerned that the conceptual model presented by FRBR doesn't support extensibility. For example, the FRBR entity Place represents only the place as a subject. If you want to represent places anywhere else in the record, you are SOL. Ditto the Event entity. The attributes in FRBR have no inherent structure, so you have, say, Manifestation with a whole page of attributes that are each defined at the most detailed level. You have reduction ratio (microform) but no reproduction info field that you could extend for another physical format. You have date of publication but no general date property that could be extended to other dates that are needed (in fact, the various date fields have no relation to each other). To have an extensible data structure we need to have some foundation classes that we can build on, and nothing in FRBR, RDA, or MARC gives us that. kc Casey A Mullin wrote: (Attention: lurker emerging) To me what it comes down to is neither simplicity nor complexity, but extensibility. In a perfect world, our data models should be capable of representing very sophisticated and robust relationships at a high level of granularity, while still accommodating ease of metadata production and contribution (especially by non-experts and those outside the library community). I agree that none of our existing data structures/syntaxes are /a priori /fundamental or infallible. But what is promising to me about RDF is its intuitive mode of expression and extensibility (exactly the kind I advocate above). Casey Han, Yan wrote: Bill and Peter, Very nice posts. XML, RDF, MARC and DC are all different ways to present information in a way (of course, XML, RDF, and DC are easier to read/processed by machine). However, down the fundamentals, I think that it can go deeper, basically data structure and algorithms making things works. RDF (with triples) is a directed graph. Graph is a powerful (the most powerful?) data structure that you can model everything. However, some of the graph theory/problems are NP-hard problems. In fundamental we are talking about Math. So a balance needs to be made. (between how complex the model is and how easy(or possible) to get it implemented). As computing power grows, complex data modeling and data mining are on the horizon. Yan -Original Message- From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of Peter Schlumpf Sent: Thursday, April 09, 2009 10:09 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] You got it! Re: [CODE4LIB] Something completely different Bill, You have hit the nail on the head! This is EXACTLY what I am trying to do! It's the underlying stuff that I am trying to get at. Looking at RDF may yield some good ideas. But I am not thinking in terms of RDF or XML, triples, or MARC, standards, or any of that stuff that gets thrown around here. Even the Internet is not terribly necessary. I am thinking in terms of data structures, pointers, sparse matrices, relationships between objects and yes, set theory too -- things like that. The former is pretty much cruft that lies upon the latter, and it mostly just gets in the way. Noise, as you put it, Bill! A big problem here is that Libraryland has a bad habit of getting itself lost in the details and going off on all kinds of tangents. As I said before, the biggest prison is between the ears Throw out all that junk in there and just start over! When I begin programming this thing my only tools will be a programming language (C or Java) a text editor (vi) and my head. But before I really start that, right now I am writing a paper that explains how this stuff works at a very low level. It's mostly an effort to get my thoughts down clearly, but I will share a draft of it with y'all on here soon. Peter Schlumpf -Original Message
Re: [CODE4LIB] Something completely different
Cloutman, David writes: I'm open to seeing new approaches to the ILS in general. A related question I had the other day, speaking of MARC, is what would an alternative bibliographic data format look like if it was designed with the intent for opening access to the data our ILS systems to developers in a more informal manner? I was thinking of an XML format that a developer could work with without formal training, the basics of which could be learned in an hour, and could reasonably represent the essential fields of the 90% of records that are most likely to be viewed by a public library patron. I read this and immediately thought, oh, that's MODS: http://en.wikipedia.org/wiki/Metadata_Object_Description_Schema Then I read on through the thread and found that Stuart Yeates recommeded TEI instead. Then I read on a few more messages, and found that Alex Dolski though Dublin Core XML was the answer. Then I read on a bit further, a found half a dozen people arguing for RDF, triplestores and topic maps. (I fact, the only thing that _no-one_ has recommended is anything based on RDA :-) ) I'm not sure what to make of this except to say that Yet Another XML Bibliographic Format is NOT the answer! ... anyway, all of this is far, far away from the point. MARC is old and ugly yes; but then so am I, and I get the job done, just like MARC. That format is responsible for about 0.2% of our difficulties, and replacing it would make essentially no difference to anything that we actually care about. _/|____ /o ) \/ Mike Taylorm...@indexdata.comhttp://www.miketaylor.org.uk )_v__/\ _Scelidosaurus_ [is] vastly more important for understanding dinosaur anatomy evolution than yet another dromaeosaur (can't believe I just wrote that, even though it's true) -- Thomas R. Holtz, Jr.
Re: [CODE4LIB] Something completely different
On Thu, Apr 9, 2009 at 10:26 AM, Mike Taylor m...@indexdata.com wrote: ... anyway, all of this is far, far away from the point. MARC is old and ugly yes; but then so am I, and I get the job done, just like MARC. That format is responsible for about 0.2% of our difficulties, and replacing it would make essentially no difference to anything that we actually care about. The *encoding* however is responsible for about 20% of my difficulties. MARC-8 should die... --Joe
Re: [CODE4LIB] Something completely different
From: Mike Taylor m...@indexdata.com ... anyway, all of this is far, far away from the point. MARC is old and ugly yes; but then so am I, I don't think you're old, Mike. --Ray
Re: [CODE4LIB] Something completely different
Ray Denenberg, Library of Congress writes: From: Mike Taylor m...@indexdata.com ... anyway, all of this is far, far away from the point. MARC is old and ugly yes; but then so am I, I don't think you're old, Mike. And _I_ don't think _you're_ ugly. :-) _/|____ /o ) \/ Mike Taylorm...@indexdata.comhttp://www.miketaylor.org.uk )_v__/\ Whenever there is a conflict between human rights and property rights, human rights must prevail -- Abraham Lincoln, quoted by Richard Stallman.
Re: [CODE4LIB] Something completely different
On Thu, Apr 9, 2009 at 10:26 AM, Mike Taylor m...@indexdata.com wrote: Cloutman, David writes: I'm open to seeing new approaches to the ILS in general. A related question I had the other day, speaking of MARC, is what would an alternative bibliographic data format look like if it was designed with the intent for opening access to the data our ILS systems to developers in a more informal manner? I was thinking of an XML format that a developer could work with without formal training, the basics of which could be learned in an hour, and could reasonably represent the essential fields of the 90% of records that are most likely to be viewed by a public library patron. I read this and immediately thought, oh, that's MODS: http://en.wikipedia.org/wiki/Metadata_Object_Description_Schema Then I read on through the thread and found that Stuart Yeates recommeded TEI instead. Then I read on a few more messages, and found that Alex Dolski though Dublin Core XML was the answer. Then I read on a bit further, a found half a dozen people arguing for RDF, triplestores and topic maps. (I fact, the only thing that _no-one_ has recommended is anything based on RDA :-) ) I'm not sure what to make of this except to say that Yet Another XML Bibliographic Format is NOT the answer! In theory, there is no difference between practice and theory, but in practice there is. In theory, Yet Another XML Bibliographic Format is NOT the answer, but it in practice it is. ;-) Kevin -- Kevin S. Clarke Coordinator of Web Services Belk Library Information Commons Appalachian State University 218 College Street Boone, NC 28608 clark...@appstate.edu (828) 262-8472 There are two kinds of people in the world: those who believe there are two kinds of people and those who know better.
Re: [CODE4LIB] Something completely different
On Thu, Apr 9, 2009 at 10:26 AM, Mike Taylor m...@indexdata.com wrote: I'm not sure what to make of this except to say that Yet Another XML Bibliographic Format is NOT the answer! I recognize that you're being flippant, and yet think there's an important nugget in here. When you say it that way, it makes it sound as if folks are debating the finer points of OAI-MARC vs MARC-XML -- that it's simply syntactic sugar (although I'm certainly one to argue for the importance of syntactic sugar) over the top of what we already have. What's actually being discussed, of course, is the underlying data model. E-R pairs primarily analyzed by set theory, triples forming directed graphs, whether or not links between data elements can themselves have attributes -- these are all possible characteristics of the fundamental underpinning of a data model to describe the data we're concerned with. The fact that they all have common XML representations is noise, and referencing the currently-most-common xml schema for these things is just convenient shorthand in a community that understands the exemplars. The fact that many in the library community don't understand that syntax is not the same as a data model is how we ended up with RDA. (Mike: I don't know your stuff, but I seriously doubt you're among that group. I'm talkin' in general, here.) Bibliographic data is astoundingly complex, and I believe wholeheartedly that modeling it sufficiently is a very, very hard task. But no matter the underlying model, we should still insist on starting with the basics that computer science folks have been using for decades now: uids (and, these days, guids) for the important attributes, separation of data and display, definition of sufficient data types and reuse of those types whenever possible, separation of identity and value, full normalization of data, zero ambiguity in the relationship diagram as a fundamental tenet, and a rigorous mathematical model to describe how it all fits together. This is hard stuff. But it's worth doing right. -- Bill Dueber Library Systems Programmer University of Michigan Library
[CODE4LIB] You got it!!!!! Re: [CODE4LIB] Something completely different
Bill, You have hit the nail on the head! This is EXACTLY what I am trying to do! It's the underlying stuff that I am trying to get at. Looking at RDF may yield some good ideas. But I am not thinking in terms of RDF or XML, triples, or MARC, standards, or any of that stuff that gets thrown around here. Even the Internet is not terribly necessary. I am thinking in terms of data structures, pointers, sparse matrices, relationships between objects and yes, set theory too -- things like that. The former is pretty much cruft that lies upon the latter, and it mostly just gets in the way. Noise, as you put it, Bill! A big problem here is that Libraryland has a bad habit of getting itself lost in the details and going off on all kinds of tangents. As I said before, the biggest prison is between the ears Throw out all that junk in there and just start over! When I begin programming this thing my only tools will be a programming language (C or Java) a text editor (vi) and my head. But before I really start that, right now I am writing a paper that explains how this stuff works at a very low level. It's mostly an effort to get my thoughts down clearly, but I will share a draft of it with y'all on here soon. Peter Schlumpf -Original Message- From: Bill Dueber b...@dueber.com Sent: Apr 9, 2009 10:37 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Something completely different On Thu, Apr 9, 2009 at 10:26 AM, Mike Taylor m...@indexdata.com wrote: I'm not sure what to make of this except to say that Yet Another XML Bibliographic Format is NOT the answer! I recognize that you're being flippant, and yet think there's an important nugget in here. When you say it that way, it makes it sound as if folks are debating the finer points of OAI-MARC vs MARC-XML -- that it's simply syntactic sugar (although I'm certainly one to argue for the importance of syntactic sugar) over the top of what we already have. What's actually being discussed, of course, is the underlying data model. E-R pairs primarily analyzed by set theory, triples forming directed graphs, whether or not links between data elements can themselves have attributes -- these are all possible characteristics of the fundamental underpinning of a data model to describe the data we're concerned with. The fact that they all have common XML representations is noise, and referencing the currently-most-common xml schema for these things is just convenient shorthand in a community that understands the exemplars. The fact that many in the library community don't understand that syntax is not the same as a data model is how we ended up with RDA. (Mike: I don't know your stuff, but I seriously doubt you're among that group. I'm talkin' in general, here.) Bibliographic data is astoundingly complex, and I believe wholeheartedly that modeling it sufficiently is a very, very hard task. But no matter the underlying model, we should still insist on starting with the basics that computer science folks have been using for decades now: uids (and, these days, guids) for the important attributes, separation of data and display, definition of sufficient data types and reuse of those types whenever possible, separation of identity and value, full normalization of data, zero ambiguity in the relationship diagram as a fundamental tenet, and a rigorous mathematical model to describe how it all fits together. This is hard stuff. But it's worth doing right. -- Bill Dueber Library Systems Programmer University of Michigan Library
Re: [CODE4LIB] Something completely different
Hi Peter, Peter Schlumpf wrote: What I had in mind for something different is this: Think of a single database of only associations between objects, and nothing more than that. If I'm understanding you correctly, what you have in mind is a triplestore. A database for storing purely relationship triples and nothing else. A triple is made up of the name of an object, the name of a type of relationship and a name of another object. A few example abstract triples might be: (The Lord of the Rings, author, J. R. R. Tolkien) (The Hobbit, sequel, The Lord of The Rings) (J. R. R. Tolkien, pet, Fido) (Fido, species, Canis lupus familiaris) (Canis lupus familiaris, commonName, Dog) etc RDF is a particular implementation of this abstract concept, where the names of both the objects and relationships are URIs. An RDF triplestore is a piece of software that stores these relationships and allows them to be queried flexibly and efficiently (in theory). It allows for an interesting extension too -- weighting those associations. Suppose we use it to create a search structure, and each time we go from one object referencing another we increment a counter for that link by one. I'm a little confused by this. What would the use case be? Are you talking about a popularity score for each relationship to weight searches with? Cheers, Alex
Re: [CODE4LIB] Something completely different
On Wed, Apr 8, 2009 at 22:38, Dr R. Sanderson azar...@liverpool.ac.uk wrote: I would encourage looking at rdf triplestores seriously, if the graph approach is the direction that you want to go in. Or, Topic Maps which is *not* a triplestore, closer to the OO model (basically a meta data model), and don't carry the stack overflow of RDF (RDF, RDFs, OWL 1-2-3) nor anonymous nodes. :) Regards, Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] Something completely different
Alexander Johannesen wrote: On Wed, Apr 8, 2009 at 22:38, Dr R. Sanderson azar...@liverpool.ac.uk wrote: I would encourage looking at rdf triplestores seriously, if the graph approach is the direction that you want to go in. Or, Topic Maps which is *not* a triplestore, closer to the OO model (basically a meta data model), and don't carry the stack overflow of RDF (RDF, RDFs, OWL 1-2-3) nor anonymous nodes. :) That's not an entirely useful comparison on topic maps and RDF. I suggest: http://www.ontopia.net/topicmaps/materials/tmrdf.html We currently use topic maps, alot, in our infrastructure. If we were starting again tomorrow, I'd advocate using RDF instead, mainly because of the much better tool support and take-up. cheers stuart -- Stuart Yeates http://www.nzetc.org/ New Zealand Electronic Text Centre http://researcharchive.vuw.ac.nz/ Institutional Repository
Re: [CODE4LIB] Something completely different
On Thu, Apr 9, 2009 at 14:33, stuart yeates stuart.yea...@vuw.ac.nz wrote: That's not an entirely useful comparison on topic maps and RDF. If I indented to be useful I'd write something substantial, backed up with stuff other than humour. I'll give that a go the next time. :) We currently use topic maps, alot, in our infrastructure. If we were starting again tomorrow, I'd advocate using RDF instead, mainly because of the much better tool support and take-up. Hmm, not a good thing at all. Could you elaborate, though, as I use it too as part of infrastructure too, and wouldn't touch RDF / SemWeb without a long stick? I'm into application semantics and shared knowledge-bases. What are you guys doing where you feel the support and tools are lacking? And what are the RDF alternatives? Regards, Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] Something completely different
Ross, I'm not questioning the technical assertion -- obviously you can combine properties from different vocabularies. My problem is with making sense of FRBR in relation to the properties, either in RDA or in bibo. Do you say that a particular grouping of properties is of type FRBR:Manifestation, or is the property defined in the vocabulary as in the Manifestation domain? RDA does the latter (although not in a semantic web way). Each data element in RDA belongs to a particular FRBR entity, so you never actually use the FRBR entities in your metadata. (Although the examples that Alistair Miles did [1] use the levels as part of the record organization.) I actually prefer the usage that I gave in my examples, in which relationships carry the FRBR meaning and bibliographic properties can be used at any level. The schema in the registry is completely flat partly because of the choice made by RDA to include the FRBR levels in the data elements themselves. The other 'partly' is because the creators of RDA are still pretty much thinking in terms of traditional bibliographic data, ISBD and MARC. kc [1] Linked from each scenario at http://dublincore.org/dcmirdataskgroup/Scenarios Ross Singer wrote: Right, ok, so an RDF graph can say the same resource is multiple things at the same time, so that's how you deal with this: http://lccn.loc.gov/95100870 rdf:type bibo:Book . http://lccn.loc.gov/95100870 dc:title Doctor Zhivago@en . http://lccn.loc.gov/95100870 dc:creator http://www.worldcat.org/identities/lccn-n79-18438 . http://lccn.loc.gov/95100870 rda:uniformTitle Doktor Zhivago. English . http://lccn.loc.gov/95100870 rdf:type rda:EditionStatement . http://lccn.loc.gov/95100870 rdf:type frbr:Manifestation . http://lccn.loc.gov/95100870 frbr:embodimentOf http://dbpedia.org/resource/Doctor_Zhivago . I'm guessing on the RDA assertions, because the schema in the metadataregistry doesn't make much sense to me. Anyway, this shows how you can say multiple things from different vocabularies for one resource. -Ross. On Mon, Apr 6, 2009 at 8:10 PM, Karen Coyle li...@kcoyle.net wrote: Jonathan Rochkind wrote: I'm curious why you think that doesn't work? Isn't place of publication a characteristic of a particular manifestation? While, title, according to traditional library practices where you take it from the title page, is also a characteristic of a particular manifestation, is it not? (uniform title is _usually_ a characteristic of a work, unless we get into music cataloging and some other 'edge' cases. Our traditional practices -- which aren't actually changed that much by RDA, are rather confusing.) Well, I was responding to Ross' statement that bibo and FRBR could be used in combination, depending on whether one was at that moment describing 'bibliographic things' or 'work things'. bibo doesn't have a uniform title, so the question is: can you use a bibo title and say that it is a work title? I thought that Ross was indicating something of that nature -- that you could have a FRBR 'work thing' with bibo properties. I'm trying to understand how that works since Work is a class. Don't you have to indicate the domain and range of a property in its definition? RDA tries to solve this by creating different properties for every concept+FRBR entity: title of the work (Work), title proper (Manifestation). [I don't understand why expressions don't have titles a translation is an expression, after all.] I am confused about what one would do about the fact that RDA defines attributes a bit different than FRBR itself does. It's not too surprising -- FRBR is really just a draft, hardly tested in the world. When RDA tried to make it a bit more concrete, it's not surprising that they found they had to make changes to make it workable. Not sure what to do about that in the grand scheme of things, if RDA and FRBR both end up registering different vocabularies. I guess we'll just have two different vocabularies though, which isn't too shocking I guess. I'm not sure there's anything to do, but I do know that the developers of RDA feel very strongly that in RDA they have 'implemented' FRBR, so we have to find a way to integrate FRBR and RDA in the registered RDA vocabulary. I agree that there's no problem with having RDA and FRBR as two different vocabularies, it's the effort of bringing them together that boggles me. I feel like it leaves a lot of loose ends. I'd be happy to see FRBR revised, or to have it re-defined without the attributes, thus allowing metadata developers to use the bibliographic relationship properties with any set of descriptive elements. I'm having trouble with the FRBR Group 1 entities as classes. I see them instead as relationships, and vocab.org does seem to treat them as relationships, not as 'things.' I see a distinct difference between a person entity and a work entity, because there is no thing that is a work. I see work as a relationship
Re: [CODE4LIB] Something completely different
So, thanks to the help of my coworkers, here's the RDA Elements schema reformatted in an easier to read presentation: http://morph.talis.com/?data-uri[]=http%3A%2F%2Frdvocab.info%2FElements.rdfinput=output=exhibitcallback= I have to say I feel like this schema is trying to both do way too much and subsequently loses the resource specificity that RDF would be providing. For one thing, it seems to reinvent a _lot_ of wheels. Why does it define its own title property instead of using DC's? By using properties like titleOfTheWork, dateOfWork and all of the properties that are specifically about TheSeries there is tremendous duplication of text. If Work was its own class, you would only need say that this manifestation was an embodimentOf of it and reuse all of the title-based properties for manifestation. The series-specific property names seem redundant, as well, since isn't SeriesStatement defining a series? Why do you need titleProperOfSeries if you already have titleProper? What does property 'uri' mean? I also can't figure out how people/institutions are modeled in this schema, since none of the elements have ranges. Are they their own resources? If so, what? The way it looks at a glance, they're strings? There are also different properties for dimensions, dimensionsOfMap, dimensionsOfStillImage, etc. Why is there any need for anything more than 'dimensions'? This is redefining what the resource 'is' in multiple places, but the fact that this is a still image is made somewhere else, right? If so, isn't it self-evident that the dimensions are of a still image? It seems to me that very little work was done find preexisting vocabularies to reuse and this schema still presents a very 'document-centric' or 'record-centric' view of data. -Ross. On Tue, Apr 7, 2009 at 9:39 AM, Karen Coyle li...@kcoyle.net wrote: Ross, I'm not questioning the technical assertion -- obviously you can combine properties from different vocabularies. My problem is with making sense of FRBR in relation to the properties, either in RDA or in bibo. Do you say that a particular grouping of properties is of type FRBR:Manifestation, or is the property defined in the vocabulary as in the Manifestation domain? RDA does the latter (although not in a semantic web way). Each data element in RDA belongs to a particular FRBR entity, so you never actually use the FRBR entities in your metadata. (Although the examples that Alistair Miles did [1] use the levels as part of the record organization.) I actually prefer the usage that I gave in my examples, in which relationships carry the FRBR meaning and bibliographic properties can be used at any level. The schema in the registry is completely flat partly because of the choice made by RDA to include the FRBR levels in the data elements themselves. The other 'partly' is because the creators of RDA are still pretty much thinking in terms of traditional bibliographic data, ISBD and MARC. kc [1] Linked from each scenario at http://dublincore.org/dcmirdataskgroup/Scenarios Ross Singer wrote: Right, ok, so an RDF graph can say the same resource is multiple things at the same time, so that's how you deal with this: http://lccn.loc.gov/95100870 rdf:type bibo:Book . http://lccn.loc.gov/95100870 dc:title Doctor Zhivago@en . http://lccn.loc.gov/95100870 dc:creator http://www.worldcat.org/identities/lccn-n79-18438 . http://lccn.loc.gov/95100870 rda:uniformTitle Doktor Zhivago. English . http://lccn.loc.gov/95100870 rdf:type rda:EditionStatement . http://lccn.loc.gov/95100870 rdf:type frbr:Manifestation . http://lccn.loc.gov/95100870 frbr:embodimentOf http://dbpedia.org/resource/Doctor_Zhivago . I'm guessing on the RDA assertions, because the schema in the metadataregistry doesn't make much sense to me. Anyway, this shows how you can say multiple things from different vocabularies for one resource. -Ross. On Mon, Apr 6, 2009 at 8:10 PM, Karen Coyle li...@kcoyle.net wrote: Jonathan Rochkind wrote: I'm curious why you think that doesn't work? Isn't place of publication a characteristic of a particular manifestation? While, title, according to traditional library practices where you take it from the title page, is also a characteristic of a particular manifestation, is it not? (uniform title is _usually_ a characteristic of a work, unless we get into music cataloging and some other 'edge' cases. Our traditional practices -- which aren't actually changed that much by RDA, are rather confusing.) Well, I was responding to Ross' statement that bibo and FRBR could be used in combination, depending on whether one was at that moment describing 'bibliographic things' or 'work things'. bibo doesn't have a uniform title, so the question is: can you use a bibo title and say that it is a work title? I thought that Ross was indicating something of that nature -- that you could have a FRBR 'work thing' with bibo properties. I'm trying
Re: [CODE4LIB] Something completely different
On Sun, Apr 5, 2009 at 10:40 AM, Peter Schlumpf pschlu...@earthlink.net wrote: I want to get back to simple things. Imagine if there were no Marc records. Minimal layers of abstraction. No politics. No vendors. No SQL straightjacket. What would an ILS look like without those things? Back to this original question, when I imagine these things, I imagine building an ILS that relies on an unusual data persistence backend, discounts industry-standard data formats, and explicitly ignores the political concerns of adopting, deploying, and maintaining it. And I get a little bit nervous. For what it's worth (and I think this touches on the ontological discussion in this thread, too) -- my experience has been that it's easier to build a piece of software that solves a problem compellingly, solving technical hurdles as you need to than it is to come up with solutions to anticipated technical problems before starting on making a product. More concretely: if you build a software product, I don't care at all whether it's based on a SQL straitjacket or a luscious RDF comforter. I care if it solves a problem well, and that I can install it and run it easily. Cheers, -Nate
Re: [CODE4LIB] Something completely different
Also back to the original question, what is an ILS in the first place? The discussion has focused on bibliographic records, but that's just one part of what's in the ILS in use at the library where I work. I see one of the big problems with current ILSs being not so much the ILS per se, but library managers'/librarians' expectations that they should have a single core system that handles all the following functionality: - maintaining a database of patron records with attached fine and fee information, which books they have out, what is waiting on the hold shelf for them, etc. - maintaining a library accounting hierarchy with the ability to run reports like it's halfway through the year and you've spent 90% of your budget for children's fiction - maintaining an acquisitions system so records for purchases are reflected into the accounting system and also as new bib records for on-order materials - serials check-in so that missing issues can be claimed - and of course a cataloging module and an OPAC. Without the ability to support all the back-end processing and accounting, simply replacing the front-end OPAC and the bibliographic database does nothing to eliminate the need for an ILS, unless it also opens the way to feed data in and out of cheap off-the-shelf accounting and purchasing systems that aren't library-specific. A lot of libraries still won't want to put together even that much out of parts, and will prefer an ILS, but if it were me, I think I'd look at reengineering some of the parts to become more interchangeable with stuff like standard accounting software. I must admit I was kind of horrified when I first got here and found that all this functionality was resident in a single system. No wonder these things are so honking expensive. Genny Engel Sonoma County Library gen...@sonoma.lib.ca.us 707 545-0831 x581 www.sonomalibrary.org njv...@wisc.edu 04/07/09 08:59AM On Sun, Apr 5, 2009 at 10:40 AM, Peter Schlumpf pschlu...@earthlink.net wrote: I want to get back to simple things. Imagine if there were no Marc records. Minimal layers of abstraction. No politics. No vendors. No SQL straightjacket. What would an ILS look like without those things? Back to this original question, [...]
Re: [CODE4LIB] Something completely different
Which is why the interface specifications are at least as important, if not more important, as the specs for each of the modules that you enumerated. If the interfaces are well-defined, then the components can be designed and developed with a minimum of further interactions among developers. In fact, there might eventually be more than one implementation of a particular module, allowing a library to assemble an ILS out of interchangeable components. (I'm assuming open source--it seems unlikely that proprietary vendors will ever come around.) Sharon M. Foster, 91.7% Librarian Speaker-to-Computers http://www.vsa-software.com/mlsportfolio/ On Tue, Apr 7, 2009 at 2:52 PM, Genny Engel gen...@sonoma.lib.ca.us wrote: Also back to the original question, what is an ILS in the first place? [...] Without the ability to support all the back-end processing and accounting, simply replacing the front-end OPAC and the bibliographic database does nothing to eliminate the need for an ILS, unless it also opens the way to feed data in and out of cheap off-the-shelf accounting and purchasing systems that aren't library-specific. A lot of libraries still won't want to put together even that much out of parts, and will prefer an ILS, but if it were me, I think I'd look at reengineering some of the parts to become more interchangeable with stuff like standard accounting software. I must admit I was kind of horrified when I first got here and found that all this functionality was resident in a single system. No wonder these things are so honking expensive. Genny Engel Sonoma County Library gen...@sonoma.lib.ca.us 707 545-0831 x581 www.sonomalibrary.org
Re: [CODE4LIB] Something completely different
An interesting thread! It will take me a while for me to digest the ideas. What I had in mind for something different is this: Think of a single database of only associations between objects, and nothing more than that. Objects defined in this database can reference any and all other objects in the database. These objects could represent anything: Title records or item records in an opac. A collection of files on a computer. Web sites. Links. Database queries. All of the above. Each object in this database contains just enough information to say that it exists and has a pointer to the thing in the outside world that it represents. Although the basic system would allow the objects in it to link to eachother in arbitrary ways, we could impose rules on it to create a system. An OPAC. A map. Other things that I can't think of right now. I think a key thought here is that it is a database of pure relationships that can be set up and manipulated. But the descriptive data is stored elsewhere. It allows for an interesting extension too -- weighting those associations. Suppose we use it to create a search structure, and each time we go from one object referencing another we increment a counter for that link by one. There are many ways to implement something like this, and I have one in mind, but this is sort of the theory behind it. It is going back to simple things. Peter Schlumpf -Original Message- From: Karen Coyle li...@kcoyle.net Sent: Apr 6, 2009 1:49 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Something completely different Cloutman, David wrote: I'm open to seeing new approaches to the ILS in general. A related question I had the other day, speaking of MARC, is what would an alternative bibliographic data format look like if it was designed with the intent for opening access to the data our ILS systems to developers in a more informal manner? I was thinking of an XML format that a developer could work with without formal training, Well, speaking of 'without formal training' -- I posted this to the Open Library technology list, but using the OL, which is triple-based and open access, I was able to create a simple demo Pipe of how you could determine the earliest date of publication of a book (with an interest in looking at potential copyright status). Caveat is that the API I'm is still pretty stubby, so it only retrieves on exact title (this will be fixed sometime in the future). The pipe is here: http://pipes.yahoo.com/pipes/pipe.info?_id=216efa8c3b04764ca77ad181b1cc66e4 kc the basics of which could be learned in an hour, and could reasonably represent the essential fields of the 90% of records that are most likely to be viewed by a public library patron. In my mind, such a format would allow creators of community-based web sites to pull data from their local library, and repurpose it without having to learn a lot of arcane formats (e.g. MARC) or esoteric protocols (e.g. Z39.50). The sacrifice, of course, would be loosing some of the richness MARC allows, but I think in many common situations the really complex records are not what patrons are interested in. You may want to consider prototyping this in your application. I see such an effort to be vital in making our systems relevant in future computing environments, and I am skeptical that a simple, workable solution would come out the initial efforts of a standardization committee. Just my 2 cents. - David --- David Cloutman dclout...@co.marin.ca.us Electronic Services Librarian Marin County Free Library -Original Message- From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of Peter Schlumpf Sent: Sunday, April 05, 2009 8:40 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] Something completely different Greetings! I have been lurking on (or ignoring) this forum for years. And libraries too. Some of you may know me. I am the Avanti guy. I am, perhaps, the first person to try to produce an open source ILS back in 1999, though there is a David Duncan out there who tried before I did. I was there when all this stuff was coming together. Since then I have seen a lot of good things happen. There's Koha. There's Evergreen. They are good things. I have also seen first hand how libraries get screwed over and over by commercial vendors with their crappy software. I believe free software is the answer to that. I have neglected Avanti for years, but now I am ready to return to it. I want to get back to simple things. Imagine if there were no Marc records. Minimal layers of abstraction. No politics. No vendors. No SQL straightjacket. What would an ILS look like without those things? Sometimes the biggest prison is between the ears. I am in a position to do this now, and that's what I have decided to do. I am getting busy. Peter Schlumpf Email Disclaimer: http://www.co.marin.ca.us/nav/misc
Re: [CODE4LIB] Something completely different
I'm open to seeing new approaches to the ILS in general. A related question I had the other day, speaking of MARC, is what would an alternative bibliographic data format look like if it was designed with the intent for opening access to the data our ILS systems to developers in a more informal manner? I was thinking of an XML format that a developer could work with without formal training, the basics of which could be learned in an hour, and could reasonably represent the essential fields of the 90% of records that are most likely to be viewed by a public library patron. In my mind, such a format would allow creators of community-based web sites to pull data from their local library, and repurpose it without having to learn a lot of arcane formats (e.g. MARC) or esoteric protocols (e.g. Z39.50). The sacrifice, of course, would be loosing some of the richness MARC allows, but I think in many common situations the really complex records are not what patrons are interested in. You may want to consider prototyping this in your application. I see such an effort to be vital in making our systems relevant in future computing environments, and I am skeptical that a simple, workable solution would come out the initial efforts of a standardization committee. Just my 2 cents. - David --- David Cloutman dclout...@co.marin.ca.us Electronic Services Librarian Marin County Free Library -Original Message- From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of Peter Schlumpf Sent: Sunday, April 05, 2009 8:40 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] Something completely different Greetings! I have been lurking on (or ignoring) this forum for years. And libraries too. Some of you may know me. I am the Avanti guy. I am, perhaps, the first person to try to produce an open source ILS back in 1999, though there is a David Duncan out there who tried before I did. I was there when all this stuff was coming together. Since then I have seen a lot of good things happen. There's Koha. There's Evergreen. They are good things. I have also seen first hand how libraries get screwed over and over by commercial vendors with their crappy software. I believe free software is the answer to that. I have neglected Avanti for years, but now I am ready to return to it. I want to get back to simple things. Imagine if there were no Marc records. Minimal layers of abstraction. No politics. No vendors. No SQL straightjacket. What would an ILS look like without those things? Sometimes the biggest prison is between the ears. I am in a position to do this now, and that's what I have decided to do. I am getting busy. Peter Schlumpf Email Disclaimer: http://www.co.marin.ca.us/nav/misc/EmailDisclaimer.cfm
Re: [CODE4LIB] Something completely different
On Sun, 5 Apr 2009, Peter Schlumpf wrote: [trimmed] I want to get back to simple things. Imagine if there were no Marc records. Minimal layers of abstraction. No politics. No vendors. No SQL straightjacket. What would an ILS look like without those things? Sometimes the biggest prison is between the ears. Perhaps a slightly different perspective on looking at requirements: What should be easier to do, but is a pain currently? -Joe
Re: [CODE4LIB] Something completely different
The linked open data crowd might suggest: Bibliographic Ontology Specification (aka bibo) http://bibliontology.com/ Abstract: The Bibliographic Ontology Specification provides main concepts and properties for describing citations and bibliographic references (i.e. quotes, books, articles, etc) on the Semantic Web. A lot of work has gone into this to make it work with a wide variety of possible use cases. It acknowledges FRBR, but doesn't require it. The Swedish national library uses a tiny fraction of BIBO, along with DC and other RDF vocabularies. BIBO as a whole is much more granular than MARC, but whether that makes it more or less suited as a library format probably depends on who you are. Tom On Sun, Apr 5, 2009 at 11:40 AM, Peter Schlumpf pschlu...@earthlink.net wrote: Greetings! I have been lurking on (or ignoring) this forum for years. And libraries too. Some of you may know me. I am the Avanti guy. I am, perhaps, the first person to try to produce an open source ILS back in 1999, though there is a David Duncan out there who tried before I did. I was there when all this stuff was coming together. Since then I have seen a lot of good things happen. There's Koha. There's Evergreen. They are good things. I have also seen first hand how libraries get screwed over and over by commercial vendors with their crappy software. I believe free software is the answer to that. I have neglected Avanti for years, but now I am ready to return to it. I want to get back to simple things. Imagine if there were no Marc records. Minimal layers of abstraction. No politics. No vendors. No SQL straightjacket. What would an ILS look like without those things? Sometimes the biggest prison is between the ears. I am in a position to do this now, and that's what I have decided to do. I am getting busy. Peter Schlumpf
Re: [CODE4LIB] Something completely different
On Mon, 6 Apr 2009, Jonathan Rochkind wrote: Joe Hourcle wrote: Perhaps a slightly different perspective on looking at requirements: What should be easier to do, but is a pain currently? My answers to that won't point to a more simplified data structure I think some are hoping for. 1. For a serial title, identify if a particular issue of that serial is held, and where. 2. Group alternate editions of the same work. 3. Identify the form/genre of an item There are more. This is what immediately comes to mind. Most of these issues are issues of metadata control, and not trivial to solve. The really sad thing is -- I agree with every one of those items, as they're problems that I run into with my own (non-bibliographic) data. I've got a few others that might not be as useful in libraries:* Identify items contained in more than one catalog: eg, Hugo award winners that were also on the NYT best seller's list. Identify the lack of correlation between catalogs: eg, Identify items in this week's NYT best sellers list that we _don't_ have at our library. Group (or filter) results by similar form and/or processing; eg, All unabridged audio books in English on CD; Large print books in Spanish in hardback; etc. (or, as an alternative, allow users to set preferences to select their preferred formats) (I don't follow ILS features, so it's possible that some might be able to handle these, but when I've talked to folks in the past, their responses have seemed to suggest that I have an odd way of looking at records) -Joe * Examples are approximates ... I'm actually looking for records such as: Find periods of time where there was coronal dimming before a coronal mass ejection. Find where there are entries in the LASCO/CME catalog that aren't in the CACTUS catalog (and visa-versa) Only show Level0 data if there isn't associated Level1 data for the observation; Only show reduced data if there's no full resolution data; Use SOHO/EIT data unless there's a gap of more than 1 hr, then fill using data from the highest resultion EUV telescope in the sun-earth line)
Re: [CODE4LIB] Something completely different
I know that a large percentage of the data in our MARC records is not being used for finding/gathering or even display, so in that case, what good is it? This is, of course, a chicken and egg thing. The reason why a lot of MARC data remains inconsistent is precisely because it is not being used for finding or display. Anyone who has worked with a faceted search application has seen this in action. As soon as you aggregate subject headings, genre designations, etc., into facets you begin to see all kinds of data problems that you never noticed before because they are scattered among thousands of records that previously could only be viewed individually. Of course, fixing bad or inconsistent data is probably an order of magnitude easier than adding data to records after the fact. --Dave == David Walker Library Web Services Manager California State University http://xerxes.calstate.edu From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of Alex Dolski [alex.dol...@unlv.edu] Sent: Monday, April 06, 2009 10:38 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Something completely different I think Dublin Core XML is an excellent attempt at what you're talking about if you want to consider it a bibliographic data format, which I guess could be one of its many uses. I know that a large percentage of the data in our MARC records is not being used for finding/gathering or even display, so in that case, what good is it? There is a lot of richness in those records, but it's so all-over-the-place that whatever value it might have had gets killed by all the inconsistency. In my experience, good, consistent metadata that captures the essence of an object is more useful than highly-detailed, inconsistent metadata (which all highly-detailed metadata tends to be) in a fine-grained element set. I think there may be a cultural element to this as well, in that IR people think of metadata in terms of its utility for IR purposes (at which DC tends to be extremely practical) and catalogers think of it as a thorough-as-possible description of an object (at which DC is quite inadequate). Alex Cloutman, David wrote: I'm open to seeing new approaches to the ILS in general. A related question I had the other day, speaking of MARC, is what would an alternative bibliographic data format look like if it was designed with the intent for opening access to the data our ILS systems to developers in a more informal manner? I was thinking of an XML format that a developer could work with without formal training, the basics of which could be learned in an hour, and could reasonably represent the essential fields of the 90% of records that are most likely to be viewed by a public library patron. In my mind, such a format would allow creators of community-based web sites to pull data from their local library, and repurpose it without having to learn a lot of arcane formats (e.g. MARC) or esoteric protocols (e.g. Z39.50). The sacrifice, of course, would be loosing some of the richness MARC allows, but I think in many common situations the really complex records are not what patrons are interested in. You may want to consider prototyping this in your application. I see such an effort to be vital in making our systems relevant in future computing environments, and I am skeptical that a simple, workable solution would come out the initial efforts of a standardization committee. Just my 2 cents. - David --- David Cloutman dclout...@co.marin.ca.us Electronic Services Librarian Marin County Free Library -Original Message- From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of Peter Schlumpf Sent: Sunday, April 05, 2009 8:40 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] Something completely different Greetings! I have been lurking on (or ignoring) this forum for years. And libraries too. Some of you may know me. I am the Avanti guy. I am, perhaps, the first person to try to produce an open source ILS back in 1999, though there is a David Duncan out there who tried before I did. I was there when all this stuff was coming together. Since then I have seen a lot of good things happen. There's Koha. There's Evergreen. They are good things. I have also seen first hand how libraries get screwed over and over by commercial vendors with their crappy software. I believe free software is the answer to that. I have neglected Avanti for years, but now I am ready to return to it. I want to get back to simple things. Imagine if there were no Marc records. Minimal layers of abstraction. No politics. No vendors. No SQL straightjacket. What would an ILS look like without those things? Sometimes the biggest prison is between the ears. I am in a position to do this now, and that's what I have decided to do. I am getting busy. Peter Schlumpf Email Disclaimer: http
Re: [CODE4LIB] Something completely different
Well, the future of ILS is to use general computing standards without making library's own. Essentially, from a computing theory view, a graph is the way to present all the info (i.e. a graph can represent a tree, or a line. When you look at MARC, it is a linear computing model.) Graph is powerful, but graph theory can be difficult and extremely complex. Some of them are NP hard problem. I think that RDF based standards (DC? Or something else or maybe no need for just one metadata standard )can be used to maximize interoperability, allow further information discovery and at the same time provide suitable description for different type of materials. Yan -Original Message- From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of Karen Coyle Sent: Monday, April 06, 2009 10:49 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Something completely different Cloutman, David wrote: I'm open to seeing new approaches to the ILS in general. A related question I had the other day, speaking of MARC, is what would an alternative bibliographic data format look like if it was designed with the intent for opening access to the data our ILS systems to developers in a more informal manner? I was thinking of an XML format that a developer could work with without formal training, Well, speaking of 'without formal training' -- I posted this to the Open Library technology list, but using the OL, which is triple-based and open access, I was able to create a simple demo Pipe of how you could determine the earliest date of publication of a book (with an interest in looking at potential copyright status). Caveat is that the API I'm is still pretty stubby, so it only retrieves on exact title (this will be fixed sometime in the future). The pipe is here: http://pipes.yahoo.com/pipes/pipe.info?_id=216efa8c3b04764ca77ad181b1cc6 6e4 kc the basics of which could be learned in an hour, and could reasonably represent the essential fields of the 90% of records that are most likely to be viewed by a public library patron. In my mind, such a format would allow creators of community-based web sites to pull data from their local library, and repurpose it without having to learn a lot of arcane formats (e.g. MARC) or esoteric protocols (e.g. Z39.50). The sacrifice, of course, would be loosing some of the richness MARC allows, but I think in many common situations the really complex records are not what patrons are interested in. You may want to consider prototyping this in your application. I see such an effort to be vital in making our systems relevant in future computing environments, and I am skeptical that a simple, workable solution would come out the initial efforts of a standardization committee. Just my 2 cents. - David --- David Cloutman dclout...@co.marin.ca.us Electronic Services Librarian Marin County Free Library -Original Message- From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of Peter Schlumpf Sent: Sunday, April 05, 2009 8:40 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] Something completely different Greetings! I have been lurking on (or ignoring) this forum for years. And libraries too. Some of you may know me. I am the Avanti guy. I am, perhaps, the first person to try to produce an open source ILS back in 1999, though there is a David Duncan out there who tried before I did. I was there when all this stuff was coming together. Since then I have seen a lot of good things happen. There's Koha. There's Evergreen. They are good things. I have also seen first hand how libraries get screwed over and over by commercial vendors with their crappy software. I believe free software is the answer to that. I have neglected Avanti for years, but now I am ready to return to it. I want to get back to simple things. Imagine if there were no Marc records. Minimal layers of abstraction. No politics. No vendors. No SQL straightjacket. What would an ILS look like without those things? Sometimes the biggest prison is between the ears. I am in a position to do this now, and that's what I have decided to do. I am getting busy. Peter Schlumpf Email Disclaimer: http://www.co.marin.ca.us/nav/misc/EmailDisclaimer.cfm -- --- Karen Coyle / Digital Library Consultant kco...@kcoyle.net http://www.kcoyle.net ph.: 510-540-7596 skype: kcoylenet fx.: 510-848-3913 mo.: 510-435-8234
Re: [CODE4LIB] Something completely different
On Mon, Apr 6, 2009 at 2:17 PM, Karen Coyle li...@kcoyle.net wrote: My problem with bibo is that it's strongly oriented toward academic journal articles... I would like to see a comparison to MARC, if anyone has done that, which might give us an idea of what isn't there. For example, I don't see the various work/work, work/expression relationships. But it has great detail in some areas, like time intervals and access rights. Well, I'm not sure I agree with the assessment that it's geared towards academic journals... there's been a lot of work towards all kinds of citations, esp. court cases and whatnot. See the examples: http://wiki.bibliontology.com/index.php/Examples As far as not including FRBR, BIBO doesn't have to, because the FRBR vocabs: http://vocab.org/frbr/core.html and http://vocab.org/frbr/extended.html already do. This way BIBO can focus on describing citations, FRBR can focus on work/expression/manifestion/item relationships and other vocabularies can focus on other attributes (size, location, circ status, whatever). This is part of the flexibility of RDF, the ability to pick and choose among schemas to describe resources however you need to. -Ross.
Re: [CODE4LIB] Something completely different
It is designed as a container for citations. Articles are one such example, but that well-understood format is not BIBO's main focus. They've been going after the tough ones, including legal cases, conference presentations, letters, etc. Oh, yeah, books, book chapters, quotations. For a partial list, see http://wiki.bibliontology.com/index.php/Examples On Mon, Apr 6, 2009 at 2:17 PM, Karen Coyle li...@kcoyle.net wrote: My problem with bibo is that it's strongly oriented toward academic journal articles... I would like to see a comparison to MARC, if anyone has done that, which might give us an idea of what isn't there. For example, I don't see the various work/work, work/expression relationships. But it has great detail in some areas, like time intervals and access rights. kc Tom Keays wrote: The linked open data crowd might suggest: Bibliographic Ontology Specification (aka bibo) http://bibliontology.com/ Abstract: The Bibliographic Ontology Specification provides main concepts and properties for describing citations and bibliographic references (i.e. quotes, books, articles, etc) on the Semantic Web. A lot of work has gone into this to make it work with a wide variety of possible use cases. It acknowledges FRBR, but doesn't require it. The Swedish national library uses a tiny fraction of BIBO, along with DC and other RDF vocabularies. BIBO as a whole is much more granular than MARC, but whether that makes it more or less suited as a library format probably depends on who you are. Tom On Sun, Apr 5, 2009 at 11:40 AM, Peter Schlumpf pschlu...@earthlink.net wrote: Greetings! I have been lurking on (or ignoring) this forum for years. And libraries too. Some of you may know me. I am the Avanti guy. I am, perhaps, the first person to try to produce an open source ILS back in 1999, though there is a David Duncan out there who tried before I did. I was there when all this stuff was coming together. Since then I have seen a lot of good things happen. There's Koha. There's Evergreen. They are good things. I have also seen first hand how libraries get screwed over and over by commercial vendors with their crappy software. I believe free software is the answer to that. I have neglected Avanti for years, but now I am ready to return to it. I want to get back to simple things. Imagine if there were no Marc records. Minimal layers of abstraction. No politics. No vendors. No SQL straightjacket. What would an ILS look like without those things? Sometimes the biggest prison is between the ears. I am in a position to do this now, and that's what I have decided to do. I am getting busy. Peter Schlumpf -- --- Karen Coyle / Digital Library Consultant kco...@kcoyle.net http://www.kcoyle.net ph.: 510-540-7596 skype: kcoylenet fx.: 510-848-3913 mo.: 510-435-8234
Re: [CODE4LIB] Something completely different
Cloutman, David wrote: I'm open to seeing new approaches to the ILS in general. A related question I had the other day, speaking of MARC, is what would an alternative bibliographic data format look like if it was designed with the intent for opening access to the data our ILS systems to developers in a more informal manner? I was thinking of an XML format that a developer could work with without formal training, Well, speaking of 'without formal training' -- I posted this to the Open Library technology list, but using the OL, which is triple-based and open access, I was able to create a simple demo Pipe of how you could determine the earliest date of publication of a book (with an interest in looking at potential copyright status). Caveat is that the API I'm is still pretty stubby, so it only retrieves on exact title (this will be fixed sometime in the future). The pipe is here: http://pipes.yahoo.com/pipes/pipe.info?_id=216efa8c3b04764ca77ad181b1cc66e4 kc the basics of which could be learned in an hour, and could reasonably represent the essential fields of the 90% of records that are most likely to be viewed by a public library patron. In my mind, such a format would allow creators of community-based web sites to pull data from their local library, and repurpose it without having to learn a lot of arcane formats (e.g. MARC) or esoteric protocols (e.g. Z39.50). The sacrifice, of course, would be loosing some of the richness MARC allows, but I think in many common situations the really complex records are not what patrons are interested in. You may want to consider prototyping this in your application. I see such an effort to be vital in making our systems relevant in future computing environments, and I am skeptical that a simple, workable solution would come out the initial efforts of a standardization committee. Just my 2 cents. - David --- David Cloutman dclout...@co.marin.ca.us Electronic Services Librarian Marin County Free Library -Original Message- From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of Peter Schlumpf Sent: Sunday, April 05, 2009 8:40 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] Something completely different Greetings! I have been lurking on (or ignoring) this forum for years. And libraries too. Some of you may know me. I am the Avanti guy. I am, perhaps, the first person to try to produce an open source ILS back in 1999, though there is a David Duncan out there who tried before I did. I was there when all this stuff was coming together. Since then I have seen a lot of good things happen. There's Koha. There's Evergreen. They are good things. I have also seen first hand how libraries get screwed over and over by commercial vendors with their crappy software. I believe free software is the answer to that. I have neglected Avanti for years, but now I am ready to return to it. I want to get back to simple things. Imagine if there were no Marc records. Minimal layers of abstraction. No politics. No vendors. No SQL straightjacket. What would an ILS look like without those things? Sometimes the biggest prison is between the ears. I am in a position to do this now, and that's what I have decided to do. I am getting busy. Peter Schlumpf Email Disclaimer: http://www.co.marin.ca.us/nav/misc/EmailDisclaimer.cfm -- --- Karen Coyle / Digital Library Consultant kco...@kcoyle.net http://www.kcoyle.net ph.: 510-540-7596 skype: kcoylenet fx.: 510-848-3913 mo.: 510-435-8234
Re: [CODE4LIB] Something completely different
Ross Singer wrote: On Mon, Apr 6, 2009 at 2:17 PM, Karen Coyle li...@kcoyle.net wrote: My problem with bibo is that it's strongly oriented toward academic journal articles... I would like to see a comparison to MARC, if anyone has done that, which might give us an idea of what isn't there. For example, I don't see the various work/work, work/expression relationships. But it has great detail in some areas, like time intervals and access rights. Well, I'm not sure I agree with the assessment that it's geared towards academic journals... there's been a lot of work towards all kinds of citations, esp. court cases and whatnot. See the examples: http://wiki.bibliontology.com/index.php/Examples Still looks pretty limited to me. What academics cite isn't a full bibliographic universe. No music, no films, no way to do realia. And citing isn't the same as bibliographic description. Don't get me wrong, I think it's very complete as a citation format, I just don't think it meets other needs. The right tool for the job... and all that. As far as not including FRBR, BIBO doesn't have to, because the FRBR vocabs: http://vocab.org/frbr/core.html and http://vocab.org/frbr/extended.html already do. This way BIBO can focus on describing citations, FRBR can focus on work/expression/manifestion/item relationships and other vocabularies can focus on other attributes (size, location, circ status, whatever). Somehow, though, they have to work together, at least where they are describing the same thing. I think the interaction between things like FRBR/Work and citation is interesting and complex. The RDA Online effort is working to allow you to assign particular data elements to FRBR entities through application profiles -- thus you can have a 'work title' which may be different to the 'manifestation title.' No one uses these differences in citations, but then again we haven't yet used them in library catalogs -- both citations and current library cataloging limit themselves to describing manifestations. However, if you are writing a literary criticism of Moby Dick you probably aren't only referring to a particular manifestation, but to the work as a whole. Right now, citation standards don't address this. Also note that IFLA is registering the FRBR vocabulary in the metadataregistry.org registry. I suspect it will look different to the one at vocab.org, although I haven't looked at the IFLA trial version in comparison to the one at vocab.org. Presumably FRAD will also be registered by IFLA in the same way. kc This is part of the flexibility of RDF, the ability to pick and choose among schemas to describe resources however you need to. -Ross. -- --- Karen Coyle / Digital Library Consultant kco...@kcoyle.net http://www.kcoyle.net ph.: 510-540-7596 skype: kcoylenet fx.: 510-848-3913 mo.: 510-435-8234
Re: [CODE4LIB] Something completely different
The TEI format does a decent job of representing bibliographic information. The TEI approach is to treat all instances of bibliographic reference as similarly as possible. So the title page of a work, the reference markers in the text and the references at the end of the work are all described in the same conceptual framework. http://www.tei-c.org/release/doc/tei-p5-doc/html/CO.html#COBI The TEI is modular, so you can declare whether you're commingling bibliographic tags with part-of-speech, manuscript description and other kinds of tags. cheers stuart Cloutman, David wrote: I'm open to seeing new approaches to the ILS in general. A related question I had the other day, speaking of MARC, is what would an alternative bibliographic data format look like if it was designed with the intent for opening access to the data our ILS systems to developers in a more informal manner? I was thinking of an XML format that a developer could work with without formal training, the basics of which could be learned in an hour, and could reasonably represent the essential fields of the 90% of records that are most likely to be viewed by a public library patron. In my mind, such a format would allow creators of community-based web sites to pull data from their local library, and repurpose it without having to learn a lot of arcane formats (e.g. MARC) or esoteric protocols (e.g. Z39.50). The sacrifice, of course, would be loosing some of the richness MARC allows, but I think in many common situations the really complex records are not what patrons are interested in. You may want to consider prototyping this in your application. I see such an effort to be vital in making our systems relevant in future computing environments, and I am skeptical that a simple, workable solution would come out the initial efforts of a standardization committee. Just my 2 cents. - David --- David Cloutman dclout...@co.marin.ca.us Electronic Services Librarian Marin County Free Library -Original Message- From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of Peter Schlumpf Sent: Sunday, April 05, 2009 8:40 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] Something completely different Greetings! I have been lurking on (or ignoring) this forum for years. And libraries too. Some of you may know me. I am the Avanti guy. I am, perhaps, the first person to try to produce an open source ILS back in 1999, though there is a David Duncan out there who tried before I did. I was there when all this stuff was coming together. Since then I have seen a lot of good things happen. There's Koha. There's Evergreen. They are good things. I have also seen first hand how libraries get screwed over and over by commercial vendors with their crappy software. I believe free software is the answer to that. I have neglected Avanti for years, but now I am ready to return to it. I want to get back to simple things. Imagine if there were no Marc records. Minimal layers of abstraction. No politics. No vendors. No SQL straightjacket. What would an ILS look like without those things? Sometimes the biggest prison is between the ears. I am in a position to do this now, and that's what I have decided to do. I am getting busy. Peter Schlumpf Email Disclaimer: http://www.co.marin.ca.us/nav/misc/EmailDisclaimer.cfm -- Stuart Yeates http://www.nzetc.org/ New Zealand Electronic Text Centre http://researcharchive.vuw.ac.nz/ Institutional Repository
Re: [CODE4LIB] Something completely different
On Mon, Apr 6, 2009 at 3:42 PM, Karen Coyle li...@kcoyle.net wrote: Still looks pretty limited to me. What academics cite isn't a full bibliographic universe. No music, no films, no way to do realia. And citing isn't the same as bibliographic description. Don't get me wrong, I think it's very complete as a citation format, I just don't think it meets other needs. The right tool for the job... and all that. But again, that's where RDF comes in, they even address the other ontologies to defer to: http://wiki.bibliontology.com/index.php/Development_Brainstorming#Possible_ontologies_to_Reuse Somehow, though, they have to work together, at least where they are describing the same thing. Right, but that's how it would work. If these resources were modeled in RDF, they'd have URIs. What you would do is to say 'bibliographic things' you'd use bibo attributes with the URI. To say work grouping things you'd use FRBR/FRAR attributes with the URI. So as long as they're using the same URIs, they're describing the same thing. -Ross.
Re: [CODE4LIB] Something completely different
Ross Singer wrote: Right, but that's how it would work. If these resources were modeled in RDF, they'd have URIs. What you would do is to say 'bibliographic things' you'd use bibo attributes with the URI. To say work grouping things you'd use FRBR/FRAR attributes with the URI. So as long as they're using the same URIs, they're describing the same thing. OK, Now I think I see where we're missing each other. Right now, IFLA is not thinking about registering (or creating identifiers for) the FRBR attributes, just the entities and relationships. I'm not sure that the attributes make the cut... and they aren't the same as the properties that RDA has defined. RDA properties have been assigned to particular FRBR entities (Groups 1 and 2 only, since RDA didn't do Group 3) in the RDA documentation, but there isn't complete agreement within the cataloging community as to which properties go with which FRBR Group 1 entities. So what RDA online is experimenting with is applying the FRBR entities as classes to RDA properties in an application profile that brings together RDA 'data elements' (properties in RDF) and FRBR entities (classes in RDF). (I haven't seen the result yet in the http://metadataregistry.org so I'm unclear on how the FRBR relationships will be used. I think they've been registered as properties.) I'm not at all sure what will happen with the FRBR attributes that are in the FRBR document, but they seem to have been rejected by the JSC in the RDA process. Nor can I figure out what's going to happen when the FRAD draft is made official. FRAD essentially includes all of FRBR plus some other properties and relationships. Now bibo has many attributes that might be the same as RDA attributes, or that could at least have some meaning within the FRBR defined classes. FRBR entities could be used with bibo, if the idea for RDA in the http://metadataregistry.org works, by creating an application profile for bibo + FRBR classes. kc -- --- Karen Coyle / Digital Library Consultant kco...@kcoyle.net http://www.kcoyle.net ph.: 510-540-7596 skype: kcoylenet fx.: 510-848-3913 mo.: 510-435-8234
Re: [CODE4LIB] Something completely different
Sorry, spoke/wrote too soon. FRBR at vocab.org isn't using the FRBR attributes either. And it does have the entities as classes. I'm still not sure how one can model a relationship between RDA or bibo properties and FRBR Group 1 entities and their properties. RDA tries to assign descriptive properties (like 'title' and 'place of publication') to particular FRBR Group 1 entities, which I think doesn't work. OK, I'm off to think about this some more. With some BIG pieces of paper. kc Karen Coyle wrote: Ross Singer wrote: Right, but that's how it would work. If these resources were modeled in RDF, they'd have URIs. What you would do is to say 'bibliographic things' you'd use bibo attributes with the URI. To say work grouping things you'd use FRBR/FRAR attributes with the URI. So as long as they're using the same URIs, they're describing the same thing. OK, Now I think I see where we're missing each other. Right now, IFLA is not thinking about registering (or creating identifiers for) the FRBR attributes, just the entities and relationships. I'm not sure that the attributes make the cut... and they aren't the same as the properties that RDA has defined. RDA properties have been assigned to particular FRBR entities (Groups 1 and 2 only, since RDA didn't do Group 3) in the RDA documentation, but there isn't complete agreement within the cataloging community as to which properties go with which FRBR Group 1 entities. So what RDA online is experimenting with is applying the FRBR entities as classes to RDA properties in an application profile that brings together RDA 'data elements' (properties in RDF) and FRBR entities (classes in RDF). (I haven't seen the result yet in the http://metadataregistry.org so I'm unclear on how the FRBR relationships will be used. I think they've been registered as properties.) I'm not at all sure what will happen with the FRBR attributes that are in the FRBR document, but they seem to have been rejected by the JSC in the RDA process. Nor can I figure out what's going to happen when the FRAD draft is made official. FRAD essentially includes all of FRBR plus some other properties and relationships. Now bibo has many attributes that might be the same as RDA attributes, or that could at least have some meaning within the FRBR defined classes. FRBR entities could be used with bibo, if the idea for RDA in the http://metadataregistry.org works, by creating an application profile for bibo + FRBR classes. kc -- --- Karen Coyle / Digital Library Consultant kco...@kcoyle.net http://www.kcoyle.net ph.: 510-540-7596 skype: kcoylenet fx.: 510-848-3913 mo.: 510-435-8234
Re: [CODE4LIB] Something completely different
Karen Coyle wrote: Sorry, spoke/wrote too soon. FRBR at vocab.org isn't using the FRBR attributes either. And it does have the entities as classes. I'm still not sure how one can model a relationship between RDA or bibo properties and FRBR Group 1 entities and their properties. RDA tries to assign descriptive properties (like 'title' and 'place of publication') to particular FRBR Group 1 entities, which I think doesn't work. I'm curious why you think that doesn't work? Isn't place of publication a characteristic of a particular manifestation? While, title, according to traditional library practices where you take it from the title page, is also a characteristic of a particular manifestation, is it not? (uniform title is _usually_ a characteristic of a work, unless we get into music cataloging and some other 'edge' cases. Our traditional practices -- which aren't actually changed that much by RDA, are rather confusing.) I am confused about what one would do about the fact that RDA defines attributes a bit different than FRBR itself does. It's not too surprising -- FRBR is really just a draft, hardly tested in the world. When RDA tried to make it a bit more concrete, it's not surprising that they found they had to make changes to make it workable. Not sure what to do about that in the grand scheme of things, if RDA and FRBR both end up registering different vocabularies. I guess we'll just have two different vocabularies though, which isn't too shocking I guess. Jonathan OK, I'm off to think about this some more. With some BIG pieces of paper. kc Karen Coyle wrote: Ross Singer wrote: Right, but that's how it would work. If these resources were modeled in RDF, they'd have URIs. What you would do is to say 'bibliographic things' you'd use bibo attributes with the URI. To say work grouping things you'd use FRBR/FRAR attributes with the URI. So as long as they're using the same URIs, they're describing the same thing. OK, Now I think I see where we're missing each other. Right now, IFLA is not thinking about registering (or creating identifiers for) the FRBR attributes, just the entities and relationships. I'm not sure that the attributes make the cut... and they aren't the same as the properties that RDA has defined. RDA properties have been assigned to particular FRBR entities (Groups 1 and 2 only, since RDA didn't do Group 3) in the RDA documentation, but there isn't complete agreement within the cataloging community as to which properties go with which FRBR Group 1 entities. So what RDA online is experimenting with is applying the FRBR entities as classes to RDA properties in an application profile that brings together RDA 'data elements' (properties in RDF) and FRBR entities (classes in RDF). (I haven't seen the result yet in the http://metadataregistry.org so I'm unclear on how the FRBR relationships will be used. I think they've been registered as properties.) I'm not at all sure what will happen with the FRBR attributes that are in the FRBR document, but they seem to have been rejected by the JSC in the RDA process. Nor can I figure out what's going to happen when the FRAD draft is made official. FRAD essentially includes all of FRBR plus some other properties and relationships. Now bibo has many attributes that might be the same as RDA attributes, or that could at least have some meaning within the FRBR defined classes. FRBR entities could be used with bibo, if the idea for RDA in the http://metadataregistry.org works, by creating an application profile for bibo + FRBR classes. kc
Re: [CODE4LIB] Something completely different
Jonathan Rochkind wrote: I'm curious why you think that doesn't work? Isn't place of publication a characteristic of a particular manifestation? While, title, according to traditional library practices where you take it from the title page, is also a characteristic of a particular manifestation, is it not? (uniform title is _usually_ a characteristic of a work, unless we get into music cataloging and some other 'edge' cases. Our traditional practices -- which aren't actually changed that much by RDA, are rather confusing.) Well, I was responding to Ross' statement that bibo and FRBR could be used in combination, depending on whether one was at that moment describing 'bibliographic things' or 'work things'. bibo doesn't have a uniform title, so the question is: can you use a bibo title and say that it is a work title? I thought that Ross was indicating something of that nature -- that you could have a FRBR 'work thing' with bibo properties. I'm trying to understand how that works since Work is a class. Don't you have to indicate the domain and range of a property in its definition? RDA tries to solve this by creating different properties for every concept+FRBR entity: title of the work (Work), title proper (Manifestation). [I don't understand why expressions don't have titles a translation is an expression, after all.] I am confused about what one would do about the fact that RDA defines attributes a bit different than FRBR itself does. It's not too surprising -- FRBR is really just a draft, hardly tested in the world. When RDA tried to make it a bit more concrete, it's not surprising that they found they had to make changes to make it workable. Not sure what to do about that in the grand scheme of things, if RDA and FRBR both end up registering different vocabularies. I guess we'll just have two different vocabularies though, which isn't too shocking I guess. I'm not sure there's anything to do, but I do know that the developers of RDA feel very strongly that in RDA they have 'implemented' FRBR, so we have to find a way to integrate FRBR and RDA in the registered RDA vocabulary. I agree that there's no problem with having RDA and FRBR as two different vocabularies, it's the effort of bringing them together that boggles me. I feel like it leaves a lot of loose ends. I'd be happy to see FRBR revised, or to have it re-defined without the attributes, thus allowing metadata developers to use the bibliographic relationship properties with any set of descriptive elements. I'm having trouble with the FRBR Group 1 entities as classes. I see them instead as relationships, and vocab.org does seem to treat them as relationships, not as 'things.' I see a distinct difference between a person entity and a work entity, because there is no thing that is a work. I see work as a relationship between two bibliographic statements. (This is vague in my mind, so I won't be surprised if it doesn't make sense) As an example, if I have a group of bibliographic properties, say an author and a title, and I say: Magic Mountain, by Thomas Mann -- expresses -- Der Zauberberg, by Thomas Mann then I have created an 'expression to work' relationship, and so Der Zauberberg is a Work. If I do this, I don't need an explicit Work title. If I have a badly created Manifestation that has on its title page: Magic Mountian, I can do: Magic Mountian, published by x in y -- manifests -- Magic Mountain, by Thomas Mann -- expresses -- Der Zauberberg, by Thomas Mann In this way, I don't have to declare different title elements with different domains/ranges (which is essentially what RDA does in an awkward way) to connect them to the FRBR Group 1 classes, and the FRBR properties become more usable because you don't have to declare your bibliographic properties in terms of the FRBR classes. Now, IF you can use any properties, say, dcterms:title, with the FRBR properties, like manifests then the whole thing is solved. I think it works that way, but that is definitely NOT what RDA has done; it has incorporated the domain (FRBR class) in the bibliographic properties. I think that what I describe above in my examples works; and if it does, then the problem is with RDA. In the end, it's the relationship between properties and classes in FRBR and RDA that is giving me a headache, and the headache mainly has to do with FRBR group 1. I think this is my bete noir, and so I will now go read something soothing and let my blood pressure drop a bit. kc -- --- Karen Coyle / Digital Library Consultant kco...@kcoyle.net http://www.kcoyle.net ph.: 510-540-7596 skype: kcoylenet fx.: 510-848-3913 mo.: 510-435-8234