Re: [CODE4LIB] Metadata
Great discussion! My cataloger heart says that you can never have enough, but my manager brain says that you only have a limited amount of resources and time to dedicate. ;cD I agree with Diane saying that metadata is not static. Your use cases will change in the future due to changed user needs or new discovery layer functionality, etc. The most important thing is getting a good core set of fields/elements set at the beginning (keeping use cases and system functionality in mind) and evolve from there. You also might want to keep in mind how much work it will be if you add metadata fields in the future. Will you be able to easily add information for the new fields in existing metadata records? An indirect comment - if you find yourself using fields now that you end up not using in the near future due to search functionality/changes, do not delete [or ignore, for that matter] those fields. 9 times out of 10 you will find yourself wanting to use that metadata again in some shape or form, and if you delete it, you're stuck with basically spending resources recreating that metadata. Thanks, Becky, who is trying to cultivate her seedling EAC-CPF records for her POW's new Islandora IR - Becky Yoose Systems Librarian Grinnell College www.libcatcode.org - catalogers and coders QA under one site
[CODE4LIB] Job Posting: Electronic Resources Librarian, University of Notre Dame Hesburgh Libraries
Electronic Resources Librarian February 13, 2012 The University of Notre Dame Hesburgh Libraries seeks a knowledgeable, creative and dynamic individual for the position of Electronic Resources Librarian. Working in a collaborative team environment this position will be responsible for improving discoverability of electronic resources tools and services as well as ongoing electronic resources management and support. The position reports to the Head, Acquisitions, Resources and Discovery Services Department in the Information Systems and Digital Access Division. The Electronic Resources Librarian works collaboratively with stakeholders in the Libraries to improve and enhance access to growing electronic resources collections. The preferred candidate will have demonstrated a strong interest in electronic resources management, emerging technologies, and user information needs. Primary Responsibilities: - Provides professional leadership and expertise in management of technologies used to support enhance access to electronic resources - As part of electronic resources support team assists with all aspects of electronic resources support. Communicates with library users, publishers, vendors and library staff to resolve access problems - Works collaboratively with stakeholders to coordinate and lead improvements to existing electronic resources tools - Works collaboratively with stakeholders to develop new services to improve user access to electronic resources - Works collaboratively with stakeholders to integrate electronic resources tools and services into Libraries website - Provides guidance and expertise within the Libraries for RefWorks, including advanced user support and training for faculty and students. Liaison with campus IT departments regarding RefWorks - Assists with management of electronic resources using CORAL Electronic Resources Management System - Monitors trends and best practices in library resource access and discoverability Experience and Qualifications Minimum qualifications: MLS degree from ALA-accredited program or non-U.S. equivalent. Required: - Electronic resources experience - Knowledge of trends and applications in electronic resource management - Strong service orientation - Excellent oral, written, and interpersonal communication skills - Ability to balance multiple projects and set priorities in a time-sensitive environment - Enthusiasm for the fast-paced, evolving nature of electronic resources - Familiarity with Open URL resolver, federated search, EZ Proxy, ILS and ERM Management - The ability to work in a highly collaborative and team oriented environment as well as the ability to take a leadership role in a group activities Preferred: - Experience with programming; preferably with PHP, Ruby, or other scripting language - Experience with vendors and content providers, including subscription agents, publishers, and library consortia Environment: The University of Notre Dame is a highly selective national Catholic teaching and research university in northern Indiana about ninety miles from Chicago. Approximately 8,200 undergraduates and 3,100 graduate students pursue a broad range of studies. The University Libraries http://www.library.nd.edu hold about 3 million volumes and provide access to more than 23,000 serials. The Libraries have 140 staff and 55 librarians. The Libraries is a member of the Academic Libraries of Indiana (ALI), ARL, NERL and other consortia. The University of Notre Dame is an Equal Opportunity/Affirmative Action Employer strongly committed to diversity. We value qualified candidates who can bring a variety of backgrounds to our community. Further details applications: More information can be found about this position at the Libraries’ website: http://www.library.nd.edu/about/employment/ To apply, please include a letter, curriculum vitae and the names, addresses, phone numbers and email addresses of three references. Electronic submission of applications is required. Send all application documents to: msten...@nd.edumailto:msten...@nd.edu The review of applications will begin on March 11, 2012 and will continue until the candidates are chosen.
[CODE4LIB] Touch Screens in the Library
Hi All, I was wondering if anyone has implemented (or plan to implement) touch screens in their library? We're looking mostly at doing it for wayfinding (finding items, rooms, etc.) but I'd definitely be interested in hearing about any other uses. What kind of hardware did you choose? What software are you using? If you did it in-house, what language(s) did you use? Any ideas/help would be great. Thanks, Cynthia
Re: [CODE4LIB] Touch Screens in the Library
NCSU has done some work you might be interested in. See this article: Lessons in Public Touchscreen Development by Andreas K. Orphanides In October 2010, the NCSU Libraries debuted its first public touchscreen information kiosk, designed to provide on-demand access to useful and commonly consulted real-time displays of library information. This article presents a description of the hardware and software development process, as well as the rationale behind a variety of design and implementation decisions. This article also provides an analysis of usage of the touchscreen since its debut, including a numerical analysis of most popular content areas, and a heatmap-based analysis of user interaction patterns with the kiosk's interface components. http://journal.code4lib.org/articles/5832 -Tod Tod Olson t...@uchicago.edu Systems Librarian University of Chicago Library On Feb 13, 2012, at 9:50 AM, Cynthia Ng wrote: Hi All, I was wondering if anyone has implemented (or plan to implement) touch screens in their library? We're looking mostly at doing it for wayfinding (finding items, rooms, etc.) but I'd definitely be interested in hearing about any other uses. What kind of hardware did you choose? What software are you using? If you did it in-house, what language(s) did you use? Any ideas/help would be great. Thanks, Cynthia
Re: [CODE4LIB] Metadata
My take on this discussion, coming from a research lab: Metadata isn't meta. For example, in recordings of, say, blood pressure over time, it's common to think about things such as participant identifiers, acquisition dates, event markers, and sampling rates as metadata, and the actual measurements as data. But really: those meta things aren't ancillary to data analysis; they're essential in keeping analyses organized, and often important parameters in running an analysis at all. Breaking things down into data versus metadata I think, encourages a false (and not very interesting) dichotomy. If information has a use, call it what it is: data. Store everything that's useful. If you don't yet have a use in mind for your data, then you have a place to start working :) -n
Re: [CODE4LIB] Touch Screens in the Library
Oh, I should amend that article with a comment. I just switched over from Firefox to Opera because OH MY GOD FIREFOX YOU USED TO BE A GOOD WEB BROWSER. Opera actually works pretty well for our implementation -- it has a nice built in kiosk mode and URL whitelist, and there were minimal changes required for switching over from Firefox. If you go with Opera get in touch with me and I'll send you the config files I used. -dre. On Mon, Feb 13, 2012 at 10:55 AM, Tod Olson t...@uchicago.edu wrote: NCSU has done some work you might be interested in. See this article: Lessons in Public Touchscreen Development by Andreas K. Orphanides In October 2010, the NCSU Libraries debuted its first public touchscreen information kiosk, designed to provide on-demand access to useful and commonly consulted real-time displays of library information. This article presents a description of the hardware and software development process, as well as the rationale behind a variety of design and implementation decisions. This article also provides an analysis of usage of the touchscreen since its debut, including a numerical analysis of most popular content areas, and a heatmap-based analysis of user interaction patterns with the kiosk's interface components. http://journal.code4lib.org/articles/5832 -Tod Tod Olson t...@uchicago.edu Systems Librarian University of Chicago Library On Feb 13, 2012, at 9:50 AM, Cynthia Ng wrote: Hi All, I was wondering if anyone has implemented (or plan to implement) touch screens in their library? We're looking mostly at doing it for wayfinding (finding items, rooms, etc.) but I'd definitely be interested in hearing about any other uses. What kind of hardware did you choose? What software are you using? If you did it in-house, what language(s) did you use? Any ideas/help would be great. Thanks, Cynthia
Re: [CODE4LIB] Touch Screens in the Library
On Feb 13, 2012, at 10:50 AM, Cynthia Ng wrote: Hi All, I was wondering if anyone has implemented (or plan to implement) touch screens in their library? We're looking mostly at doing it for wayfinding (finding items, rooms, etc.) but I'd definitely be interested in hearing about any other uses. What kind of hardware did you choose? What software are you using? If you did it in-house, what language(s) did you use? Any ideas/help would be great. I saw an article a couple of months back about one of the harvard libraries using a Microsoft Surface: http://osc.hul.harvard.edu/liblab/proj/wolbach-user-experience-lab (I took note, as the pictures of the sun are from the Solar Dynamics Observatory's AIA telescopes) I'm guessing it's out of the price range for most of us, though. -Joe - Joe Hourcle Programmer/Analyst Solar Data Analysis Center Goddard Space Flight Center
[CODE4LIB] breakout topics c4l12
Hi All, I meant to write down the breakout topics from both tues and wed, but didn't. Did anyone? And if so, would you forward to me off-list? I would also like to throw out my thanks to the organizers and others who made it such a successful and productive conference! Thanks, Tim
Re: [CODE4LIB] Touch Screens in the Library
We've bolted an ipad to our circ desk for students to search our course reserves interface: http://www.tararobertson.ca/diy-kiosk/ The industrial design shop technicians are building a bunch of secure enclosures for our grad show. I can see if they are willing to share their design. Cheers, tara On 12-02-13 7:50 AM, Cynthia Ng wrote: Hi All, I was wondering if anyone has implemented (or plan to implement) touch screens in their library? We're looking mostly at doing it for wayfinding (finding items, rooms, etc.) but I'd definitely be interested in hearing about any other uses. What kind of hardware did you choose? What software are you using? If you did it in-house, what language(s) did you use? Any ideas/help would be great. Thanks, Cynthia -- Tara Robertson systems and technical services librarian| tel 604 630 4566 fax 604 630 4531 emily carr university of art + design http://www.ecuad.ca | 1399 Johnston Street, Vancouver BC V6H 3R9
Re: [CODE4LIB] Touch Screens in the Library
Last year I spoke with Jennifer Rosenfeld from Woodbury (@jenro on the twitters) about her early homebrew attempt at installing iPads as information look-up devices: http://www.flickr.com/photos/redbobsled/sets/72157624705494862/detail/ She had lots of practical advice, and would be worth contacting. Dan * Daniel Suchy User Services Technology Analyst University of California, San Diego Libraries 858.534.6819 dsu...@ucsd.edu -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Cynthia Ng Sent: Monday, February 13, 2012 7:51 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] Touch Screens in the Library Hi All, I was wondering if anyone has implemented (or plan to implement) touch screens in their library? We're looking mostly at doing it for wayfinding (finding items, rooms, etc.) but I'd definitely be interested in hearing about any other uses. What kind of hardware did you choose? What software are you using? If you did it in-house, what language(s) did you use? Any ideas/help would be great. Thanks, Cynthia
Re: [CODE4LIB] Metadata
amen! On Mon, Feb 13, 2012 at 10:57 AM, Nate Vack njv...@wisc.edu wrote: My take on this discussion, coming from a research lab: Metadata isn't meta. For example, in recordings of, say, blood pressure over time, it's common to think about things such as participant identifiers, acquisition dates, event markers, and sampling rates as metadata, and the actual measurements as data. But really: those meta things aren't ancillary to data analysis; they're essential in keeping analyses organized, and often important parameters in running an analysis at all. Breaking things down into data versus metadata I think, encourages a false (and not very interesting) dichotomy. If information has a use, call it what it is: data. Store everything that's useful. If you don't yet have a use in mind for your data, then you have a place to start working :) -n
Re: [CODE4LIB] Metadata
I'll second this amen. It was only when I entered the library world that I learned about the concept of metadata. Of course, I'd been using metadata for 12 years, but I'd never labeled it as such. To me it was just data. Useful information. It took time for this concept of metadata to mesh with what I already knew. Also, is this simply an over-classification of things that seems to be a humorously stereotypical thing that librarians do? :) --Joel Joel Richard Lead Web Developer, Web Services Department Smithsonian Institution Libraries | http://www.sil.si.edu/ (202) 633-1706 | richar...@si.edu On Feb 13, 2012, at 2:49 PM, Rosalyn Metz wrote: amen! On Mon, Feb 13, 2012 at 10:57 AM, Nate Vack njv...@wisc.edu wrote: My take on this discussion, coming from a research lab: Metadata isn't meta. For example, in recordings of, say, blood pressure over time, it's common to think about things such as participant identifiers, acquisition dates, event markers, and sampling rates as metadata, and the actual measurements as data. But really: those meta things aren't ancillary to data analysis; they're essential in keeping analyses organized, and often important parameters in running an analysis at all. Breaking things down into data versus metadata I think, encourages a false (and not very interesting) dichotomy. If information has a use, call it what it is: data. Store everything that's useful. If you don't yet have a use in mind for your data, then you have a place to start working :) -n
Re: [CODE4LIB] Metadata
Could this conversation be described as metametadata? *runs, hides* Thanks, Becky Bonus: Metacow - http://wisconsin.cowparade.com/cow/detail/3973/ On Mon, Feb 13, 2012 at 2:16 PM, Richard, Joel M richar...@si.edu wrote: I'll second this amen. It was only when I entered the library world that I learned about the concept of metadata. Of course, I'd been using metadata for 12 years, but I'd never labeled it as such. To me it was just data. Useful information. It took time for this concept of metadata to mesh with what I already knew. Also, is this simply an over-classification of things that seems to be a humorously stereotypical thing that librarians do? :) --Joel Joel Richard Lead Web Developer, Web Services Department Smithsonian Institution Libraries | http://www.sil.si.edu/ (202) 633-1706 | richar...@si.edu On Feb 13, 2012, at 2:49 PM, Rosalyn Metz wrote: amen! On Mon, Feb 13, 2012 at 10:57 AM, Nate Vack njv...@wisc.edu wrote: My take on this discussion, coming from a research lab: Metadata isn't meta. For example, in recordings of, say, blood pressure over time, it's common to think about things such as participant identifiers, acquisition dates, event markers, and sampling rates as metadata, and the actual measurements as data. But really: those meta things aren't ancillary to data analysis; they're essential in keeping analyses organized, and often important parameters in running an analysis at all. Breaking things down into data versus metadata I think, encourages a false (and not very interesting) dichotomy. If information has a use, call it what it is: data. Store everything that's useful. If you don't yet have a use in mind for your data, then you have a place to start working :) -n
Re: [CODE4LIB] Metadata
I got such dirty looks when I used the term metametadata to describe something. ;) -Kurt On 02/13/2012 02:39 PM, Becky Yoose wrote: Could this conversation be described as metametadata? *runs, hides* Thanks, Becky
[CODE4LIB] Berkeley DB and NOID
Does anyone here have expertise with Berkeley DB? I was running an instance of NOID (which uses Berkeley DB) to mint and resolve ARKs. I updated the OS for the server it was running on from Ubuntu 9 to Ubuntu 10. Now NOID has stopped working and complains that the db version doesn't match: Program version 4.8 doesn't match environment version 4.7 I have no experience at all with Berkeley DB and could use some advice. Thanks, Josh -- Joshua Gomez Digital Library Programmer Analyst George Washington University Libraries 2130 H St, NW Washington, DC 20052 (202) 994-8267
Re: [CODE4LIB] RDF advice
Ethan, The semantics do seem odd there. It doesn't seem like a skos:Concept would typically link to a metadata record about -- if I'm following you right -- a specific coin. Is this sort of a FRBRish approach, where your skos:Concept is similar to the abstraction of a frbr:Work (that is, the idea of a particular coin), where your metadata records are really describing the common features of a particular coin? If that's close, it seems like the richer metadata is really a sort of definition of the skos:Concept, so maybe skos:definition would do the trick? Something like this: ex:wheatPenny a skos:Concept ; skos:prefLabel Wheat Penny ; skos:definition Your richer, non RDF metadata document describing the front and back, years minted, etc. In XML that might be like: skos:Concept about=http://example.org/wheatPenny; skos:prefLabelWheat Penny/skos:prefLabel skos:definition Your richer, non RDF metadata document describing the front and back, years minted, etc. /skos:definition /skos:Concept It might raise an eyebrow to have, instead of a literal value for skos:definition, another set of structured, non RDF metadata. Better in that case to go with a document reference, and make your richer metadata a standalone document with its own URI: ex:wheatPenny skos:definition ex:wheatPennyDefinition**.xml skos:Concept about=http://example.org/wheatPenny; skos:definition resource=http://example.org/wheatPenny.xml; / /skos:Concept I'm looking at the Documentation as a Document Reference section in SKOS Primer : http://www.w3.org/TR/2009/NOTE-skos-primer-20090818/ Again, if I'm following, that might be the closest approach. Hope that helps, Patrick On 02/11/2012 09:53 PM, Ethan Gruber wrote: Hi Patrick, The richer metadata model is an ontology for describing coins. It is more complex than, say, VRA Core or MODS, but not as hierarchically complicated as an EAD finding aid. I'd like to link a skos:Concept to one of these related metadata records. It doesn't matter if I use skos, owl, etc. to describe this relationship, so long as it is a semantically appropriate choice. Ethan On Sat, Feb 11, 2012 at 2:32 PM, Patrick Murray-John patrickmjc...@gmail.com wrote: Ethan, Maybe I'm being daft in missing it, but could I ask about more details in the richer metadata model? My hunch is that, depending on the details of the information you want to bring in, there might be more precise alternatives to what's in SKOS. Are you aiming to have a link between a skos:Concept and texts/documents related to that concept? Patrick On 02/11/2012 03:14 PM, Ethan Gruber wrote: Hi Ross, Thanks for the input. My main objective is to make the richer metadata available one way or another to people using our web services. Do you think it makes more sense to link to a URI of the richer metadata document as skos:related (or similar)? I've seen two uses for skos:related--one to point to related skos:concepts, the other to point to web resources associated with that concept, e.g., a wikipedia article. I have a feeling the latter is incorrect, at least according to the documentation I've read on the w3c. For what it's worth, VIAF uses owl:sameAs/@rdf:resource to point to dbpedia and other web resources. Thanks, Ethan On Sat, Feb 11, 2012 at 12:21 PM, Ross Singerrossfsin...@gmail.com wrote: On Fri, Feb 10, 2012 at 11:51 PM, Ethan Gruberewg4x...@gmail.com wrote: Hi Ross, No, the richer ontology is not an RDF vocabulary, but it adheres to linked data concepts. Hmm, ok. That doesn't necessarily mean it will work in RDF. I'm looking to do something like this example of embedding mods in rdf: http://www.daisy.org/zw/ZedAI_**Meta_Data_-_MODS_** Recommendation#RDF.2FXML_2http://www.daisy.org/zw/ZedAI_Meta_Data_-_MODS_Recommendation#RDF.2FXML_2 Yeah, I'll be honest, that looks terrible to me. This looks, to me, like kind of a misunderstanding of RDF and RDF/XML. Regardless, this would make useless RDF (see below). One of the hard things to understand about RDF, especially when you're coming at it from XML (and, by association, RDF/XML) is that RDF isn't hierarchical, it's a graph. This is one of the reasons that the XML serialization is so awkward: it looks something familiar XML people, but it doesn't work well with their tools (XPath, for example) despite the fact that it, you know, should. It's equally frustrating for RDF people because it's really verbose and its syntax can come in a million variations (more on that later in the email) making it excruciatingly hard to parse. These semantic ontologies are so flexible, it seems like I *can* do anything, so I'm left wondering what I *should* do--what makes the most sense, semantically. Is it possible to nest rdf:Description into the skos:Concept of my previous example, and then placenuds:nuds.more sophistated model../nuds:nuds into rdf:Description (or alternatively, set rdf:Description/@rdf:resource
Re: [CODE4LIB] RDF advice
Hi Patrick, Thanks. That does make sense. Hopefully others will weigh in with agreement (or disagreement). Sometimes these semantic languages are so flexible that it's unsettling. There are a million ways to do something with only de facto standards rather than restricted schemas. For what it's worth, the metadata files describe coin-types, an intellectual concept in numismatics succinctly described at http://coins.about.com/od/coinsglossary/g/coin_type.htm, not physical objects in a collection. Ethan On Mon, Feb 13, 2012 at 4:28 PM, Patrick Murray-John patrickmjc...@gmail.com wrote: Ethan, The semantics do seem odd there. It doesn't seem like a skos:Concept would typically link to a metadata record about -- if I'm following you right -- a specific coin. Is this sort of a FRBRish approach, where your skos:Concept is similar to the abstraction of a frbr:Work (that is, the idea of a particular coin), where your metadata records are really describing the common features of a particular coin? If that's close, it seems like the richer metadata is really a sort of definition of the skos:Concept, so maybe skos:definition would do the trick? Something like this: ex:wheatPenny a skos:Concept ; skos:prefLabel Wheat Penny ; skos:definition Your richer, non RDF metadata document describing the front and back, years minted, etc. In XML that might be like: skos:Concept about=http://example.org/**wheatPennyhttp://example.org/wheatPenny skos:prefLabelWheat Penny/skos:prefLabel skos:definition Your richer, non RDF metadata document describing the front and back, years minted, etc. /skos:definition /skos:Concept It might raise an eyebrow to have, instead of a literal value for skos:definition, another set of structured, non RDF metadata. Better in that case to go with a document reference, and make your richer metadata a standalone document with its own URI: ex:wheatPenny skos:definition ex:wheatPennyDefinition**.xml skos:Concept about=http://example.org/**wheatPennyhttp://example.org/wheatPenny skos:definition resource=http://example.org/**wheatPenny.xmlhttp://example.org/wheatPenny.xml / /skos:Concept I'm looking at the Documentation as a Document Reference section in SKOS Primer : http://www.w3.org/TR/2009/**NOTE-skos-primer-20090818/http://www.w3.org/TR/2009/NOTE-skos-primer-20090818/ Again, if I'm following, that might be the closest approach. Hope that helps, Patrick On 02/11/2012 09:53 PM, Ethan Gruber wrote: Hi Patrick, The richer metadata model is an ontology for describing coins. It is more complex than, say, VRA Core or MODS, but not as hierarchically complicated as an EAD finding aid. I'd like to link a skos:Concept to one of these related metadata records. It doesn't matter if I use skos, owl, etc. to describe this relationship, so long as it is a semantically appropriate choice. Ethan On Sat, Feb 11, 2012 at 2:32 PM, Patrick Murray-John patrickmjc...@gmail.com wrote: Ethan, Maybe I'm being daft in missing it, but could I ask about more details in the richer metadata model? My hunch is that, depending on the details of the information you want to bring in, there might be more precise alternatives to what's in SKOS. Are you aiming to have a link between a skos:Concept and texts/documents related to that concept? Patrick On 02/11/2012 03:14 PM, Ethan Gruber wrote: Hi Ross, Thanks for the input. My main objective is to make the richer metadata available one way or another to people using our web services. Do you think it makes more sense to link to a URI of the richer metadata document as skos:related (or similar)? I've seen two uses for skos:related--one to point to related skos:concepts, the other to point to web resources associated with that concept, e.g., a wikipedia article. I have a feeling the latter is incorrect, at least according to the documentation I've read on the w3c. For what it's worth, VIAF uses owl:sameAs/@rdf:resource to point to dbpedia and other web resources. Thanks, Ethan On Sat, Feb 11, 2012 at 12:21 PM, Ross Singerrossfsin...@gmail.com wrote: On Fri, Feb 10, 2012 at 11:51 PM, Ethan Gruberewg4x...@gmail.com wrote: Hi Ross, No, the richer ontology is not an RDF vocabulary, but it adheres to linked data concepts. Hmm, ok. That doesn't necessarily mean it will work in RDF. I'm looking to do something like this example of embedding mods in rdf: http://www.daisy.org/zw/ZedAI_Meta_Data_-_MODS_**http://www.daisy.org/zw/ZedAI_**Meta_Data_-_MODS_** Recommendation#RDF.2FXML_2htt**p://www.daisy.org/zw/ZedAI_** Meta_Data_-_MODS_**Recommendation#RDF.2FXML_2http://www.daisy.org/zw/ZedAI_Meta_Data_-_MODS_Recommendation#RDF.2FXML_2 Yeah, I'll be honest, that looks terrible to me. This looks, to me, like kind of a misunderstanding of RDF and RDF/XML. Regardless, this would make useless RDF (see below). One of the
Re: [CODE4LIB] neo4j
Hey Kent, Awesome. thanks for the info. So, using gremlin, are you using some of the other Tinkerpop technologies? And, haha, in researching stuff this weekend, I actually saw an email you sent to the neo4j google group about the lucene boosting issue… I started playing around with RDF.rb , and was really impressed, although using that doesn't give you all the stuff tinkerpop does. b,chris. On Sat, Feb 11, 2012 at 12:32 AM, Kent Fitch kent.fi...@gmail.com wrote: Hi, AustLit ( http://www.austlit.edu.au ) is in the early stages of a migration from javaServlets/xslt/oracle to java/neo4j/gremlin. The web version of AustLit was developed in 2000 based on FRBR with a strong emphasis on events realised with a topic map model, so the sql implementation is close to a triple-store. More information on the details are here: http://www.austlit.edu.au/about , http://www.austlit.edu.au/about/metadata and http://www.austlit.edu.au:/DataModel/index.html (ALEG was the working name for AustLit redevelopment in 2000). Last year a decision was taken to move AustLit from a subscription service to open access, and from updates being performed solely by dedicated bibliographers and researchers (members of various AustLit teams distributed across Australia) to include community contributions, so rather than work these changes into a 12 year old system, it was decided to start afresh with an approach which would more naturally support the AustLit data model. So, we experimented with Neo4j, and were impressed with its performance. For example, loading our current data from Oracle into an empty neo4j database takes about 30 minutes (using a run-of-the-mill 3 year-old server), producing a graph of 14m nodes and 20m relationships. Performing custom indexing of this data using the built-in Lucene integration takes about 2.5 hours, but that's a function of the extensive indexing we're performing. As you'd probably expect, we do have some issues we're working through, such as - integration with Lucene is abstracted by the neo4j index interface, so it is difficult or impossible to use some native Lucene features. For example, boosting index nodes based on their inherent importance and using this boost in lucene to determine relevance cannot be done. - our data model is complex, and added to the requirements to version every node and relationship (ie, record changes, allow rollback), our graph traversals are correspondingly complex, but I suspect as we become more familar with graph traversal idioms in gremlin and cypher, they'll become as normal as sql But so far, neo4j seems fast and robust, and we're optimistic! Kent Fitch On Sat, Feb 11, 2012 at 9:42 AM, Chris Fitzpatrick chrisfitz...@gmail.com wrote: Hej hej, Is anyone is using neo4j in their library projects. If the answer is ja, I would be very interested in hearing how it's going. How are you using it? Is it something that is in production and is adding value or is it more a skunkworks-type effort? What languages are you using? Are you using an ORM (like Rails or Django)? I would also be really interested in hearing thoughts, stories, and opinions about the idea of using a graph db or triple store in their stack. tack! b, fitz.
Re: [CODE4LIB] Metadata
I think this is a rather different situation from the one libraries commonly deal with, where there is a pretty clear distinction between data representing the full text of a 189-page book by Author X, and the descriptive data that is made up by catalogers or publishers, and is not part of Author X's work at all. In addition, it is somewhat useful to distinguish between full-text data and descriptive metadata because the nature of the work you can do with these two types of data can be so very different. You simply can't use the average library catalog to look up Author X's novel that starts with the sentence So a string walks into a bar. The actual data (the novel) is not in the catalog (which is composed only of metadata). Genny Engel Sonoma County Library gen...@sonoma.lib.ca.us 707 545-0831 x581 www.sonomalibrary.org -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Nate Vack Sent: Monday, February 13, 2012 7:57 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Metadata My take on this discussion, coming from a research lab: Metadata isn't meta. For example, in recordings of, say, blood pressure over time, it's common to think about things such as participant identifiers, acquisition dates, event markers, and sampling rates as metadata, and the actual measurements as data. But really: those meta things aren't ancillary to data analysis; they're essential in keeping analyses organized, and often important parameters in running an analysis at all. Breaking things down into data versus metadata I think, encourages a false (and not very interesting) dichotomy. If information has a use, call it what it is: data. Store everything that's useful. If you don't yet have a use in mind for your data, then you have a place to start working :) -n
Re: [CODE4LIB] Metadata
On Mon, Feb 13, 2012 at 4:25 PM, Genny Engel gen...@sonoma.lib.ca.us wrote: You simply can't use the average library catalog to look up Author X's novel that starts with the sentence So a string walks into a bar. The actual data (the novel) is not in the catalog (which is composed only of metadata). That's a technical limitation. If you're Google Books (or any other fulltext index), the actual data *is* in the catalog, and data and metadata are again functionally identical. The best working definition of metadata I've come up with is something I have a field for in my data cataloging program. I think it's kind of a circular issue: We know metadata and data are separate because our software and workflow require it. Software and workflows are designed to separate metadata and data because we know they're separate. -n
Re: [CODE4LIB] Berkeley DB and NOID
The standard BerkeleyDB library probably changed when you upgraded Ubuntu, and it complains that the NOID database (written with the old library) is incompatible. You should be able to use db_upgrade to convert the NOID database (NOID/noid.bdb). db_upgrade is a command line utility that comes with BerkeleyDB. -John --- On Mon, 13 Feb 2012, Joshua Gomez wrote: Does anyone here have expertise with Berkeley DB? I was running an instance of NOID (which uses Berkeley DB) to mint and resolve ARKs. I updated the OS for the server it was running on from Ubuntu 9 to Ubuntu 10. Now NOID has stopped working and complains that the db version doesn't match: Program version 4.8 doesn't match environment version 4.7 I have no experience at all with Berkeley DB and could use some advice. Thanks, Josh -- Joshua Gomez Digital Library Programmer Analyst George Washington University Libraries 2130 H St, NW Washington, DC 20052 (202) 994-8267
Re: [CODE4LIB] RDF advice
Hi Ethan, I will defer to those with greater insight, into what has been discussed earlier in this thread, than myself as to some of the semantics you are trying to crystallise here. What I can offer instead is a bit of advice as to lubricating the process. Firstly, stay as far away from XML as possible whilst trying to shape your model/ontologies - it a) introduces hierarchical thinking/visualisation in to what may well not be a problem of hierarchy, b) is difficult to read, c) in the world of RDF, best reserved for machine to machine communication. Secondly, put away the computer and get out the white/blackboard and pen. Start drawing some ellipses, rectangles, and arrows. When you have a model that looks something like the real world you are trying to represent (not the traditional metadata records you previously held), transform that in to a form of RDF that a computer will understand. This is an approximation of the process the British Library used to work their way towards their data modelhttp://dataliberate.com//wp-content/uploads/2012/01/British-Library-Data-Model-v1.01.pdf for the British National Bibliography. Oh, and the XML? - Let a tool like Raptor produce it for you from the more human friendly turtle you come up with. ~Richard. On 13 February 2012 21:43, Ethan Gruber ewg4x...@gmail.com wrote: Hi Patrick, Thanks. That does make sense. Hopefully others will weigh in with agreement (or disagreement). Sometimes these semantic languages are so flexible that it's unsettling. There are a million ways to do something with only de facto standards rather than restricted schemas. For what it's worth, the metadata files describe coin-types, an intellectual concept in numismatics succinctly described at http://coins.about.com/od/coinsglossary/g/coin_type.htm, not physical objects in a collection. Ethan -- Richard Wallis Founder, Data Liberate http://dataliberate.com Tel: +44 (0)7767 886 005 Linkedin: http://www.linkedin.com/in/richardwallis Skype: richard.wallis1 Twitter: @rjw IM: rjw3...@hotmail.com
Re: [CODE4LIB] RDF advice
On 2/13/12 1:43 PM, Ethan Gruber wrote: Hi Patrick, Thanks. That does make sense. Hopefully others will weigh in with agreement (or disagreement). Sometimes these semantic languages are so flexible that it's unsettling. There are a million ways to do something with only de facto standards rather than restricted schemas. For what it's worth, the metadata files describe coin-types, an intellectual concept in numismatics succinctly described at http://coins.about.com/od/coinsglossary/g/coin_type.htm, not physical objects in a collection. I believe this is similar to what FOAF does with primary topic: http://xmlns.com/foaf/spec/#term_primaryTopic In FOAF that usually points to a web page ABOUT the subject of the FOAF data, so a wikipedia web page about Stephen King would get this primary topic property. Presuming that your XML is http:// accessible, it might fit into this model. kc Ethan On Mon, Feb 13, 2012 at 4:28 PM, Patrick Murray-John patrickmjc...@gmail.com wrote: Ethan, The semantics do seem odd there. It doesn't seem like a skos:Concept would typically link to a metadata record about -- if I'm following you right -- a specific coin. Is this sort of a FRBRish approach, where your skos:Concept is similar to the abstraction of a frbr:Work (that is, the idea of a particular coin), where your metadata records are really describing the common features of a particular coin? If that's close, it seems like the richer metadata is really a sort of definition of the skos:Concept, so maybe skos:definition would do the trick? Something like this: ex:wheatPenny a skos:Concept ; skos:prefLabel Wheat Penny ; skos:definition Your richer, non RDF metadata document describing the front and back, years minted, etc. In XML that might be like: skos:Concept about=http://example.org/**wheatPennyhttp://example.org/wheatPenny skos:prefLabelWheat Penny/skos:prefLabel skos:definition Your richer, non RDF metadata document describing the front and back, years minted, etc. /skos:definition /skos:Concept It might raise an eyebrow to have, instead of a literal value for skos:definition, another set of structured, non RDF metadata. Better in that case to go with a document reference, and make your richer metadata a standalone document with its own URI: ex:wheatPenny skos:definition ex:wheatPennyDefinition**.xml skos:Concept about=http://example.org/**wheatPennyhttp://example.org/wheatPenny skos:definition resource=http://example.org/**wheatPenny.xmlhttp://example.org/wheatPenny.xml / /skos:Concept I'm looking at the Documentation as a Document Reference section in SKOS Primer : http://www.w3.org/TR/2009/**NOTE-skos-primer-20090818/http://www.w3.org/TR/2009/NOTE-skos-primer-20090818/ Again, if I'm following, that might be the closest approach. Hope that helps, Patrick On 02/11/2012 09:53 PM, Ethan Gruber wrote: Hi Patrick, The richer metadata model is an ontology for describing coins. It is more complex than, say, VRA Core or MODS, but not as hierarchically complicated as an EAD finding aid. I'd like to link a skos:Concept to one of these related metadata records. It doesn't matter if I use skos, owl, etc. to describe this relationship, so long as it is a semantically appropriate choice. Ethan On Sat, Feb 11, 2012 at 2:32 PM, Patrick Murray-John patrickmjc...@gmail.com wrote: Ethan, Maybe I'm being daft in missing it, but could I ask about more details in the richer metadata model? My hunch is that, depending on the details of the information you want to bring in, there might be more precise alternatives to what's in SKOS. Are you aiming to have a link between a skos:Concept and texts/documents related to that concept? Patrick On 02/11/2012 03:14 PM, Ethan Gruber wrote: Hi Ross, Thanks for the input. My main objective is to make the richer metadata available one way or another to people using our web services. Do you think it makes more sense to link to a URI of the richer metadata document as skos:related (or similar)? I've seen two uses for skos:related--one to point to related skos:concepts, the other to point to web resources associated with that concept, e.g., a wikipedia article. I have a feeling the latter is incorrect, at least according to the documentation I've read on the w3c. For what it's worth, VIAF uses owl:sameAs/@rdf:resource to point to dbpedia and other web resources. Thanks, Ethan On Sat, Feb 11, 2012 at 12:21 PM, Ross Singerrossfsin...@gmail.com wrote: On Fri, Feb 10, 2012 at 11:51 PM, Ethan Gruberewg4x...@gmail.com wrote: Hi Ross, No, the richer ontology is not an RDF vocabulary, but it adheres to linked data concepts. Hmm, ok. That doesn't necessarily mean it will work in RDF. I'm looking to do something like this example of embedding mods in rdf: http://www.daisy.org/zw/ZedAI_Meta_Data_-_MODS_**http://www.daisy.org/zw/ZedAI_**Meta_Data_-_MODS_**
Re: [CODE4LIB] neo4j
My proposal for code4lib on this topic was not selected, but I was invited to give the same talk at the Berkeley Information School Friday afternoon seminar last week (but I had about 40 mins rather than 20). Here are the notes from my talk last Friday: http://tingletech.github.com/296a-1-2012/ Also, I did some quick screenrs of what I would have talked about (but I didn't really practice, I would have prepared more for a real talk, these are sort of phoning it in) http://www.screenr.com/1lws http://www.screenr.com/pfws http://www.screenr.com/Pg9s Here is a page that is powered by Tinkerpop/Neo4J/rexster in production http://socialarchive.iath.virginia.edu/xtf/view?mode=RGraphdocId=franklin-benjamin-1706-1790-cr.xml I've found tinkerpop, gremlin, and rexster to be very easy to work with, and the tinkerpop list is very helpful. I'm also using a triple store to power a SPARQL interface: http://socialarchive.iath.virginia.edu/sparql/ On Mon, Feb 13, 2012 at 2:23 PM, Chris Fitzpatrick chrisfitz...@gmail.comwrote: Hey Kent, Awesome. thanks for the info. So, using gremlin, are you using some of the other Tinkerpop technologies? And, haha, in researching stuff this weekend, I actually saw an email you sent to the neo4j google group about the lucene boosting issue… I started playing around with RDF.rb , and was really impressed, although using that doesn't give you all the stuff tinkerpop does. b,chris. On Sat, Feb 11, 2012 at 12:32 AM, Kent Fitch kent.fi...@gmail.com wrote: Hi, AustLit ( http://www.austlit.edu.au ) is in the early stages of a migration from javaServlets/xslt/oracle to java/neo4j/gremlin. The web version of AustLit was developed in 2000 based on FRBR with a strong emphasis on events realised with a topic map model, so the sql implementation is close to a triple-store. More information on the details are here: http://www.austlit.edu.au/about , http://www.austlit.edu.au/about/metadata and http://www.austlit.edu.au:/DataModel/index.html (ALEG was the working name for AustLit redevelopment in 2000). Last year a decision was taken to move AustLit from a subscription service to open access, and from updates being performed solely by dedicated bibliographers and researchers (members of various AustLit teams distributed across Australia) to include community contributions, so rather than work these changes into a 12 year old system, it was decided to start afresh with an approach which would more naturally support the AustLit data model. So, we experimented with Neo4j, and were impressed with its performance. For example, loading our current data from Oracle into an empty neo4j database takes about 30 minutes (using a run-of-the-mill 3 year-old server), producing a graph of 14m nodes and 20m relationships. Performing custom indexing of this data using the built-in Lucene integration takes about 2.5 hours, but that's a function of the extensive indexing we're performing. As you'd probably expect, we do have some issues we're working through, such as - integration with Lucene is abstracted by the neo4j index interface, so it is difficult or impossible to use some native Lucene features. For example, boosting index nodes based on their inherent importance and using this boost in lucene to determine relevance cannot be done. - our data model is complex, and added to the requirements to version every node and relationship (ie, record changes, allow rollback), our graph traversals are correspondingly complex, but I suspect as we become more familar with graph traversal idioms in gremlin and cypher, they'll become as normal as sql But so far, neo4j seems fast and robust, and we're optimistic! Kent Fitch On Sat, Feb 11, 2012 at 9:42 AM, Chris Fitzpatrick chrisfitz...@gmail.com wrote: Hej hej, Is anyone is using neo4j in their library projects. If the answer is ja, I would be very interested in hearing how it's going. How are you using it? Is it something that is in production and is adding value or is it more a skunkworks-type effort? What languages are you using? Are you using an ORM (like Rails or Django)? I would also be really interested in hearing thoughts, stories, and opinions about the idea of using a graph db or triple store in their stack. tack! b, fitz.
Re: [CODE4LIB] Metadata
Genny, I agree that the actual data is not in the catalog per se, but it IS in a database somewhere. And the beauty of that digital information (which is where we are all headed) is that all of it can really now be mashed together to produce something new. The contents of _A Tale of Two Cities_ can now be seen in so many different ways: a histogram of word frequency, a chart of which characters have the most dialogue, locations in the novel can be mapped geographically over the course of the story. (I only wish I had an interactive map when reading A Game of Thrones to tell me who was where at which part of the novel!) And you can then search for books that take place in certain cities, or in a time period, or have people who wear beige top hats in victorian England. The possibilities are endless! But the point is, to a computer, it's all just bits and bytes and numbers for the crunching. To open up these avenues of new things, we need to change our thinking about what these things are. And that is exciting. --Joel On Feb 13, 2012, at 5:25 PM, Genny Engel wrote: I think this is a rather different situation from the one libraries commonly deal with, where there is a pretty clear distinction between data representing the full text of a 189-page book by Author X, and the descriptive data that is made up by catalogers or publishers, and is not part of Author X's work at all. In addition, it is somewhat useful to distinguish between full-text data and descriptive metadata because the nature of the work you can do with these two types of data can be so very different. You simply can't use the average library catalog to look up Author X's novel that starts with the sentence So a string walks into a bar. The actual data (the novel) is not in the catalog (which is composed only of metadata). Genny Engel Sonoma County Library gen...@sonoma.lib.ca.us 707 545-0831 x581 www.sonomalibrary.org -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Nate Vack Sent: Monday, February 13, 2012 7:57 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Metadata My take on this discussion, coming from a research lab: Metadata isn't meta. For example, in recordings of, say, blood pressure over time, it's common to think about things such as participant identifiers, acquisition dates, event markers, and sampling rates as metadata, and the actual measurements as data. But really: those meta things aren't ancillary to data analysis; they're essential in keeping analyses organized, and often important parameters in running an analysis at all. Breaking things down into data versus metadata I think, encourages a false (and not very interesting) dichotomy. If information has a use, call it what it is: data. Store everything that's useful. If you don't yet have a use in mind for your data, then you have a place to start working :) -n
Re: [CODE4LIB] Metadata
You realize, of course, that discussing the use of the word metametadata could be described as metametametadata? Which would make my post metametametametadata. At which point it all turns into silliness (which it certainly wasn't before... right? :-) ). Best, Kåre From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Kurt Nordstrom I got such dirty looks when I used the term metametadata to describe something. ;) -Kurt On 02/13/2012 02:39 PM, Becky Yoose wrote: Could this conversation be described as metametadata? *runs, hides* Thanks, Becky
Re: [CODE4LIB] neo4j
Hi Chris, Gremlin is the only Tinkerpop technology we've used so far. Re the boosting, we've ended up storing the document boost as a neo4j property on the node, and post-processing the hit list from lucene to get each node and combine the lucene score with the boost to determine our final relevance score. This adds about about 0.6 sec elapsed per 10K nodes ( on a Xeon E5430 @ 2.66GHz ), which is not ideal but will probably be ok Regards, Kent On Tue, Feb 14, 2012 at 9:23 AM, Chris Fitzpatrick chrisfitz...@gmail.comwrote: Hey Kent, Awesome. thanks for the info. So, using gremlin, are you using some of the other Tinkerpop technologies? And, haha, in researching stuff this weekend, I actually saw an email you sent to the neo4j google group about the lucene boosting issue… I started playing around with RDF.rb , and was really impressed, although using that doesn't give you all the stuff tinkerpop does. b,chris.
Re: [CODE4LIB] Metadata
On 13 February 2012 15:57, Nate Vack njv...@wisc.edu wrote: My take on this discussion, coming from a research lab: Metadata isn't meta. Well, coming from a publishing and repositories world, my take is slightly different. For example, in recordings of, say, blood pressure over time, it's common to think about things such as participant identifiers, acquisition dates, event markers, and sampling rates as metadata, and the actual measurements as data. But really: those meta things aren't ancillary to data analysis; they're essential in keeping analyses organized, and often important parameters in running an analysis at all. That's an interesting distinction though. Do you need all that data in order to make sense of the results? You don't [necessarily] need to know who conducted some research, or when they conducted it in order to analyse and make sense of the data. In the context of having the data, this other information becomes irrelevant in terms of understanding what that data says. But in a wider context, you do need such additional information in order to be able to use it. If you don't know who conducted the research, when it was conducted, etc. then you can't reference it.You can't place it into another context (a follow up study to validate the findings, or see if something has changed over time). And it's not a case of saying something has to fall into one category or another, it may be necessary / useful in both. Breaking things down into data versus metadata I think, encourages a false (and not very interesting) dichotomy. If information has a use, call it what it is: data. Store everything that's useful. The problem isn't that we have labels for data to be used in different contexts. It's that just because something does have a label [and may not necessarily be important to you in your context], that doesn't mean that it's any less important than something else with another label. G