Re: [CODE4LIB] Formalizing Code4Lib?
Another example to look at is Open Repositories, which entered into an MOU with CLIR last year to serve as "financial sponsor" for the OR conference series. In this model, CLIR does not bear the financial risk of the annual conference but essentially serves as a banker for any surplus generated. The host institution each year is the one that enters into contracts with hotels, etc., and bears the financial and legal risks of hosting, but there is an implied expectation that the funds held for OR by CLIR would be used to help cover a loss that occurs due to extraordinary circumstances. Since, like Hydra and Code4Lib, OR does not exist as a legal entity, the MOU is between the OR Steering Committee and CLIR. Jon -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Esmé Cowles Sent: Tuesday, June 07, 2016 4:24 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Formalizing Code4Lib? I don't think there is any Hydra legal entity (hence the need for a financial host), and the MOU is signed on behalf of the leadership committee. So I think it boils down to being organized enough for the financial host to be comfortable entering into an agreement with them. I can ask the people I know on the Hydra leadership committee to get more info on how the arrangement works. -Esmé > On Jun 7, 2016, at 4:19 PM, Jenn C <jen...@gmail.com> wrote: > > This sounds like an intriguing option. What is "Hydra" that it is able > to enter into an MOU - is the steering group an incorporated entity? > > On Tue, Jun 7, 2016 at 3:40 PM, Esmé Cowles <escow...@ticklefish.org> wrote: > >> I remember another option being brought up: picking an official >> organizational home for C4L that would handle being the financial >> host for the conference, and possibly other things (conference >> carryover, scholarship fundraising, holding intellectual property, >> etc.). An existing library non-profit might be able to do this without that >> much overhead. >> >> For example, Hydra has a MOU with DuraSpace for exactly this kind of >> arrangement, and there was a post recently about renewing the >> arrangement for another year, including the MOU: >> >> https://groups.google.com/d/msg/hydra-tech/jCua5KILos4/yRpOalF6AgAJ >> >> In the past, there has been a great deal of resistance to making C4L >> more organized, and especially on the amount of work needed to run a >> non-profit organization. So having a financial host arrangement >> could be a lighter-weight option. >> >> -Esmé >> >>> On Jun 7, 2016, at 3:31 PM, Coral Sheldon-Hess >>> <co...@sheldon-hess.org> >> wrote: >>> >>> I think this deserves its own thread--thanks for bringing it up, >> Christina! >>> >>> I'm also interested in investigating how to formalize Code4Lib as an >>> entity, for all of the reasons listed earlier in the thread. I can't >>> volunteer to be the leader/torch-bearer/main source of energy behind >>> the investigation right now (sorry), but I'm happy to join any group >>> that >> takes >>> this on. I might be willing to *co*-lead, if that is what it takes >>> to get the process started. >>> >>> And, yes, anyone who has talked to me or read my rants about the >>> proliferation of library professional organizations is going to >>> think my volunteering for this is really funny. But I think forming >>> a group to gather information gives us the chance to determine, as a >>> community, whether Code4Lib delivers enough value and has enough of >>> a separate identity to be worth forming Yet Another Professional >>> Organization (my >> gut >>> answer, today? "yes"), or whether we would do better to fold into, >>> or become a sub-entity of, some existing organization; or, >>> (unlikely) should Code4Lib stop being A Big International Thing and just do >>> regional stuff? >>> Or some other option I haven't listed--I don't even know what all >>> the options are, right now. >>> >>> One note on the "no, let's not organize" sentiment: the problem with >>> a >> flat >>> organization, or an anarchist collective, or a complete "do-ocracy," >>> is that the decision-making structures aren't as obvious to >>> newcomers, or >> even >>> long-term members who aren't already part of those structures. There >>> is value to formality, within reason. I mean... right now, I don't >>> know how >> to >>> go about getting &
Re: [CODE4LIB] SSL certificates and proxy servers
> I want to make a plea too, not to fragment Code4Lib, but rather to consolidate > EZProxy knowledge to post these queries to the EZProxy list. > > For good, bad or indifferent, OCLC is putting together an EZProxy community > wiki and for those EZProxy folks who come after you, who are not C4Lers, I ask > that whatever info go there. If we're going to go that far, why not also put it in the existing system? http://www.oclc.org/community/ezproxy.en.html? Honestly, I'm expecting much from the wiki. I tried using the community resource as it is now in the past and have had errors, things disappearing, etc. I think I may have put something up in the community site, but honestly, I'm probably never going to log in again if I don't have to. A lot of it is simply poor management and needless restrictions, which will be the same no matter what software they use. This particular question is definitely a FAQ and someday I'll get around to trying to write up something and I'll put it up ..somewhere. Maybe even just up in github and send it to the link. I don't see the harm in repeating info here. I'm guessing folks who find this new information aren't already on ezproxy and won't be on there. They're not likely to find it either, the ezproxy-l list doesn't seem very well exposed to searching. > (@Jon, kind of looking at you because I worry that EZProxy expertise such as > yours will get lost. I know it seems impossible, but one day we may all go on > to > other work. I for one am looking forward to an exciting second career as a > Starbucks barrista; I hear my Master's degree will serve me well there ;-) I'm guessing no matter where or how I put the information, people will still ask the questions :). My learned knowledge about ezproxy is combined a bit from the mailing list, a large part in just reading the OCLC documentation, and a little from ./ezproxy --help or whatever it is :). I'll try to dump some of the info or create an FAQ one of these days, but it probably won't be today. Or, of course, someone else could visit http://search.gmane.org/?query=wildcard=gmane.education.ezproxy and type in the search box wildcard and summarize the various emails on the topic ;). Jon Gorman Library IT University of Illinois 217 244-4688
Re: [CODE4LIB] SSL certificates and proxy servers
> Hi Code4Lib, > We're looking into applying an SSL certificate to an EZproxy server and aren't > sure exactly how a wildcard cert gets handled in that context. > Anyone have experience with this? Yup. > The fuzzy part is that we're not clear how wildcard certificates that handle > subdomain matching (e.g., *.example.org) translate into wild-looking proxied > domains (like search.whatever.com.proxy.example.org). This depends a lot on the version number of EzProxy. The older versions of EzProxy look for a couple of things: * proxy-by-hostname needs to be on (sounds like you have that) * The wildcard MUST be in the CN, not a SAN. You'll likely want to use your login domain in the SN, depending on levels. Given those two things, when ezproxy sees that it has a wildcard in the CN, it'll change from using periods to hypens. I think, although I can't remember for sure, at some point in 6.x this was fixed so a wildcard in a CN or SAN will work. I'd definitely verify that through testing though. A license of ezproxy should let you run a separate test instance on another machine. You can verify this by just creating a self-signed wildcard cert. You'll get a warning, but you should also see the ezproxy behavior change. I find dnsmasq can be helpful as well. So you'll want to get a wildcard cert for the one level of subdomain. While you're at it, make sure it's a 2048 bit key and SHA-2. I've been seeing a lot of people running into problems with old 3 year certs that they finally gotten around to putting into place. > This might be more of an EZproxy config question and more appropriate to that > list. There's also documentation > <https://www.oclc.org/support/services/ezproxy/documentation/cfg/ssl.en.ht > ml> > out there. But if anyone can comment on the process, whether the > documentation was helpful to you, what sort of wildcard cert you got to > address this problem, etc., we'd be interested to hear from you. It's asked frequently enough that if I wasn't quite so lazy, I'd make it into the top FAQ question. The documentation was ok, but it's really not all that complicated. Jon Gorman Library IT University of Illinois 217 244-4688
[CODE4LIB] Job: Visiting Information System Analyst (3 year term)
Visiting Information System Analyst (3 year term) University of Illinois at Urbana-Champaign Library IT - Infrastructure Management and Support (IMS) Position Available: This is a 12 month, full-time Academic Professional position with the University Library's IT Infrastructure Management and Support team. It is a visiting, three-year appointment in the first instance, but there is a possibility of extension. Duties and Responsibilities: Supporting commercial, open-source, and locally developed Library applications; Managing service and application life-cycles: Working with project stakeholders and senior programming staff to gather and analyze requirements for projects, and design approaches to meeting project requirements; Working independently or as a member of a small team, responsibility for implementing the approved recommendations, especially for in-house development, but also for customization or integration of purchased and open source software; Applying best practices in various software development methodologies, including version control, automated testing and code refactoring, and leveraging appropriate programming frameworks and technical architectures to the requirements and proposed solutions; Required Qualifications: 3 years of experience in the field and a Bachelor's Degree; or 1 year of experience in the field with a Computer Science Bachelor's Degree; Ability to work in a diverse environment; Demonstrable experience documenting systems and procedures; Programming and software development experience in scripting and client-side applications languages, including one or more of Perl, Python, PHP, Visual Basic, and Java; Current and working knowledge of HTML, CSS and JavaScript; Familiarity with programming web applications; Working knowledge of relational database design principles; Ability to work independently or under only general direction; Strong communication skills. See https://jobs.illinois.edu for Preferred Qualifications Apply: To ensure full consideration, please complete your candidate profile at https://jobs.illinois.edu and upload a letter of interest, resume, and contact information including email addresses for three professional references. Applications not submitted through this website will not be considered. The University of Illinois conducts criminal background checks on all job candidates upon acceptance of a contingent offer. For questions, please call: 217-333-8169. DEADLINE: in order to ensure full consideration, applications must be received by November 16, 2015. The U of I is an EEO Employer/Vet/Disabled www.inclusiveillinois.illnois.edu
Re: [CODE4LIB] Hours of Operation on Website - management tool
Most of the rough edges are around some of the one-time administrative actions like setting up new libraries, locations, and term schedules, although there’s also some UI improvements in our near future. FWIW, we just just 'finished' a first pass at little Rails engine around managing location data: https://github.com/pulibrary/locations It, too, is a little rough around the edges (esp wrt views) and has some site-specific stuff, like a gazillion 'location codes' to make it work with existing systems...but that's sorta why we built it. -Jon -- Jon Stroop Application Development Manager Princeton University Library jstr...@princeton.edu On 07/01/2015 11:35 AM, Chris Beer wrote: Hi Ken, We’ve recently been working on rebuilding an application for managing our hours. It’s Ruby on Rails, not-yet-in-production, full of rough edges, and has some Stanford-specific business logic, but it’s relatively simple and (probably) works for us: https://github.com/sul-dlss/library_hours_rails/releases/tag/v0.0.1 https://github.com/sul-dlss/library_hours_rails/releases/tag/v0.0.1 Currently, it’s envisioned as a backend service for staff to add and manage hours, with downstream consumers using the API to present the hours as appropriate. Our initial consumers include the main library website, our library catalog, and some other business process applications. We’ve also started thinking about embeddable HTML views of the hours to replace some of the clunky processing we’re currently doing in Drupal, but haven’t pursued that yet. Interesting features include: - JSON-API view of a location’s hours; (what I assume is a bespoke..) Drupal calendar feed; import and export for spreadsheets of hours; - multiple library (and location-within-a-library) support; - granular access control for updating hours; we have the notion of global hours administrators, but expect to also support library- and location-specific authorization, allowing library managers to set and update the hours for a subset of our locations [1]; - support for setting operating hours for a term and/or exceptions for particular days (e.g. holidays and the like) using an in-place editor; - we have a notion of location-specific messages associated with exceptions to the normal schedule (e.g. the Art library is closed this week for Y), which can be reflected in applications that consume the library hours Most of the rough edges are around some of the one-time administrative actions like setting up new libraries, locations, and term schedules, although there’s also some UI improvements in our near future. Thanks, Chris Beer Digital Library Systems and Services Stanford University Libraries [1] Although I’m more interested in allowing any staff member to update the hours, and provide better notifications when a location’s hours change; that said, strong access control is much easier to reason about and codify.. On Jul 1, 2015, at 6:01 AM, Ken Irwin kir...@wittenberg.edu wrote: Hi folks, I'm hoping to find some sort of web-based app that can manage the library's hours of operations, including: * Displaying today's hours * Displaying an upcoming schedule of hours * Updatable though a GUI interface by non-techy library staff * Able to update our Google Places account hours (which, I note, currently lists our school-year hours as our open hours today), perhaps on a daily basis * Preferably a stand-alone thing that can provide data on an ad hoc basis (as opposed to a CMS-specific thing like a WP plugin or a Drupal module) * PHP preferred but not necessary * OSS / free preferred but not necessary I feel certain that someone else has already wanted this enough to create it. Anyone have a solution they're happy with? Thanks Ken
[CODE4LIB] Recommendations for places to advertise for a library systems guru?
Hi all, I thought I'd ask folks what resources and places one could advertise positions that might not fall in some of the more traditional for libraries for systems folks. The more obvious seem to be LITA/ALA, here at Code4Lib, and perhaps some of the other library organizations. Also postings in newspapers in the area is a typical move by us. But I'm also considering IEEE ACM job listings and asking CS faculty for recommendations. I'm sure there's even more that I haven't thought of. So I'm curious about other suggestions or ideas? Particularly are there any that have worked to draw in candidates with a strong IT background? Jon Gorman University of Illinois
[CODE4LIB] Job: Ontology Engineer/Semantic Applications Developer, Cornell University Library
Ontology Engineer/Semantic Applications Developer, Cornell University Library https://cornellu.taleo.net/careersection/10164/jobdetail.ftl?job=25577 Description Join the team advancing open source, linked data initiatives for a world class academic research library on the beautiful Cornell University campus in Ithaca, New York. Ithaca has been named one of the top 100 places to live, a top 10 recreation city, a best green place to live, and one of the “foodiest” towns in America. Apply your experience and unique talents in Albert R. Mann Library as a senior level Ontology Engineer/Semantic Applications Developer on a team promoting innovation and quality in information technology. Develop and promote international standards and frameworks for scholarly content on the web, with frequent opportunities for engagement with the linked data and information science communities. Reinvent core library systems as networks of linked data connecting rich traditional library resources with diverse, distributed knowledge to meet the rapidly evolving needs of today’s researchers and students. Become the architect of compelling web applications and services to library, university, regional, and international projects engaged in disciplines ranging from earth systems and climate science to agricultural research in the developing world. Join the international team developing the open source VIVO-ISF https://github.com/vivo-isf/vivo-isf-ontology ontologyhttps://github.com/vivo-isf/vivo-isf-ontology and VIVO software (vivoweb.orghttp://vivoweb.org and http://github.com/vivoproject github.com/vivo-projecthttp://github.com/vivo-project/) and lead the creative application of VIVO technology at Cornell. The Ontology Engineer/ Semantic Applications Developer will: * Research, create, maintain, and extend ontologies, knowledge bases, and software tools in a distributed, linked data environment to support data integration, interoperability, usability, query, analysis,visualization, and dissemination * Provide technical leadership in ontology selection and design including evaluation for consistency, modularity, efficiency, and reasoning; develop mechanisms for ontology versioning, community driven editing, and deployment in software applications in concert with local,national, and international collaborators * Support dramatically increasing the production scale of semantic web applications * Participate in and contribute to open source software communities * Prepare strategic guidance, white papers, project proposals, visualizations, presentations, technical documentation, and reports * Provide training and technical assistance to academic and professional staff at Cornell and partner institutions * Contribute to the full range of information technology services provided by the Cornell University Library (may functionally supervise the work of others and lead project teams) Qualifications * Bachelor’s degree in library science, information science, computer science or other relevant discipline * More than 5 years of relevant experience * Experience designing and implementing OWL ontologies and other metadata standards * Experience applying semantic web and linked data standards to real world applications * Expert level Java programming; web application development experience in Java, Python, Ruby or similar language; knowledge of current database management systems, SQL, and non relational alternatives * Excellent interpersonal and oral and written communication skills * Evidence of ability to assess, analyze, plan, and solve problems creatively and collaboratively in a complex, rapidly-changing environment Preferred Qualifications * Familiarity with principles and methodologies for applying reasoning and rules * Experience with agile software development, continuous integration, testing frameworks * A track record of contributing to open source communities * Experience with statistical programming and/or data mining * Experience working in higher education or research Background check may be required. No relocation assistance is provided for this position. Visa sponsorship is not available for this position. Cornell University is an innovative Ivy League university and a great place to work. Our inclusive community of scholars, students and staff impart an uncommon sense of larger purpose and contribute creative ideas to further the university’s mission of teaching, discovery and engagement. Located in Ithaca, NY, Cornell's far-flung global presence includes the medical college's campuses on the Upper East Side of Manhattan and Doha, Qatar, as well as the new Cornell Tech campus to be built on Roosevelt Island in the heart of New York City. Diversity and Inclusion are a part of Cornell University’s heritage. We’re an employer and educator recognized for valuing AA/EEO, Protected Veterans,
[CODE4LIB] Job: Semantic Applications and Linked Data Developer at Cornell University Library
Semantic Applications and Linked Data Developer Albert R. Mann Library Cornell University, Ithaca, NY 14853 Job Posting #25577: http://goo.gl/Oz1PdD Cornell University’s Mann Library IT Team is seeking a senior-level Semantic Applications and Linked Data Developer who will apply innovative knowledge representation techniques and tools to library, university, regional, and international projects. Mann Library is a friendly and collaborative workplace where flexible, thoughtful and self-motivated librarians and IT staff engage together in a portfolio of projects encompassing climate science, international agricultural knowledge sharing, GIS and data visualization, and transitions from library catalogs to linked data. Join the team developing the VIVO software (vivoweb.org and github.org/vivo-project) and VIVO-ISF ontology through an international consortium of universities, research institutions, government agencies, and non-profits, promoting rich semantic interconnectivity among researchers, activities, and outputs anywhere in the world. Responsibilities include: * Researching, synthesizing, and applying the most appropriate knowledge, technologies and tools to improve data harvesting, integration, interoperability, and dissemination to consuming websites and services * Active participation with local, national, and international collaborators on ontology development, standards initiatives, and open source software * Contributing to the full range of information technology services provided by the Cornell University Library (may functionally supervise the work of others and lead project teams) * Preparing strategic guidance, white papers, project proposals, visualizations, presentations, technical documentation, and reports influencing technology development practices and policies and having broad impact within and beyond the University. * Providing training and guidance to academic and professional staff at Cornell and partner institutions Required Qualifications: * Bachelor’s degree in an information science (library science, information science, computer science or equivalent) or other relevant discipline and more than 5 years of relevant experience * Experience applying Semantic Web and Linked Data standards (RDF, SPARQL, and OWL) to real-world applications; expert-level Java programming; web application development experience in Java, Python, Ruby or similar language; web presentation layer experience with HTML5, CSS, JavaScript, and JSON; knowledge of current database management systems, SQL, and non-relational alternatives * Excellent interpersonal and oral and written communication skills and a strong, user-centered service orientation * Evidence of ability to assess, analyze, plan, and solve problems creatively and collaboratively in a complex, rapidly-changing environment Preferred Qualifications: * Experience designing, programming, and deploying creative and effective web applications * Experience working with metadata standards, ontologies, and thesauri * Familiarity with data interchange standards, reasoning, rules, XML and XSLT, and with statistics, data mining, text mining libraries, algorithms, and applications * Experience with Agile Software Development methodologies, with UNIX shell scripting, log analysis, and scheduling * Experience contributing source code, ontologies, testing, documentation, and/or support to open source communities * Experience working in higher education or research Background check may be required. No relocation assistance is provided for this position.Visa sponsorship is not available for this position. Cornell University is an innovative Ivy League university and a great place to work. Our inclusive community of scholars, students and staff impart an uncommon sense of larger purpose and contribute creative ideas to further the university's mission of teaching, discovery and engagement. Located in Ithaca, NY, Cornell's far-flung global presence includes the medical college's campuses on the Upper East Side of Manhattan and Doha, Qatar, as well as the new Cornell Tech campus to be built on Roosevelt Island in the heart of New York City. Diversity and Inclusion are a part of Cornell University’s heritage. We’re an employer and educator recognized for valuing AA/EEO, Protected Veterans, and Individuals with Disabilities.
Re: [CODE4LIB] Library Privacy, RIP (Was: Canvas Fingerprinting by AddThis)
I don't believe the horse has left the barn forever. As Bruce Schneier says, security is a process, not a product. And as we learn more about this space we can advocate in our own institutions for greater awareness and perhaps adjustments to the technologies we use to evaluate online activity. AddThis and ShareThis probably have limited value for the data they compromise. Google Analytics is probably a much better trade. EZproxy too... Jon On Fri, Aug 15, 2014 at 2:07 PM, Eric Hellman e...@hellman.net wrote: On Aug 14, 2014, at 4:32 PM, William Denton w...@pobox.com wrote: At the university where I work Google Analytics is the standard, and we use it on the library's web site. There's probably no way around that---but we can tell people how to block the tracking, which will help them locally (ironically) and everwhere else. (I use Piwik at home, and like it, but moving to that here would be a long-term project, only partly for technical reasons.) I think a reasonable place to draw a line in the sand is use for advertising. If you look at the Google Analytics site, it doesn't appear that they can use Analytics tracking for advertising, because they don't make the carve-outs for children that I believe would be required if they did. So if you trust google, and assume they know everything anyway, you can let them track users. AddThis and ShareThis, on the other hand have TOS that let them use tracking for advertising, and that's what their business is. So, hypothetically, a teen could look at library catalog records for books about childbirth, and as a result, later be shown ads for pregnancy tests, and that would be something the library has permitted. A criminal prosecutor could subpoena either Google or AddThis/ShareThis to obtain tracking data for anyone in your library who had read books about Nazism or the Black Panthers or witchcraft, completely without involving the library. Do you think Google would easily comply with that sort of request? would AddThis? Would EBSCO? At Unglue.it, we use Google Analytics, but we have avoided Things like Facebook Like, and the third party shares because we didn't like the tradeoff. But maybe the horse has left the barn forever. Eric
Re: [CODE4LIB] very large image display?
Jonathan, We use an image server I wrote, Loris, plus OpenSeadragon. Here's an example: http://libimages.princeton.edu/osd-demo/?feedme=pudl0123%2F8172070%2F01%2F0001.jp2 That image is 152500 x 4000 px: http://libimages.princeton.edu/loris/pudl0123%2F8172070%2F01%2F0001.jp2/info.json Loris is on Github: https://github.com/pulibrary/loris as is OpenSeadragon: https://github.com/openseadragon/openseadragon More generally, this is one of many problems IIIF (International Image Interoperability Framework) exists to try to solve. You might want to check out our site, which has links to other tools as well: http://iiif.io/ Hope this helps, -Jon On 07/25/2014 11:36 AM, Jonathan Rochkind wrote: Does anyone have a good solution to recommend for display of very large images on the web? I'm thinking of something that supports pan and scan, as well as loading only certain tiles for the current view to avoid loading an entire giant image. A URL to more info to learn about things would be another way of answering this question, especially if it involves special server-side software. I'm not sure where to begin. Googling around I can't find any clearly good solutions. Has anyone done this before and been happy with a solution? Thanks for any info! Jonathan
Re: [CODE4LIB] iiif compatible servers
Eric, FWIW, an HTTP resolver that could be used with Fedora has been a big topic for Loris recently, and a few of us are trying to spec out what that would look like. The discussion/proposal is here: https://github.com/pulibrary/loris/issues/98 and spreads to a few other linked issues. I'd be happy to hear what you think. -Jon Sent from my mobile. Please excuse typos. -Original Message- From: James, Eric eric.ja...@yale.edu To: CODE4LIB@LISTSERV.ND.EDU Sent: Fri, 25 Jul 2014 17:39 Subject: [CODE4LIB] iiif compatible servers Looking to implement a iiif compatible server, primarily for jp2s in fcrepo3. Just read the 'very large image display?' thread and looking at the http://iiif.io/technical-details.html, it appears options include: loris: https://github.com/pulibrary/loris IIP: http://iipimage.sourceforge.net/documentation/server/ djatoka iiif: ( https://github.com/jronallo/djatoka) The iiif djatoka gem immediately caught my eye as I've implemented djatoka w/ fcrepo3 in a previous project, but am interested if there are any opinions in choosing any one of these over another. Thanks, Eric From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] on behalf of Esmé Cowles [escow...@ticklefish.org] Sent: Friday, July 25, 2014 4:44 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] very large image display? We previously used the Zoomify Flash applet, but now use Leaflet.js with the Zoomify tileset plugin: https://github.com/turban/Leaflet.Zoomify One thing I like about this approach is that it minimizes the amount of Javascript code the clients have to load, since we use Leaflet.js for our maps and it's already loaded. -Esme -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Jonathan Rochkind Sent: Friday, July 25, 2014 10:36 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] very large image display? Does anyone have a good solution to recommend for display of very large images on the web? I'm thinking of something that supports pan and scan, as well as loading only certain tiles for the current view to avoid loading an entire giant image. A URL to more info to learn about things would be another way of answering this question, especially if it involves special server-side software. I'm not sure where to begin. Googling around I can't find any clearly good solutions. Has anyone done this before and been happy with a solution? Thanks for any info! Jonathan
Re: [CODE4LIB] LC Call # splitting/sorting scripts?
This? https://code.google.com/p/library-callnumber-lc/ On 07/11/2014 12:01 PM, Robert Dumas wrote: Hey all: Does anyone know of any scripts (preferably in Ruby or Python) which can slice up an LC call number and sort a table of items by LC call number?
Re: [CODE4LIB] Book Club software tools and approaches?
I like the Google Drive Form idea. MIght be able to do that or some variation. Thanks! Jon Gorman On Tue, Jul 1, 2014 at 3:39 PM, Matt Cordial matt.cord...@gmail.com wrote: We've been using a G+ community for event announcements and discussions. It's been fine. We're pretty small so we don't need a lot in terms of management. https://plus.google.com/communities/113393567679559625537 For voting, we've used both a Google Drive Form and simply a discussion thread. On Tue, Jul 1, 2014 at 6:38 AM, Jon Gorman jonathan.gor...@gmail.com wrote: Hi all, I've been musing on software tools that might be useful for book clubs. I'm not necessarily looking for a turnkey solution explicitly geared towards book clubs, but more a thought experiment of what tools might be useful for an ongoing in the real world book club. Some needs that software tools might help keep track of: * A way to vote for what books to read next * Schedule of times * An estimator calculator (reading level of book + length of book, estimated sessions). * way to add notes or linked materials * online discussions to supplement in person meetings * glossary/dictionary functionality perhaps? In my own thoughts some of the online services like GoodReads, Shelfari and LibraryThing seems to at least offer some tools and information. A system that I haven't had a chance to explore enough, Loomis, might help with the decision making parts. Part of the impetus for this is I've recently joined a technical book club. At the moment we're using a wiki, which is working fine, but in particular the voting is clunky. I could picture something where members can add/link to something like librarything in a list and the book with the most votes (w/ ties being broken randomly) is the next book in the queue. So anyone out there already doing something similar? Thoughts? Ideas? Jon Gorman University of Illinois
[CODE4LIB] Book Club software tools and approaches?
Hi all, I've been musing on software tools that might be useful for book clubs. I'm not necessarily looking for a turnkey solution explicitly geared towards book clubs, but more a thought experiment of what tools might be useful for an ongoing in the real world book club. Some needs that software tools might help keep track of: * A way to vote for what books to read next * Schedule of times * An estimator calculator (reading level of book + length of book, estimated sessions). * way to add notes or linked materials * online discussions to supplement in person meetings * glossary/dictionary functionality perhaps? In my own thoughts some of the online services like GoodReads, Shelfari and LibraryThing seems to at least offer some tools and information. A system that I haven't had a chance to explore enough, Loomis, might help with the decision making parts. Part of the impetus for this is I've recently joined a technical book club. At the moment we're using a wiki, which is working fine, but in particular the voting is clunky. I could picture something where members can add/link to something like librarything in a list and the book with the most votes (w/ ties being broken randomly) is the next book in the queue. So anyone out there already doing something similar? Thoughts? Ideas? Jon Gorman University of Illinois
Re: [CODE4LIB] College Question!
Riley, First, I wonder if there's anyone on this list who doesn't wish they had your foresight! You already have rare opportunity in that you're thinking about this now and not in your mid-20s, so way to go! We spoke about this a little @ the c4l conference, but I'll say more. I majored in music performance and even did a masters in it as well, which means that practically speaking I have a high school education. :-) I don't really mean that, but until you've had the experience it's difficult to explain (or at least I find it difficult) how relevant a degree in the arts/humanities can be to a job in technology--and there's no shortage of people who have taken this exact path. I did do an MLS, but see above re: high school education. At the time (~13 yrs ago) I felt like I needed to do it to get a job (I also didn't necessarily expect to wind up in systems, but that's another story), but, honestly, everything I know I learned on the job, or /a/ job, or the overnight hours between going to said job, which leads me to my point: Wherever you go to school, and regardless of your major, if you ultimately want to wind up working in a library, you should start now. Any brick and mortar university is going to have student jobs available (work study or otherwise) at the library, and while it may just be as a desk clerk or whatever, keep your ears open (we already know you're not shy): at some point there's going to be some stats that need munging, some Access (or even worse) database that needs migration, some web work to be done, or whatever and, et voilà, you're off! The point is, professional degree != professional experience, and--frankly--you probably don't want to be working at a place that requires a systems librarian to have a MLIS anyway, and certainly not in 4-5 years. Get as much experience as possible, do a CS degree, but also learn how to write and communicate OR do an arts degree, but also learn how to program (etc.), and you'll be fine. -Jon On 05/28/2014 11:17 PM, Riley Childs wrote: I was curious about the type of degrees people had. I am heading off to college next year (class of 2015) and am trying to figure out what to major in. I want to be a systems librarian, but I can't tell what to major in! I wanted to hear about what paths people took and how they ended up where they are now. BTW Y'All at NC State need a better tour bus driver (not the c4l tour, the admissions tour) ;) the bus ride was like a rickety roller coaster... Also, if you know of any scholarships please let me know ;) you would be my BFF :P Riley Childs Student Asst. Head of IT Services Charlotte United Christian Academy (704) 497-2086 RileyChilds.net Sent from my Windows Phone, please excuse mistakes
[CODE4LIB] New IIIF API specifications drafts published
The IIIF Editors are pleased to announce draft revisions of the International Image Interoperability Framework Image and Presentation (formerly 'Metadata') API specifications. * http://iiif.io/api/image/2.0/ * http://iiif.io/api/presentation/2.0/ These releases reflect a significant amount of input from both the IIIF working groups and the larger library, archives, and museum communities following roughly a year of experience either implementing or experimenting with the previous versions. A complete list of the changes can be found on the IIIF website: * http://iiif.io/api/image/2.0/change-log.html * http://iiif.io/api/presentation/2.0/change-log.html We welcome your feedback, questions, and use cases, and encourage you to submit them to the IIIF Discussion Listserv: iiif-disc...@googlegroups.com. Drafts will be kept open for comment until the beginning of August, with the goal of final release in September. However, we would appreciate feedback early in order to work on and gain consensus for any necessary changes. Sincerely, The IIIF Image and Presentation API Editors: Benjamin Albritton Michael Appleby Robert Sanderson Stuart Snydman Jon Stroop Simeon Warner -- Jon Stroop Digital Initiatives Developer/Analyst Princeton University Library jstr...@princeton.edu
Re: [CODE4LIB] Call for Old Conf Tshirt Logos
I'll try to do some digging as well Jon Gorman On Fri, Apr 11, 2014 at 9:38 AM, Lisa Rabey academichu...@gmail.com wrote: On Fri, Apr 11, 2014 at 8:30 AM, Francis Kayiwa fkay...@colgate.edu wrote: +1 Go for it Lisa! ./fxk I can start digging into the hows/whys sometime in early May and report back. If anyone has anything of interest (past C4L list convos, recommendations, etc), pass them along! -- Lisa M. Rabey | @pnkrcklibrarian http://exitpursuedbyabear.net | http://lisa.rabey.net
Re: [CODE4LIB] Call for Old Conf Tshirt Logos
I've long thought a friends of code4lib would be useful organization, but never quite pulled it together... On Apr 10, 2014 10:41 PM, Tom Cramer tcra...@stanford.edu wrote: Is black light a 501c3? Nope. Just an OSS project with lots of contributors from awesome places : ) Off the top of my head, and in alphabetical order, the obvious (to me) ones in this space that might be candidates are DuraSpace and Lyrasis. In time, DP.LA seems like a great possible candidate, though it is US-centric, I'm unsure of its corporate status (though they do seem to be able to cash and sign checks), and right now they might view C4L as a distraction more than an asset or timely alliance. (Others on this list might be in a better position to comment, ahem...) I'm sure I'm leaving out other possibilities. - Tom Riley Childs Student Asst. Head of IT Services Charlotte United Christian Academy (704) 497-2086 RileyChilds.net Sent from my Windows Phone, please excuse mistakes From: Roy Tennantmailto:roytenn...@gmail.com Sent: 4/10/2014 11:25 PM To: CODE4LIB@LISTSERV.ND.EDUmailto:CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Call for Old Conf Tshirt Logos We should probably toss out some ideas before approaching anyone. Getting the right fit would be important. Which 501(c)3's in our space do we think we may want to approach about being our fiscal agent? Maybe we should collect a list of suggestions and then (natch) vote on who to approach? We could then go down the list until we got a yes. Roy On Thu, Apr 10, 2014 at 8:21 PM, Riley Childs rchi...@cucawarriors.com wrote: That might be a better idea then a fully independent code4lib organization. Riley Childs Student Asst. Head of IT Services Charlotte United Christian Academy (704) 497-2086 RileyChilds.net Sent from my Windows Phone, please excuse mistakes From: Tom Cramermailto:tcra...@stanford.edu Sent: 4/10/2014 11:20 PM To: CODE4LIB@LISTSERV.ND.EDUmailto:CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Call for Old Conf Tshirt Logos What about approaching one of the existing 501c3's in our space to see if they might be interested in and able to take this on for the community? In addition to shirt revenues and yacht maintenance fees, it would be good to have an agency that could help do banking for scholarships, and perhaps pay forward any surpluses from one year's conference to the next year's hosts. - Tom On Apr 10, 2014, at 8:10 PM, Riley Childs wrote: No, I think it should go toward my yacht ;P. In all seriousness, code4lib needs an entity, simply to collect money for this sorta thing. LegalZoom any one? ;) Riley Childs Student Asst. Head of IT Services Charlotte United Christian Academy (704) 497-2086 RileyChilds.net Sent from my Windows Phone, please excuse mistakes From: Alicia Cozinemailto:ali...@curationexperts.com Sent: 4/10/2014 11:07 PM To: CODE4LIB@LISTSERV.ND.EDUmailto:CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Call for Old Conf Tshirt Logos Could one of the scholarship sponsors adopt this as a way to fund future conference scholarships? Alicia On Apr 10, 2014, at 9:53 PM, Roy Tennant roytenn...@gmail.com wrote: That's good on the tax front, but it would be nice if eventually we could find a way to make money to help out with the conference. But that will take an organization, and so far we've avoided that. Roy On Thu, Apr 10, 2014 at 5:42 PM, Riley Childs rchi...@cucawarriors.com wrote: It is running though spreadshirt set up with 0% commissions, so no monies are being collected. I think as long as I don't collect any money, we should be good. Riley Childs Junior IT Admin email: rchi...@cucawarriors.com office: +1 (704) 537-0031 x101 cell: +1 (704) 497-2086 Please Think Before Hitting Reply All I Do Web Design! RileyChilds.net/services From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Cary Gordon [listu...@chillco.com] Sent: Thursday, April 10, 2014 8:27 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Call for Old Conf Tshirt Logos I hope that the IRS doesn't put a lien on the yacht he buys with the proceeds. You're right, though. Probably better if some organization or institution could step up. Again we are slightly challenged by our state of non-organization. Cary On Apr 10, 2014, at 4:36 PM, Roy Tennant roytenn...@gmail.com wrote: I think one of the things that has held us back about the store in the past was the lack of a fiscal agent. That is, someone is going to be taking in money on behalf of Code4Lib (presumably), but where does it go? Since we have no organization we have no fiscal presence. No bank
[CODE4LIB] Newcomer dinner - Pit Group 4
Group 4 for The Pit: It seems like there will be a sizable exodus from the conf hotel to the restaurant around 6 PM, so let's plan to meet then or shortly before in the lobby so that we can get ourselves organized. I'll find a way to make myself know to you. -Jon
Re: [CODE4LIB] Book scanner suggestions redux
Great points, Jason! We have run into the same issue with Windows 7 drivers on our ILL scanner here. Jon Goodell, MA, AHIP UAMS Reference and Outreach Librarian 501-526-5641, jgood...@uams.edu -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of jason bengtson Sent: Wednesday, March 19, 2014 6:14 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Book scanner suggestions redux It's interesting to me to see people question the long term viability of an open source project. Not because it isn't a valid concern, but because, especially with scanners, the same issue arises with the commercial stuff. Just recently we have had to do some finagling with two very expensive ILL scanners so that we can isolate them from the network. Minolta doesn't make any Windows 7 or later drivers for them (nor does anyone else), effectively anchoring them to XP. I've seen this a few times now with scanners (probably because they tend to be longer term investments than other peripherals). The same seems to happen a lot with medical imaging devices. If I were a cynic I might suspect that Minolta and friends wanted to ensure turnover. I'm viewing the current situation as a stopgap until we can look at replacing the scanners, but, when we do that, I intend to move forward on much lower-priced alternatives. Given that, for a variety of reasons, we're pretty much a Windows sh! op, and given what seems to be the increasing pace of Windows releases, I feel like we have to consider that our scanners will have an highly indeterminate but likely limited shelf life. It's too bad . . . some company could probably do well by creating and selling third party drivers for some of these old imaging machines. Best regards, Jason Bengtson, MLIS, MA Head of Library Computing and Information Systems Assistant Professor, Graduate College Department of Health Sciences Library and Information Management University of Oklahoma Health Sciences Center 405-271-2285, opt. 5405-271-3297 (fax) jason-bengt...@ouhsc.edu http://library.ouhsc.edu www.jasonbengtson.com NOTICE: This e-mail is intended solely for the use of the individual to whom it is addressed and may contain information that is privileged, confidential or otherwise exempt from disclosure. If the reader of this e-mail is not the intended recipient or the employee or agent responsible for delivering the message to the intended recipient, you are hereby notified that any dissemination, distribution, or copying of this communication is strictly prohibited. If you have received this communication in error, please immediately notify us by replying to the original message at the listed email address. Thank You. On Mar 19, 2014, at 5:50 AM, Johannes Baiter johannes.bai...@gmail.com wrote: Hi all, spreads developer chiming in here :-) @Cindy: I'm curious - how does the shooting time per page compare to something like a Minolta PS7000? We've got an old PS7000, buit my experience with the one I've used before was that it took sooo long to shoot each page. Also, the PS7000 model didn't accommodate a bound volume that wouldn't open flat all that well. Would this be an improvement over that? With my Canon A2200s I can currently shoot at 1400-1500 pages per hour, although the bottleneck is probably my lifting the cradle/flipping the pages. @Aaron: It seems like the software piece is a big variable with the DIYBookScanner. It's interesting to hear about various setups, I just wonder about the long(ish) term viability of some of these open source projects. Obviously, the software is essential for an efficient system and I'm not sure we're interested in building/maintaining our own suite of tools. While I can't give any guarantees, I'm very optimistic that I'll continue development for the foreseeable future. I'm very passionate about the software and the project (DIYBookScaner) as a whole and my list of things I'd like to do in the software should probably suffice for at least the next two years :-) And even in the case that I should be hit by a bus, I've tried to make the code as clear and idiomatic as possible, so an experienced Python developer should be able to get up to speed pretty quickly. Additionally, as Raffaele already mentioned, spreads is very modular, you can add your own functionality very easily through the Plugin API. If you are playing with the thought of using spreads in your institution, please drop me a message, I would love to hear about your workflow and what kinds of things you'd like the software to do. All the best, Johannes -- Confidentiality Notice: This e-mail message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure
Re: [CODE4LIB] Fwd: [rules] Publication of the RDA Element Vocabularies
On Fri, Jan 24, 2014 at 12:19 PM, Edward Summers e...@pobox.com wrote: Luckily nobody’s really using it ; so it’s not a huge problem :-D Gee, thanks Ed. :-) Jon
Re: [CODE4LIB] Fwd: [rules] Publication of the RDA Element Vocabularies
On Fri, Jan 24, 2014 at 11:16 AM, Robert Sanderson azarot...@gmail.comwrote: All in my opinion, and all debatable. I hope that your choice goes well for you, I'd like to repeat: just because I agree with that choice, and I'm defending it here, it wasn't my choice. Not at all. And the concerns you express were well-aired and very carefully considered before the choice was made. And yours :) Ok, that makes me feel a bit personally defensive... I just want to be sure that it's clear that while I agree with my client, as a developer I'm not happy with opaque URIs for predicates any more than you are. The defaults that I've written into the Open Metadata Registry for coining URIs are: opaque numeric for value vocabularies, and camel-casing of the label in the default language of the vocabulary for predicate vocabularies. I think that's the way it should usually work -- my personal best practice. But these are decisively multilingual vocabularies, without a 'default' language for the labels. It's a French and Spanish and English and Hebrew and Arabic and Italian (etc.) vocabulary. It's not an English vocabulary. There's no default label to use. The obvious (and well-researched) solution is an entirely opaque, non-lexical URI. When I, wearing a developer hat, insist (as you do) that it makes the vocabularies virtually impossible to be used in development, my client regrets that there doesn't seem to be any other solution. The solution that we came up with was that, rather than have no lexical URIs, we would have _all_ of the lexical URIs, and declare them as owl:sameAs. We could have used owl:equivalentProperty (and we may have to in some cases where the translation isn't lexical but rather conceptual) but it's not as strong. The significant downside is that it immediately makes the vocabularies owl:full. At some point in the future, we may publish the mappings from lexical to opaque as a separate map for each language that can be included in the vocabulary or not and that would 'solve' the owl:full problem, sortof. The rejection of a single lexical URI in English wasn't 'politically correct' in the pejorative sense that we usually use that phrase, but rather an acknowledgement and embrace of a multilingual community. It was the politic thing to do. And yeah, our solution is debatable, but it's a debate we've often had over the years, with both colleagues and clients, in public and in private, and sometimes there's just no pleasing everyone, so we just do the best we can with the tools we have, eh? And build some new tools, which we're also working on. Cheers for the useful debate. Jon
Re: [CODE4LIB] Fwd: [rules] Publication of the RDA Element Vocabularies
Hi Rob, the conversation continues below... On Thu, Jan 23, 2014 at 7:01 PM, Robert Sanderson azarot...@gmail.comwrote: Hi Jon, To present the other side of the argument so that others on the list can make an informed decision... Thanks for reminding me that this is an academic panel discussion in front of an audience, rather than a conversation. On Thu, Jan 23, 2014 at 4:22 PM, Jon Phipps jphi...@madcreek.com wrote: I've developed a quite strong opinion that vocabulary developers should not _ever_ think that they can understand the semantics of a vocabulary resource by 'reading' the URI. 100% Agreed. Good documentation is essential for any ontology, and it has to be read to understand the semantics. You cannot just look at oa:hasTarget, out of context, and have any idea what it refers to. However if that URI is readable it makes developers lives much easier in a lot of situations, and it has no additional cost. Opaque URIs for predicates is the digital equivalent of thumbing your nose at the people you should be courting -- the people who will actually use your ontology in any practical sense. It says: We don't care about you enough to make your life one step easier by having something that's memorable. You will always have to go back to the ontology every time and reread this documentation, over and over and over again. What you suggest is that an identifier (e.g. @azaroth42 or ORCID: -0003-4441-6852 https://orcid.org/-0003-4441-6852) should always be readable as a convenience to the developer. RDA does provide a 'readable in the language of the reader' uri specifically as a convenience to the developer. A feature that I lobbied for. It's just not the /canonical/ URI, because it's an identifier of a property, not the property itself, and that property is independent of the language used to label it. It's the difference between Metadata Management Associates, PO Box 282, Jacksonville, NY 14854, USA (for people) and 14854-0282 (a perfectly functional complete address in the USA namespace), which is precisely the same identifier of that box for machines, and ultimately for the postmaster, who doesn't care whose name is on the box numbered 282, who only needs to know that highly memorable name when someone uses the convenience of not bothering to look up the box number and just sends mail addressed to us at 14854, or even just Jacksonville. And no I don't want to start a URL vs. URI/URN/IRI discussion. Do you have some expectation that in order for the data to be useful your relational or object database identifiers must be readable? Identifiers for objects, no. The table names and field names? Yes. How many DBAs do you know that create tables with opaque identifiers for the column names? How many XML schemas do you know that use opaque identifiers for the element names? My count is 0 from many many many instances. And the reason is the same as having readable predicate URIs -- so that when you look at the table, schema, ontology, triple or what have you, there is some mnemonic value from the name to its intent. Our experience obviously differs in this regard. I've seen many, many databases that have relatively opaque column identifiers that were relabeled in the query to suit the audience for the query. I've seen many French databases, with French content, intended for a French audience, designed by French developers, that had French 'column headers'. The point here is that the identifiers /identify/ a property that exists independent of the language of the data being used to describe a resource. If RDA _had_ to pick a single language to satisfy your requirement for a single readable identifier, which one? To assume that the one language should be English says to the non-english speaking world We don't care about you enough to make your life one step easier by having something that's memorable By whom, and in English? This to me is a frankly colonial assumption of the dominance of English in the world of metadata. In the world of computing in general. for if while ... all English. While there are turing complete languages out there, the ones that don't have real world language constructions are toys, like Whitespace for example. Even the lolcats programming language is more usable than whitespace. Again, it's a cost/value consideration. There are many people who will understand English, and when developers program, they're surrounded by it. If your intended audience is primarily people who speak French, then you would be entirely justified in using URIs with labels from French. Or Chinese, though the IRI expansion would be more of a pain :) Despite the fact that developers are surrounded by English I've worked with many highly skilled developers who didn't speak or read English. Who relied on documentation and meetings in their own language. What RDA is trying to convey is the specific bibliographic knowledge
Re: [CODE4LIB] Fwd: [rules] Publication of the RDA Element Vocabularies
Hi Ben, On Thu, Jan 23, 2014 at 4:48 AM, Ben Companjen ben.compan...@dans.knaw.nlwrote: Returning an HTML document (or XML document as I get) in response to a request for an RDA property or class is wrong in the Linked Data sense [note 1]. This is explained in the W3C WG Note that you referred to in recipe 2 [2]. I'm the co-author of that note, so I'm all too familiar with it. :-) At the moment, it shouldn't be possible to request html from rdaregistry.info without getting redirected to www.rdaregistry.info (hosted on github using github pages). Although I'm doing a minimal job of checking the HTTP Accept header. Are you planning on introducing 303-redirects? I'm deeply embarrassed (really) by the fact that the redirect is not a 303 and that it may not be consistent. As well as by the fact that it doesn't return the requested fragment (which I still believe is best practice). So, yeah, as soon as I get back from the ALA Midwinter conference (sooner if I can get some meeting-free time). I'll at least get a 303 redirect header in there (still learning nginx). Cheers! Jon
[CODE4LIB] Fwd: [rules] Publication of the RDA Element Vocabularies
Well, the notion of 'beta' is a bit complicated... The vocabularies aren't beta and shouldn't be regarded as such. They've been well- vetted and reviewed and various folks, including me, have been working on them for quite a few years, with plenty of feedback from quite a few 'communities'. That said, the dereferencing service infrastructure isn't yet quite right, but we're pretty happy that it mostly works the way need it to right now -- it's not just good, it's good enough. For now. I've developed a quite strong opinion that vocabulary developers should not _ever_ think that they can understand the semantics of a vocabulary resource by 'reading' the URI. Do you have some expectation that in order for the data to be useful your relational or object database identifiers must be readable? By whom, and in English? This to me is a frankly colonial assumption of the dominance of English in the world of metadata. The proper understanding of the semantics, although still relatively minimal, is from the definition, not the URI. Our coining and inclusion of multilingual (eventually) lexical URIs based on the label is a concession to developers who feel that they can't effectively 'use' the vocabularies unless they can read the URIs. Go for it. Use them. The machines, if they're configured correctly, will fetch the correct URI permanently. I grant that writing ad hoc sparql queries with opaque URIs can be intensely frustrating, but the vocabularies aren't designed specifically to support that incredibly narrow use case. If you want to see/use label-based browse use the Open Metadata Registry (and yes that could be improved too): http://metadataregistry.org/schemaprop/list/schema_id/81.html Ultimately I'm not responding on this list to defend decisions that I didn't personally make, despite the fact that I completely support the decision. WRT the bug you mention, please take the trouble to put an issue on GitHub so we can track it: https://github.com/RDARegistry/RDA-Vocabularies/issues ...but, the issue isn't that the sameAs assertions don't appear in the turtle representation, it's that they do appear in the N3 representation we've published using = (valid N3, invalid turtle), and we don't actually publish turtle at the moment, even if that's what you ask for. We publish N3 generated using the very useful RDF translation service: http://rdf-translator.appspot.com/ ...which uses RDFLib to generate N3, and there appears to be a bug in RDFLib that isn't a bug: https://github.com/RDFLib/rdflib/issues/218 I haven't had time to effectively research our options, but clearly we need to either generate both turtle and N3 serializations (my preference), or just turtle. Jon On Thu, Jan 23, 2014 at 10:50 AM, Dan Scott deni...@gmail.comjavascript:_e({}, 'cvml', 'deni...@gmail.com'); wrote: On Thu, Jan 23, 2014 at 10:08 AM, Jon Phipps jphi...@madcreek.comjavascript:_e({}, 'cvml', 'jphi...@madcreek.com'); wrote: Hi Ben, On Thu, Jan 23, 2014 at 4:48 AM, Ben Companjen ben.compan...@dans.knaw.nl javascript:_e({}, 'cvml', 'ben.compan...@dans.knaw.nl');wrote: Returning an HTML document (or XML document as I get) in response to a request for an RDA property or class is wrong in the Linked Data sense [note 1]. This is explained in the W3C WG Note that you referred to in recipe 2 [2]. I'm the co-author of that note, so I'm all too familiar with it. :-) At the moment, it shouldn't be possible to request html from rdaregistry.info without getting redirected to www.rdaregistry.info(hosted on github using github pages). Although I'm doing a minimal job of checking the HTTP Accept header. Are you planning on introducing 303-redirects? I'm deeply embarrassed (really) by the fact that the redirect is not a 303 and that it may not be consistent. As well as by the fact that it doesn't return the requested fragment (which I still believe is best practice). So, yeah, as soon as I get back from the ALA Midwinter conference (sooner if I can get some meeting-free time). I'll at least get a 303 redirect header in there (still learning nginx). Oh. I'm going to take a guess that this announcement was pushed out to meet an ALA Midwinter deadline, and therefore was a tad premature. If that's the case (or even if not), why not market it as a beta, collect up the known bugs in a visible place, and (perhaps most importantly) invite the denizens of the W3C Public Linked Open Data mailing list to weigh in on the opaque identifiers vs. meaningful identifiers vs. both opaque + meaningful direction? You want this vocabulary to be adopted and used; it would be really good to have their buy-in to the vision. In my opinion, I think it would be a mistake to continue with the opaque identifiers as the primary identifiers; the vocabulary is almost unreadable as it stands. And I believe they will make communication between people trying to implement it harder
Re: [CODE4LIB] Fwd: [rules] Publication of the RDA Element Vocabularies
Hi Dan, Thanks for taking such an interest! Regarding your questions and concerns: 'slash' vs. 'hash' URIs: As a matter of design, we coin URIs for retrieval of information about the resource identified by the URI by machines, not humans. The most current formal rules[1] state that retrieving a 'slash' fragment should return just that fragment when resolved. We're currently breaking that rule by always returning the entire vocabulary, as if it was indeed using hash URIs and will fix it in the next few weeks. An example of such a fragment (generated by the Open Metadata Registry for http://rdaregistry.info/Elements/w/P10001) is here: http://metadataregistry.org/schemaprop/show/id/15304.rdf We believe, as a matter of good design, that URIs coined for large vocabularies should minimize retrieval bandwidth, particularly since it's highly unlikely that the entire vocabulary will (or should) be retrieved when the properties are used individually as part of an application profile. The entire vocabulary can always be acquired by requesting it from the vocabulary's namespace URI: http://rdaregistry.info/Elements/w/ Lexical (readable, but not semantic) URIs: One of the most common misuses of vocabularies is the misunderstanding of the semantics of the property identified by the URI based on the user's personal, colloquial, or domain-specific interpretation of the semantics of the URI (dc:title is the one I've seem misused most often). So we believe that good vocabulary design _should_ obscure the semantics requiring that the actual vocabulary documentation be viewed by a human. The other problem is that the 'semantics' are most often broadly identified with the lexical label used in the URI. Vocabularies, no matter how stable semantically, _will_ evolve and that evolution often results in a change to the label(s), even if the semantics communicated by the URI don't change. And then there's the issue of spelling (British English vs. American English) and language. Should we assume that the entire world must use, and _understand_ English in order to effectively use a vocabulary? We don't think so. To at least partially address this we have coined multiple URIs for each property, as explained here: http://www.rdaregistry.info/Elements/e/ All RDA URIs have both an immutable canonical form and a 'readable', lexical form, which is subject to change (changes will be redirected). The lexical URIs follow the naming convention you identified and are largely based on the current English (British) label. Content-type: application/octet-stream: We just got the server (nginx) setup yesterday and we haven't yet set the mime types correctly. Again we'll fix that very shortly. Jon Phipps Metadata Management Associates Open Metadata Registry [1] http://www.w3.org/TR/swbp-vocab-pub/ Jon On Wed, Jan 22, 2014 at 12:57 PM, Dan Scott deni...@gmail.com wrote: I'm still pretty new at this linked data thing, but I find it strange that RDA element properties URIs such as http://rdaregistry.info/Elements/a/P50034 and http://rdaregistry.info/Elements/a/P50209 both return the same HTML page in a browser. Would it not have been more usable if the properties used hash-URIs that could have located the particular property on the particular page (e.g. http://rdaregistry.info/Elements/a#P50034)? Also, a plain curl request returns Content-type: application/octet-stream -- but it's pretty clearly Turtle, so I think that should be Content-type: text/turtle I would have liked to see more meaningful URIs--like http://rdaregistry.info/Elements/agent/addressOf instead of http://rdaregistry.info/Elements/a/P50209--as meaningful URIs seem a lot more approachable to this non-machine, but I guess that would have been a lot more work. On Tue, Jan 21, 2014 at 10:45 AM, Diane Hillmann metadata.ma...@gmail.com wrote: Folks: I hope this announcement will be of general interest (and apologies if you receive more than one). Diane -- Forwarded message -- From: JSC Secretary jscsecret...@rdatoolkit.org Date: Tue, Jan 21, 2014 at 10:23 AM Subject: [rules] Publication of the RDA Element Vocabularies snip recipients RDA colleagues, See the announcement below, also posted on the JSC website. Feel free to share this information with your colleagues. Regards, Judy Kuhagen = = = = = The Joint Steering Committee for Development of RDA (JSC), Metadata Management Associates, and ALA Publishing (on behalf of the co-publishers of RDA) are pleased to announce that the RDA elements and relationship designators have been published in the Open Metadata Registry (OMR) as Resource Description Framework (RDF) element sets suitable for linked data and semantic Web applications. The elements include versions unconstrained by Functional Requirements for Bibliographic Records (FRBR) and Functional Requirements for Authority Data (FRAD), the standard library models
Re: [CODE4LIB] Fwd: [rules] Publication of the RDA Element Vocabularies
Hi Karen, On Wed, Jan 22, 2014 at 6:40 PM, Karen Coyle li...@kcoyle.net wrote: I would still prefer something memorable at this stage. The 'lexical', and therefore more memorable, URIs based on the English label will always resolve to the canonical URI. If the lexical label changes, but the semantics don't change, both the old and new lexical URIs will still resolve to the same canonical URI. Of course if both the label and the semantics change, then then it's a new property and gets a new URI. We think that what's urgently needed is a far, far better html representation of the vocabularies: one that makes it obvious that humans can guess mnemonically at a resolvable URI from the label, bearing in mind that this will (hopefully) cause machines (and browsers) to follow the inevitable redirect to the canonical URI. We're actively working on that better representation. Jon
Re: [CODE4LIB] problem in old etd xml files
Right, hence my earlier suggestion of just replacing the entities ;). It's not exactly the approach you describe, as your would would deal with common cases that didn't get properly set up in the dtd, but it also would be a bit more difficult to map for weird custom entities. My email was a bit rambling, but the magic sauce I recommended was something like xmllint --loaddtd --noent --dropdtd FRONT.XML FRONT_nodtdent.xml (In reality you'd want to automate that a little more, xmllint uses the libxml libraries if I remember correctly, so there are likely bindings that do the same thing.) What that seems to do is loads the dtd (which xmllint no longer does unless it needs to), takes any entity and replaces it with what's in the dtd, and then just drops the dtd. I didn't look closely, but it doesn't seem to just transplant it with the numeric code (#255;), but use the actual unicode character. (You still need to fix the several mistakes that have already been observed and pointed out by folks like Jason, the xml:stylesheet that needs to be xml-stylesheet, making sure the filename are actually correct for case-sensitive OSes.) Jon G. On Mon, Dec 9, 2013 at 7:48 PM, Roy Tennant roytenn...@gmail.com wrote: For my money, the text transform should look only for exact matches (e.g., aacute;, nbsp;, copy;) and replace them with their numeric counterparts. Roy On Mon, Dec 9, 2013 at 5:41 PM, jason bengtson j.bengtson...@gmail.com wrote: For testing purposes I just nixed them. As I noted, to rework the file a person would probably want to use a more critical eye with find and replace. Totally doable. On Dec 9, 2013, at 7:37 PM, Jon Gorman jonathan.gor...@gmail.com wrote: How did you fix the ampersands? I ask, because if you just did a simple text transform from to amp;, it would mask the problem of the entity escaping I think... Not at work, so I don't have a good example and the file is downloading very slowly here, so I'll try to do one from memory. There were several aacute; in the XML which mapped to an accent character in the DTD via the Entity. If you just substituted with amp;, you'd get amp;aacute;, which would render inline as accute;. It would superficially solve the issue since browsers would no longer give the errors about the dtd since it wouldn't be trying to load entities from the DTDs. And depending how you did it, you likely could also replace a correctly encoded one to make amp;amp;, leading to some very odd stuff. I wouldn't be surprised to find some unescaped ampersands, but the solution I posted will essentially replace the entities with their text, hopefully causing most characters to appear correctly. You definitely still need to fix some of the other stuff. (I suspect it never worked for most browsers and XML systems, most likely only IE). Jon Gorman University of Illinois Best regards, Jason Bengtson, MLIS, MA Head of Library Computing and Information SystemsAssistant Professor, Graduate CollegeDepartment of Health Sciences Library and Information ManagementUniversity of Oklahoma Health Sciences Center405-271-2285, opt. 5405-271-3297 (fax) jason-bengt...@ouhsc.edu http://library.ouhsc.edu www.jasonbengtson.com NOTICE: This e-mail is intended solely for the use of the individual to whom it is addressed and may contain information that is privileged, confidential or otherwise exempt from disclosure. If the reader of this e-mail is not the intended recipient or the employee or agent responsible for delivering the message to the intended recipient, you are hereby notified that any dissemination, distribution, or copying of this communication is strictly prohibited. If you have received this communication in error, please immediately notify us by replying to the original message at the listed email address. Thank You.
Re: [CODE4LIB] problem in old etd xml files
A lot of modern systems won't load entities (or will limit it somehow) because of the denial of service attack that is possible. Look for XML Entity Reference Denial of Service. I can't remember if Public declarations are treated any differently than System ones. (I would have suspected it to trust SYSTEM ones more, but they'd still be exploitable by the same bug). (There's also a fair number of other errors, I'm somewhat skeptical that the example worked on many browsers even then. It's possible IE was flexible enough it would have worked). One thing you might want to do is is take out the entities. I can't remember why I had to do this, but xmllint seemed to do the trick. ( I found a snippet at http://stackoverflow.com/questions/614067/how-to-resolve-all-entity-references-in-xml-and-create-a-new-xml-in-c, but it' smissing the necessary --loaddtd) xmllint --loaddtd --noent --dropdtd FRONT.xml FRONT_nodtdent.xml I mean, you don't need the dtd for validation, particularly since I suspect given the errors it may not validate anyhow. It might make the files a little harder to read when reading the raw source, but I suspect that's not typically a problem. Jon Gorman University of Illinois On Mon, Dec 9, 2013 at 2:10 PM, Robertson, Wendy C wendy-robert...@uiowa.edu wrote: Back in 1999-2002 a handful of our theses were submitted as a collection of xml files. We posted the files in our repository several years ago (we posted a zipped folder with all the files). At that time, if you opened front.xml you would be able to access the thesis. We have not touched the files in the close to 5 years since we posted them, but the files no longer open correctly. One of the problem theses is http://ir.uiowa.edu/etd/189/. Front.xml begins ?xml version=1.0 encoding=UTF-8? ?xml:stylesheet type=text/css href=UIowa2K1.css ? !DOCTYPE thesis SYSTEM UIowa2K.dtd I have tried the following changes but they do not help 1) Adding standalone=no? to the xml declaration -- ?xml version=1.0 encoding=UTF-8 standalone=no? 2) Changing the case of UIowa2K1.css and UIowa2K.dtd to match the files (which are in all caps) 3) Changing xml:stylesheet to xml-stylesheet Chrome shows errors that entities are not defined, but they are defined in the dtd. I would appreciate any assistance in making these documents available again. Thanks! Wendy Robertson Digital Scholarship Librarian * The University of Iowa Libraries 1015 Main Library * Iowa City, Iowa 52242 wendy-robert...@uiowa.edu * 319-335-5821
Re: [CODE4LIB] problem in old etd xml files
How did you fix the ampersands? I ask, because if you just did a simple text transform from to amp;, it would mask the problem of the entity escaping I think... Not at work, so I don't have a good example and the file is downloading very slowly here, so I'll try to do one from memory. There were several aacute; in the XML which mapped to an accent character in the DTD via the Entity. If you just substituted with amp;, you'd get amp;aacute;, which would render inline as accute;. It would superficially solve the issue since browsers would no longer give the errors about the dtd since it wouldn't be trying to load entities from the DTDs. And depending how you did it, you likely could also replace a correctly encoded one to make amp;amp;, leading to some very odd stuff. I wouldn't be surprised to find some unescaped ampersands, but the solution I posted will essentially replace the entities with their text, hopefully causing most characters to appear correctly. You definitely still need to fix some of the other stuff. (I suspect it never worked for most browsers and XML systems, most likely only IE). Jon Gorman University of Illinois
Re: [CODE4LIB] Looking for two coders to help with discoverability of videos
Hi Kelley, If you haven't already, you might want to look at the music score and sound recording FRBRization work done on the Variations-FRBR project here at Indiana University. I'm not sure how directly useful this would be for your work with moving images, but there may be some useful mapping ideas: FRBR XML schemas: http://www.dlib.indiana.edu/projects/vfrbr/schemas/1.1/index.shtml MARC-FRBR mapping specifications: http://www.dlib.indiana.edu/projects/vfrbr/projectDoc/metadata/mappings/spring2010/vfrbrSpring2010mappings.shtml Java FRBRization code and documentation: http://www.dlib.indiana.edu/projects/vfrbr/projectDoc/index.shtml Jon -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Kelley McGrath Sent: Tuesday, December 03, 2013 12:35 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Looking for two coders to help with discoverability of videos Robert, Your work also sounds very interesting and definitely overlaps with some of what we want to do. It seems like a lot of people are trying to get useful format information out of MARC records and it's unfortunate that it is so complicated. I would be very interested to see your logic for determining format and dealing with self-contradictory records. Runtime from the 008 is, as you say, pretty straightforward, but not always filled out and useless if the resource is longer than 999 minutes. It's interesting that you mention identifying directors. We have also been working on a similar, although more generalized, process. We're trying to identify all of the personal and organizational names mentioned in video records and, where possible, their roles. Our existing process is pretty accurate for personal names and for roles in English. It tends to struggle with credits involving multiple corporate bodies and we're working on building a lexicon of non-English terms for common roles. We're also trying to get people to hand-annotate credits to build a corpus to help us improve our process. (Help us out at http://olac-annotator.org/. And if you're willing to be on call to help with translating non-English credits, email me with the language(s) you'd be able to help out with. We also just started a mailing list at https://lists.uoregon.edu/mailman/listinfo/olac-credits) Matching MARC records for moving images with external data sources is also on our radar. Most feature film type material can probably be identified by the attributes you mention: title, original date and director (probably 2 out of 3 would work in most cases). We are also hoping to use these attributes (and possibly others) to cluster records for the same FRBR work. It would be great to talk with you more about this off-list. Kelley kell...@uoregon.edu From: Robert Haschart [rh...@virginia.edu] Sent: Monday, December 02, 2013 10:49 AM To: Code for Libraries Cc: Kelley McGrath Subject: Re: [CODE4LIB] Looking for two coders to help with discoverability of videos Kelley, The work you are proposing is interesting and overlaps somewhat both with work I have already done and with a new project I'm looking into here at UVa. I have been the primary contributor to the Marc4j java project for the past several years and am the creator of the project SolrMarc which extracts data from Marc records based on a customizable specification, to build Solr index records to facilitate rich discovery. Much of my work on creating and improving these projects has been in service of my actual job of creating and maintaining the Solr Index behind our Blacklight-based discovery interface. As a part of that work I have created custom SolrMarc routines that extract the format of items similar to what is described in Example 3, including looking in the leader, 006, 007 and 008 to determine the format as-coded but further looking in the 245 h, 300 and 538 fields to heuristically determine when the format as-coded is incorrect and ought to be overridden. Most of the heuristic determination is targeted towards Video material, and was initiated when I found an item that due to a coding error was listed as a Video in Braille format. Further I have developed a set of custom routines that look more closely at Video items, one of which already extracts the runtime from the 008[18-20] field, To modify it from its current form that currently returns the runtime in minutes, to instead return it as HH:MM as specified in your xls file, and to further handle the edge case of 008[18-20] = 000 to return over 16:39 would literally take about 15 minutes. Another of these custom routines that is more fully-formed, is code for extracting the Director of a video from the Marc record. It examines the contents of the fields 245c, 508a, 500a, 505a, 505t, employing heuristics and targeted natural language processing techniques, to attempt to correctly extract the Director
Re: [CODE4LIB] ruby-marc api design feedback wanted
Coming from nowhere on this...is there a place where it would be convenient to flag which behavior the user (of the library) wants? I think you're correct that most of the time you'd just want to blow through it (or replace it), but for the situation where this isn't the case, I think the Right Thing to do is raise the exception. I don't think you would want to bury it in some assumption made internal to the library unless that assumption can be turned off. -Jon On 11/19/2013 07:51 PM, Jonathan Rochkind wrote: ruby-marc users, a question. I am working on some Marc8 to UTF-8 conversion for ruby-marc. Sometimes, what appears to be an illegal byte will appear in the Marc8 input, and it can not be converted to UTF8. The software will support two alternatives when this happens: 1) Raising an exception. 2) Replacing the illegal byte with a replacement char and/or omitting it. I feel like most of the time, users are going to want #2. I know that's what I'm going to want nearly all the time. Yet, still, I am feeling uncertain whether that should be the default. Which should be the default behavior, #1 or #2? If most people most of the time are going to want #2 (is this true?), then should that be the default behavior? Or should #1 still be the default behavior, because by default bad input should raise, not be silently recovered from, even though most people most of the time won't want that, heh. Jonathan
Re: [CODE4LIB] Loris
Ed, I added support for IIIF syntax to OpenSeadragon: https://github.com/openseadragon/openseadragon/blob/master/src/iiif1_1tilesource.js so it just works. Not sure if Ian has cut a release recently, but it's on the master branch anyway. -Js On 11/08/2013 04:00 PM, Edward Summers wrote: On Nov 8, 2013, at 3:05 PM, Jon Stroop jstr...@princeton.edu wrote: And here's a sample of the server backing OpenSeadragon[2]: http://goo.gl/Gks6lR Thanks for sharing that Jon. Did you have to do much to get OpenSeadragon to talk iiif? //Ed
Re: [CODE4LIB] Loris
Whoops, wait. I wrote a formula for Chris Thatcher to add support for IIIF 1.0 to add support for OSd. Then I made some changes and added support for 1.1. Credit where credit is due -Js On 11/08/2013 04:40 PM, Jon Stroop wrote: Ed, I added support for IIIF syntax to OpenSeadragon: https://github.com/openseadragon/openseadragon/blob/master/src/iiif1_1tilesource.js so it just works. Not sure if Ian has cut a release recently, but it's on the master branch anyway. -Js On 11/08/2013 04:00 PM, Edward Summers wrote: On Nov 8, 2013, at 3:05 PM, Jon Stroopjstr...@princeton.edu wrote: And here's a sample of the server backing OpenSeadragon[2]:http://goo.gl/Gks6lR Thanks for sharing that Jon. Did you have to do much to get OpenSeadragon to talk iiif? //Ed
Re: [CODE4LIB] Loris
Bleh. You know what I meant. On 11/8/13 5:13 PM, Jon Stroop wrote: Whoops, wait. I wrote a formula for Chris Thatcher to add support for IIIF 1.0 to add support for OSd. Then I made some changes and added support for 1.1. Credit where credit is due -Js On 11/08/2013 04:40 PM, Jon Stroop wrote: Ed, I added support for IIIF syntax to OpenSeadragon: https://github.com/openseadragon/openseadragon/blob/master/src/iiif1_1tilesource.js so it just works. Not sure if Ian has cut a release recently, but it's on the master branch anyway. -Js On 11/08/2013 04:00 PM, Edward Summers wrote: On Nov 8, 2013, at 3:05 PM, Jon Stroopjstr...@princeton.edu wrote: And here's a sample of the server backing OpenSeadragon[2]:http://goo.gl/Gks6lR Thanks for sharing that Jon. Did you have to do much to get OpenSeadragon to talk iiif? //Ed
Re: [CODE4LIB] Loris
Seriously! On 11/8/13 6:21 PM, Michael J. Giarlo wrote: Stick to Python, Jon. ;) On Fri, Nov 8, 2013 at 3:17 PM, Jon Stroop jstr...@princeton.edu wrote: Bleh. You know what I meant. On 11/8/13 5:13 PM, Jon Stroop wrote: Whoops, wait. I wrote a formula for Chris Thatcher to add support for IIIF 1.0 to add support for OSd. Then I made some changes and added support for 1.1. Credit where credit is due -Js On 11/08/2013 04:40 PM, Jon Stroop wrote: Ed, I added support for IIIF syntax to OpenSeadragon: https://github.com/openseadragon/openseadragon/blob/master/src/iiif1_ 1tilesource.js so it just works. Not sure if Ian has cut a release recently, but it's on the master branch anyway. -Js On 11/08/2013 04:00 PM, Edward Summers wrote: On Nov 8, 2013, at 3:05 PM, Jon Stroopjstr...@princeton.edu wrote: And here's a sample of the server backing OpenSeadragon[2]:http://goo. gl/Gks6lR Thanks for sharing that Jon. Did you have to do much to get OpenSeadragon to talk iiif? //Ed
Re: [CODE4LIB] Loris
It aims to do the same thing...serve big JP2s (and other images) over the web, so from that perspective, yes. But, beyond that, time will tell. One nice thing about coding against a well-thought-out spec is that are lots of implementations from which you can choose[1]--though as far as I know Loris is the only one that supports the IIIF syntax natively (maybe IIP?). We still have Djatoka floating around in a few places here, but, as many people have noted over the years, it takes a lot of shimming to scale it up, and, as far as I know, the project has more or less been abandoned. I haven't done too much in the way of benchmarking, but to date don't have any reason to think Loris can't perform just as well. The demo I sent earlier is working against a very large jp2 with small tiles[1] which means a lot of rapid hits on the server, and between that, (a little bit of) JMeter and ab testing, and a fair bit of concurrent use from the c4l community this afternoon, I feel fairly confident about it being able to perform as well as Djatoka in a production environment. By the way, you can page through some other images here: http://libimages.princeton.edu/osd-demo/ Not much of an answer, I realize, but, as I said, time and usage will tell. -Js 1. http://iiif.io/apps-demos.html 2. http://libimages.princeton.edu/loris/pudl0052%2F6131707%2F0001.jp2/info.json On 11/8/13 8:07 PM, Peter Murray wrote: A clarifying question: is Loris effectively a Python-based replacement for the Java-based djatoka [1] server? Peter [1] http://sourceforge.net/apps/mediawiki/djatoka/index.php?title=Main_Page On Nov 8, 2013, at 3:05 PM, Jon Stroop jstr...@princeton.edu wrote: c4l, I was reminded earlier this week at DLF (and a few minutes ago by Tom and Simeon) that I hadn't ever announced a project I've been working for the least year or so to this list. I showed an early version in a lightning talk at code4libcon last year. Meet Loris: https://github.com/pulibrary/loris Loris is a Python based image server that implements the IIIF Image API version 1.1 level 2[1]. http://www-sul.stanford.edu/iiif/image-api/1.1/ It can take JP2 (if you make Kakadu available to it), TIFF, or JPEG source images, and hand back JPEG, PNG, TIF, and GIF (why not...). Here's a demo of the server directly: http://goo.gl/8XEmjp And here's a sample of the server backing OpenSeadragon[2]: http://goo.gl/Gks6lR -Js 1. http://www-sul.stanford.edu/iiif/image-api/1.1/ 2. http://openseadragon.github.io/ -- Jon Stroop Digital Initiatives Programmer/Analyst Princeton University Library jstr...@princeton.edu -- Peter Murray Assistant Director, Technology Services Development LYRASIS peter.mur...@lyrasis.org +1 678-235-2955 800.999.8558 x2955
[CODE4LIB] Job: Digital Repository Software Developer at Princeton University
Note: this job is in Academic Services at Princeton, not in the Library, though we do work together from time to time. The full posting is here: http://jobs.princeton.edu/applicants/Central?quickFind=64011 Cross-posted. Please excuse any duplicate copies you receive. *Princeton University seeks Digital Repository Software Developer* In September of 2011 the Faculty of Princeton University approved an open access policy intended to make faculty's scholarly articles available to a wider public. Princeton is now in the process of ramping up its efforts to implement the policy. These efforts will include the development of the repository that will hold the scholarly articles. The Office of Information Technology seeks a Digital Repository Software Developer to establish and enhance digital repositories to house academic publications, research data, and related digital assets. The primary focus of the position will be to develop software and systems for collecting and depositing academic journal articles subject to Princeton University's Open Access Policy for Faculty Publications into an open access repository. This repository will enhance both the preservation and dissemination of scholarship at Princeton. The Digital Repository Software Developer will report to the Digital Repository Architect and will work closely with the University's Scholarly Communications Librarian and other IT and Library staff. -- Jon Stroop Digital Initiatives Programmer/Analyst Princeton University Library jstr...@princeton.edu
Re: [CODE4LIB] A Proposal to serialize MARC in JSON
It looks like it's there in pymarc as well: https://github.com/edsu/pymarc/blob/master/pymarc/record.py#L386 On 09/03/2013 03:02 PM, Bill Dueber wrote: I can see where you might think that no progress has been made because the only real document of the format is that old, old blog post. The problem, however, is not a lack of progress but a lack of documentation of that progress. File_MARC (PHP), MARC::Record (perl), ruby-marc (ruby) and marc4j (java) will all deal, to one extent or another, either with the JSON directly or with a hash/map data structure that maps directly to that JSON structure. [BTW, can anyone summarize the state of pymarc wrt marc-in-json?] On Tue, Sep 3, 2013 at 5:09 AM, dasos ili dasos_...@yahoo.gr wrote: It is exactly three years back, and no real progress has been made concerning this proposal to serialize MARC in JSON: http://dilettantes.code4lib.org/blog/2010/09/a-proposal-to-serialize-marc-in-json/ Meanwhile new tools for searching and retrieving records have come in, such as Solr and Elasticsearch. Any ideas on how one could alter (or propose a new format) more suited to the mechanisms of these two search platforms? Any example implemantations would be also really appreciated, thank you in advance
Re: [CODE4LIB] Python and Ruby
s/ruby/any_language/ Why not learn both? As with spoken languages, knowing more than one makes it easier for you to think at a higher level of abstraction and therefore a better developer, and, as others have alluded to, will allow you to choose the 'right tool [framework, library, etc] for the right job'. Plus, as Giarlo said, they're not really that different. From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] on behalf of Chris Fitzpatrick [chrisfitz...@gmail.com] Sent: Monday, July 29, 2013 1:39 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Python and Ruby One thing to factor in is that if you learn ruby you run the risk of becoming one of those people who constantly talks,tweets,blogs, posts to this mailing list about how great ruby is. This can have a very negative impact on your work productivity. On Monday, July 29, 2013, Dana Pearson wrote: Josh, I work exclusively with XSLT but specialize in metadata only no need for content display choices maybe a candidate for library programming language...XSLT 2.0 has useful analyze-string element to cover Roy's point by the way, Josh, live just down the road in Leeton regards, dana On Mon, Jul 29, 2013 at 12:04 PM, Roy Tennant roytenn...@gmail.comjavascript:; wrote: On Mon, Jul 29, 2013 at 9:57 AM, Peter Schlumpf pschlu...@earthlink.netjavascript:; wrote: Imagine if the library community had its own programming/scripting language, at least one that is domain relevant. What would it look like? Whatever else it had, it would have to have a sophisticated way to inspect text for patterns -- that is, regular expressions. Roy -- Dana Pearson dbpearsonmlis.com
Re: [CODE4LIB] ILLiad - RemoteAuth and OpenURL
By the way, a similar thread on the ezproxy list brought up this list: http://mail.geneseo.edu/mailman/listinfo/workflowtoolkit-l Which is apparently about ILLiad best practices. I've just subscribed and started reading through the archives. Jon Gorman University of Illinois On Thu, Jul 11, 2013 at 3:53 PM, Jimmy Ghaphery jghap...@vcu.edu wrote: yeh for us we did go with the documentation as best we could and use an ISAPI filter. The final straw for us as a hosted site was that OCLC said they could not support this method and steered us to the EzProxy auth method. On Thu, Jul 11, 2013 at 4:13 PM, Jon Gorman jonathan.gor...@gmail.com wrote: I am also following this conversation, I am wondering if you consult the following about RemoteAuth Authentication, but still failed? https://prometheus.atlas-sys.com/display/illiad/RemoteAuth+Authentication Ling UIC Library Don't know about Jimmy, but the flow on that page is one of the issues we have. We can't trigger the system to have that login behavior. From what I can tell of the logs, it decides the page to redirect to in a session before it ever checks the user status or the remote user header. I don't know how, given that, that the flow could really happen that way. (I might be missing something). As far as I can tell the other settings are what they should be, after all if they weren't I can't imagine that it would work most of the time, just not on the initial logon. We did have an issue with a session heartbeat type of thing that had a similar behavior (the headers would just drop off, somehow associated with the heartbeat process). Thankfully we were able to disable that in the authentication software. Does anyone using RemoteAuth actually see that flow (get challenged to either register or update your info after first successful login?) If you do, what are you using as a link into the system? Jon Gorman University of Illinois -- Jimmy Ghaphery Head, Digital Technologies VCU Libraries 804-827-3551
Re: [CODE4LIB] ILLiad - RemoteAuth and OpenURL
I am also following this conversation, I am wondering if you consult the following about RemoteAuth Authentication, but still failed? https://prometheus.atlas-sys.com/display/illiad/RemoteAuth+Authentication Ling UIC Library Don't know about Jimmy, but the flow on that page is one of the issues we have. We can't trigger the system to have that login behavior. From what I can tell of the logs, it decides the page to redirect to in a session before it ever checks the user status or the remote user header. I don't know how, given that, that the flow could really happen that way. (I might be missing something). As far as I can tell the other settings are what they should be, after all if they weren't I can't imagine that it would work most of the time, just not on the initial logon. We did have an issue with a session heartbeat type of thing that had a similar behavior (the headers would just drop off, somehow associated with the heartbeat process). Thankfully we were able to disable that in the authentication software. Does anyone using RemoteAuth actually see that flow (get challenged to either register or update your info after first successful login?) If you do, what are you using as a link into the system? Jon Gorman University of Illinois
Re: [CODE4LIB] MARC record model to be inserted in mongodb
Have you seen Ross' post: http://dilettantes.code4lib.org/blog/2010/09/a-proposal-to-serialize-marc-in-json/ ? pymarc can get you this json, e.g.: ``` records = pymrx.parse_xml_to_array('/path/to/some/marc.xml') json_file = [record.as_json() for record in records] ``` or, for that matter, if you happen to be using Mongo's Python API, you /may/ be able to call `as_dict()` when you store the record: ``` my_mongo_collection.insert(record.as_dict()) ``` It looks like ruby-marc does something similar, and presumably the Mongo API for Ruby uses Ruby hashes the way that the Python API uses dicts, so a similar approach is probably possible in Ruby. As for ...an efficient way so as to get results with the appropriate queries. I guess that all depends on what you're trying to do. -Jon -- Jon Stroop Digital Initiatives Programmer/Analyst Princeton University Library jstr...@princeton.edu On 07/05/2013 05:47 AM, dasos ili wrote: Could you please give us any suggestions on a data model example regarding a MARC record? The goal is to be able to store it in mongodb, in an efficient way so as to get results with the appropriate queries. thank you in advance
Re: [CODE4LIB] Regular expression for maximum 4-digit number
I have zero Excel skills, but chances are you could do this with any scripting language if you were to export the file as text (e.g. CSV). -Jon On 07/02/2013 11:02 AM, Harper, Cynthia wrote: Is there a way to return (in Excel, if possible) the largest 4-digit number (by word boundaries) in a string? I've extracted the 863 fields from Millennium for my active periodicals, and want to find the latest year in each run. I'm willing to estimate it by taking the largest 4-digit number in the string. I'm doing this in Excel. Any help? Cindy Harper Electronic Services and Serials Librarian Virginia Theological Seminary 3737 Seminary Road Alexandria VA 22304 703-461-1794 char...@vts.edu
[CODE4LIB] JCDL 2013 registration deadline extended to June 5
Early-bird registration for JCDL 2013 has been extended to June 5. Register online at http://www.regonline.com/JCDL2013. Rates available at http://jcdl2013.org/registration. The full program is available at http://jcdl2013.sched.org/. The ACM/IEEE Joint Conference on Digital Libraries is a major international forum focusing on digital libraries and associated technical, practical, organizational, and social issues, taking place in Indianapolis, Indiana, USA on July 22-26, 2013. The theme for JCDL 2013 is Digital Libraries at the Crossroads, in recognition of our location (Indiana is known as the Crossroads of America) and in recognition of the changes forthcoming from the age of mass digitization, big data, and the ever changing nature of scholarly communications. Program Highlights: * 3 outstanding keynote speakers: Jill Cousins, Clifford Lynch, and David de Roure. More information at: http://jcdl2013.org/keynotespeakers; * 6 workshops covering topics such as data and software preservation, digital scholarship, research methods and artifacts preservation, web archiving, mining publications, and CURATEcamp. More information at http://jcdl2013.org/workshops; * 6 tutorials on topics including Europeana data model collections, ResourceSync, Introduction to Digital Libraries, building collections with Greenstone, mining data semantics, and using open annotation. More information at http://jcdl2013.org/tutorials; * A diverse range of papers - 28 full papers and 22 short papers. More information at http://jcdl2013.org/papers; * And much more, including posters and demonstrations. More information at http://jcdl2013.org/posters-demonstrations. Indianapolis is a wonderful conference city friendly to both walkers and cyclists, with many dining, entertainment, and sports options accessible from the downtown area. Check out the visitors guide developed for ACRL 2013: http://conference.acrl.org/indy-pages-163.php. More JCDL travel details are available at http://jcdl2013.org/travel.
Re: [CODE4LIB] XML Parsing and Python
Mike, I haven't used minidom extensively but my guess is that doc.toprettyxml(indent= ,encoding=utf-8) isn't actually changing the encoding because it can't parse the string in your content variable. I'm surprised that you're not getting tossed a UnicodeError, but The docs for Node.toxml() [1] might shed some light: To avoid UnicodeError exceptions in case of unrepresentable text data, the encoding argument should be specified as “utf-8”. So what happens if you're not explicit about the encoding, i.e. just doc.toprettyxml()? This would hopefully at least move your exception to a more appropriate place. In any case, one solution would be to scrub the string in your content variable to get rid of the invalid characters (hopefully they're insignificant). Maybe something like this: def unicode_filter(char): try: unicode(char, encoding='utf-8', errors='strict') return char except UnicodeDecodeError: return '' content = 'abc\xFF' content = ''.join(map(unicode_filter, content)) print content Not really my area of expertise, but maybe worth a shot -Jon 1. http://docs.python.org/2/library/xml.dom.minidom.html#xml.dom.minidom.Node.toxml -- Jon Stroop Digital Initiatives Programmer/Analyst Princeton University Library jstr...@princeton.edu On 03/04/2013 03:00 PM, Michael Beccaria wrote: I'm working on a project that takes the ocr data found in a pdf and places it in a custom xml file. I use Python scripts to create the xml file. Something like this (trimmed down a bit): from xml.dom.minidom import Document doc = Document() Page = doc.createElement(Page) doc.appendChild(Page) f = StringIO(txt) lines = f.readlines() for line in lines: word = doc.createElement(String) ... word.setAttribute(CONTENT,content) Page.appendChild(word) return doc.toprettyxml(indent= ,encoding=utf-8) This creates a file, simply, that looks like this: ?xml version=1.0 encoding=utf-8? Page HEIGHT=3296 WIDTH=2609 String CONTENT=BuffaloLaunch / String CONTENT=Club / String CONTENT=Offices / String CONTENT=Installed / ... /Page I am able to get this document to be created ok and saved to an xml file. The problem occurs when I try and have it read using the lxml library: from lxml import etree doc = etree.parse(filename) I am running across errors like XMLSyntaxError: Char 0x out of allowed range, line 94, column 19. Which when I look at the file, is true. There is a 0X character in the content field. How is a file able to be created using minidom (which I assume would create a valid xml file) and then failing when parsing with lxml? What should I do to fix this on the encoding side so that errors don't show up on the parsing side? Thanks, Mike How is the Mike Beccaria Systems Librarian Head of Digital Initiative Paul Smith's College 518.327.6376 mbecca...@paulsmiths.edu Become a friend of Paul Smith's Library on Facebook today!
[CODE4LIB] Goose Island - quick stupid question - where does bus leave from
Does the bus leave from the hotel or the uic forum? Jon Gorman
[CODE4LIB] C4L2013 Game Night - UIC Library
Hi all, Some quick notes: Again, there's a sign up for individual games. This will make it easier for us to get started quickly and also help from having a large crowd of people just standing around, http://wiki.code4lib.org/index.php/2013_game_night . If you brought a game and want to play it, add it to the list. We're going to stop playing a little earlier than we had on the wiki. we're hoping to close and lock the doors at 10:30, so if people should be winding down at 10:00. It's recommend to travel back to the conference hotel in groups. Please bring your badge with so it'll be a bit easier to make sure folks in the room are people who are supposed to be there. If there's overflow, we'll try to form groups at the room to go out to try to find some spaces to game at. There's some restaurants on Halsted by the UIC Forum. Again, the wiki should have the latest info. Jon G.
[CODE4LIB] Code4LIb 2013 - Game Night - hotel card found
HI folks, Someone who attended the game night left their room key. It's been passed along to some of the folks who will be opening the conference tomorrrow and they'll also make an announcement about it. Jon Gorman
[CODE4LIB] Game Night Code4Lib 2013
Hi all, I've been getting some questions and I realized there was some confusion about the Game night. I was a bit late in organizing it and quite frankly haven't done the best job. I put out a request for people to express their interest on by Jan. 14th by signing up on the wiki or sending me an email, but I didn't actually put that date in the wiki and it was only mentioned in the email on this list (which was on the 10th of Jan if I remember). That wasn't a hard and fast deadline, mostly so we could get an idea of what sized room we need. However, in the past week or two , we've gotten a lot more people sign up and I've also heard from several folks now that they thought the signup was only for bringing games. As it stand though I realized this morning we had about 15 people expressing interest a month ago and now are looking at over twice that number. I don't want to turn anyone away, but this does pose some logistical hurdles. Mea cupla, this is my fault, not any of the Chicago folks. I'm going to try to work with the folks on the ground on seeing if we can get another room at the UIC Library. I'll also try to find out some surrounding locations that can serve as overspill, like cafes that would be fine having a table of people show up and play. I'm also nervous about the number of games vs people who want to play games. If you are attending and can bring some games and teach them, that would be wonderful. (Also, I've run gaming events like this up to about 20 people, but could really use a person or two to serve as a helper. Mainly that just means joining people to games, answering questions, etc) Due to the scale, I have some ideas like signup sheets for various games at the registration desk, rather like the signups for the newcomer's dinner. Again, sorry about this, Jon Gorman
Re: [CODE4LIB] Game Night Code4Lib 2013
Hi all, Sorry for a bit of delay on posting. I've got a few folks who have volunteered to help. It's hard to tell numbers for sure, since some folks might not come and others may show up that haven't signed up. As Francis says, the solution is likely to be nimble. (And again, I want to thank Francis and the rest of the host crew. They've been doing fabulous with disorganized folks like me ;) ). First, some logistical details. I'm thinking that we'll say that a goal will be to have this rough schedule: 7:30 - start setting up games, getting organized 7:45 - start first round of games 10:30 - start wrapping up. 11:00 - call it a night? (Walk back or catch the bus as a group to various hotels may not be a bad idea) I've got a plan (with helpful advice from several folks, thanks!), and we'll see if it a works. I'm going to work a bit tonight on setting up a new page on the wiki. It's going to be structured in a manner that's similar to the newcoming dinner, but instead will be games. Each game will have a number of seats. If you're bringing a game and are willing to play/teach it, add an entry. Estimate a starting time if it's not going to be when it starts. t'll probably look something like... Game Name (#n - if more than one entry for the game, add a number to make less confusing) 7:45. Game description (with maybe link to boardgamegeek) 1. Patty Gauzweiller (T) 2. Leslie Humphries 3. Mona Wert 4. Eddie Ramirez 5. To sign up, put your name in one of the seats. Don't add seats ;). If you can teach/lead the game, note it. (If you want to teach but not play, that's awesome. I haven't quite figured out how to note this, but I'm thinking I'll just add a line at the bottom.) We'll try to set up sections big enough for the games and put up signs. Here's the warning. I'll probably be making judgement calls on what games get set up in the main room, preferring games that I and any volunteers just coordinating can teach to go there and also based on other factors. If we hit the reasonable size for the room, we'll try to have some recommendations for places to go w the group. This is probably not the ideal solution as it makes quicker/lighter games somewhat tricky, but I'm hoping for some of those some people won't mind playing multiple games in the row, maybe teaching someone who will teach the next group and allow a little of mingling that way. Jon Gorman On Fri, Feb 8, 2013 at 4:06 PM, Francis Kayiwa kay...@uic.edu wrote: On Fri, Feb 08, 2013 at 04:39:22PM -0500, Cynthia Ng wrote: Just an idea if space is really an issue. Would it be possible to simply get a second room next to (or at least nearby) the first one? As I image not everyone will be playing the same game, I don't see it as a problem. As I said to Jon. The people here will have to be nimble. The big problem is as a `historically` commuter campus the open spaces become a premium late and night. What we will need from those who signed up is willingness to track email/wiki for changes. I've asked for other spaces but no word yet. Finally unless you have more than 40, this room will fit the current number without a problem. Also no (as they will find out when they get there) it isn't a matter of spill over to the next room. Cheers, ./fxk On Fri, Feb 8, 2013 at 12:51 PM, Jon Gorman jonathan.gor...@gmail.com wrote: Hi all, I've been getting some questions and I realized there was some confusion about the Game night. I was a bit late in organizing it and quite frankly haven't done the best job. I put out a request for people to express their interest on by Jan. 14th by signing up on the wiki or sending me an email, but I didn't actually put that date in the wiki and it was only mentioned in the email on this list (which was on the 10th of Jan if I remember). That wasn't a hard and fast deadline, mostly so we could get an idea of what sized room we need. However, in the past week or two , we've gotten a lot more people sign up and I've also heard from several folks now that they thought the signup was only for bringing games. As it stand though I realized this morning we had about 15 people expressing interest a month ago and now are looking at over twice that number. I don't want to turn anyone away, but this does pose some logistical hurdles. Mea cupla, this is my fault, not any of the Chicago folks. I'm going to try to work with the folks on the ground on seeing if we can get another room at the UIC Library. I'll also try to find out some surrounding locations that can serve as overspill, like cafes that would be fine having a table of people show up and play. I'm also nervous about the number of games vs people who want to play games. If you are attending and can bring some games and teach them, that would be wonderful. (Also, I've run gaming events like this up to about 20 people, but could really use a person or two to serve as a helper
Re: [CODE4LIB] Game Night Code4Lib 2013
I've add the page at http://wiki.code4lib.org/index.php/2013_game_night. Sorry, I realize this is a bit last minute. If for some reason you can't edit the wiki but want to sign up for a slot or add a game you're willing to run, send the info to me. I'll add it as I get time. I'll probably be adding some more of my games, but I need to go to dinner ;). Jon Gorman On Fri, Feb 8, 2013 at 6:59 PM, Jon Gorman jonathan.gor...@gmail.com wrote: Hi all, Sorry for a bit of delay on posting. I've got a few folks who have volunteered to help. It's hard to tell numbers for sure, since some folks might not come and others may show up that haven't signed up. As Francis says, the solution is likely to be nimble. (And again, I want to thank Francis and the rest of the host crew. They've been doing fabulous with disorganized folks like me ;) ). First, some logistical details. I'm thinking that we'll say that a goal will be to have this rough schedule: 7:30 - start setting up games, getting organized 7:45 - start first round of games 10:30 - start wrapping up. 11:00 - call it a night? (Walk back or catch the bus as a group to various hotels may not be a bad idea) I've got a plan (with helpful advice from several folks, thanks!), and we'll see if it a works. I'm going to work a bit tonight on setting up a new page on the wiki. It's going to be structured in a manner that's similar to the newcoming dinner, but instead will be games. Each game will have a number of seats. If you're bringing a game and are willing to play/teach it, add an entry. Estimate a starting time if it's not going to be when it starts. t'll probably look something like... Game Name (#n - if more than one entry for the game, add a number to make less confusing) 7:45. Game description (with maybe link to boardgamegeek) 1. Patty Gauzweiller (T) 2. Leslie Humphries 3. Mona Wert 4. Eddie Ramirez 5. To sign up, put your name in one of the seats. Don't add seats ;). If you can teach/lead the game, note it. (If you want to teach but not play, that's awesome. I haven't quite figured out how to note this, but I'm thinking I'll just add a line at the bottom.) We'll try to set up sections big enough for the games and put up signs. Here's the warning. I'll probably be making judgement calls on what games get set up in the main room, preferring games that I and any volunteers just coordinating can teach to go there and also based on other factors. If we hit the reasonable size for the room, we'll try to have some recommendations for places to go w the group. This is probably not the ideal solution as it makes quicker/lighter games somewhat tricky, but I'm hoping for some of those some people won't mind playing multiple games in the row, maybe teaching someone who will teach the next group and allow a little of mingling that way. Jon Gorman On Fri, Feb 8, 2013 at 4:06 PM, Francis Kayiwa kay...@uic.edu wrote: On Fri, Feb 08, 2013 at 04:39:22PM -0500, Cynthia Ng wrote: Just an idea if space is really an issue. Would it be possible to simply get a second room next to (or at least nearby) the first one? As I image not everyone will be playing the same game, I don't see it as a problem. As I said to Jon. The people here will have to be nimble. The big problem is as a `historically` commuter campus the open spaces become a premium late and night. What we will need from those who signed up is willingness to track email/wiki for changes. I've asked for other spaces but no word yet. Finally unless you have more than 40, this room will fit the current number without a problem. Also no (as they will find out when they get there) it isn't a matter of spill over to the next room. Cheers, ./fxk On Fri, Feb 8, 2013 at 12:51 PM, Jon Gorman jonathan.gor...@gmail.com wrote: Hi all, I've been getting some questions and I realized there was some confusion about the Game night. I was a bit late in organizing it and quite frankly haven't done the best job. I put out a request for people to express their interest on by Jan. 14th by signing up on the wiki or sending me an email, but I didn't actually put that date in the wiki and it was only mentioned in the email on this list (which was on the 10th of Jan if I remember). That wasn't a hard and fast deadline, mostly so we could get an idea of what sized room we need. However, in the past week or two , we've gotten a lot more people sign up and I've also heard from several folks now that they thought the signup was only for bringing games. As it stand though I realized this morning we had about 15 people expressing interest a month ago and now are looking at over twice that number. I don't want to turn anyone away, but this does pose some logistical hurdles. Mea cupla, this is my fault, not any of the Chicago folks. I'm going to try to work with the folks on the ground on seeing if we can
[CODE4LIB] Code4Lib 2013 - Game Night
Hi all, Just a brief email to say that I sent an email to all the folks who have supplied contact info for the Game Night. It's not required that you do so, but if you were thinking of attending, please sign up at http://wiki.code4lib.org/index.php/2013_social_activities#Game_Night.21 so we know how many people are coming. If you sent me contact info in order to be kept in the loop for last minute changes and I didn't send an email directly to you a little while ago, send it again. I apologize, things have been a bit hectic lately and I'm almost positive I left someone off that sent me an email. Jon Gorman
Re: [CODE4LIB] Code4Lib Conference streaming?
Three cheers for UIC folks! Jon Gorman
[CODE4LIB] C4L2013 Game Night - UIC Library - Tuesday 11th, 7:30 pm
Hi all, Thanks to Francis, we've got a room for the game night at the UIC Library. Looks like it'll start at 7:30 pm, to give folks time to get dinner. Not sure yet how late it can go. I'm going to be updating/modifying info on the social wiki (will move some of the stuff out to it's own section). I'll try to get to that tonight or tomorrow night. If you want me to also send you email when I make changes to the wiki page (http://wiki.code4lib.org/index.php/2013_social_activities) or get more info about Game Night, send an email w/ the subject starting with C4L2013 Game Night. Actually, also reply to me personally with phone info if you don't mind texting if you want to be alerted of any last minute changes or the like without checking the wiki. I'll also try to add notes in the people who signed up on the wiki (or reply to personal emails) on games they might bring so we don't end up with 20 sets of regular playing cards taking up valuable luggage space ;). I'll be bringing a number of games from my personal collection as well. Sorry for the brief note, but wanted to get something out. I'll probably not send any more emails about this directly to the list, so again, send me an email starting with C4L2013 Game Night if you want to be notified or keep an eye on the wiki. Jon Gorman
Re: [CODE4LIB] Zoia
On Fri, Jan 18, 2013 at 9:38 AM, Karen Coyle li...@kcoyle.net wrote: ... and BTW, if people see Zoia as a bit of a problem during the conference, doesn't that mean that Zoia is a bit of a problem all of the time? Is there a reason to be polite and inclusive during the conference but not every day? There's actually two different but closely related issues: 1) Plugins that generate a lot of information/responses which have been a problem as they can interrupt flow of questions/discussions during the conference. @blockparty lists what songs people are playing that have registered their irc nick scrobble. It produces a lot of lines and a couple of calls can cause people's screens to scroll-off. Not a problem with the normal traffic in the room, but when going from maybe 20/30 active participants to hundreds it can be an issue. There's probably some others like @google or @naf with a long response that could be disabled as well. @naf is a nice one for demonstrating zoia, but @marc is pretty compact and also wonderfully library-centric ;). 2) Plugins that are crude/offensive like @mf and the urban dictionary one. I think the thread kicked off with the first one, but I think it rapidly brought in the issue of the latter. I'm in agreement that the latter category probably should be just removed. The first category probably would be useful to disable during the conference but to have. Jon Gorman
Re: [CODE4LIB] A gentle proposal: slim down zoia during the conference
I like the ideas of disabling some of the @zoia bot plugins for the conference at least. For what it's worth, Jon Gorman was working on a version of `@herald` that provided introductory information to those new to the IRC channel. (I'm hoping he can speak to details.) Details of Greeter (the Herald-intro bot): It's my first foray into both supybot and python. I've got a couple of things still on the todo list before throwing it in channel. I did a fork that can be seen here: https://github.com/jtgorman/supybot-plugins/tree/master/plugins/Greeter I think off hand I have something that seems to mostly work, but I want to get @greeter add nick and @greeter remove nick so people can prevent alternative nicks from being spammed and some sort of init routine that pulls in a list of nicks to ignore if the db is not present. The latter may just end up waiting, I don't know. Feel free to submit pull requests. I'll then try to figure out the git magic to get into code4lib. (Or I'll just check out a fresh version of the code4lib and copy the directory and commit that) Hoping to get something in shape by the end of the week that can be added to Zoia. Suggestions on the message welcome. (Right now it has Welcome to code4lib! Visit http://code4lib.org/irc to find out more about this channel. Type @helpers for a list of people in channel who can help. (Going to change @helpers into @helpers #code4lib) Thanks to Mark who reminded me about the @helpers plugin. ( I don't think it's Jon Gorman
Re: [CODE4LIB] Game Night during Code4Lib 2013
Hi, At the moment it looks like we've got about 11 people or so interested in the game night. I'm thinking at this point of scheduling it for later on Tuesday to avoid conflicts with the newcomer dinners. I will (with the wonderful assistance of the hosts) start looking at some possible locations and transport. More details to follow. Jon Gorman
Re: [CODE4LIB] code4lib 2013 location
Gah, I think I forgot to announce this on the list, but there's also this google map: https://maps.google.com/maps/ms?msid=213549257652679418473.0004ce6c25e6cdeb0319dmsa=0 which I put on the social page http://wiki.code4lib.org/index.php/2013_social_activities I'll go ahead and add the hotel and conference site to that as well if it's not already there. On Fri, Jan 11, 2013 at 7:12 PM, Bill Dueber b...@dueber.com wrote: Because it seems like it might be useful, I've started a publicly-editable google map at http://goo.gl/maps/LWqay Right now, it has two points: the hotel and the conference location. Please add stuff as appropriate if the urge strikes you. On Fri, Jan 11, 2013 at 7:54 PM, Francis Kayiwa kay...@uic.edu wrote: On Fri, Jan 11, 2013 at 06:41:26PM -0500, Cynthia Ng wrote: I'm sorry, but that doesn't actually clear up anything for me. The location on the layrd page just says Chicago. So, is the conference still happening at UIC? Since the conference hotel isn't super close, does that mean there will be transportation provided? The entire conference and pre-conference is at UIC. The Forum is a revenue generating part of UIC. The pre-conference will be at the University Libraries on Monday with the exception of the Drupal one. The hotel is a mile or thereabouts from UIC Forum. Here is the problem with us natives planning. It never crossed our minds that walking a mile while on the *upper limit* of our shuttling to and from work is not the norm for everyone. This was brought to our attention and we will have a shuttle from the Hotel to the Conference venue. While we're on the subject, are the pre-conferences happening at the same location? See above. ./fxk On Fri, Jan 11, 2013 at 2:51 PM, Francis Kayiwa kay...@uic.edu wrote: On Fri, Jan 11, 2013 at 10:41:54AM -0800, Erik Hetzner wrote: Hi all, Apparently code4lib 2013 is going to be held at the UIC Forum http://www.uic.edu/depts/uicforum/ I assumed it would be at the conference hotel. This is just a note so that others do not make the same assumption, since nowhere in the information about the conference is the location made clear. Since the conference hotel is 1 mile from the venue, I assume transportation will be available. That's a good assumption to make. As to the confusion I said to you when you asked me about this a couple of days ago. http://www.uic.edu/~kayiwa/code4lib.html was supposed to be our proposal. If you look at the document it also suggests that we were going to have the conference registration staggered by timezones. We have elected not to update that because as that was our proposal. When preparing our proposal we borrowed heavily from Yale's and IU's proposal and if someone would like to steal from us I think it is fair to leave that as is. If you want the conference page use the lanyrd.com link below. I can't even take credit for doing that. All of that goes to @pberry http://lanyrd.com/2013/c4l13/ Cheers, ./fxk best, Erik Hetzner Sent from my free software system http://fsf.org/. -- Speed is subsittute fo accurancy. -- Speed is subsittute fo accurancy. -- Bill Dueber Library Systems Programmer University of Michigan Library
[CODE4LIB] Game Night during Code4Lib 2013
Hi all, I'm trying to gauge interest in Game Night during Code4Lib 2013. Now, I've signed up for the Wednesday Goose Island tour, but can back out of that if Wednesday night works the best. Right now there's a handful of folks on the wiki that has expressed interest, but could you send me an email or sign up on http://wiki.code4lib.org/index.php/2013_social_activities by Monday morning (the 14th) if you would like to go? Also, some indication of games you might be able to bring or games you like to play would be useful. I'm just trying to figure out how many folks are interested so I have a rough idea of number of games and the space we need. Also I'm leaning towards Monday or Tuesday night, but letting me know a night preference as well might be useful. (If this is what people would like for a Wednesday night non-beery alternative, I have other chances to do a Goose Island tour ;) ). Jon Gorman
Re: [CODE4LIB] basic IRC question/comments
You can also choose to anonymize yourself by choosing a nick that best represents something you're interested in or identify with that is not used on other social spheres. It really is completely up to you on what you feel most comfortable with and there is typically no hard/fast rules. One thing to keep in mind is that your nick might be anonymous, but irc in general is done in the clear and some connection information will be published by default. I think that's partially a legacy of how long IRC has been around. When someone logs into a channel you'll see something like foo...@1241workstation.uiowa.edu. There's ways to cloak that id by registering that nick and donating some money to the organization that runs freenode, pdpc. That's a bit trickier to setup. The user registration faq of freenode can be useful: http://freenode.net/faq.shtml#userregistration. So when someone who is registered and cloaked logs in, the connection will display something like foobar@professional.cloaked has joined the channel. - I can't remember the exactg string). So just know that if someone is logging the channel (which is possible, there's plenty of clients and ways to do it) and you come in several times with different nicks but the same network address they'll know it's likely the same person. Jon Gorman
Re: [CODE4LIB] basic IRC question/comments
Oh, forgot to mention. If you use a web client or use tor, that will obscure the connection info by the nature of that connection ;). Jon Gorman On Mon, Dec 10, 2012 at 1:37 PM, Jon Gorman jonathan.gor...@gmail.com wrote: You can also choose to anonymize yourself by choosing a nick that best represents something you're interested in or identify with that is not used on other social spheres. It really is completely up to you on what you feel most comfortable with and there is typically no hard/fast rules. One thing to keep in mind is that your nick might be anonymous, but irc in general is done in the clear and some connection information will be published by default. I think that's partially a legacy of how long IRC has been around. When someone logs into a channel you'll see something like foo...@1241workstation.uiowa.edu. There's ways to cloak that id by registering that nick and donating some money to the organization that runs freenode, pdpc. That's a bit trickier to setup. The user registration faq of freenode can be useful: http://freenode.net/faq.shtml#userregistration. So when someone who is registered and cloaked logs in, the connection will display something like foobar@professional.cloaked has joined the channel. - I can't remember the exactg string). So just know that if someone is logging the channel (which is possible, there's plenty of clients and ways to do it) and you come in several times with different nicks but the same network address they'll know it's likely the same person. Jon Gorman
Re: [CODE4LIB] basic IRC question/comments
And, sorry for being annoying, but some things were pointed out to me in #code4lib, so I'm issuing yet another followup. 1) the technique freenode uses for cloaks isn't as strong as it used to be. Also, it's possible to accidentally log in without a cloak, etc. Don't expect them to be very secure. 2) There's ways to get a cloak without financial contribution. How exactly to do this I leave as an exercise to the reader. I never really worried about it too much, the cloak was just a perk when I made the donation. 3) Apparently most web clients will pass on the browser ip, not the server ip address. So don't count on that to make you anonymous. So the general thrust is, if you really, really need anonymous communication, be wary of irc. However, in general people usually respect the nicks from my experience and won't press people for their actual identities. Also, as mentioned before, most irc servers/channels are not encrypted and pretty easy to log. Jon Gorman On Mon, Dec 10, 2012 at 1:38 PM, Jon Gorman jonathan.gor...@gmail.com wrote: Oh, forgot to mention. If you use a web client or use tor, that will obscure the connection info by the nature of that connection ;). Jon Gorman On Mon, Dec 10, 2012 at 1:37 PM, Jon Gorman jonathan.gor...@gmail.com wrote: You can also choose to anonymize yourself by choosing a nick that best represents something you're interested in or identify with that is not used on other social spheres. It really is completely up to you on what you feel most comfortable with and there is typically no hard/fast rules. One thing to keep in mind is that your nick might be anonymous, but irc in general is done in the clear and some connection information will be published by default. I think that's partially a legacy of how long IRC has been around. When someone logs into a channel you'll see something like foo...@1241workstation.uiowa.edu. There's ways to cloak that id by registering that nick and donating some money to the organization that runs freenode, pdpc. That's a bit trickier to setup. The user registration faq of freenode can be useful: http://freenode.net/faq.shtml#userregistration. So when someone who is registered and cloaked logs in, the connection will display something like foobar@professional.cloaked has joined the channel. - I can't remember the exactg string). So just know that if someone is logging the channel (which is possible, there's plenty of clients and ways to do it) and you come in several times with different nicks but the same network address they'll know it's likely the same person. Jon Gorman
Re: [CODE4LIB] Mentorship Buddies
Having a sort of speed dating setup might help make better fits between mentors and mentees, as well. +1, not only to satisfy the 'room full of nerds' case, but also the fact that people spend their free time @ code4libcon in a variety of ways, and not everyone might want to, e.g., wind up in the hospitality suite. On 11/28/2012 09:45 AM, Ross Singer wrote: On Nov 27, 2012, at 9:33 PM, Cynthia Ng cynthia.s...@gmail.com wrote: Getting traction for mentoring online is always difficult, but what about starting that mentorship at code4libcon? +1 - being face-to-face might help ease the tension. Having a sort of speed dating setup might help make better fits between mentors and mentees, as well. That is, a roomful of nerds deferring passively to one another might not get us very far :) Something more structured about what people want to learn and what mentors know and how they get along together would probably make for a more productive outcome. -Ross. Maybe almost like a buddy system, so that the first meeting between a mentor and mentee is at a code4libcon (national, regional, or otherwise) if possible. This might simply be a good idea for first timers who are not going with colleagues too. Just throwing out some ideas here... On Tue, Nov 27, 2012 at 7:49 PM, Nick Ruest rue...@gmail.com wrote: Matt McCollow proposed something like this a while back. We have a page up and everything! But, it never got much traction. http://www.mail-archive.com/code4lib@listserv.nd.edu/msg14270.html http://wiki.code4lib.org/index.php/Mentorship -nruest On 12-11-27 07:30 PM, Bess Sadler wrote: +1 to this idea. I have benefited tremendously over the years from kind people taking me under their wings. Many of us try to do this one-on-one, but some kind of introduction service would be a huge benefit for the community, I would think. Mentorship is a great example of a robust solution - a solution that addresses more than one problem at once. I suspect that this would not only improve our diversity as a community, it might also solve some tech leadership / succession planning problems and maybe expose some training needs. Bess On Nov 27, 2012, at 4:20 PM, Nathan Tallman ntall...@gmail.com wrote: This is a slightly different topic, but relates to Kelley's post: Does code4lib have a mentor program where more inexperienced geeks can pair up with someone to guide their development? I don't have anyone like that in my network, but would really like to. I don't mean to discount the existing resources on code4lib or this list, which both have been very useful. I'm sure I could just start by attending some of the conferences, but for more inexperienced people they can be a bit intimidating, albeit inspiring. It would also be a way to directly engage minorities. Just a thought. Nathan On Tue, Nov 27, 2012 at 6:20 PM, Kelley McGrath kell...@uoregon.edu wrote: I'll second the idea of approaching people individually and explicitly asking them to participate. It worked on me. I never would have written my first article for the Code4Lib Journal or become a member of the editorial committee if someone hadn't encouraged me individually (Thanks Jonathan!). It would also be good to find a way to somehow target the pool of lurkers who maybe aren't already connected to someone and get them more involved. As far as anonymous proposals go, we recently had a very good workshop on implicit bias here. Someone brought up that found significant changes in the gender proportions in symphony orchestras after candidates started auditioning behind screens. There are also lots of studies about the different responses to the same resume/application depending on whether a stereotypically male/female or white/black name was used. Probably it's impossible to make proposals completely anonymous, but it would be an interesting experiment to leave off the names. Kelley PS Interestingly, I wouldn't instinctively self-identify as a member of the Code4Lib community, although my first thought is that that has more to do with not being a coder than with being a woman. ** Kelley McGrath Metadata Management Librarian University of Oregon Libraries 1299 University of Oregon Eugene, OR 97403 541-346-8232 kell...@uoregon.edu -- -nruest
Re: [CODE4LIB] anti-harassment policy for code4lib?
It's sad that we have to address this formally (as formal as c4l gets anyway), but that's reality, so yes, bess++ indeed, and mjgiarlo++, anarchivist++ for the quick assist. The responses to the list in the past couple of hours alone suggest that this is something much of the community would want to get behind. To that end, and as a show of (positive) force--not to mention how cool our community is--I think it might be neat if we could find a way to make whatever winds up being drafted something we can sign; i.e. attach our personal names. I don't know how that would work exactly...maybe via the wiki (where it seems to me a lot of good info goes to die) or the code4lib Github (slightly better since you could link to your credentials in a an environment much larger than our own, and everyone could have a copy), but something along those lines. I'm happy to help if I can. Anyway, just a thought. -Jon -- Jon Stroop Digital Initiatives Programmer/Analyst Princeton University Library jstr...@princeton.edu http://pudl.princeton.edu http://findingaids.princeton.edu On 11/26/12 6:33 PM, Michael J. Giarlo wrote: All, Building on what Bess and others have written, and on the GitHub repo that anarchivist set up, I've contributed a rough draft of a Code4Lib code of conduct: https://github.com/code4lib/antiharassment-policy/blob/master/code_of_conduct.md This strawperson code of conduct is based on DLF Forum's, which is based on the Ada Initiative's sample policy. It is modified slightly to reflect a broader scope of the conference, conference social events, the IRC channel, and the mailing list. Throw darts, rinse, repeat. -Mike On Mon, Nov 26, 2012 at 6:10 PM, Robert Sanderson azarot...@gmail.comwrote: +1, of course :) You might wish to consider some further derivatives/related pages: http://www.diglib.org/about/code-of-conduct/ http://wikimediafoundation.org/wiki/Friendly_space_policy https://thestrangeloop.com/about/policies http://www.apache.org/foundation/policies/anti-harassment.html Rob On Mon, Nov 26, 2012 at 3:57 PM, Mariner, Matthew matthew.mari...@ucdenver.edu wrote: +1 for all of the below Matthew C. Mariner Head of Special Collections and Digital Initiatives Assistant Professor Auraria Library 1100 Lawrence StreetDenver, CO 80204-2041 matthew.mari...@ucdenver.edu http://library.auraria.edu :: http://archives.auraria.edu On 11/26/12 3:51 PM, Tom Cramer tcra...@stanford.edu wrote: +1 for Bess's motion +1 for Roy's expansion to C4L online interactions as well as face to face +1 for Karen's focus on general inclusivity and fair play For me the hardest thing is how one monitors and resolves issues that arise. As a group with no formal management, I suppose the conference organizers become the deciders if such a necessity arises. If it's elsewhere (email, IRC) -- that's a bit trickier. The Ada project's detailed guides should help, but if there is a policy it seems that there necessarily has to be some responsible body -- even if ad hoc. It seems to me that there would be tremendous benefit in having 1.) an explicit statement of the community norms around harassment and fair play in general. In the best case, this would help avoid uncomfortable or inappropriate situations before they occur. 2.) a defined process for handling any incidents that do arise, which in the case of this community I would imagine would revolve around reporting, communication, negotiation and arbitration rather than adjudication by a standing body (which I agree is hard to see in this crowd). I know several high schools have adopted peer arbitration networks for conflict resolution rather than referring incidents to the Principal's Office--perhaps therein lies a model for us for any incidents that may not be resolved simply through dialogue. - Tom On Nov 26, 2012, at 2:32 PM, Karen Coyle wrote: Bess and Code4libbers, I've only been to one c4l conference and it was a very positive experience for me, but I also feel that this is too valuable of a community for us to risk it getting itself into crisis mode over some unintended consequences or a bad apple incident. For that reason I would support the adoption of an anti-harassment policy in part for its consciousness-raising value. Ideally this would be not only about sexual harassment but would include general goals for inclusiveness and fair play within the community. And it would also serve as an acknowledgment that none of us is perfect, but we can deal with it. For me the hardest thing is how one monitors and resolves issues that arise. As a group with no formal management, I suppose the conference organizers become the deciders if such a necessity arises. If it's elsewhere (email, IRC) -- that's a bit trickier. The Ada project's detailed guides should help, but if there is a policy it seems that there necessarily has to be some responsible body -- even if ad hoc. kc On 11/26/12 2:16 PM, Bess
Re: [CODE4LIB] extracting tiff info
If you want everything in that RDF, you're probably wanting to extract the XMP data. Have a look at exiv2: http://www.exiv2.org/ Basically: exiv2 -px your_image.tif will dump what you want to stdout. -Jon -- Jon Stroop Digital Initiatives Programmer/Analyst Princeton University Library On 11/19/2012 04:31 PM, Kyle Banerjee wrote: Howdy all, I need to extract all the metadata from a few thousand images on a network drive and put it into spreadsheet. Since the files are huge (each is 100MB+) and my connection isn't that fast, I strongly prefer to not move them before working on them -- i.e. I'm using cygwin and/or windows. Just eyeballing these things, I see the headers contain everything I need in purty rdf. What's the best way to extract this? I thought tiffinfo would do the trick, but it's just giving me technical info. Of course I can just parse the files with perl but I'm thinking there just has to be a slicker way to do this. What's my best option? Thanks, kyle
Re: [CODE4LIB] Mobile device usage (iOS vs. Android)
Any thought? I guess I'd be somewhat wary of comparing general trends to a more defined population. I'm guessing your campus population is not typical of the national population, instead probably skewed towards a younger population with higher disposable income (and also perhaps more sensitive to peer pressure) and hence might not follow general trends ;). Also, how is your 70% traffic figured? Do you have any way to determine if perhaps a few outliers are creating a significant amount of traffic. (In other words, do you know if the mobile traffic actually represents ownership, or might there be a smaller group of i-phone users who happen to use the library services more? I'd guess the smaller the population accessing via mobile, the more likely a small population could skew the results) Also, how are you measuring the Android users? Is it possible you're missing some who would be using non-default browsers or browsers modified by a carrier? I don't unfortunately have any stats, but I do seem to remember seeing some numbers locally that would indicate iOS count of web usage is still pretty high. Android phones are becoming very, very cheap but data plans aren't. Also, the form factor and the processing power of some of the cheaper androids make web searching less than thrilling. I could see someone using an Android that they get for free, but not accessing the library for a variety of reasons. It would be interesting if one could compare the usage of different Android devices but the difficulty of data collection here might be enormous. (I'm not sure off hand if there's an easy way to distinguish, say, a Samsung Galaxy 2 from a Optimus) Jon Gorman
Re: [CODE4LIB] haititrust
You can do an empty query in their catalog, and use the Original Location facet to filter to a holding library. Programatically, I'm not sure, but you'd probably need to use the Hathi files: http://www.hathitrust.org/hathifiles. -Jon On 08/03/2012 11:07 AM, Eric Lease Morgan wrote: If I needed/wanted to know what materials held by my library were also in the HaitTrust, then programmatically how could I figure this out? In other words, do you know of a way to query the HaitTrust and limit the results to items my library owns? --Eric Lease Morgan
[CODE4LIB] 2012 VIVO Conference
(apologies for any cross-postings) In the past 3 years, a growing international movement of developers, researchers, administrators, funders, librarians and informaticians has converged around the vision of openly representing research and researchers via Linked Open Data. VIVO is helping to make this vision a reality through its community, through open software and the VIVO ontology, and a growing number of adopters and collaborators worldwide, across multiple knowledge domains. The 2012 VIVO conference will explore how to participate in and best take advantage of the emerging Linked Open Data world encompassing and expanding our understanding of research. Who should attend? Scholars, scientists, researchers, developers, librarians, publishers, funding agencies, research officers, students, institutional officials and those supporting the development of research discovery, data sharing and team science. Conference highlights The conference begins with a full day of workshops for those new to VIVO, those implementing VIVO and those wishing to develop applications using VIVO. Keynote addresses, invited speakers, scientific panels, contributed papers and posters will cover a range of topics, including the semantic web, linked open data, VIVO sustainability, adopting and implementing VIVO, research networking, network visualization, ontology and the role of VIVO in support of team science. Registration, Call for Papers and Apps Contest, hotel and travel information http://vivoweb.org/conference Topics of interest * Facilitating researcher collaboration and networking * Managing/discovering knowledge about researchers across institutional, disciplinary, and national boundaries * Approaches to the adoption of VIVO and related systems that interoperate through shared ontologies and Linked Open Data * The intersection of VIVO and international research standards * Research representation ontology development * Open representations of research and implications for the research process, collaboration, and virtual research communities * Perspectives on policy, research representation, and research impact, including questions of privacy, individual vs. institutional sourcing of data, and change over time * Semantic Web development and extensions of the VIVO platform to reach the full Web community * Open research data and related issues in discovery, reuse, and attribution About VIVO VIVO is an open source, open ontology, open process platform for hosting information about scientists’ interests, activities and accomplishments. VIVO supports open development and integration of science through simple, standard semantic web technologies. Learn more at http://vivoweb.org Jon Corson-Rikert Head, Information Technology Services VIVO Development Lead 201 Albert R. Mann Library Cornell University Ithaca, NY 14853 607 255-4608 j...@cornell.edu
Re: [CODE4LIB] Sharing code
On Fri, Mar 9, 2012 at 11:34 AM, Whitworth, Cliff cliff.whitwo...@unt.edu wrote: NOOB to list and am appreciative of this discussion. My boss is encouraging me to share code and pointed me to code4lib. the majority of my code is recycled / repurposed from others so I've had reservations about sharing mainly because of what's taken from others. At the least, I'm mindful about leaving acknowledgements intact. Is there a good resource on how to start sharing code and ethical considerations? Howdy and welcome Cliff! In short, I think there's a push over the past few years to share more and more code, even when it's small. There's a lot of individuals scattered in the library world who are not necessarily on local teams who end up doing the same work over and over again. There's some tension with this as there's also projects that tend to get abandoned or just don't have as much support and community as they could. I've been bad about releasing source myself. I've got a barrier in our lawyers, who I really need to push to let me have more leeway for releasing stuff. There's been a couple of articles over the years on the code4lib journal, see... First, an argument on why to just put stuff out there by Dale Askey: COLUMN: We Love Open Source Software. No, You Can’t Have Our Code http://journal.code4lib.org/articles/527 See Terry Reese's excellent article in the latest issue: Purposeful Development: Being Ready When Your Project Moves From ‘Hobby’ to Mission Critical http://journal.code4lib.org/articles/6393 Michael Doran gave an excellent talk a few years back that really stuck in my head with the very issue I've been reluctant to put more effort into: lawyers and code: The Intellectual Property Disclosure: OpenSource in Academia 21:09 - 4 years ago http://video.google.com/videoplay?docid=-3341633878207243364 There's a lot of other good articles in the journal and on people's various blog posts. Github is all the rage these days, so at some point I'll need to figure out how to use it ;). Again, welcome! Jon Gorman
Re: [CODE4LIB] Q.: MARC8 vs. MARC/Unicode and pymarc and misencoded III records
It used to be that way, at least it was this way when I grew up in open source (in the 90s, before Eric Raymond invented the term). And it makes sense, for successful projects that have at least a moderate number of users. Just dumping your code on github helps very few people. You realize this isn't Apache, right? It seems a small project, mostly maintained by folks as they get time. There's no SCRUM meetings or hallway meetings, no foundation, no checklist. Surely you can't generalize two interactions first as reflective as the culture of open source. It seems to have been a small piece of code shared so others wouldn't have to do it over again and it's grown with time. The primary thrust seems to be for library developers, not catalogers or folks learning python code. The typo you bought up was patched by one of the team-members within a hour or two from what I can tell. (Assuming you meant issue #22 https://github.com/edsu/pymarc/issues/22). From what I can tell someone patched it in less than an hour. In general though github is the sourceforge of years past, but even better. It seems entirely reasonable to ask for a patch to me. Perhaps it could have been handled more delicately by both sides. Perhaps you weren't treated as nicely as you'd like. There's probably some truth to that. But at the same time, Ed did include a wink at the end after requesting the patch. Had you perhaps cut him some slack instead of immediately responding incredulously you'd find it was fixed when he got time. Or not. He has his own priorities as do other folks who contributed to the code. If you're unhappy with the dump on github approach, then don't use the software. No one ran around forcing folks to do it. It's one of those lightweight github approaches, just another approach to open source software. In all the years I've also been involved with open source every project has had it's own unique culture. There's responsibility on the user before using software to figure out what it is. If it doesn't meet their expectation, I see little reason that the developer should feel compelled to change unless they're getting paid for the work. Obviously some people have found the dump on github approach useful if they've contributed patches. Can't we all just shake hands virtually or something? Jon Gorman
Re: [CODE4LIB] Microsoft Transit-SQL
I am looking for a good text on Microsoft Transit-SQL. I have searched high and low and all I find are books focused on Microsoft SQL Server. Do you mean Transact-SQL (which I usually just see abbreviated T-SQL) ? The online documentation at msdn isn't great, but it's not horrible. That's usually what I use. I mean, usually it's just a matter of looking up how it implements SQL and some of the local variants. (Do you need recommendations for books on SQL?) Jon Gorman
[CODE4LIB] How to get on irc
Hi all, Quick link for those trying to get on irc for the first time There's some info on http://code4lib.org/irc Basic: download an irc client (I like xchat) connect to the freenode server type /join #code4ib Gotta go, presentation started Jon Gorman University of Illinois
Re: [CODE4LIB] Koha in the Running
I'm curious to know of this lists current thoughts on Koha as an ILS. Where would you rank it among the various options, open source and vendor? I'm confused, what do you mean by open source and vendor? There's vendors/companies that develop for and support Koha. Open source and vendor/commercial activity are not mutually exclusive. Did you mean open source and proprietary? There's lots of combination of ILSes and how to manage them out there. Open Source ILS / local servers / no support contracts Open source ILS / hosted / no support contract Open source ILS / hosted / support contract Proprietary ILS / local servers / no support Open source ILS / hosted / no support contract Proprietary ILS / hosted / support Some of those combinations are pretty rare, but I could see all of them existing. And you could distinguish between support and development contracts, with the nice advantage of open source you can always change vendors or fund someone who's not your usual developer group depending on how the community around the project has been established. Harder to do that with proprietary software, but I've still heard of it happening. Are you interested in stuff like that? Or are you more just interested in how people's experience using Koha software itself compared ot other ILS options out there? Or the actual overall experience? Or which Koha vendor is the best? Jon G.
Re: [CODE4LIB] My crazed idea about dealing with registration limitations
Maybe keynotes happen on the middle day; the one time where the whole group comes together, though it would require a 2x size space... This could also reduce the length to 4.5 days. On 12/22/2011 10:05 AM, Peter Murray wrote: That is a crazy idea. I don't know about putting the speakers on the hook for two days -- particularly keynote speakers. Still, it would be interesting for a site to flesh this out and propose something along these lines. Peter On Dec 21, 2011, at 6:44 PM, Fleming, Declan wrote: Hi - so I know this is nuts. If we start with a couple premises for the code4lib conference: 1. Single thread is crucial. 2. 250 is about the top limit of a single threaded conference. 3. 400+ people want to attend. 4. The conference takes 2.5 days. What if we ran the 2.5 day conference twice in one week? 1. Session 1 runs from Monday until noon on Weds. 2. Session 2 runs from 1p on Weds until the end of Friday. 3. Every one of the 23 accepted talks is given twice, once in each Session, in the same order. 4. Each Session is attended by a different set of attendees. We could serve 500 attendees this way. If everyone came for the week, there could be parallel seminars, hack fests, BootCamps, THATcamps, CURATEcamps, c4lcamps, etc... for the half of the 500 that wasn't in the main conference. People could also just decide to come for the 2.5 day main conference, I guess. I SAID it was crazy. ;) D
Re: [CODE4LIB] Obvious answer to registration limitations
I had planned to come to code4lib and knew it filled up fast. I joined the mailing list so I could find out about the registration as soon as it happened. It came out in mid-morning and I happened to be in a meeting until 12 or so and by the time I tried to register it was sold out. This is annoying. Why not find a venue that is big enough to meet the obvious demand? There are surely plenty of larger venues in a city such as Seattle. The actual time when registration was going to open was published in a variety of venues (on the wiki, on the mailing lists, and it seemed someone was asking the question every fifteen minutes in the channel, including me ;) ). I purposely avoided scheduling meetings around that time and rescheduled some that were. On the other hand, it would be interesting to see a proposal for a larger code4lib and I imagine Minnesota has lots of places that can host a larger one. The deadline isn't until Jan. 22nd See http://code4lib.org/node/425 As always, if you want Code4Lib to do something or change, all you have to do is plan and work for it. That's why we're a loose collective and not a professional organization. I personally would not vote on making it much larger. It seems every order of magnitude increase takes it away from the techie origins and more like CiL or Internet Librarian. On the other hand, regardless of the size, I still suspect I'll find people willing to discuss the technical stuff, I just might stop showing up for most of the actual talks. Jon Gorman. On Mon, Dec 19, 2011 at 8:47 AM, Elfstrand, Stephen F stephen.elfstr...@mnsu.edu wrote: Stephen Elfstrand PALS Executive Director stephen.elfstr...@mnsu.edu 507.389.5059
Re: [CODE4LIB] Any ideas for free pdf to excel conversion?
I'm looking for a way to pull 29 pages of pdf tables into excel so I can munge the data into an excel project and all my free trials so far have only converted a few pages at a time. copy and paste? If it needs to be somewhat automated pdftotext - some cut paste / sed / regex - open in excel? You might need to fiddle with the pdftotext settings, but I've been pretty successful with that before doing something else. Jon G.
Re: [CODE4LIB] server side vs client side
On Thu, Dec 1, 2011 at 11:49 AM, Nate Hill nathanielh...@gmail.com wrote: As I was struggling with the syntax trying to figure out how to use javascript to load a .txt file, process it and then spit out some html on a web page, I suddenly found myself asking why I was trying to do it with javascript rather than PHP. Is there a right/wrong or better/worse approach for doing something like that? Why would I want to choose one approach rather then the other? I tend to try to do most stuff server-side. Javascript I try to keep just to enhance the GUI system and perhaps do some AJAXy stuff. There is the fact that if you're using an external API that's not crucial you might want to just do it javascript side. So think about cover images in a catalog for example. You could have the server-side script go out, grab the image, put it in a local cache, then prepare the link within the actual html. But if something goes wrong, you might either take really long to return that page or never return it. The approach that most folks do is that they have some javascript that does an AJAX call. So the page loads on the client and then when the image comes back the cover image will be added. If it never happens, you've sent the page at least. I know some who tend to always go to javascript because they're used to not having control of the underlying system except for to add html to templates and sneak in javascript that way. However, that's awkward, difficult to maintain, error-prone, and likely horrible for accessibility. If you control the underlying PHPthen yeah, do it on the PHP side ;). My advice here is somewhat simplistic and general. You do have my curiosity up now though. What was you goal with trying to load that text file? Jon Gorman
Re: [CODE4LIB] Models of MARC in RDF
You may know about this one already, but the BL exposed the British National Bibliography as RDF last summer. The project has a page[1] with a good amount of info--the data model[2] might be a good place to start. -Jon 1. http://www.bl.uk/bibliographic/datafree.html 2. http://www.bl.uk/bibliographic/pdfs/datamodelv1_01.pdf On 11/26/2011 10:58 AM, Karen Coyle wrote: A few of the code4lib talk proposals mention projects that have or will transform MARC records into RDF. If any of you have documentation and/or examples of this, I would be very interested to see them, even if they are under construction. Thanks, kc -- Jon Stroop Metadata Analyst Firestone Library Princeton University Princeton, NJ 08544 Email: jstr...@princeton.edu Phone: (609)258-0059 Fax: (609)258-0441 http://pudl.princeton.edu http://findingaids.princeton.edu http://www.cpanda.org
[CODE4LIB] Fwd: [semweb-25] Metropolitan Musem of Art hiring a Semantic Web Developer
May be of interest to someone on this list. Original Message Subject: [semweb-25] Metropolitan Musem of Art hiring a Semantic Web Developer Date: Thu, 24 Nov 2011 11:01:27 -0500 From: don undeen donund...@yahoo.com Reply-To: semweb...@meetup.com To: semweb...@meetup.com Hello, Hoping that this isn't a spam, but the Metropolitan Museum of Art's Digital Media Department is hiring for an Information Systems Developer. This position will be involved in advanced data architecture solutions, to support a variety of web and in-gallery technology. This work may entail: - Setting up and administering triple stores, NoSQL dbs, and CMSs like Drupal - designing interfaces, modules, and workflows for same - Implementing collective intelligence algorithms, - experimenting with new technologies, developing prototypes and proofs-of-concept - and (to be honest) some drudgery, like data delivery, ETL, and report generation See the application on linkedin, here: http://www.linkedin.com/jobs?viewJob=jobId=2157751srchIndex=0trk=njsrch_hitsgoback=%2Efjs_information+systems+developer_*1_*1_I_us_*1_*1_1_R_true_*2_*2_*2_*2_*2_*2_*2_*2 http://www.linkedin.com/jobs?viewJob=jobId=2157751srchIndex=0trk=njsrch_hitsgoback=%2Efjs_information+systems+developer_*1_*1_I_us_*1_*1_1_R_true_*2_*2_*2_*2_*2_*2_*2_*2 I know many of you do more than just SemWeb work, and many of you are on this list because you like to find new ways to tackle vexing problems. That's what we're looking for. If you choose to submit a resume, please send it to the email address provided, but also cc me: don.und...@metmuseum.org I look forward to hearing from you. yours, Don Undeen Manager, Media Lab Digital Media Department Metropolitan Museum of Art -- Please Note: If you hit *REPLY*, your message will be sent to *everyone* on this mailing list (semweb...@meetup.com mailto:semweb...@meetup.com) This message was sent by don undeen (donund...@yahoo.com) from Lotico New York Semantic Web http://www.meetup.com/semweb-25/. To learn more about don undeen, visit his/her member profile http://www.meetup.com/semweb-25/members/6026658/ To unsubscribe or to update your mailing list settings, click here http://www.meetup.com/semweb-25/settings/ Meetup, PO Box 4668 #37895 New York, New York 10163-4668 | supp...@meetup.com -- Jon Stroop Metadata Analyst Firestone Library Princeton University Princeton, NJ 08544 Email: jstr...@princeton.edu Phone: (609)258-0059 Fax: (609)258-0441 http://pudl.princeton.edu http://findingaids.princeton.edu http://www.cpanda.org
Re: [CODE4LIB] Professional development advice?
Probably the most important thing you can do is simply play around with the technology. Get some ideas of what you want to play around with. Then try to do it or see if someone else has already done it. If someone else has done it, try to figure out how (open source for the win). When I was starting out I liked having classes, just because they usually create goals and end points. To be honest though it's been a little while since I've actually taken a class. I probably should again, but life does get busy. Books and very good websites are a close second. Look for classes in either your CS department or the local community college. If you want to do web development, start looking for a language and framework you like. Set up a box, install a webserver on it. Find a web application you like and try to get it up and running. (Give a try on doing something like running your own koha server!) I don't know if it will help, but here's some knowledge I'd look for in any web developer that was looking for a library job: * What version control systems do they know? * Do they know project management tools like puppet? * Why they liked particular projects they worked on and what they may not liked about them. * Basic network knowledge. * Some basic knowledge of design principles and usability testing. They don't need to be a master, but I hope they're at least aware of some the techniques. I'm not really concerned about particular languages or frameworks Mainly I'm looking for signs that they're comfortable with web development and know some of the pitfalls and issues that can happen in the library environment. Have they run into issues with combining diacritics, confused librarian, what to call services? Also, I'm watching for any warning signs like like they can't distinguish between client-side javascript server-side processing or they only seem to use does it display. That would make me instantly wary. Jon Gorman
Re: [CODE4LIB] Professional development advice?
On Mon, Nov 28, 2011 at 11:50 AM, Kyle Banerjee baner...@uoregon.edu wrote: Having a playground where you can experiment aggressively is useful. I'm a fan of Amazon EC2 because you can create servers in minutes for pennies per hour and try things you'd never want to do with real hardware. It's nice when you can completely restore a destroyed server in a couple minutes. Ah, in a similar vein, having a VM setup can help a lot with playing around. Look into VirtualBox and set up a VM. It's a lot easier once you get the hang of it than the old days when you almost needed a physical machine to play around. Jon Gorman
Re: [CODE4LIB] Plea for help from Horowhenua Library Trust to Koha Community
Hi Joann, Have you considered sending this to some of the tech podcasts? I think both the Command-Line podcast (http://thecommandline.net/) and Linux Outlaws (http://sixgun.org/linuxoutlaws/) would be great audiences and receptive to this story. I'm a regular listener of both and if you want me to contact them so they would get it from a a regular listener who I'd be more than happy to forward your message with some personal notes. (And the paypal link too ;) ). Jon Gorman On Mon, Nov 21, 2011 at 6:51 PM, Joann Ransom jran...@library.org.nz wrote: Horowhenua Library Trust is the birth place of Koha and the longest serving member of the Koha community. Back in 1999 when we were working on Koha, the idea that 12 years later we would be having to write an email like this never crossed our minds. It is with tremendous sadness that we must write this plea for help to you, the other members of the Koha community. The situation we find ourselves in, is that after over a year of battling against it, PTFS/Liblime have managed to have their application for a Trademark on Koha in New Zealand accepted. We now have 3 months to object, but to do so involves lawyers and money. We are a small semi rural Library in New Zealand and have no cash spare in our operational budget to afford this, but we do feel it is something we must fight. For the library that invented Koha to now have to have a legal battle to prevent a US company trademarking the word in NZ seems bizarre, butit is at this point that we find ourselves. So, we ask you, the users and developers of Koha, from the birth place of Koha, please if you can help in anyway, let us know. Background reading: - Code4Lib article http://journal.code4lib.org/articles/1638: How hard can it be : developing in Open Source [history of the development of Koha] by Joann Ransom and Chris Cormack. - Timeline http://koha-community.org/about/history/ of Koha :development - Koha history visualization http://www.youtube.com/watch?v=Tl1a2VN_pec Help us If you would like to help us fund legal costs please use the paypal donate button below. Otherwise, any discussion, public support and ideas on how to proceed would be gratefully received. Regards Jo. -- Joann Ransom RLIANZA Head of Libraries, Horowhenua Library Trust.
Re: [CODE4LIB] marc-8
In Perl, how do I specify MARC-8 when reading (decoding) and writing (encoding) data? You can't. MARC-8 is a character set that is unknown to the operating system. Your best bet is to convert MARC-8-encoded records into UTF-8. /me throws his hands up in the air and screams! Okay. How do I go about converting MARC-8 encoded records into UTF-8? I know yaz-marcdump changes the encoding bit in MARC leaders. Does it also convert MARC-8 characters to UTF-8? (I guess I could simply try it and see what happens.) I seem to remember there was an older version of yaz-marcdump that seemed a bit buggy (would just change the header but not change encoding despite command-line options, if there was a certain combination chosen). It's also possible I was just working with a script that specified the encoding change but not the leader. I'd say get the most recent version of yaz (don't use anything in an OS repository) and then follow the docs: http://www.indexdata.com/yaz/doc/yaz-marcdump.html. The first example is what you want: yaz-marcdump -f MARC-8 -t UTF-8 -o marc -l 9=97 marc21.raw marc21.utf8.raw The -f is the source encoding, the -t is the target encoding, and the -l 9=97 sets leader to a (decimal of character to change the 9th character to a). I've typically found this is one of the easier ways to do the character set encoding, although the various Perl modules (if they're recent enough) should be able to handle the conversion as well through the MARC::Charset library. Check the cpan pages. Jon Gorman ps. For the love of all that is good, don't try to do anything in Perl with the raw MARC record to do the encoding change yourself. I've seen someone really screw records up because they altered individual characters, which in turn lead to different byte lengths. This caused all sorts of insanity which meant really weird things happened with MARC parsers that tried to follow the MARC directory (which uses byte addresses to deal with variable fields).
Re: [CODE4LIB] ISBN Regular Expression
Also, I don't know OpenBook to know your source data, but don't forget a lot of publishers have printed ISBNs in different ways over the past few years. The regex would choke on any hyphens. If users are copying from printed material, they could type them in. For example, one of the books near my desk has the ISBN printed like 0-521-61678-6 if this is user input and nothing is striping characters like that out, it could cause problems. (I think I've also seen spaces used instead of hyphens, but less positive about this). Jon Gorman On Mon, Oct 24, 2011 at 9:44 AM, Jonathan Rochkind rochk...@jhu.edu wrote: John: That's not going to work, an ISBN can end in X as a check digit, which is not [0-9]. You are going to be rejecting valid ISBN's, you have a bug. On 10/24/2011 10:40 AM, John Miedema wrote: Here's a php function I use in OpenBook to test if a user has entered a 10 or 13 digit ISBN. //test if 10 or 13 digits ISBN function openbook_utilities_validISBN($testisbn) { return (ereg (([0-9]{10}), $testisbn, $regs) || ereg (([0-9]{13}), $testisbn, $regs)); } On Fri, Oct 21, 2011 at 1:44 PM, Kozlowski,Brendonbkozlow...@sals.eduwrote: Hi all. I'm somewhat surprised that I've never had to validate an ISBN manually up until now. I suppose that's a testiment to all of the software out there. However, I now find that I need to validate both the 10-digit and 13-digit ISBNs. I realize there's also a check digit and a REGEX cannot check this value - one step at a time. Right now I just want to work on the REGEX. Does anyone know the exact specifications of both forms of an ISBN? The ISBN organization's website didn't seem to be overly clear to me. Alternatively, if anyone has a full working regular expression for this purpose I would definitely not mind if they'd be willing to share. The only thing I'm doing which is abnormal is that I am not requiring the hyphenation or spaces between numbers since some of this data will be coming from a system, and some will be coming from human input. Brendon Kozlowski Web Administrator Saratoga Springs Public Library 49 Henry Street Saratoga Springs, NY, 12866 [518] 584-7860 x217 Please consider the environment before printing this message. To report this message as spam, offensive, or if you feel you have received this in error, please send e-mail to ab...@sals.edu including the entire contents and subject of the message. It will be reviewed by staff and acted upon appropriately.
[CODE4LIB] Variations/FRBR project announces release of RDF data and project source code
(Apologies for cross-posting...) Indiana University announces the availability of several deliverables from the IMLS-funded Variations/FRBR project, all of which are accessible from the project website, http://vfrbr.info. An export of FRBRized data with an RDF binding of the Variations/FRBR data model is available in two forms: a single compressed archive containing all triples, and smaller separate files with batches of triples by entity type. Also available are an ontology in OWL and a set of RDF design templates. All data exports contain data for 80,000 sound recordings and 105,000 scores, based on holdings of Indiana University's Cook Music Library. Project source code is downloadable in four subprojects: persistence, FRBRization, export, and search. The vfrbr-persist project provides tools for creating the MySQL database and Java classes providing connection to the database. The vfrbr-frbrize-marc project provides the tools for FRBRizing MARC records and storing the results in the database. The vfrbr-export project enables XML exports from the database. The vfrbr-scherzo project contains the end-user search interface. All source code is released under a BSD open source license. The Scherzo search interface at http://vfrbr.info/search has been enhanced to include scores as well as recordings. Keyword search is now available, along with a publication date facet, and usability has been improved through numerous small changes. Comments and questions on the Variations/FRBR project may be sent to vf...@dlib.indiana.edumailto:vf...@dlib.indiana.edu. Regards, Jon --- Jon Dunn Director, Library Technologies and Digital Libraries IU Bloomington Libraries / University Information Technology Services Indiana University j...@indiana.edumailto:j...@indiana.edu (812) 855-0953
Re: [CODE4LIB] TIFF Metadata to XML?
Edward, JHOVE (1) should be able to do this, and I believe you can pass the included shell script a directory and have it extract data for everything it finds and can parse inside. -Jon On 07/18/2011 09:18 AM, Edward M. Corrado wrote: Hello All, Before I re-invent the wheel or try many different programs, does anyone have a suggestion on a good way to extract embedded Metadata added by cameras and (more importantly) photo-editing programs such as Photoshop from TIFF files and save it as as XML? I have 60k photos that have metadata including keywords, descriptions, creator, and other fields embedded in them and I need to extract the metadata so I can load them into our digital archive. Right now, after looking at a few tools and having done a number of Google searches and haven't found anything that seems to do what I want. As of now I am leaning towards extracting the metadata using exiv2 and creating a script (shell, perl, whatever) to put the fields I need into a pseudo-Dublin Core XML format. I say pseudo because I have a few fields that are not Dublin Core. I am assuming there is a better way. (Although part of me thinks it might be easier to do that then exporting to XML and using XSLT to transform the file since I might need to do a lot of cleanup of the data regardless.) Anyway, before I go any further, does anyone have any thoughts/ideas/suggestions? Edward
Re: [CODE4LIB] Question about C4L 2011 in Bloomington
Hi Tania, If you're talking about the graph paper pads, my understanding is that these were designed in-house by the Communications Office in our IT organization (UITS) and printed by a local printing firm. The Communications Office is happy to share the design files if you're interested in modifying them for your use - I'll e-mail you offline about that. Jon --- Jon Dunn Director, Library Technologies and Digital Libraries IU Bloomington Libraries / University Information Technology Services Indiana University j...@indiana.edu (812) 855-0953 -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Tania Fersenheim Sent: Wednesday, July 13, 2011 1:36 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] Question about C4L 2011 in Bloomington Do any of the organizers of the C4L conference in Bloomington back in Februry know what company made the pads of paper you handed out at registration? We like the format and might like to order some for ourselves. I can send a scan if someone wants to see what I am talking about. -- Tania Fersenheim Manager of Library Systems Brandeis University Library and Technology Services 415 South Street, (MS 017/P.O. Box 549110) Waltham, MA 02454-9110 Phone: 781.736.4698 Fax: 781.736.4577 email: tan...@brandeis.edu
Re: [CODE4LIB] MARCXML to MODS: 590 Field
I'm going to guess that it's because 59x fields are defined for local use: http://www.loc.gov/marc/bibliographic/bd59x.html ...but someone from LC should be able to confirm. -Jon -- Jon Stroop Metadata Analyst Firestone Library Princeton University Princeton, NJ 08544 Email: jstr...@princeton.edu Phone: (609)258-0059 Fax: (609)258-0441 http://pudl.princeton.edu http://diglib.princeton.edu http://diglib.princeton.edu/ead http://www.cpanda.org/cpanda On 05/19/2011 11:45 AM, Richard, Joel M wrote: Dear hive-mind, Does anyone know why the Library of Congress-supplied MARCXML to MODS XSLT [1] does not handle the MARC 590 Local Notes field? It seems to handle everything else, not that I've done an exhaustive search... :) Granted, I could copy/create my own XSLT and add this functionality in myself, but I'm curious as to whether or not there's some logic behind this decision to not include it. Logic that I would not naturally understand since I'm not formally trained as a librarian. Thanks! --Joel [1] http://www.loc.gov/standards/mods/v3/MARC21slim2MODS3-4.xsl Joel Richard IT Specialist, Web Services Department Smithsonian Institution Libraries | http://www.sil.si.edu/ (202) 633-1706 | richar...@si.edu
Re: [CODE4LIB] is this valid marc ?
You've gotten some other good responses, but I thought I'd mention the LoC and OCLC sites on MARC if you haven't seen them yet. First, the LoC site at http://www.loc.gov/marc/. This is what I use as a guide and a reference. Some folks prefer the OCLC docs http://www.oclc.org/bibformats/en/, particularly if they're an OCLC member. Of course, these apply to MARC-21 and not UniMarc. Not sure what good resources are out there for UniMARC. Jon Gorman
Re: [CODE4LIB] linked data endpoints
Just to clarify, are you picturing some sort of feedback loop? I'm just trying to get a better picture of the process (sounds like an interesting project). In other words, do you have something like: 1) take in a full-text document (like, say, a novel?) 2) Run it through NER, pull out locations, places, things. 3) Have a user who's read the novel (or perhaps display those words in context?) go through each the locations and pick a lat long using Google Maps as an interface. (Ie says this Dublin is Dublin, OH not Dublin, Ireland). 4) Do something similar with names, only using some sort of resource like dbpedia to display possible individuals? 5) markup the original file in an XML doc w/ identifiers around those occurrences? Is that what you're picturing? Jon G. Who doesn't really know enough about linked data to contribute, but is interested nonetheless.
Re: [CODE4LIB] If you were starting over, what would you learn and how would you do it?
Here's my take on whether or not the projects are going to be useful in job hunting. It's a bit of a gamble and honestly they may not. On the other hand, I certainly would take a portfolio as a very good sign of a candidate in my own hunts. But realistically, the job market's just too wild at the moment. It does seem to be smoothing out though. Certainly I would run the portfolio by some systems people you really respect and ask them to give an honest opinion. Such projects can be revealing not just in a positive way but a negative one too. (And I feel bad being negative, perhaps just blame it on a bad week. I've seen very few portfolio's that detracted from my opinion of a candidate.) On the other hand though, personal experience, particularly well supported through independent study and also discussion with others gives a huge boost to your skills. I don't know if a candidate in this job market can afford NOT to spend at least some personal time in developing their skills. Perhaps in an ideal world perhaps school and on-job training would cover all ground. If you can though, double-dip and just take a course assignment to the next level or something like that. In other words, such personal work probably won't greatly increase your chances of beating out the competition, but without it likely you're going to have a hard time making a good impression. Of course, hopefully you enjoy this tech stuff so spending personal time isn't too burdensome ;). But I understand, these days it seems like I never have enough time to work on my personal geeky projects. Sorry for the convoluted answer, hopefully it'll help. We can always use more geeky librarians ;). Jon Gorman
Re: [CODE4LIB] yaz-marcdump
From a good article on this at http://www.indexdata.com/blog/2009/10/z3950-dummies-part-4. $ yaz-marcdump -f marc-8 -t utf-8 -o marc -l 9=97 part01.dat part.mrc (97 = 'a') If I remember correctly some of this functionality has also changed over various versions so not sure if this is still needed, but better safe than sorry. Might also want to check the man page with your particular version of yaz-marcdump. Jon G. On Mon, May 2, 2011 at 9:39 AM, Eric Lease Morgan emor...@nd.edu wrote: Does the -t flag in yaz-marcdump tell the program to convert characters in MARC records to specific character sets, or does merely change the value in a MARC leader to denote the character set of the record as a whole? In other words, will yaz-marcdump do its best to convert MARC-8 characters found in MARC records into a UTF-8 characters? -- Eric Morgan University of Notre Dame
Re: [CODE4LIB] utf8 \xC2 does not map to Unicode
I'm making headway on my MARC records, but only through the use of brute force. I used wget to retrieve the MARC records (as well as associated PDF and text files) from the Internet Archive. I know IA has some bad marc records (and also records w/ bad encoding) from my experience with them in the past. I'm also not sure what the web server / wget will do to the files as well. I did play a bit with yaz-marcdump to seemingly convert things from marc-8 to utf-8, but I'm not so sure it does what is expected. Does it actually convert characters, or does it simply change a value in the leader of each record? If the former, then how do I know it is not double-encoding things? If the later, then my resulting data set is still broken. There was a bug I seem to remember with yaz-marcdump where it was just toggling the leader. (Or a design flaw where you had to specify a character conversion as well.). But that was fixed a while ago I thought. It's probably one of the better tools out there for this type of stuff. If MARC records are not well-formed and do not validate according to the standard, then just like XML processors, they should be used. Garbage in. Garbage out. I'm guessing you meant they shouldn't be used? ;). XML processors aren't really known for flexibility in this regard. Unfortunately there's a lot of issues here, not the least of it some of the worse issues I've seen are introduced by well-meaning folks who do things like dump a file out into MARCXML and twiddle with bits or a marc-breaker format and start using tools to dump unicode text into what is really a marc-8 file. Then at some point in the pipeline of conversions enough character encoding conversions happens that the file ends up being messed up. And then there's always the legacy data that got bungled up in the an encoding transfer. I know we've got some bad CJK characters due to this. At some point in converting our marc-8 records one or two characters got mapped to something that's not in the unicode spec at all. At some point we'll clean up those records, you know, when we've got some spare time :P. The problem here has been the tools and they pass whatever internal validations are enforced. Probably more stages need to check for validity, but there's a lot of records that would fail if they did. (I don't even want to think about how many people disable validation, or use the same software stack that generated the marc in the first place, or changes within the marc spec itself over time that makes validation even more difficult. Jon Gorman
Re: [CODE4LIB] utf8 \xC2 does not map to Unicode
I'm not quite convinced that it's marc-8 just because there's \xC2 ;). If you look at a hex dump I'm seeing a lot of what might be combining characters. The leader appears to have 'a' in the field to indicate unicode. In the raw hex I'm seeing a lot of two character sequences like: 756c 69c3 83c2 a872 (culir). If I knew my utf-8 better, I could guess what combining diacritics these are. Doing a look up on http://www.fileformat.info seems to indicate that this might be utf-8, a 'DIAERESIS' When debugging any encoding issue it's always good to know a) how the records were obtained b) how have they been manipulated before you touch them (basically, how many times may they have been converted by some bungling process)? c) what encoding they claim to be now? and d) what encoding they are, if any? It's been a while since I used Marc::Batch. Is there any reason you're using that instead of just using MARC::Record? I'd try just creating a MARC::Record object. I've seen people do really bizarre things to break MARC files such as editing the raw binary, thus invalidating the leader and the directory as the byte counts were no longer right) I hate to say it, but we still come across files that are no longer in any encoding due to too many bad conversions. It's possible these are as well. The enca tool (haven't used it much) guesses this at utf-8 mixed w/ non-text data. Jon
Re: [CODE4LIB] LAMP Hosting service that supports php_yaz?
On Wed, Mar 23, 2011 at 10:13 AM, Cindy Harper char...@colgate.edu wrote: Sorry to bother you all with it. Everyone's happy family is different, to hash a quote, but I hope I'm still welcome in Code4Lib, even if I'm not hired to be a library coder. Just a library (Windows) sys admin. Or maybe we need a spin-off code4lib for the amateurs among us. I think Bill meant why are you coming down here with us trolls when you're at such a nice place? You're quite welcome, although you've certainly have my curiosity up about why you want to run php_yaz in the first place. You didn't have much in the way of details in your initial email. It might change some people's advice if you're not intending the system to a long-term production system. (And I'm still curious what systems are even using php_yaz) Jon Gorman
Re: [CODE4LIB] code4lib 2011 announcements
Peter, Thanks for letting us know. There is indeed a problem with the streaming links on the website. The URLs displayed on the page are correct, but they are linking to the wrong address. This should be fixed by the time the streamed sessions start tomorrow. Note that today's preconference sessions are not being streamed. Jon --- Jon Dunn Director, Library Technologies and Digital Libraries IU Bloomington Libraries / University Information Technology Services Indiana University j...@indiana.edu (812) 855-0953 On 2/7/11 10:17 AM, Peter MacDonald pmacd...@hamilton.edu wrote: I am being challenged for a username and passphrase when I try to view the live stream at http://www.indiana.edu/~uits/code4lib/program/sessions.php How do we get them? Thanks, Peter Peter MacDonald Library Information Systems Specialist Hamilton College Library 315 859-4493 On Mon, Feb 7, 2011 at 9:34 AM, McDonald, Robert H. rhmcd...@indiana.eduwrote: Hi Everyone, Just a few announcements about Code4Lib 2011. We are so happy that so many could join us this week in Bloomington. We kicked off our pre-conferences today and are looking forward to an exciting week. I have a couple of announcements for this list about code4lib 2011. For all those not in attendance, we will be streaming the conference live this will be done in 4 sessions (Day 1 morning, Day 1-afternoon, Day 2 all day, Day 3 till noon) to get access to the streams please go to: http://www.indiana.edu/~uits/code4lib/program/sessions.php Also, one of our sponsors for this year's code4lib 2011 conference, Elsevier, is hosting a code challenge that will take place from now until March 1, 2011. It is open to all but I am trying to help them find those in the code4lib community who are interested in working with their brand new API for their SciVerse Suite. There are some cool prizes too. For more on this event ‹ please see below. Thanks again for all of your suggestions that will make code4lib 2011 a conference to remember. Best, Robert ++ Elsevier Code Challenge Building OpenSocial Apps for Libraries Elsevier is sponsoring a code challenge to build OpenSocial apps for Libraries. Using JavaScript and HTML5, Librarians can write customized apps for the SciVerse suite: ScienceDirect, Scopus and Hub. The Elsevier Challenge is now open for all librarians and coders and closes on March 1, 2011. The prizes for this code challenge are as follows: Prizes: 1. $1500 (Amazon gift card) 2. $1000 (Amazon gift card) 3. $500 (Amazon gift card) The next web is going to be based on apps that allow librarians and end users to customize their search needs. Elsevier¹s new SciVerse platform has extended Apache Shindig, the OpenSocial container, for apps to appear alongside search results, full-text articles and meta-data, which can be accessed through the Sciverse APIs. Using open APIs and open data, apps can mashup SciVerse content with third party data and services. The Elsevier Challenge at Code4Lib challenges librarians to build better tools and services on the SciVerse suite: ScienceDirect, Scopus and Hub and customize the search tools for their library and users. To register for the challenge, email challenge-regis...@elsevier.com mailto:challenge-regis...@elsevier.com For instructions: go to http://developer.sciverse.com/code4lib To get started: http://developer.sciverse.com/sdk For more information, email: challenge-i...@elsevier.commailto: challenge-i...@elsevier.com ** Robert H. McDonald Associate Dean for Library Technologies and Digital Libraries Associate Director, Data to Insight Center-Pervasive Technology Institute Executive Director, Kuali OLE Indiana University Herman B Wells Library 234 1320 East 10th Street Bloomington, IN 47405 Phone: 812-856-4834 Email: rob...@indiana.edu applewebdata://4D6D9232-E25C-47CB-ACDB-EFEDEA66AA98/rob...@indiana.edu Skype/GTalk: rhmcdonald AIM/MSN: rhmcdonald1
Re: [CODE4LIB] Registration website issues?
Yup, it's slow going. It seems so far if you just keep hitting reload after the errors it eventually gets through. It's keeping the information in session somehow. Of course, I'm on step 8 after 40 minutes.so I'm hoping I don't have to start over again.. Jon Gorman On Mon, Dec 13, 2010 at 11:39 AM, Doran, Michael D do...@uta.edu wrote: Is anyone else having trouble connecting to the Code4Lib registration website (https://www.confmanager.com/main.cfm?cid=2375)? It took me about 15 minutes to get connected initially, now it's hanging after page 2 (of 9?). -- Michael # Michael Doran, Systems Librarian # University of Texas at Arlington # 817-272-5326 office # 817-688-1926 mobile # do...@uta.edu # http://rocky.uta.edu/doran/ -Original Message- From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of Karen Coyle Sent: Monday, December 13, 2010 9:51 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Announcing OLAC's prototype FRBR-inspired moving image discovery interface Quoting Beacom, Matthew matthew.bea...@yale.edu: Sometimes I feel like we should all have the FRBR diagram tattoo'd on our arms so we can consult it any time anywhere. :-) With as complex a thing as a film--so many authors, images, music, dialog, acting, sets, costume, etc., etc., etc., applying the FRBR model is tough, and your implementation is quite sensible. However, I had a small question about one thing you said about FRBR not allowing language at the work level. That doesn't seem right to me. How could the language of a thing that is primarily or even partially a work made of language--like a novel or a motion picture with spoken dialogue would not necessarily be considered at the work level and not at some other level. Matthew, I can't answer how it is possible but I can tell you that it is a fact: language is an attribute of Expression, not of Work. That's kind of the key meaning of frbr:Expression -- it is the Expression of the Work, and the Work doesn't exist until Expressed. So Work is a very abstract concept in FRBR. (Which is why more than one attempted implementation of FRBR that I have seen combines Work and Expression attributes in some way.) Not only that, but Kelley's model uses something that I consider to be missing from FRBR: the concept of a original Expression. For FRBR (and thus for RDA) all expressions are in a sense equal; there is no privileged first or original expression. Yet there is evidence that this is a useful concept in the minds of users. Some recent user studies [1] around FRBR showed that this is a concept that users come up with spontaneously. Also, I can't think of any field of study where knowing what the original expression of a work was wouldn't be important. Because of the way we treat translations--not just in FRBR--as what FRBR calls expressions not as new works, a translation from the original language to another would be considered an FRBR expression. Could you explain this a bit more? The FRBR relationship translation of is an Expression-to-Expression relationship. (See my personal cheat sheet of RDA/FRBR relationships [2]). kc [1] http://www.asis.org/asist2010/abstracts/75.html [2] http://kcoyle.net/rda/group1relsby.html Thank you. Matthew -Original Message- ... This also allowed us to get around some of the areas of more orthodox FRBR modeling that we found unhelpful. For example, FRBR doesn't allow language at the Work level, but we think it is important to record the original language of a moving image at the top level. -- Karen Coyle kco...@kcoyle.net http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet
Re: [CODE4LIB] Which O'Reilly books should we give away at Code4Lib 2011?
So somebody actually attempts to answer your question: Some O'Reilly books that would probably be a good fit for the conference: Search/Library related - Ambient findability Information Architecture for the World Wide Web Information Architecture for the World Wide Web Search Patterns Conference/culture related - Confessions of a Public Speaker Hackers Painters Beautiful Code Just plain fun - A stack o' Make magazines? I picked these somewhat from memory and somewhat from using my wand of serendipity. If more suggestions are needed, I can probably put more actual thought into it ;). Jon Gorman On Tue, Dec 7, 2010 at 8:31 PM, Kevin S. Clarke kscla...@gmail.com wrote: Hi all, If you have particular O'Reilly titles that you'd like for us to ask O'Reilly for, send them to me and I'll put them in our request. Thanks, Kevin
Re: [CODE4LIB] unwanted (bogus) characters in marc
There's something about this that's tugging at my memory that hints it might not be quite what the error message said as far as an invalid unicode character. I guess my first couple of questions: 1) What identifiers/records are you pulling? I didn't see any actual examples in your email. Can you construct the url that the perl script is doing and give it to us? I'd guess it's very likely the original marc record is goofed up due to some transforms. I've seen it from people doing really weird things to records as part of the submit process to IA. 2) You're sure that is a unicode marc record and not marc-8, right? 3) What version is your MARC::Record module? Might want to upgrade if it's old, there's been some bug fixes. Jon Gorman On Thu, Oct 7, 2010 at 5:51 AM, Eric Lease Morgan emor...@nd.edu wrote: How do I trap for unwanted (bogus) characters in MARC records? I have a set of Internet Archive identifiers, and have written the followoing Perl loop to get the MARC records associated with each one: # process each identifier my $ua = LWP::UserAgent-new( agent = AGENT ); while ( DATA ) { # get the identifier chop; my $identifier = $_; print $identifier, \n; # get its corresponding MARC record my $response = $ua-get( ROOT . $identifier/$identifier . _meta.mrc ); if ( ! $response-is_success ) { warn $response-status_line; next; } # save it open MARC, $identifier.mrc or die Can't open $identifier.mrc: $!\n; binmode MARC, :utf8; print MARC $response-content; close MARC; } I then use the venerable marcdump to see the fruits of my labors: marcdump *.mrc. Unfortunately, marcdump returns the following error against (at least) one of my files: bienfaitsducatho00pina.mrc utf8 \xC3 does not map to Unicode at /System/Library/ Perl/5.10.0/darwin-thread-multi-2level/Encode.pm line 162. What is going on here? Am I saving my files incorrectly? Is the original MARC data inherintly incorrect? Is there some way I can fix the MARC record in question? -- Eric Lease Morgan