Re: [CODE4LIB] Formalizing Code4Lib?

2016-06-07 Thread Dunn, Jon William Butcher
Another example to look at is Open Repositories, which entered into an MOU with 
CLIR last year to serve as "financial sponsor" for the OR conference series. In 
this model, CLIR does not bear the financial risk of the annual conference but 
essentially serves as a banker for any surplus generated. The host institution 
each year is the one that enters into contracts with hotels, etc., and bears 
the financial and legal risks of hosting, but there is an implied expectation 
that the funds held for OR by CLIR would be used to help cover a loss that 
occurs due to extraordinary circumstances.

Since, like Hydra and Code4Lib, OR does not exist as a legal entity, the MOU is 
between the OR Steering Committee and CLIR.

Jon

-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Esmé 
Cowles
Sent: Tuesday, June 07, 2016 4:24 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Formalizing Code4Lib?

I don't think there is any Hydra legal entity (hence the need for a financial 
host), and the MOU is signed on behalf of the leadership committee.  So I think 
it boils down to being organized enough for the financial host to be 
comfortable entering into an agreement with them.

I can ask the people I know on the Hydra leadership committee to get more info 
on how the arrangement works.

-Esmé

> On Jun 7, 2016, at 4:19 PM, Jenn C <jen...@gmail.com> wrote:
> 
> This sounds like an intriguing option. What is "Hydra" that it is able 
> to enter into an MOU - is the steering group an incorporated entity?
> 
> On Tue, Jun 7, 2016 at 3:40 PM, Esmé Cowles <escow...@ticklefish.org> wrote:
> 
>> I remember another option being brought up: picking an official 
>> organizational home for C4L that would handle being the financial 
>> host for the conference, and possibly other things (conference 
>> carryover, scholarship fundraising, holding intellectual property, 
>> etc.).  An existing library non-profit might be able to do this without that 
>> much overhead.
>> 
>> For example, Hydra has a MOU with DuraSpace for exactly this kind of 
>> arrangement, and there was a post recently about renewing the 
>> arrangement for another year, including the MOU:
>> 
>> https://groups.google.com/d/msg/hydra-tech/jCua5KILos4/yRpOalF6AgAJ
>> 
>> In the past, there has been a great deal of resistance to making C4L 
>> more organized, and especially on the amount of work needed to run a 
>> non-profit organization.  So having a financial host arrangement 
>> could be a lighter-weight option.
>> 
>> -Esmé
>> 
>>> On Jun 7, 2016, at 3:31 PM, Coral Sheldon-Hess 
>>> <co...@sheldon-hess.org>
>> wrote:
>>> 
>>> I think this deserves its own thread--thanks for bringing it up,
>> Christina!
>>> 
>>> I'm also interested in investigating how to formalize Code4Lib as an 
>>> entity, for all of the reasons listed earlier in the thread. I can't 
>>> volunteer to be the leader/torch-bearer/main source of energy behind 
>>> the investigation right now (sorry), but I'm happy to join any group 
>>> that
>> takes
>>> this on. I might be willing to *co*-lead, if that is what it takes 
>>> to get the process started.
>>> 
>>> And, yes, anyone who has talked to me or read my rants about the 
>>> proliferation of library professional organizations is going to 
>>> think my volunteering for this is really funny. But I think forming 
>>> a group to gather information gives us the chance to determine, as a 
>>> community, whether Code4Lib delivers enough value and has enough of 
>>> a separate identity to be worth forming Yet Another Professional 
>>> Organization (my
>> gut
>>> answer, today? "yes"), or whether we would do better to fold into, 
>>> or become a sub-entity of, some existing organization; or, 
>>> (unlikely) should Code4Lib stop being A Big International Thing and just do 
>>> regional stuff?
>>> Or some other option I haven't listed--I don't even know what all 
>>> the options are, right now.
>>> 
>>> One note on the "no, let's not organize" sentiment: the problem with 
>>> a
>> flat
>>> organization, or an anarchist collective, or a complete "do-ocracy," 
>>> is that the decision-making structures aren't as obvious to 
>>> newcomers, or
>> even
>>> long-term members who aren't already part of those structures. There 
>>> is value to formality, within reason. I mean... right now, I don't 
>>> know how
>> to
>>> go about getting &

Re: [CODE4LIB] SSL certificates and proxy servers

2016-02-17 Thread Gorman, Jon
> I want to make a plea too, not to fragment Code4Lib, but rather to consolidate
> EZProxy knowledge to post these queries to the EZProxy list.
> 
> For good, bad or indifferent, OCLC is putting together an EZProxy community
> wiki and for those EZProxy folks who come after you, who are not C4Lers, I ask
> that whatever info go there.

If we're going to go that far, why not also put it in the existing system? 
http://www.oclc.org/community/ezproxy.en.html? 

Honestly, I'm expecting much from the wiki. I tried using the community 
resource as it is now in the past and have had errors, things disappearing, 
etc.  I think I may have put something up in the community site, but honestly, 
I'm probably never going to log in again if I don't have to. A lot of it is 
simply poor management and needless restrictions, which will be the same no 
matter what software they use.

This particular question is definitely a FAQ and someday I'll get around to 
trying to write up something and I'll put it up ..somewhere. Maybe even just up 
in github and send it to the link.

I don't see the harm in repeating info here.  I'm guessing folks who find this 
new information aren't already on ezproxy and won't be on there. They're not 
likely to find it either, the ezproxy-l list doesn't seem very well exposed to 
searching.



> (@Jon, kind of looking at you because I worry that EZProxy expertise such as
> yours will get lost. I know it seems impossible, but one day we may all go on 
> to
> other work. I for one am looking forward to an exciting second career as a
> Starbucks barrista; I hear my Master's degree will serve me well there ;-)

I'm guessing no matter where or how I put the information, people will still 
ask the questions :).  My learned knowledge about ezproxy is combined a bit 
from the mailing list, a large part in just reading the OCLC documentation, and 
a little from ./ezproxy --help or whatever it is :).

I'll try to dump some of the info or create an FAQ one of these days, but it 
probably won't be today.

Or, of course, someone else could visit 
http://search.gmane.org/?query=wildcard=gmane.education.ezproxy and type 
in the search box wildcard and summarize the various emails on the topic ;).


Jon Gorman
Library IT
University of Illinois
217 244-4688


Re: [CODE4LIB] SSL certificates and proxy servers

2016-02-17 Thread Gorman, Jon
> Hi Code4Lib,
> We're looking into applying an SSL certificate to an EZproxy server and aren't
> sure exactly how a wildcard cert gets handled in that context.
> Anyone have experience with this?

Yup. 
 
> The fuzzy part is that we're not clear how wildcard certificates that handle
> subdomain matching (e.g., *.example.org) translate into wild-looking proxied
> domains (like search.whatever.com.proxy.example.org).

This depends a lot on the version number of EzProxy.

The older versions of EzProxy look for a couple of things:

* proxy-by-hostname needs to be on (sounds like you have that)
* The wildcard MUST be in the CN, not a SAN. You'll likely want to use your 
login domain in the SN, depending on levels.

Given those two things, when ezproxy sees that it has a wildcard in the CN, 
it'll change from using periods to hypens.

I think, although I can't remember for sure, at some point in 6.x this was 
fixed so a wildcard in a CN or SAN will work. I'd definitely verify that 
through testing though. 

A license of ezproxy should let you run a separate test instance on another 
machine. You can verify this by just creating a self-signed wildcard cert. 
You'll get a warning, but you should also see the ezproxy behavior change. I 
find dnsmasq can be helpful as well.  

So you'll want to get a wildcard cert for the one level of subdomain.  While 
you're at it, make sure it's a 2048 bit key and SHA-2. I've been seeing a lot 
of people running into problems with old 3 year certs that they finally gotten 
around to putting into place.

 
> This might be more of an EZproxy config question and more appropriate to that
> list. There's also documentation
> <https://www.oclc.org/support/services/ezproxy/documentation/cfg/ssl.en.ht
> ml>
> out there. But if anyone can comment on the process, whether the
> documentation was helpful to you, what sort of wildcard cert you got to
> address this problem, etc., we'd be interested to hear from you.

It's asked frequently enough that if I wasn't quite so lazy, I'd make it into 
the top FAQ question. The documentation was ok, but it's really not all that 
complicated. 



Jon Gorman
Library IT
University of Illinois
217 244-4688


[CODE4LIB] Job: Visiting Information System Analyst (3 year term)

2015-10-12 Thread Gorman, Jon
Visiting Information System Analyst (3 year term)
University of Illinois at Urbana-Champaign
Library IT - Infrastructure Management and Support (IMS)


Position Available: This is a 12 month, full-time Academic Professional 
position with the University Library's IT Infrastructure Management and Support 
team. It is a visiting, three-year appointment in the first instance, but there 
is a possibility of extension.

Duties and Responsibilities:   Supporting commercial, open-source, and locally 
developed Library applications; Managing service and application life-cycles: 
Working with project stakeholders and senior programming staff to gather and 
analyze requirements for projects, and design approaches to meeting project 
requirements; Working independently or as a member of a small team, 
responsibility for implementing the approved recommendations, especially for 
in-house development, but also for customization or integration of purchased 
and open source software; Applying best practices in various software 
development methodologies, including version control, automated testing and 
code refactoring, and leveraging appropriate programming frameworks and 
technical architectures to the requirements and proposed solutions;

Required Qualifications:

3 years of experience in the field and a Bachelor's Degree; or 1 year of 
experience in the field with a Computer Science Bachelor's Degree; Ability to 
work in a diverse environment; Demonstrable experience documenting systems and 
procedures; Programming and software development experience  in scripting and 
client-side applications languages, including one or more of Perl, Python, PHP, 
Visual Basic, and Java; Current and working knowledge of HTML, CSS and 
JavaScript; Familiarity with programming web applications; Working knowledge of 
relational database design principles; Ability to work independently or under 
only general direction; Strong communication skills.  See 
https://jobs.illinois.edu for Preferred Qualifications



Apply: To ensure full consideration, please complete your candidate profile at 
https://jobs.illinois.edu and upload a letter of interest, resume, and contact 
information including email addresses for three professional references. 
Applications not submitted through this website will not be considered. The 
University of Illinois conducts criminal background checks on all job 
candidates upon acceptance of a contingent offer. For questions, please call: 
217-333-8169.



DEADLINE: in order to ensure full consideration, applications must be received 
by November 16, 2015.
The U of I is an EEO Employer/Vet/Disabled www.inclusiveillinois.illnois.edu


Re: [CODE4LIB] Hours of Operation on Website - management tool

2015-07-01 Thread Jon Stroop

Most of the rough edges are around some of the one-time administrative actions 
like setting up new libraries, locations, and term schedules, although there’s 
also some UI improvements in our near future.


FWIW, we just just 'finished' a first pass at little Rails engine around 
managing location data:


https://github.com/pulibrary/locations

It, too, is a little rough around the edges (esp wrt views) and has some 
site-specific stuff, like a gazillion 'location codes' to make it work 
with existing systems...but that's sorta why we built it.


-Jon

--
Jon Stroop
Application Development Manager
Princeton University Library
jstr...@princeton.edu



On 07/01/2015 11:35 AM, Chris Beer wrote:

Hi Ken,

We’ve recently been working on rebuilding an application for managing our 
hours. It’s Ruby on Rails, not-yet-in-production, full of rough edges, and has 
some Stanford-specific business logic, but it’s relatively simple and 
(probably) works for us:

https://github.com/sul-dlss/library_hours_rails/releases/tag/v0.0.1 
https://github.com/sul-dlss/library_hours_rails/releases/tag/v0.0.1

Currently, it’s envisioned as a backend service for staff to add and manage 
hours, with downstream consumers using the API to present the hours as 
appropriate. Our initial consumers include the main library website, our 
library catalog, and some other business process applications. We’ve also 
started thinking about embeddable HTML views of the hours to replace some of 
the clunky processing we’re currently doing in Drupal, but haven’t pursued that 
yet.

Interesting features include:

- JSON-API view of a location’s hours; (what I assume is a bespoke..) Drupal 
calendar feed; import and export for spreadsheets of hours;
- multiple library (and location-within-a-library) support;
- granular access control for updating hours; we have the notion of global 
hours administrators, but expect to also support library- and location-specific 
authorization, allowing library managers to set and update the hours for a 
subset of our locations [1];
- support for setting operating hours for a term and/or exceptions for 
particular days (e.g. holidays and the like) using an in-place editor;
- we have a notion of location-specific messages associated with exceptions to 
the normal schedule (e.g. the Art library is closed this week for Y), which can 
be reflected in applications that consume the library hours

Most of the rough edges are around some of the one-time administrative actions 
like setting up new libraries, locations, and term schedules, although there’s 
also some UI improvements in our near future.


Thanks,
Chris Beer
Digital Library Systems and Services
Stanford University Libraries


[1] Although I’m more interested in allowing any staff member to update the 
hours, and provide better notifications when a location’s hours change; that 
said, strong access control is much easier to reason about and codify..


On Jul 1, 2015, at 6:01 AM, Ken Irwin kir...@wittenberg.edu wrote:

Hi folks,

I'm hoping to find some sort of web-based app that can manage the library's 
hours of operations, including:

* Displaying today's hours

* Displaying an upcoming schedule of hours

* Updatable though a GUI interface by non-techy library staff

* Able to update our Google Places account hours (which, I note, 
currently lists our school-year hours as our open hours today), perhaps on a 
daily basis

* Preferably a stand-alone thing that can provide data on an ad hoc 
basis (as opposed to a CMS-specific thing like a WP plugin or a Drupal module)

* PHP preferred but not necessary

* OSS / free preferred but not necessary

I feel certain that someone else has already wanted this enough to create it. 
Anyone have a solution they're happy with?

Thanks
Ken


[CODE4LIB] Recommendations for places to advertise for a library systems guru?

2015-04-22 Thread Jon Gorman
Hi all,

I thought I'd ask folks what resources and places one could advertise
positions that might not fall in some of the more traditional for libraries
for systems folks.

The more obvious seem to be LITA/ALA, here at Code4Lib, and perhaps some of
the other library organizations. Also postings in newspapers in the area is
a typical move by us.

But I'm also considering IEEE  ACM job listings and asking CS faculty for
recommendations.

I'm sure there's even more that I haven't thought of. So I'm curious about
other suggestions or ideas? Particularly are there any that have worked to
draw in candidates with a strong IT background?

Jon Gorman
University of Illinois


[CODE4LIB] Job: Ontology Engineer/Semantic Applications Developer, Cornell University Library

2015-03-09 Thread Jon Corson-Rikert
Ontology Engineer/Semantic Applications Developer, Cornell University Library


https://cornellu.taleo.net/careersection/10164/jobdetail.ftl?job=25577


Description


Join the team advancing open source, linked data initiatives for a world class 
academic research library on the beautiful Cornell University campus in Ithaca, 
New York. Ithaca has been named one of the top 100 places to live, a top 10 
recreation city, a best green place to live, and one of the “foodiest” towns in 
America.


Apply your experience and unique talents in Albert R. Mann Library as a senior 
level Ontology Engineer/Semantic Applications Developer on a team promoting 
innovation and quality in information technology. Develop and promote 
international standards and frameworks for scholarly content on the web, with 
frequent opportunities for engagement with the linked data and information 
science communities. Reinvent core library systems as networks of linked data 
connecting rich traditional library resources with diverse, distributed 
knowledge to meet the rapidly evolving needs of today’s researchers and 
students. Become the architect of compelling web applications and services to 
library, university, regional, and international projects engaged in 
disciplines ranging from earth systems and climate science to agricultural 
research in the developing world. Join the international team developing the 
open source VIVO-ISF https://github.com/vivo-isf/vivo-isf-ontology 
ontologyhttps://github.com/vivo-isf/vivo-isf-ontology and VIVO software 
(vivoweb.orghttp://vivoweb.org and http://github.com/vivoproject 
github.com/vivo-projecthttp://github.com/vivo-project/) and lead the creative 
application of VIVO technology at Cornell.


The Ontology Engineer/ Semantic Applications Developer will:

  *   Research, create, maintain, and extend ontologies, knowledge bases, and 
software tools in a distributed, linked data environment to support data 
integration, interoperability, usability, query, analysis,visualization, and 
dissemination
  *   Provide technical leadership in ontology selection and design including 
evaluation for consistency, modularity, efficiency, and reasoning; develop 
mechanisms for ontology versioning, community driven editing, and deployment in 
software applications in concert with local,national, and international 
collaborators
  *   Support dramatically increasing the production scale of semantic web 
applications
  *   Participate in and contribute to open source software communities
  *   Prepare strategic guidance, white papers, project proposals, 
visualizations, presentations, technical documentation, and reports
  *   Provide training and technical assistance to academic and professional 
staff at Cornell and partner institutions
  *   Contribute to the full range of information technology services provided 
by the Cornell University Library (may functionally supervise the work of 
others and lead project teams)

Qualifications

  *   Bachelor’s degree in library science, information science, computer 
science or other relevant discipline
  *   More than 5 years of relevant experience
  *   Experience designing and implementing OWL ontologies and other metadata 
standards
  *   Experience applying semantic web and linked data standards to real world 
applications
  *   Expert level Java programming; web application development experience in 
Java, Python, Ruby or similar language; knowledge of current database 
management systems, SQL, and non relational alternatives
  *   Excellent interpersonal and oral and written communication skills
  *   Evidence of ability to assess, analyze, plan, and solve problems 
creatively and collaboratively in a complex, rapidly-changing environment

Preferred Qualifications

  *   Familiarity with principles and methodologies for applying reasoning and 
rules
  *   Experience with agile software development, continuous integration, 
testing frameworks
  *   A track record of contributing to open source communities
  *   Experience with statistical programming and/or data mining
  *   Experience working in higher education or research

Background check may be required. No relocation assistance is provided for this 
position. Visa sponsorship is not available for this position.


Cornell University is an innovative Ivy League university and a great place to 
work. Our inclusive community of scholars, students and staff impart an 
uncommon sense of larger purpose and contribute creative ideas to further the 
university’s mission of teaching, discovery and engagement. Located in Ithaca, 
NY, Cornell's far-flung global presence includes the medical college's campuses 
on the Upper East Side of Manhattan and Doha, Qatar, as well as the new Cornell 
Tech campus to be built on Roosevelt Island in the heart of New York City.


Diversity and Inclusion are a part of Cornell University’s heritage. We’re an 
employer and educator recognized for valuing AA/EEO, Protected Veterans, 

[CODE4LIB] Job: Semantic Applications and Linked Data Developer at Cornell University Library

2014-09-24 Thread Jon Corson-Rikert
Semantic Applications and Linked Data Developer
Albert R. Mann Library
Cornell University, Ithaca, NY 14853
Job Posting #25577: http://goo.gl/Oz1PdD

Cornell University’s Mann Library IT Team is seeking a senior-level Semantic 
Applications and Linked Data Developer who will apply innovative knowledge 
representation techniques and tools to library, university, regional, and 
international projects. Mann Library is a friendly and collaborative workplace 
where flexible, thoughtful and self-motivated librarians and IT staff engage 
together in a portfolio of projects encompassing climate science, international 
agricultural knowledge sharing, GIS and data visualization, and transitions 
from library catalogs to linked data. Join the team developing the VIVO 
software (vivoweb.org and github.org/vivo-project) and VIVO-ISF ontology 
through an international consortium of universities, research institutions, 
government agencies, and non-profits, promoting rich semantic interconnectivity 
among researchers, activities, and outputs anywhere in the world.

Responsibilities include:

  *   Researching, synthesizing, and applying the most appropriate knowledge, 
technologies and tools to improve data harvesting, integration, 
interoperability, and dissemination to consuming websites and services
  *   Active participation with local, national, and international 
collaborators on ontology development, standards initiatives, and open source 
software
  *   Contributing to the full range of information technology services 
provided by the Cornell University Library (may functionally supervise the work 
of others and lead project teams)
  *   Preparing strategic guidance, white papers, project proposals, 
visualizations, presentations, technical documentation, and reports influencing 
technology development practices and policies and having broad impact within 
and beyond the University.
  *   Providing training and guidance to academic and professional staff at 
Cornell and partner institutions

Required Qualifications:

  *   Bachelor’s degree in an information science (library science, information 
science, computer science or equivalent) or other relevant discipline and more 
than 5 years of relevant experience
  *   Experience applying Semantic Web and Linked Data standards (RDF, SPARQL, 
and OWL) to real-world applications; expert-level Java programming; web 
application development experience in Java, Python, Ruby or similar language; 
web presentation layer experience with HTML5, CSS, JavaScript, and JSON; 
knowledge of current database management systems, SQL, and non-relational 
alternatives
  *   Excellent interpersonal and oral and written communication skills and a 
strong, user-centered service orientation
  *   Evidence of ability to assess, analyze, plan, and solve problems 
creatively and collaboratively in a complex, rapidly-changing environment

Preferred Qualifications:

  *   Experience designing, programming, and deploying creative and effective 
web applications
  *   Experience working with metadata standards, ontologies, and thesauri
  *   Familiarity with data interchange standards, reasoning, rules, XML and 
XSLT, and with statistics, data mining, text mining libraries, algorithms, and 
applications
  *   Experience with Agile Software Development methodologies, with UNIX shell 
scripting, log analysis, and scheduling
  *   Experience contributing source code, ontologies, testing, documentation, 
and/or support to open source communities
  *   Experience working in higher education or research

Background check may be required. No relocation assistance is provided for this 
position.Visa sponsorship is not available for this position.

Cornell University is an innovative Ivy League university and a great place to 
work. Our inclusive community of scholars, students and staff impart an 
uncommon sense of larger purpose and contribute creative ideas to further the 
university's mission of teaching, discovery and engagement. Located in Ithaca, 
NY, Cornell's far-flung global presence includes the medical college's campuses 
on the Upper East Side of Manhattan and Doha, Qatar, as well as the new Cornell 
Tech campus to be built on Roosevelt Island in the heart of New York City.

Diversity and Inclusion are a part of Cornell University’s heritage. We’re an 
employer and educator recognized for valuing AA/EEO, Protected Veterans, and 
Individuals with Disabilities.




Re: [CODE4LIB] Library Privacy, RIP (Was: Canvas Fingerprinting by AddThis)

2014-08-15 Thread Jon Goodell
I don't believe the horse has left the barn forever. As Bruce Schneier
says, security is a process, not a product. And as we learn more about this
space we can advocate in our own institutions for greater awareness and
perhaps adjustments to the technologies we use to evaluate online activity.
AddThis and ShareThis probably have limited value for the data they
compromise. Google Analytics is probably a much better trade. EZproxy too...

Jon


On Fri, Aug 15, 2014 at 2:07 PM, Eric Hellman e...@hellman.net wrote:

 On Aug 14, 2014, at 4:32 PM, William Denton w...@pobox.com wrote:

  At the university where I work Google Analytics is the standard, and we
 use it on the library's web site.  There's probably no way around
 that---but we can tell people how to block the tracking, which will help
 them locally (ironically) and everwhere else.  (I use Piwik at home, and
 like it, but moving to that here would be a long-term project, only partly
 for technical reasons.)

 I think a reasonable place to draw a line in the sand is use for
 advertising. If you look at the Google Analytics site, it doesn't appear
 that they can use Analytics tracking for advertising, because they don't
 make the carve-outs for children that I believe would be required if they
 did. So if you trust google, and assume they know everything anyway, you
 can let them track users.

 AddThis and ShareThis, on the other hand have TOS that let them use
 tracking for advertising, and that's what their business is. So,
 hypothetically, a teen could look at library catalog records for books
 about childbirth, and as a result, later be shown ads for pregnancy tests,
 and that would be something the library has permitted.

 A criminal prosecutor could subpoena either Google or AddThis/ShareThis to
 obtain tracking data for anyone in your library who had read books about
 Nazism or the Black Panthers or witchcraft,  completely without involving
 the library. Do you think Google would easily comply with that sort of
 request? would AddThis? Would EBSCO?

 At Unglue.it, we use Google Analytics, but we have avoided Things like
 Facebook Like, and the third party shares because we didn't like the
 tradeoff.

 But maybe the horse has left the barn forever.

 Eric



Re: [CODE4LIB] very large image display?

2014-07-25 Thread Jon Stroop

Jonathan,

We use an image server I wrote, Loris, plus OpenSeadragon. Here's an 
example:


http://libimages.princeton.edu/osd-demo/?feedme=pudl0123%2F8172070%2F01%2F0001.jp2

That image is 152500 x 4000 px:

http://libimages.princeton.edu/loris/pudl0123%2F8172070%2F01%2F0001.jp2/info.json

Loris is on Github: https://github.com/pulibrary/loris
as is OpenSeadragon: https://github.com/openseadragon/openseadragon

More generally, this is one of many problems IIIF (International Image 
Interoperability Framework) exists to try to solve. You might want to 
check out our site, which has links to other tools as well: http://iiif.io/


Hope this helps,
-Jon

On 07/25/2014 11:36 AM, Jonathan Rochkind wrote:

Does anyone have a good solution to recommend for display of very large images 
on the web?  I'm thinking of something that supports pan and scan, as well as 
loading only certain tiles for the current view to avoid loading an entire 
giant image.

A URL to more info to learn about things would be another way of answering this 
question, especially if it involves special server-side software.  I'm not sure 
where to begin. Googling around I can't find any clearly good solutions.

Has anyone done this before and been happy with a solution?

Thanks for any info!

Jonathan


Re: [CODE4LIB] iiif compatible servers

2014-07-25 Thread Jon Stroop
Eric,
FWIW, an HTTP resolver that could be used with Fedora has been a big topic for 
Loris recently, and a few of us are trying to spec out what that would look 
like.

The discussion/proposal is here: https://github.com/pulibrary/loris/issues/98 
and spreads to a few other linked issues. I'd be happy to hear what you think. 

-Jon

Sent from my mobile.  Please excuse typos. 

-Original Message-
From: James, Eric eric.ja...@yale.edu
To: CODE4LIB@LISTSERV.ND.EDU
Sent: Fri, 25 Jul 2014 17:39
Subject: [CODE4LIB] iiif compatible servers

Looking to implement a iiif compatible server, primarily for jp2s in fcrepo3.

Just read the 'very large image display?' thread and looking at the 
http://iiif.io/technical-details.html, it appears options include:

loris: https://github.com/pulibrary/loris
IIP: http://iipimage.sourceforge.net/documentation/server/
djatoka iiif: ( https://github.com/jronallo/djatoka)

The iiif djatoka gem immediately caught my eye as I've implemented djatoka w/ 
fcrepo3 in a previous project, but am interested if there are any opinions in 
choosing any one of these over another.

Thanks,
Eric

From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] on behalf of Esmé Cowles 
[escow...@ticklefish.org]
Sent: Friday, July 25, 2014 4:44 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] very large image display?

We previously used the Zoomify Flash applet, but now use Leaflet.js with the 
Zoomify tileset plugin:

https://github.com/turban/Leaflet.Zoomify

One thing I like about this approach is that it minimizes the amount of 
Javascript code the clients have to load, since we use Leaflet.js for our maps 
and it's already loaded.

-Esme

 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of 
 Jonathan Rochkind
 Sent: Friday, July 25, 2014 10:36 AM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: [CODE4LIB] very large image display?

 Does anyone have a good solution to recommend for display of very large 
 images on the web?  I'm thinking of something that supports pan and scan, as 
 well as loading only certain tiles for the current view to avoid loading an 
 entire giant image.

 A URL to more info to learn about things would be another way of answering 
 this question, especially if it involves special server-side software.  I'm 
 not sure where to begin. Googling around I can't find any clearly good 
 solutions.

 Has anyone done this before and been happy with a solution?

 Thanks for any info!

 Jonathan


Re: [CODE4LIB] LC Call # splitting/sorting scripts?

2014-07-11 Thread Jon Stroop

This?

https://code.google.com/p/library-callnumber-lc/

On 07/11/2014 12:01 PM, Robert Dumas wrote:

​Hey all:

Does anyone know of any scripts (preferably in Ruby or Python) which can slice 
up an LC call number and sort a table of items by LC call number?
  


Re: [CODE4LIB] Book Club software tools and approaches?

2014-07-03 Thread Jon Gorman
I like the Google Drive Form idea.  MIght be able to do that or some
variation. Thanks!


Jon Gorman


On Tue, Jul 1, 2014 at 3:39 PM, Matt Cordial matt.cord...@gmail.com wrote:

 We've been using a G+ community for event announcements and discussions.
 It's been fine. We're pretty small so we don't need a lot in terms of
 management.

 https://plus.google.com/communities/113393567679559625537

  For voting, we've used both a Google Drive Form and simply a discussion
 thread.


 On Tue, Jul 1, 2014 at 6:38 AM, Jon Gorman jonathan.gor...@gmail.com
 wrote:

  Hi all,
 
  I've been musing on software tools that might be useful for book clubs.
 
  I'm not necessarily looking for a turnkey solution explicitly geared
  towards book clubs, but more a thought experiment of what tools might be
  useful for an ongoing in the real world book club.
 
  Some needs that software tools might help keep track of:
 
  * A way to vote for what books to read next
  * Schedule of times
  * An estimator calculator (reading level of book + length of book,
  estimated sessions).
  * way to add notes or linked materials
  * online discussions to supplement in person meetings
  * glossary/dictionary functionality perhaps?
 
  In my own thoughts some of the online services like GoodReads, Shelfari
 and
  LibraryThing seems to at least offer some tools and information. A system
  that I haven't had a chance to explore enough, Loomis, might help with
 the
  decision making parts.
 
 
  Part of the impetus for this is I've recently joined a technical book
 club.
  At the moment we're using a wiki, which is working fine, but in
 particular
  the voting is clunky.  I could picture something where members can
 add/link
  to something like librarything in a list and the book with the most votes
  (w/ ties being broken randomly) is the next book in the queue.
 
  So anyone out there already doing something similar? Thoughts? Ideas?
 
  Jon Gorman
  University of Illinois
 



[CODE4LIB] Book Club software tools and approaches?

2014-07-01 Thread Jon Gorman
Hi all,

I've been musing on software tools that might be useful for book clubs.

I'm not necessarily looking for a turnkey solution explicitly geared
towards book clubs, but more a thought experiment of what tools might be
useful for an ongoing in the real world book club.

Some needs that software tools might help keep track of:

* A way to vote for what books to read next
* Schedule of times
* An estimator calculator (reading level of book + length of book,
estimated sessions).
* way to add notes or linked materials
* online discussions to supplement in person meetings
* glossary/dictionary functionality perhaps?

In my own thoughts some of the online services like GoodReads, Shelfari and
LibraryThing seems to at least offer some tools and information. A system
that I haven't had a chance to explore enough, Loomis, might help with the
decision making parts.


Part of the impetus for this is I've recently joined a technical book club.
At the moment we're using a wiki, which is working fine, but in particular
the voting is clunky.  I could picture something where members can add/link
to something like librarything in a list and the book with the most votes
(w/ ties being broken randomly) is the next book in the queue.

So anyone out there already doing something similar? Thoughts? Ideas?

Jon Gorman
University of Illinois


Re: [CODE4LIB] College Question!

2014-05-29 Thread Jon Stroop

Riley,

First, I wonder if there's anyone on this list who doesn't wish they had 
your foresight! You already have rare opportunity in that you're 
thinking about this now and not in your mid-20s, so way to go!


We spoke about this a little @ the c4l conference, but I'll say more. I 
majored in music performance and even did a masters in it as well, which 
means that practically speaking I have a high school education. :-) I 
don't really mean that, but until you've had the experience it's 
difficult to explain (or at least I find it difficult) how relevant a 
degree in the arts/humanities can be to a job in technology--and there's 
no shortage of people who have taken this exact path.


I did do an MLS, but see above re: high school education. At the time 
(~13 yrs ago) I felt like I needed to do it to get a job (I also didn't 
necessarily expect to wind up in systems, but that's another story), 
but, honestly, everything I know I learned on the job, or /a/ job, or 
the overnight hours between going to said job, which leads me to my 
point: Wherever you go to school, and regardless of your major, if you 
ultimately want to wind up working in a library, you should start now. 
Any brick and mortar university is going to have student jobs available 
(work study or otherwise) at the library, and while it may just be as a 
desk clerk or whatever, keep your ears open (we already know you're not 
shy): at some point there's going to be some stats that need munging, 
some Access (or even worse) database that needs migration, some web work 
to be done, or whatever and, et voilà, you're off!


The point is, professional degree != professional experience, 
and--frankly--you probably don't want to be working at a place that 
requires a systems librarian to have a MLIS anyway, and certainly not 
in 4-5 years. Get as much experience as possible, do a CS degree, but 
also learn how to write and communicate OR do an arts degree, but also 
learn how to program (etc.), and you'll be fine.


-Jon

On 05/28/2014 11:17 PM, Riley Childs wrote:

I was curious about the type of degrees people had. I am heading off to college 
next year (class of 2015) and am trying to figure out what to major in. I want 
to be a systems librarian, but I can't tell what to major in! I wanted to hear 
about what paths people took and how they ended up where they are now.

BTW Y'All at NC State need a better tour bus driver (not the c4l tour, the 
admissions tour) ;) the bus ride was like a rickety roller coaster...   

Also, if you know of any scholarships please let me know ;) you would be my BFF 
:P


Riley Childs
Student
Asst. Head of IT Services
Charlotte United Christian Academy
(704) 497-2086
RileyChilds.net
Sent from my Windows Phone, please excuse mistakes


[CODE4LIB] New IIIF API specifications drafts published

2014-05-29 Thread Jon Stroop
The IIIF Editors are pleased to announce draft revisions of the 
International Image Interoperability Framework Image and Presentation 
(formerly 'Metadata') API specifications.


 * http://iiif.io/api/image/2.0/
 * http://iiif.io/api/presentation/2.0/

These releases reflect a significant amount of input from both the IIIF 
working groups and the larger library, archives, and museum communities 
following roughly a year of experience either implementing or 
experimenting with the previous versions.


A complete list of the changes can be found on the IIIF website:

 * http://iiif.io/api/image/2.0/change-log.html
 * http://iiif.io/api/presentation/2.0/change-log.html

We welcome your feedback, questions, and use cases, and encourage you to 
submit them to the IIIF Discussion Listserv: 
iiif-disc...@googlegroups.com. Drafts will be kept open for comment 
until the beginning of August, with the goal of final release in 
September. However, we would appreciate feedback early in order to work 
on and gain consensus for any necessary changes.


Sincerely,

The IIIF Image and Presentation API Editors:
Benjamin Albritton
Michael Appleby
Robert Sanderson
Stuart Snydman
Jon Stroop
Simeon Warner

--
Jon Stroop
Digital Initiatives Developer/Analyst
Princeton University Library
jstr...@princeton.edu


Re: [CODE4LIB] Call for Old Conf Tshirt Logos

2014-04-11 Thread Jon Gorman
I'll try to do some digging as well

Jon Gorman


On Fri, Apr 11, 2014 at 9:38 AM, Lisa Rabey academichu...@gmail.com wrote:

 On Fri, Apr 11, 2014 at 8:30 AM, Francis Kayiwa fkay...@colgate.edu
 wrote:
 
  +1
 
  Go for it Lisa!
 
  ./fxk


 I can start digging into the hows/whys sometime in early May and
 report back. If anyone has anything of interest (past C4L list convos,
 recommendations, etc), pass them along!


 --

 Lisa M. Rabey | @pnkrcklibrarian

 
 http://exitpursuedbyabear.net | http://lisa.rabey.net



Re: [CODE4LIB] Call for Old Conf Tshirt Logos

2014-04-10 Thread Jon Gorman
I've long thought a friends of code4lib would be useful organization, but
never quite pulled it together...
On Apr 10, 2014 10:41 PM, Tom Cramer tcra...@stanford.edu wrote:

  Is black light a 501c3?

 Nope. Just an OSS project with lots of contributors from awesome places : )

 Off the top of my head, and in alphabetical order, the obvious (to me)
 ones in this space that might be candidates are DuraSpace and Lyrasis.

 In time, DP.LA seems like a great possible candidate, though it is
 US-centric, I'm unsure of its corporate status (though they do seem to be
 able to cash and sign checks), and right now they might view C4L as a
 distraction more than an asset or timely alliance. (Others on this list
 might be in a better position to comment, ahem...)

 I'm sure I'm leaving out other possibilities.

 - Tom




  Riley Childs
  Student
  Asst. Head of IT Services
  Charlotte United Christian Academy
  (704) 497-2086
  RileyChilds.net
  Sent from my Windows Phone, please excuse mistakes
  
  From: Roy Tennantmailto:roytenn...@gmail.com
  Sent: 4/10/2014 11:25 PM
  To: CODE4LIB@LISTSERV.ND.EDUmailto:CODE4LIB@LISTSERV.ND.EDU
  Subject: Re: [CODE4LIB] Call for Old Conf Tshirt Logos
 
  We should probably toss out some ideas before approaching anyone. Getting
  the right fit would be important. Which 501(c)3's in our space do we
 think
  we may want to approach about being our fiscal agent? Maybe we should
  collect a list of suggestions and then (natch) vote on who to approach?
 We
  could then go down the list until we got a yes.
  Roy
 
 
  On Thu, Apr 10, 2014 at 8:21 PM, Riley Childs rchi...@cucawarriors.com
 wrote:
 
  That might be a better idea then a fully independent code4lib
 organization.
 
  Riley Childs
  Student
  Asst. Head of IT Services
  Charlotte United Christian Academy
  (704) 497-2086
  RileyChilds.net
  Sent from my Windows Phone, please excuse mistakes
  
  From: Tom Cramermailto:tcra...@stanford.edu
  Sent: 4/10/2014 11:20 PM
  To: CODE4LIB@LISTSERV.ND.EDUmailto:CODE4LIB@LISTSERV.ND.EDU
  Subject: Re: [CODE4LIB] Call for Old Conf Tshirt Logos
 
  What about approaching one of the existing 501c3's in our space to see
 if
  they might be interested in and able to take this on for the community?
 
  In addition to shirt revenues and yacht maintenance fees, it would be
 good
  to have an agency that could help do banking for scholarships, and
 perhaps
  pay forward any surpluses from one year's conference to the next year's
  hosts.
 
  - Tom
 
 
 
  On Apr 10, 2014, at 8:10 PM, Riley Childs wrote:
 
  No, I think it should go toward my yacht ;P.
  In all seriousness, code4lib needs an entity, simply to collect money
  for this sorta thing. LegalZoom any one? ;)
 
  Riley Childs
  Student
  Asst. Head of IT Services
  Charlotte United Christian Academy
  (704) 497-2086
  RileyChilds.net
  Sent from my Windows Phone, please excuse mistakes
  
  From: Alicia Cozinemailto:ali...@curationexperts.com
  Sent: 4/10/2014 11:07 PM
  To: CODE4LIB@LISTSERV.ND.EDUmailto:CODE4LIB@LISTSERV.ND.EDU
  Subject: Re: [CODE4LIB] Call for Old Conf Tshirt Logos
 
  Could one of the scholarship sponsors adopt this as a way to fund
 future
  conference scholarships?
 
  Alicia
 
  On Apr 10, 2014, at 9:53 PM, Roy Tennant roytenn...@gmail.com wrote:
 
  That's good on the tax front, but it would be nice if eventually we
  could
  find a way to make money to help out with the conference. But that
 will
  take an organization, and so far we've avoided that.
  Roy
 
 
  On Thu, Apr 10, 2014 at 5:42 PM, Riley Childs 
 rchi...@cucawarriors.com
  wrote:
 
  It is running though spreadshirt set up with 0% commissions, so no
  monies
  are being collected. I think as long as I don't collect any money, we
  should be good.
 
  Riley Childs
  Junior
  IT Admin
  email: rchi...@cucawarriors.com
  office: +1 (704) 537-0031 x101
  cell: +1 (704) 497-2086
 
  Please Think Before Hitting Reply All
  I Do Web Design! RileyChilds.net/services
  
  From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
 Cary
  Gordon [listu...@chillco.com]
  Sent: Thursday, April 10, 2014 8:27 PM
  To: CODE4LIB@LISTSERV.ND.EDU
  Subject: Re: [CODE4LIB] Call for Old Conf Tshirt Logos
 
  I hope that the IRS doesn't put a lien on the yacht he buys with the
  proceeds.
 
  You're right, though. Probably better if some organization or
  institution
  could step up. Again we are slightly challenged by our state of
  non-organization.
 
  Cary
 
  On Apr 10, 2014, at 4:36 PM, Roy Tennant roytenn...@gmail.com
 wrote:
 
  I think one of the things that has held us back about the store in
 the
  past
  was the lack of a fiscal agent. That is, someone is going to be
  taking in
  money on behalf of Code4Lib (presumably), but where does it go?
 Since
  we
  have no organization we have no fiscal presence. No bank 

[CODE4LIB] Newcomer dinner - Pit Group 4

2014-03-24 Thread Jon Stroop

Group 4 for The Pit:
It seems like there will be a sizable exodus from the conf hotel to the 
restaurant around 6 PM, so let's plan to meet then or shortly before in 
the lobby so that we can get ourselves organized. I'll find a way to 
make myself know to you.

-Jon


Re: [CODE4LIB] Book scanner suggestions redux

2014-03-19 Thread Goodell, Jon
Great points, Jason! We have run into the same issue with Windows 7 drivers on 
our ILL scanner here. 

Jon Goodell, MA, AHIP
UAMS Reference and Outreach Librarian
501-526-5641, jgood...@uams.edu

-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of jason 
bengtson
Sent: Wednesday, March 19, 2014 6:14 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Book scanner suggestions redux

It's interesting to me to see people question the long term viability of an 
open source project. Not because it isn't a valid concern, but because, 
especially with scanners, the same issue arises with the commercial stuff. Just 
recently we have had to do some finagling with two very expensive ILL scanners 
so that we can isolate them from the network. Minolta doesn't make any Windows 
7 or later drivers for them (nor does anyone else), effectively anchoring them 
to XP. I've seen this a few times now with scanners (probably because they tend 
to be longer term investments than other peripherals). The same seems to happen 
a lot with medical imaging devices. If I were a cynic I might suspect that 
Minolta and friends wanted to ensure turnover. I'm viewing the current 
situation as a stopgap until we can look at replacing the scanners, but, when 
we do that, I intend to move forward on much lower-priced alternatives. Given 
that, for a variety of reasons, we're pretty much a Windows sh!
 op, and given what seems to be the increasing pace of Windows releases, I feel 
like we have to consider that our scanners will have an highly indeterminate 
but likely limited shelf life. It's too bad . . . some company could probably 
do well by creating and selling third party drivers for some of these old 
imaging machines.

Best regards,

Jason Bengtson, MLIS, MA
Head of Library Computing and Information Systems Assistant Professor, Graduate 
College Department of Health Sciences Library and Information Management 
University of Oklahoma Health Sciences Center 405-271-2285, opt. 5405-271-3297 
(fax) jason-bengt...@ouhsc.edu http://library.ouhsc.edu www.jasonbengtson.com

NOTICE:
This e-mail is intended solely for the use of the individual to whom it is 
addressed and may contain information that is privileged, confidential or 
otherwise exempt from disclosure. If the reader of this e-mail is not the 
intended recipient or the employee or agent responsible for delivering the 
message to the intended recipient, you are hereby notified that any 
dissemination, distribution, or copying of this communication is strictly 
prohibited. If you have received this communication in error, please 
immediately notify us by replying to the original message at the listed email 
address. Thank You.

On Mar 19, 2014, at 5:50 AM, Johannes Baiter johannes.bai...@gmail.com wrote:

 Hi all,
 
 spreads developer chiming in here :-)
 
 @Cindy:
 
 I'm curious - how does the shooting time per page compare to something 
 like
 a Minolta PS7000? We've got an old PS7000, buit my experience with 
 the one I've used before was that it took sooo long to shoot each 
 page.  Also, the
 PS7000 model didn't accommodate a bound volume that wouldn't open 
 flat all that well.  Would this be an improvement over that?
 
 
 With my Canon A2200s I can currently shoot at 1400-1500 pages per 
 hour, although the bottleneck is probably my lifting the 
 cradle/flipping the pages.
 
 @Aaron:
 
 It seems like the software piece is a big variable with the DIYBookScanner.
 It's interesting to hear about various setups, I just wonder about 
 the
 long(ish) term viability of some of these open source projects. 
 Obviously, the software is essential for an efficient system and I'm 
 not sure we're interested in building/maintaining our own suite of tools.
 
 
 
 While I can't give any guarantees, I'm very optimistic that I'll 
 continue development for the foreseeable future.
 I'm very passionate about the software and the project (DIYBookScaner) 
 as a whole and my list of things I'd like to do in the software should 
 probably suffice for at least the next two years :-) And even in the 
 case that I should be hit by a bus, I've tried to make the code as 
 clear and idiomatic as possible, so an experienced Python developer 
 should be able to get up to speed pretty quickly.
 
 Additionally, as Raffaele already mentioned, spreads is very modular, 
 you can add your own functionality very easily through the Plugin API.
 
 If you are playing with the thought of using spreads in your 
 institution, please drop me a message, I would love to hear about your 
 workflow and what kinds of things you'd like the software to do.
 
 All the best,
 Johannes

--
Confidentiality Notice: This e-mail message, including any attachments, is for 
the sole use of the intended recipient(s) and may contain confidential and 
privileged information. Any unauthorized review, use, disclosure

Re: [CODE4LIB] Fwd: [rules] Publication of the RDA Element Vocabularies

2014-01-25 Thread Jon Phipps
On Fri, Jan 24, 2014 at 12:19 PM, Edward Summers e...@pobox.com wrote:

 Luckily nobody’s really using it ; so it’s not a huge problem :-D


Gee, thanks Ed. :-)

Jon


Re: [CODE4LIB] Fwd: [rules] Publication of the RDA Element Vocabularies

2014-01-25 Thread Jon Phipps
On Fri, Jan 24, 2014 at 11:16 AM, Robert Sanderson azarot...@gmail.comwrote:

  All in my opinion, and all debatable. I hope that your choice goes well
  for
   you,
 
  I'd like to repeat: just because I agree with that choice, and I'm
  defending it here, it wasn't my choice. Not at all. And the concerns you
  express were well-aired and very carefully considered before the choice
 was
  made.
 

 And yours :)



Ok, that makes me feel a bit personally defensive...

I just want to be sure that it's clear that while I agree with my client,
as a developer I'm not happy with opaque URIs for predicates any more than
you are. The defaults that I've written into the Open Metadata Registry for
coining URIs are: opaque numeric for value vocabularies, and camel-casing
of the label in the default language of the vocabulary for predicate
vocabularies. I think that's the way it should usually work -- my personal
best practice.

But these are decisively multilingual vocabularies, without a 'default'
language for the labels. It's a French and Spanish and English and Hebrew
and Arabic and Italian (etc.) vocabulary. It's not an English vocabulary.
There's no default label to use. The obvious (and well-researched) solution
is an entirely opaque, non-lexical URI. When I, wearing a developer hat,
insist (as you do) that it makes the vocabularies virtually impossible to
be used in development, my client regrets that there doesn't seem to be any
other solution.

The solution that we came up with was that, rather than have no lexical
URIs, we would have _all_ of the lexical URIs, and declare them as
owl:sameAs. We could have used owl:equivalentProperty (and we may have to
in some cases where the translation isn't lexical but rather conceptual)
but it's not as strong. The significant downside is that it immediately
makes the vocabularies owl:full. At some point in the future, we may
publish the mappings from lexical to opaque as a separate map for each
language that can be included in the vocabulary or not and that would
'solve' the owl:full problem, sortof.

The rejection of a single lexical URI in English wasn't 'politically
correct' in the pejorative sense that we usually use that phrase, but
rather an acknowledgement and embrace of a multilingual community. It was
the politic thing to do.

And yeah, our solution is debatable, but it's a debate we've often had over
the years, with both colleagues and clients, in public and in private, and
sometimes there's just no pleasing everyone, so we just do the best we can
with the tools we have, eh? And build some new tools, which we're also
working on.

Cheers for the useful debate.

Jon


Re: [CODE4LIB] Fwd: [rules] Publication of the RDA Element Vocabularies

2014-01-24 Thread Jon Phipps
Hi Rob, the conversation continues below...

On Thu, Jan 23, 2014 at 7:01 PM, Robert Sanderson azarot...@gmail.comwrote:

 Hi Jon,

 To present the other side of the argument so that others on the list can
 make an informed decision...


Thanks for reminding me that this is an academic panel discussion in front
of an audience, rather than a conversation.


 On Thu, Jan 23, 2014 at 4:22 PM, Jon Phipps jphi...@madcreek.com wrote:

  I've developed a quite strong opinion that vocabulary developers should
 not
  _ever_ think that they can understand the semantics of a vocabulary
  resource by 'reading' the URI.


 100% Agreed. Good documentation is essential for any ontology, and it has
 to be read to understand the semantics. You cannot just look at
 oa:hasTarget, out of context, and have any idea what it refers to.

 However if that URI is readable it makes developers lives much easier in a
 lot of situations, and it has no additional cost. Opaque URIs for
 predicates is the digital equivalent of thumbing your nose at the people
 you should be courting -- the people who will actually use your ontology in
 any practical sense.  It says: We don't care about you enough to make your
 life one step easier by having something that's memorable. You will always
 have to go back to the ontology every time and reread this documentation,
 over and over and over again.


What you suggest is that an identifier (e.g. @azaroth42 or ORCID:
-0003-4441-6852 https://orcid.org/-0003-4441-6852) should always
be readable as a convenience to the developer. RDA does provide a 'readable
in the language of the reader' uri specifically as a convenience to the
developer. A feature that I lobbied for. It's just not the /canonical/ URI,
because it's an identifier of a property, not the property itself, and that
property is independent of the language used to label it.

It's the difference between Metadata Management Associates, PO Box 282,
Jacksonville, NY 14854, USA (for people) and 14854-0282 (a perfectly
functional complete address in the USA namespace), which is precisely the
same identifier of that box for machines, and ultimately for the
postmaster, who doesn't care whose name is on the box numbered 282, who
only needs to know that highly memorable name when someone uses the
convenience of not bothering to look up the box number and just sends mail
addressed to us at 14854, or even just Jacksonville. And no I don't want to
start a URL vs. URI/URN/IRI discussion.


 Do you have some expectation that in order
  for the data to be useful your relational or object database identifiers
  must be readable?


 Identifiers for objects, no. The table names and field names? Yes. How many
 DBAs do you know that create tables with opaque identifiers for the column
 names?  How many XML schemas do you know that use opaque identifiers for
 the element names?

 My count is 0 from many many many instances.  And the reason is the same as
 having readable predicate URIs -- so that when you look at the table,
 schema, ontology, triple or what have you, there is some mnemonic value
 from the name to its intent.

 Our experience obviously differs in this regard. I've seen many, many
databases that have relatively opaque column identifiers that were
relabeled in the query to suit the audience for the query. I've seen many
French databases, with French content, intended for a French audience,
designed by French developers, that had French 'column headers'.

The point here is that the identifiers /identify/ a property that exists
independent of the language of the data being used to describe a resource.
If RDA _had_ to pick a single language to satisfy your requirement for a
single readable identifier, which one? To assume that the one language
should be English says to the non-english speaking world We don't care
about you enough to make your
life one step easier by having something that's memorable



  By whom, and in English? This to me is a frankly colonial
  assumption of the dominance of English in the world of metadata.


 In the world of computing in general. for if while ... all English.
 While there are turing complete languages out there, the ones that don't
 have real world language constructions are toys, like Whitespace for
 example.  Even the lolcats programming language is more usable than
 whitespace.

 Again, it's a cost/value consideration.  There are many people who will
 understand English, and when developers program, they're surrounded by it.
 If your intended audience is primarily people who speak French, then you
 would be entirely justified in using URIs with labels from French. Or
 Chinese, though the IRI expansion would be more of a pain :)



Despite the fact that developers are surrounded by English I've worked with
many highly skilled developers who didn't speak or read English. Who relied
on documentation and meetings in their own language. What RDA is trying to
convey is the specific bibliographic knowledge

Re: [CODE4LIB] Fwd: [rules] Publication of the RDA Element Vocabularies

2014-01-23 Thread Jon Phipps
Hi Ben,

On Thu, Jan 23, 2014 at 4:48 AM, Ben Companjen
ben.compan...@dans.knaw.nlwrote:

 Returning an HTML document (or XML document as I get) in
 response to a request for an RDA property or class is wrong in the Linked
 Data sense [note 1]. This is explained in the W3C WG Note that you
 referred to in recipe 2 [2].


I'm the co-author of that note, so I'm all too familiar with it. :-)

At the moment, it shouldn't be possible to request html from
rdaregistry.info without getting redirected to www.rdaregistry.info (hosted
on github using github pages). Although I'm doing a minimal job of checking
the HTTP Accept header.


 Are you planning on introducing 303-redirects?


I'm deeply embarrassed (really) by the fact that the redirect is not a 303
and that it may not be consistent. As well as by the fact that it doesn't
return the requested fragment (which I still believe is best practice). So,
yeah, as soon as I get back from the ALA Midwinter conference (sooner if I
can get some meeting-free time). I'll at least get a 303 redirect header in
there (still learning nginx).

Cheers!

Jon


[CODE4LIB] Fwd: [rules] Publication of the RDA Element Vocabularies

2014-01-23 Thread Jon Phipps
Well, the notion of 'beta' is a bit complicated... The vocabularies aren't
beta and shouldn't be regarded as such. They've been well- vetted and
reviewed and various folks, including me, have been working on them for
quite a few years, with plenty of feedback from quite a few 'communities'.
That said, the dereferencing service infrastructure isn't yet quite right,
but we're pretty happy that it mostly works the way need it to right now --
it's not just good, it's good enough. For now.

I've developed a quite strong opinion that vocabulary developers should not
_ever_ think that they can understand the semantics of a vocabulary
resource by 'reading' the URI. Do you have some expectation that in order
for the data to be useful your relational or object database identifiers
must be readable? By whom, and in English? This to me is a frankly colonial
assumption of the dominance of English in the world of metadata. The proper
understanding of the semantics, although still relatively minimal, is from
the definition, not the URI. Our coining and inclusion of multilingual
(eventually) lexical URIs based on the label is a concession to developers
who feel that they can't effectively 'use' the vocabularies unless they can
read the URIs. Go for it. Use them. The machines, if they're configured
correctly, will fetch the correct URI permanently. I grant that writing ad
hoc sparql queries with opaque URIs can be intensely frustrating, but the
vocabularies aren't designed specifically to support that incredibly narrow
use case. If you want to see/use label-based browse use the Open Metadata
Registry (and yes that could be improved too):
http://metadataregistry.org/schemaprop/list/schema_id/81.html

Ultimately I'm not responding on this list to defend decisions that I
didn't personally make, despite the fact that I completely support the
decision.

WRT the bug you mention, please take the trouble to put an issue on GitHub
so we can track it:
https://github.com/RDARegistry/RDA-Vocabularies/issues
...but, the issue isn't that the sameAs assertions don't appear in the
turtle representation, it's that they do appear in the N3 representation
we've published using = (valid N3, invalid turtle), and we don't actually
publish turtle at the moment, even if that's what you ask for. We publish
N3 generated using the very useful RDF translation service:
http://rdf-translator.appspot.com/
...which uses RDFLib to generate N3, and there appears to be a bug in
RDFLib that isn't a bug:
https://github.com/RDFLib/rdflib/issues/218

I haven't had time to effectively research our options, but clearly we need
to either generate both turtle and N3 serializations (my preference), or
just turtle.

Jon



On Thu, Jan 23, 2014 at 10:50 AM, Dan Scott
deni...@gmail.comjavascript:_e({}, 'cvml', 'deni...@gmail.com');
 wrote:

 On Thu, Jan 23, 2014 at 10:08 AM, Jon Phipps 
 jphi...@madcreek.comjavascript:_e({}, 'cvml', 'jphi...@madcreek.com');
 wrote:
  Hi Ben,
 
  On Thu, Jan 23, 2014 at 4:48 AM, Ben Companjen
  ben.compan...@dans.knaw.nl javascript:_e({}, 'cvml',
 'ben.compan...@dans.knaw.nl');wrote:
 
  Returning an HTML document (or XML document as I get) in
  response to a request for an RDA property or class is wrong in the
 Linked
  Data sense [note 1]. This is explained in the W3C WG Note that you
  referred to in recipe 2 [2].
 
 
  I'm the co-author of that note, so I'm all too familiar with it. :-)
 
  At the moment, it shouldn't be possible to request html from
  rdaregistry.info without getting redirected to www.rdaregistry.info(hosted
  on github using github pages). Although I'm doing a minimal job of
 checking
  the HTTP Accept header.
 
 
  Are you planning on introducing 303-redirects?
 
 
  I'm deeply embarrassed (really) by the fact that the redirect is not a
 303
  and that it may not be consistent. As well as by the fact that it doesn't
  return the requested fragment (which I still believe is best practice).
 So,
  yeah, as soon as I get back from the ALA Midwinter conference (sooner if
 I
  can get some meeting-free time). I'll at least get a 303 redirect header
 in
  there (still learning nginx).

 Oh. I'm going to take a guess that this announcement was pushed out to
 meet an ALA Midwinter deadline, and therefore was a tad premature.

 If that's the case (or even if not), why not market it as a beta,
 collect up the known bugs in a visible place, and (perhaps most
 importantly) invite the denizens of the W3C Public Linked Open Data
 mailing list to weigh in on the opaque identifiers vs. meaningful
 identifiers vs. both opaque + meaningful direction? You want this
 vocabulary to be adopted and used; it would be really good to have
 their buy-in to the vision.

 In my opinion, I think it would be a mistake to continue with the
 opaque identifiers as the primary identifiers; the vocabulary is
 almost unreadable as it stands. And I believe they will make
 communication between people trying to implement it harder

Re: [CODE4LIB] Fwd: [rules] Publication of the RDA Element Vocabularies

2014-01-22 Thread Jon Phipps
Hi Dan,

Thanks for taking such an interest!

Regarding your questions and concerns:

'slash' vs. 'hash' URIs:
As a matter of design, we coin URIs for retrieval of information about the
resource identified by the URI by machines, not humans. The most current
formal rules[1] state that retrieving a 'slash' fragment should return just
that fragment when resolved. We're currently breaking that rule by always
returning the entire vocabulary, as if it was indeed using hash URIs and
will fix it in the next few weeks. An example of such a fragment (generated
by the Open Metadata Registry for
http://rdaregistry.info/Elements/w/P10001)
is here:
http://metadataregistry.org/schemaprop/show/id/15304.rdf

We believe, as a matter of good design, that URIs coined for large
vocabularies should minimize retrieval bandwidth, particularly since it's
highly unlikely that the entire vocabulary will (or should) be retrieved
when the properties are used individually as part of an application
profile. The entire vocabulary can always be acquired by requesting it from
the vocabulary's namespace URI:
http://rdaregistry.info/Elements/w/

Lexical (readable, but not semantic) URIs:
One of the most common misuses of vocabularies is the misunderstanding of
the semantics of the property identified by the URI based on the user's
personal, colloquial, or domain-specific interpretation of the semantics of
the URI (dc:title is the one I've seem misused most often). So we believe
that good vocabulary design _should_ obscure the semantics requiring that
the actual vocabulary documentation be viewed by a human.

The other problem is that the 'semantics' are most often broadly identified
with the lexical label used in the URI. Vocabularies, no matter how stable
semantically, _will_ evolve and that evolution often results in a change to
the label(s), even if the semantics communicated by the URI don't change.

And then there's the issue of spelling (British English vs. American
English) and language. Should we assume that the entire world must use, and
_understand_ English in order to effectively use a vocabulary? We don't
think so.

To at least partially address this we have coined multiple URIs for each
property, as explained here:
http://www.rdaregistry.info/Elements/e/
All RDA URIs have both an immutable canonical form and a 'readable',
lexical form, which is subject to change (changes will be redirected). The
lexical URIs follow the naming convention you identified and are largely
based on the current English (British) label.

Content-type: application/octet-stream:
We just got the server (nginx) setup yesterday and we haven't yet set the
mime types correctly. Again we'll fix that very shortly.

Jon Phipps
Metadata Management Associates
Open Metadata Registry

[1] http://www.w3.org/TR/swbp-vocab-pub/


Jon



On Wed, Jan 22, 2014 at 12:57 PM, Dan Scott deni...@gmail.com wrote:

 I'm still pretty new at this linked data thing, but I find it strange
 that RDA element properties URIs such as
 http://rdaregistry.info/Elements/a/P50034 and
 http://rdaregistry.info/Elements/a/P50209 both return the same HTML
 page in a browser. Would it not have been more usable if the
 properties used hash-URIs that could have located the particular
 property on the particular page (e.g.
 http://rdaregistry.info/Elements/a#P50034)?

 Also, a plain curl request returns Content-type:
 application/octet-stream -- but it's pretty clearly Turtle, so I think
 that should be Content-type: text/turtle

 I would have liked to see more meaningful URIs--like
 http://rdaregistry.info/Elements/agent/addressOf instead of
 http://rdaregistry.info/Elements/a/P50209--as meaningful URIs seem a
 lot more approachable to this non-machine, but I guess that would have
 been a lot more work.




 On Tue, Jan 21, 2014 at 10:45 AM, Diane Hillmann
 metadata.ma...@gmail.com wrote:
  Folks:
 
  I hope this announcement will be of general interest (and apologies if
 you
  receive more than one).
 
  Diane
 
  -- Forwarded message --
  From: JSC Secretary jscsecret...@rdatoolkit.org
  Date: Tue, Jan 21, 2014 at 10:23 AM
  Subject: [rules] Publication of the RDA Element Vocabularies
  snip recipients
 
  RDA colleagues,
 
  See the announcement below, also posted on the JSC website.  Feel free to
  share this information with your colleagues.
 
  Regards, Judy Kuhagen
 
  = = = = =
 
  The Joint Steering Committee for Development of RDA (JSC), Metadata
  Management Associates, and ALA Publishing (on behalf of the co-publishers
  of RDA) are pleased to announce that the RDA elements and relationship
  designators have been published in the Open Metadata Registry (OMR) as
  Resource Description Framework (RDF) element sets suitable for linked
 data
  and semantic Web applications.
 
  The elements include versions unconstrained by Functional Requirements
  for Bibliographic Records (FRBR) and Functional Requirements for
 Authority
  Data (FRAD), the standard library models

Re: [CODE4LIB] Fwd: [rules] Publication of the RDA Element Vocabularies

2014-01-22 Thread Jon Phipps
Hi Karen,

On Wed, Jan 22, 2014 at 6:40 PM, Karen Coyle li...@kcoyle.net wrote:

 I would still prefer something memorable at this stage.


The 'lexical', and therefore more memorable, URIs based on the English
label will always resolve to the canonical URI. If the lexical label
changes, but the semantics don't change, both the old and new lexical URIs
will still resolve to the same canonical URI. Of course if both the label
and the semantics change, then then it's a new property and gets a new URI.

We think that what's urgently needed is a far, far better html
representation of the vocabularies: one that makes it obvious that humans
can guess mnemonically at a resolvable URI from the label, bearing in mind
that this will (hopefully) cause machines (and browsers) to follow the
inevitable redirect to the canonical URI. We're actively working on that
better representation.

Jon


Re: [CODE4LIB] problem in old etd xml files

2013-12-10 Thread Jon Gorman
Right, hence my earlier suggestion of just replacing the entities ;). It's
not exactly the approach you describe, as your would would deal with common
cases that didn't get properly set up in the dtd, but it also would be a
bit more difficult to map for weird custom entities.

My email was a bit rambling, but the magic sauce I recommended was
something like

xmllint --loaddtd --noent --dropdtd FRONT.XML  FRONT_nodtdent.xml

(In reality you'd want to automate that a little more, xmllint uses the
libxml libraries if I remember correctly, so there are likely bindings that
do the same thing.)

What that seems to do is loads the dtd (which xmllint no longer does unless
it needs to), takes any entity and replaces it with what's in the dtd, and
then just drops the dtd. I didn't look closely, but it doesn't seem to just
transplant it with the numeric code (#255;), but use the actual unicode
character.

(You still need to fix the several mistakes that have already been observed
and pointed out by folks like Jason, the xml:stylesheet that needs to be
xml-stylesheet, making sure the filename are actually correct for
case-sensitive OSes.)

Jon G.


On Mon, Dec 9, 2013 at 7:48 PM, Roy Tennant roytenn...@gmail.com wrote:

 For my money, the text transform should look only for exact matches (e.g.,
 aacute;, nbsp;, copy;) and replace them with their numeric
 counterparts.
 Roy


 On Mon, Dec 9, 2013 at 5:41 PM, jason bengtson j.bengtson...@gmail.com
 wrote:

  For testing purposes I just nixed them. As I noted, to rework the file a
  person would probably want to use a more critical eye with find and
  replace. Totally doable.
 
 
  On Dec 9, 2013, at 7:37 PM, Jon Gorman jonathan.gor...@gmail.com
 wrote:
 
   How did you fix the ampersands? I ask, because if you just did a simple
   text transform from  to amp;, it would mask the problem of the entity
   escaping I think...
  
   Not at work, so I don't have a good example and the file is downloading
   very slowly here, so I'll try to do one from memory.
  
   There were several aacute; in the XML which mapped to an accent
  character
   in the DTD via the Entity.
  
   If you just substituted  with amp;, you'd get amp;aacute;, which
 would
   render inline as accute;. It would superficially solve the issue since
   browsers would no longer give the errors about the dtd since it
 wouldn't
  be
   trying to load entities from the DTDs. And depending how you did it,
 you
   likely could also replace a correctly encoded one to make amp;amp;,
   leading to some very odd stuff.
  
   I wouldn't be surprised to find some unescaped ampersands, but the
  solution
   I posted will essentially replace the entities with their text,
 hopefully
   causing most characters to appear correctly. You definitely still need
 to
   fix some of the other stuff. (I suspect it never worked for most
 browsers
   and XML systems, most likely only IE).
  
   Jon Gorman
   University of Illinois
 
  Best regards,
 
  Jason Bengtson, MLIS, MA
  Head of Library Computing and Information SystemsAssistant Professor,
  Graduate CollegeDepartment of Health Sciences Library and Information
  ManagementUniversity of Oklahoma Health Sciences Center405-271-2285, opt.
  5405-271-3297 (fax)
  jason-bengt...@ouhsc.edu
  http://library.ouhsc.edu
  www.jasonbengtson.com
 
  NOTICE:
  This e-mail is intended solely for the use of the individual to whom it
 is
  addressed and may contain information that is privileged, confidential or
  otherwise exempt from disclosure. If the reader of this e-mail is not the
  intended recipient or the employee or agent responsible for delivering
 the
  message to the intended recipient, you are hereby notified that any
  dissemination, distribution, or copying of this communication is strictly
  prohibited. If you have received this communication in error, please
  immediately notify us by replying to the original message at the listed
  email address. Thank You.
 



Re: [CODE4LIB] problem in old etd xml files

2013-12-09 Thread Jon Gorman
A lot of modern systems won't load entities (or will limit it somehow)
because of the denial of service attack that is possible.  Look for XML
Entity Reference Denial of Service. I can't remember if Public declarations
are treated any differently than System ones. (I would have suspected it to
trust SYSTEM ones more, but they'd still be exploitable by the same bug).


(There's also a fair number of other errors, I'm somewhat skeptical that
the example worked on many browsers even then. It's possible IE was
flexible enough it would have worked).

One thing you might want to do is is take out the entities.

I can't remember why I had to do this, but xmllint seemed to do the trick.
( I found a snippet at
http://stackoverflow.com/questions/614067/how-to-resolve-all-entity-references-in-xml-and-create-a-new-xml-in-c,
but it' smissing the necessary --loaddtd)

xmllint --loaddtd --noent --dropdtd FRONT.xml  FRONT_nodtdent.xml

I mean, you don't need the dtd for validation, particularly since I suspect
given the errors it may not validate anyhow.

It might make the files a little harder to read when reading the raw
source, but I suspect that's not typically a problem.

Jon Gorman
University of Illinois



On Mon, Dec 9, 2013 at 2:10 PM, Robertson, Wendy C 
wendy-robert...@uiowa.edu wrote:

 Back in 1999-2002 a handful of our theses were submitted  as a collection
 of xml files.  We posted the files in our repository several years ago (we
 posted a zipped folder with all the files).  At that time, if you opened
 front.xml you would be able to access the thesis. We have not touched the
 files in the close to 5 years since we posted them, but the files no longer
 open correctly. One of the problem theses is http://ir.uiowa.edu/etd/189/.

 Front.xml begins
 ?xml version=1.0 encoding=UTF-8?
 ?xml:stylesheet type=text/css href=UIowa2K1.css ?
 !DOCTYPE thesis SYSTEM UIowa2K.dtd

 I have tried the following changes but they do not help

 1)  Adding standalone=no? to the xml declaration  -- ?xml
 version=1.0  encoding=UTF-8 standalone=no?

 2)  Changing the case of UIowa2K1.css and UIowa2K.dtd to match the
 files (which are in all caps)

 3)  Changing xml:stylesheet to xml-stylesheet

 Chrome shows errors that entities are not defined, but they are defined in
 the dtd.

 I would appreciate any assistance in making these documents available
 again. Thanks!

 Wendy Robertson
 Digital Scholarship Librarian *  The University of Iowa Libraries
 1015 Main Library  *  Iowa City, Iowa 52242
 wendy-robert...@uiowa.edu * 319-335-5821



Re: [CODE4LIB] problem in old etd xml files

2013-12-09 Thread Jon Gorman
How did you fix the ampersands? I ask, because if you just did a simple
text transform from  to amp;, it would mask the problem of the entity
escaping I think...

Not at work, so I don't have a good example and the file is downloading
very slowly here, so I'll try to do one from memory.

There were several aacute; in the XML which mapped to an accent character
in the DTD via the Entity.

If you just substituted  with amp;, you'd get amp;aacute;, which would
render inline as accute;. It would superficially solve the issue since
browsers would no longer give the errors about the dtd since it wouldn't be
trying to load entities from the DTDs. And depending how you did it, you
likely could also replace a correctly encoded one to make amp;amp;,
leading to some very odd stuff.

I wouldn't be surprised to find some unescaped ampersands, but the solution
I posted will essentially replace the entities with their text, hopefully
causing most characters to appear correctly. You definitely still need to
fix some of the other stuff. (I suspect it never worked for most browsers
and XML systems, most likely only IE).

Jon Gorman
University of Illinois


Re: [CODE4LIB] Looking for two coders to help with discoverability of videos

2013-12-03 Thread Dunn, Jon William Butcher
Hi Kelley,

If you haven't already, you might want to look at the music score and sound 
recording FRBRization work done on the Variations-FRBR project here at Indiana 
University. I'm not sure how directly useful this would be for your work with 
moving images, but there may be some useful mapping ideas:

FRBR XML schemas: 
http://www.dlib.indiana.edu/projects/vfrbr/schemas/1.1/index.shtml 

MARC-FRBR mapping specifications: 
http://www.dlib.indiana.edu/projects/vfrbr/projectDoc/metadata/mappings/spring2010/vfrbrSpring2010mappings.shtml
 

Java FRBRization code and documentation: 
http://www.dlib.indiana.edu/projects/vfrbr/projectDoc/index.shtml 

Jon

-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Kelley 
McGrath
Sent: Tuesday, December 03, 2013 12:35 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Looking for two coders to help with discoverability of 
videos

Robert,

Your work also sounds very interesting and definitely overlaps with some of 
what we want to do. It seems like a lot of people are trying to get useful 
format information out of MARC records and it's unfortunate that it is so 
complicated. I would be very interested to see your logic for determining 
format and dealing with self-contradictory records. Runtime from the 008 is, as 
you say, pretty straightforward, but not always filled out and useless if the 
resource is longer than 999 minutes.

It's interesting that you mention identifying directors. We have also been 
working on a similar, although more generalized, process. We're trying to 
identify all of the personal and organizational names mentioned in video 
records and, where possible, their roles. Our existing process is pretty 
accurate for personal names and for roles in English. It tends to struggle with 
credits involving multiple corporate bodies and we're working on building a 
lexicon of non-English terms for common roles. We're also trying to get people 
to hand-annotate credits to build a corpus to help us improve our process. 
(Help us out at http://olac-annotator.org/. And if you're willing to be on call 
to help with translating non-English credits, email me with the language(s) 
you'd be able to help out with. We also just started a mailing list at 
https://lists.uoregon.edu/mailman/listinfo/olac-credits)

Matching MARC records for moving images with external data sources is also on 
our radar. Most feature film type material can probably be identified by the 
attributes you mention: title, original date and director (probably 2 out of 3 
would work in most cases). We are also hoping to use these attributes (and 
possibly others) to cluster records for the same FRBR work.

It would be great to talk with you more about this off-list.

Kelley
kell...@uoregon.edu

From: Robert Haschart [rh...@virginia.edu]
Sent: Monday, December 02, 2013 10:49 AM
To: Code for Libraries
Cc: Kelley McGrath
Subject: Re: [CODE4LIB] Looking for two coders to help with discoverability of 
videos

Kelley,

The work you are proposing is interesting and overlaps somewhat both with work 
I have already done and with a new project I'm looking into here at UVa.
I have been the primary contributor to the Marc4j java project for the past 
several years and am the creator of the project SolrMarc which extracts data 
from Marc records based on a customizable specification, to build Solr index 
records to facilitate rich discovery.

Much of my work on creating and improving these projects has been in service of 
my actual job of creating and maintaining the Solr Index
behind our Blacklight-based discovery interface.   As a part of that
work I have created custom SolrMarc routines that extract the format of items 
similar to what is described in Example 3, including looking in the leader, 
006, 007 and 008 to determine the format as-coded but further looking in the 
245 h, 300 and 538 fields to heuristically determine when the format as-coded 
is incorrect and ought to be
overridden.   Most of the heuristic determination is targeted towards
Video material, and was initiated when I found an item that due to a coding 
error was listed as a Video in Braille format.

Further I have developed a set of custom routines that look more closely at 
Video items, one of which already extracts the runtime from the 008[18-20] 
field, To modify it from its current form that currently returns the runtime in
minutes, to instead return it as   HH:MM as specified in your xls file,
and to further handle the edge case of  008[18-20] = 000  to return over 
16:39 would literally take about 15 minutes.

Another of these custom routines that is more fully-formed, is code for 
extracting the Director of a video from the Marc record.  It examines the 
contents of the fields 245c, 508a, 500a, 505a, 505t, employing heuristics and 
targeted natural language processing techniques, to
attempt to correctly extract the Director

Re: [CODE4LIB] ruby-marc api design feedback wanted

2013-11-20 Thread Jon Stroop
Coming from nowhere on this...is there a place where it would be 
convenient to flag which behavior the user (of the library) wants? I 
think you're correct that most of the time you'd just want to blow 
through it (or replace it), but for the situation where this isn't the 
case, I think the Right Thing to do is raise the exception. I don't 
think you would want to bury it in some assumption made internal to the 
library unless that assumption can be turned off.


-Jon


On 11/19/2013 07:51 PM, Jonathan Rochkind wrote:

ruby-marc users, a question.

I am working on some Marc8 to UTF-8 conversion for ruby-marc.

Sometimes, what appears to be an illegal byte will appear in the Marc8 
input, and it can not be converted to UTF8.


The software will support two alternatives when this happens: 1) 
Raising an exception. 2) Replacing the illegal byte with a replacement 
char and/or omitting it.


I feel like most of the time, users are going to want #2.  I know 
that's what I'm going to want nearly all the time.


Yet, still, I am feeling uncertain whether that should be the default. 
Which should be the default behavior, #1 or #2?  If most people most 
of the time are going to want #2 (is this true?), then should that be 
the default behavior?   Or should #1 still be the default behavior, 
because by default bad input should raise, not be silently recovered 
from, even though most people most of the time won't want that, heh.


Jonathan


Re: [CODE4LIB] Loris

2013-11-08 Thread Jon Stroop

Ed,

I added support for IIIF syntax to OpenSeadragon:

https://github.com/openseadragon/openseadragon/blob/master/src/iiif1_1tilesource.js

so it just works. Not sure if Ian has cut a release recently, but it's 
on the master branch anyway.


-Js

On 11/08/2013 04:00 PM, Edward Summers wrote:

On Nov 8, 2013, at 3:05 PM, Jon Stroop jstr...@princeton.edu wrote:

And here's a sample of the server backing OpenSeadragon[2]: http://goo.gl/Gks6lR

Thanks for sharing that Jon. Did you have to do much to get OpenSeadragon to 
talk iiif?

//Ed


Re: [CODE4LIB] Loris

2013-11-08 Thread Jon Stroop

Whoops, wait.
I wrote a formula for Chris Thatcher to add support for IIIF 1.0 to add 
support for OSd. Then I made some changes and added support for 1.1. 
Credit where credit is due

-Js

On 11/08/2013 04:40 PM, Jon Stroop wrote:

Ed,

I added support for IIIF syntax to OpenSeadragon:

https://github.com/openseadragon/openseadragon/blob/master/src/iiif1_1tilesource.js

so it just works. Not sure if Ian has cut a release recently, but 
it's on the master branch anyway.


-Js

On 11/08/2013 04:00 PM, Edward Summers wrote:

On Nov 8, 2013, at 3:05 PM, Jon Stroopjstr...@princeton.edu  wrote:

And here's a sample of the server backing OpenSeadragon[2]:http://goo.gl/Gks6lR

Thanks for sharing that Jon. Did you have to do much to get OpenSeadragon to 
talk iiif?

//Ed




Re: [CODE4LIB] Loris

2013-11-08 Thread Jon Stroop

Bleh. You know what I meant.

On 11/8/13 5:13 PM, Jon Stroop wrote:

Whoops, wait.
I wrote a formula for Chris Thatcher to add support for IIIF 1.0 to 
add support for OSd. Then I made some changes and added support for 
1.1. Credit where credit is due

-Js

On 11/08/2013 04:40 PM, Jon Stroop wrote:

Ed,

I added support for IIIF syntax to OpenSeadragon:

https://github.com/openseadragon/openseadragon/blob/master/src/iiif1_1tilesource.js

so it just works. Not sure if Ian has cut a release recently, but 
it's on the master branch anyway.


-Js

On 11/08/2013 04:00 PM, Edward Summers wrote:

On Nov 8, 2013, at 3:05 PM, Jon Stroopjstr...@princeton.edu  wrote:

And here's a sample of the server backing OpenSeadragon[2]:http://goo.gl/Gks6lR

Thanks for sharing that Jon. Did you have to do much to get OpenSeadragon to 
talk iiif?

//Ed






Re: [CODE4LIB] Loris

2013-11-08 Thread Jon Stroop

Seriously!

On 11/8/13 6:21 PM, Michael J. Giarlo wrote:

Stick to Python, Jon. ;)


On Fri, Nov 8, 2013 at 3:17 PM, Jon Stroop jstr...@princeton.edu wrote:


Bleh. You know what I meant.


On 11/8/13 5:13 PM, Jon Stroop wrote:


Whoops, wait.
I wrote a formula for Chris Thatcher to add support for IIIF 1.0 to add
support for OSd. Then I made some changes and added support for 1.1. Credit
where credit is due
-Js

On 11/08/2013 04:40 PM, Jon Stroop wrote:


Ed,

I added support for IIIF syntax to OpenSeadragon:

https://github.com/openseadragon/openseadragon/blob/master/src/iiif1_
1tilesource.js

so it just works. Not sure if Ian has cut a release recently, but it's
on the master branch anyway.

-Js

On 11/08/2013 04:00 PM, Edward Summers wrote:


On Nov 8, 2013, at 3:05 PM, Jon Stroopjstr...@princeton.edu  wrote:


And here's a sample of the server backing OpenSeadragon[2]:http://goo.
gl/Gks6lR


Thanks for sharing that Jon. Did you have to do much to get
OpenSeadragon to talk iiif?

//Ed





Re: [CODE4LIB] Loris

2013-11-08 Thread Jon Stroop
It aims to do the same thing...serve big JP2s (and other images) over 
the web, so from that perspective, yes. But, beyond that, time will 
tell. One nice thing about coding against a well-thought-out spec is 
that are lots of implementations from which you can choose[1]--though as 
far as I know Loris is the only one that supports the IIIF syntax 
natively (maybe IIP?). We still have Djatoka floating around in a few 
places here, but, as many people have noted over the years, it takes a 
lot of shimming to scale it up, and, as far as I know, the project has 
more or less been abandoned.


I haven't done too much in the way of benchmarking, but to date don't 
have any reason to think Loris can't perform just as well. The demo I 
sent earlier is working against a very large jp2 with small tiles[1] 
which means a lot of rapid hits on the server, and between that, (a 
little bit of) JMeter and ab testing, and a fair bit of concurrent use 
from the c4l community this afternoon, I feel fairly confident about it 
being able to perform as well as Djatoka in a production environment.


By the way, you can page through some other images here: 
http://libimages.princeton.edu/osd-demo/


Not much of an answer, I realize, but, as I said, time and usage will tell.

-Js

1. http://iiif.io/apps-demos.html
2. 
http://libimages.princeton.edu/loris/pudl0052%2F6131707%2F0001.jp2/info.json



On 11/8/13 8:07 PM, Peter Murray wrote:

A clarifying question: is Loris effectively a Python-based replacement for the 
Java-based djatoka [1] server?


Peter

[1] http://sourceforge.net/apps/mediawiki/djatoka/index.php?title=Main_Page


On Nov 8, 2013, at 3:05 PM, Jon Stroop jstr...@princeton.edu wrote:


c4l,
I was reminded earlier this week at DLF (and a few minutes ago by Tom
and Simeon) that I hadn't ever announced a project I've been working for
the least year or so to this list. I showed an early version in a
lightning talk at code4libcon last year.

Meet Loris: https://github.com/pulibrary/loris

Loris is a Python based image server that implements the IIIF Image API
version 1.1 level 2[1].

http://www-sul.stanford.edu/iiif/image-api/1.1/

It can take JP2 (if you make Kakadu available to it), TIFF, or JPEG
source images, and hand back JPEG, PNG, TIF, and GIF (why not...).

Here's a demo of the server directly: http://goo.gl/8XEmjp

And here's a sample of the server backing OpenSeadragon[2]:
http://goo.gl/Gks6lR

-Js

1. http://www-sul.stanford.edu/iiif/image-api/1.1/
2. http://openseadragon.github.io/

--
Jon Stroop
Digital Initiatives Programmer/Analyst
Princeton University Library
jstr...@princeton.edu

--
Peter Murray
Assistant Director, Technology Services Development
LYRASIS
peter.mur...@lyrasis.org
+1 678-235-2955
800.999.8558 x2955


[CODE4LIB] Job: Digital Repository Software Developer at Princeton University

2013-10-11 Thread Jon Stroop
Note: this job is in Academic Services at Princeton, not in the Library, 
though we do work together from time to time. The full posting is here:


http://jobs.princeton.edu/applicants/Central?quickFind=64011

Cross-posted. Please excuse any duplicate copies you receive.

*Princeton University seeks Digital Repository Software Developer*

In September of 2011 the Faculty of Princeton University approved an 
open access policy intended to make faculty's scholarly articles 
available to a wider public. Princeton is now in the process of ramping 
up its efforts to implement the policy. These efforts will include the 
development of the repository that will hold the scholarly articles. The 
Office of Information Technology seeks a Digital Repository Software 
Developer to establish and enhance digital repositories to house 
academic publications, research data, and related digital assets.  The 
primary focus of the position will be to develop software and systems 
for collecting and depositing academic journal articles subject to 
Princeton University's Open Access Policy for Faculty Publications into 
an open access repository.  This repository will enhance both the 
preservation and dissemination of scholarship at Princeton.


The Digital Repository Software Developer will report to the Digital 
Repository Architect and will work closely with the University's 
Scholarly Communications Librarian and other IT and Library staff.


--
Jon Stroop
Digital Initiatives Programmer/Analyst
Princeton University Library
jstr...@princeton.edu


Re: [CODE4LIB] A Proposal to serialize MARC in JSON

2013-09-03 Thread Jon Stroop

It looks like it's there in pymarc as well:

https://github.com/edsu/pymarc/blob/master/pymarc/record.py#L386


On 09/03/2013 03:02 PM, Bill Dueber wrote:

I can see where you might think that no progress has been made because
the only real document of the format is that old, old blog post.

The problem, however, is not a lack of progress but a lack of documentation
of that progress. File_MARC (PHP), MARC::Record (perl), ruby-marc (ruby)
and marc4j (java) will all deal, to one extent or another, either with the
JSON directly or with a hash/map data structure that maps directly to that
JSON structure.

[BTW, can anyone summarize the state of pymarc wrt marc-in-json?]





On Tue, Sep 3, 2013 at 5:09 AM, dasos ili dasos_...@yahoo.gr wrote:


It is exactly three years back, and no real progress has been made
concerning  this proposal to serialize MARC in JSON:


http://dilettantes.code4lib.org/blog/2010/09/a-proposal-to-serialize-marc-in-json/


Meanwhile new tools for searching and retrieving records have come in,
such as Solr and Elasticsearch. Any ideas on how one could alter (or
propose a new format) more suited to the mechanisms of these two search
platforms?

Any example implemantations would be also really appreciated,

thank you in advance






Re: [CODE4LIB] Python and Ruby

2013-07-29 Thread Jon P. Stroop
s/ruby/any_language/

Why not learn both? As with spoken languages, knowing more than one makes it 
easier for you to think at a higher level of abstraction and therefore a better 
developer, and, as others have alluded to, will allow you to choose the 'right 
tool [framework, library, etc] for the right job'.

Plus, as Giarlo said, they're not really that different.


From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] on behalf of Chris 
Fitzpatrick [chrisfitz...@gmail.com]
Sent: Monday, July 29, 2013 1:39 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Python and Ruby

One thing to factor in is that if you learn ruby you run the risk of
becoming one of those people who constantly talks,tweets,blogs, posts to
this mailing list about how great ruby is. This can have a very negative
impact on your work productivity.

On Monday, July 29, 2013, Dana Pearson wrote:

 Josh,

 I work exclusively with XSLT but specialize in metadata only no need for
 content display choices

 maybe a candidate for library programming language...XSLT 2.0 has useful
 analyze-string element to cover Roy's point

 by the way, Josh, live just down the road in Leeton

 regards,
 dana


 On Mon, Jul 29, 2013 at 12:04 PM, Roy Tennant 
 roytenn...@gmail.comjavascript:;
 wrote:

  On Mon, Jul 29, 2013 at 9:57 AM, Peter Schlumpf 
  pschlu...@earthlink.netjavascript:;
 
  wrote:
   Imagine if the library community had its own programming/scripting
  language, at least one that is domain relevant.
   What would it look like?
 
  Whatever else it had, it would have to have a sophisticated way to
  inspect text for patterns -- that is, regular expressions.
  Roy
 



 --
 Dana Pearson
 dbpearsonmlis.com



Re: [CODE4LIB] ILLiad - RemoteAuth and OpenURL

2013-07-12 Thread Jon Gorman
By the way,  a similar thread on the ezproxy list brought up this list:

http://mail.geneseo.edu/mailman/listinfo/workflowtoolkit-l

Which is apparently about ILLiad best practices.  I've just subscribed and
started reading through the archives.

Jon Gorman
University of Illinois


On Thu, Jul 11, 2013 at 3:53 PM, Jimmy Ghaphery jghap...@vcu.edu wrote:

 yeh for us we did go with the documentation as best we could and use an
 ISAPI
 filter. The final straw for us as a hosted site was that OCLC said they
 could not support this method and steered us to the EzProxy auth method.


 On Thu, Jul 11, 2013 at 4:13 PM, Jon Gorman jonathan.gor...@gmail.com
 wrote:

   I am also following this conversation, I am wondering if you consult
 the
  following about RemoteAuth Authentication,
   but still failed?
  
  
 
 https://prometheus.atlas-sys.com/display/illiad/RemoteAuth+Authentication
  
   Ling
   UIC Library
 
  Don't know about Jimmy, but the flow on that page is one of the issues we
  have. We can't trigger the system to have that login behavior.  From
 what I
  can tell of the logs, it decides the page to redirect to in a session
  before it ever checks the user status or the remote user header.  I don't
  know how, given that, that the flow could really happen that way. (I
 might
  be missing something).
 
  As far as I can tell the other settings are what they should be, after
 all
  if they weren't I can't imagine that it would work most of the time, just
  not on the initial logon.
 
  We did have an issue with a session heartbeat type of thing that had a
  similar behavior (the headers would just drop off, somehow associated
 with
  the heartbeat process). Thankfully we were able to disable that in the
  authentication software.
 
  Does anyone using RemoteAuth actually see that flow (get challenged to
  either register or update your info after first successful login?)  If
 you
  do, what are you using as a link into the system?
 
 
  Jon Gorman
  University of Illinois
 



 --
 Jimmy Ghaphery
 Head, Digital Technologies
 VCU Libraries
 804-827-3551



Re: [CODE4LIB] ILLiad - RemoteAuth and OpenURL

2013-07-11 Thread Jon Gorman
 I am also following this conversation, I am wondering if you consult the
following about RemoteAuth Authentication,
 but still failed?

 https://prometheus.atlas-sys.com/display/illiad/RemoteAuth+Authentication

 Ling
 UIC Library

Don't know about Jimmy, but the flow on that page is one of the issues we
have. We can't trigger the system to have that login behavior.  From what I
can tell of the logs, it decides the page to redirect to in a session
before it ever checks the user status or the remote user header.  I don't
know how, given that, that the flow could really happen that way. (I might
be missing something).

As far as I can tell the other settings are what they should be, after all
if they weren't I can't imagine that it would work most of the time, just
not on the initial logon.

We did have an issue with a session heartbeat type of thing that had a
similar behavior (the headers would just drop off, somehow associated with
the heartbeat process). Thankfully we were able to disable that in the
authentication software.

Does anyone using RemoteAuth actually see that flow (get challenged to
either register or update your info after first successful login?)  If you
do, what are you using as a link into the system?


Jon Gorman
University of Illinois


Re: [CODE4LIB] MARC record model to be inserted in mongodb

2013-07-05 Thread Jon Stroop
Have you seen Ross' post: 
http://dilettantes.code4lib.org/blog/2010/09/a-proposal-to-serialize-marc-in-json/ 
?


pymarc can get you this json, e.g.:

```
records = pymrx.parse_xml_to_array('/path/to/some/marc.xml')

json_file = [record.as_json() for record in records]

```

or, for that matter, if you happen to be using Mongo's Python API, you 
/may/ be able to call `as_dict()` when you store the record:


```
my_mongo_collection.insert(record.as_dict())
```

It looks like ruby-marc does something similar, and presumably the Mongo 
API for Ruby uses Ruby hashes the way that the Python API uses dicts, so 
a similar approach is probably possible in Ruby.


As for ...an efficient way so as to get results with the appropriate 
queries. I guess that all depends on what you're trying to do.


-Jon

--
Jon Stroop
Digital Initiatives Programmer/Analyst
Princeton University Library
jstr...@princeton.edu



On 07/05/2013 05:47 AM, dasos ili wrote:

Could you please give us any suggestions on a data model example regarding a 
MARC record? The goal is to be able to store it in mongodb, in an efficient way 
so as to get results with the appropriate queries.

thank you in advance



Re: [CODE4LIB] Regular expression for maximum 4-digit number

2013-07-02 Thread Jon Stroop
I have zero Excel skills, but chances are you could do this with any 
scripting language if you were to export the file as text (e.g. CSV).

-Jon

On 07/02/2013 11:02 AM, Harper, Cynthia wrote:

Is there a way to return (in Excel, if possible) the largest 4-digit number (by 
word boundaries) in a string?  I've extracted the 863 fields from Millennium 
for my active periodicals, and want to find the latest year in each run.  I'm 
willing to estimate it by taking the largest 4-digit number in the string. I'm 
doing this in Excel.  Any help?

Cindy Harper
Electronic Services and Serials Librarian
Virginia Theological Seminary
3737 Seminary Road
Alexandria VA 22304
703-461-1794
char...@vts.edu


[CODE4LIB] JCDL 2013 registration deadline extended to June 5

2013-05-30 Thread Dunn, Jon William Butcher
Early-bird registration for JCDL 2013 has been extended to June 5. Register
online at http://www.regonline.com/JCDL2013. Rates available at
http://jcdl2013.org/registration. The full program is available at
http://jcdl2013.sched.org/.

 

The ACM/IEEE Joint Conference on Digital Libraries is a major international
forum focusing on digital libraries and associated technical, practical,
organizational, and social issues, taking place in Indianapolis, Indiana,
USA on July 22-26, 2013. The theme for JCDL 2013 is Digital Libraries at
the Crossroads, in recognition of our location (Indiana is known as the
Crossroads of America) and in recognition of the changes forthcoming from
the age of mass digitization, big data, and the ever changing nature of
scholarly communications.

 

Program Highlights:

 

* 3 outstanding keynote speakers: Jill Cousins, Clifford Lynch, and David de
Roure. More information at: http://jcdl2013.org/keynotespeakers;

* 6 workshops covering topics such as data and software preservation,
digital scholarship, research methods and artifacts preservation, web
archiving, mining publications, and CURATEcamp. More information at
http://jcdl2013.org/workshops;

* 6 tutorials on topics including Europeana data model  collections,
ResourceSync, Introduction to Digital Libraries, building collections with
Greenstone, mining data semantics, and using open annotation. More
information at http://jcdl2013.org/tutorials;

* A diverse range of papers - 28 full papers and 22 short papers. More
information at http://jcdl2013.org/papers;

* And much more, including posters and demonstrations. More information at
http://jcdl2013.org/posters-demonstrations. 

 

Indianapolis is a wonderful conference city friendly to both walkers and
cyclists, with many dining, entertainment, and sports options accessible
from the downtown area. Check out the visitors guide developed for ACRL
2013: http://conference.acrl.org/indy-pages-163.php. More JCDL travel
details are available at http://jcdl2013.org/travel.

 


Re: [CODE4LIB] XML Parsing and Python

2013-03-05 Thread Jon Stroop

Mike,
I haven't used minidom extensively but my guess is that 
doc.toprettyxml(indent= ,encoding=utf-8) isn't actually changing the 
encoding because it can't parse the string in your content variable. I'm 
surprised that you're not getting tossed a UnicodeError, but The docs 
for Node.toxml() [1] might shed some light:


To avoid UnicodeError exceptions in case of unrepresentable text data, 
the encoding argument should be specified as “utf-8”.


So what happens if you're not explicit about the encoding, i.e. just 
doc.toprettyxml()? This would hopefully at least move your exception to 
a more appropriate place.


In any case, one solution would be to scrub the string in your content 
variable to get rid of the invalid characters (hopefully they're 
insignificant). Maybe something like this:


def unicode_filter(char):
try:
unicode(char, encoding='utf-8', errors='strict')
return char
except UnicodeDecodeError:
return ''

content = 'abc\xFF'
content = ''.join(map(unicode_filter, content))
print content

Not really my area of expertise, but maybe worth a shot
-Jon

1. 
http://docs.python.org/2/library/xml.dom.minidom.html#xml.dom.minidom.Node.toxml


--
Jon Stroop
Digital Initiatives Programmer/Analyst
Princeton University Library
jstr...@princeton.edu




On 03/04/2013 03:00 PM, Michael Beccaria wrote:

I'm working on a project that takes the ocr data found in a pdf and places it 
in a custom xml file.

I use Python scripts to create the xml file. Something like this (trimmed down 
a bit):

from xml.dom.minidom import Document
doc = Document()
Page = doc.createElement(Page)
doc.appendChild(Page)
f = StringIO(txt)
lines = f.readlines()
for line in lines:
word = doc.createElement(String)
...
word.setAttribute(CONTENT,content)
Page.appendChild(word)
return doc.toprettyxml(indent=  ,encoding=utf-8)


This creates a file, simply, that looks like this:
?xml version=1.0 encoding=utf-8?
Page HEIGHT=3296 WIDTH=2609
   String CONTENT=BuffaloLaunch /
   String CONTENT=Club /
   String CONTENT=Offices /
   String CONTENT=Installed /
   ...
/Page

I am able to get this document to be created ok and saved to an xml file. The 
problem occurs when I try and have it read using the lxml library:

from lxml import etree
doc = etree.parse(filename)


I am running across errors like XMLSyntaxError: Char 0x out of allowed range, 
line 94, column 19. Which when I look at the file, is true. There is a 0X 
character in the content field.

How is a file able to be created using minidom (which I assume would create a 
valid xml file) and then failing when parsing with lxml? What should I do to 
fix this on the encoding side so that errors don't show up on the parsing side?
Thanks,
Mike

How is the
Mike Beccaria
Systems Librarian
Head of Digital Initiative
Paul Smith's College
518.327.6376
mbecca...@paulsmiths.edu
Become a friend of Paul Smith's Library on Facebook today!


[CODE4LIB] Goose Island - quick stupid question - where does bus leave from

2013-02-13 Thread Jon Gorman
Does the bus leave from the hotel or the uic forum?


Jon Gorman


[CODE4LIB] C4L2013 Game Night - UIC Library

2013-02-12 Thread Jon Gorman
Hi all,

Some quick notes:

Again, there's a sign up for individual games. This will make it
easier for us to get started quickly and also help from having a large
crowd of people just standing around,
http://wiki.code4lib.org/index.php/2013_game_night .  If you brought a
game and want to play it, add it to the list.

We're going to stop playing a little earlier than we had on the wiki.
we're hoping to close and lock the doors at 10:30, so if people should
be winding down at 10:00.

It's recommend to travel back to the conference hotel in groups.

Please bring your badge with so it'll be a bit easier to make sure
folks in the room are people who are supposed to be there.

If there's overflow, we'll try to form groups at the room to go out to
try to find some spaces to game at.  There's some restaurants on
Halsted by the UIC Forum.

Again, the wiki should have the latest info.

Jon G.


[CODE4LIB] Code4LIb 2013 - Game Night - hotel card found

2013-02-12 Thread Jon Gorman
HI folks,

Someone who attended the game night left their room key. It's been
passed along to some of the folks who will be opening the conference
tomorrrow and they'll also make an announcement about it.


Jon Gorman


[CODE4LIB] Game Night Code4Lib 2013

2013-02-08 Thread Jon Gorman
Hi all,

I've been getting some questions and I realized there was some
confusion about the Game night.  I was a bit late in organizing it and
quite frankly haven't done the best job.

I put out a request for people to express their interest on by Jan.
14th by signing up on the wiki or sending me an email, but I didn't
actually put that date in the wiki and it was only mentioned in the
email on this list (which was on the 10th of Jan if I remember).  That
wasn't a hard and fast deadline, mostly so we could get an idea of
what sized room we need.  However, in the past week or two , we've
gotten a lot more people sign up and I've also heard from several
folks now that they thought the signup was only for bringing games.
As it stand though I realized this morning we had about 15 people
expressing interest a month ago and now are looking at over twice that
number.

I don't want to turn anyone away, but this does pose some logistical hurdles.

Mea cupla, this is my fault, not any of the Chicago folks.  I'm going
to try to work with the folks on the ground on seeing if we can get
another room at the UIC Library. I'll also try to find out some
surrounding locations that can serve as overspill, like cafes that
would be fine having a table of people show up and play.  I'm also
nervous about the number of games vs people who want to play games.
If you are attending and can bring some games and teach them, that
would be wonderful. (Also, I've run gaming events like this up to
about 20 people, but could really use a person or two to serve as a
helper.  Mainly that just means joining people to games, answering
questions, etc) Due to the scale, I have some ideas like signup sheets
for various games at the registration desk, rather like the signups
for the newcomer's dinner.

Again, sorry about this,

Jon Gorman


Re: [CODE4LIB] Game Night Code4Lib 2013

2013-02-08 Thread Jon Gorman
Hi all,

Sorry for a bit of delay on posting.  I've got a few folks who have
volunteered to help. It's hard to tell numbers for sure, since some
folks might not come and others may show up that haven't signed up.

As Francis says, the solution is likely to be nimble.  (And again, I
want to thank Francis and the rest of the host crew.  They've been
doing fabulous with disorganized folks like me ;) ).

First, some logistical details.

I'm thinking that we'll say that a goal will be to have this rough schedule:

7:30   - start setting up games, getting organized
7:45   - start first round of games
10:30 - start wrapping up.
11:00 - call it a night?  (Walk back or catch the bus as a group to
various hotels may not be a bad idea)

I've got a plan (with helpful advice from several folks, thanks!), and
we'll see if it a works. I'm going to work a bit tonight on setting up
a new page on the wiki. It's going to be structured in a manner that's
similar to the newcoming dinner, but instead will be games. Each game
will have a number of seats. If you're bringing a game and are willing
to play/teach it, add an entry.  Estimate a starting time if it's not
going to be when it starts.

t'll probably look something like...

Game Name (#n - if more than one entry for the game, add a number to
make less confusing) 7:45.
Game description (with maybe link to boardgamegeek)
1. Patty Gauzweiller (T)
2. Leslie Humphries
3. Mona Wert
4. Eddie Ramirez
5.

To sign up, put your name in one of the seats.  Don't add seats ;). If
you can teach/lead the game, note it. (If you want to teach but not
play, that's awesome. I haven't quite figured out how to note this,
but I'm thinking I'll just add a line at the bottom.)

We'll try to set up sections big enough for the games and put up signs.

Here's the warning.  I'll probably be making judgement calls on what
games get set up in the main room, preferring games that I and any
volunteers just coordinating can teach to go there and also based on
other factors.  If we hit the reasonable size for the room, we'll try
to have some recommendations for places to go w the group.

This is probably not the ideal solution as it makes quicker/lighter
games somewhat tricky, but I'm hoping for some of those some people
won't mind playing multiple games in the row, maybe teaching someone
who will teach the next group and allow a little of mingling that way.

Jon Gorman




On Fri, Feb 8, 2013 at 4:06 PM, Francis Kayiwa kay...@uic.edu wrote:
 On Fri, Feb 08, 2013 at 04:39:22PM -0500, Cynthia Ng wrote:
 Just an idea if space is really an issue. Would it be possible to
 simply get a second room next to (or at least nearby) the first one?
 As I image not everyone will be playing the same game, I don't see it
 as a problem.

 As I said to Jon. The people here will have to be nimble. The big
 problem is as a `historically` commuter campus the open spaces become a
 premium late and night. What we will need from those who signed up is
 willingness to track email/wiki for changes. I've asked for other spaces
 but no word yet. Finally unless you have more than 40, this room will
 fit the current number without a problem.

 Also no (as they will find out when they get there) it isn't a matter of
 spill over to the next room.

 Cheers,
 ./fxk


 On Fri, Feb 8, 2013 at 12:51 PM, Jon Gorman jonathan.gor...@gmail.com 
 wrote:
  Hi all,
 
  I've been getting some questions and I realized there was some
  confusion about the Game night.  I was a bit late in organizing it and
  quite frankly haven't done the best job.
 
  I put out a request for people to express their interest on by Jan.
  14th by signing up on the wiki or sending me an email, but I didn't
  actually put that date in the wiki and it was only mentioned in the
  email on this list (which was on the 10th of Jan if I remember).  That
  wasn't a hard and fast deadline, mostly so we could get an idea of
  what sized room we need.  However, in the past week or two , we've
  gotten a lot more people sign up and I've also heard from several
  folks now that they thought the signup was only for bringing games.
  As it stand though I realized this morning we had about 15 people
  expressing interest a month ago and now are looking at over twice that
  number.
 
  I don't want to turn anyone away, but this does pose some logistical 
  hurdles.
 
  Mea cupla, this is my fault, not any of the Chicago folks.  I'm going
  to try to work with the folks on the ground on seeing if we can get
  another room at the UIC Library. I'll also try to find out some
  surrounding locations that can serve as overspill, like cafes that
  would be fine having a table of people show up and play.  I'm also
  nervous about the number of games vs people who want to play games.
  If you are attending and can bring some games and teach them, that
  would be wonderful. (Also, I've run gaming events like this up to
  about 20 people, but could really use a person or two to serve as a
  helper

Re: [CODE4LIB] Game Night Code4Lib 2013

2013-02-08 Thread Jon Gorman
I've add the page at
http://wiki.code4lib.org/index.php/2013_game_night.  Sorry, I realize
this is a bit last minute.

If for some reason you can't edit the wiki but want to sign up for a
slot or add a game you're willing to run, send the info to me.  I'll
add it as I get time.

I'll probably be adding some more of my games, but I need to go to dinner ;).

Jon Gorman

On Fri, Feb 8, 2013 at 6:59 PM, Jon Gorman jonathan.gor...@gmail.com wrote:
 Hi all,

 Sorry for a bit of delay on posting.  I've got a few folks who have
 volunteered to help. It's hard to tell numbers for sure, since some
 folks might not come and others may show up that haven't signed up.

 As Francis says, the solution is likely to be nimble.  (And again, I
 want to thank Francis and the rest of the host crew.  They've been
 doing fabulous with disorganized folks like me ;) ).

 First, some logistical details.

 I'm thinking that we'll say that a goal will be to have this rough schedule:

 7:30   - start setting up games, getting organized
 7:45   - start first round of games
 10:30 - start wrapping up.
 11:00 - call it a night?  (Walk back or catch the bus as a group to
 various hotels may not be a bad idea)

 I've got a plan (with helpful advice from several folks, thanks!), and
 we'll see if it a works. I'm going to work a bit tonight on setting up
 a new page on the wiki. It's going to be structured in a manner that's
 similar to the newcoming dinner, but instead will be games. Each game
 will have a number of seats. If you're bringing a game and are willing
 to play/teach it, add an entry.  Estimate a starting time if it's not
 going to be when it starts.

 t'll probably look something like...

 Game Name (#n - if more than one entry for the game, add a number to
 make less confusing) 7:45.
 Game description (with maybe link to boardgamegeek)
 1. Patty Gauzweiller (T)
 2. Leslie Humphries
 3. Mona Wert
 4. Eddie Ramirez
 5.

 To sign up, put your name in one of the seats.  Don't add seats ;). If
 you can teach/lead the game, note it. (If you want to teach but not
 play, that's awesome. I haven't quite figured out how to note this,
 but I'm thinking I'll just add a line at the bottom.)

 We'll try to set up sections big enough for the games and put up signs.

 Here's the warning.  I'll probably be making judgement calls on what
 games get set up in the main room, preferring games that I and any
 volunteers just coordinating can teach to go there and also based on
 other factors.  If we hit the reasonable size for the room, we'll try
 to have some recommendations for places to go w the group.

 This is probably not the ideal solution as it makes quicker/lighter
 games somewhat tricky, but I'm hoping for some of those some people
 won't mind playing multiple games in the row, maybe teaching someone
 who will teach the next group and allow a little of mingling that way.

 Jon Gorman




 On Fri, Feb 8, 2013 at 4:06 PM, Francis Kayiwa kay...@uic.edu wrote:
 On Fri, Feb 08, 2013 at 04:39:22PM -0500, Cynthia Ng wrote:
 Just an idea if space is really an issue. Would it be possible to
 simply get a second room next to (or at least nearby) the first one?
 As I image not everyone will be playing the same game, I don't see it
 as a problem.

 As I said to Jon. The people here will have to be nimble. The big
 problem is as a `historically` commuter campus the open spaces become a
 premium late and night. What we will need from those who signed up is
 willingness to track email/wiki for changes. I've asked for other spaces
 but no word yet. Finally unless you have more than 40, this room will
 fit the current number without a problem.

 Also no (as they will find out when they get there) it isn't a matter of
 spill over to the next room.

 Cheers,
 ./fxk


 On Fri, Feb 8, 2013 at 12:51 PM, Jon Gorman jonathan.gor...@gmail.com 
 wrote:
  Hi all,
 
  I've been getting some questions and I realized there was some
  confusion about the Game night.  I was a bit late in organizing it and
  quite frankly haven't done the best job.
 
  I put out a request for people to express their interest on by Jan.
  14th by signing up on the wiki or sending me an email, but I didn't
  actually put that date in the wiki and it was only mentioned in the
  email on this list (which was on the 10th of Jan if I remember).  That
  wasn't a hard and fast deadline, mostly so we could get an idea of
  what sized room we need.  However, in the past week or two , we've
  gotten a lot more people sign up and I've also heard from several
  folks now that they thought the signup was only for bringing games.
  As it stand though I realized this morning we had about 15 people
  expressing interest a month ago and now are looking at over twice that
  number.
 
  I don't want to turn anyone away, but this does pose some logistical 
  hurdles.
 
  Mea cupla, this is my fault, not any of the Chicago folks.  I'm going
  to try to work with the folks on the ground on seeing if we can

[CODE4LIB] Code4Lib 2013 - Game Night

2013-02-06 Thread Jon Gorman
Hi all,

Just a brief email to say that I sent an email to all the folks who
have supplied contact info for the Game Night. It's not required that
you do so, but if you were thinking of attending, please sign up at
http://wiki.code4lib.org/index.php/2013_social_activities#Game_Night.21
so we know how many people are coming.

If you sent me contact info in order to be kept in the loop for last
minute changes and I didn't send an email directly to you a little
while ago, send it again.  I apologize, things have been a bit hectic
lately and I'm almost positive I left someone off that sent me an
email.

Jon Gorman


Re: [CODE4LIB] Code4Lib Conference streaming?

2013-01-30 Thread Jon Gorman
Three cheers for UIC folks!

Jon Gorman


[CODE4LIB] C4L2013 Game Night - UIC Library - Tuesday 11th, 7:30 pm

2013-01-23 Thread Jon Gorman
Hi all,

Thanks to Francis, we've got a room for the game night at the UIC
Library. Looks like it'll start at 7:30 pm, to give folks time to get
dinner.  Not sure yet how late it can go.

I'm going to be updating/modifying info on the social wiki (will move
some of the stuff out to it's own section).  I'll try to get to that
tonight or tomorrow night.

If you want me to also send you email when I make changes to the wiki
page  (http://wiki.code4lib.org/index.php/2013_social_activities) or
get more info about Game Night, send an email w/ the subject starting
with C4L2013 Game Night. Actually, also reply to me personally with
phone info if you don't mind texting if you want to be alerted of any
last minute changes or the like without checking the wiki.

I'll also try to add notes in the people who signed up on the wiki (or
reply to personal emails) on games they might bring so we don't end up
with 20 sets of regular playing cards taking up valuable luggage space
;).  I'll be bringing a number of games from my personal collection as
well.

Sorry for the brief note, but wanted to get something out.  I'll
probably not send any more emails about this directly to the list, so
again, send me an email starting with C4L2013 Game Night if you want
to be notified or keep an eye on the wiki.

Jon Gorman


Re: [CODE4LIB] Zoia

2013-01-18 Thread Jon Gorman
On Fri, Jan 18, 2013 at 9:38 AM, Karen Coyle li...@kcoyle.net wrote:

 ... and BTW, if people see Zoia as a bit of a problem during the conference,
 doesn't that mean that Zoia is a bit of a problem all of the time? Is there
 a reason to be polite and inclusive during the conference but not every day?

There's actually two different but closely related issues:

1) Plugins that generate a lot of information/responses which have
been a problem as they can interrupt flow of questions/discussions
during the conference. @blockparty lists what songs people are playing
that have registered their irc nick  scrobble.  It produces a lot of
lines and a couple of calls can cause people's screens to
scroll-off.  Not a problem with the normal traffic in the room, but
when going from maybe 20/30 active participants to hundreds it can be
an issue.

There's probably some others like @google or @naf with a long response
that could be disabled as well.  @naf is a nice one for demonstrating
zoia, but @marc is pretty compact and also wonderfully library-centric
;).

2) Plugins that are crude/offensive like @mf and the urban dictionary one.

I think the thread kicked off with the first one, but I think it
rapidly brought in the issue of the latter.  I'm in agreement that the
latter category probably should be just removed.  The first category
probably would be useful to disable during the conference but to have.

Jon Gorman


Re: [CODE4LIB] A gentle proposal: slim down zoia during the conference

2013-01-17 Thread Jon Gorman
I like the ideas of disabling some of the @zoia bot plugins for the
conference at least.


 For what it's worth, Jon Gorman was working on a version of `@herald`
 that provided introductory information to those new to the IRC
 channel. (I'm hoping he can speak to details.)

Details of Greeter (the Herald-intro bot):

It's my first foray into both supybot and python.  I've got a couple
of things still on the todo list before throwing it in channel. I did
a fork that can be seen here:

https://github.com/jtgorman/supybot-plugins/tree/master/plugins/Greeter

I think off hand I have something that seems to mostly work, but I
want to get @greeter add nick and @greeter remove nick so people can
prevent alternative nicks from being spammed and some sort of init
routine that pulls in a list of nicks to ignore if the db is not
present.  The latter may just end up waiting, I don't know. Feel free
to submit pull requests.  I'll then try to figure out the git magic to
get into code4lib.  (Or I'll just check out a fresh version of the
code4lib and copy the directory and commit that)

Hoping to get something in shape by the end of the week that can be
added to Zoia. Suggestions on the message welcome.  (Right now it has
Welcome to code4lib! Visit http://code4lib.org/irc to find out more
about this channel. Type @helpers for a list of people in channel who
can help. (Going to change @helpers into @helpers #code4lib)

Thanks to Mark who reminded me about the @helpers plugin. ( I don't think it's


Jon Gorman


Re: [CODE4LIB] Game Night during Code4Lib 2013

2013-01-14 Thread Jon Gorman
Hi,

At the moment it looks like we've got about 11 people or so interested
in the game night.  I'm thinking at this point of scheduling it for
later on Tuesday to avoid conflicts with the newcomer dinners.  I will
(with the wonderful assistance of the hosts) start looking at some
possible locations and transport.

More details to follow.

Jon Gorman


Re: [CODE4LIB] code4lib 2013 location

2013-01-11 Thread Jon Gorman
Gah, I think I forgot to announce this on the list, but there's also
this google map:
https://maps.google.com/maps/ms?msid=213549257652679418473.0004ce6c25e6cdeb0319dmsa=0

which I put on the social page
http://wiki.code4lib.org/index.php/2013_social_activities

I'll go ahead and add the hotel and conference site to that as well if
it's not already there.

On Fri, Jan 11, 2013 at 7:12 PM, Bill Dueber b...@dueber.com wrote:
 Because it seems like it might be useful, I've started a publicly-editable
 google map at

 http://goo.gl/maps/LWqay

 Right now, it has two points: the hotel and the conference location. Please
 add stuff as appropriate if the urge strikes you.




 On Fri, Jan 11, 2013 at 7:54 PM, Francis Kayiwa kay...@uic.edu wrote:

 On Fri, Jan 11, 2013 at 06:41:26PM -0500, Cynthia Ng wrote:
  I'm sorry, but that doesn't actually clear up anything for me. The
  location on the layrd page just says Chicago. So, is the conference
  still happening at UIC? Since the conference hotel isn't super close,
  does that mean there will be transportation provided?

 The entire conference and pre-conference is at UIC. The Forum is a
 revenue generating part of UIC. The pre-conference will be at the
 University Libraries on Monday with the exception of the Drupal one.

 The hotel is a mile or thereabouts from UIC Forum. Here is the problem
 with us natives planning. It never crossed our minds that walking a mile
 while on the *upper limit* of our shuttling to and from work is not the
 norm for everyone. This was brought to our attention and we will have a
 shuttle from the Hotel to the Conference venue.

 
  While we're on the subject, are the pre-conferences happening at the
  same location?


 See above.

 ./fxk

 
  On Fri, Jan 11, 2013 at 2:51 PM, Francis Kayiwa kay...@uic.edu wrote:
   On Fri, Jan 11, 2013 at 10:41:54AM -0800, Erik Hetzner wrote:
   Hi all,
  
   Apparently code4lib 2013 is going to be held at the UIC Forum
  
 http://www.uic.edu/depts/uicforum/
  
   I assumed it would be at the conference hotel. This is just a note so
   that others do not make the same assumption, since nowhere in the
   information about the conference is the location made clear.
  
   Since the conference hotel is 1 mile from the venue, I assume
   transportation will be available.
  
   That's a good assumption to make. As to the confusion  I said to you
   when you asked me about this a couple of days ago.
  
   http://www.uic.edu/~kayiwa/code4lib.html was supposed to be our
   proposal. If you look at the document it also suggests that we were
   going to have the conference registration staggered by timezones. We
   have elected not to update that because as that was our proposal. When
   preparing our proposal we borrowed heavily from Yale's and IU's
 proposal
   and if someone would like to steal from us I think it is fair to leave
   that as is.
  
   If you want the conference page use the lanyrd.com link below. I can't
   even take credit for doing that. All of that goes to @pberry
  
   http://lanyrd.com/2013/c4l13/
  
   Cheers,
   ./fxk
  
  
  
  
   best, Erik Hetzner
  
   Sent from my free software system http://fsf.org/.
  
  
  
  
   --
   Speed is subsittute fo accurancy.
 

 --
 Speed is subsittute fo accurancy.




 --
 Bill Dueber
 Library Systems Programmer
 University of Michigan Library


[CODE4LIB] Game Night during Code4Lib 2013

2013-01-10 Thread Jon Gorman
Hi all,

I'm trying to gauge interest in Game Night during Code4Lib 2013.  Now,
I've signed up for the Wednesday Goose Island tour, but can back out
of that if Wednesday night works the best.

Right now there's a handful of folks on the wiki that has expressed
interest, but could you send me an email or sign up on
http://wiki.code4lib.org/index.php/2013_social_activities by Monday
morning (the 14th) if you would like to go? Also, some indication of
games you might be able to bring or games you like to play would be
useful. I'm just trying to figure out how many folks are interested so
I have a rough idea of number of games and the space we need.

Also I'm leaning towards Monday or Tuesday night, but letting me know
a night preference as well might be useful. (If this is what people
would like for a Wednesday night non-beery alternative, I have other
chances to do a Goose Island tour ;) ).

Jon Gorman


Re: [CODE4LIB] basic IRC question/comments

2012-12-10 Thread Jon Gorman
 You can also choose to anonymize yourself by choosing a nick that best 
 represents something you're interested
 in or identify with that is not used on other social spheres. It really is 
 completely up to you on what you feel most
 comfortable with and there is typically no hard/fast rules.

One thing to keep in mind is that your nick might be anonymous, but
irc in general is done in the clear  and some connection information
will be published by default. I think that's partially a legacy of how
long IRC has been around.

When someone logs into a channel you'll see something like
foo...@1241workstation.uiowa.edu.  There's ways to cloak that id by
registering that nick and donating some money to the organization that
runs freenode, pdpc.  That's a bit trickier to setup.  The user
registration faq of freenode can be useful:
http://freenode.net/faq.shtml#userregistration.

So when someone who is registered and cloaked logs in, the
connection will display something like foobar@professional.cloaked has
joined the channel.  - I can't remember the exactg string).

So just know that if someone is logging the channel (which is
possible, there's plenty of clients and ways to do it) and you come in
several times with different nicks but the same network address
they'll know it's likely the same person.

Jon Gorman


Re: [CODE4LIB] basic IRC question/comments

2012-12-10 Thread Jon Gorman
Oh, forgot to mention. If you use a web client or use tor, that will
obscure the connection info by the nature of that connection ;).

Jon Gorman

On Mon, Dec 10, 2012 at 1:37 PM, Jon Gorman jonathan.gor...@gmail.com wrote:
 You can also choose to anonymize yourself by choosing a nick that best 
 represents something you're interested
 in or identify with that is not used on other social spheres. It really is 
 completely up to you on what you feel most
 comfortable with and there is typically no hard/fast rules.

 One thing to keep in mind is that your nick might be anonymous, but
 irc in general is done in the clear  and some connection information
 will be published by default. I think that's partially a legacy of how
 long IRC has been around.

 When someone logs into a channel you'll see something like
 foo...@1241workstation.uiowa.edu.  There's ways to cloak that id by
 registering that nick and donating some money to the organization that
 runs freenode, pdpc.  That's a bit trickier to setup.  The user
 registration faq of freenode can be useful:
 http://freenode.net/faq.shtml#userregistration.

 So when someone who is registered and cloaked logs in, the
 connection will display something like foobar@professional.cloaked has
 joined the channel.  - I can't remember the exactg string).

 So just know that if someone is logging the channel (which is
 possible, there's plenty of clients and ways to do it) and you come in
 several times with different nicks but the same network address
 they'll know it's likely the same person.

 Jon Gorman


Re: [CODE4LIB] basic IRC question/comments

2012-12-10 Thread Jon Gorman
And, sorry for being annoying, but some things were pointed out to me
in #code4lib, so I'm issuing yet another followup.

1) the technique freenode uses for cloaks isn't as strong as it used
to be.  Also, it's possible to accidentally log in without a cloak,
etc. Don't expect them to be very secure.

2) There's ways to get a cloak without financial contribution.  How
exactly to do this I leave as an exercise to the reader.  I never
really worried about it too much, the cloak was just a perk when I
made the donation.

3) Apparently most web clients will pass on the browser ip, not the
server ip address.  So don't count on that to make you anonymous.

So the general thrust is, if you really, really need anonymous
communication, be wary of irc.  However, in general people usually
respect the nicks from my experience and won't press people for their
actual identities.

Also, as mentioned before, most irc servers/channels are not encrypted
and pretty easy to log.

Jon Gorman



On Mon, Dec 10, 2012 at 1:38 PM, Jon Gorman jonathan.gor...@gmail.com wrote:
 Oh, forgot to mention. If you use a web client or use tor, that will
 obscure the connection info by the nature of that connection ;).

 Jon Gorman

 On Mon, Dec 10, 2012 at 1:37 PM, Jon Gorman jonathan.gor...@gmail.com wrote:
 You can also choose to anonymize yourself by choosing a nick that best 
 represents something you're interested
 in or identify with that is not used on other social spheres. It really is 
 completely up to you on what you feel most
 comfortable with and there is typically no hard/fast rules.

 One thing to keep in mind is that your nick might be anonymous, but
 irc in general is done in the clear  and some connection information
 will be published by default. I think that's partially a legacy of how
 long IRC has been around.

 When someone logs into a channel you'll see something like
 foo...@1241workstation.uiowa.edu.  There's ways to cloak that id by
 registering that nick and donating some money to the organization that
 runs freenode, pdpc.  That's a bit trickier to setup.  The user
 registration faq of freenode can be useful:
 http://freenode.net/faq.shtml#userregistration.

 So when someone who is registered and cloaked logs in, the
 connection will display something like foobar@professional.cloaked has
 joined the channel.  - I can't remember the exactg string).

 So just know that if someone is logging the channel (which is
 possible, there's plenty of clients and ways to do it) and you come in
 several times with different nicks but the same network address
 they'll know it's likely the same person.

 Jon Gorman


Re: [CODE4LIB] Mentorship Buddies

2012-11-28 Thread Jon Stroop

Having a sort of speed dating setup might help make better fits between 
mentors and mentees, as well.
+1, not only to satisfy the 'room full of nerds' case, but also the fact 
that people spend their free time @ code4libcon in a variety of ways, 
and not everyone might want to, e.g., wind up in the hospitality suite.



On 11/28/2012 09:45 AM, Ross Singer wrote:

On Nov 27, 2012, at 9:33 PM, Cynthia Ng cynthia.s...@gmail.com wrote:


Getting traction for mentoring online is always difficult, but what
about starting that mentorship at code4libcon?


+1 - being face-to-face might help ease the tension.

Having a sort of speed dating setup might help make better fits between 
mentors and mentees, as well.

That is, a roomful of nerds deferring passively to one another might not get us 
very far :)  Something more structured about what people want to learn and what 
mentors know and how they get along together would probably make for a more 
productive outcome.

-Ross.


Maybe almost like a buddy system, so that the first meeting between a
mentor and mentee is at a code4libcon (national, regional, or
otherwise) if possible.

This might simply be a good idea for first timers who are not going
with colleagues too.

Just throwing out some ideas here...

On Tue, Nov 27, 2012 at 7:49 PM, Nick Ruest rue...@gmail.com wrote:

Matt McCollow proposed something like this a while back. We have a page up
and everything! But, it never got much traction.

http://www.mail-archive.com/code4lib@listserv.nd.edu/msg14270.html
http://wiki.code4lib.org/index.php/Mentorship

-nruest

On 12-11-27 07:30 PM, Bess Sadler wrote:

+1 to this idea. I have benefited tremendously over the years from kind
people taking me under their wings. Many of us try to do this one-on-one,
but some kind of introduction service would be a huge benefit for the
community, I would think.

Mentorship is a great example of a robust solution - a solution that
addresses more than one problem at once. I suspect that this would not only
improve our diversity as a community, it might also solve some tech
leadership / succession planning problems and maybe expose some training
needs.

Bess

On Nov 27, 2012, at 4:20 PM, Nathan Tallman ntall...@gmail.com wrote:


This is a slightly different topic, but relates to Kelley's post: Does
code4lib have a mentor program where more inexperienced geeks can pair up
with someone to guide their development? I don't have anyone like that in
my network, but would really like to. I don't mean to discount the
existing
resources on code4lib or this list, which both have been very useful. I'm
sure I could just start by attending some of the conferences, but for
more
inexperienced people they can be a bit intimidating, albeit inspiring.

It would also be a way to directly engage minorities.

Just a thought.

Nathan


On Tue, Nov 27, 2012 at 6:20 PM, Kelley McGrath kell...@uoregon.edu
wrote:


I'll second the idea of approaching people individually and explicitly
asking them to participate. It worked on me. I never would have written
my
first article for the Code4Lib Journal or become a member of the
editorial
committee if someone hadn't encouraged me individually (Thanks
Jonathan!).

It would also be good to find a way to somehow target the pool of
lurkers
who maybe aren't already connected to someone and get them more
involved.

As far as anonymous proposals go, we recently had a very good workshop
on
implicit bias here. Someone brought up that found significant changes in
the gender proportions in symphony orchestras after candidates started
auditioning behind screens. There are also lots of studies about the
different responses to the same resume/application depending on whether
a
stereotypically male/female or white/black name was used. Probably it's
impossible to make proposals completely anonymous, but it would be an
interesting experiment to leave off the names.

Kelley

PS Interestingly, I wouldn't instinctively self-identify as a member of
the Code4Lib community, although my first thought is that that has more
to
do with not being a coder than with being a woman.


**
Kelley McGrath
Metadata Management Librarian
University of Oregon Libraries
1299 University of Oregon
Eugene, OR 97403

541-346-8232
kell...@uoregon.edu


--
-nruest


Re: [CODE4LIB] anti-harassment policy for code4lib?

2012-11-26 Thread Jon Stroop
It's sad that we have to address this formally (as formal as c4l gets 
anyway), but that's reality, so yes, bess++ indeed, and mjgiarlo++, 
anarchivist++ for the quick assist.


The responses to the list in the past couple of hours alone suggest that 
this is something much of the community would want to get behind. To 
that end, and as a show of (positive) force--not to mention how cool our 
community is--I think it might be neat if we could find a way to make 
whatever winds up being drafted something we can sign; i.e. attach our 
personal names. I don't know how that would work exactly...maybe via the 
wiki (where it seems to me a lot of good info goes to die) or the 
code4lib Github (slightly better since you could link to your 
credentials in a an environment much larger than our own, and everyone 
could have a copy), but something along those lines. I'm happy to help 
if I can.


Anyway, just a thought.
-Jon

--
Jon Stroop
Digital Initiatives Programmer/Analyst
Princeton University Library

jstr...@princeton.edu

http://pudl.princeton.edu
http://findingaids.princeton.edu


On 11/26/12 6:33 PM, Michael J. Giarlo wrote:

All,

Building on what Bess and others have written, and on the GitHub repo that
anarchivist set up, I've contributed a rough draft of a Code4Lib code of
conduct:

https://github.com/code4lib/antiharassment-policy/blob/master/code_of_conduct.md

This strawperson code of conduct is based on DLF Forum's, which is based on
the Ada Initiative's sample policy. It is modified slightly to reflect a
broader scope of the conference, conference social events, the IRC channel,
and the mailing list.

Throw darts, rinse, repeat.

-Mike


On Mon, Nov 26, 2012 at 6:10 PM, Robert Sanderson azarot...@gmail.comwrote:


+1, of course :)

You might wish to consider some further derivatives/related pages:
 http://www.diglib.org/about/code-of-conduct/
 http://wikimediafoundation.org/wiki/Friendly_space_policy
 https://thestrangeloop.com/about/policies
 http://www.apache.org/foundation/policies/anti-harassment.html

Rob



On Mon, Nov 26, 2012 at 3:57 PM, Mariner, Matthew 
matthew.mari...@ucdenver.edu wrote:


+1 for all of the below

Matthew C. Mariner
Head of Special Collections and Digital Initiatives
Assistant Professor
Auraria Library
1100 Lawrence StreetDenver, CO 80204-2041
matthew.mari...@ucdenver.edu
http://library.auraria.edu :: http://archives.auraria.edu





On 11/26/12 3:51 PM, Tom Cramer tcra...@stanford.edu wrote:


+1 for Bess's motion
+1 for Roy's expansion to C4L online interactions as well as face to

face

+1 for Karen's focus on general inclusivity and fair play


For me the hardest thing is how one monitors and resolves issues that
arise. As a group with no formal management, I suppose the conference
organizers become the deciders if such a necessity arises. If it's
elsewhere (email, IRC) -- that's a bit trickier. The Ada project's
detailed guides should help, but if there is a policy it seems that
there necessarily has to be some responsible body -- even if ad hoc.


It seems to me that there would be tremendous benefit in having

1.) an explicit statement of the community norms around harassment and
fair play in general. In the best case, this would help avoid
uncomfortable or inappropriate situations before they occur.

2.) a defined process for handling any incidents that do arise, which in
the case of this community I would imagine would revolve around
reporting, communication, negotiation and arbitration rather than
adjudication by a standing body (which I agree is hard to see in this
crowd). I know several high schools have adopted peer arbitration
networks for conflict resolution rather than referring incidents to the
Principal's Office--perhaps therein lies a model for us for any

incidents

that may not be resolved simply through dialogue.

- Tom



On Nov 26, 2012, at 2:32 PM, Karen Coyle wrote:


Bess and Code4libbers,

I've only been to one c4l conference and it was a very positive
experience for me, but I also feel that this is too valuable of a
community for us to risk it getting itself into crisis mode over some
unintended consequences or a bad apple incident. For that reason I
would support the adoption of an anti-harassment policy in part for its
consciousness-raising value. Ideally this would be not only about

sexual

harassment but would include general goals for inclusiveness and fair
play within the community. And it would also serve as an acknowledgment
that none of us is perfect, but we can deal with it.

For me the hardest thing is how one monitors and resolves issues that
arise. As a group with no formal management, I suppose the conference
organizers become the deciders if such a necessity arises. If it's
elsewhere (email, IRC) -- that's a bit trickier. The Ada project's
detailed guides should help, but if there is a policy it seems that
there necessarily has to be some responsible body -- even if ad hoc.

kc


On 11/26/12 2:16 PM, Bess

Re: [CODE4LIB] extracting tiff info

2012-11-19 Thread Jon Stroop
If you want everything in that RDF, you're probably wanting to extract 
the XMP data. Have a look at exiv2: http://www.exiv2.org/


Basically:

 exiv2 -px your_image.tif

will dump what you want to stdout.
-Jon

--
Jon Stroop
Digital Initiatives Programmer/Analyst
Princeton University Library

On 11/19/2012 04:31 PM, Kyle Banerjee wrote:

Howdy all,

I need to extract all the metadata from a few thousand images on a network
drive and put it into spreadsheet. Since the files are huge (each is
100MB+) and my connection isn't that fast, I strongly prefer to not move
them before working on them -- i.e. I'm using cygwin and/or windows.

Just eyeballing these things, I see the headers contain everything I need
in purty rdf. What's the best way to extract this? I thought tiffinfo would
do the trick, but it's just giving me technical info. Of course I can just
parse the files with perl but I'm thinking there just has to be a slicker
way to do this. What's my best option? Thanks,

kyle


Re: [CODE4LIB] Mobile device usage (iOS vs. Android)

2012-10-30 Thread Jon Gorman
 Any thought?

I guess I'd be somewhat wary of comparing general trends to a more
defined population.  I'm guessing your campus population is not
typical of the national population, instead probably skewed towards a
younger population with higher disposable income (and also perhaps
more sensitive to peer pressure) and hence might not follow general
trends ;).

Also, how is your 70% traffic figured?  Do you have any way to
determine if perhaps a few outliers are creating a significant amount
of traffic.  (In other words, do you know if the mobile traffic
actually represents ownership, or might there be a smaller group of
i-phone users who happen to use the library services more? I'd guess
the smaller the population accessing via mobile, the more likely a
small population could skew the results)

Also, how are you measuring the Android users?  Is it possible you're
missing some who would be using non-default browsers or browsers
modified by a carrier?

I don't unfortunately have any stats, but I do seem to remember seeing
some numbers locally that would indicate iOS count of web usage is
still pretty high. Android phones are becoming very, very cheap but
data plans aren't.  Also, the form factor and the processing power of
some of the cheaper androids make web searching less than thrilling.
I could see someone using an Android that they get for free, but not
accessing the library for a variety of reasons.

It would be interesting if one could compare the usage of different
Android devices but the difficulty of data collection here might be
enormous. (I'm not sure off hand if there's an easy way to
distinguish, say, a Samsung Galaxy 2 from a Optimus)

Jon Gorman


Re: [CODE4LIB] haititrust

2012-08-03 Thread Jon Stroop
You can do an empty query in their catalog, and use the Original 
Location facet to filter to a holding library. Programatically, I'm not 
sure, but you'd probably need to use the Hathi files: 
http://www.hathitrust.org/hathifiles.


-Jon

On 08/03/2012 11:07 AM, Eric Lease Morgan wrote:

If I needed/wanted to know what materials held by my library were also in the 
HaitTrust, then programmatically how could I figure this out? In other words, 
do you know of a way to query the HaitTrust and limit the results to items my 
library owns? --Eric Lease Morgan


[CODE4LIB] 2012 VIVO Conference

2012-05-24 Thread Jon Corson-Rikert
(apologies for any cross-postings)

In the past 3 years, a growing international movement of developers, 
researchers, administrators, funders, librarians and informaticians has 
converged around the vision of openly representing research and researchers via 
Linked Open Data. VIVO is helping to make this vision a reality through its 
community, through open software and the VIVO ontology, and a growing number of 
adopters and collaborators worldwide, across multiple knowledge domains.  The 
2012 VIVO conference will explore how to participate in and best take advantage 
of the emerging Linked Open Data world encompassing and expanding our 
understanding of research.
 
Who should attend?
Scholars, scientists, researchers, developers, librarians, publishers, funding 
agencies, research officers, students, institutional officials and those 
supporting the development of research discovery, data sharing and team science.
 
Conference highlights
The conference begins with a full day of workshops for those new to VIVO, those 
implementing VIVO and those wishing to develop applications using VIVO.  
Keynote addresses, invited speakers, scientific panels, contributed papers and 
posters will cover a range of topics, including the semantic web, linked open 
data, VIVO sustainability, adopting and implementing VIVO, research networking, 
network visualization, ontology and the role of VIVO in support of team science.
 
Registration, Call for Papers and Apps Contest, hotel and travel information
http://vivoweb.org/conference

Topics of interest
* Facilitating researcher collaboration and networking
* Managing/discovering knowledge about researchers across institutional, 
disciplinary, and national boundaries
* Approaches to the adoption of VIVO and related systems that interoperate 
through shared ontologies and Linked Open Data
* The intersection of VIVO and international research standards
* Research representation ontology development
* Open representations of research and implications for the research process, 
collaboration, and virtual research communities
* Perspectives on policy, research representation, and research impact, 
including questions of privacy, individual vs. institutional sourcing of data, 
and change over time
* Semantic Web development and extensions of the VIVO platform to reach the 
full Web community
* Open research data and related issues in discovery, reuse, and attribution
 
About VIVO
VIVO is an open source, open ontology, open process platform for hosting 
information about scientists’ interests, activities and accomplishments.  VIVO 
supports open development and integration of science through simple, standard 
semantic web technologies.  Learn more at http://vivoweb.org


Jon Corson-Rikert
Head, Information Technology Services
VIVO Development Lead
201 Albert R. Mann Library
Cornell University
Ithaca, NY 14853
607 255-4608
j...@cornell.edu


Re: [CODE4LIB] Sharing code

2012-03-12 Thread Jon Gorman
On Fri, Mar 9, 2012 at 11:34 AM, Whitworth, Cliff
cliff.whitwo...@unt.edu wrote:
 NOOB to list and am appreciative of this discussion. My boss is encouraging 
 me to share code and pointed me to code4lib. the majority of my code is 
 recycled / repurposed from others so I've had reservations about sharing 
 mainly because of what's taken from others. At the least, I'm mindful about 
 leaving acknowledgements intact. Is there a good resource on how to start 
 sharing code and ethical considerations?


Howdy and welcome Cliff!

In short, I think there's a push over the past few years to share more
and more code, even when it's small.  There's a lot of individuals
scattered in the library world who are not necessarily on local teams
who end up doing the same work over and over again.  There's some
tension with this as there's also projects that tend to get abandoned
or just don't have as much support and community as they could.

I've been bad about releasing source myself.  I've got a barrier in
our lawyers, who I really need to push to let me have more leeway for
releasing stuff.


There's been a couple of articles over the years on the code4lib journal, see...

First, an argument on why to just put stuff out there by Dale Askey:
COLUMN: We Love Open Source Software. No, You Can’t Have Our Code
http://journal.code4lib.org/articles/527

See Terry Reese's excellent article in the latest issue: Purposeful
Development: Being Ready When Your Project Moves From ‘Hobby’ to
Mission Critical http://journal.code4lib.org/articles/6393

Michael Doran gave an excellent talk a few years back that really
stuck in my head with the very issue I've been reluctant to put more
effort into: lawyers and code:  The Intellectual Property Disclosure:
OpenSource in Academia
21:09 - 4 years ago
http://video.google.com/videoplay?docid=-3341633878207243364

There's a lot of other good articles in the journal and on people's
various blog posts.  Github is all the rage these days, so at some
point I'll need to figure out how to use it ;).

Again, welcome!

Jon Gorman


Re: [CODE4LIB] Q.: MARC8 vs. MARC/Unicode and pymarc and misencoded III records

2012-03-09 Thread Jon Gorman
 It used to be that way, at least it was this way when I grew up in open
 source (in the 90s, before Eric Raymond invented the term). And it makes
 sense, for successful projects that have at least a moderate number of
 users.  Just dumping your code on github helps very few people.


You realize this isn't Apache, right?  It seems a small project,
mostly maintained by folks as they get time.  There's no SCRUM
meetings or hallway meetings, no foundation, no checklist.  Surely you
can't generalize two interactions first as reflective as the culture
of open source.  It seems to have been a small piece of code shared
so others wouldn't have to do it over again and it's grown with time.
The primary thrust seems to be for library developers, not catalogers
or folks learning python code.

The typo you bought up was patched by one of the team-members within
a hour or two from what I can tell.  (Assuming you meant issue #22
https://github.com/edsu/pymarc/issues/22).  From what I can tell
someone patched it in less than an hour.

In general though github is the sourceforge of years past, but even
better.  It seems entirely reasonable to ask for a patch to me.
Perhaps it could have been handled more delicately by both sides.
Perhaps you weren't treated as nicely as you'd like.  There's probably
some truth to that.  But at the same time, Ed did include a wink at
the end after requesting the patch.  Had you perhaps cut him some
slack instead of immediately responding incredulously  you'd find it
was fixed when he got time. Or not.  He has his own priorities as do
other folks who contributed to the code.

If you're unhappy with the dump on github approach, then don't use the
software.  No one ran around forcing folks to do it.  It's one of
those lightweight github approaches, just another approach to open
source software.  In all the years I've also been involved with open
source every project has had it's own unique culture.  There's
responsibility on the user before using software to figure out what it
is.  If it doesn't meet their expectation, I see little reason that
the developer should feel compelled to change unless they're getting
paid for the work.  Obviously some people have found the dump on
github approach useful if they've contributed patches.

Can't we all just shake hands virtually or something?

Jon Gorman


Re: [CODE4LIB] Microsoft Transit-SQL

2012-03-06 Thread Jon Gorman
 I am looking for a good text on Microsoft Transit-SQL.  I have searched high 
 and low and
 all I find are books focused on Microsoft SQL Server.

Do you mean Transact-SQL (which I usually just see abbreviated T-SQL)
?  The online documentation at msdn isn't great, but it's not
horrible.   That's usually what I use.

I mean, usually it's just a matter of looking up how it implements SQL
and some of the local variants.

(Do you need recommendations for books on SQL?)

Jon Gorman


[CODE4LIB] How to get on irc

2012-02-07 Thread Jon Gorman
Hi all,

Quick link for those trying to get on irc for the first time

There's some info on http://code4lib.org/irc

Basic:
download an irc client (I like xchat)
connect to the freenode server
type /join #code4ib

Gotta go, presentation started

Jon Gorman
University of Illinois


Re: [CODE4LIB] Koha in the Running

2012-01-12 Thread Jon Gorman
 I'm curious to know of this lists current thoughts on Koha as an ILS. Where
 would you rank it among the various options, open source and vendor?


I'm confused, what do you mean by open source and vendor?

There's vendors/companies that develop for and support Koha.  Open
source and vendor/commercial activity are not  mutually exclusive.
Did you mean open source and proprietary?  There's lots of combination
of ILSes and how to manage them out there.

Open Source ILS / local servers / no support contracts
Open source ILS / hosted / no support contract
Open source ILS / hosted / support contract

Proprietary ILS / local servers / no support
Open source ILS / hosted / no support contract
Proprietary ILS / hosted / support

Some of those combinations are pretty rare, but I could see all of
them existing.

And you could distinguish between support and development contracts,
with the nice advantage of open source you can always change vendors
or fund someone who's not your usual developer group depending on how
the community around the project has been established.  Harder to do
that with proprietary software, but I've still heard of it happening.

Are you interested in stuff like that?

Or are you more just interested in how people's experience using Koha
software itself compared ot other ILS options out there? Or the actual
overall experience?  Or which Koha vendor is the best?

Jon G.


Re: [CODE4LIB] My crazed idea about dealing with registration limitations

2011-12-22 Thread Jon Stroop
Maybe keynotes happen on the middle day; the one time where the whole 
group comes together, though it would require a 2x size space... This 
could also reduce the length to 4.5 days.


On 12/22/2011 10:05 AM, Peter Murray wrote:

That is a crazy idea.  I don't know about putting the speakers on the hook for 
two days -- particularly keynote speakers.  Still, it would be interesting for 
a site to flesh this out and propose something along these lines.


Peter

On Dec 21, 2011, at 6:44 PM, Fleming, Declan wrote:

Hi - so I know this is nuts.

If we start with a couple premises for the code4lib conference:

1.  Single thread is crucial.
2.  250 is about the top limit of a single threaded conference.
3.  400+ people want to attend.
4.  The conference takes 2.5 days.

What if we ran the 2.5 day conference twice in one week?

1.  Session 1 runs from Monday until noon on Weds.
2.  Session 2 runs from 1p on Weds until the end of Friday.
3.  Every one of the 23 accepted talks is given twice, once in each Session, in 
the same order.
4.  Each Session is attended by a different set of attendees.

We could serve 500 attendees this way.

If everyone came for the week, there could be parallel seminars, hack fests, 
BootCamps, THATcamps, CURATEcamps, c4lcamps, etc... for the half of the 500 
that wasn't in the main conference.  People could also just decide to come for 
the 2.5 day main conference, I guess.

I SAID it was crazy.  ;)

D





Re: [CODE4LIB] Obvious answer to registration limitations

2011-12-19 Thread Jon Gorman
 I had planned to come to code4lib and knew it filled up fast. I joined the 
 mailing list so I could find out about the  registration as soon as it 
 happened. It came out in mid-morning and I happened to be in a meeting until 
 12 or
 so and by the time I tried to register it was sold out. This is annoying. Why 
 not find a venue that is big enough  to meet the obvious demand? There are 
 surely plenty of larger venues in a city such as Seattle.


The actual time when registration was going to open was published in a
variety of venues (on the wiki, on the mailing lists, and it seemed
someone was asking the question every fifteen minutes in the channel,
including me ;) ).  I purposely avoided scheduling meetings around
that time and rescheduled some that were.

On the other hand, it would be interesting to see a proposal for a
larger code4lib and I imagine Minnesota has lots of places that can
host a larger one.  The deadline isn't until Jan. 22nd See
http://code4lib.org/node/425

As always, if you want Code4Lib to do something or change, all you
have to do is plan and work for it.  That's why we're a loose
collective and not a professional organization.

I personally would not vote on making it much larger.  It seems every
order of magnitude increase takes it away from the techie origins and
more like CiL or Internet Librarian.  On the other hand, regardless of
the size, I still suspect I'll find people willing to discuss the
technical stuff, I just might stop showing up for most of the actual
talks.

Jon Gorman.


On Mon, Dec 19, 2011 at 8:47 AM, Elfstrand, Stephen F
stephen.elfstr...@mnsu.edu wrote:



 Stephen Elfstrand
 PALS Executive Director
 stephen.elfstr...@mnsu.edu
 507.389.5059


Re: [CODE4LIB] Any ideas for free pdf to excel conversion?

2011-12-14 Thread Jon Gorman
 I'm looking for a way to pull 29 pages of pdf tables into excel so I can
 munge the data into an excel project and all my free trials so far have
 only converted a few pages at a time.


copy and paste?

If it needs to be somewhat automated

pdftotext - some cut  paste / sed / regex - open in excel?

You might need to fiddle with the pdftotext settings, but I've been
pretty successful with that before doing something else.

Jon G.


Re: [CODE4LIB] server side vs client side

2011-12-01 Thread Jon Gorman
On Thu, Dec 1, 2011 at 11:49 AM, Nate Hill nathanielh...@gmail.com wrote:
 As I was struggling with the syntax trying to figure out how to use
 javascript to load a .txt file, process it and then spit out some html on a
 web page, I suddenly found myself asking why I was trying to do it with
 javascript rather than PHP.

 Is there a right/wrong or better/worse approach for doing something like
 that? Why would I want to choose one approach rather then the other?


I tend to try to do most stuff server-side.  Javascript I try to keep
just to enhance the GUI system and perhaps do some AJAXy stuff.  There
is the fact that if you're using an external API that's not crucial
you might want to just do it javascript side.  So think about cover
images in a catalog for example.

You could have the server-side script go out, grab the image, put it
in a local cache, then prepare the link within the actual html.  But
if something goes wrong, you might either take really long to return
that page or never return it.

The approach that most folks do is that they have some javascript that
does an AJAX call.  So the page loads on the client and then when the
image comes back the cover image will be added.  If it never happens,
you've sent the page at least.

I know some who tend to always go to javascript because they're used
to not having control of the underlying system except for to add html
to templates and sneak in javascript that way.

However, that's awkward, difficult to maintain, error-prone, and
likely horrible for accessibility.  If you control the underlying
PHPthen yeah, do it on the PHP side ;).

My advice here is somewhat simplistic and general.

You do have my curiosity up now though.  What was you goal with trying
to load that text file?

Jon Gorman


Re: [CODE4LIB] Models of MARC in RDF

2011-11-28 Thread Jon Stroop
You may know about this one already, but the BL exposed the British 
National Bibliography as RDF last summer. The project has a page[1] with 
a good amount of info--the data model[2] might be a good place to start.

-Jon

1. http://www.bl.uk/bibliographic/datafree.html
2. http://www.bl.uk/bibliographic/pdfs/datamodelv1_01.pdf

On 11/26/2011 10:58 AM, Karen Coyle wrote:
A few of the code4lib talk proposals mention projects that have or 
will transform MARC records into RDF. If any of you have documentation 
and/or examples of this, I would be very interested to see them, even 
if they are under construction.


Thanks,
kc



--
Jon Stroop
Metadata Analyst
Firestone Library
Princeton University
Princeton, NJ 08544

Email: jstr...@princeton.edu
Phone: (609)258-0059
Fax: (609)258-0441

http://pudl.princeton.edu
http://findingaids.princeton.edu
http://www.cpanda.org


[CODE4LIB] Fwd: [semweb-25] Metropolitan Musem of Art hiring a Semantic Web Developer

2011-11-28 Thread Jon Stroop

May be of interest to someone on this list.

 Original Message 
Subject: 	[semweb-25] Metropolitan Musem of Art hiring a Semantic Web 
Developer

Date:   Thu, 24 Nov 2011 11:01:27 -0500
From:   don undeen donund...@yahoo.com
Reply-To:   semweb...@meetup.com
To: semweb...@meetup.com



Hello,
Hoping that this isn't a spam, but the Metropolitan Museum of Art's 
Digital Media Department is hiring for an Information Systems Developer.
This position will be involved in advanced data architecture solutions, 
to support a variety of web and in-gallery technology.


This work may entail:
- Setting up and administering triple stores, NoSQL dbs, and CMSs like 
Drupal

- designing interfaces, modules, and workflows for same
- Implementing collective intelligence algorithms,
- experimenting with new technologies, developing prototypes and 
proofs-of-concept
- and (to be honest) some drudgery, like data delivery, ETL, and report 
generation


See the application on linkedin, here:
http://www.linkedin.com/jobs?viewJob=jobId=2157751srchIndex=0trk=njsrch_hitsgoback=%2Efjs_information+systems+developer_*1_*1_I_us_*1_*1_1_R_true_*2_*2_*2_*2_*2_*2_*2_*2 
http://www.linkedin.com/jobs?viewJob=jobId=2157751srchIndex=0trk=njsrch_hitsgoback=%2Efjs_information+systems+developer_*1_*1_I_us_*1_*1_1_R_true_*2_*2_*2_*2_*2_*2_*2_*2 



I know many of you do more than just SemWeb work, and many of you are on 
this list because you like to find new ways to tackle vexing problems. 
That's what we're looking for.


If you choose to submit a resume, please send it to the email address 
provided, but also cc me:

don.und...@metmuseum.org

I look forward to hearing from you.

yours,
Don Undeen
Manager, Media Lab
Digital Media Department
Metropolitan Museum of Art




--
Please Note: If you hit *REPLY*, your message will be sent to 
*everyone* on this mailing list (semweb...@meetup.com 
mailto:semweb...@meetup.com)
This message was sent by don undeen (donund...@yahoo.com) from Lotico 
New York Semantic Web http://www.meetup.com/semweb-25/.
To learn more about don undeen, visit his/her member profile 
http://www.meetup.com/semweb-25/members/6026658/
To unsubscribe or to update your mailing list settings, click here 
http://www.meetup.com/semweb-25/settings/


Meetup, PO Box 4668 #37895 New York, New York 10163-4668 | 
supp...@meetup.com


--
Jon Stroop
Metadata Analyst
Firestone Library
Princeton University
Princeton, NJ 08544

Email: jstr...@princeton.edu
Phone: (609)258-0059
Fax: (609)258-0441

http://pudl.princeton.edu
http://findingaids.princeton.edu
http://www.cpanda.org


Re: [CODE4LIB] Professional development advice?

2011-11-28 Thread Jon Gorman
Probably the most important thing you can do is simply play around
with the technology.  Get some ideas of what you want to play around
with.  Then try to do it or see if someone else has already done it.
If someone else has done it, try to figure out how (open source for
the win).

When I was starting out I liked having classes, just because they
usually create goals and end points.  To be honest though it's been a
little while since I've actually taken a class.  I probably should
again, but life does get busy.

Books and very good websites are a close second.   Look for classes in
either your CS department or the local community college.

If you want to do web development, start looking for a language and
framework you like.  Set up a box, install a webserver on it.  Find a
web application you like and try to get it up and running.  (Give a
try on doing something like running your own koha server!)

I don't know if it will help, but here's some knowledge I'd look for
in any web developer that was looking for a library job:

* What version control systems do they know?

* Do they know project management tools like puppet?

* Why they liked particular projects they worked on and what they may
not liked about them.

* Basic network knowledge.
* Some basic knowledge of design principles and usability testing.
They don't need to be a master, but I hope they're at least aware of
some the techniques.


I'm not really concerned about particular languages or frameworks

Mainly I'm looking for signs that they're comfortable with web
development and know some of the pitfalls and issues that can happen
in the library environment.  Have they run into issues with combining
diacritics,  confused librarian, what to call services?   Also, I'm
watching for any warning signs like like they can't distinguish
between client-side javascript  server-side processing or they only
seem to use does it display.  That would make me instantly wary.


Jon Gorman


Re: [CODE4LIB] Professional development advice?

2011-11-28 Thread Jon Gorman
On Mon, Nov 28, 2011 at 11:50 AM, Kyle Banerjee baner...@uoregon.edu wrote:
 Having a playground where you can experiment aggressively is useful. I'm a
 fan of Amazon EC2 because you can create servers in minutes for pennies per
 hour and try things you'd never want to do with real hardware. It's nice
 when you can completely restore a destroyed server in a couple minutes.

Ah, in a similar vein, having a VM setup can help a lot with playing
around.  Look into VirtualBox and set up a VM.  It's a lot easier once
you get the hang of it than the old days when you almost needed a
physical machine to play around.

Jon Gorman


Re: [CODE4LIB] Plea for help from Horowhenua Library Trust to Koha Community

2011-11-22 Thread Jon Gorman
Hi Joann,

Have you considered sending this to some of the tech podcasts?  I
think both the Command-Line podcast (http://thecommandline.net/) and
Linux Outlaws (http://sixgun.org/linuxoutlaws/) would be great
audiences and receptive to this story.

I'm a regular listener of both and if you want me to contact them so
they would get it from a a regular listener who I'd be more than happy
to forward your message with some personal notes.  (And the paypal
link too ;) ).

Jon Gorman

On Mon, Nov 21, 2011 at 6:51 PM, Joann Ransom jran...@library.org.nz wrote:
 Horowhenua Library Trust is the birth place of Koha and the longest serving
 member of the Koha community. Back in 1999 when we were working on Koha,
 the idea that 12 years later we would be having to write an email like this
 never crossed our minds. It is with tremendous sadness that we must write
 this plea for help to you, the other members of the Koha community.

 The situation we find ourselves in, is that after over a year of battling
 against it, PTFS/Liblime have managed to have their application for a
 Trademark on Koha in New Zealand accepted. We now have 3 months to object,
 but to do so involves lawyers and money. We are a small semi rural Library
 in New Zealand and have no cash spare
 in our operational budget to afford this, but we do feel it is something we
 must fight.

 For the library that invented Koha to now have to have a legal battle to
 prevent a US company trademarking the word in NZ seems bizarre, butit is at
 this point that we find ourselves.

 So, we ask you, the users and developers of Koha, from the birth place of
 Koha, please if you can help in anyway, let us know.

 Background reading:

   - Code4Lib article http://journal.code4lib.org/articles/1638: How hard
   can it be : developing in Open Source [history of the development of Koha]
   by Joann Ransom and Chris Cormack.
   - Timeline http://koha-community.org/about/history/ of Koha
   :development
   - Koha history visualization http://www.youtube.com/watch?v=Tl1a2VN_pec


 Help us
 If you would like to help us fund legal costs please use the paypal donate
 button below.




 Otherwise, any discussion, public support and ideas on how to proceed would
 be gratefully received.

 Regards


 Jo.

 --
 Joann Ransom RLIANZA
 Head of Libraries,
 Horowhenua Library Trust.



Re: [CODE4LIB] marc-8

2011-10-24 Thread Jon Gorman
 In Perl, how do I specify MARC-8 when reading (decoding) and writing
 (encoding) data?

 You can't.  MARC-8 is a character set that is unknown to the operating 
 system.  Your best bet is to convert MARC-8-encoded records into UTF-8.

 /me throws his hands up in the air and screams!

 Okay. How do I go about converting MARC-8 encoded records into UTF-8? I know 
 yaz-marcdump changes the encoding bit in MARC leaders. Does it also convert 
 MARC-8 characters to UTF-8? (I guess I could simply try it and see what 
 happens.)


I seem to remember there was an older version of yaz-marcdump that
seemed a bit buggy (would just change the header but not change
encoding despite command-line options, if there was a certain
combination chosen).  It's also possible I was just working with a
script that specified the encoding change but not the leader.

I'd say get the most recent version of yaz (don't use anything in an
OS repository) and then follow the docs:
http://www.indexdata.com/yaz/doc/yaz-marcdump.html.  The first example
is what you want:

 yaz-marcdump -f MARC-8 -t UTF-8 -o marc -l 9=97 marc21.raw marc21.utf8.raw

The -f is the source encoding, the -t is the target encoding, and the
-l 9=97 sets leader to a (decimal of character to change the 9th
character to a).

I've typically found this is one of the easier ways to do the
character set encoding, although the various Perl modules (if they're
recent enough) should be able to handle the conversion as well through
the MARC::Charset library.  Check the cpan pages.

Jon Gorman

ps.  For the love of all that is good, don't try to do anything in
Perl with the raw MARC record to do the encoding change yourself.
I've seen someone really screw records up because they altered
individual characters, which in turn lead to different byte lengths.
This caused all sorts of insanity which meant really weird things
happened with MARC parsers that tried to follow the MARC directory
(which uses byte addresses to deal with variable fields).


Re: [CODE4LIB] ISBN Regular Expression

2011-10-24 Thread Jon Gorman
Also, I don't know OpenBook to know your source data, but don't forget
a lot of publishers have printed ISBNs in different ways over the past
few years.  The regex would choke on any hyphens.  If users are
copying from printed material, they could type them in. For example,
one of the books near my desk has the ISBN printed like  0-521-61678-6

if this is user input and nothing is striping characters like that
out, it could cause problems.

(I think I've also seen spaces used instead of hyphens, but less
positive about this).

Jon Gorman


On Mon, Oct 24, 2011 at 9:44 AM, Jonathan Rochkind rochk...@jhu.edu wrote:
 John: That's not going to work, an ISBN can end in X as a check digit,
 which is not [0-9].  You are going to be rejecting valid ISBN's, you have a
 bug.

 On 10/24/2011 10:40 AM, John Miedema wrote:

 Here's a php function I use in OpenBook to test if a user has entered a 10
 or 13 digit ISBN.

 //test if 10 or 13 digits ISBN
 function openbook_utilities_validISBN($testisbn) {
 return (ereg (([0-9]{10}), $testisbn, $regs) || ereg (([0-9]{13}),
 $testisbn, $regs));
 }



 On Fri, Oct 21, 2011 at 1:44 PM,
 Kozlowski,Brendonbkozlow...@sals.eduwrote:

 Hi all.



 I'm somewhat surprised that I've never had to validate an ISBN manually
 up
 until now. I suppose that's a testiment to all of the software out there.



 However, I now find that I need to validate both the 10-digit and
 13-digit
 ISBNs. I realize there's also a check digit and a REGEX cannot check this
 value - one step at a time. Right now I just want to work on the REGEX.



 Does anyone know the exact specifications of both forms of an ISBN? The
 ISBN organization's website didn't seem to be overly clear to me.
 Alternatively, if anyone has a full working regular expression for this
 purpose I would definitely not mind if they'd be willing to share.



 The only thing I'm doing which is abnormal is that I am not requiring the
 hyphenation or spaces between numbers since some of this data will be
 coming
 from a system, and some will be coming from human input.




 Brendon Kozlowski
 Web Administrator
 Saratoga Springs Public Library
 49 Henry Street
 Saratoga Springs, NY, 12866
 [518] 584-7860 x217

 Please consider the environment before printing this message.

 To report this message as spam, offensive, or if you feel you have
 received
 this in error,
 please send e-mail to ab...@sals.edu including the entire contents and
 subject of the message.
 It will be reviewed by staff and acted upon appropriately.






[CODE4LIB] Variations/FRBR project announces release of RDF data and project source code

2011-08-09 Thread Dunn, Jon William Butcher
(Apologies for cross-posting...)

Indiana University announces the availability of several deliverables from the 
IMLS-funded Variations/FRBR project, all of which are accessible from the 
project website, http://vfrbr.info.

An export of FRBRized data with an RDF binding of the Variations/FRBR data 
model is available in two forms: a single compressed archive containing all 
triples, and smaller separate files with batches of triples by entity type. 
Also available are an ontology in OWL and a set of RDF design templates. All 
data exports contain data for 80,000 sound recordings and 105,000 scores, based 
on holdings of Indiana University's Cook Music Library.

Project source code is downloadable in four subprojects: persistence, 
FRBRization, export, and search. The vfrbr-persist project provides tools for 
creating the MySQL database and Java classes providing connection to the 
database. The vfrbr-frbrize-marc project provides the tools for FRBRizing MARC 
records and storing the results in the database. The vfrbr-export project 
enables XML exports from the database. The vfrbr-scherzo project contains the 
end-user search interface. All source code is released under a BSD open source 
license.

The Scherzo search interface at http://vfrbr.info/search has been enhanced to 
include scores as well as recordings. Keyword search is now available, along 
with a publication date facet, and usability has been improved through numerous 
small changes.

Comments and questions on the Variations/FRBR project may be sent to 
vf...@dlib.indiana.edumailto:vf...@dlib.indiana.edu.

Regards,

Jon

---
Jon Dunn
Director, Library Technologies and Digital Libraries
IU Bloomington Libraries / University Information Technology Services
Indiana University
j...@indiana.edumailto:j...@indiana.edu
(812) 855-0953


Re: [CODE4LIB] TIFF Metadata to XML?

2011-07-18 Thread Jon Stroop

Edward,
JHOVE (1)  should be able to do this, and I believe you can pass the 
included shell script a directory and have it extract data for 
everything it finds and can parse inside.

-Jon

On 07/18/2011 09:18 AM, Edward M. Corrado wrote:

Hello All,

Before I re-invent the wheel or try many different programs, does
anyone have a suggestion on a good way to extract embedded Metadata
added by cameras and (more importantly) photo-editing programs such as
Photoshop from TIFF files and save it as as XML? I have  60k photos
that have metadata including keywords, descriptions, creator, and
other fields embedded in them and I need to extract the metadata so I
can load them into our digital archive.

Right now, after looking at a few tools and having done a number of
Google searches and haven't found anything that seems to do what I
want. As of now I am leaning towards extracting the metadata using
exiv2 and creating a script (shell, perl, whatever) to put the fields
I need into a pseudo-Dublin Core XML format. I say pseudo because I
have a few fields that are not Dublin Core. I am assuming there is a
better way. (Although part of me thinks it might be easier to do that
then exporting to XML and using XSLT to transform the file since I
might need to do a lot of cleanup of the data regardless.)

Anyway, before I go any further, does anyone have any
thoughts/ideas/suggestions?

Edward


Re: [CODE4LIB] Question about C4L 2011 in Bloomington

2011-07-14 Thread Dunn, Jon William Butcher
Hi Tania,

If you're talking about the graph paper pads, my understanding is that these 
were designed in-house by the Communications Office in our IT organization 
(UITS) and printed by a local printing firm. The Communications Office is happy 
to share the design files if you're interested in modifying them for your use - 
I'll e-mail you offline about that.

Jon

---
Jon Dunn
Director, Library Technologies and Digital Libraries
IU Bloomington Libraries / University Information Technology Services
Indiana University
j...@indiana.edu
(812) 855-0953

-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Tania 
Fersenheim
Sent: Wednesday, July 13, 2011 1:36 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: [CODE4LIB] Question about C4L 2011 in Bloomington

Do any of the organizers of the C4L conference in Bloomington back in
Februry know what company made the pads of paper you handed out at
registration?

We like the format and might like to order some for ourselves.  I can send a
scan if someone wants to see what I am talking about.

-- 

Tania Fersenheim
Manager of Library Systems

Brandeis University
Library and Technology Services

415 South Street, (MS 017/P.O. Box 549110)
Waltham, MA 02454-9110
Phone: 781.736.4698
Fax: 781.736.4577
email: tan...@brandeis.edu


Re: [CODE4LIB] MARCXML to MODS: 590 Field

2011-05-19 Thread Jon Stroop

I'm going to guess that it's because 59x fields are defined for local use:

http://www.loc.gov/marc/bibliographic/bd59x.html

...but someone from LC should be able to confirm.
-Jon

--
Jon Stroop
Metadata Analyst
Firestone Library
Princeton University
Princeton, NJ 08544

Email: jstr...@princeton.edu
Phone: (609)258-0059
Fax: (609)258-0441

http://pudl.princeton.edu
http://diglib.princeton.edu
http://diglib.princeton.edu/ead
http://www.cpanda.org/cpanda



On 05/19/2011 11:45 AM, Richard, Joel M wrote:

Dear hive-mind,

Does anyone know why the Library of Congress-supplied MARCXML to MODS XSLT [1] 
does not handle the MARC 590 Local Notes field? It seems to handle everything 
else, not that I've done an exhaustive search... :)

Granted, I could copy/create my own XSLT and add this functionality in myself, 
but I'm curious as to whether or not there's some logic behind this decision to 
not include it. Logic that I would not naturally understand since I'm not 
formally trained as a librarian.

Thanks!
--Joel

[1] http://www.loc.gov/standards/mods/v3/MARC21slim2MODS3-4.xsl


Joel Richard
IT Specialist, Web Services Department
Smithsonian Institution Libraries | http://www.sil.si.edu/
(202) 633-1706 | richar...@si.edu


Re: [CODE4LIB] is this valid marc ?

2011-05-19 Thread Jon Gorman
You've gotten some other good responses, but I thought I'd mention the
LoC and OCLC sites on MARC if you haven't seen them yet.

First, the LoC site at http://www.loc.gov/marc/.  This is what I use
as a guide and a reference.

Some folks prefer the OCLC docs http://www.oclc.org/bibformats/en/,
particularly if they're an OCLC member.

Of course, these apply to MARC-21 and not UniMarc.  Not sure what good
resources are out there for UniMARC.

Jon Gorman


Re: [CODE4LIB] linked data endpoints

2011-05-16 Thread Jon Gorman
Just to clarify, are you picturing some sort of feedback loop?  I'm
just trying to get a better picture of the process (sounds like an
interesting project).

In other words, do you have something like:

1) take in a full-text document (like, say, a novel?)
2) Run it through NER, pull out locations, places, things.
3) Have a user who's read the novel (or perhaps display those words in
context?) go through each the locations and pick a lat  long using
Google Maps as an interface.  (Ie says this Dublin is Dublin, OH not
Dublin, Ireland).
4) Do something similar with names, only using some sort of resource
like dbpedia to display possible individuals?
5) markup the original file in an XML doc w/ identifiers around those
occurrences?

Is that what you're picturing?

Jon G.

Who doesn't really know enough about linked data to contribute, but is
interested nonetheless.


Re: [CODE4LIB] If you were starting over, what would you learn and how would you do it?

2011-05-06 Thread Jon Gorman
Here's my take on whether or not the projects are going to be useful
in job hunting.  It's a bit of a gamble and honestly they may not.  On
the other hand, I certainly would take a portfolio as a very good sign
of a candidate in my own hunts.  But realistically, the job market's
just too wild at the moment.  It does seem to be smoothing out though.

Certainly I would run the portfolio by some systems people you really
respect and ask them to give an honest opinion.  Such projects can be
revealing not just in a positive way but a negative one too.  (And I
feel bad being negative, perhaps just blame it on a  bad week.  I've
seen very few portfolio's that detracted from my opinion of a
candidate.)

On the other hand though, personal experience, particularly well
supported through independent study and also discussion with others
gives a huge boost to your skills.  I don't know if a candidate in
this job market can afford NOT to spend at least some personal time in
developing their skills.  Perhaps in an ideal world perhaps school and
on-job training would cover all ground.  If you can though, double-dip
and just take a course assignment to the next level or something like
that.

In other words, such personal work probably won't greatly increase
your chances of beating out the competition, but without it likely
you're going to have a hard time making a good impression.

Of course, hopefully you enjoy this tech stuff so spending personal
time isn't too burdensome ;).  But I understand, these days it seems
like I never have enough time to work on my personal geeky projects.

Sorry for the convoluted answer, hopefully  it'll help.  We can always
use more geeky librarians ;).

Jon Gorman


Re: [CODE4LIB] yaz-marcdump

2011-05-02 Thread Jon Gorman
From a good article on this at
http://www.indexdata.com/blog/2009/10/z3950-dummies-part-4.

$ yaz-marcdump -f marc-8 -t utf-8 -o marc -l 9=97 part01.dat  part.mrc

(97 = 'a')


If I remember correctly some of this functionality has also changed
over various versions so not sure if this is still needed, but better
safe than sorry.  Might also want to check the man page with your
particular version of yaz-marcdump.

Jon G.

On Mon, May 2, 2011 at 9:39 AM, Eric Lease Morgan emor...@nd.edu wrote:
 Does the -t flag in yaz-marcdump tell the program to convert characters in 
 MARC records to specific character sets, or does merely change the value in a 
 MARC leader to denote the character set of the record as a whole? In other 
 words, will yaz-marcdump do its best to convert MARC-8 characters found in 
 MARC records into a UTF-8 characters?

 --
 Eric Morgan
 University of Notre Dame



Re: [CODE4LIB] utf8 \xC2 does not map to Unicode

2011-04-11 Thread Jon Gorman
 I'm making headway on my MARC records, but only through the use of brute 
 force.

 I used wget to retrieve the MARC records (as well as associated PDF and text 
 files) from the
 Internet Archive.

I know IA has some bad marc records (and also records w/ bad encoding)
from my experience with them in the past.  I'm also not sure what the
web server / wget will do to the files as well.

 I did play a bit with yaz-marcdump to seemingly convert things from marc-8 to 
 utf-8, but I'm not so
 sure it does what is expected. Does it actually convert characters, or does 
 it simply change a
 value in the leader of each record? If the former, then how do I know it is 
 not double-encoding
things? If the later, then my resulting data set is still broken.

There was a bug I seem to remember with yaz-marcdump where it was just
toggling the leader.  (Or a design flaw where you had to specify a
character conversion as well.).  But that was fixed a while ago I
thought. It's probably one of the better tools out there for this type
of stuff.

 If MARC records are not well-formed and do not validate according to the 
 standard, then just like
 XML processors, they should be used. Garbage in. Garbage out.

I'm guessing you meant they shouldn't be used? ;).  XML processors
aren't really known for flexibility in this regard.

Unfortunately there's a lot of issues here, not the least of it some
of the worse issues I've seen are introduced by well-meaning folks who
do things like dump a file out into MARCXML and twiddle with bits or a
marc-breaker format and start using tools to dump unicode text into
what is really a marc-8 file.  Then at some point in the pipeline of
conversions enough character encoding conversions happens that the
file ends up being messed up.

And then there's always the legacy data that got bungled up in the an
encoding transfer.  I know we've got some bad CJK characters due to
this.  At some point in converting our marc-8 records one or two
characters got mapped to something that's not in the unicode spec at
all.  At some point we'll clean up those records, you know, when we've
got some spare time :P.

The problem here has been the tools and they pass whatever internal
validations are enforced.  Probably more stages need to check for
validity, but there's a lot of records that would fail if they did.
(I don't even want to think about how many people disable validation,
or use the same software stack that generated the marc in the first
place, or changes within the marc spec itself over time that makes
validation even more difficult.

Jon Gorman


Re: [CODE4LIB] utf8 \xC2 does not map to Unicode

2011-04-06 Thread Jon Gorman
I'm not quite convinced that it's marc-8 just because there's \xC2 ;).
 If you look at a hex dump I'm seeing a lot of what might be combining
characters.  The leader appears to have 'a' in the field to indicate
unicode.  In the raw hex I'm seeing a lot of  two character sequences
like: 756c 69c3 83c2 a872 (culir).  If I knew my utf-8 better, I
could guess what combining diacritics these are.  Doing a look up on
http://www.fileformat.info seems to indicate that this might be utf-8,
a 'DIAERESIS'

When debugging any encoding issue it's always good to know

a) how the records were obtained
b) how have they been manipulated before you touch them (basically,
how many times may they have been converted by some bungling process)?
c) what encoding they claim to be now?
and
d) what encoding they are, if any?


It's been a while since I used Marc::Batch.  Is there any reason
you're using that instead of just using MARC::Record?  I'd try just
creating a MARC::Record object.

I've seen people do really bizarre things to break MARC files such as
editing the raw binary, thus invalidating the leader and the directory
as the byte counts were no longer right)

I hate to say it, but we still come across files that are no longer in
any encoding due to too many bad conversions.  It's possible these are
as well.

The enca tool (haven't used it much) guesses this at utf-8 mixed w/
non-text data.

Jon


Re: [CODE4LIB] LAMP Hosting service that supports php_yaz?

2011-03-23 Thread Jon Gorman
On Wed, Mar 23, 2011 at 10:13 AM, Cindy Harper char...@colgate.edu wrote:
  Sorry to bother you all with it.  Everyone's happy family is
 different, to hash a quote, but I hope I'm still welcome in Code4Lib, even
 if I'm not hired to be a library coder. Just a library (Windows) sys admin.
 Or maybe we need a spin-off code4lib for the amateurs among us.

I think Bill meant why are you coming down here with us trolls when
you're at such a nice place?  You're quite welcome, although you've
certainly have my curiosity up about why you want to run php_yaz in
the first place.  You didn't have much in the way of details in your
initial email.  It might change some people's advice if you're not
intending the system to a long-term production system.  (And I'm still
curious what systems are even using php_yaz)

Jon Gorman


Re: [CODE4LIB] code4lib 2011 announcements

2011-02-07 Thread Dunn, Jon William Butcher
Peter,

Thanks for letting us know. There is indeed a problem with the streaming
links on the website. The URLs displayed on the page are correct, but they
are linking to the wrong address. This should be fixed by the time the
streamed sessions start tomorrow. Note that today's preconference sessions
are not being streamed.


Jon

---
Jon Dunn
Director, Library Technologies and Digital Libraries
IU Bloomington Libraries / University Information Technology Services
Indiana University
j...@indiana.edu
(812) 855-0953




On 2/7/11 10:17 AM, Peter MacDonald pmacd...@hamilton.edu wrote:

I am being challenged for a username and passphrase when I try to view the
live stream at

http://www.indiana.edu/~uits/code4lib/program/sessions.php

How do we get them?

Thanks,
Peter

Peter MacDonald
Library Information Systems Specialist
Hamilton College Library
315 859-4493


On Mon, Feb 7, 2011 at 9:34 AM, McDonald, Robert H.
rhmcd...@indiana.eduwrote:

 Hi Everyone,

 Just a few announcements about Code4Lib 2011. We are so happy that so
many
 could join us this week in Bloomington. We kicked off our
pre-conferences
 today and are looking forward to an exciting week.

 I have a couple of announcements for this list about code4lib 2011.

 For all those not in attendance, we will be streaming the conference
live ­
 this will be done in 4 sessions (Day 1 morning, Day 1-afternoon, Day 2
all
 day, Day 3 till noon) to get access to the streams please go to:
 http://www.indiana.edu/~uits/code4lib/program/sessions.php

 Also, one of our sponsors for this year's code4lib 2011 conference,
 Elsevier, is hosting a code challenge that will take place from now
until
 March 1, 2011. It is open to all but I am trying to help them find
those in
 the code4lib community who are interested in working with their brand
new
 API for their SciVerse Suite. There are some cool prizes too. For more
on
 this event ‹ please see below.

 Thanks again for all of your suggestions that will make code4lib 2011 a
 conference to remember.


 Best,

 Robert

 ++

 Elsevier Code Challenge ­ Building OpenSocial Apps for Libraries
 Elsevier is sponsoring a code challenge to build OpenSocial apps for
 Libraries. Using JavaScript and HTML5, Librarians can write customized
apps
 for the SciVerse suite: ScienceDirect, Scopus and Hub. The Elsevier
 Challenge is now open for all librarians and coders and closes on March
1,
 2011.

 The prizes for this code challenge are as follows:

 Prizes:

 1.   $1500 (Amazon gift card)

 2.   $1000 (Amazon gift card)

 3.   $500  (Amazon gift card)
 The next web is going to be based on apps that allow librarians and end
 users to customize their search needs. Elsevier¹s new SciVerse platform
has
 extended Apache Shindig, the OpenSocial container, for apps to appear
 alongside search results, full-text articles and meta-data, which can be
 accessed through the Sciverse APIs. Using open APIs and open data, apps
can
 mashup SciVerse content with third party data and services.

 The Elsevier Challenge at Code4Lib challenges librarians to build better
 tools and services on the SciVerse suite: ScienceDirect, Scopus and Hub
and
 customize the search tools for their library and users.

 To register for the challenge, email challenge-regis...@elsevier.com
 mailto:challenge-regis...@elsevier.com
 For instructions: go to http://developer.sciverse.com/code4lib
 To get started: http://developer.sciverse.com/sdk
 For more information, email: challenge-i...@elsevier.commailto:
 challenge-i...@elsevier.com


 **
 Robert H. McDonald
 Associate Dean for Library Technologies and Digital Libraries
 Associate Director, Data to Insight Center-Pervasive Technology
Institute
 Executive Director, Kuali OLE
 Indiana University
 Herman B Wells Library 234
 1320 East 10th Street
 Bloomington, IN 47405
 Phone: 812-856-4834
 Email: rob...@indiana.edu
 applewebdata://4D6D9232-E25C-47CB-ACDB-EFEDEA66AA98/rob...@indiana.edu
 Skype/GTalk: rhmcdonald
 AIM/MSN: rhmcdonald1



Re: [CODE4LIB] Registration website issues?

2010-12-13 Thread Jon Gorman
Yup, it's slow going.  It seems so far if you just keep hitting reload
after the errors it eventually gets through.  It's keeping the
information in session somehow.

Of course, I'm on step 8 after 40 minutes.so I'm hoping I don't
have to start over again..


Jon Gorman

On Mon, Dec 13, 2010 at 11:39 AM, Doran, Michael D do...@uta.edu wrote:
 Is anyone else having trouble connecting to the Code4Lib registration website 
 (https://www.confmanager.com/main.cfm?cid=2375)?  It took me about 15 minutes 
 to get connected initially, now it's hanging after page 2 (of 9?).

 -- Michael

 # Michael Doran, Systems Librarian
 # University of Texas at Arlington
 # 817-272-5326 office
 # 817-688-1926 mobile
 # do...@uta.edu
 # http://rocky.uta.edu/doran/


 -Original Message-
 From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of Karen
 Coyle
 Sent: Monday, December 13, 2010 9:51 AM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] Announcing OLAC's prototype FRBR-inspired moving 
 image
 discovery interface

 Quoting Beacom, Matthew matthew.bea...@yale.edu:

 Sometimes I feel like we should all have the FRBR diagram tattoo'd on
 our arms so we can consult it any time anywhere. :-)


 
  With as complex a thing as a film--so many authors, images, music,
  dialog, acting, sets, costume, etc., etc., etc., applying the FRBR
  model is tough, and your implementation is quite sensible. However,
  I had a small question about one thing you said about FRBR not
  allowing language at the work level. That doesn't seem right to me.
  How could the language of a thing that is primarily or even
  partially a work made of language--like a novel or a motion picture
  with spoken dialogue would not necessarily be considered at the work
  level and not at some other level.

 Matthew, I can't answer how it is possible but I can tell you that it
 is a fact: language is an attribute of Expression, not of Work. That's
 kind of the key meaning of frbr:Expression -- it is the Expression of
 the Work, and the Work doesn't exist until Expressed. So Work is a
 very abstract concept in FRBR. (Which is why more than one attempted
 implementation of FRBR that I have seen combines Work and Expression
 attributes in some way.)

 Not only that, but Kelley's model uses something that I consider to be
 missing from FRBR: the concept of a original Expression. For FRBR
 (and thus for RDA) all expressions are in a sense equal; there is no
 privileged first or original expression. Yet there is evidence that
 this is a useful concept in the minds of users. Some recent user
 studies [1] around FRBR showed that this is a concept that users come
 up with spontaneously. Also, I can't think of any field of study where
 knowing what the original expression of a work was wouldn't be
 important.

  Because of the way we treat translations--not just in FRBR--as what
  FRBR calls expressions not as new works, a translation from the
  original language to another would be considered an FRBR expression.
  Could you explain this a bit more?

 The FRBR relationship translation of is an Expression-to-Expression
 relationship. (See my personal cheat sheet of RDA/FRBR relationships
 [2]).

 kc
 [1] http://www.asis.org/asist2010/abstracts/75.html
 [2] http://kcoyle.net/rda/group1relsby.html

 
  Thank you.
 
  Matthew
 
 
 
  -Original Message-
  ...
 
  This also allowed us to get around some of the areas of more
  orthodox FRBR modeling that we found unhelpful. For example, FRBR
  doesn't allow language at the Work level, but we think it is
  important to record the original language of a moving image at the
  top level.
 



 --
 Karen Coyle
 kco...@kcoyle.net http://kcoyle.net
 ph: 1-510-540-7596
 m: 1-510-435-8234
 skype: kcoylenet



Re: [CODE4LIB] Which O'Reilly books should we give away at Code4Lib 2011?

2010-12-08 Thread Jon Gorman
So somebody actually attempts to answer your question:


Some O'Reilly books that would probably be a good fit for the conference:

Search/Library related -
Ambient findability
Information Architecture for the World Wide Web Information
Architecture for the World Wide Web
Search Patterns

Conference/culture related -
Confessions of a Public Speaker
Hackers  Painters
Beautiful Code

Just plain fun -
A stack o' Make magazines?


I picked these somewhat from memory and somewhat from using my wand of
serendipity.  If more suggestions are needed, I can probably put more
actual thought into it ;).

Jon Gorman

On Tue, Dec 7, 2010 at 8:31 PM, Kevin S. Clarke kscla...@gmail.com wrote:
 Hi all,

 If you have particular O'Reilly titles that you'd like for us to ask
 O'Reilly for, send them to me and I'll put them in our request.

 Thanks,
 Kevin



Re: [CODE4LIB] unwanted (bogus) characters in marc

2010-10-07 Thread Jon Gorman
There's something about this that's tugging at my memory that hints it
might not be quite what the error message said as far as an invalid
unicode character.

I guess my first couple of questions:

1) What identifiers/records are you pulling?  I didn't see any actual
examples in your email.  Can you construct the url that the perl
script is doing and give it to us?

I'd guess it's very likely the original marc record is goofed up due
to some transforms.  I've seen it from people doing really weird
things to records as part of the submit process to IA.

2) You're sure that is a unicode marc record and not marc-8, right?

3) What version is your MARC::Record module?  Might want to upgrade if
it's old, there's been some bug fixes.

Jon Gorman


On Thu, Oct 7, 2010 at 5:51 AM, Eric Lease Morgan emor...@nd.edu wrote:
 How do I trap for unwanted (bogus) characters in MARC records?

 I have a set of Internet Archive identifiers, and have written the followoing 
 Perl loop to get the MARC records associated with each one:

  # process each identifier
  my $ua = LWP::UserAgent-new( agent = AGENT );
  while ( DATA ) {

    # get the identifier
    chop;
    my $identifier = $_;
    print $identifier, \n;

    # get its corresponding MARC record
    my $response = $ua-get( ROOT . $identifier/$identifier . _meta.mrc );
    if ( ! $response-is_success ) {

      warn $response-status_line;
      next;

    }

    # save it
    open MARC,   $identifier.mrc or die Can't open $identifier.mrc: $!\n;
    binmode MARC, :utf8;
    print MARC $response-content;
    close MARC;

  }

 I then use the venerable marcdump to see the fruits of my labors: marcdump 
 *.mrc. Unfortunately, marcdump returns the following error against (at least) 
 one of my files:

  bienfaitsducatho00pina.mrc
  utf8 \xC3 does not map to Unicode at /System/Library/
  Perl/5.10.0/darwin-thread-multi-2level/Encode.pm line 162.

 What is going on here? Am I saving my files incorrectly? Is the original MARC 
 data inherintly incorrect? Is there some way I can fix the MARC record in 
 question?

 --
 Eric Lease Morgan



  1   2   >