Re: [CODE4LIB] date fields

2016-07-12 Thread Jonathan Rochkind
> but since there is really no standard field for such a value, anything I
choose is all but arbitrary. I’ll use some 9xx field, just to make things
easy. I can always (and easily) change it later.

More like there are SEVERAL standard fields for such a value.

You can certainly put it in one of the existing standard fields, you just
have to actually follow the (often byzantine legacy) rules for such entry.
For instance, the date you want may very well already be in the fixed field
008, and you could certainly add it if it weren't. But the rules and
practices for 008 are confusing -- in part, because the actual real world
universe of "what is the date of this thing" is itself complex in the real
world of actually cataloged things, including serials and series,
manuscripts, reprints and fascimiles, old things where we aren't sure of
the exact dates, etc.  And in part just because the MARC standard is kind
of old and creaky, especially with regard to fixed fields like 008 being
designed to cram maximum amount of information in minimum bytes, beyond any
reasonable economy actually needed today.

I just learned about the 264 from Karen Miller's post (thanks Karen), I
dunno about that one, but it looks like it might be applicable too.

Standards, why have just one when you can have a dozen?

On Tue, Jul 12, 2016 at 10:12 AM, Eric Lease Morgan  wrote:

> On Jul 11, 2016, at 4:32 PM, Kyle Banerjee 
> wrote:
>
> >>
> https://github.com/traject/traject/blob/e98fe35f504a2a519412cd28fdd97dc514b603c6/lib/traject/macros/marc21_semantics.rb#L299-L379
> >
> > Is the idea that this new field would be stored as MARC in the system
> (the
> > ILS?).
> >
> > If so, the 9xx solution already suggested is probably the way to go if
> the
> > 008 route suggested earlier won't work for you. Otherwise, you run a risk
> > that some form of record maintenance will blow out all your changes.
> >
> > The actual use case you have in mind makes a big difference in what paths
> > make sense, so more detail might be helpful.
>
>
> Thank you, one & all, for the input & feedback. After thinking about it
> for a while, I believe I will save my normalized dates in a local (9xx)
> field of some sort.
>
> My use case? As a part of the "Catholic Portal", I aggregate many
> different types of metadata and essentially create a union catalog of rare
> and infrequently held materials of a Catholic nature. [1] In an effort to
> measure “rarity” I've counted and tabulated the frequency of a given title
> in WorldCat. I now want to measure the age of the materials in the
> collection. To do that I need to normalize dates and evaluate them. Ideally
> I would save the normalized dates back in MARC and give the MARC back to
> Portal members libraries, but since there is really no standard field for
> such a value, anything I choose is all but arbitrary. I’ll use some 9xx
> field, just to make things easy. I can always (and easily) change it later.
>
> [1] "Catholic Portal” - http://www.catholicresearch.net
>
> —
> Eric Lease Morgan
>


Re: [CODE4LIB] date fields

2016-07-11 Thread Jonathan Rochkind
There's some super useful data in the MARC fixed fields too -- more useful
than the semi-transcribed values in 260c, although it's also a pain to
access/transform to something reasonably machine actionable.

Here's the code from traject that tries to get a reasonable date out of
marc fixed fields, falling back to 260c if it needs to.
https://github.com/traject/traject/blob/e98fe35f504a2a519412cd28fdd97dc514b603c6/lib/traject/macros/marc21_semantics.rb#L299-L379

There are already quite a few places in MARC for dates. It's just they're
all weird. You're making up yet a new kind of date to your own local
meaning and specs. I doubt there's an existing MARC field you can put it in
where it won't just add to the confusion. (obligatory reference to
https://xkcd.com/927/).

I'd just put it in a 9xx or xx9 field of your choosing, they are reserved
for local use.

On Mon, Jul 11, 2016 at 3:19 PM, Joy Nelson 
wrote:

> Hi Eric-
> Are you planning on storing the 'normalized' dates for ever in the MARC?
> i.e. leave the c1900 in the 260$c and have 1900 in another place?
>
> I think what you do depends on your ILS and tools.  My first reaction would
> be to stash the date in an unused subfield in the 260.  If your system
> allows you to add 'non standard' subfields, you could use 260$z to stash
> it.
>
> But, then I start to think that might rankle some catalogers to have 'non
> standard' date data in the 260 (or 264).  I would probably then look at
> using one of the local use tags.  901-907, 910, or 945-949.  You could be
> the date in $a and even a brief description in a second subfield.
> 901$a1900$bnormalized date for project XYZ -initials/date
>
> -Joy
>
> On Mon, Jul 11, 2016 at 12:51 PM, Eric Lease Morgan 
> wrote:
>
> > I’m looking for date fields.
> >
> > Or more specifically, I have been given a pile o’ MARC records, and I
> will
> > be extracting for analysis the values of dates from MARC 260$c. From the
> > resulting set of values — which will include all sorts of string values
> > ([1900], c1900, 190?, 19—, 1900, etc.) — I plan to normalize things to
> > integers like 1900. I then want to save/store these normalized values
> back
> > to my local set of MARC records. I will then re-read the data to create
> > things like timelines, to answer questions like “How old is old?”, or to
> > “simply” look for trends in the data.
> >
> > What field would y’all suggest I use to store my normalized date content?
> >
> > —
> > Eric Morgan
> >
>
>
>
> --
> Joy Nelson
> Director of Migrations
>
> ByWater Solutions 
> Support and Consulting for Open Source Software
> Office: Fort Worth, TX
> Phone/Fax (888)900-8944
> What is Koha? 
>


Re: [CODE4LIB] C4L17 - Potential Venue Shift to LA and Call for Proposals

2016-06-15 Thread Jonathan Rochkind
I wouldn't have even done a vote at all -- I think when we vote on
conference hosts, we are choosing people to steward the conference and make
sure it happens, as good as it can be using their judgement for what that
looks like and how to make it happen.  The fact that the NC folks are
attempting to make sure the torch can get passed instead of just throwing
up their hands and saying "it's back at you, community, we're no longer
involved" shows that stewardship was well-placed. I think it would have
been totally appropriate for them to simply pass the torch.

But if votes are going to happen, they need to happen as quickly as
possible if you want the conf to actually come off, at least in the
spring.  How is "7 days after a credible proposal that includes financial
backing" not an "arbitrary deadline"?  Are you willing to wait forever for
such a "credible proposal" to show up? Who decides if it's "credible"?
Once a proposal shows up, anyone else that was trying to work on a proposal
now has exactly 7 days to get one in, but they had no idea what their
deadline was until the first proposal showed up, which hopefully they
noticed on the email list so they know what their deadline is now?  Or only
the first proposal to get in gets a yes/no vote, and anyone else doesn't
get included in the vote, first to get the proposal to email wins?

There are a bunch of different ways it could be done, but calendar dates
are important for an orderly process, and speedy calendar dates are
important for the conf to actually happen, and I think nitpicking and
arguing over the process the NC folks have chosen is pointless, they were
entrusted to steward the thing, the process they've come up with is
reasonable, just go with it.

On Wed, Jun 15, 2016 at 3:20 PM, Cary Gordon  wrote:

> I think that we should avoid arbitrary limits such as a July 1st deadline.
> We should open up any credible proposal that includes financial backing to
> discussion and a vote closing seven days after the proposal is posted to
> this list.
>
> Cary
>
> > On Jun 15, 2016, at 12:05 PM, Brian Rogers  wrote:
> >
> > Greetings once more from the Chattanooga Local Planning Committee -
> >
> > We come with another update regarding the annual Code4Lib conference.
> After the announcement of our survey, two other groups immediately reached
> out about the possibility of hosting the conference. Of those two, the one
> that is the most confident about being able to secure a fiscal host and
> still pull off everything within the existing timeframe, is the LA-based
> C4L-SoCal. We spoke with three of their members earlier in the week - Gary
> Thompson, Christina Salazar, and Joshua Gomez. After discussion, we
> collectively envision a collaboration between the two groups, given the
> effort, energy and commitment the Chattanooga group has already invested.
> The LA group would handle more of the venue and local arrangements, with
> the Chattanooga group helping spearhead other planning elements.
> >
> > Thus, the idea is to host the annual conference in the greater LA area.
> >
> > However, even though Chattanooga's proposal was the only one put forth
> for next year, since this suggestion does reflect a significant change, and
> because LA is still working on securing a fiscal host, we are proposing to
> the community the following:
> >
> > - Since a handful of individuals came forth w/alternative cities
> subsequent to my last update, any group who now wishes to put forth a
> proposal, do so by July 1st.
> > - Given the specter of timecrunch, we ask anyone, including LA, who
> would put forth another city, to only do so with written confirmation of a
> fiscal host by that same deadline.
> > - If more than one city has put forth a proposal and secured a fiscal
> host within that window of time, we will put it to a community vote, with
> polls being left up through July 15th.
> >
> > As always, comments and suggestions welcome. Thanks for all the existing
> feedback, dialogue, various offers people have come forth with, and the
> patience while we try to wrangle up a physical home for 2017.
> >
> > - Brian Rogers
>


[CODE4LIB] Blacklight Community Survey Results

2015-10-20 Thread Jonathan Rochkind
In late August/early September you may recall I released a Blacklight 
Community Survey. I got 18 responses.


The survey covered nature or organizations implementing BL, rough 
categories of usage of the BL apps, versions of dependencies in use, and 
free form likes and dislikes about BL.


Just posted on my blog, I have links to the raw data, as well my some of 
my own summary, interpretation, and analysis.


https://bibwild.wordpress.com/2015/10/20/blacklight-community-survey-results/


[CODE4LIB] Blacklight (inc. Hydra) Community Survey

2015-08-20 Thread Jonathan Rochkind

Hi all,

I have a survey targetted at those installing/maintaining/hosting a 
Blacklight installation, including BL-based stacks like Hydra. To see 
how people are using Blacklight, and their environments.


If you are such, please take if you can spare the time!

https://docs.google.com/forms/d/1q2NL5pAKg5OsmnSIbjiBZdueu91ZZNqYKYPzfV-fLZ0/viewform?usp=send_form

All collected data/answers will be public.

Jonathan


Re: [CODE4LIB] Last day (tomorrow) for Code4lib 2016 submissions

2015-02-19 Thread Jonathan Rochkind

And sell your ability to host a good conference, of course!

Personally, I look as much at what I can tell about the capacity of the 
organizers to host a good conf as I look at which city it's in, and I 
hope others do too!


Jonathan

On 2/19/15 10:55 AM, Francis Kayiwa wrote:

Hello friends,

Tomorrow at midnight (Pacific) is the last day to submit your proposals
to host Code4lib 2016.

You will have a year and lots of support to iron out the details. ;-)
For now, just sell your city.

http://wiki.code4lib.org/2016_Hosting_Proposals

./fxk



Re: [CODE4LIB] linked data and open access

2014-12-22 Thread Jonathan Rochkind
 And as has already been pointed out, no one has really show an impressive end 
 user use for linked data, which American decision making tends to be more 
 driven by.

Well, that raises an important question -- whether an 'end user use', or other 
use, do people have examples of neat/important/useful things done with linked 
data in Europe, especially that would have been harder or less likely without 
the data being modelled/distributed as linked data?  


From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] on behalf of Brent Hanner 
[behan...@mediumaevum.com]
Sent: Monday, December 22, 2014 6:11 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] linked data and open access

There are deeper issues at work here than just the kind of obvious surface 
issues.

One of the reason Europe embraced rdf triples and linked data was timing.  The 
EU was forming its centralized information institutions the same time the idea 
of linked data to solve certain problem came about.  So they took it and ran 
with it.  In the US we have been primarily driven by the big data movement that 
gained steam shortly after.  And as has already been pointed out, no one has 
really show an impressive end user use for linked data, which American decision 
making tends to be more driven by.


Europeans can think about data and databases differently than we can here in 
the US.  In Europe a database is intellectual property, in the US only parts of 
the database that fall under copyright law are intellectual property, which for 
most databases isn't much.  You can’t copyright a fact.  So in the US once you 
release the data into the wild its usually public domain.


As for government data, the Federal and most state governments are in need of 
an overhaul that would make it possible.  If you don’t have the systems or 
people in place who can make it happen it won’t happen.  Heck the federal 
government can’t even get a single set of accounting software and what not.


So it isn’t just a lack of leadership or will, there are other things at work 
as well.



Brent






Sent from Windows Mail





From: Karen Coyle
Sent: ‎Friday‎, ‎December‎ ‎19‎, ‎2014 ‎10‎:‎32‎ ‎AM
To: CODE4LIB@LISTSERV.ND.EDU





Yep, yep, and yep.

Plus I'd add that the lack of centralization of library direction (read:
states) is also a hindrance here. Having national leadership would be
great. Being smaller also wouldn't hurt.

kc

On 12/19/14 6:48 AM, Eric Lease Morgan wrote:
 I don’t know about y’all, but it seems to me that things like linked data and 
 open access are larger trends in Europe than here in the United States. Is 
 there are larger commitment to sharing in Europe when compared to the United 
 States? If so, is this a factor based on the nonexistence of a national 
 library in the United States? Is this your perception too? —Eric Morgan

--
Karen Coyle
kco...@kcoyle.net http://kcoyle.net
m: +1-510-435-8234
skype: kcoylenet/+1-510-984-3600


Re: [CODE4LIB] announcing blacklight_folders v 1.0

2014-12-19 Thread Jonathan Rochkind
Nice, thank you!

The default bookmarks functionality, if you create bookmarks as an 
un-authenticated user, but then later log in -- those bookmarks you created are 
automatically merged into your authenticated bookmarks too, so they'll be saved 
along with your account. 

Does blacklight_folders behave similarly?

From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] on behalf of Alicia Cozine 
[ali...@curationexperts.com]
Sent: Friday, December 19, 2014 3:25 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: [CODE4LIB] announcing blacklight_folders v 1.0

Data Curation Experts welcomes the latest new gem in the Blacklight and Hydra 
family. See our blog post at 
http://curationexperts.com/2014/12/19/introducing-blacklight-folders/

Thanks to the Indiana University team of Jon Dunn, Rachael Cohen, Courtney 
Greene, Matt Sargent, Mark Feddersen, and David Jiao for a great project.

Happy holidays to all,
Alicia

Alicia Cozine
Data Curation Experts


Re: [CODE4LIB] Balancing security and privacy with EZproxy

2014-11-20 Thread Jonathan Rochkind

On 11/20/14 1:06 PM, Kyle Banerjee wrote:

BTW, you can do some funky things with EZP that include
conditional logic


Can you say more about funky things you can do with EZProxy involving 
conditional logic? Cause I've often wanted that but haven't found any! 
Are you talking about a particular part/area of EZProxy? (Logging?).


Re: [CODE4LIB] MARC reporting engine

2014-11-03 Thread Jonathan Rochkind
Hm. You don't need to keep all 800k records in memory, you just need to 
keep the data you need in memory, right? I'd keep a hash keyed by 
authorized heading, with the values I need there.


I don't think you'll have trouble keeping such a hash in memory, for a 
batch process run manually once in a while -- modern OS's do a great job 
with virtual memory making it invisible (but slower) when you use more 
memory than you have physically, if it comes to that, which it may not.


If you do, you could keep the data you need in the data store of your 
choice, such as a local DBM database, which ruby/python/perl will all 
let you do pretty painlessly, accessing a hash-like data structure which 
is actually stored on disk not in memory but which you access more or 
less the same as an in-memory hash.


But, yes, it will require some programming, for sure.

A MARC Indexer can mean many things, and I'm not sure you need one 
here, but as it happens I have built something you could describe as a 
MARC Indexer, and I guess it wasn't exactly straightforward, it's 
true. I'm not sure it's of any use to you here for your use case, but 
you can check it out at https://github.com/traject-project/traject


On 11/2/14 9:29 PM, Stuart Yeates wrote:

Do any of these have built-in indexing? 800k records isn't going to
fit in memory and if building my own MARC indexer is 'relatively
straightforward' then you're a better coder than I am.

cheers stuart

-- I have a new phone number: 04 463 5692

 From: Code for Libraries
CODE4LIB@LISTSERV.ND.EDU on behalf of Jonathan Rochkind
rochk...@jhu.edu Sent: Monday, 3 November 2014 1:24 p.m. To:
CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] MARC reporting
engine

If you are, can become, or know, a programmer, that would be
relatively straightforward in any programming language using the open
source MARC processing library for that language. (ruby marc, pymarc,
perl marc, whatever).

Although you might find more trouble than you expect around
authorities, with them being less standardized in your corpus than
you might like.  From: Code
for Libraries [CODE4LIB@LISTSERV.ND.EDU] on behalf of Stuart Yeates
[stuart.yea...@vuw.ac.nz] Sent: Sunday, November 02, 2014 5:48 PM To:
CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] MARC reporting engine

I have ~800,000 MARC records from an indexing service
(http://natlib.govt.nz/about-us/open-data/innz-metadata CC-BY). I am
trying to generate:

(a) a list of person authorities (and sundry metadata), sorted by how
many times they're referenced, in wikimedia syntax

(b) a view of a person authority, with all the records by which
they're referenced, processed into a wikipedia stub biography

I have established that this is too much data to process in XSLT or
multi-line regexps in vi. What other MARC engines are there out
there?

The two options I'm aware of are learning multi-line processing in
sed or learning enough koha to write reports in whatever their
reporting engine is.

Any advice?

cheers stuart -- I have a new phone number: 04 463 5692




Re: [CODE4LIB] MARC reporting engine

2014-11-02 Thread Jonathan Rochkind
If you are, can become, or know, a programmer, that would be relatively 
straightforward in any programming language using the open source MARC 
processing library for that language. (ruby marc, pymarc, perl marc, whatever). 
 

Although you might find more trouble than you expect around authorities, with 
them being less standardized in your corpus than you might like. 

From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] on behalf of Stuart Yeates 
[stuart.yea...@vuw.ac.nz]
Sent: Sunday, November 02, 2014 5:48 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: [CODE4LIB] MARC reporting engine

I have ~800,000 MARC records from an indexing service 
(http://natlib.govt.nz/about-us/open-data/innz-metadata CC-BY). I am trying to 
generate:

(a) a list of person authorities (and sundry metadata), sorted by how many 
times they're referenced, in wikimedia syntax

(b) a view of a person authority, with all the records by which they're 
referenced, processed into a wikipedia stub biography

I have established that this is too much data to process in XSLT or multi-line 
regexps in vi. What other MARC engines are there out there?

The two options I'm aware of are learning multi-line processing in sed or 
learning enough koha to write reports in whatever their reporting engine is.

Any advice?

cheers
stuart
--
I have a new phone number: 04 463 5692


[CODE4LIB] CrossRef/DOI content-negotiation for metadata lookup?

2014-10-23 Thread Jonathan Rochkind
Hi, the DOI system supports some metadata lookup via HTTP 
content-negotiation.


I found this blog post talking about CrossRef's support:

http://www.crossref.org/CrossTech/2011/04/content_negotiation_for_crossr.html

But I know DataCite supports it to some extent too.

Does anyone know if there's overall registrar-agnostic documentation 
from DOI for this service?


Or, if there's kept-updated documentation from CrossRef and/or DataCite 
on it?


From that blog post, it says rdf+xml, turtle, and atom+xml should all 
be supported as response formats.


But atom+xml seems to not be supported -- if I try the very example from 
that blog post, I just get a 406 No Acceptable Resource Available.


I am not sure if this is a bug, or if CrossRef at a later point than 
that blog post decided not to support atom+xml. Anyone know how I'd find 
out, or get more information?


Jonathan


Re: [CODE4LIB] Linux distro for librarians

2014-10-19 Thread Jonathan Rochkind
What's an 'apostate distro'?

Anyhow, not all librarians work in roles where data recovery is the priority, 
so prioritizing data recovery wouldn't apply to all librarians. For those 
librarians who do work in a role where for some or all of their systems data 
recovery is a focus... why wouldn't you use the existing distro(s) that focus 
on that?

Developing and maintaining a distro such that is secure, reliable, and has a 
good total cost of ownership for users (back/forwards compatibility, etc)... 
is a very 'expensive' (labor intensive) proposition.  I think you are having 
trouble explaining what the benefits you see to doing this are, and convincing 
people they are worth the expense/risk. Unless you can make such a case (and 
show that you understand the 'costs' involved), I think you will have trouble 
recruiting people for your project. 

Jonathan

From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] on behalf of Cornel Darden 
Jr. [corneldarde...@gmail.com]
Sent: Sunday, October 19, 2014 1:21 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Linux distro for librarians

Hello,

I'm not proposing that all librarians should use the same operating system. I 
think different people have different needs. When looking at the operating 
system that I found on foss4lib it had many programs that librarians'could' use 
in their work. I wish there were a better one. Maybe I'll eventually get to it 
and start a community. I think discovery is important here. Similar to many 
data recovery distros like knoppix. Doesn't need to be apostate distro, but 
point is data recovery, and gathering all the tools necessary for that end is 
the purpose of the operating system. This would be similar to one for 
librarians.


It sounds like we are starting a Linux debate. That was not my intention. But 
Linux, I believe is much better for information professionals. There are lots 
of things to work out: security, oppressive IT regimes, coding skills of 
librarians, compatibility, etc. I'm certain this is the century of open source. 
The things were currently trying to with technology demands it. Large adoption 
by librarians may be the missing link to the Linux and open source revolution.

But again, the debate was not my intent, just wanted to know If such an 
operating system exists.

Thanks,

Cornel Darden Jr.
MSLIS
Library Department Chair
South Suburban College
7087052945

Our Mission is to Serve our Students and the Community through lifelong 
learning.

Sent from my iPhone

 On Oct 19, 2014, at 8:30 AM, Craig Boman craig.bo...@gmail.com wrote:

 Hi Cornel,

 As a linux librarian myself, there may be some issues with having a linux
 OS for librarians, as follows.

 First is security. Although linux is supposedly better for security, linux
 security requires setting up your own PC firewall rules where Windows or
 Macs have most of these predetermined. Most university IT departments, I
 would presume, have Windows anti-virus vendors which they encourage library
 IT to use. And also almost all of our university's technology
 infrastructure is more friendly towards PC. For example pay-for-print
 systems, etc. If individual librarians have linux, they wouldn't be able to
 print easily at our university.

 What about the differing needs of librarians? Are you presuming the needs
 of all librarians are similar? My needs in library IT are drastically
 different than your needs in reference using Ubuntu. Technical services
 staff have differing needs from IT. And most library staff don't have a
 choice what they get to use, due to how universities purchase PC's with
 Windows pre-installed.

 There appear to be a lot of technical and practical limitations to making a
 linux software for librarians. From a library IT standpoint, having
 different operating systems which we in IT then have to troubleshoot would
 be a bit of a nightmare; standardization does have some benefits.

 Please elaborate on how a linux for librarians would make our jobs easier?
 Are you referring to automation? Surely any automation features available
 in linux can be emulated in Windows, no? Have you looked at AutoIT or
 AutoHotKey?

 All the best,
 Craig Boman, MLIS




 On Sun, Oct 19, 2014 at 2:32 AM, Cornel Darden Jr. corneldarde...@gmail.com
 wrote:

 Hello,

 I did find Potthakalya on foss4lib. I'm not sure if it has a very active
 development: https://foss4lib.org/package/potthakalaya
 But it was what I was looking for. It looks like a linux operating system
 built on puppy linux that comes pre-packaged with software for librarians
 that make our jobs easier and more efficient! Does anyone know of anything
 similar. Something like this is very helpful to the field.

 Thanks,

 On Sat, Oct 18, 2014 at 8:30 PM, Pottinger, Hardy J. 
 pottinge...@missouri.edu wrote:

 Honestly, your Host distro doesn't much matter, everything will be in
 Docker soon. Here's a quick way to get there

 

Re: [CODE4LIB] Library app basics

2014-10-07 Thread Jonathan Rochkind
It's perhaps a little bit outdated by now, since things change so fast, 
but there is a Code4Lib Journal article on one library's approach a few 
years ago, which you may find useful.


http://journal.code4lib.org/articles/5014

And, actually, googling for that one, I found several other ones too on 
library app mobile development, which I haven't actually read myself yet:


http://journal.code4lib.org/articles/7336
http://journal.code4lib.org/articles/2055
http://journal.code4lib.org/articles/2947

On 10/7/14 2:51 PM, Will Martin wrote:

My boss has directed me to start looking into producing a phone app for
the library, or better yet finding a way to integrate with the existing
campus-wide app.  Could I pick the list's brains?

1) Is there some tolerably decent cross-platform app language, or am I
going to be learning 3 different languages for iOS, Android, and Windows
phone?  I've dabbled in all kinds of things, but my bread-and-butter
work has been PHP on a LAMP stack.  Apps aren't written in that, so new
language time.

2) The library's selection of mobile devices consists of 2 iPads and a
Galaxy tablet.  We don't have phones for testing.  My personal phone is
a 12-year-old flip phone which doesn't run apps.  Can I get by with
emulators?  What are some good ones?  The budget for the project is
zero, so I don't think dedicated testing devices are in the cards unless
I upgrade my own phone, which I probably ought to anyway.

3) What are some best practices for library app design?  We were
thinking the key functionality would be personal account management
(what have I got checked out, renew my stuff, etc), hours, lab
availability, search the catalog, and ask a librarian.  Anything
missing?  Too much stuff?

Will Martin

Web Services Librarian
Chester Fritz Library

P.S.  I sent this a couple days ago and wondered why it hadn't shown up
-- only to realize I accidently sent it to j...@code4lib.org rather than
the actual list serv address.  Whoops, embarrassing!




Re: [CODE4LIB] What is the real impact of SHA-256? - Updated

2014-10-02 Thread Jonathan Rochkind
For checksums for ensuring archival integrity, are cryptographic flaws 
relavent? I'm not sure, is part of the point of a checksum to ensure against 
_malicious_ changes to files?  I honestly don't know. (But in most systems, I'd 
guess anyone who had access to maliciously change the file would also have 
access to maliciously change the checksum!)

Rot13 is not suitable as a checksum for ensuring archival integrity however, 
because it's output is no smaller than it's input, which is kind of what you're 
looking for. 


From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] on behalf of Cary Gordon 
[listu...@chillco.com]
Sent: Thursday, October 02, 2014 5:51 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] What is the real impact of SHA-256? - Updated

+1

MD5 is little better than ROT13. At least with ROT13, you have no illusions.

We use SHA 512 for most work. We don't do finance or national security, so it 
is a good fit for us.

Cary

On Oct 2, 2014, at 12:30 PM, Simon Spero sesunc...@gmail.com wrote:

 Intel skylake processors have dedicated sha instructions.
 See: https://software.intel.com/en-us/articles/intel-sha-extensions

 Using a tree hash approach (which is inherently embarrassingly parallel)
 will leave io time dominant. This approach is used by Amazon glacier - see
 http://docs.aws.amazon.com/amazonglacier/latest/dev/checksum-calculations.html

 MD5 is broken, and cannot be used for any security purposes. It cannot be
 used for deduplication if any of the files are in the directories of
 security researchers!

 If security is not a concern then there are many faster hashing algorithms
 that avoid the costs imposed by the need to defend against adversaries.
 See siphash, murmur, cityhash, etc.

 Simon
 On Oct 2, 2014 11:18 AM, Alex Duryee a...@avpreserve.com wrote:

 Despite some of its relative flaws, MD5 is frequently selected over SHA-256
 in archives as the checksum algorithm of choice. One of the primary factors
 here is the longer processing time required for SHA-256, though there have
 been no empirical studies calculating that time difference and its overall
 impact on checksum generation and verification in a preservation
 environment.

 AVPreserve Consultant Alex Duryee recently ran a series of tests comparing
 the real time and cpu time used by each algorithm. His newly updated white
 paper What Is the Real Impact of SHA-256? presents the results and comes
 to some interesting conclusions regarding the actual time difference
 between the two and what other factors may have a greater impact on your
 selection decision and file monitoring workflow. The paper can be
 downloaded for free at

 http://www.avpreserve.com/papers-and-presentations/whats-the-real-impact-of-sha-256/
 .
 __

 Alex Duryee
 *AVPreserve*
 350 7th Ave., Suite 1605
 New York, NY 10001

 office: 917-475-9630

 http://www.avpreserve.com
 Facebook.com/AVPreserve http://facebook.com/AVPreserve
 twitter.com/AVPreserve



Re: [CODE4LIB] Reconciling corporate names?

2014-09-29 Thread Jonathan Rochkind
For yet another data set and API that may or may not meet your needs, 
consider VIAF -- Virtual International Authority File, operated by OCLC.


The VIAF's dataset includes the LC NAF as well as other national 
authority files, I'm not sure if the API is suitable to limiting matches 
to the LC NAF, I haven't done much work with it, but I know it has an API.


http://oclc.org/developer/develop/web-services/viaf.en.html

On 9/29/14 10:18 AM, Trail, Nate wrote:

The ID.loc.gov site has a good known label service described here under known label 
retrieval :
http://id.loc.gov/techcenter/searching.html

Use  Curl and content negotiation to avoid screen scraping, for example, for LC 
Name authorities:

curl -L -H Accept: application/rdf+xml 
http://id.loc.gov/authorities/names/label/Library%20of%20Congress;

Nate

==
Nate Trail
LS/TECH/NDMSO
Library of Congress
n...@loc.gov


-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Simon 
Brown
Sent: Monday, September 29, 2014 9:38 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Reconciling corporate names?

You could always web scrape, or download and then search the LCNAF with some 
script that looks like:

#Build query for webscraping
query = paste(http://id.loc.gov/search/?q=;, URLencode(corporate name here ), 
q=cs%3Ahttp%3A%2F%2Fid.loc.gov%2Fauthorities%2Fnames)

#Make the call
result = readLines(query)

#Find the lines containing Corporate Name
lines = grep(Corporate Name, result)

#Alternatively use approximate string matching on the downloaded LCNAF data query - 
agrep(corporate name here,LCNAF_data_here)

#Parse for whatever info you want
...

My native programming language is R so I hope the functions like paste, 
readLines, grep, and URLencode are generic enough for other languages to have 
some kind of similar thing.  This can just be wrapped up into a for
loop:
for(i in 1:4){...}

Web scraping the results of one name at a time would be SLOW and obviously 
using an API is the way to go but it didn't look like the OCLC LCNAF API 
handled Corporate Name.  However, it sounds like in the previous message 
someone found a work around.  Best of luck! -Simon






On Mon, Sep 29, 2014 at 8:45 AM, Matt Carruthers mcarr...@umich.edu wrote:


Hi Patrick,

Over the last few weeks I've been doing something very similar.  I was
able to figure out a process that works using OpenRefine.  It works by
searching the VIAF API first, limiting results to anything that is a
corporate name and has an LC source authority.  OpenRefine then
extracts the LCCN and puts that through the LCNAF API that OCLC has to
get the name.  I had to use VIAF for the initial name search because
for some reason the LCNAF API doesn't really handle corporate names as
search terms very well, but works with the LCCN just fine (there is
the possibility that I'm just doing something wrong, and if that's the
case, anyone on the list can feel free to correct me).  In the end,
you get the LC name authority that corresponds to your search term and
a link to the authority on the LC Authorities website.

Anyway,  The process is fairly simple to run (just prepare an Excel
spreadsheet and paste JSON commands into OpenRefine).  The only
reservation is that I don't think it will run all 40,000 of your names
at once.  I've been using it to run 300-400 names at a time.  That
said, I'd be happy to share what I did with you if you'd like to try
it out.  I have some instructions written up in a Word doc, and the
JSON script is in a text file, so just email me off list and I can send them to 
you.

Matt

Matt Carruthers
Metadata Projects Librarian
University of Michigan
734-615-5047
mcarr...@umich.edu

On Fri, Sep 26, 2014 at 7:03 PM, Karen Hanson
karen.han...@ithaka.org
wrote:


I found the WorldCat Identities API useful for an institution name
disambiguation project that I worked on a few years ago, though my
goal wasn't to confirm whether names mapped to LCNAF.  The API
response

includes

a LCCN, and you can set it to fuzzy or exact matching, but you would
need to write a script to pass each term in and process the results:



http://oclc.org/developer/develop/web-services/worldcat-identities.en.
html


I also can't speak to whether all LC Name Authorities are
represented, so there may be a chance of some false negatives.

OCLC has another API, but not sure if it covers corporate names:
https://platform.worldcat.org/api-explorer/LCNAF

I suspect there are others on the list that know more about the
inner workings of these APIs if this might be an option for you...
:)

Karen

-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf
Of Ethan Gruber
Sent: Friday, September 26, 2014 3:54 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Reconciling corporate names?

I would check with the developers of SNAC (
http://socialarchive.iath.virginia.edu/), as they've spent a lot of
time developing 

Re: [CODE4LIB] LibGuides v2 - Templates and Nav

2014-09-24 Thread Jonathan Rochkind

On 9/24/14 10:27 AM, Joshua Welker wrote:

For instance, right now in 2.0 you can create templates (these are great
btw). But there's no way to my knowledge to limit editors to using several
of them. I'd like to create official one, two, and three column templates
for our library.


Honestly, this seems like asking for a technological solution to a 
social/organizational problem.


I mean, all these editors are employees of your library, right? If you 
can't get them to follow an official policy of only using certain 
templates, that's not a tech problem. (And if it's not an official 
policy, what gives you the right to restrict it technically?)


Believe me, I know this isn't as simple as it seems (oh, believe me, I 
know). But trying to work around off organizational failings with 
technological solutions, in my experience, usually just kicks the can 
down the road, the problems will keep popping up over and over again, 
and your attempted technological workarounds won't workaround.


And I'd rather SpringShare spent there time on some of the other wishes 
you outline, which are things you really can't do without software 
enhancement, not just workarounds for organizational failings.


Jonathan


Re: [CODE4LIB] LibGuides v2 - Templates and Nav

2014-09-24 Thread Jonathan Rochkind

Who has the ability to make policies about content styling?

If no such policy exists, and no person or entity in your organization 
has such an ability, then what would give you, the IT person, the right 
to make your own policies and enforce them with a LibGuides feature? 
Wouldn't that result in just as much political problems and enmity?


If someone entity have that ability, does such a policy exist?  If not, 
what would it take to make such a policy? Including consultation with 
neccesary stakeholders etc.


If such a policy exists, but librarians are ignoring it, is it 
appropriate to talk to their boss? And/or have a discussion about why 
the policy exists, and why it's important we all follow organizational 
policies to have a consistent business for our users?


Now, really, I know this can take literally _years_ to deal with, or be 
impossible to deal with in some organizations.


And since this is obviously a problem at nearly every library using 
LibGuides, _maybe_ it makes sense to put technical features in LibGuides 
allowing you to restrict editing variation etc. But if you only do the 
technical fix without the organizational issues, it is going to keep 
coming up again and again -- and you're maybe just going to get in a 
fight about why did you have the right to configure those restrictions?


Jonathan

On 9/24/14 12:56 PM, Joshua Welker wrote:

I lol'ed several times reading your message. I feel the pain. Well, it is
nice to know I am not alone. You are right that this in particular is an
organizational problem and not a LibGuides problem. But unfortunately it
has been an organizational problem at both of the universities where I've
worked that use LibGuides, and it sounds like it is a problem at many
other libraries. I'm not sure what it is about LibGuides that brings out
the most territorial and user-marginalizing aspects of the librarian
psyche.

Does anyone have any positive experience in dealing with this? I am on the
verge of just manually enforcing good standards even though it will create
a lot of enmity. LibGuides CMS has a publishing workflow feature that
would force all guide edits to be approved by me so that I could stamp
this stuff out each time it happens.

To enforce, or not to enforce, that is the question--
Whether 'tis nobler in the mind to suffer the slings and arrows of
outrageously poor usability,
Or to take arms against a sea of ugly guides,
And by forcing compliance with standards and best practices, end them?

Josh Welker


-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
Will Martin
Sent: Wednesday, September 24, 2014 11:34 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] LibGuides v2 - Templates and Nav


4. Admin controls are not very granular. With most aspects of editing
a guide, you either have the option of locking down styles and
templates completely (and oh your colleagues will howl) or allowing
everything (and oh your eyeballs will scream). Some of these things
could very well be improved in the future, and some probably will not.


This!  My librarians have successfully resisted every attempt to impose
any kind of standardization.  Visual guidelines?  Nope.  Content
guidelines?  Nope.  Standard system settings?  Nope.  Anything less than
100% free reign appears to be anathema to them.

The result, predictably, is chaos.  Our guides run the gamut.  We have
everything:

- Giant walls of text that no one ever reads.

- Lovingly crafted lists of obscure library sources that rarely (if
ever) bear any relation to what the patron is actually trying to do.

- A thriving ecosystem of competing labels.  Is it Article Indexes,
Article Databases, just plain Databases, or something more exotic?
Depends which apex predator rules this particular neck of the jungle.

- Green text on pink backgrounds with maroon borders.  Other pages in the
same guide might go with different, equally eye-twisting color schemes.
I'm not even sure how he's doing that without access to the style sheet,
but he's probably taught himself just enough HTML to mangle things in an
effort to use friendly colors.

- Some guides have three or even FOUR rows of tabs.  With drop-down
submenus on most of them, naturally.

- A few are nicely curated and easy to use, but they're in a distinct
minority.

I've tried.  I've pushed peer-reviewed usability studies at them.  I've
reported on conference sessions explaining exactly why all these things
are bad.  I've brought them studies of our own analytics.  I've had
students sit down and get confused in front of them.  Nothing has gotten
through, and being the only web type at the library, I'm outnumbered.
Just the thought of it makes me supremely tired.

I'm sorry if this has digressed.  LibGuides is not at fault, really.
It's an organizational problem.  LibGuides just seems to be the flash
point for it.

Will




Re: [CODE4LIB] LibGuides v2 - Templates and Nav

2014-09-17 Thread Jonathan Rochkind
However I'd also point out that if that class, instead of being simply 
'hidden', had been similar to Bootstrap's sr-only, or even a more 
fully spelled out screen-reader-only, the later developer would have 
been more likely to wonder Hmm, maybe that's not simply hidden but 
means something else, maybe I should try to look up or ask someone what 
it means if I'm not sure


Attempting self-documenting code always matters for successor 
developers, not just in issues of accessibility. And labelling something 
simply 'hidden' that is not fact always hidden is misleading  your 
successors.


I mean, in your example they left a comment with their thought process 
-- the thing was labelled 'hidden' after all.


Jonathan

On 9/17/14 5:03 PM, Will Martin wrote:

To digress a bit from LibGuides ...

The biggest problem with accessibility is not technical: it's cultural.
Producing HTML that meets basic accessibility tests is not all THAT
difficult.  The harder part is setting up a culture where everyone --
everyone! -- who writes content for the web is trained on how to do it
accessibly.  A content editor who is clueless about accessibility can
very easily screw up their pages without even knowing they're doing so.

The same applies to developers.  Once while reviewing a library site's
code, I came across a chunk of HTML that looked like this (roughly):

!--
I don't know why this was here?  It's invisible!  Disabling.

a href=#top class=hiddenReturn to top/a
--

An earlier developer had put that in to assist screen reader users in
getting back to the top of the page if they wanted.  The hidden class
was a correctly written class for hiding content while leaving it
available for screen reader users.  But the next person to fill that job
wasn't trained on WHY and took it out again.

If you really want to commit to accessibility, it needs to be a
criterion in the job description for your developers, and there needs to
be a training process in place for anyone who produces content for your
site.  Probably with refreshers at intervals.

Will




Re: [CODE4LIB] LibGuides v2 - Templates and Nav

2014-09-17 Thread Jonathan Rochkind
Mouse hover is not available to anyone using a touch device rather than 
a mouse, as well as being problematic for keyboard access.


While there might be ways to make the on-hover UI style keyboard 
accessible (perhaps in some cases activating on element focus in 
addition toon hover), there aren't really any good ones I can think for 
purely touch devices (which don't really trigger focus state either).


An increasing amount of web use, of course, is mobile touch devices, and 
probably will continue to be and to increase for some time, including on 
library properties.


So I think probably on-hover UI should simply be abandoned at this 
point, even if some people love it, it will be inaccessible to an 
increasing portion of our users with no good accomodations.


Jonathan

On 9/17/14 4:25 PM, Jesse Martinez wrote:

On the same token, we're making it a policy to not use mouse hover over
effects to display database/asset descriptions in LG2 until this can become
keyboard accessible. This is a beloved feature from LG1 so I'm hoping
SpringShare read my pestering emails about this...

Jesse

On Wed, Sep 17, 2014 at 3:38 PM, Brad Coffield bcoffield.libr...@gmail.com
wrote:


Johnathan,

That point is well taken. Accessibility, to me, shouldn't be a tacked-on
we'll do the best we can sort of thing. It's an essential part of being a
library being open to all users. Unfortunately I know our site has a lot of
work to be done regarding accessibility. I'll also pay attention to that
when/if I make mods to the v2 templates.

On Wed, Sep 17, 2014 at 1:49 PM, Jonathan LeBreton lebre...@temple.edu
wrote:


I might mention here that we (Temple University)  found LibGuides 2.0  to
offer some noteworthy improvements in section 508 accessibility
when compared with version 1.0.   Accessibility is a particular point of
concern for the whole institution as we look across the city, state, and
country at other institutions that have been called out and settled with
various disability advocacy groups.
So we moved to v. 2.0 during the summer in order to have those
improvements in place for the fall semester, as well as to get the value
from some other developments in v. 2.0 that benefit all customers.

When I see email on list about making  modifications to templates and
such, it gives me a bit of concern on this score that by doing so,  one
might easily begin to make the CMS framework for content less accessible.
   I thought I should voice that.This is not to say that one shouldn't
customize and explore enhancements etc.,  but one should do so with some
care if you are operating with similar mandates or concerns.Unless I

am

mistaken, several of the examples noted are now throwing 508 errors that
are not in the out-of-the box  LibGuide templates and which are not the
result of an individual content contributor/author inserting bad stuff
like images without alt tags.




Jonathan LeBreton
Senior Associate University Librarian
Editor:  Library  Archival Security
Temple University Libraries
Paley M138,  1210 Polett Walk, Philadelphia PA 19122
voice: 215.204.8231
fax: 215.204.5201
mobile: 215.284.5070
email:  lebre...@temple.edu
email:  jonat...@temple.edu

-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
Cindi Blyberg
Sent: Wednesday, September 17, 2014 12:03 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] LibGuides v2 - Templates and Nav

Hey everyone!

Not to turn C4L into Support4LibGuides, but... :)

The infrastructure for all the APIs is in place; currently, the Guides

API

and the Subjects API are functioning.  Go to Tools  API  Get Guides to
see the general structure of the URL.  Replace guides with subjects

to

retrieve your subjects.  You will need your LibGuides site ID, which you
can get from the LibApps Dashboard screen.

Word is that it will not take long to add other API calls on the back

end;

if you need these now, please do email supp...@springshare.com and
reference this conversation.

As for v1, we are planning on supporting it for 2 more years--that said,
we would never leave anyone hanging, so if it takes longer than that to

get

everyone moved over, we're ready for that.

Best,
  -Cindi

On Wed, Sep 17, 2014 at 10:46 AM, Nadaleen F Tempelman-Kluit 

n...@nyu.edu



wrote:


Hi all-
While we're on the topic of LibGuides V2, when will the GET subjects
API (and other API details) be in place? We're in a holding pattern
until we get those details and we've not been able to get any timeline
as to when those assets will be in place. So we're deciding between
building out LibGuides CMS Global landing pages using the V1
platform, or waiting until some future date which, very soon, will
mean abandoning this project till next summer. If we go the former
route, it would also be great to know how long V1 will be supported.
Thanks



On Wed, Sep 17, 2014 at 10:29 AM, Cindi Blyberg cindi...@gmail.com
wrote:


On Tue, Sep 16, 2014 at 7:15 PM, Michael 

Re: [CODE4LIB] Enabling both local and remoteAuth (Shibboleth) in ILLiad?

2014-08-20 Thread Jonathan Rochkind
I haven't done this, but have been thinking about it. I _think_ what the 
docs mean, is you can create an Alias in IIS (is that what IIS calls 
it?) for the same directory, so you have two different paths appearing 
to IIS (one of which can be protected with Shibboleth, the other one 
not) but you don't actually need to copy the directories, it's the same 
files on disk.


I haven't tried this yet though. I agree the documentation is 
unfortunately parsimonious.


On 8/18/14 3:52 PM, Kim, Bohyun wrote:

Hi all,

Looking for some advice from those who are familiar with either Shibboleth 
and/or ILLiad.

Did anyone implement both remoteAuth through Shibboleth and basic local ILLiad 
login for different groups of users? The sparse documentation on this on ILLiad 
site seems to suggest two separate directories (with two separate 
illiad.dll(s)?? ) and one directory to be the value of ‘RemoteAuthWebPath’ as 
well as the value of the Shibboleth configuration.xml ‘path’ field. We are not 
sure what each of the two directories is supposed to contain and whether they 
are supposed to be the exact duplicate of the other.

https://prometheus.atlas-sys.com/display/illiad/RemoteAuth+Authentication
“You can enable RemoteAuth authentication for a particular web directory while still 
keeping a separate web directory for users to register themselves via Basic ILLiad 
authentication. The RemoteAuthWebPath would be the directory controlled by remote 
authentication while the WebPath key (Web Interface | System | WebPath) would have 
the directory not controlled by remote authentication. RemoteAuthSupport being set 
to Yes would tell ILLiad to check the directory and then know if the user should be 
authenticated remotely or by ILLiad.

Any advice from those who have tried this would be greatly appreciated.

Thanks!
Bohyun




[CODE4LIB] very large image display?

2014-07-25 Thread Jonathan Rochkind
Does anyone have a good solution to recommend for display of very large images 
on the web?  I'm thinking of something that supports pan and scan, as well as 
loading only certain tiles for the current view to avoid loading an entire 
giant image.

A URL to more info to learn about things would be another way of answering this 
question, especially if it involves special server-side software.  I'm not sure 
where to begin. Googling around I can't find any clearly good solutions.

Has anyone done this before and been happy with a solution?

Thanks for any info!

Jonathan


Re: [CODE4LIB] very large image display?

2014-07-25 Thread Jonathan Rochkind
Thanks for all the recommendations!

I've been reading and understanding the problem space better.  Here's my 
summary of what I've figured out. 

For this project, there is really only a handful of big images, and simplicity 
of server-side is a priority -- so I think it's actually okay to pre-render all 
the tiles in advance, and avoid an actual image server -- to the extent tools 
can work with this. 

At first, I thought Oh gee, this is actually kind of like a mapping problem, 
and wound up at OpenLayers. I think OpenLayers could be used for this 
non-geographical purpose -- with units: pixels -- but it's definitely a 
complicated product (without particularly extensive documentation), and beyond 
feeling pretty confident that it would be possible to use it like this, I 
hadn't actually managed to arrive at a demo. 

Then I eventually found OpenSeadragon, which a couple other people in this 
thread suggested, which looks like a pretty good fit. It looks like it possibly 
can work with entirely pre-rendered tiles served statically with no image 
server, using the DZI format. 
(http://openseadragon.github.io/examples/tilesource-dzi/).  I haven't actually 
gotten to a proof of concept here, but I think it'll work. 

I didn't mention that the next phase requirement/desire was annotations on the 
image. It looks like there's a tool called Annotorious which has some (beta) 
support for annotations in both OpenSeadragon and OpenLayers. 

So my current plan is trying to pursue a proof of concept using OpenSeadragon 
and Annotorious. There are some potential future phase requirements which might 
require multiple layers, which I guess might require trying OpenLayers after 
all. (My sense is that Annotorious' OpenLayers support is currently a lot 
buggier than the OpenSeadragon support though). 

Thanks again for the suggestions! Very helpful. I may be back with more 
questions. 

Jonathan 

From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] on behalf of Esmé Cowles 
[escow...@ticklefish.org]
Sent: Friday, July 25, 2014 4:44 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] very large image display?

We previously used the Zoomify Flash applet, but now use Leaflet.js with the 
Zoomify tileset plugin:

https://github.com/turban/Leaflet.Zoomify

One thing I like about this approach is that it minimizes the amount of 
Javascript code the clients have to load, since we use Leaflet.js for our maps 
and it's already loaded.

-Esme

 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of 
 Jonathan Rochkind
 Sent: Friday, July 25, 2014 10:36 AM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: [CODE4LIB] very large image display?

 Does anyone have a good solution to recommend for display of very large 
 images on the web?  I'm thinking of something that supports pan and scan, as 
 well as loading only certain tiles for the current view to avoid loading an 
 entire giant image.

 A URL to more info to learn about things would be another way of answering 
 this question, especially if it involves special server-side software.  I'm 
 not sure where to begin. Googling around I can't find any clearly good 
 solutions.

 Has anyone done this before and been happy with a solution?

 Thanks for any info!

 Jonathan


Re: [CODE4LIB] very large image display?

2014-07-25 Thread Jonathan Rochkind
Thanks Christina, can you tell me more about Scripto, or provide a URL? I'm 
not sure what that refers to, and my googling is not finding the right one. Is 
Scripto an Omeka plugin? 

From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] on behalf of George, 
Christina Rose [georg...@umsystem.edu]
Sent: Friday, July 25, 2014 12:03 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] very large image display?

Jonathan,

We use Scripto with Omeka to have volunteers transcribe manuscripts which are 
high resolution images. It has the option of using OpenLayers (which is the 
setting we use) or Zoom.it for image display.

OpenLayers: http://openlayers.org/
Zoomit: http://zoom.it/

I have no idea the level of complexity it takes to implement either of these 
since these options came bundled in Scripto but I approve of the results.

-Christina

-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of 
Jonathan Rochkind
Sent: Friday, July 25, 2014 10:36 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: [CODE4LIB] very large image display?

Does anyone have a good solution to recommend for display of very large images 
on the web?  I'm thinking of something that supports pan and scan, as well as 
loading only certain tiles for the current view to avoid loading an entire 
giant image.

A URL to more info to learn about things would be another way of answering this 
question, especially if it involves special server-side software.  I'm not sure 
where to begin. Googling around I can't find any clearly good solutions.

Has anyone done this before and been happy with a solution?

Thanks for any info!

Jonathan


[CODE4LIB] access to code4lib server for planet maintenance?

2014-07-14 Thread Jonathan Rochkind
Hi all. The Planet Code4lib aggregator runs on a server operated by 
Oregon State.


128.193.168.90  poseidon.library.oregonstate.edu

I've been sort of caretaking the Planet, but really just editing the 
config for feed sources. Which is really all I have the time/inclination 
for at the moment, I've basically just been barely care-taking it.


Meanwhile, the planet is running on really old software, and maybe has 
some issues with some things.


Bill Denton has kindly volunteered to take a look at upgrading and/or 
debugging the software.


Does anyone know who we'd talk to in order to get him an account on the 
server so he can do so?


(Long term, if anyone is interested in taking over management of the 
planet including tech stuff, interest is welcome! I'm not sure if Bill 
is interested in dealing with it long-term, or just taking a look at the 
moment).


Jonathan


[CODE4LIB] MarcEdit Tasks power?

2014-06-24 Thread Jonathan Rochkind

Hi code4libbers,

I don't have much experience with MarcEdit, I'm hoping someone else 
does, especially with creating automated MarcEdit tasks, and can advise:


Would it be possible to create a MarcEdit task that:

= IF there is 338 field with subfield $a online resource, THEN erase 
all existing subfield $b's in that field, and add a single subfield $b 
cr.



We have records loaded from certain sources that have inconsistent 338$a 
and $b, where the $a is the reliable one. I'm curious if I can send 
records from these sources through a MarcEdit Task to correct this known 
pattern of error.


Anyone know if this is possible in a MarcEdit task? And if you could 
supply some hints to a complete newbie to MarcEdit on how to do it, that 
would be quite kind of you!


Jonathan


Re: [CODE4LIB] Jobs Digest - I definitely didn't rip off someone else's job posting

2014-05-29 Thread Jonathan Rochkind
Is there anyone that found the original job postings to the list 
actually MORE distracting and inconveniencing than the incessant 
discussion of what to do about them?


Jonathan

On 5/29/14 7:44 AM, Ross Singer wrote:

THIS IS NOT EXACTLY WHAT WE AGREED TO
On May 29, 2014 7:38 AM, Andreas Orphanides akorp...@ncsu.edu wrote:


YAY FULL JOB POSTINGS


On Wed, May 28, 2014 at 11:40 PM, BWS Johnson abesottedphoe...@yahoo.com

wrote:



Research Analyst I
Royt's Treehouse

The prestigious Tennant's Treehouse is accepting applications for the
position of Research Analyst I for the Juniper Club Library. A
collaborative position in nature, the Research Analyst I will indenture
themselves to the library duhrector artisanally collecting redundant data
via Diebold-O-Tron. The Research Analyst I will be abused at any given
opportunity, be paid only in hard liquor, maintain all digital object
collections, regardless of relevance or irrelevance of said collection

and

shepherd digital humanities projects, whatevertheheckthoseare.


The successful candidate will have 17 years experience in Koha despite
this being an entry level position that only freshly minted graduates may
apply to and that proficiency not possibly existing in this reality,
archiving meaningless discussion threads, ragging on royt at any given
opportunity, and collating mimeographs since we forgot to take this out

of

our job description sometime when MARC was merely a glimmer in a data
nerd's eye. None of these skills relate in the slightest to counting

votes,

but that's what HR told us, and ours is not to reason why.

We will not tell you where Royt's Treehouse is located since you are

meant

to already know. As with conference, you were meant to apply for this

post

prior to it making the rounds in your hemisphere, so if you are located
outside of the continental United States, too damn bad.

For further information, feel free to contact abesottedphoe...@yahoo.com

,

where your email will fester in a pile since your résumé will be thrown

out

for having a funny name or not matching spurious keywords.

All applicants are REQUIRED to have a beating a dead horse Code{4}Lib
t-shirt.








Re: [CODE4LIB] getting URIs, was: [CODE4LIB] barriers to open metadata?

2014-04-30 Thread Jonathan Rochkind
If you want libraries to spend money on adding URI's to their data, 
there is going to need to be some clear benefit they get from doing it 
-- and it needs to be a pretty near-term benefit, not Well, some day 
all these awesome things might happen, because linked data.



On 4/30/14 1:34 PM, Karen Coyle wrote:

Thanks, Richard. I ask because it's one of the most common questions
that I get -- often about WorldCat, but in general about any source of
URIs -- How do I connect my data (text forms) to their URIs? And these
questions usually come from library or archive projects with little or
no programming staff. So it seems like we need to be able to answer that
question so that people can get linked up. In fact, it seems to me that
the most pressing need right now is an easy way (or one that someone
else can do for you at a reasonable cost) to connect the text string
identifiers that we have to URIs. I envision something like what we
went through when we moved from AACR name forms to AACR2 name forms, and
libraries were able to send their MARC records to a service that
returned the records with the new name form. In this case, though, such
a service would return the data with the appropriate URIs added. (In the
case of MARC, in the $0 subfield.)

It's great that the big guys like LC and OCLC are providing URIs for
resources. But at the moment I feel like it's grapes dangling just
beyond the reach of the folks we want to connect to. Any ideas on how to
make this easy are welcome. And I do think that there's great potential
for an enterprising start-up to provide an affordable service for
libraries and archives. Of course, an open source pass in your data in
x or y format and we'll return it with URIs embedded would be great,
but I think it would be reasonable to charge for such a service.

kc


On 4/30/14, 9:59 AM, Richard Wallis wrote:

To unpack the several questions lurking in Karen’s question.

As to being able to use the WorldCat Works data/identifiers there is no
difference between a or b - it is ODC-BY licensed data.

Getting a Work URI may be easier for a) as they should be able to
identify
the OCLC Number and hence use the linked data from it’s URI 
http://worldcat.org/oclc/{ocn} to pick up the link to it’s work.

Tools such as xISBN http://xisbn.worldcat.org/xisbnadmin/doc/api.htm
can
step you towards identifier lookups and are openly available for low
volume
usage.

Citation lookup is more a bib lookup feature, that you could get an OCLC
Number from. One of colleagues may be helpful on the particulars of this.

Apologies for being WorldCat specific, but Karen did ask.

~Richard.


On 30 April 2014 17:15, Karen Coyle li...@kcoyle.net wrote:


My question has to do with discoverability. Let's say that I have a
bibliographic database and I want to add the OCLC work identifiers to
it.
Obviously I don't want to do it by hand. I might have ISBNs, but in some
cases I will have a regular author/title-type citation.

and let's say that I am asking this for two different institutions:
a) is an OCLC member institution
b) is not

Thanks,
kc




On 4/30/14, 8:47 AM, Dan Scott wrote:


On Tue, Apr 29, 2014 at 11:37 PM, Roy Tennant roytenn...@gmail.com
wrote:


This has now instead become a reasonable recommendation

concerning ODC-BY licensing [3] but the confusion and uncertainty
about which records an OCLC member may redistribute remains.

[3] http://www.oclc.org/news/releases/2012/201248.en.html


Allow me to try to put this confusion and uncertainty to rest once and
for
all:

ALL THE THINGS. ALL.

At least as far as we are concerned. I think it's well past time to
put
the
past in the past.


That's great, Roy. That's a *lot* simpler than parsing the
recommendations, WCRR, community norms, and such at [A, B] :)

  Meanwhile, we have just put nearly 200 million works records up as
linked

open data. [1], [2], [3]. If that doesn't rock the library open linked
data
world, then no one is paying attention.
Roy

[1] http://oclc.org/en-US/news/releases/2014/201414dublin.html
[2]
http://dataliberate.com/2014/04/worldcat-works-197-million-
nuggets-of-linked-data/
[3] http://hangingtogether.org/?p=3811


Yes, that is really awesome. But Laura was asking about barriers to
open metadata, so damn you for going off-topic with PR around a lack
of barriers to some metadata (which, for those who have not looked
yet, have a nice ODC-BY licensing statement at the bottom of a given
Works page) :)

A. http://oclc.org/worldcat/community/record-use.en.html
B. http://oclc.org/worldcat/community/record-use/data-
licensing/questions.en.html


--
Karen Coyle
kco...@kcoyle.net http://kcoyle.net
m: 1-510-435-8234
skype: kcoylenet








[CODE4LIB] online book price comparison websites?

2014-02-26 Thread Jonathan Rochkind
Anyone have any recommendations of online sites that compare online 
prices for purchasing books?


I'm looking for recommendations of sites you've actually used and been 
happy with.


They need to be searchable by ISBN.

Bonus is if they have good clean graphic design.

Extra bonus is if they manage to include shipping prices in their price 
comparisons.


Thanks!

Jonathan


Re: [CODE4LIB] online book price comparison websites?

2014-02-26 Thread Jonathan Rochkind
Thanks to everyone who made a suggestion, Raymond's suggestion of 
allbookstores.com seems best suited for my purposes. Thanks!


On 2/26/14 3:24 PM, Schwartz, Raymond wrote:

I usually use http://allbookstores.com/

It even includes a local bookstore I visit--The Strand.

-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of 
Stephanie P Hess
Sent: Wednesday, February 26, 2014 3:19 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] online book price comparison websites?

Try http://www.addall.com/. I used it all the time in my former incarnation as 
an Acquisitions Librarian.

Cheers,

Stephanie


On Wed, Feb 26, 2014 at 3:14 PM, Jonathan Rochkind rochk...@jhu.edu wrote:


Anyone have any recommendations of online sites that compare online
prices for purchasing books?

I'm looking for recommendations of sites you've actually used and been
happy with.

They need to be searchable by ISBN.

Bonus is if they have good clean graphic design.

Extra bonus is if they manage to include shipping prices in their
price comparisons.

Thanks!

Jonathan







Re: [CODE4LIB] Proquest search api?

2014-02-17 Thread Jonathan Rochkind

Interesting, thanks for the additional information, very useful!

I don't like relying on the 'free text' in subfield 3, because it seems 
fragile, who knows if I know all the possible values or if they change 
them in the future breaking my code.


But your example with two 'full text' links is enlightening.

I think what I'm liking as an algorithm for my needs (any full text is 
better than none, but PDF is best) -- is first looking for an 856 with 
second indicator 0 -- if there's only one, use it. If there are more 
than one, try to find one that includes the substring PDF, if none do, 
just use the first one.


Jonathan

On 2/17/14 11:16 AM, Andrew Anderson wrote:

The document you want to request from ProQuest support was called 
Federated-Search.docx when they sent it to me.  This will address many of your 
documentation needs.

ProQuest used to have an excel spreadsheet with all of the product codes for the databases 
available for download from 
http://support.proquest.com/kb/article?ArticleId=3698source=articlec=12cid=26,
 but it appears to no longer be available from that source.  ProQuest support should be 
able to answer where it went when you request the federated search document.

You may receive multiple 856 fields for Citation/Abstract, Full Text, and 
Scanned PDF:

=856  41$3Citation/Abstract$uhttp://search.proquest.com/docview/...
=856  40$3Full Text$uhttp://search.proquest.com/docview/...
=856  40$3Scanned PDF$uhttp://search.proquest.com/docview/...

I would suggest that rather than relying on the 2nd indicator, you should parse 
subfield 3 instead to find the format that you prefer.  You see the multiple 
856 fields in the MARC records for ProQuest holdings as well, as that is how 
ProQuest handles coverage gaps in titles, so if you have ever processed 
ProQuest MARC records before, you should be already prepared for this.



[CODE4LIB] Proquest search api?

2014-02-12 Thread Jonathan Rochkind
I feel like at some point I heard there was a search API for the 
Proquest content/database platform.


I can find no evidence of it on google, but that's not really unusual 
for most of our vendor's apis.


Does anyone know if such an API exists, usable by Proquest customers, 
and if so how to get more info about it?


I am particularly interested in a search API to the Proquest 
Dissertations  Theses database, but most Proquest content seems to be 
on a standard consistent platform, so I imagine any API's that exist 
would not be special to that one database.


Jonathan


Re: [CODE4LIB] Proquest search api?

2014-02-12 Thread Jonathan Rochkind
Aha, thinking to google search for proquest z3950 actually got me some 
additional clues!


Sites that are currently using Z39.50 to search ProQuest are advised to 
consider moving to the XML gateway.


in Google snippets for:

http://www.proquest.com/assets/downloads/products/techrequirements_np.pdf

Also If you are using the previous XML
gateway for access other than with a federated search vendor, please 
contact our support center at
www.proquest.com/go/migrate and we can get you the new XML gateway 
implementation documentation.


Okay, so now I at least know that something called the XML Gateway 
exists, and that's what I want info on or ask about!  (Why are our 
vendors so reluctant to put info on their services online?)


I am not a huge fan of z3950, and am not ordinarily optimistic about 
it's ability to actually do what I need, but I'd use it if it was all 
that was available; in this case, it seems like Proquest is recommending 
you do NOT use it, but use this mysterious 'XML gateway'.




On 2/12/14 3:29 PM, Eric Lease Morgan wrote:

On Feb 12, 2014, at 3:22 PM, Jonathan Rochkind rochk...@jhu.edu wrote:


I feel like at some point I heard there was a search API for the
Proquest content/database platform.



While it may not be the coolest, I’d be willing to bet Proquest supports 
Z39.50. I used it lately to do some interesting queries against the New York 
Times Historical Newspapers Database (index). [1] Okay. I know. Z39.50 and 
their Reverse Polish Notation query language. Yuck. Moreover, the bibliographic 
data is probably downloadable at MARC records, but hey.

[1] Z39.50 hack - http://blogs.nd.edu/emorgan/2013/11/fun/

—
Eric Lease Morgan




Re: [CODE4LIB] Academic Library Website Question

2013-12-17 Thread Jonathan Rochkind

On 12/17/13 1:46 PM, Lisa Rabey wrote:

I'm with Lisa in that when checking out other institutions, I check to
see how many clicks it takes to get to the library, and if it is not
immediately on the landing page of the college OR at least a drop down
link from a parent portal, I start becoming Judgey McJudgepants on
that institution. Because If I'm a librarian, and I can't find it, I
cannot even begin to imagine how their students can get to their own
library.



Hmm, this sounds weird to say, but it never occured to me that most 
students would start from the institutional home page, or really ever 
visit the institutional home page at all. Largely because most 
institutional home pages are nearly useless for current affiliates of 
the institution, but are instead perhaps marketting brochures for 
prospectives.


I wonder how many students or other current affiliates actually start at 
institutional home pages how often.


[CODE4LIB] new umlaut listserv location

2013-12-16 Thread Jonathan Rochkind
For those who use or are interested in Umlaut[1], there is a new 
listserv for the project, now hosted on Google Groups:


https://groups.google.com/forum/#!forum/umlaut-software

The previous listserv was hosted on rubyforge, which appears to be down 
and it's unclear if it's coming back. So I've taken the plunge and 
created the Google Group, something I had been meaning to do for a while.


Thanks!

Jonathan


Re: [CODE4LIB] Mapping LCSH to DDC

2013-12-11 Thread Jonathan Rochkind
I was looking for this last month (there may even be a thread in the 
archives from me on it); I didn't anything very suitalbe.


The only potentially useful thing I found, is that id.loc.gov does 
include mappings for _some but not all_ LCSH authority records. I think 
these mappings were included in the original LCSH authority records, 
they are not value added by id.loc.gov. So any source of LCSH authority 
records would also have this info.


For example, look under LC Classification at 
http://id.loc.gov/authorities/subjects/sh85004812.html


I think some LCSH authority records may have more than one suggested LCC 
too (which is reasonable and makes sense).


You can look up records id.loc.gov by known term by doing, say: 
http://id.loc.gov/authorities/subjects/label/anarchism


A pre-coordinated LCSH string in a MARC record may not exactly 
correspond to an authority record, so you may have trouble matching 
there, have to figure out which part of the LCSH string matches the auth 
record.  And then, not every LCSH auth record has suggested 
corresponding LCSH, only some of them do.


But that's pretty much the only thing i was able to find when I looked 
last month.


In theory, I think a prommissing approach would be to take an existing 
corpus of bib records, and assemble statistical correlations between 
LCSH and LCC.  For instance, of all the records with a certain LCSH 
assigned, maybe there are 12 LCC's represented, and some more than 
others. This would give you some pretty likely candidates for 
appropriate LCC's for a given LCSH.  It would take some algorithmic 
tweaking probably.  It would work best on a huge corpus like OCLC or LC 
(and I think I recall a non-free product that does exactly this), but 
could possibly bear fruit on a reasonably large corpus like an 
individual research library too. It would be an interesting project for 
someone to undertake, if it resulted in open data on LCSH-LCC 
correspondences, it would be useful to many.


Jonathan

On 12/10/13 8:18 AM, Irina Arndt wrote:

Hi CODE4LIB,

we would like to add DDC classes to a bunch of MARC records, which contains 
only LoC Subject Headings.
Does anybody know, if a mapping between LCSH and DDC is anywhere existent (and 
available)?

I understood, that WebDewey http://www.oclc.org/dewey/versions/webdewey.en.html 
 might provide such a service, but

· we are no OCLC customers or subscribers to WebDewey

· even if we were, I'm not sure, if the service matches our needs

I'm thinking of a tool, where I can upload my list of subject headings and get 
back a list, where the matching Dewey classes have been added (but a 'simple' 
csv file with LCSH terms and DDC classes would be helpful as well- I am fully 
aware, that neither LCSH nor DDC are simple at all...) . Naïve idea...?

Thanks for any clues,
Irina


---

Irina Arndt
Max Planck Digital Library (MPDL)
Library System Coordinator
Amalienstr. 33
D-80799 Muenchen, Germany

Tel. +49 89 38602-254
Fax +49 89 38602-290

Email: ar...@mpdl.mpg.demailto:ar...@mpdl.mpg.de
http://www.mpdl.mpg.de




Re: [CODE4LIB] Mapping LCSH to DDC

2013-12-11 Thread Jonathan Rochkind
Ah right it's ClassificationWeb that has this. Alas, ClassificationWeb 
is both not open (requires a subscription), and also, as far as I know, 
offers no machine API, it's purely manual human access.


This would definitely be an interesting project for someone to do to 
create a source of open data on LCSH/LCC correlations. Somewhat 
challenging, but definitely not 'rocket science' as they say. I think 
the data would be desirable by many.


Also, as I mentioned in another email, some but not all LCSH authority 
records already identify suggested correspoinding LCC's; LCSH authority 
data can be obtained from id.loc.gov among other places.


On 12/10/13 5:02 PM, Bryan Baldus wrote:

On Tuesday, December 10, 2013 7:18 AM, Irina Arndt wrote:

we would like to add DDC classes to a bunch of MARC records, which contains 
only LoC Subject Headings. Does anybody know, if a mapping between LCSH and DDC 
is anywhere existent (and available)?

...

I'm thinking of a tool, where I can upload my list of subject headings and get 
back a list, where the matching Dewey classes have been added (but a 'simple' 
csv file with LCSH terms and DDC classes would be helpful as well- I am fully 
aware, that neither LCSH nor DDC are simple at all...) . Naïve idea...?


Classification Web offers a correlations feature between Dewey and the 1st 
LCSH, based on usage in LC's database (as well as correlations between LCC and 
LCSH, and DDC and LCC). It is of some use in helping the cataloger determine 
possible classifications or subject headings to use. Unfortunately, I don't 
believe ClassWeb is easily accessible by automated processes (even for 
subscribers). Even if it were, I doubt it is possible to automate a process of 
assigning Dewey based on 1st LCSH. As mentioned, the 1st LCSH and 
classification are generally supposed to be similar/linked, but that applies 
more to LCC/LCSH than DDC to LCSH, due to the way Dewey works. For example, 
ClassWeb correlation between LCSH Disease management (chosen while looking at 
Health, then Disease, then looking for an example showing a better variety of 
Deweys than the 1st 2) shows DDCs used by LC (counts of records in parentheses):

Disease management [Topical]
  362.1 (4)
  610.285 (1)
  615.1 (1)
  615.5071 (1)
  616.89142 (1)



That said, as Ed mentioned, given a large set of records for training, you 
should be able to develop something to help local catalogers determine possible 
Deweys record-by-record.

I hope this helps,

Bryan Baldus
Senior Cataloger
Quality Books Inc.
The Best of America's Independent Presses
1-800-323-4241x402
bryan.bal...@quality-books.com




Re: [CODE4LIB] The lie of the API

2013-12-02 Thread Jonathan Rochkind
Yeah, I'm going to disagree a bit with the original post in this thread, 
and with Richard's contribution too. Or at least qualify it.


My experience is that folks trying to be pure and avoid an API do _not_ 
make it easier for me to consume as a developer writing clients. It's 
just not true that one always leads to the other.


The easiest API's I have to deal with are those where the developers 
really understand the use cases clients are likely to have, and really 
make API's that conveniently serve those use cases.


The most difficult API's I have to deal with are those where the 
developers spent a lot of time thinking about very abstract and 
theoretical concerns of architectural purity, whether in terms of REST, 
linked data, HATEOS, or, god forbid, all of those and more at once (and 
then realizing that sometimes they seem to conflict) -- and neglected to 
think about actual use cases and making them smooth.


Seriously, think about the most pleasant, efficient, and powerful API's 
you have used.  (github's?  Something else?).  How many of them are 
'pure' non-API API's, how many of them are actually API's?


I'm going to call it an API even if it does what the original post 
says, I'm going to say API in the sense of how software is meant to 
deal with this -- in the base case, the so-called API can be screen 
scrape HTML, okay.


I am going to agree that aligning the API with the user-visible web app 
as much as possible -- what the original post is saying you should 
always and only do -- does make sense.  But slavish devotion to avoiding 
any API as distinct from the human web UI at all leads to theoretically 
pure but difficult to use API's.


Sometimes the 'information architecture' that makes sense for humans 
differs from what makes sense for machine access. Sometimes the human UI 
needs lots of JS which complicates things.  Even without this, an API 
which lets me choose representations based on different URI's instead of 
_only_ conneg (say, /widget/18.json instead of only /widget/18 with 
conneg) ends up being significantly easier to develop against and debug.


Spend a bit of time understanding what people consider theoretically 
pure, sure, because it can give you more tools in your toolbox.  But 
simply slavishly sticking to it does not, in my experience, result in a 
good 'developer experience' for your developer clients.  And when you 
start realizing that different people from different schools have 
different ideas of what 'theoretically pure' looks like, when you start 
spending many hours going over HTTP RANGE 14 and just getting more 
confused -- realize that what matters in the end is being easy to use 
for your developers use cases, and just do it.


Personally, I'd spend more time making sure i understand my developers
use cases and getting feedback from developers, and less time on 
architecting castles in the sky that are theoretically pure.


On 12/2/13 9:56 AM, Bill Dueber wrote:

On Sun, Dec 1, 2013 at 7:57 PM, Barnes, Hugh hugh.bar...@lincoln.ac.nzwrote:


+1 to all of Richard's points here. Making something easier for you to
develop is no justification for making it harder to consume or deviating
from well supported standards.




​I just want to point out that as much as we all really, *really* want
easy to consume and following the standards to be the same
thingthey're not. Correct content negotiation is one of those things
that often follows the phrase all they have to do..., which is always a
red flag, as in  Why give the user  different URLs ​when *all they have to
do is* Caching, json vs javascript vs jsonp, etc. all make this
harder. If *all * *I have to do* is know that all the consumers of my data
are going to do content negotiation right, and then I need to get deep into
the guts of my caching mechanism, then set up an environment where it's all
easy to test...well, it's harder.

And don't tell me how lazy I am until you invent a day with a lot more
hours. I'm sick of people telling me I'm lazy because I'm not pure. I
expose APIs (which have their own share of problems, of course) because I
want them to be *useful* and *used. *

   -Bill, apparently feeling a little bitter this morning -






Re: [CODE4LIB] The lie of the API

2013-12-02 Thread Jonathan Rochkind
There are plenty of non-free API's, that need some kind of access 
control. A different side discussion is what forms of access control are 
the least barrier to developers while still being secure (a lot of 
services mess this up in both directions!).


However, there are also some free API's whcih still require API keys, 
perhaps because the owners want to track usage or throttle usage or what 
have you.


Sometimes you need to do that too, and you need to restrict access, so 
be it. But it is probably worth recognizing that you are sometimes 
adding barriers to succesful client development here -- it seems like a 
trivial barrier from the perspective of the developers of the service, 
because they use the service so often. But to a client developer working 
with a dozen different API's, the extra burden to get and deal with the 
API key and the access control mechanism can be non-trivial.


I think the best compromise is what Google ends up doing with many of 
their APIs. Allow access without an API key, but with a fairly minimal 
number of accesses-per-time-period allowed (couple hundred a day, is 
what I think google often does). This allows the developer to evaluate 
the api, explore/debug the api in the browser, and write automated tests 
against the api, without worrying about api keys. But still requires an 
api key for 'real' use, so the host can do what tracking or throttling 
they want.


Jonathan

On 12/2/13 12:18 PM, Ross Singer wrote:

I'm not going to defend API keys, but not all APIs are open or free.  You
need to have *some* way to track usage.

There may be alternative ways to implement that, but you can't just hand
wave away the rather large use case for API keys.

-Ross.


On Mon, Dec 2, 2013 at 12:15 PM, Kevin Ford k...@3windmills.com wrote:


Though I have some quibbles with Seth's post, I think it's worth drawing
attention to his repeatedly calling out API keys as a very significant
barrier to use, or at least entry.  Most of the posts here have given
little attention to the issue API keys present.  I can say that I have
quite often looked elsewhere or simply stopped pursuing my idea the moment
I discovered an API key was mandatory.

As for the presumed difficulty with implementing content negotiation (and,
especially, caching on top), it seems that if you can implement an entire
system to manage assignment of and access by API key, then I do not
understand how content negotiation and caching are significantly harder to
implement.

In any event, APIs and content negotiation are not mutually exclusive. One
should be able to use the HTTP URI to access multiple representations of
the resource without recourse to a custom API.

Yours,
Kevin





On 11/29/2013 02:44 PM, Robert Sanderson wrote:


(posted in the comments on the blog and reposted here for further
discussion, if interest)


While I couldn't agree more with the post's starting point -- URIs
identify
(concepts) and use HTTP as your API -- I couldn't disagree more with the
use content negotiation conclusion.

I'm with Dan Cohen in his comment regarding using different URIs for
different representations for several reasons below.

It's harder to implement Content Negotiation than your own API, because
you
get to define your own API whereas you have to follow someone else's rules
when you implement conneg.  You can't get your own API wrong.  I agree
with
Ruben that HTTP is better than rolling your own proprietary API, we
disagree that conneg is the correct solution.  The choice is between
conneg
or regular HTTP, not conneg or a proprietary API.

Secondly, you need to look at the HTTP headers and parse quite a complex
structure to determine what is being requested.  You can't just put a file
in the file system, unlike with separate URIs for distinct representations
where it just works, instead you need server side processing.  This also
makes it much harder to cache the responses, as the cache needs to
determine whether or not the representation has changed -- the cache also
needs to parse the headers rather than just comparing URI and content.
  For
large scale systems like DPLA and Europeana, caching is essential for
quality of service.

How do you find our which formats are supported by conneg? By reading the
documentation. Which could just say add .json on the end. The Vary
header
tells you that negotiation in the format dimension is possible, just not
what to do to actually get anything back. There isn't a way to find this
out from HTTP automatically,so now you need to read both the site's docs
AND the HTTP docs.  APIs can, on the other hand, do this.  Consider
OAI-PMH's ListMetadataFormats and SRU's Explain response.

Instead you can have a separate URI for each representation and link them
with Link headers, or just a simple rule like add '.json' on the end. No
need for complicated content negotiation at all.  Link headers can be
added
with a simple apache configuration rule, and as they're static are easy to
cache. So the 

Re: [CODE4LIB] The lie of the API

2013-12-02 Thread Jonathan Rochkind

I do frequently see API keys in header, it is a frequent pattern.

Anything that requires things in the header, in my experience makes the 
API more 'expensive' to develop against. I'm not sure it is okay to 
require headers.


Which is why I suggested allowing format specification in the URL, not 
just conneg headers. And is also, actually, why I expressed admiration 
for google's pattern of allowing X requests a day without an api key. 
Both things allow you to play with the api in a browser without headers.


If you are requiring a cryptographic signature (ala HMAC) for your 
access control, you can't feasibly play with it in a browser anyway, it 
doesn't matter whether it's supplied in headers or query params. And 
(inconvenient) HMAC probably is the only actually secure way to do api 
access control, depending on what level of security is called for.


On 12/2/13 1:03 PM, Robert Sanderson wrote:

To be (more) controversial...

If it's okay to require headers, why can't API keys go in a header rather
than the URL.
Then it's just the same as content negotiation, it seems to me. You send a
header and get a different response from the same URI.

Rob



On Mon, Dec 2, 2013 at 10:57 AM, Edward Summers e...@pobox.com wrote:


On Dec 3, 2013, at 4:18 AM, Ross Singer rossfsin...@gmail.com wrote:

I'm not going to defend API keys, but not all APIs are open or free.  You
need to have *some* way to track usage.


A key (haha) thing that keys also provide is an opportunity to have a
conversation with the user of your api: who are they, how could you get in
touch with them, what are they doing with the API, what would they like to
do with the API, what doesn’t work? These questions are difficult to ask if
they are just a IP address in your access log.

//Ed






Re: [CODE4LIB] calibr: a simple opening hours calendar

2013-11-27 Thread Jonathan Rochkind
It's true, sometimes there's a reason for what seems weird that you 
don't know unless you ask. Other times, there's no good reason for it 
anymore, but nobody noticed because nobody asked.


But I'm actually suggesting something else too.

For instance, just for one example, one of your examples: Did a 
previous librarian have some regularly scheduled thing every Tuesday 
afternoon, and that's why one section closes down early on Tuesdays?


What I am suggesting is that, even if that librarian isn't previous 
and the reason is still active -- there is a real cost to your users of 
having confusing hours.


And it's additive, at first it seems oh, we'll close half an hour early 
Tuesday, what's the big deal. Then you add in and open an hour early 
on Wednesday for some other reason, except during January etc. etc.


I am suggesting that The Library (whoever makes decisions) should think 
carefully about whether idiosyncratic hours are really neccesary, or if 
there's a way to avoid it.


Maybe that librarian's regularly scheduled thing could have it's 
schedule changed, or maybe a sub could be found. Making these changes to 
try and avoid idiosyncratic hours are going to be inconvenient and may 
have a resource cost.  Whether they are looked into or done, depends on 
how much of a 'cost' decision-makers think there is to having 
idiosyncratic hours -- is it worth it to change that regularly scheduled 
thing to avoid idiosyncratic hours? It depends on how bad it is to have 
idiosyncratic hours.


I am suggesting that many decision makers may be severely 
under-estimating the 'cost' to our effectiveness of having idiosyncratic 
hours.


Jonathan

On 11/27/13 1:36 PM, Joe Hourcle wrote:

On Nov 27, 2013, at 11:01 AM, Jonathan Rochkind wrote:


Many of our academic libraries have very byzantine 'hours' policies.

Developing UI that can express these sensibly is time-consuming and difficult; 
by doing a great job at it (like Sean has), you can make the byzantine hours 
logic a lot easier for users to understand... but you can still only do so much 
to make convoluted complicated library hours easy to deal with and understand 
for users.

If libraries can instead simplify their hours, it would make things a heck of a 
lot easier on our users. Synchronize the hours of the different parts of the 
library as much as possible. If some service points aren't open the full hours 
of the library, if you can make all those service points open the _same_ 
reduced hours, not each be different. Etc.

To some extent, working on hours displays to convey byzantine hours structures 
can turn into the familiar case of people looking for technological magic 
bullet solutions to what are in fact business and social problems.


I agree up to a point.

When I was at GWU, we were running what was the most customized
version of Banner (a software system for class registration, HR,
etc.)  Some of the changes were to deal with rules that no one
could come up with a good reason for, and they should have been
simplified.  Other ones were there for a legitimate reason.*

You should take these sorts of opportunities to ask *why* the
hours are so complicated, and either document the reason for it,
or look to simplify it.

Did a previous librarian have some regularly scheduled thing
every Tuesday afternoon, and that's why one section closes
down early on Tuesdays?  If they're not there anymore, you can
change that.

Does one station requiring some sort of a shutdown / closing
procedure that takes a significant amount of time, and they
close early so they're done by closing time?  Or do they open
late because they have similar issue setting up in the morning,
and it's unrealistic to have them come in earlier than everyone
else?  Maybe there's something else that could be done to
improve and/or speed up the procedures.**

Has there been historically less demand for certain types of
books at different times of the day?  Well, that's going to be
hard to verify, as people have now adjusted to the library's
hours, rather than visa-versa ... but it's a legitimate reason
to not keep service points open if no one's using them.

... but I would suggest that you don't use criteria like the
US Postal Service's recommendation to remove postboxes -- they
based it on number of pieces of mail, and ended up removing
them all in some areas.

...

Anyway, the point I'm making -- libraries are about service.
Simplification might make it easier to keep track of things,
but it doesn't necessarily make for better service.

-Joe

* Well, legitimate to someone, at least.  For instance, the
development office had a definition of alumni that included
donors who might not've actually attended the university.

** When I worked for the group that ran GW's computer labs,
some days I staffed a desk that we had over in the library ...
but I had to clock in at the main office, then walk over to
other building, and once the shift was over, walk back to the
main office

Re: [CODE4LIB] Tab delimited file with Python CSV

2013-11-25 Thread Jonathan Rochkind

Ah, but what if the data itself has tabs!  Doh!

It can be a mess either way.  There are standards (or conventions?) for 
escaping internal commas in CSV -- which doesn't mean the software that 
was used to produce the CSV, or the software you are using to read it, 
actually respects them.


But I'm not sure if there are even standards/conventions for escaping 
tabs in a tab-delimited text file?


Really, the lesson to me is that you should always consider use an 
existing well-tested library for both reading and writing these files, 
whether CSV or tab-delimited -- even if you think Oh, it's so simple, 
why bother than that.  There will be edge cases. That you will discover 
only when they cause bugs, possibly after somewhat painful debugging. A 
well-used third-party library is less likely to have such edge case bugs.


I am more ruby than python; in ruby there is a library for reading and 
writing CSV in the stdlib. 
http://ruby-doc.org/stdlib-1.9.3/libdoc/csv/rdoc/CSV.html


On 11/25/13 12:57 PM, Roy Tennant wrote:

Also, just to be clear, the data file is a tab-delimited text file, not a
CSV (comma-separated quoted values) file. Whenever processing data it's
important to be clear about what format you are working with. I happen to
prefer tab-delimited text files over CSV myself, as in this case like in
many others, the data itself can have quotes, which can play havoc on a
program expecting them only as delimiters.
Roy


On Mon, Nov 25, 2013 at 9:49 AM, Joshua Gomez gome...@usc.edu wrote:


If all you want to do is add a tab to the beginning of each line, then you
don't need to bother using the csv library.  Just open your file, read it
line by line, prepend a tab to each line and write it out again.

src = open('noid_refworks.txt','rU')
tgt = open('withid.txt', 'w')

for line in src.readlines():
 line = '\t%s' % line
 tgt.write(line)

-Joshua


From: Code for Libraries CODE4LIB@LISTSERV.ND.EDU on behalf of Bohyun
Kim k...@fiu.edu
Sent: Monday, November 25, 2013 9:10 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: [CODE4LIB] Tab delimited file with Python CSV

Hi all,

I am new to Python and was wondering if I can get some help with my short
script. What I would like the script to do is:
(1) Read the tab delimited file generated by Refworks
(2) Output exactly the same file but the blank column added in front.
(This is for prepping the exported tab delimited file from refworks so
that it can be imported into MySQL; so any suggestions in the line of
timtoady would be also appreciated.)

This is what I have so far. It works, but then in the output file, I end
up getting some weird character in each line in the second column (first
column in the original input file). I also don't really get what
escapechar=' ' does or what I am supposed to put in there.

import csv
with open('noid_refworks.txt','rU') as csvinput:
 with open('withid.txt', 'w') as csvoutput:
 dialect = csv.Sniffer().sniff(csvinput.read(1024))
 csvinput.seek(0)
 reader = csv.reader(csvinput, dialect)
 writer = csv.writer(csvoutput, dialect, escapechar='\'',
quoting=csv.QUOTE_NONE)
 for row in reader:
 writer.writerow(['\t']+row)

A row in the original file is like this (Tab delimited and no quotations,
some fields have commas and quotation marks inside.):

Reference TypeAuthors, PrimaryTitle PrimaryPeriodical Full
  Periodical AbbrevPub YearPub Date Free FromVolumeIssue
  Start PageOther PagesKeywordsAbstractNotesPersonal
NotesAuthors, SecondaryTitle SecondaryEditionPublisher
  Place Of PublicationAuthors, TertiaryAuthors, Quaternary
  Authors, QuinaryTitle, TertiaryISSN/ISBNAvailability
  Author/AddressAccession NumberLanguageClassificationSub
file/DatabaseOriginal Foreign TitleLinksDOICall Number
  DatabaseData SourceIdentifying PhraseRetrieved Date
  Shortened TitleUser 1User 2User 3User 4User 5User
6User 7User 8User 9User 10User 11User 12User 13
User 14User 15

A row in the output file is like this:
(The tab is successfully inserted. But I don't get why I have L inserted
after no matter what I put in escapechar)

 LReference TypeAuthors, PrimaryTitle PrimaryPeriodical
FullPeriodical AbbrevPub YearPub Date Free FromVolume
  IssueStart PageOther PagesKeywordsAbstractNotes
  Personal NotesAuthors, SecondaryTitle SecondaryEdition
  PublisherPlace Of PublicationAuthors, TertiaryAuthors,
QuaternaryAuthors, QuinaryTitle, TertiaryISSN/ISBN
  AvailabilityAuthor/AddressAccession NumberLanguage
  ClassificationSub file/DatabaseOriginal Foreign TitleLinks
  DOICall NumberDatabaseData SourceIdentifying Phrase
  Retrieved DateShortened 

Re: [CODE4LIB] Tab delimited file with Python CSV

2013-11-25 Thread Jonathan Rochkind

On 11/25/13 1:38 PM, Joe Hourcle wrote:

On Nov 25, 2013, at 1:05 PM, Jonathan Rochkind wrote:


Ah, but what if the data itself has tabs!  Doh!

It can be a mess either way.  There are standards (or conventions?)
for escaping internal commas in CSV -- which doesn't mean the
software that was used to produce the CSV, or the software you are
using to read it, actually respects them.


You don't have to escape the commas, you just have to double-quote
the string.  If you want to have a double quote, you put two in a
row:, eg:

He said, hello


Right, I would call that a form of escaping.

I can stay blissfully ignorant of what form of escaping is required by
CSV and if I'm doing it right by just using a library, heh.

Out of curiosity, let's see what the ruby stdlib csv parser/writer 
writes for various things.


Yep, just doublequotes anything with internal commas or quotes.

###
ordinary,row,with a value with spaces
a row with,several, internal, commas
or even,internal quotes, as the kids say


Re: [CODE4LIB] Code4lib 2014 Diversity Scholarships: Call for Applications

2013-11-25 Thread Jonathan Rochkind

Finances are a limiting factor on conference attendance for people of all
demographic groups, and I would endorse plans to surmount that.


Code4Lib is, of course, one of the least expensive conferences you'll 
find. And the community and organizers care a lot about keeping it so -- 
there are sometimes disputes in a given year about whether the 
organizers could have kept it even less expensive. But it's still, every 
year, one of the most affordable conferences around.


Which is pretty darn awesome, and important.

That's pretty much what we do try and increase financial accessibility 
for people of all demographic groups. We also try to switch the regional 
location around the country every year, to even out transportation costs 
for for people in different parts of the country.


If you can afford to go to any conference at all, you can afford for 
Code4Lib to be that conference. Of course, there are people who can't 
afford to go to any conference.  Which is unfortunate. But I'm not sure 
what, if anything, is being suggested we could do about that?


If you have or can find a source of funding willing to pay registration, 
hotel, and transportation for anyone who can't afford it, then please 
feel free to organize it to happen.


That's what the people who organized, and continue to organize, the 
diversity scholarships did. They just organized it.


Jonathan


Re: [CODE4LIB] ruby-marc api design feedback wanted

2013-11-20 Thread Jonathan Rochkind
I am not sure how you ran into this problem on Monday with ruby-marc, 
since ruby-marc doesn't currently handle Marc8 conversion to UTF-8 at 
all -- how could you have run into a problem with Marc8 to UTF8 
conversion?  But that is what I am adding.


But yeah, using a preprocessor is certainly one option, that will not be 
taken away from people. Although hopefully adding Marc8-UTF8 conversion 
to ruby-marc might remove the need for a preprocessor in many cases.


So again, we have a bit of a paradox, that I have in my own head too. 
Scot suggests that In either case, what we DON'T want is to halt the 
processing altogether.  And yet, still, that the default behavior 
should be raising an exception -- that, is halting processing 
altogether, right?


So hardly anyone hardly ever is going to want the default behavior, but 
everyone thinks it should be default anyway, to force people to realize 
what they're doing? I am not entirely objecting to that -- it's why I 
brought it up here, but it does seem odd, doesn't it?  To say something 
should be default that hardly anyone hardly ever will want?



On 11/20/13 10:10 AM, Scott Prater wrote:

We run into this problem fairly regularly, and in fact, ran into it on
Monday with ruby-marc.

The way we've traditionally handled it is to put our marc stream through
a cleanup preprocessor before passing it off to a marc parser (ruby marc
or marc4j).

The preprocessor can do one of two things:

   1)  Skip the bad record in the marc stream and move on; or
   2)  Substitute the bad characters with some default character, and
write it out.

In both cases we log the error as a warning, and include a byte offset
where the bad character occurs, and the record ID, if possible.  This
allows us to go back and fix the errors in a stream in a batch;
generally, the bad encoding errors fall into four or five common errors
(cutting and pasting data from Windows is a typical cause).

In either case, what we DON'T want is to halt the processing altogether.
  Generally, we're dealing with thousands, sometimes millions, of MARC
records in a stream;  it's very frustrating to get halfway through the
stream, then have the parser throw an exception and halt.  Halting the
processing should be the strategy of last resort, to be called only when
the stream has become so corrupted you can't go on to the next record.

I'd want the default to be option 1.  Let the user determine what
changes need to be made to the data;  the parser's job is to parse, not
infer and create.  Overwriting data could also lead to the misperception
that everything is okay, when it really isn't.

-- Scott

On 11/20/2013 08:32 AM, Jon Stroop wrote:

Coming from nowhere on this...is there a place where it would be
convenient to flag which behavior the user (of the library) wants? I
think you're correct that most of the time you'd just want to blow
through it (or replace it), but for the situation where this isn't the
case, I think the Right Thing to do is raise the exception. I don't
think you would want to bury it in some assumption made internal to the
library unless that assumption can be turned off.

-Jon


On 11/19/2013 07:51 PM, Jonathan Rochkind wrote:

ruby-marc users, a question.

I am working on some Marc8 to UTF-8 conversion for ruby-marc.

Sometimes, what appears to be an illegal byte will appear in the Marc8
input, and it can not be converted to UTF8.

The software will support two alternatives when this happens: 1)
Raising an exception. 2) Replacing the illegal byte with a replacement
char and/or omitting it.

I feel like most of the time, users are going to want #2.  I know
that's what I'm going to want nearly all the time.

Yet, still, I am feeling uncertain whether that should be the default.
Which should be the default behavior, #1 or #2?  If most people most
of the time are going to want #2 (is this true?), then should that be
the default behavior?   Or should #1 still be the default behavior,
because by default bad input should raise, not be silently recovered
from, even though most people most of the time won't want that, heh.

Jonathan





Re: [CODE4LIB] ruby-marc api design feedback wanted

2013-11-20 Thread Jonathan Rochkind
 think the Right Thing to do is raise the exception. I don't
think you would want to bury it in some assumption made internal to the
library unless that assumption can be turned off.

-Jon


On 11/19/2013 07:51 PM, Jonathan Rochkind wrote:

ruby-marc users, a question.

I am working on some Marc8 to UTF-8 conversion for ruby-marc.

Sometimes, what appears to be an illegal byte will appear in the Marc8
input, and it can not be converted to UTF8.

The software will support two alternatives when this happens: 1)
Raising an exception. 2) Replacing the illegal byte with a replacement
char and/or omitting it.

I feel like most of the time, users are going to want #2.  I know
that's what I'm going to want nearly all the time.

Yet, still, I am feeling uncertain whether that should be the default.
Which should be the default behavior, #1 or #2?  If most people most
of the time are going to want #2 (is this true?), then should that be
the default behavior?   Or should #1 still be the default behavior,
because by default bad input should raise, not be silently recovered
from, even though most people most of the time won't want that, heh.

Jonathan








Re: [CODE4LIB] ruby-marc api design feedback wanted

2013-11-20 Thread Jonathan Rochkind

On 11/20/13 11:40 AM, Scott Prater wrote:

Not sure what the details of our issue was on Monday -- but we do have
records that are supposedly encoded in UTF-8, but nonetheless contain
invalid characters.


Oh, and I'd clarify, if you haven't figured it out already, if those are 
ISO 2709 binary records, you can ask the reader to do different things 
there in that case (already avail in current ruby-marc release):


# raise:
MARC::Reader(something.marc, :validate_encoding = true)

# replace with unicode replacement char:
MARC::Reader(something.marc, :invalid = :replace)

This is already available in present ruby-marc release.

I would suggest one or the other -- the default of leaving bad bytes in 
your ruby strings is asking for trouble, and you probably don't want to 
do it, but was made the default for backwards compat reasons with older 
versions of ruby-marc. (See why I am reluctant to add another default 
that we don't think hardly anyone would actually want? :) )


Oh, and you may also want to explicitly specify the expected encoding to 
avoid confusing:


MARC::Reader(something.marc, :external_encoding = UTF-8, 
:validate_encoding = true)


(It will also work with any other encoding recognized by ruby, for those 
with legacy, possibly international, data).


This stuff is confusing to explain, there are so many permutations and 
combinations of circumstances involved.  But I'll try to improve the 
ruby-marc docs on this stuff, as part of adding the yet more options for 
MARC8 handling.


Re: [CODE4LIB] ruby-marc api design feedback wanted

2013-11-20 Thread Jonathan Rochkind

On 11/20/13 12:51 PM, Scott Prater wrote:

I think the issue comes down to a distinction between a stream and a
record.  Ideally, the ruby-marc library would keep pointers to which
record it is in, where the record begins, and where the record ends in
the stream.  If a valid header and end-of-record delimiter are in place,
then the library should be able to reject the record if it contains
garbage in between those two points, without compromising the integrity
of the entire stream.


I understand what you're saying, and why it's attractive. I am not sure 
if ruby-marc can do that right now. I am not personally interested in 
adding that at this particular time -- I just spent a couple days adding 
Marc8 support in the first place, and that's enough for me for now.  I 
was just soliciting some feedback on a point I wasn't sure about with 
the new MARC8 api, honestly.


But pull requests are always welcome!  Also feel free to check out 
ruby-marc to see if it accomodates your desired usage already or not, 
and let us know, even without a pull request!


If you (or anyone) are interested in checking out the MARC8 support 
added to ruby-marc, it's currently in a branch not yet merged in or 
released, but probably will be soon.


https://github.com/ruby-marc/ruby-marc/tree/marc8
https://github.com/ruby-marc/ruby-marc/pull/23


[CODE4LIB] ruby-marc api design feedback wanted

2013-11-19 Thread Jonathan Rochkind

ruby-marc users, a question.

I am working on some Marc8 to UTF-8 conversion for ruby-marc.

Sometimes, what appears to be an illegal byte will appear in the Marc8 
input, and it can not be converted to UTF8.


The software will support two alternatives when this happens: 1) Raising 
an exception. 2) Replacing the illegal byte with a replacement char 
and/or omitting it.


I feel like most of the time, users are going to want #2.  I know that's 
what I'm going to want nearly all the time.


Yet, still, I am feeling uncertain whether that should be the default. 
Which should be the default behavior, #1 or #2?  If most people most of 
the time are going to want #2 (is this true?), then should that be the 
default behavior?   Or should #1 still be the default behavior, because 
by default bad input should raise, not be silently recovered from, even 
though most people most of the time won't want that, heh.


Jonathan


Re: [CODE4LIB] yaz-client

2013-11-14 Thread Jonathan Rochkind
When we have patrons that try to download tens or hundreds of thousands 
of pages -- not uncommonly, the vendor has software that notices the 
'excessive' use, sends us an email reminding us that bulk downloading 
violates our terms of service, and temporarily blacklists the IP address 
(which could become more of a problem as we move to NAT/PAT where 
everyone appears to the external internet as one of only a few external 
IPs).


Granted, these users are usually downloading actual PDFs, not just 
citations.  I'm not really sure if when they are doing it for personal 
research of some kind, or when they are doing it to share with off-shore 
'pirate research paper' facilities (I'm not even making that up), but 
the volume of use that triggers the vendors notices is such that it's 
definitely an automated process of some kind, not just someone clicking 
a lot.


Bulk downloading from our content vendors is usually prohibited by their 
terms of service. So, beware.


On 11/14/13 10:30 AM, Eric Lease Morgan wrote:

Thank you for the replies, and after a bit of investigation I learned that I 
don’t need to do authentication because the vendor does IP authentication. 
Nice! On the other hand, I was still not able to resolve my original problem.

I needed/wanted to download ten’s of thousands, if not hundred’s of thousands of 
citations for text mining analysis. The Web interface to the database/index limits 
output to 4,000 items and selecting the set of these items is beyond tedious — it 
is cruel and unusual punishment. I then got the idea of using EndNote’s z39.50 
client, and after a bit of back  forth I got it working, but the downloading 
process was too slow. I then got the bright idea of writing my own z39.50 client 
(below). Unfortunately, I learned that the 4,000 record limit is more than that. A 
person can only download the first 4,000 records in a found set. Requests for 
record 4001, 4002, etc. fail. This is true in my locally written client as well as 
in EndNote.

Alas, it looks as if I am unable to download the data I need/require, unless 
somebody at the vendor give me a data dump. On the other hand, since my locally 
written client is so short and simple, I think I can create a Web-based 
interface to query many different z39.50 targets and provide on-the-fly text 
mining analysis against the results.

In short, I learned a great many things.

—
Eric Lease Morgan
University of Notre Dame


#!/usr/bin/perl

# nytimes-search.pl - rudimentary z39.50 client to query the NY Times

# Eric Lease Morgan emor...@nd.edu
# November 13, 2013 - first cut; Happy Birthday, Steve!

# usage: ./nytimes-search.pl  nytimes.marc


# configure
use constant DB = 'hnpnewyorktimes';
use constant HOST   = 'fedsearch.proquest.com';
use constant PORT   = 210;
use constant QUERY  = '@attr 1=1016 trade or tariff';
use constant SYNTAX = 'usmarc';

# require
use strict;
use ZOOM;

# do the work
eval {

# connect; configure; search
my $conn = new ZOOM::Connection( HOST, PORT, databaseName = DB );
$conn-option( preferredRecordSyntax = SYNTAX );
my $rs = $conn-search_pqf( QUERY );

# requests  4000 return errors
# print $rs-record( 4001 )-raw;

# retrieve; will break at record 4,000 because of vendor limitations
for my $i ( 0 .. $rs-size ) {

print STDERR \tRetrieving record #$i\r;
print $rs-record( $i )-raw;

}

};

# report errors
if ( $@ ) { print STDERR Error , $@-code, : , $@-message, \n }

# done
exit;




[CODE4LIB] anyone know how to properly do a marc4j release?

2013-11-13 Thread Jonathan Rochkind

I am a committer, but I have no idea how to do a marc4j release.

There are some fixes in master repo for marc4j. I can find all the parts 
of the source code that seem to have a version number and change them. I 
can make a git tag with the version number.


But what else is entailed, how do people actually get marc4j?  I need to 
update maven repo somehow or something? Anyone?


Re: [CODE4LIB] anyone know how to properly do a marc4j release?

2013-11-13 Thread Jonathan Rochkind

Okay, thanks Bob.

I guess for the moment I might just keep using Marc4J by building it 
myself from master without doing an official release.


I think that's probably better than halfway doing a release, like 
tagging it in the repo with a release tag, but without doing a proper 
maven release, just confusing everyone.


If anyone wants to take on a bit of release management for Marc4J -- 
sounds like we could use you!  Ie, figuring out how to do it, 
documenting it in the marc4j repo, making any changes to Marc4J source 
repo to make it easier, etc.  Tod Olsen tells me another part of this 
puzzle is making sure the javadocs get re-generated (to where? I have no 
idea!)


Anyone want to help? A bunch of people use marc4j, often through 
downstream dependencies, you'd have many thanks!


Jonathan

On 11/13/13 5:15 PM, Robert Haschart wrote:

I believe that is one of the open issues for Marc4j.   I do not know how
to push a jar or a new version of a jar to a Maven repo.
I believe Bill Dueber was looking into this just last month when he
wrote the following to the Solrmarc list:

I'm trying to get marc4j into maven central, and I don't know who
owns the domain. If it's one of us, then we can use it. If not,
well, I'm not sure what we do (except maybe use the github location?)

--
Bill Dueber
Library Systems Programmer
University of Michigan Library

The last release I did was to merely create the jar in the releases
sub-directory, and reference it in the README.textile file. That
emulates the way the releases had been done from the tigris.org site,
but its not the right way to do a release.

-Bob Haschart

On 11/13/2013 4:43 PM, Jonathan Rochkind wrote:

I am a committer, but I have no idea how to do a marc4j release.

There are some fixes in master repo for marc4j. I can find all the
parts of the source code that seem to have a version number and change
them. I can make a git tag with the version number.

But what else is entailed, how do people actually get marc4j?  I need
to update maven repo somehow or something? Anyone?




Re: [CODE4LIB] anyone know how to properly do a marc4j release?

2013-11-13 Thread Jonathan Rochkind

Aha, Kevin!

I'm not sure, would we?  Is that your advice?  Do you have any interest 
in taking this on?


There's possibly no current marc4j committers who understand how it's 
set up now, it's kind of just grown under various people's stewardship, 
I think it's possible nobody has strong opinions as long as it works and 
doens't make marc4j any harder to work with for developers.


(If someone _does_ have understanding and/or strong feelings about how 
marc4j source code is set up (maybe Bob?) then definitely correct me! 
And maybe work with Kevin on figuring out how to do a release?)


Otherwise, Kevin, you interested in getting committer privs and figuring 
out what needs to be done?


On 11/13/13 5:23 PM, Kevin S. Clarke wrote:

I have experience pushing projects into Maven's central repo through
Sonatype.  Maven has a standard structure (that you don't have to use, but
it makes things easier/more-Maven-ish).  Would you want the project
reorganized into that structure in the process?

Kevin



On Wed, Nov 13, 2013 at 5:15 PM, Robert Haschart rh...@virginia.edu wrote:


I believe that is one of the open issues for Marc4j.   I do not know how
to push a jar or a new version of a jar to a Maven repo.
I believe Bill Dueber was looking into this just last month when he wrote
the following to the Solrmarc list:

I'm trying to get marc4j into maven central, and I don't know who
owns the domain. If it's one of us, then we can use it. If not,
well, I'm not sure what we do (except maybe use the github location?)

--Bill Dueber
Library Systems Programmer
University of Michigan Library

The last release I did was to merely create the jar in the releases
sub-directory, and reference it in the README.textile file. That emulates
the way the releases had been done from the tigris.org site, but its not
the right way to do a release.

-Bob Haschart


On 11/13/2013 4:43 PM, Jonathan Rochkind wrote:


I am a committer, but I have no idea how to do a marc4j release.

There are some fixes in master repo for marc4j. I can find all the parts
of the source code that seem to have a version number and change them. I
can make a git tag with the version number.

But what else is entailed, how do people actually get marc4j?  I need to
update maven repo somehow or something? Anyone?








Re: [CODE4LIB] anyone know how to properly do a marc4j release?

2013-11-13 Thread Jonathan Rochkind
That would be super awesome if you wanted to do that, and see if you can 
come up with something that Bob is okay with, but makes it possible for 
us to actually do releases to maven, so people expecting to find 
releases there can find them there.


I'm not sure I, or any of the other committers but Bob, understand any 
of that workflow either. I know that right now it is (thankfully) in a 
state where you can use ant to succesfully build it to a .jar, and use 
ant to run tests. Hooray. But yeah, there are, I had noticed, some 
generated .java files that are not in the source repo, but are in fact 
generated by the build process.


On 11/13/13 6:21 PM, Kevin S. Clarke wrote:

On Wed, Nov 13, 2013 at 5:52 PM, Jonathan Rochkind rochk...@jhu.edu
mailto:rochk...@jhu.edu wrote:


I'm not sure, would we?  Is that your advice?


That would be my advice, yes, but I understand Robert's perspective.
  People have strong feelings one way or the other about Maven.

Otherwise, Kevin, you interested in getting committer privs and
figuring out what needs to be done?


Maybe I should just tinker in my own fork of it for a bit and see what
comes from it... I'll need to understand the workflow Robert is
describing a bit more.

Kevin


[CODE4LIB] a note on MARC8 to UTF8 transcoding: Character references

2013-11-05 Thread Jonathan Rochkind
Do you do sometimes deal with MARC in the MARC8 character encoding?  Do 
you deal with software that converts from MARC8 to UTF8?


Maybe sometimes you've seen weird escape sequences that look like HTML 
or XML character references, like, say #x200F;.


You, like me, might wonder what the heck that is about -- is it 
cataloger error, a catalgoer manually entered this or something in 
error? Is it a software error, some software accidentally stuck this in, 
at some part in the pipeline?


You can't, after all, just put HTML/XML character references wherever 
you want -- there's no reason #x200F; would mean anything other than 
, #, x, 2, etc, when embedded in MARC ISO 2709 binary, right?


Wrong, it turns out!

There is actually a standard that says you _can_ embed XML/HTML-style 
character references in MARC8, for glyphs that can't otherwise be 
represented in MARC8. Lossless conversion [from unicode] to MARC-8 
encoding.


http://www.loc.gov/marc/specifications/speccharconversion.html#lossless

Phew, who knew?!

Software that converts from MARC8 to UTF-8 may or may not properly 
un-escape these character references though. For instance, the Marc4K 
AnselToUnicode class which converts from Marc8 to UTF8 (or other 
unicode serializations) won't touch these lossless conversions (ie, 
HTML/XML character references), they'll leave them alone in the output, 
as is.


yaz-marcdump also will NOT un-escape these entities when converting from 
Marc8 to UTF8.


So, then, the system you then import your UTF8 records into will now 
just display the literal HTML/XML-style character reference, it won't 
know to un-escape them either, since those literals in UTF8 really _do_ 
just mean  followed by a # followed by an x, etc. It only means 
something special as a literal in HTML, or in XML -- or it turns out in 
MARC8, as a 'lossless character conversion'.


So, for instance, in my own Traject software that uses Marc4J to convert 
from Marc8 to UTF8 -- I'm going to have to go add another pass, that 
converts HTML/XML-character entities to actual UTF8 serializations.  Phew.


So be warned, you may need to add this to your software too.


Re: [CODE4LIB] ANNOUNCEMENT: Traject MARC-Solr indexer release

2013-10-15 Thread Jonathan Rochkind
Yep, what Bill said, I have had thoughts of extending it to other types 
of input too, it was part of my original design goals.


In particular, I was thinking of extending it to arbitrary XML.

Unlike MARC, there are many other options for indexing XML into Solr 
(assuming that's your end goal), so you may or may not find traject to 
be better than those, although for myself there might be some benefit in 
using the same tool accross formats too.


There are a number of built-in 'macros' that are MARC-specific; you 
wouldn't use those. And might need some others that are, say, 
XML-specific. (Probably just a single one, extract_xpath, for XML).


Same could be done for MODS, sure -- or you could handle MODS with a 
(hypothetical) generic XML setup.


But yeah, if you want to take input records, and transform them into 
hash-like data structures -- I was thinking from the start of 
structuring traject to support such use cases, yep. (If you want to go 
to something other than a hash-like data structure, well, it might still 
be possible, but it's straying from traject's target a bit more).


[Oh, and I just made up 'traject'. I was looking for a word (made up or 
real) not already being used for any popular software, and thinking 
about 'projections' in the sense of mathematical transformations; and 
about 'trajectory' in the sense of things sent through outer space, with 
the Solr/Solar connection. I actually had originally decided to call it 
transject, but then accidentally wrote traject when I created the 
github project, and then figured that was easier to pronounce and write 
anyhow.]


On 10/15/13 1:02 PM, Bill Dueber wrote:

'traject' means to transmit (e.g., trajectory) -- or at least it did,
when people still used it, which they don't.

The traject workflow is incredibly general: *a reader* sends *a record* to *an
indexing routine* which stuffs...stuff...into a context object which is
then sent to *a writer*. We have a few different MARC readers, a few useful
writers (one of which, obviously, is the solr writer), and a bunch of
shipped routines (which we're calling macros but are just well-formed
ruby lambda or blocks) for extracting and transforming common MARC data.

[see
http://robotlibrarian.billdueber.com/announcing-traject-indexing-software/for
more explanation and some examples]

But there's no reason why a reader couldn't produce a MODS record which
would then be worked on. I'm already imagining readers and writers that
target databases (RDBMS or NoSQL), or a queueing system like Hornet, etc.

If there are people at Stanford that want to talk about how (easy it is) to
extend traject, I'd be happy to have that conversation.



On Tue, Oct 15, 2013 at 12:28 PM, Tom Cramer tcra...@stanford.edu wrote:


++ Jonathan and Bill.

1.) Do you have any thoughts on extending traject to index other types of
data--say MODS--into solr, in the future?

2.) What's the etymology of 'traject'?

- Tom


On Oct 14, 2013, at 8:53 AM, Jonathan Rochkind wrote:


Jonathan Rochkind (Johns Hopkins) and Bill Dueber (University of

Michigan), are happy to announce a robust, feature-complete beta release of
traject, a tool for indexing MARC data to Solr.


traject, in the vein of solrmarc, allows you to define your indexing

rules using simple macro and translation files. However, traject runs under
JRuby and is ruby all the way down, so you can easily provide additional
logic by simply requiring ruby files.


There's a sample configuration file to give you a feel for traject[1].

You can view the code[2] on github, and easily install it as a (jruby)

gem using gem install traject.


traject is in a beta release hoping for feedback from more testers prior

to a 1.0.0 release, but it is already being used in production to generate
the HathiTrust (metadata-lookup) Catalog (http://www.hathitrust.org/).
traject was developed using a test-driven approach and has undergone both
continuous integration and an extensive benchmarking/profiling period to
keep it fast. It is also well covered by high-quality documentation.


Feedback is very welcome on all aspects of traject including

documentation, ease of getting started, features, any problems you have,
etc.


What we think makes traject great:

* It's all just well-crafted and documented ruby code; easy to program,

easy to read, easy to modify (the whole code base is only 6400 lines of
code, more than a third of which is tests)

* Fast. Traject by default indexes using multiple threads, so you can

use all your cores!

* Decoupled from specific readers/writers, so you can use ruby-marc or

marc4j to read, and write to solr, a debug file, or anywhere else you'd
like with little extra code.

* Designed so it's easy to test your own code and distribute it as a gem

We're hoping to build up an ecosystem around traject and encourage

people to ask questions and contribute code (either directly to the project
or via releasing plug-in gems).


[1]

https://github.com/traject

[CODE4LIB] ANNOUNCEMENT: Traject MARC-Solr indexer release

2013-10-14 Thread Jonathan Rochkind
Jonathan Rochkind (Johns Hopkins) and Bill Dueber (University of 
Michigan), are happy to announce a robust, feature-complete beta release 
of traject, a tool for indexing MARC data to Solr.


traject, in the vein of solrmarc, allows you to define your indexing 
rules using simple macro and translation files. However, traject runs 
under JRuby and is ruby all the way down, so you can easily provide 
additional logic by simply requiring ruby files.


There's a sample configuration file to give you a feel for traject[1].

You can view the code[2] on github, and easily install it as a (jruby) 
gem using gem install traject.


traject is in a beta release hoping for feedback from more testers prior 
to a 1.0.0 release, but it is already being used in production to 
generate the HathiTrust (metadata-lookup) Catalog 
(http://www.hathitrust.org/). traject was developed using a test-driven 
approach and has undergone both continuous integration and an extensive 
benchmarking/profiling period to keep it fast. It is also well covered 
by high-quality documentation.


Feedback is very welcome on all aspects of traject including 
documentation, ease of getting started, features, any problems you have, 
etc.


What we think makes traject great:

* It's all just well-crafted and documented ruby code; easy to program, 
easy to read, easy to modify (the whole code base is only 6400 lines of 
code, more than a third of which is tests)
* Fast. Traject by default indexes using multiple threads, so you can 
use all your cores!
* Decoupled from specific readers/writers, so you can use ruby-marc or 
marc4j to read, and write to solr, a debug file, or anywhere else you'd 
like with little extra code.

* Designed so it's easy to test your own code and distribute it as a gem

We're hoping to build up an ecosystem around traject and encourage 
people to ask questions and contribute code (either directly to the 
project or via releasing plug-in gems).


[1] 
https://github.com/traject-project/traject/blob/master/test/test_support/demo_config.rb

[2] http://github.com/traject-project/traject


Re: [CODE4LIB] Ruby on Windows

2013-10-01 Thread Jonathan Rochkind
So, when my desktop workstation was Windows, i developed ruby by actually 
running it on a seperate box which was a linux box. I'd just ssh in for a 
command line, and I used ExpanDrive[1] to mount the linux box's file system as 
a G:// drive on Windows, so I could still edit files there with the text 
editor of my choice. 




So it barely mattered that it was a separate machine, right?  Even if it had 
somehow been on my local machine, I'd still be opening up some kind of shell 
(whether CMD.exe or more likely some kind of Cygwin thing) to start up my app 
or run the automated tests etc.  It's a window with a command line in it, what 
does it matter if it's actually running things on my local machine, or is a 
putty window to a linux machine?




So, if you don't have a separate linux machine available, you might be able to 
do something very similar using VirtualBox[2] to run a linux machine in a VM on 
your windows machine.  With VirtualBox, you can share file systems so you can 
just open up files 'in' your linux VM on your Windows machine. There's probably 
a way to ssh into the local linux VM, from the Windows host, even if the linux 
VM doesn't have it's own externally available IP address.  




It would end up being quite similar to what I did, which worked fine for me for 
many years (eventually I got an OSX box cause I just like it better, but my 
development process is not _substantially_ different). 




But here's the thing, even if you manage to do actual Windows ruby development 
without a linux VM... assuming you're writing a web app... what the heck are 
you going to actually deploy it on?  If you're planning on deploying it on a  
Windows server, I think you're in for a _world_ of hurt; deploying a production 
ruby web app on a Windows server is going to be much _more_ painful than 
getting a ruby dev environment going on a Windows server. And really that's not 
unique to ruby, it's true of just about any non-Microsoft 
interpreted/virtual-machine language, or compiled language not supported by 
Microsoft compilers.  There are reasons that almost everyone running non-MS 
languages deploys on linux (and a virtuous/viscious circle where since most 
people deploy on linux, most open source deployment tools are for linux). 




If you really have to deploy on a Windows server, you should probably stick to 
MS languages. Or, contrarily, if you want to develop in non-MS languages, you 
should find a way to get linux servers into your infrastructure. 








[1] http://www.expandrive.com/
[2] https://www.virtualbox.org/

From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] on behalf of Ross Singer 
[rossfsin...@gmail.com]
Sent: Tuesday, October 01, 2013 7:06 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Ruby on Windows




If you absolutely must have a Windows development environment, you may want
to consider a JVM-based scripting language, like Groovy or JRuby. All the
cross-platform advantages, none of the woe. Or, not as much, at
least (there's always a modicum of woe with anything you decide on).




-Ross.




On Tuesday, October 1, 2013, Joshua Welker wrote:




 I'm using Windows 7 x64 SP1. I am using the most recent RubyInstaller
 (2.0.0-p247 x64) and DevKit (DevKit-mingw64-64-4.7.2-2013022-1432-sfx).

 That's disappointing to hear that most folks use Ruby exclusively in *nix
 environments. That really limits its utility for me. I am trying Ruby
 because dealing with HTTP in Java is a huge pain, and I was having
 difficulties setting up a Python environment in Windows, too (go figure).

 Josh Welker


 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU javascript:;]
 On Behalf Of
 David Mayo
 Sent: Tuesday, October 01, 2013 3:44 PM
 To: CODE4LIB@LISTSERV.ND.EDU javascript:;
 Subject: Re: [CODE4LIB] Ruby on Windows

 DevKit is a MingW/MSYS wrapper for Windows Ruby development.  It might not
 be finding it, but he does have a C dev environment.

 I know you cut them out earlier, but would you mind sending some of the C
 Header Blather our way?  It's probably got some clues as to what's going
 on.

 Also - which versions of Windows, RubyInstaller, and DevKit are you using?




 On Tue, Oct 1, 2013 at 4:38 PM, Ross Singer 
 rossfsin...@gmail.comjavascript:;
 wrote:

  It's probably also possible to get these working within Cygwin.
  Assuming the libraries you need to compile against are available in
  Cygwin, of course.
 
  -Ross.
 
  On Oct 1, 2013, at 4:28 PM, Michael J. Giarlo 
  leftw...@alumni.rutgers.edu javascript:; wrote:
 
   Our Windows-based devs all do their Ruby work on Ubuntu and Fedora
   VMs, FWIW.
  
   -Mike
  
  
  
   On Tue, Oct 1, 2013 at 1:12 PM, Justin Coyne
  jus...@curationexperts.com javascript:;
  wrote:
  
   If you see something about C-extensions, it's because the library
   is not written in pure Ruby, it is a wrapper around a library written
 in C.
   Your
   system may 

[CODE4LIB] google scholar link resolver links broken?!

2013-07-17 Thread Jonathan Rochkind
Google Scholar has a feature where it will provide links to your local 
institutional OpenURL link resolver -- by user preference or IP address 
recognition.


It will then present these links sometimes in a right column in Google 
Scholar results, other times in the row of links under each hit.


The hyperlinks in the right column seem to be broken in current Google 
interface. Clicking on them has no effect. It seems like some kind of 
javascript failure, although no javascript errors are raised in console.


If you open in new tab or open in new window, thus skipping the 
Google javascript -- it does work.


Many of our users use Google Scholar and count on link resolver links 
working. This is awfully inconvenient.


Anyone have any idea of any ways to report this to Google in such a way 
that they might actually care? Anyone got any internal contacts?


Jonathan


Re: [CODE4LIB] google scholar link resolver links broken?!

2013-07-17 Thread Jonathan Rochkind
It is failing with any link resolver at all. It doesn't matter where the 
link is going to, the link does not work.


The more 'hidden' link that's presented for certain records underneath 
the hit still works. But the right-column link is not working -- 
probably because of a javascript bug of some kind, I'd say. It's got 
nothing to do with what link resolver you use.


On 7/17/13 12:39 PM, Ken Varnum wrote:

Is it failing only with 360 Link, or with SFX, too? (We're 360 Link here.)


--
Ken Varnum | Web Systems Manager | MLibrary - University of Michigan - Ann
Arbor
var...@umich.edu | @varnum | http://www.lib.umich.edu/users/varnum |
734-615-3287


On Wed, Jul 17, 2013 at 12:36 PM, Braun Hamilton, Michael R 
michael.braunhamil...@ccv.edu wrote:


I just got word that Serials Solutions is advocating with Google to
resolve it whatever that means. Maybe they have an actual contact there.

-Michael
__

Michael Braun Hamilton
Public Services Librarian
Hartness Library Community College of Vermont
(802) 828-0125
michael.braunhamil...@ccv.edu

-Original Message-
From: Code for Libraries [mailto:CODE4LIB@listserv.nd.edu] On Behalf Of
Andreas Orphanides
Sent: Wednesday, July 17, 2013 12:31 PM
To: CODE4LIB@listserv.nd.edu
Subject: Re: [CODE4LIB] google scholar link resolver links broken?!

Maybe they've got the same plans for Google Scholar as they did for Reader
and other much-adored Google products: to slowly crapify it until it
becomes nearly useless, then retire it on short notice.

On Wed, Jul 17, 2013 at 12:28 PM, Sarah Lester sles...@stanford.edu
wrote:


Hi Jonathan,
I found a place for feedback, but I don't know if that will get to the
right folks at Google. Try:
https://support.google.com/scholar/contact/general

I just tried the using the FindIt@ links and they don't work for me
either.  Click and nothing happens.  I also had to re-ad my library to
the choices for the link resolver but they still don't work.

Sarah

On Jul 17, 2013, at 9:16 AM, Jonathan Rochkind wrote:


Google Scholar has a feature where it will provide links to your
local

institutional OpenURL link resolver -- by user preference or IP
address recognition.


It will then present these links sometimes in a right column in
Google

Scholar results, other times in the row of links under each hit.


The hyperlinks in the right column seem to be broken in current
Google

interface. Clicking on them has no effect. It seems like some kind of
javascript failure, although no javascript errors are raised in console.


If you open in new tab or open in new window, thus skipping the

Google javascript -- it does work.


Many of our users use Google Scholar and count on link resolver
links

working. This is awfully inconvenient.


Anyone have any idea of any ways to report this to Google in such a
way

that they might actually care? Anyone got any internal contacts?


Jonathan



Privacy  Confidentiality Notice: This message is for the designated
recipient only and may contain privileged, confidential or otherwise
private information. If you have received it in error, please notify the
sender immediately and delete the original. Any other use of an email
received in error is prohibited.






Re: [CODE4LIB] google scholar link resolver links broken?!

2013-07-17 Thread Jonathan Rochkind

And it is now fixed. I didn't do anything, other people were on it. :)

On 7/17/13 12:16 PM, Jonathan Rochkind wrote:

Google Scholar has a feature where it will provide links to your local
institutional OpenURL link resolver -- by user preference or IP address
recognition.

It will then present these links sometimes in a right column in Google
Scholar results, other times in the row of links under each hit.

The hyperlinks in the right column seem to be broken in current Google
interface. Clicking on them has no effect. It seems like some kind of
javascript failure, although no javascript errors are raised in console.

If you open in new tab or open in new window, thus skipping the
Google javascript -- it does work.

Many of our users use Google Scholar and count on link resolver links
working. This is awfully inconvenient.

Anyone have any idea of any ways to report this to Google in such a way
that they might actually care? Anyone got any internal contacts?

Jonathan


Re: [CODE4LIB] LibraryBox 2.0 Kickstarter

2013-07-08 Thread Jonathan Rochkind

I still don't understand how this project differs from PirateBox.

What features are you adding in your fork? What has been added to your 
fork over PirateBox in the current release, and what do you plan to add 
that differs from PirateBox in the 2.0 release you are funding? And why 
are you adding these features in a fork, instead of contributing them 
back to PirateBox?


Or are there no new features, it's feature-identical, but just with a 
different name and different branding?  In which case, what is the 
kickstarter actually paying for?


I'm also very confused about how you are budgetting, how you are 
determining how much money raised will fund how many new features of 
what sort:


You say in the kickstarter, that the money raised will help me find and 
pay them to make LibraryBox more awesome -- but then you also say that 
Anything raised here on Kickstarter will also be used to purchase 
hardware -- this seems to be contradictory. Will the money be used to 
pay developers, or will it be used to purchase hardware?


If it was being used to purchase hardware, than it wouldn't be obvious 
that more money raised could lead to more feature development -- since 
you don't need more hardware for more feature development. But you 
repeat later that the more money raised, the more features will be 
delivered: If we raise a ton of money, the v2.0 will have a ton a 
features! -- so I'm thinking your earlier assertion that the money will 
be used for hardware was in error (and you should correct it to avoid 
being dishonest and/or self-contradictory) -- you do plan to use the 
money to pay developers?


But then the question is, what methods have you used to estimate how 
much it will cost to pay developers for each of the new features or 
improvements you plan, how do you know the amount of money you are 
raising is sufficient for the development you are telling people you'll 
do with it -- including the 'stretch features' you already have in mind 
but have not revealed yet (you say will be revealed 'as soon as the 
project is funded').


Also, do you plan to use any of the money to pay yourself for your time, 
in addition to paying other developers, and buying hardware?


Those are my questions, since you asked.

I think these are questions that need to be answered for code4libbers -- 
or really anyone that has enough understanding of software development 
to know what to ask -- to be interested in giving you money.


Frankly, I have some serious reservations about contributing to your 
project, and would share these reservations with anyone else you asked. 
It is not clear to me that you have a clear plan for what you're 
actually going to do; that you have adequately done homework to make 
sure you can do what you want to do for the amount of money you expect; 
and you have not provided the argument for why what you want to do (a 
fork of PirateBox) is actually a useful thing to want to do in the first 
place.


Jonathan



On 7/8/13 2:14 PM, Jason Griffey wrote:

In case people hadn't seen this, at ALA Annual last week I launched a
Kickstarter for the development of LibraryBox 2.0 (http://librarybox.us),
and open source fork of the PirateBox project. I had originally budgeted
for $3K for the Kickstarter, hoping to make a bit more than that in order
to pay a developer to do the bits of the release that I can't do.

Well, it sort of blew up.

http://www.kickstarter.com/projects/griffey/librarybox-20

Take a look, let me know if you have questions. I'm really excited about
the project, and the opportunities for development that I have now.

Jason




Re: [CODE4LIB] phone app for barcode-to-textfile?

2013-06-07 Thread Jonathan Rochkind
If you are interested in doing some development, this project description and 
code may be of interest to you:

http://journal.code4lib.org/articles/5014

ISBN and QR Barcode Scanning Mobile App for Libraries

This article outlines the development of a mobile application for the Ryerson 
University Library. The application provides for ISBN barcode scanning that 
results in a lookup of library copies and services for the book scanned, as 
well as QR code scanning. Two versions of the application were developed, one 
for iOS and one for Android. The article includes some details on the free 
packages used for barcode scanning functionality. Source code for the Ryerson 
iOS and Android applications are freely available, and instructions are 
provided on customizing the Ryerson application for use in other library 
environments. Some statistics on the number of downloads of the Ryerson mobile 
app by users are included.

by Graham McCarthy and Sally Wilson

From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] on behalf of Ken Irwin 
[kir...@wittenberg.edu]
Sent: Thursday, June 06, 2013 1:40 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: [CODE4LIB] phone app for barcode-to-textfile?

Hi all,

Does anyone have a phone app (pref. iOS) that will just scan barcodes to a 
textfile? All the apps I'm finding are shopping oriented or other special uses. 
I just want to replace our antique barcode scanner that spits out a list of 
barcodes as a text file.

Anyone have such a thing? Or advice on where to assemble the building blocks to 
create one?

Thanks
Ken


Re: [CODE4LIB] Visualizing (public) library statistics

2013-06-05 Thread Jonathan Rochkind
I recently saw a great example of exactly what you're talking about... but now 
I can't find it!

I think it might have been a public library somewhere in michigan, but I could 
be misremembering that. It was pointed out on the #code4lib IRC channel, 
whoever was responsible for it was on channel at the time, and someone 
congratulated them because their public statistics dashboard had been featured 
on some web page somewhere. 

Bah, this probably isn't too helpful! How frustrating, I'm certain I saw an 
example of exactly what you are are asking for! (I encouraged them to submit to 
the code4lib journal on it, because I knew people would want to know about it!)

From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] on behalf of Cab Vinton 
[bibli...@gmail.com]
Sent: Wednesday, June 05, 2013 3:40 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: [CODE4LIB] Visualizing (public) library statistics

Come budget time, I invariably find myself working with the most
recent compilation of public library statistics put out by our State
Library -- comparing our library to peer institutions along a variety
of measures (support per capita, circulation per capita, staffing
levels, etc.) so I can make the best possible case for increasing/
maintaining our funding.

The raw data is in a Excel spreadsheet --
http://www.nh.gov/nhsl/lds/public_library_stats.html -- so this seems
ripe for mashing up, data visualization, online charting, etc.

Does anyone know of any examples where these types of library stats
have been made available online in a way that meets my goals of being
user-friendly, visually informative/ clear, and just plain cool?

If not, examples from the non-library world and/ or pointers to
dashboards of note would be equally welcome, particularly if there's
an indication of how things work on the back end.

Cheers,

Cab Vinton, Director
Sanbornton Public Library
Sanbornton, NH


Re: [CODE4LIB] Visualizing (public) library statistics

2013-06-05 Thread Jonathan Rochkind
Aha, I found it! I was right it was  Michigan. 

http://www.tadl.org/stats/
http://www.tadl.org/about/stats

I can't remember hte name of the code4libber responsible, but they were on the 
#code4lib IRC channel, they are around in our community! 


From: Jonathan Rochkind
Sent: Wednesday, June 05, 2013 5:45 PM
To: Code for Libraries
Subject: RE: [CODE4LIB] Visualizing (public) library statistics

I recently saw a great example of exactly what you're talking about... but now 
I can't find it!

I think it might have been a public library somewhere in michigan, but I could 
be misremembering that. It was pointed out on the #code4lib IRC channel, 
whoever was responsible for it was on channel at the time, and someone 
congratulated them because their public statistics dashboard had been featured 
on some web page somewhere.

Bah, this probably isn't too helpful! How frustrating, I'm certain I saw an 
example of exactly what you are are asking for! (I encouraged them to submit to 
the code4lib journal on it, because I knew people would want to know about it!)

From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] on behalf of Cab Vinton 
[bibli...@gmail.com]
Sent: Wednesday, June 05, 2013 3:40 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: [CODE4LIB] Visualizing (public) library statistics

Come budget time, I invariably find myself working with the most
recent compilation of public library statistics put out by our State
Library -- comparing our library to peer institutions along a variety
of measures (support per capita, circulation per capita, staffing
levels, etc.) so I can make the best possible case for increasing/
maintaining our funding.

The raw data is in a Excel spreadsheet --
http://www.nh.gov/nhsl/lds/public_library_stats.html -- so this seems
ripe for mashing up, data visualization, online charting, etc.

Does anyone know of any examples where these types of library stats
have been made available online in a way that meets my goals of being
user-friendly, visually informative/ clear, and just plain cool?

If not, examples from the non-library world and/ or pointers to
dashboards of note would be equally welcome, particularly if there's
an indication of how things work on the back end.

Cheers,

Cab Vinton, Director
Sanbornton Public Library
Sanbornton, NH


Re: [CODE4LIB] Visualizing (public) library statistics

2013-06-05 Thread Jonathan Rochkind
And to triple post myself, if you google around (I tried public library 
benefit statistics dashboard) you can find some other examples too, such as:

http://www.library.appstate.edu/about/planning

And there is in fact a Code4Lib Journal article on one implementation of 
library statistic visualization:

http://journal.code4lib.org/articles/7812



From: Jonathan Rochkind
Sent: Wednesday, June 05, 2013 5:47 PM
To: Code for Libraries
Subject: RE: [CODE4LIB] Visualizing (public) library statistics

Aha, I found it! I was right it was  Michigan.

http://www.tadl.org/stats/
http://www.tadl.org/about/stats

I can't remember hte name of the code4libber responsible, but they were on the 
#code4lib IRC channel, they are around in our community!


From: Jonathan Rochkind
Sent: Wednesday, June 05, 2013 5:45 PM
To: Code for Libraries
Subject: RE: [CODE4LIB] Visualizing (public) library statistics

I recently saw a great example of exactly what you're talking about... but now 
I can't find it!

I think it might have been a public library somewhere in michigan, but I could 
be misremembering that. It was pointed out on the #code4lib IRC channel, 
whoever was responsible for it was on channel at the time, and someone 
congratulated them because their public statistics dashboard had been featured 
on some web page somewhere.

Bah, this probably isn't too helpful! How frustrating, I'm certain I saw an 
example of exactly what you are are asking for! (I encouraged them to submit to 
the code4lib journal on it, because I knew people would want to know about it!)

From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] on behalf of Cab Vinton 
[bibli...@gmail.com]
Sent: Wednesday, June 05, 2013 3:40 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: [CODE4LIB] Visualizing (public) library statistics

Come budget time, I invariably find myself working with the most
recent compilation of public library statistics put out by our State
Library -- comparing our library to peer institutions along a variety
of measures (support per capita, circulation per capita, staffing
levels, etc.) so I can make the best possible case for increasing/
maintaining our funding.

The raw data is in a Excel spreadsheet --
http://www.nh.gov/nhsl/lds/public_library_stats.html -- so this seems
ripe for mashing up, data visualization, online charting, etc.

Does anyone know of any examples where these types of library stats
have been made available online in a way that meets my goals of being
user-friendly, visually informative/ clear, and just plain cool?

If not, examples from the non-library world and/ or pointers to
dashboards of note would be equally welcome, particularly if there's
an indication of how things work on the back end.

Cheers,

Cab Vinton, Director
Sanbornton Public Library
Sanbornton, NH


Re: [CODE4LIB] Wordpress: Any way to selectively control caching for content areas on a page?

2013-05-31 Thread Jonathan Rochkind
 What is most useful for me is very general conceptual directions on how to
force certain pages to refresh within a CMS, and a sanity check as to
whether it is possible to force a refresh for only certain content areas on
a page with several content areas.

 My feeling is that it would be possible to force a refresh of certain
pages, but that needs to be done from the html header.  My feeling is that
it's not possible to force a refresh for specific content areas only, but
if anyone knows conceptually how to do this, then I would love to be
pleasantly surprised.

If you're talking about HTTP-level caching, yes. It's controlled by headers on 
the HTTP response, and thus is page-by-page, meaning both a whole page (URL) at 
a time, and that for pages to be cached differently they need different HTTP 
headers delivered with them, by the CMS or web app or web server. 

Some CMS's have their own internal caching, that is not HTTP-level caching and 
is invisible to the client or user-agent, it's done just inside the 'black box' 
of the CMS. So even in cases where the browser will not cache the page, where 
the browser will make a request to the server for the page -- the server may 
then serve the page from it's own internal cache, for instance to save the time 
of going to the database and rendering the HTML, just serve already rendered 
HTML out of an inside-the-server cache.   This kind of cache can possibly 
operate on a portion of the page, it depends on how the hypothetical CMS is 
written. 

So that's a conceptual overview. 

With WordPress specifically? People have suggested some WordPress plugins that 
do caching in various different ways. But when you don't have any control over 
the WordPress installation? I guess it depends on if they have any such plugins 
installed, which only they know. 

What is your motivation here?  Just curiosity?  Or are you _wanting_ your pages 
to be cached, when they are not already? -- if so, why?   Or are things being 
cached that you do not want cached, and you need to fix it? Or what?


[CODE4LIB] shibboleth listserv?

2013-05-09 Thread Jonathan Rochkind
Can anyone provide instructions as to the current address of the 
shibboleth-users listserv?


I think I've managed to subscribe to the shibboleth-users listserv 
mentioned at: http://shibboleth.net/community/lists.html


Actually, I was already subscribed, but had delivery turned off. I 
managed to recover my password and turn delivery back on just now. I think.


But I can't figure out what email address to send to to send an email to 
the list.


Sending email to shibboleth-us...@shibboleth.net bounces back as 
undeliverable.


How do I send an email to the shibboleth-users listserv I think I am 
succesfully subscribed to? Have I even found the right shibboleth-users 
listserv? (Googling there's like 5 of them).


Anyone have any idea? Thanks for any advice!  I have been unsuccesful 
figuring this out on the shibboleth.net website or googling, apologies 
if i'm missing something obvious.


Jonathan


Re: [CODE4LIB] shibboleth listserv?

2013-05-09 Thread Jonathan Rochkind

Nevermind, the answer is us...@shibboleth.net

Thanks jeff_ on #code4lib.

On 5/9/2013 1:52 PM, Jonathan Rochkind wrote:

Can anyone provide instructions as to the current address of the
shibboleth-users listserv?

I think I've managed to subscribe to the shibboleth-users listserv
mentioned at: http://shibboleth.net/community/lists.html

Actually, I was already subscribed, but had delivery turned off. I
managed to recover my password and turn delivery back on just now. I think.

But I can't figure out what email address to send to to send an email to
the list.

Sending email to shibboleth-us...@shibboleth.net bounces back as
undeliverable.

How do I send an email to the shibboleth-users listserv I think I am
succesfully subscribed to? Have I even found the right shibboleth-users
listserv? (Googling there's like 5 of them).

Anyone have any idea? Thanks for any advice!  I have been unsuccesful
figuring this out on the shibboleth.net website or googling, apologies
if i'm missing something obvious.

Jonathan


Re: [CODE4LIB] password lockboxes

2013-03-05 Thread Jonathan Rochkind
There are cryptographic algorithms that can do that. It seems like 
overkill for departmental root passwords though.


On 3/5/2013 1:35 PM, Joe Hourcle wrote:

On Mar 5, 2013, at 8:29 AM, Adam Constabaris wrote:


An option is to use a password management program (KeepassX is good because
it is cross platform) to store the passwords on the shared drive, although
of course you need to distribute the passphrase for it around.


So years ago, when I worked for a university, they wanted us to put all of the 
root passwords into an envelope, and give them to management to hold.  (we were 
a Solaris shop, so there actually were root passwords on the boxes, but you had 
to connect from the console or su to be able to use 'em).

We managed to drag our heels on it, and management forgot about it*, but I had 
an idea ...

What if there were a way to store the passwords similar to the secret formula 
in Knight Rider?

Yes, I know, it's an obscure geeky reference, and probably dates me.  The story 
went that the secret bullet-proof spray on coating wasn't held by any one 
person; there were three people who each knew part of the formula, and that any 
two of them had enough knowledge to make it.

For needing 2 of 3 people, the process is simple -- divide it up into 3 parts, 
and each person has a different missing bit.  This doesn't work for 4 people, 
though (either needing 2 people, or 3 people to complete it).

You could probably do it for two or three classes of people (eg, you need 1 sysadmin + 1 
manager to unlock it), but I'm not sure if there's some method to get an arbitrary 
X of Y people required to unlock.

If anyone has ideas, send 'em to be off-list.  (If other people want the 
answer, I can aggregate / summarize the results, so I don't end up starting yet 
another inappropriate out-of-control thread)

...

Oh, and I was assuming that you'd be using PGP, using the public key to encrypt 
the passwords, so that anyone could insert / update a password into whatever 
drop box you had; it'd only be taking stuff out that would require multiple 
people to combine efforts.

-Joe


* or at least, they didn't bring it up again while I was still employed there.




[CODE4LIB] what do you do: API accounts used by library software, that assume an individual is registered

2013-03-04 Thread Jonathan Rochkind
Whether it's Amazon AWS, or Yahoo BOSS, or JournalTOCs, or almost 
anything else -- there are a variety of API's that library software 
wants to use, which require registering an account to use.


They may or may not be free, sometimes they require a credit card 
attached too.


Most of them assume that an individual person is creating an account, 
the account will be in that individual's name, with an email address, etc.


This isn't quite right for a business or organization, like the library, 
right?  What if that person leaves the organization? But all this 
existing software is using API keys attached to 'their' account? Or what 
if the person doesn't leave, but responsibilities for monitoring emails 
from the vendor (sent to that account) change?  And even worse if 
there's an institutional credit card attached to that account.


I am interested in hearing solutions or approaches that people have 
ACTUALLY tried to deal with this problem, and how well they have worked.


I am NOT particularly interested in Well, you could try X or Y; I can 
think of a bunch of things I _could_ try myself, each with their 
potential strengths and weaknesses. I am interested in hearing about 
what people actually HAVE tried or done, and how well it has worked.


Has anyone found a way to deal with this issue, other than having each 
API registered to an account belonging to whatever individual staff 
happened to be dealing with it that day?


Thanks for any advice.


Re: [CODE4LIB] what do you do: API accounts used by library software, that assume an individual is registered

2013-03-04 Thread Jonathan Rochkind
Makes sense, thanks!  Although leaving account/password list unencrypted 
on a shared drive seems potentially dangerous


On 3/4/2013 1:20 PM, Laura Robbins wrote:

We have a shared email account that we use for these situations.  As
well, we have a master account/password list for all of the different
accounts that get created that is in a shared network folder.  That
way if someone is out sick or on sabbatical, the information is
available to all of our full-time librarians.

Laura Pope Robbins
Associate Professor/Reference Librarian
Dowling College Library

Phone: 631.244.5023
Fax: 631.244.3374

A mind needs books as a sword needs a whetstone, if it is to keep its
edge.  --Tyrion Lannister in A Game of Thrones by George R.R. Martin

On Mar 4, 2013, at 11:11 AM, Jonathan Rochkind rochk...@jhu.edu wrote:


Whether it's Amazon AWS, or Yahoo BOSS, or JournalTOCs, or almost anything else 
-- there are a variety of API's that library software wants to use, which 
require registering an account to use.

They may or may not be free, sometimes they require a credit card attached too.

Most of them assume that an individual person is creating an account, the 
account will be in that individual's name, with an email address, etc.

This isn't quite right for a business or organization, like the library, right? 
 What if that person leaves the organization? But all this existing software is 
using API keys attached to 'their' account? Or what if the person doesn't 
leave, but responsibilities for monitoring emails from the vendor (sent to that 
account) change?  And even worse if there's an institutional credit card 
attached to that account.

I am interested in hearing solutions or approaches that people have ACTUALLY 
tried to deal with this problem, and how well they have worked.

I am NOT particularly interested in Well, you could try X or Y; I can think 
of a bunch of things I _could_ try myself, each with their potential strengths and 
weaknesses. I am interested in hearing about what people actually HAVE tried or done, and 
how well it has worked.

Has anyone found a way to deal with this issue, other than having each API 
registered to an account belonging to whatever individual staff happened to be 
dealing with it that day?

Thanks for any advice.





Re: [CODE4LIB] GitHub Myths (was thanks and poetry)

2013-02-22 Thread Jonathan Rochkind
Can you two take your argument somewhere else? This thread is REALLY boring. 

(Am I going to make it worse by posting this? Are people going to start flaming 
me for being intolerant? Would I deserve it? Possibly.  I am willing to take 
that risk in a last ditch hope that the Code4Lib listserv can somehow return to 
having actual discussions of code and technical matters again, instead of being 
like most other non-technical 'library technology' listservs. Unlikely, eh?  
Should we start a Code4LibDiscussingCodeAgain separate listserv? Please don't 
answer any of these questions in this thread, or on this listserv at all, 
really. )

From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] on behalf of MJ Ray 
[m...@phonecoop.coop]
Sent: Friday, February 22, 2013 3:55 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] GitHub Myths (was thanks and poetry)

Shaun Ellis sha...@princeton.edu
 If you read my email, I don't tell anyone what to use, but simply
 attempt to clear up some fallacies.  Distributed version control is new
 to many, and I want to make sure that folks are getting accurate
 information from this list.

As would I.  I don't think spreading misinformation about the products
of GitHub, Inc, is helping people to get accurate information.

 Unfortunately, this statement is not accurate either:

 // There's a sneaky lock-in effect of having one open tool (git hosting)
 which is fairly easy to move in and out and interoperate with, linked to
 other closed tools (such as their issues tracker and their non-git pull
 requests system) which are harder to move out or interoperate. //

Nothing written below points out any inaccuracy.

 GitHub's API allows you to easily export issues if you want to move them
 somewhere else: http://developer.github.com/v3/issues/

So what's the equivalent command to git clone  to do that, then?
I put harder, not impossible.  You try putting the sausagemeat you get
from that API into any other issue tracker.  Also, that API is only
available to registered users and it's unique as far as I've seen.

 Pull-requests are used by repository hosting platforms to make it easier
 to suggest patches.  GitHub and BitBucket both use the pattern,

Well, the pattern comes from the git request-pull tool.  GitHub just
disconnects it from that.

 and I don't understand what you mean by it being a closed tool.
 If you're concerned about barriers to entry, suggesting a patch
 using only git or mercurial can be done, but I wouldn't say it's
 easy.

git send-email and git request-pull are both pretty easy, aren't they?

and what Erik said about open/closed.

 ... and what Devon said.

Which was If you're not willing to provide even your name to make use
of a free service, then I dare say you are erecting your own
barriers.

I'm willing to provide my name.  I'm not willing to provide my full
legal name to them.  They have no need for my full legal name.  Even
if they want to come after me legally, the legal system will either
accept my common alias or convert it for them (I have to tell it both,
for that reason).

Hope that explains,
--
MJ Ray (slef), member of www.software.coop, a for-more-than-profit co-op.
http://koha-community.org supporter, web and library systems developer.
In My Opinion Only: see http://mjr.towers.org.uk/email.html
Available for hire (including development) at http://www.software.coop/


Re: [CODE4LIB] GitHub Myths (was thanks and poetry)

2013-02-20 Thread Jonathan Rochkind
Probably a mistake for me to post at all, but I'm full of mistakes. You 
know what, if someone wants to set up a spot for nerd poetry, I think 
they should do so. If someone else wants to set up a different spot 
using different tech, I think they should do so too.


I think it's mistaken to think anyone needs to reach consensus on what 
the 'right' technology or spot for this is; and I also think it's 
mistaken to think that any one platform or technology is going to be 
thought to be the best by everyone involved, differnet people will 
always have different opinions.   I also think it's a mistake to get 
offended because what someone else feels like experimenting with setting 
up is not something you think is the best way to do it.


Most piece of code4lib 'social' tech, going back to this mailing list 
itself, was created by someone who felt like creating it because they 
thought it would be fun and rewarding, and they did so.


Now, if you want to discuss the technical pro's and con's of different 
technical options, or even some technical how-to's, I think that would 
be a great use of the code4lib listserv.


Re: [CODE4LIB] A Responsibility to Encourage Better Browsers ( ? )

2013-02-19 Thread Jonathan Rochkind

On 2/19/2013 10:22 AM, Michael Schofield wrote:

Now that Google, jQuery, and others will soon drop support for IE8 -
its time to politely join-in and make luddite patrons aware. IMHO,
anyway.


I would like a cite for this. I think you are mis-informed. It is a 
misconception that JQuery is dropping support for IE8 anytime soon. And 
I'm not sure what you mean about 'Google' dropping support for IE8.


[The mis-conception comes from the fact that JQuery 1.9 will not support 
IE 9, HOWEVER, JQuery 1.8 will be supported indefinitely as 
feature-complete-compatible with JQuery 1.9, and supporting IE 9. 
JQuery 1.9 is just an alternate smaller JQuery without IE 8 support, 
yeah, but JQuery 1.8 has no EOL and will be supported indefinitey 
feature-complete with 1.9].


Anyway, I think it's clear that the web developer with our level of 
resources can not afford to support every browser that may possibly exist.


We have to decide on our list of browsers we will actually spend time 
ensuring work with our code.  (You can also, like JQuery-mobile, have a 
list that's supported as 'first class', and another list that is 
supported with graceful degredation -- and then others which you don't 
look at at all, and may fail miserably/unusably).


That decision is generally based on a combination of popularity of 
browsers among your users as well as difficulty (expense) to support.


If you can politically get away with no longer supporting IE8 even 
though it's popular among your users, I guess that could be legit. It 
depends on your 'business needs', right?


Once you've decided to stop supporting a browser, especially one that 
may be popular anyway, a secondary question is whether to let it just 
silently potentially fail (you generally aren't spending time analyzing 
whether it will in fact fail, work as intended, or degrade gracefully -- 
that's part of the point), or actually sniff user agents and give the 
user some sort of warning that your site may not work with your browser.


If you are going to give a warning, I'd recommend it be a relatively 
unobtrusive warning that still lets them proceed to use your site anyway 
if they want to ignore your warning, rather than one that locks them out.


Re: [CODE4LIB] A Responsibility to Encourage Better Browsers ( ? )

2013-02-19 Thread Jonathan Rochkind

On 2/19/2013 12:19 PM, Michael Schofield wrote:

Hey world,

I suppose I could start appending footnotes to my ranty emails.
Johnathan is definitely right regarding jQuery while I was
generalizing. Yes, jq1.8 will be supported - but, if you wanted to,
you could still run a site using jq1.4.


Dude, you are still spreading FUD.  (Although I confused things more by 
talking about 1.8 when I didn't mean 1.8 -- I meant 1.9, which will 
support IE 9, and 2.0 which will not).


You COULD run a site on JQuery 1.4, but JQuery 1.4 is no longer 
supported, and does not have the features that recent JQueries do.


This IS NOT THE SITUATION WITH JQUERY 1.9 VS 2.0. (NOTE WELL I had the 
version numbers wrong before. It's 1.9 and 2.0 I am talking about, not 
1.8 and 1.9)  I put that in all caps because people seem to be 
frequently misunderstanding this, and you are spreading misinformation 
by making this comparison.


JQuery 1.9 will continue to be officially supported for the indefinite 
future, it has no End of Life. And it will be supported with _feature 
parity_ with 1.9.  If they add new features to 2.0, they will add it to 
1.9.  1.9 and 2.0 will be _simultaneous alternatives_.  With 1.9 
supporting IE 9, and 2.0 not.  It is even encouraged that if you want 
to support IE 9, you could deliver a JQ 1.9 to those IE, and a JQ 2.0 
to everyone else -- and all your actual app code built on JQ should work 
on either, the commitment is to make them identically compatible.


This has not happened with Jquery before is why it's confusing people. 
It is NOT NOT NOT comparable to you could still run a site using jquery 
1.4.



http://blog.jquery.com/2013/01/14/the-state-of-jquery-2013/

First of all, let’s be very clear: The jQuery team does “worry about” IE 
6/7/8, with jQuery 1.9. We’ve created a legacy-free jQuery 2.0 in order 
to address the many situations where older versions of IE aren’t needed. 
Some glorious day in the future, jQuery 2.0 will be the only version 
you’ll need; until then we’ll continue to maintain jQuery 1.9.



 The jQuery team is moving

beyond LT IE9 because losing the bloat is certainly more performant,
especially as the web scurries further away from high-speed
connections. Even now, many of us are supporting old IE by pulling in
additional stylesheets or scripts. The practice doesn't change if on
detection you choose to load jq1.8 instead of 2+.  As the web moves
forward, the experience for old browser users will increasingly suck
- polyfills bust performance budget.

Google Apps / Play pulled support for IE8 on November 15. Link to
Techcrunch below. It's not fatal, but it's the same premise - IE8
users will get the you should consider upgrading message. It's the
beginning of the trend, but it's definitely a trend.  I just
browserstacked my dusty G+ profile and there is a polite message. I
didn't see it on Calendar or Gmail. It more in-your-face on Play.
It's there and it's not.  It works, it's gradual, but it's goading.

John's right, too, when he makes the point that the decision has a
lot to do with the difficulty / expense to support. The question to
me is if a library website is built by taxes and tuition, is there a
point where the redundant work for backward compatibility becomes a
disservice?

Michael // ns4lib.com

http://techcrunch.com/2012/09/14/google-apps-says-goodbye-to-internet-explorer-pulls-support-for-the-browser/

 -Original Message- From: Jonathan Rochkind
[mailto:rochk...@jhu.edu] Sent: Tuesday, February 19, 2013 11:57 AM
To: Code for Libraries Cc: Michael Schofield Subject: Re: [CODE4LIB]
A Responsibility to Encourage Better Browsers ( ? )

On 2/19/2013 10:22 AM, Michael Schofield wrote:

Now that Google, jQuery, and others will soon drop support for IE8
- its time to politely join-in and make luddite patrons aware.
IMHO, anyway.


I would like a cite for this. I think you are mis-informed. It is a
misconception that JQuery is dropping support for IE8 anytime soon.
And I'm not sure what you mean about 'Google' dropping support for
IE8.

[The mis-conception comes from the fact that JQuery 1.9 will not
support IE 9, HOWEVER, JQuery 1.8 will be supported indefinitely as
feature-complete-compatible with JQuery 1.9, and supporting IE 9.
JQuery 1.9 is just an alternate smaller JQuery without IE 8 support,
yeah, but JQuery 1.8 has no EOL and will be supported indefinitey
feature-complete with 1.9].

Anyway, I think it's clear that the web developer with our level of
resources can not afford to support every browser that may possibly
exist.

We have to decide on our list of browsers we will actually spend time
ensuring work with our code.  (You can also, like JQuery-mobile, have
a list that's supported as 'first class', and another list that is
supported with graceful degredation -- and then others which you
don't look at at all, and may fail miserably/unusably).

That decision is generally based on a combination of popularity of
browsers among your users as well

Re: [CODE4LIB] Getting started with Ruby and library-ish data (was RE: [CODE4LIB] You *are* a coder. So what am I?)

2013-02-18 Thread Jonathan Rochkind

On 2/18/2013 2:04 PM, Jason Stirnaman wrote:

I've been thinking alot about how to introduce not only my kids, but
some of our cataloging/technical staff to thinking programmatically
or computationally[1] or whatever you want to call it.


Do you have an opinion of the google 'computational thinking' curriculum 
pieces linked off of that page you cite? For instance, at:


http://www.google.com/edu/computational-thinking/lessons.html

Or at:

http://www.iste.org/learn/computational-thinking/ct-toolkit


[CODE4LIB] ruby code for Chicago Transit Authority API

2013-02-08 Thread Jonathan Rochkind
Coincidentally, I happened to notice this today. In case some code4libber in 
chicago for the conference feels like playing with it, someone has just 
released a ruby gem with ruby wrapper for CTA API (who even knew there was 
one?), apparently allowing realtime tracking of busses, among other things.

https://github.com/fbonetti/cta-api


[CODE4LIB] Delivery services preconference location and schedule

2013-02-06 Thread Jonathan Rochkind
Info on the delivery service pre-conf below, from Ted Lawless who's 
organizing it.


If you'd like to try to install Umlaut, please come with a unix 
(including OSX) computer with ruby 1.9.3 already succesfully installed 
on it. It can be a laptop, or it can be a remote machine you can ssh to, 
either way. (although for the latter, you might want to have a way to 
easily edit files on that remote machine that you are comfortable with).



 Original Message 
Date: Tue, 5 Feb 2013 20:43:47 -0500
From: Ted Lawless tlawl...@brown.edu


Hello,

I've included below the location and a schedule for the preconference
on Monday.  Please let me or the group know -
c4lib13dels...@yahoogroups.com - if you have questions or
suggestions.

Location
  - IDEA Commons of the Richard J. Daley Library.  Francis says
someone minding the door will be able to point us in the right
direction if we have questions.  We will have access to a projector.
  - http://lanyrd.com/venues/chicago/vcdtp/


Schedule
  9:10 - 9:25   -  Intros
  9:25 - 9:40   -  Umlaut - Jonathan Rochkind
  9:40 - 9:55   -  Umlaut implementation plans at Princeton - Kevin Reiss
  9:55 - 10:10  -  GWU Launchpad - Rosalyn Metz
  10:10 - 10:25 -  easyArticle and easyBorrow at Brown - Birkin Diana
and Ted Lawless
  10:25 - 10:35 -  break
  10:35 - 10:50 -  Cal State Get It Now - Aaron Collier
  10:50 - 11:05 -  Dealing with change at VCU - Erin White
  11:05 - 11:55 -  Installation sessions and break out discussions
  11:55 - 12:00 -  Wrap up


Breakout discussion ideas
  - gathering data on transactions and rates of successful full text

Installation sessions
  - Umlaut
- Please come with access to a machine with Ruby 1.9.3 installed.
Probably not Windows.
- https://github.com/team-umlaut/umlaut/wiki/Installation

  - easyA utilities from Brown
- access to a machine with Python 2.6 or 2.7.
- to make things easier install virtualenv,
http://pypi.python.org/pypi/virtualenv, and git.

Ted


[CODE4LIB] interesting link resolver layout for title-level links?

2013-02-05 Thread Jonathan Rochkind
So, many of us have a 'link resolver' product, which among other things 
will give you a screen for a journal title (say, JAMA), which lists 
several different licensed full text platforms offering access.


These platforms are usually listed with a vendor/platform name (which is 
a hyperlink), along with a coverage statement.


In most of the UI's I've seen, including most of the out-of-the-box UI's 
from the link resolver products, the vendor/platform name is the most 
prominent/scannable part of the item, while the dates of coverage is 
actually graphically subsidiary and hard to scan.


Whereas, in fact, the coverage statement is the thing most patrons are 
probably most interested in (not all all the time, but most), and which 
it's most important the user notice before clicking on the link to find 
out that coverage was only until 1995 when they wanted recent coverage.


Can anyone show me examples of link resolver UI's that change the 
emphasis in the graphic design to make the coverage statement the 
prominent part?  Either customized local UI's, or different vendor 
products that do this differnetly, etc.


One thing that makes this especially challenging is that while the 
coverage statement is _sometimes_ as simple as 1990 to present, 
sometimes it can include month and even day on both end points, as well 
as volume/issue statements on both endpoints. Which is a lot of 
information. I'm not sure how/if to split it up, and generally need some 
ideas from looking at prior art here, if there is any.


Thanks for any pointers!


[CODE4LIB] conf presenters: a kind request

2013-02-04 Thread Jonathan Rochkind
We are all very excited about the conference next week, to speak to our 
peers and to hear what our peers have to say!


I would like to suggest that those presenting be considerate to your 
audience, and actually prepare your talk in advance!


You may think you can get away with making some slides that morning 
during someone elses talk and winging it; nobody will notice right? Or 
they wont' care if they do?


From past years, I can say that for me at least, yeah, I can often tell 
who hasn't actually prepared their talk. And I'll consider it 
disrespectful to the time of the audience, who voted for your talk and 
then got on airplanes to come see it, and you didn't spend the time to 
plan it advance and make it as high quality for them as you could.


I don't mean to make people nervous about public speaking. The code4lib 
audience is a very kind and generous audience, they are a good audience. 
It'll go great! Just maybe repay their generosity by actually preparing 
your talk in advance, you know?  Do your best, it'll go great!


If you aren't sure how to do this, the one thing you can probably do to 
prepare (maybe this is obvious) is practice your presentation in 
advance, with a timer, just once.  In front of a friend or just by 
yourself. Did you finish on time, and get at least half of what was 
important in? Then you're done preparing, that was it!  Yes, if you're 
going to have slides, this means making your slides or notes/outline in 
advance so you can practice your delivery just once!


Just practice it once in advance (even the night before, as a last 
resort!), and it'll go great!


Jonathan


[CODE4LIB] kickstarter for a homebrow project, from mistym

2013-02-04 Thread Jonathan Rochkind

homebrew is what many of us use to install code packages on OSX.

mistym (Misty De Meo) is a frequent hanger outer in #code4lib, and has 
helped many of us with stuff (including homebrew). She's also one of the 
homebrew maintainers.


Here's a kickstarter from mistym, to help homebrew with some automated 
testing infrastructure, to better handle user code submissions.


It's notable that they're asking for money for actual equipment, not 
developer time reimbursement or anything. They're not asking for much.


Consider kicking in some bucks if you feel like it!

http://www.kickstarter.com/projects/homebrew/brew-test-bot


Re: [CODE4LIB] Why we need multiple discovery services engine?

2013-01-31 Thread Jonathan Rochkind
So, there are two categories of solutions here -- 1) local indexes, where you 
create the index yourself, like blacklight or vufind (both based on a local 
Solr).  2) vendor-hosted indexes, where the vendor includes all sorts of things 
in their index that you the customer don't have local metadata for, mostly 
including lots and lots of scholarly article citations. 

If you want to include scholarly article citations, you probably can't do that 
with a local index solution. Although some consortiums have done some 
interesting stuff in that area, let's just say it takes a lot of resources to 
do. For most people, if you want to include article search in your index, it's 
not feasilbe to do so with a local index. So only VuFind/Blacklight with a 
local Solr is out, if you want article search. 

You _can_ load local content in a vendor-hosted index like EDS/Primo/Summon. So 
plenty of people do choose a vendor-hosted index product as their only 
discovery tool, including both local metadata and vendor-provided metadata. As 
you suggest. 

But some people want the increased control that a locally controlled Solr index 
gives you, for the local metadata where it's feasible. So use a local index 
product. But still want the article search you can get with a vendor-hosted 
index product. So they use both.  

There is also at least some reasons to believe that our users don't mind and 
may even prefer having local results and hosted metadata results presented 
seperately (although probably preferably in a consistent UI), rather than 
merged. 

A bunch more discussion of these issues is included in my blog post at: 
http://bibwild.wordpress.com/2012/10/02/article-search-improvement-strategy/

From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] on behalf of Wayne Lam 
[wing...@gmail.com]
Sent: Thursday, January 31, 2013 9:31 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: [CODE4LIB] Why we need multiple discovery services engine?

Hi all,

I saw in numerous of library website, many of them would have their own
based discovery services (e.g. blacklight / vufind) and at the same time
they will have vendor based discovery services (e.g. EDS / Primo / Summon).
Instead of having to maintain 2 separate system, why not put everything
into just one? Any special reason or concern?

Best

Wayne


[CODE4LIB] metadata vocab re-use question?

2013-01-29 Thread Jonathan Rochkind
Hello fine code4libbers, I have a technical question about metadata 
vocab reuse, and the best way to do something I'm doing.


I'm working on an API for returning a list of scholarly articles.

I am trying to do as much as I can with already existing technical 
metadata devices.


In general, I am going to do this as an Atom XML response, with some 
'third party' XML namespaces in use too for full expression of what I 
want to express.  Using already existing vocabularies, identified by URI.


In general, this is working fine -- especially using the PRISM 
vocabulary for some scholarly citation-specific metadata elements. Also 
some things that were already part of Atom, and may be a bit of DC here 
or there.


I am generally happy with this approach, and plan to stick to it.

But there are a few places where I am not sure what to do. In general, 
there's a common pattern where I need to express a certain 'element' 
using _multiple_ vocabularies simultaneously (and/or no vocabulary at 
all, free text).


For instance, let's take the (semantically vague, yes) concept of 
type/genre.  I have a schema.org type URI that expresses the 'type'.  I 
can _also_ express the 'type' using the dcterms 'type' vocabulary. I 
could theoretically have a couple more format/type vocabularies I'd like 
to expose, but let's stop there as an example. And on top of this, I 
_also_ have a free text 'type' string (which may or may not be derivable 
from the controlled vocabs), which I'd like to make available to API 
consumer too.


Any individual item may have some, all, or none of these data associated 
with it.


Now, the dcterms 'type' element is capable of holding any or all of 
these. http://dublincore.org/documents/dcmi-terms/#terms-type


Recommended best practice is to use a controlled vocabulary such as the 
DCMI Type Vocabulary [DCMITYPE]. To describe the file format, physical 
medium, or dimensions of the resource, use the Format element.


See, _recommended_ is to use a controlled vocab _such as_ DCMI Type 
Vocab, but this makes it clear you can also use the 'type' element for 
another controlled vocab, or no controlled vocab at all.


So it's _legal_ to simply do something like this:

!-- schema.org: --
dcterms:typehttp://schema.org/ScholarlyArticle/dcterms:type

!-- dcterms type vocab: --
dcterms:typehttp://purl.org/dc/dcmitype/Text/dcterms:type

!-- free text not from a controlled vocab: --
dcterms:typeScholarly Book Review/dcterms:type



And I've been the _consumer_ of API's which do something like that: Just 
throw a grab bag of differnet things into repeated dcterms:type 
elements, including URIs representing values from different vocabs, and 
free text.  They figure, hey, it's legal to use dcterms:type that way 
according to the docs for the dcterms vocab.


And as a consumer of services that do that... I do not want to do it. It 
is too difficult to work with as a consumer, when you don't know what 
the contents of a dcterms:type element might be, from any vocab, or none 
at all. It kind of ruins the utility of the controlled vocabs in the 
first place, or requires unreasonably complex logic on the client side.


So. Another idea that occurs is just to add some custom attributes to 
the dcterms:type element.


dcterms:type 
vocab=schema.orghttp://schema.org/ScholarlyArticle/dcterms:type
dcterms:type 
vocab=dctermshttp://purl.org/dc/dcmitype/Text/dcterms:type

dcterms:typeScholarly Book Review/dcterms:type

Now at least the client can a lot more easily write logic for Is there 
a dcterms value? If so what is it.


But I can't really tell if this is legal or not -- attributes are 
handled kind of inconsistently by various XML validators. Maybe I'd need 
to namespace the attribute with a custom namespace too:


... xmlns:mine=http://example.org/vocab ...

dcterms:type 
mine:vocab=schema.orghttp://schema.org/ScholarlyArticle/dcterms:type


But namespaces on attributes are handled _very_ inconsistently and 
buggily by various standard XML parsing libraries I've used, so I don't 
really want to do that, it's going to make things too hard on the client 
to use namespaced attributes.


But I kind of like the elegancy of that 'add attributes to dcterms:type' 
approach. I suppose you could even use full URIs instead of random terms 
to identify the vocab, for the elegance of it:


dcterms:type 
vocab=http://schema.org;http://schema.org/ScholarlyArticle/dcterms:type
dcterms:type 
vocab=http://dublincore.org/documents/dcmi-terms;http://purl.org/dc/dcmitype/Text/dcterms:type


But another option, especially if that isn't legal,  is to give up 
dcterms entirely and use only my own custom namespace/vocab for 'type' 
elements:


mine:schema-typehttp://schema.org/ScholarlyArticle/dcterms:type
mine:dcterms-typehttp://purl.org/dc/dcmitype/Text/dcterms:type
mine:uncontrolled-typeScholarly Book Review/dcterms:type


Which is kind of 'inelegant', but would probably work fine too. 
Realistically, any consumer of my response is 

Re: [CODE4LIB] Group Decision Making (was Zoia)

2013-01-25 Thread Jonathan Rochkind
  The best way, in my mind, 
is to somehow create a culture where someone can say: you know, I'm not 
ok with that kind of remark and the person spoken to can respond OK, 
I'll think about that. 

I think that's a really good to try to create, Karen says it just right.  Note 
that OK, I'll think about it is neither No, you must be mistaken nor Okay, 
I will immediately do whatever you ask of me.  But it does need to be a 
legitimate actual I'll think about it, seriously. 

The flip side is that the culture is also one where when someone says you 
know, I'm not ok with that kind of remark, it often means And I'd like you to 
think about that, in a real serious way rather than And I expect you to 
immediately change your behavior to acede to my demands.

Of course, what creates that, from both ends, is a culture of trust.  Which I 
think code4lib actually has pretty a pretty decent dose of already, let's try 
to keep it that way. (In my opinion, one way we keep it that way is by 
continuing to resist becoming a formal rules-based bueurocratic organization, 
rather than a community based on social ties and good faith). 

Now, at some times it might really be neccesary to say And I expect you to 
immediately stop what you're doing and do it exactly like I say.  Other times 
it's not.  But in our society as a whole, we are so trained to think that 
everything must be rules-based rather than based on good faith trust between 
people who care about each other, that we're likely to asume that you know, 
i'm not ok with that remark ALWAYS implies And therefore I think you are an 
awful person, and your only hope of no longer being an awful person is to 
immediately do exactly what I say.  Rather than And I expect you to think 
about this seriously, and maybe get back to me on what you think.  So if you 
do mean the second one when saying you know, i'm not ok with that remark, it 
can be helpful to say so, to elicit the self-reflection you want, rather than 
defensiveness.  And of course, on the flip-side, it is obviously helpful if you 
can always respond to you know, i'm really not okay with that!
  with reflection, rather than defensiveness. 

From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] on behalf of Karen Coyle 
[li...@kcoyle.net]
Sent: Friday, January 25, 2013 12:22 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Group Decision Making (was Zoia)

On 1/24/13 3:09 PM, Shaun Ellis wrote:


 To be clear, I am only uncomfortable with uncomfortable being used
 in the policy because I wouldn't support it being there. Differing
 opinions can make people uncomfortable.  Since I am not going to stop
 sharing what may be a dissenting opinion, should I be banned?

I can't come up with a word for it that is unambiguous, but I can
propose a scenario. Imagine a room at a conference full of people -- and
that there are only a few people of color. A speaker gets up and shows
or says something racist. It may be light-hearted in nature, but the
people of color in that almost-all-white audience feel
uncomfortable/insulted/discriminated against.

I had a great example that I can no longer find -- I think it came
through on Twitter. It showed a fake ad with an image of border patrol
agents rounding up illegal aliens in the desert, and used the ad copy:
We can take care of all of your papers as the ad line for a business
computing company. It's a joke that you can almost imagine someone
actually doing. Any latinos in the audience would be within their rights
of jumping up and shouting at the speaker, but in fact sexism and racism
work precisely because people struggling for equal status are least
likely to gain that status if they speak up against the status quo. What
I think we want to change is the social acceptance of speaking up.

There's a difference between an intellectual disagreement (I think the
earth is round/I think the earth is flat) and insulting who a person is
as a person. The various *isms* (sexism, racism, homophobia) have a
demeaning nature, and there is an inherent lowering of status of the
targeted group. Booth babes at professional conferences are demeaning to
women because they present women as non-professional sex objects, and
that view generally lowers the social and intellectual status of women
in the eyes of attendees, including the professional women who are
attending. Because of this, many conferences now ban booth babes. No
conference has banned discussion of alternate views of the universe.

It's hard to find a balance between being conscious of other peoples'
sensibilities and creating a chilling effect. The best way, in my mind,
is to somehow create a culture where someone can say: you know, I'm not
ok with that kind of remark and the person spoken to can respond OK,
I'll think about that. If, however, every I'm not ok becomes a
battle, then we aren't doing it right. The reason why it shouldn't be a
battle is that there is no absolute right or 

Re: [CODE4LIB] Zoia

2013-01-24 Thread Jonathan Rochkind
 Any group decision in the past has been done via diebold-o-tron. 

No, this is not true, that any group decision has been done via online vote. 
Or it's true only in the sense that one only considers it a 'group decision' if 
it was done by online vote. 

The ONLY decisions that have been done by online vote are about the conference, 
and specifically: which presentations to include on the program, which keynote 
speakers are preferred, and which hosting proposal gets the conference. 

To my knowledge, no other decision about code4lib has ever been made by online 
vote.  

I suppose you could say that this means that no other 'group decisions' have 
ever been made, and yet still a healthy (?) community was formed, which many 
have found rewarding to participate in, and which some find so valuable that 
they think it's worth spending their time on improving it.  Just don't 
improve it into something that's no longer what people found rewarding and 
valuable in the first place, maybe. 


Re: [CODE4LIB] Group Decision Making (was Zoia)

2013-01-24 Thread Jonathan Rochkind

On 1/24/2013 5:32 PM, Gary McGath wrote:

A non-organization without a defined membership can't have votes on
anything.


Sure it can, we've DONE it. How can we have done something impossible?

But we do it when we think it's the best way to proceed, the most 
efficient way to arriving at the best decsions we can.  It's, to 
many/most of us, clearly not here. I agree with Deborah Fitchett:


 There's a code of conduct which has been developed the way Code4Lib 
develops things: ie the work's been done by people who're interested in 
doing the work. What's special about anti-harassment that it alone 
should bear the burden of bureacracy?


People who think nothing exists unless it's formally/legally organized 
with a defined membership think Code4Lib doesn't even EXIST.  But 
obviously we do exist!   And obviously we do things!


And we have some problems, like any community, and we're trying to 
address some of them. But I don't think I've seen anyone suggest that we 
as a community are so fundamentally problematic that our very nature 
needs to be fundamentally changed to address it.  Generally, most of the 
people, even those pointing out problems, like Code4lib  -- otherwise, 
why would they care to spend time fixing it?


Re: [CODE4LIB] Zoia

2013-01-22 Thread Jonathan Rochkind

I agree with Ed.

Thanks to whoever removed the 'poledance' plugin (REALLY? that existed? 
if it makes you feel any better, I don't think anyone who hangs out in 
#code4lib even knew it existed, and it never got used).


It's certainly possible that there are or will be other individual 
features that are, well, just plain rude and offensive, and should be 
removed.


But in general, I think it would be a HUGE mistake to think that all 
personality, frivolity, or 'subcultural' elements should be removed from 
all things #code4lib in the name of 'accessiblity'.  Whatever it is 
about code4lib that has made it 'succesful' -- is in large part due to 
the fact that it IS a social community with cultural features. If you 
try to remove all those, you are removing what makes code4lib what it 
is, you are removing whatever you liked about it in the first place.


If you want online or offline venues that are all-business-all-the-time 
with no social subcultural aspects, there are plenty of others already, 
you don't need to make code4lib into one. If you find those plenty of 
others not as useful or rewarding as code4lib -- well, I suggest the 
reason for that has a lot to do with the social community aspects of 
code4lib. YES, the social subcultural aspects WILL turn some people off, 
it's true, but by trying to remove them, you wind up with something that 
doesn't rub people the wrong way and doens't rub anyone the right way 
either.


On 1/22/2013 1:25 PM, Edward M. Corrado wrote:

On Fri, Jan 18, 2013 at 5:37 PM, Kyle Banerjee kyle.baner...@gmail.com wrote:

In every noisy forum that I participate in (BTW, none of them are tech or
even work related), there are always people who dislike the noise. The
concerns are analogous to the ones expressed here -- irritation  factor, it
keeps people away, it's all about the in crowd, etc. Likewise, the
proposed solutions are similar to ones that have been floated here like
directing the noisemaking from the main group elsewhere or silencing it.

For things to work, everyone needs a reason to be there. People with less
experience need access to those who have been around the block. But a diet
of repetitive shop talk isn't very interesting for people who have a decent
handle on what they're doing. They need something else to keep them there,
and in the final analysis, many come for entertainment -- this normally
manifests itself in the form of high noise levels. But even if people spend
the vast bulk of the time playing around, nuggets of wisdom are shared. And
if something's truly serious, it gets attention.

It's far better to help people learn to tune out what they don't like, and
this is much easier to do in c4l than in communities where interaction is
primarily physical. All communities have their own character and
communication norms. It's important for people to be mindful of the
environment they're helping create, but reducing communication to help
avoid exposing people to annoyances screws things up.

In all honesty, I think the silliness on the sidelines is far more
important than the formal stuff. I know I learn a lot more while goofing
off than in formal channels for pretty much everything I do.

kyle


+1

I'm all for removing specific offended responses and commands as some
others have suggested, but I agree trying to remove some of the
lighter stuff will in the long term, be more likely to be detrimental
then a positive.




Re: [CODE4LIB] ulrich's api?

2013-01-02 Thread Jonathan Rochkind
Ah, did you find docs in the SS Support Center that cover how to access 
the API and what it's functionalty is?  Have any direct links to such?


Yeah, last time I asked SerSol (a couple years ago), the XML data 
service was all that was available -- and not only does it cost extra, 
ti is actually VERY expensive (I think it's targetted at other 
vendor-like users, who will basically be reselling the data).


So yeah, I'm curious about this newer one too!

On 1/2/2013 10:53 AM, Van Mil, James (vanmiljf) wrote:

I just checked the SS Support Center and it is included with a sub to 
Ulrichsweb.

I had confused this with the XML data service (which *does* cost extra), so I 
never followed up. Thanks for mentioning!

-James

-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Ranti 
Junus
Sent: Thursday, December 27, 2012 4:34 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] ulrich's api?

Huh, that's a good question. We happen to subscribe to Ulrich and I don't know 
if it's included by default. I didn't even know about this API until Andrew 
Nagy mentioned it when I visited their booth at ALA conference to discuss 
something else.

In my experience, after I signed the document and sent it back, they sent me 
back a link to the documentation page along with the login information.

I have not been using it, regrettably. This is something I put on my goals and 
objectives for the coming year.



On Thu, Dec 27, 2012 at 4:15 PM, Jonathan Rochkind rochk...@jhu.edu wrote:


Thanks! It is indeed something included with library's ulrich's
subscription?

Do they send you documentation too?

Have you been using it? Feel like giving us a brief review of what it
does and how well it works?


On 12/27/2012 3:48 PM, Ranti Junus wrote:


Hi Jonathan,

The Ulrich XML API is already in place.  You just need to contact
their support team through their support form to get the access. They
will send you a Terms of Use document to sign and send back to them
(it might involve a fax machine. ;-) )


ranti.


On Thu, Dec 27, 2012 at 12:22 PM, Jonathan Rochkind rochk...@jhu.edu

wrote:


  Hi Code4lib'ers.


The SerSol Ulrich's marketting page at:

http://www.serialssolutions.com/en/services/ulrichs/ulrichsw
eb
http://www.**serialssolutions.com/en/**services/ulrichs/ulrichswebh
ttp://www.serialssolutions.com/en/services/ulrichs/ulrichsweb





Says:


New API for Easy Integration
A new API with XML and JSON options allow librarians and technical
staff to easily integrate Ulrich’s data into their library’s web
pages and discovery services in order to provide researchers and
staff with reliable, continuously updated information about
electronic and print serials.
*

This implies that there may be an Ulrich's API that comes with
library licensing of Ulrich's? (And that was 'new' whenever this
online brochure was written, heh, who knows how new that is now,
there's no date on the page).

Does anyone know anything about this? Or where more info about this
might be found?  Or a good contact at SerSol/Ulrich's to ask about it?

Jonathan









--
Bulk mail.  Postage paid.




Re: [CODE4LIB] directing users to mobile DBs, was RE: [CODE4LIB] Responsive Web Site Live

2013-01-02 Thread Jonathan Rochkind

What method do you use to detect mobile-or-not?

On 1/2/2013 3:33 PM, Ken Irwin wrote:

Sarah asks about how to direct users to mobile versions of databases where 
appropriate.

The way I'm doing it is:
1. All database links are served up from a database table, so the link on our 
website is http://$OUR_LIBRARY/redirect?$db_id
2. The db-of-dbs knows if there is a mobile specific url (because we put it 
there...)
3. Detect mobile-or-not as a binary value
4. Serve up the right one as an HTTP header redirect

One big exception: EBSCO (which provides a really large number of our 
databases) handles their mobile access by using the same URL with a different 
profile name in the url. The redirect script has a special case that says if 
($mobile = true and $ebsco = true) { do string replace on the url to change 
from the desktop url to the mobile url } -- so I don't have to list both 
versions of the URL in the database.

It seems to work out pretty well.

Ken

-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Sarah 
Dooley
Sent: Wednesday, January 02, 2013 3:25 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Responsive Web Site Live

Very cool--congratulations!

In addition to Dave's questions, I'd be curious to know (can't see it since I 
don't have a login) how you handled directing people to databases that have 
mobile versions. This is something I've been wondering about for our site down 
the road and library sites in general--from a responsive site, how to 
effectively link people out to vendor-provided resources that are either mobile 
or non-mobile.

-Sarah Dooley




Re: [CODE4LIB] directing users to mobile DBs, was RE: [CODE4LIB] Responsive Web Site Live

2013-01-02 Thread Jonathan Rochkind

Ah, but this still doesn't answer my question on your part, Mark!

How do you detect browser width, especially on the server-side?

If it's with Javascript... the method Ken describes, it's not clear to 
me how javascript logic could get in there exactly.


Thus my question.

On 1/2/2013 3:51 PM, Mark Pernotto wrote:

I'd be curious to hear the response to Jonathan's question.  For the
longest time, I used to determine mobile  displays by browser, but it
just got too cluttered.  Now I detect browser width to determine
mobile versions.  This little trick doesn't play nice with all
frameworks, however, so it's not bullet-proof, but so far, it has
worked well.  And on a high level, easy to troubleshoot.

It wasn't immediately apparent to me if this was a part of a CMS or
not - it's awfully clean, and the usual Joomla/Drupal/Wordpress
identities weren't visible in the source.  Really nice work!

Thanks,
Mark


On Wed, Jan 2, 2013 at 12:36 PM, Jonathan Rochkind rochk...@jhu.edu wrote:

What method do you use to detect mobile-or-not?


On 1/2/2013 3:33 PM, Ken Irwin wrote:


Sarah asks about how to direct users to mobile versions of databases where
appropriate.

The way I'm doing it is:
1. All database links are served up from a database table, so the link on
our website is http://$OUR_LIBRARY/redirect?$db_id
2. The db-of-dbs knows if there is a mobile specific url (because we put
it there...)
3. Detect mobile-or-not as a binary value
4. Serve up the right one as an HTTP header redirect

One big exception: EBSCO (which provides a really large number of our
databases) handles their mobile access by using the same URL with a
different profile name in the url. The redirect script has a special case
that says if ($mobile = true and $ebsco = true) { do string replace on the
url to change from the desktop url to the mobile url } -- so I don't have to
list both versions of the URL in the database.

It seems to work out pretty well.

Ken

-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
Sarah Dooley
Sent: Wednesday, January 02, 2013 3:25 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Responsive Web Site Live

Very cool--congratulations!

In addition to Dave's questions, I'd be curious to know (can't see it
since I don't have a login) how you handled directing people to databases
that have mobile versions. This is something I've been wondering about for
our site down the road and library sites in general--from a responsive site,
how to effectively link people out to vendor-provided resources that are
either mobile or non-mobile.

-Sarah Dooley









Re: [CODE4LIB] directing users to mobile DBs, was RE: [CODE4LIB] Responsive Web Site Live

2013-01-02 Thread Jonathan Rochkind
I don't want to do the registration just to see how it works... but I 
assume it's doing user-agent detection?


Have you had issues with newly invented mobile browsers not being 
caught, do you ever update your PHP script with a new updated copy from 
teh author or anything?


On 1/2/2013 3:55 PM, Ken Irwin wrote:

I use the PHP code from: http://detectmobilebrowsers.mobi/
(free for personal and non-profit use)

Ken

-Original Message-
From: Jonathan Rochkind [mailto:rochk...@jhu.edu]
Sent: Wednesday, January 02, 2013 3:36 PM
To: Code for Libraries
Cc: Ken Irwin
Subject: Re: [CODE4LIB] directing users to mobile DBs, was RE: [CODE4LIB] 
Responsive Web Site Live

What method do you use to detect mobile-or-not?




[CODE4LIB] ulrich's api?

2012-12-27 Thread Jonathan Rochkind

Hi Code4lib'ers.

The SerSol Ulrich's marketting page at:

http://www.serialssolutions.com/en/services/ulrichs/ulrichsweb

Says:


New API for Easy Integration
A new API with XML and JSON options allow librarians and technical staff 
to easily integrate Ulrich’s data into their library’s web pages and 
discovery services in order to provide researchers and staff with 
reliable, continuously updated information about electronic and print 
serials.

*

This implies that there may be an Ulrich's API that comes with library 
licensing of Ulrich's? (And that was 'new' whenever this online brochure 
was written, heh, who knows how new that is now, there's no date on the 
page).


Does anyone know anything about this? Or where more info about this 
might be found?  Or a good contact at SerSol/Ulrich's to ask about it?


Jonathan


Re: [CODE4LIB] ulrich's api?

2012-12-27 Thread Jonathan Rochkind
Thanks! It is indeed something included with library's ulrich's 
subscription?


Do they send you documentation too?

Have you been using it? Feel like giving us a brief review of what it 
does and how well it works?


On 12/27/2012 3:48 PM, Ranti Junus wrote:

Hi Jonathan,

The Ulrich XML API is already in place.  You just need to contact their
support team through their support form to get the access. They will send
you a Terms of Use document to sign and send back to them (it might involve
a fax machine. ;-) )


ranti.


On Thu, Dec 27, 2012 at 12:22 PM, Jonathan Rochkind rochk...@jhu.eduwrote:


Hi Code4lib'ers.

The SerSol Ulrich's marketting page at:

http://www.serialssolutions.**com/en/services/ulrichs/**ulrichswebhttp://www.serialssolutions.com/en/services/ulrichs/ulrichsweb

Says:


New API for Easy Integration
A new API with XML and JSON options allow librarians and technical staff
to easily integrate Ulrich’s data into their library’s web pages and
discovery services in order to provide researchers and staff with reliable,
continuously updated information about electronic and print serials.
*

This implies that there may be an Ulrich's API that comes with library
licensing of Ulrich's? (And that was 'new' whenever this online brochure
was written, heh, who knows how new that is now, there's no date on the
page).

Does anyone know anything about this? Or where more info about this might
be found?  Or a good contact at SerSol/Ulrich's to ask about it?

Jonathan







Re: [CODE4LIB] Question abt the code4libwomen idea

2012-12-18 Thread Jonathan Rochkind

On 12/18/2012 12:27 PM, MJ Ray wrote:

Is there clarity that deliberately-discriminatory groups should have
no platform in code4lib?


If what you mean is if everyone agrees with you that a group created for 
women in tech is bad, then, no, pretty much nobody else here agrees with 
you.


I am not sure if I'd call such a group deliberately discriminatory, 
nor am I sure what qualifies as platform in code4lib, but for what 
you're really getting at, no, there is no clarity there, pretty much 
nobody else agrees with you there.


Re: [CODE4LIB] code4lib.org domain

2012-12-18 Thread Jonathan Rochkind
I definitely see what you're saying, but think there are pro's and con's 
both ways.


OSU is already responsible for the bulk of our infrastructure too, 
adding the DNS would be minor.


But there are definitely pro's (as well as con's) to individual and/or 
non-institutional ownership/responsibility/management, compared to 
institutional.


In the end, as with much Code4Lib, as with much volunteer projects -- 
what it comes down to is who's offering to volunteer to do it. OSU is 
offering to volunteer to do it (and pay for it, apparently?), and we 
obviously find OSU to be generally responsible, since they host the rest 
of our infrastructure.


Someone offering to do it right now, someone we find generally 
responsible -- always beats the hypothetical other solution that has 
nobody actually volunteering to do it.


So, Wilhelmina, are you volunteering to run the DNS instead? :) (and pay 
for it, or fundraise to pay for it)  If you are, then we might have two 
options. Otherwise, we've got one, and no reason to reject it unless we 
thought OSU was not trustworthy with the responsibility or something 
(which if we did, would be a big problem, since they already responsible 
for a lot more than that).


On 12/18/2012 4:34 PM, Wilhelmina Randtke wrote:

I'm for individual ownership and management over organizational.
Organizations tend to not have written documentation, and to rely on
institutional memory.  I see two things going wrong:  Contact at OSU leaves
OSU and no one thinks to renew domain, or OSU doesn't have a dedicated
contact and at some point they don't renew because they don't see the value.

Also important:  OSU is on state funding cycles, so may have some rule
against renewing for more than a year at a time.  So, the deadline to renew
will come more frequently than it would with unrestricted funds and the
ability to renew for 5 or 10 years at a time.

When the domain expires, it will go into a redemption period of about a
month.  I remember what the whois record looks like for domains in the
redemption period, and whois does give the contact information.  Does the
URL stop working during this period?  If so, then that's great because if
there is a problem with a renewal then many people will notice the URL not
working, and be able to check the status of the domain and get on it.

-Wilhelmina Randtke


On Tue, Dec 18, 2012 at 2:32 PM, Ed Summers e...@pobox.com wrote:


HI all,

I've owned the code4lib.org since 2005 and have been thinking it might
be wise for to transfer ownership of it to someone else. Sometimes I
forget to pay bills, and miss emails, and it seems like the domain
means something to a larger group of people.

With Ryan Ordway's help Oregon State University indicated they would
be willing to take over administration of the domain. They also have
been responsible for running the Drupal instance at code4lib.org and
the Mediawiki instance at wiki.code4lib.org -- so it seems like a
logical move.

But I thought I would bring it up here first in the interests of
transparency, community building and whatnot, to see if there were any
objections or ideas.

//Ed






[CODE4LIB] Chrome browser preserves wrong text box on back button

2012-12-18 Thread Jonathan Rochkind

Okay, this problem is hard to explain.

Let's say I have a search box, with results under it.

I enter Monkey in the search box, I hit search, I get a new page 
with results for Monkey, and the word Monkey pre-filled in

the search box (using input value=Monkey) in the HTML.

I decide I'm not happy with this search, I change the words in the 
search box to Baboon, and I hit the submit button again.


Then I press the browser back button to go back to Monkey.

Now I'm looking at a page that has results for Monkey, has an input 
with default value=Monkey -- but where Chrome has 'helpfully' filled 
the textbox in with Baboon instead, trying to help by 'remembering' 
what I had entered there when I was first looking at the page and 
pressed submit.


This is not helpful.

I am not sure if other browsers will do this too.

Does anyone know any way to get the browser to NOT do this? If you 
understand what I'm talking about? (autocomplete=false does not seem to 
have an effect).


Re: [CODE4LIB] Chrome browser preserves wrong text box on back button

2012-12-18 Thread Jonathan Rochkind
You know how you figure out the answer right after you write out the 
question?


autocomplete=false didn't work, because that's not the proper value.

autocomplete=off does work. While also of course disabling actual 
auto-complete, which may or may not be helpful, but at least it keeps
Chrome from doing this weird thing where it remembers your entry on 
browser back button when it ought not to.


On 12/18/2012 6:48 PM, Jonathan Rochkind wrote:

Okay, this problem is hard to explain.

Let's say I have a search box, with results under it.

I enter Monkey in the search box, I hit search, I get a new page
with results for Monkey, and the word Monkey pre-filled in
the search box (using input value=Monkey) in the HTML.

I decide I'm not happy with this search, I change the words in the
search box to Baboon, and I hit the submit button again.

Then I press the browser back button to go back to Monkey.

Now I'm looking at a page that has results for Monkey, has an input
with default value=Monkey -- but where Chrome has 'helpfully' filled
the textbox in with Baboon instead, trying to help by 'remembering'
what I had entered there when I was first looking at the page and
pressed submit.

This is not helpful.

I am not sure if other browsers will do this too.

Does anyone know any way to get the browser to NOT do this? If you
understand what I'm talking about? (autocomplete=false does not seem to
have an effect).



Re: [CODE4LIB] Question abt the code4libwomen idea

2012-12-18 Thread Jonathan Rochkind
So far some brave folks have indeed indicated that, but without 
specifying any particular incidents.


It seems to me it might be helpful if the actual incidents were related 
in some anonymous way (perhaps anonymous both to reporter and to 
'offenders' involved)... because if the rest of us knew what was going 
on, we could be more alert to seeing it and stopping it (including 
possibly observing such behavior in ourselves and stopping ourselves for 
doing it, now that we realize how hurtful it can be).


I realize some people have related incidents that happened at places 
other than code4lib, and perhaps that ought to be sufficient, but, 
clearly, many of us can think Oh, but that probably doens't happen at 
Code4Lib, even if it does.


I also realize that this can quickly turn into a giant mess, which is 
why I'd suggest that any such stories be very vague and entirely 
anonymous as to all parties involved, to make this not a tribunal about 
particular incidents but just information sharing about Here are some 
things that have happened at code4lib related to gendery stuff, that 
made some people uncomfortable, just so you know what we're talking about.


There doesn't need to be ANY discussion of the issues, and I think 
probably best if there isn't actually.


But honestly, I've been scratching my head since Bess first brought this 
up, and Bess mentioned that harrasment-y incidents have happened at 
code4lib, and I'm thinking Really? I haven't heard of them or seen 
them. Am I just really unobservant? Or am I seeing things but not 
realizing they are offensive? Or what?


I think it would be helpful to all of us wanting to stop such things 
from happening to know a _bit_ more specifically what sorts of things 
have happened.


Is this a good idea, or just a disaster trainwreck lying in wait? If 
it's a good idea, we could easily set up a wiki page where people can 
easily anonymously describe incidents (again, what I'm going for is NOT 
calling specific people out, but just giving us an idea of what it is 
that has happened that we're trying to stop from happening, you know?)




On 12/18/2012 6:41 PM, BWS Johnson wrote:

Salvete!



because they can't find an SO are outliers. C4l is a tech event. Do
women really get treated that shabbily there?



I'm guessing this is a yes, since several brave folks have indicated
it. It doesn't mean that *you* are an offender, but it's clearly
happening, or at least known to have happened in past.

Cheers, Brooke




Re: [CODE4LIB] Blacklight implementation at United States Holocaust Memorial Museum

2012-12-11 Thread Jonathan Rochkind
Just curious, did you use Hydra for this project, or just straight 
Blacklight without Hydra?


Esp if not Hydra, what tools did you end up using for indexing your 
content into Solr? (Only SolrMarc, all your content was already avail in 
Marc?)


On 12/11/2012 11:10 AM, Levy, Michael wrote:

I posted the message below on the Blacklight Development group, and I was
encouraged to share with code4lib, so I'm reposting with some minor edits:

I'd like to share a Blacklight implementation at the United States
Holocaust Memorial Museum that is available at
http://collections.ushmm.org/search It's been in use in-house for about a
year, with constant improvements and additions.

First, a tremendous thanks and kudos to all of the people involved in the
Blacklight project. I'm so grateful to everyone who worked on the project
and to those who have helped me with Blacklight, Ruby on Rails, and
SolrMarc.

The various collecting units at the Museum use very different fields,
labels, vocabularies, and spellings. I had a lot of fun mapping them and
thinking about what sorts of fields might work together for searching. The
catalog records sources include: a commercial ILS; a commercial collections
management system; two completely custom desktop database applications; a
spreadsheet; and a custom MSSQL database application. In addition, we have
a system that manages digitized assets that supplies some data.

Selecting a project based on Ruby on Rails came at a cost, including the
learning curve involved with RoR and, moreso, due to the process of having
RoR established with our IT infrastructure group. (Thanks go to our IT
group as well!)

I looked at some other really fine open source projects as well as
commercial products. Blacklight seemed optimal for our case because it
easily deals with any kind of metadata sources and it was a mature system
with a vibrant user/developer community.

I'll highlight a few interesting features.

Our collections management system supports relationships between records
including parent/child type relationships, e.g. between collection and the
items that comprise it. Here is a collection that has one archival
(document) collection plus several objects:
http://collections.ushmm.org/search/catalog/irn508676
We also have another parent/child type of relationship, where a group at
the Museum catalogs victim or survivor lists. I could import those, and
because there's enough metadata to link to the archival collection they are
part of, I can link them together. For example, this archival collection
http://collections.ushmm.org/search/catalog/irn508286 is linked to a number
of names source catalog records at the bottom, and each of those is linked
to the archival record as its source. These are done by doing a separate
Solr search for each item to see whether it's got a parent or children to
display near the bottom of the record.

Many years ago the Museum developed a geographic database. One area where
the various collecting units catalog disparately is in location naming. I
simply turned the names into a Solr synonyms file and then I highlight the
snippets in the index/list view. So that way, if you searched for L'viv and
you got a hit on Lemberg or Lwow or L'vov, you'd know why you got it. Same
with Munich, München, Muenchen, Munchen, and for Lodz/Litzmannstadt. (Some
day would be nice to have the name expansion be switchable on or off.)

Thumbnail (and larger) images from the archival records and objects come
from the collections management system for the Museum objects. Also finding
aids for archival (Document) records are currently managed in the CMS
system as doc, docx, or xls files and are delivered through Blacklight on
the detail page. For the photos and the historical film, the thumbnails
come from other sources based on the two custom desktop databases mentioned
above.

We have thousands of hours of oral history testimony in many languages
viewable from the Blacklight detail page as mp4 or mp3 files. The easiest
way to get to those is by limiting Record Type to Oral History, and Online
to Yes:
http://collections.ushmm.org/search/catalog?f[di_available][]=Yesf[record_type_facet][]=Oral+History

I welcome feedback regarding the user interface, bug reports, and any other
ideas you have, on the list or offline. (Plus I hope to meet some of you at
code4lib 2013.)

Cheers!




  1   2   3   4   5   6   7   8   >