[CODE4LIB] Code4Lib 2014 Conference: call for t-shirt designs!

2013-12-02 Thread Josh Wilson
Just once a year, an opportunity comes along to emblazon your vision for a
spiffy t-shirt design onto the torsi of the entire Code4Lib community--and
to bask in the fleeting, minimal fame that accompanies the honor of being
selected. That opportunity has come.

We are now accepting design ideas for the Official Code4Lib 2014 Conference
T-Shirt! Submit yours at:
http://wiki.code4lib.org/index.php/2014_t-shirt_design_proposals

All themes and concepts welcome. A great design might reflect our
profession, the Code4Lib community, or the culture of North Carolina's
Research Triangle. Pandering for votes with puns or pop culture references
has also worked splendidly in past years.

Submissions are due January 3. The winning design will be selected via a
community-wide vote in mid-January. Further instructions and information
can be found on the wiki page.

Thanks,
Charlie Morris  Josh Wilson, C4L 2014 T-Shirt Committee


Re: [CODE4LIB] The lie of the API

2013-12-02 Thread Bill Dueber
On Sun, Dec 1, 2013 at 7:57 PM, Barnes, Hugh hugh.bar...@lincoln.ac.nzwrote:

 +1 to all of Richard's points here. Making something easier for you to
 develop is no justification for making it harder to consume or deviating
 from well supported standards.



​I just want to point out that as much as we all really, *really* want
easy to consume and following the standards to be the same
thingthey're not. Correct content negotiation is one of those things
that often follows the phrase all they have to do..., which is always a
red flag, as in  Why give the user  different URLs ​when *all they have to
do is* Caching, json vs javascript vs jsonp, etc. all make this
harder. If *all * *I have to do* is know that all the consumers of my data
are going to do content negotiation right, and then I need to get deep into
the guts of my caching mechanism, then set up an environment where it's all
easy to test...well, it's harder.

And don't tell me how lazy I am until you invent a day with a lot more
hours. I'm sick of people telling me I'm lazy because I'm not pure. I
expose APIs (which have their own share of problems, of course) because I
want them to be *useful* and *used. *

  -Bill, apparently feeling a little bitter this morning -




-- 
Bill Dueber
Library Systems Programmer
University of Michigan Library


Re: [CODE4LIB] Looking for two coders to help with discoverability of videos

2013-12-02 Thread Alexander Duryee
Is it out of the question to extract technical metadata from the
audiovisual materials themselves (via MediaInfo et al)?  It would minimize
the amount of MARC that needs to be processed and give more
accurate/complete data than relying on old cataloging records.


On Mon, Dec 2, 2013 at 12:37 AM, Kelley McGrath kell...@uoregon.edu wrote:

 I wanted to follow up on my previous post with a couple points.

 1. This is probably too late for anybody thinking about applying, but I
 thought there may be some general interest. I have put up some more
 detailed specifications about what I am hoping to do at
 http://pages.uoregon.edu/kelleym/miw/. Data extraction overview.doc is
 the general overview and the other files contain supporting documents.

 2. I replied some time ago to Heather's offer below about her website that
 will connect researchers with volunteer software developers. I have to
 admit that looking for volunteer software developers had not really
 occurred to me. However, I do have additional things that I would like to
 do for which I currently have no funding so if you would be interested in
 volunteering in the future, let me know.

 Kelley
 kell...@uoregon.edu


 On Tue, Nov 12, 2013 at 6:33 PM, Heather Claxton claxt...@gmail.com
 mailto:claxt...@gmail.com wrote:
 Hi Kelley,

 I might be able to help in your search.   I'm in the process of starting a
 website that connects academic researchers with volunteer software
 developers.  I'm looking for people to post programming projects on the
 website once it's launched in late January.   I realize that may be a
 little late for you, but perhaps the project you mentioned in your PS
 (clustering based on title, name, date ect.) would be perfect?  The
 one caveat is that the website is targeting software developers who wish to
 volunteer.   Anyway, if you're interested in posting, please send me an
 e-mail  at  sciencesolved2...@gmail.commailto:sciencesolved2...@gmail.com
I would greatly appreciate it.
 Oh and of course it would be free to post  :)  Best of luck in your
 hiring process,

 Heather Claxton-Douglas


 On Mon, Nov 11, 2013 at 9:58 PM, Kelley McGrath kell...@uoregon.edu
 mailto:kell...@uoregon.edu wrote:

  I have a small amount of money to work with and am looking for two people
  to help with extracting data from MARC records as described below. This
 is
  part of a larger project to develop a FRBR-based data store and discovery
  interface for moving images. Our previous work includes a consideration
 of
  the feasibility of the project from a cataloging perspective (
  http://www.olacinc.org/drupal/?q=node/27), a prototype end-user
 interface
  (https://blazing-sunset-24.heroku.com/,
  https://blazing-sunset-24.heroku.com/page/about) and a web form to
  crowdsource the parsing of movie credits (
  http://olac-annotator.org/#/about).
  Planned work period: six months beginning around the second week of
  December (I can be somewhat flexible on the dates if you want to wait and
  start after the New Year)
  Payment: flat sum of $2500 upon completion of the work
 
  Required skills and knowledge:
 
*   Familiarity with the MARC 21 bibliographic format
*   Familiarity with Natural Language Processing concepts (or
  willingness to learn)
*   Experience with Java, Python, and/or Ruby programming languages
 
  Description of work: Use language and text processing tools and provided
  strategies to write code to extract and normalize data in existing MARC
  bibliographic records for moving images. Refine code based on feedback
 from
  analysis of results obtained with a sample dataset.
 
  Data to be extracted:
  Tasks for Position 1:
  Titles (including the main title of the video, uniform titles, variant
  titles, series titles, television program titles and titles of contents)
  Authors and titles of related works on which an adaptation is based
  Duration
  Color
  Sound vs. silent
  Tasks for Position 2:
  Format (DVD, VHS, film, online, etc.)
  Original language
  Country of production
  Aspect ratio
  Flag for whether a record represents multiple works or not
  We have already done some work with dates, names and roles and have a
  framework to work in. I have the basic logic for the data extraction
  processes, but expect to need some iteration to refine these strategies.
 
  To apply please send me an email at kelleym@uoregon explaining why you
  are interested in this project, what relevant experience you would bring
  and any other reasons why I should hire you. If you have a preference for
  position 1 or 2, let me know (it's not necessary to have a preference).
 The
  deadline for applications is Monday, December 2, 2013. Let me know if you
  have any questions.
 
  Thank you for your consideration.
 
  Kelley
 
  PS In the near future, I will also be looking for someone to help with
  work clustering based on title, name, date and identifier data from MARC
  records. This will not involve any direct 

Re: [CODE4LIB] The lie of the API

2013-12-02 Thread Robert Sanderson
Hi Richard,

On Sun, Dec 1, 2013 at 4:25 PM, Richard Wallis 
richard.wal...@dataliberate.com wrote:

 It's harder to implement Content Negotiation than your own API, because
 you
 get to define your own API whereas you have to follow someone else's rules
 Don't wish your implementation problems on the consumers of your data.
 There are [you would hope] far more of them than of you ;-)

Content-negotiation is an already established mechanism - why invent a
 new, and different, one just for *your* data?


I should have been clearer here that I was responding to the original blog
post.  I'm not advocating arbitrary APIs, but instead just to use link
headers between the different representations.

The advantages are that the caching issues (both browser and intermediate
caches) go away as the content is static, you don't need to invent a way to
find out which formats are available (eg no arbitrary content in a 300
response), and you can simply publish the representations as any other
resource without server side logic to deal with conneg.

The disadvantages are ... none.  There's no invention of APIs, it's just
following a simpler route within the HTTP spec.


Put your self in the place of your consumer having to get their head
 around yet another site specific API pattern.


As a consumer of my own data, I would rather do a simple GET on a URI than
mess around constructing the correct Accept header.



 As to discovering then using the (currently implemented) URI returned from
 a content-negotiated call  - The standard http libraries take care of that,
 like any other http redirects (301,303, etc) plus you are protected from
 any future backend server implementation changes.


No they don't, as there's no way to know which representations are
available via conneg, and hence no automated way to construct the Accept
header.

Rob


[CODE4LIB] marc4j setId/getId

2013-12-02 Thread Kevin S. Clarke
Hi,

I was poking around in MARC4J over the break and I was intrigued by the
setId()/getId() functions on many of the classes.  The documentation says
they're intended to provide a unique number for persistence (vs. hashCode()
which wouldn't necessarily be unique).  I see it was added to the codebase
back in 2006 -
http://marc4j.tigris.org/ds/viewMessage.do?dsForumId=606dsMessageId=773835

I was just curious if people are using these functions and what they are
using them for.  I'm still working my way through the code, but the methods
I've seen don't include an implementation of how to generate those Longs
(it may be there and I just haven't stumbled across it yet).

Just curious about actual real world uses...

Thanks,
Kevin


[CODE4LIB] Encouraging Innovation and Technology: HHLib9

2013-12-02 Thread Amy Vecchione
Please excuse any duplication as this is being sent to multiple lists. If
you have any questions, please email me! I'm happy to chat about this
amazing online conference - Amy



CALL FOR PRESENTERS

*Encouraging Innovation and Technology: HHLib 9*



LearningTimes invites librarians, library staff, vendors, graduate
students, and developers to submit program proposals related to the topic
of innovative library services for the online conference:  Encouraging
Innovation and Technology: HHLib 9 to be held February 26-27, 2014.



*Proposals are due December 11, 2013.*



Go to http://www.handheldlibrarian.org/proposal-submissions to submit a
proposal



The Encouraging Innovation and Technology: conference will feature
interactive, live online sessions. We are interested in a broad range of
submissions that highlight current, evolving and future issues in library
services, including gamification in libraries, mobile library services,
technological changes, gis, etc.  We (and the conference attendees) want to
hear about:

o   Your most  innovative ideas

o   Your most innovative projects

o   Super successes and super failures.

o Super plans

oWhat was learned from your success or failure



Online presentations may be conducted in one of two  formats:

 *   a 45-minute live online session (i.e. synchronous webcast)
 *   a 15 minute lightning round presentation

Conference registration fees are waived for speakers. Presenters Are
Expected To:


1. Conduct your session using Adobe Connect (computer,
Internet, mic required)
2. Provide a digital photo of yourself for the conference
website
3. Respond to questions from attendees
4. Attend an online 30-60 minute training on Adobe Connect
prior to the conference



Proposal Submissions:  Submit your proposal by completing the web form at
http://www.handheldlibrarian.org/proposal-submissions



Information about Registration for the conference will be available in
early December.




Amy Vecchione, Digital Access Librarian/Assistant Professor
http://works.bepress.com/amy_vecchione/
Albertsons Library, Boise State University, L212
http://library.boisestate.edu
(208) 426-1625


Re: [CODE4LIB] The lie of the API

2013-12-02 Thread Robert Sanderson
On Sun, Dec 1, 2013 at 5:57 PM, Barnes, Hugh hugh.bar...@lincoln.ac.nzwrote:

 +1 to all of Richard's points here. Making something easier for you to
 develop is no justification for making it harder to consume or deviating
 from well supported standards.


I'm not suggesting deviating from well supported standards, I'm suggesting
choosing a different approach within the well supported standard that makes
it easier for both consumer and producer.



 [Robert]
   You can't
  just put a file in the file system, unlike with separate URIs for
  distinct representations where it just works, instead you need server
  side processing.

 If we introduce languages into the negotiation, this won't scale.


Sure, there's situations where the number of variants is so large that
including them all would be a nuisance.  The number of times this actually
happens is (in my experience at least) vanishingly small.  Again, I'm not
suggesting an arbitrary API, I'm saying that there's easier ways to
accomplish the 99% of cases than conneg.



 [Robert]
  This also makes it much harder to cache the
  responses, as the cache needs to determine whether or not the
  representation has changed -- the cache also needs to parse the
  headers rather than just comparing URI and content.

 Don't know caches intimately, but I don't see why that's algorithmically
 difficult. Just look at the Content-type of the response. Is it harder for
 caches to examine headers than content or URI? (That's an earnest, perhaps
 naïve, question.)

 If we are talking about caching on the client here (not caching proxies),
 I would think in most cases requests are issued with the same Accept-*
 headers, so caching will work as expected anyway.


I think Joe already discussed this one, but there's an outstanding conneg
caching bug in firefox and it took even Squid a long time to implement the
content negotiation aware caching.  Also note, much harder not
impossible :)

No Conneg:
* Check if we have the URI. Done. O(1) as it's a hash.

Conneg:
* Check if we have the URI. Parse the Accept headers from the request.
 Check if they match the cached content and don't contain wildcards.
 O(quite a lot more than 1)



 [Robert]
  Link headers
  can be added with a simple apache configuration rule, and as they're
  static are easy to cache. So the server side is easy, and the client
 side is trivial.

 Hadn't heard of these. (They are on Wikipedia so they must be real.) What
 do they offer over HTML link elements populated from the Dublin Core
 Element Set?


Nothing :) They're link elements in a header so you can use them in non
HTML representations.


My whatever it's worth . great topic, though, thanks Robert :)


Welcome :)

Rob


Re: [CODE4LIB] The lie of the API

2013-12-02 Thread Simeon Warner

On 12/2/13 10:50 AM, Robert Sanderson wrote:

On Sun, Dec 1, 2013 at 4:25 PM, Richard Wallis 
richard.wal...@dataliberate.com wrote:

As to discovering then using the (currently implemented) URI returned from
a content-negotiated call  - The standard http libraries take care of that,
like any other http redirects (301,303, etc) plus you are protected from
any future backend server implementation changes.


No they don't, as there's no way to know which representations are
available via conneg, and hence no automated way to construct the Accept
header.


To me this is the biggest issue with content negotiation for machine 
APIs. What you get may be influenced by the Accept headers you send, but 
without detailed knowledge of the particular system you are interacting 
with you can't predict what you'll actually get.


Cheers,
Simeon


Re: [CODE4LIB] The lie of the API

2013-12-02 Thread Jonathan Rochkind
Yeah, I'm going to disagree a bit with the original post in this thread, 
and with Richard's contribution too. Or at least qualify it.


My experience is that folks trying to be pure and avoid an API do _not_ 
make it easier for me to consume as a developer writing clients. It's 
just not true that one always leads to the other.


The easiest API's I have to deal with are those where the developers 
really understand the use cases clients are likely to have, and really 
make API's that conveniently serve those use cases.


The most difficult API's I have to deal with are those where the 
developers spent a lot of time thinking about very abstract and 
theoretical concerns of architectural purity, whether in terms of REST, 
linked data, HATEOS, or, god forbid, all of those and more at once (and 
then realizing that sometimes they seem to conflict) -- and neglected to 
think about actual use cases and making them smooth.


Seriously, think about the most pleasant, efficient, and powerful API's 
you have used.  (github's?  Something else?).  How many of them are 
'pure' non-API API's, how many of them are actually API's?


I'm going to call it an API even if it does what the original post 
says, I'm going to say API in the sense of how software is meant to 
deal with this -- in the base case, the so-called API can be screen 
scrape HTML, okay.


I am going to agree that aligning the API with the user-visible web app 
as much as possible -- what the original post is saying you should 
always and only do -- does make sense.  But slavish devotion to avoiding 
any API as distinct from the human web UI at all leads to theoretically 
pure but difficult to use API's.


Sometimes the 'information architecture' that makes sense for humans 
differs from what makes sense for machine access. Sometimes the human UI 
needs lots of JS which complicates things.  Even without this, an API 
which lets me choose representations based on different URI's instead of 
_only_ conneg (say, /widget/18.json instead of only /widget/18 with 
conneg) ends up being significantly easier to develop against and debug.


Spend a bit of time understanding what people consider theoretically 
pure, sure, because it can give you more tools in your toolbox.  But 
simply slavishly sticking to it does not, in my experience, result in a 
good 'developer experience' for your developer clients.  And when you 
start realizing that different people from different schools have 
different ideas of what 'theoretically pure' looks like, when you start 
spending many hours going over HTTP RANGE 14 and just getting more 
confused -- realize that what matters in the end is being easy to use 
for your developers use cases, and just do it.


Personally, I'd spend more time making sure i understand my developers
use cases and getting feedback from developers, and less time on 
architecting castles in the sky that are theoretically pure.


On 12/2/13 9:56 AM, Bill Dueber wrote:

On Sun, Dec 1, 2013 at 7:57 PM, Barnes, Hugh hugh.bar...@lincoln.ac.nzwrote:


+1 to all of Richard's points here. Making something easier for you to
develop is no justification for making it harder to consume or deviating
from well supported standards.




​I just want to point out that as much as we all really, *really* want
easy to consume and following the standards to be the same
thingthey're not. Correct content negotiation is one of those things
that often follows the phrase all they have to do..., which is always a
red flag, as in  Why give the user  different URLs ​when *all they have to
do is* Caching, json vs javascript vs jsonp, etc. all make this
harder. If *all * *I have to do* is know that all the consumers of my data
are going to do content negotiation right, and then I need to get deep into
the guts of my caching mechanism, then set up an environment where it's all
easy to test...well, it's harder.

And don't tell me how lazy I am until you invent a day with a lot more
hours. I'm sick of people telling me I'm lazy because I'm not pure. I
expose APIs (which have their own share of problems, of course) because I
want them to be *useful* and *used. *

   -Bill, apparently feeling a little bitter this morning -






Re: [CODE4LIB] Looking for two coders to help with discoverability of videos

2013-12-02 Thread Kyle Banerjee
 Is it out of the question to extract technical metadata from the
 audiovisual materials themselves (via MediaInfo et al)?


One of the things that absolutely blows my mind is the widespread practice
of hand typing this stuff into records. Aside from an obvious opportunity
to introduce errors/inconsistencies, many libraries record details for the
archival versions rather than the access versions actually provided. So
patrons see a description for what they're not getting...

Just for the heck of it, sometime last year I scanned thousands of objects
and their descriptions to see how close they were. Like an idiot, I didn't
write up what I learned because I was just trying to satisfy my own
curiosity. However, the takeaway I got from the exercise was that the
embedded info is so much better than the hand keyed stuff that you'd be
nuts to consider the latter as authoritative. Curiously, I did find cases
where the embedded info was clearly incorrect. I can only guess that was
manually edited.

kyle


[CODE4LIB] Website back up services

2013-12-02 Thread Wilhelmina Randtke
Does anyone have a recommendation for a website backup service?

I would like something where I provide FTP and MySQL connection info, and
they do something like make a monthly backup, keep backups for about a
year, and will do a roll back from their end.  How frequently they do a
backup doesn't really matter.  The biggest factors are pricing and having
an interface that someone who doesn't understand databases could use.

-Wilhelmina Randtke


Re: [CODE4LIB] The lie of the API

2013-12-02 Thread Kevin Ford
Though I have some quibbles with Seth's post, I think it's worth 
drawing attention to his repeatedly calling out API keys as a very 
significant barrier to use, or at least entry.  Most of the posts here 
have given little attention to the issue API keys present.  I can say 
that I have quite often looked elsewhere or simply stopped pursuing my 
idea the moment I discovered an API key was mandatory.


As for the presumed difficulty with implementing content negotiation 
(and, especially, caching on top), it seems that if you can implement an 
entire system to manage assignment of and access by API key, then I do 
not understand how content negotiation and caching are significantly 
harder to implement.


In any event, APIs and content negotiation are not mutually exclusive. 
One should be able to use the HTTP URI to access multiple 
representations of the resource without recourse to a custom API.


Yours,
Kevin




On 11/29/2013 02:44 PM, Robert Sanderson wrote:

(posted in the comments on the blog and reposted here for further
discussion, if interest)


While I couldn't agree more with the post's starting point -- URIs identify
(concepts) and use HTTP as your API -- I couldn't disagree more with the
use content negotiation conclusion.

I'm with Dan Cohen in his comment regarding using different URIs for
different representations for several reasons below.

It's harder to implement Content Negotiation than your own API, because you
get to define your own API whereas you have to follow someone else's rules
when you implement conneg.  You can't get your own API wrong.  I agree with
Ruben that HTTP is better than rolling your own proprietary API, we
disagree that conneg is the correct solution.  The choice is between conneg
or regular HTTP, not conneg or a proprietary API.

Secondly, you need to look at the HTTP headers and parse quite a complex
structure to determine what is being requested.  You can't just put a file
in the file system, unlike with separate URIs for distinct representations
where it just works, instead you need server side processing.  This also
makes it much harder to cache the responses, as the cache needs to
determine whether or not the representation has changed -- the cache also
needs to parse the headers rather than just comparing URI and content.  For
large scale systems like DPLA and Europeana, caching is essential for
quality of service.

How do you find our which formats are supported by conneg? By reading the
documentation. Which could just say add .json on the end. The Vary header
tells you that negotiation in the format dimension is possible, just not
what to do to actually get anything back. There isn't a way to find this
out from HTTP automatically,so now you need to read both the site's docs
AND the HTTP docs.  APIs can, on the other hand, do this.  Consider
OAI-PMH's ListMetadataFormats and SRU's Explain response.

Instead you can have a separate URI for each representation and link them
with Link headers, or just a simple rule like add '.json' on the end. No
need for complicated content negotiation at all.  Link headers can be added
with a simple apache configuration rule, and as they're static are easy to
cache. So the server side is easy, and the client side is trivial.
  Compared to being difficult at both ends with content negotiation.

It can be useful to make statements about the different representations,
and especially if you need to annotate the structure or content.  Or share
it -- you can't email someone a link that includes the right Accept headers
to send -- as in the post, you need to send them a command line like curl
with -H.

An experiment for fans of content negotiation: Have both .json and 302
style conneg from your original URI to that .json file. Advertise both. See
how many people do the conneg. If it's non-zero, I'll be extremely
surprised.

And a challenge: Even with libraries there's still complexity to figuring
out how and what to serve. Find me sites that correctly implement * based
fallbacks. Or even process q values. I'll bet I can find 10 that do content
negotiation wrong, for every 1 that does it correctly.  I'll start:
dx.doi.org touts its content negotiation for metadata, yet doesn't
implement q values or *s. You have to go to the documentation to figure out
what Accept headers it will do string equality tests against.

Rob



On Fri, Nov 29, 2013 at 6:24 AM, Seth van Hooland svhoo...@ulb.ac.be
wrote:


Dear all,

I guess some of you will be interested in the blogpost of my colleague

and co-author Ruben regarding the misunderstandings on the use and abuse of
APIs in a digital libraries context, including a description of both good
and bad practices from Europeana, DPLA and the Cooper Hewitt museum:


http://ruben.verborgh.org/blog/2013/11/29/the-lie-of-the-api/

Kind regards,

Seth van Hooland
Président du Master en Sciences et Technologies de l'Information et de la

Communication (MaSTIC)

Université Libre de Bruxelles
Av. F.D. 

Re: [CODE4LIB] User Registration and Authentication for a Tomcat webapp?

2013-12-02 Thread Simon Spero
Atlassian Crowd (if you have jira or confluence you may already have this
licensed).

SocialAuth (supports, but relies on oauth sources).

Apache Shiro (requires external account provider- see demo).

Shibboleth?
 On Dec 1, 2013 4:09 PM, LeVan,Ralph le...@oclc.org wrote:

 OCLC Research is building an ILL Cost Calculator.  We'll be asking
 institutions to enter information about their ILL practices and costs and
 then supporting mechanisms for generating reports based on that data.
  (Dennis Massie is leading this work in Research and he really should have
 a page about this project that I could point your at.)

 I'm writing the part that collects the information.  I need a light-weight
 framework that will let users register themselves and then subsequently
 authenticate themselves while engaged in an iterative process of entering
 the necessary data for the calculator.

 I'm looking for suggestions for that framework.  I'm hoping for something
 in java that can be integrated into a tomcat webapp environment, but it
 wouldn't hurt me to stretch a little if there's something else out there
 you think I should be trying.

 Thanks!

 Ralph

 Ralph LeVan
 Sr. Research Scientist
 OCLC Research



[CODE4LIB] Extended Deadline CfP Semantic Digital Archives (Special Issue of Int. J. on Digital Libraries)

2013-12-02 Thread Livia Predoiu

---
Call for Papers

Special Issue on Semantic Digital Archives
International Journal on Digital Libraries

*** Extended Deadline until December 31, 2013 ***


---

Archival Information Systems (AIS) are becoming increasingly important. For
decades, the amount of content created digitally is growing and its
complete life cycle nowadays tends to remain digital. A selection of this
content is expected to be of value for the future and can thus be
considered being part of our cultural heritage. As soon as these digital
publications become obsolete, but are still deemed to be of value in the
future, they have to be transferred smoothly into appropriate AIS where
they need to be kept accessible even through changing technologies.

This focused issue arises from issues covered by the SDA workshop series (
http://sda2013.dke-research.de/) and invites submissions from all
researchers.  The workshop series has shown that both the library and the
archiving community have made valuable contributions to the management of
huge amounts of knowledge and data. However, both are approaching this
topic from different views which shall be brought together to
cross-fertilize each other. The Semantic Web is another research area that
provides promising technical solutions for knowledge representation and
management. At the forefront of making the semantic web a mature and
applicable reality is the linked data initiative, which already has started
to be adopted by the library community. Semantic representations of
contextual knowledge about cultural heritage objects will enhance
organization and access of data and knowledge. In order to achieve a
comprehensive investigation, the information seeking and document triage !
 behaviors of users (an area also classified under the field of Human
Computer Interaction) are also important to provide a comprehensive
investigation of the research topic.

This special issue will solicit high quality papers that demonstrate
exceptional achievements on Semantic Digital Archives, including but not
limited to:
- Archival Information systems (AIS) and Archival Information
Infrastructures (AII) in general
- Architectures and Frameworks for AIS and AII
- Contextualization of digital archives, museums and digital libraries
- Ontologies  linked data for AIS, AII, museums and digital libraries
- Logical theories for digital archives  digital preservation
- Knowledge evolution
- Semantic temporal analytics
- (Semantic) provenance models
- CIDOC CRM and extensions
- Semantic long-term storage  hardware organization for AIS  AII 
digital libraries
- Semantic extensions of emulation/virtualization methodologies tailored
for AIS  AII  digital libraries
- Implementations  evaluations of (semantic) AIS, AII, semantic digital
museums  semantic digital libraries
- Preservation of scientific and research data
- Preservation of work flow processes
- Appraisal and selection of content
- Semantic search  information retrieval in digital archives, digital
museums and digital libraries
- User studies focusing on end-user needs and information seeking behavior
of end-users
- User interfaces for (semantic) AIS, AII, digital museums  semantic
digital libraries
- formalizations for changes in (designated) user communities
- Semantic multimedia AIS, AII, multimedia museums  multimedia libraries
- Web Archives
- Specialized AIS  AII for specific services like Twitter, etc.
- (Semantic) Preservation Processes and Protocols
- Semantic (Web) services implementing AIS  AII
- Information integration/semantic ingest (e.g. from digital libraries)
- Trust for ingest  data security/integrity check for long-term storage of
archival records
- Migration strategies based on Semantic Web technologies
- Legal issues

SUBMISSION DETAILS

Important Dates
Paper Submission deadline: December 31, 2013 (Deadline Extended)
First notification:March 31, 2014
Revision submission:   May 31, 2014
Second notification:   July 31, 2014
Final version submission:  August 31, 2014

GUEST EDITORS

Thomas Risse, University of Hannover  L3S Research Center, Germany
(contact person)
Livia Predoiu, University of Oxford, UK
Annett Mitschick, University of Dresden, Germany
Andreas Nürnberger, University of Magdeburg, Germany
Seamus Ross, University of Toronto, Canada

PAPER SUBMISSION

Papers submitted to this special issue for possible publication must be
original and must not be under consideration for publication in any other
journal or conference. Previously published or accepted conference papers
must contain at least 30% new material to be considered for the special
issue. All papers are to be submitted by referring to
http://www.springer.com/799. At the beginning of the 

Re: [CODE4LIB] Looking for two coders to help with discoverability of videos

2013-12-02 Thread Roy Tennant
I would have to agree with this where the data exists. The data captured by
digital cameras these days can be incredibly extensive and thorough. Given
this, I recently started exposing this data for all of the 8,000 photos I
now have on my photos web site http://FreeLargePhotos.com/ . There is now a
link on the page for an individual photo that a user can click on that will
pull out the data dynamically from the image file and display it in plain
text. Here is a random example:

http://freelargephotos.com/photos/003171/exif.txt

The tricky bit is of course where the photo is actually scanned from a
slide, which of course plays havoc with items such as the creation date. So
depending on the exact situation your mileage may vary, but the basic
principle stands -- if you can allow a machine to capture the metadata then
by all means let it.
Roy


On Mon, Dec 2, 2013 at 9:06 AM, Kyle Banerjee kyle.baner...@gmail.comwrote:

  Is it out of the question to extract technical metadata from the
  audiovisual materials themselves (via MediaInfo et al)?


 One of the things that absolutely blows my mind is the widespread practice
 of hand typing this stuff into records. Aside from an obvious opportunity
 to introduce errors/inconsistencies, many libraries record details for the
 archival versions rather than the access versions actually provided. So
 patrons see a description for what they're not getting...

 Just for the heck of it, sometime last year I scanned thousands of objects
 and their descriptions to see how close they were. Like an idiot, I didn't
 write up what I learned because I was just trying to satisfy my own
 curiosity. However, the takeaway I got from the exercise was that the
 embedded info is so much better than the hand keyed stuff that you'd be
 nuts to consider the latter as authoritative. Curiously, I did find cases
 where the embedded info was clearly incorrect. I can only guess that was
 manually edited.

 kyle



Re: [CODE4LIB] The lie of the API

2013-12-02 Thread Ross Singer
I'm not going to defend API keys, but not all APIs are open or free.  You
need to have *some* way to track usage.

There may be alternative ways to implement that, but you can't just hand
wave away the rather large use case for API keys.

-Ross.


On Mon, Dec 2, 2013 at 12:15 PM, Kevin Ford k...@3windmills.com wrote:

 Though I have some quibbles with Seth's post, I think it's worth drawing
 attention to his repeatedly calling out API keys as a very significant
 barrier to use, or at least entry.  Most of the posts here have given
 little attention to the issue API keys present.  I can say that I have
 quite often looked elsewhere or simply stopped pursuing my idea the moment
 I discovered an API key was mandatory.

 As for the presumed difficulty with implementing content negotiation (and,
 especially, caching on top), it seems that if you can implement an entire
 system to manage assignment of and access by API key, then I do not
 understand how content negotiation and caching are significantly harder to
 implement.

 In any event, APIs and content negotiation are not mutually exclusive. One
 should be able to use the HTTP URI to access multiple representations of
 the resource without recourse to a custom API.

 Yours,
 Kevin





 On 11/29/2013 02:44 PM, Robert Sanderson wrote:

 (posted in the comments on the blog and reposted here for further
 discussion, if interest)


 While I couldn't agree more with the post's starting point -- URIs
 identify
 (concepts) and use HTTP as your API -- I couldn't disagree more with the
 use content negotiation conclusion.

 I'm with Dan Cohen in his comment regarding using different URIs for
 different representations for several reasons below.

 It's harder to implement Content Negotiation than your own API, because
 you
 get to define your own API whereas you have to follow someone else's rules
 when you implement conneg.  You can't get your own API wrong.  I agree
 with
 Ruben that HTTP is better than rolling your own proprietary API, we
 disagree that conneg is the correct solution.  The choice is between
 conneg
 or regular HTTP, not conneg or a proprietary API.

 Secondly, you need to look at the HTTP headers and parse quite a complex
 structure to determine what is being requested.  You can't just put a file
 in the file system, unlike with separate URIs for distinct representations
 where it just works, instead you need server side processing.  This also
 makes it much harder to cache the responses, as the cache needs to
 determine whether or not the representation has changed -- the cache also
 needs to parse the headers rather than just comparing URI and content.
  For
 large scale systems like DPLA and Europeana, caching is essential for
 quality of service.

 How do you find our which formats are supported by conneg? By reading the
 documentation. Which could just say add .json on the end. The Vary
 header
 tells you that negotiation in the format dimension is possible, just not
 what to do to actually get anything back. There isn't a way to find this
 out from HTTP automatically,so now you need to read both the site's docs
 AND the HTTP docs.  APIs can, on the other hand, do this.  Consider
 OAI-PMH's ListMetadataFormats and SRU's Explain response.

 Instead you can have a separate URI for each representation and link them
 with Link headers, or just a simple rule like add '.json' on the end. No
 need for complicated content negotiation at all.  Link headers can be
 added
 with a simple apache configuration rule, and as they're static are easy to
 cache. So the server side is easy, and the client side is trivial.
   Compared to being difficult at both ends with content negotiation.

 It can be useful to make statements about the different representations,
 and especially if you need to annotate the structure or content.  Or share
 it -- you can't email someone a link that includes the right Accept
 headers
 to send -- as in the post, you need to send them a command line like curl
 with -H.

 An experiment for fans of content negotiation: Have both .json and 302
 style conneg from your original URI to that .json file. Advertise both.
 See
 how many people do the conneg. If it's non-zero, I'll be extremely
 surprised.

 And a challenge: Even with libraries there's still complexity to figuring
 out how and what to serve. Find me sites that correctly implement * based
 fallbacks. Or even process q values. I'll bet I can find 10 that do
 content
 negotiation wrong, for every 1 that does it correctly.  I'll start:
 dx.doi.org touts its content negotiation for metadata, yet doesn't
 implement q values or *s. You have to go to the documentation to figure
 out
 what Accept headers it will do string equality tests against.

 Rob



 On Fri, Nov 29, 2013 at 6:24 AM, Seth van Hooland svhoo...@ulb.ac.be
 wrote:


 Dear all,

 I guess some of you will be interested in the blogpost of my colleague

 and co-author Ruben regarding the misunderstandings on the use and 

Re: [CODE4LIB] User Registration and Authentication for a Tomcat webapp?

2013-12-02 Thread Tod Olson
On protecting using Shib to protect a servlet under Tomcat:

https://wiki.shibboleth.net/confluence/display/SHIB2/NativeSPJavaInstall

-Tod


On Dec 2, 2013, at 11:11 AM, Simon Spero sesunc...@gmail.com
 wrote:

 Atlassian Crowd (if you have jira or confluence you may already have this
 licensed).
 
 SocialAuth (supports, but relies on oauth sources).
 
 Apache Shiro (requires external account provider- see demo).
 
 Shibboleth?
 On Dec 1, 2013 4:09 PM, LeVan,Ralph le...@oclc.org wrote:
 
 OCLC Research is building an ILL Cost Calculator.  We'll be asking
 institutions to enter information about their ILL practices and costs and
 then supporting mechanisms for generating reports based on that data.
 (Dennis Massie is leading this work in Research and he really should have
 a page about this project that I could point your at.)
 
 I'm writing the part that collects the information.  I need a light-weight
 framework that will let users register themselves and then subsequently
 authenticate themselves while engaged in an iterative process of entering
 the necessary data for the calculator.
 
 I'm looking for suggestions for that framework.  I'm hoping for something
 in java that can be integrated into a tomcat webapp environment, but it
 wouldn't hurt me to stretch a little if there's something else out there
 you think I should be trying.
 
 Thanks!
 
 Ralph
 
 Ralph LeVan
 Sr. Research Scientist
 OCLC Research
 


Re: [CODE4LIB] The lie of the API

2013-12-02 Thread Jonathan Rochkind
There are plenty of non-free API's, that need some kind of access 
control. A different side discussion is what forms of access control are 
the least barrier to developers while still being secure (a lot of 
services mess this up in both directions!).


However, there are also some free API's whcih still require API keys, 
perhaps because the owners want to track usage or throttle usage or what 
have you.


Sometimes you need to do that too, and you need to restrict access, so 
be it. But it is probably worth recognizing that you are sometimes 
adding barriers to succesful client development here -- it seems like a 
trivial barrier from the perspective of the developers of the service, 
because they use the service so often. But to a client developer working 
with a dozen different API's, the extra burden to get and deal with the 
API key and the access control mechanism can be non-trivial.


I think the best compromise is what Google ends up doing with many of 
their APIs. Allow access without an API key, but with a fairly minimal 
number of accesses-per-time-period allowed (couple hundred a day, is 
what I think google often does). This allows the developer to evaluate 
the api, explore/debug the api in the browser, and write automated tests 
against the api, without worrying about api keys. But still requires an 
api key for 'real' use, so the host can do what tracking or throttling 
they want.


Jonathan

On 12/2/13 12:18 PM, Ross Singer wrote:

I'm not going to defend API keys, but not all APIs are open or free.  You
need to have *some* way to track usage.

There may be alternative ways to implement that, but you can't just hand
wave away the rather large use case for API keys.

-Ross.


On Mon, Dec 2, 2013 at 12:15 PM, Kevin Ford k...@3windmills.com wrote:


Though I have some quibbles with Seth's post, I think it's worth drawing
attention to his repeatedly calling out API keys as a very significant
barrier to use, or at least entry.  Most of the posts here have given
little attention to the issue API keys present.  I can say that I have
quite often looked elsewhere or simply stopped pursuing my idea the moment
I discovered an API key was mandatory.

As for the presumed difficulty with implementing content negotiation (and,
especially, caching on top), it seems that if you can implement an entire
system to manage assignment of and access by API key, then I do not
understand how content negotiation and caching are significantly harder to
implement.

In any event, APIs and content negotiation are not mutually exclusive. One
should be able to use the HTTP URI to access multiple representations of
the resource without recourse to a custom API.

Yours,
Kevin





On 11/29/2013 02:44 PM, Robert Sanderson wrote:


(posted in the comments on the blog and reposted here for further
discussion, if interest)


While I couldn't agree more with the post's starting point -- URIs
identify
(concepts) and use HTTP as your API -- I couldn't disagree more with the
use content negotiation conclusion.

I'm with Dan Cohen in his comment regarding using different URIs for
different representations for several reasons below.

It's harder to implement Content Negotiation than your own API, because
you
get to define your own API whereas you have to follow someone else's rules
when you implement conneg.  You can't get your own API wrong.  I agree
with
Ruben that HTTP is better than rolling your own proprietary API, we
disagree that conneg is the correct solution.  The choice is between
conneg
or regular HTTP, not conneg or a proprietary API.

Secondly, you need to look at the HTTP headers and parse quite a complex
structure to determine what is being requested.  You can't just put a file
in the file system, unlike with separate URIs for distinct representations
where it just works, instead you need server side processing.  This also
makes it much harder to cache the responses, as the cache needs to
determine whether or not the representation has changed -- the cache also
needs to parse the headers rather than just comparing URI and content.
  For
large scale systems like DPLA and Europeana, caching is essential for
quality of service.

How do you find our which formats are supported by conneg? By reading the
documentation. Which could just say add .json on the end. The Vary
header
tells you that negotiation in the format dimension is possible, just not
what to do to actually get anything back. There isn't a way to find this
out from HTTP automatically,so now you need to read both the site's docs
AND the HTTP docs.  APIs can, on the other hand, do this.  Consider
OAI-PMH's ListMetadataFormats and SRU's Explain response.

Instead you can have a separate URI for each representation and link them
with Link headers, or just a simple rule like add '.json' on the end. No
need for complicated content negotiation at all.  Link headers can be
added
with a simple apache configuration rule, and as they're static are easy to
cache. So the 

Re: [CODE4LIB] The lie of the API

2013-12-02 Thread Kevin Ford

 I think the best compromise is what Google ends up doing with many of
 their APIs. Allow access without an API key, but with a fairly minimal
 number of accesses-per-time-period allowed (couple hundred a day, is
 what I think google often does).
-- Agreed.

I certainly didn't mean to suggest that there were not legitimate use 
cases for API keys.  That said, my gut (plus experience sitting in 
multiple meetings during which the need for an access mechanism landed 
on the table as a primary requirement) says people believe they need an 
API key before alternatives have been fully considered and even before 
there is an actual, defined need for one.  Server logs often reveal most 
types of usage statistics service operators are interested in and 
there are ways to throttle traffic at the caching level (the latter can 
be a little tricky to implement, however).


Yours,
Kevin


On 12/02/2013 12:38 PM, Jonathan Rochkind wrote:

There are plenty of non-free API's, that need some kind of access
control. A different side discussion is what forms of access control are
the least barrier to developers while still being secure (a lot of
services mess this up in both directions!).

However, there are also some free API's whcih still require API keys,
perhaps because the owners want to track usage or throttle usage or what
have you.

Sometimes you need to do that too, and you need to restrict access, so
be it. But it is probably worth recognizing that you are sometimes
adding barriers to succesful client development here -- it seems like a
trivial barrier from the perspective of the developers of the service,
because they use the service so often. But to a client developer working
with a dozen different API's, the extra burden to get and deal with the
API key and the access control mechanism can be non-trivial.

I think the best compromise is what Google ends up doing with many of
their APIs. Allow access without an API key, but with a fairly minimal
number of accesses-per-time-period allowed (couple hundred a day, is
what I think google often does). This allows the developer to evaluate
the api, explore/debug the api in the browser, and write automated tests
against the api, without worrying about api keys. But still requires an
api key for 'real' use, so the host can do what tracking or throttling
they want.

Jonathan

On 12/2/13 12:18 PM, Ross Singer wrote:

I'm not going to defend API keys, but not all APIs are open or free.  You
need to have *some* way to track usage.

There may be alternative ways to implement that, but you can't just hand
wave away the rather large use case for API keys.

-Ross.


On Mon, Dec 2, 2013 at 12:15 PM, Kevin Ford k...@3windmills.com wrote:


Though I have some quibbles with Seth's post, I think it's worth drawing
attention to his repeatedly calling out API keys as a very significant
barrier to use, or at least entry.  Most of the posts here have given
little attention to the issue API keys present.  I can say that I have
quite often looked elsewhere or simply stopped pursuing my idea the
moment
I discovered an API key was mandatory.

As for the presumed difficulty with implementing content negotiation
(and,
especially, caching on top), it seems that if you can implement an
entire
system to manage assignment of and access by API key, then I do not
understand how content negotiation and caching are significantly
harder to
implement.

In any event, APIs and content negotiation are not mutually
exclusive. One
should be able to use the HTTP URI to access multiple representations of
the resource without recourse to a custom API.

Yours,
Kevin





On 11/29/2013 02:44 PM, Robert Sanderson wrote:


(posted in the comments on the blog and reposted here for further
discussion, if interest)


While I couldn't agree more with the post's starting point -- URIs
identify
(concepts) and use HTTP as your API -- I couldn't disagree more with
the
use content negotiation conclusion.

I'm with Dan Cohen in his comment regarding using different URIs for
different representations for several reasons below.

It's harder to implement Content Negotiation than your own API, because
you
get to define your own API whereas you have to follow someone else's
rules
when you implement conneg.  You can't get your own API wrong.  I agree
with
Ruben that HTTP is better than rolling your own proprietary API, we
disagree that conneg is the correct solution.  The choice is between
conneg
or regular HTTP, not conneg or a proprietary API.

Secondly, you need to look at the HTTP headers and parse quite a
complex
structure to determine what is being requested.  You can't just put
a file
in the file system, unlike with separate URIs for distinct
representations
where it just works, instead you need server side processing.  This
also
makes it much harder to cache the responses, as the cache needs to
determine whether or not the representation has changed -- the cache
also
needs to parse the headers rather than just comparing 

Re: [CODE4LIB] Website back up services

2013-12-02 Thread Benjamin Stewart
Happy Monday

If you have your own linux server with Apache, mysql and bind and have
another linux server for a back and a replication of web services this
attached open source script will work great.
I used this for years and will backup both data and sql for a 12 month
rotation 1st of each month plus a 30days of a month.
Very simple, easy to use and restore.

Cheers

~Ben 

Thank you,
Ben Stewart
 
Ben Stewart
System Administrator
Geoffrey R. Weller library
 University Way
Prince George, British Columbia | V2N 4Z9
PH (250) 960-6605
benjamin.stew...@unbc.ca










On 12/2/2013, 9:10 AM, Wilhelmina Randtke rand...@gmail.com wrote:

Does anyone have a recommendation for a website backup service?

I would like something where I provide FTP and MySQL connection info, and
they do something like make a monthly backup, keep backups for about a
year, and will do a roll back from their end.  How frequently they do a
backup doesn't really matter.  The biggest factors are pricing and having
an interface that someone who doesn't understand databases could use.

-Wilhelmina Randtke



rbackup
Description: rbackup
{\rtf1\ansi\ansicpg1252\cocoartf1265
{\fonttbl\f0\fnil\fcharset0 Menlo-Regular;}
{\colortbl;\red255\green255\blue255;}
\margl1440\margr1440\vieww10800\viewh8400\viewkind0
\pard\tx560\tx1120\tx1680\tx2240\tx2800\tx3360\tx3920\tx4480\tx5040\tx5600\tx6160\tx6720\pardirnatural

\f0\fs22 \cf0 \CocoaLigature0 \
### Editable Variables for Customization\
# Edit /etc/cron.d/rbackup to edit time of execution\
# Include starting forward slash, but omit trailing forward slash\
BACKUPHOST=   # x.x.x.x or hostname, leave blank for non remote\
SOURCES=/clandata/local/share/OFFICE \\\
 /clandata/local/share/PHOTOS \\\
 /clandata/local/share/TECH_SHARE \\\
 /var/www \\\
 /var/mail \\\
 /etc \\\
 /srv \\\
 /home # list seperated by spaces\
SQLDB=willowriver pgipers # list seperated by spaces\
TARGET=/mnt/backup/   # Full path to target location - mounted /dev/sde1\
MONTHROTATION=12\
DAYROTATION=30}

Re: [CODE4LIB] Looking for two coders to help with discoverability of videos - Embedded Metadata

2013-12-02 Thread Kari R Smith
I've been working with embedded metadata for some years and there are great 
tools out there for embedding, extracting and reusing metadata (technical, 
administrative, and descriptive).  The tools allow for batch data entry, use 
metadata schema or standards.  As a digital archivist whose job is to take in 
lots of this digitized content that generally has no context or that context is 
lost or misplaced, I wholly advocate for embedding metadata.  There are 
consumer products that can then expose this metadata so that it doesn't have to 
be retyped again and again.

What gets my goat is when I hear folks belabor the effort but don't talk about 
the rewards and opportunities that embedding metadata can bring.  Forthcoming 
use cases from The Royal Library in Denmark about mass digitization and 
embedding metadata as well as using the Exif / IPTC Extension for describing 
the content in image files.  There's also work being done with video and audio 
and CAD files.  

Check out these resources on Embedded Metadata from the VRA Embedded Metadata 
Working Group (Greg Reser, Chair):
About Embedded Metadata:  
http://metadatadeluxe.pbworks.com/w/page/62407805/Concepts
http://metadatadeluxe.pbworks.com/w/page/20792256/Other%20Organizations
Case Studies:  http://metadatadeluxe.pbworks.com/w/page/62407826/Communities

Okay, I'll step off my soap box now...
Kari

-Original Message-
From: Code for Libraries [mailto:CODE4LIB@listserv.nd.edu] On Behalf Of Kyle 
Banerjee
Sent: Monday, December 02, 2013 12:06 PM
To: CODE4LIB@listserv.nd.edu
Subject: Re: [CODE4LIB] Looking for two coders to help with discoverability of 
videos

 Is it out of the question to extract technical metadata from the 
 audiovisual materials themselves (via MediaInfo et al)?


One of the things that absolutely blows my mind is the widespread practice of 
hand typing this stuff into records. Aside from an obvious opportunity to 
introduce errors/inconsistencies, many libraries record details for the 
archival versions rather than the access versions actually provided. So patrons 
see a description for what they're not getting...

Just for the heck of it, sometime last year I scanned thousands of objects and 
their descriptions to see how close they were. Like an idiot, I didn't write up 
what I learned because I was just trying to satisfy my own curiosity. However, 
the takeaway I got from the exercise was that the embedded info is so much 
better than the hand keyed stuff that you'd be nuts to consider the latter as 
authoritative. Curiously, I did find cases where the embedded info was clearly 
incorrect. I can only guess that was manually edited.

kyle


Re: [CODE4LIB] The lie of the API

2013-12-02 Thread Robert Sanderson
To be (more) controversial...

If it's okay to require headers, why can't API keys go in a header rather
than the URL.
Then it's just the same as content negotiation, it seems to me. You send a
header and get a different response from the same URI.

Rob



On Mon, Dec 2, 2013 at 10:57 AM, Edward Summers e...@pobox.com wrote:

 On Dec 3, 2013, at 4:18 AM, Ross Singer rossfsin...@gmail.com wrote:
  I'm not going to defend API keys, but not all APIs are open or free.  You
  need to have *some* way to track usage.

 A key (haha) thing that keys also provide is an opportunity to have a
 conversation with the user of your api: who are they, how could you get in
 touch with them, what are they doing with the API, what would they like to
 do with the API, what doesn’t work? These questions are difficult to ask if
 they are just a IP address in your access log.

 //Ed



[CODE4LIB] Job: Asst Head of Archives Research Center at Atlanta University Center

2013-12-02 Thread jobs
Asst Head of Archives Research Center
Atlanta University Center
Atlanta

The Atlanta University Center - Robert W. Woodruff Library supports the
teaching and learning missions of four institutions of higher learning that
comprise the world's largest consortium of HBCUs--Clark Atlanta University,
the Interdenominational Theological Center, Morehouse College, and Spelman
College. Conveniently located and easily accessible to the campuses, the
Woodruff Library is the center of the intellectual and social life at the
Atlanta University Center.

  
The Archives Research Center at the Atlanta University Center Robert W.
Woodruff Library has a rich collection documenting the African American
experience and the African Diaspora. It features manuscript writings and
books, including first edition titles, limited printings autographed and rare
publications. The extensive collection of books, personal papers,
organizational and institutional records supports research in education,
literature, the arts, religion, politics and society and community
empowerment. Additionally, the AUC Woodruff Library serves as the custodian of
the Morehouse College Martin Luther King and Tupac Amaru Shakur Collections.

  
POSITION SUMMARY:

  
The Atlanta University Center - Robert W. Woodruff Library is committed to
displaying excellence in our delivery of service. This is evidenced by our
2012 national recognition with the receipt of two awards in the 28th Annual
Educational Advertising Awards. To continue our excellence in program and
services, the library is seeking a highly motivated, energetic, knowledgeable,
and efficient archivist to serve as the Library's Assistant Head of Archives
Research Center and expand access to, and promotion of its archives,
manuscripts and special collections. The individual will assist the Head of
the Archives Research Center in carrying-out the strategic goals of the
department with a focus upon the department vision of becoming a premier
destination archives. The successful candidate will be responsible for the
physical and intellectual control of the archives, manuscripts and special
collections through appraisal, arrangement, description, and creation of
finding aids; supervision of two FTE professional staff, interns, and
processing and digitization projects; and assistance in developing policies,
and the provision of reference and instructional services. The Assistant Head
will also liaison with the Digital Services Unit, and participate in library-
wide initiatives through committees, taskforces, and projects, including
exhibitions. In the absence of the Department Head, the Assistant Head
provides oversight and coordination of daily services and operations. This
position participates in evening and weekend work schedule rotations as
necessary.

  
Requirements:

  * Accredited graduate degree in an appropriate discipline (Archives 
Management, Librarianship, History, or related area)
  * Formal archival training or certification as an archivist
  * Minimum of 3 years experience processing archival and/or manuscript 
collections
  * Minimum of 1 year supervisory experience
  * Experience in project management, including planning, organizing and 
evaluating the work other staff
  * Experience with applications of technology including archival management 
systems
  * Demonstrated knowledge of current theories, trends, standards, and 
practices of archival services in academic libraries, including DACS, MPLP, LCSH
  * Demonstrated understanding of digitization efforts and knowledge of digital 
formats and standards, including XML, EAD and Dublin Core
  * Strong interpersonal skills, including the ability to work within a 
collegial work environment where change and innovation are encouraged
  * Evidence of scholarship and/or professional activity
  * Demonstrated commitment to working in a culturally diverse environment
  * Ability to thrive in a team setting and handle multiple responsibilities in 
a team environment
  * Strong oral and written communication skills
  * Strong organizational and analytical skills
  * Experience providing reference and instructional services
  * Ability to perform physical activities associated with the archival 
environment
  * Experience with applications of technology including archival management 
systems and content repositories
  
Desired:

  * Demonstrated knowledge and/or education in African American studies and 
history
  * Experience utilizing Archivists' Toolkit, CONTENTdm, and Omeka
  * Experience working in a university or academic setting
  * Grant writing and/or implementation experience
  
SALARY  BENEFITS:

  
Salary commensurate with experience; benefits include medical, dental, vision,
life, company paid disability plans, relocation assistance, company match
retirement plan (TIAA-CREF).

  
APPLICATION PROCEDURE:

  
Interested applicants should submit a letter of application and resume online
to the Human Resources Department at 

Re: [CODE4LIB] The lie of the API

2013-12-02 Thread Jonathan Rochkind

I do frequently see API keys in header, it is a frequent pattern.

Anything that requires things in the header, in my experience makes the 
API more 'expensive' to develop against. I'm not sure it is okay to 
require headers.


Which is why I suggested allowing format specification in the URL, not 
just conneg headers. And is also, actually, why I expressed admiration 
for google's pattern of allowing X requests a day without an api key. 
Both things allow you to play with the api in a browser without headers.


If you are requiring a cryptographic signature (ala HMAC) for your 
access control, you can't feasibly play with it in a browser anyway, it 
doesn't matter whether it's supplied in headers or query params. And 
(inconvenient) HMAC probably is the only actually secure way to do api 
access control, depending on what level of security is called for.


On 12/2/13 1:03 PM, Robert Sanderson wrote:

To be (more) controversial...

If it's okay to require headers, why can't API keys go in a header rather
than the URL.
Then it's just the same as content negotiation, it seems to me. You send a
header and get a different response from the same URI.

Rob



On Mon, Dec 2, 2013 at 10:57 AM, Edward Summers e...@pobox.com wrote:


On Dec 3, 2013, at 4:18 AM, Ross Singer rossfsin...@gmail.com wrote:

I'm not going to defend API keys, but not all APIs are open or free.  You
need to have *some* way to track usage.


A key (haha) thing that keys also provide is an opportunity to have a
conversation with the user of your api: who are they, how could you get in
touch with them, what are they doing with the API, what would they like to
do with the API, what doesn’t work? These questions are difficult to ask if
they are just a IP address in your access log.

//Ed






Re: [CODE4LIB] The lie of the API

2013-12-02 Thread Edward Summers
Amazon Web Services (which is probably the most heavily used API on the Web) 
use HTTP headers for authentication. But I guess developers typically use 
software libraries to access AWS rather than making the HTTP calls directly.

//Ed


Re: [CODE4LIB] The lie of the API

2013-12-02 Thread Kevin Ford

 A key (haha) thing that keys also provide is an opportunity
 to have a conversation with the user of your api: who are they,
 how could you get in touch with them, what are they doing with
 the API, what would they like to do with the API, what doesn’t
 work? These questions are difficult to ask if they are just a
 IP address in your access log.
-- True, but, again, there are other ways to go about this.

I've baulked at doing just this in the past because it reveals the raw 
and primary purpose behind an API key: to track individual user 
usage/access.  I would feel a little awkward writing (and receiving, 
incidentally) a message that began:



--

Hello,

I saw you using our service.  What are you doing with our data?

Cordially,
Data service team

---


And, if you cringe a little at the ramifications of the above, then why 
do you need user-specific granularity?   (That's really not meant to be 
a rhetorical question - I would genuinely be interested in whether my 
notions of open and free are outmoded and based too much in a 
theoretical purity that unnecessary tracking is a violation of privacy).


Unless the API key exists to control specific, user-level access 
precisely because this is a facet of the underlying service, I feel 
somewhere in all of this the service has violated, in some way, the 
notion that it is open and/or free, assuming it has billed itself as 
such.  Otherwise, it's free and open as in Google or Facebook.


All that said, I think a data service can smooth things over greatly by 
not insisting on a developer signing a EULA (which is essentially what 
happens when one requests an API key) before even trying the service or 
desiring the most basic of data access.  There are middle ground solutions.


Yours,
Kevin





On 12/02/2013 12:57 PM, Edward Summers wrote:

On Dec 3, 2013, at 4:18 AM, Ross Singer rossfsin...@gmail.com wrote:

I'm not going to defend API keys, but not all APIs are open or free.  You
need to have *some* way to track usage.


A key (haha) thing that keys also provide is an opportunity to have a 
conversation with the user of your api: who are they, how could you get in 
touch with them, what are they doing with the API, what would they like to do 
with the API, what doesn’t work? These questions are difficult to ask if they 
are just a IP address in your access log.

//Ed



[CODE4LIB] Jobs: Two Technology Specialists at Digital Public Library of America

2013-12-02 Thread Amy Rudersdorf
The Digital Public Library of America (http://dp.la) seeks *two
Technology Specialists* to join its growing team and to further DPLA’s
mission to bring together the riches of America’s libraries, archives,
and museums, and make them freely available to all. A belief in
this mission and the drive to accomplish it over time in a
collaborative spirit both within and beyond the organization is essential.

Full job descriptions at
http://dp.la/info/2013/08/20/opportunity-tech-specialist/


Re: [CODE4LIB] Jobs: Two Technology Specialists at Digital Public Library of America

2013-12-02 Thread Mark A. Matienzo
Yes! Please apply for this job, and if you have any questions, please don't
hesitate to contact me off-list. (These two positions will report to me.)

Mark

On Mon, Dec 2, 2013 at 1:33 PM, Amy Rudersdorf a...@dp.la wrote:

 The Digital Public Library of America (http://dp.la) seeks *two
 Technology Specialists* to join its growing team and to further DPLA’s
 mission to bring together the riches of America’s libraries, archives,
 and museums, and make them freely available to all. A belief in
 this mission and the drive to accomplish it over time in a
 collaborative spirit both within and beyond the organization is essential.

 Full job descriptions at
 http://dp.la/info/2013/08/20/opportunity-tech-specialist/



Re: [CODE4LIB] Looking for two coders to help with discoverability of videos

2013-12-02 Thread Robert Haschart

Kelley,

The work you are proposing is interesting and overlaps somewhat both 
with work I have already done and with a new project I'm looking into 
here at UVa.
I have been the primary contributor to the Marc4j java project for the 
past several years and am the creator of the project SolrMarc which 
extracts data from Marc records based on a customizable specification, 
to build Solr index records to facilitate rich discovery.


Much of my work on creating and improving these projects has been in 
service of my actual job of creating and maintaining the Solr Index 
behind our Blacklight-based discovery interface.   As a part of that 
work I have created custom SolrMarc routines that extract the format of 
items similar to what is described in Example 3, including looking in 
the leader, 006, 007 and 008 to determine the format as-coded but 
further looking in the 245 h, 300 and 538 fields to heuristically 
determine when the format as-coded is incorrect and ought to be 
overridden.   Most of the heuristic determination is targeted towards 
Video material, and was initiated when I found an item that due to a 
coding error was listed as a Video in Braille format.


Further I have developed a set of custom routines that look more closely 
at Video items, one of which already extracts the runtime from the 
008[18-20] field,
To modify it from its current form that currently returns the runtime in 
minutes, to instead return it as   HH:MM as specified in your xls file, 
and to further handle the edge case of  008[18-20] = 000  to return 
over 16:39 would literally take about 15 minutes.


Another of these custom routines that is more fully-formed, is code for 
extracting the Director of a video from the Marc record.  It examines 
the contents of the fields 245c, 508a, 500a, 505a, 505t, employing 
heuristics and targeted natural language processing techniques, to 
attempt to correctly extract the Director.   At this point I believe 
it achieves better results than a careful cataloger would achieve, even 
one who specializes in film and video.


The other project I have just started investigating is an effort to 
create and/or flesh out Marc records for video items based on heuristic 
matching of title and director and date with data returned from 
publicly-accessible movie information sites.


This more recent work may not be relevant to your needs but the custom 
extraction routines seem directly applicable to your goals, and may also 
provide a template that may make your other goals more easily achievable.


-Robert Haschart

On 12/2/2013 12:37 AM, Kelley McGrath wrote:

I wanted to follow up on my previous post with a couple points.

1. This is probably too late for anybody thinking about applying, but I thought 
there may be some general interest. I have put up some more detailed 
specifications about what I am hoping to do at 
http://pages.uoregon.edu/kelleym/miw/. Data extraction overview.doc is the 
general overview and the other files contain supporting documents.

2. I replied some time ago to Heather's offer below about her website that will 
connect researchers with volunteer software developers. I have to admit that 
looking for volunteer software developers had not really occurred to me. 
However, I do have additional things that I would like to do for which I 
currently have no funding so if you would be interested in volunteering in the 
future, let me know.

Kelley
kell...@uoregon.edu


On Tue, Nov 12, 2013 at 6:33 PM, Heather 
Claxtonclaxt...@gmail.commailto:claxt...@gmail.com  wrote:
Hi Kelley,

I might be able to help in your search.   I'm in the process of starting a
website that connects academic researchers with volunteer software
developers.  I'm looking for people to post programming projects on the
website once it's launched in late January.   I realize that may be a
little late for you, but perhaps the project you mentioned in your PS
(clustering based on title, name, date ect.) would be perfect?  The
one caveat is that the website is targeting software developers who wish to
volunteer.   Anyway, if you're interested in posting, please send me an
e-mail  at  sciencesolved2...@gmail.commailto:sciencesolved2...@gmail.com 
I would greatly appreciate it.
Oh and of course it would be free to post  :)  Best of luck in your
hiring process,

Heather Claxton-Douglas


On Mon, Nov 11, 2013 at 9:58 PM, Kelley 
McGrathkell...@uoregon.edumailto:kell...@uoregon.edu  wrote:


I have a small amount of money to work with and am looking for two people
to help with extracting data from MARC records as described below. This is
part of a larger project to develop a FRBR-based data store and discovery
interface for moving images. Our previous work includes a consideration of
the feasibility of the project from a cataloging perspective (
http://www.olacinc.org/drupal/?q=node/27), a prototype end-user interface
(https://blazing-sunset-24.heroku.com/,

[CODE4LIB] Job: Web and Mobile User Support Librarian at Northwestern University

2013-12-02 Thread jobs
Web and Mobile User Support Librarian
Northwestern University
Evanston

Northwestern University Library seeks to recruit a creative, dynamic,
customer-service focused librarian to join the Public Service Division's new
and innovative User Experience Department. Reporting to the Web and Mobile
Services Librarian, the Web and Mobile User Support Librarian works directly
and intensively with library users to ensure that the library's public-facing
web and mobile technologies, applications, and platforms meet the needs of a
diverse clientele; develops, implements, and assesses educational programming
for students, faculty and staff in the use of web and mobile technologies; and
assists in the assessment, quality control, and maintenance of the library's
public web site. For more information about this position, go to
www.library.northwestern.edu/about/library-administration/jobs.

  
Required Qualifications: Master's degree from an ALA-accredited program in
library and information science or the equivalent combination of education and
relevant library experience. Strong public service and outreach orientation;
excellent communication and analytical skills. Experience or coursework in
library-based instruction. Familiarity with various mobile technology
platforms (iOS, Android, Kindle, Nook, etc.); familiarity with various social
media platforms (Twitter, Facebook, Google+, etc.). Ability to work well with
people and technology; ability to work independently and as part of a team.

  
Preferred Qualifications: Work experience in a library. Experience or
coursework in designing and administering surveys and questionnaires.
Experience conducting interviews and/or focus groups. Experience doing
usability testing. Experience using Google analytics. Experience working in a
Drupal environment. Experience using Springshare products (LibGuides,
LibAnalytics, LibAnswers, LibChat). Experience in creating online learning
tools.

  
To Apply: Send PDF formatted letter of application, resume or vita, and names
of three references to the attention of Jan Hayes, Personnel Librarian, to
libsearc...@northwestern.edu. Applications received by November 29, 2013 will
receive first consideration.



Brought to you by code4lib jobs: http://jobs.code4lib.org/job/10807/


[CODE4LIB] Job: Head of Digital Scholarship Strategy at University of Nevada, Las Vegas

2013-12-02 Thread jobs
Head of Digital Scholarship Strategy
University of Nevada, Las Vegas
Las Vegas

The University of Nevada, Las Vegas invites applications for Head of Digital
Scholarship Strategy.

  
The Head of Digital Scholarship Strategy is a newly created role tasked with
developing content strategies aligned with our priorities to provide access to
the latest digital scholarly resources, capture unique UNLV scholarly output,
and integrate born-digital scholarly products into our collecting portfolio.
This role will work through a library wide collections team that includes
faculty from across the Libraries with designated collecting responsibilities
such as Library Liaisons, the Director of Special Collections and the Head of
Collection Management. The team will clarify collecting scope and
responsibilities across all collecting areas and will identify and engage
cross- organizational expertise in the collection, infrastructure, description
and delivery of digital content.

  
Reporting to the Director of LRDS this position will:

  * Serve as a member of the LRDS leadership team, participating in overall 
management, strategic planning and operation of the division, its staff and its 
functions.
  * Lead the investigation and implementation of new strategies for collecting, 
managing, and preserving digital scholarly content and new forms of scholarly 
output, including but not limited to data sets, multimedia, blogs, images, and 
websites.
  * Identify, acquire, preserve, manage rights, and provide discovery for the 
research outputs for UNLV created scholarly products, including managing the 
staff and resources charged with the creation and growth of the institutional 
repository, Digital Scholarship@UNLV
  * Lead Libraries investigation of data acquisition with a special focus on 
determining the kinds of data appropriate for libraries acquisitions and the 
role of the Libraries in data lifecycle management for the campus.
  * Assist in collecting content in all formats that documents the region.
  * Stay abreast of emerging technologies, alternative publishing models, 
scholarly communication, developments and related legislative initiatives. 
Propose new initiatives as appropriate.
  
For more information, please visit https://hrsearch.unlv.edu.



Brought to you by code4lib jobs: http://jobs.code4lib.org/job/10863/


Re: [CODE4LIB] The lie of the API

2013-12-02 Thread Miles Fidelman
umm... it's called HTTP-AUTH, and if you really want to be cool, use an 
X.509 client cert for authorization (see geoserver as an example that 
works very cleanly - 
http://docs.geoserver.org/latest/en/user/security/tutorials/cert/index.html; 
the freebxml registry-repository also uses X.509 based authentication in 
a reasonably clean manner)


Robert Sanderson wrote:

To be (more) controversial...

If it's okay to require headers, why can't API keys go in a header rather
than the URL.
Then it's just the same as content negotiation, it seems to me. You send a
header and get a different response from the same URI.

Rob



On Mon, Dec 2, 2013 at 10:57 AM, Edward Summers e...@pobox.com wrote:


On Dec 3, 2013, at 4:18 AM, Ross Singer rossfsin...@gmail.com wrote:

I'm not going to defend API keys, but not all APIs are open or free.  You
need to have *some* way to track usage.

A key (haha) thing that keys also provide is an opportunity to have a
conversation with the user of your api: who are they, how could you get in
touch with them, what are they doing with the API, what would they like to
do with the API, what doesn’t work? These questions are difficult to ask if
they are just a IP address in your access log.

//Ed



[CODE4LIB] Job: Digital Librarian at University of Wisconsin-Madison

2013-12-02 Thread jobs
Digital Librarian
University of Wisconsin-Madison
Madison

Official Title: SR INFORM PROC CONSLT(S44BN) or INFORM PROCESS CONSLT(S44DN)
or ASSOC INF PROC CONSLT(S44FN)

  
Degree and area of specialization:

BA/BS required, master's preferred, in Library Science, Computer Science,
Information Technology, Business, Social Science or related field.

  
License/certification: Project management certification or formal project
management training is a plus but not required.

  
Minimum number of years and type of relevant work experience:

Successful candidates will have at least two years of work experience in these
areas:

  * Experience applying research-based practices of information management in a 
scholarly organization;
  * Advanced organizational skills: ability to visualize and implement ideal 
arrangements of information in order to suit the needs of different groups of 
system users;
  * Familiarity with hierarchical and normalized organization of information;
  * Experience with metadata standards and tagging;
  * Experience researching and using online tools for project management 
including task assignment, calendaring, and documentation;
  * Demonstrated ability to work independently to meet deadlines as part of a 
cross-functional and multi-site project team;
  * Successful candidates will have some familiarity and knowledge in these 
areas:
  * Knowledge management or decision support systems;
  * Knowledge collaboration systems such as Microsoft SharePoint;
  * Knowledge of web based technologies including html, css, and JavaScript; 
creating and maintaining blogs and wikis; experience evaluating, integrating 
and implementing new and emerging technologies and services;
  * Knowledge of interface design, development, and testing including 
principles of user-centered web design and usability
Principal duties:

The Wisconsin Center Education Research (WCER) seeks a highly-motivated and
uniquely-skilled individual to fill a critical role as a knowledge manager and
collaboration facilitator for a complex set of research projects housed in one
of the oldest and largest university-based education research centers. The
Digital Librarian will be available to all WCER research projects and will be
vital to WCER's effort to encourage, facilitate, manage, and document
collaboration within an online collaborative environment. This position will
interface with partners and sites across the country that need to contribute
to and access a centralized, enterprise-grade, web-based repository of
information.

The Digital Librarian will also assist in the development of ongoing knowledge
management strategy by integrating the latest research in knowledge management
and available technologies, working proactively to encourage the research-
based uses of our system; provide support with patience and enthusiasm to a
variety of users; maintain a complex knowledge system; and assist researchers
and other staff members of varying technical abilities to use collaborative
technologies.

  
Duties:

  
80% Knowledge Management :

  * Develop and implement a knowledge management strategy by working with 
principal investigators and staff to evaluate research and corporate trends on 
knowledge management systems and supporting technologies. Translate that 
knowledge into the design and implementation of a knowledge management 
framework;
  * Administer an enterprise-grade knowledge management system, including 
providing design specifications to developers;
  * Provide a stable, scalable, and sustainable platform for the delivery and 
long-term management of digital content;
  * Utilize knowledge management system for management of tasks and 
deliverables of WCER projects;
  * Provide training to principal investigators and staff in the effective use 
of the knowledge management system;
  * Provide application support via phone, e-mail, IM and in person.
  * Evaluate other collaborative tools; maintain awareness of trends and 
advances in the field.
  
10% Trend Analysis:

  * Monitor trends in information management systems by tracking and analyzing 
newspapers, magazines, blogs, and forums.
  
10% User Interface Design:

  * Assist Web Design and Application Development team with the design, 
implementation, and testing of user interfaces.
  
A criminal background check will be conducted prior to hiring.

A period of evaluation will be required

*  
Employee Class: Academic Staff

Department(s): EDUC/WCER

Full Time Salary Rate: Minimum $55,000 ANNUAL (12 months) Depending on
Qualifications

Term: This is a renewable appointment.

Appointment percent: 100%

Anticipated begin date: JANUARY 01, 2014

Number of Positions: 1

  
To Ensure Consideration:

Application must be received by: DECEMBER 13, 2013

  
How To Apply:

To apply for this position please send a letter of interest and a resume to:
pvl78...@workspace.wcer.wisc.edu

Mac users: Please send the above materials to: 

Re: [CODE4LIB] The lie of the API

2013-12-02 Thread Joe Hourcle
On Dec 2, 2013, at 1:25 PM, Kevin Ford wrote:

  A key (haha) thing that keys also provide is an opportunity
  to have a conversation with the user of your api: who are they,
  how could you get in touch with them, what are they doing with
  the API, what would they like to do with the API, what doesn’t
  work? These questions are difficult to ask if they are just a
  IP address in your access log.
 -- True, but, again, there are other ways to go about this.
 
 I've baulked at doing just this in the past because it reveals the raw and 
 primary purpose behind an API key: to track individual user usage/access.  I 
 would feel a little awkward writing (and receiving, incidentally) a message 
 that began:
 
 --
 Hello,
 
 I saw you using our service.  What are you doing with our data?
 
 Cordially,
 Data service team
 --

It's better than posting to a website:

We can't justify keeping this API maintained / available,
because we have no idea who's using it, or what they're
using it for.

Or:

We've had to shut down the API because we'd had people
abusing the API and we can't easily single them out as
it's not just coming from a single IP range.

We don't require API keys here, but we *do* send out messages
to our designated community every couple of years with:

If you use our APIs, please send a letter of support
that we can include in our upcoming Senior Review.

(Senior Review is NASA's peer-review of operating projects,
where they bring in outsiders to judge if it's justifiable to
continue funding them, and if so, at what level)


Personally, I like the idea of allowing limited use without
a key (be it number of accesses per day, number of concurrent
accesses, or some other rate limiting), but as someone who has
been operating APIs for years and is *not* *allowed* to track
users, I've seen quite a few times when it would've made my
life so much easier.



 And, if you cringe a little at the ramifications of the above, then why do 
 you need user-specific granularity?   (That's really not meant to be a 
 rhetorical question - I would genuinely be interested in whether my notions 
 of open and free are outmoded and based too much in a theoretical purity 
 that unnecessary tracking is a violation of privacy).

You're assuming that you're actually correlating API calls
to the users ... it may just be an authentication system
and nothing past that.


 Unless the API key exists to control specific, user-level access precisely 
 because this is a facet of the underlying service, I feel somewhere in all of 
 this the service has violated, in some way, the notion that it is open 
 and/or free, assuming it has billed itself as such.  Otherwise, it's free 
 and open as in Google or Facebook.

You're also assuming that we've claimed that our services
are 'open'.  (mine are, but I know of plenty of them that
have to deal with authorization, as they manage embargoed
or otherwise restricted items).

Of course, you can also set up some sort of 'guest'
privileges for non-authenticated users so they just wouldn't
see the restricted content.


 All that said, I think a data service can smooth things over greatly by not 
 insisting on a developer signing a EULA (which is essentially what happens 
 when one requests an API key) before even trying the service or desiring the 
 most basic of data access.  There are middle ground solutions.

I do have problems with EULAs ... one in that we have to
get things approved by our legal department, second in that
they're often written completely one-sided and third in
that they're often written assuming personal use.

Twitter and Facebook had to make available alternate EULAs
so that governments could use them ... because you can't
hold the person who signed up for the account responsible
for it.  (and they don't want it 'owned' by that person
should they be fired, etc.)

... but sometimes they're less restrictive ... more TOS
than EULA.  Without it, you've got absolutely no sort of
SLA ... if they want to take down their API, or block you,
you've got no recourse at all.

-Joe


[CODE4LIB] Job: Systems and Digital Content Librarian at Gwinnett technical college

2013-12-02 Thread jobs
Systems and Digital Content Librarian
Gwinnett technical college
Lawrenceville

Job Summary/Basic Function: Founded in 2005, Georgia Gwinnett College (GGC) is
the 31st member of the University System of Georgia. GGC is a premier 21st
century four-year liberal arts institution accredited by the Southern
Association of Colleges  Schools. With a current enrollment of over 9,000
students, enrollment is projected to exceed 13,000 students within three
years, including both residential and commuter students. Located in the
greater Atlanta metropolitan area, GGC provides a student centered,
technology-enriched learning environment. Gwinnett County (pop. 850,000+) is
home to a variety of businesses, including organizations involved in health
care, education and information technology.

  
The Systems and Digital Content Librarian (SDCL) administers and provides
dependable and high quality electronic information services, both on campus
and remotely accessible, for the students, faculty, and staff of Georgia
Gwinnett College. These information services include the library management
system and specialized software used in libraries. The SDCL is responsible for
coordinating library technology activities and services. This is a faculty
position and as such will be expected to serve on college-wide committees and
provide liaison and instructional support as needed. This position reports to
the Library Assistant Dean.

  
Duties and Responsibilities:

  * Operates and maintains the library management system;
  * Provides technical support and oversight of the Library's electronic 
services and data processes;
  * Acts as primary technical contact for the Office of Educational Technology, 
Center for Teaching Excellence, GGC Webmaster, GIL, and GALILEO
  * Creates documentation on various systems.
  * Maintains policies and procedures for Georgia Gwinnett College Library 
information systems;
  * Serves as the main library contact for the Georgia Gwinnett College Website;
  * Designs, implements and maintains the library portal pages;
  * Assists in managing library social media outlets;
  * Provides graphics support for library;
  * Assists with statistics gathering and reporting;
  * Provides limited in-house computer support as needed in concert with the 
Office of Education Technology;
  * Provides liaison and instructional support as needed;
  * Performs other duties as assigned.
Due to the volume of applications, applicants may not receive a reply from the
College unless an applicant is selected for an interview. Review of
applications will continue until positions are filled. Hiring is contingent
upon eligibility to work in the United States and proof of eligibility will be
contemporaneously required upon acceptance of an employment offer. Any
resulting employment offers are contingent upon successful completion of a
background investigation, as determined by Georgia Gwinnett College in its
sole discretion. Georgia Gwinnett College, a unit of the University System of
Georgia, is an Affirmative Action/Equal Opportunity employer and does not
discriminate on the basis of race, color, gender, national origin, age,
disability or religion. Georgia is an open records state.

  
Minimum Qualifications: Education:

ALA-accredited master's degree in library or information science.

  
Required:

  * Working knowledge of automated library procedures;
  * Experience with all modules of Ex Libris' Voyager library management system;
  * Experience using EZProxy;
  * Experience managing remotely accessible databases;
  * Experience working in a library consortium environment;
  * Some familiarity with Microsoft Access and Oracle databases; some knowledge 
of SQL preferred;
  * Familiarity with circulation and interlibrary loan procedures;
  * Basic understanding of cataloging rules and MARC records;
  * Experience in web design and maintenance;
  * Demonstrated understanding of instructional technology;
  * Working knowledge of basic reference sources and demonstrated familiarity 
with search strategies for a variety of electronic information resources, such 
as GALILEO, GIL, and the Internet;
  * Knowledge and experience with Microsoft Office, other typical personal 
computer applications, and basic networking technology;
  * Strong organizational and analytical skills;
  * Excellent interpersonal and communication skills
  * Strong service orientation



Brought to you by code4lib jobs: http://jobs.code4lib.org/job/10931/


[CODE4LIB] Job: Web Services Librarian at Boston College

2013-12-02 Thread jobs
Web Services Librarian
Boston College
Chestnut Hill

As a member of the Library Systems Department, the Web Services Librarian will
collaborate with Public Services managers and staff to ensure the smooth,
reliable operation and usability of the libraries' key public-facing web
content systems. He/she administers library web content management systems
(e.g. LibGuides CMS and Drupal), working closely with web content owners and
authors to make certain that library web pages are optimized to conform to
indexing, design and stylistic standards. He/she conducts individual
consultations, creates documentation, tutorials and other training materials
to support staff users of Drupal, LibGuides CMS and other public-facing
library web applications as required. He/She maintains CMS asset/shared
content databases and ensures their continued accuracy and usability.

  
The successful candidate will combine an understanding of both web content
management systems and issues and trends in public services in academic
libraries. This position works closely with the Learning Commons Manager, the
Head of Access Services, and the Head of Instruction Services to ensure that
existing library web applications and services meet the needs and expectations
of library patrons and staff. He/She also collaborates with public services
staff and other constituents to plan and implement new library web services
and to continually evaluate and assess the impact and usability of existing
library web services.

  
This position reports the Manager of Library Web Services

  
Requirements

  * MLS from an ALA accredited school
  * Experience in HTML/CSS and/or Web Content Management Systems
  * 2+ years' experience working in an academic library preferred
  * Experience in Drupal/Wordpress preferred
  * Experience in JavaScript/PHP/Python/Ruby preferred



Brought to you by code4lib jobs: http://jobs.code4lib.org/job/10934/


[CODE4LIB] Digital Collections Contexts Workshop at iConference 2014

2013-12-02 Thread Senseney, Megan Finn
Digital Collection Contexts:

Intellectual and Organizational Functions at Scale

Full-day workshop at iConference

Berlin, Germany

March 4, 2014


Registration is now open for a full-day workshop that examines conceptual and 
practical aspects of collections and the context they provide in the digital 
environment, especially in large-scale cultural heritage aggregations. 
Collections will be considered in relation to the information needs of 
scholars, roles of cultural institutions, and international interoperability. 
The workshop aims to:

  *   Broaden the conversation across an international community
  *   Further the research and development agenda for digital aggregations
  *   Relate conceptual advances to implementation goals
  *   Identify realistic approaches for collection representation, 
contextualization, and interoperability at scale


Sessions will be led by European and North American experts from iSchools and 
projects developing large-scale digital cultural heritage collections.

  *   Morning session: Conceptual Foundations of Digital Collections
 *   Carole L. Palmer and Karen Wickett (CIRSS, University of Illinois)
 *   Hur-li Lee (School of Information Studies, University of 
Wisconsin-Milwaukee)
 *   Martin Doerr (Institute of Computer Science, Foundation for Research 
and Technology – Hellas)
 *   Carlo Meghini (Istituto di Scienza e Tecnologie dell’Informazione, 
Consiglio Nazionale delle Ricerche).
  *   Afternoon session: Practical Implications for Digital Collections
 *   Antoine Isaac (Europeana Foundation)
 *   Emily Gore and Amy Rudersdorf (Digital Public Library of America)
 *   Sheila Anderson (Centre for e-Research, King’s College London)
 *   Shenghui Wang (OCLC Research)
 *   Mark Stevenson and Paul Clough (Department of Computer Science, 
University of Sheffield)


For a complete program and additional information about the workshop, please 
visit http://bit.ly/collectionsworkshop2014.


Early bird registration deadline is Sunday, December 15, 2013.  Workshops are 
included in the cost of conference registration. For more details, please see 
http://ischools.org/the-iconference/registration/.


--

Megan Finn Senseney
Project Coordinator, Research Services
Graduate School of Library and Information Science
University of Illinois at Urbana-Champaign
501 East Daniel Street
Champaign, Illinois 61820
Phone: (217) 244-5574
Email: mfsen...@illinois.edu
http://www.lis.illinois.edu/research/services/http://www.lis.uiuc.edu/research/services/


[CODE4LIB] Job: Metadata Integration and Delivery Specialist at University of Virginia

2013-12-02 Thread jobs
Metadata Integration and Delivery Specialist
University of Virginia
Charlottesville

The University of Virginia Library seeks a Metadata Integration and Delivery
Specialist. Metadata Management Services facilitates access to library managed
content, participates in partnerships to share expertise and innovate in data
management and engages fully in the work of the University of Virginia. The
employee in this position supports that mission by collaborating with
colleagues within and outside of the department and Library to determine and
document best practices and workflows for non-MARC metadata creation. This
position is charged with finding creative, collaborative and sustainable
solutions for providing and managing metadata for a variety of physical and
digital resources. This position will also have responsibility for describing
resources in a variety of formats, following local and national standards. The
employee in this position is encouraged to be current with the community of
practice for non-MARC metadata and stay abreast of developments within the
broader field of information organization.

  
Qualifications:

Required: Bachelor's degree . Experience with library metadata standards and
application, and can demonstrate the ability to work independently and
collaboratively across groups to achieve objectives. Demonstrated ability to
create metadata according to established rules and standards. Demonstrated
ability to work collaboratively across groups to achieve objectives. Ability
to communicate effectively orally and in writing. Excellent interpersonal
skills.

  
To be considered for this position please visit our web site and apply on line
at the following link: jobs.virginia.edu



Brought to you by code4lib jobs: http://jobs.code4lib.org/job/10940/


Re: [CODE4LIB] The lie of the API

2013-12-02 Thread Fitchett, Deborah
Environment Canterbury has a click-through screen making you accept their terms 
and conditions before you get access to the API, and they use that as an 
opportunity to ask some questions about your intended use. Then once you've 
answered those you get direct access to the API as beautiful plain XML. (Okay, 
XML which possibly overuses attributes to carry data instead of tags, but I 
eventually figured out how to make my server's version of PHP happy with that.) 
It's glorious.  It made me so happy that I went back to their click-through 
screen to give them some more information about what I was doing.

When I had to try and navigate Twitter's API and authentication models, 
however... Well, I absolutely understand the need for it, but it'll be a long 
time before I ever try that again.

Deborah

-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Edward 
Summers
Sent: Tuesday, 3 December 2013 6:57 a.m.
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] The lie of the API

On Dec 3, 2013, at 4:18 AM, Ross Singer rossfsin...@gmail.com wrote:
 I'm not going to defend API keys, but not all APIs are open or free.  
 You need to have *some* way to track usage.

A key (haha) thing that keys also provide is an opportunity to have a 
conversation with the user of your api: who are they, how could you get in 
touch with them, what are they doing with the API, what would they like to do 
with the API, what doesn't work? These questions are difficult to ask if they 
are just a IP address in your access log.

//Ed


P Please consider the environment before you print this email.
The contents of this e-mail (including any attachments) may be confidential 
and/or subject to copyright. Any unauthorised use, distribution, or copying of 
the contents is expressly prohibited. If you have received this e-mail in 
error, please advise the sender by return e-mail or telephone and then delete 
this e-mail together with all attachments from your system.


[CODE4LIB] note about pre-conf proposal page

2013-12-02 Thread Rosalyn Metz
i was looking to see if yo_bj and i had any sign ups for Managing Projects
(brief plug: join us!) and i noticed a bunch of the proposals were missing
(like half of them).

I trolled through the history and put back the page as I think it should be
(best guess really).  so if you've made changes recently, take a look at
the page and make sure your edits aren't missing.

here is the page to make it easier for anyone to check up on it:
http://wiki.code4lib.org/index.php/2014_preconference_proposals


Re: [CODE4LIB] Looking for two coders to help with discoverability of videos

2013-12-02 Thread Kelley McGrath
Well, that would be much easier, but most of what I am working with are records 
for physical items (DVD, VHS, film) or licensed streaming video. The sample 
records are also not all UO records so I don't necessarily even have access to 
the source material (our goal is to build a general purpose tool). So I think I 
am stuck with extracting from MARC.

We should be able to get data for some resources by matching the MARC up with 
external data sources. That won't work for everything, though, so we want to 
make the process of extracting data from MARC as effective as possible.

Kelley


On Mon, Dec 2, 2013 at 7:03 AM, Alexander Duryee 
alexanderdur...@gmail.commailto:alexanderdur...@gmail.com wrote:
Is it out of the question to extract technical metadata from the
audiovisual materials themselves (via MediaInfo et al)?  It would minimize
the amount of MARC that needs to be processed and give more
accurate/complete data than relying on old cataloging records.


Re: [CODE4LIB] Looking for two coders to help with discoverability of videos

2013-12-02 Thread Kelley McGrath
Robert,

Your work also sounds very interesting and definitely overlaps with some of 
what we want to do. It seems like a lot of people are trying to get useful 
format information out of MARC records and it's unfortunate that it is so 
complicated. I would be very interested to see your logic for determining 
format and dealing with self-contradictory records. Runtime from the 008 is, as 
you say, pretty straightforward, but not always filled out and useless if the 
resource is longer than 999 minutes.

It's interesting that you mention identifying directors. We have also been 
working on a similar, although more generalized, process. We're trying to 
identify all of the personal and organizational names mentioned in video 
records and, where possible, their roles. Our existing process is pretty 
accurate for personal names and for roles in English. It tends to struggle with 
credits involving multiple corporate bodies and we're working on building a 
lexicon of non-English terms for common roles. We're also trying to get people 
to hand-annotate credits to build a corpus to help us improve our process. 
(Help us out at http://olac-annotator.org/. And if you're willing to be on call 
to help with translating non-English credits, email me with the language(s) 
you'd be able to help out with. We also just started a mailing list at 
https://lists.uoregon.edu/mailman/listinfo/olac-credits)

Matching MARC records for moving images with external data sources is also on 
our radar. Most feature film type material can probably be identified by the 
attributes you mention: title, original date and director (probably 2 out of 3 
would work in most cases). We are also hoping to use these attributes (and 
possibly others) to cluster records for the same FRBR work.

It would be great to talk with you more about this off-list.

Kelley
kell...@uoregon.edu

From: Robert Haschart [rh...@virginia.edu]
Sent: Monday, December 02, 2013 10:49 AM
To: Code for Libraries
Cc: Kelley McGrath
Subject: Re: [CODE4LIB] Looking for two coders to help with discoverability of 
videos

Kelley,

The work you are proposing is interesting and overlaps somewhat both
with work I have already done and with a new project I'm looking into
here at UVa.
I have been the primary contributor to the Marc4j java project for the
past several years and am the creator of the project SolrMarc which
extracts data from Marc records based on a customizable specification,
to build Solr index records to facilitate rich discovery.

Much of my work on creating and improving these projects has been in
service of my actual job of creating and maintaining the Solr Index
behind our Blacklight-based discovery interface.   As a part of that
work I have created custom SolrMarc routines that extract the format of
items similar to what is described in Example 3, including looking in
the leader, 006, 007 and 008 to determine the format as-coded but
further looking in the 245 h, 300 and 538 fields to heuristically
determine when the format as-coded is incorrect and ought to be
overridden.   Most of the heuristic determination is targeted towards
Video material, and was initiated when I found an item that due to a
coding error was listed as a Video in Braille format.

Further I have developed a set of custom routines that look more closely
at Video items, one of which already extracts the runtime from the
008[18-20] field,
To modify it from its current form that currently returns the runtime in
minutes, to instead return it as   HH:MM as specified in your xls file,
and to further handle the edge case of  008[18-20] = 000  to return
over 16:39 would literally take about 15 minutes.

Another of these custom routines that is more fully-formed, is code for
extracting the Director of a video from the Marc record.  It examines
the contents of the fields 245c, 508a, 500a, 505a, 505t, employing
heuristics and targeted natural language processing techniques, to
attempt to correctly extract the Director.   At this point I believe
it achieves better results than a careful cataloger would achieve, even
one who specializes in film and video.

The other project I have just started investigating is an effort to
create and/or flesh out Marc records for video items based on heuristic
matching of title and director and date with data returned from
publicly-accessible movie information sites.

This more recent work may not be relevant to your needs but the custom
extraction routines seem directly applicable to your goals, and may also
provide a template that may make your other goals more easily achievable.

-Robert Haschart