[CODE4LIB] JHOVE2 version 2.0.0 released

2011-04-21 Thread Stephen Abrams
*** Cross posted ***

 

The JHOVE2 project team has released the 2.0.0 version of the open
source software on the project's Mercurial source code repository at
BitBucket, .

 

This release supports all the major technical objectives of the project,
including a more sophisticated, modular architecture; signature-based
file identification; policy-based assessment of objects; recursive
characterization of objects comprising aggregate files and files
arbitrarily nested in containers; and extensive configuration and
reporting options.  It provides a stabile interface against which
developers can code new format modules.

 

Format modules included in this release are:

 

* ICC color profiles

* SGML

* Shapefile

* TIFF

* UTF-8

* WAVE

* XML

* Zip

 

Please note that the ZIP module comprises a non-validating partial
module, which accomplishes recursive JHOVE2 descent on the contents of
the ZIP file, but does not yet validate the ZIP file itself against the
standard.

 

Modules to be delivered in future releases include

 

* ARC  (developed by BnF/Atos)

* Gzip (developed by BnF/Atos)

* NetCDF (provided by Wegener Institute for Polar and Marine
Research) PDF

* JPEG-2000

* PDF

 

More information on JHOVE2 can be found at the project wiki,
, or via email on the
"JHOVE2-Announce-L" email list.  Please direct any comments or
suggestions to the "JHOVE2-TechTalk-L" mailing list for community
discussion.  

 

Sincerely, 

 

Stephen Abrams, California Digital Library

Tom Cramer, Stanford University

Sheila Morrissey, Portico

 

on behalf of the JHOVE2 project team

http://jhove2.org/

 


Re: [CODE4LIB] LCSH and Linked Data

2011-04-21 Thread Kyle Banerjee
The short version of this lengthy post is that there's really no value in
worrying about how to handle precoordinated strings except for purposes of
busting them up.

The Rube Goldberg style precoordination rules that cause so many headaches
were developed to address challenges brought about by paper card catalogs.
The physicality of paper required a mechanism to ensure a limited number of
cards would file together. Unless you still use a paper catalog, they're as
relevant as spurs are to race car drivers.

The order you see in the MARC record mimics the paper rules exactly (because
MARC was used mostly for card printing for decades) and has also lead to
literally tens of millions of unique subject strings as there are so many
permutations.  As a practical matter, even highly trained librarians cannot
guess how these were put together without going through a substantial
research process.

I hate to dig up stuff written in the 1920's that's rammed down the throats
of first semester library school students. However, in the case at hand,
logic from these works has direct application for purposes of making MARC
data usable.

To summarize, the concept is that subjects can be broken down into aspects
(i.e. facets) with the primary ones time, place, action, material, and
personality -- you can think of this last category as natural groupings of
the type that standardized subdivisions can be applied to such as materials,
animals, corporate entitities, diseases, body parts, etc.

It's much better to think of the facets (time, place, etc) as attributes
rather than occuring in any particular order as this allows interactive and
relatively precise drilling through huge amounts of data. You'll notice that
good search engines effectively do just that.

kyle




One of the challenges for pre-coordinated strings at least as currently
> implemented (that facets evade) is that no order will suit everyone. Which
> of the following is better?
>
> Dwellings $z Australia $x History $y 20th century
> Dwellings $z Indonesia $x Economic aspects
> Dwellings $z Indonesia $x Psychological aspects
> Dwellings $z Indonesia $x Social aspects
> Dwellings $z Ireland $x Economic aspects
> Dwellings $z Ireland $x Psychological aspects
> Dwellings $z Ireland $x Social aspects
> Dwellings $z Japan $x Economic aspects
> Dwellings $z Japan $x Psychological aspects
> Dwellings $z Japan $x Social aspects
>
> OR (mostly current practice)
>
> *Dwellings $z Australia $x History $y 20th century  **Current practice
> Dwellings $x Economic aspects $z Indonesia
> Dwellings $x Economic aspects $z Ireland
> Dwellings $x Economic aspects $z Japan
> *Dwellings $x History $z Australia $y 20th century  **Airlie recommendation
> Dwellings $x Psychological aspects $z Indonesia
> Dwellings $x Psychological aspects $z Ireland
> Dwellings $x Psychological aspects $z Japan
> Dwellings $x Social aspects $z Indonesia
> Dwellings $x Social aspects $z Ireland
> Dwellings $x Social aspects $z Japan
>


[CODE4LIB] Special Library - Direct Hire Job Opportunity - SF Bay Area (South Bay)

2011-04-21 Thread Gloria Elia
*AIM Library & Information Staffing (AIM)* www.aimusa.com has a fulltime,
direct hire job opportunity in the San Francisco Bay Area (South Bay) for a
systems librarian with the following background, including but not limited
to:  IT support, managing ILSs, knowledge of Linux, Apache, MySQL, Perl/PHP;
cloud computing and mobile platforms; sourcing, recommending, and
implementing new software solutions to improve and enhance IT processes for
users.

For immediate consideration and more information, please email resume to
ge...@aimusa.com

Thank you for your interest in AIM job opportunities.

Regards,

Gloria

-- 
Gloria Elia
AIM Library & Information Staffing
Toll Free: 877-965-7900 ext. 100 . Fax: 650-965-7774
Email: ge...@aimusa.com
Web: www.aimusa.com

*Full Time *Part Time *Direct Hire *Temporary *Contract
*Library Maintenance *Executive Recruitment

*
CONFIDENTIAL OR PRIVILEGED: This communication contains information intended
only for the use of the individuals to whom it is addressed and may contain
information that is privileged, confidential or exempt from other disclosure
under applicable law. If you are not the intended recipient, you are
notified that any disclosure, printing, copying, distribution or use of the
contents is prohibited. If you have received this in error, please notify
the sender immediately by telephone or by returning it by reply email and
then permanently deleting the communication from your system. Thank you.


Re: [CODE4LIB] Code4Lib Virtual Lightning Talks Rescheduled to April 29th

2011-04-21 Thread Peter Murray
All,

A week and a day until the first run of the Code4Lib Virtual Lightning Talks.  
Three presenters have signed up so far, with room for more:

 CodaBox: Using E-Prints for a small scale personal repository
 Edward M. Corrado

 MARC-DM: a JavaScript API for indexing MARC-JSON records in CouchDB
 Luciano Ramalho

 Extending VuFind for cross-collection search
 Michael Appleby and Youn Noh

Plenty of room for people to watch, too.  Please sign up at

  http://wiki.code4lib.org/index.php/Virtual_Lightning_Talks


Peter

On Apr 4, 2011, at 2:47 PM, Peter Murray wrote:
> 
> Thanks for the feedback, everyone (both on and off list).  The 
> one-week-to-prepare was, in hindsight, too aggressive.  Okay, lesson learned 
> (I hope).
> 
> The Virtual Lightning Talks session has been rescheduled to April 29th from 
> 1pm to 2pm Eastern U.S. time.  I've reset and updated the sign-up form:
> 
> http://wiki.code4lib.org/index.php/Virtual_Lightning_Talks
> 
> Peter

-- 
Peter Murray peter.mur...@lyrasis.orgtel:+1-678-235-2955
 
Ass't Director, Technology Services Development   http://dltj.org/about/
Lyrasis   --Great Libraries. Strong Communities. Innovative Answers.
The Disruptive Library Technology Jesterhttp://dltj.org/ 
Attrib-Noncomm-Share   http://creativecommons.org/licenses/by-nc-sa/2.5/ 


Re: [CODE4LIB] Fwd: [Air-L] Using archives of the web for research

2011-04-21 Thread Robert Sanderson
Our work on Memento comes to mind, of course.

http://www.mementoweb.org/

And in particular, regarding the second point, our papers about the
use of Memento for non-traditional interactions with web archives:

* http://arxiv.org/abs/1003.3661
Using Memento to recover the state of a web resource at the point in
time it was annotated, to ensure that the annotation is displayed with
the correct representation.

* http://arxiv.org/abs/1003.2643
Using Memento with Linked Data to perform time series analysis.

And hopefully a paper at Open Repositories, describing initial and
ongoing research, briefly summarized in:

* 
http://public.lanl.gov/herbertv/papers/Papers/2011/MementoPoster_IKS_201104.pdf


Hope that helps!

Rob


On Thu, Apr 21, 2011 at 8:58 AM, Jodi Schneider  wrote:
> Code4Lib, any thoughts for Eric? -Jodi
>
> -- Forwarded message --
> From: Eric Meyer 
> Date: Wed, Apr 20, 2011 at 4:46 PM
> Subject: [Air-L] Using archives of the web for research
> To: "ai...@listserv.aoir.org" 
> Cc: Ralph Schroeder , "
> a...@proteus-associates.com" 
>
>
> Dear AoIR,
>
> OII is currently doing some work for the IIPC (International Internet
> Preservation Consortium: http://www.netpreserve.org), and part of the work
> involves identifying current and cutting edge research techniques and tools
> that are available for research on the live web, but that are currently
> either difficult or impossible to use with web archives such as the Internet
> Archive (http://www.archive.org/) or other IIPC member organisations.
>
> The short version of what we are hoping to get from this group:
> - do you know of any innovative uses of web archives for research?
> - what techniques for researching the live web should be adapted for use
> with web archives?
> - can you envisage any innovative uses of web archives (or other archived
> Internet data) for research that you would ideally like to be able to do?
>
> The longer version:
> What we are hoping is that members of AoIR will respond to us (off list)
> with your ideas about ways you research the live web that could potentially
> be enhanced using either snapshots from the web at different time points or
> longitudinal data about the web over time, but which would need additional
> support, training, tools, or infrastructure to be able to accomplish.  Your
> responses will be used to influence the IIPC community to add web archive
> support for the kinds of cutting edge research that AoIR members are doing.
>
> Also, if you have any types of research or research questions you have been
> hoping to be able to do with archived internet data but have not been able
> to do for whatever reason, and you are willing to share the ideas and the
> barriers to researching them with us for possible inclusion in our
> discussion paper, that would be appreciated as well.
>
> Responses before 1 May will be most helpful. We will post the draft report
> back to the list in May, and the final report in the summer.  Those
> interested in web archives may also find two reports we wrote last autumn to
> be of interest:
> Dougherty, M., Meyer, E.T., Madsen, C., van den Heuvel, C., Thomas, A.,
> Wyatt, S. (2010). Researcher Engagement with Web Archives: State of the Art.
> London: JISC. Online: http://ssrn.com/abstract=1714997 or
> http://ie-repository.jisc.ac.uk/544/
> Thomas, A., Meyer, E.T., Dougherty, M., van den Heuvel, C., Madsen, C.,
> Wyatt, S. (2010). Researcher Engagement with Web Archives: Challenges and
> Opportunities for Investment. London: JISC. Online:
> http://ssrn.com/abstract=1715000 or http://ie-repository.jisc.ac.uk/543/
>
>
> Eric T. Meyer
> Research Fellow, Oxford Internet Institute
> University of Oxford
> eric.me...@oii.ox.ac.uk
> http://people.oii.ox.ac.uk/meyer
>
>
>
> ___
> The ai...@listserv.aoir.org mailing list
> is provided by the Association of Internet Researchers http://aoir.org
> Subscribe, change options or unsubscribe at:
> http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
>
> Join the Association of Internet Researchers:
> http://www.aoir.org/
>


[CODE4LIB] Fwd: [Air-L] LAST CALL - Out of the Box: Building and Using Web Archiving Collections - Registration closes May 2, 2011.

2011-04-21 Thread Jodi Schneider
-- Forwarded message --
From: Grotke, Abigail 


The International Internet Preservation Consortium (IIPC) is holding a
day-long public event May 9, 2011 at the KB, the National Library of the
Netherlands, in The Hague. "Out of the Box: Building and Using Web Archive
Collections" will feature library and museum curators highlighting their web
archives and researchers showcasing projects that utilize web archives.

Program Highlights

Collection Treasures:

* Online Revolutions in the Arabic World: Event Harvesting in
Perspective from the Library of Congress, National Library of France,
Internet Archive and the Bibliotheca Alexandrina.

Web Archiving short stories

* Why is the Web Archive Switzerland like a Swiss Cheese?

* Grow your own diamonds: increasing value of the KB web archive
Using Web Archives:

* Mapping the Dutch Blogosphere with the Internet Archive

* Researchers at the British Library and the UK Web Archive
Attendance of this one-day event is open to all and free to attend.

Who should attend? Curators, researchers, administrators, and others who are
interested in the collection, preservation and access of web archives for
academic, cultural heritage or research purposes.

Registration (http://www.netpreserve.org/events/hague_reg.php) is required
for entry. Registration closes May 2, 2011.

The full program and logistic information is available:
http://www.netpreserve.org/events/2011GAoutofthebox.php

If you have any questions please contact Abbey Potter, IIPC Communications
Officer a...@loc.gov or Frederique Vijftigschild at the
KB frederique.vijftigsch...@kb.nl.



~~
Abbie Grotke
Web Archiving Team Lead
Office of Strategic Initiatives
National Digital Information and Infrastructure Preservation Program
Library of Congress
http://www.loc.gov/webarchiving/
http://www.digitalpreservation.gov
202-707-2833
a...@loc.gov


[CODE4LIB] Fwd: [Air-L] Using archives of the web for research

2011-04-21 Thread Jodi Schneider
Code4Lib, any thoughts for Eric? -Jodi

-- Forwarded message --
From: Eric Meyer 
Date: Wed, Apr 20, 2011 at 4:46 PM
Subject: [Air-L] Using archives of the web for research
To: "ai...@listserv.aoir.org" 
Cc: Ralph Schroeder , "
a...@proteus-associates.com" 


Dear AoIR,

OII is currently doing some work for the IIPC (International Internet
Preservation Consortium: http://www.netpreserve.org), and part of the work
involves identifying current and cutting edge research techniques and tools
that are available for research on the live web, but that are currently
either difficult or impossible to use with web archives such as the Internet
Archive (http://www.archive.org/) or other IIPC member organisations.

The short version of what we are hoping to get from this group:
- do you know of any innovative uses of web archives for research?
- what techniques for researching the live web should be adapted for use
with web archives?
- can you envisage any innovative uses of web archives (or other archived
Internet data) for research that you would ideally like to be able to do?

The longer version:
What we are hoping is that members of AoIR will respond to us (off list)
with your ideas about ways you research the live web that could potentially
be enhanced using either snapshots from the web at different time points or
longitudinal data about the web over time, but which would need additional
support, training, tools, or infrastructure to be able to accomplish.  Your
responses will be used to influence the IIPC community to add web archive
support for the kinds of cutting edge research that AoIR members are doing.

Also, if you have any types of research or research questions you have been
hoping to be able to do with archived internet data but have not been able
to do for whatever reason, and you are willing to share the ideas and the
barriers to researching them with us for possible inclusion in our
discussion paper, that would be appreciated as well.

Responses before 1 May will be most helpful. We will post the draft report
back to the list in May, and the final report in the summer.  Those
interested in web archives may also find two reports we wrote last autumn to
be of interest:
Dougherty, M., Meyer, E.T., Madsen, C., van den Heuvel, C., Thomas, A.,
Wyatt, S. (2010). Researcher Engagement with Web Archives: State of the Art.
London: JISC. Online: http://ssrn.com/abstract=1714997 or
http://ie-repository.jisc.ac.uk/544/
Thomas, A., Meyer, E.T., Dougherty, M., van den Heuvel, C., Madsen, C.,
Wyatt, S. (2010). Researcher Engagement with Web Archives: Challenges and
Opportunities for Investment. London: JISC. Online:
http://ssrn.com/abstract=1715000 or http://ie-repository.jisc.ac.uk/543/


Eric T. Meyer
Research Fellow, Oxford Internet Institute
University of Oxford
eric.me...@oii.ox.ac.uk
http://people.oii.ox.ac.uk/meyer



___
The ai...@listserv.aoir.org mailing list
is provided by the Association of Internet Researchers http://aoir.org
Subscribe, change options or unsubscribe at:
http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org

Join the Association of Internet Researchers:
http://www.aoir.org/


Re: [CODE4LIB] 2012 Conference Dates

2011-04-21 Thread Tom Keays
Apparently no date set as yet.

http://code4lib.org/node/405
http://sites.google.com/site/code4lib2012seattle/\



On Thu, Apr 21, 2011 at 8:23 AM, Richard, Joel M  wrote:

> Good morning,
>
> I know that Seattle has been chosen for the next code4lib conference, but I
> can't find any info on dates. I'm really hoping it doesn't fall on the week
> of Mardi Gras (Feb 21, 2012). Does anyone have info on this?
>
> Thanks!
> --Joel
>
>
> Joel Richard
> IT Specialist, Web Services Department
> Smithsonian Institution Libraries | http://www.sil.si.edu/
> (202) 633-1706 | richar...@si.edu
>


[CODE4LIB] 2012 Conference Dates

2011-04-21 Thread Richard, Joel M
Good morning,

I know that Seattle has been chosen for the next code4lib conference, but I 
can't find any info on dates. I'm really hoping it doesn't fall on the week of 
Mardi Gras (Feb 21, 2012). Does anyone have info on this?

Thanks!
--Joel


Joel Richard
IT Specialist, Web Services Department
Smithsonian Institution Libraries | http://www.sil.si.edu/
(202) 633-1706 | richar...@si.edu