Re: [CODE4LIB] Code4Lib 2015 Newcomer Dinner Question

2015-01-28 Thread Becky Yoose
Technically, the "official" dinners can start anytime; the 6 pm start time
is only a guideline. People can have the dinner whenever it is a good time
for them to meet, either earlier or later that night

You can either lead a group like Cary suggested or you can probably stop by
one of the brewpub places that has a few groups since they might stay for
longer than other groups.

Thanks,
Becky

On Wed, Jan 28, 2015 at 6:30 PM, Cary Gordon  wrote:

> You will likely miss the "official" dinners, as pretty much all of those
> start at 6-6:30. Of course, you could just claim a restaurant on the list
> and have it start whenever you want.
>
> Cary
>
> > On Jan 28, 2015, at 12:15 PM, Matthew Sherman 
> wrote:
> >
> > Hi all,
> >
> > This question is directed at folks attending Code4Lib 2015 in almost a
> week
> > and a half.  Are any of the groups for the dinner leaving after 7pm?  I
> ask
> > as sadly my flight doesn't land until about 6:30 pm that day.  If anyone
> is
> > eating a little later it would be great to join you guys.  Thanks for any
> > info people can give.
> >
> > Matt Sherman
>


Re: [CODE4LIB] examples of displays for compound objects and metadata

2015-01-28 Thread Conal Tuohy
Laura, is it an option to migrate the literary content into a TEI form? You
could consolidate the objects that make up a single text into a single
complex object, with embedded metadata (at whatever level you like), and
then wheel in some existing TEI content management / presentation system.


On 29 January 2015 at 10:11, Laura Buchholz  wrote:

> The short answer is that what we have right now is document-type items
> (pages of books or letters, front and back of a map), but that might grow
> in the future to include video or multiple views of art objects. The
> documents are the main concern right now.
>
> Most of our compound objects in contentdm are really just items that are
> made up of multiple files (10 tiffs corresponding to 10 pages of a book),
> so those are easy enough to deal with in the new system--there is no part
> level metadata to display. There are others where there was a desire to
> provide a method of navigation based on titles, similar to bookmarks in a
> pdf, so that navigation has to go somewhere. And still others (a couple of
> rare books) where there are keywords or descriptions that are unique to
> each page, and it is necessary to display that page level metadata. There
> might also be something like a literary magazine, where there is a desire
> to record the titles and authors of each poem/story/etc and to display that
> info somewhere. We have one publication where we crop out articles to add
> as single items, in addition to adding and displaying the whole issue. It
> can get tedious to crop those.
>
> The easiest would be to just avoid recording part level metadata, or to add
> it in the main record, but since it is provided now in the current system,
> we can't really take it away. And cropping things is no fun.
>
> Thanks!
>
> On Wed, Jan 28, 2015 at 3:30 PM, Kyle Banerjee 
> wrote:
>
> > The best way to display compound objects really depends on the nature of
> > the compound objects. For example, the optimal display for a book stored
> as
> > a compound object will be very different than an art object taken from
> > various vantage points or a dataset. Likewise, whether you can get away
> > with not creating/displaying metadata for components of compound objects
> > depends on the use case. If you could say a bit more about what kind of
> > compound objects you have and what system(s) you are migrating to, people
> > could probably give you better advice.
> >
> > kyle
> >
> >
> > On Wed, Jan 28, 2015 at 1:43 PM, Laura Buchholz  >
> > wrote:
> >
> > > We're migrating from CONTENTdm and trying to figure out how to display
> > > compound objects (or the things formerly known as compound objects) and
> > > metadata for the end user. Can anyone point me to really good examples
> of
> > > displaying items like this, especially where the user can see metadata
> > for
> > > parts of the whole? I'm looking more for examples of the layout of all
> > the
> > > different components on the page (or pages) rather than specific image
> > > viewers. Our new system is homegrown, so we have a lot of flexibility
> in
> > > deciding where things go.
> > >
> > > We essentially have:
> > > -the physical item (multiple files per item of images of text, plain
> > > text, pdf)
> > > -metadata about the item
> > > -possibly metadata about a part of the item (think
> title/author/subjects
> > > for a newspaper article within the whole newspaper issue), of which the
> > > titles might be used for navigation through the whole item.
> > >
> > > I think Hathi Trust has a good example of all these components coming
> > > together (except viewing non-title metadata for parts), and I'm curious
> > if
> > > there are others. Or do most places just skip creating/displaying any
> > kind
> > > of metadata for the parts of the whole?
> > >
> > > Thanks for any help!
> > >
> > > --
> > > Laura Buchholz
> > > Digital Assets Specialist
> > > Reed College
> > > 503-517-7629
> > > laura.buchh...@reed.edu
> > >
> >
>
>
>
> --
> Laura Buchholz
> Digital Assets Specialist
> Reed College
> 503-517-7629
> laura.buchh...@reed.edu
>


Re: [CODE4LIB] Code4Lib 2015 Newcomer Dinner Question

2015-01-28 Thread Cary Gordon
You will likely miss the "official" dinners, as pretty much all of those start 
at 6-6:30. Of course, you could just claim a restaurant on the list and have it 
start whenever you want.

Cary

> On Jan 28, 2015, at 12:15 PM, Matthew Sherman  
> wrote:
> 
> Hi all,
> 
> This question is directed at folks attending Code4Lib 2015 in almost a week
> and a half.  Are any of the groups for the dinner leaving after 7pm?  I ask
> as sadly my flight doesn't land until about 6:30 pm that day.  If anyone is
> eating a little later it would be great to join you guys.  Thanks for any
> info people can give.
> 
> Matt Sherman


Re: [CODE4LIB] examples of displays for compound objects and metadata

2015-01-28 Thread Laura Buchholz
The short answer is that what we have right now is document-type items
(pages of books or letters, front and back of a map), but that might grow
in the future to include video or multiple views of art objects. The
documents are the main concern right now.

Most of our compound objects in contentdm are really just items that are
made up of multiple files (10 tiffs corresponding to 10 pages of a book),
so those are easy enough to deal with in the new system--there is no part
level metadata to display. There are others where there was a desire to
provide a method of navigation based on titles, similar to bookmarks in a
pdf, so that navigation has to go somewhere. And still others (a couple of
rare books) where there are keywords or descriptions that are unique to
each page, and it is necessary to display that page level metadata. There
might also be something like a literary magazine, where there is a desire
to record the titles and authors of each poem/story/etc and to display that
info somewhere. We have one publication where we crop out articles to add
as single items, in addition to adding and displaying the whole issue. It
can get tedious to crop those.

The easiest would be to just avoid recording part level metadata, or to add
it in the main record, but since it is provided now in the current system,
we can't really take it away. And cropping things is no fun.

Thanks!

On Wed, Jan 28, 2015 at 3:30 PM, Kyle Banerjee 
wrote:

> The best way to display compound objects really depends on the nature of
> the compound objects. For example, the optimal display for a book stored as
> a compound object will be very different than an art object taken from
> various vantage points or a dataset. Likewise, whether you can get away
> with not creating/displaying metadata for components of compound objects
> depends on the use case. If you could say a bit more about what kind of
> compound objects you have and what system(s) you are migrating to, people
> could probably give you better advice.
>
> kyle
>
>
> On Wed, Jan 28, 2015 at 1:43 PM, Laura Buchholz 
> wrote:
>
> > We're migrating from CONTENTdm and trying to figure out how to display
> > compound objects (or the things formerly known as compound objects) and
> > metadata for the end user. Can anyone point me to really good examples of
> > displaying items like this, especially where the user can see metadata
> for
> > parts of the whole? I'm looking more for examples of the layout of all
> the
> > different components on the page (or pages) rather than specific image
> > viewers. Our new system is homegrown, so we have a lot of flexibility in
> > deciding where things go.
> >
> > We essentially have:
> > -the physical item (multiple files per item of images of text, plain
> > text, pdf)
> > -metadata about the item
> > -possibly metadata about a part of the item (think title/author/subjects
> > for a newspaper article within the whole newspaper issue), of which the
> > titles might be used for navigation through the whole item.
> >
> > I think Hathi Trust has a good example of all these components coming
> > together (except viewing non-title metadata for parts), and I'm curious
> if
> > there are others. Or do most places just skip creating/displaying any
> kind
> > of metadata for the parts of the whole?
> >
> > Thanks for any help!
> >
> > --
> > Laura Buchholz
> > Digital Assets Specialist
> > Reed College
> > 503-517-7629
> > laura.buchh...@reed.edu
> >
>



-- 
Laura Buchholz
Digital Assets Specialist
Reed College
503-517-7629
laura.buchh...@reed.edu


Re: [CODE4LIB] examples of displays for compound objects and metadata

2015-01-28 Thread Peter Murray
Islandora has a compound image model that allows for objects in the repository 
to be related to each other.  An example in the Islandora Foundation's sandbox:

  http://sandbox.islandora.ca/islandora/object/islandora%3A105

This is made up of two large image objects:

  http://sandbox.islandora.ca/islandora/object/islandora%3A103
  http://sandbox.islandora.ca/islandora/object/islandora%3A104

Through theming you can pick which metadata you want to appear on the 
collection object page.  By default, it displays the metadata of the first 
item.  Although the two component items are large images, there is not a 
restriction on the types of objects that can be related in a collection object, 
nor is there a limit on the number of objects that can be related to one 
collection object.


Peter

> On Jan 28, 2015, at 4:43 PM, Laura Buchholz  wrote:
> 
> We're migrating from CONTENTdm and trying to figure out how to display
> compound objects (or the things formerly known as compound objects) and
> metadata for the end user. Can anyone point me to really good examples of
> displaying items like this, especially where the user can see metadata for
> parts of the whole? I'm looking more for examples of the layout of all the
> different components on the page (or pages) rather than specific image
> viewers. Our new system is homegrown, so we have a lot of flexibility in
> deciding where things go.
> 
> We essentially have:
> -the physical item (multiple files per item of images of text, plain
> text, pdf)
> -metadata about the item
> -possibly metadata about a part of the item (think title/author/subjects
> for a newspaper article within the whole newspaper issue), of which the
> titles might be used for navigation through the whole item.
> 
> I think Hathi Trust has a good example of all these components coming
> together (except viewing non-title metadata for parts), and I'm curious if
> there are others. Or do most places just skip creating/displaying any kind
> of metadata for the parts of the whole?
> 
> Thanks for any help!


--
Peter Murray
Assistant Director, Technology Services Development
LYRASIS
peter.mur...@lyrasis.org
+1 678-235-2955
800.999.8558 x2955


Re: [CODE4LIB] examples of displays for compound objects and metadata

2015-01-28 Thread Kyle Banerjee
The best way to display compound objects really depends on the nature of
the compound objects. For example, the optimal display for a book stored as
a compound object will be very different than an art object taken from
various vantage points or a dataset. Likewise, whether you can get away
with not creating/displaying metadata for components of compound objects
depends on the use case. If you could say a bit more about what kind of
compound objects you have and what system(s) you are migrating to, people
could probably give you better advice.

kyle


On Wed, Jan 28, 2015 at 1:43 PM, Laura Buchholz 
wrote:

> We're migrating from CONTENTdm and trying to figure out how to display
> compound objects (or the things formerly known as compound objects) and
> metadata for the end user. Can anyone point me to really good examples of
> displaying items like this, especially where the user can see metadata for
> parts of the whole? I'm looking more for examples of the layout of all the
> different components on the page (or pages) rather than specific image
> viewers. Our new system is homegrown, so we have a lot of flexibility in
> deciding where things go.
>
> We essentially have:
> -the physical item (multiple files per item of images of text, plain
> text, pdf)
> -metadata about the item
> -possibly metadata about a part of the item (think title/author/subjects
> for a newspaper article within the whole newspaper issue), of which the
> titles might be used for navigation through the whole item.
>
> I think Hathi Trust has a good example of all these components coming
> together (except viewing non-title metadata for parts), and I'm curious if
> there are others. Or do most places just skip creating/displaying any kind
> of metadata for the parts of the whole?
>
> Thanks for any help!
>
> --
> Laura Buchholz
> Digital Assets Specialist
> Reed College
> 503-517-7629
> laura.buchh...@reed.edu
>


Re: [CODE4LIB] examples of displays for compound objects and metadata

2015-01-28 Thread Esmé Cowles
Laura-

At UCSD, we have complex objects which range from a flat list of files (e.g. 
page images):

http://library.ucsd.edu/dc/object/bb59054559

all the way up to pretty involved hierarchy modeling a filesystem:

http://library.ucsd.edu/dc/object/bb9796611k

Many of these have a hierarchy with files attached, but not much metadata for 
the individual parts.  But there are also some objects with more metadata for 
each part:

http://library.ucsd.edu/dc/object/bb0479301d?

-Esme

> On 01/28/15, at 4:43 PM, Laura Buchholz  wrote:
> 
> We're migrating from CONTENTdm and trying to figure out how to display
> compound objects (or the things formerly known as compound objects) and
> metadata for the end user. Can anyone point me to really good examples of
> displaying items like this, especially where the user can see metadata for
> parts of the whole? I'm looking more for examples of the layout of all the
> different components on the page (or pages) rather than specific image
> viewers. Our new system is homegrown, so we have a lot of flexibility in
> deciding where things go.
> 
> We essentially have:
> -the physical item (multiple files per item of images of text, plain
> text, pdf)
> -metadata about the item
> -possibly metadata about a part of the item (think title/author/subjects
> for a newspaper article within the whole newspaper issue), of which the
> titles might be used for navigation through the whole item.
> 
> I think Hathi Trust has a good example of all these components coming
> together (except viewing non-title metadata for parts), and I'm curious if
> there are others. Or do most places just skip creating/displaying any kind
> of metadata for the parts of the whole?
> 
> Thanks for any help!
> 
> -- 
> Laura Buchholz
> Digital Assets Specialist
> Reed College
> 503-517-7629
> laura.buchh...@reed.edu


[CODE4LIB] examples of displays for compound objects and metadata

2015-01-28 Thread Laura Buchholz
We're migrating from CONTENTdm and trying to figure out how to display
compound objects (or the things formerly known as compound objects) and
metadata for the end user. Can anyone point me to really good examples of
displaying items like this, especially where the user can see metadata for
parts of the whole? I'm looking more for examples of the layout of all the
different components on the page (or pages) rather than specific image
viewers. Our new system is homegrown, so we have a lot of flexibility in
deciding where things go.

We essentially have:
-the physical item (multiple files per item of images of text, plain
text, pdf)
-metadata about the item
-possibly metadata about a part of the item (think title/author/subjects
for a newspaper article within the whole newspaper issue), of which the
titles might be used for navigation through the whole item.

I think Hathi Trust has a good example of all these components coming
together (except viewing non-title metadata for parts), and I'm curious if
there are others. Or do most places just skip creating/displaying any kind
of metadata for the parts of the whole?

Thanks for any help!

-- 
Laura Buchholz
Digital Assets Specialist
Reed College
503-517-7629
laura.buchh...@reed.edu


[CODE4LIB] Job: Associate University Librarians at George Washington University

2015-01-28 Thread jobs
Associate University Librarians
George Washington University
Washington, D.C.

GW is seeking two associate university librarians to complete a dynamic senior
leadership team. Diverse management portfolios will play to individual
strengths and allow for maximum flexibility in a collaborative, innovative
organization in the heart of the nation's capital.

  
More details athttp://www.gwu.jobs/postings/25802.



Brought to you by code4lib jobs: http://jobs.code4lib.org/job/19071/
To post a new job please visit http://jobs.code4lib.org/


[CODE4LIB] Code4Lib 2015 Newcomer Dinner Question

2015-01-28 Thread Matthew Sherman
Hi all,

This question is directed at folks attending Code4Lib 2015 in almost a week
and a half.  Are any of the groups for the dinner leaving after 7pm?  I ask
as sadly my flight doesn't land until about 6:30 pm that day.  If anyone is
eating a little later it would be great to join you guys.  Thanks for any
info people can give.

Matt Sherman


[CODE4LIB] Job: Library Systems Migration Expert at Contra Costa Community College District

2015-01-28 Thread jobs
Library Systems Migration Expert
Contra Costa Community College District
Pittsburg, CA

The Contra Costa Community College District (4CD) which is comprised of Contra
Costa College (CCC) in San Pablo, CA, Diablo Valley College (DVC) in Pleasant
Hill, CA and Los Medanos College (LMC) in Pittsburg, CA-is seeking a three-
year, part-time, Library System Migration Expert to begin March 15, 2015. The
primary responsibility of this position is to lead the three 4CD colleges in
their migration from their current integrated library system, Innovative
Interfaces,Inc's, Millennium, to a new system. This position serves as a
member of the 4CD library team and will work closely with the library faculty
and staff at all three colleges on planning, scheduling and managing all
aspects of the migration. For a job description, qualifications and further
information, please go to link provided in this posting.



Brought to you by code4lib jobs: http://jobs.code4lib.org/job/19069/
To post a new job please visit http://jobs.code4lib.org/


Re: [CODE4LIB] state of the art in virtual shelf browse?

2015-01-28 Thread Jenn Riley
Thanks, everyone, for the links to interesting implementations. It's
definitely given me some inspiration as we start to think about this
possibility.

I'll give my 2 cents (Canadian, that's $0.016 US today, sorry!) on a few
of Sean's questions below - the ones we've actually given any thought to
yet. We're at the 'hey, this is probably something we should look into'
phase right now, so naturally we haven't covered all of these things.



On 2015-01-28 9:29 AM, "Sean Hannan"  wrote:

>Where is the feature demand originating? Staff? Faculty? Students? Grad
>students? Undergrad students? (Not to exclude publics or special
>libraries, but this seems to be an academic catalog feature, when it shows
>up.)

This has come up for us as we start (as other academic libraries are) to
think about remimagined libraries of the future and the possibility of a
smaller on-site collection with remote or on-site but not browsable
storage comes up for consideration, and what that would mean for faculty
(especially) who find value in browsing shelves. We're not committed to
any of this yet - right now it's just a thought experiment of what it
would mean.

>What is the level of familiarity with library/library services/library
>systems for those that request this feature?

I try very hard to encourage my tech folks not to worry about that. If
it's a documented user need or helps us strategically in some other way
then we need to have it on the table of things we work on. I can
communicate how hard it will be as part of the priorities discussion, and
if it's a priority, it's a priority, and we move forward. I encourage and
reward 'cool, let's work on that and solve an interesting problem' over
'is this really necessary, do those folks asking us know what we're
talking about?' My apologies if I've read too much into your question,
though!

>Is implementing shelf browse an attempt to work around some other catalog
>deficiency (e.g. weak subject cataloging)?
>
>Does the corpus have the cataloging data to support such a feature? (A lot
>of ebook packages do not have call numbers, for example.) What¹s the
>percentage? Is that reasonable?

I'm actually wondering if there are better ways to do thematic browsing
than call number, but I know most (all?) do implement this as a literal
shelf/call # browse. But there are probably other possibilities that could
meet the serendipity need that could be worth exploring.

>How do you plan on tracking use of the feature? What would you consider to
>be a success rate? 20% of sessions? 5%? 1%?
>
>At what point do you sunset the feature? Expand upon it?

I struggle with questions like this because I think they're unfair -
frankly, our organizations don't typically ask questions like this about
e.g. an advanced search or title browse or journal a-z list, so asking it
for THIS feature puts a standard up that we don't use for other things.
Now, I'm all about assessment and collecting lots of data and ongoing
review, but we work on that for *everything*. Sure, we'll put some thought
into this but it's very unlikely something we decide is a priority is
going to get a sunset clause put into it at the beginning when we have all
sorts of legacy stuff that's limping along but of less utility. We always
look at our offerings to decide what stays and what goes, and we'd do that
for this too. But it's not in the culture of our organization to set
strict metrics like this before implementation, and frankly I think we
shouldn't do that. I want the flexibility going forward to shift
priorities as the landscape changes. For better or for worse, library
services are more than math problems. :-)

Jenn

---
Jenn Riley
Associate Dean, Digital Initiatives | Vice Doyenne, Initiatives numériques

McGill University Library | Bibliothèque Université McGill
3459 McTavish Street | 3459, rue McTavish
Montreal, QC, Canada H3A 0C9 | Montréal (QC) Canada  H3A 0C9

(514) 398-3642
jenn.ri...@mcgill.ca


[CODE4LIB] Recording of DPLA's Aggregation Webinar

2015-01-28 Thread Gretchen Gueguen
Colleagues,

You can now find the video for DPLA's recent Metadata Aggregation webinar,
held on January 22, 2015, on the DPLA blog at:
http://dp.la/info/2015/01/28/metadata-aggregation-webinar-video-and-extended-qa


The webinar featured two DPLA Service Hubs and DPLA's Metadata Coordinator
talking about the ins and outs of aggregation in the DPLA context.

*Speakers*



   -

   Lisa Gregory and Stephanie Williams of the North Carolina Digital
   Heritage Center 
   -

   Heather Gilbert and Tyler Mobley of the South Carolina Digital Library
   
   -

   Gretchen Gueguen of DPLA 


Links to download each presenter’s slides are included in this post as
well, along with answers to a few questions that we didn't get to during
the webinar.

Best,

-- 
Gretchen Gueguen
Data Services Coordinator
Digital Public Library of America
http://dp.la


[CODE4LIB] HydraCamp and Blacklight Workshop - Yale University Library - March 9th-13th, 2015

2015-01-28 Thread Mark Bussey
Apologies for cross posting.

We've just finalized details for the Spring 2015 Hydra Camp with our generous 
hosts, Yale University Library, and registration is now open. 

Hydra Camp
Yale University Library
March 9th-12th, 2015
New Haven, CT
More Info + Registration

The registration fee includes an optional Advanced Blacklight Workshop on 
Friday, March 13th. (Standalone registration for the Blacklight workshop 
without HydraCamp is $100.)

Advanced Blacklight Workshop
Yale University Library
March 13th, 2015
New Haven, CT
More Info + Registration

Topics for the Blacklight workshop will include:
• Customized item-level views (by content type)
• Overriding default behaviors and helpers
• Supporting thumbnails
• Running multiple catalog controller instances
• Dynamically adding and removing catalog filters
• Search field customization
• Facet customization
• Search across heterogeneous objects
• Maximizing upgrade compatibility when customizing
• Participant Q&A

If you plan to attend both events, please register for HydraCamp first to 
receive your link for complimentary registration for the Blacklight workshop. 
If you will only be attending the Blacklight workshop, please use the link 
above for stand-alone registration.

For more information about Hydra Camp in general along with a high level 
syllabus, visit our Hydra Camp information page.  

Please feel free to e-mail me at m...@curationexperts.com with any questions 
about Hydra Camp or the Blacklight Workshop.

Cheers
- Mark


Mark Bussey
Data Curation Experts
m...@curationexperts.com
612.524.8484


Re: [CODE4LIB] Checksums for objects and not embedded metadata

2015-01-28 Thread Ronald Houk
Also just stumbled across this on stackoverflow.

http://stackoverflow.com/questions/12115824/compute-the-hash-of-only-the-core-image-data-of-a-tiff

On Wed, Jan 28, 2015 at 10:32 AM, Ronald Houk <
rh...@ottumwapubliclibrary.org> wrote:

> Hello,
>
> I like Danielle's idea.  I wonder if it wouldn't be a good idea to
> decouple the metadata from the data permanently.  Exiftool allows you to
> export the metadata in lots of different formats like JSON.  You could
> export the metadata into JSON, run the checksums and then store the photo
> and the JSON file in a single tar-ball. From there you could use a JSON
> editor to modify/add metadata.
>
>  It would be simple to reintroduce the metadata into the file when needed.
>
> On Mon, Jan 26, 2015 at 10:27 AM, danielle plumer 
> wrote:
>
>> Kyle,
>>
>> It's a bit of a hack, but you could write a script to delete all the
>> metadata from images with ExifTool and then run checksums on the resulting
>> image (see
>> http://u88.n24.queensu.ca/exiftool/forum/index.php?topic=4902.0).
>> exiv2 might also work. I don't think you'd want to do that every time you
>> audited the files, though; generating new checksums is a faster approach.
>>
>> I haven't tried this, but I know that there's a program called ssdeep
>> developed for the digital forensics community that can do piecewise
>> hashing
>> -- it hashes chunks of content and then compares the hashes for the
>> different chunks to find matches, in theory. It might be able to match
>> files with embedded metadata vs. files without; the use cases described on
>> the forensics wiki is finding altered (truncated) files, or reuse of
>> source
>> code.  http://www.forensicswiki.org/wiki/Ssdeep
>>
>> Danielle Cunniff Plumer
>>
>> On Sun, Jan 25, 2015 at 9:44 AM, Kyle Banerjee 
>> wrote:
>>
>> > On Sat, Jan 24, 2015 at 11:07 AM, Rosalyn Metz 
>> > wrote:
>> >
>> > >
>> > >- How is your content packaged?
>> > >- Are you talking about the SIPs or the AIPs or both?
>> > >- Is your content in an instance of Fedora, a unix file structure,
>> or
>> > >something else?
>> > >- Are you generating checksums on the whole package, parts of it,
>> > both?
>> > >
>> >
>> > The quick answer to this is that this is a low tech operation. We're
>> > currently on regular filesystems where we are limited to feeding md5
>> > checksums into a list. I'm looking for a low tech way that makes it
>> easier
>> > to keep track of resources across a variety of platforms in a
>> decentralized
>> > environment and which will easily adopt to future technology
>> transitions.
>> > For example, we have a bunch of stuff in Bepress and Omeka. Neither of
>> > those is good for preservation, so authoritative files live elsewhere
>> as do
>> > a huge number of resources that aren't in these platforms. Filenames are
>> > terrible identifiers and things get moved around even if people don't
>> mess
>> > with the files.
>> >
>> > We also are trying to come up with something that deals with different
>> > kinds of datasets (we're focusing on bioimaging at the moment) and fits
>> in
>> > the workflow of campus units, each of which needs to manage tens of
>> > thousands of files with very little metadata on regular filesystems.
>> Some
>> > of the resources are enormous in terms of size or number of members.
>> >
>> > Simply embedding an identifier in the file is a really easy way to tell
>> > which files have metadata and which metadata is there. In the case at
>> hand,
>> > I could just do that and generate new checksums. But I think the generic
>> > problem of making better use of embedded metadata is an interesting one
>> as
>> > it can make objects more usable and understandable once they're removed.
>> > For example, just this past Friday I received a request to use an image
>> > someone downloaded for a book. Unfortunately, he just emailed me a copy
>> of
>> > the image, described what he wanted to do, and asked for permission but
>> he
>> > couldn't replicate how he found it. An identifier would have been handy
>> as
>> > would have been embedded rights info as this is not the same for all of
>> our
>> > images. The reason we're using DOI's is that they work well for anything
>> > and can easily be recognized by syntax wherever they may appear.
>> >
>> > On Sat, Jan 24, 2015 at 7:06 PM, Joe Hourcle <
>> > onei...@grace.nascom.nasa.gov>
>> >  wrote:
>> >
>> > >
>> > > The problems with 'metadata' in a lot of file formats is that they're
>> > > just arbitrary segments -- you'd have to have a program that knew
>> > > which segments were considered 'headers' vs. not.  It might be easier
>> > > to have it be able to compute a separate checksum for each segment,
>> > > so that should the modifications change their order, they'd still
>> > > be considered valid.
>> > >
>> >
>> > This is what I seemed to be bumping up against so I was hoping there
>> was an
>> > easy workaround. But this is helpful information. Thanks,
>> >
>> > kyle
>> >
>>
>
>
>

Re: [CODE4LIB] Checksums for objects and not embedded metadata

2015-01-28 Thread Ronald Houk
Hello,

I like Danielle's idea.  I wonder if it wouldn't be a good idea to decouple
the metadata from the data permanently.  Exiftool allows you to export the
metadata in lots of different formats like JSON.  You could export the
metadata into JSON, run the checksums and then store the photo and the JSON
file in a single tar-ball. From there you could use a JSON editor to
modify/add metadata.

 It would be simple to reintroduce the metadata into the file when needed.

On Mon, Jan 26, 2015 at 10:27 AM, danielle plumer 
wrote:

> Kyle,
>
> It's a bit of a hack, but you could write a script to delete all the
> metadata from images with ExifTool and then run checksums on the resulting
> image (see http://u88.n24.queensu.ca/exiftool/forum/index.php?topic=4902.0
> ).
> exiv2 might also work. I don't think you'd want to do that every time you
> audited the files, though; generating new checksums is a faster approach.
>
> I haven't tried this, but I know that there's a program called ssdeep
> developed for the digital forensics community that can do piecewise hashing
> -- it hashes chunks of content and then compares the hashes for the
> different chunks to find matches, in theory. It might be able to match
> files with embedded metadata vs. files without; the use cases described on
> the forensics wiki is finding altered (truncated) files, or reuse of source
> code.  http://www.forensicswiki.org/wiki/Ssdeep
>
> Danielle Cunniff Plumer
>
> On Sun, Jan 25, 2015 at 9:44 AM, Kyle Banerjee 
> wrote:
>
> > On Sat, Jan 24, 2015 at 11:07 AM, Rosalyn Metz 
> > wrote:
> >
> > >
> > >- How is your content packaged?
> > >- Are you talking about the SIPs or the AIPs or both?
> > >- Is your content in an instance of Fedora, a unix file structure,
> or
> > >something else?
> > >- Are you generating checksums on the whole package, parts of it,
> > both?
> > >
> >
> > The quick answer to this is that this is a low tech operation. We're
> > currently on regular filesystems where we are limited to feeding md5
> > checksums into a list. I'm looking for a low tech way that makes it
> easier
> > to keep track of resources across a variety of platforms in a
> decentralized
> > environment and which will easily adopt to future technology transitions.
> > For example, we have a bunch of stuff in Bepress and Omeka. Neither of
> > those is good for preservation, so authoritative files live elsewhere as
> do
> > a huge number of resources that aren't in these platforms. Filenames are
> > terrible identifiers and things get moved around even if people don't
> mess
> > with the files.
> >
> > We also are trying to come up with something that deals with different
> > kinds of datasets (we're focusing on bioimaging at the moment) and fits
> in
> > the workflow of campus units, each of which needs to manage tens of
> > thousands of files with very little metadata on regular filesystems. Some
> > of the resources are enormous in terms of size or number of members.
> >
> > Simply embedding an identifier in the file is a really easy way to tell
> > which files have metadata and which metadata is there. In the case at
> hand,
> > I could just do that and generate new checksums. But I think the generic
> > problem of making better use of embedded metadata is an interesting one
> as
> > it can make objects more usable and understandable once they're removed.
> > For example, just this past Friday I received a request to use an image
> > someone downloaded for a book. Unfortunately, he just emailed me a copy
> of
> > the image, described what he wanted to do, and asked for permission but
> he
> > couldn't replicate how he found it. An identifier would have been handy
> as
> > would have been embedded rights info as this is not the same for all of
> our
> > images. The reason we're using DOI's is that they work well for anything
> > and can easily be recognized by syntax wherever they may appear.
> >
> > On Sat, Jan 24, 2015 at 7:06 PM, Joe Hourcle <
> > onei...@grace.nascom.nasa.gov>
> >  wrote:
> >
> > >
> > > The problems with 'metadata' in a lot of file formats is that they're
> > > just arbitrary segments -- you'd have to have a program that knew
> > > which segments were considered 'headers' vs. not.  It might be easier
> > > to have it be able to compute a separate checksum for each segment,
> > > so that should the modifications change their order, they'd still
> > > be considered valid.
> > >
> >
> > This is what I seemed to be bumping up against so I was hoping there was
> an
> > easy workaround. But this is helpful information. Thanks,
> >
> > kyle
> >
>



-- 
Ronald Houk
Assistant Director
Ottumwa Public Library
102 W. Fourth Street
Ottumwa, IA 52501
(641)682-7563x203
rh...@ottumwapubliclibrary.org


Re: [CODE4LIB] state of the art in virtual shelf browse?

2015-01-28 Thread Joshua Welker
+1 to Sean's questions. I've considered implementing a shelf browse system
myself, but I am wary. It's a huge amount of work, and I have no idea who
it will benefit or how much. It's one of those things that certainly seems
cool to me, but unfortunately I am not the target audience of our website
(but it would be much easier if I were). Any usage stats would be greatly
appreciated.

Josh Welker


-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
Sean Hannan
Sent: Wednesday, January 28, 2015 8:29 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] state of the art in virtual shelf browse?

For those investigating a shelf browse (and for those that have
implemented one), I have a few questions:

Where is the feature demand originating? Staff? Faculty? Students? Grad
students? Undergrad students? (Not to exclude publics or special
libraries, but this seems to be an academic catalog feature, when it shows
up.)

What is the level of familiarity with library/library services/library
systems for those that request this feature?

Is implementing shelf browse an attempt to work around some other catalog
deficiency (e.g. weak subject cataloging)?

Does the corpus have the cataloging data to support such a feature? (A lot
of ebook packages do not have call numbers, for example.) What零 the
percentage? Is that reasonable?

How do you plan on tracking use of the feature? What would you consider to
be a success rate? 20% of sessions? 5%? 1%?

At what point do you sunset the feature? Expand upon it?

How long will the feature take to implement? How many staff will be
involved? What is the ROI?

Will all of your users understand the visual implementation on the page?
How do you plan on testing it?

Does the shelf metaphor still hold for your users? How do you know?

-Sean

On 1/28/15, 8:30 AM, "Darylyne Provost"  wrote:

>We're interested in implementing a virtual browse feature as well, so I
>was glad to find this post.
>
>Since we have a shared catalog and the feature is currently under
>discussion by our partner institutions, we're also considering
>implementing it for our installation of Summon first. I've seen U of
>Huddersfield, but am wondering if there are additional examples?
>
>Thanks,
>
>Darylyne
>
>**
>Darylyne Provost
>Assistant Director for Systems, Web, & Emerging Technologies Colby
>College
>207.859.5117
>dprov...@colby.edu
>
>On Tue, Jan 27, 2015 at 3:48 PM, Gerritsma, Wouter
>
>wrote:
>
>> Beautiful to see that the meticulously recorded book height is put
>> into use.
>>
>> -Original Message-
>> From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf
>> Of Harper, Cynthia
>> Sent: dinsdag 27 januari 2015 21:27
>> To: CODE4LIB@LISTSERV.ND.EDU
>> Subject: Re: [CODE4LIB] state of the art in virtual shelf browse?
>>
>> What testimony to what a difference presentation can make!  So much
>>better  than basically the same functionality, but in a text list, as
>>shown in our  old III Webpac.
>>
>> -Original Message-
>> From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf
>> Of Cole Hudson
>> Sent: Tuesday, January 27, 2015 9:57 AM
>> To: CODE4LIB@LISTSERV.ND.EDU
>> Subject: Re: [CODE4LIB] state of the art in virtual shelf browse?
>>
>> Hi Jenn,
>>
>> Just to add one example more to the mix, we've built a shelf browser
>>based  on Harvard's Stackview/Stacklife project--adding to it a z39.50
>>connector  and organizing results by call number. This search works
>>across all of  holdings, regardless of the books' locations. (Click
>>the link, then under  the Books and Media box, click See on Shelf to
>>look at our shelf
>>browser.)
>>
>> http://library.wayne.edu/quicksearch/#q=the%20hobbit
>>
>> Also, our code is on Github: https://github.com/WSULib/SVCatConnector
>>
>> Cole
>>


[CODE4LIB] Announcement: ruby-marc 0.8.2 re-released as version 1.0.0

2015-01-28 Thread Bill Dueber
The ruby-marc  team is happy to
announce that we’ve decided to release the current code as version 1.0.0.

There are no non-cosmetic changes to this code compared to the
until-now-current version 0.8.2.

The jump to version 1.0.0 reflects the *de facto* use of the marc gem in
production at dozens of institutions and allows further development to more
easily adhere to semantic versioning .

In that vein, please begin the process of updating your gem directives in
Gemfiles and .gemspec files to something like

gem 'marc', '~>1'

…to be sure you have the latest backwards-compatible version for your
projects.

Thanks to everyone involved, from committers to folks who file bugs, for
the progress ruby-marc has made over the years. Special thanks for the most
recent releases go to Jonathan Rochkind, whose work on encodings (including
MARC-8!!) has been relentless.

-Bill Dueber, for the ruby-marc contributors-
​
-- 
Bill Dueber
Library Systems Programmer
University of Michigan Library


[CODE4LIB] CALL: Islandora Conference CFP Open until April 30

2015-01-28 Thread Islandora Community
The world's first Islandora Conference is taking place this summer, August
3 - 7, in Charlottetown, PEI. We are now welcoming proposals for conference
presentations.

The theme of the conference is Community - the Islandora community, the
community of people our institutions serve, the community of researchers
and librarians and developers who work together to curate digital assets,
and the community of open source projects that work together and in
parallel.

We are asking for your presentations, panels, and posters around this theme
and welcome your interpretations of what "community" means to you. Please
give us your contact details, a working title, and a brief synopsis of what
you would like to present. We will allow for updates to reflect changes as
your work develops, so just give us your best pitch with what you have now.

Please submit your ideas here:
http://islandora.ca/content/islandora-conference-call-proposals

Submissions are open until April 30th and presenters will be notified
shortly thereafter.

Thank you,

The Islandora Team
commun...@islandora.ca
http://islandora.ca


Re: [CODE4LIB] state of the art in virtual shelf browse?

2015-01-28 Thread Sean Hannan
For those investigating a shelf browse (and for those that have
implemented one), I have a few questions:

Where is the feature demand originating? Staff? Faculty? Students? Grad
students? Undergrad students? (Not to exclude publics or special
libraries, but this seems to be an academic catalog feature, when it shows
up.)

What is the level of familiarity with library/library services/library
systems for those that request this feature?

Is implementing shelf browse an attempt to work around some other catalog
deficiency (e.g. weak subject cataloging)?

Does the corpus have the cataloging data to support such a feature? (A lot
of ebook packages do not have call numbers, for example.) What¹s the
percentage? Is that reasonable?

How do you plan on tracking use of the feature? What would you consider to
be a success rate? 20% of sessions? 5%? 1%?

At what point do you sunset the feature? Expand upon it?

How long will the feature take to implement? How many staff will be
involved? What is the ROI?

Will all of your users understand the visual implementation on the page?
How do you plan on testing it?

Does the shelf metaphor still hold for your users? How do you know?

-Sean

On 1/28/15, 8:30 AM, "Darylyne Provost"  wrote:

>We're interested in implementing a virtual browse feature as well, so I
>was
>glad to find this post.
>
>Since we have a shared catalog and the feature is currently under
>discussion by our partner institutions, we're also considering
>implementing
>it for our installation of Summon first. I've seen U of Huddersfield, but
>am wondering if there are additional examples?
>
>Thanks,
>
>Darylyne
>
>**
>Darylyne Provost
>Assistant Director for Systems, Web, & Emerging Technologies
>Colby College
>207.859.5117
>dprov...@colby.edu
>
>On Tue, Jan 27, 2015 at 3:48 PM, Gerritsma, Wouter
>
>wrote:
>
>> Beautiful to see that the meticulously recorded book height is put into
>> use.
>>
>> -Original Message-
>> From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
>> Harper, Cynthia
>> Sent: dinsdag 27 januari 2015 21:27
>> To: CODE4LIB@LISTSERV.ND.EDU
>> Subject: Re: [CODE4LIB] state of the art in virtual shelf browse?
>>
>> What testimony to what a difference presentation can make!  So much
>>better
>> than basically the same functionality, but in a text list, as shown in
>>our
>> old III Webpac.
>>
>> -Original Message-
>> From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
>> Cole Hudson
>> Sent: Tuesday, January 27, 2015 9:57 AM
>> To: CODE4LIB@LISTSERV.ND.EDU
>> Subject: Re: [CODE4LIB] state of the art in virtual shelf browse?
>>
>> Hi Jenn,
>>
>> Just to add one example more to the mix, we've built a shelf browser
>>based
>> on Harvard's Stackview/Stacklife project--adding to it a z39.50
>>connector
>> and organizing results by call number. This search works across all of
>> holdings, regardless of the books' locations. (Click the link, then
>>under
>> the Books and Media box, click See on Shelf to look at our shelf
>>browser.)
>>
>> http://library.wayne.edu/quicksearch/#q=the%20hobbit
>>
>> Also, our code is on Github: https://github.com/WSULib/SVCatConnector
>>
>> Cole
>>


Re: [CODE4LIB] state of the art in virtual shelf browse?

2015-01-28 Thread Darylyne Provost
We're interested in implementing a virtual browse feature as well, so I was
glad to find this post.

Since we have a shared catalog and the feature is currently under
discussion by our partner institutions, we're also considering implementing
it for our installation of Summon first. I've seen U of Huddersfield, but
am wondering if there are additional examples?

Thanks,

Darylyne

**
Darylyne Provost
Assistant Director for Systems, Web, & Emerging Technologies
Colby College
207.859.5117
dprov...@colby.edu

On Tue, Jan 27, 2015 at 3:48 PM, Gerritsma, Wouter 
wrote:

> Beautiful to see that the meticulously recorded book height is put into
> use.
>
> -Original Message-
> From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
> Harper, Cynthia
> Sent: dinsdag 27 januari 2015 21:27
> To: CODE4LIB@LISTSERV.ND.EDU
> Subject: Re: [CODE4LIB] state of the art in virtual shelf browse?
>
> What testimony to what a difference presentation can make!  So much better
> than basically the same functionality, but in a text list, as shown in our
> old III Webpac.
>
> -Original Message-
> From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
> Cole Hudson
> Sent: Tuesday, January 27, 2015 9:57 AM
> To: CODE4LIB@LISTSERV.ND.EDU
> Subject: Re: [CODE4LIB] state of the art in virtual shelf browse?
>
> Hi Jenn,
>
> Just to add one example more to the mix, we've built a shelf browser based
> on Harvard's Stackview/Stacklife project--adding to it a z39.50 connector
> and organizing results by call number. This search works across all of
> holdings, regardless of the books' locations. (Click the link, then under
> the Books and Media box, click See on Shelf to look at our shelf browser.)
>
> http://library.wayne.edu/quicksearch/#q=the%20hobbit
>
> Also, our code is on Github: https://github.com/WSULib/SVCatConnector
>
> Cole
>