Re: [CODE4LIB] searching metadata vs searching content

2016-01-26 Thread Shaun D. Ellis
Hi Laura,
Great question.  Unfortunately, I think you’re going to be fairly limited when 
it comes to having granular control over fields and facet indexing in ContentDM 
(someone correct me if I’m wrong).

But to answer your question about general steps involved with indexing the 
metadata AND full text of a METS document…

To have the most control over how your data is indexed, you will want to use a 
search platform.  Apache Solr is used in a 
majority of library-related software, so I’ll use that in my examples, although 
there are several others.  Solr doesn’t have a concept of “metadata” and 
“content”, just “fields" that you can use to search both.

In the case of your METS data, you will need to first transform it into a more 
simplified document (Solr XML) containing the fields that matter for a 
particular search interface and are defined in the 
schema.  This transform step can be 
done in any number of ways, but XSLT is fairly common.  To index the full-text 
content that your METS document points to, you can build that into your 
transform script/stylesheet, or you can run a separate script/process later 
that updates the record with the full-text.  In the case of a “compound object” 
you may need to have a script iterate over lots of separate content files and 
add them to the Solr document that represents a yearbook.

There are a few ways to add data to a solr index, but a common one in 
library-land is to add (and update) records to the Solr index by POSTing your 
freshly “transformed" data via HTTP (here’s the Solr quickstart 
tutorial).

Customizing your search results (weighting, stemming, rows per page, etc.) can 
be handled in the Solr config file. 
 For example, you can tweak the weight/relevance of the query based on which 
fields it matches.

When you query Solr over HTTP, it will return results in XML or JSON that you 
can then render in a display or discovery interface. 
Blacklight is one example of a discovery 
interface.

Sorry if I’ve covered stuff you already know.  There are lots of tools, 
applications, and frameworks that will simplify the process (perhaps too much 
in some cases!), but the best give you the most control over how you index and 
retrieve your data.  I think that covers the basics and hopefully answers your 
question.

Cheers,
Shaun
P.S. -  I’m not sure that even Solr will help you locate the Doyle Owl. ;)

On Jan 26, 2016, at 7:30 PM, Laura Buchholz 
> wrote:

Hi all,

I'm trying to understand how digital library systems work when there is a
need to search both metadata and item text content (plain text/full text),
and when the item is made up of more than one file (so, think a digitized
multi-page yearbook or newspaper). I'm not looking for answers to a
specific problem, really, just looking to know what is the current state of
community practice.

In our current system (ContentDM), the "full text" of something lives in
the metadata record, so it is indexed and searched along with the metadata,
and essentially treated as if it were metadata. (Correct?). This causes
problems in advanced searching and muddies the relationship between what is
typically a descriptive metadata record and the file that is associated
with the record. It doesn't seem like a great model for the average digital
library. True? I know the answer is "it depends", but humor me... :)

If it isn't great, and there are better models, what are they? I was taught
METS in school, and based on that, I'd approach the metadata in a METS or
METS-like fashion. But I'm unclear on the steps from having a bunch of METS
records that include descriptive metadata and pointers to text files of the
OCR (we don't, but if we did...) to indexing and providing results to
users. I think another way of phrasing this question might be: how is the
full text of a compound object (in the sense of a digitized yearbook or
similar) typically indexed?

The user requirements for this situation are essentially:
1. User can search for something and get a list of results. If something
(let's say a pamphlet) appears in results based on a hit in full text, the
user selects the pamphlet which opens to the file (or page of the pamphlet)
that contains the text that was matched. This is pretty normal and does
work in our current system.
2. In an advanced search, a user might search for a name in the "author"
field and a phrase in the "full text" field, and say they want both
conditions to be fulfilled. In our current system, this won't provide
results when it should, because the full text content is in one record and
the author's name is in another record, so the AND condition can't be met.
3. Librarians can link description metadata records (DC in our case) to
particular files, 

[CODE4LIB] ITHAKA Tech Open House in Ann Arbor Feb 4

2016-01-26 Thread Alex Humphreys
[With apologies for cross-posting.]

If anyone’s in the Ann Arbor area, ITHAKA is holding a Tech Open House on Feb 
4.  Come chat with us, see our pretty-awesome office, chat with JSTOR Labs and 
the folks who just rebuilt the entire JSTOR platform from the ground up, and, 
if you’re interested, learn what it’s like to work here.

http://ithaka.org/news/door-says-open-at-ithaka?cid=soc_annarborprJan2016

See you there!

Alex
—
Alex Humphreys
Associate Vice President, JSTOR and Director, JSTOR Labs
twitter: @abhumphreys
email: alex.humphr...@ithaka.org
web: http://labs.jstor.org


[CODE4LIB] EMPLOYMENT OPPORTUNITY: Department Head, Office of Digital Innovation and Stewardship (ODIS)

2016-01-26 Thread Han, Yan - (yhan)

Please share the posting with interested parties.  Tucson has mild winter and 
dry / warm summer. The person will be working with engaged and nice colleagues.



EMPLOYMENT OPPORTUNITY

Department Head, Office of Digital Innovation and Stewardship
The University of Arizona Libraries, Digital Innovation/Stewardship (Dept. 1705)
Classification: Administrator/Appointed Professional; Full-Time; Exempt
Location: Main Campus, Tucson

Position Summary:
The University Libraries seek a dynamic, innovative Head of the Office of 
Digital Innovation and Stewardship (ODIS), a position with the primary 
responsibility of providing leadership and strategic direction for digital 
innovation and stewardship within the broader context of the strategic plans of 
the University Libraries and the University of Arizona. ODIS provides a broad 
range of services including digital collections, data management, campus 
repository, metadata, journal hosting and publishing, copyright and scholarly 
communication, open access, and geospatial data. In overseeing several areas of 
strategic importance, the Department Head must be forward thinking and willing 
to take strategic risks in the development of services. The Department Head 
will be a member of the Libraries Cabinet (leadership, policy and management 
team) and reports to the Vice Dean of Libraries.

The Department Head of ODIS will be responsible for leadership, management, and 
planning for the services and functions of the Office of Digital Innovation and 
Stewardship, which includes 8 FTE permanent professionals and a large team of 
students and temporary employees. ODIS members work collaboratively, engaging 
the strengths and knowledge of all members of the department. The Department 
Head will coordinate and facilitate leadership currently in place among ODIS 
faculty and staff. As UA librarians have faculty status, the Department Head is 
responsible for coaching and guiding librarians through the promotion and 
continuing status process. The Department Head will also be responsible for 
ensuring that department planning furthers the strategic goals for the 
Libraries and campus.

This is a continuing-eligible, academic professional position. Incumbents are 
members of the general faculty and are entitled to all accompanying rights and 
privileges granted by the Arizona Board of Regents and the University of 
Arizona. Retention and promotion are earned through achievement of a record of 
excellence in position effectiveness, scholarship, and service.

The Office of Digital Innovation and Stewardship (ODIS) at the University of 
Arizona Libraries engages and innovates across a range of services and content 
in support of the University’s mission and strategic plan. ODIS provides 
services to the University community that encompass data management, campus 
repository, metadata, journal hosting and publishing, copyright and scholarly 
communication, open access, and geospatial data. ODIS is responsible for 
programmatic planning and oversight of the Libraries digital collections and 
digitization activities, including digital preservation and digital asset 
management efforts. ODIS coordinates strategies for exposing unique and local 
digital collections. ODIS also leads and contributes to a variety of national 
and international collaborative efforts, including TRAIL (Technical Report 
Archive and Image Library) and the Afghanistan Digital Collections. ODIS is 
active in campus-wide efforts related to scholarly activity and research data, 
participates in the University’s Research Computing Governance Committee, leads 
the institution’s faculty activity reporting efforts, and collaborates with the 
University’s Office of Research and Discovery, and University Information 
Technology Services. In this process, ODIS collaborates with faculty and staff 
throughout the University Libraries and across campus.


The University of Arizona has been recognized on Forbes 2015 list of America’s 
Best Employers in the United States and has been awarded the 2015 Work-Life 
Seal of Distinction by the Alliance for Work-Life Progress! For more 
information about working at the University Libraries, see 
http://www.library.arizona.edu/about/employment/why.


Diversity Commitment: At the University of Arizona, we value our inclusive 
climate because we know that diversity in experiences and perspectives is vital 
to advancing innovation, critical thinking, solving complex problems, and 
creating an inclusive academic community. Diversity in our environment embraces 
the acceptance of a multiplicity of cultural heritages, lifestyles and 
worldviews. We translate these values into action by seeking individuals who 
have experience and expertise working with diverse students, colleagues and 
constituencies, as we believe that such experiences are both institutional and 
service imperatives. Because we seek a workforce with diverse perspectives and 
experiences, we encourage applications 

[CODE4LIB] The 2016 TDL Awards: Nominations due February7th!

2016-01-26 Thread Vacek, Rachel E
The Texas Digital Library (TDL) Awards Committee invites nominations for the 
2016 TDL Awards.

TDL Awards acknowledge outstanding digital library work at academic libraries 
throughout Texas. The TDL Awards Committee encourages individuals to nominate 
themselves, as well as other individuals and groups, for recognition of the 
innovative and important work happening across the state in digital libraries.

Awards will be given in six categories:

 *   Innovative Outreach Award: Honors an individual or team at a Texas 
academic library that demonstrates a creative and successful approach to 
reaching new users and building awareness of an organization's digital library.
 *   Scholarly Communications Award: Honors the work of an individual or group 
in a Texas academic library who has made significant advances in our 
understanding of the issues surrounding scholarly communication and/or in 
developing innovative solutions to address current academic publishing.
 *   TDL Service Award: Honors a Texas Digital Library member (individual or 
group) who has made significant contributions to the TDL consortium and/or a 
member who has used TDL services to their fullest potential.
 *   Trailblazer Award: Honors the work of individuals or groups within Texas 
academic libraries who have used limited resources in innovative ways to 
create, maintain, or support digital collections.
 *   Leadership in Digital Libraries Award: Honors an individual at a Texas 
academic library who has made significant contributions and improvements 
related to digital libraries.
 *   Excellence in Digital Libraries Award: Honors an institution, group, or 
project at a Texas academic library that has demonstrated overall excellence in 
one or more areas of digital library practice.

Awardees will be honored at the 2016 Texas Conference on Digital 
Libraries
 in Austin, May 24-26, 2016. With the exception of the TDL Service Award, all 
awards are open to nominations from any academic institution in Texas 
regardless of affiliation with the TDL. Additionally, nominations are also open 
to groups in partnership with Texas academic libraries for all awards except 
the Leadership in Digital Libraries Award and the TDL Service Award. 
Self-nominations are accepted.

Important Dates:

 *   February 7, 2016: Deadline for submissions
 *   March 18, 2016: Notification of award recipients
 *   May 24-26: Dates for the Texas Conference on Digital Libraries, at which 
awards will be distributed. Attendance at the conference is not required to 
receive an award.

More information about criteria and processes for selection, and submitting a 
nomination, is available at the TDL Awards website: http://tdl.org/awards/

For questions about the awards, please email: i...@tdl.org

2016 TDL Awards Committee:

 *   Billie Peterson-Lugo (Chair), Baylor University
 *   Mark Phillips, University of North Texas
 *   Rachel Vacek, University of Houston
 *   Nerissa Lindsey, Texas A International University
 *   Lauren Goodley, Texas State University
 *   Laura Waugh, Texas Digital Library


[CODE4LIB] Job: TRLN Technology Initiatives Coordinator (At Will Appointment) at University of North Carolina at Chapel Hill

2016-01-26 Thread jobs
TRLN Technology Initiatives Coordinator (At Will Appointment)
University of North Carolina at Chapel Hill
Chapel Hill

The Triangle Research Libraries Network (TRLN) seeks a collaborative, tech-
savvy, and highly motivated individual to join our team as a Technology
Initiatives Coordinator. A newly-envisioned position, the TRLN Technology
Initiatives Coordinator will work closely with talented librarians,
technologists, and developers across four academic institutions to lead the
development of a variety of applications and systems that support the
discovery and delivery of library materials owned by the four member
institutions in support of research and instruction for faculty, staff and
students at all four universities. The position will apply expert skills to
help solve some of the complex issues facing academic libraries today and to
anticipate future opportunities.

  
Reporting to the TRLN Executive Director, the person in this position will
play a major role in the support of both established and emerging technology-
focused projects. The TRLN Technology Initiatives Coordinator will provide
leadership in the development, implementation and maintenance, as well as the
design and evaluation, of these projects that support cooperative collection
development and collaborative use of research library materials. A key
component of the position is working with TRLN staff and programmatic councils
to set priorities and to evaluate and communicate progress to campus
communities.

  
The TRLN staff includes an Executive Director, a Program Officer, and an
Administrative Assistant, all of whom work closely with colleagues across the
consortium. Our central staff provides support for the broad spectrum of
consortial initiatives and activities. Our organizational values include
collaboration, flexibility, creativity, and responsiveness to the evolving
role of academic libraries and consortia.

  
The focus of the work initially will be leading a major enhancement of our
shared library materials discovery index and interface, Search TRLN, which is
built on the Oracle Endeca platform. The shared discovery index and interface
will be transitioning from Endeca to a new platform (such as Blacklight or
VUFind), and the Technology Initiatives Coordinator will take primary
responsibility for advising on and implementing the new system in consultation
with librarians and administrators from the four campuses.

  
Separate from but related to the shared discovery index and interface, we also
are exploring tools and systems to support direct patron borrowing across our
institutions, and the Technology Initiatives Coordinator will be responsible
for implementing applications to support this service. These and other
projects will require collection and analysis of data for implementation as
well as monitoring and assessment.

  
The position requires close collaboration with technical staff at our four
member institutions, not only to ensure seamless integration with local
systems, but also to ensure shared responsibility for the process and the
products.

  
The person in this position will also be expected to contribute to other
technology-focused activities and initiatives as they arise, such as
management and sharing of electronic materials and development of components
of a new TRLN web site.

  
**QUALIFICATIONS**  
  
**Required:**  
●
ALA-accredited master's degree or an advanced degree in a related field

●
Relevant experience developing software applications

●
Proficient in multiple programming languages (e.g., Ruby, PHP, Python, SQL,
Java)

●
Experience working with XML, including XML transformations

●
Working knowledge of various metadata standards such as MARC, MODS, and METS

●
Experience using software development tools for version control

●
Demonstrated experience with APIs and web services

●
Excellent communication skills and ability to work in a collaborative team
environment

  
**Preferred:**  
●
Work experience in a library, academic or research environment

●
Experience participating in open source software projects

●
Familiarity with integrated library systems

●
Working knowledge of relational databases (e.g., Oracle, MySQL)

  
**Triangle Research Libraries Network**  
Triangle Research Libraries Network (TRLN) is a collaborative organization of
Duke University, North Carolina Central University, North Carolina State
University, and The University of North Carolina at Chapel Hill, the purpose
of which is to marshal the financial, human, and information resources of
their research libraries through cooperative efforts in order to create a rich
and unparalleled knowledge environment that furthers the teaching, research,
and service missions of the TRLN universities. The libraries of the four
institutions encompass collections of professional schools in business, law,
and health sciences, in addition to the major research resources in the
humanities, social science, engineering, sciences, and