[CODE4LIB] Job: Developer Needed for Omeka/Scripto + Wordpress Newman website archive at The National Institute for Newman Studies

2015-08-05 Thread jobs
Developer Needed for Omeka/Scripto + Wordpress Newman website archive
The National Institute for Newman Studies
Pittsburgh

Please follow the link below to see all the project
details.

  
https://docs.google.com/document/d/1poSC-
A7V_TPhlx2uXKSx2zUdXJl_i4Dtz3q0jiIBuVc/edit

  
Please contact me ASAP we need to get this project started and finished.



Brought to you by code4lib jobs: http://jobs.code4lib.org/job/22120/
To post a new job please visit http://jobs.code4lib.org/


[CODE4LIB] Job: RESEARCH DATA MANAGEMENT LIBRARIAN at Indiana University Bloomington

2015-08-05 Thread jobs
RESEARCH DATA MANAGEMENT LIBRARIAN
Indiana University Bloomington
Bloomington

**RESEARCH DATA MANAGEMENT LIBRARIAN  
ASSISTANT LIBRARIAN OR ASSOCIATE LIBRARIAN

INDIANA UNIVERSITY BLOOMINGTON LIBRARIES**

  
Founded in 1820, Indiana University Bloomington has grown from a small state
seminary into the flagship campus of a great public university with over
42,000 students and almost 3,000 faculty. Innovation, creativity, and academic
freedom are hallmarks of IU Bloomington and its world-class contributions in
research and the arts. The campus covers over 1,800 wooded acres and is
distinctive for both its park-like beauty and an architectural heritage
inspired by local craftsmanship in limestone.

  
The Indiana University Bloomington Libraries (http://www.libraries.iub.edu)
are among the leading academic research library systems in North America,
having recently been named the top university library by the Association of
College and Research Libraries. The IUB Libraries provide strong collections,
quality service and instructional programs, and leadership in the application
of information technologies. The collections support every academic discipline
on campus and include more than 6.6 million books, journals, maps, films, and
audio/visual materials in over 900 languages. Users can access more than 400
databases, 43,000 electronic journals, and 224,000 electronic books, as well
as locally developed digital content. Of particular note are the 8-million
volume high-density Auxiliary Library Facility (ALF) for preservation and
access to the libraries' collections and archives, and the Lilly Library, the
rare books, manuscripts, and special collections library of the Indiana
University Libraries, Bloomington.

  
The IUB Libraries are active members of regional and national associations and
consortia including the Committee on Institutional Cooperation (CIC), the
Association of Research Libraries (ARL), the Digital Library Federation (DLF),
the Hydra community, and is a founding member of HathiTrust, a shared digital
repository. IU is the principal investigator for the Kuali Open Library
Environment (OLE) and is working with academic library partners to develop a
next generation open source library management system.
Indiana University is an organizational member of the Research Data Alliance,
working internationally to bridge research data use and sharing across domains
and disciplines.

  
The Indiana University Bloomington Libraries seeks a Research Data Management
Librarian to be part of a collaborative team to plan and develop new services
and promote existing services for research data management -- consultation,
outreach and training, and repository services -- to meet the diverse needs of
all scholars across the Bloomington campus. Working across
units within the Libraries --especially Library Technologies, Scholarly
Communications, Digital Collections Services, the Office of Scholarly
Publishing, and with subject librarians, the Research Data Management
Librarian will provide data management expertise for both the libraries and
individual researchers as part of the Scholars' Commons suite of digital
scholarship services. In addition to working with library units and scholars,
this position will foster collaborations and relationships that complements
the Libraries' capacity to support the University's interdisciplinary research
and technology initiatives, building upon a foundation of successful library-
campus collaborations to date including partnerships with Indiana University's
Office of Research Administration, University Information Technology Services,
Pervasive Technology Institute-Data to Insight Center, and Office of the Vice-
Provost for Research. These larger partnerships are
instrumental in ensuring cohesion and collaboration in data management
resources at the institutional level.

  
Reporting to the Associate Dean for Library Technologies, this librarian will
consult with faculty, graduate students, and other researchers on data
management planning and data curation activities; develop instructional
programming and documentation to support scholars in this area; and work with
colleagues in Library Technologies and University Information Technology
Services to adapt, design, and develop tools and repository services for
storing and sharing research data. The successful candidate will demonstrate a
clear vision of the services, infrastructure, and skills required to provide
high quality assistance and tools to IU researchers.

  
RESPONSIBILITIES

* Contribute to university- and campus-wide initiatives to develop and design 
policies, services, and infrastructure to enable faculty and students to 
preserve and make available, and thus maximize the utility of, their research 
data.  
* Develop, enhance, deliver, and assess research data workflows for IUB 
faculty, students and staff.  
* Serve as a library consultant to IUB faculty, researchers and project teams 
on the development of 

[CODE4LIB] Processing Circ data

2015-08-05 Thread Harper, Cynthia
Hi all. What are you using to process circ data for ad-hoc queries.  I usually 
extract csv or tab-delimited files - one row per item record, with identifying 
bib record data, then total checkouts over the given time period(s).  I have 
been importing these into Access then grouping them by bib record. I think that 
I've reached the limits of scalability for Access for this project now, with 
250,000 item records.  Does anyone do this in R?  My other go-to- software for 
data processing is RapidMiner free version.  Or do you just use MySQL or other 
SQL database?  I was looking into doing it in R with RSQLite (just read about 
this and sqldf  http://www.r-bloggers.com/make-r-speak-sql-with-sqldf/ ) 
because I'm sure my IT department will be skeptical of letting me have MySQL on 
my desktop.  (I've moved into a much more users-don't-do-real-computing kind of 
environment).  I'm rusty enough in R that if anyone will give me some start-off 
data import code, that would be great.

Cindy Harper
E-services and periodicals librarian
Virginia Theological Seminary
Bishop Payne Library
3737 Seminary Road
Alexandria VA 22304
char...@vts.edumailto:char...@vts.edu
703-461-1794


Re: [CODE4LIB] Processing Circ data

2015-08-05 Thread Owen Stephens
Another option might be to use OpenRefine http://openrefine.org - this should 
easily handle 250,000 rows. I find it good for basic data analysis, and there 
are extensions which offer some visualisations (e.g. the VIB BITs extension 
which will plot simple data using d3 
https://www.bits.vib.be/index.php/software-overview/openrefine 
https://www.bits.vib.be/index.php/software-overview/openrefine)

I’ve written an introduction to OpenRefine available at 
http://www.meanboyfriend.com/overdue_ideas/2014/11/working-with-data-using-openrefine/
 
http://www.meanboyfriend.com/overdue_ideas/2014/11/working-with-data-using-openrefine/

Owen

Owen Stephens
Owen Stephens Consulting
Web: http://www.ostephens.com
Email: o...@ostephens.com
Telephone: 0121 288 6936

 On 5 Aug 2015, at 21:07, Harper, Cynthia char...@vts.edu wrote:
 
 Hi all. What are you using to process circ data for ad-hoc queries.  I 
 usually extract csv or tab-delimited files - one row per item record, with 
 identifying bib record data, then total checkouts over the given time 
 period(s).  I have been importing these into Access then grouping them by bib 
 record. I think that I've reached the limits of scalability for Access for 
 this project now, with 250,000 item records.  Does anyone do this in R?  My 
 other go-to- software for data processing is RapidMiner free version.  Or do 
 you just use MySQL or other SQL database?  I was looking into doing it in R 
 with RSQLite (just read about this and sqldf  
 http://www.r-bloggers.com/make-r-speak-sql-with-sqldf/ ) because I'm sure my 
 IT department will be skeptical of letting me have MySQL on my desktop.  
 (I've moved into a much more users-don't-do-real-computing kind of 
 environment).  I'm rusty enough in R that if anyone will give me some 
 start-off data import code, that would be great.
 
 Cindy Harper
 E-services and periodicals librarian
 Virginia Theological Seminary
 Bishop Payne Library
 3737 Seminary Road
 Alexandria VA 22304
 char...@vts.edumailto:char...@vts.edu
 703-461-1794


Re: [CODE4LIB] Processing Circ data

2015-08-05 Thread Kevin Ford

Hi Cindy,

This doesn't quite address your issue, but, unless you've hit the 2 GB 
Access size limit [1], Access can handle a good deal more than 250,000 
item records (rows, yes?) you cited.


What makes you think you've hit the limit?  Slowness, something else?

All the best,
Kevin

[1] 
https://support.office.com/en-us/article/Access-2010-specifications-1e521481-7f9a-46f7-8ed9-ea9dff1fa854






On 8/5/15 3:07 PM, Harper, Cynthia wrote:

Hi all. What are you using to process circ data for ad-hoc queries.  I usually 
extract csv or tab-delimited files - one row per item record, with identifying 
bib record data, then total checkouts over the given time period(s).  I have 
been importing these into Access then grouping them by bib record. I think that 
I've reached the limits of scalability for Access for this project now, with 
250,000 item records.  Does anyone do this in R?  My other go-to- software for 
data processing is RapidMiner free version.  Or do you just use MySQL or other 
SQL database?  I was looking into doing it in R with RSQLite (just read about 
this and sqldf  http://www.r-bloggers.com/make-r-speak-sql-with-sqldf/ ) 
because I'm sure my IT department will be skeptical of letting me have MySQL on 
my desktop.  (I've moved into a much more users-don't-do-real-computing kind of 
environment).  I'm rusty enough in R that if anyone will give me some start-off 
data import code, that would be great.

Cindy Harper
E-services and periodicals librarian
Virginia Theological Seminary
Bishop Payne Library
3737 Seminary Road
Alexandria VA 22304
char...@vts.edumailto:char...@vts.edu
703-461-1794



Re: [CODE4LIB] Processing Circ data

2015-08-05 Thread Harper, Cynthia
Well, I guess it could be bad data, but I don't know how to tell. I think I've 
done more than this before.

I have a Find duplicates query that groups by bib record number.  That query 
seemed to take about 40 minutes to process. Then I added a criterion to limit 
to only records that had 0 circs this year. That query displays the rotating 
cursor, then says Not Responding, then the cursor, and loops through that for 
hours.  Maybe I can find the Access bad data, but I'd be glad to find a more 
modern data analysis software.  My db is 136,256 kb.  But adding that extra 
query will probably put it over the 2GB mark.  I've tried extracting to a csv, 
and that didn't work. Maybe I'll try a Make table to a separate db.

Or the OpenRefine suggestion sounds good too.

Cindy Harper

-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Kevin 
Ford
Sent: Wednesday, August 05, 2015 4:23 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Processing Circ data

Hi Cindy,

This doesn't quite address your issue, but, unless you've hit the 2 GB Access 
size limit [1], Access can handle a good deal more than 250,000 item records 
(rows, yes?) you cited.

What makes you think you've hit the limit?  Slowness, something else?

All the best,
Kevin

[1]
https://support.office.com/en-us/article/Access-2010-specifications-1e521481-7f9a-46f7-8ed9-ea9dff1fa854





On 8/5/15 3:07 PM, Harper, Cynthia wrote:
 Hi all. What are you using to process circ data for ad-hoc queries.  I 
 usually extract csv or tab-delimited files - one row per item record, with 
 identifying bib record data, then total checkouts over the given time 
 period(s).  I have been importing these into Access then grouping them by bib 
 record. I think that I've reached the limits of scalability for Access for 
 this project now, with 250,000 item records.  Does anyone do this in R?  My 
 other go-to- software for data processing is RapidMiner free version.  Or do 
 you just use MySQL or other SQL database?  I was looking into doing it in R 
 with RSQLite (just read about this and sqldf  
 http://www.r-bloggers.com/make-r-speak-sql-with-sqldf/ ) because ...  I'm 
 rusty enough in R that if anyone will give me some start-off data import 
 code, that would be great.

 Cindy Harper
 E-services and periodicals librarian
 Virginia Theological Seminary
 Bishop Payne Library
 3737 Seminary Road
 Alexandria VA 22304
 char...@vts.edumailto:char...@vts.edu
 703-461-1794



Re: [CODE4LIB] Processing Circ data

2015-08-05 Thread Kevin Ford
On the surface, your difficulties suggest you may need look at a few 
optimization tactics. Apologies if these are things you've already 
considered and addressed - just offering a suggestion.


This page [1] is for Access 2003 but the items under Improve query 
performance should apply - I think - to newer versions also.  I'll draw 
specific attention to 1) Compacting the database; 2) making sure you 
have an index set up on the bib record number field and number of circs 
field; and 3) make sure you are using hte Group by sql syntax [2].


Now, I'm not terribly familiar with Access so I can't actually help you 
with point/click instructions, but the above are common 'gotchas' that 
could be a problem regardless of RDBMS.


Yours,
Kevin

[1] https://support.microsoft.com/en-us/kb/209126
[2] http://www.w3schools.com/sql/sql_groupby.asp



On 8/5/15 4:01 PM, Harper, Cynthia wrote:

Well, I guess it could be bad data, but I don't know how to tell. I think I've 
done more than this before.

I have a Find duplicates query that groups by bib record number.  That query seemed to 
take about 40 minutes to process. Then I added a criterion to limit to only records that had 0 
circs this year. That query displays the rotating cursor, then says Not Responding, then 
the cursor, and loops through that for hours.  Maybe I can find the Access bad data, but I'd be glad 
to find a more modern data analysis software.  My db is 136,256 kb.  But adding that extra query will 
probably put it over the 2GB mark.  I've tried extracting to a csv, and that didn't work. Maybe I'll 
try a Make table to a separate db.

Or the OpenRefine suggestion sounds good too.

Cindy Harper

-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Kevin 
Ford
Sent: Wednesday, August 05, 2015 4:23 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Processing Circ data

Hi Cindy,

This doesn't quite address your issue, but, unless you've hit the 2 GB Access size limit 
[1], Access can handle a good deal more than 250,000 item records (rows, 
yes?) you cited.

What makes you think you've hit the limit?  Slowness, something else?

All the best,
Kevin

[1]
https://support.office.com/en-us/article/Access-2010-specifications-1e521481-7f9a-46f7-8ed9-ea9dff1fa854





On 8/5/15 3:07 PM, Harper, Cynthia wrote:

Hi all. What are you using to process circ data for ad-hoc queries.  I usually 
extract csv or tab-delimited files - one row per item record, with identifying 
bib record data, then total checkouts over the given time period(s).  I have 
been importing these into Access then grouping them by bib record. I think that 
I've reached the limits of scalability for Access for this project now, with 
250,000 item records.  Does anyone do this in R?  My other go-to- software for 
data processing is RapidMiner free version.  Or do you just use MySQL or other 
SQL database?  I was looking into doing it in R with RSQLite (just read about 
this and sqldf  http://www.r-bloggers.com/make-r-speak-sql-with-sqldf/ ) 
because ...  I'm rusty enough in R that if anyone will give me some start-off 
data import code, that would be great.

Cindy Harper
E-services and periodicals librarian
Virginia Theological Seminary
Bishop Payne Library
3737 Seminary Road
Alexandria VA 22304
char...@vts.edumailto:char...@vts.edu
703-461-1794



[CODE4LIB] Fwd: Survey on embedded metadata in digital objects

2015-08-05 Thread Edward M. Corrado
Dear Colleagues,



You are invited to participate in a survey designed to collect information
on the practice of embedding metadata into digital objects.

The purpose of the survey is to explore the cost and benefit of embedding
additional (i.e. LAM-generated) metadata into digital objects, to the end
of evaluating current practice and defining best practices.



The survey consists of a mix of closed and open ended questions.
Participation should take between 15-20 minutes.



*Please follow this link to complete the survey:  *
http://goo.gl/forms/okWuTIyTcN



Rachel Jaffe, Metadata Librarian, UC Santa Cruz and Edward Corrado,
Associate Dean, Library Technology Planning and Policy, University of
Alabama are conducting this survey.



*Participation is voluntary; participants will have the right to
discontinue the survey at any point without penalty.*



Information obtained from the online survey will be collected in a manner
that human subjects cannot be identified, directly or through identifiers
linked to the subject. Data will be made available to the profession; along
with analysis of current practice and possibilities for future research.



The University of California, Santa Cruz Institutional Review Board has
determined that this survey qualifies as exempt from full IRB oversight.



No human subjects harm is expected to occur during the online survey.



*Deadline for completing the survey is September 15, 2015.*



Contact Rachel Jaffe at 831-502-7291 or jaf...@ucsc.edu, or Edward Corrado
at 205-348-0266 or emcorr...@ua.edu with questions or concerns about this
study. If you have questions about your rights as a participant in this
research, please contact the University of California, Santa Cruz Office of
Research Compliance Administration, at 831-459-1473 or o...@ucsc.edu.



Regards,



Rachel Jaffe

Metadata Librarian

Metadata Services, University Library

University of California, Santa Cruz

1156 High Street

Santa Cruz, CA 95064
(831) 502-7291

jaf...@ucsc.edu



Edward M. Corrado

Associate Dean

Library Technology Planning and Policy, University Libraries

University of Alabama

Box 870266

Tuscaloosa, AL 35487-0266

(205) 348-0266

emcorr...@ua.edu