Re: [CODE4LIB] Library Juice - thoughts?

2015-10-27 Thread davesgonechina
I've not taken any classes on LibraryJuice mainly because I find their
course descriptions too thin. The Data Management course has a better
description than most, but perhaps I've been spoiled by Coursera where I
can see a syllabus, schedule, and materials before deciding to pay any
fees. I'm wondering, those of you who have taken a LibraryJuice course,
what attracted you to it and how did the experience match or differ from
your expectations?

Dave

On Wed, Oct 28, 2015 at 2:58 AM, Folds, Dusty  wrote:

> Yes, I concur with these comments. Just be aware of the time commitment
> that will be involved. That's where I ran into problems, too.
>
> Dusty
>
> --
> Dusty Folds, MLIS
> Information Literacy and Digital Learning Librarian
> Assistant Professor
> University of Montevallo
> Carmichael Library
> Station 6108
> Montevallo, AL 35115
> P: 205-665-6108
> F: 205-665-6112
> E: dfo...@montevallo.edu
>
>
>
> -Original Message-
> From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
> REESE-HORNSBY, TWYLA
> Sent: Tuesday, October 27, 2015 1:47 PM
> To: CODE4LIB@LISTSERV.ND.EDU
> Subject: Re: [CODE4LIB] Library Juice - thoughts?
>
> I started a course (Introduction to XML) through Library Juice a year
> ago.  I wasn't able to finish it due to some personal challenges but I
> still have access to the archived class which is great.  Like Patricia, I
> found the content very useful but underestimated how much time I needed to
> read and study the material.  Four weeks goes fast! The instructor also
> scheduled times to meet online for questions.
>
> I did have trouble getting used to the Moodle platform but I think it has
> since been upgraded to be more user friendly.
>
> I am seriously considering taking another course in the near future.
>
> Best,
>
> Twyla Reese-Hornsby
> Public Service Librarian | J. Ardis Bell Library Tarrant County College
> Northeast Campus | Office: NLIB 2127A
> 828 W. Harwood Rd. |Hurst, TX 76054
> 817-515-6365 | Fax: 817-515-6275
> twyla.reese-horn...@tccd.edu | www.tccd.edu
>
> -Original Message-
> From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
> Patricia Farnan
> Sent: Monday, October 26, 2015 9:40 PM
> To: CODE4LIB@LISTSERV.ND.EDU
> Subject: Re: [CODE4LIB] Library Juice - thoughts?
>
> I recently did a course through Library Juice on PHP & APIs, and I found
> it really useful and easy to follow (well, easy for my poor brain to
> follow. I still had to re-read my notes and re-listen to certain parts of
> each video, to really let things sink in). The instructor was very good at
> staying in touch with students and interacting.
>
> -Original Message-
> From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
> BWS Johnson
> Sent: Tuesday, 27 October 2015 4:14 AM
> To: CODE4LIB@LISTSERV.ND.EDU
> Subject: Re: [CODE4LIB] Library Juice - thoughts?
>
> Salvete!
>
>  I'm going to be exceedingly naughty in replying to this. I used to
> teach a course on Koha for Rory, so obviously I'm heavily biased.
>
>  I taught twice, and as a fringe perq, he let instructors take certain
> courses gratis.
>
>  I would say overall that you're in for a treat. When it first started
> it was a small experimental thing. A lot of the students' experiences
> varied widely by how much they participated and which instructor they
> selected. Rory has gone out of his way over the years to solidify the
> lineup so that you get a good instructor. Compared to my University, they
> are WAY cheaper. They weren't as comprehensive as my University, but hey,
> that would be a really high bar. Also, they're designed with someone that's
> working full time in mind.
>
>
>  As far as I know, they're still using Moodle, so if you're familiar
> with that platform, you'll be right at home.
>
>  The time commitment will vary by course, as well. I bet that Rory
> would give you your instructor's email in advance to feel things out and
> see how heavy the workload might be.
>
>  So yeah, go for it!
>
> Hope this helped,
> Brooke
> IMPORTANT: This e-mail and any attachments may be confidential. If you are
> not the intended recipient you should not disclose, copy, disseminate or
> otherwise use the information contained in it. If you have received this
> e-mail in error, please notify us immediately by return e-mail and delete
> or destroy the document. Confidential and legal privilege are not waived or
> lost by reason of mistaken delivery to you. The University of Notre Dame
> Australia is not responsible for any changes made to a document other than
> those made by the University. Before opening or using attachments please
> check them for viruses and defects. Our liability is limited to
> re-supplying any affected attachments.
>


[CODE4LIB] Thomson Reuters and Impact Factors

2015-08-03 Thread davesgonechina
Hi all,

If I wanted to subscribe to up-to-date impact factor information from
Thomson Reuters, which product would I need to purchase (JCR, InCites, ESI,
etc.) and is there a general ballpark for price?

Thanks!
Dave


Re: [CODE4LIB] Definitional Question

2015-07-02 Thread davesgonechina
How many humanities scholars does it take to define digital humanities?
Good question, to which there is no good answer, but rather just more
questions... - Paul Spence, courtesy of http://whatisdigitalhumanities.com/

I can't help but think the definition of digital humanities is
overthinking it. Scholarship is practiced by non-humanities disciplines
such as the natural sciences as well as the humanities. Simply appending
digital to it doesn't really clearly refer to anything except the vague
notion that somewhere, somehow, computers and/or fingers play an important
role.

DL



On Fri, Jul 3, 2015 at 4:13 AM, Nick Szydlowski nick.szydlow...@bc.edu
wrote:

 I like Bryan's answer as well.  I've heard a lot of comments and jokes
 about the difficulty of defining digital humanities; this site gives a
 different definition each time you refresh the page:

 http://whatisdigitalhumanities.com/

 Nick


 Nick Szydlowski
 Digital Initiatives and Scholarly Communication Librarian
 Boston College Law School
 617 552-4474

 On Thu, Jul 2, 2015 at 3:04 PM, McAulay, Elizabeth 
 emcau...@library.ucla.edu wrote:

  Bryan's answer is very well thought out and jibes with my understanding
 of
  this topic, too.
 
  
  From: Code for Libraries CODE4LIB@LISTSERV.ND.EDU on behalf of Bryan
  Brown bjbr...@fsu.edu
  Sent: Thursday, July 02, 2015 11:49 AM
  To: CODE4LIB@LISTSERV.ND.EDU
  Subject: Re: [CODE4LIB] Definitional Question
 
  Hi Matt,
 
  I work in the Technology  Digital Scholarship department of Florida
 State
  University Libraries, and I spent my first few months trying to come up
  answers to those exact questions. Here's what I came up with:
 
  Digital humanities is the act of doing humanities scholarship using
  research methods enabled by new technology. The archetypical digital
  humanities project in my mind is text mining. If you are coming up with
  humanities data and using data analysis tools on it, you are probably
  doing DH work (IMHO).
 
  Digital scholarship is the idea of DH, but extended outside of DH to all
  scholarship. How does new technology affect scholarship in psychology?
  biochemistry? law? A big problem that I see with digital scholarship is
  that I have yet to hear anyone outside of libraries or DH communities use
  it. The humanities havent always been so digital, so the term Digital
  Humanities is a semi-useful term to differentiate this specific form of
  research from more traditional methods. The digital prefix has less
  utility outside of humanities; science has always been pretty digital out
  of necessity and other fields have adopted digital methods as they go.
 I've
  heard librarians use the term e-science sometimes, and it reminds me of
 the
  term e-business back in the 90's but now almost all business is
  e-business so the term no longer makes much sense. Most scholarship these
  days is digital, which makes defining digital scholarship as something
  special a bit difficult.
 
  In our department we use digital scholarship to refer to parts of the
  scholarship process that are more technology-oriented where faculty might
  not be aware of general best practices. Data management, research
 metadata,
  altmetrics, web publishing and licensing are some areas that we try to
  focus on supporting faculty. We aren't a huge department and we're
 learning
  as we go, so discussing what digital scholarship means and how we can
  provide value to faculty members is a big point of discussion (although
 I'm
  sure we all have our own definitions and ideas).
 
  Just one person's opinion, I hope that doesn't confuse things further.
  -Bryan Brown
 
  On Thu, Jul 2, 2015 at 2:13 PM, Natalie Meyers natalie.mey...@nd.edu
  wrote:
 
   this title may be of interest :
   Defining Digital Humanities A Reader Edited by Melissa Terras, Julianne
   Nyhan and Edward Vanhoutte December 2013  978-1-4094-6963-6 $44.95
  
   On Thu, Jul 2, 2015 at 1:58 PM, Matt Sherman matt.r.sher...@gmail.com
 
   wrote:
  
Hi all,
   
This is a bit more philosophical question which might only apply to a
  few
people but I am trying to work out some definitions for my own
edification.  So for those in the digital scholarship and digital
humanities subset I would be interested in getting some thoughts on
  these
three questions:
   
1) How would you define digital scholarship?
   
2) How would you define digital humanities?
   
3) Are they the same thing and why or why not?
   
Any thoughts are appreciated as I am trying to think through this
  myself.
   
Matt Sherman
   
  
  
  
   --
   *Natalie K. Meyers*
  
   *E-Research  VecNet Digital Librarian*
  
   *Hesburgh Libraries*
  
   *University of Notre Dame*
   1136A Hesburgh Library
   Notre Dame, IN 46556
   *o:* 574-631-1546
   *f:* 574-631-6772
   *e: *natalie.mey...@nd.edu
  
   http://library.nd.edu/
  
 



Re: [CODE4LIB] hathitrust research center workset browser

2015-06-01 Thread davesgonechina
If your *institutional* email address is not on their whitelist (not sure
if it is limited to subscribing ones, they don't say) you cannot register
using the signup form, instead you can only request an account by briefly
explaining why you want one. Weird, because they'd have potentially learned
more about me if they just let me put my gmail address in the signup form.

I don't get it - can all users download public domain content? If they give
me an account, will I be indistinguishable from a subscribing institution?
If not, why the extra hoops?

On Fri, May 29, 2015 at 1:51 AM, Eric Lease Morgan emor...@nd.edu wrote:

 On May 27, 2015, at 6:33 PM, Karen Coyle li...@kcoyle.net wrote:

  In my copious spare time I have hacked together a thing I’m calling the
 HathiTrust Research Center Workset Browser, a (fledgling) tool for doing
 “distant reading” against corpora from the HathiTrust. [0, 1] ...
 
  'Want to give it a try? For a limited period of time, go to the
 HathiTrust Research Center Portal, create (refine or identify) a collection
 of personal interest, use the Algorithms tool to export the collection's
 rsync file, and send the file to me. I will feed the rsync file to the
 Browser, and then send you the URL pointing to the results.
 
  [0] introduction in a blog posting - http://ntrda.me/1FUGP2g
  [1] HTRC Workset Browser - http://bit.ly/workset-browser
 
  Eric, what happens if you access this from a non-HT institution? When I
 go to HT I am often unable to download public domain titles because they
 aren't available to members of the general public.


 The short answer is, “Nothing”.

 The long answer is… longer. The HathiTrust proper is accessible to
 anybody, but the downloading of public domain content is only available to
 subscribing institutions.

 On the other hand, the “Workset Browser” is designed to work off the
 HathiTrust Research Center Portal, not the HathiTrust proper. The Portal is
 located at http://sharc.hathitrust.org From there anybody can search the
 collection of public domain content, create collections, and apply various
 algorithms against collections. One of the algorithms is “create RSYNC
 file” which, in turn, allows you to download bunches o’ metadata describing
 the items in your collection. (There is also a “download as MARC”
 algorithm.) This rsync file is the root of the Workset Browser. Feed the
 Browser a rsync file, and the Browser will mirror content locally, index
 it, and generate reports describing the collection.

 Thank you for asking. Many people do not know there is a HathiTrust
 Research Center.

 —
 Eric Morgan



Re: [CODE4LIB] hathitrust research center workset browser

2015-06-01 Thread davesgonechina
They just informed me I need a .edu address. Having trouble understanding
the use of the term public domain here.

On Mon, Jun 1, 2015, 9:58 PM Eric Lease Morgan emor...@nd.edu wrote:

 On Jun 1, 2015, at 4:33 AM, davesgonechina davesgonech...@gmail.com
 wrote:

  If your *institutional* email address is not on their whitelist (not sure
  if it is limited to subscribing ones, they don't say) you cannot register
  using the signup form, instead you can only request an account by briefly
  explaining why you want one. Weird, because they'd have potentially
 learned
  more about me if they just let me put my gmail address in the signup
 form.
 
  I don't get it - can all users download public domain content? If they
 give
  me an account, will I be indistinguishable from a subscribing
 institution?
  If not, why the extra hoops?


 Dave, you are the second person to bring this “white listing” issue to my
 attention. Bummer! Yes, apparently, unless your email address is a part of
 wider something or another, then you need to be authorized to use the
 Research Center. Weird! In my opinion, while the Research Center’s tools
 work, I believe the site suffers from usability issues.

 In any event, I have enhanced the auto-generated reports created by my
 “Browser”, and while they are very textual, I also believe they are
 insightful. For example, the complete works of:

   * William Ellery Channing - http://bit.ly/browser-channing-about
   * Jane Austen - http://bit.ly/browser-austen-about
   * Ralph Waldo Emerson - http://bit.ly/browser-emerson-about
   * Henry David Thoreau - http://bit.ly/browser-thoreau-about

 —
 Eric “Beginning To Suffer From ‘Creeping Featuritis’” Morgan



Re: [CODE4LIB] Library Hours

2015-05-07 Thread davesgonechina
I contacted the group behind the Indiegogo campaign on Twitter:

https://twitter.com/davesgonechina/status/596148115465371649


   1.
  1.   *Caravan Studios* ‏@*caravanstudios*
  https://twitter.com/caravanstudios May 2
  https://twitter.com/caravanstudios/status/594226589631533056

  Help us raise $10K to put #*libraries*
  https://twitter.com/hashtag/libraries?src=hash locations  hours in
  #*Rangeapp* https://twitter.com/hashtag/Rangeapp?src=hash  help
  youth find free #*summermeals*
  https://twitter.com/hashtag/summermeals?src=hash  #*safeplaces*
  https://twitter.com/hashtag/safeplaces?src=hash http://
  bit.ly/rangecampaign  http://t.co/Pq9Nmi8nQT
 https://twitter.com/caravanstudios/status/594226589631533056  11
  retweets   8 favorites
 1.

*davesgonechina* ‏@*davesgonechina*
   https://twitter.com/davesgonechina

   @*caravanstudios* https://twitter.com/caravanstudios also, library
   hours change often, budgets get cut. Is $10K enuff 2 run regular scrapes
   for years, or is this a one-off?
 0 retweets   0 favorites
  11:02 AM - 7 May 2015
Tweet text
   Reply to @caravanstudios https://twitter.com/caravanstudios

  1.*Caravan Studios* ‏@*caravanstudios*
  https://twitter.com/caravanstudios 7h7 hours ago
  https://twitter.com/caravanstudios/status/596400357242077184

  .@*davesgonechina* https://twitter.com/davesgonechina this is a one
  time push for this summer. We'll open up the system so librarians can
  update their own data next year.
 https://twitter.com/caravanstudios/status/596400357242077184  1
  retweet   0 favorites
 2.

   3.   *davesgonechina* ‏@*davesgonechina*
   https://twitter.com/davesgonechina 3h3 hours ago
   https://twitter.com/davesgonechina/status/596461555710955520

   @*caravanstudios* https://twitter.com/caravanstudios that presumes
   librarians have the bandwidth/inclination to update ur $10K DB. Just sayin.
  https://twitter.com/davesgonechina/status/596461555710955520  0
   retweets   1 favorite



On Thu, May 7, 2015 at 1:33 AM, Dan Scott deni...@gmail.com wrote:

 On Wed, May 6, 2015 at 8:15 AM, Ethan Gruber ewg4x...@gmail.com wrote:

  +1 on the RDFa and schema.org. For those that don't know the library URL
  off-hand, it is much easier to find a library website by Googling than it
  is to go through the central university portal, and the hours will show
 up
  at the top of the page after having been harvested by search engines.


 Hi, so this is an area that I've done, and am doing, a fair bit of work.
 See http://stuff.coffeecode.net/2015/ola_white_hat_seo/#/1/10 for some fun
 slides from a presentation I gave in January at the Ontario Library
 Association SuperConference that show some ways data gets into
 Google/Yahoo/Bing and concludes that the OCLC Registry manually maintain
 yet another copy of your data elsewhere approach isn't working. (Hit s
 to get speaker notes).

 The rest of the presentation goes into depth on how to use RDFa to mark up
 a real library web page with location, contact info, opening hours, and
 event info. And I've posited that crawling library sites to pull
 single-sourced data (e.g. you update your website to provide updated hours
 to humans, and the machines automatically benefit) would be a much more
 effective, accurate, and usable approach than maintaining copies of the
 data in Google+, OCLC Registry, etc. We could produce results like
 http://cwrc.ca/rsc-src/ that stay accurate, rather than being one-off
 efforts that decay over time. (It would be great if the OCLC Registry had a
 crawl this URL option so that it could keep all of its data up-to-date
 and incentive libraries to publish the data in a machine-readable format
 such as RDFa + schema.org.)

 On the but that's technically challenging front, I tried pursuing some
 grant funding to produce templates for publishing that structured info in
 Drupal, Joomla, and other commonly used CMSs. Sadly, my application was
 recently denied, but that will only slow me down; I'm not going to give up
 on the goal. I have a paper in the works that will expand on the content of
 the presentation for those sites that have the ability (technical and
 administrative) to modify their own web pages.

 Sites running the Evergreen library system already generate a page for each
 of their libraries that contains this structured data (e.g.
 https://laurentian.concat.ca/eg/opac/library/OSUL), which is single
 sourced
 from the data that has to be maintained in the library system anyway.

 I'll happily acknowledge that getting search engines to harvest the right
 data is not easy, though: right now, for example, if you search for J.N.
 Desmarais Library it currently shows that the library is open 24 hours a
 day, which is completely false--probably maliciously
 submitted--information. *sigh* I've edited that info in the Google+ page

Re: [CODE4LIB] Protagonists

2015-04-23 Thread davesgonechina
Hey thanks everybody, I've been too busy to dig into any of your
suggestions but hugely appreciated. This group is awesome.

@Amanda, I actually remember signing up for Small Demons in beta and it
died before I got a chance to really explore it.
@Thomas, LibraryThing's charactername field looks very promising if the
list consistently gives main characters first billing.
@Shaun Trajectory is definitely interesting, though I've not thought of a
use case yet.
@Karen true about the authority problem - unless publishers wrap this sort
of info in ebook metadata?
@Joshua Like LibraryThing, its unclear if the character lists are actually
prioritized by significance.
@Joel Shame those resources look rather dusty. As for an IMDB for books, I
think LibraryThing or Amazon are better positioned than anyone.
@Brooke I'm absolutely certain its doable, but as @Amy points out its a
pain in the ass. Even if I simply take @Alexander's suggestion of the Le
Monde list, I have to scrape and scan and scrub for something that, in a
world where we can have nice things, this already exists in a
rough-and-ready incomplete but off-the-shelf dataset. It kinda blows my
mind it doesn't.

Not to mention there's the other step I mentioned, which is matching them
up with Gutenberg.org pages.

I'll keep you guys updated as I dig into all your ideas. Cheers!

Dave



On Wed, Apr 15, 2015 at 4:17 AM, Thomas Guignard thomas.guign...@gmail.com
wrote:

 The LibraryThing API could also be used to retrieve what they call Common
 Knowledge tags, including character names but also place names etc.

 Example:

 https://www.librarything.com/services/rest/1.1/?method=librarything.ck.getworkid=2773690apikey=d231aa37c9b4f5d304a60a3d0ad1dad4
 (using the example API key)
 Look for the characternames field.

 As far as I can tell, however, there is no way to determine which of the
 characters are the lead male and lead female character short of
 assuming that the top listed characters are in effect the lead ones. Also,
 the API calls are limited to 1000 a day. But maybe an avenue to consider.

 t.

 On Tue, Apr 14, 2015 at 2:15 PM, Shaun Ellis sha...@princeton.edu wrote:

  Another interesting startup in this area is Trajectory.
 
  Here's a list of Classics/Fiction via their JSON API (doc=isbn):
  http://api.trajectory.com/api/v1/search/?q=c=Fiction%20%2F%
  20Classicslimit=568
 
  Here's a human readable view:
  http://www.trajectory.com/search/?q=facetsc=Fiction%
  20%2F%20Classicslimit=568
 
  -Shaun
 
 
  On 4/14/15 11:07 AM, Amanda French wrote:
 
  What you *did* need for this interesting project was Small Demons, which
  was a for-profit company that was creating linked data from books --
 here's
  an article about it: http://www.theverge.com/2013/
  3/1/4043298/building-an-atlas-for-books-with-small-demons
 
  But it shut down in 2013, and I have no idea what happened to the data.
  It might all have been commercial and proprietary, anyway. Article on
 its
  closure: http://www.latimes.com/books/jacketcopy/la-et-jc-small-
  demons-to-close-unless-buyer-appears-20131106-story.html
 
  Amanda
 
 
  On 4/13/15 10:12 PM, davesgonechina wrote:
 
  So I have this idea I'd like to do for a hobby project, but it requires
  finding a table that lists a classic novel, a Gutenberg.org link to an
 
   snip
 
 



[CODE4LIB] Protagonists

2015-04-13 Thread davesgonechina
So I have this idea I'd like to do for a hobby project, but it requires
finding a table that lists a classic novel, a Gutenberg.org link to an
instance of that work (first listed, one with most downloads, whichever),
the lead female character, and the lead male character (can be null). E.g.
Pride and Prejudice, http://www.gutenberg.org/ebooks/42671, Elizabeth
Bennet, Mr. Darcy. Even leaving the Gutenberg part for another day, this
has been really difficult to find.

I've had no success with Dbpedia/Wikidata since there's no real
standardized format for novels, characters often are associated more
strongly with films or video games than original works (Cheshire Cat), and
when characters are listed they are neither prioritized nor link to a
record that clearly states gender. And then there's how to select some sort
of Western Canon list. ISBNs are nowhere to be found, nor any other
identifier that might help to corral a fair chunk of results.

I looked at OCLC, but WorldCat Works is still an experiment and frankly
looks like too much work to query for too little return even if it had good
coverage. Amazon? Librarything? Goodreads? No luck yet.

I raise this partly because a) I would like to make some toys with that
list, and b) I feel this is a good test case for what developers might
want from library data, linked or otherwise. It is the sort of request
that includes many unspoken assumptions (that there is a canon, and it is
well-defined) that app users, product managers, and developers typically
want even if it is woefully incomplete or imperfect, so long as it matches
expectations. While I appreciate what it takes to make such a list, I feel
like this really ought to be a solved problem in the library space. Not in
the process of being solved, hopefully, by new emerging standards solved,
but like we solved this ages ago, here ya go solved.

I'm posting this basically in the hopes that someone will say No, doofus,
there's an easy way to do this, you just aren't very good at this - look:
and show me where I'm wrong.

D


Re: [CODE4LIB] Data Lifecycle Tracking Documentation Tools

2015-03-18 Thread davesgonechina
@John - Thanks, I'd be interested to learn more about the supportable
pattern you mentioned if there are any readings you'd recommend.

@Joe - Cheers, Andreas Rauber's presentation sounds particularly relevant.
Do you have a link?

@Colin - Thanks for the feedback, I do plan to take a closer look at JIRA.

Dave

On Fri, Mar 13, 2015 at 11:49 PM, Joe Hourcle onei...@grace.nascom.nasa.gov
 wrote:



 On Wed, 11 Mar 2015, davesgonechina wrote:

  Hi John,

 Good question - we're taking in XLS, CSV, JSON, XML, and on a bad day PDF
 of varying file sizes, each requiring different transformation and audit
 strategies, on both regular and irregular schedules. New batches often
 feature schema changes requiring modification to ingest procedures, which
 we're trying to automate as much as possible but obviously require a human
 chaperone.

 Mediawiki is our default choice at the moment, but then I would still be
 looking for a good workflow management model for the structure of the
 wiki,
 especially since in my experience wikis are often a graveyard for the best
 intentions.



 A few places that you might try asking this question again, to see if you
 can find a solution that better answers your question:


 The American Society for Information Science  Technology's Research Data
 Access  Preservation group.  It has a lot of librarians  archivists in
 it, as well as people from various research disiplines:

 http://mail.asis.org/mailman/listinfo/rdap
 http://www.asis.org/rdap/

 ...

 The Research Data Alliance has a number of groups that might be relevant.
 Here are a few that I suspect are the best fit:

 Libraries for Research Data IG
 https://rd-alliance.org/groups/libraries-research-data.html

 Reproducibility IG
 https://rd-alliance.org/groups/reproducibility-ig.html

 Research Data Provenance IG
 https://rd-alliance.org/groups/research-data-provenance.html

 Data Citation WG
 (as this fits into their 'dynamic data' problem)
 https://rd-alliance.org/groups/data-citation-wg.html

 ('IG' is 'Interest Group', which are long-lived.  'WG' is 'Working Group'
 which are formed to solve a specific problem and then disband)

 The group 'Publishing Data Workflows' might seem to be appropriate but
 it's actually 'Workflows for Publishing Data' not 'Publishing of Data
 Workflows' (which falls under 'Data Provenance' and 'Data Citation')

 There was a presentation at the meeting earlier this week by Andreas
 Rauber in the Data Citation group on workflows using git or SQL databases
 to be able to track appending or modification for CSV and similar ASCII
 files.

 ...

 Also, I would consider this to be on-topic for Stack Exchange's Open
 Data site  (and I'm one of the moderators for the site):

 http://opendata.stackexchange.com/

 -Joe






  On Tue, Mar 10, 2015 at 8:10 PM, Scancella, John j...@loc.gov wrote:

  Dave,

 How are you getting the metadata streams? Are they actual stream objects,
 or files, or database dumps, etc?

 As for the tools, I have used a number of the ones you listed below. I
 personally prefer JIRA (and it is free for non-profit). If you are ok if
 editing in wiki syntax I would recommend mediaWiki (it is what powers
 Wikipedia). You could also take a look at continuous deployment
 technologies like Virtual Machines (virtualbox), linux containers
 (docker),
 and rapid deployment tools (ansible, salt). Of course if you are doing
 lots
 of code changes you will want to test all of this continually (Jenkins).

 John Scancella
 Library of Congress, OSI

 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
 davesgonechina
 Sent: Tuesday, March 10, 2015 6:05 AM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: [CODE4LIB] Data Lifecycle Tracking  Documentation Tools

 Hi all,

 One of my projects involves harvesting, cleaning and transforming steady
 streams of metadata from numerous publishers. It's an infinite loop but
 every cycle can be a little bit or significantly different. Many issue
 tracking tools are designed for a linear progression that ends in
 deployment, not a circular workflow, and I've not hit upon a tool or use
 strategy that really fits.

 The best illustration I've found so far of the type of workflow I'm
 talking about is the DCC Curation Lifecycle Model 
 http://www.dcc.ac.uk/sites/default/files/documents/
 publications/DCCLifecycle.pdf


  .

 Here are some things I've tried or thought about trying:

- Git comments
- Github Issues
- MySQL comments
- Bash script logs
- JIRA
- Trac
- Trello
- Wiki
- Unfuddle
- Redmine
- Zendesk
- Request Tracker
- Basecamp
- Asana

 Thoughts?

 Dave





[CODE4LIB] Data Lifecycle Tracking Documentation Tools

2015-03-10 Thread davesgonechina
Hi all,

One of my projects involves harvesting, cleaning and transforming steady
streams of metadata from numerous publishers. It's an infinite loop but
every cycle can be a little bit or significantly different. Many issue
tracking tools are designed for a linear progression that ends in
deployment, not a circular workflow, and I've not hit upon a tool or use
strategy that really fits.

The best illustration I've found so far of the type of workflow I'm talking
about is the DCC Curation Lifecycle Model
http://www.dcc.ac.uk/sites/default/files/documents/publications/DCCLifecycle.pdf
.

Here are some things I've tried or thought about trying:

   - Git comments
   - Github Issues
   - MySQL comments
   - Bash script logs
   - JIRA
   - Trac
   - Trello
   - Wiki
   - Unfuddle
   - Redmine
   - Zendesk
   - Request Tracker
   - Basecamp
   - Asana

Thoughts?

Dave


Re: [CODE4LIB] Data Lifecycle Tracking Documentation Tools

2015-03-10 Thread davesgonechina
Hi John,

Good question - we're taking in XLS, CSV, JSON, XML, and on a bad day PDF
of varying file sizes, each requiring different transformation and audit
strategies, on both regular and irregular schedules. New batches often
feature schema changes requiring modification to ingest procedures, which
we're trying to automate as much as possible but obviously require a human
chaperone.

Mediawiki is our default choice at the moment, but then I would still be
looking for a good workflow management model for the structure of the wiki,
especially since in my experience wikis are often a graveyard for the best
intentions.

Dave




On Tue, Mar 10, 2015 at 8:10 PM, Scancella, John j...@loc.gov wrote:

 Dave,

 How are you getting the metadata streams? Are they actual stream objects,
 or files, or database dumps, etc?

 As for the tools, I have used a number of the ones you listed below. I
 personally prefer JIRA (and it is free for non-profit). If you are ok if
 editing in wiki syntax I would recommend mediaWiki (it is what powers
 Wikipedia). You could also take a look at continuous deployment
 technologies like Virtual Machines (virtualbox), linux containers (docker),
 and rapid deployment tools (ansible, salt). Of course if you are doing lots
 of code changes you will want to test all of this continually (Jenkins).

 John Scancella
 Library of Congress, OSI

 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
 davesgonechina
 Sent: Tuesday, March 10, 2015 6:05 AM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: [CODE4LIB] Data Lifecycle Tracking  Documentation Tools

 Hi all,

 One of my projects involves harvesting, cleaning and transforming steady
 streams of metadata from numerous publishers. It's an infinite loop but
 every cycle can be a little bit or significantly different. Many issue
 tracking tools are designed for a linear progression that ends in
 deployment, not a circular workflow, and I've not hit upon a tool or use
 strategy that really fits.

 The best illustration I've found so far of the type of workflow I'm
 talking about is the DCC Curation Lifecycle Model 
 http://www.dcc.ac.uk/sites/default/files/documents/publications/DCCLifecycle.pdf
 
 .

 Here are some things I've tried or thought about trying:

- Git comments
- Github Issues
- MySQL comments
- Bash script logs
- JIRA
- Trac
- Trello
- Wiki
- Unfuddle
- Redmine
- Zendesk
- Request Tracker
- Basecamp
- Asana

 Thoughts?

 Dave



Re: [CODE4LIB] Streaming Copyrighted material

2014-12-11 Thread davesgonechina
Hi all,

Agreed with Brent regarding a cease and desist order coming long before any
legal action, and agreed with Simon that under Aereo, and previous
decisions, streaming is a performance and not distribution. FWIW I'm fairly
certain that between the educational exemption for performance and display
(Section 110(1)) and the fair use test (Section 107) libraries are on solid
ground when streaming video through a third-party app (aren't you always
through a browser?). The portion of the video saved temporarily on your
local device is not infringement - the Cablevision decision carved out
space for that and I don't believe Aereo changed that. Aereo was more about
how transparently and frankly cynically they were adhering to the letter
and not the spirit of the law to get around transmission fees, not whether
caching is an act of piracy.

If Kodi is simply providing a platform for accessing streaming content that
is either made free on the web or for which the library has purchased an
appropriate license/subscription (i.e. institutional not individual use),
using Kodi or XBMC or some other tool as a discovery tool seems
non-problematic.

Dave


On Mon, Dec 8, 2014 at 2:30 AM, Brent Hanner behan...@mediumaevum.com
wrote:

 Sorry this took so long but been having a bunch of computer problems.

 Instead of trying to reply to bits of this I’m going to try to be more
 comprehensive.




 First thing is to understand a few things.

 The streaming aspect is far less important than where you are transfering
 it from and to.

 You have far more flexibility within the building then you do publicly
 over the internet.  Just as an individual for personal use has more
 flexibility than a public corporation.  This is where Areo tried to slide
 in and the Court disagreed with them.

 And libraries tend to fall somewhere in there having special exemptions to
 copyright granted by Congress but the laws don’t cover modern technical
 details.




 As long as you act in good faith you or your library will not get sued for
 two reasons.

 Firstly standard operating procedure is to send a cease and desist
 letter.  So if you do skirt the limits realize it can happen and comply and
 then tell us what you did and what it said so the broader library community
 can decide where they stand.

 Secondly one of the last things a major content company wants is to sue a
 library.  One thing that was clearly shown during surveys of people over
 the last few years is that while lots of people don’t actively use their
 library the public support for them is still very high.

 Thirdly they don’t want to sue a library because if they lose every
 library in the country will know what it can and cannot implement.  And if
 they win they will face a legislative fight to expand what libraries can
 do.  They are served far better by there not being clear rules, especially
 because librarians fear far more than they should.




 Part of this sort of thing in the long run is about managing bandwidth,
 with streaming video sucking up more and more bandwidth finding ways of
 controlling it will be useful.  Luckily Netflix has been working on an
 appliance to help everyone with this but I’d imagine it will be a few years
 before it gets down to a library level unless someone comes up with a
 completely open source solution we can implement ourselves.


 Someone mentioned network TV which brings up the really interesting
 space.  There is an argument to be made that providing access to access to
 content freely available to the public.  While you clearly could not stream
 it to other locations the software and hardware is readily available.  So
 the question is does anyone know of any court cases or LOC/copyright
 guidelines from back in the days of VCR about libraries recording shows on
 video tape and providing access to those tapes.


 The other thing to consider as a community is developing a catalog of
 videos that would be good to keep on servers in libraries that can be
 downloaded so they are more readily accessible without killing the
 libraries bandwidth.





 Brent






 Sent from Windows Mail





 From: Cornel Darden Jr.
 Sent: ‎Tuesday‎, ‎December‎ ‎2‎, ‎2014 ‎8‎:‎59‎ ‎PM
 To: CODE4LIB@LISTSERV.ND.EDU





 Hello,

 Is streaming (viewing online) copyrighted material illegal for
 individuals. According to the copyright.gov website this seems to be
 completely legal for the viewer when there isn't a copy of the work on the
 viewers computer. It only mentions hosting streams as being a misdemeanor,
 even if there isn't any profit.

 This is becoming a huge issue as more content consumers become cord
 cutters. Has any librarians faced these questions?

 I am planning on implementing Kodi in my library, but will only make
 public domain material accessible. Kodi provides an excellent user
 interface for organizing and viewing public domain material.

 Thanks,

 Cornel Darden Jr.
 MSLIS
 Library Department Chair
 South Suburban College
 

Re: [CODE4LIB] Anybody using pinboard?

2014-11-20 Thread davesgonechina
I like the platform, but I think I really paid for Maciej's wit.

http://idlewords.com/bt14.htm

On Thu, Nov 20, 2014 at 10:27 PM, Rogan Hamby rogan.ha...@yclibrary.net
wrote:

 I've been using it since fairly early days.  I like it but don't get
 exceptionally fancy beyond my own esoteric taxonomy for defining my
 bookmarks.

 On Thu, Nov 20, 2014 at 9:19 AM, Daniel Lovins daniel.lov...@nyu.edu
 wrote:

  I've been using it for years as a personal bookmarking tool, and
  thinks it's excellent. Jason may be doing more complex things with it,
  though.
 
  - Daniel.
 
  On Thu, Nov 20, 2014 at 9:11 AM, Brad Coffield
  bcoffield.libr...@gmail.com wrote:
   https://pinboard.in/
  
   First saw this in a webinar led by Jason Clark and thought it was cool.
   Thinking about it again and feel like I should do it. But I'm worried
  it's
   just my tendency to want it because its something neato.
  
   Anybody using it and recommend it? (or signed up and regret it?) I
  already
   work evernote hard so I'm wondering if it's useful enough separate from
   that.
  
   Thanks!
  
   --
   Brad Coffield, MLIS
   Assistant Information and Web Services Librarian
   Saint Francis University
   814-472-3315
   bcoffi...@francis.edu
 
 
 
  --
  Daniel Lovins
  Head of Knowledge Access, Design  Development
  Knowledge Access  Resource Management Services
  New York University, Division of Libraries
  20 Cooper Square, 3rd floor
  New York, NY 10003-7112
  daniel.lov...@nyu.edu
  212-998-2489
 



 --

 Rogan Hamby, MLS, CCNP, MIA
 Managers Headquarters Library and Reference Services,
 York County Library System

 “You can never get a cup of tea large enough or a book long enough to suit
 me.”
 ― C.S. Lewis http://www.goodreads.com/author/show/1069006.C_S_Lewis



[CODE4LIB] International CODEN Service

2014-07-30 Thread davesgonechina
Does anyone use it, and how? Also, how much?

Dave Lyons


Re: [CODE4LIB] 'automation' tools

2014-07-07 Thread davesgonechina
+1 to OpenRefine. Some extensions, like RDF Refine http://refine.deri.ie/,
currently only work with the old Google Refine (still available here
https://code.google.com/p/google-refine/). There's a good deal of
interesting projects for OpenRefine on GitHub and GitHub Gist.

Google Docs Spreadsheets also has a surprising amount of functionality,
such as importXML if you're willing to get your hands dirty with regular
expressions.

Dave


On Tue, Jul 8, 2014 at 3:12 AM, Tillman, Ruth K. (GSFC-272.0)[CADENCE GROUP
ASSOC] ruth.k.till...@nasa.gov wrote:

 Definite cosign on Open Refine. It's intuitive and spreadsheet-like enough
 that a lot of people can understand it. You can do anything from
 standardizing state names you get from a patron form to normalizing
 metadata keywords for a database, so I think it'd be useful even for
 non-techies.

 Ruth Kitchin Tillman
 Metadata Librarian, Cadence Group
 NASA Goddard Space Flight Center Library, Code 272
 Greenbelt, MD 20771
 Goddard Library Repository: http://gsfcir.gsfc.nasa.gov/
 301.286.6246


 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
 Terry Brady
 Sent: Monday, July 07, 2014 1:35 PM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] 'automation' tools

 I learned about Open Refine http://openrefine.org/ at the Code4Lib
 conference, and it looks like it would be a great tool for normalizing
 data.  I worked on a few projects in the past in which this would have been
 very helpful.



Re: [CODE4LIB] Web Therapy full-day preconference at ALA Annual

2014-06-05 Thread davesgonechina
I can't help but point out that the examples for Web Therapy are mostly
organizational and not Web-specific problems.

   - “Our summer reading guides are totally out of control! How do we reign
   them in?”
   -  “I was put in charge of our cataloging when a colleague left the
   organization. How do I find time to manage it well in addition to my normal
   work?”
   - “Our Board wants patron statistics, but I don’t know what reports to
   provide in a way that will make sense and tell them objectively what they
   want to know. Help!”
   - “Campus security won’t let us install book drops; how are we supposed
   to provide a convenient way to return items for our students and faculty?”

Maybe not the best substitutions, but I read the originals as issues with
staff coordination, role management, internal reporting, and
inter-departmental conflicts. None of them, except maybe the LibGuides, is
really a technical problem, and I'm wondering if this panel will actually
be a forum for talking about organizational dysfunction that often results
when new technologies are integrated rather than discussing the
technologies themselves.

Dave



On Thu, Jun 5, 2014 at 11:55 PM, McHale, Nina nina.mch...@rrcc.edu wrote:

 **apologies for cross-posting**

 Do any of these scenarios sound painfully familiar to you?

 “Our LibGuides are totally out of control! How do we reign them in?”

  “I was put in charge of our Drupal site when a colleague left the
 organization. How do I find time to manage it well in addition to my normal
 work?”

 “Our Board wants web statistics, but I don’t know what reports to provide
 in a way that will make sense and tell them objectively what they want to
 know. Help!”

 “Campus IT won’t let us install a CMS; how are we supposed to develop a
 robust library web site for our students and faculty?”

 Take comfort in knowing that you are not alone! If you are headed to Las
 Vegas for ALA in June, come join Chris Evjy (Jefferson County Public
 Library) and Nina McHale (Red Rocks Community College) and others who work
 on or manage library web sites for some Web Therapy!

 Bring your web woes to the table--specific topics will be determined by a
 survey sent in advance to attendees--and we’ll put our 20+ years of
 combined experience managing public, academic, and special library web
 sites to work to develop solutions. This is a great opportunity to work
 through complex issues in a small group setting.

 Free hugs!

 To register:

   *   Register onlinehttp://ala14.ala.org/register-now through June 20
   *   Call ALA Registration at 1 (800) 974-3084
   *   Onsite registration will also be accepted in Las Vegas.

 Nina McHale, MA, MA/MSLS
 Library Director
 Red Rocks Community College
 Buckels Library
 13300 W. 6th Ave.
 Lakewood, CO 80228-1255
 303.914.6747
 http://rrcc.colibraries.orghttp://rrcc.colibraries.org/
 nina.mch...@rrcc.edumailto:nina.mch...@rrcc.edu



Re: [CODE4LIB] convert MODS XML into CSV or tab-delimted text

2014-04-22 Thread davesgonechina
LoC has XSLT stylesheets to convert MODS to DC, HTML, and MARCXML.

http://www.loc.gov/standards/mods/mods-conversions.html

There are also XML to CSV XSLT scripts out here, and there's this app which
I tested on a MODS 3.0 record and it didn't look too bad:

https://code.google.com/p/xml2csv-conv/




On Wed, Apr 23, 2014 at 5:04 AM, Bryan Baldus 
bryan.bal...@quality-books.com wrote:

 On Tuesday, April 22, 2014 1:36 PM, Eben English wrote:
 Does anyone out there have an XSL stylesheet to transform MODS XML into a
 CSV or tab-delimited text file?
 Even if it's highly localized to your own institution/project, it would
 probably still be useful.

 I'm not sure how well it would work, but MarcEdit [1] has a MODS=MARC XML
 conversion option, and an option to Export Tab Delimited Records.

 [1] http://marcedit.reeset.net/

 I hope this helps,

 Bryan Baldus
 Senior Cataloger
 Quality Books Inc.
 The Best of America's Independent Presses
 1-800-323-4241x402
 bryan.bal...@quality-books.com
 eij...@cpan.org
 http://home.comcast.net/~eijabb/



Re: [CODE4LIB] LibGuides: I don't get it

2013-08-12 Thread davesgonechina
You guys are awesome, this is great stuff, really helpful. My impression of
libguides has been fairly negative for many of the reasons mentioned, but
Sean has a good point about content strategy and training, and Wilhemina
has a good point about the costs of open source not always being
appreciated.

Has anyone tried the two platforms Andrew Darby mentioned, SubjectsPlus and
Library a la Carte? That's the sort of thing I've been looking for but
never found until now.

Dave


On Mon, Aug 12, 2013 at 9:57 PM, Sean Hannan shan...@jhu.edu wrote:

 Again, this not a technical issue. It's a content strategy issue.

 Believe me, I was where you were. I was using all kinds of javascript and
 CSS hacks to try to prevent people from getting creative with color. I was
 getting to the point of setting up Capybara tests to run against the guides
 to alert me to abusive uses of bold and italics.

 The folks creating guides are content people, not web people. Take the web
 out of it. Focus on the content. Pick a couple heuristics to educate them
 on
 (we picked 7 +/- 2, above the fold/below the fold, and F-shaped reading
 patterns). Above all, show them statistics. And not the built-in LibGuides
 stats, either.

 New vs. returning. Average time on page. Pageviews over the course of a
 year. Very, very, very quickly our librarians realized what content is
 important, what content is superfluous, and that the time the spend
 carefully manicuring and maintaining their guides would (and could) be
 better spent elsewhere.

 -Sean

 On 8/12/13 9:35 AM, Joshua Welker wel...@ucmo.edu wrote:

  I just have to say I have been thinking the exact same thing about
 LibGuides
  for the two years I've been using it. I feel vindicated knowing others
 feel
  the same way.
 
  At UCMO, we will be migrating to Drupal in the next several months, and
 I am
  hoping very much that I can convince people to use less LibGuides.
 
  LibGuides is great in its ease of use, but fails on just about every
 design
  principle I can think of. There have been several studies on tab
 blindness
  in LibGuides, and don't get me started on the sub-tab links that are
 hiding
  and require the user to mouse over a tab to even see what is there. I've
  tried telling people so many times to have just a few tabs and always to
 use
  a table of contents for the main page, but they rarely do. And it becomes
  just about impossible to have a consistent look and feel across your
 website
  when LibGuides allows guide creators to modify every element on the page
 as
  they see fit. People will do crazy things like putting page content in a
  sidebar element, something you'd never ever ever see on any website on
 the
  Internet. I tried to enforce uniform colors and column sizes across all
 the
  guides, but I was told to let it go because my coworkers wanted to be
 able
  to decide those things on a guide-by-guide basis.
 
  I've worked at two institutions that use LibGuides, and what inevitably
  happens is that librarians create one Uber Guide for entire subject areas
  (biology, religion, etc) and then create sub-pages for all the dozens of
  specific disciplines within those subject areas. And then, assuming the
 user
  somehow manages to find these pages, they are typically not much more
 than a
  list of links that could have easily been included on the main library
  website.
 
  Okay, sorry for the rant. It has been building up for several years and
  never had a chance to voice out.
 
  Josh Welker
  Information Technology Librarian
  James C. Kirkpatrick Library
  University of Central Missouri
  Warrensburg, MO 64093
  JCKL 2260
  660.543.8022
 
  -Original Message-
  From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
  Robert Sebek
  Sent: Sunday, August 11, 2013 11:21 AM
  To: CODE4LIB@LISTSERV.ND.EDU
  Subject: Re: [CODE4LIB] LibGuides: I don't get it
 
  On Sun, Aug 11, 2013 at 9:54 AM, Heather Rayl 23e...@gmail.com wrote:
 
  I have to say that I loathe LibGuides. My library makes extensive use
  of them, too. Need a web solution? The first thing out of someone's
  mouth is Let's put it in a LibGuide!
 
  Shudder
 
  This fall, I'll be moving our main site over to Drupal, and I'm hoping
  that eventually I can convince people to re-invent their LibGuides
  there. I can use the saving money card, and the content silos are
  bad card and
  *maybe* I will be successful.
 
  Anyone fought this particular battle before?
 
  ~heather
 
  I'm fighting that battle right now. We have an excellent CMS into
  which I
  have set up all our database URLs, descriptions, etc.Anytime we need to
  refer to a database on a page, we use one of those entries. That database
  just changed platforms? No problem. I change the URL in one place and
  everything automatically updates (hooray CMSs!).
 
  All of our subject guides (http://www.lib.vt.edu/subject-guides/) are
 in the
  CMS using the exact same database entries. I converted from our 

[CODE4LIB] LibGuides: I don't get it

2013-08-10 Thread davesgonechina
I've not had an opportunity to use LibGuides, but I've seen a few and read
the features list on the SpringShare. All I see is a less flexible
WordPress at a higher price point. What advantages am I not seeing? If
there aren't any, is it the case that once signed up, migration to an open
source platform is just not worth it for most institutions?


Re: [CODE4LIB] Schema for Continuing (web) Resources

2013-08-01 Thread davesgonechina
Sorry for taking a while to respond Matt, busy week.

Initially the resources would be journals, databases, galleries, digital
collections, language learning tools, dictionaries, statistical yearbooks,
and similar online resources for China Studies. I have a Pinboard list for
the sorts of things I plan to add:
http://pinboard.in/u:davesgonechina/t:zongmu/

The goal is to have a curated collection of links to collections (not crawl
every item, that can be a later project), with faceted search so that users
can narrow down on resources of a particular format, time period,
geographic region, etc.

Dave


On Sat, Jul 27, 2013 at 3:03 AM, Matthew Sherman
matt.r.sher...@gmail.comwrote:

 Just to move your discussion along a bit, plus I think it sounds
 pretty interesting, what sort of resources are you talking about.
 Know what you are working with can give everyone a better idea on what
 schema's would work best.  I know MARC is not so friendly for online
 resources, but it depends on what the item is.  Just off the cuff
 Dublin Core is probably your best bet due it is extensiblity, but
 again depends what you are working with.

 Matt

 On Fri, Jul 26, 2013 at 10:10 AM, davesgonechina
 davesgonech...@gmail.com wrote:
  I'm trying to develop a curated site listing online resources for China
  scholars. Ideally I'd like to use a metadata schema that other libraries
  export as MARC, DC, or other standards they may use, and maybe also
 linked
  data-capable. Any suggestions? I'm experimenting with Drupal but my
  platform choice will probably be driven by my schema.
 
  Dave



[CODE4LIB] Schema for Continuing (web) Resources

2013-07-26 Thread davesgonechina
I'm trying to develop a curated site listing online resources for China
scholars. Ideally I'd like to use a metadata schema that other libraries
export as MARC, DC, or other standards they may use, and maybe also linked
data-capable. Any suggestions? I'm experimenting with Drupal but my
platform choice will probably be driven by my schema.

Dave


Re: [CODE4LIB] Libraries and IT Innovation

2013-07-18 Thread davesgonechina
Some thoughts. BTW, new to the list - librarian working for a study-abroad
program in Beijing here, building a new catalog with Koha these days and
previously did competitive intelligence for investors looking at China's IT
industries. I appreciate Matt trying to start an open-ended conversation
about innovation and thought I'd toss my own rant in the ring.

One of the things that really struck me about libraries when studying for
my MLIS was how much library systems were designed primarily for the
backend and not consumer-facing until post-Internet, and built and
maintained by third parties that aren't practicing or even trained
librarians (and charging a pretty penny for it). There's a lot of catch up
going on by a profession that outsourced these skill sets and is now
rebuilding through groups like CODE4LIB, hence we may be behind the curve
on innovation for a long time.

I'm not sure how much Big Data really comes into play for most libraries.
You might need terabytes of cloud storage for a digital preservation
project, but considering the bulk of that would be the digitized
images/videos/recordings themselves, each with a metadata record, you don't
necessarily have a very large or complex a data structure. How many library
projects are beyond the ability of commonly used software tools to
capture, curate, manage, and process the data within a tolerable elapsed
time? I'm honestly not sure, and I wonder about the nebulous definition.
What is commonly used? Hadoop? On the other hand preserving Big Data,
say from the Large Hadron Collider, and creating discovery tools for future
researchers, is something that librarians could potentially be involved in,
but if CERN already built the database and discovery tools before it
reached the library, did we miss the game? Do Big Data projects say to
themselves in the planning stage We need a librarian? Should they? If so
are we ready?

Then there's the privacy issue: Even before Snowden, the ALA Code of Ethics
bumped up against the power of crunching user data for recommendation
systems and the like. Even if you adequately anonymize your data, taking it
only in aggregate, it goes against the grain of traditional library
culture. Any discussion of retaining user social profiles, search history,
or activity tracking means talking about patron rights to anonymity.

The goal I've been fixated on for library software development has been to
deliver staff and patron-friendly open-source cataloging, discovery, and
curation tools for libraries that take back control of our systems from
closed corporate vendors, provide a user experience that matches or exceeds
expectations created in the marketplace, and remain committed to the
ethical standards and social contract traditionally held by libraries in
our society. When you consider that most of the professional news industry
delivers information discovery services using Drupal, Django, or Wordpress,
why can't there be robust ecosystems like these for libraries?

Hope I didn't bore anyone.

Dave Lyons
Digital Librarian
The Beijing Center for Chinese Studies