Re: [CODE4LIB] We should use HTTPS on code4lib.org

2013-11-06 Thread Chad Fennell
On Wed, Nov 6, 2013 at 8:49 PM, Ross Singer  wrote:

> I guess I just don't see why http and https can't coexist.
>
>
They can definitely coexist, but there is a corresponding maintenance cost
and a slightly higher risk profile (e.g. session hijacking is still
possible in a variety of mixed http/https configurations). I noticed a a
pretty good, if a bit dated, run-down of the tradeoffs for various secure
setups in Drupal
http://drupalscout.com/knowledge-base/drupal-and-ssl-multiple-recipes-possible-solutions-https.
Even if the solutions have somewhat changed, it does get at the idea of
what some of the tradeoffs are between security, usability and maintenance.

Just today, I noticed a security alert (https://drupal.org/node/2129381)
for the Drupal 6 Secure Pages module where theoretically secured pages and
forms could be transmitted in the clear. This is the module you'd most
likely use to achieve a mixed http/https site in Drupal.

I have personally tended to just put everything behind https because of the
added work/modules/maintenance associated to running it along side of http
(in Drupal, specifically), but I am a lazy person with access to free certs
and ferncer servers.

HTH
-- 
Chad Fennell
Web Developer
University of Minnesota Libraries
(612) 626-4186


Re: [CODE4LIB] We should use HTTPS on code4lib.org

2013-11-06 Thread Ross Singer
I guess I just don't see why http and https can't coexist.

-Ross.
On Nov 6, 2013 9:39 PM, "Cary Gordon"  wrote:

> This conversation is heading into the "draining the swamp" category.
>
> Bill Denton started this thread with the suggestion that we use HTTPS
> everywhere. He did not make a specific case for it. I am just guessing that
> an argument for going that route would include security.
>
> Regardless of whether this is a good idea, or whether there is a
> compelling reason for doing it, it seems to me that the possibility of its
> making it difficult for older scraping tools to scrape the site does not
> seem like a compelling reason not to do it.
>
> The cost issue, on the other hand, would be a more compelling
> consideration.
>
> Thanks,
>
> Cary
>
> On Nov 6, 2013, at 6:17 PM, Ross Singer  wrote:
>
> > How is security getting thrown under the bus?
> >
> > -Ross.
> >
> > On Wednesday, November 6, 2013, Cary Gordon wrote:
> >
> >> It sounds like we are willing to throw security under the bus for an
> edge
> >> case, although I am sure that I am missing some subtlety
> >>
> >> Cary
> >>
> >> On Nov 5, 2013, at 10:27 AM, Ross Singer  >
> >> wrote:
> >>
> >>> On Tue, Nov 5, 2013 at 12:07 PM, William Denton  >
> >> wrote:
> >>>
> 
>  (Question:  Why does HTTPS complicate screen-scraping?  Every decent
> >> tool
>  and library supports HTTPS, doesn't it?)
> 
> >>>
> >>> Birkin asked me this same question, and I realized I should clarify
> what
> >> I
> >>> meant.  I was mostly referring to existing screen scrapers/existing web
> >>> sites.  If you redirect every request from http to https, this will
> >>> probably break things.  I think the Open Library example that Karen
> >>> mentioned is a good case study.
> >>>
> >>> And it's pretty different for a library or tool to support HTTPS and a
> >>> specific app to be expecting it.  If you follow the thread around that
> OL
> >>> change, it appears there are issues with Java (as one example)
> >> arbitrarily
> >>> consuming HTTPS (from what I understand, you need to have the cert
> >>> locally?), but I don't know enough about it to say for certain.  I
> think
> >>> there would also probably be potential issues around mashups (AJAX, for
> >>> example), but seeing as code4lib.org doesn't support CORS, not really
> a
> >>> current issue.  Does apply more generally to your question about
> library
> >>> websites at large, though.
> >>>
> >>> Anyway, I agree with you that the option for both should be there.  I'm
> >> not
> >>> just not convinced that HTTPS-all-the-time is necessary for all web use
> >>> cases.
> >>>
> >>> -Ross.
> >>
>


Re: [CODE4LIB] We should use HTTPS on code4lib.org

2013-11-06 Thread Cary Gordon
This conversation is heading into the "draining the swamp" category.

Bill Denton started this thread with the suggestion that we use HTTPS 
everywhere. He did not make a specific case for it. I am just guessing that an 
argument for going that route would include security.

Regardless of whether this is a good idea, or whether there is a compelling 
reason for doing it, it seems to me that the possibility of its making it 
difficult for older scraping tools to scrape the site does not seem like a 
compelling reason not to do it.

The cost issue, on the other hand, would be a more compelling consideration.

Thanks,

Cary

On Nov 6, 2013, at 6:17 PM, Ross Singer  wrote:

> How is security getting thrown under the bus?
> 
> -Ross.
> 
> On Wednesday, November 6, 2013, Cary Gordon wrote:
> 
>> It sounds like we are willing to throw security under the bus for an edge
>> case, although I am sure that I am missing some subtlety
>> 
>> Cary
>> 
>> On Nov 5, 2013, at 10:27 AM, Ross Singer 
>> >
>> wrote:
>> 
>>> On Tue, Nov 5, 2013 at 12:07 PM, William Denton 
>>> >
>> wrote:
>>> 
 
 (Question:  Why does HTTPS complicate screen-scraping?  Every decent
>> tool
 and library supports HTTPS, doesn't it?)
 
>>> 
>>> Birkin asked me this same question, and I realized I should clarify what
>> I
>>> meant.  I was mostly referring to existing screen scrapers/existing web
>>> sites.  If you redirect every request from http to https, this will
>>> probably break things.  I think the Open Library example that Karen
>>> mentioned is a good case study.
>>> 
>>> And it's pretty different for a library or tool to support HTTPS and a
>>> specific app to be expecting it.  If you follow the thread around that OL
>>> change, it appears there are issues with Java (as one example)
>> arbitrarily
>>> consuming HTTPS (from what I understand, you need to have the cert
>>> locally?), but I don't know enough about it to say for certain.  I think
>>> there would also probably be potential issues around mashups (AJAX, for
>>> example), but seeing as code4lib.org doesn't support CORS, not really a
>>> current issue.  Does apply more generally to your question about library
>>> websites at large, though.
>>> 
>>> Anyway, I agree with you that the option for both should be there.  I'm
>> not
>>> just not convinced that HTTPS-all-the-time is necessary for all web use
>>> cases.
>>> 
>>> -Ross.
>> 


Re: [CODE4LIB] We should use HTTPS on code4lib.org

2013-11-06 Thread Ross Singer
How is security getting thrown under the bus?

-Ross.

On Wednesday, November 6, 2013, Cary Gordon wrote:

> It sounds like we are willing to throw security under the bus for an edge
> case, although I am sure that I am missing some subtlety
>
> Cary
>
> On Nov 5, 2013, at 10:27 AM, Ross Singer >
> wrote:
>
> > On Tue, Nov 5, 2013 at 12:07 PM, William Denton 
> > >
> wrote:
> >
> >>
> >> (Question:  Why does HTTPS complicate screen-scraping?  Every decent
> tool
> >> and library supports HTTPS, doesn't it?)
> >>
> >
> > Birkin asked me this same question, and I realized I should clarify what
> I
> > meant.  I was mostly referring to existing screen scrapers/existing web
> > sites.  If you redirect every request from http to https, this will
> > probably break things.  I think the Open Library example that Karen
> > mentioned is a good case study.
> >
> > And it's pretty different for a library or tool to support HTTPS and a
> > specific app to be expecting it.  If you follow the thread around that OL
> > change, it appears there are issues with Java (as one example)
> arbitrarily
> > consuming HTTPS (from what I understand, you need to have the cert
> > locally?), but I don't know enough about it to say for certain.  I think
> > there would also probably be potential issues around mashups (AJAX, for
> > example), but seeing as code4lib.org doesn't support CORS, not really a
> > current issue.  Does apply more generally to your question about library
> > websites at large, though.
> >
> > Anyway, I agree with you that the option for both should be there.  I'm
> not
> > just not convinced that HTTPS-all-the-time is necessary for all web use
> > cases.
> >
> > -Ross.
>


Re: [CODE4LIB] We should use HTTPS on code4lib.org

2013-11-06 Thread Riley Childs
SSL certs are expensive because of the administrative work associated with it. 

Riley Childs
Library Director and IT Admin
Junior
Charlotte United Christian Academy
P: 704-497-2086 (Anytime)
P: 704-537-0331 x101 (M-F 7:30am-3pm ET)

Sent from my iPhone 
Please excuse mistakes

> On Nov 6, 2013, at 8:28 PM, Cary Gordon  wrote:
> 
> It sounds like we are willing to throw security under the bus for an edge 
> case, although I am sure that I am missing some subtlety
> 
> Cary
> 
>> On Nov 5, 2013, at 10:27 AM, Ross Singer  wrote:
>> 
>>> On Tue, Nov 5, 2013 at 12:07 PM, William Denton  wrote:
>>> 
>>> 
>>> (Question:  Why does HTTPS complicate screen-scraping?  Every decent tool
>>> and library supports HTTPS, doesn't it?)
>> 
>> Birkin asked me this same question, and I realized I should clarify what I
>> meant.  I was mostly referring to existing screen scrapers/existing web
>> sites.  If you redirect every request from http to https, this will
>> probably break things.  I think the Open Library example that Karen
>> mentioned is a good case study.
>> 
>> And it's pretty different for a library or tool to support HTTPS and a
>> specific app to be expecting it.  If you follow the thread around that OL
>> change, it appears there are issues with Java (as one example) arbitrarily
>> consuming HTTPS (from what I understand, you need to have the cert
>> locally?), but I don't know enough about it to say for certain.  I think
>> there would also probably be potential issues around mashups (AJAX, for
>> example), but seeing as code4lib.org doesn't support CORS, not really a
>> current issue.  Does apply more generally to your question about library
>> websites at large, though.
>> 
>> Anyway, I agree with you that the option for both should be there.  I'm not
>> just not convinced that HTTPS-all-the-time is necessary for all web use
>> cases.
>> 
>> -Ross.


Re: [CODE4LIB] We should use HTTPS on code4lib.org

2013-11-06 Thread Riley Childs
Why? HTTPS is used when there is sensitive data involved, code4lib.org (at 
least to my knowledge) does not have sensitive data?

Riley Childs
Library Director and IT Admin
Junior
Charlotte United Christian Academy
P: 704-497-2086 (Anytime)
P: 704-537-0331 x101 (M-F 7:30am-3pm ET)

Sent from my iPhone 
Please excuse mistakes

> On Nov 6, 2013, at 8:28 PM, Cary Gordon  wrote:
> 
> It sounds like we are willing to throw security under the bus for an edge 
> case, although I am sure that I am missing some subtlety
> 
> Cary
> 
>> On Nov 5, 2013, at 10:27 AM, Ross Singer  wrote:
>> 
>>> On Tue, Nov 5, 2013 at 12:07 PM, William Denton  wrote:
>>> 
>>> 
>>> (Question:  Why does HTTPS complicate screen-scraping?  Every decent tool
>>> and library supports HTTPS, doesn't it?)
>>> 
>> 
>> Birkin asked me this same question, and I realized I should clarify what I
>> meant.  I was mostly referring to existing screen scrapers/existing web
>> sites.  If you redirect every request from http to https, this will
>> probably break things.  I think the Open Library example that Karen
>> mentioned is a good case study.
>> 
>> And it's pretty different for a library or tool to support HTTPS and a
>> specific app to be expecting it.  If you follow the thread around that OL
>> change, it appears there are issues with Java (as one example) arbitrarily
>> consuming HTTPS (from what I understand, you need to have the cert
>> locally?), but I don't know enough about it to say for certain.  I think
>> there would also probably be potential issues around mashups (AJAX, for
>> example), but seeing as code4lib.org doesn't support CORS, not really a
>> current issue.  Does apply more generally to your question about library
>> websites at large, though.
>> 
>> Anyway, I agree with you that the option for both should be there.  I'm not
>> just not convinced that HTTPS-all-the-time is necessary for all web use
>> cases.
>> 
>> -Ross.


Re: [CODE4LIB] We should use HTTPS on code4lib.org

2013-11-06 Thread Cary Gordon
It sounds like we are willing to throw security under the bus for an edge case, 
although I am sure that I am missing some subtlety

Cary

On Nov 5, 2013, at 10:27 AM, Ross Singer  wrote:

> On Tue, Nov 5, 2013 at 12:07 PM, William Denton  wrote:
> 
>> 
>> (Question:  Why does HTTPS complicate screen-scraping?  Every decent tool
>> and library supports HTTPS, doesn't it?)
>> 
> 
> Birkin asked me this same question, and I realized I should clarify what I
> meant.  I was mostly referring to existing screen scrapers/existing web
> sites.  If you redirect every request from http to https, this will
> probably break things.  I think the Open Library example that Karen
> mentioned is a good case study.
> 
> And it's pretty different for a library or tool to support HTTPS and a
> specific app to be expecting it.  If you follow the thread around that OL
> change, it appears there are issues with Java (as one example) arbitrarily
> consuming HTTPS (from what I understand, you need to have the cert
> locally?), but I don't know enough about it to say for certain.  I think
> there would also probably be potential issues around mashups (AJAX, for
> example), but seeing as code4lib.org doesn't support CORS, not really a
> current issue.  Does apply more generally to your question about library
> websites at large, though.
> 
> Anyway, I agree with you that the option for both should be there.  I'm not
> just not convinced that HTTPS-all-the-time is necessary for all web use
> cases.
> 
> -Ross.


Re: [CODE4LIB] more suggestions for code4lib.org

2013-11-06 Thread Ed Summers
On Mon, Nov 4, 2013 at 11:31 PM, Kevin Hawkins
 wrote:
> b) Modify whatever code sends formatted job postings to this list so that it
> includes the location of the position.

That would be shortimer, and I think it should be doing what you suggest now?


https://github.com/code4lib/shortimer/commit/acb57090d4842920c9f92c684810f3c618f0a21e

If not let me know, create a github issue, or send a pull request :-)

//Ed


Re: [CODE4LIB] more suggestions for code4lib.org

2013-11-06 Thread Riley Childs
For C, directing people to the list would be best, but you could point the 
email to a gmail box and setup forward rules.

Riley Childs
Library Director and IT Admin
Junior
Charlotte United Christian Academy
P: 704-497-2086 (Anytime)
P: 704-537-0331 x101 (M-F 7:30am-3pm ET)

Sent from my iPhone 
Please excuse mistakes

> On Nov 6, 2013, at 8:05 PM, "Wick, Ryan"  wrote:
> 
> Hi Kevin,
> 
> Thank you for the suggestions.
> 
> a) is done. (looks like someone already changed the links on the About page).
> 
> c) I'm torn on. I understand what you mean, but this list or IRC (or even 
> Twitter) might be better. I don't know of a way to have a message go to all 
> people with admin rights on Drupal.
> 
> Ryan Wick
> 
> -Original Message-
> From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Kevin 
> Hawkins
> Sent: Monday, November 04, 2013 8:31 PM
> To: CODE4LIB@LISTSERV.ND.EDU
> Subject: [CODE4LIB] more suggestions for code4lib.org
> 
> While we're making suggestions for improving the infrastructure of 
> code4lib.org, here are some things I'd like to see improved:
> 
> a) Change the "email" link in the navbar (and in the text at 
> http://code4lib.org/about ) from
> 
> https://listserv.nd.edu/cgi-bin/wa?SUBED1=CODE4LIB&A=1
> 
> to
> 
> https://listserv.nd.edu/cgi-bin/wa?A0=CODE4LIB
> 
> so that people can easily find the list archives and poke around recent 
> messages before deciding whether to join.
> 
> b) Modify whatever code sends formatted job postings to this list so that it 
> includes the location of the position.
> 
> c) Add a contact link so people have a clear place to go to 
> reportadministrivia like point (a) above or broken links.  It might go to 
> whichever users have admin privileges on the Drupal instance behind 
> code4lib.org.
> 
> Thanks for your consideration,
> 
> Kevin


Re: [CODE4LIB] more suggestions for code4lib.org

2013-11-06 Thread Wick, Ryan
Hi Kevin,

Thank you for the suggestions.

a) is done. (looks like someone already changed the links on the About page).

c) I'm torn on. I understand what you mean, but this list or IRC (or even 
Twitter) might be better. I don't know of a way to have a message go to all 
people with admin rights on Drupal.

Ryan Wick

-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Kevin 
Hawkins
Sent: Monday, November 04, 2013 8:31 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: [CODE4LIB] more suggestions for code4lib.org

While we're making suggestions for improving the infrastructure of 
code4lib.org, here are some things I'd like to see improved:

a) Change the "email" link in the navbar (and in the text at 
http://code4lib.org/about ) from

https://listserv.nd.edu/cgi-bin/wa?SUBED1=CODE4LIB&A=1

to

https://listserv.nd.edu/cgi-bin/wa?A0=CODE4LIB

so that people can easily find the list archives and poke around recent 
messages before deciding whether to join.

b) Modify whatever code sends formatted job postings to this list so that it 
includes the location of the position.

c) Add a contact link so people have a clear place to go to reportadministrivia 
like point (a) above or broken links.  It might go to whichever users have 
admin privileges on the Drupal instance behind code4lib.org.

Thanks for your consideration,

Kevin


[CODE4LIB] Citing source code in high-profile academic journals

2013-11-06 Thread Heather Claxton-Douglas
Hello,

I need some advice about referencing source code in an academic journal.  I 
rarely see it happen and I don’t know why.

Background:  
I’m building a website that connects academic researchers with software 
developers interested in helping scientists write code.  My goal is for these 
researchers to be able to reference any  new source code in the articles they 
publish -- much like a “gene accession number” or a “PDB code”.

Unfortunately, I don’t see any code repositories referenced in high profile 
journals like Science or PNAS.  I’m guessing it’s because the code in the 
repositories isn’t permanent and may be deleted anytime? Or perhaps a DOI needs 
to be assigned?

So my question to the group is:
What criteria is necessary for a code repository or database to be eligible for 
referencing in scientific academic journals?

Some ideas I have based on looking at the Protein Databank and Genbank are:
1) The entry is permanent -- we can’t delete articles once they’ve been 
published, same is true for entries in the PDB and Genbank
2) The entry gives credit to all authors and contributors
3) The entry has a DOI 
4) The entry has a simple accession number - PDB is a four character code,  
Genebank number is six characters.

Is there anything I’m missing?  Any advice would be greatly appreciated.

Thank you
Heather Claxton-Douglas, PhD
www.sciencesolved.com

http://igg.me/at/ScienceSolved


[CODE4LIB] Canadian WordPress Hosting

2013-11-06 Thread Cynthia Ng
Hi Everyone,

Apologies for cross-posting, but code4lib is much more active, and has more
Canadians that I've seen.

I was wondering if anyone had recommendations for a WordPress hosting
solution? And yes, it needs to be in Canada. I can do most of my own
dev-type work, so really it just needs to be setup to run WordPress
(preferably with 1-click install), and most of all, reliable, hopefully
with good customer service for when we need to contact the company.

Okay, also preferable is that they do daily backups for us and has
excellent security (considering it's WordPress).

Too many hosting solutions include email and a bunch of other stuff, and I
need it only for WordPress and nothing else.

A name, plus at least 1-2 reasons on the recommendation would be great!

Thanks in advance,
Cynthia


[CODE4LIB] How to generate a Word document which displays full text links in the output

2013-11-06 Thread Paul James Albert
For those of you who do literature searches for patrons, here is a custom 
EndNote style that can generate a Word document which displays full text links 
in the output.
https://dl.dropboxusercontent.com/u/2014679/customlinktodoi.ens

To make this work, customize the style so that it follows your local 
institution's OpenURL syntax, and, of course, be sure to get bibliographic 
records from authoritative sources like MEDLINE or Web of Knowledge. (Those are 
the only two I've tried this out on so far.)

If anyone has ideas for improving this further, please let me know, and I'll 
update the file.

thanks,
Paul


Paul Albert
Project Manager, VIVO
Weill Cornell Medical Library
646.962.2551


[CODE4LIB] catqc / marclib

2013-11-06 Thread Jay, Michael
I posted our shelf-ready record analyzer and a small C library (on which it 
depends) on sourceforge.

If someone could build and test the utility in a non Windows environment I 
would greatly appreciate it. 

If anyone is interested in using it or has any questions let me know. 

https://sourceforge.net/projects/marclib
https://sourceforge.net/projects/catqc

mj

Michael Jay, Library IT
Suite 1250
2046 Waldo Road
Gainesville, FL 32609

352.273.2678
em...@ufl.edu


[CODE4LIB] Job: Associate University Librarian for Library Information Technology, University of Michigan at University of Michigan

2013-11-06 Thread jobs
Associate University Librarian for Library Information Technology, University 
of Michigan
University of Michigan
Ann Arbor

The **University of Michigan Library** is transforming the way libraries
organize, preserve, and share access to knowledge in service of the mission of
one of the world's leading research universities. We seek a forward-thinking,
collaborative, mission-driven, and innovative Associate University Librarian
(AUL) to join the library's leadership team, reporting to the Dean of
Libraries.

  
**Associate University Librarian for Library Information Technology (LIT)**  
The AUL for LIT will lead the development of information technology in support
of the university's current and emerging research needs, and the advancement
of scholarly literacy and instructional technologies. To direct the
development, management, and maintenance of a flexible and reliable technology
environment, the AUL for LIT will lead 60 talented staff members in six units:
Core Services, Digital Library Production Services, Learning Technology
Incubation Group, Library Systems, User Experience, and Web Systems. The AUL
for LIT must possess the technical and conceptual knowledge to represent the
library in broad conversations about IT, and advance the campus-wide
development of emerging instructional technologies as well as systems to
enable emerging research needs, including the management and preservation of
data.

  
We are searching for professionals with a deep understanding of the myriad and
changing roles of the library, who view publishing and information technology
as integral to our mission, and who can excel within the context of a world-
class research university. Because we are committed to diversity, we ask our
leaders to develop and nurture the individual and collective skills to
recognize, celebrate, and deploy difference as a path to engagement,
innovation, and the generation of new ideas. More information is available at:
[http://tinyurl.com/UMLib-AUL-LIT](http://tinyurl.com/UMLib-AUL-LIT). Submit
nominations or questions to: aulsea...@umich.edu.



Brought to you by code4lib jobs: http://jobs.code4lib.org/job/10610/


Re: [CODE4LIB] rdf serialization

2013-11-06 Thread Ross Singer
Hugh, I'm skeptical of this in a usable application or interface.

Applications have constraints.  There are predicates you care about, there
are values you display in specific ways.  There are expectations, based on
the domain, in the data that are either driven by the interface or the
needs of the consumers.

I have yet to see an example of "arbitrary and unexpected data" exposed in
an application that people actually use.

-Ross.


On Wed, Nov 6, 2013 at 11:39 AM, Hugh Cayless  wrote:

> The answer is purely because the RDF data model and the technology around
> it looks like it would almost do what we need it to.
>
> I do not, and cannot, assume a closed world. The open world assumption is
> one of the attractive things about RDF, in fact :-)
>
> Hugh
>
> On Nov 6, 2013, at 11:11 , Ross Singer  wrote:
>
> > My question for you, however, is why are you using a triple store for
> this?
> > That is, why bother with the broad and general model in what I assume is
> a
> > closed world assumption in your application?
>


Re: [CODE4LIB] rdf serialization

2013-11-06 Thread Ethan Gruber
I think that the answer to #1 is that if you want or expect people to use
your endpoint that you should document how it works: the ontologies, the
models, and a variety of example SPARQL queries, ranging from simple to
complex.  The British Museum's SPARQL endpoint (
http://collection.britishmuseum.org/sparql) is highly touted, but how many
people actually use it?  I understand your point about SPARQL being too
complicated for an API interface, but the best examples of services built
on SPARQL are probably the ones you don't even realize are built on SPARQL
(e.g., http://numismatics.org/ocre/id/ric.1%282%29.aug.4A#mapTab).  So on
one hand, perhaps only the most dedicated and hardcore researchers will
venture to construct SPARQL queries for your endpoint, but on the other,
you can build some pretty visualizations based on SPARQL queries conducted
in the background from the user's interaction with a simple html/javascript
based interface.

Ethan


On Wed, Nov 6, 2013 at 11:54 AM, Ross Singer  wrote:

> Hey Karen,
>
> It's purely anecdotal (albeit anecdotes borne from working at a company
> that offered, and has since abandoned, a sparql-based triple store
> service), but I just don't see the interest in arbitrary SPARQL queries
> against remote datasets that I do against linking to (and grabbing) known
> items.  I think there are multiple reasons for this:
>
> 1) Unless you're already familiar with the dataset behind the SPARQL
> endpoint, where do you even start with constructing useful queries?
> 2) SPARQL as a query language is a combination of being too powerful and
> completely useless in practice: query timeouts are commonplace, endpoints
> don't support all of 1.1, etc.  And, going back to point #1, it's hard to
> know how to optimize your queries unless you are already pretty familiar
> with the data
> 3) SPARQL is a flawed "API interface" from the get-go (IMHO) for the same
> reason we don't offer a public SQL interface to our RDBMSes
>
> Which isn't to say it doesn't have its uses or applications.
>
> I just think that in most cases domain/service-specific APIs (be they
> RESTful, based on the Linked Data API [0], whatever) will likely be favored
> over generic SPARQL endpoints.  Are n+1 different APIs ideal?  I am pretty
> sure the answer is "no", but that's the future I foresee, personally.
>
> -Ross.
> 0. https://code.google.com/p/linked-data-api/wiki/Specification
>
>
> On Wed, Nov 6, 2013 at 11:28 AM, Karen Coyle  wrote:
>
> > Ross, I agree with your statement that data doesn't have to be "RDF all
> > the way down", etc. But I'd like to hear more about why you think SPARQL
> > availability has less value, and if you see an alternative to SPARQL for
> > querying.
> >
> > kc
> >
> >
> >
> > On 11/6/13 8:11 AM, Ross Singer wrote:
> >
> >> Hugh, I don't think you're in the weeds with your question (and, while I
> >> think that named graphs can provide a solution to your particular
> problem,
> >> that doesn't necessarily mean that it doesn't raise more questions or
> >> potentially more frustrations down the line - like any new power, it can
> >> be
> >> used for good or evil and the difference might not be obvious at first).
> >>
> >> My question for you, however, is why are you using a triple store for
> >> this?
> >>   That is, why bother with the broad and general model in what I assume
> >> is a
> >> closed world assumption in your application?
> >>
> >> We don't generally use XML databases (Marklogic being a notable
> >> exception),
> >> or MARC databases, or  choice>-specific
> >> databases because usually transmission formats are designed to account
> for
> >> lots and lots of variations and maximum flexibility, which generally is
> >> the
> >> opposite of the modeling that goes into a specific app.
> >>
> >> I think there's a world of difference between modeling your data so it
> can
> >> be represented in RDF (and, possibly, available via SPARQL, but I think
> >> there is *far* less value there) and committing to RDF all the way down.
> >>   RDF is a generalization so multiple parties can agree on what data
> >> means,
> >> but I would have a hard time swallowing the argument that
> domain-specific
> >> data must be RDF-native.
> >>
> >> -Ross.
> >>
> >>
> >> On Wed, Nov 6, 2013 at 10:52 AM, Hugh Cayless 
> >> wrote:
> >>
> >>  Does that work right down to the level of the individual triple though?
> >>> If
> >>> a large percentage of my triples are each in their own individual
> graphs,
> >>> won't that be chaos? I really don't know the answer, it's not a
> >>> rhetorical
> >>> question!
> >>>
> >>> Hugh
> >>>
> >>> On Nov 6, 2013, at 10:40 , Robert Sanderson 
> wrote:
> >>>
> >>>  Named Graphs are the way to solve the issue you bring up in that post,
>  in
>  my opinion.  You mint an identifier for the graph, and associate the
>  provenance and other information with that.  This then gets ingested
> as
> 
> >>> the
> >>>
>  4th URI into a quad store, so you don't lo

Re: [CODE4LIB] rdf serialization

2013-11-06 Thread Ross Singer
Hey Karen,

It's purely anecdotal (albeit anecdotes borne from working at a company
that offered, and has since abandoned, a sparql-based triple store
service), but I just don't see the interest in arbitrary SPARQL queries
against remote datasets that I do against linking to (and grabbing) known
items.  I think there are multiple reasons for this:

1) Unless you're already familiar with the dataset behind the SPARQL
endpoint, where do you even start with constructing useful queries?
2) SPARQL as a query language is a combination of being too powerful and
completely useless in practice: query timeouts are commonplace, endpoints
don't support all of 1.1, etc.  And, going back to point #1, it's hard to
know how to optimize your queries unless you are already pretty familiar
with the data
3) SPARQL is a flawed "API interface" from the get-go (IMHO) for the same
reason we don't offer a public SQL interface to our RDBMSes

Which isn't to say it doesn't have its uses or applications.

I just think that in most cases domain/service-specific APIs (be they
RESTful, based on the Linked Data API [0], whatever) will likely be favored
over generic SPARQL endpoints.  Are n+1 different APIs ideal?  I am pretty
sure the answer is "no", but that's the future I foresee, personally.

-Ross.
0. https://code.google.com/p/linked-data-api/wiki/Specification


On Wed, Nov 6, 2013 at 11:28 AM, Karen Coyle  wrote:

> Ross, I agree with your statement that data doesn't have to be "RDF all
> the way down", etc. But I'd like to hear more about why you think SPARQL
> availability has less value, and if you see an alternative to SPARQL for
> querying.
>
> kc
>
>
>
> On 11/6/13 8:11 AM, Ross Singer wrote:
>
>> Hugh, I don't think you're in the weeds with your question (and, while I
>> think that named graphs can provide a solution to your particular problem,
>> that doesn't necessarily mean that it doesn't raise more questions or
>> potentially more frustrations down the line - like any new power, it can
>> be
>> used for good or evil and the difference might not be obvious at first).
>>
>> My question for you, however, is why are you using a triple store for
>> this?
>>   That is, why bother with the broad and general model in what I assume
>> is a
>> closed world assumption in your application?
>>
>> We don't generally use XML databases (Marklogic being a notable
>> exception),
>> or MARC databases, or -specific
>> databases because usually transmission formats are designed to account for
>> lots and lots of variations and maximum flexibility, which generally is
>> the
>> opposite of the modeling that goes into a specific app.
>>
>> I think there's a world of difference between modeling your data so it can
>> be represented in RDF (and, possibly, available via SPARQL, but I think
>> there is *far* less value there) and committing to RDF all the way down.
>>   RDF is a generalization so multiple parties can agree on what data
>> means,
>> but I would have a hard time swallowing the argument that domain-specific
>> data must be RDF-native.
>>
>> -Ross.
>>
>>
>> On Wed, Nov 6, 2013 at 10:52 AM, Hugh Cayless 
>> wrote:
>>
>>  Does that work right down to the level of the individual triple though?
>>> If
>>> a large percentage of my triples are each in their own individual graphs,
>>> won't that be chaos? I really don't know the answer, it's not a
>>> rhetorical
>>> question!
>>>
>>> Hugh
>>>
>>> On Nov 6, 2013, at 10:40 , Robert Sanderson  wrote:
>>>
>>>  Named Graphs are the way to solve the issue you bring up in that post,
 in
 my opinion.  You mint an identifier for the graph, and associate the
 provenance and other information with that.  This then gets ingested as

>>> the
>>>
 4th URI into a quad store, so you don't lose the provenance information.

 In JSON-LD:
 {
   "@id" : "uri-for-graph",
   "dcterms:creator" : "uri-for-hugh",
   "@graph" : [
// ... triples go here ...
   ]
 }

 Rob



 On Wed, Nov 6, 2013 at 7:42 AM, Hugh Cayless 

>>> wrote:
>>>
 I wrote about this a few months back at
>
>  http://blogs.library.duke.edu/dcthree/2013/07/27/the-
>>> trouble-with-triples/
>>>
 I'd be very interested to hear what the smart folks here think!
>
> Hugh
>
> On Nov 5, 2013, at 18:28 , Alexander Johannesen <
> alexander.johanne...@gmail.com> wrote:
>
>  But the
>> question to every piece of meta data is *authority*, which is the part
>> of RDF that sucks.
>>
>
> --
> Karen Coyle
> kco...@kcoyle.net http://kcoyle.net
> m: 1-510-435-8234
> skype: kcoylenet
>


Re: [CODE4LIB] rdf serialization

2013-11-06 Thread Hugh Cayless
The answer is purely because the RDF data model and the technology around it 
looks like it would almost do what we need it to.

I do not, and cannot, assume a closed world. The open world assumption is one 
of the attractive things about RDF, in fact :-)

Hugh

On Nov 6, 2013, at 11:11 , Ross Singer  wrote:

> My question for you, however, is why are you using a triple store for this?
> That is, why bother with the broad and general model in what I assume is a
> closed world assumption in your application?


Re: [CODE4LIB] rdf serialization

2013-11-06 Thread Karen Coyle
Ross, I agree with your statement that data doesn't have to be "RDF all 
the way down", etc. But I'd like to hear more about why you think SPARQL 
availability has less value, and if you see an alternative to SPARQL for 
querying.


kc


On 11/6/13 8:11 AM, Ross Singer wrote:

Hugh, I don't think you're in the weeds with your question (and, while I
think that named graphs can provide a solution to your particular problem,
that doesn't necessarily mean that it doesn't raise more questions or
potentially more frustrations down the line - like any new power, it can be
used for good or evil and the difference might not be obvious at first).

My question for you, however, is why are you using a triple store for this?
  That is, why bother with the broad and general model in what I assume is a
closed world assumption in your application?

We don't generally use XML databases (Marklogic being a notable exception),
or MARC databases, or -specific
databases because usually transmission formats are designed to account for
lots and lots of variations and maximum flexibility, which generally is the
opposite of the modeling that goes into a specific app.

I think there's a world of difference between modeling your data so it can
be represented in RDF (and, possibly, available via SPARQL, but I think
there is *far* less value there) and committing to RDF all the way down.
  RDF is a generalization so multiple parties can agree on what data means,
but I would have a hard time swallowing the argument that domain-specific
data must be RDF-native.

-Ross.


On Wed, Nov 6, 2013 at 10:52 AM, Hugh Cayless  wrote:


Does that work right down to the level of the individual triple though? If
a large percentage of my triples are each in their own individual graphs,
won't that be chaos? I really don't know the answer, it's not a rhetorical
question!

Hugh

On Nov 6, 2013, at 10:40 , Robert Sanderson  wrote:


Named Graphs are the way to solve the issue you bring up in that post, in
my opinion.  You mint an identifier for the graph, and associate the
provenance and other information with that.  This then gets ingested as

the

4th URI into a quad store, so you don't lose the provenance information.

In JSON-LD:
{
  "@id" : "uri-for-graph",
  "dcterms:creator" : "uri-for-hugh",
  "@graph" : [
   // ... triples go here ...
  ]
}

Rob



On Wed, Nov 6, 2013 at 7:42 AM, Hugh Cayless 

wrote:

I wrote about this a few months back at


http://blogs.library.duke.edu/dcthree/2013/07/27/the-trouble-with-triples/

I'd be very interested to hear what the smart folks here think!

Hugh

On Nov 5, 2013, at 18:28 , Alexander Johannesen <
alexander.johanne...@gmail.com> wrote:


But the
question to every piece of meta data is *authority*, which is the part
of RDF that sucks.


--
Karen Coyle
kco...@kcoyle.net http://kcoyle.net
m: 1-510-435-8234
skype: kcoylenet


Re: [CODE4LIB] HathiTrust Bib Api - JSONP

2013-11-06 Thread Adam Constabaris
Sara,

I have not used the "file extension" version to tell HT I want JSON
results, but if I take your first URL and change it to

http://catalog.hathitrust.org/api/volumes/brief/json/oclc:3967141?callback=mycallbackfunction

(insert /json/ after /brief instead of tacking it on as an extension) I get
a proper JavaScript/JSONP response.

For your second example, it looks like there's a PHP error message being
output that prevents the result from being
properly formatted JavaScript.  But afaict, everything on your end with
that request is OK.

cheers,

AC



On Wed, Nov 6, 2013 at 11:08 AM, sara amato  wrote:

> Does anyone have a working example of getting jsonp from the HathiTrust
> bib API?
>
> I can get straight json (it seems to ignore the callback parameter)
>
> http://catalog.hathitrust.org/api/volumes/brief/oclc/3967141.json&callback=mycallbackfunction
>
> or jsonp with some unfortunate notices at the top (and yes, I just emailed
> their 'feedback' address and asked about this.)
>
> http://catalog.hathitrust.org/api/volumes/json/oclc:3967141&callback=mycallbackfunction
>
>
> I'm wondering if I'm just missing the correct url/syntax.
>


Re: [CODE4LIB] rdf serialization

2013-11-06 Thread Ross Singer
Hugh, I don't think you're in the weeds with your question (and, while I
think that named graphs can provide a solution to your particular problem,
that doesn't necessarily mean that it doesn't raise more questions or
potentially more frustrations down the line - like any new power, it can be
used for good or evil and the difference might not be obvious at first).

My question for you, however, is why are you using a triple store for this?
 That is, why bother with the broad and general model in what I assume is a
closed world assumption in your application?

We don't generally use XML databases (Marklogic being a notable exception),
or MARC databases, or -specific
databases because usually transmission formats are designed to account for
lots and lots of variations and maximum flexibility, which generally is the
opposite of the modeling that goes into a specific app.

I think there's a world of difference between modeling your data so it can
be represented in RDF (and, possibly, available via SPARQL, but I think
there is *far* less value there) and committing to RDF all the way down.
 RDF is a generalization so multiple parties can agree on what data means,
but I would have a hard time swallowing the argument that domain-specific
data must be RDF-native.

-Ross.


On Wed, Nov 6, 2013 at 10:52 AM, Hugh Cayless  wrote:

> Does that work right down to the level of the individual triple though? If
> a large percentage of my triples are each in their own individual graphs,
> won't that be chaos? I really don't know the answer, it's not a rhetorical
> question!
>
> Hugh
>
> On Nov 6, 2013, at 10:40 , Robert Sanderson  wrote:
>
> > Named Graphs are the way to solve the issue you bring up in that post, in
> > my opinion.  You mint an identifier for the graph, and associate the
> > provenance and other information with that.  This then gets ingested as
> the
> > 4th URI into a quad store, so you don't lose the provenance information.
> >
> > In JSON-LD:
> > {
> >  "@id" : "uri-for-graph",
> >  "dcterms:creator" : "uri-for-hugh",
> >  "@graph" : [
> >   // ... triples go here ...
> >  ]
> > }
> >
> > Rob
> >
> >
> >
> > On Wed, Nov 6, 2013 at 7:42 AM, Hugh Cayless 
> wrote:
> >
> >> I wrote about this a few months back at
> >>
> http://blogs.library.duke.edu/dcthree/2013/07/27/the-trouble-with-triples/
> >>
> >> I'd be very interested to hear what the smart folks here think!
> >>
> >> Hugh
> >>
> >> On Nov 5, 2013, at 18:28 , Alexander Johannesen <
> >> alexander.johanne...@gmail.com> wrote:
> >>
> >>> But the
> >>> question to every piece of meta data is *authority*, which is the part
> >>> of RDF that sucks.
> >>
>


Re: [CODE4LIB] rdf serialization

2013-11-06 Thread Hugh Cayless
In the kinds of data I have to deal with, who made an assertion, or what 
sources provide evidence for a statement are vitally important bits of 
information, so its not just a data-source integration problem, where you're 
taking batches of triples from different sources and putting them together. 
It's a question of how to encode "scholarly", messy, humanities data.

The answer of course, might be "don't use RDF for that" :-). I'd rather not 
invent something if I don't have to though.

Hugh

On Nov 6, 2013, at 10:56 , Robert Sanderson  wrote:

> A large number of triples that all have different provenance? I'm curious
> as to how you get them :)
> 
> Rob
> 
> 
> On Wed, Nov 6, 2013 at 8:52 AM, Hugh Cayless  wrote:
> 
>> Does that work right down to the level of the individual triple though? If
>> a large percentage of my triples are each in their own individual graphs,
>> won't that be chaos? I really don't know the answer, it's not a rhetorical
>> question!
>> 
>> Hugh
>> 
>> On Nov 6, 2013, at 10:40 , Robert Sanderson  wrote:
>> 
>>> Named Graphs are the way to solve the issue you bring up in that post, in
>>> my opinion.  You mint an identifier for the graph, and associate the
>>> provenance and other information with that.  This then gets ingested as
>> the
>>> 4th URI into a quad store, so you don't lose the provenance information.
>>> 
>>> In JSON-LD:
>>> {
>>> "@id" : "uri-for-graph",
>>> "dcterms:creator" : "uri-for-hugh",
>>> "@graph" : [
>>>  // ... triples go here ...
>>> ]
>>> }
>>> 
>>> Rob
>>> 
>>> 
>>> 
>>> On Wed, Nov 6, 2013 at 7:42 AM, Hugh Cayless 
>> wrote:
>>> 
 I wrote about this a few months back at
 
>> http://blogs.library.duke.edu/dcthree/2013/07/27/the-trouble-with-triples/
 
 I'd be very interested to hear what the smart folks here think!
 
 Hugh
 
 On Nov 5, 2013, at 18:28 , Alexander Johannesen <
 alexander.johanne...@gmail.com> wrote:
 
> But the
> question to every piece of meta data is *authority*, which is the part
> of RDF that sucks.
 
>> 


[CODE4LIB] HathiTrust Bib Api - JSONP

2013-11-06 Thread sara amato
Does anyone have a working example of getting jsonp from the HathiTrust bib API?

I can get straight json (it seems to ignore the callback parameter)
http://catalog.hathitrust.org/api/volumes/brief/oclc/3967141.json&callback=mycallbackfunction

or jsonp with some unfortunate notices at the top (and yes, I just emailed 
their 'feedback' address and asked about this.)
http://catalog.hathitrust.org/api/volumes/json/oclc:3967141&callback=mycallbackfunction


I'm wondering if I'm just missing the correct url/syntax.


Re: [CODE4LIB] rdf serialization

2013-11-06 Thread Robert Sanderson
A large number of triples that all have different provenance? I'm curious
as to how you get them :)

Rob


On Wed, Nov 6, 2013 at 8:52 AM, Hugh Cayless  wrote:

> Does that work right down to the level of the individual triple though? If
> a large percentage of my triples are each in their own individual graphs,
> won't that be chaos? I really don't know the answer, it's not a rhetorical
> question!
>
> Hugh
>
> On Nov 6, 2013, at 10:40 , Robert Sanderson  wrote:
>
> > Named Graphs are the way to solve the issue you bring up in that post, in
> > my opinion.  You mint an identifier for the graph, and associate the
> > provenance and other information with that.  This then gets ingested as
> the
> > 4th URI into a quad store, so you don't lose the provenance information.
> >
> > In JSON-LD:
> > {
> >  "@id" : "uri-for-graph",
> >  "dcterms:creator" : "uri-for-hugh",
> >  "@graph" : [
> >   // ... triples go here ...
> >  ]
> > }
> >
> > Rob
> >
> >
> >
> > On Wed, Nov 6, 2013 at 7:42 AM, Hugh Cayless 
> wrote:
> >
> >> I wrote about this a few months back at
> >>
> http://blogs.library.duke.edu/dcthree/2013/07/27/the-trouble-with-triples/
> >>
> >> I'd be very interested to hear what the smart folks here think!
> >>
> >> Hugh
> >>
> >> On Nov 5, 2013, at 18:28 , Alexander Johannesen <
> >> alexander.johanne...@gmail.com> wrote:
> >>
> >>> But the
> >>> question to every piece of meta data is *authority*, which is the part
> >>> of RDF that sucks.
> >>
>


Re: [CODE4LIB] rdf serialization

2013-11-06 Thread Hugh Cayless
Does that work right down to the level of the individual triple though? If a 
large percentage of my triples are each in their own individual graphs, won't 
that be chaos? I really don't know the answer, it's not a rhetorical question!

Hugh

On Nov 6, 2013, at 10:40 , Robert Sanderson  wrote:

> Named Graphs are the way to solve the issue you bring up in that post, in
> my opinion.  You mint an identifier for the graph, and associate the
> provenance and other information with that.  This then gets ingested as the
> 4th URI into a quad store, so you don't lose the provenance information.
> 
> In JSON-LD:
> {
>  "@id" : "uri-for-graph",
>  "dcterms:creator" : "uri-for-hugh",
>  "@graph" : [
>   // ... triples go here ...
>  ]
> }
> 
> Rob
> 
> 
> 
> On Wed, Nov 6, 2013 at 7:42 AM, Hugh Cayless  wrote:
> 
>> I wrote about this a few months back at
>> http://blogs.library.duke.edu/dcthree/2013/07/27/the-trouble-with-triples/
>> 
>> I'd be very interested to hear what the smart folks here think!
>> 
>> Hugh
>> 
>> On Nov 5, 2013, at 18:28 , Alexander Johannesen <
>> alexander.johanne...@gmail.com> wrote:
>> 
>>> But the
>>> question to every piece of meta data is *authority*, which is the part
>>> of RDF that sucks.
>> 


Re: [CODE4LIB] rdf serialization

2013-11-06 Thread Robert Sanderson
Named Graphs are the way to solve the issue you bring up in that post, in
my opinion.  You mint an identifier for the graph, and associate the
provenance and other information with that.  This then gets ingested as the
4th URI into a quad store, so you don't lose the provenance information.

In JSON-LD:
{
  "@id" : "uri-for-graph",
  "dcterms:creator" : "uri-for-hugh",
  "@graph" : [
   // ... triples go here ...
  ]
}

Rob



On Wed, Nov 6, 2013 at 7:42 AM, Hugh Cayless  wrote:

> I wrote about this a few months back at
> http://blogs.library.duke.edu/dcthree/2013/07/27/the-trouble-with-triples/
>
> I'd be very interested to hear what the smart folks here think!
>
> Hugh
>
> On Nov 5, 2013, at 18:28 , Alexander Johannesen <
> alexander.johanne...@gmail.com> wrote:
>
> > But the
> > question to every piece of meta data is *authority*, which is the part
> > of RDF that sucks.
>


[CODE4LIB] Free LITA Post-Conference Tutorial on Forthcoming NISO ResourceSync Standard

2013-11-06 Thread Peter Murray
FYI.

Begin forwarded message:

From: Cynthia Hodgson mailto:chodg...@niso.org>>
Subject: [lita-l] Free LITA Post-Conference Tutorial on Forthcoming NISO 
ResourceSync Standard
Date: November 6, 2013 at 9:26:30 AM EST
To: LITA-L mailto:lit...@ala.org>>, 
"lita-st...@ala.org" 
mailto:lita-st...@ala.org>>
Reply-To: "chodg...@niso.org" 
mailto:chodg...@niso.org>>

Participants at the 2013 LITA Forum in Louisville are invited to stay a few 
hours longer on Sunday, November 10 to attend the ResourceSync 
Tutorial, which will be 
held after the close of the main conference from 1:30-4:30 p.m. Herbert van de 
Sompel, Co-chair of the ResourceSync 
Working Group, will lead this 3-hour session where attendees can learn about 
how the forthcoming ResourceSync standard 
can be used to synchronize web resources between servers.
ResourceSync, begun in late 2011, is a joint project between NISO and the Open 
Archives Initiative (OAI) team, with funding from the Sloan Foundation. The 
standard, currently in final editing for approval, describes a synchronization 
framework for the web consisting of various capabilities that allow third-party 
systems to remain synchronized with a server's evolving resources. The 
capabilities can be combined in a modular manner to meet local or community 
requirements. This specification also describes how a server can advertise the 
synchronization capabilities it supports and how third-party systems can 
discover this information. The specification repurposes the document formats 
defined by the Sitemap protocol and introduces extensions for them.
This LITA post-conference tutorial is available at no cost. As we would 
appreciate knowing how many people are coming, please select the post 
conference checkbox on the registration 
form.
You can also view the beta version of the specification 
 and provide feedback on the 
ResourceSync Google Group. 
Visit the ResourceSync workroom 
webpage for more information about 
the project.http://www.niso.org/workrooms/resourcesync/


Cynthia Hodgson
Technical Editor / Consultant
National Information Standards Organization
chodg...@niso.org
301-654-2512

--
Peter Murray
Assistant Director, Technology Services Development
LYRASIS
peter.mur...@lyrasis.org
+1 678-235-2955
800.999.8558 x2955


[CODE4LIB] databases/indexes with well-structured output

2013-11-06 Thread Eric Lease Morgan
What are some of the more popular and useful bibliographic databases/indexes 
with well-structured output?

If it were easy (trivial) for our readers to gets sets of well-structured data 
out of our bibliographic databases, then it would be relatively easy for us to 
write software enabling readers to use and understand — evaluate — their data. 
What databases/indexes lend themselves to this solution? Let me elaborate.

JSTOR’s Data For Research service provides complete access to the totality of 
JSTOR, sans the articles themselves, unless you are auathorized. [1] A person 
can search JSTOR and then request a data dump compete with citations, keyword 
frequencies, and n-grams. This data can then be used to create a report — like 
a timeline or tag clouds or concordances — illustrating the characteristics of 
the found set. About six months ago I wrote a program, the beginnings of such a 
report. [2]

Suppose a reader diligently used something like Endnote, Zotero, or RefWorks to 
save and manage their bibliographic citations of interest. If the reader were 
to export some or all of their bibliographic data to a file, then the result 
would be well-structured and computer readable. Things like titles, authors, 
keywords/subjects, maybe abstracts, and citations would be neatly delimited. If 
this file were read by a second computer program new views of the data could be 
manifested. Again, a timeline could be created. Wordclouds could be created. An 
analysis could be done against the data to determine frequent authors. 
Relationships between authors might be able to be exposed. All of this would 
assist the reader in evaluating their found set. 

Through the use of APIs I can search things like WorldCat, the HathiTrust, or 
the Internet Archive. The result could be (for better or for worse) MARC 
records. Again, analysis could be done against this data not to find 
information (that has already been done), but rather to evaluate the data — 
look for patterns and anomalies.

Put another way, instead of trying to force people to do the best and most 
perfect bibliographic search, allow them to do broad searches and then provide 
supplementary tools enabling the reader to examine the results. It is not about 
find. It is about use & understand.

I prefer XML to other data structures, but I will not necessarily limit myself 
to XML. What information sources would you suggest I use? Here is a short, 
unordered list:

  * JSTOR Data For Research Data
  * Zotero (RDF) XML output
  * WorldCat, HathiTrust, Internet Archive

After I write the “search results evaluation tool”, I will then go to the next 
step and provide tools for the “distant reading” of individual items á la my 
PDF2TXT application. [3]

We here in libraries can no longer just give people access to information 
because people have more access than they know what to do with. Instead, I 
think an opportunity exists for us to provide tools for evaluating the 
information they have so they can use & understand it. Call it “scalable, 
computer-supplemented information literacy”.


[1] Data For Research - http://dfr.jstor.org
[2] JSTOR Tool — http://dh.crc.nd.edu/sandbox/jstor-tool/
[3] PDF2TXT - http://dh.crc.nd.edu/sandbox/pdf2txt.cgi

—
Eric Morgan
University of Notre Dame


Re: [CODE4LIB] rdf serialization

2013-11-06 Thread Hugh Cayless
I wrote about this a few months back at 
http://blogs.library.duke.edu/dcthree/2013/07/27/the-trouble-with-triples/

I'd be very interested to hear what the smart folks here think!

Hugh

On Nov 5, 2013, at 18:28 , Alexander Johannesen 
 wrote:

> But the
> question to every piece of meta data is *authority*, which is the part
> of RDF that sucks.


Re: [CODE4LIB] rdf serialization

2013-11-06 Thread Ben Companjen
I could have known it was a test! ;)

Thanks Karen :)

On 06-11-13 15:20, "Karen Coyle"  wrote:

>I guess if I want anyone to answer my emails, I need to post mistakes.


Re: [CODE4LIB] rdf serialization

2013-11-06 Thread Karen Coyle
Ben, Yes, I copied from the browser URIs, and that was sloppy. However, 
it was the quickest thing to do, plus it was addressed to a human, not a 
machine. The URI for the LC entry is there on the page. Unfortunately, 
the VIAF URI is called "Permalink" -- which isn't obvious.


I guess if I want anyone to answer my emails, I need to post mistakes. 
When I post correct information, my mail goes unanswered (not even a 
"thanks"). So, thanks, guys.


kc

On 11/6/13 12:47 AM, Ben Companjen wrote:

Karen,

The URIs you gave get me to webpages *about* the Declaration of
Independence. I'm sure it's just a copy/paste mistake, but in this context
you want the exact right URIs of course. And by "better" I guess you meant
"probably more widely used" and "probably longer lasting"? :)

LOC URI for the DoI (the work) is without .html:
http://id.loc.gov/authorities/names/n79029194


VIAF URI for the DoI is without trailing /:
http://viaf.org/viaf/179420344

Ben
http://companjen.name/id/BC <- me
http://companjen.name/id/BC.html <- about me


On 05-11-13 19:03, "Karen Coyle"  wrote:


Eric, I found an even better URI for you for the Declaration of
Independence:

http://id.loc.gov/authorities/names/n79029194.html

Now that could be seen as being representative of the name chosen by the
LC Name Authority, but the related VIAF record, as per the VIAF
definition of itself, represents the real world thing itself. That URI is:

http://viaf.org/viaf/179420344/

I noticed that this VIAF URI isn't linked from the Wikipedia page, so I
will add that.

kc


--
Karen Coyle
kco...@kcoyle.net http://kcoyle.net
m: 1-510-435-8234
skype: kcoylenet


[CODE4LIB] Job: Digital Library Applications Developer

2013-11-06 Thread Katherine Lynch
** Please excuse any cross-posting **

The Temple University Libraries are seeking a creative and energetic
individual to fill the position of Digital Library Applications Developer.
Temple’s federated library system serves an urban research university with
over 1,800 full-time faculty and a student body of 36,000 that is among the
most diverse in the nation.  For more information about Temple and
Philadelphia, visit http://www.temple.edu.

Description

Reporting to the Senior Digital Library Applications Developer and working
closely with others in the Digital Library Initiatives Department, help
develop and maintain the technological infrastructure for Temple
University’s digital library initiatives and services, which includes
preserving and delivering large collections of digital objects, and
supporting digital humanities and scholarly communication initiatives
throughout the Library. Under the guidance of supervisor, architect,
implement, test and deploy new tools and services primarily based on open
source project software, such as Omeka, Fedora Commons, Hydra, and Open
Journal Systems (OJS), potentially contributing code to those projects.
Perform other duties as assigned.

Required Education and Experience

* BS in Computer Science or related field, or an equivalent combination of
education and experience.

Required Skills and Abilities

* Demonstrated experience with application development in at least one
major programming language such as Ruby on Rails, PHP, or Java
* Demonstrated experience with MySQL or other database management systems.
* Demonstrated knowledge of the LAMP stack or similar technology stacks.
* Demonstrated ability to perform effective code testing.
* Experience with project requirements gathering.
* Strong organizational and interpersonal skills, demonstrated ability to
work in a collaborative team-based environment, and to communicate well
with IT and non-IT staff. Commitment to responsive and innovative service.
* Demonstrated ability to write clear documentation.

Preferred

* Experience with a repository system such as Fedora/Hydra,
Fedora/Islandora, or Dspace.
* Familiarity with a Content Management System like Drupal or an exhibit
curation system like Omeka would be a plus.
* Experience working with Open Source software; experience with version
control, test-driven development, and continuous integration techniques.
* Experience with QA testing of web applications.
* Experience with Linux/Unix operating systems, including scripting and
commands.
* Experience working with authentication and authorization protocols,
including LDAP.
* Knowledge of XML/XSLT.
* Familiarity with digital library standards, such as Dublin Core, MARC,
METS, EAD, and OAI-PMH.

To apply:

To apply for this position, please visit
http://www.temple.edu/hr/departments/employment/jobs_within.htm, click on
"Non-Employees Only," and search for job number TU-17222.  For full
consideration, please submit your completed electronic application, along
with a cover letter and resume. Review of applications will begin
immediately and will continue until the position is filled.

Temple University is an Affirmative Action/Equal Opportunity Employer with
a strong commitment to cultural diversity.

-- 

Katherine Lynch, Senior Digital Library Applications Developer
Temple University Library (http://library.temple.edu)
Samuel L. Paley Library, Room 113, 1210 Polett Walk, Philadelphia, PA 19122
Tel: 215-204-2821 | Fax: 215-204-5201 | Email: katherine.ly...@temple.edu


Re: [CODE4LIB] rdf serialization

2013-11-06 Thread Eric Lease Morgan
> Yes, I'm going to get sucked into this vi vs emacs argument for nostalgia's
> sake...

ROTFL, because that is exactly what I was thinking. “Vi is better. No, emacs. 
You are both wrong; it is all about BBedit!” Each tool whether they be editors, 
email clients, or RDF serializations all have their own strengths and 
weaknesses. Like religions, none of them are perfect, but they all have some 
value. —ELM


Re: [CODE4LIB] rdf serialization

2013-11-06 Thread Ed Summers
On Wed, Nov 6, 2013 at 3:47 AM, Ben Companjen
 wrote:
> The URIs you gave get me to webpages *about* the Declaration of
> Independence. I'm sure it's just a copy/paste mistake, but in this context
> you want the exact right URIs of course. And by "better" I guess you meant
> "probably more widely used" and "probably longer lasting"? :)
>
> LOC URI for the DoI (the work) is without .html:
> http://id.loc.gov/authorities/names/n79029194
>
> VIAF URI for the DoI is without trailing /:
> http://viaf.org/viaf/179420344

Thanks for that Ben. IMHO it's (yet another) illustration of why the
W3C's approach to educating the world about URIs for real world things
hasn't quite caught on, while RESTful ones (promoted by the IETF)
have. If someone as knowledgeable as Karen can do that, what does it
say about our ability as practitioners to use URIs this way, and in
our ability to write software to do it as well?

In a REST world, when you get a 200 OK it doesn't mean the resource is
a Web Document. The resource can be anything, you just happened to
successfully get a representation of it. If you like you can provide
hints about the nature of the resource in the representation, but the
resource itself never goes over the wire, the representation does.
It's a subtle but important difference in two ways of looking at Web
architecture.

If you find yourself interested in making up your own mind about this
you can find the RESTful definitions of resource and representation in
the IETF HTTP RFCs, most recently as of a few weeks ago in draft [1].
You can find language about Web Documents (or at least its more recent
variant, Information Resource) in the W3C's Architecture of the World
Wide Web [2].

Obviously I'm biased towards the IETF's position on this. This is just
my personal opinion from my experience as a Web developer trying to
explain Linked Data to practitioners, looking at the Web we have, and
chatting with good friends who weren't afraid to tell me what they
thought.

//Ed

[1] http://tools.ietf.org/html/draft-ietf-httpbis-p2-semantics-24#page-7
[2] http://www.w3.org/TR/webarch/#id-resources


Re: [CODE4LIB] rdf serialization

2013-11-06 Thread Ben Companjen
Karen,

The URIs you gave get me to webpages *about* the Declaration of
Independence. I'm sure it's just a copy/paste mistake, but in this context
you want the exact right URIs of course. And by "better" I guess you meant
"probably more widely used" and "probably longer lasting"? :)

LOC URI for the DoI (the work) is without .html:
http://id.loc.gov/authorities/names/n79029194


VIAF URI for the DoI is without trailing /:
http://viaf.org/viaf/179420344

Ben
http://companjen.name/id/BC <- me
http://companjen.name/id/BC.html <- about me


On 05-11-13 19:03, "Karen Coyle"  wrote:

>Eric, I found an even better URI for you for the Declaration of
>Independence:
>
>http://id.loc.gov/authorities/names/n79029194.html
>
>Now that could be seen as being representative of the name chosen by the
>LC Name Authority, but the related VIAF record, as per the VIAF
>definition of itself, represents the real world thing itself. That URI is:
>
>http://viaf.org/viaf/179420344/
>
>I noticed that this VIAF URI isn't linked from the Wikipedia page, so I
>will add that.
>
>kc