Re: [CODE4LIB] rdf serialization

2013-11-06 Thread Ben Companjen
Karen,

The URIs you gave get me to webpages *about* the Declaration of
Independence. I'm sure it's just a copy/paste mistake, but in this context
you want the exact right URIs of course. And by better I guess you meant
probably more widely used and probably longer lasting? :)

LOC URI for the DoI (the work) is without .html:
http://id.loc.gov/authorities/names/n79029194


VIAF URI for the DoI is without trailing /:
http://viaf.org/viaf/179420344

Ben
http://companjen.name/id/BC - me
http://companjen.name/id/BC.html - about me


On 05-11-13 19:03, Karen Coyle li...@kcoyle.net wrote:

Eric, I found an even better URI for you for the Declaration of
Independence:

http://id.loc.gov/authorities/names/n79029194.html

Now that could be seen as being representative of the name chosen by the
LC Name Authority, but the related VIAF record, as per the VIAF
definition of itself, represents the real world thing itself. That URI is:

http://viaf.org/viaf/179420344/

I noticed that this VIAF URI isn't linked from the Wikipedia page, so I
will add that.

kc


Re: [CODE4LIB] rdf serialization

2013-11-06 Thread Ed Summers
On Wed, Nov 6, 2013 at 3:47 AM, Ben Companjen
ben.compan...@dans.knaw.nl wrote:
 The URIs you gave get me to webpages *about* the Declaration of
 Independence. I'm sure it's just a copy/paste mistake, but in this context
 you want the exact right URIs of course. And by better I guess you meant
 probably more widely used and probably longer lasting? :)

 LOC URI for the DoI (the work) is without .html:
 http://id.loc.gov/authorities/names/n79029194

 VIAF URI for the DoI is without trailing /:
 http://viaf.org/viaf/179420344

Thanks for that Ben. IMHO it's (yet another) illustration of why the
W3C's approach to educating the world about URIs for real world things
hasn't quite caught on, while RESTful ones (promoted by the IETF)
have. If someone as knowledgeable as Karen can do that, what does it
say about our ability as practitioners to use URIs this way, and in
our ability to write software to do it as well?

In a REST world, when you get a 200 OK it doesn't mean the resource is
a Web Document. The resource can be anything, you just happened to
successfully get a representation of it. If you like you can provide
hints about the nature of the resource in the representation, but the
resource itself never goes over the wire, the representation does.
It's a subtle but important difference in two ways of looking at Web
architecture.

If you find yourself interested in making up your own mind about this
you can find the RESTful definitions of resource and representation in
the IETF HTTP RFCs, most recently as of a few weeks ago in draft [1].
You can find language about Web Documents (or at least its more recent
variant, Information Resource) in the W3C's Architecture of the World
Wide Web [2].

Obviously I'm biased towards the IETF's position on this. This is just
my personal opinion from my experience as a Web developer trying to
explain Linked Data to practitioners, looking at the Web we have, and
chatting with good friends who weren't afraid to tell me what they
thought.

//Ed

[1] http://tools.ietf.org/html/draft-ietf-httpbis-p2-semantics-24#page-7
[2] http://www.w3.org/TR/webarch/#id-resources


Re: [CODE4LIB] rdf serialization

2013-11-06 Thread Eric Lease Morgan
 Yes, I'm going to get sucked into this vi vs emacs argument for nostalgia's
 sake...

ROTFL, because that is exactly what I was thinking. “Vi is better. No, emacs. 
You are both wrong; it is all about BBedit!” Each tool whether they be editors, 
email clients, or RDF serializations all have their own strengths and 
weaknesses. Like religions, none of them are perfect, but they all have some 
value. —ELM


[CODE4LIB] Job: Digital Library Applications Developer

2013-11-06 Thread Katherine Lynch
** Please excuse any cross-posting **

The Temple University Libraries are seeking a creative and energetic
individual to fill the position of Digital Library Applications Developer.
Temple’s federated library system serves an urban research university with
over 1,800 full-time faculty and a student body of 36,000 that is among the
most diverse in the nation.  For more information about Temple and
Philadelphia, visit http://www.temple.edu.

Description

Reporting to the Senior Digital Library Applications Developer and working
closely with others in the Digital Library Initiatives Department, help
develop and maintain the technological infrastructure for Temple
University’s digital library initiatives and services, which includes
preserving and delivering large collections of digital objects, and
supporting digital humanities and scholarly communication initiatives
throughout the Library. Under the guidance of supervisor, architect,
implement, test and deploy new tools and services primarily based on open
source project software, such as Omeka, Fedora Commons, Hydra, and Open
Journal Systems (OJS), potentially contributing code to those projects.
Perform other duties as assigned.

Required Education and Experience

* BS in Computer Science or related field, or an equivalent combination of
education and experience.

Required Skills and Abilities

* Demonstrated experience with application development in at least one
major programming language such as Ruby on Rails, PHP, or Java
* Demonstrated experience with MySQL or other database management systems.
* Demonstrated knowledge of the LAMP stack or similar technology stacks.
* Demonstrated ability to perform effective code testing.
* Experience with project requirements gathering.
* Strong organizational and interpersonal skills, demonstrated ability to
work in a collaborative team-based environment, and to communicate well
with IT and non-IT staff. Commitment to responsive and innovative service.
* Demonstrated ability to write clear documentation.

Preferred

* Experience with a repository system such as Fedora/Hydra,
Fedora/Islandora, or Dspace.
* Familiarity with a Content Management System like Drupal or an exhibit
curation system like Omeka would be a plus.
* Experience working with Open Source software; experience with version
control, test-driven development, and continuous integration techniques.
* Experience with QA testing of web applications.
* Experience with Linux/Unix operating systems, including scripting and
commands.
* Experience working with authentication and authorization protocols,
including LDAP.
* Knowledge of XML/XSLT.
* Familiarity with digital library standards, such as Dublin Core, MARC,
METS, EAD, and OAI-PMH.

To apply:

To apply for this position, please visit
http://www.temple.edu/hr/departments/employment/jobs_within.htm, click on
Non-Employees Only, and search for job number TU-17222.  For full
consideration, please submit your completed electronic application, along
with a cover letter and resume. Review of applications will begin
immediately and will continue until the position is filled.

Temple University is an Affirmative Action/Equal Opportunity Employer with
a strong commitment to cultural diversity.

-- 

Katherine Lynch, Senior Digital Library Applications Developer
Temple University Library (http://library.temple.edu)
Samuel L. Paley Library, Room 113, 1210 Polett Walk, Philadelphia, PA 19122
Tel: 215-204-2821 | Fax: 215-204-5201 | Email: katherine.ly...@temple.edu


Re: [CODE4LIB] rdf serialization

2013-11-06 Thread Karen Coyle
Ben, Yes, I copied from the browser URIs, and that was sloppy. However, 
it was the quickest thing to do, plus it was addressed to a human, not a 
machine. The URI for the LC entry is there on the page. Unfortunately, 
the VIAF URI is called Permalink -- which isn't obvious.


I guess if I want anyone to answer my emails, I need to post mistakes. 
When I post correct information, my mail goes unanswered (not even a 
thanks). So, thanks, guys.


kc

On 11/6/13 12:47 AM, Ben Companjen wrote:

Karen,

The URIs you gave get me to webpages *about* the Declaration of
Independence. I'm sure it's just a copy/paste mistake, but in this context
you want the exact right URIs of course. And by better I guess you meant
probably more widely used and probably longer lasting? :)

LOC URI for the DoI (the work) is without .html:
http://id.loc.gov/authorities/names/n79029194


VIAF URI for the DoI is without trailing /:
http://viaf.org/viaf/179420344

Ben
http://companjen.name/id/BC - me
http://companjen.name/id/BC.html - about me


On 05-11-13 19:03, Karen Coyle li...@kcoyle.net wrote:


Eric, I found an even better URI for you for the Declaration of
Independence:

http://id.loc.gov/authorities/names/n79029194.html

Now that could be seen as being representative of the name chosen by the
LC Name Authority, but the related VIAF record, as per the VIAF
definition of itself, represents the real world thing itself. That URI is:

http://viaf.org/viaf/179420344/

I noticed that this VIAF URI isn't linked from the Wikipedia page, so I
will add that.

kc


--
Karen Coyle
kco...@kcoyle.net http://kcoyle.net
m: 1-510-435-8234
skype: kcoylenet


Re: [CODE4LIB] rdf serialization

2013-11-06 Thread Ben Companjen
I could have known it was a test! ;)

Thanks Karen :)

On 06-11-13 15:20, Karen Coyle li...@kcoyle.net wrote:

I guess if I want anyone to answer my emails, I need to post mistakes.


Re: [CODE4LIB] rdf serialization

2013-11-06 Thread Hugh Cayless
I wrote about this a few months back at 
http://blogs.library.duke.edu/dcthree/2013/07/27/the-trouble-with-triples/

I'd be very interested to hear what the smart folks here think!

Hugh

On Nov 5, 2013, at 18:28 , Alexander Johannesen 
alexander.johanne...@gmail.com wrote:

 But the
 question to every piece of meta data is *authority*, which is the part
 of RDF that sucks.


[CODE4LIB] databases/indexes with well-structured output

2013-11-06 Thread Eric Lease Morgan
What are some of the more popular and useful bibliographic databases/indexes 
with well-structured output?

If it were easy (trivial) for our readers to gets sets of well-structured data 
out of our bibliographic databases, then it would be relatively easy for us to 
write software enabling readers to use and understand — evaluate — their data. 
What databases/indexes lend themselves to this solution? Let me elaborate.

JSTOR’s Data For Research service provides complete access to the totality of 
JSTOR, sans the articles themselves, unless you are auathorized. [1] A person 
can search JSTOR and then request a data dump compete with citations, keyword 
frequencies, and n-grams. This data can then be used to create a report — like 
a timeline or tag clouds or concordances — illustrating the characteristics of 
the found set. About six months ago I wrote a program, the beginnings of such a 
report. [2]

Suppose a reader diligently used something like Endnote, Zotero, or RefWorks to 
save and manage their bibliographic citations of interest. If the reader were 
to export some or all of their bibliographic data to a file, then the result 
would be well-structured and computer readable. Things like titles, authors, 
keywords/subjects, maybe abstracts, and citations would be neatly delimited. If 
this file were read by a second computer program new views of the data could be 
manifested. Again, a timeline could be created. Wordclouds could be created. An 
analysis could be done against the data to determine frequent authors. 
Relationships between authors might be able to be exposed. All of this would 
assist the reader in evaluating their found set. 

Through the use of APIs I can search things like WorldCat, the HathiTrust, or 
the Internet Archive. The result could be (for better or for worse) MARC 
records. Again, analysis could be done against this data not to find 
information (that has already been done), but rather to evaluate the data — 
look for patterns and anomalies.

Put another way, instead of trying to force people to do the best and most 
perfect bibliographic search, allow them to do broad searches and then provide 
supplementary tools enabling the reader to examine the results. It is not about 
find. It is about use  understand.

I prefer XML to other data structures, but I will not necessarily limit myself 
to XML. What information sources would you suggest I use? Here is a short, 
unordered list:

  * JSTOR Data For Research Data
  * Zotero (RDF) XML output
  * WorldCat, HathiTrust, Internet Archive

After I write the “search results evaluation tool”, I will then go to the next 
step and provide tools for the “distant reading” of individual items á la my 
PDF2TXT application. [3]

We here in libraries can no longer just give people access to information 
because people have more access than they know what to do with. Instead, I 
think an opportunity exists for us to provide tools for evaluating the 
information they have so they can use  understand it. Call it “scalable, 
computer-supplemented information literacy”.


[1] Data For Research - http://dfr.jstor.org
[2] JSTOR Tool — http://dh.crc.nd.edu/sandbox/jstor-tool/
[3] PDF2TXT - http://dh.crc.nd.edu/sandbox/pdf2txt.cgi

—
Eric Morgan
University of Notre Dame


[CODE4LIB] Free LITA Post-Conference Tutorial on Forthcoming NISO ResourceSync Standard

2013-11-06 Thread Peter Murray
FYI.

Begin forwarded message:

From: Cynthia Hodgson chodg...@niso.orgmailto:chodg...@niso.org
Subject: [lita-l] Free LITA Post-Conference Tutorial on Forthcoming NISO 
ResourceSync Standard
Date: November 6, 2013 at 9:26:30 AM EST
To: LITA-L lit...@ala.orgmailto:lit...@ala.org, 
lita-st...@ala.orgmailto:lita-st...@ala.org 
lita-st...@ala.orgmailto:lita-st...@ala.org
Reply-To: chodg...@niso.orgmailto:chodg...@niso.org 
chodg...@niso.orgmailto:chodg...@niso.org

Participants at the 2013 LITA Forum in Louisville are invited to stay a few 
hours longer on Sunday, November 10 to attend the ResourceSync 
Tutorialhttp://www.ala.org/lita/conferences/forum/2013/postcon, which will be 
held after the close of the main conference from 1:30-4:30 p.m. Herbert van de 
Sompelhttp://public.lanl.gov/herbertv/home/, Co-chair of the ResourceSync 
Working Group, will lead this 3-hour session where attendees can learn about 
how the forthcoming ResourceSync standardx-msg://24/ResourceSync%20standard 
can be used to synchronize web resources between servers.
ResourceSync, begun in late 2011, is a joint project between NISO and the Open 
Archives Initiative (OAI) team, with funding from the Sloan Foundation. The 
standard, currently in final editing for approval, describes a synchronization 
framework for the web consisting of various capabilities that allow third-party 
systems to remain synchronized with a server's evolving resources. The 
capabilities can be combined in a modular manner to meet local or community 
requirements. This specification also describes how a server can advertise the 
synchronization capabilities it supports and how third-party systems can 
discover this information. The specification repurposes the document formats 
defined by the Sitemap protocol and introduces extensions for them.
This LITA post-conference tutorial is available at no cost. As we would 
appreciate knowing how many people are coming, please select the post 
conference checkbox on the registration 
formhttp://www.ala.org/lita/conferences/forum/2013/registration.
You can also view the beta version of the specification 
http://www.openarchives.org/rs/0.9.1/toc and provide feedback on the 
ResourceSync Google Grouphttps://groups.google.com/d/forum/resourcesync. 
Visit the ResourceSync workroom 
webpagehttp://www.niso.org/workrooms/resourcesync/ for more information about 
the project.http://www.niso.org/workrooms/resourcesync/


Cynthia Hodgson
Technical Editor / Consultant
National Information Standards Organization
chodg...@niso.orgmailto:chodg...@niso.org
301-654-2512

--
Peter Murray
Assistant Director, Technology Services Development
LYRASIS
peter.mur...@lyrasis.orgmailto:peter.mur...@lyrasis.org
+1 678-235-2955
800.999.8558 x2955


[CODE4LIB] HathiTrust Bib Api - JSONP

2013-11-06 Thread sara amato
Does anyone have a working example of getting jsonp from the HathiTrust bib API?

I can get straight json (it seems to ignore the callback parameter)
http://catalog.hathitrust.org/api/volumes/brief/oclc/3967141.jsoncallback=mycallbackfunction

or jsonp with some unfortunate notices at the top (and yes, I just emailed 
their 'feedback' address and asked about this.)
http://catalog.hathitrust.org/api/volumes/json/oclc:3967141callback=mycallbackfunction


I'm wondering if I'm just missing the correct url/syntax.


Re: [CODE4LIB] rdf serialization

2013-11-06 Thread Hugh Cayless
In the kinds of data I have to deal with, who made an assertion, or what 
sources provide evidence for a statement are vitally important bits of 
information, so its not just a data-source integration problem, where you're 
taking batches of triples from different sources and putting them together. 
It's a question of how to encode scholarly, messy, humanities data.

The answer of course, might be don't use RDF for that :-). I'd rather not 
invent something if I don't have to though.

Hugh

On Nov 6, 2013, at 10:56 , Robert Sanderson azarot...@gmail.com wrote:

 A large number of triples that all have different provenance? I'm curious
 as to how you get them :)
 
 Rob
 
 
 On Wed, Nov 6, 2013 at 8:52 AM, Hugh Cayless philomou...@gmail.com wrote:
 
 Does that work right down to the level of the individual triple though? If
 a large percentage of my triples are each in their own individual graphs,
 won't that be chaos? I really don't know the answer, it's not a rhetorical
 question!
 
 Hugh
 
 On Nov 6, 2013, at 10:40 , Robert Sanderson azarot...@gmail.com wrote:
 
 Named Graphs are the way to solve the issue you bring up in that post, in
 my opinion.  You mint an identifier for the graph, and associate the
 provenance and other information with that.  This then gets ingested as
 the
 4th URI into a quad store, so you don't lose the provenance information.
 
 In JSON-LD:
 {
 @id : uri-for-graph,
 dcterms:creator : uri-for-hugh,
 @graph : [
  // ... triples go here ...
 ]
 }
 
 Rob
 
 
 
 On Wed, Nov 6, 2013 at 7:42 AM, Hugh Cayless philomou...@gmail.com
 wrote:
 
 I wrote about this a few months back at
 
 http://blogs.library.duke.edu/dcthree/2013/07/27/the-trouble-with-triples/
 
 I'd be very interested to hear what the smart folks here think!
 
 Hugh
 
 On Nov 5, 2013, at 18:28 , Alexander Johannesen 
 alexander.johanne...@gmail.com wrote:
 
 But the
 question to every piece of meta data is *authority*, which is the part
 of RDF that sucks.
 
 


Re: [CODE4LIB] rdf serialization

2013-11-06 Thread Ross Singer
Hugh, I don't think you're in the weeds with your question (and, while I
think that named graphs can provide a solution to your particular problem,
that doesn't necessarily mean that it doesn't raise more questions or
potentially more frustrations down the line - like any new power, it can be
used for good or evil and the difference might not be obvious at first).

My question for you, however, is why are you using a triple store for this?
 That is, why bother with the broad and general model in what I assume is a
closed world assumption in your application?

We don't generally use XML databases (Marklogic being a notable exception),
or MARC databases, or insert your transmission format of choice-specific
databases because usually transmission formats are designed to account for
lots and lots of variations and maximum flexibility, which generally is the
opposite of the modeling that goes into a specific app.

I think there's a world of difference between modeling your data so it can
be represented in RDF (and, possibly, available via SPARQL, but I think
there is *far* less value there) and committing to RDF all the way down.
 RDF is a generalization so multiple parties can agree on what data means,
but I would have a hard time swallowing the argument that domain-specific
data must be RDF-native.

-Ross.


On Wed, Nov 6, 2013 at 10:52 AM, Hugh Cayless philomou...@gmail.com wrote:

 Does that work right down to the level of the individual triple though? If
 a large percentage of my triples are each in their own individual graphs,
 won't that be chaos? I really don't know the answer, it's not a rhetorical
 question!

 Hugh

 On Nov 6, 2013, at 10:40 , Robert Sanderson azarot...@gmail.com wrote:

  Named Graphs are the way to solve the issue you bring up in that post, in
  my opinion.  You mint an identifier for the graph, and associate the
  provenance and other information with that.  This then gets ingested as
 the
  4th URI into a quad store, so you don't lose the provenance information.
 
  In JSON-LD:
  {
   @id : uri-for-graph,
   dcterms:creator : uri-for-hugh,
   @graph : [
// ... triples go here ...
   ]
  }
 
  Rob
 
 
 
  On Wed, Nov 6, 2013 at 7:42 AM, Hugh Cayless philomou...@gmail.com
 wrote:
 
  I wrote about this a few months back at
 
 http://blogs.library.duke.edu/dcthree/2013/07/27/the-trouble-with-triples/
 
  I'd be very interested to hear what the smart folks here think!
 
  Hugh
 
  On Nov 5, 2013, at 18:28 , Alexander Johannesen 
  alexander.johanne...@gmail.com wrote:
 
  But the
  question to every piece of meta data is *authority*, which is the part
  of RDF that sucks.
 



Re: [CODE4LIB] rdf serialization

2013-11-06 Thread Karen Coyle
Ross, I agree with your statement that data doesn't have to be RDF all 
the way down, etc. But I'd like to hear more about why you think SPARQL 
availability has less value, and if you see an alternative to SPARQL for 
querying.


kc


On 11/6/13 8:11 AM, Ross Singer wrote:

Hugh, I don't think you're in the weeds with your question (and, while I
think that named graphs can provide a solution to your particular problem,
that doesn't necessarily mean that it doesn't raise more questions or
potentially more frustrations down the line - like any new power, it can be
used for good or evil and the difference might not be obvious at first).

My question for you, however, is why are you using a triple store for this?
  That is, why bother with the broad and general model in what I assume is a
closed world assumption in your application?

We don't generally use XML databases (Marklogic being a notable exception),
or MARC databases, or insert your transmission format of choice-specific
databases because usually transmission formats are designed to account for
lots and lots of variations and maximum flexibility, which generally is the
opposite of the modeling that goes into a specific app.

I think there's a world of difference between modeling your data so it can
be represented in RDF (and, possibly, available via SPARQL, but I think
there is *far* less value there) and committing to RDF all the way down.
  RDF is a generalization so multiple parties can agree on what data means,
but I would have a hard time swallowing the argument that domain-specific
data must be RDF-native.

-Ross.


On Wed, Nov 6, 2013 at 10:52 AM, Hugh Cayless philomou...@gmail.com wrote:


Does that work right down to the level of the individual triple though? If
a large percentage of my triples are each in their own individual graphs,
won't that be chaos? I really don't know the answer, it's not a rhetorical
question!

Hugh

On Nov 6, 2013, at 10:40 , Robert Sanderson azarot...@gmail.com wrote:


Named Graphs are the way to solve the issue you bring up in that post, in
my opinion.  You mint an identifier for the graph, and associate the
provenance and other information with that.  This then gets ingested as

the

4th URI into a quad store, so you don't lose the provenance information.

In JSON-LD:
{
  @id : uri-for-graph,
  dcterms:creator : uri-for-hugh,
  @graph : [
   // ... triples go here ...
  ]
}

Rob



On Wed, Nov 6, 2013 at 7:42 AM, Hugh Cayless philomou...@gmail.com

wrote:

I wrote about this a few months back at


http://blogs.library.duke.edu/dcthree/2013/07/27/the-trouble-with-triples/

I'd be very interested to hear what the smart folks here think!

Hugh

On Nov 5, 2013, at 18:28 , Alexander Johannesen 
alexander.johanne...@gmail.com wrote:


But the
question to every piece of meta data is *authority*, which is the part
of RDF that sucks.


--
Karen Coyle
kco...@kcoyle.net http://kcoyle.net
m: 1-510-435-8234
skype: kcoylenet


Re: [CODE4LIB] rdf serialization

2013-11-06 Thread Hugh Cayless
The answer is purely because the RDF data model and the technology around it 
looks like it would almost do what we need it to.

I do not, and cannot, assume a closed world. The open world assumption is one 
of the attractive things about RDF, in fact :-)

Hugh

On Nov 6, 2013, at 11:11 , Ross Singer rossfsin...@gmail.com wrote:

 My question for you, however, is why are you using a triple store for this?
 That is, why bother with the broad and general model in what I assume is a
 closed world assumption in your application?


Re: [CODE4LIB] rdf serialization

2013-11-06 Thread Ross Singer
Hey Karen,

It's purely anecdotal (albeit anecdotes borne from working at a company
that offered, and has since abandoned, a sparql-based triple store
service), but I just don't see the interest in arbitrary SPARQL queries
against remote datasets that I do against linking to (and grabbing) known
items.  I think there are multiple reasons for this:

1) Unless you're already familiar with the dataset behind the SPARQL
endpoint, where do you even start with constructing useful queries?
2) SPARQL as a query language is a combination of being too powerful and
completely useless in practice: query timeouts are commonplace, endpoints
don't support all of 1.1, etc.  And, going back to point #1, it's hard to
know how to optimize your queries unless you are already pretty familiar
with the data
3) SPARQL is a flawed API interface from the get-go (IMHO) for the same
reason we don't offer a public SQL interface to our RDBMSes

Which isn't to say it doesn't have its uses or applications.

I just think that in most cases domain/service-specific APIs (be they
RESTful, based on the Linked Data API [0], whatever) will likely be favored
over generic SPARQL endpoints.  Are n+1 different APIs ideal?  I am pretty
sure the answer is no, but that's the future I foresee, personally.

-Ross.
0. https://code.google.com/p/linked-data-api/wiki/Specification


On Wed, Nov 6, 2013 at 11:28 AM, Karen Coyle li...@kcoyle.net wrote:

 Ross, I agree with your statement that data doesn't have to be RDF all
 the way down, etc. But I'd like to hear more about why you think SPARQL
 availability has less value, and if you see an alternative to SPARQL for
 querying.

 kc



 On 11/6/13 8:11 AM, Ross Singer wrote:

 Hugh, I don't think you're in the weeds with your question (and, while I
 think that named graphs can provide a solution to your particular problem,
 that doesn't necessarily mean that it doesn't raise more questions or
 potentially more frustrations down the line - like any new power, it can
 be
 used for good or evil and the difference might not be obvious at first).

 My question for you, however, is why are you using a triple store for
 this?
   That is, why bother with the broad and general model in what I assume
 is a
 closed world assumption in your application?

 We don't generally use XML databases (Marklogic being a notable
 exception),
 or MARC databases, or insert your transmission format of choice-specific
 databases because usually transmission formats are designed to account for
 lots and lots of variations and maximum flexibility, which generally is
 the
 opposite of the modeling that goes into a specific app.

 I think there's a world of difference between modeling your data so it can
 be represented in RDF (and, possibly, available via SPARQL, but I think
 there is *far* less value there) and committing to RDF all the way down.
   RDF is a generalization so multiple parties can agree on what data
 means,
 but I would have a hard time swallowing the argument that domain-specific
 data must be RDF-native.

 -Ross.


 On Wed, Nov 6, 2013 at 10:52 AM, Hugh Cayless philomou...@gmail.com
 wrote:

  Does that work right down to the level of the individual triple though?
 If
 a large percentage of my triples are each in their own individual graphs,
 won't that be chaos? I really don't know the answer, it's not a
 rhetorical
 question!

 Hugh

 On Nov 6, 2013, at 10:40 , Robert Sanderson azarot...@gmail.com wrote:

  Named Graphs are the way to solve the issue you bring up in that post,
 in
 my opinion.  You mint an identifier for the graph, and associate the
 provenance and other information with that.  This then gets ingested as

 the

 4th URI into a quad store, so you don't lose the provenance information.

 In JSON-LD:
 {
   @id : uri-for-graph,
   dcterms:creator : uri-for-hugh,
   @graph : [
// ... triples go here ...
   ]
 }

 Rob



 On Wed, Nov 6, 2013 at 7:42 AM, Hugh Cayless philomou...@gmail.com

 wrote:

 I wrote about this a few months back at

  http://blogs.library.duke.edu/dcthree/2013/07/27/the-
 trouble-with-triples/

 I'd be very interested to hear what the smart folks here think!

 Hugh

 On Nov 5, 2013, at 18:28 , Alexander Johannesen 
 alexander.johanne...@gmail.com wrote:

  But the
 question to every piece of meta data is *authority*, which is the part
 of RDF that sucks.


 --
 Karen Coyle
 kco...@kcoyle.net http://kcoyle.net
 m: 1-510-435-8234
 skype: kcoylenet



Re: [CODE4LIB] rdf serialization

2013-11-06 Thread Ross Singer
Hugh, I'm skeptical of this in a usable application or interface.

Applications have constraints.  There are predicates you care about, there
are values you display in specific ways.  There are expectations, based on
the domain, in the data that are either driven by the interface or the
needs of the consumers.

I have yet to see an example of arbitrary and unexpected data exposed in
an application that people actually use.

-Ross.


On Wed, Nov 6, 2013 at 11:39 AM, Hugh Cayless philomou...@gmail.com wrote:

 The answer is purely because the RDF data model and the technology around
 it looks like it would almost do what we need it to.

 I do not, and cannot, assume a closed world. The open world assumption is
 one of the attractive things about RDF, in fact :-)

 Hugh

 On Nov 6, 2013, at 11:11 , Ross Singer rossfsin...@gmail.com wrote:

  My question for you, however, is why are you using a triple store for
 this?
  That is, why bother with the broad and general model in what I assume is
 a
  closed world assumption in your application?



Re: [CODE4LIB] rdf serialization

2013-11-06 Thread Ethan Gruber
I think that the answer to #1 is that if you want or expect people to use
your endpoint that you should document how it works: the ontologies, the
models, and a variety of example SPARQL queries, ranging from simple to
complex.  The British Museum's SPARQL endpoint (
http://collection.britishmuseum.org/sparql) is highly touted, but how many
people actually use it?  I understand your point about SPARQL being too
complicated for an API interface, but the best examples of services built
on SPARQL are probably the ones you don't even realize are built on SPARQL
(e.g., http://numismatics.org/ocre/id/ric.1%282%29.aug.4A#mapTab).  So on
one hand, perhaps only the most dedicated and hardcore researchers will
venture to construct SPARQL queries for your endpoint, but on the other,
you can build some pretty visualizations based on SPARQL queries conducted
in the background from the user's interaction with a simple html/javascript
based interface.

Ethan


On Wed, Nov 6, 2013 at 11:54 AM, Ross Singer rossfsin...@gmail.com wrote:

 Hey Karen,

 It's purely anecdotal (albeit anecdotes borne from working at a company
 that offered, and has since abandoned, a sparql-based triple store
 service), but I just don't see the interest in arbitrary SPARQL queries
 against remote datasets that I do against linking to (and grabbing) known
 items.  I think there are multiple reasons for this:

 1) Unless you're already familiar with the dataset behind the SPARQL
 endpoint, where do you even start with constructing useful queries?
 2) SPARQL as a query language is a combination of being too powerful and
 completely useless in practice: query timeouts are commonplace, endpoints
 don't support all of 1.1, etc.  And, going back to point #1, it's hard to
 know how to optimize your queries unless you are already pretty familiar
 with the data
 3) SPARQL is a flawed API interface from the get-go (IMHO) for the same
 reason we don't offer a public SQL interface to our RDBMSes

 Which isn't to say it doesn't have its uses or applications.

 I just think that in most cases domain/service-specific APIs (be they
 RESTful, based on the Linked Data API [0], whatever) will likely be favored
 over generic SPARQL endpoints.  Are n+1 different APIs ideal?  I am pretty
 sure the answer is no, but that's the future I foresee, personally.

 -Ross.
 0. https://code.google.com/p/linked-data-api/wiki/Specification


 On Wed, Nov 6, 2013 at 11:28 AM, Karen Coyle li...@kcoyle.net wrote:

  Ross, I agree with your statement that data doesn't have to be RDF all
  the way down, etc. But I'd like to hear more about why you think SPARQL
  availability has less value, and if you see an alternative to SPARQL for
  querying.
 
  kc
 
 
 
  On 11/6/13 8:11 AM, Ross Singer wrote:
 
  Hugh, I don't think you're in the weeds with your question (and, while I
  think that named graphs can provide a solution to your particular
 problem,
  that doesn't necessarily mean that it doesn't raise more questions or
  potentially more frustrations down the line - like any new power, it can
  be
  used for good or evil and the difference might not be obvious at first).
 
  My question for you, however, is why are you using a triple store for
  this?
That is, why bother with the broad and general model in what I assume
  is a
  closed world assumption in your application?
 
  We don't generally use XML databases (Marklogic being a notable
  exception),
  or MARC databases, or insert your transmission format of
 choice-specific
  databases because usually transmission formats are designed to account
 for
  lots and lots of variations and maximum flexibility, which generally is
  the
  opposite of the modeling that goes into a specific app.
 
  I think there's a world of difference between modeling your data so it
 can
  be represented in RDF (and, possibly, available via SPARQL, but I think
  there is *far* less value there) and committing to RDF all the way down.
RDF is a generalization so multiple parties can agree on what data
  means,
  but I would have a hard time swallowing the argument that
 domain-specific
  data must be RDF-native.
 
  -Ross.
 
 
  On Wed, Nov 6, 2013 at 10:52 AM, Hugh Cayless philomou...@gmail.com
  wrote:
 
   Does that work right down to the level of the individual triple though?
  If
  a large percentage of my triples are each in their own individual
 graphs,
  won't that be chaos? I really don't know the answer, it's not a
  rhetorical
  question!
 
  Hugh
 
  On Nov 6, 2013, at 10:40 , Robert Sanderson azarot...@gmail.com
 wrote:
 
   Named Graphs are the way to solve the issue you bring up in that post,
  in
  my opinion.  You mint an identifier for the graph, and associate the
  provenance and other information with that.  This then gets ingested
 as
 
  the
 
  4th URI into a quad store, so you don't lose the provenance
 information.
 
  In JSON-LD:
  {
@id : uri-for-graph,
dcterms:creator : uri-for-hugh,
@graph : [
 // ... 

[CODE4LIB] Job: Associate University Librarian for Library Information Technology, University of Michigan at University of Michigan

2013-11-06 Thread jobs
Associate University Librarian for Library Information Technology, University 
of Michigan
University of Michigan
Ann Arbor

The **University of Michigan Library** is transforming the way libraries
organize, preserve, and share access to knowledge in service of the mission of
one of the world's leading research universities. We seek a forward-thinking,
collaborative, mission-driven, and innovative Associate University Librarian
(AUL) to join the library's leadership team, reporting to the Dean of
Libraries.

  
**Associate University Librarian for Library Information Technology (LIT)**  
The AUL for LIT will lead the development of information technology in support
of the university's current and emerging research needs, and the advancement
of scholarly literacy and instructional technologies. To direct the
development, management, and maintenance of a flexible and reliable technology
environment, the AUL for LIT will lead 60 talented staff members in six units:
Core Services, Digital Library Production Services, Learning Technology
Incubation Group, Library Systems, User Experience, and Web Systems. The AUL
for LIT must possess the technical and conceptual knowledge to represent the
library in broad conversations about IT, and advance the campus-wide
development of emerging instructional technologies as well as systems to
enable emerging research needs, including the management and preservation of
data.

  
We are searching for professionals with a deep understanding of the myriad and
changing roles of the library, who view publishing and information technology
as integral to our mission, and who can excel within the context of a world-
class research university. Because we are committed to diversity, we ask our
leaders to develop and nurture the individual and collective skills to
recognize, celebrate, and deploy difference as a path to engagement,
innovation, and the generation of new ideas. More information is available at:
[http://tinyurl.com/UMLib-AUL-LIT](http://tinyurl.com/UMLib-AUL-LIT). Submit
nominations or questions to: aulsea...@umich.edu.



Brought to you by code4lib jobs: http://jobs.code4lib.org/job/10610/


[CODE4LIB] catqc / marclib

2013-11-06 Thread Jay, Michael
I posted our shelf-ready record analyzer and a small C library (on which it 
depends) on sourceforge.

If someone could build and test the utility in a non Windows environment I 
would greatly appreciate it. 

If anyone is interested in using it or has any questions let me know. 

https://sourceforge.net/projects/marclib
https://sourceforge.net/projects/catqc

mj

Michael Jay, Library IT
Suite 1250
2046 Waldo Road
Gainesville, FL 32609

352.273.2678
em...@ufl.edu


[CODE4LIB] How to generate a Word document which displays full text links in the output

2013-11-06 Thread Paul James Albert
For those of you who do literature searches for patrons, here is a custom 
EndNote style that can generate a Word document which displays full text links 
in the output.
https://dl.dropboxusercontent.com/u/2014679/customlinktodoi.ens

To make this work, customize the style so that it follows your local 
institution's OpenURL syntax, and, of course, be sure to get bibliographic 
records from authoritative sources like MEDLINE or Web of Knowledge. (Those are 
the only two I've tried this out on so far.)

If anyone has ideas for improving this further, please let me know, and I'll 
update the file.

thanks,
Paul


Paul Albert
Project Manager, VIVO
Weill Cornell Medical Library
646.962.2551


[CODE4LIB] Canadian WordPress Hosting

2013-11-06 Thread Cynthia Ng
Hi Everyone,

Apologies for cross-posting, but code4lib is much more active, and has more
Canadians that I've seen.

I was wondering if anyone had recommendations for a WordPress hosting
solution? And yes, it needs to be in Canada. I can do most of my own
dev-type work, so really it just needs to be setup to run WordPress
(preferably with 1-click install), and most of all, reliable, hopefully
with good customer service for when we need to contact the company.

Okay, also preferable is that they do daily backups for us and has
excellent security (considering it's WordPress).

Too many hosting solutions include email and a bunch of other stuff, and I
need it only for WordPress and nothing else.

A name, plus at least 1-2 reasons on the recommendation would be great!

Thanks in advance,
Cynthia


[CODE4LIB] Citing source code in high-profile academic journals

2013-11-06 Thread Heather Claxton-Douglas
Hello,

I need some advice about referencing source code in an academic journal.  I 
rarely see it happen and I don’t know why.

Background:  
I’m building a website that connects academic researchers with software 
developers interested in helping scientists write code.  My goal is for these 
researchers to be able to reference any  new source code in the articles they 
publish -- much like a “gene accession number” or a “PDB code”.

Unfortunately, I don’t see any code repositories referenced in high profile 
journals like Science or PNAS.  I’m guessing it’s because the code in the 
repositories isn’t permanent and may be deleted anytime? Or perhaps a DOI needs 
to be assigned?

So my question to the group is:
What criteria is necessary for a code repository or database to be eligible for 
referencing in scientific academic journals?

Some ideas I have based on looking at the Protein Databank and Genbank are:
1) The entry is permanent -- we can’t delete articles once they’ve been 
published, same is true for entries in the PDB and Genbank
2) The entry gives credit to all authors and contributors
3) The entry has a DOI 
4) The entry has a simple accession number - PDB is a four character code,  
Genebank number is six characters.

Is there anything I’m missing?  Any advice would be greatly appreciated.

Thank you
Heather Claxton-Douglas, PhD
www.sciencesolved.com

http://igg.me/at/ScienceSolved


Re: [CODE4LIB] more suggestions for code4lib.org

2013-11-06 Thread Wick, Ryan
Hi Kevin,

Thank you for the suggestions.

a) is done. (looks like someone already changed the links on the About page).

c) I'm torn on. I understand what you mean, but this list or IRC (or even 
Twitter) might be better. I don't know of a way to have a message go to all 
people with admin rights on Drupal.

Ryan Wick

-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Kevin 
Hawkins
Sent: Monday, November 04, 2013 8:31 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: [CODE4LIB] more suggestions for code4lib.org

While we're making suggestions for improving the infrastructure of 
code4lib.org, here are some things I'd like to see improved:

a) Change the email link in the navbar (and in the text at 
http://code4lib.org/about ) from

https://listserv.nd.edu/cgi-bin/wa?SUBED1=CODE4LIBA=1

to

https://listserv.nd.edu/cgi-bin/wa?A0=CODE4LIB

so that people can easily find the list archives and poke around recent 
messages before deciding whether to join.

b) Modify whatever code sends formatted job postings to this list so that it 
includes the location of the position.

c) Add a contact link so people have a clear place to go to reportadministrivia 
like point (a) above or broken links.  It might go to whichever users have 
admin privileges on the Drupal instance behind code4lib.org.

Thanks for your consideration,

Kevin


Re: [CODE4LIB] more suggestions for code4lib.org

2013-11-06 Thread Riley Childs
For C, directing people to the list would be best, but you could point the 
email to a gmail box and setup forward rules.

Riley Childs
Library Director and IT Admin
Junior
Charlotte United Christian Academy
P: 704-497-2086 (Anytime)
P: 704-537-0331 x101 (M-F 7:30am-3pm ET)

Sent from my iPhone 
Please excuse mistakes

 On Nov 6, 2013, at 8:05 PM, Wick, Ryan ryan.w...@oregonstate.edu wrote:
 
 Hi Kevin,
 
 Thank you for the suggestions.
 
 a) is done. (looks like someone already changed the links on the About page).
 
 c) I'm torn on. I understand what you mean, but this list or IRC (or even 
 Twitter) might be better. I don't know of a way to have a message go to all 
 people with admin rights on Drupal.
 
 Ryan Wick
 
 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Kevin 
 Hawkins
 Sent: Monday, November 04, 2013 8:31 PM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: [CODE4LIB] more suggestions for code4lib.org
 
 While we're making suggestions for improving the infrastructure of 
 code4lib.org, here are some things I'd like to see improved:
 
 a) Change the email link in the navbar (and in the text at 
 http://code4lib.org/about ) from
 
 https://listserv.nd.edu/cgi-bin/wa?SUBED1=CODE4LIBA=1
 
 to
 
 https://listserv.nd.edu/cgi-bin/wa?A0=CODE4LIB
 
 so that people can easily find the list archives and poke around recent 
 messages before deciding whether to join.
 
 b) Modify whatever code sends formatted job postings to this list so that it 
 includes the location of the position.
 
 c) Add a contact link so people have a clear place to go to 
 reportadministrivia like point (a) above or broken links.  It might go to 
 whichever users have admin privileges on the Drupal instance behind 
 code4lib.org.
 
 Thanks for your consideration,
 
 Kevin


Re: [CODE4LIB] more suggestions for code4lib.org

2013-11-06 Thread Ed Summers
On Mon, Nov 4, 2013 at 11:31 PM, Kevin Hawkins
kevin.s.hawk...@ultraslavonic.info wrote:
 b) Modify whatever code sends formatted job postings to this list so that it
 includes the location of the position.

That would be shortimer, and I think it should be doing what you suggest now?


https://github.com/code4lib/shortimer/commit/acb57090d4842920c9f92c684810f3c618f0a21e

If not let me know, create a github issue, or send a pull request :-)

//Ed


Re: [CODE4LIB] We should use HTTPS on code4lib.org

2013-11-06 Thread Cary Gordon
It sounds like we are willing to throw security under the bus for an edge case, 
although I am sure that I am missing some subtlety

Cary

On Nov 5, 2013, at 10:27 AM, Ross Singer rossfsin...@gmail.com wrote:

 On Tue, Nov 5, 2013 at 12:07 PM, William Denton w...@pobox.com wrote:
 
 
 (Question:  Why does HTTPS complicate screen-scraping?  Every decent tool
 and library supports HTTPS, doesn't it?)
 
 
 Birkin asked me this same question, and I realized I should clarify what I
 meant.  I was mostly referring to existing screen scrapers/existing web
 sites.  If you redirect every request from http to https, this will
 probably break things.  I think the Open Library example that Karen
 mentioned is a good case study.
 
 And it's pretty different for a library or tool to support HTTPS and a
 specific app to be expecting it.  If you follow the thread around that OL
 change, it appears there are issues with Java (as one example) arbitrarily
 consuming HTTPS (from what I understand, you need to have the cert
 locally?), but I don't know enough about it to say for certain.  I think
 there would also probably be potential issues around mashups (AJAX, for
 example), but seeing as code4lib.org doesn't support CORS, not really a
 current issue.  Does apply more generally to your question about library
 websites at large, though.
 
 Anyway, I agree with you that the option for both should be there.  I'm not
 just not convinced that HTTPS-all-the-time is necessary for all web use
 cases.
 
 -Ross.


Re: [CODE4LIB] We should use HTTPS on code4lib.org

2013-11-06 Thread Riley Childs
Why? HTTPS is used when there is sensitive data involved, code4lib.org (at 
least to my knowledge) does not have sensitive data?

Riley Childs
Library Director and IT Admin
Junior
Charlotte United Christian Academy
P: 704-497-2086 (Anytime)
P: 704-537-0331 x101 (M-F 7:30am-3pm ET)

Sent from my iPhone 
Please excuse mistakes

 On Nov 6, 2013, at 8:28 PM, Cary Gordon listu...@chillco.com wrote:
 
 It sounds like we are willing to throw security under the bus for an edge 
 case, although I am sure that I am missing some subtlety
 
 Cary
 
 On Nov 5, 2013, at 10:27 AM, Ross Singer rossfsin...@gmail.com wrote:
 
 On Tue, Nov 5, 2013 at 12:07 PM, William Denton w...@pobox.com wrote:
 
 
 (Question:  Why does HTTPS complicate screen-scraping?  Every decent tool
 and library supports HTTPS, doesn't it?)
 
 
 Birkin asked me this same question, and I realized I should clarify what I
 meant.  I was mostly referring to existing screen scrapers/existing web
 sites.  If you redirect every request from http to https, this will
 probably break things.  I think the Open Library example that Karen
 mentioned is a good case study.
 
 And it's pretty different for a library or tool to support HTTPS and a
 specific app to be expecting it.  If you follow the thread around that OL
 change, it appears there are issues with Java (as one example) arbitrarily
 consuming HTTPS (from what I understand, you need to have the cert
 locally?), but I don't know enough about it to say for certain.  I think
 there would also probably be potential issues around mashups (AJAX, for
 example), but seeing as code4lib.org doesn't support CORS, not really a
 current issue.  Does apply more generally to your question about library
 websites at large, though.
 
 Anyway, I agree with you that the option for both should be there.  I'm not
 just not convinced that HTTPS-all-the-time is necessary for all web use
 cases.
 
 -Ross.


Re: [CODE4LIB] We should use HTTPS on code4lib.org

2013-11-06 Thread Riley Childs
SSL certs are expensive because of the administrative work associated with it. 

Riley Childs
Library Director and IT Admin
Junior
Charlotte United Christian Academy
P: 704-497-2086 (Anytime)
P: 704-537-0331 x101 (M-F 7:30am-3pm ET)

Sent from my iPhone 
Please excuse mistakes

 On Nov 6, 2013, at 8:28 PM, Cary Gordon listu...@chillco.com wrote:
 
 It sounds like we are willing to throw security under the bus for an edge 
 case, although I am sure that I am missing some subtlety
 
 Cary
 
 On Nov 5, 2013, at 10:27 AM, Ross Singer rossfsin...@gmail.com wrote:
 
 On Tue, Nov 5, 2013 at 12:07 PM, William Denton w...@pobox.com wrote:
 
 
 (Question:  Why does HTTPS complicate screen-scraping?  Every decent tool
 and library supports HTTPS, doesn't it?)
 
 Birkin asked me this same question, and I realized I should clarify what I
 meant.  I was mostly referring to existing screen scrapers/existing web
 sites.  If you redirect every request from http to https, this will
 probably break things.  I think the Open Library example that Karen
 mentioned is a good case study.
 
 And it's pretty different for a library or tool to support HTTPS and a
 specific app to be expecting it.  If you follow the thread around that OL
 change, it appears there are issues with Java (as one example) arbitrarily
 consuming HTTPS (from what I understand, you need to have the cert
 locally?), but I don't know enough about it to say for certain.  I think
 there would also probably be potential issues around mashups (AJAX, for
 example), but seeing as code4lib.org doesn't support CORS, not really a
 current issue.  Does apply more generally to your question about library
 websites at large, though.
 
 Anyway, I agree with you that the option for both should be there.  I'm not
 just not convinced that HTTPS-all-the-time is necessary for all web use
 cases.
 
 -Ross.


Re: [CODE4LIB] We should use HTTPS on code4lib.org

2013-11-06 Thread Ross Singer
How is security getting thrown under the bus?

-Ross.

On Wednesday, November 6, 2013, Cary Gordon wrote:

 It sounds like we are willing to throw security under the bus for an edge
 case, although I am sure that I am missing some subtlety

 Cary

 On Nov 5, 2013, at 10:27 AM, Ross Singer rossfsin...@gmail.comjavascript:;
 wrote:

  On Tue, Nov 5, 2013 at 12:07 PM, William Denton 
  w...@pobox.comjavascript:;
 wrote:
 
 
  (Question:  Why does HTTPS complicate screen-scraping?  Every decent
 tool
  and library supports HTTPS, doesn't it?)
 
 
  Birkin asked me this same question, and I realized I should clarify what
 I
  meant.  I was mostly referring to existing screen scrapers/existing web
  sites.  If you redirect every request from http to https, this will
  probably break things.  I think the Open Library example that Karen
  mentioned is a good case study.
 
  And it's pretty different for a library or tool to support HTTPS and a
  specific app to be expecting it.  If you follow the thread around that OL
  change, it appears there are issues with Java (as one example)
 arbitrarily
  consuming HTTPS (from what I understand, you need to have the cert
  locally?), but I don't know enough about it to say for certain.  I think
  there would also probably be potential issues around mashups (AJAX, for
  example), but seeing as code4lib.org doesn't support CORS, not really a
  current issue.  Does apply more generally to your question about library
  websites at large, though.
 
  Anyway, I agree with you that the option for both should be there.  I'm
 not
  just not convinced that HTTPS-all-the-time is necessary for all web use
  cases.
 
  -Ross.



Re: [CODE4LIB] We should use HTTPS on code4lib.org

2013-11-06 Thread Ross Singer
I guess I just don't see why http and https can't coexist.

-Ross.
On Nov 6, 2013 9:39 PM, Cary Gordon listu...@chillco.com wrote:

 This conversation is heading into the draining the swamp category.

 Bill Denton started this thread with the suggestion that we use HTTPS
 everywhere. He did not make a specific case for it. I am just guessing that
 an argument for going that route would include security.

 Regardless of whether this is a good idea, or whether there is a
 compelling reason for doing it, it seems to me that the possibility of its
 making it difficult for older scraping tools to scrape the site does not
 seem like a compelling reason not to do it.

 The cost issue, on the other hand, would be a more compelling
 consideration.

 Thanks,

 Cary

 On Nov 6, 2013, at 6:17 PM, Ross Singer rossfsin...@gmail.com wrote:

  How is security getting thrown under the bus?
 
  -Ross.
 
  On Wednesday, November 6, 2013, Cary Gordon wrote:
 
  It sounds like we are willing to throw security under the bus for an
 edge
  case, although I am sure that I am missing some subtlety
 
  Cary
 
  On Nov 5, 2013, at 10:27 AM, Ross Singer rossfsin...@gmail.com
 javascript:;
  wrote:
 
  On Tue, Nov 5, 2013 at 12:07 PM, William Denton w...@pobox.com
 javascript:;
  wrote:
 
 
  (Question:  Why does HTTPS complicate screen-scraping?  Every decent
  tool
  and library supports HTTPS, doesn't it?)
 
 
  Birkin asked me this same question, and I realized I should clarify
 what
  I
  meant.  I was mostly referring to existing screen scrapers/existing web
  sites.  If you redirect every request from http to https, this will
  probably break things.  I think the Open Library example that Karen
  mentioned is a good case study.
 
  And it's pretty different for a library or tool to support HTTPS and a
  specific app to be expecting it.  If you follow the thread around that
 OL
  change, it appears there are issues with Java (as one example)
  arbitrarily
  consuming HTTPS (from what I understand, you need to have the cert
  locally?), but I don't know enough about it to say for certain.  I
 think
  there would also probably be potential issues around mashups (AJAX, for
  example), but seeing as code4lib.org doesn't support CORS, not really
 a
  current issue.  Does apply more generally to your question about
 library
  websites at large, though.
 
  Anyway, I agree with you that the option for both should be there.  I'm
  not
  just not convinced that HTTPS-all-the-time is necessary for all web use
  cases.
 
  -Ross.
 



Re: [CODE4LIB] We should use HTTPS on code4lib.org

2013-11-06 Thread Chad Fennell
On Wed, Nov 6, 2013 at 8:49 PM, Ross Singer rossfsin...@gmail.com wrote:

 I guess I just don't see why http and https can't coexist.


They can definitely coexist, but there is a corresponding maintenance cost
and a slightly higher risk profile (e.g. session hijacking is still
possible in a variety of mixed http/https configurations). I noticed a a
pretty good, if a bit dated, run-down of the tradeoffs for various secure
setups in Drupal
http://drupalscout.com/knowledge-base/drupal-and-ssl-multiple-recipes-possible-solutions-https.
Even if the solutions have somewhat changed, it does get at the idea of
what some of the tradeoffs are between security, usability and maintenance.

Just today, I noticed a security alert (https://drupal.org/node/2129381)
for the Drupal 6 Secure Pages module where theoretically secured pages and
forms could be transmitted in the clear. This is the module you'd most
likely use to achieve a mixed http/https site in Drupal.

I have personally tended to just put everything behind https because of the
added work/modules/maintenance associated to running it along side of http
(in Drupal, specifically), but I am a lazy person with access to free certs
and ferncer servers.

HTH
-- 
Chad Fennell
Web Developer
University of Minnesota Libraries
(612) 626-4186