Re: [CODE4LIB] looking for free hosting for html code

2015-05-22 Thread Joe Hourcle

On Fri, 22 May 2015, Sarles Patricia (18K500) wrote:

[trimmed]


I plan to teach coding to my 6th and 12th grade students next school year and 
our lab has a mixture of old (2008) and new Macs (2015) so I want to make all 
the Macs functional for writing code in an editor.

My next question is this:

I am familiar with free Web creation and hosting sites like Weebly, Wix, 
Google sites, Wikispaces, WordPress, and Blogger, but do you know of any 
free hosting sites that will allow you to plug in your own code. i.e. 
host your own html files?


If it's straight HTML, and doesn't need any sort of text pre-processing 
(SSI, ASP, JSP, PHP, ColdFusion, etc.), I think that you can use Google 
Drive.  This help page seems to suggest that's true:


https://support.google.com/drive/answer/2881970?hl=en

With all static files it might also be possible to lay things out so that 
you could serve it through github or similar.  (and teaching them about 
version control isn't a bad idea, either)


-Joe


Re: [CODE4LIB] free html editors

2015-05-16 Thread Joe Hourcle

On Sat, 16 May 2015, Nathan Rogers wrote:

If you do not need all the bells and whistles I would recommend 
TextWrangler. Free versions should still be available online and its 
bigger brother BBEdit is overkill for basic web editing.


Actually, the significant difference between TextWrangler and BBEdit is 
that BBEdits has a number of features that are specifically for web 
design, that don't exist in TextWrangler.


Looking at the version of BBEdit 9.1 that I have installed, the majority 
of it is in the 'Markup' menu:


* Close current tag / Balance tags
* Check syntax
* Check links
* Check accessibility
* Cleaners for GoLive/PageMill/HomePage/DreamWeaver
* Convert to HTML / XHTML
* Menu items to insert tags (which then give what attributes are allowed)
* Menu item to insert CSS
* Preview in ... (gives a list of installed web browsers)

...

That said, TextWrangler is still a good free editor -- and I personally 
rarely ever use the insert tags/CSS items (as I've been writing HTML for 
... crap ... I feel old ... 20+ years).


But to say that BBEdit is overkill for web editing is just wrong -- the 
majority of the feature differences are *specifically* for web editing.


-Joe

(disclaimer: for a decade or so, I was a beta tester for BareBones.  I 
haven't been using the latest-and-greatest version in a while, as I prefer 
not to install newer version of MacOSX on my personal systems ... 
basically, since Apple decided to bring all of the iOS annoyances into the 
desktop.  As such, I can't install BBEdit 10 or 11 to see what the 
difference are in more recent versions)




-Original Message-
From: Sarles Patricia (18K500) psar...@schools.nyc.gov
Sent: ?5/?16/?2015 10:21 AM
To: CODE4LIB@LISTSERV.ND.EDU CODE4LIB@LISTSERV.ND.EDU
Subject: [CODE4LIB] free html editors

I just this minute subscribed to this list after reading Andromeda Yelton's 
column in American Libraries from yesterday with great interest since I would 
like to teach coding in my high school library next year.

I purchased Andy Harris' HTML5 and CSS3 All-in-One For Dummies for my summer 
reading and the free HTML editors he mentions in the book are either not really 
free or are not compatible with my lab's 2008 Macs.

Can anyone recommend a free HTML editor for older Macs?

Many thanks and happy to be on this list,
Patricia



Patricia Sarles, MA (Anthropology), MLS
Librarian
Jerome Parker Campus Library
100 Essex Drive
Staten Island, NY 10314
718-370-6900 x1322
psar...@schools.nyc.gov
http://jeromeparkercampus.libguides.com/home

You can tell whether a man is clever by his answers. You can tell whether a man 
is wise by his questions. - Naguib Mahfouz

As a general rule the most successful man in life is the man who has the best 
information. - Benjamin Disraeli



Re: [CODE4LIB] pdf and web publishing question

2015-04-29 Thread Joe Hourcle

On Wed, 29 Apr 2015, Sergio Letuche wrote:


Dear all,

we have a pdf, that is taken from a to be printed pdf, full of tables. The
text is split in two columns. How would you suggest we uploaded this pdf to
the web? We would like to keep the structure, and split each section taken
from the table of contents as a page, but also keep the format, and if
possible, serve the content both in an html view, and in a pdf view, based
on the preference of the user.


The last time I spoke to someone from AAS about how they extracted  their 
'Data Behind the Table' (aka 'DbT'), it was mostly dependent upon getting 
something from the author when it was still in a useful format.




The document is made with Indesign CS6, and i do not know in which format i
could transform it into


There are a few ways to do tables in InDesign, as it's page layout 
software.  If it's in a single table within a text block, and there's 
nothing strange within each cell, you should be able to just select the 
table, copy it, and paste it out into a text editor.  You'll get line 
returns between each row, and tabs between each cell.


If they've placed line returns within the cells, those will get pasted in 
the middle of the cell, which can really screw you up.


For cases like that, it's sometimes easiest to go through the file, and 
paste HTML elements at the beginning of each cell to mark table cells 
(td), so when you export, you have markers as to which are legitimate 
changes in cells, and which are line returns in the file.


I then do post-processing to add in the close cells, and the row markers.

If I were using BBEdit, I'd do:

Find :
\ttd
Replace :
/tdtd

Find:
\rtd
Replace :
/td/tr\rtrtd

If you're doing it in some other editor that supports search/replace, you 
should be able to do similar, but you might need to figure out how to 
specify tabs  line returns in your program.


... and then fix the initial  final lines.  (and maybe convert some of 
the tds into ths)


-Joe


ps.  after getting in trouble last week, I should mention that all
 statements are my own, and I don't represent NASA or any other
 organizations in this matter.


Re: [CODE4LIB] Data Lifecycle Tracking Documentation Tools

2015-03-13 Thread Joe Hourcle

On Wed, 11 Mar 2015, davesgonechina wrote:


Hi John,

Good question - we're taking in XLS, CSV, JSON, XML, and on a bad day PDF
of varying file sizes, each requiring different transformation and audit
strategies, on both regular and irregular schedules. New batches often
feature schema changes requiring modification to ingest procedures, which
we're trying to automate as much as possible but obviously require a human
chaperone.

Mediawiki is our default choice at the moment, but then I would still be
looking for a good workflow management model for the structure of the wiki,
especially since in my experience wikis are often a graveyard for the best
intentions.



A few places that you might try asking this question again, to see if you 
can find a solution that better answers your question:



The American Society for Information Science  Technology's Research Data 
Access  Preservation group.  It has a lot of librarians  archivists in 
it, as well as people from various research disiplines:


http://mail.asis.org/mailman/listinfo/rdap
http://www.asis.org/rdap/

...

The Research Data Alliance has a number of groups that might be relevant. 
Here are a few that I suspect are the best fit:


Libraries for Research Data IG
https://rd-alliance.org/groups/libraries-research-data.html

Reproducibility IG
https://rd-alliance.org/groups/reproducibility-ig.html

Research Data Provenance IG
https://rd-alliance.org/groups/research-data-provenance.html

Data Citation WG
(as this fits into their 'dynamic data' problem)
https://rd-alliance.org/groups/data-citation-wg.html

('IG' is 'Interest Group', which are long-lived.  'WG' is 'Working Group' 
which are formed to solve a specific problem and then disband)


The group 'Publishing Data Workflows' might seem to be appropriate but 
it's actually 'Workflows for Publishing Data' not 'Publishing of Data 
Workflows' (which falls under 'Data Provenance' and 'Data Citation')


There was a presentation at the meeting earlier this week by Andreas 
Rauber in the Data Citation group on workflows using git or SQL databases 
to be able to track appending or modification for CSV and similar ASCII 
files.


...

Also, I would consider this to be on-topic for Stack Exchange's Open 
Data site  (and I'm one of the moderators for the site):


http://opendata.stackexchange.com/

-Joe






On Tue, Mar 10, 2015 at 8:10 PM, Scancella, John j...@loc.gov wrote:


Dave,

How are you getting the metadata streams? Are they actual stream objects,
or files, or database dumps, etc?

As for the tools, I have used a number of the ones you listed below. I
personally prefer JIRA (and it is free for non-profit). If you are ok if
editing in wiki syntax I would recommend mediaWiki (it is what powers
Wikipedia). You could also take a look at continuous deployment
technologies like Virtual Machines (virtualbox), linux containers (docker),
and rapid deployment tools (ansible, salt). Of course if you are doing lots
of code changes you will want to test all of this continually (Jenkins).

John Scancella
Library of Congress, OSI

-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
davesgonechina
Sent: Tuesday, March 10, 2015 6:05 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: [CODE4LIB] Data Lifecycle Tracking  Documentation Tools

Hi all,

One of my projects involves harvesting, cleaning and transforming steady
streams of metadata from numerous publishers. It's an infinite loop but
every cycle can be a little bit or significantly different. Many issue
tracking tools are designed for a linear progression that ends in
deployment, not a circular workflow, and I've not hit upon a tool or use
strategy that really fits.

The best illustration I've found so far of the type of workflow I'm
talking about is the DCC Curation Lifecycle Model 
http://www.dcc.ac.uk/sites/default/files/documents/publications/DCCLifecycle.pdf



.

Here are some things I've tried or thought about trying:

   - Git comments
   - Github Issues
   - MySQL comments
   - Bash script logs
   - JIRA
   - Trac
   - Trello
   - Wiki
   - Unfuddle
   - Redmine
   - Zendesk
   - Request Tracker
   - Basecamp
   - Asana

Thoughts?

Dave





Re: [CODE4LIB] Get It Services / Cart

2015-03-06 Thread Joe Hourcle

On Fri, 6 Mar 2015, Smith, Steelsen wrote:


Hi All,

I'm new to this list, so if there are any conventions I'm ignoring I'd 
appreciate someone letting me know.


I'm working on a project to allow requests that will go to multiple 
systems to be aggregated in a requesting interface. It would be 
implemented as an independent application, allow a shopping list of 
items to be added, and be able to perform some back end business logic 
(availability checking, metadata enrichment, etc.).


This seems like a very common use case so I'm surprised that I've had 
trouble finding anyone who has published an application that works like 
this - the closest I've found being Umlaut which doesn't seem to support 
multiple simultaneous requesting (although I couldn't get as far as 
request in any sample system to be certain). Is anyone on the list 
aware of such a project?



I'm aware of such a project.  And it's been the bane of my existance for 
5+ years.  I've actually asked my boss to fire me a few times so that I 
don't have to support it, as it's more like babysitting than anything 
else.


However, it's for science archives, not libraries, and only really 
supports objects that are stored in FITS (Flexibile Image Transport 
System).


I cannot in good faith recommend that anyone use it.  I've even started up 
a mailing list for IT people in solar physics archives so that I can try 
to make sure that we fight against implementing it for any new scientific 
missions.


-Joe

ps. it's not an independent application ... it's the service that does 
the 'metadata enrichment' because they store all of the data without any 
metadata so that anyone outside not running their custom software can 
actually make use of it ... and then I manage the system that does the 
aggregation, and someone else wrote the logic for availability checking 
(which seems to have decided to crap itself last month, shortly after the 
programmer who wrote it 5+ years ago moved on to another job).


pps.  if you're going to implement something like this, I'd recommend 
using metalink for the 'shipping cart' sort of stuff, and hand off to some 
dedicated download manager.  For our community, an even better option 
would be BagIt with a fetch.txt file, but the client-side tool support 
just isn't out there.


[CODE4LIB] Fwd: [CNI-ANNOUNCE] Call for Participation: Security and Privacy Agenda Workshop, March 3, 2015

2015-02-06 Thread Joe Hourcle
I saw 'hardening OAI-PMH', and thought this might be of interest to this group.

-Joe



Begin forwarded message:

 From: Clifford Lynch cni-annou...@cni.org
 Date: February 6, 2015 4:16:15 PM EST
 To: CNI-ANNOUNCE -- News from the Coalition cni-annou...@cni.org
 Subject: [CNI-ANNOUNCE] Call for Participation: Security and Privacy Agenda 
 Workshop, March 3, 2015
 Reply-To: CNI-ANNOUNCE -- News from the Coalition cni-annou...@cni.org
 
 On March 3 CNI is going to host a small workshop to develop a near term 
 agenda for work needed to improve security and privacy in systems related to 
 scholarly communication and access to scholarly information resources. The 
 focus will be largely technical, and will emphasize setting an agenda for 
 various groups to address needs and problems, rather than details of how to 
 solve specific problems. I've deliberately left the agenda  scoped rather 
 broadly , and I want to look at everything from encouraging wider and more 
 routine use of HTTPS to hardening some popular protocols like OAI-PMH. 
 Technical identity management related issues are also in scope, as are some 
 discussions about appropriate levels of assurance.
 
 We'll meet in Washington DC from 10AM-3PM on Tuesday, March 3. CNI will 
 provide refreshments and lunch, but we will not cover travel expenses.
 
 This will be a small workshop, and we will do our best to balance for 
 different perspectives. If you are interested in attending, please send  an 
 email to Joan Lippincott  (j...@cni.org) with a brief summary of the 
 expertise and perspective you would bring to the meeting. Given that the 
 meeting is only about a month away, I'll send out a first batch of 
 acceptances by Feb 13, and after than respond to later applications as they 
 come in. I'll provide more detailed logistical information with acceptances.
 
 There will be a public report from the meeting, and for those who cannot 
 attend, suggestions and comments are welcome going into the meeting.
 
 If you have questions, please be in touch with me by email.
 
 Clifford Lynch
 Director, CNI
 cl...@cni.org
 
 
 ==
 This message is sent to you because you are subscribed to
the mailing list cni-annou...@cni.org.
 To unsubscribe, E-mail to: cni-announce-...@cni.org
 To switch to the DIGEST mode, E-mail to cni-announce-dig...@cni.org
 To switch to the INDEX mode, E-mail to cni-announce-in...@cni.org
 To postpone your subscription, E-mail to cni-announce-n...@cni.org
 To resume mail list message delivery from postpone mode, E-mail to 
 cni-announce-f...@cni.org
 Send administrative queries to  cni-announce-requ...@cni.org
 
 Visit the CNI-ANNOUNCE e-mail list archive at 
 https://mail2.cni.org/Lists/CNI-ANNOUNCE/.
 
 
 


Re: [CODE4LIB] Plagiarism checker

2015-01-23 Thread Joe Hourcle
On Jan 23, 2015, at 9:44 AM, Mark A. Matienzo wrote:

 I believe Turnitin and SafeAssign both compare the text of submissions to
 against external sources (e.g., SafeAssign uses ABI/INFORM, among others).
 I am not certain if they compare submissions against each other.

My understanding of TurnItIn, at least initially, was that they
built their corpus on existing submissions.  

(they had some deals with universities back when they started up
to use their service for free or cheap, so that they could build
up their corpus).


 However, if you're looking for something along the lines of what Dre
 suggests, you could use ssdeep, which is an implementation of a piecewise
 hashing algorithm [0]. The issue with that you would have to assume that
 all students would probably be using the same file format.
 
 You could also using something like Tika to extract the text content from
 all the submissions, and then compare them against each other.

I'd agree on extracting the text.  MS Word used to store documents
as strings of edits, making it difficult to compare two
documents for similarity without parsing the format.

(I don't know if they still do this in .docx)

-Joe


Re: [CODE4LIB] Lost thread - centrally hosted global navbar

2015-01-10 Thread Joe Hourcle
On Jan 10, 2015, at 8:37 PM, Jason Bengtson wrote:

 Do you have access to the server-side? Server side scripting languages (and
 the frameworks and CMSes built with them) have provisions for just this
 sort of thing. Include statements in PHP and cfinclude tags in coldfusion,
 for example. Every Content Management System I've used has had a provision
 to create reusable content that can be added to multiple pages as blocks or
 via shortcodes. If you can use server-side script I recommend it; that's
 really the cleaner way to do this sort of thing. Another option you could
 use that avoids something like iframes is to create a javascript file that
 dynamically creates the navbar dynamically in your pages. Just include the
 javascript file in any page you want the toolbar to appear in. That method
 adds some overhead to your pages, but it's perfectly workable if
 server-side script is out of reach.


The javascript trick works pretty well when you have people 
mirroring your site via wget (as they won't run the js, and
thus won't try to retrieve all of the images that are used 
to make the page pretty every time they run their mirror job.

You can see it in action at:

http://stereo-ssc.nascom.nasa.gov/data/ins_data/

The drawback is that some browsers have a bit of a flash
when they first hit the page.  It might be possible to
mitigate the problem by having the HTML set the background
to whatever color the background will be changed to, but I
don't quite the flexibility to do that in my case, due to
how the page is being generated.

-Joe

ps.  It's been years since I've done ColdFusion, but I
remember there being a file that you could set, that would
automatically getting inserted into every page in that
directory, or in sub-directories.  I want to say it was
often used for authentication and such, but it might be
possible to use for this.  If nothing else, you could load
header into a variable, and have the pages just print the
variable in the right location.


Re: [CODE4LIB] linked data and open access

2014-12-19 Thread Joe Hourcle
On Dec 19, 2014, at 9:48 AM, Eric Lease Morgan wrote:

 I don’t know about y’all, but it seems to me that things like linked data and 
 open access are larger trends in Europe than here in the United States. Is 
 there are larger commitment to sharing in Europe when compared to the United 
 States? If so, is this a factor based on the nonexistence of a national 
 library in the United States? Is this your perception too? —Eric Morgan


I can't comment on the linked data side of things so much, but in following all 
of the comments from the US's push for opening up access to federally funded 
research, I'd have to say that capitalism and protectionist attitudes from 
'publishers' seem to be a major factor in the fight against open access.

I've placed 'publishers' in quotes, because groups that I would've considered 
to have been 'scientific societies' submitted comments against the opening up 
of the research, and in the case of AGU, referred to themselves multiple times 
as a 'publisher' and never as a 'society'.[1]  I dropped my membership when I 
realized that.


Statements from the 2011 RFI from OSTP:

http://www.whitehouse.gov/administration/eop/ostp/library/publicaccess


Statements from the 2013 NAS meetings:

http://sites.nationalacademies.org/DBASSE/CurrentProjects/DBASSE_082378

(note that I made statements at the National Academies meeting on opening 
access to federally funded research data)



[1] 
http://www.whitehouse.gov/sites/default/files/microsites/ostp/scholarly-pubs-(%23065).pdf

-Joe



ps. I still haven't seen what any of the official policies are (last year's 
government shutdown delayed the white house response to their submissions, and 
I have no idea if they've finally publicized anything) ... but I hosted a 
session at the AGU last year, where we had representatives from NOAA, NASA and 
USGS speak about what they were doing, and the NASA policy seemed to be heavily 
influenced by the more senior scientists ... who were more likely to be editors 
of journals.  They haven't updated their 'Data  Information Policy' 
(http://science.nasa.gov/earth-science/earth-science-data/data-information-policy/)
 page in over three years.


Re: [CODE4LIB] linked data and open access

2014-12-19 Thread Joe Hourcle
On Dec 19, 2014, at 12:28 PM, Kyle Banerjee wrote:

 On Fri, Dec 19, 2014 at 7:57 AM, Joe Hourcle onei...@grace.nascom.nasa.gov
 wrote:
 
 
 I can't comment on the linked data side of things so much, but in
 following all of the comments from the US's push for opening up access to
 federally funded research, I'd have to say that capitalism and
 protectionist attitudes from 'publishers' seem to be a major factor in the
 fight against open access.
 
 
 That definitely doesn't help. But quite a few players own this problem.
 
 Pockets where there is a culture of openness can be found but at least in
 my neck of the woods, researchers as a group fear being scooped and face
 incentive structures that discourage openness. You get brownie points for
 driving your metrics up as well as being first and novel, not for investing
 huge amounts of time structuring your data so that everyone else can look
 great using what you created.

There's been a lot of discussion of this problem over the last ~5 years or
so.  The general consensus is that :

1. We need better ways for people to acknowledge data being re-used.

a. The need for standards for citation so that we can use 
   bibliometric tools to extract the relationships
b. The need for a citation specifically to the data, and not
   a proxy (eg, the first results or instrument papers), to show
   that maintaining the data is still important.
c. Shift the work in determining how to acknowledge the data
   from the re-user back to the distributor the data.

2. We need standards to make it easier for researchers to re-use data.

Findability, accessibility of the file formats, documentation of
data, etc.

3. We need institutions to change their culture to acknowledge that 
   producing really good data is as important for the research ecosystem
   as writing papers.  This includes decisions regarding awarding grants,
   tenure  promotion, etc.


Much of this is covered by the Joint Declaration of Data Citation
Principles:

https://force11.org/datacitation

There are currently two sub-groups; one working on dissemination, to
make groups aware of the issues  the principles, and another (that I'm
on) working on issues of implementation.  We actually just submitted
something to PeerJ this week, on how to deal with 'machine actionable'
landing pages:

https://peerj.com/preprints/697/

(I've been pushing for one of the sections to be clarified, so feel
free to comment ... if enough other people agree w/ me, maybe I can
get my changes into the final paper)


 Libraries face their own challenges in this regard. Even if we ignore that
 many libraries and library organizations are pretty tight with what they
 consider their intellectual property, there is still the issue that most of
 us are also under pressure to demonstrate impact, originality, etc. As a
 practical matter, this means we are rewarded for contributing to churn,
 imposing branding, keeping things siloed and local, etc. so that we can
 generate metrics that show how relevant we are to those who pay our bills
 even if we could do much more good by contributing to community initiatives.

But ... one of the other things that libraries do is make stuff available
to the public.  So as most aren't dealing with data, getting that into
their IRs means that they've then got more stuff that they can serve
to possibly help push up their metrics.

(not that I think those metrics are good ... I'd rather *not* transfer
data that people aren't going to use, but the bean counters like those
graphs of data transfer going up ... we just don't mention that it's
groups in China attempting to mirror our entire holdings)



 With regards to our local data initiatives, we don't push the open data
 aspect because this has practically no traction with researchers. What does
 interest them is meeting funder and publisher requirements as well as being
 able to transport their own research from one environment to another so
 that they can use it. The takeaway from this is that leadership from the
 top does matter.

The current strategy is to push for the scientific societies to implement
policies requiring the data be opened if it's to be used as evidence in
a journal article.  There are some exceptions*, but the recommendations
so far are to still set up the landing page to make the data citable,
but instead of linking directly to the data, provide an explanation of
what the procedures are to request access.

Through this, we have the requirement be that if the researcher wants
to publish their paper ... they have to provide the data, too.

We're run into a few interesting snags, though.  For instance, some are
only requiring the data that directly supports the paper to be published;
this means that we have no way of knowing if they cherry-picked their
data and the larger collection might have evidence to refute their
findings.

The 'publishers' seem

Re: [CODE4LIB] looking for a good PHP table-manipulating class

2014-12-11 Thread Joe Hourcle
On Dec 11, 2014, at 4:32 PM, Ken Irwin wrote:

 Hi folks,
 
 I'm hoping to find a PHP class that designed to display data in tables, 
 preferably able to do two things:
 1. Swap the x- and y-axis, so you could arbitrarily show the table with 
 y=Puppies, x=Kittens or y=Kittens,x=Puppies
 2. Display the table either using plain text columns or formatted html
 
 I feel confident that in a world of 7 billion people, someone must have 
 wanted this before.


There's much more work being done in javascript tables these days than in 
backend software.

Unfortunately, I've never found a good matrix to compare features between the 
various 'data table' or 'data grid' implementations.

I did start evaluating a lot a while back, but the problem is that you have go 
go through them all to figure out what the different features might be, and 
then go back through a second time to see which ones might implement those 
features.

The second problem is that some are implemented as part of a given JS framework 
(eg, ExtJS), while other toolkits might have a dozen different 'data table' 
implementations (eg, jQuery).

-Joe


ps.  and as this wasn't a feature that I was looking for, this wasn't something 
that I tracked when I did my analysis.  I was looking for things like scaling 
to a thousand rows w/ 20 columns, rearranging/hiding columns, etc.


[CODE4LIB] Fwd: [Rdap] Call for Editors for IMLS-funded DataQ Project (Due 1/30/15)

2014-12-11 Thread Joe Hourcle
A few months ago, there was a discussion of trying to try to make a libraries 
site on Stack Exchange.

For those that were interested, this might be an interesting project to 
participate in, although their scope isn't necessarily all library questions.

-Joe


Begin forwarded message:

 From: Andrew Johnson andrew.m.john...@colorado.edu
 Date: December 11, 2014 5:34:37 PM EST
 To: r...@mail.asis.org r...@mail.asis.org
 Subject: [Rdap] Call for Editors for IMLS-funded DataQ Project (Due 1/30/15)
 Reply-To: Research Data, Access and Preservation r...@asis.org
 
 Call for Editors for the DataQ Project
 
 The University of Colorado Boulder Libraries, the Greater Western Library 
 Alliance, and the Great Plains Network are excited to announce that we have 
 received funding from the Institute of Museum and Library Services to develop 
 an online resource called DataQ, which will function as a collaborative 
 knowledge-base of research data questions and answers curated for and by the 
 library community. Library staff from any institution may submit questions on 
 research data topics to the DataQ website, where questions will then be both 
 crowd-sourced and reviewed by an Editorial Team of experts. Answers to these 
 questions, from both the community and the Editorial Team, will be posted to 
 the DataQ website and will include links to resources and tools, best 
 practices, and practical approaches to working with researchers to address 
 specific research data issues.
 
 We are currently seeking applications for our Editorial Team. If you are 
 interested in becoming a DataQ Editor, please fill out the application form 
 here by January 30, 2015: http://bit.ly/DataQApp.
 
 DataQ Editors will be responsible for helping to identify initial content, 
 providing expert feedback on questions from DataQ users, and developing 
 policies and procedures for answering questions. The Editorial Team will 
 participate in regular virtual meetings and attend one in-person meeting in 
 Kansas City, MO in late May. Each Editor will receive a $1000 stipend to help 
 cover travel costs and time contributed to the project.
 
 The initial term for each Editor will last until October 31, 2015 when the 
 grant period ends, but there may be opportunities to continue serving beyond 
 the life of the grant based on the outcome of the project.
 
 Additional opportunities to contribute to DataQ will be announced soon. For 
 all of the latest information about DataQ, please follow 
 @ResearchDataQhttp://twitter.com/researchdataq on Twitter. Please send any 
 questions about DataQ to the project Co-PIs Andrew Johnson at 
 andrew.m.john...@colorado.edumailto:andrew.m.john...@colorado.edu and Megan 
 Bresnahan at 
 megan.bresna...@colorado.edumailto:megan.bresna...@colorado.edu.
 
 -
 Andrew Johnson
 Assistant Professor; Research Data Librarian
 University of Colorado Boulder Libraries
 Phone: 303-492-6102
 Website: https://data.colorado.edu/
 ORCID iD: -0002-7952-6536http://orcid.org/-0002-7952-6536
 Impactstory Profile: https://impactstory.org/AndrewJohnson
 ___
 Rdap mailing list
 r...@mail.asis.org
 http://mail.asis.org/mailman/listinfo/rdap


Re: [CODE4LIB] Balancing security and privacy with EZproxy

2014-11-20 Thread Joe Hourcle
On Nov 19, 2014, at 11:47 PM, Dan Scott wrote:

 On Wed, Nov 19, 2014 at 4:06 PM, Kyle Banerjee kyle.baner...@gmail.com
 wrote:
 
 There are a number of technical approaches that could be used to identify
 which accounts have been compromised.
 
 But it's easier to just make the problem go away by setting usage limits so
 EZP locks the account out after it downloads too much.
 
 
 But EZProxy still doesn't let you set limits based on the type of download.
 You therefore have two very blunt sledge hammers with UsageLimit:
 
 - # of downloads (-transfers)
 - # of megabytes downloaded (-MB)


[trimmed]

I'm not familiar with EZProxy, but if it's running on an OS that you have 
control of (and not some vendor locked appliance), you likely have other tools 
that you can use for rate limiting.

For instance, I have a CGI on a webserver that's horribly resource intensive 
and takes quite a while to run.  Most people wonder what's taking so long, and 
reload multiple times, thinking the process is stuck ... or they know what's 
going on, and will open up multiple instances in different tabs to reduce their 
wait.

So I have the following IP tables rule:

-A INPUT -p tcp -m tcp --dport 80 --tcp-flags FIN,SYN,RST,ACK SYN -m 
connlimit --connlimit-above 5 --connlimit-mask 32 -j REJECT --reject-with 
tcp-reset

I can't remember if starts blocking the 5th connection, or once they're above 
5, but it keeps us from having one IP address with 20+ copies running at once.

...

And back from my days of managing directory servers -- brute forcing was a 
horrible problem with single sign-on.  We didn't have a good way to temporarily 
lock accounts for repeatedly failing passwords at the directory server (which 
would also cause a denial of service, as you could lock someone else) ... so it 
had to be up to each application to implement ... which of course, they didn't.

... so you'd have something like a webpage that required authentication that 
someone could brute force ... and then they'd also get access to a shell 
account and whatever else that person had authorization for.

-Joe


(and on that 'wow, I feel old' note ... it's been 10+ years since I've had to 
manage an LDAP server ... it's possible that they've gotten better about that 
issue since then)


Re: [CODE4LIB] Wednesday afternoon reverie

2014-11-06 Thread Joe Hourcle
On Nov 6, 2014, at 5:17 PM, Karen Coyle wrote:

 Cynthia, it's been a while but I wanted to give you feedback...
 
 Ranking on importance based on library ownership and/or circulation is 
 something that I've seen discussed but not implemented -- mainly due to the 
 difficulty of gathering the data from library systems. But it seems like an 
 obvious way to rank results, IMO.
 
 Too bad that one has to pay for BISAC headings. They tend to mirror the 
 headings in bookstores (and ebook stores) that people might be familiar with. 
 They capture fiction topics, especially, in a way to resonates with some 
 users (topics like Teen Paranormal Romance).


I believe that they were created specifically for bookstores.

The problem is that the publishers (likely with support of the authors) get to 
decide where stuff should be filed.

As I help manage the Friend's bookstore at my local library branch, I've seen 
Creation Science (on an 'E' book with archealogists  dinosaur bones on the 
cover) and a few others make me cringe.

-Joe

ps.  I haven't seen Teen Paranormal Romance specifically as a heading 
(although yes, I've seen those books) ... I'm waiting for Amish Paranormal 
Romance  (although I don't know if Amish Romance is an official BISAC 
heading).

pps.  The nature of the BISAC headings make them less useful for determining if 
a book's actually on the shelves.  It's fine for general browsing, but it 
reminds me of the filing system from Black Books (from 0:40 to ~1:45):

https://www.youtube.com/watch?v=RZVDr4r9HEw




 On 10/22/14 1:25 PM, Harper, Cynthia wrote:
 So I'm deleting all the Bisac subject headings (650_7|2bisacsh) from our 
 ebook records - they were deemed not to be useful, especially as it would 
 entail a for-fee indexing change to make them clickable.  But I'm thinking 
 if we someday have a discovery system, they'll be useful as a means for 
 broader-to-narrower term browsing that won't require translation to English, 
 as would call number ranges.
 
 As I watch the system slowly chunk through them, I think about how library 
 collections and catalogs facilitate jumping to the most specific subjects, 
 but browsing is something of an afterthought.
 
 What if we could set a ranking score for the importance of an item in 
 browsing, based on circulation data - authors ranked by the relative 
 circulation of all their works, same for series, latest edition of a 
 multi-edition work given higher ranking, etc.?  Then have a means to set the 
 threshold importance value you want to look at, and browse through these 
 general Bisac terms, or the classification?  Or have a facet for 
 importance threshold.  I see Bisac sometimes has a broadness/narrowness 
 facet (overview) - wonder how consistently that's applied, enough to be 
 useful?
 
 Guess those rankings would be very expensive in compute time.
 
 Well, back to the deletions.
 
 Cindy Harper
 Electronic Services and Serials Librarian
 Virginia Theological Seminary
 3737 Seminary Road
 Alexandria VA 22304
 703-461-1794
 char...@vts.edu
 
 -- 
 Karen Coyle
 kco...@kcoyle.net http://kcoyle.net
 m: +1-510-435-8234
 skype: kcoylenet/+1-510-984-3600


Re: [CODE4LIB] Past Conference T-Shirts?

2014-11-06 Thread Joe Hourcle
On Nov 6, 2014, at 8:11 PM, Josh Wilson wrote:

 The Code4Lib version is clearly of superior quality, design, and
 provenance, but I actually thought this was an internet thing of unknown
 origin? e.g.,
 
 http://www.cafepress.com/mf/17182533/metadata_tshirt
 http://www.redbubble.com/people/charlizeart/works/1280530-metadata?p=t-shirt
 
 Perhaps a case of multiple discovery.

I don't know when I first saw it, but I know variations of the Helvetica shirt 
were first:

http://welovetypography.com/post/10993/

-Joe


ps.  being a font snob ... the Cafe Press shirt just has horrible kerning
 between the 'T' and 'A.  The Code4Lib one is better, but misses the
 little bevel on the 'T' to have it made up tight to the 'A', and the
 kerning between 'A' and 'D' could be a bit tighter.  The helvetica
 shirt I linked to clearly slid the letters around (as the  'T' has
 the beveled edge to mate up to the now-missing 'A'.)





 
 On Thu, Nov 6, 2014 at 5:37 PM, Goben, Abigail ago...@uic.edu wrote:
 
 Joshua Gomez did the original, correct? http://wiki.code4lib.org/
 index.php/2013_t-shirt_design_proposals
 
 Thanks for working on this Riley! I know several people who will be very
 happy to be able to purchase it.
 
 
 On 11/6/2014 2:48 PM, Riley Childs wrote:
 
 Some one sent me the design, if you did it please let me know so I can
 give attribution.
 //Riley
 
 Sent from my Windows Phone
 
 --
 Riley Childs
 Senior
 Charlotte United Christian Academy
 Library Services Administrator
 IT Services Administrator
 (704) 537-0331x101
 (704) 497-2086
 rileychilds.net
 @rowdychildren
 I use Lync (select External Contact on any XMPP chat client)
 
 From: todd.d.robb...@gmail.commailto:todd.d.robb...@gmail.com
 Sent: ‎11/‎6/‎2014 3:41 PM
 To: CODE4LIB@LISTSERV.ND.EDUmailto:CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] Past Conference T-Shirts?
 
 Joshua,
 
 That is so gnarly!!!
 
 On Thu, Nov 6, 2014 at 1:13 PM, Riley Childs rchi...@cucawarriors.com
 wrote:
 
 Ok, will do, I didn't actually design it may take a little time while I
 dig though download folders from my backups, I will try and get to it
 next
 week
 //Riley
 
 
 --
 Riley Childs
 Senior
 Charlotte United Christian Academy
 IT Services Administrator
 Library Services Administrator
 https://rileychilds.net
 cell: +1 (704) 497-2086
 office: +1 (704) 537-0331x101
 twitter: @rowdychildren
 Checkout our new Online Library Catalog: https://catalog.cucawarriors.
 com
 
 Proudly sent in plain text
 
 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
 Jason Stirnaman
 Sent: Thursday, November 06, 2014 2:46 PM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] Past Conference T-Shirts?
 
 Riley,
 Could you fix the spelling on More then just books in the store? Should
 be More than just books
 
 Thanks,
 Jason
 
 Jason Stirnaman
 Lead, Library Technology Services
 University of Kansas Medical Center
 jstirna...@kumc.edu
 913-588-7319
 
 On Nov 6, 2014, at 1:04 PM, Riley Childs rchi...@cucawarriors.com
 wrote:
 
 Yes, but I have been unsuccessful thus far in getting a vector file/high
 
 res transparent image.
 
 If you have one and can send please do so and I will put it up on the
 
 code4lib store (code4lib.spreadshirt.com).
 
 Thanks
 //Riley
 
 
 --
 Riley Childs
 Senior
 Charlotte United Christian Academy
 IT Services Administrator
 Library Services Administrator
 https://rileychilds.net
 cell: +1 (704) 497-2086
 office: +1 (704) 537-0331x101
 twitter: @rowdychildren
 Checkout our new Online Library Catalog:
 
 https://catalog.cucawarriors.com
 
 Proudly sent in plain text
 
 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
 
 Goben, Abigail
 
 Sent: Thursday, November 06, 2014 1:10 PM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: [CODE4LIB] Past Conference T-Shirts?
 
 My Metadata t-shirt, from C4L 2013, has been getting some
 
 interest/requests of where others can purchase. I thought we'd talked
 about
 that here.  Was there a store ever finally set up that I could refer
 people
 to?
 
 Abigail
 
 --
 Abigail Goben, MLS
 Assistant Information Services Librarian and Assistant Professor Library
 
 of the Health Sciences University of Illinois at Chicago
 
 1750 W. Polk (MC 763)
 Chicago, IL 60612
 ago...@uic.edu
 
 
 
 --
 Tod Robbins
 Digital Asset Manager, MLIS
 todrobbins.com | @todrobbins http://www.twitter.com/#!/todrobbins
 
 
 --
 Abigail Goben, MLS
 Assistant Information Services Librarian and Assistant Professor
 Library of the Health Sciences
 University of Illinois at Chicago
 1750 W. Polk (MC 763)
 Chicago, IL 60612
 ago...@uic.edu
 


Re: [CODE4LIB] Stack Overflow

2014-11-04 Thread Joe Hourcle
On Nov 4, 2014, at 9:12 AM, Schulkins, Joe wrote:

 Presumably I'm not alone in this, but I find Stack Overflow a valuable 
 resource for various bits of web development and I was wondering whether 
 anyone has given any thought about proposing a Library Technology site to 
 Stack Exchange's Area 51 (http://area51.stackexchange.com/)? Doing a search 
 of the proposals shows there was one for 'Libraries and Information Science' 
 but this closed 2 years ago as it didn't reach the required levels during the 
 beta phase.

Some history on the Stack Exchange site:

1. Before 'Stack Exchange 2.0', they used to let other sites pay them to host 
QA sites.  There had been a library-focused site on Unshelved:

http://www.unshelved.com/2010-7-15

2. We got *hundreds* of people from Unshelved Answers to sign up on Area 51 ... 
but they wouldn't start up the site unless enough people with high enough 
reputation on existing 'Stack Exchange 2.0' sites expressed interest, claiming 
that they needed sufficient people with knowledge of the system.  I tried 
lobbying for them to count people w/ experience from Unshelved Answers, but 
they wouldn't do it.

3. It took over a year for the 'Libraries' proposal to get enough support to be 
accepted; by then, I assume most library folks had moved on.

4. They then named the site 'Library and Information Science', not 'Libraries'.

http://discuss.area51.stackexchange.com/q/3846/5710

   After my complaining, they changed it to 'Libraries and Information 
Science', but there was still a major problem:

5. As if all of the rest wasn't bad enough, we then had a bunch of non-library 
people closing answers because there wasn't a single definite answer, which was 
a large number of the questions on Unshelved Answers ... and most of the 
'example' questions were in that category as well:


https://web.archive.org/web/20120325030045/http://area51.stackexchange.com/proposals/12432/libraries-information-science



 The reason I think this might be useful is that instead of individual places 
 to go for help or raise questions (i.e. various mailing lists) there could be 
 a 'one-stop' shop approach from which we could get help with LMSs, discovery 
 layers, repository software etc. I appreciate though that certain vendors 
 aren't particularly open (yes, Innovative I'm looking at you here) and might 
 not like these things being discussed on an open forum.
 
 Does anybody else think this might be useful? Would such a forum be shot down 
 by all the vendors legalese wrapped up in their Terms and Conditions? Or are 
 you happy with the way you go about getting help?


I think that the Stack Exchange culture  policies make it a bad fit for our 
community.  I think that yes, there is a need for such a site, but that the 
issues with immediately closing questions without a clear answer are a *huge* 
problem.  If questions were easily answered, we'd have done the research and 
answered it outselves (most of us have LIS degrees and know how to research 
things!).

You might also be able to get support from Unshelved again, and if we the 
community can put together a site, have them brand it as 'Unshelved Answers' 
again.

-Joe

ps.  I'm currently the moderator of OpenData.StackExchange.com; I was 
previously the moderator of Seasoned Advice (aka. cooking.stackexchange.com)

pps.  I also objected when they changed the name of the 'databases' proposal to 
'database administrators', which many of us felt narrowed the scope 
dramatically ( http://meta.dba.stackexchange.com/q/1/51 ; 
http://meta.dba.stackexchange.com/q/11/51 ).  I don't even bother with the site 
these days.


Re: [CODE4LIB] Stack Overflow

2014-11-04 Thread Joe Hourcle
On Nov 4, 2014, at 1:33 PM, Mark Pernotto wrote:

 I think all of this is really useful. I'd be lying if I said I didn't get a
 lot of great ideas and results from StackOverflow.
 
 However, I've been burned quite a bit as well - deprecated code, inaccurate
 results, or just the wrong answer gets accepted. There seems to be such a
 push to 'accept as answer' that no one gives a second thought to
 alternative solutions. Because one size doesn't fit all - I think we all
 know that.

I hate it when I answer something 15-20 min after someone posts a
question, and they flag it as the 'correct' answer.  Someone else
might have some better response. *

I made the mistake of accepting an answer without fully testing it:

http://dba.stackexchange.com/q/30/51

Notice how no one else gave an alternative, as it works ... but I just
added the comment that the performance was much, much worse than when
I started.



... and we run into issues where what might have once been the correct
answer no longer is (because there's a new, better alternative, or
because some tool's no longer available (or not recommended because
of a horrible security flaw).


 I guess I'm trying to advocate not to rely on this type of resource
 completely when resolving your coding challenges.  While it can certainly
 be a tremendous learning tool, keep an objective mind for what tool best
 fits your institution's purpose.

What I'd like to see is some place where we can have the summary
of recommended practices for various problems ... lots of people
can contribute, and it can kept up-to-date.  Basically, a crowd
sourced FAQ.  The problem is, you can't just set up a wiki and
expect people to contribute.

Say what you will about StackExchange's herd-mentality about the
'right type of questions'**, their system gets people to contribute.



* for the people who complain about the grubbing for reputation:
  it's gamification.  I just hate the people who can manage to pop
  out reasonable sounding responses 10 min after the question was
  asked that are clearly just internet research because I *know*
  the answer is wrong. ... one person on Seasoned Advice was gaming
  the system; if you started downvoting their questions, they'd
  just delete them, but they were getting almost all of upvotes
  due to their 'early and plausible' strategy.

** Yet, I still have the 4th-rated question on Seasoned Advice
  for Translating cooking terms between US / UK / AU / CA / NZ
  ( http://cooking.stackexchange.com/q/784/67 ), simply because
  I got it in back when 'community wiki' was considered an option.
  Lots of other interesting questions have gotten closed as their
  community cracked down on 'em, though.  (eg, cookbook
  recommendations)


-Joe

ps.  Nothing frustrates me more than scouring the internet due to a
 problem you've run into ... and you *finally* find a 2 year old
 post on some forum that is the *exact* symptoms you have ... and
 you scroll through all of the replies of things you've already
 tried ... and get to the last post, from the person with the
 problem and they've posted 'nevermind, I fixed it'.


Re: [CODE4LIB] Terrible Drupal vulnerability

2014-10-31 Thread Joe Hourcle
On Oct 31, 2014, at 11:46 AM, Lin, Kun wrote:

 Hi Cary,
 
 I don't know from whom. But for the heartbeat vulnerability earlier this 
 year, they as well as some other big providers like Google and Amazon were 
 notified and patched before it was announced. 

If they have an employee who contributes to the project, it's possible that this
was discussed on development lists before it was sent down to user level mailing
lists.  

Odds are, there's also  some network of people who are willing to give things a
cursory review / beta test in a more controlled manner before they're officially
released (and might break thousands of websites).  It would make sense that
companies who derive a good deal of their profits in supporting software would
participate in those programs, as well.

I could see categorizing either of those as 'ahead of the *general* public',
which was Kun's assertion.

-Joe



 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Cary 
 Gordon
 Sent: Friday, October 31, 2014 11:10 AM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] Terrible Drupal vulnerability
 
 How do they receive vulnerability report ahead of general public? From whom?
 
 Cary
 
 On Friday, October 31, 2014, Lin, Kun l...@cua.edu wrote:
 
 If you are using drupal as main website, consider using Cloudflare Pro.
 It's just $20 a month and worth it. They'll help block most attacks. 
 And they usually receive vulnerability report ahead of general public.
 
 Kun
 
 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU 
 javascript:;] On Behalf Of Cary Gordon
 Sent: Friday, October 31, 2014 9:59 AM
 To: CODE4LIB@LISTSERV.ND.EDU javascript:;
 Subject: Re: [CODE4LIB] Terrible Drupal vulnerability
 
 This is what I posted to the Drupal4Lib list:
 
 
 
 By now, you should have seen https://www.drupal.org/PSA-2014-003 and 
 heard about the Drupageddon exploits. and you may be wondering if 
 you were vulnerable or iff you were hit by this, how you can tell and 
 what you should do. Drupageddon affects Drupal 7, Drupal 8 and, if you 
 use the DBTNG module, Drupal 6.
 
 The general recommendation is that if you do not know or are unsure of 
 your server's security and you did not either update to Drupal 7.32 or 
 apply the patch within a few hours of the notice, you should assume 
 that your site (and server) was hacked and you should restore 
 everything to a backup from before October 15th or earlier. If your 
 manage your server and you have any doubts about your file security, 
 you should restore that to a pre 10/15 image, as well or do a reinstall of 
 your server software.
 
 I know this sounds drastic, and I know that not everyone will do that.
 There are some tests you can run on your server, but they can only 
 verify the hacks that have been identified.
 
 At MPOW, we enforce file security on our production servers. Our 
 deployments are scripted in our continuous integration system, and 
 only that system can write files outside of the temporal file directory (e.g.
 /sites/site-name/files). We also forbid executables in the temporal 
 file system. This prevents many exploits related to this issue.
 
 Of course, the attack itself is on the database, so even if the file 
 system is not compromised, the attacker could, for example, get admin 
 access to the site by creating an account, making it an admin, and 
 sending themselves a password. While they need a valid email address 
 to set the password, they would likely change that as soon as they were in.
 
 Some resources:
 https://www.drupal.org/PSA-2014-003
 
 https://www.acquia.com/blog/learning-hackers-week-after-drupal-sql-inj
 ection-announcement
 
 http://drupal.stackexchange.com/questions/133996/drupal-sa-core-2014-0
 05-how-to-tell-if-my-server-sites-were-compromised
 
 I won't attempt to outline every audit technique here, but if you have 
 any questions, please ask them.
 
 The takeaway from this incident, is that while Drupal has a great 
 security team and community, it is incumbent upon site owners and 
 admins to pay attention. Most Drupal security issues are only 
 exploitable by privileged users, and admins need to be careful and 
 read every security notice. If a vulnerability is publicly exploitable, you 
 must take action immediately.
 
 Thanks,
 
 Cary
 
 On Thu, Oct 30, 2014 at 5:24 PM, Dan Scott deni...@gmail.com 
 javascript:; wrote:
 
 Via lwn.net, I came across https://www.drupal.org/PSA-2014-003 and 
 my heart
 sank:
 
 
 Automated attacks began compromising Drupal 7 websites that were not 
 patched or updated to Drupal 7.32 within hours of the announcement 
 of
 SA-CORE-2014-005
 - https://www.drupal.org/SA-CORE-2014-005Drupal
 https://www.drupal.org/SA-CORE-2014-005 core - SQL injection 
 https://www.drupal.org/SA-CORE-2014-005. You should proceed under 
 the assumption that every Drupal 7 website was compromised unless 
 updated or patched 

Re: [CODE4LIB] Why learn Unix?

2014-10-28 Thread Joe Hourcle
On Oct 28, 2014, at 10:07 AM, Joshua Welker wrote:

 There are 2 reasons I have learned/am learning Linux:
 
 1. It is cheaper as a web hosting platform. Not substantially, but enough to
 make a difference. This is a big deal when you are a library with a
 barebones budget or an indie developer (I am both).  Note that if you are
 looking for enterprise-level support, the picture is quite different.
 
 1a. A less significant reason is that Linux is much less resource-intensive
 on computers and works well on old/underpowered computers and embedded
 systems. If you want to hack an Android device or Chromebook to expand its
 functionality, Linux is what you want. I am running Ubuntu on my Acer C720
 Chromebook using Crouton, and now it has all the functionality of a
 full-fledged laptop at $200.

When I worked for an ISP in the late 1990s, our two FreeBSD servers that
handled *everything* were 75MHz Pentiums that another company had discarded
Our network admin bridged my apartment to his using a 386 w/ picoBSD install
that booted from a 3.5 floppy (to drive a WaveLAN card, before the days of
802.11)  I think one of the P75s was running fark.com for a while before
they added all of the commenting functionality.

It's amazing just how much functionality you can get out of hardware
that people have discarded by putting a system on it that doesn't
have a lot of cruft.


 2. Many scripting languages and application servers were born in *nix and
 have struggled to port over to non-*nix platforms. For example, Python and
 Ruby both are a major pain to set up in Windows. Setting up a
 production-level Rails or Django server is stupidly overcomplicated in
 Windows to the point where it is probably easier just to use Linux. It's
 much easier to sudo apt-get install in Ubuntu than to spend hours tweaking
 environment variables and config files in Windows to achieve the same
 effect.

If you're going to run Python on windows, it used to be easier to download
a full 'WAMP' build (windows, apache, mysql, perl/php/python).  I don't
know what the current state of python installers are ... except for on
the Mac, where they're still a bit of a pain.

I have no idea on Ruby.


 I will go out on a limb here and say that *nix isn't inherently better than
 Windows except perhaps the fact that it is less resource-intensive (which
 doesn't apply to OSX, the most popular *nix variant). #1 and #2 above are
 really based on historical circumstances rather than any inherent
 superiority in Linux. Back when the popular scripting languages, database
 servers, and application servers were first developed in the 90s, Windows
 had  a very sucktastic security model and was generally not up to the task
 of running a server. Windows has cleaned up its act quite a bit, but the
 ship has sailed, at this point.
 
 If you compare Windows today to Linux today, they are on very equal footing
 in terms of server features. The only real advantage Linux has at this point
 is that the big distros like Ubuntu have a much more robust package
 ecosystem that makes it much easier to install common server-side
 applications through the command line. But when you look at actually using
 and managing the OS, Linux is at a clear disadvantage. And if you compare
 the two as desktop environments, Windows wins hands-down except for a very
 few niche use cases. I say this as someone who uses a Ubuntu laptop every
 day.

For managing OSes, I admit that I haven't played with Windows 8, but I'm
still in the FreeBSD camp for servers.  (and not what Apple's done to it)

Windows might have an advantage if you're doing active directory w/
group policies, but I've heard horror stories from my neighbor about his
co-worker who decides to 'hide' his changes to individual people (eg,
blocking what websites they can get to), making it difficult for someone
else to go in and clear them out because he was too over-zealous.


 (Anyone who has read this far might be interested to know that Windows 10 is
 going to include an official MS-supported command line package management
 suite called OneGet that will build on the package ecosystem of the
 third-party Chocolatey suite.)

Very interesting.

-Joe


Re: [CODE4LIB] Subject: Re: Why learn Unix?

2014-10-28 Thread Joe Hourcle
On Oct 28, 2014, at 8:11 PM, Alex Berry wrote:

 And that is why alias rm='rm -I' was invented.


Do not *ever* set this to be a default for new users.

During my undergrad, I worked at helpdesk for the group that managed the 
computer labs, the general use unix  cms systems (not content management 
system ... an IBM mainframe ... one of the last bitnet-to-internet gateways).

The engineering school set up a bunch of default aliases for their systems... 
including rm='rm -i'.

This meant that when people came to *our* servers ... they'd decide to 
interactively clean out their home directory by typing:

rm *

... and then wonder why it didn't prompt them.

... and then come to the computer lab to complain.  

... and then complain some more when we wouldn't immediately restore their 
files for them.  (our policy was technically disaster recovery only, but it was 
effectively disaster recovery, upper level management, or members of the 
faculty senate ... because restores from tape really, really sucked.)

-Joe


Re: [CODE4LIB] Why learn Unix?

2014-10-27 Thread Joe Hourcle
On Oct 27, 2014, at 12:38 PM, Bigwood, David wrote:

 Learning UNIX is fine. However, I do think learning SQL might be a better 
 investment. So many of our resources are in databases. Understanding 
 indexing, sorting and relevancy ranking of our databases is also crucial. 
 With linked data being all the rage knowing about sparql endpoints is 
 important.  The presentation of the information from databases under our 
 control  needs work. Is the information we present actionable or just strings?

Quite likely.  I wouldn't teach people SQL (and I've done plenty of pl/sql and 
t/sql programming) unless:

1. They had data they wanted to use that's already on an SQL server.
2. They had a (read-only) account on that server, so they could 
actually use it.

If they had to go about setting up a server (even if it's an installable 
application) and ingesting their data to be able to analyze, you can get 
frustrated before you even start to see any useful results.

If they have some scenario where they need multiple tables and joins, then 
sure, teach them SQL ... but over the years, I've had weeks of SQL-related 
training*, and I don't know that I'd want to make anyone go through all of that 
if they're just trying to do some simple reports that could be done in other 
ways.  I wouldn't even suggest teaching people about indexing until they've 
tried doing stuff in SQL and wondered why it's so slow.

Likewise, if there were some sort of non-SQL database for them to play with 
(even an LDAP server) that might have information of use to them, I'd teach 
them that first ... but I'd likely start w/ unix command line stuff (see below).


 Or maybe I just like those topics better and find the work being done there 
 fascinating?


Quite likely.  I still haven't found a reason good reason to wrap my head 
around sparql ... I guess in part because the stuff I'm dealing with isn't 
served as linked data.


...


On Oct 27, 2014, at 11:15 AM, Tod Olson wrote:

 There’s also something to be said for the Unix pipeline/filter model of 
 processing. That way of breaking down a task into small steps, wiring little 
 programs to filter the data for each step, building up the solution 
 iteratively, essentially a form of function composition. Immedidately, you 
 can do a lot of powerful one-off or scripting tasks right from the command 
 line. More generally, it’s a very powerful model to have in your head, can 
 transform your thinking.


I 100% agree.

If I were to try to teach unix to a group, I'd come up with some scenarios
where command like tools can actually help them, and show them how to automate
things that they'd have to do anyway.  (or tried to do, and gave up on).

For instance, if there's some sort of metric that they need, you can show
how simple `cut | sort | uniq | wc` can be used...

eg, if I have a 'common' or 'common+' webserver log file, I can get a quick
count of today's unique hosts via :

cut -d  -f1 /var/log/httpd/access_log-2014.10.27 | sort | uniq | wc -l

If I wanted to see the top 10 hosts hitting us:

cut -d  -f1 /var/log/httpd/access_log-2014.10.27 | sort | uniq -c | 
sort -rn | head -10

If you're lazy, and want to alias this so it didn't have to hard-code today's 
date:

cut -d  -f1 `ls -1t /var/log/httpd/access_log* | head -1` | sort | 
uniq | wc -l

If your log files are rolled weekly, and we need to extract just today :
(note that it's easier if you're sure that something looking like today's date 
won't show up in requests)

cut -d  -f1,4 `ls -1t /var/log/httpd/access_log* | head -1` | grep 
`date '+%d/%b/%Y'` | cut -d  -f1 | sort | uniq | wc -l

If you just wanted a quick report of hits per day, and your log files aren't 
rolled and compressed:

cat `ls -1tr /var/log/httpd/access_log*` | cut -d\[ -f2 | cut -d: -f1 | 
uniq -c | more

(note that that last one isn't always clean ... the dates logged are when the 
request started, but they're logged when the script finishes, so sometimes 
you'll get something strange like:

12354 23/Oct/2014 
3 24/Oct/2014
1 23/Oct/2014
14593 24/Oct/2014

... but if you try to use `sort`, and you cross months, it'll sort of 
alphabetical, not cronological)

You could probably dedicate another full day to sed  awk, if you wanted ... or 
teach them enough perl to be dangerous.


-Joe


* I've taken all of the Oracle DBA classes back in the 8i days (normally 4 
weeks if taken as full-day classes), plus Oracle's data modeling and sql tuning 
classes (4-5 days each?)


Re: [CODE4LIB] CrossRef/DOI content-negotiation for metadata lookup?

2014-10-23 Thread Joe Hourcle
On Oct 23, 2014, at 11:19 AM, Jonathan Rochkind wrote:

 Hi, the DOI system supports some metadata lookup via HTTP content-negotiation.
 
 I found this blog post talking about CrossRef's support:
 
 http://www.crossref.org/CrossTech/2011/04/content_negotiation_for_crossr.html
 
 But I know DataCite supports it to some extent too.
 
 Does anyone know if there's overall registrar-agnostic documentation from DOI 
 for this service?

None that I'm aware of.  We've actually been discussing this issue in a 
breakout from 'Data Citation Implementors Group', and I think we're currently 
leaning towards not relying solely on content negotiation, but also using HTTP 
Link headers or HTML link elements to make it possible to discover the other 
formats that the metadata may be available.

If you dig into the OAI-ORE documentation, they specifically mention one of the 
problems of using Content Negotation is that you can't tell exactly what 
someone's asking for solely based on the Accept header ... do they want a 
resource map to the content, or just the metadata from the splash / landing 
page?



 Or, if there's kept-updated documentation from CrossRef and/or DataCite on it?

It looks like the one for CrossRef is :

http://www.crosscite.org/cn/

... if you go to the documentation for DataCite, it still has 'Beta' in the 
title:

DataCite Content Resolver Beta
http://data.datacite.org/static/index.html  
(note : http://data.datacite.org/ redirects here, which is linked from 
https://www.datacite.org/services )


 From that blog post, it says rdf+xml, turtle, and atom+xml should all be 
 supported as response formats.
 
 But atom+xml seems to not be supported -- if I try the very example from that 
 blog post, I just get a 406 No Acceptable Resource Available.
 
 I am not sure if this is a bug, or if CrossRef at a later point than that 
 blog post decided not to support atom+xml. Anyone know how I'd find out, or 
 get more information?


The link I gave to CrossRef documentation has three e-mail addresses at the 
bottom, if you wanted to ask them if the documentation is still current*:

6 Getting Help
Please contact l...@crossref.org, t...@datacite.org or 
medrast...@cineca.it for support.


-Joe


* and this is why when I used to maintain documentation, every document had 
both 'last revised' and 'last reviewed' date on 'em, so you had a clue how 
likely they were to be out of date.


Re: [CODE4LIB] Citation Publication Tool

2014-10-22 Thread Joe Hourcle
On Oct 22, 2014, at 4:10 PM, Bigwood, David wrote:

 Any suggestions for publishing citations on the Web? We have a department 
 that has lots of publications with citations at the end of each. Keeping the 
 citations up-to-date is a chore.
 
 Many here use Endnotes, and I know that can publish to the Web. Any examples 
 I can view? Would Libguides be something to consider? Any other suggestions 
 for easily getting different groups of citations up in multiple places?
 
 Some examples of the pages involved:
 http://www.lpi.usra.edu/education/explore/LifeOnMars/resources/
 http://www.lpi.usra.edu/education/explore/solar_system/resources/
 http://www.lpi.usra.edu/education/explore/space_health/resources/


Based on the pages that you've linked to, I wouldn't call those 'citations'*

I've seen them called different things, depending on the reason for creating 
the lists, and the intended audience.

For instance, if they're lists of scholarly resources (books  journal 
articles, maybe presentations  thesis) that make use of your group's data, 
then it's either an 'Observatory Bibliography' or 'Telescope Bibliography' 
depending on the scope, and sometimes just 'Publication List'.  Those are 
actually an easy case in our field, as The Astrophysics Data System indexes the 
main journals in our field, so you just need software that can look up metadata 
from bibcodes:


http://adsabs.harvard.edu/cgi-bin/nph-bib_query?bibcode=2004SPIE.5493..163Hdata_type=BIBTEXdb_key=ASTnocookieset=1

'Publication Lists' are a little harder, as they often include Public Press 
coverage (ie, media intended for non-scientists such as newspaper / website / 
tv news / magazines), which ADS doesn't index.  (and you often want to grab a 
snapshot of them, in case it disappears)

For what you have ... although you have some links to formally published items, 
it looks to more be links to various websites with more information on a topic. 
 I've heard them informally referred to as EPO Resource Pages (EPO == 
Education  Public Outreach) or if specifically for teachers 'Educator Resource 
Pages.  I've typically seen them organized first by intended age level, then 
by the type of resource.  (organizing how you have it is generally for 
bean-counting when it comes time for senior reviews).

...

As for software recommendations ... if you're already using a CMS, I'd look to 
see if has any add-ons for managing either bibliographies or just lists of 
external links.  

If you're looking for stand alone software, I'd look for 'Reference Manager' or 
'Bibliography Manager' software that can generate HTML to post online.  There 
are some that allow you to manage everything online, but then you have to be 
worried about securing it** :

http://en.wikipedia.org/wiki/Comparison_of_reference_management_software

I'm not aware of any that have specifically been built for EPO purposes, but 
many of them have ways to add extra fields, so you could handle intended 
audience and your current classification that way.

-Joe

* There was actually an issue that came up during the work on the 'Joint 
Declaration of Data Citation Principles' that makes me believe that there are 
at least 6 different things that people may mean by 'citation', and yours would 
likely be a 7th.  See http://docs.virtualsolar.org/wiki/CitationVocabulary

** We had to drop the one we were using after a SQL injection, and my boss 
decided to ban all PHP on our network, so we rolled back to use 10+ year old 
software that had been written for another mission.


Re: [CODE4LIB] Linux distro for librarians

2014-10-19 Thread Joe Hourcle
On Oct 19, 2014, at 3:20 PM, Francis Kayiwa wrote:

[trimmed]

  I'm willing to bet it would be much less effort to fix this Ubuntu problem 
 dealing with the Ubuntu devs (I've found them reasonable to work with) than 
 trying to heard the cats around yet another debian fork


Another alternative would be to pick an existing OS, and make sure that all of 
the requisite packages are in their package manager.

-Joe

ps.  'OS for librarians' was never defined as being (1) for servers at 
libraries, (2) for librarian workstations, or (3) for public-use machines.  
Things that make a good client machine doesn't always make for a good server.  
And what makes a good personally managed desktop doesn't necessarily make it a 
good desktop when you're managing dozens or hundreds.  (take MacOSX ... 
replacing bits to make it 'easier' for users, but harder to manage remotely in 
bulk)


[CODE4LIB] Citation hackathon tomorrow at PLOS

2014-10-17 Thread Joe Hourcle
I was just looking at the PLOS website, and noticed they had a banner:

PLOS is hosting a hackathon on Saturday, October 18th, 2014 at our SF 
office.

So, if you're in the San Francisco area, and are interested in citations (the 
theme of the hackathon), and don't have plans for tomorrow, see their website 
for more details and to RSVP:

http://www.ploslabs.org/citation-hackathon

-Joe


Re: [CODE4LIB] Requesting a Little IE Assistance

2014-10-14 Thread Joe Hourcle
It sounds like the issue already has a solution, but ...



On Oct 13, 2014, at 10:13 PM, Matthew Sherman wrote:

 The DSpace angle also complicates things a bit
 as they do not have any built in CSS that I could edit for this purpose.  I
 am hoping they will be amenable to the suggestions to right click and open
 in notepad because txt files are darn preservation friendly and readable
 with almost anything since they are some of the simplest files in
 computing.  Thanks for the input folks.


I'm not a DSpace user, but my understanding is that it's not a stand-alone
webserver ... which means that you may still have ways to re-write what
gets served out of it.

For instance, if you're running Apache you can build an 'output filter'.

I've only done them via mod_perl, but some quick research points to 
mod_ext_filter to call any command as a filter: 
http://httpd.apache.org/docs/2.2/mod/mod_ext_filter.html

You'd then set up a 'smart filter' to trigger this when you
had a text/plain response and the UserAgent is IE ... but the syntax
is ... complex, to put it nicely:

http://httpd.apache.org/docs/2.2/mod/mod_filter.html

(I've never configured a smart filter myself, and searching for
useful examples isn't really panning out for me).

... but I thought I'd mention this as an option for anyone who
might have similar problems in the future, as it lets you mess
with images and other types of content, too.

-Joe


Re: [CODE4LIB] Requesting a Little IE Assistance

2014-10-13 Thread Joe Hourcle
On Oct 13, 2014, at 9:59 AM, Matthew Sherman wrote:

 For anyone who knows Internet Explore, is there a way to tell it to use
 word wrap when it displays txt files?  This is an odd question but one of
 my supervisors exclusively uses IE and is going to try to force me to
 reupload hundreds of archived permissions e-mails as text files to a
 repository in a different, less preservable, file format if I cannot tell
 them how to turn on word wrap.  Yes it is as crazy as it sounds.  Any
 assistance is welcome.

If there's a way to do it, it likely wouldn't be something that 
you could send from the server.

Depending on the web server that you're using, you might be able to
use client detection, and then pass requests from IE through a CGI
(or similar) that does the line-wrapping ... or wraps it in HTML.

If you go the HTML route, you might be able to just put the whole
thing in a textarea element.


If you *do* have to modify all of the text files, as you specifically
mention that they're e-mails, I'd recommend looking at 'flowed'
formatting, which uses 79 character lines, but SP CRLF to mark
'soft' returns:

https://www.ietf.org/rfc/rfc2646.txt

You could also try just setting an HTTP header to 'Format: Flowed'
and see if IE will handle it from there.  (I'd test myself, but
I don't have IE to test with)

-Joe


Re: [CODE4LIB] Requesting a Little IE Assistance

2014-10-13 Thread Joe Hourcle
On Oct 13, 2014, at 5:15 PM, Kyle Banerjee wrote:

 You could encode it quotable-printable or mess with content disposition
 http headers.

Oh, please not quoted-printable.  That's=
the one that makes you think that something=
is wrong with your mail client because=
there are strange equals signs (=3D) all=
over the place.

-Joe


Re: [CODE4LIB] Informal survey regarding library website liberty

2014-09-02 Thread Joe Hourcle
On Sep 2, 2014, at 11:39 AM, Brad Coffield wrote:

 Hi all,
 
 I would love to hear from people about what sort of setup they have
 regarding linkage/collaboration/constrictions/freedom regarding campus-wide
 IT practices and CMS usage and the library website.

[trimmed]


 I'm hoping that I can get some responses from you all that way I can
 informally say of x libraries that responded y of them are not firmly tied
 to IT. (or something to that effect) I'm also very curious to read
 responses because I'm sure they will be educational and help me to make our
 site better.
 
 THE QUESTION:
 
 What kind of setup does your library have regarding servers, IT dept
 collaboration, CMS restrictions, anything else? I imagine that there are
 many unique situations. Any input you're willing to provide will be very
 welcome and useful.


So, rather than answer the question (as I don't work for a library), but
I worked in central IT for a university for ~7 years:

If you're going to consider using central IT for your infrastructure,
ask them what sort of service guarantees they're willing to provide.
This is typically called a 'Service Level Agreement', where they
spell out who's responsible for what, response times, acceptable
downtime / maintenance windows, etc.  It may include costs, but that
may be a separate document.

Typically, the hosted solutions are best when you've just got a few pages
that rarely get updated (once a year or so); if you're pulling info from a
database to display on a website, most shared solutions fall flat on their
face.  They might have a database where you could store stuff to make
data-driven web pages, but they rarely are flexible enough to interface
with some external server.

So, anyway ... it doesn't matter what other schools do if your IT dept.
can't provide the services you need.  If they *can* provide it, you need
to weight costs vs. level of service ... the cost savings may not be
worth it if they regularly take the server down for maintenance at times
when you need it.

-Joe


Re: [CODE4LIB] Hiring strategy for a library programmer with tight budget - thoughts?

2014-08-15 Thread Joe Hourcle
On Aug 15, 2014, at 12:44 PM, Kim, Bohyun wrote:

 I am in a situation in which a university has a set salary guideline for 
 programmer position classifications and if I want to hire an entry-lever dev, 
 the salary is too low to be competitive and if I want to hire a more 
 experienced dev in a higher classification, the competitive salary amount 
 exceeds what my library cannot afford. So as a compromise I am thinking about 
 going the route of posting a half-time position in a higher classification so 
 that the salary would be at least competitive. It will get full-time benefits 
 on a pro-rated basis. But I am wondering if this strategy would be viable or 
 not.
 
 Also anyone has a experience in hiring a developer to telework completely 
 from another state when you do not have previous experience working with 
 her/him? This seems a bit risky strategy to me but I am wondering if it may 
 attract more candidates particularly when the position is half time.
 
 As a current/past/future library programmer or hiring manager in IT or both, 
 if you have any thoughts, experience, or ideas, I would really appreciate it.


Salary's not the only factor when it comes to hiring ... convenience and work 
environment are a factor, too.

If I were you, I'd look to hire a half-time employee, and let them have 
flexible hours, so you could pick up a current student.  If you can offer them 
reduced tuition or parking (matters at some campuses ... for College Park, just 
getting 'em in a lot that's closer to their classes) might make up for a 
less-competitive salary.

You should also check with the university's legal department, as you have a 
class of students who specifically *can't* work full time (foreigners on 
student visas), so you might be able to hire a grad student that would've other 
problems getting hired.  Especially in the D.C. area, they have a hard time 
finding jobs (as so many companies are tied to the federal government, they 
don't want to hire non-US citizens).

...

As for the telework aspect -- it's a pain to get set up from nothing.  If you 
have someone that you're comfortable with and they move away, that's completely 
different from bringing in someone who doesn't have a vested relationship in 
the group.  At the very least, I'd recommend bring them in for an orientation 
period (2-8 weeks), where you can get a feel for their work ethic  such.

Most of the people on the project I'm on are remote ... but we keep an IM group 
chat window up all the time, and we have meetings 1-3 times per year where we 
all get together for a week to hash out various issues and keep the 
relationships strong.

-Joe


Re: [CODE4LIB] Hiring strategy for a library programmer with tight budget - thoughts?

2014-08-15 Thread Joe Hourcle
On Aug 15, 2014, at 2:49 PM, BWS Johnson wrote:

 Salvete!
 
 
 My first thought was a project-based contract, too. But there are few
 programmer projects that would require zero maintenance once finished. As
 someone who has had to pick up projects completed by others, there 
 are
 always bugs, gaps in documentation, and difficult upgrade paths.
 
 There could be follow up contracts for those problems, or they might be 
 less of a hassle for in house staff to handle than trying to do absolutely 
 errything from scratch.


That actually made me think of something -- 

I've worked in places where we've had issues with people brought in
as short-term contract developers.  The problem is ... the code was
crap.  As they didn't have to maintain it for the long run, they
wrote some really sloppy code.

I know of one group who brought someone in, they poo-pooed all of
the code, and insisted it had to be re-written (so they did ... in 
ksh ... without quoting anything ... and loading config files by
sourcing them)

... but of course, he was on an hourly contract, so he had a vested
interest in making more work for himself.  (and for me, as I was
then responsible for integrating their system w/ one that I maintain).

You also get cases where every change in the specs requires new
negotiation of payment.  (like the whole healthcare.gov thing)

...

so to sum up ... if you don't already have an established
relationship with the person, I'd avoid bringing in someone to
telework.

-Joe




 So I have no solutions to offer. Enticing people with telework is a good
 idea. It's disappointing to see libraries (and higher ed more generally)
 continuing to not invest in software development. We need developers. If we
 cannot find the money for them, perhaps we should re-evaluate our
 (budgetary?) priorities.
 
 
 
 Anytime I see things which I think more than one Library would like to 
 have I think Caw, innit that what a Consortium is for? One member alone 
 might not be able to afford a swank techie, but perhaps pooling resources 
 across Libraries would let you hire someone at an attractive salary for the 
 long haul while getting all of the members' projects knocked out. It would 
 also mean that you don't have to do any of those nasty follow up contracts 
 since the person that made it would still be about.


I'm pretty sure that there was someone on this list a few years back who made a 
comment if every library contributed 10% of an FTE of funding, we could fund a 
lot of developers.


Re: [CODE4LIB] Dewey code

2014-08-11 Thread Joe Hourcle
On Aug 8, 2014, at 10:13 PM, Riley Childs wrote:

 Ok, so you want to access LC data to get Dewey decimal numbers? You need to 
 use a z39.50 client to pull the record, you can do it with marc edit but it 
 is labor intensive.  You would need to roll your own solution for this or use 
 classify.oclc.org to get book info (this doesn't give you API access). Your 
 best bet is classify.oclc.org.
 
 That aside:
 Honestly you might be better off running with something like Koha, writing a 
 home brew library system is no cake walk, trust me I know from 2 years of 
 experience trying to code one and ultimately moving to koha. Koha can be run 
 on a VPS (Digital Ocean is what i would use) or on an old PC in the corner. I 
 am in a situation similar to yours if you want to contact me off list I can 
 give you some advice.


I 100% agree -- you'd be better off going with something intended for personal 
libraries (eg Delicious Library) and give it a dedicated machine before trying 
to roll your own.

oss4lib hasn't been updated in a while, but Lyrasis is maintaining foss4lib.org 
as a catalog of free  open source library software, and has a 'ILS feature 
comparison tool' which lists feature differences between Koha and Evergreen:

http://ils.foss4lib.org/

-Joe


Re: [CODE4LIB] very large image display?

2014-07-26 Thread Joe Hourcle
On Jul 25, 2014, at 11:36 AM, Jonathan Rochkind wrote:

 Does anyone have a good solution to recommend for display of very large 
 images on the web?  I'm thinking of something that supports pan and scan, as 
 well as loading only certain tiles for the current view to avoid loading an 
 entire giant image.
 
 A URL to more info to learn about things would be another way of answering 
 this question, especially if it involves special server-side software.  I'm 
 not sure where to begin. Googling around I can't find any clearly good 
 solutions.
 
 Has anyone done this before and been happy with a solution?


If you store the images in JPEG2000, you can pull tiles or different 
resolutions out via JPIP (JPEG 2000 Interactive Protocol)

Unfortunately, most web browsers don't support JPIP directly, so you have to 
set up a proxy for it.

For an example, see Helioviewer:

http://helioviewer.org/

Documentation and links to their JPIP server are available at:

http://wiki.helioviewer.org/wiki/JPIP_Server

-Joe


Re: [CODE4LIB] Publishing large datasets

2014-07-23 Thread Joe Hourcle
On Jul 23, 2014, at 5:29 PM, Kyle Banerjee wrote:

 We've been facing increasing requests to help researchers publish datasets.
 There are many dimensions to this problem, but one of them is applying
 appropriate metadata and mounting them so they can be explored with a
 regular web browser or downloaded by expert users using specialized tools.
 
 Datasets often are large. One that we used for a pilot project contained
 well over 10,000 objects with a total size of about 1 TB. We've been asked
 to help with much larger and more complex datasets.
 
 The pilot was successful but our current process is neither scalable nor
 sustainable. We have some ideas on how to proceed, but we're mostly making
 things up. Are there methods/tools/etc you've found helpful? Also, where
 should we look for ideas? Thanks,


The tools I use are too customized for our field to be of much use to anyone 
else, so can't help on that part of the question.


I'd really recommend trying to reach out to someone working in data informatics 
in the field that the data is from, as they would have recommendations on 
specific metadata that should be captured.


For the general 'data publication' community, it's coalescing, but still a bit 
all over the place.  Here are some of the ones that I know about:

JISC has a 'Data Publication' mailing list:

https://www.jiscmail.ac.uk/cgi-bin/webadmin?A0=DATA-PUBLICATION

ASIST runs a 'Research Data Access  Preservation' conference and 
mailing list:

http://www.asis.org/rdap/
http://mail.asis.org/mailman/listinfo/rdap

... and they put most of the presentations up on slideshare:

http://www.slideshare.net/asist_org/

The Research Data Alliance has two working groups on the topic, 
Publishing Services and Publishing Data Workflows:

https://rd-alliance.org/group/rdawds-publishing-services-wg.html

https://rd-alliance.org/group/rdawds-publishing-data-workflows-wg.html


I'm also one of the moderators of the Open Data site on Stack Exchange, which 
has some questions that might be relevant:

Let's suppose I have potentially interesting data. How to distribute?
http://opendata.stackexchange.com/q/768/263

Benefits of using CC0 over CC-BY for data
http://opendata.stackexchange.com/q/26/263

... or just ask a new question.


I'd also recommend that when you catalog your data, that you also consider 
adding DataCite metadata, so that we can try to make it easier for others to 
cite your data.   (specific implementation recommendations for data citation 
are still evolving, but general principles have been released; if you have 
questions, feel free to ask me, as I think we need to add some clarification to 
what we mean on some of the items).

http://www.datacite.org/
https://www.force11.org/datacitation


As I see it, you're dealing with data that's in the problem range -- if it were 
larger, the department collecting the data would have a system in place 
already; if it were smaller, it's easier to manage as a single item for deposit.


-Joe


Re: [CODE4LIB] net.fun

2014-07-14 Thread Joe Hourcle
On Jul 14, 2014, at 8:21 AM, Riley Childs wrote:

 My MOTDs are not as fun...
 
 RUN GET OUT OF HERE
 YOU ARE NOT WELCOME TODAY
 RESTRICTED ACCESS HERE.

I would expect that in the banner, not the motd:

$ more /etc/banner

This US Government computer is for authorized users only. By accessing  
this system you are consenting to complete monitoring with no  
expectation of privacy. Unauthorized access or use may subject you to  
disciplinary action and criminal prosecution.


The banner gets displayed before the login prompt, the motd gets displayed 
after ... there's also an assumption that the motd changes regularly, as it's 
'message of the day' ... although most people have it be completely random and 
just call fortune or never bother changing it.

-Joe


Re: [CODE4LIB] net.fun

2014-07-14 Thread Joe Hourcle
On Jul 14, 2014, at 10:44 AM, Cary Gordon wrote:

 I remember when system administrators would change the MOTD daily. The '80s
 were so pastoral.

0 0 * * * /bin/fortune  /etc/motd

or, for those running Vixie cron (which most people weren't in the 80s) :

@daily /bin/fortune  /etc/motd


... but then, everyone went the way of 'web portals' and the like, rather than 
assuming everyone was going to be (telnet|tn3270)ing into a (unix|cms) system 
so they could check their e-mail, nntp, gopher, etc.

-Joe

ps. is it disturbing that the talk of motd is making me nostalgic for ASCII art?





 On Monday, July 14, 2014, Joe Hourcle onei...@grace.nascom.nasa.gov wrote:
 
 On Jul 14, 2014, at 8:21 AM, Riley Childs wrote:
 
 My MOTDs are not as fun...
 
 RUN GET OUT OF HERE
 YOU ARE NOT WELCOME TODAY
 RESTRICTED ACCESS HERE.
 
 I would expect that in the banner, not the motd:
 
$ more /etc/banner
 
This US Government computer is for authorized users only. By
 accessing
this system you are consenting to complete monitoring with no
expectation of privacy. Unauthorized access or use may subject you
 to
disciplinary action and criminal prosecution.
 
 
 The banner gets displayed before the login prompt, the motd gets displayed
 after ... there's also an assumption that the motd changes regularly, as
 it's 'message of the day' ... although most people have it be completely
 random and just call fortune or never bother changing it.
 
 -Joe
 
 
 
 -- 
 Cary Gordon
 The Cherry Hill Company
 http://chillco.com


Re: [CODE4LIB] net.fun

2014-07-14 Thread Joe Hourcle
On Jul 14, 2014, at 11:56 AM, Riley Childs wrote:

 I know I might be little youn but code4lib needs a bbs

I can see it now ... someone re-writing TradeWars 2000 so you're an 
intergalactic bookmobile.

-Joe


Re: [CODE4LIB] net.fun

2014-07-14 Thread Joe Hourcle
On Jul 14, 2014, at 5:25 PM, Lisa Rabey wrote:

 The cause of the problem is:
 /dev/clue was linked to /dev/null
 
 Teehee.
 
 http://pages.cs.wisc.edu/~ballard/bofh/bofhserver.pl


It's difficult to use the excuse 'solar flares' when your boss is (1) a solar 
physicist and (2) reads BOFH.

http://bofh.ntk.net/BOFH//bastard06.php


 On Mon, Jul 14, 2014 at 1:02 PM, Kyle Banerjee kyle.baner...@gmail.com 
 wrote:
 The only problem is that some people might have difficulty obtaining audio
 modems that could be made to work with their cell phones...


So in um ... spring of 1995, I think it was ... we managed to get a car phone 
from Bell Atlantic (might've been Bell Atlantic-NYNEX at that point), and the 
phone had an RJ-11 jack on it.

... which of course meant that we had to see if we could hold up a modem 
connection while on the road.  Unfortunately, the best that we could manage to 
get was about 2400 baud for any extended periods.  We had our best transfer 
rates (9600 baud?) up near the NSA campus along the BW Parkway.

Mind you, this was in the days when modems topped out at 33.6k ('x2' and 
'kFlex' didn't come along 'til a year or two later, then finally v90) ... the 
modem banks we were dialing into might've only been 14.4k or 28.8k.

This was also in the days of analog cell service, as PCS didn't come out 'til 
even later ... once it did, the sysadmin for the ISP I worked at got cables so 
he could dial out from what today we'd call a 'netbook' (back then it was just 
a really tiny laptop ... this was also the days when you could keep a computer 
on your lap without it crushing you (the 'portable' aka 'luggable' era) and it 
burning your crotch (the current 'notebook' era).

... but I still think we could pull off 1200 baud w/ an acoustic coupler over a 
cell phone, which is about the bare minimum for MUDs in the mid 1990s, and 
would've been fine for BBSes, as long as you weren't dealing in warez.

-Joe

ps.  wow, this whole conversation is making me feel old ... doesn't help that I 
blew my back out last week, so I was already feeling old before the day started.


Re: [CODE4LIB] Why not Sharepoint?

2014-07-11 Thread Joe Hourcle
On Jul 11, 2014, at 10:33 AM, Thomas Kula wrote:

 On Fri, Jul 11, 2014 at 10:10:40AM -0400, Jacob Ratliff wrote:
 Hi Ned,
 
 The biggest case for SP is boiled down to 2 things in my mind.
 1) its terrible at preservation. If you are just using it as a digital
 asset mgmt system its fine, but if you need the preservation component go
 with something else.
 
 I've never used Sharepoint, but really it boils down to coming up with a
 list of requirements for a digital preservation storage system:
 
 - It must have an audit log of who did what to what when
 - It must do fixity checking of digital assets
   - At minimum, it must tell you when a fixity check fails
   - It really should be able to recover from fixity check
 failures when an object is read
   - Ideally it should discover these *before* an object is
 accessed, recover, and notify someone
 - It must support rich enough metadata for your objects
 - It must meet your preservation needs (N copies distributed over
   X distance within Y hours)
 - It must be scalable to handle anticipated future growth.
 
 I'm sure there are more, I haven't had much coffee yet this morning so
 I'm missing some. And honestly, you have to scale your requirements to
 what your specific needs are.
 
 *Only* then can you evaluate solutions. If you've got a list of
 requirements, you can then ask I need this. How well does SP (or any
 other possible solution) meet this need?

So it doesn't look like you're just coming up with cases that
Sharepoint doesn't do, you might consider something like the
TRAC checklist:

2007 version, from CRL:
http://www.crl.edu/sites/default/files/attachments/pages/trac_0.pdf
2011 update from CCSDS:
http://public.ccsds.org/publications/archive/652x0m1.pdf

The 2011 update should mirror what's in ISO 16363.

Most of the other certifications that I've seen look more at the 
organization, and don't have specific portions for technology.

-Joe


ps.  A quick search for 'SharePoint' and 'OAIS' led me to :

http://www.eprints.org/events/or2011/hargood.pdf

... which as best I can tell is the abstract for a poster at OR2011.


Re: [CODE4LIB] Is ISNI / ISO 27729:2012 a name identifier or an entity identifier?

2014-06-20 Thread Joe Hourcle
On Jun 20, 2014, at 4:30 PM, Karen Coyle wrote:

 On 6/20/14, 11:38 AM, Richard Wallis wrote:
 In what ways does ISNI support linked data?
 
 See: http://www.isni.org/how-isni-works#HowItWorks_LinkedData
 
  accessible by a persistent URI in the form 
 isni-url.oclc.nl/isni/000134596520 (for example)  and soon also in the 
 form isni.org/isni/000134596520. 
 
 Odd. I assume that whoever wrote that on their page just forgot the http://; 
 part of those strings. Right?

People think I'm being pedantic when I bitch about the protocol
missing for printed materials (flyers, business cards, etc) ...
but in this case, it's a definite violation of RFC 3986:

  1.1.1.  Generic Syntax
 Each URI begins with a scheme name, as defined in Section 3.1, that
 refers to a specification for assigning identifiers within that
 scheme.  As such, the URI syntax is a federated and extensible naming
 system wherein each scheme's specification may further restrict the
 syntax and semantics of identifiers using that scheme.

Now, it's possible that this whole we don't need to bother with
http://; thing has spilled into the CMS building community, and
they're actively stripping it out.  From their page, I think they're
using Drupal, but the horrible block of HTML that this was in is
blatantly MS Word's 'save as HTML' foulness:

  h2span lang=EN-USa name=HowItWorks_LinkedData/aLinked 
Data/span/h2
  p class=MsoNormalspan lang=EN-USLinked data is part of the ISNI-IA’s 
strategy to make ISNIs freely available and widely diffused.nbsp; Each 
assigned ISNI is accessible by a persistent URI in the form 
isni-url.oclc.nl/isni/000134596520 (for example) nbsp;and soon also in the 
form isni.org/isni/000134596520.nbsp;/span/p
  p class=MsoNormalspan lang=EN-USComing soon:nbsp; ISNI core metadata 
in RDF triples.nbsp; The RDF triples will be embedded in the public web pages 
and the format will be available via the persistent URI and the SRU search 
API./span/p
  p class=MsoNormalspan lang=EN-USnbsp;/span/p

-Joe



 
 On 20 June 2014 18:57, Eric Lease Morgan emor...@nd.edu wrote:
 
 On Jun 20, 2014, at 10:56 AM, Richard Wallis 
 richard.wal...@dataliberate.com wrote:
 
  authority control|simple identifier |Linked Data capability
 +-+--+--+
  VIAF   |X|X |  X   |
 +-+--+--+
  ORCID  | |X |  |
 +-+--+--+
   ISNI  |X|X |  X   |
 +-+--+--+
 
 Increasingly I like linked data, and consequently, here is clarification
 and a question. ORCID does support RDF, but only barely. It can output
 FOAF-like data, but not bibliographic. Moreover, it is experimental, at
 best:
 
   curl -L -H 'accept: application/rdf+xml'
 http://orcid.org/-0002-9952-7800
 
 In what ways does ISNI support linked data?
 
 ---
 Eric Morgan
 
 
 
 
 -- 
 Karen Coyle
 kco...@kcoyle.net http://kcoyle.net
 m: 1-510-435-8234
 skype: kcoylenet


Re: [CODE4LIB] College Question!

2014-05-29 Thread Joe Hourcle
On May 28, 2014, at 11:17 PM, Riley Childs wrote:

 I was curious about the type of degrees people had. I am heading off to 
 college next year (class of 2015) and am trying to figure out what to major 
 in. I want to be a systems librarian, but I can't tell what to major in! I 
 wanted to hear about what paths people took and how they ended up where they 
 are now.

What paths we took?   Well, I'm in the mood for procrastinating, so here goes.

...

Mine started well before college.

My dad got our family a computer (Apple IIe) when I was in 3rd or 4th grade ... 
so I learned Basic back in the days when you'd copy program listings from 
magazines.

In middle school, I learned Logo, and in 8th grade was a aide for the computer 
lab.  One summer I went to a two week camp, and learned Pascal, and the 
difference between Basic and Basica.  During this time, my mom worked for a 
computer company, and we upgraded to a Apple ][gs.

My high school was a 'science and tech' school.  I had 2.5 years of drafting, 2 
years of commercial graphics, and by senior year I was working as a TA in the 
computer lab, and had an independent study in the school's print shop.  Through 
this time, we upgraded to a Macintosh SE/30 and then a Macintosh IIci.

For summers in high school, I was working as an intern for an office of the 
Department of Defense (my dad was military), and I learned a few other OSes, 
including ALIS (a window manager for Sun UNIX boxes).  I was also calling into 
BBSes quite regularly (had started back in middle school w/ a 1200 baud modem).

In college, I had planned to work towards a degree in Architectural 
Engineering, but my dad taught at a school that didn't offer it ... so I 
started a degree in Civil Engineering.

After my freshman year, I started working in the university's academic 
computing center.  (They managed the computer labs  the general use UNIX  CMS 
machines).  I started off doing general helpdesk support, but by my junior year 
that whole 'world wide web' thing was getting popular.

As I had experience with computer programming, databases, desktop publishing, 
graphics, etc ... so I ended up splitting my time between the helpdesk, and the 
newly formed 'web development team' ... which was two of us (both students), 
working half time.  And I was getting to be a fairly fast typist from mudding.

After my sophomore year, Tim, the other member of our 'web development team' 
graduated, and went to work full time, while I was half time.  We grew to four 
people (3 half time, as we were full time students), and we did some cutting 
edge stuff to get all of the university's course information online (required 
parsing quark xpress files to generate HTML, parsing dumps from the 
university's course registration system, and generating HTML, etc) ... and so 
Tim got offered a job to go work for Harvard.

Through this time, I helped out on the university's solar car team, and got 
distracted and never got around to switching to a school for architecture.

I ended up taking over in managing the university's web server while they tried 
to find a new manager for our group.  (this was back when 'webmaster' meant 
'web server administrator' and not 'person who designs web pages')  I learned 
Perl, to go along with the various shell scripting that I had already learned.  
I picked up the 'UNIX System Administration Handbook' and learned from our 
group's sysadmins until I was trusted to manage that server.

While all of this was going on, as I had taken enough classes to be 1/2 a 
semester off from my classmates, I never realized that I was supposed to take 
the EIT (Engineer in Training test) ... so I was a bit screwed if I wanted to 
be an engineer.  After graduation, I went to resign, as I wanted to look for a 
full time job, but the director said that they were putting in for a new 
position for me.

By the middle of summer, my new manager told me that the director had told her 
that under no circumstances was she to hire me for the job that was being 
created.  He really didn't like guys with long hair.

... but through this time, I spent some of my savings to help one of the folks 
on the mud to start an ISP  (so they could host the mud which was getting 
kicked out of the university it was at).  I was working as their webmaster, 
remotely.  After all of this crap went down at my university, I got offered to 
do some contract work at that ISP, so I moved out to Kentucky.  The first 
contract fell through, but I kept doing various coding projects for them, did 
tech support (phone and still the days when we'd drive out to people's houses 
to set up their modems).  I learned mysql in the process.

The contracting side of our company merged with another contracting company, 
but then everything fell through ... and oddly I was the only employee that 
suddenly found themselves working for a different company.  Through this time, 
I did mostly web  database work ... the ISP that I worked for 

Re: [CODE4LIB] jobs digest for 2014-05-16

2014-05-16 Thread Joe Hourcle
On May 16, 2014, at 3:46 PM, Andreas Orphanides wrote:

 THIS IS SLIGHTLY DIFFERENT THAN WHAT WE DISCUSSED.

Agreed, but there's no need for shouting.

It looks to me like it's a change in the messages that 'jobs.code4lib.org' 
generates and sends to the list ... *not* the change that Eric made to the 
mailing list.

(I'm basing that on what a LISTSERV(tm) digest looks like, and the fact that 
it's archived this as a single message).

... and whoever made the change should at the very least put 'JOBS:' in the 
subject, so LISTSERV(tm) assigns it to the right topic for people to then 
ignore it.

-Joe




 
 On Fri, May 16, 2014 at 3:44 PM, j...@code4lib.org wrote:
 
 Library Electronic Resources Specialist
 Raritan Valley Community College
 Branchburg Township, New Jersey
 ColdFusion, EZproxy, JavaScript, Personal computer hardware
 http://jobs.code4lib.org/job/13115
 
 Digital Scholarship Specialist
 University of Oklahoma
 Norman, Oklahoma
 Digital humanities, University of Oklahoma
 http://jobs.code4lib.org/job/14593
 
 Research Data Consultant
 Virginia Polytechnic Institute and State University
 Blacksburg, Virginia
 Data curation, Data management, Digital library, Informatics
 http://jobs.code4lib.org/job/14591
 
 Systems Librarian
 Central Michigan University
 Mount Pleasant, Michigan
 CONTENTdm, Ex Libris, Innovative Interfaces, MARC standards, Proxy server,
 Resource Description and Access, SFX
 http://jobs.code4lib.org/job/14590
 
 To post a new job please visit http://jobs.code4lib.org/
 


Re: [CODE4LIB] separate list for jobs

2014-05-15 Thread Joe Hourcle

On Thu, 15 May 2014, Jodi Schneider wrote:


elm++


people still use elm?

I'm personally using the 'patterns-filters2' rule in alpine for managing 
my mailing lists.


I've considered switching to mutt, but I haven't used elm or its 
derivatives in over a decade.  (elm didn't have good MIME support, and I 
was getting tired of jumping through hoops for every attachment... 
although, it was *much* better than pine if you were connecting at 1200 
baud, as it didn't redraw the screen constantly)


-Joe




On Thu, May 15, 2014 at 6:09 PM, Eric Lease Morgan emor...@nd.edu wrote:


I have done my initial best to configure the mailing list to support a
jobs topic, and I've blogged about how you can turn off or turn on the jobs
listings. [1] From the blog:

  The Code4Lib community has also spawned job postings. Sometimes
  these job postings flood the mailing list, and while it is
  entirely possible use mail filters to exclude such postings,
  there is also more than one way to skin a cat. Since the
  mailing list uses the LISTSERV software, the mailing list has
  been configured to support the idea of topics, and through this
  feature a person can configure their subscription preferences to
  exclude job postings. Here's how. By default every subscriber to
  the mailing list will get all postings. If you want to turn off
  getting the jobs postings, then email the following command to
  lists...@listserv.nd.edu:

SET code4lib TOPICS: -JOBS

  If you want to turn on the jobs topic and receive the notices,
  then email the following command to lists...@listserv.nd.edu:

SET code4lib TOPICS: +JOBS

  Sorry, but if you subscribe to the mailing list in digest mode,
  then the topics command has no effect; you will get the job
  postings no matter what.

  Special thanks go to Jodi Schneider and Joe Hourcle who pointed
  me in the direction of this LISTSERV functionality. Thank you!

The LISTSERV topics feature is new to me, and I hope it works as
advertised. I think it will.

[1] blog posting - http://bit.ly/1nSCG2u

?
Eric Lease Morgan, Mailing List Owner





Re: [CODE4LIB] statistics for image sharing sites?

2014-05-14 Thread Joe Hourcle
On May 13, 2014, at 10:16 PM, Stuart Yeates wrote:

 On 05/14/2014 01:39 PM, Joe Hourcle wrote:
 On May 13, 2014, at 9:04 PM, Stuart Yeates wrote:
 
 We have been using google analytics since October 2008 and by and large 
 we're pretty happy with it.
 
 Recently I noticed that we're getting 100 hits a day from the 
 Pinterest/0.1 +http://pinterest.com/; bot which I understand is a 
 reasonably reliable indicator of activity from that site. Much of this 
 activity is pure-jpeg, so there is no HTML and no opportunity to execute 
 javascript, so google analytics doesn't see it.
 
 pinterest.com is absent from our referrer logs.
 
 My main question is whether anyone has an easy tool to report on this kind 
 of use of our collections?
 
 Set your webserver logs to include user agent (I use 'combined' logs), then 
 use:
 
  grep Pinterest /path/to/access/logs
 
 You could also use any analytic tools that work directly off of your log 
 files.  It might not have all of the info that the javascript analytics 
 tools pull (window size, extensions installed, etc.), but it'll work for 
 anything, not just HTML files.
 
 When I visit http://www.pinterest.com/search/pins/?q=nzetc I see a whole lot 
 of our images, but absolutely zero traffic in my log files, because those 
 images are cached by pinterest.

You could also go the opposite route, and deny Pinterest your images, so they 
can't cache them.

You could either use robots.txt rules, or matching rules w/in Apache to deny 
their agents absolutely.

I have no idea if they'd then link straight to your images (so that you could 
get useful stats), or if they'd just not allow it to be used on their site at 
all.


-Joe


Re: [CODE4LIB] statistics for image sharing sites?

2014-05-13 Thread Joe Hourcle
On May 13, 2014, at 9:04 PM, Stuart Yeates wrote:

 We have been using google analytics since October 2008 and by and large we're 
 pretty happy with it.
 
 Recently I noticed that we're getting 100 hits a day from the Pinterest/0.1 
 +http://pinterest.com/; bot which I understand is a reasonably reliable 
 indicator of activity from that site. Much of this activity is pure-jpeg, so 
 there is no HTML and no opportunity to execute javascript, so google 
 analytics doesn't see it.
 
 pinterest.com is absent from our referrer logs.
 
 My main question is whether anyone has an easy tool to report on this kind of 
 use of our collections?

Set your webserver logs to include user agent (I use 'combined' logs), then use:

grep Pinterest /path/to/access/logs

You could also use any analytic tools that work directly off of your log files. 
 It might not have all of the info that the javascript analytics tools pull 
(window size, extensions installed, etc.), but it'll work for anything, not 
just HTML files.


 My secondary question is whether any httpd gurus have recipes for redirecting 
 by agent string from low quality images to high quality. So when AGENT =  
 Pinterest/0.1 +http://pinterest.com/; and the URL matches a pattern redirect 
 to a different pattern. For example:
 
 http://nzetc.victoria.ac.nz/etexts/MakOldT/MakOldTP022a%28w100%29.jpg
 
 to
 
 http://nzetc.victoria.ac.nz/etexts/MakOldT/MakOldTP022a.jpg


Perfectly possible w/ Apache's mod_rewrite, but you didn't say what http server 
you're using.

If Apache, you'd do something like:

RewriteCond %{HTTP_USER_AGENT} ^Pinterest
RewriteRule (^/etexts/MakOldT/.*)\(.*\)\.jpg $1.jpg [L]

You might need to adjust the regex to match your URLs ... I just assumed the 
stuff in parens got stripped out of stuff in that directory.


Re: [CODE4LIB] separate list for jobs

2014-05-08 Thread Joe Hourcle
On May 8, 2014, at 11:35 AM, Ben Brumfield wrote:

 I suspect I'm not the only mostly-lurker who subscribes to CODE4LIB in digest 
 mode, finding value in a glance over the previous day's discussions each 
 morning, then (very) occasionally weighing in on individual threads via the 
 web interface.  I find this to be more effective and efficient than 
 filtering-and-foldering individual messages, at least for my goal of  having 
 some idea of the content of the conversations here, although--not being a 
 full-time library technologist--I'm really just skimming.
 
 I also suspect that I'm also not the only digest-mode subscriber who would 
 see value in a digest-mode option that excluded job postings.  


As this is an an actual LISTSERV(tm) mailing list, it's possible for the list 
owner to define 'topics', and then for people to set up their subscription to 
exclude those they wish to ignore:


http://www.lsoft.com/manuals/16.0/htmlhelp/list%20owners/ModeratingEditingLists.html#2338132

I would suspect it would be honored even in digest mode, but I've never tried 
it.

-Joe


Re: [CODE4LIB] separate list for Jobs

2014-05-08 Thread Joe Hourcle
On May 8, 2014, at 3:54 PM, Coral Sheldon-Hess wrote:

 I have another, maybe minor, point to add to this: I've posted a job to
 Code4Lib, and I did it wrong. I have no idea how I'm supposed to make a job
 show up correctly, and now that I have realized I've done it wrong, I
 probably won't send another job to this list. (Or maybe I'll look it up in
 ... where? the wiki?)
 
 A second list would make this a lot clearer, I think.


So, from my 'knowing way to much about LISTSERV(tm) brand
mailing lists, from having been the primary support person
at a university for a couple of a years:

There's another feature for 'sub-lists', where you can set
up parent/child relationships between lists ... so someone
you can have a separate address to send to for job postings
specifically:


http://www.lsoft.com/manuals/16.0/htmlhelp/list%20owners/StartingMailingLists.html#2337469



I've never tried it, but it might be possible to set the
SUBJECTHDR on the sub-list so the parent list assigns a topic
for a given sub-list.

-Joe


Re: [CODE4LIB] separate list for discussing a separate list for jobs

2014-05-06 Thread Joe Hourcle
On May 6, 2014, at 12:34 PM, Dan Chudnov wrote:

 Is it time to reconsider:  should we start a separate list for Job: 
 postings?  code4lib-jobs, perhaps?

I think the real question here is if we should have a separate list for 
discussing if we need a separate list for jobs.  I propose 
'code4lib-jobs-list-discuss'.

-Joe


Re: [CODE4LIB] CD auto-loader machine and/or services to rip CD's to disk

2014-04-30 Thread Joe Hourcle
On Apr 30, 2014, at 11:31 AM, Derek Merleaux wrote:

 I have few thousand CD's and DVD's of images scanned back in the days of
 more expensive server storage. I want the files on these transferred to a
 hard-drive or cloud storage where I can get at the them and sort out the
 keepers etc.
 I have seen a lot of great home-built auto-loader machines, but sadly do
 not have time/energy right now to build my own. Looking for recommendations
 for machines and/or for a reliable service who will take my discs and put
 them a server.


Summer interns.

Well, I guess it depends on just how many thousands it is.

I'm actually surprised that there aren't any groups renting these
sorts of things out -- most efforts like this (or film scanning,
book scanning, etc), are generally an effort that might run for
a year or two, and the gear isn't needed anymore.*

You'd think there'd be a market for folks to share the costs...
find three groups looking to do the scanning, share the up-front
costs and then pass it from place to place.

I think that IMLS has given grants for these sorts of efforts...
but if they could help match up equipment to groups that needed
it, they might be able to get better results for each dollar
spent.

-Joe

* Unless some item isn't discovered 'til later.


Re: [CODE4LIB] CFP: A Librarian's Introduction to Programming Languages

2014-03-27 Thread Joe Hourcle
On Mar 26, 2014, at 9:32 AM, Simon Spero wrote:

 I would structure the book by task, showing how different languages would
 implement the same task.
 
 For example,
 
 using a marc parsing library in java, groovy, python, ruby, perl,
 c/c++/objective c, Haskell.
 
 Implementing same.
 
 Using a rest API
 
 Implementing a rest API
 
 Doing statistical analysis of catalog records, circulation data , etc.
 
 Doing knowledge based analysis of same
 --
 Treatment of each topic and language is likely to be cursory at best, and I
 am not sure who the audience would be.
 
 A series of  language for librarians books would seem more useful and
 easier to produce.


If you tried to put it all into a book, you'd have two issues:

1. It'd be horribly long.  (anyone remember the 'Encyclopedia
   of Graphical File Formats'?)

2. Tools change over time, and books don't.

... so instead, perhaps the code4lib community would want to try to
put some of these together on the code4lib wiki.  Eg, for the Marc one:

http://wiki.code4lib.org/index.php/Working_with_MARC

... people could contribute recipes of how they use the various
libraries that are linked in.  (or just say, look it's outdated, we
listed it, but we recommend (x) instead).

Think of it like a code golf challenge -- someone throws out a
problem, and members of the community (if they have the time)
submit their various solutions in different languages or using
different libraries.


... another possibility would be to organize something over
on stackexchange ... if you set some 'scoring criteria', we could
run them as 'code-challenges' on the codegolf site:

http://codegolf.stackexchange.com/questions/tagged/code-challenge

-Joe


Re: [CODE4LIB] CFP: A Librarian's Introduction to Programming Languages

2014-03-25 Thread Joe Hourcle
On Mar 25, 2014, at 9:03 AM, Miles Fidelman wrote:

 Come to think of it, there's nothing there to frame the intent and scope of 
 the book - is it aimed at librarians who write code, or at librarians who are 
 trying to guide people to topical material?

An excellent question, so I'm cc'ing the editors for the book, so maybe they 
can answer.

(I suspect by the languages listed that it's the first one; the second would be 
so broad that it might not be useful ... I'm having a difficult time coming up 
with justifications for using Logo, IDL or Brainfuck in a library [1]).  And 
the mention of how a specific language can be used to enhance library services 
and resources might be a clue, too)


 Either way, it sure seems like at least three framing topics are missing:
 - a general overview of programming language types and characteristics (i.e., 
 context for reading the other chapters)
 - a history of programming languages (the family tree, if you will)
 - programming environments, platforms, tools, libraries and repositories - a 
 language's ecosystem probably influences choice of language use as much as 
 the language itself

Agreed on all three ... in some cases, the main justification for using a 
language is the ecosystem (eg, CPAN for Perl).

In some cases, it might be worth just assuming a library -- eg, do you want to 
teach people (ECMA|J(ava)?|Live)Script, or just assume jQuery, so they can get 
up to speed faster?  (yes, I know, you then bring in the jQuery vs. MooTools 
vs. every other JS library, but I think it's safe to say that jQuery is a 
defacto standard these days)


 - non-language languages - e.g., sql/nosql, spreadsheet macros and other 
 platforms that one builds on

Agreed on the need for SQL.  NoSQL isn't really a language on its own; I'm not 
aware of any specific general API, so I'd go with XPath  XSLT for discussing 
non-relational data.  Macro languages would be useful (and I'd assume the 
'Basic' proposal was actually for VBA, so you could create more complex MS 
Access databases)

-Joe


[1] okay, maybe Logo in the context of MakerSpaces, but still nothing on the 
other two.

ps.  I haven't trimmed this, so the editors can see some of the other comments 
made.



 Miles Fidelman
 p.s. I wrote a book for ALA Editions, they were great to work with.  The 
 acquisitions editor I worked with is now a Sr. Editor, so I expect they're 
 still good folks to work with.
 
 Jason Bengtson wrote:
 I'm also surprised not to see anything about the sql/nosql end of the 
 equation. Integral to a lot of apps and tools . . . at least from a web 
 perspective (and probably from others too).
 
 Best regards,
 
 Jason Bengtson, MLIS, MA
 Head of Library Computing and Information Systems
 Assistant Professor, Graduate College
 Department of Health Sciences Library and Information Management
 University of Oklahoma Health Sciences Center
 405-271-2285, opt. 5
 405-271-3297 (fax)
 jason-bengt...@ouhsc.edu
 http://library.ouhsc.edu
 www.jasonbengtson.com
 
 NOTICE:
 This e-mail is intended solely for the use of the individual to whom it is 
 addressed and may contain information that is privileged, confidential or 
 otherwise exempt from disclosure. If the reader of this e-mail is not the 
 intended recipient or the employee or agent responsible for delivering the 
 message to the intended recipient, you are hereby notified that any 
 dissemination, distribution, or copying of this communication is strictly 
 prohibited. If you have received this communication in error, please 
 immediately notify us by replying to the original message at the listed 
 email address. Thank You.
 
 On Mar 25, 2014, at 7:39 AM, Ian Ibbotson ian.ibbot...@k-int.com wrote:
 
 Going in the other direction from cobol and fortran -Fair warning - Putting
 on java evangelist hat- :) I wonder if it might be worth suggesting to the
 authors that they change java into JVM Languages and cover off Java,
 Scala, Groovy,...(others). We've had lots of success in the GoKB(
 http://gokb.org/) and KB+(https://www.jisc-collections.ac.uk/News/kbplus/)
 Knowledge Base projects using groovy on grails - Essentially all the
 pre-built libraries and enterprise gubbins of Java, but with a more
 ruby-esq idiom making it much more readable / less verbose / more
 expressive, and integrating nicely with all that existing enterprise
 infrastructure to boot.
 
 The use of embedded languages in JVMs (Including javascript) means that the
 use of Domain Specific Languages are becoming more and more widespread
 under JVMs, and this seems (To me) an area where there is some real
 advantage to having practitioners with real coding skills - Maybe not the
 hardcore systems development stuff but certainly ability to tune and
 configure software. Expressing things like business rules in DSLs (EG How
 to choose a supplier for an item, or how to deduplicate a title) gives
 librarians an opportunity to tune the behaviour of systems dynamically
 without system 

Re: [CODE4LIB] Usability resources

2014-03-25 Thread Joe Hourcle
On Mar 25, 2014, at 4:07 PM, Coral Sheldon-Hess wrote:

 Some things that came up in the UX discussion (well, the third of it I was
 in) at the breakout session, about how to get your library to be more open
 to UX:

[trimmed, although, I agree on the Steve Krug books]

 I apologize for the self promotion, but not all libraries' cultures allow
 for the big public test approach. Mine ... might, now, but probably
 wouldn't have, a couple of years ago.


There's been a recommendation for years that big public tests are a 
waste of people's time ... you don't do that until it's effectively
a release candidate.

Here are the problems:

(1) there's going to be one or two problems that are the majority
of the problem reports.

(2) once everyone's tested out the buggy version, they're tainted
so can't be a clean slate when testing the next version.

Most recommendations that I've seen call for 3-5 testers for each
iteration, with 2-3 being preferred if you're doing fast cycles. [1]

Yes, you can run into the one tester with completely unreasonable
demands about how things should be done, but if your programmers
don't see how stupid the ideas are, they should be shown to be
horrible in the next test cycle.

If you run too large of tests, you've got to leave some long 
time window for people to test, someone has to correlate all of
the comments ... it's just a drag.  Small test groups mean you
can run a day of testing once a week and keep moving forward.

-Joe

[1] I'll probably out myself as an old fogey here, but :
http://www.useit.com/articles/why-you-only-need-to-test-with-5-users/


Re: [CODE4LIB] Job: PERL PROGRAMMER at The Center for Research Libraries

2014-03-10 Thread Joe Hourcle
For those looking to hire a Perl programmer, two suggestions:

1. Don't put it in all caps:

http://www.perl.org/about/style-guide.html

2. Make sure you post on the Perl jobs board:

http://jobs.perl.org/

-Joe

ps. I have no idea how the Java folks like their language
capitalized, but I suspect it's similar.

pps. On the plus side, it makes it really easy to weed out
 resumes of who's only dabbling and not active in the
 community.


On Mar 10, 2014, at 11:35 AM, j...@code4lib.org wrote:

 PERL PROGRAMMER
 The Center for Research Libraries
 Chicago
 
 Center for Research Libraries (CRL) is a membership
 consortium consisting of the leading academic and research libraries in the
 U.S. and abroad, with a unique and historic collection. A
 recently awarded grant from the Andrew W. Mellon Foundation has enabled the
 CRL to continue and expand its efforts to shape a data-centered international
 strategy for archiving and digitizing historical journals and newspapers.
 
 
 We are seeking a PERL Programmer to work with our existing team of librarians
 to further develop and maintain data projects critical to meeting our
 objective. Work primarily involves analyzing and
 manipulating data sets from library and commercial sources to pull out needed
 data and transform it into additional formats for ingest into existing
 databases or tools used for presentation of the data.
 
 
 Duties and Responsibilities:
 
 • Working closely with librarians to analyze and manipulate data sets
 
 • Creating optimized scalable code
 
 • Design, build and test tools to analyze data, extract patterns, and
 transform data among various formats as required by project demands.
 
 • Design and build user-friendly interface for tools.
 
 
 Requirements:
 
 • Strong analytical skills, with experience analyzing dataflow, data patterns
 and work flow
 
 • Minimum of 1 year of PERL programming experience
 
 • Experience using PERL, JAVA or other programming languages to normalize text
 and applying API's to harvest or capture data.
 
 • Ability to collaborate and contribute to a team and work independently
 
 • Ability to document and explain standards
 
 • Related degree required
 
 
 In addition to professional challenge and the chance to make a creative
 contribution, the CRL offers a competitive salary and exceptional benefits
 package.
 
 
 Respond with the title of the position in the subject line
 to: resu...@crl.edu. You may also respond by mail or fax,
 indicating the position you are applying for to:
 
 
 Human Resources
 
 Center for Research Libraries
 
 6050 S. Kenwood Ave.
 
 Chicago, IL 60637
 
 Fax: 773-955-4545
 
 
 An Equal Opportunity Employer m/f/d/v
 
 
 
 Brought to you by code4lib jobs: http://jobs.code4lib.org/job/12932/


Re: [CODE4LIB] Job: PERL PROGRAMMER at The Center for Research Libraries

2014-03-10 Thread Joe Hourcle
On Mar 10, 2014, at 12:19 PM, Lisa Rabey wrote:

 On Mon, Mar 10, 2014 at 11:46 AM, Joe Hourcle
 onei...@grace.nascom.nasa.gov wrote:
 For those looking to hire a Perl programmer, two suggestions:
 
 1. Don't put it in all caps:
 
http://www.perl.org/about/style-guide.html
 
 
 This is a fair point if they only all-capped Perl, which they didn't;
 they capped the title of the job. I'm assuming they did this for
 formatting reasons in the email, which should have no barring in
 who's dabbling in the community and who is not.

And although I'm normally a fan of trimming down message test
to the relevant parts, you conveniently removed the other
three occurrences of 'PERL' in the posting:

We are seeking a PERL Programmer to work ...

Minimum of 1 year of PERL programming experience

Experience using PERL, JAVA or other ...



 But it also raises the point that if I were a Perl programmer, someone
 nitpicking about email formatting is probably not someone I would want
 to work with.

Right ... I should apologize for top-posting in my last message.
I'm sorry, and I'll try not to do it again.

Thank you for not continuing the top-posting in your reply.

-Joe


Re: [CODE4LIB] Book scanner suggestions redux

2014-03-04 Thread Joe Hourcle
On Mar 3, 2014, at 10:54 AM, Aaron Rubinstein wrote:

 Hi all, 
 
 We’re looking to purchase a book scanner and I was hoping to get some 
 recommendations from those who’ve had experience.

I don't have experience, but a couple of years back, a group started selling 
kits to make book scanners:

http://diybookscanner.myshopify.com/products/diy-book-scanner-kit


It's $500+shipping, and missing some parts (glass, cameras, paint), but it 
means that instead of carpentry skills, you just need experience assembling 
things.

-Joe


Re: [CODE4LIB] online book price comparison websites?

2014-02-26 Thread Joe Hourcle
On Feb 26, 2014, at 3:14 PM, Jonathan Rochkind wrote:

 Anyone have any recommendations of online sites that compare online prices 
 for purchasing books?
 
 I'm looking for recommendations of sites you've actually used and been happy 
 with.
 
 They need to be searchable by ISBN.
 
 Bonus is if they have good clean graphic design.
 
 Extra bonus is if they manage to include shipping prices in their price 
 comparisons.


Might be too late, but :

http://isbn.nu/

It doesn't include the shipping prices in their results, though.

API is just appending the ISBN to the end, either 9 or 13 :

http://isbn.nu/0060853980
http://isbn.nu/9780060853983

-Joe


Re: [CODE4LIB] Question about OAI Harvesting via Perl

2014-01-14 Thread Joe Hourcle
On Jan 14, 2014, at 3:01 PM, Eka Grguric wrote:

 Hi,
 
 I am a complete newbie to Perl (and to Code4Lib) and am trying to set up a 
 harvester to get complete metadata records from oai-pmh repositories. My 
 current approach is to use things already built as much as possible - 
 specifically the Net::Oai::Harvester 
 (http://search.cpan.org/~esummers/OAI-Harvester-1.0/lib/Net/OAI/Harvester.pm).
  The code I'm using is located in the synopsis and specific parts of it seem 
 to work with some samples I've tried. For example, if I submit a request for 
 a list of sets to the oai url for arXiv.org (http://arXiv.org/oai2) I get the 
 correct list.
 
 The error I run into reads can't call listRecords() on an undefined value in 
 *filename* line *#*. listRecords() seems to have been an issue in past 
 iterations but I'm not sure how to get around it. 
 
 At the moment it looks like this: 
 ## list all the records in a repository
 my $list = $harvester-listRecords(
   metadataPrefix = 'oai_dc'
);
 
 Any help (or Perl resources) would be appreciated!

The error message you're getting is a sign that '$harvester' (the item that you 
tried calling 'listRecords' on) hasn't been set up properly.

The typical scenarios are that either the object was never called to be created 
or when you tried to create it the function returned undef (undefined value) to 
indicate that something had gone wrong.

How did you initialize it?

-Joe


Re: [CODE4LIB] The lie of the API

2013-12-02 Thread Joe Hourcle
On Dec 2, 2013, at 1:25 PM, Kevin Ford wrote:

  A key (haha) thing that keys also provide is an opportunity
  to have a conversation with the user of your api: who are they,
  how could you get in touch with them, what are they doing with
  the API, what would they like to do with the API, what doesn’t
  work? These questions are difficult to ask if they are just a
  IP address in your access log.
 -- True, but, again, there are other ways to go about this.
 
 I've baulked at doing just this in the past because it reveals the raw and 
 primary purpose behind an API key: to track individual user usage/access.  I 
 would feel a little awkward writing (and receiving, incidentally) a message 
 that began:
 
 --
 Hello,
 
 I saw you using our service.  What are you doing with our data?
 
 Cordially,
 Data service team
 --

It's better than posting to a website:

We can't justify keeping this API maintained / available,
because we have no idea who's using it, or what they're
using it for.

Or:

We've had to shut down the API because we'd had people
abusing the API and we can't easily single them out as
it's not just coming from a single IP range.

We don't require API keys here, but we *do* send out messages
to our designated community every couple of years with:

If you use our APIs, please send a letter of support
that we can include in our upcoming Senior Review.

(Senior Review is NASA's peer-review of operating projects,
where they bring in outsiders to judge if it's justifiable to
continue funding them, and if so, at what level)


Personally, I like the idea of allowing limited use without
a key (be it number of accesses per day, number of concurrent
accesses, or some other rate limiting), but as someone who has
been operating APIs for years and is *not* *allowed* to track
users, I've seen quite a few times when it would've made my
life so much easier.



 And, if you cringe a little at the ramifications of the above, then why do 
 you need user-specific granularity?   (That's really not meant to be a 
 rhetorical question - I would genuinely be interested in whether my notions 
 of open and free are outmoded and based too much in a theoretical purity 
 that unnecessary tracking is a violation of privacy).

You're assuming that you're actually correlating API calls
to the users ... it may just be an authentication system
and nothing past that.


 Unless the API key exists to control specific, user-level access precisely 
 because this is a facet of the underlying service, I feel somewhere in all of 
 this the service has violated, in some way, the notion that it is open 
 and/or free, assuming it has billed itself as such.  Otherwise, it's free 
 and open as in Google or Facebook.

You're also assuming that we've claimed that our services
are 'open'.  (mine are, but I know of plenty of them that
have to deal with authorization, as they manage embargoed
or otherwise restricted items).

Of course, you can also set up some sort of 'guest'
privileges for non-authenticated users so they just wouldn't
see the restricted content.


 All that said, I think a data service can smooth things over greatly by not 
 insisting on a developer signing a EULA (which is essentially what happens 
 when one requests an API key) before even trying the service or desiring the 
 most basic of data access.  There are middle ground solutions.

I do have problems with EULAs ... one in that we have to
get things approved by our legal department, second in that
they're often written completely one-sided and third in
that they're often written assuming personal use.

Twitter and Facebook had to make available alternate EULAs
so that governments could use them ... because you can't
hold the person who signed up for the account responsible
for it.  (and they don't want it 'owned' by that person
should they be fired, etc.)

... but sometimes they're less restrictive ... more TOS
than EULA.  Without it, you've got absolutely no sort of
SLA ... if they want to take down their API, or block you,
you've got no recourse at all.

-Joe


Re: [CODE4LIB] The lie of the API

2013-12-01 Thread Joe Hourcle
On Dec 1, 2013, at 3:51 PM, LeVan,Ralph wrote:

 I'm confused about the supposed distinction between content negotiation and 
 explicit content request in a URL.  The reason I'm confused is that the 
 response to content negotiation is supposed to be a content location header 
 with a URL that is guaranteed to return the negotiated content.  In other 
 words, there *must* be a form of the URL that bypasses content negotiation.  
 If you can do content negotiation, then you should have a URL form that 
 doesn't require content negotiation.

There are three types of content negotiation discussed in HTTP/1.1.  The
one that most gets used is 'transparent negotiation' which results in
there being different content served under a single URL.

Transparent negotiation schemes do *not* redirect to a new URL to allow
the cache or browser to identify the specific content returned.  (this
would require an extra round trip, as you'd have to send a Location:
header to redirect, then have the browser request the new page)

So that you don't screw up web proxies, you have to specify the 'Vary'
header to tell which parameters you consider significant so that it
knows what is or isn't cacheable.  So if you might serve different
content based on the Accept and Accept-Encoding would return:

Vary: Accept, Accept-Encoding

(Including 'User Agent' is problematic because of some browsers
that pack in every module + the version in there, making there be so
many permutations that many proxies will refuse to cache it)

-Joe

(who has been managing web servers since HTTP/0.9, and gets 
annoyed when I have to explain to our security folks each year
why I don't reject pre-HTTP/1.1 requests or follow the rest of
the CIS benchmark recommendations that cause our web services to
fail horribly)


Re: [CODE4LIB] The lie of the API

2013-12-01 Thread Joe Hourcle
On Dec 1, 2013, at 7:57 PM, Barnes, Hugh wrote:

 +1 to all of Richard's points here. Making something easier for you to 
 develop is no justification for making it harder to consume or deviating from 
 well supported standards.
 
 [Robert]
 You can't 
 just put a file in the file system, unlike with separate URIs for 
 distinct representations where it just works, instead you need server 
 side processing.
 
 If we introduce languages into the negotiation, this won't scale.

It depends on what you qualify as 'scaling'.  You can configure
Apache and some other servers so that you pre-generate files such
as :

index.en.html
index.de.html
index.es.html
index.fr.html

... It's even the default for some distributions.

Then, depending on what the Accept-Language header is sent,
the server returns the appropriate response.  The only issue
is that the server assumes that the 'quality' of all of the
translations are equivalent.

You know that 'q=0.9' stuff?  There's actually a scale in
RFC 2295, that equates the different qualities to how much
content is lost in that particular version:

  Servers should use the following table a guide when assigning source
  quality values:

 1.000  perfect representation
 0.900  threshold of noticeable loss of quality
 0.800  noticeable, but acceptable quality reduction
 0.500  barely acceptable quality
 0.300  severely degraded quality
 0.000  completely degraded quality





 [Robert]
 This also makes it much harder to cache the 
 responses, as the cache needs to determine whether or not the 
 representation has changed -- the cache also needs to parse the 
 headers rather than just comparing URI and content.  
 
 Don't know caches intimately, but I don't see why that's algorithmically 
 difficult. Just look at the Content-type of the response. Is it harder for 
 caches to examine headers than content or URI? (That's an earnest, perhaps 
 naïve, question.)

See my earlier response.  The problem is without a 'Vary' header or
other cache-control headers, caches may assume that a URL is a fixed
resource.

If it were to assume that was static, then it wouldn't matter what
was sent for the Accept, Accept-Encoding or Accept-Language ... and
so the first request proxied gets cached, and then subsequent
requests get the cached copy, even if that's not what the server
would have sent.


 If we are talking about caching on the client here (not caching proxies), I 
 would think in most cases requests are issued with the same Accept-* headers, 
 so caching will work as expected anyway.

I assume he's talking about caching proxies, where it's a real
problem.


 [Robert]
 Link headers 
 can be added with a simple apache configuration rule, and as they're 
 static are easy to cache. So the server side is easy, and the client side is 
 trivial.
 
 Hadn't heard of these. (They are on Wikipedia so they must be real.) What do 
 they offer over HTML link elements populated from the Dublin Core Element 
 Set?

Wikipedia was the first place you looked?  Not IETF or W3C?
No wonder people say libraries are doomed, if even people who work
in libraries go straight to Wikipedia.


...


oh, and I should follow up to my posting from earlier tonight --
upon re-reading the HTTP/1.1 spec, it seems that there *is* a way to
specify the authoritative URL returned without an HTTP round-trip,
Content-Location :

http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.14

Of course, it doesn't look like my web browser does anything with
it:

http://www.w3.org/Protocols/rfc2616/rfc2616
http://www.w3.org/Protocols/rfc2616/rfc2616.html
http://www.w3.org/Protocols/rfc2616/rfc2616.txt

... so you'd still have to use Location: if you wanted it to 
show up to the general public.

-Joe


Re: [CODE4LIB] The lie of the API

2013-12-01 Thread Joe Hourcle
On Dec 1, 2013, at 9:36 PM, Barnes, Hugh wrote:

 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Joe 
 Hourcle
 
 (They are on Wikipedia so they must be real.)
 
 Wikipedia was the first place you looked?  Not IETF or W3C?
 No wonder people say libraries are doomed, if even people who work in 
 libraries go straight to Wikipedia.
 
 It was a humorous aside, regrettably lacking a smiley.

Yes, a smiley would have helped.

It also doesn't help that there used to be a website out there
named 'ScoopThis'.  They started as a wrestling parody site, but
my favorite part was their advice column from 'Dusty the Fat,
Bitter Cat'.

I bring this up because their slogan was cuz if it’s on the net,
it’s got to be true ... so I twitch a little whenever someone
says something similar to that phrase.

(unfortunately, the site's gone, and archive.org didn't cache
them, so you can't see the photoshopped pictures of Dusty
at Woodstock '99 or the Rock's cooking show.  They started up
a separate website for Dusty, but when they closed that one
down, they put up a parody of a porn site, so you probably
don't want to go looking for it)


 I think that comment would be better saved to pitch at folks who cite and 
 link to w3schools as if authoritative. Some of them are even in libraries.

Although I wish that w3schools would stop showing up so highly
in searches for javascript methods  css attributes, they
did have a time when they were some of the best tutorials out
there on web-related topics.  I don't know if I can claim that
to be true today, though.


 Your other comments were informative, though. Thank you :)

I try ... especially when I'm procrastinating on doing posters
that I need to have printed by Friday.

(but if anyone has any complaints about data.gov or other
federal data dissemination efforts, I'll be happy to work
them in)

-Joe


Re: [CODE4LIB] The lie of the API

2013-12-01 Thread Joe Hourcle
On Dec 1, 2013, at 11:12 PM, Simon Spero wrote:

 On Dec 1, 2013 6:42 PM, Joe Hourcle onei...@grace.nascom.nasa.gov wrote:
 
 So that you don't screw up web proxies, you have to specify the 'Vary'
 header to tell which parameters you consider significant so that it knows
 what is or isn't cacheable.
 
 I believe that if a Vary isn't specified, and the content is not marked as
 non cachable,  a cache must assume Vary:*, but I might be misremembering

That would be horrible for caching proxies to assume that nothing's
cacheable unless it said it was.  (as typically only the really big
websites or those that have seen some obvious problems bother with
setting cache control headers.)

I haven't done any exhaustive tests in many years, but I was noticing
that proxies were starting to cache GET requests with query strings,
which bothered me -- it used to be that anything that was an obvious
CGI wasn't cached.  (I guess that enough sites use it, it has to make
the assumption that the sites aren't stateful, and that the parameters
in the URL are enough information for hashing)



 (who has been managing web servers since HTTP/0.9, and gets annoyed when
 I have to explain to our security folks each year  why I don't reject
 pre-HTTP/1.1 requests or follow the rest of  the CIS benchmark
 recommendations that cause our web services to fail horribly)
 
 Old school represent (0.9 could out perform 1.0 if the request headers were
 more than 1 MTU or the first line was sent in a separate packet with nagle
 enabled). [Accept was a major cause of header bloat].

Don't even get me started on header bloat ... 

My main complaint about HTTP/1.1 is that it requires clients to support
chunked encoding, and I've got to support a client that's got a buggy
implementation.  (and then my CGIs that serve 2GB tarballs start
failing, and it's calling a program that's not smart enough to look
for SIG_PIPE, so I end up with a dozen of 'em going all stupid and
sucking down CPU on one of my servers)

Most people don't have to support a community written HTTP client,
though.  (and the one alternative HTTP client in IDL doesn't let me
interactive  w/ the HTTP headers directly, so I can't put a wrapper
around it to extract the tarball's filename from the Content-Disposition
header)

-Joe

ps.  yep, still having writer's block on posters.


Re: [CODE4LIB] calibr: a simple opening hours calendar

2013-11-27 Thread Joe Hourcle
On Nov 27, 2013, at 11:01 AM, Jonathan Rochkind wrote:

 Many of our academic libraries have very byzantine 'hours' policies.
 
 Developing UI that can express these sensibly is time-consuming and 
 difficult; by doing a great job at it (like Sean has), you can make the 
 byzantine hours logic a lot easier for users to understand... but you can 
 still only do so much to make convoluted complicated library hours easy to 
 deal with and understand for users.
 
 If libraries can instead simplify their hours, it would make things a heck of 
 a lot easier on our users. Synchronize the hours of the different parts of 
 the library as much as possible. If some service points aren't open the full 
 hours of the library, if you can make all those service points open the 
 _same_ reduced hours, not each be different. Etc.
 
 To some extent, working on hours displays to convey byzantine hours 
 structures can turn into the familiar case of people looking for 
 technological magic bullet solutions to what are in fact business and social 
 problems.

I agree up to a point.

When I was at GWU, we were running what was the most customized
version of Banner (a software system for class registration, HR,
etc.)  Some of the changes were to deal with rules that no one
could come up with a good reason for, and they should have been
simplified.  Other ones were there for a legitimate reason.*

You should take these sorts of opportunities to ask *why* the
hours are so complicated, and either document the reason for it,
or look to simplify it.

Did a previous librarian have some regularly scheduled thing
every Tuesday afternoon, and that's why one section closes
down early on Tuesdays?  If they're not there anymore, you can
change that.

Does one station requiring some sort of a shutdown / closing
procedure that takes a significant amount of time, and they
close early so they're done by closing time?  Or do they open
late because they have similar issue setting up in the morning,
and it's unrealistic to have them come in earlier than everyone
else?  Maybe there's something else that could be done to
improve and/or speed up the procedures.**

Has there been historically less demand for certain types of
books at different times of the day?  Well, that's going to be
hard to verify, as people have now adjusted to the library's
hours, rather than visa-versa ... but it's a legitimate reason
to not keep service points open if no one's using them.

... but I would suggest that you don't use criteria like the
US Postal Service's recommendation to remove postboxes -- they
based it on number of pieces of mail, and ended up removing
them all in some areas.

...

Anyway, the point I'm making -- libraries are about service.
Simplification might make it easier to keep track of things,
but it doesn't necessarily make for better service.

-Joe

* Well, legitimate to someone, at least.  For instance, the
development office had a definition of alumni that included
donors who might not've actually attended the university.

** When I worked for the group that ran GW's computer labs,
some days I staffed a desk that we had over in the library ...
but I had to clock in at the main office, then walk over to
other building, and once the shift was over, walk back to the
main office to clock out.  I got them to designate one of the
phones in the library computer lab as being allowed to call
into the time clock system, so I could stop wasting so much
time ... then they decided to just stop having staff over
there.



 On 11/27/13 9:25 AM, Sean Hannan wrote:
 I¹d argue that library hours are nothing but edge cases.
 
 Staying open past midnight is actually a common one. But how do you deal
 with multiple library locations? Multiple service points at multiple
 library locations? Service points that are Œby appointment only¹ during
 certain days/weeks/months of the year? Physical service points that are
 under renovation (and therefore closed) but their service is being carried
 out from another location?
 
 When you have these edge cases sorted out, how do you display it to users
 in a way that makes any kind of sense? How do you get beyond shoehorning
 this massive amount of data into outmoded visual paradigms into something
 that is easily scanned and processed by users? How do you make this data
 visualization work on tablets and phones?
 
 The data side of calendaring is one thing (and for as standard and
 developed as the are, iCal and Google Calendar¹s data formats don¹t get it
 100% correct as far as I¹m concerned). Designing the interaction is wholly
 another.
 
 It took me a good two or three weeks to design the interaction for our new
 hours page (http://www.library.jhu.edu/hours.html) over the summer. There
 were lots of iterations, lots of feedback, lots of user testing. ³User
 testing? Just for an hours page?² Yes. It¹s one of our most highly sought
 pieces of information on our website (and yours too, probably). Getting it
 right pays off 

Re: [CODE4LIB] Faculty publication database

2013-10-25 Thread Joe Hourcle
On Oct 25, 2013, at 11:35 AM, Alevtina Verbovetskaya wrote:

 Hi guys,
 
 Does your library maintain a database of faculty publications? How do you do 
 it?
 
 Some things I've come across in my (admittedly brief) research:
 - RSS feeds from the major databases
 - RefWorks citation lists
 
 These options do not necessarily work for my university, made up of 24 
 colleges/institutions, 6,700+ FT faculty, and 270,000+ degree-seeking 
 students.
 
 Does anyone have a better solution? It need not be searchable: we are just 
 interested in pulling a periodical report of articles written by our 
 faculty/students without relying on them self-reporting 
 days/weeks/months/years after the fact.

If you're forced to rely on self-reporting, one of the solutions
that I've seen is to add a few more features and introduce it as a
'CV Builder' or some sort of 'Faculty Directory' ... so the faculty
members get some benefit back out of it, and it's more public so they
have an interest in keeping it updated.

I'd also recommend talking to the individual colleges -- it's possible
that some of them already maintain databases, either for the whole
college or at the departmental level.  They might be willing to keep
the data populated if you provide the hosted service.

(and the tenure-track folks have a vested interest in making sure
their records kept up-to-date).

In looking through the other recommendations -- I didn't see ORCID or
ResearcherID mentioned ... I know they're not exhaustive, but it might
be possible to have a way to automate dumps from them -- so the faculty
member keeps ORCID up-to-date, and you periodically generate dumps from
ORCID for all of your faculty.  The last time I checked it, ORCID 
found all of my ASIST work ... but missed all of the stuff that I've
published in space physics and data informatics.  (admittedly, those
weren't peer-reviewed, but neither were most of the ASIST ones)

-Joe


[CODE4LIB] Please use HTTP 503 (was: Library of Congress)

2013-10-01 Thread Joe Hourcle
On Oct 1, 2013, at 9:52 AM, Nick Ruest wrote:

 Welp. XSDs are redirecting. See[1].
 
 -nruest
 
 [1] http://www.loc.gov/standards/mods/v3/mods-3-4.xsd

(*@#!@#%

I tried telling people around here to use HTTP 503 ... but GSA sent out 
advice to use 302s ... 

If there are any people who are still in the process of their 'orderly
shutdown' ... please send HTTP 503 (Service Unavailable) for requests,
so that search engines don't completely screw things up while we're
shut down, or ignorant systems try to process the error page as if it
were real content.

-Joe


Apache : http://stackoverflow.com/q/622466/143791
ISS : http://serverfault.com/q/483145/14119
Nginx : http://stackoverflow.com/q/5984270/143791



 On 13-10-01 09:36 AM, John Palmer wrote:
 Furloughs don't officially start until noon local time Tuesday, so they may
 be in the process of receiving instructions for shutdown.
 
 
 On Tue, Oct 1, 2013 at 6:21 AM, Doran, Michael D do...@uta.edu wrote:
 
 As far as I can tell the LOC is up and the offices are closed.
 HORRAY!!
 Let's celebrate!
 
 Before we start celebrating, let's consider our friends and colleagues at
 the LOC (some of who are code4lib people) who aren't able to work and
 aren't getting paid starting today.
 
 -- Michael
 
 # Michael Doran, Systems Librarian
 # University of Texas at Arlington
 # 817-272-5326 office
 # 817-688-1926 mobile
 # do...@uta.edu
 # http://rocky.uta.edu/doran/
 
 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
 Riley Childs
 Sent: Tuesday, October 01, 2013 5:28 AM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: [CODE4LIB] Library of Congress
 
 As far as I can tell the LOC is up and the offices are closed.
 HORRAY!!
 Let's celebrate!
 
 Riley Childs
 Junior and Library Tech Manager
 Charlotte United Christian Academy
 +1 (704) 497-2086
 Sent from my iPhone
 Please excuse mistakes
 


Re: [CODE4LIB] Way to record usage of tables/rooms/chairs in Library

2013-08-16 Thread Joe Hourcle
On Aug 16, 2013, at 9:52 AM, Ian Walls wrote:

 Suma is the most practical and reliable way to do this right now, I think.
 
 I've been investigating using a sensor network, but there are a lot of
 limits on the accuracy of PIR, and trip-lasers are low enough and require
 enough power that they'd be troublesome to maintain in a busy undergraduate
 environment.
 
 One idea was to use an array of sensors:  PIR for motion, microphone for
 noise level and piezo/something similar for vibration.  The thought is that
 elevated levels of these 3 measurements should correspond to high
 activity.  The placement and calibration of the sensors, though, would be
 key, and you'd need to do some thorough spot checking with Suma or something
 similar in order to be confident that what you're measuring (motion, noise
 and vibration) actually correlate to number of people.
 
 The sensors would also need to be made out of cheap enough materials and use
 low-congestion wireless frequencies in order to be practical.  Balancing
 this with accuracy may never happen... but it would certainly be a fun
 experiment!


If you're going to take the sensor approach, and it's just a matter of
if there are bodies in specific places, you *might* be able to do it by
modifying cheap webcams.

Many are sensitive in infrared, so you take the IR filter out, and then
add a visible filter.

Position the cameras so that you have coverage of the area you care about,
have them take a picture at whatever times you care about, and then it's
just looking for hot spots.

(although of course, if you do this, it'd be just as easy for someone
to review security camera footage, if you have coverage in the places
you care about; the IR might be easier to automate the counting, though,
if you have someone who's good with automated image analysis)

And if it's just a matter of activity counting -- you might be able
to see if your wireless access points can tell how many items they're
in contact with, and use that as a proxy.

-Joe


Re: [CODE4LIB] locking app for iPads

2013-07-25 Thread Joe Hourcle
On Jul 25, 2013, at 3:52 PM, Cheryl Kohen wrote:

 Dear Fellow Techs,
 
 We're looking to create a circulation policy for iPads (gen 4) in the
 Learning Commons, and were wondering about an app that will lock the device
 after a specific amount of time (3-4 hours).  The idea is if a student
 does, in fact, steal the device, they will be locked out of actually
 utilizing it.  Has anyone heard of something like this?

I don't know of a time-sensitive one, but Apple's Find My iPad (or iPhone), 
has an option to remotely lock a device:

https://www.apple.com/icloud/features/find-my-iphone.html

I suspect it needs a network connection to send the signal to lock.

I don't know if it'll stop anyone who can jailbreak the device, but it would 
hopefully stop the person attempting to 'borrow' it long-term.  (and you can 
track where it is, if it's a device with GPS)

-Joe


Re: [CODE4LIB] Lightweight Autocomplete Application

2013-07-08 Thread Joe Hourcle
On Jul 8, 2013, at 10:37 AM, Anderson, David (NIH/NLM) [E] wrote:

 I'm looking for a lightweight autocomplete application for data entry. Here's 
 what I'd like to be able to do:
 
 
 * Import large controlled vocabularies into the app
 
 * Call up the app with a macro wherever I'm entering data
 
 * Begin typing in a term from the vocabulary, get a list of 
 suggestions for terms
 
 * Select a term from the list and have it paste automatically into my 
 data entry field
 
 Ideally it would load and suggest terms quickly. I've looked around, but 
 nothing really stands out. Anyone using anything like this?


Is this web-based?

If not, do you have control of the software that you're entering the data into?

If so, what language is it in?)

If not, what OS are you using?


-Joe


Re: [CODE4LIB] StackExchange reboot?

2013-07-08 Thread Joe Hourcle
On Jul 8, 2013, at 3:50 PM, Christie Peterson wrote:

 I agree with both Shaun and Galen's points; when you're asking a how to do X 
 with tool Y type of question, SE is a great forum. Like Christina, I've 
 mostly encountered SE when Googling for answers to these types of questions.
 
 However, for the reasons that Henry and Gary mentioned, I was disappointed in 
 the Digital Preservation SE experience. At the request of one of the SE 
 organizers, I posted a question there that I had also posted to a listserv. 
 It was flagged for not being in the proper form, but I have no idea how I 
 could have framed it properly for SE because it simply wasn't a question that 
 had a single answer. I wanted discussion. Digital Preservation in particular 
 is a developing field and I was trying to gague opinions and currently 
 evolving best practices. Somewhat ironically given the potential value of the 
 commenting and upvoting mechanism, SE did not prove to be a good forum for 
 this.
 
 There may be some value to having a code4lib SE instance that answers 
 questions of the how to do X with tool Y type and similar for the reasons 
 that Shaun and Galen state. But unless the community standards about what 
 makes a good SE question change radically, I don't see it being an 
 attractive or useful forum for the more open-ended, discussion/opinion type 
 questions that people often post to library, digital preservation and other 
 listservs.


I actually just responded to this issue the other day on the Open Data SE site:

http://meta.opendata.stackexchange.com/q/126/263

Back when Cooking SE started (~2.5 years ago), multiple possible answers was 
considered a valid question.  They didn't tend to like polls ('what's the best 
...') but questions about possibilities of how to deal with problems were 
acceptable.  I'd link to some of them, but there have since been a few people 
who go around and vote to close every question they don't like, even if they're 
gotten a dozen or more upvotes.

Here's one instead that's not even a question that's ranked in the top 10 
'questions' on the cooking site:

http://cooking.stackexchange.com/q/784/67

Personally, I'm of the opinion that there are *very* few problems that only 
have a single solution, or a 'best' solution.  What they really tend to reward 
people for is coming up with a plausible, moderately detailed answer quick 
enough.  I've seen a number get marked as the 'best answer' within 30 min of 
the question being asked where the answer from my point of view was just plain 
wrong.

I do see a use for the sort of things that might've once been considered 
'community wiki' ... what books can I recommend to a 3rd grader who is 
interested in science fiction?  (I've cheated before and worded them like 
'where can I find a list of books to recommend ...')

It *might* be possible to get enough like-minded people involved to ensure that 
if anyone attempts to close reasonable questions we can get them re-opened 
quickly ... but I'd like to recommend changing the scope up front to museums, 
libraries  archives.  I don't know that the more practical 'library' and the 
abstract/academic 'library science' communities really mesh all that well.

And I should probably go get some sleep as I write e-mail that's even more 
incoherent than typical when I've only gotten ~8hrs sleep over the last 3 days.

-Joe


Re: [CODE4LIB] Code4Lib 2014: Save the dates!

2013-06-29 Thread Joe Hourcle
On Jun 29, 2013, at 7:16 AM, BWS Johnson wrote:

 Salvete!
 
 
 I am happy to announce that we have secured the venue and dates for
 Code4Lib 2014!  The conference will be held at the Sheraton Raleigh Hotel
 in downtown Raleigh, NC on March 24 - 27, 2014.  Preconferences will be
 held Monday March 24, and the main conference on Tuesday March 25 - 27.
 
 
  Hooray, that's sort of close. Maybe I'll be able to pit fight my own 
 place next year.
 
 
 Finally, the hotel has the capacity to host all of the attendees, and we've
 negotiated a rate of $159/night that includes wireless access in the hotel
 rooms.  Hotel reservations will be able to made after you register using
 the information provided in your registration confirmation.  We will be
 publishing more details as become available.
 
 
  Ruh oh. This was rather shocking. Perhaps you might wish to show them a 
 hotels.com search, which puts your $159 just over the Hilton and about double 
 other places in the vicinity. I'm sure it's nice and all that, but uh, 
 perhaps they would be willing to come down seeing as how we're sending a 
 boatload of traffic their way.

Government per-diem rates for Raleigh is $91 per night :

http://www.gsa.gov/portal/category/100120

I have no idea if that can be used for negotiations at all.

For some reason, they're not showing any federal government rates when I 
searched, but they're offering state government employees rooms at $64/night.  
(you might have to pay extra for the wifi, though)  I highly suggest that 
people who work for public universities or libraries inquire about getting that 
rate.



-Joe

(even though the state of maryland makes me pay into the state retirement 
system because I'm a municipal elected official, they won't issue me a state ID 
card, so I can't get the state rates when traveling, which are typically better 
than the federal rates ... I've actually debated about if it makes sense to 
work for 3 years in a real state job, then claim a pension based on my 'top 3 
years' of pay times the number of years worked)


Re: [CODE4LIB] DOI scraping

2013-05-21 Thread Joe Hourcle
On May 21, 2013, at 9:40 PM, Fitchett, Deborah wrote:

 Joe and Owen--
 
 Thanks for the ideas!
 
 It's a bit of the opposite goal to LibX, in that rather than having a 
 title/DOI/whatever from some random site and wanting to get to  the full-text 
 article, I'm looking at the use case of academics who are already viewing the 
 full-text article and want a link that they can share with students.  Even 
 aside from the proxy prefix, the url in their browser may include (or consist 
 entirely of) session gunk.
 
 I'll try a regexp and see how far that gets me. I'm a bit trepidatious about 
 the way the DOI standard allows just about any character imaginable, but at 
 least there's the 10. prefix. Am also considering that if DOIs also appear in 
 the article's bibliography I'll need to make sure the javascript can 
 distinguish between them and the DOI for the article itself; but a lot of 
 this might be 'cross that bridge if I come to it' stuff.


Crap.  I just remembered :

http://shortdoi.org/

... I don't know if any publishers are actually using them, or if they're just 
for people to use on twitter  other social media.

The real problem with them is that they don't have the '10.' string in them.

You can probably get away with just tracking the resolving form of them:

http://doi[.]org/(\w+)

And ignore the 

10/(\w+)

form.

-Joe


Re: [CODE4LIB] Policies for 3D Printers

2013-05-20 Thread Joe Hourcle
On May 20, 2013, at 4:47 PM, Bigwood, David wrote:

 That's a question every library will have to answer for themselves. 
 
 For us it makes perfect sense. Our scientists are sending out files to
 have 3D models of craters. When the price drops enough it will become
 more cost effective to do that in-house. It will just be an extension of
 maps and remote sensing data we already have in the collection. I can
 see a limit being fabrication related to the mission of the Institute,
 same as the large-format printer.
 
 A public library might have other concerns. If it is unlimited and free,
 is printing out 100 Hulk statues to sell at a comic convention
 acceptable? How about Barbie dolls to sell at a flea market? Or maybe
 Barbee dolls to side-step trademarks? Lots of unanswered questions, but
 each library will have to decide based on local conditions.

Actually, this made me think back to my undergrad, when I worked
in our schools 'Academic Computing' department.  We had a big problem
with students printing out multiple copies of their thesis on the
printers in the computer labs, because they'd:

1. tie up the printers for a rather long time.
2. burn through all of the paper

The result was, one or two bad actors kept everyone else from being
able to use the services, because there were taking advantage of our
'free' printing.

Our typical process, when we found someone needed to print their
thesis was to print one copy from the printer in our staff offices,
and they then had to go to one of the local copy shops to make the
additional copies that they needed.  (the policy of only one copy
had been established for years, but was only really enforced when
people came in and complained about people printing whole books)


Although I can appreciate some of the arguments for making library
services free, there needs to be some sort of a line drawn so that
one or two people don't end up monopolizing a service.

Just as I left, they ended up going to a system of some number of
free pages per semester per student, with them having to pay if
they wanted to print more than their gratis quota.  I don't know
if something like that would work, but you'd have to work out how
to handle it.  (number of objects?  time spent on the printer?
amount of material used?)

-Joe


Re: [CODE4LIB] DOI scraping

2013-05-17 Thread Joe Hourcle
On May 17, 2013, at 12:32 AM, Fitchett, Deborah wrote:

 Kia ora koutou,
 
 I’m wanting to create a bookmarklet that will let people on a journal article 
 webpage just click the bookmarklet and get a permalink to that article, 
 including our proxy information so it can be accessed off-campus.
 
 Once I’ve got a DOI (or other permalink, but I’ll cross that bridge later), 
 the rest is easy. The trouble is getting the DOI. The options seem to be:


 Can anyone think of anything else I should be looking at for inspiration?

4. Look for any strings that look like a DOI:

\b((?:http://dx.doi.org/|doi:|)10.[\d.]+/(?:\S+))

(as it sucks to code special things for each database, in case they change or 
you add a new one)

You can then fall back to #1 if necessary.


 Also on a more general matter: I have the general level of Javascript that 
 one gets by poking at things and doing small projects and then getting 
 distracted by other things and then coming back some months later for a 
 different small project and having to relearn it all over again. I’ve long 
 had jQuery on my “I guess I’m going to have to learn this someday but, um, 
 today I just wanna stick with what I know” list. So is this the kind of thing 
 where it’s going to be quicker to learn something about jQuery before I get 
 started, or can I just as easily muddle along with my existing limited 
 Javascript? (What really are the pros and cons here?)

If depends on what you're going to do with the output -- I'd likely look 
through the a href='' values for http://dx.doi.org DOIs first, then just look 
at the text displaying on the page.  I don't think you'd need jQuery for that.

-Joe


Re: [CODE4LIB] On-going support for DL projects

2013-05-17 Thread Joe Hourcle
On May 17, 2013, at 9:51 AM, Tim McGeary wrote:

 I'm interested in starting or joining discussions about best practices for
 on-going support for digital library projects.  In particular, I'm looking
 at non-repository projects, such as projects built on applications like
 Omeka.  In the repository context, there are initiatives like APTrust and
 DPN that are addressing on-going and long term collaborative support.  But,
 as far as I know, we aren't having the same types of discussions for DL
 projects that are application driven.

If you're asking about funding issues, most of those discussions that I've
seen lump it into 'governance'.


 There is no easy answer for this, so I'm looking for discussion.
 
   - Should we begin considering a cooperative project that focuses on
   emulation, where we could archive projects that emulate the system
   environment they were built?

I know that there are projects using emulation when it'd be too expensive
to port the software (and validate / vet it).  There are some that are
are setting up VMs for new software being written, so that they can
archive the whole environment to ensure that the proper version of the 
OS, libraries, etc. are captured.

Most of the ones that I've been have been focusing on scientific
workflows, but that's likely because that's the field I'm in, so I tend
to see more of those talks at conferences than other subjects.


   - Do we set policy that these types of projects last for as long as they
   can, and once they break they are pulled down?

I wouldn't recommend that directly ... like anything, the stuff being
archived has a value, and if someone's willing to pay for it to be
continued, then you do it.  Maybe you just need to have a policy on
cost-recovery for when this happens.  (and then you need to look at
the various 'governance' discussions.

   - Do we set policy that supports these projects for a certain period of
   time and then deliver the application, files, and databases to the faculty
   member to find their own support?

The ultimate decision might be at a higher pay grade -- you may want
to come up with the list of options, estimated costs, and have the
provost or deans decide what makes sense for the budget.

   - Do we look for a solution like the Way Back Machine of the Internet
   Archive to try to present some static / flat presentation of these project?

Again, it likely depends on what's being archived.  An online database
that you can search / filter / interact with would be mostly useless
as static pages.

-Joe


Re: [CODE4LIB] makerspaces in libraries workshp

2013-05-15 Thread Joe Hourcle
On May 15, 2013, at 8:30 AM, Edward Iglesias wrote:

 Hello All,
 
 I have the unlikely distinction of getting to offer a 1 day workshop on
 Makerspaces in libraries.  I have a general idea of how it's going to go
 --morning theory afternoon hands on -- but am a little overwhelmed by the
 possibilities.  My first thought was to show them how to use a Raspberry Pi
 but that would require them all to buy a Raspberry Pi.  I am open to
 suggestions on what would be worth learning that is hands on and preferably
 cheap for a group of around 20.  What would you teach/learn in an afternoon
 given the chance?
 
 Edward Iglesias


I'd make sure to mention that this does *not* have to be high-tech.

Our library runs jewelry-making workshops, and some of the local
churches have knitting circles / quilting bees so there can be a 
social component of 'making'.  They've never considered this to be
'makerspaces', but it fits the description.

If it were me, depending on how much time you had, I'd try to come
up with some sort of a project that people could build  take home
with them,  (and so the Raspberry Pi idea is likely out). Depending
on where you are, it might be a good time of year to make bird or
bat houses, or maybe something decorative.

Have them leave with a physical item that they can take and show
off to others.

Depending on how soon you'll get kicked out after your class ends,
you might be able to plan for building something, and then let
people stay later if they wanted to paint or otherwise decorate
it.

I'd plan on having someone cut all of the pieces in advance unless
it can be done w/ hand tools and you have a sufficient number of
the necessary tools ... ideally, you'd want something that could
be assembled with press-fit and glue, or maybe a few nails or screws.
(if you had to add hinges).

-Joe


If you really need an idea of something to make -- I can give you
plans for gift boxes that I make ... it's shadow-box that says
'in case of emergency, break glass', and you can then put whatever
you want in them.  (typically, I give 'em with pacifiers to 
friends having their first child ... but I've done other stuff,
like gave one w/ a box of kosher salt, peppercorns and whole
cumin to Alton Brown when he was doing a book signing back in
2004 or so)

It's simple pine, a plexiglass front, etc.  You'll need a table
saw, a miter box or chop saw and a label maker, and then it's just
a matter of glue, a few nails, and some sanding.

(you could also borrow a pneumatic brad nailer + a power sander,
so that once you get everyone to make the item, show that it 
can all be done in 1/10th the time w/ the proper tools ... which
is part of the reason for building out these spaces)


[CODE4LIB] FW: Digital Forensics Hackathon - June 3-5

2013-05-01 Thread Joe Hourcle
I thought this was something that might interest people in code4lib.

-Joe


 -Original Message-
 From: Cal Lee [mailto:cal...@email.unc.edu] 
 Sent: Wednesday, May 01, 2013 11:36 AM
 Subject: Digital Forensics Hackathon - June 3-5
 
 We'll be running a hackathon in Chapel Hill on June 3-5 that will focus on 
 applying digital forensics methods to born-digital collections. 
 We're running this with the Open Planets Foundation, who have done a terrific 
 job in the past of running these events.
 
 The format is one in which people bring real technical challenges (including 
 the associated data from their collections) to the event and pair up with 
 developers who can provide substantive solutions to those challenges by the 
 end of the three days.
 
 http://wiki.opf-labs.org/display/KB/2013-06-03+OPF+Hackathon+-+Tackling+Real-World+Collection+Challenges+with+Digital+Forensics+Tools+and+Methods+%28Chapel+Hill%29
  
 
 
 I'm very excited that we're running this event in Chapel Hill.  It's the 
 first time that this very successful OPF model has made it to the US. 
 It should be a great opportunity for all involved.
 
 I would really appreciate any efforts you could take to help us with getting 
 the word out about it.  Broadcasts through mailing lists, Twitter and such 
 are all helpful.  Even better is pointing it out to specific individuals who 
 you think would be interested and would benefit from the event.
 
 The deadline for booking a hotel room at the block rate is May 19.  But it's 
 even better if people sign up well before then, so we can make the 
 appropriate pairings and plan for the event.
 
 - Cal


Re: [CODE4LIB] Tool to highlight differences in two files

2013-04-23 Thread Joe Hourcle
On Apr 23, 2013, at 4:37 PM, Alexander Duryee wrote:

 The absolute simplest way to do this would be to fire up a terminal
 (OSX/Linux) and:
 
 diff page1.html page2.html | less
 
 Unfortunately, this will also catch changes made in other markup, and
 may or may not be terribly readable.

At the very least, I'd suggest adding a '-b' which will ignore changes to 
whitespace.

Also see:

http://www.w3.org/wiki/HtmlDiff

-Joe


 On Tue, Apr 23, 2013 at 4:31 PM, Alevtina Verbovetskaya
 alevtina.verbovetsk...@mail.cuny.edu wrote:
 I've recently begun to use Beyond Compare: http://www.scootersoftware.com/ 
 It's not free or OSS, though.
 
 There's also a plugin for Notepad++ that does something similar: 
 http://sourceforge.net/projects/npp-compare/ This is free, of course.
 
 Thanks!
 Allie
 
 
 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of 
 Wilhelmina Randtke
 Sent: Tuesday, April 23, 2013 4:24 PM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: [CODE4LIB] Tool to highlight differences in two files
 
 I would like to compare versions of a website scraped at different times to 
 see what paragraphs on a page have changed.  Does anyone here know of a tool 
 for holding two files side by side and noting what is the same and what is 
 different between the files?
 
 It seems like any simple script to note differences in two strings of text 
 would work, but I don't know a tool to use.
 
 -Wilhelmina Randtke


Re: [CODE4LIB] Tool to highlight differences in two files

2013-04-23 Thread Joe Hourcle
On Apr 23, 2013, at 8:12 PM, Genny Engel wrote:

 There's a list here that may be more along the lines of what you're seeking.
 
 http://webapps.stackexchange.com/questions/11547/diff-for-websites


Hmm ... I guess I should actually accept the answer as it was the only one ever 
given.

-Joe


[CODE4LIB] password lockboxes (was: what do you do: API accounts used by library software, that assume an individual is registered)

2013-03-05 Thread Joe Hourcle
On Mar 5, 2013, at 8:29 AM, Adam Constabaris wrote:

 An option is to use a password management program (KeepassX is good because
 it is cross platform) to store the passwords on the shared drive, although
 of course you need to distribute the passphrase for it around.

So years ago, when I worked for a university, they wanted us to put all of the 
root passwords into an envelope, and give them to management to hold.  (we were 
a Solaris shop, so there actually were root passwords on the boxes, but you had 
to connect from the console or su to be able to use 'em).

We managed to drag our heels on it, and management forgot about it*, but I had 
an idea ...

What if there were a way to store the passwords similar to the secret formula 
in Knight Rider?

Yes, I know, it's an obscure geeky reference, and probably dates me.  The story 
went that the secret bullet-proof spray on coating wasn't held by any one 
person; there were three people who each knew part of the formula, and that any 
two of them had enough knowledge to make it.

For needing 2 of 3 people, the process is simple -- divide it up into 3 parts, 
and each person has a different missing bit.  This doesn't work for 4 people, 
though (either needing 2 people, or 3 people to complete it).

You could probably do it for two or three classes of people (eg, you need 1 
sysadmin + 1 manager to unlock it), but I'm not sure if there's some method to 
get an arbitrary X of Y people required to unlock.

If anyone has ideas, send 'em to be off-list.  (If other people want the 
answer, I can aggregate / summarize the results, so I don't end up starting yet 
another inappropriate out-of-control thread)

...

Oh, and I was assuming that you'd be using PGP, using the public key to encrypt 
the passwords, so that anyone could insert / update a password into whatever 
drop box you had; it'd only be taking stuff out that would require multiple 
people to combine efforts.

-Joe


* or at least, they didn't bring it up again while I was still employed there.


Re: [CODE4LIB] what do you do: API accounts used by library software, that assume an individual is registered

2013-03-04 Thread Joe Hourcle
On Mar 4, 2013, at 11:11 AM, Jonathan Rochkind rochk...@jhu.edu wrote:

 Whether it's Amazon AWS, or Yahoo BOSS, or JournalTOCs, or almost anything 
 else -- there are a variety of API's that library software wants to use, 
 which require registering an account to use.

[trimmed]

 Has anyone found a way to deal with this issue, other than having each API 
 registered to an account belonging to whatever individual staff happened to 
 be dealing with it that day?

The government actually has a program for this.


http://www.howto.gov/web-content/resources/tools/terms-of-service-agreements

If you work for the feds there are some alternate terms of services for various 
Social Media Providers (it actually covers more than what I think of as 
social media).  So far, they've only really looked at 'free' services.

It's a little bit tricky to use them, as you have to find out if your 
government agency has yet agreed to the terms that a company is offering.  If 
they don't have an agreement ... well, it takes some time to get the approval, 
as it's got to go through the agency's legal council.

If you're with a state government (and most state universities are considered 
state government), then there are alternate TOSes available for Twitter, 
Facebook and YouTube.

-Joe


Re: [CODE4LIB] A newbie seeking input/suggestions

2013-02-21 Thread Joe Hourcle
On Feb 21, 2013, at 11:20 AM, Paul Butler (pbutler3) wrote:

 For something like this I would go the hardware route.  A walkie-talkie on a 
 charging stand at each service point. The walkie-talkies would always be on 
 and tuned to the same channel. That way the staff person is not tied to the 
 PC itself, they can grab the walkie-talkie and still do what they need to do 
 - like head to the stacks or look for that reserve material. No phone number 
 to remember. This solution could help with other issues, like security and 
 system/network outages. 

I admit, I've never worked as a librarian, but I did work at a computer help 
desk during undergrad.

We had a policy of trying our best *not* to go into the computer labs, because 
if you did, you'd get 6+ people who suddenly had questions they wanted to ask 
... but couldn't have been bothered to actually go to the office to ask.  When 
I first started, someone who went to go add paper to a printer might not come 
back for 30+ minutes.

(I realize that this policy likely won't work for a library, though)

Our follow-up policy was not the answer questions in the labs, and make them go 
to the office so they don't cut in line if there were people queued up.

... so I completely agree about needing something that's not fixed to a single 
location.  If you can make it beep on demand, that's even better.  (oops, 
sorry, I've got to go, I've been summoned back to the desk)

If you're going to do something that's computer-based, I'd be inclined to think 
about some sort of phone app, or even part of a more comprehensive tool to 
assist in other things that you might need while you're in the stacks trying to 
help someone.

-Joe


Re: [CODE4LIB] A newbie seeking input/suggestions

2013-02-21 Thread Joe Hourcle
On Feb 21, 2013, at 2:28 PM, Cab Vinton wrote:

 This seems like a good application for text messaging -- as long as
 all librarians have smartphones, which they surely would at Yale :-)


The problem is that you'd have to have it dynamically generate the list of who 
to text based on who's currently on duty.

Otherwise, you have it harassing people on their days off, when they're home 
sick, etc.

-Joe


Re: [CODE4LIB] You *are* a coder. So what am I?

2013-02-18 Thread Joe Hourcle
On Feb 18, 2013, at 11:17 AM, John Fereira wrote:

 I suggested PHP primarily because I find it easy to read and understand and 
 that's it's very commonly used.  Both Drupal and Wordpress are written in PHP 
 and if we're talking about building web pages there are a lot of sites that 
 use one of those as a CMS.

And if you're forced to maintain one of those, then by all means, learn PHP ... 
but please don't recommend that anyone learn it as a first language.

... and I'd like to say that in my mention of Perl, it was only because there's 
going to be the workshop ... not that I'd necessarily recommend it as a first 
language for all people ... I'd look at what they were interested in trying to 
do, and make a recommendation on what would best help them do what they're 
interested in.



 I've looked at both good and bad perl code, some written some very 
 accomplished software developers, and I still don't like it.   I am not 
 personally interested in learning to make web pages (I've been making them 
 for 20 years) and have mostly dabbled in Ruby but suspect that I'll be doing 
 a lot more programming in Ruby (and will be attending the LibDevConX workshop 
 at Stanford next month where I'm sure we'll be discussing Hydra).   I'm also 
 somewhat familiar with Python but I just haven't found that many people are 
 using it in my institution (where I've worked for the past 15 years) to spend 
 any time learning more about it.  If you're going to suggest mainstream 
 languages I'm not sure how you can omit Java (though just mentioning the word 
 seems to scare people).

It's *really* easy to omit Java:

http://www.recursivity.com/blog/2012/10/28/ides-are-a-language-smell/

... not to mention all of the security vulnerabilities and memory headaches 
associated with anything that runs in a VM.

You might as well ask why I didn't suggest C or assembler for beginners.  
That's not to say that I haven't learned things from programming in those 
languages (and I've even applied tricks from Fortran and IDL in other 
languages), but I wouldn't recommend any of those languages to someone who's 
just learning to program.

-Joe

(ps. I'm grumpier than usual today, as I've been trying to get hpn patched 
openssh to compile under centos 6 ... so that it can be called by a java daemon 
that is called by another C program that dynamically generates python and shell 
scripts ... and executes them but doesn't always check the exit status ... this 
is one of those times when I wish some people hadn't learned to program, so 
they'd just hire someone else to write it)


Re: [CODE4LIB] You *are* a coder. So what am I?

2013-02-17 Thread Joe Hourcle
On Feb 17, 2013, at 11:43 AM, John Fereira wrote:

 I have been writing software professionally since around 1980 and first 
 encounterd perl in the early 1990s of so and have *always* disliked it.   
 Last year I had to work on a project that was mostly developed in perl and it 
 reminded me how much I disliked it.  As a utility language, and one that I 
 think is good for beginning programmers (especially for those working in a 
 library) I'd recommend PHP over perl every time.  

I'll agree that there are a few aspects of Perl that can be confusing, as some 
functions will change behavior depending on context, and there was a lot of bad 
code examples out there.* 

... but I'd recommend almost any current mainstream language before 
recommending that someone learn PHP.

If you're looking to make web pages, learn Ruby.

If you're doing data cleanup, Perl if it's lots of text, Python if it's mostly 
numbers.

I should also mention that in the early 1990s would have been Perl 4 ... and 
unfortunately, most people who learned Perl never learned Perl 5.  It's changed 
a lot over the years.  (just like PHP isn't nearly as insecure as it used to be 
... and actually supports placeholders so you don't end up with SQL injections)

-Joe


Re: [CODE4LIB] You *are* a coder. So what am I?

2013-02-15 Thread Joe Hourcle
On Feb 15, 2013, at 8:22 AM, Kyle Banerjee wrote:

 On Thu, Feb 14, 2013 at 7:40 AM, Jason Griffey grif...@gmail.com wrote:
 
 The vast, vast, vast, vast majority of people have absolutely no clue how
 code translates into instructions for the magic glowing screen they look at
 all day. Even a tiny bit of empowerment in that arena can make huge
 differences in productivity and communication abilities
 
 
 This is what it boils down to.
 
 C4l is dominated by linux based web apps. For people in a typical office
 setting, the technologies these involve are a lousy place to start learning
 to program. What most of them need is very different than what is discussed
 here and it depends heavily on their use case and environment.
 
 A bit of VBA, vbs, or some proprietary scripting language that interfaces
 with an app they use all the time to help with a small problem is a more
 realistic entry point for most people. However, discussion of such things
 is practically nonexistent here.

Well, as you mention that ... I'm one of the organizers of the 
DC-Baltimore Perl Workshop :

http://dcbpw.org/dcbpw2013/

Last year, we targeted the beginner's track as a sort of 'Perl
as a second language', assuming that you already knew the basic
concepts of programming (what's a variable, an array, a function,
etc.)

Would it be worth us aiming for an even lower level of expertise?

-Joe

ps.  Students  the unemployed are free ... $25 before March 1st,
 $50 after; will be April 20th at U. Baltimore.  We're also
 in talks with a training company to have either another track
 of paid training or a separate day (likely Sunday); they
 wouldn't necessarily be Perl-specific.


Re: [CODE4LIB] You *are* a coder. So what am I?

2013-02-15 Thread Joe Hourcle
On Feb 15, 2013, at 9:00 AM, Lin, Kun wrote:

 Wow, Interesting. But I am not fun of Perl. Is there other workshop?

I don't know of any full workshops in the area, but there are plenty
of monthly or semi-monthly meetings of different groups:

Python: http://dcpython.org/

R : http://www.meetup.com/R-users-DC/

Groovy: http://www.dcgroovy.org/

Drupal: http://groups.drupal.org/washington-dc-drupalers

Hadoop: http://www.meetup.com/Hadoop-DC/

Ruby:   http://www.dcrug.org/

ColdFusion: http://www.cfug-md.org/


For those not in this area, see:

http://www.pm.org/groups/
http://wiki.python.org/moin/LocalUserGroups
http://r-users-group.meetup.com/
http://groups.drupal.org/
http://www.ruby-lang.org/en/community/user-groups/
http://www.haskell.org/haskellwiki/User_groups
http://coldfusion.meetup.com/

-Joe


[CODE4LIB] Learning programming data (was: You *are* a coder. So what am I?)

2013-02-15 Thread Joe Hourcle
On Feb 15, 2013, at 10:26 AM, Chris Gray wrote:

 Yes.  Exactly.  It's like saying you can't go to the doctor or hire a lawyer 
 without a bit of medical or law school.  Doctors and lawyers need to be able 
 to explain what they're doing.
 
 Another skill that would be useful is understanding databases, by which I do 
 not mean learning SQL.  Too many people's idea of working with data is Excel, 
 which provides no structure for data. Type in any data in any box.  There is 
 none of the data integrity that a database requires.  Here my ideal is 
 Database Design for Mere Mortals which teaches no SQL at all but teaches 
 how to work from data you know and use and arrive at a structure that could 
 easily be put into a database.  It's not just data, but data structure that 
 needs to be understood.  I've seen plenty of evidence that people who build 
 commercial database-backed software don't understand database structure.


I don't know of one specifically for the library community, but there are some 
courses on the topic for the science community on learning how to use 
scientific databases, or to develop their own.

Two that I know well are Kirk Bourne at GMU and Peter Fox and his cohorts at 
RPI, and there's been an effort from the Federation of Earth Science 
Information Partners (ESIP) to put together short presentations on various 
related topics: 

http://classweb.gmu.edu/kborne/ 
http://tw.rpi.edu/wiki/Peter_Fox
http://wiki.esipfed.org/index.php/Data_Management_Short_Course


With the need for expertise in data management, there's also been a push to 
teach librarians in data curation  data management at Syracuse*, UIUC and 
recently started at UNC.

http://eslib.ischool.syr.edu/
http://cirss.lis.illinois.edu/CollMeta/dcep.html

http://sils.unc.edu/programs/graduate/post-masters-certificates/data-curation

And, another conference that I'm helping to organize, the Research Data Access 
and Preservation (RDAP) Summit, also being held in Baltimore this year (April 
4-5, co-located with the IA Summit)**.  It's been a place for the science, 
library and archives community to discuss issues (and solutions) that we're 
facing; it can be an interesting overview for librarians who are starting to 
look into the management of data.  See the 'Resources' page for links to 
articles summarizing past years  videos of the talks from last year.***

http://www.asis.org/rdap/


-Joe


* disclaimer : I gave an invited talk to one of the Syracuse eScience classes a 
couple of years back.

** I know, you're thinking, 'what idiot would be involved with organizing two 
events being held weeks apart?' ... but I'm not ...  I'm organizing three, so 
if you know any craft vendors who might be interested in participating in a 
street festival in Upper Marlboro, Maryland the day before Mother's Day : 
http://MarlboroughDay.org/ .  (yes, it's the Marlboro of tobacco  horse fame, 
but we don't have cowboys)

*** although, my talk's particularly bad, as I wasn't expecting to actually 
give it 'til two of my three speakers bowed out at the last minute.  But both 
Peter Fox  Kirk Borne spoke in other sessions, and lots of other interesting 
people.

ps. and um ... the thing about people making database software that don't 
understand data structures ... that's also part of my complaint about that 
project with people writing software that they shouldn't have ... storing 
journaled data in the same table, and no indexes so a RDBMS becomes a document 
store as there's only two useful accessors (one of which has to be checked to 
see if it's been deprecated by another record because of the journaling))


Re: [CODE4LIB] You *are* a coder. So what am I?

2013-02-15 Thread Joe Hourcle
On Feb 15, 2013, at 12:27 PM, Kyle Banerjee wrote:
 On Fri, Feb 15, 2013 at 6:45 AM, Diane Hillmann 
 metadata.ma...@gmail.comwrote:
 
 I'm all for people learning to code if they want to and think it will help
 them. But it isn't
 the only thing library people need to know, and in fact, the other
 key skill needed is far rarer: knowledge of library data...
 
 ...More useful, I think, is for each side of that skills divide to value
 the skills
 of other other, and learn to work together
 
 
 Well put. No amount of technical skill substitutes for understanding what
 people are actually doing -- it's very easy to write apps that nail any set
 of specifications and then some but are still totally useless.
 
 Even if you never intend to do any programming, it's still useful to know
 how to code because it will help you know what is feasible, what questions
 to ask, what is feasible, and how to interpret responses.
 
 That doesn't mean you need to know any particular language. It does mean
 you need to grok the fundamental methodologies and constraints.

And the vocabulary (which Alison also mentioned, but for those who read
Stranger in a Strange Land know that 'grok' was also associated with
understanding the language to be able to explain what something was.)

I've had *way* too many incidents where the problem was simply
mis-communication because one group was using a term that
had a specific meaning to the other group with some other intended
meaning.  I even gave a talk last year on the problem:


http://www.igniteshow.com/videos/polysemous-terms-did-everyone-understand-your-message

And one of the presenters earlier that night touched on the issue,
for scientists talking to politicians and the public:


http://www.igniteshow.com/videos/return-jedis-so-what-making-your-science-matter


It takes more than just people skills to coordinate between the 
customers  the software people.*  Being able to translate between
the problem domain's jargon and the programmers (possibly via some
requirements language, like UML), or even just normalizing metadata
between the sub-communities is probably 25-50% of my work.

As a quick example, there's 'data' ... it means something completely
different if you're dealing with scientists, programmers, or
information scientists.  For the scientists, metadata vs. data is
a legitimate distinction as not all of what programmers would
consider 'data' is considered to be 'scientific data'.

-Joe

* http://www.youtube.com/watch?v=mGS2tKQhdhY


Re: [CODE4LIB] editing code4lib livestream - preferred format

2013-02-15 Thread Joe Hourcle
On Feb 15, 2013, at 2:30 PM, Matthew Sherman wrote:

 Not to be snarky, but wouldn't the session on HTML5 video tell you what you
 need to know?

Code it in 3+ different formats, and stack your tags in hope that
you've used enough different codecs that the browser actually
supports one of them?

http://caniuse.com/#feat=video,ogv,webm,mpeg4

... then fail back to syncronized slide show / audio:

http://caniuse.com/#feat=audio,svg-smil

... then fail back to Flash or some other security risk.


(or did they have some other solution?)

-Joe



 
 On Fri, Feb 15, 2013 at 1:20 PM, Tara Robertson 
 trobert...@langara.bc.cawrote:
 
 Hi,
 
 I'm editing the video from code4lib into the sesison chunks.
 
 What format should I export the videos as? Anything else I should be aware
 of?
 
 Thanks,
 Tara
 --
 
 Tara Robertson
 
 Accessibility Librarian, CILS 
 http://www2.langara.bc.ca/**cils/http://www2.langara.bc.ca/cils/
 
 T  604.323.5254
 F  604.323.5954
 trobert...@langara.bc.ca mailto:Tara%20Robertson%20%**
 3ctrobert...@langara.bc.catara%2520robertson%2520%253ctrobert...@langara.bc.ca
 %3E
 
 Langara. http://www.langara.bc.ca
 
 100 West 49th Avenue, Vancouver, BC, V5Y 2Z6
 


Re: [CODE4LIB] You *are* a coder. So what am I?

2013-02-14 Thread Joe Hourcle
On Feb 14, 2013, at 8:57 AM, Karen Coyle wrote:

 EVERYONE should know some code. see:
 http://laboratorium.net/archive/2013/01/16/my_career_as_a_bulk_downloader
 
 But it's hard to find the classes that teach coding for everyone. This 
 would be a good thing for c4l'ers to do in their institutions. How to write 
 the short script you need to do something practical. Also, how to throw a few 
 things into a database so you can re-munge it or explore some connections. We 
 need those classes. We need to turn a room in the library into a hacker space 
 for the staff. A learning lab.


I just realized that the e-mails from Chris Erdmann a couple of weeks back were 
*not* on code4lib ... he's running a class on programming for librarians 
(specifically for processing data), and in a couple of weeks, they're going to 
have a workshop on interfaces at Harvard.  See below.  Also, a blog post from 
last month arguing that all librarians should know how to program:

http://altbibl.io/dst4l/109/

-Joe

ps. personally, I *hate* the term coder ... one, it make me think 'code 
monkey', and what I do is much more involved than that (analyst, architect, 
sysadmin, dba, programing, debugging, tech support, etc.).  If I had a MLS, I 
might be a 'Systems Librarian', but I have a MIM (Info. Management ... still an 
LIS degree, but not the same accreditation);  It's still easier to tell the 
library community that's what I am, and it's easier to explain what I do to the 
science  by telling them I'm a 'data librarian'.*

Two, 'coding' is a relatively minor skill.  It's like putting 'typist' as a job 
title, because you use your keyboard a lot at work.  Figuring out what needs to 
be written/typed/coded is more important than the actual writing aspect of it.  
As for titles, over the years, I've had the job title of :

Programmer/Analyst
Systems Analyst
Software Engineer
UNIX Engineer
Multimedia Applications Analyst
Short Guy with Beard (which was only funny because there was a much 
shorter guy with a more impressive beard)
Web Developer
Webmaster (back when it meant the person who administered the service, 
not the person who made the website)
System Administrator
... etc.

(I've had a lot as the university I worked at tied titles to pay rate, so every 
promotion required getting new business cards; right now, I work for a 
contractor, and the contractor gives me different titles than what NASA has me 
down as ... it's important what roles that I play, and the work that I do than 
what category someone's lumped me in.  If you're going to insist on it, I'd 
rather it be broad, like 'techie' than just a 'coder'.)

* and to make it more confusing, my company's title for me is 'Principal 
Software Engineer', but I don't meet the requirements to be an engineer.  I 
went to an ABET accredited engineering program, but never took the EIT/FE or PE 
tests.  So I try to avoid the 'engineer' titles, too.



Begin forwarded message:

 From: cerdm...@cfa.harvard.edu
 Date: February 7, 2013 6:57:37 AM EST
 To: pam...@listserv.nd.edu
 Subject: [PAMNET] Liberact Workshop and Data Scientist Training for Librarians
 Reply-To: cerdm...@cfa.harvard.edu
 
 Good morning!
 
 Just a reminder to those thinking about interactive technologies in 
 libraries, this workshop may be of interest:
 http://altbibl.io/liberact/
 
 Also, we just started a course called Data Scientist Training for Librarians. 
 Follow along here:
 http://altbibl.io/dst4l/blog/
 
 Please forward to interested colleagues.
 
 Best regards,
 Christopher Erdmann, Head Librarian
 Harvard-Smithsonian Center for Astrophysics



Begin forwarded message:

 From: cerdm...@cfa.harvard.edu
 Date: January 25, 2013 5:06:58 PM EST
 To: pam...@listserv.nd.edu
 Subject: [PAMNET] Liberact Workshop Feb 28 - Mar 1 @ Harvard
 Reply-To: cerdm...@cfa.harvard.edu
 
 To individuals interested in interactive technologies in libraries, this
 event is for you.
 
 The Liberact Workshop aims to bring librarians and developers together
 to discuss and brainstorm interactive, gesture-based systems for library
 settings. An array of gesture-based technologies will be demonstrated on
 the first day with presentations, brainstorming and discussions taking
 place on the second day. The workshop will be held at the Radcliffe
 Institute of Advanced Study at Harvard University in Cambridge,
 Massachusetts, and takes place February 28 - March 1.
 
 Visit the Liberact Workshop website to learn more:
 
 http://altbibl.io/liberact
 
 To register, visit the Eventbrite page for the workshop:
 
 https://liberact.eventbrite.com
 
 We hope you will join us!
 
 Christopher Erdmann, Martin Schreiner, Lynn Schmelz, Susan Berstler,
 Paul Worster, Enrique Diaz, Lynn Sayers, Michael Leach 


[CODE4LIB] Comparison of JavaScript 'data grids'?

2013-02-14 Thread Joe Hourcle
A couple of weeks ago, I posted to Stack Exchange's 'Webmasters' site, asking 
if there were any good feature comparisons of different Javascript 'data grid' 
implementations.*

The response has been ... lacking, to put it mildly:**

http://webmasters.stackexchange.com/q/42847/22457

I can find all sorts of comparisons of databases, javascript frameworks, web 
browsers, etc ... but I just haven't been able to find anything on tabular data 
presentation other than the sort of 'top 10 list'-type stuff that doesn't go 
into detail about why you might select one over another.

Is anyone aware of such a comparison, or should I just put something half-assed 
up on wikipedia in hopes that the different implementations will fill it in?

-Joe

* ie, the ones that let you play with tabular data ... not the 'grid' stuff 
that the web designers use for layout, nor the 'data grid' stuff that the 
comp.sci  scientific community use for distributed data storage.

** maybe I should've just asked on Stack Overflow, rather than post to the 
correct topical place


Re: [CODE4LIB] Comparison of JavaScript 'data grids'?

2013-02-14 Thread Joe Hourcle

On Thu, 14 Feb 2013, Cary Gordon wrote:


I have used Flexigrid, but there are several choices, and one of the
others might better suit your needs.

I have informally tiered them but my (based on very little) perception
of their popularity.

Flexigrid: http://flexigrid.info/

Ingrid: http://reconstrukt.com/ingrid/
jQuery Grid: http://github.com/tonytomov/jqGrid

jqGridView: http://plugins.jquery.com/project/jqGridView
SlickGrid: http://github.com/mleibman/SlickGrid
DataTables: http://www.datatables.net/index
jTable: http://www.jtable.org/


Thanks for the effort, That's the sort of thing that I *don't* need.

I'm concerned about what features they have, and which browsers they 
support.


For instance:
How can you feed data into it?
HTML tables (progressive enhancement)
XML
JSOC
some other API
Can it cache data locally, and if so, how?
localStorage
webDB
indexedDB
How is it licensed?
commercial
BSD
GPLv2
GPLv3
LGPL

Does it do sorting / filtering / pagination locally, or does it
require a server component?

Can you extend the datatypes? (to support abnormal sorting)

Can you specify a function for rendering?
(eg, show negative numbers in red, wrapped in parens;
display alternate info when null)

Does it support ...
tree views?
dynamic groupings?
column re-ordering?
automatic table sizing (to fill the view)?
shift-clicking ranges of records?
alt/ctrl-clicking multiple records?
selecting checkboxes (so the table's a form input)
adding new rows?
hiding columns?
infinate scrolling?
editing of cells?
adding / deleting records?

Does it meet Section 508 requirements?

What's the realistic maximum for:
number of columns
number of rows displayed
number of records total (including not displayed)

... and the list goes on ... that's just some of the significant 
discriminators I've noticed when looking at the different implementations.


-Joe





On Thu, Feb 14, 2013 at 9:48 AM, Joe Hourcle
onei...@grace.nascom.nasa.gov wrote:

A couple of weeks ago, I posted to Stack Exchange's 'Webmasters' site, asking 
if there were any good feature comparisons of different Javascript 'data grid' 
implementations.*

The response has been ... lacking, to put it mildly:**

http://webmasters.stackexchange.com/q/42847/22457

I can find all sorts of comparisons of databases, javascript frameworks, web 
browsers, etc ... but I just haven't been able to find anything on tabular data 
presentation other than the sort of 'top 10 list'-type stuff that doesn't go 
into detail about why you might select one over another.

Is anyone aware of such a comparison, or should I just put something half-assed 
up on wikipedia in hopes that the different implementations will fill it in?

-Joe

* ie, the ones that let you play with tabular data ... not the 'grid' stuff that 
the web designers use for layout, nor the 'data grid' stuff that the comp.sci  
scientific community use for distributed data storage.

** maybe I should've just asked on Stack Overflow, rather than post to the 
correct topical place




--
Cary Gordon
The Cherry Hill Company
http://chillco.com



Re: [CODE4LIB] You *are* a coder. So what am I?

2013-02-14 Thread Joe Hourcle

On Thu, 14 Feb 2013, Jason Griffey wrote:


On Thu, Feb 14, 2013 at 10:30 AM, Joe Hourcle onei...@grace.nascom.nasa.gov

wrote:



Two, 'coding' is a relatively minor skill.  It's like putting 'typist' as
a job title, because you use your keyboard a lot at work.  Figuring out
what needs to be written/typed/coded is more important than the actual
writing aspect of it.



Any skill is minor if you already have it. :-)

As others have pointed out, learning even a tiny, tiny bit of code is a
huge benefit for librarians. The vast, vast, vast, vast majority of people
have absolutely no clue how code translates into instructions for the magic
glowing screen they look at all day. Even a tiny bit of empowerment in that
arena can make huge differences in productivity and communication
abilities. Just understanding the logic behind code means that librarians
have a better understanding of what falls into the possible and
impossible categories for doing stuff with a computer and anything that
grounds decision making in the possible is AWESOME.


It's true ... and learning lots of different programming languages makes 
you think about the problem in different ways*


But equally important is knowing that's it's just one tool.  It's like the 
quote, 'when you have a hammer, everything's a nail'.


... and more often than people realize, the correct answer is not to write 
code, or to write less of it.


I remember once, I had inherited a project where they were doing this 
really complex text parsing, and we'd spend a month or so of man-hours on 
it each year.  My manager quit, so I got to meet with the 'customer'.** 
I told her some of the more problematic bits, and some of them were things 
that she hadn't liked, so used it to push back and get things changed
upstream.  The next year, I was able to shave a week off the turn-around 
time.


For the last few years, I've been dealing with software that someone 
wrote when what they *should* have done was survey what was out there, and 
figure out which one met their needs, and if necessary, adapt it slightly. 
Instead, they wrote massive complex systems that was unnecessary.  And now 
we've got to support it, as there isn't the funding to convert it all over 
to something that has a broad community of support.


(and I guess that's one of my issues against 'coders' ... anyone who 
writes code should be required to support it, too ... I've done the 
'developer', 'sysadmin' and 'helpdesk' roles individually ... and when 
some developer makes a change that causes you to get 2am wakeup calls when 
the server crashes every night for two weeks straight,*** but they of 
course can't roll back, because 'but it's in production now, as it passed 
our testing'.)


-Joe

ps.  I like Stuart's 'Library Systems Specialist' title for those who
 actually work in libraries.

pps. Yes, I should actually be writing code right now.


* procedural, functional, OO, ... I still haven't wrapped my head around
  this whole 'noSQL' movement, and I used to manage LDAP servers and
  *love* heirarchical databases.  (even tried to push for its use in our
  local registry ... I got shot down by the others on the project).

** we were generating an HTML version of the schedule of classes based on
   the export generated from QuarkXPress, which was used to typeset the
   book.  The biggest problem was dealing with a department code that had
   an ampersand in it, and the hack that we did to the lexer to deal with
   it doubled the time of each run.  (and they made enough changes
   year-to-year that the previous year's script never worked right out the
   bat, so we'd have to run it, verify, tweak the code, re-run, etc.)

*** they never actually fixed the problem.  I put in (coded?) a watchdog
script that'd check every 60 sec. if ColdFusion was down, and if so,
start it back up again.  So only the times when the config got
corrupted did I have to manually intervene.  By the time I was fired
(long story, unrelated), it was crashing 5-10 times a day.


Re: [CODE4LIB] post your presentation slides before your talk, please!

2013-02-13 Thread Joe Hourcle
On Feb 13, 2013, at 2:10 PM, Cynthia Ng wrote:

 Adding it to lanyrd is super easy too!

http://xkcd.com/949/


 On Wed, Feb 13, 2013 at 10:14 AM, James Stuart james.stu...@gmail.com wrote:
 If our entirely awesome presenters can, just drop an email on this thread
 or link into the IRC with your slides right before you go up. That way the
 talks that use code and small text can be followable without squinting.
 
 If you use dropbox, dropping the share link into IRC is super easy.
 
 Thanks!


ps.  If you're using a tablet and can't see the title text (aka. 'alt text') 
for the image, save this as a bookmark, then select when you're on an xkcd page 
(I hope it'll work on tablets ... I use it for printing out the comics)


javascript:function%20hide(item){item.style.setProperty('display','none')};hide(document.getElementById('bottom'));hide(document.getElementById('topContainer'));Array.prototype.slice.call(document.getElementById('middleContainer').getElementsByTagName('ul'),0).forEach(hide);document.getElementById('ctitle').style.fontSize='3em';img=document.getElementById('comic').getElementsByTagName('img')[0];img.insertAdjacentHTML('afterend','p%20style=padding:0em%201em%200em%201em'+comic.getElementsByTagName('img')[0].title+'/p');document.getElementById('ctitle').style.fontSize='3em';

 


Re: [CODE4LIB] On-the-fly Closed Captioning

2013-02-06 Thread Joe Hourcle
On Feb 6, 2013, at 4:16 PM, John Wynstra wrote:

 I have been asked to find out whether there are software or hardware
 solutions for on-the-fly closed captioning.  We currently work with
 University IT production house on campus to perform this task.  I'm not
 involved in any aspect of this at this time, but have been asked to
 investigate.
 
 Workflow is like this:
 1) purchase a separate VHS copy of movie for captioning purpose (license
 issues I believe)
 2) view show and write a transcript (probably time consuming)
 3) Campus IT production creates a closed captioned digital copy using
 transcript and movie.

[trimmed]

 Thoughts?

It would never get you full closed captioning.  It might get you
subtitles, but true closed captioning also includes comments about
other audio (background music, dogs barking, singing vs.
mumbling vs. normal speech)

Some of the more elaborate ones that I've seen will specifically
move the text around on the screen so that they're not blocking
important visual items.  (that might've been on a DVD, and not
standard closed captioning; much of my experience was in hanging
out with folks from Gallaudet during undergrad and one of my dad's
ex-girlfriends who only had partial hearing, but this was all in
the mid to late 1990s)

If you watch most news programs these days, they seem to use some
sort of automatic closed captioning, as it's just awful.  Lots of 
homophone confusion, random phonetic misspellings, etc.

...

I would think that something that might be more productive, as
you're dealing with existing published media and not stuff that
you're generating yourself, would be to see if there exists some
cooperative library of closed captioning, and if there isn't,
make one.

(so that people can submit time-tagged text to go with a 
given ISBN for a VHS or DVD)

... and a quick search seems to suggest that one exists; the
Alternate Media eXchange Database:

http://www.amxdb.net/


It seems there's also an 'OpenSubtitles' player which isn't
resitricted to educational institutions, but as it's all torrent
files and looks like many other torrent trackers, I'm afraid to
download them (for fear it's got the video included).

-Joe


Re: [CODE4LIB] conf presenters: a kind request

2013-02-05 Thread Joe Hourcle
On Feb 5, 2013, at 9:42 AM, Wilhelmina Randtke wrote:

 If your university or any local professional groups have brown bag lunches
 with presentations, or anything informal and about the same amount of time
 as the conference presentation, then you can ask the group if you can do a
 dry run there.

And if you want to get critiques on the manner of presentation, rather
than the content, you might consider checking to see if there's a
Toastmasters group in your area:

http://www.toastmasters.org/

(there are some dues associated with the club, though ... but for those
with a fear of public speaking, they can help you through it)

-Joe


Re: [CODE4LIB] conf presenters: a kind request

2013-02-04 Thread Joe Hourcle
On Feb 4, 2013, at 11:25 AM, Bill Dueber wrote:

[trimmed (and agreed with all of that)]

 As Jonathan said: this is a great, great audience. We're all forgiving,
 we're all interested, we're all eager to lean new things and figure out how
 to apply them to our own situations. We love to hear about your successes.
 We *love* to hear about failures that include a way for us to avoid them,
 and you're going to be well-received no matter what because a bunch of
 people voted to hear you!

I'd actually be interested in people's complaints about bad presentations;
I've been keeping notes for years, with the intention of making a
presentation on giving better presentations.  (but it's much harder than
it sounds, as I plan on making all of the mistakes during the presentation)


 On Mon, Feb 4, 2013 at 10:47 AM, Jonathan Rochkind rochk...@jhu.edu wrote:
 
 We are all very excited about the conference next week, to speak to our
 peers and to hear what our peers have to say!
 
 I would like to suggest that those presenting be considerate to your
 audience, and actually prepare your talk in advance!

[trimmed]

 Just practice it once in advance (even the night before, as a last
 resort!), and it'll go great!


I did one of those 'Ignite' talks this year; because it's auto-
advancing slides, I went over it multiple times.  My recommendation
is that you try to get various co-workers as guinea pigs.  I even
subjected one of my neighbors to it, even though he wasn't necessarily
part of the intended audience.

They gave me a lot of feed back -- asking for clarification on bits,
we realized I could trim down a couple of slides, giving me more
slides to expand other bits.  I still screwed up the presentation,
but it would have been much worse if I hadn't practiced.

My local ASIST chapter used to run 'preview' events before the
annual meeting, where the local folks presenting at annual were
invited to give their talks.  If nothing else, it forced you to
have it done a couple of weeks early, but more importantly, it
gave me a chance to have a similar audience to what would be
at the main meeting ... one of my talks bombed hard;  it was on
standards  protocols for scientific data, and I hadn't considered
just how bad a talk that's 50% acronyms would go over.  I was
able to change how I presented the material so it wasn't quite
so painful the second time around.

There's only been once when practicing in advanced made for a worse
presentation ... and that's because when I finished, PowerPoint asked
me if I wanted to save the timings ... what ever you do, do *not*
tell it yes.  Because then it'll auto-advance your slides, so when
you skip over one slide during the practice, it'll not let you
have it up during the real talk.

(There's a setting to turn off use of timings ... and the audience
laughed when I kept scolding the computer, but it still felt
horrible when I was up there)

And it's important that you *must* practice in front of other
people.  How fast you think it's going to take you, or how fast
it takes you talking to yourself is nothing like talking in
front of other people.

...

So, all of that being said, some of the things I've made a note
of over the years.  (it's incomplete, as I've still take notes
by hand, and there are more items on the back pages of the 
various memo books I've had over the years)

* Get there before the session, and test your presentation on the
  same hardware as it's going to be presented from.  This is
  especially important if you're a Mac user, and presenting from
  a PC, or visa-versa.  Look for odd fonts, images that didn't
  load, videos, abnormal gamma, bad font sizes (may result in
  missing test), missing characters, incorrect justification, etc.

* If you're going to be presenting from your own machine, still
  test it out, to make sure that you have all of the necessary
  adaptors, that you know what needs to be done to switch the
  monitor, that the machine detects the projector at a reasonable
  size and the gamma's adjusted correctly.  (and have it loaded
  in advance; you're wasting enough time switching machines).
  And start switching machines while the last presenter's doing
  QA ... and if you lose 5 min because of switching, prepare
  to cut your talk short, force the following presenters to lose
  time)

* Have a backup plan, with the presentation stashed on a website
  that you've memorized the URL to, *and* on a USB stick.
  (website is safer vs. virus transfer, only use the USB stick
  if there's no internet)  And put the file at the top level of
  the USB stick, not buried 12 folders deep.

* If they have those clip on microphones, put it on your label
  on the same side as the screen is to you.  (so whenever you
  turn to look at the screen, it still picks up your voice)

* If you have a stationary mic, you have to actually stay near
  it or it doesn't work.

* Hand-held mics suck unless you're used to them, as most of us
  aren't used to holding our hand up 

Re: [CODE4LIB] Linked data [was: Why we need multiple discovery services engine?]

2013-02-04 Thread Joe Hourcle
On Feb 4, 2013, at 10:34 AM, Donna Campbell wrote:

 In mentioning pushing to break down silos more, it brings to mind a
 question I've had about linked data.
 
 From what I've read thus far, the idea of breaking down silos of
 information seems like a good one in that it makes finding information
 easier but doesn't it also remove some of the markers of finding credible
 sources? Doesn't it blend accurate sources and inaccurate sources?

Yes, yes it does.

The 'intelligence' community has actually been talking about this
problem with RDF for years.  My understanding is that they use
RDF quads (not triples) so that they have an extra parameter to
track the source.  (it might be that they use something larger
than a quad).

From what I remember (the conversation was years ago), they
have to be able to mark information as suspect (eg, they find
that one of the sources is unreliable, then re-run all all of
the analysis without that source's contribution to determine
if they came to the same result).

I don't know enough about the implementation of linked data
systems, so if there's some way to filter which sources are
considered for input, or if there's any tracking of the RDF
triples once they're parsed out.


-Joe


Re: [CODE4LIB] Tablets to help with circulation services

2013-01-23 Thread Joe Hourcle
On Jan 23, 2013, at 12:34 PM, Stephen Francoeur wrote:

 We're looking into ways that tablets might be used by library staff
 assisting patrons in a long line at the circ desk. With a tablet, an
 additional staff person could pick folks off the line who might have things
 that can be handled on a properly outfitted tablet.

[trimmed]

I have two thoughts on the matter --

1. Trying to take a picture with a tablet is pretty awkward.  It might
   be better on a smaller form-factor device.  (eg, an iPod Touch or an
   Android phone w/out a service plan) ... but this might be less
   useful for other tasks.

2. It might be worthwhile to look at what tasks can be handled by staff
   without a computer, or without a specially outfitted computer.
   (eg, can you answer reference questions using the publicly
   available website?)

-Joe


Re: [CODE4LIB] anti-harassment policy for code4lib?

2012-11-27 Thread Joe Hourcle
On Nov 26, 2012, at 7:47 PM, Michael J. Giarlo wrote:

 Hi Kyle,
 
 IMO, this is less an instrument to keep people playing nice and more an
 instrument to point to in the event that we have to take action against an
 offender.

That was the reasoning for the DCBPW code of conduct ... covering ourselves
if we had to eject someone.

And it's not just a diversity thing -- 

One of the concerns for the DCBPW one was that there had been a guy at
some previous Perl workshop who seemed to think that the presentations
were personal conversations between him and the speaker, and kept
interjecting.

The sad reality is, there seem to be an abnormally high number of
people in the technology fields who have gotten as far as they have
with little to no understanding of social etiquette.

(I've been told that I can cite myself as an example ... if you
don't believe me, do a `whois annoying.org`)

-Joe


  1   2   3   >