Re: [CODE4LIB] Canvas Fingerprinting by AddThis

2014-08-14 Thread Keith Jenkins
http://www.addthis.com/privacy/opt-out

Is this satire?


Re: [CODE4LIB] book cover api

2013-12-04 Thread Keith Jenkins
I was a bit surprised that these techniques from 2005 still work...
http://aaugh.com/imageabuse.html

Basically, Amazon cover images can be manipulated via the URL.  Of
course, you'll probably want to check Amazon's terms of use.

Keith


Re: [CODE4LIB] book cover api

2013-12-04 Thread Keith Jenkins
So, any bets on which book cover image provider will be the first to
implement IIIF?

http://www-sul.stanford.edu/iiif/image-api/1.1/

Keith


On Wed, Dec 4, 2013 at 2:41 PM, Karen Coyle li...@kcoyle.net wrote:
 Open Library book covers come in S, M and L -

 https://openlibrary.org/dev/docs/api/covers

 Of course, if what you want isn't exactly one of those...

 kc


 On 12/4/13 9:34 AM, Kaile Zhu wrote:

 A while ago, we had a discussion about book cover APIs.  I tried some of
 those mentioned and found they are working to some degree, but none of them
 would offer the size I want.  The flexibility of the size is just not there.
 The size I am looking for is like this:
 http://img1.imagesbn.com/p/9780316227940_p0_v2_s114x166.JPG

 Anybody has found a way of implementing book cover api to your
 specifications successfully and is willing to share that with me?  Off-line
 if you want.  Much appreciation.  Thanks.

 Kelly Zhu
 405-974-5957
 kz...@uco.edu

 **Bronze+Blue=Green** The University of Central Oklahoma is Bronze, Blue,
 and Green! Please print this e-mail only if absolutely necessary!

 **CONFIDENTIALITY** This e-mail (including any attachments) may contain
 confidential, proprietary and privileged information. Any unauthorized
 disclosure or use of this information is prohibited.


 --
 Karen Coyle
 kco...@kcoyle.net http://kcoyle.net
 m: 1-510-435-8234
 skype: kcoylenet




Re: [CODE4LIB] LOC Subject Headings API

2013-06-05 Thread Keith Jenkins
 the LCSH master file is so big that it basically crashes the server.

Do you really want to use the full LCSH, or just the subset that
exists in your local catalog?

Or, to put it another way: do you really want to provide the user with
search suggestions that will result in zero hits?

Keith


Re: [CODE4LIB] Stand Up Desks

2013-03-04 Thread Keith Jenkins
I just learned that the UNC Gillings School of Global Public Health
actually has public walkstations that can be reserved by faculty,
staff, and students.  The walkstations are treadmills with
adjustable-height desks.

http://www.sph.unc.edu/weekly_news/august_27_2012_23670_13974.html#new_walkstation

I'm not sure which will happen first: treadmills in the library, or
desks in the gym.

Keith


Re: [CODE4LIB] Lib or Libe

2013-02-13 Thread Keith Jenkins
Code4'brary

On Wed, Feb 13, 2013 at 11:40 AM, Jessie Keck jk...@stanford.edu wrote:
 Wait, you're telling me it's not Code4Liberty?

 - Jessie

 On Feb 13, 2013, at 10:18 AM, Thomas Bennett wrote:

 After voting I am surprised at the results, its a library as in libe, not a 
 leebrary as in lib, ryght or is that reeght or rit ?.

 Thomas or is it Thoomas

 you say tomato I say tomato
 pecan or pecan
 In these two examples maybe pronounce it as you wish or weesh or woosh, what 
 ever…..


 
 Support Requesthttp://portal.support.appstate.edu
 
 Thomas McMillan Grant Bennett   Appalachian State University
 Operations  Systems AnalystP O Box 32026
 University LibraryBoone, North Carolina 28608
 (828) 262 6587
 Library Systems  http://www.library.appstate.edu
 

 Confidentiality Notice:
 This communication constitutes an electronic communication within the 
 meaning of the Electronic Communications Privacy Act, 18 U.S.C. Section 
 2510, and its disclosure is strictly limited to the recipient intended by 
 the sender of this message.  If you are not the intended recipient, any 
 disclosure, copying, distribution or use of any of the information contained 
 in or attached to this transmission is STRICTLY PROHIBITED.  Please contact 
 this office immediately by return e-mail or at 828-262-6587, and destroy the 
 original transmission and its attachment(s), if any, if you are not the 
 intended recipient.

 On Feb 13, 2013, at 11:08 AM, Fleming, Declan wrote:

 Hi - at the conference, there has been much foment about how to pronounce 
 the end of code4lib.

 Please go to:
 https://docs.google.com/forms/d/1lseCc2gwQUXL6oC8aLB7N8YMRnjsl90SfPHAmX5EA_w/viewform

 and vote.

 D


Re: [CODE4LIB] Rdio playlist

2013-02-04 Thread Keith Jenkins
Can't believe no one has yet mentioned Chicago band I Fight Dragons
-- they mix NES-controller chiptunes with electric guitars, playing
rockin' covers of Mario Bros., Legend of Zelda, Contra, etc. in
addition to some originals that have been remixed by other Chicago
musicians.  They just got back from their War of Cyborg Liberation
Tour.


Re: [CODE4LIB] Responsive Web Site Live

2013-01-03 Thread Keith Jenkins
Does anyone here have any experience with browser emulators such as
BrowserStack?  http://www.browserstack.com/

If so, have you come across any significant differences between the
emulators and the real thing?

Keith


On Wed, Jan 2, 2013 at 5:34 PM, Ron Gilmour rgilmou...@gmail.com wrote:
 Ideally, of course, one would have a mobile device lab
 http://mobile.smashingmagazine.com/2012/09/24/establishing-an-open-device-lab/
 where one could test a site on all kinds of devices, but that's not likely
 at a small college library.


Re: [CODE4LIB] Metrics for measuring digital library production

2012-12-17 Thread Keith Jenkins
 Just wondering who might be willing to share what kind of stats they
 produce to justify their continued existence?

Although it's more anecdotal (rather than statistical), fan mail can
help make a convincing argument that certain services are worth
continuing.  Here at our library, I've seen both e-mails and
snail-mailed letters gushing with thanks for help at the reference
desk, thanks for a specific digital collection, thanks for an in-depth
research consultation.  Some of these are sent to specific staff, and
some to the library as a whole.

Our library director usually reads one or two or these aloud at our
all-staff meetings each semester, so I know she sees value in them.

Keith


Re: [CODE4LIB] code4lib online study group - GIS anyone?

2012-12-03 Thread Keith Jenkins
Hi, Bess.

Check out the online trainings from OpenGeo, a company based in New
York City that has many of the major geospatial open source developers
on their payroll (so they really know what they are talking about):
http://opengeo.org/products/training/

There are so many opensource geospatial softwares that it can be
difficult to understand how they are all related, but you would do
well to focus on PostGIS and GeoServer.  (OpenLayers is good too, but
now has some healthy competition from lighter-weight javascript
libraries like Leaflet.)

OSGeo (the foundation, not to be confused with OpenGeo, the company)
has a bootable DVD that will let you run most of the available open
source geospatial software without having to install anything first:
http://live.osgeo.org/

The FOSS4G conference is a great place to learn more; there are
usually plenty of pre-conference workshops for the major projects.
The 2013 conference is going to be in the UK, but there will also be a
regional conference in Minneapolis, May 22-24:
http://foss4g-na.org/

We are currently planning a complete rewrite of our geospatial data
repository, to be based on OpenGeoportal, which originated at Tufts
and now has several other deployments elsewhere.  Its major strength
is its excellent spatial relevance ranking (something which fails
miserably in nearly every other geodata portal I've ever seen)
implemented using Solr.  It also uses GeoServer, OpenLayers, etc.
http://opengeoportal.org/

I'd be interested in knowing how that online course turns out, and
would be happy to try to help out if you run into any stumbling
blocks.

Cheers,
Keith

Keith Jenkins
GIS/Geospatial Applications Librarian, Cornell University

On Sat, Dec 1, 2012 at 6:28 PM, Bess Sadler bess.sad...@gmail.com wrote:
 There's an interesting thread going on around code4lib study groups for a 
 given MOOC (or, presumably, other kinds of online training).

 I am currently attempting to educate myself in the subject of how to design, 
 build, and maintain a spatial data infrastructure for a library[1]. This will 
 serve out our local GIS resources, enabling them to be incorporated into 
 online mapping programs. I know that this is something that many academic 
 libraries are going to have to tackle eventually. Luckily there are some 
 great open source tools out there for tackling this job, but unluckily there 
 is not a lot of training that I have been able to find.

 I have uncovered one online course that looks pretty good: 
 http://www.geospatialtraining.com/index.php?option=com_catalogview=nodeid=71%3Aopen-source-gis-bootcampItemid=108

 I signed up, but I haven't got very far yet. I wonder if having code4lib 
 collaborators would help? If anyone else is undertaking a project like this 
 and would like to form a support / study group, please let me know.

 Cheers,
 Bess

 [1]We are also hiring a GIS developer: http://goo.gl/PURkZ


Re: [CODE4LIB] Seeking examples of outstanding discovery layers

2012-09-19 Thread Keith Jenkins
It's not a library, but the McMaster-Carr product catalog is a classic:
http://www.mcmaster.com/

Keith


On Wed, Sep 19, 2012 at 3:00 PM, Tania Fersenheim tan...@brandeis.edu wrote:
 Got a favorite discovery interface?  Send me the URL

 I am doing some quick  dirty investigation into libraries that have
 successfully and elegantly integrated discovery of various resources,
 e.g.:

  - library catalog
  - federated indexing service such as  Serials Solutions or Primo
 Central, or a federated search system like Metalib
  - ejournals
  - ebooks
  - libguides
  - library web site
  - worldcat local
  - that kind o' stuff

 I am looking for sites that are both nice to look at and seem easy to
 use.  I will assume that if you're touting your own site it is
 technologically sophisticated.  :-D  Got any faves?

 Tania

 --

 Tania Fersenheim
 Manager of Library Systems

 Brandeis University
 Library and Technology Services

 415 South Street, (MS 017/P.O. Box 549110)
 Waltham, MA 02454-9110
 Phone: 781.736.4698
 Fax: 781.736.4577
 email: tan...@brandeis.edu


Re: [CODE4LIB] Storing lat / long

2012-06-28 Thread Keith Jenkins
On Thu, Jun 28, 2012 at 2:57 PM, Mark Jordan mjor...@sfu.ca wrote:
 What's the best (i.e., most standardized and flexible) format for storing 
 single-point geocoordinates?

Definitely stick with decimal degrees (-122.61458), because dealing
with minutes and seconds (122° 36' 52.5 W) is a real nuisance and
unnecessarily complicates everything.

If you are looking to embed coordinates within a data format like
JSON, you might want to look at GeoJSON:
http://www.geojson.org/geojson-spec.html#id2

Cheers,
Keith


Re: [CODE4LIB] archiving a wiki

2012-05-23 Thread Keith Jenkins
Many organizations are using Archive-It, the Internet Archive's
service for harvesting and preserving specific websites.  I think it
can be used to produce public or private archives.

http://www.archive-it.org/

Keith


On Tue, May 22, 2012 at 5:04 PM, Carol Hassler
carol.hass...@wicourts.gov wrote:
 My organization would like to archive/export our internal wiki in some
 kind of end-user friendly format. The concept is to copy the wiki
 contents annually to a format that can be used on any standard computer
 in case of an emergency (i.e. saved as an HTML web-style archive, saved
 as PDF files, saved as Word files).


Re: [CODE4LIB] Lift the Flap books

2012-02-15 Thread Keith Jenkins
The Massachusetts Historical Society had to deal with the flap issue
when presenting Thomas Jefferson's Notes on the State of Virginia.
Jefferson had inserted blocks of text into the manuscript by gluing a
flap at the point of insertion.  Some of the flaps have text on both
sides, others on one side only.

Here's an example of how this is presented in the document view:
http://masshist.org/thomasjeffersonpapers/notes/nsvviewer.php?page=5

I'm not certain about this particular document, but most of the other
documents in the collection were marked up using TEI.

Keith


On Tue, Feb 14, 2012 at 8:34 PM, stuart yeates stuart.yea...@vuw.ac.nz wrote:
 On 15/02/12 13:43, Sara Amato wrote:

 If you were to have a 'lift the flap' type book that you wanted to
 digitize, for web display and use, what technology would you use for markup
 and display?



Re: [CODE4LIB] marc in json

2011-12-01 Thread Keith Jenkins
On Thu, Dec 1, 2011 at 11:56 AM, Gabriel Farrell gsf...@gmail.com
wrote: I suspect newline-delimited will win this race.
Yes.  Everyone please cast a vote for newline-delimited JSON.

Is there any consensus on the appropriate mime type for ndj?

Keith


[CODE4LIB] conference voting and registration

2011-12-01 Thread Keith Jenkins
On Thu, Dec 1, 2011 at 11:57 AM, Ross Singer rossfsin...@gmail.com wrote:
 Last year we had 129
 unique voters for the proposals, roughly unchanged from Asheville
 (119).  Both cases FAR fewer than the number of delegates (and more
 importantly, the number of people that wanted to be delegates).

Just a thought: If we ever wanted to move to a lottery-based
registration for the conference, perhaps those who take time to cast
votes for presentation proposals could be weighted slightly.

Keith (who sadly missed out on the whole Black Wednesday rush for
Code4Lib 2012)


Re: [CODE4LIB] Job Posting: Digital Library Repository Developer, Boston Public Library (Boston, MA)

2011-09-28 Thread Keith Jenkins
Unless it has changed, I think the official posting policy is here:
https://listserv.nd.edu/cgi-bin/wa?A2=ind0311L=CODE4LIBD=0T=0P=3396


Re: [CODE4LIB] source of marc geographic code?

2011-06-23 Thread Keith Jenkins
On Thu, Jun 23, 2011 at 10:59 AM, Jonathan Rochkind rochk...@jhu.edu wrote:
 On 6/22/2011 11:25 PM, Ross Singer wrote:
 Can't you use:
 http://www.loc.gov/standards/codelists/gacs.xml

 Yes, I can! I didn't know about/hadn't found that one either hadn't been
 mentioned until now. Thanks! Where did you find that?

That XML file is linked from near the bottom of this page:
http://www.loc.gov/marc/geoareas/

Keith


Re: [CODE4LIB] stemming in author search?

2011-06-14 Thread Keith Jenkins
Does Solr support Soundex?  (Soundex was originally developed to
assist with alternate spellings of names)

Keith


On Mon, Jun 13, 2011 at 8:08 PM, Jonathan Rochkind rochk...@jhu.edu wrote:
 In a Solr-based search, stemming is done at indexing time, into fields with 
 stemmed tokens.

 It seems typical in library-catalog type applications based on Solr to have 
 the default (or even only) searches be over these stemmed fields, thus 
 'auto-stemming' to the user. (Search for 'monkey', find 'monkeys' too, and 
 vice versa).

 I am curious how many people, who have Solr based catalogs (that is, I'm 
 interested in people who have search engines with majority or only content 
 originally from MARC), use such stemmed fields ('auto-stemming') over their 
 _author_ fields as well.

 There are pro's and con's to this. There are certainly some things in an 
 author field that would benefit from stemming (mostly various kinds of 
 corporate authors, some of whose endings end up looking like english language 
 phrases). There are also very many things in an author field that would not 
 benefit from stemming, and thus when stemming is done it sometimes(/often?) 
 results in false matches, pluralizing an author's last name in an 
 inappropriate way for instance.

 So, wanna say on the list, if you are using a Solr-based catalog, are you 
 using stemmed fields for your author searches? Curious what people end up 
 doing.  If there are any other more complicated clever things you've done 
 than just stem-or-not, let us know that too!

 Jonathan



Re: [CODE4LIB] Same CMS for both Intranet and Public websites?

2011-06-10 Thread Keith Jenkins
We are using Confluence here, campus-wide -- not just for the
libraries.  The campus installation has over 200 spaces (projects),
some of which are public, some private.  You can also have private
pages within a public space, and vice versa.  You can change
permissions on any page, and pages inherit permissions by hierarchy,
so it is pretty easy to set up a private section of a public space,
etc.

However, there were some issues I encountered a while back, when
trying to move a private page into the public part of the page
hierarchy: it was telling me that the page had no restrictions, but
apparently no one else could see it.  This problem may have been fixed
in our recent upgrade, although I've just gotten used to the
workaround we found (adding another restriction and then removing it).

Keith


On Thu, Jun 9, 2011 at 10:21 PM, Cary Gordon listu...@chillco.com wrote:
 FWIW, we prefer Confluence for documentation-centric intranets.


Re: [CODE4LIB] RDF for opening times/hours?

2011-06-08 Thread Keith Jenkins
schema.org may have some potential, but it's not clear to me how the
LocalBusiness/openingHours is supposed to work with anything but
regular hours...

Here's how openingHours are described at http://schema.org/LocalBusiness


The opening hours for a business. Opening hours can be specified as a
weekly time range, starting with days, then times per day. Multiple
days can be listed with commas ',' separating each day. Day or time
ranges are specified using a hyphen '-'.
- Days are specified using the following two-letter combinations: Mo,
Tu, We, Th, Fr, Sa, Su.
- Times are specified using 24:00 time. For example, 3pm is specified as 15:00.
Here is an example: time itemprop=openingHours datetime=Tu,Th
16:00-20:00Tuesdays and Thursdays 4-8pm/time


So how would one indicate different hours on different days?  Is there
more documentation that I'm missing?

If I wanted to create a service to take this data an calculate whether
the business is open right now, I'm still stuck having to parse and
interpret a text string, which defeats the purpose of having this
information encoded as data.

Keith


Re: [CODE4LIB] A right way for recording a place name?

2011-05-31 Thread Keith Jenkins
1 and 2 probably represent two different geographic levels with the
same name.  There is a township (county subdivision) called
Springfield, which also contains a city called Springfield.

If you are planning to generate LCSH placenames, one thing to note is
that LCSH typically uses old-style state abbreviations (Mass.,
Pa.) instead of the more common postal abbreviations (MA, PA).

I thought there might be some way to use id.loc.gov but for some
reason none of your example LCSHeadings show up in a search for
springfield -- maybe place headings are not comprehensively included
in id.loc.gov?

Keith


On Tue, May 31, 2011 at 11:02 AM, Ethan Gruber ewg4x...@gmail.com wrote:
 Hi all,

 I've just about completed a new XForms-based interface for querying
 geonames.org to populate the geogname element in EAD.  An XML
 representation of a geographical place returned by the geonames APIs
 includes its name, e.g., Springfield, country name, and several levels
 administrative names (Sangamon County, Illinois).  Is there some sort of
 official way of textually representing a place?  In LCSH, one finds:

 1 Springfield (Bucks County, Pa.)
 2 Springfield (Bucks County, Pa. : Township)
 3 Springfield (Burlington County, N.J.)

 Why 1 and 2 are distinct terms in LCSH, I don't know.  The mode for dealing
 with American place names seems to be [name of place] ([administrative name
 - lower level], [administrative name - upper level]).  For a European city,
 we find Berlin (Germany)

 Are these examples in LCSH the most common way to textually record places,
 or are there other examples I should look at?

 Thanks,
 Ethan



[CODE4LIB] exposing website visitor IP addresses to webcrawlers

2011-05-20 Thread Keith Jenkins
Just out of curiosity, does anyone on this list have any opinions
about whether website owners should publicly post lists of their
visitors' IP addresses (or hostnames) and to also allow such lists to
be indexable by search engines?

For example:
https://www3.ietf.org/usagedata/site_201104.html

Keith


Re: [CODE4LIB] exposing website visitor IP addresses to webcrawlers

2011-05-20 Thread Keith Jenkins
Thanks for all the responses so far.  My thoughts are pretty much
summed up by Mike and Nate, although I would suggest that no one is
going out of their way to make these IPs accessible -- rather, they
aren't going out of their way to make them inaccessible.

Luckily, most websites don't make their stats accessible, or else the
problem would be much larger -- anyone could get a list of websites
you have ever visited, using a simple google search, like this (to
pick a hostname at random from that IETF page):
http://www.google.com/search?q=%22tge1lba9.emirates.net.ae%22

While not all IP addresses are linked to individuals, some are, and I
think this is mainly a privacy problem for those individuals who have
static IPs.

Keith


On Fri, May 20, 2011 at 10:51 AM, Mike Taylor m...@indexdata.com wrote:
 My computer at home has a static IP address.  If I visit
 www.wikileaks.ch, I might not want the world to know that my IP
 address is in its access logs.

 So this is potentially a gross invasion of privacy.

On Fri, May 20, 2011 at 11:26 AM, Nate Vack njv...@wisc.edu wrote:
 Strikes me as roughly analogous to publicly posting the caller ID of
 everyone who calls you.

 It's not a big risk to your visitors. It's not very polite. It's
 probably not very useful.

 Unless you've got a good reason to do it... why bother?


Re: [CODE4LIB] Seth Godin on The future of the library

2011-05-17 Thread Keith Jenkins
I always get suspicious when an author converts current prices into
1962 dollars for no apparent reason, and without explanation.

Keith


On Tue, May 17, 2011 at 11:22 AM, Roy Zimmer roy.zim...@wmich.edu wrote:
 I think 50 cents would be right in the ballpark. My earliest scifi
 paperbacks cost me that much, mid-60's.

 Roy Zimmer
 Waldo Library
 Western Michigan University


 On 5/17/2011 11:18 AM, Jonathan Rochkind wrote:

 On 5/16/2011 7:52 PM, Luciano Ramalho wrote:

   And then we need to consider the rise of the Kindle. An ebook costs
   about $1.60 in 1962 dollars. A thousand ebooks can fit on one device,

 1) Why quote the ebook price in 1962 dollars? The reality in 2011 is
 that Kindle books in general are too expensive, particularly when

 Yeah, how much did a paperback book cost in 1962?  50 cents? $1?  I wasn't
 alive then, but I bet $1.60 is expensive in 1962 dollars!



Re: [CODE4LIB] What's the descriptive technical terminology?... pdf image of a page. pdf format used with cut paste.

2011-04-28 Thread Keith Jenkins
I've also heard many people use the term searchable PDF for a text-based PDF.

Keith


On Thu, Apr 28, 2011 at 12:43 PM, Peter Murray peter.mur...@lyrasis.org wrote:
 That is the same terminology I use as well -- image-based versus text-based. 
 I find that works most times because people can visually see if something 
 looks like a scanned image.


Re: [CODE4LIB] What do you wish you had time to learn?

2011-04-27 Thread Keith Jenkins
* Google App Engine
* PostGIS
* Drums


Re: [CODE4LIB] regexp for LCC?

2011-03-31 Thread Keith Jenkins
The Google Code regex looks like it will accept any 1-3 letters at the
start of the call number.  But LCC has no I, O, W, X, or Y
classifications.

So you might want to use something more like ^[A-HJ-NP-VZ] at the
start of the regex.

Also, there are only a few major classifications that use three
letters.  Like DJK, and several in the Ks.  I'm not sure, but there
might be others.

Keith


On Thu, Mar 31, 2011 at 1:11 PM, Jonathan Rochkind rochk...@jhu.edu wrote:
 Except now I wonder if those annoying MLCS call numbers might actually be
 properly MATCHED by this regex, when I need em excluded. They are annoying
 _similar_ to a classified call number. Well, one way to find out.

 And the reason this matters is to try and use an LCC to map to a
 'discipline' or other broad category, either directly from the LCC schedule
 labels, or using a mapping like umich's:
 http://www.lib.umich.edu/browse/categories/

 But if it's not really an LCC at all, and you try to map it, you'll get bad
 postings.

 On 3/31/2011 1:03 PM, Jonathan Rochkind wrote:

 Thanks, that looks good!

 It's hosted on Google Code, but I don't think that code is anything
 Google uses, it looks like it's from our very own Bill Dueber.

 On 3/31/2011 12:38 PM, Tod Olson wrote:

 Check the regexp that Google uses in their call number normalization:

        http://code.google.com/p/library-callnumber-lc/wiki/Home

 You may want to remove the prefix part, and allow for a fourth cutter.

 The folks at UNC pointed me to this a few months ago.

 -Tod

 On Mar 31, 2011, at 11:29 AM, Jonathan Rochkind wrote:

 Does anyone have a good regular expression that will match all legal LC
 Call Numbers from the LC Classified Schedule, but will generally not
 match things that could not possibly be an LC Call Number from the LC
 Classified Schedule?

 In particular, I need it to NOT match an MLC call number, which is an
 LC assigned call number that shows up in an 050 with no way to
 distinguish based on indicators, but isn't actually from the LC
 Schedules.  Here's an example of an MLC call number:

 MLCS 83/5180 (P)

 Hmm, maybe all MLC call numbers begin with MLC, okay I guess I can
 exclude them just like that. But it looks like there are also OTHER
 things that can show up in the 050 but aren't actually from the
 classified schedule, the OCLC documentation even contains an example of
 Microfilm 19072 E.

 What a mess, huh?  So, yeah, regex anyone?

 [You can probably guess why I care if it's from the LC Classified
 Schedule or not].

 Tod Olsont...@uchicago.edu
 Systems Librarian
 University of Chicago Library




[CODE4LIB] OpenRoom daylight saving time

2011-03-25 Thread Keith Jenkins
On Fri, Mar 25, 2011 at 6:14 AM, graham gra...@theseamans.net wrote:
 We've just gone live with OpenRoom, got a lot of student response - and
 have run into a problem I think is due to incorrect handling of the
 change to daylight saving time (this is in the UK). The problem is the
 display is out of sync with the data in the database, and as a result
 the system is letting students make duplicate bookings causing further
 problems. We've turned it off for now - but I wonder if anyone else has
 run into this problem and has a quick fix?

I'm not sure about the quick fix, but the ideal long-term solution is
to abolish daylight saving time worldwide.  It has certainly caused
more problems than it has solved... [remainder of rant omitted, due to
gmail message size limitations]

Some questions to help diagnose the problem:
  1. Which dates are affected?  Just those after the time change? (UK
changes this weekend, right?)
  2. How are the dates actually stored? (UTC, local time, Unix timestamp?)
  3. Are the stored times correct?  (i.e. is the error occurring on
input, or on output?)

If the dates are being stored as text strings like 2011-03-28
09:00:00 that are getting processed by the PHP strtotime() function
(OpenRoom runs on PHP, right?), then you can force a particular
timezone interpretation by appending the timezone info
(Europe/London or whatever) to the end of the string before
processing with strtotime().

Good luck,
Keith


Re: [CODE4LIB] geo-locating email domains

2011-03-25 Thread Keith Jenkins
Hi, Eric.

While you're at it... what's the percentage overlap between those lists?

Keith


On Thu, Mar 24, 2011 at 12:41 PM, Eric Lease Morgan emor...@nd.edu wrote:
 For a good time I geo-located the email domains of Code4Lib subscribers, 
 plotted them on a Google map, and discovered that us Code4Libbers use Gmail 
 in greater proportions than a couple of my other mailing lists (NGC4Lib and 
 Usability4Lib) -- http://bit.ly/hdL55U  Interesting!?

 Fun with Perl, the Google Maps API, and mailing lists.

 --
 Eric Lease Morgan
 University of Notre Dame



Re: [CODE4LIB] Which O'Reilly books should we give away at Code4Lib 2011?

2010-12-08 Thread Keith Jenkins
Hopefully the 4th edition of Programming Python will be out in time
for the conference.

Keith


On Tue, Dec 7, 2010 at 9:31 PM, Kevin S. Clarke kscla...@gmail.com wrote:
 If you have particular O'Reilly titles that you'd like for us to ask
 O'Reilly for, send them to me and I'll put them in our request.


Re: [CODE4LIB] how 'great' are the great books

2010-11-04 Thread Keith Jenkins
Hi, Eric.

I suspect that many of us have only read a (small) fraction of these
books.  (But since your survey links directly to the full text, we can
now only claim the limits of time as our excuse.)

Are you tracking how many times people choose I don't know?  I'm
sure that is the most popular book.

Maybe you could first ask which of the books we've read, and then have
us vote just amongst those titles?

Cheers,
Keith


On Thu, Nov 4, 2010 at 9:12 AM, Eric Lease Morgan emor...@nd.edu wrote:
 In an effort to answer the question, How 'great' are the Great Books?, I 
 have created the beginnings of a crowd sourced survey, and it would be 
 great if y'all were to beta test it for me -- http://bit.ly/bPQHIg

 I'm also looking for ways to make the survey more fun to use. If y'all could 
 give me any suggestions, then at would be... great.


Re: [CODE4LIB] how 'great' are the great books

2010-11-04 Thread Keith Jenkins
Roberto Hoyle roberto.j.ho...@dartmouth.edu wrote:
 If you haven't read one of the books, doesn't that argue for it's lack of 
 'greatness?'

Elizabeth Winter elizabeth.win...@library.gatech.edu wrote:
 Gosh, I hope not.  I think it argues for better literature programs in our 
 K-12 and universities

It also argues for a moratorium on publishing new books until we all
have time to catch up.  (I'm still working my way through the 5th
century B.C.)

Keith





 --
 Elizabeth L. Winter
 Electronic Resources Coordinator
 Collection Acquisitions  Management
 Library and Information Center
 Georgia Institute of Technology
 email: elizabeth.win...@library.gatech.edu
 phone: 404.385.0593
 fax: 404.894.1723

 - Original Message -
 From: Roberto Hoyle roberto.j.ho...@dartmouth.edu
 To: CODE4LIB@LISTSERV.ND.EDU
 Sent: Thursday, November 4, 2010 4:03:12 PM
 Subject: Re: [CODE4LIB] how 'great' are the great books

 On Nov 4, 2010, at 11:24 AM, McAulay, Elizabeth wrote:

 i agree with keith's comments about having a 'what have you read?' portion 
 first. I had to answer i don't know to most of the questions because if I 
 hadn't read both of the works, i didn't want to choose one over the other. i 
 have a master's in English and i think only one out of 20 comparisons i 
 answered included two works i had read.


 r.



Re: [CODE4LIB] locator

2010-06-30 Thread Keith Jenkins
Tom,

Before spending too much time trying to integrate building floorplans
with Google Maps, I would consider whether the maximum zoom level
(currently 20, which is around 3 pixels per foot) will allow you to
provide the detail needed for your floorplan.

Although this might only be an issue if you want the floorplan to
display as an overlay over the regular GMaps basemaps.

Keith

Keith Jenkins
GIS/Geospatial Applications Librarian
Mann Library, Cornell University
Ithaca, New York 14853


On Wed, Jun 30, 2010 at 8:24 AM, Tom Vanmechelen
tom.vanmeche...@libis.kuleuven.be wrote:
 We're considering  to expand our service with a item locator. Mapping the 
 library (http://mashedlibrary.com/wiki/index.php?title=Mapping_the_library) 
 describes how to build this with Google maps. But is this really the way to 
 go?  Does anyone has any experience with this? Does anyone have some best 
 practices for this kind of project knowing that we have about 20 buildings 
 spread all over the town?

 Tom

 ---
 Tom Vanmechelen

 K.U.Leuven / LIBIS
 W. De Croylaan 54 bus 5592
 BE-3001 Heverlee
 Tel  +32 16 32 27 93



Re: [CODE4LIB] audio transcription software

2010-05-12 Thread Keith Jenkins
I tried Dragon Naturally Speaking a couple of years ago.  (After
breaking a wrist in a cycling accident, I figured it might be easier
than one-hand typing, which wasn't true in the case of typing
programming code with lots of curly brackets, indentation, etc.)

Speech-to-text software works best after a training session, in which
the software asks the speaker to read a known text, to calibrate the
software.  I'm not sure how it might work to calibrate for voices on
recordings, but it may be that the software can learn during a
proof-reading process.  Your success for oral history recordings may
depend on the uniqueness of each speakers voice, and the length of
each recording.  (Lots of short recordings of many different speakers
would tend to be harder.)

Keith


On Wed, May 12, 2010 at 2:18 PM, Eric Lease Morgan emor...@nd.edu wrote:
 Does anybody here use or know of any audio transcription software?

 We have a growing number of projects here at Notre Dame that include oral 
 histories. How can these digital files be converted into plain text? Audio 
 transcription software may be the answer?

 --
 Eric Lease Morgan
 University of Notre Dame



Re: [CODE4LIB] Next-generation policy for WorldCat records?open for community review

2010-04-08 Thread Keith Jenkins
On Thu, Apr 8, 2010 at 9:53 AM, Karen Coyle li...@kcoyle.net wrote:
 My question about WorldCat records has to do with whole v. parts -- I can
 understand that a full MARC record, with holdings, downloaded from WC could
 be considered a WC record. After that, there is a lot of distance between
 the full MARC and, say, a citation with an author, title, publisher and
 date. Where is the line drawn? When does it cease to be a WC record and
 become just another chunk of bibliographic data floating around cyberspace?

This reminds of when dewey.info released RDF data under a Creative
Commons No Derivative Works license, which doesn't really make sense
to me.  Data (as opposed to literary texts or music, for example) is
always going to be manipulated for processing or display.  It seemed
to me that in order to ingest and use the data in any way (for
example, in a web interface) you have to use a derivative, unless you
are simply re-displaying the original data verbatim.  But I don't
think many users would want to look at raw RDF/XML.

Keith


Re: [CODE4LIB] PHP bashing (was: newbie)

2010-03-26 Thread Keith Jenkins
Who is presenting at the ducttape4lib conference this year?

Keith


On Fri, Mar 26, 2010 at 12:51 PM, Jason Stirnaman jstirna...@kumc.edu wrote:
 Oh, and please never use duck tape for stage applications like taping
 extension cords and mic cables to the floor. Gaff tape is tougher and
 leaves no sticky residue.


Re: [CODE4LIB] Q: MARC formats in XML

2010-02-16 Thread Keith Jenkins
A few years ago, there was some work being done in Portugal for
UNIMARC and MARC21 validation via XML schemas:
http://www.bookmarc.pt/unimarc/
http://www.bookmarc.pt/documentation/marcdoc.html

Cheers,
Keith


On Mon, Feb 15, 2010 at 2:14 PM, Houghton,Andrew hough...@oclc.org wrote:
 Does anybody know whether the MARC formats ... are encoded in
 an XML format that one might use for processing/validation of the
 leader, field tags, indicator codes, subfield codes, field and subfield
 repeatability, and field and subfield requiredness.


Re: [CODE4LIB] Auto-suggest and the id.loc.gov LCSH web service

2009-12-08 Thread Keith Jenkins
On Mon, Dec 7, 2009 at 5:56 PM, Ed Summers e...@pobox.com wrote:
 It would be great to have some external dataset to use in
 ranking LCSH suggestions at id.loc.gov. But at the moment it's a
 simple mysql db loaded up with some MARC LCSH data. I guess it could
 do something smart with PageRank-like ranking of 'super-concepts'
 (concepts that are linked to a lot)...but that would've taken longer
 than 20 minutes :-)

The frequency of an LCSH term within the LC catalog could also be
useful for ranking, although I'm not sure if such data would be
readily available.

Another possibility would be a simple count of broader terms +
narrower terms + related terms or something like that.  Although
PageRank would probably be better, since even some important terms
might have a relatively small number of immediately-adjacent links.

Keith


Re: [CODE4LIB] Journal Usage Statistical collection software - suggestions?

2009-10-30 Thread Keith Jenkins
There's also SUSHI (Standardized Usage Statistics Harvesting Initiative):
http://www.niso.org/workrooms/sushi

Keith


On Fri, Oct 30, 2009 at 11:19 AM, Brandon Dudley bran...@discontent.com wrote:
 Apologies for cross-posting. My institution is currently evaluating methods
 of collecting COUNTER stats in a comprehensive way. We currently use Excel
 spreadsheets to calculate cost-per-use and gather all the stats together,
 but I am hoping that there's a better way. In today's climate, justifying
 our spending decisions grows ever more important.

 I am aware of JURO and JURO4c, and of the Swets Scholarly Stats commercial
 packages - are there any other options worth consideration? Anybody devised
 their own slick homegrown method of collecting such stats?

 Many thanks,
 Brandon Dudley



Re: [CODE4LIB] Bookmarking web links - authoritativeness or focused searching

2009-09-30 Thread Keith Jenkins
On Wed, Sep 30, 2009 at 7:56 AM, Tim Cornwell tc...@cornell.edu wrote:
 41,000 sites and 21 million pages (http://www.ablegrape.com/en/about.html) is 
 a lot of
 vetting.
...
 Authoratative vetting of a large volume of resources is a hard problem.  I 
 haven't seen
 any good solutions, but am leaning toward crowd-sourcing with an 
 authoratative crowd. :-)

 Do you have any additional information on how AbleGrape vets these?

I can only guess, but I would think it's probably a combination of
automatic and manual vetting: crawl the links from known good sites,
filter out bad sites, filter out off-topic sites, manually add
newly-discovered sites not already in the index, manually remove
inappropriate sites that somehow made it into the index, adjust the
algorithms, try to build a user community and solicit feedback.  (I
once reported inappropriate results coming from a wine producer's
website that had been taken over by vandals, and AbleGrape removed it
from the index almost immediately.)

Keith


Re: [CODE4LIB] Bookmarking web links - authoritativeness or focused searching

2009-09-29 Thread Keith Jenkins
AbleGrape.com is a good example of a focused search engine that aims
to index only authoritative sources within a particular disciple --
in this case it's wine, enology, and viticulture.  It currently crawls
about 40,000 vetted websites.

It's a great search engine for the subject area it serves, and it
probably helped that the creator was a VP at Inktomi.

Keith


On Tue, Sep 29, 2009 at 10:53 AM, Cindy Harper char...@colgate.edu wrote:
 So that led me to speculate about a search engine that ranked just by links
 from .edu's, libraries sites, and a librarian-vetted list of .orgs,
 scholarly publishers, etc.  I think you can limit by .edu in the linked-from
 in Google - I haven't tried that much. if anyone here has experience at
 using tha technique, I'd like to hear about it.  But I'm thinking now about
 the possibility of a search engine limited to sites cooperatively vetted by
 librarians, that would incorporate ranking by # links.  Something more
 responsive than cataloging websites in our catalogs.

 Is anyone else thinking about these ideas?  or do you know of projects that
 approach this goal of leveraging librarian's vetting of authoritative
 sources?


Re: [CODE4LIB] help on displaying lcsh from MARC

2009-09-23 Thread Keith Jenkins
Using -- before subfields $v, $x, $y, and $z should work well for
all the standard MARC 6XX fields.

The only exception might be in any locally-defined fields 690-699.

Keith


On Wed, Sep 23, 2009 at 10:19 AM, Jonathan Rochkind rochk...@jhu.edu wrote:
 Thanks!  So $v, $x, $y, and $z should always get a -- before them --
 that's sufficient logic to do it 'right'?  I guess an LCSH 6xx always needs
 an $a first, so I don't need to worry about if a $v or $x happens to come
 first, and shouldn't get a preceding --.

 Should I do this only for 6xx with 2nd indicator 2 indicating LCSH, or do
 people generally just do this for all 6xx?

 Tod Olson wrote:

 Only for certain subfields:

 Dash (-) that precedes a subdivision in an extended 600 subject  heading
 is not carried in the MARC record. It may be system generated  as a display
 constant associated with the content of subfield $v, $x,  $y, and $z.

  From http://www.loc.gov/marc/bibliographic/bd600.html

 -Tod

 On Sep 23, 2009, at 9:09 AM, Jonathan Rochkind wrote:



 Hi all, I'm writing some marc record display code, and I have a  question
 about the 'right' way to display LCSH headings.

 LCSH headings are typically displayed with -- between components.  But
 looking at the MARC, it looks like the -- punctuation isn't  actually in
 the MARC field. (A rare instance where display  punctuation isn't in the
 marc!).

 Is it correct for any LCSH 6xx field (which you know because the 2nd
  indicator is 0, right?), to add -- between ALL present subfields  on
 display?   Do I have the right logic there?

 Thanks for any advice!

 Jonathan






Re: [CODE4LIB] Book recommendation

2009-09-09 Thread Keith Jenkins
I haven't read any of them yet, but O'Reilly has a new series of books
that might be of interest. They all have titles like Beautiful
Teams, Beautiful Architecture, Beautiful Data, Beautiful
Testing, etc.

Maybe someone else has read one and can comment on their usefulness?

Keith


On Wed, Sep 9, 2009 at 12:12 PM, Robert Fox rf...@nd.edu wrote:
 Since this list has librarians, hard core programmers and hybrid librarian 
 programmers on it, this is probably a good place to ask this sort of question.

 I'm looking for some book recommendations. I've read a lot of technical books 
 on how to work with specific kinds of technology, read a lot of online 
 technical how tos and that has been good as far as it goes. But, technology 
 changes too fast to be wed to one particular programming language, database 
 technology, metadata standard, etc. I'm interested in finding books that 
 speak to the issues of programming methodology, design principles, lessons 
 learned, etc. that transcend any particular programming technology. Are there 
 good books that distill the wisdom and experience of veteran developers and 
 /or communicate best practices for things like design patterns, overall 
 software architecture, learning from mistakes, the developer mindset and such 
 things?

 Could you recommend perhaps the top three or four books you've read in these 
 areas?

 Rob Fox
 Hesburgh Libraries
 University of Notre Dame



Re: [CODE4LIB] GPO PURLs

2009-09-01 Thread Keith Jenkins
On Tue, Sep 1, 2009 at 11:53 AM, Jonathan Rochkindrochk...@jhu.edu wrote:
 Of course, one failure in X (10?) years is fairly good reliability...
 depending on how long it takes them to get everything back working 100%. If
 it's back by tomorrow, one outage in 10 years pretty good. If it takes a
 week to get back, not so good.

It's been 8 days so far... hopefully it will we back to normal soon.
I'm not sure how long GPO has been serving PURLs, but if we assume
this is the first failure in 10 years, then that's still 99.8% uptime,
which isn't bad.  If the 0.2% were spread evenly across ten years, it
would hardly be noticable, but when it happens all at once, it
certainly does seem worse.

Keith


Re: [CODE4LIB] find in page, diacritics, etc

2009-08-28 Thread Keith Jenkins
Hi, Tim.

Are you are referring to a find in page, where a user presses CTRL-F
in the browser?

If so, it will depend on the browser.  Google Chrome 2.0 will find
matches regardless of the diacritics (i.e. user can type placa and
it matches plaça, and vice versa).  This doesn't seem to work in
Firefox 3.0.13 or IE8.

Keith


On Fri, Aug 28, 2009 at 12:17 PM, Tim Shearersh...@ils.unc.edu wrote:
 Hi Folks,

 Looking for help/perspectives.

 Anyone got any clever solutions for allowing folks to find a word with
 diacritics in a rendered web page regardless of whether or not the user
 tries with or without diacritics.

 In indexes this is usually solved by indexing the word with and without, so
 the user gets what they want regardless of how they search.

 Thanks in advance for any ideas/enlightenment,
 Tim



Re: [CODE4LIB] GPO PURLs

2009-08-27 Thread Keith Jenkins
Thanks to everyone who helped me confirm that the GPO PURL server is
down.  An official announcement on the GPO Listserv said:
  The PURL Server is currently inaccessible. GPO is working with IT
staff to restore service as soon as possible. We regret any
inconvenience caused by the server problems. An updated listserv will
be sent once service is restored.

While the server is down, here is one workaround (thanks to Patricia Duplantis):
  1. Go to http://catalog.gpo.gov/
  2. Click Advanced Search
  3. Search for word in URL/PURL, enter the PURL
  4. Click Go
  5. The original URL at the time of cataloging should appear in a 53x note.

This incident, however, illuminates a weakness in PURL systems: access
is broken when the PURL server breaks, even though the documents are
still online at their original URLs.

Maybe someone more familiar with PURL systems can tell me... is there
any way to harvest data from a PURL server, so that a backup/mirror
can be available?

Keith


[CODE4LIB] GPO PURLs

2009-08-25 Thread Keith Jenkins
Is it just me, or is the GPO PURL resolver down?  I keep getting a
timeout error...

For example:
http://purl.access.gpo.gov/GPO/LPS4468

Does anyone know of any alternate place to lookup the real URL for GPO PURLs?

Keith


Re: [CODE4LIB] [Fwd: [ol-tech] Modified RDF/XML api]

2009-08-14 Thread Keith Jenkins
I agree with Ed.  It would be best to omit the statement about the
cover image if it doesn't actually exist.

Keith


On Thu, Aug 13, 2009 at 9:39 PM, Ed Summerse...@pobox.com wrote:
 On Thu, Aug 13, 2009 at 3:10 PM, Karen Coyleli...@kcoyle.net wrote:
 Does it work for folks if this returns either a cover OR a blank? (1x1 jpg).
 It may be awkward to test first for an actual cover. Also, if it's ok to not
 test for a cover, does anyone have a preference over the blank or a 404
 error? I think the API can do both.

 My personal preference would be to include the assertion about the
 cover image only if it's true. It might be better to say nothing than
 having to live with the false positives. But maybe some other people
 feel differently about it.

 //Ed



Re: [CODE4LIB] OT(?) - Historical journal value data?

2009-05-21 Thread Keith Jenkins
On Thu, May 21, 2009 at 9:38 AM, Nate Vack njv...@wisc.edu wrote:
 Anyone know of a source of historical journal value data?

I know that, for historical prices of specific journal titles, a
former colleague would consult our old paper copies of Ulrich's.  (The
online version of Ulrich's only has current pricing info.)

Keith


Re: [CODE4LIB] registering info: uris?

2009-04-01 Thread Keith Jenkins
On Wed, Apr 1, 2009 at 8:37 AM, Mike Taylor m...@indexdata.com wrote:
 Worse, consider how the actionable-identifier approach would translate
 to other non-actionable identifiers like ISBNs.  If I offer the
 non-actionable identifier
        info:isbn/025490
 which identified Farlow and Brett-Surman's edited volume The Complete
 Dinosaur, it's obvious that you have a choice of methods for
 resolving the ISBN

... but the identifier gives no indication of what those choices might
be, and I wouldn't even be able to find out anything more about the
info:isbn scheme unless I happened to know that http://info-uri.info/
is the registry for info: URIs (or could Google my way to it).

An http: identifier could at least take you to general information
about the scheme (perhaps with options for resolution), if not
directly to some description of the identified thing itself.

Keith


Re: [CODE4LIB] PHP Frameworks: An informal survey.

2009-02-12 Thread Keith Jenkins
On Thu, Feb 12, 2009 at 1:24 PM, Cloutman, David
dclout...@co.marin.ca.us wrote:

 The results were pretty lopsided:

 Unnamed: 8
 Zend: 11
 CakePHP: 4
 Symfony: 4
 Code Igniter: 2

Interestingly, these numbers match up pretty well with what Google
Trends shows for 2008:

http://www.google.com/trends?q=zend,symfony,cakephp,code+igniterdate=ytd

Keith


Re: [CODE4LIB] amazon s3?

2008-11-11 Thread Keith Jenkins
Relatedly, just today Fedora Commons and DSpace have announced a
project called DuraSpace:

'''Over the next six months funding from the planning grant will allow
the organizations to jointly specify and design DuraSpace, a new
web-based service that will allow institutions to easily distribute
content to multiple storage providers, both cloud-based and
institution-based.  The idea behind DuraSpace is to provide a trusted,
value-added service layer to augment the capabilities of generic
storage providers by making stored digital content more durable,
manageable, accessible and sharable.'''

Full press release here:
  
http://www.dspace.org/index.php?option=com_contenttask=blogcategoryid=74Itemid=175

Keith


Re: [CODE4LIB] creating call number browse

2008-10-01 Thread Keith Jenkins
I think that one advantage of browsing a physical shelf is that the
shelf is linear, so it's very easy to methodically browse from the
left end of the shelf to the right, and have a sense that you haven't
accidentally missed anything.  (Ignore, for the moment, all the books
that happen to be checked out and not on the shelf...)

Online, linearity is no longer a constraint, which is a very good
thing, but it does have some drawbacks as well.  There is usually no
clear way to follow a series of more like this links and get a sense
that you have seen all the books that the library has on a given
subject.  Yes, you might get lucky and discover some great things, but
it usually involves a lot of aimless wandering, coming back to the
same highly-related items again and again, while missing some
slightly-more-distantly-related items.

Ideally, the user should be able to run a query, retrieve a set of
items, sort them however he wants (by author, date, call number, some
kind of dynamic clustering algorithm, whatever), and be able to
methodically browse from one end of that sort order to the other
without any fear of missing something.

Keith


On Tue, Sep 30, 2008 at 6:08 PM, Stephens, Owen
[EMAIL PROTECTED] wrote:
 I think we need to understand the
 way people use browse to navigate resources if we are to successfully bring
 the concept of collection browsing to our navigation tools. David suggests
 that we should think of a shelf browse as a type of 'show me more like this'
 which is definitely one reason to browse - but is it the only reason?


Re: [CODE4LIB] New England code4lib gathering

2008-10-01 Thread Keith Jenkins
On Wed, Oct 1, 2008 at 2:00 PM, Jay Luker [EMAIL PROTECTED] wrote:
 The reasons I threw the Northampton/Amherst area out there are a) it's
 central to a lot of NE and is on or near the major highways (91 and
 90)

...and if you are willing to bend the interpretation of NE to mean
not just New England, but North East, there might be a few of us
across the border in New York state who might be tempted to join in
the fun.  In which case the Northampton/Amherst locale would have
extra appeal.

Keith


Re: [CODE4LIB] a brief summary of the Google App Engine

2008-07-18 Thread Keith Jenkins
Thanks for sharing that, Doug.  It's not mentioned at all in the
Developer's Guide (contains everything you need to know):
http://code.google.com/appengine/docs/

I'll have to take a closer look at the src docs...

Keith


On Thu, Jul 17, 2008 at 3:01 PM, Doug Chestnut [EMAIL PROTECTED] wrote:
 There is a Searchable Entity with GAE.  Refer to the src for docs.  It is
 fairly straight forward, it takes the text of the properties, removes stop
 words, creates a new list property that contains the words.  An index on
 this property allows fast retrieval.  It is fairly limited, you don't want
 list properties to get too large (this is what the devs told me at google
 io).  From the docs:

 Don't expect too much. First, there's no ranking, which is a killer
 drawback.
 There's also no exact phrase match, substring match, boolean operators,
 stemming, or other common full text search features. Finally, support for
 stop
 words (common words that are not indexed) is currently limited to English.

 I have been playing with reverse indexes in GAE with some potential success
 for search and faceted browse.

 --Doug



Re: [CODE4LIB] a brief summary of the Google App Engine

2008-07-16 Thread Keith Jenkins
On Wed, Jul 16, 2008 at 12:21 AM, Godmar Back [EMAIL PROTECTED] wrote:
 Aside from the limitations imposed by the index model, the problem
 then is fundamentally similar to how you index MARC data for use in
 any discovery system.

I think Godmar is referring to GAE's lack of keyword searching.  To
elaborate, the following is from
http://code.google.com/appengine/docs/datastore/queriesandindexes.html


Tip: Query filters do not have an explicit way to match just part of a
string value, but you can fake a prefix match using inequality
filters:

db.GqlQuery(SELECT * FROM MyModel WHERE prop = :1 AND prop  :2,
abc, abc + \xEF\xBF\xBD)

This matches every MyModel entity with a string property prop that
begins with the characters abc. The byte string \xEF\xBF\xBD
represents the largest possible Unicode character. When the property
values are sorted in an index, the values that fall in this range are
all of the values that begin with the given prefix.


So it's a bit of a hack just to get a left-anchored search.  Querying
for a particular keyword anywhere within a string value would be even
more work.  For small datasets, I guess you could iterate through
every record.  But for anything larger, you'd probably want to figure
out a way to manually build an index within the Google datastore, or
else keep the indexing outside GAE, and just use GAE for fetching
specified records.  Any ideas on how that might work?

Keith


Re: [CODE4LIB] free movie cover images?

2008-05-19 Thread Keith Jenkins
On Mon, May 19, 2008 at 1:12 PM, Peter Murray [EMAIL PROTECTED] wrote:
 IMDB has cover art for films, but I haven't looked to see if they
 provide an API to get to them /a la/ Google Books.

I don't think IMDB is an option...

All pictures and videos found on our site (including movie stills,
headshots, photo galleries and trailers) are licensed to IMDb only and
we are not permitted to sublicense them onwards or grant permission
for other to use them, sorry.

From: http://www.imdb.com/help/show_leaf?usephotos

Keith


Re: [CODE4LIB] poll of javascript libraries

2008-03-26 Thread Keith Jenkins
As of right now, the results of the informal poll of Javascript
libraries stands as follows:

jQuery = 23 votes
Prototype = 17 votes
Scriptaculous = 10 votes
YUI = 9 votes
ExtJS = 5 votes
Dojo = 2 votes
MooTools = 2 votes
MochiKit = 1 votes
LowPro = 0 votes

Note that these poll results are completely unscientific and
necessarily incomplete (superdelegates have not been counted yet...),
but hopefully not entirely uninformative.

If you still want to add your input, the poll is here:
   http://doodle.ch/sr5z4vusiwi4yssi

subliminalMessageMaybe someone wants to write an article for the
code4lib journal, or present at next year's conference about their
favorite javascript library.../subliminalMessage

Cheers,
Keith


[CODE4LIB] poll of javascript libraries

2008-03-20 Thread Keith Jenkins
I've seen various Javascript libraries mentioned on this list from
time to time, and would like to get a better sense which major
Javascript libraries are being used within the Code4Lib community.

So I've set up a quick, informal poll here:
http://doodle.ch/sr5z4vusiwi4yssi

Just enter your name (or any string, really) and vote for the
libraries that you are currently using.

Since the libraries listed have different scopes, and some are used in
conjunction, please vote for any and all that you are actively using.
(If you've used a library in the past, but don't expect to use it
again, don't vote for it.)

I'll collect the results and report back to the list after a week or so.

Keith


Re: [CODE4LIB] httpRequest javascript.... grrr

2007-11-29 Thread Keith Jenkins
jQuery++

I like to do things from scratch, but have never regretted moving to
jQuery.  Whatever time it takes you to check it out will be paid back
a thousand times, at least.

Keith


On 11/29/07, Ewout Van Troostenberghe [EMAIL PROTECTED] wrote:
 To point out why the use of a Javascript framework is important, let me
 put your code into jQuery (http://jquery.com)

 $.get('index.cgi', {cmd:'add_tag', username:'username'}, function(html) {
   // do whatever you want here
 })


Re: [CODE4LIB] weird worldcat results?

2007-09-04 Thread Keith Jenkins
 http://www.worldcat.org/issn/00253154

This link brings up an article record, apparently from ArticleFirst.
(FYI, worldcat.org is including results from the GPO, ArticleFirst,
Medline and ERIC databases -- I just found that out now.)

On 9/4/07, Jonathan Rochkind [EMAIL PROTECTED] wrote:
 Would that ISSN assigned to a record that's really for an article not
 for a title be considered bad data? Are you supposed to have an ISSN on
 such a record?

In the ArticleFirst database, the ISSN of the source journal is
recorded as a standard number.  It makes sense within an article
database to include the ISSN as an unambiguous reference to the source
journal.

But I agree that it makes absolutely no sense to deliver this single
article record in response to the URL above.  I'm sure OCLC would
agree, too...  So, Jonathan, I'd suggest reporting this via the
feedback link at the bottom of the worldcat.org page.  Hopefully
they can correct this odd behaviour (while it's still in beta).

Keith


Re: [CODE4LIB] LCC classifications in XML

2007-08-28 Thread Keith Jenkins
On 8/28/07, Roy Tennant [EMAIL PROTECTED] wrote:
 the files at http://www.loc.gov/catdir/cpso/lcco/

The files Roy mentions are a great start, and might be enough for your purposes.

But they don't go into the full gory detail of the classification.
For example, if you are interested in QA934, you can only get as close
as:
QA801-939  Analytic mechanics

If you do need the full, detailed classification, LC's Classification
Web appears to be based on a set of MARC records.  Here's an example
for QA934:


ID: CF 94037268Rec Stat: nEntrd: 930427Used: 19960528
CNID: DLCRec Type: wKOR: aNumType: aVal: a
Opt: aEnc Levl: nSyn: aDis: a

010 $a CF 94037268
040 $a DLC $c DLC
084 0   $a lcc
153 $a QA934 $h Mathematics $h Analytic 
mechanics $h
Elasticity. Plasticity $j Torsion
753 $a Torsion $b Analytic mechanics

Record: 85693
Added:  Tue Apr 27 00:00:00 1993
Modified:   Tue May 28 10:09:38 1996


The drawback is that these records only appear to be available through
the subscription-accessed Classification Web.  I couldn't find any
mention of the availability of the actual MARC records on the
Cataloging Distribution Service website.

Maybe LC will make them available via a free RESTful web service... :)

-Keith


Re: [CODE4LIB] 2007 Conference Attendee List

2007-02-23 Thread Keith Jenkins

I'm with you, Peter, in not being there.  But I renamed the section to
lurkers, because stalkers just sounded too creepy...

-Keith

On 2/23/07, Binkley, Peter [EMAIL PROTECTED] wrote:

I've taken the liberty of adding a stalkers section at the end, for
those of us who will be there in spirit but not in the flesh.


Re: [CODE4LIB] LC MARC records?

2006-09-05 Thread Keith Jenkins

From looking at the LC website, it looks like there is a one-time cost

of $30,450 for all the retrospective records from 1968-2005.  But then
it's $21,025 for weekly updates throughout the 2006, or $24,025 for
daily updates.  (One year is estimated to include 1,000,000 records,
including 387,000 new records.)

So it's the keeping up-to-date that is the real expense, since it's
$20,000+ every year.

http://www.loc.gov/cds/mds.html#complete

-Keith

p.s. If you could wait until 2007, then 2006 would be included in the
retrospective records.  Or you could even wait a decade and really
save money... and by that time maybe they'll be free at last :-)


On 9/5/06, Houghton,Andrew [EMAIL PROTECTED] wrote:

It is my understanding that OCLC *pays* LC several thousands
of dollars, per year, for access to their MARC records...