Re: [CODE4LIB] neo4j

2012-02-13 Thread Brian Tingle
My proposal for code4lib on this topic was not selected, but I was invited
to give the same talk at the Berkeley Information School Friday afternoon
seminar last week (but I had about 40 mins rather than 20).

Here are the notes from my talk last Friday:

http://tingletech.github.com/296a-1-2012/

Also, I did some quick screenrs of what I would have talked about (but I
didn't really practice, I would have prepared more for a real talk, these
are sort of phoning it in)
http://www.screenr.com/1lws
http://www.screenr.com/pfws
http://www.screenr.com/Pg9s

Here is a page that is powered by Tinkerpop/Neo4J/rexster in production

http://socialarchive.iath.virginia.edu/xtf/view?mode=RGraphdocId=franklin-benjamin-1706-1790-cr.xml

I've found tinkerpop, gremlin, and rexster to be very easy to work with,
and the tinkerpop list is very helpful.

I'm also using a triple store to power a SPARQL interface:
http://socialarchive.iath.virginia.edu/sparql/


On Mon, Feb 13, 2012 at 2:23 PM, Chris Fitzpatrick
chrisfitz...@gmail.comwrote:

 Hey Kent,

 Awesome. thanks for the info. So, using gremlin, are you using some of
 the other Tinkerpop technologies?

 And, haha, in researching stuff this weekend, I actually saw an email
 you sent to the neo4j google group about the lucene boosting issue…

 I started playing around with RDF.rb , and was really impressed,
 although using that doesn't give you all the stuff tinkerpop does.

 b,chris.

 On Sat, Feb 11, 2012 at 12:32 AM, Kent Fitch kent.fi...@gmail.com wrote:
  Hi,
 
  AustLit ( http://www.austlit.edu.au ) is in the early stages of a
  migration from javaServlets/xslt/oracle to java/neo4j/gremlin.  The
  web version of AustLit was developed in 2000 based on FRBR with a
  strong emphasis on events realised with a topic map model, so the sql
  implementation is close to a triple-store.  More information on the
  details are here: http://www.austlit.edu.au/about ,
  http://www.austlit.edu.au/about/metadata and
  http://www.austlit.edu.au:/DataModel/index.html (ALEG was the
  working name for AustLit redevelopment in 2000).
 
  Last year a decision was taken to move AustLit from a subscription
  service to open access, and from updates being performed solely by
  dedicated bibliographers and researchers (members of various AustLit
  teams distributed across Australia) to include community
  contributions, so rather than work these changes into a 12 year old
  system, it was decided to start afresh with an approach which would
  more naturally support the AustLit data model.
 
  So, we experimented with Neo4j, and were impressed with its
  performance.  For example, loading our current data from Oracle into
  an empty neo4j database takes about 30 minutes (using a
  run-of-the-mill 3 year-old server), producing a graph of 14m nodes and
  20m relationships.  Performing custom indexing of this data using the
  built-in Lucene integration takes about 2.5 hours, but that's a
  function of the extensive indexing we're performing.
 
  As you'd probably expect, we do have some issues we're working
  through, such as
 
  - integration with Lucene is abstracted by the neo4j index
  interface, so it is difficult or impossible to use some native Lucene
  features.  For example, boosting index nodes based on their inherent
  importance and using this boost in lucene to determine relevance
  cannot be done.
 
  - our data model is complex, and added to the requirements to version
  every node and relationship (ie, record changes, allow rollback), our
  graph traversals are correspondingly complex, but I suspect as we
  become more familar with graph traversal idioms in gremlin and cypher,
  they'll become as normal as sql
 
  But so far, neo4j seems fast and robust, and we're optimistic!
 
  Kent Fitch
 
  On Sat, Feb 11, 2012 at 9:42 AM, Chris Fitzpatrick
  chrisfitz...@gmail.com wrote:
  Hej hej,
 
  Is anyone is using neo4j in their library projects.
 
  If the answer is ja, I would be very interested in hearing how it's
 going.
  How are you using it?
  Is it something that is in production and is adding value or is it
  more a skunkworks-type effort?
  What languages are you using? Are you using an ORM (like Rails or
 Django)?
 
  I would also be really interested in hearing thoughts, stories, and
  opinions about the idea of using a graph db or triple store in their
  stack.
 
  tack!
 
  b, fitz.



Re: [CODE4LIB] Google Analytics w/ Sub-sub-domains

2012-02-06 Thread BRIAN TINGLE
This can be really tricky to get right when you have a more complicated site 
with lots of domains.  Since you are all on .yale.edu it should be easier than 
crossing .cdlib.org to .universityofcalifornia.edu.  If I understand correctly, 
you should be able to 
_gaq.push(['_setDomainName', '.yale.edu']); on every page and it should work.

http://code.google.com/apis/analytics/docs/tracking/gaTrackingSite.html#domainSubDomains
 

This debugging plugin for chrome is pretty useful 

https://chrome.google.com/webstore/detail/jnkmfdileelhofjcijamephohjechhna

It will help you confirm what is getting sent to google.

-- Brian

On Feb 6, 2012, at 11:53 AM, Predmore, Andrew wrote:

 I have been tasked with updating the Analytics for the Yale University 
 Library, and I am having quite a bit of trouble.
 
 Specifically, I was hoping to only track domain names that included 
 library.yale.edu, like www.library.yale.edu,  resources.library.yale.edu, but 
 the instructions don't seem to cover sub-sub-domains like this.
 
 Also, I was hoping to set up a profile/filter that would show me the 
 sub-domains in the reports.  Again, I followed the directions but I am not 
 getting any results.  Well, that's not entirely true the reports are showing 
 about 30 visitors a day (and no page hits, how is that possible?).  The main 
 profile is showing 5,000 – 10,000 visitors day.
 
 Does anyone have experience with this that could help me out?  Maybe there is 
 even someone from Google at the conference?
 
 --
 Clayton Andrew Predmore
 Manager, Web Operations
 Yale University Library
 andrew.predm...@yale.edumailto:andrew.predm...@yale.edu


Re: [CODE4LIB] Google Analytics w/ Sub-sub-domains

2012-02-06 Thread Brian Tingle
Henry, that is what you need to do if you want to track the same page to
two different google analytics properties and you are using the
legacy synchronous code.  It sounds like yale wants to collect all this use
under one UA- google analytics property (it is just that the property spans
multiple subdomains).

I think the link I sent to

http://code.google.com/apis/analytics/docs/tracking/gaTrackingSite.htmlhttp://%22

addresses the yale case; and I the way I read it adding this:

 _gaq.push(['_setDomainName', '.yale.edu']);

Or  _gaq.push(['_setDomainName', 'library.yale.edu']);

on _every_ page should  work.

Right now, I only see _setDomainName on the home page.  If this is not the
_same_ on all the pages, the cookies won't be shared as users move between
the sites.

For example;

view-source:http://www.library.yale.edu/researcheducation/

This page is missing

_gaq.push(['_setDomainName', '.yale.edu']);

 It will only work prospectively (it won't change the past) but when all
pages are sharing the same _setDomainName then they should all share the
same cookies and the links between pages should be counted correctly.

But google analytics can get tricky, just when I think I understand
something it changes.  I find I have to double check things a lot with the
debug toolbar to make sure the right stuff is getting sent to google (esp.
when setting up custom events or setting up multiple trackers on the same
page).  You should be able to use it to verify that the same session and
cookies are being used as you go from page to page.  In the chrome debug
nowadays you can right click on the console log and select Preserve Log
upon navigation which makes this a lot easier.


Re: [CODE4LIB] Q: best practices for *simple* contributor IP/licensing management for open source?

2011-12-15 Thread BRIAN TINGLE
 
 Does something along those lines end up working legally, or is it worthless, 
 no better than just continuing to ignore the problem, so you might as well 
 just continue to ignore the problem? Or if it is potentially workable, does 
 anyone have examples of projects using such a system, ideally with some 
 evidence some lawyer has said it’s worthwhile, including a lawyer-vetted 
 digital contributor agreement?



I'm not sure the extent of this risk for most small projects, esp. if you don't 
think you will ever want to relicense it.

If I send you a pull request, and it is like a couple of characters different 
because there was a syntax error, or I add a couple of lines, and I don't 
bother to change the license and copyright statement, I don't think it is too 
unreasonable to accept the patch.  If I write dozens of files for a new module 
that is sort of unrelated to the original code and I don't include the 
project's copyright statement in the code, then I could see you saying hey, 
can you clarify if you are granting me the copyright, or maybe you want to slap 
a copyright notice on that yourself before you accept the contribution.

But maybe you could just make a statement in your README along the lines of If 
you send a pull request to this project or otherwise make a contribution of 
code you and your employer to grant a non-excluive royalty free perpetual 
redistribution license to Acme Inc and you represent that you have the rights 
to do so

Maybe you could argue that the act of submission of the modified code is an 
implicit grant of the code consistent with the terms of the license of the 
original code.

The cool thing about revision control, and accepting pull requests, is that it 
keeps a line by line record of who committed the code, so if there were a 
problem you might have a chance at extracting and re-writing the tainted 
contribution.

Of course, I am not a lawyer; you probably need to talk to your contracts and 
grants people or OGC.

   THIS EMAIL IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS AS IS
   AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
   IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
   ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
   LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
   CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
   SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
   INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
   CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
   ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
   POSSIBILITY OF SUCH DAMAGE.


Re: [CODE4LIB] automatic greeking of sample files

2011-12-13 Thread BRIAN TINGLE
On Dec 12, 2011, at 6:35 PM, Michael B. Klein wrote:

 I've altered my previous function (https://gist.github.com/1468557) into
 something that's pretty much a straight letter-substitution cipher.

This is what I ended up using
https://github.com/tingletech/greeker.py/blob/3ba1e84bc1ea51fa501c1a479f8758593bac5ffd/greeker.py#L131-150
it uses a different straight letter-substitutiuon for every unique word, using 
the input as the random's seed.
It does not look as pretty as your code

 But if you really want it to index
 realistically, it would need to be altered to leave common stems (-s, -ies,
 -ed, -ing, etc.) alone (assuming the indexer uses some sort of stemming
 algorithm).

I'm only doing nouns, and I'm matching inflection.  I guess I could investigate 
stemming as well.

I'd still like to play with substituting nouns using a dictionary of nouns of 
the same length; but I have not found a dictionary of nouns to use, I thought I 
would find one in nltk somewhere, but I did not figure out how to use wordnet 
when I looked at it.


Re: [CODE4LIB] automatic greeking of sample files

2011-12-12 Thread Brian Tingle
On Mon, Dec 12, 2011 at 10:56 AM, Michael B. Klein mbkl...@gmail.comwrote:

 Here's a snippet that will completely randomize the contents of an
 arbitrary string while replacing the general flow (vowels replaced with
 vowels, consonants replaced with consonants (with case retained in both
 instances), digits replaced with digits, and everything else is left alone.

 https://gist.github.com/1468557  https://gist.github.com/1468557


I like the way the output looks; but one problem with the random output is
that the same word might come out to different values.  The distribution of
unique words would also be affected, not sure if that would
impact relevance/searching/index size.  Also, I was sort of hoping to be
able to have some sort of browsing, so I'm looking for something that is
like a pronounceable hash one way hash.  Maybe if I take the md5 of the
word; and then use that as the seed for random, and then run
your algorithm then NASA would always hash to the same thing?

Potential contributors of specimens would have to be okay with the fact
that a determined person could recreate their original records.  The goal
is that an end user who might stumble across a random XTF tutorial
installation would not mistake what they are seeing for a real collection
description.

Hopefully nothing transforms to a swear word, I guess that is a problem
with pig latin as well...

Thanks for the feedback and the suggestion.  I'll play with this some
tonight and see if setting the seed based on the input word works to get
the same pseudo-random result, seems like it should.


Re: [CODE4LIB] copyright/fair use considerations for re-using Seattle World's Fair images

2011-12-09 Thread BRIAN TINGLE
these guys might own the copyright 
http://seattlecenter.org/

https://www.facebook.com/pages/1962-Seattle-Worlds-Fair/106938462090

On Dec 9, 2011, at 12:53 PM, Doran, Michael D wrote:

 Hi Trish,
 
 Thank you for the referral.  I looked through that but I don't think my 
 intended use (an unofficial code4lib conference t-shirt) can be categorized 
 as teaching, research, or study. ;-)  I may do a one-off copy for myself.
 
 -- Michael
 
 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
 Trish Rose-Sandler
 Sent: Friday, December 09, 2011 1:56 PM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] copyright/fair use considerations for re-using
 Seattle World's Fair images
 
 Michael,
 
 If you think your use falls under Fair Use you may find the recently
 released document from the Visual Resources Association useful
 
 *Statement on the Fair Use of Images for Teaching, Research, and Study*. *
 http://www.vraweb.org/organization/pdf/VRAFairUseGuidelinesFinal.pdf*.
 
 Trish Rose-Sandler
 Data Analyst, Biodiversity Heritage Library Project
 http://www.biodiversitylibrary.org/
 
 
 On Fri, Dec 9, 2011 at 1:45 PM, Beanworks beanwo...@gmail.com wrote:
 
 I think what Cary is trying to say is welcome to the fun world of
 copyright!
 
 No, you shouldn't assume copyright was not renewed. You will need to
 determine (1) who the copyright holder is/was and (2) whether the
 copyright
 has lapsed. This is not always an easy task, which is why you need to
 document your good faith efforts (which will, of course, be exhaustive).
 
 Carol
 
 On Dec 9, 2011, at 2:26 PM, Cary Gordon listu...@chillco.com wrote:
 
 Copyright law requires that you make a good-faith effort to find the
 copyright owners. If you document such effort and they sue you, this
 can weigh heavily in your favor. There are two obvious caveats: a) You
 can still get sued, not to mention annoying cease-and-desist letters;
 and 2) They could still win.
 
 Being that we are, for the most part, not art critics, you could
 consider creating original art. You might get mocked, particularly
 after a few beers, but that's just the way we roll. Of course, if you
 buy beer, that will reduce any mock risk.
 
 Cary
 
 On Fri, Dec 9, 2011 at 12:34 PM, Doran, Michael D do...@uta.edu
 wrote:
 I was hoping to re-use/re-purpose a couple of 1962 Seattle World's
 Fair
 images found on the interwebs [1][2].  Both images were originally
 created
 for souvenir decals.
 
 According to the U.S. Copyright Office's Copyrights Basics [3]
 section on works originally created and published or registered before
 January 1, 1978, copyright endured for a first term of 28 years from the
 date it was secured -- i.e. for these images, from 1962 to 1990.  It
 goes
 on to say that During the last (28th) year of the first term, the
 copyright was eligible for renewal.  This however, was *not* an
 automatic
 renewal.
 
 So, unless the copyright was explicitly renewed in 1990, the images
 are
 in the public domain.  Since these images were for souvenir decals
 (rather
 than something like a poster), I'm inclined to think the original
 copyright
 owner probably didn't renew the copyright.  However, I don't know who the
 original copyright owner is and really have no way of finding out, and
 therefore I can't ascertain whether or not the copyright was renewed.
 
 For those with more experience in copyright, any thoughts regarding
 situations like this?
 
 I realize this isn't a coding question, but figured I might get some
 helpful responses from those of y'all working in archives and various
 digital projects where copyright issues regularly come up.
 
 ps  I've eliminated the Century 21 Exposition logo in my proposed
 reuse, if that matters (on one image, there is a registered trademark
 symbol next to the logo).  I'm also not retaining the original Seattle
 World's Fair text.
 
 -- Michael
 
 [1] http://www.flickr.com/photos/hollywoodplace/6007390480/
 
 [2]
 
 http://media.photobucket.com/image/seattle%20world%2527s%20fair%20monorail/
 bananaphone5000/NEWGORILLA/SeattleWFDecal.jpg
 
 [3] http://www.copyright.gov/circs/circ1.pdf
 
 # Michael Doran, Systems Librarian
 # University of Texas at Arlington
 # 817-272-5326 office
 # 817-688-1926 mobile
 # do...@uta.edu
 # http://rocky.uta.edu/doran/
 
 
 
 --
 Cary Gordon
 The Cherry Hill Company
 http://chillco.com
 


[CODE4LIB] automatic greeking of sample files

2011-12-09 Thread BRIAN TINGLE
Hi,

I'm now in the group that produces XTF, and for XTF4.0, I'm thinking about 
updating the EAD XSLT based on the Online Archive of California's stylesheets.

For our EAD samples that we distribute with the XTF tutorial, we are using 6 
EAD files from the library of congress (which presumably are public domain).  

I'd like to start of a collection of pathological EAD examples that we have the 
rights to redistribute with the XTF tutorials and to use for testing.

Anticipating that potential contributors might not want to release their actual 
records for inclusion in an open source project; I hacked a little script to 
systematically change names and nouns to pig latin

https://gist.github.com/1429538

Here is a sample run;

Input: (from http://www.oac.cdlib.org/findaid/ark:/13030/kt3580374v/ )

The NASA Space Shuttle Challenger disaster occurred on January 28, 1986 when 
Space Shuttle Challenger broke apart 73 seconds into its flight, leading to the 
deaths of its seven crew members. Disintegration of the entire vehicle began 
after an O-ring seal in its right solid rocket booster failed at liftoff. The 
disaster resulted in the formation of the Rogers Commission, a special 
commission appointed by United States President Ronald Reagan to investigate 
the accident. The Presidential Commission found that NASA's organizational 
culture and decision-making processes had been a key contributing factor to the 
accident. NASA managers had known that contractor Morton Thiokol's design of 
the solid rocket boosters contained a potentially catastrophic flaw in the 
O-rings, but they failed to address it properly. They also disregarded warnings 
from engineers about the dangers of launching posed by the low temperatures of 
that morning.

output:

The Nasaay Acespay Uttleshay Allengerchay isasterday occurred on Anuaryjay 28, 
1986 when Acespay Uttleshay Allengerchay okebray apartway 73 econdsays into its 
flight, leading to the eathdays of its seven ewcray embermays. Isintegrationday 
of the entire ehiclevay began after an O-ring ealsay in its ightray solid 
ocketray oosterbay failed at iftofflay. The isasterday resulted in the 
ormationfay of the Ogersray Ommissioncay, a special ommissioncay appointed by 
Itedunay States Esidentpray Onaldray Eaganray to investigate the accidentway. 
The Esidentialpray Ommissioncay found that Nasaay's organizational ulturecay 
and decision-making ocessprays had been a key ontributingcay actorfay to the 
accidentway. Nasaay anagermays had known that ontractorcay Ortonmay Iokolthay's 
esignday of the solid ocketray oosterbays contained a potentially catastrophic 
awflay in the ingO-rays, but they failed to addressway it properly. They also 
disregarded arningways from engineerways about the angerda!
 ys of launching posed by the low emperaturetays of that orningmay.

Does anyone have any thoughts or feedback on this?  Is this totally silly?  Is 
there something besides pig latin that I could transform the words to?  Any 
obvious ways I could improve the python?


Re: [CODE4LIB] What software for a digital library

2011-12-09 Thread BRIAN TINGLE
On Dec 9, 2011, at 9:05 PM, Lars Aronsson wrote:

 in particular I didn't like these steps:
5. Shut down tomcat.
6. Do an incremental re-index (2) to include the new document.
7. Start up tomcat.
...
I'm not sure why this step is in the tutorial -- XTF does not normally require 
for tomcat to be shutdown/restarted for a indexing.  (There is a tutorial 
version of XTF that comes with a bundled tomcat; maybe there is something with 
the way that tomcat is configured that makes this step required?)

 If I built this website today and not in 1994,
 http://runeberg.org/irescan/0014.html

 [...] which open source framework would I use? Greenstone?
 XTF? DSpace? Mediawiki? Django? WordPress?
 ... To be clear: I need a platform where regular users, logged
 in or not, can upload new books through a web interface.
 Does that leave me with anything else than Mediawiki?


Is that your most important requirement?

Are you expecting to just install something without doing a lot of development, 
or looking to have the most fun hacking?

What format is the book in?  PDF?  Individual pages images?  Some ebook format? 
 Something downloaded from internet archive?

The Open Monograph Press from the Public Knowledge Project might be something 
to look at when it comes out, but it maybe is focused on editorial workflows 
than you would need? 
http://pkp.sfu.ca/omp

Django is nice if you want to use an SQL database and and ORM.  Flask (a python 
microframework) also looks interesting.

 I would probably use some open source
 content management (CMS) or digital asset managment (DAM)
 software rather than a Perl script that generates static
 HTML files.


I would not give up on text files and generation scripts.  Check out this 
presentation from the last code4lib about using http://tinytree.info/ to run a 
lot of command line tools to generate static HTML.

http://www.slideshare.net/MrDys/lets-get-small-a-microservices-approach-to-library-websites
http://www.indiana.edu/~video/stream/launchflash.html?format=MP4folder=vicfilename=C4L2011_session_3b_20110209.mp4
 


Re: [CODE4LIB] Sending html via ajax -vs- building html in js (was: jQuery Ajax request to update a PHP variable)

2011-12-08 Thread BRIAN TINGLE
returning JSONP is the the cool hipster way to go (well, not hipster cool 
anymore, but the hipsters were doing it before it went mainstream), but I'm not 
convinced it is inherently a problem to return HTML for use in AJAX type 
development in a non--ironic-retro way.  

On Dec 7, 2011, at 2:19 PM, Robert Sanderson wrote:

 * Lax Security -- It's easier to get into trouble when you're simply
 inlining HTML received, compared to building the elements.  Getting
 into the same bad habits as SQL injection. It might not be a big deal
 now, but it will be later on.

I've been scratching my head about this one.  Can someone elaborate on this?

Re: [CODE4LIB] Sending html via ajax -vs- building html in js (was: jQuery Ajax request to update a PHP variable)

2011-12-08 Thread Brian Tingle
On Thu, Dec 8, 2011 at 9:11 AM, Godmar Back god...@gmail.com wrote:


 Let me give you an example for why returning HTML is a difficult
 approach, to say the least, when it comes to rich AJAX applications. I
 had in my argument referred to a trend, connected to increasing
 richness and interactivity in AJAX applications being developed today.


don't get me wrong; I love hipsters and JSON.  JSONP callbacks are esp.
handy.  I still would not consume it from a source I did not trust.

 ...



 If we tell newbies (no offense meant by that term) that AJAX means
 send a request and then insert a chunk of HTML in your DOM, we're
 short-changing their view of the type of Rich Internet Application
 (RIA) AJAX today is equated with.

 sure, fair point -- I just don't think there is anything wrong with
generating HTML on the sever and injecting into the DOM if that makes the
most sense for what you are trying to do.  And for things that work that
way now, I don't see a need to rush and change it all to JSONP callbacks
because of some vague security concern.


  - Godmar



Re: [CODE4LIB] Models of MARC in RDF

2011-12-06 Thread BRIAN TINGLE
On Dec 6, 2011, at 5:52 PM, Montoya, Gabriela wrote:

 ...
 I'd much rather see resources invested in data synching than spending it in 
 saving text dumps that will most likely not be referred to again.
 ...

In a MARC-as-the-record-of-record scenario; storing the original raw MARC might 
be helpful for the syncing -- when a sync was happing, the new MARC of record 
could maybe be compared against the old MARC of record to know that RDF triples 
needed to be updated?


[CODE4LIB] Library News (à la ycombinator's hackernews)

2011-11-23 Thread BRIAN TINGLE
I'm not sure how many of y'all read hackernews (news.ycombinator.com, I'm 
addicted to it) but I just saw on there that there is a similar style site for 
Library News that somebody launched.

http://news.librarycloud.org/news


Re: [CODE4LIB] Plea for help from Horowhenua Library Trust to Koha Community

2011-11-22 Thread BRIAN TINGLE
FWIW, the discussion on hackernews

http://news.ycombinator.com/item?id=3264378

On Nov 21, 2011, at 4:51 PM, Joann Ransom wrote:

 Horowhenua Library Trust is the birth place of Koha and the longest serving
 member of the Koha community. Back in 1999 when we were working on Koha,
 the idea that 12 years later we would be having to write an email like this
 never crossed our minds. It is with tremendous sadness that we must write
 this plea for help to you, the other members of the Koha community.
 
 The situation we find ourselves in, is that after over a year of battling
 against it, PTFS/Liblime have managed to have their application for a
 Trademark on Koha in New Zealand accepted. We now have 3 months to object,
 but to do so involves lawyers and money. We are a small semi rural Library
 in New Zealand and have no cash spare
 in our operational budget to afford this, but we do feel it is something we
 must fight.
 
 For the library that invented Koha to now have to have a legal battle to
 prevent a US company trademarking the word in NZ seems bizarre, butit is at
 this point that we find ourselves.
 
 So, we ask you, the users and developers of Koha, from the birth place of
 Koha, please if you can help in anyway, let us know.
 
 Background reading:
 
   - Code4Lib article http://journal.code4lib.org/articles/1638: How hard
   can it be : developing in Open Source [history of the development of Koha]
   by Joann Ransom and Chris Cormack.
   - Timeline http://koha-community.org/about/history/ of Koha
   :development
   - Koha history visualization http://www.youtube.com/watch?v=Tl1a2VN_pec
 
 
 Help us
 If you would like to help us fund legal costs please use the paypal donate
 button below.
 
 
 
 
 Otherwise, any discussion, public support and ideas on how to proceed would
 be gratefully received.
 
 Regards
 
 
 Jo.
 
 -- 
 Joann Ransom RLIANZA
 Head of Libraries,
 Horowhenua Library Trust.


[CODE4LIB] Open Position at California Digital Library Access Group

2011-10-12 Thread Brian Tingle
The California Digital Library’s Access Group is seeking a programmer
analyst to help build and provide access to a world-class collection of
scholarly publications and datasets, and historical and primary source
materials.  We need your expertise to develop platforms and tools for
distributing the scholarly work generated by the UC academic community and
for surfacing the remarkable, one-of-a-kind collections held in the
University of California’s campus libraries and California’s cultural
institutions.

*About the position: Programmer/Analyst II***

The programmer analyst will provide development and operations support to
our core services—eScholarship, the Online Archive of California, and
Calisphere—with a focus on further developing systems for contributing
digital content.  We are committed to helping institutions of varying sizes
and technical infrastructure expand access to their collections within
California and across the world.  The programmer analyst will also help
enhance end-user services to provide innovative access to those collections.

This position will play a major role in CDL’s contribution to the Open
Journal Systems (OJS) project; additional projects may include:

   - Developing a statistics reporting system  that meets internal and
   contributor needs
   - Building out a user interface for contributing content
   - Supporting the ingest and display of new content streams, such as audio
   and video
   - Creating new tools for both content contributors and end-users to meet
   expanding user expectations

The programmer analyst will also be responsible for keeping “an ear to the
ground” for potential new tools and services by monitoring technology trends
and investigating their viability and potential incorporation into Access
Group services.  The CDL continually seeks new and creative ideas, and this
position has the potential to make a real difference in how we work and what
kinds of features we offer our constituents.

For more information, and to apply, visit
https://jobs.ucop.edu/applicants/Central?quickFind=54764


Re: [CODE4LIB] Job Posting: Digital Library Repository Developer, Boston Public Library (Boston, MA)

2011-09-27 Thread BRIAN TINGLE
I know I should not take the bait... but if anything we say on this list -- 
however stupid or pedantic -- is taken as representing our employers and not 
our personal opinions; then I'm not sure this is a list I can participate in.  
It is chilling to see veiled legal threats thrown around on this list.   I 
mostly lurk here anyways.  But if everything I say is going to be taken to be 
the official word of my employer, then basically I can't say anything at all as 
far as I understand, except maybe if I cut and paste from press releases / get 
everything I say vetted though a communications officer.

I read the announcement in a way more similar to the way Ya'aqov did than the 
way Roy did; but I don't see how Roy's comments were uncalled for.  As far as 
interfering with a recruitment (?) if anything this increased the visibility of 
this position.  I know I would not have bothered to read the position 
description (on a vacation day even) if I had not been curious to see why it 
had attracted so much attention.

Are there any ground rules or terms of use for this list...  All I can find is 
this:

https://listserv.nd.edu/cgi-bin/wa?A2=ind0312L=CODE4LIBT=0F=S=P=61

If it is official policy that we don't speak for ourselves, I'm out of here.


On Sep 27, 2011, at 7:14 PM, Ya'aqov Ziso wrote:

 The posting's sentence  't*he successful candidate will develop and
 maintain' * does NOT say *'*developing its own digital repository system
 ... throwing anything else at it beyond this one developer' as Roy put it.
 
 In a community where any comma or space makes a world of a difference I pay
 attention to all words and their consequences.
 
 Roy, the wording of your question and intervention in BPL's search (as
 someone representing OCLC and its monopoly) were uncalled for. Yes, let's
 move on, Ya'aqov
 
 
 
 
 On Tue, Sep 27, 2011 at 12:40 PM, Roy Tennant roytenn...@gmail.com wrote:
 
 Phew! That's a relief! I saw the word develop instead of
 implement. Thanks for the clarification,
 Roi
 
 2011/9/27 Colford, Scot scolf...@bpl.org:
 Not developing from scratch, mind you.
 
 This position will be working closely with the other position posted for
 Web Services Developer, the rest of the Web Services and Digital Projects
 teams already at the BPL, and the staffs of other Massachusetts libraries
 participating the Digital Commonwealth project.
 
 Don't you worry about us, Roy. ;-)
 
 \-/-\-/-\-/-\-/-\-/-\-/-\-/-\-/-\-/-\-/-\-/-\-/
 
 Scot Colford
 Web Services Manager
 Boston Public Library
 
 scolf...@bpl.org
 Phone 617.859.2399
 Mobile 617.592.8669
 Fax 617.536.7558
 
 
 
 
 
 
 
 On 9/27/11 11:58 AM, Roy Tennant roytenn...@gmail.com wrote:
 
 So BPL is developing its own digital repository system? Mind if I ask
 why? And are you throwing anything else at it beyond this one
 developer?
 Roy
 
 On Tue, Sep 27, 2011 at 8:52 AM, Colford, Scot scolf...@bpl.org wrote:
 The Boston Public Library is accepting applications for the Digital
 Library Repository Developer position. The successful candidate will
 develop and maintain the core technical infrastructure for a digital
 object repository and library system that will be used by Massachusetts
 libraries, archives, historical societies, and museums to store and
 deliver digital resources to users across the State and beyond.
 Competitive benefits. Salary:  $62,053 - 83,770, DOQ.
 
 
 MINIMUM QUALIFICATIONS:
 
 
 EDUCATION
 
 Bachelor¹s Degree in Computer Science from an accredited college or
 university with a focus on programming, applications development, and
 scripting languages. Preferred degree or coursework in
 Library/Information
 Science.
 
 
 EXPERIENCE
 
 · A minimum of 4 years experience of significant development
 experience
 in an object oriented environment such as Ruby, Python, or Java.
 
 ·Strong working knowledge of XML/XSLT.
 
 ·   Demonstrated familiarity with image, audio, video, and text
 file
 formats - especially as they relate to digital library standards,
 encoding/decoding/transcoding, and related metadata schemas.
 
 ·   Demonstrated familiarity with semantic web/RDF components such
 as SPARQL, FOAF, and OWL.
 
 ·   Demonstrated familiarity and comfort working with various
 operating systems such as UNIX/Linux, Windows, and Mac OSX.
 
 ·  Significant experience working in LAMP and/or WAMP stacks,
 preferably on virtualized and/or cloud-computing platforms.
 
 ·   Experience with open-source repository systems such as Fedora,
 Greenstone, or D-Space and affiliated projects and service providers
 such
 as Hydra, Islandora, and Duraspace.
 
 ·   Demonstrated project management experience.
 
 
 
 REQUIREMENTS ­ Ability to exercise good judgment and focus on detail as
 required by the job
 
 
 RESIDENCY ­ Must be a resident of the City of Boston upon the first day
 of
 hire.
 
 
 CORI ­ Must successfully clear a Criminal Offenders Record Information
 check with the City of Boston
 
 
 Complete job description 

[CODE4LIB] json4lib / API for Calisphere

2011-05-09 Thread Brian Tingle
I've started work on a project that I'm envisioning partly as sort of
a JSON profile of METS optimized for access.

I've implemented this JSON format for search results in calisphere by
adding an rmode=json to the calisphere/oac branch of xtf.

Here is some documentation about it
  http://json4lib.readthedocs.org/en/latest/

And here is the code
   http://code.google.com/p/json4lib/source/browse/

This provides an API of sorts to calisphere which is being used to
power a slide show widget we are beta testing with contributors which
should be released in June.

Only simple image objects support full file access at this time, but I
have an idea of how to support complex objects in search results using
json references.  My intention is to use the same JSON format for
either a set of search results or a flattened set of nodes of a
complex object.

To see the JSON results, it is easiest to install JSONview for Firefox or Chrome

https://addons.mozilla.org/en-us/firefox/addon/jsonview/
https://chrome.google.com/extensions/detail/chklaanhfefbnpoihckbnefhakgolnmc

Do a search or browse in calisphere, say to

http://www.calisphere.universityofcalifornia.edu/browse/azBrowse/National+parks

add rmode=json

http://www.calisphere.universityofcalifornia.edu/browse/azBrowse/National+parksrmode=json

the callback parameter is also supported to allow cross domain access via JSONP

Here is a working example of the slideshow widget

http://cdn.calisphere.org/json4lib/slideshow/example.html

The slideshow widget is built using jQuery UI Dialog and the
PikaChoose slideshow library.

Here is how the JSONP is generated out of XTF (based on xml2json-xslt)
https://bitbucket.org/btingle/dsc-xtf/src/92c8607e3fed/style/crossQuery/resultFormatter/json/

The API and the widget are provided for the use of OAC/Calisphere
contributors on their own websites.  Nothing technical would stop
someone from using the JSON in other ways.  (I figure I could
whitelist referrers if unauthorized use becomes a problem).

I'm interested to hear any feedback; especially about the approach of
creating a javascript API (do_api.js) that provides methods to the
underlying JSON digital library object -- or my crazy arbitrarily
qualifiable dublin core implementation -- or any advice or concerns
about leaving the JSON feed wide open w/o any sort of API key.

Thanks -- Brian


Re: [CODE4LIB] Blogs/news you follow

2011-05-03 Thread Brian Tingle
This perltree is called daily reading but some are more like weekly or monthly

http://pear.ly/tSgr
{
  http://highscalability.com/
  http://slashdot.org/
  http://planet.code4lib.org/
  http://planetdjango.org/ -- currently down
  http://news.ycombinator.com/
  http://thedailywtf.com/
}

plus I try to scan

http://groups.google.org/

On Mon, May 2, 2011 at 1:51 PM, Lovins, Daniel daniel.lov...@yale.edu wrote:
 I get a daily digest from slashdot.org.
 / Daniel

 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Ed 
 Summers
 Sent: Monday, May 02, 2011 2:02 PM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] Blogs/news you follow

 On Mon, May 2, 2011 at 12:56 PM, Jonathan Rochkind rochk...@jhu.edu wrote:
 programming.reddit.com

 Similar, but different:

    http://news.ycombinator.com/

 which also has a daily edition:

    http://www.daemonology.net/hn-daily/

 //Ed



Re: [CODE4LIB] graphML of a social network in archival context

2011-02-18 Thread Brian Tingle
 How have you been liking neo4j so far? Is the neo4j
 graph database something that you have been using in SNAC? Have you
 been interacting with it mainly via gremlin, the REST API, and/or
 Java?

I've been using the tinkerpop graph processing stack, and the first
example I found in the documentation used neo4j, and it seemed to be
installed already with gremlin by it's pom.xml so I just went with it.
 The code should work with any graph database that blueprints
supports.

tinkerpop has something called rexster which gives a JSON/REST
interface to the graph.  This is what I'm planning to use as the
backend for an interactive javascript visualization of the graph.


 Just as an aside, I noticed that there are 66 edges that lack labels,
 and 8332 'associateWith' labels that probably should be
 'associatedWith'?

Thanks, for catching this error, there must be a typo somewhere in the
EAC to EAC XSLT.

 I'm also kind of curious to hear more about what
 'associatedWith' means, is that something from EAC? I noticed that it
 can connect people, corporate bodies and families.

This is something I think Daniel Pitti came up with for the project.
Any named identity mentioned in the EAD is presumed to be
associatedWith the named identity of the collection creator -- but
if the named identity is in a correspondence series or there is
another clue that there are correspondence between the identities they
are tagged as correspondedWith, a stronger connection.


[CODE4LIB] graphML of a social network in archival context

2011-02-17 Thread Brian Tingle
Hi,

As a part of our work on the Social Networks and Archival Context
Project [1], the SNAC team is please to release more early results of
our ongoing research.

A property graph [2] of correspondedWith and associatedWith
relationships between corporate, personal, and family identities is
made available under the Open Data Commons Attribution License [3] in
the form of a graphML file [4].  The graph expresses 245,367
relationships between 124,152 named entities.

The graphML file, as well as the scripts to create and load a graph
database from EAC or graphML, are available on google code [5]

We are still researching how to map from the property graph model to
RDF, but this graph processing stack will likely power the interactive
visualization of the historical social networks we are developing.

Please let us know if you have any feedback about the graph, how it is
licensed, or if you create something cool with the data.

-- Brian

[1] http://socialarchive.iath.virginia.edu/

[2] http://engineering.attinteractive.com/2010/12/a-graph-processing-stack/

[3] http://www.opendatacommons.org/licenses/by/

[4] http://graphml.graphdrawing.org/

[5] 
http://code.google.com/p/eac-graph-load/downloads/detail?name=eac-graph-load-data-2011-02.tar

Research funded by the National Endowment for the Humanities http://www.neh.gov/


Re: [CODE4LIB] Apache URL redirect

2011-02-03 Thread Brian Tingle
Redirect does not look the hostname, just the path

I think you have two options;

1) set up a named based virtual host for www.partnersinreading.org
In that name based virtual host, set up your Redirect /
http://www.sjpl.org/par

2) if you are using mod_rewrite, you could do something like this.

RewriteCond %{HTTP_HOST} ^.*partnersinreading\.org$
RewriteRule ^/(.*)$ http://www.sjpl.org/par$1 [R,NE]

hope that helps,

-- Brian

On Thu, Feb 3, 2011 at 1:42 PM, Nate Hill nathanielh...@gmail.com wrote:
 Hi - I'm new to Apache and hope that someone out there might be able to help
 me with a configuration issue over at San Jose Public Library.

 I need to have the URL www.partnersinreading.org redirect to
 http://www.sjpl.org/par
 Right now if you go to www.partnersinreading.org it takes you to the root,
 sjpl.org and then if you navigate through the site all the urls are
 rewritten with partnersinreading as the root.
 That's no good.

 I went into Apache's httpd.conf file and added in the mod_alias area:
 Redirect permanent http://www.partnersinreading.org/ http://www.sjpl.org/par

 I'm assuming this is not a DNS issue...

 This isn't the right approach.

 Any input would be appreciated, its rather unnerving to have no experience
 with this and be expected to make it work.


 --
 Nate Hill
 nathanielh...@gmail.com
 http://www.natehill.net



[CODE4LIB] METS Editorial Board accepting applications

2011-02-01 Thread Brian Tingle
forwarded...

Apologies for any duplication!

The METS Editorial Board (MEB) is seeking applicants to fill two
positions on the Board. Board membership criteria and procedures
for filling the Board vacancies are documented on the METS website:

METS Editorial Board Membership Criteria:
http://www.loc.gov/standards/mets/mets-boardcriteria.html

Procedures for Filling METS Editorial Board Vacancies:
http://www.loc.gov/standards/mets/mets-boardprocedures.html

Given the MEB agenda for the next several years, we are seeking
applicants who are interested both in education/training to further
expand the established METS usage around the world, but also working
with the rest of the Board in adapting the METS-related schemas and
tools for use with the Semantic Web and related technologies.

Deadline for applications is February 21st, and should be sent to
nhoe...@kmotifs.com or rbeau...@library.berkeley.edu. Questions
about Board membership, goals, and processes can also be directed
to any Board member.  Please consider joining the Board if you're
interested in working with a congenial, committed group of experienced
digital library and preservation partners.

Regards,
Nancy Hoebelheinrich
Administrative Co-chair, METS Editorial Board
--
Nancy Hoebelheinrich
Information Analyst  Principal
Knowledge Motifs LLC
San Mateo, CA  94401
nhoe...@kmotifs.com
njhoe...@gmail.com
(m) 650-302-4493
(f) 650-745-


Re: [CODE4LIB] best persistent url system

2011-01-14 Thread Brian Tingle
On Fri, Jan 14, 2011 at 11:20 AM, Michael J. Giarlo
leftw...@alumni.rutgers.edu wrote:

 Has anyone thought through, or put into practice, using Apache
 mod_rewrite tables for this simple redirect one URL to another use
 case?

I do mod_rewrite redirects with a RewriteMap

https://bitbucket.org/btingle/dsc-role-account/src/c81c543848a7/servers/front/conf/common-rewrite.conf#cl-26
https://bitbucket.org/btingle/dsc-role-account/src/c81c543848a7/servers/front/conf/UCPEE.txt

But I prefer to use mod_rewrite with mod_proxy in a reverse proxy
configuration -- this lets the URL NOT redirect so the canonical URL
stays in the browser window (at least for the first page).  The
persistent URL and the real URL are the same this way.

http://content.cdlib.org/ark:/13030/tf387002bh/
https://bitbucket.org/btingle/dsc-role-account/src/c81c543848a7/servers/front/conf/common-rewrite.conf#cl-58

http://www.oac.cdlib.org/findaid/ark:/13030/kt0f59q75v/
https://bitbucket.org/btingle/dsc-role-account/src/c81c543848a7/servers/front/conf/vhosts/001-findaid.conf.in#cl-69

By using mod_proxy with mod_rewrite; the URLs for the front pages of
the EADs did not change as we moved from DLXS to XTF (at least for
front or home pages for the finding aids).


Re: [CODE4LIB] javascript testing?

2011-01-12 Thread BRIAN TINGLE
Mark Redar at CDL has some selenium tests for calisphere.org/mapped but they 
are not automatically run

I've been wanting to play around with selenium grid on EC2 but never had the 
time / real reason to -- but if it is as advertised it might speed up running 
the tests by executing them in parallel

http://selenium-grid.seleniumhq.org/
http://selenium-grid.seleniumhq.org/run_the_demo_on_ec2.html

-- Brian

On Jan 12, 2011, at 8:45 AM, Demian Katz wrote:

 For what it's worth, the VuFind community has recently been playing with 
 Selenium (not an especially new or exciting technology, I realize...  and 
 probably one of the things you were thinking of for approach #1).  The good 
 news is that it plays well with Hudson, and we have been able to get it to 
 successfully automatically test AJAXy code in Firefox as part of our 
 continuous integration process.  The bad news is that it's incredibly slow -- 
 that successful test takes ten minutes to execute, and all it does is load 
 one web page and confirm that a lightbox opens when a button is clicked.  I 
 wouldn't realistically expect this sort of thing to be FAST, but the current 
 performance we are experiencing stretches belief a bit -- we're still 
 investigating to see if we're doing something wrong that can be improved, but 
 the general consensus seems to be that Selenium is just really slow on 
 certain platforms.  It's a shame, because I think we could potentially write 
 a very comprehensi!
 ve!
  and powerful test suite with Selenium...  but tests are significantly less 
 valuable if they can't give you reasonably quick feedback while you're in the 
 midst of coding!
 
 In any case, I'm happy to share my limited experience with Selenium if it's 
 of any use (some VuFind-specific notes are here: 
 http://vufind.org/wiki/unit_tests#selenium and more can probably be gleaned 
 by looking at VuFind's test-related configuration and scripts).  I'd also be 
 very interested to hear if anyone has overcome the speed problems (which I've 
 encountered under both RedHat and Ubuntu, possibly related to using a virtual 
 frame buffer) or if there is a better, equivalent solution.
 
 - Demian
 
 -Original Message-
 From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of
 Jonathan Rochkind
 Sent: Wednesday, January 12, 2011 11:32 AM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] javascript testing?
 
 As far as I can tell, while there are several, there are none that are
 actually Just Work good.  It seems to be an area still in flux, people
 coming up with an open source way to do that that is reliable and easy
 to use and just works.
 
 The main division in current approaches seems to be between: 1) Trying
 to automate _actual browsers_ so you know you've tested it in the real
 browsers you care about (the headaches of this are obvious, but people
 are doing it!), and 2) Using a headless javascript browser that can be
 run right on the server, to test general javascriptyness but without
 testing idiosyncracies of particular browsers (I would lean towards
 this
 one myself, I'm willing to give up what it gives up for something that
 works a lot simpler with less headaches).
 
 Jonathan
 
 On 1/11/2011 7:21 PM, Bess Sadler wrote:
 Can anyone recommend a javascript testing framework? At Stanford, we
 know we need to test the js portions of our applications, but we
 haven't settled on a tool for that yet. I've heard good things about
 celerity (http://celerity.rubyforge.org/) but I believe it only works
 with jruby, which has been a barrier to getting started with it so far.
 Anyone have other tools to suggest? Is anyone doing javascript testing
 in a way they like? Feel like sharing?
 
 Thanks!
 
 Bess
 


[CODE4LIB] graph processing stack

2010-12-20 Thread BRIAN TINGLE
I saw an interesting article on hackernews (news.ycombinator.com) yesterday 
published by ATT interactive

-- article summary -- 

A Graph Processing Stack
http://engineering.attinteractive.com/2010/12/a-graph-processing-stack/

[...] ATTi along with other collaborators (see acknowledgments), have been 
working on an open-source, graph processing stack. This stack depends on the 
use of a graph database. There are numerous graph databases in the market 
today. To name a few, there exist Neo4j, OrientDB, DEX, InfiniteGraph, Sones, 
HyperGraphDB, and others.

Blueprints can be considered the 'JDBC' for the graph database community.   
there is a driver for Sesame Sail Quad Store
https://github.com/tinkerpop/blueprints/wiki/

Pipes is a low level access to a graph in the database, looks almost like DOM 
or SAX but for graphs.
https://github.com/tinkerpop/pipes/wiki/

Gremlin is a higher level access API and looks more like XPath
https://github.com/tinkerpop/gremlin/wiki

Finally, at the top of the stack, there exists Rexster. Rexster exposes any 
Blueprints-enabled graph database as a RESTful server.
https://github.com/tinkerpop/rexster/wiki/

-- end of article summary --

We just released the first public prototype for a Social Networks and Archival 
Context project, right now you can search 123,920 EAC-CPF records with XTF.
http://socialarchive.iath.virginia.edu/prototype.html

So there is full text search of names and keywords, and some faceted browsing 
and basic search stuff.  Some of the really interesting data is in all the 
correspondedWith and associatedWith relationships we have in the EAC 
records, but putting those into XTF facet values does not seem to be too useful.

Now I think that loading these correspondedWith and associatedWith 
relationships into a graph database with a simple model and then plopping this 
graph processing stack on top of it might be the best way to index and search 
these relationships.  I had been trying to figure out RDF and how FOAF would 
map to our data, but I'm stuck on that and don't know how to start.  

graph processing stack on top of a graph database resonates with me more than 
RDF store with SPARQL access but I guess they are basically/functionally 
saying the same thing?  Maybe the graph database way of thinking about it is  
potentially less interoperable open data linking way? -- but I've always 
believed you have to operate before you can interoperate.

Anyway, I hope the recycled hackernews was interesting, and if anyone has any 
ideas, suggestions, criticism or advice on how to expose access to the social 
graph in the SNAC project prototype please let me know.


[CODE4LIB] collengine, the collection engine; runs on django-nonrel / app engine

2010-12-16 Thread BRIAN TINGLE
Having been several months since I've tried to run django on the google app 
engine, I took a crack at it today with Django appengine 
http://www.allbuttonspressed.com/projects/djangoappengine

Since it is based on django-nonrel, in theory it does not have vendor lock in 
to app engine, so you could start to develop there and move in house if you 
need to.

I set up a very simple little app, and it deployed to appspot okay, here is the 
code and a short screen cast on my blog

screen cast:
http://tingletech.tumblr.com/post/2334189882/
demonstrates the django admin interface running in the google app engine 
editing the super basic models

The super basic models:
https://github.com/tingletech/collengine/blob/master/items/models.py

code repository: 
https://github.com/tingletech/collengine

Dose anyone know of any other django or app engine based digital library 
metadata collection tools?  Seems like being able to run for free on app engine 
(if things fit in google quotas) would be an advantage for small libraries and 
short term grant funded projects.  Also, the django-nonrel looks like is has 
some interesting search features that could be used in access systems.

Anyway, just throwing this out there in case it might be useful for the hackfest

-- Brian


Re: [CODE4LIB] HTML Load Time

2010-12-06 Thread Brian Tingle
 Does anyone have any tricks or tips to decrease
 the load time?

You could try server side gzip compression
https://github.com/paulirish/html5-boilerplate/blob/master/.htaccess#L101

At a certain point, all you can do is try to split it up into multiple pages.


On Mon, Dec 6, 2010 at 11:49 AM, Nathan Tallman ntall...@gmail.com wrote:
 Hi Cod4Libers,

 I've got a LARGE finding aid that was generated from EAD.  It's over 5 MB
 and has caused even Notepad++ and Dreamweaver to crash.  My main concern is
 client-side load time.  The collection is our most heavily used and the
 finding aid will see a lot of traffic.  I'm fairly adept with HTML, but I
 can't think of anything.  Does anyone have any tricks or tips to decrease
 the load time?  The finding aid can be viewed at 
 http://www.americanjewisharchives.com/aja/FindingAids/ms0361.html.

 Thanks,
 Nathan Tallman
 Associate Archivist
 American Jewish Archives



[CODE4LIB] Reimagining METS

2010-10-27 Thread Brian Tingle
The METS Editorial Board is starting to think about what a METS 2.0
might look like / assess the need for a METS 2.0.

To that end, we have put together a little While Paper

Reimagining METS: An Exploration
http://bit.ly/cySIM1

suggested suplemental reading for Reimagining METS
http://bit.ly/96vaFO

From Nancy's message to the METS list
[T]here is a session planned to discuss the White Paper at the CLIR /
DLR Fall Forum (http://www.clir.org/dlf/forums/fall2010/index.html)
on Tuesday, November 2nd, from 4 - 5:30 pm PDT at the DLF Meeting site
in Palo Alto, California.  While registration for that event is now
closed, discussions by anyone interested including METS Board members
will also occur on Wednesday afternoon at the open Board meeting, from
1:30 - 5 pm, PDT.  An agenda for the open Board meeting on Wednesday
and Thursday can be found on the METS wiki at:
https://www.socialtext.net/mim-2006/index.cgi?agenda_3_4_november_2010_dlf_fall_forum.
 If you are interested in attending this meeting in person, please
contact any of the Board members (see the METS website at
http://www.loc.gov/standards/mets/mets-board.html).  We would like to
make the meeting available via web conferencing as well, if possible,
so please let a Board member know if you are interested in
participation in the meeting by that means.

I personally have questions about the need for a new XML Schema (w3c)
for METS, and I'd like to understand what the goal of the new METS is
before deciding what the form of a new METS is.  Once, Mackenzie Smith
was suggesting investigating METS as an RDF schema.  Maybe if METS
could be expressed in JSON, then they would be easy to work with from
javascript web apps?  Could METS become a metamodel of digital object
existence that transcends the physical information serialization?  Or,
do we just try to harmonize with MODS/MADS and EAD/TEI/DDI etc. as
they evolve as XML schema and wave at OAI-ORE?  With alternatives to
the fileSec like bagIt out there, could METS stand to be more modular
so one maybe could keep the structMap in METS but point to files in a
bag?

Also, what about interoperability?  This still seems like a good goal
to me, an interoperable standard for digital object where I can
download an object out of a repository and put it my own system
without having to worry about customizing my systems to work with your
stuff.  METS sort of helps here, but ... not really that much more
than having stuff in XML.

What are your ideas about what direction METS should develop in to
best meet the needs of the code4lib community?  What annoys you about
METS needs to be fixed?  What about METS puzzles your?  What do you
love about METS and would hate to see changed?

Thanks for any thoughts on this subject,

-- Brian Tingle, METS Editorial Board

some more METS related thoughts and links
http://tingletech.tumblr.com/tagged/mets


Re: [CODE4LIB] open source proxy packages?

2010-08-14 Thread Brian Tingle
apache httpd has a mod_proxy module can let apache act as a proxy server.

http://httpd.apache.org/docs/current/mod/mod_proxy.html

You should be able to use this with htpasswd files you would use to
secure a web directory with apache.

On Sat, Aug 14, 2010 at 10:05 AM, phoebe ayers phoebe.w...@gmail.com wrote:
 Hello all,

 Are there any open source proxies for libraries that have been
 developed, e.g. an open source alternative to EZProxy or similar? I'm
 working with a non-profit tech foundation that is interested in
 granting access to a few licensed resources to a few hundred people
 who are scattered around the world.

 thanks,
 Phoebe

 --
 * I use this address for lists; send personal messages to phoebe.ayers
 at gmail.com *



[CODE4LIB] cpf2html

2010-08-09 Thread Brian Tingle
Hi,

A couple of you on the list may have been at the EAC-CPF: Moving
forward with Authority thing at NARA today.

I can't post a link to the demo site yet, but all the source for my
part of the EAC-CPF XTF prototype for the Social Networking in
Archival Context project http://socialarchive.iath.virginia.edu/ is at
http://bitbucket.org/btingle/cpf2html/wiki/Home

These are libraries I'm investigating for visualization of the social
network. - http://tingletech.tumblr.com/tagged/visualization

Protovis' Arc Diagrams and Matrix Diagrams look the most interesting
to me right now, but I'm also interested in making the network graph
visualization interactive so one can explore dynamically the areas
they are interested in.

http://vis.stanford.edu/protovis/ex/arc.html
http://vis.stanford.edu/protovis/ex/matrix.html

Right now there is limited support for the creation of graphviz .dot
file from search results that can be feed to neato

Ed Summers was asking me during a break today about support for
embedded linked data in the HTML view of the EAC record.  I have to
admit I'm a bit of a linked data skeptic, but I'd be interested to
explore how we could better support interoperability with linked data
initiatives in the prototype.

If anyone has EAC records they are playing with and would like to try
this out I'd love to hear any feedback you might have on the code and
learn of any issues there might be with your EAC records (I've tried
to base it off the tag library as much as possible, but I've only
tested it with the EAC Daniel has been creating from EAD).

-- Brian