Re: [CODE4LIB] Software used in Panama Papers Analysis

2016-04-12 Thread Chris Fitzpatrick
The Knight News Challenge grant also funded DocumentCloud ( open source
newsroom platform for uploading, annotating, sharing documents ), which
actually created Backbone.js and Underscore.js. They have a pretty active
community.




On Tue, Apr 12, 2016 at 6:21 PM, Chad Nelson  wrote:

> Tom,
>
> The Knight funded OpenNews project  is not exactly
> a
> community but certainly is working along those lines. Their upcoming SRCCON
> conference seems focused on the same kinds of things
> code4lib is, but for journalism.
>
> Another upcoming conference, 'csv,conf'  is bringing
> together journalists and open source tech folks, as well as civic hackers,
> and even a few cultural heritage folks, to talk about the technology they
> are using to work with open data.
>
> Chad
>
> On Tue, Apr 12, 2016 at 11:07 AM Tom Cramer  wrote:
>
> > The IJNet article is particularly interesting—thanks for posting this.
> > Excerpts like the one below make me wonder if there is a “Code4News”
> > community, and if so, how do we find and connect with them. It seems we
> > have a lot in common, and maybe a lot to offer each other.
> >
> >
> > MC: What we’ve achieved is pretty remarkable. Newsrooms are in an
> economic
> > crisis. No newsroom right now--except for maybe The New York Times and a
> > few others--have the capability to do something major like this at a
> global
> > scale. But we’re showing it’s possible. We share data, we produce tools
> for
> > communication, we share our stories and our interactives, to make it
> happen.
> >
> > - Tom
> >
> >
> >
> >
> >
> >
> > On Apr 7, 2016, at 7:24 AM, Gregory Markus  > > wrote:
> >
> > Hey Sebastian,
> >
> > They go into a lot of detail in this article
> >
> >
> >
> https://ijnet.org/en/blog/how-icij-pulled-large-scale-cross-border-investigative-collaboration
> >
> > Indeed this is pretty interesting stuff and a good shout out for
> Blacklight
> > and other OS tools!
> >
> > -greg
> >
> > On Thu, Apr 7, 2016 at 4:21 PM, Sebastian Karcher <
> > karc...@u.northwestern.edu> wrote:
> >
> > Hi everyone,
> >
> > from one of the New York Times stories on the Panama Papers:
> > "The ICIJ made a number of powerful research tools available to the
> > consortium that the group had developed for previous leak investigations.
> > Those included a secure, Facebook-type forum where reporters could post
> the
> > fruits of their research, as well as database search program called
> > “Blacklight” that allowed the teams to hunt for specific names, countries
> > or sources."
> >
> >
> >
> http://www.nytimes.com/2016/04/06/business/media/how-a-cryptic-message-interested-in-data-led-to-the-panama-papers.html
> >
> > I assume this is http://projectblacklight.org/, which is pretty cool to
> > see
> > used that way. Does anyone know or have read anything about the other
> tools
> > they used? What did they use for OCR? Did they use qualitative data
> > analysis software? Some type of annotation tools? It seems like there's a
> > lot to learn from this effort.
> >
> > Thanks,
> >
> > --
> > Sebastian Karcher, PhD
> > Qualitative Data Repository, Syracuse University
> > qdr.syr.edu
> >
> >
> >
> >
> > --
> >
> > *Gregory Markus*
> >
> > Project Assistant
> >
> > *Netherlands Institute for Sound and Vision*
> > *Media Parkboulevard 1, 1217 WE  Hilversum | Postbus 1060, 1200 BB
> > Hilversum | *
> > *beeldengeluid.nl* 
> > *T* 0612350556
> >
> > *Aanwezig:* - ma, di, wo, do, vr
> >
> >
>


Re: [CODE4LIB] Islandora & Vagrant - Development use only?

2016-04-06 Thread Chris Fitzpatrick
Vagrant is not a good idea for production. It's really for people to work
against a copy of the production environment.
Like you can use Vagrant, then update a ansible or puppet or chef script
then deploy that to yr VM.
Hashicorp is making something called Otto which is supposed to replace
Vagrant for end-to-end deployments like this, but that's in alpha now.

Vagrant isn't  like virtualenv at all. Virtualenv is a way to maintain
Python dependencies by mucking around with some environment variables. It's
more like Ruby's bundler.

It's kinda more like Docker. Docker makes linux containers. Nobody knows
what those are, but they work great.

I've seen Vagrant used in production and it supposedly worked well but the
guy who set it up left and things went bad. It wasn't a performance issue,
it's just really hard for the replacement to figure out what's going on.
Use Vagrant with Ansible/Puppet/Chef. Or use Docker. Or use all of that,
for the win.



On Wed, Apr 6, 2016 at 3:55 PM, Francis Kayiwa  wrote:

> On 4/6/16 9:49 AM, Annamarie C Klose wrote:
>
>> Hi, all,
>>
>> Can anyone provide a technical explanation as to why it is not
>> appropriate to install Islandora on a public server with Vagrant? Despite
>> all the documentation instructing that Vagrant is for development only, my
>> university's IT department thinks Vagrant makes Islandora more secure for
>> production use. They have also stated "Vagrant is used to keep dependencies
>> separate on machines in the same way Pythons Virtualenv or Ruby's Docker
>> is." Unfortunately, secure networking is outside of my expertise. I'm
>> concerned that Vagrant's virtualization is a poor substitute for the real
>> thing. Before I add hundreds of records to Islandora, I'd like to make sure
>> that I'm building my library's digital collections on a steady foundation.
>> Any advice and/or explanations to give IT is welcome.
>>
>
>
> If we agree  that your University IT are the Operations people find the
> nicest way to tell them how the developers of Vagrant view the tool below
>
> https://www.vagrantup.com/docs/why-vagrant/
>
> Specifically. "...If you are an operations engineer, Vagrant gives you a
> disposable environment and consistent workflow for developing and testing
> infrastructure management scripts..."
>
> You are also correct in being wary about having a production application
> running on Vagrant. A part of me wants to test that just for laughs, but it
> will be painful to set up for them and the performance will horrible for
> you.
>
> Cheers,
> ./fxk
>
> --
> "Anyone attempting to generate random numbers by deterministic means is,
> of course, living in a state of sin."
> -- John Von Neumann
>


Re: [CODE4LIB] Archives Spaces - Internal Server Error 500

2015-07-23 Thread Chris Fitzpatrick
Hi,

Yes, what version are you running?
b,chris.

On Tue, Jul 21, 2015 at 9:18 AM, KNOWLES Claire claire.know...@ed.ac.uk
wrote:

 Hi Michael,

 Have you posted to the ArchivesSpace Google Group
 https://groups.google.com/forum/#!forum/archivesspace? I've found it
 really helpful when I've had issues with ArchivesSpace. What version of
 ArchivesSpace are you running?

 Claire

 --
 Claire Knowles
 Library and University Collections
 University of Edinburgh



 On 16/07/2015 18:44, Code for Libraries on behalf of Bobak, Michael -
 HPL CODE4LIB@LISTSERV.ND.EDUmailto:CODE4LIB@LISTSERV.ND.EDU on behalf
 of michael.bo...@houstontx.govmailto:michael.bo...@houstontx.gov wrote:

 Good afternoon,

 I am trying to troubleshoot an issue we are having with our Archives Space
 server. Currently, whenever someone tries to export a resource by
 Downloading the EAD as a PDF we get an Internal Server 500 response back,
 with a not so descriptive error text of (Error) Timeout::Error. Here is a
 link to a screenshot of the error message. http://i.imgur.com/hLBJXRY.png

 I am led to believe that this has to do with a timeout issue with the
 server and some script that is generating the PDF. I am just not certain
 where the script or configuration file reside within the AS structure in
 order to increase the timeout value. Hopefully someone has some idea where
 I should look. If I am way off with my assessment perhaps someone else
 could point me in the direction as to what is causing this error.

 I appreciate any responses anyone may have!

 Regards,
 Michael Bobak


 The University of Edinburgh is a charitable body, registered in
 Scotland, with registration number SC005336.




Re: [CODE4LIB] Scanned PDF to text

2014-12-11 Thread Chris Fitzpatrick
Tesseract is going to be slow, and there might not much you can do about
that.

You can do a couple of things, like set up a processes that run on AWS EC2
spot instances, so you can put a standing bid order on AWS instances and
only run your OCR when the price drops.

Or you can buy ABBYY , which is much faster.

b,chris.

b,chris.


On Tue, Dec 9, 2014 at 5:45 PM, Kyle Banerjee kyle.baner...@gmail.com
wrote:

  I’m not quite sure if I understand the question, but if all you want to
 do is pull the text out of an OCR’ed PDF file, then I have found both Tika
 and PDFtotext to be useful tools
 
  On the other hand, if you need to do the OCR itself, then employing
 Tesseract is probably the way to go.

 For clarity, I have to do the OCR itself. I've been using CAM::PDF to
 extract existing text.

 Kyle



[CODE4LIB] ArchivesSpace House @ code4lib 2015

2014-12-11 Thread Chris Fitzpatrick
Hi everyone,

The ArchivesSpace project is considering hosting an open house during
code4lib this year in Portland, with the idea of having some meetups and
get-togethers planned.

We are looking at renting a house close to the conference and it's possible
that there's an extra bed for accommodation.

Maybe you're a graduate student focusing on digital archives thinking of
attending C4L? Or you're self-funded and trying to make the numbers work?

If you're interested, contact me directly. No promised yet, but we can see
what is possible.

Thanks!


Re: [CODE4LIB] Forwarding blog post: Apple, Android and NFC – how should libraries prepare? (RFID stuffs)

2014-10-08 Thread Chris Fitzpatrick
Oh I definitely agree. Some of my best friends are narcissists, so I get
it.

On Wed, Oct 8, 2014 at 1:55 PM, Riley Childs rchi...@cucawarriors.com
wrote:

 I like c4l because there are limited standards... Just sayin'

 Riley Childs
 Senior
 Charlotte United Christian Academy
 Library Services Administrator
 IT Services
 (704) 497-2086
 rileychilds.net
 @rowdychildren
 
 From: Chris Fitzpatrickmailto:chrisfitz...@gmail.com
 Sent: ‎10/‎8/‎2014 7:53 AM
 To: CODE4LIB@LISTSERV.ND.EDUmailto:CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] Forwarding blog post: Apple, Android and NFC – how
 should libraries prepare? (RFID stuffs)

 So this thread started from talking about RFID ( i'm interested! ) to
 talking about augmented reality ( uh, ok, now less interested...) to
 talking about standards ( oh no, not again.. ) to talking about c4l (
 yep. )

 So, are people using RFID? A lot? Is it working, or did it make life
 hellish?

 b,chris.



 On Wed, Oct 8, 2014 at 10:54 AM, Ross Singer rossfsin...@gmail.com
 wrote:

  I guess there’s “what do you mean by ‘C4L'” and “what do you mean by
  ‘standards’” that need to be clarified here.
 
  Cary is right, this list/community/whatever is definitely well
 represented
  by people who sit on formal standards committees or are involved in the
  organizations that create them, etc.
 
  But I think more important is the “what do you mean by ‘standards’”
  question: C4L has definitely spawned several specifications (COinS,
 UnAPI,
  etc.) and (in my mind) has been under-utilized in this arena for a few
  years.  You’ve got a gathering of smart, like-minded people: if you want
 to
  create a spec, solicit your idea, start a mailing list, follow the ROGUE
  ’05 rules [1], and let a thousand specifications bloom.
 
  We’re generally in need of a spec, not a standard, I’ve found (although
  they’re definitely not mutually exclusive!).
 
  -Ross.
  1. http://wiki.code4lib.org/Rogue
 
  On Oct 7, 2014, at 7:17 PM, Salazar, Christina 
  christina.sala...@csuci.edu wrote:
 
   OH NO! (shudder) I’m pretty sure no one is suggesting a formalized
 c4l
  AGAIN - we've been there done that, relatively recently too.
  
   I think what we're talking about is a way to represent c4l interests in
  standards making bodies.
  
   And just for my own edification, if you're saying c4l IS represented in
  standards making bodies, please tell me who do I talk to? For instance on
  the RFID thing, who can I talk to in order to find out HOW and IF this
  conversation is happening with American standards making bodies?
  
   Or do you mean INDIVIDUALS who participate in c4l are represented in
  standards making bodies?
  
   Christina
  
   -Original Message-
   From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf
 Of
  Francis Kayiwa
   Sent: Tuesday, October 07, 2014 11:07 AM
   To: CODE4LIB@LISTSERV.ND.EDU
   Subject: Re: [CODE4LIB] Forwarding blog post: Apple, Android and NFC –
  how should libraries prepare? (RFID stuffs)
  
   On 10/07/2014 02:03 PM, Cary Gordon wrote:
  
  
  
NISO (and LITA, ASIST,
   etc.) are quite well represented on this list, and I don't believe
   that a formalized c4l would give us any more say in standards that we
  have already.
  
   +1
  
  
   ./fxk
  
  
   --
   You single-handedly fought your way into this hopeless mess.
 



Re: [CODE4LIB] jobs digest for 2014-05-16

2014-05-23 Thread Chris Fitzpatrick
more cowbell


On Thu, May 22, 2014 at 9:05 PM, Wilhelmina Randtke rand...@gmail.comwrote:

 I prefer full ads also.

 -Wilhelmina Randtke


 On Wed, May 21, 2014 at 7:53 AM, Dunn, Katie dun...@rpi.edu wrote:

  On Fri, May 16, 2014 at 10:06 PM, Joe Hourcle wrote:
   It looks to me like it's a change in the messages that '
  jobs.code4lib.org'
   generates and sends to the list ...
 
  I much preferred receiving the full ads in separate messages, because
 they
  were easy to archive and search in my email without having to copy/paste
  from the website, but I can just subscribe to the Atom feed instead.
 
  Katie
 



[CODE4LIB] ArchivesSpace v1.0.9 released

2014-05-13 Thread Chris Fitzpatrick
:
   Add REST endpoint to suspend indexer
   - BUG #59985522 http://www.pivotaltracker.com/story/show/59985522:
   Windows: application cannot resolve gem paths if pathname has any spaces in
   it
   - FEATURE #64190840 http://www.pivotaltracker.com/story/show/64190840:
   As an archivist, I don't want an agent to have two identical alternative
   name records.
   - BUG #68437254 http://www.pivotaltracker.com/story/show/68437254:
   Cannot save resource component that has attached digital objects.
   - BUG #64855666 http://www.pivotaltracker.com/story/show/64855666:
   Importing a set of MARCXML records results in the date statement for each
   MARC record being recorded in all of the MARC records imported from the set.
   - BUG #66596734 http://www.pivotaltracker.com/story/show/66596734: As
   a repository manager, I do not want suppressed records (accession,
   resource, resource component, digital object, and digital object
   components) to be cast to the public UI.
   - FEATURE #65477970 http://www.pivotaltracker.com/story/show/65477970:
   As a repository manager, I would like to change to Indirect  the term
   Inverted now present in the Name Order enum list in the Name Form template
   - FEATURE #65475942 http://www.pivotaltracker.com/story/show/65475942:
   As a repository manager, I want personal names to import in Indirect order
   when occurring as such in the source record.
   - FEATURE #64336538 http://www.pivotaltracker.com/story/show/64336538:
   As an archivist. I want to be able to view digital objects in staff UI the
   same way they render in public UI.
   - FEATURE #62787364 http://www.pivotaltracker.com/story/show/62787364:
   Public UI should render index items in a way that allows users to navigate
   to the designated note
   - BUG #68450066 http://www.pivotaltracker.com/story/show/68450066:
   Event records are unlinked from context record when context record is
   updated.
   - FEATURE #68863136 http://www.pivotaltracker.com/story/show/68863136:
   Add Shutdown Hook that can be invoked in JVM
   - BUG #65282410 http://www.pivotaltracker.com/story/show/65282410: Am
   not able to suppress an accession record that contains either a linked
   agent or a linked location.
   - FEATURE #65515468 http://www.pivotaltracker.com/story/show/65515468:
   As a repository manager, I want to remove the Dates field from the Name
   Form template for Persons, Families, and Corporations.
   - FEATURE #68989280 http://www.pivotaltracker.com/story/show/68989280:
   Only display dates in agent records if data is present. If there is not
   data already existing, don't display the field.
   - BUG #68543726 http://www.pivotaltracker.com/story/show/68543726:
   Column control-group values are not correct (for rows 2 and higher) after a
   user has used the Reorder Columns feature
   - BUG #69259490 http://www.pivotaltracker.com/story/show/69259490:
   ARCHIVESSPACE_VERSION not found in Tomcat deployments
   - FEATURE #65515152 http://www.pivotaltracker.com/story/show/65515152:
   As a repository manager, I want to restrict date type in the Exist and
   Usage date templates to 1) Single date and 2) Date Range.
   - FEATURE #68863168 http://www.pivotaltracker.com/story/show/68863168:
   Improved LCNAF query
   - BUG #70784798 http://www.pivotaltracker.com/story/show/70784798: Fix
   performance issues with migration
   - CHORE #66974808 http://www.pivotaltracker.com/story/show/66974808:
   For digital object file display, remove the link between xlink attributes
   and display in public UI.
   - CHORE #63339564 http://www.pivotaltracker.com/story/show/63339564:
   Update JRuby
   - CHORE #66533444 http://www.pivotaltracker.com/story/show/66533444:
   Look at using saxerator for EAD converter

   ArchivesSpace 1.0.7 is open source software; the application and its
source code are available on Github. Build instructions and technical
documentation http://archivesspace.github.com/archivesspace/doc/ are also
provided for the more technically inclined.

ArchivesSpace is also a member-supported community.  Lists of current
members are posted at http://www.archivesspace.org/members.  If you are
interested in becoming a member, please send a request to
archivesspaceh...@lyrasis.org.
  ​


  Chris Fitzpatrick | Developer, ArchivesSpace
Skype: chrisfitzpat  | Phone: 918.236.6048
http://archivesspace.org/


[CODE4LIB] ArchivesSpace hackfest/meetup TODAY!

2014-03-27 Thread Chris Fitzpatrick
Hi everyone,

In Raleigh for the afternoon? If so, please feel free to stop by the
ArchivesSpace hackfest/meetup that we're hosting today at 1pm. We will be
in Magnolia 1, which is on the first floor of the Sheraton.

We will be going over some install and deployment for ArchivesSpace, as
well as going over migration  and plugins architecture.

All are welcome.

See you this afternoon!

best, chris fitzpatrick

ArchivesSpace Developer
http://archivesspace.org


[CODE4LIB] ArchivesSpace Post-Code4Lib Hackfest + Meetup

2014-03-11 Thread Chris Fitzpatrick
Hi All,

ArchivesSpace will be hosting a Post-Code4Lib hackfest and meetup on *Thursday
March 27 between 1-5pm at the Sheraton Raleigh*. This will be a great way
to cap off the Code4Lib week. If you're interested in archives and/or
awesome code, we'd love to meet you.

Everyone is invited, no matter what your archives or technical experience
are. We will be giving a brief overview of the application, showing off
it's key features and functions, as well as covering
installation/deployment options.
We will have also a session on migrating to ArchivesSpace from Archon and
ArchivistToolkit, which will be lead by Nathan Stevens from NYU.
In addition, we will go over how to extend and customize ArchivesSpace, in
which I will be going over how to write a plugin.

Refreshments will be provided.

Please let me know if you have any questions or if there's anything
specific you'd like to see in the session.

Hope to see you in Raleigh!

best regards, Chris Fitzpatrick



Chris Fitzpatrick | chris.fitzpatr...@lyrasis.org
Developer, ArchivesSpace
http://archivesspace.org/



About ArchivesSpace
Built for archives by archivists, ArchivesSpace is the open source archives
information management application for managing and providing web access to
archives, manuscripts and digital objects.


[CODE4LIB] ArchivesSpace v1.0.7 Released

2014-03-06 Thread Chris Fitzpatrick
://www.pivotaltracker.com/s/projects/386247/stories/39402005): As a
Repository Manager or an Administrator, I want to suppress a (Digital
Object | Digital Object Component | Resource | Resource Component)
* BUG [#65864630](
https://www.pivotaltracker.com/s/projects/386247/stories/65864630): Link
statements using HTTPS are not active in the Digital Object view in Public
UI.
* FEATURE [#65826472](
https://www.pivotaltracker.com/s/projects/386247/stories/65826472): There
should be a delete option on the location record staff view and edit modes.

* BUG [#66535718](
https://www.pivotaltracker.com/s/projects/386247/stories/66535718): merge
button in edit controlled value lists not working in v 1.0.7 rc1
* BUG [#66589136](
https://www.pivotaltracker.com/s/projects/386247/stories/66589136): Cannot
search or browse for Agents to be merged
* BUG [#66125918](
https://www.pivotaltracker.com/s/projects/386247/stories/66125918): CSV
importer errors out on UTF-8 file with a BOM
* BUG [#54842798](
https://www.pivotaltracker.com/s/projects/386247/stories/54842798): When
tabbing through tree nodes, there is no indication which node has focus

ArchivesSpace 1.0.7 is open source software; the application and its source
code are available on Github. Build instructions and technical
documentationhttp://archivesspace.github.com/archivesspace/doc/ are
also provided for the more technically inclined.

ArchivesSpace is also a member-supported community.  Lists of current
members are posted at http://www.archivesspace.org/members.  If you are
interested in becoming a member, please send a request to
archivesspaceh...@lyrasis.org.
 


Chris Fitzpatrick | chris.fitzpatr...@lyrasis.org
Developer, ArchivesSpace
http://archivesspace.org/


Re: [CODE4LIB] Book scanner suggestions redux

2014-03-05 Thread Chris Fitzpatrick
I've used one of the DIY Bookscanners kits. Worked great and I didn't have
to go into the dumpster.  They did a good job on the components and
assembly was rather easy.

However, it is all very much a manual process. An operator has to work the
machine to scan all the pages.
In addition, there's a post-processing part using a bit of software called
ScanTailor where you assemble your captures images ( cropping, adjusting,
etc). Then, you have to run all those through Tesseract OCR and an image
compressor to get PDFs, if that's what you want.

The scanner probably cannot do large format stuff like maps and posters,
but since it's just using standard cameras, you can just take the cameras
off and use them capture images with a standard tripod.

OCR to ePub is not really possible. Or at least, I don't think it's
possible.

Lots of work...like actual work someone has to do. But the quality is very
good, much better than the Xerox MFP scanners we used to scan loose leaf
articles.





On Wed, Mar 5, 2014 at 3:33 PM, Pikas, Christina K. 
christina.pi...@jhuapl.edu wrote:

 I think this is what the class at Wisconsin built:
 http://lis644bookscanner.wordpress.com/

 Christina

 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@listserv.nd.edu] On Behalf Of
 Joe Hourcle
 Sent: Tuesday, March 04, 2014 6:44 PM
 To: CODE4LIB@listserv.nd.edu
 Subject: Re: [CODE4LIB] Book scanner suggestions redux

 On Mar 3, 2014, at 10:54 AM, Aaron Rubinstein wrote:

  Hi all,
 
  We're looking to purchase a book scanner and I was hoping to get some
 recommendations from those who've had experience.

 I don't have experience, but a couple of years back, a group started
 selling kits to make book scanners:

 http://diybookscanner.myshopify.com/products/diy-book-scanner-kit


 It's $500+shipping, and missing some parts (glass, cameras, paint), but it
 means that instead of carpentry skills, you just need experience assembling
 things.

 -Joe



[CODE4LIB] ArchivesSpace Meetup Hackfest @ C4L14

2014-02-11 Thread Chris Fitzpatrick
Hi everybody,

Thanks to everyone who helped out with our poll for our ArchivesSpace
meetup. We had a really great response!

ArchivesSpace is going to be hosting a meetup and hackfest on Thursday
March 24. This will be held between 1-4PM, following the closing of
Code4Lib 2014. The location is TBA.

If you're interested in software and/or archives, we'd love to meet you.
We're going to be going over the overall architecture, introducing the UI,
showing how to get yourself setup with AS, as well as doing some coding
projects that show what is possible.

If you have any questions, please feel free to contact me directly.

Thanks and look forward to see you all soon!

best regards,

Chris Fitzpatrick
Developer
ArchivesSpace
http://archivesspace.org


Re: [CODE4LIB] Public transport from RDU to Sheraton Raleigh and how safe is it?

2014-01-14 Thread Chris Fitzpatrick
Also, last time few time I was in LA I took the Metro to/from the airport
and it was great.
I think the Green line goes to LAX and the Red Line goes to North Hollywood
and Burbank.

But you would run the danger of running into Ed Begley Jr., so there's
that.



On Tue, Jan 14, 2014 at 12:48 AM, Andreas Orphanides akorp...@ncsu.eduwrote:

 There's a pretty reliable bus that will take you straight from the airport
 to the center of downtown. Clean and safe, if a little infrequent. And $2.


 http://www.triangletransit.org/sites/default/files/maps-and-schedules/RoutesAndSchedules-100.pdf


 On Mon, Jan 13, 2014 at 6:31 PM, Salazar, Christina 
 christina.sala...@csuci.edu wrote:

  (Am I the only one who hears James Brown's Night Train in my head when I
  type Raleigh, North Carolina?)
 
  I'm just wondering if there's any public transportation from RDU to the
  conference hotel and if so, how safe is it? I have opted out of public
  transport at some places that I later found out were very safe (e.g.,
  Boston) because I'm from Los Angeles and we don't do public
 transportation,
  so I just thought I'd ask now and plan in advance.
 
  Christina Salazar
  Systems Librarian
  John Spoor Broome Library
  California State University, Channel Islands
  805/437-3198
  [Description: Description: CI Formal Logo_1B grad_em signature]
 
 



[CODE4LIB] ArchivesSpace Meetup + Hackfest at c4l14

2014-01-09 Thread Chris Fitzpatrick
Hello Everyone,

Are you going to Code4Lib 2014? We want to meet you!

ArchivesSpace is wanting to have a informal hackfest and meetup to
introduce ourselves, show off our code, and get you involved with the next
generation of archives management software.

We are thinking of setting up time(s) either Wednesday evening, Thursday
afternoon,  Friday morning following c4l, or some combination of available
times.
If you are interested, please take a moment to indicate your preferences
here = http://doodle.com/px8h778xz4f6rrau

If you have any questions, suggestions, or just general comments, please
feel free to contact me directly.

Thanks again and look forward to seeing you all soon.


best,

Chris Fitzpatrick
Developer
ArchivesSpace


Re: [CODE4LIB] mass convert jpeg to pdf

2013-11-08 Thread Chris Fitzpatrick
Do you need OCR?
This script =
http://bookscanner.pbworks.com/w/page/45609343/Homer%20bash%20script
will OCR a directory of TIFFs (using Tesseract) and build a PDF using
Tesseract.

It's a little old, but I still use it pretty much every day. I think you'll
need to have Ruby 1.9 installed, since the PDFBeads library uses Hpricot.

There's lots of Document View/Book Widget/Page Turners...the Internet
Archive one is good. I also really like the NYTime Document Viewer (
https://github.com/documentcloud/document-viewer ). The DocumentCloud
people also have something to rip your PDFs apart and put them into the
viewer ( https://github.com/documentcloud/docsplit )







On Fri, Nov 8, 2013 at 8:23 PM, Karen Coyle li...@kcoyle.net wrote:

 +1 for the viewer concept, and I'll add that viewing  downloading meet
 different needs and should both be offered if possible. (said because of
 recently having had to download huge PDFs just to glance at a few pages).

 kc


 On 11/8/13 11:10 AM, Edward Summers wrote:

 It is sad to me that converting to PDF for viewing off the Web seems like
 the answer. Isn’t there a tiling viewer (like Leaflet) that could be used
 to render jpeg derivatives of the original tif files in Omeka?

 For an example of using Leaflet (usually used for working with maps) in
 this way checkout NYTimes Machine Beta:

  http://apps.beta620.nytimes.com/timesmachine/1969/07/20/issue.html

 //Ed

 On Nov 8, 2013, at 2:00 PM, Kyle Banerjee kyle.baner...@gmail.com
 wrote:

  We are in the process of migrating our digital collections from CONTENTdm
 to Omeka and are trying to figure out what to do about the compound
 objects
 -- the vast majority of which are digitized books.

 The source files are actually hi res tiffs but since ginormous objects
 broken into hundreds of pieces (each of which can be well over 100MB in
 size) aren't exactly friendly to use, we'd like to stitch them into
 individual pdf's that can be viewed more conveniently

 My game plan is to simply have a script pull the files down as jpegs
 which
 can be fed to imagemagick which can theoretically do everything I need.
 However, I've never actually done anything like this before, so I wanted
 to
 see if there's a method that people have used for combining lots of
 images
 into pdfs that works particularly well. Thanks,

 kyle


 --
 Karen Coyle
 kco...@kcoyle.net http://kcoyle.net
 m: 1-510-435-8234
 skype: kcoylenet



Re: [CODE4LIB] display book covers

2013-11-07 Thread Chris Fitzpatrick
Hi,


I think you can do this all with JS or Coffeescript.

Here's a fiddle :

http://jsfiddle.net/chrisfitzpat/t69Xs/




On Tue, Nov 5, 2013 at 10:12 PM, Daryl Grenz grenzda...@hotmail.com wrote:

 Powell's Books provides an API (http://api.powells.com/stable) and direct
 links to their book covers by ISBN13 only.
 Regarding the limit on daily use of the Google Books API, I think from
 when I used it before that if you access cover links through the Dynamic
 Links API (https://developers.google.com/books/docs/dynamic-links) there
 is no daily limit.
 - Daryl

  Date: Tue, 5 Nov 2013 15:13:35 +
  From: aw...@rockhall.org
  Subject: [CODE4LIB] display book covers
  To: CODE4LIB@LISTSERV.ND.EDU
 
  Hi all,
 
  Anyone have some good resources about tools for gathering book cover
 images?  I'm building that into our next catalog update, which uses
 Blacklight, but I'm not necessarily looking for Rails-only approaches.  My
 questions are more general:
 
  What sources are out there?  (ex. Google Books, amazon)
 
  Making it work?
  I'm trying out Google Books at the moment, just making a call to their
 API.  This can be asynchronously and loaded after the rest of the page, or
 cached, perhaps even store the url in solr or a database table?
 
  Tools?
  I am trying out a Google Books gem[1], which is just a wrapper for the
 api.
 
  Other thoughts?
 
  Thanks in advance,
 
  …adam
 
  __
  Adam Wead
  Systems and Digital Collections Librarian
  Library + Archives
  Rock and Roll Hall of Fame and Museum
  216.515.1960
  aw...@rockhall.org
 
  [1] https://github.com/zeantsoi/GoogleBooks
  This communication is a confidential and proprietary business
 communication. It is intended solely for the use of the designated
 recipient(s). If this communication is received in error, please contact
 the sender and delete this communication.




Re: [CODE4LIB] solr computation field norm problem

2013-09-25 Thread Chris Fitzpatrick
Yeah...I think you're running into this:

http://lucene.472066.n3.nabble.com/field-length-normalization-tp495308p495311.html

TL;DR:
Jay Hill says fields with 3 terms and 4 terms both score at .5 in the
lengthNorm.







On Wed, Sep 25, 2013 at 4:21 PM, Nicolas Franck nicolas.fra...@ugent.bewrote:

 Hi there,

 I have a question about the way Lucene computes the length norm of field
 norm for its documents.
 My documents are indexed using Solr.
 These are the documents that where indexed (ignore 'score', that is not
 part of the document itself)

 doc
   float name=score1.00711/float
   str name=_idejn01:25675596/str
   str name=titleJournal of neurology research/str
 /doc
 doc
   float name=score1.00711/float
   str name=_idejn01:954925518616/str
   str name=titleJournal of neurology/str
 /doc


 The field title has the following definition in schema.xml:

 fieldType name=utf8text class=solr.TextField
 positionIncrementGap=100 omitNorms=false
   analyzer type=index
 tokenizer class=solr.StandardTokenizerFactory
 maxTokenLength=1024/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.ASCIIFoldingFilterFactory/
 filter class=solr.SynonymFilterFactory
 synonyms=index_synonyms.txt format=solr ignoreCase=false
 expand=true tokenizerFactory=solr.WhitespaceTokenizerFactory/
   /analyzer
   analyzer type=query
 tokenizer class=solr.StandardTokenizerFactory
 maxTokenLength=1024/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.ASCIIFoldingFilterFactory/
 filter class=solr.SynonymFilterFactory
 synonyms=index_synonyms.txt format=solr ignoreCase=false
 expand=true tokenizerFactory=solr.WhitespaceTokenizerFactory/
   /analyzer
 /fieldType


 If I use the query journal of neurology, both documents have the same
 score, although the second document is more exact. Supplying a phrase query
 does not fix the issue. I also see that the computed fieldNorm is 0.5 for
 both documents. Does this have something to do with the loss of precision
 when storing the length norm into one byte?

 These are all the supplied parameters (defaults in solrconfig.xml):

 str name=lowercaseOperatorsfalse/str
 str name=mm-10%/str
 str name=pfauthor^3 title^2/str
 str name=sortscore desc/str
 arr name=bq
   strsource:ser01^10/str
   strsource:ejn01^10/str
  str(*:* -type:article)^999/str
 /arr
 str name=echoParamsall/str
 str name=dfall/str
 str name=tie0/str
 str name=qf
 author^15 title^10 subject^1 summary^1 library^1 location^1 publisher^1
 place_published^1 issn^1 isbn^1
 /str
 str name=q.alt*:*/str
 str name=ps2/str
 str name=defTypeedismax/str
 str name=qjournal of neurology/str
 str name=echoParamsall/str
 str name=sortscore desc/str

 Looking the computation of the score, I see no single difference between
 them (see down below)
 Any idea why the fieldNorm is the same for both documents?


 Thanks in advance!

 Greetings,

 Nicolas




 str name=ejn01:25675596
 1.0071099 = (MATCH) sum of:
   0.0053001107 = (MATCH) sum of:
 0.0017667036 = (MATCH) max of:
   0.0017667036 = (MATCH) weight(title:journal^10.0 in 0), product of:
 0.005943145 = queryWeight(title:journal^10.0), product of:
   10.0 = boost
   0.5945349 = idf(docFreq=2, maxDocs=2)
   9.996294E-4 = queryNorm
 0.29726744 = (MATCH) fieldWeight(title:journal in 0), product of:
   1.0 = tf(termFreq(title:journal)=1)
   0.5945349 = idf(docFreq=2, maxDocs=2)
   0.5 = fieldNorm(field=title, doc=0)
 0.0017667036 = (MATCH) max of:
   0.0017667036 = (MATCH) weight(title:of^10.0 in 0), product of:
 0.005943145 = queryWeight(title:of^10.0), product of:
   10.0 = boost
   0.5945349 = idf(docFreq=2, maxDocs=2)
   9.996294E-4 = queryNorm
 0.29726744 = (MATCH) fieldWeight(title:of in 0), product of:
   1.0 = tf(termFreq(title:of)=1)
   0.5945349 = idf(docFreq=2, maxDocs=2)
   0.5 = fieldNorm(field=title, doc=0)
 0.0017667036 = (MATCH) max of:
   0.0017667036 = (MATCH) weight(title:neurology^10.0 in 0), product of:
 0.005943145 = queryWeight(title:neurology^10.0), product of:
   10.0 = boost
   0.5945349 = idf(docFreq=2, maxDocs=2)
   9.996294E-4 = queryNorm
 0.29726744 = (MATCH) fieldWeight(title:neurology in 0), product of:
   1.0 = tf(termFreq(title:neurology)=1)
   0.5945349 = idf(docFreq=2, maxDocs=2)
   0.5 = fieldNorm(field=title, doc=0)
   0.0031800664 = (MATCH) max of:
 0.0031800664 = (MATCH) weight(title:journal of neurology~2^2.0 in
 0), product of:
   0.0035658872 = queryWeight(title:journal of neurology~2^2.0),
 product of:
 2.0 = boost
 1.7836046 = idf(title: journal=2 of=2 neurology=2)
 9.996294E-4 = queryNorm
   0.8918023 = fieldWeight(title:journal of neurology in 0), product
 of:
 1.0 = tf(phraseFreq=1.0)
 1.7836046 = idf(title: journal=2 

Re: [CODE4LIB] XML split and transform in Java

2013-09-08 Thread Chris Fitzpatrick
Hi,

Would something like this work?

https://github.com/marc4j/marc4j/blob/master/src/org/marc4j/samples/StylesheetChainExample.java



On Sun, Sep 8, 2013 at 6:22 PM, Tod Olson t...@uchicago.edu wrote:

 code4lib,

 I'm looking for some advice on splitting and transforming XML data using
 Java. The context is writing a mixin for SolrMARC to enhance our bib data,
 bringing in table of contents and summary data. The data is in XML,
 isomorphic to MARCXML. I need to split it up, transform it, and store it
 for use at import time. I expect the input XML to be up to a few GB, so
 slurping the whole thing into a DOM seems questionable. I've done one
 implementation for a split-only version of the problem, but the transform
 requirement is causing me to re-think.

 And maybe someone out there has already done this exact thing.

 I think the basic approach is to read a record from start tag to end tag,
 and create a reader/stream/whatever to hand exactly that record to the
 transform API. Lots of options for this: SAX, StAX events, or what have
 you. Any thoughts of what seems the most straightforward for this
 split-and-transform scenario would be welcome.

 On a related note, any thoughts on your favorite light-weight key/value
 pair persistent storage for Java would be welcome. I expect the data to be
 a little large for a serialized HashMap.

 Best,

 -Tod


 Tod Olson t...@uchicago.edu
 Systems Librarian
 University of Chicago Library



Re: [CODE4LIB] Python and Ruby

2013-07-31 Thread Chris Fitzpatrick
UNSUBCRIBE


On Wed, Jul 31, 2013 at 3:30 PM, Joshua Welker wel...@ucmo.edu wrote:

 Ah you got me. Shame on me for not checking the link first. I haven't had
 to
 dodge Rickrolls since 2010 so I am out of practice.

 Josh Welker
 Information Technology Librarian
 James C. Kirkpatrick Library
 University of Central Missouri
 Warrensburg, MO 64093
 JCKL 2260
 660.543.8022

 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
 Michael J. Giarlo
 Sent: Tuesday, July 30, 2013 3:21 PM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] Python and Ruby

 It's extremely eerie how this thread has played out almost exactly like a
 similar one in 2010: http://bit.ly/4kb77v

 Creatures of habit, we are.


 On Tue, Jul 30, 2013 at 12:55 PM, Levy, Michael ml...@ushmm.org wrote:

  Has anyone tried coding using one of these?
  http://www.youtube.com/watch?v=E3keLeMwfHY
 



Re: [CODE4LIB] Python and Ruby

2013-07-29 Thread Chris Fitzpatrick
One thing to factor in is that if you learn ruby you run the risk of
becoming one of those people who constantly talks,tweets,blogs, posts to
this mailing list about how great ruby is. This can have a very negative
impact on your work productivity.

On Monday, July 29, 2013, Dana Pearson wrote:

 Josh,

 I work exclusively with XSLT but specialize in metadata only no need for
 content display choices

 maybe a candidate for library programming language...XSLT 2.0 has useful
 analyze-string element to cover Roy's point

 by the way, Josh, live just down the road in Leeton

 regards,
 dana


 On Mon, Jul 29, 2013 at 12:04 PM, Roy Tennant 
 roytenn...@gmail.comjavascript:;
 wrote:

  On Mon, Jul 29, 2013 at 9:57 AM, Peter Schlumpf 
  pschlu...@earthlink.netjavascript:;
 
  wrote:
   Imagine if the library community had its own programming/scripting
  language, at least one that is domain relevant.
   What would it look like?
 
  Whatever else it had, it would have to have a sophisticated way to
  inspect text for patterns -- that is, regular expressions.
  Roy
 



 --
 Dana Pearson
 dbpearsonmlis.com



Re: [CODE4LIB] Python and Ruby

2013-07-29 Thread Chris Fitzpatrick
Hi,

My first email was an attempt at humour. Sorry, I didn't mean to jack your
thread.

Ruby is my language of choice, but I have done some work in Python.

For all the things you listed, there are libraries in both languages that
are probably as good as each other.

Python has lxml, which is as good as Ruby's Nokogiri for XML stuff. Python
has Sunburnt for Solr stuff, although I do really like Sunspot (and Tire
for ElasticSearch is even better).  Both Python and Ruby have mechanize for
screen scraping,  which was actually based off a Perl's WWW::Mechanize
library...

I will say that  while Ruby has more web application building tools, I
think Python is still more popular with science-y type people. Python seems
to be what all Programming 101 for Non-CS Students classes use now, so I
think Python has more data processing/science libraries, especially for
things like  Natural Language Processing and statistics. I went to a
Semantic Web workshop and everyone was using Python or Java, although there
are some Ruby libraries out there...

That said, JRuby has really come a long way in the past year, so now it's
easier to use the bad-ass Java libraries ( like Marc4j, CoreNLP, and Java's
XML libraries)   without actually having to put up with all the crap Java
makes you submit to.

In terms of speed/performance both Ruby and Python are equally terrible.

I guess I'd just recommend instead of learning both languages, I would push
myself to learn one really really well. That was something I learned the
hard way when I was younger...always learning a language just well enough
to get comfortable then getting bored and trying something else.

good luck!



On Mon, Jul 29, 2013 at 10:03 PM, Jon P. Stroop jstr...@princeton.eduwrote:

 s/ruby/any_language/

 Why not learn both? As with spoken languages, knowing more than one makes
 it easier for you to think at a higher level of abstraction and therefore a
 better developer, and, as others have alluded to, will allow you to choose
 the 'right tool [framework, library, etc] for the right job'.

 Plus, as Giarlo said, they're not really that different.

 
 From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] on behalf of Chris
 Fitzpatrick [chrisfitz...@gmail.com]
 Sent: Monday, July 29, 2013 1:39 PM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] Python and Ruby

 One thing to factor in is that if you learn ruby you run the risk of
 becoming one of those people who constantly talks,tweets,blogs, posts to
 this mailing list about how great ruby is. This can have a very negative
 impact on your work productivity.

 On Monday, July 29, 2013, Dana Pearson wrote:

  Josh,
 
  I work exclusively with XSLT but specialize in metadata only no need for
  content display choices
 
  maybe a candidate for library programming language...XSLT 2.0 has useful
  analyze-string element to cover Roy's point
 
  by the way, Josh, live just down the road in Leeton
 
  regards,
  dana
 
 
  On Mon, Jul 29, 2013 at 12:04 PM, Roy Tennant roytenn...@gmail.com
 javascript:;
  wrote:
 
   On Mon, Jul 29, 2013 at 9:57 AM, Peter Schlumpf 
 pschlu...@earthlink.netjavascript:;
  
   wrote:
Imagine if the library community had its own programming/scripting
   language, at least one that is domain relevant.
What would it look like?
  
   Whatever else it had, it would have to have a sophisticated way to
   inspect text for patterns -- that is, regular expressions.
   Roy
  
 
 
 
  --
  Dana Pearson
  dbpearsonmlis.com
 



Re: [CODE4LIB] Image de-duping and file identification

2013-03-20 Thread chris fitzpatrick
inline: compose-unknown-contact.jpg

Re: [CODE4LIB] web-based ocr

2013-03-13 Thread chris fitzpatrick
I recommend looking at pdfbeads. It's in ruby and the documentation is 
mostly in Russian ( 
http://rubyforge.org/docman/view.php/9752/10692/pdfbeads.ru.html ), but 
it provides both a library and an easy to use executiable to build PDFs 
out of hOCR files and images. You literally just point it at a directory 
with page images and hOCR files and it spits out a PDF. Very handy.


Also, the DIY Book Scanner forum (diybookscanner.org ) is a great 
resource if you're into these sorts of things...




Eric Lease Morgan wrote:


On Mar 13, 2013, at 8:07 AM, Ben Brumfieldbenwb...@gmail.com wrote:



https://github.com/idigbio-aocr/RESTAPI/tree/master/doc



Interesting. Printed for future reference. Thank you.

BTW, I did finally get Image::OCR::Tesseract to make, make test, and 
make install correctly. I did not have the correct/proper libraries 
installed for Tesseract's supporting Leptonica library. Now I need to 
find a PDF library similar to libtff and libpng.


--
Eric Morgan


Re: [CODE4LIB] web-based ocr

2013-03-12 Thread chris fitzpatrick

Hi,

I recently looked into similar services...

There are some cloud based vendors that do this. Abbyy, for example, 
offers one. But the cost seems rather high when working in bulk. I did 
the math and it didn't make sense for usI think they market it 
towards people building mobile apps, not scanning books.


Luckily, the Internet Archive OCRs documents uploaded to it for free. 
And the OCR results are pretty good (or better than I ever got with 
Tesseract) . So I use that a lot. However, you have to upload your 
document in a specific zipped up package... I don't think there's a 
generic web form.


For something like that, I'd suggest ... Google Drive. It OCRs documents 
fairly well, although they have a size limit. We're using Google Apps 
for Education as our Digital Repository, so that works pretty well for a 
lot of our smaller documents...


b,chris.


Eric Lease Morgan wrote:


Does anybody here know of a Web-based OCR program or Web service?

Many people want to do OCR against digitized texts. We all know of 
various OCR applications (Adobe Acrobat, ABBYY FineReader, Google's 
Tesseract, etc.), but they are not necessarily Web-based. As a service 
to my university, I thought it might be cool (or kewl) to support an 
image to text application. Go to Web form. Submit one or more image 
files. Have OCR done against them no matter how dirty the output. 
Return plain text. As a bonus, the application would support a 
REST-ful API.


Does anybody know of something like this that exists already?

--
Eric Lease Morgan
University of Notre Dame


Re: [CODE4LIB] web-based ocr

2013-03-12 Thread chris fitzpatrick

Hi,

In regards to handwriting, you could always train an OCR library to do 
this and there are several OCR libraries that attempt to do this 
out-of-the-box (probably most notable is Evernote) ...but yeah, the 
results vary greatly depending on the style of writing. Most focus on 
just hand printed things like post-its.


And a quick thing I found out recently about Tesseract. It is pretty 
good if all you want is the text extracted. It does not do layout 
recognition very well, so output will look funky if there's layout 
oddities...like footnotes. But it really depends on what you have and 
what you're trying to do. For example, I did not have much success 
making EPUBS with Tesseract, but it worked great with our theses (which 
have manditory layout requirements). So another big bonus for using the 
Internet Archive (who, I think, use Abbyy).




b,chris.


Eric Lease Morgan wrote:


Thank you for the prompt replies.

Call me cheap or unable to navigate the political/fiscal landscape, 
but I don't see myself subscribing to a service. Instead I see putting 
a wrapper around Tesseract, but alas, the wrappers are written in 
languages that I don't know. [1] Hmmm… On the Perl side, I am having 
problems installing Image::OCR::Tesseract.


[1] Wrappers - http://code.google.com/p/tesseract-ocr/wiki/AddOns

--
Eric Still Cogitating Morgan


[CODE4LIB] Google Drive as an IR

2013-03-12 Thread chris fitzpatrick

So, yeah, new thread. Sorry (I'm not sorry).

tl;dr = it's not perfect but you'll never get access 
control/revision/fulltext searching functionality even if you  spend  
~1000x more.


About using Google Driveyeah, we're very small ( 115 students!), so 
we're very interested in keeping our over-heads nice and low..
I'm guess I'm old enough to think that 100 GB for $5 a month is a pretty 
good deal, so we started saying Google Drive is our IR as a joke, but 
like it's actually turned into a really nice IR type thingy.


 We just added a generic library user in our domain and bought extra 
drive space it. We try to organize things orderly by keeping things in 
various folders ( Dissertations, Articles, UN Documents), since it 
makes it easier to recursively apply ACLs. GDrive is like AWS in that 
the folders are not really folders like we're used to on a file 
system, but more like tags..so if you move a file around, it keeps it 
UUID (and therefore URL), which is pretty nice.


The best part is that since Google Apps uses OAuth, access control is 
really simple both in Google Apps and with external web apps.  We can 
make a document open to the world, grant access to groups/individuals, 
only allow access if they have the URL, etc. This works if they search 
in Google Drive or if they're tying to access a document embedded on 
another site.


The bad news is that there's not much (i.e. none) support in the way of 
descriptive metadata, which is kind of huge. To work around this, we 
currently either have descriptive metadata records kept in our ILS 
(Koha) or in our group Mendeley account. This adds a bit of complexity 
to managing the metadata and also means there's not a discovery 
interface that allows for both full-text searching (which google 
provides) and metadata searching (which Koha mostly provides).
I wrote an app last summer that indexes some of this content into a 
discovery interface, which I'm actually in the process of merging back 
into our Blacklight OPAC so we'll have a unified DS9DE (Deep Space 9 
Discovery Environment)  . So, there's that...


And slightly less bad news, OCR and the document viewer only supports 
files  20MB. We're have a lot of very large PDFs, so it's a bit of a 
drag, but the students just have to download the PDF, so it's not so bad.


And the Google Drive desktop client can be buggy and crashes if you try 
and sync large collection. And you still have to figure out preservation 
(or not).


But yeah, despite all that BS it's been pretty great. And since Google 
gives the CIA unlimited warrantless access, I assume that someone out 
there (i.e. DC metro area)  is reading our content.


Any questions, please feel free to ask me...

b,chris.


Re: [CODE4LIB] jobs.code4lib.org and job locations

2013-02-24 Thread Chris Fitzpatrick
hi,
has anyone volunteered for the mapping feature? if not, I'd like to take a
crack at it as I am wanting to get more practical django experience under
my belt. and since this list has gotten me two jobs, I would love to give
some payback.  just dont want to duplicate any work someone else has
started. b, chris.
On 24 Feb 2013 20:08, Gary McGath develo...@mcgath.com wrote:

 It works very nicely with Sage, which is what I use to follow feeds.
 Thanks!

 On 2/24/13 1:45 PM, Ed Summers wrote:
  Hi Gary,
 
  Great idea, and it was easy to implement. For example you can now get
  tag related feeds:
 
  http://jobs.code4lib.org/feed/tag/digital-preservation/
  http://jobs.code4lib.org/feed/tag/python/
  http://jobs.code4lib.org/feed/tag/web-archiving/
  http://jobs.code4lib.org/feed/tag/fedora-repository-architecture/
  etc ...
 



 --
 Gary McGath, Professional Software Developer
 http://www.garymcgath.com



Re: [CODE4LIB] You are a *pedantic* coder. So what am I?

2013-02-21 Thread Chris Fitzpatrick
pendantic and ruby go together about as well as brevity and this
mailing list

class Foo
 private
 def bar
   Calling a private method is foobar
 end
end

$ irb
1.9.3p286 :009  Foo.new.bar
NoMethodError: private method `bar' called for #Foo:0x007f9e9184b8b8

1.9.3p286 :010  Foo.new.send(:bar)
 = Calling a private method is foobar

They've been saying they're going to remove this in the next version for
about 5 years now...



On Thu, Feb 21, 2013 at 9:37 PM, Ian Walls iwa...@library.umass.edu wrote:

 Justin,


 I certainly agree that to become a better coder, it's good to experiment
 with many languages and applications.  I'm not advocating that any given
 shop should always rule out a project in a new (to them) language.  What
 I'm
 saying is that the context of what you already know and what your
 environment supports is an equally important part of the conversation when
 choosing a language to develop in.

 -Ian

 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
 Justin Coyne
 Sent: Thursday, February 21, 2013 12:59 PM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] You are a *pedantic* coder. So what am I?

 Ian, I have to caution against taking the attitude we only code in what we
 already know.  Of course you are going to be able to hit the ground
 running
 faster in what you are expert in.  Putting on the blinders is a great way
 to
 become irrelevant in the technology sphere.  If you want to be a better
 coder, there is no better way than to learn a new language, and actually do
 a project in it. The insights you find in doing this will make you a better
 coder when your go back to doing whatever it was you were doing before.

 -Justin


 On Thu, Feb 21, 2013 at 11:53 AM, Ian Walls iwa...@library.umass.edu
 wrote:

  Agreed.  Each language has its own strengths and weaknesses.  Pick the
  one that works best for your situation, factoring in not only what the
  application needs to do, but your and your team's level of experience,
  and the overall community context in which the project will live.  The
  peculiarities of a given languages truth tables, for example, can
  easily get washed out of the calculation when you consider what
  languages you know and what platforms your institution supports.
 
 
  -Ian
 
  -Original Message-
  From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf
  Of Ethan Gruber
  Sent: Thursday, February 21, 2013 12:45 PM
  To: CODE4LIB@LISTSERV.ND.EDU
  Subject: Re: [CODE4LIB] You are a *pedantic* coder. So what am I?
 
  Look, I'm sure we can list the many ways different languages fail to
  meet our expectations, but is this really a constructive line of
 conversation?
 
  -1
 
 
  On Thu, Feb 21, 2013 at 12:40 PM, Justin Coyne
  jus...@curationexperts.comwrote:
 
   I did misspeak a bit.  You can override static methods in Java.  My
   major issue is that there is no getClass() within a static method,
   so when the static method is being run in the context of the
   inheriting class it is unaware of its own run context.
  
   For example: I want the output to be Hi from bar, but it's Hi
   from
  foo:
  
   class Foo {
 public static void sayHello() {
   hi();
 }
 public static void hi() {
   System.out.println(Hi from foo);
 }
   }
  
   class Bar extends Foo {
  
 public static void hi() {
   System.out.println(Hi from bar);
 }
   }
  
   class Test {
 public static void main(String [ ] args) {
   Bar.sayHello();
 }
   }
  
  
   -Justin
  
  
  
   On Thu, Feb 21, 2013 at 11:18 AM, Eric Hellman e...@hellman.net
 wrote:
  
OK, pedant, tell us why you think methods that can be over-ridden
are static.
Also, tell us why you think classes in Java are not instances of
java.lang.Class
   
   
On Feb 18, 2013, at 1:39 PM, Justin Coyne
jus...@curationexperts.com
wrote:
   
 To be pedantic, Ruby and JavaScript are more Object Oriented
 than Java because they don't have primitives and (in Ruby's
 case) because classes
are
 themselves objects.   Unlike Java, both Python and Ruby can
 properly
 override of static methods on sub-classes. The Java language
 made many compromises as it was designed as a bridge to Object
 Oriented
   programming
 for programmers who were used to writing C and C++.

 -Justin

   
  
 



[CODE4LIB] Tesseract hOCR to ePub conversion

2013-02-16 Thread Chris Fitzpatrick
Hi,

I am wanting to add epub output to our scanning workflow...just like the
Internet Archive does. However, looking at their code, it appears they are
using Abbyy FineReader for OCR.

We're using Tesseract to make hOCR files, which we combine to with the
images to make PDFs. Has anyone done the conversion of hOCR files to ePub?

I want to avoid the PDF or DjVU to ePUB conversion, since the output from
this is usually very bad.

Thanks.. b,chris.


Re: [CODE4LIB] tech vs. nursing

2012-11-30 Thread Chris Fitzpatrick
Ah, so I am a bit delusional. I really need to cut back on these bath
salts(but at least I'm not ALA delusional...$90 for a PDF of librarian
salary statistics...really? )

Thinking about this, I guess things have changed quite a bit and I hadn't
realized it...at my first 3 library/archive jobs as a student, I was the
only man in the departments I was working in. But recently, it seems women
are making up more of the library staff the are a) retiring, b) being laid
off or forced out, or c) being put into marginalized middle management
positions (which usually leads to scenario b ) . I don't think this is some
sort of evil plan that is being hatch in some boardroom somewhere...but it
does seem to be happening, right?

But, this leads into another trend I've noticed... recent MLIS graduates
are constantly lamenting the lack of jobs...meanwhile this list is flooded
with jobs. It's a really odd disconnect.  MLIS programs probably have a
good mix of genders (or has that changed too?), so maybe being more active
with current MLIS students will not only get more women in code4lib but
also get more women working in the newer technology departments?




On Thu, Nov 29, 2012 at 11:08 PM, Karen Coyle li...@kcoyle.net wrote:

 ALA does salary surveys every year. This is from the ALA-APA toolkit [1]:

 Pay inequity also exists within librarianship. The Association of
 Research Libraries, in its Annual Salary Survey
 2005-6, reported that the average salary for male academic librarians in
 member libraries was $63,984, while
 the average for female academic librarians was $61,083.5

 Library Journal reported that new library school graduates finally crossed
 the $40,000 mark as an average salary,
 but the gender split had women below that point with $39,587 and men at
 $42,143.

 And there's more if you go through the literature.

 kc

 [1] http://ala-apa.org/improving-**salariesstatus/resources/ala-**
 apa-librarian-and-library-**worker-salary-surveys/http://ala-apa.org/improving-salariesstatus/resources/ala-apa-librarian-and-library-worker-salary-surveys/


 On 11/29/12 1:19 PM, Chris Fitzpatrick wrote:

 Hm. This all has been a long and really interesting conversation...but I
 gotta ask if  men really outweigh women in the higher paying library jobs
 as much as they do in banks and K-12? I guess it depends on the definition
 of tech vs. non-tech jobs in the library setting, which I'll leave to
 that other email tread...but since I started working in libraries, 3 of my
 last 5 managers (hi, Bess!) were women. I always thought one of the best
 things about working at libraries was that there are way more women
 working
 in higher positions than there are in most private for-profit companies.
   And I'd be willing to bet my life savings that libraries have
 a significantly higher percentage of women executives than Fourtune 500
 companies. But maybe I'm delusional about this? I don't have any figures
 or
 anything...

 What I have noticed is that academic libraries have been trying harder to
 emulate the Valley and the general tech field. Not only is Thinking Like
 A
 Startup a mantra, but libraries are flocking to flashier cutting edge
 technologies. This is probably not a bad thing, but communities like
 Rails,
 Drupal, Django, Hadoop, and Node are all overloaded with particular
 chromosome. So maybe a side-effect is that we're now emulating some of
 their bad habits along with the good ones?

 Another thing that Karen Coyle's comments about coders vs. helpers
 made
 me think of is that academic libraries tend to be reorganizing their
   departments in kinda interesting ways. There now seems to be things like
 Metadata or Systems groups that are distinct from Digital Repository
 or Applications groups. Catalogers and the people who work on the ILS
 are
 often completely segregated from the people who work on the new flashy
 grant-funded projects. The former, it kinda seems to me, tends to have
 more
 women members while the latter is often lacking. Code4Lib draws mostly
 from
 people working in these new-ish groups, which the others get sent to
 things
 like ALA...maybe we can significantly improve our ratio by trying to
 involve and interact more with our colleagues sitting on the other side of
 the cubical partition? Although the last time I did that I learned the
 hard
 way why turning off the Zebra index is a bad idea, so maybe on second
 thought it's better if we don't get in each other's hair

 best,fitz.


 On Thu, Nov 29, 2012 at 9:10 PM, Bess Sadler bess.sad...@gmail.com
 wrote:

  The challenges around getting women into male-dominated professions is a
 little different from the challenges of getting men into women-dominated
 professions. For one thing, professions that are female-dominated are
 notoriously low-paying and low-status (think K-12 teachers, nursing,
 social
 workers, etc). These professions do have major recruiting problems,
 largely
 because they are low-paying, often considered

Re: [CODE4LIB] tech vs. nursing

2012-11-29 Thread Chris Fitzpatrick
Hm. This all has been a long and really interesting conversation...but I
gotta ask if  men really outweigh women in the higher paying library jobs
as much as they do in banks and K-12? I guess it depends on the definition
of tech vs. non-tech jobs in the library setting, which I'll leave to
that other email tread...but since I started working in libraries, 3 of my
last 5 managers (hi, Bess!) were women. I always thought one of the best
things about working at libraries was that there are way more women working
in higher positions than there are in most private for-profit companies.
 And I'd be willing to bet my life savings that libraries have
a significantly higher percentage of women executives than Fourtune 500
companies. But maybe I'm delusional about this? I don't have any figures or
anything...

What I have noticed is that academic libraries have been trying harder to
emulate the Valley and the general tech field. Not only is Thinking Like A
Startup a mantra, but libraries are flocking to flashier cutting edge
technologies. This is probably not a bad thing, but communities like Rails,
Drupal, Django, Hadoop, and Node are all overloaded with particular
chromosome. So maybe a side-effect is that we're now emulating some of
their bad habits along with the good ones?

Another thing that Karen Coyle's comments about coders vs. helpers made
me think of is that academic libraries tend to be reorganizing their
 departments in kinda interesting ways. There now seems to be things like
Metadata or Systems groups that are distinct from Digital Repository
or Applications groups. Catalogers and the people who work on the ILS are
often completely segregated from the people who work on the new flashy
grant-funded projects. The former, it kinda seems to me, tends to have more
women members while the latter is often lacking. Code4Lib draws mostly from
people working in these new-ish groups, which the others get sent to things
like ALA...maybe we can significantly improve our ratio by trying to
involve and interact more with our colleagues sitting on the other side of
the cubical partition? Although the last time I did that I learned the hard
way why turning off the Zebra index is a bad idea, so maybe on second
thought it's better if we don't get in each other's hair

best,fitz.


On Thu, Nov 29, 2012 at 9:10 PM, Bess Sadler bess.sad...@gmail.com wrote:

 The challenges around getting women into male-dominated professions is a
 little different from the challenges of getting men into women-dominated
 professions. For one thing, professions that are female-dominated are
 notoriously low-paying and low-status (think K-12 teachers, nursing, social
 workers, etc). These professions do have major recruiting problems, largely
 because they are low-paying, often considered to be undesirable, and they
 have high levels of stress burnout. When men choose to enter these fields,
 they often are promoted more quickly and paid more than women. There are
 many professions where this is true. Women outnumber men as K-12 teachers,
 but men outnumber women as K-12 principals and school superintendents.
 Women make up the majority of bank tellers, but men make up the majority of
 bank managers. Women make up the majority of librarians, but men make up
 the majority of the higher-paying technology jobs in libraries. Sensing a
 pattern yet? THAT is what we a!
  re trying to disrupt.

 Don't get me wrong, getting more men into nursing is a good thing too! The
 fact that men are less likely to put up with low wages, bad working
 conditions, or disrespectful colleagues can work in everyone's favor, and
 the field of nursing in particular has faced such problems with recruiting
 that they are trying to undergo a major cultural shift. Male nurses have
 been a part of that. Obviously I am not a nurse, but I do have a close
 relative who authored a study on this subject for a nursing school, so I
 have heard a bit about it.

 I highly recommend the book Women Don't Ask (http://www.womendontask.com),
 which is a great book for anyone who wants to know more about effective
 negotiating. (Read it before your next salary negotiation!) The book
 discusses why men tend to ask for better treatment, better salaries, more
 opportunities, etc, while women more often accept whatever they are given.
 This is learned behavior that we can learn to change, though. I think a
 place like code4lib, where there is so much opportunity to speak up or
 spark initiatives without any hierarchy or bureaucracy getting in the way,
 can be a fertile ground for women who want to develop their negotiation and
 leadership skills, as well as their technical capacity. My entire career
 has been shaped around stuff I learned in code4lib, and only some of it was
 about code.

 Bess

 On Nov 27, 2012, at 7:56 AM, Huwig,Steve huw...@oclc.org wrote:

 I'm just the peanut gallery (having never attended Code4Lib) but it
  seems to me that a useful analogue to programming/tech 

Re: [CODE4LIB] Proposed Changes to Future Conference Program Choosing

2012-11-28 Thread Chris Fitzpatrick
 maybe i'm just being naive, but i have the feeling if we:
 a) strongly stated that we support and encourage diversity and would like
to see that reflected in our presentation lineup
 b) allowed people to include some information about themselves in the
proposal that increases voter awareness ( like newcomer or
diverse perspective or something... god, really hard not to put a joke in
here. ). The designation would be the presenter's choice.
c) simply reiterated the goals and code of conduct right before voting time
so everyone remembers the we had this discussion.

I kinda think if we did that, we'd meet our goals and would avoid having to
make a bunch of voting rule changes or form a committee.



On Wed, Nov 28, 2012 at 6:58 PM, Kyle Banerjee kyle.baner...@gmail.comwrote:

 On Tue, Nov 27, 2012 at 6:30 PM, Cynthia Ng cynthia.s...@gmail.com
 wrote:

  I'm really glad to see this discussion continuing. It seems like
  there's a good amount of support for at least giving a certain amount
  of sessions over for the program committee to decide.
 

 Frankly, I'd favor letting them decide *all* of the sessions, the logic
 being that the only reason for a program committee to exist in first place
 is to put together a program.

 Don't get me wrong. I like approval voting. I like the idea of putting on
 what people want. But that's not the same as putting on what people ask
 for.

 When you ask a decent sized population what they want, they'll ask for
 things they know they want to learn and people they want to hear from.
 What's wrong with that? For starters, it encourages intellectual
 inbreeding. Problems, technologies, etc, that affect more people are
 favored while things with a more select appeal get deemphasized. But IMO,
 the reason to go to c4l is not to learn about X or Y, but to expose
 yourself to people and things that were totally off your radar.

 Secondly, the program should be a coherent whole, not a collection of
 parts. People choose sessions individually without any knowledge of what
 else will be on the program. Balance can only be achieved by accident or if
 someone is making it happen (i.e. the program committee). People shouldn't
 just be submitting things -- the committee should identify talented
 individuals who aren't already known and actively recruit them. They should
 directly suggest topics to people who know something but have trouble
 recognizing how much their ideas would benefit the community. By taking a
 much more active role in recruiting presentations, the program committee
 can mitigate the self selection issue as well as tackle the diversity issue
 head on. It's not like the process wouldn't still be community driven since
 anyone can be on the program committee.

 As far as the 15% target goes, I think that's a decent goal but would hope
 it would be much higher in practice. This conference is all about
 participation and sharing. At the first c4l, 100% of the sessions were by
 first time attendees. I seem to remember that the vast majority of the
 people attending were on the stage at some time. Besides, a lot of people
 do their best work early in their careers.

 And to all the people reading this who feel shy/intimidated about jumping
 in, you're too respectful of the status quo. There are a lot of dedicated
 people who really know what they're doing. But you should never be afraid
 to call things as you see them. If everyone in a group you like thinks one
 thing, and you think another, that doesn't make you wrong -- to believe
 otherwise is a substitute for thinking. Creative spark rather than
 technical skill is what moves us forward and many of the people who appear
 very established were regarded as yahoos not that long ago.

 To summarize, I favor having the program committee decide the whole program
 and think their process should be informed by voting and goals of the
 community.

 kyle



Re: [CODE4LIB] desensitization

2012-11-20 Thread Chris Fitzpatrick
Wow. Thanks Thomas. That helps a lot.

Looking over this, I started wondering if it might be possible to actually
trigger the relay using voltage from tablet's audio jack. I've seen people
do this with cell phones and camera flash triggers, although I doubt I can
get 5V DC out of the audio jack without amplifying it...or I find a relay
that can go off 600mV or whatever I can get from the jack.  I'm probably
stuck using an iPad, since my wife has one i can use to build the prototype
and iOS has a pretty good barcode reader library.

b,chris.


On Mon, Nov 19, 2012 at 4:38 PM, Thomas Bennett bennet...@appstate.eduwrote:

 On our 3M self checkout, the desensitizer is activated when the barcode is
 read (by a laser scanner) if I remember correctly, the patron is already
 logged in to the system.  You might be able to get something from an
 electronics store to first, replace the manual switch with and electronic
 switch that operates on 5 volts(I think it is 5 on USB), then some how have
 this connected maybe with a usb hub that the scanner is on.  There may be
 other



 From reef central forum:
 The voltage supplied by a usb host or powered hub is between 4.75 and
 5.25VDC. USB 2.0 specifies 5VDC @ 500ma max.
 The relay you would need would be a 5VDC relay, with the contacts rated
 for 110 -125VAC. These are available, however the load rating is often low
 ~1 amp or less. (not all inclusive)
 Small 1A SPDT Relay, 5v, OMRON
 http://www.allspectrum.com/store/small-1a-spdt-relay-5v-omron-p-512.html

 Also from reef central:

 I have all the parts to build a USB AC power center, but haven't gotten
 around to trying it.
 The problem here is that a USB port is a serial port.
 While you MIGHT be able to get away with just wiring up one USB serial
 line to a relay and forcing that pin high, you can only do one device.
 My design uses the DALLAS 1-wire switches and a USB adapter.
 You can string together hundreds of these devices onto just TWO wires and
 drive and query all of them using a Linux file system called OWFS ( One
 wire file system ).
 All of the devices on the interface show up in the linux filesystem as
 files.
 To read the status, you just read the file, to change the device status (
 closed or open ) you just write to the file.

 Honestly the simplest way to experiment with this stuff is to use a
 computer parallel port.
 You have a LOT more pins and they can be set via peeks/poke from the OS.

 Another option is a USB to parallel port converter or a USB relay board.
 http://www.virtualvillage.com/usb-po...stamp-bs2.html

 Also check these guys out:
 http://bb-elec.com/welcome.asp


 Hope this helps,

 Thomas

 
 Support Requesthttp://portal.support.appstate.edu
 
 Thomas McMillan Grant Bennett   Appalachian State University
 Operations  Systems AnalystP O Box 32026
 University LibraryBoone, North Carolina
 28608
 (828) 262 6587
 Library Systems
 http://www.library.appstate.edu
 

 Confidentiality Notice:
 This communication constitutes an electronic communication within the
 meaning of the Electronic Communications Privacy Act, 18 U.S.C. Section
 2510, and its disclosure is strictly limited to the recipient intended by
 the sender of this message.  If you are not the intended recipient, any
 disclosure, copying, distribution or use of any of the information
 contained in or attached to this transmission is STRICTLY PROHIBITED.
  Please contact this office immediately by return e-mail or at
 828-262-6587, and destroy the original transmission and its
 attachment(s), if any, if you are not the intended recipient.

 On Nov 19, 2012, at 4:09 AM, Chris Fitzpatrick wrote:

  Hi,
 
  I'm working on designs to build a self-checkout kiosk for our Koha
 system.
  Seems pretty straight-forward except the book desensitizer part.  All the
  desensitizers I've every used only had an on/off switch.
  Has anyone ever seen or used a desensitizer that can
  be programautomagically triggered?
 
  Hoping to use an iPad or Nexus, so something that's not windows only
 would
  be ideal, but looking for anything right now...
 
  thanks for any pointers/suggestions
 
  b,chris.



Re: [CODE4LIB] Code4lib 2013 Presentation Election now open!

2012-11-13 Thread Chris Fitzpatrick
I'm also having issues. I'm using WebTV.


On Tue, Nov 13, 2012 at 7:27 PM, Michael B. Klein mbkl...@gmail.com wrote:

 Results brought to you by @zalgo.


 On Tue, Nov 13, 2012 at 12:15 PM, Becky Yoose b.yo...@gmail.com wrote:

  Not a voting problem per se, but the results page in IE9 [1] in Win7
 threw
  up up everywhere: http://screencast.com/t/lUnwFl8h
 
  Otherwise, yay new design :cD
 
  Thanks,
  Becky
 
  [1] Related: don't ask why I was in IE.
 
  On Mon, Nov 12, 2012 at 11:03 PM, Ross Singer rossfsin...@gmail.com
  wrote:
 
   http://vote.code4lib.org/election/24
  
   Vote early, vote often, but most importantly, vote soon:  the polls
 close
   sometime on the night of Monday the 19th of November (looking at the
 host
   that the diebold-o-tron, I think it will be around 11 PM EST, but when
  they
   close, they close!).
  
   -Ross.
   p.s. given the new design, let me know if there are any voting
 problems.
  
 



Re: [CODE4LIB] Crappy AJAX

2012-10-25 Thread Chris Fitzpatrick
http://en.m.wikipedia.org/wiki/Sayre's_law
On Oct 25, 2012 3:49 PM, Ross Singer rossfsin...@gmail.com wrote:

 On Oct 25, 2012, at 9:20 AM, Gary McGath develo...@mcgath.com wrote:

  Which is exactly the point I was about to make before I read your second
  paragraph; the server, not the web page, should be fixed up to make
  things work sensibly.

 http://www.google.com/search?q=Postel's+law

 -Ross.



Re: [CODE4LIB] new server

2012-07-17 Thread Chris Fitzpatrick
If you're just wanting a web server for a single site, having a
physical dedicated server is probably not really needed. But if it's a
requirement to have stuff, i'd look to buy something that I can set up
a small VM setup that I could deploy multiple webservers as needed, in
which case you probably could do worse than a Dell Poweredge running
VMWare.

An alternative would be to buy a Mac Mini with  OS X Server
installed...you could run Ubuntu on that too.


Having said that... I can understand why some would see using
cloud-based systems as outsourcing, but there is more to it  than
just getting out of physical server management.. There's a lot of
development platforms coming together now that offer a set of services
that make developing and managing web applications aaa easier.
  For Drupal, I'd suggest looking at Acquia, as they have a pretty
good platform for Drupal development and hosting.

b,chris.







On Tue, Jul 17, 2012 at 6:37 AM, Cary Gordon listu...@chillco.com wrote:
 When you look at everything that goes into the TCO, it is hard to make
 a case for a physical server.

 We have about 17 years experience running systems starting with the
 California State Library's DEC Alpha. We won't miss running the
 datacenter on the weekend to deal with a drive failure.

 Amazon has gone from a metric-less, expensive and difficult to manage
 system to a solid infrastructure with better performance per dollar
 than we can get in our datacenter. The bonus is thatt we can scale at
 will.

 Cary

 On Mon, Jul 16, 2012 at 6:18 PM, Nate Hill nathanielh...@gmail.com wrote:
 I should have anticipated a lot of folks would be pushing AWS or Rackspace
 or something off-site.

 At my last job in San Jose I would have *loved* to have outsourced all of
 this because of the complications working with both city and University IT
 and network.
 I would have loved to have kissed those Windows servers goodbye and brushed
 up on my Linux and had the 24 hour support and zero downtime guarantee that
 came with such a solution.

 In Chattanooga, the situation is different.

 We've got the 1 gig connection, and it is a big piece of this wonderful
 city's identity.  I definitely don't know enough about network architecture
 to speak meaningfully about it, but we are moving from an antiquated setup
 to the fastest public internet in the country.  It's pretty cool.  I don't
 think outsourcing is really part of that plan, you know?  I'm really
 looking forward to engaging the local geek community in creating local
 solutions.

 I do imagine that in the future as we do one-off apps we'll experiment with
 AWS.  For now, I'm awfully excited to set up some hardware, have control of
 that hardware (that cannot be taken for granted in public libraries) and do
 some tinkering.

 Yes... I do need more than just a production server, but I've got some
 reconditioned boxes coming from the city that I can play with for testing
 and staging (for now).

 For now, this server is going to run/host a Drupal website for the library.

 Please, anybody, do speak up if you think my approach is flawed...

 N

 On Mon, Jul 16, 2012 at 8:43 PM, Ross Singer rossfsin...@gmail.com wrote:

 This answer segues well into my question: why, exactly, do you want a
 physical server?

 I realize that there are plenty arguments for running your own hardware
 (and bandwidth is cheap and plentiful in Chattanooga -- which deals with
 the main carrying cost), but, presumably you'll need more than one (for
 replication and whatnot), right?

 What exactly do you plan to run/host on this server?

 -Ross.

 On Monday, July 16, 2012, Cary Gordon wrote:

  We currently use Dell in our datacenter, but we are moving almost all
  of our servers to AWS over the next 10 months.
 
  Thanks,
 
  Cary
 
  On Mon, Jul 16, 2012 at 11:52 AM, Nate Hill nathanielh...@gmail.com
 javascript:;
  wrote:
   I'm shopping for a new dedicated server for our public library website.
   I'd like to run Ubuntu.
   Does anyone have any hardware suggestions/guidance they'd like to
 offer?
   I'd like to not spend a zillion dollars.
   Thanks-
  
   --
   Nate Hill
   nathanielh...@gmail.com javascript:;
   http://www.natehill.net
 
 
 
  --
  Cary Gordon
  The Cherry Hill Company
  http://chillco.com
 




 --
 Nate Hill
 nathanielh...@gmail.com
 http://www.natehill.net



 --
 Cary Gordon
 The Cherry Hill Company
 http://chillco.com


Re: [CODE4LIB] LoC job opening ???

2012-07-09 Thread Chris Fitzpatrick
This just seems like some sort of trap. The fact that it's a craigslist ad
in all caps makes me pretty sure this person is working on a librarian
centpede in their basement.
On Jul 9, 2012 7:56 PM, Simon Spero sesunc...@gmail.com wrote:

 On Jul 9, 2012 1:27 PM, Joshua Gomez jngo...@gwu.edu wrote:

  WE NEED A CAT LOVER WHO IS ALSO A FEDERAL EMPLOYEE TO DO THIS JOB!

 Must have active TS/SCI clearance with FS Poly.

 All applicants must complete the attached 20 page KSA.



Re: [CODE4LIB] LoC job opening ???

2012-07-09 Thread Chris Fitzpatrick
I think it has to be a federal employee because the SCOTUS ruling left
the experimentation on federal employees part of Obamacare stand. I
think that was just a to placate Scalia or something (didn't work).

And they're probably looking for librarians because they'll come in
droves if there's cats that needs feeding. Especially DARLING, SUPER
FLUFFY, SHY HIMALAYANS.

But doing the LC thing isn't as bad as it soundsI did it for a few
months when I first got out of school. The pay is lousy, but you do
get pretty nice benefits (although it's hard to find a dentist that
will actually see you when you're in that condition).

Recent MLIS graduates could do worse...just don't expect any
promotions. It's a very flat organization structure and everyone is
really just stuck in the middle...it's kinda sad b/c there's people
who've been there forever just waiting around to become the guy in
front. But you literally have to know Dick Chaney to be the front of
the pedeliterally.

b,chris.




On Mon, Jul 9, 2012 at 8:39 PM, BWS Johnson abesottedphoe...@yahoo.com wrote:
 Salvete!


 Despite the lawful and prudent endorsement of this thread by our
 official designee to the OCLC Off-Topic Cat Discussion Moderation
 Divisiion, I feel it necessary to point out that @mjgiarlo's post was
 in error. The K Street lobbyist does not handle kitty litter. The
 Congressperson the K Street lobbyist controls, does.
 Roy


 Psha! The K Street lobbyist's Congressperson's aide does, amateur! You're 
 so naive when it comes to the inner workings of the Capitol.

 Cheers,
 Brooke


Re: [CODE4LIB] triple stores ???

2012-05-30 Thread Chris Fitzpatrick
Hey Ravi,

I actually learned about TinkerPop from a posting on this list from
Brian Tingle and I started playing with it and eventually started
working with it for the digital repository we're building. I actually
began with the Sail extension, but scaled back to non-RDF model on
that after realizing it wasn't really a requirement that our users
were asking for...simply using it as a graph db makes things go a lot
quicker. Of course, we're planning on revisiting implementing the
RDF-based model in the future.

In TinkerPop/Neo4j, there's ACID transaction support and there a bulk
loading utility.  However, there is not transaction support in the
bulk loader...also, I think Sail has kinda different/weird transaction
support, but I think you can override that in Tinkerpop somehow...

b,chris.



On Wed, May 30, 2012 at 2:29 AM, Simon Spero sesunc...@gmail.com wrote:
 The latest version of Jena TDB adds atomic transactions (version 0.9.0+)

 See http://jena.apache.org/documentation/tdb/tdb_transactions.html for
 documentation:

 The following limitations are listed:


   - Bulk loads: the TDB bulk loader is not transactional
   - Nested transactions are not supported.
   - Some active transaction state is held exclusively in-memory, limiting
   scalability.
   - Long-running transactions. Read-transactions cause a build-up of
   pending changes;
   - If a single read transaction runs for a long time when there are many
   updates, the system will consume a lot of temporary resources.




 On Tue, May 29, 2012 at 7:00 PM, Fleming, Declan dflem...@ucsd.edu wrote:

 Hi Ravi - I'll let some of my more technical folks chime in, but we do a
 bunch with RDF and have found every triplestore we've tried very limited in
 handling transactions.  Reading and writing at the same time causes a
 deadlock that's a mess to keep clean.  So, we went back where we started
 and created a triplestore using SQL with big tables of triples.  We cheat a
 little bit with a fourth column for ID and a fifth that helps speed up
 blank node searching.  This has helped us avoid these transactional
 problems we were having, and the performance is quite good for ingest.

 Most of our searching is done by stuffing the triples into solr in a JSON
 format, so we don't rely on the backend data store for that much.  We also
 sync the SQL triples to Allegrograph in case we need deeper SPARQL things,
 but we're thinking of shedding this from our architecture.

 Declan

 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
 Ravi Shankar
 Sent: Tuesday, May 29, 2012 12:12 AM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: [CODE4LIB] triple stores ???

 We (DLSS at Stanford Libraries) are planning to use a triple store for
 storing and retrieving annotations (in RDF) on digital objects. We are
 currently looking at open-source triple stores such as 4store, Virtuoso,
 Jena SDB and Mulgara. Are you currently using a triple store or
 contemplating on using one? How would you evaluate 'your' triple store
 along the lines of 1) ease of setup, 2) scalability, 3) query performance,
 3) bulk load performance, 4) access api, 5) documentation and 6) community
 support?

 Highly appreciate your thoughts, ideas and suggestions.

 Thanks,
 Ravi Shankar



Re: [CODE4LIB] triple stores ???

2012-05-29 Thread Chris Fitzpatrick
Hi Ravi,

Yeah, if you haven't seen it yet, take a look at the first link
(http://www.w3.org/wiki/LargeTripleStores) in the search results that
Stefano included.

A big question is if you're going to need reasoning capabilities. If
that's the case, you'll probably want to look at the first 3 in that
list (Allegro from Franz, BigOWLIM from Ontotext, and Virtuoso from
OpenLink). These are kinda the big 3 in terms of addressing the
reasoning, large scalability, and performance issues

If you don't really need the built-in reasoning or have huge needs in
scalability/performance , I'd really recommend having a look at
TinkerPop.
For RDF capabilities, you'll probably need to use the TinkerPop Sail
implementation ( check out this blog post =
http://architects.dzone.com/articles/visualizing-rdf-schema ) .
TinkerPop has a pretty good community around it and Neo4j has an
excellent community. The setup is fairly easy, although the Sail stuff
does have a learning curve. Performance seems very good, and using
TinkerPop/Neo4j is great because it offers a variety of querying
options. That all said, I've found performance for  triple stores can
be really hard to measure, since how you store your data and how
you're querying can make all the difference...so often it's not the
software, but the way the data model has been designed..

good luck!
b,chris.







On Tue, May 29, 2012 at 10:01 AM, Stefano Bargioni bargi...@pusc.it wrote:
 Maybe a G search can help to find comparisons:
 http://www.google.com/search?sugexp=chrome,mod=4sourceid=chromeie=UTF-8q=4store+Virtuoso+Jena+SDB++Mulgara
 The result includes your post... added 8 minutes ago.
 Stefano

 On 29/mag/2012, at 09.12, Ravi Shankar wrote:

 We (DLSS at Stanford Libraries) are planning to use a triple store for 
 storing and retrieving annotations (in RDF) on digital objects. We are 
 currently looking at open-source triple stores such as 4store, Virtuoso, 
 Jena SDB and Mulgara. Are you currently using a triple store or 
 contemplating on using one? How would you evaluate 'your' triple store along 
 the lines of 1) ease of setup, 2) scalability, 3) query performance, 3) bulk 
 load performance, 4) access api, 5) documentation and 6) community support?

 Highly appreciate your thoughts, ideas and suggestions.

 Thanks,
 Ravi Shankar



 __
 Il tuo 5x1000 al Patronato di San Girolamo della Carita' e' un gesto semplice 
 ma di grande valore.
 Una tua firma aiutera' i sacerdoti ad essere piu' vicini alle esigenze di 
 tutti noi.
 Aiutaci a formare sacerdoti e seminaristi provenienti dai 5 continenti 
 indicando nella dichiarazione dei redditi il codice fiscale 97023980580.


Re: [CODE4LIB] Bootstrap vs Foundation

2012-05-18 Thread Chris Fitzpatrick
just to mention, I don't think Less works with jruby, so if you use
Bootstrap, you have to use the static assets and can't use the
generators...




On Fri, May 11, 2012 at 4:06 PM, Shaun Ellis sha...@princeton.edu wrote:
 I have not used Foundation, but from what I can see, it offers a subset of
 the features that you get with Bootstrap.  I suppose that's what they mean
 by light framework.  The idea that it is designed to be overridden is a
 bit of a strange claim as I don't see how it's any different from overriding
 any other base stylesheet. I've been overriding styles in Bootstrap simply
 by creating an override.css file from the beginning.

 We are currently in the last stages of the prototype phase for our Finding
 Aids site and will be going into beta soon.  It currently looks like a
 Bootstrap site, hence the samification that the List Apart article
 mentions, and I will soon need to Princeton-ify it (aka tiger style).

 I think that the transition to a custom site that stands out from other
 Bootstrap sites is not particularly easy if you've been using Bootstrap out
 of the box and overriding it like I've been doing.  This is because there
 are standard/shared colors and styles that are set as variables in Less.
  It's a lot more laborious to go through and override these manually than
 simply change the variables in Less.

 If you are interested in using Bootstrap, I would recommend designing a
 style guide (or UI pattern library, as Matthew called it) for your own
 institution and building it with Less, which is my next step.  This guide
 will provide me and my colleagues custom variations on components, but I
 plan to maintain the architecture of the Bootstrap site.  I just love how
 organized it is, and how easy it is to simply copy code from the examples.

 Furthermore, it will be easier to keep such a style guide in sync with
 future Bootstrap versions.  I'm currently putting off upgrading to Bootstrap
 2.0 because they changed the default grid and I didn't start the project
 using Less.  Finally, other developers at your institution can use the same
 custom guide as easily as they would the Bootstrap site for grabbing and
 quickly implementing their design conventions.

 I don't regret not using Less out of the gate since it was pretty foreign to
 me at the time, and I really just wanted to get going quickly with
 prototyping the architecture.

 Cheers,
 Shaun


 On 5/11/12 9:27 AM, Joseph Gilbert wrote:

 Hi Jessie,

 I've used Bootstrap more than Foundation, but both are solid choices.
 There are some relatively minor differences: Bootstrap uses LESS while
 Foundation is CSS with an officially supported SASS version; Bootstrap
 has a few more JS widgets thrown in.

 One philosophical distinction seems to lie in the it’s designed to be
 overridden line in the article Tom mentions.  Bootstrap looks good
 right out of the box, but the underlying styles are also a bit more
 complex and therefore sometimes require a little more effort to tweak.
  Bootstrap out-of-the-box and without customizations--a bit like
 jQueryUI before it--is already starting to seem hackneyed, but
 assuming you all will be doing institutional customizations, either
 library, I think, will give you a good starting point.

 Best,
 Joe


 --
 Joseph Gilbert
 User Experience Web Developer
 University of Virginia Library


 On Fri, May 11, 2012 at 7:01 AM, Tom Keaystomke...@gmail.com  wrote:

 I read this awhile back. It's by someone associated with the
 Foundation project.

  http://www.alistapart.com/articles/dive-into-responsive-prototyping-with-foundation/
 Both look good. Like you, I looked hard at Bootstrap after the
 conference, but haven't really done anything with it. I'd be
 interested which framework you settle on.



 On Thu, May 10, 2012 at 7:17 PM, Jessie Keckjk...@stanford.edu  wrote:

 Hi all,
 We are about to develop a set of style-guids and templates for our
 locally developed applications that will have a unified look and feel.  One
 manifestation of this will be a ruby gem that we will use for all of our
 rails apps (including Blacklight and Hydra applications).

 As we were discussing the approaches we may take for this, the question
 of basing our designs on a library such as Bootstrap or Foundation came up.
  I have heard a lot about Bootstrap in the C4L community, but very little
 about Foundation.  Does anybody here have extensive experience w/ both
 libraries and would recommend one over the other?

 We are already leaning towards Bootstrap as many in the Blacklight and
 Hydra communities have expressed interest or are using it already.  Also,
 some folks locally who have used or investigated both libraries have had
 positive experiences in either case.

 Understanding that this may be boil down to a simple matter of taste, I
 wonder what opinions you all have.

 Thank you,
 - Jessie Keck
 Stanford University


 --
 Shaun D. Ellis
 Digital Library Interface Developer
 Firestone Library, 

Re: [CODE4LIB] Anyone implementing common LIS applications on PaaS providers?

2012-03-29 Thread Chris Fitzpatrick
Hi,

I've deployed Blacklight on both Heroku and Elastic BeanStalk.

Heroku is still a much better choice. The only issue I had was I
needed to make sure the sass-rails gem in installed in the :production
gem group and not just development.

 I still have an issue of getting heroku to compile all my
sass/coffeescript/etc assets on update, but it actually doesn't seem
to make much of an impact on performance. The minor issue is that it
would be nice to figure out a way to slim down BL's slug size. The
lowest I've been able to get it is about 30mb and Heroku recommends
having it be below 25mb.

I have not used Heroku's solr service (I still use EC2 for my solr
deployments).
EngineYard would also be another option.

There is also an AMI for DSpace, so deploying that to EC2 should be
pretty easy

b,chris.



On Thu, Mar 29, 2012 at 3:55 PM, Rosalyn Metz rosalynm...@gmail.com wrote:
 Erik,

 I haven't tried it (recently) on PaaS providers, but I have on IaaS.  The
 AMIs I've created in association with start up scripts (if you're
 interested in seeing those let me know, I'd have to look for them somewhere
 or other) mean that the application automagically starts up on its own, all
 you need to do is go to the URL.  I've used this as a back up method in the
 past and I think would be a great way for people to be able to play with
 the different apps before committing.

 To this end, I created an AMI for Blacklight a while back:
 http://www.rosalynmetz.com/ami-3c10f255/  I guarantee you it is grossly out
 of date.  I also have instructions on creating an EBS backed AMI:
 http://rosalynmetz.com/ideas/2011/04/14/creating-an-ebs-backed-ami/ which
 is the method I used for creating the Blacklight AMI. These instructions
 are also fairly old, but I still get comments on my blog now and then that
 the method works.

 I also played around with it on Heroku, but that was so long ago I don't
 think any of the things I learned still apply (this was when Heroku was
 fairly new to the scene).  Hope some of this helps.

 Rosalyn



 On Thu, Mar 29, 2012 at 8:34 AM, Seth van Hooland svhoo...@ulb.ac.bewrote:

 Dear Erik,

 Bram Wiercx and myself have given a talk on how to put together a package
 to install CollectiveAccess on Red Hat's OpenShift:
 http://www.dish2011.nl/sessions/open-source-software-platform-collectiveacces-as-a-service-solution
 .

 My students are currently happily playing around with CollectiveAccess,
 which they have installed on OpenShift. My teaching assistant Max De Wilde
 has developed clear guidelines on how to run the installation procedure:
 http://homepages.ulb.ac.be/~svhoolan/redhat_ca_install.pdf.

 It would be wonderful to aggregate these kind of installation procedure's
 for other types of LIS applications...

 Kind regards and looking forward to your book!

 Seth van Hooland
 Président du Master en Sciences et Technologies de l'Information et de la
 Communication (MaSTIC)
 Université Libre de Bruxelles
 Av. F.D. Roosevelt, 50 CP 123  | 1050 Bruxelles
 http://homepages.ulb.ac.be/~svhoolan/
 http://twitter.com/#!/sethvanhooland
 http://mastic.ulb.ac.be
 0032 2 650 4765
 Office: DC11.113

 Le 29 mars 2012 à 14:10, Erik Mitchell a écrit :

  Hi all,
 
  I have been toying with the process of implementing common LIS
  applications (e.g. Vufind, Dspace, Blacklight. .  .) on PaaS providers
  like Heroku and Amazon Elastic Beanstalk.  I have just tried out of
  the box distributions so far and have not made much progress but was
  wondering if someone else had tried this or had ideas about what
  issues I might run into.
 
  Thanks,
 
  Erik
 
  Erik Mitchell
  Assistant Professor
  College of Information Studies
  University of Maryland, College Park
  http://ischool.umd.edu



Re: [CODE4LIB] Anyone implementing common LIS applications on PaaS providers?

2012-03-29 Thread Chris Fitzpatrick
Hey Sean,

Jah, I did that...my .slugignore is:
tmp/*
log/*
coverage/*
spec/*
koha/*
jetty/*

That dropped it down to 30 from ~50mb, so that's good .
(koha has some scripts wrote to pull from our ILS).

I think the slug size is a really minor issue. Heroku says under 25mb
is good, but over 50mb is not so good.  Not Good,  but not Chaotic
Evil . Neutral Good.



On Thu, Mar 29, 2012 at 6:26 PM, Sean Hannan shan...@jhu.edu wrote:
 If you already have everything indexed in Solr elsewhere, a way to cut down
 the BL slug size is to remove/ignore the SolrMarc.jar. It's pretty sizable.

 -Sean


 On 3/29/12 12:16 PM, Chris Fitzpatrick chrisfitz...@gmail.com wrote:

 Hi,

 I've deployed Blacklight on both Heroku and Elastic BeanStalk.

 Heroku is still a much better choice. The only issue I had was I
 needed to make sure the sass-rails gem in installed in the :production
 gem group and not just development.

  I still have an issue of getting heroku to compile all my
 sass/coffeescript/etc assets on update, but it actually doesn't seem
 to make much of an impact on performance. The minor issue is that it
 would be nice to figure out a way to slim down BL's slug size. The
 lowest I've been able to get it is about 30mb and Heroku recommends
 having it be below 25mb.

 I have not used Heroku's solr service (I still use EC2 for my solr
 deployments).
 EngineYard would also be another option.

 There is also an AMI for DSpace, so deploying that to EC2 should be
 pretty easy

 b,chris.



 On Thu, Mar 29, 2012 at 3:55 PM, Rosalyn Metz rosalynm...@gmail.com wrote:
 Erik,

 I haven't tried it (recently) on PaaS providers, but I have on IaaS.  The
 AMIs I've created in association with start up scripts (if you're
 interested in seeing those let me know, I'd have to look for them somewhere
 or other) mean that the application automagically starts up on its own, all
 you need to do is go to the URL.  I've used this as a back up method in the
 past and I think would be a great way for people to be able to play with
 the different apps before committing.

 To this end, I created an AMI for Blacklight a while back:
 http://www.rosalynmetz.com/ami-3c10f255/  I guarantee you it is grossly out
 of date.  I also have instructions on creating an EBS backed AMI:
 http://rosalynmetz.com/ideas/2011/04/14/creating-an-ebs-backed-ami/ which
 is the method I used for creating the Blacklight AMI. These instructions
 are also fairly old, but I still get comments on my blog now and then that
 the method works.

 I also played around with it on Heroku, but that was so long ago I don't
 think any of the things I learned still apply (this was when Heroku was
 fairly new to the scene).  Hope some of this helps.

 Rosalyn



 On Thu, Mar 29, 2012 at 8:34 AM, Seth van Hooland svhoo...@ulb.ac.bewrote:

 Dear Erik,

 Bram Wiercx and myself have given a talk on how to put together a package
 to install CollectiveAccess on Red Hat's OpenShift:
 http://www.dish2011.nl/sessions/open-source-software-platform-collectiveacce
 s-as-a-service-solution
 .

 My students are currently happily playing around with CollectiveAccess,
 which they have installed on OpenShift. My teaching assistant Max De Wilde
 has developed clear guidelines on how to run the installation procedure:
 http://homepages.ulb.ac.be/~svhoolan/redhat_ca_install.pdf.

 It would be wonderful to aggregate these kind of installation procedure's
 for other types of LIS applications...

 Kind regards and looking forward to your book!

 Seth van Hooland
 Président du Master en Sciences et Technologies de l'Information et de la
 Communication (MaSTIC)
 Université Libre de Bruxelles
 Av. F.D. Roosevelt, 50 CP 123  | 1050 Bruxelles
 http://homepages.ulb.ac.be/~svhoolan/
 http://twitter.com/#!/sethvanhooland
 http://mastic.ulb.ac.be
 0032 2 650 4765
 Office: DC11.113

 Le 29 mars 2012 à 14:10, Erik Mitchell a écrit :

 Hi all,

 I have been toying with the process of implementing common LIS
 applications (e.g. Vufind, Dspace, Blacklight. .  .) on PaaS providers
 like Heroku and Amazon Elastic Beanstalk.  I have just tried out of
 the box distributions so far and have not made much progress but was
 wondering if someone else had tried this or had ideas about what
 issues I might run into.

 Thanks,

 Erik

 Erik Mitchell
 Assistant Professor
 College of Information Studies
 University of Maryland, College Park
 http://ischool.umd.edu



Re: [CODE4LIB] Anyone implementing common LIS applications on PaaS providers?

2012-03-29 Thread Chris Fitzpatrick
Hey Erik,

I used this AMI for solr =
http://www.lucidimagination.com/blog/2010/02/01/solr-shines-through-the-cloud-lucidworks-solr-on-ec2/

Note : You will have to change the schema and solrconfig files on the image...

b,chris.


On Thu, Mar 29, 2012 at 9:44 PM, Erik Mitchell mitch...@gmail.com wrote:
 Neat!  Thanks Mark,

 Erik

 On Thu, Mar 29, 2012 at 2:19 PM, Mark A. Matienzo m...@matienzo.org wrote:
 Like Chris, I've deployed Blacklight on Heroku, and this thread
 (particularly Rosalyn's message) has gotten me to write up a quick
 HOWTO on the Blacklight wiki [0].

 For Solr hosting I've used both a VM that I run (on Slicehost) and EC2.

 Mark

  [0] 
 https://github.com/projectblacklight/blacklight/wiki/Blacklight-on-Heroku


Re: [CODE4LIB] Anyone implementing common LIS applications on PaaS providers?

2012-03-29 Thread Chris Fitzpatrick
Oh, yeah...Mark's wiki page also points out that you'll need to change
to postgres for production. Very important.

From what I understand, the compiled slug is the app + all it's
dependancies. So, yeah,  I guess since the gem does have the
SolrMarc.jar in there, it would have the jar in the slug somewhere

However, I did just see a page on heroku that says:
Generally speaking any slug under 15MB is small and nimble; 30MB is
average; and 50MB or above is weighty.

So maybe I was being slightly over sensitive to my slug's size? I was
just worried all the other slugs would call it names.

I am going to try and figure out why my pipeline assets aren't
compiling when I push. The solution is probably just to compile them
locally and push them rather than rely on Heroku to precompile them
(currently when I push, Heroku's precompile fails, so it reverts to
compile at runtime mode) if anyone has insight into this, please
lemme know...I believe having them compile at runtime does slow down
the application...

b,chris.

On Thu, Mar 29, 2012 at 10:49 PM, Chris Fitzpatrick
chrisfitz...@gmail.com wrote:
 Hey Erik,

 I used this AMI for solr =
 http://www.lucidimagination.com/blog/2010/02/01/solr-shines-through-the-cloud-lucidworks-solr-on-ec2/

 Note : You will have to change the schema and solrconfig files on the image...

 b,chris.


 On Thu, Mar 29, 2012 at 9:44 PM, Erik Mitchell mitch...@gmail.com wrote:
 Neat!  Thanks Mark,

 Erik

 On Thu, Mar 29, 2012 at 2:19 PM, Mark A. Matienzo m...@matienzo.org wrote:
 Like Chris, I've deployed Blacklight on Heroku, and this thread
 (particularly Rosalyn's message) has gotten me to write up a quick
 HOWTO on the Blacklight wiki [0].

 For Solr hosting I've used both a VM that I run (on Slicehost) and EC2.

 Mark

  [0] 
 https://github.com/projectblacklight/blacklight/wiki/Blacklight-on-Heroku


Re: [CODE4LIB] Job: Web Developer Ninja at Springshare

2012-03-21 Thread Chris Fitzpatrick
I figured it was in Paris since that's where all the ninjas seem to be
 these days.

On Wed, Mar 21, 2012 at 4:46 PM, Lisa H Kurt lk...@unr.edu wrote:
 Cary,

 It looks like this is a telecommuting job- location would be anywhere:

 * Working from home (yes, you heard it right, though slackers need not
 apply - see the point above about needing to be a self-starter and
 self-motivator)




 On 3/21/12 6:49 AM, Cary Gordon listu...@chillco.com wrote:

It would be great if job listings could include location, particularly
where the work is to be performed onsite.

Thanks,

Cary

On Tue, Mar 20, 2012 at 2:02 PM,  j...@code4lib.org wrote:
 Howdy, code4lib-ers! Springshare
 ([http://springshare.com](http://springshare.com)) is looking for web
 developers with mad skills and thirst for innovation. We create web
tools that
 libraries love, and we need your help to carry out our mission of
creating
 awesome web software and providing even awesome-r service to our
libraries.


 This is what we'd need from you:

  * LAMP skills of the ninja caliber, including:
    * 3+ years PHP / MySQL experience
    * Unix / Apache skills
  * Experience in scaling web infrastructure
  * Front-end JS programming experience (e.g. jQuery or dojo)
  * Bonus: worked with Nginx, Mobile tech, or Solr? Experience with any
of these is a plus. Worked with all three? Where have you been all our
lives??
  * You need to be a self-starter and self-motivating type. We work in a
typical startup fashion so you'll be wearing many hats and doing a lot
of things - at once - hence having great organizational and multitasking
skills is essential
 In a typical week, you'll:

  * Create front- and back-end interfaces for new or existing products,
letting your creative juices run free
  * Work with our partners (other library-centric companies) to
integrate their tools with Springshare and vice versa
  * Dream up new ideas that will rock the library (software) world
  * Every one us (including our CEO himself) also helps with support and
making sure our customers' needs are taken care of, so you'll be talking
with our customers regularly, troubleshooting bug fixes and such
 We offer:

  * Great pay and benefits (health, dental, 401K, etc.)
  * Very flexible vacations/time off policy
  * Working from home (yes, you heard it right, though slackers need not
apply - see the point above about needing to be a self-starter and
self-motivator)
  * A very supportive, library-centric environment (half of our team is
librarians).
 If this sounds like your dream gig, please send your resume to
 sa...@springshare.com and let us know what makes you awesome.



 Brought to you by code4lib jobs: http://jobs.code4lib.org/job/864/



--
Cary Gordon
The Cherry Hill Company
http://chillco.com


Re: [CODE4LIB] neo4j

2012-02-13 Thread Chris Fitzpatrick
Hey Kent,

Awesome. thanks for the info. So, using gremlin, are you using some of
the other Tinkerpop technologies?

And, haha, in researching stuff this weekend, I actually saw an email
you sent to the neo4j google group about the lucene boosting issue…

I started playing around with RDF.rb , and was really impressed,
although using that doesn't give you all the stuff tinkerpop does.

b,chris.

On Sat, Feb 11, 2012 at 12:32 AM, Kent Fitch kent.fi...@gmail.com wrote:
 Hi,

 AustLit ( http://www.austlit.edu.au ) is in the early stages of a
 migration from javaServlets/xslt/oracle to java/neo4j/gremlin.  The
 web version of AustLit was developed in 2000 based on FRBR with a
 strong emphasis on events realised with a topic map model, so the sql
 implementation is close to a triple-store.  More information on the
 details are here: http://www.austlit.edu.au/about ,
 http://www.austlit.edu.au/about/metadata and
 http://www.austlit.edu.au:/DataModel/index.html (ALEG was the
 working name for AustLit redevelopment in 2000).

 Last year a decision was taken to move AustLit from a subscription
 service to open access, and from updates being performed solely by
 dedicated bibliographers and researchers (members of various AustLit
 teams distributed across Australia) to include community
 contributions, so rather than work these changes into a 12 year old
 system, it was decided to start afresh with an approach which would
 more naturally support the AustLit data model.

 So, we experimented with Neo4j, and were impressed with its
 performance.  For example, loading our current data from Oracle into
 an empty neo4j database takes about 30 minutes (using a
 run-of-the-mill 3 year-old server), producing a graph of 14m nodes and
 20m relationships.  Performing custom indexing of this data using the
 built-in Lucene integration takes about 2.5 hours, but that's a
 function of the extensive indexing we're performing.

 As you'd probably expect, we do have some issues we're working
 through, such as

 - integration with Lucene is abstracted by the neo4j index
 interface, so it is difficult or impossible to use some native Lucene
 features.  For example, boosting index nodes based on their inherent
 importance and using this boost in lucene to determine relevance
 cannot be done.

 - our data model is complex, and added to the requirements to version
 every node and relationship (ie, record changes, allow rollback), our
 graph traversals are correspondingly complex, but I suspect as we
 become more familar with graph traversal idioms in gremlin and cypher,
 they'll become as normal as sql

 But so far, neo4j seems fast and robust, and we're optimistic!

 Kent Fitch

 On Sat, Feb 11, 2012 at 9:42 AM, Chris Fitzpatrick
 chrisfitz...@gmail.com wrote:
 Hej hej,

 Is anyone is using neo4j in their library projects.

 If the answer is ja, I would be very interested in hearing how it's going.
 How are you using it?
 Is it something that is in production and is adding value or is it
 more a skunkworks-type effort?
 What languages are you using? Are you using an ORM (like Rails or Django)?

 I would also be really interested in hearing thoughts, stories, and
 opinions about the idea of using a graph db or triple store in their
 stack.

 tack!

 b, fitz.


[CODE4LIB] neo4j

2012-02-10 Thread Chris Fitzpatrick
Hej hej,

Is anyone is using neo4j in their library projects.

If the answer is ja, I would be very interested in hearing how it's going.
How are you using it?
Is it something that is in production and is adding value or is it
more a skunkworks-type effort?
What languages are you using? Are you using an ORM (like Rails or Django)?

I would also be really interested in hearing thoughts, stories, and
opinions about the idea of using a graph db or triple store in their
stack.

tack!

b, fitz.


Re: [CODE4LIB] Metadata war stories...

2012-01-25 Thread Chris Fitzpatrick
I was part of a particularly long siege during the METS offensive back in '08. 
It was brutal. We pretty much ran out of everything and were fighting 
hand-to-hand before the whole thing was over.

I remember toward the end, while out on requirement gathering patrol, my team 
came up on a group of rouge library staff who had separated from their 
cataloging unit. They were just sitting there, literally a few feet away, 
taking a chow break. We were heavily outnumbered and out-gunned, but it was a 
dark night, so I hoped we could just lie low and let them pass. But they 
started talking about how they were plotting a move to take out our dmdSec with 
some kind of RDF improvised explosive devise. I knew this would set us back 
months and would result in a great loss of many of my fellow developers and 
librarians. So, I ordered my team into action…since we had surprise on our 
side, we were able to even the numbers by taking out several of their squad. 
Their manager order them to fall back and they retreated up a hill. Several of 
my team started whooping and hollering like we'd won something, but I knew they 
were just regrouping to hit back at us.

And, boy, did they ever hit back. We had a prolonged shoot out.  I knew they 
longer this went, the more likely they'd be able to call in reinforcements or 
possibly get us with a Faculty-lead napalm strike. So, I made the quick 
decision to charge their position. We bounced up the hill, taking cover behind 
trees, rocks, corpses, and whatever we could. We took heavy fire, but we got to 
the top. And that's when all hell broke lose. 

I've killed my fair share of people. In combat, you just learn to live with 
that. But there's something about strangling someone with your bare hands that 
just leaves a lasting impression. What happened on that hill comes back to me 
like nothing else. The screams and the faces and the smell. I talked to that 
doc and went to some ALA conferences, but whiskey seems to be the only thing 
that helps. 

They say we won that war, but most of the time I'm not sure we did….war's not 
over for me. It's never over. 



On Jan 25, 2012, at 10:13 AM, Kyle Banerjee wrote:

 For our preconference, “Digging into Metadata,” we’d like to get a little
 discussion going to build on once the preconference rolls around.
 
 ...
 - Dealing with free text in MARC records and how to parse them w/o too much
 heartache
 
 
 You can find horrendous stories even with data that's fully structured.
 Multiple libraries have had call numbers not migrated (or the wrong one
 migrated due to the unfortunate practice of most libraries to retain
 multiple call numbers) during an ILS migration -- as you can imagine, that
 would make books much harder to find on the shelves. I can't remember the
 names of institutions this happened to, but you could probably find someone
 who can give you precise details on the autocat list.
 
 There is the constant problem that in any migration, the data is not
 structured/used the same way in the new system as in the old -- some fields
 exist in one system but not the other, different numbers/types of fields
 are used to represent concepts, etc.
 
 I've personally encountered cases where the data that comes out of a system
 is outright invalid or gets mangled in bizarre ways by the export routine
 itself. For example, there's a system used for many digital archives that
 splits a field in two anytime a field that needs to be represented by an
 XML entity is encountered. Name withheld to protect the guilty.
 
 kyle


Re: [CODE4LIB] Linux Laptop

2011-12-14 Thread Chris Fitzpatrick
Thanks everyone for all the recommendations. I know this would be this list to 
ask. 

Sounds like Ubuntu is the overwhelming favorite. In the past when I've used a 
linux in a non-server computer, there are always some annoying problems... 
things like the laptop not waking from sleep mode, power consumption problems, 
or the microphone not working.  

So, I wondering about specific laptop brands/models and linux 
distributions/versions that people have found to work really well. A Dell or 
ThinkPad with Ubuntu seems to be the popular choice? 

But, yeah, I know i started it, but I'm going to avoid going deeper into my 
opinions on Apple vs. Windows vs. Linux and the implications vis-à-vis 
productivity, copyright, social justice, and the plight of the polar bear. If 
only out of concern that introducing this discussion might cause the poor mail 
server at ND to meltdown…..

b,chris. 


 


On Dec 14, 2011, at 9:25 AM, Dave Caroline wrote:

 You just cannot do the technical futzing easily on mac or doze, I too
 am a Ubuntu user on my desktop and servers
 getting stuff done web wise is faster that way.I expect to run the
 apache,php,mysql and replicate systems that are servers
 windows and mac screw with stupid things like case in the file system!
 
 Dave Caroline


Re: [CODE4LIB] Linux Laptop

2011-12-14 Thread Chris Fitzpatrick
Thanks! 

Evil doesn't really concern me. 

If I could run Ubuntu on a laptop made in the pits of hell by the dark lord 
himself, I would certainly do it. 

Just as long as I never have to say the words Genius Bar, I will be happy. 


On Dec 14, 2011, at 11:30 AM, Birkin Diana wrote:

 Possible bridges to whatever you decide to get:
 
 - Use Ubuntu on VirtualBox on your Mac (yes, I did see the using 'linux in a 
 non-server computer' concern -- thus the bridge qualification).
 
 - Depending on the issues and languages you're using, use in-language 
 package-managers. For example, we have a library of python repository code 
 with three python-package dependences: fcrepo, lxml, and solrpy. Using 'pip' 
 (a python package-manager), we've created a simple requirements file, so 
 that, when I run, on my mac, a pip-install command for our repo_utils, it 
 automatically web-downloads and installs those packages if necessary (even 
 specific versions if desired). And this'll work the same on your Mac, VBox, 
 and a possible eventual linux laptop.
 
 -Birkin
 
 ---
 Birkin James Diana
 Programmer, Digital Technologies
 Brown University Library
 birkin_di...@brown.edu
 
 
 On Dec 14, 2011, at 12:54 PM, Chris Fitzpatrick wrote:
 
 Thanks everyone for all the recommendations. I know this would be this list 
 to ask. 
 
 Sounds like Ubuntu is the overwhelming favorite. In the past when I've used 
 a linux in a non-server computer, there are always some annoying problems... 
 things like the laptop not waking from sleep mode, power consumption 
 problems, or the microphone not working.  
 
 So, I wondering about specific laptop brands/models and linux 
 distributions/versions that people have found to work really well. A Dell or 
 ThinkPad with Ubuntu seems to be the popular choice? 
 
 But, yeah, I know i started it, but I'm going to avoid going deeper into my 
 opinions on Apple vs. Windows vs. Linux and the implications vis-à-vis 
 productivity, copyright, social justice, and the plight of the polar bear. 
 If only out of concern that introducing this discussion might cause the poor 
 mail server at ND to meltdown…..
 
 b,chris. 


Re: [CODE4LIB] Voting is open for code4lib 2012 presentations.

2011-11-22 Thread Chris Fitzpatrick
Why do I have the feeling that Mahmoud Ahmadinejad's presentation on test 
driven development is going to get 99% of the vote?


On Nov 22, 2011, at 12:34 PM, David Uspal wrote:

 Of course, literally two seconds after sending my last email, my vote finally 
 goes through...
 
 
 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Tom 
 Keays
 Sent: Tuesday, November 22, 2011 3:19 PM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] Voting is open for code4lib 2012 presentations.
 
 Mine are being remembered from this morning when I filled it out at home.
 I'm now on a different network/OS/browser.
 
 Tom
 
 On Tue, Nov 22, 2011 at 2:22 PM, Andrew Nagy asn...@gmail.com wrote:
 
 My votes are not showing after returning to the voting page.  I thought I
 remembered being able to modify my votes from previous years.  I went
 through the first 30 or so, and wanted to come back to it to go through
 more, but my votes are not persisting.  Is this a bug, a change, or a
 failure in my memory?
 
 Andrew
 
 On Tue, Nov 22, 2011 at 2:14 PM, Michael J. Giarlo 
 leftw...@alumni.rutgers.edu wrote:
 
 POWERED BY DIEBOLD
 
 
 On Tue, Nov 22, 2011 at 14:08, Michael B. Klein mbkl...@gmail.com
 wrote:
 Hmm. 404'ing for me now.
 
 On Nov 22, 2011, at 4:22 AM, Ross Singer rossfsin...@gmail.com
 wrote:
 
 Ok, the results screen should no longer be throwing an error.
 
 Vote early, vote often,
 -Ross.
 
 On Tue, Nov 22, 2011 at 6:57 AM, Ross Singer rossfsin...@gmail.com
 wrote:
 Mark, I'm only getting that for the results page.  Are you getting
 it
 somewhere else?
 
 I'll fix the results page as soon as I can.
 
 -Ross.
 
 On Monday, November 21, 2011, Mark Diggory mdigg...@atmire.com
 wrote:
 The ever popular...Internal Server Error
 On Mon, Nov 21, 2011 at 7:34 PM, Anjanette Young
 youn...@u.washington.eduwrote:
 
 Voting for code4lib 2012 talks are now open.
 
 Voting will close at 5pm (PST) on December 9, 2011.
 
 Presentation criteria to keep in mind
 
 
   - Usefulness
   - Newness
   - Geekiness
   - Diversity of topics
 
 http://vote.code4lib.org/election/21 -- You will need your
 code4lib.orglogin in order to vote. If you do not have one you can
 create
 one at
 http://code4lib.org/
 
 Presentation proposal descriptions can be found on the wiki
 
 http://wiki.code4lib.org/index.php/2012_talks_proposals
 
 Thank you to Ross Singer for keying in all 72 proposals!
 
 --Anjanette
 
 --
 You received this message because you are subscribed to the Google
 Groups
 code4libcon group.
 To post to this group, send email to code4lib...@googlegroups.com.
 To unsubscribe from this group, send email to
 code4libcon+unsubscr...@googlegroups.com.
 For more options, visit this group at
 http://groups.google.com/group/code4libcon?hl=en.
 
 
 
 
 --
 [image: @mire Inc.]
 *Mark Diggory*
 *2888 Loker Avenue East, Suite 305, Carlsbad, CA. 92010*
 *Esperantolaan 4, Heverlee 3001, Belgium*
 http://www.atmire.com
 
 
 
 


Re: [CODE4LIB] We are now on a wait list... DON'T PANIC!!!

2011-11-16 Thread Chris Fitzpatrick
I think that we should start over. 


On Nov 16, 2011, at 9:28 AM, Frumkin, Jeremy wrote:

 Hi Elizabeth - 
 
 The message you sent is confusing - could you clarify what you mean by
 there are a multitude of reasons why you will be contacted and be able to
 get into code4lib?
 
 -- jaf
 
 
 Jeremy Frumkin
 Assistant Dean / Chief Technology Strategist
 University of Arizona Libraries
 
 +1 520.626.7296
 frumk...@u.library.arizona.edu
 
 Hanlon's Razor: Never attribute to malice that which is adequately
 explained by stupidity
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 On 11/16/11 10:01 AM, Elizabeth Duell edu...@uoregon.edu wrote:
 
 DO NOT PANIC!
 
 Please DO REGISTER for Code4Lib... you will not be directed to
 the payment window, but you WILL be put on the wait list.
 
 What good does that do you?
 
 There are a multitude of reasons why you will be contacted and be
 able to get into Code4Lib National.
 
 Continue to register and remember...
 
 Keep Calm and Code on!
 
 
 
 Elizabeth Duell
 Orbis Cascade Alliance
 edu...@uoregon.edu
 (541) 346-1883
 
 -- 
 You received this message because you are subscribed to the Google Groups
 code4libcon group.
 To post to this group, send email to code4lib...@googlegroups.com.
 To unsubscribe from this group, send email to
 code4libcon+unsubscr...@googlegroups.com.
 For more options, visit this group at
 http://groups.google.com/group/code4libcon?hl=en.
 


Re: [CODE4LIB] Hotel registration - This was a test, right?

2011-11-16 Thread Chris Fitzpatrick
UNFOLLOW

On Nov 16, 2011, at 2:36 PM, Joe Atzberger wrote:

 The site you are trying to access does not exist. Please contact the event
 organizer to report this problem.


Re: [CODE4LIB] two open positions at Stanford

2011-10-13 Thread Chris Fitzpatrick
It's days like today I wish I was Amish. 


On Oct 13, 2011, at 2:16 PM, Ross Singer wrote:

 On Thu, Oct 13, 2011 at 4:21 PM, Blake, Tom tbl...@bpl.org wrote:
 ...or, you could take advantage of our extended application deadline and 
 reconsider one of the two developer positions open at the Boston Public 
 Library. Salaries are more in line with municipal government than Silicon 
 Valley - but benefits and job security are too. You'd be one of the 
 brightest stars on our team of several whip smart librarians, and you really 
 cannot beat the variety weather out here. Please feel free to contact me if 
 you have any questions or want to discuss the positions - posted and 
 Advanced Searchable under Job Opening ID #341652 (Web Developer) and #341650 
 (Repository Developer) here:
 
 www.cityofboston.gov/OHR/careercenter.asp
 
 Hmm, I think we really need to run this by OCLC first.
 
 -Ross.
 
 
 
 Thomas Blake
 Digital Projects Manager
 Boston Public Library
 700 Boylston St.
 Boston, MA 02116
 617 859-2039
 http://www.bpl.org/online/
 Free To All
 
 
 
 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Bess 
 Sadler
 Sent: Wednesday, October 05, 2011 4:23 PM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: [CODE4LIB] two open positions at Stanford
 
 We are looking for two software developers to work on a four year grant 
 funded digital library project. The department, Digital Library Systems and 
 Services (DLSS) is part of the Stanford Library, and it's a great place to 
 work. Salaries are more in line with Silicon Valley than with academia, 
 you'd be part of a team of whip smart library programmers, and you really 
 cannot beat the weather out here. Please feel free to contact me if you have 
 any questions or want to discuss the positions.
 
 To see more details on any of the positions, search for the Job ID at 
 http://jobs.stanford.edu/find_a_job.html.
 
 Bess Sadler
 Manager for Application Development, Digital Library Systems and Services
 
 p.s. Resistance is futile. ;)
 
 
 Feel free to browse other great jobs at 
 http://jobs.stanford.edu/find_a_job.html
 Digital Library Software Engineer, Stanford University Libraries
 
 Job ID
  43757
 Job Location
  University Libraries
 Job Category
  Library
 Salary
  4P3
 Date Posted
  Jul 29, 2011
 
 Working Title: Revs Infrastructure Developer
 Job Classification: System Software Developer
 
 This position is double posted at the 4P3 and 4P4 levels.
 
 Job Objective:
 
 Stanford University Libraries is seeking a talented software engineer to 
 support the digitization, collection delivery and collaboration components 
 of the Revs Program at Stanford. This is a four-year, grant-funded position.
 
 This position is part of the Revs Program at Stanford 
 (http://revs.stanford.edu). This program is dedicated to developing an 
 understanding of the impact of the automobile on society, culture and 
 technology. The Revs Program at Stanford was founded to inspire a new 
 trans-disciplinary field connecting the past, present and future of the 
 automobile. The Revs Program fosters an intellectual community bridging the 
 humanities and fine arts, social sciences, design, science and engineering, 
 and the professions. As a part of that effort, SULAIR will support the 
 dissemination of scholarly research on the automobile; provide digital 
 access to a collection of over two million items relating to automotive 
 history, racing and technology; and, develop a system and service to develop 
 and sustain an online automotive community. Members of the team will play a 
 role in building the world's leading center on the study of the impact of 
 the automobile on the 20th and 21st century.
 
 The Repository Developer will primarily develop digital library software to 
 enable management, preservation, and online discovery of Revs materials. 
 This will involve deployment of a new repository and web application using 
 the Hydra technology stack (http://projecthydra.org). The Repository 
 Developer will be a core contributor to the open source Hydra project in the 
 process of building the Revs digital repository.
 
 The Repository Developerwill be a member of a core team dedicated to the 
 successful completion of this project, and will work closely with the 
 project manager, Revs web developer, the information architect, digital 
 library infrastructure developers, the user experience designer and other 
 developers involved in digital library initiatives. This particular project 
 is highly collaborative, and will involve interactions with developers, 
 scholars and staff across Stanford and from other institutions. As a member 
 of SULAIR's digital library application development team, the Repository 
 Developer will contribute to the overall development of the Stanford 
 Library's web and digital library infrastructure, and help plan, specify, 
 and build the technologies needed to support the University's goal of 

Re: [CODE4LIB] two open positions at Stanford

2011-10-13 Thread Chris Fitzpatrick
Alas, I can't grow one, which is why the Amish won't have me. 

This is the life of a grammy-winning teen pop superstar. 

On Oct 13, 2011, at 3:04 PM, Frumkin, Jeremy wrote:

 Is that the official OCLC policy?
 
 -- jaf
 
 
 Jeremy Frumkin
 Assistant Dean / Chief Technology Strategist
 University of Arizona Libraries
 
 +1 520.626.7296
 frumk...@u.library.arizona.edu
 
 I dream of a world where chickens can cross the road without having their
 motives questioned
 
 
 
 
 
 
 
 
 
 
 
 
 
 On 10/13/11 3:02 PM, Roy Tennant roytenn...@gmail.com wrote:
 
 Ross, hold him down, would you, while I cut his beard off?
 Roy
 
 On Thu, Oct 13, 2011 at 3:00 PM, Chris Fitzpatrick cf...@stanford.edu
 wrote:
 It's days like today I wish I was Amish.
 
 
 On Oct 13, 2011, at 2:16 PM, Ross Singer wrote:
 
 On Thu, Oct 13, 2011 at 4:21 PM, Blake, Tom tbl...@bpl.org wrote:
 ...or, you could take advantage of our extended application deadline
 and reconsider one of the two developer positions open at the Boston
 Public Library. Salaries are more in line with municipal government
 than Silicon Valley - but benefits and job security are too. You'd be
 one of the brightest stars on our team of several whip smart
 librarians, and you really cannot beat the variety weather out here.
 Please feel free to contact me if you have any questions or want to
 discuss the positions - posted and Advanced Searchable under Job
 Opening ID #341652 (Web Developer) and #341650 (Repository Developer)
 here:
 
 www.cityofboston.gov/OHR/careercenter.asp
 
 Hmm, I think we really need to run this by OCLC first.
 
 -Ross.
 
 
 
 Thomas Blake
 Digital Projects Manager
 Boston Public Library
 700 Boylston St.
 Boston, MA 02116
 617 859-2039
 http://www.bpl.org/online/
 Free To All
 
 
 
 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf
 Of Bess Sadler
 Sent: Wednesday, October 05, 2011 4:23 PM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: [CODE4LIB] two open positions at Stanford
 
 We are looking for two software developers to work on a four year
 grant funded digital library project. The department, Digital Library
 Systems and Services (DLSS) is part of the Stanford Library, and it's
 a great place to work. Salaries are more in line with Silicon Valley
 than with academia, you'd be part of a team of whip smart library
 programmers, and you really cannot beat the weather out here. Please
 feel free to contact me if you have any questions or want to discuss
 the positions.
 
 To see more details on any of the positions, search for the Job ID at
 http://jobs.stanford.edu/find_a_job.html.
 
 Bess Sadler
 Manager for Application Development, Digital Library Systems and
 Services
 
 p.s. Resistance is futile. ;)
 
 
 Feel free to browse other great jobs at
 http://jobs.stanford.edu/find_a_job.html
 Digital Library Software Engineer, Stanford University Libraries
 
 Job ID
 43757
 Job Location
 University Libraries
 Job Category
 Library
 Salary
 4P3
 Date Posted
 Jul 29, 2011
 
 Working Title: Revs Infrastructure Developer
 Job Classification: System Software Developer
 
 This position is double posted at the 4P3 and 4P4 levels.
 
 Job Objective:
 
 Stanford University Libraries is seeking a talented software engineer
 to support the digitization, collection delivery and collaboration
 components of the Revs Program at Stanford. This is a four-year,
 grant-funded position.
 
 This position is part of the Revs Program at Stanford
 (http://revs.stanford.edu). This program is dedicated to developing an
 understanding of the impact of the automobile on society, culture and
 technology. The Revs Program at Stanford was founded to inspire a new
 trans-disciplinary field connecting the past, present and future of
 the automobile. The Revs Program fosters an intellectual community
 bridging the humanities and fine arts, social sciences, design,
 science and engineering, and the professions. As a part of that
 effort, SULAIR will support the dissemination of scholarly research on
 the automobile; provide digital access to a collection of over two
 million items relating to automotive history, racing and technology;
 and, develop a system and service to develop and sustain an online
 automotive community. Members of the team will play a role in building
 the world's leading center on the study of the impact of the
 automobile on the 20th and 21st century.
 
 The Repository Developer will primarily develop digital library
 software to enable management, preservation, and online discovery of
 Revs materials. This will involve deployment of a new repository and
 web application using the Hydra technology stack
 (http://projecthydra.org). The Repository Developer will be a core
 contributor to the open source Hydra project in the process of
 building the Revs digital repository.
 
 The Repository

Re: [CODE4LIB] Archivists' Toolkit, Timeouts and Hibernate

2011-10-06 Thread Chris Fitzpatrick
Hi Cindy,

I getting deja vu from this...we had a similar problem over a year ago. What 
happened to us is that our IST dept (who run a mysql service) made some 
changes to their load balancer and pooling configuration on their servers.
You might be running into a similar problem. The solution a coworker figured 
out was the following: 

1. Edit  the  hibernate.cfg.xml in the /Archivists Toolkit 2.0/src folder.
Starting about line 35:

property name=hibernate.c3p0.min_size5/property
!--This maxsize was changed from a 100 to 15 on 8/30/2010--
property name=hibernate.c3p0.max_size5/property
property name=hibernate.c3p0.timeout299/property
!--property name=hibernate.c3p0.max_statements50/property--
property name=hibernate.c3p0.idle_test_period30/property


2.  add a file named  c3p0.properties to the src folder with these lines :

c3p0.automaticTestTable=c3p0
c3p0.testConnectionOnCheckin=true



You'll have to compile this and replace ArchivistToolkit.jar  in all of your 
the AT clients with the new version. 
 You might be able to avoid this by using your own MySQL database, which might 
be easier for you to deploy than compiling the code and updating all the AT 
clients

best,chris. 



On Oct 6, 2011, at 12:05 PM, Cindy Harper wrote:

 I'm asking you all because it's not clear to me how to interact with the AT
 developers directly - the response back from the ATUG list is rather slow,
 and I'm hoping you can give me a technical explanation a la no, because...
 rather than just a no.
 
 We're trying to adopt Archivists Toolkit at Colgate. We don't have a Java
 developer in-house, but I'm exploring whether I can learn to address minor
 issues myself.
 
 We're a small liberal arts college, so library policy is to out-source as
 much infrastructure as possible (meaning open source is generally avoided).
 So the MySQL database is hosted on a Lunarpages server, and I can't adjust
 the timeout at the server level. But I'm suspecting that the timeout we're
 seeing is not a timeout of the given MySQL transaction, but instead a
 problem with Hibernate persistence.  The symptom - we edit a record, proceed
 to child records that require much editing - the chunk of data that my
 people are trying to enter at one time takes over 10 minutes to edit.
 During their editing the child records, an error occurs.  AT has added error
 code to sense that when this is a JDBCConnectionError, then it forces you to
 restart.
   if(errorText.contains(JDBCConnectionException)) {
String message = Database connection has been lost due to a
 server timeout.\n\n +
Please RESTART the program to continue.  If the problem
 persists, consult your System Administrator.;
 
 So what I did was add a connectTimeout=3600 parameter to the
 SessionFactory database URL.  But I still seem to have trouble with the
 timeout.
 
 Now, I acknowledge that understanding Hibernate and how it interacts with
 JDBC and altering code in AT may be getting over my head, and that what I
 probably should try next is either putting the database on my local MS SQL
 Server instance, or my test-server instance of MySQL (I don't have a local
 production instance of MySQL), and abandon the hosted server.
 
 But can any of you add to my knowledge base here, and tell me:
 - is it possible to correct this problem easily in the AT code?
 - is the JDBCConnectionException due to the MySQL server timeout that is
 set by connectTimeout?
 - is simply adding a parameter to the database URL an effective way of
 making sure that that parameter is used in each opensession instance?
 - I know I have a lot to learn about hibernate - I've located a book to skim
 in Books24x7 - I'll try wikipedia to get a briefer intial grounding. Any
 other advice?
 
 
 Cindy Harper, Systems Librarian
 Colgate University Libraries
 char...@colgate.edu
 315-228-7363


Re: [CODE4LIB] Job Posting: Digital Library Repository Developer, Boston Public Library (Boston, MA)

2011-09-28 Thread Chris Fitzpatrick
I do think it's pretty funny that the person shrieking about some kind of 
corporate conspiracy to intervene into library sovereignty is writing from a 
Google Gmail account.   


On Sep 28, 2011, at 1:31 PM, Karen Coyle wrote:

 That you left out Jeff Young, the only real RealWorldObject at gmail, just 
 shows that the conspiracy continues, unabated, since we have no proof that 
 the others actually exist, semantically.
 
 kc
 
 Quoting LeVan,Ralph le...@oclc.org:
 
 I demand that the investigation be expanded to include Andy Houghton,
 Karen Coombs, Ralph LeVan, Devon Smith, Thom Hickey, Doug Loynes and
 Michael Panzer!  These people have clearly had a pernicious impact on
 this list and the general business of libraries!
 
 Ralph
 
 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf
 Of
 Suchy, Daniel
 Sent: Wednesday, September 28, 2011 3:07 PM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: Job Posting: Digital Library Repository Developer, Boston
 Public
 Library (Boston, MA)
 
 I demand a full investigation of TennantGate!
 -dan
 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf
 Of
 Shearer, Timothy J
 Sent: Wednesday, September 28, 2011 12:05 PM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] Job Posting: Digital Library Repository
 Developer,
 Boston Public Library (Boston, MA)
 
 Don't you mean tennant4oclc?  He cannot be 4lib.
 
 -t
 
 On 9/28/11 3:02 PM, Peter Murray peter.mur...@lyrasis.org wrote:
 
 On Sep 28, 2011, at 2:32 PM, Michael B. Klein wrote:
  On Wed, Sep 28, 2011 at 11:29 AM, Michael J. Giarlo 
  leftw...@alumni.rutgers.edu wrote:
  P.S. Perhaps those who take issue with Mr. Tennant's listserv
  etiquette and ethics can take this up privately?
 
 
  WHY IS PENN STATE SO INTERESTED IN SUPPRESSING DISCUSSION
 OF THIS
  TOPIC??!?!!
 
 
 Clearly we need a mailing list to discuss this matter.  tennant4lib
 anyone?
 
 
 Peter
 --
 Peter Murray peter.mur...@lyrasis.org
 tel:+1-678-235-2955
 
 Ass't Director, Technology Services Development
 http://dltj.org/about/
 LYRASIS   --Great Libraries. Strong Communities. Innovative
 Answers.
 The Disruptive Library Technology Jester
 http://dltj.org/
 Attrib-Noncomm-Share
 http://creativecommons.org/licenses/by-nc-sa/2.5/
 
 
 
 
 -- 
 Karen Coyle
 kco...@kcoyle.net http://kcoyle.net
 ph: 1-510-540-7596
 m: 1-510-435-8234
 skype: kcoylenet


Re: [CODE4LIB] Job Posting: Web Services Developer, Boston Public Library (Boston, MA)

2011-09-27 Thread Chris Fitzpatrick
That's just because we need young virgins to sacrifice in order appease our sun 
god, who blesses us with perfect weather.

We used to just hold fake Jonas Brother's concerts on campus, but this method 
has just turned out so much easier since developers really don't put up much of 
a fight



On Sep 27, 2011, at 12:03 PM, Ross Singer wrote:

 On Tue, Sep 27, 2011 at 2:54 PM, Michael B. Klein mbkl...@gmail.com wrote:
 I'm curious why someone who works for a primarily Mellon-funded organization
 would be interested in influencing the direction the BPL takes with its
 staff-wide web development.
 
 I'm curious why someone from a west coast private university would be
 interested in influencing somebody from a Mellon-funded organization
 meddling in the direction the BPL takes with its staff-wide web
 development.
 
 And let's face it, Stanford will just hire the person away in 3 or 4
 months anyway, so BPL might as well not even bother.
 
 YOU WILL ALL BE HIRED AWAY BY STANFORD IN THREE OR FOUR MONTHS,
 RESISTANCE IS FUTILE.
 
 -Ross.
 
 On Tue, Sep 27, 2011 at 11:53 AM, Mark A. Matienzo m...@matienzo.orgwrote:
 
 So BPL is developing its own public and staff-side web portal for a
 repository from scratch? Do you mind if I ask why?
 
 Mark (not affiliated with OCLC)
 
 
 On Tue, Sep 27, 2011 at 11:55 AM, Colford, Scot scolf...@bpl.org wrote:
 The Boston Public Library is accepting applications for the Web Services
 Developer position. The successful candidate will develop and maintain
 the
 public and staff-side web portal for a digital object repository that
 will
 be used by Massachusetts libraries, archives, historical societies, and
 museums to store, make discoverable, and deliver digital resources to
 users across the State and beyond. Competitive benefits. Salary:  $62,053
 - 83,770, DOQ.
 
 
 MINIMUM QUALIFICATIONS:
 
 
 EDUCATION
 
  * Bachelor¹s Degree in an Information Technology field from an
 accredited institution with a focus on programming, web
 development/design, and scripting languages.
 
  * An Associate¹s Degree or higher degree in another field plus 4 years
 programming experience in a library setting may be substituted in lieu of
 Bachelor's Degree.
 
  * Degree or coursework in Library/Information Science preferred.
 
  * Experience working in a library environment preferred.
 
 
 
 EXPERIENCE
 
  * A minimum of 3 years of significant experience developing and
 maintaining database-driven web applications.
 
  * Thorough knowledge of and experience with web technologies including
 (X)HTML, DOM, CSS, XML, XSLT, and RSS.
 
  * Experience developing and coding interactive web applications using
 scripting languages, including JavaScript and PHP.
 
  * Experience in web programming frameworks such as JQuery, Zend, Rails,
 and/or other AJAX-compliant services.
 
  * 3 years experience with relational database modeling on systems such
 as MySQL, Sybase ASE, and MS SQL Server.
 
  * Demonstrated familiarity and comfort working in UNIX/Linux and Windows
 operating systems, related software, and basic system administration
 utilities.
 
  * Significant experience working in LAMP and/or WAMP stacks, preferably
 on virtualized and/or cloud-computing platforms. Experience with Apache
 Tomcat and Geronimo desirable.
 
  * Experience in web programming frameworks such as PHP, Rails, or
 Django.
 
  * Familiarity with an object-oriented programming language such as Ruby,
 Python, or Java is highly desirable.
 
  * Demonstrated project management experience.
 
 
 
 Requirements ­ Ability to exercise good judgment and focus on detail as
 required by the job
 
 
 Residency ­ Must be a resident of the City of Boston upon the first day
 of
 hire.
 
 
 CORI ­ Must successfully clear a Criminal Offenders Record Information
 check with the City of Boston
 
 
 Complete job description and application available at:
 www.cityofboston.gov/OHR/careercenter.asp
 
 
 Deadline for application: October 9, 2011
 
 
 In compliance with Federal and State Equal Employment Laws, Equal
 opportunity will be afforded to all applicants regardless of race, color,
 sex, age, religious creed, disability, national origin, ancestry, sexual
 orientation, marital status, ex-offender status, prior psychiatric
 treatment or military status.
 
 
 
 \-/-\-/-\-/-\-/-\-/-\-/-\-/-\-/-\-/-\-/-\-/-\-/
 
 Scot Colford
 Web Services Manager
 Boston Public Library
 
 scolf...@bpl.org
 Phone 617.859.2399
 Mobile 617.592.8669
 Fax 617.536.7558
 
 
 


Re: [CODE4LIB] EADitor: XForms for EAD beta .1105 released

2011-05-16 Thread Chris Fitzpatrick
On a related project,  I also just pushed some major code updates to the Orbeon 
xforms application we use at Stanford for MODS and TEI editing. 

It's at https://github.com/cfitz/orbeon-forms . 
It uses the Orbeon Form Runner forms environment, which you can read about 
here: http://www.orbeon.com/forms/orbeon-form-runner . 

If anyone has any questions/comments/ect. feel free to ping me...

best,chris. 


On May 16, 2011, at 6:02 AM, Ethan Gruber wrote:

 Apologies to those who may also be on the EAD list who would have already
 received this email.  EADitor is one of several active XForms projects
 detailed in XForms for Libraries: An Introduction, an article in the 11th
 issue of the code4lib journal (http://journal.code4lib.org/articles/3916)
 
 *
 
 I'm pleased to announce a new, much overdue,
 EADitorhttp://code.google.com/p/eaditor/beta, .1105.
 
 EADitor is an XForms framework for the creation and editing of Encoded
 Archival Description http://www.loc.gov/ead/ (EAD) finding aids using
 Orbeon http://www.orbeon.com/, an enterprise-level XForms Java
 application, which runs in Apache Tomcat.  Although the web form is
 certainly the most important aspect of the application since it can be
 integrated with existing content management and dissemination systems,
 EADitor also includes an easily customizable public interface for searching,
 sorting, and browsing collections of finding aids. This enables institutions
 to use a single application for content creation and publication.
 
 FEATURES
 * Create and edit EAD finding aids adhering to the EAD 2002 schema (elements
 are represented at almost every level in the finding aid, with the notable
 exception of mixed content at the paragraph level).
 * Import EAD 2002 schema or DTD-compliant finding aids into EADitor
 * An administrative user interface for publishing/unpublishing finding aids
 * Simple component reordering interface
 * Controlled vocabulary integration with auto-suggest, including LCSH terms
 and local vocabularies in subject, persname, famname, corpname, geogname,
 and genreform.  Languages refer to controlled vocabulary also.
 * Set default templates for the EAD core and components
 * A form for setting agency codes
 * Public interface for searching, browsing, and viewing finding aids (based
 on Solr).
 * Atom feed for published finding aids
 
 EADitor is still a work in progress, but will advance more consistently now
 that it is officially supported by the American Numismatic Society.
 Ultimately, I would like to integrate other controlled vocabulary services.
 One of the most important issues to address moving forward is better
 documentation.
 
 MORE INFORMATION
 
 EADitor project site (Google Code): http://code.google.com/p/eaditor/
 Installation instructions (specific for Ubuntu but broadly applies to all
 Unix-based systems):
 http://code.google.com/p/eaditor/wiki/UbuntuInstallation
 Google Group: http://groups.google.com/group/eaditor
 EADitor blog (will be the primary medium for providing information and
 progress on the project): http://eaditor.blogspot.com/


Re: [CODE4LIB] data export help: line breaks on tab-delimited download

2011-01-11 Thread Chris Fitzpatrick

Hey Ken,

When you changed it to Content-Disposition = attachment, did you keep  
the content-type still set to text/plain ?

You might try setting content-type to one (or maybe all)  of these:

Content-Type =  application/octet-stream
Content-Type = application/force-download
Content-Type =  application/download

Also, you can try setting Content-Transfer-Encoding to binary.

best,chris.


On Jan 11, 2011, at 12:29 PM, Ken Irwin wrote:


Content-Disposition: attachment;


Re: [CODE4LIB] collengine, the collection engine; runs on django-nonrel / app engine

2010-12-16 Thread Chris Fitzpatrick

Hey Brian,

This is  awesome.
Awhile back I took a stab at doing something kinda similar with jruby  
and google app engine. I think I still have a half finished blog post  
floating around somewhere on thatfinishing that might be a good  
christmas break project.


For other ruby-based projects, I've had great success with Heroku.  
They also have a solr hosting service...


This is what we did for the OLAC project.  Rails  hosting cost were  
way too much for a pilot project, so we're using the free version of  
heroku.


Also, while I happen to work for a larger university library with VMs  
coming out the wazoo, in my experience, often these types of  
development services really help with collobroration projects,  since  
you're not having to relying on one institution partner to provide the  
support for the development environment. It also kinda makes the  
collaborators more equal at the get-go, since nobody has their  
employer's name etched into to the URL and server names. Also, it  
might make managers a little less spooked about having to support  
things long term


best,chris


On Dec 16, 2010, at 12:11 AM, BRIAN TINGLE wrote:

Having been several months since I've tried to run django on the  
google app engine, I took a crack at it today with Django appengine http://www.allbuttonspressed.com/projects/djangoappengine


Since it is based on django-nonrel, in theory it does not have  
vendor lock in to app engine, so you could start to develop there  
and move in house if you need to.


I set up a very simple little app, and it deployed to appspot okay,  
here is the code and a short screen cast on my blog


screen cast:
http://tingletech.tumblr.com/post/2334189882/
demonstrates the django admin interface running in the google app  
engine editing the super basic models


The super basic models:
https://github.com/tingletech/collengine/blob/master/items/models.py

code repository:
https://github.com/tingletech/collengine

Dose anyone know of any other django or app engine based digital  
library metadata collection tools?  Seems like being able to run for  
free on app engine (if things fit in google quotas) would be an  
advantage for small libraries and short term grant funded projects.   
Also, the django-nonrel looks like is has some interesting search  
features that could be used in access systems.


Anyway, just throwing this out there in case it might be useful for  
the hackfest


-- Brian


Re: [CODE4LIB] Code4Lib 2011 Registration Closed

2010-12-16 Thread Chris Fitzpatrick
A couple of years ago I missed registration and had to  get my  
Code4Lib ticket on StubHub.


The only downsides were I had to pay twice face value and tell  
everyone my name was Naomi Dushay all week



On Dec 16, 2010, at 8:47 AM, Birkin James Diana wrote:


Kevin wrote:

...we did have lots of folks drop their registrations... opening up  
spots for waitlisters...


Same for the 2009 Providence conference, which filled very quickly.  
Another factor is that we initially held back a small buffer of  
spots to make sure that our calculations for keynoters/presenters/ 
etc were correct (due to a firm cap) and then allotted those to  
waitlisters.


-Birkin

---
Birkin James Diana
Programmer, Integrated Technology Services
Brown University Library
birkin_di...@brown.edu


On Dec 16, 2010, at 10:30 AM, Kevin S. Clarke wrote:


I believe the cap is the same this year as last year (250).  It did
stay open a couple of weeks last year.  In years before, it's sold  
out
even quicker than this year.  Probably lots of factors for how  
quickly

it sells out (location, talks, etc.).  Regardless, it's popular.

Richard, if last year is any indication of this year, we did have  
lots

of folks drop their registrations... opening up spots for waitlisters
(so people at the top of the list have a good chance, I think).

Kevin


On Thu, Dec 16, 2010 at 10:08 AM, Richard, Joel M  
richar...@si.edu wrote:
Woah, that was fast. I guess I'll go on the waiting list. *fingers  
crossed*


Is this code4lib larger or smaller than last year? I seem to  
remember registering weeks after the registration opened. Maybe  
it's getting popular, eh?


Thanks,
--Joel

Joel Richard
IT Specialist, Web Services Department
Smithsonian Institution Libraries | http://www.sil.si.edu/
(202) 633-1706 | (202) 786-2861 (f) | richar...@si.edu


Re: [CODE4LIB] T-Shirt Voting is Open

2010-12-16 Thread Chris Fitzpatrick

Hi everyone,

I found out this morning not everyone watched WGN on Saturday  
afternoon in the 1980s


So here are two links to contextualize tshirt option #1:

http://www.youtube.com/watch?v=J1jzs6dk4bs
http://thisdistractedglobe.com/2007/04/03/breaking-away-1979/

All great designs...thanks!



On Dec 16, 2010, at 8:24 AM, Durbin, Michael R wrote:

We've gotten four lovely submissions for the t-shirt design  
contest.  Please cast your vote now.


http://vote.code4lib.org/election/index/18

Closes at 12:00 AM EST on 2010-12-23.

-Mike


Re: [CODE4LIB] mailing list administratativia

2010-10-27 Thread Chris Fitzpatrick

+1 to the  this discussion is really depressing me  camp.


On Oct 27, 2010, at 12:53 PM, Jonathan Rochkind wrote:


Alexander Johannesen wrote:


Is it to throttle spam or something? 50 seems rather low, and it's
rather depressing to have a lively discussion throttled like that.  
Not






Pretty sure it wasn't depressing to the vast majority of the  
listserv audience.  That was/is a discussion that benefited from a  
timeout period, like you give the pre-schoolers.


Re: [CODE4LIB] Batch loading in fedora

2010-07-29 Thread Chris Fitzpatrick

Hey,

I've written a few ingest scripts for fedora...too many,  
actually...but I posted a more generic version of one I did somewhat  
recently in Ruby here ==

http://worldonawire.info/2010/07/ruby-fedora-ingestor/

I tried to clean it up a bit and a little bit of explanation as to  
what it's doing (or trying to do).
Let me know if you have any questions or if you notice something  
awry


I also have a bash ingest script, but I can't find it at the moment.  
It pretty much does what Adam described using curl. I'll try and find  
that and post it


thanks.
best,chris.

On Jul 29, 2010, at 7:00 AM, Pottinger, Hardy J. wrote:

Are you okay with using ruby? I've been using active fedora lately  
and I

love it. Here is some pseudo-code. If you want something that I've
actually syntax checked let me know.


Hi, Bess, I'm not Kyle, but I'd love to see the syntax checked code,  
if you're going to share it. Thanks!


--Hardy


[CODE4LIB] Google buys Metaweb (freebase)

2010-07-16 Thread Chris Fitzpatrick

Dunno if this is a good thing or a bad thing

http://www.sfgate.com/cgi-bin/article.cgi?f=/n/a/2010/07/16/financial/f132526D44.DTL


best,chris.


Re: [CODE4LIB] XForms EAD editor sandbox available

2009-11-13 Thread Chris Fitzpatrick
I too have written a metadata editor in Orbeon xforms, using their new  
Form Runner framework.


 I put a semi-up-to-date beta demo version of it here, if anyone is  
curious  -- https://mdtoolkit-dev.stanford.edu/ops/fr/mods/mlm/
(Feel free to edit/delete records, as this is just a dev instance.  
You'll probably have to accept a self-signed cert tho).


(Records probably look a little weird because they were just blindly   
imported from MARC records from our ILMS. )


I've written a version that back-ends into Fedora and Solr , but we're  
still using the default exist data base in production.

Some features this version has:

1. The Import Record from Catalog Key is based on a REST-ful service  
written by my coworker Richard Anderson that pulls MARC xml records  
from our SOLR db and converts them into MODS.

   You can try it out by entering 8257892 and hitting the plus...
2. The language section has the ability to do a real-time autosuggest  
lookup of a value list. In this case, it's from this xml file -- http://www.loc.gov/standards/codelists/languages.xml
  If you want to try this out in a record, add a new language node  
(hit the green plus), and type something  (bug -- it has to start with  
an uppercase letter) into the box (Something  like Ger) and wait a  
couple of seconds. Not too long...
I've also done demo  versions that query value lists from SOLR and  
from LCSH genre RDF in Mulgara , as well as queried  the OCLC grid   
naming authority service to add nodes from their authority file.  So,  
there are a lot of possibilities there.
3. When you create a new record, the uuids are generated by a REST  
request to our uuid generator.


But the performance seems ok, but I haven't done any heavy stress  
testing on it. It's a little slow, I guess. This really is just a way  
for our catalogers/project managers to create records to be loaded  
into SOLR, so it gets very light traffic. And it runs into perm gen  
space problems if you're running things like Mulgara, SOLR, or  
multiple Orbeon applications in the same container, especially on a VM.


And, yes , it is very ugly and a little weird, but so are most of  us  
in the library business, so I've been comfortable with it.


Any suggestions,comments, and barbs are welcome...
best,chris.



On Nov 13, 2009, at 9:49 AM, [Your Name] wrote:

In discussion with colleagues around this topic, the question of  
controlled vocabularies has been prominent. We're looking to move  
away from list instances that are packed into the XForm at render  
time to lists that are exposed from other services through REST  
interfaces, which can be dynamically coupled into a form.


On the other hand, 4 seconds is really not terribly long. {grin}

---
A. Soroka
Digital Research and Scholarship R  D
the University of Virginia Library



On Nov 13, 2009, at 12:45 PM, Ford, Kevin wrote:

We've been using Orbeon forms for about a year now for cataloging  
our digital collections.  We use Fedora Commons, so using the XML  
as input and outputting to XML seemed a no brainer.  It has worked  
very nicely for editing VRA Core4 records. But, instead of doing  
anything terribly fancy with Orbeon, we simply use the little  
sandbox application that comes with Orbeon (there's an online demo  
[1]).  The URL to the XForm is part of the query string. This  
solution has greatly reduced our time investment in making Orbeon  
part of our workflow and, more importantly, getting Orbeon to work  
for us.  All that being said, Ethan's sharp looking EAD editor  
makes me jealous that we haven't created our own custom editor.


As for Orbeon's performance, once we worked out some quirks, we've  
been quite happy with Orbeon.  Orbeon hosts a useful performance  
and tuning page [2].  We also learned that it is helpful to stop  
the Orbeon app and restart it about once every two weeks as  
performance can become progressively slower.  It seems to need a  
little reboot.  In any event, a typical XForm for us is about 200k,  
with a number of authority lists, one of which includes nearly 1500  
items.  Orbeon loads and renders the XForm fairly quickly (less  
than 4 seconds) and editing performance hasn't been an issue  
either, which is great considering that a 1500-item-subject- 
authority drop down list is created for each subject being added to  
a record.


Moving such a large XForm to a server-based solution was  
necessary.  Our XForm cataloging application, which began with a  
simple DC record and focused on producing a viable XForm, initially  
used the Mozilla XForm add-on [3].  The Firefox add-on, which of  
course runs on the client, easily scaled for a VRA Core4 record,  
but it couldn't handle a burgeoning subject authority file.  Hence  
the need for an alternative solution, quick.


-Kevin

[1] http://www.orbeon.com/ops/xforms-sandbox/
[2] http://wiki.orbeon.com/forms/doc/developer-guide/performance-tuning
[3] http://www.mozilla.org/projects/xforms/

--
Kevin 

Re: [CODE4LIB] Another nail in the coffin

2009-05-04 Thread Chris Fitzpatrick
My favorite part is when he ask the software to return a bibliographic  
record  matching 245 10$aFaust.$nPart one

and the computer literally catches fire.

Artificial intelligence is no match for the MARC format.



On May 4, 2009, at 12:40 PM, Frumkin, Jeremy wrote:

Seems more like a conversation for web4lib than code4lib, however, I  
always find it intriguing that somehow there is this notion that  
technology and libraries are at odds. While Alexander talks about  
this being another nail in the coffin, I look at this (and other  
emerging technologies) are new support and information tools that  
can be used by libraries. Hmmm.


-- jaf

==
Jeremy Frumkin
Assistant Dean / Chief Technology Strategist
University of Arizona Libraries

frumk...@u.library.arizona.edu
+1 520.307.4548
==

-Original Message-
From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf  
Of Ranti Junus

Sent: Monday, May 04, 2009 12:26 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Another nail in the coffin

I always intrigued with everything that has knowledge management or
information management aspect of it. I'm more intrigued with the
possibility that this tool might be able to work on presenting the
information based on the context.

Scouring the Twine site gave me this:

http://www.readwriteweb.com/archives/see_wolfram_alpha_in_action_-_video_and_screenshots.php

I personally think this is a darn cool stuff, from the computing point
of view.  I'll wait until it's out sometime this month to see what
kind of results I would get.


ranti.


On Sun, May 3, 2009 at 9:13 PM, Alexander Johannesen
alexander.johanne...@gmail.com wrote:

Another nail in the library coffin, especially the academic ones ;

  http://www.youtube.com/watch?v=5TIOH80Qg7Q

Organisations and people are slowly turning into data producers, not
book producers.


Alex
--
---
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian,  
Topic Maps
-- http://shelter.nu/blog/  







--
Bulk mail.  Postage paid.