Re: [CODE4LIB] unwanted (bogus) characters in marc

2010-10-08 Thread Ere Maijala

On 7.10.2010 15:17, Thomas Krichel wrote:

   Ere Maijala writes


# Fix non-UTF-8 characters with two highest bits set (we assume they
are actually ISO-8859-1)


   What about

use Encode::Guess qw/latin-1/;
$decoded=decode(Guess, $dodgy_input);

   $decoded then should be a utf-8 string with utf8 flag on.


Would that work for a predominantly proper utf-8 input with some 
mistakes thrown in?


--Ere


Re: [CODE4LIB] unwanted (bogus) characters in marc

2010-10-08 Thread Thomas Krichel
  Ere Maijala writes

 On 7.10.2010 15:17, Thomas Krichel wrote:

  ...

 use Encode::Guess qw/latin-1/;
 $decoded=decode(Guess, $dodgy_input);
 
$decoded then should be a utf-8 string with utf8 flag on.

 Would that work for a predominantly proper utf-8 input with some
 mistakes thrown in?

  It will try to guess between UTF-8 and ISO-8859-1. This can be done
  because UTF-8 has many invalid byte sequences.  But say if you
  wanted to guess between ISO-8859-1 and ISO-8859-2, you'd be out of
  luck. The module seems to do a good job for me.

  I use it for a robot on CrossRef's sigg API. The engine is reliable,
  but the data there is poorly character coded and marked up. I'd be
  happy to share the robot with anyone who wants to go out there get
  the character creeps. After all, we have Halloween coming up. ;-)


  Cheers,

  Thomas Krichelhttp://openlib.org/home/krichel
http://authorclaim.org/profile/pkr1
   skype: thomaskrichel


[CODE4LIB] FCLA Discovery Service (Mango) is now Solr powered

2010-10-08 Thread Joshua Greben
Hello Code4Lib-ers,

I just wanted to make a general announcement about the development of the 
Florida University Library System's Discovery service.

Florida Center for Library Automation (FCLA), supporting the 11 State 
Universities of Florida has switched search engines from Endeca to Solr.  We 
went live with the Solr version of our locally developed discovery application 
(which we call Mango) in August. The motivation to switch to Solr was driven 
primarily by cost cutting efforts, not only to save the ongoing maintenance  
fees but additional fees that would have been incurred with increasing size of 
our data base, currently 11 million records.

We used Blacklight as a jumping off point for our implementation of Solr, 
changing it quite a bit to work with our existing Mango discovery application, 
and Mango required some modification to adapt the calls to the Solr search 
engine. Mango is a Tomcat application that also uses various APIs and data 
service layers to bring in outside content such as Google Book covers, journal 
article metadata, and real time ILS (Ex Libris Aleph) item availability.

We work closely with public and technical service librarians at the elven State 
University Libraries in Florida to develop new features and services in Mango 
that make it a useful and informative service for our users. The State 
University Union Catalog can be found here with links to the eleven University 
Library catalogs:

http://union.catalog.fcla.edu/



Joshua Greben
Systems Librarian/Analyst
Florida Center for Library Automation
5830 NW 39th Ave, 
Gainesville, FL 32606
352-392-9020 ext 246
jgre...@ufl.edu


Re: [CODE4LIB] FCLA Discovery Service (Mango) is now Solr powered

2010-10-08 Thread Joshua Greben
Ya'aqov,

Thanks for the kudos. FCLA already has Ex Libris' Metalib integrated in Mango 
as a Quick Articles search. Several of the State University Libraries have 
opted to make this one of their discovery services. You can see how this looks 
and works in the FIU, UCF and UWF catalogs:

http://fiu.catalog.fcla.edu/fi.jsp?
http://ucf.catalog.fcla.edu/cf.jsp?
http://uwf.catalog.fcla.edu/wf.jsp?

We are also looking at different ways to bring the Primo Central mega-aggregate 
index into the mix as a faster (non-federated search) alternative.


Josh



On Oct 8, 2010, at 12:44 PM, Ya'aqov Ziso wrote:

 Josh,
 
 Kudos to FCLA's team. Seems that that savings between ENDECA (not  a native
 bibliographic search engine) and Solr were most significant, probably in the
 range of 6 digits per year.
 
 Integration of SFX seems seamless. How is FCLA planning to integrate also
 MetaLib or any other subject database cluster engine?
 
 Ya'aqov
 
 
 
 
 On Fri, Oct 8, 2010 at 11:01 AM, Joshua Greben jgre...@ufl.edu wrote:
 
 Hello Code4Lib-ers,
 
 I just wanted to make a general announcement about the development of the
 Florida University Library System's Discovery service.
 
 Florida Center for Library Automation (FCLA), supporting the 11 State
 Universities of Florida has switched search engines from Endeca to Solr.  We
 went live with the Solr version of our locally developed discovery
 application (which we call Mango) in August. The motivation to switch to
 Solr was driven primarily by cost cutting efforts, not only to save the
 ongoing maintenance  fees but additional fees that would have been incurred
 with increasing size of our data base, currently 11 million records.
 
 We used Blacklight as a jumping off point for our implementation of Solr,
 changing it quite a bit to work with our existing Mango discovery
 application, and Mango required some modification to adapt the calls to the
 Solr search engine. Mango is a Tomcat application that also uses various
 APIs and data service layers to bring in outside content such as Google Book
 covers, journal article metadata, and real time ILS (Ex Libris Aleph) item
 availability.
 
 We work closely with public and technical service librarians at the elven
 State University Libraries in Florida to develop new features and services
 in Mango that make it a useful and informative service for our users. The
 State University Union Catalog can be found here with links to the eleven
 University Library catalogs:
 
 http://union.catalog.fcla.edu/
 
 
 
 Joshua Greben
 Systems Librarian/Analyst
 Florida Center for Library Automation
 5830 NW 39th Ave,
 Gainesville, FL 32606
 352-392-9020 ext 246
 jgre...@ufl.edu
 
 


[CODE4LIB] Job Opening at the Harvard Library Innovation Laboratory

2010-10-08 Thread Jeff Goldenson
Hey out there, 
 
We're writing from the Harvard Library Innovation Laboratory.  We're new.
We have a job up, and we're all about innovating for libraries.

We've got a great team that works really well together and we're looking to
bring somebody else on.
 
To see some of what we're working on, check out the website:
http://www.librarylab.law.harvard.edu/
 
link to job description:
https://jobs.brassring.com/1033/asp/tg/cim_jobdetail.asp?partnerID=25240sit
eID=5341AReq=22156BR

Feel free to pass along and thanks,
 
The team at the Harvard Library Innovation Lab

Questions? Mail Jeff:
jgolden...@law.harvard.edu