Re: [CODE4LIB] unwanted (bogus) characters in marc
On 7.10.2010 15:17, Thomas Krichel wrote: Ere Maijala writes # Fix non-UTF-8 characters with two highest bits set (we assume they are actually ISO-8859-1) What about use Encode::Guess qw/latin-1/; $decoded=decode(Guess, $dodgy_input); $decoded then should be a utf-8 string with utf8 flag on. Would that work for a predominantly proper utf-8 input with some mistakes thrown in? --Ere
Re: [CODE4LIB] unwanted (bogus) characters in marc
Ere Maijala writes On 7.10.2010 15:17, Thomas Krichel wrote: ... use Encode::Guess qw/latin-1/; $decoded=decode(Guess, $dodgy_input); $decoded then should be a utf-8 string with utf8 flag on. Would that work for a predominantly proper utf-8 input with some mistakes thrown in? It will try to guess between UTF-8 and ISO-8859-1. This can be done because UTF-8 has many invalid byte sequences. But say if you wanted to guess between ISO-8859-1 and ISO-8859-2, you'd be out of luck. The module seems to do a good job for me. I use it for a robot on CrossRef's sigg API. The engine is reliable, but the data there is poorly character coded and marked up. I'd be happy to share the robot with anyone who wants to go out there get the character creeps. After all, we have Halloween coming up. ;-) Cheers, Thomas Krichelhttp://openlib.org/home/krichel http://authorclaim.org/profile/pkr1 skype: thomaskrichel
[CODE4LIB] FCLA Discovery Service (Mango) is now Solr powered
Hello Code4Lib-ers, I just wanted to make a general announcement about the development of the Florida University Library System's Discovery service. Florida Center for Library Automation (FCLA), supporting the 11 State Universities of Florida has switched search engines from Endeca to Solr. We went live with the Solr version of our locally developed discovery application (which we call Mango) in August. The motivation to switch to Solr was driven primarily by cost cutting efforts, not only to save the ongoing maintenance fees but additional fees that would have been incurred with increasing size of our data base, currently 11 million records. We used Blacklight as a jumping off point for our implementation of Solr, changing it quite a bit to work with our existing Mango discovery application, and Mango required some modification to adapt the calls to the Solr search engine. Mango is a Tomcat application that also uses various APIs and data service layers to bring in outside content such as Google Book covers, journal article metadata, and real time ILS (Ex Libris Aleph) item availability. We work closely with public and technical service librarians at the elven State University Libraries in Florida to develop new features and services in Mango that make it a useful and informative service for our users. The State University Union Catalog can be found here with links to the eleven University Library catalogs: http://union.catalog.fcla.edu/ Joshua Greben Systems Librarian/Analyst Florida Center for Library Automation 5830 NW 39th Ave, Gainesville, FL 32606 352-392-9020 ext 246 jgre...@ufl.edu
Re: [CODE4LIB] FCLA Discovery Service (Mango) is now Solr powered
Ya'aqov, Thanks for the kudos. FCLA already has Ex Libris' Metalib integrated in Mango as a Quick Articles search. Several of the State University Libraries have opted to make this one of their discovery services. You can see how this looks and works in the FIU, UCF and UWF catalogs: http://fiu.catalog.fcla.edu/fi.jsp? http://ucf.catalog.fcla.edu/cf.jsp? http://uwf.catalog.fcla.edu/wf.jsp? We are also looking at different ways to bring the Primo Central mega-aggregate index into the mix as a faster (non-federated search) alternative. Josh On Oct 8, 2010, at 12:44 PM, Ya'aqov Ziso wrote: Josh, Kudos to FCLA's team. Seems that that savings between ENDECA (not a native bibliographic search engine) and Solr were most significant, probably in the range of 6 digits per year. Integration of SFX seems seamless. How is FCLA planning to integrate also MetaLib or any other subject database cluster engine? Ya'aqov On Fri, Oct 8, 2010 at 11:01 AM, Joshua Greben jgre...@ufl.edu wrote: Hello Code4Lib-ers, I just wanted to make a general announcement about the development of the Florida University Library System's Discovery service. Florida Center for Library Automation (FCLA), supporting the 11 State Universities of Florida has switched search engines from Endeca to Solr. We went live with the Solr version of our locally developed discovery application (which we call Mango) in August. The motivation to switch to Solr was driven primarily by cost cutting efforts, not only to save the ongoing maintenance fees but additional fees that would have been incurred with increasing size of our data base, currently 11 million records. We used Blacklight as a jumping off point for our implementation of Solr, changing it quite a bit to work with our existing Mango discovery application, and Mango required some modification to adapt the calls to the Solr search engine. Mango is a Tomcat application that also uses various APIs and data service layers to bring in outside content such as Google Book covers, journal article metadata, and real time ILS (Ex Libris Aleph) item availability. We work closely with public and technical service librarians at the elven State University Libraries in Florida to develop new features and services in Mango that make it a useful and informative service for our users. The State University Union Catalog can be found here with links to the eleven University Library catalogs: http://union.catalog.fcla.edu/ Joshua Greben Systems Librarian/Analyst Florida Center for Library Automation 5830 NW 39th Ave, Gainesville, FL 32606 352-392-9020 ext 246 jgre...@ufl.edu
[CODE4LIB] Job Opening at the Harvard Library Innovation Laboratory
Hey out there, We're writing from the Harvard Library Innovation Laboratory. We're new. We have a job up, and we're all about innovating for libraries. We've got a great team that works really well together and we're looking to bring somebody else on. To see some of what we're working on, check out the website: http://www.librarylab.law.harvard.edu/ link to job description: https://jobs.brassring.com/1033/asp/tg/cim_jobdetail.asp?partnerID=25240sit eID=5341AReq=22156BR Feel free to pass along and thanks, The team at the Harvard Library Innovation Lab Questions? Mail Jeff: jgolden...@law.harvard.edu