Re: [CODE4LIB] U of Baltimore, Final Usability Report, link resolvers -- MIA?
I was going to comment that some of the Encore shortcomings mentioned in the PDf do seem to be addressed in current Encore versions, although some of these issues have to be addressed - for instance, there is a spell-check, but it can give some surprising suggestions, though suggestions do clue the user in to the fact that they might have a misspelling/typo. III's reaction to studies that report that users ignore the right-side panel of search options was to provide a skin that has only two columns - the facets on the left, and the search results on the middle-to-right. This pushes important facets like the tag cloud very far down the page, and causes a lot of scrolling, so I don't like this skin much. I recently asked a question on the encore users' list about how the tag cloud could be improved - currently it suggests the most common subfield a of the subject headings. I would think it should include the general, chronological, geographical subdivisions - subfields x,y,z. For instance, it doesn't provide good suggestions for improving the search civil war without these. A chronological subdivision would help a lot there. But then again, I haven't seen a prototype of how many relevant subdivisions this would result in - would the subdivisions drown out the main headings in the tag cloud? Cindy Harper, Systems Librarian Colgate University Libraries char...@colgate.edu 315-228-7363 On Wed, Sep 5, 2012 at 5:30 PM, Jonathan LeBreton lebre...@temple.eduwrote: Lucy Holman, Director of the U Baltimore Library, and a former colleague of mine at UMBC, got back to me about this. Her reply puts this particular document into context. It is an interesting reminder that not everything you find on the web is as it seems, and it certainly is not necessarily the final word. We gotta go buy the book! Lucy is off-list, but asked me to post this on her behalf. Her contact information is below, though Very interesting discussion This issue of what is right and feasible in discovery services and how to configure it is important stuff for many of our libraries and we should be able to build on the findings and experiences of others rather than re-inventing the wheel locally (We use Summon) - Jonathan LeBreton begin Lucy's explanation -- The full study and analysis are included in Chapter 14 of a new book, Planning and Implementing Resource Discovery Tools in Academic Libraries, Mary P. Popp and Diane Dallis (Eds). The project was part of a graduate Research Methods course in the University of Baltimore's MS in Interaction Design and Information Architecture program. Originally groups within the course conducted task-based usability tests on EDS, Primo, Summon and Encore. Unfortunately, the test environment of Encore led to many usability issues that we believed were more a result of the test environment than the product itself; therefore we did not report on Encore in the final analysis. The study (and chapter) does offers findings on the other three discovery tools. There were six student groups in the course; each group studied two tools with the same user population (undergrad, graduate and faculty) so that each tool was compared against the other three with each user population overall. The .pdf that you found was the final report of one of those six groups, so it only addresses two of the four tools. The chapter is the only document that pulls the six portions of the study together. I would be happy to discuss this with any of you individually if you need more information. Thanks for your interest in the study. Lucy Holman, DCD Director, Langsdale Library University of Baltimore 1420 Maryland Avenue Baltimore, MD 21201 410-837-4333 - end insert Jonathan LeBreton Sr. Associate University Librarian Temple University Libraries Paley M138, 1210 Polett Walk, Philadelphia PA 19122 voice: 215-204-8231 fax: 215-204-5201 mobile: 215-284-5070 email: lebre...@temple.edu email: jonat...@temple.edu -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of karim boughida Sent: Tuesday, September 04, 2012 5:09 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] U of Baltimore, Final Usability Report, link resolvers -- MIA? Hi Tom, Top players are EDS, Primo and Summonthe only reason I see encore in the mix is if you have other III products which is not the case of Ubalt library. They have now worldcat? Encore vs Summon is an easy win for summon. Let's wait for Jonathan LeBreton (Thanks BTW). Karim Boughida On Tue, Sep 4, 2012 at 4:26 PM, Tom Pasley tom.pas...@gmail.com wrote: Yes, I'm curious to know too! Due to database/resource matching or coverage perhaps (anyone's guess). Tom On Wed, Sep 5, 2012 at 7:50 AM, karim boughida kbough...@gmail.com
Re: [CODE4LIB] U of Baltimore, Final Usability Report, link resolvers -- MIA?
I meant to say some of these issues have to be addressed in configuration Cindy Harper, Systems Librarian Colgate University Libraries char...@colgate.edu 315-228-7363 On Thu, Sep 6, 2012 at 9:06 AM, Cindy Harper char...@colgate.edu wrote: I was going to comment that some of the Encore shortcomings mentioned in the PDf do seem to be addressed in current Encore versions, although some of these issues have to be addressed - for instance, there is a spell-check, but it can give some surprising suggestions, though suggestions do clue the user in to the fact that they might have a misspelling/typo. III's reaction to studies that report that users ignore the right-side panel of search options was to provide a skin that has only two columns - the facets on the left, and the search results on the middle-to-right. This pushes important facets like the tag cloud very far down the page, and causes a lot of scrolling, so I don't like this skin much. I recently asked a question on the encore users' list about how the tag cloud could be improved - currently it suggests the most common subfield a of the subject headings. I would think it should include the general, chronological, geographical subdivisions - subfields x,y,z. For instance, it doesn't provide good suggestions for improving the search civil war without these. A chronological subdivision would help a lot there. But then again, I haven't seen a prototype of how many relevant subdivisions this would result in - would the subdivisions drown out the main headings in the tag cloud? Cindy Harper, Systems Librarian Colgate University Libraries char...@colgate.edu 315-228-7363 On Wed, Sep 5, 2012 at 5:30 PM, Jonathan LeBreton lebre...@temple.eduwrote: Lucy Holman, Director of the U Baltimore Library, and a former colleague of mine at UMBC, got back to me about this. Her reply puts this particular document into context. It is an interesting reminder that not everything you find on the web is as it seems, and it certainly is not necessarily the final word. We gotta go buy the book! Lucy is off-list, but asked me to post this on her behalf. Her contact information is below, though Very interesting discussion This issue of what is right and feasible in discovery services and how to configure it is important stuff for many of our libraries and we should be able to build on the findings and experiences of others rather than re-inventing the wheel locally (We use Summon) - Jonathan LeBreton begin Lucy's explanation -- The full study and analysis are included in Chapter 14 of a new book, Planning and Implementing Resource Discovery Tools in Academic Libraries, Mary P. Popp and Diane Dallis (Eds). The project was part of a graduate Research Methods course in the University of Baltimore's MS in Interaction Design and Information Architecture program. Originally groups within the course conducted task-based usability tests on EDS, Primo, Summon and Encore. Unfortunately, the test environment of Encore led to many usability issues that we believed were more a result of the test environment than the product itself; therefore we did not report on Encore in the final analysis. The study (and chapter) does offers findings on the other three discovery tools. There were six student groups in the course; each group studied two tools with the same user population (undergrad, graduate and faculty) so that each tool was compared against the other three with each user population overall. The .pdf that you found was the final report of one of those six groups, so it only addresses two of the four tools. The chapter is the only document that pulls the six portions of the study together. I would be happy to discuss this with any of you individually if you need more information. Thanks for your interest in the study. Lucy Holman, DCD Director, Langsdale Library University of Baltimore 1420 Maryland Avenue Baltimore, MD 21201 410-837-4333 - end insert Jonathan LeBreton Sr. Associate University Librarian Temple University Libraries Paley M138, 1210 Polett Walk, Philadelphia PA 19122 voice: 215-204-8231 fax: 215-204-5201 mobile: 215-284-5070 email: lebre...@temple.edu email: jonat...@temple.edu -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of karim boughida Sent: Tuesday, September 04, 2012 5:09 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] U of Baltimore, Final Usability Report, link resolvers -- MIA? Hi Tom, Top players are EDS, Primo and Summonthe only reason I see encore in the mix is if you have other III products which is not the case of Ubalt library. They have now worldcat? Encore vs Summon is an easy win for summon. Let's wait for Jonathan LeBreton (Thanks BTW). Karim Boughida
[CODE4LIB] Fwd: [LIBR-FAC] bad news about ERIC documents
Sarah Park - This is from the message our govdocs librarian sent out last week... Cindy Harper, Systems Librarian, Colgate This bad news about full text ERIC documents came through govdoc-l today: The full text documents for ERIC have been temporarily disabled due to a privacy concern. We apologize for the inconvenience and are currently working to isolate the affected documents and return full text access to users as quickly as possible. Please stay tuned to eric.ed.gov for an update on when they will become available again. Not sure what the privacy concern is - Mary Jane Walsh, Head of Government Documents, Maps, Microforms and Interim Head of Reference Colgate University mwalsh at colgate dot edu
[CODE4LIB] Gadgeteers
I didn't know there were so many gadgeteers on this list. My latest item on my wishlist is this http://wimm.com/. Now, I'm not a smartphone user, because I'm always losing my cellphone, and I can't justify the cost of a data plan. And I've looked into a wearable notepad, but I think the shoulder-holster will not give quite the right message. But my ideal watch device would have the time, alarms and calendars synced to my Google calendar, and a voice recorder for voice memos to my absent-minded self. I think, with the right Android programming, this device could do it. Anyone seen one of these? Cindy Harper, Systems Librarian Colgate University Libraries char...@colgate.edu 315-228-7363
Re: [CODE4LIB] CAS authentication with ILLiad
How opportune! Colgate wants to do this, but I'm offered a one-week timeframe. We have CAS all set up. Does it look like it's doable in that time? Cindy Harper, Systems Librarian Colgate University Libraries char...@colgate.edu 315-228-7363 On Thu, Jan 12, 2012 at 12:51 PM, Friscia, Michael michael.fris...@yale.edu wrote: Anyone still interested in the topic of remote authentication for ILLiad using CAS? (for sites that host their own ILLiad instance) I just completed integration this morning without using the various UofA or UC Davis ISAPI filters out there. If there's interest I'd be happy to share how it was done. ___ Michael Friscia Manager, Digital Library Programming Services Yale University Library (203) 432-1856
Re: [CODE4LIB] CAS authentication with ILLiad
We're running 2008 w/ IIS7. Cindy Harper, Systems Librarian Colgate University Libraries char...@colgate.edu 315-228-7363 On Thu, Jan 12, 2012 at 1:11 PM, Friscia, Michael michael.fris...@yale.eduwrote: Took me 4 hours start to finish, 10 minutes to make it work, 3 hours 50 minutes to convert 24k user accounts to work with it. So yes, I think it is doable. I'll see what I can put together for documentation. It will be written assuming using Windows server 2008 with IIS7. It can be done with IIS6 on Server 2003 but would require someone that knows both pretty well. ___ Michael Friscia Manager, Digital Library Programming Services Yale University Library (203) 432-1856 -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Cindy Harper Sent: Thursday, January 12, 2012 1:04 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] CAS authentication with ILLiad How opportune! Colgate wants to do this, but I'm offered a one-week timeframe. We have CAS all set up. Does it look like it's doable in that time? Cindy Harper, Systems Librarian Colgate University Libraries char...@colgate.edu 315-228-7363 On Thu, Jan 12, 2012 at 12:51 PM, Friscia, Michael michael.fris...@yale.edu wrote: Anyone still interested in the topic of remote authentication for ILLiad using CAS? (for sites that host their own ILLiad instance) I just completed integration this morning without using the various UofA or UC Davis ISAPI filters out there. If there's interest I'd be happy to share how it was done. ___ Michael Friscia Manager, Digital Library Programming Services Yale University Library (203) 432-1856
[CODE4LIB] Data Mining / Business Analytics in libraries
Are there any listservs, blogs, forums addressing data mining in libraries? I've taken some courses, and am now exploring software - I just tried our RapidMiner, which integrates with R and Weka, and has facility for data cleaning and storage. I'm interested to see if anyone is sharing their experiences with Business Analytics type products in libraries. Cindy Harper, Systems Librarian Colgate University Libraries char...@colgate.edu 315-228-7363
Re: [CODE4LIB] Examples of visual searching or browsing
A couple years ago, I used a crossmap of LC call numbers to subject headings (admittedly out of date) to provide subject-labeled sort by call number on an experimental catalog search.http://lisv06.colgate.edu/profound/%20%20%20%20%20%20%3C/div%3E%09%09%09%09http://lisv06.colgate.edu/profound/%09%09%09%09%09%09%09%09%09%09%09%09 The mapping came from Mona Scott. Conversion Tableshttp://encore.colgate.edu/iii/encore/search/C%7CSmona+subject+scott%7COrightresult%7CU1?lang=engsuite=def .http://encore.colgate.edu/iii/encore/search/C%7CSmona+subject+scott%7COrightresult%7CU1?lang=engsuite=def 1999 I don't know how robust this is, but try searching a word that will appear across subject areas, like brown, to see the classification/subject labels. I read the tables into a database, and in a batch process, coded each call number division by how deep into the hierarchy it was linked - the number of indents from 1 to 6. My ambition was to then try to find the most frequently used subject headings in each step of the hierarchy (limited to a workable range) to try to generate some semantic-net-like set of links between subject headings and classification. But I never was able to pursue that goal. Cindy Harper, Systems Librarian Colgate University Libraries char...@colgate.edu 315-228-7363 On Sun, Oct 30, 2011 at 5:58 PM, David Friggens frigg...@waikato.ac.nzwrote: Clicking on one of Ben Shneiderman's treemapping projects reminded me that I've always thought treemaps [1] would serve well as a browsing interface for library and archive collections because they work well with hierarchical data. I played around with this earlier in the year, wanting to provide a drill-down into our collections by call number. For our Education Library's Teaching Collection, I used a three-level visualisation of items based on Dewey hierarchy, and coloured by the proportion of new (post 2006) items. I never put it online anywhere, so have attached it here. Dewey was pretty easy to get labels for the first three levels, and that seemed reasonable enough for most areas. But the majority of our items are LCC, and that's where I ran aground. The labels for the first two letters are readily available, but far too general to make this interesting. I couldn't seem to find any useful data in machine readable format. Sourcing another level down from LoC [1] or Wikipedia [2] seems tantalisingly close, but there's a whole lot of manual effort in turning these (incomplete) ranges into something usable. Cheers David [1] http://www.loc.gov/catdir/cpso/lcco/ [2] http://en.wikipedia.org/wiki/Library_of_Congress_Classification -- oʇɐʞıɐʍ ɟo ʎʇısɹǝʌıun uɐıɹɐɹqıן sɯǝʇsʎs
Re: [CODE4LIB] Examples of visual searching or browsing
Oh - looks like the item display didn't survive the transition to IIS 7 - I'll look into that. Cindy Harper, Systems Librarian Colgate University Libraries char...@colgate.edu 315-228-7363 On Mon, Oct 31, 2011 at 11:49 AM, Cindy Harper char...@colgate.edu wrote: A couple years ago, I used a crossmap of LC call numbers to subject headings (admittedly out of date) to provide subject-labeled sort by call number on an experimental catalog search.http://lisv06.colgate.edu/profound/ The mapping came from Mona Scott. Conversion Tableshttp://encore.colgate.edu/iii/encore/search/C%7CSmona+subject+scott%7COrightresult%7CU1?lang=engsuite=def .http://encore.colgate.edu/iii/encore/search/C%7CSmona+subject+scott%7COrightresult%7CU1?lang=engsuite=def 1999 I don't know how robust this is, but try searching a word that will appear across subject areas, like brown, to see the classification/subject labels. I read the tables into a database, and in a batch process, coded each call number division by how deep into the hierarchy it was linked - the number of indents from 1 to 6. My ambition was to then try to find the most frequently used subject headings in each step of the hierarchy (limited to a workable range) to try to generate some semantic-net-like set of links between subject headings and classification. But I never was able to pursue that goal. Cindy Harper, Systems Librarian Colgate University Libraries char...@colgate.edu 315-228-7363 On Sun, Oct 30, 2011 at 5:58 PM, David Friggens frigg...@waikato.ac.nzwrote: Clicking on one of Ben Shneiderman's treemapping projects reminded me that I've always thought treemaps [1] would serve well as a browsing interface for library and archive collections because they work well with hierarchical data. I played around with this earlier in the year, wanting to provide a drill-down into our collections by call number. For our Education Library's Teaching Collection, I used a three-level visualisation of items based on Dewey hierarchy, and coloured by the proportion of new (post 2006) items. I never put it online anywhere, so have attached it here. Dewey was pretty easy to get labels for the first three levels, and that seemed reasonable enough for most areas. But the majority of our items are LCC, and that's where I ran aground. The labels for the first two letters are readily available, but far too general to make this interesting. I couldn't seem to find any useful data in machine readable format. Sourcing another level down from LoC [1] or Wikipedia [2] seems tantalisingly close, but there's a whole lot of manual effort in turning these (incomplete) ranges into something usable. Cheers David [1] http://www.loc.gov/catdir/cpso/lcco/ [2] http://en.wikipedia.org/wiki/Library_of_Congress_Classification -- oʇɐʞıɐʍ ɟo ʎʇısɹǝʌıun uɐıɹɐɹqıן sɯǝʇsʎs
[CODE4LIB] Archivists' Toolkit, Timeouts and Hibernate
I'm asking you all because it's not clear to me how to interact with the AT developers directly - the response back from the ATUG list is rather slow, and I'm hoping you can give me a technical explanation a la no, because... rather than just a no. We're trying to adopt Archivists Toolkit at Colgate. We don't have a Java developer in-house, but I'm exploring whether I can learn to address minor issues myself. We're a small liberal arts college, so library policy is to out-source as much infrastructure as possible (meaning open source is generally avoided). So the MySQL database is hosted on a Lunarpages server, and I can't adjust the timeout at the server level. But I'm suspecting that the timeout we're seeing is not a timeout of the given MySQL transaction, but instead a problem with Hibernate persistence. The symptom - we edit a record, proceed to child records that require much editing - the chunk of data that my people are trying to enter at one time takes over 10 minutes to edit. During their editing the child records, an error occurs. AT has added error code to sense that when this is a JDBCConnectionError, then it forces you to restart. if(errorText.contains(JDBCConnectionException)) { String message = Database connection has been lost due to a server timeout.\n\n + Please RESTART the program to continue. If the problem persists, consult your System Administrator.; So what I did was add a connectTimeout=3600 parameter to the SessionFactory database URL. But I still seem to have trouble with the timeout. Now, I acknowledge that understanding Hibernate and how it interacts with JDBC and altering code in AT may be getting over my head, and that what I probably should try next is either putting the database on my local MS SQL Server instance, or my test-server instance of MySQL (I don't have a local production instance of MySQL), and abandon the hosted server. But can any of you add to my knowledge base here, and tell me: - is it possible to correct this problem easily in the AT code? - is the JDBCConnectionException due to the MySQL server timeout that is set by connectTimeout? - is simply adding a parameter to the database URL an effective way of making sure that that parameter is used in each opensession instance? - I know I have a lot to learn about hibernate - I've located a book to skim in Books24x7 - I'll try wikipedia to get a briefer intial grounding. Any other advice? Cindy Harper, Systems Librarian Colgate University Libraries char...@colgate.edu 315-228-7363
Re: [CODE4LIB] Archivists' Toolkit, Timeouts and Hibernate
Hi - I tried socketTimeout (I don't believe it's set in the SessionFactory code, but it may be in the hibernate config), and then got a recordlock error after 5 minutes: java.lang.NullPointerException at org.archiviststoolkit.mydomain.DomainAccessObjectImpl.update(DomainAccessObjectImpl.java:228) at org.archiviststoolkit.util.RecordLockUtils.updateRecordLocksTime(RecordLockUtils.java:170) at org.archiviststoolkit.Main$1.run(Main.java:526) at java.lang.Thread.run(Unknown Source) I'll try Chris' solution next. Cindy Harper, Systems Librarian Colgate University Libraries char...@colgate.edu 315-228-7363 On Thu, Oct 6, 2011 at 4:12 PM, Cowles, Esme escow...@ucsd.edu wrote: Cindy- I think connectTimeout is used for making the initial connection to the database. But the error you describe sounds more like the initial connection succeeds, but then there is a timeout afterwards. I think the socketTimeout parameter is what would control the timeout during an editing session. Though the docs say both connectTimeout and socketTimeout are 0 for no timeout by default: http://dev.mysql.com/doc/refman/5.0/en/connector-j-reference-configuration-properties.html Is socketTimeout specified in the JDBC config, by any chance? -Esme -- Esme Cowles escow...@ucsd.edu In the old days, an operating system was designed to optimize the utilization of the computer's resources. In the future, its main goal will be to optimize the user's time. -- Jakob Nielsen On 10/6/2011, at 3:05 PM, Cindy Harper wrote: I'm asking you all because it's not clear to me how to interact with the AT developers directly - the response back from the ATUG list is rather slow, and I'm hoping you can give me a technical explanation a la no, because... rather than just a no. We're trying to adopt Archivists Toolkit at Colgate. We don't have a Java developer in-house, but I'm exploring whether I can learn to address minor issues myself. We're a small liberal arts college, so library policy is to out-source as much infrastructure as possible (meaning open source is generally avoided). So the MySQL database is hosted on a Lunarpages server, and I can't adjust the timeout at the server level. But I'm suspecting that the timeout we're seeing is not a timeout of the given MySQL transaction, but instead a problem with Hibernate persistence. The symptom - we edit a record, proceed to child records that require much editing - the chunk of data that my people are trying to enter at one time takes over 10 minutes to edit. During their editing the child records, an error occurs. AT has added error code to sense that when this is a JDBCConnectionError, then it forces you to restart. if(errorText.contains(JDBCConnectionException)) { String message = Database connection has been lost due to a server timeout.\n\n + Please RESTART the program to continue. If the problem persists, consult your System Administrator.; So what I did was add a connectTimeout=3600 parameter to the SessionFactory database URL. But I still seem to have trouble with the timeout. Now, I acknowledge that understanding Hibernate and how it interacts with JDBC and altering code in AT may be getting over my head, and that what I probably should try next is either putting the database on my local MS SQL Server instance, or my test-server instance of MySQL (I don't have a local production instance of MySQL), and abandon the hosted server. But can any of you add to my knowledge base here, and tell me: - is it possible to correct this problem easily in the AT code? - is the JDBCConnectionException due to the MySQL server timeout that is set by connectTimeout? - is simply adding a parameter to the database URL an effective way of making sure that that parameter is used in each opensession instance? - I know I have a lot to learn about hibernate - I've located a book to skim in Books24x7 - I'll try wikipedia to get a briefer intial grounding. Any other advice? Cindy Harper, Systems Librarian Colgate University Libraries char...@colgate.edu 315-228-7363
Re: [CODE4LIB] Usage and financial data aggregation
We're a III library (yes I know), and looking into the new Sierra product's API that promises to release some of this data to us for use in a 3rd-party product such as you describe. III does have its proprietary Encore Reporter product, but I'm predicting some Sierra sites will look for an open-source product. I'd be very interested in working with others on such an effort. Cindy Harper, Systems Librarian Colgate University Libraries char...@colgate.edu 315-228-7363 On Tue, Sep 13, 2011 at 5:08 PM, Jason Stirnaman jstirna...@kumc.eduwrote: Does anyone have suggestions or recommendations for platforms that can aggregate usage data from multiple sources, combine it with financial data, and then provide some analysis, graphing, data views, etc? From what I can tell, something like Ex Libris' Alma would require all fulfillment transactions to occur within the system. I'm looking instead for something like Splunk that would accept log data, circulation data, usage reports, costs, and Sherpa/Romeo authority data but then schematize it for data analysis and maybe push out reporting dashboards nods to Brown Library http://library.brown.edu/dashboard/widgets/all/ I'd also want to automate the data retrieval, so that might consist of scraping, web services, and FTP, but that could easily be handled separately. I'm aware there are many challenges, such as comparing usage stats, shifts in journal aggregators, etc. Does anyone have any cool homegrown examples or ideas they've cooked up for this? Pie in the sky? Jason Jason Stirnaman Biomedical Librarian, Digital Projects A.R. Dykes Library, University of Kansas Medical Center jstirna...@kumc.edu 913-588-7319
Re: [CODE4LIB] Apps to reduce large file on the fly when it's requested
So I take it, this would need a fast connection between Google and your server, but would tolerate a slow connection between the user and Google? Cindy Harper, Systems Librarian Colgate University Libraries char...@colgate.edu 315-228-7363 On Thu, Aug 4, 2011 at 6:03 AM, Richard Wallis richard.wal...@talis.comwrote: Why not let someone else, such as the Google, do the heavy lifting for you: https://docs.google.com/viewer ~Richard. On 4 August 2011 07:39, Dave Caroline dave.thearchiv...@gmail.com wrote: One method is to dispense with PDF and just view the scanned pages online as images or OCR'd text or point the user to a directory with the scans for the document. He then only needs an image viewer using a lot less of his machines memory. Large PDF's also cause problems in the viewing computer. I was reviewing someones 25mb PDF the other day and it peaked at 3.3 gig memory use, which on a 2.5gig memory box meant it went into swap and slowed to a crawl. The viewer used there was evince. I scan to jpg and only produce a PDF if nagged http://www.collection.archivist.info/archive/manuals/IS44_Tektronix_602_display_unit/ As I serve from home and the upload is on the slow side individual pages helps there too. And when in a good mood I finish off a document thus http://www.collection.archivist.info/searchv13.php?searchstr=lucas+tp1 where all pages are web viewable. Been too lazy to write a page to page link on the page view so far (need a round tuit). Dave Caroline -- Richard Wallis Technology Evangelist, Talis Tel: +44 (0)7767 886 005 Linkedin: http://www.linkedin.com/in/richardwallis Skype: richard.wallis1 Twitter: @rjw IM: rjw3...@hotmail.com
Re: [CODE4LIB] Access 2011 Conference - Early Bird Reminder
For those of us unable to attend, will handouts/video be posted? Cindy Harper, Systems Librarian Colgate University Libraries char...@colgate.edu 315-228-7363 On Wed, Jun 29, 2011 at 12:48 AM, Mark Jordan mjor...@sfu.ca wrote: It's only 112 days until the Access 2011 conference this fall. A friendly reminder to look at the schedule and started making plans to come to Vancouver! We're also still accepting Hackfest project suggestions: http://access2011.library.ubc.ca/hackfest/ Early Bird registration rate is only available until August 1st http://access2011.library.ubc.ca/registration/. At only $169/night the conference hotel is quickly booking up. The rate is available for several days before and after the conference, but you must book through the link on the conference website http://access2011.library.ubc.ca/hotel/. Open data, open source development and community building, digital preservation, and artful data visualization! We hope to see you here for all this and more October 19-22, 2011 in Vancouver. Mark Jordan Access 2011 Conference Planning Committee Follow us on Twitter - @access_2011
Re: [CODE4LIB] ajaxy CRUD / weeding helper
The weeding project that we've started this year involves identifying unneeded added copies and outdated editions only. Rather than have the professional librarians examine every book on every shelf, I've suggested we prepare some lists that students can pull - where there has been less than x uses in the past y years and there are more than one copies. We haven't historically provided copy numbers on our records, so we can't tell if two item records are for the same entity, but I think student workers could probably handle checking for that. The next category of thing - superceded editions - is more difficult to check for - they may have the same call number with a different date on the call number, etc. Has anyone done any work to match author/title and identify the series of editions based on that? Or any other automation that would help with this weeding project. Our collection managers are skeptical that it can be automated in any way. Cindy Harper, Systems Librarian Colgate University Libraries char...@colgate.edu
Re: [CODE4LIB] Group-sourced Google custom search site?
That's right. I see that Google didn't provide a -1 button on their +1 button experiment. Cindy Harper, Systems Librarian Colgate University Libraries char...@colgate.edu 315-228-7363 On Wed, May 11, 2011 at 2:32 PM, Peter Noerr pno...@museglobal.com wrote: Just curious: - what do you mean by Some way to avoid the site-scrapers who populate the troubleshooting pages. (last sentence below)? I presume you are wishing to avoid the trouble shooting sites which consist of nothing more than pages copied from other sites, and look only at the prime source pages for information? Peter -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Cindy Harper Sent: Monday, May 02, 2011 2:15 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] Group-sourced Google custom search site? That reminds me - I was looking last week into the possibility of making a Google custom search site with either a whitelist of trusted technology sites, or a blacklist of sites to exclude. I haven't looked into whether the management of that could be group-sourced, but maybe someone else here has thought about this. I haven't looked into the terms of service of custom search sites, either. But of course slashdot was high on the whitelist. I was thinking about sites for several purposes - general technology news and opinion, or specific troubleshooting / programming sites. Some way to avoid the site-scrapers who populate the troubleshooting pages. Cindy Harper, Colgate U.
[CODE4LIB] Group-sourced Google custom search site?
That reminds me - I was looking last week into the possibility of making a Google custom search site with either a whitelist of trusted technology sites, or a blacklist of sites to exclude. I haven't looked into whether the management of that could be group-sourced, but maybe someone else here has thought about this. I haven't looked into the terms of service of custom search sites, either. But of course slashdot was high on the whitelist. I was thinking about sites for several purposes - general technology news and opinion, or specific troubleshooting / programming sites. Some way to avoid the site-scrapers who populate the troubleshooting pages. Cindy Harper, Colgate U.
[CODE4LIB] Fwd: online courses- Sentiment Analysis, Text Mining
I recommend the courses from statistics.com - price reductions for educators... Do you see any possible applications for libraries? -- Forwarded message -- From: Peter Bruce ourcour...@statistics.com Date: Tue, Apr 26, 2011 at 4:09 PM Subject: online courses- Sentiment Analysis, Text Mining To: char...@mail.colgate.edu Dear ... : How are you (your organization/product/service) regarded in cyberspace? Thrifty with money and time, people are unsparing with their opinions using Twitter, Facebook, Yelp, Flixster, blogs, web forums, product reviews... Sentiment analysis is the relatively new art and science of distilling useful data from this mass of unstructured text. The first annual conference on this subject was just held in NYC (google Sentiment Analysis Symposium); one of the main presenters was Nitin Indurkhya and his staff from eBay. He will present two online courses at statistics.com in June and July: Jun 3 - Jul 1: Text Mining (4 weeks) Jul 8 - Jul 29: Sentiment Analysis (3 weeks) Text Mining will introduce the essential techniques of text mining -the extension of data mining's standard predictive methods to unstructured text. This course will discuss these standard predictive modeling techniques (some familiarity with these methods will help), and will devote considerable attention to the data preparation and handling methods that are required to transform unstructured text into a form in which it can be mined. Access to software is provided with the course text. Sentiment Analysis introduces you to the algorithms, techniques and software used in sentiment analysis. Their use will be illustrated by reference to existing applications, particularly product reviews and opinion mining. The course will try to make clear both the capabilities and the limitations of these applications. For real-world applications, sentiment analysis draws heavily on work in computational linguistics and text-mining. At the completion of the course, a student will have a good idea of the field of sentiment analysis, the current state-of-the-art and the issues and problems that are likely to be the focus of future systems. Nitin Indurkhya is co-author of Text Mining (Springer), and co-editor of the Handbook of Natural Language Processing (CRC). Dr. Indurkhya is Principal Research Scientist at eBay. Previously, he was a Professor at the School of Computer Science and Engineering, University of New South Wales (Australia), as well as the founder and president of Data-Miner Pty Ltd, an Australian company engaged in data-mining consulting and education. Participants can ask questions and exchange comments directly with Dr. Indurkhya via a private discussion forum throughout each course. For details and to register: http://www.statistics.com/courses/data-mining-2/textmining/ http://www.statistics.com/courses/data-mining-2/sentiment-analysis/ The courses take place online at statistics.com in a series of weekly lessons and assignments, and require about 15 hours per week. Participate at your own convenience; there are no set hours when you must be online. Peter Bruce ourcour...@statistics.com P.S. Just let me know if you no longer wish to receive our course announcements. statistics.com 612 N. Jackson St. Arlington VA 22201 USA
[CODE4LIB] Semantic web introduction to tools
This article came in via email this morning - it may be the kind of pointers I needed to read about open-source tools to get started using the SW. *Computerworld First Look* http://cwonline.computerworld.com/t/7258117/240182/237524/0/?0fc84754=Y2hhcnBlckBtYWlsLmNvbGdhdGUuZWR1x=9633e82f -- *Semantic Web: Tools you can use*http://cwonline.computerworld.com/t/7258117/240182/376767/0/ Standards, tools, platforms, prewritten components and services are available to help make semantic deployments less time-consuming, less technically complex and (somewhat) less costly. *Read More*http://cwonline.computerworld.com/t/7258117/240182/376767/0/ Cindy Harper, Systems Librarian Colgate University Libraries char...@colgate.edu 315-228-7363
Re: [CODE4LIB] LAMP Hosting service that supports php_yaz?
Maybe I shouldn't be trolling code4lib for my personal interests, but I'm asking not about a mission-critical application, but a platform for keeping my personal skills up, and that would be accorded the proportionate amount of time. So I'd rather that not be a time sink for management, and I don't want to create a hacker-cracker's delight. My college is not enthused about librarians creating code or platforms that the college becomes responsible for maintaining - we're very abstemious in that regard. So I'm seeing how I can do this personally spending my personal cash without burdening my college. Sorry to bother you all with it. Everyone's happy family is different, to hash a quote, but I hope I'm still welcome in Code4Lib, even if I'm not hired to be a library coder. Just a library (Windows) sys admin. Or maybe we need a spin-off code4lib for the amateurs among us. Cindy Harper, Systems Librarian Colgate University Libraries char...@colgate.edu 315-228-7363 On Wed, Mar 23, 2011 at 10:55 AM, Bill Dueber b...@dueber.com wrote: On Wed, Mar 23, 2011 at 10:44 AM, Cary Gordon listu...@chillco.com wrote: You can probably find an curious intern to do it. Oh, for the love of god, please don't go this route. This is why libraries tend to be a huge mishmash of unsupported, one-off crap that some outgoing student did for extra credit six years ago. To ask the obvious question: You're at a real, honest-to-god prestigious college. Why are you trolling code4lib for cheap hosting environments? If IT won't give you a piece of a machine somewhere, or at least set up a Mac running OSX, they're failing to support a critical mission of the college and someone needs to be up in arms about it. If you haven't even asked them, well, maybe you should. -Bill, who spent his first two years in a library dealing with crappy old PHP code from long-gone students -- Bill Dueber Library Systems Programmer University of Michigan Library
Re: [CODE4LIB] LAMP Hosting service that supports php_yaz?
Sorry not to giv e more info on my first request. But I'm a little shy about owning up to my untried idea since there are a lot of IFs and unknowns about the whole project. By the time I get anywhere on this project, III will have made it possible to link Subject Guides from Encore, and my idea will have no chance of adoption at my own school. And I must admit, my goals are equal parts wanting to be able to bring my idea to reality with my own hands, as well as providing a possible aid toward meeting the problem of students don't always make the transition from Google to the library resources. My idea is/was this: I created a Firefox extension that fires upon a Google search. The idea is to identify the general subject area of the search, and pop-up a notice about the pertinent library subject guides, reminding the user that these resources are paid for and selected by the libraries. My idea for identifying the general subject area was to use a catalog search, mapping the call numbers of the resulting hits to the given subject guides. So I prototyped this in ASP .NET, which is the platform I have most experience with, using YAZ/VB-Zoom to perform the Z39.50 search. This is an example of my prototype pop-up. A warning, I assumed it was a pop-up, so if you're seeing it in tabs, it's going to resize your browser. That needs work. http://lisv06.colgate.edu/aftergoogle/default.aspx?searchargs=iran+nuclear There are a lot of questions to be answered, though. What proportion of Google searches consist of phrases that could be found in the catalog? What proportion of Google searches (in our computer labs, for instance) need scholarly information? To answer those questions, I intended to log the searches and the success of the mapping. Well, my adminstration rejected my proposal to test the app in their reference area. So I showed my app to Andrew Darby, author of Subjects Plus, the app we use for our subject guides, and he was interested - if I could port it over to PHP. And since it would need to support a variety of ILS's, Z39.50 seems still to be the most likely technology. That was last October, and I have yet to get PHP_YAZ working. Of course, this is a squeeze-in in my own spare time, so the time devoted to it is sporadic. And the idea of offering it to other small libs is also why I would want to have the app hosted. If Andrew were to offer it as a part of Subjects Plus, it would have to be something that a library like mine could support without a lot of in-house support. So I need to know what hosting service could make it easiest for a small library with a small staff. I know there are other questions - what kind of burden on the ILS would this be? I know Z39.50 is old technology, and there are probably other problems you all can predict. So that's what I'm up to. Cindy Harper, Systems Librarian Colgate University Libraries char...@colgate.edu 315-228-7363 On Wed, Mar 23, 2011 at 11:28 AM, Jon Gorman jonathan.gor...@gmail.comwrote: On Wed, Mar 23, 2011 at 10:13 AM, Cindy Harper char...@colgate.edu wrote: Sorry to bother you all with it. Everyone's happy family is different, to hash a quote, but I hope I'm still welcome in Code4Lib, even if I'm not hired to be a library coder. Just a library (Windows) sys admin. Or maybe we need a spin-off code4lib for the amateurs among us. I think Bill meant why are you coming down here with us trolls when you're at such a nice place? You're quite welcome, although you've certainly have my curiosity up about why you want to run php_yaz in the first place. You didn't have much in the way of details in your initial email. It might change some people's advice if you're not intending the system to a long-term production system. (And I'm still curious what systems are even using php_yaz) Jon Gorman
Re: [CODE4LIB] LAMP Hosting service that supports php_yaz?
Sorry - what do you mean by triggers their usage monitor - CPU usage above a certain threshold? Or they don't allow compiles? I spoke with Bluehost, and they indicated that if I got SSH access, I could try to compile it myself. I'll check to see if this is possible with Lunarpages, which we now have accounts with. Cindy Harper, Systems Librarian Colgate University Libraries char...@colgate.edu 315-228-7363 On Mon, Mar 7, 2011 at 1:58 PM, Ross Singer rossfsin...@gmail.com wrote: Cindy, I think this might be possible, depending on the provider. I have a site on Site5 and this seems pretty doable (it looks like I might have even tried this at some point, since I seem to have a compiled version of yaz in my home directory). It would probably take some rooting around in the forums to see how people successfully are installing PECL extensions and it might take a few tries to compile yaz successfully (since if it triggers their usage monitor, they'll kill the process), but I think it would be worth a shot. I would definitely recommend this before jumping to a VPS (and let's be realistic, everybody, if you're being this blasee about running a VPS, you are either investing some time/expertise sys admining it or you have an insecure server waiting to be exploited). Good luck! -Ross. On Mon, Mar 7, 2011 at 1:17 PM, Cindy Harper char...@colgate.edu wrote: I guess I was hoping to have service such as that provided by my current hosting service, where security,etc., updates for L A M P are all taken care of by the host. Any recommendations along those lines? One that provides that and still lets me install what I want? My service suggested that I go to a VPS account,where I'd have to do my own updates. Cindy Harper, Systems Librarian Colgate University Libraries char...@colgate.edu 315-228-7363 On Mon, Mar 7, 2011 at 11:00 AM, Han, Yan h...@u.library.arizona.edu wrote: You can just buy a node from a variety of cloud providers such as Amazon EC2, Linode etc. (It is very easy to build anything you want). Yan -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Cindy Harper Sent: Sunday, March 06, 2011 10:54 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] LAMP Hosting service that supports php_yaz? At the risk of exhausting my quota of messages for the month - Our LAMP hosting service does not support PECL extension php_yaz. Does anyone know of a service that does? Cindy Harper, Systems Librarian Colgate University Libraries char...@colgate.edu 315-228-7363
Re: [CODE4LIB] LAMP Hosting service that supports php_yaz?
Thanks, Ross. So that's why they call it nice? As usual I have much to learn. Cindy Harper, Systems Librarian Colgate University Libraries char...@colgate.edu 315-228-7363 On Tue, Mar 8, 2011 at 10:50 AM, Ross Singer rossfsin...@gmail.com wrote: Cindy, sorry, I realize that was vague. I have shell access on Site5, but since you're using shared resources, they monitor your CPU/memory usage. During high volume on a particular server, they'll kill processes that are running to make sure they can meet demands. This *could* happen when you're trying to compile something, which tends to be CPU-intensive, although it just depends. I've had their trigger kick in while trying to install ruby gems, although it's completely unpredictable (that is, based on all sorts of variables) - sometimes the gems install with no problem, other times they're killed. Compiling yaz is probably less of an issue (the makefile calls lots of things that run intensely, but quickly) than the pecl install of php/yaz. Running things in nice (http://linux.die.net/man/2/nice) probably helps your chances, but YMMV. I don't think this policy is exclusive to Site5, pretty much all of the major shared web hosting providers will have something similar in place, otherwise users could constantly have processes running in shells. Like I said, though, it shouldn't be a problem, it just might take a few tries (which will be less work, in the long run, then running your own VPS). -Ross. On Tue, Mar 8, 2011 at 10:05 AM, Cindy Harper char...@colgate.edu wrote: Sorry - what do you mean by triggers their usage monitor - CPU usage above a certain threshold? Or they don't allow compiles? I spoke with Bluehost, and they indicated that if I got SSH access, I could try to compile it myself. I'll check to see if this is possible with Lunarpages, which we now have accounts with. Cindy Harper, Systems Librarian Colgate University Libraries char...@colgate.edu 315-228-7363 On Mon, Mar 7, 2011 at 1:58 PM, Ross Singer rossfsin...@gmail.com wrote: Cindy, I think this might be possible, depending on the provider. I have a site on Site5 and this seems pretty doable (it looks like I might have even tried this at some point, since I seem to have a compiled version of yaz in my home directory). It would probably take some rooting around in the forums to see how people successfully are installing PECL extensions and it might take a few tries to compile yaz successfully (since if it triggers their usage monitor, they'll kill the process), but I think it would be worth a shot. I would definitely recommend this before jumping to a VPS (and let's be realistic, everybody, if you're being this blasee about running a VPS, you are either investing some time/expertise sys admining it or you have an insecure server waiting to be exploited). Good luck! -Ross. On Mon, Mar 7, 2011 at 1:17 PM, Cindy Harper char...@colgate.edu wrote: I guess I was hoping to have service such as that provided by my current hosting service, where security,etc., updates for L A M P are all taken care of by the host. Any recommendations along those lines? One that provides that and still lets me install what I want? My service suggested that I go to a VPS account,where I'd have to do my own updates. Cindy Harper, Systems Librarian Colgate University Libraries char...@colgate.edu 315-228-7363 On Mon, Mar 7, 2011 at 11:00 AM, Han, Yan h...@u.library.arizona.eduwrote: You can just buy a node from a variety of cloud providers such as Amazon EC2, Linode etc. (It is very easy to build anything you want). Yan -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Cindy Harper Sent: Sunday, March 06, 2011 10:54 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] LAMP Hosting service that supports php_yaz? At the risk of exhausting my quota of messages for the month - Our LAMP hosting service does not support PECL extension php_yaz. Does anyone know of a service that does? Cindy Harper, Systems Librarian Colgate University Libraries char...@colgate.edu 315-228-7363
Re: [CODE4LIB] LAMP Hosting service that supports php_yaz?
I guess I was hoping to have service such as that provided by my current hosting service, where security,etc., updates for L A M P are all taken care of by the host. Any recommendations along those lines? One that provides that and still lets me install what I want? My service suggested that I go to a VPS account,where I'd have to do my own updates. Cindy Harper, Systems Librarian Colgate University Libraries char...@colgate.edu 315-228-7363 On Mon, Mar 7, 2011 at 11:00 AM, Han, Yan h...@u.library.arizona.eduwrote: You can just buy a node from a variety of cloud providers such as Amazon EC2, Linode etc. (It is very easy to build anything you want). Yan -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Cindy Harper Sent: Sunday, March 06, 2011 10:54 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] LAMP Hosting service that supports php_yaz? At the risk of exhausting my quota of messages for the month - Our LAMP hosting service does not support PECL extension php_yaz. Does anyone know of a service that does? Cindy Harper, Systems Librarian Colgate University Libraries char...@colgate.edu 315-228-7363
Re: [CODE4LIB] online course on the semantic web?
The JHU course is a semester-long equivalent, and is in the $3000 range. Cindy Harper, Systems Librarian Colgate University Libraries char...@colgate.edu 315-228-7363 On Sat, Mar 5, 2011 at 7:51 PM, Joe Hourcle onei...@grace.nascom.nasa.govwrote: On Mar 5, 2011, at 3:01 PM, Cindy Harper wrote: Well, I just walked my 80-year-old mother through setting up her wireless router and wireless on her desktop and laptop via telephone NY-to-VA, and now I feel like I can think about another challenge for the coming season(s). Does anyone know of a good online course that's an introduction to semantic web technology that they could recommend? My goals are simply to understand more and be able to code a little, and afterward applying it to linked data? I know of one course this summer at Johns Hopkins Engineering for Professionals program http://ep.jhu.edu/course-homepages/viewpage.php?homepage_id=2993, but it's rather pricey. Anyone know of cheaper options or creative ideas for funding? I don't know how introductory it'd be, but ASIST has been doing a lot of 'webinars' this year, and there are ones coming up on the 9th and 13th on linked data, and the first one sounds like it'll cover some semantic web issues:: http://asis.org/Conferences/webinars/2011/linked-data.html (I can't compare prices to the JHU one, as I didn't see any pricing on the JHU site; this round of ASIST webinars are $25 for members, $59 for non-members; some in the past have been free for ASIST members) Also, looking at MIT's Open Courseware catalog, I see a few individual lessons that might be applicable: http://ocw.mit.edu/index.htm In the past, I've looked at some of the courses from W3schools (not affiliated with W3C, but has some tutorials on various things related to the web). They tend to be fairly introductory, but they have two that might be of interest: http://www.w3schools.com/rdf/default.asp http://www.w3schools.com/semweb/default.asp -Joe - Joe Hourcle Programmer/Analyst Solar Data Analysis Center Goddard Space Flight Center
[CODE4LIB] LAMP Hosting service that supports php_yaz?
At the risk of exhausting my quota of messages for the month - Our LAMP hosting service does not support PECL extension php_yaz. Does anyone know of a service that does? Cindy Harper, Systems Librarian Colgate University Libraries char...@colgate.edu 315-228-7363
Re: [CODE4LIB] online course on the semantic web?
Thanks to Jerry and to Joe and Karen - these links look good! Cindy Harper, Systems Librarian Colgate University Libraries char...@colgate.edu 315-228-7363 On Sun, Mar 6, 2011 at 12:59 PM, Jerry Persons jpers...@stanford.eduwrote: A couple of things you might look at: [1] just published, free HTML version ... Heath and Bizer http://linkeddatabook.com/editions/1.0/ Tom Heath and Christian Bizer (2011) Linked Data: Evolving the Web into a Global Data Space (1st edition). Synthesis Lectures on the Semantic Web: Theory and Technology, 1:1, 1-136. Morgan Claypool. [2] the materials for the 2-day intro to Web of Data from Talis http://dynamicorange.com/2010/11/03/web-of-data/ leading to: http://api.talis.com/stores/training/items/training.html http://api.talis.com/stores/training/items/exercises.html -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Cindy Harper Sent: Sunday, March 06, 2011 9:32 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] online course on the semantic web? The JHU course is a semester-long equivalent, and is in the $3000 range. Cindy Harper, Systems Librarian Colgate University Libraries char...@colgate.edu 315-228-7363 On Sat, Mar 5, 2011 at 7:51 PM, Joe Hourcle onei...@grace.nascom.nasa.govwrote: On Mar 5, 2011, at 3:01 PM, Cindy Harper wrote: Well, I just walked my 80-year-old mother through setting up her wireless router and wireless on her desktop and laptop via telephone NY-to-VA, and now I feel like I can think about another challenge for the coming season(s). Does anyone know of a good online course that's an introduction to semantic web technology that they could recommend? My goals are simply to understand more and be able to code a little, and afterward applying it to linked data? I know of one course this summer at Johns Hopkins Engineering for Professionals program http://ep.jhu.edu/course-homepages/viewpage.php?homepage_id=2993, but it's rather pricey. Anyone know of cheaper options or creative ideas for funding? I don't know how introductory it'd be, but ASIST has been doing a lot of 'webinars' this year, and there are ones coming up on the 9th and 13th on linked data, and the first one sounds like it'll cover some semantic web issues:: http://asis.org/Conferences/webinars/2011/linked-data.html (I can't compare prices to the JHU one, as I didn't see any pricing on the JHU site; this round of ASIST webinars are $25 for members, $59 for non-members; some in the past have been free for ASIST members) Also, looking at MIT's Open Courseware catalog, I see a few individual lessons that might be applicable: http://ocw.mit.edu/index.htm In the past, I've looked at some of the courses from W3schools (not affiliated with W3C, but has some tutorials on various things related to the web). They tend to be fairly introductory, but they have two that might be of interest: http://www.w3schools.com/rdf/default.asp http://www.w3schools.com/semweb/default.asp -Joe - Joe Hourcle Programmer/Analyst Solar Data Analysis Center Goddard Space Flight Center
[CODE4LIB] online course on the semantic web?
Well, I just walked my 80-year-old mother through setting up her wireless router and wireless on her desktop and laptop via telephone NY-to-VA, and now I feel like I can think about another challenge for the coming season(s). Does anyone know of a good online course that's an introduction to semantic web technology that they could recommend? My goals are simply to understand more and be able to code a little, and afterward applying it to linked data? I know of one course this summer at Johns Hopkins Engineering for Professionals program http://ep.jhu.edu/course-homepages/viewpage.php?homepage_id=2993, but it's rather pricey. Anyone know of cheaper options or creative ideas for funding? Cindy Harper, Systems Librarian Colgate University Libraries char...@colgate.edu 315-228-7363
Re: [CODE4LIB] online course on the semantic web?
Now that I think about it, this may be an opportunity to apply another idea that I was exploring in another context: I had written to syslib-l looking for anyone interested in collaborating on a staff technology training wiki that would link staff to free and authoritative web-based resources on a range of technology training subjects. Would anyone be interested in applying that idea to code4lib technology learning? How much effort would be required for someone who's well acquainted with the Semantic Web to contribute to a site that lists texts or curriculum for those who are interested in learning? I don't know if this is doable. Anyone interested? Or should I just find myself a text and wade through it? Cindy Harper, Systems Librarian Colgate University Libraries char...@colgate.edu 315-228-7363 On Sat, Mar 5, 2011 at 3:01 PM, Cindy Harper char...@colgate.edu wrote: Well, I just walked my 80-year-old mother through setting up her wireless router and wireless on her desktop and laptop via telephone NY-to-VA, and now I feel like I can think about another challenge for the coming season(s). Does anyone know of a good online course that's an introduction to semantic web technology that they could recommend? My goals are simply to understand more and be able to code a little, and afterward applying it to linked data? I know of one course this summer at Johns Hopkins Engineering for Professionals program http://ep.jhu.edu/course-homepages/viewpage.php?homepage_id=2993, but it's rather pricey. Anyone know of cheaper options or creative ideas for funding? Cindy Harper, Systems Librarian Colgate University Libraries char...@colgate.edu 315-228-7363
Re: [CODE4LIB] A suggested role for text mining in library catalogs?
Sorry - it's more reflective of me and my amateur status Cindy Harper, Systems Librarian Colgate University Libraries char...@colgate.edu 315-228-7363 On Tue, Feb 22, 2011 at 9:46 AM, Rob Casson rob.cas...@gmail.com wrote: And I probably should have added to your thread on NGC4LIB, rather than Code4lib - I tend to conflate them. i'm offended ;)
Re: [CODE4LIB] A suggested role for text mining in library catalogs?
It's not ironic - my post was musing inspired by your work. I guess I wasn't sure if I understood your results. You were looking at the overall POS usage in the entire texts as a possible way of ranking the texts. I was wondering about POS of particular search terms - those that could take on several POS. A related question - does SOLR use stemming to widen the search to various POS? Then would it be meaningful to rank the given texts by the POS of the actual search terms? And has anyone looked at samples of user search terms - are they almost always noun phrases? Just wanting to understand what you have explored. And I probably should have added to your thread on NGC4LIB, rather than Code4lib - I tend to conflate them. Cindy Harper, Systems Librarian Colgate University Libraries char...@colgate.edu 315-228-7363 On Sat, Feb 19, 2011 at 5:42 PM, Eric Lease Morgan emor...@nd.edu wrote: On Feb 19, 2011, at 11:26 AM, Cindy Harper wrote: I just was testing our discovery engine for any technical issues after a reboot. I was just using random single words, and one word I used was correct. Looking at the first ranked items, I wondered if there's some role for parts-of-speech in ranking hits - are nouns and , in this case, adjectives more indicative of aboutness than verbs? The first items were Miss Manners ... excruciating correctly behavior, then a bunch of govdocs on an act to correct. I don't think there's any reason to prefer nouns over verbs, but I thought I'd throw the thought at you anyway. Ironically, I was playing with parts-of-speech (POS) analysis the other day. [1] Using a pseudo-random sample of texts, I found there to be surprisingly similar POS usage between texts. With such similarity, I thought it would be difficult to use general POS as a means for ranking or sorting. On the other hand, specific POS may be useful. For example, Thoreau was dominated by first-person male pronouns but Austen was dominated by second person female pronouns. I think there is something to be explored here. [1] POS - http://bit.ly/hsxD2i -- Eric Still Counting Tweets and Chats Morgan
[CODE4LIB] A suggested role for text mining in library catalogs?
I just was testing our discovery engine for any technical issues after a reboot. I was just using random single words, and one word I used was correct. Looking at the first ranked items, I wondered if there's some role for parts-of-speech in ranking hits - are nouns and , in this case, adjectives more indicative of aboutness than verbs? The first items were Miss Manners ... excruciating correctly behavior, then a bunch of govdocs on an act to correct. I don't think there's any reason to prefer nouns over verbs, but I thought I'd throw the thought at you anyway. Cindy Harper, Systems Librarian Colgate University Libraries char...@colgate.edu 315-228-7363
Re: [CODE4LIB] simple,flexible ILS for a small librar
Joanne indicated there's a negative scanner on Leve5 - is thattrue? Cindy Harper, Systems Librarian Colgate University Libraries char...@colgate.edu 315-228-7363 On Wed, Oct 6, 2010 at 8:42 AM, Jon Gorman jonathan.gor...@gmail.comwrote: On Wed, Oct 6, 2010 at 12:05 AM, Susan Kane adarconsult...@gmail.com wrote: I wonder if this person might be better served by some kind of bartering software. I wasn't sure there was such a thing -- but of course, there is. http://www.curomuto.com/ http://www.barter-blog.com/?p=51 There's also some book-swap and book exchange sites and projects out there. It would be interesting to try to merge the three. The OP seems to really want to use VuFind, so if you could use a book swap software on the administrative side and have that update the catalog that would be useful. Or maybe even see if it's possible to create drivers for some book swap. The problem with the book-swap is it is usually one person to one person, end to end. There's no in-between library. Book Mooch is the one I've heard a lot about, although I haven't ever used it. Jon Gorman
Re: [CODE4LIB] [NGC4LIB] CSU library finds 40% of collection hasn't circulated
My mistake - wrong list! On Tue, Oct 5, 2010 at 10:59 AM, Cindy Harper char...@colgate.edu wrote: Colgate University built an on-site ASRS in 2005 as part of renovating our entire main library. During the 2 years of construction on
[CODE4LIB] Fwd: [NGC4LIB] CSU library finds 40% of collection hasn't circulated
Colgate University built an on-site ASRS in 2005 as part of renovating our entire main library. During the 2 years of construction on the building, our services were dispersed among several buildings on campus and the high-use portion of the collection that remained available to our students during that time was entirely housed in the ASRS, requested through our online catalog, and delivered to our circulation point in utility vehicle loads. Of course, we also made major use of the ConnectNY user-initiated resource sharing and traditional ILL. There was user dissatisfaction at first, but one thing we learned is that patrons were greatly pleased when we made a public awareness campaign to show them how to virtually browse the stacks in call-number order using the OPAC. The other thing we heard when we moved back into our renovated building was that students were disappointed that they had to go to the stacks and find the books themselves! And faculty were disappointed when we stopped delivering directly to their offices, of course - but we want them to come to the library :) . When we opened the new building, we brought up the Encore discovery system, and blended it into the classic OPAC site as our keyword search (classic indexes are still available in other tabs). Encore doesn't have a virtual call number browse feature, but we have asked for this as an enhancement - either a linear browse of the shelves, or a hierarchical call number facet drill-down. Cindy Harper, Systems Librarian Colgate University Libraries char...@colgate.edu 315-228-7363 On Tue, Oct 5, 2010 at 10:01 AM, Emily Lynema emily_lyn...@ncsu.edu wrote: I agree with Dan that it is a bit of a moot point to argue about the benefits of moving materials to off-site storage. It is absolutely going to happen. But here's the thingit's been happening for years as we buy more and more e-books and digital collections. If the argument is that users need to be able to 'browse' the physical stacks, they've already been unable to discover digital materials in this way for some time now. But here's where I think this topic does tie in with NGC4LIB. The question we should be asking ourselves is What are our patrons losing when we move our physical print materials off-site? Are there tools we can build to help them recover that usefulness in new ways? It's for that exact reason that we are continuing to explore different, enriched ways to browse the collection virtually at NCSU, in addition to thinking about what enhanced delivery services we can offer to our patrons to make it easier and more reliable to get a book out of an automated retrieval system than it was to go find it in the stacks. I bet there are a lot of cool new discovery tools we could think about that way make both digital collections AND materials stored off-site accessible to our patrons. As for the use case that Tim pointed out, it seems like those materials should have been part of a reference collection of some sort. It goes without saying that as libraries contemplate major changes like these, our job is to be listening to our patrons so that we can learn what mistakes we might have made and remedy them. An interesting idea that has been tossed around here is to retain on open browsing shelves the materials most recently pulled from the ARS. Perhaps that would need to include materials most frequently pulled from the ARS, too. -emily -- Date:Fri, 1 Oct 2010 13:46:24 -0400 From:Dan Scott d...@coffeecode.net Subject: Re: CSU library finds 40% of collection hasn't circulated On Fri, Oct 1, 2010 at 7:29 AM, Kyle Banerjee baner...@uoregon.edu wrote: We're going to move out the books that are never checked out, the ones that are never used anymore, I hope they're not relying exclusively on circ transaction data to discover what is never used. I realize this may sound insane, but a lot of materials are actually used *in the library* without being checked out. The nature of the resource and the people who have a lot to do with this. Years ago, we did a major weeding and storage project at a place I worked at did something similar. Just to be safe, we had the shelvers look at our proposed list which contained 10's of thousands of items to see if any of them jumped out as things they recognized as materials that were used. While most were not, there were certainly a number that were. That's not at all insane. In fact, we use our next-generation ILS (Evergreen - did y'all catch that valiant attempt to link this thread to the supposed topic of the mailing list?) to record in-house uses, and when we did our own PR-free move of items from the stuffed circulation stacks into storage this summer, we used a combination of lack of circulation since 1985 and lack of recorded in-house uses since 2003 to determine likely suspects for movement into storage
[CODE4LIB] Innovative's Synergy
Hi All - III is touting their web-services based Synergy product as having the efficiency of a pre-indexed service and the timeliness of a just-in-time service. Does anyone know if the agreements they have made with database vendors to use these web services preclude an organization developing an open-source client to take advantage of those web services? Just curious. I suppose I should direct my question to EBSCO and Proquest directly. Cindy Harper, Systems Librarian Colgate University Libraries char...@colgate.edu 315-228-7363
[CODE4LIB] Google Book Search staff member?
Hi - I wonder - if the Google Book Search staff member who attended C4L10 is monitoring this list, could he contact me off-list? I didn't get a chance to continue the conversation that we almost started while waiting for dinner transportation Tuesday night, and I wonder what Google thinks of some ideas I have for using GBS data. I didn't even get your name! Cindy Harper, Systems Librarian Colgate University Libraries char...@colgate.edu 315-228-7363
[CODE4LIB] Female roomate wanted for Code4libcon
Hi - I've booked a room at the marriott. Would like to share with female room-mate to cut costs. Must mention that I snore, alas. Cindy Harper, Systems Librarian Colgate University Libraries char...@colgate.edu 315-228-7363
[CODE4LIB] Bookmarking web links - authoritativeness or focused searching
I've been thinking about the role of libraries as promoter of authoritative works - helping to select and sort the plethora of information out there. And I heard another presentation about social media this morning. So I though I'd bring up for discussion here some of the ideas I've been mulling over. Last week I sent this message to the Suggestions and Ideas forum at delicious. http://support.delicious.com/forum/comments.php?DiscussionID=3237page=1#Item_0 The basic idea is to develop a delicious network of librarians. Or a network of faculty members. Then have one login whose network included those users, and share that login so that lots of people could share that network. Delicious responded that we could have a wiki where people posted their delicious names so that others could add them to their personal networks, but that doesn't scale up very well. Or another project I've toyed with, involving focused searching: I started with Robert Teeter's index to Great Books lists. http://www.interleaves.org/~rteeter/grtalphaa.htmlhttp://www.interleaves.org/%7Erteeter/grtalphaa.html. I've almost completed pulling them into a MySQL database so that I could sort the titles by the number of Great Books lists that mention each title. Then I thought about how one could do focused searching of the web, collecting pages with a title containing (best and books) or (great and books), and screen scraping title lists (you'd have to have some heuristic method of identifying the data, of course, and I'm aware what problems might arise there). But my test searches in that idea showed that one runs into a lot of commercial ephemeral lists and spurious lists. Now, you could rely on crowd-sourcing to filter out the consensus by ranking by the number of sites/cites. But I thought you might want to differentiate between the source - .edus, librarys, etc. So that led me to speculate about a search engine that ranked just by links from .edu's, libraries sites, and a librarian-vetted list of .orgs, scholarly publishers, etc. I think you can limit by .edu in the linked-from in Google - I haven't tried that much. if anyone here has experience at using tha technique, I'd like to hear about it. But I'm thinking now about the possibility of a search engine limited to sites cooperatively vetted by librarians, that would incorporate ranking by # links. Something more responsive than cataloging websites in our catalogs. Is anyone else thinking about these ideas? or do you know of projects that approach this goal of leveraging librarian's vetting of authoritative sources? Cindy Harper, Systems Librarian Colgate University Libraries char...@colgate.edu 315-228-7363
Re: [CODE4LIB] indexing pdf files
We're just talking about creating an index, not a separate copy of the works, right? because I imagine that copyright has a lot to do with why this type of thing doesn't already exist. On Wed, Sep 16, 2009 at 3:08 PM, Eric Lease Morgan emor...@nd.edu wrote: Eric Morgan wrote: http://infomotions.com/highlights/ Rosalyn Metz wrote: I have librarians that would kill for this. In fact I was talking to one about it the other day. She felt there must be a way to handle active reading and make it portable. This would be great in conjunction with RefWorks or Zotero or something along those lines. Yep, when I was creating this application for myself I was wondering what it would be like if a whole group, say, an academic department, were to systematically contribute to such a thing? I thought the output would be pretty exciting. Mark A. Matienzo wrote: Have you considered using Solr's ExtractingRequestHandler [1] for the PDFs? We're using it at NYPL with pretty great success. [1] http://wiki.apache.org/solr/ExtractingRequestHandler Nope, never saw that previously. Thanks for the pointer. Peter Kiraly wrote: I would like to suggest an API for extracting text (including highlighted or annotated ones) from PDF: iText (http://www.lowagie.com/iText/). This is a Java API (has C# port), and it helped me a lot, when we worked with extraordinary PDF files. More tools! Thank you. danielle plumer wrote: My (much more primitive) version of the same thing involves reading and annotating articles using my Tablet PC. Although I do get a variety of print publications, I find I don't tend to annotate them as much anymore. I used to use EndNote to do the metadata, then I switched to Zotero. I hadn't thought to try to create a full-text search of the articles -- hmm. Yes, for a growing number of the tools I create I need to be thinking about Zotero as way of remembering content. Thanks for... reminding me. Erik Hatcher wrote: Here's a post on how easy it is to send PDF documents to Solr from Java: http://www.lucidimagination.com/blog/2009/09/14/posting-rich-documents-to-apache-solr-using-solrj-and-solr-cell-apache-tika/ I'm looking forward to the arrival of my Solr books any day now. After reading it I hope to have a better handle on the guts of Solr as well as increase my abilities to do the sorts of things discussed at the URL above. Thank you, one and all for your replies. -- Eric Morgan -- Cindy Harper, Systems Librarian Colgate University Libraries char...@colgate.edu 315-228-7363
Re: [CODE4LIB] R?
I took some online courses in data mining last year at statistics.com, some of which featured R. I was pleased with it, although I haven't tried to integrate it in any programming project, and I only scratched the surface. I also would highly recommend the courses at statistics.com. Now if I could just work out the data collection to make use of the data mining techniques on our library data. On Thu, Sep 10, 2009 at 9:59 AM, Glen Newton - NRC/CNRC CISTI/ICIST Research glen.new...@nrc-cnrc.gc.ca wrote: William == William Denton w...@pobox.com writes: William Are any of you using R? http://www.r-project.org/ I use R for a number of things, including the multidimensional scaling (512--2) I do here: http://zzzoot.blogspot.com/2009/07/project-torngat-building-large-scale.html It is fast, backed by the stats braniacs, has a huge number of domain-specific modules (biology, genomics, geology, engineering, ). It is great. Slices bread, juliennes fries, casts my votes, does my taxes, feeds my dogs and submits my postings to code4lib. ;-) -glen William == William Denton w...@pobox.com writes: William Are any of you using R? http://www.r-project.org/ WilliamBlog about R, info viz, etc.: William http://blog.revolution-computing.com/ William I have something in mind I'm going to try fooling around William with in R, but I wondered if anyone was using it for William visualizing searches, usage, networks of information, William that kind of thing. William Bill -- William Denton, Toronto : miskatonic.org William www.frbr.org openfrbr.org -- Glen Newton | glen.new...@nrc-cnrc.gc.ca Researcher, Information Science, CISTI Research NRC W3C Advisory Committee Representative http://tinyurl.com/yvchmu tel/tél http://tinyurl.com/yvchmu%0Atel/t%C3%A9l: 613-990-9163 | facsimile/télécopieur 613-952-8246 Canada Institute for Scientific and Technical Information (CISTI) National Research Council Canada (NRC)| M-55, 1200 Montreal Road http://www.nrc-cnrc.gc.ca/ Institut canadien de l'information scientifique et technique (ICIST) Conseil national de recherches Canada | M-55, 1200 chemin Montréal Ottawa, Ontario K1A 0R6 Government of Canada | Gouvernement du Canada -- -- Cindy Harper, Systems Librarian Colgate University Libraries char...@colgate.edu 315-228-7363
Re: [CODE4LIB] A little Google Book Search project: GoogleBSCites - Ranking by Google Book Search
I thought someone out there might be interested in a poster session I just did at the Innovative Users Group Conference 2009. I undertook the project because i was personally interested in the outcome, and because I look forward to the day when these data will be available - from Google, from the Internet Archive, from Hathi trust, from . It's fraught with problems and both recall and precision errors, but I call it an approximation of citation searching for the books in the Colgate collection, then ranking them by the number of hits. I took about 688,000 monographic records that had both an author and a title from the Colgate library catalog, and constructed a search in GoogleBookSearch. Since I wanted to find citations - or other books that mentioned the book in question, I didn't restrict by field. Title phrase from 245 subfields a b, up to 10 words long. plus: first two words in the author (if a personal author) author phrase (if a conference author) first 6 words in author (if a corporate author) Searched these over the course of 3/1/2009 - 4/27/2009 at less than 380 searches an hour (took 3 machines to get the job done in 6 weeks). Screen-scraped Google's reported 1 to 8 of #hits records. The results rank these by the # of citations. http://lisv06.colgate.edu/GBSCites/default.aspx My results omit GovDocs for the time being, since I forgot to download the 086s into the records - I could add that later. Those corporate bodies are problems in my search strategy, anyway. I did include them in the search portion of the project. I don't know how many users this MySql site will support - it's entirely un-stress-tested, but i trust you won't all go searching it at once. -- Cindy Harper, Systems Librarian Colgate University Libraries char...@colgate.edu 315-228-7363