Re: [CODE4LIB] code4lib services and https
SSL is security theatre unless people start doing it better. SSL is a layer of complexity, it's easy to get wrong and the library community is systematically getting it wrong (picking on some big names, because they're tough enough to take it, not because they noticeably do it any better or worse): https://www.ssllabs.com/ssltest/analyze.html?d=viaf.org https://www.ssllabs.com/ssltest/analyze.html?d=code4lib.org https://www.ssllabs.com/ssltest/analyze.html?d=loc.gov I'd implore you to check a couple of sites local to you and ping the administrators if it doesn't get the all clear. In some cases there are reasons why security might be lagging on a particular site (third party hosting, third party client connecting using out-of-date SSL libraries, need to support many-years-out-of-patch-cycle browsers, etc), but that's the kind of thing that needs to be an explicit policy. cheers stuart
Re: [CODE4LIB] Open Journal Systems - experiences?
On 17/12/14 04:23, Tania Fersenheim wrote: I have some staff interested in a pilot of Open Journal Systems. http://openjournalsystems.com/ Anyone here have experiences with the software they'd like to share, either installed locally or hosted by the OJS folks? I'm especially interested in how responsive the developers and the user community are. The software is OJS and can be found at https://pkp.sfu.ca/ojs/ The link you used points to an third-party hoster of that software. My experience with locally-installed OJS installed 'out-of-the-box' with half a dozen journals (some doing retrospective digitalisation, so reasonably large) has been great. Two upgrades have gone very painlessly. As with most open source software, most of the development is left to individual users to do (or to pay third parties to do). For example I reported that the MARCXML output is completely useless and got this response: http://pkp.sfu.ca/bugzilla/show_bug.cgi?id=9019 On the positive side, I have the tools to provide better MARCXML, on the downside, I have to wrangle the time to do that. cheers stuart
Re: [CODE4LIB] MARC reporting engine
Thank you to all who responded with software suggestions. https://github.com/ubleipzig/marctools is looking like the most promising candidate so far. The more I read through the recommendations the more it dawned on me that I don't want to have to configure yet another java toolchain (yes I know, that may be personal bias). Thank you to all who responded about the challenges of authority control in such collections. I'm aware of these issues. The current project is about marshalling resources for editors to make informed decisions about rather than automating the creation of articles, because there is human judgement involved in the last step I can afford to take a few authority control 'risks' cheers stuart -- I have a new phone number: 04 463 5692 From: Code for Libraries CODE4LIB@LISTSERV.ND.EDU on behalf of raffaele messuti raffaele.mess...@gmail.com Sent: Monday, 3 November 2014 11:39 p.m. To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] MARC reporting engine Stuart Yeates wrote: Do any of these have built-in indexing? 800k records isn't going to fit in memory and if building my own MARC indexer is 'relatively straightforward' then you're a better coder than I am. you could try marcdb[1] from marctools[2] [1] https://github.com/ubleipzig/marctools#marcdb [2] https://github.com/ubleipzig/marctools -- raffaele
Re: [CODE4LIB] MARC reporting engine
Apologies, I should have used Plain English for an international audience. 'Sundry' means 'miscellaneous' or 'other' Ideally for each person, I'd generate a range of date for mentions, a check to see whether they had obituaries in the index, I'll also generate URLs into the search engines for various external systems (worldcat, VIAF, ORCID, digitalnz, etc) because these are useful to the editor who makes the decisions about using the content to make the wikipedia stub. cheers stuart -- I have a new phone number: 04 463 5692 From: Code for Libraries CODE4LIB@LISTSERV.ND.EDU on behalf of Jean-Claude Dauphin jc.daup...@gmail.com Sent: Tuesday, 4 November 2014 7:40 a.m. To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] MARC reporting engine Hi Stuart, I made some experiments with the innz-metadata in J-ISIS software, and you may be interested to read the summary which is attached. Thank you for informing the CODE4LIB list about the innz-metadata dataset, this is very useful for testing and improving J-ISIS. But now, I would like to see if it's easy to do what you wish to achieve with J-ISIS. Please excuse my ignorance, but could you please explain on which MARC fields or subfields you wish to extract the person authorities and explain me what are the sundry metadata and how they are related to MARC records. I googled about sundry metadata but didn't found any satisfactory information Best wishes, Jean-Claude On Mon, Nov 3, 2014 at 4:24 PM, Brian Kennison kennis...@wcsu.edu wrote: On Nov 2, 2014, at 9:29 PM, Stuart Yeates stuart.yea...@vuw.ac.nzmailto: stuart.yea...@vuw.ac.nz wrote: Do any of these have built-in indexing? 800k records isn't going to fit in memory and if building my own MARC indexer is 'relatively straightforward' then you're a better coder than I am. I think the XMLDB idea is the way to go but I’d use Basex ( http://basex.org). Basex has query and indexing capabilities, If you know XSLT (and SQL) then you’d at least have a start with Xquery. —Brian -- Jean-Claude Dauphin jc.daup...@gmail.com jc.daup...@afus.unesco.org http://kenai.com/projects/j-isis/ http://www.unesco.org/isis/ http://www.unesco.org/idams/ http://www.greenstone.org
[CODE4LIB] MARC reporting engine
I have ~800,000 MARC records from an indexing service (http://natlib.govt.nz/about-us/open-data/innz-metadata CC-BY). I am trying to generate: (a) a list of person authorities (and sundry metadata), sorted by how many times they're referenced, in wikimedia syntax (b) a view of a person authority, with all the records by which they're referenced, processed into a wikipedia stub biography I have established that this is too much data to process in XSLT or multi-line regexps in vi. What other MARC engines are there out there? The two options I'm aware of are learning multi-line processing in sed or learning enough koha to write reports in whatever their reporting engine is. Any advice? cheers stuart -- I have a new phone number: 04 463 5692
Re: [CODE4LIB] MARC reporting engine
Do any of these have built-in indexing? 800k records isn't going to fit in memory and if building my own MARC indexer is 'relatively straightforward' then you're a better coder than I am. cheers stuart -- I have a new phone number: 04 463 5692 From: Code for Libraries CODE4LIB@LISTSERV.ND.EDU on behalf of Jonathan Rochkind rochk...@jhu.edu Sent: Monday, 3 November 2014 1:24 p.m. To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] MARC reporting engine If you are, can become, or know, a programmer, that would be relatively straightforward in any programming language using the open source MARC processing library for that language. (ruby marc, pymarc, perl marc, whatever). Although you might find more trouble than you expect around authorities, with them being less standardized in your corpus than you might like. From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] on behalf of Stuart Yeates [stuart.yea...@vuw.ac.nz] Sent: Sunday, November 02, 2014 5:48 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] MARC reporting engine I have ~800,000 MARC records from an indexing service (http://natlib.govt.nz/about-us/open-data/innz-metadata CC-BY). I am trying to generate: (a) a list of person authorities (and sundry metadata), sorted by how many times they're referenced, in wikimedia syntax (b) a view of a person authority, with all the records by which they're referenced, processed into a wikipedia stub biography I have established that this is too much data to process in XSLT or multi-line regexps in vi. What other MARC engines are there out there? The two options I'm aware of are learning multi-line processing in sed or learning enough koha to write reports in whatever their reporting engine is. Any advice? cheers stuart -- I have a new phone number: 04 463 5692
Re: [CODE4LIB] content inventory of mediawiki site?
All but the cats should be available thought the standard API, I believe. https://www.mediawiki.org/wiki/API:Main_page cheers stuart -- I have a new phone number: 04 463 5692 From: Code for Libraries CODE4LIB@LISTSERV.ND.EDU on behalf of Shearer, Timothy tshea...@email.unc.edu Sent: Friday, 31 October 2014 4:49 a.m. To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] content inventory of mediawiki site? Hi Folks, My google fu isn't working. Does anyone know of an extension, something native, or a methodology to get a content inventory of a mediawiki site. We're trying to get a report that includes Pagename, most recent editor, date created, date last modified, categories Thanks for any advice, Tim
Re: [CODE4LIB] Subject: Re: Why learn Unix?
-- Because you can delete everything on the system with a very short command. This is actually a misconception. The very short command doesn't delete everything on the system. The integrity of files which are currently open (including things like the kernel image, executable files for currently-running programs, etc) is protected until they are closed (or the next reboot, whichever is first). These files vanish from the directory structure on the filesystem but can still be accessed by interacting with the running processes which have them open (or /proc/ for the very desperate). This is the POSIX alternative to the windows That file is currently in use scenario and explains why, when a runaway log file fills up a disk, you have to both delete the log file and restart the service to get the disk back. cheers stuart
Re: [CODE4LIB] Subject: Re: Why learn Unix?
'alias' is a non-portable bash-ism. Of course, this matters less now Oracle as declared Solaris dead. cheers stuart -- I have a new phone number: 04 463 5692 From: Code for Libraries CODE4LIB@LISTSERV.ND.EDU on behalf of Alex Berry ber...@indiana.edu Sent: Wednesday, 29 October 2014 1:11 p.m. To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Subject: Re: Why learn Unix? And that is why alias rm='rm -I' was invented. Quoting Roy Tennant roytenn...@gmail.com: I agree. I've done serious damage to my own server this way. Anyone who knows me knows that I'm completely capable of this. Unlike others, who are both more intelligent and more cautious. Down the path of the wild carded, recursive delete command lies DANGER. Having a little bit of knowledge is more dangerous, in most cases, than none at all. In Unix and in whitewater rafting. Roy On Oct 28, 2014, at 6:46 PM, Cary Gordon listu...@chillco.com wrote: Well you can do a lot of damage quickly using very short commands. Deleting the master boot record can be quite effective, but I will demure from giving specific examples. On Tue, Oct 28, 2014 at 3:22 PM, Stuart Yeates stuart.yea...@vuw.ac.nz wrote: -- Because you can delete everything on the system with a very short command. This is actually a misconception. The very short command doesn't delete everything on the system. The integrity of files which are currently open (including things like the kernel image, executable files for currently-running programs, etc) is protected until they are closed (or the next reboot, whichever is first). These files vanish from the directory structure on the filesystem but can still be accessed by interacting with the running processes which have them open (or /proc/ for the very desperate). This is the POSIX alternative to the windows That file is currently in use scenario and explains why, when a runaway log file fills up a disk, you have to both delete the log file and restart the service to get the disk back. cheers stuart -- Cary Gordon The Cherry Hill Company http://chillco.com
Re: [CODE4LIB] Why learn Unix?
Learning UNIX is a dreadful idea. If you think you want to learn UNIX, you probably should learn POSIX. Implementations are transient; if we're lucky standards are durable. cheers stuart -- I have a new phone number: 04 463 5692 From: Code for Libraries CODE4LIB@LISTSERV.ND.EDU on behalf of Siobhain Rivera siori...@indiana.edu Sent: Tuesday, 28 October 2014 3:02 a.m. To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] Why learn Unix? Hi everyone, I'm part of the ASIST Student Chapter and Indiana University, and we're putting together a series of workshops on Unix. We've noticed that a lot of people don't seem to have a good idea of why they should learn Unix, particularly the reference/non technology types. We're going to do some more research to make a fact sheet about the uses of Unix, but I thought I'd pose the question to the list - what do you think are reasons librarians need to know Unix, even if they aren't in particularly tech heavy jobs? I'd appreciate any input. Have a great week! Siobhain Rivera Indiana University Bloomington Library Science, Digital Libraries Specialization ASIST-SC, Webmaster
Re: [CODE4LIB] Linux distro for librarians
Turning this question on it's head: Is there any group / page / etc doing coordination of which library software is packaged for which distros and the chasing of distro-level bugs? At least some interoperability issues would be mitigated if all the appropriate libraries installed and worked reliably out of the box on whatever platform our colleagues were being force by local precedent to use. cheers stuart
[CODE4LIB] Wikipedia in teaching
Some of you may know of teaching staff using, or looking to use, wikipedia in their courses; if you do, I implore you to forward them https://en.wikipedia.org/wiki/Wikipedia:Education_program Wikipedia has active assistance that can be provided in such cases, but assistance is less useful once egg has connected with face. Alternatively, to see whether we're already providing assistance to courses at your institution, you can go to https://en.wikipedia.org/wiki/Special:Courses cheers stuart -- I have a new phone number: 04 463 5692
[CODE4LIB] ISSN lists?
My understanding is that there is no universal ISSN list but that worldcat allows querying of their database by ISSN. Which method of sampling the ISSN namespace is going to cause least pain? http://www.worldcat.org/ISSN/ seems to be the one talked about, but is there another that's less resource intensive? Maybe someone's already exported this data? cheers stuart -- I have a new phone number: 04 463 5692
Re: [CODE4LIB] Digitization Project from Scratch
Others in this thread have all made useful comments, but I think it would pay to take a step back first and ask yourself some questions about your situation: (*) what's your volume of material? Do you have a single book? a shelf of contents? a room of content? a multi-site organisation full of content? (*) what are your resources? Do you have techies? Do you have cataloguers? Do you have volunteers? Do you have machine-readable catalog records for the books? Is there good authority control for the people in the archive? Do you have existing finding aids? Do you have a book scanner? (*) Are you working as part of an enduring institution with a demonstrated commitment to archives? (*) Have you looked around for possible consortia to join? (*) Have you looked around to see who else has already digitised closely-related materials? (*) Which languages are the archives in? (*) Do you have a collections policy? ... The more detailed the answers, the better we'll be able to give you advice rather than just push our prejudices at you... cheers stuart -- I have a new phone number: 04 463 5692 From: Code for Libraries CODE4LIB@LISTSERV.ND.EDU on behalf of P.G. booksbyp...@gmail.com Sent: Wednesday, 15 October 2014 9:55 a.m. To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] Digitization Project from Scratch Hello, Anyone has experience in digitizing archival materials? I need your recommendations/suggestions on how we can start with our digitization. We need to build a searchable website so the public can access our materials of images, publications and media files. What platform did you use? Open-source or fee-base? What is your experience using it? Basically, we started using Sharepoint but at this point, I believe it is only good for sharing of internal documents. We are on a limited budget so we may need to host it on our own server as well. Any feedback or persons to contact for more info is highly appreciated. Thanks. Chris
Re: [CODE4LIB] Digitization Project from Scratch
Once the physical embodiment of books become self-aware, they might seriously look at building a consortia. That may or may not be what triggers the transition to a post-apocalyptic world. cheers stuart -- I have a new phone number: 04 463 5692 From: Code for Libraries CODE4LIB@LISTSERV.ND.EDU on behalf of Cary Gordon listu...@chillco.com Sent: Wednesday, 15 October 2014 1:54 p.m. To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Digitization Project from Scratch OK I am now obsessed with the idea of a post-apocalyptic consortia of 100,000 libraries, each with one book. Cary On Oct 14, 2014, at 4:57 PM, Stuart Yeates stuart.yea...@vuw.ac.nz wrote: Others in this thread have all made useful comments, but I think it would pay to take a step back first and ask yourself some questions about your situation: (*) what's your volume of material? Do you have a single book? a shelf of contents? a room of content? a multi-site organisation full of content? (*) what are your resources? Do you have techies? Do you have cataloguers? Do you have volunteers? Do you have machine-readable catalog records for the books? Is there good authority control for the people in the archive? Do you have existing finding aids? Do you have a book scanner? (*) Are you working as part of an enduring institution with a demonstrated commitment to archives? (*) Have you looked around for possible consortia to join? (*) Have you looked around to see who else has already digitised closely-related materials? (*) Which languages are the archives in? (*) Do you have a collections policy? ... The more detailed the answers, the better we'll be able to give you advice rather than just push our prejudices at you... cheers stuart -- I have a new phone number: 04 463 5692 From: Code for Libraries CODE4LIB@LISTSERV.ND.EDU on behalf of P.G. booksbyp...@gmail.com Sent: Wednesday, 15 October 2014 9:55 a.m. To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] Digitization Project from Scratch Hello, Anyone has experience in digitizing archival materials? I need your recommendations/suggestions on how we can start with our digitization. We need to build a searchable website so the public can access our materials of images, publications and media files. What platform did you use? Open-source or fee-base? What is your experience using it? Basically, we started using Sharepoint but at this point, I believe it is only good for sharing of internal documents. We are on a limited budget so we may need to host it on our own server as well. Any feedback or persons to contact for more info is highly appreciated. Thanks. Chris
[CODE4LIB] Wikipedia notability (was Re: [CODE4LIB] Official #teamharpy Statement on the case of Joseph Hawley Murphy vs. nina de jesus and Lisa Rabey)
I'm currently spending a chunk of time attempting to balance https://en.wikipedia.org/wiki/Category:New_Zealand_academics for recentism, gender imbalance and racial imbalance, creating 100 biographies so far. I can tell you that google scholar is a really crappy measure once you move outside the modern hard sciences, particularly in fields where the monograph is still revered. I can also tell you that there are people who have won the Hector Medal (for a long time the highest science prize in the country) who are in neither google scholar nor VIAF (and yes, we're a country hooked up to the feeder system for VIAF). As much as we like to think that libraries are central to academia and research, sometimes the world is not as it appears from our hallowed windows. [Anyone struggling to prove notability for a non-straight-white-male is welcome to ping me on wikipedia for help.] cheers stuart On 21/09/14 06:12, Karen Coyle wrote: I also was interested because I've recently joined the hardworking group of Wikipedians who work to distinguish between notable persons and able self-promoters. In doing so, I've learned a lot about how self-promotion works, especially in social media. In Wikipedia, to be considered notable, there needs to be some reliable proof - that is, third-party references, not provided by the individual in question. In terms of accomplishments, for example for academics, there is a list of measures, albeit not measurable in the scientific sense. [1] Just for a lark, look at the Google scholar profiles for Joe Murphy, RoyT, and for myself: http://scholar.google.com/citations?user=zW1lb04Jhl=enoi=ao http://scholar.google.com/citations?user=LJw73cAJhl=en http://scholar.google.com/citations?user=m4Tx73QJhl=enoi=ao The h-index, while imprecise, is about as close as you get to something one can cite as a measure. It's not a decision, but it is an indication. I put this forward not as proof of anything, but to offer that reputation is extremely hard to quantify, but should be looked at with a critical eye and not taken for granted. It also fits in with what we already know, which is that men promote themselves in the workplace more aggressively than women do. In fact, in the Wikipedia group, we mainly find articles about men whose notability is over-stated. (You can see my blog post on the problems of notability for women. [2]) I greatly admire your stand for free speech. Beyond this, I will contact you offline with other thoughts. kc [1] http://en.wikipedia.org/wiki/Wikipedia:Notability_%28academics%29 [2] http://kcoyle.blogspot.com/2014/09/wpnotability-and-women.html On 9/20/14, 9:16 AM, Lisa Rabey wrote: Friends: I know many of you have already been boosting the signal, and we thank you profusely for the help. For those who do not know, Joe Murphy is currently suing nina and I in $1.25M defamation case because From our official statement (http://teamharpy.wordpress.com/why-are-we-being-sued/) Mr. Murphy claims that Ms. Rabey “posted the following false, libelous and highly damaging tweet accusing the plaintiff of being a ‘sexual predator'”3. He further claims that Ms. de jesus wrote a blog post that “makes additional false, libelous, highly damaging, outrageous, malicious statements against the plaintiff alleging the commission of sexual harassment and sexual abuse of women and other forms of criminal and unlawful behaviour”4. Both Ms. Rabey and Ms. de jesus maintain that our comments are fair and are truthful, which we intend to establish in our defense. Neither of us made the claims maliciously nor with any intent to damage Mr. Murphy’s reputation. Right now we need the following most importantly: 1. We have a call out for additional witnesses (http://teamharpy.wordpress.com/call-for-witnesses/), which have started to filter in more accounts of harrassment. Please, PLEASE, if you know/seen/heard anything about the plaintiff, or know someone who might -- please have them get in touch. 2. Share our site (http://teamharpy.wordpress.com) which includes details of the case and updates. Please help us get the word out to as many people as possible about the plaintiff's attempt to silence those speaking up against sexual harassment and why you won't stand for it. 3. onations: Many, many of you have asked to help donate to fund our mounting legal costs. We will have a donation page up soon. Even if you cannot help financially, please share across your social networks. We will not be silenced. We will not be shamed. Thank you again. The outpouring of support that has been happening has made this all very much worth while. Best, Lisa
Re: [CODE4LIB] REST vs ODBC
On 23/09/14 10:01, Fitchett, Deborah wrote: Morning, all, We have a small dilemma: 1. Our brand new Alma system provides access to a bunch of data via RESTful API. It’s on The Cloud so we’re not going to be getting direct access to the database anytime soon. Is there a reason reason that you can't take your nightly backups of MARC data, suck them into a disposable install of koha and the koha ODBC connection? cheers stuart
[CODE4LIB] handle and doi HTTPS infrastructure
First up, I've got to say that I'm unaware of anyone using these over HTTPS in production, so issues are forward-looking and largely hypothetical. The good news is that both use DNSSEC: http://dnssec-debugger.verisignlabs.com/hdl.handle.net http://dnssec-debugger.verisignlabs.com/dx.doi.org The bad news is that some servers in the dx.doi.org DNS rotation don't appear be listening on 443 at all and that those that do have variable configuration that gets them a 'C': https://www.ssllabs.com/ssltest/analyze.html?d=dx.doi.org Further, a number of doi.org-native links redirect from HTTPS to HTTP without warning. For example https://dx.doi.org/ links to https://dx.doi.org/help.html but that's just a redirect to http://www.doi.org/factsheets/DOIProxy.html www.doi.org isn't listening on port 443. Testing DOI resolution over HTTPS gives occasional very long timeouts (presumably those non-443 servers?). All of the servers in the hdl.handle.net DNS rotation are listening on 443, but again the variable security config and low scores: https://www.ssllabs.com/ssltest/analyze.html?d=hdl.handle.net Note that some of the servers have 'test' in their server name, which makes me wonder... Again, the home site and help pages are HTTP only and there are HTTPS- HTTP redirects. Testing handle resolution over HTTPS seemed to work reliably for me when I tested it. Anyone have ideas as to who needs to lobby who to get this improved? cheers stuart
Re: [CODE4LIB] metadata for free ebook repositories
Authors in OL have already been linked to Wikipedia, and Wikipedia has been linked to VIAF, and the OCLC number, when present, has been taken from the MARC record. Therefore the OL record in some cases already has these connections. It's not just about authors. It's also about the work (+manifestation, +... ), the subject (particularly when the subject is separate work), the publisher, the illustrator, etc, etc. cheers stuart
[CODE4LIB] metadata for free ebook repositories
There are a stack of great free ebook repositories available on the web, things like https://unglue.it/ http://www.gutenberg.org/ https://en.wikibooks.org/wiki/Main_Page http://www.gutenberg.net.au/ https://www.smashwords.com/books/category/1/newest/0/free/any etc, etc What there doesn't appear to be, is high-quality AACR2 / RDA records available for these. There are things like https://ebooks.adelaide.edu.au/meta/pg/ which are elaborate dublin core to MARC converters, but these lack standardisation of names, authority control (people, entities, places, etc), interlinking, etc. It seems to me that quality metadata would greatly increase the value / findability / use of these projects and thus their visibility and available sources. Are there any projects working in this space already? Are there suitable tools available? cheers stuart
Re: [CODE4LIB] metadata for free ebook repositories
I think what I'm looking for is a crowd-sourcing platform to add: https://en.wikipedia.org/wiki/Willa_Cather http://viaf.org/viaf/182113193/ https://en.wikipedia.org/wiki/My_%C3%81ntonia http://www.worldcat.org/title/my-antonia/oclc/809034 ... to https://archive.org/download/myantonia00cathrich/myantonia00cathrich_marc.xml cheers stuart On 19/08/14 11:57, Karen Coyle wrote: About 1/3 of the 1M ebooks on OpenLibrary.org have full MARC records, and you can retrieve the record via the API. There is also a secret record format that returns not the full MARC for the hard copy (which is what the records represent because these are digitized books) but a record that has been modified to represent the ebook. The MARC records for the hard copy follow the pattern: https://archive.org/download/[archive identifier]/[archive identifier]_marc.[xml|mrc] Download MARC XML https://archive.org/download/myantonia00cathrich/myantonia00cathrich_marc.xml Download MARC binary https://www.archive.org/download/myantonia00cathrich/myantonia00cathrich_meta.mrc https://archive.org/download/myantonia00cathrich/myantonia00cathrich_meta.mrc To get the one that represents the ebook, do: https://archive.org/download/[archive identifier]/[archive identifier]_archive_marc.xml https://archive.org/download/myantonia00cathrich/myantonia00cathrich_archive_marc.xml This one has an 007, the 245 $h, and a few other things. Tom Morris did some code that helps you search for books by author and title and retrieve a MARC record. I don't recall where his github archive is, but I'll find out and post it here. The code is open source. We used it for a project that added ebook records to a public library catalog. You can also use the OPenLibrary API to select all open access ebooks. What I'd like to see is a way to create a list or bibliography in OL that then is imported into a program that will find MARC records for those books. The list function is still under development, though. kc On 8/18/14, 3:04 PM, Stuart Yeates wrote: There are a stack of great free ebook repositories available on the web, things like https://unglue.it/ http://www.gutenberg.org/ https://en.wikibooks.org/wiki/Main_Page http://www.gutenberg.net.au/ https://www.smashwords.com/books/category/1/newest/0/free/any etc, etc What there doesn't appear to be, is high-quality AACR2 / RDA records available for these. There are things like https://ebooks.adelaide.edu.au/meta/pg/ which are elaborate dublin core to MARC converters, but these lack standardisation of names, authority control (people, entities, places, etc), interlinking, etc. It seems to me that quality metadata would greatly increase the value / findability / use of these projects and thus their visibility and available sources. Are there any projects working in this space already? Are there suitable tools available? cheers stuart
Re: [CODE4LIB] EZProxy ssl security
Thank you, that helped greatly. cheers stuart On 13/08/14 10:09, Will Martin wrote: I can't offer a comprehensive guide, but I can give you some tips gleaned from the EZ Proxy mailing list and my own experimentation. There are some configuration settings you can adjust to improve its security. Here are the ones from mine: # Disable old, insecure SSL methods Option DisableSSL56bit Option DisableSSL40bit Option DisableSSLv2 Those go before setting the LoginPortSSL -- in my config.txt, they're the first thing after the Name directive at the top of the file. Doing that will help a good bit. Here's the report for my server on SSL Labs: https://www.ssllabs.com/ssltest/analyze.html?d=ezproxy.library.und.edu A marked improvement. Not perfect, but much better. EZ Proxy embeds a statically linked copy of the SSL libraries, so SSL upgrades to it only happen when you update EZ Proxy itself. I'm on version 5.7.32, which still suffers from some old security vulnerabilities, as you can see in the SSL labs report. I believe the next version of EZ Proxy is supposed to update the SSL to support newer protocols. But I'm not sure, and I'm unlikely to find out of my own. OCLC recently changed their pricing model to a yearly subscription fee if you want to receive continued updates, and my university has not chosen to pay for that at this time. So we won't be getting any further updates until we can find the money for the yearly fee. Hope this helps. Will Martin On 2014-08-12 16:38, Stuart Yeates wrote: So I just ran my EZproxy through an SSL checker and was shocked by the outcome: https://www.ssllabs.com/ssltest/analyze.html?d=login.helicon.vuw.ac.nz Finding other EZproxy installs in google and checking them gave a range of answers, some MUCH better, some MUCH worse. Clearly secure EZproxy is possible, but patchy. Is there a decent guide to securing EZproxy anywhere? I'm hoping that it might be as simple as dropping a new openssl library into a directory within the EZproxy install? cheers stuart
[CODE4LIB] EZProxy ssl security
So I just ran my EZproxy through an SSL checker and was shocked by the outcome: https://www.ssllabs.com/ssltest/analyze.html?d=login.helicon.vuw.ac.nz Finding other EZproxy installs in google and checking them gave a range of answers, some MUCH better, some MUCH worse. Clearly secure EZproxy is possible, but patchy. Is there a decent guide to securing EZproxy anywhere? I'm hoping that it might be as simple as dropping a new openssl library into a directory within the EZproxy install? cheers stuart
Re: [CODE4LIB] Bandwidth control
We had complaints from students about other students using the limited resource (in this case student computers) to do facebook / youtube. We negotiated with the students union that certain sites would be blocked from those machines for a certain busy period during the day. Negotiation with the students union appeared to be hugely important in deflating any protests. cheers stuart On 05/08/14 02:20, Carol Bean wrote: A quick and dirty search of the list archives turned up this topic from 5 years ago. I am wondering what libraries (especially those with limited resources) are doing today to control or moderate bandwidth, e.g., where viewing video sites uses up excessive amounts of bandwidth? Thanks for any help, Carol Carol Bean beanwo...@gmail.com
[CODE4LIB] ANNOUNCEMENT REMINDER (Was: Re: [CODE4LIB] NOW AVAILABLE: VIVO Release 1.7
I'd just like to remind posters of announcements that it's not really helpful post announcements that don't actually say what it is that product / service / event you're announcing is. The following case is particularly egregious, since the project home page at https://wiki.duraspace.org/display/VIVO/VIVO+Main+Page also fails to contain this important information. For the record VIVO is an open source, semantic-web tool for research discovery -- finding people and the research they do. https://wiki.duraspace.org/display/VIVO/Short+Tour%3A+VIVO+Overview cheers stuart On 07/03/2014 02:53 AM, Carol Minton Morris wrote: FOR IMMEDIATE RELEASE July 2, 2014 Contact: Layne Johnson ljohn...@duraspace.org Read it online: http://bit.ly/1sXbXHE VIVO Release 1.7 is Now Available! Key Features Include Enhanced ORCID Functionality and Simplified Data Handling Winchester, MA The VIVO Project is pleased to announce the release of VIVO 1.7. The software can be installed by downloading either a zip or tar.gz file located on the download page at VIVOweb.org and deploying it to your web server for production use. Installation Instructions and an Upgrade Guide v1.6 to 1.7 are also available. VIVO is a DuraSpace project. The VIVO 1.7 release combines new features with improvements to existing features and services and continues to leverage the VIVO-Integrated Semantic Framework (VIVO-ISF) ontology introduced in VIVO 1.6. No data migration or changes to local data ingest procedures, visualization, or analysis tools drawing directly on VIVO data will be required to upgrade to VIVO 1.7. VIVO 1.7 notably includes the results of an ORCID Adoption and Integration Grant to support the creation and verification of ORCID iDs. VIVO now offers the opportunity for a researcher to add and/or confirm his or her global, unique researcher identifier directly with ORCID without the necessity of applying through other channels and re-typing the 16-digit ORCID identifier. We anticipate that this facility will help promote ORCID iDs more widely and expand adoption for the benefit of the entire research community. VIVO 1.7 also incorporates several updates to key software libraries in VIVO, including the Apache Jena libraries that provide the default VIVO triple store from Jena 2.6.4 to Jena 2.10.1. This Jena upgrade does require existing VIVO sites to run an automated migration procedure for user accounts prior to upgrading VIVO itself. The Apache Solr search library used by VIVO has been updated to Solr 4.7.2 and the programming interface to Solr has been modularized to allow substitution of alternative search indexing libraries to benefit from specific desired features. The SPARQL web services introduced in VIVO 1.6 have been extended to support full read-write capability and content negotiation through a single interface. The ability to export or dump the entire VIVO knowledge base for analysis by external tools has also been improved to scale better with triple store size, as has the ability to request lists of RDF by type to facilitate linked data applications. The VIVO 1.7 release also reflects feedback from the VIVO Leadership Group requesting a predictable pattern of one minor release and one major release each year. We anticipate releases in late spring/early summer and again in late fall to help adopters plan for release schedules and new features, and anticipate any changes that may affect local data ingest processes, visualizations, reporting, and/or data analysis. Learn More at the VIVO Conference There’s still time to register for the upcoming VIVO Conference that will be held in Austin, TX August 6-8, 2014. The program is designed to help you harness the full potential of research networking, discovery, and open research. • Program available here • Register here How Does DuraSpace Help? VIVO is a DuraSpace project. The DuraSpace (http://duraspace.org) organization is an independent 501(c)(3) not-for-profit providing leadership and innovation for open technologies that promote durable, persistent access and discovery of digital data. Our values are expressed in our organizational byline, Committed to our digital future. DuraSpace works collaboratively with organizations that use VIVO to advance the design, development and sustainability of the project. As a non-profit, DuraSpace provides technical leadership, sustainability planning, fundraising, community development, marketing and communications, collaborations and strategic partnerships, and administration.
Re: [CODE4LIB] Community anti-harassment policy
There exists a code at: https://github.com/code4lib/antiharassment-policy/blob/master/code_of_conduct.md I believe it applies here. cheers stuart On 07/03/2014 12:54 PM, Coral Sheldon-Hess wrote: I was under the impression that we had a code of conduct/anti-harassment policy in place for IRC and the mailing lists. Was this an incorrect impression? I am definitely in favor of adopting one, if there isn't one in place! Logistically, Geek Feminism is also not a formal organization--they were recently described as an anarchist collective--so I think we could follow their lead pretty easily. We could make a mail alias that goes to a ROTATING team/committee (this is very important; people burn out, dealing with these things for too long), for reporting purposes. IRC aliases are a thing, too, right? -coral
Re: [CODE4LIB] Is ISNI / ISO 27729:2012 a name identifier or an entity identifier?
In wikipedia, the principal representation for alternative names for entities are 'redirects'. The redirect from Catherine Sefton to Martin Waddell can be found at https://en.wikipedia.org/w/index.php?title=Catherine_Seftonredirect=no (and yes, being a wiki it's editable). That redirect is annotated that this is a redirect From an alternative name (as opposed to a common spelling mistake or something else) and From a printworthy page title (which says to use this redirect when building (cross-) indexes etc.). To create a link from the Catherine Sefton to an authority control system (as distinct from the Martin Waddell link), the redirect can be editted include an Authority control template (see https://en.wikipedia.org/wiki/Template:Authority_control ), which is the same template used for full articles. cheers stuart On 06/19/2014 08:53 PM, Owen Stephens wrote: An aside but interesting to see how some of this identity stuff seems to be playing out in the wild now. Google for Catherine Sefton: https://www.google.co.uk/search?q=catherine+sefton The Knowledge Graph displays information about Martin Waddell. Catherine Sefton is a pseudonym of Martin Waddell. It is impossible to know, but the most likely source of this knowledge is Wikipedia which includes the ISNI for Catherine Sefton in the Wikipeda page for Martin Waddell (http://en.wikipedia.org/wiki/Martin_Waddell) (although oddly not the ISNI for Martin Waddell under his own name). Owen Owen Stephens Owen Stephens Consulting Web: http://www.ostephens.com Email: o...@ostephens.com Telephone: 0121 288 6936 On 18 Jun 2014, at 23:28, Stuart Yeates stuart.yea...@vuw.ac.nz wrote: My reading of that suggests that http://isni-url.oclc.nl/isni/000122816316 shouldn't have both Bell, Currer and Brontë, Charlotte, which it clearly does... Is this is a case of one of our sources of truth doesn't distinguish betweens identities and entities and we're allowing it to pollute our data? If that source of truth is wikipedia, we can fix that. cheers stuart On 06/19/2014 12:11 AM, Richard Wallis wrote: Hi all, Seeing this thread I checked with the ISNI team and got the following answer from Janifer Gatenby who asked me to post it on her behalf: SNI identifies “public identities”.The scope as stated in the standard is “This International Standard specifies the International Standard name identif*i*er (ISNI) for the identification of public identities of parties; that is, the identities used publicly by parties involved throughout the media content industries in the creation, production, management, and content distribution chains.” The relevant definitions are: *3.1* *party* natural person or legal person, whether or not incorporated, or a group of either *3.3* *public identity* Identity of a *party *(3.1) or a fictional character that is or was presented to the public *3.4* *name* character string by which a *public identity *(3.3) is or was commonly referenced A party may have multiple public identities and a public identity may have multiple names (e.g. pseudonyms) ISNI data is available as linked data. There are currently 8 million ISNIs assigned and 16 million links. Example: [image: image001.png] ~Richard. On 16 June 2014 10:54, Ben Companjen ben.compan...@dans.knaw.nl wrote: Hi Stuart, I don't have a copy of the official standard, but from the documents on the ISNI website I remember that there are name variations and 'public identities' (as the lemma on Wikipedia also uses). I'm not sure where the borderline is or who decides when different names are different identities. If it were up to me: pseudonyms are definitely different public identities, name changes after marriage probably not, name change after gender change could mean a different public identity. Different public identities get different ISNIs; the ISNI organisation says the ISNI system can keep track of connected public identities. Discussions about name variations or aliases are not new, of course. I remember the discussions about 'aliases' vs 'Artist Name Variations' that are/were happening on Discogs.com, e.g. 'is J Dilla an alias or a ANV of Jay Dee?' It appears the users on Discogs finally went with aliases, but VIAF put the names/identities together: http://viaf.org/viaf/32244000 - and there is no ISNI (yet). It gets more confusing when you look at Washington Irving who had several pseudonyms: they are just listed under one ISNI. Maybe because he is dead, or because all other databases already know and connected the pseudonyms to the birth name? (I just sent a comment asking about the record at http://isni.org/isni/000121370797 ) [Here goes the reference list…] Hope this helps :) Groeten van Ben On 15-06-14 23:11, Stuart Yeates stuart.yea...@vuw.ac.nz wrote: Could someone with access to the official text of ISO 27729:2012 tell me whether an ISNI is a name identifier or an entity identifier
Re: [CODE4LIB] Is ISNI / ISO 27729:2012 a name identifier or an entity identifier?
Thank you for (and Janifer Gatenby) for this answer. My reading of this is that people who change their name when they marry don't get a new ISNI, but those who change it when they transition gender do, because that's a new identify. That's useful to know. cheers stuart On 06/19/2014 12:11 AM, Richard Wallis wrote: Hi all, Seeing this thread I checked with the ISNI team and got the following answer from Janifer Gatenby who asked me to post it on her behalf: SNI identifies “public identities”.The scope as stated in the standard is “This International Standard specifies the International Standard name identif*i*er (ISNI) for the identification of public identities of parties; that is, the identities used publicly by parties involved throughout the media content industries in the creation, production, management, and content distribution chains.” The relevant definitions are: *3.1* *party* natural person or legal person, whether or not incorporated, or a group of either *3.3* *public identity* Identity of a *party *(3.1) or a fictional character that is or was presented to the public *3.4* *name* character string by which a *public identity *(3.3) is or was commonly referenced A party may have multiple public identities and a public identity may have multiple names (e.g. pseudonyms) ISNI data is available as linked data. There are currently 8 million ISNIs assigned and 16 million links. Example: [image: image001.png] ~Richard. On 16 June 2014 10:54, Ben Companjen ben.compan...@dans.knaw.nl wrote: Hi Stuart, I don't have a copy of the official standard, but from the documents on the ISNI website I remember that there are name variations and 'public identities' (as the lemma on Wikipedia also uses). I'm not sure where the borderline is or who decides when different names are different identities. If it were up to me: pseudonyms are definitely different public identities, name changes after marriage probably not, name change after gender change could mean a different public identity. Different public identities get different ISNIs; the ISNI organisation says the ISNI system can keep track of connected public identities. Discussions about name variations or aliases are not new, of course. I remember the discussions about 'aliases' vs 'Artist Name Variations' that are/were happening on Discogs.com, e.g. 'is J Dilla an alias or a ANV of Jay Dee?' It appears the users on Discogs finally went with aliases, but VIAF put the names/identities together: http://viaf.org/viaf/32244000 - and there is no ISNI (yet). It gets more confusing when you look at Washington Irving who had several pseudonyms: they are just listed under one ISNI. Maybe because he is dead, or because all other databases already know and connected the pseudonyms to the birth name? (I just sent a comment asking about the record at http://isni.org/isni/000121370797 ) [Here goes the reference list…] Hope this helps :) Groeten van Ben On 15-06-14 23:11, Stuart Yeates stuart.yea...@vuw.ac.nz wrote: Could someone with access to the official text of ISO 27729:2012 tell me whether an ISNI is a name identifier or an entity identifier? That is, if someone changes their name (adopts a pseudonym, changes their name by to marriage, transitions gender, etc), should they be assigned a new identifier? If the answer is 'No' why is this called a 'name identifier'? Ideally someone with access to the official text would update the article at https://en.wikipedia.org/wiki/International_Standard_Name_Identifier With a brief quote referenced to the standard with a page number. [The context of this is ORCID, which is being touted as an entity identifier, while not being clear on whether it's a name or entity identifier.] cheers stuart
Re: [CODE4LIB] Is ISNI / ISO 27729:2012 a name identifier or an entity identifier?
My reading of that suggests that http://isni-url.oclc.nl/isni/000122816316 shouldn't have both Bell, Currer and Brontë, Charlotte, which it clearly does... Is this is a case of one of our sources of truth doesn't distinguish betweens identities and entities and we're allowing it to pollute our data? If that source of truth is wikipedia, we can fix that. cheers stuart On 06/19/2014 12:11 AM, Richard Wallis wrote: Hi all, Seeing this thread I checked with the ISNI team and got the following answer from Janifer Gatenby who asked me to post it on her behalf: SNI identifies “public identities”.The scope as stated in the standard is “This International Standard specifies the International Standard name identif*i*er (ISNI) for the identification of public identities of parties; that is, the identities used publicly by parties involved throughout the media content industries in the creation, production, management, and content distribution chains.” The relevant definitions are: *3.1* *party* natural person or legal person, whether or not incorporated, or a group of either *3.3* *public identity* Identity of a *party *(3.1) or a fictional character that is or was presented to the public *3.4* *name* character string by which a *public identity *(3.3) is or was commonly referenced A party may have multiple public identities and a public identity may have multiple names (e.g. pseudonyms) ISNI data is available as linked data. There are currently 8 million ISNIs assigned and 16 million links. Example: [image: image001.png] ~Richard. On 16 June 2014 10:54, Ben Companjen ben.compan...@dans.knaw.nl wrote: Hi Stuart, I don't have a copy of the official standard, but from the documents on the ISNI website I remember that there are name variations and 'public identities' (as the lemma on Wikipedia also uses). I'm not sure where the borderline is or who decides when different names are different identities. If it were up to me: pseudonyms are definitely different public identities, name changes after marriage probably not, name change after gender change could mean a different public identity. Different public identities get different ISNIs; the ISNI organisation says the ISNI system can keep track of connected public identities. Discussions about name variations or aliases are not new, of course. I remember the discussions about 'aliases' vs 'Artist Name Variations' that are/were happening on Discogs.com, e.g. 'is J Dilla an alias or a ANV of Jay Dee?' It appears the users on Discogs finally went with aliases, but VIAF put the names/identities together: http://viaf.org/viaf/32244000 - and there is no ISNI (yet). It gets more confusing when you look at Washington Irving who had several pseudonyms: they are just listed under one ISNI. Maybe because he is dead, or because all other databases already know and connected the pseudonyms to the birth name? (I just sent a comment asking about the record at http://isni.org/isni/000121370797 ) [Here goes the reference list…] Hope this helps :) Groeten van Ben On 15-06-14 23:11, Stuart Yeates stuart.yea...@vuw.ac.nz wrote: Could someone with access to the official text of ISO 27729:2012 tell me whether an ISNI is a name identifier or an entity identifier? That is, if someone changes their name (adopts a pseudonym, changes their name by to marriage, transitions gender, etc), should they be assigned a new identifier? If the answer is 'No' why is this called a 'name identifier'? Ideally someone with access to the official text would update the article at https://en.wikipedia.org/wiki/International_Standard_Name_Identifier With a brief quote referenced to the standard with a page number. [The context of this is ORCID, which is being touted as an entity identifier, while not being clear on whether it's a name or entity identifier.] cheers stuart
Re: [CODE4LIB] Does 'Freedom to Read' require us to systematically privilege HTTPS over HTTP?
Anyone thinking about these things is encouraged to read the thread [CODE4LIB] EZProxy changes / alternatives ? in the archives of this list. cheers stuart On 06/19/2014 05:28 AM, Andrew Anderson wrote: EZproxy already handles HTTPS connections for HTTPS enabled services today, and on modern hardware (i.e. since circa 2005), cryptographic processing far surpasses the speed of most network connections, so I do not accept the “it’s too heavy” argument against it supporting the HTTPS to HTTP functionality. Even embedded systems with 500MHz CPUs can terminate SSL VPNs at over 100Mb/s these days. All I am saying is that the model where you expose HTTPS to the patron and still continue to use HTTP for the vendor is not possible with EZproxy today, and there is no technical reason why it could not do so, but rather a policy decision. While HTTPS to HTTP translation would not completely solve the entire point of the original posting, it would be a step in the right direction until the rest of the world caught up. As an aside, the lightweight nature of EZproxy seems to be becoming its Achilles Heel these days, as modern web development methods seem to be pushing the boundaries of its capabilities pretty hard. The stance that EZproxy only supports what it understands is going to be a problem when vendors adopt HTTP/2.0, SDCH encoding, web sockets, etc., just as AJAX caused issues previously. Most vendor platforms are Java based, and once Jetty starts supporting these features, the performance chasm between dumbed-down proxy connections and direct connections is going to become even more significant than it is today.
Re: [CODE4LIB] Does 'Freedom to Read' require us to systematically privilege HTTPS over HTTP?
On 06/17/2014 08:49 AM, Galen Charlton wrote: On Sun, Jun 15, 2014 at 4:03 PM, Stuart Yeates stuart.yea...@vuw.ac.nz wrote: As I read it, 'Freedom to Read' means that we have to take active steps to protect that rights of our readers to read what they want and in private. [snip] * building HTTPS Everywhere-like functionality into LMSs (such functionality may already exist, I'm not sure) Many ILSs can be configured to require SSL to access their public interfaces, and I think it would be worthwhile to encourage that as a default expectation for discovery interfaces. However, I think that's only part of the picture for ILSs. Other parts would include: * staff training on handling patron and circulation data * ensuring that the ILS has the ability to control (and let users control) how much circulation and search history data gets retained * ensuring that the ILS backup policy strikes the correct balance between having enough for disaster recovery while not keeping individually identifiable circ history forever * ensuring that contracts with ILS hosting providers and services that access patron data from the ILS have appropriate language concerning data retention and notification of subpoenas. Compared to other contributors to this thread, I appear to be (a) less worried about state actors than our commercial partners and (b) keener to see relatively straight forward technical fixes that just work 'for free' across large classes of library systems. Things like: * An ILS module that pulls the HTTPS Everywhere ruleset from https://gitweb.torproject.org/https-everywhere.git/tree/HEAD:/src/chrome/content/rules and applies those rules as a standard data-cleanup step on all imported data (MARC, etc). * A plugin to the CMS that drives the library's websites / blogs / whatever and uses the same rulesets to default all links to HTTPS. * An EzProxy plugin (or howto) on silently redirectly users to HTTPS over HTTP sites. cheers stuart
Re: [CODE4LIB] Does 'Freedom to Read' require us to systematically privilege HTTPS over HTTP?
On 06/18/2014 12:36 PM, Brent E Hanner wrote: Stuart Yeates wrote: Compared to other contributors to this thread, I appear to be (a) less worried about state actors than our commercial partners and (b) keener to see relatively straight forward technical fixes that just work 'for free' across large classes of library systems. Things like: * An ILS module that pulls the HTTPS Everywhere ruleset from https://gitweb.torproject.org/https-everywhere.git/tree/HEAD:/src/chrome/content/rules and applies those rules as a standard data-cleanup step on all imported data (MARC, etc). * A plugin to the CMS that drives the library's websites / blogs / whatever and uses the same rulesets to default all links to HTTPS. * An EzProxy plugin (or howto) on silently redirectly users to HTTPS over HTTP sites. So let me see if I understand this. Your concern is that commercial partners are putting HTTP links in their systems rather than HTTPS. Because HTTPS only protects from a third party so the partner will still have access to all the information about what the user read. IP6 will improve the HTTPS issue but something like HTTPS Everywhere ( https://www.eff.org/https-everywhere ) is actually the simplest solution, especially as you can't be sure every link will have HTTPS. My concern is that by referring users to resources and services via HTTP rather than HTTPS, we are encouraging users to leak more personal information (reading habits, location, language settings, etc) to third parties. These third parties include our networking providers, our hosting providers, our content providers, the next person who uses the users' public computer, etc., etc. HTTPS protects in multiple ways. Firstly it protects the data 'on the wire' (but that is rarely a problem in practice). Secondly HTTPS protects from web caching attacks. Thirdly the fact that a connection is HTTPS causes almost all tools and applications to use a more secure set of options and preferences, covering everything from cookie handling, to not remembering passwords, not storing local caches, using shorter timeouts, etc. This last category is where the real protection is. There are lots of privacy breaches that HTTPS won't deter (a thorough compromise of the users' machine, a thorough compromise of the content provider's machine, etc.), but it raises the bar and protects against a significant number of breaches that become impossible or much, much harder / less likely. My understanding is that that HTTPS and EzProxy can potentially protect readers identity very effectively (assuming the library systems are secure and no one turns up with a warrant). And having just read the Freedom to Read Statement, this issue has no bearing on it. Freedom to Read is about accessibility to materials, not privacy. While no doubt there is some statement somewhere about that, Freedom to Read is a statement about diversity of materials and not the ability to read them without anyone knowing about it. If materials are only available at the cost of personal privacy, are they really available? In repressive regimes all across the world people are actively discriminated against (or worse) for read the wrong book, being in the wrong place or communicating in the wrong language. How many of us live in countries where currently (or in living memory) people are been derided for speaking a non-English language? cheers stuart
[CODE4LIB] Is ISNI / ISO 27729:2012 a name identifier or an entity identifier?
Could someone with access to the official text of ISO 27729:2012 tell me whether an ISNI is a name identifier or an entity identifier? That is, if someone changes their name (adopts a pseudonym, changes their name by to marriage, transitions gender, etc), should they be assigned a new identifier? If the answer is 'No' why is this called a 'name identifier'? Ideally someone with access to the official text would update the article at https://en.wikipedia.org/wiki/International_Standard_Name_Identifier With a brief quote referenced to the standard with a page number. [The context of this is ORCID, which is being touted as an entity identifier, while not being clear on whether it's a name or entity identifier.] cheers stuart
[CODE4LIB] Does 'Freedom to Read' require us to systematically privilege HTTPS over HTTP?
As I read it, 'Freedom to Read' means that we have to take active steps to protect that rights of our readers to read what they want and in private. Triggered by discussions at a bar-camp on NLNZ on Friday I'm thinking that in a digital world this means systematically privileging HTTPS over HTTP. Things like: * serving our websites and content over HTTPS * installing HTTPS Everywhere on public-access desktops * preferring HTTPS links in EZProxy / MARC / etc (basically in our catalogued materials) * building HTTPS Everywhere-like functionality into LMSs (such functionality may already exist, I'm not sure) * providing user-education materials. Thoughts? cheers stuart
Re: [CODE4LIB] orcid
On a similar note, https://en.wikipedia.org/wiki/User:Pigsonthewing has just (today / yesterday depending on timezone) been appointed Wikipedian in residence at ORCID. He has tons of experience in museums, galleries and archives and is a great person to get in touch with in this kind of area. cheers stuart On 06/11/2014 08:11 AM, todd.d.robb...@gmail.com wrote: On a related-new-things-at-Wikipedia note: https://en.wikipedia.org/wiki/Wikipedia:ORCID On Tue, Jun 10, 2014 at 2:07 PM, Masamitsu, Pam masamit...@llnl.gov wrote: http://orcid.org/faq-page#n110 ORCID is an acronym, short for Open Researcher and Contributor ID. pam Pam Masamitsu Reference and Systems Phone: 925.424.4299 Email: masamit...@llnl.gov Lawrence Livermore National Laboratory Main Library -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Eric Lease Morgan Sent: Tuesday, June 10, 2014 1:05 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] orcid Is ORCID an acronym, and if it is then what does it stand for? -ELM
[CODE4LIB] Selenium testing for outsourced library services?
Has anyone had an success using Selenium or other web testing systems for testing and monitoring of complex outsourced web services? I'm thinking of a system that is the integration of a website, an authentication service, a discovery service and a repository and monitoring that end-to-end use of the system is working. cheers stuart
Re: [CODE4LIB] orcid and researcherid and scopus, oh my
If there is really an appetite to continue DAIs going forward, the wikipedia support for identifiers is modula and there's no reason not to add more identifiers. cheers stuart On 06/05/2014 11:06 PM, Ben Companjen wrote: Hi, Of course there are more identifier systems (or domains, if you will). Most/many authors in The Netherlands have a Digital Author Identifier (DAI), which is the record number in the GGC (Gemeenschappelijk Geautomatiseerd Catalogiseersysteem), or Shared Automated Catalogue system. The DAIs are assigned by (university) libraries and in the case of university libraries assigning/finding DAIs for their researchers, the DAI is usually linked to the employee in the repository. Following EduStandaard agreements [0] among all Dutch universities and some service providers like my employer DANS and the National Library of the Netherlands (KB), we can harvest the IRs and link publications to researcher profiles and show them in NARCIS [1]. [0]: http://www.edustandaard.nl/afspraken%20en%20architectuur/beheerde-afspraken / [1]: http://www.narcis.nl Setup as a service by a company called Pica, the GGC is now hosted by OCLC after Pica merged into OCLC [2]. The authority files for authors together are called the NTA ([Dutch Thesaurus Author names]). [2]: http://www.oclc.org/nl-NL/ggc.html OCLC is also hosting the ISNI database and VIAF (of course). VIAF, as you know, was setup as a crosswalk of authority files (including the NTA). OCLC are working on crosswalking identifiers, AFAIK. Please be aware that ISNI is a /name/ identifier. Pseudonyms and birth names for the same person (should) get different ISNIs. And, as said before, not only people can get ISNIs. Also, the business models for ORCID and ISNI are different. As a Linked Data aside, Eric, be aware of what an identifier identifies - and then how you make assertions using them. For example, ORCID doesn't use the hash or 303 pattern, so if you resolve http://orcid.org/-0002-9952-7800 you get a webpage, i.e. http://orcid.org/-0002-9952-7800 identifies a webpage (the same goes for DOIs, btw). That is why I say about myself (in Turtle): http://companjen.name/id/BC dct:identifier http://orcid.org/-0002-7023-9047; . instead of http://companjen.name/id/BC owl:sameAs http://orcid.org/-0002-7023-9047 . … for I am not a website. Linking me to things I make is done like so (Qualified DC): #thing dct:creator http://companjen.name/id/BC . In your example you used the identifiers as names for the creator(s); it is as meaningful as saying (in unqualified/simple DC): #thing dc:creator Eric Lease Morgan . Hope this helps :) Groeten van Ben On 05-06-14 00:14, Stuart Yeates stuart.yea...@vuw.ac.nz wrote: Others have made excellent contributions to this thread, which I won't repeat, but I feel it's worth asking the question: Who is systematically cross walking these identifiers? The only party I'm aware of doing this in a large-scale fashion is Wikipedia, via https://en.wikipedia.org/wiki/Template:Authority_control cheers stuart On 06/05/2014 06:34 AM, Eric Lease Morgan wrote: ORDID and ResearcherID and Scopus, oh my! It is just me, or are there an increasing number of unique identifiers popping up in Library Land? A person can now be identified with any one of a number of URIs such as: * ORCID - http://orcid.org/-0002-9952-7800 * ResearcherID - http://www.researcherid.com/rid/F-2062-2014 * Scopus - http://www.scopus.com/authid/detail.url?authorId=25944695600 * VIAF - http://viaf.org/viaf/26290254 * LC - http://id.loc.gov/authorities/names/n94036700 * ISNI - http://isni.org/isni/35290715 At least these identifiers are (for the most part) “cool”. I have a new-to-me hammer, and these identifiers can play a nice role in linked data. For example: @prefix dc: http://purl.org/dc/elements/1.1/ . http://dx.doi.org/10.1108/07378831211213201 dc:creator http://orcid.org/-0002-9952-7800; , http://id.loc.gov/authorities/names/n94036700; , http://isni.org/isni/35290715; , http://viaf.org/viaf/26290254; . How have any of y’all used theses sorts of identifiers, and what problems do you think you will be able to solve by doing so? For example, I know of a couple of instances where these sort of identifiers are being put into MARC records. — Eric Morgan
Re: [CODE4LIB] orcid and researcherid and scopus, oh my
On 06/06/2014 12:51 AM, Gary Thompson wrote: I also hope to convince our campus Shibboleth IdP to add ORCID as a new attribute. If I understand correctly, what we need is ISNI added to the next release of EduPerson, per http://software.internet2.edu/eduperson/internet2-mace-dir-eduperson-201310.html Then non-trivial numbers of service providers and identity providers can interoperate using ORCID and/or ISNI. cheers stuart
Re: [CODE4LIB] orcid and researcherid and scopus, oh my
Others have made excellent contributions to this thread, which I won't repeat, but I feel it's worth asking the question: Who is systematically cross walking these identifiers? The only party I'm aware of doing this in a large-scale fashion is Wikipedia, via https://en.wikipedia.org/wiki/Template:Authority_control cheers stuart On 06/05/2014 06:34 AM, Eric Lease Morgan wrote: ORDID and ResearcherID and Scopus, oh my! It is just me, or are there an increasing number of unique identifiers popping up in Library Land? A person can now be identified with any one of a number of URIs such as: * ORCID - http://orcid.org/-0002-9952-7800 * ResearcherID - http://www.researcherid.com/rid/F-2062-2014 * Scopus - http://www.scopus.com/authid/detail.url?authorId=25944695600 * VIAF - http://viaf.org/viaf/26290254 * LC - http://id.loc.gov/authorities/names/n94036700 * ISNI - http://isni.org/isni/35290715 At least these identifiers are (for the most part) “cool”. I have a new-to-me hammer, and these identifiers can play a nice role in linked data. For example: @prefix dc: http://purl.org/dc/elements/1.1/ . http://dx.doi.org/10.1108/07378831211213201 dc:creator http://orcid.org/-0002-9952-7800; , http://id.loc.gov/authorities/names/n94036700; , http://isni.org/isni/35290715; , http://viaf.org/viaf/26290254; . How have any of y’all used theses sorts of identifiers, and what problems do you think you will be able to solve by doing so? For example, I know of a couple of instances where these sort of identifiers are being put into MARC records. — Eric Morgan
[CODE4LIB] 2014 National Digital Forum conference
I'm sending this on the off-chance that people happen to have the travel budget for it: http://www.ndf.org.nz/programme/ It's a great cross-GLAM event hosted by the Museum of New Zealand Te Papa Tongarewa, which you may be familiar with for their kōiwi tangata Māori repatriation program. If I were a foreign visitor looking to have a paper accepted, I'd (a) get in touch with the organisers and (b) present on an indigenous topic. cheers stuart
[CODE4LIB] statistics for image sharing sites?
We have been using google analytics since October 2008 and by and large we're pretty happy with it. Recently I noticed that we're getting 100 hits a day from the Pinterest/0.1 +http://pinterest.com/; bot which I understand is a reasonably reliable indicator of activity from that site. Much of this activity is pure-jpeg, so there is no HTML and no opportunity to execute javascript, so google analytics doesn't see it. pinterest.com is absent from our referrer logs. My main question is whether anyone has an easy tool to report on this kind of use of our collections? My secondary question is whether any httpd gurus have recipes for redirecting by agent string from low quality images to high quality. So when AGENT = Pinterest/0.1 +http://pinterest.com/; and the URL matches a pattern redirect to a different pattern. For example: http://nzetc.victoria.ac.nz/etexts/MakOldT/MakOldTP022a%28w100%29.jpg to http://nzetc.victoria.ac.nz/etexts/MakOldT/MakOldTP022a.jpg cheers stuart
Re: [CODE4LIB] statistics for image sharing sites?
On 05/14/2014 01:23 PM, Barnes, Hugh wrote: -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Stuart Yeates Sent: Wednesday, 14 May 2014 1:04 p.m. To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] statistics for image sharing sites? [snip] My secondary question is whether any httpd gurus have recipes for redirecting by agent string from low quality images to high quality. So when AGENT = Pinterest/0.1 +http://pinterest.com/; and the URL matches a pattern redirect to a different pattern. For example: http://nzetc.victoria.ac.nz/etexts/MakOldT/MakOldTP022a%28w100%29.jpg to http://nzetc.victoria.ac.nz/etexts/MakOldT/MakOldTP022a.jpg This sounds totally doable, but what are you trying to achieve? To my mind, it has unintended consequences and chaos writ all over it. My naive thought was to make the images that appear in pinterest as high-quality as possible, rather than the thumbnails that we often use on search pages, etc. cheers stuart
Re: [CODE4LIB] statistics for image sharing sites?
On 05/14/2014 01:39 PM, Joe Hourcle wrote: On May 13, 2014, at 9:04 PM, Stuart Yeates wrote: We have been using google analytics since October 2008 and by and large we're pretty happy with it. Recently I noticed that we're getting 100 hits a day from the Pinterest/0.1 +http://pinterest.com/; bot which I understand is a reasonably reliable indicator of activity from that site. Much of this activity is pure-jpeg, so there is no HTML and no opportunity to execute javascript, so google analytics doesn't see it. pinterest.com is absent from our referrer logs. My main question is whether anyone has an easy tool to report on this kind of use of our collections? Set your webserver logs to include user agent (I use 'combined' logs), then use: grep Pinterest /path/to/access/logs You could also use any analytic tools that work directly off of your log files. It might not have all of the info that the javascript analytics tools pull (window size, extensions installed, etc.), but it'll work for anything, not just HTML files. When I visit http://www.pinterest.com/search/pins/?q=nzetc I see a whole lot of our images, but absolutely zero traffic in my log files, because those images are cached by pinterest. My secondary question is whether any httpd gurus have recipes for redirecting by agent string from low quality images to high quality. So when AGENT = Pinterest/0.1 +http://pinterest.com/; and the URL matches a pattern redirect to a different pattern. For example: http://nzetc.victoria.ac.nz/etexts/MakOldT/MakOldTP022a%28w100%29.jpg to http://nzetc.victoria.ac.nz/etexts/MakOldT/MakOldTP022a.jpg Perfectly possible w/ Apache's mod_rewrite, but you didn't say what http server you're using. If Apache, you'd do something like: RewriteCond %{HTTP_USER_AGENT} ^Pinterest RewriteRule (^/etexts/MakOldT/.*)\(.*\)\.jpg $1.jpg [L] That is pretty much exactly what I was after, thanks. As discussed elsewhere on the thread, I plan on using it judiciously. cheers stuart
Re: [CODE4LIB] Extracting Text From .tiff Files
Your first step is to pin down the format. TIFF is a container form (like zip) and can contain pretty much anything. Likely candidates for you format include https://en.wikipedia.org/wiki/IPTC_Information_Interchange_Model and https://en.wikipedia.org/wiki/Extensible_Metadata_Platform Your second step is to find a library / tool for your platform that supports your format. Cheers stuart -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Gavin Spomer Sent: Tuesday, 13 May 2014 10:01 a.m. To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] Extracting Text From .tiff Files Hello folks, I'm in the process of migrating a student newspaper collection, currently implemented with ResCarta, into our new bepress institutional repository. ResCarta has each page of a newspaper stored as a tiff file. Not only does the tiff file contain the graphics data, but it has some metadata in xml format and the fulltext of the page. I know this because I opened up some of the tiffs with a plain-text editor (Vim). Although I can see the text in the file, I've only been about 90% accurate in extracting it with a script. Some of those weird characters seem to do some wonky things when doing file IO for some reason. Is there a more reliable way to extract text stored in a tiff file? I've Googled and Googled and have pulled up almost nothing. But there's got to be a way, since ResCarta stores it there and can extract it. Any ideas? Gavin Spomer Systems Programmer Brooks Library Central Washington University
Re: [CODE4LIB] Withdraw my post was: Re: [CODE4LIB] separate list for jobs
Context: http://www.youtube.com/watch?v=dIYvD9DI1ZA Cheers stuart -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Fitchett, Deborah Sent: Monday, 12 May 2014 9:53 a.m. To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Withdraw my post was: Re: [CODE4LIB] separate list for jobs I can't help with the Python, but a test case for the script would obviously be You know I can't subscribe to your ghost jobs list. Deborah -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Susan Kane Sent: Friday, 9 May 2014 2:44 a.m. To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] Withdraw my post was: Re: [CODE4LIB] separate list for jobs Obviously, we must now task someone in CODE4LIB with writing a Python script to convert New Zealand English to International English. Or, I guess we could solve this on the user side with a sarcasm filter or a humor pipe, but you might lose some data that way. :-) -- Susan Kane Boston(ish), MA P Please consider the environment before you print this email. The contents of this e-mail (including any attachments) may be confidential and/or subject to copyright. Any unauthorised use, distribution, or copying of the contents is expressly prohibited. If you have received this e-mail in error, please advise the sender by return e-mail or telephone and then delete this e-mail together with all attachments from your system.
Re: [CODE4LIB] Withdraw my post was: Re: [CODE4LIB] separate list for jobs
On 05/09/2014 02:44 AM, Susan Kane wrote: Obviously, we must now task someone in CODE4LIB with writing a Python script to convert New Zealand English to International English. Yes, because tasking people with AI-complete programming tasks (see https://en.wikipedia.org/wiki/AI-complete ) is only slightly worse than systematically malfunctioning sarcasm filters. Or, I guess we could solve this on the user side with a sarcasm filter or a humor pipe, but you might lose some data that way. Or we could acknowledge code4lib's role as a safe place for people to tune their sarcasm detectors. cheers stuart
Re: [CODE4LIB] how to post jobs (was Re: [CODE4LIB] separate list for Jobs)
On 05/09/2014 10:04 AM, Jodi Schneider wrote: On Thu, May 8, 2014 at 9:54 PM, Coral Sheldon-Hess wrote: I have another, maybe minor, point to add to this: I've posted a job to Code4Lib, and I did it wrong. I have no idea how I'm supposed to make a job show up correctly, and now that I have realized I've done it wrong, I probably won't send another job to this list. (Or maybe I'll look it up in ... where? the wiki?) You post them at http://jobs.code4lib.org/ Could that information please be added to the footer that's added when posting jobs? cheers stuart
[CODE4LIB] Withdraw my post was: Re: [CODE4LIB] separate list for jobs
The fact that the only person who has given any acknowledgement of understanding my message was someone else in .ac.nz suggests that despite my best efforts my message content was effectively shredded by the implicit conversion from New Zealand English to International English. My apologies; I withdraw my original email. To translate explicitly into International English, my point was: I have observed that an individuals position on mail filtering vs separate mailing lists appears to be an implicit marker of group membership in this group (i.e. a shibboleth). Note that I do not endorse this or any other marker of group membership, but my understanding of psychology of groups suggest that all functional groups have markers of group membership and that attempting to eliminate markers of group membership in an attempt at inclusiveness (a) can in itself be a marker of group membership and (b) is only likely to drive a shift from relative explicit markers to relatively implicit markers. cheers stuart On 05/08/2014 10:17 AM, David Friggens wrote: This is a pretty terrible reply. I thought it was a great reply. obscure words (seriously, shibboleth?) Somewhat obscure, but not so much in Code4Lib. http://en.wikipedia.org/wiki/Shibboleth http://en.wikipedia.org/wiki/Shibboleth_(Internet2) Unless you're trying to be sarcastic...in which case ignore this. He most definitely was. I believe Stuart's point was to suggest that when the multiple requests for a separate list for job notices get immediately shot down with no - use an email filter, or are you stupid? [1] it doesn't help to create an inclusive and good learning environment. [1] NB the respondents aren't explicitly are you stupid but that's how it may be taken by some people. And to answer the original question - job listings help more people than they annoy so they should be kept as-is. My view is that it would make more sense to have separate discussion and job notice lists, as I see in other places. But I'm not that bothered personally, as I would subscribe to both and filter them into the same folder in my mail client. :-) Cheers David
Re: [CODE4LIB] separate list for jobs
On 05/07/2014 04:59 AM, Richard Sarvas wrote: Not to be a jerk about this, but why is the answer always No? There seem to be more posts on this list relating to job openings than there are relating to code discussions. Are job postings a part why this list was originally created? If so, I'll stop now. The answer is always no because we are collectively using the the possession of an email client with filtering capability and the personal knowledge of how to use it as a Shibboleth for group membership. Those who find it easier to complain than write a filter mark themselves as members of the outgroup intruding on the ingroup. cheers stuart
Re: [CODE4LIB] barriers to open metadata?
On 04/30/2014 09:38 AM, David Friggens wrote: Hi Laura I'd like to find out from as many people as are interested what barriers you feel exist right now to you releasing your library's bibliographic metadata openly. One issue is that we pay for enrichments (tables of contents etc) for records, and I believe the licence restricts us from giving them to other people. We send our records to the national union catalogue and OCLC before adding the enrichments, and we'd need to take them out before we could release records elsewhere. Note that this is primarily a problem because MARC assumes that all versioning is done at the record level; there's no easy way to say the core bib item is from X, the TOC is from Y and the cover image is from Z. cheers stuart
Re: [CODE4LIB] New Zealand Chapter
Nice. The real question is whether that's U+2163, like it should be. cheers stuart On 04/10/2014 07:17 AM, Jay Gattuso wrote: Hi all, Long time listener, first time caller. We don't have a C4L chapter over here in New Zealand, and I wondered what we would need to do to align the small group of Lib / GLAM coders with the broader C4L group. One of my colleagues did make this: http://i.imgur.com/XgGP9vX.jpg We are also setting up a two day code/hack fest, focusing on our Digital Preservation concerns, in June. I'd also really like to run the hackfest under a C4L banner. Any thoughts? J Jay Gattuso | Digital Preservation Analyst | Preservation, Research and Consultancy National Library of New Zealand | Te Puna Mātauranga o Aotearoa PO Box 1467 Wellington 6140 New Zealand | +64 (0)4 474 3064 jay.gatt...@dia.govt.nzmailto:jay.gatt...@natlib.govt.nz
[CODE4LIB] Cataloguing Telugu
Currently there is a funding proposal for cataloguing Telugu works up before the Wikimedia foundation. If anyone has experience with Telugu or knows of any tools that are likely to be useful, please give your input: https://meta.wikimedia.org/wiki/Grants:IEG/Making_telugu_content_accessible cheers stuart
Re: [CODE4LIB] Open Publication Distribution System
We use OPDS at http://nzetc.victoria.ac.nz and have for a while now. Basically it's an extra namespace that you can use in your RSS with extra information for ebook readers / consumers. Other RSS readers / consumers silently ignore the namespace, so done right you only need one RSS feed. We do ours on the back of our solr search, so that suddenly you can browse by anything you can facet or search by. At the bottom of every search page is the result set as RSS / OPDS. cheers stuart On 06/02/14 11:06, Bigwood, David wrote: I recently became aware of Open Publication Distribution System (OPDS) Catalog format, a syndication format for e-pubs based on Atom HTTP. It is something like an RSS feed for e-books. People are using it to find and acquire books. It sounds like a natural fit for library digitization projects. An easy way for folks to know what's new and grab a copy if they like. So is anyone using this? Is it built into Omeka, Greenstone, DSpace or any of our tools? If you do use it do you have separate feeds for different projects. Say, one for dissertations, another for the local history project and another for books by state authors. Or do you have just one large feed? Is it being used by the DPLA or Internet Archive? How's it working for you? We have plenty of documents we have scanned as well as our own publications. Might this be a good way to make them more discoverable? Or is this just a tool no one is using? Thanks, David Bigwood dbigw...@hou.usra.edumailto:dbigw...@hou.usra.edu Lunar and Planetary Institute https://twitter.com/Catalogablog -- Stuart Yeates Library Technology Services http://www.victoria.ac.nz/library/
Re: [CODE4LIB] EZProxy changes / alternatives ?
On 04/02/14 05:09, Andrew Anderson wrote: There exists a trivial DoS attack against EZproxy that I reported to OCLC about 2 years ago, and has not been addressed yet. ... and as soon as that gets a CVE (see http://cve.mitre.org/), corporate IT departments will force libraries to upgrade to the latest version or turn the software off. cheers stuart -- Stuart Yeates Library Technology Services http://www.victoria.ac.nz/library/
Re: [CODE4LIB] EZProxy changes / alternatives ?
On 01/02/14 08:34, Mosior, Benjamin wrote: Does anyone have any thoughts on how to move forward with organizing the development and adoption of an alternative proxy solution? A collaborative Google Doc? Perhaps a LibraryProxy GitHub Organization? I'd say that more than anything else what is needed is for techies to do experiments, document and share the results. These could either follow on from the example of Andrew Anderson earlier in this thread or strike out in different directions. cheers stuart -- Stuart Yeates Library Technology Services http://www.victoria.ac.nz/library/
[CODE4LIB] Southeastern Library Association
Someone has put a great deal of time into https://en.wikipedia.org/wiki/Southeastern_Library_Association but it's going to get deleted unless it acquires some independent secondary sources. [This also serves to illustrate why wikipedia has issues as an authority control system.] cheers stuart -- Stuart Yeates Library Technology Services http://www.victoria.ac.nz/library/
Re: [CODE4LIB] Southeastern Library Association
On 03/02/14 13:35, BWS Johnson wrote: [This also serves to illustrate why wikipedia has issues as an authority control system.] I went ahead and strongarmed the templates away. Feel free to add your thoughts on the talk page. :) Wikimedians are very cool in person, and there's acknowledgement inside of the community that there are several bad actors that end up making for lots of bad experiences. So any time you run into this, revert the changes, add more sources if possible, and add to the talk page so that editors that aren't in the know should be able to read the whys of things. I'm the wikimedian who added the templates there in the first place to give the newbie author some guidance as to what needed to happen; when the newbie editor ran out of steam I appealed for input from here. Wikipedia is in many ways as structured as cataloguing, but you can get away with pretty much everything if you have secondary sources. The fact that anyone on this list thinks that a single-column contemporary eye-witness account qualifies as a secondary source staggers me. Maybe that makes me a bad actor. [and yes, the article is still in need of secondary sources] cheers stuart -- Stuart Yeates Library Technology Services http://www.victoria.ac.nz/library/
Re: [CODE4LIB] EZProxy changes / alternatives ?
It's worse than that. The price we were quoted for hosting seems to have been picked so it can be offered with a 90% discount when bundled with a package deal with other OCLC products; buying into the on-going balkanization of the industry. cheers stuart On 01/02/14 16:24, Roy Tennant wrote: When it comes to hedging bets, I'd sure rather hedge my $50,000 bet than my $500 one. Just sayin'. Roy On Fri, Jan 31, 2014 at 6:04 PM, BWS Johnson abesottedphoe...@yahoo.comwrote: Salvete! Tisn't necessarily Socialist to hedge one's bets. Look at what Wall St. experts advise when one is unsure of whether to hold or sell. Monopoly is only ever in the interest of those that hold it. Short term the aquarium is enticing, but do you enjoy your collapsed dorsal fin? Cheers, Brooke -- On Fri, Jan 31, 2014 6:10 PM EST Salazar, Christina wrote: I think though that razor thin budgets aside, the EZProxy using community is vulnerable to what amounts to a monopoly. Don't get any ideas, OCLC peeps (just kiddin') but now we're so captive to EZProxy, what are our options if OCLC wants to gradually (or not so gradually) jack up the price? Does being this captive to a single product justify community developer time? I think so but I'm probably just a damn socialist. On Jan 31, 2014, at 1:36 PM, Tim McGeary timmcge...@gmail.com wrote: Even with razor thin budgets, this is a no brainer. May they need decide between buying 10 new books or license EZProxy? Possibly, but if they have a need for EZProxy, that's still a no brainer - until a solid OSS replacement that includes as robust a developer /support community comes around. But again, at $500/year, I don't see a lot of incentive to invest in such a project. On Fri, Jan 31, 2014 at 3:55 PM, Riley Childs rchi...@cucawarriors.com wrote: But there are places on a razor thin budget, and things like this throw them off ball acne Sent from my iPhone On Jan 31, 2014, at 3:32 PM, Tim McGeary timmcge...@gmail.com wrote: So what's the price point that EZProxy needs to climb to make it more realistic to put resources into an alternative. At $500/year, I don't even have to think about justifying it. At 1% (or less) of the cost of position with little to no prior experience needed, it doesn't make a lot of sense to invest in an open source alternative, even on a campus that heavily uses Shibboleth. Tim On Fri, Jan 31, 2014 at 1:36 PM, Ross Singer rossfsin...@gmail.com wrote: Not only that, but it's also expressly designed for the purpose of reverse proxying subscription databases in a library environment. There are tons of things vendors do that would be incredibly frustrating to get working properly in Squid, nginx, or Apache that have already been solved by EZProxy. Which is self-fulfilling: vendors then cater to what EZProxy does (rather than improving access to their resources). Art Rhyno used to say that the major thing that was inhibiting the widespread adoption of Shibboleth was how simple and cheap EZProxy was. I think there is a lot of truth to that. -Ross. On Fri, Jan 31, 2014 at 1:23 PM, Kyle Banerjee kyle.baner...@gmail.com wrote: EZproxy is a self-installing statically compiled single binary download, with a built-in administrative interface that makes most common administrative tasks point-and-click, that works on Linux and Windows systems, and requires very little in the way of resources to run. It also has a library of a few hundred vendor stanzas that can be copied and pasted and work the majority of the time. To successfully replace EZproxy in this setting, it would need to be packaged in such a way that it is equally easy to install and maintain, and the library of vendor stanzas would need to be developed as apache conf.d files. This. The real gain with EZProxy is that configuring it is crazy easy. You just drop it in and run it -- it's feasible for someone with no experience in proxying or systems administration to get it operational in a few minutes. That is why I think virtualizing a system that makes accessing the more powerful features of EZProxy easy is a good alternative. kyle -- Tim McGeary timmcge...@gmail.com GTalk/Yahoo/Skype/Twitter: timmcgeary 484-294-7660 (cell) -- Tim McGeary timmcge...@gmail.com GTalk/Yahoo/Skype/Twitter: timmcgeary 484-294-7660 (cell) -- Stuart Yeates Library Technology Services http://www.victoria.ac.nz/library/
Re: [CODE4LIB] Southeastern Library Association
[and yes, the article is still in need of secondary sources] Thanks to User:Eveross1 (who may or may not be a code4libber) for stepping up and fixing this. cheers stuart
Re: [CODE4LIB] EZProxy changes / alternatives ?
The text I've seen talks about [e]xpanded reporting capabilities to support management decisions in forthcoming versions and encourages towards the hosted solution. Since we're in .nz, they'd put our hosted proxy server in .au, but the network connection between .nz and .au is via the continental .us, which puts an extra trans-pacific network loop in 99% of our proxied network connections. cheers stuart On 30/01/14 03:14, Ingraham Dwyer, Andy wrote: OCLC announced in April 2013 the changes in their license model for North America. EZProxy's license moves from requiring a one-time purchase of US$495 to a *annual* fee of $495, or through their hosted service, with the fee depending on scale of service. The old one-time purchase license is no longer offered for sale as of July 1, 2013. I don't have any details about pricing for other parts of the world. An important thing to recognize here, is that they cannot legally change the terms of a license that is already in effect. The software you have purchased under the old license is still yours to use, indefinitely. OCLC has even released several maintenance updates during 2013 that are available to current license-holders. In fact, they released V5.7 in early January 2014, and made that available to all license-holders. However, all updates after that version are only available to holders of the yearly subscription. The hosted product is updated to the most current version automatically. My recommendation is: If your installation of EZProxy works, don't change it. Yet. Upgrade your installation to the last version available under the old license, and use that for as long as you can. At this point, there are no world-changing new features that have been added to the product. There is speculation that IPv6 support will be the next big feature-add, but I haven't heard anything official. Start planning and budgeting for a change, either to the yearly fee, or the cost of hosted, or to some as-yet-undetermined alternative. But I see no need to start paying now for updates you don't need. -Andy Andy Ingraham Dwyer Infrastructure Specialist State Library of Ohio 274 E. 1st Avenue Columbus, OH 43201 library.ohio.gov -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of stuart yeates Sent: Tuesday, January 28, 2014 10:03 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] EZProxy changes / alternatives ? I probably should have been more specific. Does anyone have experience switching from EzProxy to anything else? Is anyone else aware of the coming OCLC changes and considering switching? Does anyone have a worked example like: My EzProxy config for site Y looked like A; after the switch, my X config for site Z looked like B? I'm aware of this good article: http://journal.code4lib.org/articles/7470 cheers stuart On 29/01/14 15:24, stuart yeates wrote: We've just received notification of forth-coming changes to EZProxy, which will require us to pay an arm and a leg for future versions to install locally and/or host with OCLC AU with a ~ 10,000km round trip. What are the alternatives? cheers stuart -- Stuart Yeates Library Technology Services http://www.victoria.ac.nz/library/ -- Stuart Yeates Library Technology Services http://www.victoria.ac.nz/library/
Re: [CODE4LIB] EZProxy changes / alternatives ?
this in some cases as it rewrites the page. There may be a better way to do this, but this is what I threw together for testing: Location “/badpath” ProxyHTMLEnable Off SetOutputFilter INFLATE;dummy-html-to-plain ExtFilterOptions LogStdErr Onfail=remove /Location ExtFilterDefine dummy-html-to-plain mode=output intype=text/html outtype=text/plain cmd=“/bin/cat -“ So what’s currently missing in the Apache HTTPd solution? - Services that use an authentication token (predominantly ebook vendors) need special support written. I have been entertaining using mod_lua for this to make this support relatively easy for someone who is not hard-core technical to maintain. - Services that are not IP authenticated, but use one of the Form-based authentication variants. I suspect that an approach that injects a script tag into the page pointing to javascript that handles the form fill/submission might be a sane approach here. This should also cleanly deal with the ASP.net abominations that use __PAGESTATE to store sessions client-side instead of server-side. - EZproxy’s built-in DNS server (enabled with the “DNS” directive) would need to be handled using a separate DNS server (there are several options to choose from). - In this setup, standard systems-level management and reporting tools would be used instead of the /admin interface in EZproxy - In this setup, the functionality of the EZproxy /menu URL would need to be handled externally. This may not be a real issue, as many academic sites already use LMS or portal systems instead of the EZproxy to direct students to resources, so this feature may not be as critical to replicate. - And of course, extensive testing. While the above ProQuest stanza works for the main ProQuest search interface, it won’t work for everyone, everywhere just yet. Bottom line: Yes, Apache HTTPd is a viable EZproxy alternative if you have a system administrator who knows their way around Apache HTTPd, and are willing to spend some time getting to know your vendor services intimately. All of this testing was done on Fedora 19 for the 2.4 version of HTTPd, which should be available in RHEL7/CentOS7 soon, so about the time that hard decisions are to be made regarding EZproxy vs something else, that something else may very well be Apache HTTPd with vendor-specific configuration files. -- Stuart Yeates Library Technology Services http://www.victoria.ac.nz/library/
[CODE4LIB] EZProxy changes / alternatives ?
We've just received notification of forth-coming changes to EZProxy, which will require us to pay an arm and a leg for future versions to install locally and/or host with OCLC AU with a ~ 10,000km round trip. What are the alternatives? cheers stuart -- Stuart Yeates Library Technology Services http://www.victoria.ac.nz/library/
Re: [CODE4LIB] EZProxy changes / alternatives ?
EZProxy is a proxy for use with vendors that have products gateway'd by IP address. It allows users who are off-campus to access resources that are locked down by IP address as though the user was on campus. It does deep-packet inspection to write URLs and javascript, facing DNS stuff, etc. It's a product from OCLC, see http://www.oclc.org/en-US/ezproxy.html cheers stuart On 29/01/14 15:05, Riley Childs wrote: Ok, what exactly is EZProxy, I could never figure that out, if I knew I could help :) Sent from my iPhone On Jan 28, 2014, at 9:04 PM, stuart yeates stuart.yea...@vuw.ac.nz wrote: We've just received notification of forth-coming changes to EZProxy, which will require us to pay an arm and a leg for future versions to install locally and/or host with OCLC AU with a ~ 10,000km round trip. What are the alternatives? cheers stuart -- Stuart Yeates Library Technology Services http://www.victoria.ac.nz/library/ -- Stuart Yeates Library Technology Services http://www.victoria.ac.nz/library/
Re: [CODE4LIB] EZProxy changes / alternatives ?
I probably should have been more specific. Does anyone have experience switching from EzProxy to anything else? Is anyone else aware of the coming OCLC changes and considering switching? Does anyone have a worked example like: My EzProxy config for site Y looked like A; after the switch, my X config for site Z looked like B? I'm aware of this good article: http://journal.code4lib.org/articles/7470 cheers stuart On 29/01/14 15:24, stuart yeates wrote: We've just received notification of forth-coming changes to EZProxy, which will require us to pay an arm and a leg for future versions to install locally and/or host with OCLC AU with a ~ 10,000km round trip. What are the alternatives? cheers stuart -- Stuart Yeates Library Technology Services http://www.victoria.ac.nz/library/
Re: [CODE4LIB] Expressing negatives and similar in RDF
On 13/09/13 23:32, Meehan, Thomas wrote: However, it would be more useful, and quite common at least in a bibliographic context, to say This book does not have a title. Ideally (?!) there would be an ontology of concepts like none, unknown, or even something, but unspecified: This book has no title: example:thisbook dc:title hasobject:false . It is unknown if this book has a title (sounds undesirable but I can think of instances where it might be handy[2]): example:thisbook dc:title hasobject:unknown . This book has a title but it has not been specified: example:thisbook dc:title hasobject:true . The root of the cure here is having a model that defines the exact semantics of the RDF tags you're using. For example the FRBRoo model, to assert that an F1 (Work) exists logically implies the existence of an E39 (Creator), an F27 (Work Conception), an F28 (Expression Creation), an F4 (Manifestation Singleton) and an F2 Expression, as well as two E52 (TimeSpan)s and two E53 (Place)s. See http://www.cidoc-crm.org/frbr_graphical_representation/graphical_representation/work_time.html The bibliographer / cataloguer need not mention any of these, unless they wish to use them to add metadata to the F1 or to connect them with other items in the collection. cheers stuart -- Stuart Yeates Library Technology Services http://www.victoria.ac.nz/library/
Re: [CODE4LIB] Subject Terms in Institutional Repositories
That's handled by staff with cataloguing training and disposition. cheers stuart On 03/09/13 05:24, McAulay, Elizabeth wrote: Hi Stuart, For bullet point #2 below, how do you manage the workflow of the creative spelling correction. Is the correction handled manually or automatically, or somewhere in between? Thanks, Lisa - Elizabeth Lisa McAulay Librarian for Digital Collection Development UCLA Digital Library Program http://digital.library.ucla.edu/ email: emcaulay [at] library.ucla.edu From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] on behalf of stuart yeates [stuart.yea...@vuw.ac.nz] Sent: Sunday, September 01, 2013 1:36 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Subject Terms in Institutional Repositories I run the techie side of http://researcharchive.vuw.ac.nz/ and we use dc.subject: (*) We ask for at least three depositor-supplied keywords (*) When a depositor uses creative spelling in any of the depositor-supplied fields, we add standard spelling as a dc.subject (*) When any field uses non-English language terms we add an English term as a dc.subject (*) When any field uses English language terms to refer to non-English subjects, we add a dc.subject with the native-language term (*) We have some hacky stuff in vuwschema.subject.* which the DSpace development team have told use to keep hacky while they migrate to http://dublincore.org/documents/dcmi-terms/ in the next couple of releases. We'd love to have the resources to do proper subject classification, because it would be a huge enabler of deep interoperability. cheers stuart On 31/08/13 01:36, Matthew Sherman wrote: Sorry, I probably should have provided a bit more depth. It is a University Institutional Repository so we have a rather varied collection of materials from engineering to education to computer science to chiropractic to dental to some student theses and posters. So I guess I need to find something at is extensible. Does that provide a better idea or should I provide more info? On Fri, Aug 30, 2013 at 9:32 AM, Jacob Ratliff jaratlif...@gmail.comwrote: Hi Matt, It depends on the subject area of your repository. There are dozens of controlled vocabularies that exist (not including specific Enterprise Content Management controlled vocabularies). If you can describe your collection, people might be able to advise you better. Jacob Ratliff Archivist/Taxonomy Librarian National Fire Protection Association On Fri, Aug 30, 2013 at 9:26 AM, Matthew Sherman matt.r.sher...@gmail.comwrote: Hello Code4Libbers, I am working on cleaning up our institutional repository, and one of the big areas of improvement needed is the list of terms from the subject fields. It is messy and I want to take the subject terms and place them into a much better order. I was contemplating using Library of Congress Subject Headings, but I wanted to see what others have done in this area to see if there is another good controlled vocabulary that could work better. Any insight is welcome. Thanks for your time everyone. Matt Sherman Digital Content Librarian University of Bridgeport -- Stuart Yeates Library Technology Services http://www.victoria.ac.nz/library/ -- Stuart Yeates Library Technology Services http://www.victoria.ac.nz/library/
Re: [CODE4LIB] Subject Terms in Institutional Repositories
I run the techie side of http://researcharchive.vuw.ac.nz/ and we use dc.subject: (*) We ask for at least three depositor-supplied keywords (*) When a depositor uses creative spelling in any of the depositor-supplied fields, we add standard spelling as a dc.subject (*) When any field uses non-English language terms we add an English term as a dc.subject (*) When any field uses English language terms to refer to non-English subjects, we add a dc.subject with the native-language term (*) We have some hacky stuff in vuwschema.subject.* which the DSpace development team have told use to keep hacky while they migrate to http://dublincore.org/documents/dcmi-terms/ in the next couple of releases. We'd love to have the resources to do proper subject classification, because it would be a huge enabler of deep interoperability. cheers stuart On 31/08/13 01:36, Matthew Sherman wrote: Sorry, I probably should have provided a bit more depth. It is a University Institutional Repository so we have a rather varied collection of materials from engineering to education to computer science to chiropractic to dental to some student theses and posters. So I guess I need to find something at is extensible. Does that provide a better idea or should I provide more info? On Fri, Aug 30, 2013 at 9:32 AM, Jacob Ratliff jaratlif...@gmail.comwrote: Hi Matt, It depends on the subject area of your repository. There are dozens of controlled vocabularies that exist (not including specific Enterprise Content Management controlled vocabularies). If you can describe your collection, people might be able to advise you better. Jacob Ratliff Archivist/Taxonomy Librarian National Fire Protection Association On Fri, Aug 30, 2013 at 9:26 AM, Matthew Sherman matt.r.sher...@gmail.comwrote: Hello Code4Libbers, I am working on cleaning up our institutional repository, and one of the big areas of improvement needed is the list of terms from the subject fields. It is messy and I want to take the subject terms and place them into a much better order. I was contemplating using Library of Congress Subject Headings, but I wanted to see what others have done in this area to see if there is another good controlled vocabulary that could work better. Any insight is welcome. Thanks for your time everyone. Matt Sherman Digital Content Librarian University of Bridgeport -- Stuart Yeates Library Technology Services http://www.victoria.ac.nz/library/
Re: [CODE4LIB] The Wikipedia Library
If I understand what you're saying, what you need is an EZproxy http://www.oclc.org/ezproxy.en.html install configured to authenticate against Unified login https://meta.wikimedia.org/wiki/Help:Unified_login and specific user rights https://www.mediawiki.org/wiki/Manual:User_rights or some other user grouping mechanism. EZproxy is the defacto standard for making paywalled resources avaliable to institutional users from off campus. The only downsides are (a) that it requires you have full control over DNS because a proxy at https://proxy.wikimedia.org/ would also answer https://some.subscrpition.resource.example.com.proxy.wikimedia.org/ and forward to https://some.subscrpition.resource.example.com/ and (b) as a proxy, it touches all traffic, stream video and very large PDFs will be redirected through the proxy. Alternatively, there is a slight chance of joining a Shibboleth federation http://en.wikipedia.org/wiki/Shibboleth_%28Internet2%29 but that's a big ball of policy that is likely to be incompatible. Shibboleth is preferred for streaming because there is no proxying. cheers stuart http://en.wikipedia.org/wiki/User:Stuartyeates On 29/08/13 03:49, Jake Orlowitz wrote: Hi folks, My name is Jake Orlowitz and I coordinate Wikipedia's open research hub, The Wikipedia Library. Wikimedia Foundation board member Phoebe Ayers recommended that I reach out to you to see if we might be able to collaborate in some way. The Wikipedia Library has several different platforms, several of which would benefit from better technical integration. One of our primary goals is to get active, experienced Wikipedia editors access to paywalled sources and university libraries. We have received donations from several publishers and interest from several libraries. The challenge for us is managing those partnerships at scale and in a secure fashion. We're also working towards more functional research desks, programs that let reference librarians field research queries from editors or the public, remote 'visiting scholar' or 'research affiliate' positions at institutional libraries, University partnerships with online library access, open access awareness programs, and other related activities. I'd love to talk more about these projects with you, either through email or voice chat. Best, Jake Orlowitz Wikipedia: Ocaasi http://enwp.org/User:Ocaasi Facebook: Jake Orlowitz http://www.facebook.com/jorlowitz Twitter: JakeOrlowitz https://twitter.com/JakeOrlowitz LinkedIn: Jake Orlowitzhttp://www.linkedin.com/profile/view?id=197604531 Email: jorlow...@yahoo.com Skype: jorlowitz Cell: (484) 684-2104 Home: (484) 380-3940 -- Stuart Yeates Library Technology Services http://www.victoria.ac.nz/library/
Re: [CODE4LIB] text mining software
There have been some great software recommendations in this thread, that I really don't want to quibble with. What I'd like to quibble with is the software-first approach. We've all tried the software-first approach, how many of us were happy with it? There is a standard in this area and that standard appears to have at least two non-trivial implementations, including from one software distributor whose name we all recognise. SPEC: http://docs.oasis-open.org/uima/v1.0/uima-v1.0.html APACHE UIMA: http://uima.apache.org/ GATE: http://gate.ac.uk/ Anyone have experience using the standard or these two implementations? cheers stuart -- Stuart Yeates Library Technology Services http://www.victoria.ac.nz/library/
Re: [CODE4LIB] Way to record usage of tables/rooms/chairs in Library
Many buildings have IR sensors already installed for burglar alarms / fire detection. If you can get a read-only feed from that system you may be able to piggyback. Of course, these kinds of sensors are tripped by staff making regular rounds of all spaces and similar non-patron activity. cheers stuart On 16/08/13 06:33, Brian Feifarek wrote: Motion sensors might be the ticket. For example, https://www.sparkfun.com/products/8630 Brian - Original Message - From: Andreas Orphanides akorp...@ncsu.edu To: CODE4LIB@LISTSERV.ND.EDU Sent: Thursday, August 15, 2013 11:12:02 AM Subject: Re: [CODE4LIB] Way to record usage of tables/rooms/chairs in Library Oh, that's a much better idea than light sensors. One challenge with that might be difficulty in determining what vacant looks like authoritatively, especially if people move chairs, walk through room, etc. But much more accessible than actually bolting stuff to the table, I would think. On Thu, Aug 15, 2013 at 1:03 PM, Schwartz, Raymond schwart...@wpunj.eduwrote: Hey Dre, Perhaps a video camera with some OpenCV? -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Andreas Orphanides Sent: Thursday, August 15, 2013 8:55 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Way to record usage of tables/rooms/chairs in Library If I were feeling really ambitious -- and fair warning, I'm a big believer that any solution worth engineering is worth over-engineering -- I'd come up with something involving light sensors (a la a gate counter) mounted on the table legs, just above seat height. Throw in some something something Arduino or Raspberry Pi, and Bob's your uncle. I find myself more intimidated by the practicality of maintaining such a system (batteries, cord management etc) than about the practicality of this implementation, actually. -dre. On Wed, Aug 14, 2013 at 7:59 PM, Thomas Misilo misi...@fit.edu wrote: Hi, I was wondering if anyone has been asked before to come up with a way to record usage of tables. The ideal solution would be a web app, that we can create floor plans with where all the tables/chairs are and select the reporting time, say 9PM at night. Go around the library and select all the seats/tables/rooms that are currently being used/occupied for statistical data. We would be wanting to go around probably multiple times a day. The current solution I have seen is a pen and paper task, and then someone will have to manually put the data into a spreadsheet for analysis. Thanks! Tom -- Stuart Yeates Library Technology Services http://www.victoria.ac.nz/library/
Re: [CODE4LIB] TemaTres 1.7 released: now with meta-terms and SPARQL endpoint
I'm glad to see development of this continuing. It's been on my list of potential stand-along authority control systems. cheers stuart On 14/08/13 13:35, diego ferreyra wrote: We are glad to announce the public release of TemaTres./TemaTres.html1.7. Here the changelog: - Now you can have a SPARQL Endpoint for your TemaTres vocabulary. Many thanks to Enayat Rajabi!!! - Capability to create and manage meta-terms. Meta-term is a term to describe others terms (Ej: Guide terms, Facets, Categories, etc.). Can't be use in indexing process. - New standard reports: all the terms with his UF terms and all the terms with his RT terms. - Capability to define custom fields in alphabetical export - New capabilities for TemaTres API: suggest suggestDetails, - Fixed bugs and improved several functional aspects. Many thanks to the feedback provided by TemaTres community :) Some HOWTO: How to update to Tematres 1.7: - Login as admin and go to: Menu - Administration - Database maintance - Update 1.6 to 1.7 How to enable SPARQL endpoint: 1) Login as admin and go to Menu - Administration - Configuration - Click in your vocabulary: Set as ENABLE SPARQL endpoint (by default is disable). 2) Login as admin and Goto: Menu - Administration - Database maintance - Update SPARQL endpoint. Best Regards and apologies for cross-posting diego ferreyra temat...@r020.com.ar http://www.vocabularyserver.com -- Stuart Yeates Library Technology Services http://www.victoria.ac.nz/library/
Re: [CODE4LIB] LibGuides: I don't get it
On 12/08/13 12:20, Andrew Darby wrote: I don't get this argument at all. Why is it counter productive to try to look at open source alternatives if the vendor's option is relatively cheap? Why wouldn't you investigate all options? If you have no in-house technical capability, the cost of looking at an open source alternative can easily outweigh the multi-year licensing fee. cheers stuart -- Stuart Yeates Library Technology Services http://www.victoria.ac.nz/library/
Re: [CODE4LIB] WorldCat Implements Content-Negotiation for Linked Data
On 04/06/13 11:18, Karen Coyle wrote: Ta da! That did it, Kyle. Why on earth do we all them smart quotes ?! Because they look damn sexy when printed on pulp-of-murdered-tree, which we all know is authoritative form of any communication. cheers stuart -- Stuart Yeates Library Technology Services http://www.victoria.ac.nz/library/
Re: [CODE4LIB] On-going support for DL projects
On 18/05/13 01:51, Tim McGeary wrote: There is no easy answer for this, so I'm looking for discussion. - Should we begin considering a cooperative project that focuses on emulation, where we could archive projects that emulate the system environment they were built? - Do we set policy that these types of projects last for as long as they can, and once they break they are pulled down? - Do we set policy that supports these projects for a certain period of time and then deliver the application, files, and databases to the faculty member to find their own support? - Do we look for a solution like the Way Back Machine of the Internet Archive to try to present some static / flat presentation of these project? Actually, there is an easy answer to this. Make sure that the collection is aligned with broader institutional priorities to ensure that if/when staff and funding priorities move elsewhere that there is some group / community with a clear interest and/or mandate in keeping the collection at least on life support, if not thriving. Google collections policy for what written statements of this might look like. cheers stuart -- Stuart Yeates Library Technology Services http://www.victoria.ac.nz/library/
Re: [CODE4LIB] Tool to highlight differences in two files
Automating your favourite browser to load and screenshot each version and then using http://www.imagemagick.org/Usage/compare/ should work. Note that this will also catch the scenario where someone has changed the page by changing an image on the page. cheers stuart On 24/04/13 10:18, Wilhelmina Randtke wrote: That helps a lot, because it's for websites which is what I want to compare. I am looking for changes in a site, and I have some archives, but tools for merging code are too labor intensive and don't give a good visual report that I can show to a supervisor. This is good moving forward, but doesn't cover historical pages. I was hoping for something where I could call up two pages and get a visual display of differences for the display version of html, not the code. -Wilhelmina On Tue, Apr 23, 2013 at 5:14 PM, Pottinger, Hardy J. pottinge...@missouri.edu wrote: Hi, I'm not sure if you're really looking for a diff tool, so I'll just shout an answer to a question that I think you might be asking. I use a variation of the script posted here: http://stackoverflow.com/questions/1494488/watch-a-web-page-for-changes for watching a web page for changes. I mostly only ever use this for watching for new artifacts to appear in Maven Central (because refreshing a web page is pretty dull work). Hope this helps. -- HARDY POTTINGER pottinge...@umsystem.edu University of Missouri Library Systems http://lso.umsystem.edu/~pottingerhj/ https://MOspace.umsystem.edu/ Do you love it? Do you hate it? There it is, the way you made it. --Frank Zappa On 4/23/13 3:24 PM, Wilhelmina Randtke rand...@gmail.com wrote: I would like to compare versions of a website scraped at different times to see what paragraphs on a page have changed. Does anyone here know of a tool for holding two files side by side and noting what is the same and what is different between the files? It seems like any simple script to note differences in two strings of text would work, but I don't know a tool to use. -Wilhelmina Randtke -- Stuart Yeates Library Technology Services http://www.victoria.ac.nz/library/
[CODE4LIB] RDA software for managing authorities
I'm looking for recommendations for software for managing authorities. Currently we're using a somewhat antiquated version of EATS https://code.google.com/p/eats/ but we're looking for something different. Our needs / wants are: (*) Sane import/export to RDA (leaning towards RDA native) (*) Sane import from legacy formats (*) Sane export to sundry RDF formats + legacy formats (*) Web based (*) Out of the box rather than highly customised software (*) Good support for bi-lingual / multi-lingual entries (*) Ability to host multiple entirely separate authorities groups with separate policies and practises. (*) Explicit support for VIAF / wikidata / LoC It occurs to me that conceivably the best software for the job is actually an LMS with all the item-level stuff suppressed in favour work-level and authority records, in which case the question becomes is there a RDA-based LMS that can be customised to remove all the item-level stuff? cheers stuart -- Stuart Yeates Library Technology Services http://www.victoria.ac.nz/library/
Re: [CODE4LIB] wiki page about the chode4lib irc bot created
On 25/01/13 09:47, Bohyun Kim wrote: Hi all~ I was not familiar with the code4lib IRC bot (or irc bot in general for that matter), and the recent discussion on the listserv made me curious. BTW I fully support the idea of removing offensive content, and big thanks to those who have been working on cleaning up those stuff. In any case, I figured there might be others who are new to code4lib and were somewhat aware of zoia but not sure what exactly it does or will do. So I created a wiki page with a bunch of examples today morning. It's far from comprehensive but I think it would be cool if others -who care about the bot - add more content to this page. http://wiki.code4lib.org/index.php/Zoia_or_the_Code4Lib_IRC_bot Looking at that, the only absolutely library-specific content there appears to be the MARC plugin (which isn't documented in detail). cheers stuart -- Stuart Yeates Library Technology Services http://www.victoria.ac.nz/library/
Re: [CODE4LIB] Metrics for measuring digital library production
On 18/12/12 10:20, Kyle Banerjee wrote: Howdy all, Just wondering who might be willing to share what kind of stats they produce to justify their continued existence? Of course we do the normal (web activity, items and metadata records created, stuff scanned, etc), but I'm trying to wrap my mind around ways to describe work where there's not a built in assumption that more is better. I recently gave a seven minute rant at NDF about what statistics we aren't collecting. The video requires silverlight, alas: http://webcast.gigtv.com.au/Mediasite/Catalog/catalogs/NDF/ (second page, 'Lightning talks Session 2') Capsule summary: we claim to value user engagement. Making that claim and then failing to attempt to measure it is unprofessional. cheers stuart -- Stuart Yeates Library Technology Services http://www.victoria.ac.nz/library/
Re: [CODE4LIB] basic IRC question/comments
If you decide you'll be hanging out on #libtechwomen a lot (or on irc.freenode.net in general), it might be a good idea to register your nick as explained here http://freenode.net/faq.shtml#userregistration because the short answer is, sometimes people are assholes. Another important option for complete newbies is to pick a guest username (these are often generated automatically by different clients, Guest1234, etc) so that you can try it out completely anonymously. I, for one, am certainly happy to interact with people on an anonymous basis if that's what they want. Cheers stuart
Re: [CODE4LIB] Gender Survey Summary and Results
On 06/12/12 09:05, Sara Amato wrote: I'd been staying out of this discussion, but the thought occurs to me that someone with access to the list of subscribers might run that against a list of traditional boy/girl names, and be able to make some guesses…. That idea runs into problems both with non-western names (there is more than one kind of diversity) and those people whose experience of gender in the workplace have led them to use non-gender-specific identifiers. cheers stuart -- Stuart Yeates Library Technology Services http://www.victoria.ac.nz/library/
Re: [CODE4LIB] EPUB and ILS indexing update : Question on CIP Usage for e-books
But who is deciding the LCC or Dewey Classification code ? Should it be the publisher's initiative ? Is there a way to get those information automatically ? For books published in the US except for those categories listed at http://www.loc.gov/publish/cip/about/ineligible.html , the Library of Congress creates all of the CIP data based on information provided before publication by the publisher. It's not up to the publisher to suggest these. If you want to find classification data for existing titles (as above, mostly only those published in print), you could query the Library of Congress catalog or WorldCat. Conveniently, those ineligible categories include: Books published in electronic format Cheers stuart
Re: [CODE4LIB] Google Analytics/Do Not Track
On 31/10/12 09:51, Nathan Tallman wrote: After all the hoopla this year it looks as if all the major browsers plan to implement a do not track feature that users can enable. Does anyone know if this will block Google Analytics It's probably too early to tell, but my guess is yes... Let me get this right. You want to track users after they have expressed an explicit desire not to be tracked? The link you're after is http://www.ala.org/offices/oif/statementspols/ftrstatement/freedomreadstatement cheers stuart -- Stuart Yeates Library Technology Services http://www.victoria.ac.nz/library/
Re: [CODE4LIB] Seeking examples of outstanding discovery layers
On 21/09/12 12:52, Penelope Campbell wrote: It may not be what you are thinking of, but see http://trove.nla.gov.au/ the best way to see it in action is to do a search. http://www.digitalnz.org/ and it's skins such as http://nzresearch.org.nz/ are also pretty good, not that I'm trying to encourage trans-Tasman rivalry. cheers stuart -- Stuart Yeates Library Technology Services http://www.victoria.ac.nz/library/
Re: [CODE4LIB] visualize website
On 03/09/12 12:59, David Friggens wrote: If you don't have a GUI on the server, it looks like philesight will provide a ring chart view through the web server (haven't tried it myself). If you don't have X11 on the server, pipe (or copy) the output of 'du' via 'ssh' to 'xdu' on your desktop. cheers stuart -- Stuart Yeates Library Technology Services http://www.victoria.ac.nz/library/
Re: [CODE4LIB] Corrections to Worldcat/Hathi/Google
On 29/08/12 19:46, Michael Hopwood wrote: Thanks for this pointer Owen. It's a nice illustration of the fact that what users actually want (well, I know I did back when I actually worked in large information services departments!) is something more like an intranet where the content I find is weighted towards me, the audience e.g. the intranet knows I'm a 2nd year medical student and one of my registered preferred languages is Mandarin already, or it knows that I'm a rare books cataloguer and I want to see what nine out of ten other cataloguers recorded for this obscure and confusing title. Yet another re-invention of content negotiation, AKA RFC 2295. These attempts fail because 99% of data publishers care in the first instance about the single use before them and in the second instance the precedent has already been set. The exception, of course, is legally mandated multi-lingual bureaucracies (Canadian government for en/fr; EU organs for various languages etc) and on-the-wire formatting (for which it works very well) However, this stuff is quite intense for linked data, isn't it? I understand that it would involve lots of quads, named graphs or whatever... In a parallel world, I'm currently writing up recommendations for aggregating ONIX for Books records. ONIX data can come from multiple sources who potentially assert different things about a given book (i.e. something with an ISBN to keep it simple). This is why *every single ONIX data element* can have option attributes of @datestamp @sourcename @sourcetype [e.g. publisher, retailer, data aggregator... library?] ...and the ONIX message as a whole is set up with header and product record segments that each include some info about the sender/recipient/data record in question. Do yo have any stats for how many ONIX data elements in the wild actually use these elements in non-trivial ways? I've never seen any. cheers stuart -- Stuart Yeates Library Technology Services http://www.victoria.ac.nz/library/
Re: [CODE4LIB] Corrections to Worldcat/Hathi/Google
These have to be named graphs, or at least collections of triples which can be processed through workflows as a single unit. In terms of LD there version needs to be defined in terms of: (a) synchronisation with the non-bibliographic real world (i.e. Dataset Z version X was released at time Y) (b) correction/augmentation of other datasets (i.e Dataset F version G contains triples augmenting Dataset H versions A, B, C and D) (c) mapping between datasets (i.e. Dataset I contains triples mapping between Dataset J version K and Dataset L version M (and visa-versa)) Note that a 'Dataset' here could be a bibliographic dataset (records of works, etc), a classification dataset (a version of the Dewey Decimal Scheme, a version of the Māori Subject Headings, a version of Dublin Core Scheme, etc), a dataset of real-world entities to do authority control against (a dbpedia dump, an organisational structure in an institution, etc), or some arbitrary mapping between some arbitrary combination of these. Most of these are going to be managed and generated using current systems with processes that involve periodic dumps (or drops) of data (the dbpedia drops of wikipedia data are a good model here). git makes little sense for this kind of data. github is most likely to be useful for smaller niche collaborative collections (probably no more than a million triples) mapping between the larger collections, and scripts for integrating the collections into a sane whole. cheers stuart On 28/08/12 08:36, Karen Coyle wrote: Ed, Corey - I also assumed that Ed wasn't suggesting that we literally use github as our platform, but I do want to remind folks how far we are from having people friendly versioning software -- at least, none that I have seen has felt intuitive. The features of git are great, and people have built interfaces to it, but as Galen's question brings forth, the very *idea* of versioning doesn't exist in library data processing, even though having central-system based versions of MARC records (with a single time line) is at least conceptually simple. Therefore it seems to me that first we have to define what a version would be, both in terms of data but also in terms of the mind set and work flow of the cataloging process. How will people *understand* versions in the context of their work? What do they need in order to evaluate different versions? And that leads to my second question: what is a version in LD space? Triples are just triples - you can add them or delete them but I don't know of a way that you can version them, since each has an independent T-space existence. So, are we talking about named graphs? I think this should be a high priority activity around the new bibliographic framework planning because, as we have seen with MARC, the idea of versioning needs to be part of the very design or it won't happen. kc On 8/27/12 11:20 AM, Ed Summers wrote: On Mon, Aug 27, 2012 at 1:33 PM, Corey A Harper corey.har...@nyu.edu wrote: I think there's a useful distinction here. Ed can correct me if I'm wrong, but I suspect he was not actually suggesting that Git itself be the user-interface to a github-for-data type service, but rather that such a service can be built *on top* of an infrastructure component like GitHub. Yes, I wasn't saying that we could just plonk our data into Github, and pat ourselves on the back for a good days work :-) I guess I was stating the obvious: technologies like Git have made once hard problems like decentralized version control much, much easier...and there might be some giants shoulders to stand on. //Ed -- Stuart Yeates Library Technology Services http://www.victoria.ac.nz/library/
Re: [CODE4LIB] Corrections to Worldcat/Hathi/Google
On 28/08/12 12:07, Peter Noerr wrote: They are not descendents of the same original, they are independent entities, whether they are recorded as singular MARC records or collections of LD triples. That depends on which end of the stick one grasps. Conceptually these are descendants of the abstract work in question; textually these are independent (or likely to be). In practice it doesn't matter: since git/svn/etc are all textual in nature, they're not good at handling these. The reconciliation is likely to be a good candidate for temporal versioning. It's interesting to ponder which of the many datasets is going to prove to be the hub for reconciliation. My money is on librarything, because their merge-ist approach to cataloguing means they have lots and lots of different versions of the work information to match against. See for example: https://www.librarything.com/work/683408/editions/11795335 Wikipedia / dbpedia have redirects which tend in the same direction, but only for titles and not ISBNs. cheers stuart -- Stuart Yeates Library Technology Services http://www.victoria.ac.nz/library/
Re: [CODE4LIB] Wikis
The wiki software with the largest user base is undoubtedly media wiki (i.e. wikiepdia). We're moving to it as a platform precisely because to leverage the skills that implies. We're not far enough into our roll out to tell whether it's going to be a success cheers stuart Stuart Yeates Library Technology Services http://www.victoria.ac.nz/library/ -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Nathan Tallman Sent: Wednesday, 25 July 2012 8:34 a.m. To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] Wikis There are a plethora of options for wiki software. Does anyone have any recommendations for a platform that's easy-to-use and has a low-learning curve for users? I'm thinking of starting a wiki for internal best practices, etc. and wondered what people who've done the same had success with. Thanks, Nathan
Re: [CODE4LIB] The history of Code4Lib and MediaWiki development.
On 09/06/12 06:18, Klein,Max wrote: I was just wondering if there have been any efforts from Code4Lib into MediaWiki development? I know that there have been some Wikipedia templates and bots designed to interface with library services. Yet what about cold hard MediaWiki extensions? Has there been any discussion on this, any ideas raised? Do you have any specific examples of things that can't be done with templates (good for holding information), bots (good for adding, curating and maintaining information) or CSS (good for displaying information) that can be done using a MediaWiki extension? The road to get a MediaWiki extension stable enough, tested enough, trusted enough and needed enough for it to get rolled out on Wikipedia is long and hard and I wouldn't recommend it unless you have the most cast-iron of use-cases. If the ISBN support were proposed today, I strongly suspect that it won't make it in today. Of course it's now grandfathered in and removing support seems very, very, unlikely. cheers stuart -- Stuart Yeates Library Technology Services http://www.victoria.ac.nz/library/
Re: [CODE4LIB] Best way to process large XML files
On 09/06/12 06:36, Kyle Banerjee wrote: How do you guys deal with large XML files? There have been a number of excellent suggestions from other people, but it's worth pointing out that sometimes low tech is all you need. I frequently use sed to do things such as replace one domain name with another when a website changes their URL. Short for Stream EDitor, sed is a core part of POSIX and should be available pretty on much every UNIX-like platform imaginable. For non-trivial files it works faster than disk access (i.e. works as fast as a naive file copy). Full regexp support is available. sed 's/www.example.net/example.com/gI' IN_FILE OUT_FILE Will stream IN_FILE to OUT_FILE replacing all instances of www.example.net with example.com cheers stuart -- Stuart Yeates Library Technology Services http://www.victoria.ac.nz/library/
Re: [CODE4LIB] Studying the email list
On 06/06/12 06:11, Doran, Michael D wrote: Without asking permission of the list, I hereby assign this new category of things requiring OCLC oversight as salami on the charcuterie spectrum. Bologna == Seal of Disapproval There appears to be a typo here: Soylent Green == Seal of Disapproval cheers stuart -- Stuart Yeates Library Technology Services http://www.victoria.ac.nz/library/
[CODE4LIB] OCLC / British Library / VIAF / Wikipedia
There's a discussion going on on Wikipedia that may be of interest to subscribers of this list: https://en.wikipedia.org/wiki/Wikipedia_talk:Authority_control#More_VIAF_Integration cheers stuart
Re: [CODE4LIB] MARC Magic for file
On 24/05/12 07:14, Ford, Kevin wrote: I finally had occasion today (read: remembered) to see if the *nix file command would recognize a MARC record file. I haven't tested extensively, but it did identify the file as MARC21 Bibliographic record. It also correctly identified a MARC21 Authority Record. I'm running the most recent version of Ubuntu (12.04 - precise pangolin). I write because the inclusion of a file MARC21 specification rule in the magic.db stems from a Code4lib exchange that started in March 2011 [1] (it ends in April if you want to go crawling for the entire thread). A couple of warnings about the unix file command (a) it only looks at the start of the file. This is great because it works fast on big files. This is dreadful because it can't warn you that everything after the first 10k of a 2GB file is corrupt or a 1k MARC file is pre-pended to a 400GB astronomy data file. (b) it is not uncommon for a file to match multiple file types. This can cause problems when using file to check whether inputs to a program are actually the type the program is expecting. (c) some platforms have been notoriously slow to add new definitions, ubuntu is not such a platform. cheers stuart -- Stuart Yeates Library Technology Services http://www.victoria.ac.nz/library/
Re: [CODE4LIB] whimsical homepage idea
The catalog is also a good reference for how many books there are available fuel. Hopefully the records contain information on which are printed on clean-burning paper. cheers stuart On 03/05/12 10:32, Genny Engel wrote: The number of currently available cardigans could then be displayed along with the temperature gauges. Now you also have to interface this whole thing with the item status in the catalog, which will of course have to contain cardigan records. You could use NCIP to grab the status, but I'm not sure what the standard cardigan metadata would include. Genny Engel -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Maryann Kempthorne Sent: Tuesday, May 01, 2012 9:56 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] whimsical homepage idea Why not a cardigan checkout? Maryann On Tue, May 1, 2012 at 6:23 PM, Kyle Banerjeebaner...@uoregon.edu wrote: [stuff on where to get sensors deleted] Depending on how many you need, wireless sensors for weather stations could make more sense (you can run them on different channels to prevent interference). Plus you can use the weather software to generate graphs, upload data, etc. kyle -- -- Kyle Banerjee Digital Services Program Manager Orbis Cascade Alliance baner...@uoregon.edu / 503.999.9787 -- Stuart Yeates Library Technology Services http://www.victoria.ac.nz/library/
[CODE4LIB] Alternatives to MARC (was: Re: [CODE4LIB] NON-MARC ILS?)
MARC is a pain to work with; this is a truism which most of us should be familiar with. Blindly moving away from MARC is not the solution, indeed history suggests that path leads us back to an even more complex version of MARC. MARC is complex (and thus a pain) for three reasons: (a) the inherent complexity of the bibliographic content it deals with; (b) the fact that there are many MARC-using groups who have different sets of motivations and ideas as to what MARC is for; and (c) MARC's long and complicated history. Throwing out MARC doesn't solve any of these except the last, and then only if you throw away all your data and make no efforts to migrate it. Obtaining new data from a consortia or company almost certainly buys you not only MARC's history, but some tasty local decisions on top. A far more productive discussion is to explore potential replacements for MARC. This, of course, is only productively conducted with a sound understanding of the causes of the complexity in MARC. I'll leave it to the reader to consider whether various proponents' arguments are persuasive on this point. cheers stuart -- Stuart Yeates Library Technology Services http://www.victoria.ac.nz/library/
Re: [CODE4LIB] Preserving hyperlinks in conversion from Excel/googledocs/anything to PDF (was Any ideas for free pdf to excel conversion?)
Sounds like a job for LaTeX and a short bash script to me. cheers stuart On 07/03/12 07:55, Bill Dueber wrote: What exactly are you trying to do? Take a list of links and turn them into...a list of hot links in a PDF file? On Mon, Mar 5, 2012 at 8:46 AM, Matt Amorymatt.am...@gmail.com wrote: Does anyone know of any script library that can convert a set of (~200) hyperlinks into Acrobat's goofy protocol? I do own Acrobat Pro. Thanks On Wed, Dec 14, 2011 at 1:08 PM, Matt Amorymatt.am...@gmail.com wrote: Just looking to preserve column structure. -- Matt Amory (917) 771-4157 matt.am...@gmail.com http://www.linkedin.com/pub/matt-amory/8/515/239 -- Matt Amory (917) 771-4157 matt.am...@gmail.com http://www.linkedin.com/pub/matt-amory/8/515/239 -- Stuart Yeates Library Technology Services http://www.victoria.ac.nz/library/
Re: [CODE4LIB] Lift the Flap books
On 15/02/12 13:43, Sara Amato wrote: If you were to have a 'lift the flap' type book that you wanted to digitize, for web display and use, what technology would you use for markup and display? Visually I like the Internet Archive BookReader ( http://openlibrary.org/dev/docs/bookreader ), which says it can do 'foldouts', thought I haven't found an example of HOW to do that ... nor exactly what the metadata schema is. Sounds like an ideal use for HTML, javascript and image transparency using OnMouseOver as trigger. cheers stuart -- Stuart Yeates Library Technology Services http://www.victoria.ac.nz/library/
Re: [CODE4LIB] Detecting DRM in files
On 19/01/12 10:39, Farrell, Larry D wrote: Does anyone know of a Java, Ruby, Python, etc. package to detect digital rights management features in files? This is really a part of detecting file types. There are a number of systems for detecting file types, an overview can be found at: http://www.forensicswiki.org/wiki/File_Format_Identification cheers stuart -- Stuart Yeates Library Technology Services http://www.victoria.ac.nz/library/
[CODE4LIB] Crafting MARC records to ePub files
I help maintain a website ( http://www.nzetc.org/ ) which publishes texts of reasonably broad interest (Nineteenth Century New Zealand novels, the Official New Zealand War Histories, a couple of literary journals, early New Zealand ethnography, etc). We publish to the web and as ePub (+PDF in cases where the layout or typography might be important). Currently we have MARC records that point to the web version, for internal use, but we're looking at making a MARCXML file available for download by third parties (we anticipate public libraries and perhaps school libraries). We'd like to make the MARC records as useful as possible to as many people as possible. I'm looking for specific recommendations of how to code references to ePubs in MARC for maximum compatibility with multiple cataloguing systems. [For those looking at our current ePubs, we're also working to increment the version of the standard we use to make them compatible with iTunesU.] cheers stuart -- Stuart Yeates Library Technology Services http://www.victoria.ac.nz/library/
Re: [CODE4LIB] Models of MARC in RDF
On 07/12/11 14:52, Montoya, Gabriela wrote: Dream Team for Building a MARC RDF Model: Karen Coyle, Alistair Miles, Diane Hillman, Ed Summers, Bradley Westbrook. As much as I have nothing against anyone on this list, isn't it a little US-centric? Didn't we make that mistake before? cheers stuart -- Stuart Yeates Library Technology Services http://www.victoria.ac.nz/library/