Re: [CODE4LIB] Implementing OpenURL for simple web resources

2009-09-15 Thread O.Stephens
True. How, from the OpenURL, are you going to know that the rft is meant to represent a website? I guess that was part of my question. But no one has suggested defining a new metadata profile for websites (which I probably would avoid tbh). DC doesn't seem to offer a nice way of doing this (that

Re: [CODE4LIB] Implementing OpenURL for simple web resources

2009-09-15 Thread O.Stephens
I agree with this Rosalyn. The issue that Nate brought up was that the content at http://www.bbc.co.uk could change over time, and old content might be moved to another URI - http://archive.bbc.co.uk or something. So if course A references http://www.bbc.co.uk on 24/08/09, if the content that

[CODE4LIB] Results from Institutional Identifiers in Repositories Survey

2009-09-15 Thread Michael J. Giarlo
Greetings, The NISO I2 Working Group surveyed repository managers and developers about current practices and needs of the repository community around institutional identifiers. Results from the survey will inform a set of use cases that are expected to drive the development of a draft standard

[CODE4LIB] indexing pdf files

2009-09-15 Thread Eric Lease Morgan
I have been having fun recently indexing PDF files. For the pasts six months or so I have been keeping the articles I've read in a pile, and I was rather amazed at the size of the pile. It was about a foot tall. When I read these articles I actively read them -- meaning, I write, scribble,

Re: [CODE4LIB] Implementing OpenURL for simple web resources

2009-09-15 Thread Rosalyn Metz
you could force a timestamp if people don't include a date. and I like the idea of going to the Internet Archive of a website, because then you're not having to get into the business of handling www.bbc.co.uk differently than cnn.com and someblog.org. i also like the idea of using a redirect.

Re: [CODE4LIB] indexing pdf files

2009-09-15 Thread Rosalyn Metz
Eric, I have librarians that would kill for this. In fact I was talking to one about it the other day. She felt there must be a way to handle active reading and make it portable. This would be great in conjunction with RefWorks or Zotero or something along those lines. Rosalyn On Tue, Sep

Re: [CODE4LIB] Implementing OpenURL for simple web resources

2009-09-15 Thread O.Stephens
Thanks Rosalyn, As you say we could push a custom value into rfr_genre. I'm a bit torn on this, as I guess I'm trying to do something that isn't 'hacky' - or at least not from the OpenURL end of it. It might be that this is just wishful thinking, and that I'm just trying to fool myself into

Re: [CODE4LIB] indexing pdf files

2009-09-15 Thread Mark A. Matienzo
Eric, 5. Use pdttotext to extract the OCRed text from the PDF and index it along with the MyLibrary metadata using Solr. [3, 4] Have you considered using Solr's ExtractingRequestHandler [1] for the PDFs? We're using it at NYPL with pretty great success. [1]

Re: [CODE4LIB] indexing pdf files

2009-09-15 Thread Peter Kiraly
Hi all, I would like to suggest an API for extracting text (including highlighted or annotated ones) from PDF: iText (http://www.lowagie.com/iText/). This is a Java API (has C# port), and it helped me a lot, when we worked with extraordinary PDF files. Solr uses Tika

Re: [CODE4LIB] indexing pdf files

2009-09-15 Thread danielle plumer
My (much more primitive) version of the same thing involves reading and annotating articles using my Tablet PC. Although I do get a variety of print publications, I find I don't tend to annotate them as much anymore. I used to use EndNote to do the metadata, then I switched to Zotero. I hadn't

Re: [CODE4LIB] Implementing OpenURL for simple web resources

2009-09-15 Thread Ross Singer
Owen, I might have missed it in this message -- my eyes are starting glaze over at this point in the thread, but can you describe how the input of these resources would work? What I'm basically asking is -- what would the professor need to do to add a new: citation for a 70 year old book;

Re: [CODE4LIB] Implementing OpenURL for simple web resources

2009-09-15 Thread O.Stephens
Ross - no you didn't miss it, There are 3 ways that references might be added to the learning environment: An author (or realistically a proxy on behalf of the author) can insert a reference into a structured Word document from an RIS file. This structured document (XML) then goes through a

Re: [CODE4LIB] Implementing OpenURL for simple web resources

2009-09-15 Thread Eric Hellman
A suggestion on how to get a prof to enter a url. I use this bookmarklet to add a URL to Hacker News: javascript:window.location=%22http://news.ycombinator.com/submitlink?u=%22+encodeURIComponent(document.location)+%22t=%22+encodeURIComponent(document.title) I'm tempted to suggest an api

Re: [CODE4LIB] Implementing OpenURL for simple web resources

2009-09-15 Thread Jonathan Rochkind
O.Stephens wrote: True. How, from the OpenURL, are you going to know that the rft is meant to represent a website? I guess that was part of my question. But no one has suggested defining a new metadata profile for websites (which I probably would avoid tbh). DC doesn't seem to offer a

Re: [CODE4LIB] Implementing OpenURL for simple web resources

2009-09-15 Thread Jonathan Rochkind
Wait, are you really going to try to do this with _SFX_ too? I missed that part. Oh boy. Seriously, I think you are in for a world of painful hacky kludge. Rosalyn Metz wrote: Owen, The reason I suggest a source parser rather than a target parser is that handling the openurl based on the

Re: [CODE4LIB] Implementing OpenURL for simple web resources

2009-09-15 Thread Jonathan Rochkind
O.Stephens wrote: Thanks Rosalyn, As you say we could push a custom value into rfr_genre. I'm a bit torn on this, as I guess I'm trying to do something that isn't 'hacky' - or at least not from the OpenURL end of it. It might be that this is just wishful thinking, and that I'm just trying to

Re: [CODE4LIB] Implementing OpenURL for simple web resources

2009-09-15 Thread O.Stephens
Do you think? I reckon it is just a few lines of code in a custom source parser... Only need to: Check rft.id contains an http uri (regexp) Define a fetchID based on this URI (possibly + date/other metadata) Get a URL or null from a lookup service Insert URL or rft_id value into rft.856 Simple!

Re: [CODE4LIB] Implementing OpenURL for simple web resources

2009-09-15 Thread Ross Singer
Given that the burden of creating these links is entirely on RefWorks Telstar, OpenURL seems as good a choice as anything (since anything would require some other service, anyway). As long as the profs aren't expected to mess with it, I'm not sure that *how* you do the indirection matters all

Re: [CODE4LIB] Implementing OpenURL for simple web resources

2009-09-15 Thread Ross Singer
Oh yeah, one thing I left off -- In Moodle, it would probably make sense to link to the URL in the a tag: a href=http://bbc.co.uk/;The Beeb!/a but use a javascript onMouseDown action to rewrite the link to route through your funky link resolver path, a la Google. That way, the page works like

[CODE4LIB] Fall Internships at WGBH Media Library Archives

2009-09-15 Thread Courtney Michael
Greetings colleagues! We have two opportunities for 2-3 interns at the WGBH Media Library Archives! Please forgive the cross postings and do not respond to me, but send a resume and a statement of interest by email to: human_resour...@wgbh.org or by mail to: WGBH Educational Foundation Human

Re: [CODE4LIB] Implementing OpenURL for simple web resources

2009-09-15 Thread O.Stephens
I'm thinking about it :) Logically I think we can avoid this as we have the context based on the rfr_id (for which we are proposing) rfr_id=info:sid/learn.open.ac.uk:[course code] (at the risk of more comment!) Which seems to me equivalent. I guess it is just a matter of where you do the

Re: [CODE4LIB] Implementing OpenURL for simple web resources

2009-09-15 Thread Eric Hellman
I think using locally meaningful ids in rft_id is a misuse and a mistake. locally meaningful data should goi in rft_dat, accompanied by rfr_id just sayin' On Sep 15, 2009, at 11:52 AM, Jonathan Rochkind wrote: I do like Ross's solution, if you really wanna use OpenURL. I'm much more

Re: [CODE4LIB] Implementing OpenURL for simple web resources

2009-09-15 Thread Eric Hellman
Yes, you can. On Sep 15, 2009, at 11:41 AM, Ross Singer wrote: I can't remember if you can include both metadata-by-reference keys and metadata-by-value, but you could have by-reference (rft_ref=http://telstar.open.ac.uk/1234rft_ref_fmt=RIS or something) point at your citation db to return a

Re: [CODE4LIB] Implementing OpenURL for simple web resources

2009-09-15 Thread Ross Singer
On Tue, Sep 15, 2009 at 12:06 PM, Eric Hellman e...@hellman.net wrote: Yes, you can. In this case, I say punt on dc.identifier, throw the URL in rft_id (since, Eric, you had some concern regarding using the local id for this?) and let the real URL persistence/resolution work happen with the

Re: [CODE4LIB] Implementing OpenURL for simple web resources

2009-09-15 Thread Erik Hetzner
Hi Owen, all: This is a very interesting problem. At Tue, 15 Sep 2009 10:04:09 +0100, O.Stephens wrote: […] If we look at a website it is pretty difficult to reference it without including the URL - it seems to be the only good way of describing what you are actually talking about (how many

Re: [CODE4LIB] indexing pdf files

2009-09-15 Thread Erik Hatcher
Here's a post on how easy it is to send PDF documents to Solr from Java: http://www.lucidimagination.com/blog/2009/09/14/posting-rich-documents-to-apache-solr-using-solrj-and-solr-cell-apache-tika/ Not only can you post PDF (and other rich content) files to Solr for indexing, you can

Re: [CODE4LIB] Implementing OpenURL for simple web resources

2009-09-15 Thread Eric Hellman
The process by which a URI comes to identify something other than the stuff you get by resolving it can be mysterious- I've blogged about a bit: http://go-to-hellman.blogspot.com/2009/07/illusion-of-internet-identity.html In the case of worldcat or google, it's fame. If you think a URI