Perhaps if you exported data from lists that re likely to have items/bibs
deleted after you have collected them, you could keep an archive of data.
-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of don
warner saklad
Sent: Friday, October 11, 2013
I have a background job that wakes up every night and screen scrapes my
reading history and lists into a local database, and updates cached
availability information -- so I don't have to worry about the problem that
Don mentions.
However, this is not a solution that scales to Minuteman's 600K+
On Oct 13, 2013, at 6:21 PM, David Friggens frigg...@waikato.ac.nz wrote:
For a limited period of time I am making publicly available a Web-based
program called PDF2TXT --http://bit.ly/1bJRyh8
PDF2TXT extracts the text from an OCRed PDF document
The file I tried was digital native
On Oct 14, 2013, at 1:48 AM, Penelope Campbell
penelope.campb...@facs.nsw.gov.au wrote:
For a limited period of time I am making publicly available a Web-based
program called PDF2TXT -- http://bit.ly/1bJRyh8
As a small special library (solo librarian) in an Australian State
Government
On Oct 14, 2013, at 7:56 AM, Nicolas Franck nicolas.fra...@ugent.be wrote:
Could this also be done by Apache Tika? Or do I miss a crucial point?
http://tika.apache.org/1.4/gettingstarted.html
Nicolas, this looks VERY promising! It seemingly can extract the OCR from a PDF
document as well
On Oct 14, 2013, at 4:49 PM, Robert Haschart rh...@virginia.edu wrote:
For a limited period of time I am making publicly available a Web-based
program called PDF2TXT --http://bit.ly/1bJRyh8
Although based on some subsequent messages where you mention tesseract
maybe I misunderstood and
++ Jonathan and Bill.
1.) Do you have any thoughts on extending traject to index other types of
data--say MODS--into solr, in the future?
2.) What's the etymology of 'traject'?
- Tom
On Oct 14, 2013, at 8:53 AM, Jonathan Rochkind wrote:
Jonathan Rochkind (Johns Hopkins) and Bill Dueber
Don Warner Saklad said:
a) Forensics studies deal with how to retrieve deleted unarchived
data. So called deleted data is actually available.
Computer forensics cannot always get the data back. Television crime shows
greatly exaggerate the capabilities of computer forensics. It depends on
Thank you Steve McDonald !
On Tue, Oct 15, 2013 at 12:32 PM, McDonald, Stephen
steve.mcdon...@tufts.edu wrote:
Don Warner Saklad said:
a) Forensics studies deal with how to retrieve deleted unarchived
data. So called deleted data is actually available.
Computer forensics cannot always get
'traject' means to transmit (e.g., trajectory) -- or at least it did,
when people still used it, which they don't.
The traject workflow is incredibly general: *a reader* sends *a record* to *an
indexing routine* which stuffs...stuff...into a context object which is
then sent to *a writer*. We
Yep, what Bill said, I have had thoughts of extending it to other types
of input too, it was part of my original design goals.
In particular, I was thinking of extending it to arbitrary XML.
Unlike MARC, there are many other options for indexing XML into Solr
(assuming that's your end goal),
For the My Lists feature what steps are actually involved retrieving an
altered/deleted listing like [_] Record b2491348 is not available
03-12-2013 by that bibliographic reference code from the 7 month system
backup? Perhaps the backup is compressed and searching a compressed file is
a barrier
The Code4Lib Journal (http://journal.code4lib.org/) is looking for
volunteers to join its editorial committee. Editorial committee members
work collaboratively to produce the quarterly Code4Lib Journal.
Editorsare expected to:
* Read, discuss, and vote on incoming proposals.
* Volunteer to be
Searching compressed files is no big deal. First of all, you can always
decompress. But if they've just been compressed and not put in a tarball or
some other archive format, you can just use zgrep.
However, many if not most files are in structures that don't lend
themselves to just scanning for
Hello Code4lib,
I'm wondering if any III Sierra users out there have worked on building an API
for accessing their ILS data on top of Sierra's Postgres database. Right now
I'm looking into possibly building something to serve local needs and use
cases, as we're not terribly confident that
Hi Jason,
We haven't planned to write our own APIs for Sierra at this point (we're
still working on getting Sierra to work in the first place), but Grinnell
would be interested in seeing how the process goes for you in terms of
local API building.
As for the Sierra APIs - III just hired a new
On Tue, Oct 15, 2013 at 07:29:01PM +, Thomale, Jason wrote:
Hello Code4lib,
I'm wondering if any III Sierra users out there have worked on building an
API for accessing their ILS data on top of Sierra's Postgres database. Right
now I'm looking into possibly building something to serve
(Please pardon repeated posts).
My thanks to everyone who has responded to the survey so far. I am still
eager to hear about libraries of all types that have implemented or are
planning to implement makerspaces or making activities. Please respond
to the survey linked below before October 22.
Hi Jason,
I've started looking into using ActiveRecord in Rails to plug into the Sierra
Postgres tables.
I'm still learning how to work with Ruby and Rails, but initial experiments are
working:
https://github.com/jamesvanmil/ActiveSierra
(really have just written a few simple models with
Eric,
You might want to consider using http://www.documentcloud.org to host
your users document. That would also take care of
privacy/authentication concerns. I know of a project in journalism
domain (http://overview.ap.org/) which does that.
As far as I remember they do provide an API interface
+1
https://www.documentcloud.org/opensource
--
Al Matthews
Software Developer, Digital Services Unit
Atlanta University Center, Robert W. Woodruff Library
email: amatth...@auctr.edu; office: 1 404 978 2057
On 10/15/13 4:23 PM, Arash.Joorabchi arash.joorab...@ul.ie wrote:
Eric,
You might
Jason,
To expand on Becky's answer a bit: we haven't written our own APIs yet, but
I did write a Sierra driver for VuFind, so I do have some notes that might
be useful to you that I'm happy to share. At least, I've learned the hard
way some things that you don't want to do when you're querying
Jonathan, Bill,
Very interesting--thanks for the replies. While I'm not sure I understand what
indexing arbitrary XML into solr might look like, this does prompt me to think
it would be interesting to look at Trajecting up some EAD (may I use it as a
verb?) into solr, for finding aid
Kent State University Libraries seeks an experienced and creative Digital
Projects Librarian (DPL) who will be responsible for the research, planning,
execution and management of digital projects throughout the University
Libraries' environment. The DPL will work with programmers
and applications
Virginia Tech's Newman Library and the Center for Digital Research and
Scholarship (CDRS) are seeking qualified candidates for two Systems Engineers
for data initiatives. Incumbents will develop systems that: 1) enable data
integration across distributed and heterogeneous local and external data
25 matches
Mail list logo