[mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
Ethan Gruber
Sent: Friday, September 26, 2014 3:54 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Reconciling corporate names?
I would check with the developers of SNAC (
http://socialarchive.iath.virginia.edu/), as they've spent a lot
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Reconciling corporate names?
You could always web scrape, or download and then search the LCNAF with some
script that looks like:
#Build query for webscraping
query = paste(http://id.loc.gov/search/?q=;, URLencode(corporate name here
), q=cs
...@loc.gov
-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Simon
Brown
Sent: Monday, September 29, 2014 9:38 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Reconciling corporate names?
You could always web scrape, or download
for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf
Of Ethan Gruber
Sent: Friday, September 26, 2014 3:54 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Reconciling corporate names?
I would check with the developers of SNAC (
http://socialarchive.iath.virginia.edu/), as they've
-Original Message-
KB From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
KB Simon Brown
KB Sent: Monday, September 29, 2014 9:38 AM
KB To: CODE4LIB@LISTSERV.ND.EDU
KB Subject: Re: [CODE4LIB] Reconciling corporate names?
KB
KB You could always web scrape, or download
After a quick search, http://id.loc.gov/download/ looks like the place to
go. I haven't downloaded it myself, but the file sizes make it look like
the right stuff.
kyle
On Mon, Sep 29, 2014 at 10:55 AM, Jean Roth jr...@nber.org wrote:
What is the link to the downloadable LCNAF data? -- Jean
Thank you! It looks like the files are available as RDF/XML, Turtle, or
N-triples files.
Any examples or suggestions for reading any of these formats?
The MARC Countries file is small, 31-79 kb. I assume a script that
would read a small file like that would at least be a start for the LCNAF
The best way to handle them depends on what you want to do. You need to
actually download the NAF files rather than countries or other small files
as different kinds of data will be organized differently. Just don't try to
read multigigabyte files in a text editor :)
If you start with one of the
I'm looking to reconcile about 40,000 corporate names against LCNAF to see
whether they are authorized strings or not, but I'm drawing a blank about how
to get it done.
I've used http://freeyourmetadata.org/ for reconciling subject headings before,
but I can't get it to work for LCNAF. Has
I would check with the developers of SNAC (
http://socialarchive.iath.virginia.edu/), as they've spent a lot of time
developing named entity recognition scripts for personal and corporate
names. They might have something you can reuse.
Ethan
On Fri, Sep 26, 2014 at 3:47 PM, Galligan, Patrick
@LISTSERV.ND.EDU] On Behalf Of Ethan
Gruber
Sent: Friday, September 26, 2014 3:54 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Reconciling corporate names?
I would check with the developers of SNAC (
http://socialarchive.iath.virginia.edu/), as they've spent a lot of time
developing named entity
11 matches
Mail list logo