Hello, Graeme:

It's not a foolish question.  The download file is a pretty good size and that 
can make it seem a little unwieldly.

The "New Pilot" JSON downloads are newline delimited, meaning each resource is 
on its own line.  As chance would have it, this SO post pretty much describes 
the source JSON you are dealing with and solution for reading line-by-line.  
(See the next answer too.)

https://stackoverflow.com/a/12451465

Best of luck,
Kevin

--
Kevin Ford
Library of Congress
Washington, DC


-----Original Message-----
From: Code for Libraries <[email protected]> On Behalf Of Graeme Williams
Sent: Thursday, April 1, 2021 5:11 PM
To: [email protected]
Subject: [CODE4LIB] A question ...

You can download Library of Congress authority files in various formats from 
https://id.loc.gov/download/

Since it's April Fool's Day, let me ask a foolish question:  does anyone have 
code for extracting data from these files?  Preferably the MADS/RDF/JSON format 
files.  Preferably Python code.

I'm interested in extracting (e.g.) the names from the name authority file, so 
I can check the various name fields in a MARC record (e.g., 100 $a) -- and do 
it locally, without calling an API for each entry.

Graeme Williams
Las Vegas, NV
github.com/lagbolt

Reply via email to