Re: [ol-tech] Greek books

Tom Morris Fri, 23 Nov 2012 11:37:12 -0800

The whole thing is moot if the ship is unmanned and there's no one at the
controls to provide access.


>From a brief glance at the code, it appears that the MARC importer was most
recently updated (2010) so therefore most likely to be current.
 openlibrary/catalog/marc/parse.read_edition() creates a Python dictionary
which is basically the equivalent of a JSON data structure, so that would
probably be the target.

The biggest issue would be managing the reconciliation process so that a
ton of duplicate entries aren't created (and, on the other hand, things
which should be different aren't merged together).  The previous/current
import process clearly didn't do a very good job in that regard.

Tom

On Fri, Nov 23, 2012 at 1:08 PM, Karen Coyle <[email protected]> wrote:

> Tom, the input routines do not take in JSON, so I don't know if that will
> work. The only input formats that I'm aware of are MARC and (I just looked
> at the code) a crawl of Amazon. The OL API returns JSON, but the internal
> storage is a triple-store. It looks to me like the MARC data is transformed
> into key/value pairs, but there isn't documentation and reading code "cold"
> is not fun. If you can figure anything out from the code -- please report
> back!
>
> kc
>
>
> On 11/23/12 9:44 AM, Tom Morris wrote:
>
>> On Fri, Nov 23, 2012 at 8:46 AM, Karen Coyle <[email protected]
>> <mailto:[email protected]>> wrote:
>>
>>     Angelo, I'm not surprised that OL does not have many modern Greek
>> books
>>     - the data comes mainly from US and UK libraries. There are some of
>> the
>>     usual ancient Greek "classics" but not current publications.
>>
>>     I looked at your spreadsheet and, with the help of Google translate,
>> was
>>     able to make out some of it. I would be willing to spec out how to
>>     convert this to a format that OL can load (either MARC or the Amazon
>> API
>>     format), but can't do the programming, so someone else will need to
>>     volunteer for that. I'm also not sure how we get a file of records
>> into
>>     the OL pipeline, so before we go to the effort we should make sure
>> that
>>     is possible.
>>
>>
>> Since OpenLibrary's native format is JSON, there's a good chance that
>> OpenRefine could massage the CSV into the necessary format using the
>> templating exporter (ie no programming required).
>>
>> I'd be willing to help with that or a simple Python script to do the
>> conversion if I didn't have the strong (nay, overwhelming) sense that
>> Open Library has been abandoned by the Internet Archive as they move on
>> to newer, shinier projects.
>>
>> Tom
>>
>>     On 11/18/12 3:56 PM, Angelo wrote:
>>      > sorry, forgot to attach the file
>>      >
>>      > here it is
>>      >
>>      > Στις 19/11/2012 01:54 πμ, ο/η Angelo έγραψε:
>>      >> Hi,
>>      >>
>>      >> I think that non-english publishers are also included in the open
>>      >> library project but not many books in Greek are included yet.
>>      >>
>>      >> So i would like to inform you (in case you don't already know)
>> that
>>      >> the National Book Center of Greece (EKEBI) collects all greek
>>      >> publications. They have records for all books published in Greece
>>      >> since 1990 (around 170.000)
>>      >>
>>      >> You can find more info
>>      >>
>>      >> here : 
>> http://www.gbip.gr/main.asp?**page=aboutus<http://www.gbip.gr/main.asp?page=aboutus>
>>      >> and here : 
>> http://www.gbip.gr/main.asp?**page=aboutus<http://www.gbip.gr/main.asp?page=aboutus>
>>      >>
>>      >> They state that "On July 1st 2003 access was made openly
>>     available to
>>      >> the public." which is true but their public service lacks an api
>>     or a
>>      >> way to get a dump of the database.
>>      >>
>>      >> On the other hand they do provide a way to download an .xls file
>> of
>>      >> the user's current search. (greek version of the site only). A
>>     current
>>      >> search could be : gimme all the books of publisher X
>>      >>
>>      >> I have contacted them several times for details on how i can
>> access
>>      >> the database for my thesis but got no answer.
>>      >>
>>      >> I guess that if you send them a request, they will have to
>>     answer back.
>>      >>
>>      >> If not, i could finish my scraping scripts based on any specs
>>     that you
>>      >> can provide. The problem is that their web view doesn't provide
>> all
>>      >> info (the xls download method mentioned above provides more)
>>      >>
>>      >> Their database provides Titles,authors, publishers etc in  greek,
>>      >> english and greeklish.
>>      >>
>>      >> if you do take action and have any feedback i would really be
>>      >> interested to know the response.
>>      >>
>>      >> cheers
>>      >>
>>      >> Angelo
>>      >>
>>      >> I attach a small xls sample file of a single publisher
>>      >
>>
>>
> --
> Karen Coyle
> [email protected] http://kcoyle.net
> ph: 1-510-540-7596
> m: 1-510-435-8234
> skype: kcoylenet
>

_______________________________________________
Ol-tech mailing list
[email protected]
http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech
To unsubscribe from this mailing list, send email to 
[email protected]

Re: [ol-tech] Greek books

Reply via email to