Hi,

Thank you very much for your answers !
Actually Lee were right, I didn't noticed it, but my search engine was no
capable of dealing with the size of the dump (31Gigs for the last one).

For people who use this bulk are you reinjecting it into a proper database,
or are you abble to request efficiently the raw file ?

thanks again.
Best Regards.

On Wed, Dec 8, 2010 at 4:40 AM, Anand Chitipothu <[email protected]> wrote:

>
> On 08-Dec-2010, at 6:53 AM, Jeulin-L Michael wrote:
>
> Hi,
>
> Actually, it doesn't work like that in my data.
> I have :
> - "ol_dump_2010-11-30.txt" (http://openlibrary.org/data)
>
> My book is related to a work and they both have the same author key.
> But the author key doesn't correspond to any author.
>
> You can see as a research in the full bulk file the following results :
>
> "*Search "OL414281A" (2 hits in 1 files)
> ol_dump_2010-11-30.txt (2 hits)
>
> Line 107198: /type/edition    /books/OL6837588M    2
> 2009-12-14T23:32:31.751114    {"publishers": ["RADA Ediciones"],
> "pagination": "50 p. :", "languages": [{"key": "/languages/spa"}],
> "lc_classifications": ["MLCS 2002/05180 (P)"], "title": "Arena", "lccn":
> ["00334746"], "series": ["Serie Creacio\u0301n ;", "4", "Serie
> Creacio\u0301n (Trujillo, La Libertad, Peru) ;", "4."], "number_of_pages":
> 50, "edition_name": "1. ed.", "last_modified": {"type": "/type/datetime",
> "value": "2009-12-14T23:32:31.751114"}, "latest_revision": 2,
> "publish_country": "pe ", "key": "/books/OL6837588M", "authors": [{"key":
> "/authors/OL414281A"}], "publish_date": "1999", "publish_places":
> ["Trujillo, Peru\u0301"], "works": [{"key": "/works/OL2796358W"}], "type":
> {"key": "/type/edition"}, "by_statement": "Gladys Benko Angulo ;
> [ilustraciones interiores, Manlio Holgui\u0301n Go\u0301mez].", "revision":
> 2}
>
> Line 327392: /type/work    /works/OL2796358W    2
> 2010-02-06T16:32:17.806241    {"lc_classifications": ["MLCS 2002/05180
> (P)"], "key": "/works/OL2796358W", "created": {"type": "/type/datetime",
> "value": "2009-12-10T00:26:56.990080"}, "title": "Arena",
> "first_publish_date": "1999", "latest_revision": 2, "last_modified":
> {"type": "/type/datetime", "value": "2010-02-06T16:32:17.806241"},
> "authors": [{"type": "/type/author_role", "author": {"key":
> "/authors/OL414281A"}}], "type": {"key": "/type/work"}, "revision": 2}*"
>
> As you can see there is the relation between book and work but absolutely
> none between book and author and/or between work and author.
>
> Am I missing something ?
>
>
> Looks like you don't have complete data. Partial download?
>
> I'm seeing 5 matchs.
>
> $ zcat ol_dump_2010-11-30.txt.gz | grep OL414281A
> /type/edition   /books/OL6837588M       2       2009-12-14T23:32:31.751114
>      {"publishers": ["RADA Ediciones"], "pagination": "50 p. :",
> "languages": [{"key": "/languages/spa"}], "lc_classifications": ["MLCS
> 2002/05180 (P)"], "title": "Arena", "lccn": ["00334746"], "series": ["Serie
> Creacio\u0301n ;", "4", "Serie Creacio\u0301n (Trujillo, La Libertad, Peru)
> ;", "4."], "number_of_pages": 50, "edition_name": "1. ed.", "last_modified":
> {"type": "/type/datetime", "value": "2009-12-14T23:32:31.751114"},
> "latest_revision": 2, "publish_country": "pe ", "key": "/books/OL6837588M",
> "authors": [{"key": "/authors/OL414281A"}], "publish_date": "1999",
> "publish_places": ["Trujillo, Peru\u0301"], "works": [{"key":
> "/works/OL2796358W"}], "type": {"key": "/type/edition"}, "by_statement":
> "Gladys Benko Angulo ; [ilustraciones interiores, Manlio Holgui\u0301n
> Go\u0301mez].", "revision": 2}
> /type/work      /works/OL2796358W       2       2010-02-06T16:32:17.806241
>      {"lc_classifications": ["MLCS 2002/05180 (P)"], "key":
> "/works/OL2796358W", "created": {"type": "/type/datetime", "value":
> "2009-12-10T00:26:56.990080"}, "title": "Arena", "first_publish_date":
> "1999", "latest_revision": 2, "last_modified": {"type": "/type/datetime",
> "value": "2010-02-06T16:32:17.806241"}, "authors": [{"type":
> "/type/author_role", "author": {"key": "/authors/OL414281A"}}], "type":
> {"key": "/type/work"}, "revision": 2}
> /type/author    /authors/OL414281A      2       2008-09-02T21:34:21.624464
>      {"name": "Gladys Benko", "personal_name": "Gladys Benko",
> "last_modified": {"type": "/type/datetime", "value":
> "2008-09-02T21:34:21.624464"}, "key": "/authors/OL414281A", "birth_date":
> "1949", "type": {"key": "/type/author"}, "revision": 2}
> /type/work      /works/OL2796359W       2       2010-02-06T16:32:17.806241
>      {"lc_classifications": ["MLCM 97/08524 (P)"], "key":
> "/works/OL2796359W", "created": {"type": "/type/datetime", "value":
> "2009-12-10T00:26:56.990080"}, "title": "Cada pa\u0301gina un recuerdo y un
> sentir", "first_publish_date": "1967", "latest_revision": 2,
> "last_modified": {"type": "/type/datetime", "value":
> "2010-02-06T16:32:17.806241"}, "authors": [{"type": "/type/author_role",
> "author": {"key": "/authors/OL414281A"}}], "type": {"key": "/type/work"},
> "revision": 2}
> /type/edition   /books/OL722600M        2       2009-12-11T23:22:30.156119
>      {"publishers": ["[s.n."], "pagination": "46 leaves ;",
> "lc_classifications": ["MLCM 97/08524 (P)"], "title": "Cada pa\u0301gina un
> recuerdo y un sentir", "lccn": ["97109885"], "number_of_pages": 46,
> "languages": [{"key": "/languages/spa"}], "last_modified": {"type":
> "/type/datetime", "value": "2009-12-11T23:22:30.156119"}, "latest_revision":
> 2, "publish_country": "pe ", "key": "/books/OL722600M", "authors": [{"key":
> "/authors/OL414281A"}], "publish_date": "1967", "publish_places":
> ["Trujillo, Peru\u0301"], "works": [{"key": "/works/OL2796359W"}], "type":
> {"key": "/type/edition"}, "by_statement": "Gladys Benko.", "revision": 2}
>
> Please verify the md5sum of the file that you have. Here is the correct
> value:
>
> $ md5sum ol_dump_2010-11-30.txt.gz
> 451537405c7494f7d85f5eddca8bccfd  ol_dump_2010-11-30.txt.gz
>
> Anand
>
>
> _______________________________________________
> Ol-tech mailing list
> [email protected]
> http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech
> To unsubscribe from this mailing list, send email to
> [email protected]
>
>
_______________________________________________
Ol-tech mailing list
[email protected]
http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech
To unsubscribe from this mailing list, send email to 
[email protected]

Reply via email to