Hi,
We have downloaded the open library edition data dump from following link
http://openlibrary.org/data/ol_dump_authors_latest.txt.gz
While parsing the data, we found the subjects fields are corrupted for many
editions. eg.
For /books/OL7974826M (isbn=0809119935), the subjects filed is given
following value
"subjects": ["1962-1965)", "Congresses", "Declaratio de libertate religi",
"(2nd :", "Vatican Council"]
By using isbndb api I got below subjects data
<Subjects>
<Subject
subject_id="vatican_council_2nd_1962_1965_declaratio_de_libertate_religi">
Vatican Council -- (2nd :1962-1965). -- Declaratio de libertate
religiosa -- Congresses
</Subject>
<Subject subject_id="freedom_of_religion_congresses">Freedom of
religion -- Congresses</Subject>
</Subjects>
It's clearly visible that the open library data is corrupted in the
subjects filed. This is observed in so many other editions also in the dump.
Can you please help us to find out the correct data? Can you suggest any
solution?
Rgds,
Sujoy
_______________________________________________
Ol-tech mailing list
[email protected]
http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech
To unsubscribe from this mailing list, send email to
[email protected]