On Mon, 2012-06-04 at 01:11 +0200, Ben Companjen wrote:
> The bad thing is, well more a glitch, if I'm correct: one has to
> scrape author IDs from these search pages, because there is no
> wildcard search in the API. I noticed AMillarBot was
> replacing/correcting missing Umlauts, so perhaps some of the code is
> already there.

Unfortunately, the AMillarBot work is not a good reference.  It is
indeed scraping the search pages, for a handful of patterns I came
across like "Beitra>ge"="Beiträge" (did a search on "title:beitra
ge").  

It is really just a cheap hack, and doesn't scale at all.  

I also just discovered the REAL fix for this anyways:  OL already has
the correct data, it just didn't get imported right.  Ack!!

Take a look at these:

http://openlibrary.org/authors/OL4459814A/Heinrich_Schro_der

http://openlibrary.org/works/OL10684450W/Tonbandgera_te-Messpraxis

http://openlibrary.org/show-records/talis_openlibrary_contribution/talis-openlibrary-contribution.mrc:299045317:529

The Marc record shows the proper original data, at least in my browser,
while the imported items are mangled.  These just need to be
re-processed, and I'm not going to re-invent the importer  :-(

I guess I better file a bug report.

- Alan


_______________________________________________
Ol-tech mailing list
[email protected]
http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech
To unsubscribe from this mailing list, send email to 
[email protected]

Reply via email to