On 06-Jun-2012, at 6:10 PM, Ben Companjen wrote:

> Hi Anand,
> 
> Thanks for the quick response, but I'm afraid you mixed up my two questions :)
> 
> My first question was in general: how should I use save_many?

I'm not sure what you mean. If you are asking about how many docs you should 
save per chunk, I would say, 100.

> My second question was: how should we handle the "bad data" errors?
> When you say: it is fine to remove authors from editions, I think you
> mean "when there are authors in the related work"?

Good point. I totally missed that.

> If there is no
> work, I think finding the new author is second best to relating the
> edition to a (possibly new, hopefully matching existing) work and
> putting the author in the work. But that is something I hope another
> bot (developer) can do :)

It looks like the best thing to do is follow redirects.

> How about having WorkBot create a work for every edition that has no
> work right now? And later see about merging works? There are many
> duplicate works already, why not create some (i.e. ~5 million) more
> and try fixing them all when it can be done automatically?

Yes, that will make the data more uniform. I'll look at it when find some time.

Anand
_______________________________________________
Ol-tech mailing list
[email protected]
http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech
To unsubscribe from this mailing list, send email to 
[email protected]

Reply via email to