On 06-Jun-2012, at 6:10 PM, Ben Companjen wrote: > Hi Anand, > > Thanks for the quick response, but I'm afraid you mixed up my two questions :) > > My first question was in general: how should I use save_many?
I'm not sure what you mean. If you are asking about how many docs you should save per chunk, I would say, 100. > My second question was: how should we handle the "bad data" errors? > When you say: it is fine to remove authors from editions, I think you > mean "when there are authors in the related work"? Good point. I totally missed that. > If there is no > work, I think finding the new author is second best to relating the > edition to a (possibly new, hopefully matching existing) work and > putting the author in the work. But that is something I hope another > bot (developer) can do :) It looks like the best thing to do is follow redirects. > How about having WorkBot create a work for every edition that has no > work right now? And later see about merging works? There are many > duplicate works already, why not create some (i.e. ~5 million) more > and try fixing them all when it can be done automatically? Yes, that will make the data more uniform. I'll look at it when find some time. Anand _______________________________________________ Ol-tech mailing list [email protected] http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech To unsubscribe from this mailing list, send email to [email protected]
