On Sat, January 16, 2010 7:06 am, Ulrich Spörlein wrote: > So what's to be done? If fortunes2 is merged into fortunes, a lot of > possible offensive quotes (previously in fortunes-o) would suddenly show > up.
Doing an exact match (and assuming my math/sorting isn't bad): Shared in fortunes and fortunes2: 662 (15205 total) Shared in fortunes and fortunes2-o: 17 (5916 total) Shared in fortunes and zippy: 10 (4097 total) Shared in fortunes and limerick: 10 (4403 total) Shared in fortunes2 and fortunes2-o: 99 (14025 total) Shared in fortunes2-o and limerick: 57 (3223 total) You could merge the *2 files, purge fortunes duplicates in fortunes2-o, zippy, and limerick, etc, without actually losing any data. It certainly seems like a reasonable target for cleaning. My initial hunch would be that these collections grew this way organically, and they haven't been cleaned up because "they've always been that way and it's just 3k of data or so". Someone with a better historical sense can correct me.
