On Fri, 23 Dec 2005 19:26:11 +0100, Peter Otten wrote: > Dan Stromberg wrote: > >> I'm wanting to sort a large number of files, like a bunch of output files >> from a large series of rsh or ssh outputs on a large series of distinct >> machines, a music collection in .ogg format (strictly redistributable and >> legally purchased music), a collection of .iso cdrom images (strictly >> redistributable and legally purchased software), and so forth. > > Are you really trying to establish an order or do want to eliminate the > duplicates? > >>>> File("perfectly_legal.ogg") < File("free_of_charge.mp3") > True > > doesn't make that much sense to me, regardless of what the comparison may > actually do.
If I have understood the poster's algorithm correctly, it gets even weirder: Sorted list of files -> [parrot.ogg, redhat.iso, george.log, fred.log, rhino.ogg, cat.ogg, debian.iso, sys_restore.iso, adrian.log, fox.ogg, ...] It seems to this little black duck that by sorting by file contents in this way, the effect to the human reader is virtually to randomise the list of file names. Even if you limit yourself to (say) a set of ogg files, and sort by the binary contents -> # album-track [parrot-6.ogg, rhino-1.ogg, cat-12.ogg, fox-2.ogg, parrot-3.ogg, ...] most people looking at the list would guess it had been shuffled, not sorted. So I too don't know what the original poster hopes to accomplish by sorting on the content of large binary files. -- Steven. -- http://mail.python.org/mailman/listinfo/python-list