Re: sorting with expensive compares?

Steven D'Aprano Fri, 23 Dec 2005 13:25:44 -0800

On Fri, 23 Dec 2005 19:26:11 +0100, Peter Otten wrote:

> Dan Stromberg wrote:
> 
>> I'm wanting to sort a large number of files, like a bunch of output files
>> from a large series of rsh or ssh outputs on a large series of distinct
>> machines, a music collection in .ogg format (strictly redistributable and
>> legally purchased music), a collection of .iso cdrom images (strictly
>> redistributable and legally purchased software), and so forth.
> 
> Are you really trying to establish an order or do want to eliminate the
> duplicates?
> 
>>>> File("perfectly_legal.ogg") < File("free_of_charge.mp3")
> True
> 
> doesn't make that much sense to me, regardless of what the comparison may
> actually do.


If I have understood the poster's algorithm correctly, it gets even
weirder:


Sorted list of files ->

[parrot.ogg, redhat.iso, george.log, fred.log, rhino.ogg, cat.ogg,
debian.iso, sys_restore.iso, adrian.log, fox.ogg, ...]

It seems to this little black duck that by sorting by file contents in
this way, the effect to the human reader is virtually to randomise the
list of file names.

Even if you limit yourself to (say) a set of ogg files, and sort by the
binary contents ->

# album-track
[parrot-6.ogg, rhino-1.ogg, cat-12.ogg, fox-2.ogg, parrot-3.ogg, ...]

most people looking at the list would guess it had been shuffled, not
sorted. So I too don't know what the original poster hopes to accomplish
by sorting on the content of large binary files.



-- 
Steven.

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: sorting with expensive compares?

Reply via email to