John Machin wrote:

(1) It's actually .bz2, not .bz (2) Why annoy people with the
not-widely-known bzip2 format just to save a few % of a 12KB file?? (3)
Typing that on Windows command line doesn't produce a useful result (4)
Haven't you heard of distutils?

(1) Typo, thanks for pointing it out
(2)(3) In the Linux world, it is really popular. I suppose you are a Windows user, and I haven't given that much thought. The point was not to save space, just to use the "standard" format. What would it be for Windows - zip?
(4) Never used them, but are very valid point. I will look into it.


(6) You are keeping open handles for all files of a given size -- have
you actually considered the possibility of an exception like this:
IOError: [Errno 24] Too many open files: 'foo509'

(6) Not much I can do about this. In the beginning, all files of equal size are potentially identical. I first need to read a chunk of each, and if I want to avoid opening & closing files all the time, I need them open together.
What would you suggest?


Once upon a time, max 20 open files was considered as generous as 640KB
of memory. Looks like Bill thinks 512 (open files, that is) is about
right these days.

Bill also thinks it is normal that half of service pack 2 lingers twice on a harddisk. Not sure whether he's my hero ;-)


(7)
Why sort? What's wrong with just two lines:

! for size, file_list in self.compfiles.iteritems():
!     self.comparefiles(size, file_list)

(7) I wanted the output to be sorted by file size, instead of being random. It's psychological, but if you're chasing dups, you'd want to start with the largest ones first. If you have more that a screen full of info, it's the last lines which are the most interesting. And it will produce the same info in the same order if you run it twice on the same folders.


(8)     global
MIN_FILESIZE,MAX_ONEBUFFER,MAX_ALLBUFFERS,BLOCKSIZE,INODES

That doesn't sit very well with the 'everything must be in a class'
religion seemingly espoused by the following:

(8) Agreed. I'll think about that.

(9) Any good reason why the "executables" don't have ".py" extensions
on their names?

(9) Because I am lazy and Linux doesn't care. I suppose Windows does?

All in all, a very poor "out-of-the-box" experience. Bear in mind that
very few Windows users would have even heard of bzip2, let alone have a
bzip2.exe on their machine. They wouldn't even be able to *open* the
box.

As I said, I did not give Windows users much thought. I will improve this.

And what is "chown" -- any relation of Perl's "chomp"?

chown is a Unix command to change the owner or the group of a file. It has to do with controlling access to the file. It is not relevant on Windows. No relation to Perl's chomp.


Thank you very much for your feedback. Did you actually run it on your Windows box?

-pu
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to