John Machin wrote:
Oh yeah, "the computer said so, it must be correct". Even with your algorithm, I would be investigating cases where files were duplicates but there was nothing in the names or paths that suggested how that might have come about.
Of course, but it's good to know that the computer is right, isn't it? That leaves the human to take decisions instead of double-checking.
I beg your pardon, I was wrong. Bad memory. It's the case of running out of the minuscule buffer pool that you allocate by default where it panics and pulls the sys.exit(1) rip-cord.
Bufferpool is a parameter, and the default values allow for 4096 files of the same size. It's more likely to run out of file handles than out of bufferspace, don't you think?
The pythonic way is to press ahead optimistically and recover if you get bad news.
You're right, that's what I thought about afterwards. Current idea is to design a second class that opens/closes/reads the files and handles the situation independantly of the main class.
I didn't "ask"; I suggested. I would never suggest a class-for-classes-sake. You had already a singleton class; why another". What I did suggest was that you provide a callable interface that returned clusters of duplicates [so that people could do their own thing instead of having to parse your file output which contains a mixture of warning & info messages and data].
That is what I have submitted to you. Are you sure that *I* am the lawyer here?
Re (a): what evidence do you have?
See ;-)
Interesting. Less on XP than on 2000? Maybe there's a machine-wide limit, not a per-process limit, like the old DOS max=20. What else was running at the time?
Nothing I started manually, but the usual bunch of local firewall, virus scanner (not doing a complete machine check at that time).
Test: !for k in range(1000): ! open('foo' + str(k), 'w')
I'll try that.
Announce: "I can open A files at once on box B running os C. The most files of the same length that I have seen is D. The ratio A/D is small enough not to worry."
I wouldn't count on that on a multi-tasking environment, as I said. The class I described earlier seems a cleaner approach.
Regards, -pu -- http://mail.python.org/mailman/listinfo/python-list