Jim, I've integrated your tricky file into the unit test. Alexis' version passes all tests, whereas the current version fails on your file and ugh.pdf. So I guess we can call it a day and integrate Alexis' fix.
One thing which is different is the use of 1<<16 = 65536 as read block size instead of 65368. It was Barry Pearce who contributed the latest version of read_to_boundary, he must have known why this block size was good (or was it his lucky number ?). Anyway, I've changed Alexis' code to use the readBlockSize variable and changed its value to 1<<16. I've done tests with both 1<<16 and 65638 on the server side and on the client side (changing the tricky file generation), and the new code works.
What we should have done, while we were at spending a lot of time on a trivial string search problem, is at least to implement a fast Boyer-Moore search algorithm, and drop the use of readline. But I guess this will be for 3.4 ;).
And yes, this 3 spaces indent is dreadful.
Regards,
Nicolas
2005/11/7, Jim Gallacher <[EMAIL PROTECTED]
>:
Gregory (Grisha) Trubetskoy wrote:
>
> So I guess this means we roll and vote on a 3.2.5b?
>
As much as it pains me to say it, but yes, this is a must fixm so it's
on to 3.2.5b.
I think we need to do some more extensive testing on Alexis's fix before
we roll 3.2.5b. His read_to_boundary is much simpler than the current
one. This makes me wonder if there is some magic happening in the
current version which is solving some weird problems, or is his code
just that much better?
I'm feeling a little dull right now so all the code just blurs together.
It also doesn't help that util.py uses *3 spaces* for the indent. Yikes.
How the heck did that happen? :(
I'll take another look tomorrow.
Jim