On Tue, Oct 9, 2012 at 12:59 AM, Tjerk Anne Meesters <datib...@php.net> wrote:
> On Tue, Oct 9, 2012 at 12:14 AM, Nicolai Scheer <sc...@planetavent.de>wrote:
>
>> Hi!
>>
>> We switched from php 5.3.10 to 5.3.17 this weekend and stumbled upon a
>> behaviour of stream_get_line that is most likely a bug and breaks a
>> lot of our file processing code.
>>
>> The issue seems to have been introduced from 5.3.10 to 5.3.11.
>>
>> I opened a bug report: #63240.
>>
>
> I've managed to reduce the code to this; it's very specific:
>
> $file = __DIR__ . '/input_dummy.txt';
> $delimiter = 'MM';
> file_put_contents($file, str_repeat('.', 8189) . $delimiter . $delimiter);
>
> $fh = fopen($file, "rb");
>
> stream_get_line($fh, 8192, $delimiter);
> var_dump($delimiter === stream_get_line($fh, 8192, $delimter));
>
> fclose($fh);
> unlink($file);
>
> If the internal buffer length is 8192, after the first call to
> stream_get_line() the read position (x) and physical file pointer (y)
> should be positioned like so:
>
> .......MM(x)M(y)M
>
> The fact that (y) is in between the delimiter seems to cause an issue.
>
>


I'm not sure why this bug exists, and I haven't exactly been able to
pinpoint where the bug manifests itself, but something I find
incredibly unusual here is the fact that the size of the stream being
exactly 8193 bytes long is the reason the bug exists.

It has nothing to do with the file pointers position since all we have
to do here is increase or decrease the size of the file by exactly 1
byte and the bug will never show its face.

Test case 1: (we decrease the file size from 8193 bytes to 8192 bytes)

$file = __DIR__ . '/input_dummy.txt';
$delimiter = 'MM';
file_put_contents($file, str_repeat('.', 8188) . $delimiter . $delimiter);

$fh = fopen($file, "rb");

stream_get_line($fh, 8192, $delimiter);
var_dump($delimiter === stream_get_line($fh, 8192, $delimiter));

fclose($fh);
unlink($file);

/* bool(false) */

---------------------------------------

Test 2: (we increase the file size from 8193 bytes to 8194 bytes)

$file = __DIR__ . '/input_dummy.txt';
$delimiter = 'MM';
file_put_contents($file, str_repeat('.', 8190) . $delimiter . $delimiter);

$fh = fopen($file, "rb");

stream_get_line($fh, 8192, $delimiter);
var_dump($delimiter === stream_get_line($fh, 8192, $delimiter));

fclose($fh);
unlink($file);

/* bool(false) */


----------------------


As long as the file size is not exactly equal to 8193 bytes you don't
get this issue. In fact, you can test it with any multiple of 8192 + 1
and the same issue appears. However, the bigger anomaly is that it
also requires the length of the delimiter to be larger than 1 before
the bug manifests itself.

I suspect this has something to do with the way PHP streams are
buffered internally. The internal stream is read up to a certain
length and buffered in memory using the internal API functions, while
your calls to PHP-facing functions like stream_get_line() read
directly from the buffer instead. So it's possible somewhere in this
function (line 1026 of main/streams/streams.c
http://lxr.php.net/xref/PHP_5_4/main/streams/streams.c#1026) lies the
bug.



>> The issue seems to be related to #44607, but that one got fixed years ago.
>>
>> Is anybody able to confirm this behaviour or has stumbled upon this?
>>
>> Furthermore the behaviour of stream_get_line on an empty file seems to
>> have changed between php 5.3.10 and php 5.3.11:
>>
>> <?php
>>
>> $file = __DIR__ . 'empty.txt';
>> file_put_contents( $file, '' );
>> $fh = fopen( $file, 'rb' );
>> $data = stream_get_line( $fh, 4096 );
>> var_dump( $data );
>>
>> result in
>>
>> string(0) ""
>>
>> for php 5.3.10
>>
>> and in
>>
>> bool(false)
>>
>> for php > 5.3.10.
>
> I don't know if this should be considered a bug, but as far as I know
>> such a behaviour should not change during minor releases...
>>
>> Any insight is appreciated!
>>
>> Greetings
>>
>> Nico
>>
>> --
>> PHP Internals - PHP Runtime Development Mailing List
>> To unsubscribe, visit: http://www.php.net/unsub.php
>>
>>
>
>
> --
> --
> Tjerk

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to