Dear Anders, > So the implication of this is that php cant (be trusted to) handled binary > files properly under windows?
=In a way, yes - but not limited to, or the fault of PHP! Because Windows has these two methods to 'delimit' a file, programs need to know which one to use. =However, it is not my experience that either format presents any problems to PHP under Windows (today - it has in the past!). Thus the PHP filesystem functions are clearly labeled when they are "binary safe". =A short experiment: I asked Windows Explorer for the Properties of my php.ini file. It returned the following information: "24.5KB (25,170 bytes), 25,600 bytes used". That's a file in ASCII-format. A bit further down the dir list was a file in binary format - REGEDIT.EXE: "70.2KB (71,952 bytes), 72,192 bytes used" =I then ran them through PHP's fopen(), fread(), and compared filesize() to strlen(). It made no difference if the fopen used "rb" or just "r" (r=read, b=binary). The lengths reported, and the strings read were consistent/identical (25,170 and 71,952 bytes respectively). > While once step debugging with VC++ to see what fgets() returns as EOF. I > found it to be equal to -1. I don't remember if this was in binary or text > mode, but I think it was binary mode. =I'm sorry, I can't comment on what VC++ does. A lot of 'input' operations return TRUE (-1) at end-of-data/when they have 'nothing' to say, ie when EOF=TRUE. Similarly some return ^Z (CTRL+Z). At least one DEC minicomputer system could even return a string "EOF". =Some binary-read commands allow a whole sector/block/whatever to be read. This enables reading 'beyond' the last (meaningful) byte in a file. PHP's read commands (I quickly checked) all seem to be 'binary safe' and will terminate at EOF - even if you ask for a 1MB string of data from a 1Byte file. =I don't seem to have anything on my laptop at present which would let me binary-read a specific number of bytes - it to get into bytes 71,953-72,192 of the disk space allocated to REGEDIT.EXE. =I have no difficulty using PHP (not quite sure why I'd want to) as a file copier. It faithfully copied a local .HTML file/web page (ASCII) and a .JPG graphic (binary). The .HTML file comes up perfectly in a browser and displays as-expected in NotePad. The .JPG copy displays in a graphics editor faithfully, as well as showing on directory listings as byte-perfect. > Regarding the EOF with Unix files, I am not quite sure how it works, but the > file size information is stored in a structure called 'icommon', so it would > make sense to read the info from there, and then just send and EOF to the > "application" when the file position pointer has reached that # of read > bytes. But to know for sure I would need to check the UNIX file system's > kernel source code, but I really don't have the time left for doing those > funny things today - actually I guess I would need to do at least some work > today to. :) ="work"? What's that? =I had a feeling that was the case for *nix, but in which case why the need for CTRL-Z? - or is CTRL-Z not used under *nix? Any *nix bit-busters amongst us? > /Anders - who was tempted to sporadically insert some ctrl-Z in this > message. :) =perversely enough, the RFC determines a different convention to denote the end of an email msg!!!??? (probably because of what CTRL-Z can do on non-safe systems!) =dn (who file-opened in binary, just in case!) > >-----Original Message----- > >From: DL Neil [mailto:[EMAIL PROTECTED]] > >Sent: Thursday, February 14, 2002 2:04 PM > >To: Svensson, B.A.T. (HKG); [EMAIL PROTECTED] > >Cc: 'alain' > >Subject: Re: [PHP-WIN] Re: Searching MS Word Docs > > > > > >Sorry B.A.T, > > > >Alain's quote is taken directly from > >http://www.php.net/manual/en/function.fopen.php > > > >Using/not "b" does make a difference under Windows - haven't > >tested it under *nix! > > > >The 'end' of a Windows file can be defined in two ways: an > >ASCII file ends with CTRL+Z, a 'binary' file ends > >where the header/catalog entry says so (not sure if that > >terminology has 'traveled' properly). Somewhat > >obviously, the reason for the latter is that the ASCII code > >for CTRL+Z may coincidentally appear within the > >middle of some binary digit (where it doesn't mean either > >CTRL-Z or eof). Does *nix take filesizes from the > >catalog/file header information? and only there? > > > >Regards, > >=dn > > > > > >> >From the manual - Last thing to read ;) > >> >"Note: The mode may contain the letter 'b'. This is useful only on > systems > >> >which differentiate between binary and text files (i.e. Windows. It's > >> >useless on Unix). If not needed, this will be ignored. " > >> > >> If UNIX system doesn't difference between binary and text > >files, how are for > >> ex > >> > >> 'char *fgets()' respectively 'char *gets()' > >> > >> supposed to be working then? > >> > >> When opening a file in binary mode (b) Unix and MS Windows > >behaves more or > >> less the same, on the other hand when opening a file in text > >write mode (w > >> or a) then UNIX might differ from MS Windows. I've noticed > >that MS VC++ > >> appends an ctrl-Z on the end of text files while writing to > >them. If Borland > >> C++ does the same, I don't know. But as far as my limited > >knowledge extends, > >> neither CC, cc, gcc or g++ does in this behavior. > >> > >> Conclusion: > >> > >> Opening a file in binary mode works more or less the same on UNIX and > >> Windows, but text mode works different on the both systems. > > > >=dn > > > > -- > PHP Windows Mailing List (http://www.php.net/) > To unsubscribe, visit: http://www.php.net/unsub.php > > -- PHP Windows Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php