Mark Tarver wrote:
> On 23 July, 18:01, Dennis Lee Bieber <[email protected]> wrote:
>> On Thu, 23 Jul 2009 08:48:46 -0700 (PDT), Mark Tarver
>> <[email protected]> declaimed the following in
>> gmane.comp.python.general:
>>
>> > The only hint at a difference I can see is that my ftp program says
>> > the files are of unequal lengths. test.py is 129 bytes long.
>> > python.py 134 bytes long.
>>
>> Just a guess...
>>
>> Line endings... <lf> vs <cr><lf>
>>
>> --
>> Wulfraed Dennis Lee Bieber KD6MOG
>> [email protected] [email protected]
>> HTTP://wlfraed.home.netcom.com/
>> (Bestiaria Support Staff: [email protected])
>> HTTP://www.bestiaria.com/
>
> Is that linefeed + ctrl or what? I can't pick up any difference
> reading the files char by char in Lisp. How do you find the
> difference?
carriage-return + linefeed
That's the Windows convention for end-of-line markers. Unix uses linefeed
only.
If you are on Windows you have to open the file in binary mode to see the
difference:
>>> open("python.py", "rb").read()
'#!/usr/bin/python\r\nprint "Content-type:
text/html"\r\nprint\r\nprint"<html>"\r\nprint "<center>Hello,
Linux.com!</center>"\r\nprint "</html>"'
>>> open("test.py", "rb").read()
'#!/usr/bin/python\nprint "Content-type: text/html"\nprint\nprint
"<html>"\nprint "<center>Hello, Linux.com!</center>"\nprint "</html>"'
>>>
\n denotes a newline (chr(10))
\r denotes a cr (chr(13))
You can fix the line endings with
open(outfile, "wb").writelines(open(infile, "rU"))
Here "U" denotes "universal newline" mode which recognizes "\r", "\n", and
"\r\n" as line endings and translates all of these into "\n". "wb" means
"write binary" which does no conversion. Therefore you'll end up with
outfile following the Unix convention.
Peter
--
http://mail.python.org/mailman/listinfo/python-list