Re: [Python-Dev] Python code.interact() and UTF-8 locale

2005-09-13 Thread Hye-Shik Chang
On 9/11/05, Victor STINNER <[EMAIL PROTECTED]> wrote:
> Hi,
> 
> I found a bug in Python interactive command line (program python alone:
> looks to be code.interact() function in code.py). With UTF-8 locale, the
> command << u"é" >> returns << u'\xc3\xa9' >> and not << u'\xE9' >>.
> Remember: the french e with acute is Unicode 233 (0xE9), encoded \xC3
> \xA9 in UTF-8.

Which version of python do you use?  From 2.4, the interactive mode
respects locale as a source code encoding and it falls back to latin-1
when decoding fails.

Python 2.4.1 (#2, Jul 31 2005, 04:45:53)
[GCC 3.4.2 [FreeBSD] 20040728] on freebsd5
Type "help", "copyright", "credits" or "license" for more information.
>>> u"é"
u'\xe9'


Hye-Shik
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python code.interact() and UTF-8 locale

2005-09-13 Thread Hye-Shik Chang
On 9/13/05, Hye-Shik Chang <[EMAIL PROTECTED]> wrote:
> On 9/11/05, Victor STINNER <[EMAIL PROTECTED]> wrote:
> >
> > I found a bug in Python interactive command line (program python alone:
> > looks to be code.interact() function in code.py). With UTF-8 locale, the
> > command << u"é" >> returns << u'\xc3\xa9' >> and not << u'\xE9' >>.
> > Remember: the french e with acute is Unicode 233 (0xE9), encoded \xC3
> > \xA9 in UTF-8.
> 
> Which version of python do you use?  From 2.4, the interactive mode
> respects locale as a source code encoding and it falls back to latin-1
> when decoding fails.
> 

Aah, code.interact() and IDLE behaviors different from the real
interactive mode currently.  I think it needs to be fixed before the
next release.  For IDLE, I filed a patch on SF #1061803. But it
may need some discussion because of its trickiness. :)

Hye-Shik
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Replacement for print in Python 3.0

2005-09-13 Thread Calvin Spealman
On 9/9/05, Guido van Rossum <[EMAIL PROTECTED]> wrote:
> While I laugh at the naive view of people who write things like
> "Interface equality and neutrality would be a good thing in the
> language" and seriously (? I didn't see a smiley) use this argument to
> plead for not making print() a built-in, I do think that avoiding the
> 'print' name would be a good thing if it could be done without ticking
> off the old-timers.

Oh, no! I've been misrepresented!

I can be a little unclear sometimes, and for that I apologize. What I
was saying is that there are essential to ends to the spectrum: you
either elevate text console IO to a status above other forms of
interface with the applications written in the language, or you don't
build any interface mechanisms into the lanuage at all. Python
currently is at the former end of that spectrum, and the current
discussions seem to be pushing towards the later. My disagreement is
more with consistancy than where it actually stands in that spectrum.
So, I'm saying if it has to be in the language directly, keep a
statement for it. If if really shouldn't be a statement, then make me
import it first.

Yes, I know that no one wants to import a module just to output text,
but I don't see how or why it is any different than importing GUI
modules.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Simplify the file-like-object interface

2005-09-13 Thread Calvin Spealman
On 9/13/05, Andrew Durdin <[EMAIL PROTECTED]> wrote:
> On 9/6/05, Antoine Pitrou <[EMAIL PROTECTED]> wrote:
> >
> > One could use "class decorators". For example if you want to define the
> > method foo() in a file-like class, you could use code like:
> 
> I like the sound of this. Suppose there were a function textstream()
> that decorated a file-like object (supporting read() and write()), so
> as to add all of __iter__(), next(), readline(), readlines(), and
> writeline() that it did not already implement. Then you could wrap any
> file-like object easily to give it convenient text-handling:

Yes, this isn't perl and text, although still important, is not worth
its wait in gold these days. And, have you even tried to weigh digital
content to begin with? Not much there anyway.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Simplify the file-like-object interface

2005-09-13 Thread Michael Chermside
Andrew Durdin writes:
> Another area where I think this approach can help is with the
> text/binary file distinction. file() could always open files as
> binary, and there could be a convenience function textfile(name, mode)
> which would simply return textstream(file(name, mode)). This would
> remove the need for "a" or "b" in the mode parameter, and make it
> easier to keep text- and binary-file access separate in one's mind:

I think you are suffering from the (rather common) misconception that
all files are binary, and the definition of "text file" is a binary
file which should be interpreted as containing characters in some
encoding.

In unix, the above is true. One of the fundamental decisions in Unix
was to treat all files (and lots of other vaguely file-like things)
as undiferentiated streams of bytes. But this is NOT true on many
other operating systems. It is not, for example, true on Windows.

Many operating systems make a distinction between two basic types of
files... "text files" which are line-oriented and contain "text", and
"binary files" which contain streams of bytes. This distinction is
supported by the basic file operations in the C library. To open a
text file in binary mode is technically an error (although in many OSs
you'll get away with it).

-- Michael Chermside
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python code.interact() and UTF-8 locale

2005-09-13 Thread Victor STINNER
Le mardi 13 septembre 2005 à 17:56 +0900, Hye-Shik Chang a écrit :
> On 9/11/05, Victor STINNER <[EMAIL PROTECTED]> wrote:
> > Hi,
> > 
> > I found a bug in Python interactive command line (program python alone:
> > looks to be code.interact() function in code.py). With UTF-8 locale, the
> > command << u"é" >> returns << u'\xc3\xa9' >> and not << u'\xE9' >>.
> > Remember: the french e with acute is Unicode 233 (0xE9), encoded \xC3
> > \xA9 in UTF-8.
> 
> Which version of python do you use?  From 2.4, the interactive mode
> respects locale as a source code encoding and it falls back to latin-1
> when decoding fails.
> 
> Python 2.4.1 (#2, Jul 31 2005, 04:45:53)
> [GCC 3.4.2 [FreeBSD] 20040728] on freebsd5
> Type "help", "copyright", "credits" or "license" for more information.
> >>> u"é"
> u'\xe9'

I installed my own Python 2.4 in /opt/python/. I don't know if the right
code.py is loaded, but here is the output :
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
$ ./python2.4 
Python 2.4.1 (#1, Sep 11 2005, 01:37:26) 
[GCC 4.0.2 20050821 (prerelease) (Debian 4.0.1-6)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> u"é"
u'\xe9'
>>> import code
>>> code.interact()
Python 2.4.1 (#1, Sep 11 2005, 01:37:26) 
[GCC 4.0.2 20050821 (prerelease) (Debian 4.0.1-6)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>> u"é"
u'\xc3\xa9'
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

Well, that works better :-) For code.interact(), you can read my
attached patch. I don't know if it the best way to fix the but.

But, the following code still bug in Python 2.4 :
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
$ cat python_unicode_eval_bug.py 
#*- coding: UTF-8 -*-
print "One Unicode character: %u" % len(u"é")
print "One Unicode character (using eval) : %u" % eval('len(u"é")')
$ python2.4 python_unicode_eval_bug.py 
One Unicode character: 1
One Unicode character (using eval) : 2
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

RexFi explains me that Python can't guess eval('len(u"é")') charset.
Yep, that's difficult: locale? charset encoding? This test doesn't
matter.

@+, Haypo
--- /usr/lib/python2.3/code.py	2005-08-30 18:02:31.0 +0200
+++ code.py	2005-09-12 14:37:14.0 +0200
@@ -232,6 +232,7 @@
 prompt = sys.ps1
 try:
 line = self.raw_input(prompt)
+line = unicode(line, sys.stdin.encoding)
 except EOFError:
 self.write("\n")
 break
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] os.path.diff(path1, path2)

2005-09-13 Thread Nathan Bullock
Just wondering if a function such as this has ever
been considered? I find that I quite often want a
function that will give me a relative path from path A
to path B. I have created such a function, but it
would be nice if it was in the standard library.

This function would take two paths: A and B and give
the relation between them. Here are a few of examples.

os.path.diff("/A/C/D/", "/A/D/F/")
 ==> "../../D/F"

os.path.diff("/A/", "/A/B/C/")
 ==> "B/C"

os.path.diff("/A/B/C/", "/A/")
 ==> "../.."

I suppose it would also be nice if you could use path
+ file. For example:

os.path.diff("/A/C/D/xyz.html", "/A/D/F/zlf.html")
 ==> "../../D/F/zlf.html"

I am not subscribed to the list so if anyone thinks
this is useful please CC my email address.

I also want to say thank you to everyone who has made
Python what it is today.

Nathan Bullock


Visit my website at http://www.nathanbullock.org






__ 
Find your next car at http://autos.yahoo.ca
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] IDLE development

2005-09-13 Thread Arthur
Moam writes - 

>Hello,

>More than a year and a half ago, I posted a big patch to IDLE which
>adds support for completion and much better calltips, along with some
>other improvements.

I had also tried to have a little input to the IDLE development process.
Suggesting on the idle-dev list it seemed to me that a trivial patch to the
existing code would provide functionality to allow customization via user-
defined domain-specific syntax highlighting files in a user's home
directory.  So that if one wished one could add appropriate syntax
highlighting for say, Numeric.

I left open the possibility that I was mistaken in my opinion that it was
trivial or necessary or desirable.

I was unable to assess whether the lack of my ability to get a yes, no or
sideways was a result of the fact my suggestion and analysis was absurd, or
something else.  One of the "something else" possibilities was a broken
process.

Still can't assess it.

Art


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Simplify the file-like-object interface

2005-09-13 Thread Paul Moore
On 9/13/05, Michael Chermside <[EMAIL PROTECTED]> wrote:
> In unix, the above is true. One of the fundamental decisions in Unix
> was to treat all files (and lots of other vaguely file-like things)
> as undiferentiated streams of bytes. But this is NOT true on many
> other operating systems. It is not, for example, true on Windows.


Actually, on Windows, it *is* true. At the OS API level, all files are
streams of bytes (it's not as uniform as Unix - many things that Unix
forces into a file-like mould don't look exactly like OS-level files
on Windows, consoles and sockets being particular examples). However,
at the C library level, text files are "special" insofar as the C
stdio routines handle CRLF issues and the like differently.

The problem is twofold: (1) that Python works with the C runtime, so
stdio behaviour gets involved, and (2) on Windows, the familiar Unix/C
conventions of "\n" for "newline" don't work without translation - so
if you write text to a binary file, your output doesn't conform to the
*conventions* used by other applications (notably notepad - it's
surprising how many Windows programs actually work fine with
LF-delimited lines in text files...)


Paul.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] os.path.diff(path1, path2)

2005-09-13 Thread Trent Mick
[Nathan Bullock wrote]
> Just wondering if a function such as this has ever
> been considered? I find that I quite often want a
> function that will give me a relative path from path A
> to path B. I have created such a function, but it
> would be nice if it was in the standard library.
> 
> This function would take two paths: A and B and give
> the relation between them. Here are a few of examples.
> 
> os.path.diff("/A/C/D/", "/A/D/F/")
>  ==> "../../D/F"
> 
> os.path.diff("/A/", "/A/B/C/")
>  ==> "B/C"
> 
> os.path.diff("/A/B/C/", "/A/")
>  ==> "../.."

Look around for functions/recipes called "relpath". E.g.:

http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/302594
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/208993
http://www.jorendorff.com/articles/python/path/

Trent

-- 
Trent Mick
[EMAIL PROTECTED]
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Simplify the file-like-object interface

2005-09-13 Thread Bill Janssen
> This [text/binary] distinction is
> supported by the basic file operations in the C library. To open a
> text file in binary mode is technically an error (although in many OSs
> you'll get away with it).

It's one of those "technical" errors that really isn't an error (from
Python).  On the other hand, opening a file in text mode will cause
real data damage to binary files on Windows.  Let some
platform-specific library for that platform preserve it, if it cares
to.

I see no point in keeping this distinction in Python, even for those
platforms bone-headed enough to preserve it.  And the default
certainly shouldn't be to the mode which loses data.

Bill
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] speeding up list append calls

2005-09-13 Thread Neal Norwitz
Tim made me do it! 

 http://groups.google.com/group/comp.lang.python/msg/9075a3bc59c334c9

For whatever reason, I was just curious how his code could be sped up.
 I kept seeing this append method being called and I thought, "there's
an opcode for that."  What happens if you replace var.append() with
the LIST_APPEND opcode.

# standard opcodes
$ ./python ./Lib/timeit.py -n 1000 -s 'import foo' 'foo.foo1(1)'
1000 loops, best of 3: 3.66 msec per loop
# hacked version
$ ./python ./Lib/timeit.py -n 1000 -s 'import foo' 'foo.foo2(1)'
1000 loops, best of 3: 1.74 msec per loop

The patch and foo.py are attached.

This code doesn't really work in general.  It assumes that any append
function call is a list method, which is obviously invalid.  But if a
variable is known to be a list (ie, local and assigned as list
(BUILD_LIST) or a list comprehension), could we do something like this
as a peephole optimization?  I'm not familiar enough with some dynamic
tricks to know if it there are conditions that could break this.

Probably useless, but it was interesting to me at the time.

n
Index: Python/compile.c
===
RCS file: /cvsroot/python/python/dist/src/Python/compile.c,v
retrieving revision 2.352
diff -w -u -r2.352 compile.c
--- Python/compile.c	3 Aug 2005 18:33:05 -	2.352
+++ Python/compile.c	14 Sep 2005 05:33:29 -
@@ -754,6 +754,37 @@
 			}
 			break;
 
+		/* Replace LOAD_ATTR "append" ... CALL_FUNCTION 1 with
+		   LIST_APPEND */
+		case LOAD_ATTR:
+			/* for testing: only do optimization if name is speed */
+			name = PyString_AsString(PyTuple_GET_ITEM(names, 0));
+			if (name == NULL  ||  strcmp(name, "speed") != 0)
+continue;
+			/* end testing bit */
+
+			j = GETARG(codestr, i);
+			name = PyString_AsString(PyTuple_GET_ITEM(names, j));
+			if (name == NULL  ||  strcmp(name, "append") != 0)
+continue;
+			if (i + 9 > codelen)
+continue;
+			if (codestr[i+6] != CALL_FUNCTION &&
+			codestr[i+9] != POP_TOP)
+continue;
+
+			codestr[i+0] = NOP;
+			codestr[i+1] = NOP;
+			codestr[i+2] = NOP;
+			/* 0-2= LOAD_XXX */
+			/* replace CALL_FUNCTION 1 with LIST_APPEND */
+			codestr[i+6] = LIST_APPEND;
+			codestr[i+7] = NOP;
+			codestr[i+8] = NOP;
+			/* get rid of POP_TOP */
+			codestr[i+9] = NOP;
+			break;
+
 		/* Skip over LOAD_CONST trueconst  JUMP_IF_FALSE xx  POP_TOP */
 		case LOAD_CONST:
 			cumlc = lastlc + 1;


def foo1(n):
  d = []
  for x in range(n):
d.append(5)

def foo2(n):
  speed = []
  for x in range(n):
speed.append(5)

if __name__ == '__main__':
  import dis
  dis.dis(foo1)
  print
  dis.dis(foo2)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com