Re: [Python-3000] not switching core VCS (was How to override io.BytesIO and io.StringIO with their optimized C version?)
Barry Warsaw wrote: On Dec 28, 2007, at 8:24 PM, Martin v. Löwis wrote: We'll leave the timing up to Brett and the infrastructure committee, but IMO, there's no overriding reason to wait. *That* is the major overriding reason: lack of volunteers. If Bazaar is chosen, I would volunteer to help with the conversion. Same here... Later, Blake. ___ Python-3000 mailing list Python-3000@python.org http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com
Re: [Python-3000] Will standard library modules comply with PEP 8?
Raymond Hettinger wrote: On Aug 27, 2007, at 6:16 PM, [EMAIL PROTECTED] wrote: I would like to see PEP 8 remove the as necessary to improve readability in the function and method naming conventions. That way methods like StringIO.getvalue() can be renamed to StringIO.get_value(). Gratuitous breakage -- for nothing. This is idiotic, pedantic, and counterproductive. (No offense intended, I'm talking about the suggestion, not the suggestor). Ask ten of your programmer friends to write down result equals object dot get value and see if more than one in ten uses an underscore (no stacking the deck with Cobol programmers). Sure, but given the rise of Java, how many of them will spell it with a capital 'V'? ;) On the one hand, I really like consistency in my programming languages. On the other hand, a foolish consistency is the hobgoblin of little minds. Later, Blake. ___ Python-3000 mailing list Python-3000@python.org http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com
Re: [Python-3000] 100% backwards compatible parenless function call statements
Chris Monsanto wrote: so those uncomfortable with this (basic) idea can continue to use parens in their function calls. But we would have to read people's code who didn't use them. my_func2 # call other function my_func2() # call it again So, those two are the same, but these two are different? print my_func2 print my_func2() What about these two? x.y().z x.y().z() Would this apply to anything which implements callable? # Method call? f = open(myfile) f.close What happens in for x in dir(f): x ? If some things are functions, do they get called and the other things don't? --Pros:-- 1) Removes unnecessary verbosity for the majority of situations. unnecessary verbosity is kind of stretching it. Two whole characters in some situations is hardly a huge burden. I'm willing to write up a proper PEP if anyone is interested in the idea. I figured I'd poll around first. I vote AA! Dear god, no!. ;) Seriously, knowing at a glance the difference between function references and function invocations is one of the reasons I like Python (and dislike Ruby). Your proposal would severely compromise that functionality. Later, Blake. ___ Python-3000 mailing list Python-3000@python.org http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com
Re: [Python-3000] [Python-Dev] Universal newlines support in Python 3.0
Guido van Rossum wrote: On 8/13/07, Russell E Owen [EMAIL PROTECTED] wrote: In article [EMAIL PROTECTED], Stephen J. Turnbull [EMAIL PROTECTED] wrote: I have run into files that intentionally have more than one newline convention used (mbox and Babyl mail folders, with messages received from various platforms). However, most of the time multiple newline conventions is a sign that the file is either corrupt or isn't text. There is at least one Mac source code editor (SubEthaEdit) that is all too happy to add one kind of newline to a file that started out with a different line ending character. I've seen similar behavior in MS VC++ (long ago, dunno what it does these days). It would read files with \r\n and \n line endings, and whenever you edited a line, that line also got a \r\n ending. But unchanged lines that started out with \n-only endings would keep the \n only. And there was no way for the end user to see or control this. I've seen it in Scite (an editor based around Scintilla) just yesterday. It was rather annoying, since it messed up my diffs something awful, and was invisible to the naked eye. (But it lets you Show Line Endings, which quickly made the problem apparent.) Later, Blake. ___ Python-3000 mailing list Python-3000@python.org http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com
Re: [Python-3000] base64 - bytes and strings
Martin v. Löwis wrote: The debate is whether base64.encodestring (which accepts bytes) should *produce* (unicode) strings, which would then have to be encoded as us-ascii. That would make a process of going from unicode to base64 bytes a three-step process: tosend = base64.encodestring(data.encode(utf-8)).encode(ascii) Currently, you can spare the last step if you do want bytes, and need to specify .decode(ascii) if you want strings. As a vote for keeping it, does anyone really want to encode the base64-ed data as something other than ascii? I mean, does it make any sense to write: tosend = base64.encodestring(data.encode(utf-8)).encode(UTF-16) ? Even if you could, I believe the resulting string would be un-processable by any other base-64 decoding tool. Later, Blake. ___ Python-3000 mailing list Python-3000@python.org http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com
Re: [Python-3000] _heapq.c, etc. (was Re: Heaptypes)
Guido van Rossum wrote: While the pickle/cPickle, StringIO/cStringIO, etc., naming can be a bit annoying, it does give me the choice whether I want it to be fast or flexible. I definitely *don't* want to continue the old habit of having a slow and a fast module with different names; the experience with especially cPickle and cStringIO is that everyone believes their code is performance critical and hence uses the C version if it exists, thereby repeating the same idiom over and over. Until they need to turn Unicode strings into file-like objects, at which point they go back to StringIO. (Why yes, I was recently bitten by that particular restriction. :) Later, Blake. ___ Python-3000 mailing list Python-3000@python.org http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com
Re: [Python-3000] Support for PEP 3131
Ka-Ping Yee wrote: On Fri, 25 May 2007, Blake Winton wrote: Ka-Ping Yee wrote: Let's see what we can find. I made several attempts to search for non-ASCII identifiers using google.com/codesearch and here's what I got. I think you've got a selection bias here, since google isn't likely to index code not intended for the whole world, and thus the code you'll be searching through is more likely to be in english than code in general. Indeed. I couldn't think of a better way to do a search, but if you come up with any better methods, go for it and let us know what you find. That was what my second [snipped] paragraph was about. If you could find tutorials or sample code in other languages, that might be less biased. Or maybe more biased in the other direction. On the other hand, I suspect you might have to work at Google to be able to run those sorts of queries. It's a hard problem, and while I applaud your effort, I just wanted to make sure that people knew that it wasn't necessarily representative of the real world. Later, Blake. ___ Python-3000 mailing list Python-3000@python.org http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com
Re: [Python-3000] Support for PEP 3131
Ka-Ping Yee wrote: On Fri, 25 May 2007, Josiah Carlson wrote: Apples and oranges to be sure, but there are no other statistics that anyone else is able to offer about use of non-ascii identifiers in Java, Javascript, C#, etc. Let's see what we can find. I made several attempts to search for non-ASCII identifiers using google.com/codesearch and here's what I got. I think you've got a selection bias here, since google isn't likely to index code not intended for the whole world, and thus the code you'll be searching through is more likely to be in english than code in general. Perhaps searching the entire web for class non-ascii string, or non-ascii string ( or non-ascii string = would give more accurate results, if such a thing is even possible. Later, Blake. ___ Python-3000 mailing list Python-3000@python.org http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com
Re: [Python-3000] Support for PEP 3131
Jim Jewett wrote: If you didn't realize it was using non-ASCII (or even that it could), and the author didn't warn you -- then that is an1 appropriate time for the interpreter to warn you that things aren't as you expect. I fail to see your point. Why should the interpreter warn you? Arbitrary Unicode identifier opens up the possibility of code that *looks* like ASCII, but isn't -- so I don't even realize that I missed something. You already have that problem. Right now. And you've had it for at least a year (assuming you installed 2.4.3 when it came out). All screenshots taken on Python 2.4.3, Mac OSX 10.4 Intel. http://bwinton.latte.ca/temp/Python/File.png http://bwinton.latte.ca/temp/Python/Run.png http://bwinton.latte.ca/temp/Python/foo.py So, what are you doing to mitigate this risk now, and why not do the same thing when identifiers are allowed to be arbitrary Unicode? Later, Blake. ___ Python-3000 mailing list Python-3000@python.org http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com
Re: [Python-3000] PEP 3131 accepted
Ka-Ping Yee wrote: But with Unicode identifiers you have no way to know even whether you should be suspicious. You would feel confident that you know what a simple piece of code does, and yet be wrong. Also, Jim Jewett wrote: Strings aren't a problem unless I evaluate them. a = This string has a triple quote and a command in it. \ os.remove(*) If that \ is merely a unicode character that looks like \, you've just deleted your harddrive. (To close it off, you could use , where the middle quote is a unicode character that looks like .) Two strings, with some executable code in the middle, that looks like one harmless string. Actually, I think that could shorten down to: a = os.remove(*) with the middle character of each not being a . My point here is that if you're confident that you know what a simple piece of code does, you're already wrong. Unicode identifiers don't change that. But there is no way to tell by looking at it whether it works or not. If all three occurrences of 'allow' are spelled with ASCII characters, it will work. If the second occurrence of 'allow' is spelled with a Cyrillic 'a' (U+0430), you have a silent security hole. If you search for allow, it'll only match the ones that actually match. Yes, it makes patch reviewers jobs harder, or makes the tools they need to do their jobs need to be smarter. No, I don't think it's as bad as you think it is. And heck, if you're a patch reviewer, set the ASCII-only flag on your version of Python, or run a program before checking it in to flag non-ASCII characters, and reject all patches from that person in the future, since clearly they're a black hat. Also, I find strangely amusing that complaints about characters that look the same as other characters come from someone named ?!ng. :) Later, 314|3. ___ Python-3000 mailing list Python-3000@python.org http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com
Re: [Python-3000] Support for PEP 3131 - discussion on python zope users group
Jason Orendorff wrote: On 5/16/07, Collin Winter [EMAIL PROTECTED] wrote: [Test][ExpectedException(typeof(ArgumentException))] public void 一年未満はエラーになる() { Date date = new Date(0, 1, 1); } The mix of Japanese and English is not as visually jarring as I expected. It actually looks kinda cool. :) I agree, but that particular example kind of worried me, since in my browser's font, は looks a lot like ( followed by some other Japanese character. I spent a couple of minutes looking for the closing paren before realizing that it wasn't what I thought it was... Or course, I have the same problem in English, with rn looking a lot like m sometirnes. (In a related story, a friend of mine mentioned she was on the Pom-pom squad in high-school.) Later, Blake. ___ Python-3000 mailing list Python-3000@python.org http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com
Re: [Python-3000] Reminder: Py3k PEPs due by April
Phillip J. Eby wrote: * Eliminate implicit string concatenation: abc def Sure, I use it, but if it went away, I would type the plus sign. Not a problem. And it would be one less thing for newcomers to learn, and explicit is better, right? But there's another Python principle here, I think... complexity of computation should be represented by complexity of syntax. We don't generally like to use properties for expensive computation, or methods for simple field access, for example. Putting in a '+' sign makes the code *feel* like there's more computation going on, even if the computation gets optimized away. Given that the set of things the compiler can optimize grows at a faster rate than the syntax changes to the language, I'm not sure that that's really a principle. Yeah, sure, you don't want to hide, say, a reverse-dns-lookup-with-associated-timeout behind the creation of a socket (to take a horrible example from Java that's just finished biting me), but + isn't really a heavyweight operator, especially if you're thinking of (small-ish) integers, and if the compiler optimizes it into less than one instruction, then great. I suppose I see the + case for strings, in particular, as being more similar to the + case for numbers than calling methods or properties... Later, Blake. ___ Python-3000 mailing list Python-3000@python.org http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com
Re: [Python-3000] Modules with Dual Python/C Implementations
Brett Cannon wrote: This has been argued about before. It has been suggested we actually ditch the C version since we only want to maintain one version and the Python version can be used by alternative Python implementations. This probably needs to be covered in a PEP that covers a stdlib reorg/renaming. If someone does that, I'ld like to suggest removing re.match, because the match/search distinction seems to be one of the things that trips newbies up on a fairly regular basis (and tripped me up just a few weeks ago after years of using Python), and re.match( pat ) is equivalent to: re.search( ^ + pat ) (or could hopefully be made equivalent). Later, Blake. ___ Python-3000 mailing list Python-3000@python.org http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com
Re: [Python-3000] BOM handling
Talin wrote: My point was different : most programmers are not at your level (or Paul's level, etc.) when it comes to Unicode knowledge. Py3k's str type is supposed to be an abstracted textual type to make it easy to write unicode-friendly applications (isn't it?). The basic controversy centers around the various ways in which Python should attempt to deal with character encodings on various platforms, but my question is for what use cases? To my mind, trying to ask how should we handle character encoding without indicating what we want to use the characters *for* is a meaningless question. Contrary to all expectations, this thread has helped me in my day job already. I'm about to start writing a program (in Python, natch) which will take a set of files, and perform simple token substitution on them, replacing tokens of the form %STUFF.format% with the value of the STUFF token looked up in another (XML, thus Unicode by the time it gets to me) file. The files I'll be substituting in will be in various encodings, and I'll be creating new files which must have the same encoding. Sadly, I don't know what all the encodings are. (The Windows Resource Compiler takes in .rc files, but I can't find any suggestion of what encoding those use. Anyone here know?) The first version of the spec naively mentioned nothing about encodings, and so I raised a red flag about that, seeing that we would have problems, and that the right thing to do in this case isn't clear. Um, what more data do we need for this use-case? I'm not going to suggest an API, other than it would be nice if I didn't have to manually figure out/hard code all the encodings. (It's my belief that I will currently have to do that, or at least special-case XML, to read the encoding attribute.) Oh, and it would be particularly horrible if I output a shell script in UTF-8, and it included the BOM, since I believe that would break the magic number of #!. (To test it in vim, set the following options: :set encoding=utf-8 :set bomb ) Jennifer:~ bwinton$ xxd test 000: efbb bf23 2120 2f62 696e 2f62 6173 680a ...#! /bin/bash. 010: 6563 686f 204a 7573 7420 7465 7374 696e echo Just testin 020: 672e 2e2e 0a g Jennifer:~ bwinton$ ./test -bash: ./test: cannot execute binary file Jennifer:~ bwinton$ xxd test 000: 2321 202f 6269 6e2f 6261 7368 0a65 6368 #! /bin/bash.ech 010: 6f20 4a75 7374 2074 6573 7469 6e67 2e2e o Just testing.. 020: 2e0a .. Jennifer:~ bwinton$ ./test Just testing... From the standpoint of a programmer writing code to process file contents, there's really no such thing as a text file - there are only various text-based file formats. There are XML files, .ini files, email messages and Python source code, all of which need to be processed differently. Yeah, see, at a business level, I really need to process those all in the same way, and it would be annoying to have to write code to handle them all differently. For files with any kind of structure in them, common practice is that we don't treat them as streams of characters, rather we generally have some abstraction layer that sits on top of the character stream and allows us to work with the structure directly. Your common practice, perhaps. I find myself treating them as streams of characters as often as not, because I neither need nor care to process the structure. Heck, even in my source code, I grep more often than I use the fancy Find Usages button (if only because PyDev in Eclipse doesn't let me search for all the usages of a function). So my whole approach to the problem of reading and writing is to come up with a collection of APIs that reflect the common use patterns for the various popular file types. That sounds great. Can you also come up with an API for the files that you don't consider to be in common use? And if so, that's the one that everyone is going to use. (I'm not saying that to be contrary, but because I honestly believe that that's what's going to happen. If there's a choice between using one API for all your files, and using n APIs for all your files, my money is always going to be on the one. Maybe XML will have enough traction to make it two, but certainly no more than that.) Later, Blake. ___ Python-3000 mailing list Python-3000@python.org http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com
Re: [Python-3000] BOM handling
Josiah Carlson wrote: Blake Winton [EMAIL PROTECTED] wrote: I'm not going to suggest an API, other than it would be nice if I didn't have to manually figure out/hard code all the encodings. (It's my belief that I will currently have to do that, or at least special-case XML, to read the encoding attribute.) Use the XML tag/attribute ?xml ... encoding=... ? to discover the encoding and assume utf-8 otherwise as per spec: http://www.w3.org/TR/2000/REC-xml-20001006#NT-EncodingDecl Yeah, but now you're requiring me to read and understand the file's contents, which is something I (as someone who doesn't particularly care about all this encoding stuff) am trying very hard not to do. Does no-one write generic text processing programs anymore? If I were to write a program which rotated an image using PIL, I wouldn't have to care whether it was a png or a jpeg. (At least, I'm pretty sure I wouldn't. I haven't tried recently.) Oh, and it would be particularly horrible if I output a shell script in UTF-8, and it included the BOM, since I believe that would break the magic number of #!. Does bash natively support utf-8? A quick Google gives me: - About bash utf-8: Bash is the shell, or command language interpreter, that will appear in the GNU operating system. It is default shell for BeOS. By default, GNU bash assumes that every character is one byte long and one column wide. It may cause several problems for all non-english BeOS users, especially with file names using national characters. A patch for bash 2.04, by Marcin 'Qrczak' Kowalczyk and Ricardas Cepas, teaches bash about multibyte characters in UTF-8 encoding, and fixes those problems. Double-width characters, combining characters and bidi are not supported by this patch. - which I'm mainly posting here because of the reference to Marcin 'Qrczak' Kowalczyk. Small world, but I wouldn't want to paint it. Is there a bash equivalent to Python coding: directives? You may be attempting to fix a problem that doesn't exist. I don't know if the magic number stuff to determine whether a file is executable or not is bash-specific. Either way, when I save the file in UTF-8, it's fine, but when I save it in UTF-8 with a BOM, it fails. Yeah, see, at a business level, I really need to process those all in the same way, and it would be annoying to have to write code to handle them all differently. So you, or anyone else, can write a module for discovering the encoding used for a particular file based on XML tags, Python coding: directives, etc. It could include an extensible registry, and if it is used enough, could be included in the Python standard library. Okay, so what will happen for file types which aren't in the registry, like that Windows .rc files? I was lying up above when I said that I don't care about this sort of thing. I do care, but I also believe that I am, and should be, in the minority, and that if we can't ship something that will work for people who don't care about this stuff, then we've failed both them and Python. Later, Blake. ___ Python-3000 mailing list Python-3000@python.org http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com
Re: [Python-3000] Fwd: proposal: disambiguating type
On Tue, May 23, 2006 at 09:54:57AM -0700, Guido van Rossum wrote: (In fact, the first time I tried to use type( x ), I accidentally typed 'typeof( x )'. So this is one data point as to how intuitive the name is.) The only intuitive interface is the nipple. Everything else is learned. (Jef Raskin, I believe.) As the father of two young girls, let me add that the nipple isn't particularly intuitive either. It was a two-to-three day intensive learning process for my wife and both babies. (Although it was easier the second time around, since then at least one of them knew what was supposed to be happening.) Later, Blake. ___ Python-3000 mailing list Python-3000@python.org http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com
Re: [Python-3000] Changing function-related TypeErrors
Bill Janssen wrote: GvR writes: On 5/5/06, Bill Janssen [EMAIL PROTECTED] wrote: Is there anywhere else in Python where the type of an object isn't checkable with isinstance()? Yes, it's called duck typing. And, in my opinion, it's probably worth stomping out in Py3K. You want to get rid of all duck typing? That doesn't sound right to me. Anyway it isn't enforceable. I must be misunderstanding you. Yes, I meant get rid of all duck typing. Duck typing is for languages that can't do any better. It's a weakness, not a strength. You missed April Fool's day by more than a month, Bill. Seriously, if I wanted a language that restricted me to classes and subtyping and mixins, I'ld use Java. I like being able to take classes that only implement the methods I need. I like being able to do stuff like: def f(x): ... return x + 1 ... def y(x): ... return x.inc(1) ... y.inc = f y(y) 2 Even though I probably wouldn't do it in production code. Your suggestion also makes it much harder to write Proxy objects, especially if you don't know what it is you're going to be proxying (think SOAP or ctypes for examples). Later, Blake. ___ Python-3000 mailing list Python-3000@python.org http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com
Re: [Python-3000] suggestion for a new socket io stack
tomer filiba wrote: Each protocol would subclass {Stream/Dgram/Raw}Socket and add its protocol-specific options. here's an example for a hierarchy: Socket RawSocket DgramSocket UDPSocket StreamSocket TCPSocket SSLSocket My one small complaint with that is that SSLSocket should really be an SSL wrapper around a TCPSocket, because I wouldn't want to have two classes for HTTP/HTTPS, or FTP/FTPS/SFTP. Java does something reasonable in this regard, I think, where you can have subclasses of Filter(Input,Output)Stream which can build on one-another, so I can have a base-64 encoded, TEA-encrypted, SSL, TCP socket by writing a line similar to: sock = TcpSocket( 129.97.134.11, 8080 ) myIn = Base64Stream( TeaStream( SslStream( sock.getInputStream() ) ) ) myOut = Base64Stream( TeaStream( SslStream( sock.getOutputStream() ) ) ) The real beauty of that is that I can hook up my Base64Stream to a StringIO-like object to base64-encode stuff in memory, for testing, or for actual use. Or I could Tea-encrypt a file while writing it to disk. On the downside, it's a larger change than the one you're proposing. On the upside, it unifies stream sockets, strings, files, and any other source of stream data you can come up with (random generators? keyboards?). Further reading of your examples suggests to me that what I would really like is a stream-view of the Socket class, and let SSL (, etc...) be a wrapper around the Socket's input/output streams. Other than that small suggestion, I quite like your proposal. Later, Blake. ___ Python-3000 mailing list Python-3000@python.org http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com
[Python-3000] Make it easier to port to small devices.
Specifically, I'm thinking of the 300MHz Palm TX sitting on the table beside me, which should be more than powerful enough to run Python, but in general I think it should be easier to cross-compile Python for new architectures. I've taken a stab at it, and perhaps it's because I don't have enough experience porting large projects, but it turned out to be a non-starter for me. It might be worth talking to the Nokia and Pippy people to see what the major challenges were, and how we could make it easier for them. I see smaller devices as a new field, without much competition from other languages, in which Python could come to dominate. (Python on the Palm at least exists, even if it is based on 1.5. I haven't heard anything about anyone attempting to put Ruby on a Palm.) Thanks, Blake. ___ Python-3000 mailing list Python-3000@python.org http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com