[ANNOUNCE] greenlet 0.3.4
Hi, I have uploaded greenlet 0.3.4 to PyPI: http://pypi.python.org/pypi/greenlet What is it? --- The greenlet module provides coroutines for python. coroutines allow suspending and resuming execution at certain locations. concurrence[1], eventlet[2] and gevent[3] use the greenlet module in order to implement concurrent network applications. Documentation can be found here: http://greenlet.readthedocs.org The code is hosted on github: https://github.com/python-greenlet/greenlet Changes in version 0.3.4 The NEWS file lists these changes for release 0.3.4: * Use plain distutils for install command, this fixes installation of the greenlet.h header. * Enhanced arm32 support * Fix support for Linux/S390 zSeries * Workaround compiler bug on RHEL 3 / CentOS 3 [1] http://opensource.hyves.org/concurrence/ [2] http://eventlet.net/ [3] http://www.gevent.org/ -- Cheers Ralf Schmitt -- http://mail.python.org/mailman/listinfo/python-announce-list Support the Python Software Foundation: http://www.python.org/psf/donations/
ANN: TLS Lite 0.4.0
Hi, After years of neglect, I'm maintaining TLS Lite again, the pure-python SSL/TLS library. Lots of bugfixes and cleanup went into this release, but I'm sure there's more issues out there, so let me know! https://github.com/trevp/tlslite/ https://github.com/downloads/trevp/tlslite/tlslite-0.4.0.tar.gz mailing list: http://sourceforge.net/mailarchive/forum.php?forum_name=tlslite-users Trevor -- http://mail.python.org/mailman/listinfo/python-announce-list Support the Python Software Foundation: http://www.python.org/psf/donations/
Re: Python usage numbers
On 12.2.2012 03:23, Steven D'Aprano wrote: The use-case given is: I have a file containing text. I can open it in an editor and see it's nearly all ASCII text, except for a few weird and bizarre characters like £ © ± or ö. In Python 2, I can read that file fine. In Python 3 I get an error. What should I do that requires no thought? Obvious answers: - Try decoding with UTF8 or Latin1. Even if you don't get the right characters, you'll get *something*. - Use open(filename, encoding='ascii', errors='surrogateescape') (Or possibly errors='ignore'.) These are not good answer, IMHO. The only answer I can think of, really, is: - pack you luggage, your submarine waits on you to peel onions in it (with reference to the Joel's article). Meaning, really, you should learn your craft and pull up your head from the sand. There is a wider world around you. (and yes, I am a Czech, so I need at least latin-2 for my language). Best, Matěj -- http://mail.python.org/mailman/listinfo/python-list
Re: Python usage numbers
On 12.2.2012 09:14, Matej Cepl wrote: Obvious answers: - Try decoding with UTF8 or Latin1. Even if you don't get the right characters, you'll get *something*. - Use open(filename, encoding='ascii', errors='surrogateescape') (Or possibly errors='ignore'.) These are not good answer, IMHO. The only answer I can think of, really, is: Slightly less flameish answer to the question “What should I do, really?” is a tough one: all these suggested answers are bad because they don’t deal with the fact, that your input data are obviously broken. The rest is just pure GIGO … without fixing (and I mean, really, fixing, not ignoring the problem, which is what the previous answers suggest) your input, you’ll get garbage on output. And you should be thankful to py3k that it shown the issue to you. BTW, can you display the following line? Příliš žluťoučký kůň úpěl ďábelské ódy. Best, Matěj -- http://mail.python.org/mailman/listinfo/python-list
Re: Guide to: Learning Python Decorators
just google jack diederich decorators it costs nothing and you get a free pycon talk out of it. -Jack -- http://mail.python.org/mailman/listinfo/python-list
Re: Python usage numbers
On Sun, 12 Feb 2012 01:05:35 -0600, Andrew Berg wrote: On 2/12/2012 12:10 AM, Steven D'Aprano wrote: It's not just UTF8 either, but nearly all encodings. You can't even expect to avoid problems if you stick to nothing but Windows, because Windows' default encoding is localised: a file generated in (say) Israel or Japan or Germany will use a different code page (encoding) by default than one generated in (say) the US, Canada or UK. Generated by what? Windows will store a locale value for programs to use, but programs use Unicode internally by default Which programs? And we're not talking about what they use internally, but what they write to files. (i.e., API calls are Unicode unless they were built for old versions of Windows), and the default filesystem (NTFS) uses Unicode for file names. No. File systems do not use Unicode for file names. Unicode is an abstract mapping between code points and characters. File systems are written using bytes. Suppose you're a fan of Russian punk bank Наӥв and you have a directory of their music. The file system doesn't store the Unicode code points 1053 1072 1253 1074, it has to be encoded to a sequence of bytes first. NTFS by default uses the UTF-16 encoding, which means the actual bytes written to disk are \x1d\x040\x04\xe5\x042\x04 (possibly with a leading byte-order mark \xff\xfe). Windows has two separate APIs, one for wide characters, the other for single bytes. Depending on which one you use, the directory will appear to be called Наӥв or 0å2. But in any case, we're not talking about the file name encoding. We're talking about the contents of files. AFAIK, only the terminal has a localized code page by default. Perhaps Notepad will write text files with the localized code page by default, but that's an application choice... Exactly. And unless you know what encoding the application chooses, you will likely get an exception trying to read the file. -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: ANN: Sarge, a library wrapping the subprocess module, has been released.
Having written something with similar purpose (https://github.com/aht/extproc), here are my comments: * Having command parsed from a string is complicated. Why not just have an OOP API to construct commands? extproc does this, but you opted to write a recursive descent parser. I'm sure it's fun but I think simple is better than complex. Most users would prefer not to deal with Python, not another language. * Using threads and fork()ing process does not play nice together unless extreme care is taken. Disasters await. For a shell-like library, I would recommend its users to never use threads (so that those who do otherwise know what they are in for). -- http://mail.python.org/mailman/listinfo/python-list
Re: Numeric root-finding in Python
On Feb 12, 7:41 am, Steven D'Aprano steve +comp.lang.pyt...@pearwood.info wrote: This is only peripherally a Python problem, but in case anyone has any good ideas I'm going to ask it. I have a routine to calculate an approximation of Lambert's W function, and then apply a root-finding technique to improve the approximation. This mostly works well, but sometimes the root-finder gets stuck in a cycle. Here's my function: import math def improve(x, w, exp=math.exp): Use Halley's method to improve an estimate of W(x) given an initial estimate w. try: for i in range(36): # Max number of iterations. ew = exp(w) a = w*ew - x b = ew*(w + 1) err = -a/b # Estimate of the error in the current w. if abs(err) = 1e-16: break print '%d: w= %r err= %r' % (i, w, err) # Make a better estimate. c = (w + 2)*a/(2*w + 2) delta = a/(b - c) w -= delta else: raise RuntimeError('calculation failed to converge', err) except ZeroDivisionError: assert w == -1 return w Here's an example where improve() converges very quickly: py improve(-0.36, -1.222769842388856) 0: w= -1.222769842388856 err= -2.9158979924038895e-07 1: w= -1.2227701339785069 err= 8.4638038491998997e-16 -1.222770133978506 That's what I expect: convergence in only a few iterations. Here's an example where it gets stuck in a cycle, bouncing back and forth between two values: py improve(-0.36787344117144249, -1.0057222396915309) 0: w= -1.0057222396915309 err= 2.6521238905750239e-14 1: w= -1.0057222396915044 err= -2.6521238905872001e-14 2: w= -1.0057222396915309 err= 2.6521238905750239e-14 3: w= -1.0057222396915044 err= -2.6521238905872001e-14 4: w= -1.0057222396915309 err= 2.6521238905750239e-14 5: w= -1.0057222396915044 err= -2.6521238905872001e-14 [...] 32: w= -1.0057222396915309 err= 2.6521238905750239e-14 33: w= -1.0057222396915044 err= -2.6521238905872001e-14 34: w= -1.0057222396915309 err= 2.6521238905750239e-14 35: w= -1.0057222396915044 err= -2.6521238905872001e-14 Traceback (most recent call last): File stdin, line 1, in module File stdin, line 19, in improve RuntimeError: ('calculation failed to converge', -2.6521238905872001e-14) (The correct value for w is approximately -1.00572223991.) I know that Newton's method is subject to cycles, but I haven't found any discussion about Halley's method and cycles, nor do I know what the best approach for breaking them would be. None of the papers on calculating the Lambert W function that I have found mentions this. Does anyone have any advice for solving this? -- Steven Looks like floating point issues to me, rather than something intrinsic to the iterative algorithm. Surely there is not complex chaotic behavior to be found in this fairly smooth function in a +/- 1e-14 window. Otoh, there is a lot of floating point significant bit loss issues to be suspected in the kind of operations you are performing (exp(x) + something, always a tricky one). I would start by asking: How accurate is good enough? If its not good enough, play around the the ordering of your operations, try solving a transformed problem less sensitive to loss of significance; and begin by trying different numeric types to see if the problem is sensitive thereto to begin with. -- http://mail.python.org/mailman/listinfo/python-list
Re: Python usage numbers
On 2/12/2012 3:12 AM, Steven D'Aprano wrote: NTFS by default uses the UTF-16 encoding, which means the actual bytes written to disk are \x1d\x040\x04\xe5\x042\x04 (possibly with a leading byte-order mark \xff\xfe). That's what I meant. Those bytes will be interpreted consistently across all locales. Windows has two separate APIs, one for wide characters, the other for single bytes. Depending on which one you use, the directory will appear to be called Наӥв or 0å2. Yes, and AFAIK, the wide API is the default. The other one only exists to support programs that don't support the wide API (generally, such programs were intended to be used on older platforms that lack that API). But in any case, we're not talking about the file name encoding. We're talking about the contents of files. Okay then. As I stated, this has nothing to do with the OS since programs are free to interpret bytes any way they like. -- CPython 3.2.2 | Windows NT 6.1.7601.17640 -- http://mail.python.org/mailman/listinfo/python-list
Re: ANN: Sarge, a library wrapping the subprocess module, has been released.
On Feb 12, 9:41 am, Anh Hai Trinh anh.hai.tr...@gmail.com wrote: Having written something with similar purpose (https://github.com/aht/extproc), here are my comments: * Having command parsed from a string is complicated. Why not just have an OOP API to construct commands? It's not hard for the user, and less work e.g. when migrating from an existing Bash script. I may have put in the effort to use a recursive descent parser under the hood, but why should the user of the library care? It doesn't make their life harder. And it's not complicated, not even particularly complex - such parsers are commonplace. * Using threads and fork()ing process does not play nice together unless extreme care is taken. Disasters await. By that token, disasters await if you ever use threads, unless you know what you're doing (and sometimes even then). Sarge doesn't force the use of threads with forking - you can do everything synchronously if you want. The test suite does cover the particular case of thread +fork. Do you have specific caveats, or is it just a there be dragons sentiment? Sarge is still in alpha status; no doubt bugs will surface, but unless a real show-stopper occurs, there's not much to be gained by throwing up our hands. BTW extproc is nice, but I wanted to push the envelope a little :-) Regards, Vinay Sajip -- http://mail.python.org/mailman/listinfo/python-list
Re: Python usage numbers
On 12/02/2012 08:26, Matej Cepl wrote: On 12.2.2012 09:14, Matej Cepl wrote: Obvious answers: - Try decoding with UTF8 or Latin1. Even if you don't get the right characters, you'll get *something*. - Use open(filename, encoding='ascii', errors='surrogateescape') (Or possibly errors='ignore'.) These are not good answer, IMHO. The only answer I can think of, really, is: Slightly less flameish answer to the question “What should I do, really?” is a tough one: all these suggested answers are bad because they don’t deal with the fact, that your input data are obviously broken. The rest is just pure GIGO … without fixing (and I mean, really, fixing, not ignoring the problem, which is what the previous answers suggest) your input, you’ll get garbage on output. And you should be thankful to py3k that it shown the issue to you. BTW, can you display the following line? Příliš žluťoučký kůň úpěl ďábelské ódy. Best, Matěj Yes in Thunderbird, Notepad, Wordpad and Notepad++ on Windows Vista, can't be bothered to try any other apps. -- Cheers. Mark Lawrence. -- http://mail.python.org/mailman/listinfo/python-list
Re: ldap proxy user bind
sajuptpm wrote: Yea i am not totally clear about that Client's Requirement is option to have a ldap proxy user bind to the ldap server if it needs more directory rights than an anonymous bind. option to use a ldap proxy user when searching. As said: there's the proxy authorization control (see RFC 4370) for which a Python class exists in python-ldap. This is used e.g. in web applications if the user has successfully authenticated to the application and his identity should be used when processing ACLs in the LDAP server. In this case the proxy user is trusted entity to have done authentication right. The proxy authz control is sent by the application with each LDAP request. The server has to be correctly configured to accept that. Another option is a LDAP proxy server which accepts anon requests and binds as a certain user. You could OpenLDAP with back-ldap or back-meta for that. So you should ask your customer what's really needed. Ciao, Michael. -- http://mail.python.org/mailman/listinfo/python-list
Re: Numeric root-finding in Python
在 2012年2月12日星期日UTC+8下午2时41分20秒,Steven D#39;Aprano写道: This is only peripherally a Python problem, but in case anyone has any good ideas I'm going to ask it. I have a routine to calculate an approximation of Lambert's W function, and then apply a root-finding technique to improve the approximation. This mostly works well, but sometimes the root-finder gets stuck in a cycle. Here's my function: import math def improve(x, w, exp=math.exp): Use Halley's method to improve an estimate of W(x) given an initial estimate w. try: for i in range(36): # Max number of iterations. ew = exp(w) a = w*ew - x W*EXP(W) can converge for negative values of W b = ew*(w + 1) b=exp(W)*W+W err = -a/b # Estimate of the error in the current w. What's X not expalained? if abs(err) = 1e-16: break print '%d: w= %r err= %r' % (i, w, err) # Make a better estimate. c = (w + 2)*a/(2*w + 2) delta = a/(b - c) w -= delta else: raise RuntimeError('calculation failed to converge', err) except ZeroDivisionError: assert w == -1 return w Here's an example where improve() converges very quickly: py improve(-0.36, -1.222769842388856) 0: w= -1.222769842388856 err= -2.9158979924038895e-07 1: w= -1.2227701339785069 err= 8.4638038491998997e-16 -1.222770133978506 That's what I expect: convergence in only a few iterations. Here's an example where it gets stuck in a cycle, bouncing back and forth between two values: py improve(-0.36787344117144249, -1.0057222396915309) 0: w= -1.0057222396915309 err= 2.6521238905750239e-14 1: w= -1.0057222396915044 err= -2.6521238905872001e-14 2: w= -1.0057222396915309 err= 2.6521238905750239e-14 3: w= -1.0057222396915044 err= -2.6521238905872001e-14 4: w= -1.0057222396915309 err= 2.6521238905750239e-14 5: w= -1.0057222396915044 err= -2.6521238905872001e-14 [...] 32: w= -1.0057222396915309 err= 2.6521238905750239e-14 33: w= -1.0057222396915044 err= -2.6521238905872001e-14 34: w= -1.0057222396915309 err= 2.6521238905750239e-14 35: w= -1.0057222396915044 err= -2.6521238905872001e-14 Traceback (most recent call last): File stdin, line 1, in module File stdin, line 19, in improve RuntimeError: ('calculation failed to converge', -2.6521238905872001e-14) (The correct value for w is approximately -1.00572223991.) I know that Newton's method is subject to cycles, but I haven't found any discussion about Halley's method and cycles, nor do I know what the best approach for breaking them would be. None of the papers on calculating the Lambert W function that I have found mentions this. Does anyone have any advice for solving this? -- Steven I sugest you can use Taylor's series expansion to speed up w*exp(w) for negative values of w. -- http://mail.python.org/mailman/listinfo/python-list
Re: Numeric root-finding in Python
On 12/02/2012 10:10, Eelco wrote: On Feb 12, 7:41 am, Steven D'Apranosteve +comp.lang.pyt...@pearwood.info wrote: This is only peripherally a Python problem, but in case anyone has any good ideas I'm going to ask it. I have a routine to calculate an approximation of Lambert's W function, and then apply a root-finding technique to improve the approximation. This mostly works well, but sometimes the root-finder gets stuck in a cycle. Here's my function: import math def improve(x, w, exp=math.exp): Use Halley's method to improve an estimate of W(x) given an initial estimate w. try: for i in range(36): # Max number of iterations. ew = exp(w) a = w*ew - x b = ew*(w + 1) err = -a/b # Estimate of the error in the current w. if abs(err)= 1e-16: break print '%d: w= %r err= %r' % (i, w, err) # Make a better estimate. c = (w + 2)*a/(2*w + 2) delta = a/(b - c) w -= delta else: raise RuntimeError('calculation failed to converge', err) except ZeroDivisionError: assert w == -1 return w Here's an example where improve() converges very quickly: py improve(-0.36, -1.222769842388856) 0: w= -1.222769842388856 err= -2.9158979924038895e-07 1: w= -1.2227701339785069 err= 8.4638038491998997e-16 -1.222770133978506 That's what I expect: convergence in only a few iterations. Here's an example where it gets stuck in a cycle, bouncing back and forth between two values: py improve(-0.36787344117144249, -1.0057222396915309) 0: w= -1.0057222396915309 err= 2.6521238905750239e-14 1: w= -1.0057222396915044 err= -2.6521238905872001e-14 2: w= -1.0057222396915309 err= 2.6521238905750239e-14 3: w= -1.0057222396915044 err= -2.6521238905872001e-14 4: w= -1.0057222396915309 err= 2.6521238905750239e-14 5: w= -1.0057222396915044 err= -2.6521238905872001e-14 [...] 32: w= -1.0057222396915309 err= 2.6521238905750239e-14 33: w= -1.0057222396915044 err= -2.6521238905872001e-14 34: w= -1.0057222396915309 err= 2.6521238905750239e-14 35: w= -1.0057222396915044 err= -2.6521238905872001e-14 Traceback (most recent call last): File stdin, line 1, inmodule File stdin, line 19, in improve RuntimeError: ('calculation failed to converge', -2.6521238905872001e-14) (The correct value for w is approximately -1.00572223991.) I know that Newton's method is subject to cycles, but I haven't found any discussion about Halley's method and cycles, nor do I know what the best approach for breaking them would be. None of the papers on calculating the Lambert W function that I have found mentions this. Does anyone have any advice for solving this? -- Steven Looks like floating point issues to me, rather than something intrinsic to the iterative algorithm. Surely there is not complex chaotic behavior to be found in this fairly smooth function in a +/- 1e-14 window. Otoh, there is a lot of floating point significant bit loss issues to be suspected in the kind of operations you are performing (exp(x) + something, always a tricky one). I would start by asking: How accurate is good enough? If its not good enough, play around the the ordering of your operations, try solving a transformed problem less sensitive to loss of significance; and begin by trying different numeric types to see if the problem is sensitive thereto to begin with. HTH. c:\Users\Mark\Pythontype sda.py import decimal def improve(x, w, exp=decimal.Decimal.exp): Use Halley's method to improve an estimate of W(x) given an initial estimate w. try: for i in range(36): # Max number of iterations. ew = exp(w) a = w*ew - x b = ew*(w + 1) err = -a/b # Estimate of the error in the current w. if abs(err) = 1e-16: break print '%d: w= %r err= %r' % (i, w, err) # Make a better estimate. c = (w + 2)*a/(2*w + 2) delta = a/(b - c) w -= delta else: raise RuntimeError('calculation failed to converge', err) print '%d: w= %r err= %r' % (i, w, err) except ZeroDivisionError: assert w == -1 return w improve(decimal.Decimal('-0.36'), decimal.Decimal('-1.222769842388856')) improve(decimal.Decimal('-0.36787344117144249'), decimal.Decimal('-1.0057222396915309')) c:\Users\Mark\Pythonsda.py 0: w= Decimal('-1.222769842388856') err= Decimal('-2.915897982757542086414504607E-7') 1: w= Decimal('-1.222770133978505953034526059') err= Decimal('-1.084120148360381932277303211E-19') 0: w= Decimal('-1.0057222396915309') err= Decimal('5.744538819905061986438230561E-15') 1: w= Decimal('-1.005722239691525155461180092') err= Decimal('-0E+2') -- Cheers. Mark Lawrence. -- http://mail.python.org/mailman/listinfo/python-list
Re: Numeric root-finding in Python
On 2/12/12 6:41 AM, Steven D'Aprano wrote: This is only peripherally a Python problem, but in case anyone has any good ideas I'm going to ask it. I have a routine to calculate an approximation of Lambert's W function, and then apply a root-finding technique to improve the approximation. This mostly works well, but sometimes the root-finder gets stuck in a cycle. I don't have any advice for fixing your code, per se, but I would just grab mpmath and use their lambertw function: http://mpmath.googlecode.com/svn/trunk/doc/build/functions/powers.html#lambert-w-function -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco -- http://mail.python.org/mailman/listinfo/python-list
Re: Python usage numbers
In article mailman.5715.1329021524.27778.python-l...@python.org, Chris Angelico ros...@gmail.com wrote: On Sun, Feb 12, 2012 at 1:36 PM, Rick Johnson rantingrickjohn...@gmail.com wrote: On Feb 11, 8:23 pm, Steven D'Aprano steve +comp.lang.pyt...@pearwood.info wrote: I have a file containing text. I can open it in an editor and see it's nearly all ASCII text, except for a few weird and bizarre characters like £ © ± or ö. In Python 2, I can read that file fine. In Python 3 I get an error. What should I do that requires no thought? Obvious answers: the most obvious answer would be to read the file WITHOUT worrying about asinine encoding. What this statement misunderstands, though, is that ASCII is itself an encoding. Files contain bytes, and it's only what's external to those bytes that gives them meaning. Exactly. soapbox class=wise-old-geezer. ASCII was so successful at becoming a universal standard which lasted for decades, people who grew up with it don't realize there was once any other way. Not just EBCDIC, but also SIXBIT, RAD-50, tilt/rotate, packed card records, and so on. Transcoding was a way of life, and if you didn't know what you were starting with and aiming for, it was hopeless. Kind of like now where we are again with Unicode. /soapbox -- http://mail.python.org/mailman/listinfo/python-list
Re: Python usage numbers
In article 4f375347$0$29986$c3e8da3$54964...@news.astraweb.com, Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote: ASCII truly is a blight on the world, and the sooner it fades into obscurity, like EBCDIC, the better. That's a fair statement, but it's also fair to say that at the time it came out (49 years ago!) it was a revolutionary improvement on the extant state of affairs (every manufacturer inventing their own code, and often different codes for different machines). Given the cost of both computer memory and CPU cycles at the time, sticking to a 7-bit code (the 8th bit was for parity) was a necessary evil. As Steven D'Aprano pointed out, it was missing some commonly used US symbols such as ¢ or ©. This was a small price to pay for the simplicity ASCII afforded. It wasn't a bad encoding. I was a very good encoding. But the world has moved on and computing hardware has become cheap enough that supporting richer encodings and character sets is realistic. And, before people complain about the character set being US-Centric, keep in mind that the A in ASCII stands for American, and it was published by ANSI (whose A also stands for American). I'm not trying to wave the flag here, just pointing out that it was never intended to be anything other than a national character set. Part of the complexity of Unicode is that when people switch from working with ASCII to working with Unicode, they're really having to master two distinct things at the same time (and often conflate them into a single confusing mess). One is the Unicode character set. The other is a specific encoding (UTF-8, UTF-16, etc). Not to mention silly things like BOM (Byte Order Mark). I expect that some day, storage costs will become so cheap that we'll all just be using UTF-32, and programmers of the day will wonder how their poor parents and grandparents ever managed in a world where nobody quite knew what you meant when you asked, how long is that string?. -- http://mail.python.org/mailman/listinfo/python-list
Re: ANN: Sarge, a library wrapping the subprocess module, has been released.
It's not hard for the user I think most users like to use Python, or they'd use Bash. I think people prefer not another language that is different from both, and having little benefits. My own opinion of course. Re. threads fork(): http://www.linuxprogrammingblog.com/threads-and-fork-think-twice-before-using-them For a careful impl of fork-exec with threads, see http://golang.org/src/pkg/syscall/exec_unix.go By that token, disasters await if you ever use threads, unless you know what you're doing So don't, this package is mainly a fork-exec-wait library providing shell-like functionalities. Just use fork(). BTW extproc is nice, but I wanted to push the envelope a little :-) Hmm, if the extra envelop is the async code with threads that may deadlock, I would say thanks but no thanks :p I do think that IO redirection is much nicer with extproc. -- http://mail.python.org/mailman/listinfo/python-list
Re: Python usage numbers
On Sun, 12 Feb 2012 17:08:24 +1100, Chris Angelico wrote: On Sun, Feb 12, 2012 at 4:51 PM, Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote: You can't say that it cost you £10 to courier your résumé to the head office of Encyclopædia Britanica to apply for the position of Staff Coördinator. True, but if it cost you $10 (or 10 GBP) to courier your curriculum vitae to the head office of Encyclopaedia Britannica to become Staff Coordinator, then you'd be fine. And if it cost you $10 to post your work summary to Britannica's administration to apply for this Staff Coordinator position, you could say it without 'e' too. Doesn't mean you don't need Unicode! Back in the late 1970's, the economy and the outlook in the USA sucked, and the following joke made the rounds: Mr. Smith: Good morning, Mr. Jones. How are you? Mr. Jones: I'm fine. (The humor is that Mr. Jones had his head so far [in the sand] that he thought that things were fine.) American English is my first spoken language, but I know enough French, Greek, math, and other languages that I am very happy to have more than ASCII these days. I imagine that even Steven's surname should be spelled D’Aprano rather than D'Aprano. Dan -- http://mail.python.org/mailman/listinfo/python-list
Re: Numeric root-finding in Python
I don't know the first thing about this math problem however, if I were to code this I might try ; except ZeroDivisionError: assert w = -1 rather than; except ZeroDivisionError: assert w == -1 jimonlinux On Sunday, February 12, 2012 06:41:20 AM Steven D'Aprano wrote: This is only peripherally a Python problem, but in case anyone has any good ideas I'm going to ask it. I have a routine to calculate an approximation of Lambert's W function, and then apply a root-finding technique to improve the approximation. This mostly works well, but sometimes the root-finder gets stuck in a cycle. Here's my function: import math def improve(x, w, exp=math.exp): Use Halley's method to improve an estimate of W(x) given an initial estimate w. try: for i in range(36): # Max number of iterations. ew = exp(w) a = w*ew - x b = ew*(w + 1) err = -a/b # Estimate of the error in the current w. if abs(err) = 1e-16: break print '%d: w= %r err= %r' % (i, w, err) # Make a better estimate. c = (w + 2)*a/(2*w + 2) delta = a/(b - c) w -= delta else: raise RuntimeError('calculation failed to converge', err) except ZeroDivisionError: assert w == -1 return w Here's an example where improve() converges very quickly: py improve(-0.36, -1.222769842388856) 0: w= -1.222769842388856 err= -2.9158979924038895e-07 1: w= -1.2227701339785069 err= 8.4638038491998997e-16 -1.222770133978506 That's what I expect: convergence in only a few iterations. Here's an example where it gets stuck in a cycle, bouncing back and forth between two values: py improve(-0.36787344117144249, -1.0057222396915309) 0: w= -1.0057222396915309 err= 2.6521238905750239e-14 1: w= -1.0057222396915044 err= -2.6521238905872001e-14 2: w= -1.0057222396915309 err= 2.6521238905750239e-14 3: w= -1.0057222396915044 err= -2.6521238905872001e-14 4: w= -1.0057222396915309 err= 2.6521238905750239e-14 5: w= -1.0057222396915044 err= -2.6521238905872001e-14 [...] 32: w= -1.0057222396915309 err= 2.6521238905750239e-14 33: w= -1.0057222396915044 err= -2.6521238905872001e-14 34: w= -1.0057222396915309 err= 2.6521238905750239e-14 35: w= -1.0057222396915044 err= -2.6521238905872001e-14 Traceback (most recent call last): File stdin, line 1, in module File stdin, line 19, in improve RuntimeError: ('calculation failed to converge', -2.6521238905872001e-14) (The correct value for w is approximately -1.00572223991.) I know that Newton's method is subject to cycles, but I haven't found any discussion about Halley's method and cycles, nor do I know what the best approach for breaking them would be. None of the papers on calculating the Lambert W function that I have found mentions this. Does anyone have any advice for solving this? -- http://mail.python.org/mailman/listinfo/python-list
Re: ANN: Sarge, a library wrapping the subprocess module, has been released.
For a careful impl of fork-exec with threads, see http://golang.org/src/pkg/syscall/exec_unix.go I forgot to mention that this impl is indeed correct only because you cannot start thread or call fork() directly in the Go language, other than use goroutines and the ForkExec() function implemented there. So all that locking is internal. If you use threads and call fork(), you'll almost guaranteed to face with deadlocks. Perhaps not in a particular piece of code, but some others. Perhaps not on your laptop, but on the production machine with different kernels. Like most race conditions, they will eventually show up. -- http://mail.python.org/mailman/listinfo/python-list
Re: Numeric root-finding in Python
On 02/12/2012 10:20 AM, inq1ltd wrote: I don't know the first thing about this math problem however, if I were to code this I might try ; except ZeroDivisionError: assert w = -1 You top-posted. Please type your response after whatever you're quoting. In my case, I only need a portion of what you said, and my remarks are following it. assert takes an expression, so the one above is just wrong. Fortunately, Python would tell you with SyntaxError: invalid syntax -- DaveA -- http://mail.python.org/mailman/listinfo/python-list
Re: Python usage numbers
On Feb 12, 10:51 am, Steven D'Aprano steve +comp.lang.pyt...@pearwood.info wrote: On Sun, 12 Feb 2012 15:38:37 +1100, Chris Angelico wrote: Everything that displays text to a human needs to translate bytes into glyphs, and the usual way to do this conceptually is to go via characters. Pretending that it's all the same thing really means pretending that one byte represents one character and that each character is depicted by one glyph. And that's doomed to failure, unless everyone speaks English with no foreign symbols - so, no mathematical notations. Pardon me, but you can't even write *English* in ASCII. You can't say that it cost you £10 to courier your résumé to the head office of Encyclopædia Britanica to apply for the position of Staff Coördinator. (Admittedly, the umlaut on the second o looks a bit stuffy and old-fashioned, but it is traditional English.) Hell, you can't even write in *American*: you can't say that the recipe for the 20¢ WobblyBurger™ is © 2012 WobblyBurgerWorld Inc. [Quite OT but...] How do you type all this? [Note: I grew up on APL so unlike Rick I am genuinely asking :-) ] -- http://mail.python.org/mailman/listinfo/python-list
Re: Python usage numbers
In article mailman.5730.1329065268.27778.python-l...@python.org, Dennis Lee Bieber wlfr...@ix.netcom.com wrote: On Sun, 12 Feb 2012 10:48:36 -0500, Roy Smith r...@panix.com wrote: As Steven D'Aprano pointed out, it was missing some commonly used US symbols such as ¢ or ©. That's interesting. When I wrote that, it showed on my screen as a cent symbol and a copyright symbol. What I see in your response is an upper case A with a hat accent (circumflex?) over it followed by a cent symbol, and likewise an upper case A with a hat accent over it followed by copyright symbol. Oh, for the days of ASCII again :-) Not to mention, of course, that I wrote colondashclose-paren, but I fully expect some of you will be reading this with absurd clients which turn that into some kind of smiley-face image. Any volunteers to create an Extended Baudot... Instead of letter shift and number shift we could have a generic encoding shift which uses the following characters to identify which 7-bit subset of Unicode is to be represented G I think that's called UTF-8. -- http://mail.python.org/mailman/listinfo/python-list
Re: Python usage numbers
In article e7f457b3-7d49-4c95-bd95-e0f27fa66...@s8g2000pbj.googlegroups.com, rusi rustompm...@gmail.com wrote: On Feb 12, 10:51 am, Steven D'Aprano steve +comp.lang.pyt...@pearwood.info wrote: On Sun, 12 Feb 2012 15:38:37 +1100, Chris Angelico wrote: Everything that displays text to a human needs to translate bytes into glyphs, and the usual way to do this conceptually is to go via characters. Pretending that it's all the same thing really means pretending that one byte represents one character and that each character is depicted by one glyph. And that's doomed to failure, unless everyone speaks English with no foreign symbols - so, no mathematical notations. Pardon me, but you can't even write *English* in ASCII. You can't say that it cost you £10 to courier your résumé to the head office of Encyclopædia Britanica to apply for the position of Staff Coördinator. (Admittedly, the umlaut on the second o looks a bit stuffy and old-fashioned, but it is traditional English.) Hell, you can't even write in *American*: you can't say that the recipe for the 20¢ WobblyBurger is © 2012 WobblyBurgerWorld Inc. [Quite OT but...] How do you type all this? [Note: I grew up on APL so unlike Rick I am genuinely asking :-) ] What I do (on a Mac) is open the Keyboard Viewer thingie and try various combinations of shift-control-option-command-function until the thing I'm looking for shows up on a keycap. A few of them I've got memorized (for example, option-8 gets you a bullet ). I would imagine if you commonly type in a language other than English, you would quickly memorize the ones you use a lot. Or, open the Character Viewer thingie and either hunt around the various drill-down menus (North American Scripts / Canadian Aboriginal Syllabics, for example) or type in some guess at the official unicode name into the search box. -- http://mail.python.org/mailman/listinfo/python-list
Re: Python usage numbers
rusi rustompm...@gmail.com wrote: On Feb 12, 10:51 am, Steven D'Aprano steve +comp.lang.pyt...@pearwood.info wrote: On Sun, 12 Feb 2012 15:38:37 +1100, Chris Angelico wrote: Everything that displays text to a human needs to translate bytes into glyphs, and the usual way to do this conceptually is to go via characters. Pretending that it's all the same thing really means pretending that one byte represents one character and that each character is depicted by one glyph. And that's doomed to failure, unless everyone speaks English with no foreign symbols - so, no mathematical notations. Pardon me, but you can't even write *English* in ASCII. You can't say that it cost you £10 to courier your résumé to the head office of Encyclopædia Britanica to apply for the position of Staff Coördinator. (Admittedly, the umlaut on the second o looks a bit stuffy and old-fashioned, but it is traditional English.) Hell, you can't even write in *American*: you can't say that the recipe for the 20¢ WobblyBurger™ is © 2012 WobblyBurgerWorld Inc. [Quite OT but...] How do you type all this? [Note: I grew up on APL so unlike Rick I am genuinely asking :-) ] [Emacs speficic] Many different ways of course, but in emacs, you can select e.g. the TeX input method with C-x RET C-\ TeX RET. which does all of the above symbols with the exception of the cent symbol (or maybe I missed it) - you type the thing in the first column and you get the thing in the second column \pounds £ \'e é \ae æ \o ö ^{TM} ™ \copyright © I gave up on the cent symbol and used ucs-insert (C-x 8 RET) which allows you to type a name, in this case CENT SIGN to get ¢. Nick -- http://mail.python.org/mailman/listinfo/python-list
Re: Python usage numbers
On 12 Feb 2012 09:12:57 GMT, Steven D'Aprano wrote: Suppose you're a fan of Russian punk bank Наӥв and you have a directory of their music. Sigh. Banking ain't what it used to be. I'm sticking with classical Muzak. -- To email me, substitute nowhere-spamcop, invalid-net. -- http://mail.python.org/mailman/listinfo/python-list
package extension problem
Hello, I wish to extend the functionality of an existing python package by creating a new package that redefines the relevant classes of the old package. Each new class inherits the equivalent old class and adds new methods. In the new package there is something like the following. import old_package as op class A(op.A): ... add new methods ... class B(op.B): ... add new methods ... Some classes of the old package works as a dictionary of other classes of the same old package. Example: if class A and class B are classes of the old package, B[some_hash] returns an instance of A. When a program imports the new package and create instances of the new class B, B[some_hash] still returns an instance of the old class A, while I want an instance of the new class A. There is a way to solve this problem without redefining in the new package all the methods of the old package that return old classes? Thanks in advance for any suggestion, Fabrizio -- http://mail.python.org/mailman/listinfo/python-list
Re: Disable use of pyc file with no matching py file
On 2/2/2012 1:21 AM, Terry Reedy wrote: On 2/2/2012 1:42 AM, Devin Jeanpierre wrote: On Wed, Feb 1, 2012 at 2:53 PM, Terry Reedytjre...@udel.edu wrote: And it bothers me that you imput such ignorance to me. You made what I think was a bad analogy and I made a better one of the same type, though still imperfect. I acknowledged that the transition will take years. Ah. It is a common attitude among those that make these sorts of comments about Python 3, and I hadn't read anything in what you said that made me think that you were considering more than the superficial costs of moving. I thought '95% in 10 years' would be a hint that I know upgrading is not trivial for every one ;-). I am sorry that I did not give you the benefit of the doubt. Apology accepted. This is OT, but relevant to the side discussion. Also note that the community respects and appreciates the T.J. Reedy contributions and hard work. Reality. It was a monumental task to convert ATE driver dev and related factory automation stuff from C to Python starting in 2000. Most was done guerrilla-style. My employer now appreciates the resultant infrastructure that even the Mexico factory engineering team can use, and is good for their pride because they are no longer totally dependent on gringo engineers. Am currently being disruptive to their little world with my recent (1Q 2011) switch to 3.x. The corporate natives are restless and there is talk of an armed insurrection. Humans, at the tribal level, do not adapt to change. Expect a long series of battles that are ruthless, bloody, and have a high body count. Vive la Revolution. -- http://mail.python.org/mailman/listinfo/python-list
Re: Python usage numbers
On Sun, 12 Feb 2012 12:11:01 +, Mark Lawrence wrote: On 12/02/2012 08:26, Matej Cepl wrote: On 12.2.2012 09:14, Matej Cepl wrote: Obvious answers: - Try decoding with UTF8 or Latin1. Even if you don't get the right characters, you'll get *something*. - Use open(filename, encoding='ascii', errors='surrogateescape') (Or possibly errors='ignore'.) These are not good answer, IMHO. The only answer I can think of, really, is: Slightly less flameish answer to the question “What should I do, really?” is a tough one: all these suggested answers are bad because they don’t deal with the fact, that your input data are obviously broken. The rest is just pure GIGO … without fixing (and I mean, really, fixing, not ignoring the problem, which is what the previous answers suggest) your input, you’ll get garbage on output. And you should be thankful to py3k that it shown the issue to you. BTW, can you display the following line? Příliš žluťoučký kůň úpěl ďábelské ódy. Best, Matěj Yes in Thunderbird, Notepad, Wordpad and Notepad++ on Windows Vista, can't be bothered to try any other apps. Pan seems to be fine , they at least look like letters not just blocks -- Appearances often are deceiving. -- Aesop -- http://mail.python.org/mailman/listinfo/python-list
Re: Python usage numbers
There is so much to say on the subject, I do not know where to start. Some points. Today, Sunday, 12 February 2012, 90%, if not more, of the Python applications supposed to work with text and I'm toying with are simply not working. Two reasons: 1) Most of the devs understand nothing or not enough on the field of the coding of the characters. 2) In gui applications, most of the devs understand nothing or not enough in the keyboard keys/chars handling. --- I know Python since version 1.5.2 or 1.5.6 (?). Among the applications I wrote, my fun is in writing GUI interactive interpreters with Python 2 or 3, tkinter, Tkinter, wxPython, PySide, PyQt4 on Windows. Believe or not, my interactive interpreters are the only ones where I can enter text and where text is displayed correctly. IDLE, wxPython/PyShell, DrPython, ... all are failing. (I do not count console applications). Python popularity? I have no popularity-meter. What I know: I can not type French text in IDLE on Windows. It is like this since ~ten years and I never saw any complain about this. (The problem in bad programmation). Ditto for PyShell in wxPython. I do not count, the number of corrections I proposed. In one version, it takes me 18 months until finally decided to propose a correction. During this time, I never heard of the problem. (Now, it is broken again). --- Is there a way to fix this actual status? - Yes, and *very easily*. Will it be fixed? - No, because there is no willingness to solve it. --- Roy Smith's quote: ... that we'll all just be using UTF-32, ... Considering PEP 393, Python is not taking this road. --- How many devs know, one can not write text in French with the iso-8859-1 coding? (see pep 393) How can one explain, corporates like MS or Apple with their cp1252 or mac-roman codings succeeded to know this? Ditto for foundries (Adobe, LinoType, ...) --- Python is 20 years old. It was developped with ascii in mind. Python was not born, all this stuff was already a no problem with Windows and VB. Even a step higher, Windows was no born, this was a no problem at DOS level (eg TurboPascal), 30 years ago! Design mistake. --- Python 2 introduced the unicode type. Very nice. Problem. The introduction of the automatic coercion ascii-unicode, which somehow breaks everything. Very bad design mistake. (In my mind, the biggest one). --- One day, I fell on the web on a very old discussion about Python related to the introduction of unicode in Python 2. Something like: Python core dev (it was VS or AP): ... lets go with ucs-4 and we have no problem in the future Look at the situation today. --- And so one. --- Conclusion. A Windows programmer is better served by downloading VB.NET Express. A end Windows user is better served with an application developped with VB.NET Express. I find somehow funny, Python is able to produce this: (1.1).hex() '0x1.1999ap+0' and on the other side, Python, Python applications, are not able to deal correctly with text entering and text displaying. Probably, the two most important tasks a computer has to do! jmf PS I'm not a computer scientist, only a computer user. -- http://mail.python.org/mailman/listinfo/python-list
Need help with shutils.copytree
I have a 'master' directory and a collection of 'slave' dirs. I want the master to collect all of the stuff in the slave dirs. The slaves all look like this, . |-- slaveX | `-- archI | | `-- distJ | | | ` -- FILE Where the different slaveX dirs may contain multiple occurrences of archI and distJ, but across all slaveX dirs, there will only be one *unique* instance of FILE in archI and distJ. Here's an example: Given slave[1234], arch1 and arch2, and dist1 and dist2, I want master to end up looking like this: . |-- master | `-- arch1 | | ` -- dist1 | | |` -- FILE | `-- arch1 | | ` -- dist2 | | |` -- FILE | `-- arch2 | | ` -- dist1 | | |` -- FILE | `-- arch2 | | ` -- dist2 | | |` -- FILE etc... In bash, I might use cpio passthrough mode and say something like: master=$path_to_master for slave in ${slaves} do pushd $slave find . -print | cpio -pdum $master popd done but I'm having a hard time trying to get this functionality in python. (I'm trying to avoid writing a subprocess.) I tried using shutil.copytree with a try / except that does a pass on OSError (which is what gets raised when trying to create a dir that already exists). No joy there. I also tried an ignore function that always returns (). Someone must have done this before. Any suggestions / pointers are much appreciated. (I hope this was clear to read.) TIA -- Time flies like the wind. Fruit flies like a banana. Stranger things have .0. happened but none stranger than this. Does your driver's license say Organ ..0 Donor?Black holes are where God divided by zero. Listen to me! We are all- 000 individuals! What if this weren't a hypothetical question? steveo at syslang.net -- http://mail.python.org/mailman/listinfo/python-list
Re: Numeric root-finding in Python
On Feb 12, 6:41 am, Steven D'Aprano steve +comp.lang.pyt...@pearwood.info wrote: err = -a/b # Estimate of the error in the current w. if abs(err) = 1e-16: break If the result you're expecting is around -1.005, this exit condition is rather optimistic: the difference between the two Python floats either side of this value is already 2.22e-16, so you're asking for less than half a ulp of error! As to the rest; your error estimate simply doesn't have enough precision. The main problem is in the computation of a, where you're subtracting two almost identical values. The absolute error incurred in computing w*exp(w) is of the same order of magnitude as the difference 'w*exp(w) - x' itself, so err has lost essentially all of its significant bits, and is at best only a crude indicator of the size of the error. The solution would be to compute the quantities 'exp(w), w*exp(w), and w*exp(w) - x' all with extended precision. For the other quantities, there shouldn't be any major issues---after all, you only need a few significant bits of 'delta' for it to be useful, but with the subtraction that generates a, you don't even get those few significant bits. (The correct value for w is approximately -1.00572223991.) Are you sure? Wolfram Alpha gives me the following value for W(-1, -0.36787344117144249455719773322925902903079986572265625): -1.005722239691522978... so it looks as though the values you're getting are at least alternating around the exact value. -- Mark -- http://mail.python.org/mailman/listinfo/python-list
Re: ANN: Sarge, a library wrapping the subprocess module, has been released.
On Feb 12, 3:35 pm, Anh Hai Trinh anh.hai.tr...@gmail.com wrote: I think most users like to use Python, or they'd use Bash. I think people prefer not another language that is different from both, and having little benefits. My own opinion of course. I have looked at pbs and clom: they Pythonify calls to external programs by making spawning those look like function calls. There's nothing wrong with that, it's just a matter of taste. I find that e.g. wc(ls(/etc, -1), -l) is not as readable as call(“ls /etc –1 | wc –l”) and the attempt to Pythonify doesn't buy you much, IMO. Of course, it is a matter of taste - I understand that there are people who will prefer the pbs/clom way of doing things. Re. threads fork():http://www.linuxprogrammingblog.com/threads-and-fork-think-twice-befo... For a careful impl of fork-exec with threads, seehttp://golang.org/src/pkg/syscall/exec_unix.go Thanks for the links. The first seems to me to be talking about the dangers of locking and forking; if you don't use threads, you don't need locks, so the discussion about locking only really applies in a threading+forking scenario. I agree that locking+forking can be problematic because the semantics of what happens to the state of the locks and threads in the child (for example, as mentioned in http://bugs.python.org/issue6721). However, it's not clear that any problem occurs if the child just execs a new program, overwriting the old - which is the case here. The link you pointed to says that It seems that calling execve(2) to start another program is the only sane reason you would like to call fork(2) in a multi-threaded program. which is what we're doing in this case. Even though it goes on to mention the dangers inherent in inherited file handles, it also mentions that these problems have been overcome in recent Linux kernels, and the subprocess module does contain code to handle at least some of these conditions (e.g. preexec_fn, close_fds keyword arguments to subprocess.Popen). Hopefully, if there are race conditions which emerge in the subprocess code (as has happened in the past), they will be fixed (as has happened in the past). Hmm, if the extra envelop is the async code with threads that may deadlock, I would say thanks but no thanks :p That is of course your privilege. I would hardly expect you to drop extproc in favour of sarge. But there might be people who need to tread in these dangerous waters, and hopefully sarge will make things easier for them. As I said earlier, one doesn't *need* to use asynchronous calls. I agree that I may have to review the design decisions I've made, based on feedback based on people actually trying the async functionality out. I don't feel that shying away from difficult problems without even trying to solve them is the best way of moving things forward. What are the outcomes? * Maybe people won't even try the async functionality (in which case, they won't hit problems) * They'll hit problems and just give up on the library (I hope not - if I ever have a problem with a library I want to use, I always try and engage with the developers to find a workaround or fix) * They'll report problems which, on investigation, will turn out to be fixable bugs - well and good * The reported bugs will be unfixable for some reason, in which case I'll just have to deprecate that functionality. Remember, this is version 0.1 of the library, not version 1.0. I expect to do some API and functionality tweaks based on feedback and bugs which show up. I do think that IO redirection is much nicer with extproc. Again, a matter of taste. You feel that it's better to pass dicts around in the public API where integer file handles map to other handles or streams; I feel that using a Capture instance is less fiddly for the user. Let a thousand flowers bloom, and all that. I do thank you for the time you've taken to make these comments, and I found the reading you pointed me to interesting. I will update the sarge docs to point to the link on the Linux Programming blog, to make sure people are informed of potential pitfalls. Regards, Vinay Sajip -- http://mail.python.org/mailman/listinfo/python-list
Re: ANN: Sarge, a library wrapping the subprocess module, has been released.
On Feb 12, 4:19 pm, Anh Hai Trinh anh.hai.tr...@gmail.com wrote: If you use threads and call fork(), you'll almost guaranteed to face with deadlocks. Perhaps not in a particular piece of code, but some others. Perhaps not on your laptop, but on the production machine with different kernels. Like most race conditions, they will eventually show up. You can hit deadlocks in multi-threaded programs even without the fork(), can't you? In that situation, you either pin it down to a bug in your code (and even developers experienced in writing multi- threaded programs hit these), or a bug in the underlying library (which can hopefully be fixed, but that applies to any bug you might hit in any library you use, and is something you have to consider whenever you use a library written by someone else), or an unfixable problem (e.g. due to problems in the Python or C runtime) which require a different approach. I understand your concerns, but you are just a little further along the line from people who say If you use threads, you will have deadlock problems. Don't use threads. I'm not knocking that POV - people need to use what they're comfortable with, and to avoid things that make them uncomfortable. I'm not pushing the async feature as a major advantage of the library - it's still useful without that, IMO. Regards, Vinay Sajip -- http://mail.python.org/mailman/listinfo/python-list
Re: Need help with shutils.copytree
On Sun, Feb 12, 2012 at 12:14 PM, Steven W. Orr ste...@syslang.net wrote: I have a 'master' directory and a collection of 'slave' dirs. I want the master to collect all of the stuff in the slave dirs. The slaves all look like this, . |-- slaveX | `-- archI | | `-- distJ | | | ` -- FILE Where the different slaveX dirs may contain multiple occurrences of archI and distJ, but across all slaveX dirs, there will only be one *unique* instance of FILE in archI and distJ. Here's an example: Given slave[1234], arch1 and arch2, and dist1 and dist2, I want master to end up looking like this: . |-- master | `-- arch1 | | ` -- dist1 | | | ` -- FILE | `-- arch1 | | ` -- dist2 | | | ` -- FILE | `-- arch2 | | ` -- dist1 | | | ` -- FILE | `-- arch2 | | ` -- dist2 | | | ` -- FILE etc... You have multiple directories at the same level in the hierarchy with identical names (e.g. two arch1s), which is invalid. I assume you meant for them to be combined? In bash, I might use cpio passthrough mode and say something like: master=$path_to_master for slave in ${slaves} do pushd $slave find . -print | cpio -pdum $master popd done but I'm having a hard time trying to get this functionality in python. (I'm trying to avoid writing a subprocess.) I tried using shutil.copytree with a try / except that does a pass on OSError (which is what gets raised when trying to create a dir that already exists). No joy there. Right; the stack has already been unwound by the time your `except` clause is reached. You just need to recover by instead copying the children of the subtree individually yourself when their parent already exists. For example: master/arch1 already exists? Then copy the slave/arch1/distN-s individually. Or alternately, abandon copytree() entirely: you just LYBL and check if the parent directories already exist; if not, you try to create the directories yourself; and finally, you copy the individual files. [Useful funcs: os.listdir(), os.path.exists(), os.mkdir() / os.mkdirs()] I also tried an ignore function that always returns (). That's an effective no-op which doesn't alter copytree()'s behavior whatsoever. Cheers, Chris -- http://rebertia.com -- http://mail.python.org/mailman/listinfo/python-list
Re: Numeric root-finding in Python
On 2/12/2012 5:10 AM, Eelco wrote: On Feb 12, 7:41 am, Steven D'Apranosteve +comp.lang.pyt...@pearwood.info wrote: This is only peripherally a Python problem, but in case anyone has any good ideas I'm going to ask it. I have a routine to calculate an approximation of Lambert's W function, and then apply a root-finding technique to improve the approximation. This mostly works well, but sometimes the root-finder gets stuck in a cycle. Here's my function: import math def improve(x, w, exp=math.exp): Use Halley's method to improve an estimate of W(x) given an initial estimate w. try: for i in range(36): # Max number of iterations. ew = exp(w) a = w*ew - x b = ew*(w + 1) err = -a/b # Estimate of the error in the current w. if abs(err)= 1e-16: break print '%d: w= %r err= %r' % (i, w, err) # Make a better estimate. c = (w + 2)*a/(2*w + 2) delta = a/(b - c) w -= delta else: raise RuntimeError('calculation failed to converge', err) except ZeroDivisionError: assert w == -1 return w Here's an example where improve() converges very quickly: py improve(-0.36, -1.222769842388856) 0: w= -1.222769842388856 err= -2.9158979924038895e-07 1: w= -1.2227701339785069 err= 8.4638038491998997e-16 -1.222770133978506 That's what I expect: convergence in only a few iterations. Here's an example where it gets stuck in a cycle, bouncing back and forth between two values: py improve(-0.36787344117144249, -1.0057222396915309) 0: w= -1.0057222396915309 err= 2.6521238905750239e-14 1: w= -1.0057222396915044 err= -2.6521238905872001e-14 2: w= -1.0057222396915309 err= 2.6521238905750239e-14 3: w= -1.0057222396915044 err= -2.6521238905872001e-14 4: w= -1.0057222396915309 err= 2.6521238905750239e-14 35: w= -1.0057222396915044 err= -2.6521238905872001e-14 Traceback (most recent call last): File stdin, line 1, inmodule File stdin, line 19, in improve RuntimeError: ('calculation failed to converge', -2.6521238905872001e-14) (The correct value for w is approximately -1.00572223991.) I know that Newton's method is subject to cycles, but I haven't found any discussion about Halley's method and cycles, nor do I know what the best approach for breaking them would be. None of the papers on calculating the Lambert W function that I have found mentions this. Does anyone have any advice for solving this? Looks like floating point issues to me, rather than something intrinsic to the iterative algorithm. Surely there is not complex chaotic behavior to be found in this fairly smooth function in a +/- 1e-14 window. Otoh, there is a lot of floating point significant bit loss issues to be suspected in the kind of operations you are performing (exp(x) + something, always a tricky one). To investigate this, I would limit the iterations to 2 or 3 and print ew, a,b,c, and delta, maybe in binary(hex) form I would start by asking: How accurate is good enough? If its not good enough, play around the the ordering of your operations, try solving a transformed problem less sensitive to loss of significance; and begin by trying different numeric types to see if the problem is sensitive thereto to begin with. -- Terry Jan Reedy -- http://mail.python.org/mailman/listinfo/python-list
Re: Python usage numbers
On 2/12/2012 10:13 AM, Roy Smith wrote: Exactly.soapbox class=wise-old-geezer. ASCII was so successful at becoming a universal standard which lasted for decades, I think you are overstating the universality and length. I used a machine in the 1970s with 60-bit words that could be interpreted as 10 6-bit characters. IBM used EBCDIC at least into the 1980s. The UCLA machine I used had a translator for ascii terminals that connected by modems. I remember discussing the translation table with the man in charge of it. Dedicated wordprocessing machines of the 70s and 80s *had* to use something other than plain ascii, as it is inadequate for business text, as opposed to pure computation and labeled number tables. Whether they used extended ascii or something else, I have no idea. Ascii was, however, as far as I know, the universal basis for the new personal computers starting about 1975, and most importantly, for the IBM PC. But even that actually used its version of extended ascii, as did each wordprocessing program. people who grew up with it don't realize there was once any other way. Not just EBCDIC, but also SIXBIT, RAD-50, tilt/rotate, packed card records, and so on. Transcoding was a way of life, and if you didn't know what you were starting with and aiming for, it was hopeless. But because of the limitation of ascii on a worldwide, as opposed to American basis, we ended up with 100-200 codings for almost as many character sets. This is because the idea of ascii was applied by each nation or language group individually to their local situation. Kind of like now where we are again with Unicode./soapbox The situation before ascii is like where we ended up *before* unicode. Unicode aims to replace all those byte encoding and character sets with *one* byte encoding for *one* character set, which will be a great simplification. It is the idea of ascii applied on a global rather that local basis. Let me repeat. Unicode and utf-8 is a solution to the mess, not the cause. Perhaps we should have a synonym for utf-8: escii, for Earthian Standard Code for Information Interchange. -- Terry Jan Reedy -- http://mail.python.org/mailman/listinfo/python-list
Re: Python usage numbers
On Mon, Feb 13, 2012 at 9:07 AM, Terry Reedy tjre...@udel.edu wrote: The situation before ascii is like where we ended up *before* unicode. Unicode aims to replace all those byte encoding and character sets with *one* byte encoding for *one* character set, which will be a great simplification. It is the idea of ascii applied on a global rather that local basis. Unicode doesn't deal with byte encodings; UTF-8 is an encoding, but so are UTF-16, UTF-32. and as many more as you could hope for. But broadly yes, Unicode IS the solution. ChrisA -- http://mail.python.org/mailman/listinfo/python-list
Re: Python usage numbers
In article mailman.5738.1329084478.27778.python-l...@python.org, Terry Reedy tjre...@udel.edu wrote: Let me repeat. Unicode and utf-8 is a solution to the mess, not the cause. Perhaps we should have a synonym for utf-8: escii, for Earthian Standard Code for Information Interchange. I'm not arguing that Unicode is where we need to get to. Just trying to give a little history. -- http://mail.python.org/mailman/listinfo/python-list
Re: Python usage numbers
In article mailman.5739.1329084873.27778.python-l...@python.org, Chris Angelico ros...@gmail.com wrote: On Mon, Feb 13, 2012 at 9:07 AM, Terry Reedy tjre...@udel.edu wrote: The situation before ascii is like where we ended up *before* unicode. Unicode aims to replace all those byte encoding and character sets with *one* byte encoding for *one* character set, which will be a great simplification. It is the idea of ascii applied on a global rather that local basis. Unicode doesn't deal with byte encodings; UTF-8 is an encoding, but so are UTF-16, UTF-32. and as many more as you could hope for. But broadly yes, Unicode IS the solution. I could hope for one and only one, but I know I'm just going to be disapointed. The last project I worked on used UTF-8 in most places, but also used some C and Java libraries which were only available for UTF-16. So it was transcoding hell all over the place. Hopefully, we will eventually reach the point where storage is so cheap that nobody minds how inefficient UTF-32 is and we all just start using that. Life will be a lot simpler then. No more transcoding, a string will just as many bytes as it is characters, and everybody will be happy again. -- http://mail.python.org/mailman/listinfo/python-list
Re: Python usage numbers
On Sun, 12 Feb 2012 05:11:30 -0600, Andrew Berg wrote: On 2/12/2012 3:12 AM, Steven D'Aprano wrote: NTFS by default uses the UTF-16 encoding, which means the actual bytes written to disk are \x1d\x040\x04\xe5\x042\x04 (possibly with a leading byte-order mark \xff\xfe). That's what I meant. Those bytes will be interpreted consistently across all locales. Right. But, that's not Unicode, it is an encoding of Unicode. Terminology is important -- if we don't call things by the right names (or at least agreed upon names) how can we communicate? Windows has two separate APIs, one for wide characters, the other for single bytes. Depending on which one you use, the directory will appear to be called Наӥв or 0å2. Yes, and AFAIK, the wide API is the default. The other one only exists to support programs that don't support the wide API (generally, such programs were intended to be used on older platforms that lack that API). I'm not sure that default is the right word, since (as far as I know) both APIs have different spelling and the coder has to make the choice whether to call function X or function Y. Perhaps you mean that Microsoft encourages the wide API and makes the single-byte API available for legacy reasons? But in any case, we're not talking about the file name encoding. We're talking about the contents of files. Okay then. As I stated, this has nothing to do with the OS since programs are free to interpret bytes any way they like. Yes, but my point was that even if the developer thinks he can avoid the problem by staying away from Unicode files coming from Linux and OS-X, he can't avoid dealing with multiple code pages on Windows. You are absolutely correct that this is *not* a cross-platform issue to do with the OS, but some people may think it is. -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: Python usage numbers
On 02/12/2012 05:27 PM, Roy Smith wrote: In articlemailman.5739.1329084873.27778.python-l...@python.org, Chris Angelicoros...@gmail.com wrote: On Mon, Feb 13, 2012 at 9:07 AM, Terry Reedytjre...@udel.edu wrote: The situation before ascii is like where we ended up *before* unicode. Unicode aims to replace all those byte encoding and character sets with *one* byte encoding for *one* character set, which will be a great simplification. It is the idea of ascii applied on a global rather that local basis. Unicode doesn't deal with byte encodings; UTF-8 is an encoding, but so are UTF-16, UTF-32. and as many more as you could hope for. But broadly yes, Unicode IS the solution. I could hope for one and only one, but I know I'm just going to be disapointed. The last project I worked on used UTF-8 in most places, but also used some C and Java libraries which were only available for UTF-16. So it was transcoding hell all over the place. Hopefully, we will eventually reach the point where storage is so cheap that nobody minds how inefficient UTF-32 is and we all just start using that. Life will be a lot simpler then. No more transcoding, a string will just as many bytes as it is characters, and everybody will be happy again. Keep your in-memory character strings as Unicode, and only serialize(encode) them when they go to/from a device, or to/from anachronistic code. Then the cost is realized at the point of the problem. No different than when deciding how to serialize any other data type. Do it only at the point of entry/exit of your program. But as long as devices are addressed as bytes, or as anything smaller than 32bit thingies, you will have encoding issues when writing to the device, and decoding issues when reading. At the very least, you have big-endian/little-endian ways to encode that UCS-4 code point. -- http://mail.python.org/mailman/listinfo/python-list
How do you Unicode proponents type your non-ASCII characters? (was: Python usage numbers)
rusi rustompm...@gmail.com writes: On Feb 12, 10:51 am, Steven D'Aprano steve +comp.lang.pyt...@pearwood.info wrote: Pardon me, but you can't even write *English* in ASCII. You can't say that it cost you £10 to courier your résumé to the head office of Encyclopædia Britanica to apply for the position of Staff Coördinator. (Admittedly, the umlaut on the second o looks a bit stuffy and old-fashioned, but it is traditional English.) Hell, you can't even write in *American*: you can't say that the recipe for the 20¢ WobblyBurger™ is © 2012 WobblyBurgerWorld Inc. [Quite OT but...] How do you type all this? In GNU+Linux, I run the IBus daemon to manage different keyboard input methods across all my applications consistently. That makes hundreds of language-specific input methods available, and also many that are not language-specific. It's useful if I want to 英語の書面を書き中 type a passage of Japanese with the ‘anthy’ input method, or likewise for any of the other available language-specific input methods. I normally have IBus presenting the ‘rfc1345’ input method. That makes just about all keys input the corresponding character just as if no input method were active. But when I type ‘’ followed by a two- or three-key sequence, it inputs the corresponding character from the RFC 1345 mnemonics table: → P d → £ e ' → é a e → æ o : → ö C t → ¢ T M → ™ C o → © 6 → “ 9 → ” … Those same characters are also available with the ‘latex’ input method, if I'm familiar with LaTeX character entity names. (I'm not.) -- \ “If [a technology company] has confidence in their future | `\ ability to innovate, the importance they place on protecting | _o__) their past innovations really should decline.” —Gary Barnett | Ben Finney -- http://mail.python.org/mailman/listinfo/python-list
M2crypto
Hello, M2crypto __init__(self, alg, key, iv, op, key_as_bytes=0, d='md5', salt='12345678', i=1, padding=1) I wont write app, using M2crypto and I can not understand what are the arguments: key, iv, op, salt ? What they do ? -- http://mail.python.org/mailman/listinfo/python-list
Re: Python usage numbers
On 02/12/2012 05:30 PM, Steven D'Aprano wrote: On Sun, 12 Feb 2012 05:11:30 -0600, Andrew Berg wrote: On 2/12/2012 3:12 AM, Steven D'Aprano wrote: snip Windows has two separate APIs, one for wide characters, the other for single bytes. Depending on which one you use, the directory will appear to be called Наӥв or 0å2. Yes, and AFAIK, the wide API is the default. The other one only exists to support programs that don't support the wide API (generally, such programs were intended to be used on older platforms that lack that API). I'm not sure that default is the right word, since (as far as I know) both APIs have different spelling and the coder has to make the choice whether to call function X or function Y. Perhaps you mean that Microsoft encourages the wide API and makes the single-byte API available for legacy reasons? When I last looked, the pair of functions were equivalently available, and neither one was named the way you'd expect. One had a suffix of A and the other had a suffix of W (guess which was which). C header definitions used #define to define the actual functions, and the preprocessor effectively stuck A's on all of them or W's on all of them. Very bulky, but buried in some MS header files. Other languages were free to use either or both. VB used just the W versions, as I presume java does. But the interesting point was that for most of these functions, the A versions were native on Win95-derived OS'es, while the W versions were native on NT-derived OS's. There were translation DLL's which supplied the secondary versions. So in the old days it was more efficient to use the A versions. No longer true, since as far as I know, nobody that still uses Win ME, Win98, or Win95 is targeted for much new programming. -- DaveA -- http://mail.python.org/mailman/listinfo/python-list
Re: Python usage numbers
On Sun, 12 Feb 2012 12:11:46 -0500, Roy Smith wrote: In article mailman.5730.1329065268.27778.python-l...@python.org, Dennis Lee Bieber wlfr...@ix.netcom.com wrote: On Sun, 12 Feb 2012 10:48:36 -0500, Roy Smith r...@panix.com wrote: As Steven D'Aprano pointed out, it was missing some commonly used US symbols such as ¢ or ©. That's interesting. When I wrote that, it showed on my screen as a cent symbol and a copyright symbol. What I see in your response is an upper case A with a hat accent (circumflex?) over it followed by a cent symbol, and likewise an upper case A with a hat accent over it followed by copyright symbol. Somebody's mail or news reader is either ignoring the message's encoding line, or not inserting an encoding line. Either way, that's a bug. Oh, for the days of ASCII again :-) I look forward to the day, probably around 2525, when everybody uses UTF-32 always. -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: Numeric root-finding in Python
On Sun, 12 Feb 2012 13:52:48 +, Robert Kern wrote: I don't have any advice for fixing your code, per se, but I would just grab mpmath and use their lambertw function: That's no fun! I'd never see mpmath before, it looks like it is worth investigating. Nevertheless, I still intend working on my lambert function, as it's a good learning exercise. I did look into SciPy's lambert too, and was put off by this warning in the docs: In some corner cases, lambertw might currently fail to converge http://docs.scipy.org/doc/scipy/reference/generated/ scipy.special.lambertw.html Naturally I thought I can do better than that. Looks like I can't :) -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: Python usage numbers
On Sun, 12 Feb 2012 08:50:28 -0800, rusi wrote: You can't say that it cost you £10 to courier your résumé to the head office of Encyclopædia Britanica to apply for the position of Staff Coördinator. (Admittedly, the umlaut on the second o looks a bit stuffy and old-fashioned, but it is traditional English.) Hell, you can't even write in *American*: you can't say that the recipe for the 20¢ WobblyBurger™ is © 2012 WobblyBurgerWorld Inc. [Quite OT but...] How do you type all this? [Note: I grew up on APL so unlike Rick I am genuinely asking :-) ] In my case, I used the KDE application KCharSelect. I manually hunt through the tables for the character I want (which sucks), click on the characters I want, and copy and paste them into my editor. Back in Ancient Days when I ran Mac OS 6, I had memorised many keyboard shortcuts for these things. Option-4 was the pound sign, I believe, and Option-Shift-4 the cent sign. Or perhaps the other way around? -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: Generating a .pc file using distutils
Just re-bumping this - I am fiddling with this code again and it's gross, so any input would be greatly appreciated :-) \t On Mon, Jan 23, 2012 at 05:31:20PM -0600, Tycho Andersen wrote: Is there some standard way to generate a .pc file (given a .pc.in or similar) using distutils? If there's not, is there a good way to access whatever the user passes in as --prefix (besides parsing sys.argv yourself)? Thanks, \t -- http://mail.python.org/mailman/listinfo/python-list
Re: Numeric root-finding in Python
On Sun, 12 Feb 2012 12:18:15 -0800, Mark Dickinson wrote: On Feb 12, 6:41 am, Steven D'Aprano steve +comp.lang.pyt...@pearwood.info wrote: err = -a/b # Estimate of the error in the current w. if abs(err) = 1e-16: break If the result you're expecting is around -1.005, this exit condition is rather optimistic: the difference between the two Python floats either side of this value is already 2.22e-16, so you're asking for less than half a ulp of error! I was gradually coming to the conclusion on my own that I was being overly optimistic with my error condition, although I couldn't put it into words *why*. Thanks for this Mark, this is exactly the sort of thing I need to learn -- as is obvious, I'm no expert on numeric programming. As to the rest; your error estimate simply doesn't have enough precision. The main problem is in the computation of a, where you're subtracting two almost identical values. The absolute error incurred in computing w*exp(w) is of the same order of magnitude as the difference 'w*exp(w) - x' itself, so err has lost essentially all of its significant bits, and is at best only a crude indicator of the size of the error. The solution would be to compute the quantities 'exp(w), w*exp(w), and w*exp(w) - x' all with extended precision. Other than using Decimal, there's no way to do that in pure Python, is there? We have floats (double) and that's it. For the other quantities, there shouldn't be any major issues---after all, you only need a few significant bits of 'delta' for it to be useful, but with the subtraction that generates a, you don't even get those few significant bits. (The correct value for w is approximately -1.00572223991.) Are you sure? Wolfram Alpha gives me the following value for W(-1, -0.36787344117144249455719773322925902903079986572265625): -1.005722239691522978... I did say *approximately*. The figure I quote comes from my HP-48GX, and seems to be accurate to the precision offered by the HP. -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: Python usage numbers
On Sun, 12 Feb 2012 17:27:34 -0500, Roy Smith wrote: In article mailman.5739.1329084873.27778.python-l...@python.org, Chris Angelico ros...@gmail.com wrote: On Mon, Feb 13, 2012 at 9:07 AM, Terry Reedy tjre...@udel.edu wrote: The situation before ascii is like where we ended up *before* unicode. Unicode aims to replace all those byte encoding and character sets with *one* byte encoding for *one* character set, which will be a great simplification. It is the idea of ascii applied on a global rather that local basis. Unicode doesn't deal with byte encodings; UTF-8 is an encoding, but so are UTF-16, UTF-32. and as many more as you could hope for. But broadly yes, Unicode IS the solution. I could hope for one and only one, but I know I'm just going to be disapointed. The last project I worked on used UTF-8 in most places, but also used some C and Java libraries which were only available for UTF-16. So it was transcoding hell all over the place. Um, surely the solution to that is to always call a simple wrapper function to the UTF-16 code to handle the transcoding? What do the Design Patterns people call it, a facade? No, an adapter. (I never remember the names...) Instead of calling library.foo() which only outputs UTF-16, write a wrapper myfoo() which calls foo, captures its output and transcribes to UTF-8. You have to do that once (per function), but now it works from everywhere, so long as you remember to always call myfoo instead of foo. Hopefully, we will eventually reach the point where storage is so cheap that nobody minds how inefficient UTF-32 is and we all just start using that. Life will be a lot simpler then. No more transcoding, a string will just as many bytes as it is characters, and everybody will be happy again. I think you mean 4 times as many bytes as characters. Unless you have 32 bit bytes :) -- Steven -- http://mail.python.org/mailman/listinfo/python-list
French and IDLE on Windows (was Re: Python usage numbers)
On 2/12/2012 2:52 PM, jmfauth wrote: Python popularity? I have no popularity-meter. What I know: I can not type French text in IDLE on Windows. It is like I am pretty sure others have managed to. tk and hence idle handle the entire BMP subset of unicode just fine once they get them. Except for the apple version, which has just been fixed so French entry should work. Showing characters on the screen requires an appropriate font. http://bugs.python.org/issue4281 was the result of a font problem. this since ~ten years and I never saw any complain about this. Neither have I, except for the issue above I just found. So there is nothing obvious to fix. If you have a problem, give the specifics here and lets see if someone has a solution. -- Terry Jan Reedy -- http://mail.python.org/mailman/listinfo/python-list
Re: Generating a .pc file using distutils
On 02/12/2012 06:04 PM, Tycho Andersen wrote: Just re-bumping this - I am fiddling with this code again and it's gross, so any input would be greatly appreciated :-) \t On Mon, Jan 23, 2012 at 05:31:20PM -0600, Tycho Andersen wrote: Is there some standard way to generate a .pc file (given a .pc.in or similar) using distutils? If there's not, is there a good way to access whatever the user passes in as --prefix (besides parsing sys.argv yourself)? Thanks, \t Bumping a message (especially using top-posting) seldom does much good unless you also supply some more information to either catch people's attention, or even better, remind them of something they know that might apply. So you could have said: A .pc file is lijfds;lkjds;fdsjfds;ljfds;ljfds;ljfd and I need to produce it in the slf;lfdsjfds;l;lkjfds;lj circumstances. Or, a .pc file is described on the wiki page at link http://www.sljfds.slijfdsj.unknown Or even I tried to get more information on the comp.lang.pc newsgroup, but nobody there will give me the time of day. As it is, the only thing I could do is point you to the only other keyword in your message: Try on a support forum for distutils. -- DaveA -- http://mail.python.org/mailman/listinfo/python-list
Re: Python usage numbers
In article 4f384b6e$0$29986$c3e8da3$54964...@news.astraweb.com, Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote: I could hope for one and only one, but I know I'm just going to be disapointed. The last project I worked on used UTF-8 in most places, but also used some C and Java libraries which were only available for UTF-16. So it was transcoding hell all over the place. Um, surely the solution to that is to always call a simple wrapper function to the UTF-16 code to handle the transcoding? What do the Design Patterns people call it, a facade? No, an adapter. (I never remember the names...) I am familiar with the concept. It was ICU. A very big library. Lots of calls. I don't remember the details, I'm sure we wrote wrappers. It was still a mess. Hopefully, we will eventually reach the point where storage is so cheap that nobody minds how inefficient UTF-32 is and we all just start using that. Life will be a lot simpler then. No more transcoding, a string will just as many bytes as it is characters, and everybody will be happy again. I think you mean 4 times as many bytes as characters. Unless you have 32 bit bytes :) Yes, exactly. -- http://mail.python.org/mailman/listinfo/python-list
Re: Numeric root-finding in Python
On 02/12/2012 06:05 PM, Steven D'Aprano wrote: On Sun, 12 Feb 2012 12:18:15 -0800, Mark Dickinson wrote: On Feb 12, 6:41 am, Steven D'Apranosteve +comp.lang.pyt...@pearwood.info wrote: err = -a/b # Estimate of the error in the current w. if abs(err)= 1e-16: break If the result you're expecting is around -1.005, this exit condition is rather optimistic: the difference between the two Python floats either side of this value is already 2.22e-16, so you're asking for less than half a ulp of error! I was gradually coming to the conclusion on my own that I was being overly optimistic with my error condition, although I couldn't put it into words *why*. Thanks for this Mark, this is exactly the sort of thing I need to learn -- as is obvious, I'm no expert on numeric programming. me either. But comments below. As to the rest; your error estimate simply doesn't have enough precision. The main problem is in the computation of a, where you're subtracting two almost identical values.SNIP SNIP Two pieces of my history that come to mind. 40+ years ago I got a letter from a user of our computer stating that our math seemed to be imprecise in certain places. He was very polte about it, and admitted to maybe needing a different algorithm. The letter was so polite that I (as author of the math microcode) worked on his problem, and found the difficulty, as well as a solution. The problem was figuring out the difference in a machining table between being level every place on its surface (in which case it would be slightly curved to match the earth), or being perfectly flat (in which case some parts of the table would be further from the earth's center than others) The table was 200 feet long, and we were talking millionths of an inch. He solved it three ways, and got three different answers. The first two differed in the 3rd place, which he thought far too big an error, and the third answer was just about exactly half the others. Well the 2:1 discrepancy just happens when you change your assumption of what part of the flat table is level. If the center is level, then the edges are only 100 feet out, while if the edge is level, the other edge is 200 feet out. But the other solution was very interesting. Turns out he sketched a right triangle, with narrow angle at the center of the earth, side opposite being 200 feet. He then calculated the difference between the other two sides. one 8000 miles, and the other 8000 miles plus a few microinches. He got that distance by subtracting the sine from the tangent, or something similar to that. I had microcoded both those functions, and was proud of their accuracy. But if you subtract two 13 digit numbers that only differ in the last 3, you only get 3 digits worth of accuracy, best case. Solution was to apply some similar triangles, and some trivial approximations, and the problem turned out not to need trig at all, and accurate to at least 12 places. if I recall, it was something like 8000mi is to 200 feet, as 200 feet is to X. Cross multiply and it's just arithmetic. The other problem was even earlier. It was high school physics, and the challenge was to experimentally determine the index of refraction of air to 5 places. Problem is our measurements can't be that accurate. So this is the same thing in reverse. Find a way to measure the difference of the index of refraction of air and vacuum, to one or two places, and add that to 1. taken together with lots of other experience, i try to avoid commiting an algorithm to code before thinking about errors, convergence, and exceptional conditions. I've no experience with Lambert, but I suspect it can be attacked similarly. -- DaveA -- http://mail.python.org/mailman/listinfo/python-list
Re: Python usage numbers
Am 12.02.2012 23:07, schrieb Terry Reedy: But because of the limitation of ascii on a worldwide, as opposed to American basis, we ended up with 100-200 codings for almost as many character sets. This is because the idea of ascii was applied by each nation or language group individually to their local situation. You really learn to appreciate unicode when you have to deal with mixed languages in texts and old databases from the 70ties and 80ties. I'm working with books that contain medieval German, old German, modern German, English, French, Latin, Hebrew, Arabic, ancient and modern Greek, Rhaeto-Romanic, East European and more languages. Sometimes three or four languages are used in a single book. Some books are more than 700 years old and contain glyphs that aren't covered by unicode yet. Without unicode it would be virtually impossible to deal with it. Metadata for these books come from old and proprietary databases and are stored in a format that is optimized for magnetic tape. Most people will never have heard about ISO-5426 or ANSEL encoding or about file formats like MAB2, MARC or PICA. It took me quite some time to develop codecs to encode and decode an old and partly undocumented variable multibyte encodings that predates UTF-8 by about a decade. Of course every system interprets the undocumented parts slightly different ... Unicode and XML are bliss for metadata exchange and long term storage! -- http://mail.python.org/mailman/listinfo/python-list
Re: Python usage numbers
On 02/12/2012 06:29 PM, Steven D'Aprano wrote: On Sun, 12 Feb 2012 17:27:34 -0500, Roy Smith wrote: SNIP Hopefully, we will eventually reach the point where storage is so cheap that nobody minds how inefficient UTF-32 is and we all just start using that. Life will be a lot simpler then. No more transcoding, a string will just as many bytes as it is characters, and everybody will be happy again. I think you mean 4 times as many bytes as characters. Unless you have 32 bit bytes :) Until you have 32 bit bytes, you'll continue to have encodings, even if only a couple of them. -- DaveA -- http://mail.python.org/mailman/listinfo/python-list
Re: M2crypto
zigi wrote: Hello, M2crypto __init__(self, alg, key, iv, op, key_as_bytes=0, d='md5', salt='12345678', i=1, padding=1) I wont write app, using M2crypto and I can not understand what are the arguments: key, iv, op, salt ? What they do ? I assume you're reading in http://www.heikkitoivonen.net/m2crypto/api/ about M2Crypto.EVP.Cipher. Epydoc claims another victim. I'm having a lot of trouble finding documentation. The obvious OpenSSL pages are kind of thin, too. You might see some useful code in the EVP unit tests m2crypto/tests/test_evp.py in the m2crypto installation. Good hunting, Mel. -- http://mail.python.org/mailman/listinfo/python-list
Re: M2crypto
On Sun, Feb 12, 2012 at 4:00 PM, Mel Wilson mwil...@the-wire.com wrote: zigi wrote: Hello, M2crypto __init__(self, alg, key, iv, op, key_as_bytes=0, d='md5', salt='12345678', i=1, padding=1) I wont write app, using M2crypto and I can not understand what are the arguments: key, iv, op, salt ? What they do ? I assume you're reading in http://www.heikkitoivonen.net/m2crypto/api/ about M2Crypto.EVP.Cipher. Epydoc claims another victim. I'm having a lot of trouble finding documentation. The obvious OpenSSL pages are kind of thin, too. You might see some useful code in the EVP unit tests m2crypto/tests/test_evp.py in the m2crypto installation. Not intending to be rude, but being perfectly serious: as a general rule, if you don't know what an IV is you're probably getting yourself into a lot of trouble working with low-level crypto libraries. Two suggestions: 1. Describe what you're trying to do- I'll be able to help more if I know what you're actually going for. 2. Try keyczar. It's not perfect, but it's a lot easier to get right. Geremy Condra -- http://mail.python.org/mailman/listinfo/python-list
Re: Python usage numbers
On Mon, Feb 13, 2012 at 11:03 AM, Dave Angel d...@davea.name wrote: On 02/12/2012 06:29 PM, Steven D'Aprano wrote: I think you mean 4 times as many bytes as characters. Unless you have 32 bit bytes :) Until you have 32 bit bytes, you'll continue to have encodings, even if only a couple of them. The advantage, though, is that you can always know how many bytes to read for X characters. In ASCII, you allocate 80 bytes of storage and you can store 80 characters. In UTF-8, if you want an 80-character buffer, you can probably get away with allocating 240 characters... but maybe not. In UTF-32, it's easy - just allocate 320 bytes and you know you can store them. Also, you know exactly where the 17th character is; in UTF-8, you have to count. That's a huge advantage for in-memory strings; but is it useful on disk, where (as likely as not) you're actually looking for lines, which you still have to scan for? I'm thinking not, so it makes sense to use a smaller disk image than UTF-32 - less total bytes means less sectors to read/write, which translates fairly directly into performance. ChrisA -- http://mail.python.org/mailman/listinfo/python-list
Re: Python usage numbers
In article mailman.5750.1329094801.27778.python-l...@python.org, Chris Angelico ros...@gmail.com wrote: The advantage, though, is that you can always know how many bytes to read for X characters. In ASCII, you allocate 80 bytes of storage and you can store 80 characters. In UTF-8, if you want an 80-character buffer, you can probably get away with allocating 240 characters... but maybe not. In UTF-32, it's easy - just allocate 320 bytes and you know you can store them. Also, you know exactly where the 17th character is; in UTF-8, you have to count. That's a huge advantage for in-memory strings; but is it useful on disk, where (as likely as not) you're actually looking for lines, which you still have to scan for? I'm thinking not, so it makes sense to use a smaller disk image than UTF-32 - less total bytes means less sectors to read/write, which translates fairly directly into performance. You might just write files compressed. My guess is that a typical gzipped UTF-32 text file will be smaller than the same data stored as uncompressed UTF-8. -- http://mail.python.org/mailman/listinfo/python-list
entering unicode (was Python usage numbers)
On Feb 12, 10:36 pm, Nick Dokos nicholas.do...@hp.com wrote: rusi rustompm...@gmail.com wrote: On Feb 12, 10:51 am, Steven D'Aprano steve +comp.lang.pyt...@pearwood.info wrote: On Sun, 12 Feb 2012 15:38:37 +1100, Chris Angelico wrote: Everything that displays text to a human needs to translate bytes into glyphs, and the usual way to do this conceptually is to go via characters. Pretending that it's all the same thing really means pretending that one byte represents one character and that each character is depicted by one glyph. And that's doomed to failure, unless everyone speaks English with no foreign symbols - so, no mathematical notations. Pardon me, but you can't even write *English* in ASCII. You can't say that it cost you £10 to courier your résumé to the head office of Encyclopædia Britanica to apply for the position of Staff Coördinator. (Admittedly, the umlaut on the second o looks a bit stuffy and old-fashioned, but it is traditional English.) Hell, you can't even write in *American*: you can't say that the recipe for the 20¢ WobblyBurger™ is © 2012 WobblyBurgerWorld Inc. [Quite OT but...] How do you type all this? [Note: I grew up on APL so unlike Rick I am genuinely asking :-) ] [Emacs speficic] Many different ways of course, but in emacs, you can select e.g. the TeX input method with C-x RET C-\ TeX RET. which does all of the above symbols with the exception of the cent symbol (or maybe I missed it) - you type the thing in the first column and you get the thing in the second column \pounds £ \'e é \ae æ \o ö ^{TM} ™ \copyright © I gave up on the cent symbol and used ucs-insert (C-x 8 RET) which allows you to type a name, in this case CENT SIGN to get ¢. Nick [OT warning] I asked this on the emacs list: No response there and the responses here are more helpful so asking here. My question there was emacs-specific. If there is some other app, thats fine. I have some bunch of sanskrit (devanagari) to type. It would be easiest for me if I could have the English (roman) as well as the sanskrit (devanagari). For example using the devanagari-itrans input method I can write the gayatri mantra using OM bhUrbhuvaH suvaH tatsaviturvarenyam bhargo devasya dhImahi dhiyo yonaH prachodayAt and emacs produces *on the fly* (ie I cant see/edit the above) ॐ भूर्भुवः सुवः तत्सवितुर्वरेण्यम् भर्गो देवस्य धीमहि धियो योनः प्रचोदयात् Can I do it in batch mode? ie write the first in a file and run some command on it to produce the second? -- http://mail.python.org/mailman/listinfo/python-list
Re: Python usage numbers
On 2/12/2012 5:14 PM, Chris Angelico wrote: On Mon, Feb 13, 2012 at 9:07 AM, Terry Reedytjre...@udel.edu wrote: The situation before ascii is like where we ended up *before* unicode. Unicode aims to replace all those byte encoding and character sets with *one* byte encoding for *one* character set, which will be a great simplification. It is the idea of ascii applied on a global rather that local basis. Unicode doesn't deal with byte encodings; UTF-8 is an encoding, The Unicode Standard specifies 3 UTF storage formats* and 8 UTF byte-oriented transmission formats. UTF-8 is the most common of all encodings for web pages. (And ascii pages are utf-8 also.) It is the only one of the 8 most of us need to much bother with. Look here for the list http://www.unicode.org/glossary/#U and for details look in various places in http://www.unicode.org/versions/Unicode6.1.0/ch03.pdf but so are UTF-16, UTF-32. and as many more as you could hope for. All the non-UTF 'as many more as you could hope for' encodings are not part of Unicode. * The new internal unicode scheme for 3.3 is pretty much a mixture of the 3 storage formats (I am of course, skipping some details) by using the widest one needed for each string. The advantage is avoiding problems with each of the three. The disadvantage is greater internal complexity, but that should be hidden from users. They will not need to care about the internals. They will be able to forget about 'narrow' versus 'wide' builds and the possible requirement to code differently for each. There will only be one scheme that works the same on all platforms. Most apps should require less space and about the same time. -- Terry Jan Reedy -- http://mail.python.org/mailman/listinfo/python-list
Python vs. C++11
There are bigsimilarities between Python and the new C++ standard. Now we can actually use our experience as Python programmers to write fantastic C++ :-) Here is a small list of similarities to consider: Iterate over any container, like Python's for loop: for (type item: container) Pointer type with reference counting: std::shared_ptr Python-like datatypes: tuple std::tuple list std::vector std::list std::stack dict std::unordered_map set std::unordered_set complex std::complex deque std::deque lambda[name](params){body} heapq std::heap weakref weak_ptr str std::string -- unicode, raw strings, etc work as Python Other things of interest: std::regex, std::cmatch std::thread thread api versy similar to Python's std::atomic datatype for atomic operations std::mt19937 same prng as Python -- http://mail.python.org/mailman/listinfo/python-list
Re: Python vs. C++11
On Feb 13, 4:21 am, sturlamolden sturlamol...@yahoo.no wrote: There are bigsimilarities between Python and the new C++ standard. Now we can actually use our experience as Python programmers to write fantastic C++ :-) And of course the keyword 'auto', which means automatic type interence. -- http://mail.python.org/mailman/listinfo/python-list
Re: Python usage numbers
In article mailman.5752.1329102603.27778.python-l...@python.org, Terry Reedy tjre...@udel.edu wrote: On 2/12/2012 5:14 PM, Chris Angelico wrote: On Mon, Feb 13, 2012 at 9:07 AM, Terry Reedytjre...@udel.edu wrote: The situation before ascii is like where we ended up *before* unicode. Unicode aims to replace all those byte encoding and character sets with *one* byte encoding for *one* character set, which will be a great simplification. It is the idea of ascii applied on a global rather that local basis. Unicode doesn't deal with byte encodings; UTF-8 is an encoding, The Unicode Standard specifies 3 UTF storage formats* and 8 UTF byte-oriented transmission formats. UTF-8 is the most common of all encodings for web pages. (And ascii pages are utf-8 also.) It is the only one of the 8 most of us need to much bother with. Look here for the list http://www.unicode.org/glossary/#U and for details look in various places in http://www.unicode.org/versions/Unicode6.1.0/ch03.pdf but so are UTF-16, UTF-32. and as many more as you could hope for. All the non-UTF 'as many more as you could hope for' encodings are not part of Unicode. * The new internal unicode scheme for 3.3 is pretty much a mixture of the 3 storage formats (I am of course, skipping some details) by using the widest one needed for each string. The advantage is avoiding problems with each of the three. The disadvantage is greater internal complexity, but that should be hidden from users. They will not need to care about the internals. They will be able to forget about 'narrow' versus 'wide' builds and the possible requirement to code differently for each. There will only be one scheme that works the same on all platforms. Most apps should require less space and about the same time. All that is just fine, but what the heck are we going to do about ascii art, that's what I want to know. Python just won't be the same in UTF-8. /^\/^\ _|__| O| \/ /~ \_/ \ \|__/ \ \___ \ `\ \ \ | | \ / /\ / / \\ / / \ \ / /\ \ / / __\ \ / / _-~ ~-_ | | ( (_-~_--_~-_ _/ | \ ~--~_-~~-_~-_-~/ ~-_ _-~ ~-_ _-~ - jurcy - ~--__-~~-___-~ -- http://mail.python.org/mailman/listinfo/python-list
Re: ANN: Sarge, a library wrapping the subprocess module, has been released.
On Monday, February 13, 2012 3:13:17 AM UTC+7, Vinay Sajip wrote: On Feb 12, 3:35 pm, Anh Hai Trinh anh.hai.tr...@gmail.com wrote: I think most users like to use Python, or they'd use Bash. I think people prefer not another language that is different from both, and having little benefits. My own opinion of course. I have looked at pbs and clom: they Pythonify calls to external programs by making spawning those look like function calls. There's nothing wrong with that, it's just a matter of taste. I find that e.g. wc(ls(/etc, -1), -l) is not as readable as call(“ls /etc –1 | wc –l”) I don't disagree with it. But the solution is really easy, just call 'sh' and pass it a string! from extproc import sh n = int(sh(“ls /etc –1 | wc –l”)) No parser needed written! Yes there is a danger of argument parsing and globs and all that. But people are aware of it. With string parsing, ambiguity is always there. Even when you have a BNF grammar, people easily make mistakes. -- http://mail.python.org/mailman/listinfo/python-list
Re: ANN: Sarge, a library wrapping the subprocess module, has been released.
On Feb 12, 9:35 am, Anh Hai Trinh anh.hai.tr...@gmail.com wrote: It's not hard for the user I think most users like to use Python, or they'd use Bash. I think people prefer not another language that is different from both, and having little benefits. My own opinion of course. Objection! Does the defense REALLY expect this court to believe that he can testify as to how MOST members of the Python community would or would not favor bash over Python? And IF they do in fact prefer bash, is this display of haughty arrogance nothing more than a hastily stuffed straw-man presented to protect his own ego? BTW extproc is nice, but I wanted to push the envelope a little :-) Hmm, if the extra envelop is the async code with threads that may deadlock, I would say thanks but no thanks :p And why do you need to voice such strong opinions of disdain in an announcement thread? Testing the integrity of a module (or module) author is fine so long as we are respectful whilst doing so. However, i must take exception with your crass attitude. -- http://mail.python.org/mailman/listinfo/python-list
Re: Python usage numbers
Roy Smith r...@panix.com writes: All that is just fine, but what the heck are we going to do about ascii art, that's what I want to know. Python just won't be the same in UTF-8. If it helps, ASCII art *is* UTF-8 art. So it will be the same in UTF-8. Or maybe you already knew that, and your sarcasm was lost with the high bit. -- \ “We are all agreed that your theory is crazy. The question that | `\ divides us is whether it is crazy enough to have a chance of | _o__)being correct.” —Niels Bohr (to Wolfgang Pauli), 1958 | Ben Finney -- http://mail.python.org/mailman/listinfo/python-list
Re: ANN: Sarge, a library wrapping the subprocess module, has been released.
On Feb 12, 2:13 pm, Vinay Sajip vinay_sa...@yahoo.co.uk wrote: wc(ls(/etc, -1), -l) is not as readable as call(“ls /etc –1 | wc –l”) And i agree! I remember a case where i was forced to use an idiotic API for creating inputbox dialogs. Something like this: prompts = ['Height', 'Width', 'Color'] values = [10, 20, Null] options = [Null, Null, Red|White|Blue] dlg(prompts, values, options) ...and as you can see this is truly asinine! Later, someone slightly more intelligent wrapped this interface up like this: dlg = Ipb(Title) dlg.add(Height) dlg.add(Width, 39) dlg.add(Color, [Red, White, Blue]) dl.show() ...and whilst i prefer this interface over the original, i new we could make it better; because we had the technology! dlg = Ipb( Title, Height=10, Width=20, Color=Red|Green|Blue, ) Ahh... refreshing as a cold brew! -- http://mail.python.org/mailman/listinfo/python-list
Re: Python usage numbers
On Feb 12, 12:10 am, Steven D'Aprano steve +comp.lang.pyt...@pearwood.info wrote: On Sat, 11 Feb 2012 18:36:52 -0800, Rick Johnson wrote: I have a file containing text. I can open it in an editor and see it's nearly all ASCII text, except for a few weird and bizarre characters like £ © ± or ö. In Python 2, I can read that file fine. In Python 3 I get an error. What should I do that requires no thought? Obvious answers: the most obvious answer would be to read the file WITHOUT worrying about asinine encoding. Your mad leet reading comprehension skillz leave me in awe Rick. Same goes for your abstract reasoning skillz! You're attempting to treat the problem, whilst ignoring the elephant in the room -- THE DISEASE!. Do you think that cost of healthcare is the problem? Do you think the cost of healthcare insurance is the problem? NO! The problem is people expect entitlements. If you can't afford healthcare, then you die. If you can't afford food, then you starve. If you can't afford prophylactics, then you will be sentenced to eighteen years of hell! Maybe a charity will help you, or maybe a friend, or maybe a neighbor. If not, then you suffer the fatal exception. Life sucks, deal with it! You want to solve the healthcare problem then STOP TREATING PEOPLE WHO DON'T HAVE INSURANCE! Problem solved! You are only born with one guarantee; you will die, guaranteed! Any questions? The problem with bytes is not encodings or OS's. Can you guess what the REAL problem is? ..take all the time you need. -- http://mail.python.org/mailman/listinfo/python-list
Re: Python usage numbers
On Mon, Feb 13, 2012 at 3:48 PM, Rick Johnson rantingrickjohn...@gmail.com wrote: The problem with bytes is not encodings or OS's. Can you guess what the REAL problem is? ..take all the time you need. The REAL problem is trolls. But they're such fun, and so cute when they get ranting... ChrisA -- http://mail.python.org/mailman/listinfo/python-list
how to tell a method is classmethod or static method or instance method
how to tell a method is class method or static method or instance method? -- http://mail.python.org/mailman/listinfo/python-list
Re: ANN: Sarge, a library wrapping the subprocess module, has been released.
Objection! Does the defense REALLY expect this court to believe that he can testify as to how MOST members of the Python community would or would not favor bash over Python? And IF they do in fact prefer bash, is this display of haughty arrogance nothing more than a hastily stuffed straw-man presented to protect his own ego? Double objection! Relevance. The point is that the OP created another language that is neither Python nor Bash. And why do you need to voice such strong opinions of disdain in an announcement thread? Testing the integrity of a module (or module) author is fine so long as we are respectful whilst doing so. However, i must take exception with your crass attitude. My respectful opinion is that the OP's approach is fundamentally flawed. There are many platform-specific issues when forking and threading are fused. My benign intent was to warn others about unsolved problems and scratching-your-head situations. Obviously, the OP can always choose to continue his direction at his own discretion. -- http://mail.python.org/mailman/listinfo/python-list
Re: how to tell a method is classmethod or static method or instance method
On 13Feb2012 15:59, Zheng Li dllizh...@gmail.com wrote: | how to tell a method is class method or static method or instance method? Maybe a better question is: under what circumstances do you need to figure this out? I'm actually quite serious here. Please outline what circumstances cause you to want to ask and answer this question. Cheers, -- Cameron Simpson c...@zip.com.au DoD#743 http://www.cskk.ezoshosting.com/cs/ Reason #173 to fear technology: o o o o o o o o ^|\ ^|^ v|^ v|v |/v |X| \| | /\ \ / \ / \ / \ o o o o o o o o \ x / | / \ ) |\ / \ / \ / \L Mr. email does the Macarena. -- http://mail.python.org/mailman/listinfo/python-list
[issue13997] Clearly explain the bare minimum Python 3 users should know about Unicode
Changes by Ezio Melotti ezio.melo...@gmail.com: -- components: +Unicode nosy: +ezio.melotti stage: - needs patch type: - enhancement ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13997 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue4966] Improving Lib Doc Sequence Types Section
Nick Coghlan ncogh...@gmail.com added the comment: Just noting that this has slipped a bit down my Python to-do list (there are other things I want to focus on before the first 3.3 alpha). I'll get back to it at some point, but if someone want to take my branch and run with it in the meantime, please feel free. -- assignee: ncoghlan - ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4966 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12627] Implement PEP 394: The python Command on Unix-Like Systems
Changes by Nick Coghlan ncogh...@gmail.com: -- nosy: +benjamin.peterson ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12627 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13998] Lookbehind assertions go behind the start position for the match
New submission from Devin Jeanpierre jeanpierr...@gmail.com: compiled regex objects' match method offers an optional pos parameter described to be roughly equivalent to slicing except for how it treats the ^ operation. See http://docs.python.org/library/re.html#re.RegexObject.search However, the behavior of lookbehind assertions also differs: re.compile((?=a)b).match(ab, 1) _sre.SRE_Match object at 0x... re.compile((?=a)b).match(ab[1:]) This alone might be a documentation bug, but the behavior is also inconsistent with the behavior of lookahead assertions, which do *not* look past the endpos: re.compile(a(?=b)).match(ab, 0, 1) re.compile(a(?=b)).match(ab) _sre.SRE_Match object at 0x... -- components: Regular Expressions messages: 153188 nosy: Devin Jeanpierre, ezio.melotti priority: normal severity: normal status: open title: Lookbehind assertions go behind the start position for the match type: behavior versions: Python 2.7, Python 3.1, Python 3.2 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13998 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13997] Clearly explain the bare minimum Python 3 users should know about Unicode
STINNER Victor victor.stin...@haypocalc.com added the comment: A common programming task is I want to process this text file, I know it's in an ASCII compatible encoding, I don't know which one specifically, but I'm only manipulating the ASCII parts so it doesn't matter. Can you give more detail about this use case? Why would you ignore non-ASCII characters? -- nosy: +haypo ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13997 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13997] Clearly explain the bare minimum Python 3 users should know about Unicode
Nick Coghlan ncogh...@gmail.com added the comment: Usually because the file may contain certain ASCII markers (or you're inserting such markers), but beyond that, you only care that it's in a consistent ASCII compatible encoding. Parsing log files from sources that aren't set up correctly often falls into this category - you know the markers are ASCII, but the actual message contents may not be properly encoded. (e.g. they use a locale dependent encoding, but not all the log files are from the same machine and not all machines have their locale set up properly). (although errors=replace can be a better option for such read-only use cases). A use case where you really do need errors='surrogateescape' is when you're reformatting a log file and you want to preserve the encoding for the messages while manipulating the pure ASCII timestamps and message headers. In that case, surrogateescape is the right answer, because you can manipulate the ASCII bits freely while preserving the log message contents when you write the reformatted files back out. The reformatting script offers an API that says put any ASCII compatible encoding in, and you'll get that same encoding back out. You'll get weird behaviour (i.e. as you do in Python 2) if the assumption of an ASCII compatible encoding is ever violated, but that would be equally true if the script tried to process things at the raw bytes level. The assumption of an ASCII compatibile text encoding is a useful one a lot of the time. The problem with Python 2 is it makes that assumption implicitly, and makes it almost impossible to disable it. Python 3, on the other hand, assumes very little by default (basically what it returns from sys.getfilesystemencoding() and locale.getpreferredencoding()), this requiring that the programmer know how to state their assumptions explicitly. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13997 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13997] Clearly explain the bare minimum Python 3 users should know about Unicode
STINNER Victor victor.stin...@haypocalc.com added the comment: Why do you use Unicode with the ugly surrogateescape error handler in this case? Bytes are just fine for such usecase. The surrogateescape error handler produces unusual characters in range U+DC80-U+DCFF which cannot be printed to a console because sys.stdout uses the strict error handler, and sys.stderr uses the backslashreplace error handler. If I remember correctly, only UTF-7 encoder allow lone surrogate characters. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13997 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13968] Support recursive globs
Yuval Greenfield ubershme...@gmail.com added the comment: Raymond Hettinger, by simple do you mean a single argument rglob function? Or do you mean you prefer glob doesn't get a new function? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13968 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13953] Get rid of doctests in packaging.tests.test_version
Francisco Martín Brugué franci...@email.de added the comment: Does a doc test test the output literally? (I've just always used unittest) Ok, thanks -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13953 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13963] dev guide has no mention of mechanics of patch review
Nadeem Vawda nadeem.va...@gmail.com added the comment: AFAIK, all interested parties are supposed to automatically be sent email notifications whenever anyone posts an update on the review. However, I've run into a bug http://psf.upfronthosting.co.za/roundup/meta/issue402 that seems to be preventing this from working for me. * Do all patches go into this review site, or do I have to do something extra to get them to land there? It should happen automatically, but only if the patch applies cleanly to the head of the default branch. * I have patches for both 2.6 and 3.1 - are they kept separate, or do they affect each other's delta from patch set? If the patch applies cleanly to default (unlikely), I imagine things could get messy. Most of the time, though, they'll be ignored altogether. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13963 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13997] Clearly explain the bare minimum Python 3 users should know about Unicode
Changes by Nadeem Vawda nadeem.va...@gmail.com: -- nosy: +nadeem.vawda ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13997 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10287] NNTP authentication should check capabilities
Hynek Schlawack h...@ox.cx added the comment: Sure, I wanted to add tests as soon as I know that the flow is correct (which it isn't :)). So essentially we want always CAPABILITIES LOGIN CAPABILITIES ? That's rather simple to achieve. The tests are going to be the harder part. ;) Re the patch: I tried to generate it using SourceTree but probably did something wrong – will use hg next time again. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10287 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13992] Segfault in PyTrash_destroy_chain
Aaron Staley usaa...@gmail.com added the comment: Active extension modules are MySQL-python, numpy, and crypto. Here is the output from the non-optimized debug build. Slightly different trace, but still some sort of deallocator crashing AFAIK: #0 0x0046247c in _Py_ForgetReference (op= Channel(origin_addr=None, in_window_size=65536, in_window_threshold=6553, lock=thread.lock at remote 0x571bf90, _pipe=None, eof_received=0, in_max_packet_size=34816, out_buffer_cv=_Condition(_Verbose__verbose=False, _Condition__lock=thread.lock at remote 0x571bf90, acquire=built-in method acquire of thread.lock object at remote 0x571bf90, _Condition__waiters=[], release=built-in method release of thread.lock object at remote 0x571bf90) at remote 0x593d3e0, event=_Event(_Verbose__verbose=False, _Event__flag=True, _Event__cond=_Condition(_Verbose__verbose=False, _Condition__lock=thread.lock at remote 0x59138b0, acquire=built-in method acquire of thread.lock object at remote 0x59138b0, _Condition__waiters=[], release=built-in method release of thread.lock object at remote 0x59138b0) at remote 0x5a333e0) at remote 0x593ded0, transport=Transport(_Thread__ident=140009885591296, host_key_type='ssh-rsa', _channels=ChannelMap(_lock=thread.lock at remote 0x5928f90, _map=WeakValueDictionary(_re...(truncated)) at Objects/object.c:2220 #1 0x004624ed in _Py_Dealloc (op= Channel(origin_addr=None, in_window_size=65536, in_window_threshold=6553, lock=thread.lock at remote 0x571bf90, _pipe=None, eof_received=0, in_max_packet_size=34816, out_buffer_cv=_Condition(_Verbose__verbose=False, _Condition__lock=thread.lock at remote 0x571bf90, acquire=built-in method acquire of thread.lock object at remote 0x571bf90, _Condition__waiters=[], release=built-in method release of thread.lock object at remote 0x571bf90) at remote 0x593d3e0, event=_Event(_Verbose__verbose=False, _Event__flag=True, _Event__cond=_Condition(_Verbose__verbose=False, _Condition__lock=thread.lock at remote 0x59138b0, acquire=built-in method acquire of thread.lock object at remote 0x59138b0, _Condition__waiters=[], release=built-in method release of thread.lock object at remote 0x59138b0) at remote 0x5a333e0) at remote 0x593ded0, transport=Transport(_Thread__ident=140009885591296, host_key_type='ssh-rsa', _channels=ChannelMap(_lock=thread.lock at remote 0x5928f90, _map=WeakValueDictionary(_re...(truncated)) at Objects/object.c:2240 #2 0x00442244 in list_dealloc (op=0x66d7ab0) at Objects/listobject.c:309 #3 0x004624fa in _Py_Dealloc (op= [Channel(origin_addr=None, in_window_size=65536, in_window_threshold=6553, lock=thread.lock at remote 0x571bf90, _pipe=None, eof_received=0, in_max_packet_size=34816, out_buffer_cv=_Condition(_Verbose__verbose=False, _Condition__lock=thread.lock at remote 0x571bf90, acquire=built-in method acquire of thread.lock object at remote 0x571bf90, _Condition__waiters=[], release=built-in method release of thread.lock object at remote 0x571bf90) at remote 0x593d3e0, event=_Event(_Verbose__verbose=False, _Event__flag=True, _Event__cond=_Condition(_Verbose__verbose=False, _Condition__lock=thread.lock at remote 0x59138b0, acquire=built-in method acquire of thread.lock object at remote 0x59138b0, _Condition__waiters=[], release=built-in method release of thread.lock object at remote 0x59138b0) at remote 0x5a333e0) at remote 0x593ded0, transport=Transport(_Thread__ident=140009885591296, host_key_type='ssh-rsa', _channels=ChannelMap(_lock=thread.lock at remote 0x5928f90 , _map=WeakValueDictionary(_r...(truncated)) at Objects/object.c:2241 #4 0x00448bc4 in listiter_next (it=0x5d1c530) at Objects/listobject.c:2917 #5 0x004ce425 in PyEval_EvalFrameEx (f= Frame 0x7f56a050aea0, for file /usr/lib/python2.7/dist-packages/paramiko/transport.py, line 1586, in run (self=Transport(_Thread__ident=140009885591296, host_key_type='ssh-rsa', _channels=ChannelMap(_lock=thread.lock at remote 0x5928f90, _map=WeakValueDictionary(_remove=function at remote 0x56355a0, data={}) at remote 0x5939588) at remote 0x5912bc0, lock=thread.lock at remote 0x5928d60, _Thread__started=_Event(_Verbose__verbose=False, _Event__flag=True, _Event__cond=_Condition(_Verbose__verbose=False, _Condition__lock=thread.lock at remote 0x521ff40, acquire=built-in method acquire of thread.lock object at remote 0x521ff40, _Condition__waiters=[], release=built-in method release of thread.lock object at remote 0x521ff40) at remote 0x5223300) at remote 0x47f56f0, _channel_counter=22, active=False, _preferred_compression=('none',), server_object=None, kex_engine=None, log_name='paramiko.transport', _x11_handler=None, remote_compression='none', _Thread__initiali zed=True, server_accepts=[], s...(truncated), throwflag=0) at Python/ceval.c:2497 #6 0x004d41c3 in fast_function (func=function at remote 0x4716300, pp_stack=0x7f56977ed400, n=1, na=1, nk=0) at Python/ceval.c:4099 #7
[issue13882] PEP 410: Use decimal.Decimal type for timestamps
STINNER Victor victor.stin...@haypocalc.com added the comment: Patch version 15: - round correctly - datetime.date.fromtimestamp() and datetime.datetime.fromtimestamp() reuses _PyTime_t API to support decimal.Decimal without loss of precision - add more tests -- Added file: http://bugs.python.org/file24495/time_decimal-15.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13882 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13997] Clearly explain the bare minimum Python 3 users should know about Unicode
Nick Coghlan ncogh...@gmail.com added the comment: If such use cases are indeed better handled as bytes, then that's what should be documented. However, there are some text processing assumptions that no longer hold when using bytes instead of strings (such as x[0:1] == x[0]). You also can't safely pass such byte sequences to various other APIs (e.g. urllib.parse will happily process surrogate escaped text without corrupting them, but will throw UnicodeDecodeError for bytes sequences that aren't pure 7-bit ASCII). Using surrogateescape instead means that you're only going to have problems if you go to encode the data to an encoding other than the source one. That's basically the things work in Python 2 with 8-bit strings. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13997 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13882] PEP 410: Use decimal.Decimal type for timestamps
Changes by STINNER Victor victor.stin...@haypocalc.com: Removed file: http://bugs.python.org/file24495/time_decimal-15.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13882 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13882] PEP 410: Use decimal.Decimal type for timestamps
STINNER Victor victor.stin...@haypocalc.com added the comment: (Oops, I attached the wrong patch.) -- Added file: http://bugs.python.org/file24496/time_decimal-15.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13882 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13882] PEP 410: Use decimal.Decimal type for timestamps
Changes by STINNER Victor victor.stin...@haypocalc.com: Removed file: http://bugs.python.org/file24496/time_decimal-15.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13882 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13991] namespace packages depending on order
andrea crotti andrea.crott...@gmail.com added the comment: About the binary file, in theory I agree with you, but that archive contains 5 or 6 subdirectories with a few almost empty files. Of course I can write a script that recreates that situation, but does it make sense when I can just tar and untar it? And what should be the security threat in a tar.gz file? Anyway it doesn't matter and sure I will try to use plain text in the future.. About the bug, for what I can understand the bug comes from pkg_resources: /usr/lib/python2.7/site-packages/pkg_resources.py is owned by python2-distribute 0.6.24-1 and/or from how the import mechanism works on namespace packages, not from setuptools.. Should I still move the bug report to somewhere else? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13991 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13882] PEP 410: Use decimal.Decimal type for timestamps
STINNER Victor victor.stin...@haypocalc.com added the comment: New try, set the version to 16 to avoid the confusion. test_time is failing on Windows. -- Added file: http://bugs.python.org/file24497/time_decimal-16.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13882 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13997] Clearly explain the bare minimum Python 3 users should know about Unicode
Paul Moore p.f.mo...@gmail.com added the comment: A better example in terms of intended to be text might be ChangeLog files. These are clearly text files, but of sufficiently standard format that they can be manipulated programmatically. Consider a program to get a list of all authors who changed a particular file. Scan the file for date lines, then scan the block of text below for the filename you care about. Extract the author from the date line, put into a set, sort and print. All of this can be done assuming the file is ASCII-compatible, but requires non-trivial text processing that would be a pain to do on bytes. But author names are quite likely to be non-ASCII, especially if it's an international project. And the changelog file is manually edited by people on different machines, so the possibility of inconsistent encodings is definitely there. (I have seen this happen - it's not theoretical!) For my code, all I care about is that the names round-trip, so that I'm not damaging people's names any more than has already happened. encoding=ascii,errors=surrogateescape sounds like precisely the right answer here. (If it's hard to find a good answer in Python 3, it's very easy to decide to use Python 2 which just works, or even other tools like awk which also take Python 2's naive approach - and dismiss Python 3's Unicode model as too hard). My mental model here is text editors, which let you open any file, do their best to display as much as they can and allow you to manipulate it without damaging the bits you don't change. I don't see any reason why people shouldn't be able to write Python 3 code that way if they need to. -- nosy: +pmoore ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13997 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13988] Expose the C implementation of ElementTree by default when importing ElementTree
Florent Xicluna florent.xicl...@gmail.com added the comment: There was a discussion in December which ruled out some CPython too specific implementation details from the documentation. http://mail.python.org/pipermail/python-dev/2011-December/115063.html Maybe it's better to remove these 2 lines about the transparent optimizer. Then the versionchanged tag can be changed a little: .. versionchanged:: 3.3 This module will use a fast implementation whenever available. The module :mod:`xml.etree.cElementTree` is deprecated. Probably we'll add few words in the Doc/whatsnew/3.3.rst too. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13988 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13988] Expose the C implementation of ElementTree by default when importing ElementTree
Florent Xicluna florent.xicl...@gmail.com added the comment: Updated patch: - add 'XMLID' and 'register_namespace' to the ElementTree.__all__ - the comment says explicitly that cElementTree is deprecated - exercise the deprecated module with the tests -- Added file: http://bugs.python.org/file24498/issue13988_fold_cET_behind_ET_v2.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13988 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13997] Clearly explain the bare minimum Python 3 users should know about Unicode
Changes by Florent Xicluna florent.xicl...@gmail.com: -- nosy: +flox ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13997 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com