[issue39468] Improved the site module's permission handling while writing .python_history
Change by Aurora : -- pull_requests: +24543 status: pending -> open pull_request: https://github.com/python/cpython/pull/18210 ___ Python tracker <https://bugs.python.org/issue39468> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39480] referendum reference is needlessly annoying
Change by Aurora : -- type: -> enhancement ___ Python tracker <https://bugs.python.org/issue39480> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39468] Improved the site module's permission handling while writing .python_history
Change by Aurora : -- title: .python_history write permission improvements -> Improved the site module's permission handling while writing .python_history ___ Python tracker <https://bugs.python.org/issue39468> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39468] Improved the site module's permission handling while writing .python_history
Change by Aurora : -- status: open -> pending ___ Python tracker <https://bugs.python.org/issue39468> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39468] .python_history write permission improvements
Change by Aurora : -- pull_requests: -17675 ___ Python tracker <https://bugs.python.org/issue39468> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39468] .python_history write permission improvements
Change by Aurora : -- pull_requests: +17677 pull_request: https://github.com/python/cpython/pull/18299 ___ Python tracker <https://bugs.python.org/issue39468> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39468] .python_history write permission improvements
Change by Aurora : -- pull_requests: -17589 ___ Python tracker <https://bugs.python.org/issue39468> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39468] .python_history write permission improvements
Change by Aurora : -- pull_requests: -17674 ___ Python tracker <https://bugs.python.org/issue39468> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39468] .python_history write permission improvements
Change by Aurora : -- pull_requests: +17675 pull_request: https://github.com/python/cpython/pull/39468 ___ Python tracker <https://bugs.python.org/issue39468> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39468] .python_history write permission improvements
Change by Aurora : -- pull_requests: +17674 pull_request: https://github.com/python/cpython/pull/18299 ___ Python tracker <https://bugs.python.org/issue39468> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39455] Update the documentation for the linecache module
Change by Aurora : -- stage: -> resolved status: open -> closed ___ Python tracker <https://bugs.python.org/issue39455> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39314] (readline) Autofill the closing parenthesis during auto-completion for functions which accept no arguments at all
Change by Aurora : -- title: Autofill the closing paraenthesis during auto-completion for functions which accept no arguments at all -> (readline) Autofill the closing parenthesis during auto-completion for functions which accept no arguments at all ___ Python tracker <https://bugs.python.org/issue39314> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39455] Update the documentation for the linecache module
Change by Aurora : -- title: Update the documentation for linecache module -> Update the documentation for the linecache module ___ Python tracker <https://bugs.python.org/issue39455> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39480] referendum reference is needlessly annoying
Aurora added the comment: This example is practically against Python's diversity statement. -- nosy: +opensource-assist ___ Python tracker <https://bugs.python.org/issue39480> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39468] .python_history write permission improvements
Change by Aurora : -- pull_requests: +17589 pull_request: https://github.com/python/cpython/pull/18210 ___ Python tracker <https://bugs.python.org/issue39468> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39468] .python_history write permission improvements
Change by Aurora : -- pull_requests: -17586 ___ Python tracker <https://bugs.python.org/issue39468> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39468] .python_history write permission improvements
Change by Aurora : -- keywords: +patch pull_requests: +17586 stage: -> patch review pull_request: https://github.com/python/cpython/pull/18210 ___ Python tracker <https://bugs.python.org/issue39468> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39468] .python_history write permission improvements
Aurora added the comment: https://github.com/opensource-assist/cpython/blob/opensource-assist-patch-sitepy-1/Lib/site.py -- ___ Python tracker <https://bugs.python.org/issue39468> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39468] .python_history write permission improvements
New submission from Aurora : On a typical Linux system, if you run 'chattr +i /home/user/.python_history', and then run python, then exit, the following error message will be printed out: Error in atexit._run_exitfuncs: Traceback (most recent call last): File "/usr/local/lib/python3.9/site.py", line 446, in write_history readline.write_history_file(history) OSError: [Errno -1] Unknown error -1 With a simple improvement, the site module can check and suggest the user to run 'chattr -i' on the .python_history file. Additionaly, I don't know if it's a good idea to automatically run 'chattr -i' in such a situation or not. -- components: Library (Lib) messages: 360790 nosy: opensource-assist priority: normal severity: normal status: open title: .python_history write permission improvements type: enhancement versions: Python 3.9 ___ Python tracker <https://bugs.python.org/issue39468> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39455] Update the documentation for linecache module
New submission from Aurora : Added the definitions for two undocumented functions. -- assignee: docs@python components: Documentation messages: 360709 nosy: docs@python, opensource-assist priority: normal pull_requests: 17572 severity: normal status: open title: Update the documentation for linecache module type: enhancement versions: Python 3.9 ___ Python tracker <https://bugs.python.org/issue39455> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39449] New Assignment operator
Aurora added the comment: That's a nice simple idea. -- nosy: +opensource-assist ___ Python tracker <https://bugs.python.org/issue39449> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39314] Autofill the closing paraenthesis during auto-completion for functions which accept no arguments at all
Change by Aurora : -- versions: -Python 3.9 ___ Python tracker <https://bugs.python.org/issue39314> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39319] ntpath module must not be available on POSIX platforms
Aurora added the comment: @eryksun So modify the documentation to note that they're operable on both platforms. I've seen that ntpath worked on my Linux system, but the documentation was misleading. -- ___ Python tracker <https://bugs.python.org/issue39319> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39319] ntpath module must not be available on POSIX platforms
New submission from Aurora : According to https://docs.python.org/dev/library/undoc.html the 'ntpath' module is an "Implementation of os.path on Win32 and Win64 platforms". Just like all other Windows-specific modules(like winreg),'ntpath' must not be available for use on a POSIX system like Linux. I guess that 'posixpath' is also available on Windows, that if it is, it must not be available too. -- components: Interpreter Core messages: 359897 nosy: opensource-assist priority: normal severity: normal status: open title: ntpath module must not be available on POSIX platforms type: behavior versions: Python 3.9 ___ Python tracker <https://bugs.python.org/issue39319> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39319] ntpath module must not be available on POSIX platforms
Change by Aurora : -- components: +Library (Lib) -Interpreter Core type: behavior -> ___ Python tracker <https://bugs.python.org/issue39319> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39314] Autofill the closing paraenthesis during auto-completion for functions which accept no arguments at all
Change by Aurora : -- title: Autofill the closing paraenthesis during auto-completion for functions which accept no arguments -> Autofill the closing paraenthesis during auto-completion for functions which accept no arguments at all ___ Python tracker <https://bugs.python.org/issue39314> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39304] Don't accept a negative number for the count argument in str.replace(old, new[, count])
Aurora added the comment: @xtreak Understood, just as an aftermath: I still disagree a little with such an implementation because it's riding way into terse-coding that it's going against the principles of mathematics, which is the basis of computer science and programming. Python can use another special keyword or something(e.g. the Ellipsis notation) for this and all similar cases. You'll get into trouble if you wanna explain such a thing to a mathematician or if you wanna write some pseudo-code based on it, which in both cases they're not gonna look at the underlying implementation. A bad practice in C, followed by CPython spreaded to others. -- ___ Python tracker <https://bugs.python.org/issue39304> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39314] Autofill the closing paraenthesis during auto-completion for functions which accept no arguments
New submission from Aurora : If Python is compiled with the GNU readline headers, it will provide autocompletion for Python functions and etc. In the Python interpreter environment, if a function is typed partially, Python will fill in the rest if a tab character is typed. If a function accepts no arguments, Python still doesn't fill in the last closing paraenthesis during autocompletion, in the hope that the user will provide arguments, but in such a case it's pointless. -- components: Interpreter Core messages: 359855 nosy: opensource-assist priority: normal severity: normal status: open title: Autofill the closing paraenthesis during auto-completion for functions which accept no arguments type: enhancement versions: Python 3.9 ___ Python tracker <https://bugs.python.org/issue39314> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39304] Don't accept a negative number for the count argument in str.replace(old, new[, count])
New submission from Aurora : It's meaningless for the count argument to have a negative value, since there's no such thing as negative count for something. -- components: Library (Lib) messages: 359795 nosy: opensource-assist priority: normal severity: normal status: open title: Don't accept a negative number for the count argument in str.replace(old, new[,count]) type: behavior versions: Python 3.9 ___ Python tracker <https://bugs.python.org/issue39304> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue38441] failing to build the Documentation
New submission from Aurora : I'm failing to build the cpython/Doc dir. The full build log is as follows: mkdir -p build Building NEWS from Misc/NEWS.d with blurb PATH=./venv/bin:$PATH sphinx-build -b epub -d build/doctrees -D latex_elements.papersize= -W . build/epub Running Sphinx v2.2.0 making output directory... done building [mo]: targets for 0 po files that are out of date building [epub]: targets for 476 source files that are out of date updating environment: [new config] 476 added, 0 changed, 0 removed reading sources... [100%] whatsnew/index Warning, treated as error: /home/aurora/A.Code/Python/Reference/python/cpython/Doc/library/email.message.rst:4:duplicate object description of email.message, other instance in library/email.compat32-message, use :noindex: for one of them make: *** [Makefile:46: build] Error 2 Running on Debian Experimental kernel v5.3 -- assignee: docs@python components: Documentation messages: 354425 nosy: aurora, docs@python priority: normal severity: normal status: open title: failing to build the Documentation type: compile error versions: Python 3.9 ___ Python tracker <https://bugs.python.org/issue38441> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
Re: Unicode question : turn José into uJosé
First of all, if you run this on the console, find out your console's encoding. In my case it is English Windows XP. It uses 'cp437'. C:\chcp Active code page: 437 Then s = José u = uJos\u00e9 # same thing in unicode escape s.decode('cp437') == u # use encoding that match your console True wy This is probably stupid and/or misguided but supposing I'm passed a byte-string value that I want to be unicode, this is what I do. I'm sure I'm missing something very important. Short version : s = José #Start with non-unicode string unicoded = eval(u'%s' % José) Long version : s = José #Start with non-unicode string s #Lets look at it 'Jos\xe9' escaped = s.encode('string_escape') escaped 'Jos\\xe9' unicoded = eval(u'%s' % escaped) unicoded u'Jos\xe9' test = uJosé #What they should have passed me test == unicoded #Am I really getting the same thing? True #Yay! -- http://mail.python.org/mailman/listinfo/python-list
Re: Design mini-lanugage for data input
Yes. But they have different motivations. The mini-language concept is to design an input format that is convenient for human editor and that is close to the semi-structured data source. I think the benefit from ease of editing and flexibility would justify writing a little parsing code. JSON is mainly designed for data exchange between programs. You can hand edit JSON data (as well as XML or Python statement) but it is not the most convenient. Just consider you don't have to enter two quotes for every string object is almost liberating. These quotes are only artifacts for structured data format. The idea to design a format convenient for human and let code to parse and built the data structure. wy Hmm, Do you know about JSON and YAML? http://en.wikipedia.org/wiki/JSON http://en.wikipedia.org/wiki/YAML They have the advantage of being maintained by a group of people and being available for a number of languages. (as well as NOT being XML :-) - Cheers, Paddy. -- http://paddy3118.blogspot.com/ -- http://mail.python.org/mailman/listinfo/python-list
Re: Design mini-lanugage for data input
P.S. Also it is a 'mini-language' because it is an ad-hoc design that is good enough and can be easily implemented for a given problem. This is oppose to a general purpose solution like XML that is one translation from the original data format and carries too much baggages. Just consider you don't have to enter two quotes for every string object is almost liberating. These quotes are only artifacts for structured data format. The idea to design a format convenient for human and let code to parse and built the data structure. wy -- http://mail.python.org/mailman/listinfo/python-list
Design mini-lanugage for data input
This is an entry I just added to ASPN. It is a somewhat novel technique I have employed quite successfully in my code. I repost it here for more explosure and discussions. http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/475158 wy Title: Design mini-lanugage for data input Description: Many programs need a set of initial data. For ease of use and flexibility, design a mini-language for your input data. Use Python's superb text handling capability to parse and build the data structure from the input text. Source: Text Source # this is an example to demonstrate the programming technique DATA = # data souce: http://www.mongabay.com/igapo/world_statistics_by_pop.htm # Country / Captial / Area [sq. km] / 2002 Population Estimate China / Beijing / 9,596,960 / 1,284,303,705 India / New Delhi / 3,287,590 / 1,045,845,226 United States / Washington DC / 9,629,091 / 280,562,489 Indonesia / Jakarta / 1,919,440 / 231,328,092 Russia / Moscow / 17,075,200 / 144,978,573 def initData(): parse and return a country list of (name, captial, area, population) countries = [] for line in DATA.splitlines(): # filter out blank lines/comment lines line = line.strip() if not line or line.startswith('#'): continue # 4 fields separated by '/' parts = map(string.strip, line.split('/')) country, captial, area, population = parts # remove commas in numbers area = int(area.replace(',','')) population = int(population.replace(',','')) countries.append((country, captial, area, population)) return countries def findLargestCountry(countries): # your algorithm here def main(): countries = initData() print findLargestCountry(countries) Discussion: Problem --- Many programs need a set of initial data. The simplest way is to construct Python data structure directly as shown below. This is often not ideal. Algorithm and data structure tend to change. Python program statements is likely differ literally from its data source, which might be text pulled from web pages or other place. This means a great deal of effort is often needed to format and maintain the input as Python statements. This is a sample program that initialize some geographical data. # map of country - (captial, area, population) COUNTRIES = {} COUNTRIES['China'] = ('Beijing', 9596960, 1284303705) COUNTRIES['India'] = ('New Delhi', 3287590, 1045845226) COUNTRIES['United States'] = ('Washington DC', 9629091, 280562489) COUNTRIES['Indonesia'] = ('Jakarta', 1919440, 231328092) COUNTRIES['Russia'] = ('Moscow', 17075200, 144978573) Mini-language - A more flexible approach is to define a mini-lanugage to describe the data. This can be as simple as formatting data into a multiple-line string. 1. Define the data format in text. It should mirror the data source and designed for ease for human editing. 2. Define the data structure. 3. Write glue code to parse the input data and initialize the data structure. In the example above we use one line for each record. Each record has four fields, Country, captial, area and population, separated by slashes. One of the immediate benefit is that we no longer need to type so many quotes for every string literal. This concise data format is much easiler to read and edit than Python statements. The parser simply break down the input text using splitlines() and then loop through them line by line. It is useful to account for some extra white space so that it is more flexible for human editor. In this case the numbers (area, population) from the data source contains commas. Rather than manually edit them out, they are copied as is into the text as is. Then they are parsed into integer using area = int(area.replace(',','')) Slash is chosen as the separator (rather than the more common comma) because it does not otherwise appear in the data. A record is parsed into field using line.split('/') Don't forget to remove extra white space using string.strip() Finally it built a data structure of list of country record as tuple of (country, captial, area, population). It is just as easy to turn them into objects or any other data structure as desired. The mini-language technique can be refined to represent more complex, more structured input. It makes transformation and maintenance of input data much easier. -- http://mail.python.org/mailman/listinfo/python-list
Re: datetime iso8601 string input
I agree. I just keep rewriting the parse method again and again. wy def parse_iso8601_date(s): Parse date in iso8601 format e.g. 2003-09-15T10:34:54 and returns a datetime object. y=m=d=hh=mm=ss=0 if len(s) not in [10,19,20]: raise ValueError('Invalid timestamp length - %s' % s) if s[4] != '-' or s[7] != '-': raise ValueError('Invalid separators - %s' % s) if len(s) 10 and (s[13] != ':' or s[16] != ':'): raise ValueError('Invalid separators - %s' % s) try: y = int(s[0:4]) m = int(s[5:7]) d = int(s[8:10]) if len(s) = 19: hh = int(s[11:13]) mm = int(s[14:16]) ss = int(s[17:19]) except Exception, e: raise ValueError('Invalid timestamp - %s: %s' % (s, str(e))) return datetime(y,m,d,hh,mm,ss) I was a little surprised to recently discover that datetime has no method to input a string value. PEP 321 appears does not convey much information, but a timbot post from a couple years ago clarifies things: http://tinyurl.com/epjqc You can stop looking: datetime doesn't support any kind of conversion from string. The number of bottomless pits in any datetime module is unbounded, and Guido declared this particular pit out-of-bounds at the start so that there was a fighting chance to get *anything* done for 2.3. I can understand why datetime can't handle arbitrary string inputs, but why not just simple iso8601 format -- i.e. the default output format for datetime? Given a datetime-generated string: now = str(datetime.datetime.now()) print now '2006-02-23 11:03:36.762172' Why can't we have a function to accept it as string input and return a datetime object? datetime.parse_iso8601(now) Jeff Bauer Rubicon, Inc. -- http://mail.python.org/mailman/listinfo/python-list
ANN: pyregex 0.5
pyregex is a command line tools for constructing and testing Python's regular _expression_. Features includes text highlighting, detail break down of match groups, substitution and a syntax quick reference. It is released in the public domain. Screenshot and download from http://tungwaiyip.info/software/pyregex.html. Wai Yip Tung Usage: pyregex.py [options] -|filename regex [replacement [count]] Test Python regular expressions. Specify test data's filename or use - to enter test text from console. Optionally specify a replacement text. Options: -f filter mode -n nnn limit to examine the first nnn lines. default no limit. -m show only matched line. default False Regular _expression_ Syntax Special Characters . matches any character except a newline ^ matches the start of the string $ matches the end of the string or just before the newline at the end of the string * matches 0 or more repetitions of the preceding RE + matches 1 or more repetitions of the preceding RE ? matches 0 or 1 repetitions of the preceding RE {m} exactly m copies of the previous RE should be matched {m,n} matches from m to n repetitions of the preceding RE \ either escapes special characters or signals a special sequence [] indicate a set of characters. Characters can be listed individually, or a range of characters can be indicated by giving two characters and separating them by a -. Special characters are not active inside sets Including a ^ as the first character match the complement of the set | A|B matches either A or B (...) indicates the start and end of a group (?...) this is an extension notation. See documentation for detail (?iLmsux) I ignorecase; L locale; M multiline; S dotall; U unicode; X verbose *, +, ? and {m,n} are greedy. Append the ? qualifier to match non-greedily. Special Sequences \number matches the contents of the group of the same number. Groups are numbered starting from 1 \A matches only at the start of the string \b matches the empty string at the beginning or end of a word \B matches the empty string not at the beginning or end of a word \d matches any decimal digit \D matches any non-digit character \gnameuse the substring matched by the group named 'name' for sub() \s matches any whitespace character \S matches any non-whitespace character \w matches any alphanumeric character and the underscore \W matches any non-alphanumeric character \Z matches only at the end of the string See the Python documentation on Regular _expression_ Syntax for more detail http://docs.python.org/lib/re-syntax.html -- http://mail.python.org/mailman/listinfo/python-announce-list Support the Python Software Foundation: http://www.python.org/psf/donations.html
Re: HTMLTestRunner - generates HTML test report for unittest
On Fri, 27 Jan 2006 06:35:46 -0800, Paul McGuire [EMAIL PROTECTED] wrote: Nice! I just adapted my pyparsing unit tests to use this tool - took me about 3 minutes, and now it's much easier to run and review my unit test results. I especially like the pass/fail color coding, and the drill-down to the test output. -- Paul Thank you! I'm glad that it is helpful to you :) -- http://mail.python.org/mailman/listinfo/python-list
ANN: HTMLTestRunner - generates HTML test report for unittest
Greeting, HTMLTestRunner is an extension to the Python standard library's unittest module. It generates easy to use HTML test reports. See a sample report at http://tungwaiyip.info/software/sample_test_report.html. Check more information and download from http://tungwaiyip.info/software/#htmltestrunner Wai Yip Tung -- http://mail.python.org/mailman/listinfo/python-list
Re: decode unicode string using 'unicode_escape' codecs
Cool, it works! I have also done some due diligence that the utf-8 encoding would not introduce any Python escape accidentially. I have written a recipe in the Python cookbook: Efficient character escapes decoding http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/466293 wy Does this do what you want? u'€\\n€' u'\x80\\n\x80' len(u'€\\n€') 4 u'€\\n€'.encode('utf-8').decode('string_escape').decode('utf-8') u'\x80\n\x80' len(u'€\\n€'.encode('utf-8').decode('string_escape').decode('utf-8')) 3 Basically, I convert the unicode string to bytes, escape the bytes using the 'string_escape' codec, and then convert the bytes back into a unicode string. HTH, STeVe -- http://mail.python.org/mailman/listinfo/python-list
decode unicode string using 'unicode_escape' codecs
I have some unicode string with some characters encode using python notation like '\n' for LF. I need to convert that to the actual LF character. There is a 'unicode_escape' codec that seems to suit my purpose. encoded = u'A\\nA' decoded = encoded.decode('unicode_escape') print len(decoded) 3 Note that both encoded and decoded are unicode string. I'm trying to use the builtin codec because I assume it has better performance that for me to write pure Python decoding. But I'm not converting between byte string and unicode string. However it runs into problem in some cases. encoded = u'€\\n€' decoded = encoded.decode('unicode_escape') Traceback (most recent call last): File g:\bin\py_repos\mindretrieve\trunk\minds\x.py, line 9, in ? decoded = encoded.decode('unicode_escape') UnicodeEncodeError: 'ascii' codec can't encode character u'\u20ac' in position 0: ordinal not in range(128) Reading the docuemnt more carefully, I found out what has happened. decode('unicode_escape') takes byte string as operand and convert it into unicode string. Since encoded is already unicode, it is first implicitly converted to byte string using 'ascii' encoding. In this case it fails because of the '€' character. So I resigned to the fact that 'unicode_escape' doesn't do what I want. But I think more deeply. I come up with this Python source code. It runs OK and outputs 3. - # -*- coding: utf-8 -*- print len(u'€\n€') # 3 - Think about what happened in the second line. First the parser decodes the bytes into an unicode string with UTF-8 encoding. Then it applies syntax run to decode the unicode characters '\n' to LF. The second is what I want. There must be something available to the Python interpreter that is not available to the user. So it there something I have overlook? Anyway I just want to leverage the builtin codecs for performance. I figure this would be faster than encoded.replace('\\n', '\n') ...and so on... If there are other suggestion it would be greatly appriciated :) wy -- http://mail.python.org/mailman/listinfo/python-list
Re: performance of recursive generator
You seem to be assuming that a yield statement and a function call are equivalent. I'm not sure that's a valid assumption. I don't know. I was hoping the compiler can optimize away the chain of yields. Anyway, here's some data to consider: test.py def gen(n): if n: for i in gen(n/2): yield i yield n for i in gen(n/2): yield i def gen_wrapper(n): return list(gen(n)) def nongen(n, func): if n: nongen(n/2, func) func(n) nongen(n/2, func) def nongen_wrapper(n): result = [] nongen(n, result.append) return result - This test somehow water down the n^2 issue. The problem is in the depth of recursion, in this case it is only log(n). It is probably more interesting to test: def gen(n): if n: yield n for i in gen(n-1): yield i -- http://mail.python.org/mailman/listinfo/python-list
performance of recursive generator
I love generator and I use it a lot. Lately I've been writing some recursive generator to traverse tree structures. After taking closer look I have some concern on its performance. Let's take the inorder traversal from http://www.python.org/peps/pep-0255.html as an example. def inorder(t): if t: for x in inorder(t.left): yield x yield t.label for x in inorder(t.right): yield x Consider a 4 level deep tree that has only a right child: 1 \ 2 \ 3 \ 4 Using the recursive generator, the flow would go like this: maingen1gen2gen3gen4 inorder(1..4) yield 1 inorder(2..4) yield 2 yield 2 inorder(3..4) yield 3 yield3 yield 3 inorder(4) yield 4 yield 4 yield 4 yield 4 Note that there are 4 calls to inorder() and 10 yield. Indeed the complexity of traversing this kind of tree would be O(n^2)! Compare that with a similar recursive function using callback instead of generator. def inorder(t, foo): if t: inorder(t.left, foo): foo(t.label) inorder(t.right, foo): The flow would go like this: mainstack1 stack2 stack3 stack4 inorder(1..4) foo(1) inorder(2..4) foo(2) inorder(3..4) foo(3) inorder(4) foo(4) There will be 4 calls to inorder() and 4 call to foo(), give a reasonable O(n) performance. Is it an inherent issue in the use of recursive generator? Is there any compiler optimization possible? -- http://mail.python.org/mailman/listinfo/python-list
Problem redirecting stdin on Windows
On Windows (XP) with win32 extension installed, a Python script can be launched from the command line directly since the .py extension is associated with python. However it fails if the stdin is piped or redirected. Assume there is an echo.py that read from stdin and echo the input. Launching from command line directly, this echos input from keyboard: echo.py But it causes an error if the stdin is redirected echo.py textfile ... for line in fp: IOError: [Errno 9] Bad file descriptor However it works as expected if launched via Python.exe c:\Python24\python.exe echo.py textfile Why is the second option fails? It makes many script lot less functional. -- http://mail.python.org/mailman/listinfo/python-list
win32clipboard.GetClipboardData() return string with null characters
I was using win32clipboard.GetClipboardData() to retrieve the Windows clipboard using code similar to the message below: http://groups-beta.google.com/group/comp.lang.python/msg/3722ba3afb209314?hl=en Some how I notice the data returned includes \0 and some characters that shouldn't be there after the null character. It is easy enough to truncate them. But why does it get there in the first place? Is the data length somehow calculated wrong? I'm using Windows XP SP2 with Python 2.4 and pywin32-203. aurora -- http://mail.python.org/mailman/listinfo/python-list
Re: Unit testing - one test class/method, or test class/class
I do something more or less like your option b. I don't think there is any orthodox structure to follow. You should use a style that fit your taste. What I really want to bring up is your might want to look at refactoring your module in the first place. 348 test cases for one module sounds like a large number. That reflects you have a fairly complex module to be tested to start with. Often the biggest benefit of doing automated unit testing is it forces the developers to modularize and decouple their code in order to make it testable. This action alone improve that code quality a lot. If breaking up the module make sense in your case, the test structure will follows. Hi, I just found py.test[1] and converted a large unit test module to py.test format (which is actually almost-no-format-at-all, but I won't get there now). Having 348 test cases in the module and huge test classes, I started to think about splitting classes. Basically you have at least three obvious choises, if you are going for consistency in your test modules: Choise a: Create a single test class for the whole module to be tested, whether it contains multiple classes or not. ...I dont think this method deserves closer inspection. It's probably rather poor method to begin with. With py.test where no subclassing is required (like in Python unittest, where you have to subclass unittest.TestCase) you'd probably be better off with just writing a test method for each class and each class method in the module. Choise b: Create a test class for each class in the module, plus one class for any non-class methods defined in the module. + Feels clean, because each test class is mapped to one class in the module + It is rather easy to find all tests for given class + Relatively easy to create class skeleton automatically from test module and the other way round - Test classes get huge easily - Missing test methods are not very easy to find[2] - A test method may depend on other tests in the same class Choise c: Create a test class for each non-class method and class method in the tested module. + Test classes are small, easy to find all tests for given method + Helps in test isolation - having separate test class for single method makes tested class less dependent of any other methods/classes + Relatively easy to create test module from existing class (but then you are not doing TDD!) but not vice versa - Large number of classes results in more overhead; more typing, probably requires subclassing because of common test class setup methods etc. What do you think, any important points I'm missing? Footnotes: [1] In reality, this is a secret plot to advertise py.test, see http://codespeak.net/py/current/doc/test.html [2] However, this problem disappears if you start with writing your tests first: with TDD, you don't have untested methods, because you start by writing the tests first, and end up with a module that passes the tests -- # Edvard Majakari Software Engineer # PGP PUBLIC KEY available Soli Deo Gloria! One day, when he was naughty, Mr Bunnsy looked over the hedge into Farmer Fred's field and it was full of fresh green lettuces. Mr Bunnsy, however, was not full of lettuces. This did not seem fair. --Mr Bunnsy has an adventure -- http://mail.python.org/mailman/listinfo/python-list
Re: running a shell command from a python program
In Python 2.4, use the new subprocess module for this. It subsume the popen* methods. Hi, I'm a newbie, so please be gentle :-) How would I run a shell command in Python? Here is what I want to do: I want to run a shell command that outputs some stuff, save it into a list and do stuff with the contents of that list. I started with a BASH script actually, until I realized I really needed better data structures :-) Is popen the answer? Also, where online would I get access to good sample code that I could peruse? I'm running 2.2.3 on Linux, and going strictly by online doc so far. Thanks! S C -- http://mail.python.org/mailman/listinfo/python-list
Re: Python and Ajax technology collaboration
It was discussed in the last Bay Area Python Interest Group meeting. Thursday, February 10, 2005 Agenda: Developing Responsive GUI Applications Using HTML and HTTP Speakers: Donovan Preston http://www.baypiggies.net/ The author has a component LivePage for this. You may find it from http://nevow.com/. Similar idea from the Javascript stuff but very Python centric. Interesting GUI developments, it seems. Anyone developed a Ajax application using Python? Very curious thx (Ajax stands for: XHTML and CSS; dynamic display and interaction using the Document Object Model; data interchange and manipulation using XML and XSLT; asynchronous data retrieval using XMLHttpRequest; and JavaScript binding everything together ie Google has used these technologies to build Gmail, Google Maps etc. more info: http://www.adaptivepath.com/publications/essays/archives/000385.php) -- http://mail.python.org/mailman/listinfo/python-list
Re: unicode encoding usablilty problem
On Sat, 19 Feb 2005 18:44:27 +0100, Fredrik Lundh [EMAIL PROTECTED] wrote: aurora [EMAIL PROTECTED] wrote: I don't want to mix them. But how could I find them? How do I know this statement can be potential problem if a==b: where a and b can be instantiated individually far away from this line of code that put them together? if you don't know what a and b comes from, how can you be sure that your program works at all? how can you be sure they're both strings? (a op b can fail in many ways, depending on what a, b, and op are) a and b are both string. The issue is 8-bit string or unicode string. Things works fine, unit tests pass, all until the first non-ASCII characters come in and then the program breaks. if you have unit tests, why don't they include Unicode tests? /F How do I structure the test cases to guarantee coverage? It is not practical to test every combinations of unicode/8-bit strings. Adding non-ascii characters to test data probably make problem pop up earlier. But it is arduous and it is hard to spot if you left out any. -- http://mail.python.org/mailman/listinfo/python-list
Re: unicode encoding usablilty problem
On Sun, 20 Feb 2005 15:01:09 +0100, Martin v. Löwis [EMAIL PROTECTED] wrote: Nick Coghlan wrote: Having , u, and r be immutable, while b was mutable would seem rather inconsistent. Yes. However, this inconsistency might be desirable. It would, of course, mean that the literal cannot be a singleton. Instead, it has to be a display (?), similar to list or dict displays: each execution of the byte string literal creates a new object. An alternative would be to have bytestr be the immutable type corresponding to the current str (with b literals producing bytestr's), while reserving the bytes name for a mutable byte sequence. Indeed. This maze of options has caused the process to get stuck. People also argue that with such an approach, we could as well tell users to use array.array for the mutable type. But then, people complain that it doesn't have all the library support that strings have. The main point being, the replacement for 'str' needs to be immutable or the upgrade process is going to be a serious PITA. Somebody really needs to take this in his hands, completing the PEP, writing a patch, checking applications to find out what breaks. Regards, Martin What is the processing of getting a PEP work out? Does the work and discussion carry out in the python-dev mailing list? I would be glad to help out especially on this particular issue. -- http://mail.python.org/mailman/listinfo/python-list
Re: unicode and socket
On 18 Feb 2005 19:10:36 -0800, [EMAIL PROTECTED] wrote: It's really funny, I cannot send a unicode stream throuth socket with python while all the other languages as perl,c and java can do it. then, how about converting the unicode string to a binary stream? It is possible to send a binary through socket with python? I was answering your specific question: How can I send the unicode string to the remote end of the socket as it is without any conversion of encode The answer is you could not. Not that you cannot sent unicode but you have to encode it. The same applies to perl, c or Java. The only difference is the detail of how strings get encoded. There are a few posts suggest various means. Or you can check out codecs.getwriter() which closer resembles Java's way. -- http://mail.python.org/mailman/listinfo/python-list
unicode encoding usablilty problem
I have long find the Python default encoding of strict ASCII frustrating. For one thing I prefer to get garbage character than an exception. But the biggest issue is Unicode exception often pop up in unexpected places and only when a non-ASCII or unicode character first found its way into the system. Below is an example. The program may runs fine at the beginning. But as soon as an unicode character u'b' is introduced, the program boom out unexpectedly. sys.getdefaultencoding() 'ascii' a='\xe5' # can print, you think you're ok ... print a å b=u'b' a==b Traceback (most recent call last): File stdin, line 1, in ? UnicodeDecodeError: 'ascii' codec can't decode byte 0xe5 in position 0: ordinal not in range(128) One may suggest the correct way to do it is to use decode, such as a.decode('latin-1') == b This brings up another issue. Most references and books focus exclusive on entering unicode literal and using the encode/decode methods. The fallacy is that string is such a basic data type use throughout the program, you really don't want to make a individual decision everytime when you use string (and take a penalty for any negligence). The Java has a much more usable model with unicode used internally and encoding/decoding decision only need twice when dealing with input and output. I am sure these errors are a nuisance to those who are half conscious to unicode. Even for those who choose to use unicode, it is almost impossible to ensure their program work correctly. -- http://mail.python.org/mailman/listinfo/python-list
Re: Newbie CGI problem
Not sure about the repeated hi. But you are supposed to use \r\n\r\n, not just \n\n according to the HTTP specification. #!/usr/bin/python import cgi print Content-type: text/html\n\n print hi Gives me the following in my browser: ''' hi Content-type: text/html hi ''' Why are there two 'hi's? Thanks, Rory -- http://mail.python.org/mailman/listinfo/python-list
Re: Newbie CGI problem
On Fri, 18 Feb 2005 18:36:10 +0100, Peter Otten [EMAIL PROTECTED] wrote: Rory Campbell-Lange wrote: #!/usr/bin/python import cgi print Content-type: text/html\n\n print hi Gives me the following in my browser: ''' hi Content-type: text/html hi ''' Why are there two 'hi's? You have chosen a bad name for your script: cgi.py. It is now self-importing. Rename it to something that doesn't clash with the standard library, and all should be OK. Peter You are genius. -- http://mail.python.org/mailman/listinfo/python-list
Re: unicode and socket
You could not. Unicode is an abstract data type. It must be encoded into octets in order to send via socket. And the other end must decode the octets to retrieve the unicode string. Needless to say the encoding scheme must be consistent and understood by both ends. On 18 Feb 2005 11:03:46 -0800, [EMAIL PROTECTED] wrote: hello all, I am new in Python. And I have got a problem about unicode. I have got a unicode string, when I was going to send it out throuth a socket by send(), I got an exception. How can I send the unicode string to the remote end of the socket as it is without any conversion of encode, so the remote end of the socket will receive unicode string? Thanks -- http://mail.python.org/mailman/listinfo/python-list
Re: unicode encoding usablilty problem
On Fri, 18 Feb 2005 20:18:28 +0100, Walter Dörwald [EMAIL PROTECTED] wrote: aurora wrote: [...] In Java they are distinct data type and the compiler would catch all incorrect usage. In Python, the interpreter seems to 'help' us to promote binary string to unicode. Things works fine, unit tests pass, all until the first non-ASCII characters come in and then the program breaks. Is there a scheme for Python developer to use so that they are safe from incorrect mixing? Put the following: import sys sys.setdefaultencoding(undefined) in a file named sitecustomize.py somewhere in your Python path and Python will complain whenever there's an implicit conversion between str and unicode. HTH, Walter Dörwald That helps! Running unit test caught quite a few potential problems (as well as a lot of safe of ASCII string promotion). -- http://mail.python.org/mailman/listinfo/python-list
Re: unicode encoding usablilty problem
On Fri, 18 Feb 2005 21:16:01 +0100, Martin v. Löwis [EMAIL PROTECTED] wrote: I'd like to point out the historical reason: Python predates Unicode, so the byte string type has many convenience operations that you would only expect of a character string. We have come up with a transition strategy, allowing existing libraries to widen their support from byte strings to character strings. This isn't a simple task, so many libraries still expect and return byte strings, when they should process character strings. Instead of breaking the libraries right away, we have defined a transitional mechanism, which allows to add Unicode support to libraries as the need arises. This transition is still in progress. I understand. So I wasn't yelling why can't Python be more like Java. On the other hand I also want to point out making individual decision for each string wasn't practical and is very error prone. The fact that unicode and 8 bit string look alike and work alike in common situation but only run into problem with non-ASCII is very confusing for most people. Eventually, the primary string type should be the Unicode string. If you are curious how far we are still off that goal, just try running your program with the -U option. Lots of errors. Amount them are gzip (binary?!) and strftime?? I actually quite appriciate Python's power in processing binary data as 8-bit strings. But perhaps we should transition to use unicode as text string as treat binary string as exception. Right now we have '' - 8bit string; u'' unicode string How about b'' - 8bit string; '' unicode string and no automatic conversion. Perhaps this can be activated by something like the encoding declarations, so that transition can happen module by module. Regards, Martin -- http://mail.python.org/mailman/listinfo/python-list
Re: DHTML control from Python?
IE should be able to do that. Install the win32 modules. Then you should simply embed Python using script language='python'. Not sure about Mac. Even on Windows your audiences are limited to those who have IE+python+win32 modules. Are there any ways to use Python (rather than JavaScript) for controlling DHTML? I don't mind writing JavaScript stubs which can be called by Python, so long as I need to do so only once for a particular feature. I'm running Mac OS X 10.3, so comments as to the best browser for testing this would also be appreciated. If you could also email as well as posting a reply, I'd be grateful. email to : kenneth.m.mcdonald _at_ sbcglobal.net Thanks, Ken McDonald -- http://mail.python.org/mailman/listinfo/python-list
Re: executing VBScript from Python and vice versa
Go to the bookstore and get a copy of Python Programming on Win32 by Mark Hammond, Andy Robinson today. http://www.oreilly.com/catalog/pythonwin32/ It has everything you need. Is there a way to make programs written in these two languages communicate with each other? I am pretty sure that VBScript can access a Python script because Python is COM compliant. On the other hand, Python might be able to call a VBScript through WSH. Can somebody provide a simple example? I have exactly 4 days of experience in Python (and fortunately, much more in VB6) Thanks. -- http://mail.python.org/mailman/listinfo/python-list
Re: OT: why are LAMP sites slow?
aurora [EMAIL PROTECTED] writes: Slow compares to what? For a large commerical site with bigger budget, better infrastructure, better implementation, it is not surprising that they come out ahead compares to hobbyist sites. Hmm, as mentioned, I'm not sure what the commercial sites do that's different. I take the view that the free software world is capable of anything that the commercial world is capable of, so I'm not awed just because a site is commercial. And sites like Slashdot have pretty big budgets by hobbyist standards. Putting implementation aside, is LAMP inherently performing worst than commerical alternatives like IIS, ColdFusion, Sun ONE or DB2? Sounds like that's your perposition. I wouldn't say that. I don't think Apache is a bottleneck compared with other web servers. Similarly I don't see an inherent reason for Python (or whatever) to be seriously slower than Java servlets. I have heard that MySQL doesn't handle concurrent updates nearly as well as DB2 or Oracle, or for that matter PostgreSQL, so I wonder if busier LAMP sites might benefit from switching to PostgreSQL (LAMP = LAPP?). I'm lost. So what do you compares against when you said LAMP is slow? What is the reference point? Is it just a general observation that slashdot is slower than we like it to be? If you are talking about slashdot, there are many ideas to make it faster. For example they can send all 600 comments to the client and let the user do querying using DHTML on the client side. This leave the server serving mostly static files and will certainly boost the performance tremendously. If you mean MySQL or SQL database in general is slow, there are truth in it. The best thing about SQL database is concurrent access, transactional semantics and versatile querying. Turns out a lot of application can really live without that. If you can rearchitect the application using flat files instead of database it can often be a big bloom. A lot of these is just implementation. Find the right tool and the right design for the job. I still don't see a case that LAMP based solution is inherently slow. -- http://mail.python.org/mailman/listinfo/python-list
Re: hotspot profiler experience and accuracy?
Thanks for pointing me to your analysis. Now I know it wasn't me doing something wrong. hotspot did lead me to knock down a major performance bottleneck one time. I found that zipfile.ZipFile() basically read the entire zip file in instantiation time, even though you may only need one file from it subsequencely. In anycase the number of function call seems to make sense and it should give some insight to the runtime behaviour. The CPU time is just so misleading. aurora wrote: But the numbers look skeptical. Hotspot claim 71.166 CPU seconds but the actual elapsed time is only 54s. When measuring elapsed time instead of CPU time the performance gain is only 13% with the profiler running and down to 10% when not using the profiler. Is there something I misunderstood in reading the numbers? Well, I'm confused too. Look at my post from a few months ago: http://tinyurl.com/6awzj (note that my code contained a few errors and that you need to use the fixed code that I posted a few replies later). Perhaps somebody can explain a bit more about this this time? :-) At the moment, frankly, hotspot seems rather useless. --Irmen -- http://mail.python.org/mailman/listinfo/python-list
Re: Printing Filenames with non-Ascii-Characters
print d.encode('cp437') So I would have to specify the encoding on every call to print? I am sure to forget and I don't like the program dying, in my case garbled output would be much more acceptable. Marian I'm with you. You never known you have put enough encode in all the right places and there is no static type checking to help you. So that short answer is to set a different default in sitecustomize.py. I'm trying to writeup something about unicode in Python, once I understand what's going on inside... -- http://mail.python.org/mailman/listinfo/python-list
Re: Next step after pychecker
A frequent error I encounter try: ...do something... except IOError: log('encounter an error %s line %d' % filename) Here in the string interpolation I should supply (filename,lineno). Usually I have a lot of unittesting to catch syntax error in the main code. But it is very difficult to run into exception handler, some of those are added defensely. Unfortunately those untested exception sometimes fails precisely when we need it for diagnosis information. pychecker sometime give false alarm. The argument of a string interpolation may be a valid tuple. It would be great it we can somehow unit test the exception handler (without building an extensive library of mock objects). -- http://mail.python.org/mailman/listinfo/python-list
Re: Printing Filenames with non-Ascii-Characters
On Tue, 01 Feb 2005 20:28:11 +0100, Marian Aldenhövel [EMAIL PROTECTED] wrote: Hi, I am very new to Python and have run into the following problem. If I do something like dir = os.listdir(somepath) for d in dir: print d The program fails for filenames that contain non-ascii characters. 'ascii' codec can't encode characters in position 33-34: I have noticed that this seems to be a very common problem. I have read a lot of postings regarding it but not really found a solution. Is there a simple one? English windows command prompt uses cp437 charset. To print it, use print d.encode('cp437') The issue is a terminal only understand certain character set. If you have unicode string, like d in your case, you have to encode it before it can be printed. (We really need native unicode terminal!!!) If you don't encode, Python will do it for you. The default encoding is ASCII. Any string that contains non-ASCII character will give you trouble. In my opinion Python is too conversative to use the 'strict' encoding which gives users unaware of unicode a lot of woes. So how did you get a unicoded d to start with? If 'somepath' is unicode, os.listdir returns a list of unicode. So why is somepath unicode? Either you have entered a unicode literal or it comes from some other sources. One possible source is XML parser, which returns string in unicode. Windows NT support unicode filename. I'm not sure about Linux. The result maybe slightly differ. What I specifically do not understand is why Python wants to interpret the string as ASCII at all. Where is this setting hidden? I am running Python 2.3.4 on Windows XP and I want to run the program on Debian sarge later. Ciao, MM -- http://mail.python.org/mailman/listinfo/python-list
hotspot profiler experience and accuracy?
I have a parser I need to optimize. It has some disk IO and a lot of looping over characters. I used the hotspot profiler to gain insight on optimization options. The methods show up on on the top of this list seems fairly trivial and does not look like CPU hogger. Nevertheless I optimized it and have 25% performance gain according to hotspot's number. But the numbers look skeptical. Hotspot claim 71.166 CPU seconds but the actual elapsed time is only 54s. When measuring elapsed time instead of CPU time the performance gain is only 13% with the profiler running and down to 10% when not using the profiler. Is there something I misunderstood in reading the numbers? -- http://mail.python.org/mailman/listinfo/python-list
Go visit Xah Lee's home page
Let's stop discussing about the perl-python non-sense. It is so boring. For a break, just visit Mr Xah Lee's personal page (http://xahlee.org/PageTwo_dir/Personal_dir/xah.html). You'll find lot of funny information and quotes from this queer personality. Thankfully no perl-python stuff there. Don't miss Mr. Xah Lee's recent pictures at http://xahlee.org/PageTwo_dir/Personal_dir/mi_pixra.html My favor is the last picture. Long haired Xah Lee sitting contemplatively in the living room. The caption says my beautiful hair, fails to resolve the problems of humanity. And, it is falling apart by age. -- http://mail.python.org/mailman/listinfo/python-list
Re: Transparent (redirecting) proxy with BaseHTTPServer
It should be very safe to count on the host header. Maybe some really really old browser would not support that. But they probably won't work in today's WWW anyway. Majority of today's web site is likely to be virtually hosted. One Apache maybe hosting for 50 web addresses. If a client strip the host name and not sending the host header either the web server wouldn't what address it is really looking for. If you caught some request that doesn't have host header it is a good idea to redirect them to a browser upgrade page. Thanks, aurora ;), aurora wrote: If you actually want the IP, resolve the host header would give you that. I' m only interested in the hostname. The second form of HTTP request without the host part is for compatability of pre-HTTP/1.1 standard. All modern web browser should send the Host header. How safe is the assumtion that the Host header will be there? Is it part of the HTTP/1.1 spec? And does it mean all pre 1.1 clients will fail? Hmm, maybe I should look on the wire whats really happening... thanks again Paul -- http://mail.python.org/mailman/listinfo/python-list
Re: Transparent (redirecting) proxy with BaseHTTPServer
If you actually want the IP, resolve the host header would give you that. In the redirect case you should get a host header like Host: www.python.org From that you can reconstruct the original URL as http://www.python.org/ftp/python/contrib/. With that you can open it using urllib and proxy the data to the client. The second form of HTTP request without the host part is for compatability of pre-HTTP/1.1 standard. All modern web browser should send the Host header. Hi list, My ultimate goal is to have a small HTTP proxy which is able to show a message specific to clients name/ip/status then handle the original request normally either by redirecting the client, or acting as a proxy. I started with a modified[1] version of TinyHTTPProxy postet by Suzuki Hisao somewhere in 2003 to this list and tried to extend it to my needs. It works quite well if I configure my client to use it, but using iptables REDIRECT feature to point the clients transparently to the proxy caused some issues. Precisely, the self.path member variable of baseHTTPRequestHandler is missing the command and the host (i.e www.python.org) part of the request line for REDIRECTed connections: without iptables REDIRECT: self.path - GET http://www.python.org/ftp/python/contrib/ HTTP/1.1 with REDIRECT: self.path - GET /ftp/python/contrib/ HTTP/1.1 I asked about this on the squid mailing list and was told this is normal and I have to reconstuct the request line from the real destination IP, the URL-path and the Host header (if any). If the Host header is sent it's an (unsafe) nobrainer, but I cannot for the life of me figure out where to get the real destination IP. Any ideas? thanks Paul [1] HTTP Debugging Proxy Modified by Xavier Defrang (http://defrang.com/) -- http://mail.python.org/mailman/listinfo/python-list
Re: limited python virtual machine (WAS: Another scripting language implemented into Python itself?)
It is really necessary to build a VM from the ground up that includes OS ability? What about JavaScript? On Wed, Jan 26, 2005 at 05:18:59PM +0100, Alexander Schremmer wrote: On Tue, 25 Jan 2005 22:08:01 +0100, I wrote: sys.safecall(func, maxcycles=1000) could enter the safe mode and call the func. This might be even enhanced like this: import sys sys.safecall(func, maxcycles=1000, allowed_domains=['file-IO', 'net-IO', 'devices', 'gui'], allowed_modules=['_sre']) Any comments about this from someone who already hacked CPython? Yes, this comes up every couple months and there is only one answer: This is the job of the OS. Java largely succeeds at doing sandboxy things because it was written that way from the ground up (to behave both like a program interpreter and an OS). Python the language was not, and the CPython interpreter definitely was not. Search groups.google.com for previous discussions of this on c.l.py -Jack -- http://mail.python.org/mailman/listinfo/python-list
Re: list unpack trick?
On Sat, 22 Jan 2005 10:03:27 -0800, aurora [EMAIL PROTECTED] wrote: I am think more in the line of string.ljust(). So if we have a list.ljust(length, filler), we can do something like name, value = s.split('=',1).ljust(2,'') I can always break it down into multiple lines. The good thing about list unpacking is its a really compact and obvious syntax. Just to clarify the ljust() is a feature wish, probably should be named something like pad(). Also there is another thread a few hours before this asking about essentially the same thing. default value in a list http://groups-beta.google.com/group/comp.lang.python/browse_frm/thread/f3affefdb4272270 -- http://mail.python.org/mailman/listinfo/python-list
Re: list unpack trick?
Thanks. I'm just trying to see if there is some concise syntax available without getting into obscurity. As for my purpose Siegmund's suggestion works quite well. The few forms you have suggested works. But as they refer to list multiple times, it need a separate assignment statement like list = s.split('=',1) I am think more in the line of string.ljust(). So if we have a list.ljust(length, filler), we can do something like name, value = s.split('=',1).ljust(2,'') I can always break it down into multiple lines. The good thing about list unpacking is its a really compact and obvious syntax. On Sat, 22 Jan 2005 08:34:27 +0100, Fredrik Lundh [EMAIL PROTECTED] wrote: ... So more generally, is there an easy way to pad a list into length of n with filler items appended at the end? some variants (with varying semantics): list = (list + n*[item])[:n] or list += (n - len(list)) * [item] or (readable): if len(list) n: list.extend((n - len(list)) * [item]) etc. /F -- http://mail.python.org/mailman/listinfo/python-list
Re: A Fundamental Turn Toward Concurrency in Software
Of course there are many performance bottleneck, CPU, memory, I/O, network all the way up to the software design and implementation. As a software guy myself I would say by far better software design would lead to the greatest performance gain. But that doesn't mean hardware engineer can sit back and declare this as software's problem. Even if we are not writing CPU intensive application we will certain welcome free performace gain coming from a faster CPU or a more optimized compiler. I think this is significant because it might signify a paradigm shift. This might well be a hype, but let's just assume this is future direction of CPU design. Then we might as well start experimenting now. I would just throw some random ideas: parallel execution at statement level, look up symbol and attributes predicitively, parallelize hash function, dictionary lookup, sorting, list comprehension, etc, background just-in-time compilation, etc, etc. One of the author's idea is many of today's main stream technology (like OO) did not come about suddenly but has cumulated years of research before becoming widely used. A lot of these ideas may not work or does not seems to matter much today. But in 10 years we might be really glad that we have tried. aurora [EMAIL PROTECTED] writes: Just gone though an article via Slashdot titled The Free Lunch Is Over: A Fundamental Turn Toward Concurrency in Software [http://www.gotw.ca/publications/concurrency-ddj.htm]. It argues that the continous CPU performance gain we've seen is finally over. And that future gain would primary be in the area of software concurrency taking advantage hyperthreading and multicore architectures. Well, another gain could be had in making the software less wasteful of cpu cycles. I'm a pretty experienced programmer by most people's standards but I see a lot of systems where I can't for the life of me figure out how they manage to be so slow. It might be caused by environmental pollutants emanating from Redmond. -- http://mail.python.org/mailman/listinfo/python-list
A Fundamental Turn Toward Concurrency in Software
Hello! Just gone though an article via Slashdot titled The Free Lunch Is Over: A Fundamental Turn Toward Concurrency in Software [http://www.gotw.ca/publications/concurrency-ddj.htm]. It argues that the continous CPU performance gain we've seen is finally over. And that future gain would primary be in the area of software concurrency taking advantage hyperthreading and multicore architectures. Perhaps something the Python interpreter team can ponder. -- http://mail.python.org/mailman/listinfo/python-list