[issue14814] Implement PEP 3144 (the ipaddress module)
Changes by Alan Kennedy python-...@xhaus.com: -- nosy: +amak ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14814 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3566] httplib persistent connections violate MUST in RFC2616 sec 8.1.4.
Changes by Alan Kennedy python-...@xhaus.com: -- nosy: +amak ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue3566 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
How to run jython WSGI applications on Google AppEngine.
Hi all, You can find instructions about how to run jython Web applications on Google AppEngine, using WSGI and modjy, here. http://jython.xhaus.com You can see the jython 2.5 Demo WSGI application running, here. http://jywsgi.appspot.com Regards, Alan Kennedy. -- http://mail.python.org/mailman/listinfo/python-list
Jython on Google AppEngine.
Hi all, You may be interested to know that you can now run jython 2.2 out of the box on Google AppEngine, thanks to their new java support. A patch is required for jython 2.5, but we will be folding this in before the jython 2.5 RC release over the next few weeks. More details here http://jython.xhaus.com Regards, Alan. -- http://mail.python.org/mailman/listinfo/python-list
[issue2550] SO_REUSEADDR doesn't have the same semantics on Windows as on Unix
Changes by Alan Kennedy [EMAIL PROTECTED]: -- nosy: +amak __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2550 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2452] inaccuracy in httplib timeout documentation
Changes by Alan Kennedy [EMAIL PROTECTED]: -- nosy: +amak __ Tracker [EMAIL PROTECTED] http://bugs.python.org/issue2452 __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
Re: JPype - passing to Java main
-ERROR_- File tester.py, line 10, in module com.JPypeTest.main(arg) RuntimeError: No matching overloads found. at src/native/common/ jp_method.cpp:121 --END ERROR- I haven't used jpype, but the signature for java main functions is public static void main (String [] args) So try com.JPypeTest.main([arg]) Note the addition of square brackets to create a *list* of arguments, which presumably jpype will transform into a java String[]. Alan. -- http://mail.python.org/mailman/listinfo/python-list
Re: python and JMS
[tksri2000] I am looking to use python to talk to JMS. Can some please point me to such resources if this is possible. PyHJB is the python-to-JMS gateway. ... via HJB, the HTTP JMS bridge. http://hjb.python-hosting.com/ HJB (HTTP JMS Bridge) http://hjb.berlios.de/ HTH, -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: smtplib timeout
[Stuart D. Gathman] I need to set a timelimit for the operation of smtplib.sendmail. It has to be thread based, because pymilter uses libmilter which is thread based. There are some cookbook recipies which run a function in a new thread and call Thread.join(timeout). This doesn't help, because although the calling thread gets a nice timeout exception, the thread running the function continues to run. In fact, the problem is worse, because even more threads are created. Have you tried setting a default socket timeout, which applies to all socket operations? Here is a code snippet which times out for server connections. Timeouts should also work for sending and receiving on sockets that are already open, i.e. should work for the smtplib.sendmail call. == import socket import smtplib dud_server = '192.168.1.1' timeout_value = 1.0 # seconds socket.setdefaulttimeout(timeout_value) print connecting to server: %s % dud_server try: connection = smtplib.SMTP(dud_server) except socket.timeout: print server timed out == HTH, -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: smtplib timeout
[Stuart D. Gathman] I need to set a timelimit for the operation of smtplib.sendmail. It has to be thread based, because pymilter uses libmilter which is thread based. [Alan Kennedy] Have you tried setting a default socket timeout, which applies to all socket operations? [Stuart D. Gathman] Does this apply to all threads, is it inherited when creating threads, or does each thread need to specify it separately? It is a module level default, which affects all socket operations after the socket.setdefaulttimeout() call, regardless of whether they are in threads or not. So you only need to call it once, probably before any other processing takes place. = import socket import smtplib import threading dud_server = '192.168.1.1' timeout_value = 1.0 # seconds socket.setdefaulttimeout(timeout_value) def do_connect(tno): print Thread%d: connecting to server: %s % (tno, dud_server) try: connection = smtplib.SMTP(dud_server) except socket.timeout: print Thread%d: server timed out % tno for x in range(5): t = threading.Thread(target=do_connect, args=(x,)) t.start() = C:\python smtp_timeout.py Thread0: connecting to server: 192.168.1.1 Thread1: connecting to server: 192.168.1.1 Thread2: connecting to server: 192.168.1.1 Thread3: connecting to server: 192.168.1.1 Thread4: connecting to server: 192.168.1.1 Thread0: server timed out Thread1: server timed out Thread2: server timed out Thread4: server timed out Thread3: server timed out -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: language-x-isms
[Bryan] for example, i've noticed several java developers i know write python code like this: foo_list = [...] for i in range(len(foo_list)): print '%d %s' % (i, foo_list[i]) [Fredrik Lundh] which is a perfectly valid way of doing things if you're targeting older Python platforms as well (including Jython). [astyonax] But it's not the pythonic way. [Terry Reedy] I don't think you understood what Fredrik said. It was the Python way before enumerate() builtin was added and remains the Python way if you wish to write for older versions of Python and Jython. On jython 2.1, I use something like this #-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= try: enumerate except NameError: def enumerate(iterable): results = [] ; ix = 0 for item in iterable: results.append( (ix, item) ) ix = ix+1 return results if __name__ == __main__: my_list = [0, 1, 1, 2, 3, 5, 8, ] for ix, fibo in enumerate(my_list): print Position %d: %d % (ix, fibo) #-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Which runs like:- C:\alankpython -V Python 2.4.3 C:\alankpython fibo.py Position 0: 0 Position 1: 1 Position 2: 1 Position 3: 2 Position 4: 3 Position 5: 5 Position 6: 8 C:\alankjython --version Jython 2.1 on java (JIT: null) C:\alankjython fibo.py Position 0: 0 Position 1: 1 Position 2: 1 Position 3: 2 Position 4: 3 Position 5: 5 Position 6: 8 Of course, the efficiency is different across cpython vs. jython, but it's nice to have the same pythonic code running across both. And when jython progresses beyond 2.1, (any day now!), it will still work seamlessly. regards, -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: XML, JSON, or what?
[Frank Millman] I am writing a multi-user accounting/business application, which uses sockets to communicate between server and client. The server contains all the business logic. It has no direct knowledge of the client. I have devised a simple message format to exchange information between the two. At first, I used XML as a message format. Then I read the article that recommended not using XML for Python-to-Python, so I changed it to a pickled collection of Python objects. It works just as well. If you were just communicating python to python, I'd recommend Pyro, since it has all the socket management, etc, already taken care of. http://pyro.sourceforge.net At present the client uses wxPython. One of my medium-term goals is to write a web-based client. I don't think it will be easy to reproduce all the functionality that I have at present, but I hope to get close. I have not done any serious research yet, but I am pretty sure I will use javascript on the client, to make it as universal as possible. If you're going to mix javascript client and python server, you definitely need something cross platform, like XML or JSON. Ideally, the server should be able to handle a wxPython client or a web client, without even knowing which one it is talking to. Obviously I cannot use Pickle for this. So my question is, what is the ideal format to use? I could go back to XML, or I could switch to JSON - I have read a bit about it and it does not seem complicated. JSON is indeed (mostly) as simple as it looks: it fits your need very well. And you should be up and running with it very quickly. My messages are not very large (maximum about 2k so far for a complex screen layout) so I don't think performance will be an issue. And parsing JSON will almost certainly be faster than parsing XML. You should easily be able to parse hundreds or maybe thousands of 2K JSON messages a second. And JSON generation and parsing is at least as well supported and robust as XML, in most languages. I would rather make a decision now, otherwise I will have a lot of changes to make later on. Does anyone have any recommendations? I'd go with JSON, for simplicity and portability. If you have any specific questions about it, ask. regards, -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: language-x-isms
[Alan Kennedy] On jython 2.1, I use something like this #-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= try: enumerate except NameError: def enumerate(iterable): results = [] ; ix = 0 for item in iterable: results.append( (ix, item) ) ix = ix+1 return results [Fredrik Lundh] at least in CPython, using a user-defined enumerate function is a bit slower than using the built-in version. Who's using a user-defined enumerate on cpython? The above code only defines a user-defined enumerate when the built-in one doesn't exist, i.e. on jython, which raises NameError on reference to the non-existent enumerate. On cpython, the reference to enumerate doesn't generate a NameError, therefore the user-supplied version never gets defined, and the builtin is used. regards, -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: language-x-isms
[Alan Kennedy] On jython 2.1, I use something like this #-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= try: enumerate except NameError: def enumerate(iterable): results = [] ; ix = 0 for item in iterable: results.append( (ix, item) ) ix = ix+1 return results [Fredrik Lundh] at least in CPython, using a user-defined enumerate function is a bit slower than using the built-in version. [Alan Kennedy] Who's using a user-defined enumerate on cpython? [Fredrik Lundh] anyone targeting older Python platforms. You mean python platforms where there is no built-in enumerate? Your comment makes using a user-defined enumerate [on cpython] is slower than using the built-in version makes no sense in relation to the code I posted, which only defines a user-defined enumerate *if there is no builtin enumerate*. [Alan Kennedy] On cpython, the reference to enumerate doesn't generate a NameError, [Fredrik Lundh] python Python 2.2.3 (#42, May 30 2003, 18:12:08) enumerate Traceback (most recent call last): File stdin, line 1, in ? NameError: name 'enumerate' is not defined Right: there is no built-in enumerate on cpython 2.2. So how could a user-defined one be slower than it? Of course, my cpython comment referred to recent cpythons, i.e. 2.3, 2.4, 2.5. The code I supplied works on all versions of python, and *never* replaces the built-in enumerate, if there is one. So what's the problem, exactly? regards, -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: language-x-isms
[Steve Holden] You are assuming a relatively recent release of CPython. If you look at the stuff that the effbot distributes you will see that most of it supports CPython all the way back to 1.5.2. Oh for cripes sake. The code I posted 1. works on all versions of cpython 2. works on all versions of jython 3. works on all versions of ironpython 4. never replaces the builtin enumerate I don't think many of us have the right to be telling Fredrik what's pythonic and what's not ... Who ever said anything like that? I never said anything about pythonicity. Read the thread again: All I said was Here's what I use on jython 2.1. Fred picked a non-existent hole in my code by saying using a user-defined enumerate is slower than using the builtin, implying that the code I posted replaced the builtin, which it never does. While the comment using a user-defined enumerate is slower than using the builtin may be true, it has no bearing on the code I posted, which is all I'm trying to say ... -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: language-x-isms
[Alan Kennedy] Your comment makes using a user-defined enumerate [on cpython] is slower than using the built-in version makes no sense in relation to the code I posted Fredrik Lundh wrote: try combining with the second sentence in my post. OK, so putting at least in CPython, using a user-defined enumerate function is a bit slower than using the built-in version together with in fact, if performance is important, the following can sometimes be the most efficient way to loop over things ix = 0 for fibo in my_list: do something with ix and my_list[ix] ix += 1 We still don't get anything that sheds light on how the code I posted is deficient. Why can't you just say I made a mistake, I thought your code replaced the builtin enumerate, but it doesnt? I'm sorry I posted it now, what a rigmarole. -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: XML, JSON, or what?
[Ant] I'd favour JSON if the data structures are simple personally. XML is comparatively speaking a pain to deal with, where with JSON you can simply eval() the data and you have a Python dictionary at your disposal. [Steve] Modulo any security problems that alert and malicious users are able to inject into your application. Simply using eval() uncritically on whatever comes down the pipe is a train wreck waiting to happen. Yes, evaling JSON, or any other text coming from the web, is definitely a bad idea. But there's no need for eval: there are safe JSON codecs for python, http://cheeseshop.python.org/pypi?%3Aaction=searchdescription=json And one for javascript, http://www.json.org/js.html http://www.json.org/json.js And most other languages you're likely to come across. http://www.json.org/ regards, -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: Proposal for new operators to python that add syntactic sugar for hierarcical data.
[glomde] i I would like to extend python so that you could create hiercical tree structures (XML, HTML etc) easier and that resulting code would be more readable than how you write today with packages like elementtree and xist. Any comments? Yes: it's ugly and unnecessary. Why would you want to change the language syntax just to make it easier to express markup directly in program code? Javascript just made that mistake with E4X: IMHO, it makes for some of the ugliest code. E4X reminds me of that Microsoft b*stardisation, XML Data Islands. http://www.w3schools.com/e4x/e4x_howto.asp http://en.wikipedia.org/wiki/E4X For a nice pythonic solution to representing markup directly in python code, you should check out stan. http://divmod.org/users/exarkun/nevow-api/public/nevow.stan-module.html Here's a nice stan example, taken from Kieran Holland's tutorial http://www.kieranholland.com/code/documentation/nevow-stan/ aDocument = tags.html[ tags.head[ tags.title[Hello, world!] ], tags.body[ tags.h1[ This is a complete XHTML document modeled in Stan. ], tags.p[ This text is inside a paragraph tag. ], tags.div(style=color: blue; width: 200px; background-color: yellow;) [ And this is a coloured div. ] ] ] That looks nice and simple, and no need to destroy the elegance of python to do it. regards, -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: generators shared among threads
[EMAIL PROTECTED] def f() i = 0 while True: yield i i += 1 g=f() If I pass g around to various threads and I want them to always be yielded a unique value, will I have a race condition? Yes. Generators can be shared between threads, but they cannot be resumed from two threads at the same time. You should wrap it in a lock to ensure that only one thread at a time can resume the generator. Read this thread from a couple of years back about the same topic. Suggested generator to add to threading module. http://groups.google.com/group/comp.lang.python/browse_frm/thread/76aa2afa913fe4df/a2ede21f7dd78f34#a2ede21f7dd78f34 Also contained in that thread is an implementation of Queue.Queue which supplies values from a generator, and which does not require a separate thread to generate values. HTH, -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: PyXML SAX Q?
[EMAIL PROTECTED] is it possible to use SAX to parse XML that is not in a file but in a large string? If I open my XML file and read the content into a string variable. Is there a way I can pass it to the PyXML Sax handler? The reason I want to know is that I need to parse XML that is generated by a process on a Unix system. I can connect to the process via a socket and the the XML but I need to store the XML somewhere before I can parse it. I could dump it to a file first and then hand it to the parser but that seems unefficient. Maybe I could read the XML from the socket directly into the parser. You can find exactly what you need in this old thread about incremental XML parsing. Parsing XML streams http://groups.google.com/group/comp.lang.python/msg/e97309244914343b? -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: Clarity: GIL, processes and CPUs etc
[EMAIL PROTECTED] I understand that embedding the interpreter in a C/C++ application limits it to one CPU. If the application is multi-threaded (system threads) in will not use additional CPUs as the interpreter is tied to one CPU courtesy of the GIL. True or False? True, only when a thread is inside pure python code. C-level extensions and library modules, e.g. I/O modules, release the GIL, thus permitting other threads in the same process to run simultaneously. But only one thread can be running *inside the python interpreter* at a time. I understand that forking or running multiple process instances of the above application would make use of multiple CPUs. This is because each process would have its own interpreter and GIL that is independent of any other process. True or False? True. Every separate process will have its own python interpreter, meaning it has its own GIL. Python code running in multiple processes can execute truly simultaneously. So you can run pure python code simultaneously on multiple cpus on a multi-cpu box by using multiple independent processes. But if your processes need to communicate, then you need to de/serialise objects/parameters for transmission between those processes. -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: how do you pronounce 'tuple'?
[Terry Hancock] So what's a 1-element tuple, anyway? A mople? monople? It does seem like this lopsided pythonic creature (1,) ought to have a name to reflect its ugly, newbie-unfriendly nature. It's a trip-you-uple, which you can pronounce anyway you like ;-) -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: Jython inherit from Java class
[Mark Fink] I wrote a Jython class that inherits from a Java class and (thats the plan) overrides one method. Everything should stay the same. If I run this nothing happens whereas if I run the Java class it says: usage: java fit.FitServer [-v] host port socketTicket -v verbose I think this is because I do not understand the jython mechanism for inheritance (yet). 1. Are you running jythonc? If yes, I think your class and file should have the same name, i.e. Class FitServer should be in a file called FitServer.py. I recommend calling your class something different from the base class, e.g. MyJythonFitServer, to prevent namespace clashes. 2. If your main function in jython? If yes, please post the code so we can see how you're instantiating your objects? 3. How are you running this? I.e. show us a command line session which uses your class. JyFitServer.py: === class FitServer(fit.FitServer): # call constructor of superclass def __init__(self, host, port, verbose): FitServer.__init__(self, host, port, verbose) ^ Shouldn't this be: fit.FitServer.__init__(self, host, port, verbose) I'm not sure the latter is cause of your problems, but it might be. HTH, -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: HTMLDocument and Xpath
[EMAIL PROTECTED] Hi, I want to use xpath to scrape info from a website using pyXML but I keep getting no results. For example, in the following, I want to return the text Element1 I can't get xpath to return anything at all. What's wrong with this code? Your xpath expression is wrong. test = Evaluate('td', doc_node.documentElement) Try one of the following alternatives, all of which should work. test = Evaluate('//td', doc_node.documentElement) test = Evaluate('/html/body/table/tr/td', doc_node.documentElement) test = Evaluate('/html/body/table/tr/td[1]', doc_node.documentElement) HTH, Alan. -- http://mail.python.org/mailman/listinfo/python-list
Re: Python module for LX200 telescope command set
[RayS] I've begun a Python module to provide a complete interface to the Meade LX200 command set: and History: after searching for such a thing, I had found only: http://72.14.203.104/search?q=cache:hOgO7H_MUeYJ:phl3pc02.phytch.dur.ac.uk/~jrl/source/source_code.html+lx200.pyhl=engl=usct=clnkcd=1 from University of Durham, but the code is not available... and http://cvs.sourceforge.net/viewcvs.py/projgalileo/lx200/lx200.py?rev=1.4 from the Galileo project, but which only has a subset and is aimed at their needs, mostly. Some of the Galileo code looks useful, and so I might want to make use of at least some of the methodology, even if only to get their interest as well. Do you know about ASCOM? The primary goal of the ASCOM Initiative is to provide software driver technology that will help bring about a rebirth of science in amateur astronomy by making instruments scriptable via standard low-level scripting interfaces. http://ascom-standards.org/faq.html IIUC, ASCOM is a set of Windows COM objects which provides a standardised API for controlling telescopes. Since it uses Windows COM, you should be able to control it easily from python using the excellent win32 extensions. Python for Windows Extensions http://starship.python.net/crew/mhammond/ From this page http://ascom-standards.org/nr-11Jun01.html comes the statement Telescopes currently supported are the Meade LX200 and Autostar, Celestron NexStar, Vixen SkySensor 2000, Software Bisque Paramount (via TheSky), a number of mounts that support subsets of the LX200 serial protocol (Astro-Physics, Bartels, etc.), and many research-grade telescopes that use the Astronomy Control Language (ACL). Products that support telescope control via ASCOM are Starry Night by SPACE.com, MaxIm DL V3 by Cyanogen Productions, and ACP2 by DC3 Dreams. HTH, -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: jython base64.urlsafe_b64xxx
py wrote: anyone know how to do perform the equivalent base64.urlsafe_b64encode and base64.urlsafe_b64decode functions that Python has but in jython? Jython comes with a base64 module but it does not have the urlsafe functions. Tried copying the pythhon base64.py to replace the Jython one, and although it did perform the encode/decode it didnt seem to be correctly decoded. You're probably better off using a java library for the task. There are plenty to choose from, but most embedded as utility classes in bigger packages. Here's a public domain one for example http://dev.i2p.net/cgi-bin/cvsweb.cgi/i2p/core/java/src/net/i2p/data/Base64.java?f=H With javadoc at http://dev.i2p.net/javadoc/net/i2p/data/Base64.html Seems to do what you want. HTH, -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: XML vs. cPickle
[Mike] I know XML is more (processor) costly than cPickle, but how bad is it? Are you sure you know that? I'd guess that XML serialisation with cElementTree is both cpu and memory competitive with cpickle, if not superior. Although I'm too lazy to fire up the timeit module right now :-) Also, how quickly the relevant parsers work depends on the input, i.e. your data structures. Only you can take measurements with your data structures The idea is I want to store data that can be described as XML can != should into my database as cPickle objects. Except my web framework has no support for BLOB datatype yet, and I might have to go with XML. Or you could encode the binary pickle in a text-safe encoding such as base64, and store the result in a text column. Although that will obviously increase your processing time, both going in and out of the database. Ideas are appreciated, I'd write a few simple prototypes and take some empirical measurements. HTH, -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: Spelling mistakes!
[EMAIL PROTECTED] Aside from the other responses (unittests, pychecker/pylint), you might also consider using __slots__ for new-style classes: I've been shouted at for suggesting exactly that! :-) http://groups.google.com/group/comp.lang.python/msg/fa453d925b912917 how-come-aahz-didn't-shout-at-you-ly'yrs, -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: how-to POST form data to ASP pages?
[livin] I'm not a coder really at all (I dabble with vbscript jscript) but an asking for help to get this working. I have tried this... params = urllib.urlencode({'action': 'hs.ExecX10ByName Kitchen Espresso Machine, On, 100'}) urllib.urlopen(http://192.168.1.11:80/hact/kitchen.asp;, params) You should try to phrase your question so that it is easier for us to understand what is going wrong, and thus help you to correct it. As Mike already suggested, you have a string that may be spread over two lines, which would be illegal python syntax, and which would give a SyntaxError if run. You should be sure that this is not the cause of your problem before going further. The following code should do the same as the above, but not suffer from the line breaks problem. name_value_pairs = { 'action': 'hs.ExecX10ByName Kitchen Espresso Machine, On, 100' } params = urllib.urlencode(name_value_pairs) urllib.urlopen(http://192.168.1.11:80/hact/kitchen.asp;, params) BTW, it looks to me like you may be opening up a security hole in your application. The following string looks very like a VB function invocation: 'hs.ExecX10ByName Kitchen Espresso Machine, On, 100' Are you executing the contents of form input fields as program code? That's highly inadvisable from a security point of view. Happy New Year. -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: how-to POST form data to ASP pages?
[livin] I have tried the code you suggested and .. .. Either way I get this error log... File Q:\python\python23.zlib\urllib.py, line 78, in urlopen File Q:\python\python23.zlib\urllib.py, line 183, in open File Q:\python\python23.zlib\urllib.py, line 297, in open_http File Q:\python\python23.zlib\httplib.py, line 712, in endheaders File Q:\python\python23.zlib\httplib.py, line 597, in _send_output File Q:\python\python23.zlib\httplib.py, line 576, in send File string, line 1, in sendall IOError : [Errno socket error] (10057, 'Socket is not connected') OK, now we're getting somewhere. As you can probably guess from the error message, the socket through which urllib is making the request is not connected to the server. We have to figure out why. That library path is unusual: Q:\python\python23.zlib\httplib.py Python supports reading library modules from a zip file, but the standard installations generally don't use it, except for Python CE, i.e. Python for Microsoft Windows PocketPC/CE/WTF. Is this the platform that you're using? If I remember rightly, Python for Pocket Windows doesn't support sockets, meaning that urllib wouldn't work on that platform. Another thing to establish is whether the URL is working correctly, from a client you know works independently from your script above, e.g. an ordinary browser. When you submit to your form handling script from an ordinary browser, does it work? -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: Newbie needs help extracting data from XML
[Rodney] Im a Python newbie and am trying to get the data out of a series of XML files. As Paul McGuire already noted, it's unusual to extract information from a SOAP message this way: it is more usual to use a SOAP toolkit to do the job for you. But, assuming that you know what you're doing, and that you're doing it for good reasons, here's a snippet that uses xpath to do what you want. #-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= document = \ ?xml version=1.0 encoding=utf-8? soap:Envelope xmlns:soap=http://schemas.xmlsoap.org/soap/envelope/; xmlns:soapenc=http://schemas.xmlsoap.org/soap/encoding/; xmlns:tns=http://www.ExchangeNetwork.net/schema/v1.0/node.wsdl; xmlns:types=http://www.ExchangeNetwork.net/schema/v1.0/node.wsdl; xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance; xmlns:xsd=http://www.w3.org/2001/XMLSchema; soap:Header wsu:Timestamp xmlns:wsu=http://schemas.xmlsoap.org/ws/2002/07/utility; wsu:Created2005-12-28T05:59:38Z/wsu:Created wsu:Expires2005-12-28T06:04:38Z/wsu:Expires /wsu:Timestamp /soap:Header soap:Body soap:encodingStyle=http://schemas.xmlsoap.org/soap/encoding/; q1:NodePingResponse xmlns:q1=http://www.ExchangeNetwork.net/schema/v1.0/node.xsd; return xsi:type=xsd:stringReady/return /q1:NodePingResponse /soap:Body /soap:Envelope import xml.dom.minidom import xml.xpath #dom_tree = xml.dom.minidom.parse('my_xml_file.xml') dom_tree = xml.dom.minidom.parseString(document) return_node = xml.xpath.Evaluate('//return', dom_tree)[0] print Return status is: '%s' % return_node.childNodes[0].nodeValue #-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= You have to install PyXML to get xpath support: http://pyxml.sf.net There are other ways to do it, e.g. using ElementTree, but I'll leave it to others to suggest the best way to do that. HTH, -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: Help designing reading/writing a xml-fileformat
[Jacob Kroon] I'm writing a block-diagram editor, and could use some tips about writing/reading diagrams to/from an xml file format. I highly recommend reading David Mertz excellent articles on the conversion of objects to xml and vice-versa. On the 'Pythonic' treatment of XML documents as objects, I + II http://www-128.ibm.com/developerworks/library/xml-matters1/index.html http://www-128.ibm.com/developerworks/library/xml-matters2/index.html Revisiting xml_pickle and xml_objectify http://www-128.ibm.com/developerworks/xml/library/x-matters11.html Anyone have a good idea on how to approach this problem ? (I do not want to use the pickle module) Why not the pickle module? XML-format pickles are a good solution to your problem, IMHO. -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: XML and namespaces
[Paul Boddie] It's interesting that minidom plus PrettyPrint seems to generate the xmlns attributes in the serialisation, though; should that be reported as a bug? I believe that it is a bug. [Paul Boddie] Well, with the automagic, all DOM users get the once in a lifetime chance to exchange those lead boots for concrete ones. I'm sure there are all sorts of interesting reasons for assigning namespaces to nodes, serialising the document, and then not getting all the document information back when parsing it, but I'd rather be spared all the amusement behind all those reasons and just have life made easier for just about everyone concerned. Well, if you have a fair amount of spare time and really want to improve things, I recommend that you consider implementing the DOM L3 namespace normalisation algorithm. http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407/namespaces-algorithms.html That way, everyone can have namespace well-formed documents by simply calling a single method, and not a line of automagic in sight: just standards-compliant XML processing. Anyway, thank you for your helpful commentary on this matter! And thanks to you for actually informing yourself on the issue, and for taking the time to research and understand it. I wish that your refreshing attitude was more widespread! now-i-really-must-get-back-to-work-ly'yrs, -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: Pythonic XML library with XPath support for Jython?
[James] A couple of years ago there wasn't one and the recommendation was to simply use Java libs. Have things changed since? AFAIK, things haven't changed. Things you might be interested to know 1. There is a module in PyXML, called javadom, that layers python semantics on top of various Java DOM implementations. 2. I submitted a patch that extends that support to JAXP, although the patch has not yet been folded into the main jython repo. I think I really need to submit the patch to PyXML. http://sourceforge.net/tracker/index.php?func=detailaid=876821group_id=12867atid=312867 3. It should not be too complex to develop a binding for the excellent Jaxen universal xpath engine, which could provide PyXML compatible xpath support under jython. http://www.jaxen.org I see ElementTree promises one in the future but are there any out now? Not yet, although I could be wrong. -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: XML and namespaces
[Paul Boddie] However, wouldn't the correct serialisation of the document be as follows? ?xml version=1.0? href xmlns=DAV:no_ns xmlns=//href Yes, the correct way to override a default namespace is an xmlns= attribute. [Paul Boddie] As for the first issue - the presence of the xmlns attribute in the serialised document - I'd be interested to hear whether it is considered acceptable to parse the serialised document and to find that no non-null namespaceURI is set on the href element, given that such a namespaceURI was set when the document was created. The key issue: should the serialised-then-reparsed document have the same DOM content (XML InfoSet) if the user did not explicitly create the requisite namespace declaration attributes? My answer: No, it should not be the same. My reasoning: The user did not explicitly create the attributes = The DOM should not automagically create them (according to the L2 spec) = such attributes should not be serialised - The user didn't create them - The DOM implementation didn't create them - If the serialisation processor creates them, that gives the same end result as if the DOM impl had (wrongly) created them. = the serialisation is a faithful/naive representation of the (not-namespace-well-formed) DOM constructed by the user (who omitted required attributes). = The reloaded document is a different DOM to the original, i.e. it has a different infoset. The xerces and jython snippet I posted the other day demonstrates this. If you look closely at that code, the actual DOM implementation and the serialisation processor used are from different libraries. The DOM is the inbuilt JAXP DOM implementation, Apache Crimson(the example only works on JDK 1.4). The serialisation processor is the Apache Xerces serialiser. The fact that the xmlns=DAV: attribute didn't appear in the output document shows that BOTH the (Crimson) DOM implementation AND the (Xerces) serialiser chose NOT to automagically create the attribute. If you run that snippet with other DOM implementations, by setting the javax.xml.parsers.DocumentBuilderFactory property, you'll find the same result. Serialisation and namespace normalisation are both in the realm of DOM Level 3, whereas minidom is only L2 compliant. Automagically introducing L3 semantics into the L2 implementation is the wrong thing to do. http://www.w3.org/TR/DOM-Level-3-LS/load-save.html http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407/namespaces-algorithms.html [Paul Boddie] In other words, ... What should the Namespace is message produce? Namespace is None If you want it to produce, Namespace is 'DAV:' and for your code to be portable to other DOM implementations besides libxml2dom, then your code should look like:- document = libxml2dom.createDocument(None, doc, None) top = document.xpath(*)[0] elem1 = document.createElementNS(DAV:, href) elem1.setAttributeNS(xml.dom.XMLNS_NAMESPACE, xmlns, DAV:) document.replaceChild(elem1, top) elem2 = document.createElementNS(None, no_ns) elem2.setAttributeNS(xml.dom.XMLNS_NAMESPACE, xmlns, ) document.xpath(*)[0].appendChild(elem2) document.toFile(open(test_ns.xml, wb)) its-not-about-namespaces-its-about-automagic-ly'yrs, -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: Creating referenceable objects from XML
[Michael Williams] I need it to somehow convert my XML to intuitively referenceable object. Any ideas? I could even do it myself if I knew the mechanism by which python classes do this (create variables on the fly). You seem to already have a fair idea what kind of model you need, and to know that there is a simple way for you to create one. I encourage you to progress on this path: it will increase the depth of your understanding. One mistake I think that some people make about XML is relying on other peoples interpretations of the subject, rather than forming their own opinions. The multitude of document models provided by everyone and his mother all make assumptions about how the components of the model will be accessed, in what order those components will be accessed, how often and when, how memory efficient the model is, etc, etc. To really understand the trade-offs and strengths of all the different models, it is a good exercise to build your own object model. It's a simple exercise, due to pythons highly dynamic nature. Understanding your own model will help you understand what the other models do and do not provide. You can then evaluate other off-the-shelf models for your specific applications: I always find different XML tools suit different situations. See this post of mine from a couple years back about different ways of building your own document/data models. http://groups.google.com/group/comp.lang.python/msg/e2a4a1c35395ffec I think the reference to the ActiveState recipe will be of particular interest, since you could have a running example very quickly indeed. See also my tutorial post on extracting document content from a SAX stream. I gave the example of a simple stack-based xpath-style expression matcher. http://groups.google.com/group/comp.lang.python/msg/6853bddbb9326948 Also contained in that thread is an illuminating and productive discussion between the effbot and myself about how wonderfully simple ElementTree makes this, not to mention unbeatably efficient. this-week-i-ave-been-mostly-using-kid-for-templating-ly'yrs, -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: XML and namespaces
[Fredrik Lundh] It's libxml2 that does all the work, and the libxml2 authors claim that libxml2 implements the DOM level 2 document model, but with a different API. That statement is meaningless. The DOM is *only* an API, i.e. an interface. The opening statement on the W3C DOM page is What is the Document Object Model? The Document Object Model is a platform- and language-neutral interface that will allow programs and scripts to dynamically access and update the content, structure and style of documents. http://www.w3.org/DOM/ The interfaces that make up the different levels of the DOM are described in CORBA IDL - Interface Definition Language. DOM Implementations are free to implement the methods and properties of the IDL interfaces as they see fit. Some implementations might maintain an object model, with separate objects for each node in the tree, several string variables associated with each node, i.e. node name, namespace, etc. But they could just as easily store those data in tables, indexed by some node id. (As an aside, the non-DOM-compatible Xalan Table Model does exactly that: http://xml.apache.org/xalan-j/dtm.html). So when the libxml2 developers say (copied from http://www.xmlsoft.org/) To some extent libxml2 provides support for the following additional specifications but doesn't claim to implement them completely: * Document Object Model (DOM) http://www.w3.org/TR/DOM-Level-2-Core/ the document model, but it doesn't implement the API itself, gdome2 does this on top of libxml2 They've completely missed the point: DOM is *only* the API. Maybe they're wrong, but wasn't the whole point of this subthread that different developers have interpreted the specification in different ways ? What specification? Libxml2 implements none of the DOM specifications. -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: XML and namespaces
[Alan Kennedy] Don't confuse libxml2dom with libxml2. [Paul Boddie] Well, quite, but perhaps you can explain what I'm doing wrong with this low-level version of the previously specified code: Well, if your purpose is to make a point about minidom and DOM standards compliance in relation to serialisation of namespaces, then what you're doing wrong is to use a library that bears no relationship to the DOM to make your point. Think about it this way: Say you decide to create a new XML document using a non-DOM library, such as the excellent ElementTree. So you make a series of ElementTree-API-specific calls to create the document, the elements, attributes, namespaces, etc, and then serialise the whole thing. And the end result is that you end up with a document that looks like this ?xml version=1.0 encoding=utf-8? href xmlns=DAV:/ It is not possible to use that ElementTree code to make inferences on how minidom should behave, because the syntax and semantics of the minidom API calls and the ElementTree API calls are different. Minidom is constrained to implement the precise semantics of the DOM APIs, because it claims standards compliance. ElementTree is free to do whatever it likes, e.g. be pythonic, because it has no standard to conform to: it is designed solely according to the experience and intuition of its author, who is free change it at any stage if he feels like it. s/ElementTree/libxml2/g If I've completely missed your point and you were talking something else entirely, please forgive me. I'd be happy to help with any questions if I can. -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: XML and namespaces
[Fredrik Lundh] my point was that (unless I'm missing something here), there are at least two widely used implementations (libxml2 and the 4DOM domlette stuff) that don't interpret the spec in this way. Libxml2dom is of alpha quality, according to its CheeseShop page anyway. http://cheeseshop.python.org/pypi/libxml2dom/0.2.4 This can be seen in its incorrect serialisation of the following valid DOM. #-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= document = libxml2dom.createDocument(None, doc, None) top = document.xpath(*)[0] elem1 = document.createElementNS(DAV:, myns:href) elem1.setAttributeNS(xml.dom.XMLNS_NAMESPACE, xmlns:myns, DAV:) document.replaceChild(elem1, top) print document.toString() #-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Which produces ?xml version=1.0? myns:href xmlns:myns=DAV: xmlns:xmlns=http://www.w3.org/2000/xmlns/; xmlns:myns=DAV: / Which is not even well-formed XML (duplicate attributes), let alone namespace well-formed. Note also the invalid xml namespace xmlns:xmlns attribute. So I don't accept that libxml2dom's behaviour is definitive in this case. The other DOM you refer to, the 4DOM stuff, was written by a participant in this discussion. Will you accept Apache Xerces 2 for Java as a widely used DOM Implementation? I guarantee that it is far more widely used than either of the DOMs mentioned. Download Xerces 2 (I am using Xerces 2.7.1), and run the following code under jython:- http://www.apache.org/dist/xml/xerces-j/ #-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= # # This is a simple adaptation of the DOMGenerate.java # sample from the Xerces 2.7.1 distribution. # from javax.xml.parsers import DocumentBuilder, DocumentBuilderFactory from org.apache.xml.serialize import OutputFormat, XMLSerializer from java.io import StringWriter def create_document(): dbf = DocumentBuilderFactory.newInstance() db = dbf.newDocumentBuilder() return db.newDocument() def serialise(doc): format = OutputFormat( doc ) outbuf = StringWriter() serial = XMLSerializer( outbuf, format ) serial.asDOMSerializer() serial.serialize(doc.getDocumentElement()) return outbuf.toString() doc = create_document() root = doc.createElementNS(DAV:, href) doc.appendChild( root ) print serialise(doc) #-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Which produces ?xml version=1.0 encoding=UTF-8? href/ As I expected it would. -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: XML and namespaces
[EMAIL PROTECTED] You're the one who doesn't seem to clearly understand XML namespaces. It's your position that is bewildering, not XML namespaces (well, they are confusing, but I have a good handle on all the nuances by now). So you keep claiming, but I have yet to see the evidence. Again, no skin off my back here: I write and use tools that are XML namespaces compliant. It doesn't hurt me that Minidom is not. I was hoping to help, but again I don't have time for ths argument. If you make statements such as you're wrong on this , you misunderstand , you're guessing ., etc, then you should be prepared to back them up, not state them and then say but I'm too busy and/or important to discuss it with you. Perhaps you should think twice before making such statements in the future. -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: XML and namespaces
[Fredrik Lundh] but isn't libxml2dom just a binding for libxml2? as I mention above, I had libxml2 in mind when I wrote widely used, not the libxml2dom binding itself. No, libxml2dom is Paul Boddie's DOM API compatibility layer on top of the cpython bindings for libxml2. From the CheeseShop page The libxml2dom package provides a traditional DOM wrapper around the Python bindings for libxml2. In contrast to the libxml2 bindings, libxml2dom provides an API reminiscent of minidom, pxdom and other Python-based and Python-related XML toolkits. http://cheeseshop.python.org/pypi/libxml2dom [Alan Kennedy] Will you accept Apache Xerces 2 for Java as a widely used DOM Implementation? [Fredrik Lundh] sure. but libxml2 is also widely used, so we have at least two ways to interpret the spec. Don't confuse libxml2dom with libxml2. As I showed with a code snippet in a previous message, libxml2dom has significant defects in relation to serialisation of namespaced documents, whereby the serialised documents it produces aren't even well-formed xml. Perhaps you can show a code snippet in libxml2 that illustrates the behaviour you describe? -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: XML and namespaces
confusing DAV: as a namespace uri with DAV as a namespace prefix. Code for creating the correct prefix declaration is prefix_decl = xmlns:%s % element_ns_prefix element.setAttributeNS(xml.dom.XMLNS_NAMESPACE, prefix_decl, element_ns_uri) I'd love to hear how many actual minidom users would agree with you. It's currently a bug. It needs to be fixed. However, I have no time for this bewildering fight. If the consensus is to leave minidom the way it is, I'll just wash my hands of the matter, but I'll be sure to emphasize heavily to users that minidom is broken with respect to Namespaces and serialization, and that they abandon it in favor of third-party tools. It's not a bug, it doesn't need fixing, minidom is not broken. Although I am sympathetic to your bewilderment: xml namespaces can be overly complex when it comes to the nitty, gritty details. -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: XML and namespaces
[Fredrik Lundh] can anyone perhaps dig up a DOM L2 implementation that's not written by anyone involved in this thread, and see what it does ? [Paul Boddie] document = libxml2dom.createDocument(None, doc, None) top = document.xpath(*)[0] element = document.createElementNS(DAV:, href) document.replaceChild(element, top) print document.toString() This outputs the following: ?xml version=1.0? href xmlns=DAV:/ But that's incorrect. You have now defaulted the namespace to DAV: for every unprefixed element that is a descendant of the href element. Here is an example #-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= document = libxml2dom.createDocument(None, doc, None) top = document.xpath(*)[0] elem1 = document.createElementNS(DAV:, href) document.replaceChild(elem1, top) elem2 = document.createElementNS(None, no_ns) document.childNodes[0].appendChild(elem2) print document.toString() #-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= which produces ?xml version=1.0? href xmlns=DAV:no_ns//href The defaulting rules of XML namespaces state 5.2 Namespace Defaulting A default namespace is considered to apply to the element where it is declared (if that element has no namespace prefix), and to all elements with no prefix within the content of that element. http://www.w3.org/TR/REC-xml-names/#defaulting So although I have explicitly specified no namespace for the no_ns subelement, it now defaults to the default DAV: namespace which has been declared in the automagically created xmlns attribute. This is wrong behaviour. If I want for my sub-element to truly have no namespace, I have to write it like this ?xml version=1.0? myns:href xmlns:myns=DAV:no_ns//myns:href [Paul Boddie] Leaving such attributes out by default, whilst claiming some kind of fine print standards compliance, is really a recipe for unnecessary user frustration. On the contrary, once you start second guessing the standards and making guesses about what users are really trying to do, and making decisions for them, then some people are going to get different behaviour from what they rightfully expect according to the standard. People whose expectations match with the guesses made on their behalf will find that their software is not portable between DOM implementations. With something as finicky as XML namespaces, you can't just make ad-hoc decisions as to what the user really wants. That's why DOM L2 punted on the whole problem, and left it to DOM L3. -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: XML and namespaces
[Alan Kennedy] On the contrary, once you start second guessing the standards and making guesses about what users are really trying to do, and making decisions for them, then some people are going to get different behaviour from what they rightfully expect according to the standard. People whose expectations match with the guesses made on their behalf will find that their software is not portable between DOM implementations. [Fredrik Lundh] and this hypothetical situation is different from the current situation in exactly what way? Hmm, not sure I understand what you're getting at. If changes are made to minidom that implement non-standard behaviour, there are two groups of people I'm thinking of 1. The people who expect the standard behaviour, not the modified behaviour. From these people's POV, the software can then be considered broken, since it produces different results from what is expected according to the standard. 2. The people who are ignorant of the decisions made on their behalf, and assume that they have written correct code. But their code won't work on other DOM implementations (because the automagic namespace fixup code isn't present, for example). From these people's POV, the software can then be considered broken. [Alan Kennedy] With something as finicky as XML namespaces, you can't just make ad-hoc decisions as to what the user really wants. That's why DOM L2 punted on the whole problem, and left it to DOM L3. [Fredrik Lundh] so L2 is the we support namespaces, but we don't really support them level ? Well, I read it as we support namespaces, but only if you know what you're doing. [Fredrik Lundh] maybe we could take everyone involved with the DOM design out to the backyard and beat them with empty PET bottles until they promise never to touch a computer again ? :-D -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: XML and namespaces
[AMK] (I assume not. Section 1.3.3 of the DOM Level 3 says Similarly, creating a node with a namespace prefix and namespace URI, or changing the namespace prefix of a node, does not result in any addition, removal, or modification of any special attributes for declaring the appropriate XML namespaces. So the DOM can create XML documents that aren't well-formed w.r.t. namespaces, I think.) [Uche] Oh no. That only means that namespace declaration attributes are not created in the DOM data structure. However, output has to fix up namespaces in .namespaceURI properties as well as directly asserted xmlns attributes. It would be silly for DOM to produce malformed XML+XMLNS, and of course it is not meant to. The minidom behavior needs fixing, badly. My interpretation of namespace nodes is that the application is responsible for creating whatever namespace declaration attribute nodes are required, on the DOM tree. DOM should not have to imply any attributes on output. #-=-=-=-=-=-=-=-=-= import xml.dom import xml.dom.minidom DAV_NS_U = http://webdav.org; xmldoc = xml.dom.minidom.Document() xmlroot = xmldoc.createElementNS(DAV_NS_U, DAV:xpg) xmlroot.setAttributeNS(xml.dom.XMLNS_NAMESPACE, xmlns:DAV, DAV_NS_U) xmldoc.appendChild(xmlroot) print xmldoc.toprettyxml() #-=-=-=-=-=-=-=-=-= produces ?xml version=1.0 ? DAV:xpg xmlns:DAV=http://webdav.org/ Which is well formed wrt namespaces. regards, -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: xpath support in python 2.4
[And80] I would like to use xpath modules in python2.4 In my local machine I am running python2.3.5 and on the server I run python2.4. I have seen that while on my computer i am able to import xml.xpath, on the server the module seems to not exist. Is it still part of the standard library? No, it's not. Not sure if it ever was. if not, what should I use? Install PyXML http://pyxml.sourceforge.net HTH, -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: Making a persistent HTTP connection
[David Rasmussen] I use urllib2 to do some simple HTTP communication with a web server. In one session, I do maybe 10-15 requests. It seems that urllib2 opens op a connection every time I do a request. Can I somehow make it use _one_ persistent connection where I can do multiple GET-receive data passes before the connection is closed? [Diez B. Roggisch] Are you sure HTTP supports that? Yes, HTTP 1.1 definitely supports multiple requests on the same connection. http://www.w3.org/Protocols/rfc2616/rfc2616-sec8.html#sec8.1 Some HTTP 1.0 clients supported persistent connections through the use of the non-standard keep-alive header. And even if it works - what is the problem with connections being created? The URL above describes the benefits of persistent connections. The primary problem of the old style of one-request-per-connection is the creation of more sockets than are necessary. To the OP: neither urllib nor urllib2 implements persistent connections, but httplib does. See the httplib documentation page for an example. http://www.python.org/doc/2.4.2/lib/httplib-examples.html However, even httplib is synchronous, in that it cannot pipeline requests: the response to the first request must be competely read before a second request can be issued. HTH, -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: Making a persistent HTTP connection
[David Rasmussen] I use urllib2 to do some simple HTTP communication with a web server. In one session, I do maybe 10-15 requests. It seems that urllib2 opens op a connection every time I do a request. Can I somehow make it use _one_ persistent connection where I can do multiple GET-receive data passes before the connection is closed? [Diez B. Roggisch] Are you sure HTTP supports that? Yes, HTTP 1.1 definitely supports multiple requests on the same connection. http://www.w3.org/Protocols/rfc2616/rfc2616-sec8.html#sec8.1 Some HTTP 1.0 clients supported persistent connections through the use of the non-standard keep-alive header. And even if it works - what is the problem with connections being created? The URL above describes the benefits of persistent connections. The primary problem of the old style of one-request-per-connection is the creation of more sockets than are necessary. To the OP: neither urllib nor urllib2 implements persistent connections, but httplib does. See the httplib documentation page for an example. http://www.python.org/doc/2.4.2/lib/httplib-examples.html However, even httplib is synchronous, in that it cannot pipeline requests: the response to the first request must be competely read before a second request can be issued. HTH, -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: Inserting Records into SQL Server - is there a faster interface than ADO
[EMAIL PROTECTED] I have a program that reads records from a binary file and loads them into an MS-SQL Server database. It is using a stored proc, passing the parameters. [snip] So my questions is Is there a faster method I can use to connect to the SQL server ? Or does anyone have any optimization tips the can offer ? Is there a reason why you need to use a stored procedure? Do you need to process the data in some way in order to maintain referential integrity of the database? If the answer to both these questions is no, then you can use the bcp (Bulk CoPy) utility to transfer data into SQLServer *very* quickly. http://msdn.microsoft.com/library/en-us/coprompt/cp_bcp_61et.asp http://www.sql-server-performance.com/bcp.asp thought-it-was-worth-mentioning-ly y'rs, -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: Map of email origins to Python list
[Claire McLister] I've made the script available on our downloads page at: http://www.zeesource.net/downloads/e2i [Alan Kennedy] I look forward to the map with updated precision :-) [Claire McLister] Me too. Please let me know how we should modify the script. Having examined your script, I'm not entirely sure what your input source is, so I'm assuming it's an mbox file of the archives from python-list, e.g. as appears on this page http://mail.python.org/pipermail/python-list/ or at this URL http://mail.python.org/pipermail/python-list/2005-November.txt Those messages are the email versions, so all of the NNTP headers, e.g. NNTP-Posting-Host, will have been dropped. You will need these in order to get the geographic location of posts that have been made through NNTP. In order to be able to get those headers, you need somehow to get the NNTP originals of messages that originated on UseNet. You can see an example of the format, i.e. your message to which I am replying, at this URL http://groups.google.com/group/comp.lang.python/msg/56e3baabcd4498f2?dmode=source The NNTP-Posting-Host for that message is '194.109.207.14', which reverses to 'bag.python.org', which is presumably the machine that gatewayed the message from python-list onto comp.lang.python. So there are a couple of different approaches 1. Get an archive of the UseNet postings to comp.lang.python (anybody know where?) A: messages sent through email will have the NNTP-Posting-Host as a machine at python.org, so fall back to your original algorithm for those messages B: messages sent through UseNet, or a web gateway to same, will have an NNTP-Posting-Host elsewhere than python.org, so do your geo-lookup on that IP address. 2. Get the python-list archive A: Figure out which messages came through the python.org NNTP gateway (not sure offhand if this is possible). Automate a query to Google groups to find the NNTP-Posting-Host (using a URL like the one above). Requires being able to map the python-list message-id to the google groups message-id. Do your geo-lookup on that NNTP-Posting-Host value B: Use your original algorithm for messages sent through email. 2A message-id lookup should be achievable through the advanced google groups search, at this URL http://groups.google.com/advanced_search?q=; See the Lookup the message with message ID at the bottom. Sorry I don't have time to supply code for any of this. Perhaps some one can add more details, or better still some code? -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: Threading- Stopping
[Tuvas] Is there a way to stop a thread with some command like t.stop()? Or any other neat way to get around it? Thanks! Good question. And one that gets asked so often, I ask myself why it isn't in the FAQ? http://www.python.org/doc/faq/library.html It really should be in the FAQ. Isn't that what FAQs are for? Maybe the FAQ needs to be turned into a wiki? -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: Map of email origins to Python list
[Robert Kern] Most of AOL's offices are in Dulles, VA. Google's headquarters are in Mountain View, CA. [EMAIL PROTECTED] Aha, I post to the usenet through Google. Makes the map application all the more stupid, doesn't it? Actually, no, because Google Groups sets the NNTP-Posting-Host header to the IP address from which the user connected to Google. So your post to which I'm replying came from IP address 68.73.244.37, which reverses to adsl-68-73-244-37.dsl.chcgil.ameritech.net. http://groups.google.com/group/comp.lang.python/msg/ca06957210fe12ae?dmode=source So presumably chcgil indicates you're in Chicago, Illinois? Although I do have to point out that the map makes it appear as if I've been busy posting from all over Dublin's Southside, which, as anyone who has seen The Commitments can attest, is a deep insult a born-and-bred Northsider such as myself ;-) -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: Map of email origins to Python list
[Alan Kennedy] So presumably chcgil indicates you're in Chicago, Illinois? [EMAIL PROTECTED] Yes, but why, then, is my name logged into Mountain View, CA? Presumably the creators of the map have chosen to use a mechanism other than NNTP-Posting-Host IP address to geolocate posters. Claire, what mechanism did you use? That justifies my claim of all the more stupid, doesn't it? Well, to me it just says that the map creation software has some bugs that need fixing. -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: Map of email origins to Python list
[Claire McLister] Thanks, Alan. You are absolutely right, we are not using the NNTP-Posting-Host header for obtaining the IP address. Aha, that would explain the lack of precision in many cases. A lot of posters in this list/group go through NNTP (either with an NNTP client or through NNTP-aware services like Google Groups) which should give very good results, when available. So, we'll have to go back and fix the script that is extracting the IP address (which is written in Python, btw). What better language to write in :-) Let me know if someone is interested in taking a look at it and I can post it somewhere. Sure, please do make it available, or at least the geolocation component anyway. I'm sure you'll get lots of useful comments from the many clever and experienced folk who frequent this group. Don't be aggrieved at the negative comment you've received: I think what you're doing is fascinating. But don't forget that a lot of people are not aware that this kind of geolocation can be done, along with the many other inferences that can be drawn from message and browser headers. So don't be surprised if some of them try to shoot the messenger. I look forward to the map with updated precision :-) -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: modifying source at runtime - jython case
[Jan Gregor] Following try showed me that instances aren't affected by modification in class definition. Is this more like what you mean? c:\jython Jython 2.1 on java1.4.2_09 (JIT: null) Type copyright, credits or license for more information. class a: ... def test(self): ... print First ... x = a() x.test() First def test(self): ... print Second ... a.test = test x.test() Second If that's what you're thinking, you should read up on class objects, instance objects and method objects. http://docs.python.org/tut/node11.html#SECTION001132 [Jan Gregor] my typical scenario is that my swing application is running, and i see some error or chance for improvement - modify sources of app, stop and run application again. You're really talking about an IDE here. so task is to reload class defitions (from source files) and modify also existing instances (their methods). You can modify the behaviour of all existing instances of a class, e.g. their methods, by modifying the class itself. Any reference to the method on the instance will resolve to the method definition from its class, provided you haven't overwritten the method on the individual instance. Resolving the method is done at invocation time, because python/jython is a late-binding language. Any closer? -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: modifying source at runtime - jython case
[Jan Gregor] I want to apply changes in my source code without stopping jython and JVM. Preferable are modifications directly to instances of classes. My application is a desktop app using swing library. Python solutions also interest me. Solution similiar to lisp way is ideal. OK, I'll play 20 questions with you. How close is the following to what you're thinking? begin SelfMod.java --- // import org.python.util.PythonInterpreter; import org.python.core.*; class SelfMod { static String my_class_source = class MyJyClass:\n + def hello(self):\n + print 'Hello World!'\n; static String create_instance = my_instance = MyJyClass()\n; static String invoke_hello = my_instance.hello(); static String overwrite_meth = def goodbye():\n+ print 'Goodbye world!'\n + \n + my_instance.hello = goodbye\n; public static void main ( String args[] ) { PythonInterpreter interp = new PythonInterpreter(); interp.exec(my_class_source); interp.exec(create_instance); interp.exec(invoke_hello); interp.exec(overwrite_meth); interp.exec(invoke_hello); } } // end SelfMod.java - need-to-complete-my-coursework-for-telepathy-101-ly y'rs -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: XML DOM: XML/XHTML inside a text node
[EMAIL PROTECTED] In my program, I get input from the user and insert it into an XHTML document. Sometimes, this input will contain XHTML, but since I'm inserting it as a text node, xml.dom.minidom escapes the angle brackets ('' becomes 'lt;', '' becomes 'gt;'). I want to be able to override this behavior cleanly. Why? You need to make a decision on how the contained xhtml is treated after it has been inserted into the document. 1. If it is simply textual payload, then it should be perfectly acceptable to escape those characters. Or you could include it as a CDATA section. 2. If it needs to become a structural part of the xml document, i.e. the elements are structurally incorporated into the document, then you need to transform it into nodes somehow, e.g. by parsing it with sax, etc. Although it would probably be easier to parse it into a separate DOM and import the generated root node into your document. Is this xhtml coming from a trusted source? Or are you accepting it from strangers, over the internet? If the latter, there are security concerns relating to XSS attacks that you need to be aware of. See the following archive post for how to clean up untrusted (x)html. http://groups.google.com/group/comp.lang.python/browse_thread/thread/fbdc7ae20353a36d/91b6510990a25f9a HTH, -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: WTF?
[James Stroud] Why do my posts get held for suspcious headers You're probably trying to post through the python-list email address, which has had SPAM problems in the past, because the email address has been used by spammers as a forged from address, meaning the bounces would go to everyone, including being gateway'ed to comp.lang.python, which is the NNTP group in which many people read this list. Have you tried using an NNTP client instead, or using a web interface such as Google Groups? http://groups.google.com/group/comp.lang.python?gvc=2 and troll Xha Lee gets to post all sorts of profanity and ranting without any problem? Take a look at the source of XL's messages: he posts through Google Groups, thus completely avoiding the SPAM filter on python.org. http://groups.google.com/group/comp.lang.python/msg/762c8dad1928ecc2?dmode=source -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: Python and MySQL
[Aquarius] I appologize in advance for this strange (and possibly stupid) question. I want to know if there is a way to interface a MySQL database without Python-MySQL or without installing anything that has C files that need to be compiled. The reason for this, is that I want to develop a certain web application, but my hosting provider ([EMAIL PROTECTED]@#%) isn't very eager to supply Python-MySQL (or any modules to python). Is there an alternative approach I could use to pass around this ridiculos lack of functionality? Possibly not want you want to hear, but I'd strongly recommend to stop wasting your time with a hosting company that doesn't support the technologies you need. Instead, try a hosting company that supports python: there are lots and lots http://wiki.python.org/moin/PythonHosting Life's too short to spend your time hacking around artificial barriers to progress. -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: help with concurrency control (threads/processes signals)
[Sori Schwimmer] I am working on an application which involves interprocess communication. More to the point, processes should be able to notify other processes about certain situations, so the notifyees would be able to act in certain ways. As processes are distributed on several machines, in different physical locations, my thinking was: a) set a message manager (MM) b) all the participants will register with MM, so MM will have their host address and their pid on host c) when someone needs to send a notification, it is sent to MM, and MM it's doing the job [snip] Life is a struggle. Programming in Python shouldn't be. Ergo, I'm doing something wrong. Any advice? Rather than rolling your own, have you considered using the spread module: robust, tested, efficient and no infrastructure development required. http://www.zope.org/Members/tim_one/spread/ http://www.python.org/other/spread/ The latter page has links to the original C spread module, which has documentation, FAQs, etc. -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: 1 Million users.. I can't Scale!!
[yoda] I really need help because my application currently can't scale. Some user's end up getting their data 30 seconds after generation(best case) and up to 5 minutes after content generation. This is simply unacceptable. The subscribers deserve much better service if my startup is to survive in the market. My questions therefore are: 1)Should I switch to stackless python or should I carry out experiments with mutlithreading the application? 2)What architectural suggestions can you give me? 3)Has anyone encountered such a situation before? How did you deal with it? 4)Lastly, and probably most controversial: Is python the right language for this? I really don't want to switch to Lisp, Icon or Erlang as yet. I highly recommend reading the following paper on the architecture of highly concurrent systems. A Design Framework for Highly Concurrent Systems, Welsh et al. http://www.eecs.harvard.edu/~mdw/papers/events.pdf The key principle that I see being applicable to your scenario is to have a fixed number of delivery processes/threads. Welsh terms this the width of your delivery channel. The number should match the number of delivery channels that your infrastructure can support. If you are delivering your SMSs by SMPP, then there is probably a limit to the number of messages/second that your outgoing SMPP server can handle. If you go above that limit, then you might cause thrashing or overload in that server. If you're delivering by an actual GSM mobile connected serially connected to your server/pc, then you should have a single delivery process/thread for each connected mobile. These delivery processes/threads would be fed by queues of outgoing SMSs. If you want to use a multithreaded design, then simply use a python Queue.Queue for each delivery channel. If you want to use a multi-process design, devise a simple protocol for communicating those messages from your generating database/process to your delivery channel over TCP sockets. As explained in Welsh's paper, you will get the highest stability ensuring that your delivery channels only receive as many messages as the outgoing transmission mechanism can actually handle. If you devise a multi-process solution, using TCP sockets to distribute messages from your generating application to your delivery channels, then it would be very straightforward to scale that up to multiple processes running on a either a multiple-core-cpu, a multiple-cpu-server, or a multiple-server-network. All of this should be achievable with python. Some questions: 1. How are you transmitting your SMSs? 2. If you disable the actual transmission, how many SMSs can your application generate per second? HTH, -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: Python xml.dom, help reading attribute data
[Thierry Lam] Let's say I have the following xml tag: para role=success1/para I can't figure out what kind of python xml.dom codes I should invoke to read the data 1? Any help please? This is job for xpath. #-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= from xml import xpath from xml.dom import minidom doc = para role=success1/para mydom = minidom.parseString(doc) #result_nodes = xpath.Evaluate(/para/text(), mydom) #result_nodes = xpath.Evaluate(/para[1]/text(), mydom) result_nodes = xpath.Evaluate(/[EMAIL PROTECTED]'success']/text(), mydom) for ix, r in enumerate(result_nodes): print result %d: %s % (ix, r) #-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Xpath support is a part of the PyXML package, which you can get from here http://pyxml.sourceforge.net Xpath tutorials from here http://www.zvon.org/xxl/XPathTutorial/General/examples.html http://www.w3schools.com/xpath/ there-are-other-ways-to-do-it-but-i-like-xpath-ly'yrs, -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: Django Vs Rails
[D H] Go with Rails. Django is only like a month old. [bruno modulix] Please take time to read the project's page. Django has in fact three years of existence and is already used on production websites, so it's far from pre-alpha/planning stage. But the APIs still aren't 100% stable. http://www.djangoproject.com/screencasts/model_syntax_change/ -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: dual processor
[Jeremy Jones] One Python process will only saturate one CPU (at a time) because of the GIL (global interpreter lock). [Nick Craig-Wood] I'm hoping python won't always be like this. Me too. However its crystal clear now the future is SMP. Definitely. So, I believe Python has got to address the GIL, and soon. I agree. I note that PyPy currently also has a GIL, although it should hopefully go away in the future. Armin and Richard started to change genc so that it can handle the new external objects that Armin had to introduce to implement threading in PyPy. For now we have a simple GIL but it is not really deeply implanted in the interpreter so we should be able to change that later. After two days of hacking the were finished. Despite that it is still not possible to translate PyPy with threading because we are missing dictionaries with int keys on the RPython level. http://codespeak.net/pipermail/pypy-dev/2005q3/002287.html The more I read about such global interpreter locks, the more I think that the difficulty in getting rid of them lies in implementing portable and reliable garbage collection. Read this thread to see what Matz has to say about threading in Ruby. http://groups.google.com/group/comp.lang.ruby/msg/dcf5ca374e6c5da8 One of these years I'm going to have to set aside a month or two to go through and understand the cpython interpreter code, so that I have a first-hand understanding of the issues. -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: SpamBayes wins PCW Editors Choice Award for anti-spam software.
[Alan Kennedy] ... PCW ran a story this time last year about Michael Sparks, python and python's use in the BBC's future distribution plans for digital TV. [Paul Boddie] Well, I didn't even notice the story! ;-) Here's the message I posted here at the time http://groups.google.com/group/comp.lang.python/msg/4a33f07f11a0ef30 regards, -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: SpamBayes wins PCW Editors Choice Award for anti-spam software.
[Alan Kennedy] IMHO, there is a great opportunity here for the python community: [...] Surely that's worth a simple team name, for mnemonic purposes if nothing else. Something different or unusual, like one of my favourites, Legion of the Bouncy Castle, who are a group of Java cryptography dudes [Tony Meyer] Is there really anything to be gained by referring to the SpamBayes development team via some cryptic name? Yes, there is something to be gained: mindshare. A simple catchy memorable team name could work wonders for showing potential users that this excellent open-source software product was produced a team of real people, using the worlds best development language ;-) And produced by people who have a pythonic sense of humour. Off the top of my head suggestions include - The Python Anti-Spam Cabal - The Flying Circus - The SPAM Vikings (*) - The Knights who go Nih - The Ministry of Funny Software - Masters of the pythonic time machine (probably too pretentious) * http://www.2famouslyrics.com/m/monty-python/spam-song.html [Tony Meyer] You can call use the SDT if you like. Well, I said an interesting name ;-) [Tony Meyer] Should the Python developers likewise get some cryptic name? No, they'll always be the python-dev cabal to me. basic-marketing-is-easy-ly'yrs, -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: SpamBayes wins PCW Editors Choice Award for anti-spam software.
[Alan Kennedy] (PCW, for those who don't know it, is sort of the UK's equivalent of Byte Magazine,except that it's still publishing after almost 25 years). [Paul Boddie] Hmmm. Even Byte at its lowest point was far better than PCW ever was. Well, I mostly disagree, but you've got your opinion. I personally preferred Byte, particularly because of their orientation towards what the PC market would become, not just how it currently was, e.g. they would run articles on RISC vs. CISC, for example, when the battle was just starting. But Byte went out of business: obviously not enough people cared about what it had to say. PCW is, and always has been, focussed on the state of the market as is, which means they're always testing and reviewing the stuff you can get off the shelves right now. Whenever I buy a peripheral, e.g. scanner/fax, optical writer, digital camera, etc, I always check if PCW has anything to say about it, or about the class of peripherals: chances are they've done a reasonably thorough review quite recently. And they are still around, after all these decades, because they provide information that people want. [Alan Kennedy] The only problem was they listed the manufacturer of the software as SourceForge, so the product was known as SourceForge SpamBayes. [Paul Boddie] PCW may still be publishing after 25 years (half the magazine being adverts probably keeps it just about economically viable), but they clearly haven't yet managed to shake off that classic 1980s mindset where everything is a product by a company (and, given the superficial understanding of software licensing still likely to be pervasive in the mainstream UK IT press, everything else is public domain). Don't forget that the comprehension of IT journalists is generally a good indicator of the comprehension of the general computer-using public. But I generally find that the journos at PCW are a little more enlightened than average: rather than copying and pasting corporate product announcements, they actually use the stuff they comment on. I personally put great store in the fact that PCW awarded the Editors Choice award to SpamBayes, because it's based on actually *using* the software, rather than doing a simple feature comparison. 95% of the people who read the review and download/install SpamBayes won't give a monkeys what language it's written in. But they'll still have a modern python interpreter installed on their system as a consequence. IMHO, there is a great opportunity here for the python community: SpamBayes is a best-of-breed product in a very important market, anti-SPAM: it even beat commercial competitors. SPAM has become an enormous logistical, financial, commercial and legal problem across the world, purportedly costing billions of dollars(virus distribution, phishing, scams, etc). If the community ever wanted to prove python to be a serious language, here's a fine opportunity. Surely that's worth a simple team name, for mnemonic purposes if nothing else. Something different or unusual, like one of my favourites, Legion of the Bouncy Castle, who are a group of Java cryptography dudes http://www.bouncycastle.org (Also, I've often seen PCW refer to open source apps as manufactured by individuals or teams: it's just that in this case the SpamBayes team have made no name available). As for URLs and other things, last time I looked at the PCW Web site, it was all time-limited (or page-view-limited) viewing for non-subscribers. If British print distribution wasn't such a lock-in, I'd imagine PCW would have taken its place alongside Byte, staring at us from the print media fossil record. I didn't notice any complaints when PCW ran a story this time last year about Michael Sparks, python and python's use in the BBC's future distribution plans for digital TV. old-fashioned-ly'yrs, -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
SpamBayes wins PCW Editors Choice Award for anti-spam software.
Hi All, If there any contributors of SpamBayes reading, Congratulations! SpamBayes has won the Personal Computer World (pcw.co.uk) Editors Choice award for anti-spam software, in a review of anti-SPAM solutions in the October 2005 edition. (PCW, for those who don't know it, is sort of the UK's equivalent of Byte Magazine, except that it's still publishing after almost 25 years). SpamBayes was one of two open-source apps in the group review, which included commercial products from Symantec, McAfee, and half a dozen other companies. ... SpamBayes 1.0.1 is definitely in a league of its own: during our tests it obtained a 100% real success rate. It would have to be trained for several months in order to check that it isn't too strict on a daily basis and that it lets most of the good messages through. However, the fact that it's free, offers a high level of efficiency and is compatible with Outlook makes it ideal for anyone looking for a zero cost solution. As such, we think it deserves our Editor's Choice award. As with all Bayesian filters, it gets better with use, especially in terms of detecting wanted mail. The only problem was they listed the manufacturer of the software as SourceForge, so the product was known as SourceForge SpamBayes. You guys need to come up a team/manufacturer name. (But there is a screenshot of the software in the review, and the Python Powered logo is right there for all to see). Congratulations! Unfortunately, PCW don't seem to have made the review available online (yet), so I can't provide a URL. Maybe someone else will have more success finding a URL? thought-ye'd-like-to-know-ly'yrs, -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: python xml DOM? pulldom? SAX?
[jog] I want to get text out of some nodes of a huge xml file (1,5 GB). The architecture of the xml file is something like this [snip] I want to combine the text out of page:title and page:revision:text for every single page element. One by one I want to index these combined texts (so for each page one index) What is the most efficient API for that?: SAX ( I don´t thonk so) SAX is perfect for the job. See code below. DOM If your XML file is 1.5G, you'll need *lots* of RAM and virtual memory to load it into a DOM. or pulldom? Not sure how pulldom does it's pull optimizations, but I think it still builds an in-memory object structure for your document, which will still take buckets of memory for such a big document. I could be wrong though. Or should I just use Xpath somehow. Using xpath normally requires building a (D)OM, which will consume *lots* of memory for your document, regardless of how efficient the OM is. Best to use SAX and XPATH-style expressions. You can get a limited subset of xpath using a SAX handler and a stack. Your problem is particularly well suited to that kind of solution. Code that does a basic job of this for your specific problem is given below. Note that there are a number of caveats with this code 1. characterdata handlers may get called multiple times for a single xml text() node. This is permitted in the SAX spec, and is basically a consequence of using buffered IO to read the contents of the xml file, e.g. the start of a text node is at the end of the last buffer read, and the rest of the text node is at the beginning of the next buffer. 2. This code assumes that your revision/text nodes do not contain mixed content, i.e. a mixture of elements and text, e.g. revisiontextThis is a piece of brevision/b text/text/revision. The below code will fail to extract all character data in that case. #-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= import xml.sax class Page: def append(self, field_name, new_value): old_value = if hasattr(self, field_name): old_value = getattr(self, field_name) setattr(self, field_name, %s%s % (old_value, new_value)) class page_matcher(xml.sax.handler.ContentHandler): def __init__(self, page_handler=None): xml.sax.handler.ContentHandler.__init__(self) self.page_handler = page_handler self.stack = [] def check_stack(self): stack_expr = / + /.join(self.stack) if '/parent/page' == stack_expr: self.page = Page() elif '/parent/page/title/text()' == stack_expr: self.page.append('title', self.chardata) elif '/parent/page/revision/id/text()' == stack_expr: self.page.append('revision_id', self.chardata) elif '/parent/page/revision/text/text()' == stack_expr: self.page.append('revision_text', self.chardata) else: pass def startElement(self, elemname, attrs): self.stack.append(elemname) self.check_stack() def endElement(self, elemname): if elemname == 'page' and self.page_handler: self.page_handler(self.page) self.page = None self.stack.pop() def characters(self, data): self.chardata = data self.stack.append('text()') self.check_stack() self.stack.pop() testdoc = parent page titlePage number 1/title idp1/id revision idr1/id textrevision one/text /revision /page page titlePage number 2/title idp2/id revision idr2/id textrevision two/text /revision /page /parent def page_handler(new_page): print New page print title\t\t%s % new_page.title print revision_id\t%s % new_page.revision_id print revision_text\t%s % new_page.revision_text print if __name__ == __main__: parser = xml.sax.make_parser() parser.setContentHandler(page_matcher(page_handler)) parser.setFeature(xml.sax.handler.feature_namespaces, 0) parser.feed(testdoc) #-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= HTH, -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: python xml DOM? pulldom? SAX?
[Alan Kennedy] SAX is perfect for the job. See code below. [Fredrik Lundh] depends on your definition of perfect... Obviously, perfect is the eye of the beholder ;-) [Fredrik Lundh] using a 20 MB version of jog's sample, and having replaced the print statements with local variable assignments, I get the following timings: 5 lines of cElementTree code: 7.2 seconds 60+ lines of xml.sax code: 63 seconds (Python 2.4.1, Windows XP, Pentium 3 GHz) Impressive! At first, I thought your code sample was building a tree for the entire document, so I checked the API docs. It appeared to me that an event processing model *couldn't* obtain the text for the node when notified of the node: the text node is still in the future. That's when I understood the nature of iterparse, which must generate an event *after* the node is complete, and it's subdocument reified. That's also when I understood the meaning of the elem.clear() call at the end. Only the required section of the tree is modelled in memory at any given time. Nice. There are some minor inefficiencies in my pure python sax code, e.g. building the stack expression for every evaluation, but I left them in for didactic reasons. But even if every possible speed optimisation was added to my python code, I doubt it would be able to match your code. I'm guessing that a lot of the reason why cElementTree performs best is because the model-building is primarily implemented in C: Both of our solutions run python code for every node in the tree, i.e. are O(N). But yours also avoids the overhead of having function-calls/stack-frames for every single node event, by processing all events inside a single function. If the SAX algorithm were implemented in C (or Java) for that matter, I wonder if it might give comparable performance to the cElementTree code, primarily because the data structures it is building are simpler, compared to the tree-subsections being reified and discarded by cElementTree. But that's not of relevance, because we're looking for python solutions. (Aside: I can't wait to run my solution on a fully-optimising PyPy :-) That's another nice thing I didn't know (c)ElementTree could do. enlightened-ly'yrs, -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: Sanitizing untrusted code for eval()
[Jim Washington] I'm still working on yet another parser for JSON (http://json.org). It's called minjson, and it's tolerant on input, strict on output, and pretty fast. The only problem is, it uses eval(). It's important to sanitize the incoming untrusted code before sending it to eval(). I think that you shouldn't need eval to parse JSON. For a discussion of the use of eval in pyjsonrpc, between me and the author, Jan-Klaas Kollhof, see the content of the following links. A discussion of the relative time *in*efficiency of eval is also included: it is much faster to use built-in functions such str and float to convert from JSON text/tokens to strings and numbers. http://mail.python.org/pipermail/python-list/2005-February/265805.html http://groups.yahoo.com/group/json-rpc/message/55 Pyjsonrpc uses the python tokeniser to split up JSON strings, which means that you cannot be strict about things like double () vs. single (') quotes, etc. JSON is so simple, I think it best to write a tokeniser and parser for it, either using a parsing library, or just coding your own. -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: global interpreter lock
[km] is true parallelism possible in python ? cpython:no. jython: yes. ironpython: yes. or atleast in the coming versions ? cpython:unknown. pypy: don't have time to research. Anyone know? is global interpreter lock a bane in this context ? beauty/bane-is-in-the-eye-of-the-beholder-ly y'rs -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: global interpreter lock
[Bryan Olson] I don't see much point in trying to convince programmers that they don't really want concurrent threads. They really do. Some don't know how to use them, but that's largely because they haven't had them. I doubt a language for thread-phobes has much of a future. [Mike Meyer] The real problem is that the concurrency models available in currently popular languages are still at the goto stage of language development. Better models exist, have existed for decades, and are available in a variety of languages. I think that having a concurrency mechanism that doesn't use goto will require a fundamental redesign of the underlying execution hardware, i.e. the CPU. All modern CPUs allow flow control through the use of machine-code/assembly instructions which branch, either conditionally or unconditionally, to either a relative or absolute memory address, i.e. a GOTO. Modern languages wrap this goto nicely using constructs such as generators, coroutines or continuations, which allow preservation and restoration of the execution context, e.g. through closures, evaluation stacks, etc. But underneath the hood, they're just gotos. And I have no problem with that. To really have parallel execution with clean modularity requires a hardware redesign at the CPU level, where code units, executing in parallel, are fed a series of data/work-units. When they finish processing an individual unit, it gets passed (physically, at a hardware level) to another code unit, executing in parallel on another execution unit/CPU. To achieve multi-stage processing of data would require breaking up the processing into a pipeline of modular operations, which communicate through dedicated hardware channels. I don't think I've described it very clearly above, but you can read a good high-level overview of a likely model from the 1980's, the Transputer, here http://en.wikipedia.org/wiki/Transputer Transputers never took off, for a variety of technical and commercial reasons, even though there was full high-level programming language support in the form of Occam: I think it was just too brain-bending for most programmers at the time. (I personally *almost* took on the task of developing a debugger for transputer arrays for my undergrad thesis in 1988, but when I realised the complexity of the problem, I picked a hypertext project instead ;-) http://en.wikipedia.org/wiki/Occam_programming_language IMHO, python generators (which BTW are implemented with a JVM goto instruction in jython 2.2) are a nice programming model that fits neatly with this hardware model. Although not today. -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: JBUS and Python which way
[mjekl] My aim is to have an idea of the alternatives technologies for accessing information produced by a machine with a JBUS interface (RS232) and how to access this information realtime in Python (connecting a PC locally via serial port). I'm aware of pyserial but I wonder if there is a library/module that takes care of accessing/interpreting JBUS protocol. I've searched for this without results. A possibility you may not have considered is to use a Java library for Modbus/JBus, and then use jython to control that. The following looks like a likely candidate. http://sourceforge.net/projects/jamod/ I imagine that writing your own cpython implementation wouldn't be that difficult. I did some modbus work in C back in the 90s, and it was pretty straightforward, but requiring a lot of finicky bit-twiddling. I'm pretty certain that writing a python implementation would be a snap. I also searched the net looking for some information so that I could have a birds-eye-view on this subject and got the impression that a possibility is to have the communication (JBUS protocol / buffering) managed by some hardware component. Is this so? Can some-one give me some pointers/resources on this subject. Would it still be possible to work with Python. Well, if you do find some hardware component that manages the JBus interface, you've then turned your problem into How to talk between the PC and the JBus instrument-manager rather than How to talk between the PC and JBus instruments. Depending on the protocol used by the instrument-manager, you may be able to use python to control that. HTH, -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: Counting processors
[Pauldoo] Is a way in python to obtain the total number of processors present in the system? On windows, List Processor Information. Description: Returns information about the processors installed on a computer. http://www.microsoft.com/technet/scriptcenter/scripts/Python/hardware/basic/hwbapy01.mspx -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: Java RMI-like services in Python
[Maurice LING] I am wondering if Python has services or frameworks that does the same as Java RMI? As Harald mentioned, Pyro is firmly in the Remote Method Invocation space. And there's always CORBA, of which there are multiple python and java implementations. Which might be useful, if you wanted to have a mixed language implementation. Another technology that could be very useful for you is Spread, for which both python and java libraries exist. http://www.zope.org/Members/tim_one/spread/ [Maurice LING] What I am seeking is to do pseudo-clustering. [ .. snip .. ] I know something like this had been achieved in Java (http://www-128.ibm.com/developerworks/java/library/j-super.html) but wondering if it is possible in Python. Is so, how? So, do you want to A: Build your own pseudo-clustering implementation? B: Use one that's already been written? If the answer is the latter, I recommend you take a look at PyLinda. PyLinda - Distributed Computing Made Easy http://www-users.cs.york.ac.uk/~aw/pylinda/ -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: When someone from Britain speaks, Americans hear a British accent...
[Mike Holmans] Some of those sonorous slow talkers from the South, and majestic bass African-Americans like James Earl Jones or Morgan Freeman, have far more gravitas than any English accent can: to us, such people sound monumental. On a related note, have you ever seen any of the original undubbed Star Wars scenes with Darth Vader, with the original voice of the English actor who played him, Dave Prowse (The Green Cross Man, for those who remember ;-) Problem was, Mr. Prowse has a pronounced West Country accent. Imagine it: Darth Vader (in the voice of Farmer Giles): You are a Rebel, and a Traitor to the Empire. Hilarious :-D, and impossible to take seriously. Thankfully they overdubbed it with James Earl Jones, Born in Mississippi, raised in Michigan, who produced one of the finest and most memorable voice performances in modern cinema. get-orff-moy-lahnd-ly y'rs -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: How to implement a file lock ??
[ionel] is there a cross-platform library or module for file locking? or at least a win32 implementation. i'm trying to get a lock on some file in a cgi script. ( i got my data erased a few times :P ) portalocker - Cross-platform (posix/nt) API for flock-style file locking. http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/65203 HTH, -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: parsing IMAP responses?
[Grant Edwards] Is there a library somewhere that impliments the IMAP protocol syntax? [Paul Rubin] It's very messy. [Grant Edwards] It sure is. You'd think something intended to be machine-readable would be easier to parse. IMHO, email is the most disgracefully badly spec'ced application in existence: I'm sure the average modern-day scr1pt k1dd13 could do better. SMTP: Have you ever tried to bounce processing? PITA. POP: No virtual hosting support. IMAP: You are in a twisty maze of passages, each slightly different . It's no wonder the spammers can ply their trade with such ease. grumpily-y'rs, -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: For American numbers
[Scott David Daniels] Kind of fun exercise (no good for British English). def units(value, units='bytes'): magnitude = abs(value) if magnitude = 1000: for prefix in ['kilo mega giga tera peta ' 'exa zetta yotta').split(): magnitude /= 1000. if magnitude 1000.: break [Peter Hansen] Only for hard drive manufacturers, perhaps. For the rest of the computer world, unless I've missed a changing of the guard or something, kilo is 1024 and mega is 1024*1024 and so forth... Maybe you missed these? http://en.wikipedia.org/wiki/Kibibyte http://en.wikipedia.org/wiki/Mebibyte http://en.wikipedia.org/wiki/Gibibyte kilo-mega-giga-etc-should-be-powers-of-10-ly y'rs, -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: connecting to Sybase/MsSQL from python
[John Fabiani] I'm hoping someone on the list has connected to sybase/MsSQL with something that works with DBAPI 2.0 from a linux box (SUSE 9.2) because I can't seem to get it done. I found Object Craft's python code that uses FreeTDS. But I can't get it compiled. The code is looking for sybdb.h (first error) - of course I don't have sybdb.h. It is not in Object Crafts code nor in the FreeTDS source. I'm guessing that I need sybase develop lib's but I don't know where they are to be found. Hmmm, google(sybdb.h source, submit=I'm feeling Lucky) lands on http://www.freetds.org/reference/a00337.html which has a GPL licence declaration on the top, so it seems that the FreeTDS have an OSS version available. If your compile is failing because it cannot find the file, perhaps you neglected to run ./configure before starting the compile? http://www.freetds.org/userguide/config.htm So is there a kind sole out there that can help with instructions on what is needed to get python talking to MsSQL. Sorry, can't help you there, haven't used FreeTDS. But I hate to see questions being asked a *second* time without some form of reasonable answer . As already mentioned by another poster, have you considered using ODBC? There are several python ODBC implementations. HTH, -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: Second posting - Howto connect to MsSQL
[John Fabiani] Since this is (sort of) my second request it must not be an easy solution. Are there others using Python to connect MsSQL? [jdonnell] http://sourceforge.net/projects/mysql-python Note that MsSQL != MySQL. -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: is there a safe marshaler?
[Irmen de Jong] Interestingly enough, I just ran across Flatten: http://sourceforge.net/project/showfiles.php?group_id=82591package_id=91311 ...which aids in serializing/unserializing networked data securely, without having to fear execution of code or the like. Sounds promising! Well, I'm always dubious of OSS projects that don't even have any bugs reported, let alone fixed: no patches submitted, etc, etc. http://sourceforge.net/tracker/?group_id=82591 Though maybe I'm missing something obvious? -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: goto, cls, wait commands
[BOOGIEMAN] I've just finished reading Python turtorial for non-programmers and I haven't found there anything about some usefull commands I used in QBasic. First of all, what's Python command equivalent to QBasic's goto ? Oh no! You said the G word! That's a dirty word in computer science circles, because of the perception that goto (there, I said it, ugh!) can lead people to structure their code badly, i.e. write bad programs. Instead, most modern programming languages offer a range of control and looping constructs that allow you to code your intention more clearly than with goto. Python examples include while, for .. in .., try .. except .., etc, etc. So in order to answer your question, you're probably going to have to be more specific on what you want goto for. Interestingly, gotos are undergoing a bit of renaissance in coding circles, but people have felt compelled to call them something different: continuations. But you're probably not interested in them. And python can't do them anyway. Secondly, how do I clear screen (cls) from text and other content ? That depends on A: What type of display device you're using B: What type of interface is being rendered on that display (command line, GUI, IDE, etc) C: Perhaps what operating system you are using. And last, how do I put program to wait certain amount of seconds ? If I remeber correctly I used to type Wait 10 and QBasic waits 10 seconds before proceeding to next command. Ahh, a simple question! :-) import time time.sleep(10.0) HTH, -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: is there a safe marshaler?
[Irmen de Jong] Pickle and marshal are not safe. They can do harmful things if fed maliciously constructed data. That is a pity, because marshal is fast. I need a fast and safe (secure) marshaler. Hi Irmen, I'm not necessarily proposing a solution to your problem, but am interested in your requirement. Is this for pyro? In the light of pyro, would something JSON be suitable for your need? I only came across it a week ago (when someone else posted about it here on c.l.py), and am intrigued by it. http://json.org What I find particularly intriguing is the JSON-RPC protocol, which looks like a nice lightweight alternative to XML-RPC. http://oss.metaparadigm.com/jsonrpc/ Also interesting is the browser embeddable JSON-RPC client written in javascript, for which you can see a demo here http://oss.metaparadigm.com/jsonrpc/demos.html I thought you might be interested. regards, -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: is there a safe marshaler?
[Alan Kennedy] What I find particularly intriguing is the JSON-RPC protocol, which looks like a nice lightweight alternative to XML-RPC. http://oss.metaparadigm.com/jsonrpc/ Also interesting is the browser embeddable JSON-RPC client written in javascript, for which you can see a demo here http://oss.metaparadigm.com/jsonrpc/demos.html I should have mentioned as well that there is a python JSON-RPC server implementation, which incudes a complete JSON--python-objects codec. http://www.json-rpc.org/pyjsonrpc/index.xhtml regards, -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: is there a safe marshaler?
[Irmen de Jong] I need a fast and safe (secure) marshaler. [Alan Kennedy] , would something JSON be suitable for your need? http://json.org [Irmen de Jong] Looks very interesting indeed, but in what way would this be more secure than say, pickle or marshal? A quick glance at some docs reveal that they are using eval to process the data... ouch. Well, the python JSON codec provided appears to use eval, which might make it *seem* unsecure. http://www.json-rpc.org/pyjsonrpc/index.xhtml But a more detailed examination of the code indicates, to this reader at least, that it can be made completely secure very easily. The designer of the code could very easily have not used eval, and possibly didn't do so simply because he wasn't thinking in security terms. The codec uses tokenize.generate_tokens to split up the JSON string into tokens to be interpreted as python objects. tokenize.generate_tokens generates a series of textual name/value pairs, so nothing insecure there: the content of the token/strings is not executed. Each of the tokens is then passed to a parseValue function, which is defined thusly: #=== def parseValue(self, tkns): (ttype, tstr, ps, pe, lne) = tkns.next() if ttype in [token.STRING, token.NUMBER]: return eval(tstr) elif ttype == token.NAME: return self.parseName(tstr) elif ttype == token.OP: if tstr == -: return - self.parseValue(tkns) elif tstr == [: return self.parseArray(tkns) elif tstr == {: return self.parseObj(tkns) elif tstr in [}, ]]: return EndOfSeq elif tstr == ,: return SeqSep else: raise expected '[' or '{' but found: '%s' % tstr else: return EmptyValue #=== As you can see, eval is *only* called when the next token in the stream is either a string or a number, so it's really just a very simple code shortcut to get a value from a string or number. If one defined the function like this (not tested!), to remove the eval, I think it should be safe. #=== default_number_type = float #default_number_type = int def parseValue(self, tkns): (ttype, tstr, ps, pe, lne) = tkns.next() if ttype in [token.STRING]: return tstr if ttype in [token.NUMBER]: return default_number_type(tstr) elif ttype == token.NAME: return self.parseName(tstr) elif ttype == token.OP: if tstr == -: return - self.parseValue(tkns) elif tstr == [: return self.parseArray(tkns) elif tstr == {: return self.parseObj(tkns) elif tstr in [}, ]]: return EndOfSeq elif tstr == ,: return SeqSep else: raise expected '[' or '{' but found: '%s' % tstr else: return EmptyValue #=== The only other use of eval is also only for string types, i.e. in the parseObj function: #=== def parseObj(self, tkns): obj = {} nme = try: while 1: (ttype, tstr, ps, pe, lne) = tkns.next() if ttype == token.STRING: nme = eval(tstr) (ttype, tstr, ps, pe, lne) = tkns.next() if tstr == :: v = self.parseValue(tkns) # Remainder of this function elided #=== Which could similarly be replaced with direct use of the string itself, rather than eval'ing it. (Although one might want to look at encoding issues: I haven't looked at JSON-RPC enough to know how it proposes to handle string encodings.) So I don't think there any serious security issues here: the simplicity of the JSON grammar is what attracted me to it in the first place, especially since there are already robust and efficient lexers and parsers already available built-in to python and javascript (and javascript interpreters are getting pretty ubiquitous these days). And it's certainly the case that if the only available python impl of JSON/RPC is not secure, it is possible to write one that is both efficient and secure. Hopefully there isn't some glaring security hole that I've missed: doubtless I'll find out real soon ;-) Gotta love full disclosure. regards, -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: Python and version control
[Carl] What is the ultimate version control tool for Python if you are working in a Windows environment? [Peter Hansen] I never liked coupling the two together like that. Instead I use tools like TortoiseCVS or (now) TortoiseSVN with a Subversion repository. These things let you access revision control features from context (right-button) menus right in Windows Explorer, as you browse the file system. [Robert Brewer] Seconded. [Johann C. Rocholl] Thirded. [Roger] Fourth-ed! I suppose that leaves me a Fifth Column subversionist. I couldn't work without svn and TortoiseSVN now: superb tools. -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: Python and version control
[Peter Hansen] BTW, as a general caution: while Visual Source Safe may be easy, it's also dangerous and has been known to corrupt many a code base, mine included. I wouldn't touch the product with a virtual ten-foot pole [Christos TZOTZIOY Georgiou] Are you sure you got the acronym right?-) It seems that VSS provides viRTual source-safety... In my circles, VSS is most often referred to as Visual Source Unsafe. -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: is there a safe marshaler?
[Alan Kennedy] Well, the python JSON codec provided appears to use eval, which might make it *seem* unsecure. http://www.json-rpc.org/pyjsonrpc/index.xhtml But a more detailed examination of the code indicates, to this reader at least, that it can be made completely secure very easily. The designer of the code could very easily have not used eval, and possibly didn't do so simply because he wasn't thinking in security terms. [Irmen de Jong] I think we (?) should do this then, and send it to the author of the original version so that he can make an improved version available? I think there are more people interested in a secure marshaling implementation than just me :) I should learn to keep my mouth zipped :-L OK, I really don't have time for a detailed examination of either the JSON spec or the python impl of same. And I *definitely* don't have time for a detailed security audit, much though I'd love to. But I'll try to help: the code changes are really very simple. So I've edited the single affected file, json.py, and here's a patch: But be warned that I haven't even run this code! Index: json.py === --- json.py (revision 2) +++ json.py (working copy) @@ -66,8 +66,10 @@ def parseValue(self, tkns): (ttype, tstr, ps, pe, lne) = tkns.next() -if ttype in [token.STRING, token.NUMBER]: -return eval(tstr) +if ttype == token.STRING: +return unicode(tstr) +if ttype == token.NUMBER: +return float(tstr) elif ttype == token.NAME: return self.parseName(tstr) elif ttype == token.OP: @@ -110,7 +112,12 @@ while 1: (ttype, tstr, ps, pe, lne) = tkns.next() if ttype == token.STRING: -nme = eval(tstr) +possible_ident = unicode(tstr) +try: +# Python identifiers have to be ascii +nme = possible_ident.encode('ascii') +except UnicodeEncodeError: +raise Non-ascii identifier (ttype, tstr, ps, pe, lne) = tkns.next() if tstr == :: v = self.parseValue(tkns) I'll leave contacting the author to you, if you wish. I'll still have to look at Twisted's Jelly. Hmmm, s-expressions, interesting. But you'd have to write your own s-expression parser and jelly RPC client to get up and running in other languages. regards, -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: multi threading in multi processor (computer)
[EMAIL PROTECTED] That's a pity, since when we have to run parallel, with single processor is really not efficient. To use more computers I think is cheaper than to buy super computer in developt country. Although cpython has a GIL that prevents multiple python threads *in the same python process* from running *inside the python interpreter* at the same time (I/O is not affected, for example), this can be gotten around by using multiple processes, each bound to a different CPU, and using some form of IPC (pyro, CORBA, bespoke, etc) to communicate between those processes. This solution is not ideal, because it will probably involve restructuring your app. Also, all of the de/serialization involved in the IPC will slow things down, unless you're using POSH, a shared memory based system that requires System V IPC. http://poshmodule.sf.net Alternatively, you could simply use either jython or ironpython, both of which have no central interpreter lock (because they rely on JVM/CLR garbage collection), and thus will support transparent migration of threads to multiple processors in a multi-cpu system, if the underlying VM supports that. http://www.jython.org http://www.ironpython.com And you shouldn't have to restructure your code, assuming that it is already thread-safe? For interest, I thought I'd mention PyLinda, a distributed object system that takes a completely different, higher level, approach to object distribution: it creates tuple space, where objects live. The objects can be located and sent messages. But (Py)Linda hides most of gory details of how objects actually get distributed, and the mechanics of actually connecting with those remote objects. http://www-users.cs.york.ac.uk/~aw/pylinda/ HTH, -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: Suggesion for an undergrad final year project in Python
[Sridhar] I am doing my undergrade CS course. I am in the final year, and would like to do my project involving Python. Our instructors require the project to have novel ideas. Can the c.l.p people shed light on this topic? PyPy is chock full of novel ideas, given that two of the main developers are Armin Rigo (of psyco fame) and Christian Tismer (stackless python). PyPy is a project, which has obtained European funding, to reimplement python *in python*, a very laudable goal. http://codespeak.net/pypy/ Psyco is a specialising compiler for cpython, which essentially does something like just-in-time compilation, but with a different slant. http://psyco.sourceforge.net/introduction.html Armin's paper on the techniques used should make an interesting read for your CS professors: http://psyco.sourceforge.net/theory_psyco.pdf Stackless python has support for full coroutines, as opposed to cpython's current support for semi-coroutines. In the past, Stackless used to support continuations, but no longer does because of the complexity of adapting the cpython interpreter to support them. But Christian's implementation experience will hopefully guide PyPy in the direction of supporting both coroutines and continuations. http://www.stackless.com/ As for what you could do in the PyPy project, I have no suggestions since I am not involved in the project. But I am sure that a message to pypy-dev will elicit plenty of ideas. http://codespeak.net/mailman/listinfo/pypy-dev Lastly, the jython project is undergoing a bit of renaissance at the moment, and there's plenty of work to be done. A message to jython-dev volunteering time is unlikely to go unnoticed. Particularly, the parser, code generation and AST are areas which require a fair amount of rework. But there is less opportunity to use novel ideas in jython, so it may not interest your professors, unless you have some novel ideas of your own to bring to the project. http://lists.sourceforge.net/lists/listinfo/jython-dev How much time, over how long a period, do you have available for your project? Best of luck, -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: Textual markup languages (was Re: What YAML engine do you use?)
[Alan Kennedy] However, I'm torn on whether to use ReST for textual content. On the one hand, it's looks pretty comprehensive and solidly implemented. But OTOH, I'm concerned about complexity: I don't want to commit to ReST if it's going to become a lot of hard work or highly-inefficient when I really need to use it in anger. From what I've seen, pretty much every textual markup targetted for web content, e.g. wiki markup, seems to have grown/evolved organically, meaning that it is either underpowered or overpowered, full of special cases, doesn't have a meaningful object model, etc. [Aahz] My perception is that reST is a lot like Python itself: it's easy to hit the ground running, particularly if you restrict yourself to a specific subset of featuers. It does give you a fair amount of power, and some things are difficult or impossible. Note that reST was/is *not* specifically aimed at web content. Several people have used it for writing books; some people are using it instead of PowerPoint. Thanks, Aahz, that's a key point that I'll continue on below. [Alan Kennedy] So, I'm hoping that the learned folks here might be able to give me some pointers to a markup language that has the following characteristics 1. Is straightforward for non-technical users to use, i.e. can be (mostly) explained in a two to three page document which is comprehensible to anyone who has ever used a simple word-processor or text-editor. 2. Allows a wide variety of content semantics to be represented, e.g. headings, footnotes, sub/superscript, links, etc, etc. [Aahz] These two criteria seem to be in opposition. I certainly wouldn't expect a three-page document to explain all these features, not for non-technical users. reST fits both these criteria, but only for a selected subset of featuers. The point is well made. When I wrote my requirements, I did have a specific limited feature set in mind: basically a print-oriented set of features with which anyone who reads books would be familiar. I'm trying to capture scientific abstracts, of the sort that you can see linked off this page. http://www.paratuberculosis.org/proc7/ But I'm basically only interested in representation of the original input text. I'll be capturing a lot of metadata as well, but most of that will be captured outside the markup language, through a series of form inputs which ask specific metadata questions. So, for example, the relationships between authors and institutions, seen on the next page, will not be recorded in the markup. http://www.paratuberculosis.org/proc7/abst5_p2.htm I think that is where a lot of markup languages fall down, in that they end trying to develop a sophisticated metadata model that can capture that kind of information, and re-engineering the markup to support it. This co-evolution of the markup and model can go horribly awry, if the designers are inexperienced or don't know where they're headed. Since ReST seems to do this stuff fairly well, I think I'll take a closer look at it. From what I've seen of it, e.g. PEPs, python module documentation (SQLObject, etc), it seems to be reasonably unobtrusive to the author. regards, -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Textual markup languages (was Re: What YAML engine do you use?)
[Effbot] ReST and YAML share the same deep flaw: both formats are marketed as simple, readable formats, and at a first glance, they look simple and read- able -- but in reality, they're messy as hell, and chances are that the thing you're looking at doesn't really mean what you think it means (unless you're the official ReST/YAML parser implementation). experienced designers know how to avoid that; the ReST/YAML designers don't even understand why they should. I'm looking for a good textual markup language at the moment, for capturing web and similar textual content. I don't want to use XML for this particular usage, because this content will be entered through a web interface, and I don't want to force users through multiple rounds of submit/check-syntax/generate-error-report/re-submit in order to enter their content. I have no strong feelings about YAML: If I want to structured data, e.g. lists, dictionaries, etc, I just use python. However, I'm torn on whether to use ReST for textual content. On the one hand, it's looks pretty comprehensive and solidly implemented. But OTOH, I'm concerned about complexity: I don't want to commit to ReST if it's going to become a lot of hard work or highly-inefficient when I really need to use it in anger. From what I've seen, pretty much every textual markup targetted for web content, e.g. wiki markup, seems to have grown/evolved organically, meaning that it is either underpowered or overpowered, full of special cases, doesn't have a meaningful object model, etc. So, I'm hoping that the learned folks here might be able to give me some pointers to a markup language that has the following characteristics 1. Is straightforward for non-technical users to use, i.e. can be (mostly) explained in a two to three page document which is comprehensible to anyone who has ever used a simple word-processor or text-editor. 2. Allows a wide variety of content semantics to be represented, e.g. headings, footnotes, sub/superscript, links, etc, etc. 3. Has a complete (but preferably lightweight) object model into which documents can be loaded, for transformation to other languages. 4. Is speed and memory efficient. 5. Obviously, has a solid python implementation. Most useful would be a pointer to a good comparison/review page which compares multiple markup languages, in terms of the above requirements. If I can't find such a markup language, then I might instead end up using a WYSIWYG editing component that gives the user a GUI and generates (x)html. htmlArea: http://www.htmlarea.com/ Editlet: http://www.editlet.com/ But I'd prefer a markup solution. TIA for any pointers. regards, -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: Textual markup languages (was Re: What YAML engine do you use?)
[Alan Kennedy] From what I've seen, pretty much every textual markup targetted for web content, e.g. wiki markup, seems to have grown/evolved organically, meaning that it is either underpowered or overpowered, full of special cases, doesn't have a meaningful object model, etc. [Fredrik Lundh] I spent the eighties designing one textual markup language after another, for a wide variety of projects (mainly for technical writing). I've since come to the conclusion that they all suck (for exactly the reasons you mention above, plus the usual the implementation is the only complete spec we have issue). Thanks Fredrik, I thought you might have a fair amount of experience in this area :-) [Fredrik Lundh] the only markup language I've seen lately that isn't a complete mess is John Gruber's markdown: http://daringfireball.net/projects/markdown/ which has an underlying object model (HTML/XHTML) and doesn't have too many warts. not sure if anyone has done a Python implementation yet, though (for html-markdown, see http://www.aaronsw.com/2002/html2text/ ), and I don't think it supports footnotes (HTML doesn't). Thanks for the pointer. I took a look at Markdown, and it does look nice. But I don't like the dual syntax, e.g. switching into HTML for tables, etc: I'm concerned that the syntax switch might be too much for non-techies. [Alan Kennedy] If I can't find such a markup language, then I might instead end up using a WYSIWYG editing component that gives the user a GUI and generates (x)html. htmlArea: http://www.htmlarea.com/ Editlet: http://www.editlet.com/ But I'd prefer a markup solution. [Fredrik Lundh] some of these are amazingly usable. have you asked your users what they prefer? (or maybe you are your user? ;-) Actually, I'm looking for a solution for both myself and for end-users (who will take what they're given ;-). For myself, I think I'll end up picking Markdown, ReST, or something comparable from the wiki-wiki-world. For the end-users, I'm starting to think that GUI is the only way to go. The last time I looked at this area, a few years ago, the components were fairly immature and pretty buggy. But the number of such components and their quality seems to have greatly increased in recent times. Particularly, many of them seem to address an important requirement that I neglected to mention in my original list: unicode support. I'll be processing all kinds of funny characters, e.g. math/scientific symbols, european, asian and middle-eastern names, etc. thanks-and-regards-ly-y'rs, -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: Textual markup languages (was Re: What YAML engine do you use?)
[Alan Kennedy] So, I'm hoping that the learned folks here might be able to give me some pointers to a markup language that has the following characteristics [Paul Rubin] I'm a bit biased but I've been using Texinfo for a long time and have been happy with it. It's reasonably lightweight to implement, fairly intuitive to use, and doesn't get in the way too much when you're writing. There are several implementations, none in Python at the moment but that would be simple enough. It does all the content semantics you're asking (footnotes etc). It doesn't have an explicit object model, but is straightforward to convert into a number of formats including high-quality printed docs (TeX); the original Info hypertext browser that predates the web; and these days HTML. Thanks Paul, I took a look at texinfo, and it looks powerful and good ... for programmers. Looks like a very steep learning curve for non-programmers though. It seems to require just a few hundred kilobytes too much documentation .. regards, -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: Threading Problem
[Norbert] i am experimenting with threads and get puzzling results. Consider the following example: # import threading, time def threadfunction(): print threadfunction: entered x = 10 while x 40: time.sleep(1) # time unit is seconds print threadfunction x=%d % x x += 10 print start th = threading.Thread(target = threadfunction()) The problem is here^^ You are *invoking* threadfunction, and passing its return value as the target, rather than passing the function itself as the target. That's why threadfunction's output appears in the output stream before the thread has even started. Try this instead #--- import threading, time def threadfunction(): print threadfunction: entered x = 10 while x 40: time.sleep(1) # time unit is seconds print threadfunction x=%d % x x += 10 print start th = threading.Thread(target = threadfunction) th.start() print start completed # Which should output the expected start threadfunction: entered start completed threadfunction x=10 threadfunction x=20 threadfunction x=30 regards, -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: Boo who? (was Re: newbie question)
[Roger Binns] That work died due to a crisis of faith: http://mylist.net/archives/spry-dev/2004-November/72.html [A.M. Kuchling] rolls eyes Soon it will be possible to become a well-known programmer without writing any code at all; just issue grandiose manifestos and plans until everyone is suitably impressed. Well, things are getting better then . It used to be that grandiose manifestos and suitably impressive plans were all you needed to make billions through a stock flotation ;-) 0.5 wink-ly y'rs, -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list
Re: Webapp servers security
[Anakim Border] App servers such as quixote, webware and skunkweb (just to name a few) offer a clean environment to develop Python webapps. I have some problems, however, understanding their security model. Since they each have different security models, that's not surprising. This is a difficult problem for people approaching python. Hopefully it is the sort of problem that will be brought more under control when WSGI* is in widespread use and authentication is controlled using WSGI middleware. *: http://www.python.org/peps/pep-0333.html My objective is to host webapps from different people on a single Linux server; because of that, I want to be sure that one webapp cannot interfere with another. My first attempt at privilege separation went through users groups. Using unix users and groups is the best way to attain total separation between environments. Either that or put them on different user-mode-linux* hosts. *: http://usermodelinux.org/ Unfortunately application servers execute all python code under the same uid; that way webapp 'a' from Alice can easily overwrite files from webapp 'b' owned by Bob. Perhaps you could run multiple application servers? One per isolated environment? Each of the above packages (quixote, etc) contains its own standalone server, as well as the capability to integrate into other server environments. Use some form of proxy webserver in the front, which simply routes requests to the relevant application server, based on URL, HTTP_HOST, etc, etc. Apache has a mod_proxy[1] designed specifically for this purpose. In combination with mod_rewrite[2], that should give you fairly powerful control over who gets to see which requests. You could probably roll your solution fairly easily using one or more of the mod_python Python*Handlers[3] and something like mod_scgi[4] or FastCGI[5]. 1: http://httpd.apache.org/docs-2.0/mod/mod_proxy.html 2: http://httpd.apache.org/docs-2.0/mod/mod_rewrite.html 3: http://www.modpython.org/live/current/doc-html/dir-handlers.html 4: http://www.mems-exchange.org/software/scgi/ 5: http://www.fastcgi.com/mod_fastcgi/docs/mod_fastcgi.html Did I miss anything? I am sure there are other approaches as well. HTH, -- alan kennedy -- email alan: http://xhaus.com/contact/alan -- http://mail.python.org/mailman/listinfo/python-list