Re: Idea for removing the GIL...

2011-02-28 Thread Stefan Behnel

Aahz, 01.03.2011 03:02:

Carl Banks wrote:


The real reason they never replaced the GIL is that fine-grained
locking is expensive with reference counting.  The only way the cost
of finer-grained locking would be acceptable, then, is if they got rid
of the reference counting altogether, and that was considered too
drastic a change.


...especially given CPython's goal of easy integration with C libraries.


+1, the GIL is much more rarely a problem than some people want to make it 
appear. Especially those who don't understand why it's there, or who fail 
to notice that threading is not the only way to do parallel processing (and 
certainly not the easiest either).


Stefan

--
http://mail.python.org/mailman/listinfo/python-list


Re: Problems of Symbol Congestion in Computer Languages

2011-02-28 Thread Xah Lee
On Feb 28, 7:30 pm, rusi  wrote:
> On Feb 28, 11:39 pm, Dotan Cohen  wrote:
>
> > You miss the canonical bad character reuse case: = vs ==.
>
> > Had there been more meta keys, it might be nice to have a symbol for
> > each key on the keyboard. I personally have experimented with putting
> > the symbols as regular keys and the numbers as the Shifted versions.
> > It's great for programming.
>
> Hmmm... Clever!
> Is it X or Windows?
> Can I have your setup?

hi Russ,

there's a programer's dvorak layout i think is bundled with linux.

or you can do it with xmodmap on X-11 or AutoHotKey on Windows, or
within emacs... On the mac, you can use keyboardMaestro, Quickeys, or
just write a os wide config file yourself. You can see tutorials and
sample files for all these here 
http://xahlee.org/Periodic_dosage_dir/keyboarding.html

i'd be interested to know what Dotan Cohen use too.

i tried the swapping number row with symbols a few years back. didn't
like it so much because numbers are frequently used as well,
especially when you need to enter a series of numbers. e.g. heavy
math, or dates 2010-02-28. One can use the number pad but i use that
as extra programable buttons.

 Xah

> One problem we programmers face is that keyboards were made for
> typists not programmers.
> Another is that when we move from 'hi-level' questions eg code reuse
> -- to lower and lower -- eg ergonomics of reading and writing code --
> the focus goes from the center of consciousness to the periphery and
> we miss how many inefficiencies there are in our semi-automatic
> actions.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Problems of Symbol Congestion in Computer Languages

2011-02-28 Thread rusi
On Feb 28, 11:39 pm, Dotan Cohen  wrote:
> You miss the canonical bad character reuse case: = vs ==.
>
> Had there been more meta keys, it might be nice to have a symbol for
> each key on the keyboard. I personally have experimented with putting
> the symbols as regular keys and the numbers as the Shifted versions.
> It's great for programming.

Hmmm... Clever!
Is it X or Windows?
Can I have your setup?

One problem we programmers face is that keyboards were made for
typists not programmers.
Another is that when we move from 'hi-level' questions eg code reuse
-- to lower and lower -- eg ergonomics of reading and writing code --
the focus goes from the center of consciousness to the periphery and
we miss how many inefficiencies there are in our semi-automatic
actions.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Executing js/ajax in a sandboxed environment

2011-02-28 Thread Miki Tebeka
> I was wondering if there is a way to execute js associated in
> page in sandbox environment before I start parsing it.
You can use http://code.google.com/p/python-spidermonkey/ or 
http://code.google.com/p/pyv8/ to evaluate JavaScript.

You can use any browser (including embedded htmlunit) using Selenium.

HTH
--
Miki Tebeka 
http://pythonwise.blogspot.com
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Idea for removing the GIL...

2011-02-28 Thread Aahz
In article ,
Carl Banks   wrote:
>
>The real reason they never replaced the GIL is that fine-grained
>locking is expensive with reference counting.  The only way the cost
>of finer-grained locking would be acceptable, then, is if they got rid
>of the reference counting altogether, and that was considered too
>drastic a change.

...especially given CPython's goal of easy integration with C libraries.
-- 
Aahz (a...@pythoncraft.com)   <*> http://www.pythoncraft.com/

"Programming language design is not a rational science. Most reasoning
about it is at best rationalization of gut feelings, and at worst plain
wrong."  --GvR, python-ideas, 2009-03-01
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Plumber, an alternative to mixin-based subclassing

2011-02-28 Thread Aahz
[posted & e-mailed]

In article ,
Florian Friesdorf   wrote:
>
>An alternative to mixin-based subclassing:
>
>http://pypi.python.org/pypi/plumber

You'll probably get more interest if you provide a summary.
-- 
Aahz (a...@pythoncraft.com)   <*> http://www.pythoncraft.com/

"Programming language design is not a rational science. Most reasoning
about it is at best rationalization of gut feelings, and at worst plain
wrong."  --GvR, python-ideas, 2009-03-01
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: OT: Code Examples

2011-02-28 Thread Fred Marshall

On 2/28/2011 8:14 AM, n00m wrote:

On Feb 28, 6:03 pm, Fred Marshall
wrote:



The best place for you to start: http://numpy.scipy.org/

Numpy manual: http://www.tramy.us/numpybook.pdf


OK Thanks!

Fred

--
http://mail.python.org/mailman/listinfo/python-list


Re: move to end, in Python 3.2 Really?

2011-02-28 Thread Gregory Ewing

Raymond Hettinger wrote:


The existing list.pop() API is similar (though it takes an index
value instead of a boolean):


mylist.pop()  # default case:  pop from last
mylist.pop(0) # other case:pop from first


pop() is somewhat different, because there is an infinite
range of possible values for the index parameter.

Here there are only two possibilites, though, and it's
unlikely that code will want to dynamically select between
them -- so the "no constant parameters" guideline would seem
to apply.

--
Greg
--
http://mail.python.org/mailman/listinfo/python-list


Re: how to properly pass literal strings python code to be executed using python -c

2011-02-28 Thread Steven D'Aprano
On Mon, 28 Feb 2011 10:59:01 -0800, jmoons wrote:

> I need some help figuring out how to execute this python code from
> python -c
> I am have trouble formatting python so that it will execute for another
> app in cmd I understand there maybe other ways to do what I am doing but
> I am limited by the final execution using cmd python -c so please keep
> this in mind.
> I'm limited by the final delivery of code. The python is being called by
> a server that does not have access to any python script file

Let me translate that... 

"I'm having trouble hammering this nail with a screwdriver. Keep in mind 
that I am limited by the requirement that I use a screwdriver, not a 
hammer, to hammer the nail. The nail is being hammered by somebody who 
doesn't have a hammer."

So give them a hammer. Put the code in a text file, call it "main.py" or 
something, and execute "python -m main", or "python -c 'import main'" if 
you prefer.

I don't understand the requirement to avoid storing your code in a file 
-- surely you won't be typing the script into cmd every single time you 
want to run it, so surely it will be stored in a batch file or similar? 
As far as I can tell, the *only* legitimate reason for the requirement is 
to win a bet :) Otherwise, you're just making your life much much harder 
than it needs to be.


[...]
> So this what i have but no worky
> 
> cmdline = "\" import os, shutil \n for root, dirs, files in
> os.walk("+myPath+"):\n \t for file in files: \n \t \t
> os.remove(os.path.join(root, file)) \n \t for dir in dirs: \n \t\t
> shutil.rmtree(os.path.join(root, dir))"


I have no idea what the string handling rules for cmd are, and I'm not 
going to try to guess. This doesn't appear to be a Python problem, it's a 
cmd problem. You need to work out how to correctly quote your string. 
Perhaps try on some Windows forums.


> I have also tried the following
> python -c "import os; import shutil; for root, dirs, files in
> os.walk('+myPath+'): for file in files: os.remove(os.path.join(root,
> file)); for dir in dirs: shutil.rmtree(os.path.join(root, dir))"
> 
> I am still getting error tree(os.path.join(root, dir)) ^ SyntaxError:
> invalid syntax

No you don't. You don't call a function "tree", so you can't be getting 
that error. The actual function you call is shutil.rmtree. Please don't 
retype, summarize, simplify or paraphrase error messages. Copy and paste 
them *exactly* as they are shown, complete with any traceback which is 
printed.


-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: subclass urllib2

2011-02-28 Thread Alex Willmer
On Feb 28, 6:53 pm, monkeys paw  wrote:
> I'm trying to subclass urllib2 in order to mask the
> version attribute. Here's what i'm using:
>
> import urllib2
>
> class myURL(urllib2):
>      def __init__(self):
>          urllib2.__init__(self)
>          self.version = 'firefox'
>
> I get this>
> Traceback (most recent call last):
>    File "", line 1, in 
> TypeError: Error when calling the metaclass bases
> module.__init__() takes at most 2 arguments (3 given)
>
> I don't see where i am supplying 3 arguments. What am i
> missing?

urllib2 is a module, not a class, so you can't subclass it. You could
subclass one of the classes inside urllib2, such as
urllib2.BaseHandler. Whether you want to depends on what your want to
achieve.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: 3.1 -> 3.2: base64 lost deprecation warning

2011-02-28 Thread Ethan Furman

Terry Reedy wrote:

On 2/28/2011 3:51 PM, Ethan Furman wrote:

The deprecation warning has gone away in 3.2,


No, still there:
def encodestring(s):
"""Legacy alias of encodebytes()."""
import warnings
warnings.warn("encodestring() is a deprecated alias, use 
encodebytes()",

  DeprecationWarning, 2)
return encodebytes(s)

In 3.2, DeprecationWarnings are turned off by default so as to not annoy 
users who can do nothing about them and developers who do not want to do 
anything at the moment. I presume the doc for warnings says how to turn 
them back on.




Ah, thank you!

~Ethan~
--
http://mail.python.org/mailman/listinfo/python-list


Re: 3.1 -> 3.2: base64 lost deprecation warning

2011-02-28 Thread Terry Reedy

On 2/28/2011 3:51 PM, Ethan Furman wrote:

Python 3.1.2 (r312:79149, Mar 21 2010, 00:41:52) [MSC v.1500 32 bit
(Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
--> import base64
--> base64.encodestring(b'this is a test')
__main__:1: DeprecationWarning: encodestring() is a deprecated alias,
use encodebytes()
b'dGhpcyBpcyBhIHRlc3Q=\n'


Python 3.2 (r32:88445, Feb 20 2011, 21:29:02) [MSC v.1500 32 bit
(Intel)] on win
32
Type "help", "copyright", "credits" or "license" for more information.
--> import base64
--> base64.encodestring(b'another test')
b'dGhpcyBpcyBhIHRlc3Q=\n'


The deprecation warning has gone away in 3.2,


No, still there:
def encodestring(s):
"""Legacy alias of encodebytes()."""
import warnings
warnings.warn("encodestring() is a deprecated alias, use 
encodebytes()",

  DeprecationWarning, 2)
return encodebytes(s)



but the function
remains... does anyone know if this was intentional?


In 3.2, DeprecationWarnings are turned off by default so as to not annoy 
users who can do nothing about them and developers who do not want to do 
anything at the moment. I presume the doc for warnings says how to turn 
them back on.


--
Terry Jan Reedy

--
http://mail.python.org/mailman/listinfo/python-list


Re: 3.1 -> 3.2: base64 lost deprecation warning

2011-02-28 Thread Robert

On 2011-02-28 15:51:32 -0500, Ethan Furman said:


Python 3.1.2 (r312:79149, Mar 21 2010, 00:41:52) [MSC v.1500 32 bit
(Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
--> import base64
--> base64.encodestring(b'this is a test')
__main__:1: DeprecationWarning: encodestring() is a deprecated alias,
use encodebytes()
b'dGhpcyBpcyBhIHRlc3Q=\n'


Python 3.2 (r32:88445, Feb 20 2011, 21:29:02) [MSC v.1500 32 bit
(Intel)] on win
32
Type "help", "copyright", "credits" or "license" for more information.
--> import base64
--> base64.encodestring(b'another test')
b'dGhpcyBpcyBhIHRlc3Q=\n'


The deprecation warning has gone away in 3.2, but the function
remains... does anyone know if this was intentional?

~Ethan~


I only found this:

Issue #3613: base64.{encode,decode}string are now called
 base64.{encode,decode}bytes which reflects what type they accept and return.
 The old names are still there as deprecated aliases.

Doesn't exlain the "no warning" though.

--
Robert


--
http://mail.python.org/mailman/listinfo/python-list


Re: Problems with read_eager and Telnet

2011-02-28 Thread Terry Reedy

On 2/28/2011 3:46 PM, Robi wrote:


unless using it just to get/set configuration,
in which case, speed should hardly seem an issue.


Right, I'm using it that way, I get/set properties changing them in
real time (I whish!).

...

My conclusion being, fgfs cannot answer back quicker than this: 20Hz.


I suspect that is by design, so as to not interfere with the simulation 
itself.



This is quite poor for a quasi-realtime hardware interface (which is
in my intention).


I suggest you discuss your use case on a FlightGear mailing list
(mirrored on gmane as newsgroups) or forum.

--
Terry Jan Reedy

--
http://mail.python.org/mailman/listinfo/python-list


Re: SoC project: Python-Haskell bridge - request for feedback

2011-02-28 Thread Pauli Rikula
There is your bridge: http://kks.cabal.fi/HaskellAndPython

It's not polished and one might shoot his/her legs off while using
that -so be careful.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Problems with read_eager and Telnet

2011-02-28 Thread Robi
> Given that FlightGear is a graphical flight 
> simulatorhttp://www.flightgear.org/https://secure.wikimedia.org/wikipedia/en/wiki/FlightGear
> using a text terminal connection seems a bit odd,
> unless using it just to get/set configuration,
> in which case, speed should hardly seem an issue.

Right, I'm using it that way, I get/set properties changing them in
real time (I whish!).


> I presume you are using read_until(b'\n').
> read_until cannot return any faster than the server sends complete lines

Actually it's read_until('\n') (because it's python 2.6, anyway I know
what you meant with b'\n').
And yes, after today I've come to the conclusion that even read_eager
is of no use to me. I have to settle down for that poor telnet speed,
I can't get more than that :-( In fact, the telnet connection speed is
not the issue, it's the time FGFS spends in reading its internals and
answearing back to the telnet client.

I've made some tests, it looks like it doesn't matter how many fgfs's
properties I try and read with telnet at the same time with python, I
always have the same lag, I can even run multiple python instances of
the same script, in order to get/set those same amount of properties
in parallel, well ... result is I always get the same lag, it doesn't
even become slower.

In my test I run a simple loop, in each loop I do a telnet.write() and
a telnet.read_until('\n'). I get back results at 20Hz (that means I
get 120 write/read cycles in 6 seconds).
I repeated the same test doing three telnet.write() followed by three
telnet.read_until('\n') in each loop. Result is always 20Hz (but now
that means I get 120 loop cycles in 6 seconds, which is now 120x3
write/read cycles in 6 seconds).
I even run three instances of the same script at the same time (trying
to saturate fgfs's capabilities and find some limits). No luck (or
yes?!?), I always get a 20hz frequency.

My conclusion being, fgfs cannot answer back quicker than this: 20Hz.

This is quite poor for a quasi-realtime hardware interface (which is
in my intention).
Maybe it's enough for something ... I'll give it a try anyway and find
out what can be a usable scenario for such values.

I guees even telnet is not the best apporach to use for my project,
I'll keep experiencing.

Thank you all anyway, it's always a pleasure to share some
considerations and get back usefull hints.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: urlopen returns forbidden

2011-02-28 Thread Chris Rebert
On Mon, Feb 28, 2011 at 9:44 AM, Terry Reedy  wrote:
> On 2/28/2011 10:21 AM, Grant Edwards wrote:
>> As somebody else has already said, if the site provides an API that
>> they want you to use you should do so rather than hammering their web
>> server with a screen-scraper.
>
> If there any generic method for finding out 'if the site provides an API"
> and specifically, how to find Wikipedia's?
>
> I looked as the Wikipedia articles on API and web services and did not find
> any mention of thiers (though there is one for Amazon).

Technically it's Wikipedia's underlying wiki software (MediaWiki)'s API:
http://www.mediawiki.org/wiki/API

Cheers,
Chris
-- 
http://mail.python.org/mailman/listinfo/python-list


3.1 -> 3.2: base64 lost deprecation warning

2011-02-28 Thread Ethan Furman
Python 3.1.2 (r312:79149, Mar 21 2010, 00:41:52) [MSC v.1500 32 bit 
(Intel)] on win32

Type "help", "copyright", "credits" or "license" for more information.
--> import base64
--> base64.encodestring(b'this is a test')
__main__:1: DeprecationWarning: encodestring() is a deprecated alias, 
use encodebytes()

b'dGhpcyBpcyBhIHRlc3Q=\n'


Python 3.2 (r32:88445, Feb 20 2011, 21:29:02) [MSC v.1500 32 bit 
(Intel)] on win

32
Type "help", "copyright", "credits" or "license" for more information.
--> import base64
--> base64.encodestring(b'another test')
b'dGhpcyBpcyBhIHRlc3Q=\n'


The deprecation warning has gone away in 3.2, but the function 
remains... does anyone know if this was intentional?


~Ethan~
--
http://mail.python.org/mailman/listinfo/python-list


Re: Problem with python 3.2 and circular imports

2011-02-28 Thread Rafael Durán Castañeda
I'm stil totally stuck with relative imports, i' ve tried the example tree
from PEP 328 without any result:

package/
__init__.py
subpackage1/
__init__.py
moduleX.py
moduleY.py
subpackage2/
__init__.py
moduleZ.py
moduleA.py

Assuming that the current file is either moduleX.py or
subpackage1/__init__.py, following are correct usages of the new
syntax:

from .moduleY import spam
from .moduleY import spam as ham
from . import moduleY
from ..subpackage1 import moduleY
from ..subpackage2.moduleZ import eggs
from ..moduleA import foo
from ...package import bar
from ...sys import path

I always get:

Traceback (most recent call last):
  File "moduleY.py", line 1, in 
from ..moduleA import a
ValueError: Attempted relative import in non-package


2011/2/27 Frank Millman 

>
> "Steven D'Aprano"  wrote in message
> news:4d6a56aa$0$29972$c3e8da3$54964...@news.astraweb.com...
>
>  On Sun, 27 Feb 2011 12:08:12 +0200, Frank Millman wrote:
>>
>>  Assume the following structure -
>>>
>>> main.py
>>> /pkg
>>>__init__.py
>>>mod1.py
>>>mod2.py
>>>
>>> main.py
>>>from pkg import mod1
>>>
>>> mod1.py
>>>import mod2
>>>
>>> mod2.py
>>>  import mod1
>>>
>>
>>
>> If you change the "import mod*" lines to "import pkg.mod*" it works for
>> me in Python 3.1 and 3.2.
>>
>> According to my understand of PEP 328, "from . import mod*" should work,
>> but I agree with you that it doesn't.
>>
>> If you get rid of the circular import, it does work. So I suspect a bug.
>>
>>
>>
> Thanks, Steven.
>
> I confirm that 'import pkg.mod* works. Unfortunately I am using
> sub-packages as well, which means that to refer to an object in the
> sub-package I need to use w.x.y.z every time, which gets to be a lot of
> typing! I will stick to my hack of putting the package name in sys.path for
> now, unless someone comes up with a better idea.
>
> Frank
>
>
>
> --
> http://mail.python.org/mailman/listinfo/python-list
>
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: subclass urllib2

2011-02-28 Thread Santoso Wijaya
1. Why are you subclassing a module?
2. If you want to "mask" a module's version attribute, just do this:
>>> import urllib2
>>> urllib2.__version__ = 'foo'
>>> print urllib2.__version__
foo

~/santa


On Mon, Feb 28, 2011 at 10:53 AM, monkeys paw  wrote:

> I'm trying to subclass urllib2 in order to mask the
> version attribute. Here's what i'm using:
>
> import urllib2
>
> class myURL(urllib2):
>def __init__(self):
>urllib2.__init__(self)
>self.version = 'firefox'
>
> I get this>
> Traceback (most recent call last):
>  File "", line 1, in 
> TypeError: Error when calling the metaclass bases
> module.__init__() takes at most 2 arguments (3 given)
>
> I don't see where i am supplying 3 arguments. What am i
> missing?
>
> --
> http://mail.python.org/mailman/listinfo/python-list
>
-- 
http://mail.python.org/mailman/listinfo/python-list


how to properly pass literal strings python code to be executed using python -c

2011-02-28 Thread jmoons
I need some help figuring out how to execute this python code from
python -c
I am have trouble formatting python so that it will execute for
another app in cmd I understand there maybe other ways to do what I am
doing but I am limited by the final execution using cmd python -c so
please keep this in mind.
I'm limited by the final delivery of code. The python is being called
by a server that does not have access to any python script file

So I have some python code ie,

import os
import shutil

myPath =r"C:\dingdongz"
for root, dirs, files in os.walk(myPath):
for file in files:
os.remove(os.path(root, file))
for dir in dirs:
shutil.rmtree(os.path.join(root,dir))

But I am trying to excute it using the following method, python -c
"print 'hotdogs'"

So this what i have but no worky

cmdline = "\" import os, shutil \n for root, dirs, files in
os.walk("+myPath+"):\n \t for file in files: \n \t \t
os.remove(os.path.join(root, file)) \n \t for dir in dirs: \n \t\t
shutil.rmtree(os.path.join(root, dir))"


I have also tried the following
python -c "import os; import shutil; for root, dirs, files in
os.walk('+myPath+'): for file in files: os.remove(os.path.join(root,
file)); for dir in dirs: shutil.rmtree(os.path.join(root, dir))"

I am still getting error tree(os.path.join(root, dir)) ^ SyntaxError:
invalid syntax
-- 
http://mail.python.org/mailman/listinfo/python-list


subclass urllib2

2011-02-28 Thread monkeys paw

I'm trying to subclass urllib2 in order to mask the
version attribute. Here's what i'm using:

import urllib2

class myURL(urllib2):
def __init__(self):
urllib2.__init__(self)
self.version = 'firefox'

I get this>
Traceback (most recent call last):
  File "", line 1, in 
TypeError: Error when calling the metaclass bases
module.__init__() takes at most 2 arguments (3 given)

I don't see where i am supplying 3 arguments. What am i
missing?

--
http://mail.python.org/mailman/listinfo/python-list


Re: Problems of Symbol Congestion in Computer Languages

2011-02-28 Thread Dotan Cohen
You miss the canonical bad character reuse case: = vs ==.

Had there been more meta keys, it might be nice to have a symbol for
each key on the keyboard. I personally have experimented with putting
the symbols as regular keys and the numbers as the Shifted versions.
It's great for programming.


-- 
Dotan Cohen

http://gibberish.co.il
http://what-is-what.com
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Problems with read_eager and Telnet

2011-02-28 Thread Terry Reedy

On 2/28/2011 10:54 AM, Robi wrote:

Hi everybody,
  I'm totally new to Python but well motivated :-)

I'm fooling around with Python in order to interface with FlightGear
using a telnet connection.


Given that FlightGear is a graphical flight simulator
http://www.flightgear.org/
https://secure.wikimedia.org/wikipedia/en/wiki/FlightGear
using a text terminal connection seems a bit odd,
unless using it just to get/set configuration,
in which case, speed should hardly seem an issue.


I can do what I had in mind (send some commands and read output from
Flightgear using the telnetlib) with a read_until() object to catch
every output line I need, but it proved to be very slow (it takes 1/10
of a sec for every read_until().


I presume you are using read_until(b'\n').
read_until cannot return any faster than the server sends complete lines


I tried using the read_eager() object and it's w faster (it
does the job in 1/100 of a sec, maybe more, I didn't tested) but it
gives me problems, it gets back strange strings, repeated ones,
partially broken ones, well ... I don't know what's going on with it.


read_eager is for when you want to parse the bytes yourself, such as 
when they come in a continuous stream without newline or other obvious 
separators. It will not return b'\n' any faster than read_until. It will 
simply give you your output lines in little bits that you would have to 
reassemble yourself.



You see, I don't know telnet (the protocol) very good, I'm very new to
Python and Python's docs are not very specific about that read_eager(9
stuff.

Could someone point me to some more documentation about that? or at
least help me in getting a correct idea of what's going on with
read_eager()?


Read the source Lib/telnetlib.py (as I just did). It is pretty 
straightforwad code.


--
Terry Jan Reedy

--
http://mail.python.org/mailman/listinfo/python-list


Re: Lumberjack Song

2011-02-28 Thread MRAB

On 28/02/2011 10:26, Tom Zych wrote:

We all like computers here. No doubt many of us like computer games.
And most of us will be at least somewhat familiar with Monty Python.
Therefore, I present (drum roll)...

http://www.youtube.com/watch?v=Zh-zL-rhUuU

(For the Runescape fans out there, this should be quite hilarious.
Possibly not as much for those unfamiliar with Runescape...)


BTW, the lumberjack is called Kevin.
--
http://mail.python.org/mailman/listinfo/python-list


Re: Problems of Symbol Congestion in Computer Languages

2011-02-28 Thread rusi
On Feb 17, 3:07 am, Xah Lee  wrote:
> might be interesting.
>
> 〈Problems of Symbol Congestion in Computer Languages (ASCII Jam;
> Unicode; Fortress)〉http://xahlee.org/comp/comp_lang_unicode.html

Haskell is slowly moving this way see for example
http://www.haskell.org/ghc/docs/latest/html/users_guide/syntax-extns.html#unicode-syntax

But its not so easy the lambda does not work straight off -- see
http://hackage.haskell.org/trac/ghc/ticket/1102
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Problems with read_eager and Telnet

2011-02-28 Thread Robi
On 28 Feb, 18:35, Jack Diederich  wrote:
> On Mon, Feb 28, 2011 at 12:13 PM, Roberto Inzerillo
>
>  wrote:
> > Yes. read_eager() will never actually read from the socket, if it has
>
> >> any data it has already read & processed it will return those.  If you
> >> call it enough times it will just start returning empty strings
> >> because it never asks the socket to read & wait for new data.
>
> > Can you point me to a pratical usage example of read_eager()? Maybe that
> > will help me in making all this clear. I'm still very fuzzy about the socket
> > and the processing stuff.
>
> > I'm still convinced I cannot use read_until() in my project and I'm
> > determined in looking into the read_eager(), maybe that will be of any help
> > if carefully used.
>
> If you always want to see the response you have to use read_until().
> All the other read_* methods don't guarantee you will get the text you
> want because they don't wait for the other end of the connection to
> have actually sent the data.  It's a bummer, but if the server hasn't
> returned the data there is nothing magic that telnetlib can do to make
> it true.

Well, you see, I could prefetch some usefull data in a spare cycle,
maybe this way I could "fake" almost-realtime ... I don't know, I'm
just guessing here.

Fact is, in my case, when I do a 100 write/read cycles in a second
with telnet.write() followed by telnet.read-eager(), I see I get all
those 100 things done properly, that means, both client and server are
doing it right, there's no lag on the processing and network  sides.
But when I use telnet.read_until() I get those feedbacks in the right
order (good) but it's way too slow (it's at least 10 times slower,
t bad!!! and, in my case, unusable).

I'm trying to find a way out, and I know in these cases knowledge is
the only way out.
I've seen a huge speed improvement using telnet.read_eager(), I want
to investigate more. Who knows, maybe I can stitch my code around it's
appearant unusability.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: nntplib encoding problem

2011-02-28 Thread Laurent Duchesne

Hi,

Thanks it's working!
But is it "normal" for a string coming out of a module (nntplib) to 
crash when passed to print or write?


I'm just asking to know if I should open a bug report or not :)

I'm also wondering which strings should be re-encoded using the 
surrogateescape parameter and which should not.. I guess I could 
reencode them all and it wouldn't cause any problems?


Laurent

On Mon, 28 Feb 2011 02:12:20 +, MRAB wrote:

On 28/02/2011 01:31, Laurent Duchesne wrote:

Hi,

I'm using python 3.2 and got the following error:


nntpClient = nntplib.NNTP_SSL(...)
nntpClient.group("alt.binaries.cd.lossless")
nntpClient.over((534157,534157))
... 'subject': 'Myl\udce8ne Farmer - Anamorphosee (Japan Edition) 
1995

[02/41] "Back.jpg" yEnc (1/3)' ...

overview = nntpClient.over((534157,534157))
print(overview[1][0][1]['subject'])

Traceback (most recent call last):
File "", line 1, in 
UnicodeEncodeError: 'utf-8' codec can't encode character '\udce8' in
position 3: surrogates not allowed

I'm not sure if I should report this as a bug in nntplib or if I'm 
doing

something wrong.

Note that I get the same error if I try to write this data to a 
file:



h = open("output.txt", "a")
h.write(overview[1][0][1]['subject'])

Traceback (most recent call last):
File "", line 1, in 
UnicodeEncodeError: 'utf-8' codec can't encode character '\udce8' in
position 3: surrogates not allowed


It's looks like the subject was originally encoded as Latin-1 (or
similar) (b'Myl\xe8ne Farmer - Anamorphosee (Japan Edition) 1995
[02/41] "Back.jpg" yEnc (1/3)') but has been decoded as UTF-8 with
"surrogateescape" passed as the "errors" parameter.

You can get the "correct" Unicode by encoding as UTF-8 with
"surrogateescape" and then decoding as Latin-1:

overview[1][0][1]['subject'].encode("utf-8",
"surrogateescape").decode("latin-1")


--
http://mail.python.org/mailman/listinfo/python-list


Re: urlopen returns forbidden

2011-02-28 Thread Terry Reedy

On 2/28/2011 10:21 AM, Grant Edwards wrote:


As somebody else has already said, if the site provides an API that
they want you to use you should do so rather than hammering their web
server with a screen-scraper.


If there any generic method for finding out 'if the site provides an 
API" and specifically, how to find Wikipedia's?


I looked as the Wikipedia articles on API and web services and did not 
find any mention of thiers (though there is one for Amazon).


--
Terry Jan Reedy

--
http://mail.python.org/mailman/listinfo/python-list


Re: Problems with read_eager and Telnet

2011-02-28 Thread Jack Diederich
On Mon, Feb 28, 2011 at 12:13 PM, Roberto Inzerillo
 wrote:
> Yes. read_eager() will never actually read from the socket, if it has
>>
>> any data it has already read & processed it will return those.  If you
>> call it enough times it will just start returning empty strings
>> because it never asks the socket to read & wait for new data.
>
> Can you point me to a pratical usage example of read_eager()? Maybe that
> will help me in making all this clear. I'm still very fuzzy about the socket
> and the processing stuff.
>
> I'm still convinced I cannot use read_until() in my project and I'm
> determined in looking into the read_eager(), maybe that will be of any help
> if carefully used.

If you always want to see the response you have to use read_until().
All the other read_* methods don't guarantee you will get the text you
want because they don't wait for the other end of the connection to
have actually sent the data.  It's a bummer, but if the server hasn't
returned the data there is nothing magic that telnetlib can do to make
it true.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Problems with read_eager and Telnet

2011-02-28 Thread Robi
Can you point me to a pratical usage example of read_eager()? Maybe
that will help me in making all this clear. I'm still very fuzzy about
the socket and the processing stuff.

I'm still convinced I cannot use read_until() in my project and I'm
determined in looking into the read_eager(), maybe that will be of any
help if carefully used.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Problems with read_eager and Telnet

2011-02-28 Thread Jack Diederich
On Mon, Feb 28, 2011 at 11:50 AM, Robi  wrote:
>> Telnet sends two kinds of data over the same channel (a simple TCP
>> stream).  It sends the bytes you actually see in your terminal and it
>> sends control commands that do things like turn echo on/off and
>> negotiate what terminal type to use.  Each time telnetlib reads from
>> the socket it puts the control stuff in one bucket and stores the
>> plain text in a buffer to return from all the read_* commands.
>>
>> read_eager() returns the plain text that has already been read from
>> the socket.  That might be a partial line.  It won't try to read from
>> the socket to get a full line.  That's why it is fast, because it
>> never does I/O.
>
> Ok, that's a start (I'm reading RFC 854 in the meanwhile). Still that
> doesn't help me much (sorry, I know it's me, not you).
>
> You mean read_eager() doesn't wait until it gets a complete reading of
> a line, instead it reads what's on the socket (even if it's to quick
> and there's till nothing) and let's the python script running anyway,
> right?
> Then with the subsequent read_eager() it will read (if there's
> something more on the socket in the meanwhile) the previous data bits
> and maybe the new ones too (a new line of data) into a single data
> chunk. Is that why I get sometimes repeated empty lines followed by
> many consequent lines all together out of a single read_eager() call?

Yes. read_eager() will never actually read from the socket, if it has
any data it has already read & processed it will return those.  If you
call it enough times it will just start returning empty strings
because it never asks the socket to read & wait for new data.

-Jack
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Problems with read_eager and Telnet

2011-02-28 Thread Robi
> Telnet sends two kinds of data over the same channel (a simple TCP
> stream).  It sends the bytes you actually see in your terminal and it
> sends control commands that do things like turn echo on/off and
> negotiate what terminal type to use.  Each time telnetlib reads from
> the socket it puts the control stuff in one bucket and stores the
> plain text in a buffer to return from all the read_* commands.
>
> read_eager() returns the plain text that has already been read from
> the socket.  That might be a partial line.  It won't try to read from
> the socket to get a full line.  That's why it is fast, because it
> never does I/O.
>
> -Jack

Ok, that's a start (I'm reading RFC 854 in the meanwhile). Still that
doesn't help me much (sorry, I know it's me, not you).

You mean read_eager() doesn't wait until it gets a complete reading of
a line, instead it reads what's on the socket (even if it's to quick
and there's till nothing) and let's the python script running anyway,
right?
Then with the subsequent read_eager() it will read (if there's
something more on the socket in the meanwhile) the previous data bits
and maybe the new ones too (a new line of data) into a single data
chunk. Is that why I get sometimes repeated empty lines followed by
many consequent lines all together out of a single read_eager() call?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: question about numpy.polyval

2011-02-28 Thread Robert Kern

On 2/28/11 9:34 AM, sirvival wrote:

Hi,
I have some simulated data of stellar absorption lines.


You will want to ask numpy questions on the numpy mailing list:

  http://www.scipy.org/Mailing_Lists

It would be best if you could make a minimal, self-contained, runnable script 
that demonstrates your problem. I.e. provide x and y arrays that show the 
problem; all the details about chunking and such are just getting in the way. 
Please state what results you expect in addition to the results you got; you 
show plots but don't show exactly what you plotted. I have no idea what you were 
expecting to get. Make sure you use variable names consistently. For example, 
you refer to a "plot of fita", but nothing in your code assigns to "fita".


Thanks.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco

--
http://mail.python.org/mailman/listinfo/python-list


Re: OT: Code Examples

2011-02-28 Thread n00m
On Feb 28, 6:03 pm, Fred Marshall 
wrote:
> I'm interested in developing Python-based programs, including an
> engineering app. ... re-writing from Fortran and C versions.  One of the
> objectives would to be make reasonable use of the available structure
> (objects, etc.).  So, I'd like to read a couple of good, simple
> scientific-oriented programs that do that kind of thing.
>
> Looking for links, etc.
>
> Fred

The best place for you to start: http://numpy.scipy.org/

Numpy manual: http://www.tramy.us/numpybook.pdf
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Problems with read_eager and Telnet

2011-02-28 Thread Jack Diederich
On Mon, Feb 28, 2011 at 10:54 AM, Robi  wrote:
> Hi everybody,
>  I'm totally new to Python but well motivated :-)
>
> I'm fooling around with Python in order to interface with FlightGear
> using a telnet connection.
>
> I can do what I had in mind (send some commands and read output from
> Flightgear using the telnetlib) with a read_until() object to catch
> every output line I need, but it proved to be very slow (it takes 1/10
> of a sec for every read_until().
>
> I tried using the read_eager() object and it's w faster (it
> does the job in 1/100 of a sec, maybe more, I didn't tested) but it
> gives me problems, it gets back strange strings, repeated ones,
> partially broken ones, well ... I don't know what's going on with it.
>
> You see, I don't know telnet (the protocol) very good, I'm very new to
> Python and Python's docs are not very specific about that read_eager(9
> stuff.
>
> Could someone point me to some more documentation about that? or at
> least help me in getting a correct idea of what's going on with
> read_eager()?

Telnet sends two kinds of data over the same channel (a simple TCP
stream).  It sends the bytes you actually see in your terminal and it
sends control commands that do things like turn echo on/off and
negotiate what terminal type to use.  Each time telnetlib reads from
the socket it puts the control stuff in one bucket and stores the
plain text in a buffer to return from all the read_* commands.

read_eager() returns the plain text that has already been read from
the socket.  That might be a partial line.  It won't try to read from
the socket to get a full line.  That's why it is fast, because it
never does I/O.

-Jack
-- 
http://mail.python.org/mailman/listinfo/python-list


OT: Code Examples

2011-02-28 Thread Fred Marshall
I'm interested in developing Python-based programs, including an 
engineering app. ... re-writing from Fortran and C versions.  One of the 
objectives would to be make reasonable use of the available structure 
(objects, etc.).  So, I'd like to read a couple of good, simple 
scientific-oriented programs that do that kind of thing.


Looking for links, etc.

Fred
--
http://mail.python.org/mailman/listinfo/python-list


Problems with read_eager and Telnet

2011-02-28 Thread Robi
Hi everybody,
 I'm totally new to Python but well motivated :-)

I'm fooling around with Python in order to interface with FlightGear
using a telnet connection.

I can do what I had in mind (send some commands and read output from
Flightgear using the telnetlib) with a read_until() object to catch
every output line I need, but it proved to be very slow (it takes 1/10
of a sec for every read_until().

I tried using the read_eager() object and it's w faster (it
does the job in 1/100 of a sec, maybe more, I didn't tested) but it
gives me problems, it gets back strange strings, repeated ones,
partially broken ones, well ... I don't know what's going on with it.

You see, I don't know telnet (the protocol) very good, I'm very new to
Python and Python's docs are not very specific about that read_eager(9
stuff.

Could someone point me to some more documentation about that? or at
least help me in getting a correct idea of what's going on with
read_eager()?

I'm going on investigating but a help from here would be very
appreciated :-)

Thanks in advance,
   Roberto
-- 
http://mail.python.org/mailman/listinfo/python-list


question about numpy.polyval

2011-02-28 Thread sirvival
Hi,
I have some simulated data of stellar absorption lines.

What I am trying to is the following:

I divide my data into chunks (each of the same size).
Then I let the code find the max y value in one of those chunks.
I got this working.

Then I put those value in a two column array (first column has the
position of the max value in the original data; second column the y
value at this position).

Then I use polyfit to fit the data.

At last I use polyval to get the fit.

The problem is now since I got about 70 chunks I am not sure how to
use polyfit to get the fit for the original data.


My simulated data is a one column array of 30 data points. I am
only interested in a fit of values above 16.
The data is data_mean.

My code:

import numpy as np
import matplotlib.pyplot as mpl
import scipy, pyfits

chunk = 2000
data_len = len(data_mean)
num_chunk = data_len/chunk
start_chunk = 16
num_chunk = num_chunk - start_chunk/chunk  # I defined num_chunk this
way so I can change the startvalue
data_chunk = np.zeros( (num_chunk, chunk))
data_mean_b = data_mean[start_chunk:len(data_mean)] # for fitting
purpose later in the code

for i in range(num_chunk):
  data_chunk[i] = data_mean[start_chunk+i*2000:start_chunk
+2000+i*2000]

data_max = np.zeros(num_chunk)

for i in range(num_chunk):
  data_max[i] = max(data_chunk[i])  # finding the max values inside a
chunk


data_max_pos = np.zeros(num_chunk) # the position of the max values

for i in range(num_chunk):
  for position, item in enumerate(data_mean):
if item == data_max[i]:
data_max_pos[i] = position


data_fin = np.zeros((num_chunk,2))

for i in range(num_chunk):   # final data two columns
   data_fin[i,0] = data_max_pos[i]
   data_fin[i,1] = data_max[i]

order = 2
x = np.arange(num_chunk)
y = data_fin[::,1]
coeff = np.polyfit(x, y, order)
fit = np.polyval(coeff,x)

xa = np.arange(len(data_mean_b))
fitb = np.zeros((num_chunk,2))


#end of code

Now fit does work fine but as len(num_chunk) = 70 it is no use for the
simulated data.
So I tried with xa and fitb.

But this just gives me somethin like this (plot of fita):
http://img40.imageshack.us/i/web01.png/

Plot of fit:
http://img196.imageshack.us/i/web02j.png/


Thanks
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: urlopen returns forbidden

2011-02-28 Thread Grant Edwards
On 2011-02-28, Chris Rebert  wrote:
> On Sun, Feb 27, 2011 at 9:38 PM, monkeys paw  wrote:
>> I have a working urlopen routine which opens
>> a url, parses it for  tags and prints out
>> the links in the page. On some sites, wikipedia for
>> instance, i get a
>>
>> HTTP error 403, forbidden.
>>
>> What is the difference in accessing the site through a web browser
>> and opening/reading the URL with python urllib2.urlopen?
>
> The User-Agent header (http://en.wikipedia.org/wiki/User_agent ).

Sometimes you also need to set the Referrer header for pages that
don't allow direct-linking from "outside".

As somebody else has already said, if the site provides an API that
they want you to use you should do so rather than hammering their web
server with a screen-scraper.

Not only is is a lot less load on the site, it's usually a lot easier.

-- 
Grant Edwards   grant.b.edwardsYow! Look DEEP into the
  at   OPENINGS!!  Do you see any
  gmail.comELVES or EDSELS ... or a
   HIGHBALL?? ...
-- 
http://mail.python.org/mailman/listinfo/python-list


a chance from you.

2011-02-28 Thread marian
You'll like the way I was convinced a girl for winning the new Volvo
S60:
http://www.unlimitednaughty.ro/camera/video/concurent/Marian-Briceag
Give me a chance by sharing this video to everyone on the group.
Please, please SHARE it :) . p.s.: for English turn on the captions
(the cc, under arrow from the right-bottom side of video). This is not
a spam, it's my dream. Thanks!
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: SocketServer problem: client hangs trying to reconnect after server restart

2011-02-28 Thread Massi
On 28 Feb, 13:34, cmcp  wrote:
> In method StopServer() of class MyServer try calling  self.server_close() 
> after the self.shutdown() call.  I believe this will actually close the 
> server's socket and allow its reuse.

It works! Thank you!!
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: SocketServer problem: client hangs trying to reconnect after server restart

2011-02-28 Thread cmcp
In method StopServer() of class MyServer try calling  self.server_close() after 
the self.shutdown() call.  I believe this will actually close the server's 
socket and allow its reuse.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Embedding python : can't find encoding error

2011-02-28 Thread Mathieu CLERICI
Precisions : I'm trying to embed python 3.2 release.
-- 
http://mail.python.org/mailman/listinfo/python-list


Embedding python : can't find encoding error

2011-02-28 Thread Mathieu CLERICI
Hi,

I'm trying to embed python in a c++ program.
I have compiled python32.lib with msvc 2010 targetting 32bits, i link
it with my program wich is also 32bit.
I get an error when calling Py_Initialize() : "no codec search
functions registered:  can't find encoding"

Py_FileSystemDefaultEncoding value is "mbcs".

_PyCodec_Lookup raise an eror because  len = PyList_Size(interp-
>codec_search_path); returns 0 in codecs.c

Does someone already had this problem ? I have no idea how to solve
that.

Sorry for my bad english.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Various behaviors of doctest

2011-02-28 Thread Peter Otten
Gnarlodious wrote:

> Yeah, I just spent about 2 hours trying everything I could think of...
> without success. Including your suggestions. Guess I'll have to skip
> it. But thanks for the ideas.
> 
> -- Gnarlie

Are you using Python 2.x? Then you cannot redefine print. Instead you have 
to redirect stdout. The following example should run as a cgi script:

#!/usr/bin/env python
import cgi
import sys
from cStringIO import StringIO

def f():
"""
>>> 1 + 1
2
>>> 2 + 2
22
>>> "<"
'<'
"""

if __name__ == "__main__":
print ("Content-type: text/html\r\n\r\n"
   ""
   "There goes:")

outstream = StringIO()
import doctest, sys

from doctest import DocTestRunner
class DTR(DocTestRunner):
def run(self, test, compileflags=None, out=None, clear_globs=True):
DocTestRunner.run(self, test, compileflags, outstream.write, 
clear_globs)
def summarize(self):
saved = sys.stdout
sys.stdout = outstream
DocTestRunner.summarize(self)
sys.stdout = saved

doctest.DocTestRunner = DTR
doctest.testmod(verbose=True)
text = outstream.getvalue()

for line in text.splitlines():
spaces = len(line) - len(line.lstrip())
line = line.strip()
if "*" in line:
print ""
else:
print "%s" % (
"blue" if spaces else "black", spaces, cgi.escape(line))
print ""

-- 
http://mail.python.org/mailman/listinfo/python-list


SocketServer problem: client hangs trying to reconnect after server restart

2011-02-28 Thread Massi
Hi everyone!

in my script (Python 2.6 on windows 7) I have to set up a SocketServer
server and use it to handle external connections. During the execution
It can happen that this server should be closed and restarted (for
example with different port or host). The following piece of code
simulates the situation I have to deal with:

import SocketServer, socket, threading
from time import sleep
BUF_LENGTH = 1024

class MyHandler(SocketServer.BaseRequestHandler) :
def handle(self):
while 1:
data = self.request.recv(1024)
self.request.send(data)
if data.strip() == 'bye':
return

class MyServer(SocketServer.ThreadingTCPServer) :
def __init__(self, host, port, handler) :
self.allow_reuse_address = True
self.__handler = handler
self.__serving = True
SocketServer.ThreadingTCPServer.__init__ (self, (host, port),
handler)

def StartServer(self) :
self.serve_forever()

def StopServer(self) :
self.shutdown()

def Init() :
server = MyServer("localhost", 5000, MyHandler)
threading.Thread(target=server.StartServer).start()
sleep(0.5)

sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.connect(("localhost", 5000))
sock.send("hi")
sock.recv(BUF_LENGTH)
sock.send("bye")
return server

def CleanUp(server) :
server.StopServer()

for i in range(3) :
print "-- Connection: "+str(i)+" --"
server = Init()
CleanUp(server)

print "Finished"

If you ran the code you'll see that the client hangs after the first
connection. Can anyone point me out where I'm doing wrong? Thanks in
advance!
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: urlopen returns forbidden

2011-02-28 Thread Steven D'Aprano
On Sun, 27 Feb 2011 22:19:18 -0800, Chris Rebert wrote:

> On Sun, Feb 27, 2011 at 9:38 PM, monkeys paw 
> wrote:
>> I have a working urlopen routine which opens a url, parses it for 
>> tags and prints out the links in the page. On some sites, wikipedia for
>> instance, i get a
>>
>> HTTP error 403, forbidden.
>>
>> What is the difference in accessing the site through a web browser and
>> opening/reading the URL with python urllib2.urlopen?
[...]
> Sidenote: Wikipedia has a proper API for programmatic browsing, likely
> hence why it's blocking your program.

What he said. Please don't abuse Wikipedia by screen-scraping it.


-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Lumberjack Song

2011-02-28 Thread Tom Zych
We all like computers here. No doubt many of us like computer games.
And most of us will be at least somewhat familiar with Monty Python.
Therefore, I present (drum roll)...

http://www.youtube.com/watch?v=Zh-zL-rhUuU

(For the Runescape fans out there, this should be quite hilarious.
Possibly not as much for those unfamiliar with Runescape...)

-- 
Tom Zych / freethin...@pobox.com
Quidquid latine dictum sit, altum viditur.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: [ANN]VTD-XML 2.10

2011-02-28 Thread Stefan Behnel

pyt...@bdurham.com, 27.02.2011 13:52:

How does VTD-XML compare to XML tools in the stdlib or to 3rd party
alternatives like lxml?


For one, I'm not aware of any Python wrappers for vtd-xml, despite having 
seen lots of announcements by Jimmy on this list already (not the 
python-announce list, *this* list). So, IMHO, these announcements are to be 
considered spam, or at least vapourware. Also, Jimmy commonly does not 
respond to questions regarding his announcements, so I assume he doesn't 
actually read c.l.py but uses it solely to post his announcements.


That being said, vtd-xml is supposed to be fast, and there are benchmarks 
on the web that seem to suggest that this may be true. For example,


http://xmlbench.sourceforge.net/results/benchmark200910/index.html

lists its parser as being about twice as fast as libxml2, which is used by 
lxml.


I say "may be true", because benchmarks rarely indicate the performance 
behaviour in real world code. It does not seem completely unreasonable to 
me that vtd-xml is interesting for Java developers, where XML performance 
isn't really whopping cool and simple-to-use XML tools are rare anyway (I 
never tried it, but vtd-xml also has an aura of a not-so-easy-to-use tool).


It does not seem unreasonable that there are use cases for a very fast 
parser that justify the time it may take to get used to the tool. But its 
'different' nature also makes it clearly lack a lot of tooling around the 
actual parser, which prevents it from positioning itself as a general 
purpose XML tool. I'm yet to be convinced that vtd-xml is an interesting 
tool for a Python developer.


Stefan

--
http://mail.python.org/mailman/listinfo/python-list