[issue2892] improve cElementTree iterparse error handling

2010-06-11 Thread Fredrik Lundh

Fredrik Lundh fred...@effbot.org added the comment:

Note that this was fixed in upstream 1.3 (and verified by the selftests), but 
the fix and test was apparently lost when that code was merged into 2.7.  Since 
2.7 is supposed to ship with 1.3, this is a regression, not a feature request.

(But 2.7 is in rc, and I'm on vacation, so I guess it's a bit too late to do 
anything about that.  I'll leave the final decision to flox and the python-dev 
crowd.)

--
assignee: effbot - flox
versions: +Python 2.7

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2892
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8583] Hardcoded namespace_separator in the cElementTree.XMLParser

2010-05-01 Thread Fredrik Lundh

Fredrik Lundh fred...@effbot.org added the comment:

Namespaces are a fundamental part of the XML information model (both xpath and 
infoset) and all modern XML document formats, so I'm not sure what problem 
you're trying to solve by pretending that they don't exist.

It's a bit like modifying import foo to work like from foo import *...

--
nosy: +effbot

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8583
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6488] ElementTree documentation refers to path with no explanation, and inconsistently

2010-04-01 Thread Fredrik Lundh

Fredrik Lundh fred...@effbot.org added the comment:

 As per PEP 257, “Returns” should become “Return” (it’s a command, not a 
 description).

Upstream ET uses JavaDoc conventions, where the conventions are
designed by technical writers, not hackers.  In JavaDoc, descriptions
are 3rd person declarative (after all, the documentation describes
what the function does, not what you want it to do).

http://java.sun.com/j2se/javadoc/writingdoccomments/

The incompatibilities with Python's NIH-standards are unfortunate, but
that's the way it is.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6488
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6488] ElementTree documentation refers to path with no explanation, and inconsistently

2010-04-01 Thread Fredrik Lundh

Fredrik Lundh fred...@effbot.org added the comment:

The missing/extra words in the findtext description is just a case of sloppy 
copy-editing, most likely after a quick reformatting.  Not sure why you're 
spending all this energy arguing about commas, though.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6488
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8047] Serialiser in ElementTree returns unicode strings in Py3k

2010-03-21 Thread Fredrik Lundh

Fredrik Lundh fred...@effbot.org added the comment:

Hmm.  I'm not entirely sure about giving False a meaning when None has 
traditionally had a different (and documented) meaning.  And sleeping on it 
hasn't convinced me in either direction :-(

(well, I'd say no, but the compatibility argument is somewhat tempting)

I'm not that concerned by changing the default for write -- 3.x users with 
utf-8 as the default output encoding will get different output, but still 
perfectly valid XML.  3.x users with non-utf-8 default encodings  will get 
valid XML also in cases where it didn't work before.

tostring() is more problematic, but I'm leaning towards Guido's torpedoes 
approach there -- changing the default output to bytestrings is more likely to 
cause code to blow up than cause bad output, and you can trivially make your 
program backwards compatible by adding an extra check/decode after the call.  
Supporting unicode for lxml.etree compatibility is fine with me, but I think it 
might make sense to support the string unicode as well (as a pseudo-encoding 
-- it's pretty clear to me that nobody will ever define a real character 
encoding with that name :-).

Have you posted/can you post the patch to riedveld, btw?  I have some questions 
about the code that are independent of the encoding decision.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8047
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8047] Serialiser in ElementTree returns unicode strings in Py3k

2010-03-12 Thread Fredrik Lundh

Fredrik Lundh fred...@effbot.org added the comment:

'None' has always been the documented default for the encoding parameter

That's probably mostly by accident at least in original ET, but the 1.3 draft 
docs at effbot.org/elementtree does spell it out explicitly for the 'write' 
method:

   Output encoding. If omitted or set to None, defaults to US-ASCII.

Not sure I'd consider this text binding in itself, though (even if I'd argue 
that it's preferred to have the same interpretation of encoding everywhere).

writing out the Unicode serialisation will result in an incorrect XML 
serialisation

I think Guido meant the ElementTree.write method; is that broken too?

The file.write(et.tostring()) issue is probably my most pressing concern here; 
that's a common use case (e.g. when using iterparse to cut pieces from a big 
document), and the defaults were chosen to increase the chance that this 
automatically do the right thing for non-ASCII even if the programmer never 
tests it.  In 3.X, that construct is suddenly dependent on the interpreter's 
default encoding.

I think I'd prefer old tostring behaviour and a separate tounicode 
function, and I'm still not convinced that the latter is required for the XML 
use case (which implies that maybe it should live in lxml.html for the HTML 
case, even if it ends up calling the same internal implementation).

Or should that be tobytes and tounicode to eliminate all ambiguity?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8047
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8047] Serialiser in ElementTree returns unicode strings in Py3k

2010-03-12 Thread Fredrik Lundh

Fredrik Lundh fred...@effbot.org added the comment:

(what's the Python 3 replacement for the array module, btw?)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8047
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8047] Serialiser in ElementTree returns unicode strings in Py3k

2010-03-12 Thread Fredrik Lundh

Fredrik Lundh fred...@effbot.org added the comment:

Yes, the feature has been implemented deep down in the _encode() helper 
function, so it impacts the entire serialiser, not only its API

Ouch.

 import locale
 locale.getpreferredencoding() == utf-8
False
 from xml.etree.ElementTree import *
 e = Element(tag)
 e.text = hellö
 tostring(e)
'taghellö/tag'
 ElementTree(e).write(out.xml)
 tree = parse(out.xml)
Traceback (most recent call last):
  File stdin, line 1, in module
  File C:\Python31\lib\xml\etree\ElementTree.py, line 843, in parse
tree.parse(source, parser)
  File C:\Python31\lib\xml\etree\ElementTree.py, line 581, in parse
parser.feed(data)
  File C:\Python31\lib\xml\etree\ElementTree.py, line 1221, in feed
self._parser.Parse(data, 0)
xml.parsers.expat.ExpatError: not well-formed (invalid token): line 1, column 9

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8047
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8047] Serialiser in ElementTree returns unicode strings in Py3k

2010-03-12 Thread Fredrik Lundh

Fredrik Lundh fred...@effbot.org added the comment:

I wouldn't raise much opposition against tobytes() as an alias for tostring(), 
although that sounds more like duplicating an otherwise simple API.

Adding an alias would be a way address the 2.X/3.X terminology overlap; string 
traditionally implies 8-bit in 2.X, and apparently now Unicode in 3.X.  That's 
likely to cause a lot of confusion for people switching over (and to people 
writing 3.X documentation, as well; the array module's documentation is an 
example).

ET isn't the only thing with tostring functionality, of course -- it's  pretty 
much the standard name for serialize data structure to byte string for later 
transmission -- so it probably wouldn't hurt with a python-dev pronouncement 
here.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8047
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8047] Serialiser in ElementTree returns unicode strings in Py3k

2010-03-12 Thread Fredrik Lundh

Changes by Fredrik Lundh fred...@effbot.org:


--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8047
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8047] Serialiser in ElementTree returns unicode strings in Py3k

2010-03-12 Thread Fredrik Lundh

Fredrik Lundh fred...@effbot.org added the comment:

I wouldn't raise much opposition against tobytes() as an alias for tostring(), 
although that sounds more like duplicating an otherwise simple API.

Adding an alias would be a way address the 2.X/3.X terminology overlap; string 
traditionally implies 8-bit in 2.X, and apparently now Unicode in 3.X.  That's 
likely to cause a lot of confusion for people switching from 2 to 3 (and to 
people writing 3.X documentation, apparently; the array module's documentation 
is an example of that).

(And once everyone has switched over, we can deprecate the tostring spelling... 
:)

ET isn't the only thing with tostring functionality, of course -- it's  pretty 
much the standard name for serialize data structure to byte string for later 
transmission -- so it probably wouldn't hurt with a python-dev pronouncement 
here.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8047
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8047] Serialiser in ElementTree returns unicode strings in Py3k

2010-03-12 Thread Fredrik Lundh

Fredrik Lundh fred...@effbot.org added the comment:

Interesting.  But isn't the problem with 3.1 that it relies on the standard 
encoding, which results in code that may or may not work depending on a global 
platform setting?  Who's doing the encoding in the new version?  And what ends 
up in the file?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8047
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8047] Serialiser in ElementTree returns unicode strings in Py3k

2010-03-12 Thread Fredrik Lundh

Fredrik Lundh fred...@effbot.org added the comment:

Oops :)  Yeah, that was pretty lousy way to show what encoding I was using for 
that test:

 import locale
 locale.getpreferredencoding()
'cp1252'


(Somewhat related, it would be nice if Python actually normalized 
defaultencoding/preferredencoding to some canonical name for the codec in use, 
i.e. preferred MIME name or at least IANA; we had a rather nice little bug 
recently that wouldn't have happened if that had been the case...)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8047
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7114] HTMLParser doesn't handle ![CDATA[ ... ]]

2010-03-11 Thread Fredrik Lundh

Fredrik Lundh fred...@effbot.org added the comment:

And to clarify, XHTML is an reformulation of HTML4 using XML syntax, so you 
should use an XML parser to parse it, not an HTML parser.  The formats are 
related, but not identical.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7114
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue5100] ElementTree.iterparse and Element.tail confusion

2010-03-11 Thread Fredrik Lundh

Fredrik Lundh fred...@effbot.org added the comment:

Footnote: iterparse does things this way mostly to keep the implementation 
simple and fast; due to buffering, the tree builder are usually ahead of the 
event generation with up to 16k.  See the note on this page:

http://effbot.org/zone/element-iterparse.htm

and the message it links to for more on this topic.

Your case is a very common use case for tostring, so it would probably have 
made sense to make tostring skip the tail on the element itself, at least if 
it's whitespace only.  Guess we could add an option...

But in your case, you can probably just nuke or normalize the tail element 
before writing it out (i.e. set it to None or \n).

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue5100
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8047] Serialiser in ElementTree returns unicode strings in Py3k

2010-03-11 Thread Fredrik Lundh

Fredrik Lundh fred...@effbot.org added the comment:

 if I don't specify an encoding, I get unicode.  If I do specify an encoding, 
 I get encoded bytes.

You're confusing the XML document encoding with character set encoding.

A serialized (unparsed) XML document is a byte stream, not a string of Unicode 
characters.  And the character set encoding is both embedded in that byte 
stream and affects how it's generated in more than one way; you cannot just 
recode XML documents nilly willy and expect things to work.

A parsed XML document (an infoset) -- for ET, that's the tree of Element 
objects -- does indeed contain Unicode strings, but the transformation from the 
byte stream to the Unicode string doesn't just involve character set decoding; 
there are several other constructs that are handled by the XML parser.

 Ha. There has been a very long temporal window

You should have had plenty of time to fix it, then, right?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8047
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6472] Update ElementTree with upstream changes

2010-03-11 Thread Fredrik Lundh

Fredrik Lundh fred...@effbot.org added the comment:

W00t!

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6472
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8047] Serialiser in ElementTree returns unicode strings in Py3k

2010-03-11 Thread Fredrik Lundh

Fredrik Lundh fred...@effbot.org added the comment:

 import array
 array.array(i, [1, 2, 3]).tostring()
b'\x01\x00\x00\x00\x02\x00\x00\x00\x03\x00\x00\x00'

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8047
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com




[issue8047] Serialiser in ElementTree returns unicode strings in Py3k

2010-03-11 Thread Fredrik Lundh

Fredrik Lundh fred...@effbot.org added the comment:

So now it's the domain experts against some hypothetical people that might 
exist?  Tricky.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8047
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7462] Implement fastsearch algorithm for rfind/rindex

2010-01-04 Thread Fredrik Lundh

Fredrik Lundh fred...@effbot.org added the comment:

Thanks Florent!

 Are there any simple, common cases that are made slower by this patch? 

The original fastsearch implementation has a couple of special cases to make 
sure it's faster than the original code in all cases.  The reason it wasn't 
implemented for reverse search was more a question of developer time 
constraints; reverse search isn't nearly as common as forward search, and we 
had other low-hanging fruit to deal with.

(btw, while it's great that someone finally got around to fix this, it wouldn't 
surprise me if replacing the KMP implementation in SRE with a fastsearch would 
save as many CPU cycles worldwide as this patch :)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7462
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3475] _elementtree.c import can fail silently

2009-11-08 Thread Fredrik Lundh

Fredrik Lundh fred...@effbot.org added the comment:

Note that fail silently is a bit of a misnomer - if the embedded import 
doesn't work, portions of the library will fail pretty loudly.  Feel free 
to use some variation of the suggested patch, or just wait until the next 
upstream release gets imported (if ever).

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue3475
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7139] ElementTree: Incorrect serialization of end-of-line characters in attribute values

2009-11-02 Thread Fredrik Lundh

Fredrik Lundh fred...@effbot.org added the comment:

The real problem here is that XML attributes weren't really designed
to hold data that doesn't survive normalization.  One would have
thought that making it difficult to do that, and easy to store such
things as character data, would have made people think a bit before
designing XML formats that does things the other way around, but
apparently some people finds it hard having to use their brain when
designing things...

FWIW, the current ET 1.3 beta escapes newline but not tabs and
carriage returns; I don't really mind adding tabs, but I'm less sure
about carriage return -- XML pretty much treats CT as a junk character
also outside attributes, and escaping it in all contexts would just be
silly.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7139
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



Re: [Image-SIG] Some issue with easy_install and PIL/Imaging

2009-10-05 Thread Fredrik Lundh
The problem is that too many people arguing for eggs do this by
sending nastygrams, which doesn't really provide much motivation for
doing anything about it (I don't do asshole-driven development).  The
public review PIL got a couple a minutes ago matches some of the
private mail I've gotten:

   no egg - worst seen ever, remove it from pypi or provide an egg
(jensens, 2009-10-05, 0 points)

/F

On Wed, Sep 30, 2009 at 6:24 PM, Chris Withers ch...@simplistix.co.uk wrote:
 Fredrik Lundh wrote:

 On Fri, Sep 11, 2009 at 3:49 PM, Chris Withers ch...@simplistix.co.uk
 wrote:

 Klein Stéphane wrote:

 Resume :
 1. first question : why PIL package in pypi don't work ?

 Because Fred Lundh have his package distributions unfortunate names that
 setuptools doesn't like...

 It used to support this, but no longer does.  To me, that says more
 about the state of setuptools than it does about the state of PIL,
 which has been using the same naming convention for 15 years.

 Yep, but it is now in the minority, and consistency in package naming is
 always good.

 Would there be any problems for you in naming the distribution in a
 setuptools-friendly way from the next point release?

 cheers,

 Chris

 --
 Simplistix - Content Management, Batch Processing  Python Consulting
           - http://www.simplistix.co.uk

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: [Image-SIG] Some issue with easy_install and PIL/Imaging

2009-09-28 Thread Fredrik Lundh
On Fri, Sep 11, 2009 at 3:49 PM, Chris Withers ch...@simplistix.co.uk wrote:
 Klein Stéphane wrote:

 Resume :
 1. first question : why PIL package in pypi don't work ?

 Because Fred Lundh have his package distributions unfortunate names that
 setuptools doesn't like...

It used to support this, but no longer does.  To me, that says more
about the state of setuptools than it does about the state of PIL,
which has been using the same naming convention for 15 years.

/F
-- 
http://mail.python.org/mailman/listinfo/python-list


[issue6562] OverflowError in RLock.acquire()

2009-08-04 Thread Fredrik Lundh

Fredrik Lundh fred...@effbot.org added the comment:

PIL is completely thread-agnostic, so I not sure there's anything PIL can 
do to fix this.

(and ImageQt is of course an interface to PyQt, which is an interface to 
Qt, which consists of a *lot* more than 50 lines...)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6562
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6233] ElementTree (py3k) doesn't properly encode characters that can't be represented in the specified encoding

2009-06-24 Thread Fredrik Lundh

Fredrik Lundh fred...@effbot.org added the comment:

That's backwards, unless I'm missing something here: charrefs represent 
Unicode characters, not UTF-8 byte values.  The character LATIN SMALL 
LETTER A WITH TILDE with the character value 227 should be represented as 
#227; if serialized to an encoding that doesn't support non-ASCII 
characters.

And there's no need to use RE:s to filter things under 3.X; those parts of 
ET 1.2 are there for pre-2.0 compatibility.

Did you try running the tests with the escape function I posted?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6233
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue5166] ElementTree and minidom don't prevent creation of not well-formed XML

2009-06-24 Thread Fredrik Lundh

Fredrik Lundh fred...@effbot.org added the comment:

For ET, that's very much on purpose.  Validating data provided by every 
single application would kill performance for all of them, even if only a 
small minority would ever try to serialize data that cannot be represented 
in XML.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue5166
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6266] cElementTree.iterparse ElementTree.iterparse return differently encoded strings

2009-06-21 Thread Fredrik Lundh

Fredrik Lundh fred...@effbot.org added the comment:

It should definitely give what's intended (either a Unicode string, or, if 
the content is plain ASCII, an 8-bit string).  What did you get instead?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6266
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6233] ElementTree (py3k) doesn't properly encode characters that can't be represented in the specified encoding

2009-06-21 Thread Fredrik Lundh

Fredrik Lundh fred...@effbot.org added the comment:

Umm.  Isn't _encode used to encode tags and attribute names?  The charref 
syntax is only valid in CDATA sections and attribute values, which are 
encoded by the corresponding _escape functions.  I suspect this patch will 
make things blow up on a non-ASCII tag/attribute name.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6233
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6233] ElementTree (py3k) doesn't properly encode characters that can't be represented in the specified encoding

2009-06-21 Thread Fredrik Lundh

Fredrik Lundh fred...@effbot.org added the comment:

Did you look at the 1.3 alpha code base when you came up with this idea?  
Unfortunately, 1.3's _encode is used for a different purpose...

I don't have time to test it tonight, but I suspect that 1.3's 
escape_data/escape_attrib functions might work better under 3.X; they do 
the text.replace dance first, and then an explicit text.encode(encoding, 
xmlcharrefreplace) at the end.  E.g.

def _escape_cdata(text, encoding):
# escape character data
try:
# it's worth avoiding do-nothing calls for strings that are
# shorter than 500 character, or so.  assume that's, by far,
# the most common case in most applications.
if  in text:
text = text.replace(, amp;)
if  in text:
text = text.replace(, lt;)
if  in text:
text = text.replace(, gt;)
return text.encode(encoding, xmlcharrefreplace)
except (TypeError, AttributeError):
_raise_serialization_error(text)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6233
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6266] cElementTree.iterparse ElementTree.iterparse return differently encoded strings

2009-06-20 Thread Fredrik Lundh

Fredrik Lundh fred...@effbot.org added the comment:

Converting from UTF-8 to Unicode is the right thing to do, but 
converting back to Latin-1 is not correct -- note that ET returns a 
Unicode string, not an 8-bit string.  There's a makestring helper that 
does the right thing in the library; just changing:

parcel = Py_BuildValue(ss, (prefix) ? prefix : , uri);

to 

parcel = Py_BuildValue(sN, (prefix) ? prefix : , makestring(uri));

should work (even if you should probably do that in two steps, and look 
for errors from makestring before proceeding).

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6266
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue5767] xmlrpclib loads invalid documents

2009-04-16 Thread Fredrik Lundh

Fredrik Lundh fred...@effbot.org added the comment:

sgmlop doesn't do much validation; to quote the homepage: [sgmlop] is 
tolerant, and happily accepts XML-like data that are not well-formed. If 
you need strictness, use another parser.

But given that Python ships with cElementTree these days, and 
cElementTree's XMLParser (based on expat) is faster than both sgmlop and 
pyexpat, maybe it's time to remove sgmlop support from xmlrpclib...

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue5767
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1143] Update to latest ElementTree in Python 2.7

2009-04-02 Thread Fredrik Lundh

Fredrik Lundh eff...@users.sourceforge.net added the comment:

ET 1.3 is still in alpha, though.  Hopefully, that'll sort itself out
over the next few weeks.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1143
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1538691] Patch cElementTree to export CurrentLineNumber

2009-04-02 Thread Fredrik Lundh

Fredrik Lundh eff...@users.sourceforge.net added the comment:

In the upstream 1.0.6, the ParseError exception has a position attribute
that contains a (line, column) tuple.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1538691
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1777] ElementTree/cElementTree findtext inconsistency

2009-01-10 Thread Fredrik Lundh

Fredrik Lundh eff...@users.sourceforge.net added the comment:

Forgot to mention that this is fixed in the cElementTree trunk (public
as of today's 1.0.6 preview release).  Will merge with Python trunk when
I find the time...

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1777
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



Re: Official definition of call-by-value (Re: Finding the instance reference...)

2008-11-13 Thread Fredrik Lundh

greg wrote:


If you're going to indulge in argument by authority,
you need to pick authorities that can be considered,
er, authoritative in the field concerned...


Like Barbara Liskov, who's won tons of awards for her work on computer 
science and programming languages, and who was among the first to 
design, implement, and formally describe a language with *exactly* the 
same evaluation semantics as Python?  What did she and her co-authors 
have to say about the calling semantics in their new language?  Let's see:


In particular it is not call by value because mutations
of arguments performed by the called routine will be
visible to the caller.  And it is not call by reference
because access is not given to the variables of the
caller, but merely to certain objects.

Let's take that again, with emphasis:

IN PARTICULAR IT IS NOT CALL BY VALUE because mutations
of arguments performed by the called routine will be visible to
the caller. And IT IS NOT CALL BY REFERENCE because access
is not given to the variables of the caller, but merely to
certain objects.

It is not.  And it is not.

But maybe they were just ignorant, and didn't really get how earlier 
languages worked?  Let's see what Liskov has to say about that:


The group as a whole was quite knowledgeable about languages that
existed at the time. I had used Lisp extensively and had also
programmed in Fortran and Algol 60, Steve Zilles and Craig
Schaffert had worked on PL/I compilers, and Alan Snyder had done
extensive programming in C. In addition, we were familiar with
Algol 68, EL/1, Simula 67, Pascal, SETL, and various machine
languages. Early in the design process we did a study of other
languages to see whether we should use one of them as a basis for
our work [Aiello, 1974]. We ultimately decided that none would be
suitable as a basis. None of them supported data abstraction, and
we wanted to see where that idea would lead us without having to
worry about how it might interact with pre-existing
features. However, we did borrow from existing languages.  Our
semantic model is largely borrowed from Lisp; our syntax is
Algol-like.

Still think they didn't understand Algol's semantic model?

:::

But nevermind - the real WTF with threads like this one is the whole 
idea that there are two and only two evaluation strategies to choose 
from.  That's a remarkable narrow-mindedness.


/F

--
http://mail.python.org/mailman/listinfo/python-list


Re: Official definition of call-by-value (Re: Finding the instance reference...)

2008-11-12 Thread Fredrik Lundh

Aahz wrote:


There you have it -- call by value is offially defined in
terms of assignment. There is no mention in there of copying.

So it's perfectly correct to use it in relation to Python.


Except, of course, for the fact that it is generally misleading.


It's not only misleading, it's also a seriously flawed reading of the 
original text - the Algol 60 report explicitly talks about assignment of 
*values*.


I'm not aware of any language where a reference to an object, rather 
than the *contents* of the object, is seen as the object's actual value. 
 It's definitely not true for Python, at least.


/F

--
http://mail.python.org/mailman/listinfo/python-list


Re: Official definition of call-by-value (Re: Finding the instance reference...)

2008-11-12 Thread Fredrik Lundh

greg wrote:

It's not only misleading, it's also a seriously flawed reading of the 
original text - the Algol 60 report explicitly talks about assignment 
of *values*.


Do you agree that an expression in Python has a value?



Do you agree that it makes sense to talk about assigning
that value to something?


Python's definition of the word value can be found in the language 
reference:


http://docs.python.org/reference/datamodel.html#objects-values-and-types

Using that definition, a Python expression yields an object, not an 
object value.


For comparison, here's Algol's definition of the word value:

A value is an ordered set of numbers (special case: a single number), 
an ordered set of logical values (special case: a single logical value), 
or a label.


It should be obvious to anyone that Python is not Algol.

 If so, what is there to stop us from applying the Algol
 definition to Python?

The fact that we're talking about Python.  Python is not Algol.

/F

--
http://mail.python.org/mailman/listinfo/python-list


[issue4100] xml.etree.ElementTree does not read xml-text over page bonderies

2008-11-01 Thread Fredrik Lundh

Fredrik Lundh [EMAIL PROTECTED] added the comment:

Roland's right - iterparse only guarantees that it has seen the  
character of a starting tag when it emits a start event, so the 
attributes are defined, but the contents of the text and tail attributes 
are undefined at that point.  The same applies to the element children; 
they may or may not be present.

If you need a fully populated element, look for end events instead.

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue4100
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



Re: Using the 'with' statement with cStringIO objects

2008-09-27 Thread Fredrik Lundh

peppergrower wrote:


teststring='this is a test'

with cStringIO.StringIO(teststring) as testfile:
pass


umm.  what exactly do you expect that code to do?

/F

--
http://mail.python.org/mailman/listinfo/python-list


Re: How to read a jpg bytearray from a Flash AS3 file

2008-09-27 Thread Fredrik Lundh

[EMAIL PROTECTED] wrote:


I'm trying to save an image from a Flash AS3 to my server as a jpg
file. I found some PHP code to do this, but I want to do this in
Python. I'm not quite sure how to convert the following code to
Python. It's mainly the $GLOBALS[HTTP_RAW_POST_DATA] part I don't
know how to convert.


depends on what framework you're using.  if you're using plain CGI, you 
should be able to read the posted data from sys.stdin:


  import sys

  im = sys.stdin.read()

  f = open(name, 'wb')
  f.write(jpg)
  f.close()

to make your code a bit more robust, you may want to check the 
content-length before doing the read, e.g.


  import os

  if os.environ.get(REQUEST_METHOD) != POST:
  ... report invalid request ...

  bytes = int(os.environ.get(CONTENT_LENGTH, 0))
  if bytes  MAX_REQUEST_SIZE:
  ... report request too large ...

  im = sys.stdin.read(bytes)

to deal with query parameters etc, see

   http://docs.python.org/lib/module-cgi.html

/F

--
http://mail.python.org/mailman/listinfo/python-list


Re: How to get the filename in the right case ?

2008-09-27 Thread Fredrik Lundh

Stef Mientki wrote:


I don't think your suggestion is a good one.
If a filename has uppercase characters in it,
the END-USER has done that for some kind of reason.


I explain how pdb works and show you how to solve the specific 
comparison problem you mentioned in your post, and you start ranting 
because it doesn't solve all your problems?  what's wrong with you?


/F

--
http://mail.python.org/mailman/listinfo/python-list


Re: Regular expression help: unable to search ' # ' character in the file

2008-09-27 Thread Fredrik Lundh

[EMAIL PROTECTED] wrote:


import re

fd = open(file, 'r')
line = fd.readline
pat1 = re.compile(\#*)
while(line):
mat1 = pat1.search(line)
if mat1:
print line
line = fd.readline()


I strongly doubt that this is the code you used.


But the above prints the whole file instead of the hash lines only.


* means zero or more matches.  all lines is a file contain zero or 
more # characters.


but using a RE is overkill in this case, of course.  to check for a 
character or substring, use the in operator:


for line in open(file):
if # in line:
print line

/F

--
http://mail.python.org/mailman/listinfo/python-list


[issue433029] SRE: posix classes aren't supported

2008-09-27 Thread Fredrik Lundh

Fredrik Lundh [EMAIL PROTECTED] added the comment:

Yes, this refers to the POSIX character classes as described here:

http://www.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap09.html

(Ideally, there should be an (internal) API that lets you register class 
definitions from the Python level.)

Support for Unicode properties could perhaps be addressed at the same 
time:

http://unicode.org/unicode/reports/tr18/#Basic_Unicode_Support

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue433029
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



Re: Eggs, VirtualEnv, and Apt - best practices?

2008-09-25 Thread Fredrik Lundh

Dmitry S. Makovey wrote:


you have just described OS package building ;)

I can't speak for everybody, but supporting multiple platforms (PHP, Perl,
Python, Java) we found that the only way to stay consistent is to use OS
native packaging tools (in your case apt and .deb ) and if you're missing
something - roll your own package. After a while you accumulate plenty of
templates to chose from when you need yet-another-library not available
upstream in your preferred package format. Remember that some python tools
might depend on non-python packages, so the only way to make sure all that
is consistent across environment - use unified package management.


you're speaking for lots of organizations, at least.

rpm/debs from supplier's repository
subversion (or equivalent) - locally built rpm/debs
  + organization's favourite deployment tools
-
deployed application

/F

--
http://mail.python.org/mailman/listinfo/python-list


Re: How to get the filename in the right case ?

2008-09-25 Thread Fredrik Lundh

Stef Mientki wrote:

 1. I've a multitab editor.
 2. When a breakpoint is reached,
 3. I check if the file specified in pdb output, is already open in one
 of the editor tabs,
 4. if not, I open a new tab with the correct file,
 5. I focus the correct editor tab and jump to the line specified by
 pdb.
 6. After that I should be able to inspect the surrounding of the
 breakpoint, so I need the modules name.

 For 3 I need to compare filenames, the editor contains the case
 sensitive name, pdb not.

pdb uses os.path.abspath and os.path.normcase to normalize filenames so 
they can be safely compared (see the canonic method in bdb.py).


I suggest you do the same in your editor; e.g:

pdb_filename = ...

for buffer in editor_buffers:
filename = os.path.normcase(os.path.abspath(buffer.filename))
if pdb == filename:
... found it ...
break

/F

--
http://mail.python.org/mailman/listinfo/python-list


[issue3547] Ctypes is confused by bitfields of varying integer types

2008-09-24 Thread Fredrik Lundh

Fredrik Lundh [EMAIL PROTECTED] added the comment:

Looks fine to me, except for the comment in the test suite.  Should

+# MS compilers do NOT combine c_short and c_int into
+# one field, gcc doesn't.

perhaps be

+# MS compilers do NOT combine c_short and c_int into
+# one field, gcc do.

?

Is using explicit tests for MSVC vs. GCC a good idea, btw?  What about 
other compilers?  Can the test be changed to accept either value?

--
nosy: +effbot

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3547
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3547] Ctypes is confused by bitfields of varying integer types

2008-09-24 Thread Fredrik Lundh

Fredrik Lundh [EMAIL PROTECTED] added the comment:



Do should be does, right.  Not enough coffee today :)

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3547
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



Re: Why are broken iterators broken?

2008-09-22 Thread Fredrik Lundh

Cameron Simpson wrote:

you probably want the consumer thread to block when it catches up with  
the producer, rather than exit.


It sounds like he wants non-blocking behaviour in his consumer.


Roy gave an example, he didn't post a requirements specification.


A common example is try to gather a lot of stuff into a single packet,
but send a smaller packet promptly if there isn't much stuff.


that use case is better solved with a plain list object.  no need to 
make things harder than they are.


/F

--
http://mail.python.org/mailman/listinfo/python-list


Re: Regex Help

2008-09-22 Thread Fredrik Lundh

Support Desk wrote:

the code I am using is 


regex = r'a href=[|\']([^|\']+)[|\']'


that's way too fragile to work with real-life HTML (what if the link has 
a TITLE attribute, for example?  or contains whitespace after the HREF?)


you might want to consider using a real HTML parser for this task.


page_text = urllib.urlopen('http://somesite.com')
page_text = page_text.read()

links = re.findall(regex, text, re.IGNORECASE)


the RE looks fine for the subset of all valid A elements that it can 
handle, though.


got any examples of pages where you see that behaviour?

/F

--
http://mail.python.org/mailman/listinfo/python-list


Re: Here's something interesting: sympy crashes in Python 2.6 (Windows)

2008-09-22 Thread Fredrik Lundh

Robert Kern wrote:


No warnings show up when importing the offending module:

Python 2.5.1 (r251:54869, Apr 18 2007, 22:08:04)
[GCC 4.0.1 (Apple Computer, Inc. build 5367)] on darwin
Type help, copyright, credits or license for more information.
  from sympy.mpmath import specfun
 

So what could be suppressing the warning?


a bug in Python 2.5, it seems:

 more f1.py
as = 1
as = 2
as = 3
 python f1.py
f1.py:1: Warning: 'as' will become a reserved keyword in Python 2.6
f1.py:2: Warning: 'as' will become a reserved keyword in Python 2.6
f1.py:3: Warning: 'as' will become a reserved keyword in Python 2.6

 more f2.py
as = 1
import os
as = 3
 python f2.py
f2.py:1: Warning: 'as' will become a reserved keyword in Python 2.6

A quick look in parsetok.c reveals that it sets a handling_import flag 
when it stumbles upon an import statement, a flag that's later used to 
suppress the warning message.  The bug is that the flag isn't reset 
until the parser sees an ENDMARKER token (end of file), instead of when 
it sees the next NEWLINE token.


(if someone wants to submit this to bugs.python.org, be my guest)

/F

--
http://mail.python.org/mailman/listinfo/python-list


Re: Tkinter 3000 WCK Install Problem

2008-09-22 Thread Fredrik Lundh

garyr wrote:


I'm trying to install WCK. I downloaded and installed the Windows
executable for my Python version. It appeared to run OK. I then
downloaded the demo files but find that none run due to error:
ImportError: No module named _tk3draw.
I'm using ActivePython 2.3.5 on Windows XP Home.
What can I do to fix this problem?


the error means that the interpreter cannot find the _tk3draw.pyd file.

if you use the standard install location, it should be installed under

C:\Python23\lib\site-packages

you could try this:

 import FixTk
 import _tk3draw
 _tk3draw.__file__
'C:\\Python23\\lib\\site-packages\\_tk3draw.pyd'

if this also gives you an error, search for _tk3draw.pyd on the disk.

/F

--
http://mail.python.org/mailman/listinfo/python-list


Re: Not fully OO ?

2008-09-21 Thread Fredrik Lundh

Martin v. Löwis wrote:


I don't think he meant that Python is wrong somehow, but that the OO
babble of what happens for 2+2 is wrong. The babble said that, when the
code is executed, an __add__ message is sent to the 2 object, with
another 2 object as the parameter. That statement is incorrect: no
message is sent at all, but the result is available even before the
program starts.


On the other hand, the inability to distinguish between as if and 
hah, I've looked under the covers isn't necessarily a good trait for a 
programmer.  If he bases his mental model on concrete implementation 
details of a production quality software product, he's bound to end up 
with a cargo-cultish understanding of fundamental issues.  If he uses it 
to win arguments, people will flip his bozo bit pretty quickly.


/F

--
http://mail.python.org/mailman/listinfo/python-list


Re: How to kill threading.Thread instance?

2008-09-21 Thread Fredrik Lundh

dmitrey wrote:


BTW, it should be noticed that lots of threading module methods have
no docstrings (in my Python 2.5), for example _Thread__bootstrap,
_Thread__stop.


things named _Class__name are explicitly marked private by the 
implementation (using the __ prefix).


using them just because you can find them via dir is a really stupid 
idea.  (and, as noted in the comment section to the recipe, the stop 
method flags a thread as stopped, it doesn't stop it.)


/F

--
http://mail.python.org/mailman/listinfo/python-list


Re: How to kill threading.Thread instance?

2008-09-21 Thread Fredrik Lundh

Diez B. Roggisch wrote:


I wonder why something like myThread.exit() or myThread.quit() or
threading.kill(myThread) can't be implemented?
Is something like that present in Python 3000?


Not that I'm aware of it (which doesn't mean to much though).

However I *am* aware of the bazillions discussions that have been held 
over this here - and the short answer is: it is a generally very bad 
idea to terminate threads hard, as it can cause all kinds of corruption.


the problem is that you have no idea what the thread is doing, so just 
killing it dead it may make one big mess out of the application's 
internal state; see e.g. this post


  http://mail.python.org/pipermail/python-list/2006-August/400256.html

  That's wise ;-)  Stopping a thread asynchronously is in /general/ a
  dangerous thing to do, and for obvious reasons.  For example, perhaps
  the victim thread is running in a library routine at the time the
  asynch exception is raised, and getting forcibly ejected from the
  normal control flow leaves a library-internal mutex locked forever.
  Or perhaps a catch-all finally: clause in the library manages to
  release the mutex, but leaves the internals in an inconsistent state.

which links to a FAQ from Sun on this very topic:

http://java.sun.com/j2se/1.3/docs/guide/misc/threadPrimitiveDeprecation.html

(note that Java releases all mutexes when a thread is killed, but that's 
not much better, as the FAQ explains)


so as usual, the right thing to do is to do things in the right way.

/F

--
http://mail.python.org/mailman/listinfo/python-list


Re: Override the '+' symbol

2008-09-21 Thread Fredrik Lundh

Mr.SpOOn wrote:


how can I override the '+' symbol (and other math symbols) so that it
can have a new behavior when applied to some objects?


see Emulating Numeric Types in the language reference:

http://www.python.org/doc/ref/numeric-types.html

/F

--
http://mail.python.org/mailman/listinfo/python-list


Re: Why are broken iterators broken?

2008-09-21 Thread Fredrik Lundh

Steven D'Aprano wrote:

According to the Python docs, once an iterator raises StopIteration, it 
should continue to raise StopIteration forever. Iterators that fail to 
behave in this fashion are deemed to be broken:


http://docs.python.org/lib/typeiter.html

I don't understand the reasoning behind this. As I understand it, an 
iterator is something like a stream. There's no constraint that once a 
stream is empty it must remain empty forever.


it's a design guideline, not an absolute rule.

but I disagree that an iterator is something like a stream.  it's 
rather something like a pointer or an index, that is, an object that 
helps you iterate over all members in a collection.


/F

--
http://mail.python.org/mailman/listinfo/python-list


Re: Why are broken iterators broken?

2008-09-21 Thread Fredrik Lundh

Roy Smith wrote:

There are plausible examples of collections which grow while you're 
iterating over them.  I'm thinking specifically of a queue in a 
multi-threaded application.  One thread pushes work onto the back of the 
queue while another pops from the front.  The queue could certainly go 
empty at times.  But, maybe a Python iterator is just the wrong way to 
model such behavior.


you probably want the consumer thread to block when it catches up with 
the producer, rather than exit.


(that's the default behaviour of Python's Queue object, btw)

/F

--
http://mail.python.org/mailman/listinfo/python-list


Re: Newick parser

2008-09-21 Thread Fredrik Lundh

aditya shukla wrote:

Hello folks , i have a .nwk file.I want to parser the tree from that 
file.I found this python parser for newick trees.

http://www.daimi.au.dk/~mailund/newick.html

But i don't understand the usage properly.What i wanna do is if i have a 
file in the location c:\\files\\file1.nwk , then i wanna parse the trees 
in that file.


judging from the docs, you should be able to do e.g.

  from newick import parse_tree

  file = open(c:\\files\\file1.nwk)
  text = file.read()

  print parse_tree(text)

/F

--
http://mail.python.org/mailman/listinfo/python-list


Re: report a BUG of package setuptools-0.6c8.

2008-09-20 Thread Fredrik Lundh

为爱而生 wrote:

  File /usr/lib/python2.5/site-packages/setuptools/command/sdist.py, 
line 98, in entries_finder

log.warn(unrecognized .svn/entries format in %s, dirname)
NameError: global name 'log' is not defined

global name 'log' is not defined to the line 98!!!


please report bugs here:

http://bugs.python.org/

/F

--
http://mail.python.org/mailman/listinfo/python-list

Re: The Python computer language

2008-09-20 Thread Fredrik Lundh

ROSEEE wrote:


http://pthoncomputerlanguage.blogspot.com


report here:

http://tinyurl.com/blogspot-spam

/F

--
http://mail.python.org/mailman/listinfo/python-list


Re: Not fully OO ?

2008-09-20 Thread Fredrik Lundh

Kay Schluehr wrote:


Answer: if you want to define an entity it has to be defined inside a
class. If you want to access an entity you have to use the dot
operator. Therefore Java is OO but Python is not.


you're satirising the quoted author's cargo-cultish view of object 
orientation, right?


/F

--
http://mail.python.org/mailman/listinfo/python-list


Re: Not fully OO ?

2008-09-20 Thread Fredrik Lundh

Colin J. Williams wrote:


foreach: for x in array: statements


Loops over the array given by array. On each iteration, the value of the 
current element is assigned to x and the internal array pointer is 
advanced by one. 


This could be a useful addition to Python.


for-in could be a useful addition to Python?  looks like Guido's used 
his time machine again, then, since it's been around since the pre-1.0 days:


http://www.python.org/doc/ref/for.html

/F

--
http://mail.python.org/mailman/listinfo/python-list


Re: How to make a reverse for loop in python?

2008-09-20 Thread Fredrik Lundh

Alex Snast wrote:


I'm new to python and i can't figure out how to write a reverse for
loop in python

e.g. the python equivalent to the c++ loop

for (i = 10; i = 0; --i)


use range with a negative step:

for i in range(10-1, -1, -1):
...

or just reverse the range:

for i in reversed(range(10)):
...

(the latter is mentioned in the tutorial, and is the second hit if you 
google for python reverse for loop)


/F

--
http://mail.python.org/mailman/listinfo/python-list


Re: How to make a reverse for loop in python?

2008-09-20 Thread Fredrik Lundh

Fredrik Lundh wrote:


e.g. the python equivalent to the c++ loop

for (i = 10; i = 0; --i)


use range with a negative step:

for i in range(10-1, -1, -1):
...

or just reverse the range:

for i in reversed(range(10)):
...


(and to include the 10 in the range, add one to the 10 above)

/F

--
http://mail.python.org/mailman/listinfo/python-list


Re: NEW GENERATED DLL ERROR FOUND WITHIN f2PY.py

2008-09-20 Thread Fredrik Lundh

Blubaugh, David A. wrote:

(no need to shout when filling in the subject line, thanks)


I have now been able to generate a .pyd file from a FORTRAN

 file that I am trying to interface with python.  I was able
 to execute this with an additional insight into how f2py
 operates.
 
ImportError: DLL load with error code 193


Error code 193 is ERROR_BAD_EXE_FORMAT, which means that the thing 
you're trying to import is not a proper DLL.


 copy LICENSE.txt LICENSE.pyd
1 file(s) copied.

 python
 import LICENSE
Traceback (most recent call last):
  File stdin, line 1, in module
ImportError: DLL load failed with error code 193

In general, the tools for building binary extensions for Python assumes 
that you have at least some basic knowledge about how to build binaries 
using a compiled language.


/F

--
http://mail.python.org/mailman/listinfo/python-list


Re: How to Determine Name of the Day in the Week

2008-09-18 Thread Fredrik Lundh

Keo Sophon wrote:

I've tried calendar.month_name[0], it displays empty string, while 
calendar.month_name[1] is January? Why does calendar.month_name's 
index not start with index 0 as calendar.day_name?


the lists are set up to match the values used by the time and datetime 
modules; see e.g.


http://docs.python.org/lib/module-time.html
http://docs.python.org/lib/datetime-date.html

/F

--
http://mail.python.org/mailman/listinfo/python-list


Re: PEP proposal optparse

2008-09-18 Thread Fredrik Lundh

James Mills wrote:


As you can see (as long as you're
reading this in fixed-width fonts)
it _is_ very readable.


given that it only relies on indentation from the left margin, it's no 
less readable in a proportional font (unless you're using an font with 
variable-width spaces, that is ;-).


/F

--
http://mail.python.org/mailman/listinfo/python-list


Re: Twisted vs Python Sockets

2008-09-18 Thread Fredrik Lundh

James Matthews wrote:

I am wondering what are the major points of twisted over regular python 
sockets. I am looking to write a TCP server and want to know the pros 
can cons of using one over the other.


Twisted is a communication framework with lots of ready-made components:

   http://twistedmatrix.com/trac/wiki/TwistedAdvantage

Regular sockets are, well, regular sockets.  No more, no less.

/F

--
http://mail.python.org/mailman/listinfo/python-list


Re: Extracting hte font name from a TrueType font file

2008-09-18 Thread Fredrik Lundh

Steve Holden wrote:


Does anyone have a Python recipe for this?


 from PIL import ImageFont
 f = ImageFont.truetype(/windows/fonts/verdanai.ttf, 1)
 f.font.family
'Verdana'
 f.font.style
'Italic'

/F

--
http://mail.python.org/mailman/listinfo/python-list


Re: how many objects are loaded for hello world?

2008-09-17 Thread Fredrik Lundh

belred wrote:


i just read this blog about how many objects (types) are loaded for a
hello world program in C#.

http://blogs.msdn.com/abhinaba/archive/2008/09/15/how-many-types-are-loaded-for-hello-world.aspx

how can you find out how many are loaded for a python program:  print
'hello'


types and objects are different things, though.  to get an idea of how 
much stuff Python loads using upstart, do python -vv script.py or add


import sys
print len(sys.modules), modules
print sys.modules.keys()

to the end of the script.

to get an idea of how many objects and types that are created at that 
point, add


import gc
print len(gc.get_objects()), objects
print len(set(map(type, gc.get_objects(, types

this gives me 35 modules (including the gc module), 3219 objects and 26 
distinct types -- but the above will miss things, so the true numbers 
are a bit higher.


(you might be able to use a debug build to get more detailed information)

/F

--
http://mail.python.org/mailman/listinfo/python-list


Re: File Reading related query

2008-09-17 Thread Fredrik Lundh

Usman Ajmal wrote:

Is there any function for reading a file while ignoring *\n* occuring in 
the file?


can you be a bit more precise?  are we talking about text files or 
binary files?  how do you want to treat any newlines that actually 
appear in the file?


/F

--
http://mail.python.org/mailman/listinfo/python-list


Re: recursive using the os.walk(path) from the os module

2008-09-17 Thread Fredrik Lundh

A. Joseph wrote:


I want to search through a directory and re-arrange all the files into e.g

All .doc files go into MS WORD folder, all .pdf files goes into PDF Folder.

I`m thinking of doing something with the os.walk(path) method from os 
module, I need some ideal how the algorithm should look like, maybe 
recursive ..any deal?


os.walk traverses the directory tree, so I'm not sure why you think that 
your program needs to use recursion?  wouldn't a plain loop work?


import os, shutil

for dirpath, dirnames, filenames in os.walk(directory):
for name in filenames:
source = os.path.join(dirpath, name)
... check extension and determine target directory ...
destination = os.path.join(targetdir, name)
shutil.move(source, destination)

tweak as necessary.

/F

--
http://mail.python.org/mailman/listinfo/python-list


Re: python regex character group matches

2008-09-17 Thread Fredrik Lundh

christopher taylor wrote:


my issue, is that the pattern i used was returning:

[ '\\uAD0X', '\\u1BF3', ... ]

when i expected:

[ '\\uAD0X\\u1BF3', ]

the code looks something like this:

pat = re.compile((\\\u[0-9A-F]{4})+, re.UNICODE|re.LOCALE)
#print pat.findall(txt_line)
results = pat.finditer(txt_line)

i ran the pattern through a couple of my colleagues and they were all
in agreement that my pattern should have matched correctly.


First, [0-9A-F] cannot match an X.  Assuming that's a typo, your next 
problem is a precedence issue: (X)+ means one or more (X), not one or 
more X inside parens.  In other words, that pattern matches one or more 
X's and captures the last one.


Assuming that you want to find runs of \u escapes, simply use 
non-capturing parentheses:


   pat = re.compile(u(?:\\\u[0-9A-F]{4}))

and use group(0) instead of group(1) to get the match.

/F

--
http://mail.python.org/mailman/listinfo/python-list


Re: python regex character group matches

2008-09-17 Thread Fredrik Lundh

Steven D'Aprano wrote:


Assuming that you want to find runs of \u escapes, simply use
non-capturing parentheses:

pat = re.compile(u(?:\\\u[0-9A-F]{4}))


Doesn't work for me:


pat = re.compile(u(?:\\\u[0-9A-F]{4}))


it helps if you cut and paste the right line...  here's a better version:

pat = re.compile(r(?:\\u[0-9A-F]{4})+)

/F

--
http://mail.python.org/mailman/listinfo/python-list


Re: translating ascii to binary

2008-09-17 Thread Fredrik Lundh

Lie wrote:


Any advice about this matter would be very appreciated.
Thanks in advance.


It'd be easier to make a one-char version of ascii2bin then make the
string version based on the one-char version.


And it'd be a lot easier to read your posts if you trimmed away at least 
some of the original message before posting.  If you cannot do that for 
some technical reason, I recommend using top-posting instead.


/F

--
http://mail.python.org/mailman/listinfo/python-list


Re: python-mode problem, doesnt load whole module?

2008-09-17 Thread Fredrik Lundh

cnb wrote:


a = parsing.unserialize(C:/users/saftarn/desktop/twok.txt)

Traceback (most recent call last):



  File C:\Python25\lib\pickle.py, line 1126, in find_class
klass = getattr(mod, name)


when reporting a traceback, please include the error message that 
follows after the stack trace.


/F

--
http://mail.python.org/mailman/listinfo/python-list


Re: How do I add permanently to Pythons sys.path?

2008-09-16 Thread Fredrik Lundh

cnb wrote:


no I can't...


Python has supported packages since version 1.4 or so, so I'm pretty 
sure you can.


/F

--
http://mail.python.org/mailman/listinfo/python-list


Re: Why some blog entries at MSN Space support rss feed while others don't?

2008-09-14 Thread Fredrik Lundh

liuyuprc wrote:


Not sure if this is the place this question should even be raised


it isn't.

--
http://mail.python.org/mailman/listinfo/python-list


Re: Stuck connection in Python 3.0b2 http.server

2008-09-14 Thread Fredrik Lundh

rs387 wrote:


I've encountered a weird issue when migrating a web server to Python 3
- the browser would wait forever without showing a page, displaying
Transferring data in the status bar. I tracked it down to a
reference cycle in my BaseHTTPRequestHandler descendant - one of the
attributes stored a dict of methods. Removing the cycle made the
problem go away.

In Python 2.5.2 the code works fine either way.

Here's a minimal example which runs in both 2.5 and 3.0 - to see stuck
connections run as-is in 3.0 and navigate to http://localhost:8123; to
fix this comment out self.dummy = self (alternatively reset
self.dummy = None at the end of the __init__ method).

Am I doing it wrong, or is this a bug?


it's weird enough to deserve an issue over at http://bugs.python.org/, 
at least.


it'd probably be a good idea to test this on 2.6rc as well.

/F

--
http://mail.python.org/mailman/listinfo/python-list


Re: Is there any nice way to unpack a list of unknown size??

2008-09-14 Thread Fredrik Lundh

srinivasan srinivas wrote:


I want to do something like below:

1. first, second, third, *rest = foo

 2. for (a,b,c,*rest) in list_of_lists:


update to Python 3.0 (as others have pointed out), or just do

first, second, third = foo[:3]
rest = foo[3:]

for item in list_of_lists:
a, b, c = item[:3]
rest = item[3:]
...

and move on to more interesting parts of your program.

/F

--
http://mail.python.org/mailman/listinfo/python-list


[issue3865] explain that profilers should be used for profiling, not benchmarking

2008-09-14 Thread Fredrik Lundh

Fredrik Lundh [EMAIL PROTECTED] added the comment:

(the reason this is extra bad for C modules is that the profilers
introduce overhead for Python code, but not for C-level functions.  For
example, using the standard profiler to benchmark parser performance for
xml.etree.ElementTree vs. xml.etree.cElementTree will make ET appear to
be about 10 times slower than it actually is.)

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3865
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3865] explain that profilers should be used for profiling, not benchmarking

2008-09-14 Thread Fredrik Lundh

New submission from Fredrik Lundh [EMAIL PROTECTED]:

You often see people using the profiler for benchmarking instead of
profiling.  I suggest adding a note that explains that the profiler
modules are designed to provide an execution profile for a given
program, not for benchmarking different libraries or, even worse,
benchmarking Python code against C libraries.  Point people to the
timeit module if they want resonably accurate results.

(and yes, it would be nice if the copyright text on the page

http://docs.python.org/dev/library/profile.html

was moved to the bottom of the page.  If necessary, add something like
This description of the profile module is Copyright © 1994, by InfoSeek
Corporation, all rights reserved.  Full copyright message below at the
top.)

--
assignee: georg.brandl
components: Documentation
messages: 73213
nosy: effbot, georg.brandl
severity: normal
status: open
title: explain that profilers should be used for profiling, not benchmarking
type: feature request
versions: Python 2.6

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3865
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



Re: XML RPC Problem....

2008-09-13 Thread Fredrik Lundh

Usman Ajmal wrote:

Please explain the arguments of send_request. What exactly are the 
connection, handler and request_body? It will be really helpful if you 
give an example of how do i call send_request


you don't call send_request.  you should pass the SecureTransport 
instance as an argument to the ServerProxy, which will then use it to 
talk to the server.  see the custom transport example in the library 
reference that I pointed you to.


  http://www.python.org/doc/lib/xmlrpc-client-example.html

/F

--
http://mail.python.org/mailman/listinfo/python-list


Re: book example confusion

2008-09-13 Thread Fredrik Lundh

byron wrote:


Being that each function is an object, a name assignment to
(tmp1,tmp2) doesn't actually evaluate or run the function itself until
the name is called..


the above would be true if the code had been

   tmp1, tmp2 = f1, f2

but it isn't.  look again.

/F

--
http://mail.python.org/mailman/listinfo/python-list


Re: XML RPC Problem....

2008-09-13 Thread Fredrik Lundh

Usman Ajmal wrote:

Where exactly should i call ServerProxy? Following is the code from my 
client.py


ServerProxy is the preferred name.  Server is an old alias for the same 
class.



t = SecureTransport()
   
t.set_authorization(ustring, text_ucert)

server = xmlrpclib.Server('http://localhost:8000/',transport=t)
print server.s()


that code looks correct.  so what's the problem?

/F

--
http://mail.python.org/mailman/listinfo/python-list


Re: XML RPC Problem....

2008-09-13 Thread Fredrik Lundh

Usman Ajmal wrote:

Problem is that when i start client (while the server is already 
running), i get an error i.e.

Error 500 Internal Server Error


that's a server error, not a client error.  check the server logs (e.g. 
error.log or similar).


/F

--
http://mail.python.org/mailman/listinfo/python-list


Re: How to run PyOS_InputHook from python code (i.e. yield to event loops)

2008-09-13 Thread Fredrik Lundh

ville wrote:


That's tk-specific, right? I'm looking for a snippet that

- Would not be tied to tk


upstream, you said:

   My actual use case is to keep a tkinter application responsive

/F

--
http://mail.python.org/mailman/listinfo/python-list


Re: Checking the boolean value of a collection

2008-09-13 Thread Fredrik Lundh

Marco Bizzarri wrote:


class FolderInUse:



def true_for(self, archivefolder):
return any([instance.forbid_to_close(archivefolder) for instance in
self.core.active_outgoing_registration_instances()])

Is this any better? The true_for name does not satisfy me a lot...


well, true_for is indeed pretty inscrutable, but I'm not sure that 
would be the first thing I'd complain about in that verbose mess...


(when you pick method names, keep in mind that the reader will see the 
context, the instance, and the arguments at the same time as they see 
the name.  there's no need to use complete sentences; pick short short 
descriptive names instead.)


/F

--
http://mail.python.org/mailman/listinfo/python-list


Re: code style and readability [was: Re: Checking the boolean value of a collection]

2008-09-13 Thread Fredrik Lundh

Larry Bates wrote:

I also have a personal dislike for early returns because I've found it 
makes it harder insert execution trace logging into the code.


in a language that makes it trivial to wrap arbitrary callables in 
tracing wrappers?


/F

--
http://mail.python.org/mailman/listinfo/python-list


[issue3825] Major reworking of Python 2.5.2 re module

2008-09-13 Thread Fredrik Lundh

Fredrik Lundh [EMAIL PROTECTED] added the comment:

A bit more information on the changes to the core engine that are
responsible for the 2x speedup (on what?) would be nice to have, I think
(especially since you seem to have removed the KMP prefix scanner).

(Isn't there a RE benchmark suite somewhere under tests?)

--
nosy: +effbot

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3825
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



Re: Checking the boolean value of a collection

2008-09-12 Thread Fredrik Lundh

Marco Bizzarri wrote:


Can you clarify where I can find any? It seems to me I'm

 unable to find it...

it's a 2.5 addition.  to use this in a future-compatible way in 2.3, 
you can add


 try:
 any
 except NameError:
 def any(iterable):
 for element in iterable:
 if element:
 return True
 return False

to the top of the file (or to some suitable support library).

2.5 also provides an all function, which can be emulated as:

 try:
 all
 except NameError:
 def all(iterable):
 for element in iterable:
 if not element:
 return False
 return True

/F

--
http://mail.python.org/mailman/listinfo/python-list


Re: Matching horizontal white space

2008-09-12 Thread Fredrik Lundh

[EMAIL PROTECTED] wrote:


multipleSpaces = re.compile(u'\\h+')

importantTextString = '\n  \n  \n \t\t  '
importantTextString = multipleSpaces.sub(M, importantTextString)


what's \\h supposed to mean?


I would have expected consecutive spaces and tabs to be replaced by M
but nothing is being replaced.


if you know what you want to replace, be explicit:

 importantTextString = '\n  \n  \n \t\t  '
 re.compile([\t ]+).sub(M, importantTextString)
'\nM\nM\nM'

/F

--
http://mail.python.org/mailman/listinfo/python-list


Re: Checking the boolean value of a collection

2008-09-12 Thread Fredrik Lundh

Marco Bizzarri wrote:


I would like to make  this available to the whole project. I suspect I
could put it in the package __init__.py... in that way, the
__builtins__ namespace should have it... am I right?


the __init__ module for package foo defines the contents of the foo 
module; it doesn't add anything to the builtin namespace.


Diez made a typo in his post, btw.  To add your own builtins, you should 
add them to the __builtin__ module (no plural s):


 import __builtin__

 try:
 any
 except NameError:
 def any(iterable):
 for element in iterable:
 if element:
 return True
 return False
 __builtin__.any = any

 try:
 all
 except NameError:
 def all(iterable):
 for element in iterable:
 if not element:
 return False
 return True
 __builtin__.all = all

The __builtins__ object is an implementation detail, and shouldn't be 
accessed directly.  And I hope I don't need to point out that adding 
custom builtins nillywilly is a bad idea...


/F

--
http://mail.python.org/mailman/listinfo/python-list


Re: setattr in class

2008-09-12 Thread Fredrik Lundh

Bojan Mihelac wrote:


Hi all - when trying to set some dynamic attributes in class, for
example:

class A:
for lang in ['1', '2']:
exec('title_%s = lang' % lang) #this work but is ugly
# setattr(A, title_%s % lang, lang) # this wont work

setattr(A, title_1, x) # this work when outside class

print A.title_1
print A.title_2

I guess A class not yet exists in line 4. Is it possible to achive
adding dynamic attributes without using exec?


Move the for-in loop out of the class definition:

 class A:
... pass
...
 for lang in ['1', '2']:
... setattr(A, title_%s % lang, lang)
 a = A()
 a.title_1
'1'

A truly dynamic solution (using __getattr__ and modification on access) 
would probably give you a more pythonic solution.


/F

--
http://mail.python.org/mailman/listinfo/python-list


Re: Checking the boolean value of a collection

2008-09-12 Thread Fredrik Lundh

D'Arcy J.M. Cain wrote:


Is there ever any advantage to having something as a builtin rather
than as a regular user method?  What difference does it make to the
running script?  I can see that adding bar from module foo to
__builtins__ means that you can use bar() instead of foo.bar().
Is that the only benefit?


basically, yes.  in this case, it does make some sense to patch any/all 
into __builtin__, since they are builtins in a later version.


/F

--
http://mail.python.org/mailman/listinfo/python-list


Re: lacking follow-through

2008-09-12 Thread Fredrik Lundh

Steve Holden wrote:


The defence rests.


can you please stop quoting that guy, so we don't have to killfile you 
as well...


/F

--
http://mail.python.org/mailman/listinfo/python-list


Re: Which version

2008-09-12 Thread Fredrik Lundh

Don wrote:


I'm a reasonably experienced in other languages and have just decided to
get my feet wet with Python. But I'm using FC6 which has v2.4.4 installed,
is this good enough to start out with or am I likely to encounter bugs that
have been fixed in later versions.


Python 2.4 is definitely good enough to start with.

The bugs you'll find in released versions are usually pretty obscure; 
I've been using Python since release 1.1 or so, and I cannot remember 
ever having to upgrade due to a critical bug in the version I was using.


/F

--
http://mail.python.org/mailman/listinfo/python-list


Re: Which version

2008-09-12 Thread Fredrik Lundh

Eric Wertman wrote:


The subprocess module is one though


footnote: subprocess works on older versions too, and can be trivially 
installed along with your application under Python 2.2 and 2.3.


binary builds for Windows are available here:

  http://effbot.org/downloads/#subprocess

/F

--
http://mail.python.org/mailman/listinfo/python-list


Re: I want to use a C++ library from Python

2008-09-11 Thread Fredrik Lundh

Anders Eriksson wrote:


I have looked (very briefly) at the three framework you mention but they
all need the source code of the C++?


No, they need header files and an import library to be able to compile 
the bindings and link them to your DLL.


Do you know enough about C/C++ build issues to be able to compile a C++ 
program against the given library?  If you do, fixing the rest should be 
straightforward, since the binding is just another C++ program designed 
to be imported by Python.


/F

--
http://mail.python.org/mailman/listinfo/python-list


  1   2   3   4   5   6   7   8   9   10   >