Re: [Python-Dev] Accepting PEP 3154 for 3.4?

2013-11-20 Thread Alexandre Vassalotti
On Tue, Nov 19, 2013 at 2:09 PM, Antoine Pitrou solip...@pitrou.net wrote:

 Well, I don't think it's a big deal to add a FRAME opcode if it doesn't
 change the current framing logic. I'd like to defer to Alexandre on this
 one, anyway.


Looking at the different options available to us:

1A. Mandatory framing
  (+) Allows the internal buffering layer of the Unpickler to rely
  on the presence of framing to simplify its implementation.
  (-) Forces all implementations of pickle to include support for
  framing if they want to use the new protocol.
  (-) Cannot be removed from future versions of the Unpickler
  without breaking protocols which mandates framing.
1B. Optional framing
  (+) Could allow optimizations to disable framing if beneficial
  (e.g., when pickling to and unpickling from a string).

2A. With explicit FRAME opcode
  (+) Makes optional framing simpler to implement.
  (+) Makes variable-length encoding of the frame size simpler
  to implement.
  (+) Makes framing visible to pickletools.
  (-) Adds an extra byte of overhead to each frames.
2B. No opcode

3A. With fixed 8-bytes headers
 (+) Is simple to implement
 (-) Adds overhead to small pickles.
3B. With variable-length headers
 (-) Requires Pickler implemention to do extra data copies when
 pickling to strings.

4A. Framing baked-in the pickle protocol
 (+) Enables faster implementations
4B. Framing through a specialized I/O buffering layer
 (+) Could be reused by other modules

I may change my mind as I work on the implementation, but at least for now,
I think the combination of 1B, 2A, 3A, 4A will be a reasonable compromise
here.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Sharing docstrings between the Python and C implementations of a module

2013-04-15 Thread Alexandre Vassalotti
On Mon, Apr 15, 2013 at 12:56 AM, David Lam david.k.l...@gmail.com wrote:

 I tried to find an example in the source which addressed this, but
 found that the docstrings in similar cases to be largely duplicated.


I find this annoying too. It would be nice to have a common way to share
docstrings between C and Python implementations of the same interface. One
roadblock though is functions in C modules often document their parameters
in their docstring.

  import _json
  help(_json.scanstring)
 scanstring(...)
scanstring(basestring, end, encoding, strict=True) - (str, end)

Scan the string s for a JSON string. End is the index of the
character in s after the quote that started the JSON string.
[...]

Argument clinic will hopefully lift this roadblock soon. Perhaps, we could
add something to the clinic DSL a way to fetch the docstring directly from
the Python implementation. And as an extra, it would be easy to add
verification step as well that checks the both implementations provide a
similar interfaces once we have this in place.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Usage of += on strings in loops in stdlib

2013-02-12 Thread Alexandre Vassalotti
On Tue, Feb 12, 2013 at 1:44 PM, Antoine Pitrou solip...@pitrou.net wrote:

 It's idiomatic because strings are immutable (by design, not because of
 an optimization detail) and therefore concatenation *has* to imply
 building a new string from scratch.


Not necessarily. It is totally possible to implement strings such they are
immutable and  concatenation takes O(1): ropes are the canonical example of
this.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Usage of += on strings in loops in stdlib

2013-02-12 Thread Alexandre Vassalotti
On Tue, Feb 12, 2013 at 5:25 PM, Christian Tismer tis...@stackless.comwrote:

 Would ropes be an answer (and a simple way to cope with string mutation
 patterns) as an alternative implementation, and therefore still justify
 the usage of that pattern?


I don't think so. Ropes are really useful when you work with gigabytes of
data, but unfortunately they don't make good general-purpose strings.
Monolithic arrays are much more efficient and simple for the typical
use-cases we have in Python.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] cpython: Issue #16218: skip test if filesystem doesn't support required encoding

2012-11-08 Thread Alexandre Vassalotti
On Thu, Nov 8, 2012 at 9:45 AM, Serhiy Storchaka storch...@gmail.comwrote:

 My intention was testing with filename which cannot be decoded as UTF-8 in
 strict mode.  I agree that testing with name which is encodable in locale
 encoding can be useful too, but now the test has no effect on UTF-8 locale.


So should we change the test back? Or just change the test name?
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] cpython: Issue #16218: skip test if filesystem doesn't support required encoding

2012-11-07 Thread Alexandre Vassalotti
The Unicode code points in the
U+DC00-DFFFhttp://www.unicode.org/charts/PDF/UDC00.pdf
 range (low surrogate area) can't be encoded in UTF-8. Quoting from
RFC 3629http://tools.ietf.org/html/rfc3629
:

*The definition of UTF-8 prohibits encoding character numbers between
U+D800 and U+DFFF, which are reserved for use with the UTF-16 encoding form
(as surrogate pairs) and do not directly represent characters.*


It looks like this test was doing something specific with regards to this.
So, I am curious as well about this change.



On Sat, Nov 3, 2012 at 10:13 AM, Antoine Pitrou solip...@pitrou.net wrote:

 On Sat,  3 Nov 2012 13:37:48 +0100 (CET)
 andrew.svetlov python-check...@python.org wrote:
  http://hg.python.org/cpython/rev/95d1adf144ee
  changeset:   80187:95d1adf144ee
  user:Andrew Svetlov andrew.svet...@gmail.com
  date:Sat Nov 03 14:37:37 2012 +0200
  summary:
Issue #16218: skip test if filesystem doesn't support required encoding
 
  files:
Lib/test/test_cmd_line_script.py |  7 ++-
1 files changed, 6 insertions(+), 1 deletions(-)
 
 
  diff --git a/Lib/test/test_cmd_line_script.py
 b/Lib/test/test_cmd_line_script.py
  --- a/Lib/test/test_cmd_line_script.py
  +++ b/Lib/test/test_cmd_line_script.py
  @@ -366,7 +366,12 @@
   def test_non_utf8(self):
   # Issue #16218
   with temp_dir() as script_dir:
  -script_basename = '\udcf1\udcea\udcf0\udce8\udcef\udcf2'
  +script_basename = '\u0441\u043a\u0440\u0438\u043f\u0442'

 Why exactly did you change the tested name here?

 Regards

 Antoine.


 ___
 Python-Dev mailing list
 Python-Dev@python.org
 http://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe:
 http://mail.python.org/mailman/options/python-dev/alexandre%40peadrop.com

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Benchmarking Python 3.3 against Python 2.7 (wide build)

2012-09-30 Thread Alexandre Vassalotti
On Sun, Sep 30, 2012 at 4:50 PM, Brett Cannon br...@python.org wrote:

 I accidentally left out the telco benchmark, which is bad since cdecimal
 makes it just scream on Python 3.3 (and I verified with Python 3.2 that
 this is an actual speedup and not some silly screw-up like I initially had
 with spectral_norm):


You could also make the pickle benchmark use the C accelerator module by
passing the --use_cpickle flag. The Python 3 version should be a lot faster.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] On a new version of pickle [PEP 3154]: self-referential frozensets

2012-06-27 Thread Alexandre Vassalotti
On Sat, Jun 23, 2012 at 3:19 AM, M Stefan mstefa...@gmail.com wrote:

 * UNION_FROZENSET: like UPDATE_SET, but create a new frozenset
stack before: ... pyfrozenset mark stackslice
stack after : ... pyfrozenset.union(stackslice)


Since frozenset are immutable, could you explain how adding the
UNION_FROZENSET opcode helps in pickling self-referential frozensets? Or
are you only adding this one to follow the current style used for pickling
dicts and lists in protocols 1 and onward?


 While this design allows pickling of self-referenti/Eal sets,
 self-referential
 frozensets are still problematic. For instance, trying to pickle `fs':
 a=A(); fs=frozenset([a]); a.fs = fs
 (when unpickling, the object a has to be initialized before it is added to
  the frozenset)

 The only way I can think of to make this work is to postpone
 the initialization of all the objects inside the frozenset until after
 UNION_FROZENSET.
 I believe this is doable, but there might be memory penalties if the
 approach
 is to simply store all the initialization opcodes in memory until pickling
 the frozenset is finished.


I don't think that's the only way. You could also emit POP opcode to
discard the frozenset from stack and then emit a GET to fetch it back from
the memo. This is how we currently handle self-referential tuples. Check
out the save_tuple method in pickle.py to see how it is done. Personally, I
would prefer that approach because it already well-tested and proven to
work.

That said, your approach sounds good too. The memory trade-off could lead
to smaller pickles and more efficient decoding (though these
self-referential objects are rare enough that I don't think that any
improvements there would matter much).

While self-referential frozensets are uncommon, a far more problematic
 situation is with the self-referential objects created with REDUCE. While
 pickle uses the idea of creating empty collections and then filling them,
 reduce tipically creates already-filled objects. For instance:
 cnt = collections.Counter(); cnt[a]=3; a.cnt=cnt; cnt.__reduce__()
 (class 'collections.Counter', ({__main__.A object at 0x0286E8F8: 3},))
 where the A object contains a reference to the counter. Unpickling an
 object pickled with this reduce function is not possible, because the
 reduce
 function, which explains how to create the object, is asking for the
 object
 to exist before being created.


Your example seems to work on Python 3. I am not sure if I follow what you
are trying to say. Can you provide a working example?

$ python3
Python 3.1.2 (r312:79147, Dec  9 2011, 20:47:34)
[GCC 4.4.3] on linux2
Type help, copyright, credits or license for more information.
 import pickle, collections
 c = collections.Counter()
 class A: pass
...
 a = A()
 c[a] = 3
 a.cnt = c
 b =pickle.loads(pickle.dumps(a))
 b in b.cnt
True


 Pickle could try to fix this by detecting when reduce returns a class type
 as the first tuple arg and move the dict ctor parameter to the state, but
 this may not always be intended. It's also a bit strange that __getstate__
 is never used anywhere in pickle directly.


I would advise against any such change. The reduce protocol is already
fairly complex. Further I don't think change it this way would give us any
extra flexibility.

The documentation has a good explanation of how __getstate__ works under
hood:
http://docs.python.org/py3k/library/pickle.html#pickling-class-instances

And if you need more, PEP 307 (http://www.python.org/dev/peps/pep-0307/)
provides some of the design rationales of the API.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] What should we do with cProfile?

2012-05-29 Thread Alexandre Vassalotti
Hello,

As per PEP 3108, we were supposed to merge profile/cProfile into one
unified module. I initially championed the change, but other things got in
the way and I have never got to the point of a useful patch. I posted some
code and outlined an approach how the merge could be done. However, there
still a lot of details to be worked out.

So I wondering whether we should abandon the change all together or attempt
it for the next release. Personally, I slightly leaning on the former
option since the two modules are actually fairly different underneath even
though they are used similarly. And also, because it is getting late to
make such backward incompatible changes.

I am willing to volunteer to push the change though if it is still desired
by the community.

Cheers!

http://bugs.python.org/issue2919
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Cython for cPickle?

2012-04-22 Thread Alexandre Vassalotti
On Thu, Apr 19, 2012 at 4:55 AM, Stefan Behnel stefan...@behnel.de wrote:

 That sounds like less than two weeks of work, maybe even if we add the
 marshal module to it.
 In less than a month of GSoC time, this could easily reach a point where
 it's close to the speed of what we have and fast enough, but a lot more
 accessible and maintainable, thus also making it easier to add the
 extensions described in the PEP.

 What do you think?


As others have pointed out, many users of pickle depend on its performance.
The main reason why _pickle.c is so big is all the low-level optimizations
we have in there. We have custom stack and dictionary implementations just
for the sake of speed. We also have fast paths for I/O operations and
function calls. These optimizations alone are taking easily 2000 lines of
code and they are not micro-optimizations. Each of these were shown to give
speedups from one to several orders of magnitude.

So I disagree that we could easily reach the point where it's close to the
speed of what we have. And if we were to attempt this, it would be a
multiple months undertaking. I would rather see that time spent on
improving pickle than on yet another reimplementation.

-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Cython for cPickle?

2012-04-22 Thread Alexandre Vassalotti
On Sun, Apr 22, 2012 at 6:12 PM, mar...@v.loewis.de wrote:

  So I disagree that we could easily reach the point where it's close to
 the
 speed of what we have. And if we were to attempt this, it would be a
 multiple months undertaking. I would rather see that time spent on
 improving pickle than on yet another reimplementation.


 Of course, this being free software, anybody can spend time on whatever
 they
 please, and this should not make anybody feel sad. You just don't get
 merits
 if you work on stuff that nobody cares about.


Yes, of course. I don't want to discourage anyone to investigate this
option—in fact, I would very much like to see myself proven wrong. But, if
I understood Stefan correctly, he is proposing to have a GSoC student to do
the work, to which I would feel uneasy about since we have no idea how
valuable this would be as a contribution.

-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3154 - pickle protocol 4

2011-08-15 Thread Alexandre Vassalotti
On Fri, Aug 12, 2011 at 3:58 AM, Antoine Pitrou solip...@pitrou.net wrote:


 Hello,

 This PEP is an attempt to foster a number of small incremental
 improvements in a future pickle protocol version. The PEP process is
 used in order to gather as many improvements as possible, because the
 introduction of a new protocol version should be a rare occurrence.

 Feel free to suggest any additions.


Your propositions sound all good to me. We will need to agree about the
details, but I believe these improvements to the current protocol will be
appreciated.

Also, one thing keeps coming back is the need for pickling functions and
methods which are not part of the global namespace (e.g. issue
9276http://bugs.python.org/issue9276).
Support for this would likely help us fixing another related namespace issue
(i.e., issue 3657 http://bugs.python.org/issue3657%C2%A0). Finally, we
currently missing support for pickling classes with __new__ taking
keyword-only arguments (i.e. issue 4727 http://bugs.python.org/issue4727).

-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Speeding up 2to3: Results from a GSOC Project

2010-07-30 Thread Alexandre Vassalotti
Love it!

BTW, it's not a good idea to have an import statement under 3
level of loops:

https://code.google.com/p/2to3-speedup2/source/browse/trunk/lib2to3/refactor.py#427

-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Readability of hex strings (Was: Use of coding cookie in 3.x stdlib)

2010-07-26 Thread Alexandre Vassalotti
[+Python-ideas -Python-Dev]

import binascii
def h(s):
  return binascii.unhexlify(.join(s.split()))

h(DE AD BE EF CA FE BA BE)

-- Alexandre

On Mon, Jul 26, 2010 at 11:29 AM, anatoly techtonik techto...@gmail.com wrote:
 I find \xXX\xXX\xXX\xXX... notation for binary data totally
 unreadable. Everybody who uses and analyses binary data is more
 familiar with plain hex dumps in the form of XX XX XX XX

 I wonder if it is possible to introduce an effective binary string
 type that will be represented as hXX XX XX in language syntax? It
 will be much easier to analyze printed binary data and copy/paste such
 data as-is from hex editors/views.

 On Mon, Jul 19, 2010 at 9:45 AM, Guido van Rossum gu...@python.org wrote:
 Sounds like a good idea to try to remove redundant cookies *and* to
 remove most occasional use of non-ASCII characters outside comments
 (except for unittests specifically trying to test Unicode features).
 Personally I would use \xXX escapes instead of spelling out the
 characters in shlex.py, for example.

 Both with or without the coding cookies, many ways of displaying text
 files garble characters outside the ASCII range, so it's better to
 stick to ASCII as much as possible.

 --Guido

 On Mon, Jul 19, 2010 at 1:21 AM, Alexander Belopolsky
 alexander.belopol...@gmail.com wrote:
 I was looking at the inspect module and noticed that it's source
 starts with # -*- coding: iso-8859-1 -*-.   I have checked and there
 are no non-ascii characters in the file.   There are several other
 modules that still use the cookie:

 Lib/ast.py:# -*- coding: utf-8 -*-
 Lib/getopt.py:# -*- coding: utf-8 -*-
 Lib/inspect.py:# -*- coding: iso-8859-1 -*-
 Lib/pydoc.py:# -*- coding: latin-1 -*-
 Lib/shlex.py:# -*- coding: iso-8859-1 -*-
 Lib/encodings/punycode.py:# -*- coding: utf-8 -*-
 Lib/msilib/__init__.py:# -*- coding: utf-8 -*-
 Lib/sqlite3/__init__.py:#-*- coding: ISO-8859-1 -*-
 Lib/sqlite3/dbapi2.py:#-*- coding: ISO-8859-1 -*-
 Lib/test/bad_coding.py:# -*- coding: uft-8 -*-
 Lib/test/badsyntax_3131.py:# -*- coding: utf-8 -*-

 I understand that coding: utf-8 is strictly redundant in 3.x.  There
 are cases such as Lib/shlex.py where using encoding other than utf-8
 is justified.  (See
 http://svn.python.org/view?view=revrevision=82560).  What are the
 guidelines for other cases?  Should redundant cookies be removed?
 Since not all editors respect the  -*- cookie, I think the answer
 should be yes particularly when the cookie is setting encoding other
 than utf-8.
 ___
 Python-Dev mailing list
 Python-Dev@python.org
 http://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: 
 http://mail.python.org/mailman/options/python-dev/guido%40python.org




 --
 --Guido van Rossum (python.org/~guido)
 ___
 Python-Dev mailing list
 Python-Dev@python.org
 http://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: 
 http://mail.python.org/mailman/options/python-dev/techtonik%40gmail.com

 ___
 Python-Dev mailing list
 Python-Dev@python.org
 http://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: 
 http://mail.python.org/mailman/options/python-dev/alexandre%40peadrop.com

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Future of 2.x.

2010-06-09 Thread Alexandre Vassalotti
On Wed, Jun 9, 2010 at 1:23 PM, Martin v. Löwis mar...@v.loewis.de wrote:
 Closing the backport requests is fine. For the feature requests, I'd only
 close them *after* the 2.7 release (after determining that they won't apply
 to 3.x, of course).

 There aren't that many backport requests, anyway, are there?


There is only a few requests (about five).

-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Future of 2.x.

2010-06-09 Thread Alexandre Vassalotti
On Wed, Jun 9, 2010 at 5:55 AM, Facundo Batista
facundobati...@gmail.com wrote:
 Yes, closing the tickets as won't fix and tagging them as
 will-never-happen-in-2.x or something, is the best combination of
 both worlds: it will clean the tracker and ease further developments,
 and will allow anybody to pick up those tickets later.


The issue I care about are already tagged as 26backport. So, I don't
think another keyword is needed.

-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Future of 2.x.

2010-06-08 Thread Alexandre Vassalotti
Is there is any plan for a 2.8 release? If not, I will go through the
tracker and close outstanding backport requests of 3.x features to
2.x.

-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Did I miss the decision to untabify all of the C code?

2010-05-06 Thread Alexandre Vassalotti
On Wed, May 5, 2010 at 8:52 PM, Joao S. O. Bueno jsbu...@python.org.br wrote:
 Python 2.7 is in beta, but not applying such a fix now would probably
 mean that python 2.x would forever remain with the mixed tabs, since
 it would make much less sense for such a change in a minor revision
 (although I'd favor it even there).


Since 2.7 is likely the last release of the 2.x series, wouldn't it
more productive to spend time improving it instead of wasting time on
minor details like indentation?

-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Running Clang 2.7's static analyzer over the code base

2010-05-03 Thread Alexandre Vassalotti
On Mon, May 3, 2010 at 7:34 PM, Barry Warsaw ba...@python.org wrote:
 Now would be a good time to convert the C files to 4 space indents.  We've
 only been talking about it for a decade at least.

Will changing the indentation of source files to 4 space indents break
patches on the bug tracker?

-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Unpickling memory usage problem, and a proposed solution

2010-04-23 Thread Alexandre Vassalotti
On Fri, Apr 23, 2010 at 2:11 PM, Dan Gindikin dgindi...@gmail.com wrote:
 We were having performance problems unpickling a large pickle file, we were
 getting 170s running time (which was fine), but 1100mb memory usage. Memory
 usage ought to have been about 300mb, this was happening because of memory
 fragmentation, due to many unnecessary puts in the pickle stream.

 We made a pickletools.optimize inspired tool that could run directly on a
 pickle file and used pickletools.genops. This solved the unpickling problem
 (84s, 382mb).

 However the tool itself was using too much memory and time (1100s, 470mb), so
 I recoded it to scan through the pickle stream directly, without going through
 pickletools.genops, giving (240s, 130mb).


Collin Winter wrote a simple optimization pass for cPickle in Unladen
Swallow [1]. The code reads through the stream and remove all the
unnecessary PUTs in-place.

[1]: 
http://code.google.com/p/unladen-swallow/source/browse/trunk/Modules/cPickle.c#735

 Other people that deal with large pickle files are probably having similar
 problems, and since this comes up when dealing with large data it is precisely
 in this situation that you probably can't use pickletools.optimize or
 pickletools.genops. It feels like functionality that ought to be added to
 pickletools, is there some way I can contribute this?


Just put your code on bugs.python.org and I will take a look.

-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Unpickling memory usage problem, and a proposed solution

2010-04-23 Thread Alexandre Vassalotti
On Fri, Apr 23, 2010 at 2:38 PM, Alexandre Vassalotti
alexan...@peadrop.com wrote:
 Collin Winter wrote a simple optimization pass for cPickle in Unladen
 Swallow [1]. The code reads through the stream and remove all the
 unnecessary PUTs in-place.


I just noticed the code removes *all* PUT opcodes, regardless if they
are needed or not. So, this code can only be used if there's no GET in
the stream (which is unlikely for a large stream). I believe Collin
made this trade-off for performance reasons. However, it wouldn't be
hard to make the current code to work like pickletools.optimize().

-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Unpickling memory usage problem, and a proposed solution

2010-04-23 Thread Alexandre Vassalotti
On Fri, Apr 23, 2010 at 3:07 PM, Collin Winter collinwin...@google.com wrote:
 I should add that, adding the necessary bookkeeping to remove only
 unused PUTs (instead of the current all-or-nothing scheme) should not
 be hard. I'd watch out for a further performance/memory hit; the
 pickling benchmarks in the benchmark suite should help assess this.

I was thinking about this too. A simple boolean table could be fast,
while keeping the space requirement down. This scheme would be nice to
caches as well.

 The current optimization penalizes pickling to speed up unpickling,
 which made sense when optimizing pickles that would go into memcache
 and be read out 13-15x more often than they were written.

This is my current impression of how pickle is most often used. Are
you aware of a use case of pickle where you do more writes than reads?
I can't think of any.

-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Unpickling memory usage problem, and a proposed solution

2010-04-23 Thread Alexandre Vassalotti
On Fri, Apr 23, 2010 at 3:57 PM, Dan Gindikin dgindi...@gmail.com wrote:
 This wouldn't help our use case, your code needs the entire pickle
 stream to be in memory, which in our case would be about 475mb, this
 is on top of the 300mb+ data structures that generated the pickle
 stream.


In that case, the best we could do is a two-pass algorithm to remove
the unused PUTs. That won't be efficient, but it will satisfy the
memory constraint. Another solution is to not generate the PUTs at all
by setting the 'fast' attribute on Pickler. But that won't work if you
have a recursive structure, or have code that requires that the
identity of objects to be preserved.

 import io, pickle
 x=[1,2]
 f = io.BytesIO()
 p = pickle.Pickler(f, protocol=-1)
 p.dump([x,x])
 pickletools.dis(f.getvalue())
0: \x80 PROTO  2
2: ]EMPTY_LIST
3: qBINPUT 0
5: (MARK
6: ]EMPTY_LIST
7: qBINPUT 1
9: (MARK
   10: KBININT11
   12: KBININT12
   14: eAPPENDS(MARK at 9)
   15: hBINGET 1
   17: eAPPENDS(MARK at 5)
   18: .STOP
highest protocol among opcodes = 2
 [id(x) for x in pickle.loads(f.getvalue())]
[20966504, 20966504]

Now with the 'fast' mode enabled:

 f = io.BytesIO()
 p = pickle.Pickler(f, protocol=-1)
 p.fast = True
 p.dump([x,x])
 pickletools.dis(f.getvalue())
0: \x80 PROTO  2
2: ]EMPTY_LIST
3: (MARK
4: ]EMPTY_LIST
5: (MARK
6: KBININT11
8: KBININT12
   10: eAPPENDS(MARK at 5)
   11: ]EMPTY_LIST
   12: (MARK
   13: KBININT11
   15: KBININT12
   17: eAPPENDS(MARK at 12)
   18: eAPPENDS(MARK at 3)
   19: .STOP
highest protocol among opcodes = 2
 [id(x) for x in pickle.loads(f.getvalue())]
[20966504, 21917992]

As you can observe, the pickle stream generated with the fast mode
might actually be bigger.

By the way, it is weird that the total memory usage of the data
structure is smaller than the size of its respective pickle stream.
What pickle protocol are you using?

-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] contributor to committer

2010-02-24 Thread Alexandre Vassalotti
On Wed, Feb 24, 2010 at 7:13 AM, Florent Xicluna
florent.xicl...@gmail.com wrote:
 Hello,

 I am a semi-regular contributor for Python: I have contributed many patches
 since end of last year, some of them were reviewed by Antoine.
 Lately, he suggested that I should apply for commit rights.


+1

-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Help wanted on a code generator project

2010-01-26 Thread Alexandre Vassalotti
On Tue, Jan 26, 2010 at 7:04 AM, Yingjie Lan lany...@yahoo.com wrote:
 note that this is quite off-topic for this list, which is
 about the
 development of the CPython interpreter and runtime
 environment.

 Sorry if this is bothering you. I thought here are a lot of people who knows 
 how to write extensions, and has a lot of experiences. These are exactly the 
 best people that can perfect expy. On the other hand, expy, once perfected, 
 would be a nice tool to expedite adding runtime modules to Python. I am not 
 aware of other nice places to ask for help of such a sort. If you know, 
 please let me know, thanks in advance.


It is the third time now that people let you know that announcements
about your project are not welcome on this mailing list.

http://mail.python.org/pipermail/python-dev/2009-July/090699.html
http://mail.python.org/pipermail/python-dev/2009-August/091023.html

So please stop playing the ignorance card and behave appropriately.

-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3003 - Python Language Moratorium

2009-11-04 Thread Alexandre Vassalotti
On Tue, Nov 3, 2009 at 12:35 PM, Guido van Rossum gu...@python.org wrote:
 I've checked draft (!) PEP 3003, Python Language Moratorium, into
 SVN. As authors I've listed Jesse, Brett and myself.


+1 from me.

-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Fwd: [issue6397] Implementing Solaris poll in the select module]

2009-07-01 Thread Alexandre Vassalotti
On Wed, Jul 1, 2009 at 10:05 PM, Guido van Rossumgu...@python.org wrote:
 The select module already supports the poll() system call. Or is there
 a special variant that only Solaris has?


I think Jesus refers to /dev/poll—i.e., the interface for
edge-triggered polling on Solaris. This is the Solaris equivalent of
FreeBSD's kqueue and Linux's epoll.

-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Draft PEP 385: Migrating from svn to Mercurial

2009-06-08 Thread Alexandre Vassalotti
On Mon, Jun 8, 2009 at 3:57 PM, Martin v. Löwismar...@v.loewis.de wrote:
 FWIW, I really think that PEP 385 should really grow a timeline
 pretty soon. Are we going to switch this year, next year, or 2011?


+1

-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Dropping bytes support in json

2009-04-13 Thread Alexandre Vassalotti
On Mon, Apr 13, 2009 at 5:25 PM, Daniel Stutzbach
dan...@stutzbachenterprises.com wrote:
 On Mon, Apr 13, 2009 at 3:02 PM, Martin v. Löwis mar...@v.loewis.de
 wrote:

  True, I can always convert from bytes to str or vise versa.

 I think you are missing the point. It will not be necessary to convert.

 Sometimes I want bytes and sometimes I want str.  I am going to be
 converting some of the time. ;-)

 Below is a basic CGI application that assumes that json module works with
 str, not bytes.  How would you write it if the json module does not support
 returning a str?

 print(Content-Type: application/json; charset=utf-8)
 input_object = json.loads(sys.stdin.read())
 output_object = do_some_work(input_object)
 print(json.dumps(output_object))
 print()


Like this?

print(Content-Type: application/json; charset=utf-8)
input_object = json.loads(sys.stdin.buffer.read())
output_object = do_some_work(input_object)
stdout.buffer.write(json.dumps(output_object))


-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Dropping bytes support in json

2009-04-09 Thread Alexandre Vassalotti
On Thu, Apr 9, 2009 at 1:15 AM, Antoine Pitrou solip...@pitrou.net wrote:
 As for reading/writing bytes over the wire, JSON is often used in the same
 context as HTML: you are supposed to know the charset and decode/encode the
 payload using that charset. However, the RFC specifies a default encoding of
 utf-8. (*)


 (*) http://www.ietf.org/rfc/rfc4627.txt


That is one short and sweet RFC. :-)

 The RFC also specifies a discrimination algorithm for non-supersets of ASCII
 (“Since the first two characters of a JSON text will always be ASCII
   characters [RFC0020], it is possible to determine whether an octet
   stream is UTF-8, UTF-16 (BE or LE), or UTF-32 (BE or LE) by looking
   at the pattern of nulls in the first four octets.”), but it is not
 implemented in the json module:


Given the RFC specifies that the encoding used should be one of the
encodings defined by Unicode, wouldn't be a better idea to remove the
unicode support, instead? To me, it would make sense to use the
detection algorithms for Unicode to sniff the encoding of the JSON
stream and then use the detected encoding to decode the strings embed
in the JSON stream.

Cheers,
-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Mercurial?

2009-04-07 Thread Alexandre Vassalotti
On Tue, Apr 7, 2009 at 2:03 AM, Stephen J. Turnbull step...@xemacs.org wrote:
 Alexandre Vassalotti writes:

   This makes me remember that we will have to decide how we will
   reorganize our workflow. For this, we can either be conservative and
   keep the current CVS-style development workflow--i.e., a few main
   repositories where all developers can commit to.

 That was the original idea of PEP 374, that was a presumption under
 which I wrote my part of it, I think we should stick with it.  As
 people develop personal workflows, they can suggest them, and/or
 changes in the public workflow needed to support them.  But there
 should be a working sample implementation before thinking about
 changes to the workflow.


Aahz convinced me earlier that changing the current workflow would be
stupid. So, I now think the best thing to do is to provide a CVS-style
environment similar to what we have currently, and let the workflow
evolve naturally as developers gain more confidence with Mercurial.


   Or we could drink the kool-aid and go with a kernel-style
   development workflow--i.e., each developer maintains his own branch
   and pull changes from each others.

 Can you give examples of projects using Mercurial that do that?


Mercurial itself is developed using that style, I believe.

-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] BufferedReader.peek() ignores its argument

2009-04-05 Thread Alexandre Vassalotti
On Sat, Apr 4, 2009 at 9:03 PM, Antoine Pitrou solip...@pitrou.net wrote:
 Hello,

 Currently, BufferedReader.peek() ignores its argument and can return more or
 less than the number of bytes requested by the user. This is how it was
 implemented in the Python version, and we've reflected this in the C version.

 It seems a bit strange and unhelpful though. Should we change the 
 implementation
 so that the argument to peek() becomes the upper bound to the number of bytes
 returned?


I am not sure if this is a good idea. Currently, the argument of
peek() is documented as a lower bound that cannot exceed the size of
the buffer:

Returns buffered bytes without advancing the position.

The argument indicates a desired minimal number of bytes; we
do at most one raw read to satisfy it.  We never return more
than self.buffer_size.

Changing the meaning of peek() now could introduce at least some
confusion and maybe also bugs. And personally, I like the current
behavior, since it guarantees that peek() won't return an empty string
unless you reached the end-of-file.  Plus, it is fairly easy to cap
the number of bytes returned by doing f.peek()[:upper_bound].

-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Should I/O object wrappers close their underlying buffer when deleted?

2009-04-05 Thread Alexandre Vassalotti
Hello,

I would like to call to your attention the following behavior of TextIOWrapper:

   import io
   def test(buf):
  textio = io.TextIOWrapper(buf)
   buf = io.BytesIO()
   test(buf)
   print(buf.closed)  # This prints True currently

The problem here is TextIOWrapper closes its buffer when deleted.
BufferedRWPair behalves similarly. The solution is simply to override
the __del__ method of TextIOWrapper inherited from IOBase.

-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Mercurial?

2009-04-05 Thread Alexandre Vassalotti
On Sun, Apr 5, 2009 at 5:06 AM, Martin v. Löwis mar...@v.loewis.de wrote:
 Off the top of my head, the following is needed for a successful migration:

    - Verify that the repository at http://code.python.org/hg/ is
 properly converted.

 I see that this has four branches. What about all the other branches?
 Will they be converted, or not? What about the stuff outside /python?


I am not sure if it would be useful to convert the old branches to
Mercurial. The simplest thing to do would be to keep the current svn
repository as a read-only archive. And if people needs to commit to
these branches, they could request the branch to be imported into a
Mercurial branch (or a simple to use script could be provided and
developer could run it directly on the server to create a user
branch).

 In particular, the Stackless people have requested that they move along
 with what core Python does, so their code should also be converted.


Noted.

    - Add Mercurial support to the issue tracker.

 Not sure what this means. There is currently svn support insofar as the
 tracker can format rNNN references into ViewCVS links; this should be
 updated if possible (removed if not). There would also be a possibility
 to auto-close issues from the commit messages. This is not done
 currently, so I would not make it a prerequisite for the switch.


Yes, I was referring to the rNNN references. Actually, I am not sure
how this could be implemented, since with Mercurial we lose atomic
revision IDs. We could use something like h...@branch-name (e.g,
bf94293b1...@py3k) referring to specific revision.

An auto-close would be a nice feature, but, as you said, not necessary
for the migration. The main stumbling block to implement an auto-close
feature is to define when an issue should be closed. Maybe we could
add our own meta-data to the commit message. For example:

   Fix some nasty bug.

   Close-Issue: 4532

When a such commit would arrive in one of the main branches, a commit
hook would close the issue if all the affected releases have been
fixed.

    - Setup temporary svn mirrors for the main Mercurial repositories.

 What is that?


I think it would be a good idea to host a temporary svn mirrors for
developers who accesses their VCS via an IDE. Although, I am sure
anymore if supporting these developers (if there are any) would worth
the trouble. So, think of this as optional.

    - Augment code.python.org infrastructure to support the creation of
 developer accounts.

 One option would be to carry on with the current setup; migrating it
 to hg might work as well, of course.


You mean the current setup for svn.python.org? Would you be
comfortable to let this machine be accessed by core developers through
SSH? Since with Mercurial, SSH access will be needed for server-side
clone (or, a script similar to what the Mozilla folk have [1] could be
added).

[1]: https://developer.mozilla.org/en/Publishing_Mercurial_Clones

    - Update the release.py script.

 There is probably some other things that I missed

 Here are some:

 - integrate with the buildbot

Good one. It seems buildbot has support for Mercurial. [2] So, this
will be a matter of tweaking the right options. The batch scripts in
Tools/buildbot will also need to be updated.

[2]: 
http://djmitche.github.com/buildbot/docs/0.7.10/#How-Different-VC-Systems-Specify-Sources

 - come up with a strategy for /external (also relevant for
  the buildbot slaves)

Since the directories in /external are considered read-only, we could
simply a new Mercurial repository and copy the content of /external in
it. When a new release needs to be added, just create a new directory
and commit.

 - decide what to do with the bzr mirrors


I don't see much benefits to keep them. So, I say, archive the
branches there unless someone step-up to maintain them.

-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Mercurial?

2009-04-05 Thread Alexandre Vassalotti
On Sun, Apr 5, 2009 at 6:27 AM, Antoine Pitrou solip...@pitrou.net wrote:
 Alexandre Vassalotti alexandre at peadrop.com writes:

 Off the top of my head, the following is needed for a successful migration:

 There's also the issue of how we adapt the current workflow of svnmerging
 between branches when we want to back- or forward-port stuff. In particular,
 tracking of already done or blocked backports.

 (the issue being that svnmerge is different from what DVCS'es call merging
 :-))


See the PEP about that. I have written a fair amount of details how
this would work with Mercurial:

http://www.python.org/dev/peps/pep-0374/#backport

-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Mercurial?

2009-04-05 Thread Alexandre Vassalotti
On Sun, Apr 5, 2009 at 1:37 PM, Martin v. Löwis mar...@v.loewis.de wrote:
 I think it should be stated in the PEP what branches get converted,
 in what form, and what the further usage of the svn repository should
 be.


Noted.

 I think there is a long tradition of such annotations; we should
 try to repeat history here. IIUC, the Debian bugtracker understands

   Closes: #4532

 and some other syntaxes. It must be easy to remember, else people
 won't use it.


That should reasonable. Personally, I don't really care about the
syntax we would use as long its consistent and documented.


 Any decision to have or not have such a feature should be stated in
 the PEP. I personally don't use IDEs, so I don't care (although
 I do notice that the apparent absence of IDE support for Mercurial
 indicates maturity of the technology)


I know Netbeans has Mercurial support built-in (which makes sense
because Sun uses Mercurial for its open-source projects). However, I
am not sure if Eclipse has good Mercurial support yet. There are
3rd-party plugins for Eclipse, but I don't know if they work well.

 Ok, I take that back. I assumed that Mercurial could work *exactly*
 as Subversion. Apparently, that's not the case (although I have no
 idea what a server-side clone is). So I wait for the PEP to explain
 how authentication and access control is to be implemented. Creating
 individual Unix accounts for committers should be avoided.

With Subversion, we can do a server-side clone (or copy) using the copy command:

  svn copy SRC_URL DEST_URL

This prevents wasting time and bandwidth by doing the copy directly on
server. Without this feature, you would need to checkout the remote
repository to clone, then push it to a different location. Since
upload bandwidth is often limited, creating new branch in a such
fashion would be time consuming.

With Mercurial, we will need to add support for server-side clone
ourselves. There's few ways to provide this feature. We give Unix user
accounts to all core developers and let developers manages their
private branches directly on the server. You made clear that this is
not wanted. So an alternative approach is to add a interface
accessible via SSH. As I previously mentioned, this is the approach
used by Mozilla.

Yet another approach would be to add a web interface for managing the
repositories. This what OpenSolaris admins opted for. Personnally, I
do not think this a good idea because it would requires us to roll our
own authentication mechanism which is clearly a bad thing (both
security-wise and usability-wise).

This makes me remember that we will have to decide how we will
reorganize our workflow. For this, we can either be conservative and
keep the current CVS-style development workflow—i.e., a few main
repositories where all developers can commit to. Or we could drink the
kool-aid and go with a kernel-style development workflow—i.e., each
developer maintains his own branch and pull changes from each others.

From what I have heard, the CVS-style workflow has a lower overhead
than the kernel-style workflow. However the kernel-style workflow
somehow advantageous because changes get reviewed several times before
they get in the main branches. Thus, it is less likely that someone
manage to break the build. In addition, Mercurial is much better
suited at supporting the kernel-style workflow.

However if we go kernel-style, I will need to designate someone (i.e.,
an integrator) that will maintain the main branches, which will tested
by buildbot and used for the public releases. These are issues I would
like to address in the PEP.


 I can give you access to the master setup. Ideally, this should
 be tested before the switchover (with a single branch). We also
 need instructions for the slaves (if any - perhaps installing
 a hg binary is sufficient).


I am not too familiar with our buildbot setup. So, I will to do some
reading before actually doing any change. You can give me access to
the buildbot master now. However, I would use this access only to
study how the current setup works and to plan the changes we need
accordingly.

 Since the directories in /external are considered read-only, we could
 simply a new Mercurial repository and copy the content of /external in
 it.
 - decide what to do with the bzr mirrors


 I don't see much benefits to keep them.

 Both should go into the PEP.

Noted.

Regards,
-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Mercurial?

2009-04-05 Thread Alexandre Vassalotti
On Sun, Apr 5, 2009 at 2:45 PM, Dirkjan Ochtman dirk...@ochtman.nl wrote:
 On 05/04/2009 20:36, Martin v. Löwis wrote:

 We do require full real names (i.e. no nicknames). Can Mercurial
 guarantee such a thing?

 We could pre-record the list of allowed names in a hook, then have the hook
 check that usernames include one of those names and an email address (so
 people can still start using another email address).


But that won't work if people who are not core developers submit us
patch bundle to import. And maintaining a such white-list sounds to me
more burdensome than necessary.

-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Mercurial?

2009-04-05 Thread Alexandre Vassalotti
On Sun, Apr 5, 2009 at 2:40 PM, Martin v. Löwis mar...@v.loewis.de wrote:
 Okay, sounds like that will be easy. Would be good to enable compression
 on the SSH, though, if that's not already done.

 Where is that configured?


If I recall correctly, only ssh clients can request compression to the
server—in other words, the server cannot forces the clients to use
compression, but merely allow them use it.

See the man page for sshd_config and ssh_config for the specific details.

-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Mercurial?

2009-04-05 Thread Alexandre Vassalotti
On Mon, Apr 6, 2009 at 12:20 AM, Aahz a...@pythoncraft.com wrote:
 How difficult would it be to change the decision later?  That is, how
 about starting with a CVS-style system and maybe switch to kernel-style
 once people get comfortable with Hg?

I believe it would be fairly easy. It would be a matter of declaring a
volunteer to maintain the main repositories and ask core developers to
avoid committing directly to them.

Cheers,
-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Mercurial?

2009-04-04 Thread Alexandre Vassalotti
On Sat, Apr 4, 2009 at 11:40 AM, Aahz a...@pythoncraft.com wrote:
 With Brett's (hopefully temporary!) absence, who is spearheading the
 Mercurial conversion?  Whoever it is should probably take over PEP 374
 and start updating it with the conversion plan, particularly WRT
 expectations for dates relative to 3.1 final and 2.7 final.

I am willing to take over this. I was in charge of the Mercurial
scenarios in the PEP, so it would be natural for me to continue with
the transition. In addition, I volunteer to maintain the new Mercurial
installation.

Off the top of my head, the following is needed for a successful migration:

   - Verify that the repository at http://code.python.org/hg/ is
properly converted.
   - Convert the current svn commit hooks to Mercurial.
   - Add Mercurial support to the issue tracker.
   - Update the developer FAQ.
   - Setup temporary svn mirrors for the main Mercurial repositories.
   - Augment code.python.org infrastructure to support the creation of
developer accounts.
   - Update the release.py script.

There is probably some other things that I missed, but I think this is
a good overview of what needs to be done. And of course, I would
welcome anyone who would be willing to help me with the transition.

-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] issue5578 - explanation

2009-04-03 Thread Alexandre Vassalotti
On Tue, Mar 31, 2009 at 11:25 PM, Guido van Rossum gu...@python.org wrote:
 Well hold on for a minute, I remember we used to have an exec
 statement in a class body in the standard library, to define some file
 methods in socket.py IIRC.

FYI, collections.namedtuple is also implemented using exec.

- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Should the io-c modules be put in their own directory?

2009-04-03 Thread Alexandre Vassalotti
On Fri, Apr 3, 2009 at 5:12 PM, Benjamin Peterson benja...@python.org wrote:
 I'm +.2. This is the layout I would suggest:

 Modules/
  _io/
     _io.c
     stringio.c
     textio.c
     etc


That seems good to me. I opened an issue on the tracker and included a patch.

http://bugs.python.org/issue5682

-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Should the io-c modules be put in their own directory?

2009-04-02 Thread Alexandre Vassalotti
Hello,

I just noticed that the new io-c modules were merged in the py3k
branch (I know, I am kind late on the news—blame school work). Anyway,
I am just wondering if it would be a good idea to put the io-c modules
in a sub-directory (like sqlite), instead of scattering them around in
the Modules/ directory.

Cheers,
-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] GPython?

2009-03-26 Thread Alexandre Vassalotti
On Thu, Mar 26, 2009 at 11:40 PM, Collin Winter coll...@gmail.com wrote:
 In fact, right now I'm adding a last few tests before putting our cPickle
 patches up on the tracker for further review.


Put me in the nosy list when you do; and when I get some free time, I
will give your patches a complete review. I've already taken a quick
look at cPickle changes you did in Unladen and I think some (i.e., the
custom memo table) are definitely worthy to be merged in the
mainlines.

Cheers,
-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] IO implementation: in C and Python?

2009-02-19 Thread Alexandre Vassalotti
On Fri, Feb 20, 2009 at 12:35 AM, Steven D'Aprano st...@pearwood.info wrote:

 Currently, if I want to verify that (say) cFoo and Foo do the same thing, or
 compare their speed, it's easy because I can import the modules separately.
 Given the 3.0 approach, how would one access the Python versions without
 black magic or hacks?


My prefered way to handle this is to keep the original Python
implementations with a leading underscore (e.g., pickle._Pickler). I
found this was the easiest way to test the C and Python
implementations without resorting to import hacks.

-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] undesireable unpickle behavior, proposed fix

2009-01-27 Thread Alexandre Vassalotti
On Tue, Jan 27, 2009 at 5:16 PM, Jake McGuire j...@youtube.com wrote:
 Another vaguely related change would be to store string and unicode objects
 in the pickler memo keyed as themselves rather than their object ids.

That wouldn't be difficult to do--i.e., simply add a type check in
Pickler.memoize and another in Pickler.save().  But I am not sure if
that would be a good idea, since you would end up hashing every string
pickled. And, that would probably be expensive if you are pickling for
long strings.

-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] test_subprocess and sparc buildbots

2008-12-30 Thread Alexandre Vassalotti
Here is what I found just by analyzing the logs. It seems the first
failures appeared after this change:

http://svn.python.org/view/python/branches/release30-maint/Objects/object.c?rev=67888view=diffr1=67888r2=67887p1=python/branches/release30-maint/Objects/object.cp2=/python/branches/release30-maint/Objects/object.c

The logs of failing test runs all shows the same error message:

[31481 refs]
* ob
object  : refcnt 0 at 0x3a97728
type: str
refcount: 0
address : 0x3a97728
* op-_ob_prev-_ob_next
object  : refcnt 0 at 0x3a97728
type: str
refcount: 0
address : 0x3a97728
* op-_ob_next-_ob_prev
object  : [31776 refs]

This is the output of _Py_ForgetReference (which calls _PyObject_Dump)
called either from _PyUnicode_New or unicode_subtype_new. In both
cases, this implies PyObject_MALLOC returned NULL when allocating the
internal array of a str object. However, I have no idea why malloc()
is failing there.

By counting the number of [reftotal] printed in the log, I found that
the failing test could be one of the following: test_invalid_args,
test_invalid_bufsize, test_list2cmdline, test_no_leaking. Looking at
the tests, it seems only test_no_leaking could be problematic:

* test_list2cmdline checks if the subprocess.line2cmdline function
  works correctly, only Python code is involved here;
* test_invalid_args checks if using an option unsupported by a
platform raises an
  exception, only Python code is involved here;
* test_invalid_bufsize only checks whether Popen rejects non-integer
bufsize, only
  Python code is involved here.

And unsurprisingly, that is the failing test:

test test_subprocess failed -- Traceback (most recent call last):
  File 
/home/pybot/buildarea-sid/3.0.klose-debian-sparc/build/Lib/test/test_subprocess.py,
line 423, in test_no_leaking
data = p.communicate(blime)[0]
  File 
/home/pybot/buildarea-sid/3.0.klose-debian-sparc/build/Lib/subprocess.py,
line 671, in communicate
return self._communicate(input)
  File 
/home/pybot/buildarea-sid/3.0.klose-debian-sparc/build/Lib/subprocess.py,
line 1171, in _communicate
bytes_written = os.write(self.stdin.fileno(), chunk)
OSError: [Errno 32] Broken pipe

It seems one of the spawned processes goes out of memory while
allocating a new PyUnicode object. I believe we don't see the usual
MemoryError because the parent process catches stderr and stdout of
the children.

Also, only klose-*-sparc buildbots are failing this way; loewis-sun is
failing too but for a different reason. So, how much memory is
available on this machine (or actually, on this virtual machine)?

Now, I wonder why manipulating the GIL caused the bug to appear in
3.0, but not in 2.x. Maybe it is related to the new I/O library in
Python 3.0.

Regards,
-- Alexandre

On Tue, Dec 30, 2008 at 4:20 PM, Nick Coghlan ncogh...@gmail.com wrote:
 Does anyone have local access to a sparc machine to try to track down
 the ongoing buildbot failures in test_subprocess?

 (I think the problem is specific to 3.x builds on sparc machines, but I
 haven't checked the buildbots all that closely - that assessment is just
 based on what I recall of the buildbot failure emails).

 Cheers,
 Nick.

 --
 Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
 ---
 ___
 Python-Dev mailing list
 Python-Dev@python.org
 http://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: 
 http://mail.python.org/mailman/options/python-dev/alexandre%40peadrop.com

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] test_subprocess and sparc buildbots

2008-12-30 Thread Alexandre Vassalotti
On Tue, Dec 30, 2008 at 10:41 PM, Daniel (ajax) Diniz aja...@gmail.com wrote:
 A reliable way to get that in a --with-pydebug build seems to be:

 ~/py3k$ ./python -c import locale; locale.format_string(1,1)
 * ob
 object  : refcnt 0 at 0x825c76c
 type: tuple
 refcount: 0
 address : 0x825c76c
 * op-_ob_prev-_ob_next
 NULL
 * op-_ob_next-_ob_prev
 object  : refcnt 0 at 0x825c76c
 type: tuple
 refcount: 0
 address : 0x825c76c
 Fatal Python error: UNREF invalid object
 TypeError: expected string or buffer
 Aborted


Nice catch! I reduced your example to: import _sre;  _sre.compile(0,
0, []). And, it doesn't seem to be an input validation problem with
_sre. From what I saw, it's actually a bug in Py_TRACE_REFS's code.
Now, it's getting interesting!

It seems something is breaking the refchain. However, I don't know
what is causing the problem exactly.

 Found using Fusil in a very quick run on top of:
 Python 3.1a0 (py3k:68055M, Dec 31 2008, 01:34:52)
 [GCC 4.2.4 (Ubuntu 4.2.4-1ubuntu3)] on linux2

 So kudos to Victor again :)


Could share the details on how you used Fusil to find another crasher?
It sounds like a useful tool.

Thanks!

-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] extremely slow exit for program having huge (45G) dict (python 2.5.2)

2008-12-22 Thread Alexandre Vassalotti
On Mon, Dec 22, 2008 at 7:34 PM, Antoine Pitrou solip...@pitrou.net wrote:

 Now, we should find a way to benchmark this without having to steal Mike's
 machine and wait 30 minutes every time.

 So, I seem to reproduce it. The following script takes about 15 seconds to
 run and allocates a 2 GB dict which it deletes at the end (gc disabled of
 course).
 With 2.4, deleting the dict takes ~1.2 seconds while with 2.5 and higher
 (including 3.0), deleting the dict takes ~3.5 seconds. Nothing spectacular
 but the difference is clear.


I modified your script to delete the dictionary without actually
deallocating the items in it. You can speed up a dictionary
deallocation significantly if you keep a reference to its items and
delete the dictionary before deleting its items. In Python 2.4, the
same behavior exists, but is not as strongly marked as in Python 2.6
with pymalloc enabled.

I can understand that deallocating the items in the order (or
actually, the reverse order) they were allocated is faster, than doing
so in a rather haphazard manner (i.e., like dict). However, I am not
sure why pymalloc accentuate this behavior.

-- Alexandre

Python 2.6 with pymalloc, without pydebug

a...@helios:~$ python2.6 dict_dealloc_test.py
creating 397476 items...
 - 6.613 s.
building dict...
 - 0.230 s.
deleting items...
 - 0.059 s.
deleting dict...
 - 2.299 s.
total deallocation time: 2.358 seconds.

a...@helios:~$ python2.6 dict_dealloc_test.py
creating 397476 items...
 - 6.530 s.
building dict...
 - 0.228 s.
deleting dict...
 - 0.089 s.
deleting items...
 - 0.971 s.
total deallocation time: 1.060 seconds.


Python 2.6 without pymalloc, without pydebug

a...@helios:release26-maint$ ./python /home/alex/dict_dealloc_test.py
creating 397476 items...
 - 5.921 s.
building dict...
 - 0.244 s.
deleting items...
 - 0.073 s.
deleting dict...
 - 1.502 s.
total deallocation time: 1.586 seconds.

a...@helios:release26-maint$ ./python /home/alex/dict_dealloc_test.py
creating 397476 items...
 - 6.122 s.
building dict...
 - 0.237 s.
deleting dict...
 - 0.092 s.
deleting items...
 - 1.238 s.
total deallocation time: 1.330 seconds.


a...@helios:~$ python2.4 dict_dealloc_test.py
creating 397476 items...
 - 6.164 s.
building dict...
 - 0.218 s.
deleting items...
 - 0.057 s.
deleting dict...
 - 1.185 s.
total deallocation time: 1.243 seconds.

a...@helios:~$ python2.4 dict_dealloc_test.py
creating 397476 items...
 - 6.202 s.
building dict...
 - 0.218 s.
deleting dict...
 - 0.090 s.
deleting items...
 - 0.852 s.
total deallocation time: 0.943 seconds.



##

import random
import time
import gc


# Adjust this parameter according to your system RAM!
target_size = int(2.0  * 1024**3)  # 2.0 GB

pool_size = 4 * 1024
# This is a ballpark estimate: 60 bytes overhead for each
# { dict entry struct + float object + tuple object header },
# 1.3 overallocation factor for the dict.
target_length = int(target_size / (1.3 * (pool_size + 60)))

def make_items():
print (creating %d items... % target_length)

# 1. Initialize a set of pre-computed random keys.
keys = [random.random() for i in range(target_length)]

# 2. Build the values that will constitute the dict. Each value will, as
#far as possible, span a contiguous `pool_size` memory area.

# Over 256 bytes per alloc, PyObject_Malloc defers to the system malloc()
# We avoid that by allocating tuples of smaller longs.
int_size = 200
# 24 roughly accounts for the long object overhead (YMMV)
int_start = 1  ((int_size - 24) * 8 - 7)
int_range = range(1, 1 + pool_size // int_size)

values = [None] * target_length
# Maximize allocation locality by pre-allocating the values
for n in range(target_length):
   values[n] = tuple(int_start + j for j in int_range)
return list(zip(keys,values))

if __name__ == __main__:
gc.disable()
t1 = time.time()
items = make_items()
t2 = time.time()
print  - %.3f s. % (t2 - t1)

print building dict...
t1 = time.time()
testdict = dict(items)
t2 = time.time()
print  - %.3f s. % (t2 - t1)

def delete_testdict():
   global testdict
   print deleting dict...
   t1 = time.time()
   del testdict
   t2 = time.time()
   print  - %.3f s. % (t2 - t1)

def delete_items():
   global items
   print deleting items...
   t1 = time.time()
   del items
   t2 = time.time()
   print  - %.3f s. % (t2 - t1)

t1 = time.time()
# Swap these, and look at the total time
delete_items()
delete_testdict()
t2 = time.time()
print total deallocation time: %.3f seconds. % (t2 - t1)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] extremely slow exit for program having huge (45G) dict (python 2.5.2)

2008-12-20 Thread Alexandre Vassalotti
On Fri, Dec 19, 2008 at 6:29 PM, Mike Coleman tutu...@gmail.com wrote:
 I have a program that creates a huge (45GB) defaultdict.  (The keys
 are short strings, the values are short lists of pairs (string, int).)
  Nothing but possibly the strings and ints is shared.




 That is, after executing the final statement (a print), it is apparently 
 spending a
 huge amount of time cleaning up before exiting.


 I have done 'gc.disable()' for performance (which is hideous without it)--I 
 have
 no reason to think there are any loops.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] extremely slow exit for program having huge (45G) dict (python 2.5.2)

2008-12-20 Thread Alexandre Vassalotti
[Sorry, for the previous garbage post.]

 On Fri, Dec 19, 2008 at 6:29 PM, Mike Coleman tutu...@gmail.com wrote:
 I have a program that creates a huge (45GB) defaultdict.  (The keys
 are short strings, the values are short lists of pairs (string, int).)
 Nothing but possibly the strings and ints is shared.

Could you give us more information about the dictionary. For example,
how many objects does it contain? Is 45GB the actual size of the
dictionary or of the Python process?

 That is, after executing the final statement (a print), it is apparently
 spending a huge amount of time cleaning up before exiting.

Most of this time is probably spent on DECREF'ing objects in the
dictionary. As other mentioned, it would useful to have self-contained
example to examine the behavior more closely.

 I have done 'gc.disable()' for performance (which is hideous without it)--I
 have no reason to think there are any loops.

Have you seen any significant difference in the exit time when the
cyclic GC is disabled or enabled?

-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Reindenting the C code base?

2008-12-15 Thread Alexandre Vassalotti
On Mon, Dec 15, 2008 at 3:59 PM, Guido van Rossum gu...@python.org wrote:
 Aha! A specific file. I'm supportive of fixing that specific file. Now
 if you can figure out how to do it and still allow merging between 2.6
 and 3.0 that would be cool.


Here's the simplest solution I thought so far to allow smooth merging
subsequently. First, fix the 2.6 version with 4-space indent. Over a
third of the file is already using spaces for indentation, so I don't
think losing consistency is a big deal. Then, block the trunk commit
with svnmerge to prevent it from being merged back to the py3k branch.
Finally, fix the 3.0 version.

-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Reindenting the C code base?

2008-12-14 Thread Alexandre Vassalotti
On Sat, Dec 13, 2008 at 5:11 PM, Antoine Pitrou solip...@pitrou.net wrote:
 Guido van Rossum guido at python.org writes:

 I think we should not do this. We should use 4 space indents for new
 files, but existing files should not be reindented.

 Well, right now many files are indented with a mix of spaces and tabs, 
 depending
 on who did the edit and how their editor was configured at the time.


Personally, I think the indentation of, at least,
Objects/unicodeobject.c should be fixed. This file has become so
mixed-up with tab and space indents that I have no-idea what to use
when I edit it. Just to give an idea how messy it is, they are 5214
lines indented with tabs and 4272 indented with spaces (out the 9733
of the file).

-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Reindenting the C code base?

2008-12-14 Thread Alexandre Vassalotti
On Sun, Dec 14, 2008 at 12:43 PM, Jeffrey Yasskin jyass...@gmail.com wrote:
 I've never figured out how to configure emacs to deduce whether the
 current file uses spaces or tabs and has a 4 or 8 space indent. I
 always try to get it right anyway, but it'd be a lot more convenient
 if my editor did it for me. If there are such instructions, perhaps
 they should be added to PEPs 7 and 8?


I know python-mode is able to detect indent configuration of python
code automatically, but I don't know if c-mode is able to. Personally,
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Reindenting the C code base?

2008-12-14 Thread Alexandre Vassalotti
On Sun, Dec 14, 2008 at 12:57 PM, Alexandre Vassalotti
alexan...@peadrop.com wrote:
 On Sun, Dec 14, 2008 at 12:43 PM, Jeffrey Yasskin jyass...@gmail.com wrote:
 I've never figured out how to configure emacs to deduce whether the
 current file uses spaces or tabs and has a 4 or 8 space indent. I
 always try to get it right anyway, but it'd be a lot more convenient
 if my editor did it for me. If there are such instructions, perhaps
 they should be added to PEPs 7 and 8?


 I know python-mode is able to detect indent configuration of python
 code automatically, but I don't know if c-mode is able to. Personally,


[sorry, tabspace in gmail made it send my unfinished email]

Personally, I use auto-mode-alist to make Emacs choose the indent
configuration to use automatically.

Here's how it looks like for me:

(defmacro def-styled-c-mode (name style rest body)
  Define styled C modes.
  `(defun ,name ()
 (interactive)
 (c-mode)
 (c-set-style ,style)
 ,@body))

(def-styled-c-mode python-c-mode python
  (setq indent-tabs-mode t
tab-width 8
c-basic-offset 8))

(def-styled-c-mode py3k-c-mode python
  (setq indent-tabs-mode nil
tab-width 4
c-basic-offset 4))

(setq auto-mode-alist
  (append '((/python.org/python/.*\\.[ch]\\' . python-c-mode)
(/python.org/.*/.*\\.[ch]\\' . py3k-c-mode)) auto-mode-alist))
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] 2to3 question about fix_imports.

2008-12-14 Thread Alexandre Vassalotti
On Fri, Dec 12, 2008 at 11:39 AM, Lennart Regebro rege...@gmail.com wrote:
 The fix_imports fix seems to fix only the first import per line that you have.
 So if you do for example
   import urllib2, cStringIO
 it will not fix cStringIO.

 Is this a bug or a feature? :-) If it's a feature it should warn at
 least, right?


Which revision of python are you using? I tried the test-case you gave
and 2to3 translated it perfectly.

-- Alexandre

a...@helios:~$ cat test.py
import urllib2, cStringIO

s = cStringIO.StringIO(urllib2.randombytes(100))
a...@helios:~$ 2to3 test.py
RefactoringTool: Skipping implicit fixer: buffer
RefactoringTool: Skipping implicit fixer: idioms
RefactoringTool: Skipping implicit fixer: set_literal
RefactoringTool: Skipping implicit fixer: ws_comma
--- test.py (original)
+++ test.py (refactored)
@@ -1,3 +1,3 @@
-import urllib2, cStringIO
+import urllib.request, urllib.error, io

-s = cStringIO.StringIO(urllib2.randombytes(100))
+s = io.StringIO(urllib2.randombytes(100))
RefactoringTool: Files that need to be modified:
RefactoringTool: test.py
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] 2to3 question about fix_imports.

2008-12-14 Thread Alexandre Vassalotti
On Sun, Dec 14, 2008 at 1:34 PM, Lennart Regebro rege...@gmail.com wrote:
 On Sun, Dec 14, 2008 at 19:19, Alexandre Vassalotti
 alexan...@peadrop.com wrote:
 Which revision of python are you using? I tried the test-case you gave
 and 2to3 translated it perfectly.

 3.0, I haven't tried with trunk yet, and possibly it's a more
 complicated usecase.

Strange, fix_imports in Python 3.0 (final) looks fine. If you can come
up with a reproducible example, please open a bug on bugs.python.org
and set me as the assignee (my user id is alexandre.vassalotti).

Thanks,
-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proper initialization of structs

2008-10-30 Thread Alexandre Vassalotti
On Thu, Oct 30, 2008 at 1:00 PM, Fred Drake [EMAIL PROTECTED] wrote:
 It's good to move work into __init__ where reasonable, so that it can be
 avoided if a subclass wants it done in a completely different way, but new
 can't work that way.


And that is exactly the reason why, the _pickle module doesn't use
__new__ for initialization.  Doing any kind of argument parsing in
__new__ prevents subclasses from customizing the arguments for their
__init__.

Although, I agree that __new__ should be used, whenever it is
possible, to initialize struct members.

-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proper initialization of structs

2008-10-30 Thread Alexandre Vassalotti
[oops, I forgot to cc the list]

On Thu, Oct 30, 2008 at 7:43 PM, Christian Heimes [EMAIL PROTECTED] wrote:
 Alexandre Vassalotti wrote:

 And that is exactly the reason why, the _pickle module doesn't use
 __new__ for initialization.  Doing any kind of argument parsing in
 __new__ prevents subclasses from customizing the arguments for their
 __init__.

 Although, I agree that __new__ should be used, whenever it is
 possible, to initialize struct members.

 You are missunderstanding me. I want everybody to set the struct members to
 *A* sensible default value, not *THE* value. Argument parsing can still
 happen in tp_init. tp_new should (or must?) set all struct members to
 sensible defaults like NULL for pointers, -1 or 0 for numbers etc.

 Python uses malloc to allocate memory. Unless you are using debug builds the
 memory block is not initialized. In both cases the block of memory isn't
 zeroed. You all know the problems caused by uninitialized memory.


But what if  PyType_GenericAlloc is used for tp_alloc? As far as I
know, the memory block allocated with PyType_GenericAlloc is zeroed.

-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] C API for gc.enable() and gc.disable()

2008-06-25 Thread Alexandre Vassalotti
On Wed, Jun 25, 2008 at 4:55 PM, Martin v. Löwis [EMAIL PROTECTED] wrote:
 I think exactly the other way 'round. The timing of thing should not
 matter at all, only the exact sequence of allocations and deallocations.

I would it be possible, if not a good idea, to only track object
deallocations as the GC traversal trigger? As far as I know, dangling
cyclic references cannot be formed when allocating objects. So, this
could potentially mitigate the quadratic behavior during allocation
bursts.

-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] C API for gc.enable() and gc.disable()

2008-06-25 Thread Alexandre Vassalotti
On Thu, Jun 26, 2008 at 12:01 AM, Martin v. Löwis [EMAIL PROTECTED] wrote:
 I would it be possible, if not a good idea, to only track object
 deallocations as the GC traversal trigger? As far as I know, dangling
 cyclic references cannot be formed when allocating objects.

 Not sure what you mean by that.

 x = []
 x.append(x)
 del x

 creates a cycle with no deallocation occurring.


Oh... never mind then.

-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] C API for gc.enable() and gc.disable()

2008-06-19 Thread Alexandre Vassalotti
On Sun, Jun 1, 2008 at 12:28 AM, Adam Olsen [EMAIL PROTECTED] wrote:
 On Sat, May 31, 2008 at 10:11 PM, Alexandre Vassalotti
 [EMAIL PROTECTED] wrote:
 Would anyone mind if I did add a public C API for gc.disable() and
 gc.enable()? I would like to use it as an optimization for the pickle
 module (I found out that I get a good 2x speedup just by disabling the
 GC while loading large pickles). Of course, I could simply import the
 gc module and call the functions there, but that seems overkill to me.
 I included the patch below for review.

 I'd rather see it fixed.  It behaves quadratically if you load enough
 to trigger full collection a few times.


Do you have any idea how this behavior could be fixed? I am not a GC
expert, but I could try to fix this.

-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-3000] Betas today - I hope

2008-06-11 Thread Alexandre Vassalotti
On Wed, Jun 11, 2008 at 7:35 AM, Barry Warsaw [EMAIL PROTECTED] wrote:
 My plan is to begin building the betas tonight, at around 9 or 10pm EDT
 (0100 to 0200 UTC Thursday).  If a showstopper comes up before then, I'll
 email the list.  If you think we really aren't ready for beta, then I would
 still like to get a release out today.  In that case, we'll call it alpha
 and delay the betas.

I have two release blockers pending review:

  http://bugs.python.org/issue2918
  http://bugs.python.org/issue2917

I believe both patches are ready to be committed to the py3k branch.
However, I would certainly like that someone would review the patches
(or at least test them).

Right now, I am currently looking at fixing issue 2919
(http://bugs.python.org/issue2919). The profile and the cProfile
module differ much more than I originally expected.  So, I won't be
able to get these two for the beta.

I have also been looking at http://bugs.python.org/issue2874, in which
Benjamin Peterson proposed an simple solution to fix it. Although I
haven't tried his approach, I think I could get this one done for
today.

Finally, I would like to commit the patch in
http://bugs.python.org/issue2523 which fix the quadratic behavior in
BufferedReader.read(). It would also be nice to have someone else
experienced with the io module to review the patch.

Cheers,
-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-3000] How to specify keyword-only arguments from C?

2008-06-05 Thread Alexandre Vassalotti
On Thu, Jun 5, 2008 at 11:18 PM, Alexandre Vassalotti
[EMAIL PROTECTED] wrote:
 On Thu, Jun 5, 2008 at 10:14 PM, Mark Hammond [EMAIL PROTECTED] wrote:
 Set an error if the 'arg' tuple doesn't have a length of zero?


 Oh, that isn't a bad idea at all. I will try this. Thanks!


Worked flawlessly!

Just for the archives, here's how it looks like:

static int
Unpickler_init(UnpicklerObject *self, PyObject *args, PyObject *kwds)
{
static char *kwlist[] = {file, encoding, errors, 0};
PyObject *file;
char *encoding = NULL;
char *errors = NULL;

if (Py_SIZE(args) != 1) {
PyErr_Format(PyExc_TypeError,
 %s takes exactly one 1 positional argument (%zd given),
 Py_TYPE(self)-tp_name, Py_SIZE(args));
return -1;
}

if (!PyArg_ParseTupleAndKeywords(args, kwds, O|ss:Unpickler, kwlist,
 file, encoding, errors))
return -1;
...


Thank you, Mark, for the tip!

-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] C API for gc.enable() and gc.disable()

2008-05-31 Thread Alexandre Vassalotti
Would anyone mind if I did add a public C API for gc.disable() and
gc.enable()? I would like to use it as an optimization for the pickle
module (I found out that I get a good 2x speedup just by disabling the
GC while loading large pickles). Of course, I could simply import the
gc module and call the functions there, but that seems overkill to me.
I included the patch below for review.

-- Alexandre



Index: Include/objimpl.h
===
--- Include/objimpl.h   (revision 63766)
+++ Include/objimpl.h   (working copy)
@@ -221,8 +221,10 @@
  * ==
  */

-/* C equivalent of gc.collect(). */
+/* C equivalent of gc.collect(), gc.enable() and gc.disable(). */
 PyAPI_FUNC(Py_ssize_t) PyGC_Collect(void);
+PyAPI_FUNC(void) PyGC_Enable(void);
+PyAPI_FUNC(void) PyGC_Disable(void);

 /* Test if a type has a GC head */
 #define PyType_IS_GC(t) PyType_HasFeature((t), Py_TPFLAGS_HAVE_GC)
Index: Modules/gcmodule.c
===
--- Modules/gcmodule.c  (revision 63766)
+++ Modules/gcmodule.c  (working copy)
@@ -1252,6 +1252,18 @@
return n;
 }

+void
+PyGC_Disable(void)
+{
+enabled = 0;
+}
+
+void
+PyGC_Enable(void)
+{
+enabled = 1;
+}
+
 /* for debugging */
 void
 _PyGC_Dump(PyGC_Head *g)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Module renaming and pickle mechanisms

2008-05-17 Thread Alexandre Vassalotti
On Sat, May 17, 2008 at 5:05 AM, M.-A. Lemburg [EMAIL PROTECTED] wrote:
 I'd like to bring a potential problem to attention that is caused
 by the recent module renaming approach:

 Object serialization protocols like e.g. pickle usually store the
 complete module path to the object class together with the object.

Thanks for bringing this up. I was aware of the problem myself, but I
hadn't yet worked out a good solution to it.


 It can also happen in storage setups where Python
 objects are stored using e.g. pickle, ZODB being a prominent
 example. As soon as a Python 2.6 application starts writing to
 such storages, Python 2.5 and lower versions will no longer be
 able to read back all the data.


The opposite problem exists for Python 3.0, too. Pickle streams
written by Python 2.x applications will not be readable by Python 3.0.
And, one solution to this is to use Python 2.6 to regenerate pickle
stream.

Another solution would be to write a 2to3 pickle converter using the
pickletools module. It is surely not the most elegant or robust
solution, but I could work.

 Now, I think there's a way to solve this puzzle:

 Instead of renaming the modules (e.g. Queue - queue), we leave
 the code in the existing modules and packages and instead add
 the new module names and package structure with pointers and
 redirects to the existing 2.5 modules.

This would certainly work for simple modules, but what about packages?
For packages, you can't use the ``sys.modules[__name__] = Queue`` to
preserve module identity. Therefore, pickle will use the new package
name when writing its streams. So, we are back to the same problem
again.

A possible solution could be writing a compatibility layer for the
Pickler class, which would map new module names to their old at
runtime. Again, this is neither an elegant, nor robust, solution, but
it should work in most cases.

-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Module renaming and pickle mechanisms

2008-05-17 Thread Alexandre Vassalotti
Errata:

On Sat, May 17, 2008 at 10:59 AM, Alexandre Vassalotti
[EMAIL PROTECTED] wrote:
 And, one solution to this is to use Python 2.6 to regenerate pickle
 stream.

... to regenerate *the* pickle *streams*.


 It is surely not the most elegant or robust solution, but I could work.

... but *it* could work.


 This would certainly work for simple modules, but what about packages?
 For packages, you can't use the ``sys.modules[__name__] = Queue`` to
 preserve module identity.

... you can't use the ``sys.modules[__name__] = Queue`` *trick* to
preserve module identity.


 A possible solution could be writing a compatibility layer for the

... could be *to write* a compatibility layer...


I guess I should start proofreading my emails before sending them, not after...

-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Symbolic errno values in error messages

2008-05-16 Thread Alexandre Vassalotti
On Fri, May 16, 2008 at 10:52 AM, Yannick Gingras [EMAIL PROTECTED] wrote:
 Alexander Belopolsky [EMAIL PROTECTED] writes:

 try:
 ...open('/')
 ... except Exception,e:
 ...pass
 ...
 print e
 [Errno 21] Is a directory

 So now I am not sure what OP is proposing.  Do you want to replace 21
 with EISDIR in the above?

 Yes, that's what I had in mind.


Then, check out EnvironmentError_str in Objects/exceptions.c. You
should be able import the errno module and fetch its errorcode
dictionary.

-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Distutils configparser rename

2008-05-15 Thread Alexandre Vassalotti
On Thu, May 15, 2008 at 6:49 PM, Nick Coghlan [EMAIL PROTECTED] wrote:
 Since it would be nice for the standard library to not emit any warnings
 with the -3 flag, perhaps distutils should at least be trying the new name
 first, and only falling back to the old name on an ImportError (assuming we
 do decide we want to be able to run the 2.6 distutils on older versions of
 Python).


Well, that is a good idea. And, that will silence the Windows
buildbots while other developers find out how to add lib-old/ to the
sys.path.

-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] heads up on svn.python.org ssh keys - debian/ubuntu users may need new ones

2008-05-13 Thread Alexandre Vassalotti
On Tue, May 13, 2008 at 7:12 PM, Martin v. Löwis [EMAIL PROTECTED] wrote:
  If you generated your python subversion ssh key during this time on a
   machine fitting the description above, please consider replacing your
   keys.
  
   apt-get update ; apt-get upgrade on debian will provide you with a
   ssh-vulnkey program that can be used to test if your ssh keys are
   valid or not.

  I'll ping all committers for which ssh-vulnkey reports COMPROMISED.

  I personally don't think the threat is severe - unless people also
  published their public SSH keys somewhere, there is little chance that
  somebody can break in by just guessing them remotely - you still need
  to try a lot of combinations for user names and passwords, plus with
  subversion, we'll easily recognize doubtful checkins (as we do even
  if the committer is legitimate :-).


Well, I had a break in on my public server (peadrop.com) this week,
which had a copy my ssh pubkey. I don't know  if the attacker took a
look at my pubkeys, but I won't take any change. So, I definitely have
to change my key, ASAP.

-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Buildbots have trouble checking out the repository due to recent changes.

2008-05-12 Thread Alexandre Vassalotti
On Mon, May 12, 2008 at 3:40 AM, Martin v. Löwis [EMAIL PROTECTED] wrote:
  When I rename a module I use svn copy, since svn remove doesn't
   pick up changes made to the deleted file. For example, here is what
   I did for PixMapWrapper:

  You want to make changes to the deleted file? Why?


The idea was to replace the orignial module file with its stub.
However, the svn copy and edit process isn't the cause of the
problems. It is the fact that 2 files existed in the same directory
differing only by a case-change.

Anyway, all the buildbot seems okay now.

-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Buildbots have trouble checking out the repository due to recent changes.

2008-05-12 Thread Alexandre Vassalotti
On Mon, May 12, 2008 at 3:49 AM, Martin v. Löwis [EMAIL PROTECTED] wrote:
  Well, I guess I really messed up on that one. So, do you have any idea
   on how to revert the changes?

  If the changes where in a single revision N, do

  svn merge -rN:N-1 .
  svn commit -m revert rN

  If they span over several subsequent revisions, use N-k
  instead. If they span over several revisions with intermediate
  revisions that you don't want to revert, try multiple merge
  commands before a single commit; if that fails, revert and commit
  each range of changes separately.



Yes. That is exactly what I did to revert the changes.


  P.S. If you want to get the buildbots back in shape (in case they
  aren't), build a non-existing branch through the UI
  (which will cause a recursive removal of the entire checkout), then
  either wait for the next regular commit, or force a build of the
  respective branch (branches/py3k or trunk). On Windows, if there
  is still a python.exe process holding onto its binary, that fails,
  and we need support from the slave admin.


Thanks for the tip. Now, I just hope that I will never have to use it. ;-)

-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Buildbots have trouble checking out the repository due to recent changes.

2008-05-12 Thread Alexandre Vassalotti
On Mon, May 12, 2008 at 7:18 AM, Paul Moore [EMAIL PROTECTED] wrote:
  Revision 63129 is not valid on case folding filesystems. In
  particular, this horribly breaks using hg-svn to make a local mirror
  of the Python repository:

  \Apps\HGsvn\hgimportsvn.exe -r 63120
  http://svn.python.org/projects/python/trunk foo
  cd foo
  \apps\hgsvn\hgpullsvn
  hg log Lib\socketserver.py
  changeset:   2:e8856fdf9300
  branch:  trunk
  tag: svn.63129
  user:alexandre.vassalotti
  date:Mon May 12 02:37:10 2008 +0100
  summary: [svn] Renamed SocketServer to 'socketserver'.

  hg up -r2
  abort: case-folding collision between Lib/socketserver.py and
  Lib/SocketServer.py

  hg up -rtip
  abort: case-folding collision between Lib/socketserver.py and
  Lib/SocketServer.py

  The hg repository is now totally broken.


Which version of mercurial are you using? I know that versions prior
1.0 had some bug with handling case-changes on case-insensitive
filesystems.

-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Buildbots have trouble checking out the repository due to recent changes.

2008-05-12 Thread Alexandre Vassalotti
On Mon, May 12, 2008 at 9:24 AM, Martin v. Löwis [EMAIL PROTECTED] wrote:
  The idea was to replace the orignial module file with its stub.
   However, the svn copy and edit process isn't the cause of the
   problems. It is the fact that 2 files existed in the same directory
   differing only by a case-change.

  I still don't understand. You wanted to replace the file with a stub,
  and then delete it? Why not just delete it (or use svn mv in the first
  place)?

No. That is exactly what I wanted to avoid by using svn copy,
instead of svn move. svn move mark the original file for removal.
which makes it impossible to modify the original file on the same
commit. Anyway, Brett updated the PEP with renaming procedure that
avoids this problem completely.

-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] How best to handle the docs for a renamed module?

2008-05-12 Thread Alexandre Vassalotti
On Mon, May 12, 2008 at 6:10 AM, Georg Brandl [EMAIL PROTECTED] wrote:
  I've now updated docs for the Queue, SocketServer and copy_reg modules in
  the trunk.


Thank you, Georg, for updating docs!

-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Trickery with moving urllib

2008-05-11 Thread Alexandre Vassalotti
On Sat, May 10, 2008 at 11:43 PM,  [EMAIL PROTECTED] wrote:

Brett There is going to be an issue with the current proposal for
Brett keeping around urllib. Since the package is to be named the same
Brett thing as the module

 Is this the only module morphing into a package of the same name?


No, it is not. The dbm package will have the same issue.

-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Trickery with moving urllib

2008-05-11 Thread Alexandre Vassalotti
On Sat, May 10, 2008 at 11:38 PM, Brett Cannon [EMAIL PROTECTED] wrote:
 I see three solutions for dealing with this.

 1. Have stubs for the entire urllib API in urllib.__init__ that raise
 a DeprecationWarning either specifying the new name or saying the
 function/class is deprecated.

 2. Rename urllib to urllib.fetch or urllib.old_request to get people
 to move over to urllib.request (aka urllib2) at some point.


I am probably missing something, because I don't see how this solution
would solve the problem. The warning in urllib.__init__ will still be
issued when people will import urllib.fetch (or urllib.fetch).

-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Buildbots have trouble checking out the repository due to recent changes.

2008-05-11 Thread Alexandre Vassalotti
Hello,

I have been working the module renaming for PEP-3108, and I just
noticed that some buildbots are throwing errors while updating their
checkout. It seems the method I use for renaming modules hits a
subversion bug on certain platforms. The error thrown looks like this:

...
svn: In directory 'build/Lib/plat-mac'
svn: Can't move source to dest
svn: Can't move
'build/Lib/plat-mac/.svn/tmp/prop-base/pixmapwrapper.py.svn-base' to
'build/Lib/plat-mac/.svn/prop-base/pixmapwrapper.py.svn-base': No such
file or directory
program finished with exit code 1

(http://www.python.org/dev/buildbot/all/x86 osx.5 trunk/builds/201/step-svn/0)


When I rename a module I use svn copy, since svn remove doesn't
pick up changes made to the deleted file. For example, here is what
I did for PixMapWrapper:

   svn copy ./Lib/plat-mac/PixMapWrapper.py ./Lib/plat-mac/pixmapwrapper.py
   edit ./Lib/plat-mac/PixMapWrapper.py
   svn commit

It seems that I could avoid this error by using cp instead of svn
copy (which I did use for renaming copy_reg). However, I am not sure
if this method doesn't preserve the full history of file.

So, how should I do to fix the failing buildbots?

-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Buildbots have trouble checking out the repository due to recent changes.

2008-05-11 Thread Alexandre Vassalotti
On Sun, May 11, 2008 at 5:44 PM, Paul Moore [EMAIL PROTECTED] wrote:
 2008/5/11 Alexandre Vassalotti [EMAIL PROTECTED]:
 When I rename a module I use svn copy, since svn remove doesn't
 pick up changes made to the deleted file. For example, here is what
 I did for PixMapWrapper:

   svn copy ./Lib/plat-mac/PixMapWrapper.py ./Lib/plat-mac/pixmapwrapper.py
   edit ./Lib/plat-mac/PixMapWrapper.py
   svn commit

 That seems a very odd usage. You're renaming, not copying. Why aren't
 you using svn rename (svn move)? I can well imagine this causing
 serious confusion.


I wrote:
 When I rename a module I use svn copy, since svn remove doesn't
 pick up changes made to the deleted file. For example, here is what
 I did for PixMapWrapper:

Oops, I meant svn rename when I said svn remove. As I said, if I
use svn rename I cannot make changes to the file being renamed.


 Please be very careful here - if you introduce revisions which contain
 multiple files with names that differ only in case, you're going to
 really mess up history (and probably the only clean way to fix this
 will be to actually go back and edit the history).

Oh, you are right. I totally forgot about case-insensible filesystems.
This is really going to make such case-change renamings nasty.

-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Buildbots have trouble checking out the repository due to recent changes.

2008-05-11 Thread Alexandre Vassalotti
On Sun, May 11, 2008 at 6:31 PM, Brett Cannon [EMAIL PROTECTED] wrote:
 The PEP specifies the lib-old directory to hold the old case name so
 that the svn rename won't lead to two files in the same directory. I
 was hoping that creating the stub in lib-old would allow a simple
 ``svn rename`` for the original module on a case-sensitive file-system
 and the case-insensitive file-systems would just be able to deal with
 it. Is that just not going to work?

 Oh, and I am really sorry, Alexandre, but the PixMapWrapper rename
 should have been taken out of the PEP as the entire Mac directory is
 going away, so the rename is kind of pointless since the module is
 going to be deleted.


Well, I guess I really messed up on that one. So, do you have any idea
on how to revert the changes?

-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Buildbots have trouble checking out the repository due to recent changes.

2008-05-11 Thread Alexandre Vassalotti
On Sun, May 11, 2008 at 5:29 PM, Alexandre Vassalotti
[EMAIL PROTECTED] wrote:
 Hello,

 I have been working the module renaming for PEP-3108, and I just
 noticed that some buildbots are throwing errors while updating their
 checkout. It seems the method I use for renaming modules hits a
 subversion bug on certain platforms. The error thrown looks like this:
[SNIP]
 So, how should I do to fix the failing buildbots?


I reverted the all problematic changes and the buildbots are green again.

Thank you all for you support!

-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] r62778 - in python/branches/py3k: Lib/io.py Lib/test/test_StringIO.py Lib/test/test_io.py Lib/test/test_largefile.py Lib/test/test_memoryio.py Lib/test/test_mimetools.py Modules/_byte

2008-05-07 Thread Alexandre Vassalotti
On Tue, May 6, 2008 at 6:52 PM, Christian Heimes [EMAIL PROTECTED] wrote:
 alexandre.vassalotti schrieb:

  Author: alexandre.vassalotti
   Date: Tue May  6 21:48:38 2008
   New Revision: 62778
  
   Log:
   Added fast alternate io.BytesIO implementation and its test suite.
   Removed old test suite for StringIO.
   Modified truncate() to imply a seek to given argument value.

  Thanks for your great work! But what about the trunk? :] Can you port
  your code to the trunk before the alpha gets out?


I have a backported version of my code for the trunk. Should I commit
it or should I post it to issue tracker and wait for proper review?

-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP Proposal: Revised slice objects lists use slice objects as indexes

2008-03-09 Thread Alexandre Vassalotti
On Sun, Mar 9, 2008 at 7:21 PM, Forrest Voight [EMAIL PROTECTED] wrote:
 This would simplify the handling of list slices.

  Slice objects that are produced in a list index area would be different,
  and optionally the syntax for slices in list indexes would be expanded
  to work everywhere. Instead of being containers for the start, end,
  and step numbers, they would be generators, similar to xranges.

I am not sure what you are trying to propose here. The slice object
isn't special, it's just a regular built-in type.

   slice(1,4)
  slice(1, 4, None)
   [1,2,3,4,5,6][slice(1,4)]
  [2, 3, 4]

I don't see how introducing new syntax would simplify indexing.


  Lists would accept these slice objects as indexes, and would also
  accept any other list or generator.


Why lists should accept a list or a generator as index? What is the
use case you have in mind?


  Optionally, the 1:2 syntax would create a slice object outside of list
  index areas.

Again, I don't see how this could be useful...


   list(1:5)
  [1, 2, 3, 4]

   list(1:5:2)
  [1, 3]


list(range(1,5,2))?

   range(30)[1:5 + 15:17]
  [1, 2, 3, 4, 15, 16]


This is confusing, IMHO, and doesn't provide any advantage over:

   s = list(range(30))
   s[1:5] + s[15:17]

If you really needed it, you could define a custom class with a fancy
__getitem__

  class A:
def __getitem__(self, x):
   return x

   A()[1:3,2:5]
  (slice(1, 3, None), slice(2, 5, None))

P.S. You should consider using the python-ideas
(http://mail.python.org/mailman/listinfo/python-ideas) mailing list,
instead of python-dev for posting suggestions.

Cheers,
-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Any Emacs tips for core developers?

2008-02-04 Thread Alexandre Vassalotti
On Feb 4, 2008 7:47 PM,  [EMAIL PROTECTED] wrote:

 I should have asked this before, but what's so special about core
 (Python?)  development that the tools should be different than for
 non-core development?

Brett Usually the core has keywords, built-ins, etc. that have not been
Brett pushed to the release versions for various editors.

 Ah, okay.  Barry mentioned something about adjusting the python-mode syntax
 tables to include Python 3.x stuff, though patches are always
 welcome. wink

Brett Plus coding guidelines might be different from PEPs 7 and 8
Brett compared to what an editor is set to do by default.

 That might be a bit more challenging.  I was thinking today that it would be
 kind of nice to have a set of predefined settings for Python's new C style
 (someone mentioned producing that).  Should that go in the C/C++ mode or be
 delivered somehow else?


It's fairly trivial to adjust cc-mode to conform PEP 7 C coding convention:

(defmacro def-styled-c-mode (name style rest body)
  Define styled C modes.
  `(defun ,name ()
 (interactive)
 (c-mode)
 (c-set-style ,style)
 ,@body))

(def-styled-c-mode python-c-mode python
  (setq indent-tabs-mode t
tab-width 8
c-basic-offset 8))

(def-styled-c-mode py3k-c-mode python
  (setq indent-tabs-mode nil
tab-width 4
c-basic-offset 4))


-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-3000] inst_persistent_id

2008-01-14 Thread Alexandre Vassalotti
Oh, you are right. I thought that save_inst() used inst_persistent_id,
but that isn't the case. Now, I have checked more thoroughly and found
the relevant piece of code:

if (!pers_save  self-inst_pers_func) {
if ((tmp = save_pers(self, args, self-inst_pers_func)) != 0) {
res = tmp;
goto finally;
}
}

which is indeed called only when the object is not supported by pickle.

I guess my original argument doesn't hold anymore, thus I don't have
anything against supporting this feature officially.

Thanks for correcting me!
-- Alexandre


On Jan 14, 2008 12:59 PM, Armin Rigo [EMAIL PROTECTED] wrote:
 Hi,

 On Sat, Jan 12, 2008 at 07:33:38PM -0500, Alexandre Vassalotti wrote:
  Well, in Python 3K, inst_persistent_id() won't be usable, since
  PyInstance_Type was removed.

 Looking at the code, inst_persistent_id() is just a confusing name.  It
 has got nothing to do with PyInstance_Type; it's called for any object
 type that cPickle.c doesn't know how to handle.  In fact, it seems that
 cPickle.c never calls inst_persistent_id() for objects of type
 PyInstance_Type...


 A bientot,

 Armin.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP: per user site-packages directory

2008-01-11 Thread Alexandre Vassalotti
I can't comment on the implementation details, but +1 for the idea. I
think this feature will be very useful in a shared hosting
environment.

-- Alexandre

On Jan 11, 2008 6:27 PM, Christian Heimes [EMAIL PROTECTED] wrote:
 PEP: XXX
 Title: Per user site-packages directory
 Version: $Revision$
 Last-Modified: $Date$
 Author: Christian Heimes christian(at)cheimes(dot)de
 Status: Draft
 Type: Standards Track
 Content-Type: text/x-rst
 Created: 11-Jan-2008
 Python-Version: 2.6, 3.0
 Post-History:

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Subversion forbidden error while committing to trunk

2007-12-07 Thread Alexandre Vassalotti
Hi,

I tried a few times to commit a patch (for issue #1530) to the trunk,
but I always get this error:

  alex:python% svn commit Lib/doctest.py --file svn-commit.tmp
  svn: Commit failed (details follow):
  svn: MKACTIVITY of
'/projects/!svn/act/53683b5b-99d8-497e-bc98-6d07f9401f50': 403
Forbidden (http://svn.python.org)

I first thought that was related to the Py3k freeze. However, I tried
again a few minutes ago and I still got this error. Is possible that
my commit rights are limited to the py3k branches? Or this is a
genuine error?

Thanks,
--  Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Subversion forbidden error while committing to trunk

2007-12-07 Thread Alexandre Vassalotti
Thanks Guido.

I just found what was the problem. My checkout of the trunk was the
read-only one (i.e., over http).

-- Alexandre

On Dec 7, 2007 11:40 PM, Guido van Rossum [EMAIL PROTECTED] wrote:

 On Dec 7, 2007 8:35 PM, Alexandre Vassalotti [EMAIL PROTECTED] wrote:
  I tried a few times to commit a patch (for issue #1530) to the trunk,
  but I always get this error:
 
alex:python% svn commit Lib/doctest.py --file svn-commit.tmp
svn: Commit failed (details follow):
svn: MKACTIVITY of
  '/projects/!svn/act/53683b5b-99d8-497e-bc98-6d07f9401f50': 403
  Forbidden (http://svn.python.org)
 
  I first thought that was related to the Py3k freeze. However, I tried
  again a few minutes ago and I still got this error. Is possible that
  my commit rights are limited to the py3k branches? Or this is a
  genuine error?

 I just successfully committed something to the trunk, so the server is
 not screwed.

 I'm not aware of an access control mechanism that would prevent anyone
 from checking in to the trunk while allowing them to check in to a
 branch.

 I suspect your workspace may be corrupt.

 --
 --Guido van Rossum (home page: http://www.python.org/~guido/)

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [poll] New name for __builtins__

2007-12-04 Thread Alexandre Vassalotti
I just want to let you all know that the name issue was settled and
committed to py3k branch a few days ago. It was chosen to simply
rename the module __builtin__ to builtins.

-- Alexandre

On Nov 29, 2007 6:15 AM, Nick Coghlan [EMAIL PROTECTED] wrote:
 Given that the *effect* of __builtins__ is to make the contents of the
 __builtin__ module implicitly available in every module's global
 namespace, why not call it __implicit__?

 I really don't like all of these __root__ inspired names, because
 __builtin__ isn't the root of any Python hierarchy that I know of.

   import sys
   import __builtin__
   __builtin__.sys
 Traceback (most recent call last):
File stdin, line 1, in module
 AttributeError: 'module' object has no attribute 'sys'

 The builtin namespace doesn't know anything about other modules, the
 current module's global namespace, the current function's local
 variables, or much of anything really. To me, the concept of root in a
 computing sense implies a node from which you can reach every other node
 - from the root of the filesystem you can get to every other directory,
 as the root user you can access any other account, etc. To those that
 like these names, what do you consider __root__ to be the root of?

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [poll] New name for __builtins__

2007-12-04 Thread Alexandre Vassalotti
Oh, sorry for the noise. I thought people were still arguing about the
name issue, but it was in fact 5-day late emails that I am still
receiving. (Gmail seems to have delivery issues lately...)

-- Alexandre

On Dec 4, 2007 12:49 PM, Alexandre Vassalotti [EMAIL PROTECTED] wrote:
 I just want to let you all know that the name issue was settled and
 committed to py3k branch a few days ago. It was chosen to simply
 rename the module __builtin__ to builtins.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Extending Python 3000

2007-09-18 Thread Alexandre Vassalotti
PyObject_HEAD was changed in Py3k to make it conform to C's strict
aliasing rules (See PEP 3123 [1]).

In your code, you need to change:

static PyTypeObject MPFType = {
PyObject_HEAD_INIT(NULL)
0, /*ob_size*/
...
}

to this:

static PyTypeObject MPFType = {
PyVarObject_HEAD_INIT(NULL, 0)
...
}

Good luck,
-- Alexandre

[1]: http://www.python.org/dev/peps/pep-3123/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Avoiding cascading test failures

2007-09-02 Thread Alexandre Vassalotti
On 8/28/07, Collin Winter [EMAIL PROTECTED] wrote:
 On 8/22/07, Alexandre Vassalotti [EMAIL PROTECTED] wrote:
  When I was fixing tests failing in the py3k branch, I found the number
  duplicate failures annoying. Often, a single bug, in an important
  method or function, caused a large number of testcase to fail. So, I
  thought of a simple mechanism for avoiding such cascading failures.
 
  My solution is to add a notion of dependency to testcases. A typical
  usage would look like this:
 
  @depends('test_getvalue')
  def test_writelines(self):
  ...
  memio.writelines([buf] * 100)
  self.assertEqual(memio.getvalue(), buf * 100)
  ...

 This definitely seems like a neat idea. Some thoughts:

 * How do you deal with dependencies that cross test modules? Say
 test A depends on test B, how do we know whether it's worthwhile
 to run A if B hasn't been run yet? It looks like you run the test
 anyway (I haven't studied the code closely), but that doesn't
 seem ideal.

I am not sure what you mean by test modules. Do you mean module in
the Python sense, or like a test-case class?

 * This might be implemented in the wrong place. For example, the [x
 for x in dir(self) if x.startswith('test')] you do is most certainly
 better-placed in a custom TestLoader implementation.

That certainly is a good suggestion. I am not sure yet how I will
implement my idea in the unittest module. However, I pretty sure that
it will be quite different from my prototype.

 But despite that, I think it's a cool idea and worth pursuing. Could
 you set up a branch (probably of py3k) so we can see how this plays
 out in the large?

Sure. I need to finish merging pickle and cPickle for Py3k before
tackling this project, though.

-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Order of operations

2007-08-29 Thread Alexandre Vassalotti
On 8/29/07, Martin v. Löwis [EMAIL PROTECTED] wrote:
 Scott Dial schrieb:
  Martin v. Löwis wrote:
  Do you know why? Thanks!
  I'm not sure why precedence was defined that
  way, though.
 
 
  Because it is consistent with C's precedence rules.

 Maybe I'm missing something - how exactly is the exponentiation
 operator spelled in C?


C doesn't have an exponentiation operator. You use the pow() function, instead:

  #include math.h
  double pow(double x, double y);

-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Avoiding cascading test failures

2007-08-25 Thread Alexandre Vassalotti
On 8/25/07, Gregory P. Smith [EMAIL PROTECTED] wrote:
 I like this idea.

Yay! Now, I ain't the only one. ;)

 Be sure to have an option to ignore dependancies and run all tests.

Yes, I planned to add a such option.

 Also when skipping tests because a depedancy failed have unittest
 print out an indication that a test was skipped due to a dependancy
 rather than silently running fewer tests.  Otherwise it could be
 deceptive and appear that only one test was affected.

However, that was never planned.

I added the ignore_dependencies option. Also, I fixed the sub-optimal
dependency resolution algorithm that was in my original example
implementation.

-- Alexandre

--- dep.py.old  2007-08-25 19:54:27.0 -0400
+++ dep.py  2007-08-25 20:02:55.0 -0400
@@ -2,8 +2,9 @@
 class CycleError(Exception):
 pass

+class TestGraph:

-class TestCase:
+ignore_dependencies = False

 def __init__(self):
 self.graph = {}
@@ -19,16 +20,16 @@
 graph = self.graph
 toskip = set()
 msgs = []
-while graph:
+if self.ignore_dependencies:
+for test in graph:
+graph[test].clear()
 # find tests without any pending dependencies
-source = [test for test, deps in graph.items() if not deps]
-if not source:
-raise CycleError
-for testname in source:
+queue = [test for test, deps in graph.items() if not deps]
+while queue:
+testname = queue.pop()
 if testname in toskip:
 msgs.append(%s... skipped % testname)
-resolvedeps(graph, testname)
-del graph[testname]
+queue.extend(resolve(graph, testname))
 continue
 test = getattr(self, testname)
 try:
@@ -42,8 +43,9 @@
 else:
 msgs.append(%s... ok % testname)
 finally:
-resolvedeps(graph, testname)
-del graph[testname]
+queue.extend(resolve(graph, testname))
+if graph:
+raise CycleError
 for msg in sorted(msgs):
 print(msg)

@@ -60,10 +62,15 @@
 rdeps.update(getrevdeps(graph, x))
 return rdeps

- def resolvedeps(graph, testname):
+def resolve(graph, testname):
+toqueue = []
 for test in graph:
 if testname in graph[test]:
 graph[test].remove(testname)
+if not graph[test]:
+toqueue.append(test)
+del graph[testname]
+return toqueue

 def depends(*args):
 def decorator(test):
@@ -75,7 +82,9 @@
 return decorator


-class MyTest(TestCase):
+class MyTest(TestGraph):
+
+ignore_dependencies = True

@depends('test_foo')
def test_nah(self):

class CycleError(Exception):
pass

class TestGraph:

ignore_dependencies = False

def __init__(self):
self.graph = {}
tests = [x for x in dir(self) if x.startswith('test')]
for testname in tests:
test = getattr(self, testname)
if hasattr(test, 'deps'):
self.graph[testname] = test.deps
else:
self.graph[testname] = set()

def run(self):
graph = self.graph
toskip = set()
msgs = []
if self.ignore_dependencies:
for test in graph:
graph[test].clear()
# find tests without any pending dependencies
queue = [test for test, deps in graph.items() if not deps]
while queue:
testname = queue.pop()
if testname in toskip:
msgs.append(%s... skipped % testname)
queue.extend(resolve(graph, testname))
continue
test = getattr(self, testname)
try:
test()
except AssertionError:
toskip.update(getrevdeps(graph, testname))
msgs.append(%s... failed % testname)
except:
toskip.update(getrevdeps(graph, testname))
msgs.append(%s... error % testname)
else:
msgs.append(%s... ok % testname)
finally:
queue.extend(resolve(graph, testname))
if graph:
raise CycleError
for msg in sorted(msgs):
print(msg)


def getrevdeps(graph, testname):
Return the reverse depencencies of a test
rdeps = set()
for x in graph:
if testname in graph[x]:
rdeps.add(x)
if rdeps:
# propagate depencencies recursively
for x in rdeps.copy():
rdeps.update(getrevdeps(graph, x))
return rdeps

def resolve(graph, testname):
toqueue = []
for test in graph:
if testname in graph[test]:
graph[test].remove(testname)
if not graph[test]:
toqueue.append(test)
del 

[Python-Dev] Avoiding cascading test failures

2007-08-22 Thread Alexandre Vassalotti
When I was fixing tests failing in the py3k branch, I found the number
duplicate failures annoying. Often, a single bug, in an important
method or function, caused a large number of testcase to fail. So, I
thought of a simple mechanism for avoiding such cascading failures.

My solution is to add a notion of dependency to testcases. A typical
usage would look like this:

@depends('test_getvalue')
def test_writelines(self):
...
memio.writelines([buf] * 100)
self.assertEqual(memio.getvalue(), buf * 100)
...

Here, running the test is pointless if test_getvalue fails. So by
making test_writelines depends on the success of test_getvalue, we can
ensure that the report won't be polluted with unnecessary failures.

Also, I believe this feature will lead to more orthogonal tests, since
it encourages the user to write smaller test with less dependencies.

I wrote an example implementation (included below) as a proof of
concept. If the idea get enough support, I will implement it and add
it to the unittest module.

-- Alexandre


class CycleError(Exception):
pass


class TestCase:

def __init__(self):
self.graph = {}
tests = [x for x in dir(self) if x.startswith('test')]
for testname in tests:
test = getattr(self, testname)
if hasattr(test, 'deps'):
self.graph[testname] = test.deps
else:
self.graph[testname] = set()

def run(self):
graph = self.graph
toskip = set()
msgs = []
while graph:
# find tests without any pending dependencies
source = [test for test, deps in graph.items() if not deps]
if not source:
raise CycleError
for testname in source:
if testname in toskip:
msgs.append(%s... skipped % testname)
resolvedeps(graph, testname)
del graph[testname]
continue
test = getattr(self, testname)
try:
test()
except AssertionError:
toskip.update(getrevdeps(graph, testname))
msgs.append(%s... failed % testname)
except:
toskip.update(getrevdeps(graph, testname))
msgs.append(%s... error % testname)
else:
msgs.append(%s... ok % testname)
finally:
resolvedeps(graph, testname)
del graph[testname]
for msg in sorted(msgs):
print(msg)


def getrevdeps(graph, testname):
Return the reverse depencencies of a test
rdeps = set()
for x in graph:
if testname in graph[x]:
rdeps.add(x)
if rdeps:
# propagate depencencies recursively
for x in rdeps.copy():
rdeps.update(getrevdeps(graph, x))
return rdeps

def resolvedeps(graph, testname):
for test in graph:
if testname in graph[test]:
graph[test].remove(testname)

def depends(*args):
def decorator(test):
if hasattr(test, 'deps'):
test.deps.update(args)
else:
test.deps = set(args)
return test
return decorator


class MyTest(TestCase):

@depends('test_foo')
def test_nah(self):
pass

@depends('test_bar', 'test_baz')
def test_foo(self):
pass

@depends('test_tin')
def test_bar(self):
self.fail()

def test_baz(self):
self.error()

def test_tin(self):
pass

def error(self):
raise ValueError

def fail(self):
raise AssertionError

if __name__ == '__main__':
t = MyTest()
t.run()
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-3000] Documentation switch imminent

2007-08-17 Thread Alexandre Vassalotti
On 8/17/07, Georg Brandl [EMAIL PROTECTED] wrote:
 Alexandre Vassalotti schrieb:
  On 8/16/07, Neal Norwitz [EMAIL PROTECTED] wrote:
  On 8/15/07, Georg Brandl [EMAIL PROTECTED] wrote:
   Okay, I made the switch.  I tagged the state of both Python branches
   before the switch as tags/py{26,3k}-before-rstdocs/.
 
  http://docs.python.org/dev/
  http://docs.python.org/dev/3.0/
 
 
  Is it just me, or the markup of the new docs is quite heavy?

 Docutils markup tends to be a bit verbose, yes, but the index is not
 even generated by them.

  alex% wget -q -O- http://docs.python.org/api/genindex.html | wc -c
  77868
  alex% wget -q -O- http://docs.python.org/dev/3.0/genindex.html | wc -c
  918359

 The new index includes all documents (api, lib, ref, ...), so the ratio
 is more like 678000 : 95 (using 2.6 here), and the difference can be
 explained quite easily because (a) sphinx uses different anchor names
 (mailbox.Mailbox.__contains__ vs l2h-849) and the hrefs have to
 include subdirs like reference/.

Ah, I didn't notice that index included all the documents. That
explains the huge size increase. However, would it be possible to keep
the indexes separated? I noticed that I find I want more quickly when
the indexes are separated.

 I've now removed leading spaces in the index output, and the character
 count is down to 85.

  Firefox, on my fairly recent machine, takes ~5 seconds rendering the
  index of the new docs from disk, compared to a fraction of a second
  for the old one.

 But you're right that rendering is slow there.  It may be caused by the
 more complicated CSS... perhaps the index should be split up in several
 pages.


I disabled CSS-support (with View-Page Style-No Style), but it
didn't affect the initial rendering speed. However, scrolling was
*much* faster without CSS.

-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-3000] Documentation switch imminent

2007-08-16 Thread Alexandre Vassalotti
On 8/16/07, Neal Norwitz [EMAIL PROTECTED] wrote:
 On 8/15/07, Georg Brandl [EMAIL PROTECTED] wrote:
  Okay, I made the switch.  I tagged the state of both Python branches
  before the switch as tags/py{26,3k}-before-rstdocs/.

 http://docs.python.org/dev/
 http://docs.python.org/dev/3.0/


Is it just me, or the markup of the new docs is quite heavy?

alex% wget -q -O- http://docs.python.org/api/genindex.html | wc -c
77868
alex% wget -q -O- http://docs.python.org/dev/3.0/genindex.html | wc -c
918359

Firefox, on my fairly recent machine, takes ~5 seconds rendering the
index of the new docs from disk, compared to a fraction of a second
for the old one.

-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] cStringIO.StringIO() buffer behavior

2007-08-06 Thread Alexandre Vassalotti
On 8/6/07, Georg Brandl [EMAIL PROTECTED] wrote:
 Okay, I propose the following patch:
 [...]

I think your patch is complicated for nothing. It would be much more
straightforward to use PyString_AsStringAndSize to encode the Unicode
string with the default encoding. I think it would be necessary to
port the fix to O_write and O_writelines.

-- Alexandre

Index: Modules/cStringIO.c
===
--- Modules/cStringIO.c (revision 56754)
+++ Modules/cStringIO.c (working copy)
@@ -665,8 +674,15 @@
   char *buf;
   Py_ssize_t size;

-  if (PyObject_AsCharBuffer(s, (const char **)buf, size) != 0)
-  return NULL;
+  /* Special case for unicode objects. */
+  if (PyUnicode_Check(s)) {
+ if (PyString_AsStringAndSize(s, buf, size) == -1)
+ return NULL;
+  }
+  else {
+ if (PyObject_AsReadBuffer(s, (const void **)buf, size) == -1)
+ return NULL;
+  }

   self = PyObject_New(Iobject, Itype);
   if (!self) return NULL;
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] cStringIO.StringIO() buffer behavior

2007-08-05 Thread Alexandre Vassalotti
On 8/5/07, Georg Brandl [EMAIL PROTECTED] wrote:
 See bugs #1548891 and #1730114.

 In the former, it was reported that cStringIO works differently from StringIO
 when handling unicode strings; it used GetReadBuffer which returned the raw
 internal UCS-2 or UCS-4 encoded string.

 I changed it to use GetCharBuffer, which converts to a string using the
 default encoding first. This fix was also in 2.5.1.

 The latter bug now complains that this excludes things like array.array()s
 from being used as an argument to cStringIO.StringIO(), which worked before
 with GetReadBuffer.

 What's the preferred solution here?


The best thing would be add a special case for ascii-only unicode
objects, and keep the old behavior. However, I believe this will be
ugly, especially in O_write. So, it would perhaps be better to simply
stop supporting unicode objects.

-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Py3k: 'range' fail

2007-07-24 Thread Alexandre Vassalotti
Yes, range() on the p3yk branch seems broken. However, this bug has
been fixed in the py3k-struni, the branch where most the development
for Python 3000 is taking place.

-- Alexandre

On 7/24/07, Lisandro Dalcin [EMAIL PROTECTED] wrote:
 I did a fresh checkout as below (is p3yk the right branch?)

 $ svn co http://svn.python.org/projects/python/branches/p3yk python-3k

 after building and installing, I get

 $ python3.0
 Python 3.0x (p3yk:56529, Jul 24 2007, 15:58:59)
 [GCC 4.1.1 20061011 (Red Hat 4.1.1-30)] on linux2
 Type help, copyright, credits or license for more information.
  range(0,10,2)
 Traceback (most recent call last):
   File stdin, line 1, in module
 SystemError: NULL result without error in PyObject_Call
 
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


  1   2   >