Re: [Python-Dev] Add a transformdict to collections

2013-09-10 Thread Lukas Lueg
Should'nt the key'ing behaviour be controlled by the type of the key
instead of the type of the container?


2013/9/10 MRAB pyt...@mrabarnett.plus.com

 On 10/09/2013 20:08, Paul Moore wrote:

 On 10 September 2013 19:31, Antoine Pitrou solip...@pitrou.net wrote:

 I think it would be a flaw to have this detail implementation-defined.
 This would be like saying that it is implementation-defined which
 of A,B,C is returned from A and B and C if all are true.


 Ok, it seems everyone (except me :-)) agrees that it should return the
 first key value, so that's how it will be.


 If you retain the first key value, it's easy enough for the
 application to implement retain the last semantics:

 try:
  del d[k]
 finally:
  d[k] = v

  That would raise a KeyError is the key was missing. A better way is:

 d.pop(k, None)

 d[k] = v

  If you provide retain the last, I can't see any obvious way of
 implementing retain the first in application code without in effect
 reimplementing the class.

  Retain the first does feel more natural to me.

 __**_
 Python-Dev mailing list
 Python-Dev@python.org
 https://mail.python.org/**mailman/listinfo/python-devhttps://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: https://mail.python.org/**mailman/options/python-dev/**
 lukas.lueg%40gmail.comhttps://mail.python.org/mailman/options/python-dev/lukas.lueg%40gmail.com

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Disabling string interning for null and single-char causes segfaults

2013-03-02 Thread Lukas Lueg
Debugging a refcount bug? Good. Out of the door, line on the left, one
cross each.


2013/3/2 Stefan Bucur stefan.bu...@gmail.com

 On Sat, Mar 2, 2013 at 4:31 PM, Antoine Pitrou solip...@pitrou.net
 wrote:
  On Fri, 1 Mar 2013 16:24:42 +0100
  Stefan Bucur stefan.bu...@gmail.com wrote:
 
  However, after applying this modification, when running make test I
 get a
  segfault in the test___all__ test case.
 
  Before digging deeper into the issue, I wanted to ask here if there are
 any
  implicit assumptions about string identity and interning throughout the
  interpreter implementation. For instance, are two single-char strings
  having the same content supposed to be identical objects?
 
  From a language POV, no, but inside a specific interpreter such as
  CPython it may be a reasonable expectation.
 
  I'm assuming that it's either this, or some refcount bug in the
 interpreter
  that manifests only when certain strings are no longer interned and thus
  have a higher chance to get low refcount values.
 
  Indeed, if it's a real bug it would be nice to get it fixed :-)

 By the way, in that case, what would be the best way to debug such
 type of ref count errors? I recently ran across this document [1],
 which kind of applies to debugging focused on newly introduced code.
 But when some changes potentially impact a good fraction of the
 interpreter, where should I look first?

 I'm asking since I re-ran the failing test with gdb, and the segfault
 seems to occur when invoking the kill() syscall, so the error seems to
 manifest at some later point than when the faulty code is executed.

 Stefan

 [1] http://www.python.org/doc/essays/refcnt/
 ___
 Python-Dev mailing list
 Python-Dev@python.org
 http://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe:
 http://mail.python.org/mailman/options/python-dev/lukas.lueg%40gmail.com

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] performance of {} versus dict()

2012-11-14 Thread Lukas Lueg
Notice that {'x':1} and dict(x=1) are different beasts: The first one
compiles directly to BUILD_MAP. The second one loads a reference to 'dict'
from globals() and calls the constructor. The two are not the same.



2012/11/15 Steven D'Aprano st...@pearwood.info

 On 15/11/12 05:54, Mark Adam wrote:

  Merging of two dicts is done with dict.update.   How do you do it on
 initialization?  This doesn't make sense.


 Frequently.

 my_prefs = dict(default_prefs, setting=True, another_setting=False)


 Notice that I'm not merging one dict into another, but merging two dicts
 into a third.

 (Well, technically, one of the two comes from keyword arguments rather
 than an actual dict, but the principle is the same.)

 The Python 1.5 alternative was:

 my_prefs = {}
 my_prefs.update(default_prefs)
 my_prefs['setting'] = True
 my_prefs['another_setting'] = False


 Blah, I'm so glad I don't have to write Python 1.5 code any more. Even
 using copy only saves a line:

 my_prefs = default_prefs.copy()
 my_prefs['setting'] = True
 my_prefs['another_setting'] = False




 --
 Steven

 __**_
 Python-Dev mailing list
 Python-Dev@python.org
 http://mail.python.org/**mailman/listinfo/python-devhttp://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: http://mail.python.org/**mailman/options/python-dev/**
 lukas.lueg%40gmail.comhttp://mail.python.org/mailman/options/python-dev/lukas.lueg%40gmail.com

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Use QueryPerformanceCounter() for time.monotonic() and/or time.highres()?

2012-04-02 Thread Lukas Lueg
At least on some versions of Windows (e.g. XP) the
QueryPerformanceCounter()-API is more or less only a stub around a
call to RDTSC which in turn varies in frequency on (at least) Intel
Pentium 4, Pentium M and Xeon processors (bound to the current clock
frequencies).
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Lazy unpacking for struct module

2011-06-17 Thread Lukas Lueg
The followup: I've implemented a new StructView-object for 3.3a-trunk.
The object takes and existing Struct-object and provides on-access
unpacking. The breaking point where this object is faster than calling
Struct.unpack seems to be somewhere around 12 fields in the
format-string. Format strings with less fields expose too much
overhead of entering the C-code and staying there a little longer to
unpack all fields is actually faster.

Having fifteen or more fields in a format-string seems unlikely and
I'll therefor abandon the idea of providing this mechanism.

2011/6/14 Lukas Lueg lukas.l...@googlemail.com:
 So I really can't see what harm it could do, except for
 maybe a tiny performance reduction in the case where you
 extract all the fields, or refer to extracted fields
 repeatedly.

 Referring to the view-object multiple times should not be a problem
 since the object can create and hold references to the unpacked values
 it created; remember that they are all immutable.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Lazy unpacking for struct module

2011-06-12 Thread Lukas Lueg
Hi.

We extensively use the struct module to crunch large amounts of binary
data. There are basically two operations for us that only seem to the
naked eye as one: Filtering (see if certain fields have certain
values, throw everything away if not) and inspection (care about all
the fields' values). The filtering-part is very important as most of
the binary data can actually be thrown away and never have to be
inspected any further. When thinking about how to increase
performance, one thought was that a lot of objects are generated by
the struct module that we never need: Unpacking six fields in order to
look at one and then throwing everything away is inefficient
concerning the five other fields. It also makes filtering and
inspecting basically the same operation regarding the (slow) unpacking
of values so we don't really benefit from filtering. This is a huge
problem when crunching gigabytes of data and creating millions of
fields.

One solution to this is using two format-strings instead of only one
(e.g. '4s4s i 4s2s2s'): One that unpacks just the filtered fields
(e.g. '8x i 8x') and one that unpacks all the fields except the one
already created by the filter (e.g. '4s4s  4x  4s2s2s'). This solution
works very well and increases throughput by far. It however also
creates complexity in the code as we have to keep track and combine
field-values that came from the filtering-part with the ones unpacked
during inspection-part (we don't want to simply unpack twice).

I'd like to propose an enhancement to the struct module that should
solve this dilemma and ask for your comments.

The function s_unpack_internal() inside _struct.c currently unpacks
all values from the buffer-object passed to it and returns a tuple
holding these values. Instead, the function could create a tuple-like
object that holds a reference to it's own Struct-object (which holds
the format) and a copy of the memory it is supposed to unpack. This
object allows access to the unpacked values through the sequence
protocol, basically unpacking the fields if - and only if - accessed
through sq_item (e.g. foo = struct.unpack('2s2s', 'abcd'); foo[0] ==
'ab'). The object can also unpack all fields only once (as all
unpacked objects are immutable, we can hold references to them and
return these instead once known). This approach is possible because
there are no further error conditions inside the unpacking-functions
that we would *have* to deal with at the time .unpack() is called; in
other words: Unpacking can't fail if the format-string's syntax had
been correct and can therefor be deferred (while packing can't).

I understand that this may seem like a single-case-optimization. We
can however assume that most people will benefit from the new behavior
unknowingly while everyone else takes now harm: The object mimicking
the otherwise returned tuple is immutable (therefor it's not suddenly
part of GC) and the memory overhead caused by holding references to
the original memory a little longer (reclaimed after the result
becomes unreachable) should be comparable to the memory used by
unneeded fields (reclaimed directly after creation).

I'd like to hear your thoughts and am perfectly willing to provide a
patch if it has a chance of inclusion.


Best regards
Lukas
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Lazy unpacking for struct module

2011-06-12 Thread Lukas Lueg
 This is what people normally do (unpack just the values they need,
 when they need them).
Due to the fact that there hundreds of format-strings which
dynamically compiled from a more verbose language at runtime, we will
have significant complexity in the code in order to generate format
strings that parse just the fields that are needed for filtering. It's
not just put-a-string-here-and-there.

 I don't think there is a net win from adding complexity to the struct
 module.  Introducing lazy behaviors creates its own overhead
 that would compete with code optimized using the traditional
 approach (unpack what you need, when you need it).  Also,
 the new behaviors add to the cognitive load when learning
 and remembering how to use this module.

The complexity is very well handled. Remember that the interface to
the module does not change at all and the documentation would be
exactly the same. There is no special case introduced here the user
has to know about. I also think this case has very little black magic
in it since we are dealing only with immutable objects and do not have
delayed error conditions (both usually being the primary source of
headaches when introducing lazy behavior).
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pass possibly imcompatible options to distutil's ccompiler

2011-04-12 Thread Lukas Lueg
Distutils2 is not really an option right now as it is not found on
major Linux distributions, FreeBSD or MacOS X

2011/4/12 Nick Coghlan ncogh...@gmail.com:
 On Tue, Apr 12, 2011 at 7:41 AM, Lukas Lueg lukas.l...@googlemail.com wrote:
 Any other ideas on how to solve this in a better way?

 Have you tried with distutils2? If it can't help you, it should really
 be looked into before the packaging API is locked for 3.3.

 Cheers,
 Nick.

 --
 Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Pass possibly imcompatible options to distutil's ccompiler

2011-04-11 Thread Lukas Lueg
Hi,

I'm the maintainer of Pyrit (http://pyrit.googlecode.com) and recently
checked in some code that uses the AES-NI intrinsics found in GCC
4.4+. I'm looking for a way how to build the python-extension using
distutils in a sane way and could not get an answer from the
distutils-people about that.

The enable the intrinsics, one must pass '-maes' and '-mpclmul' as
commandline-arguments to gcc, e.g. through extra_compile_args. This is
not always safe to do as previous versions of GCC do not support these
options and cause cc to fail with an error. Such platforms are not
uncommon, e.g. XCode 3.2 on MacOS is shipped with gcc 4.2. I fail to
see how to determine in advance what compiler distutils will use and
what version that compiler has. Therefor I see two options:
- Try to build a small pseudo-extension with the flags enabled, watch
for exceptions and only enable the extra_compile_args on the real
extension if the build succeeds
- Override the build_ext-command with another class and override
build_extension. Try to build the extension and, if a CompilerError is
thrown, remove '-maes' and '-mpclmul' from extra_compile_args. Try
again and re-raise possible CompilerErrors now.

The first option seems rather bogus so I'm currently going with the
second option. After all, this leaves me with the best chance of
enabling the AES-NI-code on compatible machines (no false-negatives
with some kind of auto-detection) and not having people being unable
to compile it at all (false-positives, resulting in final compiler
errors). The downside is that visible error messages are printed to
stderr from the first call to build_ext.build_extension if AES-NI is
actually not supported.

Any other ideas on how to solve this in a better way?


Best regards
Lukas
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Builtin open() too slow

2011-03-12 Thread Lukas Lueg
Hi,

i've a storage engine that stores a lot of files (e.g.  10.000) in
one path. Running the code under cProfile, I found that with a total
CPU-time of 1,118 seconds, 121 seconds are spent in 27.013 calls to
open(). The number of calls is not the problem; however I find it
*very* discomforting that Python spends about 2 minutes out of 18
minutes of cpu time just to get a file-handle after which it can spend
some other time to read from them.

May this be a problem with the way Python 2.7 gets filehandles from
the OS or is it a problem with large directories itself?

Best regards
Lukas
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Possible optimization for LOAD_FAST ?

2011-01-04 Thread Lukas Lueg
Doesnt this all boil down to being able to monitor PyDict for changes
to it's key-space?

The keys are immutable anyway so the instances of PyDict could manage
a opaque value (in fact, a counter) that changes every time a new
value is written to any key. Once we get a reference out of the dict,
we can can do very fast lookups by passing the key, the reference we
know from the last lookup and our last state. The lookup returns a new
reference and the new state.
If the dict has not changed, the state doesnt change and the reference
is simply taken from the passed value passed to the lookup. That way
the code remains the same no matter if the dict has changed or not.


2011/1/4 Michael Foord fuzzy...@voidspace.org.uk:
 On 04/01/2011 16:54, Barry Warsaw wrote:

 On Jan 04, 2011, at 10:21 AM, Alex Gaynor wrote:

 Ugh, I can't be the only one who finds these special cases to be a little
 nasty?

 Special cases aren't special enough to break the rules.

 Yeah, I agree.  Still it would be interesting to see what kind of
 performance
 improvement this would result in.  That seems to be the only way to decide
 whether the cost is worth the benefit.

 Outside of testing, I do agree that most of the builtins could be pretty
 safely optimized (even open()).  There needs to be a way to stop all
 optimizations for testing purposes.  Perhaps a sys variable, plus command
 line
 option and/or environment variable?

 Although testing in an environment deliberately different from production is
 a recipe for hard to diagnose bugs.

 Michael

 -Barry


 ___
 Python-Dev mailing list
 Python-Dev@python.org
 http://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe:
 http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk


 --
 http://www.voidspace.org.uk/

 May you do good and not evil
 May you find forgiveness for yourself and forgive others
 May you share freely, never taking more than you give.
 -- the sqlite blessing http://www.sqlite.org/different.html

 ___
 Python-Dev mailing list
 Python-Dev@python.org
 http://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe:
 http://mail.python.org/mailman/options/python-dev/lukas.lueg%40gmail.com


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Possible optimization for LOAD_FAST ?

2011-01-04 Thread Lukas Lueg
I very much like the fact that python has *very* little black magic
revealed to the user. Strong -1 on optimizing picked builtins in a
picked way.

2011/1/4 Steven D'Aprano st...@pearwood.info:
 Guido van Rossum wrote:

 On Tue, Jan 4, 2011 at 2:49 AM, Michael Foord fuzzy...@voidspace.org.uk
 wrote:

 I think someone else pointed this out, but replacing builtins externally
 to
 a module is actually common for testing. In particular replacing the open
 function, but also other builtins, is often done temporarily to replace
 it
 with a mock. It seems like this optimisation would break those tests.

 Hm, I already suggested to make an exception for open, (and one should
 be added for __import__) but if this is done for other builtins that
 is indeed a problem. Can you point to example code doing this?


 I've been known to monkey-patch builtins in the interactive interpreter and
 in test code. One example that comes to mind is that I had some
 over-complicated recursive while loop (!), and I wanted to work out the Big
 Oh behaviour so I knew exactly how horrible it was. Working it out from
 first principles was too hard, so I cheated: I knew each iteration called
 len() exactly once, so I monkey-patched len() to count how many times it was
 called. Problem solved.

 I also have a statistics package that has its own version of sum, and I rely
 on calls to sum() from within the package picking up my version rather than
 the builtin one.


 --
 Steven
 ___
 Python-Dev mailing list
 Python-Dev@python.org
 http://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe:
 http://mail.python.org/mailman/options/python-dev/lukas.lueg%40gmail.com

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Possible optimization for LOAD_FAST ?

2010-12-28 Thread Lukas Lueg
Consider the following code:

def foobar(x):
for i in range(5):
x[i] = i

The bytecode in python 2.7 is the following:

  2   0 SETUP_LOOP  30 (to 33)
  3 LOAD_GLOBAL  0 (range)
  6 LOAD_CONST   1 (5)
  9 CALL_FUNCTION1
 12 GET_ITER
   13 FOR_ITER16 (to 32)
 16 STORE_FAST   1 (i)

  3  19 LOAD_FAST1 (i)
 22 LOAD_FAST0 (x)
 25 LOAD_FAST1 (i)
 28 STORE_SUBSCR
 29 JUMP_ABSOLUTE   13
   32 POP_BLOCK
   33 LOAD_CONST   0 (None)
 36 RETURN_VALUE

Can't we optimize the LOAD_FAST in lines 19 and 25 to a single load
and put the reference twice on the stack? There is no way that the
reference of i might change in between the two lines. Also, the
load_fast in lne 22 to reference x could be taken out of the loop as x
will always point to the same object
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com