from:"Sven"

Re: argparse: delimiter for argparse list arguments

2021-08-03 Thread Sven R. Kunze


It could be but I've seen them used somewhere else.

I wouldn't bikeshed on this yet, as I haven't found a way to do this so 
far. Let's imagine the following parser:


parser.add_argument('things',action='append')
parser.add_argument('stuff',action='append')

At least from my point of view, I don't any way to separate both lists 
on this command call:



cool-script.py thing1 thing2 stuff1 stuff2


Do I miss something here?


Best
Sven

On 03.08.21 01:49, Dan Stromberg wrote:


Isn't -- usually used to signal the end of options?

On Mon, Aug 2, 2021 at 12:52 PM Sven R. Kunze <mailto:srku...@mail.de>> wrote:


Hi everyone,

maybe, I am missing something here but is it possible to specify a
delimiter for list arguments in argparse:

https://docs.python.org/3/library/argparse.html
<https://docs.python.org/3/library/argparse.html>

Usually, '--' is used to separate two lists (cf. git).

Cheers,
Sven

-- 
https://mail.python.org/mailman/listinfo/python-list

<https://mail.python.org/mailman/listinfo/python-list>


--
https://mail.python.org/mailman/listinfo/python-list

argparse: delimiter for argparse list arguments

2021-08-02 Thread Sven R. Kunze


Hi everyone,

maybe, I am missing something here but is it possible to specify a 
delimiter for list arguments in argparse:


https://docs.python.org/3/library/argparse.html

Usually, '--' is used to separate two lists (cf. git).

Cheers,
Sven

--
https://mail.python.org/mailman/listinfo/python-list

Re: [Python-ideas] Inconsistencies

2016-09-11 Thread Sven R. Kunze


On 10.09.2016 15:00, Chris Angelico wrote:

Some things are absolute hard facts. There is no way in which 1 will
ever be greater than 2, ergo "1 is less than 2" is strictly true, and
not a matter of opinion. If you hear someone trying to claim
otherwise, would you let him have his opinion, or would you treat it
as incorrect?


I don't know exactly if it's clear that one would need to make a 
distinction between real/physical-world facts and pure-logic facts.


"1 < 2" is by definition "true" (construction of natural numbers) not by 
real-world evidence. IIRC, the quote is about real-world matters.



There is some merit in this. For instance, Python 2 had a lower-level
consistency in the division operator than Python 3 has. According to
Py2, integers and floats are fundamentally different beasts, and when
you divide an int by an int, you get an int, not a float. Py3 says
"well, you probably REALLY meant to divide a number by a number", so
it gives you a float back, unless you explicitly ask for floor
division.

Py2 is more consistent on a lower level of abstraction. Py3 is more
consistent on a higher level of abstraction (modulo the oddities at
extremely large numbers). Both have merit, but in a high level
language, the Py3 way is usually [1] better.

But the consistency of call-by-object-reference is at the same high
level as the consistency of call-by-value or call-by-name. I can
explain Python's assignment model to someone fairly easily, using
pencil and paper, without any reference to "low level" or "high level"
concepts. And Python is extremely internally consistent; *every*
assignment behaves the exact same way. How does "import x" compare
with "from x import y"? Easy: the former is "x = some_module_object",
and the latter is "y = some_module_object.y", and either way, it's
regular assignment. How does parameter passing work? You take the
value of the argument as evaluated in the caller, and assign it to the
parameter in the function. What about default arguments? They're
evaluated when the function's defined, and assigned to the parameter
when it's called. Function definition itself is the same thing - it's
assigning a function object to a name. Python handles every one of
them the same way.

I don't care one iota about how voltages inside a CPU operate. I don't
generally even care about machine code - let other people worry about
that, people more expert than I am. Discussions about how the core dev
and the novice see Python's consistencies are nothing to do with those
levels. To go back to your original point, that a newbie is better at
recognizing inconsistencies... maybe, in a sense, but they also get a
lot of false positives. Ultimately, "consistent" means that there's a
single pattern that explains everything; if you're unaware of that
pattern, you won't know that it's consistent.

ChrisA


[1] Even in Python, there are places where low-level consistency is
better, because Python is a glue language. But in general, it's better
for Python to be consistent with humans than with C APIs.


The last sentence is the part why I love Python.  :)

I could not agree more with what you said above, so I hope this will put 
the discussion in better perspective, especially when people here trying 
to be overly absolute in their views (which was the quote about).


Cheers,
Sven

--
https://mail.python.org/mailman/listinfo/python-list

Re: Best Practices for Internal Package Structure

2016-04-06 Thread Sven R. Kunze


On 06.04.2016 01:47, Chris Angelico wrote:

Generally, I refactor code not because the files are getting "too
large" (for whatever definition of that term you like), but because
they're stretching the file's concept. Every file should have a
purpose; every piece of code in that file should ideally be supporting
exactly that purpose.


Well said.

The definition of purpose and concept are blurry, though. So, what is 
within the boundary of a concept is hard to define.



@Steven
You might not understand the purpose of the guideline. That's what makes 
them so valuable. It's hard to get them right and it's hard to 
understand them if you don't have any experience with them.




An attempt of an explanation (which maybe in itself not 100% correct): 
there are two different forces acting on the source code:


1) make it short and concise (the 2-pages guideline)
2) keep conceptually close things together (cf. Chris)

So, there's always a bargaining of what can be put in/removed from a 
module in the first place:


"just these 14 lines, please; we need that feature"
"but the module already has 310 lines"
"only this one last one, please; it belongs here"
"but it's an if-else and another def; nasty nesting and more complexity"
"hmm, what if we remove those 5 over here? we don't need them anymore"
"really? then, we can remove 2 superfluous newlines and 2 import lines 
as well"

"I could even squeeze those 14 lines to 10 using dict comprehensions"
"that's even more readable; +1 line, that's okay"

Life is full of compromises.

This guideline is more about discussing, shaping existing code and 
extracting the essence (with yourself or with colleagues) to keep things 
on a usable level.


Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

Re: Best Practices for Internal Package Structure

2016-04-06 Thread Sven R. Kunze


On 06.04.2016 09:28, Michael Selik wrote:

On Wed, Apr 6, 2016, 2:51 AM Steven D'Aprano  wrote:


On Wed, 6 Apr 2016 05:56 am, Michael Selik wrote:

[Michael]

When you made that suggestion earlier, I immediately guessed that you

were

using PyCharm. I agree that the decision to split into multiple files or
keep everything in just a few files seems to be based on your development
tools. I use IPython and SublimeText, so my personal setup is more suited
to one or a few files.


Interesting to know. I remember us while looking for our next IDE, we 
investigated Sublime as well. I don't remember the exact reasons anymore 
but I remember this lightweight feeling of control and doing all sorts 
of boilerplatey, distracting things extremely easily and most 
importantly extremely fast. It just feels like doing plain'ol files 
editing but with less typing; you can think more (mental activity) than 
you need to write (physical activity). That brought us to another level 
when designing stuff. Keeps us able to handle the workload.


Thus, I use it for my small PyPI projects as well. Why should I use less 
capable tools for those projects:


https://github.com/srkunze/xheap
https://github.com/srkunze/fork
https://github.com/srkunze/xcache

They are small but deserve the same professionalism I daresay.


How does PyCharm make the use of many files easier?

I'll let Sven answer that one. I don't know, but I've noticed the
correlation of habit to IDE.


Definitely true. I think that's natural and doing otherwise would impair 
the productivity improvements provided by the tools.


About the "many files easier" question: not sure what I said exactly, 
but one could also ask: "what makes the alternatives harder?"


1) learning them and using them regularly
2) you need "switching to a file" as a requirement for those other 
alternatives I mentioned in the other post


So, you before you can even start learning the alternatives, you do 1000 
times "switch to a file".



Moreover, if you split things up properly, you don't need to jump 
between 20 files at once. Usually you fix a problem/build a feature in a 
narrow slot of your code (1-4 files). If you don't do that regularly, 
it's an indication you've split your stuff up the wrong way. ;-)



Last but not least, you basically don't look for files anymore. You look 
for names, and PyCharm opens the file you need. You jump from code 
position to code position by interacting (clicking, shortcuts, etc.) 
with the code. In one file, you have only one cursor. So, when PyCharm 
jumps within in a single file, you loose your previous cursor position 
(You can jump back but that's not permanent - especially when you do 
something else in the same file). If you have two files open, you have 
two cursors and you almost always reside at the same spot there. You 
COULD do that with a single view (split view as mentioned in the last 
post) but yeah that's not as easy as just another file.



Again this is a just an attempt of explaining an observation.

Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

Re: Best Practices for Internal Package Structure

2016-04-05 Thread Sven R. Kunze


On 05.04.2016 20:40, Ethan Furman wrote:


(utils.py does export a couple of functions, but they should be in 
the main

module, or possibly made into a method of BidirectionalMapping.)

Your package is currently under 500 lines. As it stands now, you could
easily flatten it to a single module:

bidict.py


Yup... well, actually you could just stick it in __init__.py.


Interesting. We did (and I started it) a similar thing for some packages 
which then grew unbearably in a single year.



Now, everybody in team agrees with: 'who idiot put this stuff in 
__ini__.py?' It was me who started it so I take it with a smile. But 
it's definitely a wart.


So, we have a new guideline since then: "empty __init__.py" if possible 
of course; you sometimes need to collect/do magic imports but that's a 
separate matter.



Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

Re: Best Practices for Internal Package Structure

2016-04-05 Thread Sven R. Kunze


On 05.04.2016 19:59, Chris Angelico wrote:

On Wed, Apr 6, 2016 at 3:38 AM, Sven R. Kunze  wrote:

Your package is currently under 500 lines. As it stands now, you could
easily flatten it to a single module:

bidict.py


I don't recommend this.

The line is blurry but 500 is definitely too much. Those will simply not fit
on a 1 or 2 generous single screens anymore (which basically is our
guideline). The intention here is to always have a bit more of a full screen
of code (no wasted pixels) while benefiting from switching to another file
(also seeing a full page of other code).

Clearly this is a matter of opinion. I have absolutely no problem with
a 500-lne file. As soon as you force people to split things across
files, you add a new level of indirection that causes new problems.


Guidelines. No forcing.


I'd rather keep logically-related code together rather than splitting
across arbitrary boundaries;


That's a good advice and from what I can see bidict adheres to that. ;)


you can always search within a file for the bit you want.


If you work like in the 80's, maybe. Instead of scrolling, (un)setting 
jumppoints, or use splitview of the same file, it's just faster/easier 
to jump between separate files in todays IDEs if you need to jump 
between 4 places within 3000 lines of code.



When you split a file into two, you duplicate the
headers at the top (imports and stuff), so you'll split a 100-line
file into two 60-line files or so. Do that to several levels in a big
project and you end up with a lot more billable lines of code, but no
actual improvement.


Who cares about the imports? As I said somewhere else in my response, 
it's hidden from sight if you use a modern IDE. We call that folding. ;)


Who bills lines of code? Interesting business model. ;)


I guess that's worth doing - lovely billable hours
doing the refactoring,


Refactoring is not just splitting files if that concept is new to you.

Refactoring it not an end to itself. It serves a purpose.


more billable hours later on when you have to
read past the migration in source control ("where did this line come
from" gets two answers all the time), and more billable hours dealing
with circular imports when two fragments start referring to each
other. Sounds like a plan.


It appears to me as if you like messy code then. ;)

Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

Re: Best Practices for Internal Package Structure

2016-04-05 Thread Sven R. Kunze


On 05.04.2016 03:43, Steven D'Aprano wrote:

The purpose of packages isn't enable Java-style "one class per file" coding,
especially since *everything* in the package except the top level "bidict"
module itself is private. bidict.compat and bidict.util aren't flagged as
private, but they should be, since there's nothing in either of them that
the user of a bidict class should care about.

(utils.py does export a couple of functions, but they should be in the main
module, or possibly made into a method of BidirectionalMapping.)

Your package is currently under 500 lines. As it stands now, you could
easily flatten it to a single module:

bidict.py


I don't recommend this.

The line is blurry but 500 is definitely too much. Those will simply not 
fit on a 1 or 2 generous single screens anymore (which basically is our 
guideline). The intention here is to always have a bit more of a full 
screen of code (no wasted pixels) while benefiting from switching to 
another file (also seeing a full page of other code).


This said, and after having a look at your packages code, it's quite 
well structured and you have almost always more than 1 name defined in 
each submodule. So, it's fine. _frozen and _loose are a bit empty but 
well don't let's stretch rules here too far.


I remember us having some years ago file that regularly hit the 3000 or 
4000 lines of code. We systematically split those up, refactored them 
and took our time to name those module appropriately. Basically we 
started with:


base.py << trashcan for whatever somebody might need

to

base.py << really the base
domain_specific1.py  << something you can remember
domain_specific2.py  << ...
domain_specific3.py
domain_specific4.py



Unless you are getting some concrete benefit from a package structure, you
shouldn't use a package just for the sake of it.


I agree.


Even if the code doubles
in size, to 1000 lines, that's still *far* below the point at which I
believe a single module becomes unwieldy just from size. At nearly 6500
lines, the decimal.py module is, in my opinion, *almost* at the point where
just size alone suggests splitting the file into submodules. Your module is
nowhere near that point.


I disagree completely. After reading his package, the structure really 
helped me. So, I see a benefit.



I agree with Steven that hiding where a name comes from is a bit 
problematic. Additionally, as we use PyCharm internally, 1) we don't see 
imports regularly 2) we don't create/optimize them manually anymore 3) 
we just don't care if the import is too long. So, it's fine to us and as 
PyCharm tried not to be overly clever when it comes to detecting names, 
we like the direct way.


In case of our PyPI module, usability is really important for newbies 
and people not using sophisticated IDEs. So, making it really easy for 
them is a must. :)


Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

Re: Best Practices for Internal Package Structure

2016-04-04 Thread Sven R. Kunze


Hi Josh,

good question.

On 04.04.2016 18:47, Josh B. wrote:

My package, available at https://github.com/jab/bidict, is currently laid out 
like this:

bidict/
├── __init__.py
├── _bidict.py
├── _common.py
├── _frozen.py
├── _loose.py
├── _named.py
├── _ordered.py
├── compat.py
├── util.py


I'd like to get some more feedback on a question about this layout that I originally 
asked here: <https://github.com/jab/bidict/pull/33#issuecomment-193877248>:

What do you think of the code layout, specifically the use of the _foo modules? 
It seems well-factored to me, but I haven't seen things laid out this way very 
often in other projects, and I'd like to do this as nicely as possible.

It does kind of bug me that you see the _foo modules in the output when you do 
things like this:
[code]


we had a similar discussion internally. We have various packages 
requiring each other but have some internals that should not be used 
outside of them.


The _ signifies that actually clearly but it looks weird within the 
package itself.


We haven't found a solution so far. Maybe others do.


Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

Re: Learning Python (or Haskell) makes you a worse programmer

2016-03-31 Thread Sven R. Kunze


On 31.03.2016 18:30, Travis Griggs wrote:



British:  http://www.oxforddictionaries.com/definition/english/python
American: http://www.dictionary.com/browse/python?s=t

That does it. If I ever make some sort of open source module for pythun/pythawn 
I’ll be sure to call it either tuhmayto/tomawto. Or maybe I’ll call it 
puhtayto/potawto.


Isn't it more like "Pythn"?
--
https://mail.python.org/mailman/listinfo/python-list

Re: Slice equivalent to dict.get

2016-03-31 Thread Sven R. Kunze


On 31.03.2016 17:07, Steven D'Aprano wrote:

Sometimes people look for a method which is equivalent to dict.get, where
they can set a default value for when the key isn't found:


py> d = {1: 'a', 2: 'b'}
py> d.get(999, '?')
'?'


The equivalent for sequences such as lists and tuples is a slice. If the
slice is out of range, Python returns a empty sequence:


I see what you are trying to achieve here. What do you think about this?

[1, 2, 3].get(999, '?')


Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

Re: Learning Python (or Haskell) makes you a worse programmer

2016-03-30 Thread Sven R. Kunze


On 30.03.2016 12:21, BartC wrote:

On 30/03/2016 11:07, Sven R. Kunze wrote:

On 30.03.2016 01:29, Eric S. Johansson wrote:



On 3/29/2016 6:05 AM, Sven R. Kunze wrote:


Python = English


As someone who writes English text and code using speech recognition,
I can assure you that Python is not English. :-)


:D Interesting. Never thought of how Python sounds when spoken.


Among other things, it becomes case insensitive...



Now that you mention it... ;)

You do coding with speech recognition, too?


--
https://mail.python.org/mailman/listinfo/python-list

Re: Learning Python (or Haskell) makes you a worse programmer

2016-03-30 Thread Sven R. Kunze


On 30.03.2016 12:14, Tim Golden wrote:

Not that you quite meant this, but I'm always amused (and still a little
startled) when I listen to talks recorded from, say, PyCon and hear
people with American accents pronouncing Python with the stress on the
slightly longer second syllable.

(I don't know how other English-speaking groups say the word, but in
England the first syllable is stressed and the second is the
conventional short "uh" sound).

TJG


I recognize this too. I also started with the England variant but now I 
am not so sure anymore. :D


Sven
--
https://mail.python.org/mailman/listinfo/python-list

Re: Learning Python (or Haskell) makes you a worse programmer

2016-03-30 Thread Sven R. Kunze


On 30.03.2016 01:29, Eric S. Johansson wrote:



On 3/29/2016 6:05 AM, Sven R. Kunze wrote:


Python = English

As someone who writes English text and code using speech recognition, 
I can assure you that Python is not English. :-)


:D Interesting. Never thought of how Python sounds when spoken.

Btw. the equivalence was more meant in the context of this thread. ;)

Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

Re: Threading is foobared?

2016-03-30 Thread Sven R. Kunze


On 30.03.2016 01:43, Steven D'Aprano wrote:

On Tue, 29 Mar 2016 09:26 pm, Sven R. Kunze wrote:


On 27.03.2016 05:01, Steven D'Aprano wrote:

Am I the only one who has noticed that threading of posts here is
severely broken? It's always been the case that there have been a few
posts here and there that break threading, but now it seems to be much
more common.

I agree. Didn't we both already have a conversation about this? I
thought it is my thunderbird messing things up.

I'm not using Thunderbird, so whatever the cause of the problem, it is not
specific to Thunderbird.






Haha, how nice. My thread view shows your reply as a sibling not a child 
to my mail. I assume you replied to my mail. How strange.



Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

Re: [OT] C# -- sharp or carp? was Re: Learning Python (or Haskell) makes you a worse programmer

2016-03-29 Thread Sven R. Kunze


On 29.03.2016 18:05, Peter Otten wrote:

Reformatting it a bit

String.Join(
 "\n",
 mylist.Where(
 foo => !String.IsNullOrEmpty(foo.description)
 ).Select(
 foo => foo.description))

this looks like a variant of Python's

str.join(
"\n",
map(lambda foo: foo.description,
filter(lambda foo: foo.description, mylist)))

Assuming it's type-safe and can perhaps reshuffle the where and select part
into something optimised there is definitely progress.

But still, Python's generator expressions are cool..


Haha, sure. But don't get stuck there. Learn something new from time to 
time; even a new language.



Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

Re: [OT] C# -- sharp or carp? was Re: Learning Python (or Haskell) makes you a worse programmer

2016-03-29 Thread Sven R. Kunze


On 29.03.2016 12:18, Sven R. Kunze wrote:

On 29.03.2016 11:39, Peter Otten wrote:

My question to those who know a bit of C#: what is the state-of-the-art
equivalent to

"\n".join(foo.description() for foo in mylist
  if foo.description() != "")



Using LINQ, I suppose: 
https://en.wikipedia.org/wiki/Language_Integrated_Query


Friend of mine told me something like this:

String.Join("\n", mylist.Where(foo => 
!String.IsNullOrEmpty(foo.description)).Select(foo => foo.description))


[untested, but from what I know of quite correct]

Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

Re: Exclude every nth element from list?

2016-03-29 Thread Sven R. Kunze


On 26.03.2016 18:06, Peter Otten wrote:

beliavsky--- via Python-list wrote:


I can use x[::n] to select every nth element of a list. Is there a
one-liner to get a list that excludes every nth element?

del x[::n]

;)


Actually quite nice.
--
https://mail.python.org/mailman/listinfo/python-list

Re: Threading is foobared?

2016-03-29 Thread Sven R. Kunze


On 27.03.2016 05:01, Steven D'Aprano wrote:

Am I the only one who has noticed that threading of posts here is severely
broken? It's always been the case that there have been a few posts here and
there that break threading, but now it seems to be much more common.


I agree. Didn't we both already have a conversation about this? I 
thought it is my thunderbird messing things up.


Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

Re: newbie question

2016-03-29 Thread Sven R. Kunze

On 28.03.2016 17:34, ast wrote:

"Matt Wheeler"  a écrit dans le message de 
news:mailman.92.1458825746.2244.python-l...@python.org...

On Thu, 24 Mar 2016 11:10 Sven R. Kunze,  wrote:

On 24.03.2016 11:57, Matt Wheeler wrote:
>>>> import ast
>>>> s = "(1, 2, 3, 4)"
>>>> t = ast.literal_eval(s)
>>>> t
> (1, 2, 3, 4)

I suppose that's the better solution in terms of safety.

It has the added advantage that the enquirer gets to import a module 
that

shares their name ;)

I had a look at that "ast" module doc, but I must admit that
I didn't understood a lot of things.

If there were a module "srkunze", I think, I would be equally surprised. ;)

Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

Re: [OT] C# -- sharp or carp? was Re: Learning Python (or Haskell) makes you a worse programmer

2016-03-29 Thread Sven R. Kunze


On 29.03.2016 11:39, Peter Otten wrote:

My question to those who know a bit of C#: what is the state-of-the-art
equivalent to

"\n".join(foo.description() for foo in mylist
  if foo.description() != "")



Using LINQ, I suppose: 
https://en.wikipedia.org/wiki/Language_Integrated_Query

--
https://mail.python.org/mailman/listinfo/python-list

Re: Learning Python (or Haskell) makes you a worse programmer

2016-03-29 Thread Sven R. Kunze




On 29.03.2016 06:13, Michael Torrie wrote:

On 03/28/2016 06:44 PM, Steven D'Aprano wrote:

http://lukeplant.me.uk/blog/posts/why-learning-haskell-python-makes-you-a-worse-programmer/

I have the same problem as the writer.  Working in Python makes me
really dislike working in any other language!



Python = English


 :)
--
https://mail.python.org/mailman/listinfo/python-list

Re: Which are best, well-tested ways to create REST services, with Json, in Python?

2016-03-29 Thread Sven R. Kunze

Not heard of any but I can recommend django-restframework. We've got 
good experience with that.


On 28.03.2016 23:06, David Shi via Python-list wrote:

Has anyone done a recent reviews of creating REST services, in Python?
Regards.
David


--
https://mail.python.org/mailman/listinfo/python-list

Re: newbie question

2016-03-24 Thread Sven R. Kunze


On 24.03.2016 14:22, Matt Wheeler wrote:

On Thu, 24 Mar 2016 11:10 Sven R. Kunze,  wrote:


On 24.03.2016 11:57, Matt Wheeler wrote:

import ast
s = "(1, 2, 3, 4)"
t = ast.literal_eval(s)
t

(1, 2, 3, 4)

I suppose that's the better solution in terms of safety.


It has the added advantage that the enquirer gets to import a module that
shares their name ;)


One shouldn't underestimate this. ;-)
--
https://mail.python.org/mailman/listinfo/python-list

Re: newbie question

2016-03-24 Thread Sven R. Kunze


On 24.03.2016 11:57, Matt Wheeler wrote:

import ast
s = "(1, 2, 3, 4)"
t = ast.literal_eval(s)
t

(1, 2, 3, 4)


I suppose that's the better solution in terms of safety.
--
https://mail.python.org/mailman/listinfo/python-list

Re: monkey patching code

2016-03-23 Thread Sven R. Kunze


On 23.03.2016 09:24, dieter wrote:

But you have observed that you cannot do everything with a
code substitution: a function call does not only depend on the code
but also on other properties of the function object: e.g. the
parameter processing.


Yep, that's because Python is very flexible and provides means for 
changing even that. So, it's not part of the __code__ object but part of 
the actual function. That's okay.



You might be able to change them in a similar way as "__code__" (i.e.
direct modification). Otherwise, you would need to construct a new
"function object" -- and lose the possibility to completely
change the function object in place.


Exactly. Except __globals__ we are all set and I think that'll work for 
us. I will report once we've implemented it that way.


Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

how to cache invalidation

2016-03-22 Thread Sven R. Kunze


Hi everybody,

I got another module up and running: xcache
Background described here: 
http://srkunze.blogspot.com/2016/03/safe-cache-invalidation.html


We needed a way to safely invalidate rlu_caches once a Web request has 
been finished. So, we came up with a solution using garbage collection 
and extended this to context managers for other purposes. Maybe, it can 
be useful to other Python devs as well. :-)


Let me know if you need help with it.

Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

Re: monkey patching code

2016-03-22 Thread Sven R. Kunze


On 21.03.2016 21:42, Matt Wheeler wrote:

On 20 March 2016 at 16:46, Sven R. Kunze  wrote:

On 19.03.2016 00:58, Matt Wheeler wrote:

I know you have a working solution now with updating the code &
defaults of the function, but what about just injecting your function
into the modules that had already imported it after the
monkeypatching?

Seems perhaps cleaner, unless you'd end up having to do it to lots of
modules...

Why do you consider it cleaner?

I think it would be more explicit and understandable for someone
reading your code.

I suppose it's quite subjective :)


As far as I can see, the code replacement approach solves the problem 
once and for all. Thus is far more stable.


Manually finding out every single module that might or might not have 
imported "reverse" before we could monkeypatch it might result in a 
maintenance nightmare (just think about a Django upgrade).



It reminds me of list replacement:

mylist = newlist
mylist[:] = newlist

The latter keeps the reference stable whereas the former does not. Same 
with monkeypatching.



Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

Re: monkey patching code

2016-03-20 Thread Sven R. Kunze


On 19.03.2016 00:58, Matt Wheeler wrote:

I know you have a working solution now with updating the code &
defaults of the function, but what about just injecting your function
into the modules that had already imported it after the
monkeypatching?

Seems perhaps cleaner, unless you'd end up having to do it to lots of modules...


Why do you consider it cleaner?

Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

Re: empty clause of for loops

2016-03-19 Thread Sven R. Kunze


On 16.03.2016 16:02, Tim Chase wrote:

On 2016-03-16 15:29, Sven R. Kunze wrote:

I would re-use the "for-else" for this. Everything I thought I
could make use of the "-else" clause, I was disappointed I couldn't.

Hmm...this must be a mind-set thing.  I use the "else" clause with
for/while loops fairly regularly and would be miffed if their behavior
changed.

Could I work around their absence?  Certainly.

Does it annoy me when I have to work in other languages that lack
Python's {for/while}/else functionality?  You bet.


I can imagine that. Could you describe the general use-case? From what I 
know, "else" is executed when you don't "break" the loop. When is this 
useful?



Btw., I don't have any issue with else or whatever it is called. It's 
just a word but it must fit intuition. And this is why I would rather 
see "else" being used there. But this may result because of the lack of 
usage of mine.



We can also re-use "except" for it. ;-)


Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

Re: empty clause of for loops

2016-03-19 Thread Sven R. Kunze


On 16.03.2016 18:08, Random832 wrote:

Yeah, well, you can *almost* get there with:

try:
 thing = next(item for item in collection if good(item))
except StopIteration:
 thing = default

But the for/else thing seems like a more natural way to do it. Plus,
this is a toy example, if the body is more than one statement or doesn't
involve returning a value comprehensions aren't a good fit.


Sure, YMMV.

What I don't understand is why Python features "if break, then no else 
clause", but "if empty, then empty clause".


I found this excellent post: 
https://shahriar.svbtle.com/pythons-else-clause-in-loops


The described break-else replacement greatly resembles the answers of 
this thread:


condition_is_met = False
for x in data:
if meets_condition(x):
condition_is_met = True

if not condition_is_met:
# raise error or do additional processing

Compared to the proposed empty clause replacement:

empty = True:
for item in items:
empty = False
...

if empty:
...


In order to explain why this might be slightly more important to us than 
to other folks: we work in the field of Web development. As humans are 
no machines, they usually expect an empty list to be marked as such OR 
special actions when lists are not filled as expected.


Even Django ({% empty %}) and jinja ( {% else %}) features this type of 
construct. You might think it's enough when template engines work this 
way (the output layer). However, I quite regularly could find this 
useful within the logic part (the actions) of our applications.


Do you think this would be worth posting on python-ideas?

Best,
Sven

--
https://mail.python.org/mailman/listinfo/python-list

Re: monkey patching code

2016-03-19 Thread Sven R. Kunze


On 18.03.2016 14:47, Ian Kelly wrote:

Your patched version takes two extra arguments. Did you add the
defaults for those to the function's __defaults__ attribute?


That's it! :-) Thanks a lot.

Just to understand this better: why is that not part of the code object 
but part of the function?



This sounds like a pretty hairy thing that you're trying to do. Surely
there must be some better way to accomplish the same goal.


We are open for suggestions. We featured our own reverse function for a 
while but it lead to inconsistent behaviors across the field. Especially 
considering that Django provides an {% url %} template tag which would 
then use yet another reverse implementation.


Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

Re: empty clause of for loops

2016-03-19 Thread Sven R. Kunze


On 16.03.2016 11:47, Peter Otten wrote:


What would you expect?


A keyword filling the missing functionality? Some Python magic, I 
haven't seen before. ;-)





class Empty(Exception): pass

...

def check_empty(items):

... items = iter(items)
... try:
... yield next(items)
... except StopIteration:
... raise Empty
... yield from items
...

try:

...for item in check_empty("abc"): print(item)
... except Empty: print("oops")
...
a
b
c

try:

...for item in check_empty(""): print(item)
... except Empty: print("oops")
...
oops


He will be highly delighted so see such a simplistic solution. ;-)


I'm kidding, of course. Keep it simple and use a flag like you would in any
other language:

empty = True:
for item in items:
 empty = False
 ...
if empty:
 ...



He likes this approach. Thanks. :-)


Although, I for one would like a keyword. I remember having this issue 
myself, and found that the "empty" variable approach is more like a 
pattern. As usual, patterns are workarounds for features that a language 
misses.


Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

Re: empty clause of for loops

2016-03-19 Thread Sven R. Kunze


On 18.03.2016 20:10, Palpandi wrote:

You can do like this.

if not my_iterable:
 
for x in my_iterable:
 


Thanks for you help here, however as already pointed out, my_iterable is 
not necessarily a list but more likely an exhaustible iterator/generator.


Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

Re: empty clause of for loops

2016-03-19 Thread Sven R. Kunze


On 16.03.2016 13:57, Peter Otten wrote:


I'd put that the other way round: syntactical support for every pattern
would make for a rather unwieldy language. You have to choose carefully, and
this requirement could easily be fulfilled by a function, first in your
personal toolbox, then in a public libary, then in the stdlib.

If you don't like exceptions implement (or find) something like

items = peek(items)
if items.has_more():
# at least one item
for item in items:
...
else:
# empty

Only if such a function is used a lot or cannot be conceived without severe
shortcumings adding to the syntax should be considered. The (hypothetical)
question you should answer: which current feature would you throw out to
make room for your cool new addition?


I am glad you asked. ;-)

I would re-use the "for-else" for this. Everything I thought I could 
make use of the "-else" clause, I was disappointed I couldn't.



I find the addition to for-loop as useful as we already have a quite 
complex try-except-else-finally clause. I don't know why for-loops 
couldn't benefit from this as well.



Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

Re: monkey patching code

2016-03-19 Thread Sven R. Kunze


On 18.03.2016 15:48, Ian Kelly wrote:

Well I didn't design it, so I'm not really sure. But it could be argued
that the defaults are intrinsic to the function declaration, not the code
object, as not all code objects even have arguments. It also makes it
straight-forward to create a new function that uses the same code but with
different defaults or globals.


It occurred to me after I sent that email.

However, changing globals is not possible.


Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

Re: Replace weird error message?

2016-03-19 Thread Sven R. Kunze


On 16.03.2016 19:53, Ben Finney wrote:

Do you think some better error message should be used?

Yes, I think that error message needs to be improved. Please file a bug
report in Python's issue tracker https://bugs.python.org/>.


For example a hint that "0" does work for the given argument.

I suggest: “zero-padding only allowed for numeric types, not 'str'”.


That error message would make it very clear. +1
--
https://mail.python.org/mailman/listinfo/python-list

Re: monkey patching code

2016-03-19 Thread Sven R. Kunze


On 18.03.2016 15:23, Ian Kelly wrote:

On Fri, Mar 18, 2016 at 7:47 AM, Ian Kelly  wrote:

Your patched version takes two extra arguments. Did you add the
defaults for those to the function's __defaults__ attribute?

And as an afterthought, you'll likely need to replace the function's
__globals__ with your own as well.


Thanks again. :-)

Again, why would it make sense for those dunder attributes to be part of 
the function but not of the code object?


Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

Re: empty clause of for loops

2016-03-19 Thread Sven R. Kunze


On 16.03.2016 17:37, Random832 wrote:

On Wed, Mar 16, 2016, at 11:17, Sven R. Kunze wrote:

I can imagine that. Could you describe the general use-case? From what I
know, "else" is executed when you don't "break" the loop. When is this
useful?


for item in collection:
if good(item):
   thing = item
   break
else:
thing = default # or raise an exception, etc


I was thinking about why we don't use it that often. My response to this 
example:


thing = item if item in collection else default
--
https://mail.python.org/mailman/listinfo/python-list

Re: empty clause of for loops

2016-03-19 Thread Sven R. Kunze


On 16.03.2016 13:08, Steven D'Aprano wrote:

Doing what? What is the code supposed to do? What's "empty" mean as a
keyword?

If you explain what your friends wants, then perhaps we can suggest
something. Otherwise we're just guessing. I can think of at least two
different meanings:

* run the "empty" block only if my_iterable is empty at the start of the
first loop;


Solutions of most commenters already tackled this one, which is (at 
least from my point of view) the most intuitive.



* run the "empty" block if my_iterable becomes empty after the first loop.


I think I can imagine where this is coming from but this was not the 
initial use-case. I think Tim's answer (count approach) would provide a 
solution for this (from my point of view) rather rare use-case.



Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

Re: empty clause of for loops

2016-03-19 Thread Sven R. Kunze


On 16.03.2016 15:29, Sven R. Kunze wrote:

On 16.03.2016 13:57, Peter Otten wrote:


I'd put that the other way round: syntactical support for every pattern
would make for a rather unwieldy language. You have to choose 
carefully, and

this requirement could easily be fulfilled by a function, first in your
personal toolbox, then in a public libary, then in the stdlib.

If you don't like exceptions implement (or find) something like

items = peek(items)
if items.has_more():
# at least one item
for item in items:
...
else:
# empty

Only if such a function is used a lot or cannot be conceived without 
severe
shortcumings adding to the syntax should be considered. The 
(hypothetical)

question you should answer: which current feature would you throw out to
make room for your cool new addition?


I am glad you asked. ;-)

I would re-use the "for-else" for this. [Everything] I thought I could 
make use of the "-else" clause, I was disappointed I couldn't.




[everytime]



I find the addition to for-loop as useful as we already have a quite 
complex try-except-else-finally clause. I don't know why for-loops 
couldn't benefit from this as well.



Best,
Sven


--
https://mail.python.org/mailman/listinfo/python-list

Re: empty clause of for loops

2016-03-19 Thread Sven R. Kunze


On 16.03.2016 14:58, alister wrote:
no , i just typed it, while trying to hold a conversation with swmbo 
:-( apologies to the op if e could not see where i was intending to go 
with this. 


No problem, I perform quite well at guessing folk's intention.

So, yes, I can extrapolate what you meant. Thanks. :)

Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

Re: empty clause of for loops

2016-03-19 Thread Sven R. Kunze


On 16.03.2016 14:09, Tim Chase wrote:

If you can len() on it, then the obvious way is

   if my_iterable:
 for x in my_iterable:
   do_something(x)
   else:
 something_else()

However, based on your follow-up that it's an exhaustible iterator
rather than something you can len(), I'd use enumerate:

   count = 0 # have to set a default since it doesn't get assigned
 # if no iteration happens
   for count, x in enumerate(my_iterable, 1):
 do_something(x)
   if not count:
 something_else()


Interesting variation. Good to keep in mind if I encounter a situation 
where I need both (empty flag + counter). Thanks. :)



I do a lot of ETL work, and my code often has to report how many
things were processed, so having that count is useful to me.
Otherwise, I'd use a flag:

   empty = True
   for x in my_iterable:
 empty = False
 do_something(x)
   if empty:
 something_else()


Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

Re: empty clause of for loops

2016-03-19 Thread Sven R. Kunze




On 16.03.2016 17:20, Terry Reedy wrote:

On 3/16/2016 11:17 AM, Sven R. Kunze wrote:

On 16.03.2016 16:02, Tim Chase wrote:



Does it annoy me when I have to work in other languages that lack
Python's {for/while}/else functionality?  You bet.


I can imagine that. Could you describe the general use-case? From what I
know, "else" is executed when you don't "break" the loop. When is this
useful?


When one wants to know if the iterable contained an exceptional item 
or not


I don't think that's all since I wouldn't need the "else" clause here. 
The important part here is that there is code attached to the case of 
"item found".


That naturally leads to what when I want to attach code to the case of 
"no item found"?



or when one wants to know if the iterator is exhausted or not.


That's not 100% true, isn't it? The break could happen during iterating 
over the last item of the iterable.


I vaguely remember this being a problem, why I always needed to dismiss 
the "else" idea in my code because of that very corner case.



Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

Re: empty clause of for loops

2016-03-18 Thread Sven R. Kunze


On 16.03.2016 17:56, Sven R. Kunze wrote:

On 16.03.2016 17:37, Random832 wrote:

On Wed, Mar 16, 2016, at 11:17, Sven R. Kunze wrote:
I can imagine that. Could you describe the general use-case? From 
what I

know, "else" is executed when you don't "break" the loop. When is this
useful?


for item in collection:
if good(item):
   thing = item
   break
else:
thing = default # or raise an exception, etc


I was thinking about why we don't use it that often. My response to 
this example:


thing = item if item in collection else default


Time for a break. That is not going to work.

Will still think about why we don't use it (often/at all).

Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

Re: monkey patching code

2016-03-18 Thread Sven R. Kunze


On 18.03.2016 15:33, Sven R. Kunze wrote:

On 18.03.2016 15:23, Ian Kelly wrote:
On Fri, Mar 18, 2016 at 7:47 AM, Ian Kelly  
wrote:

Your patched version takes two extra arguments. Did you add the
defaults for those to the function's __defaults__ attribute?

And as an afterthought, you'll likely need to replace the function's
__globals__ with your own as well.



def f(a, b=None, c=None):
print(a, b, c)

def f_patch(a, b=None, c=None, d=None, e=None):
print(a, b, c, d, e)


f.__code__ = f_patch.__code__
f.__defaults__ = f_patch.__defaults__
f.__kwdefaults__ = f_patch.__kwdefaults__
f.__globals__ = f_patch.__globals__   <<<<< crashes here with 
"AttributeError: readonly attribute"



f('a', e='e')


It seems like we need to work with the globals we have aka. importing 
things locally.



Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

Re: empty clause of for loops

2016-03-18 Thread Sven R. Kunze


On 17.03.2016 01:27, Steven D'Aprano wrote:

That post describes the motivating use-case for the introduction
of "if...else", and why break skips the "else" clause:


for x in data:
 if meets_condition(x):
 break
else:
 # raise error or do additional processing


It might help to realise that the "else" clause is misnamed. It should be
called "then":

for x in data:
 block
then:
 block


The "then" (actually "else") block is executed *after* the for-loop, unless
you jump out of that chunk of code by raising an exception, calling return,
or break.

As a beginner, it took me years of misunderstanding before I finally
understood for...else and while...else, because I kept coming back to the
thought that the else block was executed if the for/while block *didn't*
execute.


That's true. I needed to explain this to few people and I always need 
several attempts/starts to get it right in a simple statement:


'If you do a "break", then "else" is NOT executed.' I think the "NOT" 
results in heavy mental lifting.



I couldn't get code with for...else to work right and I didn't
understand why until finally the penny dropped and realised that "else"
should be called "then".


That's actually a fine idea. One could even say: "finally".

Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

Re: Bash-like pipes in Python

2016-03-18 Thread Sven R. Kunze


On 16.03.2016 16:09, Joel Goldstick wrote:

symbol '|' in python.  Can you elaborate


bitwise or
--
https://mail.python.org/mailman/listinfo/python-list

monkey patching code

2016-03-18 Thread Sven R. Kunze


Hi,

we got an interesting problem. We need to monkeypatch Django's reverse 
function:



First approach:

urlresolvers.reverse = patched_reverse


Problem: some of Django's internal modules import urlresolvers.reverse 
before we can patch it for some reasons.



Second approach:

urlresolvers.reverse.__code__ = patched_reverse.__code__


Unfortunately, we got this error:

>>> reverse('login')

patched_reverse() takes at least 3 arguments (1 given)


These are the functions' signatures:

def patched_reverse(viewname, urlconf=None, args=None, kwargs=None, 
prefix=None, current_app=None, get=None, fragment=None):
def reverse(viewname, urlconf=None, args=None, kwargs=None, prefix=None, 
current_app=None):



Some ideas?

Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

Re: empty clause of for loops

2016-03-16 Thread Sven R. Kunze


On 16.03.2016 11:28, Joaquin Alzola wrote:

If len(my_iterable) is not 0:
 for x in my_iterable:
 # do
else:
  # do something else


I am sorry, I should have been more precise here.

my_iterable is an iterator that's exhausted after a complete iteration 
and cannot be restored.

It's furthermore too large to fit into memory completely.

Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

empty clause of for loops

2016-03-16 Thread Sven R. Kunze


Hi,

a colleague of mine (I write this mail because I am on the list) has the 
following issue:



for x in my_iterable:
# do
empty:
# do something else


What's the most Pythonic way of doing this?

Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

Re: argparse

2016-03-12 Thread Sven R. Kunze


On 12.03.2016 00:18, Fillmore wrote:


Playing with ArgumentParser. I can't find a way to override the -h and 
--help options so that it provides my custom help message.


I remember everything being a lot easier using argh instead of argparse.

https://pypi.python.org/pypi/argh#examples

The doc string of a function basically is the help string which is true 
for arguments as well.


I hope that helps even though you asked for argparse explicitly. :-)

Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

Re: even faster heaps

2016-03-09 Thread Sven R. Kunze


On 09.03.2016 19:19, Sven R. Kunze wrote:



ps: there are two error's when i ran tests with test_xheap.


Damn. I see this is Python 2 and Python 3 related. Thanks for bringing 
this to my attention. I am going to fix this soon.


Fixed.
--
https://mail.python.org/mailman/listinfo/python-list

Re: even faster heaps

2016-03-09 Thread Sven R. Kunze


On 06.03.2016 14:59, Sven R. Kunze wrote:
Using the original xheap benchmark 
<http://srkunze.blogspot.de/2016/02/the-xheap-benchmark.html>, I could 
see huge speedups: from 50x/25x down to 3x/2x compared to heapq. 
That's a massive improvement. I will publish an update soon.


An here it is: http://srkunze.blogspot.com/2016/03/even-faster-heaps.html


Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

Re: even faster heaps

2016-03-09 Thread Sven R. Kunze

x) 106.80 (
1.82x) 1108.52 ( 1.84x)')

(u'pop ', u'heapq  ', u' 0.10 ( 1.00x)  1.41 ( 1.00x) 17.64 (
1.00x) 209.82 ( 1.00x)')
(u'', u'Heap   ', u' 0.11 ( 1.07x)  1.47 ( 1.04x) 18.27 (
1.04x) 215.14 ( 1.03x)')
(u'', u'RemovalHeap', u' 0.15 ( 1.52x)  1.91 ( 1.35x) 22.64 (
1.28x) 258.68 ( 1.23x)')

(u'push', u'heapq  ', u' 0.04 ( 1.00x)  0.32 ( 1.00x)  3.49 (
1.00x) 33.92 ( 1.00x)')
(u'', u'Heap   ', u' 0.05 ( 1.18x)  0.39 ( 1.22x)  4.21 (
1.20x) 42.03 ( 1.24x)')
(u'', u'RemovalHeap', u' 0.06 ( 1.52x)  0.52 ( 1.62x)  5.60 (
1.60x) 56.54 ( 1.67x)')


(u'init', u'heapq', u' 0.44 ( 1.00x)  7.92 ( 1.00x) 106.52 (
1.00x) 915.20 ( 1.00x)')
(u'', u'OrderHeap', u' 0.50 ( 1.14x)  8.67 ( 1.10x) 111.99 (
1.05x) 1129.89 ( 1.23x)')
(u'', u'XHeap', u' 0.61 ( 1.38x) 10.75 ( 1.36x) 140.86 (
1.32x) 1417.84 ( 1.55x)')

(u'pop ', u'heapq', u' 0.04 ( 1.00x)  0.55 ( 1.00x)  6.59 ( 1.00x)
76.81 ( 1.00x)')
(u'', u'OrderHeap', u' 0.06 ( 1.68x)  0.79 ( 1.43x)  9.04 ( 1.37x)
101.72 ( 1.32x)')
(u'', u'XHeap', u' 0.09 ( 2.43x)  1.03 ( 1.85x) 11.48 ( 1.74x)
125.94 ( 1.64x)')

(u'push', u'heapq', u' 0.01 ( 1.00x)  0.16 ( 1.00x)  1.85 ( 1.00x)
14.65 ( 1.00x)')
(u'', u'OrderHeap', u' 0.04 ( 4.32x)  0.46 ( 2.81x)  4.74 ( 2.56x)
42.25 ( 2.88x)')
(u'', u'XHeap', u' 0.03 ( 3.73x)  0.42 ( 2.55x)  4.34 ( 2.35x)
38.37 ( 2.62x)')


(u'remove', u'RemovalHeap', u' 0.05 ( 1.00x)  0.54 ( 1.00x)  5.62 (
1.00x) 57.04 ( 1.00x)')
(u'  ', u'XHeap  ', u' 0.04 ( 0.88x)  0.44 ( 0.80x)  4.41 (
0.78x) 43.86 ( 0.77x)')




So as the results are not much effected apart of __init__, i think you
should consider this.


Looks promising. I will look into it.


Note: i'm not using collections.Counter because it is written in
python, and from my previous experience it is slower than using
defaultdict for this kind of purposes.


That's exactly why the __setitem__ implemenation was so slow. It did not 
use the C implemention. I think I could reduce the overhead even further 
by having my own global/thread-local integer counter. Stay tuned. ;-)



ps: there are two error's when i ran tests with test_xheap.


Damn. I see this is Python 2 and Python 3 related. Thanks for bringing 
this to my attention. I am going to fix this soon.



OMG why
did you keep repititions = 10000, at first i though my pentium laptop
is too slow that it is not printing a single line even after 20
minutes. then i saw the number of computations are in the order of
10**10, how many days did it took to completely run the tests?


I don't know. I let it run overnight since I wasn't at home. :)

But you are right. I re-executed the benchmark and compared 100, 1000 
and 1 with each other. Almost no difference at all.


I am going to reduce it to 100. So, it takes ca. 8 minutes on my machine.

Thanks for your feedback,
Sven




On Sun, Mar 6, 2016 at 7:29 PM, Sven R. Kunze  wrote:

Hi python-list, hi Srinivas,

I managed to implement the mark&sweep approach for fast removal from heaps.
This way, I got three pleasant results:

1) a substantial speed up!
2) an improved testsuite
3) discovery and fixing of several bugs

@Srinivas I would be honored if you could have a look at the implementation:
https://github.com/srkunze/xheap . After all, it was your idea. I only
perform the sweeping step during pop and remove with the condition of yours.
:)

Using the original xheap benchmark, I could see huge speedups: from 50x/25x
down to 3x/2x compared to heapq. That's a massive improvement. I will
publish an update soon.

Best,
Sven





--
https://mail.python.org/mailman/listinfo/python-list

Re: reversed(zip(...)) not working as intended

2016-03-06 Thread Sven R. Kunze


On 06.03.2016 19:51, Tim Chase wrote:

So it looks like one needs to either

results = reversed(list(zip(...)))

or, more efficiently (doing it with one less duplication of the list)

results = list(zip(...))
results.reverse()


Nice idea. :) Unfortunately, I used it while drafting some unittests and 
I love SHORT oneliners:


for c in reversed(zip(ascii_lowercase, ascii_uppercase)):
...

ooops. :-/

Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

Re: reversed(zip(...)) not working as intended

2016-03-06 Thread Sven R. Kunze


On 06.03.2016 19:53, Peter Otten wrote:

Sven R. Kunze wrote:


what's the reason that reversed(zip(...)) raises as a TypeError?

Would allowing reversed to handle zip and related functions lead to
strange errors?

In Python 3 zip() can deal with infinite iterables -- what would you expect
from

reversed(zip(count()))

?


Have I no idea. ;-)

But to me, "infinite" feels not like the most common use-case to most 
developers.


I just stumbled over it during rapid test case development:

for c in reversed(zip(ascii_lowercase, ascii_uppercase)):
...

Bam! That doesn't work although the code clearly describes what to do. :-(




If all arguments of zip() are finite and of equal length you can write

zip(reversed(a1), reversed(a2), ...)

or if you find that really useful something like


class myzip(zip):

... def __init__(self, *args):
... self.args = args
... def __reversed__(self):
... return zip(*(reversed(a) for a in self.args))
...

list(reversed(myzip("abc", [1,2,3])))

[('c', 3), ('b', 2), ('a', 1)]

While this might look right at first sight it really opens a can of worms.
First zip():


z = zip("abc", "def")
next(z)

('a', 'd')

list(z)

[('b', 'e'), ('c', 'f')]

Now myzip():


m = myzip("abc", "def")
next(m)

('a', 'd')

list(reversed(m))

[('c', 'f'), ('b', 'e'), ('a', 'd')]

Frankly, I have no idea what consistent behaviour should look like for a
zip() that can be "reverse-iterated".

PS: In Python 2 zip() would produce a list, so


list(reversed(zip("abc", "def")))

[('c', 'f'), ('b', 'e'), ('a', 'd')]

worked without requiring any code in zip().



--
https://mail.python.org/mailman/listinfo/python-list

reversed(zip(...)) not working as intended

2016-03-06 Thread Sven R. Kunze


Hi,

what's the reason that reversed(zip(...)) raises as a TypeError?

Would allowing reversed to handle zip and related functions lead to 
strange errors?


Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

even faster heaps

2016-03-06 Thread Sven R. Kunze


Hi python-list, hi Srinivas,

I managed to implement the mark&sweep approach for fast removal from 
heaps. This way, I got three pleasant results:


1) a substantial speed up!
2) an improved testsuite
3) discovery and fixing of several bugs

@Srinivas I would be honored if you could have a look at the 
implementation: https://github.com/srkunze/xheap . After all, it was 
your idea. I only perform the sweeping step during pop and remove with 
the condition of yours. :)


Using the original xheap benchmark 
<http://srkunze.blogspot.de/2016/02/the-xheap-benchmark.html>, I could 
see huge speedups: from 50x/25x down to 3x/2x compared to heapq. That's 
a massive improvement. I will publish an update soon.


Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

Re: [Off-topic] Requests author discusses MentalHealthError exception

2016-03-01 Thread Sven R. Kunze


On 01.03.2016 13:13, Steven D'Aprano wrote:

On Tue, 1 Mar 2016 09:38 am, Larry Martell wrote:


But what is reality?

Reality is that which, when you stop believing in it, doesn't go away.


Just like that.
--
https://mail.python.org/mailman/listinfo/python-list

Re: Everything good about Python except GUI IDE?

2016-02-28 Thread Sven R. Kunze


On 28.02.2016 07:34, Steven D'Aprano wrote:

I think that's out-and-out wrong, and harmful to the developer community. I
think that we're stuck in the equivalent of the pre-WYSIWYG days of word
processing: you can format documents as nicely as you like, but you have to
use a separate mode to see it.


Good point.


Drag-and-drop GUI builders have the same advantages over code as Python has
over languages with distinct compile/execute steps: rapid development,
prototyping, exploration and discovery. Of course, any decent modern
builder won't limit you to literally drag-and-drop, but will offer
functionality like duplicating elements, aligning them, magnetic guides,
etc.


Another good point. I will get to this later.


GUI elements are by definition graphical in nature, and like other graphical
elements, manipulation by hand is superior to command-based manipulation.
Graphical interfaces for manipulating graphics have won the UI war so
effectively that some people have forgotten there ever was a war. Can you
imagine using Photoshop without drag and drop?
(you can measure this by counting the numbers of replies to a thread)


That's whole different topic. What is Photoshop manipulating? Layers of 
pixels. That's an extremely simplified model. There is no dynamic 
behavior as there is with GUIs.



And yet programming those graphical interfaces is an exception. There, with
very few exceptions, we still *require* a command interface. Not just a
command interface, but an *off-line* command interface, where you batch up
all your commands then run them at once, as if we were Neanderthals living
in a cave.


Not sure if I agree with you here.

Let's ask ourselves, what is so different about, say, a complex 
mathematical function and a complex GUI? In other words: why do you can 
live with a text representation of that function whereas you cannot live 
with a text representation of a GUI?


One difference is the number of interactions you can do with a function 
and a GUI. A function takes some numbers whereas a GUI takes some 
complex text/mouse/finger/voice interactions.
So, I've never heard of any complains when it comes to mathematical 
functions represented in some source code. But, I've heard a lot of 
complains regarding GUI design and interaction tests (even when they are 
done graphically) -- also in WPF.


Both text representations are abstract descriptions of the real thing 
(function and GUI). You need some imagination to get them right, to 
debug them, to maintain them, to change them. We could blame Python here 
but it's due to the problem realm and to the people working there:


Functions -> mathematicians/computer scientists, work regularly with 
highly abstract objects
GUI -> designers, never really got the same education for 
programming/abstraction as the former group has


So, (and I know that from where I am involved with) GUI research 
(development, evaluation etc.) is not a topic considered closed. No 
serious computer scientist really knows the "right" way. But, hey, 
people are working on it at least.


Usually, you start out simple. As the time flies, you put in more and 
more features and things become more and more complex (we all know that 
all non-toy projects will). And so does a GUI. At a certain point, there 
is no other way than going into the code and do something nasty by 
utilizing the Turing-completeness of the underlying language. Generated 
code always looks creepy, bloaty with a lot of boilerplate. If you 
really really need to dig deeper, you will have a hard time finding out 
what of the boilerplate is really needed and what was added by the 
code-generator. In the end, you might even break the 
"drag-n-drop"ability. :-(


That is the reason, why traditional CASE tools never really got started, 
why we still need programmers, why we still have text. From my point of 
view (highly subjective), start by using general building blocks (text, 
functions, classes, ...) is better long-term; not by starting with a 
cage (the GUI) and subsequently adding more and more holes not fitting 
the original concept. History so far as agreed with this; professional 
software development always uses text tools for which LATER somebody 
built a GUI. I cannot remember it being the other way round.


Furthermore, I agree with Chris about the version control problem.

Last but not least, GUIs are a place for bike shedding because almost 
everybody is able to see them and can start having an opinion about them:

Who loves the new Windows modern UI? Either you like it or you hate it.
What about the Riemann zeta function? Anybody?


Best,
Sven

PS: another thought.

I recently introduced LaTeX to my girlfriend. LaTeX is quite ugly and it 
has this "distinct compile/execute step", so initially I hesitated to 
show it to her. But her MS Word experience got worse and worse the more 
complex (and especially

Re: Bug in Python?

2016-02-28 Thread Sven R. Kunze


On 27.02.2016 12:48, Terry Reedy wrote:

On 2/27/2016 4:44 AM, Steven D'Aprano wrote:

On Sat, 27 Feb 2016 07:55 pm, Terry Reedy wrote:


In other words, when that doc says *list*, it means a *list*.

"To create a heap, use a list initialized to [], or you can transform a
populated list into a heap via function heapify()."

[...]

"A heap must be an instance of *list* (and not a subclass thereof).  To
create a heap, start with [] or transform an existing list into a heap
via function heapify()."


I think that's a sad decision. heapq ought to be able to handle any list
subclass, not just actual lists. Preferably it ought to handle 
duck-typed
lists too, anything with a list-like interface. It is okay if the 
optimized
C version only works with actual lists, and falls back to a slower 
Python

version for anything else.




I agree that it would increase comprehensibility. I took me a fair 
amount of time to see why things are not working as intended (and as you 
see I wasn't even close to the real explanation).


However, heapq might work well the way it is as long as you have the 
chance to get your hands on one of the other heap implementations out there.



Propose that on the tracker, after checking previous issues.


:)

Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

Re: Bug in Python?

2016-02-28 Thread Sven R. Kunze


On 27.02.2016 00:07, eryk sun wrote:

  On Fri, Feb 26, 2016 at 4:08 PM, Sven R. Kunze  wrote:

Python sometimes seems not to hop back and forth between C and Python code.
Can somebody explain this?

Normally a C extension would call PySequence_SetItem, which would call
the type's sq_ass_item, which for MyList is slot_sq_ass_item. The
latter function bridges the CPython and Python sides by binding and
calling the overridden __setitem__ method.  However, the _heapq
extension module uses `PyList_SET_ITEM(heap, 0, lastelt)`. This macro
expands to `((PyListObject *)(heap))->ob_item[0] = lastelt`. This
directly modifies the internal ob_item array of the list, so the
overridden __setitem__ method is never called. I presume it was
implemented like this with performance in mind, but I don't know
whether or not that justifies the loss of generality.


I think this is true and it explains the huge performance penalty of the 
current RemovalHeap and XHeap implementation as it basically uses Python 
only (results here: http://bit.ly/1KU7CyW).


Shoot! I could have seen this earlier. I thought the performance penalty 
was due to calling __setitem__ and dict operations. But having all heap 
operations carried out in Python slows things down considerably of course.


Let's see if I can manage to create a more efficient mark-and-sweep 
approach which uses the C module.


Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

Re: Bug in Python?

2016-02-28 Thread Sven R. Kunze


On 26.02.2016 23:37, Ian Kelly wrote:

On Fri, Feb 26, 2016 at 3:08 PM, Sven R. Kunze  wrote:

Python sometimes seems not to hop back and forth between C and Python code.

C code as a rule tends to ignore dunder methods. Those are used to
implement Python operations, not C operations.


Ah, good to know.


 _siftup(heap, 0)# that's C

Your comment here appears to be incorrect.

[snip]

So I would guess that the difference here is because one
implementation is entirely C, and the other implementation is entirely
Python.


You are damn right. While implementing xheap and looking at the C 
implementation, I just assumed that all Python functions would be 
replaced by when the C module is imported. Furthermore, the C module 
exports some internal functions (leading _ ) such as _heappop_max. Thus, 
I assumed this is true for _siftup without verifying it by looking too 
closely at the PyMethodDef and PyModuleDef structs. Sorry for that.


Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

Bug in Python?

2016-02-26 Thread Sven R. Kunze


Hi everybody,

I recognized the following oddity (background story: 
http://srkunze.blogspot.com/2016/02/lets-go-down-rabbit-hole.html).


Python sometimes seems not to hop back and forth between C and Python code.

Can somebody explain this?


class MyList(list):
count = 0
def __setitem__(self, key, value):
self.count += 1
super(MyList, self).__setitem__(key, value)

# using heapq directly
from heapq import heappop
ml = MyList(range(10))
heappop(ml) # that's C
print(ml.count) # print 0


# using exact copy from heapq
from heapq import _siftup
def my_heappop(heap):
lastelt = heap.pop()
if heap:
returnitem = heap[0]
heap[0] = lastelt
_siftup(heap, 0)# that's C
return returnitem
return lastelt

ml = MyList(range(10))
my_heappop(ml)
print(ml.count) # print 6


Best,
Sven

--
https://mail.python.org/mailman/listinfo/python-list

Re: How the heck does async/await work in Python 3.5

2016-02-23 Thread Sven R. Kunze


On 20.02.2016 07:53, Christian Gollwitzer wrote:
If you have difficulties wit hthe overall concept, and if you are open 
to discussions in another language, take a look at this video:


https://channel9.msdn.com/Shows/C9-GoingNative/GoingNative-39-await-co-routines 



MS has added coroutine support with very similar syntax to VC++ 
recently, and the developer tries to explain it to the "stackful" 
programmers.


Because of this thread, I finally finished an older post collecting 
valuable insights from last year discussions regarding concurrency 
modules available in Python: 
http://srkunze.blogspot.com/2016/02/concurrency-in-python.html It 
appears to me that it would fit here well.


@python-ideas
Back then, the old thread ("Concurrency Modules") was like basically 
meant to result in something useful. I hope the post covers the essence 
of the discussion.
Some even suggested putting the table into the Python docs. I am unaware 
of the formal procedure here but I would be glad if somebody could point 
be at the right direction if that the survey table is wanted in the docs.


Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

Re: How the heck does async/await work in Python 3.5

2016-02-23 Thread Sven R. Kunze


On 23.02.2016 18:37, Ian Kelly wrote:

It's not entirely clear to me what the C++ is actually doing. With
Python we have an explicit event loop that has to be started to manage
resuming the coroutines. Since it's explicit, you could easily drop in
a different event loop, such as Tornado or curio, if desired. If your
coroutine never awaits anything that isn't already done then
technically you don't need an event loop, but at that point you might
as well be using ordinary functions.


I don't think taking the shortcut to ordinary functions will work on the 
big scale.


I certainly agree that asynchronous operations can/should? be the very 
core of everything (files, sockets, timers, etc.); just as Microsoft is 
pushing with their Windows API. Chaining async ops together just works 
now with async/await in Python as well. However, in the end of the 
chaining there'll always be synchronous code that e.g. initializes the 
event loop. Real world code works the same.


Imagine, in some near/distant future, Python might have all its core 
components (like reading a file, etc. etc.) converted to async. It would 
be great for many larger applications if there one could introduce async 
gently. So, laying out an async foundation (the building blocks) but the 
wiring synchronous operations still work as they should. Once, the team 
decides they want to leverage the async potential in their code (as the 
building blocks COULD be executed concurrently), they will then be able 
to replace the synchronous wires with an event loop.


So, I see quite some potential here.


The C++ on the other hand seems to be doing something implicit at the
compiler level to make everything happen automatically inside the
future.get() call, but I don't know what that is.

You could wrap up the boilerplate in Python if you like:

def get(coro, loop=None):
 if loop is None:
 loop = asyncio.get_event_loop()
 return loop.run_until_complete(coro)

print(get(tcp_reader(1000)))


As usual. :)

Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

Re: How the heck does async/await work in Python 3.5

2016-02-23 Thread Sven R. Kunze


On 23.02.2016 01:48, Ian Kelly wrote:

On Mon, Feb 22, 2016 at 3:16 PM, Sven R. Kunze  wrote:

Is something like shown in 12:50 ( cout << tcp_reader(1000).get() ) possible
with asyncio? (tcp_reader would be async def)

loop = asyncio.get_event_loop()
print(loop.run_until_complete(tcp_reader(1000)))


I see. Thanks. :)

How come that Python (compared to C++) needs much more boilerplate to 
use async programming? Historically, it was the other way round.


Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

Re: How the heck does async/await work in Python 3.5

2016-02-22 Thread Sven R. Kunze


On 20.02.2016 07:53, Christian Gollwitzer wrote:
If you have difficulties wit hthe overall concept, and if you are open 
to discussions in another language, take a look at this video:


https://channel9.msdn.com/Shows/C9-GoingNative/GoingNative-39-await-co-routines 



MS has added coroutine support with very similar syntax to VC++ 
recently, and the developer tries to explain it to the "stackful" 
programmers.


Thanks, Christian. Very informative video.

Is something like shown in 12:50 ( cout << tcp_reader(1000).get() ) 
possible with asyncio? (tcp_reader would be async def)



Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

benchmarking in general and using xheap

2016-02-19 Thread Sven R. Kunze


Hi everybody,

I've finally had the time to do the benchmarks and here you go: 
http://srkunze.blogspot.com/2016/02/the-xheap-benchmark.html


The benchmark compares heapq, Heap, OrderHeap, RemovalHeap and XHeap 
regarding their operation heapify, push and pop.


As expected wrapping results in some overhead. Most of the overhead 
consists of wrapper, super and descriptor calls. As with the current 
optimizations efforts, I expect this to be reduced even further. But 
even using current CPython 2.7 or 3.5, the overhead for simple heaps, 
heaps with a custom orders or heaps with removal can be considered 
affordable given the maintenance benefits.


@srinivas
The current removal implementation uses a index-tracking approach with 
quite some overhead for other operations. I am not sure if that is 
remediable with a mark-and-sweep approach but given the time I will 
definitely look into it for another benchmark post now that I have build 
the infrastructure for it.


@all benchmark friends
Not sure how you do your benchmarks, but I am somewhat dissatisfied with 
the current approach. I started out using unittests as they integrated 
nicely with my current toolchain and I could write actual code. Then I 
threw anything away and used timeit as suggested here 
https://mail.python.org/pipermail/python-list/2016-January/702571.html 
and grew my own set of tools around it to produce readable results; 
writing code as strings. :-/


And from what I know, there is no official infrastructure (tools, 
classes, ... such as there is for unittests) around timeit to 
encapsulate benchmarks, choosing a baseline, calculate ratios etc (and 
write code instead of strings).


Does somebody have an idea here?

Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

Re: Guido on python3 for beginners

2016-02-18 Thread Sven R. Kunze


On 18.02.2016 07:59, Paul Rubin wrote:

Steven D'Aprano  writes:

I suppose that it is objectively correct that it is harder to learn
than Python 2. But I don't think the learning curve is any steeper. If
anything, the learning curve is ever-so-slightly less steep.

I think py3 has more learning curve because it uses iterators in places
where py2 uses lists.  That's a significant new concept and it can be
bug-prone even for programmers who are experienced with it.



That is indeed very true.


Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

Re: Multiple Assignment a = b = c

2016-02-16 Thread Sven R. Kunze


On 16.02.2016 14:05, Sven R. Kunze wrote:

Hi Srinivas,

I think the tuple assignment you showed basically nails it.

First, the rhs is evaluated.
Second, the lhs is evaluated from left to right.

Completely wrong?

Best,
Sven


As you mentioned swapping. The following two statements do the same (as 
you suggested at the beginning).


a,b=b,a=4,5
(a,b),(b,a)=(4,5),(4,5)

Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

Re: Multiple Assignment a = b = c

2016-02-16 Thread Sven R. Kunze


Hi Srinivas,

On 16.02.2016 13:46, srinivas devaki wrote:

Hi,

a = b = c

as an assignment doesn't return anything, i ruled out a = b = c as
chained assignment, like a = (b = c)
SO i thought, a = b = c is resolved as
a, b = [c, c]


at-least i fixed in my mind that every assignment like operation in
python is done with references and then the references are binded to
the named variables.
like globals()['a'] = result()

but today i learned that this is not the case with great pain(7 hours
of debugging.)

class Mytest(object):
 def __init__(self, a):
 self.a = a
 def __getitem__(self, k):
 print('__getitem__', k)
 return self.a[k]
 def __setitem__(self, k, v):
 print('__setitem__', k, v)
 self.a[k] = v

roots = Mytest([0, 1, 2, 3, 4, 5, 6, 7, 8])
a = 4
roots[4] = 6
a = roots[a] = roots[roots[a]]


the above program's output is
__setitem__ 4 6
__getitem__ 4
__getitem__ 6
__setitem__ 6 6


But the output that i expected is
__setitem__ 4 6
__getitem__ 4
__getitem__ 6
__setitem__ 4 6

SO isn't it counter intuitive from all other python operations.
like how we teach on how python performs a swap operation???

I just want to get a better idea around this.


I think the tuple assignment you showed basically nails it.

First, the rhs is evaluated.
Second, the lhs is evaluated from left to right.

Completely wrong?

Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

Re: Heap Implementation

2016-02-10 Thread Sven R. Kunze


Hi Cem,

On 08.02.2016 02:37, Cem Karan wrote:

My apologies for not writing sooner, but work has been quite busy lately (and 
likely will be for some time to come).


no problem here. :)


I read your approach, and it looks pretty good, but there may be one issue with 
it; how do you handle the same item being pushed into the heap more than once?  
In my simple simulator, I'll push the same object into my event queue multiple 
times in a row.  The priority is the moment in the future when the object will 
be called.  As a result, items don't have unique priorities.  I know that there 
are methods of handling this from the client-side (tuples with unique counters 
come to mind), but if your library can handle it directly, then that could be 
useful to others as well.


I've pondered about that in the early design phase. I considered it a 
slowdown for my use-case without benefit.


Why? Because I always push a fresh object ALTHOUGH it might be equal 
comparing attributes (priority, deadline, etc.).



That's the reason why I need to ask again: why pushing the same item on 
a heap?



Are we talking about function objects? If so, then your concern is 
valid. Would you accept a solution that would involve wrapping the 
function in another object carrying the priority? Would you prefer a 
wrapper that's defined by xheap itself so you can just use it?



Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

Re: Asyncio thought experiment

2016-02-10 Thread Sven R. Kunze


On 08.02.2016 23:13, Marko Rauhamaa wrote:

As I stated in an earlier post, a normal subroutine may turn out to be
blocking. To make it well-behaved under asyncio, you then dutifully tag
the subroutine with "async" and adorn the blocking statement with
"await". Consequently, you put "await" in front of all calls to the
subroutine and cascade the "async"s and "await"s all the way to the top
level.

Now what would prevent you from making *every* function an "async" and
"await"ing *every* function call? Then, you would never fall victim to
the cascading async/await.

And if you did that, why bother sprinkling async's and await's
everywhere? Why not make every single function call an await implicitly
and every single subroutine an async? In fact, that's how everything
works in multithreading: blocking statements don't need to be ornamented
in any manner.


So? :)

Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

Re: _siftup and _siftdown implementation

2016-02-05 Thread Sven R. Kunze


again for the list:
###


import random

from xheap import RemovalHeap


class X(object):
c = 0
def __init__(self, x):
self.x = x
def __lt__(self, other):
X.c += 1
return self.x < other.x

n = 10

for jj in range(5):
items = [X(i) for i in range(n)]
random.shuffle(items)
heap = RemovalHeap(items)

random.shuffle(items)
for i in items:
heap.remove(i)

print(X.c)
X.c = 0


(note to myself: never copy PyCharm formatting strings to this list).

On 05.02.2016 17:27, Sven R. Kunze wrote:

Hi srinivas,

I wrote this simple benchmark to measure comparisons:

import random

from xheapimport RemovalHeap


class X(object):
c =0 def __init__(self, x):
self.x = x
def __lt__(self, other):
X.c +=1 return self.x < other.x

n =10 for jjin range(5):
items = [X(i)for iin range(n)]
random.shuffle(items)
heap = RemovalHeap(items)

random.shuffle(items)
for i  in items:
heap.remove(i)

print(X.c)
X.c =0


old version:
430457
430810
430657
429971
430583

your pull request version:
426414
426045
425437
425528
425522


Can we do better here?

Best,
Sven


--
https://mail.python.org/mailman/listinfo/python-list

Re: _siftup and _siftdown implementation

2016-02-05 Thread Sven R. Kunze


Hi srinivas,

I wrote this simple benchmark to measure comparisons:

import random

from xheapimport RemovalHeap


class X(object):
c =0 def __init__(self, x):
self.x = x
def __lt__(self, other):
X.c +=1 return self.x < other.x

n =10 for jjin range(5):
items = [X(i)for iin range(n)]
random.shuffle(items)
heap = RemovalHeap(items)

random.shuffle(items)
for i  in items:
heap.remove(i)

print(X.c)
X.c =0


old version:
430457
430810
430657
429971
430583

your pull request version:
426414
426045
425437
425528
425522


Can we do better here?

Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

Re: _siftup and _siftdown implementation

2016-02-05 Thread Sven R. Kunze


On 05.02.2016 15:48, Bernardo Sulzbach wrote:

On 02/05/2016 12:42 PM, Sven R. Kunze wrote:


PS: I do competitive programming, I use these modules every couple of
days
when compared to other modules. so didn't give much thought when
posting to
the mailing list. sorry for that.


Competitive programming? That sounds interesting. :)



I wonder why you *can* use this amount of already done stuff in 
competitive programming. When I was into that you could use what the 
standard library of the language gave you and nothing else.


AFAICT, heapq is part of the standard lib. :)

Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

Re: _siftup and _siftdown implementation

2016-02-05 Thread Sven R. Kunze


On 05.02.2016 02:26, srinivas devaki wrote:

as I come to think of it again, it is not subheap, it actually heap cut at
some level hope you get the idea from the usage of _siftup. so even though
the `pos` children are valid the _siftup brings down the new element (i.e
the element which is at first at `pos`) upto its leaf level and then again
it is brought up by using _siftdown. why do the redundant work when it can
simply breakout?


The heapq module itself has a very extensive documentation inside. This 
is what it says for _siftup. I think this is sort of an optimization 
that works pretty well (cf. the numbers) for popping off the FIRST item:


"""

# The child indices of heap index pos are already heaps, and we want to 
make # a heap at index pos too. We do this by bubbling the smaller child 
of # pos up (and so on with that child's children, etc) until hitting a 
leaf, # then using _siftdown to move the oddball originally at index pos 
into place. # # We *could* break out of the loop as soon as we find a 
pos where newitem <= # both its children, but turns out that's not a 
good idea, and despite that # many books write the algorithm that way. 
During a heap pop, the last array # element is sifted in, and that tends 
to be large, so that comparing it # against values starting from the 
root usually doesn't pay (= usually doesn't # get us out of the loop 
early). See Knuth, Volume 3, where this is # explained and quantified in 
an exercise. # # Cutting the # of comparisons is important, since these 
routines have no # way to extract "the priority" from an array element, 
so that intelligence # is likely to be hiding in custom comparison 
methods, or in array elements # storing (priority, record) tuples. 
Comparisons are thus potentially # expensive. # # On random arrays of 
length 1000, making this change cut the number of # comparisons made by 
heapify() a little, and those made by exhaustive # heappop() a lot, in 
accord with theory. Here are typical results from 3 # runs (3 just to 
demonstrate how small the variance is): # # Compares needed by heapify 
Compares needed by 1000 heappops # -- 
 # 1837 cut to 1663 14996 cut to 8680 # 
1855 cut to 1659 14966 cut to 8678 # 1847 cut to 1660 15024 cut to 8703 
# # Building the heap by using heappush() 1000 times instead required # 
2198, 2148, and 2219 compares: heapify() is more efficient, when # you 
can use it. # # The total compares needed by list.sort() on the same 
lists were 8627, # 8627, and 8632 (this should be compared to the sum of 
heapify() and # heappop() compares): list.sort() is (unsurprisingly!) 
more efficient # for sorting.


"""

What do you think about our use-case?


_siftup and _siftdown are functions from python standard heapq module.

PS: I do competitive programming, I use these modules every couple of days
when compared to other modules. so didn't give much thought when posting to
the mailing list. sorry for that.


Competitive programming? That sounds interesting. :)

Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

Re: _siftup and _siftdown implementation

2016-02-04 Thread Sven R. Kunze


On 05.02.2016 01:12, Steven D'Aprano wrote:

On Fri, 5 Feb 2016 07:50 am, srinivas devaki wrote:


_siftdown function breaks out of the loop when the current pos has a valid
parent.

but _siftup function is not implemented in that fashion, if a valid
subheap is given to the _siftup, it will bring down the root of sub heap
and then again bring it up to its original place.

I was wondering why it is so, is it just to make the code look simple???

Hi Srinivas,

I'm sure that your question is obvious to you, but it's not obvious to us.
Where are _siftup and _siftdown defined? Are they in your code? Somebody
else's code? A library? Which library? What do they do? Where are they
from?



The question originated here: 
https://github.com/srkunze/xheap/pull/1#discussion_r51770210



(btw, Steven, your email client somehow breaks my threading view in 
thunderbird. This reply appeared unconnected to Srinivas' post.)

--
https://mail.python.org/mailman/listinfo/python-list

Re: Efficient Wrappers for Instance Methods

2016-02-04 Thread Sven R. Kunze


On 04.02.2016 19:35, Random832 wrote:

On Thu, Feb 4, 2016, at 11:18, Sven R. Kunze wrote:

On 04.02.2016 00:47, Random832 wrote:

On Wed, Feb 3, 2016, at 16:43, Sven R. Kunze wrote:

Actually a nice idea if there were no overhead of creating methods for
all heap instances separately. I'll keep that in mind. :)

What about changing the class of the object to one which is inherited
from its original class and has the method you want? What about reaching
into the class and changing the method in the first place? Either may
not be appropriate, of course, depending on your use case.

There is no base class.

I meant something like...

Class C:
 replace = heapreplace

Cs = {}

...

if not isinstance(x, C)
 T = type(x)
 cls = cache.get(T)
 if cls is None:
cls = type('C_'+T.__name__, (C, T), {})
 x.__class__ = cls

(Of course, by dynamically reassigning __class__ and using the type
constructor, this checks two of the three "crazy type system voodoo"
boxes. I have no idea if it will work, or if I've made a mistake, or if
you'll be able to understand it in six months.)


I think I agree with you that this might be a maintenance nightmare. ;)

Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

Re: Efficient Wrappers for Instance Methods

2016-02-04 Thread Sven R. Kunze


On 04.02.2016 00:47, Random832 wrote:

On Wed, Feb 3, 2016, at 16:43, Sven R. Kunze wrote:

Actually a nice idea if there were no overhead of creating methods for
all heap instances separately. I'll keep that in mind. :)

What about changing the class of the object to one which is inherited
from its original class and has the method you want? What about reaching
into the class and changing the method in the first place? Either may
not be appropriate, of course, depending on your use case.


There is no base class.

Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

Re: Efficient Wrappers for Instance Methods

2016-02-03 Thread Sven R. Kunze


On 03.02.2016 22:34, Bernardo Sulzbach wrote:

Did Peter's suggestion work?


Somewhat for a single Heap class.

However, it breaks inheritance.
--
https://mail.python.org/mailman/listinfo/python-list

Re: Efficient Wrappers for Instance Methods

2016-02-03 Thread Sven R. Kunze



On 03.02.2016 22:15, Peter Otten wrote:


The technical reason is that functions written in C don't implement the
descriptor protocol. The bound method is created by invoking the __get__
method of the class attribute:


Good to know. :-/


It's sad. These functions just look so method-like.


Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

Re: Efficient Wrappers for Instance Methods

2016-02-03 Thread Sven R. Kunze


On 03.02.2016 22:19, Peter Otten wrote:


You could try putting

self.heappush = functools.partial(heapq.heappush, self)

into the initializer.


Actually a nice idea if there were no overhead of creating methods for 
all heap instances separately. I'll keep that in mind. :)

--
https://mail.python.org/mailman/listinfo/python-list

Re: Efficient Wrappers for Instance Methods

2016-02-03 Thread Sven R. Kunze


On 03.02.2016 22:14, Bernardo Sulzbach wrote:

Thanks for quoting, for some reason my client always replies to the
person and not the list (on this list only).

I did what I could. I could show you a lambda function there, but it
doesn't solve anything. If there is a way to avoid a wrapper, I don't
know.


I appreciate every single reply. :)

Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

Re: Efficient Wrappers for Instance Methods

2016-02-03 Thread Sven R. Kunze


On 03.02.2016 22:06, Bernardo Sulzbach wrote:

I may say something wrong, but this is what I see going on:

When you get "replace = heapreplace" you are creating a data attribute
called replace (you will access it by self.replace or
variable.replace) that is an alias for heapreplace.

When you call x.replace(2) you are calling heapreplace(2), NOT
heapreplace(self, 2).


It is exactly as you've described it.

Question now is how can I circumvent/shortcut that? There are several 
proposals out there in the Web but none of them works. :-/



Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

Re: Efficient Wrappers for Instance Methods

2016-02-03 Thread Sven R. Kunze


On 03.02.2016 21:40, Bernardo Sulzbach wrote:

I am not entirely sure about what your question is.

Are you talking about the "heapreplace expected 2 arguments, got 1"
you get if you set replace = heapreplace?


Yes, I think so.

I might ask differently: why do I need to write wrapper method when the 
function I am wrapping actually already has the same signature?



Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

Efficient Wrappers for Instance Methods

2016-02-03 Thread Sven R. Kunze


Hi,

as you might have noticed, I am working on 
https://github.com/srkunze/xheap right now.


In order to make it even faster and closer to heapq's baseline 
performance, I wonder if there is a possibility of creating fast 
wrappers for functions.



Please compare

https://github.com/srkunze/xheap/blob/ca56ac55269ce0bc7b61d28ba9ceb41e9075476a/xheap.py#L29
https://github.com/srkunze/xheap/blob/ca56ac55269ce0bc7b61d28ba9ceb41e9075476a/xheap.py#L32

with

https://github.com/srkunze/xheap/blob/ca56ac55269ce0bc7b61d28ba9ceb41e9075476a/xheap.py#L44


Why is it not possible to create a method from a function like I aliased 
replace with poppush?


If I am not completely mistaken, it saves 1 stack frame, right?


Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

Re: Heap Implementation

2016-02-02 Thread Sven R. Kunze

On 02.02.2016 01:48, srinivas devaki wrote:

On Feb 1, 2016 10:54 PM, "Sven R. Kunze" <mailto:srku...@mail.de>> wrote:

>
> Maybe I didn't express myself well. Would you prefer the sweeping 
approach in terms of efficiency over how I implemented xheap currently?

>

complexity wise your approach is the best one of all that I have seen 
till now

> Without running some benchmarks, I have absolutely no feeling which 
approach is faster/more memory efficient etc.

>

this is obviously memory efficient but I don't know whether this 
approach would be faster than previous approaches, with previous 
approaches there is no call back into Python code from C code for 
comparison.
but this should be faster than HeapDict as HeapDict is directly using 
its own methods for heappush, heappop etc

Yes. So, it remains to be seen until I implemented the sweeping and 
compared them to each other.

PS: if you have time, could you please review my pull request.

Indeed, I am already thinking about it. :)

Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

Re: Heap Implementation

2016-02-01 Thread Sven R. Kunze


On 31.01.2016 02:48, Steven D'Aprano wrote:

On Sunday 31 January 2016 09:47, Sven R. Kunze wrote:


@all
What's the best/standardized tool in Python to perform benchmarking?

timeit

Thanks, Steven.

Maybe, I am doing it wrong but I get some weird results:

>>> min(timeit.Timer('for _ in range(1): heappop(h)', 'from heapq 
import heappop; h=list(range(1000))').repeat(10, 1)), 
min(timeit.Timer('for _ in range(1): h.pop()', 'from xheap import 
Heap; h=Heap(range(1000))').repeat(10, 1))

(0.01726761805314, 0.01615345600021101)

>>> min(timeit.Timer('for _ in range(10): heappop(h)', 'from heapq 
import heappop; h=list(range(1000))').repeat(10, 1)), 
min(timeit.Timer('for _ in range(10): h.pop()', 'from xheap import 
Heap; h=Heap(range(1000))').repeat(10, 1))

(0.12321608699949138, 0.1304205129002)

>>> min(timeit.Timer('for _ in range(1): heappop(h)', 'from heapq 
import heappop; h=list(range(100))').repeat(10, 1)), 
min(timeit.Timer('for _ in range(1): h.pop()', 'from xheap import 
Heap; h=Heap(range(100))').repeat(10, 1))

(0.010081621999233903, 0.008791901999757101)

>>> min(timeit.Timer('for _ in range(100): heappop(h)', 'from heapq 
import heappop; h=list(range(1000))').repeat(10, 1)), 
min(timeit.Timer('for _ in range(100): h.pop()', 'from xheap import 
Heap; h=Heap(range(1000))').repeat(10, 1))

(0.6218949679996513, 0.7172151949998806)


How can it be that my wrapper is sometimes faster and sometimes slower 
than heapq? I wouldn't mind slower, but faster*?



Best,
Sven


* that behavior is reproducible also for other combos and other machines.
--
https://mail.python.org/mailman/listinfo/python-list

Re: Heap Implementation

2016-02-01 Thread Sven R. Kunze


On 31.01.2016 05:59, srinivas devaki wrote:

@Sven
actually you are not sweeping at all, as i remember from my last post
what i meant by sweeping is periodically deleting the elements which
were marked as popped items.


Exactly.

Maybe I didn't express myself well. Would you prefer the sweeping 
approach in terms of efficiency over how I implemented xheap currently?


Without running some benchmarks, I have absolutely no feeling which 
approach is faster/more memory efficient etc.



kudos on that __setitem__ technique,
instead of using references to the items like in HeapDict, it is
brilliant of you to simply use __setitem__


Thanks. :)


On Sun, Jan 31, 2016 at 4:17 AM, Sven R. Kunze  wrote:

Hi again,

as the topic of the old thread actually was fully discussed, I dare to open
a new one.

I finally managed to finish my heap implementation. You can find it at
https://pypi.python.org/pypi/xheap + https://github.com/srkunze/xheap.

I described my motivations and design decisions at
http://srkunze.blogspot.com/2016/01/fast-object-oriented-heap-implementation.html

@Cem
You've been worried about a C implementation. I can assure you that I did
not intend to rewrite the incredibly fast and well-tested heapq
implementation. I just re-used it.

I would really be grateful for your feedback as you have first-hand
experience with heaps.

@srinivas
You might want to have a look at the removal implementation. Do you think it
would be wiser/faster to switch for the sweeping approach?

I plan to publish some benchmarks to compare heapq and xheap.

@all
What's the best/standardized tool in Python to perform benchmarking? Right
now, I use a self-made combo of unittest.TestCase and time.time + proper
formatting.

Best,
Sven


PS: fixing some weird typos and added missing part.


--
https://mail.python.org/mailman/listinfo/python-list

Heap Implementation

2016-01-30 Thread Sven R. Kunze


Hi again,

as the topic of the old thread actually was fully discussed, I dare to 
open a new one.


I finally managed to finish my heap implementation. You can find it at 
https://pypi.python.org/pypi/xheap + https://github.com/srkunze/xheap.


I described my motivations and design decisions at 
http://srkunze.blogspot.com/2016/01/fast-object-oriented-heap-implementation.html 



@Cem
You've been worried about a C implementation. I can assure you that I 
did not intend to rewrite the incredibly fast and well-tested heapq 
implementation. I just re-used it.


I would really be grateful for your feedback as you have first-hand 
experience with heaps.


@srinivas
You might want to have a look at the removal implementation. Do you 
think it would be wiser/faster to switch for the sweeping approach?


I plan to publish some benchmarks to compare heapq and xheap.

@all
What's the best/standardized tool in Python to perform benchmarking? 
Right now, I use a self-made combo of unittest.TestCase and time.time + 
proper formatting.


Best,
Sven


PS: fixing some weird typos and added missing part.
--
https://mail.python.org/mailman/listinfo/python-list

Heap Implemenation

2016-01-30 Thread Sven R. Kunze


Hi again,

as the topic of the old thread actually was fully discussed, I dare to 
open a new one.


I finally managed to finish my heap implementation. You can find it at 
https://pypi.python.org/pypi/xheap + https://github.com/srkunze/xheap.


I described my motivations and design decisions at 
http://srkunze.blogspot.com/2016/01/fast-object-oriented-heap-implementation.html


@Cem
You've been worried about a C implemenation. I can assure you that I did 
not intend to rewrite the incredibly fast and well-tested heapq 
implementation. I just re-used it. ;)


I would really be grateful for your feedback as you have first-hand 
experience with heaps.


@srinivas
You might want to have a look at the removal implementation. Do you 
think it would be wiser/faster to switch for the sweeping approach?


I plan to publish some benchmarks to compare heapq and xheap.


Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

Re: psss...I want to move from Perl to Python

2016-01-30 Thread Sven R. Kunze


On 29.01.2016 23:49, Ben Finney wrote:

"Sven R. Kunze"  writes:


On 29.01.2016 01:01, Fillmore wrote:

How was the Python 2.7 vs Python 3.X solved? which version should I
go for?

Python 3 is the new and better one.

More importantly: Python 2 will never improve; Python 3 is the only one
that is actively developed.



Exactly. The following story also confirms that: always use up-to-date 
software (not only for security reasons).


We recently upgraded from Django 1.3 to 1.4 to 1.5 to 1.6 to 1.7 and now 
to 1.8. It was amazing how much code (and workarounds) we could remove 
by simply using standard Django tools (small things but used hundreds of 
times).


Thus: Python 3.

Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

Re: psss...I want to move from Perl to Python

2016-01-29 Thread Sven R. Kunze


Hi,

On 29.01.2016 01:01, Fillmore wrote:

I look and Python and it looks so much more clean

add to that that it is the language of choice of data miners...

add to that that iNotebook looks powerful



All true. :)


Does Python have Regexps?


"import re"

https://docs.python.org/3.5/library/re.html

How was the Python 2.7 vs Python 3.X solved? which version should I go 
for?


Python 3 is the new and better one.

(I for one can see we already use all __future__ imports to get as close 
as possible to Python 3 in our production code)



Do you think that switching to Python from Perl is a good idea at 45?



http://srkunze.blogspot.de/2016/01/next-programming-language.html


Where do I get started moving from Perl to Python?


some keywords:

basic mathematical calculations
dicts and lists
classes
closures/mixins

and other tutorials. :)


which gotchas need I be aware of?


Don't worry. Just try it out. :)

Best,
Sven

--
https://mail.python.org/mailman/listinfo/python-list

Re: issues

2016-01-14 Thread Sven R. Kunze


Hi Gert,

just upgrade to 5.03.

Best,
Sven

On 13.01.2016 18:38, Gert Förster wrote:

Ladies, Gentlemen,

using the PyCharm Community Edition 4.5.4,  with Python-3-5-1-amd64.exe,
there is constantly a “Repair”-demand. This is “successful” when executed.
Without execution, there results an “Error code 1602” error. Please help me?

  


Sincerely Yours,

Gert Förster

  



--
https://mail.python.org/mailman/listinfo/python-list

Re: How to remove item from heap efficiently?

2016-01-13 Thread Sven R. Kunze


On 13.01.2016 12:20, Cem Karan wrote:

On Jan 12, 2016, at 11:18 AM, "Sven R. Kunze"  wrote:


Thanks for replying here. I've come across these types of 
wrappers/re-implementations of heapq as well when researching this issue. :)

Unfortunately, they don't solve the underlying issue at hand which is: "remove item 
from heap with unknown index" and be efficient at it (by not using _heapq C 
implementation).


So, I thought I did another wrapper. ;) It at least uses _heapq (if available 
otherwise heapq) and lets you remove items without violating the invariant in 
O(log n). I am going to make that open-source on pypi and see what people think 
of it.

Is that so?  I'll be honest, I never tested its asymptotic performance, I just 
assumed that he had a dict coupled with a heap somehow, but I never looked into 
the code.


My concern about that specific package is a missing C-implementation. I 
feel that somewhat defeats the whole purpose of using a heap: performance.


Asymptotic performance is still O(log n).


That said, IMHO using a dict interface is the way to go for priority queues; it 
really simplified my code using it!  This is my not-so-subtle way of asking you 
to adopt the MutableMapping interface for your wrapper ;)


Could you elaborate on this? What simplified you code so much?

I have been using heaps for priority queues as well but haven't missed 
the dict interface so far. Maybe, my use-case is different.


Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

Re: How to remove item from heap efficiently?

2016-01-12 Thread Sven R. Kunze


On 12.01.2016 03:48, Cem Karan wrote:


Jumping in late, but...

If you want something that 'just works', you can use HeapDict:

http://stutzbachenterprises.com/

I've used it in the past, and it works quite well.  I haven't tested its 
asymptotic performance though, so you might want to check into that.


Thanks for replying here. I've come across these types of 
wrappers/re-implementations of heapq as well when researching this issue. :)


Unfortunately, they don't solve the underlying issue at hand which is: 
"remove item from heap with unknown index" and be efficient at it (by 
not using _heapq C implementation).



So, I thought I did another wrapper. ;) It at least uses _heapq (if 
available otherwise heapq) and lets you remove items without violating 
the invariant in O(log n). I am going to make that open-source on pypi 
and see what people think of it.


Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

Re: How to remove item from heap efficiently?

2016-01-10 Thread Sven R. Kunze


On 09.01.2016 19:32, Paul Rubin wrote:

"Sven R. Kunze"  writes:

Basically a task scheduler where tasks can be thrown away once they
are too long in the queue.

I don't think there's a real nice way to do this with heapq.  The
computer-sciencey way would involve separate balanced tree structures
for the two sorting keys (think of a database table with indexes on two
different columns).


Others suggested using an additional dict where to store the indexes of 
the items. Then, when removing an item, I just need to query the dict.


Best,
Sven
--
https://mail.python.org/mailman/listinfo/python-list

1 2 3 >

1 - 100 of 223 matches

Mail list logo