Re: [matplotlib-devel] purpose of 'stepfilled' in hist()?

Erik Tollerud Fri, 27 Aug 2010 14:05:44 -0700

I also completely agree that, ideally, hist/bar/box would get a full
re-write...  A re-write of the hist side of things would be very
useful for me, and something that I might end up doing myself, if I
can find the time.  I haven't really done anything with the bar or box
plots, though, so I'm not sure what needs to be changed there.


But presumably we're stuck with backwards-compatibility for the
existing function... and as it stands now, it's clearly bugged, so I
still think the patch should be applied in the interm.

To address the True/1.0 issue, though: The reason I did it here is not
because True is really a special case - this was just a performance
tweak... The whole code snippet is as follows:

if log is True:
    minimum = 1.0
elif log:
    minimum = float(log)
else:
    minimum = 0.0

You'll note that if I change the "elif log:" to "if log" and remove
the first case completely, the behavior is unchanged, because
float(True) = 1.0 ... that was the reason for the choice of 1 as the
default.  The only reason I'm special-casing "log is True" is because
the default is False, so all other things being equal, the most likely
thing someone would assume who didn't read the docs closely would
assume is that it should be "True" - so it's only a tiny performance
boost for those cases.  Honestly, it's not a big deal - I did it out
of habit due to other projects where you get significant performance
gains by special-casing the default argument.



>>    I realize API changes are a pain, but this seems error-prone from a
>>    user's point of view. If they accidentally use 1 instead of "True" -
>>    common among C or old Python users - suddenly the function does
>>    something startling. (There's also an ambiguity between zero and
>>    False, but that's probably not so serious here.) If I were designing
>>    an API from scratch I'd probably go with a separate parameter for the
>>    minimum (or not, if ylim can fix it after the fact) and a dedicated
>>    one for "should we use a log scale". Failing that, maybe the string
>>    "auto" to indicate automatic minimum values and None for a default?

On Wed, Aug 25, 2010 at 10:06 AM, Eric Firing <[email protected]> wrote:
> On 08/25/2010 04:50 AM, Benjamin Root wrote:
>>
>> On Wed, Aug 25, 2010 at 12:00 AM, Anne Archibald
>> <[email protected] <mailto:[email protected]>> wrote:
>>
>>    On 24 August 2010 22:22, Benjamin Root <[email protected]
>>    <mailto:[email protected]>> wrote:
>>     > On Tue, Aug 24, 2010 at 9:01 PM, Anne Archibald
>>    <[email protected] <mailto:[email protected]>>
>>     > wrote:
>>     >>
>>     >> On 24 August 2010 19:16, Erik Tollerud <[email protected]
>>    <mailto:[email protected]>> wrote:
>>     >> > Whoops, yes, that should be True... Also realized a slight
>>    error in
>>     >> > the description of how the mimum is set - both of those are
>>    fixed in
>>     >> > the attached diff.
>>     >>
>>     >> Um, this is a kind of important point of style: it is much better
>> to
>>     >> use "if foo:" than "if foo is True:" or even "if foo == True:".
>>     >> Long-standing python convention allows things like 1, 7.0, numpy
>>     >> booleans that are true, and nonempty lists to have a value of True.
>>     >> Using "if foo:", this works. Using "if foo is True:", this cannot
>>     >> possibly work; even though 1==True, it is not true that 1 is True.
>>     >> "is" has a very specific meaning that should be used only when
>>     >> appropriate (generally, for None or for mutable objects).
>>
>>    [snip]
>>
>>     > While it probably could be better done, the logic of the entire
>>    if statement
>>     > is to first check to see if someone explicitly set a True value
>>    (default is
>>     > False), and that sets the minimum to 1.0.  Then, if it isn't
>>    explicitly
>>     > True, then it checks to see if it is a numerical value and uses
>>    that value
>>     > to indicate the minimum.  Only if it is None or False does it
>>    then go to the
>>     > last branch which would set minimum to zero.
>>     >
>>     > Maybe it should use a cbook function that test for a numerical value
>>     > explicitly instead and do that first, then have a check for the
>>    Truthiness
>>     > of log?
>>
>>    I realize API changes are a pain, but this seems error-prone from a
>>    user's point of view. If they accidentally use 1 instead of "True" -
>>    common among C or old Python users - suddenly the function does
>>    something startling. (There's also an ambiguity between zero and
>>    False, but that's probably not so serious here.) If I were designing
>>    an API from scratch I'd probably go with a separate parameter for the
>>    minimum (or not, if ylim can fix it after the fact) and a dedicated
>>    one for "should we use a log scale". Failing that, maybe the string
>>    "auto" to indicate automatic minimum values and None for a default?
>>
>>    If you're going to use True to mean something different from 1,
>>    though, I'd make sure to put a warning in the docstring. Unfortunately
>>    you can't just rely on duck typing to tell numeric values from
>>    booleans, since float(True) is 1.0. On the other hand, "is True" fails
>>    for numpy booleans, and "== True" passes for 1.0. So if this is your
>>    constraint, I'm not sure you can do better than "is True", but it's a
>>    huge UI wart.
>>
>>    Anne
>>
>>    P.S. keep in mind that the values users pass are not always directly
>>    visible to users - they might be passing a value output from someone
>>    else's routine that is described as returning a boolean value but
>>    actually returns an integer. This is particularly common among C or
>>    Fortran routines, which are in turn very common in the numerical
>>    world. From the other direction, if you pull a value out of a boolean
>>    numpy array, you get a numpy boolean which will never "is True". -A
>>
>>
>> I agree.  After thinking about it further, I realized a few other ways
>> that this will silently fail.  Quite personally, I feel that the
>> hist/bar family of functions ought to be nuked from orbit.  It is
>> absolutely amazing just how convoluted those functions are.  We seem to
>> be acquiescing to every single feature request rather than thinking
>> about how one could use the existing matplotlib tools.  For example, if
>> one wants error bars on their histograms/bar plots, then they should be
>> able to call hist() and then errorbars() to achieve their needs.
>
> Ben,
>
> "Me, too!" regarding dismay over the hist/bar family (including box plots).
>  They need to be rethought and re-implemented in their own module.  Dealing
> with the transition may be a pain, but so is every experience with trying to
> maintain them.
>
> Eric
>
>
>>
>> For logarithmic hist(), I am not exactly sure why we should have a
>> keyword argument to indicate a mode of operation.  We have loglog(),
>> semilogx(), semilogy().  A user could also call set_xscale() or
>> set_yscale() and the limits accordingly. I am not saying they are the
>> best examples/approaches, merely pointing out the different ways we have
>> for log plotting.
>>
>> Should we have histlog() and barlog() functions?  Are we willing to make
>> an effort to look at some of these plotting functions and clean them up?
>>
>> Ben Root
>
>



-- 
Erik Tollerud

------------------------------------------------------------------------------
Sell apps to millions through the Intel(R) Atom(Tm) Developer Program
Be part of this innovative community and reach millions of netbook users 
worldwide. Take advantage of special opportunities to increase revenue and 
speed time-to-market. Join now, and jumpstart your future.
http://p.sf.net/sfu/intel-atom-d2d
_______________________________________________
Matplotlib-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/matplotlib-devel

Re: [matplotlib-devel] purpose of 'stepfilled' in hist()?

Reply via email to