Still, the calculated a priori error bounds for the LRA case still look
strange.


On Sun, Sep 13, 2020 at 1:09 PM leerho <[email protected]> wrote:

> Pavel,
> Yes, I was doing something wrong! Sorry! I was confusing the predicted
> bounds on a rank vs the error shown in the graphs where the absolute rank
> is subtracted out just showing the difference.
>
> Nonetheless, we do need functions that do not go negative for small K.
>
> Lee.
>
> On Sun, Sep 13, 2020 at 12:50 PM leerho <[email protected]> wrote:
>
>> Pavel,
>> I have looked very carefully at your p-code for getRankUB and getRankLB,
>> and they both produce nonsensical values.
>>
>>    - The getMaximumRSE(k), with K=4 (the smallest value of K) multiplied
>>    by 3 SD = 1.23, which is > 1.  It should never produce values > 1 for all
>>    possible legal values of K. This will cause the LB to go negative.
>>    - Given a K and SD, the function (1 +/- SD * RSE) is a constant.
>>    This multiplied by the rank will produce a linear increasing bounds from
>>    the lowest rank to the highest.  And this does not model the error
>>    properties for the HRA case which has decreasing error for the high ranks
>>    or the LRA case where the error is pretty flat for most of the range.
>>    - We need equations that not only predict the actual error behavior,
>>    but also for the lower bound, never produces values < 0 (for all legal
>>    values of K), which these equations will.
>>
>> Comments:
>>
>>    - This is not a bug, but a user design principle.  I don't think it
>>    is a good idea to have the user specify a value to obtain the rank error.
>>    We don't want to give the user the idea that the error bounds are related
>>    to values at all, only ranks.  I realize we can translate the given value
>>    to an estimated rank, but let's have the user do that and document to the
>>    user the caveats about doing that.
>>    - We have been training our users to become accustomed to specifying
>>    and obtaining ranks as normalized [0,1]. That is how our other quantile
>>    sketches work.  Most users think in terms of percentiles anyway so
>>    normalized ranks are pretty natural.
>>
>> Perhaps I am still doing something wrong?
>>
>> Cheers,
>>
>> Lee.
>>
>> On Sat, Sep 12, 2020 at 11:26 AM leerho <[email protected]> wrote:
>>
>>> Hi Pavel,
>>> I think we are in agreement.  I take it as a positive that you haven't
>>> found any flaw in the implementation of the relative error quantiles
>>> algorithm.   What we are discussing now is definitions in how to define
>>> rank especially with respect to HRA and LRA, and philosophically, what kind
>>> of error distribution as a function of rank will users want and be easiest
>>> to specify and explain.
>>>
>>> My intuition is that all plots should look roughly the same, up to
>>>> mirroring
>>>> along the Y axis.
>>>
>>>
>>> This statement is a little puzzling as it will only be true if we choose
>>> definitions of rank appropriately if the user selects LRA or HRA.  As my
>>> plots reveal, if we keep the definition of rank the same, switching  from
>>> HRA to LRA has dramatically different error distribution as a  f(rank).  I
>>> agree with your formulas for relative error, except that all of our ranks
>>> are already normalized by SL, so I would replace SL in your formulas by
>>> 1.0.
>>>
>>> I still want to add to these plots your a priori calculation of error to
>>> see where it lies compared to the measured error.
>>>
>>> I gather from your comments that what you had in mind when writing the
>>> paper was a rank error distribution that looks like plots 3 and 4 above and
>>> not plots 1 and 2.  However, I feel that the rank error distribution shown
>>> in plots 1 and 2 would be easier for users to understand.  We will probably
>>> leave the decision totally up to the user as to what they prefer, however,
>>> this will require more documentation to explain all the different error
>>> behaviors  :( .
>>>
>>> Lee.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to