tdunning commented on PR #471:
URL: https://github.com/apache/datasketches-cpp/pull/471#issuecomment-3722483623

   If you have a function that adds up an array of values, what is the correct
   behavior if one of the values is +∞ and the others are normal numbers?
   
   I think the answer should be +∞
   
   What if you pass in +∞ and -∞ along with normal values?
   
   I think the answer should be NaN
   
   This is the behavior that any function like sum should follow. This
   includes mean and, in a more nuanced way, t-digest.
   
   What is the median of [0, 0, 1, +∞, +∞] ?
   
   I think it should be 1, not 0.
   
   Julia agrees:
   ```
   julia> median([0, 0, 1, +Inf, +Inf])
   1.0
   ```
   If the infinities are ignored, we get the wrong answer.
   
   On Wed, Jan 7, 2026 at 9:15 PM Hyeonho Kim ***@***.***> wrote:
   
   > *proost* left a comment (apache/datasketches-cpp#471)
   > 
<https://github.com/apache/datasketches-cpp/pull/471#issuecomment-3721925921>
   >
   > @AlexanderSaydakov <https://github.com/AlexanderSaydakov>
   >
   > There is one thing I wanted to clarify regarding this PR.
   >
   > In the current implementation, ±∞ values can flow into the t-digest and be
   > treated as data points. With this PR, ±∞ values passed to update() are
   > silently ignored, and ±∞ passed to query methods such as get_rank()
   > result in an exception. That means this change alters the behavior for
   > users who might already be passing infinity values (intentionally or not),
   > which effectively makes it a breaking change.
   >
   > If supporting ±∞ as valid data was intentional in the original design,
   > then I agree that changing this behavior is not appropriate. However, if it
   > wasn’t an intentional design choice and infinity handling was simply
   > unspecified, then I think explicitly rejecting/ignoring infinities and
   > documenting the behavior would improve robustness and make the policy
   > clearer for users.
   >
   > I wanted to check your view on this before moving forward, since it
   > affects compatibility and library semantics.
   >
   > —
   > Reply to this email directly, view it on GitHub
   > 
<https://github.com/apache/datasketches-cpp/pull/471#issuecomment-3721925921>,
   > or unsubscribe
   > 
<https://github.com/notifications/unsubscribe-auth/AAB5E6SNQ4WHZEFLXWC2OYL4FXRWFAVCNFSM6AAAAACQ5Y54J2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTOMRRHEZDKOJSGE>
   > .
   > You are receiving this because you were mentioned.Message ID:
   > ***@***.***>
   >
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to