[I] vector_of_kll_floats_sketches.get_quantiles() returns wrong values with float32 [datasketches-python]

via GitHub Sun, 30 Nov 2025 23:46:51 -0800


tyler-rt opened a new issue, #63:
URL: https://github.com/apache/datasketches-python/issues/63


   
   
   ```python
   #!/usr/bin/env python3
   """
   Minimal example: vector_of_kll_floats_sketches.get_quantiles() returns WRONG 
VALUES with float32
   """
   import numpy as np
   from datasketches import vector_of_kll_floats_sketches
   
   # Create test data: 1000 samples between -100 and -10
   np.random.seed(42)
   test_data = np.random.uniform(-100, -10, size=(1000, 1)).astype(np.float32)
   
   print("Test data: 1000 samples between -100 and -10")
   print(f"True min: {test_data.min():.2f}, True max: {test_data.max():.2f}")
   
   # Create sketch and add data
   kll = vector_of_kll_floats_sketches(200, 1)
   kll.update(test_data)
   
   # Request p0.0001 (should be ~-100) and p0.9999 (should be ~-10)
   ranks_list = [0.0001, 0.9999]
   ranks_array32 = np.array(ranks_list, dtype=np.float32)
   ranks_array64 = np.array(ranks_list, dtype=np.float64)
   
   
   print("\n" + "="*60)
   print("BUG: numpy array with dtype=np.float32 returns WRONG quantiles")
   print("="*60)
   
   quants_array = kll.get_quantiles(ranks_array32)
   print(f"\nWith numpy array with dtype=np.float32: {ranks_array32}")
   print(f"  p0.0001 = {quants_array[0][0]:.2f}  (expected: ~-100)")
   print(f"  p0.9999 = {quants_array[0][1]:.2f}  (expected: ~-10)")
   print(f"  ✗ WRONG: Both values near minimum!")
   
   quants_array64 = kll.get_quantiles(ranks_array64)
   print(f"\nWith numpy array with dtype=np.float64: {ranks_array64}")
   print(f"  p0.0001 = {quants_array64[0][0]:.2f}  (expected: ~-100)")
   print(f"  p0.9999 = {quants_array64[0][1]:.2f}  (expected: ~-10)")
   print(f"  ✓ CORRECT")
   
   ```
   
   ```
   Test data: 1000 samples between -100 and -10
   True min: -99.58, True max: -10.03
   
   ============================================================
   BUG: numpy array with dtype=np.float32 returns WRONG quantiles
   ============================================================
   
   With numpy array with dtype=np.float32: [1.000e-04 9.999e-01]
     p0.0001 = -98.69  (expected: ~-100)
     p0.9999 = -99.50  (expected: ~-10)
     ✗ WRONG: Both values near minimum!
   
   With numpy array with dtype=np.float64: [1.000e-04 9.999e-01]
     p0.0001 = -99.50  (expected: ~-100)
     p0.9999 = -10.28  (expected: ~-10)
     ✓ CORRECT
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[I] vector_of_kll_floats_sketches.get_quantiles() returns wrong values with float32 [datasketches-python]

Reply via email to