ozankabak commented on issue #8078:
URL: 
https://github.com/apache/arrow-datafusion/issues/8078#issuecomment-1810936320

   Thank you for investigating this. There are some parts I don't quite follow. 
Particularly,
   
   > I don't think the proposal (at least as I understand it) in 
https://github.com/apache/arrow-datafusion/issues/8078#issuecomment-1804546752, 
can capture the known min/max values
   
   I don't think this is entirely accurate. In addition to the two 
configurations you identified, it can also be:
   ```rust
   Precision::Range(
     PointEstimate::Exact(0), // min
     PointEstimate::Exact(N), // max
   )
   ```
   which captures the known min/max values. However, this loses our inexact 
guess `0.1 * N`. I think what you are trying to convey is that @berkaysynnada's 
proposal can not represent the triple of (*lower bound*, *guess*, *upper 
bound*). If this is what you mean, I agree with you.
   
   ## Example 2 `(x <= y)`
   ```
   x: Bounded { lower: 50, estimate: 0.75 * N, upper: 100 }, 
   y: Bounded { lower: 50, estimate: 0.75 * N, upper: 200 }
   ```
   I don't understand what you mean by this. In this snippet, lower/upper 
bounds refer to column bounds, while estimate refers to row count. Did you mean 
to write:
   ```
   x: Bounded { lower: 50, estimate: 75, upper: 100 }, 
   y: Bounded { lower: 50, estimate: 125, upper: 200 }
   row_count: Bounded { lower: 0, estimate: 62.5, upper: 100}
   ```
   where the 62.5 figure is derived from the factor:
   
   
![image](https://github.com/apache/arrow-datafusion/assets/2006258/dda3cb9d-538b-415e-9f06-62992f583943)
   
   which is 5/8.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to