berkaysynnada commented on PR #7544: URL: https://github.com/apache/arrow-datafusion/pull/7544#issuecomment-1719164499
> > I'm sorry but I didn't understand exactly what you mean while saying derive/propagate. > > I didn't explain it in enough detail, and I apologize for that. > > `Statistic Derive/Propagate` is a process to compute a whole plan statistic. In general, we have the init statistic of table, and we will compute parent PlanNode from bottom to up recursively and we will compute fill statistic in all PlanNode. > > This PR is handle a condition: `In FilterExec::statisticsmethod, if the input statistics are None, the analysis is not performed.` So this PR fill TypeInfo (such as ScalarValue::Null) into `Statisitc`. > > My point is that we can also correct the "Statistic Derive" so that the input statistics are not None, but already have TypeInfo, rather than injecting TypeInfo directly into FilterExec. Thanks for the explanation :) There is an [issue](https://github.com/apache/arrow-datafusion/issues/7553), I don't know if you have found the chance to review it, but in summary, this `statistics` method of `FilterExec` needs a refactor. I will move the changes in this PR there; therefore I close this PR. Actually what you said "if the input statistics are None, the analysis is not performed." is not what I intended. The analysis is performed with columns having infinite bounds. To do this, filling Statistics with TypeInfo is inevitable. To reflect what you suggest in practice, I plan to add a `Schema`-like field to the `Statistics` struct (to hold TypeInfo) and let each statistics method update this schema for the PlanNode's above it. However, since each statistics method has access to its own and children's schema, it would be like carrying duplicate information, so I gave up. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
