XiaoHongbo-Hope commented on code in PR #7965:
URL: https://github.com/apache/paimon/pull/7965#discussion_r3301087772
##########
paimon-python/pypaimon/daft/daft_datasource.py:
##########
@@ -213,18 +219,29 @@ async def get_tasks(self, pushdowns: Pushdowns) ->
AsyncIterator[DataSourceTask]
read_table = self._table.copy({"blob-as-descriptor": "true"})
read_builder = read_table.new_read_builder()
+ reader_predicate, filters_consumed =
self._pushdown_filter_state(pushdowns)
+ planning_predicate = self._planning_predicate(reader_predicate)
+ requested_columns = self._valid_output_columns(pushdowns.columns)
+ task_columns = self._task_columns(read_table, requested_columns,
pushdowns)
+ read_columns = self._fallback_read_columns(read_table, task_columns,
reader_predicate)
Review Comment:
nit: these locals (+ `source_limit` below) must stay in lockstep across
`get_tasks`'s planner and `_fallback_read_builder`. Consider grouping
them into a small shared dataclass. Non-blocking.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]