[
https://issues.apache.org/jira/browse/SPARK-38852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
jiaan.geng updated SPARK-38852:
-------------------------------
Description:
Currently, Spark supports push down Filters and Aggregates to data source.
However, the Data Source V2 operator pushdown framework has the following
shortcomings:
# Only simple filter and aggregate are supported, which makes it impossible to
apply in most scenarios
# The incompatibility of SQL syntax makes it impossible to apply in most
scenarios
# Aggregate push down does not support multiple partitions of data sources
# Spark's additional aggregate will cause some overhead
# Limit push down is not supported
# Top n push down is not supported
# Aggregate push down does not support group by expressions
# Aggregate push down does not support not use aggregate functions
# Offset push down is not supported
# Paging push down is not supported
# UDF/UDAF push down is not supported
was:
Currently, Spark supports push down Filters and Aggregates to data source.
However, the Data Source V2 operator pushdown framework has the following
shortcomings:
# Only simple filter and aggregate are supported, which makes it impossible to
apply in most scenarios
# The incompatibility of SQL syntax makes it impossible to apply in most
scenarios
# Aggregate push down does not support multiple partitions of data sources
# Spark's additional aggregate will cause some overhead
# Limit push down is not supported
# Top n push down is not supported
# Aggregate push down does not support group by expressions
# Aggregate push down does not support not use aggregate functions
# Offset push down is not supported
# Paging push down is not supported
> Better Data Source V2 operator pushdown framework
> -------------------------------------------------
>
> Key: SPARK-38852
> URL: https://issues.apache.org/jira/browse/SPARK-38852
> Project: Spark
> Issue Type: Improvement
> Components: SQL
> Affects Versions: 3.3.0
> Reporter: jiaan.geng
> Priority: Major
>
> Currently, Spark supports push down Filters and Aggregates to data source.
> However, the Data Source V2 operator pushdown framework has the following
> shortcomings:
> # Only simple filter and aggregate are supported, which makes it impossible
> to apply in most scenarios
> # The incompatibility of SQL syntax makes it impossible to apply in most
> scenarios
> # Aggregate push down does not support multiple partitions of data sources
> # Spark's additional aggregate will cause some overhead
> # Limit push down is not supported
> # Top n push down is not supported
> # Aggregate push down does not support group by expressions
> # Aggregate push down does not support not use aggregate functions
> # Offset push down is not supported
> # Paging push down is not supported
> # UDF/UDAF push down is not supported
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]