[jira] [Updated] (SPARK-38852) Better Data Source V2 operator pushdown framework

jiaan.geng (Jira) Sun, 30 Jul 2023 21:02:23 -0700


     [ 
https://issues.apache.org/jira/browse/SPARK-38852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


jiaan.geng updated SPARK-38852:
-------------------------------
    Description: 
Currently, Spark supports push down Filters and Aggregates to data source.
However, the Data Source V2 operator pushdown framework has the following 
shortcomings:

# Only simple filter and aggregate are supported, which makes it impossible to 
apply in most scenarios
# The incompatibility of SQL syntax makes it impossible to apply in most 
scenarios
# Aggregate push down does not support multiple partitions of data sources
# Spark's additional aggregate will cause some overhead
# Limit push down is not supported
# Top n push down is not supported
# Aggregate push down does not support group by expressions
# Aggregate push down does not support not use aggregate functions
# Offset push down is not supported
# Paging push down is not supported
# UDF/UDAF push down is not supported

  was:
Currently, Spark supports push down Filters and Aggregates to data source.
However, the Data Source V2 operator pushdown framework has the following 
shortcomings:

# Only simple filter and aggregate are supported, which makes it impossible to 
apply in most scenarios
# The incompatibility of SQL syntax makes it impossible to apply in most 
scenarios
# Aggregate push down does not support multiple partitions of data sources
# Spark's additional aggregate will cause some overhead
# Limit push down is not supported
# Top n push down is not supported
# Aggregate push down does not support group by expressions
# Aggregate push down does not support not use aggregate functions
# Offset push down is not supported
# Paging push down is not supported


> Better Data Source V2 operator pushdown framework
> -------------------------------------------------
>
>                 Key: SPARK-38852
>                 URL: https://issues.apache.org/jira/browse/SPARK-38852
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 3.3.0
>            Reporter: jiaan.geng
>            Priority: Major
>
> Currently, Spark supports push down Filters and Aggregates to data source.
> However, the Data Source V2 operator pushdown framework has the following 
> shortcomings:
> # Only simple filter and aggregate are supported, which makes it impossible 
> to apply in most scenarios
> # The incompatibility of SQL syntax makes it impossible to apply in most 
> scenarios
> # Aggregate push down does not support multiple partitions of data sources
> # Spark's additional aggregate will cause some overhead
> # Limit push down is not supported
> # Top n push down is not supported
> # Aggregate push down does not support group by expressions
> # Aggregate push down does not support not use aggregate functions
> # Offset push down is not supported
> # Paging push down is not supported
> # UDF/UDAF push down is not supported



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (SPARK-38852) Better Data Source V2 operator pushdown framework

Reply via email to