Luis Lozano Coira created SPARK-38288:
-----------------------------------------

             Summary: Aggregate push down doesnt work using Spark SQL jdbc 
datasource with postgresql
                 Key: SPARK-38288
                 URL: https://issues.apache.org/jira/browse/SPARK-38288
             Project: Spark
          Issue Type: Question
          Components: SQL
    Affects Versions: 3.2.1
            Reporter: Luis Lozano Coira


I am establishing a connection with postgresql using the Spark SQL jdbc 
datasource. I have started the spark shell including the postgres driver and I 
can connect and execute queries without problems. I am using this statement:
{code:java}
val df = spark.read.format("jdbc").option("url", 
"jdbc:postgresql://host:port/").option("driver", 
"org.postgresql.Driver").option("dbtable", "test").option("user", 
"postgres").option("password", 
"*******").option("pushDownAggregate",true).load()
{code}
I am adding the pushDownAggregate option because I would like the aggregations 
are delegated to the source. But for some reason this is not happening.

Reviewing this pull request, it seems that this feature should be merged into 
3.2. [https://github.com/apache/spark/pull/29695]

I am making the aggregations considering the mentioned limitations. An example 
case where I don't see pushdown being done would be this one:
{code:java}
df.groupBy("name").max("age").show()
{code}
What could be the problem? Should pushDownAggregate work in this case?



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to