Chhavi Bansal created SPARK-51295:
-------------------------------------

             Summary: \d failure in CLI client for postgres v15 and above for 
corelated subquery
                 Key: SPARK-51295
                 URL: https://issues.apache.org/jira/browse/SPARK-51295
             Project: Spark
          Issue Type: Bug
          Components: Spark Core
    Affects Versions: 3.4.1
            Reporter: Chhavi Bansal


Postgres v15 and above fire particular metadata queries when a 
{code:java}
\d tablename {code}
query is fired by the user.
If the backend is spark in this case, certain metadata query fails
{code:java}
SELECT a.attname,  pg_catalog.format_type(a.atttypid, a.atttypmod),  (SELECT 
pg_catalog.pg_get_expr(d.adbin, d.adrelid, true)   FROM pg_catalog.pg_attrdef d 
  WHERE d.adrelid = a.attrelid AND d.adnum = a.attnum AND a.atthasdef),  
a.attnotnull,  (SELECT c.collname FROM pg_catalog.pg_collation c, 
pg_catalog.pg_type t   WHERE c.oid = a.attcollation AND t.oid = a.atttypid AND 
a.attcollation <> t.typcollation) AS attcollation,  ''::pg_catalog.char AS 
attidentity,  ''::pg_catalog.char AS attgeneratedFROM pg_catalog.pg_attribute 
aWHERE a.attrelid = '29916401' AND a.attnum > 0 AND NOT a.attisdroppedORDER BY 
a.attnum;  {code}
The inner subquery
{code:java}
(SELECT c.collname FROM pg_catalog.pg_collation c, pg_catalog.pg_type t WHERE 
c.oid = a.attcollation AND t.oid = a.atttypid AND a.attcollation <> 
t.typcollation) AS attcollation  {code}
is the cause of failure citing to the reason
{code:java}
 Reason: 
[[UNSUPPORTED_SUBQUERY_EXPRESSION_CATEGORY.MUST_AGGREGATE_CORRELATED_SCALAR_SUBQUERY]
 Unsupported subquery expression: Correlated scalar subqueries must be 
aggregated to return at most one row.; line 10 pos 2;Project [attname#11, 
format_type(atttypid, atttypmod)#15002, scalarsubquery(attrelid, attnum, 
atthasdef)#15010, attnotnull#22, attcollation#14999, attidentity#15000, 
attgenerated#15001]+- Sort [attnum#15 ASC NULLS FIRST], true   +- Project 
[attname#11, if (isnull(atttypid#12)) null else 
format_type(knownnotnull(atttypid#12), cast(atttypmod#18 as string)) AS 
format_type(atttypid, atttypmod)#15002, scalar-subquery#14997 [attrelid#10 && 
attnum#15 && atthasdef#23] AS scalarsubquery(attrelid, attnum, 
atthasdef)#15010, attnotnull#22, scalar-subquery#14998 [attcollation#28 && 
atttypid#12 && attcollation#28] AS attcollation#14999, cast( as string) AS 
attidentity#15000, cast( as string) AS attgenerated#15001, attnum#15]      :  
:- Aggregate [first(if (isnull(adrelid#3704)) null else pg_get_expr(adbin#3706, 
knownnotnull(adrelid#3704)), false) AS first(pg_get_expr(adbin, 
adrelid))#15003]      :  :  +- Filter (((adrelid#3704 = outer(attrelid#10)) AND 
(adnum#3705 = cast(outer(attnum#15) as int))) AND (outer(atthasdef#23) = true)) 
     :  :     +- SubqueryAlias d      :  :        +- SubqueryAlias pg_attrdef   
   :  :           +- View (`pg_attrdef`, 
[oid#3703,adrelid#3704,adnum#3705,adbin#3706,adsrc#3707])      :  :             
 +- Project [oid#3703, adrelid#3704, adnum#3705, adbin#3706, adsrc#3707]      : 
 :                 +- SubqueryAlias global_temp.pg_attrdef      :  :            
        +- View (`global_temp`.`pg_attrdef`, 
[oid#3703,adrelid#3704,adnum#3705,adbin#3706,adsrc#3707])      :  :             
          +- LogicalRDD [oid#3703, adrelid#3704, adnum#3705, adbin#3706, 
adsrc#3707], false 
 {code}


Since we cant control the metadata queries being fired by the CLI, Can this 
kind of query with correlated subquery in select statement be supported by 
spark?

Because the same query runs perfectly fine on vanilla postgres.

What should be the way to handle this query?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to