[GitHub] [spark] yaooqinn commented on pull request #31940: [SPARK-34833][SQL] Apply right-padding correctly for correlated subqueries

GitBox Tue, 23 Mar 2021 19:05:41 -0700


yaooqinn commented on pull request #31940:
URL: https://github.com/apache/spark/pull/31940#issuecomment-805417257



   > The PR LGTM.
   > 
   > But I have a quick question, a bit unrelated to this PR but related to 
fixed-length string columns. Is this expected that we have in Spark:
   > 
   > ```
   > sql(s"CREATE TABLE t(c3 CHAR(3), c5 CHAR(5)) USING parquet")
   > sql("INSERT INTO t VALUES ('a', 'a')")
   > sql("SELECT c3, c5, c3 = c5, upper(c3) = upper(c5) FROM t").show()
   > ```
   > 
   > ```
   > +---+-----+---------+-----------------------+
   > | c3|   c5|(c3 = c5)|(upper(c3) = upper(c5))|
   > +---+-----+---------+-----------------------+
   > |a  |a    |     true|                  false|
   > +---+-----+---------+-----------------------+
   > ```
   > 
   > But in PostgreSQL `upper(c3) = upper(c5)` is true:
   > 
   > ```
   > petertoth=# SELECT c3, c5, c3 = c5, upper(c3) = upper(c5) FROM t;
   >  c3  |  c5   | ?column? | ?column?
   > -----+-------+----------+----------
   >  a   | a     | t        | t
   > ```
   
   cc @cloud-fan, we seemed to have discussed these when we prohibit 
char/varchar in UDFs


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] yaooqinn commented on pull request #31940: [SPARK-34833][SQL] Apply right-padding correctly for correlated subqueries

Reply via email to