[ 
https://issues.apache.org/jira/browse/SPARK-34669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17299290#comment-17299290
 ] 

Hyukjin Kwon commented on SPARK-34669:
--------------------------------------

I am suggesting to check the behaviours of length function in other DBMSes to 
see if Spark's behaviour makes sense or not, instead of arguing with one 
specific DBMS.

> Spark SQL uses the function[ length()] to return the length of the string 
> rather than the length of the character
> -----------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-34669
>                 URL: https://issues.apache.org/jira/browse/SPARK-34669
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 3.1.1
>         Environment: spark 3.1.1,scala 2.12
>            Reporter: zhishun luan
>            Priority: Major
>
> Such as the title.
> For the function length (). MySQL and other relational databases, we get the 
> byte length, but spark SQL gets the string length. For these two cases, 
> please provide a new function to get the byte length, otherwise it is easy to 
> mislead users
> ----------------------------------------------------------------------------------------
> {code:java}
> // code placeholder
> SparkSession.builder()
>   .config(new SparkConf().setMaster("local"))
>   .getOrCreate()
>   .sql("select length('测a')")
>   .show()
> {code}
>  
> [result]
> +-----------+
> |length(测a)|
> +-----------+
> |2|
> +-----------+
> in mysql 
> +-----------+
> |length(测a)|
> +-----------+
> |4|
> +-----------+
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to