[ https://issues.apache.org/jira/browse/SPARK-34669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17299290#comment-17299290 ]
Hyukjin Kwon commented on SPARK-34669: -------------------------------------- I am suggesting to check the behaviours of length function in other DBMSes to see if Spark's behaviour makes sense or not, instead of arguing with one specific DBMS. > Spark SQL uses the function[ length()] to return the length of the string > rather than the length of the character > ----------------------------------------------------------------------------------------------------------------- > > Key: SPARK-34669 > URL: https://issues.apache.org/jira/browse/SPARK-34669 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 3.1.1 > Environment: spark 3.1.1,scala 2.12 > Reporter: zhishun luan > Priority: Major > > Such as the title. > For the function length (). MySQL and other relational databases, we get the > byte length, but spark SQL gets the string length. For these two cases, > please provide a new function to get the byte length, otherwise it is easy to > mislead users > ---------------------------------------------------------------------------------------- > {code:java} > // code placeholder > SparkSession.builder() > .config(new SparkConf().setMaster("local")) > .getOrCreate() > .sql("select length('测a')") > .show() > {code} > > [result] > +-----------+ > |length(测a)| > +-----------+ > |2| > +-----------+ > in mysql > +-----------+ > |length(测a)| > +-----------+ > |4| > +-----------+ > > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org