[ https://issues.apache.org/jira/browse/SPARK-2686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Yin Huai updated SPARK-2686: ---------------------------- Target Version/s: 1.4.0 (was: 1.3.0) > Add Length support to Spark SQL and HQL and Strlen support to SQL > ----------------------------------------------------------------- > > Key: SPARK-2686 > URL: https://issues.apache.org/jira/browse/SPARK-2686 > Project: Spark > Issue Type: Improvement > Components: SQL > Environment: all > Reporter: Stephen Boesch > Priority: Minor > Labels: hql, length, sql > Original Estimate: 0h > Remaining Estimate: 0h > > Syntactic, parsing, and operational support have been added for LEN(GTH) and > STRLEN functions. > Examples: > SQL: > import org.apache.spark.sql._ > case class TestData(key: Int, value: String) > val sqlc = new SQLContext(sc) > import sqlc._ > val testData: SchemaRDD = sqlc.sparkContext.parallelize( > (1 to 100).map(i => TestData(i, i.toString))) > testData.registerAsTable("testData") > sqlc.sql("select length(key) as key_len from testData order by key_len desc > limit 5").collect > res12: Array[org.apache.spark.sql.Row] = Array([3], [2], [2], [2], [2]) > HQL: > val hc = new org.apache.spark.sql.hive.HiveContext(sc) > import hc._ > hc.hql > hql("select length(grp) from simplex").collect > res14: Array[org.apache.spark.sql.Row] = Array([6], [6], [6], [6]) > As far as codebase changes: they have been purposefully made similar to the > ones made for for adding SUBSTR(ING) from July 17: > SQLParser, Optimizer, Expression, stringOperations, and HiveQL were the main > classes changed. The testing suites affected are ConstantFolding and > ExpressionEvaluation. > In addition some ad-hoc testing was done as shown in the examples. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org