GitHub user kevinyu98 opened a pull request:
https://github.com/apache/spark/pull/12646
[SPARK-14878][SQL] Trim characters string function support
#### What changes were proposed in this pull request?
This PR enhances the TRIM function support in Spark SQL by allowing the
specification
of trim characters as per the SQL 2003 standard. Below is the SQL syntax :
``` SQL
<trim function> ::= TRIM <left paren> <trim operands> <right paren>
<trim operands> ::= [ [ <trim specification> ] [ <trim character> ] FROM ]
<trim source>
<trim source> ::= <character value expression>
<trim specification> ::=
LEADING
| TRAILING
| BOTH
<trim character> ::= <character value expression>
```
Here are the documentation link of support of this feature by other
mainstream databases.
- **Oracle:** [TRIM
function](http://docs.oracle.com/javadb/10.6.1.0/ref/rreftrimfunc.html)
- **DB2:** [TRIM scalar
function](http://www.ibm.com/support/knowledgecenter/SSEPGG_9.8.0/com.ibm.db2.luw.sql.ref.doc/doc/r0023198.html)
- **MySQL:** [Trim
function](http://dev.mysql.com/doc/refman/5.7/en/string-functions.html#function_trim)
This PR is to implement the above enhancement. In the implementation, the
design principle is to keep the changes to the minimum. Also, the exiting trim
functions (which handles a special case, i.e., trimming space characters) are
kept unchanged for performane reasons.
#### How was this patch tested?
The unit test cases are added in the following files:
- UTF8StringSuite.java
- StringExpressionsSuite.scala
- sql/SQLQuerySuite.scala
- StringFunctionsSuite.scala
- ExpressionToSQLSuite.scala
- execution/SQLQuerySuite.scala
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/kevinyu98/spark spark-14878
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/12646.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #12646
----
commit ac718e268d6090fd788e5ec8addb10230cfae16b
Author: Kevin Yu <[email protected]>
Date: 2016-04-06T21:38:53Z
draft of seq[expression]
commit c78ae966f30ac2437fe8292d9024adbef2f60860
Author: Kevin Yu <[email protected]>
Date: 2016-04-07T02:20:23Z
trim with binaryExpression
commit c749691d532c0f09400c143379f1486c39fbaed8
Author: Kevin Yu <[email protected]>
Date: 2016-04-08T06:11:00Z
utf8 string code change
commit 3c014a57daff15bb86995993de1bcdd0ab136fec
Author: Kevin Yu <[email protected]>
Date: 2016-04-08T06:15:18Z
Merge branch 'trim-fun4' into trim-seqexp
I am using seq[expression] now
commit ae68402631b8325c5037fed8bf4b45599f8d3000
Author: Kevin Yu <[email protected]>
Date: 2016-04-11T21:03:45Z
adding seq(expression)
commit 7bb9770a75ccddd69eaee4c06674aa64220d828b
Author: Kevin Yu <[email protected]>
Date: 2016-04-11T22:19:36Z
fix2
commit 9525770c0e5bbba26b17d544653c8722ba261a37
Author: Kevin Yu <[email protected]>
Date: 2016-04-15T23:12:51Z
trim character
commit 209bd195a9bc96889908b96ec2631dddd8b46a6d
Author: Kevin Yu <[email protected]>
Date: 2016-04-16T00:13:26Z
fix style at utf8stringsuite
commit 4a49fcfa9ae102859ea78f0f4ec6d95a0d7855ed
Author: Kevin Yu <[email protected]>
Date: 2016-04-16T17:10:16Z
simply trim method
commit 18c17b5bcb1e5574d58f46f1bf55defbbc1647ac
Author: Kevin Yu <[email protected]>
Date: 2016-04-18T05:19:11Z
fixing style and simply code
commit 5833d26e8299efa6c47d4281eec7ea23f5dd3ec7
Author: Kevin Yu <[email protected]>
Date: 2016-04-18T16:08:41Z
simply trimleft
commit 4e93a5032b352f3c0985d7e0fb362495077efdf7
Author: Kevin Yu <[email protected]>
Date: 2016-04-18T16:26:02Z
fixing more styles
commit d6a1cb0dca88629d5d1d9ef8d05d08dcdb1089bc
Author: Kevin Yu <[email protected]>
Date: 2016-04-19T16:12:04Z
fixing style3
commit 3b44c5978bd44db986621d3e8511e9165b66926b
Author: Kevin Yu <[email protected]>
Date: 2016-04-20T18:06:30Z
adding testcase
commit 7dc5ecaf52936017ac739ba58fe4b7c9036570e6
Author: Kevin Yu <[email protected]>
Date: 2016-04-22T01:44:26Z
fixing style 4
commit 25dbb2351bea034ffe300d94ea45c3277d399641
Author: Kevin Yu <[email protected]>
Date: 2016-04-22T04:50:00Z
adding trim comments
commit 257303d5099dc405d5845bbcb9a5249d50aff018
Author: Kevin Yu <[email protected]>
Date: 2016-04-22T06:42:04Z
fixing more style5
commit 11438c030a6066daf2caf6252b645ae6c464efee
Author: Kevin Yu <[email protected]>
Date: 2016-04-22T16:27:50Z
fixing comments
commit de7bff8d1a654919a1f509aaf1c7a5799e1815b4
Author: Kevin Yu <[email protected]>
Date: 2016-04-22T21:31:43Z
fixing more styles
commit 18b4a31c687b264b50aa5f5a74455956911f738a
Author: Kevin Yu <[email protected]>
Date: 2016-04-22T21:48:00Z
Merge remote-tracking branch 'upstream/master'
commit 4f4d1c8f2801b1e662304ab2b33351173e71b427
Author: Kevin Yu <[email protected]>
Date: 2016-04-23T16:50:19Z
Merge remote-tracking branch 'upstream/master'
get latest code from upstream
commit c3c68e55a3bceb30116d6b1b4babe2629428cee7
Author: Kevin Yu <[email protected]>
Date: 2016-04-23T22:20:01Z
string trim characters function support
commit f5f0cbed1eb5754c04c36933b374c3b3d2ae4f4e
Author: Kevin Yu <[email protected]>
Date: 2016-04-23T22:20:53Z
Merge remote-tracking branch 'upstream/master'
adding trim characters support
commit 38405747d25a183b6d9ab4589f03be93a5a40d4b
Author: Kevin Yu <[email protected]>
Date: 2016-04-23T22:23:04Z
Merge branch 'test_jira' into spark-14878
adding trim characters support in spark sql
commit 061d7e9c1a77180f6f11c0c26588780934dec12e
Author: Kevin Yu <[email protected]>
Date: 2016-04-23T23:27:50Z
fixing style
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]