[GitHub] [spark] beliefer opened a new pull request #29891: [WIP][SPARK-30796][SQL] Add parameter position for REGEXP_REPLACE

GitBox Mon, 28 Sep 2020 02:10:43 -0700


beliefer opened a new pull request #29891:
URL: https://github.com/apache/spark/pull/29891



   ### What changes were proposed in this pull request?
   `REGEXP_REPLACE` could replace all substrings of string that match regexp 
with replacement string.
   But `REGEXP_REPLACE` lost some flexibility. such as: converts camel case 
strings to a string containing lower case words separated by an underscore:
   AddressLine1 -> address_line_1
   If we support the parameter position, we can do like this(e.g. Oracle):
   ```
   CopyWITH strings as (   
     SELECT 'AddressLine1' s FROM dual union all   
     SELECT 'ZipCode' s FROM dual union all   
     SELECT 'Country' s FROM dual   
   )   
     SELECT s "STRING",  
            lower(regexp_replace(s, '([A-Z0-9])', '_\1', 2)) "MODIFIED_STRING"  
     FROM strings;
   
     STRING               MODIFIED_STRING
   -------------------- --------------------
   AddressLine1         address_line_1
   ZipCode              zip_code
   Country              country
   ```
   There are some mainstream database support the syntax.
   
   **Oracle**
   
https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/REGEXP_REPLACE.html#GUID-EA80A33C-441A-4692-A959-273B5A224490
   
   **Vertica**
   
https://www.vertica.com/docs/9.2.x/HTML/Content/Authoring/SQLReferenceManual/Functions/RegularExpressions/REGEXP_REPLACE.htm?zoom_highlight=regexp_replace
   
   **Redshift**
   https://docs.aws.amazon.com/redshift/latest/dg/REGEXP_REPLACE.html
   
   
   ### Why are the changes needed?
   The parameter position for `REGEXP_REPLACE` is very useful.
   
   
   ### Does this PR introduce _any_ user-facing change?
   'Yes'.
   
   
   ### How was this patch tested?
   Jenkins test.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] beliefer opened a new pull request #29891: [WIP][SPARK-30796][SQL] Add parameter position for REGEXP_REPLACE

Reply via email to