[ 
https://issues.apache.org/jira/browse/CALCITE-3415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952631#comment-16952631
 ] 

Pranay Parmar edited comment on CALCITE-3415 at 10/17/19 12:19 PM:
-------------------------------------------------------------------

[~amaliujia]

*REGEXP_SUBSTR* function is present in Oracle, Teradata and a bunch of other 
major dialects but not in BigQuery. As you mentioned the closest match in 
BigQuery is *REGEXP_EXTRACT* and *REGEXP_EXTRACT_ALL*.

There are *4* variations of this function with 2, 3, 4 or 5 parameters :

*1. REGEXP_SUBSTR(<CHARACTER>, <CHARACTER>) [2 params] :*
{code:sql}
SELECT REGEXP_SUBSTR('choco chico chipo', 'c+.{2}') FROM foodmart.product
{code}
For BigQuery it will be unparsed into :
{code:sql}
SELECT REGEXP_EXTRACT('choco chico chipo', 'c+.{2}') FROM foodmart.product
{code}
*2. REGEXP_SUBSTR(<CHARACTER>, <CHARACTER>, <INT>) [3 params] :*
{code:sql}
SELECT REGEXP_SUBSTR('choco chico chipo', 'c+.{2}', 7) FROM foodmart.product
{code}
For BigQuery it will be unparsed into :
{code:sql}
SELECT REGEXP_EXTRACT(SUBSTR('choco chico chipo', 7), 'c+.{2}') FROM 
foodmart.product
{code}
*3. REGEXP_SUBSTR(<CHARACTER>, <CHARACTER>, <INT>, <INT>) [4 params] :*
{code:sql}
SELECT REGEXP_SUBSTR('chocolate chip cookies', 'c+.{2}', 4, 2) FROM 
foodmart.product
{code}
For BigQuery it will be unparsed into :
{code:sql}
SELECT REGEXP_EXTRACT_ALL(SUBSTR('chocolate chip cookies', 4), 'c+.{2}') 
[OFFSET(3)] FROM foodmart.product
{code}
*4. REGEXP_SUBSTR(<CHARACTER>, <CHARACTER>, <INT>, <INT>, <CHARACTER>) [5 
params] :*
{code:sql}
SELECT REGEXP_SUBSTR('chocolate Chip cookies', 'c+.{2}', 4, 2, 'i') FROM 
foodmart.product
{code}
For BigQuery it will be unparsed into :
{code:sql}
SELECT REGEXP_EXTRACT_ALL(SUBSTR('chocolate Chip cookies', 4), '(?i)c+.{2}') 
[OFFSET(3)] FROM foodmart.product
{code}


was (Author: pranay.parmar):
[~amaliujia]

*REGEXP_SUBSTR* function is present in Oracle, Teradata and a bunch of other 
major dialects but not in BigQuery. As you mentioned the closest match in 
BigQuery is *REGEXP_EXTRACT* and *REGEXP_EXTRACT_ALL*.

There are *4* variations of this function with 2, 3, 4 or 5 parameters :

*1. REGEXP_SUBSTR(<CHARACTER>, <CHARACTER>) [2 params] :*
{code:sql}
SELECT REGEXP_SUBSTR('choco chico chipo', 'c+.{2}') FROM foodmart.product
{code}
For BigQuery it will be unparsed into :
{code:sql}
SELECT REGEXP_EXTRACT('choco chico chipo', 'c+.{2}') FROM foodmart.product
{code}
*2. REGEXP_SUBSTR(<CHARACTER>, <CHARACTER>, <INT>) [3 params] :*
{code:sql}
SELECT REGEXP_SUBSTR('choco chico chipo', 'c+.{2}', 7) FROM foodmart.product
{code}
For BigQuery it will be unparsed into :
{code:sql}
SELECT REGEXP_EXTRACT(SUBSTR('choco chico chipo', 7), 'c+.{2}') FROM 
foodmart.product
{code}
*3. REGEXP_SUBSTR(<CHARACTER>, <CHARACTER>, <INT>, <INT>) [4 params] :*
{code:sql}
SELECT REGEXP_SUBSTR('chocolate chip cookies', 'c+.{2}', 4, 2) FROM 
foodmart.product
{code}
For BigQuery it will be unparsed into :
{code:sql}
SELECT REGEXP_EXTRACT_ALL(SUBSTR('chocolate chip cookies', 4), 'c+.{2}') 
[OFFSET(4 - 1)] FROM foodmart.product
{code}
*4. REGEXP_SUBSTR(<CHARACTER>, <CHARACTER>, <INT>, <INT>, <CHARACTER>) [5 
params] :*
{code:sql}
SELECT REGEXP_SUBSTR('chocolate Chip cookies', 'c+.{2}', 4, 2, 'i') FROM 
foodmart.product
{code}
For BigQuery it will be unparsed into :
{code:sql}
SELECT REGEXP_EXTRACT_ALL(SUBSTR('chocolate Chip cookies', 4), '(?i)c+.{2}') 
[OFFSET(4 - 1)] FROM foodmart.product
{code}

> Cannot parse REGEXP_SUBSTR in BigQuery
> --------------------------------------
>
>                 Key: CALCITE-3415
>                 URL: https://issues.apache.org/jira/browse/CALCITE-3415
>             Project: Calcite
>          Issue Type: Improvement
>          Components: core
>    Affects Versions: 1.21.0
>            Reporter: Pranay Parmar
>            Priority: Minor
>
> REGEXP_SUBSTR error :
> {code:java}
> No match found for function signature REGEXP_SUBSTR(<CHARACTER>, <CHARACTER>, 
> [<INT>, <INT>, <CHARACTER>]){code}
>  
> Example query:
> {code:sql}
> SELECT REGEXP_SUBSTR('chocolate Chip cookies', 'c+.{2}', 1, product_id, 'i')
> FROM public.account{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to