[
https://issues.apache.org/jira/browse/CALCITE-5580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17699918#comment-17699918
]
Julian Hyde commented on CALCITE-5580:
--------------------------------------
You might find it interesting to note that with the addition of the {{SPLIT}}
function, the
[WordCount|http://blog.hydromatic.net/2020/03/31/word-count-revisited.html]
problem can be solved in pure SQL.
> Add SPLIT() Function (Enabled for BigQuery)
> -------------------------------------------
>
> Key: CALCITE-5580
> URL: https://issues.apache.org/jira/browse/CALCITE-5580
> Project: Calcite
> Issue Type: Improvement
> Reporter: Tanner Clary
> Assignee: Tanner Clary
> Priority: Major
> Labels: pull-request-available
> Time Spent: 20m
> Remaining Estimate: 0h
>
> BigQuery offers the {{SPLIT()}} function which splits a string at an
> optionally-specified delimiter into a string array. If no delimiter is
> specified, it is default to a comma. If the string is empty, an array of a
> single empty string is returned. If the delimiter is not found in the string,
> an array with a single element (the string) is returned.
> In BigQuery, the function can also accept bytes. In order to implement this,
> I think some modifications to ByteString.java may be required. I will
> probably not do this at least for my initial draft. If anyone has any
> suggestions or guidance on whether or not it should be supported, I would
> appreciate it.
> Documentation and example cases may be found below.
> EXAMPLE: {{SPLIT('h,e,l,l,o')}} would return: {{[h, e, l, l, o]}}.
> EXAMPLE: {{SPLIT('h-e-l-l-o', '-')}} would return: {{[h, e, l, l, o]}}.
> [BigQuery
> docs|https://cloud.google.com/bigquery/docs/reference/standard-sql/string_functions#split]
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)