[ 
https://issues.apache.org/jira/browse/BEAM-10074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17117483#comment-17117483
 ] 

Darshan Jani edited comment on BEAM-10074 at 5/27/20, 7:54 AM:
---------------------------------------------------------------

Hi Rui,
It would be nice if hashing functions can be part of builtinfunctions as they 
are in BQ.
Own udf does works.
Only limitation I see is when we want to use SerializableFunction, we need to 
have a public class derived from it which implements apply method. we cannot 
use a local variable or a lamba function instead like in other transforms like 
MapElements ...
That is calcite limitation I think.

On a side note, I feel there should be documentation in offical BeamSQL pages 
of how we can write UDFs and register it. I also see offical beamSQL 
documentation is outdated and all functions are not documented there. It would 
be good to provide example usage along with list of functions. For example: 
https://beam.apache.org/documentation/dsls/sql/calcite/aggregate-functions/




was (Author: darshanjani):
Hi Rui,
I feel hashing functions can be part of builtinfunctions as they are in BQ.
Own udf does works.
Only limitation I see is when we want to use SerializableFunction, we need to 
have a public class derived from it which implements apply method. we cannot 
use a local variable or a lamba function instead like in other transforms like 
MapElements ...
That is calcite limitation I think.

On a side note, I feel there should be documentation in offical BeamSQL pages 
of how we can write UDFs and register it. I also see offical beamSQL 
documentation is outdated and all functions are not documented there. It would 
be good to provide example usage along with list of functions. For example: 
https://beam.apache.org/documentation/dsls/sql/calcite/aggregate-functions/



> Hash Functions in BeamSQL
> -------------------------
>
>                 Key: BEAM-10074
>                 URL: https://issues.apache.org/jira/browse/BEAM-10074
>             Project: Beam
>          Issue Type: New Feature
>          Components: dsl-sql
>            Reporter: Darshan Jani
>            Assignee: Darshan Jani
>            Priority: P2
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> I would like to propose hash functions (implemented as UDFs): 
> Optionally we can also add for below functions variants which return hex 
> string instead of bytes. 
> # MD5
> Calculates an MD5 128-bit checksum of string or bytes and returns it as a 
> bytes
> {code:java}
> SELECT MD5("Some String") as md5;
> {code}
> # SHA1
> Calculates a SHA-1 hash value of string or bytes and returns it as a bytes.
> {code:java}
> SELECT SHA1("Some String") as sha1;
> {code}
> # SHA256
> Calculates a SHA-256 hash value of string or bytes and returns it as a bytes
> {code:java}
> SELECT SHA256("Some String") as sha256;
> {code}
> # SHA512
> Calculates a SHA-512 hash value of string or bytes and returns it as a bytes.
> {code:java}
> SELECT SHA512("Some String") as sha512;
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to