[
https://issues.apache.org/jira/browse/BEAM-10074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17117483#comment-17117483
]
Darshan Jani edited comment on BEAM-10074 at 5/27/20, 7:54 AM:
---------------------------------------------------------------
Hi Rui,
It would be nice if hashing functions can be part of builtinfunctions as they
are in BQ.
Own udf does works.
Only limitation I see is when we want to use SerializableFunction, we need to
have a public class derived from it which implements apply method. we cannot
use a local variable or a lamba function instead like in other transforms like
MapElements ...
That is calcite limitation I think.
On a side note, I feel there should be documentation in offical BeamSQL pages
of how we can write UDFs and register it. I also see offical beamSQL
documentation is outdated and all functions are not documented there. It would
be good to provide example usage along with list of functions. For example:
https://beam.apache.org/documentation/dsls/sql/calcite/aggregate-functions/
was (Author: darshanjani):
Hi Rui,
I feel hashing functions can be part of builtinfunctions as they are in BQ.
Own udf does works.
Only limitation I see is when we want to use SerializableFunction, we need to
have a public class derived from it which implements apply method. we cannot
use a local variable or a lamba function instead like in other transforms like
MapElements ...
That is calcite limitation I think.
On a side note, I feel there should be documentation in offical BeamSQL pages
of how we can write UDFs and register it. I also see offical beamSQL
documentation is outdated and all functions are not documented there. It would
be good to provide example usage along with list of functions. For example:
https://beam.apache.org/documentation/dsls/sql/calcite/aggregate-functions/
> Hash Functions in BeamSQL
> -------------------------
>
> Key: BEAM-10074
> URL: https://issues.apache.org/jira/browse/BEAM-10074
> Project: Beam
> Issue Type: New Feature
> Components: dsl-sql
> Reporter: Darshan Jani
> Assignee: Darshan Jani
> Priority: P2
> Time Spent: 20m
> Remaining Estimate: 0h
>
> I would like to propose hash functions (implemented as UDFs):
> Optionally we can also add for below functions variants which return hex
> string instead of bytes.
> # MD5
> Calculates an MD5 128-bit checksum of string or bytes and returns it as a
> bytes
> {code:java}
> SELECT MD5("Some String") as md5;
> {code}
> # SHA1
> Calculates a SHA-1 hash value of string or bytes and returns it as a bytes.
> {code:java}
> SELECT SHA1("Some String") as sha1;
> {code}
> # SHA256
> Calculates a SHA-256 hash value of string or bytes and returns it as a bytes
> {code:java}
> SELECT SHA256("Some String") as sha256;
> {code}
> # SHA512
> Calculates a SHA-512 hash value of string or bytes and returns it as a bytes.
> {code:java}
> SELECT SHA512("Some String") as sha512;
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)