[
https://issues.apache.org/jira/browse/BEAM-12393?focusedWorklogId=665961&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-665961
]
ASF GitHub Bot logged work on BEAM-12393:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 17/Oct/21 00:48
Start Date: 17/Oct/21 00:48
Worklog Time Spent: 10m
Work Description: apilloud commented on a change in pull request #15728:
URL: https://github.com/apache/beam/pull/15728#discussion_r730048606
##########
File path:
sdks/java/extensions/zetasketch/src/main/java/org/apache/beam/sdk/extensions/zetasketch/HllCount.java
##########
@@ -279,6 +279,10 @@ private Builder(HllCountInitFn<InputT, ?> initFn) {
public <K> Combine.PerKey<K, InputT, byte[]> perKey() {
return Combine.perKey(initFn);
}
+
+ public HllCountInitFn<InputT, ?> asUdaf() {
Review comment:
nit: You might have this return `Combine.CombineFn` and make it package
private (drop the `public`)?
##########
File path:
sdks/java/extensions/zetasketch/src/main/java/org/apache/beam/sdk/extensions/zetasketch/ApproximateCountDistinct.java
##########
@@ -99,6 +99,10 @@
.build();
}
+ public static <T> HllCountInitFn<T, ?> getUdaf(TypeDescriptor<T> input) {
Review comment:
nit: You might have this return `Combine.CombineFn`
##########
File path:
sdks/java/extensions/zetasketch/src/main/java/org/apache/beam/sdk/extensions/zetasketch/HllCount.java
##########
@@ -279,6 +279,10 @@ private Builder(HllCountInitFn<InputT, ?> initFn) {
public <K> Combine.PerKey<K, InputT, byte[]> perKey() {
return Combine.perKey(initFn);
}
+
+ public HllCountInitFn<InputT, ?> asUdaf() {
Review comment:
If this interface is package private, returning Combine.CombineFn
doesn't really matter. If it is public, returning a more generic interface
allows you to change the internals of the package without affecting the return
type. To be more specific: If you change HllCountInitFn, anything calling into
the package will have to recompile or will experience a NoSuchMethodError. If
the return type is Combine.CombineFn, you can change HllCountInitFn without
requiring calling packages to recompile.
##########
File path:
sdks/java/extensions/zetasketch/src/main/java/org/apache/beam/sdk/extensions/zetasketch/ApproximateCountDistinct.java
##########
@@ -99,6 +99,10 @@
.build();
}
+ public static <T> HllCountInitFn<T, ?> getUdaf(TypeDescriptor<T> input) {
Review comment:
This specifically worked for me: `public static <T> Combine.CombineFn<T,
?, byte[]> getUdaf(TypeDescriptor<T> input) {`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 665961)
Time Spent: 0.5h (was: 20m)
> Beam SQL support for HLL count
> ------------------------------
>
> Key: BEAM-12393
> URL: https://issues.apache.org/jira/browse/BEAM-12393
> Project: Beam
> Issue Type: New Feature
> Components: dsl-sql, extensions-java-sketching
> Reporter: Brachi Packter
> Priority: P3
> Time Spent: 0.5h
> Remaining Estimate: 0h
>
> There is support for HLL sketch in Ptransform:
>
> {code:java}
> p.apply("Int", Create.of(ints)).apply("IntHLL",
> ApproximateCountDistinct.globally()
> .withPercision(PRECISION));{code}
>
> or
>
> {code:java}
> PCollection<KV<Integer, Long>> result =
> p.apply("Long", Create.of(longs)).apply("LongHLL",
> ApproximateCountDistinct.perKey());
>
> {code}
> But, no support for beam sql.
> We can't initiate it to be used in SqlTransform (even the combiner:
> HllCountMergePartialFn exists)
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)