[ 
https://issues.apache.org/jira/browse/BEAM-14094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kyle Weaver updated BEAM-14094:
-------------------------------
    Fix Version/s: 2.39.0
         Assignee: Brachi Packter
       Resolution: Fixed
           Status: Resolved  (was: Open)

> Fix null pointer exception in HllCountInitFn
> --------------------------------------------
>
>                 Key: BEAM-14094
>                 URL: https://issues.apache.org/jira/browse/BEAM-14094
>             Project: Beam
>          Issue Type: Bug
>          Components: dsl-sql, extensions-java-sketching
>            Reporter: Brachi Packter
>            Assignee: Brachi Packter
>            Priority: P3
>             Fix For: 2.39.0
>
>          Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> When trying to aggregate input with null value we will fail on null pointer 
> exception, when we deal with high events rate we can have sometimes "broken" 
> events  and I don't want them the break all the pipline.
> trace:
>   exception: "java.lang.NullPointerException
>       at 
> com.google.zetasketch.HyperLogLogPlusPlus.add(HyperLogLogPlusPlus.java:212)
>       at 
> org.apache.beam.sdk.extensions.zetasketch.HllCountInitFn$ForString.addInput(HllCountInitFn.java:147)
>       at 
> org.apache.beam.sdk.extensions.zetasketch.HllCountInitFn$ForString.addInput(HllCountInitFn.java:137)
>       at 
> org.apache.beam.sdk.extensions.sql.impl.transform.agg.AggregationCombineFnAdapter$WrappedCombinerBase.addInput(AggregationCombineFnAdapter.java:54)
>       at 
> org.apache.beam.sdk.transforms.CombineFns$ComposedCombineFn.addInput(CombineFns.java:382)
>       at 
> org.apache.beam.sdk.schemas.transforms.SchemaAggregateFn$Inner.addInput(SchemaAggregateFn.java:324)
>       at 
> org.apache.beam.sdk.schemas.transforms.SchemaAggregateFn$Inner.addInput(SchemaAggregateFn.java:63)
>       at 
> org.apache.beam.runners.dataflow.worker.WindmillStateInternals$WindmillCombiningState.add(WindmillStateInternals.java:2056)
>       at 
> org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SystemReduceFn.processValue(SystemReduceFn.java:119)
>       at 
> org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.ReduceFnRunner.processElement(ReduceFnRunner.java:613)
>       at 
> org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.ReduceFnRunner.processElements(ReduceFnRunner.java:360)
>       at 
> org.apache.beam.runners.dataflow.worker.StreamingGroupAlsoByWindowViaWindowSetFn.processElement(StreamingGroupAlsoByWindowViaWindowSetFn.java:96)
>       at 
> org.apache.beam.runners.dataflow.worker.StreamingGroupAlsoByWindowViaWindowSetFn.processElement(StreamingGroupAlsoByWindowViaWindowSetFn.java:43)
>       at 
> org.apache.beam.runners.dataflow.worker.GroupAlsoByWindowFnRunner.invokeProcessElement(GroupAlsoByWindowFnRunner.java:121)
>       at 
> org.apache.beam.runners.dataflow.worker.GroupAlsoByWindowFnRunner.processElement(GroupAlsoByWindowFnRunner.java:73)
>       at 
> org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.LateDataDroppingDoFnRunner.processElement(LateDataDroppingDoFnRunner.java:80)
>       at 
> org.apache.beam.runners.dataflow.worker.GroupAlsoByWindowsParDoFn.processElement(GroupAlsoByWindowsParDoFn.java:137)
>       at 
> org.apache.beam.runners.dataflow.worker.util.common.worker.ParDoOperation.process(ParDoOperation.java:44)
>       at 
> org.apache.beam.runners.dataflow.worker.util.common.worker.OutputReceiver.process(OutputReceiver.java:49)
>       at 
> org.apache.beam.runners.dataflow.worker.util.common.worker.ReadOperation.runReadLoop(ReadOperation.java:212)
>       at 
> org.apache.beam.runners.dataflow.worker.util.common.worker.ReadOperation.start(ReadOperation.java:163)
>       at 
> org.apache.beam.runners.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:92)
>       at 
> org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.process(StreamingDataflowWorker.java:1437)
>       at 
> org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.access$1100(StreamingDataflowWorker.java:165)
>       at 
> org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker$7.run(StreamingDataflowWorker.java:1113)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>       at java.lang.Thread.run(Thread.java:748)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to