Brachi Packter created BEAM-14094:
-------------------------------------

             Summary: Fix null pointer exception in HllCountInitFn
                 Key: BEAM-14094
                 URL: https://issues.apache.org/jira/browse/BEAM-14094
             Project: Beam
          Issue Type: Bug
          Components: dsl-sql, extensions-java-sketching
            Reporter: Brachi Packter


When trying to aggregate input with null value we will fail on null pointer 
exception, when we deal with high events rate we can have sometimes "broken" 
events  and I don't want them the break all the pipline.

trace:
  exception: "java.lang.NullPointerException
        at 
com.google.zetasketch.HyperLogLogPlusPlus.add(HyperLogLogPlusPlus.java:212)
        at 
org.apache.beam.sdk.extensions.zetasketch.HllCountInitFn$ForString.addInput(HllCountInitFn.java:147)
        at 
org.apache.beam.sdk.extensions.zetasketch.HllCountInitFn$ForString.addInput(HllCountInitFn.java:137)
        at 
org.apache.beam.sdk.extensions.sql.impl.transform.agg.AggregationCombineFnAdapter$WrappedCombinerBase.addInput(AggregationCombineFnAdapter.java:54)
        at 
org.apache.beam.sdk.transforms.CombineFns$ComposedCombineFn.addInput(CombineFns.java:382)
        at 
org.apache.beam.sdk.schemas.transforms.SchemaAggregateFn$Inner.addInput(SchemaAggregateFn.java:324)
        at 
org.apache.beam.sdk.schemas.transforms.SchemaAggregateFn$Inner.addInput(SchemaAggregateFn.java:63)
        at 
org.apache.beam.runners.dataflow.worker.WindmillStateInternals$WindmillCombiningState.add(WindmillStateInternals.java:2056)
        at 
org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SystemReduceFn.processValue(SystemReduceFn.java:119)
        at 
org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.ReduceFnRunner.processElement(ReduceFnRunner.java:613)
        at 
org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.ReduceFnRunner.processElements(ReduceFnRunner.java:360)
        at 
org.apache.beam.runners.dataflow.worker.StreamingGroupAlsoByWindowViaWindowSetFn.processElement(StreamingGroupAlsoByWindowViaWindowSetFn.java:96)
        at 
org.apache.beam.runners.dataflow.worker.StreamingGroupAlsoByWindowViaWindowSetFn.processElement(StreamingGroupAlsoByWindowViaWindowSetFn.java:43)
        at 
org.apache.beam.runners.dataflow.worker.GroupAlsoByWindowFnRunner.invokeProcessElement(GroupAlsoByWindowFnRunner.java:121)
        at 
org.apache.beam.runners.dataflow.worker.GroupAlsoByWindowFnRunner.processElement(GroupAlsoByWindowFnRunner.java:73)
        at 
org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.LateDataDroppingDoFnRunner.processElement(LateDataDroppingDoFnRunner.java:80)
        at 
org.apache.beam.runners.dataflow.worker.GroupAlsoByWindowsParDoFn.processElement(GroupAlsoByWindowsParDoFn.java:137)
        at 
org.apache.beam.runners.dataflow.worker.util.common.worker.ParDoOperation.process(ParDoOperation.java:44)
        at 
org.apache.beam.runners.dataflow.worker.util.common.worker.OutputReceiver.process(OutputReceiver.java:49)
        at 
org.apache.beam.runners.dataflow.worker.util.common.worker.ReadOperation.runReadLoop(ReadOperation.java:212)
        at 
org.apache.beam.runners.dataflow.worker.util.common.worker.ReadOperation.start(ReadOperation.java:163)
        at 
org.apache.beam.runners.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:92)
        at 
org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.process(StreamingDataflowWorker.java:1437)
        at 
org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.access$1100(StreamingDataflowWorker.java:165)
        at 
org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker$7.run(StreamingDataflowWorker.java:1113)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to