[ https://issues.apache.org/jira/browse/BEAM-3487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16330099#comment-16330099 ]
Bei Zhang commented on BEAM-3487: --------------------------------- Also at one point it throws an errorĀ "Refusing to split <at position ShufflePosition(base64:AAAAAmGGbvYAAQ) of shuffle range [ShufflePosition(base64:AAAAAlgn1NIAAQ), ShufflePosition(base64:AAAAAmGGbvcAAQ))> at ShufflePosition(base64:AAAAAmGGbvcAAQ): proposed split position out of range". This happens when usingĀ {code}WriteFiles.to{code}. > GroupByKey stalls with GroupingShuffleReader split refusals > ----------------------------------------------------------- > > Key: BEAM-3487 > URL: https://issues.apache.org/jira/browse/BEAM-3487 > Project: Beam > Issue Type: Bug > Components: runner-dataflow > Affects Versions: 2.2.0 > Reporter: Bei Zhang > Assignee: Thomas Groh > Priority: Major > > With info messages with something like: > {quote}{{Refused to split GroupingShuffleReader <unstarted in shuffle range > [ShufflePosition(base64:AAAAA1CWNvgAAQ), > ShufflePosition(base64:AAAAA4sOz1AAAQ))> at > ShufflePosition(base64:AAAAA1CWNvkAAQ)}} > {quote} > The lull messages look like this: > {quote}{{Processing lull for PT300.006S in state read-shuffle of Write > Vectors2/GroupIntoShards/Read at > com.google.cloud.dataflow.worker.ApplianceShuffleReader.readIncludingPosition(Native > Method) at > com.google.cloud.dataflow.worker.ChunkingShuffleBatchReader.read(ChunkingShuffleBatchReader.java:62) > at > com.google.cloud.dataflow.worker.util.common.worker.CachingShuffleBatchReader$1.load(CachingShuffleBatchReader.java:57) > at > com.google.cloud.dataflow.worker.util.common.worker.CachingShuffleBatchReader$1.load(CachingShuffleBatchReader.java:53) > at > com.google.cloud.dataflow.worker.repackaged.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3628) > at > com.google.cloud.dataflow.worker.repackaged.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2336) > at > com.google.cloud.dataflow.worker.repackaged.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2295) > at > com.google.cloud.dataflow.worker.repackaged.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2208) > at > com.google.cloud.dataflow.worker.repackaged.com.google.common.cache.LocalCache.get(LocalCache.java:4053) > at > com.google.cloud.dataflow.worker.repackaged.com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:4057) > at > com.google.cloud.dataflow.worker.repackaged.com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4986) > at > com.google.cloud.dataflow.worker.util.common.worker.CachingShuffleBatchReader.read(CachingShuffleBatchReader.java:76) > at > com.google.cloud.dataflow.worker.util.common.worker.BatchingShuffleEntryReader$ShuffleReadIterator.fillEntries(BatchingShuffleEntryReader.java:133) > at > com.google.cloud.dataflow.worker.util.common.worker.BatchingShuffleEntryReader$ShuffleReadIterator.fillEntriesIfNeeded(BatchingShuffleEntryReader.java:126) > at > com.google.cloud.dataflow.worker.util.common.worker.BatchingShuffleEntryReader$ShuffleReadIterator.hasNext(BatchingShuffleEntryReader.java:90) > at > com.google.cloud.dataflow.worker.util.common.ForwardingReiterator.hasNext(ForwardingReiterator.java:62) > at > com.google.cloud.dataflow.worker.util.common.worker.GroupingShuffleEntryIterator.advance(GroupingShuffleEntryIterator.java:118) > at > com.google.cloud.dataflow.worker.GroupingShuffleReader$GroupingShuffleReaderIterator.advance(GroupingShuffleReader.java:230) > at > com.google.cloud.dataflow.worker.GroupingShuffleReader$GroupingShuffleReaderIterator.start(GroupingShuffleReader.java:224) > at > com.google.cloud.dataflow.worker.util.common.worker.ReadOperation$SynchronizedReaderIterator.start(ReadOperation.java:347) > at > com.google.cloud.dataflow.worker.util.common.worker.ReadOperation.runReadLoop(ReadOperation.java:183) > at > com.google.cloud.dataflow.worker.util.common.worker.ReadOperation.start(ReadOperation.java:148) > at > com.google.cloud.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:68) > at > com.google.cloud.dataflow.worker.DataflowWorker.executeWork(DataflowWorker.java:330) > at > com.google.cloud.dataflow.worker.DataflowWorker.doWork(DataflowWorker.java:302) > at > com.google.cloud.dataflow.worker.DataflowWorker.getAndPerformWork(DataflowWorker.java:251) > at > com.google.cloud.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.doWork(DataflowBatchWorkerHarness.java:135) > at > com.google.cloud.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:115) > at > com.google.cloud.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:102) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745)}} > {quote} -- This message was sent by Atlassian JIRA (v7.6.3#76005)