acherla opened a new issue, #12702:
URL: https://github.com/apache/druid/issues/12702
Please provide a detailed title (e.g. "Broker crashes when using TopN query
with Bound filter" instead of just "Broker crashes").
### Affected Version
0.22.1
### Description
When an index_parallel is executed against a GCS bucket to pull in a large
file the index_parallel job subtasks fail with the below error when attempting
to pull the source file from GCS:
```
2022-06-24T02:21:22,499 WARN [task-runner-0-priority-0]
org.apache.druid.java.util.common.RetryUtils - Retrying (1 of 10) in 879ms.
java.io.IOException: Connection closed prematurely: bytesRead = 638213054,
Content-Length = 904600641
at
com.google.api.client.http.javanet.NetHttpResponse$SizeValidatingInputStream.throwIfFalseEOF(NetHttpResponse.java:209)
~[google-http-client-1.26.0.jar:?]
at
com.google.api.client.http.javanet.NetHttpResponse$SizeValidatingInputStream.read(NetHttpResponse.java:171)
~[google-http-client-1.26.0.jar:?]
at
com.google.common.io.CountingInputStream.read(CountingInputStream.java:62)
~[guava-16.0.1.jar:?]
at
org.apache.druid.data.input.impl.RetryingInputStream.read(RetryingInputStream.java:144)
[druid-core-0.22.1.jar:0.22.1]
at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:284)
[?:1.8.0_275]
at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:326)
[?:1.8.0_275]
at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178) [?:1.8.0_275]
at java.io.InputStreamReader.read(InputStreamReader.java:184)
[?:1.8.0_275]
at java.io.BufferedReader.fill(BufferedReader.java:161) [?:1.8.0_275]
at java.io.BufferedReader.readLine(BufferedReader.java:324)
[?:1.8.0_275]
at java.io.BufferedReader.readLine(BufferedReader.java:389)
[?:1.8.0_275]
at org.apache.commons.io.LineIterator.hasNext(LineIterator.java:96)
[commons-io-2.11.0.jar:2.11.0]
at org.apache.druid.data.input.TextReader$1.hasNext(TextReader.java:73)
[druid-core-0.22.1.jar:0.22.1]
at
org.apache.druid.data.input.IntermediateRowParsingReader$1.hasNext(IntermediateRowParsingReader.java:60)
[druid-core-0.22.1.jar:0.22.1]
at
org.apache.druid.java.util.common.parsers.CloseableIterator$2.findNextIteratorIfNecessary(CloseableIterator.java:74)
[druid-core-0.22.1.jar:0.22.1]
at
org.apache.druid.java.util.common.parsers.CloseableIterator$2.next(CloseableIterator.java:108)
[druid-core-0.22.1.jar:0.22.1]
at
org.apache.druid.java.util.common.parsers.CloseableIterator$1.next(CloseableIterator.java:52)
[druid-core-0.22.1.jar:0.22.1]
at
org.apache.druid.indexing.common.task.FilteringCloseableInputRowIterator.hasNext(FilteringCloseableInputRowIterator.java:68)
[druid-indexing-service-0.22.1.jar:0.22.1]
at
org.apache.druid.indexing.common.task.batch.parallel.SinglePhaseSubTask.generateAndPushSegments(SinglePhaseSubTask.java:375)
[druid-indexing-service-0.22.1.jar:0.22.1]
at
org.apache.druid.indexing.common.task.batch.parallel.SinglePhaseSubTask.runTask(SinglePhaseSubTask.java:209)
[druid-indexing-service-0.22.1.jar:0.22.1]
at
org.apache.druid.indexing.common.task.AbstractBatchIndexTask.run(AbstractBatchIndexTask.java:159)
[druid-indexing-service-0.22.1.jar:0.22.1]
at
org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:471)
[druid-indexing-service-0.22.1.jar:0.22.1]
at
org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:443)
[druid-indexing-service-0.22.1.jar:0.22.1]
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
[?:1.8.0_275]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
[?:1.8.0_275]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[?:1.8.0_275]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_275]
2022-06-24T02:21:23,379 WARN [task-runner-0-priority-0]
org.apache.druid.java.util.common.RetryUtils - Retrying (2 of 10) in 1,924ms.
java.io.IOException: Connection closed prematurely: bytesRead = 638213054,
Content-Length = 904600641
at
com.google.api.client.http.javanet.NetHttpResponse$SizeValidatingInputStream.throwIfFalseEOF(NetHttpResponse.java:209)
~[google-http-client-1.26.0.jar:?]
at
com.google.api.client.http.javanet.NetHttpResponse$SizeValidatingInputStream.read(NetHttpResponse.java:171)
~[google-http-client-1.26.0.jar:?]
at
com.google.common.io.CountingInputStream.read(CountingInputStream.java:62)
~[guava-16.0.1.jar:?]
at
org.apache.druid.data.input.impl.RetryingInputStream.read(RetryingInputStream.java:144)
[druid-core-0.22.1.jar:0.22.1]
at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:284)
[?:1.8.0_275]
at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:326)
[?:1.8.0_275]
at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178) [?:1.8.0_275]
at java.io.InputStreamReader.read(InputStreamReader.java:184)
[?:1.8.0_275]
at java.io.BufferedReader.fill(BufferedReader.java:161) [?:1.8.0_275]
at java.io.BufferedReader.readLine(BufferedReader.java:324)
[?:1.8.0_275]
at java.io.BufferedReader.readLine(BufferedReader.java:389)
[?:1.8.0_275]
at org.apache.commons.io.LineIterator.hasNext(LineIterator.java:96)
[commons-io-2.11.0.jar:2.11.0]
at org.apache.druid.data.input.TextReader$1.hasNext(TextReader.java:73)
[druid-core-0.22.1.jar:0.22.1]
at
org.apache.druid.data.input.IntermediateRowParsingReader$1.hasNext(IntermediateRowParsingReader.java:60)
[druid-core-0.22.1.jar:0.22.1]
at
org.apache.druid.java.util.common.parsers.CloseableIterator$2.findNextIteratorIfNecessary(CloseableIterator.java:74)
[druid-core-0.22.1.jar:0.22.1]
at
org.apache.druid.java.util.common.parsers.CloseableIterator$2.next(CloseableIterator.java:108)
[druid-core-0.22.1.jar:0.22.1]
at
org.apache.druid.java.util.common.parsers.CloseableIterator$1.next(CloseableIterator.java:52)
[druid-core-0.22.1.jar:0.22.1]
at
org.apache.druid.indexing.common.task.FilteringCloseableInputRowIterator.hasNext(FilteringCloseableInputRowIterator.java:68)
[druid-indexing-service-0.22.1.jar:0.22.1]
at
org.apache.druid.indexing.common.task.batch.parallel.SinglePhaseSubTask.generateAndPushSegments(SinglePhaseSubTask.java:375)
[druid-indexing-service-0.22.1.jar:0.22.1]
at
org.apache.druid.indexing.common.task.batch.parallel.SinglePhaseSubTask.runTask(SinglePhaseSubTask.java:209)
[druid-indexing-service-0.22.1.jar:0.22.1]
at
org.apache.druid.indexing.common.task.AbstractBatchIndexTask.run(AbstractBatchIndexTask.java:159)
[druid-indexing-service-0.22.1.jar:0.22.1]
at
org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:471)
[druid-indexing-service-0.22.1.jar:0.22.1]
at
org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:443)
[druid-indexing-service-0.22.1.jar:0.22.1]
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
[?:1.8.0_275]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
[?:1.8.0_275]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[?:1.8.0_275]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_275]
```
Steps to Reproduce:
1. Upload a 4GB or larger file to a GCS bucket
2. Run an index_parallel job to index the file
Optional Solutions:
1. Create smaller files for druid index_parallel job ingestion.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]