didip opened a new issue #11468: URL: https://github.com/apache/druid/issues/11468
single_phase_sub_task failed without any logs. ### Affected Version 0.21.1 ### Description single_phase_sub_task seemed to fail randomly without explanation why. See the failed screenshot. <img width="549" alt="Screen Shot 2021-07-19 at 11 56 39 AM" src="https://user-images.githubusercontent.com/72918/126212534-33c79d18-0a47-46f7-bc24-61e37d5f8f81.png"> That task looked great without any error in the log. The log simply got cut off like this: ``` 2021-07-19T18:55:03,320 INFO [task-runner-0-priority-0] org.apache.parquet.hadoop.InternalParquetRecordReader - block read in memory in 107 ms. row count = 1679 2021-07-19T18:55:04,369 INFO [task-runner-0-priority-0] org.apache.parquet.hadoop.InternalParquetRecordReader - Assembled and processed 143107 records from 1300 columns in 98891 ms: 1.4471185 rec/ms, 1881.2542 cell/ms 2021-07-19T18:55:04,369 INFO [task-runner-0-priority-0] org.apache.parquet.hadoop.InternalParquetRecordReader - time spent so far 1% reading (1669 ms) and 98% processing (98891 ms) 2021-07-19T18:55:04,369 INFO [task-runner-0-priority-0] org.apache.parquet.hadoop.InternalParquetRecordReader - at row 143107. reading next block 2021-07-19T18:55:04,409 INFO [task-runner-0-priority-0] org.apache.parquet.hadoop.InternalParquetRecordReader - block read in memory in 40 ms. row count = 1209 2021-07-19T18:55:05,155 INFO [task-runner-0-priority-0] org.apache.parquet.hadoop.InternalParquetRecordReader - Assembled and processed 144316 records from 1300 columns in 99584 ms: 1.4491886 rec/ms, 1883.9452 cell/ms 2021-07-19T18:55:05,155 INFO [task-runner-0-priority-0] org.apache.parquet.hadoop.InternalParquetRecordReader - time spent so far 1% reading (1709 ms) and 98% processing (99584 ms) 2021-07-19T18:55:05,155 INFO [task-runner-0-priority-0] org.apache.parquet.hadoop.InternalParquetRecordReader - at row 144316. reading next block 2021-07-19T18:55:05,819 INFO [task-runner-0-priority-0] org.apache.parquet.hadoop.InternalParquetRecordReader - block read in memory in 664 ms. row count = 24938 ``` When I exec into the pod, the middlemanager looks healthy, the peons too, each are consuming 20GB RAM. The task runner mode is `httpRemote`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
