lingguang opened a new issue, #12434:
URL: https://github.com/apache/druid/issues/12434
GroupBy query with DST timezone hangs in Historical server.
### Affected Version
At lease from 0.20.0 to master
### Description
We found some native groupby queries hung in historical server and caused
query timeout.
All these query threads took up all merge buffers, then we had to kill the
historical server for restarting.
So we did some debugging and found the cause:
The PeriodGranularity may not calculate the bucketStart correctly when works
with timeZone that DST is used.
Here's what in our groupby query:
"granularity": {
"type": "period",
"period": "P7D",
"timeZone": "America/Denver",
"origin": "2022-03-27T02:35:00.000-06:00"
},
"intervals": [
"2022-03-06T02:35:00.000-07:00/2022-03-06T03:45:00.000-07:00"
]
Check this code:
org.apache.druid.java.util.common.granularity.PeriodGranularity.truncate(long)
Left: current, Right: possible fix

Check the output from our test program(attached):
[TestPeriod.zip](https://github.com/apache/druid/files/8482207/TestPeriod.zip)
origin 2022-03-27T02:35:00.000-06:00
segment min 2022-03-06T02:40:00.000-07:00
bucketStart 2022-03-06T03:35:00.000-07:00
-21D(expected) 2022-03-06T02:35:00.000-07:00
-14D 2022-03-13T03:35:00.000-06:00
-14D-7D 2022-03-06T03:35:00.000-07:00
DST Transition 2022-03-13T03:00:00.000-06:00
Due to the DST Transition, the result from origin -21D is not the same with
-14D then -7D. It should have the same issue with period Week/Month.
In our case, it causes the granulizer selected nothing for the
bucketInterval in the following code:
VectorGroupByEngine$VectorGroupByEngineIterator.initNewDelegate()

Finally causing a infinite loop in (the selected while loop):
VectorGroupByEngine$VectorGroupByEngineIterator.hasNext()

The query thread will take 100% CPU and never exit, after all merge buffers
have been taken by these threads, no more groupby queries can be executed on
this historical server. We had to kill this process for restarting.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]