readiness

GitBox Wed, 24 Jul 2019 16:39:49 -0700

dene14 opened a new issue #8155: Historical node periodic timeouts for
/druid/historical/v1/readiness
URL: https://github.com/apache/incubator-druid/issues/8155

Please provide a detailed title (e.g. "Broker crashes when using TopN query
with Bound filter" instead of just "Broker crashes").

### Affected Version
0.14.2

### Description

Please include as much detailed information about the problem as possible.
- Cluster size
7 historicals x 500Gb HDD x 16Gb Ram
3 brokers
6 middlemanagers x 2 workers (24Gb Ram per MM)

Running in docker on K8S.
Historical nodes stopping to respond to `readiness` endpoint (connection
timeout), that happens approximately in a hour after start. As a result
container getting killed. I suspect some sort of periodic routine happening in
Historical, but nothing relevant logged unfortunately. I've found similar
issue, but it's relatively old (Druid 0.8 epoch):
https://groups.google.com/forum/#!topic/druid-user/vPcffwqf7d4

Recently I've had to change ingestion spec for one of my datasources in
order to add datasketches. `maxRowsPerSegment` was reduced from 500K to 50K,
otherwise it was allocating too many littleBufs (>120K) and dying with "Not
enough direct memory". As a result of that change count of segments (or
partitions of segments) jumped up dramatically. I suspect that this is
connected somehow.

![image](https://user-images.githubusercontent.com/7289205/61835622-3f749880-ae85-11e9-85e1-d851ff84ea77.png)

Any advice is much appreciated. If any additional information required,
please LMK.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [incubator-druid] dene14 opened a new issue #8155: Historical node periodic timeouts for /druid/historical/v1/readiness

Reply via email to