kkrugler opened a new issue #6597:
URL: https://github.com/apache/incubator-pinot/issues/6597


   On a five server cluster with one controller & one broker, running Pinot 
0.6.0, the following query caused the cluster to no longer process queries: 
`select distinctcount(column) from table`, where the column in question had 
very high cardinality (> 1B unique values out of 5B total records).
   
   There were no errors logged for the controller, broker, or 5 server 
processes. Once the cluster became unresponsive, a new request (e.g. `select * 
from table limit 20`) would be logged by the broker and the servers, but the 
broker logging indicated it did not think it received a response:
   
   ```
   2021/02/19 22:21:53.860 INFO [BaseBrokerRequestHandler] 
[jersey-server-managed-async-executor-59] 
requestId=41163,table=crawldata_OFFLINE,timeMs=10000,docs=0/0,entries=0/0,segments(queried/processed/matched/consuming/unavailable):0/0/0/0/0,consumingFreshnessTimeMs=0,servers=0/5,groupLimitReached=false,brokerReduceTimeMs=0,exceptions=0,serverStats=(Server=SubmitDelayMs,ResponseDelayMs,ResponseSize,DeserializationTimeMs);116.202.83.208_O=0,-1,0,0;168.119.147.123_O=0,-1,0,0;168.119.147.125_O=1,-1,0,0;168.119.147.124_O=1,-1,0,0;116.202.52.154_O=1,-1,0,0,query=select
 * from crawldata limit 20
   ```
   
   @Jackie-Jiang indicated he thought that the very large response from servers 
(on the order of 200M unique strings, each 64 characters long, per server) 
caused an issue with the transport layer, but Netty didn't log any error:
   
   > Based on the log you posted, server side processed the second query 
without any issue, but broker didn't receive the response, and that's why I 
suspect something is broken in the transport layer
   
   > We rely on netty to transport data, maybe we hit some limitation in netty, 
but netty didn’t trigger the exception callback
   
   I restarted the broker process, and the cluster once again was working, 
which indicates that the issue was indeed due to some invalid state for the 
broker.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to