soumitra-st opened a new issue, #13414:
URL: https://github.com/apache/pinot/issues/13414

   ### Issue
   The broker query response serializes BrokerResponseNative to a JSON String, 
and then Jersey converts it to a HeapCharBuffer to send the response to the 
client. This results in three copies of the same data: BrokerResponseNative, 
JSON String, and HeapCharBuffer. Moreover, the JSON String is quite expensive, 
and together with other objects, the memory cost of the query response is 
pretty high.
   
   ### Steps to reproduce the issue:
   
   1. Run `quick-start-batch.sh` with `-Xms2G -Xmx2G 
-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath="<directory to dump heap>" 
-XX:+ExitOnOutOfMemoryError` to limit the max heap and dump heap on OOM error
   2. Execute concurrent queries to `githubComplexTypeEvents` table to download 
lots of data. Run 10 queries concurrently in a loop:
   `select * from githubComplexTypeEvents limit 100000`
   
   ### Issue visible in heap analysis
   BrokerResponseNative is ~26M, JSON String is 77M, and HeapCharBuffer is ~77M
   ![Screenshot 2024-06-17 at 10 57 51 
AM](https://github.com/apache/pinot/assets/127247229/20cb8c3a-03a4-421a-8ace-f78f29251170)
   
   ### Proposed fix
   Broker should stream the BrokerResponse object using 
[StreamingOutput](https://eclipse-ee4j.github.io/jersey.github.io/apidocs/2.22/jersey/index.html?javax/ws/rs/core/StreamingOutput.html)
 to avoid converting to String and save heap memory.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to