[ 
https://issues.apache.org/jira/browse/HUDI-1553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prashant Wason updated HUDI-1553:
---------------------------------
    Status: In Progress  (was: Open)

> Add configs for TimelineServer to configure Jetty
> -------------------------------------------------
>
>                 Key: HUDI-1553
>                 URL: https://issues.apache.org/jira/browse/HUDI-1553
>             Project: Apache Hudi
>          Issue Type: Improvement
>            Reporter: Prashant Wason
>            Assignee: Prashant Wason
>            Priority: Major
>
> TimelineServer uses Javalin which is based on Jetty.
> By default Jetty:
>  * Has 200 threads
>  * Compresses output by gzip
>  * Handles each request sequentially
>  
> On a large-scale HUDI dataset (2000 partitions), when TimelineServer is 
> enabled, the operations slow down due to following reasons:
>  # Driver process usually has a few cores. 200 Jetty threads lead to huge 
> contention when 100s of executors connect to the Server in parallel.
>  # To handle large number of requests in parallel, its better to handle each 
> HTTP request in an asynchronous manner using Futures which are supported by 
> Javalin.
>  # The compute overhead of gzipping may not be necessary when the executors 
> and driver are in the same rack or within the same datacenter 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to