[ 
https://issues.apache.org/jira/browse/FALCON-1368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sandeep samudrala updated FALCON-1368:
--------------------------------------
    Attachment: FALCON-1368-V2.patch

[~pallavi.rao] Thanks for pointing it out. 

 I moved listeners(onReload) part out of threads. During restart, the maximum 
time is spent on reading all files from hdfs, and so only putting the hdfs read 
calls in parallel would help in reducing the restart time.

> Improve Falcon server restart  time
> -----------------------------------
>
>                 Key: FALCON-1368
>                 URL: https://issues.apache.org/jira/browse/FALCON-1368
>             Project: Falcon
>          Issue Type: Improvement
>            Reporter: Ajay Yadava
>            Assignee: sandeep samudrala
>         Attachments: FALCON-1368-V1.patch, FALCON-1368-V2.patch, 
> FALCON-1368.patch
>
>
> Currently on restart, Falcon server loads all the entities from HDFS one by 
> one. In a large set up like the one at Inmobi, where we have several 
> thousands of feeds and processes, this takes several minutes to increase the 
> start up time.
> Since this is an IO intensive task(reading a file from HDFS into memory), 
> having multiple threads to load entities in parallel will improve the start 
> up time of the server. 
> Two points need to be taken care of:
> 1. Only a single type of entities should be load in parallel to preserve the 
> order of load of entities.
> 2. Currently Falcon server fails to start if there is an error in loading any 
> entity, it will be slightly tricky but we should preserve the same behaviour 
> in case of parallel uploads via threads.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to