[ 
https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16464369#comment-16464369
 ] 

ASF GitHub Bot commented on DRILL-5270:
---------------------------------------

kkhatua opened a new pull request #1250: DRILL-5270: Improve loading of 
profiles listing in the WebUI
URL: https://github.com/apache/drill/pull/1250
 
 
   When Drill is displaying profiles stored on the file system (Local or 
Distributed), it does so by loading the entire list of `.sys.drill` files in 
the profile directory, sorting and deserializing. This can get expensive, since 
only a single CPU thread does this.
   As an example, a directory of 120K profiles, the time to just fetch the list 
of files alone is over 6 seconds. After that, based on the number of profiles 
being rendered, the time varies. An average of 30ms is needed to deserialize a 
standard profile, which translates to an additional 3sec for therendering of 
default 100 profiles.
   
   A user reported issue confirms just that:
   DRILL-5028 Opening profiles page from web ui gets very slow when a lot of 
history files have been stored in HDFS or Local FS
   
   Additional JIRAs filed ask for managing these profiles
   DRILL-2362 Drill should manage Query Profiling archiving
   DRILL-2861 enhance drill profile file management
   
   This PR brings the following enhancements to achieve that:
   1. Mimick the In-memory persistence of profiles (DRILL-5481), by keeping 
only a predefined `max-capacity` number of profiles in the directory and moving 
the oldest to an 'archived' sub-directory.
   2. Improve loading times by pinning the deserialized list in memory 
(TreeSet; for maintaining a memory-efficient sortedness of the profiles). That 
way, if we do not detect any new profiles in the profileStore (i.e. profile 
directory) since the last time a web-request for rendering the profiles was 
made, we can re-serve the same listing and skip making a trip to the filesystem 
to re-fetch all the profiles.
   
   Reload & reconstruction of the profiles in the Tree is done in the event of 
any of the following states changing:
     i.   Modification Time of profile dir
     ii.  Number of profiles in the profile dir
     iii. Number of profiles requested exceeds existing the currently available 
list
   
   3. When 2 or more web-requests for rendering arrive, the WebServer code 
already processes the requests sequentially. As a result, the earliest request 
will trigger the reconstruction of the in-memory profile-set, and the 
last-modified timestamp of the profileStore is tracked. This way, the remaining 
blocked requests can re-use the freshly-reconstructed profile-set for rendering 
if the underlying profileStore has not been modified. There is an assumption 
made here that the rate of profiles being added to the profileStore is not too 
high to trigger a reconstruction for every queued up request. 
   4. To prevent frequent archiving, there is a threshold (max-capacity) 
defined for triggering the archive. However, the number of profiles archived is 
selected to ensure that the profiles not archived is 90% of the threshold.
   5. To prevent the archiving process from taking too long, an archival rate 
(`drill.exec.profiles.store.archive.rate`) is defined so that upto that many 
number of profiles are archived in one go, before resumption of re-rendering 
takes place.
   6. On a Distributed FileSystem (e.g. HDFS), multiple Drillbits might attempt 
to archive. To mitigate that, if a Drillbit detects that it is unable to 
archive a profile, it will assume that another Drillbit is also archiving, and 
stop archiving any more.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


> Improve loading of profiles listing in the WebUI
> ------------------------------------------------
>
>                 Key: DRILL-5270
>                 URL: https://issues.apache.org/jira/browse/DRILL-5270
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Web Server
>    Affects Versions: 1.9.0
>            Reporter: Kunal Khatua
>            Assignee: Kunal Khatua
>            Priority: Major
>             Fix For: 1.14.0
>
>
> Currently, as the number of profiles increase, we reload the same list of 
> profiles from the FS.
> An ideal improvement would be to detect if there are any new profiles and 
> only reload from the disk then. Otherwise, a cached list is sufficient.
> For a directory of 280K profiles, the load time is close to 6 seconds on a 32 
> core server. With the caching, we can get it down to as much as a few 
> milliseconds.
> To render the cache as invalid, we inspect the last modified time of the 
> directory to confirm whether a reload is needed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to