[
https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16464369#comment-16464369
]
ASF GitHub Bot commented on DRILL-5270:
---------------------------------------
kkhatua opened a new pull request #1250: DRILL-5270: Improve loading of
profiles listing in the WebUI
URL: https://github.com/apache/drill/pull/1250
When Drill is displaying profiles stored on the file system (Local or
Distributed), it does so by loading the entire list of `.sys.drill` files in
the profile directory, sorting and deserializing. This can get expensive, since
only a single CPU thread does this.
As an example, a directory of 120K profiles, the time to just fetch the list
of files alone is over 6 seconds. After that, based on the number of profiles
being rendered, the time varies. An average of 30ms is needed to deserialize a
standard profile, which translates to an additional 3sec for therendering of
default 100 profiles.
A user reported issue confirms just that:
DRILL-5028 Opening profiles page from web ui gets very slow when a lot of
history files have been stored in HDFS or Local FS
Additional JIRAs filed ask for managing these profiles
DRILL-2362 Drill should manage Query Profiling archiving
DRILL-2861 enhance drill profile file management
This PR brings the following enhancements to achieve that:
1. Mimick the In-memory persistence of profiles (DRILL-5481), by keeping
only a predefined `max-capacity` number of profiles in the directory and moving
the oldest to an 'archived' sub-directory.
2. Improve loading times by pinning the deserialized list in memory
(TreeSet; for maintaining a memory-efficient sortedness of the profiles). That
way, if we do not detect any new profiles in the profileStore (i.e. profile
directory) since the last time a web-request for rendering the profiles was
made, we can re-serve the same listing and skip making a trip to the filesystem
to re-fetch all the profiles.
Reload & reconstruction of the profiles in the Tree is done in the event of
any of the following states changing:
i. Modification Time of profile dir
ii. Number of profiles in the profile dir
iii. Number of profiles requested exceeds existing the currently available
list
3. When 2 or more web-requests for rendering arrive, the WebServer code
already processes the requests sequentially. As a result, the earliest request
will trigger the reconstruction of the in-memory profile-set, and the
last-modified timestamp of the profileStore is tracked. This way, the remaining
blocked requests can re-use the freshly-reconstructed profile-set for rendering
if the underlying profileStore has not been modified. There is an assumption
made here that the rate of profiles being added to the profileStore is not too
high to trigger a reconstruction for every queued up request.
4. To prevent frequent archiving, there is a threshold (max-capacity)
defined for triggering the archive. However, the number of profiles archived is
selected to ensure that the profiles not archived is 90% of the threshold.
5. To prevent the archiving process from taking too long, an archival rate
(`drill.exec.profiles.store.archive.rate`) is defined so that upto that many
number of profiles are archived in one go, before resumption of re-rendering
takes place.
6. On a Distributed FileSystem (e.g. HDFS), multiple Drillbits might attempt
to archive. To mitigate that, if a Drillbit detects that it is unable to
archive a profile, it will assume that another Drillbit is also archiving, and
stop archiving any more.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> Improve loading of profiles listing in the WebUI
> ------------------------------------------------
>
> Key: DRILL-5270
> URL: https://issues.apache.org/jira/browse/DRILL-5270
> Project: Apache Drill
> Issue Type: Improvement
> Components: Web Server
> Affects Versions: 1.9.0
> Reporter: Kunal Khatua
> Assignee: Kunal Khatua
> Priority: Major
> Fix For: 1.14.0
>
>
> Currently, as the number of profiles increase, we reload the same list of
> profiles from the FS.
> An ideal improvement would be to detect if there are any new profiles and
> only reload from the disk then. Otherwise, a cached list is sufficient.
> For a directory of 280K profiles, the load time is close to 6 seconds on a 32
> core server. With the caching, we can get it down to as much as a few
> milliseconds.
> To render the cache as invalid, we inspect the last modified time of the
> directory to confirm whether a reload is needed.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)