Tim Armstrong created IMPALA-9381:
-------------------------------------
Summary: Lazily convert and/or cache different representations of
the query profile
Key: IMPALA-9381
URL: https://issues.apache.org/jira/browse/IMPALA-9381
Project: IMPALA
Issue Type: Sub-task
Components: Backend
Reporter: Tim Armstrong
There are some obvious inefficiencies with how the query state record works:
* We do an unnecessary copy of the archive string when adding it to the query
log
https://github.com/apache/impala/blob/79aae231443a305ce8503dbc7b4335e8ae3f3946/be/src/service/impala-server.cc#L1812.
* We eagerly convert the profile to text and JSON, when in many cases they
won't be needed -
https://github.com/apache/impala/blob/79aae231443a305ce8503dbc7b4335e8ae3f3946/be/src/service/impala-server.cc#L1839
. I think it is generally rare for more than one profile format to be
downloaded from the web UI. I know of tools that scrape the thrift profile, but
the human-readable version would usually only be consumed by humans. We could
avoid this by only storing the thrift representation of the profile, then
reconstituting the other representations from thrift if requested.
* After ComputeExecSummary(), the profile shouldn't change, but we'll
regenerate the thrift representation for every web request to get the encoded.
This may waste a lot of CPU for tools scraping the profiles.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]