[ 
https://issues.apache.org/jira/browse/IMPALA-9381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17044855#comment-17044855
 ] 

ASF subversion and git services commented on IMPALA-9381:
---------------------------------------------------------

Commit 1bd45d295ebfc3f526a98eebb9b61525b9332c91 in impala's branch 
refs/heads/master from Tim Armstrong
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=1bd45d2 ]

IMPALA-9381: on-demand conversion of runtime profile

Converting the runtime profile to JSON and text representations
at the end of the query used significant CPU and time. These
representations will commonly never be accessed, because
they need to be explicitly requested by a client via the
HTTP debug interface or via a thrift profile request.
So it is a waste of resources to eagerly convert them, and
in particular it is a bad idea to do so on the critical path
of a query.

This commit switches to generating alternative profile
representations on-demand. Only the compressed thrift version
of the profile is stored in QueryStateRecord. This is the
most compact representation of the profile and it is
relatively convenient to convert into other formats.

Also use a move() when constructing QueryStateRecord to avoid
copying the profile unnecessarily.

Fix a couple of potential use-after-free issues where Json
objects generated by RuntimeProfile::ToJson() could reference
strings owned by the object pool. These were detected by
running an ASAN build, because after this change, the temporary
object pool used to hold the deserialized profile was freed before
the JSON tree was returned.

The "kind" field of counters is removed from the JSON profile.
This couldn't be round-tripped correctly through thrift, and
probably isn't necessary. It also helps slim down the profiles.

Also make sure to preserve the "indent" field when round-tripping
to thrift.

Testing:
Ran core tests.

Diffed JSON and text profiles download from web UI from before and
after to make sure there were no unexpected changes as a result
of the round-trip via thrift.

Change-Id: Ic2f5133cc146adc3b044cf4b64aae0a9688449fa
Reviewed-on: http://gerrit.cloudera.org:8080/15236
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>


> Lazily convert and/or cache different representations of the query profile
> --------------------------------------------------------------------------
>
>                 Key: IMPALA-9381
>                 URL: https://issues.apache.org/jira/browse/IMPALA-9381
>             Project: IMPALA
>          Issue Type: Sub-task
>          Components: Backend
>            Reporter: Tim Armstrong
>            Assignee: Tim Armstrong
>            Priority: Major
>
> There are some obvious inefficiencies with how the query state record works:
> * We do an unnecessary copy of the archive string when adding it to the query 
> log
> https://github.com/apache/impala/blob/79aae231443a305ce8503dbc7b4335e8ae3f3946/be/src/service/impala-server.cc#L1812.
> * We eagerly convert the profile to text and JSON, when in many cases they 
> won't be needed - 
> https://github.com/apache/impala/blob/79aae231443a305ce8503dbc7b4335e8ae3f3946/be/src/service/impala-server.cc#L1839
>  . I think it is generally rare for more than one profile format to be 
> downloaded from the web UI. I know of tools that scrape the thrift profile, 
> but the human-readable version would usually only be consumed by humans. We 
> could avoid this by only storing the thrift representation of the profile, 
> then reconstituting the other representations from thrift if requested.
> * After ComputeExecSummary(), the profile shouldn't change, but we'll 
> regenerate the thrift representation for every web request to get the 
> encoded. This may waste a lot of CPU for tools scraping the profiles.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to