[ 
https://issues.apache.org/jira/browse/IMPALA-9846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18052303#comment-18052303
 ] 

ASF subversion and git services commented on IMPALA-9846:
---------------------------------------------------------

Commit 97d766577df69b5e602c811063f365e464390e21 in impala's branch 
refs/heads/master from Surya Hebbar
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=97d766577 ]

IMPALA-14680: Improve row regex search syntax in runtime profile tests

Currently, the runtime profile tests contain row regex searches
which try to find matches by comparing the regex line by line.
This form of search is inefficient.

So, while updating the tests for the aggregated profile IMPALA-9846,
this performance is being improved by accumulating row regexes together,
then searching the entire profile at once.

In order to support this improvement, we need to correct the current
`row_regex` syntax being used.

The current tests use greedy regex like ".*" at the beginning and end
of `row_regex` searches. Using greedy regex in this way consumes more
resources and is redundant for the current implementation.

To fix this, these additional greedy regex characters(i.e. `.*`,`.+`)
are being removed or replaced across all the runtime profile tests.

Change-Id: I1460c2d22b03c06aa43c85f78fa9e05cec2775ec
Reviewed-on: http://gerrit.cloudera.org:8080/23864
Tested-by: Impala Public Jenkins <[email protected]>
Reviewed-by: Csaba Ringhofer <[email protected]>


> Shift to Aggregated Runtime Profile representation
> --------------------------------------------------
>
>                 Key: IMPALA-9846
>                 URL: https://issues.apache.org/jira/browse/IMPALA-9846
>             Project: IMPALA
>          Issue Type: Sub-task
>          Components: Backend
>            Reporter: Tim Armstrong
>            Assignee: Surya Hebbar
>            Priority: Critical
>              Labels: multithreading
>
> The traditional query profile generated bigger forests of runtime profiles 
> and child counters, from fragments and operators upto instance levels.
> This structure of the runtime profile can potentially stress the memory 
> allocator and use up a lot more memory and cache than is really necessary.
> To mitigate these issues, the aggregated profiles were introduced, which are 
> substantially denser and faster to process for higher 'mt_dop' values.
> In the aggregated profile, the depth of the forests have been reduced by 
> transforming instance-level counters into operator level arrays or maps.
> The aggregation is also done in a single step, merging the aggregated thrift 
> profiles from the executor directly into the final aggregated profile, 
> without converting it to an unaggregated profile first.
> This representation also helps with producing nice, high-level, readable text 
> profile by default with the option to produce more detailed profiles and 
> alternate views of the profile when required.
> With this change, aggregated runtime profile will be enabled by default with 
> the 'aggregated_profile' flag set to 'true'. This will serves as replacement 
> for the previous 'gen_experimental_profile' flag.
> For enabling the aggregated profile, more than 2700+ tests, both backend and 
> end-to-end tests need to be be implemented/modified/corrected to accommodate 
> the usage of both the aggregated runtime profile and the traditional runtime 
> profile.
> We also need to ensure that the aggregated profile is an adequate 
> replacement, then shift over the default.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to