[jira] [Commented] (IMPALA-9846) Shift to Aggregated Runtime Profile representation

2026-03-02 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18062234#comment-18062234
 ] 

ASF subversion and git services commented on IMPALA-9846:
-

Commit 90389331a09f3d27dffa1566506b56bb930afa71 in impala's branch 
refs/heads/master from Surya Hebbar
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=90389331a ]

IMPALA-14793: Replace row regex with aggregation tests

In the end-to-end python tests, there are many 'row_regex' searches
that can instead be treated as 'aggregation()' tests.

Rewriting these tests as aggregations can make the syntax easier,
while providing compatibility for both the traditional profile
and the new aggregated profile.

In many cases this results in better efficiency as the regex pattern to
search for is less complex than the original.

This is a split change associated with IMPALA-9846, in order to enable
the aggegated profile.

This reduces the number of lines from the large change IMPALA-9846
by a little, in order to aid the reviewer.

Change-Id: Ied342a3b89eb922137f0c1a7d3b2978b813de381
Reviewed-on: http://gerrit.cloudera.org:8080/24058
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Shift to Aggregated Runtime Profile representation
> --
>
> Key: IMPALA-9846
> URL: https://issues.apache.org/jira/browse/IMPALA-9846
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend
>Reporter: Tim Armstrong
>Assignee: Surya Hebbar
>Priority: Critical
>  Labels: multithreading
>
> The traditional query profile generated bigger forests of runtime profiles 
> and child counters, from fragments and operators upto instance levels.
> This structure of the runtime profile can potentially stress the memory 
> allocator and use up a lot more memory and cache than is really necessary.
> To mitigate these issues, the aggregated profiles were introduced, which are 
> substantially denser and faster to process for higher 'mt_dop' values.
> In the aggregated profile, the depth of the forests have been reduced by 
> transforming instance-level counters into operator level arrays or maps.
> The aggregation is also done in a single step, merging the aggregated thrift 
> profiles from the executor directly into the final aggregated profile, 
> without converting it to an unaggregated profile first.
> This representation also helps with producing nice, high-level, readable text 
> profile by default with the option to produce more detailed profiles and 
> alternate views of the profile when required.
> With this change, aggregated runtime profile will be enabled by default with 
> the 'aggregated_profile' flag set to 'true'. This will serves as replacement 
> for the previous 'gen_experimental_profile' flag.
> For enabling the aggregated profile, more than 2700+ tests, both backend and 
> end-to-end tests need to be be implemented/modified/corrected to accommodate 
> the usage of both the aggregated runtime profile and the traditional runtime 
> profile.
> We also need to ensure that the aggregated profile is an adequate 
> replacement, then shift over the default.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



[jira] [Commented] (IMPALA-9846) Shift to Aggregated Runtime Profile representation

2026-01-15 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18052303#comment-18052303
 ] 

ASF subversion and git services commented on IMPALA-9846:
-

Commit 97d766577df69b5e602c811063f365e464390e21 in impala's branch 
refs/heads/master from Surya Hebbar
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=97d766577 ]

IMPALA-14680: Improve row regex search syntax in runtime profile tests

Currently, the runtime profile tests contain row regex searches
which try to find matches by comparing the regex line by line.
This form of search is inefficient.

So, while updating the tests for the aggregated profile IMPALA-9846,
this performance is being improved by accumulating row regexes together,
then searching the entire profile at once.

In order to support this improvement, we need to correct the current
`row_regex` syntax being used.

The current tests use greedy regex like ".*" at the beginning and end
of `row_regex` searches. Using greedy regex in this way consumes more
resources and is redundant for the current implementation.

To fix this, these additional greedy regex characters(i.e. `.*`,`.+`)
are being removed or replaced across all the runtime profile tests.

Change-Id: I1460c2d22b03c06aa43c85f78fa9e05cec2775ec
Reviewed-on: http://gerrit.cloudera.org:8080/23864
Tested-by: Impala Public Jenkins 
Reviewed-by: Csaba Ringhofer 


> Shift to Aggregated Runtime Profile representation
> --
>
> Key: IMPALA-9846
> URL: https://issues.apache.org/jira/browse/IMPALA-9846
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend
>Reporter: Tim Armstrong
>Assignee: Surya Hebbar
>Priority: Critical
>  Labels: multithreading
>
> The traditional query profile generated bigger forests of runtime profiles 
> and child counters, from fragments and operators upto instance levels.
> This structure of the runtime profile can potentially stress the memory 
> allocator and use up a lot more memory and cache than is really necessary.
> To mitigate these issues, the aggregated profiles were introduced, which are 
> substantially denser and faster to process for higher 'mt_dop' values.
> In the aggregated profile, the depth of the forests have been reduced by 
> transforming instance-level counters into operator level arrays or maps.
> The aggregation is also done in a single step, merging the aggregated thrift 
> profiles from the executor directly into the final aggregated profile, 
> without converting it to an unaggregated profile first.
> This representation also helps with producing nice, high-level, readable text 
> profile by default with the option to produce more detailed profiles and 
> alternate views of the profile when required.
> With this change, aggregated runtime profile will be enabled by default with 
> the 'aggregated_profile' flag set to 'true'. This will serves as replacement 
> for the previous 'gen_experimental_profile' flag.
> For enabling the aggregated profile, more than 2700+ tests, both backend and 
> end-to-end tests need to be be implemented/modified/corrected to accommodate 
> the usage of both the aggregated runtime profile and the traditional runtime 
> profile.
> We also need to ensure that the aggregated profile is an adequate 
> replacement, then shift over the default.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]