[ 
https://issues.apache.org/jira/browse/IMPALA-12191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18056068#comment-18056068
 ] 

Joe McDonnell commented on IMPALA-12191:
----------------------------------------

After thinking about this for a while, I want to rework this Jira. Here are my 
thoughts:
 # We want information for all the backends used by the query, not just one 
node. That means we need to be collecting and transmitting the information from 
each node. The information in the profile should be a summary of all the nodes 
used by this query.
 # The list of desired information could be very long, so I think we should 
break this into pieces. This Jira should focus on the main code to collect and 
convey information into the profile, but it should limit itself to a couple 
interesting things (e.g. CPU type, CPU count, OS). We can add more pieces of 
information in subsequent patches.
 # The profile information should be succinct. If we are describing CPUs, we 
can summarize multiple nodes by giving a count of each of the configurations. 
e.g. "12th Gen Intel(R) Core(TM) i9-12900K (15), 12th Gen Intel(R) Core(TM) 
i9-12700K (1)" or something like that.
 # To make it easier to summarize across the cluster, it is better to keep the 
information structured. We don't want to convey a complicated string summary of 
dozens of things. Instead, each piece can be its own field.

I think a good way to convey this information is via the BackendDescriptorPB 
protobuf structure that is sent by each backend when registering with the 
statestore. This already conveys various other information about the backend 
(e.g. ip address, admission control memory, etc). The fields are currently set 
in ImpalaServer::BuildLocalBackendDescriptorInternal(), and the information is 
automatically conveyed via the statestore to the coordinator. The code on the 
coordinator to combine information about multiple backends would be associated 
with the ExecutorGroup.

> Add hardware and OS details to runtime profile
> ----------------------------------------------
>
>                 Key: IMPALA-12191
>                 URL: https://issues.apache.org/jira/browse/IMPALA-12191
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Backend
>            Reporter: David Rorke
>            Assignee: Arnab Karmakar
>            Priority: Major
>              Labels: ramp-up
>
> The runtime profiles are currently lacking any details about the hardware the 
> query ran on (CPU model and core count, cache sizes, etc) OS versions, etc 
> which may all be relevant when analyzing performance issues, comparing 
> performance metrics across different profiles, etc.
> We should add relevant hardware and OS details to the profile.  The 
> information currently displayed at the root Impalad web UI page (Hardware 
> Info, OS Info) would be a good starting point.
> IMPALA-12118 is also relevant if we want to cover ARM processors.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to