This is an automated email from the ASF dual-hosted git repository.

arodoni pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git

commit f07be8dfd802d41ee5a4011d17f4aad4d7208d65
Author: Alex Rodoni <[email protected]>
AuthorDate: Fri Apr 12 18:00:52 2019 -0700

    IMPALA-7892 IMPALA-8416: [DOCS] Described the new network and disk info in 
query profiles
    
    - HostDiskReadThroughput
    - HostDiskWriteThroughput
    - HostNetworkRx
    - HostNetworkTx
    
    Change-Id: I25b128bc23f418347b400ca9e694d9d591935592
    Reviewed-on: http://gerrit.cloudera.org:8080/13006
    Tested-by: Impala Public Jenkins <[email protected]>
    Reviewed-by: Lars Volker <[email protected]>
---
 docs/topics/impala_explain_plan.xml | 43 ++++++++++++++++++++++++++-----------
 1 file changed, 30 insertions(+), 13 deletions(-)

diff --git a/docs/topics/impala_explain_plan.xml 
b/docs/topics/impala_explain_plan.xml
index d983c21..b8f3168 100644
--- a/docs/topics/impala_explain_plan.xml
+++ b/docs/topics/impala_explain_plan.xml
@@ -189,16 +189,18 @@ under the License.
 
     <conbody>
 
-      <p>
-        The <codeph>PROFILE</codeph> statement, available in the 
<cmdname>impala-shell</cmdname> interpreter,
-        produces a detailed low-level report showing how the most recent query 
was executed. Unlike the
-        <codeph>EXPLAIN</codeph> plan described in <xref 
href="#perf_explain"/>, this information is only available
-        after the query has finished. It shows physical details such as the 
number of bytes read, maximum memory
-        usage, and so on for each node. You can use this information to 
determine if the query is I/O-bound or
-        CPU-bound, whether some network condition is imposing a bottleneck, 
whether a slowdown is affecting some
-        nodes but not others, and to check that recommended configuration 
settings such as short-circuit local
-        reads are in effect.
-      </p>
+      <p> The <codeph>PROFILE</codeph> command, available in the
+          <cmdname>impala-shell</cmdname> interpreter, produces a detailed
+        low-level report showing how the most recent query was executed. Unlike
+        the <codeph>EXPLAIN</codeph> plan described in <xref
+          href="#perf_explain"/>, this information is only available after the
+        query has finished. It shows physical details such as the number of
+        bytes read, maximum memory usage, and so on for each node. You can use
+        this information to determine if the query is I/O-bound or CPU-bound,
+        whether some network condition is imposing a bottleneck, whether a
+        slowdown is affecting some nodes but not others, and to check that
+        recommended configuration settings such as short-circuit local reads 
are
+        in effect. </p>
 
       <p rev="">
         By default, time values in the profile output reflect the wall-clock 
time taken by an operation.
@@ -223,14 +225,29 @@ under the License.
         section includes the following metrics that can be controlled by the
             <codeph><xref
             href="impala_resource_trace_ratio.xml#resource_trace_ratio"
-            >RESOURCE_TRACE_RATIO</xref></codeph> query option. The host CPU
-        usage metrics (user, system, and IO wait time) are already in the
-        section.</p>
+            >RESOURCE_TRACE_RATIO</xref></codeph> query option.</p>
       <ul>
+        <li>For each host that participates in the query execution it adds the
+          read and write bandwidth across all disks. This includes all data 
read
+          or written by the host as part of the execution of a query 
(spilling),
+          by the HDFS data node, and by other processes running on the same
+          system.</li>
         <li><codeph>CpuIoWaitPercentage</codeph>
         </li>
         <li><codeph>CpuSysPercentage</codeph></li>
         <li><codeph>CpuUserPercentage</codeph></li>
+        <li><codeph>HostDiskReadThroughput</codeph>: All data read by the host
+          as part of the execution of this query (spilling), by the HDFS data
+          node, and by other processes running on the same system.</li>
+        <li><codeph>HostDiskWriteThroughput</codeph>: All data written by the
+          host as part of the execution of this query (spilling), by the HDFS
+          data node, and by other processes running on the same system.</li>
+        <li><codeph>HostNetworkRx</codeph>: All data received by the host as
+          part of the execution of this query, other queries, and other
+          processes running on the same system. </li>
+        <li><codeph>HostNetworkTx</codeph>: All data transmitted by the host as
+          part of the execution of this query, other queries, and other
+          processes running on the same system. </li>
       </ul>
       <!--AR 3/11/2019 The below example is out dated and does not add much 
value. Hiding it until this doc gets refactored.-->
 

Reply via email to