Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11967 )

Change subject: IMPALA-1048: show sinks in exec summary
......................................................................


Patch Set 8:

(14 comments)

http://gerrit.cloudera.org:8080/#/c/11967/8/be/src/exec/data-sink.h
File be/src/exec/data-sink.h:

http://gerrit.cloudera.org:8080/#/c/11967/8/be/src/exec/data-sink.h@58
PS8, Line 58: Otherwise
            :   /// 'fragment_idx' should be -1.
> Are there any other cases than join builders that are sinks but not at the
Done


http://gerrit.cloudera.org:8080/#/c/11967/8/be/src/exec/data-sink.h@60
PS8, Line 60: fragment_idx
> I think this could be easier to understand if it was named "data_sink_id" b
Done


http://gerrit.cloudera.org:8080/#/c/11967/8/be/src/exec/data-sink.cc
File be/src/exec/data-sink.cc:

http://gerrit.cloudera.org:8080/#/c/11967/8/be/src/exec/data-sink.cc@72
PS8, Line 72:   int fragment_idx = fragment_ctx.fragment.idx;
> While you're here, could you add a DCHECK that this is not a JOIN_BUILD_SIN
Done


http://gerrit.cloudera.org:8080/#/c/11967/8/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

http://gerrit.cloudera.org:8080/#/c/11967/8/be/src/runtime/coordinator-backend-state.cc@501
PS8, Line 501:   // TODO: why do this every time we get an updated instance 
profile?
> While you're here, do you know why we do this?
It looks like a historical accident. The initial version of the summary was 
only computed at query end, but later logic was added to make it live. I think 
the easiest way to implement that was to just run the logic that was run at the 
end of the query on each update. This is the commit that added it.


    commit 9bd8b5b8cd74ad86830a9f41f3e98d33dc1f0aea
    Author: Nong Li <[email protected]>
    Date:   Mon Oct 6 13:57:58 2014 -0700

    Make GetExecSummary() RPC callable while query is running.

    Change-Id: I51fbade93b735dc007b1e84762194061a466947a
    Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4768
    Tested-by: jenkins
    Reviewed-by: Nong Li <[email protected]>
    Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5091


http://gerrit.cloudera.org:8080/#/c/11967/8/be/src/runtime/coordinator-backend-state.cc@518
PS8, Line 518: node_exec_summary
> rename to exec_summary?
Done


http://gerrit.cloudera.org:8080/#/c/11967/8/be/src/runtime/coordinator.h
File be/src/runtime/coordinator.h:

http://gerrit.cloudera.org:8080/#/c/11967/8/be/src/runtime/coordinator.h@285
PS8, Line 285: TPlanNodeId
> Maybe create a separate TDataSinkId (see comment elsewhere)?
Done


http://gerrit.cloudera.org:8080/#/c/11967/8/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/11967/8/be/src/runtime/coordinator.cc@246
PS8, Line 246: the
> nit: be
Done


http://gerrit.cloudera.org:8080/#/c/11967/8/be/src/util/runtime-profile.cc
File be/src/util/runtime-profile.cc:

http://gerrit.cloudera.org:8080/#/c/11967/8/be/src/util/runtime-profile.cc@870
PS8, Line 870:   // that rely on the plan node id being set there.
> Should we create a follow up jira to drop this in Impala 4.0 / on the next
I'm kind of inclined to just leave it, since it's not adding significant 
complexity.


http://gerrit.cloudera.org:8080/#/c/11967/8/common/thrift/RuntimeProfile.thrift
File common/thrift/RuntimeProfile.thrift:

http://gerrit.cloudera.org:8080/#/c/11967/8/common/thrift/RuntimeProfile.thrift@68
PS8, Line 68: struct TRuntimeProfileNodeMetadata {
> Out of curiosity, is this approach preferable to storing the id and the typ
Hm that's a good question. I was thinking that this would allow evolution since 
we can could add richer information here with different types rather than just 
reinterpreting an int in different ways. 

Actually now that I'm looking at it, a union would make more sense.


http://gerrit.cloudera.org:8080/#/c/11967/8/common/thrift/RuntimeProfile.thrift@70
PS8, Line 70: i32
> We also have Types.TPlanNodeId, would that fit here?
Done (and added TDataSinkId).


http://gerrit.cloudera.org:8080/#/c/11967/8/fe/src/main/java/org/apache/impala/planner/DataSink.java
File fe/src/main/java/org/apache/impala/planner/DataSink.java:

http://gerrit.cloudera.org:8080/#/c/11967/8/fe/src/main/java/org/apache/impala/planner/DataSink.java@84
PS8, Line 84: toThrift
> I find overloads harder to follow when they have different visibility and s
Done - toThriftImpl()


http://gerrit.cloudera.org:8080/#/c/11967/8/shell/impala_client.py
File shell/impala_client.py:

http://gerrit.cloudera.org:8080/#/c/11967/8/shell/impala_client.py@211
PS8, Line 211:             "" if is_sink else prettyprint_units(cardinality),
> nit: you can write this as
I don't really like either alternative, but I feel like if/else is easier to 
parse even if the order is weird, so I'll leave unless someone feels really 
strongly.


http://gerrit.cloudera.org:8080/#/c/11967/8/tests/query_test/test_observability.py
File tests/query_test/test_observability.py:

http://gerrit.cloudera.org:8080/#/c/11967/8/tests/query_test/test_observability.py@97
PS8, Line 97: result.exec_summary[0]
> You could name this part to root_sink to make the code less repetitive and
Done


http://gerrit.cloudera.org:8080/#/c/11967/8/tests/query_test/test_observability.py@109
PS8, Line 109: *
> nit: + instead of *
Done



--
To view, visit http://gerrit.cloudera.org:8080/11967
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3fdf7bacae8ff597b255da65af453e174ba53544
Gerrit-Change-Number: 11967
Gerrit-PatchSet: 8
Gerrit-Owner: Tim Armstrong <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Lars Volker <[email protected]>
Gerrit-Reviewer: Michael Ho <[email protected]>
Gerrit-Reviewer: Philip Zeyliger <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Comment-Date: Mon, 03 Dec 2018 17:37:29 +0000
Gerrit-HasComments: Yes

Reply via email to