Michal Ostrowski has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11425 )

Change subject: IMPALA-2063 Remove newline characters in query status.
......................................................................


Patch Set 8:

(1 comment)

> Patch Set 5:
>
> (1 comment)

http://gerrit.cloudera.org:8080/#/c/11425/5/be/src/util/runtime-profile.cc
File be/src/util/runtime-profile.cc:

http://gerrit.cloudera.org:8080/#/c/11425/5/be/src/util/runtime-profile.cc@551
PS5, Line 551:   info_strings_.emplace(key, std::move(value));
> I guess a more meaningful test is to verify that all "Query Status" (not ju
The test for double newlines effectively was a test that info strings aren't 
newline terminated, since the construction of the full profile text inserts 
newlines between each info string.  As I pointed out earlier, that doesn't work 
because there are other strings that will break that test without this change.  
More importantly, we only get to test the final profile text and not the 
individual strings that it is built from, so we don't have the context to 
identify if a newline is from an info string or if it was introduced in 
formatting.

There is no existing tests to validate correct formatting of  this text. 

The text overall is a free-for-all -- we don't exactly know how and where all 
possible strings come from, making it hard to
impose and then verify formatting rules.  (How do we know that our tests simply 
never generate the particular string that would break the profile text 
verification?)

Adding more specific tests in turn would result in us effectively writing tests 
to match our implementation (i.e. string matches).

FWIW, here's an example of a problematic profile text -- note the "WARNING" 
towards the end.

Query (id=324df3273ce7648d:8d1602ea00000000):
  DEBUG MODE WARNING: Query profile created while running a DEBUG build of 
Impala. Use RELEASE builds to measure query performance.
  Summary:
    Session ID: ea417f33ee6aee28:58213ba9cfaaf191
    Session Type: BEESWAX
    Start Time: 2018-10-02 11:16:51.674327000
    End Time: 2018-10-02 11:16:52.681041000
    Query Type: QUERY
    Query State: EXCEPTION
    Query Status: Cancelled
    Impala Version: impalad version 3.1.0-SNAPSHOT DEBUG (build 
6247d6d41cfea7149704cf9345d014c17375f61b)
    User: mostrows
    Connected User: mostrows
    Delegated User:
    Network Address: ::ffff:127.0.0.1:51148
    Default Db: tpch_seq_snap
    Sql Statement: select count(l_returnflag) pk from lineitem
    Coordinator: mostrows-dev.vpc.cloudera.com:22000
    Query Options (set by configuration): 
ABORT_ON_ERROR=1,BUFFER_POOL_LIMIT=0,EXEC_SINGLE_NODE_ROWS_THRESHOLD=0,DISABLE_CODEGEN_ROWS_THRESHOLD=0,CPU_LIMIT_S=100000
    Query Options (set by configuration and planner): 
ABORT_ON_ERROR=1,BUFFER_POOL_LIMIT=0,EXEC_SINGLE_NODE_ROWS_THRESHOLD=0,MT_DOP=0,DISABLE_CODEGEN_ROWS_THRESHOLD=0,CPU_LIMIT_S=100000
    Plan:
----------------
Max Per-Host Resource Reservation: Memory=8.00MB Threads=3
Per-Host Resource Estimates: Memory=236MB
WARNING: The following tables are missing relevant table and/or column 
statistics.
tpch_seq_snap.lineitem

F01:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
|  Per-Host Resources: mem-estimate=10.00MB mem-reservation=0B 
thread-reservation=1
PLAN-ROOT SINK



--
To view, visit http://gerrit.cloudera.org:8080/11425
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2bbd7d7fe2c6d0f3799d0e6b336710bccfef0ab1
Gerrit-Change-Number: 11425
Gerrit-PatchSet: 8
Gerrit-Owner: Michal Ostrowski <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Michael Ho <[email protected]>
Gerrit-Reviewer: Michal Ostrowski <[email protected]>
Gerrit-Comment-Date: Tue, 02 Oct 2018 18:19:22 +0000
Gerrit-HasComments: Yes

Reply via email to