Michal Ostrowski has posted comments on this change. ( http://gerrit.cloudera.org:8080/11425 )
Change subject: IMPALA-2063 Remove newline characters in query status. ...................................................................... Patch Set 8: (1 comment) > Patch Set 5: > > (1 comment) http://gerrit.cloudera.org:8080/#/c/11425/5/be/src/util/runtime-profile.cc File be/src/util/runtime-profile.cc: http://gerrit.cloudera.org:8080/#/c/11425/5/be/src/util/runtime-profile.cc@551 PS5, Line 551: info_strings_.emplace(key, std::move(value)); > I guess a more meaningful test is to verify that all "Query Status" (not ju The test for double newlines effectively was a test that info strings aren't newline terminated, since the construction of the full profile text inserts newlines between each info string. As I pointed out earlier, that doesn't work because there are other strings that will break that test without this change. More importantly, we only get to test the final profile text and not the individual strings that it is built from, so we don't have the context to identify if a newline is from an info string or if it was introduced in formatting. There is no existing tests to validate correct formatting of this text. The text overall is a free-for-all -- we don't exactly know how and where all possible strings come from, making it hard to impose and then verify formatting rules. (How do we know that our tests simply never generate the particular string that would break the profile text verification?) Adding more specific tests in turn would result in us effectively writing tests to match our implementation (i.e. string matches). FWIW, here's an example of a problematic profile text -- note the "WARNING" towards the end. Query (id=324df3273ce7648d:8d1602ea00000000): DEBUG MODE WARNING: Query profile created while running a DEBUG build of Impala. Use RELEASE builds to measure query performance. Summary: Session ID: ea417f33ee6aee28:58213ba9cfaaf191 Session Type: BEESWAX Start Time: 2018-10-02 11:16:51.674327000 End Time: 2018-10-02 11:16:52.681041000 Query Type: QUERY Query State: EXCEPTION Query Status: Cancelled Impala Version: impalad version 3.1.0-SNAPSHOT DEBUG (build 6247d6d41cfea7149704cf9345d014c17375f61b) User: mostrows Connected User: mostrows Delegated User: Network Address: ::ffff:127.0.0.1:51148 Default Db: tpch_seq_snap Sql Statement: select count(l_returnflag) pk from lineitem Coordinator: mostrows-dev.vpc.cloudera.com:22000 Query Options (set by configuration): ABORT_ON_ERROR=1,BUFFER_POOL_LIMIT=0,EXEC_SINGLE_NODE_ROWS_THRESHOLD=0,DISABLE_CODEGEN_ROWS_THRESHOLD=0,CPU_LIMIT_S=100000 Query Options (set by configuration and planner): ABORT_ON_ERROR=1,BUFFER_POOL_LIMIT=0,EXEC_SINGLE_NODE_ROWS_THRESHOLD=0,MT_DOP=0,DISABLE_CODEGEN_ROWS_THRESHOLD=0,CPU_LIMIT_S=100000 Plan: ---------------- Max Per-Host Resource Reservation: Memory=8.00MB Threads=3 Per-Host Resource Estimates: Memory=236MB WARNING: The following tables are missing relevant table and/or column statistics. tpch_seq_snap.lineitem F01:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1 | Per-Host Resources: mem-estimate=10.00MB mem-reservation=0B thread-reservation=1 PLAN-ROOT SINK -- To view, visit http://gerrit.cloudera.org:8080/11425 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2bbd7d7fe2c6d0f3799d0e6b336710bccfef0ab1 Gerrit-Change-Number: 11425 Gerrit-PatchSet: 8 Gerrit-Owner: Michal Ostrowski <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Michael Ho <[email protected]> Gerrit-Reviewer: Michal Ostrowski <[email protected]> Gerrit-Comment-Date: Tue, 02 Oct 2018 18:19:22 +0000 Gerrit-HasComments: Yes
