Stamatis Zampetakis created HIVE-26168:
------------------------------------------

             Summary: EXPLAIN DDL command output is not deterministic 
                 Key: HIVE-26168
                 URL: https://issues.apache.org/jira/browse/HIVE-26168
             Project: Hive
          Issue Type: Bug
          Components: HiveServer2
            Reporter: Stamatis Zampetakis


The EXPLAIN DDL command (HIVE-24596) can be used to recreate the schema for a 
given query in order to debug planner issues. This is achieved by fetching 
information from the metastore and outputting series of DDL commands. 

The output commands though may appear in different order among runs since there 
is no mechanism to enforce an explicit order.

Consider for instance the following scenario.

{code:sql}
CREATE TABLE customer
(
    `c_custkey` bigint,
    `c_name`    string,
    `c_address` string
);

INSERT INTO customer VALUES (1, 'Bob', '12 avenue Mansart'), (2, 'Alice', '24 
avenue Mansart');

EXPLAIN DDL SELECT c_custkey FROM customer WHERE c_name = 'Bob'; 
{code}

+Result 1+

{noformat}
ALTER TABLE default.customer UPDATE STATISTICS 
SET('numRows'='2','rawDataSize'='48' );
ALTER TABLE default.customer UPDATE STATISTICS FOR COLUMN c_address 
SET('avgColLen'='17.0','maxColLen'='17','numNulls'='0','numDVs'='2' );
-- BIT VECTORS PRESENT FOR default.customer FOR COLUMN c_address BUT THEY ARE 
NOT SUPPORTED YET. THE BASE64 VALUE FOR THE BITVECTOR IS SExMoAICwbec/QPAjtBF 
ALTER TABLE default.customer UPDATE STATISTICS FOR COLUMN c_custkey 
SET('lowValue'='1','highValue'='2','numNulls'='0','numDVs'='2' );
-- BIT VECTORS PRESENT FOR default.customer FOR COLUMN c_custkey BUT THEY ARE 
NOT SUPPORTED YET. THE BASE64 VALUE FOR THE BITVECTOR IS SExMoAICwfO+SIOOofED 
ALTER TABLE default.customer UPDATE STATISTICS FOR COLUMN c_name 
SET('avgColLen'='4.0','maxColLen'='5','numNulls'='0','numDVs'='2' );
-- BIT VECTORS PRESENT FOR default.customer FOR COLUMN c_name BUT THEY ARE NOT 
SUPPORTED YET. THE BASE64 VALUE FOR THE BITVECTOR IS SExMoAIChJLg1AGD1aCNBg== 
{noformat}

+Result 2+

{noformat}
ALTER TABLE default.customer UPDATE STATISTICS FOR COLUMN c_custkey 
SET('lowValue'='1','highValue'='2','numNulls'='0','numDVs'='2' );
-- BIT VECTORS PRESENT FOR default.customer FOR COLUMN c_custkey BUT THEY ARE 
NOT SUPPORTED YET. THE BASE64 VALUE FOR THE BITVECTOR IS SExMoAICwfO+SIOOofED
ALTER TABLE default.customer UPDATE STATISTICS 
SET('numRows'='2','rawDataSize'='48' );
ALTER TABLE default.customer UPDATE STATISTICS FOR COLUMN c_address 
SET('avgColLen'='17.0','maxColLen'='17','numNulls'='0','numDVs'='2' );
-- BIT VECTORS PRESENT FOR default.customer FOR COLUMN c_address BUT THEY ARE 
NOT SUPPORTED YET. THE BASE64 VALUE FOR THE BITVECTOR IS SExMoAICwbec/QPAjtBF  
ALTER TABLE default.customer UPDATE STATISTICS FOR COLUMN c_name 
SET('avgColLen'='4.0','maxColLen'='5','numNulls'='0','numDVs'='2' );
-- BIT VECTORS PRESENT FOR default.customer FOR COLUMN c_name BUT THEY ARE NOT 
SUPPORTED YET. THE BASE64 VALUE FOR THE BITVECTOR IS SExMoAIChJLg1AGD1aCNBg== 
{noformat}

The two results are equivalent but the statements appear in a different order. 
This is not a big issue cause the results remain correct but it may lead to 
test flakiness so it might be worth addressing.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to