Joe McDonnell created IMPALA-9759:
-------------------------------------
Summary: Revisit integration of snapshot dataload with s3guard
Key: IMPALA-9759
URL: https://issues.apache.org/jira/browse/IMPALA-9759
Project: IMPALA
Issue Type: Bug
Components: Infrastructure
Affects Versions: Impala 4.0
Reporter: Joe McDonnell
Sometimes, the s3 jobs (which use s3guard for consistency) sees test failures
due to missing files from the dataload snapshot (see bottom). This may be
related to the interaction of snapshot loading with s3guard. We should nail
down exactly the right procedure for loading the snapshot. Currently, we do the
following:
1. Remove any data from the s3bucket via the s3 commandline
2. Create the s3guard dynamodb table (or reuse existing one if a previous job
failed without deleting the old dynamodb table)
3. Prune any existing entries from that table
4. Load the snapshot to the s3 bucket
In theory, this leave s3guard with an empty dynamodb table and an s3bucket with
data. As tests progress and try to access the s3 bucket, s3guard would see that
there is no entry in the dynamodb table and then check the underlying s3 bucket.
We need to revisit these steps and verify that everything is being done
correctly.
{noformat}
metadata/test_metadata_query_statements.py:70: in test_show_stats
self.run_test_case('QueryTest/show-stats', vector, "functional")
common/impala_test_suite.py:687: in run_test_case
self.__verify_results_and_errors(vector, test_section, result, use_db)
common/impala_test_suite.py:523: in __verify_results_and_errors
replace_filenames_with_placeholder)
common/test_result_verifier.py:456: in verify_raw_results
VERIFIER_MAP[verifier](expected, actual)
common/test_result_verifier.py:278: in verify_query_result_is_equal
assert expected_results == actual_results
E assert Comparing QueryTestResults (expected vs actual):
E '2009','1',310,1,'19.95KB','NOT CACHED','NOT
CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=1'
== '2009','1',310,1,'19.95KB','NOT CACHED','NOT
CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=1'
E '2009','10',310,1,'20.36KB','NOT CACHED','NOT
CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=10'
== '2009','10',310,1,'20.36KB','NOT CACHED','NOT
CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=10'
E '2009','11',300,1,'19.71KB','NOT CACHED','NOT
CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=11'
== '2009','11',300,1,'19.71KB','NOT CACHED','NOT
CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=11'
E '2009','12',310,1,'20.36KB','NOT CACHED','NOT
CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=12'
== '2009','12',310,1,'20.36KB','NOT CACHED','NOT
CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=12'
E '2009','2',280,1,'18.12KB','NOT CACHED','NOT
CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=2'
== '2009','2',280,1,'18.12KB','NOT CACHED','NOT
CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=2'
E '2009','3',310,1,'20.06KB','NOT CACHED','NOT
CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=3'
== '2009','3',310,1,'20.06KB','NOT CACHED','NOT
CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=3'
E '2009','4',300,1,'19.61KB','NOT CACHED','NOT
CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=4'
== '2009','4',300,1,'19.61KB','NOT CACHED','NOT
CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=4'
E '2009','5',310,1,'20.36KB','NOT CACHED','NOT
CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=5'
!= '2009','5',0,1,'20.36KB','NOT CACHED','NOT
CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=5'
E '2009','6',300,1,'19.71KB','NOT CACHED','NOT
CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=6'
== '2009','6',300,1,'19.71KB','NOT CACHED','NOT
CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=6'
E '2009','7',310,1,'20.36KB','NOT CACHED','NOT
CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=7'
== '2009','7',310,1,'20.36KB','NOT CACHED','NOT
CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=7'
E '2009','8',310,1,'20.36KB','NOT CACHED','NOT
CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=8'
== '2009','8',310,1,'20.36KB','NOT CACHED','NOT
CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=8'
E '2009','9',300,1,'19.71KB','NOT CACHED','NOT
CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=9'
== '2009','9',300,1,'19.71KB','NOT CACHED','NOT
CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2009/month=9'
E '2010','1',310,1,'20.36KB','NOT CACHED','NOT
CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=1'
== '2010','1',310,1,'20.36KB','NOT CACHED','NOT
CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=1'
E '2010','10',310,1,'20.36KB','NOT CACHED','NOT
CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=10'
== '2010','10',310,1,'20.36KB','NOT CACHED','NOT
CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=10'
E '2010','11',300,1,'19.71KB','NOT CACHED','NOT
CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=11'
== '2010','11',300,1,'19.71KB','NOT CACHED','NOT
CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=11'
E '2010','12',310,1,'20.36KB','NOT CACHED','NOT
CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=12'
== '2010','12',310,1,'20.36KB','NOT CACHED','NOT
CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=12'
E '2010','2',280,1,'18.39KB','NOT CACHED','NOT
CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=2'
== '2010','2',280,1,'18.39KB','NOT CACHED','NOT
CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=2'
E '2010','3',310,1,'20.36KB','NOT CACHED','NOT
CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=3'
== '2010','3',310,1,'20.36KB','NOT CACHED','NOT
CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=3'
E '2010','4',300,1,'19.71KB','NOT CACHED','NOT
CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=4'
== '2010','4',300,1,'19.71KB','NOT CACHED','NOT
CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=4'
E '2010','5',310,1,'20.36KB','NOT CACHED','NOT
CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=5'
== '2010','5',310,1,'20.36KB','NOT CACHED','NOT
CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=5'
E '2010','6',300,1,'19.71KB','NOT CACHED','NOT
CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=6'
== '2010','6',300,1,'19.71KB','NOT CACHED','NOT
CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=6'
E '2010','7',310,1,'20.36KB','NOT CACHED','NOT
CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=7'
== '2010','7',310,1,'20.36KB','NOT CACHED','NOT
CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=7'
E '2010','8',310,1,'20.36KB','NOT CACHED','NOT
CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=8'
== '2010','8',310,1,'20.36KB','NOT CACHED','NOT
CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=8'
E '2010','9',300,1,'19.71KB','NOT CACHED','NOT
CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=9'
== '2010','9',300,1,'19.71KB','NOT CACHED','NOT
CACHED','TEXT','false','s3a://impala-test-uswest2-1/test-warehouse/alltypes/year=2010/month=9'
E 'Total','',7300,24,'478.45KB','0B','','','','' !=
'Total','',6990,24,'478.45KB','0B','','','',''
{noformat}
This also shows up in cardinality calculations:
{noformat}
metadata/test_explain.py:113: in test_explain_validate_cardinality_estimates
check_cardinality(result.data, '7.30K')
metadata/test_explain.py:98: in check_cardinality
query_result, expected_cardinality=expected_cardinality)
metadata/test_explain.py:86: in check_row_size_and_cardinality
assert m.groups()[1] == expected_cardinality
E assert '6.99K' == '7.30K'
E - 6.99K
E + 7.30K
{noformat}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]