[ 
https://issues.apache.org/jira/browse/IMPALA-12840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17821008#comment-17821008
 ] 

ASF subversion and git services commented on IMPALA-12840:
----------------------------------------------------------

Commit 0c0a3fff39839b400370433568f37a317b7d4800 in impala's branch 
refs/heads/master from Riza Suminto
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=0c0a3fff3 ]

IMPALA-12840: Exclude THdfsFileDesc in getJsonCatalogObject

TestReusePartitions::test_reuse_partitions_transactional calls
/catalog_object path of CatalogD's WebUI and decodes the JSON response.
The "json_string" field from the response text often contains Unicode
control characters that come from serialized binary data from
THdfsFileDesc objects. That causes JSON decoding to fail with an error
like this:

ValueError: Invalid control character at: line 1 column 1850 (char 1849)

This patch attempts to deflake the test by tuning the return value of
getJsonCatalogObject() that excludes THdfsFileDesc, by lowering the
detail level from ThriftObjectType.FULL to
ThriftObjectType.DESCRIPTOR_ONLY.

test_reuse_partitions_transactional is tweaked a bit to print the
response / JSON object if an assertion fails.

Testing:
- Loop and pass the test for hundred times.

Change-Id: I5f6840bf1267d1d99d321c0a6b4a0cab49543182
Reviewed-on: http://gerrit.cloudera.org:8080/21064
Reviewed-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com>


> test_reuse_partitions_transactional is flaky
> --------------------------------------------
>
>                 Key: IMPALA-12840
>                 URL: https://issues.apache.org/jira/browse/IMPALA-12840
>             Project: IMPALA
>          Issue Type: Bug
>            Reporter: Riza Suminto
>            Priority: Major
>              Labels: broken-build
>         Attachments: test_reuse_partitions.diff, 
> test_reuse_partitions_nontransactional-error.txt
>
>
> TestReusePartitions::test_reuse_partitions_transactional has been 
> increasingly flaky both at upstream and downstream build. The most recent 
> occurrence is at
> [https://jenkins.impala.io/job/ubuntu-20.04-dockerised-tests/1310/testReport/junit/metadata.test_reuse_partitions/TestReusePartitions/test_reuse_partitions_nontransactional/]
> {code:java}
> Error Message
> metadata/test_reuse_partitions.py:55: in 
> test_reuse_partitions_nontransactional     
> self.__test_reuse_partitions_helper(unique_database, transactional=False) 
> metadata/test_reuse_partitions.py:83: in __test_reuse_partitions_helper     
> new_part_ids = self.__get_partition_id_set(unique_database, tbl_name) 
> metadata/test_reuse_partitions.py:47: in __get_partition_id_set     
> catalog_obj = json.loads(json.loads(response.text)["json_string"]) 
> ../toolchain/toolchain-packages-gcc10.4.0/python-2.7.16/lib/python2.7/json/__init__.py:339:
>  in loads     return _default_decoder.decode(s) 
> ../toolchain/toolchain-packages-gcc10.4.0/python-2.7.16/lib/python2.7/json/decoder.py:364:
>  in decode     obj, end = self.raw_decode(s, idx=_w(s, 0).end()) 
> ../toolchain/toolchain-packages-gcc10.4.0/python-2.7.16/lib/python2.7/json/decoder.py:380:
>  in raw_decode     obj, end = self.scan_once(s, idx) E   ValueError: Invalid 
> control character at: line 1 column 1850 (char 1849)
> Stacktrace
> metadata/test_reuse_partitions.py:55: in 
> test_reuse_partitions_nontransactional
>     self.__test_reuse_partitions_helper(unique_database, transactional=False)
> metadata/test_reuse_partitions.py:83: in __test_reuse_partitions_helper
>     new_part_ids = self.__get_partition_id_set(unique_database, tbl_name)
> metadata/test_reuse_partitions.py:47: in __get_partition_id_set
>     catalog_obj = json.loads(json.loads(response.text)["json_string"])
> ../toolchain/toolchain-packages-gcc10.4.0/python-2.7.16/lib/python2.7/json/__init__.py:339:
>  in loads
>     return _default_decoder.decode(s)
> ../toolchain/toolchain-packages-gcc10.4.0/python-2.7.16/lib/python2.7/json/decoder.py:364:
>  in decode
>     obj, end = self.raw_decode(s, idx=_w(s, 0).end())
> ../toolchain/toolchain-packages-gcc10.4.0/python-2.7.16/lib/python2.7/json/decoder.py:380:
>  in raw_decode
>     obj, end = self.scan_once(s, idx)
> E   ValueError: Invalid control character at: line 1 column 1850 (char 1849)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to