[
https://issues.apache.org/jira/browse/IMPALA-13471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17892426#comment-17892426
]
Daniel Becker commented on IMPALA-13471:
----------------------------------------
If NDV stats were generally unavailable in Ozone,
{{custom_cluster.test_iceberg_with_puffin.TestIcebergTableWithPuffinStats.test_puffin_stats}}
would also have failed.
In the logs we can see:
{code:java}
W1022 12:51:59.890749 31907 PuffinStatsLoader.java:116] Could not load Iceberg
Puffin column statistics for table
'functional_parquet.iceberg_with_puffin_stats' from Puffin file
'/test-warehouse/iceberg_test/iceberg_with_puffin_stats/metadata/20240906_085606_00006_wsfgs-4d9242d5-bd79-4069-be8b-2cfced8e0647.stats'.
Exception: org.apache.iceberg.exceptions.RuntimeIOException: Failed to open
input stream for file:
/test-warehouse/iceberg_test/iceberg_with_puffin_stats/metadata/20240906_085606_00006_wsfgs-4d9242d5-bd79-4069-be8b-2cfced8e0647.stats
{code}
I think the file path should start with {{ofs:// }} on Ozone, it does in the
passing test
{{custom_cluster.test_iceberg_with_puffin.TestIcebergTableWithPuffinStats.test_puffin_stats}}.
The failing test uses a table loaded at dataload, the passing Puffin test
creates it on the fly. I'll check how we handle the paths.
> test_enable_reading_puffin() seems to fail in the Ozone build
> -------------------------------------------------------------
>
> Key: IMPALA-13471
> URL: https://issues.apache.org/jira/browse/IMPALA-13471
> Project: IMPALA
> Issue Type: Bug
> Reporter: Fang-Yu Rao
> Assignee: Daniel Becker
> Priority: Major
> Labels: broken-build
>
> We found that the test
> [test_enable_reading_puffin()|https://github.com/apache/impala/blame/master/tests/custom_cluster/test_iceberg_with_puffin.py#L59]
> added in IMPALA-13247 seems to fail in the Ozone build.
> +*Error Message*+
> {code}
> assert [-1, -1] == [2, 2] At index 0 diff: -1 != 2 Full diff: - [-1,
> -1] + [2, 2]
> {code}
> +*Stacktrace*+
> {code}
> custom_cluster/test_iceberg_with_puffin.py:50: in test_enable_reading_puffin
> self._read_ndv_stats_expect_result([2, 2])
> custom_cluster/test_iceberg_with_puffin.py:59: in
> _read_ndv_stats_expect_result
> assert ndvs == expected_ndv_stats
> E assert [-1, -1] == [2, 2]
> E At index 0 diff: -1 != 2
> E Full diff:
> E - [-1, -1]
> E + [2, 2]
> {code}
> According to the above, in the Ozone build, the result of "show column stats"
> was [-1, -1]. It looks like the NDV statistics is not available in the Ozone
> build.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]