Hari Reddy created KUDU-3327:
--------------------------------

             Summary: Unit test failures on a test build/port of Kudu on 
ppc64le architecture
                 Key: KUDU-3327
                 URL: https://issues.apache.org/jira/browse/KUDU-3327
             Project: Kudu
          Issue Type: Test
          Components: compaction
    Affects Versions: 1.15.0
         Environment: Kudu:  V 1.15.0
Architectures tested: x86-64, IBM ppc64le (kudu test build)
x86-64
Linux 4.18.0-240.15.1.el8_3.x86_64 #1 SMP Wed Feb 3 03:12:15 EST 2021 x86_64 
x86_64 x86_64 GNU/Linux

ppc64le
Linux 4.18.0-240.15.1.el8_3.ppc64le #1 SMP Wed Feb 3 03:10:19 EST 2021 ppc64le 
ppc64le ppc64le GNU/Linux


            Reporter: Hari Reddy


Hello,

 

We are in the process of porting Apache:Kudu  V 1.15.0 to IBM ppc64le (PowerPC 
Linux Little Endian) platform. We have a test build and ran the unit tests 
using “ctest” mentioned in the build instructions. The results are as follows:

 

On ppc64le:

93% tests passed, 31 tests failed out of 456

On x86-64

93% tests passed, 34 tests failed out of 456

 

The results are similar, except for a few failing testcases failing on the 
individual architecture.

Our familiarity with the design and code base of Kudu is very recent. We 
appreciate help and guidance from those in the community who are familiar with 
these testcases in getting these testcases resolved. The summary of results 
these is attached at the end of this note.

                                                         

To initiate discussion,  I shall pick one testcase which failed on the ppc64le 
only: compaction-test. This test consists of 23 sub-tests. On x86-64 all 23 
pass. On ppc64le 22 of them pass and only the following fails:

TEST_F(TestCompaction, TestFlushMRSWithRolling)

 

The  TestFlushMRSWithRolling  has three assertions. On ppc64le the first passes 
and the next two fail.

 

rows.clear();

  rowsets[0]->DebugDump(&rows);

  EXPECT_EQ(R"(RowIdxInBlock: 0; Base: (string key="hello 00000000", int32 
val=0, )"

                "int32 nullable_val=0); Undo Mutations: [@1(DELETE)]; Redo 
Mutations: [];",

            rows[0]);

 

  rows.clear();

  rowsets[1]->DebugDump(&rows);

  EXPECT_EQ(R"(RowIdxInBlock: 0; Base: (string key="hello 00017150", int32 
val=1715, )"

            "int32 nullable_val=NULL); Undo Mutations: [@1716(DELETE)]; Redo 
Mutations: [];",

            rows[0]);

  EXPECT_EQ(R"(RowIdxInBlock: 1; Base: (string key="hello 00017160", int32 
val=1716, )"

            "int32 nullable_val=1716); Undo Mutations: [@1717(DELETE)]; Redo 
Mutations: [];",

            rows[1]);

 

 

By tracing the data, it seems that the number of records per rowset  is off 
(less) by 5 when compared with the x86-64 implementation. For example, the 
rowset[0] has 1750 records (17 blocks of 100 each + 18^th^block at 15) on 
x86-64; on ppc64le rowset[0] has 1700 records (17 block blocks of each 100 + 
18^th^ block). Consequently, the composition of the first block of  the 
rowset[1] on x86-64 is as follows:

 

CompactionInputRowToString(input_row)

RowIdxInBlock: 0; Base: (string key="hello 00017150", int32 val=1715, int32 
nullable_val=NULL); Undo Mutations: [@1716(DELETE)]; Redo Mutations: [];

CompactionInputRowToString(input_row)

RowIdxInBlock: 1; Base: (string key="hello 00017160", int32 val=1716, int32 
nullable_val=1716); Undo Mutations: [@1717(DELETE)]; Redo Mutations: [];

CompactionInputRowToString(input_row)

RowIdxInBlock: 2; Base: (string key="hello 00017170", int32 val=1717, int32 
nullable_val=NULL); Undo Mutations: [@1718(DELETE)]; Redo Mutations: [];

CompactionInputRowToString(input_row)

RowIdxInBlock: 3; Base: (string key="hello 00017180", int32 val=1718, int32 
nullable_val=1718); Undo Mutations: [@1719(DELETE)]; Redo Mutations: [];

CompactionInputRowToString(input_row)

RowIdxInBlock: 4; Base: (string key="hello 00017190", int32 val=1719, int32 
nullable_val=NULL); Undo Mutations: [@1720(DELETE)]; Redo Mutations: [];

CompactionInputRowToString(input_row)

RowIdxInBlock: 5; Base: (string key="hello 00017200", int32 val=1720, int32 
nullable_val=1720); Undo Mutations: [@1721(DELETE)]; Redo Mutations: [];

CompactionInputRowToString(input_row)

RowIdxInBlock: 6; Base: (string key="hello 00017210", int32 val=1721, int32 
nullable_val=NULL); Undo Mutations: [@1722(DELETE)]; Redo Mutations: [];

CompactionInputRowToString(input_row)

 

On ppc64le it is:

 

Block 0

CompactionInputRowToString(input_row)

RowIdxInBlock: 0; Base: (string key="hello 00017100", int32 val=1710, int32 
nullable_val=1710); Undo Mutations: [@1711(DELETE)]; Redo Mutations: [];

CompactionInputRowToString(input_row)

RowIdxInBlock: 1; Base: (string key="hello 00017110", int32 val=1711, int32 
nullable_val=NULL); Undo Mutations: [@1712(DELETE)]; Redo Mutations: [];

CompactionInputRowToString(input_row)

RowIdxInBlock: 2; Base: (string key="hello 00017120", int32 val=1712, int32 
nullable_val=1712); Undo Mutations: [@1713(DELETE)]; Redo Mutations: [];

CompactionInputRowToString(input_row)

RowIdxInBlock: 3; Base: (string key="hello 00017130", int32 val=1713, int32 
nullable_val=NULL); Undo Mutations: [@1714(DELETE)]; Redo Mutations: [];

CompactionInputRowToString(input_row)

RowIdxInBlock: 4; Base: (string key="hello 00017140", int32 val=1714, int32 
nullable_val=1714); Undo Mutations: [@1715(DELETE)]; Redo Mutations: [];

CompactionInputRowToString(input_row)

RowIdxInBlock: 5; Base: (string key="hello 00017150", int32 val=1715, int32 
nullable_val=NULL); Undo Mutations: [@1716(DELETE)]; Redo Mutations: [];

CompactionInputRowToString(input_row)

 

This means on ppc64le the second assertion on RowIdxInBlock: 5 passes. We 
trying to locate in the code where the mismatch of records entry is happening 
between the implementations.

Appreciate your help to resolve this and the other unit tests as we fully 
implement Kudu on ppc64le.

 

 

Summary of Unit test case Failures on ppc64le and x86-64

 
|*Unit    Testcase Failed*| |                  *Platform*|
|auth_token_expire-itest|x86-64|ppc64le|
|hms_catalog-test.0|x86-64|ppc64le|
|hms_catalog-test.1|x86-64|ppc64le|
|hms_catalog-test.2|x86-64|ppc64le|
|hms_catalog-test.3|x86-64|ppc64le|
|hms_client-test.0|x86-64|ppc64le|
|hms_client-test.1|x86-64|ppc64le|
|hms_client-test.2|x86-64|ppc64le|
|hms_client-test.3|x86-64|ppc64le|
|kudu-tool-test.0|x86-64|ppc64le|
|kudu-tool-test.1|x86-64|ppc64le|
|kudu-tool-test.2|x86-64|ppc64le|
|kudu-tool-test.3|x86-64|ppc64le|
|master_authz-itest.0|x86-64|ppc64le|
|master_authz-itest.1|x86-64|ppc64le|
|master_authz-itest.2|x86-64|ppc64le|
|master_authz-itest.3|x86-64|ppc64le|
|master_authz-itest.4|x86-64|ppc64le|
|master_authz-itest.5|x86-64|ppc64le|
|master_authz-itest.6|x86-64|ppc64le|
|master_authz-itest.7|x86-64|ppc64le|
|master_hms-itest|x86-64|ppc64le|
|negotiation-test|x86-64|ppc64le|
|security-itest|x86-64|ppc64le|
|alter_table-randomized-test.1|x86-64| |
|client_examples-test|x86-64| |
|master_failover-itest.1|x86-64| |
|master_failover-itest.3|x86-64| |
|master-stress-test.1|x86-64| |
|mini_kdc-test|x86-64| |
|mini_postgres-test|x86-64| |
|mini_ranger-test|x86-64| |
|ranger_client-test|x86-64| |
|webserver-test|x86-64| |
|client_symbol-test| |ppc64le|
|compaction-test| |ppc64le|
|debug-util-test| |ppc64le|
|fs_manager-test| |ppc64le|
|log-rolling-itest| |ppc64le|
|oid_generator-test| |ppc64le|
|yamlreader-test| |ppc64le|

 

                                                                       

Thanks, and regards

 

Hari

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to