[
https://issues.apache.org/jira/browse/KUDU-3438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Xixu Wang updated KUDU-3438:
----------------------------
Description:
The unit test of TabletCopyClientAbortTest maybe core. See the core stack
information.
{code:java}
/root/kudu/src/kudu/tserver/tablet_server-test-base.cc:130: FailureFailedBad
status: IO error: Couldn't create tablet metadata: Failed to create
TabletMetadata: All healthy data directories are full (error 28)
W20230123 18:02:20.993072 869956 reactor.cc:684] Failed to create an outbound
connection to 255.255.255.255:1 because connect() failed: Network error:
connect(2) error: Network is unreachable (error 101)
/root/kudu/src/kudu/tserver/tablet_copy-test-base.h:49: FailureExpected:
StartTabletServer(kNumDataDirs) doesn't generate new fatal failures in the
current thread.Actual: it does.
/root/kudu/src/kudu/tserver/tablet_copy_client-test.cc:112: FailureExpected:
TabletCopyTest::SetUp() doesn't generate new fatal failures in the current
thread.Actual: it does.
W20230123 18:02:20.993108 870018 heartbeater.cc:399] Failed 3 heartbeats in a
row: no longer allowing fast heartbeat attempts.
*** Aborted at 1674468140 (unix time) try "date -d @1674468140" if you are
using GNU date ***
PC: @ 0x0 (unknown)
*** SIGSEGV (@0x0) received by PID 868247 (TID 0x7f2d76bb8a00) from PID 0;
stack trace: ***
@ 0x7f2d7964e9f6 google::(anonymous namespace)::FailureSignalHandler()
@ 0x7f2d7d4c6630 (unknown)
@ 0x4a32d0 kudu::tserver::TabletCopyClientTest::StartCopy()
@ 0x4a51c8 kudu::tserver::TabletCopyClientAbortTest::SetUp()
@ 0x7f2d81704bfe testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x7f2d816f9566 testing::Test::Run()
@ 0x7f2d816f9795 testing::TestInfo::Run()
@ 0x7f2d816f9cdf testing::TestSuite::Run()
@ 0x7f2d816fa29f testing::internal::UnitTestImpl::RunAllTests()
@ 0x7f2d8170513e testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x7f2d816f983d testing::UnitTest::Run()@ 0x7f2d81cc7f76 RUN_ALL_TESTS()
@ 0x7f2d81cc72e6 main
@ 0x7f2d77f17555 __libc_start_main
@ 0x48e879 (unknown)Segmentation fault (core dumped)
{code}
The reason is TabletCopyClientTest::SetUp() of TabletCopyClientAbortTest may
fail, for example, because of the full disk. TabletCopyClient will be not
initialized. Therefore using TabletCopyClient in StartCopy() will cause
coredump.
!image-2023-01-30-10-10-40-439.png!
was:
The unit test of TabletCopyClientAbortTest maybe core. See the core stack
information.
{code:java}
// code placeholder
{code}
/root/kudu/src/kudu/tserver/tablet_server-test-base.cc:130: FailureFailedBad
status: IO error: Couldn't create tablet metadata: Failed to create
TabletMetadata: All healthy data directories are full (error 28)W20230123
18:02:20.993072 869956 reactor.cc:684] Failed to create an outbound connection
to 255.255.255.255:1 because connect() failed: Network error: connect(2) error:
Network is unreachable (error
101)/root/kudu/src/kudu/tserver/tablet_copy-test-base.h:49: FailureExpected:
StartTabletServer(kNumDataDirs) doesn't generate new fatal failures in the
current thread.Actual: it
does./root/kudu/src/kudu/tserver/tablet_copy_client-test.cc:112:
FailureExpected: TabletCopyTest::SetUp() doesn't generate new fatal failures in
the current thread.Actual: it does.W20230123 18:02:20.993108 870018
heartbeater.cc:399] Failed 3 heartbeats in a row: no longer allowing fast
heartbeat attempts.*** Aborted at 1674468140 (unix time) try "date -d
@1674468140" if you are using GNU date ***PC: @ 0x0 (unknown)*** SIGSEGV (@0x0)
received by PID 868247 (TID 0x7f2d76bb8a00) from PID 0; stack trace: ***@
0x7f2d7964e9f6 google::(anonymous namespace)::FailureSignalHandler()@
0x7f2d7d4c6630 (unknown)@ 0x4a32d0
kudu::tserver::TabletCopyClientTest::StartCopy()@ 0x4a51c8
kudu::tserver::TabletCopyClientAbortTest::SetUp()@ 0x7f2d81704bfe
testing::internal::HandleExceptionsInMethodIfSupported<>()@ 0x7f2d816f9566
testing::Test::Run()@ 0x7f2d816f9795 testing::TestInfo::Run()@ 0x7f2d816f9cdf
testing::TestSuite::Run()@ 0x7f2d816fa29f
testing::internal::UnitTestImpl::RunAllTests()@ 0x7f2d8170513e
testing::internal::HandleExceptionsInMethodIfSupported<>()@ 0x7f2d816f983d
testing::UnitTest::Run()@ 0x7f2d81cc7f76 RUN_ALL_TESTS()@ 0x7f2d81cc72e6 main@
0x7f2d77f17555 __libc_start_main@ 0x48e879 (unknown)Segmentation fault (core
dumped)
> The unit test of TabletCopyClientAbortTest maybe core
> -----------------------------------------------------
>
> Key: KUDU-3438
> URL: https://issues.apache.org/jira/browse/KUDU-3438
> Project: Kudu
> Issue Type: Bug
> Reporter: Xixu Wang
> Priority: Major
> Attachments: image-2023-01-30-10-10-40-439.png
>
>
> The unit test of TabletCopyClientAbortTest maybe core. See the core stack
> information.
> {code:java}
> /root/kudu/src/kudu/tserver/tablet_server-test-base.cc:130: FailureFailedBad
> status: IO error: Couldn't create tablet metadata: Failed to create
> TabletMetadata: All healthy data directories are full (error 28)
> W20230123 18:02:20.993072 869956 reactor.cc:684] Failed to create an outbound
> connection to 255.255.255.255:1 because connect() failed: Network error:
> connect(2) error: Network is unreachable (error 101)
> /root/kudu/src/kudu/tserver/tablet_copy-test-base.h:49: FailureExpected:
> StartTabletServer(kNumDataDirs) doesn't generate new fatal failures in the
> current thread.Actual: it does.
> /root/kudu/src/kudu/tserver/tablet_copy_client-test.cc:112: FailureExpected:
> TabletCopyTest::SetUp() doesn't generate new fatal failures in the current
> thread.Actual: it does.
> W20230123 18:02:20.993108 870018 heartbeater.cc:399] Failed 3 heartbeats in a
> row: no longer allowing fast heartbeat attempts.
> *** Aborted at 1674468140 (unix time) try "date -d @1674468140" if you are
> using GNU date ***
> PC: @ 0x0 (unknown)
> *** SIGSEGV (@0x0) received by PID 868247 (TID 0x7f2d76bb8a00) from PID 0;
> stack trace: ***
> @ 0x7f2d7964e9f6 google::(anonymous namespace)::FailureSignalHandler()
> @ 0x7f2d7d4c6630 (unknown)
> @ 0x4a32d0 kudu::tserver::TabletCopyClientTest::StartCopy()
> @ 0x4a51c8 kudu::tserver::TabletCopyClientAbortTest::SetUp()
> @ 0x7f2d81704bfe testing::internal::HandleExceptionsInMethodIfSupported<>()
> @ 0x7f2d816f9566 testing::Test::Run()
> @ 0x7f2d816f9795 testing::TestInfo::Run()
> @ 0x7f2d816f9cdf testing::TestSuite::Run()
> @ 0x7f2d816fa29f testing::internal::UnitTestImpl::RunAllTests()
> @ 0x7f2d8170513e testing::internal::HandleExceptionsInMethodIfSupported<>()
> @ 0x7f2d816f983d testing::UnitTest::Run()@ 0x7f2d81cc7f76 RUN_ALL_TESTS()
> @ 0x7f2d81cc72e6 main
> @ 0x7f2d77f17555 __libc_start_main
> @ 0x48e879 (unknown)Segmentation fault (core dumped)
> {code}
>
>
> The reason is TabletCopyClientTest::SetUp() of TabletCopyClientAbortTest may
> fail, for example, because of the full disk. TabletCopyClient will be not
> initialized. Therefore using TabletCopyClient in StartCopy() will cause
> coredump.
> !image-2023-01-30-10-10-40-439.png!
--
This message was sent by Atlassian Jira
(v8.20.10#820010)