[
https://issues.apache.org/jira/browse/KUDU-1527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15471996#comment-15471996
]
Jordan Birdsell commented on KUDU-1527:
---------------------------------------
This seems to be a duplicate of KUDU-1304
> Python unit tests should wait for tservers to up and running
> ------------------------------------------------------------
>
> Key: KUDU-1527
> URL: https://issues.apache.org/jira/browse/KUDU-1527
> Project: Kudu
> Issue Type: Bug
> Components: python
> Affects Versions: 0.9.1
> Reporter: Adar Dembo
>
> Here is the output from a failed run of the Python unit tests:
> {noformat}
> 02:53:15 _____________ ERROR at setup of TestClient.test_capture_kudu_error
> _____________
> 02:53:15
> 02:53:15 cls = <class 'kudu.tests.test_client.TestClient'>
> 02:53:15
> 02:53:15 @classmethod
> 02:53:15 def setUpClass(cls):
> 02:53:15 cls.cluster_path, master_port = cls.start_cluster()
> 02:53:15 time.sleep(1)
> 02:53:15
> 02:53:15 cls.master_host = '127.0.0.1'
> 02:53:15 cls.master_port = master_port
> 02:53:15
> 02:53:15 cls.client = kudu.connect(cls.master_host, cls.master_port)
> 02:53:15
> 02:53:15 cls.schema = cls.example_schema()
> 02:53:15 cls.partitioning = cls.example_partitioning()
> 02:53:15
> 02:53:15 cls.ex_table = 'example-table'
> 02:53:15 if cls.client.table_exists(cls.ex_table):
> 02:53:15 cls.client.delete_table(cls.ex_table)
> 02:53:15 > cls.client.create_table(cls.ex_table, cls.schema,
> cls.partitioning)
> 02:53:15
> 02:53:15 kudu/tests/common.py:140:
> 02:53:15 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
> _ _ _ _ _ _
> 02:53:15 kudu/client.pyx:249: in kudu.client.Client.create_table
> (kudu/client.cpp:4819)
> 02:53:15 check_status(s)
> 02:53:15 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
> _ _ _ _ _ _
> 02:53:15
> 02:53:15 > raise KuduInvalidArgument(c_message)
> 02:53:15 E KuduInvalidArgument: Error creating table example-table on the
> master: Not enough live tablet servers to create a table with the requested
> replication factor 1. 0 tablet servers are alive.
> 02:53:15
> 02:53:15 kudu/errors.pyx:60: KuduInvalidArgument
> 02:53:15 ---------------------------- Captured stderr setup
> -----------------------------
> 02:53:15 I0712 02:53:13.192479 7420 mem_tracker.cc:138] MemTracker: hard
> memory limit is 94.627029 GB
> 02:53:15 I0712 02:53:13.192636 7420 mem_tracker.cc:140] MemTracker: soft
> memory limit is 56.776218 GB
> 02:53:15 I0712 02:53:13.202077 7420 master_main.cc:58] Initializing master
> server...
> 02:53:15 I0712 02:53:13.202827 7420 hybrid_clock.cc:178] HybridClock
> initialized. Resolution in nanos?: 1 Wait times tolerance adjustment: 1.0005
> Current error: 969344
> 02:53:15 I0712 02:53:13.305677 7420 server_base.cc:164] Could not load
> existing FS layout: Not found:
> /home/jenkins-slave/workspace/kudu-3/build/test-tmp/tmp5JN8R3/master/data/instance:
> No such file or directory (error 2)
> 02:53:15 I0712 02:53:13.305732 7420 server_base.cc:165] Creating new FS
> layout
> 02:53:15 I0712 02:53:13.359714 7420 fs_manager.cc:345] Generated new
> instance metadata in path
> /home/jenkins-slave/workspace/kudu-3/build/test-tmp/tmp5JN8R3/master/data/instance:
> 02:53:15 uuid: "3ace416aa5ec45d692d21875e8c23322"
> 02:53:15 format_stamp: "Formatted at 2016-07-12 02:53:13 on
> jenkins-slave-c3oe"
> 02:53:15 I0712 02:53:13.414095 7420 fs_manager.cc:242] Opened local
> filesystem:
> /home/jenkins-slave/workspace/kudu-3/build/test-tmp/tmp5JN8R3/master/data
> 02:53:15 uuid: "3ace416aa5ec45d692d21875e8c23322"
> 02:53:15 format_stamp: "Formatted at 2016-07-12 02:53:13 on
> jenkins-slave-c3oe"
> 02:53:15 I0712 02:53:13.499505 7420 master_main.cc:61] Starting Master
> server...
> 02:53:15 I0712 02:53:13.726842 7420 rpc_server.cc:164] RPC server started.
> Bound to: 0.0.0.0:52533
> 02:53:15 I0712 02:53:13.727243 7420 webserver.cc:122] Starting webserver on
> 0.0.0.0:0
> 02:53:15 I0712 02:53:13.727269 7420 webserver.cc:127] Document root:
> /home/jenkins-slave/workspace/kudu-3/www
> 02:53:15 I0712 02:53:13.727638 7420 webserver.cc:217] Webserver started.
> Bound to: http://0.0.0.0:47227/
> 02:53:15 I0712 02:53:13.728736 7420 server_base.cc:234] Dumped server
> information to
> /home/jenkins-slave/workspace/kudu-3/build/test-tmp/tmp5JN8R3/master/config.json
> 02:53:15 I0712 02:53:13.769598 7518 tablet_bootstrap.cc:393] T
> 00000000000000000000000000000000 P 3ace416aa5ec45d692d21875e8c23322:
> Bootstrap starting.
> 02:53:15 I0712 02:53:13.771863 7518 tablet_bootstrap.cc:553] T
> 00000000000000000000000000000000 P 3ace416aa5ec45d692d21875e8c23322: Time
> spent opening tablet: real 0.001s user 0.000s sys 0.000s
> 02:53:15 I0712 02:53:13.772128 7518 tablet_bootstrap.cc:493] T
> 00000000000000000000000000000000 P 3ace416aa5ec45d692d21875e8c23322: No
> blocks or log segments found. Creating new log.
> 02:53:15 I0712 02:53:13.772845 7518 log.cc:349] Log is configured to *not*
> fsync() on all Append() calls
> 02:53:15 I0712 02:53:13.774338 7518 tablet_bootstrap.cc:393] T
> 00000000000000000000000000000000 P 3ace416aa5ec45d692d21875e8c23322: No
> bootstrap required, opened a new log
> 02:53:15 I0712 02:53:13.776185 7518 raft_consensus.cc:269] T
> 00000000000000000000000000000000 P 3ace416aa5ec45d692d21875e8c23322 [term 0
> FOLLOWER]: Replica starting. Triggering 0 pending transactions. Active
> config: opid_index: -1 peers { permanent_uuid:
> "3ace416aa5ec45d692d21875e8c23322" member_type: VOTER }
> 02:53:15 I0712 02:53:13.776408 7518 raft_consensus.cc:301] T
> 00000000000000000000000000000000 P 3ace416aa5ec45d692d21875e8c23322 [term 0
> FOLLOWER]: Consensus starting up: Expiring failure detector timer to make a
> prompt election more likely
> 02:53:15 I0712 02:53:13.776487 7518 raft_consensus.cc:499] T
> 00000000000000000000000000000000 P 3ace416aa5ec45d692d21875e8c23322 [term 0
> FOLLOWER]: Becoming Follower/Learner. State: Replica:
> 3ace416aa5ec45d692d21875e8c23322, State: 1, Role: FOLLOWER
> 02:53:15 Watermarks: {Received: term: 0 index: 0 Committed: term: 0 index: 0}
> 02:53:15 I0712 02:53:13.776564 7518 consensus_queue.cc:162] T
> 00000000000000000000000000000000 P 3ace416aa5ec45d692d21875e8c23322
> [NON_LEADER]: Queue going to NON_LEADER mode. State: All replicated op: 0.0,
> Majority replicated op: 0.0, Committed index: 0.0, Last appended: 0.0,
> Current term: 0, Majority size: -1, State: 1, Mode: NON_LEADER
> 02:53:15 I0712 02:53:13.776618 7518 raft_consensus.cc:314] T
> 00000000000000000000000000000000 P 3ace416aa5ec45d692d21875e8c23322 [term 0
> FOLLOWER]: Only one voter in the Raft config. Triggering election immediately
> 02:53:15 I0712 02:53:13.776641 7518 raft_consensus.cc:370] T
> 00000000000000000000000000000000 P 3ace416aa5ec45d692d21875e8c23322 [term 0
> FOLLOWER]: No leader contacted us within the election timeout. Triggering
> leader election
> 02:53:15 I0712 02:53:13.776671 7518 raft_consensus.cc:2019] T
> 00000000000000000000000000000000 P 3ace416aa5ec45d692d21875e8c23322 [term 0
> FOLLOWER]: Advancing to term 1
> 02:53:15 I0712 02:53:13.777194 7518 raft_consensus.cc:1970] T
> 00000000000000000000000000000000 P 3ace416aa5ec45d692d21875e8c23322 [term 1
> FOLLOWER]: Snoozing failure detection for election timeout plus an additional
> 1.636s
> 02:53:15 I0712 02:53:13.777375 7518 raft_consensus.cc:386] T
> 00000000000000000000000000000000 P 3ace416aa5ec45d692d21875e8c23322 [term 1
> FOLLOWER]: Starting election with config: opid_index: -1 peers {
> permanent_uuid: "3ace416aa5ec45d692d21875e8c23322" member_type: VOTER }
> 02:53:15 I0712 02:53:13.778064 7518 leader_election.cc:248] T
> 00000000000000000000000000000000 P 3ace416aa5ec45d692d21875e8c23322
> [CANDIDATE]: Term 1 election: Election decided. Result: candidate won.
> 02:53:15 I0712 02:53:13.778406 7527 raft_consensus.cc:1970] T
> 00000000000000000000000000000000 P 3ace416aa5ec45d692d21875e8c23322 [term 1
> FOLLOWER]: Snoozing failure detection for election timeout plus an additional
> 1.630s
> 02:53:15 I0712 02:53:13.778456 7527 raft_consensus.cc:1859] T
> 00000000000000000000000000000000 P 3ace416aa5ec45d692d21875e8c23322 [term 1
> FOLLOWER]: Leader election won for term 1
> 02:53:15 I0712 02:53:13.790758 7527 raft_consensus.cc:464] T
> 00000000000000000000000000000000 P 3ace416aa5ec45d692d21875e8c23322 [term 1
> LEADER]: Becoming Leader. State: Replica: 3ace416aa5ec45d692d21875e8c23322,
> State: 1, Role: LEADER
> 02:53:15 Watermarks: {Received: term: 0 index: 0 Committed: term: 0 index: 0}
> 02:53:15 I0712 02:53:13.791013 7527 consensus_queue.cc:145] T
> 00000000000000000000000000000000 P 3ace416aa5ec45d692d21875e8c23322 [LEADER]:
> Queue going to LEADER mode. State: All replicated op: 0.0, Majority
> replicated op: 0.0, Committed index: 0.0, Last appended: 0.0, Current term:
> 1, Majority size: 1, State: 1, Mode: LEADER, active raft config: opid_index:
> -1 peers { permanent_uuid: "3ace416aa5ec45d692d21875e8c23322" member_type:
> VOTER }
> 02:53:15 I0712 02:53:13.794667 7527 sys_catalog.cc:235] T
> 00000000000000000000000000000000 P 3ace416aa5ec45d692d21875e8c23322
> [sys.catalog]: SysCatalogTable state changed. Reason: New leader
> 3ace416aa5ec45d692d21875e8c23322. Latest consensus state: current_term: 1
> leader_uuid: "3ace416aa5ec45d692d21875e8c23322" config { opid_index: -1 peers
> { permanent_uuid: "3ace416aa5ec45d692d21875e8c23322" member_type: VOTER } }
> 02:53:15 I0712 02:53:13.794703 7527 sys_catalog.cc:238] T
> 00000000000000000000000000000000 P 3ace416aa5ec45d692d21875e8c23322
> [sys.catalog]: This master's current role is: LEADER
> 02:53:15 I0712 02:53:13.794970 7518 sys_catalog.cc:235] T
> 00000000000000000000000000000000 P 3ace416aa5ec45d692d21875e8c23322
> [sys.catalog]: SysCatalogTable state changed. Reason: Started TabletPeer.
> Latest consensus state: current_term: 1 leader_uuid:
> "3ace416aa5ec45d692d21875e8c23322" config { opid_index: -1 peers {
> permanent_uuid: "3ace416aa5ec45d692d21875e8c23322" member_type: VOTER } }
> 02:53:15 I0712 02:53:13.795006 7518 sys_catalog.cc:238] T
> 00000000000000000000000000000000 P 3ace416aa5ec45d692d21875e8c23322
> [sys.catalog]: This master's current role is: LEADER
> 02:53:15 I0712 02:53:13.795017 7529 sys_catalog.cc:235] T
> 00000000000000000000000000000000 P 3ace416aa5ec45d692d21875e8c23322
> [sys.catalog]: SysCatalogTable state changed. Reason: RaftConsensus started.
> Latest consensus state: current_term: 1 leader_uuid:
> "3ace416aa5ec45d692d21875e8c23322" config { opid_index: -1 peers {
> permanent_uuid: "3ace416aa5ec45d692d21875e8c23322" member_type: VOTER } }
> 02:53:15 I0712 02:53:13.795081 7529 sys_catalog.cc:238] T
> 00000000000000000000000000000000 P 3ace416aa5ec45d692d21875e8c23322
> [sys.catalog]: This master's current role is: LEADER
> 02:53:15 I0712 02:53:13.803755 7531 raft_consensus_state.cc:534] T
> 00000000000000000000000000000000 P 3ace416aa5ec45d692d21875e8c23322 [term 1
> LEADER]: Advanced the committed_index across terms. Last committed operation
> was: term: 0 index: 0 New committed index is: term: 1 index: 1
> 02:53:15 I0712 02:53:13.804172 7518 sys_catalog.cc:311] T
> 00000000000000000000000000000000 P 3ace416aa5ec45d692d21875e8c23322
> [sys.catalog]: configured and running, proceeding with master startup.
> 02:53:15 W0712 02:53:13.804466 7534 catalog_manager.cc:345] Catalog manager
> background task thread going to sleep: Service unavailable: Catalog manager
> is not initialized. State: 1
> 02:53:15 I0712 02:53:13.804584 7532 catalog_manager.cc:636] Loading table
> and tablet metadata into memory...
> 02:53:15 I0712 02:53:13.804772 7420 master_main.cc:64] Master server
> successfully started.
> 02:53:15 I0712 02:53:14.245645 7564 mem_tracker.cc:138] MemTracker: hard
> memory limit is 94.627029 GB
> 02:53:15 I0712 02:53:14.245791 7564 mem_tracker.cc:140] MemTracker: soft
> memory limit is 56.776218 GB
> 02:53:15 I0712 02:53:14.257587 7562 mem_tracker.cc:138] MemTracker: hard
> memory limit is 94.627029 GB
> 02:53:15 I0712 02:53:14.257714 7562 mem_tracker.cc:140] MemTracker: soft
> memory limit is 56.776218 GB
> 02:53:15 I0712 02:53:14.266713 7566 mem_tracker.cc:138] MemTracker: hard
> memory limit is 94.627029 GB
> 02:53:15 I0712 02:53:14.266863 7566 mem_tracker.cc:140] MemTracker: soft
> memory limit is 56.776218 GB
> 02:53:15 I0712 02:53:14.302768 7566 tablet_server_main.cc:54] Initializing
> tablet server...
> 02:53:15 I0712 02:53:14.303900 7566 hybrid_clock.cc:178] HybridClock
> initialized. Resolution in nanos?: 1 Wait times tolerance adjustment: 1.0005
> Current error: 969844
> 02:53:15 I0712 02:53:14.307047 7566 server_base.cc:164] Could not load
> existing FS layout: Not found:
> /home/jenkins-slave/workspace/kudu-3/build/test-tmp/tmp5JN8R3/ts/2/data/instance:
> No such file or directory (error 2)
> 02:53:15 I0712 02:53:14.307086 7566 server_base.cc:165] Creating new FS
> layout
> 02:53:15 I0712 02:53:14.310693 7562 tablet_server_main.cc:54] Initializing
> tablet server...
> 02:53:15 I0712 02:53:14.311682 7562 hybrid_clock.cc:178] HybridClock
> initialized. Resolution in nanos?: 1 Wait times tolerance adjustment: 1.0005
> Current error: 969844
> 02:53:15 I0712 02:53:14.314699 7562 server_base.cc:164] Could not load
> existing FS layout: Not found:
> /home/jenkins-slave/workspace/kudu-3/build/test-tmp/tmp5JN8R3/ts/0/data/instance:
> No such file or directory (error 2)
> 02:53:15 I0712 02:53:14.314730 7562 server_base.cc:165] Creating new FS
> layout
> 02:53:15 I0712 02:53:14.318703 7564 tablet_server_main.cc:54] Initializing
> tablet server...
> 02:53:15 I0712 02:53:14.319725 7564 hybrid_clock.cc:178] HybridClock
> initialized. Resolution in nanos?: 1 Wait times tolerance adjustment: 1.0005
> Current error: 969844
> 02:53:15 I0712 02:53:14.322813 7564 server_base.cc:164] Could not load
> existing FS layout: Not found:
> /home/jenkins-slave/workspace/kudu-3/build/test-tmp/tmp5JN8R3/ts/1/data/instance:
> No such file or directory (error 2)
> 02:53:15 I0712 02:53:14.322846 7564 server_base.cc:165] Creating new FS
> layout
> 02:53:15 I0712 02:53:14.357198 7564 fs_manager.cc:345] Generated new
> instance metadata in path
> /home/jenkins-slave/workspace/kudu-3/build/test-tmp/tmp5JN8R3/ts/1/data/instance:
> 02:53:15 uuid: "76029326f81443679b55fcc12dcb8431"
> 02:53:15 format_stamp: "Formatted at 2016-07-12 02:53:14 on
> jenkins-slave-c3oe"
> 02:53:15 I0712 02:53:14.357671 7566 fs_manager.cc:345] Generated new
> instance metadata in path
> /home/jenkins-slave/workspace/kudu-3/build/test-tmp/tmp5JN8R3/ts/2/data/instance:
> 02:53:15 uuid: "82fdf4247a094384bfcbd0793f59941d"
> 02:53:15 format_stamp: "Formatted at 2016-07-12 02:53:14 on
> jenkins-slave-c3oe"
> 02:53:15 I0712 02:53:14.378212 7562 fs_manager.cc:345] Generated new
> instance metadata in path
> /home/jenkins-slave/workspace/kudu-3/build/test-tmp/tmp5JN8R3/ts/0/data/instance:
> 02:53:15 uuid: "aae8d534a60648bfb24892221b73f659"
> 02:53:15 format_stamp: "Formatted at 2016-07-12 02:53:14 on
> jenkins-slave-c3oe"
> 02:53:15 I0712 02:53:14.499289 7566 fs_manager.cc:242] Opened local
> filesystem:
> /home/jenkins-slave/workspace/kudu-3/build/test-tmp/tmp5JN8R3/ts/2/data
> 02:53:15 uuid: "82fdf4247a094384bfcbd0793f59941d"
> 02:53:15 format_stamp: "Formatted at 2016-07-12 02:53:14 on
> jenkins-slave-c3oe"
> 02:53:15 I0712 02:53:14.531463 7562 fs_manager.cc:242] Opened local
> filesystem:
> /home/jenkins-slave/workspace/kudu-3/build/test-tmp/tmp5JN8R3/ts/0/data
> 02:53:15 uuid: "aae8d534a60648bfb24892221b73f659"
> 02:53:15 format_stamp: "Formatted at 2016-07-12 02:53:14 on
> jenkins-slave-c3oe"
> 02:53:15 I0712 02:53:14.542300 7564 fs_manager.cc:242] Opened local
> filesystem:
> /home/jenkins-slave/workspace/kudu-3/build/test-tmp/tmp5JN8R3/ts/1/data
> 02:53:15 uuid: "76029326f81443679b55fcc12dcb8431"
> 02:53:15 format_stamp: "Formatted at 2016-07-12 02:53:14 on
> jenkins-slave-c3oe"
> 02:53:15 I0712 02:53:14.576563 7566 tablet_server_main.cc:57] Starting
> tablet server...
> 02:53:15 I0712 02:53:14.606667 7564 tablet_server_main.cc:57] Starting
> tablet server...
> 02:53:15 I0712 02:53:14.624748 7562 tablet_server_main.cc:57] Starting
> tablet server...
> 02:53:15 I0712 02:53:15.243650 7492 catalog_manager.cc:762] CreateTable from
> {real_user=jenkins-slave, eff_user=} at 127.0.0.1:47450:
> 02:53:15 name: "example-table"
> 02:53:15 schema {
> 02:53:15 columns {
> 02:53:15 name: "key"
> 02:53:15 type: INT32
> 02:53:15 is_key: true
> 02:53:15 is_nullable: false
> 02:53:15 encoding: AUTO_ENCODING
> 02:53:15 compression: DEFAULT_COMPRESSION
> 02:53:15 cfile_block_size: 0
> 02:53:15 }
> 02:53:15 columns {
> 02:53:15 name: "int_val"
> 02:53:15 type: INT32
> 02:53:15 is_key: false
> 02:53:15 is_nullable: true
> 02:53:15 encoding: AUTO_ENCODING
> 02:53:15 compression: DEFAULT_COMPRESSION
> 02:53:15 cfile_block_size: 0
> 02:53:15 }
> 02:53:15 columns {
> 02:53:15 name: "string_val"
> 02:53:15 type: STRING
> 02:53:15 is_key: false
> 02:53:15 is_nullable: true
> 02:53:15 encoding: AUTO_ENCODING
> 02:53:15 compression: DEFAULT_COMPRESSION
> 02:53:15 cfile_block_size: 0
> 02:53:15 }
> 02:53:15 }
> 02:53:15 split_rows_range_bounds {
> 02:53:15 }
> 02:53:15 partition_schema {
> 02:53:15 range_schema {
> 02:53:15 columns {
> 02:53:15 name: "key"
> 02:53:15 }
> 02:53:15 }
> 02:53:15 }
> {noformat}
> Here the CreateTable() call has arrived to the master before the tservers
> have finished initializing, and thus before they have heartbeated to the
> master.
> We should modify the start_cluster() call to wait until the tservers are
> initialized. Possibly this means adding --server_dump_info_path=... to the
> command line arguments and waiting for the file to show up. Or maybe that's
> insufficient and we should also query the master for the list of tservers.
> Either way, we need to wait a little longer in case the tservers are slow to
> start.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)