[ https://issues.apache.org/jira/browse/IMPALA-9798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Quanlong Huang reassigned IMPALA-9798: -------------------------------------- Assignee: Tim Armstrong > TestScratchDir.test_multiple_dirs fails to start impalad > -------------------------------------------------------- > > Key: IMPALA-9798 > URL: https://issues.apache.org/jira/browse/IMPALA-9798 > Project: IMPALA > Issue Type: Bug > Reporter: Quanlong Huang > Assignee: Tim Armstrong > Priority: Critical > Labels: broken-build > Fix For: Impala 4.0 > > Attachments: > impalad.impala-ec2-centos74-m5-4xlarge-ondemand-0b8f.vpc.cloudera.com.jenkins.log.INFO.20200528-151451.15245 > > > Saw in an exhaustive job: > Stacktrace: > {code} > custom_cluster/test_scratch_disk.py:97: in test_multiple_dirs > '--impalad_args=--disk_spill_punch_holes=true']) > common/custom_cluster_test_suite.py:277: in _start_impala_cluster > check_call(cmd + options, close_fds=True) > /data/jenkins/workspace/impala-asf-master-exhaustive/Impala-Toolchain/python-2.7.16/lib/python2.7/subprocess.py:190: > in check_call > raise CalledProcessError(retcode, cmd) > E CalledProcessError: Command > '['/data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/bin/start-impala-cluster.py', > '--state_store_args=--statestore_update_frequency_ms=50 > --statestore_priority_update_frequency_ms=50 > --statestore_heartbeat_frequency_ms=50', '--cluster_size=3', > '--num_coordinators=3', > '--log_dir=/data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/logs/custom_cluster_tests', > '--log_level=1', '--impalad_args=-logbuflevel=-1 > -scratch_dirs=/tmp/tmpR006lp,/tmp/tmpzKVBYt,/tmp/tmpBLcN_O,/tmp/tmp6kqoj5,/tmp/tmpT_R39r', > '--impalad_args=--allow_multiple_scratch_dirs_per_device=false', > '--impalad_args=--disk_spill_compression_codec=zstd', > '--impalad_args=--disk_spill_punch_holes=true', > '--impalad_args=--default_query_options=']' returned non-zero exit status 1 > {code} > Standard Output: > {code} > Generated dir/tmp/tmpR006lp > Generated dir/tmp/tmpzKVBYt > Generated dir/tmp/tmpBLcN_O > Generated dir/tmp/tmp6kqoj5 > Generated dir/tmp/tmpT_R39r > {code} > Standard Error: > {code} > 15:14:51 MainThread: Found 0 impalad/0 statestored/0 catalogd process(es) > 15:14:51 MainThread: Starting State Store logging to > /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/logs/custom_cluster_tests/statestored.INFO > 15:14:51 MainThread: Starting Catalog Service logging to > /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/logs/custom_cluster_tests/catalogd.INFO > 15:14:51 MainThread: Starting Impala Daemon logging to > /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/logs/custom_cluster_tests/impalad.INFO > 15:14:51 MainThread: Starting Impala Daemon logging to > /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/logs/custom_cluster_tests/impalad_node1.INFO > 15:14:51 MainThread: Starting Impala Daemon logging to > /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/logs/custom_cluster_tests/impalad_node2.INFO > 15:14:54 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es) > 15:14:54 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es) > 15:14:54 MainThread: Getting num_known_live_backends from > impala-ec2-centos74-m5-4xlarge-ondemand-0b8f.vpc.cloudera.com:25000 > 15:14:54 MainThread: 'backends' > 15:14:54 MainThread: Waiting for num_known_live_backends=3. Current value: > None > 15:14:55 MainThread: Found 2 impalad/1 statestored/1 catalogd process(es) > 15:14:55 MainThread: Error starting cluster > Traceback (most recent call last): > File > "/data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/bin/start-impala-cluster.py", > line 770, in <module> > expected_cluster_size - expected_catalog_delays) > File > "/data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/tests/common/impala_cluster.py", > line 186, in wait_until_ready > early_abort_fn=check_processes_still_running) > File > "/data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/tests/common/impala_service.py", > line 284, in wait_for_num_known_live_backends > early_abort_fn() > File > "/data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/tests/common/impala_cluster.py", > line 178, in check_processes_still_running > assert len(self.impalads) >= expected_num_impalads > AssertionError > DEBUG:impala_cluster:Found 2 impalad/1 statestored/1 catalogd process(es) > {code} > Looking into the crashed impalad's log: > {code} > I0528 15:14:54.587469 15245 tmp-file-mgr.cc:229] Using scratch directory > /tmp/tmpR006lp/impala-scratch on disk 0 limit: 8589934592.00 GB > I0528 15:14:54.648952 15245 status.cc:129] Failed to get post-punch file > size: Not found: > /tmp/tmpR006lp/impala-scratch/88432f73256ff458:c620697eade771bb: No such file > or directory (error 2) > @ 0x1d5b072 impala::Status::Status() > @ 0x264fa09 impala::FileSystemUtil::CheckHolePunch() > @ 0x22e6947 impala::TmpFileMgr::InitCustom() > @ 0x22e59a4 impala::TmpFileMgr::InitCustom() > @ 0x22e58f0 impala::TmpFileMgr::Init() > @ 0x248835c impala::ImpalaServer::ImpalaServer() > @ 0x2483a34 ImpaladMain() > @ 0x1d048af main > @ 0x7f23f3a64c04 __libc_start_main > @ 0x1d04726 (unknown) > E0528 15:14:54.649108 15245 impala-server.cc:394] Failed to get post-punch > file size: Not found: > /tmp/tmpR006lp/impala-scratch/88432f73256ff458:c620697eade771bb: No such file > or directory (error 2) > E0528 15:14:54.649127 15245 impala-server.cc:397] Aborting Impala Server > startup due to improperly configured scratch directories.. Impalad exiting. > {code} > It looks like the scratch dir is not created successfully. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org