How about attaching 'strace' to the catalogd startup and see where it
crashes (if its reproducible on demand) ? May be others have better ideas.

On Sat, Jul 29, 2017 at 3:14 PM, Jim Apple <[email protected]> wrote:

> To be specific about "no error message": the logs written in the logs
> directory near the time of the crash are nearly identical to those of a
> process that got much further on a machine with a configuration that I do
> not know how to reproduce. The one that ended earlier has output like:
>
> Creating /test-warehouse HDFS directory (logging to
> /home/ubuntu/Impala/logs/data_loading/create-test-warehouse-dir.log)...
>     OK (Took: 0 min 2 sec)
> Derived params for create-load-data.sh:
> EXPLORATION_STRATEGY=exhaustive
> SKIP_METADATA_LOAD=0
> SKIP_SNAPSHOT_LOAD=0
> SNAPSHOT_FILE=
> CM_HOST=
> REMOTE_LOAD=
> Starting Impala cluster (logging to
> /home/ubuntu/Impala/logs/data_loading/start-impala-cluster.log)...
>     FAILED (Took: 0 min 11 sec)
>     '/home/ubuntu/Impala/bin/start-impala-cluster.py
> --log_dir=/home/ubuntu/Impala/logs/data_loading -s 3' failed. Tail of log:
> Log for command '/home/ubuntu/Impala/bin/start-impala-cluster.py
> --log_dir=/home/ubuntu/Impala/logs/data_loading -s 3'
> Starting State Store logging to
> /home/ubuntu/Impala/logs/data_loading/statestored.INFO
> Starting Catalog Service logging to
> /home/ubuntu/Impala/logs/data_loading/catalogd.INFO
> Error starting cluster: Unable to start catalogd. Check log or file
> permissions for more details.
> Error in /home/ubuntu/Impala/testdata/bin/create-load-data.sh at line 48:
> LOAD_DATA_ARGS=""
> + cleanup
> + rm -rf /tmp/tmp.HVkbPNl08R
>
>
> The one that got further in the process (and I think may be dying due to a
> spurious out-of-disk failure that I am putting on the back-burner for the
> moment) has the following output:
>
> Creating /test-warehouse HDFS directory (logging to
> /home/ubuntu/Impala/logs/data_loading/create-test-warehouse-dir.log)...
>     OK (Took: 0 min 2 sec)
> Derived params for create-load-data.sh:
> EXPLORATION_STRATEGY=exhaustive
> SKIP_METADATA_LOAD=0
> SKIP_SNAPSHOT_LOAD=0
> SNAPSHOT_FILE=
> CM_HOST=
> REMOTE_LOAD=
> Starting Impala cluster (logging to
> /home/ubuntu/Impala/logs/data_loading/start-impala-cluster.log)...
>     OK (Took: 0 min 11 sec)
> Setting up HDFS environment (logging to
> /home/ubuntu/Impala/logs/data_loading/setup-hdfs-env.log)...
>     OK (Took: 0 min 8 sec)
> Loading custom schemas (logging to
> /home/ubuntu/Impala/logs/data_loading/load-custom-schemas.log)...
>     OK (Took: 0 min 35 sec)
> Loading functional-query data (logging to
> /home/ubuntu/Impala/logs/data_loading/load-functional-query.log)...
>     OK (Took: 37 min 14 sec)
> Loading TPC-H data (logging to
> /home/ubuntu/Impala/logs/data_loading/load-tpch.log)...
>     OK (Took: 14 min 11 sec)
> Loading nested data (logging to
> /home/ubuntu/Impala/logs/data_loading/load-nested.log)...
>     OK (Took: 3 min 41 sec)
> Loading TPC-DS data (logging to
> /home/ubuntu/Impala/logs/data_loading/load-tpcds.log)...
>     FAILED (Took: 5 min 50 sec)
>     'load-data tpcds core' failed. Tail of log:
> ss_net_paid_inc_tax,
> ss_net_profit,
> ss_sold_date_sk
> from store_sales_unpartitioned
> WHERE ss_sold_date_sk < 2451272
> distribute by ss_sold_date_sk
> INFO  : Query ID =
> ubuntu_20170729150909_583df9cf-e54b-44bf-a104-ef5e690cfa0d
> INFO  : Total jobs = 1
> INFO  : Launching Job 1 out of 1
> INFO  : Starting task [Stage-1:MAPRED] in serial mode
> INFO  : Number of reduce tasks not specified. Estimated from input data
> size: 2
> INFO  : In order to change the average load for a reducer (in bytes):
> INFO  :   set hive.exec.reducers.bytes.per.reducer=<number>
> INFO  : In order to limit the maximum number of reducers:
> INFO  :   set hive.exec.reducers.max=<number>
> INFO  : In order to set a constant number of reducers:
> INFO  :   set mapreduce.job.reduces=<number>
> INFO  : number of splits:2
> INFO  : Submitting tokens for job: job_local1041198115_0826
> INFO  : The url to track the job: http://localhost:8080/
> INFO  : Job running in-process (local Hadoop)
> INFO  : 2017-07-29 15:09:25,495 Stage-1 map = 0%,  reduce = 0%
> INFO  : 2017-07-29 15:09:32,498 Stage-1 map = 100%,  reduce = 0%
> ERROR : Ended Job = job_local1041198115_0826 with errors
> ERROR : FAILED: Execution Error, return code 2 from
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask
> INFO  : MapReduce Jobs Launched:
> INFO  : Stage-Stage-1:  HDFS Read: 17615502357 HDFS Write: 12907849658 FAIL
> INFO  : Total MapReduce CPU Time Spent: 0 msec
> INFO  : Completed executing
> command(queryId=ubuntu_20170729150909_583df9cf-e54b-
> 44bf-a104-ef5e690cfa0d);
> Time taken: 18.314 seconds
> Error: Error while processing statement: FAILED: Execution Error, return
> code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
> (state=08S01,code=2)
> java.sql.SQLException: Error while processing statement: FAILED: Execution
> Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
>         at
> org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:292)
>         at
> org.apache.hive.beeline.Commands.executeInternal(Commands.java:989)
>         at org.apache.hive.beeline.Commands.execute(Commands.java:1203)
>         at org.apache.hive.beeline.Commands.sql(Commands.java:1117)
>         at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:1176)
>         at org.apache.hive.beeline.BeeLine.execute(BeeLine.java:1010)
>         at org.apache.hive.beeline.BeeLine.executeFile(BeeLine.java:987)
>         at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:914)
>         at
> org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:518)
>         at org.apache.hive.beeline.BeeLine.main(BeeLine.java:501)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:
> 57)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(
> DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
>
> Closing: 0: jdbc:hive2://localhost:11050/default;auth=none
> Error executing file from Hive: load-tpcds-core-hive-generated.sql
> Error in /home/ubuntu/Impala/testdata/bin/create-load-data.sh at line 48:
> LOAD_DATA_ARGS=""
> + cleanup
> + rm -rf /tmp/tmp.Yfeh8QGfi1
>
>
>
>
> On Sat, Jul 29, 2017 at 12:47 AM, Jim Apple <[email protected]> wrote:
>
> > I'm seeing https://issues.apache.org/jira/browse/IMPALA-5700 when trying
> > to bootstrap a new development environment on an EC2 machine with Ubuntu
> > 14.04, 250GB of free disk space and over 60GB of free memory. I've seen
> > this with and without the -so flag.
> >
> > I'm running the below script, which I thought was the canonical way to
> > bootstrap a development environment. When catalog doesn't start, I don't
> > see anything amiss in any of the logs. I was thinking that maybe a port
> is
> > closed that should be open? I only have port 22 open in my ec2 config.
> >
> > Has anyone else fixed a problem like this before?
> >
> > #!/bin/bash -eux
> >
> > IMPALA_REPO_URL=https://git-wip-us.apache.org/repos/asf/
> > incubator-impala.git
> > IMPALA_REPO_BRANCH=master
> >
> > sudo apt-get install --yes git
> >
> > sudo apt-get install --yes openjdk-7-jdk
> >
> > # JAVA_HOME needed by chef scripts
> > export JAVA_HOME="/usr/lib/jvm/$(ls -tr /usr/lib/jvm/ | tail -1)"
> > $JAVA_HOME/bin/javac -version
> >
> > # TODO: check that df . is large enough.
> > df -h .
> >
> > IMPALA_LOCATION=Impala
> >
> > cd "/home/$(whoami)"
> >
> > git clone "${IMPALA_REPO_URL}" "${IMPALA_LOCATION}"
> > cd "${IMPALA_LOCATION}"
> > git checkout "${IMPALA_REPO_BRANCH}"
> > GIT_LOG_FILE=$(mktemp)
> > git log --pretty=oneline >"${GIT_LOG_FILE}"
> > head "${GIT_LOG_FILE}"
> >
> > ./bin/bootstrap_development.sh
> >
>

Reply via email to