Most likely is the following commit, which turned off Yarn by default. The
commit message will tell you how to turn it on ("testdata/cluster/admin -y
start_cluster"). I'm open to changing the defaults the other way around if
we think this is an important use case.

-- Philip

commit 6dc13d933b5ea9a41e584d83e95db72b9e8e19b3
Author: Philip Zeyliger <[email protected]>
Date:   Tue Apr 10 09:21:20 2018 -0700

    Remove Yarn from minicluster by default. (2nd try)

On Wed, Apr 18, 2018 at 6:25 PM, Tianyi Wang <[email protected]> wrote:

> I was trying to run  tests/comparison/data_generator.py, which used to
> work
> before switching to hadoop 3. Now MR claims that it's wrongly configured to
> connect to 0.0.0.0:8032, but I cannot find text "8032" in our minicluster
> configs. Does anybody happen to know this error?
>
>
> Traceback (most recent call last):
>   File "./data_generator.py", line 339, in <module>
>     populator.populate_db(args.table_count, postgresql_conn=postgresql_
> conn)
>   File "./data_generator.py", line 134, in populate_db
>     self._run_data_generator_mr_job([g for _, g in table_and_generators],
> self.db_name)
>   File "./data_generator.py", line 244, in _run_data_generator_mr_job
>     % (reducer_count, ','.join(files), mapper_input_file, hdfs_output_dir))
>   File "/home/twang/projects/impala/tests/comparison/cluster.py", line
> 476,
> in run_mr_job
>     stderr=subprocess.STDOUT, env=env)
>   File "/home/twang/projects/impala/tests/util/shell_util.py", line 113,
> in
> shell
>     "\ncmd: %s\nstdout: %s\nstderr: %s") % (retcode, cmd, output, err))
> Exception: Command returned non-zero exit code: 5
> cmd: set -euo pipefail
> hadoop jar
> /home/twang/projects/impala/toolchain/cdh_components/
> hadoop-3.0.0-cdh6.x-SNAPSHOT/share/hadoop/tools/lib/hadoop-
> streaming-3.0.0-cdh6.x-SNAPSHOT.jar
> -D mapred.reduce.tasks=34 \
>         -D stream.num.map.output.key.fields=2 \
>         -files
> ./common.py,./db_types.py,./data_generator_mapred_common.
> py,./data_generator_mapper.py,./data_generator_reducer.py,./
> random_val_generator.py
> \
>         -input /tmp/data_gen_randomness_mr_input_1524095906 \
>         -output /tmp/data_gen_randomness_mr_output_1524095906 \
>         -mapper data_generator_mapper.py \
>         -reducer data_generator_reducer.py
> stdout: packageJobJar: []
> [/home/twang/projects/impala/toolchain/cdh_components/
> hadoop-3.0.0-cdh6.x-SNAPSHOT/share/hadoop/tools/lib/hadoop-
> streaming-3.0.0-cdh6.x-SNAPSHOT.jar]
> /tmp/streamjob6950277591392799099.jar tmpDir=null
> 18/04/18 16:58:30 INFO client.RMProxy: Connecting to ResourceManager at /
> 0.0.0.0:8032
> 18/04/18 16:58:30 INFO client.RMProxy: Connecting to ResourceManager at /
> 0.0.0.0:8032
> 18/04/18 16:58:32 INFO ipc.Client: Retrying connect to server:
> 0.0.0.0/0.0.0.0:8032. Already tried 0 time(s); retry policy is
> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000
> MILLISECONDS)
> 18/04/18 16:58:33 INFO ipc.Client: Retrying connect to server:
> 0.0.0.0/0.0.0.0:8032. Already tried 1 time(s); retry policy is
> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000
> MILLISECONDS)
>
> ..........................
>
> 18/04/18 16:58:51 INFO ipc.Client: Retrying connect to server:
> 0.0.0.0/0.0.0.0:8032. Already tried 9 time(s); retry policy is
> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000
> MILLISECONDS)
> 18/04/18 16:58:51 INFO retry.RetryInvocationHandler:
> java.net.ConnectException: Your endpoint configuration is wrong; For more
> details see:  http://wiki.apache.org/hadoop/UnsetHostnameOrPort, while
> invoking ApplicationClientProtocolPBClientImpl.getNewApplication over null
> after 1 failover attempts. Trying to failover after sleeping for 16129ms.
>
> --
> Tianyi Wang
>

Reply via email to