My previous message is misleading. After I started yarn it doesn't fail
immediately but still doesn't work:


2018-04-23 23:15:46,065 INFO:db_connection[752]:Dropping database randomness
2018-04-23 23:15:46,095 INFO:db_connection[234]:Creating database randomness
2018-04-23 23:15:52,390 INFO:data_generator[235]:Starting MR job to
generate data for randomness
Traceback (most recent call last):
  File "tests/comparison/data_generator.py", line 339, in <module>
    populator.populate_db(args.table_count, postgresql_conn=postgresql_conn)
  File "tests/comparison/data_generator.py", line 134, in populate_db
    self._run_data_generator_mr_job([g for _, g in table_and_generators],
self.db_name)
  File "tests/comparison/data_generator.py", line 244, in
_run_data_generator_mr_job
    % (reducer_count, ','.join(files), mapper_input_file, hdfs_output_dir))
  File "/home/impdev/projects/impala/tests/comparison/cluster.py", line
476, in run_mr_job
    stderr=subprocess.STDOUT, env=env)
  File "/home/impdev/projects/impala/tests/util/shell_util.py", line 113,
in shell
    "\ncmd: %s\nstdout: %s\nstderr: %s") % (retcode, cmd, output, err))
Exception: Command returned non-zero exit code: 1
cmd: set -euo pipefail
hadoop jar
/home/impdev/projects/impala/toolchain/cdh_components/hadoop-3.0.0-cdh6.x-SNAPSHOT/share/hadoop/tools/lib/hadoop-streaming-3.0.0-cdh6.x-SNAPSHOT.jar
-D mapred.reduce.tasks=36 \
        -D stream.num.map.output.key.fields=2 \
        -files
tests/comparison/common.py,tests/comparison/db_types.py,tests/comparison/data_generator_mapred_common.py,tests/comparison/data_generator_mapper.py,tests/comparison/data_generator_reducer.py,tests/comparison/random_val_generator.py
\
        -input /tmp/data_gen_randomness_mr_input_1524525348 \
        -output /tmp/data_gen_randomness_mr_output_1524525348 \
        -mapper data_generator_mapper.py \
        -reducer data_generator_reducer.py
stdout: packageJobJar: []
[/home/impdev/projects/impala/toolchain/cdh_components/hadoop-3.0.0-cdh6.x-SNAPSHOT/share/hadoop/tools/lib/hadoop-streaming-3.0.0-cdh6.x-SNAPSHOT.jar]
/tmp/streamjob2990195923122538287.jar tmpDir=null
18/04/23 23:15:53 INFO client.RMProxy: Connecting to ResourceManager at /
0.0.0.0:8032
18/04/23 23:15:53 INFO client.RMProxy: Connecting to ResourceManager at /
0.0.0.0:8032
18/04/23 23:15:54 INFO mapreduce.JobResourceUploader: Disabling Erasure
Coding for path:
/tmp/hadoop-yarn/staging/impdev/.staging/job_1524519161700_0002
18/04/23 23:15:54 INFO lzo.GPLNativeCodeLoader: Loaded native gpl library
18/04/23 23:15:54 INFO lzo.LzoCodec: Successfully loaded & initialized
native-lzo library [hadoop-lzo rev 2b3bd7731ff3ef5d8585a004b90696630e5cea96]
18/04/23 23:15:54 INFO mapred.FileInputFormat: Total input files to process
: 1
18/04/23 23:15:54 INFO mapreduce.JobSubmitter: number of splits:2
18/04/23 23:15:54 INFO Configuration.deprecation: mapred.reduce.tasks is
deprecated. Instead, use mapreduce.job.reduces
18/04/23 23:15:54 INFO Configuration.deprecation:
yarn.resourcemanager.system-metrics-publisher.enabled is deprecated.
Instead, use yarn.system-metrics-publisher.enabled
18/04/23 23:15:54 INFO mapreduce.JobSubmitter: Submitting tokens for job:
job_1524519161700_0002
18/04/23 23:15:54 INFO mapreduce.JobSubmitter: Executing with tokens: []
18/04/23 23:15:54 INFO conf.Configuration: resource-types.xml not found
18/04/23 23:15:54 INFO resource.ResourceUtils: Unable to find
'resource-types.xml'.
18/04/23 23:15:54 INFO impl.YarnClientImpl: Submitted application
application_1524519161700_0002
18/04/23 23:15:54 INFO mapreduce.Job: The url to track the job:
http://c37e0835e988:8088/proxy/application_1524519161700_0002/
18/04/23 23:15:54 INFO mapreduce.Job: Running job: job_1524519161700_0002
18/04/23 23:16:00 INFO mapreduce.Job: Job job_1524519161700_0002 running in
uber mode : false
18/04/23 23:16:00 INFO mapreduce.Job:  map 0% reduce 0%
18/04/23 23:16:06 INFO mapreduce.Job: Job job_1524519161700_0002 failed
with state FAILED due to: Application application_1524519161700_0002 failed
2 times due to AM Container for appattempt_1524519161700_0002_000002 exited
with  exitCode: 255
Failing this attempt.Diagnostics: [2018-04-23 23:16:06.473]Exception from
container-launch.
Container id: container_1524519161700_0002_02_000001
Exit code: 255

[2018-04-23 23:16:06.475]Container exited with a non-zero exit code 255.
Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
Apr 23, 2018 11:16:03 PM
com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register
INFO: Registering
org.apache.hadoop.mapreduce.v2.app.webapp.JAXBContextResolver as a provider
class
Apr 23, 2018 11:16:03 PM
com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register
INFO: Registering org.apache.hadoop.yarn.webapp.GenericExceptionHandler as
a provider class
Apr 23, 2018 11:16:03 PM
com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register
INFO: Registering org.apache.hadoop.mapreduce.v2.app.webapp.AMWebServices
as a root resource class
Apr 23, 2018 11:16:03 PM
com.sun.jersey.server.impl.application.WebApplicationImpl _initiate
INFO: Initiating Jersey application, version 'Jersey: 1.19 02/11/2015 03:25
AM'
Apr 23, 2018 11:16:03 PM
com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory
getComponentProvider
INFO: Binding org.apache.hadoop.mapreduce.v2.app.webapp.JAXBContextResolver
to GuiceManagedComponentProvider with the scope "Singleton"
Apr 23, 2018 11:16:03 PM
com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory
getComponentProvider
INFO: Binding org.apache.hadoop.yarn.webapp.GenericExceptionHandler to
GuiceManagedComponentProvider with the scope "Singleton"
Apr 23, 2018 11:16:03 PM
com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory
getComponentProvider
INFO: Binding org.apache.hadoop.mapreduce.v2.app.webapp.AMWebServices to
GuiceManagedComponentProvider with the scope "PerRequest"
log4j:WARN No appenders could be found for logger
(org.apache.hadoop.mapreduce.v2.app.MRAppMaster).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for
more info.


[2018-04-23 23:16:06.476]Container exited with a non-zero exit code 255.
Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
Apr 23, 2018 11:16:03 PM
com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register
INFO: Registering
org.apache.hadoop.mapreduce.v2.app.webapp.JAXBContextResolver as a provider
class
Apr 23, 2018 11:16:03 PM
com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register
INFO: Registering org.apache.hadoop.yarn.webapp.GenericExceptionHandler as
a provider class
Apr 23, 2018 11:16:03 PM
com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register
INFO: Registering org.apache.hadoop.mapreduce.v2.app.webapp.AMWebServices
as a root resource class
Apr 23, 2018 11:16:03 PM
com.sun.jersey.server.impl.application.WebApplicationImpl _initiate
INFO: Initiating Jersey application, version 'Jersey: 1.19 02/11/2015 03:25
AM'
Apr 23, 2018 11:16:03 PM
com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory
getComponentProvider
INFO: Binding org.apache.hadoop.mapreduce.v2.app.webapp.JAXBContextResolver
to GuiceManagedComponentProvider with the scope "Singleton"
Apr 23, 2018 11:16:03 PM
com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory
getComponentProvider
INFO: Binding org.apache.hadoop.yarn.webapp.GenericExceptionHandler to
GuiceManagedComponentProvider with the scope "Singleton"
Apr 23, 2018 11:16:03 PM
com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory
getComponentProvider
INFO: Binding org.apache.hadoop.mapreduce.v2.app.webapp.AMWebServices to
GuiceManagedComponentProvider with the scope "PerRequest"
log4j:WARN No appenders could be found for logger
(org.apache.hadoop.mapreduce.v2.app.MRAppMaster).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for
more info.


For more detailed output, check the application tracking page:
http://localhost:8088/cluster/app/application_1524519161700_0002 Then click
on links to logs of each attempt.
. Failing the application.
18/04/23 23:16:06 INFO mapreduce.Job: Counters: 0
18/04/23 23:16:06 ERROR streaming.StreamJob: Job not successful!
Streaming Command Failed!






On Thu, Apr 19, 2018 at 1:59 PM Tianyi Wang <[email protected]> wrote:

> Thanks Philip. It works! I'll update the README file.
>
> On Wed, Apr 18, 2018 at 6:49 PM Philip Zeyliger <[email protected]>
> wrote:
>
>> Most likely is the following commit, which turned off Yarn by default. The
>> commit message will tell you how to turn it on ("testdata/cluster/admin -y
>> start_cluster"). I'm open to changing the defaults the other way around if
>> we think this is an important use case.
>>
>> -- Philip
>>
>> commit 6dc13d933b5ea9a41e584d83e95db72b9e8e19b3
>> Author: Philip Zeyliger <[email protected]>
>> Date:   Tue Apr 10 09:21:20 2018 -0700
>>
>>     Remove Yarn from minicluster by default. (2nd try)
>>
>> On Wed, Apr 18, 2018 at 6:25 PM, Tianyi Wang <[email protected]> wrote:
>>
>> > I was trying to run  tests/comparison/data_generator.py, which used to
>> > work
>> > before switching to hadoop 3. Now MR claims that it's wrongly
>> configured to
>> > connect to 0.0.0.0:8032, but I cannot find text "8032" in our
>> minicluster
>> > configs. Does anybody happen to know this error?
>> >
>> >
>> > Traceback (most recent call last):
>> >   File "./data_generator.py", line 339, in <module>
>> >     populator.populate_db(args.table_count, postgresql_conn=postgresql_
>> > conn)
>> >   File "./data_generator.py", line 134, in populate_db
>> >     self._run_data_generator_mr_job([g for _, g in
>> table_and_generators],
>> > self.db_name)
>> >   File "./data_generator.py", line 244, in _run_data_generator_mr_job
>> >     % (reducer_count, ','.join(files), mapper_input_file,
>> hdfs_output_dir))
>> >   File "/home/twang/projects/impala/tests/comparison/cluster.py", line
>> > 476,
>> > in run_mr_job
>> >     stderr=subprocess.STDOUT, env=env)
>> >   File "/home/twang/projects/impala/tests/util/shell_util.py", line 113,
>> > in
>> > shell
>> >     "\ncmd: %s\nstdout: %s\nstderr: %s") % (retcode, cmd, output, err))
>> > Exception: Command returned non-zero exit code: 5
>> > cmd: set -euo pipefail
>> > hadoop jar
>> > /home/twang/projects/impala/toolchain/cdh_components/
>> > hadoop-3.0.0-cdh6.x-SNAPSHOT/share/hadoop/tools/lib/hadoop-
>> > streaming-3.0.0-cdh6.x-SNAPSHOT.jar
>> > -D mapred.reduce.tasks=34 \
>> >         -D stream.num.map.output.key.fields=2 \
>> >         -files
>> > ./common.py,./db_types.py,./data_generator_mapred_common.
>> > py,./data_generator_mapper.py,./data_generator_reducer.py,./
>> > random_val_generator.py
>> > \
>> >         -input /tmp/data_gen_randomness_mr_input_1524095906 \
>> >         -output /tmp/data_gen_randomness_mr_output_1524095906 \
>> >         -mapper data_generator_mapper.py \
>> >         -reducer data_generator_reducer.py
>> > stdout: packageJobJar: []
>> > [/home/twang/projects/impala/toolchain/cdh_components/
>> > hadoop-3.0.0-cdh6.x-SNAPSHOT/share/hadoop/tools/lib/hadoop-
>> > streaming-3.0.0-cdh6.x-SNAPSHOT.jar]
>> > /tmp/streamjob6950277591392799099.jar tmpDir=null
>> > 18/04/18 16:58:30 INFO client.RMProxy: Connecting to ResourceManager at
>> /
>> > 0.0.0.0:8032
>> > 18/04/18 16:58:30 INFO client.RMProxy: Connecting to ResourceManager at
>> /
>> > 0.0.0.0:8032
>> > 18/04/18 16:58:32 INFO ipc.Client: Retrying connect to server:
>> > 0.0.0.0/0.0.0.0:8032. Already tried 0 time(s); retry policy is
>> > RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000
>> > MILLISECONDS)
>> > 18/04/18 16:58:33 INFO ipc.Client: Retrying connect to server:
>> > 0.0.0.0/0.0.0.0:8032. Already tried 1 time(s); retry policy is
>> > RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000
>> > MILLISECONDS)
>> >
>> > ..........................
>> >
>> > 18/04/18 16:58:51 INFO ipc.Client: Retrying connect to server:
>> > 0.0.0.0/0.0.0.0:8032. Already tried 9 time(s); retry policy is
>> > RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000
>> > MILLISECONDS)
>> > 18/04/18 16:58:51 INFO retry.RetryInvocationHandler:
>> > java.net.ConnectException: Your endpoint configuration is wrong; For
>> more
>> > details see:  http://wiki.apache.org/hadoop/UnsetHostnameOrPort, while
>> > invoking ApplicationClientProtocolPBClientImpl.getNewApplication over
>> null
>> > after 1 failover attempts. Trying to failover after sleeping for
>> 16129ms.
>> >
>> > --
>> > Tianyi Wang
>> >
>>
> --
> Tianyi Wang
>
-- 
Tianyi Wang

Reply via email to