alapha23 commented on issue #187: [NEMO-324] Distinguish Beam's run and waitUntilFinish methods URL: https://github.com/apache/incubator-nemo/pull/187#issuecomment-473177054 @wonook I am afraid that timeout is not working. Driver shutdown did not initiate. You could refer to my `driver.stderr` and `driver.stdout` from my [driver_log.zip](https://github.com/apache/incubator-nemo/files/2969718/driver_log.zip) Interesting, I happened to implement timeout in a similar way as you did. This is [my branch](https://github.com/alapha23/incubator-nemo/tree/apache-master-gao). I was hit an identical error. This error is also verified by @taegeonum on his machine ### I started running nextmark using these parameters ``` #!/bin/bash # # Licensed to the Apache Software Foundation (ASF) under one # or more contributor license agreements. See the NOTICE file # distributed with this work for additional information # regarding copyright ownership. The ASF licenses this file # to you under the Apache License, Version 2.0 (the # "License"); you may not use this file except in compliance # with the License. You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, # software distributed under the License is distributed on an # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY # KIND, either express or implied. See the License for the # specific language governing permissions and limitations # under the License. # # run this by ./bin/generate_javadocs.sh TIMEOUT=30 WINDOW=30 INTERVAL=30 EVENTS=0 PARALLELISM=1 PERIOD=50 NORMAL=10 BURSTY=10 CPU_DELAY=0 SAMPLING=0.9 ENABLE_OFFLOADING=false ENABLE_OFFLOADING_DEBUG=false POOL_SIZE=0 FLUSH_BYTES=$((10 * 1024 * 1024)) FLUSH_COUNT=10 ./bin/run_nexmark.sh \ -job_id nexmark-Q0 \ -executor_json `pwd`/examples/resources/executors/beam_test_executor_resources.json \ -user_main org.apache.beam.sdk.nexmark.Main \ -optimization_policy org.apache.nemo.compiler.optimizer.policy.StreamingPolicy \ -scheduler_impl_class_name org.apache.nemo.runtime.master.scheduler.StreamingScheduler \ -user_args "--runner=org.apache.nemo.client.beam.NemoRunner --streaming=true --query=$1 --manageResources=false --monitorJobs=true --streamTimeout=$TIMEOUT" ``` ### my commandline shows ``` Powered by _ __ / | / /__ ____ ___ ____ / |/ / _ \/ __ `__ \/ __ \ / /| / __/ / / / / / /_/ / /_/ |_/\___/_/ /_/ /_/\____/ SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/usr/local/share/hadoop-2.7.2/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/Users/zhiyuangao/Documents/incubator-nemo/examples/nexmark/target/nexmark-0.2-SNAPSHOT-shaded.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] INFO 03-15 15:23:52,485 JobLauncher:127 [main] - Launching RPC Server INFO 03-15 15:23:52,753 DriverRPCServer:93 [main] - DriverRPCServer running at 15621 INFO 03-15 15:23:53,058 JobLauncher:163 [main] - Launching driver Powered by ___________ ______ ______ _______ / ______ / / ___/ / ___/ / ____/ / _____/ / /__ / /__ / /___ / /\ \ / ___/ / ___/ / ____/ / / \ \ / /__ / /__ / / /__/ \__\ /_____/ /_____/ /__/ Mar 15, 2019 3:23:53 PM org.apache.reef.util.REEFVersion logVersion INFO: REEF Version: 0.16.0 Mar 15, 2019 3:23:53 PM org.apache.reef.client.DriverLauncher$SubmittedJobHandler onNext INFO: REEF job submitted: nexmark-Q0. INFO 03-15 15:23:53,439 JobLauncher:297 [main] - User program started 2019-03-15T06:23:54.247Z Running query:0; streamTimeout:30 2019-03-15T06:23:54.581Z Generating 100000 events in streaming mode INFO 03-15 15:23:55,012 JobLauncher:242 [ForkJoinPool.commonPool-worker-1] - Waiting for the driver to be ready Mar 15, 2019 3:23:55 PM org.apache.reef.client.DriverLauncher$RunningJobHandler onNext INFO: The Job nexmark-Q0 is running. Mar 15, 2019 3:23:55 PM org.apache.reef.runtime.local.driver.ResourceManager sendRuntimeStatus INFO: Allocated: 1, Outstanding requests: Optional:{0} Mar 15, 2019 3:23:55 PM org.apache.reef.runtime.local.driver.ResourceManager sendRuntimeStatus INFO: Allocated: 1, Outstanding requests: Optional:{0} Mar 15, 2019 3:23:55 PM org.apache.reef.runtime.local.driver.ResourceManager sendRuntimeStatus INFO: Allocated: 2, Outstanding requests: Optional:{0} Mar 15, 2019 3:23:55 PM org.apache.reef.runtime.local.driver.ResourceManager sendRuntimeStatus INFO: Allocated: 2, Outstanding requests: Optional:{0} Mar 15, 2019 3:23:55 PM org.apache.reef.runtime.common.driver.evaluator.AllocatedEvaluatorImpl makeRootServiceConfiguration INFO: No service configuration given and no ConfigurationProviders set. Mar 15, 2019 3:23:55 PM org.apache.reef.runtime.common.driver.evaluator.AllocatedEvaluatorImpl makeRootServiceConfiguration INFO: No service configuration given and no ConfigurationProviders set. Mar 15, 2019 3:23:55 PM org.apache.reef.runtime.local.process.ReefRunnableProcessObserver onResourceStatus INFO: Sending resource status: ResourceStatusEventImpl:{id:Node-59-1552631035581, runtime:Node-59-1552631035581, state:RUNNING, diag:Optional.empty, exit:Optional.empty} Mar 15, 2019 3:23:55 PM org.apache.reef.runtime.local.process.ReefRunnableProcessObserver onResourceStatus INFO: Sending resource status: ResourceStatusEventImpl:{id:Node-58-1552631035640, runtime:Node-58-1552631035640, state:RUNNING, diag:Optional.empty, exit:Optional.empty} INFO 03-15 15:23:57,651 JobLauncher:250 [ForkJoinPool.commonPool-worker-1] - Launching DAG... INFO 03-15 15:23:57,744 JobLauncher:263 [ForkJoinPool.commonPool-worker-1] - Waiting for the DAG to finish execution INFO 03-15 15:24:24,995 NemoPipelineResult:75 [main] - Job timed out before PT30Sms, while waiting until finish. INFO 03-15 15:24:24,996 JobLauncher:181 [main] - Wait for the driver to finish ```
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
