alapha23 commented on a change in pull request #187: [NEMO-324] Distinguish Beam's run and waitUntilFinish methods URL: https://github.com/apache/incubator-nemo/pull/187#discussion_r328224088
########## File path: examples/beam/src/main/java/org/apache/nemo/examples/beam/WordCountTimeOut1Sec.java ########## @@ -0,0 +1,52 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.nemo.examples.beam; + +import org.apache.beam.sdk.Pipeline; +import org.apache.beam.sdk.options.PipelineOptions; + +import static org.apache.nemo.examples.beam.WordCount.generateWordCountPipeline; + +/** + * WordCount application, but with a timeout of 1 second. + */ +public final class WordCountTimeOut1Sec { + + /** + * Private constructor. + */ + private WordCountTimeOut1Sec() { + } + + /** + * Main function for the MR BEAM program. + * + * @param args arguments. + */ + public static void main(final String[] args) { + final String inputFilePath = args[0]; + final String outputFilePath = args[1]; + final PipelineOptions options = NemoPipelineOptionsFactory.create(); + options.setJobName("WordCountTimeOut1Sec"); + + final Pipeline p = generateWordCountPipeline(options, inputFilePath, outputFilePath); + p.run().waitUntilFinish(org.joda.time.Duration.standardSeconds(1)); Review comment: Thank you for the great work! I did some tests with this beam test case, and there seems to be some problems with either my understanding of the waitUntilFinish API, or the functionality of the waitUntilFinish. First I extended input file `examples/resources/inputs/test_input_wordcount2` to be large enough to cause visible latency before job completion. (236MB in my case caused elapsed time to grow from 21s to 42s) By changing `p.run().waitUntilFinish(org.joda.time.Duration.standardSeconds(1)); ` into `p.run()`, `p.run().waitUntilFinish();`, I was expecting different job completion time. I expected that `p.run().waitUntilFinish(org.joda.time.Duration.standardSeconds(1)); ` should finish the earlier than `p.run().waitUntilFinish()` since we wait only 1 second and interrupts the job. Also I thought `p.run()` would be non-blocking call so it ends the earliest(correct me if I'm wrong). The fact is time tells me they complete just about the same time after starting. ``` p.run(); 1. real 0m41.799s user 1m0.495s sys 0m2.564s p.run().waitUntilFinish(org.joda.time.Duration.standardSeconds(1)); 1. real 0m42.223s user 1m1.080s sys 0m2.432s 2. real 0m43.842s user 1m3.689s sys 0m2.418s p.run().waitUntilFinish() 1. real 0m41.275s user 0m59.458s sys 0m2.158s 2. real 0m41.619s user 1m0.445s sys 0m2.326s ``` Once again, would you suggest how this came to happen, or I understand this waitUntilFinish wrongly? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
