This closes #3705: [BEAM-165] Initial implementation of the MapReduce runner
mr-runner: Removes WordCountTest, fixes checkstyle, findbugs, and addressed comments. mr-runner-hack: disable unrelated modules to shorten build time during development. mr-runner: support SourceMetrics, this fixes MetricsTest.testBoundedSourceMetrics(). mr-runner: introduces duplicateFactor in FlattenOperation, this fixes testFlattenInputMultipleCopies(). mr-runner: translate empty flatten into EmptySource, this fixes few empty FalttenTests. mr-runner: ensure Operation only start/finish once for diamond shaped DAG, this fixes ParDoLifecycleTest. mr-runner: Graph.getSteps() to return with topological order, this fixes few CombineTests. mr-runner: fail early in the runner when MapReduce job fails. mr-runner: use InMemoryStateInternals in ParDoOperation, this fixed ParDoTest that uses state. mr-runner: use the correct step name in ParDoTranslator, this fixes MetricsTest.testAttemptedCounterMetrics(). mr-runner: remove the hard-coded GlobalWindow coder, and fixes WindowingTest. mr-runner: handle no files case in FileSideInputReader for empty views. mr-runner: fix NPE in PipelineTest.testIdentityTransform(). mr-runner: filter out unsupported features in ValidatesRunner tests. mr-runner: setMetricsSupported to run ValidatesRunner tests with TestPipeline. mr-runner: fix the bug that steps are attached multiple times in diamond shaped DAG. [BEAM-2783] support metrics in MapReduceRunner. mr-runner: setup file paths for read and write sides of materialization. mr-runner: support side inputs by reading in all views contents. mr-runner: support multiple SourceOperations by composing and partitioning. mr-runner: support PCollections materialization with multiple MR jobs. mr-runner: hack to get around that ViewAsXXX.expand() return wrong output PValue. mr-runner: support graph visualization with dotfiles. mr-runner: refactors and creates Graph data structures to handle general Beam pipelines. mr-runner: add JarClassInstanceFactory to run ValidatesRunner tests. mr-runner: support reduce side ParDos and WordCount. core-java: InMemoryTimerInternals expose getTimers() for timer firings in mr-runner. mr-runner: add BeamReducer and support GroupByKey. mr-runner: add ParDoOperation and support ParDos chaining. mr-runner: add JobPrototype and translate it to a MR job. mr-runner: support BoundedSource with BeamInputFormat. MapReduceRunner: add unit tests for GraphConverter and GraphPlanner. MapReduceRunner: add Graph and its visitors. Initial commit for MapReduceRunner. Project: http://git-wip-us.apache.org/repos/asf/beam/repo Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/5fa0b14d Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/5fa0b14d Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/5fa0b14d Branch: refs/heads/mr-runner Commit: 5fa0b14d20fab007d9e2d954eb4a34155a6f199f Parents: 2fa4fde 32aeb7a Author: Kenneth Knowles <k...@google.com> Authored: Thu Sep 7 11:38:22 2017 -0700 Committer: Kenneth Knowles <k...@google.com> Committed: Thu Sep 7 11:38:22 2017 -0700 ---------------------------------------------------------------------- pom.xml | 2 +- .../runners/core/InMemoryTimerInternals.java | 5 + runners/map-reduce/pom.xml | 197 ++++++++++++++ .../mapreduce/MapReducePipelineOptions.java | 89 +++++++ .../mapreduce/MapReducePipelineResult.java | 67 +++++ .../runners/mapreduce/MapReduceRegistrar.java | 55 ++++ .../beam/runners/mapreduce/MapReduceRunner.java | 113 ++++++++ .../beam/runners/mapreduce/package-info.java | 21 ++ .../mapreduce/translation/BeamInputFormat.java | 257 ++++++++++++++++++ .../mapreduce/translation/BeamMapper.java | 71 +++++ .../mapreduce/translation/BeamReducer.java | 104 ++++++++ .../translation/ConfigurationUtils.java | 62 +++++ .../mapreduce/translation/DotfileWriter.java | 93 +++++++ .../translation/FileReadOperation.java | 181 +++++++++++++ .../translation/FileSideInputReader.java | 122 +++++++++ .../translation/FileWriteOperation.java | 73 +++++ .../mapreduce/translation/FlattenOperation.java | 42 +++ .../translation/FlattenTranslator.java | 154 +++++++++++ .../runners/mapreduce/translation/Graph.java | 218 +++++++++++++++ .../mapreduce/translation/GraphConverter.java | 178 +++++++++++++ .../mapreduce/translation/GraphPlanner.java | 144 ++++++++++ .../runners/mapreduce/translation/Graphs.java | 267 +++++++++++++++++++ .../GroupAlsoByWindowsParDoOperation.java | 56 ++++ .../GroupAlsoByWindowsViaOutputBufferDoFn.java | 142 ++++++++++ .../translation/GroupByKeyOperation.java | 54 ++++ .../translation/GroupByKeyTranslator.java | 45 ++++ .../mapreduce/translation/JobPrototype.java | 211 +++++++++++++++ .../translation/MapReduceMetricResults.java | 106 ++++++++ .../mapreduce/translation/MetricsReporter.java | 97 +++++++ .../translation/NormalParDoOperation.java | 51 ++++ .../mapreduce/translation/Operation.java | 113 ++++++++ .../mapreduce/translation/OutputReceiver.java | 60 +++++ .../mapreduce/translation/ParDoOperation.java | 209 +++++++++++++++ .../mapreduce/translation/ParDoTranslator.java | 46 ++++ .../translation/PartitionOperation.java | 64 +++++ .../translation/ReadBoundedTranslator.java | 38 +++ .../mapreduce/translation/ReadOperation.java | 58 ++++ .../ReifyTimestampAndWindowsParDoOperation.java | 64 +++++ .../translation/SerializableConfiguration.java | 52 ++++ .../translation/SerializedPipelineOptions.java | 76 ++++++ .../translation/ShuffleWriteOperation.java | 61 +++++ .../translation/SourceReadOperation.java | 44 +++ .../translation/TransformTranslator.java | 48 ++++ .../translation/TranslationContext.java | 199 ++++++++++++++ .../translation/TranslatorRegistry.java | 49 ++++ .../mapreduce/translation/ViewTranslator.java | 51 ++++ .../translation/WindowAssignOperation.java | 88 ++++++ .../translation/WindowAssignTranslator.java | 37 +++ .../mapreduce/translation/package-info.java | 22 ++ .../translation/GraphConverterTest.java | 60 +++++ .../mapreduce/translation/GraphPlannerTest.java | 64 +++++ runners/pom.xml | 1 + sdks/java/pom.xml | 3 +- sdks/pom.xml | 2 +- 54 files changed, 4783 insertions(+), 3 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/beam/blob/5fa0b14d/pom.xml ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/beam/blob/5fa0b14d/runners/pom.xml ----------------------------------------------------------------------