See 
<https://builds.apache.org/job/beam_PostCommit_Python_VR_Spark/1580/display/redirect?page=changes>

Changes:

[echauchot] [BEAM-8470] Add an empty spark-structured-streaming runner project

[echauchot] [BEAM-8470] Fix missing dep

[echauchot] [BEAM-8470] Add SparkPipelineOptions

[echauchot] [BEAM-8470] Start pipeline translation

[echauchot] [BEAM-8470] Add global pipeline translation structure

[echauchot] [BEAM-8470] Add nodes translators structure

[echauchot] [BEAM-8470] Wire node translators with pipeline translator

[echauchot] [BEAM-8470] Renames: better differenciate pipeline translator for

[echauchot] [BEAM-8470] Organise methods in PipelineTranslator

[echauchot] [BEAM-8470] Initialise BatchTranslationContext

[echauchot] [BEAM-8470] Refactoring: -move batch/streaming common translation

[echauchot] [BEAM-8470] Make transform translation clearer: renaming, comments

[echauchot] [BEAM-8470] Improve javadocs

[echauchot] [BEAM-8470] Move SparkTransformOverrides to correct package

[echauchot] [BEAM-8470] Move common translation context components to superclass

[echauchot] [BEAM-8470] apply spotless

[echauchot] [BEAM-8470] Make codestyle and firebug happy

[echauchot] [BEAM-8470] Add TODOs

[echauchot] [BEAM-8470] Post-pone batch qualifier in all classes names for

[echauchot] [BEAM-8470] Add precise TODO for multiple TransformTranslator per

[echauchot] [BEAM-8470] Added SparkRunnerRegistrar

[echauchot] [BEAM-8470] Add basic pipeline execution. Refactor 
translatePipeline()

[echauchot] [BEAM-8470] Create PCollections manipulation methods

[echauchot] [BEAM-8470] Create Datasets manipulation methods

[echauchot] [BEAM-8470] Add Flatten transformation translator

[echauchot] [BEAM-8470] Add primitive GroupByKeyTranslatorBatch implementation

[echauchot] [BEAM-8470] Use Iterators.transform() to return Iterable

[echauchot] [BEAM-8470] Implement read transform

[echauchot] [BEAM-8470] update TODO

[echauchot] [BEAM-8470] Apply spotless

[echauchot] [BEAM-8470] start source instanciation

[echauchot] [BEAM-8470] Improve exception flow

[echauchot] [BEAM-8470] Improve type enforcement in ReadSourceTranslator

[echauchot] [BEAM-8470] Experiment over using spark Catalog to pass in Beam 
Source

[echauchot] [BEAM-8470] Add source mocks

[echauchot] [BEAM-8470] fix mock, wire mock in translators and create a main 
test.

[echauchot] [BEAM-8470] Use raw WindowedValue so that spark Encoders could work

[echauchot] [BEAM-8470] clean deps

[echauchot] [BEAM-8470] Move DatasetSourceMock to proper batch mode

[echauchot] [BEAM-8470] Run pipeline in batch mode or in streaming mode

[echauchot] [BEAM-8470] Split batch and streaming sources and translators

[echauchot] [BEAM-8470] Use raw Encoder<WindowedValue> also in regular

[echauchot] [BEAM-8470] Clean

[echauchot] [BEAM-8470] Add ReadSourceTranslatorStreaming

[echauchot] [BEAM-8470] Move Source and translator mocks to a mock package.

[echauchot] [BEAM-8470] Pass Beam Source and PipelineOptions to the spark 
DataSource

[echauchot] [BEAM-8470] Refactor DatasetSource fields

[echauchot] [BEAM-8470] Wire real SourceTransform and not mock and update the 
test

[echauchot] [BEAM-8470] Add missing 0-arg public constructor

[echauchot] [BEAM-8470] Use new PipelineOptionsSerializationUtils

[echauchot] [BEAM-8470] Apply spotless and fix  checkstyle

[echauchot] [BEAM-8470] Add a dummy schema for reader

[echauchot] [BEAM-8470] Add empty 0-arg constructor for mock source

[echauchot] [BEAM-8470] Clean

[echauchot] [BEAM-8470] Checkstyle and Findbugs

[echauchot] [BEAM-8470] Refactor SourceTest to a UTest instaed of a main

[echauchot] [BEAM-8470] Fix pipeline triggering: use a spark action instead of

[echauchot] [BEAM-8470] improve readability of options passing to the source

[echauchot] [BEAM-8470] Clean unneeded fields in DatasetReader

[echauchot] [BEAM-8470] Fix serialization issues

[echauchot] [BEAM-8470] Add SerializationDebugger

[echauchot] [BEAM-8470] Add serialization test

[echauchot] [BEAM-8470] Move SourceTest to same package as tested class

[echauchot] [BEAM-8470] Fix SourceTest

[echauchot] [BEAM-8470] Simplify beam reader creation as it created once the 
source

[echauchot] [BEAM-8470] Put all transform translators Serializable

[echauchot] [BEAM-8470] Enable test mode

[echauchot] [BEAM-8470] Enable gradle build scan

[echauchot] [BEAM-8470] Add flatten test

[echauchot] [BEAM-8470] First attempt for ParDo primitive implementation

[echauchot] [BEAM-8470] Serialize windowedValue to byte[] in source to be able 
to

[echauchot] [BEAM-8470] Comment schema choices

[echauchot] [BEAM-8470] Fix errorprone

[echauchot] [BEAM-8470] Fix testMode output to comply with new binary schema

[echauchot] [BEAM-8470] Cleaning

[echauchot] [BEAM-8470] Remove bundleSize parameter and always use spark default

[echauchot] [BEAM-8470] Fix split bug

[echauchot] [BEAM-8470] Clean

[echauchot] [BEAM-8470] Add ParDoTest

[echauchot] [BEAM-8470] Address minor review notes

[echauchot] [BEAM-8470] Clean

[echauchot] [BEAM-8470] Add GroupByKeyTest

[echauchot] [BEAM-8470] Add comments and TODO to GroupByKeyTranslatorBatch

[echauchot] [BEAM-8470] Fix type checking with Encoder of WindowedValue<T>

[echauchot] [BEAM-8470] Port latest changes of ReadSourceTranslatorBatch to

[echauchot] [BEAM-8470] Remove no more needed putDatasetRaw

[echauchot] [BEAM-8470] Add ComplexSourceTest

[echauchot] [BEAM-8470] Fail in case of having SideInouts or State/Timers

[echauchot] [BEAM-8470] Fix Encoders: create an Encoder for every manipulated 
type

[echauchot] [BEAM-8470] Apply spotless

[echauchot] [BEAM-8470] Fixed Javadoc error

[echauchot] [BEAM-8470] Rename SparkSideInputReader class and rename 
pruneOutput()

[echauchot] [BEAM-8470] Don't use deprecated

[echauchot] [BEAM-8470] Simplify logic of ParDo translator

[echauchot] [BEAM-8470] Fix kryo issue in GBK translator with a workaround

[echauchot] [BEAM-8470] Rename SparkOutputManager for consistency

[echauchot] [BEAM-8470] Fix for test elements container in GroupByKeyTest

[echauchot] [BEAM-8470] Added "testTwoPardoInRow"

[echauchot] [BEAM-8470] Add a test for the most simple possible Combine

[echauchot] [BEAM-8470] Rename SparkDoFnFilterFunction to DoFnFilterFunction for

[echauchot] [BEAM-8470] Generalize the use of SerializablePipelineOptions in 
place

[echauchot] [BEAM-8470] Fix getSideInputs

[echauchot] [BEAM-8470] Extract binary schema creation in a helper class

[echauchot] [BEAM-8470] First version of combinePerKey

[echauchot] [BEAM-8470] Improve type checking of Tuple2 encoder

[echauchot] [BEAM-8470] Introduce WindowingHelpers (and helpers package) and 
use it

[echauchot] [BEAM-8470] Fix combiner using KV as input, use binary encoders in 
place

[echauchot] [BEAM-8470] Add combinePerKey and CombineGlobally tests

[echauchot] [BEAM-8470] Introduce RowHelpers

[echauchot] [BEAM-8470] Add CombineGlobally translation to avoid translating

[echauchot] [BEAM-8470] Cleaning

[echauchot] [BEAM-8470] Get back to classes in translators resolution because 
URNs

[echauchot] [BEAM-8470] Fix various type checking issues in Combine.Globally

[echauchot] [BEAM-8470] Update test with Long

[echauchot] [BEAM-8470] Fix combine. For unknown reason GenericRowWithSchema is 
used

[echauchot] [BEAM-8470] Use more generic Row instead of GenericRowWithSchema

[echauchot] [BEAM-8470] Add explanation about receiving a Row as input in the

[echauchot] [BEAM-8470] Fix encoder bug in combinePerkey

[echauchot] [BEAM-8470] Cleaning

[echauchot] [BEAM-8470] Implement WindowAssignTranslatorBatch

[echauchot] [BEAM-8470] Implement WindowAssignTest

[echauchot] [BEAM-8470] Fix javadoc

[echauchot] [BEAM-8470] Added SideInput support

[echauchot] [BEAM-8470] Fix CheckStyle violations

[echauchot] [BEAM-8470] Don't use Reshuffle translation

[echauchot] [BEAM-8470] Added using CachedSideInputReader

[echauchot] [BEAM-8470] Added TODO comment for ReshuffleTranslatorBatch

[echauchot] [BEAM-8470] And unchecked warning suppression

[echauchot] [BEAM-8470] Add streaming source initialisation

[echauchot] [BEAM-8470] Implement first streaming source

[echauchot] [BEAM-8470] Add a TODO on spark output modes

[echauchot] [BEAM-8470] Add transformators registry in 
PipelineTranslatorStreaming

[echauchot] [BEAM-8470] Add source streaming test

[echauchot] [BEAM-8470] Specify checkpointLocation at the pipeline start

[echauchot] [BEAM-8470] Clean unneeded 0 arg constructor in batch source

[echauchot] [BEAM-8470] Clean streaming source

[echauchot] [BEAM-8470] Continue impl of offsets for streaming source

[echauchot] [BEAM-8470] Deal with checkpoint and offset based read

[echauchot] [BEAM-8470] Apply spotless and fix spotbugs warnings

[echauchot] [BEAM-8470] Disable never ending test

[echauchot] [BEAM-8470] Fix access level issues, typos and modernize code to 
Java 8

[echauchot] [BEAM-8470] Merge Spark Structured Streaming runner into main Spark

[echauchot] [BEAM-8470] Fix non-vendored imports from Spark Streaming Runner 
classes

[echauchot] [BEAM-8470] Pass doFnSchemaInformation to ParDo batch translation

[echauchot] [BEAM-8470] Fix spotless issues after rebase

[echauchot] [BEAM-8470] Fix logging levels in Spark Structured Streaming 
translation

[echauchot] [BEAM-8470] Add SparkStructuredStreamingPipelineOptions and

[echauchot] [BEAM-8470] Rename SparkPipelineResult to

[echauchot] [BEAM-8470] Use PAssert in Spark Structured Streaming transform 
tests

[echauchot] [BEAM-8470] Ignore spark offsets (cf javadoc)

[echauchot] [BEAM-8470] implement source.stop

[echauchot] [BEAM-8470] Update javadoc

[echauchot] [BEAM-8470] Apply Spotless

[echauchot] [BEAM-8470] Enable batch Validates Runner tests for Structured 
Streaming

[echauchot] [BEAM-8470] Limit the number of partitions to make tests go 300% 
faster

[echauchot] [BEAM-8470] Fixes ParDo not calling setup and not tearing down if

[echauchot] [BEAM-8470] Pass transform based doFnSchemaInformation in ParDo

[echauchot] [BEAM-8470] Consider null object case on RowHelpers, fixes empty 
side

[echauchot] [BEAM-8470] Put back batch/simpleSourceTest.testBoundedSource

[echauchot] [BEAM-8470] Update windowAssignTest

[echauchot] [BEAM-8470] Add comment about checkpoint mark

[echauchot] [BEAM-8470] Re-code GroupByKeyTranslatorBatch to conserve windowing

[echauchot] [BEAM-8470] re-enable reduceFnRunner timers for output

[echauchot] [BEAM-8470] Improve visibility of debug messages

[echauchot] [BEAM-8470] Add a test that GBK preserves windowing

[echauchot] [BEAM-8470] Add TODO in Combine translations

[echauchot] [BEAM-8470] Update KVHelpers.extractKey() to deal with 
WindowedValue and

[echauchot] [BEAM-8470] Fix comment about schemas

[echauchot] [BEAM-8470] Implement reduce part of CombineGlobally translation 
with

[echauchot] [BEAM-8470] Output data after combine

[echauchot] [BEAM-8470] Implement merge accumulators part of CombineGlobally

[echauchot] [BEAM-8470] Fix encoder in combine call

[echauchot] [BEAM-8470] Revert extractKey while combinePerKey is not done (so 
that

[echauchot] [BEAM-8470] Apply a groupByKey avoids for some reason that the spark

[echauchot] [BEAM-8470] Fix case when a window does not merge into any other 
window

[echauchot] [BEAM-8470] Fix wrong encoder in combineGlobally GBK

[echauchot] [BEAM-8470] Fix bug in the window merging logic

[echauchot] [BEAM-8470] Remove the mapPartition that adds a key per partition

[echauchot] [BEAM-8470] Remove CombineGlobally translation because it is less

[echauchot] [BEAM-8470] Now that there is only Combine.PerKey translation, make 
only

[echauchot] [BEAM-8470] Clean no more needed KVHelpers

[echauchot] [BEAM-8470] Clean not more needed RowHelpers

[echauchot] [BEAM-8470] Clean not more needed WindowingHelpers

[echauchot] [BEAM-8470] Fix javadoc of AggregatorCombiner

[echauchot] [BEAM-8470] Fixed immutable list bug

[echauchot] [BEAM-8470] add comment in combine globally test

[echauchot] [BEAM-8470] Clean groupByKeyTest

[echauchot] [BEAM-8470] Add a test that combine per key preserves windowing

[echauchot] [BEAM-8470] Ignore for now not working test testCombineGlobally

[echauchot] [BEAM-8470] Add metrics support in DoFn

[echauchot] [BEAM-8470] Add missing dependencies to run Spark Structured 
Streaming

[echauchot] [BEAM-8470] Add setEnableSparkMetricSinks() method

[echauchot] [BEAM-8470] Fix javadoc

[echauchot] [BEAM-8470] Fix accumulators initialization in Combine that 
prevented

[echauchot] [BEAM-8470] Add a test to check that CombineGlobally preserves 
windowing

[echauchot] [BEAM-8470] Persist all output Dataset if there are multiple 
outputs in

[echauchot] [BEAM-8470] Added metrics sinks and tests

[echauchot] [BEAM-8470] Make spotless happy

[echauchot] [BEAM-8470] Add PipelineResults to Spark structured streaming.

[echauchot] [BEAM-8470] Update log4j configuration

[echauchot] [BEAM-8470] Add spark execution plans extended debug messages.

[echauchot] [BEAM-8470] Print number of leaf datasets

[echauchot] [BEAM-8470] fixup! Add PipelineResults to Spark structured 
streaming.

[echauchot] [BEAM-8470] Remove no more needed AggregatorCombinerPerKey (there is

[echauchot] [BEAM-8470] After testing performance and correctness, launch 
pipeline

[echauchot] [BEAM-8470] Improve Pardo translation performance: avoid calling a

[echauchot] [BEAM-8470] Use "sparkMaster" in local mode to obtain number of 
shuffle

[echauchot] [BEAM-8470] Wrap Beam Coders into Spark Encoders using

[echauchot] [BEAM-8470] Wrap Beam Coders into Spark Encoders using

[echauchot] [BEAM-8470] type erasure: spark encoders require a Class<T>, pass 
Object

[echauchot] [BEAM-8470] Fix scala Product in Encoders to avoid StackEverflow

[echauchot] [BEAM-8470] Conform to spark ExpressionEncoders: pass classTags,

[echauchot] [BEAM-8470] Add a simple spark native test to test Beam coders 
wrapping

[echauchot] [BEAM-8470] Fix code generation in Beam coder wrapper

[echauchot] [BEAM-8470] Lazy init coder because coder instance cannot be

[echauchot] [BEAM-8470] Fix warning in coder construction by reflexion

[echauchot] [BEAM-8470] Fix ExpressionEncoder generated code: typos, try catch, 
fqcn

[echauchot] [BEAM-8470] Fix getting the output value in code generation

[echauchot] [BEAM-8470] Fix beam coder lazy init using reflexion: use .clas + 
try

[echauchot] [BEAM-8470] Remove lazy init of beam coder because there is no 
generic

[echauchot] [BEAM-8470] Remove example code

[echauchot] [BEAM-8470] Fix equal and hashcode

[echauchot] [BEAM-8470] Fix generated code: uniform exceptions catching, fix

[echauchot] [BEAM-8470] Add an assert of equality in the encoders test

[echauchot] [BEAM-8470] Apply spotless and checkstyle and add javadocs

[echauchot] [BEAM-8470] Wrap exceptions in UserCoderExceptions

[echauchot] [BEAM-8470] Put Encoders expressions serializable

[echauchot] [BEAM-8470] Catch Exception instead of IOException because some 
coders

[echauchot] [BEAM-8470] Apply new Encoders to CombinePerKey

[echauchot] [BEAM-8470] Apply new Encoders to Read source

[echauchot] [BEAM-8470] Improve performance of source: the mapper already calls

[echauchot] [BEAM-8470] Ignore long time failing test: SparkMetricsSinkTest

[echauchot] [BEAM-8470] Apply new Encoders to Window assign translation

[echauchot] [BEAM-8470] Apply new Encoders to AggregatorCombiner

[echauchot] [BEAM-8470] Create a Tuple2Coder to encode scala tuple2

[echauchot] [BEAM-8470] Apply new Encoders to GroupByKey

[echauchot] [BEAM-8470] Apply new Encoders to Pardo. Replace Tuple2Coder with

[echauchot] [BEAM-8470] Apply spotless, fix typo and javadoc

[echauchot] [BEAM-8470] Use beam encoders also in the output of the source

[echauchot] [BEAM-8470] Remove unneeded cast

[echauchot] [BEAM-8470] Fix: Remove generic hack of using object. Use actual 
Coder

[echauchot] [BEAM-8470] Remove Encoders based on kryo now that we call Beam 
coders

[echauchot] [BEAM-8470] Add a jenkins job for validates runner tests in the new

[echauchot] [BEAM-8470] Apply spotless

[echauchot] [BEAM-8470] Rebase on master: pass sideInputMapping in 
SimpleDoFnRunner

[echauchot] Fix SpotBugs

[echauchot] [BEAM-8470] simplify coders in combinePerKey translation

[echauchot] [BEAM-8470] Fix combiner. Do not reuse instance of accumulator

[echauchot] [BEAM-8470] input windows can arrive exploded (for sliding 
windows). As

[echauchot] [BEAM-8470] Add a combine test with sliding windows

[echauchot] [BEAM-8470] Add a test to test combine translation on 
binaryCombineFn

[echauchot] [BEAM-8470] Fix tests: use correct

[echauchot] [BEAM-8470] Fix wrong expected results in

[echauchot] [BEAM-8470] Add disclaimers about this runner being experimental

[echauchot] [BEAM-8470] Fix: create an empty accumulator in

[echauchot] [BEAM-8470] Apply spotless

[echauchot] [BEAM-8470] Add a countPerElement test with sliding windows

[echauchot] [BEAM-8470] Fix the output timestamps of combine: timestamps must be

[echauchot] [BEAM-8470] set log level to info to avoid resource consumption in

[echauchot] [BEAM-8470] Fix CombineTest.testCountPerElementWithSlidingWindows

[aromanenko.dev] [BEAM-8470] Remove "validatesStructuredStreamingRunnerBatch" 
from

[echauchot] [BEAM-8470] Fix timestamps in combine output: assign the timestamp 
to


------------------------------------------
[...truncated 1.32 MB...]
19/11/20 14:38:35 INFO sdk_worker.create_state_handler: State channel 
established.
19/11/20 14:38:35 INFO data_plane.create_data_channel: Creating client data 
channel for localhost:46283
19/11/20 14:38:35 INFO 
org.apache.beam.runners.fnexecution.data.GrpcDataService: Beam Fn Data client 
connected.
19/11/20 14:38:35 INFO 
org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory: Closing 
environment urn: "beam:env:process:v1"
payload: 
"\032\202\001<https://builds.apache.org/job/beam_PostCommit_Python_VR_Spark/ws/src/sdks/python/test-suites/portable/py2/build/sdk_worker.sh";>

19/11/20 14:38:35 INFO sdk_worker.run: No more requests from control plane
19/11/20 14:38:35 INFO sdk_worker.run: SDK Harness waiting for in-flight 
requests to complete
19/11/20 14:38:35 WARN org.apache.beam.sdk.fn.data.BeamFnDataGrpcMultiplexer: 
Hanged up for unknown endpoint.
19/11/20 14:38:35 INFO data_plane.close: Closing all cached grpc data channels.
19/11/20 14:38:35 INFO sdk_worker.close: Closing all cached gRPC state handlers.
19/11/20 14:38:35 INFO sdk_worker.run: Done consuming work.
19/11/20 14:38:35 INFO sdk_worker_main.main: Python sdk harness exiting.
19/11/20 14:38:35 INFO 
org.apache.beam.runners.fnexecution.logging.GrpcLoggingService: Logging client 
hanged up.
19/11/20 14:38:35 WARN org.apache.beam.sdk.fn.data.BeamFnDataGrpcMultiplexer: 
Hanged up for unknown endpoint.
19/11/20 14:38:35 INFO 
org.apache.beam.runners.fnexecution.artifact.AbstractArtifactRetrievalService: 
GetManifest for 
/tmp/sparktestM8epv3/job_a170b361-0556-4262-8ba2-aeaec0641ddd/MANIFEST
19/11/20 14:38:35 INFO 
org.apache.beam.runners.fnexecution.artifact.AbstractArtifactRetrievalService: 
GetManifest for 
/tmp/sparktestM8epv3/job_a170b361-0556-4262-8ba2-aeaec0641ddd/MANIFEST -> 0 
artifacts
19/11/20 14:38:36 INFO 
org.apache.beam.runners.fnexecution.logging.GrpcLoggingService: Beam Fn Logging 
client connected.
19/11/20 14:38:36 INFO sdk_worker_main.main: Logging handler created.
19/11/20 14:38:36 INFO sdk_worker_main.start: Status HTTP server running at 
localhost:34981
19/11/20 14:38:36 INFO sdk_worker_main.main: semi_persistent_directory: /tmp
19/11/20 14:38:36 WARN sdk_worker_main._load_main_session: No session file 
found: /tmp/staged/pickled_main_session. Functions defined in __main__ 
(interactive session) may fail. 
19/11/20 14:38:36 WARN pipeline_options.get_all_options: Discarding unparseable 
args: [u'--job_server_timeout=60', 
u'--app_name=test_windowing_1574260714.18_ce6f6c60-4aee-43c2-a2d6-5dd7ae57b385',
 u'--direct_runner_use_stacked_bundle', u'--spark_master=local', 
u'--options_id=30', u'--enable_spark_metric_sinks', u'--pipeline_type_check'] 
19/11/20 14:38:36 INFO sdk_worker_main.main: Python sdk harness started with 
pipeline_options: {'runner': u'None', 'experiments': [u'beam_fn_api'], 
'environment_cache_millis': u'0', 'environment_type': u'PROCESS', 
'sdk_location': u'container', 'job_name': u'test_windowing_1574260714.18', 
'environment_config': u'{"command": 
"<https://builds.apache.org/job/beam_PostCommit_Python_VR_Spark/ws/src/sdks/python/test-suites/portable/py2/build/sdk_worker.sh"}',>
 'sdk_worker_parallelism': u'1', 'job_endpoint': u'localhost:33243'}
19/11/20 14:38:36 INFO statecache.__init__: Creating state cache with size 0
19/11/20 14:38:36 INFO sdk_worker.__init__: Creating insecure control channel 
for localhost:36079.
19/11/20 14:38:36 INFO 
org.apache.beam.runners.fnexecution.control.FnApiControlClientPoolService: Beam 
Fn Control client connected with id 262-1
19/11/20 14:38:36 INFO sdk_worker.__init__: Control channel established.
19/11/20 14:38:36 INFO sdk_worker.__init__: Initializing SDKHarness with 
unbounded number of workers.
19/11/20 14:38:36 INFO sdk_worker.create_state_handler: Creating insecure state 
channel for localhost:34975.
19/11/20 14:38:36 INFO sdk_worker.create_state_handler: State channel 
established.
19/11/20 14:38:36 INFO data_plane.create_data_channel: Creating client data 
channel for localhost:40539
19/11/20 14:38:36 INFO 
org.apache.beam.runners.fnexecution.data.GrpcDataService: Beam Fn Data client 
connected.
19/11/20 14:38:36 INFO 
org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory: Closing 
environment urn: "beam:env:process:v1"
payload: 
"\032\202\001<https://builds.apache.org/job/beam_PostCommit_Python_VR_Spark/ws/src/sdks/python/test-suites/portable/py2/build/sdk_worker.sh";>

19/11/20 14:38:36 INFO sdk_worker.run: No more requests from control plane
19/11/20 14:38:36 INFO sdk_worker.run: SDK Harness waiting for in-flight 
requests to complete
19/11/20 14:38:36 WARN org.apache.beam.sdk.fn.data.BeamFnDataGrpcMultiplexer: 
Hanged up for unknown endpoint.
19/11/20 14:38:36 INFO data_plane.close: Closing all cached grpc data channels.
19/11/20 14:38:36 INFO sdk_worker.close: Closing all cached gRPC state handlers.
19/11/20 14:38:36 INFO sdk_worker.run: Done consuming work.
19/11/20 14:38:36 INFO sdk_worker_main.main: Python sdk harness exiting.
19/11/20 14:38:36 INFO 
org.apache.beam.runners.fnexecution.logging.GrpcLoggingService: Logging client 
hanged up.
19/11/20 14:38:36 WARN org.apache.beam.sdk.fn.data.BeamFnDataGrpcMultiplexer: 
Hanged up for unknown endpoint.
19/11/20 14:38:36 INFO 
org.apache.beam.runners.fnexecution.artifact.AbstractArtifactRetrievalService: 
GetManifest for 
/tmp/sparktestM8epv3/job_a170b361-0556-4262-8ba2-aeaec0641ddd/MANIFEST
19/11/20 14:38:36 INFO 
org.apache.beam.runners.fnexecution.artifact.AbstractArtifactRetrievalService: 
GetManifest for 
/tmp/sparktestM8epv3/job_a170b361-0556-4262-8ba2-aeaec0641ddd/MANIFEST -> 0 
artifacts
19/11/20 14:38:37 INFO 
org.apache.beam.runners.fnexecution.logging.GrpcLoggingService: Beam Fn Logging 
client connected.
19/11/20 14:38:37 INFO sdk_worker_main.main: Logging handler created.
19/11/20 14:38:37 INFO sdk_worker_main.start: Status HTTP server running at 
localhost:46867
19/11/20 14:38:37 INFO sdk_worker_main.main: semi_persistent_directory: /tmp
19/11/20 14:38:37 WARN sdk_worker_main._load_main_session: No session file 
found: /tmp/staged/pickled_main_session. Functions defined in __main__ 
(interactive session) may fail. 
19/11/20 14:38:37 WARN pipeline_options.get_all_options: Discarding unparseable 
args: [u'--job_server_timeout=60', 
u'--app_name=test_windowing_1574260714.18_ce6f6c60-4aee-43c2-a2d6-5dd7ae57b385',
 u'--direct_runner_use_stacked_bundle', u'--spark_master=local', 
u'--options_id=30', u'--enable_spark_metric_sinks', u'--pipeline_type_check'] 
19/11/20 14:38:37 INFO sdk_worker_main.main: Python sdk harness started with 
pipeline_options: {'runner': u'None', 'experiments': [u'beam_fn_api'], 
'environment_cache_millis': u'0', 'environment_type': u'PROCESS', 
'sdk_location': u'container', 'job_name': u'test_windowing_1574260714.18', 
'environment_config': u'{"command": 
"<https://builds.apache.org/job/beam_PostCommit_Python_VR_Spark/ws/src/sdks/python/test-suites/portable/py2/build/sdk_worker.sh"}',>
 'sdk_worker_parallelism': u'1', 'job_endpoint': u'localhost:33243'}
19/11/20 14:38:37 INFO statecache.__init__: Creating state cache with size 0
19/11/20 14:38:37 INFO sdk_worker.__init__: Creating insecure control channel 
for localhost:44281.
19/11/20 14:38:37 INFO sdk_worker.__init__: Control channel established.
19/11/20 14:38:37 INFO sdk_worker.__init__: Initializing SDKHarness with 
unbounded number of workers.
19/11/20 14:38:37 INFO 
org.apache.beam.runners.fnexecution.control.FnApiControlClientPoolService: Beam 
Fn Control client connected with id 263-1
19/11/20 14:38:37 INFO sdk_worker.create_state_handler: Creating insecure state 
channel for localhost:44181.
19/11/20 14:38:37 INFO sdk_worker.create_state_handler: State channel 
established.
19/11/20 14:38:37 INFO data_plane.create_data_channel: Creating client data 
channel for localhost:40627
19/11/20 14:38:37 INFO 
org.apache.beam.runners.fnexecution.data.GrpcDataService: Beam Fn Data client 
connected.
19/11/20 14:38:37 INFO 
org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory: Closing 
environment urn: "beam:env:process:v1"
payload: 
"\032\202\001<https://builds.apache.org/job/beam_PostCommit_Python_VR_Spark/ws/src/sdks/python/test-suites/portable/py2/build/sdk_worker.sh";>

19/11/20 14:38:37 INFO sdk_worker.run: No more requests from control plane
19/11/20 14:38:37 INFO sdk_worker.run: SDK Harness waiting for in-flight 
requests to complete
19/11/20 14:38:37 WARN org.apache.beam.sdk.fn.data.BeamFnDataGrpcMultiplexer: 
Hanged up for unknown endpoint.
19/11/20 14:38:37 INFO data_plane.close: Closing all cached grpc data channels.
19/11/20 14:38:37 INFO sdk_worker.close: Closing all cached gRPC state handlers.
19/11/20 14:38:37 INFO sdk_worker.run: Done consuming work.
19/11/20 14:38:37 INFO sdk_worker_main.main: Python sdk harness exiting.
19/11/20 14:38:37 INFO 
org.apache.beam.runners.fnexecution.logging.GrpcLoggingService: Logging client 
hanged up.
19/11/20 14:38:37 WARN org.apache.beam.sdk.fn.data.BeamFnDataGrpcMultiplexer: 
Hanged up for unknown endpoint.
19/11/20 14:38:37 INFO 
org.apache.beam.runners.fnexecution.artifact.AbstractArtifactRetrievalService: 
GetManifest for 
/tmp/sparktestM8epv3/job_a170b361-0556-4262-8ba2-aeaec0641ddd/MANIFEST
19/11/20 14:38:37 INFO 
org.apache.beam.runners.fnexecution.artifact.AbstractArtifactRetrievalService: 
GetManifest for 
/tmp/sparktestM8epv3/job_a170b361-0556-4262-8ba2-aeaec0641ddd/MANIFEST -> 0 
artifacts
19/11/20 14:38:38 INFO 
org.apache.beam.runners.fnexecution.logging.GrpcLoggingService: Beam Fn Logging 
client connected.
19/11/20 14:38:38 INFO sdk_worker_main.main: Logging handler created.
19/11/20 14:38:38 INFO sdk_worker_main.start: Status HTTP server running at 
localhost:44003
19/11/20 14:38:38 INFO sdk_worker_main.main: semi_persistent_directory: /tmp
19/11/20 14:38:38 WARN sdk_worker_main._load_main_session: No session file 
found: /tmp/staged/pickled_main_session. Functions defined in __main__ 
(interactive session) may fail. 
19/11/20 14:38:38 WARN pipeline_options.get_all_options: Discarding unparseable 
args: [u'--job_server_timeout=60', 
u'--app_name=test_windowing_1574260714.18_ce6f6c60-4aee-43c2-a2d6-5dd7ae57b385',
 u'--direct_runner_use_stacked_bundle', u'--spark_master=local', 
u'--options_id=30', u'--enable_spark_metric_sinks', u'--pipeline_type_check'] 
19/11/20 14:38:38 INFO sdk_worker_main.main: Python sdk harness started with 
pipeline_options: {'runner': u'None', 'experiments': [u'beam_fn_api'], 
'environment_cache_millis': u'0', 'environment_type': u'PROCESS', 
'sdk_location': u'container', 'job_name': u'test_windowing_1574260714.18', 
'environment_config': u'{"command": 
"<https://builds.apache.org/job/beam_PostCommit_Python_VR_Spark/ws/src/sdks/python/test-suites/portable/py2/build/sdk_worker.sh"}',>
 'sdk_worker_parallelism': u'1', 'job_endpoint': u'localhost:33243'}
19/11/20 14:38:38 INFO statecache.__init__: Creating state cache with size 0
19/11/20 14:38:38 INFO sdk_worker.__init__: Creating insecure control channel 
for localhost:37061.
19/11/20 14:38:38 INFO sdk_worker.__init__: Control channel established.
19/11/20 14:38:38 INFO sdk_worker.__init__: Initializing SDKHarness with 
unbounded number of workers.
19/11/20 14:38:38 INFO 
org.apache.beam.runners.fnexecution.control.FnApiControlClientPoolService: Beam 
Fn Control client connected with id 264-1
19/11/20 14:38:38 INFO sdk_worker.create_state_handler: Creating insecure state 
channel for localhost:40607.
19/11/20 14:38:38 INFO sdk_worker.create_state_handler: State channel 
established.
19/11/20 14:38:38 INFO data_plane.create_data_channel: Creating client data 
channel for localhost:36721
19/11/20 14:38:38 INFO 
org.apache.beam.runners.fnexecution.data.GrpcDataService: Beam Fn Data client 
connected.
19/11/20 14:38:38 INFO 
org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory: Closing 
environment urn: "beam:env:process:v1"
payload: 
"\032\202\001<https://builds.apache.org/job/beam_PostCommit_Python_VR_Spark/ws/src/sdks/python/test-suites/portable/py2/build/sdk_worker.sh";>

19/11/20 14:38:38 INFO sdk_worker.run: No more requests from control plane
19/11/20 14:38:38 INFO sdk_worker.run: SDK Harness waiting for in-flight 
requests to complete
19/11/20 14:38:38 INFO data_plane.close: Closing all cached grpc data channels.
19/11/20 14:38:38 WARN org.apache.beam.sdk.fn.data.BeamFnDataGrpcMultiplexer: 
Hanged up for unknown endpoint.
19/11/20 14:38:38 INFO sdk_worker.close: Closing all cached gRPC state handlers.
19/11/20 14:38:38 INFO sdk_worker.run: Done consuming work.
19/11/20 14:38:38 INFO sdk_worker_main.main: Python sdk harness exiting.
19/11/20 14:38:38 INFO 
org.apache.beam.runners.fnexecution.logging.GrpcLoggingService: Logging client 
hanged up.
19/11/20 14:38:38 WARN org.apache.beam.sdk.fn.data.BeamFnDataGrpcMultiplexer: 
Hanged up for unknown endpoint.
19/11/20 14:38:38 INFO 
org.apache.beam.runners.fnexecution.artifact.AbstractArtifactRetrievalService: 
GetManifest for 
/tmp/sparktestM8epv3/job_a170b361-0556-4262-8ba2-aeaec0641ddd/MANIFEST
19/11/20 14:38:38 INFO 
org.apache.beam.runners.fnexecution.artifact.AbstractArtifactRetrievalService: 
GetManifest for 
/tmp/sparktestM8epv3/job_a170b361-0556-4262-8ba2-aeaec0641ddd/MANIFEST -> 0 
artifacts
19/11/20 14:38:39 INFO 
org.apache.beam.runners.fnexecution.logging.GrpcLoggingService: Beam Fn Logging 
client connected.
19/11/20 14:38:39 INFO sdk_worker_main.main: Logging handler created.
19/11/20 14:38:39 INFO sdk_worker_main.start: Status HTTP server running at 
localhost:44923
19/11/20 14:38:39 INFO sdk_worker_main.main: semi_persistent_directory: /tmp
19/11/20 14:38:39 WARN sdk_worker_main._load_main_session: No session file 
found: /tmp/staged/pickled_main_session. Functions defined in __main__ 
(interactive session) may fail. 
19/11/20 14:38:39 WARN pipeline_options.get_all_options: Discarding unparseable 
args: [u'--job_server_timeout=60', 
u'--app_name=test_windowing_1574260714.18_ce6f6c60-4aee-43c2-a2d6-5dd7ae57b385',
 u'--direct_runner_use_stacked_bundle', u'--spark_master=local', 
u'--options_id=30', u'--enable_spark_metric_sinks', u'--pipeline_type_check'] 
19/11/20 14:38:39 INFO sdk_worker_main.main: Python sdk harness started with 
pipeline_options: {'runner': u'None', 'experiments': [u'beam_fn_api'], 
'environment_cache_millis': u'0', 'environment_type': u'PROCESS', 
'sdk_location': u'container', 'job_name': u'test_windowing_1574260714.18', 
'environment_config': u'{"command": 
"<https://builds.apache.org/job/beam_PostCommit_Python_VR_Spark/ws/src/sdks/python/test-suites/portable/py2/build/sdk_worker.sh"}',>
 'sdk_worker_parallelism': u'1', 'job_endpoint': u'localhost:33243'}
19/11/20 14:38:39 INFO statecache.__init__: Creating state cache with size 0
19/11/20 14:38:39 INFO sdk_worker.__init__: Creating insecure control channel 
for localhost:38011.
19/11/20 14:38:39 INFO sdk_worker.__init__: Control channel established.
19/11/20 14:38:39 INFO 
org.apache.beam.runners.fnexecution.control.FnApiControlClientPoolService: Beam 
Fn Control client connected with id 265-1
19/11/20 14:38:39 INFO sdk_worker.__init__: Initializing SDKHarness with 
unbounded number of workers.
19/11/20 14:38:39 INFO sdk_worker.create_state_handler: Creating insecure state 
channel for localhost:41833.
19/11/20 14:38:39 INFO sdk_worker.create_state_handler: State channel 
established.
19/11/20 14:38:39 INFO data_plane.create_data_channel: Creating client data 
channel for localhost:37195
19/11/20 14:38:39 INFO 
org.apache.beam.runners.fnexecution.data.GrpcDataService: Beam Fn Data client 
connected.
19/11/20 14:38:39 INFO 
org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory: Closing 
environment urn: "beam:env:process:v1"
payload: 
"\032\202\001<https://builds.apache.org/job/beam_PostCommit_Python_VR_Spark/ws/src/sdks/python/test-suites/portable/py2/build/sdk_worker.sh";>

19/11/20 14:38:39 INFO sdk_worker.run: No more requests from control plane
19/11/20 14:38:39 INFO sdk_worker.run: SDK Harness waiting for in-flight 
requests to complete
19/11/20 14:38:39 INFO data_plane.close: Closing all cached grpc data channels.
19/11/20 14:38:39 WARN org.apache.beam.sdk.fn.data.BeamFnDataGrpcMultiplexer: 
Hanged up for unknown endpoint.
19/11/20 14:38:39 INFO sdk_worker.close: Closing all cached gRPC state handlers.
19/11/20 14:38:39 INFO sdk_worker.run: Done consuming work.
19/11/20 14:38:39 INFO sdk_worker_main.main: Python sdk harness exiting.
19/11/20 14:38:39 INFO 
org.apache.beam.runners.fnexecution.logging.GrpcLoggingService: Logging client 
hanged up.
19/11/20 14:38:39 WARN org.apache.beam.sdk.fn.data.BeamFnDataGrpcMultiplexer: 
Hanged up for unknown endpoint.
19/11/20 14:38:39 INFO org.apache.beam.runners.spark.SparkPipelineRunner: Job 
test_windowing_1574260714.18_ce6f6c60-4aee-43c2-a2d6-5dd7ae57b385 finished.
19/11/20 14:38:39 WARN 
org.apache.beam.runners.spark.SparkPipelineResult$BatchMode: Collecting 
monitoring infos is not implemented yet in Spark portable runner.
19/11/20 14:38:39 INFO 
org.apache.beam.runners.fnexecution.artifact.AbstractArtifactRetrievalService: 
Manifest at 
/tmp/sparktestM8epv3/job_a170b361-0556-4262-8ba2-aeaec0641ddd/MANIFEST has 0 
artifact locations
19/11/20 14:38:39 INFO 
org.apache.beam.runners.fnexecution.artifact.BeamFileSystemArtifactStagingService:
 Removed dir /tmp/sparktestM8epv3/job_a170b361-0556-4262-8ba2-aeaec0641ddd/
INFO:apache_beam.runners.portability.portable_runner:Job state changed to DONE
.
======================================================================
ERROR: test_pardo_state_with_custom_key_coder (__main__.SparkRunnerTest)
Tests that state requests work correctly when the key coder is an
----------------------------------------------------------------------
Traceback (most recent call last):
  File "apache_beam/runners/portability/portable_runner_test.py", line 231, in 
test_pardo_state_with_custom_key_coder
    equal_to(expected))
  File "apache_beam/pipeline.py", line 436, in __exit__
    self.run().wait_until_finish()
  File "apache_beam/runners/portability/portable_runner.py", line 428, in 
wait_until_finish
    for state_response in self._state_stream:
  File 
"<https://builds.apache.org/job/beam_PostCommit_Python_VR_Spark/ws/src/build/gradleenv/1866363813/local/lib/python2.7/site-packages/grpc/_channel.py";,>
 line 395, in next
    return self._next()
  File 
"<https://builds.apache.org/job/beam_PostCommit_Python_VR_Spark/ws/src/build/gradleenv/1866363813/local/lib/python2.7/site-packages/grpc/_channel.py";,>
 line 552, in _next
    _common.wait(self._state.condition.wait, _response_ready)
  File 
"<https://builds.apache.org/job/beam_PostCommit_Python_VR_Spark/ws/src/build/gradleenv/1866363813/local/lib/python2.7/site-packages/grpc/_common.py";,>
 line 140, in wait
    _wait_once(wait_fn, MAXIMUM_WAIT_TIMEOUT, spin_cb)
  File 
"<https://builds.apache.org/job/beam_PostCommit_Python_VR_Spark/ws/src/build/gradleenv/1866363813/local/lib/python2.7/site-packages/grpc/_common.py";,>
 line 105, in _wait_once
    wait_fn(timeout=timeout)
  File "/usr/lib/python2.7/threading.py", line 359, in wait
    _sleep(delay)
  File "apache_beam/runners/portability/portable_runner_test.py", line 75, in 
handler
==================== Timed out after 60 seconds. ====================

# Thread: <Thread(wait_until_finish_read, started daemon 140314589488896)>

    raise BaseException(msg)
BaseException: Timed out after 60 seconds.

======================================================================
ERROR: test_pardo_timers (__main__.SparkRunnerTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "apache_beam/runners/portability/fn_api_runner_test.py", line 328, in 
test_pardo_timers
    assert_that(actual, equal_to(expected))
  File "apache_beam/pipeline.py", line 436, in __exit__
    self.run().wait_until_finish()
# Thread: <Thread(Thread-118, started daemon 140314597881600)>

  File "apache_beam/runners/portability/portable_ru# Thread: 
<_MainThread(MainThread, started 140315379283712)>
==================== Timed out after 60 seconds. ====================

nner.py", line 428, in wait_until_finish
    for state_response in self._state_stream:
# Thread: <Thread(wait_until_finish_read, started daemon 140314098530048)>

  File 
"<https://builds.apache.org/job/beam_PostCommit_Python_VR_Spark/ws/src/build/gradleenv/1866363813/local/lib/python2.7/site-packages/grpc/_channel.py";,>
 line 395, in next
    return self._next()
# Thread: <Thread(Thread-124, started daemon 140314090137344)>

  File 
"<https://builds.apache.org/job/beam_PostCommit_Python_VR_Spark/ws/src/build/gradleenv/1866363813/local/lib/python2.7/site-packages/grpc/_channel.py";,>
 line 552, in _next
    _common.wait(self._state.condition.wait, _response_ready)
# Thread: <_MainThread(MainThread, started 140315379283712)>
  File 
"<https://builds.apache.org/job/beam_PostCommit_Python_VR_Spark/ws/src/build/gradleenv/1866363813/local/lib/python2.7/site-packages/grpc/_common.py";,>
 line 140, in wait
    _wait_once(wait_fn, MAXIMUM_WAIT_TIMEOUT, spin_cb)
  File 
"<https://builds.apache.org/job/beam_PostCommit_Python_VR_Spark/ws/src/build/gradleenv/1866363813/local/lib/python2.7/site-packages/grpc/_common.py";,>
 line 105, in _wait_once
    wait_fn(timeout=timeout)
  File "/usr/lib/python2.7/threading.py", line 359, in wait
    _sleep(delay)
  File "apache_beam/runners/portability/portable_runner_test.py", line 75, in 
handler
    raise BaseException(msg)
BaseException: Timed out after 60 seconds.

======================================================================
ERROR: test_sdf_with_watermark_tracking (__main__.SparkRunnerTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "apache_beam/runners/portability/fn_api_runner_test.py", line 499, in 
test_sdf_with_watermark_tracking
    assert_that(actual, equal_to(list(''.join(data))))
  File "apache_beam/pipeline.py", line 436, in __exit__
    self.run().wait_until_finish()
  File "apache_beam/runners/portability/portable_runner.py", line 438, in 
wait_until_finish
    self._job_id, self._state, self._last_error_message()))
RuntimeError: Pipeline 
test_sdf_with_watermark_tracking_1574260705.52_39edadb2-ae7d-49b9-afe2-79d8bc21cf60
 failed in state FAILED: java.lang.UnsupportedOperationException: The 
ActiveBundle does not have a registered bundle checkpoint handler.

----------------------------------------------------------------------
Ran 38 tests in 326.382s

FAILED (errors=3, skipped=9)

> Task :sdks:python:test-suites:portable:py2:sparkValidatesRunner FAILED

FAILURE: Build failed with an exception.

* Where:
Build file 
'<https://builds.apache.org/job/beam_PostCommit_Python_VR_Spark/ws/src/sdks/python/test-suites/portable/py2/build.gradle'>
 line: 198

* What went wrong:
Execution failed for task 
':sdks:python:test-suites:portable:py2:sparkValidatesRunner'.
> Process 'command 'sh'' finished with non-zero exit value 1

* Try:
Run with --stacktrace option to get the stack trace. Run with --info or --debug 
option to get more log output. Run with --scan to get full insights.

* Get more help at https://help.gradle.org

Deprecated Gradle features were used in this build, making it incompatible with 
Gradle 6.0.
Use '--warning-mode all' to show the individual deprecation warnings.
See 
https://docs.gradle.org/5.2.1/userguide/command_line_interface.html#sec:command_line_warnings

BUILD FAILED in 8m 29s
60 actionable tasks: 52 executed, 8 from cache

Publishing build scan...
https://gradle.com/s/admroodybd366

Build step 'Invoke Gradle script' changed build result to FAILURE
Build step 'Invoke Gradle script' marked build as failure

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to