[jira] [Work logged] (BEAM-4297) Flink portable runner executable stage operator for streaming

2018-05-28 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4297?focusedWorklogId=106467=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-106467
 ]

ASF GitHub Bot logged work on BEAM-4297:


Author: ASF GitHub Bot
Created on: 29/May/18 05:36
Start Date: 29/May/18 05:36
Worklog Time Spent: 10m 
  Work Description: tweise commented on a change in pull request #5407: 
[BEAM-4297] Streaming executable stage translation and operator for portable 
Flink runner.
URL: https://github.com/apache/beam/pull/5407#discussion_r191307968
 
 

 ##
 File path: 
runners/flink/src/test/java/org/apache/beam/runners/flink/streaming/ExecutableStageDoFnOperatorTest.java
 ##
 @@ -0,0 +1,321 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.flink.streaming;
+
+import static org.hamcrest.Matchers.is;
+import static org.hamcrest.collection.IsIterableContainingInOrder.contains;
+import static org.junit.Assert.assertThat;
+import static org.mockito.Matchers.any;
+import static org.mockito.Mockito.doThrow;
+import static org.mockito.Mockito.verify;
+import static org.mockito.Mockito.verifyNoMoreInteractions;
+import static org.mockito.Mockito.when;
+
+import com.google.common.collect.ImmutableList;
+import com.google.common.collect.ImmutableMap;
+import com.google.protobuf.Struct;
+import java.util.Collections;
+import java.util.List;
+import org.apache.beam.model.pipeline.v1.RunnerApi.Components;
+import org.apache.beam.model.pipeline.v1.RunnerApi.ExecutableStagePayload;
+import org.apache.beam.model.pipeline.v1.RunnerApi.PCollection;
+import org.apache.beam.runners.flink.ArtifactSourcePool;
+import org.apache.beam.runners.flink.FlinkPipelineOptions;
+import 
org.apache.beam.runners.flink.translation.functions.FlinkExecutableStageContext;
+import 
org.apache.beam.runners.flink.translation.wrappers.streaming.DoFnOperator;
+import 
org.apache.beam.runners.flink.translation.wrappers.streaming.ExecutableStageDoFnOperator;
+import org.apache.beam.runners.fnexecution.control.OutputReceiverFactory;
+import org.apache.beam.runners.fnexecution.control.RemoteBundle;
+import org.apache.beam.runners.fnexecution.control.StageBundleFactory;
+import org.apache.beam.runners.fnexecution.provisioning.JobInfo;
+import org.apache.beam.runners.fnexecution.state.StateRequestHandler;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.coders.VarIntCoder;
+import org.apache.beam.sdk.coders.VoidCoder;
+import org.apache.beam.sdk.fn.data.FnDataReceiver;
+import org.apache.beam.sdk.options.PipelineOptionsFactory;
+import org.apache.beam.sdk.util.WindowedValue;
+import org.apache.beam.sdk.values.TupleTag;
+import org.apache.beam.sdk.values.WindowingStrategy;
+import org.apache.flink.api.common.cache.DistributedCache;
+import org.apache.flink.api.common.functions.RuntimeContext;
+import org.apache.flink.streaming.runtime.streamrecord.StreamRecord;
+import org.apache.flink.streaming.util.OneInputStreamOperatorTestHarness;
+import org.apache.flink.util.OutputTag;
+import org.junit.Before;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.rules.ExpectedException;
+import org.junit.runner.RunWith;
+import org.junit.runners.JUnit4;
+import org.mockito.Mock;
+import org.mockito.Mockito;
+import org.mockito.MockitoAnnotations;
+
+/** Tests for {@link ExecutableStageDoFnOperator}. */
+@RunWith(JUnit4.class)
+public class ExecutableStageDoFnOperatorTest {
+  @Rule public ExpectedException thrown = ExpectedException.none();
+
+  @Mock private RuntimeContext runtimeContext;
+  @Mock private DistributedCache distributedCache;
+  @Mock private FlinkExecutableStageContext stageContext;
+  @Mock private StageBundleFactory stageBundleFactory;
 
 Review comment:
   I think the mock based test is good for covering just the operator class, 
without other dependencies.  InProcessServerFactory might be a good way to 
write an integration test that also covers the translator, outside of the 
validate runner suite. I can probably do that as follow-up.


This is an automated 

[jira] [Work logged] (BEAM-4297) Flink portable runner executable stage operator for streaming

2018-05-28 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4297?focusedWorklogId=106463=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-106463
 ]

ASF GitHub Bot logged work on BEAM-4297:


Author: ASF GitHub Bot
Created on: 29/May/18 05:14
Start Date: 29/May/18 05:14
Worklog Time Spent: 10m 
  Work Description: tweise commented on a change in pull request #5407: 
[BEAM-4297] Streaming executable stage translation and operator for portable 
Flink runner.
URL: https://github.com/apache/beam/pull/5407#discussion_r191305096
 
 

 ##
 File path: 
runners/flink/src/main/java/org/apache/beam/runners/flink/translation/utils/FlinkPipelineTranslatorUtils.java
 ##
 @@ -0,0 +1,40 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.flink.translation.utils;
+
+import com.google.common.collect.BiMap;
+import com.google.common.collect.ImmutableBiMap;
+
+/**
+ * Utilities for pipeline translation.
+ */
+public final class FlinkPipelineTranslatorUtils {
+  private FlinkPipelineTranslatorUtils() {}
+
+  /**  Creates a mapping from PCollection id to output tag integer. */
+  public static BiMap createOutputMap(Iterable 
localOutputs) {
+ImmutableBiMap.Builder builder = ImmutableBiMap.builder();
+int outputIndex = 0;
+for (String tag : localOutputs) {
 
 Review comment:
   done


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 106463)
Time Spent: 3h 10m  (was: 3h)

> Flink portable runner executable stage operator for streaming
> -
>
> Key: BEAM-4297
> URL: https://issues.apache.org/jira/browse/BEAM-4297
> Project: Beam
>  Issue Type: Task
>  Components: runner-flink
>Reporter: Thomas Weise
>Assignee: Thomas Weise
>Priority: Major
>  Labels: portability
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4297) Flink portable runner executable stage operator for streaming

2018-05-28 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4297?focusedWorklogId=106464=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-106464
 ]

ASF GitHub Bot logged work on BEAM-4297:


Author: ASF GitHub Bot
Created on: 29/May/18 05:14
Start Date: 29/May/18 05:14
Worklog Time Spent: 10m 
  Work Description: tweise commented on a change in pull request #5407: 
[BEAM-4297] Streaming executable stage translation and operator for portable 
Flink runner.
URL: https://github.com/apache/beam/pull/5407#discussion_r191305612
 
 

 ##
 File path: 
runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkStreamingPortablePipelineTranslator.java
 ##
 @@ -423,8 +432,133 @@ private void translateImpulse(
   String id,
   RunnerApi.Pipeline pipeline,
   StreamingTranslationContext context) {
+// TODO: Fail on stateful DoFns for now.
+// TODO: Support stateful DoFns by inserting group-by-keys where necessary.
+// TODO: Fail on splittable DoFns.
+// TODO: Special-case single outputs to avoid multiplexing PCollections.
+RunnerApi.Components components = pipeline.getComponents();
+RunnerApi.PTransform transform = components.getTransformsOrThrow(id);
+Map outputs = transform.getOutputsMap();
+RehydratedComponents rehydratedComponents =
+RehydratedComponents.forComponents(components);
+
+BiMap outputMap =
+FlinkPipelineTranslatorUtils.createOutputMap(outputs.keySet());
+Map>> outputCoders = Maps.newHashMap();
+for (String localOutputName : new TreeMap<>(outputMap.inverse()).values()) 
{
+  String collectionId = outputs.get(localOutputName);
+  Coder> windowCoder = (Coder) 
instantiateCoder(collectionId, components);
+  outputCoders.put(localOutputName, windowCoder);
+}
+
+final RunnerApi.ExecutableStagePayload stagePayload;
+try {
+  stagePayload = 
RunnerApi.ExecutableStagePayload.parseFrom(transform.getSpec().getPayload());
+} catch (IOException e) {
+  throw new RuntimeException(e);
+}
+
+String inputPCollectionId =
+Iterables.getOnlyElement(transform.getInputsMap().values());
 
 Review comment:
   I added a test for serialization. If we agree on the repeated instantiation 
of ExecutableStage, then I can take this up in a separate PR (for both, batch 
and streaming translation). I would do that once we have test and end-to-end 
coverage, right now the translators are still not wired.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 106464)
Time Spent: 3h 20m  (was: 3h 10m)

> Flink portable runner executable stage operator for streaming
> -
>
> Key: BEAM-4297
> URL: https://issues.apache.org/jira/browse/BEAM-4297
> Project: Beam
>  Issue Type: Task
>  Components: runner-flink
>Reporter: Thomas Weise
>Assignee: Thomas Weise
>Priority: Major
>  Labels: portability
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4297) Flink portable runner executable stage operator for streaming

2018-05-28 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4297?focusedWorklogId=106462=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-106462
 ]

ASF GitHub Bot logged work on BEAM-4297:


Author: ASF GitHub Bot
Created on: 29/May/18 05:14
Start Date: 29/May/18 05:14
Worklog Time Spent: 10m 
  Work Description: tweise commented on a change in pull request #5407: 
[BEAM-4297] Streaming executable stage translation and operator for portable 
Flink runner.
URL: https://github.com/apache/beam/pull/5407#discussion_r191305254
 
 

 ##
 File path: 
runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkStreamingPortablePipelineTranslator.java
 ##
 @@ -423,8 +432,133 @@ private void translateImpulse(
   String id,
   RunnerApi.Pipeline pipeline,
   StreamingTranslationContext context) {
+// TODO: Fail on stateful DoFns for now.
+// TODO: Support stateful DoFns by inserting group-by-keys where necessary.
+// TODO: Fail on splittable DoFns.
+// TODO: Special-case single outputs to avoid multiplexing PCollections.
+RunnerApi.Components components = pipeline.getComponents();
+RunnerApi.PTransform transform = components.getTransformsOrThrow(id);
+Map outputs = transform.getOutputsMap();
+RehydratedComponents rehydratedComponents =
+RehydratedComponents.forComponents(components);
+
+BiMap outputMap =
+FlinkPipelineTranslatorUtils.createOutputMap(outputs.keySet());
+Map>> outputCoders = Maps.newHashMap();
+for (String localOutputName : new TreeMap<>(outputMap.inverse()).values()) 
{
 
 Review comment:
   I have cleaned up this portion of the code.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 106462)
Time Spent: 3h 10m  (was: 3h)

> Flink portable runner executable stage operator for streaming
> -
>
> Key: BEAM-4297
> URL: https://issues.apache.org/jira/browse/BEAM-4297
> Project: Beam
>  Issue Type: Task
>  Components: runner-flink
>Reporter: Thomas Weise
>Assignee: Thomas Weise
>Priority: Major
>  Labels: portability
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle #368

2018-05-28 Thread Apache Jenkins Server
See 


--
[...truncated 18.07 MB...]
INFO: Adding 
PAssert$33/GroupGlobally/GatherAllOutputs/Window.Into()/Window.Assign as step 
s15
May 29, 2018 2:03:00 AM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/GatherAllOutputs/GroupByKey as step 
s16
May 29, 2018 2:03:00 AM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/GatherAllOutputs/Values/Values/Map as 
step s17
May 29, 2018 2:03:00 AM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/RewindowActuals/Window.Assign as step 
s18
May 29, 2018 2:03:00 AM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/KeyForDummy/AddKeys/Map as step s19
May 29, 2018 2:03:00 AM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding 
PAssert$33/GroupGlobally/RemoveActualsTriggering/Flatten.PCollections as step 
s20
May 29, 2018 2:03:00 AM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/Create.Values/Read(CreateSource) as 
step s21
May 29, 2018 2:03:00 AM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/WindowIntoDummy/Window.Assign as step 
s22
May 29, 2018 2:03:00 AM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding 
PAssert$33/GroupGlobally/RemoveDummyTriggering/Flatten.PCollections as step s23
May 29, 2018 2:03:00 AM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/FlattenDummyAndContents as step s24
May 29, 2018 2:03:00 AM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/NeverTrigger/Flatten.PCollections as 
step s25
May 29, 2018 2:03:00 AM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/GroupDummyAndContents as step s26
May 29, 2018 2:03:00 AM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/Values/Values/Map as step s27
May 29, 2018 2:03:00 AM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/ParDo(Concat) as step s28
May 29, 2018 2:03:00 AM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GetPane/Map as step s29
May 29, 2018 2:03:00 AM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/RunChecks as step s30
May 29, 2018 2:03:00 AM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/VerifyAssertions/ParDo(DefaultConclude) as step s31
May 29, 2018 2:03:00 AM org.apache.beam.runners.dataflow.DataflowRunner run
INFO: Staging pipeline description to 
gs://temp-storage-for-validates-runner-tests//viewtest0testsingletonsideinput-jenkins-0529020257-87622d4f/output/results/staging/
May 29, 2018 2:03:00 AM org.apache.beam.runners.dataflow.util.PackageUtil 
tryStagePackage
INFO: Uploading <71120 bytes, hash DaNaYyuZ7dtjiV3nJHuKhA> to 
gs://temp-storage-for-validates-runner-tests//viewtest0testsingletonsideinput-jenkins-0529020257-87622d4f/output/results/staging/pipeline-DaNaYyuZ7dtjiV3nJHuKhA.pb

org.apache.beam.sdk.transforms.ViewTest > testSingletonSideInput STANDARD_OUT
Dataflow SDK version: 2.5.0-SNAPSHOT

org.apache.beam.sdk.transforms.ViewTest > testSingletonSideInput STANDARD_ERROR
May 29, 2018 2:03:02 AM org.apache.beam.runners.dataflow.DataflowRunner run
INFO: To access the Dataflow monitoring console, please navigate to 
https://console.cloud.google.com/dataflow/jobsDetail/locations/us-central1/jobs/2018-05-28_19_03_01-13697804118624917740?project=apache-beam-testing

org.apache.beam.sdk.transforms.ViewTest > testSingletonSideInput STANDARD_OUT
Submitted job: 2018-05-28_19_03_01-13697804118624917740

org.apache.beam.sdk.transforms.ViewTest > testSingletonSideInput STANDARD_ERROR
May 29, 2018 2:03:02 AM org.apache.beam.runners.dataflow.DataflowRunner run
INFO: To cancel the job using the 'gcloud' tool, run:
> gcloud dataflow jobs --project=apache-beam-testing cancel 
--region=us-central1 2018-05-28_19_03_01-13697804118624917740
May 29, 2018 2:03:02 AM org.apache.beam.runners.dataflow.TestDataflowRunner 
run
INFO: Running Dataflow job 

Build failed in Jenkins: beam_PerformanceTests_MongoDBIO_IT #232

2018-05-28 Thread Apache Jenkins Server
See 


--
[...truncated 282.22 KB...]
java.lang.RuntimeException: com.mongodb.MongoTimeoutException: Timed out 
after 3 ms while waiting for a server that matches 
ReadPreferenceServerSelector{readPreference=primary}. Client view of cluster 
state is {type=UNKNOWN, servers=[{address=35.192.181.202:27017, type=UNKNOWN, 
state=CONNECTING, exception={com.mongodb.MongoSocketOpenException: Exception 
opening socket}, caused by {java.net.SocketTimeoutException: connect timed 
out}}]
at 
com.mongodb.connection.BaseCluster.createTimeoutException(BaseCluster.java:369)
at com.mongodb.connection.BaseCluster.selectServer(BaseCluster.java:101)
at 
com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.(ClusterBinding.java:75)
at 
com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.(ClusterBinding.java:71)
at 
com.mongodb.binding.ClusterBinding.getReadConnectionSource(ClusterBinding.java:63)
at 
com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:89)
at 
com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:84)
at 
com.mongodb.operation.CommandReadOperation.execute(CommandReadOperation.java:55)
at com.mongodb.Mongo.execute(Mongo.java:772)
at com.mongodb.Mongo$2.execute(Mongo.java:759)
at com.mongodb.MongoDatabaseImpl.runCommand(MongoDatabaseImpl.java:130)
at com.mongodb.MongoDatabaseImpl.runCommand(MongoDatabaseImpl.java:124)
at com.mongodb.MongoDatabaseImpl.runCommand(MongoDatabaseImpl.java:114)
at 
org.apache.beam.sdk.io.mongodb.MongoDbIO$BoundedMongoDbSource.split(MongoDbIO.java:332)
at 
com.google.cloud.dataflow.worker.WorkerCustomSources.splitAndValidate(WorkerCustomSources.java:275)
at 
com.google.cloud.dataflow.worker.WorkerCustomSources.performSplitTyped(WorkerCustomSources.java:197)
at 
com.google.cloud.dataflow.worker.WorkerCustomSources.performSplitWithApiLimit(WorkerCustomSources.java:181)
at 
com.google.cloud.dataflow.worker.WorkerCustomSources.performSplit(WorkerCustomSources.java:160)
at 
com.google.cloud.dataflow.worker.WorkerCustomSourceOperationExecutor.execute(WorkerCustomSourceOperationExecutor.java:77)
at 
com.google.cloud.dataflow.worker.BatchDataflowWorker.executeWork(BatchDataflowWorker.java:383)
at 
com.google.cloud.dataflow.worker.BatchDataflowWorker.doWork(BatchDataflowWorker.java:355)
at 
com.google.cloud.dataflow.worker.BatchDataflowWorker.getAndPerformWork(BatchDataflowWorker.java:286)
at 
com.google.cloud.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.doWork(DataflowBatchWorkerHarness.java:134)
at 
com.google.cloud.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:114)
at 
com.google.cloud.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:101)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
com.mongodb.MongoTimeoutException: Timed out after 3 ms while waiting 
for a server that matches ReadPreferenceServerSelector{readPreference=primary}. 
Client view of cluster state is {type=UNKNOWN, 
servers=[{address=35.192.181.202:27017, type=UNKNOWN, state=CONNECTING, 
exception={com.mongodb.MongoSocketOpenException: Exception opening socket}, 
caused by {java.net.SocketTimeoutException: connect timed out}}]
at 
com.mongodb.connection.BaseCluster.createTimeoutException(BaseCluster.java:369)
at com.mongodb.connection.BaseCluster.selectServer(BaseCluster.java:101)
at 
com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.(ClusterBinding.java:75)
at 
com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.(ClusterBinding.java:71)
at 
com.mongodb.binding.ClusterBinding.getReadConnectionSource(ClusterBinding.java:63)
at 
com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:89)
at 
com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:84)
at 
com.mongodb.operation.CommandReadOperation.execute(CommandReadOperation.java:55)
at com.mongodb.Mongo.execute(Mongo.java:772)
at com.mongodb.Mongo$2.execute(Mongo.java:759)
at com.mongodb.MongoDatabaseImpl.runCommand(MongoDatabaseImpl.java:130)
at com.mongodb.MongoDatabaseImpl.runCommand(MongoDatabaseImpl.java:124)
at 

Jenkins build is back to normal : beam_PerformanceTests_HadoopInputFormat #323

2018-05-28 Thread Apache Jenkins Server
See 




Jenkins build is back to normal : beam_PerformanceTests_Compressed_TextIOIT_HDFS #227

2018-05-28 Thread Apache Jenkins Server
See 




Build failed in Jenkins: beam_PerformanceTests_XmlIOIT_HDFS #226

2018-05-28 Thread Apache Jenkins Server
See 


--
Started by timer
[EnvInject] - Loading node environment variables.
Building remotely on beam10 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision 33f048b3b86ea2566c8c5bff3a5e93cc874f146d (origin/master)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 33f048b3b86ea2566c8c5bff3a5e93cc874f146d
Commit message: "Merge pull request #5495 from iemejia/try-dep-plugin"
 > git rev-list --no-walk 33f048b3b86ea2566c8c5bff3a5e93cc874f146d # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
[EnvInject] - Executing scripts and injecting environment variables after the 
SCM step.
[EnvInject] - Injecting as environment variables the properties content 
SPARK_LOCAL_IP=127.0.0.1

[EnvInject] - Variables injected successfully.
[beam_PerformanceTests_XmlIOIT_HDFS] $ /bin/bash -xe 
/tmp/jenkins1390713603054407098.sh
+ gcloud container clusters get-credentials io-datastores --zone=us-central1-a 
--verbosity=debug
DEBUG: Running [gcloud.container.clusters.get-credentials] with arguments: 
[--verbosity: "debug", --zone: "us-central1-a", NAME: "io-datastores"]
Fetching cluster endpoint and auth data.
DEBUG: Saved kubeconfig to /home/jenkins/.kube/config
kubeconfig entry generated for io-datastores.
INFO: Display format "default".
DEBUG: SDK update checks are disabled.
[beam_PerformanceTests_XmlIOIT_HDFS] $ /bin/bash -xe 
/tmp/jenkins8571547913606413747.sh
+ cp /home/jenkins/.kube/config 

[beam_PerformanceTests_XmlIOIT_HDFS] $ /bin/bash -xe 
/tmp/jenkins7003386212323373488.sh
+ kubectl 
--kubeconfig=
 create namespace filebasedioithdfs-226
Error from server (AlreadyExists): namespaces "filebasedioithdfs-226" already 
exists
Build step 'Execute shell' marked build as failure


Build failed in Jenkins: beam_PerformanceTests_AvroIOIT_HDFS #226

2018-05-28 Thread Apache Jenkins Server
See 


--
Started by timer
[EnvInject] - Loading node environment variables.
Building remotely on beam12 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision 33f048b3b86ea2566c8c5bff3a5e93cc874f146d (origin/master)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 33f048b3b86ea2566c8c5bff3a5e93cc874f146d
Commit message: "Merge pull request #5495 from iemejia/try-dep-plugin"
 > git rev-list --no-walk 33f048b3b86ea2566c8c5bff3a5e93cc874f146d # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
[EnvInject] - Executing scripts and injecting environment variables after the 
SCM step.
[EnvInject] - Injecting as environment variables the properties content 
SPARK_LOCAL_IP=127.0.0.1

[EnvInject] - Variables injected successfully.
[beam_PerformanceTests_AvroIOIT_HDFS] $ /bin/bash -xe 
/tmp/jenkins5726366994623187031.sh
+ gcloud container clusters get-credentials io-datastores --zone=us-central1-a 
--verbosity=debug
DEBUG: Running [gcloud.container.clusters.get-credentials] with arguments: 
[--verbosity: "debug", --zone: "us-central1-a", NAME: "io-datastores"]
Fetching cluster endpoint and auth data.
DEBUG: Saved kubeconfig to /home/jenkins/.kube/config
kubeconfig entry generated for io-datastores.
INFO: Display format "default".
DEBUG: SDK update checks are disabled.
[beam_PerformanceTests_AvroIOIT_HDFS] $ /bin/bash -xe 
/tmp/jenkins834452074962830183.sh
+ cp /home/jenkins/.kube/config 

[beam_PerformanceTests_AvroIOIT_HDFS] $ /bin/bash -xe 
/tmp/jenkins2007030460235419795.sh
+ kubectl 
--kubeconfig=
 create namespace filebasedioithdfs-226
Error from server (AlreadyExists): namespaces "filebasedioithdfs-226" already 
exists
Build step 'Execute shell' marked build as failure


[jira] [Work logged] (BEAM-3214) Add an integration test for HBaseIO Read/Write transforms

2018-05-28 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3214?focusedWorklogId=106415=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-106415
 ]

ASF GitHub Bot logged work on BEAM-3214:


Author: ASF GitHub Bot
Created on: 28/May/18 21:14
Start Date: 28/May/18 21:14
Worklog Time Spent: 10m 
  Work Description: szewi opened a new pull request #5499: [BEAM-3214] Add 
integration test for HBaseIO.
URL: https://github.com/apache/beam/pull/5499
 
 
   This PR contains:  
   - kubernetes infrastructure for running integration tests of HBaseIO
   - support for running integration test of HBaseIO with Gradle
   - java code of integration test 
   - by default test is being run on 100k records, but hashes are also 
calculated for 1k, 600k and 5M of records and can be set with pipeline option 
--numberOfRecords
   
   
   Follow this checklist to help us incorporate your contribution quickly and 
easily:
   
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   It will help us expedite review of your Pull Request if you tag someone 
(e.g. `@username`) to look at it.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 106415)
Time Spent: 10m
Remaining Estimate: 0h

> Add an integration test for HBaseIO Read/Write transforms
> -
>
> Key: BEAM-3214
> URL: https://issues.apache.org/jira/browse/BEAM-3214
> Project: Beam
>  Issue Type: Test
>  Components: io-java-hbase
>Reporter: Chamikara Jayalath
>Assignee: Kamil Szewczyk
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We should add an small scale integration test for HBaseIO that can be run as 
> a part of 'beam_PostCommit_Java_MavenInstall' and 
> 'beam_PostCommit_Java_ValidatesRunner*' Jenkins test suites.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle #367

2018-05-28 Thread Apache Jenkins Server
See 


--
[...truncated 18.09 MB...]
INFO: Adding PAssert$33/GroupGlobally/GatherAllOutputs/GroupByKey as step 
s16
May 28, 2018 8:08:52 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/GatherAllOutputs/Values/Values/Map as 
step s17
May 28, 2018 8:08:52 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/RewindowActuals/Window.Assign as step 
s18
May 28, 2018 8:08:52 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/KeyForDummy/AddKeys/Map as step s19
May 28, 2018 8:08:52 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding 
PAssert$33/GroupGlobally/RemoveActualsTriggering/Flatten.PCollections as step 
s20
May 28, 2018 8:08:52 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/Create.Values/Read(CreateSource) as 
step s21
May 28, 2018 8:08:52 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/WindowIntoDummy/Window.Assign as step 
s22
May 28, 2018 8:08:52 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding 
PAssert$33/GroupGlobally/RemoveDummyTriggering/Flatten.PCollections as step s23
May 28, 2018 8:08:52 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/FlattenDummyAndContents as step s24
May 28, 2018 8:08:52 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/NeverTrigger/Flatten.PCollections as 
step s25
May 28, 2018 8:08:52 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/GroupDummyAndContents as step s26
May 28, 2018 8:08:52 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/Values/Values/Map as step s27
May 28, 2018 8:08:52 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/ParDo(Concat) as step s28
May 28, 2018 8:08:52 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GetPane/Map as step s29
May 28, 2018 8:08:52 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/RunChecks as step s30
May 28, 2018 8:08:52 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/VerifyAssertions/ParDo(DefaultConclude) as step s31
May 28, 2018 8:08:52 PM org.apache.beam.runners.dataflow.DataflowRunner run
INFO: Staging pipeline description to 
gs://temp-storage-for-validates-runner-tests//viewtest0testsingletonsideinput-jenkins-0528200849-6c33f46e/output/results/staging/
May 28, 2018 8:08:53 PM org.apache.beam.runners.dataflow.util.PackageUtil 
tryStagePackage
INFO: Uploading <71120 bytes, hash 313FhuBZZQAMjIw226tldw> to 
gs://temp-storage-for-validates-runner-tests//viewtest0testsingletonsideinput-jenkins-0528200849-6c33f46e/output/results/staging/pipeline-313FhuBZZQAMjIw226tldw.pb

org.apache.beam.sdk.transforms.ViewTest > testSingletonSideInput STANDARD_OUT
Dataflow SDK version: 2.5.0-SNAPSHOT

org.apache.beam.sdk.transforms.ViewTest > testSingletonSideInput STANDARD_ERROR
May 28, 2018 8:08:54 PM org.apache.beam.runners.dataflow.DataflowRunner run
INFO: To access the Dataflow monitoring console, please navigate to 
https://console.cloud.google.com/dataflow/jobsDetail/locations/us-central1/jobs/2018-05-28_13_08_53-1729043576820431779?project=apache-beam-testing

org.apache.beam.sdk.transforms.ViewTest > testSingletonSideInput STANDARD_OUT
Submitted job: 2018-05-28_13_08_53-1729043576820431779

org.apache.beam.sdk.transforms.ViewTest > testSingletonSideInput STANDARD_ERROR
May 28, 2018 8:08:54 PM org.apache.beam.runners.dataflow.DataflowRunner run
INFO: To cancel the job using the 'gcloud' tool, run:
> gcloud dataflow jobs --project=apache-beam-testing cancel 
--region=us-central1 2018-05-28_13_08_53-1729043576820431779
May 28, 2018 8:08:54 PM org.apache.beam.runners.dataflow.TestDataflowRunner 
run
INFO: Running Dataflow job 2018-05-28_13_08_53-1729043576820431779 with 1 
expected assertions.
May 28, 2018 8:09:23 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-05-28T20:08:53.839Z: Autoscaling is enabled for job 

[jira] [Work logged] (BEAM-2810) Consider a faster Avro library in Python

2018-05-28 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-2810?focusedWorklogId=106399=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-106399
 ]

ASF GitHub Bot logged work on BEAM-2810:


Author: ASF GitHub Bot
Created on: 28/May/18 19:20
Start Date: 28/May/18 19:20
Worklog Time Spent: 10m 
  Work Description: ryan-williams commented on issue #5496: do not merge! 
[BEAM-2810] use fastavro in Avro IO
URL: https://github.com/apache/beam/pull/5496#issuecomment-392586690
 
 
   Run Python PreCommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 106399)
Time Spent: 1h 10m  (was: 1h)

> Consider a faster Avro library in Python
> 
>
> Key: BEAM-2810
> URL: https://issues.apache.org/jira/browse/BEAM-2810
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Eugene Kirpichov
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> https://stackoverflow.com/questions/45870789/bottleneck-on-data-source
> Seems like this job is reading Avro files (exported by BigQuery) at about 2 
> MB/s.
> We use the standard Python "avro" library which is apparently known to be 
> very slow (10x+ slower than Java) 
> http://apache-avro.679487.n3.nabble.com/Avro-decode-very-slow-in-Python-td4034422.html,
>  and there are alternatives e.g. https://pypi.python.org/pypi/fastavro/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-2810) Consider a faster Avro library in Python

2018-05-28 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-2810?focusedWorklogId=106398=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-106398
 ]

ASF GitHub Bot logged work on BEAM-2810:


Author: ASF GitHub Bot
Created on: 28/May/18 19:20
Start Date: 28/May/18 19:20
Worklog Time Spent: 10m 
  Work Description: ryan-williams commented on issue #5496: do not merge! 
[BEAM-2810] use fastavro in Avro IO
URL: https://github.com/apache/beam/pull/5496#issuecomment-392589998
 
 
   run python precommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 106398)
Time Spent: 1h  (was: 50m)

> Consider a faster Avro library in Python
> 
>
> Key: BEAM-2810
> URL: https://issues.apache.org/jira/browse/BEAM-2810
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Eugene Kirpichov
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> https://stackoverflow.com/questions/45870789/bottleneck-on-data-source
> Seems like this job is reading Avro files (exported by BigQuery) at about 2 
> MB/s.
> We use the standard Python "avro" library which is apparently known to be 
> very slow (10x+ slower than Java) 
> http://apache-avro.679487.n3.nabble.com/Avro-decode-very-slow-in-Python-td4034422.html,
>  and there are alternatives e.g. https://pypi.python.org/pypi/fastavro/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PerformanceTests_HadoopInputFormat #322

2018-05-28 Thread Apache Jenkins Server
See 


Changes:

[iemejia] Refine dependencies, make new ones explicit and minor maven plugins

--
[...truncated 106.80 KB...]

> Task :beam-sdks-java-extensions-google-cloud-platform-core:compileTestJava 
> UP-TO-DATE
Build cache key for task 
':beam-sdks-java-extensions-google-cloud-platform-core:compileTestJava' is 
5b033746ee39bd6c6bad5a69aa871388
Skipping task 
':beam-sdks-java-extensions-google-cloud-platform-core:compileTestJava' as it 
is up-to-date.
:beam-sdks-java-extensions-google-cloud-platform-core:compileTestJava 
(Thread[Task worker for ':' Thread 11,5,main]) completed. Took 0.029 secs.
:beam-sdks-java-extensions-google-cloud-platform-core:testClasses (Thread[Task 
worker for ':' Thread 11,5,main]) started.

> Task :beam-sdks-java-extensions-google-cloud-platform-core:testClasses 
> UP-TO-DATE
Skipping task 
':beam-sdks-java-extensions-google-cloud-platform-core:testClasses' as it has 
no actions.
:beam-sdks-java-extensions-google-cloud-platform-core:testClasses (Thread[Task 
worker for ':' Thread 11,5,main]) completed. Took 0.0 secs.
:beam-sdks-java-extensions-google-cloud-platform-core:shadowTestJar 
(Thread[Task worker for ':' Thread 11,5,main]) started.

> Task :beam-sdks-java-extensions-google-cloud-platform-core:shadowTestJar 
> UP-TO-DATE
Build cache key for task 
':beam-sdks-java-extensions-google-cloud-platform-core:shadowTestJar' is 
d0e4b63467b5777ff352b6f75284eb8a
Caching disabled for task 
':beam-sdks-java-extensions-google-cloud-platform-core:shadowTestJar': Caching 
has not been enabled for the task
Skipping task 
':beam-sdks-java-extensions-google-cloud-platform-core:shadowTestJar' as it is 
up-to-date.
:beam-sdks-java-extensions-google-cloud-platform-core:shadowTestJar 
(Thread[Task worker for ':' Thread 11,5,main]) completed. Took 0.015 secs.
:beam-sdks-java-io-google-cloud-platform:compileTestJava (Thread[Task worker 
for ':' Thread 11,5,main]) started.

> Task :beam-sdks-java-io-google-cloud-platform:compileTestJava UP-TO-DATE
Build cache key for task 
':beam-sdks-java-io-google-cloud-platform:compileTestJava' is 
fb2ee1ec8f8ddd5e444eb8b3c33645ef
Skipping task ':beam-sdks-java-io-google-cloud-platform:compileTestJava' as it 
is up-to-date.
:beam-sdks-java-io-google-cloud-platform:compileTestJava (Thread[Task worker 
for ':' Thread 11,5,main]) completed. Took 0.055 secs.
:beam-sdks-java-io-google-cloud-platform:testClasses (Thread[Task worker for 
':' Thread 11,5,main]) started.

> Task :beam-sdks-java-io-google-cloud-platform:testClasses UP-TO-DATE
Skipping task ':beam-sdks-java-io-google-cloud-platform:testClasses' as it has 
no actions.
:beam-sdks-java-io-google-cloud-platform:testClasses (Thread[Task worker for 
':' Thread 11,5,main]) completed. Took 0.0 secs.
:beam-sdks-java-io-google-cloud-platform:shadowTestJar (Thread[Task worker for 
':' Thread 11,5,main]) started.

> Task :beam-sdks-java-io-google-cloud-platform:shadowTestJar UP-TO-DATE
Build cache key for task 
':beam-sdks-java-io-google-cloud-platform:shadowTestJar' is 
ebc671c4a4c91772e8484aeb0fb1fb52
Caching disabled for task 
':beam-sdks-java-io-google-cloud-platform:shadowTestJar': Caching has not been 
enabled for the task
Skipping task ':beam-sdks-java-io-google-cloud-platform:shadowTestJar' as it is 
up-to-date.
:beam-sdks-java-io-google-cloud-platform:shadowTestJar (Thread[Task worker for 
':' Thread 11,5,main]) completed. Took 0.047 secs.
:beam-runners-google-cloud-dataflow-java:compileTestJava (Thread[Task worker 
for ':' Thread 11,5,main]) started.

> Task :beam-runners-google-cloud-dataflow-java:compileTestJava UP-TO-DATE
Build cache key for task 
':beam-runners-google-cloud-dataflow-java:compileTestJava' is 
fb188adede9887854f27efb0fe173c4f
Skipping task ':beam-runners-google-cloud-dataflow-java:compileTestJava' as it 
is up-to-date.
:beam-runners-google-cloud-dataflow-java:compileTestJava (Thread[Task worker 
for ':' Thread 11,5,main]) completed. Took 0.06 secs.
:beam-runners-google-cloud-dataflow-java:testClasses (Thread[Task worker for 
':' Thread 11,5,main]) started.

> Task :beam-runners-google-cloud-dataflow-java:testClasses UP-TO-DATE
Skipping task ':beam-runners-google-cloud-dataflow-java:testClasses' as it has 
no actions.
:beam-runners-google-cloud-dataflow-java:testClasses (Thread[Task worker for 
':' Thread 11,5,main]) completed. Took 0.0 secs.
:beam-runners-google-cloud-dataflow-java:shadowTestJar (Thread[Task worker for 
':' Thread 10,5,main]) started.

> Task :beam-runners-google-cloud-dataflow-java:shadowTestJar UP-TO-DATE
Build cache key for task 
':beam-runners-google-cloud-dataflow-java:shadowTestJar' is 
2ce5f20f0c5c1caf8a47048e77b592e7
Caching disabled for task 
':beam-runners-google-cloud-dataflow-java:shadowTestJar': Caching has not been 
enabled for the task
Skipping task ':beam-runners-google-cloud-dataflow-java:shadowTestJar' as 

Build failed in Jenkins: beam_PerformanceTests_XmlIOIT_HDFS #225

2018-05-28 Thread Apache Jenkins Server
See 


Changes:

[iemejia] Refine dependencies, make new ones explicit and minor maven plugins

--
Started by timer
[EnvInject] - Loading node environment variables.
Building remotely on beam10 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision 33f048b3b86ea2566c8c5bff3a5e93cc874f146d (origin/master)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 33f048b3b86ea2566c8c5bff3a5e93cc874f146d
Commit message: "Merge pull request #5495 from iemejia/try-dep-plugin"
 > git rev-list --no-walk 3a668eaff6266c87c645a44cc84b4d5e1c3b2228 # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
[EnvInject] - Executing scripts and injecting environment variables after the 
SCM step.
[EnvInject] - Injecting as environment variables the properties content 
SPARK_LOCAL_IP=127.0.0.1

[EnvInject] - Variables injected successfully.
[beam_PerformanceTests_XmlIOIT_HDFS] $ /bin/bash -xe 
/tmp/jenkins5351208593317861394.sh
+ gcloud container clusters get-credentials io-datastores --zone=us-central1-a 
--verbosity=debug
DEBUG: Running [gcloud.container.clusters.get-credentials] with arguments: 
[--verbosity: "debug", --zone: "us-central1-a", NAME: "io-datastores"]
Fetching cluster endpoint and auth data.
DEBUG: Saved kubeconfig to /home/jenkins/.kube/config
kubeconfig entry generated for io-datastores.
INFO: Display format "default".
DEBUG: SDK update checks are disabled.
[beam_PerformanceTests_XmlIOIT_HDFS] $ /bin/bash -xe 
/tmp/jenkins2239071674187469199.sh
+ cp /home/jenkins/.kube/config 

[beam_PerformanceTests_XmlIOIT_HDFS] $ /bin/bash -xe 
/tmp/jenkins3183451553647688404.sh
+ kubectl 
--kubeconfig=
 create namespace filebasedioithdfs-225
Error from server (AlreadyExists): namespaces "filebasedioithdfs-225" already 
exists
Build step 'Execute shell' marked build as failure


[jira] [Work logged] (BEAM-2810) Consider a faster Avro library in Python

2018-05-28 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-2810?focusedWorklogId=106397=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-106397
 ]

ASF GitHub Bot logged work on BEAM-2810:


Author: ASF GitHub Bot
Created on: 28/May/18 18:57
Start Date: 28/May/18 18:57
Worklog Time Spent: 10m 
  Work Description: ryan-williams commented on issue #5496: do not merge! 
[BEAM-2810] use fastavro in Avro IO
URL: https://github.com/apache/beam/pull/5496#issuecomment-392586690
 
 
   Run Python PreCommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 106397)
Time Spent: 50m  (was: 40m)

> Consider a faster Avro library in Python
> 
>
> Key: BEAM-2810
> URL: https://issues.apache.org/jira/browse/BEAM-2810
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Eugene Kirpichov
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> https://stackoverflow.com/questions/45870789/bottleneck-on-data-source
> Seems like this job is reading Avro files (exported by BigQuery) at about 2 
> MB/s.
> We use the standard Python "avro" library which is apparently known to be 
> very slow (10x+ slower than Java) 
> http://apache-avro.679487.n3.nabble.com/Avro-decode-very-slow-in-Python-td4034422.html,
>  and there are alternatives e.g. https://pypi.python.org/pypi/fastavro/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-2810) Consider a faster Avro library in Python

2018-05-28 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-2810?focusedWorklogId=106396=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-106396
 ]

ASF GitHub Bot logged work on BEAM-2810:


Author: ASF GitHub Bot
Created on: 28/May/18 18:55
Start Date: 28/May/18 18:55
Worklog Time Spent: 10m 
  Work Description: ryan-williams commented on issue #5496: do not merge! 
[BEAM-2810] use fastavro in Avro IO
URL: https://github.com/apache/beam/pull/5496#issuecomment-392586690
 
 
   Run PythonPreCommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 106396)
Time Spent: 40m  (was: 0.5h)

> Consider a faster Avro library in Python
> 
>
> Key: BEAM-2810
> URL: https://issues.apache.org/jira/browse/BEAM-2810
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Eugene Kirpichov
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> https://stackoverflow.com/questions/45870789/bottleneck-on-data-source
> Seems like this job is reading Avro files (exported by BigQuery) at about 2 
> MB/s.
> We use the standard Python "avro" library which is apparently known to be 
> very slow (10x+ slower than Java) 
> http://apache-avro.679487.n3.nabble.com/Avro-decode-very-slow-in-Python-td4034422.html,
>  and there are alternatives e.g. https://pypi.python.org/pypi/fastavro/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-2810) Consider a faster Avro library in Python

2018-05-28 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-2810?focusedWorklogId=106395=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-106395
 ]

ASF GitHub Bot logged work on BEAM-2810:


Author: ASF GitHub Bot
Created on: 28/May/18 18:54
Start Date: 28/May/18 18:54
Worklog Time Spent: 10m 
  Work Description: ryan-williams commented on issue #5496: do not merge! 
[BEAM-2810] use fastavro in Avro IO
URL: https://github.com/apache/beam/pull/5496#issuecomment-392586611
 
 
   I've observed locally that it takes two tries to pick up a new version of 
`fastavro-blocks` from test.pypi.org, and I think maybe that's what's happening 
here? Going to try to tell Jenkins to re-run…


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 106395)
Time Spent: 0.5h  (was: 20m)

> Consider a faster Avro library in Python
> 
>
> Key: BEAM-2810
> URL: https://issues.apache.org/jira/browse/BEAM-2810
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Eugene Kirpichov
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> https://stackoverflow.com/questions/45870789/bottleneck-on-data-source
> Seems like this job is reading Avro files (exported by BigQuery) at about 2 
> MB/s.
> We use the standard Python "avro" library which is apparently known to be 
> very slow (10x+ slower than Java) 
> http://apache-avro.679487.n3.nabble.com/Avro-decode-very-slow-in-Python-td4034422.html,
>  and there are alternatives e.g. https://pypi.python.org/pypi/fastavro/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PerformanceTests_MongoDBIO_IT #231

2018-05-28 Thread Apache Jenkins Server
See 


Changes:

[iemejia] Refine dependencies, make new ones explicit and minor maven plugins

--
[...truncated 357.09 KB...]
at org.apache.beam.sdk.testing.PAssert.doChecks(PAssert.java:1250)
at 
org.apache.beam.sdk.testing.PAssert$SideInputCheckerDoFn.processElement(PAssert.java:1192)
at 
org.apache.beam.sdk.testing.PAssert$SideInputCheckerDoFn$DoFnInvoker.invokeProcessElement(Unknown
 Source)
at 
org.apache.beam.runners.core.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:177)
at 
org.apache.beam.runners.core.SimpleDoFnRunner.processElement(SimpleDoFnRunner.java:138)
at 
com.google.cloud.dataflow.worker.SimpleParDoFn.processElement(SimpleParDoFn.java:323)
at 
com.google.cloud.dataflow.worker.util.common.worker.ParDoOperation.process(ParDoOperation.java:43)
at 
com.google.cloud.dataflow.worker.util.common.worker.OutputReceiver.process(OutputReceiver.java:48)
at 
com.google.cloud.dataflow.worker.AssignWindowsParDoFnFactory$AssignWindowsParDoFn.processElement(AssignWindowsParDoFnFactory.java:118)
at 
com.google.cloud.dataflow.worker.util.common.worker.ParDoOperation.process(ParDoOperation.java:43)
at 
com.google.cloud.dataflow.worker.util.common.worker.OutputReceiver.process(OutputReceiver.java:48)
at 
com.google.cloud.dataflow.worker.util.common.worker.ReadOperation.runReadLoop(ReadOperation.java:200)
at 
com.google.cloud.dataflow.worker.util.common.worker.ReadOperation.start(ReadOperation.java:158)
at 
com.google.cloud.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:75)
at 
com.google.cloud.dataflow.worker.BatchDataflowWorker.executeWork(BatchDataflowWorker.java:383)
at 
com.google.cloud.dataflow.worker.BatchDataflowWorker.doWork(BatchDataflowWorker.java:355)
at 
com.google.cloud.dataflow.worker.BatchDataflowWorker.getAndPerformWork(BatchDataflowWorker.java:286)
at 
com.google.cloud.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.doWork(DataflowBatchWorkerHarness.java:134)
at 
com.google.cloud.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:114)
at 
com.google.cloud.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:101)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
java.lang.AssertionError: Calculate hashcode/Flatten.PCollections.out: 
Expected: "e5e0503902018c83e8c8977ef437feba"
 but: was "09be6c406fef3c30e837134d9f0430af"
at 
org.apache.beam.sdk.testing.PAssert$PAssertionSite.capture(PAssert.java:168)
at org.apache.beam.sdk.testing.PAssert.thatSingleton(PAssert.java:418)
at org.apache.beam.sdk.testing.PAssert.thatSingleton(PAssert.java:408)
at 
org.apache.beam.sdk.io.mongodb.MongoDBIOIT.testWriteAndRead(MongoDBIOIT.java:129)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at 
org.apache.beam.sdk.testing.TestPipeline$1.evaluate(TestPipeline.java:317)
at 
org.apache.beam.sdk.testing.TestPipeline$1.evaluate(TestPipeline.java:317)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at 

Build failed in Jenkins: beam_PerformanceTests_Compressed_TextIOIT_HDFS #226

2018-05-28 Thread Apache Jenkins Server
See 


Changes:

[iemejia] Refine dependencies, make new ones explicit and minor maven plugins

--
[...truncated 355.11 KB...]
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy65.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2108)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1305)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301)
at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1317)
at org.apache.hadoop.fs.Globber.getFileStatus(Globber.java:57)
at org.apache.hadoop.fs.Globber.glob(Globber.java:252)
at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:1657)
at 
org.apache.beam.sdk.io.hdfs.HadoopFileSystem.match(HadoopFileSystem.java:81)
at org.apache.beam.sdk.io.FileSystems.match(FileSystems.java:123)
at 
org.apache.beam.sdk.io.common.FileBasedIOITHelper$DeleteFileFn.processElement(FileBasedIOITHelper.java:90)
at 
org.apache.beam.sdk.io.common.FileBasedIOITHelper$DeleteFileFn$DoFnInvoker.invokeProcessElement(Unknown
 Source)
at 
org.apache.beam.runners.core.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:177)
at 
org.apache.beam.runners.core.SimpleDoFnRunner.processElement(SimpleDoFnRunner.java:138)
at 
com.google.cloud.dataflow.worker.SimpleParDoFn.processElement(SimpleParDoFn.java:323)
at 
com.google.cloud.dataflow.worker.util.common.worker.ParDoOperation.process(ParDoOperation.java:43)
at 
com.google.cloud.dataflow.worker.util.common.worker.OutputReceiver.process(OutputReceiver.java:48)
at 
com.google.cloud.dataflow.worker.util.common.worker.ReadOperation.runReadLoop(ReadOperation.java:200)
at 
com.google.cloud.dataflow.worker.util.common.worker.ReadOperation.start(ReadOperation.java:158)
at 
com.google.cloud.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:75)
at 
com.google.cloud.dataflow.worker.BatchDataflowWorker.executeWork(BatchDataflowWorker.java:383)
at 
com.google.cloud.dataflow.worker.BatchDataflowWorker.doWork(BatchDataflowWorker.java:355)
at 
com.google.cloud.dataflow.worker.BatchDataflowWorker.getAndPerformWork(BatchDataflowWorker.java:286)
at 
com.google.cloud.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.doWork(DataflowBatchWorkerHarness.java:134)
at 
com.google.cloud.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:114)
at 
com.google.cloud.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:101)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
java.net.ConnectException: Call From 
textioit0writethenreadall-05281105-8q0e-harness-wnd0.c.apache-beam-testing.internal/10.128.0.5
 to 9.136.198.104.bc.googleusercontent.com:9000 failed on connection exception: 
java.net.ConnectException: Connection refused; For more details see:  
http://wiki.apache.org/hadoop/ConnectionRefused
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:732)
at org.apache.hadoop.ipc.Client.call(Client.java:1479)
at org.apache.hadoop.ipc.Client.call(Client.java:1412)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
at com.sun.proxy.$Proxy64.getFileInfo(Unknown Source)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:771)
at 

[jira] [Work logged] (BEAM-214) Create Parquet IO

2018-05-28 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-214?focusedWorklogId=106373=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-106373
 ]

ASF GitHub Bot logged work on BEAM-214:
---

Author: ASF GitHub Bot
Created on: 28/May/18 16:26
Start Date: 28/May/18 16:26
Worklog Time Spent: 10m 
  Work Description: aromanenko-dev commented on issue #5242: [BEAM-214] 
ParquetIO
URL: https://github.com/apache/beam/pull/5242#issuecomment-392565675
 
 
   I tried to run the simple 
pipeline`p.apply(ParquetIO.read(SCHEMA).from("s3://bucket-name/*"));` against 
S3 and I have this exception:
   `An exception occured while executing the Java class. java.io.IOException: 
can not read class org.apache.parquet.format.FileMetaData: java.io.IOException: 
Attempted read on closed stream.`
   It works fine against local FS.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 106373)
Time Spent: 16h 50m  (was: 16h 40m)

> Create Parquet IO
> -
>
> Key: BEAM-214
> URL: https://issues.apache.org/jira/browse/BEAM-214
> Project: Beam
>  Issue Type: Improvement
>  Components: io-ideas
>Reporter: Neville Li
>Assignee: Jean-Baptiste Onofré
>Priority: Minor
>  Time Spent: 16h 50m
>  Remaining Estimate: 0h
>
> Would be nice to support Parquet files with projection and predicates.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle #366

2018-05-28 Thread Apache Jenkins Server
See 


Changes:

[iemejia] Refine dependencies, make new ones explicit and minor maven plugins

--
[...truncated 18.07 MB...]
INFO: Adding 
PAssert$33/GroupGlobally/GatherAllOutputs/Window.Into()/Window.Assign as step 
s15
May 28, 2018 4:04:44 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/GatherAllOutputs/GroupByKey as step 
s16
May 28, 2018 4:04:44 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/GatherAllOutputs/Values/Values/Map as 
step s17
May 28, 2018 4:04:44 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/RewindowActuals/Window.Assign as step 
s18
May 28, 2018 4:04:44 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/KeyForDummy/AddKeys/Map as step s19
May 28, 2018 4:04:44 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding 
PAssert$33/GroupGlobally/RemoveActualsTriggering/Flatten.PCollections as step 
s20
May 28, 2018 4:04:44 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/Create.Values/Read(CreateSource) as 
step s21
May 28, 2018 4:04:44 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/WindowIntoDummy/Window.Assign as step 
s22
May 28, 2018 4:04:44 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding 
PAssert$33/GroupGlobally/RemoveDummyTriggering/Flatten.PCollections as step s23
May 28, 2018 4:04:44 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/FlattenDummyAndContents as step s24
May 28, 2018 4:04:44 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/NeverTrigger/Flatten.PCollections as 
step s25
May 28, 2018 4:04:44 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/GroupDummyAndContents as step s26
May 28, 2018 4:04:44 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/Values/Values/Map as step s27
May 28, 2018 4:04:44 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/ParDo(Concat) as step s28
May 28, 2018 4:04:44 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GetPane/Map as step s29
May 28, 2018 4:04:44 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/RunChecks as step s30
May 28, 2018 4:04:44 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/VerifyAssertions/ParDo(DefaultConclude) as step s31
May 28, 2018 4:04:44 PM org.apache.beam.runners.dataflow.DataflowRunner run
INFO: Staging pipeline description to 
gs://temp-storage-for-validates-runner-tests//viewtest0testsingletonsideinput-jenkins-0528160441-aad7a812/output/results/staging/
May 28, 2018 4:04:44 PM org.apache.beam.runners.dataflow.util.PackageUtil 
tryStagePackage
INFO: Uploading <71120 bytes, hash sDO54lzHnncWqfW3Uzb7-A> to 
gs://temp-storage-for-validates-runner-tests//viewtest0testsingletonsideinput-jenkins-0528160441-aad7a812/output/results/staging/pipeline-sDO54lzHnncWqfW3Uzb7-A.pb

org.apache.beam.sdk.transforms.ViewTest > testSingletonSideInput STANDARD_OUT
Dataflow SDK version: 2.5.0-SNAPSHOT

org.apache.beam.sdk.transforms.ViewTest > testSingletonSideInput STANDARD_ERROR
May 28, 2018 4:04:45 PM org.apache.beam.runners.dataflow.DataflowRunner run
INFO: To access the Dataflow monitoring console, please navigate to 
https://console.cloud.google.com/dataflow/jobsDetail/locations/us-central1/jobs/2018-05-28_09_04_44-10792389488258178972?project=apache-beam-testing

org.apache.beam.sdk.transforms.ViewTest > testSingletonSideInput STANDARD_OUT
Submitted job: 2018-05-28_09_04_44-10792389488258178972

org.apache.beam.sdk.transforms.ViewTest > testSingletonSideInput STANDARD_ERROR
May 28, 2018 4:04:45 PM org.apache.beam.runners.dataflow.DataflowRunner run
INFO: To cancel the job using the 'gcloud' tool, run:
> gcloud dataflow jobs --project=apache-beam-testing cancel 
--region=us-central1 2018-05-28_09_04_44-10792389488258178972
May 28, 2018 4:04:45 PM 

[jira] [Created] (BEAM-4417) BigqueryIO Numeric datatype Support

2018-05-28 Thread Kishan Kumar (JIRA)
Kishan Kumar created BEAM-4417:
--

 Summary: BigqueryIO Numeric datatype Support
 Key: BEAM-4417
 URL: https://issues.apache.org/jira/browse/BEAM-4417
 Project: Beam
  Issue Type: Improvement
  Components: io-java-gcp
Affects Versions: 2.4.0
Reporter: Kishan Kumar
Assignee: Chamikara Jayalath
 Fix For: 2.5.0


The BigQueryIO.read fails while parsing the data from the avro file generated 
while reading the data from the table which has columns with *Numeric* 
datatypes. 

We have gone through the source code at Git-Hub and noticed that *Numeric data 
type is not yet supported.* 

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-4416) Change google-cloud-platform IOITs to write then read style Performance Tests

2018-05-28 Thread JIRA
Łukasz Gajowy created BEAM-4416:
---

 Summary: Change google-cloud-platform IOITs to write then read 
style Performance Tests
 Key: BEAM-4416
 URL: https://issues.apache.org/jira/browse/BEAM-4416
 Project: Beam
  Issue Type: Wish
  Components: testing
Reporter: Łukasz Gajowy
Assignee: Jason Kuster


Google cloud platform IOITs are different than other IOITs (such as JdbcIOIT or 
MongodbIOIT) and do not fulfil the rules described in the documentation: 
[https://beam.apache.org/documentation/io/testing/#i-o-transform-integration-tests].
 

We should make it coherent with other tests, more specifically: 
 - write it in writeThenReadAll style
 - enable running it with Perfkit
 - provide Jenkins job to run it periodically



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4283) Export nexmark execution times to bigQuery

2018-05-28 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4283?focusedWorklogId=106361=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-106361
 ]

ASF GitHub Bot logged work on BEAM-4283:


Author: ASF GitHub Bot
Created on: 28/May/18 15:24
Start Date: 28/May/18 15:24
Worklog Time Spent: 10m 
  Work Description: echauchot commented on issue #5464: [BEAM-4283] Write 
Nexmark execution times to bigquery
URL: https://github.com/apache/beam/pull/5464#issuecomment-392554240
 
 
   @chamikaramj finally, in the production code, I used `BigQueryServices` 
included in `BigQueryIO` with fake windows/pane/timestamp instead of the 
regular big query client. Thus, I could use `FakeBigQueryServices` in the test.
   PTAL


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 106361)
Time Spent: 50m  (was: 40m)

> Export nexmark execution times to bigQuery
> --
>
> Key: BEAM-4283
> URL: https://issues.apache.org/jira/browse/BEAM-4283
> Project: Beam
>  Issue Type: Sub-task
>  Components: examples-nexmark
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Nexmark only outputs the results collection to bigQuery and prints in the 
> console the execution times. To supervise Nexmark execution times, we need to 
> store them as well per runner/query/mode



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-4415) Enable HDFS based Performance Test for ParquetIO

2018-05-28 Thread JIRA

[ 
https://issues.apache.org/jira/browse/BEAM-4415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16492771#comment-16492771
 ] 

Łukasz Gajowy commented on BEAM-4415:
-

Note that this issue may require some changes in ParquetIO itself as discussed 
here: https://github.com/apache/beam/pull/5242#issuecomment-392492612

> Enable HDFS based Performance Test for ParquetIO
> 
>
> Key: BEAM-4415
> URL: https://issues.apache.org/jira/browse/BEAM-4415
> Project: Beam
>  Issue Type: New Feature
>  Components: testing
>Reporter: Łukasz Gajowy
>Assignee: Łukasz Gajowy
>Priority: Major
>
> There already is a running job for ParquetIO on Jenkins: 
> [https://builds.apache.org/view/A-D/view/Beam/job/beam_PerformanceTests_ParquetIOIT/]
>  
> There are also Jenkins' Jobs running such tests on an HDFS cluster: 
> [https://builds.apache.org/view/A-D/view/Beam/job/beam_PerformanceTests_AvroIOIT_HDFS/]
>  
> Therefore, we should provide a Performance Test for ParquetIO running on HDFS 
> cluster.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-4415) Enable HDFS based Performance Test for ParquetIO

2018-05-28 Thread JIRA
Łukasz Gajowy created BEAM-4415:
---

 Summary: Enable HDFS based Performance Test for ParquetIO
 Key: BEAM-4415
 URL: https://issues.apache.org/jira/browse/BEAM-4415
 Project: Beam
  Issue Type: New Feature
  Components: testing
Reporter: Łukasz Gajowy
Assignee: Łukasz Gajowy


There already is a running job for ParquetIO on Jenkins: 
[https://builds.apache.org/view/A-D/view/Beam/job/beam_PerformanceTests_ParquetIOIT/]
 

There are also Jenkins' Jobs running such tests on an HDFS cluster: 
[https://builds.apache.org/view/A-D/view/Beam/job/beam_PerformanceTests_AvroIOIT_HDFS/]
 

Therefore, we should provide a Performance Test for ParquetIO running on HDFS 
cluster.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-214) Create Parquet IO

2018-05-28 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-214?focusedWorklogId=106359=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-106359
 ]

ASF GitHub Bot logged work on BEAM-214:
---

Author: ASF GitHub Bot
Created on: 28/May/18 14:54
Start Date: 28/May/18 14:54
Worklog Time Spent: 10m 
  Work Description: lgajowy commented on issue #5242: [BEAM-214] ParquetIO
URL: https://github.com/apache/beam/pull/5242#issuecomment-392546365
 
 
   @jbonofre I didn't check S3 but I was able to run the ParquetIOIT on HDFS 
for 1 000 000 records (~50Mb). However, for larger loads, like 100 000 000 
records (~5Gb) I get ` java.net.ConnectException: Connection refused` from the 
HDFS cluster that I have setup on Kubernetes. Maybe this is the same issue 
(similar?) you face right now? 
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 106359)
Time Spent: 16h 40m  (was: 16.5h)

> Create Parquet IO
> -
>
> Key: BEAM-214
> URL: https://issues.apache.org/jira/browse/BEAM-214
> Project: Beam
>  Issue Type: Improvement
>  Components: io-ideas
>Reporter: Neville Li
>Assignee: Jean-Baptiste Onofré
>Priority: Minor
>  Time Spent: 16h 40m
>  Remaining Estimate: 0h
>
> Would be nice to support Parquet files with projection and predicates.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-214) Create Parquet IO

2018-05-28 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-214?focusedWorklogId=106358=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-106358
 ]

ASF GitHub Bot logged work on BEAM-214:
---

Author: ASF GitHub Bot
Created on: 28/May/18 14:53
Start Date: 28/May/18 14:53
Worklog Time Spent: 10m 
  Work Description: lgajowy commented on issue #5242: [BEAM-214] ParquetIO
URL: https://github.com/apache/beam/pull/5242#issuecomment-392546365
 
 
   @jbonofre I didn't check S3 but I was able to run the ParquetIOIT on HDFS 
for 1 000 000 records (~50Mb). However, for larger loads, like 100 000 000 
records (~5Gb) I get ` java.net.ConnectException: Connection refused` from the 
hdfs cluster that I have setup on kubernetes. Maybe this is the same issue 
(similar?) you face right now? 
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 106358)
Time Spent: 16.5h  (was: 16h 20m)

> Create Parquet IO
> -
>
> Key: BEAM-214
> URL: https://issues.apache.org/jira/browse/BEAM-214
> Project: Beam
>  Issue Type: Improvement
>  Components: io-ideas
>Reporter: Neville Li
>Assignee: Jean-Baptiste Onofré
>Priority: Minor
>  Time Spent: 16.5h
>  Remaining Estimate: 0h
>
> Would be nice to support Parquet files with projection and predicates.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-4414) Create more specific namespace for each IOIT in FileBasedIOIT

2018-05-28 Thread Kasia Kucharczyk (JIRA)
Kasia Kucharczyk created BEAM-4414:
--

 Summary: Create more specific namespace for each IOIT in 
FileBasedIOIT
 Key: BEAM-4414
 URL: https://issues.apache.org/jira/browse/BEAM-4414
 Project: Beam
  Issue Type: Bug
  Components: testing
Reporter: Kasia Kucharczyk
Assignee: Kasia Kucharczyk


After changing namespaces (https://issues.apache.org/jira/browse/BEAM-4371) 
file-based tests started failing because shared of namespace. All those tests 
(e.g TextIOIT or AvroIOIT) should have specified 'subname' passed to namespace. 
E.g. 'filebasedioithdfs-203' > 'filebasedioithdfs-text-203'



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-214) Create Parquet IO

2018-05-28 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-214?focusedWorklogId=106357=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-106357
 ]

ASF GitHub Bot logged work on BEAM-214:
---

Author: ASF GitHub Bot
Created on: 28/May/18 14:48
Start Date: 28/May/18 14:48
Worklog Time Spent: 10m 
  Work Description: lgajowy commented on issue #5242: [BEAM-214] ParquetIO
URL: https://github.com/apache/beam/pull/5242#issuecomment-392546365
 
 
   @jbonofre I didn't check S3 but I was able to run the ParquetIOIT on HDFS 
for 1 000 000 records (~50Mb). However, for larger loads, like 100 000 000 
records (~5Gb) I get ` java.net.ConnectException: Connection refused`. Maybe 
this is the same issue you face right now? 
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 106357)
Time Spent: 16h 20m  (was: 16h 10m)

> Create Parquet IO
> -
>
> Key: BEAM-214
> URL: https://issues.apache.org/jira/browse/BEAM-214
> Project: Beam
>  Issue Type: Improvement
>  Components: io-ideas
>Reporter: Neville Li
>Assignee: Jean-Baptiste Onofré
>Priority: Minor
>  Time Spent: 16h 20m
>  Remaining Estimate: 0h
>
> Would be nice to support Parquet files with projection and predicates.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle #365

2018-05-28 Thread Apache Jenkins Server
See 


--
[...truncated 18.13 MB...]
INFO: Adding PAssert$33/GroupGlobally/GatherAllOutputs/GroupByKey as step 
s16
May 28, 2018 1:59:58 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/GatherAllOutputs/Values/Values/Map as 
step s17
May 28, 2018 1:59:58 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/RewindowActuals/Window.Assign as step 
s18
May 28, 2018 1:59:58 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/KeyForDummy/AddKeys/Map as step s19
May 28, 2018 1:59:58 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding 
PAssert$33/GroupGlobally/RemoveActualsTriggering/Flatten.PCollections as step 
s20
May 28, 2018 1:59:58 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/Create.Values/Read(CreateSource) as 
step s21
May 28, 2018 1:59:58 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/WindowIntoDummy/Window.Assign as step 
s22
May 28, 2018 1:59:58 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding 
PAssert$33/GroupGlobally/RemoveDummyTriggering/Flatten.PCollections as step s23
May 28, 2018 1:59:58 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/FlattenDummyAndContents as step s24
May 28, 2018 1:59:58 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/NeverTrigger/Flatten.PCollections as 
step s25
May 28, 2018 1:59:58 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/GroupDummyAndContents as step s26
May 28, 2018 1:59:58 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/Values/Values/Map as step s27
May 28, 2018 1:59:58 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/ParDo(Concat) as step s28
May 28, 2018 1:59:58 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GetPane/Map as step s29
May 28, 2018 1:59:58 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/RunChecks as step s30
May 28, 2018 1:59:58 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/VerifyAssertions/ParDo(DefaultConclude) as step s31
May 28, 2018 1:59:58 PM org.apache.beam.runners.dataflow.DataflowRunner run
INFO: Staging pipeline description to 
gs://temp-storage-for-validates-runner-tests//viewtest0testsingletonsideinput-jenkins-0528135954-c31ceaa/output/results/staging/
May 28, 2018 1:59:58 PM org.apache.beam.runners.dataflow.util.PackageUtil 
tryStagePackage
INFO: Uploading <71120 bytes, hash uezJoAtHowg_kN_E2rXRdQ> to 
gs://temp-storage-for-validates-runner-tests//viewtest0testsingletonsideinput-jenkins-0528135954-c31ceaa/output/results/staging/pipeline-uezJoAtHowg_kN_E2rXRdQ.pb

org.apache.beam.sdk.transforms.ViewTest > testSingletonSideInput STANDARD_OUT
Dataflow SDK version: 2.5.0-SNAPSHOT

org.apache.beam.sdk.transforms.ViewTest > testSingletonSideInput STANDARD_ERROR
May 28, 2018 1:59:59 PM org.apache.beam.runners.dataflow.DataflowRunner run
INFO: To access the Dataflow monitoring console, please navigate to 
https://console.cloud.google.com/dataflow/jobsDetail/locations/us-central1/jobs/2018-05-28_06_59_58-18180271460954501526?project=apache-beam-testing

org.apache.beam.sdk.transforms.ViewTest > testSingletonSideInput STANDARD_OUT
Submitted job: 2018-05-28_06_59_58-18180271460954501526

org.apache.beam.sdk.transforms.ViewTest > testSingletonSideInput STANDARD_ERROR
May 28, 2018 1:59:59 PM org.apache.beam.runners.dataflow.DataflowRunner run
INFO: To cancel the job using the 'gcloud' tool, run:
> gcloud dataflow jobs --project=apache-beam-testing cancel 
--region=us-central1 2018-05-28_06_59_58-18180271460954501526
May 28, 2018 1:59:59 PM org.apache.beam.runners.dataflow.TestDataflowRunner 
run
INFO: Running Dataflow job 2018-05-28_06_59_58-18180271460954501526 with 1 
expected assertions.
May 28, 2018 2:00:13 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-05-28T13:59:58.646Z: Autoscaling is enabled for job 

[jira] [Resolved] (BEAM-4306) Enforce ErrorProne analysis in apex project

2018-05-28 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/BEAM-4306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía resolved BEAM-4306.

   Resolution: Fixed
Fix Version/s: 2.5.0

> Enforce ErrorProne analysis in apex project
> ---
>
> Key: BEAM-4306
> URL: https://issues.apache.org/jira/browse/BEAM-4306
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-apex
>Reporter: Scott Wegner
>Assignee: Ismaël Mejía
>Priority: Minor
>  Labels: errorprone, starter
> Fix For: 2.5.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Java ErrorProne static analysis was [recently 
> enabled|https://github.com/apache/beam/pull/5161] in the Gradle build 
> process, but only as warnings. ErrorProne errors are generally useful and 
> easy to fix. Some work was done to [make sdks-java-core 
> ErrorProne-clean|https://github.com/apache/beam/pull/5319] and add 
> enforcement. This task is clean ErrorProne warnings and add enforcement in 
> {{beam-runners-apex}}. Additional context discussed on the [dev 
> list|https://lists.apache.org/thread.html/95aae2785c3cd728c2d3378cbdff2a7ba19caffcd4faa2049d2e2f46@%3Cdev.beam.apache.org%3E].
> Fixing this issue will involve:
> # Follow instructions in the [Contribution 
> Guide|https://beam.apache.org/contribute/] to set up a {{beam}} development 
> environment.
> # Run the following command to compile and run ErrorProne analysis on the 
> project: {{./gradlew :beam-runners-apex:assemble}}
> # Fix each ErrorProne warning from the {{runners/apex}} project.
> # In {{runners/apex/build.gradle}}, add {{failOnWarning: true}} to the call 
> the {{applyJavaNature()}} 
> ([example|https://github.com/apache/beam/pull/5319/files#diff-9390c20635aed5f42f83b97506a87333R20]).
> This starter issue is sponsored by [~swegner]. Feel free to [reach 
> out|https://beam.apache.org/community/contact-us/] with questions or code 
> review:
> * JIRA: [~swegner]
> * GitHub: [@swegner|https://github.com/swegner]
> * Slack: [@Scott Wegner|https://s.apache.org/beam-slack-channel]
> * Email: swegner at google dot com



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4008) Futurize and fix python 2 compatibility for utils subpackage

2018-05-28 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4008?focusedWorklogId=106330=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-106330
 ]

ASF GitHub Bot logged work on BEAM-4008:


Author: ASF GitHub Bot
Created on: 28/May/18 13:23
Start Date: 28/May/18 13:23
Worklog Time Spent: 10m 
  Work Description: RobbeSneyders commented on a change in pull request 
#5336: [BEAM-4008] Futurize utils subpackage
URL: https://github.com/apache/beam/pull/5336#discussion_r191205722
 
 

 ##
 File path: sdks/python/apache_beam/utils/profiler.py
 ##
 @@ -20,21 +20,19 @@
 For internal use only; no backwards-compatibility guarantees.
 """
 
-import cProfile
+from __future__ import absolute_import
+
+import cProfile  # pylint: disable=bad-python3-import
 
 Review comment:
   This is a bug in pylint.
   See https://github.com/PyCQA/pylint/issues/1612 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 106330)
Time Spent: 1h 10m  (was: 1h)

> Futurize and fix python 2 compatibility for utils subpackage
> 
>
> Key: BEAM-4008
> URL: https://issues.apache.org/jira/browse/BEAM-4008
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: Robbe
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-4413) Create infrastructure configuration (docker/k8s) for HCatalogIOIT

2018-05-28 Thread Kasia Kucharczyk (JIRA)
Kasia Kucharczyk created BEAM-4413:
--

 Summary: Create infrastructure configuration (docker/k8s) for 
HCatalogIOIT
 Key: BEAM-4413
 URL: https://issues.apache.org/jira/browse/BEAM-4413
 Project: Beam
  Issue Type: Test
  Components: testing
Reporter: Kasia Kucharczyk
Assignee: Jason Kuster


Currently the HCatalogIOIT test is working +only+ with provided infrastructure 
(e.g. DataProc). It is caused by lack of docker image which would meet test 
requirements.
 * First of all some of Hive images are custom made and does not contain 
HCatalog.
 * Secondly some of images are problematic in terms of network configuration. 
It was impossible to make all components visible for each other (datanodes, 
namenodes, etc.), what was observed with following error:

{code:java}
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File could only be 
replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) 
running and 1 node(s) are excluded in this operation.{code}

Here is related stackoverflow issue: 
[https://stackoverflow.com/questions/26743656/hadoop-2-5-0-failure-to-write-file-remotely]
For a moment there is no solution.

What seems to be a next reasonable point to check here is creating own docker 
image, publish it and test it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PerformanceTests_MongoDBIO_IT #230

2018-05-28 Thread Apache Jenkins Server
See 


--
[...truncated 212.27 KB...]
at 
org.apache.beam.runners.core.SimpleDoFnRunner.access$700(SimpleDoFnRunner.java:66)
at 
org.apache.beam.runners.core.SimpleDoFnRunner$DoFnProcessContext.output(SimpleDoFnRunner.java:436)
at 
org.apache.beam.runners.core.SimpleDoFnRunner$DoFnProcessContext.output(SimpleDoFnRunner.java:424)
at 
org.apache.beam.sdk.transforms.MapElements$1.processElement(MapElements.java:129)
at 
org.apache.beam.sdk.transforms.MapElements$1$DoFnInvoker.invokeProcessElement(Unknown
 Source)
at 
org.apache.beam.runners.core.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:177)
at 
org.apache.beam.runners.core.SimpleDoFnRunner.processElement(SimpleDoFnRunner.java:141)
at 
com.google.cloud.dataflow.worker.SimpleParDoFn.processElement(SimpleParDoFn.java:323)
at 
com.google.cloud.dataflow.worker.util.common.worker.ParDoOperation.process(ParDoOperation.java:43)
at 
com.google.cloud.dataflow.worker.util.common.worker.OutputReceiver.process(OutputReceiver.java:48)
at 
com.google.cloud.dataflow.worker.util.common.worker.ReadOperation.runReadLoop(ReadOperation.java:200)
at 
com.google.cloud.dataflow.worker.util.common.worker.ReadOperation.start(ReadOperation.java:158)
at 
com.google.cloud.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:75)
at 
com.google.cloud.dataflow.worker.BatchDataflowWorker.executeWork(BatchDataflowWorker.java:383)
at 
com.google.cloud.dataflow.worker.BatchDataflowWorker.doWork(BatchDataflowWorker.java:355)
at 
com.google.cloud.dataflow.worker.BatchDataflowWorker.getAndPerformWork(BatchDataflowWorker.java:286)
at 
com.google.cloud.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.doWork(DataflowBatchWorkerHarness.java:134)
at 
com.google.cloud.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:114)
at 
com.google.cloud.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:101)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
com.mongodb.MongoTimeoutException: Timed out after 3 ms while waiting 
for a server that matches WritableServerSelector. Client view of cluster state 
is {type=UNKNOWN, servers=[{address=35.232.185.85:27017, type=UNKNOWN, 
state=CONNECTING, exception={com.mongodb.MongoSocketOpenException: Exception 
opening socket}, caused by {java.net.SocketTimeoutException: connect timed 
out}}]
at 
com.mongodb.connection.BaseCluster.createTimeoutException(BaseCluster.java:369)
at com.mongodb.connection.BaseCluster.selectServer(BaseCluster.java:101)
at 
com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.(ClusterBinding.java:75)
at 
com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.(ClusterBinding.java:71)
at 
com.mongodb.binding.ClusterBinding.getWriteConnectionSource(ClusterBinding.java:68)
at 
com.mongodb.operation.OperationHelper.withConnection(OperationHelper.java:219)
at 
com.mongodb.operation.MixedBulkWriteOperation.execute(MixedBulkWriteOperation.java:168)
at 
com.mongodb.operation.MixedBulkWriteOperation.execute(MixedBulkWriteOperation.java:74)
at com.mongodb.Mongo.execute(Mongo.java:781)
at com.mongodb.Mongo$2.execute(Mongo.java:764)
at 
com.mongodb.MongoCollectionImpl.insertMany(MongoCollectionImpl.java:323)
at 
com.mongodb.MongoCollectionImpl.insertMany(MongoCollectionImpl.java:311)
at 
org.apache.beam.sdk.io.mongodb.MongoDbIO$Write$WriteFn.flush(MongoDbIO.java:667)
at 
org.apache.beam.sdk.io.mongodb.MongoDbIO$Write$WriteFn.processElement(MongoDbIO.java:652)
com.mongodb.MongoTimeoutException: Timed out after 3 ms while waiting 
for a server that matches WritableServerSelector. Client view of cluster state 
is {type=UNKNOWN, servers=[{address=35.232.185.85:27017, type=UNKNOWN, 
state=CONNECTING, exception={com.mongodb.MongoSocketOpenException: Exception 
opening socket}, caused by {java.net.SocketTimeoutException: connect timed 
out}}]
at 
com.mongodb.connection.BaseCluster.createTimeoutException(BaseCluster.java:369)
at com.mongodb.connection.BaseCluster.selectServer(BaseCluster.java:101)
at 
com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.(ClusterBinding.java:75)
at 
com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.(ClusterBinding.java:71)
at 

[beam] branch master updated (3a668ea -> 33f048b)

2018-05-28 Thread jbonofre
This is an automated email from the ASF dual-hosted git repository.

jbonofre pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from 3a668ea  Merge pull request #5475: [BEAM-4306] Enforce ErrorProne 
analysis in apex runner
 add bd5cdef  Refine dependencies, make new ones explicit and minor maven 
plugins updates
 add 33f048b  Merge pull request #5495 from iemejia/try-dep-plugin

No new revisions were added by this update.

Summary of changes:
 pom.xml  |  7 ---
 runners/core-construction-java/build.gradle  |  4 ++--
 runners/core-construction-java/pom.xml   | 11 ++-
 runners/core-java/build.gradle   |  2 +-
 runners/core-java/pom.xml| 11 ++-
 runners/direct-java/pom.xml  | 20 
 runners/extensions-java/metrics/build.gradle |  2 +-
 runners/extensions-java/metrics/pom.xml  | 17 -
 sdks/java/io/common/build.gradle |  4 ++--
 sdks/java/io/common/pom.xml  | 12 +++-
 10 files changed, 57 insertions(+), 33 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
jbono...@apache.org.


Build failed in Jenkins: beam_PerformanceTests_HadoopInputFormat #321

2018-05-28 Thread Apache Jenkins Server
See 


--
[...truncated 108.09 KB...]
> Task :beam-sdks-java-extensions-google-cloud-platform-core:shadowTestJar 
> UP-TO-DATE
Build cache key for task 
':beam-sdks-java-extensions-google-cloud-platform-core:shadowTestJar' is 
2ec5061a8ca140c3410d14d32f133958
Caching disabled for task 
':beam-sdks-java-extensions-google-cloud-platform-core:shadowTestJar': Caching 
has not been enabled for the task
Skipping task 
':beam-sdks-java-extensions-google-cloud-platform-core:shadowTestJar' as it is 
up-to-date.
:beam-sdks-java-extensions-google-cloud-platform-core:shadowTestJar 
(Thread[Task worker for ':' Thread 6,5,main]) completed. Took 0.014 secs.
:beam-sdks-java-io-google-cloud-platform:compileTestJava (Thread[Task worker 
for ':' Thread 6,5,main]) started.

> Task :beam-sdks-java-io-google-cloud-platform:compileTestJava UP-TO-DATE
Build cache key for task 
':beam-sdks-java-io-google-cloud-platform:compileTestJava' is 
0cf8771af1f1b5bd43d001452413e6ae
Skipping task ':beam-sdks-java-io-google-cloud-platform:compileTestJava' as it 
is up-to-date.
:beam-sdks-java-io-google-cloud-platform:compileTestJava (Thread[Task worker 
for ':' Thread 6,5,main]) completed. Took 0.052 secs.
:beam-sdks-java-io-google-cloud-platform:testClasses (Thread[Task worker for 
':' Thread 6,5,main]) started.

> Task :beam-sdks-java-io-google-cloud-platform:testClasses UP-TO-DATE
Skipping task ':beam-sdks-java-io-google-cloud-platform:testClasses' as it has 
no actions.
:beam-sdks-java-io-google-cloud-platform:testClasses (Thread[Task worker for 
':' Thread 6,5,main]) completed. Took 0.0 secs.
:beam-sdks-java-io-google-cloud-platform:shadowTestJar (Thread[Task worker for 
':' Thread 6,5,main]) started.

> Task :beam-sdks-java-io-google-cloud-platform:shadowTestJar UP-TO-DATE
Build cache key for task 
':beam-sdks-java-io-google-cloud-platform:shadowTestJar' is 
a7e6b685f0f365c0a9913ef367e311e0
Caching disabled for task 
':beam-sdks-java-io-google-cloud-platform:shadowTestJar': Caching has not been 
enabled for the task
Skipping task ':beam-sdks-java-io-google-cloud-platform:shadowTestJar' as it is 
up-to-date.
:beam-sdks-java-io-google-cloud-platform:shadowTestJar (Thread[Task worker for 
':' Thread 6,5,main]) completed. Took 0.029 secs.
:beam-runners-google-cloud-dataflow-java:compileTestJava (Thread[Task worker 
for ':' Thread 6,5,main]) started.

> Task :beam-runners-google-cloud-dataflow-java:compileTestJava UP-TO-DATE
Build cache key for task 
':beam-runners-google-cloud-dataflow-java:compileTestJava' is 
eeca6bcd9b05030e0021500fe1c65e19
Skipping task ':beam-runners-google-cloud-dataflow-java:compileTestJava' as it 
is up-to-date.
:beam-runners-google-cloud-dataflow-java:compileTestJava (Thread[Task worker 
for ':' Thread 6,5,main]) completed. Took 0.048 secs.
:beam-runners-google-cloud-dataflow-java:testClasses (Thread[Task worker for 
':' Thread 6,5,main]) started.

> Task :beam-runners-google-cloud-dataflow-java:testClasses UP-TO-DATE
Skipping task ':beam-runners-google-cloud-dataflow-java:testClasses' as it has 
no actions.
:beam-runners-google-cloud-dataflow-java:testClasses (Thread[Task worker for 
':' Thread 6,5,main]) completed. Took 0.0 secs.
:beam-runners-google-cloud-dataflow-java:shadowTestJar (Thread[Task worker for 
':' Thread 8,5,main]) started.

> Task :beam-runners-google-cloud-dataflow-java:shadowTestJar UP-TO-DATE
Build cache key for task 
':beam-runners-google-cloud-dataflow-java:shadowTestJar' is 
0d85ca47af4df2978d3d3131e6bd3af6
Caching disabled for task 
':beam-runners-google-cloud-dataflow-java:shadowTestJar': Caching has not been 
enabled for the task
Skipping task ':beam-runners-google-cloud-dataflow-java:shadowTestJar' as it is 
up-to-date.
:beam-runners-google-cloud-dataflow-java:shadowTestJar (Thread[Task worker for 
':' Thread 8,5,main]) completed. Took 0.03 secs.
:beam-sdks-java-io-hadoop-input-format:compileTestJava (Thread[Task worker for 
':' Thread 8,5,main]) started.

> Task :beam-sdks-java-io-hadoop-input-format:compileTestJava UP-TO-DATE
Build cache key for task 
':beam-sdks-java-io-hadoop-input-format:compileTestJava' is 
4855d2b393b42a296911735edbe4a203
Skipping task ':beam-sdks-java-io-hadoop-input-format:compileTestJava' as it is 
up-to-date.
:beam-sdks-java-io-hadoop-input-format:compileTestJava (Thread[Task worker for 
':' Thread 8,5,main]) completed. Took 0.439 secs.
:beam-sdks-java-io-hadoop-input-format:testClasses (Thread[Task worker for ':' 
Thread 8,5,main]) started.

> Task :beam-sdks-java-io-hadoop-input-format:testClasses UP-TO-DATE
Skipping task ':beam-sdks-java-io-hadoop-input-format:testClasses' as it has no 
actions.
:beam-sdks-java-io-hadoop-input-format:testClasses (Thread[Task worker for ':' 
Thread 8,5,main]) completed. Took 0.0 secs.
:beam-sdks-java-io-hadoop-input-format:integrationTest (Thread[Task worker for 
':' Thread 8,5,main]) started.

Build failed in Jenkins: beam_PerformanceTests_XmlIOIT_HDFS #224

2018-05-28 Thread Apache Jenkins Server
See 


--
Started by timer
[EnvInject] - Loading node environment variables.
Building remotely on beam10 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision 3a668eaff6266c87c645a44cc84b4d5e1c3b2228 (origin/master)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 3a668eaff6266c87c645a44cc84b4d5e1c3b2228
Commit message: "Merge pull request #5475: [BEAM-4306] Enforce ErrorProne 
analysis in apex runner"
 > git rev-list --no-walk 3a668eaff6266c87c645a44cc84b4d5e1c3b2228 # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
[EnvInject] - Executing scripts and injecting environment variables after the 
SCM step.
[EnvInject] - Injecting as environment variables the properties content 
SPARK_LOCAL_IP=127.0.0.1

[EnvInject] - Variables injected successfully.
[beam_PerformanceTests_XmlIOIT_HDFS] $ /bin/bash -xe 
/tmp/jenkins854525233264603.sh
+ gcloud container clusters get-credentials io-datastores --zone=us-central1-a 
--verbosity=debug
DEBUG: Running [gcloud.container.clusters.get-credentials] with arguments: 
[--verbosity: "debug", --zone: "us-central1-a", NAME: "io-datastores"]
Fetching cluster endpoint and auth data.
DEBUG: Saved kubeconfig to /home/jenkins/.kube/config
kubeconfig entry generated for io-datastores.
INFO: Display format "default".
DEBUG: SDK update checks are disabled.
[beam_PerformanceTests_XmlIOIT_HDFS] $ /bin/bash -xe 
/tmp/jenkins7687198278845063754.sh
+ cp /home/jenkins/.kube/config 

[beam_PerformanceTests_XmlIOIT_HDFS] $ /bin/bash -xe 
/tmp/jenkins1732928241995242020.sh
+ kubectl 
--kubeconfig=
 create namespace filebasedioithdfs-224
Error from server (AlreadyExists): namespaces "filebasedioithdfs-224" already 
exists
Build step 'Execute shell' marked build as failure


[jira] [Work logged] (BEAM-214) Create Parquet IO

2018-05-28 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-214?focusedWorklogId=106307=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-106307
 ]

ASF GitHub Bot logged work on BEAM-214:
---

Author: ASF GitHub Bot
Created on: 28/May/18 10:56
Start Date: 28/May/18 10:56
Worklog Time Spent: 10m 
  Work Description: jbonofre commented on issue #5242: [BEAM-214] ParquetIO
URL: https://github.com/apache/beam/pull/5242#issuecomment-392492612
 
 
   By the way, `Read` seems to work only for local filesystem (not S3/HDFS). 
I'm investigating the issue.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 106307)
Time Spent: 16h 10m  (was: 16h)

> Create Parquet IO
> -
>
> Key: BEAM-214
> URL: https://issues.apache.org/jira/browse/BEAM-214
> Project: Beam
>  Issue Type: Improvement
>  Components: io-ideas
>Reporter: Neville Li
>Assignee: Jean-Baptiste Onofré
>Priority: Minor
>  Time Spent: 16h 10m
>  Remaining Estimate: 0h
>
> Would be nice to support Parquet files with projection and predicates.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-214) Create Parquet IO

2018-05-28 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-214?focusedWorklogId=106297=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-106297
 ]

ASF GitHub Bot logged work on BEAM-214:
---

Author: ASF GitHub Bot
Created on: 28/May/18 10:21
Start Date: 28/May/18 10:21
Worklog Time Spent: 10m 
  Work Description: lgajowy commented on issue #5242: [BEAM-214] ParquetIO
URL: https://github.com/apache/beam/pull/5242#issuecomment-392485942
 
 
   @Igosuki this is interesting - could you elaborate a little bit more? Are 
there any parquet issues (or something similar) describing the behaviour you 
mentioned? 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 106297)
Time Spent: 16h  (was: 15h 50m)

> Create Parquet IO
> -
>
> Key: BEAM-214
> URL: https://issues.apache.org/jira/browse/BEAM-214
> Project: Beam
>  Issue Type: Improvement
>  Components: io-ideas
>Reporter: Neville Li
>Assignee: Jean-Baptiste Onofré
>Priority: Minor
>  Time Spent: 16h
>  Remaining Estimate: 0h
>
> Would be nice to support Parquet files with projection and predicates.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-214) Create Parquet IO

2018-05-28 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-214?focusedWorklogId=106294=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-106294
 ]

ASF GitHub Bot logged work on BEAM-214:
---

Author: ASF GitHub Bot
Created on: 28/May/18 10:08
Start Date: 28/May/18 10:08
Worklog Time Spent: 10m 
  Work Description: Igosuki commented on issue #5242: [BEAM-214] ParquetIO
URL: https://github.com/apache/beam/pull/5242#issuecomment-392483237
 
 
   There should be accompanying documentation about caveats for memory 
management since this uses the ParquetReader it may make memory explode in 
unexpected ways.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 106294)
Time Spent: 15h 50m  (was: 15h 40m)

> Create Parquet IO
> -
>
> Key: BEAM-214
> URL: https://issues.apache.org/jira/browse/BEAM-214
> Project: Beam
>  Issue Type: Improvement
>  Components: io-ideas
>Reporter: Neville Li
>Assignee: Jean-Baptiste Onofré
>Priority: Minor
>  Time Spent: 15h 50m
>  Remaining Estimate: 0h
>
> Would be nice to support Parquet files with projection and predicates.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-4366) Flipping tests - some tests in Euphoria operator test suit are failing randomly

2018-05-28 Thread Vaclav Plajt (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-4366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16492383#comment-16492383
 ] 

Vaclav Plajt commented on BEAM-4366:


Merge request to review: [https://github.com/seznam/beam/pull/2]

Fix description: 
[https://github.com/seznam/beam/commit/7fcc8ebfb6eeed227326a3b3c1a1ed12c39f24a3]

 

> Flipping tests - some tests in Euphoria operator test suit are failing 
> randomly
> ---
>
> Key: BEAM-4366
> URL: https://issues.apache.org/jira/browse/BEAM-4366
> Project: Beam
>  Issue Type: Sub-task
>  Components: dsl-euphoria
>Affects Versions: Not applicable
>Reporter: Vaclav Plajt
>Assignee: Vaclav Plajt
>Priority: Major
>
> When whole test suit runs some test may randomly fail. Assertion error shows 
> empty test output. Current believe is that we may have a problem somewhere 
> around {{ListDataSink}} due to a race condition.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle #364

2018-05-28 Thread Apache Jenkins Server
See 


--
[...truncated 18.07 MB...]
INFO: Adding PAssert$33/GroupGlobally/GatherAllOutputs/GroupByKey as step 
s16
May 28, 2018 7:57:17 AM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/GatherAllOutputs/Values/Values/Map as 
step s17
May 28, 2018 7:57:17 AM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/RewindowActuals/Window.Assign as step 
s18
May 28, 2018 7:57:17 AM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/KeyForDummy/AddKeys/Map as step s19
May 28, 2018 7:57:17 AM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding 
PAssert$33/GroupGlobally/RemoveActualsTriggering/Flatten.PCollections as step 
s20
May 28, 2018 7:57:17 AM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/Create.Values/Read(CreateSource) as 
step s21
May 28, 2018 7:57:17 AM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/WindowIntoDummy/Window.Assign as step 
s22
May 28, 2018 7:57:17 AM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding 
PAssert$33/GroupGlobally/RemoveDummyTriggering/Flatten.PCollections as step s23
May 28, 2018 7:57:17 AM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/FlattenDummyAndContents as step s24
May 28, 2018 7:57:17 AM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/NeverTrigger/Flatten.PCollections as 
step s25
May 28, 2018 7:57:17 AM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/GroupDummyAndContents as step s26
May 28, 2018 7:57:17 AM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/Values/Values/Map as step s27
May 28, 2018 7:57:17 AM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/ParDo(Concat) as step s28
May 28, 2018 7:57:17 AM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GetPane/Map as step s29
May 28, 2018 7:57:17 AM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/RunChecks as step s30
May 28, 2018 7:57:17 AM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/VerifyAssertions/ParDo(DefaultConclude) as step s31
May 28, 2018 7:57:17 AM org.apache.beam.runners.dataflow.DataflowRunner run
INFO: Staging pipeline description to 
gs://temp-storage-for-validates-runner-tests//viewtest0testsingletonsideinput-jenkins-0528075714-8a33e07e/output/results/staging/
May 28, 2018 7:57:18 AM org.apache.beam.runners.dataflow.util.PackageUtil 
tryStagePackage
INFO: Uploading <71120 bytes, hash Jz1m661l_C30RpWGrXUVUQ> to 
gs://temp-storage-for-validates-runner-tests//viewtest0testsingletonsideinput-jenkins-0528075714-8a33e07e/output/results/staging/pipeline-Jz1m661l_C30RpWGrXUVUQ.pb

org.apache.beam.sdk.transforms.ViewTest > testSingletonSideInput STANDARD_OUT
Dataflow SDK version: 2.5.0-SNAPSHOT

org.apache.beam.sdk.transforms.ViewTest > testSingletonSideInput STANDARD_ERROR
May 28, 2018 7:57:19 AM org.apache.beam.runners.dataflow.DataflowRunner run
INFO: To access the Dataflow monitoring console, please navigate to 
https://console.cloud.google.com/dataflow/jobsDetail/locations/us-central1/jobs/2018-05-28_00_57_18-1030143076842556?project=apache-beam-testing

org.apache.beam.sdk.transforms.ViewTest > testSingletonSideInput STANDARD_OUT
Submitted job: 2018-05-28_00_57_18-1030143076842556

org.apache.beam.sdk.transforms.ViewTest > testSingletonSideInput STANDARD_ERROR
May 28, 2018 7:57:19 AM org.apache.beam.runners.dataflow.DataflowRunner run
INFO: To cancel the job using the 'gcloud' tool, run:
> gcloud dataflow jobs --project=apache-beam-testing cancel 
--region=us-central1 2018-05-28_00_57_18-1030143076842556
May 28, 2018 7:57:19 AM org.apache.beam.runners.dataflow.TestDataflowRunner 
run
INFO: Running Dataflow job 2018-05-28_00_57_18-1030143076842556 with 1 
expected assertions.
May 28, 2018 7:57:35 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-05-28T07:57:18.554Z: Autoscaling is enabled for 

Build failed in Jenkins: beam_PerformanceTests_MongoDBIO_IT #229

2018-05-28 Thread Apache Jenkins Server
See 


--
[...truncated 188.89 KB...]
at 
com.mongodb.connection.SocketStreamHelper.initialize(SocketStreamHelper.java:50)
at com.mongodb.connection.SocketStream.open(SocketStream.java:58)
... 3 more


Gradle Test Executor 1 finished executing tests.

> Task :beam-sdks-java-io-mongodb:integrationTest

org.apache.beam.sdk.io.mongodb.MongoDBIOIT > testWriteAndRead FAILED
java.lang.RuntimeException: com.mongodb.MongoTimeoutException: Timed out 
after 3 ms while waiting for a server that matches WritableServerSelector. 
Client view of cluster state is {type=UNKNOWN, 
servers=[{address=35.226.213.205:27017, type=UNKNOWN, state=CONNECTING, 
exception={com.mongodb.MongoSocketOpenException: Exception opening socket}, 
caused by {java.net.SocketTimeoutException: connect timed out}}]
at 
com.mongodb.connection.BaseCluster.createTimeoutException(BaseCluster.java:369)
at com.mongodb.connection.BaseCluster.selectServer(BaseCluster.java:101)
at 
com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.(ClusterBinding.java:75)
at 
com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.(ClusterBinding.java:71)
at 
com.mongodb.binding.ClusterBinding.getWriteConnectionSource(ClusterBinding.java:68)
at 
com.mongodb.operation.OperationHelper.withConnection(OperationHelper.java:219)
at 
com.mongodb.operation.MixedBulkWriteOperation.execute(MixedBulkWriteOperation.java:168)
at 
com.mongodb.operation.MixedBulkWriteOperation.execute(MixedBulkWriteOperation.java:74)
at com.mongodb.Mongo.execute(Mongo.java:781)
at com.mongodb.Mongo$2.execute(Mongo.java:764)
at 
com.mongodb.MongoCollectionImpl.insertMany(MongoCollectionImpl.java:323)
at 
com.mongodb.MongoCollectionImpl.insertMany(MongoCollectionImpl.java:311)
at 
org.apache.beam.sdk.io.mongodb.MongoDbIO$Write$WriteFn.flush(MongoDbIO.java:667)
at 
org.apache.beam.sdk.io.mongodb.MongoDbIO$Write$WriteFn.processElement(MongoDbIO.java:652)
com.mongodb.MongoSocketReadException: Prematurely reached end of stream
at com.mongodb.connection.SocketStream.read(SocketStream.java:88)
at 
com.mongodb.connection.InternalStreamConnection.receiveResponseBuffers(InternalStreamConnection.java:491)
at 
com.mongodb.connection.InternalStreamConnection.receiveMessage(InternalStreamConnection.java:221)
at 
com.mongodb.connection.UsageTrackingInternalConnection.receiveMessage(UsageTrackingInternalConnection.java:102)
at 
com.mongodb.connection.DefaultConnectionPool$PooledConnection.receiveMessage(DefaultConnectionPool.java:435)
at 
com.mongodb.connection.WriteCommandProtocol.receiveMessage(WriteCommandProtocol.java:234)
at 
com.mongodb.connection.WriteCommandProtocol.execute(WriteCommandProtocol.java:104)
at 
com.mongodb.connection.InsertCommandProtocol.execute(InsertCommandProtocol.java:67)
at 
com.mongodb.connection.InsertCommandProtocol.execute(InsertCommandProtocol.java:37)
at 
com.mongodb.connection.DefaultServer$DefaultServerProtocolExecutor.execute(DefaultServer.java:159)
at 
com.mongodb.connection.DefaultServerConnection.executeProtocol(DefaultServerConnection.java:286)
at 
com.mongodb.connection.DefaultServerConnection.insertCommand(DefaultServerConnection.java:115)
at 
com.mongodb.operation.MixedBulkWriteOperation$Run$2.executeWriteCommandProtocol(MixedBulkWriteOperation.java:455)
at 
com.mongodb.operation.MixedBulkWriteOperation$Run$RunExecutor.execute(MixedBulkWriteOperation.java:646)
at 
com.mongodb.operation.MixedBulkWriteOperation$Run.execute(MixedBulkWriteOperation.java:401)
at 
com.mongodb.operation.MixedBulkWriteOperation$1.call(MixedBulkWriteOperation.java:179)
at 
com.mongodb.operation.MixedBulkWriteOperation$1.call(MixedBulkWriteOperation.java:168)
at 
com.mongodb.operation.OperationHelper.withConnectionSource(OperationHelper.java:230)
at 
com.mongodb.operation.OperationHelper.withConnection(OperationHelper.java:221)
at 
com.mongodb.operation.MixedBulkWriteOperation.execute(MixedBulkWriteOperation.java:168)
at 
com.mongodb.operation.MixedBulkWriteOperation.execute(MixedBulkWriteOperation.java:74)
at com.mongodb.Mongo.execute(Mongo.java:781)
at com.mongodb.Mongo$2.execute(Mongo.java:764)
at 
com.mongodb.MongoCollectionImpl.insertMany(MongoCollectionImpl.java:323)
at 
com.mongodb.MongoCollectionImpl.insertMany(MongoCollectionImpl.java:311)
at 
org.apache.beam.sdk.io.mongodb.MongoDbIO$Write$WriteFn.flush(MongoDbIO.java:667)
at 
org.apache.beam.sdk.io.mongodb.MongoDbIO$Write$WriteFn.processElement(MongoDbIO.java:652)
com.mongodb.MongoTimeoutException: Timed out after 3 ms while 

Build failed in Jenkins: beam_PerformanceTests_HadoopInputFormat #320

2018-05-28 Thread Apache Jenkins Server
See 


--
[...truncated 108.19 KB...]
> Task :beam-sdks-java-extensions-google-cloud-platform-core:shadowTestJar 
> UP-TO-DATE
Build cache key for task 
':beam-sdks-java-extensions-google-cloud-platform-core:shadowTestJar' is 
43478bbf27fb9e253636d5b9c41ca26e
Caching disabled for task 
':beam-sdks-java-extensions-google-cloud-platform-core:shadowTestJar': Caching 
has not been enabled for the task
Skipping task 
':beam-sdks-java-extensions-google-cloud-platform-core:shadowTestJar' as it is 
up-to-date.
:beam-sdks-java-extensions-google-cloud-platform-core:shadowTestJar 
(Thread[Task worker for ':' Thread 11,5,main]) completed. Took 0.016 secs.
:beam-sdks-java-io-google-cloud-platform:compileTestJava (Thread[Task worker 
for ':' Thread 11,5,main]) started.

> Task :beam-sdks-java-io-google-cloud-platform:compileTestJava UP-TO-DATE
Build cache key for task 
':beam-sdks-java-io-google-cloud-platform:compileTestJava' is 
678cb8ca79cecb7d33af904b1117dccc
Skipping task ':beam-sdks-java-io-google-cloud-platform:compileTestJava' as it 
is up-to-date.
:beam-sdks-java-io-google-cloud-platform:compileTestJava (Thread[Task worker 
for ':' Thread 11,5,main]) completed. Took 0.049 secs.
:beam-sdks-java-io-google-cloud-platform:testClasses (Thread[Task worker for 
':' Thread 11,5,main]) started.

> Task :beam-sdks-java-io-google-cloud-platform:testClasses UP-TO-DATE
Skipping task ':beam-sdks-java-io-google-cloud-platform:testClasses' as it has 
no actions.
:beam-sdks-java-io-google-cloud-platform:testClasses (Thread[Task worker for 
':' Thread 11,5,main]) completed. Took 0.0 secs.
:beam-sdks-java-io-google-cloud-platform:shadowTestJar (Thread[Task worker for 
':' Thread 11,5,main]) started.

> Task :beam-sdks-java-io-google-cloud-platform:shadowTestJar UP-TO-DATE
Build cache key for task 
':beam-sdks-java-io-google-cloud-platform:shadowTestJar' is 
b535bc1bdff32e82d1f14cc34e85268d
Caching disabled for task 
':beam-sdks-java-io-google-cloud-platform:shadowTestJar': Caching has not been 
enabled for the task
Skipping task ':beam-sdks-java-io-google-cloud-platform:shadowTestJar' as it is 
up-to-date.
:beam-sdks-java-io-google-cloud-platform:shadowTestJar (Thread[Task worker for 
':' Thread 11,5,main]) completed. Took 0.028 secs.
:beam-runners-google-cloud-dataflow-java:compileTestJava (Thread[Task worker 
for ':' Thread 11,5,main]) started.

> Task :beam-runners-google-cloud-dataflow-java:compileTestJava UP-TO-DATE
Build cache key for task 
':beam-runners-google-cloud-dataflow-java:compileTestJava' is 
94c89317364665b7595de64d7d7f61c8
Skipping task ':beam-runners-google-cloud-dataflow-java:compileTestJava' as it 
is up-to-date.
:beam-runners-google-cloud-dataflow-java:compileTestJava (Thread[Task worker 
for ':' Thread 11,5,main]) completed. Took 0.045 secs.
:beam-runners-google-cloud-dataflow-java:testClasses (Thread[Task worker for 
':' Thread 11,5,main]) started.

> Task :beam-runners-google-cloud-dataflow-java:testClasses UP-TO-DATE
Skipping task ':beam-runners-google-cloud-dataflow-java:testClasses' as it has 
no actions.
:beam-runners-google-cloud-dataflow-java:testClasses (Thread[Task worker for 
':' Thread 11,5,main]) completed. Took 0.0 secs.
:beam-runners-google-cloud-dataflow-java:shadowTestJar (Thread[Task worker for 
':' Thread 11,5,main]) started.

> Task :beam-runners-google-cloud-dataflow-java:shadowTestJar UP-TO-DATE
Build cache key for task 
':beam-runners-google-cloud-dataflow-java:shadowTestJar' is 
c7d9f36b9079392aedfd635096c6412d
Caching disabled for task 
':beam-runners-google-cloud-dataflow-java:shadowTestJar': Caching has not been 
enabled for the task
Skipping task ':beam-runners-google-cloud-dataflow-java:shadowTestJar' as it is 
up-to-date.
:beam-runners-google-cloud-dataflow-java:shadowTestJar (Thread[Task worker for 
':' Thread 11,5,main]) completed. Took 0.026 secs.
:beam-sdks-java-io-hadoop-input-format:compileTestJava (Thread[Task worker for 
':' Thread 11,5,main]) started.

> Task :beam-sdks-java-io-hadoop-input-format:compileTestJava UP-TO-DATE
Build cache key for task 
':beam-sdks-java-io-hadoop-input-format:compileTestJava' is 
e4ed6133e22e0103cb4d20cad4204c78
Skipping task ':beam-sdks-java-io-hadoop-input-format:compileTestJava' as it is 
up-to-date.
:beam-sdks-java-io-hadoop-input-format:compileTestJava (Thread[Task worker for 
':' Thread 11,5,main]) completed. Took 0.417 secs.
:beam-sdks-java-io-hadoop-input-format:testClasses (Thread[Task worker for ':' 
Thread 11,5,main]) started.

> Task :beam-sdks-java-io-hadoop-input-format:testClasses UP-TO-DATE
Skipping task ':beam-sdks-java-io-hadoop-input-format:testClasses' as it has no 
actions.
:beam-sdks-java-io-hadoop-input-format:testClasses (Thread[Task worker for ':' 
Thread 11,5,main]) completed. Took 0.0 secs.
:beam-sdks-java-io-hadoop-input-format:integrationTest (Thread[Task worker for 
':' Thread 

[jira] [Work logged] (BEAM-4076) Schema followups

2018-05-28 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4076?focusedWorklogId=106248=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-106248
 ]

ASF GitHub Bot logged work on BEAM-4076:


Author: ASF GitHub Bot
Created on: 28/May/18 06:03
Start Date: 28/May/18 06:03
Worklog Time Spent: 10m 
  Work Description: kennknowles opened a new pull request #5498: 
[BEAM-4076] Remove unsafe methods from Schema.TypeName and Schema.FieldType
URL: https://github.com/apache/beam/pull/5498
 
 
   The methods `TypeName.type()` and `FieldType.withMapType()` etc all refer to 
various operations on types and type constructors that are not well-defined for 
a lot of inputs. Since these methods are also not needed, this PR deletes them.
   
   This is stacked on #5497
   
   
   
   Follow this checklist to help us incorporate your contribution quickly and 
easily:
   
- [x] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [x] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   It will help us expedite review of your Pull Request if you tag someone 
(e.g. `@username`) to look at it.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 106248)
Time Spent: 10m
Remaining Estimate: 0h

> Schema followups
> 
>
> Key: BEAM-4076
> URL: https://issues.apache.org/jira/browse/BEAM-4076
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, dsl-sql, sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This umbrella bug contains subtasks with followups for Beam schemas, which 
> were moved from SQL to the core Java SDK and made to be type-name-based 
> rather than coder based.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PerformanceTests_XmlIOIT_HDFS #223

2018-05-28 Thread Apache Jenkins Server
See 


--
Started by timer
[EnvInject] - Loading node environment variables.
Building remotely on beam10 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision 3a668eaff6266c87c645a44cc84b4d5e1c3b2228 (origin/master)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 3a668eaff6266c87c645a44cc84b4d5e1c3b2228
Commit message: "Merge pull request #5475: [BEAM-4306] Enforce ErrorProne 
analysis in apex runner"
 > git rev-list --no-walk 3a668eaff6266c87c645a44cc84b4d5e1c3b2228 # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
[EnvInject] - Executing scripts and injecting environment variables after the 
SCM step.
[EnvInject] - Injecting as environment variables the properties content 
SPARK_LOCAL_IP=127.0.0.1

[EnvInject] - Variables injected successfully.
[beam_PerformanceTests_XmlIOIT_HDFS] $ /bin/bash -xe 
/tmp/jenkins4300687650643679995.sh
+ gcloud container clusters get-credentials io-datastores --zone=us-central1-a 
--verbosity=debug
DEBUG: Running [gcloud.container.clusters.get-credentials] with arguments: 
[--verbosity: "debug", --zone: "us-central1-a", NAME: "io-datastores"]
Fetching cluster endpoint and auth data.
DEBUG: Saved kubeconfig to /home/jenkins/.kube/config
kubeconfig entry generated for io-datastores.
INFO: Display format "default".
DEBUG: SDK update checks are disabled.
[beam_PerformanceTests_XmlIOIT_HDFS] $ /bin/bash -xe 
/tmp/jenkins8075963430466719714.sh
+ cp /home/jenkins/.kube/config 

[beam_PerformanceTests_XmlIOIT_HDFS] $ /bin/bash -xe 
/tmp/jenkins38075572331046334.sh
+ kubectl 
--kubeconfig=
 create namespace filebasedioithdfs-223
Error from server (AlreadyExists): namespaces "filebasedioithdfs-223" already 
exists
Build step 'Execute shell' marked build as failure