[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=97=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-97
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 12/Jun/18 18:37
Start Date: 12/Jun/18 18:37
Worklog Time Spent: 10m 
  Work Description: jkff commented on a change in pull request #5591: 
[BEAM-4290] Beam File System based ArtifactStagingService
URL: https://github.com/apache/beam/pull/5591#discussion_r194573872
 
 

 ##
 File path: 
runners/java-fn-execution/src/test/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingServiceTest.java
 ##
 @@ -0,0 +1,406 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.fnexecution.artifact;
+
+import static com.google.common.base.Preconditions.checkArgument;
+
+import com.google.common.base.Joiner;
+import com.google.common.base.Strings;
+import com.google.common.collect.ImmutableMap;
+import com.google.protobuf.ByteString;
+import com.google.protobuf.util.JsonFormat;
+import io.grpc.inprocess.InProcessChannelBuilder;
+import io.grpc.stub.StreamObserver;
+import java.io.FileInputStream;
+import java.io.IOException;
+import java.nio.charset.Charset;
+import java.nio.charset.StandardCharsets;
+import java.nio.file.FileVisitResult;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.nio.file.Paths;
+import java.nio.file.SimpleFileVisitor;
+import java.nio.file.attribute.BasicFileAttributes;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactChunk;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactMetadata;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestRequest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestResponse;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.Manifest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Builder;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Location;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactMetadata;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactRequest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactResponse;
+import org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc.ArtifactStagingServiceStub;
+import org.apache.beam.runners.fnexecution.GrpcFnServer;
+import org.apache.beam.runners.fnexecution.InProcessServerFactory;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
+import org.junit.runner.RunWith;
+import org.junit.runners.JUnit4;
+
+/**
+ * Tests for {@link BeamFileSystemArtifactStagingService}.
+ */
+@RunWith(JUnit4.class)
+public class BeamFileSystemArtifactStagingServiceTest {
+
+  private static final Joiner JOINER = Joiner.on("");
+  private static final Charset CHARSET = StandardCharsets.UTF_8;
+  private static final int DATA_1KB = 1 << 10;
+  private GrpcFnServer server;
+  private BeamFileSystemArtifactStagingService artifactStagingService;
+  private ArtifactStagingServiceStub stub;
+  private Path srcDir;
+  private Path destDir;
+
+  @Before
+  public void setUp() throws Exception {
+artifactStagingService = new BeamFileSystemArtifactStagingService();
+server = GrpcFnServer
+.allocatePortAndCreateFor(artifactStagingService, 
InProcessServerFactory.create());
+stub =
+ArtifactStagingServiceGrpc.newStub(
+

[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=94=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-94
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 12/Jun/18 18:37
Start Date: 12/Jun/18 18:37
Worklog Time Spent: 10m 
  Work Description: jkff commented on a change in pull request #5591: 
[BEAM-4290] Beam File System based ArtifactStagingService
URL: https://github.com/apache/beam/pull/5591#discussion_r194573699
 
 

 ##
 File path: 
runners/java-fn-execution/src/test/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingServiceTest.java
 ##
 @@ -0,0 +1,406 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.fnexecution.artifact;
+
+import static com.google.common.base.Preconditions.checkArgument;
+
+import com.google.common.base.Joiner;
+import com.google.common.base.Strings;
+import com.google.common.collect.ImmutableMap;
+import com.google.protobuf.ByteString;
+import com.google.protobuf.util.JsonFormat;
+import io.grpc.inprocess.InProcessChannelBuilder;
+import io.grpc.stub.StreamObserver;
+import java.io.FileInputStream;
+import java.io.IOException;
+import java.nio.charset.Charset;
+import java.nio.charset.StandardCharsets;
+import java.nio.file.FileVisitResult;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.nio.file.Paths;
+import java.nio.file.SimpleFileVisitor;
+import java.nio.file.attribute.BasicFileAttributes;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactChunk;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactMetadata;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestRequest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestResponse;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.Manifest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Builder;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Location;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactMetadata;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactRequest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactResponse;
+import org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc.ArtifactStagingServiceStub;
+import org.apache.beam.runners.fnexecution.GrpcFnServer;
+import org.apache.beam.runners.fnexecution.InProcessServerFactory;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
+import org.junit.runner.RunWith;
+import org.junit.runners.JUnit4;
+
+/**
+ * Tests for {@link BeamFileSystemArtifactStagingService}.
+ */
+@RunWith(JUnit4.class)
+public class BeamFileSystemArtifactStagingServiceTest {
+
+  private static final Joiner JOINER = Joiner.on("");
+  private static final Charset CHARSET = StandardCharsets.UTF_8;
+  private static final int DATA_1KB = 1 << 10;
+  private GrpcFnServer server;
+  private BeamFileSystemArtifactStagingService artifactStagingService;
+  private ArtifactStagingServiceStub stub;
+  private Path srcDir;
+  private Path destDir;
+
+  @Before
+  public void setUp() throws Exception {
+artifactStagingService = new BeamFileSystemArtifactStagingService();
+server = GrpcFnServer
+.allocatePortAndCreateFor(artifactStagingService, 
InProcessServerFactory.create());
+stub =
+ArtifactStagingServiceGrpc.newStub(
+

[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=98=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-98
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 12/Jun/18 18:37
Start Date: 12/Jun/18 18:37
Worklog Time Spent: 10m 
  Work Description: jkff closed pull request #5591: [BEAM-4290] Beam File 
System based ArtifactStagingService
URL: https://github.com/apache/beam/pull/5591
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingService.java
 
b/runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingService.java
new file mode 100644
index 000..48d8ad6d610
--- /dev/null
+++ 
b/runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingService.java
@@ -0,0 +1,302 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.fnexecution.artifact;
+
+import static com.google.common.base.Preconditions.checkNotNull;
+
+import com.fasterxml.jackson.core.JsonProcessingException;
+import com.fasterxml.jackson.databind.ObjectMapper;
+import com.google.common.hash.Hashing;
+import com.google.protobuf.util.JsonFormat;
+import io.grpc.stub.StreamObserver;
+import java.io.IOException;
+import java.io.Serializable;
+import java.nio.channels.WritableByteChannel;
+import java.nio.charset.Charset;
+import java.nio.charset.StandardCharsets;
+import java.util.Collections;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactMetadata;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestRequest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestResponse;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Location;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactMetadata;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactRequest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactResponse;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc.ArtifactStagingServiceImplBase;
+import org.apache.beam.runners.fnexecution.FnService;
+import org.apache.beam.sdk.io.FileSystems;
+import org.apache.beam.sdk.io.fs.MoveOptions.StandardMoveOptions;
+import org.apache.beam.sdk.io.fs.ResolveOptions.StandardResolveOptions;
+import org.apache.beam.sdk.io.fs.ResourceId;
+import org.apache.beam.sdk.util.MimeTypes;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * This implementation is experimental.
+ *
+ * {@link ArtifactStagingServiceImplBase} based on beam file system. {@link
+ * BeamFileSystemArtifactStagingService} requires {@link StagingSessionToken} 
in every me call. The
+ * manifest is put in {@link StagingSessionToken#getBasePath()}/{@link
+ * StagingSessionToken#getSessionId()} and artifacts are put in {@link
+ * StagingSessionToken#getBasePath()}/{@link 
StagingSessionToken#getSessionId()}/{@link
+ * BeamFileSystemArtifactStagingService#ARTIFACTS}.
+ *
+ * The returned token is the path to the manifest file.
+ *
+ * The manifest file is encoded in {@link ProxyManifest}.
+ */
+public class BeamFileSystemArtifactStagingService extends 
ArtifactStagingServiceImplBase implements
+FnService {
+
+  private static final Logger LOG =
+  LoggerFactory.getLogger(BeamFileSystemArtifactStagingService.class);
+  private static final ObjectMapper MAPPER = new ObjectMapper();
+  // Use UTF8 for all text encoding.
+  private static final Charset CHARSET = StandardCharsets.UTF_8;
+  public static final String MANIFEST = "MANIFEST";
+  public static final String 

[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=95=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-95
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 12/Jun/18 18:37
Start Date: 12/Jun/18 18:37
Worklog Time Spent: 10m 
  Work Description: jkff commented on a change in pull request #5591: 
[BEAM-4290] Beam File System based ArtifactStagingService
URL: https://github.com/apache/beam/pull/5591#discussion_r194573653
 
 

 ##
 File path: 
runners/java-fn-execution/src/test/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingServiceTest.java
 ##
 @@ -0,0 +1,406 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.fnexecution.artifact;
+
+import static com.google.common.base.Preconditions.checkArgument;
+
+import com.google.common.base.Joiner;
+import com.google.common.base.Strings;
+import com.google.common.collect.ImmutableMap;
+import com.google.protobuf.ByteString;
+import com.google.protobuf.util.JsonFormat;
+import io.grpc.inprocess.InProcessChannelBuilder;
+import io.grpc.stub.StreamObserver;
+import java.io.FileInputStream;
+import java.io.IOException;
+import java.nio.charset.Charset;
+import java.nio.charset.StandardCharsets;
+import java.nio.file.FileVisitResult;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.nio.file.Paths;
+import java.nio.file.SimpleFileVisitor;
+import java.nio.file.attribute.BasicFileAttributes;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactChunk;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactMetadata;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestRequest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestResponse;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.Manifest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Builder;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Location;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactMetadata;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactRequest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactResponse;
+import org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc.ArtifactStagingServiceStub;
+import org.apache.beam.runners.fnexecution.GrpcFnServer;
+import org.apache.beam.runners.fnexecution.InProcessServerFactory;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
+import org.junit.runner.RunWith;
+import org.junit.runners.JUnit4;
+
+/**
+ * Tests for {@link BeamFileSystemArtifactStagingService}.
+ */
+@RunWith(JUnit4.class)
+public class BeamFileSystemArtifactStagingServiceTest {
+
+  private static final Joiner JOINER = Joiner.on("");
+  private static final Charset CHARSET = StandardCharsets.UTF_8;
+  private static final int DATA_1KB = 1 << 10;
+  private GrpcFnServer server;
+  private BeamFileSystemArtifactStagingService artifactStagingService;
+  private ArtifactStagingServiceStub stub;
+  private Path srcDir;
+  private Path destDir;
+
+  @Before
+  public void setUp() throws Exception {
+artifactStagingService = new BeamFileSystemArtifactStagingService();
+server = GrpcFnServer
+.allocatePortAndCreateFor(artifactStagingService, 
InProcessServerFactory.create());
+stub =
+ArtifactStagingServiceGrpc.newStub(
+

[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=93=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-93
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 12/Jun/18 18:37
Start Date: 12/Jun/18 18:37
Worklog Time Spent: 10m 
  Work Description: jkff commented on a change in pull request #5591: 
[BEAM-4290] Beam File System based ArtifactStagingService
URL: https://github.com/apache/beam/pull/5591#discussion_r194573173
 
 

 ##
 File path: 
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingService.java
 ##
 @@ -0,0 +1,302 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.fnexecution.artifact;
+
+import static com.google.common.base.Preconditions.checkNotNull;
+
+import com.fasterxml.jackson.core.JsonProcessingException;
+import com.fasterxml.jackson.databind.ObjectMapper;
+import com.google.common.hash.Hashing;
+import com.google.protobuf.util.JsonFormat;
+import io.grpc.stub.StreamObserver;
+import java.io.IOException;
+import java.io.Serializable;
+import java.nio.channels.WritableByteChannel;
+import java.nio.charset.Charset;
+import java.nio.charset.StandardCharsets;
+import java.util.Collections;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactMetadata;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestRequest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestResponse;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Location;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactMetadata;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactRequest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactResponse;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc.ArtifactStagingServiceImplBase;
+import org.apache.beam.runners.fnexecution.FnService;
+import org.apache.beam.sdk.io.FileSystems;
+import org.apache.beam.sdk.io.fs.MoveOptions.StandardMoveOptions;
+import org.apache.beam.sdk.io.fs.ResolveOptions.StandardResolveOptions;
+import org.apache.beam.sdk.io.fs.ResourceId;
+import org.apache.beam.sdk.util.MimeTypes;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * This implementation is experimental.
+ *
+ * {@link ArtifactStagingServiceImplBase} based on beam file system. {@link
+ * BeamFileSystemArtifactStagingService} requires {@link StagingSessionToken} 
in every me call. The
+ * manifest is put in {@link StagingSessionToken#getBasePath()}/{@link
+ * StagingSessionToken#getSessionId()} and artifacts are put in {@link
+ * StagingSessionToken#getBasePath()}/{@link 
StagingSessionToken#getSessionId()}/{@link
+ * BeamFileSystemArtifactStagingService#ARTIFACTS}.
+ *
+ * The returned token is the path to the manifest file.
+ *
+ * The manifest file is encoded in {@link ProxyManifest}.
+ */
+public class BeamFileSystemArtifactStagingService extends 
ArtifactStagingServiceImplBase implements
+FnService {
+
+  private static final Logger LOG =
+  LoggerFactory.getLogger(BeamFileSystemArtifactStagingService.class);
+  private static final ObjectMapper MAPPER = new ObjectMapper();
+  // Use UTF8 for all text encoding.
+  private static final Charset CHARSET = StandardCharsets.UTF_8;
+  public static final String MANIFEST = "MANIFEST";
+  public static final String ARTIFACTS = "artifacts";
+
+  @Override
+  public StreamObserver putArtifact(
+  StreamObserver responseObserver) {
+return new PutArtifactStreamObserver(responseObserver);
+  }
+
+  @Override
+  public void commitManifest(
+  CommitManifestRequest request, StreamObserver 
responseObserver) {
+try {
+  ResourceId manifestResourceId = 
getManifestFileResourceId(request.getStagingSessionToken());
+  ResourceId artifactDirResourceId = 
getArtifactDirResourceId(request.getStagingSessionToken());
+  

[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=96=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-96
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 12/Jun/18 18:37
Start Date: 12/Jun/18 18:37
Worklog Time Spent: 10m 
  Work Description: jkff commented on a change in pull request #5591: 
[BEAM-4290] Beam File System based ArtifactStagingService
URL: https://github.com/apache/beam/pull/5591#discussion_r194573619
 
 

 ##
 File path: 
runners/java-fn-execution/src/test/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingServiceTest.java
 ##
 @@ -0,0 +1,406 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.fnexecution.artifact;
+
+import static com.google.common.base.Preconditions.checkArgument;
+
+import com.google.common.base.Joiner;
+import com.google.common.base.Strings;
+import com.google.common.collect.ImmutableMap;
+import com.google.protobuf.ByteString;
+import com.google.protobuf.util.JsonFormat;
+import io.grpc.inprocess.InProcessChannelBuilder;
+import io.grpc.stub.StreamObserver;
+import java.io.FileInputStream;
+import java.io.IOException;
+import java.nio.charset.Charset;
+import java.nio.charset.StandardCharsets;
+import java.nio.file.FileVisitResult;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.nio.file.Paths;
+import java.nio.file.SimpleFileVisitor;
+import java.nio.file.attribute.BasicFileAttributes;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactChunk;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactMetadata;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestRequest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestResponse;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.Manifest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Builder;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Location;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactMetadata;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactRequest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactResponse;
+import org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc.ArtifactStagingServiceStub;
+import org.apache.beam.runners.fnexecution.GrpcFnServer;
+import org.apache.beam.runners.fnexecution.InProcessServerFactory;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
+import org.junit.runner.RunWith;
+import org.junit.runners.JUnit4;
+
+/**
+ * Tests for {@link BeamFileSystemArtifactStagingService}.
+ */
+@RunWith(JUnit4.class)
+public class BeamFileSystemArtifactStagingServiceTest {
+
+  private static final Joiner JOINER = Joiner.on("");
+  private static final Charset CHARSET = StandardCharsets.UTF_8;
+  private static final int DATA_1KB = 1 << 10;
+  private GrpcFnServer server;
+  private BeamFileSystemArtifactStagingService artifactStagingService;
+  private ArtifactStagingServiceStub stub;
+  private Path srcDir;
+  private Path destDir;
+
+  @Before
+  public void setUp() throws Exception {
+artifactStagingService = new BeamFileSystemArtifactStagingService();
+server = GrpcFnServer
+.allocatePortAndCreateFor(artifactStagingService, 
InProcessServerFactory.create());
+stub =
+ArtifactStagingServiceGrpc.newStub(
+

[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=19=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-19
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 12/Jun/18 16:43
Start Date: 12/Jun/18 16:43
Worklog Time Spent: 10m 
  Work Description: robertwb commented on a change in pull request #5591: 
[BEAM-4290] Beam File System based ArtifactStagingService
URL: https://github.com/apache/beam/pull/5591#discussion_r194810545
 
 

 ##
 File path: 
runners/java-fn-execution/src/test/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingServiceTest.java
 ##
 @@ -251,17 +258,129 @@ public void putArtifactsMultipleFilesTest() throws 
Exception {
 assertFiles(files.keySet(), stagingToken);
   }
 
+  @Test
+  public void putArtifactsMultipleFilesConcurrentlyTest() throws Exception {
+String stagingSession = "123";
+Map files = new HashMap<>();
+files.put("file5cb", (DATA_1KB / 2) /*500b*/);
+files.put("file1kb", DATA_1KB /*1 kb*/);
+files.put("file15cb", (DATA_1KB * 3) / 2  /*1.5 kb*/);
+files.put("nested/file1kb", DATA_1KB /*1 kb*/);
+files.put("file10kb", 10 * DATA_1KB /*10 kb*/);
+files.put("file100kb", 100 * DATA_1KB /*100 kb*/);
+
+final String text = "abcdefghinklmop\n";
+files.forEach((fileName, size) -> {
+  Path filePath = Paths.get(srcDir.toString(), fileName).toAbsolutePath();
+  try {
+Files.createDirectories(filePath.getParent());
+Files.write(filePath,
+Strings.repeat(text, Double.valueOf(Math.ceil(size * 1.0 / 
text.length())).intValue())
+.getBytes(CHARSET));
+  } catch (IOException ignored) {
+  }
+});
+String stagingSessionToken = BeamFileSystemArtifactStagingService
+.generateStagingSessionToken(stagingSession, 
destDir.toUri().getPath());
+
+List metadata = new ArrayList<>();
+ExecutorService executorService = Executors.newFixedThreadPool(8);
+try {
+  for (String fileName : files.keySet()) {
+executorService.execute(() -> {
+  try {
+putArtifact(stagingSessionToken,
+Paths.get(srcDir.toString(), 
fileName).toAbsolutePath().toString(), fileName);
+  } catch (Exception e) {
+Assert.fail(e.getMessage());
+  }
+  
metadata.add(ArtifactMetadata.newBuilder().setName(fileName).build());
+});
+  }
+} finally {
+  executorService.shutdown();
+  executorService.awaitTermination(2, TimeUnit.SECONDS);
+}
+
+String stagingToken = commitManifest(stagingSessionToken, metadata);
+Assert.assertEquals(
+Paths.get(destDir.toAbsolutePath().toString(), stagingSession, 
"MANIFEST").toString(),
+stagingToken);
+assertFiles(files.keySet(), stagingToken);
+  }
+
+  @Test
+  public void putArtifactsMultipleFilesConcurrentSessionsTest() throws 
Exception {
+String stagingSession1 = "123";
+String stagingSession2 = "abc";
+Map files = new HashMap<>();
+files.put("file5cb", (DATA_1KB / 2) /*500b*/);
+files.put("file1kb", DATA_1KB /*1 kb*/);
+files.put("file15cb", (DATA_1KB * 3) / 2  /*1.5 kb*/);
+files.put("nested/file1kb", DATA_1KB /*1 kb*/);
+files.put("file10kb", 10 * DATA_1KB /*10 kb*/);
+files.put("file100kb", 100 * DATA_1KB /*100 kb*/);
+
+final String text = "abcdefghinklmop\n";
+files.forEach((fileName, size) -> {
+  Path filePath = Paths.get(srcDir.toString(), fileName).toAbsolutePath();
+  try {
+Files.createDirectories(filePath.getParent());
+Files.write(filePath,
+Strings.repeat(text, Double.valueOf(Math.ceil(size * 1.0 / 
text.length())).intValue())
+.getBytes(CHARSET));
+  } catch (IOException ignored) {
+  }
+});
+String stagingSessionToken1 = BeamFileSystemArtifactStagingService
+.generateStagingSessionToken(stagingSession1, 
destDir.toUri().getPath());
+String stagingSessionToken2 = BeamFileSystemArtifactStagingService
+.generateStagingSessionToken(stagingSession2, 
destDir.toUri().getPath());
+
+List metadata = new ArrayList<>();
+ExecutorService executorService = Executors.newFixedThreadPool(8);
+try {
+  for (String fileName : files.keySet()) {
+executorService.execute(() -> {
+  try {
+putArtifact(stagingSessionToken1,
 
 Review comment:
   I was actually thinking of something much simpler, i.e upload file 1a, 
upload file 2a, upload file 1b, check, but this should have the same coverage. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please 

[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-11 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=110878=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-110878
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 11/Jun/18 22:40
Start Date: 11/Jun/18 22:40
Worklog Time Spent: 10m 
  Work Description: angoenka commented on issue #5591: [BEAM-4290] Beam 
File System based ArtifactStagingService
URL: https://github.com/apache/beam/pull/5591#issuecomment-396409541
 
 
   retest this please


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 110878)
Time Spent: 14h 40m  (was: 14.5h)

> ArtifactStagingService that stages to a distributed filesystem
> --
>
> Key: BEAM-4290
> URL: https://issues.apache.org/jira/browse/BEAM-4290
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-core
>Reporter: Eugene Kirpichov
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 14h 40m
>  Remaining Estimate: 0h
>
> Using the job's staging directory from PipelineOptions.
> Physical layout on the distributed filesystem is TBD but it should allow for 
> arbitrary filenames and ideally for eventually avoiding uploading artifacts 
> that are already there.
> Handling credentials is TBD.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-11 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=110876=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-110876
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 11/Jun/18 22:31
Start Date: 11/Jun/18 22:31
Worklog Time Spent: 10m 
  Work Description: angoenka commented on issue #5591: [BEAM-4290] Beam 
File System based ArtifactStagingService
URL: https://github.com/apache/beam/pull/5591#issuecomment-396407813
 
 
   retest please


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 110876)
Time Spent: 14.5h  (was: 14h 20m)

> ArtifactStagingService that stages to a distributed filesystem
> --
>
> Key: BEAM-4290
> URL: https://issues.apache.org/jira/browse/BEAM-4290
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-core
>Reporter: Eugene Kirpichov
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 14.5h
>  Remaining Estimate: 0h
>
> Using the job's staging directory from PipelineOptions.
> Physical layout on the distributed filesystem is TBD but it should allow for 
> arbitrary filenames and ideally for eventually avoiding uploading artifacts 
> that are already there.
> Handling credentials is TBD.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-11 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=110846=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-110846
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 11/Jun/18 21:25
Start Date: 11/Jun/18 21:25
Worklog Time Spent: 10m 
  Work Description: angoenka commented on a change in pull request #5591: 
[BEAM-4290] Beam File System based ArtifactStagingService
URL: https://github.com/apache/beam/pull/5591#discussion_r194551115
 
 

 ##
 File path: 
runners/java-fn-execution/src/test/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingServiceTest.java
 ##
 @@ -251,17 +258,129 @@ public void putArtifactsMultipleFilesTest() throws 
Exception {
 assertFiles(files.keySet(), stagingToken);
   }
 
+  @Test
+  public void putArtifactsMultipleFilesConcurrentlyTest() throws Exception {
+String stagingSession = "123";
+Map files = new HashMap<>();
+files.put("file5cb", (DATA_1KB / 2) /*500b*/);
+files.put("file1kb", DATA_1KB /*1 kb*/);
+files.put("file15cb", (DATA_1KB * 3) / 2  /*1.5 kb*/);
+files.put("nested/file1kb", DATA_1KB /*1 kb*/);
+files.put("file10kb", 10 * DATA_1KB /*10 kb*/);
+files.put("file100kb", 100 * DATA_1KB /*100 kb*/);
+
+final String text = "abcdefghinklmop\n";
+files.forEach((fileName, size) -> {
+  Path filePath = Paths.get(srcDir.toString(), fileName).toAbsolutePath();
+  try {
+Files.createDirectories(filePath.getParent());
+Files.write(filePath,
+Strings.repeat(text, Double.valueOf(Math.ceil(size * 1.0 / 
text.length())).intValue())
+.getBytes(CHARSET));
+  } catch (IOException ignored) {
+  }
+});
+String stagingSessionToken = BeamFileSystemArtifactStagingService
+.generateStagingSessionToken(stagingSession, 
destDir.toUri().getPath());
+
+List metadata = new ArrayList<>();
+ExecutorService executorService = Executors.newFixedThreadPool(8);
+try {
+  for (String fileName : files.keySet()) {
+executorService.execute(() -> {
+  try {
+putArtifact(stagingSessionToken,
+Paths.get(srcDir.toString(), 
fileName).toAbsolutePath().toString(), fileName);
+  } catch (Exception e) {
+Assert.fail(e.getMessage());
+  }
+  
metadata.add(ArtifactMetadata.newBuilder().setName(fileName).build());
+});
+  }
+} finally {
+  executorService.shutdown();
+  executorService.awaitTermination(2, TimeUnit.SECONDS);
+}
+
+String stagingToken = commitManifest(stagingSessionToken, metadata);
+Assert.assertEquals(
+Paths.get(destDir.toAbsolutePath().toString(), stagingSession, 
"MANIFEST").toString(),
+stagingToken);
+assertFiles(files.keySet(), stagingToken);
+  }
+
+  @Test
+  public void putArtifactsMultipleFilesConcurrentSessionsTest() throws 
Exception {
+String stagingSession1 = "123";
+String stagingSession2 = "abc";
+Map files = new HashMap<>();
+files.put("file5cb", (DATA_1KB / 2) /*500b*/);
+files.put("file1kb", DATA_1KB /*1 kb*/);
+files.put("file15cb", (DATA_1KB * 3) / 2  /*1.5 kb*/);
+files.put("nested/file1kb", DATA_1KB /*1 kb*/);
+files.put("file10kb", 10 * DATA_1KB /*10 kb*/);
+files.put("file100kb", 100 * DATA_1KB /*100 kb*/);
+
+final String text = "abcdefghinklmop\n";
+files.forEach((fileName, size) -> {
+  Path filePath = Paths.get(srcDir.toString(), fileName).toAbsolutePath();
+  try {
+Files.createDirectories(filePath.getParent());
+Files.write(filePath,
+Strings.repeat(text, Double.valueOf(Math.ceil(size * 1.0 / 
text.length())).intValue())
+.getBytes(CHARSET));
+  } catch (IOException ignored) {
+  }
+});
+String stagingSessionToken1 = BeamFileSystemArtifactStagingService
+.generateStagingSessionToken(stagingSession1, 
destDir.toUri().getPath());
+String stagingSessionToken2 = BeamFileSystemArtifactStagingService
+.generateStagingSessionToken(stagingSession2, 
destDir.toUri().getPath());
+
+List metadata = new ArrayList<>();
+ExecutorService executorService = Executors.newFixedThreadPool(8);
+try {
+  for (String fileName : files.keySet()) {
+executorService.execute(() -> {
+  try {
+putArtifact(stagingSessionToken1,
 
 Review comment:
   The API does not allow uploading multiple chunks of same file in parallel.
   
   This testcase simulates file uploaded by 2 separate session in parallel. 
   
   I will create 2 sets of files here which should make the session completely 
different.


This is an automated message from the Apache Git Service.
To respond to the message, please log on 

[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-11 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=110843=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-110843
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 11/Jun/18 21:15
Start Date: 11/Jun/18 21:15
Worklog Time Spent: 10m 
  Work Description: angoenka commented on a change in pull request #5591: 
[BEAM-4290] Beam File System based ArtifactStagingService
URL: https://github.com/apache/beam/pull/5591#discussion_r194550351
 
 

 ##
 File path: 
runners/java-fn-execution/src/test/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingServiceTest.java
 ##
 @@ -251,17 +258,129 @@ public void putArtifactsMultipleFilesTest() throws 
Exception {
 assertFiles(files.keySet(), stagingToken);
   }
 
+  @Test
+  public void putArtifactsMultipleFilesConcurrentlyTest() throws Exception {
+String stagingSession = "123";
+Map files = new HashMap<>();
+files.put("file5cb", (DATA_1KB / 2) /*500b*/);
+files.put("file1kb", DATA_1KB /*1 kb*/);
+files.put("file15cb", (DATA_1KB * 3) / 2  /*1.5 kb*/);
+files.put("nested/file1kb", DATA_1KB /*1 kb*/);
+files.put("file10kb", 10 * DATA_1KB /*10 kb*/);
+files.put("file100kb", 100 * DATA_1KB /*100 kb*/);
+
+final String text = "abcdefghinklmop\n";
+files.forEach((fileName, size) -> {
+  Path filePath = Paths.get(srcDir.toString(), fileName).toAbsolutePath();
+  try {
+Files.createDirectories(filePath.getParent());
+Files.write(filePath,
+Strings.repeat(text, Double.valueOf(Math.ceil(size * 1.0 / 
text.length())).intValue())
+.getBytes(CHARSET));
+  } catch (IOException ignored) {
+  }
+});
+String stagingSessionToken = BeamFileSystemArtifactStagingService
+.generateStagingSessionToken(stagingSession, 
destDir.toUri().getPath());
+
+List metadata = new ArrayList<>();
+ExecutorService executorService = Executors.newFixedThreadPool(8);
+try {
+  for (String fileName : files.keySet()) {
+executorService.execute(() -> {
+  try {
+putArtifact(stagingSessionToken,
+Paths.get(srcDir.toString(), 
fileName).toAbsolutePath().toString(), fileName);
+  } catch (Exception e) {
+Assert.fail(e.getMessage());
+  }
+  
metadata.add(ArtifactMetadata.newBuilder().setName(fileName).build());
+});
+  }
+} finally {
+  executorService.shutdown();
+  executorService.awaitTermination(2, TimeUnit.SECONDS);
+}
+
+String stagingToken = commitManifest(stagingSessionToken, metadata);
+Assert.assertEquals(
+Paths.get(destDir.toAbsolutePath().toString(), stagingSession, 
"MANIFEST").toString(),
+stagingToken);
+assertFiles(files.keySet(), stagingToken);
+  }
+
+  @Test
+  public void putArtifactsMultipleFilesConcurrentSessionsTest() throws 
Exception {
+String stagingSession1 = "123";
+String stagingSession2 = "abc";
+Map files = new HashMap<>();
 
 Review comment:
   ImmutableMap.of(k,v) is only applicable for 5 k-v while we are having more 
KVs.
   Using builders to create the immutable map.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 110843)
Time Spent: 14h 10m  (was: 14h)

> ArtifactStagingService that stages to a distributed filesystem
> --
>
> Key: BEAM-4290
> URL: https://issues.apache.org/jira/browse/BEAM-4290
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-core
>Reporter: Eugene Kirpichov
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 14h 10m
>  Remaining Estimate: 0h
>
> Using the job's staging directory from PipelineOptions.
> Physical layout on the distributed filesystem is TBD but it should allow for 
> arbitrary filenames and ideally for eventually avoiding uploading artifacts 
> that are already there.
> Handling credentials is TBD.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-11 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=110836=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-110836
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 11/Jun/18 21:04
Start Date: 11/Jun/18 21:04
Worklog Time Spent: 10m 
  Work Description: angoenka commented on a change in pull request #5591: 
[BEAM-4290] Beam File System based ArtifactStagingService
URL: https://github.com/apache/beam/pull/5591#discussion_r194535246
 
 

 ##
 File path: 
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingService.java
 ##
 @@ -0,0 +1,292 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.fnexecution.artifact;
+
+import com.fasterxml.jackson.core.JsonProcessingException;
+import com.fasterxml.jackson.databind.ObjectMapper;
+import com.google.common.hash.Hashing;
+import com.google.protobuf.util.JsonFormat;
+import io.grpc.stub.StreamObserver;
+import java.io.FileNotFoundException;
+import java.io.IOException;
+import java.io.Serializable;
+import java.nio.channels.WritableByteChannel;
+import java.nio.charset.Charset;
+import java.nio.charset.StandardCharsets;
+import java.util.Collections;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactMetadata;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestRequest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestResponse;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Location;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactMetadata;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactRequest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactResponse;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc.ArtifactStagingServiceImplBase;
+import org.apache.beam.runners.fnexecution.FnService;
+import org.apache.beam.sdk.io.FileSystems;
+import org.apache.beam.sdk.io.fs.MoveOptions.StandardMoveOptions;
+import org.apache.beam.sdk.io.fs.ResolveOptions.StandardResolveOptions;
+import org.apache.beam.sdk.io.fs.ResourceId;
+import org.apache.beam.sdk.util.MimeTypes;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * {@link ArtifactStagingServiceImplBase} based on beam file system.
+ */
+public class BeamFileSystemArtifactStagingService extends 
ArtifactStagingServiceImplBase implements
+FnService {
+
+  private static final Logger LOG =
+  LoggerFactory.getLogger(BeamFileSystemArtifactStagingService.class);
+  private static final ObjectMapper MAPPER = new ObjectMapper();
+  // Use UTF8 for all text encoding.
+  private static final Charset CHARSET = StandardCharsets.UTF_8;
+  public static final String MANIFEST = "MANIFEST";
+
+  @Override
+  public StreamObserver putArtifact(
+  StreamObserver responseObserver) {
+return new PutArtifactStreamObserver(responseObserver);
+  }
+
+  @Override
+  public void commitManifest(
+  CommitManifestRequest request, StreamObserver 
responseObserver) {
+try {
+  ResourceId manifestResourceId = 
getManifestFileResourceId(request.getStagingSessionToken());
+  ResourceId artifactDirResourceId = 
getArtifactDirResourceId(request.getStagingSessionToken());
+  ProxyManifest.Builder proxyManifestBuilder = ProxyManifest.newBuilder()
+  .setManifest(request.getManifest());
+  for (ArtifactMetadata artifactMetadata : 
request.getManifest().getArtifactList()) {
+proxyManifestBuilder.addLocation(Location.newBuilder()
+.setName(artifactMetadata.getName())
+.setUri(artifactDirResourceId
+.resolve(encodedFileName(artifactMetadata), 
StandardResolveOptions.RESOLVE_FILE)
+.toString()).build());
+  }
+  try (WritableByteChannel manifestWritableByteChannel = FileSystems
+  .create(manifestResourceId, MimeTypes.TEXT)) {
+

[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-11 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=110830=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-110830
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 11/Jun/18 21:04
Start Date: 11/Jun/18 21:04
Worklog Time Spent: 10m 
  Work Description: angoenka commented on a change in pull request #5591: 
[BEAM-4290] Beam File System based ArtifactStagingService
URL: https://github.com/apache/beam/pull/5591#discussion_r194525812
 
 

 ##
 File path: 
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingService.java
 ##
 @@ -0,0 +1,292 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.fnexecution.artifact;
+
+import com.fasterxml.jackson.core.JsonProcessingException;
+import com.fasterxml.jackson.databind.ObjectMapper;
+import com.google.common.hash.Hashing;
+import com.google.protobuf.util.JsonFormat;
+import io.grpc.stub.StreamObserver;
+import java.io.FileNotFoundException;
+import java.io.IOException;
+import java.io.Serializable;
+import java.nio.channels.WritableByteChannel;
+import java.nio.charset.Charset;
+import java.nio.charset.StandardCharsets;
+import java.util.Collections;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactMetadata;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestRequest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestResponse;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Location;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactMetadata;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactRequest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactResponse;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc.ArtifactStagingServiceImplBase;
+import org.apache.beam.runners.fnexecution.FnService;
+import org.apache.beam.sdk.io.FileSystems;
+import org.apache.beam.sdk.io.fs.MoveOptions.StandardMoveOptions;
+import org.apache.beam.sdk.io.fs.ResolveOptions.StandardResolveOptions;
+import org.apache.beam.sdk.io.fs.ResourceId;
+import org.apache.beam.sdk.util.MimeTypes;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * {@link ArtifactStagingServiceImplBase} based on beam file system.
+ */
+public class BeamFileSystemArtifactStagingService extends 
ArtifactStagingServiceImplBase implements
+FnService {
+
+  private static final Logger LOG =
+  LoggerFactory.getLogger(BeamFileSystemArtifactStagingService.class);
+  private static final ObjectMapper MAPPER = new ObjectMapper();
+  // Use UTF8 for all text encoding.
+  private static final Charset CHARSET = StandardCharsets.UTF_8;
+  public static final String MANIFEST = "MANIFEST";
+
+  @Override
+  public StreamObserver putArtifact(
+  StreamObserver responseObserver) {
+return new PutArtifactStreamObserver(responseObserver);
+  }
+
+  @Override
+  public void commitManifest(
+  CommitManifestRequest request, StreamObserver 
responseObserver) {
+try {
+  ResourceId manifestResourceId = 
getManifestFileResourceId(request.getStagingSessionToken());
+  ResourceId artifactDirResourceId = 
getArtifactDirResourceId(request.getStagingSessionToken());
+  ProxyManifest.Builder proxyManifestBuilder = ProxyManifest.newBuilder()
+  .setManifest(request.getManifest());
+  for (ArtifactMetadata artifactMetadata : 
request.getManifest().getArtifactList()) {
+proxyManifestBuilder.addLocation(Location.newBuilder()
+.setName(artifactMetadata.getName())
+.setUri(artifactDirResourceId
+.resolve(encodedFileName(artifactMetadata), 
StandardResolveOptions.RESOLVE_FILE)
+.toString()).build());
+  }
+  try (WritableByteChannel manifestWritableByteChannel = FileSystems
+  .create(manifestResourceId, MimeTypes.TEXT)) {
+

[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-11 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=110835=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-110835
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 11/Jun/18 21:04
Start Date: 11/Jun/18 21:04
Worklog Time Spent: 10m 
  Work Description: angoenka commented on a change in pull request #5591: 
[BEAM-4290] Beam File System based ArtifactStagingService
URL: https://github.com/apache/beam/pull/5591#discussion_r194535202
 
 

 ##
 File path: 
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingService.java
 ##
 @@ -0,0 +1,292 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.fnexecution.artifact;
+
+import com.fasterxml.jackson.core.JsonProcessingException;
+import com.fasterxml.jackson.databind.ObjectMapper;
+import com.google.common.hash.Hashing;
+import com.google.protobuf.util.JsonFormat;
+import io.grpc.stub.StreamObserver;
+import java.io.FileNotFoundException;
+import java.io.IOException;
+import java.io.Serializable;
+import java.nio.channels.WritableByteChannel;
+import java.nio.charset.Charset;
+import java.nio.charset.StandardCharsets;
+import java.util.Collections;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactMetadata;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestRequest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestResponse;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Location;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactMetadata;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactRequest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactResponse;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc.ArtifactStagingServiceImplBase;
+import org.apache.beam.runners.fnexecution.FnService;
+import org.apache.beam.sdk.io.FileSystems;
+import org.apache.beam.sdk.io.fs.MoveOptions.StandardMoveOptions;
+import org.apache.beam.sdk.io.fs.ResolveOptions.StandardResolveOptions;
+import org.apache.beam.sdk.io.fs.ResourceId;
+import org.apache.beam.sdk.util.MimeTypes;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * {@link ArtifactStagingServiceImplBase} based on beam file system.
+ */
+public class BeamFileSystemArtifactStagingService extends 
ArtifactStagingServiceImplBase implements
+FnService {
+
+  private static final Logger LOG =
+  LoggerFactory.getLogger(BeamFileSystemArtifactStagingService.class);
+  private static final ObjectMapper MAPPER = new ObjectMapper();
+  // Use UTF8 for all text encoding.
+  private static final Charset CHARSET = StandardCharsets.UTF_8;
+  public static final String MANIFEST = "MANIFEST";
+
+  @Override
+  public StreamObserver putArtifact(
+  StreamObserver responseObserver) {
+return new PutArtifactStreamObserver(responseObserver);
+  }
+
+  @Override
+  public void commitManifest(
+  CommitManifestRequest request, StreamObserver 
responseObserver) {
+try {
+  ResourceId manifestResourceId = 
getManifestFileResourceId(request.getStagingSessionToken());
+  ResourceId artifactDirResourceId = 
getArtifactDirResourceId(request.getStagingSessionToken());
+  ProxyManifest.Builder proxyManifestBuilder = ProxyManifest.newBuilder()
+  .setManifest(request.getManifest());
+  for (ArtifactMetadata artifactMetadata : 
request.getManifest().getArtifactList()) {
+proxyManifestBuilder.addLocation(Location.newBuilder()
+.setName(artifactMetadata.getName())
+.setUri(artifactDirResourceId
+.resolve(encodedFileName(artifactMetadata), 
StandardResolveOptions.RESOLVE_FILE)
+.toString()).build());
+  }
+  try (WritableByteChannel manifestWritableByteChannel = FileSystems
+  .create(manifestResourceId, MimeTypes.TEXT)) {
+

[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-11 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=110832=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-110832
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 11/Jun/18 21:04
Start Date: 11/Jun/18 21:04
Worklog Time Spent: 10m 
  Work Description: angoenka commented on a change in pull request #5591: 
[BEAM-4290] Beam File System based ArtifactStagingService
URL: https://github.com/apache/beam/pull/5591#discussion_r194527254
 
 

 ##
 File path: 
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingService.java
 ##
 @@ -0,0 +1,292 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.fnexecution.artifact;
+
+import com.fasterxml.jackson.core.JsonProcessingException;
+import com.fasterxml.jackson.databind.ObjectMapper;
+import com.google.common.hash.Hashing;
+import com.google.protobuf.util.JsonFormat;
+import io.grpc.stub.StreamObserver;
+import java.io.FileNotFoundException;
+import java.io.IOException;
+import java.io.Serializable;
+import java.nio.channels.WritableByteChannel;
+import java.nio.charset.Charset;
+import java.nio.charset.StandardCharsets;
+import java.util.Collections;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactMetadata;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestRequest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestResponse;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Location;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactMetadata;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactRequest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactResponse;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc.ArtifactStagingServiceImplBase;
+import org.apache.beam.runners.fnexecution.FnService;
+import org.apache.beam.sdk.io.FileSystems;
+import org.apache.beam.sdk.io.fs.MoveOptions.StandardMoveOptions;
+import org.apache.beam.sdk.io.fs.ResolveOptions.StandardResolveOptions;
+import org.apache.beam.sdk.io.fs.ResourceId;
+import org.apache.beam.sdk.util.MimeTypes;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * {@link ArtifactStagingServiceImplBase} based on beam file system.
+ */
+public class BeamFileSystemArtifactStagingService extends 
ArtifactStagingServiceImplBase implements
+FnService {
+
+  private static final Logger LOG =
+  LoggerFactory.getLogger(BeamFileSystemArtifactStagingService.class);
+  private static final ObjectMapper MAPPER = new ObjectMapper();
+  // Use UTF8 for all text encoding.
+  private static final Charset CHARSET = StandardCharsets.UTF_8;
+  public static final String MANIFEST = "MANIFEST";
+
+  @Override
+  public StreamObserver putArtifact(
+  StreamObserver responseObserver) {
+return new PutArtifactStreamObserver(responseObserver);
+  }
+
+  @Override
+  public void commitManifest(
+  CommitManifestRequest request, StreamObserver 
responseObserver) {
+try {
+  ResourceId manifestResourceId = 
getManifestFileResourceId(request.getStagingSessionToken());
+  ResourceId artifactDirResourceId = 
getArtifactDirResourceId(request.getStagingSessionToken());
+  ProxyManifest.Builder proxyManifestBuilder = ProxyManifest.newBuilder()
+  .setManifest(request.getManifest());
+  for (ArtifactMetadata artifactMetadata : 
request.getManifest().getArtifactList()) {
+proxyManifestBuilder.addLocation(Location.newBuilder()
+.setName(artifactMetadata.getName())
+.setUri(artifactDirResourceId
+.resolve(encodedFileName(artifactMetadata), 
StandardResolveOptions.RESOLVE_FILE)
+.toString()).build());
+  }
+  try (WritableByteChannel manifestWritableByteChannel = FileSystems
+  .create(manifestResourceId, MimeTypes.TEXT)) {
+

[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-11 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=110833=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-110833
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 11/Jun/18 21:04
Start Date: 11/Jun/18 21:04
Worklog Time Spent: 10m 
  Work Description: angoenka commented on a change in pull request #5591: 
[BEAM-4290] Beam File System based ArtifactStagingService
URL: https://github.com/apache/beam/pull/5591#discussion_r194547405
 
 

 ##
 File path: 
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingService.java
 ##
 @@ -0,0 +1,292 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.fnexecution.artifact;
+
+import com.fasterxml.jackson.core.JsonProcessingException;
+import com.fasterxml.jackson.databind.ObjectMapper;
+import com.google.common.hash.Hashing;
+import com.google.protobuf.util.JsonFormat;
+import io.grpc.stub.StreamObserver;
+import java.io.FileNotFoundException;
+import java.io.IOException;
+import java.io.Serializable;
+import java.nio.channels.WritableByteChannel;
+import java.nio.charset.Charset;
+import java.nio.charset.StandardCharsets;
+import java.util.Collections;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactMetadata;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestRequest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestResponse;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Location;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactMetadata;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactRequest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactResponse;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc.ArtifactStagingServiceImplBase;
+import org.apache.beam.runners.fnexecution.FnService;
+import org.apache.beam.sdk.io.FileSystems;
+import org.apache.beam.sdk.io.fs.MoveOptions.StandardMoveOptions;
+import org.apache.beam.sdk.io.fs.ResolveOptions.StandardResolveOptions;
+import org.apache.beam.sdk.io.fs.ResourceId;
+import org.apache.beam.sdk.util.MimeTypes;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * {@link ArtifactStagingServiceImplBase} based on beam file system.
 
 Review comment:
   sure.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 110833)
Time Spent: 13h 40m  (was: 13.5h)

> ArtifactStagingService that stages to a distributed filesystem
> --
>
> Key: BEAM-4290
> URL: https://issues.apache.org/jira/browse/BEAM-4290
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-core
>Reporter: Eugene Kirpichov
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 13h 40m
>  Remaining Estimate: 0h
>
> Using the job's staging directory from PipelineOptions.
> Physical layout on the distributed filesystem is TBD but it should allow for 
> arbitrary filenames and ideally for eventually avoiding uploading artifacts 
> that are already there.
> Handling credentials is TBD.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-11 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=110834=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-110834
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 11/Jun/18 21:04
Start Date: 11/Jun/18 21:04
Worklog Time Spent: 10m 
  Work Description: angoenka commented on a change in pull request #5591: 
[BEAM-4290] Beam File System based ArtifactStagingService
URL: https://github.com/apache/beam/pull/5591#discussion_r194536154
 
 

 ##
 File path: 
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingService.java
 ##
 @@ -0,0 +1,292 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.fnexecution.artifact;
+
+import com.fasterxml.jackson.core.JsonProcessingException;
+import com.fasterxml.jackson.databind.ObjectMapper;
+import com.google.common.hash.Hashing;
+import com.google.protobuf.util.JsonFormat;
+import io.grpc.stub.StreamObserver;
+import java.io.FileNotFoundException;
+import java.io.IOException;
+import java.io.Serializable;
+import java.nio.channels.WritableByteChannel;
+import java.nio.charset.Charset;
+import java.nio.charset.StandardCharsets;
+import java.util.Collections;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactMetadata;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestRequest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestResponse;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Location;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactMetadata;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactRequest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactResponse;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc.ArtifactStagingServiceImplBase;
+import org.apache.beam.runners.fnexecution.FnService;
+import org.apache.beam.sdk.io.FileSystems;
+import org.apache.beam.sdk.io.fs.MoveOptions.StandardMoveOptions;
+import org.apache.beam.sdk.io.fs.ResolveOptions.StandardResolveOptions;
+import org.apache.beam.sdk.io.fs.ResourceId;
+import org.apache.beam.sdk.util.MimeTypes;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * {@link ArtifactStagingServiceImplBase} based on beam file system.
+ */
+public class BeamFileSystemArtifactStagingService extends 
ArtifactStagingServiceImplBase implements
+FnService {
+
+  private static final Logger LOG =
+  LoggerFactory.getLogger(BeamFileSystemArtifactStagingService.class);
+  private static final ObjectMapper MAPPER = new ObjectMapper();
+  // Use UTF8 for all text encoding.
+  private static final Charset CHARSET = StandardCharsets.UTF_8;
+  public static final String MANIFEST = "MANIFEST";
+
+  @Override
+  public StreamObserver putArtifact(
+  StreamObserver responseObserver) {
+return new PutArtifactStreamObserver(responseObserver);
+  }
+
+  @Override
+  public void commitManifest(
+  CommitManifestRequest request, StreamObserver 
responseObserver) {
+try {
+  ResourceId manifestResourceId = 
getManifestFileResourceId(request.getStagingSessionToken());
+  ResourceId artifactDirResourceId = 
getArtifactDirResourceId(request.getStagingSessionToken());
+  ProxyManifest.Builder proxyManifestBuilder = ProxyManifest.newBuilder()
+  .setManifest(request.getManifest());
+  for (ArtifactMetadata artifactMetadata : 
request.getManifest().getArtifactList()) {
+proxyManifestBuilder.addLocation(Location.newBuilder()
+.setName(artifactMetadata.getName())
+.setUri(artifactDirResourceId
+.resolve(encodedFileName(artifactMetadata), 
StandardResolveOptions.RESOLVE_FILE)
+.toString()).build());
+  }
+  try (WritableByteChannel manifestWritableByteChannel = FileSystems
+  .create(manifestResourceId, MimeTypes.TEXT)) {
+

[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-11 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=110837=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-110837
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 11/Jun/18 21:04
Start Date: 11/Jun/18 21:04
Worklog Time Spent: 10m 
  Work Description: angoenka commented on a change in pull request #5591: 
[BEAM-4290] Beam File System based ArtifactStagingService
URL: https://github.com/apache/beam/pull/5591#discussion_r194534830
 
 

 ##
 File path: 
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingService.java
 ##
 @@ -0,0 +1,292 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.fnexecution.artifact;
+
+import com.fasterxml.jackson.core.JsonProcessingException;
+import com.fasterxml.jackson.databind.ObjectMapper;
+import com.google.common.hash.Hashing;
+import com.google.protobuf.util.JsonFormat;
+import io.grpc.stub.StreamObserver;
+import java.io.FileNotFoundException;
+import java.io.IOException;
+import java.io.Serializable;
+import java.nio.channels.WritableByteChannel;
+import java.nio.charset.Charset;
+import java.nio.charset.StandardCharsets;
+import java.util.Collections;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactMetadata;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestRequest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestResponse;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Location;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactMetadata;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactRequest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactResponse;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc.ArtifactStagingServiceImplBase;
+import org.apache.beam.runners.fnexecution.FnService;
+import org.apache.beam.sdk.io.FileSystems;
+import org.apache.beam.sdk.io.fs.MoveOptions.StandardMoveOptions;
+import org.apache.beam.sdk.io.fs.ResolveOptions.StandardResolveOptions;
+import org.apache.beam.sdk.io.fs.ResourceId;
+import org.apache.beam.sdk.util.MimeTypes;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * {@link ArtifactStagingServiceImplBase} based on beam file system.
+ */
+public class BeamFileSystemArtifactStagingService extends 
ArtifactStagingServiceImplBase implements
+FnService {
+
+  private static final Logger LOG =
+  LoggerFactory.getLogger(BeamFileSystemArtifactStagingService.class);
+  private static final ObjectMapper MAPPER = new ObjectMapper();
+  // Use UTF8 for all text encoding.
+  private static final Charset CHARSET = StandardCharsets.UTF_8;
+  public static final String MANIFEST = "MANIFEST";
+
+  @Override
+  public StreamObserver putArtifact(
+  StreamObserver responseObserver) {
+return new PutArtifactStreamObserver(responseObserver);
+  }
+
+  @Override
+  public void commitManifest(
+  CommitManifestRequest request, StreamObserver 
responseObserver) {
+try {
+  ResourceId manifestResourceId = 
getManifestFileResourceId(request.getStagingSessionToken());
+  ResourceId artifactDirResourceId = 
getArtifactDirResourceId(request.getStagingSessionToken());
+  ProxyManifest.Builder proxyManifestBuilder = ProxyManifest.newBuilder()
+  .setManifest(request.getManifest());
+  for (ArtifactMetadata artifactMetadata : 
request.getManifest().getArtifactList()) {
+proxyManifestBuilder.addLocation(Location.newBuilder()
+.setName(artifactMetadata.getName())
+.setUri(artifactDirResourceId
+.resolve(encodedFileName(artifactMetadata), 
StandardResolveOptions.RESOLVE_FILE)
+.toString()).build());
+  }
+  try (WritableByteChannel manifestWritableByteChannel = FileSystems
+  .create(manifestResourceId, MimeTypes.TEXT)) {
+

[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-11 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=110829=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-110829
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 11/Jun/18 21:04
Start Date: 11/Jun/18 21:04
Worklog Time Spent: 10m 
  Work Description: angoenka commented on a change in pull request #5591: 
[BEAM-4290] Beam File System based ArtifactStagingService
URL: https://github.com/apache/beam/pull/5591#discussion_r194526541
 
 

 ##
 File path: 
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingService.java
 ##
 @@ -0,0 +1,292 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.fnexecution.artifact;
+
+import com.fasterxml.jackson.core.JsonProcessingException;
+import com.fasterxml.jackson.databind.ObjectMapper;
+import com.google.common.hash.Hashing;
+import com.google.protobuf.util.JsonFormat;
+import io.grpc.stub.StreamObserver;
+import java.io.FileNotFoundException;
+import java.io.IOException;
+import java.io.Serializable;
+import java.nio.channels.WritableByteChannel;
+import java.nio.charset.Charset;
+import java.nio.charset.StandardCharsets;
+import java.util.Collections;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactMetadata;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestRequest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestResponse;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Location;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactMetadata;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactRequest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactResponse;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc.ArtifactStagingServiceImplBase;
+import org.apache.beam.runners.fnexecution.FnService;
+import org.apache.beam.sdk.io.FileSystems;
+import org.apache.beam.sdk.io.fs.MoveOptions.StandardMoveOptions;
+import org.apache.beam.sdk.io.fs.ResolveOptions.StandardResolveOptions;
+import org.apache.beam.sdk.io.fs.ResourceId;
+import org.apache.beam.sdk.util.MimeTypes;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * {@link ArtifactStagingServiceImplBase} based on beam file system.
+ */
+public class BeamFileSystemArtifactStagingService extends 
ArtifactStagingServiceImplBase implements
+FnService {
+
+  private static final Logger LOG =
+  LoggerFactory.getLogger(BeamFileSystemArtifactStagingService.class);
+  private static final ObjectMapper MAPPER = new ObjectMapper();
+  // Use UTF8 for all text encoding.
+  private static final Charset CHARSET = StandardCharsets.UTF_8;
+  public static final String MANIFEST = "MANIFEST";
+
+  @Override
+  public StreamObserver putArtifact(
+  StreamObserver responseObserver) {
+return new PutArtifactStreamObserver(responseObserver);
+  }
+
+  @Override
+  public void commitManifest(
+  CommitManifestRequest request, StreamObserver 
responseObserver) {
+try {
+  ResourceId manifestResourceId = 
getManifestFileResourceId(request.getStagingSessionToken());
+  ResourceId artifactDirResourceId = 
getArtifactDirResourceId(request.getStagingSessionToken());
+  ProxyManifest.Builder proxyManifestBuilder = ProxyManifest.newBuilder()
+  .setManifest(request.getManifest());
+  for (ArtifactMetadata artifactMetadata : 
request.getManifest().getArtifactList()) {
+proxyManifestBuilder.addLocation(Location.newBuilder()
+.setName(artifactMetadata.getName())
+.setUri(artifactDirResourceId
+.resolve(encodedFileName(artifactMetadata), 
StandardResolveOptions.RESOLVE_FILE)
+.toString()).build());
+  }
+  try (WritableByteChannel manifestWritableByteChannel = FileSystems
+  .create(manifestResourceId, MimeTypes.TEXT)) {
+

[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-11 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=110831=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-110831
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 11/Jun/18 21:04
Start Date: 11/Jun/18 21:04
Worklog Time Spent: 10m 
  Work Description: angoenka commented on a change in pull request #5591: 
[BEAM-4290] Beam File System based ArtifactStagingService
URL: https://github.com/apache/beam/pull/5591#discussion_r194531422
 
 

 ##
 File path: 
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingService.java
 ##
 @@ -0,0 +1,292 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.fnexecution.artifact;
+
+import com.fasterxml.jackson.core.JsonProcessingException;
+import com.fasterxml.jackson.databind.ObjectMapper;
+import com.google.common.hash.Hashing;
+import com.google.protobuf.util.JsonFormat;
+import io.grpc.stub.StreamObserver;
+import java.io.FileNotFoundException;
+import java.io.IOException;
+import java.io.Serializable;
+import java.nio.channels.WritableByteChannel;
+import java.nio.charset.Charset;
+import java.nio.charset.StandardCharsets;
+import java.util.Collections;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactMetadata;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestRequest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestResponse;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Location;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactMetadata;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactRequest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactResponse;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc.ArtifactStagingServiceImplBase;
+import org.apache.beam.runners.fnexecution.FnService;
+import org.apache.beam.sdk.io.FileSystems;
+import org.apache.beam.sdk.io.fs.MoveOptions.StandardMoveOptions;
+import org.apache.beam.sdk.io.fs.ResolveOptions.StandardResolveOptions;
+import org.apache.beam.sdk.io.fs.ResourceId;
+import org.apache.beam.sdk.util.MimeTypes;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * {@link ArtifactStagingServiceImplBase} based on beam file system.
+ */
+public class BeamFileSystemArtifactStagingService extends 
ArtifactStagingServiceImplBase implements
+FnService {
+
+  private static final Logger LOG =
+  LoggerFactory.getLogger(BeamFileSystemArtifactStagingService.class);
+  private static final ObjectMapper MAPPER = new ObjectMapper();
+  // Use UTF8 for all text encoding.
+  private static final Charset CHARSET = StandardCharsets.UTF_8;
+  public static final String MANIFEST = "MANIFEST";
+
+  @Override
+  public StreamObserver putArtifact(
+  StreamObserver responseObserver) {
+return new PutArtifactStreamObserver(responseObserver);
+  }
+
+  @Override
+  public void commitManifest(
+  CommitManifestRequest request, StreamObserver 
responseObserver) {
+try {
+  ResourceId manifestResourceId = 
getManifestFileResourceId(request.getStagingSessionToken());
+  ResourceId artifactDirResourceId = 
getArtifactDirResourceId(request.getStagingSessionToken());
+  ProxyManifest.Builder proxyManifestBuilder = ProxyManifest.newBuilder()
+  .setManifest(request.getManifest());
+  for (ArtifactMetadata artifactMetadata : 
request.getManifest().getArtifactList()) {
+proxyManifestBuilder.addLocation(Location.newBuilder()
+.setName(artifactMetadata.getName())
+.setUri(artifactDirResourceId
+.resolve(encodedFileName(artifactMetadata), 
StandardResolveOptions.RESOLVE_FILE)
+.toString()).build());
+  }
+  try (WritableByteChannel manifestWritableByteChannel = FileSystems
+  .create(manifestResourceId, MimeTypes.TEXT)) {
+

[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-11 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=110828=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-110828
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 11/Jun/18 21:03
Start Date: 11/Jun/18 21:03
Worklog Time Spent: 10m 
  Work Description: robertwb commented on a change in pull request #5591: 
[BEAM-4290] Beam File System based ArtifactStagingService
URL: https://github.com/apache/beam/pull/5591#discussion_r194545548
 
 

 ##
 File path: 
runners/java-fn-execution/src/test/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingServiceTest.java
 ##
 @@ -251,17 +258,129 @@ public void putArtifactsMultipleFilesTest() throws 
Exception {
 assertFiles(files.keySet(), stagingToken);
   }
 
+  @Test
+  public void putArtifactsMultipleFilesConcurrentlyTest() throws Exception {
+String stagingSession = "123";
+Map files = new HashMap<>();
+files.put("file5cb", (DATA_1KB / 2) /*500b*/);
+files.put("file1kb", DATA_1KB /*1 kb*/);
+files.put("file15cb", (DATA_1KB * 3) / 2  /*1.5 kb*/);
+files.put("nested/file1kb", DATA_1KB /*1 kb*/);
+files.put("file10kb", 10 * DATA_1KB /*10 kb*/);
+files.put("file100kb", 100 * DATA_1KB /*100 kb*/);
+
+final String text = "abcdefghinklmop\n";
+files.forEach((fileName, size) -> {
+  Path filePath = Paths.get(srcDir.toString(), fileName).toAbsolutePath();
+  try {
+Files.createDirectories(filePath.getParent());
+Files.write(filePath,
+Strings.repeat(text, Double.valueOf(Math.ceil(size * 1.0 / 
text.length())).intValue())
+.getBytes(CHARSET));
+  } catch (IOException ignored) {
+  }
+});
+String stagingSessionToken = BeamFileSystemArtifactStagingService
+.generateStagingSessionToken(stagingSession, 
destDir.toUri().getPath());
+
+List metadata = new ArrayList<>();
+ExecutorService executorService = Executors.newFixedThreadPool(8);
+try {
+  for (String fileName : files.keySet()) {
+executorService.execute(() -> {
+  try {
+putArtifact(stagingSessionToken,
+Paths.get(srcDir.toString(), 
fileName).toAbsolutePath().toString(), fileName);
+  } catch (Exception e) {
+Assert.fail(e.getMessage());
+  }
+  
metadata.add(ArtifactMetadata.newBuilder().setName(fileName).build());
+});
+  }
+} finally {
+  executorService.shutdown();
+  executorService.awaitTermination(2, TimeUnit.SECONDS);
+}
+
+String stagingToken = commitManifest(stagingSessionToken, metadata);
+Assert.assertEquals(
+Paths.get(destDir.toAbsolutePath().toString(), stagingSession, 
"MANIFEST").toString(),
+stagingToken);
+assertFiles(files.keySet(), stagingToken);
+  }
+
+  @Test
+  public void putArtifactsMultipleFilesConcurrentSessionsTest() throws 
Exception {
+String stagingSession1 = "123";
+String stagingSession2 = "abc";
+Map files = new HashMap<>();
+files.put("file5cb", (DATA_1KB / 2) /*500b*/);
+files.put("file1kb", DATA_1KB /*1 kb*/);
+files.put("file15cb", (DATA_1KB * 3) / 2  /*1.5 kb*/);
+files.put("nested/file1kb", DATA_1KB /*1 kb*/);
+files.put("file10kb", 10 * DATA_1KB /*10 kb*/);
+files.put("file100kb", 100 * DATA_1KB /*100 kb*/);
+
+final String text = "abcdefghinklmop\n";
+files.forEach((fileName, size) -> {
+  Path filePath = Paths.get(srcDir.toString(), fileName).toAbsolutePath();
+  try {
+Files.createDirectories(filePath.getParent());
+Files.write(filePath,
+Strings.repeat(text, Double.valueOf(Math.ceil(size * 1.0 / 
text.length())).intValue())
+.getBytes(CHARSET));
+  } catch (IOException ignored) {
+  }
+});
+String stagingSessionToken1 = BeamFileSystemArtifactStagingService
+.generateStagingSessionToken(stagingSession1, 
destDir.toUri().getPath());
+String stagingSessionToken2 = BeamFileSystemArtifactStagingService
+.generateStagingSessionToken(stagingSession2, 
destDir.toUri().getPath());
+
+List metadata = new ArrayList<>();
+ExecutorService executorService = Executors.newFixedThreadPool(8);
+try {
+  for (String fileName : files.keySet()) {
+executorService.execute(() -> {
+  try {
+putArtifact(stagingSessionToken1,
 
 Review comment:
   There should be at least *some* difference in what is being placed to 
actually verify there is no cross-staging interference. Granted, there's also 
no need to place a huge number of multi-chunk files here; a single file for one 
and two for the other would be perfectly fine. 


This is an automated message from the Apache Git Service.
To respond to 

[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-11 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=110825=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-110825
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 11/Jun/18 21:02
Start Date: 11/Jun/18 21:02
Worklog Time Spent: 10m 
  Work Description: robertwb commented on a change in pull request #5591: 
[BEAM-4290] Beam File System based ArtifactStagingService
URL: https://github.com/apache/beam/pull/5591#discussion_r194546744
 
 

 ##
 File path: 
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingService.java
 ##
 @@ -0,0 +1,292 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.fnexecution.artifact;
+
+import com.fasterxml.jackson.core.JsonProcessingException;
+import com.fasterxml.jackson.databind.ObjectMapper;
+import com.google.common.hash.Hashing;
+import com.google.protobuf.util.JsonFormat;
+import io.grpc.stub.StreamObserver;
+import java.io.FileNotFoundException;
+import java.io.IOException;
+import java.io.Serializable;
+import java.nio.channels.WritableByteChannel;
+import java.nio.charset.Charset;
+import java.nio.charset.StandardCharsets;
+import java.util.Collections;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactMetadata;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestRequest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestResponse;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Location;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactMetadata;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactRequest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactResponse;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc.ArtifactStagingServiceImplBase;
+import org.apache.beam.runners.fnexecution.FnService;
+import org.apache.beam.sdk.io.FileSystems;
+import org.apache.beam.sdk.io.fs.MoveOptions.StandardMoveOptions;
+import org.apache.beam.sdk.io.fs.ResolveOptions.StandardResolveOptions;
+import org.apache.beam.sdk.io.fs.ResourceId;
+import org.apache.beam.sdk.util.MimeTypes;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * {@link ArtifactStagingServiceImplBase} based on beam file system.
+ */
+public class BeamFileSystemArtifactStagingService extends 
ArtifactStagingServiceImplBase implements
+FnService {
+
+  private static final Logger LOG =
+  LoggerFactory.getLogger(BeamFileSystemArtifactStagingService.class);
+  private static final ObjectMapper MAPPER = new ObjectMapper();
+  // Use UTF8 for all text encoding.
+  private static final Charset CHARSET = StandardCharsets.UTF_8;
+  public static final String MANIFEST = "MANIFEST";
+
+  @Override
+  public StreamObserver putArtifact(
+  StreamObserver responseObserver) {
+return new PutArtifactStreamObserver(responseObserver);
+  }
+
+  @Override
+  public void commitManifest(
+  CommitManifestRequest request, StreamObserver 
responseObserver) {
+try {
+  ResourceId manifestResourceId = 
getManifestFileResourceId(request.getStagingSessionToken());
+  ResourceId artifactDirResourceId = 
getArtifactDirResourceId(request.getStagingSessionToken());
+  ProxyManifest.Builder proxyManifestBuilder = ProxyManifest.newBuilder()
+  .setManifest(request.getManifest());
+  for (ArtifactMetadata artifactMetadata : 
request.getManifest().getArtifactList()) {
+proxyManifestBuilder.addLocation(Location.newBuilder()
+.setName(artifactMetadata.getName())
+.setUri(artifactDirResourceId
+.resolve(encodedFileName(artifactMetadata), 
StandardResolveOptions.RESOLVE_FILE)
+.toString()).build());
+  }
+  try (WritableByteChannel manifestWritableByteChannel = FileSystems
+  .create(manifestResourceId, MimeTypes.TEXT)) {
+

[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-11 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=110827=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-110827
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 11/Jun/18 21:02
Start Date: 11/Jun/18 21:02
Worklog Time Spent: 10m 
  Work Description: robertwb commented on a change in pull request #5591: 
[BEAM-4290] Beam File System based ArtifactStagingService
URL: https://github.com/apache/beam/pull/5591#discussion_r194544535
 
 

 ##
 File path: 
runners/java-fn-execution/src/test/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingServiceTest.java
 ##
 @@ -251,17 +258,129 @@ public void putArtifactsMultipleFilesTest() throws 
Exception {
 assertFiles(files.keySet(), stagingToken);
   }
 
+  @Test
+  public void putArtifactsMultipleFilesConcurrentlyTest() throws Exception {
+String stagingSession = "123";
+Map files = new HashMap<>();
+files.put("file5cb", (DATA_1KB / 2) /*500b*/);
+files.put("file1kb", DATA_1KB /*1 kb*/);
+files.put("file15cb", (DATA_1KB * 3) / 2  /*1.5 kb*/);
+files.put("nested/file1kb", DATA_1KB /*1 kb*/);
+files.put("file10kb", 10 * DATA_1KB /*10 kb*/);
+files.put("file100kb", 100 * DATA_1KB /*100 kb*/);
+
+final String text = "abcdefghinklmop\n";
+files.forEach((fileName, size) -> {
+  Path filePath = Paths.get(srcDir.toString(), fileName).toAbsolutePath();
+  try {
+Files.createDirectories(filePath.getParent());
+Files.write(filePath,
+Strings.repeat(text, Double.valueOf(Math.ceil(size * 1.0 / 
text.length())).intValue())
+.getBytes(CHARSET));
+  } catch (IOException ignored) {
+  }
+});
+String stagingSessionToken = BeamFileSystemArtifactStagingService
+.generateStagingSessionToken(stagingSession, 
destDir.toUri().getPath());
+
+List metadata = new ArrayList<>();
+ExecutorService executorService = Executors.newFixedThreadPool(8);
+try {
+  for (String fileName : files.keySet()) {
+executorService.execute(() -> {
+  try {
+putArtifact(stagingSessionToken,
+Paths.get(srcDir.toString(), 
fileName).toAbsolutePath().toString(), fileName);
+  } catch (Exception e) {
+Assert.fail(e.getMessage());
+  }
+  
metadata.add(ArtifactMetadata.newBuilder().setName(fileName).build());
+});
+  }
+} finally {
+  executorService.shutdown();
+  executorService.awaitTermination(2, TimeUnit.SECONDS);
+}
+
+String stagingToken = commitManifest(stagingSessionToken, metadata);
+Assert.assertEquals(
+Paths.get(destDir.toAbsolutePath().toString(), stagingSession, 
"MANIFEST").toString(),
+stagingToken);
+assertFiles(files.keySet(), stagingToken);
+  }
+
+  @Test
+  public void putArtifactsMultipleFilesConcurrentSessionsTest() throws 
Exception {
+String stagingSession1 = "123";
+String stagingSession2 = "abc";
+Map files = new HashMap<>();
 
 Review comment:
   Nit (here and above): I prefer using ImmutableMaps for constants like this. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 110827)
Time Spent: 12h 50m  (was: 12h 40m)

> ArtifactStagingService that stages to a distributed filesystem
> --
>
> Key: BEAM-4290
> URL: https://issues.apache.org/jira/browse/BEAM-4290
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-core
>Reporter: Eugene Kirpichov
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 12h 50m
>  Remaining Estimate: 0h
>
> Using the job's staging directory from PipelineOptions.
> Physical layout on the distributed filesystem is TBD but it should allow for 
> arbitrary filenames and ideally for eventually avoiding uploading artifacts 
> that are already there.
> Handling credentials is TBD.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-11 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=110826=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-110826
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 11/Jun/18 21:02
Start Date: 11/Jun/18 21:02
Worklog Time Spent: 10m 
  Work Description: robertwb commented on a change in pull request #5591: 
[BEAM-4290] Beam File System based ArtifactStagingService
URL: https://github.com/apache/beam/pull/5591#discussion_r194545548
 
 

 ##
 File path: 
runners/java-fn-execution/src/test/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingServiceTest.java
 ##
 @@ -251,17 +258,129 @@ public void putArtifactsMultipleFilesTest() throws 
Exception {
 assertFiles(files.keySet(), stagingToken);
   }
 
+  @Test
+  public void putArtifactsMultipleFilesConcurrentlyTest() throws Exception {
+String stagingSession = "123";
+Map files = new HashMap<>();
+files.put("file5cb", (DATA_1KB / 2) /*500b*/);
+files.put("file1kb", DATA_1KB /*1 kb*/);
+files.put("file15cb", (DATA_1KB * 3) / 2  /*1.5 kb*/);
+files.put("nested/file1kb", DATA_1KB /*1 kb*/);
+files.put("file10kb", 10 * DATA_1KB /*10 kb*/);
+files.put("file100kb", 100 * DATA_1KB /*100 kb*/);
+
+final String text = "abcdefghinklmop\n";
+files.forEach((fileName, size) -> {
+  Path filePath = Paths.get(srcDir.toString(), fileName).toAbsolutePath();
+  try {
+Files.createDirectories(filePath.getParent());
+Files.write(filePath,
+Strings.repeat(text, Double.valueOf(Math.ceil(size * 1.0 / 
text.length())).intValue())
+.getBytes(CHARSET));
+  } catch (IOException ignored) {
+  }
+});
+String stagingSessionToken = BeamFileSystemArtifactStagingService
+.generateStagingSessionToken(stagingSession, 
destDir.toUri().getPath());
+
+List metadata = new ArrayList<>();
+ExecutorService executorService = Executors.newFixedThreadPool(8);
+try {
+  for (String fileName : files.keySet()) {
+executorService.execute(() -> {
+  try {
+putArtifact(stagingSessionToken,
+Paths.get(srcDir.toString(), 
fileName).toAbsolutePath().toString(), fileName);
+  } catch (Exception e) {
+Assert.fail(e.getMessage());
+  }
+  
metadata.add(ArtifactMetadata.newBuilder().setName(fileName).build());
+});
+  }
+} finally {
+  executorService.shutdown();
+  executorService.awaitTermination(2, TimeUnit.SECONDS);
+}
+
+String stagingToken = commitManifest(stagingSessionToken, metadata);
+Assert.assertEquals(
+Paths.get(destDir.toAbsolutePath().toString(), stagingSession, 
"MANIFEST").toString(),
+stagingToken);
+assertFiles(files.keySet(), stagingToken);
+  }
+
+  @Test
+  public void putArtifactsMultipleFilesConcurrentSessionsTest() throws 
Exception {
+String stagingSession1 = "123";
+String stagingSession2 = "abc";
+Map files = new HashMap<>();
+files.put("file5cb", (DATA_1KB / 2) /*500b*/);
+files.put("file1kb", DATA_1KB /*1 kb*/);
+files.put("file15cb", (DATA_1KB * 3) / 2  /*1.5 kb*/);
+files.put("nested/file1kb", DATA_1KB /*1 kb*/);
+files.put("file10kb", 10 * DATA_1KB /*10 kb*/);
+files.put("file100kb", 100 * DATA_1KB /*100 kb*/);
+
+final String text = "abcdefghinklmop\n";
+files.forEach((fileName, size) -> {
+  Path filePath = Paths.get(srcDir.toString(), fileName).toAbsolutePath();
+  try {
+Files.createDirectories(filePath.getParent());
+Files.write(filePath,
+Strings.repeat(text, Double.valueOf(Math.ceil(size * 1.0 / 
text.length())).intValue())
+.getBytes(CHARSET));
+  } catch (IOException ignored) {
+  }
+});
+String stagingSessionToken1 = BeamFileSystemArtifactStagingService
+.generateStagingSessionToken(stagingSession1, 
destDir.toUri().getPath());
+String stagingSessionToken2 = BeamFileSystemArtifactStagingService
+.generateStagingSessionToken(stagingSession2, 
destDir.toUri().getPath());
+
+List metadata = new ArrayList<>();
+ExecutorService executorService = Executors.newFixedThreadPool(8);
+try {
+  for (String fileName : files.keySet()) {
+executorService.execute(() -> {
+  try {
+putArtifact(stagingSessionToken1,
 
 Review comment:
   There should be at least *some* difference in what is being placed to 
actually verify there is no cross-staging interference. Granted, there's also 
no need to place a huge number of files here, a single file for one and two for 
the other would be perfectly fine. 


This is an automated message from the Apache Git Service.
To respond to the message, 

[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-09 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=110442=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-110442
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 09/Jun/18 20:32
Start Date: 09/Jun/18 20:32
Worklog Time Spent: 10m 
  Work Description: jkff commented on a change in pull request #5591: 
[BEAM-4290] Beam File System based ArtifactStagingService
URL: https://github.com/apache/beam/pull/5591#discussion_r194238250
 
 

 ##
 File path: 
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingService.java
 ##
 @@ -0,0 +1,292 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.fnexecution.artifact;
+
+import com.fasterxml.jackson.core.JsonProcessingException;
+import com.fasterxml.jackson.databind.ObjectMapper;
+import com.google.common.hash.Hashing;
+import com.google.protobuf.util.JsonFormat;
+import io.grpc.stub.StreamObserver;
+import java.io.FileNotFoundException;
+import java.io.IOException;
+import java.io.Serializable;
+import java.nio.channels.WritableByteChannel;
+import java.nio.charset.Charset;
+import java.nio.charset.StandardCharsets;
+import java.util.Collections;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactMetadata;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestRequest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestResponse;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Location;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactMetadata;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactRequest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactResponse;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc.ArtifactStagingServiceImplBase;
+import org.apache.beam.runners.fnexecution.FnService;
+import org.apache.beam.sdk.io.FileSystems;
+import org.apache.beam.sdk.io.fs.MoveOptions.StandardMoveOptions;
+import org.apache.beam.sdk.io.fs.ResolveOptions.StandardResolveOptions;
+import org.apache.beam.sdk.io.fs.ResourceId;
+import org.apache.beam.sdk.util.MimeTypes;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * {@link ArtifactStagingServiceImplBase} based on beam file system.
+ */
+public class BeamFileSystemArtifactStagingService extends 
ArtifactStagingServiceImplBase implements
+FnService {
+
+  private static final Logger LOG =
+  LoggerFactory.getLogger(BeamFileSystemArtifactStagingService.class);
+  private static final ObjectMapper MAPPER = new ObjectMapper();
+  // Use UTF8 for all text encoding.
+  private static final Charset CHARSET = StandardCharsets.UTF_8;
+  public static final String MANIFEST = "MANIFEST";
+
+  @Override
+  public StreamObserver putArtifact(
+  StreamObserver responseObserver) {
+return new PutArtifactStreamObserver(responseObserver);
+  }
+
+  @Override
+  public void commitManifest(
+  CommitManifestRequest request, StreamObserver 
responseObserver) {
+try {
+  ResourceId manifestResourceId = 
getManifestFileResourceId(request.getStagingSessionToken());
+  ResourceId artifactDirResourceId = 
getArtifactDirResourceId(request.getStagingSessionToken());
+  ProxyManifest.Builder proxyManifestBuilder = ProxyManifest.newBuilder()
+  .setManifest(request.getManifest());
+  for (ArtifactMetadata artifactMetadata : 
request.getManifest().getArtifactList()) {
+proxyManifestBuilder.addLocation(Location.newBuilder()
+.setName(artifactMetadata.getName())
+.setUri(artifactDirResourceId
+.resolve(encodedFileName(artifactMetadata), 
StandardResolveOptions.RESOLVE_FILE)
+.toString()).build());
+  }
+  try (WritableByteChannel manifestWritableByteChannel = FileSystems
+  .create(manifestResourceId, MimeTypes.TEXT)) {
+

[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-09 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=110434=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-110434
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 09/Jun/18 20:32
Start Date: 09/Jun/18 20:32
Worklog Time Spent: 10m 
  Work Description: jkff commented on a change in pull request #5591: 
[BEAM-4290] Beam File System based ArtifactStagingService
URL: https://github.com/apache/beam/pull/5591#discussion_r194238156
 
 

 ##
 File path: 
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingService.java
 ##
 @@ -0,0 +1,292 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.fnexecution.artifact;
+
+import com.fasterxml.jackson.core.JsonProcessingException;
+import com.fasterxml.jackson.databind.ObjectMapper;
+import com.google.common.hash.Hashing;
+import com.google.protobuf.util.JsonFormat;
+import io.grpc.stub.StreamObserver;
+import java.io.FileNotFoundException;
+import java.io.IOException;
+import java.io.Serializable;
+import java.nio.channels.WritableByteChannel;
+import java.nio.charset.Charset;
+import java.nio.charset.StandardCharsets;
+import java.util.Collections;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactMetadata;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestRequest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestResponse;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Location;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactMetadata;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactRequest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactResponse;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc.ArtifactStagingServiceImplBase;
+import org.apache.beam.runners.fnexecution.FnService;
+import org.apache.beam.sdk.io.FileSystems;
+import org.apache.beam.sdk.io.fs.MoveOptions.StandardMoveOptions;
+import org.apache.beam.sdk.io.fs.ResolveOptions.StandardResolveOptions;
+import org.apache.beam.sdk.io.fs.ResourceId;
+import org.apache.beam.sdk.util.MimeTypes;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * {@link ArtifactStagingServiceImplBase} based on beam file system.
+ */
+public class BeamFileSystemArtifactStagingService extends 
ArtifactStagingServiceImplBase implements
+FnService {
+
+  private static final Logger LOG =
+  LoggerFactory.getLogger(BeamFileSystemArtifactStagingService.class);
+  private static final ObjectMapper MAPPER = new ObjectMapper();
+  // Use UTF8 for all text encoding.
+  private static final Charset CHARSET = StandardCharsets.UTF_8;
+  public static final String MANIFEST = "MANIFEST";
+
+  @Override
+  public StreamObserver putArtifact(
+  StreamObserver responseObserver) {
+return new PutArtifactStreamObserver(responseObserver);
+  }
+
+  @Override
+  public void commitManifest(
+  CommitManifestRequest request, StreamObserver 
responseObserver) {
+try {
+  ResourceId manifestResourceId = 
getManifestFileResourceId(request.getStagingSessionToken());
+  ResourceId artifactDirResourceId = 
getArtifactDirResourceId(request.getStagingSessionToken());
+  ProxyManifest.Builder proxyManifestBuilder = ProxyManifest.newBuilder()
+  .setManifest(request.getManifest());
+  for (ArtifactMetadata artifactMetadata : 
request.getManifest().getArtifactList()) {
+proxyManifestBuilder.addLocation(Location.newBuilder()
+.setName(artifactMetadata.getName())
+.setUri(artifactDirResourceId
+.resolve(encodedFileName(artifactMetadata), 
StandardResolveOptions.RESOLVE_FILE)
+.toString()).build());
+  }
+  try (WritableByteChannel manifestWritableByteChannel = FileSystems
+  .create(manifestResourceId, MimeTypes.TEXT)) {
+

[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-09 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=110436=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-110436
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 09/Jun/18 20:32
Start Date: 09/Jun/18 20:32
Worklog Time Spent: 10m 
  Work Description: jkff commented on a change in pull request #5591: 
[BEAM-4290] Beam File System based ArtifactStagingService
URL: https://github.com/apache/beam/pull/5591#discussion_r194238135
 
 

 ##
 File path: 
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingService.java
 ##
 @@ -0,0 +1,292 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.fnexecution.artifact;
+
+import com.fasterxml.jackson.core.JsonProcessingException;
+import com.fasterxml.jackson.databind.ObjectMapper;
+import com.google.common.hash.Hashing;
+import com.google.protobuf.util.JsonFormat;
+import io.grpc.stub.StreamObserver;
+import java.io.FileNotFoundException;
+import java.io.IOException;
+import java.io.Serializable;
+import java.nio.channels.WritableByteChannel;
+import java.nio.charset.Charset;
+import java.nio.charset.StandardCharsets;
+import java.util.Collections;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactMetadata;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestRequest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestResponse;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Location;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactMetadata;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactRequest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactResponse;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc.ArtifactStagingServiceImplBase;
+import org.apache.beam.runners.fnexecution.FnService;
+import org.apache.beam.sdk.io.FileSystems;
+import org.apache.beam.sdk.io.fs.MoveOptions.StandardMoveOptions;
+import org.apache.beam.sdk.io.fs.ResolveOptions.StandardResolveOptions;
+import org.apache.beam.sdk.io.fs.ResourceId;
+import org.apache.beam.sdk.util.MimeTypes;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * {@link ArtifactStagingServiceImplBase} based on beam file system.
+ */
+public class BeamFileSystemArtifactStagingService extends 
ArtifactStagingServiceImplBase implements
+FnService {
+
+  private static final Logger LOG =
+  LoggerFactory.getLogger(BeamFileSystemArtifactStagingService.class);
+  private static final ObjectMapper MAPPER = new ObjectMapper();
+  // Use UTF8 for all text encoding.
+  private static final Charset CHARSET = StandardCharsets.UTF_8;
+  public static final String MANIFEST = "MANIFEST";
+
+  @Override
+  public StreamObserver putArtifact(
+  StreamObserver responseObserver) {
+return new PutArtifactStreamObserver(responseObserver);
+  }
+
+  @Override
+  public void commitManifest(
+  CommitManifestRequest request, StreamObserver 
responseObserver) {
+try {
+  ResourceId manifestResourceId = 
getManifestFileResourceId(request.getStagingSessionToken());
+  ResourceId artifactDirResourceId = 
getArtifactDirResourceId(request.getStagingSessionToken());
+  ProxyManifest.Builder proxyManifestBuilder = ProxyManifest.newBuilder()
+  .setManifest(request.getManifest());
+  for (ArtifactMetadata artifactMetadata : 
request.getManifest().getArtifactList()) {
+proxyManifestBuilder.addLocation(Location.newBuilder()
+.setName(artifactMetadata.getName())
+.setUri(artifactDirResourceId
+.resolve(encodedFileName(artifactMetadata), 
StandardResolveOptions.RESOLVE_FILE)
+.toString()).build());
+  }
+  try (WritableByteChannel manifestWritableByteChannel = FileSystems
+  .create(manifestResourceId, MimeTypes.TEXT)) {
+

[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-09 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=110437=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-110437
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 09/Jun/18 20:32
Start Date: 09/Jun/18 20:32
Worklog Time Spent: 10m 
  Work Description: jkff commented on a change in pull request #5591: 
[BEAM-4290] Beam File System based ArtifactStagingService
URL: https://github.com/apache/beam/pull/5591#discussion_r194238107
 
 

 ##
 File path: 
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingService.java
 ##
 @@ -0,0 +1,292 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.fnexecution.artifact;
+
+import com.fasterxml.jackson.core.JsonProcessingException;
+import com.fasterxml.jackson.databind.ObjectMapper;
+import com.google.common.hash.Hashing;
+import com.google.protobuf.util.JsonFormat;
+import io.grpc.stub.StreamObserver;
+import java.io.FileNotFoundException;
+import java.io.IOException;
+import java.io.Serializable;
+import java.nio.channels.WritableByteChannel;
+import java.nio.charset.Charset;
+import java.nio.charset.StandardCharsets;
+import java.util.Collections;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactMetadata;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestRequest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestResponse;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Location;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactMetadata;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactRequest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactResponse;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc.ArtifactStagingServiceImplBase;
+import org.apache.beam.runners.fnexecution.FnService;
+import org.apache.beam.sdk.io.FileSystems;
+import org.apache.beam.sdk.io.fs.MoveOptions.StandardMoveOptions;
+import org.apache.beam.sdk.io.fs.ResolveOptions.StandardResolveOptions;
+import org.apache.beam.sdk.io.fs.ResourceId;
+import org.apache.beam.sdk.util.MimeTypes;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * {@link ArtifactStagingServiceImplBase} based on beam file system.
+ */
+public class BeamFileSystemArtifactStagingService extends 
ArtifactStagingServiceImplBase implements
+FnService {
+
+  private static final Logger LOG =
+  LoggerFactory.getLogger(BeamFileSystemArtifactStagingService.class);
+  private static final ObjectMapper MAPPER = new ObjectMapper();
+  // Use UTF8 for all text encoding.
+  private static final Charset CHARSET = StandardCharsets.UTF_8;
+  public static final String MANIFEST = "MANIFEST";
+
+  @Override
+  public StreamObserver putArtifact(
+  StreamObserver responseObserver) {
+return new PutArtifactStreamObserver(responseObserver);
+  }
+
+  @Override
+  public void commitManifest(
+  CommitManifestRequest request, StreamObserver 
responseObserver) {
+try {
+  ResourceId manifestResourceId = 
getManifestFileResourceId(request.getStagingSessionToken());
+  ResourceId artifactDirResourceId = 
getArtifactDirResourceId(request.getStagingSessionToken());
+  ProxyManifest.Builder proxyManifestBuilder = ProxyManifest.newBuilder()
 
 Review comment:
   Oh nice, didn't know we already had such a proto.
   
   CC: @axelmagn 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 110437)
Time Spent: 11h 40m  (was: 11.5h)

> ArtifactStagingService that stages to a distributed filesystem
> 

[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-09 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=110440=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-110440
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 09/Jun/18 20:32
Start Date: 09/Jun/18 20:32
Worklog Time Spent: 10m 
  Work Description: jkff commented on a change in pull request #5591: 
[BEAM-4290] Beam File System based ArtifactStagingService
URL: https://github.com/apache/beam/pull/5591#discussion_r194238203
 
 

 ##
 File path: 
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingService.java
 ##
 @@ -0,0 +1,292 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.fnexecution.artifact;
+
+import com.fasterxml.jackson.core.JsonProcessingException;
+import com.fasterxml.jackson.databind.ObjectMapper;
+import com.google.common.hash.Hashing;
+import com.google.protobuf.util.JsonFormat;
+import io.grpc.stub.StreamObserver;
+import java.io.FileNotFoundException;
+import java.io.IOException;
+import java.io.Serializable;
+import java.nio.channels.WritableByteChannel;
+import java.nio.charset.Charset;
+import java.nio.charset.StandardCharsets;
+import java.util.Collections;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactMetadata;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestRequest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestResponse;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Location;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactMetadata;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactRequest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactResponse;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc.ArtifactStagingServiceImplBase;
+import org.apache.beam.runners.fnexecution.FnService;
+import org.apache.beam.sdk.io.FileSystems;
+import org.apache.beam.sdk.io.fs.MoveOptions.StandardMoveOptions;
+import org.apache.beam.sdk.io.fs.ResolveOptions.StandardResolveOptions;
+import org.apache.beam.sdk.io.fs.ResourceId;
+import org.apache.beam.sdk.util.MimeTypes;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * {@link ArtifactStagingServiceImplBase} based on beam file system.
 
 Review comment:
   Please document more about how this works - how it stores artifacts, 
manifests etc.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 110440)

> ArtifactStagingService that stages to a distributed filesystem
> --
>
> Key: BEAM-4290
> URL: https://issues.apache.org/jira/browse/BEAM-4290
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-core
>Reporter: Eugene Kirpichov
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 12h
>  Remaining Estimate: 0h
>
> Using the job's staging directory from PipelineOptions.
> Physical layout on the distributed filesystem is TBD but it should allow for 
> arbitrary filenames and ideally for eventually avoiding uploading artifacts 
> that are already there.
> Handling credentials is TBD.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-09 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=110439=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-110439
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 09/Jun/18 20:32
Start Date: 09/Jun/18 20:32
Worklog Time Spent: 10m 
  Work Description: jkff commented on a change in pull request #5591: 
[BEAM-4290] Beam File System based ArtifactStagingService
URL: https://github.com/apache/beam/pull/5591#discussion_r194238189
 
 

 ##
 File path: 
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingService.java
 ##
 @@ -0,0 +1,292 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.fnexecution.artifact;
+
+import com.fasterxml.jackson.core.JsonProcessingException;
+import com.fasterxml.jackson.databind.ObjectMapper;
+import com.google.common.hash.Hashing;
+import com.google.protobuf.util.JsonFormat;
+import io.grpc.stub.StreamObserver;
+import java.io.FileNotFoundException;
+import java.io.IOException;
+import java.io.Serializable;
+import java.nio.channels.WritableByteChannel;
+import java.nio.charset.Charset;
+import java.nio.charset.StandardCharsets;
+import java.util.Collections;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactMetadata;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestRequest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestResponse;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Location;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactMetadata;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactRequest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactResponse;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc.ArtifactStagingServiceImplBase;
+import org.apache.beam.runners.fnexecution.FnService;
+import org.apache.beam.sdk.io.FileSystems;
+import org.apache.beam.sdk.io.fs.MoveOptions.StandardMoveOptions;
+import org.apache.beam.sdk.io.fs.ResolveOptions.StandardResolveOptions;
+import org.apache.beam.sdk.io.fs.ResourceId;
+import org.apache.beam.sdk.util.MimeTypes;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * {@link ArtifactStagingServiceImplBase} based on beam file system.
+ */
+public class BeamFileSystemArtifactStagingService extends 
ArtifactStagingServiceImplBase implements
+FnService {
+
+  private static final Logger LOG =
+  LoggerFactory.getLogger(BeamFileSystemArtifactStagingService.class);
+  private static final ObjectMapper MAPPER = new ObjectMapper();
+  // Use UTF8 for all text encoding.
+  private static final Charset CHARSET = StandardCharsets.UTF_8;
+  public static final String MANIFEST = "MANIFEST";
+
+  @Override
+  public StreamObserver putArtifact(
+  StreamObserver responseObserver) {
+return new PutArtifactStreamObserver(responseObserver);
+  }
+
+  @Override
+  public void commitManifest(
+  CommitManifestRequest request, StreamObserver 
responseObserver) {
+try {
+  ResourceId manifestResourceId = 
getManifestFileResourceId(request.getStagingSessionToken());
+  ResourceId artifactDirResourceId = 
getArtifactDirResourceId(request.getStagingSessionToken());
+  ProxyManifest.Builder proxyManifestBuilder = ProxyManifest.newBuilder()
+  .setManifest(request.getManifest());
+  for (ArtifactMetadata artifactMetadata : 
request.getManifest().getArtifactList()) {
+proxyManifestBuilder.addLocation(Location.newBuilder()
+.setName(artifactMetadata.getName())
+.setUri(artifactDirResourceId
+.resolve(encodedFileName(artifactMetadata), 
StandardResolveOptions.RESOLVE_FILE)
+.toString()).build());
+  }
+  try (WritableByteChannel manifestWritableByteChannel = FileSystems
+  .create(manifestResourceId, MimeTypes.TEXT)) {
+

[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-09 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=110443=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-110443
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 09/Jun/18 20:32
Start Date: 09/Jun/18 20:32
Worklog Time Spent: 10m 
  Work Description: jkff commented on a change in pull request #5591: 
[BEAM-4290] Beam File System based ArtifactStagingService
URL: https://github.com/apache/beam/pull/5591#discussion_r194238229
 
 

 ##
 File path: 
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingService.java
 ##
 @@ -0,0 +1,292 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.fnexecution.artifact;
+
+import com.fasterxml.jackson.core.JsonProcessingException;
+import com.fasterxml.jackson.databind.ObjectMapper;
+import com.google.common.hash.Hashing;
+import com.google.protobuf.util.JsonFormat;
+import io.grpc.stub.StreamObserver;
+import java.io.FileNotFoundException;
+import java.io.IOException;
+import java.io.Serializable;
+import java.nio.channels.WritableByteChannel;
+import java.nio.charset.Charset;
+import java.nio.charset.StandardCharsets;
+import java.util.Collections;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactMetadata;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestRequest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestResponse;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Location;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactMetadata;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactRequest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactResponse;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc.ArtifactStagingServiceImplBase;
+import org.apache.beam.runners.fnexecution.FnService;
+import org.apache.beam.sdk.io.FileSystems;
+import org.apache.beam.sdk.io.fs.MoveOptions.StandardMoveOptions;
+import org.apache.beam.sdk.io.fs.ResolveOptions.StandardResolveOptions;
+import org.apache.beam.sdk.io.fs.ResourceId;
+import org.apache.beam.sdk.util.MimeTypes;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * {@link ArtifactStagingServiceImplBase} based on beam file system.
+ */
+public class BeamFileSystemArtifactStagingService extends 
ArtifactStagingServiceImplBase implements
+FnService {
+
+  private static final Logger LOG =
+  LoggerFactory.getLogger(BeamFileSystemArtifactStagingService.class);
+  private static final ObjectMapper MAPPER = new ObjectMapper();
+  // Use UTF8 for all text encoding.
+  private static final Charset CHARSET = StandardCharsets.UTF_8;
+  public static final String MANIFEST = "MANIFEST";
+
+  @Override
+  public StreamObserver putArtifact(
+  StreamObserver responseObserver) {
+return new PutArtifactStreamObserver(responseObserver);
+  }
+
+  @Override
+  public void commitManifest(
+  CommitManifestRequest request, StreamObserver 
responseObserver) {
+try {
+  ResourceId manifestResourceId = 
getManifestFileResourceId(request.getStagingSessionToken());
+  ResourceId artifactDirResourceId = 
getArtifactDirResourceId(request.getStagingSessionToken());
+  ProxyManifest.Builder proxyManifestBuilder = ProxyManifest.newBuilder()
+  .setManifest(request.getManifest());
+  for (ArtifactMetadata artifactMetadata : 
request.getManifest().getArtifactList()) {
+proxyManifestBuilder.addLocation(Location.newBuilder()
+.setName(artifactMetadata.getName())
+.setUri(artifactDirResourceId
+.resolve(encodedFileName(artifactMetadata), 
StandardResolveOptions.RESOLVE_FILE)
+.toString()).build());
+  }
+  try (WritableByteChannel manifestWritableByteChannel = FileSystems
+  .create(manifestResourceId, MimeTypes.TEXT)) {
+

[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-09 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=110435=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-110435
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 09/Jun/18 20:32
Start Date: 09/Jun/18 20:32
Worklog Time Spent: 10m 
  Work Description: jkff commented on a change in pull request #5591: 
[BEAM-4290] Beam File System based ArtifactStagingService
URL: https://github.com/apache/beam/pull/5591#discussion_r194238157
 
 

 ##
 File path: 
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingService.java
 ##
 @@ -0,0 +1,292 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.fnexecution.artifact;
+
+import com.fasterxml.jackson.core.JsonProcessingException;
+import com.fasterxml.jackson.databind.ObjectMapper;
+import com.google.common.hash.Hashing;
+import com.google.protobuf.util.JsonFormat;
+import io.grpc.stub.StreamObserver;
+import java.io.FileNotFoundException;
+import java.io.IOException;
+import java.io.Serializable;
+import java.nio.channels.WritableByteChannel;
+import java.nio.charset.Charset;
+import java.nio.charset.StandardCharsets;
+import java.util.Collections;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactMetadata;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestRequest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestResponse;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Location;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactMetadata;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactRequest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactResponse;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc.ArtifactStagingServiceImplBase;
+import org.apache.beam.runners.fnexecution.FnService;
+import org.apache.beam.sdk.io.FileSystems;
+import org.apache.beam.sdk.io.fs.MoveOptions.StandardMoveOptions;
+import org.apache.beam.sdk.io.fs.ResolveOptions.StandardResolveOptions;
+import org.apache.beam.sdk.io.fs.ResourceId;
+import org.apache.beam.sdk.util.MimeTypes;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * {@link ArtifactStagingServiceImplBase} based on beam file system.
+ */
+public class BeamFileSystemArtifactStagingService extends 
ArtifactStagingServiceImplBase implements
+FnService {
+
+  private static final Logger LOG =
+  LoggerFactory.getLogger(BeamFileSystemArtifactStagingService.class);
+  private static final ObjectMapper MAPPER = new ObjectMapper();
+  // Use UTF8 for all text encoding.
+  private static final Charset CHARSET = StandardCharsets.UTF_8;
+  public static final String MANIFEST = "MANIFEST";
+
+  @Override
+  public StreamObserver putArtifact(
+  StreamObserver responseObserver) {
+return new PutArtifactStreamObserver(responseObserver);
+  }
+
+  @Override
+  public void commitManifest(
+  CommitManifestRequest request, StreamObserver 
responseObserver) {
+try {
+  ResourceId manifestResourceId = 
getManifestFileResourceId(request.getStagingSessionToken());
+  ResourceId artifactDirResourceId = 
getArtifactDirResourceId(request.getStagingSessionToken());
+  ProxyManifest.Builder proxyManifestBuilder = ProxyManifest.newBuilder()
+  .setManifest(request.getManifest());
+  for (ArtifactMetadata artifactMetadata : 
request.getManifest().getArtifactList()) {
+proxyManifestBuilder.addLocation(Location.newBuilder()
+.setName(artifactMetadata.getName())
+.setUri(artifactDirResourceId
+.resolve(encodedFileName(artifactMetadata), 
StandardResolveOptions.RESOLVE_FILE)
+.toString()).build());
+  }
+  try (WritableByteChannel manifestWritableByteChannel = FileSystems
+  .create(manifestResourceId, MimeTypes.TEXT)) {
+

[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-09 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=110438=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-110438
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 09/Jun/18 20:32
Start Date: 09/Jun/18 20:32
Worklog Time Spent: 10m 
  Work Description: jkff commented on a change in pull request #5591: 
[BEAM-4290] Beam File System based ArtifactStagingService
URL: https://github.com/apache/beam/pull/5591#discussion_r194238271
 
 

 ##
 File path: 
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingService.java
 ##
 @@ -0,0 +1,292 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.fnexecution.artifact;
+
+import com.fasterxml.jackson.core.JsonProcessingException;
+import com.fasterxml.jackson.databind.ObjectMapper;
+import com.google.common.hash.Hashing;
+import com.google.protobuf.util.JsonFormat;
+import io.grpc.stub.StreamObserver;
+import java.io.FileNotFoundException;
+import java.io.IOException;
+import java.io.Serializable;
+import java.nio.channels.WritableByteChannel;
+import java.nio.charset.Charset;
+import java.nio.charset.StandardCharsets;
+import java.util.Collections;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactMetadata;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestRequest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestResponse;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Location;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactMetadata;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactRequest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactResponse;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc.ArtifactStagingServiceImplBase;
+import org.apache.beam.runners.fnexecution.FnService;
+import org.apache.beam.sdk.io.FileSystems;
+import org.apache.beam.sdk.io.fs.MoveOptions.StandardMoveOptions;
+import org.apache.beam.sdk.io.fs.ResolveOptions.StandardResolveOptions;
+import org.apache.beam.sdk.io.fs.ResourceId;
+import org.apache.beam.sdk.util.MimeTypes;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * {@link ArtifactStagingServiceImplBase} based on beam file system.
+ */
+public class BeamFileSystemArtifactStagingService extends 
ArtifactStagingServiceImplBase implements
+FnService {
+
+  private static final Logger LOG =
+  LoggerFactory.getLogger(BeamFileSystemArtifactStagingService.class);
+  private static final ObjectMapper MAPPER = new ObjectMapper();
+  // Use UTF8 for all text encoding.
+  private static final Charset CHARSET = StandardCharsets.UTF_8;
+  public static final String MANIFEST = "MANIFEST";
+
+  @Override
+  public StreamObserver putArtifact(
+  StreamObserver responseObserver) {
+return new PutArtifactStreamObserver(responseObserver);
+  }
+
+  @Override
+  public void commitManifest(
+  CommitManifestRequest request, StreamObserver 
responseObserver) {
+try {
+  ResourceId manifestResourceId = 
getManifestFileResourceId(request.getStagingSessionToken());
+  ResourceId artifactDirResourceId = 
getArtifactDirResourceId(request.getStagingSessionToken());
+  ProxyManifest.Builder proxyManifestBuilder = ProxyManifest.newBuilder()
+  .setManifest(request.getManifest());
+  for (ArtifactMetadata artifactMetadata : 
request.getManifest().getArtifactList()) {
+proxyManifestBuilder.addLocation(Location.newBuilder()
+.setName(artifactMetadata.getName())
+.setUri(artifactDirResourceId
+.resolve(encodedFileName(artifactMetadata), 
StandardResolveOptions.RESOLVE_FILE)
+.toString()).build());
+  }
+  try (WritableByteChannel manifestWritableByteChannel = FileSystems
+  .create(manifestResourceId, MimeTypes.TEXT)) {
+

[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-09 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=110441=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-110441
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 09/Jun/18 20:32
Start Date: 09/Jun/18 20:32
Worklog Time Spent: 10m 
  Work Description: jkff commented on a change in pull request #5591: 
[BEAM-4290] Beam File System based ArtifactStagingService
URL: https://github.com/apache/beam/pull/5591#discussion_r194238285
 
 

 ##
 File path: 
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingService.java
 ##
 @@ -0,0 +1,292 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.fnexecution.artifact;
+
+import com.fasterxml.jackson.core.JsonProcessingException;
+import com.fasterxml.jackson.databind.ObjectMapper;
+import com.google.common.hash.Hashing;
+import com.google.protobuf.util.JsonFormat;
+import io.grpc.stub.StreamObserver;
+import java.io.FileNotFoundException;
+import java.io.IOException;
+import java.io.Serializable;
+import java.nio.channels.WritableByteChannel;
+import java.nio.charset.Charset;
+import java.nio.charset.StandardCharsets;
+import java.util.Collections;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactMetadata;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestRequest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestResponse;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Location;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactMetadata;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactRequest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactResponse;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc.ArtifactStagingServiceImplBase;
+import org.apache.beam.runners.fnexecution.FnService;
+import org.apache.beam.sdk.io.FileSystems;
+import org.apache.beam.sdk.io.fs.MoveOptions.StandardMoveOptions;
+import org.apache.beam.sdk.io.fs.ResolveOptions.StandardResolveOptions;
+import org.apache.beam.sdk.io.fs.ResourceId;
+import org.apache.beam.sdk.util.MimeTypes;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * {@link ArtifactStagingServiceImplBase} based on beam file system.
+ */
+public class BeamFileSystemArtifactStagingService extends 
ArtifactStagingServiceImplBase implements
+FnService {
+
+  private static final Logger LOG =
+  LoggerFactory.getLogger(BeamFileSystemArtifactStagingService.class);
+  private static final ObjectMapper MAPPER = new ObjectMapper();
+  // Use UTF8 for all text encoding.
+  private static final Charset CHARSET = StandardCharsets.UTF_8;
+  public static final String MANIFEST = "MANIFEST";
+
+  @Override
+  public StreamObserver putArtifact(
+  StreamObserver responseObserver) {
+return new PutArtifactStreamObserver(responseObserver);
+  }
+
+  @Override
+  public void commitManifest(
+  CommitManifestRequest request, StreamObserver 
responseObserver) {
+try {
+  ResourceId manifestResourceId = 
getManifestFileResourceId(request.getStagingSessionToken());
+  ResourceId artifactDirResourceId = 
getArtifactDirResourceId(request.getStagingSessionToken());
+  ProxyManifest.Builder proxyManifestBuilder = ProxyManifest.newBuilder()
+  .setManifest(request.getManifest());
+  for (ArtifactMetadata artifactMetadata : 
request.getManifest().getArtifactList()) {
+proxyManifestBuilder.addLocation(Location.newBuilder()
+.setName(artifactMetadata.getName())
+.setUri(artifactDirResourceId
+.resolve(encodedFileName(artifactMetadata), 
StandardResolveOptions.RESOLVE_FILE)
+.toString()).build());
+  }
+  try (WritableByteChannel manifestWritableByteChannel = FileSystems
+  .create(manifestResourceId, MimeTypes.TEXT)) {
+

[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=110369=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-110369
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 09/Jun/18 02:03
Start Date: 09/Jun/18 02:03
Worklog Time Spent: 10m 
  Work Description: angoenka commented on a change in pull request #5591: 
[BEAM-4290] Beam File System based ArtifactStagingService
URL: https://github.com/apache/beam/pull/5591#discussion_r194207605
 
 

 ##
 File path: 
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingService.java
 ##
 @@ -0,0 +1,286 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.fnexecution.artifact;
+
+import com.fasterxml.jackson.core.JsonProcessingException;
+import com.fasterxml.jackson.databind.ObjectMapper;
+import com.google.common.hash.Hashing;
+import com.google.protobuf.util.JsonFormat;
+import io.grpc.stub.StreamObserver;
+import java.io.FileNotFoundException;
+import java.io.IOException;
+import java.io.Serializable;
+import java.nio.channels.WritableByteChannel;
+import java.nio.charset.Charset;
+import java.nio.charset.StandardCharsets;
+import java.util.Collections;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactMetadata;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestRequest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestResponse;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Builder;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Location;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactMetadata;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactRequest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactResponse;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc.ArtifactStagingServiceImplBase;
+import org.apache.beam.runners.fnexecution.FnService;
+import org.apache.beam.sdk.io.FileSystems;
+import org.apache.beam.sdk.io.fs.MoveOptions.StandardMoveOptions;
+import org.apache.beam.sdk.io.fs.ResolveOptions.StandardResolveOptions;
+import org.apache.beam.sdk.io.fs.ResourceId;
+import org.apache.beam.sdk.util.MimeTypes;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * {@link ArtifactStagingServiceImplBase} based on beam file system.
+ */
+public class BeamFileSystemArtifactStagingService extends 
ArtifactStagingServiceImplBase implements
+FnService {
+
+  private static final Logger LOG =
+  LoggerFactory.getLogger(BeamFileSystemArtifactStagingService.class);
+  private static final ObjectMapper MAPPER = new ObjectMapper();
+  // Use UTF8 for all text encoding.
+  private static final Charset CHARSET = StandardCharsets.UTF_8;
+  public static final String MANIFEST = "MANIFEST";
+
+  @Override
+  public StreamObserver putArtifact(
+  StreamObserver responseObserver) {
+return new PutArtifactStreamObserver(responseObserver);
+  }
+
+  @Override
+  public void commitManifest(
+  CommitManifestRequest request, StreamObserver 
responseObserver) {
+try {
+  ResourceId jobResourceDirId = 
getJobDirResourceId(request.getStagingSessionToken());
+  ResourceId manifestResourceId = jobResourceDirId
+  .resolve(MANIFEST, StandardResolveOptions.RESOLVE_FILE);
+  ResourceId artifactDirResourceId = 
getArtifactDirResourceId(request.getStagingSessionToken());
+  Builder proxyManifestBuilder = 
ProxyManifest.newBuilder().setManifest(request.getManifest());
+  for (ArtifactMetadata artifactMetadata : 
request.getManifest().getArtifactList()) {
+
proxyManifestBuilder.addLocation(Location.newBuilder().setName(artifactMetadata.getName())
+.setUri(artifactDirResourceId
+.resolve(encodedFileName(artifactMetadata), 
StandardResolveOptions.RESOLVE_FILE)
+

[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=110363=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-110363
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 09/Jun/18 02:03
Start Date: 09/Jun/18 02:03
Worklog Time Spent: 10m 
  Work Description: angoenka commented on a change in pull request #5591: 
[BEAM-4290] Beam File System based ArtifactStagingService
URL: https://github.com/apache/beam/pull/5591#discussion_r194207729
 
 

 ##
 File path: 
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingService.java
 ##
 @@ -0,0 +1,286 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.fnexecution.artifact;
+
+import com.fasterxml.jackson.core.JsonProcessingException;
+import com.fasterxml.jackson.databind.ObjectMapper;
+import com.google.common.hash.Hashing;
+import com.google.protobuf.util.JsonFormat;
+import io.grpc.stub.StreamObserver;
+import java.io.FileNotFoundException;
+import java.io.IOException;
+import java.io.Serializable;
+import java.nio.channels.WritableByteChannel;
+import java.nio.charset.Charset;
+import java.nio.charset.StandardCharsets;
+import java.util.Collections;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactMetadata;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestRequest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestResponse;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Builder;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Location;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactMetadata;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactRequest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactResponse;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc.ArtifactStagingServiceImplBase;
+import org.apache.beam.runners.fnexecution.FnService;
+import org.apache.beam.sdk.io.FileSystems;
+import org.apache.beam.sdk.io.fs.MoveOptions.StandardMoveOptions;
+import org.apache.beam.sdk.io.fs.ResolveOptions.StandardResolveOptions;
+import org.apache.beam.sdk.io.fs.ResourceId;
+import org.apache.beam.sdk.util.MimeTypes;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * {@link ArtifactStagingServiceImplBase} based on beam file system.
+ */
+public class BeamFileSystemArtifactStagingService extends 
ArtifactStagingServiceImplBase implements
+FnService {
+
+  private static final Logger LOG =
+  LoggerFactory.getLogger(BeamFileSystemArtifactStagingService.class);
+  private static final ObjectMapper MAPPER = new ObjectMapper();
+  // Use UTF8 for all text encoding.
+  private static final Charset CHARSET = StandardCharsets.UTF_8;
+  public static final String MANIFEST = "MANIFEST";
+
+  @Override
+  public StreamObserver putArtifact(
+  StreamObserver responseObserver) {
+return new PutArtifactStreamObserver(responseObserver);
+  }
+
+  @Override
+  public void commitManifest(
+  CommitManifestRequest request, StreamObserver 
responseObserver) {
+try {
+  ResourceId jobResourceDirId = 
getJobDirResourceId(request.getStagingSessionToken());
+  ResourceId manifestResourceId = jobResourceDirId
+  .resolve(MANIFEST, StandardResolveOptions.RESOLVE_FILE);
+  ResourceId artifactDirResourceId = 
getArtifactDirResourceId(request.getStagingSessionToken());
+  Builder proxyManifestBuilder = 
ProxyManifest.newBuilder().setManifest(request.getManifest());
+  for (ArtifactMetadata artifactMetadata : 
request.getManifest().getArtifactList()) {
+
proxyManifestBuilder.addLocation(Location.newBuilder().setName(artifactMetadata.getName())
+.setUri(artifactDirResourceId
+.resolve(encodedFileName(artifactMetadata), 
StandardResolveOptions.RESOLVE_FILE)
+

[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=110362=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-110362
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 09/Jun/18 02:03
Start Date: 09/Jun/18 02:03
Worklog Time Spent: 10m 
  Work Description: angoenka commented on a change in pull request #5591: 
[BEAM-4290] Beam File System based ArtifactStagingService
URL: https://github.com/apache/beam/pull/5591#discussion_r194207275
 
 

 ##
 File path: 
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingService.java
 ##
 @@ -0,0 +1,286 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.fnexecution.artifact;
+
+import com.fasterxml.jackson.core.JsonProcessingException;
+import com.fasterxml.jackson.databind.ObjectMapper;
+import com.google.common.hash.Hashing;
+import com.google.protobuf.util.JsonFormat;
+import io.grpc.stub.StreamObserver;
+import java.io.FileNotFoundException;
+import java.io.IOException;
+import java.io.Serializable;
+import java.nio.channels.WritableByteChannel;
+import java.nio.charset.Charset;
+import java.nio.charset.StandardCharsets;
+import java.util.Collections;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactMetadata;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestRequest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestResponse;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Builder;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Location;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactMetadata;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactRequest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactResponse;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc.ArtifactStagingServiceImplBase;
+import org.apache.beam.runners.fnexecution.FnService;
+import org.apache.beam.sdk.io.FileSystems;
+import org.apache.beam.sdk.io.fs.MoveOptions.StandardMoveOptions;
+import org.apache.beam.sdk.io.fs.ResolveOptions.StandardResolveOptions;
+import org.apache.beam.sdk.io.fs.ResourceId;
+import org.apache.beam.sdk.util.MimeTypes;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * {@link ArtifactStagingServiceImplBase} based on beam file system.
+ */
+public class BeamFileSystemArtifactStagingService extends 
ArtifactStagingServiceImplBase implements
+FnService {
+
+  private static final Logger LOG =
+  LoggerFactory.getLogger(BeamFileSystemArtifactStagingService.class);
+  private static final ObjectMapper MAPPER = new ObjectMapper();
+  // Use UTF8 for all text encoding.
+  private static final Charset CHARSET = StandardCharsets.UTF_8;
+  public static final String MANIFEST = "MANIFEST";
+
+  @Override
+  public StreamObserver putArtifact(
+  StreamObserver responseObserver) {
+return new PutArtifactStreamObserver(responseObserver);
+  }
+
+  @Override
+  public void commitManifest(
+  CommitManifestRequest request, StreamObserver 
responseObserver) {
+try {
+  ResourceId jobResourceDirId = 
getJobDirResourceId(request.getStagingSessionToken());
+  ResourceId manifestResourceId = jobResourceDirId
+  .resolve(MANIFEST, StandardResolveOptions.RESOLVE_FILE);
+  ResourceId artifactDirResourceId = 
getArtifactDirResourceId(request.getStagingSessionToken());
+  Builder proxyManifestBuilder = 
ProxyManifest.newBuilder().setManifest(request.getManifest());
+  for (ArtifactMetadata artifactMetadata : 
request.getManifest().getArtifactList()) {
+
proxyManifestBuilder.addLocation(Location.newBuilder().setName(artifactMetadata.getName())
 
 Review comment:
   Location is a proto so will keep it as it is
   

[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=110365=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-110365
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 09/Jun/18 02:03
Start Date: 09/Jun/18 02:03
Worklog Time Spent: 10m 
  Work Description: angoenka commented on a change in pull request #5591: 
[BEAM-4290] Beam File System based ArtifactStagingService
URL: https://github.com/apache/beam/pull/5591#discussion_r194207894
 
 

 ##
 File path: 
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingService.java
 ##
 @@ -0,0 +1,286 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.fnexecution.artifact;
+
+import com.fasterxml.jackson.core.JsonProcessingException;
+import com.fasterxml.jackson.databind.ObjectMapper;
+import com.google.common.hash.Hashing;
+import com.google.protobuf.util.JsonFormat;
+import io.grpc.stub.StreamObserver;
+import java.io.FileNotFoundException;
+import java.io.IOException;
+import java.io.Serializable;
+import java.nio.channels.WritableByteChannel;
+import java.nio.charset.Charset;
+import java.nio.charset.StandardCharsets;
+import java.util.Collections;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactMetadata;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestRequest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestResponse;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Builder;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Location;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactMetadata;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactRequest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactResponse;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc.ArtifactStagingServiceImplBase;
+import org.apache.beam.runners.fnexecution.FnService;
+import org.apache.beam.sdk.io.FileSystems;
+import org.apache.beam.sdk.io.fs.MoveOptions.StandardMoveOptions;
+import org.apache.beam.sdk.io.fs.ResolveOptions.StandardResolveOptions;
+import org.apache.beam.sdk.io.fs.ResourceId;
+import org.apache.beam.sdk.util.MimeTypes;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * {@link ArtifactStagingServiceImplBase} based on beam file system.
+ */
+public class BeamFileSystemArtifactStagingService extends 
ArtifactStagingServiceImplBase implements
+FnService {
+
+  private static final Logger LOG =
+  LoggerFactory.getLogger(BeamFileSystemArtifactStagingService.class);
+  private static final ObjectMapper MAPPER = new ObjectMapper();
+  // Use UTF8 for all text encoding.
+  private static final Charset CHARSET = StandardCharsets.UTF_8;
+  public static final String MANIFEST = "MANIFEST";
+
+  @Override
+  public StreamObserver putArtifact(
+  StreamObserver responseObserver) {
+return new PutArtifactStreamObserver(responseObserver);
+  }
+
+  @Override
+  public void commitManifest(
+  CommitManifestRequest request, StreamObserver 
responseObserver) {
+try {
+  ResourceId jobResourceDirId = 
getJobDirResourceId(request.getStagingSessionToken());
+  ResourceId manifestResourceId = jobResourceDirId
+  .resolve(MANIFEST, StandardResolveOptions.RESOLVE_FILE);
+  ResourceId artifactDirResourceId = 
getArtifactDirResourceId(request.getStagingSessionToken());
+  Builder proxyManifestBuilder = 
ProxyManifest.newBuilder().setManifest(request.getManifest());
+  for (ArtifactMetadata artifactMetadata : 
request.getManifest().getArtifactList()) {
+
proxyManifestBuilder.addLocation(Location.newBuilder().setName(artifactMetadata.getName())
+.setUri(artifactDirResourceId
+.resolve(encodedFileName(artifactMetadata), 
StandardResolveOptions.RESOLVE_FILE)
+

[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=110366=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-110366
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 09/Jun/18 02:03
Start Date: 09/Jun/18 02:03
Worklog Time Spent: 10m 
  Work Description: angoenka commented on a change in pull request #5591: 
[BEAM-4290] Beam File System based ArtifactStagingService
URL: https://github.com/apache/beam/pull/5591#discussion_r194206311
 
 

 ##
 File path: 
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingService.java
 ##
 @@ -0,0 +1,286 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.fnexecution.artifact;
+
+import com.fasterxml.jackson.core.JsonProcessingException;
+import com.fasterxml.jackson.databind.ObjectMapper;
+import com.google.common.hash.Hashing;
+import com.google.protobuf.util.JsonFormat;
+import io.grpc.stub.StreamObserver;
+import java.io.FileNotFoundException;
+import java.io.IOException;
+import java.io.Serializable;
+import java.nio.channels.WritableByteChannel;
+import java.nio.charset.Charset;
+import java.nio.charset.StandardCharsets;
+import java.util.Collections;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactMetadata;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestRequest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestResponse;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Builder;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Location;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactMetadata;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactRequest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactResponse;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc.ArtifactStagingServiceImplBase;
+import org.apache.beam.runners.fnexecution.FnService;
+import org.apache.beam.sdk.io.FileSystems;
+import org.apache.beam.sdk.io.fs.MoveOptions.StandardMoveOptions;
+import org.apache.beam.sdk.io.fs.ResolveOptions.StandardResolveOptions;
+import org.apache.beam.sdk.io.fs.ResourceId;
+import org.apache.beam.sdk.util.MimeTypes;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * {@link ArtifactStagingServiceImplBase} based on beam file system.
+ */
+public class BeamFileSystemArtifactStagingService extends 
ArtifactStagingServiceImplBase implements
+FnService {
+
+  private static final Logger LOG =
+  LoggerFactory.getLogger(BeamFileSystemArtifactStagingService.class);
+  private static final ObjectMapper MAPPER = new ObjectMapper();
+  // Use UTF8 for all text encoding.
+  private static final Charset CHARSET = StandardCharsets.UTF_8;
+  public static final String MANIFEST = "MANIFEST";
+
+  @Override
+  public StreamObserver putArtifact(
+  StreamObserver responseObserver) {
+return new PutArtifactStreamObserver(responseObserver);
+  }
+
+  @Override
+  public void commitManifest(
+  CommitManifestRequest request, StreamObserver 
responseObserver) {
+try {
+  ResourceId jobResourceDirId = 
getJobDirResourceId(request.getStagingSessionToken());
+  ResourceId manifestResourceId = jobResourceDirId
+  .resolve(MANIFEST, StandardResolveOptions.RESOLVE_FILE);
+  ResourceId artifactDirResourceId = 
getArtifactDirResourceId(request.getStagingSessionToken());
 
 Review comment:
   done


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 110366)

> ArtifactStagingService that stages to a 

[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=110368=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-110368
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 09/Jun/18 02:03
Start Date: 09/Jun/18 02:03
Worklog Time Spent: 10m 
  Work Description: angoenka commented on a change in pull request #5591: 
[BEAM-4290] Beam File System based ArtifactStagingService
URL: https://github.com/apache/beam/pull/5591#discussion_r194209128
 
 

 ##
 File path: 
runners/java-fn-execution/src/test/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingServiceTest.java
 ##
 @@ -0,0 +1,273 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.fnexecution.artifact;
+
+
+import com.google.common.base.Joiner;
+import com.google.common.base.Strings;
+import com.google.protobuf.ByteString;
+import com.google.protobuf.util.JsonFormat;
+import io.grpc.inprocess.InProcessChannelBuilder;
+import io.grpc.stub.StreamObserver;
+import java.io.FileInputStream;
+import java.io.IOException;
+import java.nio.charset.Charset;
+import java.nio.charset.StandardCharsets;
+import java.nio.file.FileVisitResult;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.nio.file.Paths;
+import java.nio.file.SimpleFileVisitor;
+import java.nio.file.attribute.BasicFileAttributes;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactChunk;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactMetadata;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestRequest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestResponse;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.Manifest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Builder;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Location;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactMetadata;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactRequest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactResponse;
+import org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc.ArtifactStagingServiceStub;
+import org.apache.beam.runners.fnexecution.GrpcFnServer;
+import org.apache.beam.runners.fnexecution.InProcessServerFactory;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
+import org.junit.runner.RunWith;
+import org.junit.runners.JUnit4;
+
+/**
+ * Tests for {@link BeamFileSystemArtifactStagingService}.
+ */
+@RunWith(JUnit4.class)
+public class BeamFileSystemArtifactStagingServiceTest {
+
+  public static final Joiner JOINER = Joiner.on("");
+  public static final Charset CHARSET = StandardCharsets.UTF_8;
+  private GrpcFnServer server;
+  private BeamFileSystemArtifactStagingService artifactStagingService;
+  private ArtifactStagingServiceStub stub;
+  private Path srcDir;
+  private Path destDir;
+
+  @Before
+  public void setUp() throws Exception {
+artifactStagingService = new BeamFileSystemArtifactStagingService();
+server = GrpcFnServer
+.allocatePortAndCreateFor(artifactStagingService, 
InProcessServerFactory.create());
+stub =
+ArtifactStagingServiceGrpc.newStub(
+
InProcessChannelBuilder.forName(server.getApiServiceDescriptor().getUrl()).build());
+
+srcDir = Files.createTempDirectory("BFSTemp");
+destDir = Files.createTempDirectory("BFDTemp");
+
+  }
+
+  @After
+  public void tearDown() throws Exception {
+if (server != null) {
+  

[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=110361=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-110361
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 09/Jun/18 02:03
Start Date: 09/Jun/18 02:03
Worklog Time Spent: 10m 
  Work Description: angoenka commented on a change in pull request #5591: 
[BEAM-4290] Beam File System based ArtifactStagingService
URL: https://github.com/apache/beam/pull/5591#discussion_r194208734
 
 

 ##
 File path: 
runners/java-fn-execution/src/test/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingServiceTest.java
 ##
 @@ -0,0 +1,273 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.fnexecution.artifact;
+
+
+import com.google.common.base.Joiner;
+import com.google.common.base.Strings;
+import com.google.protobuf.ByteString;
+import com.google.protobuf.util.JsonFormat;
+import io.grpc.inprocess.InProcessChannelBuilder;
+import io.grpc.stub.StreamObserver;
+import java.io.FileInputStream;
+import java.io.IOException;
+import java.nio.charset.Charset;
+import java.nio.charset.StandardCharsets;
+import java.nio.file.FileVisitResult;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.nio.file.Paths;
+import java.nio.file.SimpleFileVisitor;
+import java.nio.file.attribute.BasicFileAttributes;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactChunk;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactMetadata;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestRequest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestResponse;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.Manifest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Builder;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Location;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactMetadata;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactRequest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactResponse;
+import org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc.ArtifactStagingServiceStub;
+import org.apache.beam.runners.fnexecution.GrpcFnServer;
+import org.apache.beam.runners.fnexecution.InProcessServerFactory;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
+import org.junit.runner.RunWith;
+import org.junit.runners.JUnit4;
+
+/**
+ * Tests for {@link BeamFileSystemArtifactStagingService}.
+ */
+@RunWith(JUnit4.class)
+public class BeamFileSystemArtifactStagingServiceTest {
+
+  public static final Joiner JOINER = Joiner.on("");
+  public static final Charset CHARSET = StandardCharsets.UTF_8;
+  private GrpcFnServer server;
+  private BeamFileSystemArtifactStagingService artifactStagingService;
+  private ArtifactStagingServiceStub stub;
+  private Path srcDir;
+  private Path destDir;
+
+  @Before
+  public void setUp() throws Exception {
+artifactStagingService = new BeamFileSystemArtifactStagingService();
+server = GrpcFnServer
+.allocatePortAndCreateFor(artifactStagingService, 
InProcessServerFactory.create());
+stub =
+ArtifactStagingServiceGrpc.newStub(
+
InProcessChannelBuilder.forName(server.getApiServiceDescriptor().getUrl()).build());
+
+srcDir = Files.createTempDirectory("BFSTemp");
+destDir = Files.createTempDirectory("BFDTemp");
+
+  }
+
+  @After
+  public void tearDown() throws Exception {
+if (server != null) {
+  

[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=110364=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-110364
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 09/Jun/18 02:03
Start Date: 09/Jun/18 02:03
Worklog Time Spent: 10m 
  Work Description: angoenka commented on a change in pull request #5591: 
[BEAM-4290] Beam File System based ArtifactStagingService
URL: https://github.com/apache/beam/pull/5591#discussion_r194207039
 
 

 ##
 File path: 
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingService.java
 ##
 @@ -0,0 +1,286 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.fnexecution.artifact;
+
+import com.fasterxml.jackson.core.JsonProcessingException;
+import com.fasterxml.jackson.databind.ObjectMapper;
+import com.google.common.hash.Hashing;
+import com.google.protobuf.util.JsonFormat;
+import io.grpc.stub.StreamObserver;
+import java.io.FileNotFoundException;
+import java.io.IOException;
+import java.io.Serializable;
+import java.nio.channels.WritableByteChannel;
+import java.nio.charset.Charset;
+import java.nio.charset.StandardCharsets;
+import java.util.Collections;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactMetadata;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestRequest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestResponse;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Builder;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Location;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactMetadata;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactRequest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactResponse;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc.ArtifactStagingServiceImplBase;
+import org.apache.beam.runners.fnexecution.FnService;
+import org.apache.beam.sdk.io.FileSystems;
+import org.apache.beam.sdk.io.fs.MoveOptions.StandardMoveOptions;
+import org.apache.beam.sdk.io.fs.ResolveOptions.StandardResolveOptions;
+import org.apache.beam.sdk.io.fs.ResourceId;
+import org.apache.beam.sdk.util.MimeTypes;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * {@link ArtifactStagingServiceImplBase} based on beam file system.
+ */
+public class BeamFileSystemArtifactStagingService extends 
ArtifactStagingServiceImplBase implements
+FnService {
+
+  private static final Logger LOG =
+  LoggerFactory.getLogger(BeamFileSystemArtifactStagingService.class);
+  private static final ObjectMapper MAPPER = new ObjectMapper();
+  // Use UTF8 for all text encoding.
+  private static final Charset CHARSET = StandardCharsets.UTF_8;
+  public static final String MANIFEST = "MANIFEST";
+
+  @Override
+  public StreamObserver putArtifact(
+  StreamObserver responseObserver) {
+return new PutArtifactStreamObserver(responseObserver);
+  }
+
+  @Override
+  public void commitManifest(
+  CommitManifestRequest request, StreamObserver 
responseObserver) {
+try {
+  ResourceId jobResourceDirId = 
getJobDirResourceId(request.getStagingSessionToken());
+  ResourceId manifestResourceId = jobResourceDirId
+  .resolve(MANIFEST, StandardResolveOptions.RESOLVE_FILE);
+  ResourceId artifactDirResourceId = 
getArtifactDirResourceId(request.getStagingSessionToken());
+  Builder proxyManifestBuilder = 
ProxyManifest.newBuilder().setManifest(request.getManifest());
 
 Review comment:
   Makes sense.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time 

[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=110370=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-110370
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 09/Jun/18 02:03
Start Date: 09/Jun/18 02:03
Worklog Time Spent: 10m 
  Work Description: angoenka commented on a change in pull request #5591: 
[BEAM-4290] Beam File System based ArtifactStagingService
URL: https://github.com/apache/beam/pull/5591#discussion_r194208961
 
 

 ##
 File path: 
runners/java-fn-execution/src/test/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingServiceTest.java
 ##
 @@ -0,0 +1,273 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.fnexecution.artifact;
+
+
+import com.google.common.base.Joiner;
+import com.google.common.base.Strings;
+import com.google.protobuf.ByteString;
+import com.google.protobuf.util.JsonFormat;
+import io.grpc.inprocess.InProcessChannelBuilder;
+import io.grpc.stub.StreamObserver;
+import java.io.FileInputStream;
+import java.io.IOException;
+import java.nio.charset.Charset;
+import java.nio.charset.StandardCharsets;
+import java.nio.file.FileVisitResult;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.nio.file.Paths;
+import java.nio.file.SimpleFileVisitor;
+import java.nio.file.attribute.BasicFileAttributes;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactChunk;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactMetadata;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestRequest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestResponse;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.Manifest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Builder;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Location;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactMetadata;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactRequest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactResponse;
+import org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc.ArtifactStagingServiceStub;
+import org.apache.beam.runners.fnexecution.GrpcFnServer;
+import org.apache.beam.runners.fnexecution.InProcessServerFactory;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
+import org.junit.runner.RunWith;
+import org.junit.runners.JUnit4;
+
+/**
+ * Tests for {@link BeamFileSystemArtifactStagingService}.
+ */
+@RunWith(JUnit4.class)
+public class BeamFileSystemArtifactStagingServiceTest {
+
+  public static final Joiner JOINER = Joiner.on("");
+  public static final Charset CHARSET = StandardCharsets.UTF_8;
+  private GrpcFnServer server;
+  private BeamFileSystemArtifactStagingService artifactStagingService;
+  private ArtifactStagingServiceStub stub;
+  private Path srcDir;
+  private Path destDir;
+
+  @Before
+  public void setUp() throws Exception {
+artifactStagingService = new BeamFileSystemArtifactStagingService();
+server = GrpcFnServer
+.allocatePortAndCreateFor(artifactStagingService, 
InProcessServerFactory.create());
+stub =
+ArtifactStagingServiceGrpc.newStub(
+
InProcessChannelBuilder.forName(server.getApiServiceDescriptor().getUrl()).build());
+
+srcDir = Files.createTempDirectory("BFSTemp");
+destDir = Files.createTempDirectory("BFDTemp");
+
+  }
+
+  @After
+  public void tearDown() throws Exception {
+if (server != null) {
+  

[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=110367=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-110367
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 09/Jun/18 02:03
Start Date: 09/Jun/18 02:03
Worklog Time Spent: 10m 
  Work Description: angoenka commented on a change in pull request #5591: 
[BEAM-4290] Beam File System based ArtifactStagingService
URL: https://github.com/apache/beam/pull/5591#discussion_r194208717
 
 

 ##
 File path: 
runners/java-fn-execution/src/test/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingServiceTest.java
 ##
 @@ -0,0 +1,273 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.fnexecution.artifact;
+
+
+import com.google.common.base.Joiner;
+import com.google.common.base.Strings;
+import com.google.protobuf.ByteString;
+import com.google.protobuf.util.JsonFormat;
+import io.grpc.inprocess.InProcessChannelBuilder;
+import io.grpc.stub.StreamObserver;
+import java.io.FileInputStream;
+import java.io.IOException;
+import java.nio.charset.Charset;
+import java.nio.charset.StandardCharsets;
+import java.nio.file.FileVisitResult;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.nio.file.Paths;
+import java.nio.file.SimpleFileVisitor;
+import java.nio.file.attribute.BasicFileAttributes;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactChunk;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactMetadata;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestRequest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestResponse;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.Manifest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Builder;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Location;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactMetadata;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactRequest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactResponse;
+import org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc.ArtifactStagingServiceStub;
+import org.apache.beam.runners.fnexecution.GrpcFnServer;
+import org.apache.beam.runners.fnexecution.InProcessServerFactory;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
+import org.junit.runner.RunWith;
+import org.junit.runners.JUnit4;
+
+/**
+ * Tests for {@link BeamFileSystemArtifactStagingService}.
+ */
+@RunWith(JUnit4.class)
+public class BeamFileSystemArtifactStagingServiceTest {
+
+  public static final Joiner JOINER = Joiner.on("");
+  public static final Charset CHARSET = StandardCharsets.UTF_8;
+  private GrpcFnServer server;
+  private BeamFileSystemArtifactStagingService artifactStagingService;
+  private ArtifactStagingServiceStub stub;
+  private Path srcDir;
+  private Path destDir;
+
+  @Before
+  public void setUp() throws Exception {
+artifactStagingService = new BeamFileSystemArtifactStagingService();
+server = GrpcFnServer
+.allocatePortAndCreateFor(artifactStagingService, 
InProcessServerFactory.create());
+stub =
+ArtifactStagingServiceGrpc.newStub(
+
InProcessChannelBuilder.forName(server.getApiServiceDescriptor().getUrl()).build());
+
+srcDir = Files.createTempDirectory("BFSTemp");
+destDir = Files.createTempDirectory("BFDTemp");
+
+  }
+
+  @After
+  public void tearDown() throws Exception {
+if (server != null) {
+  

[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=110343=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-110343
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 09/Jun/18 00:09
Start Date: 09/Jun/18 00:09
Worklog Time Spent: 10m 
  Work Description: robertwb commented on a change in pull request #5591: 
[BEAM-4290] Beam File System based ArtifactStagingService
URL: https://github.com/apache/beam/pull/5591#discussion_r194203941
 
 

 ##
 File path: 
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingService.java
 ##
 @@ -0,0 +1,286 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.fnexecution.artifact;
+
+import com.fasterxml.jackson.core.JsonProcessingException;
+import com.fasterxml.jackson.databind.ObjectMapper;
+import com.google.common.hash.Hashing;
+import com.google.protobuf.util.JsonFormat;
+import io.grpc.stub.StreamObserver;
+import java.io.FileNotFoundException;
+import java.io.IOException;
+import java.io.Serializable;
+import java.nio.channels.WritableByteChannel;
+import java.nio.charset.Charset;
+import java.nio.charset.StandardCharsets;
+import java.util.Collections;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactMetadata;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestRequest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestResponse;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Builder;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Location;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactMetadata;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactRequest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactResponse;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc.ArtifactStagingServiceImplBase;
+import org.apache.beam.runners.fnexecution.FnService;
+import org.apache.beam.sdk.io.FileSystems;
+import org.apache.beam.sdk.io.fs.MoveOptions.StandardMoveOptions;
+import org.apache.beam.sdk.io.fs.ResolveOptions.StandardResolveOptions;
+import org.apache.beam.sdk.io.fs.ResourceId;
+import org.apache.beam.sdk.util.MimeTypes;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * {@link ArtifactStagingServiceImplBase} based on beam file system.
+ */
+public class BeamFileSystemArtifactStagingService extends 
ArtifactStagingServiceImplBase implements
+FnService {
+
+  private static final Logger LOG =
+  LoggerFactory.getLogger(BeamFileSystemArtifactStagingService.class);
+  private static final ObjectMapper MAPPER = new ObjectMapper();
+  // Use UTF8 for all text encoding.
+  private static final Charset CHARSET = StandardCharsets.UTF_8;
+  public static final String MANIFEST = "MANIFEST";
+
+  @Override
+  public StreamObserver putArtifact(
+  StreamObserver responseObserver) {
+return new PutArtifactStreamObserver(responseObserver);
+  }
+
+  @Override
+  public void commitManifest(
+  CommitManifestRequest request, StreamObserver 
responseObserver) {
+try {
+  ResourceId jobResourceDirId = 
getJobDirResourceId(request.getStagingSessionToken());
+  ResourceId manifestResourceId = jobResourceDirId
+  .resolve(MANIFEST, StandardResolveOptions.RESOLVE_FILE);
+  ResourceId artifactDirResourceId = 
getArtifactDirResourceId(request.getStagingSessionToken());
 
 Review comment:
   Nit: for symmetry, maybe also encapsulate the above into a 
getManifestFileResourceId?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

   

[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=110349=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-110349
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 09/Jun/18 00:09
Start Date: 09/Jun/18 00:09
Worklog Time Spent: 10m 
  Work Description: robertwb commented on a change in pull request #5591: 
[BEAM-4290] Beam File System based ArtifactStagingService
URL: https://github.com/apache/beam/pull/5591#discussion_r194205918
 
 

 ##
 File path: 
runners/java-fn-execution/src/test/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingServiceTest.java
 ##
 @@ -0,0 +1,273 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.fnexecution.artifact;
+
+
+import com.google.common.base.Joiner;
+import com.google.common.base.Strings;
+import com.google.protobuf.ByteString;
+import com.google.protobuf.util.JsonFormat;
+import io.grpc.inprocess.InProcessChannelBuilder;
+import io.grpc.stub.StreamObserver;
+import java.io.FileInputStream;
+import java.io.IOException;
+import java.nio.charset.Charset;
+import java.nio.charset.StandardCharsets;
+import java.nio.file.FileVisitResult;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.nio.file.Paths;
+import java.nio.file.SimpleFileVisitor;
+import java.nio.file.attribute.BasicFileAttributes;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactChunk;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactMetadata;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestRequest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestResponse;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.Manifest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Builder;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Location;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactMetadata;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactRequest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactResponse;
+import org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc.ArtifactStagingServiceStub;
+import org.apache.beam.runners.fnexecution.GrpcFnServer;
+import org.apache.beam.runners.fnexecution.InProcessServerFactory;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
+import org.junit.runner.RunWith;
+import org.junit.runners.JUnit4;
+
+/**
+ * Tests for {@link BeamFileSystemArtifactStagingService}.
+ */
+@RunWith(JUnit4.class)
+public class BeamFileSystemArtifactStagingServiceTest {
+
+  public static final Joiner JOINER = Joiner.on("");
+  public static final Charset CHARSET = StandardCharsets.UTF_8;
+  private GrpcFnServer server;
+  private BeamFileSystemArtifactStagingService artifactStagingService;
+  private ArtifactStagingServiceStub stub;
+  private Path srcDir;
+  private Path destDir;
+
+  @Before
+  public void setUp() throws Exception {
+artifactStagingService = new BeamFileSystemArtifactStagingService();
+server = GrpcFnServer
+.allocatePortAndCreateFor(artifactStagingService, 
InProcessServerFactory.create());
+stub =
+ArtifactStagingServiceGrpc.newStub(
+
InProcessChannelBuilder.forName(server.getApiServiceDescriptor().getUrl()).build());
+
+srcDir = Files.createTempDirectory("BFSTemp");
+destDir = Files.createTempDirectory("BFDTemp");
+
+  }
+
+  @After
+  public void tearDown() throws Exception {
+if (server != null) {
+  

[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=110350=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-110350
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 09/Jun/18 00:09
Start Date: 09/Jun/18 00:09
Worklog Time Spent: 10m 
  Work Description: robertwb commented on a change in pull request #5591: 
[BEAM-4290] Beam File System based ArtifactStagingService
URL: https://github.com/apache/beam/pull/5591#discussion_r194205879
 
 

 ##
 File path: 
runners/java-fn-execution/src/test/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingServiceTest.java
 ##
 @@ -0,0 +1,273 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.fnexecution.artifact;
+
+
+import com.google.common.base.Joiner;
+import com.google.common.base.Strings;
+import com.google.protobuf.ByteString;
+import com.google.protobuf.util.JsonFormat;
+import io.grpc.inprocess.InProcessChannelBuilder;
+import io.grpc.stub.StreamObserver;
+import java.io.FileInputStream;
+import java.io.IOException;
+import java.nio.charset.Charset;
+import java.nio.charset.StandardCharsets;
+import java.nio.file.FileVisitResult;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.nio.file.Paths;
+import java.nio.file.SimpleFileVisitor;
+import java.nio.file.attribute.BasicFileAttributes;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactChunk;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactMetadata;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestRequest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestResponse;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.Manifest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Builder;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Location;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactMetadata;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactRequest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactResponse;
+import org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc.ArtifactStagingServiceStub;
+import org.apache.beam.runners.fnexecution.GrpcFnServer;
+import org.apache.beam.runners.fnexecution.InProcessServerFactory;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
+import org.junit.runner.RunWith;
+import org.junit.runners.JUnit4;
+
+/**
+ * Tests for {@link BeamFileSystemArtifactStagingService}.
+ */
+@RunWith(JUnit4.class)
+public class BeamFileSystemArtifactStagingServiceTest {
+
+  public static final Joiner JOINER = Joiner.on("");
+  public static final Charset CHARSET = StandardCharsets.UTF_8;
+  private GrpcFnServer server;
+  private BeamFileSystemArtifactStagingService artifactStagingService;
+  private ArtifactStagingServiceStub stub;
+  private Path srcDir;
+  private Path destDir;
+
+  @Before
+  public void setUp() throws Exception {
+artifactStagingService = new BeamFileSystemArtifactStagingService();
+server = GrpcFnServer
+.allocatePortAndCreateFor(artifactStagingService, 
InProcessServerFactory.create());
+stub =
+ArtifactStagingServiceGrpc.newStub(
+
InProcessChannelBuilder.forName(server.getApiServiceDescriptor().getUrl()).build());
+
+srcDir = Files.createTempDirectory("BFSTemp");
+destDir = Files.createTempDirectory("BFDTemp");
+
+  }
+
+  @After
+  public void tearDown() throws Exception {
+if (server != null) {
+  

[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=110345=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-110345
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 09/Jun/18 00:09
Start Date: 09/Jun/18 00:09
Worklog Time Spent: 10m 
  Work Description: robertwb commented on a change in pull request #5591: 
[BEAM-4290] Beam File System based ArtifactStagingService
URL: https://github.com/apache/beam/pull/5591#discussion_r194204766
 
 

 ##
 File path: 
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingService.java
 ##
 @@ -0,0 +1,286 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.fnexecution.artifact;
+
+import com.fasterxml.jackson.core.JsonProcessingException;
+import com.fasterxml.jackson.databind.ObjectMapper;
+import com.google.common.hash.Hashing;
+import com.google.protobuf.util.JsonFormat;
+import io.grpc.stub.StreamObserver;
+import java.io.FileNotFoundException;
+import java.io.IOException;
+import java.io.Serializable;
+import java.nio.channels.WritableByteChannel;
+import java.nio.charset.Charset;
+import java.nio.charset.StandardCharsets;
+import java.util.Collections;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactMetadata;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestRequest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestResponse;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Builder;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Location;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactMetadata;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactRequest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactResponse;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc.ArtifactStagingServiceImplBase;
+import org.apache.beam.runners.fnexecution.FnService;
+import org.apache.beam.sdk.io.FileSystems;
+import org.apache.beam.sdk.io.fs.MoveOptions.StandardMoveOptions;
+import org.apache.beam.sdk.io.fs.ResolveOptions.StandardResolveOptions;
+import org.apache.beam.sdk.io.fs.ResourceId;
+import org.apache.beam.sdk.util.MimeTypes;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * {@link ArtifactStagingServiceImplBase} based on beam file system.
+ */
+public class BeamFileSystemArtifactStagingService extends 
ArtifactStagingServiceImplBase implements
+FnService {
+
+  private static final Logger LOG =
+  LoggerFactory.getLogger(BeamFileSystemArtifactStagingService.class);
+  private static final ObjectMapper MAPPER = new ObjectMapper();
+  // Use UTF8 for all text encoding.
+  private static final Charset CHARSET = StandardCharsets.UTF_8;
+  public static final String MANIFEST = "MANIFEST";
+
+  @Override
+  public StreamObserver putArtifact(
+  StreamObserver responseObserver) {
+return new PutArtifactStreamObserver(responseObserver);
+  }
+
+  @Override
+  public void commitManifest(
+  CommitManifestRequest request, StreamObserver 
responseObserver) {
+try {
+  ResourceId jobResourceDirId = 
getJobDirResourceId(request.getStagingSessionToken());
+  ResourceId manifestResourceId = jobResourceDirId
+  .resolve(MANIFEST, StandardResolveOptions.RESOLVE_FILE);
+  ResourceId artifactDirResourceId = 
getArtifactDirResourceId(request.getStagingSessionToken());
+  Builder proxyManifestBuilder = 
ProxyManifest.newBuilder().setManifest(request.getManifest());
+  for (ArtifactMetadata artifactMetadata : 
request.getManifest().getArtifactList()) {
+
proxyManifestBuilder.addLocation(Location.newBuilder().setName(artifactMetadata.getName())
+.setUri(artifactDirResourceId
+.resolve(encodedFileName(artifactMetadata), 
StandardResolveOptions.RESOLVE_FILE)
+

[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=110348=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-110348
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 09/Jun/18 00:09
Start Date: 09/Jun/18 00:09
Worklog Time Spent: 10m 
  Work Description: robertwb commented on a change in pull request #5591: 
[BEAM-4290] Beam File System based ArtifactStagingService
URL: https://github.com/apache/beam/pull/5591#discussion_r194204644
 
 

 ##
 File path: 
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingService.java
 ##
 @@ -0,0 +1,286 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.fnexecution.artifact;
+
+import com.fasterxml.jackson.core.JsonProcessingException;
+import com.fasterxml.jackson.databind.ObjectMapper;
+import com.google.common.hash.Hashing;
+import com.google.protobuf.util.JsonFormat;
+import io.grpc.stub.StreamObserver;
+import java.io.FileNotFoundException;
+import java.io.IOException;
+import java.io.Serializable;
+import java.nio.channels.WritableByteChannel;
+import java.nio.charset.Charset;
+import java.nio.charset.StandardCharsets;
+import java.util.Collections;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactMetadata;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestRequest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestResponse;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Builder;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Location;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactMetadata;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactRequest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactResponse;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc.ArtifactStagingServiceImplBase;
+import org.apache.beam.runners.fnexecution.FnService;
+import org.apache.beam.sdk.io.FileSystems;
+import org.apache.beam.sdk.io.fs.MoveOptions.StandardMoveOptions;
+import org.apache.beam.sdk.io.fs.ResolveOptions.StandardResolveOptions;
+import org.apache.beam.sdk.io.fs.ResourceId;
+import org.apache.beam.sdk.util.MimeTypes;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * {@link ArtifactStagingServiceImplBase} based on beam file system.
+ */
+public class BeamFileSystemArtifactStagingService extends 
ArtifactStagingServiceImplBase implements
+FnService {
+
+  private static final Logger LOG =
+  LoggerFactory.getLogger(BeamFileSystemArtifactStagingService.class);
+  private static final ObjectMapper MAPPER = new ObjectMapper();
+  // Use UTF8 for all text encoding.
+  private static final Charset CHARSET = StandardCharsets.UTF_8;
+  public static final String MANIFEST = "MANIFEST";
+
+  @Override
+  public StreamObserver putArtifact(
+  StreamObserver responseObserver) {
+return new PutArtifactStreamObserver(responseObserver);
+  }
+
+  @Override
+  public void commitManifest(
+  CommitManifestRequest request, StreamObserver 
responseObserver) {
+try {
+  ResourceId jobResourceDirId = 
getJobDirResourceId(request.getStagingSessionToken());
+  ResourceId manifestResourceId = jobResourceDirId
+  .resolve(MANIFEST, StandardResolveOptions.RESOLVE_FILE);
+  ResourceId artifactDirResourceId = 
getArtifactDirResourceId(request.getStagingSessionToken());
+  Builder proxyManifestBuilder = 
ProxyManifest.newBuilder().setManifest(request.getManifest());
+  for (ArtifactMetadata artifactMetadata : 
request.getManifest().getArtifactList()) {
+
proxyManifestBuilder.addLocation(Location.newBuilder().setName(artifactMetadata.getName())
 
 Review comment:
   Should location instead be a map?


This is an automated message from the Apache Git 

[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=110344=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-110344
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 09/Jun/18 00:09
Start Date: 09/Jun/18 00:09
Worklog Time Spent: 10m 
  Work Description: robertwb commented on a change in pull request #5591: 
[BEAM-4290] Beam File System based ArtifactStagingService
URL: https://github.com/apache/beam/pull/5591#discussion_r194202354
 
 

 ##
 File path: 
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingService.java
 ##
 @@ -0,0 +1,286 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.fnexecution.artifact;
+
+import com.fasterxml.jackson.core.JsonProcessingException;
+import com.fasterxml.jackson.databind.ObjectMapper;
+import com.google.common.hash.Hashing;
+import com.google.protobuf.util.JsonFormat;
+import io.grpc.stub.StreamObserver;
+import java.io.FileNotFoundException;
+import java.io.IOException;
+import java.io.Serializable;
+import java.nio.channels.WritableByteChannel;
+import java.nio.charset.Charset;
+import java.nio.charset.StandardCharsets;
+import java.util.Collections;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactMetadata;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestRequest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestResponse;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Builder;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Location;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactMetadata;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactRequest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactResponse;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc.ArtifactStagingServiceImplBase;
+import org.apache.beam.runners.fnexecution.FnService;
+import org.apache.beam.sdk.io.FileSystems;
+import org.apache.beam.sdk.io.fs.MoveOptions.StandardMoveOptions;
+import org.apache.beam.sdk.io.fs.ResolveOptions.StandardResolveOptions;
+import org.apache.beam.sdk.io.fs.ResourceId;
+import org.apache.beam.sdk.util.MimeTypes;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * {@link ArtifactStagingServiceImplBase} based on beam file system.
+ */
+public class BeamFileSystemArtifactStagingService extends 
ArtifactStagingServiceImplBase implements
+FnService {
+
+  private static final Logger LOG =
+  LoggerFactory.getLogger(BeamFileSystemArtifactStagingService.class);
+  private static final ObjectMapper MAPPER = new ObjectMapper();
+  // Use UTF8 for all text encoding.
+  private static final Charset CHARSET = StandardCharsets.UTF_8;
+  public static final String MANIFEST = "MANIFEST";
+
+  @Override
+  public StreamObserver putArtifact(
+  StreamObserver responseObserver) {
+return new PutArtifactStreamObserver(responseObserver);
+  }
+
+  @Override
+  public void commitManifest(
+  CommitManifestRequest request, StreamObserver 
responseObserver) {
+try {
+  ResourceId jobResourceDirId = 
getJobDirResourceId(request.getStagingSessionToken());
+  ResourceId manifestResourceId = jobResourceDirId
+  .resolve(MANIFEST, StandardResolveOptions.RESOLVE_FILE);
+  ResourceId artifactDirResourceId = 
getArtifactDirResourceId(request.getStagingSessionToken());
+  Builder proxyManifestBuilder = 
ProxyManifest.newBuilder().setManifest(request.getManifest());
+  for (ArtifactMetadata artifactMetadata : 
request.getManifest().getArtifactList()) {
+
proxyManifestBuilder.addLocation(Location.newBuilder().setName(artifactMetadata.getName())
+.setUri(artifactDirResourceId
+.resolve(encodedFileName(artifactMetadata), 
StandardResolveOptions.RESOLVE_FILE)
+

[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=110340=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-110340
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 09/Jun/18 00:09
Start Date: 09/Jun/18 00:09
Worklog Time Spent: 10m 
  Work Description: robertwb commented on a change in pull request #5591: 
[BEAM-4290] Beam File System based ArtifactStagingService
URL: https://github.com/apache/beam/pull/5591#discussion_r194204099
 
 

 ##
 File path: 
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingService.java
 ##
 @@ -0,0 +1,286 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.fnexecution.artifact;
+
+import com.fasterxml.jackson.core.JsonProcessingException;
+import com.fasterxml.jackson.databind.ObjectMapper;
+import com.google.common.hash.Hashing;
+import com.google.protobuf.util.JsonFormat;
+import io.grpc.stub.StreamObserver;
+import java.io.FileNotFoundException;
+import java.io.IOException;
+import java.io.Serializable;
+import java.nio.channels.WritableByteChannel;
+import java.nio.charset.Charset;
+import java.nio.charset.StandardCharsets;
+import java.util.Collections;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactMetadata;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestRequest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestResponse;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Builder;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Location;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactMetadata;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactRequest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactResponse;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc.ArtifactStagingServiceImplBase;
+import org.apache.beam.runners.fnexecution.FnService;
+import org.apache.beam.sdk.io.FileSystems;
+import org.apache.beam.sdk.io.fs.MoveOptions.StandardMoveOptions;
+import org.apache.beam.sdk.io.fs.ResolveOptions.StandardResolveOptions;
+import org.apache.beam.sdk.io.fs.ResourceId;
+import org.apache.beam.sdk.util.MimeTypes;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * {@link ArtifactStagingServiceImplBase} based on beam file system.
+ */
+public class BeamFileSystemArtifactStagingService extends 
ArtifactStagingServiceImplBase implements
+FnService {
+
+  private static final Logger LOG =
+  LoggerFactory.getLogger(BeamFileSystemArtifactStagingService.class);
+  private static final ObjectMapper MAPPER = new ObjectMapper();
+  // Use UTF8 for all text encoding.
+  private static final Charset CHARSET = StandardCharsets.UTF_8;
+  public static final String MANIFEST = "MANIFEST";
+
+  @Override
+  public StreamObserver putArtifact(
+  StreamObserver responseObserver) {
+return new PutArtifactStreamObserver(responseObserver);
+  }
+
+  @Override
+  public void commitManifest(
+  CommitManifestRequest request, StreamObserver 
responseObserver) {
+try {
+  ResourceId jobResourceDirId = 
getJobDirResourceId(request.getStagingSessionToken());
+  ResourceId manifestResourceId = jobResourceDirId
+  .resolve(MANIFEST, StandardResolveOptions.RESOLVE_FILE);
+  ResourceId artifactDirResourceId = 
getArtifactDirResourceId(request.getStagingSessionToken());
+  Builder proxyManifestBuilder = 
ProxyManifest.newBuilder().setManifest(request.getManifest());
 
 Review comment:
   It would be easier to read if this were fully qualified. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure 

[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=110341=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-110341
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 09/Jun/18 00:09
Start Date: 09/Jun/18 00:09
Worklog Time Spent: 10m 
  Work Description: robertwb commented on a change in pull request #5591: 
[BEAM-4290] Beam File System based ArtifactStagingService
URL: https://github.com/apache/beam/pull/5591#discussion_r194200497
 
 

 ##
 File path: 
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingService.java
 ##
 @@ -0,0 +1,285 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.fnexecution.artifact;
+
+import com.fasterxml.jackson.core.JsonProcessingException;
+import com.fasterxml.jackson.databind.ObjectMapper;
+import com.google.common.hash.Hashing;
+import com.google.protobuf.util.JsonFormat;
+import io.grpc.stub.StreamObserver;
+import java.io.FileNotFoundException;
+import java.io.IOException;
+import java.io.Serializable;
+import java.nio.channels.WritableByteChannel;
+import java.nio.charset.Charset;
+import java.nio.charset.StandardCharsets;
+import java.util.Collections;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactMetadata;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestRequest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestResponse;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Builder;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Location;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactMetadata;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactRequest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactResponse;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc.ArtifactStagingServiceImplBase;
+import org.apache.beam.runners.fnexecution.FnService;
+import org.apache.beam.sdk.io.FileSystems;
+import org.apache.beam.sdk.io.fs.MoveOptions.StandardMoveOptions;
+import org.apache.beam.sdk.io.fs.ResolveOptions.StandardResolveOptions;
+import org.apache.beam.sdk.io.fs.ResourceId;
+import org.apache.beam.sdk.util.MimeTypes;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * {@link ArtifactStagingServiceImplBase} based on beam file system.
+ */
+public class BeamFileSystemArtifactStagingService extends 
ArtifactStagingServiceImplBase implements
+FnService {
+
+  private static final Logger LOG =
+  LoggerFactory.getLogger(BeamFileSystemArtifactStagingService.class);
+  private static final ObjectMapper MAPPER = new ObjectMapper();
+  // Use UTF8 for all text encoding.
+  private static final Charset CHARSET = StandardCharsets.UTF_8;
+  public static final String MANIFEST = "MANIFEST";
+
+  @Override
+  public StreamObserver putArtifact(
+  StreamObserver responseObserver) {
+return new PutArtifactStreamObserver(responseObserver);
+  }
+
+  @Override
+  public void commitManifest(
+  CommitManifestRequest request, StreamObserver 
responseObserver) {
+try {
+  ResourceId jobResourceDirId = 
getJobDirResourceId(request.getStagingSessionToken());
+  ResourceId manifestResourceId = jobResourceDirId
+  .resolve(MANIFEST, StandardResolveOptions.RESOLVE_FILE);
+  ResourceId artifactDirResourceId = 
getArtifactDirResourceId(request.getStagingSessionToken());
+  Builder proxyManifestBuilder = 
ProxyManifest.newBuilder().setManifest(request.getManifest());
+  for (ArtifactMetadata artifactMetadata : 
request.getManifest().getArtifactList()) {
+
proxyManifestBuilder.addLocation(Location.newBuilder().setName(artifactMetadata.getName())
+.setUri(artifactDirResourceId
+.resolve(encodedFileName(artifactMetadata), 
StandardResolveOptions.RESOLVE_FILE)
+

[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=110346=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-110346
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 09/Jun/18 00:09
Start Date: 09/Jun/18 00:09
Worklog Time Spent: 10m 
  Work Description: robertwb commented on a change in pull request #5591: 
[BEAM-4290] Beam File System based ArtifactStagingService
URL: https://github.com/apache/beam/pull/5591#discussion_r194204959
 
 

 ##
 File path: 
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingService.java
 ##
 @@ -0,0 +1,286 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.fnexecution.artifact;
+
+import com.fasterxml.jackson.core.JsonProcessingException;
+import com.fasterxml.jackson.databind.ObjectMapper;
+import com.google.common.hash.Hashing;
+import com.google.protobuf.util.JsonFormat;
+import io.grpc.stub.StreamObserver;
+import java.io.FileNotFoundException;
+import java.io.IOException;
+import java.io.Serializable;
+import java.nio.channels.WritableByteChannel;
+import java.nio.charset.Charset;
+import java.nio.charset.StandardCharsets;
+import java.util.Collections;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactMetadata;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestRequest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestResponse;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Builder;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Location;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactMetadata;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactRequest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactResponse;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc.ArtifactStagingServiceImplBase;
+import org.apache.beam.runners.fnexecution.FnService;
+import org.apache.beam.sdk.io.FileSystems;
+import org.apache.beam.sdk.io.fs.MoveOptions.StandardMoveOptions;
+import org.apache.beam.sdk.io.fs.ResolveOptions.StandardResolveOptions;
+import org.apache.beam.sdk.io.fs.ResourceId;
+import org.apache.beam.sdk.util.MimeTypes;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * {@link ArtifactStagingServiceImplBase} based on beam file system.
+ */
+public class BeamFileSystemArtifactStagingService extends 
ArtifactStagingServiceImplBase implements
+FnService {
+
+  private static final Logger LOG =
+  LoggerFactory.getLogger(BeamFileSystemArtifactStagingService.class);
+  private static final ObjectMapper MAPPER = new ObjectMapper();
+  // Use UTF8 for all text encoding.
+  private static final Charset CHARSET = StandardCharsets.UTF_8;
+  public static final String MANIFEST = "MANIFEST";
+
+  @Override
+  public StreamObserver putArtifact(
+  StreamObserver responseObserver) {
+return new PutArtifactStreamObserver(responseObserver);
+  }
+
+  @Override
+  public void commitManifest(
+  CommitManifestRequest request, StreamObserver 
responseObserver) {
+try {
+  ResourceId jobResourceDirId = 
getJobDirResourceId(request.getStagingSessionToken());
+  ResourceId manifestResourceId = jobResourceDirId
+  .resolve(MANIFEST, StandardResolveOptions.RESOLVE_FILE);
+  ResourceId artifactDirResourceId = 
getArtifactDirResourceId(request.getStagingSessionToken());
+  Builder proxyManifestBuilder = 
ProxyManifest.newBuilder().setManifest(request.getManifest());
+  for (ArtifactMetadata artifactMetadata : 
request.getManifest().getArtifactList()) {
+
proxyManifestBuilder.addLocation(Location.newBuilder().setName(artifactMetadata.getName())
+.setUri(artifactDirResourceId
+.resolve(encodedFileName(artifactMetadata), 
StandardResolveOptions.RESOLVE_FILE)
+

[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=110342=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-110342
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 09/Jun/18 00:09
Start Date: 09/Jun/18 00:09
Worklog Time Spent: 10m 
  Work Description: robertwb commented on a change in pull request #5591: 
[BEAM-4290] Beam File System based ArtifactStagingService
URL: https://github.com/apache/beam/pull/5591#discussion_r194205152
 
 

 ##
 File path: 
runners/java-fn-execution/src/test/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingServiceTest.java
 ##
 @@ -0,0 +1,273 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.fnexecution.artifact;
+
+
+import com.google.common.base.Joiner;
+import com.google.common.base.Strings;
+import com.google.protobuf.ByteString;
+import com.google.protobuf.util.JsonFormat;
+import io.grpc.inprocess.InProcessChannelBuilder;
+import io.grpc.stub.StreamObserver;
+import java.io.FileInputStream;
+import java.io.IOException;
+import java.nio.charset.Charset;
+import java.nio.charset.StandardCharsets;
+import java.nio.file.FileVisitResult;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.nio.file.Paths;
+import java.nio.file.SimpleFileVisitor;
+import java.nio.file.attribute.BasicFileAttributes;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactChunk;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactMetadata;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestRequest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestResponse;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.Manifest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Builder;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Location;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactMetadata;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactRequest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactResponse;
+import org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc.ArtifactStagingServiceStub;
+import org.apache.beam.runners.fnexecution.GrpcFnServer;
+import org.apache.beam.runners.fnexecution.InProcessServerFactory;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
+import org.junit.runner.RunWith;
+import org.junit.runners.JUnit4;
+
+/**
+ * Tests for {@link BeamFileSystemArtifactStagingService}.
+ */
+@RunWith(JUnit4.class)
+public class BeamFileSystemArtifactStagingServiceTest {
+
+  public static final Joiner JOINER = Joiner.on("");
+  public static final Charset CHARSET = StandardCharsets.UTF_8;
+  private GrpcFnServer server;
+  private BeamFileSystemArtifactStagingService artifactStagingService;
+  private ArtifactStagingServiceStub stub;
+  private Path srcDir;
+  private Path destDir;
+
+  @Before
+  public void setUp() throws Exception {
+artifactStagingService = new BeamFileSystemArtifactStagingService();
+server = GrpcFnServer
+.allocatePortAndCreateFor(artifactStagingService, 
InProcessServerFactory.create());
+stub =
+ArtifactStagingServiceGrpc.newStub(
+
InProcessChannelBuilder.forName(server.getApiServiceDescriptor().getUrl()).build());
+
+srcDir = Files.createTempDirectory("BFSTemp");
+destDir = Files.createTempDirectory("BFDTemp");
+
+  }
+
+  @After
+  public void tearDown() throws Exception {
+if (server != null) {
+  

[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=110351=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-110351
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 09/Jun/18 00:09
Start Date: 09/Jun/18 00:09
Worklog Time Spent: 10m 
  Work Description: robertwb commented on a change in pull request #5591: 
[BEAM-4290] Beam File System based ArtifactStagingService
URL: https://github.com/apache/beam/pull/5591#discussion_r194205685
 
 

 ##
 File path: 
runners/java-fn-execution/src/test/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingServiceTest.java
 ##
 @@ -0,0 +1,273 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.fnexecution.artifact;
+
+
+import com.google.common.base.Joiner;
+import com.google.common.base.Strings;
+import com.google.protobuf.ByteString;
+import com.google.protobuf.util.JsonFormat;
+import io.grpc.inprocess.InProcessChannelBuilder;
+import io.grpc.stub.StreamObserver;
+import java.io.FileInputStream;
+import java.io.IOException;
+import java.nio.charset.Charset;
+import java.nio.charset.StandardCharsets;
+import java.nio.file.FileVisitResult;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.nio.file.Paths;
+import java.nio.file.SimpleFileVisitor;
+import java.nio.file.attribute.BasicFileAttributes;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactChunk;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactMetadata;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestRequest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestResponse;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.Manifest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Builder;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Location;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactMetadata;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactRequest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactResponse;
+import org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc.ArtifactStagingServiceStub;
+import org.apache.beam.runners.fnexecution.GrpcFnServer;
+import org.apache.beam.runners.fnexecution.InProcessServerFactory;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
+import org.junit.runner.RunWith;
+import org.junit.runners.JUnit4;
+
+/**
+ * Tests for {@link BeamFileSystemArtifactStagingService}.
+ */
+@RunWith(JUnit4.class)
+public class BeamFileSystemArtifactStagingServiceTest {
+
+  public static final Joiner JOINER = Joiner.on("");
+  public static final Charset CHARSET = StandardCharsets.UTF_8;
+  private GrpcFnServer server;
+  private BeamFileSystemArtifactStagingService artifactStagingService;
+  private ArtifactStagingServiceStub stub;
+  private Path srcDir;
+  private Path destDir;
+
+  @Before
+  public void setUp() throws Exception {
+artifactStagingService = new BeamFileSystemArtifactStagingService();
+server = GrpcFnServer
+.allocatePortAndCreateFor(artifactStagingService, 
InProcessServerFactory.create());
+stub =
+ArtifactStagingServiceGrpc.newStub(
+
InProcessChannelBuilder.forName(server.getApiServiceDescriptor().getUrl()).build());
+
+srcDir = Files.createTempDirectory("BFSTemp");
+destDir = Files.createTempDirectory("BFDTemp");
+
+  }
+
+  @After
+  public void tearDown() throws Exception {
+if (server != null) {
+  

[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=110347=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-110347
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 09/Jun/18 00:09
Start Date: 09/Jun/18 00:09
Worklog Time Spent: 10m 
  Work Description: robertwb commented on a change in pull request #5591: 
[BEAM-4290] Beam File System based ArtifactStagingService
URL: https://github.com/apache/beam/pull/5591#discussion_r194205444
 
 

 ##
 File path: 
runners/java-fn-execution/src/test/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingServiceTest.java
 ##
 @@ -0,0 +1,273 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.fnexecution.artifact;
+
+
+import com.google.common.base.Joiner;
+import com.google.common.base.Strings;
+import com.google.protobuf.ByteString;
+import com.google.protobuf.util.JsonFormat;
+import io.grpc.inprocess.InProcessChannelBuilder;
+import io.grpc.stub.StreamObserver;
+import java.io.FileInputStream;
+import java.io.IOException;
+import java.nio.charset.Charset;
+import java.nio.charset.StandardCharsets;
+import java.nio.file.FileVisitResult;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.nio.file.Paths;
+import java.nio.file.SimpleFileVisitor;
+import java.nio.file.attribute.BasicFileAttributes;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactChunk;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactMetadata;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestRequest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestResponse;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.Manifest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Builder;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Location;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactMetadata;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactRequest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactResponse;
+import org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc.ArtifactStagingServiceStub;
+import org.apache.beam.runners.fnexecution.GrpcFnServer;
+import org.apache.beam.runners.fnexecution.InProcessServerFactory;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
+import org.junit.runner.RunWith;
+import org.junit.runners.JUnit4;
+
+/**
+ * Tests for {@link BeamFileSystemArtifactStagingService}.
+ */
+@RunWith(JUnit4.class)
+public class BeamFileSystemArtifactStagingServiceTest {
+
+  public static final Joiner JOINER = Joiner.on("");
+  public static final Charset CHARSET = StandardCharsets.UTF_8;
+  private GrpcFnServer server;
+  private BeamFileSystemArtifactStagingService artifactStagingService;
+  private ArtifactStagingServiceStub stub;
+  private Path srcDir;
+  private Path destDir;
+
+  @Before
+  public void setUp() throws Exception {
+artifactStagingService = new BeamFileSystemArtifactStagingService();
+server = GrpcFnServer
+.allocatePortAndCreateFor(artifactStagingService, 
InProcessServerFactory.create());
+stub =
+ArtifactStagingServiceGrpc.newStub(
+
InProcessChannelBuilder.forName(server.getApiServiceDescriptor().getUrl()).build());
+
+srcDir = Files.createTempDirectory("BFSTemp");
+destDir = Files.createTempDirectory("BFDTemp");
+
+  }
+
+  @After
+  public void tearDown() throws Exception {
+if (server != null) {
+  

[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=110328=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-110328
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 08/Jun/18 22:45
Start Date: 08/Jun/18 22:45
Worklog Time Spent: 10m 
  Work Description: angoenka commented on a change in pull request #5591: 
[BEAM-4290] Beam File System based ArtifactStagingService
URL: https://github.com/apache/beam/pull/5591#discussion_r194193541
 
 

 ##
 File path: 
runners/java-fn-execution/src/test/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingServiceTest.java
 ##
 @@ -0,0 +1,271 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.fnexecution.artifact;
+
+
+import com.google.common.base.Joiner;
+import com.google.common.base.Strings;
+import com.google.protobuf.ByteString;
+import com.google.protobuf.util.JsonFormat;
+import io.grpc.inprocess.InProcessChannelBuilder;
+import io.grpc.stub.StreamObserver;
+import java.io.FileInputStream;
+import java.io.IOException;
+import java.nio.charset.Charset;
+import java.nio.charset.StandardCharsets;
+import java.nio.file.FileVisitResult;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.nio.file.Paths;
+import java.nio.file.SimpleFileVisitor;
+import java.nio.file.attribute.BasicFileAttributes;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactChunk;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactMetadata;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestRequest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestResponse;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.Manifest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Builder;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Location;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactMetadata;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactRequest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactResponse;
+import org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc.ArtifactStagingServiceStub;
+import org.apache.beam.runners.fnexecution.GrpcFnServer;
+import org.apache.beam.runners.fnexecution.InProcessServerFactory;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
+import org.junit.runner.RunWith;
+import org.junit.runners.JUnit4;
+
+/**
+ * Tests for {@link BeamFileSystemArtifactStagingService}.
+ */
+@RunWith(JUnit4.class)
+public class BeamFileSystemArtifactStagingServiceTest {
+
+  public static final Joiner JOINER = Joiner.on("");
+  public static final Charset CHARSET = StandardCharsets.UTF_8;
+  private GrpcFnServer server;
+  private BeamFileSystemArtifactStagingService artifactStagingService;
+  private ArtifactStagingServiceStub stub;
+  private Path srcDir;
+  private Path destDir;
+
+  @Before
+  public void setUp() throws Exception {
+artifactStagingService = new BeamFileSystemArtifactStagingService();
+server = GrpcFnServer
+.allocatePortAndCreateFor(artifactStagingService, 
InProcessServerFactory.create());
+stub =
+ArtifactStagingServiceGrpc.newStub(
+
InProcessChannelBuilder.forName(server.getApiServiceDescriptor().getUrl()).build());
+
+srcDir = Files.createTempDirectory("BFSTemp");
+destDir = Files.createTempDirectory("BFDTemp");
+
+  }
+
+  @After
+  public void tearDown() throws Exception {
+if (server != null) {
+  

[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=110327=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-110327
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 08/Jun/18 22:45
Start Date: 08/Jun/18 22:45
Worklog Time Spent: 10m 
  Work Description: angoenka commented on a change in pull request #5591: 
[BEAM-4290] Beam File System based ArtifactStagingService
URL: https://github.com/apache/beam/pull/5591#discussion_r194193649
 
 

 ##
 File path: 
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingService.java
 ##
 @@ -0,0 +1,285 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.fnexecution.artifact;
+
+import com.fasterxml.jackson.core.JsonProcessingException;
+import com.fasterxml.jackson.databind.ObjectMapper;
+import com.google.common.hash.Hashing;
+import com.google.protobuf.util.JsonFormat;
+import io.grpc.stub.StreamObserver;
+import java.io.FileNotFoundException;
+import java.io.IOException;
+import java.io.Serializable;
+import java.nio.channels.WritableByteChannel;
+import java.nio.charset.Charset;
+import java.nio.charset.StandardCharsets;
+import java.util.Collections;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactMetadata;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestRequest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestResponse;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Builder;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest.Location;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactMetadata;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactRequest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactResponse;
+import 
org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc.ArtifactStagingServiceImplBase;
+import org.apache.beam.runners.fnexecution.FnService;
+import org.apache.beam.sdk.io.FileSystems;
+import org.apache.beam.sdk.io.fs.MoveOptions.StandardMoveOptions;
+import org.apache.beam.sdk.io.fs.ResolveOptions.StandardResolveOptions;
+import org.apache.beam.sdk.io.fs.ResourceId;
+import org.apache.beam.sdk.util.MimeTypes;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * {@link ArtifactStagingServiceImplBase} based on beam file system.
+ */
+public class BeamFileSystemArtifactStagingService extends 
ArtifactStagingServiceImplBase implements
+FnService {
+
+  private static final Logger LOG =
+  LoggerFactory.getLogger(BeamFileSystemArtifactStagingService.class);
+  private static final ObjectMapper MAPPER = new ObjectMapper();
+  // Use UTF8 for all text encoding.
+  private static final Charset CHARSET = StandardCharsets.UTF_8;
+  public static final String MANIFEST = "MANIFEST";
+
+  @Override
+  public StreamObserver putArtifact(
+  StreamObserver responseObserver) {
+return new PutArtifactStreamObserver(responseObserver);
+  }
+
+  @Override
+  public void commitManifest(
+  CommitManifestRequest request, StreamObserver 
responseObserver) {
+try {
+  ResourceId jobResourceDirId = 
getJobDirResourceId(request.getStagingSessionToken());
+  ResourceId manifestResourceId = jobResourceDirId
+  .resolve(MANIFEST, StandardResolveOptions.RESOLVE_FILE);
+  ResourceId artifactDirResourceId = 
getArtifactDirResourceId(request.getStagingSessionToken());
+  Builder proxyManifestBuilder = 
ProxyManifest.newBuilder().setManifest(request.getManifest());
+  for (ArtifactMetadata artifactMetadata : 
request.getManifest().getArtifactList()) {
+
proxyManifestBuilder.addLocation(Location.newBuilder().setName(artifactMetadata.getName())
+.setUri(artifactDirResourceId
+.resolve(encodedFileName(artifactMetadata), 
StandardResolveOptions.RESOLVE_FILE)
+

[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=110309=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-110309
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 08/Jun/18 22:14
Start Date: 08/Jun/18 22:14
Worklog Time Spent: 10m 
  Work Description: angoenka opened a new pull request #5591: [BEAM-4290] 
Beam File System based ArtifactStagingService
URL: https://github.com/apache/beam/pull/5591
 
 
   Artifact staging service which uses BeamFileSystem to stage files on various 
file systems.
   
   
   
   Follow this checklist to help us incorporate your contribution quickly and 
easily:
   
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   It will help us expedite review of your Pull Request if you tag someone 
(e.g. `@username`) to look at it.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 110309)
Time Spent: 7h 50m  (was: 7h 40m)

> ArtifactStagingService that stages to a distributed filesystem
> --
>
> Key: BEAM-4290
> URL: https://issues.apache.org/jira/browse/BEAM-4290
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-core
>Reporter: Eugene Kirpichov
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 7h 50m
>  Remaining Estimate: 0h
>
> Using the job's staging directory from PipelineOptions.
> Physical layout on the distributed filesystem is TBD but it should allow for 
> arbitrary filenames and ideally for eventually avoiding uploading artifacts 
> that are already there.
> Handling credentials is TBD.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-08 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=110310=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-110310
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 08/Jun/18 22:14
Start Date: 08/Jun/18 22:14
Worklog Time Spent: 10m 
  Work Description: angoenka commented on issue #5591: [BEAM-4290] Beam 
File System based ArtifactStagingService
URL: https://github.com/apache/beam/pull/5591#issuecomment-395906244
 
 
   R: @bsidhom @jkff @axelmagn @robertwb 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 110310)
Time Spent: 8h  (was: 7h 50m)

> ArtifactStagingService that stages to a distributed filesystem
> --
>
> Key: BEAM-4290
> URL: https://issues.apache.org/jira/browse/BEAM-4290
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-core
>Reporter: Eugene Kirpichov
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 8h
>  Remaining Estimate: 0h
>
> Using the job's staging directory from PipelineOptions.
> Physical layout on the distributed filesystem is TBD but it should allow for 
> arbitrary filenames and ideally for eventually avoiding uploading artifacts 
> that are already there.
> Handling credentials is TBD.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=108920=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-108920
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 05/Jun/18 02:23
Start Date: 05/Jun/18 02:23
Worklog Time Spent: 10m 
  Work Description: herohde commented on issue #5489: [BEAM-4290] proto 
changes to support staging_session_token
URL: https://github.com/apache/beam/pull/5489#issuecomment-394559684
 
 
   @angoenka This PR seems to have broken the build:
   
   ./gradlew build
   [...]
   
   Task :beam-sdks-go-container:resolveBuildDependencies
   
   Resolving 
./github.com/apache/beam/sdks/go@/Users/herohde/go/src/github.com/apache/beam/sdks/go

github.com/apache/beam/runners/gcp/gcsproxy/vendor/github.com/apache/beam/sdks/go/pkg/beam/artifact/gcsproxy
   
vendor/github.com/apache/beam/sdks/go/pkg/beam/artifact/gcsproxy/staging.go:133:32:
 md.Name undefined (type *jobmanagement_v1.PutArtifactMetadata has no field or 
method Name)
   
vendor/github.com/apache/beam/sdks/go/pkg/beam/artifact/gcsproxy/staging.go:145:58:
 md.Name undefined (type *jobmanagement_v1.PutArtifactMetadata has no field or 
method Name)
   
vendor/github.com/apache/beam/sdks/go/pkg/beam/artifact/gcsproxy/staging.go:148:7:
 md.Md5 undefined (type *jobmanagement_v1.PutArtifactMetadata has no field or 
method Md5)
   
vendor/github.com/apache/beam/sdks/go/pkg/beam/artifact/gcsproxy/staging.go:149:66:
 md.Name undefined (type *jobmanagement_v1.PutArtifactMetadata has no field or 
method Name)
   
vendor/github.com/apache/beam/sdks/go/pkg/beam/artifact/gcsproxy/staging.go:149:81:
 md.Md5 undefined (type *jobmanagement_v1.PutArtifactMetadata has no field or 
method Md5)
   
vendor/github.com/apache/beam/sdks/go/pkg/beam/artifact/gcsproxy/staging.go:153:12:
 md.Name undefined (type *jobmanagement_v1.PutArtifactMetadata has no field or 
method Name)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 108920)
Time Spent: 7h 40m  (was: 7.5h)

> ArtifactStagingService that stages to a distributed filesystem
> --
>
> Key: BEAM-4290
> URL: https://issues.apache.org/jira/browse/BEAM-4290
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-core
>Reporter: Eugene Kirpichov
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 7h 40m
>  Remaining Estimate: 0h
>
> Using the job's staging directory from PipelineOptions.
> Physical layout on the distributed filesystem is TBD but it should allow for 
> arbitrary filenames and ideally for eventually avoiding uploading artifacts 
> that are already there.
> Handling credentials is TBD.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=108919=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-108919
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 05/Jun/18 02:22
Start Date: 05/Jun/18 02:22
Worklog Time Spent: 10m 
  Work Description: herohde commented on issue #5489: [BEAM-4290] proto 
changes to support staging_session_token
URL: https://github.com/apache/beam/pull/5489#issuecomment-394559684
 
 
   @angoenka This PR seems to have broken the build:
   
   # ./gradlew build
   [...]
   
   > Task :beam-sdks-go-container:resolveBuildDependencies
   Resolving 
./github.com/apache/beam/sdks/go@/Users/herohde/go/src/github.com/apache/beam/sdks/go
   # 
github.com/apache/beam/runners/gcp/gcsproxy/vendor/github.com/apache/beam/sdks/go/pkg/beam/artifact/gcsproxy
   
vendor/github.com/apache/beam/sdks/go/pkg/beam/artifact/gcsproxy/staging.go:133:32:
 md.Name undefined (type *jobmanagement_v1.PutArtifactMetadata has no field or 
method Name)
   
vendor/github.com/apache/beam/sdks/go/pkg/beam/artifact/gcsproxy/staging.go:145:58:
 md.Name undefined (type *jobmanagement_v1.PutArtifactMetadata has no field or 
method Name)
   
vendor/github.com/apache/beam/sdks/go/pkg/beam/artifact/gcsproxy/staging.go:148:7:
 md.Md5 undefined (type *jobmanagement_v1.PutArtifactMetadata has no field or 
method Md5)
   
vendor/github.com/apache/beam/sdks/go/pkg/beam/artifact/gcsproxy/staging.go:149:66:
 md.Name undefined (type *jobmanagement_v1.PutArtifactMetadata has no field or 
method Name)
   
vendor/github.com/apache/beam/sdks/go/pkg/beam/artifact/gcsproxy/staging.go:149:81:
 md.Md5 undefined (type *jobmanagement_v1.PutArtifactMetadata has no field or 
method Md5)
   
vendor/github.com/apache/beam/sdks/go/pkg/beam/artifact/gcsproxy/staging.go:153:12:
 md.Name undefined (type *jobmanagement_v1.PutArtifactMetadata has no field or 
method Name)
   
   > Task :beam-runners-gcp-gcsproxy:buildLinuxAmd64 FAILED
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 108919)
Time Spent: 7.5h  (was: 7h 20m)

> ArtifactStagingService that stages to a distributed filesystem
> --
>
> Key: BEAM-4290
> URL: https://issues.apache.org/jira/browse/BEAM-4290
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-core
>Reporter: Eugene Kirpichov
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 7.5h
>  Remaining Estimate: 0h
>
> Using the job's staging directory from PipelineOptions.
> Physical layout on the distributed filesystem is TBD but it should allow for 
> arbitrary filenames and ideally for eventually avoiding uploading artifacts 
> that are already there.
> Handling credentials is TBD.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=108773=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-108773
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 04/Jun/18 21:41
Start Date: 04/Jun/18 21:41
Worklog Time Spent: 10m 
  Work Description: angoenka commented on issue #5489: [BEAM-4290] proto 
changes to support staging_session_token
URL: https://github.com/apache/beam/pull/5489#issuecomment-394508969
 
 
   Yes, The missing "token" will be filled in subsequent PR where applicable.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 108773)
Time Spent: 7h 10m  (was: 7h)

> ArtifactStagingService that stages to a distributed filesystem
> --
>
> Key: BEAM-4290
> URL: https://issues.apache.org/jira/browse/BEAM-4290
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-core
>Reporter: Eugene Kirpichov
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 7h 10m
>  Remaining Estimate: 0h
>
> Using the job's staging directory from PipelineOptions.
> Physical layout on the distributed filesystem is TBD but it should allow for 
> arbitrary filenames and ideally for eventually avoiding uploading artifacts 
> that are already there.
> Handling credentials is TBD.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=108740=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-108740
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 04/Jun/18 20:53
Start Date: 04/Jun/18 20:53
Worklog Time Spent: 10m 
  Work Description: angoenka commented on a change in pull request #5489: 
[BEAM-4290] proto changes to support staging_session_token
URL: https://github.com/apache/beam/pull/5489#discussion_r192501650
 
 

 ##
 File path: model/job-management/src/main/proto/beam_artifact_api.proto
 ##
 @@ -102,13 +99,19 @@ message ArtifactChunk {
   bytes data = 1;
 }
 
+message PutArtifactMetadata {
+  // (Required) A token for artifact staging session.
 
 Review comment:
   Yes, staging_session_token is a session token so all the artifacts related 
to that session are expected to use the same token.
   
   Sure, I will update the documentation.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 108740)
Time Spent: 6.5h  (was: 6h 20m)

> ArtifactStagingService that stages to a distributed filesystem
> --
>
> Key: BEAM-4290
> URL: https://issues.apache.org/jira/browse/BEAM-4290
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-core
>Reporter: Eugene Kirpichov
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 6.5h
>  Remaining Estimate: 0h
>
> Using the job's staging directory from PipelineOptions.
> Physical layout on the distributed filesystem is TBD but it should allow for 
> arbitrary filenames and ideally for eventually avoiding uploading artifacts 
> that are already there.
> Handling credentials is TBD.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=108745=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-108745
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 04/Jun/18 20:53
Start Date: 04/Jun/18 20:53
Worklog Time Spent: 10m 
  Work Description: angoenka commented on a change in pull request #5489: 
[BEAM-4290] proto changes to support staging_session_token
URL: https://github.com/apache/beam/pull/5489#discussion_r192509819
 
 

 ##
 File path: 
runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/ArtifactServiceStager.java
 ##
 @@ -145,7 +146,9 @@ public ArtifactMetadata get() throws Exception {
   StreamObserver requestObserver = 
stub.putArtifact(responseObserver);
   ArtifactMetadata metadata =
   ArtifactMetadata.newBuilder().setName(file.getStagingName()).build();
-  
requestObserver.onNext(PutArtifactRequest.newBuilder().setMetadata(metadata).build());
+  PutArtifactMetadata putMetadata = 
PutArtifactMetadata.newBuilder().setMetadata(metadata)
+  .setStagingSessionToken("token").build();
 
 Review comment:
   I will add the todo as the token is not implemented yet.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 108745)

> ArtifactStagingService that stages to a distributed filesystem
> --
>
> Key: BEAM-4290
> URL: https://issues.apache.org/jira/browse/BEAM-4290
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-core
>Reporter: Eugene Kirpichov
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 6h 50m
>  Remaining Estimate: 0h
>
> Using the job's staging directory from PipelineOptions.
> Physical layout on the distributed filesystem is TBD but it should allow for 
> arbitrary filenames and ideally for eventually avoiding uploading artifacts 
> that are already there.
> Handling credentials is TBD.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=108741=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-108741
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 04/Jun/18 20:53
Start Date: 04/Jun/18 20:53
Worklog Time Spent: 10m 
  Work Description: angoenka commented on a change in pull request #5489: 
[BEAM-4290] proto changes to support staging_session_token
URL: https://github.com/apache/beam/pull/5489#discussion_r192502440
 
 

 ##
 File path: model/job-management/src/main/proto/beam_artifact_api.proto
 ##
 @@ -124,6 +127,8 @@ message PutArtifactResponse {
 message CommitManifestRequest {
   // (Required) The manifest to commit.
   Manifest manifest = 1;
+  // (Required) A token for artifact staging session.
+  string staging_session_token = 2;
 
 Review comment:
   Done


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 108741)
Time Spent: 6h 40m  (was: 6.5h)

> ArtifactStagingService that stages to a distributed filesystem
> --
>
> Key: BEAM-4290
> URL: https://issues.apache.org/jira/browse/BEAM-4290
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-core
>Reporter: Eugene Kirpichov
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 6h 40m
>  Remaining Estimate: 0h
>
> Using the job's staging directory from PipelineOptions.
> Physical layout on the distributed filesystem is TBD but it should allow for 
> arbitrary filenames and ideally for eventually avoiding uploading artifacts 
> that are already there.
> Handling credentials is TBD.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=108746=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-108746
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 04/Jun/18 20:53
Start Date: 04/Jun/18 20:53
Worklog Time Spent: 10m 
  Work Description: angoenka commented on a change in pull request #5489: 
[BEAM-4290] proto changes to support staging_session_token
URL: https://github.com/apache/beam/pull/5489#discussion_r192511665
 
 

 ##
 File path: 
runners/reference/java/src/main/java/org/apache/beam/runners/reference/testing/TestJobService.java
 ##
 @@ -59,6 +59,7 @@ public void prepare(
 PrepareJobResponse.newBuilder()
 .setPreparationId(preparationId)
 .setArtifactStagingEndpoint(stagingEndpoint)
+.setStagingSessionToken("TestStagingToken")
 
 Review comment:
   At this point, it should only be set as it is not currently used.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 108746)
Time Spent: 7h  (was: 6h 50m)

> ArtifactStagingService that stages to a distributed filesystem
> --
>
> Key: BEAM-4290
> URL: https://issues.apache.org/jira/browse/BEAM-4290
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-core
>Reporter: Eugene Kirpichov
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 7h
>  Remaining Estimate: 0h
>
> Using the job's staging directory from PipelineOptions.
> Physical layout on the distributed filesystem is TBD but it should allow for 
> arbitrary filenames and ideally for eventually avoiding uploading artifacts 
> that are already there.
> Handling credentials is TBD.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=108744=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-108744
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 04/Jun/18 20:53
Start Date: 04/Jun/18 20:53
Worklog Time Spent: 10m 
  Work Description: angoenka commented on a change in pull request #5489: 
[BEAM-4290] proto changes to support staging_session_token
URL: https://github.com/apache/beam/pull/5489#discussion_r192872171
 
 

 ##
 File path: sdks/python/apache_beam/runners/portability/local_job_service.py
 ##
 @@ -87,7 +87,9 @@ def Prepare(self, request, context=None):
 use_grpc=self._use_grpc,
 sdk_harness_factory=sdk_harness_factory)
 logging.debug("Prepared job '%s' as '%s'", request.job_name, 
preparation_id)
-return beam_job_api_pb2.PrepareJobResponse(preparation_id=preparation_id)
+# TODO(angoenka): Pass an appropriate staging_session_token
 
 Review comment:
   done


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 108744)

> ArtifactStagingService that stages to a distributed filesystem
> --
>
> Key: BEAM-4290
> URL: https://issues.apache.org/jira/browse/BEAM-4290
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-core
>Reporter: Eugene Kirpichov
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 6h 50m
>  Remaining Estimate: 0h
>
> Using the job's staging directory from PipelineOptions.
> Physical layout on the distributed filesystem is TBD but it should allow for 
> arbitrary filenames and ideally for eventually avoiding uploading artifacts 
> that are already there.
> Handling credentials is TBD.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=108738=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-108738
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 04/Jun/18 20:53
Start Date: 04/Jun/18 20:53
Worklog Time Spent: 10m 
  Work Description: angoenka commented on a change in pull request #5489: 
[BEAM-4290] proto changes to support staging_session_token
URL: https://github.com/apache/beam/pull/5489#discussion_r192509534
 
 

 ##
 File path: model/job-management/src/main/proto/beam_job_api.proto
 ##
 @@ -69,12 +69,16 @@ message PrepareJobRequest {
 
 message PrepareJobResponse {
   // (required) The ID used to associate calls made while preparing the job. 
preparationId is used
-  // to run the job, as well as in other pre-execution APIs such as Artifact 
staging.
+  // to run the job.
   string preparation_id = 1;
 
   // An endpoint which exposes the Beam Artifact Staging API. Artifacts used 
by the job should be
   // staged to this endpoint, and will be available during job execution.
   org.apache.beam.model.pipeline.v1.ApiServiceDescriptor 
artifact_staging_endpoint = 2;
+
+  // (required) Token for the artifact staging. This token also represent an 
artifact
 
 Review comment:
   The job service is expected to generate the token some how. Adding how its 
generated here will me suggestion for implementation.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 108738)
Time Spent: 6h 10m  (was: 6h)

> ArtifactStagingService that stages to a distributed filesystem
> --
>
> Key: BEAM-4290
> URL: https://issues.apache.org/jira/browse/BEAM-4290
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-core
>Reporter: Eugene Kirpichov
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>
> Using the job's staging directory from PipelineOptions.
> Physical layout on the distributed filesystem is TBD but it should allow for 
> arbitrary filenames and ideally for eventually avoiding uploading artifacts 
> that are already there.
> Handling credentials is TBD.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=108743=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-108743
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 04/Jun/18 20:53
Start Date: 04/Jun/18 20:53
Worklog Time Spent: 10m 
  Work Description: angoenka commented on a change in pull request #5489: 
[BEAM-4290] proto changes to support staging_session_token
URL: https://github.com/apache/beam/pull/5489#discussion_r192510656
 
 

 ##
 File path: 
runners/direct-java/src/main/java/org/apache/beam/runners/direct/portable/job/ReferenceRunnerJobService.java
 ##
 @@ -97,6 +97,7 @@ public void prepare(
   PrepareJobResponse.newBuilder()
   .setPreparationId(preparationId)
   
.setArtifactStagingEndpoint(artifactStagingService.getApiServiceDescriptor())
+  .setStagingSessionToken(tempDir.toFile().getAbsolutePath())
 
 Review comment:
   Done


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 108743)
Time Spent: 6h 50m  (was: 6h 40m)

> ArtifactStagingService that stages to a distributed filesystem
> --
>
> Key: BEAM-4290
> URL: https://issues.apache.org/jira/browse/BEAM-4290
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-core
>Reporter: Eugene Kirpichov
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 6h 50m
>  Remaining Estimate: 0h
>
> Using the job's staging directory from PipelineOptions.
> Physical layout on the distributed filesystem is TBD but it should allow for 
> arbitrary filenames and ideally for eventually avoiding uploading artifacts 
> that are already there.
> Handling credentials is TBD.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=108742=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-108742
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 04/Jun/18 20:53
Start Date: 04/Jun/18 20:53
Worklog Time Spent: 10m 
  Work Description: angoenka commented on a change in pull request #5489: 
[BEAM-4290] proto changes to support staging_session_token
URL: https://github.com/apache/beam/pull/5489#discussion_r192511287
 
 

 ##
 File path: 
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/jobsubmission/InMemoryJobService.java
 ##
 @@ -114,6 +114,8 @@ public void prepare(
   .newBuilder()
   .setPreparationId(preparationId)
   .setArtifactStagingEndpoint(stagingServiceDescriptor)
+  // TODO: Pass the correct token for staging.
 
 Review comment:
   The correct token will depend upon the implementation of 
ArtifactStagingService. We will have to revisit this in next PR where I will 
add the ArtifactStagingService.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 108742)
Time Spent: 6h 40m  (was: 6.5h)

> ArtifactStagingService that stages to a distributed filesystem
> --
>
> Key: BEAM-4290
> URL: https://issues.apache.org/jira/browse/BEAM-4290
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-core
>Reporter: Eugene Kirpichov
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 6h 40m
>  Remaining Estimate: 0h
>
> Using the job's staging directory from PipelineOptions.
> Physical layout on the distributed filesystem is TBD but it should allow for 
> arbitrary filenames and ideally for eventually avoiding uploading artifacts 
> that are already there.
> Handling credentials is TBD.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=108739=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-108739
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 04/Jun/18 20:53
Start Date: 04/Jun/18 20:53
Worklog Time Spent: 10m 
  Work Description: angoenka commented on a change in pull request #5489: 
[BEAM-4290] proto changes to support staging_session_token
URL: https://github.com/apache/beam/pull/5489#discussion_r192510959
 
 

 ##
 File path: 
runners/direct-java/src/test/java/org/apache/beam/runners/direct/portable/artifact/LocalFileSystemArtifactStagerServiceTest.java
 ##
 @@ -88,7 +88,11 @@ public void singleDataPutArtifactSucceeds() throws 
Exception {
 String name = "my-artifact";
 requestObserver.onNext(
 ArtifactApi.PutArtifactRequest.newBuilder()
-
.setMetadata(ArtifactApi.ArtifactMetadata.newBuilder().setName(name).build())
+.setMetadata(
 
 Review comment:
   Not really. The LocalFileSystemArtifactStagerServiceTest does not use 
StagingSessionToken so there is no need to verify at the moment.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 108739)
Time Spent: 6h 20m  (was: 6h 10m)

> ArtifactStagingService that stages to a distributed filesystem
> --
>
> Key: BEAM-4290
> URL: https://issues.apache.org/jira/browse/BEAM-4290
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-core
>Reporter: Eugene Kirpichov
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 6h 20m
>  Remaining Estimate: 0h
>
> Using the job's staging directory from PipelineOptions.
> Physical layout on the distributed filesystem is TBD but it should allow for 
> arbitrary filenames and ideally for eventually avoiding uploading artifacts 
> that are already there.
> Handling credentials is TBD.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=108174=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-108174
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 01/Jun/18 19:57
Start Date: 01/Jun/18 19:57
Worklog Time Spent: 10m 
  Work Description: jkff commented on a change in pull request #5489: 
[BEAM-4290] proto changes to support staging_session_token
URL: https://github.com/apache/beam/pull/5489#discussion_r192499066
 
 

 ##
 File path: model/job-management/src/main/proto/beam_artifact_api.proto
 ##
 @@ -102,13 +99,19 @@ message ArtifactChunk {
   bytes data = 1;
 }
 
+message PutArtifactMetadata {
+  // (Required) A token for artifact staging session.
 
 Review comment:
   Are all PutArtifactMetadata's within the same PutArtifactRequest stream 
required to use the same token?
   
   Also: document where this token is supposed to come from?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 108174)
Time Spent: 5h 20m  (was: 5h 10m)

> ArtifactStagingService that stages to a distributed filesystem
> --
>
> Key: BEAM-4290
> URL: https://issues.apache.org/jira/browse/BEAM-4290
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-core
>Reporter: Eugene Kirpichov
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> Using the job's staging directory from PipelineOptions.
> Physical layout on the distributed filesystem is TBD but it should allow for 
> arbitrary filenames and ideally for eventually avoiding uploading artifacts 
> that are already there.
> Handling credentials is TBD.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=108176=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-108176
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 01/Jun/18 19:57
Start Date: 01/Jun/18 19:57
Worklog Time Spent: 10m 
  Work Description: jkff commented on a change in pull request #5489: 
[BEAM-4290] proto changes to support staging_session_token
URL: https://github.com/apache/beam/pull/5489#discussion_r192499216
 
 

 ##
 File path: model/job-management/src/main/proto/beam_artifact_api.proto
 ##
 @@ -124,6 +127,8 @@ message PutArtifactResponse {
 message CommitManifestRequest {
   // (Required) The manifest to commit.
   Manifest manifest = 1;
+  // (Required) A token for artifact staging session.
+  string staging_session_token = 2;
 
 Review comment:
   Document where this token comes from?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 108176)
Time Spent: 5h 40m  (was: 5.5h)

> ArtifactStagingService that stages to a distributed filesystem
> --
>
> Key: BEAM-4290
> URL: https://issues.apache.org/jira/browse/BEAM-4290
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-core
>Reporter: Eugene Kirpichov
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 5h 40m
>  Remaining Estimate: 0h
>
> Using the job's staging directory from PipelineOptions.
> Physical layout on the distributed filesystem is TBD but it should allow for 
> arbitrary filenames and ideally for eventually avoiding uploading artifacts 
> that are already there.
> Handling credentials is TBD.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=108179=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-108179
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 01/Jun/18 19:57
Start Date: 01/Jun/18 19:57
Worklog Time Spent: 10m 
  Work Description: jkff commented on a change in pull request #5489: 
[BEAM-4290] proto changes to support staging_session_token
URL: https://github.com/apache/beam/pull/5489#discussion_r192499781
 
 

 ##
 File path: 
runners/reference/java/src/main/java/org/apache/beam/runners/reference/testing/TestJobService.java
 ##
 @@ -59,6 +59,7 @@ public void prepare(
 PrepareJobResponse.newBuilder()
 .setPreparationId(preparationId)
 .setArtifactStagingEndpoint(stagingEndpoint)
+.setStagingSessionToken("TestStagingToken")
 
 Review comment:
   Ditto, does it need to be verified or only set?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 108179)
Time Spent: 6h  (was: 5h 50m)

> ArtifactStagingService that stages to a distributed filesystem
> --
>
> Key: BEAM-4290
> URL: https://issues.apache.org/jira/browse/BEAM-4290
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-core
>Reporter: Eugene Kirpichov
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 6h
>  Remaining Estimate: 0h
>
> Using the job's staging directory from PipelineOptions.
> Physical layout on the distributed filesystem is TBD but it should allow for 
> arbitrary filenames and ideally for eventually avoiding uploading artifacts 
> that are already there.
> Handling credentials is TBD.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=108177=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-108177
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 01/Jun/18 19:57
Start Date: 01/Jun/18 19:57
Worklog Time Spent: 10m 
  Work Description: jkff commented on a change in pull request #5489: 
[BEAM-4290] proto changes to support staging_session_token
URL: https://github.com/apache/beam/pull/5489#discussion_r192499599
 
 

 ##
 File path: 
runners/direct-java/src/main/java/org/apache/beam/runners/direct/portable/job/ReferenceRunnerJobService.java
 ##
 @@ -97,6 +97,7 @@ public void prepare(
   PrepareJobResponse.newBuilder()
   .setPreparationId(preparationId)
   
.setArtifactStagingEndpoint(artifactStagingService.getApiServiceDescriptor())
+  .setStagingSessionToken(tempDir.toFile().getAbsolutePath())
 
 Review comment:
   Add a comment saying that we intentionally use the temp dir path as the 
staging token, and clarify who (what class) is going to interpret it this way?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 108177)
Time Spent: 5h 50m  (was: 5h 40m)

> ArtifactStagingService that stages to a distributed filesystem
> --
>
> Key: BEAM-4290
> URL: https://issues.apache.org/jira/browse/BEAM-4290
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-core
>Reporter: Eugene Kirpichov
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> Using the job's staging directory from PipelineOptions.
> Physical layout on the distributed filesystem is TBD but it should allow for 
> arbitrary filenames and ideally for eventually avoiding uploading artifacts 
> that are already there.
> Handling credentials is TBD.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=108180=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-108180
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 01/Jun/18 19:57
Start Date: 01/Jun/18 19:57
Worklog Time Spent: 10m 
  Work Description: jkff commented on a change in pull request #5489: 
[BEAM-4290] proto changes to support staging_session_token
URL: https://github.com/apache/beam/pull/5489#discussion_r192499671
 
 

 ##
 File path: 
runners/direct-java/src/test/java/org/apache/beam/runners/direct/portable/artifact/LocalFileSystemArtifactStagerServiceTest.java
 ##
 @@ -88,7 +88,11 @@ public void singleDataPutArtifactSucceeds() throws 
Exception {
 String name = "my-artifact";
 requestObserver.onNext(
 ArtifactApi.PutArtifactRequest.newBuilder()
-
.setMetadata(ArtifactApi.ArtifactMetadata.newBuilder().setName(name).build())
+.setMetadata(
 
 Review comment:
   In this file we only set the token - should we also verify it somewhere or 
it's not needed?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 108180)

> ArtifactStagingService that stages to a distributed filesystem
> --
>
> Key: BEAM-4290
> URL: https://issues.apache.org/jira/browse/BEAM-4290
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-core
>Reporter: Eugene Kirpichov
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 6h
>  Remaining Estimate: 0h
>
> Using the job's staging directory from PipelineOptions.
> Physical layout on the distributed filesystem is TBD but it should allow for 
> arbitrary filenames and ideally for eventually avoiding uploading artifacts 
> that are already there.
> Handling credentials is TBD.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=108172=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-108172
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 01/Jun/18 19:57
Start Date: 01/Jun/18 19:57
Worklog Time Spent: 10m 
  Work Description: jkff commented on a change in pull request #5489: 
[BEAM-4290] proto changes to support staging_session_token
URL: https://github.com/apache/beam/pull/5489#discussion_r192499486
 
 

 ##
 File path: 
runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/ArtifactServiceStager.java
 ##
 @@ -145,7 +146,9 @@ public ArtifactMetadata get() throws Exception {
   StreamObserver requestObserver = 
stub.putArtifact(responseObserver);
   ArtifactMetadata metadata =
   ArtifactMetadata.newBuilder().setName(file.getStagingName()).build();
-  
requestObserver.onNext(PutArtifactRequest.newBuilder().setMetadata(metadata).build());
+  PutArtifactMetadata putMetadata = 
PutArtifactMetadata.newBuilder().setMetadata(metadata)
+  .setStagingSessionToken("token").build();
 
 Review comment:
   This looks like a dummy value in a non-test class: is this correct?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 108172)
Time Spent: 5h  (was: 4h 50m)

> ArtifactStagingService that stages to a distributed filesystem
> --
>
> Key: BEAM-4290
> URL: https://issues.apache.org/jira/browse/BEAM-4290
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-core
>Reporter: Eugene Kirpichov
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 5h
>  Remaining Estimate: 0h
>
> Using the job's staging directory from PipelineOptions.
> Physical layout on the distributed filesystem is TBD but it should allow for 
> arbitrary filenames and ideally for eventually avoiding uploading artifacts 
> that are already there.
> Handling credentials is TBD.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=108178=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-108178
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 01/Jun/18 19:57
Start Date: 01/Jun/18 19:57
Worklog Time Spent: 10m 
  Work Description: jkff commented on a change in pull request #5489: 
[BEAM-4290] proto changes to support staging_session_token
URL: https://github.com/apache/beam/pull/5489#discussion_r192499706
 
 

 ##
 File path: 
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/jobsubmission/InMemoryJobService.java
 ##
 @@ -114,6 +114,8 @@ public void prepare(
   .newBuilder()
   .setPreparationId(preparationId)
   .setArtifactStagingEndpoint(stagingServiceDescriptor)
+  // TODO: Pass the correct token for staging.
 
 Review comment:
   Address this - what would be the correct token?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 108178)

> ArtifactStagingService that stages to a distributed filesystem
> --
>
> Key: BEAM-4290
> URL: https://issues.apache.org/jira/browse/BEAM-4290
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-core
>Reporter: Eugene Kirpichov
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> Using the job's staging directory from PipelineOptions.
> Physical layout on the distributed filesystem is TBD but it should allow for 
> arbitrary filenames and ideally for eventually avoiding uploading artifacts 
> that are already there.
> Handling credentials is TBD.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=108181=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-108181
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 01/Jun/18 19:57
Start Date: 01/Jun/18 19:57
Worklog Time Spent: 10m 
  Work Description: jkff commented on a change in pull request #5489: 
[BEAM-4290] proto changes to support staging_session_token
URL: https://github.com/apache/beam/pull/5489#discussion_r192499852
 
 

 ##
 File path: sdks/python/apache_beam/runners/portability/local_job_service.py
 ##
 @@ -87,7 +87,9 @@ def Prepare(self, request, context=None):
 use_grpc=self._use_grpc,
 sdk_harness_factory=sdk_harness_factory)
 logging.debug("Prepared job '%s' as '%s'", request.job_name, 
preparation_id)
-return beam_job_api_pb2.PrepareJobResponse(preparation_id=preparation_id)
+# TODO(angoenka): Pass an appropriate staging_session_token
 
 Review comment:
   Ditto


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 108181)

> ArtifactStagingService that stages to a distributed filesystem
> --
>
> Key: BEAM-4290
> URL: https://issues.apache.org/jira/browse/BEAM-4290
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-core
>Reporter: Eugene Kirpichov
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 6h
>  Remaining Estimate: 0h
>
> Using the job's staging directory from PipelineOptions.
> Physical layout on the distributed filesystem is TBD but it should allow for 
> arbitrary filenames and ideally for eventually avoiding uploading artifacts 
> that are already there.
> Handling credentials is TBD.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=108173=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-108173
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 01/Jun/18 19:57
Start Date: 01/Jun/18 19:57
Worklog Time Spent: 10m 
  Work Description: jkff commented on a change in pull request #5489: 
[BEAM-4290] proto changes to support staging_session_token
URL: https://github.com/apache/beam/pull/5489#discussion_r192499261
 
 

 ##
 File path: model/job-management/src/main/proto/beam_job_api.proto
 ##
 @@ -69,12 +69,16 @@ message PrepareJobRequest {
 
 message PrepareJobResponse {
   // (required) The ID used to associate calls made while preparing the job. 
preparationId is used
-  // to run the job, as well as in other pre-execution APIs such as Artifact 
staging.
+  // to run the job.
   string preparation_id = 1;
 
   // An endpoint which exposes the Beam Artifact Staging API. Artifacts used 
by the job should be
   // staged to this endpoint, and will be available during job execution.
   org.apache.beam.model.pipeline.v1.ApiServiceDescriptor 
artifact_staging_endpoint = 2;
+
+  // (required) Token for the artifact staging. This token also represent an 
artifact
 
 Review comment:
   Ditto


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 108173)
Time Spent: 5h 10m  (was: 5h)

> ArtifactStagingService that stages to a distributed filesystem
> --
>
> Key: BEAM-4290
> URL: https://issues.apache.org/jira/browse/BEAM-4290
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-core
>Reporter: Eugene Kirpichov
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> Using the job's staging directory from PipelineOptions.
> Physical layout on the distributed filesystem is TBD but it should allow for 
> arbitrary filenames and ideally for eventually avoiding uploading artifacts 
> that are already there.
> Handling credentials is TBD.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-06-01 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=108175=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-108175
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 01/Jun/18 19:57
Start Date: 01/Jun/18 19:57
Worklog Time Spent: 10m 
  Work Description: jkff commented on a change in pull request #5489: 
[BEAM-4290] proto changes to support staging_session_token
URL: https://github.com/apache/beam/pull/5489#discussion_r192500128
 
 

 ##
 File path: sdks/python/apache_beam/runners/portability/portable_stager_test.py
 ##
 @@ -64,7 +64,9 @@ def _stage_files(self, files):
 test_port = server.add_insecure_port('[::]:0')
 server.start()
 stager = portable_stager.PortableStager(
-grpc.insecure_channel('localhost:%s' % test_port))
+artifact_service_channel=grpc.insecure_channel(
+'localhost:%s' % test_port),
+staging_session_token='token')
 
 Review comment:
   Ditto - verify that it's present in the requests


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 108175)
Time Spent: 5.5h  (was: 5h 20m)

> ArtifactStagingService that stages to a distributed filesystem
> --
>
> Key: BEAM-4290
> URL: https://issues.apache.org/jira/browse/BEAM-4290
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-core
>Reporter: Eugene Kirpichov
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> Using the job's staging directory from PipelineOptions.
> Physical layout on the distributed filesystem is TBD but it should allow for 
> arbitrary filenames and ideally for eventually avoiding uploading artifacts 
> that are already there.
> Handling credentials is TBD.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-05-31 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=107937=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-107937
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 31/May/18 23:25
Start Date: 31/May/18 23:25
Worklog Time Spent: 10m 
  Work Description: angoenka commented on a change in pull request #5489: 
[BEAM-4290] proto changes to support artifact_staging_id
URL: https://github.com/apache/beam/pull/5489#discussion_r192264472
 
 

 ##
 File path: sdks/go/gogradle.lock
 ##
 @@ -200,7 +200,7 @@ dependencies:
 - "g...@github.com:golang/protobuf.git"
 vcs: "git"
 name: "github.com/golang/protobuf"
-commit: "bbd03ef6da3a115852eaf24c8a1c46aeb39aa175"
+commit: "3a3da3a4e26776cc22a79ef46d5d58477532dede"
 
 Review comment:
   The new proto generator is incompatible with old golang/protobuf
   Reference discussion https://github.com/google/protobuf/issues/4582


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 107937)
Time Spent: 4h 50m  (was: 4h 40m)

> ArtifactStagingService that stages to a distributed filesystem
> --
>
> Key: BEAM-4290
> URL: https://issues.apache.org/jira/browse/BEAM-4290
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-core
>Reporter: Eugene Kirpichov
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> Using the job's staging directory from PipelineOptions.
> Physical layout on the distributed filesystem is TBD but it should allow for 
> arbitrary filenames and ideally for eventually avoiding uploading artifacts 
> that are already there.
> Handling credentials is TBD.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-05-31 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=107936=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-107936
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 31/May/18 23:25
Start Date: 31/May/18 23:25
Worklog Time Spent: 10m 
  Work Description: angoenka commented on issue #5489: [BEAM-4290] proto 
changes to support artifact_staging_id
URL: https://github.com/apache/beam/pull/5489#issuecomment-393713259
 
 
   The PR is ready for review PTAL


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 107936)
Time Spent: 4h 40m  (was: 4.5h)

> ArtifactStagingService that stages to a distributed filesystem
> --
>
> Key: BEAM-4290
> URL: https://issues.apache.org/jira/browse/BEAM-4290
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-core
>Reporter: Eugene Kirpichov
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> Using the job's staging directory from PipelineOptions.
> Physical layout on the distributed filesystem is TBD but it should allow for 
> arbitrary filenames and ideally for eventually avoiding uploading artifacts 
> that are already there.
> Handling credentials is TBD.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-05-31 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=107911=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-107911
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 31/May/18 22:10
Start Date: 31/May/18 22:10
Worklog Time Spent: 10m 
  Work Description: angoenka commented on a change in pull request #5489: 
[BEAM-4290] proto changes to support artifact_staging_id
URL: https://github.com/apache/beam/pull/5489#discussion_r192252323
 
 

 ##
 File path: model/job-management/src/main/proto/beam_artifact_api.proto
 ##
 @@ -102,13 +99,19 @@ message ArtifactChunk {
   bytes data = 1;
 }
 
+message PutArtifactMetadata {
+  // (Required) An identifier for artifact staging session.
+  string artifact_staging_id = 1;
+  // (Required) The Artifact metadata.
+  ArtifactMetadata metadata = 2;
+}
+
 // A request to stage an artifact.
 message PutArtifactRequest {
   // (Required)
   oneof content {
-// The Artifact metadata. The first message in a PutArtifact call must 
contain the name
-// of the artifact.
-ArtifactMetadata metadata = 1;
+// The first message in a PutArtifact call must contain this field.
 
 Review comment:
   The structure of proto makes it difficult to pass an additional field in 
there.
   To pass an additional field in PutArtifactRequest we will have to some thing 
like this
   
   ```
   message PutArtifactRequest {
 // (Required)
 oneof content {
   // The Artifact metadata. The FIRST message in a PutArtifact call must 
contain the name
   // of the artifact.
   string staging_session_token = 1;
   // The Artifact metadata. The SECOND message in a PutArtifact call must 
contain the name
   // of the artifact.
   ArtifactMetadata metadata = 2;
   
   // A chunk of the artifact. All messages after the first in a 
PutArtifact call must contain a
   // chunk.
   ArtifactChunk data = 3;
 }
   }
   ```
   To avoid this sequencing of fields, I prefer to make a separate Message 
which should be passed in first request.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 107911)
Time Spent: 4.5h  (was: 4h 20m)

> ArtifactStagingService that stages to a distributed filesystem
> --
>
> Key: BEAM-4290
> URL: https://issues.apache.org/jira/browse/BEAM-4290
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-core
>Reporter: Eugene Kirpichov
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> Using the job's staging directory from PipelineOptions.
> Physical layout on the distributed filesystem is TBD but it should allow for 
> arbitrary filenames and ideally for eventually avoiding uploading artifacts 
> that are already there.
> Handling credentials is TBD.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-05-31 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=107910=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-107910
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 31/May/18 22:09
Start Date: 31/May/18 22:09
Worklog Time Spent: 10m 
  Work Description: angoenka commented on a change in pull request #5489: 
[BEAM-4290] proto changes to support artifact_staging_id
URL: https://github.com/apache/beam/pull/5489#discussion_r192252323
 
 

 ##
 File path: model/job-management/src/main/proto/beam_artifact_api.proto
 ##
 @@ -102,13 +99,19 @@ message ArtifactChunk {
   bytes data = 1;
 }
 
+message PutArtifactMetadata {
+  // (Required) An identifier for artifact staging session.
+  string artifact_staging_id = 1;
+  // (Required) The Artifact metadata.
+  ArtifactMetadata metadata = 2;
+}
+
 // A request to stage an artifact.
 message PutArtifactRequest {
   // (Required)
   oneof content {
-// The Artifact metadata. The first message in a PutArtifact call must 
contain the name
-// of the artifact.
-ArtifactMetadata metadata = 1;
+// The first message in a PutArtifact call must contain this field.
 
 Review comment:
   The structure of proto makes it difficult to pass an additional field in 
there.
   To pass an additional field in PutArtifactRequest we will have to some thing 
like this
   
   ```
   message PutArtifactRequest {
 // (Required)
 oneof content {
   // The Artifact metadata. The FIRST message in a PutArtifact call must 
contain the name
   // of the artifact.
   string staging_session_token = 1;
   // The Artifact metadata. The SECOND message in a PutArtifact call must 
contain the name
   // of the artifact.
   ArtifactMetadata metadata = 2;
   
   // A chunk of the artifact. All messages after the first in a 
PutArtifact call must contain a
   // chunk.
   ArtifactChunk data = 3;
 }
   }
   ```
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 107910)
Time Spent: 4h 20m  (was: 4h 10m)

> ArtifactStagingService that stages to a distributed filesystem
> --
>
> Key: BEAM-4290
> URL: https://issues.apache.org/jira/browse/BEAM-4290
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-core
>Reporter: Eugene Kirpichov
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> Using the job's staging directory from PipelineOptions.
> Physical layout on the distributed filesystem is TBD but it should allow for 
> arbitrary filenames and ideally for eventually avoiding uploading artifacts 
> that are already there.
> Handling credentials is TBD.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-05-31 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=107877=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-107877
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 31/May/18 20:49
Start Date: 31/May/18 20:49
Worklog Time Spent: 10m 
  Work Description: jkff commented on a change in pull request #5489: 
[BEAM-4290] proto changes to support artifact_staging_id
URL: https://github.com/apache/beam/pull/5489#discussion_r192168198
 
 

 ##
 File path: model/job-management/src/main/proto/beam_artifact_api.proto
 ##
 @@ -102,13 +99,19 @@ message ArtifactChunk {
   bytes data = 1;
 }
 
+message PutArtifactMetadata {
+  // (Required) An identifier for artifact staging session.
+  string artifact_staging_id = 1;
+  // (Required) The Artifact metadata.
+  ArtifactMetadata metadata = 2;
+}
+
 // A request to stage an artifact.
 message PutArtifactRequest {
   // (Required)
   oneof content {
-// The Artifact metadata. The first message in a PutArtifact call must 
contain the name
-// of the artifact.
-ArtifactMetadata metadata = 1;
+// The first message in a PutArtifact call must contain this field.
 
 Review comment:
   Any reason not to put the staging session token as a top-level field in 
PutArtifactRequest, instead of adding the new message PutArtifactMetadata? The 
latter feels confusing since the session token is not really associated with 
any particular artifact.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 107877)
Time Spent: 4h 10m  (was: 4h)

> ArtifactStagingService that stages to a distributed filesystem
> --
>
> Key: BEAM-4290
> URL: https://issues.apache.org/jira/browse/BEAM-4290
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-core
>Reporter: Eugene Kirpichov
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> Using the job's staging directory from PipelineOptions.
> Physical layout on the distributed filesystem is TBD but it should allow for 
> arbitrary filenames and ideally for eventually avoiding uploading artifacts 
> that are already there.
> Handling credentials is TBD.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-05-31 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=107733=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-107733
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 31/May/18 16:15
Start Date: 31/May/18 16:15
Worklog Time Spent: 10m 
  Work Description: herohde commented on a change in pull request #5489: 
[BEAM-4290] proto changes to support artifact_staging_id
URL: https://github.com/apache/beam/pull/5489#discussion_r192155816
 
 

 ##
 File path: model/job-management/src/main/proto/beam_artifact_api.proto
 ##
 @@ -102,13 +99,19 @@ message ArtifactChunk {
   bytes data = 1;
 }
 
+message PutArtifactMetadata {
+  // (Required) An identifier for artifact staging session.
+  string artifact_staging_id = 1;
 
 Review comment:
   LGTM


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 107733)
Time Spent: 4h  (was: 3h 50m)

> ArtifactStagingService that stages to a distributed filesystem
> --
>
> Key: BEAM-4290
> URL: https://issues.apache.org/jira/browse/BEAM-4290
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-core
>Reporter: Eugene Kirpichov
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> Using the job's staging directory from PipelineOptions.
> Physical layout on the distributed filesystem is TBD but it should allow for 
> arbitrary filenames and ideally for eventually avoiding uploading artifacts 
> that are already there.
> Handling credentials is TBD.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-05-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=107403=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-107403
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 30/May/18 22:28
Start Date: 30/May/18 22:28
Worklog Time Spent: 10m 
  Work Description: angoenka commented on a change in pull request #5489: 
[BEAM-4290] proto changes to support artifact_staging_id
URL: https://github.com/apache/beam/pull/5489#discussion_r191942404
 
 

 ##
 File path: model/job-management/src/main/proto/beam_artifact_api.proto
 ##
 @@ -102,13 +99,19 @@ message ArtifactChunk {
   bytes data = 1;
 }
 
+message PutArtifactMetadata {
+  // (Required) An identifier for artifact staging session.
+  string artifact_staging_id = 1;
 
 Review comment:
   Just to reiterate on naming. 
   I don't have strong preference for naming between `artifact_staging_id and 
staging_session_token`
   But I will go ahead with `staging_session_token`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 107403)
Time Spent: 3h 50m  (was: 3h 40m)

> ArtifactStagingService that stages to a distributed filesystem
> --
>
> Key: BEAM-4290
> URL: https://issues.apache.org/jira/browse/BEAM-4290
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-core
>Reporter: Eugene Kirpichov
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> Using the job's staging directory from PipelineOptions.
> Physical layout on the distributed filesystem is TBD but it should allow for 
> arbitrary filenames and ideally for eventually avoiding uploading artifacts 
> that are already there.
> Handling credentials is TBD.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-05-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=107402=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-107402
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 30/May/18 22:25
Start Date: 30/May/18 22:25
Worklog Time Spent: 10m 
  Work Description: angoenka commented on a change in pull request #5489: 
[BEAM-4290] proto changes to support artifact_staging_id
URL: https://github.com/apache/beam/pull/5489#discussion_r191941805
 
 

 ##
 File path: model/job-management/src/main/proto/beam_job_api.proto
 ##
 @@ -69,12 +69,16 @@ message PrepareJobRequest {
 
 message PrepareJobResponse {
   // (required) The ID used to associate calls made while preparing the job. 
preparationId is used
-  // to run the job, as well as in other pre-execution APIs such as Artifact 
staging.
+  // to run the job.
   string preparation_id = 1;
 
   // An endpoint which exposes the Beam Artifact Staging API. Artifacts used 
by the job should be
   // staged to this endpoint, and will be available during job execution.
   org.apache.beam.model.pipeline.v1.ApiServiceDescriptor 
artifact_staging_endpoint = 2;
 
 Review comment:
   Had an offline discussion with @herohde 
   Enhancement to `ApiServiceDescriptor` is certainly some thing to consider 
but its our of scope for this PR. 
   For This PR we are going with approach 3 mentioned in the document.
   
https://docs.google.com/document/d/12zNk3O2nhTB8Zmxw5U78qXrvlk5r42X8tqF248IDlpI/edit#heading=h.mvxjskcybk6q
   
   
   
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 107402)
Time Spent: 3h 40m  (was: 3.5h)

> ArtifactStagingService that stages to a distributed filesystem
> --
>
> Key: BEAM-4290
> URL: https://issues.apache.org/jira/browse/BEAM-4290
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-core
>Reporter: Eugene Kirpichov
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> Using the job's staging directory from PipelineOptions.
> Physical layout on the distributed filesystem is TBD but it should allow for 
> arbitrary filenames and ideally for eventually avoiding uploading artifacts 
> that are already there.
> Handling credentials is TBD.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-05-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=107392=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-107392
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 30/May/18 21:41
Start Date: 30/May/18 21:41
Worklog Time Spent: 10m 
  Work Description: axelmagn commented on a change in pull request #5489: 
[BEAM-4290] proto changes to support artifact_staging_id
URL: https://github.com/apache/beam/pull/5489#discussion_r191932190
 
 

 ##
 File path: model/job-management/src/main/proto/beam_artifact_api.proto
 ##
 @@ -102,13 +99,19 @@ message ArtifactChunk {
   bytes data = 1;
 }
 
+message PutArtifactMetadata {
+  // (Required) An identifier for artifact staging session.
+  string artifact_staging_id = 1;
 
 Review comment:
   I see.  I guess I didn't realize it was that widespread.  In that case I 
retract my objections, since you're right that we should follow the prevailing 
style.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 107392)
Time Spent: 3.5h  (was: 3h 20m)

> ArtifactStagingService that stages to a distributed filesystem
> --
>
> Key: BEAM-4290
> URL: https://issues.apache.org/jira/browse/BEAM-4290
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-core
>Reporter: Eugene Kirpichov
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> Using the job's staging directory from PipelineOptions.
> Physical layout on the distributed filesystem is TBD but it should allow for 
> arbitrary filenames and ideally for eventually avoiding uploading artifacts 
> that are already there.
> Handling credentials is TBD.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-05-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=107348=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-107348
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 30/May/18 20:02
Start Date: 30/May/18 20:02
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on a change in pull request #5489: 
[BEAM-4290] proto changes to support artifact_staging_id
URL: https://github.com/apache/beam/pull/5489#discussion_r191904156
 
 

 ##
 File path: model/job-management/src/main/proto/beam_artifact_api.proto
 ##
 @@ -102,13 +99,19 @@ message ArtifactChunk {
   bytes data = 1;
 }
 
+message PutArtifactMetadata {
+  // (Required) An identifier for artifact staging session.
+  string artifact_staging_id = 1;
 
 Review comment:
   We have a bunch of places where there is an token/id and a runner may choose 
to put something  there. Migrating to struct for all the tokens is a valid 
discussion to have but I feel should be separate from this PR as this PR 
already copies existing behavior when it comes to tokens.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 107348)
Time Spent: 3h 20m  (was: 3h 10m)

> ArtifactStagingService that stages to a distributed filesystem
> --
>
> Key: BEAM-4290
> URL: https://issues.apache.org/jira/browse/BEAM-4290
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-core
>Reporter: Eugene Kirpichov
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> Using the job's staging directory from PipelineOptions.
> Physical layout on the distributed filesystem is TBD but it should allow for 
> arbitrary filenames and ideally for eventually avoiding uploading artifacts 
> that are already there.
> Handling credentials is TBD.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-05-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=107342=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-107342
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 30/May/18 19:44
Start Date: 30/May/18 19:44
Worklog Time Spent: 10m 
  Work Description: angoenka commented on a change in pull request #5489: 
[BEAM-4290] proto changes to support artifact_staging_id
URL: https://github.com/apache/beam/pull/5489#discussion_r191898873
 
 

 ##
 File path: model/job-management/src/main/proto/beam_job_api.proto
 ##
 @@ -69,12 +69,16 @@ message PrepareJobRequest {
 
 message PrepareJobResponse {
   // (required) The ID used to associate calls made while preparing the job. 
preparationId is used
-  // to run the job, as well as in other pre-execution APIs such as Artifact 
staging.
+  // to run the job.
   string preparation_id = 1;
 
   // An endpoint which exposes the Beam Artifact Staging API. Artifacts used 
by the job should be
   // staged to this endpoint, and will be available during job execution.
   org.apache.beam.model.pipeline.v1.ApiServiceDescriptor 
artifact_staging_endpoint = 2;
 
 Review comment:
   Thanks for the link.
   I agree with @lukecwik on adding more information to `ApiServiceDescriptor` 
as ApiServiceDescriptor is meant to hold all the relevant connection 
information. In that case it is fair to pack headers for connections in 
`ApiServiceDescriptor` which the framework should simply pass.
   Enhancing `ApiServiceDescriptor` to have headers (for now and credentials 
etc later) is a more suitable approach.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 107342)
Time Spent: 3h 10m  (was: 3h)

> ArtifactStagingService that stages to a distributed filesystem
> --
>
> Key: BEAM-4290
> URL: https://issues.apache.org/jira/browse/BEAM-4290
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-core
>Reporter: Eugene Kirpichov
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> Using the job's staging directory from PipelineOptions.
> Physical layout on the distributed filesystem is TBD but it should allow for 
> arbitrary filenames and ideally for eventually avoiding uploading artifacts 
> that are already there.
> Handling credentials is TBD.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-05-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=107337=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-107337
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 30/May/18 19:36
Start Date: 30/May/18 19:36
Worklog Time Spent: 10m 
  Work Description: axelmagn commented on a change in pull request #5489: 
[BEAM-4290] proto changes to support artifact_staging_id
URL: https://github.com/apache/beam/pull/5489#discussion_r191896270
 
 

 ##
 File path: model/job-management/src/main/proto/beam_artifact_api.proto
 ##
 @@ -102,13 +99,19 @@ message ArtifactChunk {
   bytes data = 1;
 }
 
+message PutArtifactMetadata {
+  // (Required) An identifier for artifact staging session.
+  string artifact_staging_id = 1;
 
 Review comment:
   Okay.  Understandable.  Have we considered using an anonymous Struct for 
implementatation-specific metadata, so that we can decouple it from the 
identifier?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 107337)
Time Spent: 3h  (was: 2h 50m)

> ArtifactStagingService that stages to a distributed filesystem
> --
>
> Key: BEAM-4290
> URL: https://issues.apache.org/jira/browse/BEAM-4290
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-core
>Reporter: Eugene Kirpichov
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> Using the job's staging directory from PipelineOptions.
> Physical layout on the distributed filesystem is TBD but it should allow for 
> arbitrary filenames and ideally for eventually avoiding uploading artifacts 
> that are already there.
> Handling credentials is TBD.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-05-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=107336=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-107336
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 30/May/18 19:34
Start Date: 30/May/18 19:34
Worklog Time Spent: 10m 
  Work Description: herohde commented on a change in pull request #5489: 
[BEAM-4290] proto changes to support artifact_staging_id
URL: https://github.com/apache/beam/pull/5489#discussion_r191895524
 
 

 ##
 File path: model/job-management/src/main/proto/beam_job_api.proto
 ##
 @@ -69,12 +69,16 @@ message PrepareJobRequest {
 
 message PrepareJobResponse {
   // (required) The ID used to associate calls made while preparing the job. 
preparationId is used
-  // to run the job, as well as in other pre-execution APIs such as Artifact 
staging.
+  // to run the job.
   string preparation_id = 1;
 
   // An endpoint which exposes the Beam Artifact Staging API. Artifacts used 
by the job should be
   // staged to this endpoint, and will be available during job execution.
   org.apache.beam.model.pipeline.v1.ApiServiceDescriptor 
artifact_staging_endpoint = 2;
 
 Review comment:
   Sorry. Forgot to link to it: 
https://github.com/apache/beam/pull/5349#discussion_r190624823.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 107336)
Time Spent: 2h 50m  (was: 2h 40m)

> ArtifactStagingService that stages to a distributed filesystem
> --
>
> Key: BEAM-4290
> URL: https://issues.apache.org/jira/browse/BEAM-4290
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-core
>Reporter: Eugene Kirpichov
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> Using the job's staging directory from PipelineOptions.
> Physical layout on the distributed filesystem is TBD but it should allow for 
> arbitrary filenames and ideally for eventually avoiding uploading artifacts 
> that are already there.
> Handling credentials is TBD.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-05-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=107333=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-107333
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 30/May/18 19:19
Start Date: 30/May/18 19:19
Worklog Time Spent: 10m 
  Work Description: angoenka commented on a change in pull request #5489: 
[BEAM-4290] proto changes to support artifact_staging_id
URL: https://github.com/apache/beam/pull/5489#discussion_r191891215
 
 

 ##
 File path: model/job-management/src/main/proto/beam_artifact_api.proto
 ##
 @@ -102,13 +99,19 @@ message ArtifactChunk {
   bytes data = 1;
 }
 
+message PutArtifactMetadata {
+  // (Required) An identifier for artifact staging session.
+  string artifact_staging_id = 1;
 
 Review comment:
   Given the diversity of ArtifactStagingService, its really hard to capture 
all the required metadata. 
   One thing which we are trying to do in the proto is not to enforce any 
implementation details. 
   From proto perspective, artifact_staging_id/token is just a text string. It 
can very well be just an id and the implementation of the service can look up 
that id in another system or it can just be json so that implementation can 
extract all the relevant information from the token without referring to 
another system.
   This gives the implementation flexibility to implement the service in the 
way they want.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 107333)
Time Spent: 2h 40m  (was: 2.5h)

> ArtifactStagingService that stages to a distributed filesystem
> --
>
> Key: BEAM-4290
> URL: https://issues.apache.org/jira/browse/BEAM-4290
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-core
>Reporter: Eugene Kirpichov
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> Using the job's staging directory from PipelineOptions.
> Physical layout on the distributed filesystem is TBD but it should allow for 
> arbitrary filenames and ideally for eventually avoiding uploading artifacts 
> that are already there.
> Handling credentials is TBD.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-05-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=107332=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-107332
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 30/May/18 19:13
Start Date: 30/May/18 19:13
Worklog Time Spent: 10m 
  Work Description: angoenka commented on a change in pull request #5489: 
[BEAM-4290] proto changes to support artifact_staging_id
URL: https://github.com/apache/beam/pull/5489#discussion_r191889464
 
 

 ##
 File path: model/job-management/src/main/proto/beam_job_api.proto
 ##
 @@ -69,12 +69,16 @@ message PrepareJobRequest {
 
 message PrepareJobResponse {
   // (required) The ID used to associate calls made while preparing the job. 
preparationId is used
-  // to run the job, as well as in other pre-execution APIs such as Artifact 
staging.
+  // to run the job.
   string preparation_id = 1;
 
   // An endpoint which exposes the Beam Artifact Staging API. Artifacts used 
by the job should be
   // staged to this endpoint, and will be available during job execution.
   org.apache.beam.model.pipeline.v1.ApiServiceDescriptor 
artifact_staging_endpoint = 2;
 
 Review comment:
   Sorry, I don't have context of that PR. Does the client_id refers to a 
common id which is shared across all grpc connection to identify client or is 
it a service specific id so that the service can identify client connected to 
it. I think if it is the 1st then we can pack all the relevant info in that id.
   However, I like the idea of having explicit token for artifact staging.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 107332)
Time Spent: 2.5h  (was: 2h 20m)

> ArtifactStagingService that stages to a distributed filesystem
> --
>
> Key: BEAM-4290
> URL: https://issues.apache.org/jira/browse/BEAM-4290
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-core
>Reporter: Eugene Kirpichov
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Using the job's staging directory from PipelineOptions.
> Physical layout on the distributed filesystem is TBD but it should allow for 
> arbitrary filenames and ideally for eventually avoiding uploading artifacts 
> that are already there.
> Handling credentials is TBD.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-05-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=107331=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-107331
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 30/May/18 19:11
Start Date: 30/May/18 19:11
Worklog Time Spent: 10m 
  Work Description: axelmagn commented on a change in pull request #5489: 
[BEAM-4290] proto changes to support artifact_staging_id
URL: https://github.com/apache/beam/pull/5489#discussion_r191888360
 
 

 ##
 File path: model/job-management/src/main/proto/beam_artifact_api.proto
 ##
 @@ -102,13 +99,19 @@ message ArtifactChunk {
   bytes data = 1;
 }
 
+message PutArtifactMetadata {
+  // (Required) An identifier for artifact staging session.
+  string artifact_staging_id = 1;
 
 Review comment:
   The discussion document alludes to the following items to be specified by 
the job service and passed to the staging service:
   
   - Base directory to put artifacts in.
   - TTL for the artifacts.
   - Authentication to submit artifacts.
   - Credentials to store artifacts in distributed file system.
   
   Of these, how much is going to fit into the `artifact_staging_id`?  Since 
this is already a metadata proto, why are we packing data into a string to be 
parsed later?  Is there a particular parser for it that we already have an 
implementation for?  If so, we need to document it as such.  Otherwise I'd 
recommend that any metadata contained within the `artifact_staging_id` should 
be made explicit as fields in `PutArtifactMetadata`.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 107331)
Time Spent: 2h 20m  (was: 2h 10m)

> ArtifactStagingService that stages to a distributed filesystem
> --
>
> Key: BEAM-4290
> URL: https://issues.apache.org/jira/browse/BEAM-4290
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-core
>Reporter: Eugene Kirpichov
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Using the job's staging directory from PipelineOptions.
> Physical layout on the distributed filesystem is TBD but it should allow for 
> arbitrary filenames and ideally for eventually avoiding uploading artifacts 
> that are already there.
> Handling credentials is TBD.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-05-29 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=106967=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-106967
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 30/May/18 01:24
Start Date: 30/May/18 01:24
Worklog Time Spent: 10m 
  Work Description: herohde commented on a change in pull request #5489: 
[BEAM-4290] proto changes to support artifact_staging_id
URL: https://github.com/apache/beam/pull/5489#discussion_r191617046
 
 

 ##
 File path: model/job-management/src/main/proto/beam_job_api.proto
 ##
 @@ -69,12 +69,16 @@ message PrepareJobRequest {
 
 message PrepareJobResponse {
   // (required) The ID used to associate calls made while preparing the job. 
preparationId is used
-  // to run the job, as well as in other pre-execution APIs such as Artifact 
staging.
+  // to run the job.
   string preparation_id = 1;
 
   // An endpoint which exposes the Beam Artifact Staging API. Artifacts used 
by the job should be
   // staged to this endpoint, and will be available during job execution.
   org.apache.beam.model.pipeline.v1.ApiServiceDescriptor 
artifact_staging_endpoint = 2;
 
 Review comment:
   @lukecwik suggested in a separate PR to add a client_id to the 
ApiServiceDescriptor proto and always send that as a header. Would that ID be 
suitable for an artifact_staging_id separate from preparation_id? Then no other 
changes would be needed (except fixing the comment above) and the propagation 
would be done by general purpose logic.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 106967)
Time Spent: 2h 10m  (was: 2h)

> ArtifactStagingService that stages to a distributed filesystem
> --
>
> Key: BEAM-4290
> URL: https://issues.apache.org/jira/browse/BEAM-4290
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-core
>Reporter: Eugene Kirpichov
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Using the job's staging directory from PipelineOptions.
> Physical layout on the distributed filesystem is TBD but it should allow for 
> arbitrary filenames and ideally for eventually avoiding uploading artifacts 
> that are already there.
> Handling credentials is TBD.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4290) ArtifactStagingService that stages to a distributed filesystem

2018-05-29 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4290?focusedWorklogId=106957=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-106957
 ]

ASF GitHub Bot logged work on BEAM-4290:


Author: ASF GitHub Bot
Created on: 30/May/18 00:07
Start Date: 30/May/18 00:07
Worklog Time Spent: 10m 
  Work Description: angoenka commented on a change in pull request #5489: 
[BEAM-4290] proto changes to support artifact_staging_id
URL: https://github.com/apache/beam/pull/5489#discussion_r191610707
 
 

 ##
 File path: model/job-management/src/main/proto/beam_artifact_api.proto
 ##
 @@ -102,13 +99,19 @@ message ArtifactChunk {
   bytes data = 1;
 }
 
+message PutArtifactMetadata {
+  // (Required) An identifier for artifact staging session.
+  string artifact_staging_id = 1;
 
 Review comment:
   It makes sense. And thanks for giving it a better name.
   I will go with `artifact_staging_id -> staging_session_token`


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 106957)
Time Spent: 1h 50m  (was: 1h 40m)

> ArtifactStagingService that stages to a distributed filesystem
> --
>
> Key: BEAM-4290
> URL: https://issues.apache.org/jira/browse/BEAM-4290
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-core
>Reporter: Eugene Kirpichov
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Using the job's staging directory from PipelineOptions.
> Physical layout on the distributed filesystem is TBD but it should allow for 
> arbitrary filenames and ideally for eventually avoiding uploading artifacts 
> that are already there.
> Handling credentials is TBD.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   >