[GitHub] [hadoop] steveloughran commented on a change in pull request #1591: HADOOP-16629: support copyFile in s3afilesystem
steveloughran commented on a change in pull request #1591: HADOOP-16629: support copyFile in s3afilesystem URL: https://github.com/apache/hadoop/pull/1591#discussion_r333723547 ## File path: hadoop-common-project/hadoop-common/src/site/markdown/filesystem/filesystem.md ## @@ -905,6 +905,27 @@ the file, then verify the invariants in the `PathHandle` using `getFileStatus(Path)` to implement `CONTENT`. This could yield false positives and it requires additional RPC traffic. + +### `copyFile(URI srcFile, URI dstFile)` + +Copies a file `srcFile` to another file `dstFile`. + +Implementations without a compliant call SHOULD throw `UnsupportedOperationException`. + + Preconditions + +if not exists(FS, srcFile) : raise FileNotFoundException + +if not exists(parentFolder(dstFile)) : raise [IllegalArguementException] + +if isDirectory(srcFile) : raise [IllegalArguementException] + +if exists(dstFile) : raise FileAlreadyExistsException Review comment: why? Because with s3 I you don't get an atomic check anyway...it's simplest just to do the copy and say "yes, we overwrite". This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on a change in pull request #1591: HADOOP-16629: support copyFile in s3afilesystem
steveloughran commented on a change in pull request #1591: HADOOP-16629: support copyFile in s3afilesystem URL: https://github.com/apache/hadoop/pull/1591#discussion_r333721693 ## File path: hadoop-common-project/hadoop-common/src/site/markdown/filesystem/filesystem.md ## @@ -905,6 +905,27 @@ the file, then verify the invariants in the `PathHandle` using `getFileStatus(Path)` to implement `CONTENT`. This could yield false positives and it requires additional RPC traffic. + +### `copyFile(URI srcFile, URI dstFile)` + +Copies a file `srcFile` to another file `dstFile`. + +Implementations without a compliant call SHOULD throw `UnsupportedOperationException`. Review comment: MUST This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on a change in pull request #1591: HADOOP-16629: support copyFile in s3afilesystem
steveloughran commented on a change in pull request #1591: HADOOP-16629: support copyFile in s3afilesystem URL: https://github.com/apache/hadoop/pull/1591#discussion_r333723714 ## File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java ## @@ -82,6 +82,7 @@ import com.google.common.annotations.VisibleForTesting; import com.google.common.base.Preconditions; import com.google.common.util.concurrent.ListeningExecutorService; +import org.apache.hadoop.fs.s3a.impl.CopyOperation; Review comment: needs to go into the org.apache imports This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on a change in pull request #1591: HADOOP-16629: support copyFile in s3afilesystem
steveloughran commented on a change in pull request #1591: HADOOP-16629: support copyFile in s3afilesystem URL: https://github.com/apache/hadoop/pull/1591#discussion_r333722944 ## File path: hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/AbstractContractCopyTest.java ## @@ -0,0 +1,149 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.contract; + +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.Path; +import org.junit.Test; + +import static org.apache.hadoop.fs.CommonPathCapabilities.FS_NATIVE_COPY; +import static org.apache.hadoop.fs.contract.ContractTestUtils.createFile; +import static org.apache.hadoop.fs.contract.ContractTestUtils.dataset; +import static org.apache.hadoop.fs.contract.ContractTestUtils.touch; +import static org.apache.hadoop.test.LambdaTestUtils.intercept; + +/** + * Test native copy - if supported. + */ +public abstract class AbstractContractCopyTest extends AbstractFSContractTestBase { + + protected Path srcDir; + protected Path srcFile; + protected Path zeroByteFile; + protected Path dstDir; + protected Path dstFile; + protected FileSystem fs; + + @Override + public void setup() throws Exception { +super.setup(); +srcDir = absolutepath("source"); +dstDir = absolutepath("target"); + +srcFile = new Path(srcDir, "source.txt"); +dstFile = new Path(dstDir, "dst.txt"); +zeroByteFile = new Path(srcDir, "zero.txt"); + +byte[] block = dataset(TEST_FILE_LEN, 0, 255); +createFile(getFileSystem(), srcFile, true, block); +touch(getFileSystem(), zeroByteFile); + +fs = getFileSystem(); + +skipIfUnsupported(SUPPORTS_NATIVE_COPY); + } + + @Override + public void teardown() throws Exception { +describe("\nTeardown\n"); +super.teardown(); + } + + + @Test + public void testBasicCopy() throws Throwable { +describe("Checks whether S3A FS supports native copy" ++ " and copies a file from one location to another." ++ " File deletion taken care in teardown."); + +assertPathExists(srcFile + " does not exist", srcFile); + +fs.delete(dstFile, false); +// initiate copy +fs.copyFile(srcFile.toUri(), dstFile.toUri()); + +// Check if file is available +assertIsFile(dstFile); Review comment: you need to * actually verify the data * assert that the source file is stil there This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on a change in pull request #1591: HADOOP-16629: support copyFile in s3afilesystem
steveloughran commented on a change in pull request #1591: HADOOP-16629: support copyFile in s3afilesystem URL: https://github.com/apache/hadoop/pull/1591#discussion_r333722583 ## File path: hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/AbstractContractCopyTest.java ## @@ -0,0 +1,149 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.contract; + +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.Path; +import org.junit.Test; + +import static org.apache.hadoop.fs.CommonPathCapabilities.FS_NATIVE_COPY; +import static org.apache.hadoop.fs.contract.ContractTestUtils.createFile; +import static org.apache.hadoop.fs.contract.ContractTestUtils.dataset; +import static org.apache.hadoop.fs.contract.ContractTestUtils.touch; +import static org.apache.hadoop.test.LambdaTestUtils.intercept; + +/** + * Test native copy - if supported. + */ +public abstract class AbstractContractCopyTest extends AbstractFSContractTestBase { + + protected Path srcDir; + protected Path srcFile; + protected Path zeroByteFile; + protected Path dstDir; + protected Path dstFile; + protected FileSystem fs; + + @Override + public void setup() throws Exception { +super.setup(); +srcDir = absolutepath("source"); +dstDir = absolutepath("target"); + +srcFile = new Path(srcDir, "source.txt"); +dstFile = new Path(dstDir, "dst.txt"); +zeroByteFile = new Path(srcDir, "zero.txt"); + +byte[] block = dataset(TEST_FILE_LEN, 0, 255); +createFile(getFileSystem(), srcFile, true, block); +touch(getFileSystem(), zeroByteFile); + +fs = getFileSystem(); + +skipIfUnsupported(SUPPORTS_NATIVE_COPY); Review comment: do this before creating files This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on a change in pull request #1591: HADOOP-16629: support copyFile in s3afilesystem
steveloughran commented on a change in pull request #1591: HADOOP-16629: support copyFile in s3afilesystem URL: https://github.com/apache/hadoop/pull/1591#discussion_r333722308 ## File path: hadoop-common-project/hadoop-common/src/site/markdown/filesystem/filesystem.md ## @@ -905,6 +905,27 @@ the file, then verify the invariants in the `PathHandle` using `getFileStatus(Path)` to implement `CONTENT`. This could yield false positives and it requires additional RPC traffic. + +### `copyFile(URI srcFile, URI dstFile)` + +Copies a file `srcFile` to another file `dstFile`. + +Implementations without a compliant call SHOULD throw `UnsupportedOperationException`. + + Preconditions + +if not exists(FS, srcFile) : raise FileNotFoundException + +if not exists(parentFolder(dstFile)) : raise [IllegalArguementException] + +if isDirectory(srcFile) : raise [IllegalArguementException] + +if exists(dstFile) : raise FileAlreadyExistsException + + Postconditions + +`dstFile` is available in the filesystem. Review comment: afraid you are going to have to be a bit more detailed. Presumably the data in dstFile matches that of srcFile...currently that little detail isn't spelled out. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org