[ https://issues.apache.org/jira/browse/HDFS-17381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17878213#comment-17878213 ]
ASF GitHub Bot commented on HDFS-17381: --------------------------------------- steveloughran commented on code in PR #6551: URL: https://github.com/apache/hadoop/pull/6551#discussion_r1739192871 ########## hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/WithErasureCoding.java: ########## @@ -0,0 +1,42 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs; + +import java.io.IOException; + +/** + * Filesystems that support EC can implement this interface. + */ +public interface WithErasureCoding { + + /** + * Get the EC Policy name of the given file + * @param fileStatus object of the file whose ecPolicy needs to be obtained + * @return the ec Policy name + */ + String getErasureCodingPolicyName(FileStatus fileStatus); Review Comment: you're going to have to define this "fully", specifically * what if a filestatus for a different FS schema is passed in? * can a null string be returned * can an empty string be returned * if a file has no erasure coding, what is returned? ########## hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/WithErasureCoding.java: ########## @@ -0,0 +1,42 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs; + +import java.io.IOException; + +/** + * Filesystems that support EC can implement this interface. + */ +public interface WithErasureCoding { + + /** + * Get the EC Policy name of the given file + * @param fileStatus object of the file whose ecPolicy needs to be obtained + * @return the ec Policy name + */ + String getErasureCodingPolicyName(FileStatus fileStatus); + + /** + * Set the given ecPolicy on the path Review Comment: nit, add a `.` ########## hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/WithErasureCoding.java: ########## @@ -0,0 +1,42 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs; + +import java.io.IOException; + +/** + * Filesystems that support EC can implement this interface. + */ +public interface WithErasureCoding { + + /** + * Get the EC Policy name of the given file + * @param fileStatus object of the file whose ecPolicy needs to be obtained + * @return the ec Policy name + */ + String getErasureCodingPolicyName(FileStatus fileStatus); + + /** + * Set the given ecPolicy on the path + * @param path on which the EC policy needs to be set + * @throws IOException if the set is not successful + */ + void setErasureCodingPolicy(Path path, String ecPolicyName) throws Review Comment: * can null be supplied? * what if an empty string is passed in? * what if the policy is unknown? ########## hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/RetriableFileCopyCommand.java: ########## @@ -207,9 +209,13 @@ private long copyToFile(Path targetPath, FileSystem targetFS, ErasureCodingPolicy ecPolicy = null; if (preserveEC && sourceStatus.isErasureCoded() - && sourceStatus instanceof HdfsFileStatus - && targetFS instanceof DistributedFileSystem) { - ecPolicy = ((HdfsFileStatus) sourceStatus).getErasureCodingPolicy(); + && sourceFS instanceof WithErasureCoding + && targetFS instanceof WithErasureCoding) { + String ecPolicyName = + ((WithErasureCoding) sourceFS).getErasureCodingPolicyName(sourceStatus); + if (ecPolicyName != null) { + ecPolicy = SystemErasureCodingPolicies.getByName(ecPolicyName); Review Comment: is this needed now the policy is being passed across as a string? as it is requiring that HDFS files define which ECs are available and mandates that hadoop-hdfs is on the classpath. ########## hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DistributedFileSystem.java: ########## @@ -3862,6 +3869,11 @@ protected EnumSet<CreateFlag> getFlags() { */ @Override public FSDataOutputStream build() throws IOException { + if (getOptionalKeys().contains( + Options.OpenFileOptions.FS_OPTION_OPENFILE_EC_POLICY)) { + ecPolicyName(getOptions().get( Review Comment: Just do ``` policy = getOptions().get(FS_OPTION_OPENFILE_EC_POLICY, "") if (!policy.isEmpty()) { ecPolicyName(policy) } ``` no need to look for the keys, just look up the value and act only on whether or not it was present ########## hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/WithErasureCoding.java: ########## @@ -0,0 +1,42 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs; + +import java.io.IOException; + +/** + * Filesystems that support EC can implement this interface. + */ +public interface WithErasureCoding { + + /** + * Get the EC Policy name of the given file Review Comment: nit: add a . ########## hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/RetriableFileCopyCommand.java: ########## @@ -24,18 +24,19 @@ import java.util.EnumSet; import org.apache.hadoop.fs.FileStatus; -import org.apache.hadoop.hdfs.DistributedFileSystem; import org.apache.hadoop.hdfs.protocol.ErasureCodingPolicy; -import org.apache.hadoop.hdfs.protocol.HdfsFileStatus; +import org.apache.hadoop.hdfs.protocol.SystemErasureCodingPolicies; import org.apache.hadoop.tools.DistCpOptions; import org.slf4j.Logger; import org.slf4j.LoggerFactory; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.CreateFlag; +import org.apache.hadoop.fs.WithErasureCoding; Review Comment: nit: move down to keep the imports in alphabetical order within the package ########## hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/RetriableDirectoryCreateCommand.java: ########## @@ -66,11 +68,12 @@ protected Object doExecute(Object... arguments) throws Exception { boolean preserveEC = getFileAttributeSettings(context) .contains(DistCpOptions.FileAttribute.ERASURECODINGPOLICY); if (preserveEC && sourceStatus.isErasureCoded() - && targetFS instanceof DistributedFileSystem) { - ErasureCodingPolicy ecPolicy = - ((HdfsFileStatus) sourceStatus).getErasureCodingPolicy(); - DistributedFileSystem dfs = (DistributedFileSystem) targetFS; - dfs.setErasureCodingPolicy(target, ecPolicy.getName()); + && targetFS instanceof WithErasureCoding) { + ErasureCodingPolicy ecPolicy = SystemErasureCodingPolicies.getByName( Review Comment: propose logging at debug here ########## hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/WithErasureCoding.java: ########## @@ -0,0 +1,42 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs; + +import java.io.IOException; + +/** + * Filesystems that support EC can implement this interface. + */ +public interface WithErasureCoding { Review Comment: This is going to be the "strict" definition of the API even if HDFS is the reference implementation. now is the time to define those semantics such that if someone else implements it they know what to do -I assume Ozone wants to do this? Have a look at what hdfs does here and actually define pre/postconditions, see if you can think of ways to break hdfs that needs tightening (e.g. what if status from a different fs is passed in), and how to handle future changes such as different erasure codings across different stores. what happens if source fs and dest fs ever have different codings? Also "require" that implementations MUST support the FS_OPTION_OPENFILE_EC_POLICY path capability under the setErasureCodingPolicy() path to be able to set it. (and query?) > Distcp of EC files should not be limited to DFS. > ------------------------------------------------ > > Key: HDFS-17381 > URL: https://issues.apache.org/jira/browse/HDFS-17381 > Project: Hadoop HDFS > Issue Type: Bug > Components: distcp > Reporter: Sadanand Shenoy > Assignee: Sadanand Shenoy > Priority: Major > Labels: pull-request-available > > Currently EC file support in distcp is limited to DFS and the code checks if > the given FS instance is DistributedFileSystem, In Ozone, EC is supported > now, and this limitation can be removed and a general contract for any > filesystem that supports EC files should be allowed by implementing few > interfaces/methods. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org