morrySnow commented on code in PR #63400:
URL: https://github.com/apache/doris/pull/63400#discussion_r3322797937


##########
fe/fe-filesystem/fe-filesystem-s3/src/main/java/org/apache/doris/filesystem/s3/S3OutputStream.java:
##########
@@ -15,48 +15,39 @@
 // specific language governing permissions and limitations
 // under the License.
 
-package org.apache.doris.filesystem.s3;
-
-import org.apache.doris.filesystem.spi.RequestBody;
+package org.apache.doris.filesystem.spi;
 
 import java.io.ByteArrayInputStream;
 import java.io.ByteArrayOutputStream;
 import java.io.IOException;
 import java.io.OutputStream;
 
 /**
- * OutputStream that buffers writes in memory and uploads to S3 on close.
+ * OutputStream that buffers writes in memory and uploads to object storage on 
close.
  *
- * <p>This implementation is intentionally simple and suitable for small 
metadata files (manifests,
- * snapshots, job info, etc.). Writes are rejected when the in-memory buffer 
would exceed
- * {@link #MAX_SINGLE_UPLOAD_BYTES} to prevent OOM on large payloads.
- * For large file writes (Hive data files, Backup archives), multipart upload 
must be used instead.
+ * <p>This implementation is intentionally simple and suitable for small 
metadata files
+ * (manifests, snapshots, job info, etc.). Writes are rejected when the 
in-memory buffer
+ * would exceed {@link #MAX_SINGLE_UPLOAD_BYTES} to prevent OOM on large 
payloads.
+ * For large file writes (Hive data files, Backup archives), multipart upload 
must be
+ * used instead.
  *
- * <p><strong>Empty-close semantics (#22):</strong> if {@link #close()} is 
called without any
- * preceding {@code write(...)} call, NO object is uploaded. This avoids 
polluting the bucket
- * with phantom 0-byte placeholders when the caller opens an output stream and 
aborts before
- * writing. To explicitly create a zero-byte object, call {@code write(new 
byte[0])} (or
- * {@code write(b, off, 0)}) prior to {@link #close()} — any write call, even 
of length 0,
- * marks the stream as "written" and triggers an upload of the empty buffer.
+ * <p><strong>Empty-close semantics (#22):</strong> if {@link #close()} is 
called without
+ * any preceding {@code write(...)} call, no object is uploaded. This avoids 
polluting the
+ * bucket with phantom 0-byte placeholders when the caller opens an output 
stream and
+ * aborts before writing. To explicitly create a zero-byte object, call
+ * {@code write(new byte[0])} or {@code write(b, off, 0)} before {@link 
#close()}.
  */
-class S3OutputStream extends OutputStream {
+public class ObjectStorageOutputStream extends OutputStream {
 
-    /**
-     * Maximum in-memory buffer size before writes are rejected.
-     * S3 single-PUT limit is 5 GB, but we enforce a much smaller guard (256 
MB) so that an
-     * accidental large-file write fails early with a clear message rather 
than OOMing silently.
-     */

Review Comment:
   why remove this comment?



##########
fe/fe-filesystem/fe-filesystem-api/src/main/java/org/apache/doris/filesystem/capability/BatchDeleteCapability.java:
##########
@@ -0,0 +1,31 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+package org.apache.doris.filesystem.capability;
+
+import org.apache.doris.filesystem.Location;
+
+import java.io.IOException;
+import java.util.List;
+
+/**
+ * Capability for deleting multiple file locations through a 
provider-optimized path.
+ */
+public interface BatchDeleteCapability {

Review Comment:
   maybe all Capability interface should extends from a interface named 
`Capability` and update `requireCapability`'s paramter to `Class<? extends 
Capability>` in `FileSystem`.



##########
fe/fe-filesystem/fe-filesystem-spi/src/main/java/org/apache/doris/filesystem/spi/ObjectStorageFileSystem.java:
##########
@@ -0,0 +1,1086 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+package org.apache.doris.filesystem.spi;
+
+import org.apache.doris.filesystem.DorisInputFile;
+import org.apache.doris.filesystem.DorisInputStream;
+import org.apache.doris.filesystem.DorisOutputFile;
+import org.apache.doris.filesystem.FileEntry;
+import org.apache.doris.filesystem.FileIterator;
+import org.apache.doris.filesystem.GlobListing;
+import org.apache.doris.filesystem.Location;
+
+import java.io.IOException;
+import java.io.InputStream;
+import java.io.OutputStream;
+import java.nio.file.FileAlreadyExistsException;
+import java.util.ArrayList;
+import java.util.Collection;
+import java.util.Collections;
+import java.util.LinkedHashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.regex.Pattern;
+
+/* Object-storage-backed FileSystem implementation for the Doris FE filesystem 
SPI.
+ * Does not depend on fe-core, fe-common, or fe-catalog.
+ */
+public class ObjectStorageFileSystem extends ObjFileSystem {

Review Comment:
   The names "ObjectStorageFileSystem" and "ObjFileSystem" are too similar. 
More distinct and descriptive names that reflect their actual functions are 
needed.



##########
fe/fe-filesystem/fe-filesystem-spi/src/main/java/org/apache/doris/filesystem/spi/ObjectStorageOutputStream.java:
##########
@@ -84,7 +75,6 @@ public void close() throws IOException {
         }
         closed = true;
         if (!writeCalled) {
-            // No write call was ever made: skip upload to avoid creating a 
phantom 0-byte object.

Review Comment:
   why remove this comment?



##########
fe/fe-filesystem/fe-filesystem-spi/src/main/java/org/apache/doris/filesystem/spi/ObjectStorageFileSystem.java:
##########
@@ -0,0 +1,1086 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+package org.apache.doris.filesystem.spi;
+
+import org.apache.doris.filesystem.DorisInputFile;
+import org.apache.doris.filesystem.DorisInputStream;
+import org.apache.doris.filesystem.DorisOutputFile;
+import org.apache.doris.filesystem.FileEntry;
+import org.apache.doris.filesystem.FileIterator;
+import org.apache.doris.filesystem.GlobListing;
+import org.apache.doris.filesystem.Location;
+
+import java.io.IOException;
+import java.io.InputStream;
+import java.io.OutputStream;
+import java.nio.file.FileAlreadyExistsException;
+import java.util.ArrayList;
+import java.util.Collection;
+import java.util.Collections;
+import java.util.LinkedHashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.regex.Pattern;
+
+/* Object-storage-backed FileSystem implementation for the Doris FE filesystem 
SPI.
+ * Does not depend on fe-core, fe-common, or fe-catalog.
+ */
+public class ObjectStorageFileSystem extends ObjFileSystem {

Review Comment:
   maybe need to rename to S3CompatibleFileSystem. it should be abstract



##########
fe/fe-filesystem/fe-filesystem-s3/src/main/java/org/apache/doris/filesystem/s3/S3OutputStream.java:
##########
@@ -15,48 +15,39 @@
 // specific language governing permissions and limitations
 // under the License.
 
-package org.apache.doris.filesystem.s3;
-
-import org.apache.doris.filesystem.spi.RequestBody;
+package org.apache.doris.filesystem.spi;
 
 import java.io.ByteArrayInputStream;
 import java.io.ByteArrayOutputStream;
 import java.io.IOException;
 import java.io.OutputStream;
 
 /**
- * OutputStream that buffers writes in memory and uploads to S3 on close.
+ * OutputStream that buffers writes in memory and uploads to object storage on 
close.
  *
- * <p>This implementation is intentionally simple and suitable for small 
metadata files (manifests,
- * snapshots, job info, etc.). Writes are rejected when the in-memory buffer 
would exceed
- * {@link #MAX_SINGLE_UPLOAD_BYTES} to prevent OOM on large payloads.
- * For large file writes (Hive data files, Backup archives), multipart upload 
must be used instead.
+ * <p>This implementation is intentionally simple and suitable for small 
metadata files
+ * (manifests, snapshots, job info, etc.). Writes are rejected when the 
in-memory buffer
+ * would exceed {@link #MAX_SINGLE_UPLOAD_BYTES} to prevent OOM on large 
payloads.
+ * For large file writes (Hive data files, Backup archives), multipart upload 
must be
+ * used instead.
  *
- * <p><strong>Empty-close semantics (#22):</strong> if {@link #close()} is 
called without any
- * preceding {@code write(...)} call, NO object is uploaded. This avoids 
polluting the bucket
- * with phantom 0-byte placeholders when the caller opens an output stream and 
aborts before
- * writing. To explicitly create a zero-byte object, call {@code write(new 
byte[0])} (or
- * {@code write(b, off, 0)}) prior to {@link #close()} — any write call, even 
of length 0,
- * marks the stream as "written" and triggers an upload of the empty buffer.
+ * <p><strong>Empty-close semantics (#22):</strong> if {@link #close()} is 
called without
+ * any preceding {@code write(...)} call, no object is uploaded. This avoids 
polluting the
+ * bucket with phantom 0-byte placeholders when the caller opens an output 
stream and
+ * aborts before writing. To explicitly create a zero-byte object, call
+ * {@code write(new byte[0])} or {@code write(b, off, 0)} before {@link 
#close()}.
  */
-class S3OutputStream extends OutputStream {
+public class ObjectStorageOutputStream extends OutputStream {
 
-    /**
-     * Maximum in-memory buffer size before writes are rejected.
-     * S3 single-PUT limit is 5 GB, but we enforce a much smaller guard (256 
MB) so that an
-     * accidental large-file write fails early with a clear message rather 
than OOMing silently.
-     */
     private static final long MAX_SINGLE_UPLOAD_BYTES = 256L * 1024 * 1024; // 
256 MB
 
     private final String remotePath;
-    private final S3ObjStorage objStorage;
+    private final ObjStorage<?> objStorage;
     private final ByteArrayOutputStream buffer = new ByteArrayOutputStream();
     private boolean closed = false;
-    // Tracks whether write(...) was called at least once. close() skips the 
upload entirely
-    // when this is false to avoid creating phantom 0-byte objects on 
accidental empty close.

Review Comment:
   why remove this comment?



##########
fe/fe-filesystem/fe-filesystem-spi/src/main/java/org/apache/doris/filesystem/spi/ObjStorage.java:
##########


Review Comment:
   if this is a S3 style API interface, it should reflacted on the interface's 
name



##########
fe/fe-filesystem/fe-filesystem-spi/src/main/java/org/apache/doris/filesystem/spi/ObjectListOptions.java:
##########
@@ -0,0 +1,87 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+package org.apache.doris.filesystem.spi;
+
+/**
+ * Optional controls for object listing.
+ */
+public final class ObjectListOptions {
+
+    private final String continuationToken;
+    private final String startAfter;
+    private final int maxKeys;
+    private final String delimiter;

Review Comment:
   add comment to these attribute



##########
fe/fe-filesystem/fe-filesystem-spi/src/main/java/org/apache/doris/filesystem/spi/ObjectStorageFileSystem.java:
##########
@@ -0,0 +1,1086 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+package org.apache.doris.filesystem.spi;
+
+import org.apache.doris.filesystem.DorisInputFile;
+import org.apache.doris.filesystem.DorisInputStream;
+import org.apache.doris.filesystem.DorisOutputFile;
+import org.apache.doris.filesystem.FileEntry;
+import org.apache.doris.filesystem.FileIterator;
+import org.apache.doris.filesystem.GlobListing;
+import org.apache.doris.filesystem.Location;
+
+import java.io.IOException;
+import java.io.InputStream;
+import java.io.OutputStream;
+import java.nio.file.FileAlreadyExistsException;
+import java.util.ArrayList;
+import java.util.Collection;
+import java.util.Collections;
+import java.util.LinkedHashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.regex.Pattern;
+
+/* Object-storage-backed FileSystem implementation for the Doris FE filesystem 
SPI.
+ * Does not depend on fe-core, fe-common, or fe-catalog.
+ */
+public class ObjectStorageFileSystem extends ObjFileSystem {
+
+    // Object stores do not have real directories; use a zero-byte marker with 
trailing slash.
+    private static final String DIR_MARKER_SUFFIX = "/";
+
+    private final boolean usePathStyle;
+
+    protected ObjectStorageFileSystem(String name, ObjStorage<?> objStorage, 
boolean usePathStyle) {
+        super(name, objStorage);
+        this.usePathStyle = usePathStyle;
+    }
+
+    @Override
+    protected boolean isNotFoundError(IOException e) {
+        return e instanceof java.io.FileNotFoundException;
+    }

Review Comment:
   same as super class



##########
fe/fe-filesystem/fe-filesystem-spi/src/main/java/org/apache/doris/filesystem/spi/ObjectStorageUri.java:
##########
@@ -0,0 +1,97 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+package org.apache.doris.filesystem.spi;
+
+/**
+ * Parsed object-storage URI. Extracts bucket and key without any fe-core 
dependency.
+ */
+public final class ObjectStorageUri {
+
+    private final String bucket;
+    private final String key;
+
+    private ObjectStorageUri(String bucket, String key) {
+        this.bucket = bucket;
+        this.key = key;
+    }
+
+    /**
+     * Parses URIs of the form: scheme://bucket/key.
+     *
+     * <p>When {@code pathStyleAccess} is {@code true} <strong>and</strong> 
the scheme is
+     * {@code http} or {@code https}, the URI is treated as path-style:
+     * {@code scheme://endpoint/bucket/key}. The first path segment after the 
host is the
+     * bucket, the remainder is the key. For all other schemes the first 
authority
+     * component is always treated as the bucket, regardless of {@code 
pathStyleAccess}.
+     */
+    public static ObjectStorageUri parse(String path, boolean pathStyleAccess) 
{

Review Comment:
   There is an implicit dependency here, which requires that the path be 
escaped. Please note this.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to