CalvinKirs commented on code in PR #61031:
URL: https://github.com/apache/doris/pull/61031#discussion_r2958773610
##########
be/src/common/config.cpp:
##########
@@ -1523,6 +1523,41 @@ DEFINE_mInt64(hive_sink_max_file_size, "1073741824"); //
1GB
/** Iceberg sink configurations **/
DEFINE_mInt64(iceberg_sink_max_file_size, "1073741824"); // 1GB
+// URI scheme to Doris file type mappings used by paimon-cpp DorisFileSystem.
+// Each entry uses the format "<scheme>=<file_type>", and file_type must be
one of:
+// local, hdfs, s3, http, broker.
+DEFINE_Strings(paimon_file_system_scheme_mappings,
+ "file=local,hdfs=hdfs,viewfs=hdfs,local=hdfs,jfs=hdfs,"
+ "s3=s3,s3a=s3,s3n=s3,oss=s3,obs=s3,cos=s3,cosn=s3,gs=s3,"
+ "abfs=s3,abfss=s3,wasb=s3,wasbs=s3,http=http,https=http,"
+ "ofs=broker,gfs=broker");
+DEFINE_Validator(paimon_file_system_scheme_mappings,
+ ([](const std::vector<std::string>& mappings) -> bool {
+ doris::StringCaseUnorderedSet seen_schemes;
+ static const doris::StringCaseUnorderedSet
supported_types = {
+ "local", "hdfs", "s3", "http", "broker"};
+ for (const auto& raw_entry : mappings) {
+ std::string_view entry = doris::trim(raw_entry);
+ size_t separator = entry.find('=');
+ if (separator == std::string_view::npos) {
+ return false;
+ }
+ std::string scheme =
std::string(doris::trim(entry.substr(0, separator)));
+ std::string file_type =
+
std::string(doris::trim(entry.substr(separator + 1)));
Review Comment:
With Paimon JNI/Split execution, the FE plans only at the split level, while
the BE resolves the actual file paths at runtime. This means the BE currently
needs its own path translation logic to
stay consistent with the FE. We should consolidate that logic into a
shared module going forward.
##########
be/src/format/table/paimon_doris_file_system.cpp:
##########
@@ -149,7 +163,7 @@ std::string normalize_path_for_type(const std::string&
path, const std::string&
if (type == doris::TFileType::FILE_LOCAL) {
return normalize_local_path(path);
}
- if (type == doris::TFileType::FILE_S3 && scheme != "s3" &&
!is_http_scheme(scheme)) {
+ if (type == doris::TFileType::FILE_S3 && scheme != "s3") {
return replace_scheme(path, "s3");
}
Review Comment:
is_http_scheme?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]