westonpace commented on code in PR #13390:
URL: https://github.com/apache/arrow/pull/13390#discussion_r900279339


##########
cpp/src/arrow/engine/substrait/relation_internal.cc:
##########
@@ -106,25 +106,37 @@ Result<compute::Declaration> FromProto(const 
substrait::Rel& rel,
           path = item.uri_path_glob();
         }
 
+        ::arrow::internal::Uri uri;
+        RETURN_NOT_OK(uri.Parse(path));
         if (item.format() ==
             substrait::ReadRel::LocalFiles::FileOrFiles::FILE_FORMAT_PARQUET) {
           format = std::make_shared<dataset::ParquetFileFormat>();
-        } else if (util::string_view{path}.ends_with(".arrow")) {
+        } else if (uri.extension() == ".arrow") {

Review Comment:
   This "extension comparison" mechanism is short-lived.  The latest version of 
Substrait supports specifying IPC as a file format.  However, I'm interested in 
the thought of supporting `.arrows` as that would imply we are reading files 
from a filesystem where the files were written with the streaming mode.  Do you 
think that is a use case we might encounter?
   
   I purposely did not include support for the streaming mode when adding the 
Substrait support for IPC to LocalFiles because I didn't think it would make 
sense for a dataset of files stored on disk to have IPC files written with the 
streaming mode (if you're going to be writing to disk, you might as well use 
the file mode).  Do you think this is a situation we might encounter?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to