devinjdangelo commented on code in PR #9594:
URL: https://github.com/apache/arrow-datafusion/pull/9594#discussion_r1523217543


##########
datafusion/sql/src/statement.rs:
##########
@@ -855,6 +855,26 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> {
         let file_type = try_infer_file_type(&mut options, &statement.target)?;
         let partition_by = take_partition_by(&mut options);
 
+        match &file_type {
+            // Renames un-prefixed keys to support legacy format specific 
options
+            FileType::CSV | FileType::JSON | FileType::PARQUET => {
+                let prefix = format!("{}", file_type);
+                let keys_to_rename: Vec<_> = options
+                    .keys()
+                    .filter(|key| !key.starts_with(&prefix))
+                    .cloned()
+                    .collect();
+
+                for key in keys_to_rename {
+                    if let Some(value) = options.remove(&key) {
+                        let new_key = format!("{}.{}", prefix, key);
+                        options.insert(new_key, value);
+                    }
+                }
+            }
+            _ => {}
+        }
+

Review Comment:
   The overall strategy here looks good, but I propose two changes:
   
   1. We can make the logic more concise by modifying options in a single pass 
with into_iter, map, collect. This also avoids clones.
   2. I think we should be a bit more conservative about the keys we modify. If 
the key has a namespace at all (rather than specifically the file format 
namespace), we can leave it as is. Downstream users may for example add a 
custom namespace and we wouldn't wan't this code to modify it unexpectedly.



##########
datafusion/sql/src/statement.rs:
##########
@@ -855,6 +855,26 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> {
         let file_type = try_infer_file_type(&mut options, &statement.target)?;
         let partition_by = take_partition_by(&mut options);
 
+        match &file_type {
+            // Renames un-prefixed keys to support legacy format specific 
options
+            FileType::CSV | FileType::JSON | FileType::PARQUET => {
+                let prefix = format!("{}", file_type);
+                let keys_to_rename: Vec<_> = options
+                    .keys()
+                    .filter(|key| !key.starts_with(&prefix))
+                    .cloned()
+                    .collect();
+
+                for key in keys_to_rename {
+                    if let Some(value) = options.remove(&key) {
+                        let new_key = format!("{}.{}", prefix, key);
+                        options.insert(new_key, value);
+                    }
+                }
+            }
+            _ => {}
+        }
+

Review Comment:
   ```suggestion
           let options = match &file_type {
               // Renames un-prefixed keys to support legacy format specific 
options
               FileType::CSV | FileType::JSON | FileType::PARQUET => {
                   options
                       .into_iter()
                       .map(|(k, v)| {
                           // If config does not belong to any namespace, 
assume it is
                           // a legacy option and apply file_type namespace for 
backwards
                           // compatibility.
                           if !k.contains('.') {
                               let new_key = format!("{}.{}", file_type, k);
                               (new_key, v)
                           } else {
                               (k, v)
                           }
                       })
                       .collect()
               }
               _ => options,
           };
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to