OussamaSaoudi commented on code in PR #12981:
URL: https://github.com/apache/datafusion/pull/12981#discussion_r1817985620


##########
datafusion/core/src/execution/context/mod.rs:
##########
@@ -114,17 +114,20 @@ impl DataFilePaths for &String {
     }
 }
 
-impl<P> DataFilePaths for Vec<P>
-where
-    P: AsRef<str>,
-{
+impl DataFilePaths for Vec<&str> {

Review Comment:
   I'm not sure there's a clean way to implement this change without a breaking 
existing code.  Here are a few things I considered: 
   - I briefly considered  changing the `as_str` impl of `ListingTableChanges` 
to make it succeed with a `parse`, but this is both a breaking change and 
pretty silly.
   - I tried using a blanket implement using TryInto, and use 
`ListingTableUrl::parse` as a `TryInto<ListingTableUrl>` impl. However I hit 
some issues:
   ```rust
   impl<P> DataFilePaths for Vec<P>
   where P: TryInto<ListingTableUrl, Error=DataFusionError>
   ```
   The problem is the associated type for `TryInto`. A single type 
(ex`ListingTableUrl`) can implement `TryInto<ListingTableUrl, 
Error=Infallible>` , or `TryInto<ListingTableUrl, Error=DataFusionError>` (ex: 
through `parse`). This of course leads to conflicting types for my blanket impl 
approach.
   - I looked through `SessionContext::read_parquet`. We can't change the use 
of `DataFilePaths::to_urls()` since that's baked into the API:
   ```rust
       pub async fn read_parquet<P: DataFilePaths>(
           &self,
           table_paths: P,
           options: ParquetReadOptions<'_>,
   ```
   I would've preferred to separate the concerns of parsing and fetching 
parquet. An approach that parses strings into `ListingTableUrl` before calling 
`read_parquet` would be my preferred solution.
   
   If DataFusion is not looking to change these APIs (understandably), I can go 
ahead and close this PR :)
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to