trueleo opened a new issue, #8068:
URL: https://github.com/apache/arrow-datafusion/issues/8068

   ### Describe the bug
   
   Based on datafusion 32, any listing scan seems to fail with ` ObjectStore 
NotFound` when using a LocalFileSystem where root path is not `/`. Somewhere 
during the query/scan flow the root specified in local fs object store 
unconditionally applied to all parsed ListTableUrl paths. 
   
   ```
   thread 'main' panicked at src/main.rs:45:14:
   called `Result::unwrap()` on an `Err` value: ObjectStore(NotFound { path: 
"/home/trueleo/rust/datafusion-localfs-test/home/trueleo/rust/datafusion-localfs-test/data.parquet",
 source: Os { code: 2, kind: NotFound, message: "No such file or directory" } })
   ````
   
   This does not seem to be the case for s3 path where `s3://bucket` is used to 
find right object store for a listed file and path prefixed with this seems 
work just fine.
   
   In the example below i have manually registered a LocalFileSystem with it's 
root path set to example project's directory where i have a `data.parquet` file.
   
   ### To Reproduce
   
   ```rust 
   
   use std::{env::current_dir, sync::Arc};
   
   use datafusion::{
       arrow::util::pretty::pretty_format_batches,
       datasource::{
           file_format::parquet::ParquetFormat,
           listing::{ListingOptions, ListingTable, ListingTableConfig, 
ListingTableUrl},
       },
       execution::{
           object_store::{DefaultObjectStoreRegistry, ObjectStoreRegistry},
           runtime_env::{RuntimeConfig, RuntimeEnv},
       },
       prelude::{SessionConfig, SessionContext},
   };
   use object_store::local::LocalFileSystem;
   
   #[tokio::main]
   async fn main() {
       let project_dir = current_dir().unwrap();
       let url = url::Url::parse(&format!("file://{}", 
project_dir.display())).unwrap();
       let store = 
Arc::new(LocalFileSystem::new_with_prefix(&project_dir).unwrap());
   
       let object_store_registry = Arc::new(DefaultObjectStoreRegistry::new());
       object_store_registry.register_store(&url, store);
   
       let runtime = Arc::new(
           
RuntimeEnv::new(RuntimeConfig::new().with_object_store_registry(object_store_registry))
               .unwrap(),
       );
   
       let ctx = SessionContext::new_with_config_rt(SessionConfig::default(), 
runtime);
   
       let listing_options =
           
ListingOptions::new(Arc::new(ParquetFormat::new())).with_file_extension(".parquet");
   
       let table = Arc::new(
           ListingTable::try_new(
               ListingTableConfig::new(
                   ListingTableUrl::parse(
                       
"file:///home/trueleo/rust/datafusion-localfs-test/data.parquet",
                   )
                   .unwrap(),
               )
               .with_listing_options(listing_options)
               .infer_schema(&ctx.state())
               .await
               .unwrap(),
           )
           .unwrap(),
       );
   
       ctx.register_table("table", table).unwrap();
   
       let df = ctx.sql("select count(*) from table").await.unwrap();
       let res = df.collect().await.unwrap();
       println!("{}", pretty_format_batches(&res).unwrap())
   }
   
   ```
   
   ### Expected behavior
   
   The query should have just worked based on registered object store prefix 
and absolute path provided. 
   
   
   
   ### Additional context
   
   
   I've also tried following paths but still the same error occurs
   `/home/trueleo/rust/datafusion-localfs-test/data.parquet`
   `data.parquet`
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to