thisisnic opened a new pull request, #12839:
URL: https://github.com/apache/arrow/pull/12839

   As discussed on #12826 
   
   Not sure how (if) to write tests but tried running it locally using the CSV 
directory set up in `test-dataset-csv.R` with and without this change, and 
without it, we get, e.g.
   
   ```
   open_dataset(csv_dir)
   # Error in `handle_parquet_io_error()` at r/R/dataset.R:221:6:
   # ! Invalid: Error creating dataset. Could not read schema from 
'/tmp/RtmpuTyOD8/file5049dcf581a5/5/file1.csv': Could not open Parquet input 
source '/tmp/RtmpuTyOD8/file5049dcf581a5/5/file1.csv': Parquet magic bytes not 
found in footer. Either the file is corrupted or this is not a parquet file.
   # /home/nic2/arrow/cpp/src/arrow/dataset/file_parquet.cc:323  
GetReader(source, scan_options). Is this a 'parquet' file?
   # /home/nic2/arrow/cpp/src/arrow/dataset/discovery.cc:40  
InspectSchemas(std::move(options))
   # /home/nic2/arrow/cpp/src/arrow/dataset/discovery.cc:262  
Inspect(options.inspect_options)
   # ℹ Did you mean to specify a 'format' other than the default (parquet)?
   ```
   
   and then with it:
   
   ```
   open_dataset(csv_dir)
   # Error in `open_dataset()`:
   # ! Invalid: Error creating dataset. Could not read schema from 
'/tmp/RtmpLbqZs6/file4e4ca14fb5795/5/file1.csv': Could not open Parquet input 
source '/tmp/RtmpLbqZs6/file4e4ca14fb5795/5/file1.csv': Parquet magic bytes not 
found in footer. Either the file is corrupted or this is not a parquet file.
   # /home/nic2/arrow/cpp/src/arrow/dataset/file_parquet.cc:323  
GetReader(source, scan_options). Is this a 'parquet' file?
   # /home/nic2/arrow/cpp/src/arrow/dataset/discovery.cc:40  
InspectSchemas(std::move(options))
   # /home/nic2/arrow/cpp/src/arrow/dataset/discovery.cc:262  
Inspect(options.inspect_options)
   # ℹ Did you mean to specify a 'format' other than the default (parquet)?
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to