INRIX-Mark-Gershaft opened a new issue, #38812:
URL: https://github.com/apache/arrow/issues/38812

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   On Windows via pyarrow reading CSV file with a very, very, very long field 
(several million characters).
   Since we don't know ahead of time length of the column - keep trying to read 
doubling buffer size.
   Issue is that upon exception file seems to be not properly closed leaving an 
open handle and preventing temporary file(s) from being removed.
   Python code:
   ```
       while True:
           try:
               table = csv.read_csv(csv_file_path, read_options=read_options, 
parse_options=parse_options,
                                    convert_options=convert_options)
               break
           except pa.lib.ArrowInvalid:
               print(f'Doubling CSV block_size from {read_options.block_size}')
               read_options.block_size *= 2
   ```
   
   This issue is possibly related to 
https://github.com/apache/arrow/issues/31796 since it seems to be about 
properly closing file handles on Windows.
   
   ### Component(s)
   
   C++, Python


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to