jorisvandenbossche commented on PR #37821:
URL: https://github.com/apache/arrow/pull/37821#issuecomment-1782496266

   @joemarshall there is a Python failure that seems to happen consistently for 
the different builds / different last commits:
   
   ```
   __________________ TestThreadedCSVTableRead.test_cancellation 
__________________
   
   self = <pyarrow.tests.test_csv.TestThreadedCSVTableRead object at 
0x7f1cf2cbe760>
   
       def test_cancellation(self):
           if (threading.current_thread().ident !=
                   threading.main_thread().ident):
               pytest.skip("test only works from main Python thread")
           # Skips test if not available
           raise_signal = util.get_raise_signal()
           signum = signal.SIGINT
       
           def signal_from_thread():
               # Give our workload a chance to start up
               time.sleep(0.2)
               raise_signal(signum)
       
           # We start with a small CSV reading workload and increase its size
           # until it's large enough to get an interruption during it, even in
           # release mode on fast machines.
           last_duration = 0.0
           workload_size = 100_000
           attempts = 0
       
           while last_duration < 5.0 and attempts < 10:
               print("workload size:", workload_size)
               large_csv = b"a,b,c\n" + b"1,2,3\n" * workload_size
               exc_info = None
       
               try:
                   # We use a signal fd to reliably ensure that the signal
                   # has been delivered to Python, regardless of how exactly
                   # it was caught.
                   with util.signal_wakeup_fd() as sigfd:
                       try:
                           t = threading.Thread(target=signal_from_thread)
                           t.start()
                           t1 = time.time()
                           try:
   >                           self.read_bytes(large_csv)
   
   
opt/conda/envs/arrow/lib/python3.10/site-packages/pyarrow/tests/test_csv.py:1419:
 
   _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ 
   
opt/conda/envs/arrow/lib/python3.10/site-packages/pyarrow/tests/test_csv.py:689:
 in read_bytes
       return self.read_csv(pa.py_buffer(b), **kwargs)
   
opt/conda/envs/arrow/lib/python3.10/site-packages/pyarrow/tests/test_csv.py:684:
 in read_csv
       table = read_csv(csv, *args, **kwargs)
   pyarrow/_csv.pyx:1262: in pyarrow._csv.read_csv
       ???
   pyarrow/_csv.pyx:1271: in pyarrow._csv.read_csv
       ???
   pyarrow/error.pxi:154: in pyarrow.lib.pyarrow_internal_check_status
       ???
   _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ 
   
   >   ???
   E   pyarrow.lib.ArrowCancelled: Operation cancelled. Detail: received signal 
2
   
   pyarrow/error.pxi:91: ArrowCancelled
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to