dxdc commented on issue #43892: URL: https://github.com/apache/arrow/issues/43892#issuecomment-2353695282
@amol- @rok any chance you're able to take a look at this issue? It's very simple to reproduce. The most ultra-simple use case is running this stripped down script with this file: - [sample.txt](https://github.com/user-attachments/files/16821794/sample.txt) ```py import pyarrow.csv as pv # setting use_threads to False does not hang python read_options = pv.ReadOptions(encoding="big5", use_threads=True) parse_options = pv.ParseOptions(delimiter="|") with open("sample.txt", "rb") as f: table = pv.read_csv(f, read_options=read_options, parse_options=parse_options) ``` There is a bug with threads and pyarrow. I have now an additional file I can use for testing on my side. I'm also willing to dig into it if you have a sense of where the issue may lie. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
