nateprewitt commented on PR #48971: URL: https://github.com/apache/arrow/pull/48971#issuecomment-3803479982
I did a whole write up in the other issue with what I found so far, but then the answer dawned on me. It's a small enough fix, I'm just going to include it here. These [two tests](https://github.com/apache/arrow/blob/5272a68c134deea82040f2f29bb6257ad7b52be0/cpp/src/arrow/filesystem/azurefs_test.cc#L2985-L2992) were hanging when trying to upload a single large chunk. At first it seemed like there was some issue with the Azure SDK communicating with Azurite where it would get some successful calls to create our blob and then stall out trying to write actual data. Long story short, we've created an int64_t [max blob size](https://github.com/apache/arrow/blob/main/cpp/src/arrow/filesystem/azurefs.cc#L973) that was defined as `4UL * 1024 * 1024 * 1024`. That's fine for Linux/macOS (both 64 bit), but on Windows [Unsigned Long is 32 bit](https://learn.microsoft.com/en-us/cpp/cpp/data-type-ranges?view=msvc-170) not 64. That leads us to an integer overflow, setting the max blob size to 0. That's used in [this min() statement](https://github.com/apache/arrow/blob/5272a68c134deea82040f2f29bb6257ad7b52be0/cpp/src/arrow/filesystem/azurefs.cc#L1201) to calculate our upload_size, resulting in an infinite loop of trying to write 0 bytes. Changing it to Long Long forces it to a signed 64 bit int (our declared type) and fixes the issue across platforms. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
