NickCrews opened a new pull request, #48581: URL: https://github.com/apache/arrow/pull/48581
### Rationale for this change Fixes https://github.com/apache/arrow/issues/22081 Efficiently/correctly creating List arrays from numpy arrays with ndims > 1. ### What changes are included in this PR? Before, `pa.array(np.arange(6).reshape(2,3))` would fail. Now it returns an array of length 2, where each element is a size-3 list. I think this is the only intuitive behavior. But if you can think of an alternative behavior a user might want/expect from this, then please let's talk about it. I am not super familiar with numpy/pyarrow memory layout internals to understand if there are other cases besides the C-continuous memory layout where we could use zero-copy. But even if there are other cases, I'm not sure if we need to bother with them, I bet the c-continuous covers 95% of usage. I also am not sure if this is a good way to to do this, or if there is a more succinct way. This was written entirely by GH copilot. You can see my dialog with copilot as I tweaked it's directions and chose an implementation in https://github.com/NickCrews/arrow/pull/3 Perhaps this logic should be pulled into its own `_from_n_dim_numpy(np_arr)` helper function to keep the larger control flow of the function more clear, let me know if you think so. ### Are these changes tested? Yes, I think adequately. It doesn't actually verify that the zero-copy path is used, just that the results are correct. I didn't really want to deal with messing with monkeypatching/spying on things to detect the 0-copy, but can add this if we want to verify. We also just compare the results to the result via the .tolist() path, but perhaps we should instead write out the actual expected value as boilerplate so that it is even more obvious what the expected behavior is. ### Are there any user-facing changes? No breaking changes, only newly supported features! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
