[ https://issues.apache.org/jira/browse/ARROW-13690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17403137#comment-17403137 ]
Joris Van den Bossche commented on ARROW-13690: ----------------------------------------------- Somewhat related, using IPC for pickling would also help for ensuring we don't pickle the full buffer of a sliced array -> ARROW-10739 (I don't know if there are significant downsides in always using IPC? I suppose for simple/small arrays it will give some overhead for Arrays, since we need to put those in a RecordBatch to use the IPC machinery?) > [Python] Use IPC writing code for pickling RecordBatches > -------------------------------------------------------- > > Key: ARROW-13690 > URL: https://issues.apache.org/jira/browse/ARROW-13690 > Project: Apache Arrow > Issue Type: Improvement > Components: Python > Reporter: Micah Kornfield > Priority: Major > > For wide schemas in particular the the recursive nature of the currently > pickling algorithm for record batches makes it less efficient then using the > IPC format (which can be done entirely in C++). > > Consider switching the mechanism to use the IPC format. I think this can be > a backwards compatible change if the current leaving: > _reconstruct_record_batch in place if we care about that. -- This message was sent by Atlassian Jira (v8.3.4#803005)