Re: IO considerations for PyArrow

2016-06-17 Thread Uwe Korn
Hello Wes, the concept sounds sensible and really useful. Probably the implementation will reside in the beginning fully inside of Arrow but do you plan to split it up into a separate package later on? Cheers Uwe On 16.06.16 03:35, Wes McKinney wrote: Hi folks, I put some more thought

Re: IO considerations for PyArrow

2016-06-16 Thread pino patera
Looks a good idea.In order to take advantage of async IO, it'd be nice having a concept of chunking for large objects and "pipelining" in the sense of starting the serialization/deserelezation while reading/writing the chunks. In some application, it can be very useful when dealing with large

Re: IO considerations for PyArrow

2016-06-15 Thread Wes McKinney
Hi folks, I put some more thought into the "IO problem" as it relates Arrow in C++ (and transitively, Python) and wrote a short Google document with my thoughts on it: https://docs.google.com/document/d/16y-eyIgSVL8m5Q7Mmh-jIDRwlh-r0bYatYuDl4sbMIk/edit# Feedback greatly appreciated! This will