[GitHub] [arrow] n3world commented on pull request #10794: ARROW-13441: [C++][CSV] Skip empty batches in column decoder

GitBox Mon, 26 Jul 2021 07:00:13 -0700


n3world commented on pull request #10794:
URL: https://github.com/apache/arrow/pull/10794#issuecomment-886728438



   > The problem with this approach is that yields column chunks of differing 
types. You can see this if you add the following test:
   
   Yes that will because the type is unknown, yet. This test seems artificial 
in that it doesn't follow how the column decoder is actually used. In use all 
empty record batches get discarded so their type don't actually matter. That is 
why it works for the csv streaming test I modified to have multiple empty 
blocks before a block with data.
   
   That test would not work without this change so the change does not make 
anything better or worse as far as that test goes. I would argue it makes 
things better because eventually you can get data from the converter.
   
   The only ways I can think to get that test to work would be to allow the API 
not to have to return a result until the type is known but that is dependent on 
the read ahead, for csv, to be high enough to actually find data.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] n3world commented on pull request #10794: ARROW-13441: [C++][CSV] Skip empty batches in column decoder

Reply via email to