[ 
https://issues.apache.org/jira/browse/ARROW-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17360295#comment-17360295
 ] 

Jonathan Keane commented on ARROW-13028:
----------------------------------------

Yeah, I tend to agree that if one needs to / wants to manage that conversion 
explicit column types is the way to go (and that interface has the benefit of 
also allowing one to control other types of other columns). 

This is an empirical question (and almost certainly vary by the data), but what 
would a miss look like performance wise for trying 32bit and then having to 
change to 64bit after the fact? That would involve some computation, correct? 
Or can we do that conversion for free / without rewriting the representation?

> [C++] CSV add convert option to attempt 32bit number inferences
> ---------------------------------------------------------------
>
>                 Key: ARROW-13028
>                 URL: https://issues.apache.org/jira/browse/ARROW-13028
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++
>            Reporter: Nate Clark
>            Assignee: Nate Clark
>            Priority: Major
>
> When types are being inferred by CSV the numbers are always 64 bit. For large 
> data sets it could be better to use 32 bit types to save over all memory. To 
> do this it would be useful to add an option to ConvertOptions to try 32 bit 
> numbers before 64 bit. By default this option would be disabled.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to