[ 
https://issues.apache.org/jira/browse/ARROW-10132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Antoine Pitrou updated ARROW-10132:
-----------------------------------
    Summary: [Rust] Considers scientific notation when inferring schema from 
csv  (was: Considers scientific notation when inferring schema from csv)

> [Rust] Considers scientific notation when inferring schema from csv
> -------------------------------------------------------------------
>
>                 Key: ARROW-10132
>                 URL: https://issues.apache.org/jira/browse/ARROW-10132
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: Rust
>    Affects Versions: 1.0.1
>         Environment: Ubuntu
>            Reporter: Ziru Niu
>            Priority: Minor
>              Labels: easyfix
>
>  
> ||col||
> |1.2|
> |1.3e-2|
> |1.4|
> Currently this column would be inferred as Utf8 type, since 
> csv::reader::DECIMAL_RE is defined as r"^-?(\d+\.\d+)$". Maybe we could 
> change this to r"^-?(\d+\.\d+)(e-?(\d+))?$" or similar stuff to allow 
> scientific notation of real number inferred as float?
>  
> (The RE I currently proposed doesn't handle "5e-4" correctly though)
>  
> And I would wish we could infer "3." or ".3" as float too. I will come up 
> with an exact RE when I get time.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to