[
https://issues.apache.org/jira/browse/ARROW-10132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Antoine Pitrou updated ARROW-10132:
-----------------------------------
Summary: [Rust] Considers scientific notation when inferring schema from
csv (was: Considers scientific notation when inferring schema from csv)
> [Rust] Considers scientific notation when inferring schema from csv
> -------------------------------------------------------------------
>
> Key: ARROW-10132
> URL: https://issues.apache.org/jira/browse/ARROW-10132
> Project: Apache Arrow
> Issue Type: Improvement
> Components: Rust
> Affects Versions: 1.0.1
> Environment: Ubuntu
> Reporter: Ziru Niu
> Priority: Minor
> Labels: easyfix
>
>
> ||col||
> |1.2|
> |1.3e-2|
> |1.4|
> Currently this column would be inferred as Utf8 type, since
> csv::reader::DECIMAL_RE is defined as r"^-?(\d+\.\d+)$". Maybe we could
> change this to r"^-?(\d+\.\d+)(e-?(\d+))?$" or similar stuff to allow
> scientific notation of real number inferred as float?
>
> (The RE I currently proposed doesn't handle "5e-4" correctly though)
>
> And I would wish we could infer "3." or ".3" as float too. I will come up
> with an exact RE when I get time.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)