[
https://issues.apache.org/jira/browse/ARROW-15665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17507488#comment-17507488
]
Joris Van den Bossche commented on ARROW-15665:
-----------------------------------------------
For cases 1 and 2, I agree that both are cases that should error (which already
is what Arrow does now)
---
Only, for case 1, you might have a typo in your example ("M" vs "m"), because
we are parsing minutes (and the missing month gets filled with 1):
{code}
>>> print(datetime.datetime.strptime("1999-12-31", "%Y-%d-%M"))
1999-01-12 00:31:00
{code}
(pandas does the same, and Arrow as well)
If I change that to use {{"%Y-%d-%m"}} (lower case m), Python and pandas give
an error for this:
{code}
>>> print(datetime.datetime.strptime("1999-12-31", "%Y-%d-%m"))
...
ValueError: unconverted data remains: 1
{code}
and so does Arrow ("Failed to parse string: '1999-12-31' ..")
> [C++] Add error handling option to StrptimeOptions
> --------------------------------------------------
>
> Key: ARROW-15665
> URL: https://issues.apache.org/jira/browse/ARROW-15665
> Project: Apache Arrow
> Issue Type: Improvement
> Components: C++
> Reporter: Rok Mihevc
> Assignee: Rok Mihevc
> Priority: Major
> Labels: kernel, pull-request-available
> Time Spent: 40m
> Remaining Estimate: 0h
>
> We want to have an option to either raise, ignore or return NA in case of
> format mismatch.
> See
> [pandas.to_datetime|https://pandas.pydata.org/docs/reference/api/pandas.to_datetime.html]
> and lubridate
> [parse_date_time|https://lubridate.tidyverse.org/reference/parse_date_time.html]
> for examples.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)