[
https://issues.apache.org/jira/browse/ARROW-8213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Joris Van den Bossche updated ARROW-8213:
-----------------------------------------
Description:
Even after the previous PRs related to local paths
(https://github.com/apache/arrow/pull/6643,
https://github.com/apache/arrow/pull/6655), I don't think the user experience
optimal in case you are working with local files, and pass a wrong,
non-existent path (eg due to a typo).
Currently, you get this error:
{code}
>>> dataset = ds.dataset("data_with_typo.parquet", format="parquet")
...
ArrowInvalid: URI has empty scheme: 'data_with_typo.parquet'
{code}
where "URI has empty scheme" is rather confusing for the user in case of a
non-existent path. I think ideally we should raise a "No such file or
directory" error.
I am not fully sure what the best solution is, as {{FileSystem.from_uri}} can
also give other errors that we do want to propagate to the user.
The most straightforward that I am now thinking of is checking if "URI has
empty scheme" is in the error message, and then rewording it, but that's not
very clean ..
was:
Even after the previous PRs related to local paths
(https://github.com/apache/arrow/pull/6643,
https://github.com/apache/arrow/pull/6655), I don't the user experience optimal
in case you are working with local files, and pass a wrong, non-existent path
(eg due to a typo).
Currently, you get this error:
{code}
>>> dataset = ds.dataset("data_with_typo.parquet", format="parquet")
...
ArrowInvalid: URI has empty scheme: 'data_with_typo.parquet'
{code}
where "URI has empty scheme" is rather confusing for the user in case of a
non-existent path. I think ideally we should raise a "No such file or
directory" error.
I am not fully sure what the best solution is, as {{FileSystem.from_uri}} can
also give other errors that we do want to propagate to the user.
The most straightforward that I am now thinking of is checking if "URI has
empty scheme" is in the error message, and then rewording it, but that's not
very clean ..
> [Python][Dataset] Opening a dataset with a local incorrect path gives
> confusing error message
> ---------------------------------------------------------------------------------------------
>
> Key: ARROW-8213
> URL: https://issues.apache.org/jira/browse/ARROW-8213
> Project: Apache Arrow
> Issue Type: Bug
> Components: C++ - Dataset, Python
> Reporter: Joris Van den Bossche
> Priority: Major
> Fix For: 0.17.0
>
>
> Even after the previous PRs related to local paths
> (https://github.com/apache/arrow/pull/6643,
> https://github.com/apache/arrow/pull/6655), I don't think the user experience
> optimal in case you are working with local files, and pass a wrong,
> non-existent path (eg due to a typo).
> Currently, you get this error:
> {code}
> >>> dataset = ds.dataset("data_with_typo.parquet", format="parquet")
> ...
> ArrowInvalid: URI has empty scheme: 'data_with_typo.parquet'
> {code}
> where "URI has empty scheme" is rather confusing for the user in case of a
> non-existent path. I think ideally we should raise a "No such file or
> directory" error.
> I am not fully sure what the best solution is, as {{FileSystem.from_uri}} can
> also give other errors that we do want to propagate to the user.
> The most straightforward that I am now thinking of is checking if "URI has
> empty scheme" is in the error message, and then rewording it, but that's not
> very clean ..
--
This message was sent by Atlassian Jira
(v8.3.4#803005)