[jira] [Updated] (ARROW-17429) [R] Error messages are not helpful of read_csv_arrow with col_types option

SHIMA Tatsuya (Jira) Tue, 16 Aug 2022 02:27:18 -0700


     [ 
https://issues.apache.org/jira/browse/ARROW-17429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


SHIMA Tatsuya updated ARROW-17429:
----------------------------------
    Description: 
The error message displayed when a non-convertible type is specified does not 
seem to help in the development version.

{code:r}
tbl <- tibble::tibble(time = c("1970-01-01T12:00:00+12:00"))
csv_file <- tempfile()
on.exit(unlink(csv_file))
write.csv(tbl, csv_file, row.names = FALSE)

arrow::read_csv_arrow(csv_file, col_types = "?", col_names = "x", skip = 1)
#> # A tibble: 1 × 1
#>   x
#>   <dttm>
#> 1 1970-01-01 00:00:00
arrow::read_csv_arrow(csv_file, col_types = "c", col_names = "x", skip = 1)
#> # A tibble: 1 × 1
#>   x
#>   <chr>
#> 1 1970-01-01T12:00:00+12:00
arrow::read_csv_arrow(csv_file, col_types = "i", col_names = "x", skip = 1)
#> Error in as.data.frame(tab): object 'tab' not found
arrow::read_csv_arrow(csv_file, col_types = "T", col_names = "x", skip = 1)
#> Error in as.data.frame(tab): object 'tab' not found
{code}

In arrow 9.0.0

{code:r}
tbl <- tibble::tibble(time = c("1970-01-01T12:00:00+12:00"))
csv_file <- tempfile()
on.exit(unlink(csv_file))
write.csv(tbl, csv_file, row.names = FALSE)

arrow::read_csv_arrow(csv_file, col_types = "?", col_names = "x", skip = 1)
#> # A tibble: 1 × 1
#>   x
#>   <dttm>
#> 1 1970-01-01 00:00:00
arrow::read_csv_arrow(csv_file, col_types = "c", col_names = "x", skip = 1)
#> # A tibble: 1 × 1
#>   x
#>   <chr>
#> 1 1970-01-01T12:00:00+12:00
arrow::read_csv_arrow(csv_file, col_types = "i", col_names = "x", skip = 1)
#> Error:
#> ! Invalid: In CSV column #0: CSV conversion error to int32: invalid value 
'1970-01-01T12:00:00+12:00'
arrow::read_csv_arrow(csv_file, col_types = "T", col_names = "x", skip = 1)
#> Error:
#> ! Invalid: In CSV column #0: CSV conversion error to timestamp[ns]: expected 
no zone offset in '1970-01-01T12:00:00+12:00'
{code}


  was:
The error message displayed when a non-convertible type is specified does not 
seem to help in the development version.

{code:r}
tbl <- tibble::tibble(time = c("1970-01-01T12:00:00+12:00"))
csv_file <- tempfile()
on.exit(unlink(csv_file))
write.csv(tbl, csv_file, row.names = FALSE)

arrow::read_csv_arrow(csv_file, col_types = "?", col_names = "x", skip = 1)
#> # A tibble: 1 × 1
#>   x
#>   <dttm>
#> 1 1970-01-01 00:00:00
arrow::read_csv_arrow(csv_file, col_types = "c", col_names = "x", skip = 1)
#> # A tibble: 1 × 1
#>   x
#>   <chr>
#> 1 1970-01-01T12:00:00+12:00
arrow::read_csv_arrow(csv_file, col_types = "i", col_names = "x", skip = 1)
#> Error in as.data.frame(tab): object 'tab' not found
arrow::read_csv_arrow(csv_file, col_types = "T", col_names = "x", skip = 1)
#> Error in as.data.frame(tab): object 'tab' not found
{code}




> [R] Error messages are not helpful of read_csv_arrow with col_types option
> --------------------------------------------------------------------------
>
>                 Key: ARROW-17429
>                 URL: https://issues.apache.org/jira/browse/ARROW-17429
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: R
>    Affects Versions: 9.0.0
>            Reporter: SHIMA Tatsuya
>            Priority: Major
>
> The error message displayed when a non-convertible type is specified does not 
> seem to help in the development version.
> {code:r}
> tbl <- tibble::tibble(time = c("1970-01-01T12:00:00+12:00"))
> csv_file <- tempfile()
> on.exit(unlink(csv_file))
> write.csv(tbl, csv_file, row.names = FALSE)
> arrow::read_csv_arrow(csv_file, col_types = "?", col_names = "x", skip = 1)
> #> # A tibble: 1 × 1
> #>   x
> #>   <dttm>
> #> 1 1970-01-01 00:00:00
> arrow::read_csv_arrow(csv_file, col_types = "c", col_names = "x", skip = 1)
> #> # A tibble: 1 × 1
> #>   x
> #>   <chr>
> #> 1 1970-01-01T12:00:00+12:00
> arrow::read_csv_arrow(csv_file, col_types = "i", col_names = "x", skip = 1)
> #> Error in as.data.frame(tab): object 'tab' not found
> arrow::read_csv_arrow(csv_file, col_types = "T", col_names = "x", skip = 1)
> #> Error in as.data.frame(tab): object 'tab' not found
> {code}
> In arrow 9.0.0
> {code:r}
> tbl <- tibble::tibble(time = c("1970-01-01T12:00:00+12:00"))
> csv_file <- tempfile()
> on.exit(unlink(csv_file))
> write.csv(tbl, csv_file, row.names = FALSE)
> arrow::read_csv_arrow(csv_file, col_types = "?", col_names = "x", skip = 1)
> #> # A tibble: 1 × 1
> #>   x
> #>   <dttm>
> #> 1 1970-01-01 00:00:00
> arrow::read_csv_arrow(csv_file, col_types = "c", col_names = "x", skip = 1)
> #> # A tibble: 1 × 1
> #>   x
> #>   <chr>
> #> 1 1970-01-01T12:00:00+12:00
> arrow::read_csv_arrow(csv_file, col_types = "i", col_names = "x", skip = 1)
> #> Error:
> #> ! Invalid: In CSV column #0: CSV conversion error to int32: invalid value 
> '1970-01-01T12:00:00+12:00'
> arrow::read_csv_arrow(csv_file, col_types = "T", col_names = "x", skip = 1)
> #> Error:
> #> ! Invalid: In CSV column #0: CSV conversion error to timestamp[ns]: 
> expected no zone offset in '1970-01-01T12:00:00+12:00'
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (ARROW-17429) [R] Error messages are not helpful of read_csv_arrow with col_types option

Reply via email to