joelnitta commented on issue #38903:
URL: https://github.com/apache/arrow/issues/38903#issuecomment-1907432572

   I would add that [the current 
documentation](https://arrow.apache.org/docs/r/reference/open_delim_dataset.html)
 says that a "compact string representation" of column types is allowable. This 
is very similar to the wording of 
[{readr}](https://readr.tidyverse.org/reference/read_delim.html), so without 
additional explanation I assumed that's what it meant, but that this does not 
seem to work:
   
   ``` r
   library(readr)
   library(arrow)
   #> 
   #> Attaching package: 'arrow'
   #> The following object is masked from 'package:utils':
   #> 
   #>     timestamp
   
   # works
   read_csv(readr_example("mtcars.csv"), col_types = paste(rep("c", 11), 
collapse = ""))
   #> # A tibble: 32 × 11
   #>    mpg   cyl   disp  hp    drat  wt    qsec  vs    am    gear  carb 
   #>    <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
   #>  1 21    6     160   110   3.9   2.62  16.46 0     1     4     4    
   #>  2 21    6     160   110   3.9   2.875 17.02 0     1     4     4    
   #>  3 22.8  4     108   93    3.85  2.32  18.61 1     1     4     1    
   #>  4 21.4  6     258   110   3.08  3.215 19.44 1     0     3     1    
   #>  5 18.7  8     360   175   3.15  3.44  17.02 0     0     3     2    
   #>  6 18.1  6     225   105   2.76  3.46  20.22 1     0     3     1    
   #>  7 14.3  8     360   245   3.21  3.57  15.84 0     0     3     4    
   #>  8 24.4  4     146.7 62    3.69  3.19  20    1     0     4     2    
   #>  9 22.8  4     140.8 95    3.92  3.15  22.9  1     0     4     2    
   #> 10 19.2  6     167.6 123   3.92  3.44  18.3  1     0     4     4    
   #> # ℹ 22 more rows
   
   # works
   open_csv_dataset(readr_example("mtcars.csv"))
   #> FileSystemDataset with 1 csv file
   #> mpg: double
   #> cyl: int64
   #> disp: double
   #> hp: int64
   #> drat: double
   #> wt: double
   #> qsec: double
   #> vs: int64
   #> am: int64
   #> gear: int64
   #> carb: int64
   
   # doesn't work
   open_csv_dataset(readr_example("mtcars.csv"), col_types = paste(rep("c", 
11), collapse = ""))
   #> Error:
   #> ! Unsupported `col_types` specification.
   #> ℹ `col_types` must be NULL, or a <Schema>.
   #> Backtrace:
   #>      ▆
   #>   1. └─arrow (local) `<fn>`(...)
   #>   2.   └─arrow::open_dataset(...)
   #>   3.     └─DatasetFactory$create(...)
   #>   4.       └─FileFormat$create(...)
   #>   5.         └─CsvFileFormat$create(...)
   #>   6.           └─arrow:::check_csv_file_format_args(dots, partitioning = 
partitioning)
   #>   7.             ├─base::do.call(csv_file_format_convert_opts, args)
   #>   8.             └─arrow (local) `<fn>`(...)
   #>   9.               ├─base::do.call(csv_convert_options, opts)
   #>  10.               └─arrow (local) `<fn>`(...)
   #>  11.                 └─rlang::abort(c("Unsupported `col_types` 
specification.", i = "`col_types` must be NULL, or a <Schema>."))
   ```
   
   <sup>Created on 2024-01-24 with [reprex 
v2.0.2](https://reprex.tidyverse.org)</sup>
   
   <details style="margin-bottom:10px;">
   <summary>
   Session info
   </summary>
   
   ``` r
   sessioninfo::session_info()
   #> ─ Session info 
───────────────────────────────────────────────────────────────
   #>  setting  value
   #>  version  R version 4.3.2 (2023-10-31)
   #>  os       macOS Sonoma 14.1.2
   #>  system   aarch64, darwin20
   #>  ui       X11
   #>  language (EN)
   #>  collate  en_US.UTF-8
   #>  ctype    UTF-8
   #>  tz       Asia/Tokyo
   #>  date     2024-01-24
   #>  pandoc   3.1.2 @ /usr/local/bin/ (via rmarkdown)
   #> 
   #> ─ Packages 
───────────────────────────────────────────────────────────────────
   #>  package     * version  date (UTC) lib source
   #>  arrow       * 14.0.0.2 2023-12-02 [1] CRAN (R 4.3.1)
   #>  assertthat    0.2.1    2019-03-21 [1] CRAN (R 4.3.0)
   #>  bit           4.0.5    2022-11-15 [1] CRAN (R 4.3.0)
   #>  bit64         4.0.5    2020-08-30 [1] CRAN (R 4.3.0)
   #>  cli           3.6.2    2023-12-11 [1] CRAN (R 4.3.1)
   #>  crayon        1.5.2    2022-09-29 [1] CRAN (R 4.3.0)
   #>  digest        0.6.33   2023-07-07 [1] CRAN (R 4.3.0)
   #>  evaluate      0.23     2023-11-01 [1] CRAN (R 4.3.1)
   #>  fansi         1.0.6    2023-12-08 [1] CRAN (R 4.3.1)
   #>  fastmap       1.1.1    2023-02-24 [1] CRAN (R 4.3.0)
   #>  fs            1.6.3    2023-07-20 [1] CRAN (R 4.3.0)
   #>  glue          1.6.2    2022-02-24 [1] CRAN (R 4.3.0)
   #>  hms           1.1.3    2023-03-21 [1] CRAN (R 4.3.0)
   #>  htmltools     0.5.7    2023-11-03 [1] CRAN (R 4.3.1)
   #>  knitr         1.45     2023-10-30 [1] CRAN (R 4.3.1)
   #>  lifecycle     1.0.4    2023-11-07 [1] CRAN (R 4.3.1)
   #>  magrittr      2.0.3    2022-03-30 [1] CRAN (R 4.3.0)
   #>  pillar        1.9.0    2023-03-22 [1] CRAN (R 4.3.0)
   #>  pkgconfig     2.0.3    2019-09-22 [1] CRAN (R 4.3.0)
   #>  purrr         1.0.2    2023-08-10 [1] CRAN (R 4.3.0)
   #>  R.cache       0.16.0   2022-07-21 [1] CRAN (R 4.3.0)
   #>  R.methodsS3   1.8.2    2022-06-13 [1] CRAN (R 4.3.0)
   #>  R.oo          1.25.0   2022-06-12 [1] CRAN (R 4.3.0)
   #>  R.utils       2.12.3   2023-11-18 [1] CRAN (R 4.3.1)
   #>  R6            2.5.1    2021-08-19 [1] CRAN (R 4.3.0)
   #>  readr       * 2.1.4    2023-02-10 [1] CRAN (R 4.3.0)
   #>  reprex        2.0.2    2022-08-17 [1] CRAN (R 4.3.0)
   #>  rlang         1.1.2    2023-11-04 [1] CRAN (R 4.3.1)
   #>  rmarkdown     2.25     2023-09-18 [1] CRAN (R 4.3.1)
   #>  sessioninfo   1.2.2    2021-12-06 [1] CRAN (R 4.3.0)
   #>  styler        1.10.2   2023-08-29 [1] CRAN (R 4.3.0)
   #>  tibble        3.2.1    2023-03-20 [1] CRAN (R 4.3.0)
   #>  tidyselect    1.2.0    2022-10-10 [1] CRAN (R 4.3.0)
   #>  tzdb          0.4.0    2023-05-12 [1] CRAN (R 4.3.0)
   #>  utf8          1.2.4    2023-10-22 [1] CRAN (R 4.3.1)
   #>  vctrs         0.6.5    2023-12-01 [1] CRAN (R 4.3.1)
   #>  vroom         1.6.5    2023-12-05 [1] CRAN (R 4.3.1)
   #>  withr         2.5.2    2023-10-30 [1] CRAN (R 4.3.1)
   #>  xfun          0.41     2023-11-01 [1] CRAN (R 4.3.1)
   #>  yaml          2.3.8    2023-12-11 [1] CRAN (R 4.3.1)
   #> 
   #>  [1] /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library
   #> 
   #> 
──────────────────────────────────────────────────────────────────────────────
   ```
   
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to