paleolimbot commented on code in PR #36992:
URL: https://github.com/apache/arrow/pull/36992#discussion_r1289138367


##########
r/tests/testthat/test-parquet.R:
##########
@@ -484,3 +484,30 @@ test_that("Can read Parquet files from a URL", {
   expect_true(tibble::is_tibble(pu))
   expect_identical(dim(pu), c(10L, 11L))
 })
+
+test_that("thrift string size can be specified when reading Parquet files", {

Review Comment:
   Should this test also include the container size limit?



##########
r/R/parquet.R:
##########
@@ -541,12 +542,13 @@ ParquetFileReader <- R6Class("ParquetFileReader",
 ParquetFileReader$create <- function(file,
                                      props = 
ParquetArrowReaderProperties$create(),
                                      mmap = TRUE,
+                                     reader_props = 
ParquetReaderProperties$create(),

Review Comment:
   Does this also need to be exposed in `open_dataset()` (i.e., if you 
`open_dataset("the big file.parquet")`, is the same workaround required?)



##########
r/tests/testthat/test-parquet.R:
##########
@@ -484,3 +484,30 @@ test_that("Can read Parquet files from a URL", {
   expect_true(tibble::is_tibble(pu))
   expect_identical(dim(pu), c(10L, 11L))
 })
+
+test_that("thrift string size can be specified when reading Parquet files", {
+  tf <- tempfile()
+  on.exit(unlink(tf))
+
+  table <- arrow_table(example_data)
+
+  write_parquet(table, tf)
+
+  reader_props <- ParquetReaderProperties$create()
+  reader_props$set_thrift_string_size_limit(1)
+
+  expect_identical(reader_props$thrift_string_size_limit(), 1L)
+
+  file <- make_readable_file(tf)
+  on.exit(file$close())
+
+  # We get an error if we set the Thrift string size limit too small
+  expect_error(ParquetFileReader$create(file, reader_props = reader_props))

Review Comment:
   I worry that `expect_error()` without a regex will also possibly swallow an 
unrelated error (if not now than possibly in the future when somebody updates 
some piece of relevant code).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to