Dewey Dunnington created ARROW-15097:
----------------------------------------
Summary: [R] Can't write dataset on minio local s3
Key: ARROW-15097
URL: https://issues.apache.org/jira/browse/ARROW-15097
Project: Apache Arrow
Issue Type: Improvement
Components: R
Reporter: Dewey Dunnington
When trying to reproduce the "Odd behaviour when writing a dataset to s3" error
described here ( [https://github.com/apache/arrow/issues/11934] ), I ran into
problems writing to a local minio-backed bucket. This could be 'user error'
(me!) since I'm unfamiliar with this kind of thing. If so, perhaps documenting
how to make a test setup as alluded to in the S3 vignette might be a good
solution here.
The code I'm using to reproduce is:
{code:r}
library(arrow, warn.conflicts = FALSE)
dir <- tempfile()
dir.create(dir)
subdir <- file.path(dir, "some_subdir")
dir.create(subdir)
list.files(dir)
#> [1] "some_subdir"
minio_server <- processx::process$new("minio", args = c("server", dir),
supervise = TRUE)
Sys.sleep(1)
stopifnot(minio_server$is_alive())
# make sure we can connect
s3_uri <-
"s3://minioadmin:minioadmin@?scheme=http&endpoint_override=localhost%3A9000"
bucket <- s3_bucket(s3_uri)
bucket$ls("some_subdir")
#> character(0)
# write a dataset to minio (currently hangs or errors)
data <- data.frame(x = letters[1:5])
write_dataset(
dataset = data,
path = bucket$path("test_parquet")
)
#> Error: IOError: When creating bucket 'test_parquet': AWS Error [code 100]:
Unable to parse ExceptionName: InvalidBucketName Message: The specified bucket
is not valid.
minio_server$interrupt()
#> [1] TRUE
Sys.sleep(1)
stopifnot(!minio_server$is_alive())
{code}
The output of {{mc admin trace}} is:
{noformat}
$ mc alias set myminio http://localhost:9000 minioadmin minioadmin
Added `myminio` successfully.
$ mc admin trace myminio
2021-12-14T08:46:04:000 [200 OK] s3.ListBuckets localhost:9000/ ::1
444µs ↑ 156 B ↓ 685 B
2021-12-14T08:46:26:000 [400 Bad Request] s3.PutBucket
localhost:9000/test_parquet ::1 127µs ↑ 187 B ↓ 625 B
{noformat}
--
This message was sent by Atlassian Jira
(v8.20.1#820001)