[ 
https://issues.apache.org/jira/browse/ARROW-15041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17456465#comment-17456465
 ] 

Dragoș Moldovan-Grünfeld commented on ARROW-15041:
--------------------------------------------------

This seems to have something to do with the order in which the 2 files are 
created and subsequently brought together with {{open_dataset()}}. Creating 
file 2 first doesn't trigger the problem. 

{code:r}
  temp_dir <- make_temp_dir()
  writeLines("\xef\xbb\xbfa,b\n1,2\n", con = file.path(temp_dir, "file1.csv"))
  writeLines("\xef\xbb\xbfa,b\n3,4\n", con = file.path(temp_dir, "file2.csv"))

  expect_equal(
    open_dataset(temp_dir, format = "csv") %>% collect(),
    tibble(a = c(1, 3), b = c(2, 4))
  )
{code}

> [R] Flaky BOM removal test
> --------------------------
>
>                 Key: ARROW-15041
>                 URL: https://issues.apache.org/jira/browse/ARROW-15041
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: R
>            Reporter: Antoine Pitrou
>            Priority: Major
>             Fix For: 7.0.0
>
>
> The test introduced in ARROW-14644 appears to be flaky.
> See example failed runs:
> https://github.com/apache/arrow/runs/4466790381?check_suite_focus=true#step:8:21277
> https://github.com/apache/arrow/runs/4463832536?check_suite_focus=true#step:9:22039
> {code}
> ── Failure (test-dataset-csv.R:297:3): open_dataset() deals with BOMs 
> (byte-order-marks) correctly ──
> `object` (`actual`) not equal to `expected` (`expected`).
> actual vs expected
>                 a b
> - actual[1, ]   3 4
> + expected[1, ] 1 2
> - actual[2, ]   1 2
> + expected[2, ] 3 4
>   `actual$a`: 3 1
> `expected$a`: 1 3
>   `actual$b`: 4 2
> `expected$b`: 2 4
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to