[
https://issues.apache.org/jira/browse/ARROW-16417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17530059#comment-17530059
]
David Li commented on ARROW-16417:
----------------------------------
This reproduces it reasonably consistently for me:
{code:python}
def test_joins():
expected = pa.table({
"colB": [1, 2],
"col3": ["A", "B"]
})
t1 = pa.Table.from_pydict({
"colA": [1, 2, 6],
"col2": ["a", "b", "f"]
})
t2 = pa.Table.from_pydict({
"colB": [99, 2, 1],
"col3": ["Z", "B", "A"]
})
for _ in range(1000):
r = ep._perform_join("right semi", t1, "colA", t2, "colB",
use_threads=True, coalesce_keys=True)
r = r.combine_chunks()
r = r.sort_by("colB")
{code}
Both {{combine_chunks}} and {{sort_by}} are necessary.
> [C++][Python] Segfault in test_exec_plan.py / test_joins
> --------------------------------------------------------
>
> Key: ARROW-16417
> URL: https://issues.apache.org/jira/browse/ARROW-16417
> Project: Apache Arrow
> Issue Type: Bug
> Components: C++, Python
> Affects Versions: 8.0.0
> Reporter: David Li
> Priority: Major
>
> Occurs during wheel verification. It also happens to master. The failure is
> sporadic but fairly reliable. test_joins is parameterized; it's not
> consistent in the parameters it occurs on, but it consistently occurs on that
> test.
> The backtrace reaches into malloc_consolidate. MALLOC_CHECK doesn't help.
> However:
> {noformat}
> (gdb) b main
> Breakpoint 1 at 0x11ea20: file
> /home/conda/feedstock_root/build_artifacts/python-split_1625973859697/work/Programs/python.c,
> line 15.
> (gdb) command 1
> Type commands for breakpoint(s) 1, one per line.
> End with a line saying just "end".
> >call mcheck(0)
> >continue
> >end {noformat}
> This fairly consistently fails with "memory clobbered before allocated block"
> but the location varies.
> This may be a red herring though. I also tried LD_PRELOADING a secure build
> of mimalloc to see if it would catch any sort of heap corruption but instead
> the tests pass consistently with mimalloc.
>
--
This message was sent by Atlassian Jira
(v8.20.7#820007)