[ 
https://issues.apache.org/jira/browse/ARROW-16417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17530059#comment-17530059
 ] 

David Li commented on ARROW-16417:
----------------------------------

This reproduces it reasonably consistently for me:

{code:python}
def test_joins():
    expected = pa.table({
        "colB": [1, 2],
        "col3": ["A", "B"]
    })

    t1 = pa.Table.from_pydict({
        "colA": [1, 2, 6],
        "col2": ["a", "b", "f"]
    })

    t2 = pa.Table.from_pydict({
        "colB": [99, 2, 1],
        "col3": ["Z", "B", "A"]
    })

    for _ in range(1000):
        r = ep._perform_join("right semi", t1, "colA", t2, "colB",
                             use_threads=True, coalesce_keys=True)
        r = r.combine_chunks()
        r = r.sort_by("colB")
{code}
 Both {{combine_chunks}} and {{sort_by}} are necessary.

> [C++][Python] Segfault in test_exec_plan.py / test_joins
> --------------------------------------------------------
>
>                 Key: ARROW-16417
>                 URL: https://issues.apache.org/jira/browse/ARROW-16417
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: C++, Python
>    Affects Versions: 8.0.0
>            Reporter: David Li
>            Priority: Major
>
> Occurs during wheel verification. It also happens to master. The failure is 
> sporadic but fairly reliable. test_joins is parameterized; it's not 
> consistent in the parameters it occurs on, but it consistently occurs on that 
> test.
> The backtrace reaches into malloc_consolidate. MALLOC_CHECK doesn't help. 
> However:
> {noformat}
> (gdb) b main
> Breakpoint 1 at 0x11ea20: file 
> /home/conda/feedstock_root/build_artifacts/python-split_1625973859697/work/Programs/python.c,
>  line 15.
> (gdb) command 1
> Type commands for breakpoint(s) 1, one per line.
> End with a line saying just "end".
> >call mcheck(0)
> >continue
> >end {noformat}
> This fairly consistently fails with "memory clobbered before allocated block" 
> but the location varies. 
> This may be a red herring though. I also tried LD_PRELOADING a secure build 
> of mimalloc to see if it would catch any sort of heap corruption but instead 
> the tests pass consistently with mimalloc.
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to