Github user sitalkedia commented on the pull request:
https://github.com/apache/spark/pull/13107#issuecomment-219762996
I am not 100% sure of the root cause, but I suspect this is happening when
JVM is trying to allocate very large size buffer for pointer array. The issue
might be because the JVM is not able to allocate large buffer in contiguous
memory location on heap and since the unsafe operations assume contiguous
memory location of the objects, any unsafe operation on large buffer results in
memory corruption which manifests as TimSort issue.
Unfortunately, this issue is not reproducible consistently and I am not
sure of the root cause. So I am not sure how can we have a regression test for
it.
Also, please note that this change itself is a no-op unless you override
the default value of `numElementsForSpillThreshold`, which is `Long.MAX_VALUE`.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]