Github user viirya commented on the issue:
https://github.com/apache/spark/pull/13439
@rxin hmm, I just think if we can improve it by just adding conditional
check, it might be worth doing.
For the performance hurt, this is benchmark for on-heap and off-heap column
vectors before this patch:
On Heap:
ColumnVector R/W: Best/Avg Time(ms) Rate(M/s)
Per Row(ns) Relative
------------------------------------------------------------------------------------------------
On Heap 39 / 47 1.1
946.8 1.0X
ColumnVector R/W: Best/Avg Time(ms) Rate(M/s)
Per Row(ns) Relative
------------------------------------------------------------------------------------------------
On Heap 41 / 46 1.0
995.5 1.0X
Off Heap:
ColumnVector R/W: Best/Avg Time(ms) Rate(M/s)
Per Row(ns) Relative
------------------------------------------------------------------------------------------------
Off Heap 65 / 75 0.6
1598.2 1.0X
ColumnVector R/W: Best/Avg Time(ms) Rate(M/s)
Per Row(ns) Relative
------------------------------------------------------------------------------------------------
Off Heap 63 / 74 0.7
1532.5 1.0X
Looks like the performance is not hurt obviously/significantly.
But if you still have concerns about this, we can close this.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]