srowen commented on pull request #29529:
URL: https://github.com/apache/spark/pull/29529#issuecomment-679126671
That is a very big difference. While that test may be a relatively obscure
path, and we don't want to lose the benefit of this optimization elsewhere, I
agree it's a concern. I wonder if there is a way to make it all faster.
I am really struggling to understand why the change could make that large of
a difference. In the main, it should avoid an array copy and that should be a
win. I wonder if somehow the `Seq` passed here is not an `IndexedSeq` so that
indexing it is not constant time?
Could we try this implementation instead to see if that is the case?
```
def this(dataTypes: Seq[DataType]) = {
// SPARK-32550: use while loop instead of map
this(new Array[MutableValue](dataTypes.length))
var i = 0
dataTypes.foreach { dt =>
values(i) = dataTypeToMutableValue(dt)
i += 1
}
}
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]