Github user JoshRosen commented on the pull request:
https://github.com/apache/spark/pull/6222#issuecomment-106152311
@rxin, I noticed that the old code (and this code) might have a small
inconsistency w.r.t how we handle `Option` values. In
`createToCatalystConverter`, the code extracts values from options, whereas
`convertToCatalyst` does not attempt to do that. Here's a test case which
demonstrates this:
```scala
test("option handling in convertToCatalyst") {
assert(CatalystTypeConverters.convertToCatalyst(Some(123)) ===
Some(123))
}
test("option handling in createToCatalystConverter") {
assert(CatalystTypeConverters.createToCatalystConverter(IntegerType)(Some(123))
=== 123)
}
```
I guess that the `convertToCatalyst` is only used in a couple of contexts
which aren't passed options. Maybe we should add another `case` statement to
handle those, though, just to keep these two methods consistent.
By the way, I figured out the minor microbenchmark perf. issue that I
mentioned offline: in the old code, `createToCatalystConverter` would return
the same function for all primitives, whereas the new code returns different
functions. If you were converting a row which only contained numeric
primitives, then the old code would have a monomorphic callsite to the
conversion function whereas the new code is polymorphic. This issue also
affects the common case of converting rows which only contain either primitive
types of strings, since the new code takes what used to be a bimorphic
calllsite and makes it polymorphic. I've found a simple fix for this, which
I'll push in a new commit.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]