Github user vlad17 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13778#discussion_r68789054
  
    --- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala
 ---
    @@ -349,11 +349,12 @@ object MapObjects {
       def apply(
           function: Expression => Expression,
           inputData: Expression,
    -      elementType: DataType): MapObjects = {
    +      elementType: DataType,
    +      inputDataType: Option[DataType] = None): MapObjects = {
    --- End diff --
    
    I've got quite a few problems with this default:
    1. It looks like it's here to avoid coding work in most places, not because 
it's an obvious value for apply() to take on
    2. The additional parameter is completely undocumented and call sites have 
no mention of it.
    
    Also, the use of option in the case class constructor is a bit obtuse.
    
    Here is my suggestion:
    1. Remove the default.
    2. If the option in apply() is None, then pass 
inputDataType.getOrElse(inputData.dataType) as the inputDataType : DataType to 
the case class constructor, which uses the parameter data type without any 
hidden logic (as it does now).
    3. Document the fact that supplying None in apply() triggers this kind of 
inference.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to