Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/15072#discussion_r84580480
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
    @@ -2725,4 +2725,14 @@ class Dataset[T] private[sql](
       @inline private def withTypedPlan[U : Encoder](logicalPlan: => 
LogicalPlan): Dataset[U] = {
         Dataset(sparkSession, logicalPlan)
       }
    +
    +  /** A convenient function to wrap a set based logical plan and produce a 
Dataset. */
    +  @inline private def withSetOperator[U : Encoder](logicalPlan: => 
LogicalPlan): Dataset[U] = {
    +    if (classTag.runtimeClass == classOf[Row]) {
    --- End diff --
    
    I _guess_ it'd be fine within Spark as it always set `clsTag` to `Row` via 
[RowEncoder.scala#L59]( 
https://github.com/apache/spark/blob/39e2bad6a866d27c3ca594d15e574a1da3ee84cc/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/encoders/RowEncoder.scala#L59)
 in the path creating `DataFrame`; however, it seems potentially problematic if 
a custom encoder is used with a subclass of `Row` as I guess you meant.
    
    Let me try to handle this. Thank you.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to