[GitHub] spark pull request: [SPARK-11116] [SQL] First Draft of Dataset API

mateiz Wed, 21 Oct 2015 18:21:08 -0700

Github user mateiz commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9190#discussion_r42702980
  
    --- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/encoders/Encoder.scala
 ---
    @@ -46,13 +47,27 @@ trait Encoder[T] {
     
       /**
        * Returns an object of type `T`, extracting the required values from 
the provided row.  Note that
    -   * you must bind the encoder to a specific schema before you can call 
this function.
    +   * you must `bind` an encoder to a specific schema before you can call 
this function.
        */
       def fromRow(row: InternalRow): T
     
       /**
        * Returns a new copy of this encoder, where the expressions used by 
`fromRow` are bound to the
    -   * given schema
    +   * given schema.
        */
       def bind(schema: Seq[Attribute]): Encoder[T]
    --- End diff --
    
    It would be nice if these were separate from the process of actually 
encoding stuff. Otherwise users that want to make custom encoders will have to 
do lots of work. It's not super clear at a glance what each of these APIs are 
for and when each will recalled (i.e. bind vs bindOrdinals vs rebind).



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-11116] [SQL] First Draft of Dataset API

Reply via email to