[GitHub] spark pull request: [SPARK-11116] [SQL] First Draft of Dataset API

mateiz Wed, 21 Oct 2015 18:27:39 -0700

Github user mateiz commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9190#discussion_r42703300
  
    --- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/encoders/Encoder.scala
 ---
    @@ -46,13 +47,27 @@ trait Encoder[T] {
     
       /**
        * Returns an object of type `T`, extracting the required values from 
the provided row.  Note that
    -   * you must bind the encoder to a specific schema before you can call 
this function.
    +   * you must `bind` an encoder to a specific schema before you can call 
this function.
        */
       def fromRow(row: InternalRow): T
     
       /**
        * Returns a new copy of this encoder, where the expressions used by 
`fromRow` are bound to the
    -   * given schema
    +   * given schema.
        */
       def bind(schema: Seq[Attribute]): Encoder[T]
    --- End diff --
    
    To simplify it, maybe we can just use "schema" to figure out the order of 
field names this Encoder expects, and internally project the rows we pass to it 
so that they're in that order. It might be somewhat less efficient though, I 
guess, but it would be nice if this API was closer to being open-able because 
some people might like to play with it in 1.6.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-11116] [SQL] First Draft of Dataset API

Reply via email to