I actually already made a pull request adding support for arbitrary sequence types.

https://github.com/apache/spark/pull/16240

There is still a little problem of Seq.toDS not working for those types (couldn't get implicits with multiple type parameters to resolve correctly) but createDataset works fine.

Would be glad if you bring some attention to it. It's my first code-related pull request and noone responded to it yet. I'm wondering if I'm doing something wrong on that front.

Michal Senkyr


On 16.12.2016 12:04, Jakub Dubovsky wrote:
I will give that a try. Thanks!

On Fri, Dec 16, 2016 at 12:45 AM, Michael Armbrust <mich...@databricks.com <mailto:mich...@databricks.com>> wrote:

    I would have sworn there was a ticket, but I can't find it.  So
    here you go: https://issues.apache.org/jira/browse/SPARK-18891
    <https://issues.apache.org/jira/browse/SPARK-18891>

    A work around until that is fixed would be for you to manually
    specify the kryo encoder
    
<http://spark.apache.org/docs/2.0.2/api/java/org/apache/spark/sql/Encoders.html#kryo%28scala.reflect.ClassTag%29>.

    On Thu, Dec 15, 2016 at 8:18 AM, Jakub Dubovsky
    <spark.dubovsky.ja...@gmail.com
    <mailto:spark.dubovsky.ja...@gmail.com>> wrote:

        Hey,

        I want to ask whether there is any roadmap/plan for adding
        Encoders for further types in next releases of Spark. Here is
        a list
        
<http://spark.apache.org/docs/latest/sql-programming-guide.html#data-types> of
        currently supported types. We would like to use Datasets with
        our internally defined case classes containing
        scala.collection.immutable.List(s). This does not work now
        because these lists are converted to ArrayType (Seq). This
        then fails a constructor lookup because of seq-is-not-a-list
        error...

        This means that for now we are stuck with using RDDs.

        Thanks for any insights!

        Jakub Dubovsky




Reply via email to