Thanks Ryan On Tue, Feb 5, 2019 at 10:28 PM Ryan Blue <rb...@netflix.com> wrote:
> Shubham, > > DataSourceV2 passes Spark's internal representation to your source and > expects Spark's internal representation back from the source. That's why > you consume and produce InternalRow: "internal" indicates that Spark > doesn't need to convert the values. > > Spark's internal representation for a date is the ordinal from the unix > epoch date, 1970-01-01 = 0. > > rb > > On Tue, Feb 5, 2019 at 4:46 AM Shubham Chaurasia < > shubh.chaura...@gmail.com> wrote: > >> Hi All, >> >> I am using custom DataSourceV2 implementation (*Spark version 2.3.2*) >> >> Here is how I am trying to pass in *date type *from spark shell. >> >> scala> val df = >>> sc.parallelize(Seq("2019-02-05")).toDF("datetype").withColumn("datetype", >>> col("datetype").cast("date")) >>> scala> df.write.format("com.shubham.MyDataSource").save >> >> >> Below is the minimal write() method of my DataWriter implementation. >> >> @Override >> public void write(InternalRow record) throws IOException { >> ByteArrayOutputStream format = streamingRecordFormatter.format(record); >> System.out.println("MyDataWriter.write: " + record.get(0, >> DataTypes.DateType)); >> >> } >> >> It prints an integer as output: >> >> MyDataWriter.write: 17039 >> >> >> Is this a bug? or I am doing something wrong? >> >> Thanks, >> Shubham >> > > > -- > Ryan Blue > Software Engineer > Netflix >