I see, thank you :) On Mon, Jul 13, 2015 at 11:02 PM, Jacques Nadeau <[email protected]> wrote:
> Wrong line in the code. Actual code: > > > https://github.com/apache/drill/blob/master/exec/java-exec/src/test/resources/vector/complex/extended.json#L8 > > On Mon, Jul 13, 2015 at 3:48 PM, Jacques Nadeau <[email protected]> > wrote: > > > If you use extended JSON in your JSON file, Drill will automatically > > convert to TIMESTAMP_MILLIS. You can see and example of the JSON format > > for this at [1]. > > > > For checking, one of the parquet-tools options will solve this. I can't > > remember which one off hand. > > > > > > > https://github.com/apache/drill/blob/master/exec/java-exec/src/test/resources/vector/complex/extended.json#L30 > > > > On Mon, Jul 13, 2015 at 9:54 AM, Stefán Baxter < > [email protected]> > > wrote: > > > >> Hi Jacques, > >> > >> How can I tell if has that notation and is there a way for me to set the > >> defaults for the conversion of json datatime fields? > >> > >> Regards, > >> -Stefan > >> > >> > >> On Mon, Jul 13, 2015 at 3:19 PM, Jacques Nadeau <[email protected]> > >> wrote: > >> > >> > There are two different settings inside a Parquet file: physical > storage > >> > and loigcal annotation. A timestamp should be stored as a physical > >> INT64 > >> > with the TIMESTAMP_MILLI annotation. See here: > >> > > >> > > >> > > >> > https://github.com/apache/parquet-format/blob/master/src/thrift/parquet.thrift#L105 > >> > > >> > On Mon, Jul 13, 2015 at 7:47 AM, Stefán Baxter < > >> [email protected]> > >> > wrote: > >> > > >> > > thank you. > >> > > > >> > > I had seen this. I was just expecting the list to say > >> 'TIMESTAMP_MILLI' > >> > :) > >> > > (that would up the confidence level for a newbie) > >> > > > >> > > Regards, > >> > > -Stefan > >> > > > >> > > On Mon, Jul 13, 2015 at 2:44 PM, Kristine Hahn <[email protected]> > >> > wrote: > >> > > > >> > > > Expected, I think. > >> > > > > >> > > > > >> > > > >> > > >> > https://drill.apache.org/docs/parquet-format/#sql-types-to-parquet-logical-types > >> > > > says > >> > > > that the timestamp type is mapped to the Parquet TIMESTAMP_MILLI, > >> which > >> > > is > >> > > > a Unix timestamp (int64). Take a look at > >> > > > https://drill.apache.org/docs/data-type-conversion/#to_timestamp > >> and > >> > the > >> > > > Timezone Limitations section. > >> > > > > >> > > > On Monday, July 13, 2015, Stefán Baxter < > [email protected]> > >> > > wrote: > >> > > > > >> > > > > Hi, > >> > > > > > >> > > > > I have a json file that contains a SQL timestamp. > >> > > > > > >> > > > > When I use it to create a Parquet file it seems to become a > INT64: > >> > > > > > >> > > > > Jul 12, 2015 3:34:59 PM INFO: > >> > parquet.hadoop.ColumnChunkPageWriteStore: > >> > > > > written 153,728B for [occurred_at] INT64: 28,910 values, > 231,288B > >> > raw, > >> > > > > 153,681B comp, 1 pages, encodings: [RLE, BIT_PACKED, PLAIN] > >> > > > > > >> > > > > Is that to be expected or am I missing something that needs to > be > >> > done > >> > > > for > >> > > > > it to become a timestamp in Parquet? > >> > > > > > >> > > > > Regards, > >> > > > > -Stefan > >> > > > > > >> > > > > >> > > > > >> > > > -- > >> > > > Kristine Hahn > >> > > > Sr. Technical Writer > >> > > > 415-497-8107 @krishahn > >> > > > > >> > > > >> > > >> > > > > >
