Re: csv date/timestamp type inference in spark 2.0.1

2016-10-26 Thread Hyukjin Kwon
Hi Koert,


I am curious about your case. I guess the purpose of timestampFormat and
dateFormat is to infer timestamps/dates when parsing/inferring

but not to exclude the type inference/parsing. Actually, it does try to
infer/parse in 2.0.0 as well (but it fails) so actually I guess there
wouldn't be a big performance difference.


I guess it is type inference and therefore it is the right behaviour that
it tries to do its best to infer the appropriate type inclusively.

Why don't you just cast the timestamps to strings?


Thanks.


2016-10-27 9:47 GMT+09:00 Koert Kuipers :

> i tried setting both dateFormat and timestampFormat to impossible values
> (e.g. "~|.G~z~a|wW") and it still detected my data to be TimestampType
>
> On Wed, Oct 26, 2016 at 1:15 PM, Koert Kuipers  wrote:
>
>> we had the inference of dates/timestamps when reading csv files disabled
>> in spark 2.0.0 by always setting dateFormat to something impossible (e.g.
>> dateFormat "~|.G~z~a|wW")
>>
>> i noticed in spark 2.0.1 that setting this impossible dateFormat does not
>> stop spark from inferring it is a date or timestamp type anyhow. is this
>> intentional? how do i disable inference of datetype/timestamp type now?
>>
>> thanks! koert
>>
>>
>


Re: csv date/timestamp type inference in spark 2.0.1

2016-10-26 Thread Koert Kuipers
i tried setting both dateFormat and timestampFormat to impossible values
(e.g. "~|.G~z~a|wW") and it still detected my data to be TimestampType

On Wed, Oct 26, 2016 at 1:15 PM, Koert Kuipers  wrote:

> we had the inference of dates/timestamps when reading csv files disabled
> in spark 2.0.0 by always setting dateFormat to something impossible (e.g.
> dateFormat "~|.G~z~a|wW")
>
> i noticed in spark 2.0.1 that setting this impossible dateFormat does not
> stop spark from inferring it is a date or timestamp type anyhow. is this
> intentional? how do i disable inference of datetype/timestamp type now?
>
> thanks! koert
>
>


csv date/timestamp type inference in spark 2.0.1

2016-10-26 Thread Koert Kuipers
we had the inference of dates/timestamps when reading csv files disabled in
spark 2.0.0 by always setting dateFormat to something impossible (e.g.
dateFormat "~|.G~z~a|wW")

i noticed in spark 2.0.1 that setting this impossible dateFormat does not
stop spark from inferring it is a date or timestamp type anyhow. is this
intentional? how do i disable inference of datetype/timestamp type now?

thanks! koert