I have used the queries below to create parquet files from 2 CSV files:

create table dfs.datatransfer.`ct_fremde/2015/07` as
select
to_timestamp(columns[0],'dd.MM.yyyy') as Datum,
columns[1] as Airline_In,
columns[2] as Trip_In,
columns[3] as Ac_Typ,
columns[4] as Ordertype,
to_time(columns[5],'HH:mm') as Start_Time,
columns[6] as End_Time,
columns[7] as Reg_In
from dfs.datatransfer.`CT_Fremde_Juli_2015.tsv`


create table dfs.datatransfer.`ct_fremde/2015/08` as
select
to_timestamp(columns[0],'dd.MM.yyyy') as Datum,
columns[1] as Airline_In,
columns[2] as Trip_In,
columns[3] as Ac_Typ,
columns[4] as Ordertype,
to_time(columns[5],'HH:mm') as Start_Time,
columns[6] as End_Time,
columns[7] as Reg_In
from dfs.datatransfer.`CT_Fremde_August_2015.tsv`


when I query the data using following sql:

select distinct dir0 from dfs.datatransfer.`ct_fremde/2015/*`

... I get 07 and 08 as the result.


When I run a group by query:

select dir0,count(3) from dfs.datatransfer.`ct_fremde/2015/*` group by dir0

... I get 2115 for 07 and 2128 for 08 back.


Now when I run following query:

select * from dfs.datatransfer.`ct_fremde/2015/*` where dir0=7

... I get records back from the query


And when I run this query:

select * from dfs.datatransfer.`ct_fremde/2015/*` where dir0=8

... I do NOT get a result back


Am I doing something wrong here? Or what is going on here?

Greetings,

Uwe


Reply via email to