====================Hive tables========================
fact table: ft
region_id
province_id
city_id

lookup table: lk1
region_id
region_name
province_id
province_name
city_id
city_name


====================Cube Design========================

JOIN:   inner

ft.region_id=lk1.region_id
and ft.province_id=lk1.province_id
and ft.city_id=lk1.city_id


Dimension

Hierarhy 1:  region_id -> province_id -> city_id  (NOTE: all of these 3
columns come from fact table: ft)
Derived 1: region_id, region_name (NOTE: it comes from lookup table: lk1)
Derived 2: province_id, province_name(NOTE: it comes from lookup table: lk1)
Derived 3: city_id, city_name(NOTE: it comes from lookup table: lk1)


is it correct to create the derived dimensions?


====================Cube Building========================
Build the cube:

when the first step, the log says as below:

CREATE EXTERNAL TABLE IF NOT EXISTS
kylin_intermediate_test_cube_1_20150101000000_20150601000000_635c985b_3212_4dd5_9ffa_f100e4c323e1
(
test_cube_1_DT string
,test_cube_1_region_id int
,test_cube_1_province_id int
,test_cube_1_city_id int
,test_cube_1_m1 bigint
,test_cube_1_m2 bigint
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\177'
STORED AS SEQUENCEFILE


we can see that "region_name", "province_name", "city_name" don't appear in
the intermediate hive table created in the 1st step, instead, it only
contains "*_id" columns as listed above,  is it expected?

Reply via email to