FLINK SQL view相关问题:
create view order_source
as
select order_id, order_goods_id, user_id,...
from (
...... proctime,row_number() over(partition by order_id, order_goods_id
order by proctime desc) as rownum
from hive.temp_dw.dm_trd_order_goods/*+
OPTIONS('properties.group.id'='flink_etl_kafka_hbase',
'scan.startup.mode'='latest-offset') */
) where rownum = 1 and price > 0;
insert into hive.temp_dw.day_order_index select rowkey, ROW(cast(saleN as
BIGINT),)
from
(
select order_date as rowkey,
sum(amount) as saleN,
from order_source
group by order_date
);
insert into hive.temp_dw.day_order_index select rowkey, ROW(cast(saleN as
BIGINT))
from
(
select order_hour as rowkey, sum(amount) as saleN,
from order_source
group by order_hour
);
问题:同一个view,相同的消费group,不同的sink,产生 2个job。 这样的话,相当于2个job公用一个consumer group。
最后生成的job是 : a. order_source -> sink 1 b. order_source -> sink 2
本意是想通过view order_source (view里需要对订单数据去重)复用同一份source全量数据,对应底层可以复用同一份state数据
,如何做到 ?