why "cache table a as select * from b" will do shuffle,and create 2 stages.

2015-12-14 Thread ant2nebula
why "cache table a as select * from b" will do shuffle,and create 2 stages. example: table "ods_pay_consume" is from "KafkaUtils.createDirectStream" hiveContext.sql("cache table dwd_pay_consume as select * from ods_pay_consume") this code will make 2 statges of DAG

why "cache table a as select * from b" will do shuffle,and create 2 stages.

2015-12-13 Thread ant2nebula
why "cache table a as select * from b" will do shuffle,and create 2 stages. example: table "ods_pay_consume" is from "KafkaUtils.createDirectStream" hiveContext.sql("cache table dwd_pay_consume as select * from ods_pay_consume") this code will make 2 statges of DAG