With such a small dataset, the by-partition scan might be slow than a full table scan. You can do that on a real big data set, for example, hundreds of GB;
You can also refer to this post: http://blog.cloudera.com/blog/2014/08/improving-query-performance-using-partitioning-in-apache-hive/ 2016-05-13 8:30 GMT+08:00 Mars J <[email protected]>: > I have test it, create a partition hive table and use the same partition > column in hive and kylin. but the time consuming of creating flat table > step is more than didn't use the partition table. in my test, data is very > small, when not use partition table, it takes 2.31 mins and data size is > 130.96mb, when use partition table, it takes 3.17 mins and data size is > 21.62mb(this 2 buiding process has the same start date and different end > date) > > 2016-04-09 15:43 GMT+08:00 ShaoFeng Shi <[email protected]>: > >> It is recommended to use the same partition column in hive and kylin, >> that would gain better performance in generating the flat table step, but >> this is not required. >> >> 2016-04-09 9:36 GMT+08:00 Mars J <[email protected]>: >> >>> Hi , >>> >>> Are hive fact tables and dimensiontal tables should be >>> date-column partition table when incremental building by date ? >>> >> >> >> >> -- >> Best regards, >> >> Shaofeng Shi >> >> > -- Best regards, Shaofeng Shi
