Hi, I have made some tests on our cluster after hadoop lzo installed and lzo enabled in kylin. Kylin has better performance with LZO.
I build cubes with two tables, small one with 10,000 records(table called Small_Table), and large one with 4,000,000 records(table called Large_Table). The cube sizes are reduced obviously. Large_Table Small_Table No LZO 776.33m 16.15m LZO 571.49m 7.53m For the query duration time is not quite stable, I made comparation with a time-consuming query on kylin with and without lzo. The query seems like "SELECT A,B from Large_Table where A<'5000000000' and B>'5000000000' group by A,B order by A;" On Kylin with lzo, I queried for 10 times, the time durings were: 4.80s,5.74s,5.98s,4.95s,4.86s,7.24s,4.72s,6.80s,6.42s,7.08s The mean time was 5.859s. On Kylin without lzo, I queried for 10 times, the time durings were: 11.66s,6.31s,7.17s,6.37s,6.78s,6.43s,7.47s,5.62s,7.60s,6.47s The mean time was 7.188s. For the time of cube building, I didn't see much improvement, maybe this is because I didn't build many times and do not have more accurate comparations. Could you please share your experience about Kylin with lzo? Tnanks Best Regards, George/倪春恩 Software Engineer/软件工程师 Mobile:+86-13501723787| Fax:+8610-56842040 北京明略软件系统有限公司(www.mininglamp.com) 北京市昌平区东小口镇中东路398号中煤建设集团大厦1号楼4层 F4,1#,Zhongmei Construction Group Plaza,398# Zhongdong Road,Changping District,Beijing,102218 ----------------------------------------------------------------------------------------------------------------------------
