> 2) We don't have a short-term plan for automatic-multi-partition > insertion. However there is a simple workaround if you know the > partition values (and Hive can do multiple inserts in a single > map-reduce job!). "src" can be a sub query as well. > FROM src > INSERT OVERWRITE TABLE tgt PARTITION(pcol="2009-08-01") SELECT * WHERE > ts = "2009-08-01" > INSERT OVERWRITE TABLE tgt PARTITION(pcol="2009-08-02") SELECT * WHERE > ts = "2009-08-02"
-------------------------------------------------------- In my case src too is partitioned by "ts", which means that two mappings should take place at the same time since the data is independant, but Hive (0.3) produces a linear partition-by-partition job sequence. I also do group by inside every insert... Any ideas? [this, together with the fact that hive --service thriftserver (at least in 0.3) doesn't support multiple clients, makes it very hard to effectively run some queries. -- Andraz Tori, CTO Zemanta Ltd, New York, London, Ljubljana www.zemanta.com mail: [email protected] tel: +386 41 515 767 twitter: andraz, skype: minmax_test
