Does anybody else have any Hive experience that would like to share? Please? Any kind of suggestion is highly appreciated. Thanks in advanced.
Renato M. 2010/7/19 Renato Marroquín Mogrovejo <[email protected]> > Hi Ashish, > > I mean if there are like modeling best practices in order to obtain better > performance (buckets, partitions, tables related), e.g. maybe creating > different partitions considering not just time frames but maybe also > partition size, or for example in Hive's paper, the list partitioning that > the compiler uses to know where to look for the data, or I dunno those kind > of modeling related things. > Or is it just to choose between the well known Kimball or Inmon > approaches? > Thanks in advanced. > > > Renato M. > > 2010/7/16 Ashish Thusoo <[email protected]> > > Hi Renato, >> >> Can you expand more on what exactly you mean by modelling? >> >> On the append side, Hive does not really support appends though you can >> create a new partition within the table for every run and that could be used >> as a work around for appends. >> >> Ashish >> >> ------------------------------ >> *From:* Renato Marroquín Mogrovejo [mailto:[email protected]] >> *Sent:* Thursday, July 15, 2010 2:53 PM >> >> *To:* [email protected] >> *Subject:* Hive Usage >> >> Hi there I would like to know if there is anyone who has done some kind of >> modelling on Hive, and is willing to share some experiences please. >> Today is my first day with Hive, and I have several doubts regarding to >> the modelling, if I would have to do a special modelling, or a regular DW >> one.ç >> And another thing I wanted to know is if Hive already has the append >> option enabled, because I know there is a hadoop branch with the append >> option enabled and also a cloudera release does (I think it is the CHD3). >> Please any kind of suggestion or opinions are highly appreciated. >> >> >> Renato M. >> > >
