Could it be the addition of the memstoreTS? i forget if that is in v1 as well.
Matt On Tue, Aug 28, 2012 at 7:37 AM, Stack <[email protected]> wrote: > On Mon, Aug 27, 2012 at 8:30 PM, anil gupta <[email protected]> wrote: > > Hi All, > > > > Here are the steps i followed to load the table with HFilev1 format: > > 1. Set the property hfile.format.version to 1. > > 2. Updated the conf across the cluster. > > 3. Restarted the cluster. > > 4. Ran the bulk loader. > > > > Table has 34 million records and one column family. > > Results: > > HDFS space for one replica of table in HFilev2:39.8 GB > > HDFS space for one replica of table in HFilev1:38.4 GB > > > > Ironically, as per the above results HFileV1 is taking 3.5% lesser space > > than HFileV2 format. I also skimmed through the code and i saw references > > to "hfile.format.version" in HFile.java class. > > > > It would be interesting to know what makes up the 3.5% difference? > More metadata on the end of the file on v2? > > St.Ack >
