Hi, Yup after some digging I got to HFileOutputFormat and was relieved to know that it does support compression. Was able to add code to set compression based on the column family's compression setting.
Will create a ticket and submit the patch after some more testing and going over the coding guidelines. My code looks a little hacky because I am passing the family specific compression algorithm name as "," delimited single configuration item. I figure that Configuration should have a method to return all key values where key's match a pattern. Maybe there are better ways to do this. Will get this into the ticket. Thanks and regards, - Ashish On Mon, 24 Jan 2011 11:12:06 -0800 Todd Lipcon <[email protected]> wrote: > On Mon, Jan 24, 2011 at 9:50 AM, Stack <[email protected]> wrote: > > > In HFileOutputFormat it says this near top: > > > > // Invented config. Add to hbase-*.xml if other than default > > compression. > > final String compression = conf.get("hfile.compression", > > Compression.Algorithm.NONE.getName()); > > > > You might try messing with this config? > > > > And would be great to file (and provide a patch for) a JIRA that > automatically sets this based on the HTableDescriptor when you're > loading into an existing table! > > -Todd > > > > On Sun, Jan 23, 2011 at 9:38 PM, Ashish Shinde <[email protected]> > > wrote: > > > Hi, > > > > > > I have been importing data to hbase 0.90.0 using the code from > > > the bulk uploader (ImportTsv.java). The table has LZO compression > > > set, however unless major compaction is run the table it does not > > > get compressed. > > > > > > Is there a way to compress the table as the bulk uploader creates > > > the HFile. This is important for us because we don't want to have > > > a burst increase in our disk usage. > > > > > > Thanks and regards, > > > - Ashish > > > > > > > >
