You need enable LZO compression on the target table (the table you are importing to), but I assume you did that.
----- Original Message ----- From: Lord Khan Han <[email protected]> To: [email protected] Cc: Sent: Saturday, December 10, 2011 10:09 AM Subject: Re: Hbase export / import Why doubling the Table Size ? When we exporting from hbase table which is LZO compression on it, the exported file is decompressed or as is with LZO columns? On Sat, Dec 10, 2011 at 6:40 PM, Lord Khan Han <[email protected]>wrote: > It is a succes for both lzo snappy. Content is the html document.. Web > document > > > hbase org.apache.hadoop.hbase.util.CompressionTest > hdfs://localhost:8020/user/root/testfile.lzo lzo > > 11/12/10 18:37:04 INFO lzo.GPLNativeCodeLoader: Loaded native gpl library > > 11/12/10 18:37:04 INFO lzo.LzoCodec: Successfully loaded & initialized > native-lzo library [hadoop-lzo rev 2ad6654f3e9cad97d13f716e51a0509253c0aabb] > > 11/12/10 18:37:04 INFO compress.CodecPool: Got brand-new compressor > > SUCCESS > > > > > > On Sat, Dec 10, 2011 at 1:03 PM, Lars George <[email protected]>wrote: > >> Could you use the ComressionTest to verify that the library path is set >> up properly? >> >> $ hbase org.apache.hadoop.hbase.util.CompressionTest >> hdfs://<your-namenode>:8020/<some-writable-path>/test.lzo lzo >> >> Does it report OK? Same for Snappy? The reason I am asking is that when >> it does not find the native libs it uses no compression at all, and if your >> original was compressed then you will see the copied one being uncompressed >> and therefore much larger. >> >> Also, what is the content like? How large are the cells that are stored? >> >> Lars >> >> >> On Dec 10, 2011, at 8:53 AM, Lord Khan Han wrote: >> >> > I will check the reverse export imprt to cdh3b4 today to see is it same >> > size in the cluster.. >> > >> > when we use the hadoop dst copy how we candeal with the .META ? because >> we >> > are copying 1 tabel not all and also there is region info in .META >> > including their dns which is different offcoures in new cluster. >> > >> > I tried the import again today with no compression.. It is doubled the >> > exported file size!! I mean I have 200gig exported hbase table size. >> when >> > import without compression its going 400gig.. Its definitely writing >> twice >> > something.. >> > >> > thanks >> > >> > >> > >> > On Sat, Dec 10, 2011 at 2:19 AM, lars hofhansl <[email protected]> >> wrote: >> > >> >> There's copytable (also an MR job - written by J-D), but it reuses the >> >> mapper class from the Import.java, so it >> >> probably won't make a difference. >> >> >> >> What I meant to say below... When you export/import the table from your >> >> CDH3u2 cluster back to your CDH3B4 >> >> cluster, is the size still doubled? >> >> >> >> >> >> If both clusters are shutdown, you can use Hadoop's distcp to copy >> >> directly on the filesystem level; in fact that might be your >> >> best option. >> >> >> >> -- Lars >> >> >> >> >> >> ----- Original Message ----- >> >> From: Lord Khan Han <[email protected]> >> >> To: [email protected]; lars hofhansl <[email protected]> >> >> Cc: >> >> Sent: Friday, December 9, 2011 4:05 PM >> >> Subject: Re: Hbase export / import Why doubling the Table Size ? >> >> >> >> Thanks for your time.. >> >> >> >> Is there any reliable way to copy table between these cluster instead >> of >> >> export/import? >> >> >> >> >> >> >> >> On Sat, Dec 10, 2011 at 1:39 AM, lars hofhansl <[email protected]> >> >> wrote: >> >> >> >>> Hmm... I'm afraid I am out of options. If you want you can try to copy >> >> the >> >>> table >> >>> from CHD3u2 and your CDH3B4 system, and see if the size remains >> doubled. >> >>> >> >>> Does this happen with very small table, too? If so, you could take a >> >> small >> >>> sample >> >>> HFile and upload it (both the CHD3B4 and CDH3u2 versions) somewhere so >> >>> that we can have a look. >> >>> >> >>> >> >>> -- Lars >> >>> >> >>> >> >>> ----- Original Message ----- >> >>> From: Lord Khan Han <[email protected]> >> >>> To: [email protected]; lars hofhansl <[email protected]> >> >>> Cc: >> >>> Sent: Friday, December 9, 2011 2:45 PM >> >>> Subject: Re: Hbase export / import Why doubling the Table Size ? >> >>> >> >>> in same configured cluster (carbon copy) when I made import there >> is no >> >>> increas on size.. same size.. >> >>> >> >>> problem in the cdh3u2.. >> >>> >> >>> >> >>> On Sat, Dec 10, 2011 at 12:42 AM, lars hofhansl <[email protected]> >> >>> wrote: >> >>> >> >>>> What happens when you export/import into the same (CDH3B4) cluster >> >> using >> >>> a >> >>>> new table name? >> >>>> Does the size double as well? >> >>>> >> >>>> >> >>>> >> >>>> ----- Original Message ----- >> >>>> From: Lord Khan Han <[email protected]> >> >>>> To: [email protected]; lars hofhansl <[email protected]> >> >>>> Cc: >> >>>> Sent: Friday, December 9, 2011 2:27 PM >> >>>> Subject: Re: Hbase export / import Why doubling the Table Size ? >> >>>> >> >>>> I flush ed and major_compact ed .. nothing changed... i am >> stuck >> >>> this >> >>>> last two days...:( any idea? >> >>>> >> >>>> >> >>>> On Sat, Dec 10, 2011 at 12:11 AM, Lord Khan Han < >> >> [email protected] >> >>>>> wrote: >> >>>> >> >>>>> Now flushed and compacting again.. >> >>>>> >> >>>>> one more clue: >> >>>>> >> >>>>> I tested to import CDH3B4 (same as exported cluster) with lzo.. all >> >> is >> >>>>> okay.. table size is same.. >> >>>>> than I upgrade to cdh3u2 table also is ok and same size.. >> >>>>> >> >>>>> But when I try to import in cdh3u2 this size doubling happens.. >> >>>>> >> >>>>> >> >>>>> >> >>>>> >> >>>>> On Sat, Dec 10, 2011 at 12:07 AM, Lord Khan Han < >> >>> [email protected] >> >>>>> wrote: >> >>>>> >> >>>>>> I made major_compact but not flush... will do now with flush.. >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> On Fri, Dec 9, 2011 at 11:58 PM, lars hofhansl < >> [email protected] >> >>>>> wrote: >> >>>>>> >> >>>>>>> Can you try flushing and compacting the table? How did you measure >> >>> the >> >>>>>>> size? >> >>>>>>> >> >>>>>>> Both can be done from the shell using the 'flush' and >> >> 'major_compact' >> >>>>>>> commands, resp. >> >>>>>>> >> >>>>>>> >> >>>>>>> >> >>>>>>> ----- Original Message ----- >> >>>>>>> From: Lord Khan Han <[email protected]> >> >>>>>>> To: [email protected] >> >>>>>>> Cc: >> >>>>>>> Sent: Friday, December 9, 2011 1:50 PM >> >>>>>>> Subject: Hbase export / import Why doubling the Table Size ? >> >>>>>>> >> >>>>>>> Hi , >> >>>>>>> >> >>>>>>> We are usng CDH3B4 and want to upgrade to CDH3u2. Before doing >> >> this >> >>>>>>> we make a separate cluster with same config and installed CDH3u2. >> >>>>>>> >> >>>>>>> We exported our hbase table from cdh3b4 cluster and import it to >> >>> the >> >>>>>>> new cdh3u2 cluster. Table is LZO and both cluster config is same. >> >>>>>>> >> >>>>>>> After import finished hbase table size doubled!! even its >> >> configured >> >>>>>>> to use LZO. We changed table to snappy import again and same >> >>> result. >> >>>>>>> Table size multiplied x 2 in new cdh3u2 cluster. >> >>>>>>> >> >>>>>>> We didnt find why ? Is there any ideas for this ? >> >>>>>>> >> >>>>>>> thanks >> >>>>>>> >> >>>>>>> Khan >> >>>>>>> >> >>>>>>> >> >>>>>> >> >>>>> >> >>>> >> >>>> >> >>> >> >>> >> >> >> >> >> >> >
