Ah, that's a great piece of info J-D! I had 4 families just as a logical division. I don't think I'm really using the fact that we have 4 different families anywhere. Thanks a lot for the information.
thanks, hari On Thu, Nov 11, 2010 at 10:45 PM, Jean-Daniel Cryans <[email protected]>wrote: > Oh I see, you are using 4 families. An important thing to know (and > it's not super obvious) is that the regions flush on the total size of > the memstore across all families (there's one memstore per family, > learn more here > http://www.larsgeorge.com/2009/10/hbase-architecture-101-storage.html). > > This is actually a deficiency which will be solved in the context of > https://issues.apache.org/jira/browse/HBASE-3149 > > Generally, I rarely see any reason to use more than 1 family. You > really have to be in a case where the stored data is very different in > nature and requires specific family-level configurations. Here across > our 100+ tables, only 3-4 have more than one family and I'm sure that > number should be lower. > > J-D > > On Thu, Nov 11, 2010 at 12:54 AM, Hari Sreekumar > <[email protected]> wrote: > > Here's the output of lsr on one of the tables: > > > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:33 > > /hbase/Webevent/1102232448 > > -rw-r--r-- 3 hadoop supergroup 2318 2010-11-11 13:33 > > /hbase/Webevent/1102232448/.regioninfo > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:33 > > /hbase/Webevent/1102232448/Channel > > -rw-r--r-- 3 hadoop supergroup 16943616 2010-11-11 13:33 > > /hbase/Webevent/1102232448/Channel/7714679806810147132 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:33 > > /hbase/Webevent/1102232448/Customer > > -rw-r--r-- 3 hadoop supergroup 19089809 2010-11-11 13:33 > > /hbase/Webevent/1102232448/Customer/228422950590673569 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:33 > > /hbase/Webevent/1102232448/Event > > -rw-r--r-- 3 hadoop supergroup 96925019 2010-11-11 13:33 > > /hbase/Webevent/1102232448/Event/3246797304454611713 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:34 > > /hbase/Webevent/1102232448/User > > -rw-r--r-- 3 hadoop supergroup 176008329 2010-11-11 13:33 > > /hbase/Webevent/1102232448/User/6713166405821540696 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:34 > > /hbase/Webevent/1102232448/http > > -rw-r--r-- 3 hadoop supergroup 36644077 2010-11-11 13:34 > > /hbase/Webevent/1102232448/http/5528514474393215140 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:40 > > /hbase/Webevent/1181349092 > > -rw-r--r-- 3 hadoop supergroup 2318 2010-11-11 13:40 > > /hbase/Webevent/1181349092/.regioninfo > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:40 > > /hbase/Webevent/1181349092/Channel > > -rw-r--r-- 3 hadoop supergroup 14203831 2010-11-11 13:40 > > /hbase/Webevent/1181349092/Channel/1711324265142021994 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:40 > > /hbase/Webevent/1181349092/Customer > > -rw-r--r-- 3 hadoop supergroup 14091927 2010-11-11 13:40 > > /hbase/Webevent/1181349092/Customer/3269372098573435637 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:40 > > /hbase/Webevent/1181349092/Event > > -rw-r--r-- 3 hadoop supergroup 80842368 2010-11-11 13:40 > > /hbase/Webevent/1181349092/Event/1632526964097525926 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:41 > > /hbase/Webevent/1181349092/User > > -rw-r--r-- 3 hadoop supergroup 146490419 2010-11-11 13:40 > > /hbase/Webevent/1181349092/User/723684665063798772 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:41 > > /hbase/Webevent/1181349092/http > > -rw-r--r-- 3 hadoop supergroup 27612664 2010-11-11 13:41 > > /hbase/Webevent/1181349092/http/3591070734425406504 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:28 > > /hbase/Webevent/124990928 > > -rw-r--r-- 3 hadoop supergroup 2317 2010-11-11 13:28 > > /hbase/Webevent/124990928/.regioninfo > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:35 > > /hbase/Webevent/124990928/Channel > > -rw-r--r-- 3 hadoop supergroup 23700865 2010-11-11 13:35 > > /hbase/Webevent/124990928/Channel/3488091559288595522 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:35 > > /hbase/Webevent/124990928/Customer > > -rw-r--r-- 3 hadoop supergroup 23572454 2010-11-11 13:35 > > /hbase/Webevent/124990928/Customer/522070966307001888 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:36 > > /hbase/Webevent/124990928/Event > > -rw-r--r-- 3 hadoop supergroup 126857284 2010-11-11 13:35 > > /hbase/Webevent/124990928/Event/8659573512216796018 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:36 > > /hbase/Webevent/124990928/User > > -rw-r--r-- 3 hadoop supergroup 229590074 2010-11-11 13:36 > > /hbase/Webevent/124990928/User/4169913968975354294 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:36 > > /hbase/Webevent/124990928/http > > -rw-r--r-- 3 hadoop supergroup 43849622 2010-11-11 13:36 > > /hbase/Webevent/124990928/http/798925777717846362 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:22 > > /hbase/Webevent/13518424 > > -rw-r--r-- 3 hadoop supergroup 2316 2010-11-11 13:22 > > /hbase/Webevent/13518424/.regioninfo > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:24 > > /hbase/Webevent/13518424/Channel > > -rw-r--r-- 3 hadoop supergroup 11192244 2010-11-11 13:24 > > /hbase/Webevent/13518424/Channel/6283534518250465269 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:24 > > /hbase/Webevent/13518424/Customer > > -rw-r--r-- 3 hadoop supergroup 16335757 2010-11-11 13:24 > > /hbase/Webevent/13518424/Customer/8233555538562313638 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:25 > > /hbase/Webevent/13518424/Event > > -rw-r--r-- 3 hadoop supergroup 86782869 2010-11-11 13:24 > > /hbase/Webevent/13518424/Event/7296313542067955537 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:25 > > /hbase/Webevent/13518424/User > > -rw-r--r-- 3 hadoop supergroup 157614762 2010-11-11 13:25 > > /hbase/Webevent/13518424/User/5713897981539665344 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:25 > > /hbase/Webevent/13518424/http > > -rw-r--r-- 3 hadoop supergroup 31036461 2010-11-11 13:25 > > /hbase/Webevent/13518424/http/3276765473089850908 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:22 > > /hbase/Webevent/1397796225 > > -rw-r--r-- 3 hadoop supergroup 2144 2010-11-11 13:22 > > /hbase/Webevent/1397796225/.regioninfo > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:30 > > /hbase/Webevent/1397796225/Channel > > -rw-r--r-- 3 hadoop supergroup 3937460 2010-11-11 13:30 > > /hbase/Webevent/1397796225/Channel/3684194843745008101 > > -rw-r--r-- 3 hadoop supergroup 13426908 2010-11-11 13:27 > > /hbase/Webevent/1397796225/Channel/5763776518727398923 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:30 > > /hbase/Webevent/1397796225/Customer > > -rw-r--r-- 3 hadoop supergroup 9358001 2010-11-11 13:30 > > /hbase/Webevent/1397796225/Customer/2373893879659383981 > > -rw-r--r-- 3 hadoop supergroup 15152448 2010-11-11 13:23 > > /hbase/Webevent/1397796225/Customer/5404281688196690956 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:31 > > /hbase/Webevent/1397796225/Event > > -rw-r--r-- 3 hadoop supergroup 49691275 2010-11-11 13:30 > > /hbase/Webevent/1397796225/Event/1611219478516160819 > > -rw-r--r-- 3 hadoop supergroup 80075191 2010-11-11 13:23 > > /hbase/Webevent/1397796225/Event/4491108423840726530 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:31 > > /hbase/Webevent/1397796225/User > > -rw-r--r-- 3 hadoop supergroup 235564578 2010-11-11 13:31 > > /hbase/Webevent/1397796225/User/1070607442453415896 > > -rw-r--r-- 3 hadoop supergroup 145355910 2010-11-11 13:23 > > /hbase/Webevent/1397796225/User/6446151707620200218 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:32 > > /hbase/Webevent/1397796225/http > > -rw-r--r-- 3 hadoop supergroup 46665707 2010-11-11 13:31 > > /hbase/Webevent/1397796225/http/2613117415168100829 > > -rw-r--r-- 3 hadoop supergroup 28997988 2010-11-11 13:24 > > /hbase/Webevent/1397796225/http/7620282531029987336 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:30 > > /hbase/Webevent/1568886745 > > -rw-r--r-- 3 hadoop supergroup 2318 2010-11-11 13:30 > > /hbase/Webevent/1568886745/.regioninfo > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:32 > > /hbase/Webevent/1568886745/Channel > > -rw-r--r-- 3 hadoop supergroup 22384663 2010-11-11 13:32 > > /hbase/Webevent/1568886745/Channel/3092028782443043693 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:32 > > /hbase/Webevent/1568886745/Customer > > -rw-r--r-- 3 hadoop supergroup 24060024 2010-11-11 13:32 > > /hbase/Webevent/1568886745/Customer/2143995643997658656 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:32 > > /hbase/Webevent/1568886745/Event > > -rw-r--r-- 3 hadoop supergroup 111172989 2010-11-11 13:32 > > /hbase/Webevent/1568886745/Event/606180646892333139 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:33 > > /hbase/Webevent/1568886745/User > > -rw-r--r-- 3 hadoop supergroup 201627486 2010-11-11 13:32 > > /hbase/Webevent/1568886745/User/1159084185112718235 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:33 > > /hbase/Webevent/1568886745/http > > -rw-r--r-- 3 hadoop supergroup 42824881 2010-11-11 13:33 > > /hbase/Webevent/1568886745/http/3005498889980823864 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:32 > > /hbase/Webevent/1585185360 > > -rw-r--r-- 3 hadoop supergroup 2318 2010-11-11 13:32 > > /hbase/Webevent/1585185360/.regioninfo > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:38 > > /hbase/Webevent/1585185360/Channel > > -rw-r--r-- 3 hadoop supergroup 13146621 2010-11-11 13:38 > > /hbase/Webevent/1585185360/Channel/2384148253824087933 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:38 > > /hbase/Webevent/1585185360/Customer > > -rw-r--r-- 3 hadoop supergroup 17772527 2010-11-11 13:38 > > /hbase/Webevent/1585185360/Customer/7079893521022823531 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:39 > > /hbase/Webevent/1585185360/Event > > -rw-r--r-- 3 hadoop supergroup 97860459 2010-11-11 13:38 > > /hbase/Webevent/1585185360/Event/4129421247504808018 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:39 > > /hbase/Webevent/1585185360/User > > -rw-r--r-- 3 hadoop supergroup 177262872 2010-11-11 13:39 > > /hbase/Webevent/1585185360/User/5689647586095222756 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:39 > > /hbase/Webevent/1585185360/http > > -rw-r--r-- 3 hadoop supergroup 38392938 2010-11-11 13:39 > > /hbase/Webevent/1585185360/http/1513015171284860625 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:31 > > /hbase/Webevent/1679169023 > > -rw-r--r-- 3 hadoop supergroup 1970 2010-11-11 13:31 > > /hbase/Webevent/1679169023/.regioninfo > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:37 > > /hbase/Webevent/1679169023/Channel > > -rw-r--r-- 3 hadoop supergroup 16691718 2010-11-11 13:37 > > /hbase/Webevent/1679169023/Channel/3995013105248642215 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:37 > > /hbase/Webevent/1679169023/Customer > > -rw-r--r-- 3 hadoop supergroup 18627546 2010-11-11 13:37 > > /hbase/Webevent/1679169023/Customer/2697135409291299740 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:37 > > /hbase/Webevent/1679169023/Event > > -rw-r--r-- 3 hadoop supergroup 97721412 2010-11-11 13:37 > > /hbase/Webevent/1679169023/Event/5517850771377063599 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:38 > > /hbase/Webevent/1679169023/User > > -rw-r--r-- 3 hadoop supergroup 177198181 2010-11-11 13:37 > > /hbase/Webevent/1679169023/User/1664697801534568988 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:38 > > /hbase/Webevent/1679169023/http > > -rw-r--r-- 3 hadoop supergroup 35558386 2010-11-11 13:38 > > /hbase/Webevent/1679169023/http/2236900881608337670 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:30 > > /hbase/Webevent/1837902643 > > -rw-r--r-- 3 hadoop supergroup 2318 2010-11-11 13:30 > > /hbase/Webevent/1837902643/.regioninfo > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:33 > > /hbase/Webevent/1837902643/Channel > > -rw-r--r-- 3 hadoop supergroup 12956819 2010-11-11 13:33 > > /hbase/Webevent/1837902643/Channel/7551397343290053516 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:33 > > /hbase/Webevent/1837902643/Customer > > -rw-r--r-- 3 hadoop supergroup 18017948 2010-11-11 13:33 > > /hbase/Webevent/1837902643/Customer/1637842838964675843 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:33 > > /hbase/Webevent/1837902643/Event > > -rw-r--r-- 3 hadoop supergroup 99238886 2010-11-11 13:33 > > /hbase/Webevent/1837902643/Event/4961580175946952300 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:34 > > /hbase/Webevent/1837902643/User > > -rw-r--r-- 3 hadoop supergroup 179431668 2010-11-11 13:33 > > /hbase/Webevent/1837902643/User/8513763763938668916 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:34 > > /hbase/Webevent/1837902643/http > > -rw-r--r-- 3 hadoop supergroup 35275755 2010-11-11 13:34 > > /hbase/Webevent/1837902643/http/1801439100480395261 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:25 > > /hbase/Webevent/1840258192 > > -rw-r--r-- 3 hadoop supergroup 2318 2010-11-11 13:25 > > /hbase/Webevent/1840258192/.regioninfo > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:34 > > /hbase/Webevent/1840258192/Channel > > -rw-r--r-- 3 hadoop supergroup 15810928 2010-11-11 13:34 > > /hbase/Webevent/1840258192/Channel/8758451310929982789 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:34 > > /hbase/Webevent/1840258192/Customer > > -rw-r--r-- 3 hadoop supergroup 16184063 2010-11-11 13:34 > > /hbase/Webevent/1840258192/Customer/8209107027540853853 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:35 > > /hbase/Webevent/1840258192/Event > > -rw-r--r-- 3 hadoop supergroup 89893065 2010-11-11 13:34 > > /hbase/Webevent/1840258192/Event/2507733338503153306 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:35 > > /hbase/Webevent/1840258192/User > > -rw-r--r-- 3 hadoop supergroup 162202298 2010-11-11 13:35 > > /hbase/Webevent/1840258192/User/3877054643528147835 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:35 > > /hbase/Webevent/1840258192/http > > -rw-r--r-- 3 hadoop supergroup 30458950 2010-11-11 13:35 > > /hbase/Webevent/1840258192/http/7057895626422451135 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:28 > > /hbase/Webevent/1857066524 > > -rw-r--r-- 3 hadoop supergroup 2318 2010-11-11 13:28 > > /hbase/Webevent/1857066524/.regioninfo > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:28 > > /hbase/Webevent/1857066524/Channel > > -rw-r--r-- 3 hadoop supergroup 17158229 2010-11-11 13:28 > > /hbase/Webevent/1857066524/Channel/660294007043817390 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:28 > > /hbase/Webevent/1857066524/Customer > > -rw-r--r-- 3 hadoop supergroup 17982120 2010-11-11 13:28 > > /hbase/Webevent/1857066524/Customer/8154314358497892797 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:36 > > /hbase/Webevent/1857066524/Event > > -rw-r--r-- 3 hadoop supergroup 103807737 2010-11-11 13:28 > > /hbase/Webevent/1857066524/Event/8608458148878068560 > > -rw-r--r-- 3 hadoop supergroup 103807737 2010-11-11 13:36 > > /hbase/Webevent/1857066524/Event/8753716512365715611 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:29 > > /hbase/Webevent/1857066524/User > > -rw-r--r-- 3 hadoop supergroup 188208796 2010-11-11 13:28 > > /hbase/Webevent/1857066524/User/5807656088473870598 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:29 > > /hbase/Webevent/1857066524/http > > -rw-r--r-- 3 hadoop supergroup 35830676 2010-11-11 13:29 > > /hbase/Webevent/1857066524/http/4192260931766222885 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:31 > > /hbase/Webevent/1954991296 > > -rw-r--r-- 3 hadoop supergroup 2318 2010-11-11 13:31 > > /hbase/Webevent/1954991296/.regioninfo > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:38 > > /hbase/Webevent/1954991296/Channel > > -rw-r--r-- 3 hadoop supergroup 14723821 2010-11-11 13:38 > > /hbase/Webevent/1954991296/Channel/1271796192395132719 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:38 > > /hbase/Webevent/1954991296/Customer > > -rw-r--r-- 3 hadoop supergroup 16998002 2010-11-11 13:38 > > /hbase/Webevent/1954991296/Customer/1871613240079217431 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:38 > > /hbase/Webevent/1954991296/Event > > -rw-r--r-- 3 hadoop supergroup 90132913 2010-11-11 13:38 > > /hbase/Webevent/1954991296/Event/8627908912432238564 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:38 > > /hbase/Webevent/1954991296/User > > -rw-r--r-- 3 hadoop supergroup 163362248 2010-11-11 13:38 > > /hbase/Webevent/1954991296/User/8343583184278031381 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:38 > > /hbase/Webevent/1954991296/http > > -rw-r--r-- 3 hadoop supergroup 37650515 2010-11-11 13:38 > > /hbase/Webevent/1954991296/http/783502764043910698 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:16 > > /hbase/Webevent/387441199 > > -rw-r--r-- 3 hadoop supergroup 2317 2010-11-11 13:16 > > /hbase/Webevent/387441199/.regioninfo > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:23 > > /hbase/Webevent/387441199/Channel > > -rw-r--r-- 3 hadoop supergroup 8751094 2010-11-11 13:22 > > /hbase/Webevent/387441199/Channel/6907788666949153760 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:23 > > /hbase/Webevent/387441199/Customer > > -rw-r--r-- 3 hadoop supergroup 16526400 2010-11-11 13:23 > > /hbase/Webevent/387441199/Customer/52924882214004995 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:23 > > /hbase/Webevent/387441199/Event > > -rw-r--r-- 3 hadoop supergroup 96466783 2010-11-11 13:23 > > /hbase/Webevent/387441199/Event/991918398642333797 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:24 > > /hbase/Webevent/387441199/User > > -rw-r--r-- 3 hadoop supergroup 173755411 2010-11-11 13:23 > > /hbase/Webevent/387441199/User/3697716047653972271 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:22 > > /hbase/Webevent/387441199/http > > -rw-r--r-- 3 hadoop supergroup 29164625 2010-11-11 13:19 > > /hbase/Webevent/387441199/http/2172660655272329198 > > -rw-r--r-- 3 hadoop supergroup 3505176 2010-11-11 13:22 > > /hbase/Webevent/387441199/http/9190482934578742068 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:30 > > /hbase/Webevent/480045516 > > -rw-r--r-- 3 hadoop supergroup 2317 2010-11-11 13:30 > > /hbase/Webevent/480045516/.regioninfo > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:37 > > /hbase/Webevent/480045516/Channel > > -rw-r--r-- 3 hadoop supergroup 14777812 2010-11-11 13:37 > > /hbase/Webevent/480045516/Channel/2328066899305806515 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:37 > > /hbase/Webevent/480045516/Customer > > -rw-r--r-- 3 hadoop supergroup 18953627 2010-11-11 13:37 > > /hbase/Webevent/480045516/Customer/2078047623290175963 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:37 > > /hbase/Webevent/480045516/Event > > -rw-r--r-- 3 hadoop supergroup 104229664 2010-11-11 13:37 > > /hbase/Webevent/480045516/Event/910211247163239598 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:37 > > /hbase/Webevent/480045516/User > > -rw-r--r-- 3 hadoop supergroup 189096799 2010-11-11 13:37 > > /hbase/Webevent/480045516/User/5717389634644419119 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:37 > > /hbase/Webevent/480045516/http > > -rw-r--r-- 3 hadoop supergroup 36533404 2010-11-11 13:37 > > /hbase/Webevent/480045516/http/8604372036650962237 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:40 > > /hbase/Webevent/601109706 > > -rw-r--r-- 3 hadoop supergroup 2317 2010-11-11 13:40 > > /hbase/Webevent/601109706/.regioninfo > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:40 > > /hbase/Webevent/601109706/Channel > > -rw-r--r-- 3 hadoop supergroup 14155967 2010-11-11 13:40 > > /hbase/Webevent/601109706/Channel/1819667230290028427 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:40 > > /hbase/Webevent/601109706/Customer > > -rw-r--r-- 3 hadoop supergroup 14563111 2010-11-11 13:40 > > /hbase/Webevent/601109706/Customer/7336170720169514891 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:40 > > /hbase/Webevent/601109706/Event > > -rw-r--r-- 3 hadoop supergroup 82278013 2010-11-11 13:40 > > /hbase/Webevent/601109706/Event/5064894617590864583 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:40 > > /hbase/Webevent/601109706/User > > -rw-r--r-- 3 hadoop supergroup 149299853 2010-11-11 13:40 > > /hbase/Webevent/601109706/User/5997879119834564841 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:40 > > /hbase/Webevent/601109706/http > > -rw-r--r-- 3 hadoop supergroup 29266049 2010-11-11 13:40 > > /hbase/Webevent/601109706/http/3987271255931462679 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:25 > > /hbase/Webevent/666508206 > > -rw-r--r-- 3 hadoop supergroup 2317 2010-11-11 13:25 > > /hbase/Webevent/666508206/.regioninfo > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:33 > > /hbase/Webevent/666508206/Channel > > -rw-r--r-- 3 hadoop supergroup 22727461 2010-11-11 13:33 > > /hbase/Webevent/666508206/Channel/9154587641511700292 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:33 > > /hbase/Webevent/666508206/Customer > > -rw-r--r-- 3 hadoop supergroup 23277615 2010-11-11 13:33 > > /hbase/Webevent/666508206/Customer/3760018687145755911 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:34 > > /hbase/Webevent/666508206/Event > > -rw-r--r-- 3 hadoop supergroup 111133668 2010-11-11 13:33 > > /hbase/Webevent/666508206/Event/3598650053650721687 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:34 > > /hbase/Webevent/666508206/User > > -rw-r--r-- 3 hadoop supergroup 201631388 2010-11-11 13:34 > > /hbase/Webevent/666508206/User/3597127170470234124 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:34 > > /hbase/Webevent/666508206/http > > -rw-r--r-- 3 hadoop supergroup 39920111 2010-11-11 13:34 > > /hbase/Webevent/666508206/http/1455502897668123089 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:32 > > /hbase/Webevent/717393157 > > -rw-r--r-- 3 hadoop supergroup 2317 2010-11-11 13:32 > > /hbase/Webevent/717393157/.regioninfo > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:34 > > /hbase/Webevent/717393157/Channel > > -rw-r--r-- 3 hadoop supergroup 7937724 2010-11-11 13:34 > > /hbase/Webevent/717393157/Channel/4038125755496042580 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:34 > > /hbase/Webevent/717393157/Customer > > -rw-r--r-- 3 hadoop supergroup 14666396 2010-11-11 13:34 > > /hbase/Webevent/717393157/Customer/8406371944316504992 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:34 > > /hbase/Webevent/717393157/Event > > -rw-r--r-- 3 hadoop supergroup 85611423 2010-11-11 13:34 > > /hbase/Webevent/717393157/Event/127456153926503346 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:35 > > /hbase/Webevent/717393157/User > > -rw-r--r-- 3 hadoop supergroup 154335622 2010-11-11 13:34 > > /hbase/Webevent/717393157/User/7421172344231467438 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:35 > > /hbase/Webevent/717393157/http > > -rw-r--r-- 3 hadoop supergroup 28943243 2010-11-11 13:35 > > /hbase/Webevent/717393157/http/7543152081662309456 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:30 > > /hbase/Webevent/902882312 > > -rw-r--r-- 3 hadoop supergroup 2317 2010-11-11 13:30 > > /hbase/Webevent/902882312/.regioninfo > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:30 > > /hbase/Webevent/902882312/Channel > > -rw-r--r-- 3 hadoop supergroup 9541469 2010-11-11 13:30 > > /hbase/Webevent/902882312/Channel/3254461494206070427 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:31 > > /hbase/Webevent/902882312/Customer > > -rw-r--r-- 3 hadoop supergroup 16270772 2010-11-11 13:30 > > /hbase/Webevent/902882312/Customer/3583245475353507819 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:31 > > /hbase/Webevent/902882312/Event > > -rw-r--r-- 3 hadoop supergroup 90805116 2010-11-11 13:31 > > /hbase/Webevent/902882312/Event/1032140072520109551 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:31 > > /hbase/Webevent/902882312/User > > -rw-r--r-- 3 hadoop supergroup 164990613 2010-11-11 13:31 > > /hbase/Webevent/902882312/User/5112158281218703912 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:31 > > /hbase/Webevent/902882312/http > > -rw-r--r-- 3 hadoop supergroup 38405659 2010-11-11 13:31 > > /hbase/Webevent/902882312/http/5928256232381135445 > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:41 > > /hbase/Webevent/compaction.dir > > drwxr-xr-x - hadoop supergroup 0 2010-11-11 13:36 > > /hbase/Webevent/compaction.dir/1857066524 > > -rw-r--r-- 3 hadoop supergroup 153276719 2010-11-11 13:36 > > /hbase/Webevent/compaction.dir/1857066524/8008135349377513409 > > > > There are many smaller files of sizes < 20 MB which might be actually > taking > > up 64*3=192 MB after replication. And even for the larger files, a file > of > > 129 MB would use up 3 blocks right? Or is it somehow optimized to > minimize > > space usage? > > > > On Wed, Nov 10, 2010 at 11:07 PM, Jean-Daniel Cryans < > [email protected]>wrote: > > > >> Can you pastebin the output of the lsr command on the table's dir? > >> > >> Thx > >> > >> J-D > >> > >> On Tue, Nov 9, 2010 at 10:54 PM, Hari Sreekumar > >> <[email protected]> wrote: > >> > I checked the "browse filesystem" link in the web interface (50070). > >> HBase > >> > creates a directly named after the table ,and in the directory, there > are > >> > files which are 5-6 MB in size, on average. Some are in kbs, and there > >> are > >> > some of 12-13 MB size, but most are around 6 MB. I was thinking these > >> files > >> > are stored in 64 MB blocks, leading to the space usage. > >> > > >> > hari > >> > > >> > On Wed, Nov 10, 2010 at 11:56 AM, Jean-Daniel Cryans < > >> [email protected]>wrote: > >> > > >> >> I'm pretty sure that's not how it's reported by the "du" command, but > >> >> I wouldn't expect to see files of 5MB on average. Can you be more > >> >> specific? > >> >> > >> >> J-D > >> >> > >> >> On Tue, Nov 9, 2010 at 9:58 PM, Hari Sreekumar < > >> [email protected]> > >> >> wrote: > >> >> > Ah, so the bloat is not because of the files being 5-6 MB in size? > >> >> Wouldn't > >> >> > a 6 MB file occupy 64 MB if I set block size as 64 MB? > >> >> > > >> >> > hari > >> >> > > >> >> > On Wed, Nov 10, 2010 at 11:16 AM, Jean-Daniel Cryans < > >> >> [email protected]>wrote: > >> >> > > >> >> >> Each value is stored with it's full key e.g. row key + family + > >> >> >> qualifier + timestamp + offsets. You don't give any information > >> >> >> regarding how you stored the data, but if you have large enough > keys > >> >> >> then it should easily explain the bloat. > >> >> >> > >> >> >> J-D > >> >> >> > >> >> >> On Tue, Nov 9, 2010 at 9:21 PM, Hari Sreekumar < > >> >> [email protected]> > >> >> >> wrote: > >> >> >> > Hi, > >> >> >> > > >> >> >> > Data seems to be taking up too much space when I put into > >> HBase. > >> >> e.g, > >> >> >> I > >> >> >> > have a 2 GB text file which seems to be taking up ~70 GB when I > >> dump > >> >> into > >> >> >> > HBase. I have block size set to 64 MB and replication=3, which I > >> think > >> >> is > >> >> >> > the possible reason for this expansion. But if that is the case, > >> how > >> >> can > >> >> >> I > >> >> >> > prevent it? Decreasing the block size will have a negative > impact > >> on > >> >> >> > performance, so is there a way I can increase the average size > on > >> >> >> > HBase-created files to be comparable to 64 MB. Right now they > are > >> ~5 > >> >> MB > >> >> >> on > >> >> >> > average. Or is this an entirely different thing at work here? > >> >> >> > > >> >> >> > thanks, > >> >> >> > hari > >> >> >> > > >> >> >> > >> >> > > >> >> > >> > > >> > > >
