Mike, thanks for responding.

BTW, I have a small update. I succeeded opening the table by setting
"hbase.master.assignment.timeoutmonitor.timeout" to 1 hour.
Now the table is hosted on single region server which is bad (see status
below). Should I compact the table and then split it?

>>>>  What did you set your max region size to be for this table?
I did not set it explicitly, so default settings of 0.90.3-cdh3u1 are used.
What setting should I use?

>>>> 14K files totalling 650GB means you have a lot of small files...
>>>>  On average ~45MB (rough calc).
Correct, I'd like to minimize this number but I am not sure how.
Maybe splits generated by my bulkloader MR job are just wrong, cause now I
just have only one region with bunch of small files.

>>How many regions?
Here is a status:
hbase(main):012:0> status 'detailed'
version 0.90.3-cdh3u1
0 regionsInTransition
3 live servers
    slave113:60020 1320067636128
        requests=0, regions=1, usedHeap=7296, maxHeap=16346
        mytable,,1319730467540.69e5825d3fea11030d9f370a9219328e.
            stores=2, storefiles=14917, storefileSizeMB=677337,
memstoreSizeMB=0, storefileIndexSizeMB=5774
    slave115:60020 1320067640784
        requests=0, regions=2, usedHeap=37, maxHeap=16346
        .META.,,1
            stores=1, storefiles=2, storefileSizeMB=0, memstoreSizeMB=0,
storefileIndexSizeMB=0
        -ROOT-,,0
            stores=1, storefiles=1, storefileSizeMB=0, memstoreSizeMB=0,
storefileIndexSizeMB=0
    slave114:60020 1320067640288
        requests=0, regions=1, usedHeap=30, maxHeap=16346
0 dead servers

>>>>  Do you have mslabs set up?
Nope. Should I?

>>>> GC tuning?
Nope. Should I? I use: "-ea -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode"


Best regards,
   Matthew Tovbin =)



On Mon, Oct 31, 2011 at 15:48, Michel Segel <[email protected]>wrote:

> What did you set your max region size?
>
>
>
> Sent from a remote device. Please excuse any typos...
>
> Mike Segel
>
> On Oct 31, 2011, at 5:07 AM, Matthew Tovbin <[email protected]> wrote:
>
> > Ted,  thanks for such a rapid response.
> >
> > You're right, we use hbase 0.90.3 from cdh3u1.
> >
> > So, I suppose I need to make bulk loading in smaller bulks then. Any
> other
> > suggestions?
> >
> >
> > Best regards,
> >    Matthew Tovbin =)
> >
> >>
> >>
> >> I assume you're using HBase 0.90.x where HBASE-4015 isn't available.
> >>
> >>>> 5. And so on, till some of Slaves fail with "java.net.SocketException:
> >> Too many open files".
> >> Do you have some monitoring setup so that you can know the number of
> open
> >> file handles ?
> >>
> >> Cheers
> >>
> >> On Sun, Oct 30, 2011 at 7:21 AM, Matthew Tovbin <[EMAIL PROTECTED]>
> wrote:
> >>
> >>> Hi guys,
> >>>
> >>>  I've bulkloaded a solid amount of data (650GB, ~14000 files) into
> Hbase
> >>> (1master + 3regions) and now enabling the table results the
> >>> following behavior on the cluster:
> >>>
> >>>  1. Master says that opening started  -
> >>>   "org.apache.hadoop.hbase.master.AssignmentManager: Handling
> >>>  transition=RS_ZK_REGION_OPENING, server=slave..."
> >>>  2. Slaves report about opening files in progress -
> >>>  "org.apache.hadoop.hbase.regionserver.Store: loaded hdfs://...."
> >>>  3. Then after ~10 mins the following error occurs on hmaster -
> >>>   "org.apache.hadoop.hbase.master.AssignmentManager: Regions in
> > transition
> >>>  timed out / Region has been OPENING for too long, reassigning
> > region=..."
> >>>  4. More slaves report about opening files in progress -
> >>>  "org.apache.hadoop.hbase.regionserver.Store: loaded hdfs://...."
> >>>  5. And so on, till some of Slaves fail with "java.net.SocketException:
> >>>  Too many open files".
> >>>
> >>>
> >>> What I've done already to solve the issue (which DID NOT help though):
> >>>
> >>>  1. Set 'ulimit -n 65536' for hbase user
> >>>  2. Set hbase.hbasemaster.maxregionopen=3600000 (1 hour) in
> > hbase-site.xml
> >>>
> >>>
> >>> What else can I try?!
> >>>
> >>>
> >>> Best regards,
> >>>   Matthew Tovbin =)
> >>>
>

Reply via email to