In our cluster, there're thousands of region and each region has 5-8 store files, so the connection number is very terrible. And if we close the store file after scanning , the connection number may bring down some region servers. (In my cluster , my region server went down caused by socket OOM, our fd limit set to 30000, and tcp_mem set to 2G). We want to reuse these connections in DFSClient, do you have some more graceful solutions ?
Thanks & Best regards LiuJinglong 在 2010年7月30日 下午2:51,Angus He <[email protected]>写道: > Hi Baggio, > > > > 2. During scanning, it'll new StoreFile one by one. And in constructor > of > > StoreFile , HFile.Reader will be created. HFile.Reader act as DFSClient, > > it'll keep a connection with DataNode when something should be read. > > Yes, There is a HFile.Reader opend in StoreFile constructor. > But I am afraid there is only one instance in HBase for each > StoreFile. And I just confirmed this > in my dev machine. (HBase running in standalone mode). After many scan > operations, still only one > instance for each StoreFile. > > > 1.After checking code, and I've seen in scanner close method, > > HFile.Reader has not been closed. Is that used for reuse HFile ? > > I try to close the scanner explictly, but it cause meta region can > not > > be loaded when start up. Both get() and scan() use the same scanner... > > It seems there is only one HFile.Reader for each StoreFile, and the > HFile.Reader instance is shared by multiple scannners. > > As for closing the scannner explicitly, it works all right in our > case. And really do not know what happened in your case. > > By the way, how do you get the scanner,by HTable or other ways? > > > > > > Thanks & Best regards > > LiuJinglong > > > > 在 2010年7月30日 上午11:33,Angus He <[email protected]>写道: > > > >> 1. try to close the scanner explictly? > >> > >> 2. I do not think HBase will issue a new connection for each StoreFile > >> for the scan operation. > >> > >> 2010/7/29 baggio liu <[email protected]>: > >> > Hi all, > >> > We have 53 machines in our hbase cluster and run 6 clients to scan > a > >> > table. During scanning, we found when a region is scanning , it'll new > >> > StoreFile object, create a connection to datanode (in fact , create > HFile > >> > Reader), so the number of connection increases by the number of store > >> files. > >> > We have many store files ,(cross several regions and has not reach > minor > >> > compaction thredshould), too many connections has been created. And > after > >> > scanning, the connection will not closed. > >> > As the result of it , a machine which act as region server has too > >> high > >> > system CPU, and hung for a long time. > >> > My question is : > >> > 1. Why we don't close connection ( in fact, we don't close > >> > HFile.Reader)after we complete to scan table ? We wanna to reuse > >> connection > >> > in the next scan ?? > >> > 2. How can we limit the connection number ? > >> > > >> > Thanks & Best regards > >> > LiuJinglong > >> > > >> > >> > >> > >> -- > >> Regards > >> Angus > >> > > > > > > -- > Regards > Angus >
