So did you get any success with the problem? Well, I think you can try using it with Asynchbase, a hbase client used in OpenTSDB.
On Mon, Jan 16, 2012 at 6:46 AM, Neil Yalowitz <[email protected]>wrote: > I'm seeing something unusual here and I wanted to see if it has occurred > for any other HBase 0.90 users. I've read several emails here that > recommend NOT using multi-threading in an MR job, so that's certainly under > consideration. If anyone could add to their experiences with > multi-threading in an MR job it would be very helpful. We are testing both > implementations (with threading and without), but the threaded solution is > causing the problem. > > We are processing log files with PUTs in the Map and a followup > incrementColumnValue() to a separate "counts" table in the Reducer. The > reduce phase uses multi-threading. The Reducer initializes an HTablePool > in the setup(), starts threads in the reduce() (to a > Java BlockingQueue/CompletionService) which do the incrementColumnValue() > and depending on the value returned create a PUT in the "counter" table, > and in the cleanup() performs a completionService.take() which is ignored > and flushes the PUTs queued by the threads. > > There are no issues for approximately the first 100GB of data inserted. > After approximately 100GB however, every subsequent job has a freeze > during the Reduce phase. What I see happening is at some point the Reduce > (where the incrementColumnValue() takes place) tasks are "hung" and > eventually killed with reason: task client has not responded for 600 > seconds. The counters in the reduce job seem to grow briefly but then all > the tasks' counter stop increasing and the task is eventually killed. > > Oddly, the problem does not occur if compaction is completely disabled (not > just major, but also setting hbase.hstore.compactionThreshold = 9999999 > and hbase.hstore.blockingStoreFiles = 9999999). > > Could there be a bug with HTablePool for large datasets and compaction? > Again, this works as expected for approximately the first 100 jobs (1GB > each) but consistently fails after that. Also to repeat, the problem does > not occur with ALL compaction disabled. > > Difficult problem to describe, but I'm hoping someone may have some > feedback and/or similar experiences. I can provide code examples if anyone > is curious. > > > > Neil Yalowitz >
