We were confronted with a MapReduce job, developed by an internal dev group, with reducers writing to HBase directly that would see a fraction of the reducers consistently killed due to timeout. Looking at jstacks all seemed well -- RPC in progress, nothing remarkable client or server side. We were stumped until we looked at the client code. It turns out they would, in a data driven way, occasionally submit really really really large lists. After we asked them to use the write buffer, all was well. Best regards,
- Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White) >________________________________ >From: Doug Meil <[email protected]> >To: "[email protected]" <[email protected]> >Sent: Tuesday, July 26, 2011 7:14 PM >Subject: HBASE-4142? > >Hi there- > >I just saw this in the build message… > >HBASE-4142 Advise against large batches in javadoc for HTable#put(List<Put>) > >… And I was curious as to why this was a bad thing. We do this and it >actually is quite helpful (in concert with an internal utility class that >later became HTableUtil). > > > >Doug Meil >Chief Software Architect, Explorys >[email protected] > > > >
