We were confronted with a MapReduce job, developed by an internal dev group, 
with reducers writing to HBase directly that would see a fraction of the 
reducers consistently killed due to timeout. Looking at jstacks all seemed well 
-- RPC in progress, nothing remarkable client or server side. We were stumped 
until we looked at the client code. It turns out they would, in a data driven 
way, occasionally submit really really really large lists. After we asked them 
to use the write buffer, all was well.
 
Best regards,


    - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein (via 
Tom White)


>________________________________
>From: Doug Meil <[email protected]>
>To: "[email protected]" <[email protected]>
>Sent: Tuesday, July 26, 2011 7:14 PM
>Subject: HBASE-4142?
>
>Hi there-
>
>I just saw this in the build message…
>
>HBASE-4142 Advise against large batches in javadoc for HTable#put(List<Put>)
>
>… And I was curious as to why this was a bad thing.  We do this and it 
>actually is quite helpful (in concert with an internal utility class that 
>later became HTableUtil).
>
>
>
>Doug Meil
>Chief Software Architect, Explorys
>[email protected]
>
>
>
>

Reply via email to