Should we check writebuffer size in the for loop, say after every 5 puts are 
added ?

On Jul 26, 2011, at 7:43 PM, Doug Meil <[email protected]> wrote:

> 
> But as long as autoFlush=false, both put() methods use the writebuffer.
> Was the issue that the flush evaluation doesn't happen on put(List<Put>)
> until the entire list was processed?
> 
> If that was the issue, then I think it would make sense to call that out
> explicitly in the Javadoc rather than just saying it may cause perf
> problems.
> 
> 
> 
> public void put(final Put put) throws IOException {
>    doPut(Arrays.asList(put));
>  }
> 
>  /**
>   * {@inheritDoc}
>   */
>  @Override
>  public void put(final List<Put> puts) throws IOException {
>    doPut(puts);
>  }
> 
>  private void doPut(final List<Put> puts) throws IOException {
>    for (Put put : puts) {
>      validatePut(put);
>      writeBuffer.add(put);
>      currentWriteBufferSize += put.heapSize();
>    }
>    if (autoFlush || currentWriteBufferSize > writeBufferSize) {
>      flushCommits();
>    }
>  }
> 
> 
> 
> 
> 
> 
> 
> 
> On 7/26/11 10:28 PM, "Andrew Purtell" <[email protected]> wrote:
> 
>> We were confronted with a MapReduce job, developed by an internal dev
>> group, with reducers writing to HBase directly that would see a fraction
>> of the reducers consistently killed due to timeout. Looking at jstacks
>> all seemed well -- RPC in progress, nothing remarkable client or server
>> side. We were stumped until we looked at the client code. It turns out
>> they would, in a data driven way, occasionally submit really really
>> really large lists. After we asked them to use the write buffer, all was
>> well.
>> 
>> Best regards,
>> 
>> 
>>   - Andy
>> 
>> Problems worthy of attack prove their worth by hitting back. - Piet Hein
>> (via Tom White)
>> 
>> 
>>> ________________________________
>>> From: Doug Meil <[email protected]>
>>> To: "[email protected]" <[email protected]>
>>> Sent: Tuesday, July 26, 2011 7:14 PM
>>> Subject: HBASE-4142?
>>> 
>>> Hi there-
>>> 
>>> I just saw this in the build messageŠ
>>> 
>>> HBASE-4142 Advise against large batches in javadoc for
>>> HTable#put(List<Put>)
>>> 
>>> Š And I was curious as to why this was a bad thing.  We do this and it
>>> actually is quite helpful (in concert with an internal utility class
>>> that later became HTableUtil).
>>> 
>>> 
>>> 
>>> Doug Meil
>>> Chief Software Architect, Explorys
>>> [email protected]
>>> 
>>> 
>>> 
> 

Reply via email to