I think that would be a good idea.
On 7/26/11 10:52 PM, "Ted Yu" <[email protected]> wrote: >Should we check writebuffer size in the for loop, say after every 5 puts >are added ? > >On Jul 26, 2011, at 7:43 PM, Doug Meil <[email protected]> >wrote: > >> >> But as long as autoFlush=false, both put() methods use the writebuffer. >> Was the issue that the flush evaluation doesn't happen on put(List<Put>) >> until the entire list was processed? >> >> If that was the issue, then I think it would make sense to call that out >> explicitly in the Javadoc rather than just saying it may cause perf >> problems. >> >> >> >> public void put(final Put put) throws IOException { >> doPut(Arrays.asList(put)); >> } >> >> /** >> * {@inheritDoc} >> */ >> @Override >> public void put(final List<Put> puts) throws IOException { >> doPut(puts); >> } >> >> private void doPut(final List<Put> puts) throws IOException { >> for (Put put : puts) { >> validatePut(put); >> writeBuffer.add(put); >> currentWriteBufferSize += put.heapSize(); >> } >> if (autoFlush || currentWriteBufferSize > writeBufferSize) { >> flushCommits(); >> } >> } >> >> >> >> >> >> >> >> >> On 7/26/11 10:28 PM, "Andrew Purtell" <[email protected]> wrote: >> >>> We were confronted with a MapReduce job, developed by an internal dev >>> group, with reducers writing to HBase directly that would see a >>>fraction >>> of the reducers consistently killed due to timeout. Looking at jstacks >>> all seemed well -- RPC in progress, nothing remarkable client or server >>> side. We were stumped until we looked at the client code. It turns out >>> they would, in a data driven way, occasionally submit really really >>> really large lists. After we asked them to use the write buffer, all >>>was >>> well. >>> >>> Best regards, >>> >>> >>> - Andy >>> >>> Problems worthy of attack prove their worth by hitting back. - Piet >>>Hein >>> (via Tom White) >>> >>> >>>> ________________________________ >>>> From: Doug Meil <[email protected]> >>>> To: "[email protected]" <[email protected]> >>>> Sent: Tuesday, July 26, 2011 7:14 PM >>>> Subject: HBASE-4142? >>>> >>>> Hi there- >>>> >>>> I just saw this in the build messageŠ >>>> >>>> HBASE-4142 Advise against large batches in javadoc for >>>> HTable#put(List<Put>) >>>> >>>> Š And I was curious as to why this was a bad thing. We do this and it >>>> actually is quite helpful (in concert with an internal utility class >>>> that later became HTableUtil). >>>> >>>> >>>> >>>> Doug Meil >>>> Chief Software Architect, Explorys >>>> [email protected] >>>> >>>> >>>> >>
