ReplicationSink.batch() is calling the HTable.batch(list) without looking at the results. The idea is, should we allow something like HTableInterface.batch(List<>, null) for the cases where we don't need to retrieve the result of the calls.
Today, on the HConnectionManager.processBatchCallback() you will get a NPE right from the first line "if (results.length != list.size())". If I have 1000 increments to send and have nothing planned in case they fail, do I really want to create an array of 1000 objects for nothing, iterate over it, etc. when it might have been possible to simply drop it? If results == null, in HConnectionManager.processBatchCallback we can use workingList in step 3 instead of iterating again in step 4. I will try to explain that a bit more in the JIRA. JM 2013/3/14 Ted Yu <[email protected]>: > bq. Should we mark it as deprecated in the interface too? > > Yes. That was my intention. > > I am not clear about your second suggestion, though. > > Cheers > > On Thu, Mar 14, 2013 at 3:36 PM, Jean-Marc Spaggiari < > [email protected]> wrote: > >> I agree. >> >> This method is also in the interface declaration. Should we mark it as >> deprecated in the interface too? >> >> Also, if someone don't want to get the results, should we find a way >> to allow he user to pass null for results? >> >> 2013/3/14 Ted Yu <[email protected]>: >> > Looking at this batch() method in HTable: >> > >> > Object[] batch(final List<? extends Row> actions) throws IOException, >> > InterruptedException; >> > I think the above method should be deprecated due to the issue raised by >> > Amit. >> > The following method is more reliable: >> > >> > void batch(final List<?extends Row> actions, final Object[] results) >> > throws IOException, InterruptedException; >> > I plan to raise a JIRA for deprecating the first method, if I don't hear >> > objections. >> > >> > Cheers >> > >> > On Thu, Mar 14, 2013 at 11:55 AM, Jean-Marc Spaggiari < >> > [email protected]> wrote: >> > >> >> Amit, do it that way: >> >> >> >> Object[] res = new Object[batch.size()]; >> >> try { >> >> table.batch(batch, res); >> >> >> >> Then res will contain the result, and the exception even if you will >> >> catch a RetriesExhaustedWithDetailsException because your batch got >> >> one. >> >> >> >> JM >> >> >> >> 2013/3/14 Jean-Marc Spaggiari <[email protected]>: >> >> > Can you paste the compelte stacktrace here with the causes too? >> >> > >> >> > I will try you piece of code locally to try to reproduce. >> >> > >> >> > JM >> >> > >> >> > 2013/3/14 Amit Sela <[email protected]>: >> >> >> I did look at HConnectionManager and that is the reason I expected >> the >> >> >> scenario you just described but running the test I ran from the >> >> development >> >> >> environment (IntelliJ IDEA) I did not get any returned value, instead >> >> the >> >> >> exception is thrown and after I catch it the result is null... >> >> >> >> >> >> Object[] res = null; >> >> >> try { >> >> >> res = table.batch(batch); >> >> >> } catch (RetriesExhaustedWithDetailsException >> >> >> retriesExhaustedWithDetailsException) { >> >> >> retriesExhaustedWithDetailsException.printStackTrace(); >> >> >> } >> >> >> if (res == null) { >> >> >> System.out.println("No results - returned null."); >> >> >> return; >> >> >> } >> >> >> >> >> >> >> >> >> >> >> >> On Thu, Mar 14, 2013 at 7:52 PM, Jean-Marc Spaggiari < >> >> >> [email protected]> wrote: >> >> >> >> >> >>> Hi Amit, >> >> >>> >> >> >>> Just take a look at the processBatchCallback method in >> >> HConnectionManager. >> >> >>> >> >> >>> There you will see how the result is populated, and when an >> exception >> >> >>> is returned. >> >> >>> >> >> >>> In your example below, if you look at the content of the returned >> >> >>> array, you should see one cell with the result of the increment, and >> >> >>> one cell with a Throwable into it. >> >> >>> >> >> >>> JM >> >> >>> >> >> >>> 2013/3/14 Amit Sela <[email protected]>: >> >> >>> > Hi all, >> >> >>> > >> >> >>> > I did some testing with HTableInterface#batch() for batching >> >> Increments >> >> >>> and >> >> >>> > I was wondering about the returned value Object[]. >> >> >>> > >> >> >>> > As I understand (or would expect), the returned value would be: >> >> >>> > >> >> >>> > null - all batch of increments failed. >> >> >>> > An object in the array is null / is Exception - that increment has >> >> >>> failed. >> >> >>> > >> >> >>> > So I ran some tests and executed a batch of two Increment Objects >> on >> >> two >> >> >>> > different row keys, where one of them is valid and the other one >> has >> >> a >> >> >>> > family that does not exist. >> >> >>> > When calling HTableInterface#batch() I >> >> >>> > get RetriesExhaustedWithDetailsException but looking at the >> counter >> >> in >> >> >>> > HBase it looks like the valid increment was executed. >> >> >>> > >> >> >>> > Shouldn't I get an Object[2] where one of the objects is null >> >> >>> > / RetriesExhaustedWithDetailsException ? >> >> >>> > >> >> >>> > How can I know # of success/failures in the batch ? What is the >> >> >>> "contract" >> >> >>> > here ? >> >> >>> > >> >> >>> > Thanks, >> >> >>> > >> >> >>> > Amit. >> >> >>> >> >> >>
