[
https://issues.apache.org/jira/browse/PHOENIX-6821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17699612#comment-17699612
]
ASF GitHub Bot commented on PHOENIX-6821:
-----------------------------------------
haridsv commented on code in PR #1570:
URL: https://github.com/apache/phoenix/pull/1570#discussion_r1133873408
##########
phoenix-core/src/main/java/org/apache/phoenix/jdbc/PhoenixStatement.java:
##########
@@ -1993,19 +1992,31 @@ public void clearBatch() throws SQLException {
@Override
public int[] executeBatch() throws SQLException {
int i = 0;
+ int[] returnCodes = new int [batch.size()];
+ Arrays.fill(returnCodes, -1);
+ boolean autoCommit = connection.getAutoCommit();
+ connection.setAutoCommit(false);
try {
- int[] returnCodes = new int [batch.size()];
for (i = 0; i < returnCodes.length; i++) {
PhoenixPreparedStatement statement = batch.get(i);
- returnCodes[i] = statement.execute(true) ?
Statement.SUCCESS_NO_INFO : statement.getUpdateCount();
+ statement.executeForBatch();
+ returnCodes[i] = statement.getUpdateCount();
}
// Flush all changes in batch if auto flush is true
flushIfNecessary();
// If we make it all the way through, clear the batch
clearBatch();
return returnCodes;
- } catch (Throwable t) {
- throw new BatchUpdateExecution(t,i);
+ } catch (SQLException t) {
+ throw new BatchUpdateException(returnCodes, t);
Review Comment:
First of all, please note that I am actually initializing the array with
`-1` which is currently an invalid value. I know that it is not formally
reserved as an invalid value, but JDBC specifically leaves this out and uses
`-2` for `SUCCESS_NO_INFO` and `-3` for `EXECUTE_FAILED` so it seems like they
had this in their mind and so seems to be safe.
Regarding your thought, at this location, we would know which statement
failed, so we could very well set that index alone to `EXECUTE_FAILED`,
however, if the exception occurs in commit, you are right that we won't know.
However, this is going to be the case even when `autocommit` is `false` so I
think there is benefit in keeping the behavior same here as well.
> Batching with auto-commit connections
> -------------------------------------
>
> Key: PHOENIX-6821
> URL: https://issues.apache.org/jira/browse/PHOENIX-6821
> Project: Phoenix
> Issue Type: Improvement
> Reporter: Kadir Ozdemir
> Assignee: Hari Krishna Dara
> Priority: Major
>
> Phoenix commits the commands of a batch individually when executeBatch() is
> called if auto commit is enabled on the connection. For example, if a batch
> of 100 upsert statements is created using addBatch() within an auto-commit
> mode connection then when executeBatch() is called, Phoenix creates 100 HBase
> batches each with a single mutation, i.e., one for each upsert. This defeats
> the purpose of batching. The correct behavior is to commit the entire batch
> of upsert statements using the minimum number of HBase batches. This means if
> the entire batch of upsert statements fits in a single HBase batch, then one
> HBase batch should be used.
> Please note for connections without auto-commit, Phoenix behaves correctly,
> that is, the entire batch of upsert commands is committed using the minimum
> number of HBase batches.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)