Chris McCubbin created ACCUMULO-1833:
----------------------------------------
Summary: MultiTableBatchWriterImpl.getBatchWriter() is not
performant for multiple threads
Key: ACCUMULO-1833
URL: https://issues.apache.org/jira/browse/ACCUMULO-1833
Project: Accumulo
Issue Type: Improvement
Affects Versions: 1.5.0, 1.6.0
Reporter: Chris McCubbin
This issue comes from profiling our application. We have a
MultiTableBatchWriter created by normal means. I am attempting to write to it
with multiple threads by doing things like the following:
{code}
batchWriter.getBatchWriter(table).addMutations(mutations);
{code}
In my test with 4 threads writing to one table, this call is quite inefficient
and results in a large performance degradation over a single BatchWriter.
I believe the culprit is the fact that the call is synchronized. Also there is
the possibility that the zookeeper call to Tables.getTableState on every call
is negatively affecting performance:
{code}
@Override
public synchronized BatchWriter getBatchWriter(String tableName) throws
AccumuloException, AccumuloSecurityException, TableNotFoundException {
ArgumentChecker.notNull(tableName);
String tableId = Tables.getNameToIdMap(instance).get(tableName);
if (tableId == null)
throw new TableNotFoundException(tableId, tableName, null);
if (Tables.getTableState(instance, tableId) == TableState.OFFLINE)
throw new TableOfflineException(instance, tableId);
BatchWriter tbw = tableWriters.get(tableId);
if (tbw == null) {
tbw = new TableBatchWriter(tableId);
tableWriters.put(tableId, tbw);
}
return tbw;
}
{code}
I recommend moving the synchronized block to happen only if the batchwriter is
not present, and also only checking if the table is online at that time:
{code}
@Override
public BatchWriter getBatchWriter(String tableName) throws AccumuloException,
AccumuloSecurityException, TableNotFoundException {
ArgumentChecker.notNull(tableName);
String tableId = Tables.getNameToIdMap(instance).get(tableName);
if (tableId == null)
throw new TableNotFoundException(tableId, tableName, null);
BatchWriter tbw = tableWriters.get(tableId);
if (tbw == null) {
if (Tables.getTableState(instance, tableId) == TableState.OFFLINE)
throw new TableOfflineException(instance, tableId);
tbw = new TableBatchWriter(tableId);
synchronized(tableWriters){
//only create a new table writer if we haven't been beaten to it.
if (tableWriters.get(tableId) == null)
tableWriters.put(tableId, tbw);
}
}
return tbw;
}
{code}
--
This message was sent by Atlassian JIRA
(v6.1#6144)