Chris McCubbin created ACCUMULO-1833:
----------------------------------------

             Summary: MultiTableBatchWriterImpl.getBatchWriter() is not 
performant for multiple threads
                 Key: ACCUMULO-1833
                 URL: https://issues.apache.org/jira/browse/ACCUMULO-1833
             Project: Accumulo
          Issue Type: Improvement
    Affects Versions: 1.5.0, 1.6.0
            Reporter: Chris McCubbin


This issue comes from profiling our application. We have a 
MultiTableBatchWriter created by normal means. I am attempting to write to it 
with multiple threads by doing things like the following:

{code}
batchWriter.getBatchWriter(table).addMutations(mutations);
{code}

In my test with 4 threads writing to one table, this call is quite inefficient 
and results in a large performance degradation over a single BatchWriter.

I believe the culprit is the fact that the call is synchronized. Also there is 
the possibility that the zookeeper call to Tables.getTableState on every call 
is negatively affecting performance:

{code}
  @Override
  public synchronized BatchWriter getBatchWriter(String tableName) throws 
AccumuloException, AccumuloSecurityException, TableNotFoundException {
    ArgumentChecker.notNull(tableName);
    String tableId = Tables.getNameToIdMap(instance).get(tableName);
    if (tableId == null)
      throw new TableNotFoundException(tableId, tableName, null);
    
    if (Tables.getTableState(instance, tableId) == TableState.OFFLINE)
      throw new TableOfflineException(instance, tableId);
    
    BatchWriter tbw = tableWriters.get(tableId);
    if (tbw == null) {
      tbw = new TableBatchWriter(tableId);
      tableWriters.put(tableId, tbw);
    }
    return tbw;
  }
{code}

I recommend moving the synchronized block to happen only if the batchwriter is 
not present, and also only checking if the table is online at that time:

{code}
  @Override
  public BatchWriter getBatchWriter(String tableName) throws AccumuloException, 
AccumuloSecurityException, TableNotFoundException {
    ArgumentChecker.notNull(tableName);
    String tableId = Tables.getNameToIdMap(instance).get(tableName);
    if (tableId == null)
      throw new TableNotFoundException(tableId, tableName, null);

    BatchWriter tbw = tableWriters.get(tableId);
    if (tbw == null) {

      if (Tables.getTableState(instance, tableId) == TableState.OFFLINE)
          throw new TableOfflineException(instance, tableId);
      tbw = new TableBatchWriter(tableId);
      synchronized(tableWriters){
          //only create a new table writer if we haven't been beaten to it.
          if (tableWriters.get(tableId) == null)      
              tableWriters.put(tableId, tbw);
      }
    }
    return tbw;
  }
{code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to