[ 
https://issues.apache.org/jira/browse/ACCUMULO-1833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13815056#comment-13815056
 ] 

Josh Elser commented on ACCUMULO-1833:
--------------------------------------

bq. Nope, just wanted to handle the system clock jumping backwards. Hopefully 
Guava's cache would handle this.

Ok, cool. I'll try to get to this today. I'll try to peek at the implementation 
again, but I think it handles this ok.

Should this go in 1.6.0 or >=1.6.1? I figured I'd put it in 1.5.1 either way.

> MultiTableBatchWriterImpl.getBatchWriter() is not performant for multiple 
> threads
> ---------------------------------------------------------------------------------
>
>                 Key: ACCUMULO-1833
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-1833
>             Project: Accumulo
>          Issue Type: Improvement
>    Affects Versions: 1.5.0, 1.6.0
>            Reporter: Chris McCubbin
>         Attachments: ACCUMULO-1833-test.patch, ZooKeeperThreadUtilization.png
>
>
> This issue comes from profiling our application. We have a 
> MultiTableBatchWriter created by normal means. I am attempting to write to it 
> with multiple threads by doing things like the following:
> {code}
> batchWriter.getBatchWriter(table).addMutations(mutations);
> {code}
> In my test with 4 threads writing to one table, this call is quite 
> inefficient and results in a large performance degradation over a single 
> BatchWriter.
> I believe the culprit is the fact that the call is synchronized. Also there 
> is the possibility that the zookeeper call to Tables.getTableState on every 
> call is negatively affecting performance:
> {code}
>   @Override
>   public synchronized BatchWriter getBatchWriter(String tableName) throws 
> AccumuloException, AccumuloSecurityException, TableNotFoundException {
>     ArgumentChecker.notNull(tableName);
>     String tableId = Tables.getNameToIdMap(instance).get(tableName);
>     if (tableId == null)
>       throw new TableNotFoundException(tableId, tableName, null);
>     
>     if (Tables.getTableState(instance, tableId) == TableState.OFFLINE)
>       throw new TableOfflineException(instance, tableId);
>     
>     BatchWriter tbw = tableWriters.get(tableId);
>     if (tbw == null) {
>       tbw = new TableBatchWriter(tableId);
>       tableWriters.put(tableId, tbw);
>     }
>     return tbw;
>   }
> {code}
> I recommend moving the synchronized block to happen only if the batchwriter 
> is not present, and also only checking if the table is online at that time:
> {code}
>   @Override
>   public BatchWriter getBatchWriter(String tableName) throws 
> AccumuloException, AccumuloSecurityException, TableNotFoundException {
>     ArgumentChecker.notNull(tableName);
>     String tableId = Tables.getNameToIdMap(instance).get(tableName);
>     if (tableId == null)
>       throw new TableNotFoundException(tableId, tableName, null);
>     BatchWriter tbw = tableWriters.get(tableId);
>     if (tbw == null) {
>       if (Tables.getTableState(instance, tableId) == TableState.OFFLINE)
>           throw new TableOfflineException(instance, tableId);
>       tbw = new TableBatchWriter(tableId);
>       synchronized(tableWriters){
>           //only create a new table writer if we haven't been beaten to it.
>           if (tableWriters.get(tableId) == null)      
>               tableWriters.put(tableId, tbw);
>       }
>     }
>     return tbw;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to