Hi Guys,
I have updated my code to use apache commons-pool2 for connection pooling.
In this implementation, each connector has it's own zookeeper instance...
Now my code looks like this:
public void readTable(...) {
Connector connector = null;
try {
connector = accumuloConnectionPool.getConnector();
Scanner scanner = connector.getScanner(tableName, auths);
Scanner.setRange(range);
for (Map.Entry<Key,Value> entry : scanner) {
...
}
scanner.close();
} finally {
accumuloConnectionPool.releaseConnector(connector);
}
}
I've built a plugin for using Accumulo from the Play Framework (
www.playframework.org). If the above implementation looks good, I'll be
happy to publish a blog article about it.
Josh - It was nice meeting you at the Accumulo Summit last week.
Sincerely,
Jeff Schwartz
On Mon, May 19, 2014 at 10:45 PM, Josh Elser <[email protected]> wrote:
> Hi Jeff,
>
> Not a rookie question at all. This is an area in the API where we know we
> could make the lifecycle more obvious. We have a ticket somewhere for it.
>
> If you're using a single user/password to connect to Accumulo (not using
> special accounts per your QSL client), there's no reason you can't reuse
> Connectors. The number of Connectors you want to cache is likely relative
> to the concurrent user load of your service.
>
> The fun part here is that each Connector retains a reference to the
> Instance which it uses internally. There are synchronized calls inside each
> ZooKeeperInstance which may start to degrade when you get above maybe 50
> concurrent threads accessing it (ballpark guess).
>
> You also do not want to create a new ZooKeeperInstance for every request
> as you're doing now as I believe it will cause you some issues in Java heap
> due to some nitty-gritty ZooKeeper details (ask if you're actually curious).
>
> In summary, definitely cache ZooKeeperInstances, but use some number
> relative to the number of users. Connectors can be cached too, but share
> Instances under the hoods. Using HTTP benchmarking tools with various
> client pool sizes like JMeter should help you balance out these numbers.
>
> Hope this helps.
>
> - Josh
>
>
> On 5/19/14, 10:29 PM, Jeff Schwartz wrote:
>
>> Rookie Question... I've built a Query Service Layer (QSL) according to
>> the documentation from the Accumulo v1.6.0 User Manual. My question is
>> how often should I be getting a Zoo Keeper Instance and Connector to
>> accumulo. For example, here's some psuedo code for a typical service in
>> my QSL.
>>
>> public void readTable(...) {
>> Instance instance = new ZooKeeperInstance(accumuloInstanceName,
>> zooServers);
>> Connector connector = instance.getConnector(username,
>> passwordToken);
>> Scanner scanner = connector.getScanner(tableName, auths);
>> Scanner.setRange(range);
>> for (Map.Entry<Key,Value> entry : scanner) {
>> ...
>> }
>> scanner.close();
>> }
>>
>> If I do these lines of code for every call in my restful service, then I
>> feel like that is generating a lot of extra connections to both
>> zookeeper and accumulo. Additionally, I would assume that that will
>> have a negative impact on performance. Should I cache any connectors or
>> ZooKeeper instances?
>>
>> Any suggestions or best practices would be greatly appreciated.
>>
>> Thanks in advance.
>>
>> Sincerely,
>> Jeff Schwartz
>>
>