I think in almost all cases you’d just get a RuntimeException that wraps the 
TTransportException.  We could certainly do a better job at (automatic) fault 
recovery than we are currently… especially in the “simple” cases where, for 
example, the proxy goes down and come back while you are not using the Instance 
code (a simple “reconnect” should suffice).    In other cases, we might be able 
to provide a more API-friendly exception (e.g., throw a 
MutationRejectedException if a push of Mutations fails due to a network 
problem).   But our use case was for convenience, not production—so we didn’t 
worry too much about failures other than to make sure we knew they happened.

The proxy currently maintains the TTransport reference for the life of the 
Instance (rather than reconnecting on every request).  One of the (other) 
issues doing this is we need an explicit “close()” on the Instance (not in the 
Instance API) when you are done with it.  We toyed with the idea of 
creating/closing/destroying the transport on every access which would remove 
the need for an explicit close as well as handle many of the intermittent 
connectivity issues you raise.  But that would be at the cost of runtime 
performance in what is (hopefully) the average case.   Perhaps that performance 
hit would be OK (hey, if you really need it to work fast just the standard 
library!) but we tried to make the runtime at least tolerable! :-) Maybe 
someone has a brilliant idea of how we could handle it better…

Thanks,
Dennis


From: Keith Turner [mailto:[email protected]]
Sent: Thursday, September 03, 2015 11:18 AM
To: [email protected]
Subject: Re: Accumulo ProxyInstance available

Thats interesting.  Curious what you do in the case of faults?  I.e. if proxy 
does down for a bit and then comes back up, or if network between an Accumulo 
ProxyInstance client and the client goes down for a bit.

On Thu, Sep 3, 2015 at 7:33 AM, Patrone, Dennis S. 
<[email protected]<mailto:[email protected]>> wrote:
I just wanted to let folks know about the availability of a new project called 
'Accumulo ProxyInstance'. It is a Java Instance implementation that 
communicates with an Accumulo cluster via the Thrift proxy server.

Basically, we have an isolated Accumulo cluster only accessible to the rest of 
the network through a single, dual-homed gatekeeper machine.  We started the 
Thrift proxy server on the gatekeeper.  We then created this Instance 
implementation to allow developers to access the cluster from their development 
machines through the gatekeeper/proxy but still using the traditional Java 
Instance APIs.    This encapsulates a majority of the client code from needing 
to know if it is running on the cluster or connecting through the proxy server 
(with the notable exception of the actual Instance instantiation).

Now for convenience we can test, debug, and perform smaller queries using the 
ProxyInstance on our development machines through the proxy server.  When 
performance demands it, we can simply change 1 line of code to instantiate a 
ZooKeeperInstance instead, move the code to the isolated subnet, and 
communicate directly with the tablet servers using the traditional Java client 
library.

You can read more about the project here: 
http://jhuapl.github.io/accumulo-proxy-instance/proxy_instance_user_manual.html.
  The code is available on github 
(https://github.com/JHUAPL/accumulo-proxy-instance) and the JAR is on maven 
central 
(http://search.maven.org/#artifactdetails|edu.jhuapl.accumulo|proxy-instance|1.0.0|jar<http://search.maven.org/#artifactdetails%7Cedu.jhuapl.accumulo%7Cproxy-instance%7C1.0.0%7Cjar>).
  It was developed and tested against Accumulo 1.6.2.

We’ve found it useful in our development environment given our isolated 
Accumulo cluster; perhaps others who have a similar setup might also find 
utility in it as well.

Thanks,
Dennis



Reply via email to