> a) a copy of Zookeeper running on the machine from which I'm calling for data > b) call the "local" zookeeper for data and let it connect to the remote node for the data?

No, a ZooKeeper server does not have to be machine local for you to use it. It just has to be reachable on the network.

I'm sorry to say, I kind of at a loss. I'm not sure what you are running into. You could try remote debugging your application on the "other" cloud machine to see how exactly your code is converting the instance name into the instanceID (and confirm that the value in the TCredentials object is, in fact, different than what you expect it to be).

As for your local windows machine, I know some people have connected to Accumulo from Windows before, but it is a YMMV platform. Hopefully it works just fine because it's Java under the hood, but we have no tests to guarantee that this does work.

David Patterson wrote:
Josh, thanks for your help.
1) Running on the machine that has the accumulo/hadoop/zookeeper code,
in the accumulo shell for the user name "dave" I see the UUID for my
instance.
2) Running on the "other" machine, launching the zookeeper client,
pointing to the ip address of the server and issuing the get
/accumulo/instance/{my-instance-name}, I see the same UUID for the
instance.
3) Running on the "other" machine, when I run my java code to connect to
the remote machine with the proper instance name, userid and password, I
get the INVALID_INSTANCEID as described in detail above.
4) Running on my normal machine (Windows) running eclipse where I've
developed the code, if I run the code as a Java Application, it hangs.
5) Running on my windows machine, if I debug the application, I can
interrupt it when it hangs up and it is waiting on the line with
      Connector connector = instance.getConnector( acUserName, new
PasswordToken( acPassword));

Can my application create a connector to a remote machine's
ZookeeperInstance and reference it from "afar"? Do I have to have:
a)  a copy of Zookeeper running on the machine from which I'm calling
for data
b) call the "local" zookeeper for data and let it connect to the remote
node for the data?

The code I'm writing receives a row identifier as a String parameter,
creates a Scanner, sets the range to a single row (same value for both
ends of the range) and iterates over the (one and only) row.

I'm using Accumulo 1.6.1, Hadoop 2.6.0, and zookeeper 3.4.6, Java 7
(Oracle). The two cloud machines are running Ubuntu 14.04.

Thanks.

Dave




On Tue, Feb 17, 2015 at 5:24 PM, Josh Elser <[email protected]
<mailto:[email protected]>> wrote:

    Oops, sorry. I used '>' to denote the shell prompt. The bits below
    where it converted them to a quote is just meant to denote commands
    that are run inside the zkCli :)


    Josh Elser wrote:

        If you're using the same exact code on both machines, it sounds
        like you
        might have something unexpected going on with your networking.

        Accumulo can share ZooKeeper and HDFS instances -- it uses the
        notion of
        an InstanceID to do this. The InstanceID is a UUID assigned to an
        Accumulo instance during `accumulo init`. Because a UUID is hard to
        memorize, and you need to identify the Accumulo instance you want to
        connect to in the client API, there is also a mapping of some
        'easy-to-remember' name to that UUID. For example
        'daves_accumulo' maps
        to '12345678-1234-1234-__123456789012'.

        The error you're seeing is because the UUID your client found
        from the
        `instanceName` is different than the instanceID the Accumulo
        server has.
        A quick sanity check is to look at ZooKeeper:

        zkCli.sh -server your_zk_host:2181

            get /accumulo/instances/your___instance_name


        Compare the value of that node (first line of output) with the
        instance
        ID displayed on the Accumulo monitor (top of the page). They
        should be
        the same.

        I don't think I've ever seen this personally, so I'm not sure
        what to
        guess at how it happened. It's possible you might have
        networking messed
        up and are talking to a different ZooKeeper than you think you are
        (common problem if you have misconfigured a quorum and each ZK
        node is
        acting independent instead of together). A quick fix would be to
        change
        the node in ZK to the correct instance ID.

        zkCli.sh -server your_zk_host:2181

            delete /accumulo/instances/your___instance_name
            create /accumulo/instances/your___instance_name
            instance_id_from_monitor


        If that doesn't help, please give us some more information (versions
        you're using, how you set up the system, anything special you did).

        David Patterson wrote:

            I'm running a very simple test configuration with on Ubuntu 14
            machine. If I run code on that machine I can read the data
            I've added.

            I'm only using column family name, (empty_text for the
            qualifier) and
            a value -- no authorizations.

            When I run the exact same program (identical jar) on another
            Ubuntu 14
            machine, I get

            org.apache.accumulo.core.__client.__AccumuloSecurityException:
            Error
            INVALID_INSTANCEID for user dave - Unknown security exception
            at
            
org.apache.accumulo.core.__client.impl.ServerClient.__execute(ServerClient.java:63)

            at
            
org.apache.accumulo.core.__client.impl.ConnectorImpl.<__init>(ConnectorImpl.java:70)

            at
            
org.apache.accumulo.core.__client.ZooKeeperInstance.__getConnector(__ZooKeeperInstance.java:240)

            at com.iai.diad.data.ImageDAO_A.<__init>(ImageDAO_A.java:123)
            at com.iai.diad.data.ImageDAO_A.__main(ImageDAO_A.java:63)
            Caused by: ThriftSecurityException(user:__dave,
            code:INVALID_INSTANCEID)

            The error occurs on the instance.getConnector call (the
            second line
            below)

            instance = new ZooKeeperInstance(__instanceName, zooServers);
            connector = instance.getConnector( acUserName, new
            PasswordToken(
            acPassword));

            One possible source for strangeness is that both of these
            machines are
            on a cloud server. Each of them has 2 ip addresses -- one
            that is
            available from the outside, and one that is available only
            inside the
            cloud. I'm using the outside-the-cloud ip address in the
            zooServers
            string.

            The /etc/hosts file on the machine with the Accumulo data
            has the
            external ip address as the name of the machine. It also has
            127.0.0.1
            defined as localhost.

            Any suggestions?

            Dave Patterson


Reply via email to