Hi John, Yes, I changed the conf files of the slaves, master, gc, tracers and monitor from 'localhost' to 'domain name'. If it's relevant, the server is a team members home server with a fqdn of Home.home when I type in 'hostname --fqdn'.
On Sunday, April 12, 2015, John Vines <[email protected]> wrote: > Did you set the names in the slaves, master, etc. files to the server name > from localhost? > > On Sat, Apr 11, 2015 at 7:47 PM Ryan <[email protected] > <javascript:_e(%7B%7D,'cvml','[email protected]');>> wrote: > >> Sorry to bring back an old thread but I'm working with Accumulo at a >> hackathon and am running into this same issue with the being unable to >> connect to zookeeper from the local machine (the program hangs at >> inst.getConnector). >> >> Dave, what did you fix in Hadoop to get it to work? I changed the 5 >> mentioned conf files to my server's domain name, deleted the hdfs accumulo >> directory, and reinstalled using accumulo init. With the domain name there, >> I'm now unable to even start Accumulo. >> >> Josh, I recall you helped me with this a little while back on RHEL. I've >> been pouring through my notes but have yet to find a solution. >> >> Any help would be greatly appreciated. >> >> Thanks! >> Ryan >> >> On Tue, Mar 3, 2015 at 5:21 PM, Josh Elser <[email protected] >> <javascript:_e(%7B%7D,'cvml','[email protected]');>> wrote: >> >>> Excellent! Happy to hear it. >>> >>> Simple problem, but multiple places to fix it in :) >>> >>> David Patterson wrote: >>> >>>> Josh, I just wanted to close the loop on this problem. I redid the >>>> installation making sure there were no references to localhost or >>>> 127.0.0.1. There was a problem in Hadoop that I was able to solve with >>>> the help of the Hadoop user group. >>>> >>>> The combo of no localhosts and the correct hadoop configuration and >>>> initialization has worked. >>>> >>>> I am now able to run code from my Windows machine in Eclipse that >>>> references the Accumulo store in my cloud machine and get the correct >>>> answers back. >>>> >>>> Thank you for your help. >>>> >>>> Dave Patterson >>>> >>>> On Thu, Feb 19, 2015 at 4:29 PM, Josh Elser <[email protected] >>>> <javascript:_e(%7B%7D,'cvml','[email protected]');> >>>> <mailto:[email protected] >>>> <javascript:_e(%7B%7D,'cvml','[email protected]');>>> wrote: >>>> >>>> Ah! There's the rub. >>>> >>>> > At this point, I see that the ThriftTransportKey has a host name: >>>> > "localhost" and a port of "9997". >>>> >>>> Double check your configuration files: >>>> $ACCUMULO_CONF_DIR/{masters,__monitor,slaves,gc,tracers} >>>> >>>> These files control what network interface your Accumulo processes >>>> bind on. Because they only bound to localhost, your application >>>> worked when run on the same machine, but not on any remote machine. >>>> >>>> Typically, you want to put the FQDN in these files. >>>> >>>> David Patterson wrote: >>>> >>>> Josh and anyone else interested, >>>> >>>> More data on this problem. >>>> >>>> I have tried debugging the code in Eclipse (running it on my >>>> Windows >>>> machine). The ZooKeeperInstance is working fine in this remote >>>> mode. I >>>> can query the instance, and get the instanceID, instance Name, >>>> zookeepers string, and session timeout. >>>> >>>> I've also tried creating a ZooCache and a UUID object with the >>>> long >>>> string value of my actual instance identification. If I do >>>> String instanceName = ZooKeeperInstance.__lookupInstance( >>>> zooCache uuid); >>>> It is able to return the string name of the instance. So, that >>>> part of >>>> the communication seems to be fine. >>>> >>>> The hang-up is still coming on the instance.getConnector( >>>> username, new >>>> PasswordToken( password)); >>>> >>>> It hangs, and when I ran my code in debug mode on Eclipse, I >>>> interrupted >>>> it while it was doing nothing. >>>> >>>> I see a long string of calls going from >>>> ZooKeeperInstance.getConnector >>>> to ConnectorImpl constructor >>>> to ServerClient.execute >>>> to ServerClient.executeRaw >>>> to ServerClient.getConnection(__Instance) >>>> to ServerClient.getConnection(__Instance, boolean) >>>> to ServerClient.getConnection(__Instance, boolean, long) >>>> to >>>> ThriftTransportPool.__getAnyTransport(List<__ThriftTransportKey>, >>>> boolean) >>>> >>>> At this point, I see that the ThriftTransportKey has a host >>>> name: >>>> "localhost" and a port of "9997". >>>> >>>> From there, it goes to ThriftUtil.__createClientTransport, >>>> TTimeoutTransport.create(__HostAndPort), >>>> TTimeoutTransport(__SocketAddress, >>>> long), >>>> SocketAdapter.connect(__SocketAddress), >>>> SocketAdapter.connect(__SocketAddress, int), >>>> SocketChannelImpl.connect( >>>> SocketAddress), >>>> Net.connect(FileDescriptor, InetAddress,int), >>>> Net.connect(ProtocolFamily,__FileDescriptor, InetAddress, int) >>>> and finally >>>> Net.connect0(boolean, FileDescriptor, InetAddress, int) >>>> >>>> I guess I don't understand why this is going into Thrift code. >>>> >>>> Is there some authorization I need to provide to let me do a >>>> remote >>>> connection into Accumulo (Zookeeper seems happy to work, but is >>>> Accumulo >>>> stopping me?)? >>>> >>>> If anyone wants line numbers, etc. I can supply more info. >>>> >>>> Dave Patterson >>>> >>>> On Wed, Feb 18, 2015 at 10:20 AM, Josh Elser >>>> <[email protected] >>>> <javascript:_e(%7B%7D,'cvml','[email protected]');> <mailto: >>>> [email protected] >>>> <javascript:_e(%7B%7D,'cvml','[email protected]');>> >>>> <mailto:[email protected] >>>> <javascript:_e(%7B%7D,'cvml','[email protected]');> <mailto: >>>> [email protected] >>>> <javascript:_e(%7B%7D,'cvml','[email protected]');>>>> wrote: >>>> >>>> > a) a copy of Zookeeper running on the machine from which >>>> I'm >>>> calling for data >>>> > b) call the "local" zookeeper for data and let it connect >>>> to the >>>> remote node for the data? >>>> >>>> No, a ZooKeeper server does not have to be machine local >>>> for you to >>>> use it. It just has to be reachable on the network. >>>> >>>> I'm sorry to say, I kind of at a loss. I'm not sure what >>>> you are >>>> running into. You could try remote debugging your >>>> application on the >>>> "other" cloud machine to see how exactly your code is >>>> converting the >>>> instance name into the instanceID (and confirm that the >>>> value in the >>>> TCredentials object is, in fact, different than what you >>>> expect it >>>> to be). >>>> >>>> As for your local windows machine, I know some people have >>>> connected >>>> to Accumulo from Windows before, but it is a YMMV platform. >>>> Hopefully it works just fine because it's Java under the >>>> hood, but >>>> we have no tests to guarantee that this does work. >>>> >>>> David Patterson wrote: >>>> >>>> Josh, thanks for your help. >>>> 1) Running on the machine that has the >>>> accumulo/hadoop/zookeeper >>>> code, >>>> in the accumulo shell for the user name "dave" I see >>>> the UUID for my >>>> instance. >>>> 2) Running on the "other" machine, launching the >>>> zookeeper client, >>>> pointing to the ip address of the server and issuing >>>> the get >>>> /accumulo/instance/{my-____instance-name}, I see the >>>> >>>> same UUID for the >>>> >>>> instance. >>>> 3) Running on the "other" machine, when I run my java >>>> code to >>>> connect to >>>> the remote machine with the proper instance name, >>>> userid and >>>> password, I >>>> get the INVALID_INSTANCEID as described in detail >>>> above. >>>> 4) Running on my normal machine (Windows) running >>>> eclipse where I've >>>> developed the code, if I run the code as a Java >>>> Application, it >>>> hangs. >>>> 5) Running on my windows machine, if I debug the >>>> application, I can >>>> interrupt it when it hangs up and it is waiting on the >>>> line with >>>> Connector connector = instance.getConnector( >>>> acUserName, new >>>> PasswordToken( acPassword)); >>>> >>>> Can my application create a connector to a remote >>>> machine's >>>> ZookeeperInstance and reference it from "afar"? Do I >>>> have to have: >>>> a) a copy of Zookeeper running on the machine from >>>> which I'm >>>> calling >>>> for data >>>> b) call the "local" zookeeper for data and let it >>>> connect to the >>>> remote >>>> node for the data? >>>> >>>> The code I'm writing receives a row identifier as a >>>> String >>>> parameter, >>>> creates a Scanner, sets the range to a single row (same >>>> value >>>> for both >>>> ends of the range) and iterates over the (one and only) >>>> row. >>>> >>>> I'm using Accumulo 1.6.1, Hadoop 2.6.0, and zookeeper >>>> 3.4.6, Java 7 >>>> (Oracle). The two cloud machines are running Ubuntu >>>> 14.04. >>>> >>>> Thanks. >>>> >>>> Dave >>>> >>>> >>>> >>>> >>>> On Tue, Feb 17, 2015 at 5:24 PM, Josh Elser >>>> <[email protected] >>>> <javascript:_e(%7B%7D,'cvml','[email protected]');> <mailto: >>>> [email protected] >>>> <javascript:_e(%7B%7D,'cvml','[email protected]');>> >>>> <mailto:[email protected] >>>> <javascript:_e(%7B%7D,'cvml','[email protected]');> <mailto: >>>> [email protected] >>>> <javascript:_e(%7B%7D,'cvml','[email protected]');>>> >>>> <mailto:[email protected] >>>> <javascript:_e(%7B%7D,'cvml','[email protected]');> <mailto: >>>> [email protected] >>>> <javascript:_e(%7B%7D,'cvml','[email protected]');>> >>>> <mailto:[email protected] >>>> <javascript:_e(%7B%7D,'cvml','[email protected]');> <mailto: >>>> [email protected] >>>> <javascript:_e(%7B%7D,'cvml','[email protected]');>>>>__> >>>> >>>> wrote: >>>> >>>> Oops, sorry. I used '>' to denote the shell >>>> prompt. The >>>> bits below >>>> where it converted them to a quote is just meant >>>> to denote >>>> commands >>>> that are run inside the zkCli :) >>>> >>>> >>>> Josh Elser wrote: >>>> >>>> If you're using the same exact code on both >>>> machines, >>>> it sounds >>>> like you >>>> might have something unexpected going on with >>>> your >>>> networking. >>>> >>>> Accumulo can share ZooKeeper and HDFS >>>> instances -- it >>>> uses the >>>> notion of >>>> an InstanceID to do this. The InstanceID is a >>>> UUID >>>> assigned to an >>>> Accumulo instance during `accumulo init`. >>>> Because a >>>> UUID is hard to >>>> memorize, and you need to identify the >>>> Accumulo >>>> instance you want to >>>> connect to in the client API, there is also a >>>> mapping >>>> of some >>>> 'easy-to-remember' name to that UUID. For example >>>> 'daves_accumulo' maps >>>> to '12345678-1234-1234-______123456789012'. >>>> >>>> The error you're seeing is because the UUID >>>> your client >>>> found >>>> from the >>>> `instanceName` is different than the >>>> instanceID the >>>> Accumulo >>>> server has. >>>> A quick sanity check is to look at ZooKeeper: >>>> >>>> zkCli.sh -server your_zk_host:2181 >>>> >>>> get >>>> /accumulo/instances/your_______instance_name >>>> >>>> >>>> >>>> Compare the value of that node (first line of >>>> output) >>>> with the >>>> instance >>>> ID displayed on the Accumulo monitor (top of >>>> the page). >>>> They >>>> should be >>>> the same. >>>> >>>> I don't think I've ever seen this personally, >>>> so I'm >>>> not sure >>>> what to >>>> guess at how it happened. It's possible you >>>> might have >>>> networking messed >>>> up and are talking to a different ZooKeeper >>>> than you >>>> think you are >>>> (common problem if you have misconfigured a >>>> quorum and >>>> each ZK >>>> node is >>>> acting independent instead of together). A >>>> quick fix >>>> would be to >>>> change >>>> the node in ZK to the correct instance ID. >>>> >>>> zkCli.sh -server your_zk_host:2181 >>>> >>>> delete >>>> /accumulo/instances/your_______instance_name >>>> create >>>> /accumulo/instances/your_______instance_name >>>> instance_id_from_monitor >>>> >>>> >>>> If that doesn't help, please give us some more >>>> information (versions >>>> you're using, how you set up the system, >>>> anything >>>> special you did). >>>> >>>> David Patterson wrote: >>>> >>>> I'm running a very simple test >>>> configuration with >>>> on Ubuntu 14 >>>> machine. If I run code on that machine I >>>> can read >>>> the data >>>> I've added. >>>> >>>> I'm only using column family name, >>>> (empty_text for the >>>> qualifier) and >>>> a value -- no authorizations. >>>> >>>> When I run the exact same program >>>> (identical jar) >>>> on another >>>> Ubuntu 14 >>>> machine, I get >>>> >>>> >>>> >>>> org.apache.accumulo.core.______client.______ >>>> AccumuloSecurityException: >>>> Error >>>> INVALID_INSTANCEID for user dave - Unknown >>>> security >>>> exception >>>> at >>>> >>>> >>>> org.apache.accumulo.core.______client.impl.ServerClient.____ >>>> __execute(ServerClient.java:63) >>>> >>>> at >>>> >>>> >>>> org.apache.accumulo.core.______client.impl.ConnectorImpl.<__ >>>> ____init>(ConnectorImpl.java:70) >>>> >>>> at >>>> >>>> >>>> org.apache.accumulo.core.______client.ZooKeeperInstance.____ >>>> __getConnector(______ZooKeeperInstance.java:240) >>>> >>>> at >>>> >>>> com.iai.diad.data.ImageDAO_A.<______init>(ImageDAO_A.java:123) >>>> at >>>> com.iai.diad.data.ImageDAO_A._ >>>> _____main(ImageDAO_A.java:63) >>>> Caused by: >>>> ThriftSecurityException(user:______dave, >>>> code:INVALID_INSTANCEID) >>>> >>>> The error occurs on the >>>> instance.getConnector call (the >>>> second line >>>> below) >>>> >>>> instance = new >>>> ZooKeeperInstance(______instanceName, >>>> >>>> >>>> zooServers); >>>> connector = instance.getConnector( >>>> acUserName, new >>>> PasswordToken( >>>> acPassword)); >>>> >>>> One possible source for strangeness is >>>> that both of >>>> these >>>> machines are >>>> on a cloud server. Each of them has 2 ip >>>> addresses >>>> -- one >>>> that is >>>> available from the outside, and one that >>>> is >>>> available only >>>> inside the >>>> cloud. I'm using the outside-the-cloud ip >>>> address >>>> in the >>>> zooServers >>>> string. >>>> >>>> The /etc/hosts file on the machine with >>>> the >>>> Accumulo data >>>> has the >>>> external ip address as the name of the >>>> machine. It >>>> also has >>>> 127.0.0.1 >>>> defined as localhost. >>>> >>>> Any suggestions? >>>> >>>> Dave Patterson >>>> >>>> >>>> >>>> >>>> >>
