Hi @Haosdent,

We have multiple networks- that could be one of the problems. I tried with
all 3 of them and it still shows the same error. Can you help me understand
what hostname exactly expects in such scenario?

On Thu, Dec 15, 2016 at 6:08 PM, haosdent <[email protected]> wrote:

> Hi, @haripriya What's the hostname flag that you use to start master?
> According to the screenshot you posted before, I think you need to set it
> to something like `socrates-nid000xxx.us.cray.com`.
> However, the error log you post above, you set the hostname flag to
> nid00016 which could not be resolved.
>
> On Fri, Dec 16, 2016 at 6:51 AM, Haripriya Ayyalasomayajula <
> [email protected]> wrote:
>
>> Hello @Haosdent,
>>
>> After I tried to use hostname, I still see the error. This is the output
>> I see in developer tools for chrome:
>>
>> Failed to load resource: the server responded with a status of 404 (Not
>> Found)
>> http://nid00016:5050/metrics/snapshot?jsonp=angular.callbacks._2 Failed
>> to load resource: net::ERR_NAME_NOT_RESOLVED
>> http://nid00016:5050/master/state?jsonp=angular.callbacks._3 Failed to
>> load resource: net::ERR_NAME_NOT_RESOLVED
>> http://nid00016:5050/master/state?jsonp=angular.callbacks._4 Failed to
>> load resource: net::ERR_NAME_NOT_RESOLVED
>> http://nid00016:5050/metrics/snapshot?jsonp=angular.callbacks._5 Failed
>> to load resource: net::ERR_NAME_NOT_RESOLVED
>> http://nid00016:5050/master/state?jsonp=angular.callbacks._6 Failed to
>> load resource: net::ERR_NAME_NOT_RESOLVED
>> http://nid00016:5050/metrics/snapshot?jsonp=angular.callbacks._7 Failed
>> to load resource: net::ERR_NAME_NOT_RESOLVED
>> http://nid00016:5050/master/state?jsonp=angular.callbacks._8 Failed to
>> load resource: net::ERR_NAME_NOT_RESOLVED
>> http://nid00016:5050/metrics/snapshot?jsonp=angular.callbacks._9 Failed
>> to load resource: net::ERR_NAME_NOT_RESOLVED
>> http://nid00016:5050/master/state?jsonp=angular.callbacks._a Failed to
>> load resource: net::ERR_NAME_NOT_RESOLVED
>> http://nid00016:5050/metrics/snapshot?jsonp=angular.callbacks._b Failed
>> to load resource: net::ERR_NAME_NOT_RESOLVED
>> http://nid00016:5050/master/state?jsonp=angular.callbacks._c Failed to
>> load resource: net::ERR_NAME_NOT_RESOLVED
>> http://nid00016:5050/metrics/snapshot?jsonp=angular.callbacks._d Failed
>> to load resource: net::ERR_NAME_NOT_RESOLVED
>> http://nid00016:5050/master/state?jsonp=angular.callbacks._e Failed to
>> load resource: net::ERR_NAME_NOT_RESOLVED
>> http://nid00016:5050/metrics/snapshot?jsonp=angular.callbacks._f Failed
>> to load resource: net::ERR_NAME_NOT_RESOLVED
>> http://nid00016:5050/master/state?jsonp=angular.callbacks._g Failed to
>> load resource: net::ERR_NAME_NOT_RESOLVED
>> http://nid00016:5050/metrics/snapshot?jsonp=angular.callbacks._h Failed
>> to load resource: net::ERR_NAME_NOT_RESOLVED
>> angular-1.2.3.min.js:70 GET http://nid00016:5050/master/st
>> ate?jsonp=angular.callbacks._i net::ERR_NAME_NOT_RESOLVEDg @
>> angular-1.2.3.min.js:70(anonymous function) @ angular-1.2.3.min.js:71D @
>> angular-1.2.3.min.js:68h @ angular-1.2.3.min.js:66D @
>> angular-1.2.3.min.js:91D @ angular-1.2.3.min.js:91(anonymous function) @
>> angular-1.2.3.min.js:93$eval @ angular-1.2.3.min.js:101$digest @
>> angular-1.2.3.min.js:98$apply @ angular-1.2.3.min.js:101(anonymous
>> function) @ angular-1.2.3.min.js:111e @ angular-1.2.3.min.js:33(anonymous
>> function) @ angular-1.2.3.min.js:37
>> angular-1.2.3.min.js:70 GET http://nid00016:5050/metrics/s
>> napshot?jsonp=angular.callbacks._j net::ERR_NAME_NOT_RESOLVEDg @
>> angular-1.2.3.min.js:70(anonymous function) @ angular-1.2.3.min.js:71D @
>> angular-1.2.3.min.js:68h @ angular-1.2.3.min.js:66D @
>> angular-1.2.3.min.js:91D @ angular-1.2.3.min.js:91(anonymous function) @
>> angular-1.2.3.min.js:93$eval @ angular-1.2.3.min.js:101$digest @
>> angular-1.2.3.min.js:98$apply @ angular-1.2.3.min.js:101(anonymous
>> function) @ angular-1.2.3.min.js:111e @ angular-1.2.3.min.js:33(anonymous
>> function) @ angular-1.2.3.min.js:37
>>
>>
>> Also, regarding the "cluster flag", here is my output:
>>
>> nid00016: root     14940  2.5  0.0 2080192 85012 ?       Ssl  16:44
>> 0:08 /usr/sbin/mesos-master --zk=zk://192.168.0.1:2181,192.168.0.17:2181,
>> 192.168.0.33:2181/mesos --port=5050 --log_dir=/var/log/mesos
>> --acls=/etc/mesos_acls.json --authenticate_frameworks=true
>> --cluster="socrates" --credentials=/etc/marathon-auth/credentials
>> --hostname=nid00016 --quorum=2 --work_dir=/var/lib/mesos
>>
>> nid00016: root     14965  0.0  0.0 107892   612 ?        S    16:44
>> 0:00 logger -p user.info -t mesos-master[14940]
>>
>> nid00016: root     14966  0.0  0.0 107892   692 ?        S    16:44
>> 0:00 logger -p user.err -t mesos-master[14940]
>>
>> nid00016: root     15892  0.0  0.0 113116  1604 ?        Ss   16:50
>> 0:00 bash -c ps -aux | grep mesos-master
>>
>> nid00016: root     15959  0.0  0.0 112644   948 ?        S    16:50
>> 0:00 grep mesos-master
>>
>> nid00032: root     30018  2.5  0.0 2670032 26480 ?       Ssl  16:44
>> 0:08 /usr/sbin/mesos-master --zk=zk://192.168.0.1:2181,192.168.0.17:2181,
>> 192.168.0.33:2181/mesos --port=5050 --log_dir=/var/log/mesos
>> --acls=/etc/mesos_acls.json --authenticate_frameworks=true
>> --cluster="socrates" --credentials=/etc/marathon-auth/credentials
>> --hostname=nid00032 --quorum=2 --work_dir=/var/lib/mesos
>>
>> nid00032: root     30043  0.0  0.0 107892   612 ?        S    16:44
>> 0:00 logger -p user.info -t mesos-master[30018]
>>
>> nid00032: root     30044  0.0  0.0 107892   692 ?        S    16:44
>> 0:00 logger -p user.err -t mesos-master[30018]
>>
>> nid00032: root     31091  0.0  0.0 113116  1604 ?        Ss   16:50
>> 0:00 bash -c ps -aux | grep mesos-master
>>
>> nid00032: root     31158  0.0  0.0 112644   948 ?        S    16:50
>> 0:00 grep mesos-master
>>
>> nid00000: root     49753  3.7  0.0 3259912 27584 ?       Ssl  16:44
>> 0:13 /usr/sbin/mesos-master --zk=zk://192.168.0.1:2181,192.168.0.17:2181,
>> 192.168.0.33:2181/mesos --port=5050 --log_dir=/var/log/mesos
>> --acls=/etc/mesos_acls.json --authenticate_frameworks=true
>> --cluster="socrates" --credentials=/etc/marathon-auth/credentials
>> --hostname=nid00000.local --quorum=2 --work_dir=/var/lib/mesos
>>
>> nid00000: root     49778  0.0  0.0 107892   612 ?        S    16:44
>> 0:00 logger -p user.info -t mesos-master[49753]
>>
>> nid00000: root     49779  0.0  0.0 107892   692 ?        S    16:44
>> 0:00 logger -p user.err -t mesos-master[49753]
>>
>> nid00000: root     50887  0.0  0.0 113116  1604 ?        Ss   16:50
>> 0:00 bash -c ps -aux | grep mesos-master
>>
>> nid00000: root     50954  0.0  0.0 112648   948 ?        S    16:50
>> 0:00 grep mesos-master
>>
>> On Tue, Dec 6, 2016 at 6:58 PM, haosdent <[email protected]> wrote:
>>
>>> Hi, @Haripriya It looks like there are some problems in your master
>>> flags.
>>>
>>> > I'm attaching a snapshot of the error I've seen in Chrome with this
>>> email. It'll be great if you can suggest if I'm missing any configuration
>>> or if its some bug.
>>> According to the screenshot you attached, the hostnames are incorrect on
>>> your servers. Mesos WebUI depends on that to find the leading master.
>>> A workaround is to specific the `--hostname` flag when starting your
>>> masters. For example, launch your masters with
>>>
>>> ```
>>> $ mesos-master --hostname=socrates-nid000xxx.us.cray.com xxx
>>> ```
>>>
>>> > Is it something to do with a stale state of mesos anywhere or the way
>>> I'm passing cluster? I have a config file named cluster in
>>> /etc/mesos-master/ and when I restart the cluster it picks up the config
>>> files.
>>>
>>> You need to ensure the flags of every master contains
>>> `--cluster=your_cluster_name`.
>>>
>>> Could you perform `ps aux |grep mesos-master` on every master and paste
>>> their outputs here?
>>>
>>>
>>> On Wed, Dec 7, 2016 at 4:39 AM, Haripriya Ayyalasomayajula <
>>> [email protected]> wrote:
>>>
>>>> Hello, @Haosdent,
>>>>
>>>> Thanks for suggesting these.
>>>> I'm attaching a snapshot of the error I've seen in Chrome with this
>>>> email. It'll be great if you can suggest if I'm missing any configuration
>>>> or if its some bug.
>>>>
>>>> And for the second part, my `/master/state` end point does not return
>>>> "cluster" anywhere. It returned 75k lines of json so I'm not pasting all of
>>>> it.
>>>> {
>>>>     "activated_slaves": 37.0,
>>>>     "build_date": "2016-11-16 01:31:49",
>>>>     "build_time": 1479259909.0,
>>>>     "build_user": "centos",
>>>>     "completed_frameworks": [
>>>>         {
>>>>             "active": true,
>>>>   ..........
>>>>
>>>>
>>>>
>>>>     "start_time": 1480967418.42687,
>>>>     "unregistered_frameworks": [],
>>>>     "version": "1.1.0"
>>>> }
>>>>
>>>> Is it something to do with a stale state of mesos anywhere or the way
>>>> I'm passing cluster? I have a config file named cluster in
>>>> /etc/mesos-master/ and when I restart the cluster it picks up the config
>>>> files.
>>>>
>>>> On Mon, Dec 5, 2016 at 6:24 PM, haosdent <[email protected]> wrote:
>>>>
>>>>> Hi, @Haripriya
>>>>>
>>>>> > (less than 1 min though the  jobs are running just fine).
>>>>> > Is there any new configuration that has to be added?
>>>>>
>>>>> We change to use JSONP to send requests in WebUI since 1.0 May I have
>>>>> your error log in Safari, Chrome and Firefox?
>>>>> You could open it via https://developers.google.
>>>>> com/web/tools/chrome-devtools/console/
>>>>>
>>>>> > The UI does not display the name of the cluster despite using the
>>>>> --cluster flag.
>>>>> --cluster flag works fine for me. May you paste your `/master/state`
>>>>> endpoint at the email, I would like to check the value of `cluster` field
>>>>> in it.
>>>>>
>>>>> On Tue, Dec 6, 2016 at 5:34 AM, Haripriya Ayyalasomayajula <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> I have two issues with the web UI in Mesos 1.1
>>>>>>
>>>>>> 1.
>>>>>>
>>>>>> Earlier when I was using Mesos 0.28, mesos web UI would try to
>>>>>> reconnect only when there are network issues or when there is a newly
>>>>>> elected leader. After upgrade to 1.1, we see that it won't work (shows no
>>>>>> leader is elected even when there is a leader elected and jobs are 
>>>>>> running
>>>>>> happily ) on safari, works on chrome and firefox but tries to re-connect
>>>>>> very often (less than 1 min though the  jobs are running just fine).
>>>>>>
>>>>>> Is there any new configuration that has to be added?
>>>>>>
>>>>>>
>>>>>> 2. The UI does not display the name of the cluster despite using the
>>>>>> --cluster flag.
>>>>>>
>>>>>> /usr/sbin/mesos-master --zk=zk://mesos1:2181,mesos2:2181,mesos3:2181/
>>>>>> mesos --port=5050 --log_dir=/var/log/mesos --acls=/etc/mesos_acls.json
>>>>>> --authenticate_frameworks=true --cluster="cluster1"
>>>>>> --credentials=/etc/auth/credentials --quorum=2 --work_dir=/var/lib/
>>>>>> mesos
>>>>>>
>>>>>>
>>>>>> I also tried adding the name of the cluster without quotes: cluster1
>>>>>> instead of "cluster1", but that doesn't work either.
>>>>>>
>>>>>> /usr/sbin/mesos-master --zk=zk://mesos1:2181,mesos2:2181,mesos3:2181/
>>>>>> mesos --port=5050 --log_dir=/var/log/mesos --acls=/etc/mesos_acls.json
>>>>>> --authenticate_frameworks=true --cluster=cluster1
>>>>>> --credentials=/etc/auth/credentials --quorum=2 --work_dir=/var/lib/
>>>>>> mesos
>>>>>> I greatly appreciate any help!
>>>>>>
>>>>>> --
>>>>>> Thanks,
>>>>>> Haripriya
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Best Regards,
>>>>> Haosdent Huang
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Thanks,
>>>> Haripriya
>>>>
>>>>
>>>
>>>
>>> --
>>> Best Regards,
>>> Haosdent Huang
>>>
>>
>>
>>
>> --
>> Regards,
>> Haripriya Ayyalasomayajula
>>
>>
>
>
> --
> Best Regards,
> Haosdent Huang
>



-- 
Regards,
Haripriya Ayyalasomayajula

Reply via email to