Inline..
On Apr 20, 2016, at 3:35 AM, Jakub Pavlik
<[email protected]<mailto:[email protected]>> wrote:
Hi Raj,
I am running latest build of 3.0 from yesterday. I have everything in active
state now, but still every 15 seconds in
/var/log/contrail/contrail-analytics-api.log:
04/20/2016 12:15:50 PM [contrail-analytics-api]: Exception TimeoutError in uve
stream proc. Arguments:
('Timeout reading from socket',) : traceback Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/opserver/partition_handler.py", line
354, in _run
for message in pb.listen():
File "/usr/lib/python2.7/dist-packages/redis/client.py", line 2215, in listen
response = self.handle_message(self.parse_response(block=True))
File "/usr/lib/python2.7/dist-packages/redis/client.py", line 2150, in
parse_response
return self._execute(connection, connection.read_response)
File "/usr/lib/python2.7/dist-packages/redis/client.py", line 2132, in _execute
return command(*args)
File "/usr/lib/python2.7/dist-packages/redis/connection.py", line 569, in
read_response
response = self._parser.read_response()
File "/usr/lib/python2.7/dist-packages/redis/connection.py", line 224, in
read_response
response = self._buffer.readline()
File "/usr/lib/python2.7/dist-packages/redis/connection.py", line 162, in
readline
self._read_from_socket()
File "/usr/lib/python2.7/dist-packages/redis/connection.py", line 133, in
_read_from_socket
raise TimeoutError("Timeout reading from socket")
TimeoutError: Timeout reading from socket
I am not sure what’s happening here, did you check redis-server logs, is it
restart’ing?
I have also all config,analytics and database nodes in alert mode at GUI with
message "Analytics Node config missing or incorrect”
expectation is each of the nodes are added to config and GUI checks for the same
see
https://github.com/Juniper/contrail-controller/blob/master/src/config/utils/provision_config_node.py
Each node in cluster has following uve topics in kafka:
/usr/share/kafka/bin/kafka-topics.sh --zookeeper 172.16.20.101:2181 --list
-uve-0
-uve-1
-uve-10
-uve-11
-uve-12
-uve-13
-uve-14
-uve-2
-uve-3
-uve-4
-uve-5
-uve-6
-uve-7
-uve-8
-uve-9
Some of alarm-gen have less partitions than another. I have 3 controllers:
{
"admin_state": "up",
"ep_id": "openstack-stg-ctl03",
"ep_type": "AlarmGenerator",
"hbcount": 0,
"heartbeat": 1461148260,
"in_use": 0,
"info": {
"instance-id": "0",
"ip-address": "172.16.20.103",
"partitions": "{\"1\": 1461141724586667, \"4\":
1461141724643808, \"6\": 1461141724644764, \"7\": 1461141724645688, \"9\":
1461141724646408, \"11\": 1461141724647377, \"12\": 1461141724651648, \"13\":
1461141724656747, \"14\": 1461141724662476}",
"redis-port": "6379"
},
"oper_state": "up",
"oper_state_msg": "",
"prov_state": "up",
"remote": "172.16.20.103",
"sequence": "1461148260openstack-stg-ctl02",
"service_id": "openstack-stg-ctl03:AlarmGenerator",
"service_type": "AlarmGenerator",
"status": "up",
"ts_created": 1461148260,
"ts_use": 49217,
"version": "1.0"
},
Is it correct that there are twice values with FQDN
(openstack-stg-ctl01.openstack.tcpcloud.eu<http://openstack-stg-ctl01.openstack.tcpcloud.eu/>)
and hostname (openstack-stg-ctl01)?
This seems incorrect, may be a restart of AlarmGen will clean it up.. the
hostname may have been changed after alarmgen started up.
{
"admin_state": "up",
"ep_id":
"openstack-stg-ctl01.openstack.tcpcloud.eu<http://openstack-stg-ctl01.openstack.tcpcloud.eu/>",
"ep_type": "AlarmGenerator",
"hbcount": 0,
"heartbeat": 1460968900,
"in_use": 0,
"info": {
"instance-id": "0",
"ip-address": "172.16.20.101",
"partitions": "{\"0\": 1460965891588811, \"10\":
1460965891603645, \"3\": 1460965891589092, \"5\": 1460965891599416}",
"redis-port": "6379"
},
{
"admin_state": "up",
"ep_id": "openstack-stg-ctl01",
"ep_type": "AlarmGenerator",
"hbcount": 0,
"heartbeat": 1461148265,
"in_use": 0,
"info": {
"instance-id": "0",
"ip-address": "172.16.20.101",
"partitions": "{\"0\": 1461147595465749, \"10\":
1461147595474188, \"3\": 1461147595478460, \"5\": 1461147595479857}",
"redis-port": "6379"
},
Thanks,
Jakub
On 19.4.2016 22:30, Raj Reddy wrote:
There were some fixes in this area, what version of software are you using?
Can you verify if all alarm-gen are publishing partition numbers from 1 to 15
to discovery at <ip>:5998?
contrail-analytics-api reads this from discovery and complains if it’s
incomplete.
-Raj
On Apr 19, 2016, at 4:38 AM, Jakub Pavlik
<[email protected]<mailto:[email protected]>> wrote:
Hello,
does anybody hit this issue with Contrail 3.0 and analytics-api? It shows even
if all service are active:
== Contrail Analytics ==
supervisor-analytics: active
contrail-alarm-gen active
contrail-analytics-api initializing
(UvePartitions:UVE-Aggregation[Partitions:13] connection down)
contrail-analytics-nodemgr active
contrail-collector active
contrail-query-engine active
contrail-snmp-collector active
contrail-topology active
/var/log/contrail/contrail-analytics.api.log
04/19/2016 01:35:55 PM [contrail-analytics-api]: Exception TimeoutError in uve
stream proc. Arguments:
('Timeout reading from socket',) : traceback Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/opserver/partition_handler.py", line
354, in _run
for message in pb.listen():
File "/usr/lib/python2.7/dist-packages/redis/client.py", line 2215, in listen
response = self.handle_message(self.parse_response(block=True))
File "/usr/lib/python2.7/dist-packages/redis/client.py", line 2150, in
parse_response
return self._execute(connection, connection.read_response)
File "/usr/lib/python2.7/dist-packages/redis/client.py", line 2132, in _execute
return command(*args)
File "/usr/lib/python2.7/dist-packages/redis/connection.py", line 569, in
read_response
response = self._parser.read_response()
File "/usr/lib/python2.7/dist-packages/redis/connection.py", line 224, in
read_response
response = self._buffer.readline()
File "/usr/lib/python2.7/dist-packages/redis/connection.py", line 162, in
readline
self._read_from_socket()
File "/usr/lib/python2.7/dist-packages/redis/connection.py", line 133, in
_read_from_socket
raise TimeoutError("Timeout reading from socket")
TimeoutError: Timeout reading from socket
Redis server seems to be working fine. I see this problem in two independent
deployments.
Thanks,
jakub
_______________________________________________
Users mailing list
[email protected]<mailto:[email protected]>
http://lists.opencontrail.org/mailman/listinfo/users_lists.opencontrail.org
--
Jakub Pavlik
CTO
[tcp ◕ cloud]
+420 602 177 027
[email protected]<mailto:[email protected]>
tcp cloud a.s.
Thamova 16
186 00 Praha 8 - Karlin
Czech republic
http://tcpcloud.eu<http://tcpcloud.eu/>
http://opentcpcloud.org<http://opentcpcloud.org/>
_______________________________________________
Users mailing list
[email protected]
http://lists.opencontrail.org/mailman/listinfo/users_lists.opencontrail.org