[ 
https://issues.apache.org/jira/browse/HAWQ-323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15086862#comment-15086862
 ] 

zharui commented on HAWQ-323:
-----------------------------

After set  net.ipv4.tcp_timestamps=1 and net.ipv4.tcp_tw_recycle=1, the 
Segmentation fault is disappear, but the problem about communication between 
segment and master resource manager still exists. I ran "netstat -apn | grep 
5437" and the results as follows

tcp        0      0 0.0.0.0:5437                0.0.0.0:*                   
LISTEN      6409/postgres       
tcp        0      1 192.168.3.2:12877           192.168.3.2:5437            
SYN_SENT    6480/postgres       
tcp        0      1 192.168.3.2:5437            192.168.3.3:21157           
LAST_ACK    -                   
tcp        0      1 192.168.3.2:12881           192.168.3.2:5437            
SYN_SENT    6480/postgres       
tcp        0      1 192.168.3.2:12879           192.168.3.2:5437            
SYN_SENT    6480/postgres       
tcp        0      1 192.168.3.2:5437            192.168.3.3:21217           
LAST_ACK    -             

I see that LAST_ACK is to 192.168.3.3 but not ack to 192.168.3.2. BTW, process 
6480 is the segment resource menager on the same node with master.



> Cannot query when cluster include more than 1 segment
> -----------------------------------------------------
>
>                 Key: HAWQ-323
>                 URL: https://issues.apache.org/jira/browse/HAWQ-323
>             Project: Apache HAWQ
>          Issue Type: Bug
>          Components: Core, Resource Manager
>    Affects Versions: 2.0.0-beta-incubating
>            Reporter: zharui
>            Assignee: Lei Chang
>
> The version I use is 2.0.0-beta-RC2. I can query data normally when cluster 
> just have 1 segment. Once the cluster have more then 1 segments online, I 
> cannot finish any query and being informed that "ERROR:  failed to acquire 
> resource from resource manager, 7 of 8 segments are unavailable 
> (pquery.c:788)".
> I have read the segment logs and the source code about resource manager. I 
> guess this issue is because of the communication failure between segment 
> instance and resource manager server. I can find the logs of the segment 
> connect to resource manager successfully such as "AsyncComm framework 
> receives message 518 from FD5" and "Resource enforcer increases memory quota 
> to: total memory quota=65536 MB, delta memory quota = 65536 MB", but the 
> other online segments have no these log.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to