[jira] [Updated] (HAWQ-592) QD fails when connects to QE again in executormgr_allocate_any_executor()

2016-03-25 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang updated HAWQ-592:
---
Description: 
We first run a query to get some QEs. Then we kill one and run "set 
log_min_messages=DEBUG1" to let QD get executormgr_allocate_any_executor(). We 
find QD failed.
1. Run query to get some QEs.
{code}
dispatch=# select count(*) from test_dispatch as t1, test_dispatch as t2, 
test_dispatch as t3 where t1.id *2 = t2.id and t1.id < t3.id;
 count
---
  3725
(1 row)
{code}
{code}
$ ps -ef|grep postgres
  501 12817 1   0  4:41下午 ?? 0:00.36 /usr/local/hawq/bin/postgres 
-D /Users/wangchunling/hawq-data-directory/masterdd -i -M master -p 5432 
--silent-mode=true
  501 12818 12817   0  4:41下午 ?? 0:00.01 postgres: port  5432, master 
logger process
  501 12821 12817   0  4:41下午 ?? 0:00.00 postgres: port  5432, stats 
collector process
  501 12822 12817   0  4:41下午 ?? 0:00.03 postgres: port  5432, writer 
process
  501 12823 12817   0  4:41下午 ?? 0:00.00 postgres: port  5432, 
checkpoint process
  501 12824 12817   0  4:41下午 ?? 0:00.00 postgres: port  5432, 
seqserver process
  501 12825 12817   0  4:41下午 ?? 0:00.00 postgres: port  5432, WAL Send 
Server process
  501 12826 12817   0  4:41下午 ?? 0:00.00 postgres: port  5432, DFS 
Metadata Cache process
  501 12827 12817   0  4:41下午 ?? 0:00.16 postgres: port  5432, master 
resource manager
  501 12844 1   0  4:41下午 ?? 0:00.57 /usr/local/hawq/bin/postgres 
-D /Users/wangchunling/hawq-data-directory/segmentdd -i -M segment -p 4 
--silent-mode=true
  501 12845 12844   0  4:41下午 ?? 0:00.01 postgres: port 4, logger 
process
  501 12856 12862   0  4:42下午 ?? 0:00.05 postgres: port  5432, 
wangchunling dispatch [local] con13 cmd10 idle [local]
  501 12872 12844   0  4:42下午 ?? 0:00.00 postgres: port 4, stats 
collector process
  501 12873 12844   0  4:42下午 ?? 0:00.01 postgres: port 4, writer 
process
  501 12874 12844   0  4:42下午 ?? 0:00.00 postgres: port 4, 
checkpoint process
  501 12875 12844   0  4:42下午 ?? 0:00.03 postgres: port 4, segment 
resource manager
{code}
2. Kill -9 some QE and wait segment up.
{code}
$ ps -ef|grep postgres
  501 12817 1   0  4:41下午 ?? 0:00.91 /usr/local/hawq/bin/postgres 
-D /Users/wangchunling/hawq-data-directory/masterdd -i -M master -p 5432 
--silent-mode=true
  501 12818 12817   0  4:41下午 ?? 0:00.05 postgres: port  5432, master 
logger process
  501 12844 1   0  4:41下午 ?? 0:01.52 /usr/local/hawq/bin/postgres 
-D /Users/wangchunling/hawq-data-directory/segmentdd -i -M segment -p 4 
--silent-mode=true
  501 12845 12844   0  4:41下午 ?? 0:00.04 postgres: port 4, logger 
process
  501 12872 12844   0  4:42下午 ?? 0:00.02 postgres: port 4, stats 
collector process
  501 12873 12844   0  4:42下午 ?? 0:00.19 postgres: port 4, writer 
process
  501 12874 12844   0  4:42下午 ?? 0:00.03 postgres: port 4, 
checkpoint process
  501 12875 12844   0  4:42下午 ?? 0:00.41 postgres: port 4, segment 
resource manager
  501 12932 12817   0  4:52下午 ?? 0:00.00 postgres: port  5432, stats 
collector process
  501 12933 12817   0  4:52下午 ?? 0:00.01 postgres: port  5432, writer 
process
  501 12934 12817   0  4:52下午 ?? 0:00.00 postgres: port  5432, 
checkpoint process
  501 12935 12817   0  4:52下午 ?? 0:00.00 postgres: port  5432, 
seqserver process
  501 12936 12817   0  4:52下午 ?? 0:00.00 postgres: port  5432, WAL Send 
Server process
  501 12937 12817   0  4:52下午 ?? 0:00.00 postgres: port  5432, DFS 
Metadata Cache process
  501 12938 12817   0  4:52下午 ?? 0:00.04 postgres: port  5432, master 
resource manager
  501 12952 12817   0  4:53下午 ?? 0:00.00 postgres: port  5432, 
wangchunling dispatch [local] con30 idle [local]
{code}
{code}
dispatch=# select * from gp_segment_configuration;
 registration_order | role | status | port  |  hostname   | 
  address   |description
+--++---+-+-+
  0 | m| u  |  5432 | ChunlingdeMacBook-Pro.local | 
ChunlingdeMacBook-Pro.local |
  1 | p| d  | 4 | localhost   | 
127.0.0.1   | resource manager process was reset
(2 rows)

dispatch=# select * from gp_segment_configuration;
 registration_order | role | status | port  |  hostname   | 
  address   | description
+--++---+-+-+-
  0 | m| u  |  5432 | 

[jira] [Updated] (HAWQ-592) QD fails when connects to QE again in executormgr_allocate_any_executor()

2016-03-25 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang updated HAWQ-592:
---
Affects Version/s: 2.0.0

> QD fails when connects to QE again in executormgr_allocate_any_executor()
> -
>
> Key: HAWQ-592
> URL: https://issues.apache.org/jira/browse/HAWQ-592
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Dispatcher
>Affects Versions: 2.0.0
>Reporter: Chunling Wang
>Assignee: Lei Chang
>
> We first run a query to get some QEs. Then we kill one and run "set 
> log_min_messages=DEBUG1" to let QD get executormgr_allocate_any_executor(). 
> We find QD failed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)