[jira] [Closed] (HAWQ-559) QD hangs when QE is killed after connected to QD

2017-09-21 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang closed HAWQ-559.
--

> QD hangs when QE is killed after connected to QD
> 
>
> Key: HAWQ-559
> URL: https://issues.apache.org/jira/browse/HAWQ-559
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Dispatcher
>Affects Versions: 2.0.0.0-incubating
> Environment: mac os X 10.10
>Reporter: Chunling Wang
>Assignee: Lili Ma
> Fix For: 2.0.0.0-incubating
>
>
> When the first query finishes, the QE is still alive. Then we run the second 
> query. After the thread of QD is created and bind to QE but not send data to 
> QE, we kill this QE and find QD hangs.
> Here is the backtrace when QD hangs:
> {code}
> * thread #1: tid = 0x1c4afd, 0x7fff890355be libsystem_kernel.dylib`poll + 
> 10, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
>   * frame #0: 0x7fff890355be libsystem_kernel.dylib`poll + 10
> frame #1: 0x00010745692c postgres`receiveChunksUDP [inlined] 
> udpSignalPoll + 42 at ic_udp.c:2882
> frame #2: 0x000107456902 postgres`receiveChunksUDP + 26 at 
> ic_udp.c:2715
> frame #3: 0x0001074568e8 postgres`receiveChunksUDP [inlined] 
> waitOnCondition(timeout_us=25) + 82 at ic_udp.c:1599
> frame #4: 0x000107456896 
> postgres`receiveChunksUDP(pTransportStates=0x7ff2a381ae48, 
> pEntry=0x7ff2a18f2230, motNodeID=, 
> srcRoute=0x7fff58c0ce96, conn=, inTeardown='\0') + 726 at 
> ic_udp.c:4039
> frame #5: 0x000107452a86 postgres`RecvTupleChunkFromAnyUDP [inlined] 
> RecvTupleChunkFromAnyUDP_Internal + 498 at ic_udp.c:4146
> frame #6: 0x000107452894 
> postgres`RecvTupleChunkFromAnyUDP(mlStates=, 
> transportStates=, motNodeID=1, srcRoute=0x7fff58c0ce96) + 
> 100 at ic_udp.c:4167
> frame #7: 0x000107442254 postgres`RecvTupleFrom [inlined] 
> processIncomingChunks(mlStates=0x7ff2a3812a30, 
> transportStates=0x7ff2a381ae48, motNodeID=1, srcRoute=) + 34 
> at cdbmotion.c:684
> frame #8: 0x000107442232 
> postgres`RecvTupleFrom(mlStates=0x7ff2a3812a30, 
> transportStates=, motNodeID=1, tup_i=0x7fff58c0cf00, 
> srcRoute=-100) + 370 at cdbmotion.c:610
> frame #9: 0x0001071c8778 postgres`ExecMotion [inlined] 
> execMotionUnsortedReceiver(node=) + 57 at nodeMotion.c:466
> frame #10: 0x0001071c873f postgres`ExecMotion(node=) + 
> 1071 at nodeMotion.c:298
> frame #11: 0x0001071a4835 
> postgres`ExecProcNode(node=0x7ff2a38164b8) + 613 at execProcnode.c:999
> frame #12: 0x0001071b9f82 postgres`ExecAgg + 104 at nodeAgg.c:1163
> frame #13: 0x0001071b9f1a postgres`ExecAgg + 316 at nodeAgg.c:1693
> frame #14: 0x0001071b9dde postgres`ExecAgg(node=0x7ff2a3815348) + 
> 126 at nodeAgg.c:1138
> frame #15: 0x0001071a4803 
> postgres`ExecProcNode(node=0x7ff2a3815348) + 563 at execProcnode.c:979
> frame #16: 0x00010719ecfd 
> postgres`ExecutePlan(estate=0x7ff2a3814e30, planstate=0x7ff2a3815348, 
> operation=CMD_SELECT, numberTuples=0, direction=, 
> dest=0x7ff2a28db178) + 1181 at execMain.c:3218
> frame #17: 0x00010719e619 
> postgres`ExecutorRun(queryDesc=0x7ff2a3811f00, 
> direction=ForwardScanDirection, count=0) + 569 at execMain.c:1213
> frame #18: 0x0001072e7fc2 postgres`PortalRun + 14 at pquery.c:1649
> frame #19: 0x0001072e7fb4 
> postgres`PortalRun(portal=0x7ff2a1893e30, count=, 
> isTopLevel='\x01', dest=, altdest=0x7ff2a28db178, 
> completionTag=0x7fff58c0d530) + 1124 at pquery.c:1471
> frame #20: 0x0001072e4a8e 
> postgres`exec_simple_query(query_string=0x7ff2a380fe30, 
> seqServerHost=0x, seqServerPort=-1) + 2078 at postgres.c:1745
> frame #21: 0x0001072e0c4c postgres`PostgresMain(argc=, 
> argv=, username=0x7ff2a201bcf0) + 9404 at postgres.c:4754
> frame #22: 0x00010729a002 postgres`ServerLoop [inlined] BackendRun + 
> 105 at postmaster.c:5889
> frame #23: 0x000107299f99 postgres`ServerLoop at postmaster.c:5484
> frame #24: 0x000107299f99 postgres`ServerLoop + 9593 at 
> postmaster.c:2163
> frame #25: 0x000107296f3b postgres`PostmasterMain(argc=, 
> argv=) + 5019 at postmaster.c:1454
> frame #26: 0x000107200ca9 postgres`main(argc=9, 
> argv=0x7ff2a141eef0) + 1433 at main.c:209
> frame #27: 0x7fff95e8c5c9 libdyld.dylib`start + 1
>   thread #2: tid = 0x1c4afe, 0x7fff890355be libsystem_kernel.dylib`poll + 
> 10
> frame #0: 0x7fff890355be libsystem_kernel.dylib`poll + 10
> frame #1: 0x00010744d8e3 postgres`rxThreadFunc(arg=) + 
> 2163 at ic_udp.c:6251
> frame #2: 0x7fff95e822fc libsystem_pthread.dylib`_pthread_body + 131
> frame #3: 

[jira] [Closed] (HAWQ-524) Do not resolve the condition of 'executor->refResult = NULL' in executormgr_bind_executor_task()

2017-09-21 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang closed HAWQ-524.
--

> Do not resolve the condition of 'executor->refResult = NULL' in 
> executormgr_bind_executor_task() 
> -
>
> Key: HAWQ-524
> URL: https://issues.apache.org/jira/browse/HAWQ-524
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Dispatcher
>Affects Versions: 2.0.0.0-incubating
>Reporter: Chunling Wang
>Assignee: Lili Ma
> Fix For: 2.0.0.0-incubating
>
>
> In executormgr.c, the code below should not be Assert(). The condition of 
> 'executor->refResult = NULL' should be catch.
> bool
> executormgr_bind_executor_task(struct DispatchData *data,
>   QueryExecutor *executor,
>   
> SegmentDatabaseDescriptor *desc,
>   struct DispatchTask 
> *task,
>   struct DispatchSlice 
> *slice)
> {
>   ...
>   Assert(executor->refResult != NULL);
>   ...
> }



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (HAWQ-1145) After registering a partition table, if we want to insert some data into the table, it fails.

2017-09-21 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang closed HAWQ-1145.
---

> After registering a partition table, if we want to insert some data into the 
> table, it fails.
> -
>
> Key: HAWQ-1145
> URL: https://issues.apache.org/jira/browse/HAWQ-1145
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Command Line Tools
>Affects Versions: 2.1.0.0-incubating
>Reporter: Lili Ma
>Assignee: Chunling Wang
> Fix For: 2.1.0.0-incubating
>
> Attachments: dbgen, dists.dss
>
>
> Reproduce Steps:
> 1. Create a partition table
> {code}
> CREATE TABLE parquet_LINEITEM_uncompressed(   
>   
>   
>   
>  L_ORDERKEY INT8, 
>   
>   
>   
>  L_PARTKEY BIGINT,
>   
>   
>   
>  L_SUPPKEY BIGINT,
>   
>   
>   
>  L_LINENUMBER BIGINT, 
>   
>   
>   
>  L_QUANTITY decimal,  
>   
>   
>   
>  L_EXTENDEDPRICE decimal, 
>   
>   
>   
>  L_DISCOUNT decimal,  
>   
>   
>   
>  L_TAX decimal,   
>   
>   
>   
>  L_RETURNFLAG CHAR(1),
>   
>   
>   
>  L_LINESTATUS 
> CHAR(1),  
>   
>   
>  
> L_SHIPDATE date,  
>   
>   
>   
> L_COMMITDATE date,
>   
>   

[jira] [Closed] (HAWQ-523) Dead code in executormgr_bind_executor_task()

2017-09-21 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang closed HAWQ-523.
--

> Dead code in executormgr_bind_executor_task()
> -
>
> Key: HAWQ-523
> URL: https://issues.apache.org/jira/browse/HAWQ-523
> Project: Apache HAWQ
>  Issue Type: New Feature
>  Components: Dispatcher
>Affects Versions: 2.0.0.0-incubating
>Reporter: Chunling Wang
>Assignee: Lili Ma
> Fix For: 2.0.0.0-incubating
>
>
> In executormgr.c, the code below would never access:
> bool
> executormgr_bind_executor_task(struct DispatchData *data,
>   QueryExecutor *executor,
>   
> SegmentDatabaseDescriptor *desc,
>   struct DispatchTask 
> *task,
>   struct DispatchSlice 
> *slice)
> {
>   ...
>   if (desc == NULL)
>   {
>   executor->health = QEH_ERROR;
>   return false;
>   }
>   ...
> }



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HAWQ-619) Change 'gpextract' to 'hawqextract' for InputFormat unit test

2017-09-21 Thread Chunling Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16174367#comment-16174367
 ] 

Chunling Wang commented on HAWQ-619:


The PR is 
https://github.com/apache/incubator-hawq/commit/53a9f76f04d3f56684f3c0e3cb3dd17ba1ae1997

> Change 'gpextract' to 'hawqextract' for InputFormat unit test
> -
>
> Key: HAWQ-619
> URL: https://issues.apache.org/jira/browse/HAWQ-619
> Project: Apache HAWQ
>  Issue Type: Task
>  Components: Tests
>Reporter: Chunling Wang
>Assignee: Jiali Yao
> Fix For: 2.0.0.0-incubating
>
>
> Change 'gpextract' to 'hawqextract' in SimpleTableLocalTester.java for 
> InputFormat unit test.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (HAWQ-619) Change 'gpextract' to 'hawqextract' for InputFormat unit test

2017-09-21 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang closed HAWQ-619.
--

> Change 'gpextract' to 'hawqextract' for InputFormat unit test
> -
>
> Key: HAWQ-619
> URL: https://issues.apache.org/jira/browse/HAWQ-619
> Project: Apache HAWQ
>  Issue Type: Task
>  Components: Tests
>Reporter: Chunling Wang
>Assignee: Jiali Yao
> Fix For: 2.0.0.0-incubating
>
>
> Change 'gpextract' to 'hawqextract' in SimpleTableLocalTester.java for 
> InputFormat unit test.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (HAWQ-812) Activate standby master failed after create a new database

2017-09-21 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang closed HAWQ-812.
--

> Activate standby master failed after create a new database
> --
>
> Key: HAWQ-812
> URL: https://issues.apache.org/jira/browse/HAWQ-812
> Project: Apache HAWQ
>  Issue Type: Bug
>Reporter: Chunling Wang
>Assignee: Ming LI
> Fix For: 2.0.0.0-incubating
>
>
> Activate standby master failed after create a new database. However, it will 
> success if we do not create a new database even we create a new table and 
> insert data. 
> 1. Create a new database 'gptest'
> {code}
> [gpadmin@test1 ~]$ psql -l
>  List of databases
>Name|  Owner  | Encoding | Access privileges
> ---+-+--+---
>  postgres  | gpadmin | UTF8 |
>  template0 | gpadmin | UTF8 |
>  template1 | gpadmin | UTF8 |
> (3 rows)
> [gpadmin@test1 ~]$ createdb gptest
> [gpadmin@test1 ~]$ psql -l
>  List of databases
>Name|  Owner  | Encoding | Access privileges
> ---+-+--+---
>  gptest| gpadmin | UTF8 |
>  postgres  | gpadmin | UTF8 |
>  template0 | gpadmin | UTF8 |
>  template1 | gpadmin | UTF8 |
> (4 rows)
> {code}
> 2. Stop HAWQ master
> {code}
> [gpadmin@test1 ~]$ hawq stop master -a
> 20160613:20:13:44:068559 hawq_stop:test1:gpadmin-[INFO]:-Prepare to do 'hawq 
> stop'
> 20160613:20:13:44:068559 hawq_stop:test1:gpadmin-[INFO]:-You can find log in:
> 20160613:20:13:44:068559 
> hawq_stop:test1:gpadmin-[INFO]:-/home/gpadmin/hawqAdminLogs/hawq_stop_20160613.log
> 20160613:20:13:44:068559 hawq_stop:test1:gpadmin-[INFO]:-GPHOME is set to:
> 20160613:20:13:44:068559 
> hawq_stop:test1:gpadmin-[INFO]:-/data/pulse-agent-data/HAWQ-main-FeatureTest-opt-mutilnodeparallel-wcl/product/hawq/.
> 20160613:20:13:44:068559 hawq_stop:test1:gpadmin-[INFO]:-Stop hawq with args: 
> ['stop', 'master']
> 20160613:20:13:45:068559 hawq_stop:test1:gpadmin-[INFO]:-There are 0 
> connections to the database
> 20160613:20:13:45:068559 hawq_stop:test1:gpadmin-[INFO]:-Commencing Master 
> instance shutdown with mode='smart'
> 20160613:20:13:45:068559 hawq_stop:test1:gpadmin-[INFO]:-Master host=test1
> 20160613:20:13:45:068559 hawq_stop:test1:gpadmin-[INFO]:-Stop hawq master
> 20160613:20:13:46:068559 hawq_stop:test1:gpadmin-[INFO]:-Master stopped 
> successfully
> {code}
> 3. Activate standby master
> {code}
> [gpadmin@test1 ~]$ ssh test5 'source 
> /data/pulse-agent-data/HAWQ-main-FeatureTest-opt-mutilnodeparallel-wcl/product/hawq/./greenplum_path.sh;
>  hawq activate standby -a'
> 20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-Prepare to do 
> 'hawq activate'
> 20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-You can find log 
> in:
> 20160613:20:14:14:126841 
> hawq_activate:test5:gpadmin-[INFO]:-/home/gpadmin/hawqAdminLogs/hawq_activate_20160613.log
> 20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-GPHOME is set to:
> 20160613:20:14:14:126841 
> hawq_activate:test5:gpadmin-[INFO]:-/data/pulse-agent-data/HAWQ-main-FeatureTest-opt-mutilnodeparallel-wcl/product/hawq/.
> 20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-Activate hawq 
> with args: ['activate', 'standby']
> 20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-Starting to 
> activate standby master 'test5'
> 20160613:20:14:15:126841 hawq_activate:test5:gpadmin-[INFO]:-HAWQ master is 
> not running, skip
> 20160613:20:14:15:126841 hawq_activate:test5:gpadmin-[INFO]:-Stopping all the 
> running segments
> 20160613:20:14:21:126841 hawq_activate:test5:gpadmin-[INFO]:-
> 20160613:20:14:21:126841 hawq_activate:test5:gpadmin-[INFO]:-Stopping running 
> standby
> 20160613:20:14:23:126841 hawq_activate:test5:gpadmin-[INFO]:-Update master 
> host name in hawq-site.xml
> 20160613:20:14:31:126841 hawq_activate:test5:gpadmin-[INFO]:-GUC 
> hawq_master_address_host already exist in hawq-site.xml
> Update it with value: test5
> 20160613:20:14:31:126841 hawq_activate:test5:gpadmin-[INFO]:-Remove current 
> standby from hawq-site.xml
> 20160613:20:14:39:126841 hawq_activate:test5:gpadmin-[INFO]:-Start master in 
> master only mode
> {code}
> It hangs and can not start master. And the master log is following:
> {code}
> 2016-06-13 20:14:40.268022 
> PDT,,,p127518,th-12124628160,,,seg-1,"LOG","0","database 
> system was shut down at 2016-06-13 20:02:50 PDT",,,0,,"xlog.c",6205,
> 2016-06-13 20:14:40.268112 
> PDT,,,p127518,th-12124628160,,,seg-1,"LOG","0","found 
> recovery.conf file indicating standby takeover recovery 
> needed",,,0,,"xlog.c",5485,
> 2016-06-13 20:14:40.268131 
> 

[jira] [Closed] (HAWQ-1034) add --repair option for hawq register

2017-09-21 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang closed HAWQ-1034.
---

> add --repair option for hawq register
> -
>
> Key: HAWQ-1034
> URL: https://issues.apache.org/jira/browse/HAWQ-1034
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Command Line Tools
>Affects Versions: 2.1.0.0-incubating
>Reporter: Lili Ma
>Assignee: Chunling Wang
> Fix For: 2.1.0.0-incubating
>
>
> add --repair option for hawq register
> Will change both file folder and catalog table pg_aoseg.pg_paqseg_$relid to 
> the state which .yml file configures. Note may some new generated files since 
> the checkpoint may be deleted here. Also note the all the files in .yml file 
> should all under the table folder on HDFS. Limitation: Do not support cases 
> for hash table redistribution, table truncate and table drop. This is for 
> scenario rollback of table: Do checkpoints somewhere, and need to rollback to 
> previous checkpoint. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (HAWQ-1418) Print executing command for hawq register

2017-09-21 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang closed HAWQ-1418.
---

> Print executing command for hawq register
> -
>
> Key: HAWQ-1418
> URL: https://issues.apache.org/jira/browse/HAWQ-1418
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Command Line Tools
>Reporter: Chunling Wang
>Assignee: Chunling Wang
> Fix For: 2.2.0.0-incubating
>
>
> Print executing command for hawq register



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (HAWQ-1426) hawq extract meets error after the table was reorganized.

2017-09-21 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang closed HAWQ-1426.
---

> hawq extract meets error after the table was reorganized.
> -
>
> Key: HAWQ-1426
> URL: https://issues.apache.org/jira/browse/HAWQ-1426
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Command Line Tools
>Reporter: Lili Ma
>Assignee: Chunling Wang
> Fix For: 2.3.0.0-incubating
>
>
> After one table is reorganized, hawq extract the table will meet error.
> Reproduce Steps:
> 1. create an AO table
> 2. insert into several records into it
> 3. Get the table reorganized.  "alter table a set with (reorganize=true);"
> 4. run hawq extract, error thrown out.
> For the bug fix, we should also guarantee that hawq extract should work if 
> the table is truncated and re-inserted.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (HAWQ-1525) Segmentation fault occurs if reindex database when loading data from Hive to HAWQ using hcatalog

2017-09-21 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang closed HAWQ-1525.
---

> Segmentation fault occurs if reindex database when loading data from Hive to 
> HAWQ using hcatalog
> 
>
> Key: HAWQ-1525
> URL: https://issues.apache.org/jira/browse/HAWQ-1525
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Query Execution
>Reporter: Chunling Wang
>Assignee: Chunling Wang
> Fix For: 2.3.0.0-incubating
>
>
> When we use hcatalog to load data from Hive to HAWQ, if the amount of data is 
> big enough, it will trigger automatic statistics collection, calling vacuum 
> analyze. At that time if we reindex the database, the system will panic on 
> the next auto analyze. Here is the call stack. 
> {code}
> 2017-09-07 13:34:41.441970 IST,,,p34393,th0,,,2017-09-07 13:34:09 
> IST,0,con1140,cmd6,seg-1,"PANIC","XX000","Unexpected internal error: 
> Master pr
> ocess received signal SIGSEGV",,,0"10x96f57c postgres  found> + 0x96f57c
> 20x96f785 postgres StandardHandlerForSigillSigsegvSigbus_OnMainThread + 
> 0x2b
> 30x88b04f postgres CdbProgramErrorHandler + 0xf1
> 40x3a16a0f7e0 libpthread.so.0  + 0x16a0f7e0
> 50x973048 postgres FunctionCall2 + 0x8e
> 60xabefab postgres  + 0xabefab
> 70xabfee4 postgres InMemHeap_GetNext + 0x408
> 80x4f7bc6 postgres  + 0x4f7bc6
> 90x4f7abc postgres systable_getnext + 0x50
> 10   0x953fb8 postgres SearchCatCache + 0x276
> 11   0x95ce10 postgres SearchSysCache + 0x93
> 12   0x95cecb postgres SearchSysCacheKeyArray + 0x9f
> 13   0x5a07fc postgres caql_getoid_plus + 0x176
> 14   0x5c4888 postgres LookupNamespaceId + 0x129
> 15   0x5c475d postgres LookupInternalNamespaceId + 0x1d
> 16   0x687897 postgres  + 0x687897
> 17   0x687574 postgres CreateSchemaCommand + 0x8f
> 18   0x8952d1 postgres ProcessUtility + 0x4ff
> 19   0x5c5728 postgres  + 0x5c5728
> 20   0x5c2fea postgres RangeVarGetCreationNamespace + 0x253
> 21   0x6e43f3 postgres  + 0x6e43f3
> 22   0x6e49c4 postgres  + 0x6e49c4
> 23   0x6e1401 postgres  + 0x6e1401
> 24   0x6deb2d postgres ExecutorStart + 0xb01
> 25   0x738594 postgres  + 0x738594
> 26   0x73809f postgres  + 0x73809f
> 27   0x7351a9 postgres SPI_execute + 0x13c
> 28   0x6490f2 postgres spiExecuteWithCallback + 0x130
> 29   0x64956b postgres  + 0x64956b
> 30   0x648be0 postgres  + 0x648be0
> 31   0x647be0 postgres analyzeStmt + 0x91d
> 32   0x647247 postgres analyzeStatement + 0xb1
> 33   0x6ca11d postgres vacuum + 0xe5
> 34   0x827910 postgres autostats_issue_analyze + 0x160
> 35   0x827e10 postgres auto_stats + 0x19b
> 36   0x8906b5 postgres  + 0x8906b5
> 37   0x8930f5 postgres  + 0x8930f5
> 38   0x892619 postgres PortalRun + 0x3e6
> 39   0x8884f6 postgres  + 0x8884f6
> {code}
> This is because reindex command clear the relcache, and inmemscan->rs_rd->rel 
> in InMemHeap_GetNext() using the address of this heap relation in relcache, 
> which is not same with that when heap relation is reopened.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (HAWQ-1525) Segmentation fault occurs if reindex database when loading data from Hive to HAWQ using hcatalog

2017-09-21 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang resolved HAWQ-1525.
-
   Resolution: Fixed
Fix Version/s: 2.3.0.0-incubating

> Segmentation fault occurs if reindex database when loading data from Hive to 
> HAWQ using hcatalog
> 
>
> Key: HAWQ-1525
> URL: https://issues.apache.org/jira/browse/HAWQ-1525
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Query Execution
>Reporter: Chunling Wang
>Assignee: Chunling Wang
> Fix For: 2.3.0.0-incubating
>
>
> When we use hcatalog to load data from Hive to HAWQ, if the amount of data is 
> big enough, it will trigger automatic statistics collection, calling vacuum 
> analyze. At that time if we reindex the database, the system will panic on 
> the next auto analyze. Here is the call stack. 
> {code}
> 2017-09-07 13:34:41.441970 IST,,,p34393,th0,,,2017-09-07 13:34:09 
> IST,0,con1140,cmd6,seg-1,"PANIC","XX000","Unexpected internal error: 
> Master pr
> ocess received signal SIGSEGV",,,0"10x96f57c postgres  found> + 0x96f57c
> 20x96f785 postgres StandardHandlerForSigillSigsegvSigbus_OnMainThread + 
> 0x2b
> 30x88b04f postgres CdbProgramErrorHandler + 0xf1
> 40x3a16a0f7e0 libpthread.so.0  + 0x16a0f7e0
> 50x973048 postgres FunctionCall2 + 0x8e
> 60xabefab postgres  + 0xabefab
> 70xabfee4 postgres InMemHeap_GetNext + 0x408
> 80x4f7bc6 postgres  + 0x4f7bc6
> 90x4f7abc postgres systable_getnext + 0x50
> 10   0x953fb8 postgres SearchCatCache + 0x276
> 11   0x95ce10 postgres SearchSysCache + 0x93
> 12   0x95cecb postgres SearchSysCacheKeyArray + 0x9f
> 13   0x5a07fc postgres caql_getoid_plus + 0x176
> 14   0x5c4888 postgres LookupNamespaceId + 0x129
> 15   0x5c475d postgres LookupInternalNamespaceId + 0x1d
> 16   0x687897 postgres  + 0x687897
> 17   0x687574 postgres CreateSchemaCommand + 0x8f
> 18   0x8952d1 postgres ProcessUtility + 0x4ff
> 19   0x5c5728 postgres  + 0x5c5728
> 20   0x5c2fea postgres RangeVarGetCreationNamespace + 0x253
> 21   0x6e43f3 postgres  + 0x6e43f3
> 22   0x6e49c4 postgres  + 0x6e49c4
> 23   0x6e1401 postgres  + 0x6e1401
> 24   0x6deb2d postgres ExecutorStart + 0xb01
> 25   0x738594 postgres  + 0x738594
> 26   0x73809f postgres  + 0x73809f
> 27   0x7351a9 postgres SPI_execute + 0x13c
> 28   0x6490f2 postgres spiExecuteWithCallback + 0x130
> 29   0x64956b postgres  + 0x64956b
> 30   0x648be0 postgres  + 0x648be0
> 31   0x647be0 postgres analyzeStmt + 0x91d
> 32   0x647247 postgres analyzeStatement + 0xb1
> 33   0x6ca11d postgres vacuum + 0xe5
> 34   0x827910 postgres autostats_issue_analyze + 0x160
> 35   0x827e10 postgres auto_stats + 0x19b
> 36   0x8906b5 postgres  + 0x8906b5
> 37   0x8930f5 postgres  + 0x8930f5
> 38   0x892619 postgres PortalRun + 0x3e6
> 39   0x8884f6 postgres  + 0x8884f6
> {code}
> This is because reindex command clear the relcache, and inmemscan->rs_rd->rel 
> in InMemHeap_GetNext() using the address of this heap relation in relcache, 
> which is not same with that when heap relation is reopened.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HAWQ-1525) Segmentation fault occurs if reindex database when loading data from Hive to HAWQ using hcatalog

2017-09-14 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang updated HAWQ-1525:

Description: 
When we use hcatalog to load data from Hive to HAWQ, if the amount of data is 
big enough, it will trigger automatic statistics collection, calling vacuum 
analyze. At that time if we reindex the database, the system will panic on the 
next auto analyze. Here is the call stack. 

{code}
2017-09-07 13:34:41.441970 IST,,,p34393,th0,,,2017-09-07 13:34:09 
IST,0,con1140,cmd6,seg-1,"PANIC","XX000","Unexpected internal error: Master 
pr
ocess received signal SIGSEGV",,,0"10x96f57c postgres  + 0x96f57c
20x96f785 postgres StandardHandlerForSigillSigsegvSigbus_OnMainThread + 0x2b
30x88b04f postgres CdbProgramErrorHandler + 0xf1
40x3a16a0f7e0 libpthread.so.0  + 0x16a0f7e0
50x973048 postgres FunctionCall2 + 0x8e
60xabefab postgres  + 0xabefab
70xabfee4 postgres InMemHeap_GetNext + 0x408
80x4f7bc6 postgres  + 0x4f7bc6
90x4f7abc postgres systable_getnext + 0x50
10   0x953fb8 postgres SearchCatCache + 0x276
11   0x95ce10 postgres SearchSysCache + 0x93
12   0x95cecb postgres SearchSysCacheKeyArray + 0x9f
13   0x5a07fc postgres caql_getoid_plus + 0x176
14   0x5c4888 postgres LookupNamespaceId + 0x129
15   0x5c475d postgres LookupInternalNamespaceId + 0x1d
16   0x687897 postgres  + 0x687897
17   0x687574 postgres CreateSchemaCommand + 0x8f
18   0x8952d1 postgres ProcessUtility + 0x4ff
19   0x5c5728 postgres  + 0x5c5728
20   0x5c2fea postgres RangeVarGetCreationNamespace + 0x253
21   0x6e43f3 postgres  + 0x6e43f3
22   0x6e49c4 postgres  + 0x6e49c4
23   0x6e1401 postgres  + 0x6e1401
24   0x6deb2d postgres ExecutorStart + 0xb01
25   0x738594 postgres  + 0x738594
26   0x73809f postgres  + 0x73809f
27   0x7351a9 postgres SPI_execute + 0x13c
28   0x6490f2 postgres spiExecuteWithCallback + 0x130
29   0x64956b postgres  + 0x64956b
30   0x648be0 postgres  + 0x648be0
31   0x647be0 postgres analyzeStmt + 0x91d
32   0x647247 postgres analyzeStatement + 0xb1
33   0x6ca11d postgres vacuum + 0xe5
34   0x827910 postgres autostats_issue_analyze + 0x160
35   0x827e10 postgres auto_stats + 0x19b
36   0x8906b5 postgres  + 0x8906b5
37   0x8930f5 postgres  + 0x8930f5
38   0x892619 postgres PortalRun + 0x3e6
39   0x8884f6 postgres  + 0x8884f6
{code}

This is because reindex command clear the relcache, and inmemscan->rs_rd->rel 
in InMemHeap_GetNext() using the address of this heap relation in relcache, 
which is not same with that when heap relation is reopened.

  was:
When we use hcatalog to load data from Hive to HAWQ, if the amount of data is 
big enough, it will trigger automatic statistics collection, calling vacuum 
analyze. At that time if we reindex the database, the system will panic on the 
next auto analyze. Here is the call stack. 

{code}
2017-09-07 13:34:41.441970 IST,,,p34393,th0,,,2017-09-07 13:34:09 
IST,0,con1140,cmd6,seg-1,"PANIC","XX000","Unexpected internal error: Master 
pr
ocess received signal SIGSEGV",,,0"10x96f57c postgres  + 0x96f57c
20x96f785 postgres StandardHandlerForSigillSigsegvSigbus_OnMainThread + 0x2b
30x88b04f postgres CdbProgramErrorHandler + 0xf1
40x3a16a0f7e0 libpthread.so.0  + 0x16a0f7e0
50x973048 postgres FunctionCall2 + 0x8e
60xabefab postgres  + 0xabefab
70xabfee4 postgres InMemHeap_GetNext + 0x408
80x4f7bc6 postgres  + 0x4f7bc6
90x4f7abc postgres systable_getnext + 0x50
10   0x953fb8 postgres SearchCatCache + 0x276
11   0x95ce10 postgres SearchSysCache + 0x93
12   0x95cecb postgres SearchSysCacheKeyArray + 0x9f
13   0x5a07fc postgres caql_getoid_plus + 0x176
14   0x5c4888 postgres LookupNamespaceId + 0x129
15   0x5c475d postgres LookupInternalNamespaceId + 0x1d
16   0x687897 postgres  + 0x687897
17   0x687574 postgres CreateSchemaCommand + 0x8f
18   0x8952d1 postgres ProcessUtility + 0x4ff
19   0x5c5728 postgres  + 0x5c5728
20   0x5c2fea postgres RangeVarGetCreationNamespace + 0x253
21   0x6e43f3 postgres  + 0x6e43f3
22   0x6e49c4 postgres  + 0x6e49c4
23   0x6e1401 postgres  + 0x6e1401
24   0x6deb2d postgres ExecutorStart + 0xb01
25   0x738594 postgres  + 0x738594
26   0x73809f postgres  + 0x73809f
27   0x7351a9 postgres SPI_execute + 0x13c
28   0x6490f2 postgres spiExecuteWithCallback + 0x130
29   0x64956b postgres  + 0x64956b
30   0x648be0 postgres  + 0x648be0
31   0x647be0 postgres analyzeStmt + 0x91d
32   0x647247 postgres analyzeStatement + 0xb1
33   0x6ca11d postgres vacuum + 0xe5
34   0x827910 postgres autostats_issue_analyze + 0x160
35   0x827e10 postgres auto_stats + 0x19b
36   0x8906b5 postgres  + 0x8906b5
37   0x8930f5 postgres  + 0x8930f5
38   0x892619 postgres PortalRun + 0x3e6
39   0x8884f6 postgres  + 0x8884f6
{code}

This is because reindex command clear the syscache, and inmemscan->rs_rd->rel 
in InMemHeap_GetNext() using the address of this heap relation in syscache, 
which is not 

[jira] [Assigned] (HAWQ-1525) Segmentation fault occurs if reindex database when loading data from Hive to HAWQ using hcatalog

2017-09-11 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang reassigned HAWQ-1525:
---

Assignee: Chunling Wang  (was: Lei Chang)

> Segmentation fault occurs if reindex database when loading data from Hive to 
> HAWQ using hcatalog
> 
>
> Key: HAWQ-1525
> URL: https://issues.apache.org/jira/browse/HAWQ-1525
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Query Execution
>Reporter: Chunling Wang
>Assignee: Chunling Wang
>
> When we use hcatalog to load data from Hive to HAWQ, if the amount of data is 
> big enough, it will trigger automatic statistics collection, calling vacuum 
> analyze. At that time if we reindex the database, the system will panic on 
> the next auto analyze. Here is the call stack. 
> {code}
> 2017-09-07 13:34:41.441970 IST,,,p34393,th0,,,2017-09-07 13:34:09 
> IST,0,con1140,cmd6,seg-1,"PANIC","XX000","Unexpected internal error: 
> Master pr
> ocess received signal SIGSEGV",,,0"10x96f57c postgres  found> + 0x96f57c
> 20x96f785 postgres StandardHandlerForSigillSigsegvSigbus_OnMainThread + 
> 0x2b
> 30x88b04f postgres CdbProgramErrorHandler + 0xf1
> 40x3a16a0f7e0 libpthread.so.0  + 0x16a0f7e0
> 50x973048 postgres FunctionCall2 + 0x8e
> 60xabefab postgres  + 0xabefab
> 70xabfee4 postgres InMemHeap_GetNext + 0x408
> 80x4f7bc6 postgres  + 0x4f7bc6
> 90x4f7abc postgres systable_getnext + 0x50
> 10   0x953fb8 postgres SearchCatCache + 0x276
> 11   0x95ce10 postgres SearchSysCache + 0x93
> 12   0x95cecb postgres SearchSysCacheKeyArray + 0x9f
> 13   0x5a07fc postgres caql_getoid_plus + 0x176
> 14   0x5c4888 postgres LookupNamespaceId + 0x129
> 15   0x5c475d postgres LookupInternalNamespaceId + 0x1d
> 16   0x687897 postgres  + 0x687897
> 17   0x687574 postgres CreateSchemaCommand + 0x8f
> 18   0x8952d1 postgres ProcessUtility + 0x4ff
> 19   0x5c5728 postgres  + 0x5c5728
> 20   0x5c2fea postgres RangeVarGetCreationNamespace + 0x253
> 21   0x6e43f3 postgres  + 0x6e43f3
> 22   0x6e49c4 postgres  + 0x6e49c4
> 23   0x6e1401 postgres  + 0x6e1401
> 24   0x6deb2d postgres ExecutorStart + 0xb01
> 25   0x738594 postgres  + 0x738594
> 26   0x73809f postgres  + 0x73809f
> 27   0x7351a9 postgres SPI_execute + 0x13c
> 28   0x6490f2 postgres spiExecuteWithCallback + 0x130
> 29   0x64956b postgres  + 0x64956b
> 30   0x648be0 postgres  + 0x648be0
> 31   0x647be0 postgres analyzeStmt + 0x91d
> 32   0x647247 postgres analyzeStatement + 0xb1
> 33   0x6ca11d postgres vacuum + 0xe5
> 34   0x827910 postgres autostats_issue_analyze + 0x160
> 35   0x827e10 postgres auto_stats + 0x19b
> 36   0x8906b5 postgres  + 0x8906b5
> 37   0x8930f5 postgres  + 0x8930f5
> 38   0x892619 postgres PortalRun + 0x3e6
> 39   0x8884f6 postgres  + 0x8884f6
> {code}
> This is because reindex command clear the syscache, and inmemscan->rs_rd->rel 
> in InMemHeap_GetNext() using the address of this heap relation in syscache, 
> which is not same with that when heap relation is reopened.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HAWQ-1525) Segmentation fault occurs if reindex database when loading data from Hive to HAWQ using hcatalog

2017-09-11 Thread Chunling Wang (JIRA)
Chunling Wang created HAWQ-1525:
---

 Summary: Segmentation fault occurs if reindex database when 
loading data from Hive to HAWQ using hcatalog
 Key: HAWQ-1525
 URL: https://issues.apache.org/jira/browse/HAWQ-1525
 Project: Apache HAWQ
  Issue Type: Bug
  Components: Query Execution
Reporter: Chunling Wang
Assignee: Lei Chang


When we use hcatalog to load data from Hive to HAWQ, if the amount of data is 
big enough, it will trigger automatic statistics collection, calling vacuum 
analyze. At that time if we reindex the database, the system will panic on the 
next auto analyze. Here is the call stack. 

{code}
2017-09-07 13:34:41.441970 IST,,,p34393,th0,,,2017-09-07 13:34:09 
IST,0,con1140,cmd6,seg-1,"PANIC","XX000","Unexpected internal error: Master 
pr
ocess received signal SIGSEGV",,,0"10x96f57c postgres  + 0x96f57c
20x96f785 postgres StandardHandlerForSigillSigsegvSigbus_OnMainThread + 0x2b
30x88b04f postgres CdbProgramErrorHandler + 0xf1
40x3a16a0f7e0 libpthread.so.0  + 0x16a0f7e0
50x973048 postgres FunctionCall2 + 0x8e
60xabefab postgres  + 0xabefab
70xabfee4 postgres InMemHeap_GetNext + 0x408
80x4f7bc6 postgres  + 0x4f7bc6
90x4f7abc postgres systable_getnext + 0x50
10   0x953fb8 postgres SearchCatCache + 0x276
11   0x95ce10 postgres SearchSysCache + 0x93
12   0x95cecb postgres SearchSysCacheKeyArray + 0x9f
13   0x5a07fc postgres caql_getoid_plus + 0x176
14   0x5c4888 postgres LookupNamespaceId + 0x129
15   0x5c475d postgres LookupInternalNamespaceId + 0x1d
16   0x687897 postgres  + 0x687897
17   0x687574 postgres CreateSchemaCommand + 0x8f
18   0x8952d1 postgres ProcessUtility + 0x4ff
19   0x5c5728 postgres  + 0x5c5728
20   0x5c2fea postgres RangeVarGetCreationNamespace + 0x253
21   0x6e43f3 postgres  + 0x6e43f3
22   0x6e49c4 postgres  + 0x6e49c4
23   0x6e1401 postgres  + 0x6e1401
24   0x6deb2d postgres ExecutorStart + 0xb01
25   0x738594 postgres  + 0x738594
26   0x73809f postgres  + 0x73809f
27   0x7351a9 postgres SPI_execute + 0x13c
28   0x6490f2 postgres spiExecuteWithCallback + 0x130
29   0x64956b postgres  + 0x64956b
30   0x648be0 postgres  + 0x648be0
31   0x647be0 postgres analyzeStmt + 0x91d
32   0x647247 postgres analyzeStatement + 0xb1
33   0x6ca11d postgres vacuum + 0xe5
34   0x827910 postgres autostats_issue_analyze + 0x160
35   0x827e10 postgres auto_stats + 0x19b
36   0x8906b5 postgres  + 0x8906b5
37   0x8930f5 postgres  + 0x8930f5
38   0x892619 postgres PortalRun + 0x3e6
39   0x8884f6 postgres  + 0x8884f6
{code}

This is because reindex command clear the syscache, and inmemscan->rs_rd->rel 
in InMemHeap_GetNext() using the address of this heap relation in syscache, 
which is not same with that when heap relation is reopened.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HAWQ-1426) hawq extract meets error after the table was reorganized.

2017-04-07 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang reassigned HAWQ-1426:
---

Assignee: Chunling Wang  (was: Ed Espino)

> hawq extract meets error after the table was reorganized.
> -
>
> Key: HAWQ-1426
> URL: https://issues.apache.org/jira/browse/HAWQ-1426
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Command Line Tools
>Reporter: Lili Ma
>Assignee: Chunling Wang
> Fix For: 2.3.0.0-incubating
>
>
> After one table is reorganized, hawq extract the table will meet error.
> Reproduce Steps:
> 1. create an AO table
> 2. insert into several records into it
> 3. Get the table reorganized.  "alter table a set with (reorganize=true);"
> 4. run hawq extract, error thrown out.
> For the bug fix, we should also guarantee that hawq extract should work if 
> the table is truncated and re-inserted.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (HAWQ-1418) Print executing command for hawq register

2017-03-28 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang closed HAWQ-1418.
---

> Print executing command for hawq register
> -
>
> Key: HAWQ-1418
> URL: https://issues.apache.org/jira/browse/HAWQ-1418
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Command Line Tools
>Reporter: Chunling Wang
>Assignee: Chunling Wang
> Fix For: backlog
>
>
> Print executing command for hawq register



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (HAWQ-1418) Print executing command for hawq register

2017-03-28 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang resolved HAWQ-1418.
-
Resolution: Fixed

> Print executing command for hawq register
> -
>
> Key: HAWQ-1418
> URL: https://issues.apache.org/jira/browse/HAWQ-1418
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Command Line Tools
>Reporter: Chunling Wang
>Assignee: Chunling Wang
> Fix For: backlog
>
>
> Print executing command for hawq register



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HAWQ-1418) Print executing command for hawq register

2017-03-28 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang reassigned HAWQ-1418:
---

Assignee: Chunling Wang  (was: Ed Espino)

> Print executing command for hawq register
> -
>
> Key: HAWQ-1418
> URL: https://issues.apache.org/jira/browse/HAWQ-1418
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Command Line Tools
>Reporter: Chunling Wang
>Assignee: Chunling Wang
> Fix For: backlog
>
>
> Print executing command for hawq register



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HAWQ-1418) Print executing command for hawq register

2017-03-28 Thread Chunling Wang (JIRA)
Chunling Wang created HAWQ-1418:
---

 Summary: Print executing command for hawq register
 Key: HAWQ-1418
 URL: https://issues.apache.org/jira/browse/HAWQ-1418
 Project: Apache HAWQ
  Issue Type: Sub-task
  Components: Command Line Tools
Reporter: Chunling Wang
Assignee: Ed Espino


Print executing command for hawq register



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (HAWQ-1332) Can not grant database and schema privileges without table privileges in ranger or ranger plugin service

2017-03-27 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang closed HAWQ-1332.
---

> Can not grant database and schema privileges without table privileges in 
> ranger or ranger plugin service
> 
>
> Key: HAWQ-1332
> URL: https://issues.apache.org/jira/browse/HAWQ-1332
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Security
>Reporter: Chunling Wang
>Assignee: Alexander Denissov
> Fix For: 2.2.0.0-incubating
>
> Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png
>
>
> We try to grant database connect and schema usage privileges to a non-super 
> user to connect database. We find that if we set policy with database and 
> schema included, but with table excluded, we can not connect database. But if 
> we include table, we can connect to database. We think there may be bug in 
> Ranger Plugin Service or Ranger. Here are steps to reproduce it.
> 1. create a new user "usertest1" in database:
> {code}
> $ psql postgres
> psql (8.2.15)
> Type "help" for help.
> postgres=# CREATE USER usertest1;
> NOTICE:  resource queue required -- using default resource queue "pg_default"
> CREATE ROLE
> postgres=#
> {code}
> 2. add user "usertest1" in pg_hba.conf
> {code}
> local all usertest1 trust
> {code}
> 3. set policy with database and schema included, with table excluded
> !screenshot-1.png|width=800,height=400!
> 4. connect database with user "usertest1" but failed with permission denied
> {code}
> $ psql postgres -U usertest1
> psql: FATAL:  permission denied for database "postgres"
> DETAIL:  User does not have CONNECT privilege.
> {code}
> 5. set policy with database, schema and table included
> !screenshot-2.png|width=800,height=400!
> 6. connect database with user "usertest1" and succeed
> {code}
> $ psql postgres -U usertest1
> psql (8.2.15)
> Type "help" for help.
> postgres=#
> {code}
> But if we do not set table as "*", and specify table like "a", we can not 
> access database either.
> !screenshot-3.png|width=800,height=400!



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (HAWQ-1332) Can not grant database and schema privileges without table privileges in ranger or ranger plugin service

2017-03-27 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang resolved HAWQ-1332.
-
Resolution: Not A Problem

> Can not grant database and schema privileges without table privileges in 
> ranger or ranger plugin service
> 
>
> Key: HAWQ-1332
> URL: https://issues.apache.org/jira/browse/HAWQ-1332
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Security
>Reporter: Chunling Wang
>Assignee: Alexander Denissov
> Fix For: 2.2.0.0-incubating
>
> Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png
>
>
> We try to grant database connect and schema usage privileges to a non-super 
> user to connect database. We find that if we set policy with database and 
> schema included, but with table excluded, we can not connect database. But if 
> we include table, we can connect to database. We think there may be bug in 
> Ranger Plugin Service or Ranger. Here are steps to reproduce it.
> 1. create a new user "usertest1" in database:
> {code}
> $ psql postgres
> psql (8.2.15)
> Type "help" for help.
> postgres=# CREATE USER usertest1;
> NOTICE:  resource queue required -- using default resource queue "pg_default"
> CREATE ROLE
> postgres=#
> {code}
> 2. add user "usertest1" in pg_hba.conf
> {code}
> local all usertest1 trust
> {code}
> 3. set policy with database and schema included, with table excluded
> !screenshot-1.png|width=800,height=400!
> 4. connect database with user "usertest1" but failed with permission denied
> {code}
> $ psql postgres -U usertest1
> psql: FATAL:  permission denied for database "postgres"
> DETAIL:  User does not have CONNECT privilege.
> {code}
> 5. set policy with database, schema and table included
> !screenshot-2.png|width=800,height=400!
> 6. connect database with user "usertest1" and succeed
> {code}
> $ psql postgres -U usertest1
> psql (8.2.15)
> Type "help" for help.
> postgres=#
> {code}
> But if we do not set table as "*", and specify table like "a", we can not 
> access database either.
> !screenshot-3.png|width=800,height=400!



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (HAWQ-1367) hawq can access to user tables that have no permission with fallback check table.

2017-03-27 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang closed HAWQ-1367.
---

> hawq can access to user tables that have no permission with fallback check 
> table. 
> --
>
> Key: HAWQ-1367
> URL: https://issues.apache.org/jira/browse/HAWQ-1367
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Security
>Reporter: Xiang Sheng
>Assignee: Chunling Wang
> Fix For: 2.2.0.0-incubating
>
>
> if a user have access to catalog table and he have no access to user table b.
> he can access to table b using "select * from catalog_table, b;"



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (HAWQ-1367) hawq can access to user tables that have no permission with fallback check table.

2017-03-27 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang resolved HAWQ-1367.
-
Resolution: Fixed

> hawq can access to user tables that have no permission with fallback check 
> table. 
> --
>
> Key: HAWQ-1367
> URL: https://issues.apache.org/jira/browse/HAWQ-1367
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Security
>Reporter: Xiang Sheng
>Assignee: Chunling Wang
> Fix For: 2.2.0.0-incubating
>
>
> if a user have access to catalog table and he have no access to user table b.
> he can access to table b using "select * from catalog_table, b;"



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (HAWQ-1377) Add more information for Ranger related GUCs in default hawq-site.xml

2017-03-27 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang resolved HAWQ-1377.
-
Resolution: Fixed

> Add more information for Ranger related GUCs in default hawq-site.xml
> -
>
> Key: HAWQ-1377
> URL: https://issues.apache.org/jira/browse/HAWQ-1377
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: Security
>Reporter: Chunling Wang
>Assignee: Chunling Wang
> Fix For: 2.2.0.0-incubating
>
>
> We should add default GUCs for Ranger in sample hawq-site.xml, just as what 
> resource manager does, so that users don't need to refer to the documents for 
> detailed GUC names.
> The output content should be like follows:
> {code}
> 
> 
> hawq_acl_type
> standalone
> 
> 
> hawq_rps_address_host
> localhost
> 
> 
> hawq_rps_address_suffix
> rps
> 
> 
> hawq_rps_address_port
> 8432
> 
> 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (HAWQ-1377) Add more information for Ranger related GUCs in default hawq-site.xml

2017-03-27 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang closed HAWQ-1377.
---

> Add more information for Ranger related GUCs in default hawq-site.xml
> -
>
> Key: HAWQ-1377
> URL: https://issues.apache.org/jira/browse/HAWQ-1377
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: Security
>Reporter: Chunling Wang
>Assignee: Chunling Wang
> Fix For: 2.2.0.0-incubating
>
>
> We should add default GUCs for Ranger in sample hawq-site.xml, just as what 
> resource manager does, so that users don't need to refer to the documents for 
> detailed GUC names.
> The output content should be like follows:
> {code}
> 
> 
> hawq_acl_type
> standalone
> 
> 
> hawq_rps_address_host
> localhost
> 
> 
> hawq_rps_address_suffix
> rps
> 
> 
> hawq_rps_address_port
> 8432
> 
> 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HAWQ-1377) Add more information for Ranger related GUCs in default hawq-site.xml

2017-03-06 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang reassigned HAWQ-1377:
---

Assignee: Chunling Wang  (was: Ed Espino)

> Add more information for Ranger related GUCs in default hawq-site.xml
> -
>
> Key: HAWQ-1377
> URL: https://issues.apache.org/jira/browse/HAWQ-1377
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: Security
>Reporter: Chunling Wang
>Assignee: Chunling Wang
> Fix For: 2.2.0.0-incubating
>
>
> We should add default GUCs for Ranger in sample hawq-site.xml, just as what 
> resource manager does, so that users don't need to refer to the documents for 
> detailed GUC names.
> The output content should be like follows:
> {code}
> 
> 
> hawq_acl_type
> standalone
> 
> 
> hawq_rps_address_host
> localhost
> 
> 
> hawq_rps_address_suffix
> rps
> 
> 
> hawq_rps_address_port
> 8432
> 
> 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HAWQ-1377) Add more information for Ranger related GUCs in default hawq-site.xml

2017-03-05 Thread Chunling Wang (JIRA)
Chunling Wang created HAWQ-1377:
---

 Summary: Add more information for Ranger related GUCs in default 
hawq-site.xml
 Key: HAWQ-1377
 URL: https://issues.apache.org/jira/browse/HAWQ-1377
 Project: Apache HAWQ
  Issue Type: Improvement
  Components: Security
Reporter: Chunling Wang
Assignee: Ed Espino
 Fix For: 2.2.0.0-incubating


We should add default GUCs for Ranger in sample hawq-site.xml, just as what 
resource manager does, so that users don't need to refer to the documents for 
detailed GUC names.

The output content should be like follows:
{code}


hawq_acl_type
standalone



hawq_rps_address_host
localhost



hawq_rps_address_suffix
rps



hawq_rps_address_port
8432


{code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HAWQ-1367) hawq can access to user tables that have no permission with fallback check table.

2017-02-28 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang reassigned HAWQ-1367:
---

Assignee: Chunling Wang  (was: Ed Espino)

> hawq can access to user tables that have no permission with fallback check 
> table. 
> --
>
> Key: HAWQ-1367
> URL: https://issues.apache.org/jira/browse/HAWQ-1367
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Security
>Reporter: Xiang Sheng
>Assignee: Chunling Wang
> Fix For: 2.2.0.0-incubating
>
>
> if a user have access to catalog table and he have no access to user table b.
> he can access to table b using "select * from catalog_table, b;"



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HAWQ-1332) Can not grant database and schema privileges without table privileges in ranger or ranger plugin service

2017-02-15 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang updated HAWQ-1332:

Description: 
We try to grant database connect and schema usage privileges to a non-super 
user to connect database. We find that if we set policy with database and 
schema included, but with table excluded, we can not connect database. But if 
we include table, we can connect to database. We think there may be bug in 
Ranger Plugin Service or Ranger. Here are steps to reproduce it.

1. create a new user "usertest1" in database:
{code}
$ psql postgres
psql (8.2.15)
Type "help" for help.

postgres=# CREATE USER usertest1;
NOTICE:  resource queue required -- using default resource queue "pg_default"
CREATE ROLE
postgres=#
{code}

2. add user "usertest1" in pg_hba.conf
{code}
local all usertest1 trust
{code}

3. set policy with database and schema included, with table excluded
!screenshot-1.png|width=800,height=400!

4. connect database with user "usertest1" but failed with permission denied
{code}
$ psql postgres -U usertest1
psql: FATAL:  permission denied for database "postgres"
DETAIL:  User does not have CONNECT privilege.
{code}

5. set policy with database, schema and table included
!screenshot-2.png|width=800,height=400!

6. connect database with user "usertest1" and succeed
{code}
$ psql postgres -U usertest1
psql (8.2.15)
Type "help" for help.

postgres=#
{code}

But if we do not set table as "*", and specify table like "a", we can not 
access database either.
!screenshot-3.png|width=800,height=400!

  was:
We try to grant database connect and schema usage privileges to a non-super 
user to connect database. We find that if we set policy with database and 
schema included, but with table excluded, we can not connect database. But if 
we include table, we can connect to database. We think there may be bug in 
Ranger Plugin Service or Ranger. Here are steps to reproduce it.

1. create a new user "usertest1" in database:
{code}
$ psql postgres
psql (8.2.15)
Type "help" for help.

postgres=# CREATE USER usertest1;
NOTICE:  resource queue required -- using default resource queue "pg_default"
CREATE ROLE
postgres=#
{code}

2. add user "usertest1" in pg_hba.conf
{code}
local all usertest1 trust
{code}

3. set policy with database and schema included, with table excluded
!screenshot-1.png|width=800,height=400!

4. connect database with user "usertest1" but failed with permission denied
{code}
$ psql postgres -U usertest1
psql: FATAL:  permission denied for database "postgres"
DETAIL:  User does not have CONNECT privilege.
{code}

5. set policy with database, schema and table included
!screenshot-2.png|width=800,height=400!

6. connect database with user "usertest1" and succeed
{code}
$ psql postgres -U usertest1
psql (8.2.15)
Type "help" for help.

postgres=#
{code}


> Can not grant database and schema privileges without table privileges in 
> ranger or ranger plugin service
> 
>
> Key: HAWQ-1332
> URL: https://issues.apache.org/jira/browse/HAWQ-1332
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Security
>Reporter: Chunling Wang
>Assignee: Ed Espino
> Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png
>
>
> We try to grant database connect and schema usage privileges to a non-super 
> user to connect database. We find that if we set policy with database and 
> schema included, but with table excluded, we can not connect database. But if 
> we include table, we can connect to database. We think there may be bug in 
> Ranger Plugin Service or Ranger. Here are steps to reproduce it.
> 1. create a new user "usertest1" in database:
> {code}
> $ psql postgres
> psql (8.2.15)
> Type "help" for help.
> postgres=# CREATE USER usertest1;
> NOTICE:  resource queue required -- using default resource queue "pg_default"
> CREATE ROLE
> postgres=#
> {code}
> 2. add user "usertest1" in pg_hba.conf
> {code}
> local all usertest1 trust
> {code}
> 3. set policy with database and schema included, with table excluded
> !screenshot-1.png|width=800,height=400!
> 4. connect database with user "usertest1" but failed with permission denied
> {code}
> $ psql postgres -U usertest1
> psql: FATAL:  permission denied for database "postgres"
> DETAIL:  User does not have CONNECT privilege.
> {code}
> 5. set policy with database, schema and table included
> !screenshot-2.png|width=800,height=400!
> 6. connect database with user "usertest1" and succeed
> {code}
> $ psql postgres -U usertest1
> psql (8.2.15)
> Type "help" for help.
> postgres=#
> {code}
> But if we do not set table as "*", and specify table like "a", we can not 
> access database either.
> 

[jira] [Updated] (HAWQ-1332) Can not grant database and schema privileges without table privileges in ranger or ranger plugin service

2017-02-15 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang updated HAWQ-1332:

Attachment: screenshot-3.png

> Can not grant database and schema privileges without table privileges in 
> ranger or ranger plugin service
> 
>
> Key: HAWQ-1332
> URL: https://issues.apache.org/jira/browse/HAWQ-1332
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Security
>Reporter: Chunling Wang
>Assignee: Ed Espino
> Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png
>
>
> We try to grant database connect and schema usage privileges to a non-super 
> user to connect database. We find that if we set policy with database and 
> schema included, but with table excluded, we can not connect database. But if 
> we include table, we can connect to database. We think there may be bug in 
> Ranger Plugin Service or Ranger. Here are steps to reproduce it.
> 1. create a new user "usertest1" in database:
> {code}
> $ psql postgres
> psql (8.2.15)
> Type "help" for help.
> postgres=# CREATE USER usertest1;
> NOTICE:  resource queue required -- using default resource queue "pg_default"
> CREATE ROLE
> postgres=#
> {code}
> 2. add user "usertest1" in pg_hba.conf
> {code}
> local all usertest1 trust
> {code}
> 3. set policy with database and schema included, with table excluded
> !screenshot-1.png|width=800,height=400!
> 4. connect database with user "usertest1" but failed with permission denied
> {code}
> $ psql postgres -U usertest1
> psql: FATAL:  permission denied for database "postgres"
> DETAIL:  User does not have CONNECT privilege.
> {code}
> 5. set policy with database, schema and table included
> !screenshot-2.png|width=800,height=400!
> 6. connect database with user "usertest1" and succeed
> {code}
> $ psql postgres -U usertest1
> psql (8.2.15)
> Type "help" for help.
> postgres=#
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HAWQ-1332) Can not grant database and schema privileges without table privileges in ranger or ranger plugin service

2017-02-14 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang updated HAWQ-1332:

Description: 
We try to grant database connect and schema usage privileges to a non-super 
user to connect database. We find that if we set policy with database and 
schema included, but with table excluded, we can not connect database. But if 
we include table, we can connect to database. We think there may be bug in 
Ranger Plugin Service or Ranger. Here are steps to reproduce it.

1. create a new user "usertest1" in database:
{code}
$ psql postgres
psql (8.2.15)
Type "help" for help.

postgres=# CREATE USER usertest1;
NOTICE:  resource queue required -- using default resource queue "pg_default"
CREATE ROLE
postgres=#
{code}

2. add user "usertest1" in pg_hba.conf
{code}
local all usertest1 trust
{code}

3. set policy with database and schema included, with table excluded
!screenshot-1.png|width=800,height=400!

4. connect database with user "usertest1" but failed with permission denied
{code}
$ psql postgres -U usertest1
psql: FATAL:  permission denied for database "postgres"
DETAIL:  User does not have CONNECT privilege.
{code}

5. set policy with database, schema and table included
!screenshot-2.png|width=800,height=400!

6. connect database with user "usertest1" and succeed
{code}
$ psql postgres -U usertest1
psql (8.2.15)
Type "help" for help.

postgres=#
{code}

  was:We try to grant database connect and schema usage privileges to a 
non-super user to connect database. We find that if we set policy with database 
and schema included, but with table excluded, we can not connect database. But 
if we include table, we can connect to database. We think there may be bug in 
Ranger Plugin Service or Ranger.


> Can not grant database and schema privileges without table privileges in 
> ranger or ranger plugin service
> 
>
> Key: HAWQ-1332
> URL: https://issues.apache.org/jira/browse/HAWQ-1332
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Security
>Reporter: Chunling Wang
>Assignee: Ed Espino
> Attachments: screenshot-1.png, screenshot-2.png
>
>
> We try to grant database connect and schema usage privileges to a non-super 
> user to connect database. We find that if we set policy with database and 
> schema included, but with table excluded, we can not connect database. But if 
> we include table, we can connect to database. We think there may be bug in 
> Ranger Plugin Service or Ranger. Here are steps to reproduce it.
> 1. create a new user "usertest1" in database:
> {code}
> $ psql postgres
> psql (8.2.15)
> Type "help" for help.
> postgres=# CREATE USER usertest1;
> NOTICE:  resource queue required -- using default resource queue "pg_default"
> CREATE ROLE
> postgres=#
> {code}
> 2. add user "usertest1" in pg_hba.conf
> {code}
> local all usertest1 trust
> {code}
> 3. set policy with database and schema included, with table excluded
> !screenshot-1.png|width=800,height=400!
> 4. connect database with user "usertest1" but failed with permission denied
> {code}
> $ psql postgres -U usertest1
> psql: FATAL:  permission denied for database "postgres"
> DETAIL:  User does not have CONNECT privilege.
> {code}
> 5. set policy with database, schema and table included
> !screenshot-2.png|width=800,height=400!
> 6. connect database with user "usertest1" and succeed
> {code}
> $ psql postgres -U usertest1
> psql (8.2.15)
> Type "help" for help.
> postgres=#
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HAWQ-1332) Can not grant database and schema privileges without table privileges in ranger or ranger plugin service

2017-02-14 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang updated HAWQ-1332:

Attachment: screenshot-1.png

> Can not grant database and schema privileges without table privileges in 
> ranger or ranger plugin service
> 
>
> Key: HAWQ-1332
> URL: https://issues.apache.org/jira/browse/HAWQ-1332
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Security
>Reporter: Chunling Wang
>Assignee: Ed Espino
> Attachments: screenshot-1.png, screenshot-2.png
>
>
> We try to grant database connect and schema usage privileges to a non-super 
> user to connect database. We find that if we set policy with database and 
> schema included, but with table excluded, we can not connect database. But if 
> we include table, we can connect to database. We think there may be bug in 
> Ranger Plugin Service or Ranger.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HAWQ-1332) Can not grant database and schema privileges without table privileges in ranger or ranger plugin service

2017-02-14 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang updated HAWQ-1332:

Attachment: screenshot-2.png

> Can not grant database and schema privileges without table privileges in 
> ranger or ranger plugin service
> 
>
> Key: HAWQ-1332
> URL: https://issues.apache.org/jira/browse/HAWQ-1332
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Security
>Reporter: Chunling Wang
>Assignee: Ed Espino
> Attachments: screenshot-1.png, screenshot-2.png
>
>
> We try to grant database connect and schema usage privileges to a non-super 
> user to connect database. We find that if we set policy with database and 
> schema included, but with table excluded, we can not connect database. But if 
> we include table, we can connect to database. We think there may be bug in 
> Ranger Plugin Service or Ranger.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HAWQ-1332) Can not grant database and schema privileges without table privileges in ranger or ranger plugin service

2017-02-14 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang updated HAWQ-1332:

Attachment: (was: screenshot-2.png)

> Can not grant database and schema privileges without table privileges in 
> ranger or ranger plugin service
> 
>
> Key: HAWQ-1332
> URL: https://issues.apache.org/jira/browse/HAWQ-1332
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Security
>Reporter: Chunling Wang
>Assignee: Ed Espino
>
> We try to grant database connect and schema usage privileges to a non-super 
> user to connect database. We find that if we set policy with database and 
> schema included, but with table excluded, we can not connect database. But if 
> we include table, we can connect to database. We think there may be bug in 
> Ranger Plugin Service or Ranger.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HAWQ-1332) Can not grant database and schema privileges without table privileges in ranger or ranger plugin service

2017-02-14 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang updated HAWQ-1332:

Attachment: (was: screenshot-1.png)

> Can not grant database and schema privileges without table privileges in 
> ranger or ranger plugin service
> 
>
> Key: HAWQ-1332
> URL: https://issues.apache.org/jira/browse/HAWQ-1332
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Security
>Reporter: Chunling Wang
>Assignee: Ed Espino
>
> We try to grant database connect and schema usage privileges to a non-super 
> user to connect database. We find that if we set policy with database and 
> schema included, but with table excluded, we can not connect database. But if 
> we include table, we can connect to database. We think there may be bug in 
> Ranger Plugin Service or Ranger.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HAWQ-1332) Can not grant database and schema privileges without table privileges in ranger or ranger plugin service

2017-02-14 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang updated HAWQ-1332:

Attachment: screenshot-2.png

> Can not grant database and schema privileges without table privileges in 
> ranger or ranger plugin service
> 
>
> Key: HAWQ-1332
> URL: https://issues.apache.org/jira/browse/HAWQ-1332
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Security
>Reporter: Chunling Wang
>Assignee: Ed Espino
> Attachments: screenshot-1.png, screenshot-2.png
>
>
> We try to grant database connect and schema usage privileges to a non-super 
> user to connect database. We find that if we set policy with database and 
> schema included, but with table excluded, we can not connect database. But if 
> we include table, we can connect to database. We think there may be bug in 
> Ranger Plugin Service or Ranger.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HAWQ-1332) Can not grant database and schema privileges without table privileges in ranger or ranger plugin service

2017-02-14 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang updated HAWQ-1332:

Attachment: screenshot-1.png

> Can not grant database and schema privileges without table privileges in 
> ranger or ranger plugin service
> 
>
> Key: HAWQ-1332
> URL: https://issues.apache.org/jira/browse/HAWQ-1332
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Security
>Reporter: Chunling Wang
>Assignee: Ed Espino
> Attachments: screenshot-1.png
>
>
> We try to grant database connect and schema usage privileges to a non-super 
> user to connect database. We find that if we set policy with database and 
> schema included, but with table excluded, we can not connect database. But if 
> we include table, we can connect to database. We think there may be bug in 
> Ranger Plugin Service or Ranger.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HAWQ-1332) Can not grant database and schema privileges without table privileges in ranger or ranger plugin service

2017-02-14 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang updated HAWQ-1332:

Description: We try to grant database connect and schema usage privileges 
to a non-super user to connect database. We find that if we set policy with 
database and schema included, but with table excluded, we can not connect 
database. But if we include table, we can connect to database. We think there 
may be bug in Ranger Plugin Service or Ranger.  (was: We try to grant database 
connect and schema usage privileges to a normal user to connect database. We 
find that if we set policy with database and schema included, but with table 
excluded, we can not connect database. But if we include table, we can connect 
to database. We think there may be bug in Ranger Plugin Service or Ranger.)

> Can not grant database and schema privileges without table privileges in 
> ranger or ranger plugin service
> 
>
> Key: HAWQ-1332
> URL: https://issues.apache.org/jira/browse/HAWQ-1332
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Security
>Reporter: Chunling Wang
>Assignee: Ed Espino
>
> We try to grant database connect and schema usage privileges to a non-super 
> user to connect database. We find that if we set policy with database and 
> schema included, but with table excluded, we can not connect database. But if 
> we include table, we can connect to database. We think there may be bug in 
> Ranger Plugin Service or Ranger.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HAWQ-1332) Can not grant database and schema privileges without table privileges in ranger or ranger plugin service

2017-02-14 Thread Chunling Wang (JIRA)
Chunling Wang created HAWQ-1332:
---

 Summary: Can not grant database and schema privileges without 
table privileges in ranger or ranger plugin service
 Key: HAWQ-1332
 URL: https://issues.apache.org/jira/browse/HAWQ-1332
 Project: Apache HAWQ
  Issue Type: Bug
  Components: Security
Reporter: Chunling Wang
Assignee: Ed Espino


We try to grant database connect and schema usage privileges to a normal user 
to connect database. We find that if we set policy with database and schema 
included, but with table excluded, we can not connect database. But if we 
include table, we can connect to database. We think there may be bug in Ranger 
Plugin Service or Ranger.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (HAWQ-1249) Don't do ACL checks on segments

2017-01-09 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang closed HAWQ-1249.
---

> Don't do ACL checks on segments
> ---
>
> Key: HAWQ-1249
> URL: https://issues.apache.org/jira/browse/HAWQ-1249
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Security
>Reporter: Chunling Wang
>Assignee: Chunling Wang
> Fix For: 2.2.0.0-incubating
>
>
> HAWQ does ACL checks on segments, which we think is not necessary for QE 
> because there is no catalog data on segments. Even a hacker can connect to a 
> segdb with GP_ROLE_EXECUTE, he can not do any queries while he can do on 
> Greenplum for there is catalog data on segments. Further more, in ranger 
> checks, if all segments do same checks as master with RPS, it costs a lot and 
> effects the performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HAWQ-1249) Don't do ACL checks on segments

2017-01-09 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang resolved HAWQ-1249.
-
   Resolution: Fixed
Fix Version/s: (was: backlog)
   2.2.0.0-incubating

> Don't do ACL checks on segments
> ---
>
> Key: HAWQ-1249
> URL: https://issues.apache.org/jira/browse/HAWQ-1249
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Security
>Reporter: Chunling Wang
>Assignee: Chunling Wang
> Fix For: 2.2.0.0-incubating
>
>
> HAWQ does ACL checks on segments, which we think is not necessary for QE 
> because there is no catalog data on segments. Even a hacker can connect to a 
> segdb with GP_ROLE_EXECUTE, he can not do any queries while he can do on 
> Greenplum for there is catalog data on segments. Further more, in ranger 
> checks, if all segments do same checks as master with RPS, it costs a lot and 
> effects the performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HAWQ-1249) Don't do ACL checks on segments

2017-01-03 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang reassigned HAWQ-1249:
---

Assignee: Chunling Wang  (was: Ed Espino)

> Don't do ACL checks on segments
> ---
>
> Key: HAWQ-1249
> URL: https://issues.apache.org/jira/browse/HAWQ-1249
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Security
>Reporter: Chunling Wang
>Assignee: Chunling Wang
> Fix For: backlog
>
>
> HAWQ does ACL checks on segments, which we think is not necessary for QE 
> because there is no catalog data on segments. Even a hacker can connect to a 
> segdb with GP_ROLE_EXECUTE, he can not do any queries while he can do on 
> Greenplum for there is catalog data on segments. Further more, in ranger 
> checks, if all segments do same checks as master with RPS, it costs a lot and 
> effects the performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HAWQ-1249) Don't do ACL checks on segments

2017-01-03 Thread Chunling Wang (JIRA)
Chunling Wang created HAWQ-1249:
---

 Summary: Don't do ACL checks on segments
 Key: HAWQ-1249
 URL: https://issues.apache.org/jira/browse/HAWQ-1249
 Project: Apache HAWQ
  Issue Type: Sub-task
  Components: Security
Reporter: Chunling Wang
Assignee: Ed Espino


HAWQ does ACL checks on segments, which we think is not necessary for QE 
because there is no catalog data on segments. Even a hacker can connect to a 
segdb with GP_ROLE_EXECUTE, he can not do any queries while he can do on 
Greenplum for there is catalog data on segments. Further more, in ranger 
checks, if all segments do same checks as master with RPS, it costs a lot and 
effects the performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (HAWQ-1239) Fail to call pg_rangercheck_batch() when when "rte->rtekind != RTE_RELATION" or "requiredPerms == 0"

2017-01-02 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang closed HAWQ-1239.
---

> Fail to call pg_rangercheck_batch() when when "rte->rtekind != RTE_RELATION" 
> or "requiredPerms == 0"
> 
>
> Key: HAWQ-1239
> URL: https://issues.apache.org/jira/browse/HAWQ-1239
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Security
>Reporter: Chunling Wang
>Assignee: Chunling Wang
> Fix For: 2.2.0.0-incubating
>
>
> In ExecCheckRTPermsWithRanger(), it should continue but not return when 
> "rte->rtekind != RTE_RELATION" or "requiredPerms == 0".
> {code}
> if (rte->rtekind != RTE_RELATION)
>   return;
> requiredPerms = rte->requiredPerms;
> if (requiredPerms == 0)
>   return;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HAWQ-1239) Fail to call pg_rangercheck_batch() when when "rte->rtekind != RTE_RELATION" or "requiredPerms == 0"

2017-01-02 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang resolved HAWQ-1239.
-
Resolution: Fixed

> Fail to call pg_rangercheck_batch() when when "rte->rtekind != RTE_RELATION" 
> or "requiredPerms == 0"
> 
>
> Key: HAWQ-1239
> URL: https://issues.apache.org/jira/browse/HAWQ-1239
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Security
>Reporter: Chunling Wang
>Assignee: Chunling Wang
> Fix For: 2.2.0.0-incubating
>
>
> In ExecCheckRTPermsWithRanger(), it should continue but not return when 
> "rte->rtekind != RTE_RELATION" or "requiredPerms == 0".
> {code}
> if (rte->rtekind != RTE_RELATION)
>   return;
> requiredPerms = rte->requiredPerms;
> if (requiredPerms == 0)
>   return;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HAWQ-1237) Insert statement need "select" privilege in ranger check

2017-01-02 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang resolved HAWQ-1237.
-
Resolution: Fixed

> Insert statement need "select" privilege in ranger check 
> -
>
> Key: HAWQ-1237
> URL: https://issues.apache.org/jira/browse/HAWQ-1237
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Security
>Reporter: Chunling Wang
>Assignee: Chunling Wang
> Fix For: 2.2.0.0-incubating
>
>
> The code in create_ranger_request_json_batch() in rangerrest.c is hard code 
> and make all statements need "select" privilege in ranger check.
> {code}
>   //ListCell *cell;
>   //foreach(cell, arg_ptr->actions)
>   //{
>   char tmp[7] = "select";
>   json_object* jaction = json_object_new_string((char *)tmp);
>   //json_object* jaction = json_object_new_string((char 
> *)cell->data.ptr_value);
>   json_object_array_add(jactions, jaction);
>   //}
>   json_object_object_add(jelement, "privileges", jactions);
>   json_object_array_add(jaccess, jelement);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (HAWQ-1237) Insert statement need "select" privilege in ranger check

2017-01-02 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang closed HAWQ-1237.
---

> Insert statement need "select" privilege in ranger check 
> -
>
> Key: HAWQ-1237
> URL: https://issues.apache.org/jira/browse/HAWQ-1237
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Security
>Reporter: Chunling Wang
>Assignee: Chunling Wang
> Fix For: 2.2.0.0-incubating
>
>
> The code in create_ranger_request_json_batch() in rangerrest.c is hard code 
> and make all statements need "select" privilege in ranger check.
> {code}
>   //ListCell *cell;
>   //foreach(cell, arg_ptr->actions)
>   //{
>   char tmp[7] = "select";
>   json_object* jaction = json_object_new_string((char *)tmp);
>   //json_object* jaction = json_object_new_string((char 
> *)cell->data.ptr_value);
>   json_object_array_add(jactions, jaction);
>   //}
>   json_object_object_add(jelement, "privileges", jactions);
>   json_object_array_add(jaccess, jelement);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HAWQ-1237) Insert statement need "select" privilege in ranger check

2017-01-02 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang updated HAWQ-1237:

Fix Version/s: 2.2.0.0-incubating

> Insert statement need "select" privilege in ranger check 
> -
>
> Key: HAWQ-1237
> URL: https://issues.apache.org/jira/browse/HAWQ-1237
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Security
>Reporter: Chunling Wang
>Assignee: Chunling Wang
> Fix For: 2.2.0.0-incubating
>
>
> The code in create_ranger_request_json_batch() in rangerrest.c is hard code 
> and make all statements need "select" privilege in ranger check.
> {code}
>   //ListCell *cell;
>   //foreach(cell, arg_ptr->actions)
>   //{
>   char tmp[7] = "select";
>   json_object* jaction = json_object_new_string((char *)tmp);
>   //json_object* jaction = json_object_new_string((char 
> *)cell->data.ptr_value);
>   json_object_array_add(jactions, jaction);
>   //}
>   json_object_object_add(jelement, "privileges", jactions);
>   json_object_array_add(jaccess, jelement);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HAWQ-1239) Fail to call pg_rangercheck_batch() when when "rte->rtekind != RTE_RELATION" or "requiredPerms == 0"

2017-01-02 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang updated HAWQ-1239:

Fix Version/s: 2.2.0.0-incubating

> Fail to call pg_rangercheck_batch() when when "rte->rtekind != RTE_RELATION" 
> or "requiredPerms == 0"
> 
>
> Key: HAWQ-1239
> URL: https://issues.apache.org/jira/browse/HAWQ-1239
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Security
>Reporter: Chunling Wang
>Assignee: Chunling Wang
> Fix For: 2.2.0.0-incubating
>
>
> In ExecCheckRTPermsWithRanger(), it should continue but not return when 
> "rte->rtekind != RTE_RELATION" or "requiredPerms == 0".
> {code}
> if (rte->rtekind != RTE_RELATION)
>   return;
> requiredPerms = rte->requiredPerms;
> if (requiredPerms == 0)
>   return;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HAWQ-1239) Fail to call pg_rangercheck_batch() when when "rte->rtekind != RTE_RELATION" or "requiredPerms == 0"

2016-12-27 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang reassigned HAWQ-1239:
---

Assignee: Chunling Wang  (was: Ed Espino)

> Fail to call pg_rangercheck_batch() when when "rte->rtekind != RTE_RELATION" 
> or "requiredPerms == 0"
> 
>
> Key: HAWQ-1239
> URL: https://issues.apache.org/jira/browse/HAWQ-1239
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Security
>Reporter: Chunling Wang
>Assignee: Chunling Wang
>
> In ExecCheckRTPermsWithRanger(), it should continue but not return when 
> "rte->rtekind != RTE_RELATION" or "requiredPerms == 0".
> {code}
> if (rte->rtekind != RTE_RELATION)
>   return;
> requiredPerms = rte->requiredPerms;
> if (requiredPerms == 0)
>   return;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HAWQ-1239) Fail to call pg_rangercheck_batch() when when "rte->rtekind != RTE_RELATION" or "requiredPerms == 0"

2016-12-27 Thread Chunling Wang (JIRA)
Chunling Wang created HAWQ-1239:
---

 Summary: Fail to call pg_rangercheck_batch() when when 
"rte->rtekind != RTE_RELATION" or "requiredPerms == 0"
 Key: HAWQ-1239
 URL: https://issues.apache.org/jira/browse/HAWQ-1239
 Project: Apache HAWQ
  Issue Type: Bug
  Components: Security
Reporter: Chunling Wang
Assignee: Ed Espino


In ExecCheckRTPermsWithRanger(), it should continue but not return when 
"rte->rtekind != RTE_RELATION" or "requiredPerms == 0".
{code}
if (rte->rtekind != RTE_RELATION)
  return;
requiredPerms = rte->requiredPerms;
if (requiredPerms == 0)
  return;
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HAWQ-1238) Can not get any data when the network is connected again after a while disconnected.

2016-12-27 Thread Chunling Wang (JIRA)
Chunling Wang created HAWQ-1238:
---

 Summary: Can not get any data when the network is connected again 
after a while disconnected.
 Key: HAWQ-1238
 URL: https://issues.apache.org/jira/browse/HAWQ-1238
 Project: Apache HAWQ
  Issue Type: Bug
  Components: Security
Reporter: Chunling Wang
Assignee: Ed Espino


Can not get any data when the network is connected again after a while 
disconnected.
1. Psql postgres, run "\d" and find relations in database.
{code}
psql postgres
psql (8.2.15)
Type "help" for help.

postgres=# \d
  List of relations
 Schema |  Name  | Type  |Owner |   Storage
++---+--+-
 public | sales1 | table | wangchunling | append only
 public | sales1_1_prt_1 | table | wangchunling | append only
 public | sales1_1_prt_2 | table | wangchunling | append only
 public | t  | table | wangchunling | append only
 public | tv | view  | wangchunling | none
(5 rows)

{code}
2. Quit the session and disconnect the network. Then psql postgres, run "\d" 
and get expected error.
{code}
$ psql postgres
psql (8.2.15)
Type "help" for help.

postgres=# \d
WARNING:  curl_easy_perform() failed: Couldn't connect to server
LINE 1: select version()
   ^
WARNING:  curl_easy_perform() failed: Couldn't connect to server
ERROR:  permission denied for function version
WARNING:  curl_easy_perform() failed: Couldn't connect to server
ERROR:  permission denied for function version
WARNING:  curl_easy_perform() failed: Couldn't connect to server
LINE 5: FROM pg_catalog.pg_class c
 ^
ERROR:  permission denied for schema pg_catalog
LINE 5: FROM pg_catalog.pg_class c
 ^
{code}

3. Connect the network and run "\d", but find no relations.
{code}
postgres=# \d
No relations found.
{code}

4.  Quit the session again. Then psql postgres, run "\d" and find relations 
correctly.
{code}
$ psql postgres
psql (8.2.15)
Type "help" for help.

postgres=# \d
  List of relations
 Schema |  Name  | Type  |Owner |   Storage
++---+--+-
 public | sales1 | table | wangchunling | append only
 public | sales1_1_prt_1 | table | wangchunling | append only
 public | sales1_1_prt_2 | table | wangchunling | append only
 public | t  | table | wangchunling | append only
 public | tv | view  | wangchunling | none
(5 rows)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HAWQ-1237) Insert statement need "select" privilege in ranger check

2016-12-26 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang reassigned HAWQ-1237:
---

Assignee: Chunling Wang  (was: Ed Espino)

> Insert statement need "select" privilege in ranger check 
> -
>
> Key: HAWQ-1237
> URL: https://issues.apache.org/jira/browse/HAWQ-1237
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Security
>Reporter: Chunling Wang
>Assignee: Chunling Wang
>
> The code in create_ranger_request_json_batch() in rangerrest.c is hard code 
> and make all statements need "select" privilege in ranger check.
> {code}
>   //ListCell *cell;
>   //foreach(cell, arg_ptr->actions)
>   //{
>   char tmp[7] = "select";
>   json_object* jaction = json_object_new_string((char *)tmp);
>   //json_object* jaction = json_object_new_string((char 
> *)cell->data.ptr_value);
>   json_object_array_add(jactions, jaction);
>   //}
>   json_object_object_add(jelement, "privileges", jactions);
>   json_object_array_add(jaccess, jelement);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HAWQ-1237) Insert statement need "select" privilege in ranger check

2016-12-26 Thread Chunling Wang (JIRA)
Chunling Wang created HAWQ-1237:
---

 Summary: Insert statement need "select" privilege in ranger check 
 Key: HAWQ-1237
 URL: https://issues.apache.org/jira/browse/HAWQ-1237
 Project: Apache HAWQ
  Issue Type: Bug
  Components: Security
Reporter: Chunling Wang
Assignee: Ed Espino


The code in create_ranger_request_json_batch() in rangerrest.c is hard code and 
make all statements need "select" privilege in ranger check.
{code}
//ListCell *cell;
//foreach(cell, arg_ptr->actions)
//{
char tmp[7] = "select";
json_object* jaction = json_object_new_string((char *)tmp);
//json_object* jaction = json_object_new_string((char 
*)cell->data.ptr_value);
json_object_array_add(jactions, jaction);
//}
json_object_object_add(jelement, "privileges", jactions);
json_object_array_add(jaccess, jelement);
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HAWQ-1149) Built-in function gp_persistent_build_all loses data in gp_relfile_node and gp_persistent_relfile_node

2016-11-08 Thread Chunling Wang (JIRA)
Chunling Wang created HAWQ-1149:
---

 Summary: Built-in function gp_persistent_build_all loses data in 
gp_relfile_node and gp_persistent_relfile_node
 Key: HAWQ-1149
 URL: https://issues.apache.org/jira/browse/HAWQ-1149
 Project: Apache HAWQ
  Issue Type: Bug
  Components: Core
Reporter: Chunling Wang
Assignee: Lei Chang


When we create a new table, and insert data into it. There will be records in 
gp_relfile_node, gp_persistent_relfile_node and gp_persistent_relation_node. 
But if we run the HAWQ build-in function gp_persistent_build_all, we will find 
that the record in gp_relfile_node and gp_persistent_relfile_node for this 
table is lost. And if there are more than 1 file in this talbe, we will get 
error when we drop this table. Here are the steps to recur this bug:
1. Create table a, and insert data into a with two concurrent process:
{code}
postgres=# create table a(id int);
CREATE TABLE
postgres=# insert into a select generate_series(1, 1000);
INSERT 0 1000
{code}
{code}
postgres=# insert into a select generate_series(1000, 2000);
INSERT 0 1001
{code}
2. Check the persistent table and find two files in this table's directory:
{code}
postgres=# select oid from pg_class where relname='a';
   oid
-
 3017232
(1 row)

postgres=# select * from gp_relfile_node where relfilenode_oid=3017232;
 relfilenode_oid | segment_file_num | persistent_tid | persistent_serial_num
-+--++---
 3017232 |1 | (4,128)|855050
 3017232 |2 | (4,129)|855051
(2 rows)

postgres=# select * from gp_persistent_relation_node where 
relfilenode_oid=3017232;
 tablespace_oid | database_oid | relfilenode_oid | persistent_state | reserved 
| parent_xid | persistent_serial_num | previous_free_tid
+--+-+--+--++---+---
  16385 |16387 | 3017232 |2 |0 
|  0 |158943 | (0,0)
(1 row)

postgres=# select * from gp_persistent_relfile_node where 
relfilenode_oid=3017232;
 tablespace_oid | database_oid | relfilenode_oid | segment_file_num | 
relation_storage_manager | persistent_state | relation_bufpool_kind | 
parent_xid | persistent_serial_num | previous_free_tid
+--+-+--+--+--+---++---+---
  16385 |16387 | 3017232 |1 |   
 2 |2 | 0 |  0 |
855050 | (0,0)
  16385 |16387 | 3017232 |2 |   
 2 |2 | 0 |  0 |
855051 | (0,0)
(2 rows)

hadoop fs -ls /hawq_default/16385/16387/3017232
-rw---   3 wangchunling supergroup  100103584 2016-11-08 17:02 
/hawq_default/16385/16387/3017232/1
-rw---   3 wangchunling supergroup  100103600 2016-11-08 17:02 
/hawq_default/16385/16387/3017232/2
{code}

3. Rebuilt persistent tables.
{code}
postgres=# insert into a select generate_series(1000, 2000);
INSERT 0 1001
postgres=# select gp_persistent_reset_all();
 gp_persistent_reset_all
-
   1
(1 row)

postgres=# select gp_persistent_build_all(false);
 gp_persistent_build_all
-
   1
(1 row)
{code}

4. Check persistent table and find data lost in gp_relfile_node and 
gp_persistent_relfile_node.
{code}
postgres=# select * from gp_relfile_node where relfilenode_oid=3017232;
 relfilenode_oid | segment_file_num | persistent_tid | persistent_serial_num
-+--++---
(0 rows)

postgres=# select * from gp_persistent_relation_node where 
relfilenode_oid=3017232;
 tablespace_oid | database_oid | relfilenode_oid | persistent_state | reserved 
| parent_xid | persistent_serial_num | previous_free_tid
+--+-+--+--++---+---
  16385 |16387 | 3017232 |2 |0 
|  0 |159020 | (0,0)
(1 row)

postgres=# select * from gp_persistent_relfile_node where 
relfilenode_oid=3017232;
 tablespace_oid | database_oid | relfilenode_oid | segment_file_num | 
relation_storage_manager | persistent_state | relation_bufpool_kind | 
parent_xid | persistent_serial_num | previous_free_tid

[jira] [Created] (HAWQ-1132) HAWQ should throw error when we insert data in a hash table and the virtual segment number is 1

2016-11-01 Thread Chunling Wang (JIRA)
Chunling Wang created HAWQ-1132:
---

 Summary: HAWQ should throw error when we insert data in a hash 
table and the virtual segment number is 1
 Key: HAWQ-1132
 URL: https://issues.apache.org/jira/browse/HAWQ-1132
 Project: Apache HAWQ
  Issue Type: Bug
  Components: Core, Planner, Query Execution
Reporter: Chunling Wang
Assignee: Lei Chang


If we set  virtual segment number is 1, and create a hash table (default hash 
number is 6), we will just get a warning message in a no partition table when 
we insert a tuple. And we can even not get any message in a partition table. 
When we select from this table, HAWQ throws error. 

No partition table:
{code}
postgres=# set enforce_virtual_segment_number = 1;
SET
postgres=# create table t(id int) DISTRIBUTED BY (id);
CREATE TABLE
postgres=# insert into t values(1);
WARNING:  skipping "t" --- error returned: file count 1 in catalog is not in 
proportion to the bucket number 6 of hash table with oid=2966724, some data may 
be lost, if you still want to continue the query by considering the table as 
random, set GUC allow_file_count_bucket_num_mismatch to on and try again.
INFO:  ANALYZE completed. Success: 0, Failure: 1 (t)
INSERT 0 1
postgres=# select * from t;
ERROR:  file count 1 in catalog is not in proportion to the bucket number 6 of 
hash table with oid=2966724, some data may be lost, if you still want to 
continue the query by considering the table as random, set GUC 
allow_file_count_bucket_num_mismatch to on and try again. 
(cdbdatalocality.c:3801)
postgres=#
{code}

Partition table:
{code}
postgres=# set enforce_virtual_segment_number = 1;
SET
postgres=# CREATE TABLE t (id int, rank int, year int, gender char(1), count 
int ) DISTRIBUTED BY (id) PARTITION BY LIST (gender) ( PARTITION girls 
VALUES ('F'), PARTITION boys VALUES ('M'), DEFAULT PARTITION other );
NOTICE:  CREATE TABLE will create partition "t_1_prt_girls" for table "t"
NOTICE:  CREATE TABLE will create partition "t_1_prt_boys" for table "t"
NOTICE:  CREATE TABLE will create partition "t_1_prt_other" for table "t"
CREATE TABLE
postgres=# insert into t values(51, 1, 1, 'F', 1);
INSERT 0 1
postgres=# select * from t;
ERROR:  file count 1 in catalog is not in proportion to the bucket number 6 of 
hash table with oid=2966703, some data may be lost, if you still want to 
continue the query by considering the table as random, set GUC 
allow_file_count_bucket_num_mismatch to on and try again. 
(cdbdatalocality.c:3801)
postgres=#
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HAWQ-1128) Support HAWQ register tables with same file name in different schema

2016-10-31 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang reassigned HAWQ-1128:
---

Assignee: Chunling Wang  (was: Lei Chang)

> Support HAWQ register tables with same file name in different schema
> 
>
> Key: HAWQ-1128
> URL: https://issues.apache.org/jira/browse/HAWQ-1128
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Command Line Tools
>Reporter: Chunling Wang
>Assignee: Chunling Wang
> Fix For: backlog
>
>
> Now, in HAWQ Register, it can not distinguish tables with same file name but 
> in different schema, which are regarded as same table. We should save and use 
> schema information for HAWQ register.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HAWQ-1128) Support HAWQ register tables with same file name in different schema

2016-10-31 Thread Chunling Wang (JIRA)
Chunling Wang created HAWQ-1128:
---

 Summary: Support HAWQ register tables with same file name in 
different schema
 Key: HAWQ-1128
 URL: https://issues.apache.org/jira/browse/HAWQ-1128
 Project: Apache HAWQ
  Issue Type: Sub-task
  Components: Command Line Tools
Reporter: Chunling Wang
Assignee: Lei Chang


Now, in HAWQ Register, it can not distinguish tables with same file name but in 
different schema, which are regarded as same table. We should save and use 
schema information for HAWQ register.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-1034) add --repair option for hawq register

2016-10-31 Thread Chunling Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15621371#comment-15621371
 ] 

Chunling Wang commented on HAWQ-1034:
-

The reason why we remove these code is that removing data in table directory in 
repair mode will cause risk to lose data. So we decided to not remove data in 
table directory in repair mode, which can be replaced by force mode.

> add --repair option for hawq register
> -
>
> Key: HAWQ-1034
> URL: https://issues.apache.org/jira/browse/HAWQ-1034
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Command Line Tools
>Affects Versions: 2.0.1.0-incubating
>Reporter: Lili Ma
>Assignee: Chunling Wang
> Fix For: 2.0.1.0-incubating
>
>
> add --repair option for hawq register
> Will change both file folder and catalog table pg_aoseg.pg_paqseg_$relid to 
> the state which .yml file configures. Note may some new generated files since 
> the checkpoint may be deleted here. Also note the all the files in .yml file 
> should all under the table folder on HDFS. Limitation: Do not support cases 
> for hash table redistribution, table truncate and table drop. This is for 
> scenario rollback of table: Do checkpoints somewhere, and need to rollback to 
> previous checkpoint. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-1034) add --repair option for hawq register

2016-10-30 Thread Chunling Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15621348#comment-15621348
 ] 

Chunling Wang commented on HAWQ-1034:
-

The code and test cases for repair mode have been removed by GitHub Pull 
Request #986.

> add --repair option for hawq register
> -
>
> Key: HAWQ-1034
> URL: https://issues.apache.org/jira/browse/HAWQ-1034
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Command Line Tools
>Affects Versions: 2.0.1.0-incubating
>Reporter: Lili Ma
>Assignee: Chunling Wang
> Fix For: 2.0.1.0-incubating
>
>
> add --repair option for hawq register
> Will change both file folder and catalog table pg_aoseg.pg_paqseg_$relid to 
> the state which .yml file configures. Note may some new generated files since 
> the checkpoint may be deleted here. Also note the all the files in .yml file 
> should all under the table folder on HDFS. Limitation: Do not support cases 
> for hash table redistribution, table truncate and table drop. This is for 
> scenario rollback of table: Do checkpoints somewhere, and need to rollback to 
> previous checkpoint. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HAWQ-1034) add --repair option for hawq register

2016-10-30 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang reassigned HAWQ-1034:
---

Assignee: Chunling Wang  (was: hongwu)

> add --repair option for hawq register
> -
>
> Key: HAWQ-1034
> URL: https://issues.apache.org/jira/browse/HAWQ-1034
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Command Line Tools
>Affects Versions: 2.0.1.0-incubating
>Reporter: Lili Ma
>Assignee: Chunling Wang
> Fix For: 2.0.1.0-incubating
>
>
> add --repair option for hawq register
> Will change both file folder and catalog table pg_aoseg.pg_paqseg_$relid to 
> the state which .yml file configures. Note may some new generated files since 
> the checkpoint may be deleted here. Also note the all the files in .yml file 
> should all under the table folder on HDFS. Limitation: Do not support cases 
> for hash table redistribution, table truncate and table drop. This is for 
> scenario rollback of table: Do checkpoints somewhere, and need to rollback to 
> previous checkpoint. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HAWQ-1113) In force mode, hawq register error when files in yaml is disordered

2016-10-19 Thread Chunling Wang (JIRA)
Chunling Wang created HAWQ-1113:
---

 Summary: In force mode, hawq register error when files in yaml is 
disordered
 Key: HAWQ-1113
 URL: https://issues.apache.org/jira/browse/HAWQ-1113
 Project: Apache HAWQ
  Issue Type: Bug
  Components: Command Line Tools
Reporter: Chunling Wang
Assignee: Lei Chang


In force mode, hawq register error when files in yaml is in disordered. For 
example, the files order in yaml is as following:
{code}
  Files:
  - path: /hawq_default/16385/@DATABASE_OID@/@TABLE_OID_OLD@/2
size: 250
  - path: /hawq_default/16385/@DATABASE_OID@/@TABLE_OID_OLD@/4
size: 250
  - path: /hawq_default/16385/@DATABASE_OID@/@TABLE_OID_OLD@/5
size: 258
  - path: /hawq_default/16385/@DATABASE_OID@/@TABLE_OID_OLD@/6
size: 270
  - path: /hawq_default/16385/@DATABASE_OID@/@TABLE_OID_OLD@/3
size: 258
  - path: /hawq_default/16385/@DATABASE_OID@/@TABLE_OID_NEW2@/1
size: 228
  - path: /hawq_default/16385/@DATABASE_OID@/@TABLE_OID_NEW@/2
size: 215
  - path: /hawq_default/16385/@DATABASE_OID@/@TABLE_OID_NEW@/3
size: 215
  - path: /hawq_default/16385/@DATABASE_OID@/@TABLE_OID_NEW@/4
size: 220
  - path: /hawq_default/16385/@DATABASE_OID@/@TABLE_OID_OLD@/1
size: 254
  - path: /hawq_default/16385/@DATABASE_OID@/@TABLE_OID_NEW@/6
size: 215
  - path: /hawq_default/16385/@DATABASE_OID@/@TABLE_OID_NEW@/5
size: 210
{code}
After hawq register success, we select data from table and get the error:
{code}
ERROR:  hdfs file length does not equal to metadata logic length! 
(cdbdatalocality.c:1102)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HAWQ-910) "hawq register": before registration, need check the consistency between the file and HAWQ table

2016-09-22 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang updated HAWQ-910:
---
Description: 
As a user,
I can be notified that the uploading file is not consistent to the table I want 
to register to during registration
so that I can do corresponding modifications as early as possible. 
There are two situations we need to check:
1. Hawq register a single file or folder, it should check the consistent to the 
table and uploading files.
2. Hawq register a .yml file, it should check the consistent to the table (if 
the table exists), .yml file and file(s) need to move. 

  was:
As a user,
I can be notified that the uploading file is not consistent to the table I want 
to register to during registration
so that I can do corresponding modifications as early as possible.


> "hawq register": before registration, need check the consistency between the 
> file and HAWQ table
> 
>
> Key: HAWQ-910
> URL: https://issues.apache.org/jira/browse/HAWQ-910
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Storage
>Reporter: Lili Ma
>Assignee: Lei Chang
> Fix For: backlog
>
>
> As a user,
> I can be notified that the uploading file is not consistent to the table I 
> want to register to during registration
> so that I can do corresponding modifications as early as possible. 
> There are two situations we need to check:
> 1. Hawq register a single file or folder, it should check the consistent to 
> the table and uploading files.
> 2. Hawq register a .yml file, it should check the consistent to the table (if 
> the table exists), .yml file and file(s) need to move. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HAWQ-975) Queries run much slower with 'explain analyze' than which without 'explain analyze'

2016-08-31 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang updated HAWQ-975:
---
Affects Version/s: 2.0.0.0-incubating

> Queries run much slower with 'explain analyze' than which without  'explain 
> analyze'
> 
>
> Key: HAWQ-975
> URL: https://issues.apache.org/jira/browse/HAWQ-975
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 2.0.0.0-incubating
>Reporter: Chunling Wang
>Assignee: Chunling Wang
>Priority: Critical
>  Labels: performance
> Fix For: 2.0.1.0-incubating
>
>
> When we run queries with 'explain analyze' in AWS cluster, the total running 
> time is about 2-3 times longer than which without 'explain analyze'.
> Here is a group of TPC-H results for queries with 'explain analyze' and 
> queries without 'explain analyze'.
> ||query   ||without 'explain analyze' ||with 'explain analyze'
> ||multiple
> |TPCH_Query_01|   311843  |   818658  |   2.63
> |TPCH_Query_02|   34675   |   117884  |   3.40
> |TPCH_Query_03|   166155  |   422131  |   2.54
> |TPCH_Query_04|   157807  |   507143  |   3.21
> |TPCH_Query_05|   272657  |   710573  |   2.61
> |TPCH_Query_06|   12508   |   22276   |   1.78
> |TPCH_Query_07|   71893   |   370338  |   5.15
> |TPCH_Query_08|   12  |   672625  |   5.17
> |TPCH_Query_09|   575709  |   1171672 |   2.04
> |TPCH_Query_10|   93770   |   233391  |   2.49
> |TPCH_Query_11|   16252   |   58360   |   3.59
> |TPCH_Query_12|   142576  |   237270  |   1.66
> |TPCH_Query_13|   72682   |   343257  |   4.72
> |TPCH_Query_14|   10410   |   32337   |   3.11
> |TPCH_Query_15|   25719   |   98705   |   3.84
> |TPCH_Query_16|   21382   |   76877   |   3.60
> |TPCH_Query_17|   839683  |   2041169 |   2.43
> |TPCH_Query_18|   460570  |   1065940 |   2.31
> |TPCH_Query_19|   69075   |   82286   |   1.19
> |TPCH_Query_20|   78263   |   292041  |   3.73
> |TPCH_Query_21|   505606  |   1549690 |   3.07
> |TPCH_Query_22|   56450   |   329837  |   5.84
> |Total|   4125684 |   11254460|   
> 2.73



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (HAWQ-975) Queries run much slower with 'explain analyze' than which without 'explain analyze'

2016-08-31 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang closed HAWQ-975.
--
Resolution: Not A Bug

> Queries run much slower with 'explain analyze' than which without  'explain 
> analyze'
> 
>
> Key: HAWQ-975
> URL: https://issues.apache.org/jira/browse/HAWQ-975
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 2.0.0.0-incubating
>Reporter: Chunling Wang
>Assignee: Chunling Wang
>Priority: Critical
>  Labels: performance
> Fix For: 2.0.1.0-incubating
>
>
> When we run queries with 'explain analyze' in AWS cluster, the total running 
> time is about 2-3 times longer than which without 'explain analyze'.
> Here is a group of TPC-H results for queries with 'explain analyze' and 
> queries without 'explain analyze'.
> ||query   ||without 'explain analyze' ||with 'explain analyze'
> ||multiple
> |TPCH_Query_01|   311843  |   818658  |   2.63
> |TPCH_Query_02|   34675   |   117884  |   3.40
> |TPCH_Query_03|   166155  |   422131  |   2.54
> |TPCH_Query_04|   157807  |   507143  |   3.21
> |TPCH_Query_05|   272657  |   710573  |   2.61
> |TPCH_Query_06|   12508   |   22276   |   1.78
> |TPCH_Query_07|   71893   |   370338  |   5.15
> |TPCH_Query_08|   12  |   672625  |   5.17
> |TPCH_Query_09|   575709  |   1171672 |   2.04
> |TPCH_Query_10|   93770   |   233391  |   2.49
> |TPCH_Query_11|   16252   |   58360   |   3.59
> |TPCH_Query_12|   142576  |   237270  |   1.66
> |TPCH_Query_13|   72682   |   343257  |   4.72
> |TPCH_Query_14|   10410   |   32337   |   3.11
> |TPCH_Query_15|   25719   |   98705   |   3.84
> |TPCH_Query_16|   21382   |   76877   |   3.60
> |TPCH_Query_17|   839683  |   2041169 |   2.43
> |TPCH_Query_18|   460570  |   1065940 |   2.31
> |TPCH_Query_19|   69075   |   82286   |   1.19
> |TPCH_Query_20|   78263   |   292041  |   3.73
> |TPCH_Query_21|   505606  |   1549690 |   3.07
> |TPCH_Query_22|   56450   |   329837  |   5.84
> |Total|   4125684 |   11254460|   
> 2.73



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (HAWQ-975) Queries run much slower with 'explain analyze' than which without 'explain analyze'

2016-08-31 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang reopened HAWQ-975:


> Queries run much slower with 'explain analyze' than which without  'explain 
> analyze'
> 
>
> Key: HAWQ-975
> URL: https://issues.apache.org/jira/browse/HAWQ-975
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Core
>Reporter: Chunling Wang
>Assignee: Chunling Wang
>Priority: Critical
>  Labels: performance
> Fix For: 2.0.1.0-incubating
>
>
> When we run queries with 'explain analyze' in AWS cluster, the total running 
> time is about 2-3 times longer than which without 'explain analyze'.
> Here is a group of TPC-H results for queries with 'explain analyze' and 
> queries without 'explain analyze'.
> ||query   ||without 'explain analyze' ||with 'explain analyze'
> ||multiple
> |TPCH_Query_01|   311843  |   818658  |   2.63
> |TPCH_Query_02|   34675   |   117884  |   3.40
> |TPCH_Query_03|   166155  |   422131  |   2.54
> |TPCH_Query_04|   157807  |   507143  |   3.21
> |TPCH_Query_05|   272657  |   710573  |   2.61
> |TPCH_Query_06|   12508   |   22276   |   1.78
> |TPCH_Query_07|   71893   |   370338  |   5.15
> |TPCH_Query_08|   12  |   672625  |   5.17
> |TPCH_Query_09|   575709  |   1171672 |   2.04
> |TPCH_Query_10|   93770   |   233391  |   2.49
> |TPCH_Query_11|   16252   |   58360   |   3.59
> |TPCH_Query_12|   142576  |   237270  |   1.66
> |TPCH_Query_13|   72682   |   343257  |   4.72
> |TPCH_Query_14|   10410   |   32337   |   3.11
> |TPCH_Query_15|   25719   |   98705   |   3.84
> |TPCH_Query_16|   21382   |   76877   |   3.60
> |TPCH_Query_17|   839683  |   2041169 |   2.43
> |TPCH_Query_18|   460570  |   1065940 |   2.31
> |TPCH_Query_19|   69075   |   82286   |   1.19
> |TPCH_Query_20|   78263   |   292041  |   3.73
> |TPCH_Query_21|   505606  |   1549690 |   3.07
> |TPCH_Query_22|   56450   |   329837  |   5.84
> |Total|   4125684 |   11254460|   
> 2.73



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HAWQ-975) Queries run much slower with 'explain analyze' than which without 'explain analyze'

2016-08-31 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang resolved HAWQ-975.

Resolution: Not A Bug

It is a system configuration issue other than a bug in HAWQ.

> Queries run much slower with 'explain analyze' than which without  'explain 
> analyze'
> 
>
> Key: HAWQ-975
> URL: https://issues.apache.org/jira/browse/HAWQ-975
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Core
>Reporter: Chunling Wang
>Assignee: Chunling Wang
>Priority: Critical
>  Labels: performance
> Fix For: 2.0.1.0-incubating
>
>
> When we run queries with 'explain analyze' in AWS cluster, the total running 
> time is about 2-3 times longer than which without 'explain analyze'.
> Here is a group of TPC-H results for queries with 'explain analyze' and 
> queries without 'explain analyze'.
> ||query   ||without 'explain analyze' ||with 'explain analyze'
> ||multiple
> |TPCH_Query_01|   311843  |   818658  |   2.63
> |TPCH_Query_02|   34675   |   117884  |   3.40
> |TPCH_Query_03|   166155  |   422131  |   2.54
> |TPCH_Query_04|   157807  |   507143  |   3.21
> |TPCH_Query_05|   272657  |   710573  |   2.61
> |TPCH_Query_06|   12508   |   22276   |   1.78
> |TPCH_Query_07|   71893   |   370338  |   5.15
> |TPCH_Query_08|   12  |   672625  |   5.17
> |TPCH_Query_09|   575709  |   1171672 |   2.04
> |TPCH_Query_10|   93770   |   233391  |   2.49
> |TPCH_Query_11|   16252   |   58360   |   3.59
> |TPCH_Query_12|   142576  |   237270  |   1.66
> |TPCH_Query_13|   72682   |   343257  |   4.72
> |TPCH_Query_14|   10410   |   32337   |   3.11
> |TPCH_Query_15|   25719   |   98705   |   3.84
> |TPCH_Query_16|   21382   |   76877   |   3.60
> |TPCH_Query_17|   839683  |   2041169 |   2.43
> |TPCH_Query_18|   460570  |   1065940 |   2.31
> |TPCH_Query_19|   69075   |   82286   |   1.19
> |TPCH_Query_20|   78263   |   292041  |   3.73
> |TPCH_Query_21|   505606  |   1549690 |   3.07
> |TPCH_Query_22|   56450   |   329837  |   5.84
> |Total|   4125684 |   11254460|   
> 2.73



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HAWQ-975) Queries run much slower with 'explain analyze' than which without 'explain analyze'

2016-08-31 Thread Chunling Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15451831#comment-15451831
 ] 

Chunling Wang edited comment on HAWQ-975 at 8/31/16 10:30 AM:
--

The performance of explain analyze on AWS is low because the VDSO on agents of 
AWS is not properly configured and does not work well. To be specific, 
gettimeofday() takes too much time.


was (Author: wcl14):
It is because that the VDSO on agents of AWS does not work well. So the 
execution time of function 'gettimeofday()' is too much.

> Queries run much slower with 'explain analyze' than which without  'explain 
> analyze'
> 
>
> Key: HAWQ-975
> URL: https://issues.apache.org/jira/browse/HAWQ-975
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Core
>Reporter: Chunling Wang
>Assignee: Chunling Wang
>Priority: Critical
>  Labels: performance
> Fix For: 2.0.1.0-incubating
>
>
> When we run queries with 'explain analyze' in AWS cluster, the total running 
> time is about 2-3 times longer than which without 'explain analyze'.
> Here is a group of TPC-H results for queries with 'explain analyze' and 
> queries without 'explain analyze'.
> ||query   ||without 'explain analyze' ||with 'explain analyze'
> ||multiple
> |TPCH_Query_01|   311843  |   818658  |   2.63
> |TPCH_Query_02|   34675   |   117884  |   3.40
> |TPCH_Query_03|   166155  |   422131  |   2.54
> |TPCH_Query_04|   157807  |   507143  |   3.21
> |TPCH_Query_05|   272657  |   710573  |   2.61
> |TPCH_Query_06|   12508   |   22276   |   1.78
> |TPCH_Query_07|   71893   |   370338  |   5.15
> |TPCH_Query_08|   12  |   672625  |   5.17
> |TPCH_Query_09|   575709  |   1171672 |   2.04
> |TPCH_Query_10|   93770   |   233391  |   2.49
> |TPCH_Query_11|   16252   |   58360   |   3.59
> |TPCH_Query_12|   142576  |   237270  |   1.66
> |TPCH_Query_13|   72682   |   343257  |   4.72
> |TPCH_Query_14|   10410   |   32337   |   3.11
> |TPCH_Query_15|   25719   |   98705   |   3.84
> |TPCH_Query_16|   21382   |   76877   |   3.60
> |TPCH_Query_17|   839683  |   2041169 |   2.43
> |TPCH_Query_18|   460570  |   1065940 |   2.31
> |TPCH_Query_19|   69075   |   82286   |   1.19
> |TPCH_Query_20|   78263   |   292041  |   3.73
> |TPCH_Query_21|   505606  |   1549690 |   3.07
> |TPCH_Query_22|   56450   |   329837  |   5.84
> |Total|   4125684 |   11254460|   
> 2.73



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HAWQ-975) Queries run much slower with 'explain analyze' than which without 'explain analyze'

2016-08-31 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang reassigned HAWQ-975:
--

Assignee: Chunling Wang  (was: Lei Chang)

> Queries run much slower with 'explain analyze' than which without  'explain 
> analyze'
> 
>
> Key: HAWQ-975
> URL: https://issues.apache.org/jira/browse/HAWQ-975
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Core
>Reporter: Chunling Wang
>Assignee: Chunling Wang
>Priority: Critical
>  Labels: performance
> Fix For: 2.0.1.0-incubating
>
>
> When we run queries with 'explain analyze' in AWS cluster, the total running 
> time is about 2-3 times longer than which without 'explain analyze'.
> Here is a group of TPC-H results for queries with 'explain analyze' and 
> queries without 'explain analyze'.
> ||query   ||without 'explain analyze' ||with 'explain analyze'
> ||multiple
> |TPCH_Query_01|   311843  |   818658  |   2.63
> |TPCH_Query_02|   34675   |   117884  |   3.40
> |TPCH_Query_03|   166155  |   422131  |   2.54
> |TPCH_Query_04|   157807  |   507143  |   3.21
> |TPCH_Query_05|   272657  |   710573  |   2.61
> |TPCH_Query_06|   12508   |   22276   |   1.78
> |TPCH_Query_07|   71893   |   370338  |   5.15
> |TPCH_Query_08|   12  |   672625  |   5.17
> |TPCH_Query_09|   575709  |   1171672 |   2.04
> |TPCH_Query_10|   93770   |   233391  |   2.49
> |TPCH_Query_11|   16252   |   58360   |   3.59
> |TPCH_Query_12|   142576  |   237270  |   1.66
> |TPCH_Query_13|   72682   |   343257  |   4.72
> |TPCH_Query_14|   10410   |   32337   |   3.11
> |TPCH_Query_15|   25719   |   98705   |   3.84
> |TPCH_Query_16|   21382   |   76877   |   3.60
> |TPCH_Query_17|   839683  |   2041169 |   2.43
> |TPCH_Query_18|   460570  |   1065940 |   2.31
> |TPCH_Query_19|   69075   |   82286   |   1.19
> |TPCH_Query_20|   78263   |   292041  |   3.73
> |TPCH_Query_21|   505606  |   1549690 |   3.07
> |TPCH_Query_22|   56450   |   329837  |   5.84
> |Total|   4125684 |   11254460|   
> 2.73



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-975) Queries run much slower with 'explain analyze' than which without 'explain analyze'

2016-08-31 Thread Chunling Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15451831#comment-15451831
 ] 

Chunling Wang commented on HAWQ-975:


It is because that the VDSO on agents of AWS does not work well. So the 
execution time of function 'gettimeofday()' is too much.

> Queries run much slower with 'explain analyze' than which without  'explain 
> analyze'
> 
>
> Key: HAWQ-975
> URL: https://issues.apache.org/jira/browse/HAWQ-975
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Core
>Reporter: Chunling Wang
>Assignee: Lei Chang
>Priority: Critical
>  Labels: performance
> Fix For: 2.0.1.0-incubating
>
>
> When we run queries with 'explain analyze' in AWS cluster, the total running 
> time is about 2-3 times longer than which without 'explain analyze'.
> Here is a group of TPC-H results for queries with 'explain analyze' and 
> queries without 'explain analyze'.
> ||query   ||without 'explain analyze' ||with 'explain analyze'
> ||multiple
> |TPCH_Query_01|   311843  |   818658  |   2.63
> |TPCH_Query_02|   34675   |   117884  |   3.40
> |TPCH_Query_03|   166155  |   422131  |   2.54
> |TPCH_Query_04|   157807  |   507143  |   3.21
> |TPCH_Query_05|   272657  |   710573  |   2.61
> |TPCH_Query_06|   12508   |   22276   |   1.78
> |TPCH_Query_07|   71893   |   370338  |   5.15
> |TPCH_Query_08|   12  |   672625  |   5.17
> |TPCH_Query_09|   575709  |   1171672 |   2.04
> |TPCH_Query_10|   93770   |   233391  |   2.49
> |TPCH_Query_11|   16252   |   58360   |   3.59
> |TPCH_Query_12|   142576  |   237270  |   1.66
> |TPCH_Query_13|   72682   |   343257  |   4.72
> |TPCH_Query_14|   10410   |   32337   |   3.11
> |TPCH_Query_15|   25719   |   98705   |   3.84
> |TPCH_Query_16|   21382   |   76877   |   3.60
> |TPCH_Query_17|   839683  |   2041169 |   2.43
> |TPCH_Query_18|   460570  |   1065940 |   2.31
> |TPCH_Query_19|   69075   |   82286   |   1.19
> |TPCH_Query_20|   78263   |   292041  |   3.73
> |TPCH_Query_21|   505606  |   1549690 |   3.07
> |TPCH_Query_22|   56450   |   329837  |   5.84
> |Total|   4125684 |   11254460|   
> 2.73



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HAWQ-1037) modify way to get HDFS port in TestHawqRegister

2016-08-31 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang updated HAWQ-1037:

Summary: modify way to get HDFS port in TestHawqRegister  (was: modify to 
get HDFS port in TestHawqRegister)

> modify way to get HDFS port in TestHawqRegister
> ---
>
> Key: HAWQ-1037
> URL: https://issues.apache.org/jira/browse/HAWQ-1037
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Tests
>Reporter: Chunling Wang
>Assignee: Chunling Wang
>
> In test TestHawqRegister, the HDFS port is hard-coded. Now we get the HDFS 
> port from HdfsConfig.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HAWQ-1037) modify to get HDFS port in TestHawqRegister

2016-08-30 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang reassigned HAWQ-1037:
---

Assignee: Chunling Wang  (was: Jiali Yao)

> modify to get HDFS port in TestHawqRegister
> ---
>
> Key: HAWQ-1037
> URL: https://issues.apache.org/jira/browse/HAWQ-1037
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Tests
>Reporter: Chunling Wang
>Assignee: Chunling Wang
>
> In test TestHawqRegister, the HDFS port is hard-coded. Now we get the HDFS 
> port from HdfsConfig.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HAWQ-1037) modify to get HDFS port in TestHawqRegister

2016-08-30 Thread Chunling Wang (JIRA)
Chunling Wang created HAWQ-1037:
---

 Summary: modify to get HDFS port in TestHawqRegister
 Key: HAWQ-1037
 URL: https://issues.apache.org/jira/browse/HAWQ-1037
 Project: Apache HAWQ
  Issue Type: Bug
  Components: Tests
Reporter: Chunling Wang
Assignee: Jiali Yao


In test TestHawqRegister, the HDFS port is hard-coded. Now we get the HDFS port 
from HdfsConfig.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (HAWQ-969) Add getting configuration from HDFS and YARN

2016-08-25 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang closed HAWQ-969.
--

> Add getting configuration from HDFS and YARN
> 
>
> Key: HAWQ-969
> URL: https://issues.apache.org/jira/browse/HAWQ-969
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Chunling Wang
>Assignee: Chunling Wang
> Fix For: 2.0.1.0-incubating
>
>
> Add getting configuration from HDFS and YARN and writing xml file in 
> xml_parser.cpp.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (HAWQ-1020) Fix bugs to let feature tests TestCommonLib.TestHdfsConfig and TestCommonLib.TestYanConfig run in concourse

2016-08-25 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang closed HAWQ-1020.
---

> Fix bugs to let feature tests TestCommonLib.TestHdfsConfig and 
> TestCommonLib.TestYanConfig run in concourse
> ---
>
> Key: HAWQ-1020
> URL: https://issues.apache.org/jira/browse/HAWQ-1020
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 2.0.1.0-incubating
>Reporter: Chunling Wang
>Assignee: Chunling Wang
> Fix For: 2.0.1.0-incubating
>
>
> Fix bugs to let feature tests TestCommonLib.TestHdfsConfig and 
> TestCommonLib.TestYanConfig run in concourse



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HAWQ-969) Add getting configuration from HDFS and YARN

2016-08-25 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang resolved HAWQ-969.

Resolution: Fixed

> Add getting configuration from HDFS and YARN
> 
>
> Key: HAWQ-969
> URL: https://issues.apache.org/jira/browse/HAWQ-969
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Chunling Wang
>Assignee: Chunling Wang
> Fix For: 2.0.1.0-incubating
>
>
> Add getting configuration from HDFS and YARN and writing xml file in 
> xml_parser.cpp.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HAWQ-1020) Fix bugs to let feature tests TestCommonLib.TestHdfsConfig and TestCommonLib.TestYanConfig run in concourse

2016-08-25 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang reassigned HAWQ-1020:
---

Assignee: Chunling Wang  (was: Jiali Yao)

> Fix bugs to let feature tests TestCommonLib.TestHdfsConfig and 
> TestCommonLib.TestYanConfig run in concourse
> ---
>
> Key: HAWQ-1020
> URL: https://issues.apache.org/jira/browse/HAWQ-1020
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Tests
>Reporter: Chunling Wang
>Assignee: Chunling Wang
>
> Fix bugs to let feature tests TestCommonLib.TestHdfsConfig and 
> TestCommonLib.TestYanConfig run in concourse



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HAWQ-1020) Fix bugs to let feature tests TestCommonLib.TestHdfsConfig and TestCommonLib.TestYanConfig run in concourse

2016-08-25 Thread Chunling Wang (JIRA)
Chunling Wang created HAWQ-1020:
---

 Summary: Fix bugs to let feature tests 
TestCommonLib.TestHdfsConfig and TestCommonLib.TestYanConfig run in concourse
 Key: HAWQ-1020
 URL: https://issues.apache.org/jira/browse/HAWQ-1020
 Project: Apache HAWQ
  Issue Type: Bug
  Components: Tests
Reporter: Chunling Wang
Assignee: Jiali Yao


Fix bugs to let feature tests TestCommonLib.TestHdfsConfig and 
TestCommonLib.TestYanConfig run in concourse



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HAWQ-969) Add getting configuration from HDFS and YARN

2016-08-01 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang reassigned HAWQ-969:
--

Assignee: Chunling Wang  (was: Jiali Yao)

> Add getting configuration from HDFS and YARN
> 
>
> Key: HAWQ-969
> URL: https://issues.apache.org/jira/browse/HAWQ-969
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Chunling Wang
>Assignee: Chunling Wang
> Fix For: 2.0.1.0-incubating
>
>
> Add getting configuration from HDFS and YARN and writing xml file in 
> xml_parser.cpp.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HAWQ-969) Add getting configuration from HDFS and YARN

2016-08-01 Thread Chunling Wang (JIRA)
Chunling Wang created HAWQ-969:
--

 Summary: Add getting configuration from HDFS and YARN
 Key: HAWQ-969
 URL: https://issues.apache.org/jira/browse/HAWQ-969
 Project: Apache HAWQ
  Issue Type: Sub-task
  Components: Tests
Reporter: Chunling Wang
Assignee: Jiali Yao


Add getting configuration from HDFS and YARN and writing xml file in 
xml_parser.cpp.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HAWQ-812) Activate standby master failed after create a new database

2016-06-15 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang updated HAWQ-812:
---
Component/s: (was: Backup & restore)

> Activate standby master failed after create a new database
> --
>
> Key: HAWQ-812
> URL: https://issues.apache.org/jira/browse/HAWQ-812
> Project: Apache HAWQ
>  Issue Type: Bug
>Reporter: Chunling Wang
>Assignee: Lei Chang
>
> Activate standby master failed after create a new database. However, it will 
> success if we do not create a new database even we create a new table and 
> insert data. 
> 1. Create a new database 'gptest'
> {code}
> [gpadmin@test1 ~]$ psql -l
>  List of databases
>Name|  Owner  | Encoding | Access privileges
> ---+-+--+---
>  postgres  | gpadmin | UTF8 |
>  template0 | gpadmin | UTF8 |
>  template1 | gpadmin | UTF8 |
> (3 rows)
> [gpadmin@test1 ~]$ createdb gptest
> [gpadmin@test1 ~]$ psql -l
>  List of databases
>Name|  Owner  | Encoding | Access privileges
> ---+-+--+---
>  gptest| gpadmin | UTF8 |
>  postgres  | gpadmin | UTF8 |
>  template0 | gpadmin | UTF8 |
>  template1 | gpadmin | UTF8 |
> (4 rows)
> {code}
> 2. Stop HAWQ master
> {code}
> [gpadmin@test1 ~]$ hawq stop master -a
> 20160613:20:13:44:068559 hawq_stop:test1:gpadmin-[INFO]:-Prepare to do 'hawq 
> stop'
> 20160613:20:13:44:068559 hawq_stop:test1:gpadmin-[INFO]:-You can find log in:
> 20160613:20:13:44:068559 
> hawq_stop:test1:gpadmin-[INFO]:-/home/gpadmin/hawqAdminLogs/hawq_stop_20160613.log
> 20160613:20:13:44:068559 hawq_stop:test1:gpadmin-[INFO]:-GPHOME is set to:
> 20160613:20:13:44:068559 
> hawq_stop:test1:gpadmin-[INFO]:-/data/pulse-agent-data/HAWQ-main-FeatureTest-opt-mutilnodeparallel-wcl/product/hawq/.
> 20160613:20:13:44:068559 hawq_stop:test1:gpadmin-[INFO]:-Stop hawq with args: 
> ['stop', 'master']
> 20160613:20:13:45:068559 hawq_stop:test1:gpadmin-[INFO]:-There are 0 
> connections to the database
> 20160613:20:13:45:068559 hawq_stop:test1:gpadmin-[INFO]:-Commencing Master 
> instance shutdown with mode='smart'
> 20160613:20:13:45:068559 hawq_stop:test1:gpadmin-[INFO]:-Master host=test1
> 20160613:20:13:45:068559 hawq_stop:test1:gpadmin-[INFO]:-Stop hawq master
> 20160613:20:13:46:068559 hawq_stop:test1:gpadmin-[INFO]:-Master stopped 
> successfully
> {code}
> 3. Activate standby master
> {code}
> [gpadmin@test1 ~]$ ssh test5 'source 
> /data/pulse-agent-data/HAWQ-main-FeatureTest-opt-mutilnodeparallel-wcl/product/hawq/./greenplum_path.sh;
>  hawq activate standby -a'
> 20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-Prepare to do 
> 'hawq activate'
> 20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-You can find log 
> in:
> 20160613:20:14:14:126841 
> hawq_activate:test5:gpadmin-[INFO]:-/home/gpadmin/hawqAdminLogs/hawq_activate_20160613.log
> 20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-GPHOME is set to:
> 20160613:20:14:14:126841 
> hawq_activate:test5:gpadmin-[INFO]:-/data/pulse-agent-data/HAWQ-main-FeatureTest-opt-mutilnodeparallel-wcl/product/hawq/.
> 20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-Activate hawq 
> with args: ['activate', 'standby']
> 20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-Starting to 
> activate standby master 'test5'
> 20160613:20:14:15:126841 hawq_activate:test5:gpadmin-[INFO]:-HAWQ master is 
> not running, skip
> 20160613:20:14:15:126841 hawq_activate:test5:gpadmin-[INFO]:-Stopping all the 
> running segments
> 20160613:20:14:21:126841 hawq_activate:test5:gpadmin-[INFO]:-
> 20160613:20:14:21:126841 hawq_activate:test5:gpadmin-[INFO]:-Stopping running 
> standby
> 20160613:20:14:23:126841 hawq_activate:test5:gpadmin-[INFO]:-Update master 
> host name in hawq-site.xml
> 20160613:20:14:31:126841 hawq_activate:test5:gpadmin-[INFO]:-GUC 
> hawq_master_address_host already exist in hawq-site.xml
> Update it with value: test5
> 20160613:20:14:31:126841 hawq_activate:test5:gpadmin-[INFO]:-Remove current 
> standby from hawq-site.xml
> 20160613:20:14:39:126841 hawq_activate:test5:gpadmin-[INFO]:-Start master in 
> master only mode
> {code}
> It hangs and can not start master. And the master log is following:
> {code}
> 2016-06-13 20:14:40.268022 
> PDT,,,p127518,th-12124628160,,,seg-1,"LOG","0","database 
> system was shut down at 2016-06-13 20:02:50 PDT",,,0,,"xlog.c",6205,
> 2016-06-13 20:14:40.268112 
> PDT,,,p127518,th-12124628160,,,seg-1,"LOG","0","found 
> recovery.conf file indicating standby takeover recovery 
> needed",,,0,,"xlog.c",5485,
> 2016-06-13 20:14:40.268131 
> 

[jira] [Updated] (HAWQ-812) Activate standby master failed after create a new database

2016-06-13 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang updated HAWQ-812:
---
Description: 
Activate standby master failed after create a new database. However, it will 
success if we do not create a new database even we create a new table and 
insert data. 
1. Create a new database 'gptest'
{code}
[gpadmin@test1 ~]$ psql -l
 List of databases
   Name|  Owner  | Encoding | Access privileges
---+-+--+---
 postgres  | gpadmin | UTF8 |
 template0 | gpadmin | UTF8 |
 template1 | gpadmin | UTF8 |
(3 rows)

[gpadmin@test1 ~]$ createdb gptest
[gpadmin@test1 ~]$ psql -l
 List of databases
   Name|  Owner  | Encoding | Access privileges
---+-+--+---
 gptest| gpadmin | UTF8 |
 postgres  | gpadmin | UTF8 |
 template0 | gpadmin | UTF8 |
 template1 | gpadmin | UTF8 |
(4 rows)
{code}
2. Stop HAWQ master
{code}
[gpadmin@test1 ~]$ hawq stop master -a
20160613:20:13:44:068559 hawq_stop:test1:gpadmin-[INFO]:-Prepare to do 'hawq 
stop'
20160613:20:13:44:068559 hawq_stop:test1:gpadmin-[INFO]:-You can find log in:
20160613:20:13:44:068559 
hawq_stop:test1:gpadmin-[INFO]:-/home/gpadmin/hawqAdminLogs/hawq_stop_20160613.log
20160613:20:13:44:068559 hawq_stop:test1:gpadmin-[INFO]:-GPHOME is set to:
20160613:20:13:44:068559 
hawq_stop:test1:gpadmin-[INFO]:-/data/pulse-agent-data/HAWQ-main-FeatureTest-opt-mutilnodeparallel-wcl/product/hawq/.
20160613:20:13:44:068559 hawq_stop:test1:gpadmin-[INFO]:-Stop hawq with args: 
['stop', 'master']
20160613:20:13:45:068559 hawq_stop:test1:gpadmin-[INFO]:-There are 0 
connections to the database
20160613:20:13:45:068559 hawq_stop:test1:gpadmin-[INFO]:-Commencing Master 
instance shutdown with mode='smart'
20160613:20:13:45:068559 hawq_stop:test1:gpadmin-[INFO]:-Master host=test1
20160613:20:13:45:068559 hawq_stop:test1:gpadmin-[INFO]:-Stop hawq master
20160613:20:13:46:068559 hawq_stop:test1:gpadmin-[INFO]:-Master stopped 
successfully
{code}
3. Activate standby master
{code}
[gpadmin@test1 ~]$ ssh test5 'source 
/data/pulse-agent-data/HAWQ-main-FeatureTest-opt-mutilnodeparallel-wcl/product/hawq/./greenplum_path.sh;
 hawq activate standby -a'
20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-Prepare to do 
'hawq activate'
20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-You can find log 
in:
20160613:20:14:14:126841 
hawq_activate:test5:gpadmin-[INFO]:-/home/gpadmin/hawqAdminLogs/hawq_activate_20160613.log
20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-GPHOME is set to:
20160613:20:14:14:126841 
hawq_activate:test5:gpadmin-[INFO]:-/data/pulse-agent-data/HAWQ-main-FeatureTest-opt-mutilnodeparallel-wcl/product/hawq/.
20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-Activate hawq with 
args: ['activate', 'standby']
20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-Starting to 
activate standby master 'test5'
20160613:20:14:15:126841 hawq_activate:test5:gpadmin-[INFO]:-HAWQ master is not 
running, skip
20160613:20:14:15:126841 hawq_activate:test5:gpadmin-[INFO]:-Stopping all the 
running segments
20160613:20:14:21:126841 hawq_activate:test5:gpadmin-[INFO]:-
20160613:20:14:21:126841 hawq_activate:test5:gpadmin-[INFO]:-Stopping running 
standby
20160613:20:14:23:126841 hawq_activate:test5:gpadmin-[INFO]:-Update master host 
name in hawq-site.xml
20160613:20:14:31:126841 hawq_activate:test5:gpadmin-[INFO]:-GUC 
hawq_master_address_host already exist in hawq-site.xml
Update it with value: test5
20160613:20:14:31:126841 hawq_activate:test5:gpadmin-[INFO]:-Remove current 
standby from hawq-site.xml
20160613:20:14:39:126841 hawq_activate:test5:gpadmin-[INFO]:-Start master in 
master only mode

{code}
It hangs and can not start master. And the master log is following:
{code}
2016-06-13 20:14:40.268022 
PDT,,,p127518,th-12124628160,,,seg-1,"LOG","0","database system 
was shut down at 2016-06-13 20:02:50 PDT",,,0,,"xlog.c",6205,
2016-06-13 20:14:40.268112 
PDT,,,p127518,th-12124628160,,,seg-1,"LOG","0","found 
recovery.conf file indicating standby takeover recovery 
needed",,,0,,"xlog.c",5485,
2016-06-13 20:14:40.268131 
PDT,,,p127518,th-12124628160,,,seg-1,"LOG","0","checkpoint 
record is at 0/1C75EF0",,,0,,"xlog.c",6304,
2016-06-13 20:14:40.268143 
PDT,,,p127518,th-12124628160,,,seg-1,"LOG","0","redo record is 
at 0/1C75EF0; undo record is at 0/0; shutdown TRUE",,,0,,"xlog.c",6338,
2016-06-13 20:14:40.268155 
PDT,,,p127518,th-12124628160,,,seg-1,"LOG","0","next 
transaction ID: 0/1003; next OID: 16508",,,0,,"xlog.c",6342,
2016-06-13 20:14:40.268165 
PDT,,,p127518,th-12124628160,,,seg-1,"LOG","0","next 
MultiXactId: 1; next MultiXactOffset: 0",,,0,,"xlog.c",6345,

[jira] [Closed] (HAWQ-813) Activate standby master failed after create a new database

2016-06-13 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang closed HAWQ-813.
--
Resolution: Invalid

> Activate standby master failed after create a new database
> --
>
> Key: HAWQ-813
> URL: https://issues.apache.org/jira/browse/HAWQ-813
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Backup & restore
>Reporter: Chunling Wang
>Assignee: Lei Chang
>
> Activate standby master failed after create a new database. However, it will 
> success if we do not create a new database even we create a new table and 
> insert data. 
> 1. Create a new database 'gptest'
> {code}
> [gpadmin@test1 ~]$ psql -l
>  List of databases
>Name|  Owner  | Encoding | Access privileges
> ---+-+--+---
>  postgres  | gpadmin | UTF8 |
>  template0 | gpadmin | UTF8 |
>  template1 | gpadmin | UTF8 |
> (3 rows)
> [gpadmin@test1 ~]$ createdb gptest
> [gpadmin@test1 ~]$ psql -l
>  List of databases
>Name|  Owner  | Encoding | Access privileges
> ---+-+--+---
>  gptest| gpadmin | UTF8 |
>  postgres  | gpadmin | UTF8 |
>  template0 | gpadmin | UTF8 |
>  template1 | gpadmin | UTF8 |
> (4 rows)
> {code}
> 2. Stop HAWQ master
> {code}
> [gpadmin@test1 ~]$ hawq stop master -a
> 20160613:20:13:44:068559 hawq_stop:test1:gpadmin-[INFO]:-Prepare to do 'hawq 
> stop'
> 20160613:20:13:44:068559 hawq_stop:test1:gpadmin-[INFO]:-You can find log in:
> 20160613:20:13:44:068559 
> hawq_stop:test1:gpadmin-[INFO]:-/home/gpadmin/hawqAdminLogs/hawq_stop_20160613.log
> 20160613:20:13:44:068559 hawq_stop:test1:gpadmin-[INFO]:-GPHOME is set to:
> 20160613:20:13:44:068559 
> hawq_stop:test1:gpadmin-[INFO]:-/data/pulse-agent-data/HAWQ-main-FeatureTest-opt-mutilnodeparallel-wcl/product/hawq/.
> 20160613:20:13:44:068559 hawq_stop:test1:gpadmin-[INFO]:-Stop hawq with args: 
> ['stop', 'master']
> 20160613:20:13:45:068559 hawq_stop:test1:gpadmin-[INFO]:-There are 0 
> connections to the database
> 20160613:20:13:45:068559 hawq_stop:test1:gpadmin-[INFO]:-Commencing Master 
> instance shutdown with mode='smart'
> 20160613:20:13:45:068559 hawq_stop:test1:gpadmin-[INFO]:-Master host=test1
> 20160613:20:13:45:068559 hawq_stop:test1:gpadmin-[INFO]:-Stop hawq master
> 20160613:20:13:46:068559 hawq_stop:test1:gpadmin-[INFO]:-Master stopped 
> successfully
> {code}
> 3. Activate standby master
> {code}
> [gpadmin@test1 ~]$ ssh test5 'source 
> /data/pulse-agent-data/HAWQ-main-FeatureTest-opt-mutilnodeparallel-wcl/product/hawq/./greenplum_path.sh;
>  hawq activate standby -a'
> 20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-Prepare to do 
> 'hawq activate'
> 20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-You can find log 
> in:
> 20160613:20:14:14:126841 
> hawq_activate:test5:gpadmin-[INFO]:-/home/gpadmin/hawqAdminLogs/hawq_activate_20160613.log
> 20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-GPHOME is set to:
> 20160613:20:14:14:126841 
> hawq_activate:test5:gpadmin-[INFO]:-/data/pulse-agent-data/HAWQ-main-FeatureTest-opt-mutilnodeparallel-wcl/product/hawq/.
> 20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-Activate hawq 
> with args: ['activate', 'standby']
> 20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-Starting to 
> activate standby master 'test5'
> 20160613:20:14:15:126841 hawq_activate:test5:gpadmin-[INFO]:-HAWQ master is 
> not running, skip
> 20160613:20:14:15:126841 hawq_activate:test5:gpadmin-[INFO]:-Stopping all the 
> running segments
> 20160613:20:14:21:126841 hawq_activate:test5:gpadmin-[INFO]:-
> 20160613:20:14:21:126841 hawq_activate:test5:gpadmin-[INFO]:-Stopping running 
> standby
> 20160613:20:14:23:126841 hawq_activate:test5:gpadmin-[INFO]:-Update master 
> host name in hawq-site.xml
> 20160613:20:14:31:126841 hawq_activate:test5:gpadmin-[INFO]:-GUC 
> hawq_master_address_host already exist in hawq-site.xml
> Update it with value: test5
> 20160613:20:14:31:126841 hawq_activate:test5:gpadmin-[INFO]:-Remove current 
> standby from hawq-site.xml
> 20160613:20:14:39:126841 hawq_activate:test5:gpadmin-[INFO]:-Start master in 
> master only mode
> {code}
> It hangs and can not start master.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HAWQ-812) Activate standby master failed after create a new database

2016-06-13 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang updated HAWQ-812:
---
Description: 
Activate standby master failed after create a new database. However, it will 
success if we do not create a new database even we create a new table and 
insert data. 
1. Create a new database 'gptest'
{code}
[gpadmin@test1 ~]$ psql -l
 List of databases
   Name|  Owner  | Encoding | Access privileges
---+-+--+---
 postgres  | gpadmin | UTF8 |
 template0 | gpadmin | UTF8 |
 template1 | gpadmin | UTF8 |
(3 rows)

[gpadmin@test1 ~]$ createdb gptest
[gpadmin@test1 ~]$ psql -l
 List of databases
   Name|  Owner  | Encoding | Access privileges
---+-+--+---
 gptest| gpadmin | UTF8 |
 postgres  | gpadmin | UTF8 |
 template0 | gpadmin | UTF8 |
 template1 | gpadmin | UTF8 |
(4 rows)
{code}
2. Stop HAWQ master
{code}
[gpadmin@test1 ~]$ hawq stop master -a
20160613:20:13:44:068559 hawq_stop:test1:gpadmin-[INFO]:-Prepare to do 'hawq 
stop'
20160613:20:13:44:068559 hawq_stop:test1:gpadmin-[INFO]:-You can find log in:
20160613:20:13:44:068559 
hawq_stop:test1:gpadmin-[INFO]:-/home/gpadmin/hawqAdminLogs/hawq_stop_20160613.log
20160613:20:13:44:068559 hawq_stop:test1:gpadmin-[INFO]:-GPHOME is set to:
20160613:20:13:44:068559 
hawq_stop:test1:gpadmin-[INFO]:-/data/pulse-agent-data/HAWQ-main-FeatureTest-opt-mutilnodeparallel-wcl/product/hawq/.
20160613:20:13:44:068559 hawq_stop:test1:gpadmin-[INFO]:-Stop hawq with args: 
['stop', 'master']
20160613:20:13:45:068559 hawq_stop:test1:gpadmin-[INFO]:-There are 0 
connections to the database
20160613:20:13:45:068559 hawq_stop:test1:gpadmin-[INFO]:-Commencing Master 
instance shutdown with mode='smart'
20160613:20:13:45:068559 hawq_stop:test1:gpadmin-[INFO]:-Master host=test1
20160613:20:13:45:068559 hawq_stop:test1:gpadmin-[INFO]:-Stop hawq master
20160613:20:13:46:068559 hawq_stop:test1:gpadmin-[INFO]:-Master stopped 
successfully
{code}
3. Activate standby master
{code}
[gpadmin@test1 ~]$ ssh test5 'source 
/data/pulse-agent-data/HAWQ-main-FeatureTest-opt-mutilnodeparallel-wcl/product/hawq/./greenplum_path.sh;
 hawq activate standby -a'
20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-Prepare to do 
'hawq activate'
20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-You can find log 
in:
20160613:20:14:14:126841 
hawq_activate:test5:gpadmin-[INFO]:-/home/gpadmin/hawqAdminLogs/hawq_activate_20160613.log
20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-GPHOME is set to:
20160613:20:14:14:126841 
hawq_activate:test5:gpadmin-[INFO]:-/data/pulse-agent-data/HAWQ-main-FeatureTest-opt-mutilnodeparallel-wcl/product/hawq/.
20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-Activate hawq with 
args: ['activate', 'standby']
20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-Starting to 
activate standby master 'test5'
20160613:20:14:15:126841 hawq_activate:test5:gpadmin-[INFO]:-HAWQ master is not 
running, skip
20160613:20:14:15:126841 hawq_activate:test5:gpadmin-[INFO]:-Stopping all the 
running segments
20160613:20:14:21:126841 hawq_activate:test5:gpadmin-[INFO]:-
20160613:20:14:21:126841 hawq_activate:test5:gpadmin-[INFO]:-Stopping running 
standby
20160613:20:14:23:126841 hawq_activate:test5:gpadmin-[INFO]:-Update master host 
name in hawq-site.xml
20160613:20:14:31:126841 hawq_activate:test5:gpadmin-[INFO]:-GUC 
hawq_master_address_host already exist in hawq-site.xml
Update it with value: test5
20160613:20:14:31:126841 hawq_activate:test5:gpadmin-[INFO]:-Remove current 
standby from hawq-site.xml
20160613:20:14:39:126841 hawq_activate:test5:gpadmin-[INFO]:-Start master in 
master only mode

{code}
It hangs and can not start master. And the master log is following:
{code}

  48,1 底端
2016-06-13 20:14:40.268022 
PDT,,,p127518,th-12124628160,,,seg-1,"LOG","0","database system 
was shut down at 2016-06-13 20:02:50 PDT",,,0,,"xlog.c",6205,
2016-06-13 20:14:40.268112 
PDT,,,p127518,th-12124628160,,,seg-1,"LOG","0","found 
recovery.conf file indicating standby takeover recovery 
needed",,,0,,"xlog.c",5485,
2016-06-13 20:14:40.268131 
PDT,,,p127518,th-12124628160,,,seg-1,"LOG","0","checkpoint 
record is at 0/1C75EF0",,,0,,"xlog.c",6304,
2016-06-13 20:14:40.268143 
PDT,,,p127518,th-12124628160,,,seg-1,"LOG","0","redo record is 
at 0/1C75EF0; undo record is at 0/0; shutdown TRUE",,,0,,"xlog.c",6338,
2016-06-13 20:14:40.268155 
PDT,,,p127518,th-12124628160,,,seg-1,"LOG","0","next 
transaction ID: 0/1003; next OID: 16508",,,0,,"xlog.c",6342,
2016-06-13 20:14:40.268165 

[jira] [Created] (HAWQ-813) Activate standby master failed after create a new database

2016-06-13 Thread Chunling Wang (JIRA)
Chunling Wang created HAWQ-813:
--

 Summary: Activate standby master failed after create a new database
 Key: HAWQ-813
 URL: https://issues.apache.org/jira/browse/HAWQ-813
 Project: Apache HAWQ
  Issue Type: Bug
  Components: Backup & restore
Reporter: Chunling Wang
Assignee: Lei Chang


Activate standby master failed after create a new database. However, it will 
success if we do not create a new database even we create a new table and 
insert data. 
1. Create a new database 'gptest'
{code}
[gpadmin@test1 ~]$ psql -l
 List of databases
   Name|  Owner  | Encoding | Access privileges
---+-+--+---
 postgres  | gpadmin | UTF8 |
 template0 | gpadmin | UTF8 |
 template1 | gpadmin | UTF8 |
(3 rows)

[gpadmin@test1 ~]$ createdb gptest
[gpadmin@test1 ~]$ psql -l
 List of databases
   Name|  Owner  | Encoding | Access privileges
---+-+--+---
 gptest| gpadmin | UTF8 |
 postgres  | gpadmin | UTF8 |
 template0 | gpadmin | UTF8 |
 template1 | gpadmin | UTF8 |
(4 rows)
{code}
2. Stop HAWQ master
{code}
[gpadmin@test1 ~]$ hawq stop master -a
20160613:20:13:44:068559 hawq_stop:test1:gpadmin-[INFO]:-Prepare to do 'hawq 
stop'
20160613:20:13:44:068559 hawq_stop:test1:gpadmin-[INFO]:-You can find log in:
20160613:20:13:44:068559 
hawq_stop:test1:gpadmin-[INFO]:-/home/gpadmin/hawqAdminLogs/hawq_stop_20160613.log
20160613:20:13:44:068559 hawq_stop:test1:gpadmin-[INFO]:-GPHOME is set to:
20160613:20:13:44:068559 
hawq_stop:test1:gpadmin-[INFO]:-/data/pulse-agent-data/HAWQ-main-FeatureTest-opt-mutilnodeparallel-wcl/product/hawq/.
20160613:20:13:44:068559 hawq_stop:test1:gpadmin-[INFO]:-Stop hawq with args: 
['stop', 'master']
20160613:20:13:45:068559 hawq_stop:test1:gpadmin-[INFO]:-There are 0 
connections to the database
20160613:20:13:45:068559 hawq_stop:test1:gpadmin-[INFO]:-Commencing Master 
instance shutdown with mode='smart'
20160613:20:13:45:068559 hawq_stop:test1:gpadmin-[INFO]:-Master host=test1
20160613:20:13:45:068559 hawq_stop:test1:gpadmin-[INFO]:-Stop hawq master
20160613:20:13:46:068559 hawq_stop:test1:gpadmin-[INFO]:-Master stopped 
successfully
{code}
3. Activate standby master
{code}
[gpadmin@test1 ~]$ ssh test5 'source 
/data/pulse-agent-data/HAWQ-main-FeatureTest-opt-mutilnodeparallel-wcl/product/hawq/./greenplum_path.sh;
 hawq activate standby -a'
20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-Prepare to do 
'hawq activate'
20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-You can find log 
in:
20160613:20:14:14:126841 
hawq_activate:test5:gpadmin-[INFO]:-/home/gpadmin/hawqAdminLogs/hawq_activate_20160613.log
20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-GPHOME is set to:
20160613:20:14:14:126841 
hawq_activate:test5:gpadmin-[INFO]:-/data/pulse-agent-data/HAWQ-main-FeatureTest-opt-mutilnodeparallel-wcl/product/hawq/.
20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-Activate hawq with 
args: ['activate', 'standby']
20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-Starting to 
activate standby master 'test5'
20160613:20:14:15:126841 hawq_activate:test5:gpadmin-[INFO]:-HAWQ master is not 
running, skip
20160613:20:14:15:126841 hawq_activate:test5:gpadmin-[INFO]:-Stopping all the 
running segments
20160613:20:14:21:126841 hawq_activate:test5:gpadmin-[INFO]:-
20160613:20:14:21:126841 hawq_activate:test5:gpadmin-[INFO]:-Stopping running 
standby
20160613:20:14:23:126841 hawq_activate:test5:gpadmin-[INFO]:-Update master host 
name in hawq-site.xml
20160613:20:14:31:126841 hawq_activate:test5:gpadmin-[INFO]:-GUC 
hawq_master_address_host already exist in hawq-site.xml
Update it with value: test5
20160613:20:14:31:126841 hawq_activate:test5:gpadmin-[INFO]:-Remove current 
standby from hawq-site.xml
20160613:20:14:39:126841 hawq_activate:test5:gpadmin-[INFO]:-Start master in 
master only mode

{code}
It hangs and can not start master.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HAWQ-812) Activate standby master failed after create a new database

2016-06-13 Thread Chunling Wang (JIRA)
Chunling Wang created HAWQ-812:
--

 Summary: Activate standby master failed after create a new database
 Key: HAWQ-812
 URL: https://issues.apache.org/jira/browse/HAWQ-812
 Project: Apache HAWQ
  Issue Type: Bug
  Components: Backup & restore
Reporter: Chunling Wang
Assignee: Lei Chang


Activate standby master failed after create a new database. However, it will 
success if we do not create a new database even we create a new table and 
insert data. 
1. Create a new database 'gptest'
{code}
[gpadmin@test1 ~]$ psql -l
 List of databases
   Name|  Owner  | Encoding | Access privileges
---+-+--+---
 postgres  | gpadmin | UTF8 |
 template0 | gpadmin | UTF8 |
 template1 | gpadmin | UTF8 |
(3 rows)

[gpadmin@test1 ~]$ createdb gptest
[gpadmin@test1 ~]$ psql -l
 List of databases
   Name|  Owner  | Encoding | Access privileges
---+-+--+---
 gptest| gpadmin | UTF8 |
 postgres  | gpadmin | UTF8 |
 template0 | gpadmin | UTF8 |
 template1 | gpadmin | UTF8 |
(4 rows)
{code}
2. Stop HAWQ master
{code}
[gpadmin@test1 ~]$ hawq stop master -a
20160613:20:13:44:068559 hawq_stop:test1:gpadmin-[INFO]:-Prepare to do 'hawq 
stop'
20160613:20:13:44:068559 hawq_stop:test1:gpadmin-[INFO]:-You can find log in:
20160613:20:13:44:068559 
hawq_stop:test1:gpadmin-[INFO]:-/home/gpadmin/hawqAdminLogs/hawq_stop_20160613.log
20160613:20:13:44:068559 hawq_stop:test1:gpadmin-[INFO]:-GPHOME is set to:
20160613:20:13:44:068559 
hawq_stop:test1:gpadmin-[INFO]:-/data/pulse-agent-data/HAWQ-main-FeatureTest-opt-mutilnodeparallel-wcl/product/hawq/.
20160613:20:13:44:068559 hawq_stop:test1:gpadmin-[INFO]:-Stop hawq with args: 
['stop', 'master']
20160613:20:13:45:068559 hawq_stop:test1:gpadmin-[INFO]:-There are 0 
connections to the database
20160613:20:13:45:068559 hawq_stop:test1:gpadmin-[INFO]:-Commencing Master 
instance shutdown with mode='smart'
20160613:20:13:45:068559 hawq_stop:test1:gpadmin-[INFO]:-Master host=test1
20160613:20:13:45:068559 hawq_stop:test1:gpadmin-[INFO]:-Stop hawq master
20160613:20:13:46:068559 hawq_stop:test1:gpadmin-[INFO]:-Master stopped 
successfully
{code}
3. Activate standby master
{code}
[gpadmin@test1 ~]$ ssh test5 'source 
/data/pulse-agent-data/HAWQ-main-FeatureTest-opt-mutilnodeparallel-wcl/product/hawq/./greenplum_path.sh;
 hawq activate standby -a'
20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-Prepare to do 
'hawq activate'
20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-You can find log 
in:
20160613:20:14:14:126841 
hawq_activate:test5:gpadmin-[INFO]:-/home/gpadmin/hawqAdminLogs/hawq_activate_20160613.log
20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-GPHOME is set to:
20160613:20:14:14:126841 
hawq_activate:test5:gpadmin-[INFO]:-/data/pulse-agent-data/HAWQ-main-FeatureTest-opt-mutilnodeparallel-wcl/product/hawq/.
20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-Activate hawq with 
args: ['activate', 'standby']
20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-Starting to 
activate standby master 'test5'
20160613:20:14:15:126841 hawq_activate:test5:gpadmin-[INFO]:-HAWQ master is not 
running, skip
20160613:20:14:15:126841 hawq_activate:test5:gpadmin-[INFO]:-Stopping all the 
running segments
20160613:20:14:21:126841 hawq_activate:test5:gpadmin-[INFO]:-
20160613:20:14:21:126841 hawq_activate:test5:gpadmin-[INFO]:-Stopping running 
standby
20160613:20:14:23:126841 hawq_activate:test5:gpadmin-[INFO]:-Update master host 
name in hawq-site.xml
20160613:20:14:31:126841 hawq_activate:test5:gpadmin-[INFO]:-GUC 
hawq_master_address_host already exist in hawq-site.xml
Update it with value: test5
20160613:20:14:31:126841 hawq_activate:test5:gpadmin-[INFO]:-Remove current 
standby from hawq-site.xml
20160613:20:14:39:126841 hawq_activate:test5:gpadmin-[INFO]:-Start master in 
master only mode

{code}
It hangs and can not start master.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HAWQ-619) Change 'gpextract' to 'hawqextract' for InputFormat unit test

2016-04-01 Thread Chunling Wang (JIRA)
Chunling Wang created HAWQ-619:
--

 Summary: Change 'gpextract' to 'hawqextract' for InputFormat unit 
test
 Key: HAWQ-619
 URL: https://issues.apache.org/jira/browse/HAWQ-619
 Project: Apache HAWQ
  Issue Type: Task
  Components: Tests
Reporter: Chunling Wang
Assignee: Jiali Yao


Change 'gpextract' to 'hawqextract' in SimpleTableLocalTester.java for 
InputFormat unit test.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HAWQ-592) QD fails when connects to QE again in executormgr_allocate_any_executor()

2016-03-25 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang updated HAWQ-592:
---
Description: 
We first run a query to get some QEs. Then we kill one and run "set 
log_min_messages=DEBUG1" to let QD get executormgr_allocate_any_executor(). We 
find QD failed.
1. Run query to get some QEs.
{code}
dispatch=# select count(*) from test_dispatch as t1, test_dispatch as t2, 
test_dispatch as t3 where t1.id *2 = t2.id and t1.id < t3.id;
 count
---
  3725
(1 row)
{code}
{code}
$ ps -ef|grep postgres
  501 12817 1   0  4:41下午 ?? 0:00.36 /usr/local/hawq/bin/postgres 
-D /Users/wangchunling/hawq-data-directory/masterdd -i -M master -p 5432 
--silent-mode=true
  501 12818 12817   0  4:41下午 ?? 0:00.01 postgres: port  5432, master 
logger process
  501 12821 12817   0  4:41下午 ?? 0:00.00 postgres: port  5432, stats 
collector process
  501 12822 12817   0  4:41下午 ?? 0:00.03 postgres: port  5432, writer 
process
  501 12823 12817   0  4:41下午 ?? 0:00.00 postgres: port  5432, 
checkpoint process
  501 12824 12817   0  4:41下午 ?? 0:00.00 postgres: port  5432, 
seqserver process
  501 12825 12817   0  4:41下午 ?? 0:00.00 postgres: port  5432, WAL Send 
Server process
  501 12826 12817   0  4:41下午 ?? 0:00.00 postgres: port  5432, DFS 
Metadata Cache process
  501 12827 12817   0  4:41下午 ?? 0:00.16 postgres: port  5432, master 
resource manager
  501 12844 1   0  4:41下午 ?? 0:00.57 /usr/local/hawq/bin/postgres 
-D /Users/wangchunling/hawq-data-directory/segmentdd -i -M segment -p 4 
--silent-mode=true
  501 12845 12844   0  4:41下午 ?? 0:00.01 postgres: port 4, logger 
process
  501 12856 12862   0  4:42下午 ?? 0:00.05 postgres: port  5432, 
wangchunling dispatch [local] con13 cmd10 idle [local]
  501 12872 12844   0  4:42下午 ?? 0:00.00 postgres: port 4, stats 
collector process
  501 12873 12844   0  4:42下午 ?? 0:00.01 postgres: port 4, writer 
process
  501 12874 12844   0  4:42下午 ?? 0:00.00 postgres: port 4, 
checkpoint process
  501 12875 12844   0  4:42下午 ?? 0:00.03 postgres: port 4, segment 
resource manager
{code}
2. Kill -9 some QE and wait segment up.
{code}
$ ps -ef|grep postgres
  501 12817 1   0  4:41下午 ?? 0:00.91 /usr/local/hawq/bin/postgres 
-D /Users/wangchunling/hawq-data-directory/masterdd -i -M master -p 5432 
--silent-mode=true
  501 12818 12817   0  4:41下午 ?? 0:00.05 postgres: port  5432, master 
logger process
  501 12844 1   0  4:41下午 ?? 0:01.52 /usr/local/hawq/bin/postgres 
-D /Users/wangchunling/hawq-data-directory/segmentdd -i -M segment -p 4 
--silent-mode=true
  501 12845 12844   0  4:41下午 ?? 0:00.04 postgres: port 4, logger 
process
  501 12872 12844   0  4:42下午 ?? 0:00.02 postgres: port 4, stats 
collector process
  501 12873 12844   0  4:42下午 ?? 0:00.19 postgres: port 4, writer 
process
  501 12874 12844   0  4:42下午 ?? 0:00.03 postgres: port 4, 
checkpoint process
  501 12875 12844   0  4:42下午 ?? 0:00.41 postgres: port 4, segment 
resource manager
  501 12932 12817   0  4:52下午 ?? 0:00.00 postgres: port  5432, stats 
collector process
  501 12933 12817   0  4:52下午 ?? 0:00.01 postgres: port  5432, writer 
process
  501 12934 12817   0  4:52下午 ?? 0:00.00 postgres: port  5432, 
checkpoint process
  501 12935 12817   0  4:52下午 ?? 0:00.00 postgres: port  5432, 
seqserver process
  501 12936 12817   0  4:52下午 ?? 0:00.00 postgres: port  5432, WAL Send 
Server process
  501 12937 12817   0  4:52下午 ?? 0:00.00 postgres: port  5432, DFS 
Metadata Cache process
  501 12938 12817   0  4:52下午 ?? 0:00.04 postgres: port  5432, master 
resource manager
  501 12952 12817   0  4:53下午 ?? 0:00.00 postgres: port  5432, 
wangchunling dispatch [local] con30 idle [local]
{code}
{code}
dispatch=# select * from gp_segment_configuration;
 registration_order | role | status | port  |  hostname   | 
  address   |description
+--++---+-+-+
  0 | m| u  |  5432 | ChunlingdeMacBook-Pro.local | 
ChunlingdeMacBook-Pro.local |
  1 | p| d  | 4 | localhost   | 
127.0.0.1   | resource manager process was reset
(2 rows)

dispatch=# select * from gp_segment_configuration;
 registration_order | role | status | port  |  hostname   | 
  address   | description
+--++---+-+-+-
  0 | m| u  |  5432 | 

[jira] [Created] (HAWQ-592) QD fails when connects to QE again in executormgr_allocate_any_executor()

2016-03-25 Thread Chunling Wang (JIRA)
Chunling Wang created HAWQ-592:
--

 Summary: QD fails when connects to QE again in 
executormgr_allocate_any_executor()
 Key: HAWQ-592
 URL: https://issues.apache.org/jira/browse/HAWQ-592
 Project: Apache HAWQ
  Issue Type: Bug
  Components: Dispatcher
Reporter: Chunling Wang
Assignee: Lei Chang


We first run a query to get some QEs. Then we kill one and run "set 
log_min_messages=DEBUG1" to let QD get executormgr_allocate_any_executor(). We 
find QD failed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HAWQ-592) QD fails when connects to QE again in executormgr_allocate_any_executor()

2016-03-25 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang updated HAWQ-592:
---
Affects Version/s: 2.0.0

> QD fails when connects to QE again in executormgr_allocate_any_executor()
> -
>
> Key: HAWQ-592
> URL: https://issues.apache.org/jira/browse/HAWQ-592
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Dispatcher
>Affects Versions: 2.0.0
>Reporter: Chunling Wang
>Assignee: Lei Chang
>
> We first run a query to get some QEs. Then we kill one and run "set 
> log_min_messages=DEBUG1" to let QD get executormgr_allocate_any_executor(). 
> We find QD failed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-564) QD hangs when connecting to resource manager

2016-03-22 Thread Chunling Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15206095#comment-15206095
 ] 

Chunling Wang commented on HAWQ-564:


And 'kill -6' can cause same result.

> QD hangs when connecting to resource manager
> 
>
> Key: HAWQ-564
> URL: https://issues.apache.org/jira/browse/HAWQ-564
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Resource Manager
>Affects Versions: 2.0.0
>Reporter: Chunling Wang
>Assignee: Lei Chang
>
> When first inject panic in QE process, we run a query and segment is down. 
> After the segment is up, we run another query and get correct answer. Then we 
> inject the same panic second time. After the segment is down and then up 
> again, we run a query and find QD process hangs when connecting to resource 
> manager. Here is the backtrace when QD hangs:
> {code}
> * thread #1: tid = 0x21d8be, 0x7fff890355be libsystem_kernel.dylib`poll + 
> 10, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
>   * frame #0: 0x7fff890355be libsystem_kernel.dylib`poll + 10
> frame #1: 0x000101daeafe postgres`processAllCommFileDescs + 158 at 
> rmcomm_AsyncComm.c:156
> frame #2: 0x000101db85f5 
> postgres`callSyncRPCRemote(hostname=0x7f9c19e00cd0, port=5437, 
> sendbuff=0x7f9c1b918f50, sendbuffsize=80, sendmsgid=259, 
> exprecvmsgid=2307, recvsmb=, errorbuf=0x00010230c1a0, 
> errorbufsize=) + 645 at rmcomm_SyncComm.c:122
> frame #3: 0x000101db2d85 postgres`acquireResourceFromRM [inlined] 
> callSyncRPCToRM(sendbuff=0x7f9c1b918f50, sendbuffsize=, 
> sendmsgid=259, exprecvmsgid=2307, recvsmb=0x7f9c1b918e70, 
> errorbuf=, errorbufsize=1024) + 73 at rmcomm_QD2RM.c:2780
> frame #4: 0x000101db2d3c 
> postgres`acquireResourceFromRM(index=, sessionid=12, 
> slice_size=462524016, iobytes=134217728, preferred_nodes=0x7f9c1a02d398, 
> preferred_nodes_size=, max_seg_count_fix=, 
> min_seg_count_fix=, errorbuf=, 
> errorbufsize=) + 572 at rmcomm_QD2RM.c:742
> frame #5: 0x000101c979e7 postgres`AllocateResource(life=QRL_ONCE, 
> slice_size=5, iobytes=134217728, max_target_segment_num=1, 
> min_target_segment_num=1, vol_info=0x7f9c1a02d398, vol_info_size=1) + 631 
> at pquery.c:796
> frame #6: 0x000101e8c60f 
> postgres`calculate_planner_segment_num(query=, 
> resourceLife=QRL_ONCE, fullRangeTable=, 
> intoPolicy=, sliceNum=5) + 14287 at cdbdatalocality.c:4207
> frame #7: 0x000101c0f671 postgres`planner + 106 at planner.c:496
> frame #8: 0x000101c0f607 postgres`planner(parse=0x7f9c1a02a140, 
> cursorOptions=, boundParams=0x, 
> resourceLife=QRL_ONCE) + 311 at planner.c:310
> frame #9: 0x000101c8eb33 
> postgres`pg_plan_query(querytree=0x7f9c1a02a140, 
> boundParams=0x, resource_life=QRL_ONCE) + 99 at postgres.c:837
> frame #10: 0x000101c956ae postgres`exec_simple_query + 21 at 
> postgres.c:911
> frame #11: 0x000101c95699 
> postgres`exec_simple_query(query_string=0x7f9c1a028a30, 
> seqServerHost=0x, seqServerPort=-1) + 1577 at postgres.c:1671
> frame #12: 0x000101c91a4c postgres`PostgresMain(argc=, 
> argv=, username=0x7f9c1b808cf0) + 9404 at postgres.c:4754
> frame #13: 0x000101c4ae02 postgres`ServerLoop [inlined] BackendRun + 
> 105 at postmaster.c:5889
> frame #14: 0x000101c4ad99 postgres`ServerLoop at postmaster.c:5484
> frame #15: 0x000101c4ad99 postgres`ServerLoop + 9593 at 
> postmaster.c:2163
> frame #16: 0x000101c47d3b postgres`PostmasterMain(argc=, 
> argv=) + 5019 at postmaster.c:1454
> frame #17: 0x000101bb1aa9 postgres`main(argc=9, 
> argv=0x7f9c19c1eef0) + 1433 at main.c:209
> frame #18: 0x7fff95e8c5c9 libdyld.dylib`start + 1
>   thread #2: tid = 0x21d8bf, 0x7fff890355be libsystem_kernel.dylib`poll + 
> 10
> frame #0: 0x7fff890355be libsystem_kernel.dylib`poll + 10
> frame #1: 0x000101dfe723 postgres`rxThreadFunc(arg=) + 
> 2163 at ic_udp.c:6251
> frame #2: 0x7fff95e822fc libsystem_pthread.dylib`_pthread_body + 131
> frame #3: 0x7fff95e82279 libsystem_pthread.dylib`_pthread_start + 176
> frame #4: 0x7fff95e804b1 libsystem_pthread.dylib`thread_start + 13
>   thread #3: tid = 0x21d9c2, 0x7fff890343f6 
> libsystem_kernel.dylib`__select + 10
> frame #0: 0x7fff890343f6 libsystem_kernel.dylib`__select + 10
> frame #1: 0x000101e9d42e postgres`pg_usleep(microsec=) + 
> 78 at pgsleep.c:43
> frame #2: 0x000101db1a66 
> postgres`generateResourceRefreshHeartBeat(arg=0x7f9c19f02480) + 166 at 
> rmcomm_QD2RM.c:1519
> frame #3: 0x7fff95e822fc libsystem_pthread.dylib`_pthread_body + 131
> frame #4: 0x7fff95e82279 

[jira] [Created] (HAWQ-572) Improve code coverage for dispatcher: fail_qe_after_connection & fail_qe_when_do_query & fail_qe_when_begin_parquet_scan

2016-03-22 Thread Chunling Wang (JIRA)
Chunling Wang created HAWQ-572:
--

 Summary: Improve code coverage for dispatcher: 
fail_qe_after_connection & fail_qe_when_do_query & 
fail_qe_when_begin_parquet_scan
 Key: HAWQ-572
 URL: https://issues.apache.org/jira/browse/HAWQ-572
 Project: Apache HAWQ
  Issue Type: Sub-task
  Components: Dispatcher
Reporter: Chunling Wang
Assignee: Lei Chang


Add those fault injections:
1. fail_qe_after_connection 
2. fail_qe_when_do_query 
3. fail_qe_when_begin_parquet_scan
And add test cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HAWQ-568) After query finished, kill a QE but can still recv() data from this QE socket

2016-03-21 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang updated HAWQ-568:
---
Summary: After query finished, kill a QE but can still recv() data from 
this QE socket  (was: After query finished, kill a QE but can still recv() from 
this QE socket)

> After query finished, kill a QE but can still recv() data from this QE socket
> -
>
> Key: HAWQ-568
> URL: https://issues.apache.org/jira/browse/HAWQ-568
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Dispatcher
>Affects Versions: 2.0.0
>Reporter: Chunling Wang
>Assignee: Lei Chang
>
> After query finished, we kill a QE and other QEs remain in QE pool. When 
> check the connection to this QE is whether alive, we use recv() to this QE 
> socket, but can still receive data.
> 1. Run a query and remain some QEs.
> {code}
> dispatch=# select count(*) from test_dispatch as t1, test_dispatch as t2, 
> test_dispatch as t3 where t1.id *2 = t2.id and t1.id < t3.id;
>  count
> ---
>   3725
> (1 row)
> {code}
> {code}
> $ ps -ef|grep postgres
>   501 55701 1   0  5:38下午 ?? 0:00.38 /usr/local/hawq/bin/postgres 
> -D /Users/wangchunling/hawq-data-directory/masterdd -i -M master -p 5432 
> --silent-mode=true
>   501 55702 55701   0  5:38下午 ?? 0:00.01 postgres: port  5432, master 
> logger process
>   501 55705 55701   0  5:38下午 ?? 0:00.00 postgres: port  5432, stats 
> collector process
>   501 55706 55701   0  5:38下午 ?? 0:00.04 postgres: port  5432, writer 
> process
>   501 55707 55701   0  5:38下午 ?? 0:00.01 postgres: port  5432, 
> checkpoint process
>   501 55708 55701   0  5:38下午 ?? 0:00.00 postgres: port  5432, 
> seqserver process
>   501 55709 55701   0  5:38下午 ?? 0:00.01 postgres: port  5432, WAL 
> Send Server process
>   501 55710 55701   0  5:38下午 ?? 0:00.00 postgres: port  5432, DFS 
> Metadata Cache process
>   501 55711 55701   0  5:38下午 ?? 0:00.26 postgres: port  5432, master 
> resource manager
>   501 55727 1   0  5:38下午 ?? 0:00.52 /usr/local/hawq/bin/postgres 
> -D /Users/wangchunling/hawq-data-directory/segmentdd -i -M segment -p 4 
> --silent-mode=true
>   501 55728 55727   0  5:38下午 ?? 0:00.06 postgres: port 4, logger 
> process
>   501 55731 55727   0  5:38下午 ?? 0:00.00 postgres: port 4, stats 
> collector process
>   501 55732 55727   0  5:38下午 ?? 0:00.04 postgres: port 4, writer 
> process
>   501 55733 55727   0  5:38下午 ?? 0:00.01 postgres: port 4, 
> checkpoint process
>   501 55734 55727   0  5:38下午 ?? 0:00.09 postgres: port 4, 
> segment resource manager
>   501 55741 55748   0  5:38下午 ?? 0:00.05 postgres: port  5432, 
> wangchunling dispatch [local] con12 cmd6 idle [local]
>   501 55743 55727   0  5:38下午 ?? 0:00.36 postgres: port 4, 
> wangchunling dispatch 127.0.0.1(50800) con12 seg0 idle
>   501 55770 55727   0  5:43下午 ?? 0:00.12 postgres: port 4, 
> wangchunling dispatch 127.0.0.1(50853) con12 seg0 idle
>   501 55771 55727   0  5:44下午 ?? 0:00.11 postgres: port 4, 
> wangchunling dispatch 127.0.0.1(50855) con12 seg0 idle
>   501 55774 26980   0  5:44下午 ttys0080:00.00 grep postgres
> {code}
> 2. Kill one QE.
> {code}
> $ kill 55771
> $ ps -ef|grep postgres
>   501 55701 1   0  5:38下午 ?? 0:00.38 /usr/local/hawq/bin/postgres 
> -D /Users/wangchunling/hawq-data-directory/masterdd -i -M master -p 5432 
> --silent-mode=true
>   501 55702 55701   0  5:38下午 ?? 0:00.01 postgres: port  5432, master 
> logger process
>   501 55705 55701   0  5:38下午 ?? 0:00.00 postgres: port  5432, stats 
> collector process
>   501 55706 55701   0  5:38下午 ?? 0:00.04 postgres: port  5432, writer 
> process
>   501 55707 55701   0  5:38下午 ?? 0:00.01 postgres: port  5432, 
> checkpoint process
>   501 55708 55701   0  5:38下午 ?? 0:00.00 postgres: port  5432, 
> seqserver process
>   501 55709 55701   0  5:38下午 ?? 0:00.01 postgres: port  5432, WAL 
> Send Server process
>   501 55710 55701   0  5:38下午 ?? 0:00.00 postgres: port  5432, DFS 
> Metadata Cache process
>   501 55711 55701   0  5:38下午 ?? 0:00.27 postgres: port  5432, master 
> resource manager
>   501 55727 1   0  5:38下午 ?? 0:00.52 /usr/local/hawq/bin/postgres 
> -D /Users/wangchunling/hawq-data-directory/segmentdd -i -M segment -p 4 
> --silent-mode=true
>   501 55728 55727   0  5:38下午 ?? 0:00.06 postgres: port 4, logger 
> process
>   501 55731 55727   0  5:38下午 ?? 0:00.00 postgres: port 4, stats 
> collector process
>   501 55732 55727   0  5:38下午 ?? 0:00.04 postgres: port 4, writer 
> process
>   501 

[jira] [Created] (HAWQ-568) After query finished, kill a QE but can still recv() from this QE socket

2016-03-21 Thread Chunling Wang (JIRA)
Chunling Wang created HAWQ-568:
--

 Summary: After query finished, kill a QE but can still recv() from 
this QE socket
 Key: HAWQ-568
 URL: https://issues.apache.org/jira/browse/HAWQ-568
 Project: Apache HAWQ
  Issue Type: Bug
  Components: Dispatcher
Reporter: Chunling Wang
Assignee: Lei Chang


After query finished, we kill a QE and other QEs remain in QE pool. When check 
the connection to this QE is whether alive, we use recv() to this QE socket, 
but can still receive data.
1. Run a query and remain some QEs.
{code}
dispatch=# select count(*) from test_dispatch as t1, test_dispatch as t2, 
test_dispatch as t3 where t1.id *2 = t2.id and t1.id < t3.id;
 count
---
  3725
(1 row)
{code}
{code}
$ ps -ef|grep postgres
  501 55701 1   0  5:38下午 ?? 0:00.38 /usr/local/hawq/bin/postgres 
-D /Users/wangchunling/hawq-data-directory/masterdd -i -M master -p 5432 
--silent-mode=true
  501 55702 55701   0  5:38下午 ?? 0:00.01 postgres: port  5432, master 
logger process
  501 55705 55701   0  5:38下午 ?? 0:00.00 postgres: port  5432, stats 
collector process
  501 55706 55701   0  5:38下午 ?? 0:00.04 postgres: port  5432, writer 
process
  501 55707 55701   0  5:38下午 ?? 0:00.01 postgres: port  5432, 
checkpoint process
  501 55708 55701   0  5:38下午 ?? 0:00.00 postgres: port  5432, 
seqserver process
  501 55709 55701   0  5:38下午 ?? 0:00.01 postgres: port  5432, WAL Send 
Server process
  501 55710 55701   0  5:38下午 ?? 0:00.00 postgres: port  5432, DFS 
Metadata Cache process
  501 55711 55701   0  5:38下午 ?? 0:00.26 postgres: port  5432, master 
resource manager
  501 55727 1   0  5:38下午 ?? 0:00.52 /usr/local/hawq/bin/postgres 
-D /Users/wangchunling/hawq-data-directory/segmentdd -i -M segment -p 4 
--silent-mode=true
  501 55728 55727   0  5:38下午 ?? 0:00.06 postgres: port 4, logger 
process
  501 55731 55727   0  5:38下午 ?? 0:00.00 postgres: port 4, stats 
collector process
  501 55732 55727   0  5:38下午 ?? 0:00.04 postgres: port 4, writer 
process
  501 55733 55727   0  5:38下午 ?? 0:00.01 postgres: port 4, 
checkpoint process
  501 55734 55727   0  5:38下午 ?? 0:00.09 postgres: port 4, segment 
resource manager
  501 55741 55748   0  5:38下午 ?? 0:00.05 postgres: port  5432, 
wangchunling dispatch [local] con12 cmd6 idle [local]
  501 55743 55727   0  5:38下午 ?? 0:00.36 postgres: port 4, 
wangchunling dispatch 127.0.0.1(50800) con12 seg0 idle
  501 55770 55727   0  5:43下午 ?? 0:00.12 postgres: port 4, 
wangchunling dispatch 127.0.0.1(50853) con12 seg0 idle
  501 55771 55727   0  5:44下午 ?? 0:00.11 postgres: port 4, 
wangchunling dispatch 127.0.0.1(50855) con12 seg0 idle
  501 55774 26980   0  5:44下午 ttys0080:00.00 grep postgres
{code}
2. Kill one QE.
{code}
$ kill 55771
$ ps -ef|grep postgres
  501 55701 1   0  5:38下午 ?? 0:00.38 /usr/local/hawq/bin/postgres 
-D /Users/wangchunling/hawq-data-directory/masterdd -i -M master -p 5432 
--silent-mode=true
  501 55702 55701   0  5:38下午 ?? 0:00.01 postgres: port  5432, master 
logger process
  501 55705 55701   0  5:38下午 ?? 0:00.00 postgres: port  5432, stats 
collector process
  501 55706 55701   0  5:38下午 ?? 0:00.04 postgres: port  5432, writer 
process
  501 55707 55701   0  5:38下午 ?? 0:00.01 postgres: port  5432, 
checkpoint process
  501 55708 55701   0  5:38下午 ?? 0:00.00 postgres: port  5432, 
seqserver process
  501 55709 55701   0  5:38下午 ?? 0:00.01 postgres: port  5432, WAL Send 
Server process
  501 55710 55701   0  5:38下午 ?? 0:00.00 postgres: port  5432, DFS 
Metadata Cache process
  501 55711 55701   0  5:38下午 ?? 0:00.27 postgres: port  5432, master 
resource manager
  501 55727 1   0  5:38下午 ?? 0:00.52 /usr/local/hawq/bin/postgres 
-D /Users/wangchunling/hawq-data-directory/segmentdd -i -M segment -p 4 
--silent-mode=true
  501 55728 55727   0  5:38下午 ?? 0:00.06 postgres: port 4, logger 
process
  501 55731 55727   0  5:38下午 ?? 0:00.00 postgres: port 4, stats 
collector process
  501 55732 55727   0  5:38下午 ?? 0:00.04 postgres: port 4, writer 
process
  501 55733 55727   0  5:38下午 ?? 0:00.01 postgres: port 4, 
checkpoint process
  501 55734 55727   0  5:38下午 ?? 0:00.09 postgres: port 4, segment 
resource manager
  501 55741 55748   0  5:38下午 ?? 0:00.05 postgres: port  5432, 
wangchunling dispatch [local] con12 cmd6 idle [local]
  501 55743 55727   0  5:38下午 ?? 0:00.36 postgres: port 4, 
wangchunling dispatch 127.0.0.1(50800) con12 seg0 idle
  501 55770 55727   0  5:43下午 ?? 0:00.12 postgres: port 4, 
wangchunling dispatch 127.0.0.1(50853) con12 seg0 idle
  501 55776 

[jira] [Updated] (HAWQ-568) After query finished, kill a QE but can still recv() from this QE socket

2016-03-21 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang updated HAWQ-568:
---
Affects Version/s: 2.0.0

> After query finished, kill a QE but can still recv() from this QE socket
> 
>
> Key: HAWQ-568
> URL: https://issues.apache.org/jira/browse/HAWQ-568
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Dispatcher
>Affects Versions: 2.0.0
>Reporter: Chunling Wang
>Assignee: Lei Chang
>
> After query finished, we kill a QE and other QEs remain in QE pool. When 
> check the connection to this QE is whether alive, we use recv() to this QE 
> socket, but can still receive data.
> 1. Run a query and remain some QEs.
> {code}
> dispatch=# select count(*) from test_dispatch as t1, test_dispatch as t2, 
> test_dispatch as t3 where t1.id *2 = t2.id and t1.id < t3.id;
>  count
> ---
>   3725
> (1 row)
> {code}
> {code}
> $ ps -ef|grep postgres
>   501 55701 1   0  5:38下午 ?? 0:00.38 /usr/local/hawq/bin/postgres 
> -D /Users/wangchunling/hawq-data-directory/masterdd -i -M master -p 5432 
> --silent-mode=true
>   501 55702 55701   0  5:38下午 ?? 0:00.01 postgres: port  5432, master 
> logger process
>   501 55705 55701   0  5:38下午 ?? 0:00.00 postgres: port  5432, stats 
> collector process
>   501 55706 55701   0  5:38下午 ?? 0:00.04 postgres: port  5432, writer 
> process
>   501 55707 55701   0  5:38下午 ?? 0:00.01 postgres: port  5432, 
> checkpoint process
>   501 55708 55701   0  5:38下午 ?? 0:00.00 postgres: port  5432, 
> seqserver process
>   501 55709 55701   0  5:38下午 ?? 0:00.01 postgres: port  5432, WAL 
> Send Server process
>   501 55710 55701   0  5:38下午 ?? 0:00.00 postgres: port  5432, DFS 
> Metadata Cache process
>   501 55711 55701   0  5:38下午 ?? 0:00.26 postgres: port  5432, master 
> resource manager
>   501 55727 1   0  5:38下午 ?? 0:00.52 /usr/local/hawq/bin/postgres 
> -D /Users/wangchunling/hawq-data-directory/segmentdd -i -M segment -p 4 
> --silent-mode=true
>   501 55728 55727   0  5:38下午 ?? 0:00.06 postgres: port 4, logger 
> process
>   501 55731 55727   0  5:38下午 ?? 0:00.00 postgres: port 4, stats 
> collector process
>   501 55732 55727   0  5:38下午 ?? 0:00.04 postgres: port 4, writer 
> process
>   501 55733 55727   0  5:38下午 ?? 0:00.01 postgres: port 4, 
> checkpoint process
>   501 55734 55727   0  5:38下午 ?? 0:00.09 postgres: port 4, 
> segment resource manager
>   501 55741 55748   0  5:38下午 ?? 0:00.05 postgres: port  5432, 
> wangchunling dispatch [local] con12 cmd6 idle [local]
>   501 55743 55727   0  5:38下午 ?? 0:00.36 postgres: port 4, 
> wangchunling dispatch 127.0.0.1(50800) con12 seg0 idle
>   501 55770 55727   0  5:43下午 ?? 0:00.12 postgres: port 4, 
> wangchunling dispatch 127.0.0.1(50853) con12 seg0 idle
>   501 55771 55727   0  5:44下午 ?? 0:00.11 postgres: port 4, 
> wangchunling dispatch 127.0.0.1(50855) con12 seg0 idle
>   501 55774 26980   0  5:44下午 ttys0080:00.00 grep postgres
> {code}
> 2. Kill one QE.
> {code}
> $ kill 55771
> $ ps -ef|grep postgres
>   501 55701 1   0  5:38下午 ?? 0:00.38 /usr/local/hawq/bin/postgres 
> -D /Users/wangchunling/hawq-data-directory/masterdd -i -M master -p 5432 
> --silent-mode=true
>   501 55702 55701   0  5:38下午 ?? 0:00.01 postgres: port  5432, master 
> logger process
>   501 55705 55701   0  5:38下午 ?? 0:00.00 postgres: port  5432, stats 
> collector process
>   501 55706 55701   0  5:38下午 ?? 0:00.04 postgres: port  5432, writer 
> process
>   501 55707 55701   0  5:38下午 ?? 0:00.01 postgres: port  5432, 
> checkpoint process
>   501 55708 55701   0  5:38下午 ?? 0:00.00 postgres: port  5432, 
> seqserver process
>   501 55709 55701   0  5:38下午 ?? 0:00.01 postgres: port  5432, WAL 
> Send Server process
>   501 55710 55701   0  5:38下午 ?? 0:00.00 postgres: port  5432, DFS 
> Metadata Cache process
>   501 55711 55701   0  5:38下午 ?? 0:00.27 postgres: port  5432, master 
> resource manager
>   501 55727 1   0  5:38下午 ?? 0:00.52 /usr/local/hawq/bin/postgres 
> -D /Users/wangchunling/hawq-data-directory/segmentdd -i -M segment -p 4 
> --silent-mode=true
>   501 55728 55727   0  5:38下午 ?? 0:00.06 postgres: port 4, logger 
> process
>   501 55731 55727   0  5:38下午 ?? 0:00.00 postgres: port 4, stats 
> collector process
>   501 55732 55727   0  5:38下午 ?? 0:00.04 postgres: port 4, writer 
> process
>   501 55733 55727   0  5:38下午 ?? 0:00.01 postgres: port 4, 
> checkpoint process
>   501 55734 55727   0  5:38下午 ?? 0:00.09 postgres: port 

[jira] [Commented] (HAWQ-564) QD hangs when connecting to resource manager

2016-03-21 Thread Chunling Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15203864#comment-15203864
 ] 

Chunling Wang commented on HAWQ-564:


There is another way to cause this bug without fault injection.
1. First run query and get some QEs.
{code}
dispatch=# select count(*) from test_dispatch as t1, test_dispatch as t2, 
test_dispatch as t3 where t1.id *2 = t2.id and t1.id < t3.id;
 count
---
  3725
(1 row)
{code}

{code}
$ ps -ef|grep postgres
  501 30190 1   0  2:34下午 ?? 0:00.31 /usr/local/hawq/bin/postgres 
-D /Users/wangchunling/hawq-data-directory/masterdd -i -M master -p 5432 
--silent-mode=true
  501 30191 30190   0  2:34下午 ?? 0:00.01 postgres: port  5432, master 
logger process
  501 30194 30190   0  2:34下午 ?? 0:00.00 postgres: port  5432, stats 
collector process
  501 30195 30190   0  2:34下午 ?? 0:00.01 postgres: port  5432, writer 
process
  501 30196 30190   0  2:34下午 ?? 0:00.00 postgres: port  5432, 
checkpoint process
  501 30197 30190   0  2:34下午 ?? 0:00.00 postgres: port  5432, 
seqserver process
  501 30198 30190   0  2:34下午 ?? 0:00.00 postgres: port  5432, WAL Send 
Server process
  501 30199 30190   0  2:34下午 ?? 0:00.00 postgres: port  5432, DFS 
Metadata Cache process
  501 30200 30190   0  2:34下午 ?? 0:00.07 postgres: port  5432, master 
resource manager
  501 30216 1   0  2:34下午 ?? 0:00.37 /usr/local/hawq/bin/postgres 
-D /Users/wangchunling/hawq-data-directory/segmentdd -i -M segment -p 4 
--silent-mode=true
  501 30217 30216   0  2:34下午 ?? 0:00.02 postgres: port 4, logger 
process
  501 30220 30216   0  2:34下午 ?? 0:00.00 postgres: port 4, stats 
collector process
  501 30221 30216   0  2:34下午 ?? 0:00.01 postgres: port 4, writer 
process
  501 30222 30216   0  2:34下午 ?? 0:00.00 postgres: port 4, 
checkpoint process
  501 30223 30216   0  2:34下午 ?? 0:00.03 postgres: port 4, segment 
resource manager
  501 30231 30190   0  2:35下午 ?? 0:00.03 postgres: port  5432, 
wangchunling dispatch [local] con12 cmd6 idle [local]
  501 30235 30216   0  2:35下午 ?? 0:00.13 postgres: port 4, 
wangchunling dispatch 127.0.0.1(65051) con12 seg0 idle
  501 30239 30216   0  2:35下午 ?? 0:00.06 postgres: port 4, 
wangchunling dispatch 127.0.0.1(65061) con12 seg0 idle
  501 30240 30216   0  2:35下午 ?? 0:00.06 postgres: port 4, 
wangchunling dispatch 127.0.0.1(65063) con12 seg0 idle
  501 30242 99560   0  2:36下午 ttys0000:00.00 grep postgres
{code}

2. Kill some QE and there is no QE.
{code}
$ kill -9 30235
$ ps -ef|grep postgres
  501 30190 1   0  2:34下午 ?? 0:00.32 /usr/local/hawq/bin/postgres 
-D /Users/wangchunling/hawq-data-directory/masterdd -i -M master -p 5432 
--silent-mode=true
  501 30191 30190   0  2:34下午 ?? 0:00.01 postgres: port  5432, master 
logger process
  501 30194 30190   0  2:34下午 ?? 0:00.00 postgres: port  5432, stats 
collector process
  501 30195 30190   0  2:34下午 ?? 0:00.01 postgres: port  5432, writer 
process
  501 30196 30190   0  2:34下午 ?? 0:00.00 postgres: port  5432, 
checkpoint process
  501 30197 30190   0  2:34下午 ?? 0:00.00 postgres: port  5432, 
seqserver process
  501 30198 30190   0  2:34下午 ?? 0:00.00 postgres: port  5432, WAL Send 
Server process
  501 30199 30190   0  2:34下午 ?? 0:00.00 postgres: port  5432, DFS 
Metadata Cache process
  501 30200 30190   0  2:34下午 ?? 0:00.08 postgres: port  5432, master 
resource manager
  501 30216 1   0  2:34下午 ?? 0:00.58 /usr/local/hawq/bin/postgres 
-D /Users/wangchunling/hawq-data-directory/segmentdd -i -M segment -p 4 
--silent-mode=true
  501 30217 30216   0  2:34下午 ?? 0:00.03 postgres: port 4, logger 
process
  501 30231 30190   0  2:35下午 ?? 0:00.04 postgres: port  5432, 
wangchunling dispatch [local] con12 cmd6 idle [local]
  501 30248 30216   0  2:36下午 ?? 0:00.00 postgres: port 4, stats 
collector process
  501 30249 30216   0  2:36下午 ?? 0:00.00 postgres: port 4, writer 
process
  501 30250 30216   0  2:36下午 ?? 0:00.00 postgres: port 4, 
checkpoint process
  501 30251 30216   0  2:36下午 ?? 0:00.00 postgres: port 4, segment 
resource manager
  501 30255 99560   0  2:36下午 ttys0000:00.00 grep postgres
{code}
3. Run query again and get some new QEs.
{code}
dispatch=# select count(*) from test_dispatch as t1, test_dispatch as t2, 
test_dispatch as t3 where t1.id *2 = t2.id and t1.id < t3.id;
 count
---
  3725
(1 row)
{code}

{code}
$ ps -ef|grep postgres
  501 30190 1   0  2:34下午 ?? 0:00.33 /usr/local/hawq/bin/postgres 
-D /Users/wangchunling/hawq-data-directory/masterdd -i -M master -p 5432 
--silent-mode=true
  501 30191 30190   0  2:34下午 ?? 0:00.01 postgres: port  

[jira] [Updated] (HAWQ-564) QD hangs when connecting to resource manager

2016-03-20 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang updated HAWQ-564:
---
Description: 
When first inject panic in QE process, we run a query and segment is down. 
After the segment is up, we run another query and get correct answer. Then we 
inject the same panic second time. After the segment is down and then up again, 
we run a query and find QD process hangs when connecting to resource manager. 
Here is the backtrace when QD hangs:
{code}
* thread #1: tid = 0x21d8be, 0x7fff890355be libsystem_kernel.dylib`poll + 
10, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
  * frame #0: 0x7fff890355be libsystem_kernel.dylib`poll + 10
frame #1: 0x000101daeafe postgres`processAllCommFileDescs + 158 at 
rmcomm_AsyncComm.c:156
frame #2: 0x000101db85f5 
postgres`callSyncRPCRemote(hostname=0x7f9c19e00cd0, port=5437, 
sendbuff=0x7f9c1b918f50, sendbuffsize=80, sendmsgid=259, exprecvmsgid=2307, 
recvsmb=, errorbuf=0x00010230c1a0, errorbufsize=) 
+ 645 at rmcomm_SyncComm.c:122
frame #3: 0x000101db2d85 postgres`acquireResourceFromRM [inlined] 
callSyncRPCToRM(sendbuff=0x7f9c1b918f50, sendbuffsize=, 
sendmsgid=259, exprecvmsgid=2307, recvsmb=0x7f9c1b918e70, 
errorbuf=, errorbufsize=1024) + 73 at rmcomm_QD2RM.c:2780
frame #4: 0x000101db2d3c 
postgres`acquireResourceFromRM(index=, sessionid=12, 
slice_size=462524016, iobytes=134217728, preferred_nodes=0x7f9c1a02d398, 
preferred_nodes_size=, max_seg_count_fix=, 
min_seg_count_fix=, errorbuf=, 
errorbufsize=) + 572 at rmcomm_QD2RM.c:742
frame #5: 0x000101c979e7 postgres`AllocateResource(life=QRL_ONCE, 
slice_size=5, iobytes=134217728, max_target_segment_num=1, 
min_target_segment_num=1, vol_info=0x7f9c1a02d398, vol_info_size=1) + 631 
at pquery.c:796
frame #6: 0x000101e8c60f 
postgres`calculate_planner_segment_num(query=, 
resourceLife=QRL_ONCE, fullRangeTable=, intoPolicy=, 
sliceNum=5) + 14287 at cdbdatalocality.c:4207
frame #7: 0x000101c0f671 postgres`planner + 106 at planner.c:496
frame #8: 0x000101c0f607 postgres`planner(parse=0x7f9c1a02a140, 
cursorOptions=, boundParams=0x, 
resourceLife=QRL_ONCE) + 311 at planner.c:310
frame #9: 0x000101c8eb33 
postgres`pg_plan_query(querytree=0x7f9c1a02a140, 
boundParams=0x, resource_life=QRL_ONCE) + 99 at postgres.c:837
frame #10: 0x000101c956ae postgres`exec_simple_query + 21 at 
postgres.c:911
frame #11: 0x000101c95699 
postgres`exec_simple_query(query_string=0x7f9c1a028a30, 
seqServerHost=0x, seqServerPort=-1) + 1577 at postgres.c:1671
frame #12: 0x000101c91a4c postgres`PostgresMain(argc=, 
argv=, username=0x7f9c1b808cf0) + 9404 at postgres.c:4754
frame #13: 0x000101c4ae02 postgres`ServerLoop [inlined] BackendRun + 
105 at postmaster.c:5889
frame #14: 0x000101c4ad99 postgres`ServerLoop at postmaster.c:5484
frame #15: 0x000101c4ad99 postgres`ServerLoop + 9593 at 
postmaster.c:2163
frame #16: 0x000101c47d3b postgres`PostmasterMain(argc=, 
argv=) + 5019 at postmaster.c:1454
frame #17: 0x000101bb1aa9 postgres`main(argc=9, 
argv=0x7f9c19c1eef0) + 1433 at main.c:209
frame #18: 0x7fff95e8c5c9 libdyld.dylib`start + 1

  thread #2: tid = 0x21d8bf, 0x7fff890355be libsystem_kernel.dylib`poll + 10
frame #0: 0x7fff890355be libsystem_kernel.dylib`poll + 10
frame #1: 0x000101dfe723 postgres`rxThreadFunc(arg=) + 
2163 at ic_udp.c:6251
frame #2: 0x7fff95e822fc libsystem_pthread.dylib`_pthread_body + 131
frame #3: 0x7fff95e82279 libsystem_pthread.dylib`_pthread_start + 176
frame #4: 0x7fff95e804b1 libsystem_pthread.dylib`thread_start + 13

  thread #3: tid = 0x21d9c2, 0x7fff890343f6 libsystem_kernel.dylib`__select 
+ 10
frame #0: 0x7fff890343f6 libsystem_kernel.dylib`__select + 10
frame #1: 0x000101e9d42e postgres`pg_usleep(microsec=) + 
78 at pgsleep.c:43
frame #2: 0x000101db1a66 
postgres`generateResourceRefreshHeartBeat(arg=0x7f9c19f02480) + 166 at 
rmcomm_QD2RM.c:1519
frame #3: 0x7fff95e822fc libsystem_pthread.dylib`_pthread_body + 131
frame #4: 0x7fff95e82279 libsystem_pthread.dylib`_pthread_start + 176
frame #5: 0x7fff95e804b1 libsystem_pthread.dylib`thread_start + 13
{code}

And here is the operations:
1. Before injection, get query answer correctly.
{code}
dispatch=# select count(*) from test_dispatch as t1, test_dispatch as t2, 
test_dispatch as t3 where t1.id *2 = t2.id and t1.id < t3.id;
 count
---
  3725
(1 row)
{code}
2. Inject panic, fault triggered, and segment is down.
{code}
dispatch=# select count(*) from test_dispatch as t1, test_dispatch as t2, 
test_dispatch as t3 where t1.id *2 = t2.id and t1.id < t3.id;
ERROR:  fault triggered, fault 

[jira] [Updated] (HAWQ-564) QD hangs when connecting to resource manager

2016-03-20 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang updated HAWQ-564:
---
Affects Version/s: 2.0.0

> QD hangs when connecting to resource manager
> 
>
> Key: HAWQ-564
> URL: https://issues.apache.org/jira/browse/HAWQ-564
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Resource Manager
>Affects Versions: 2.0.0
>Reporter: Chunling Wang
>Assignee: Lei Chang
>
> When first inject panic in QE process, we run a query and segment is down. 
> After the segment is up, we run another query and get correct answer. Then we 
> inject the same panic second time. After the segment is down and then up 
> again, we run a query and find QD process hangs when connecting to resource 
> manager. Here is the backtrace when QD hangs:
> {code}
> * thread #1: tid = 0x21d8be, 0x7fff890355be libsystem_kernel.dylib`poll + 
> 10, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
>   * frame #0: 0x7fff890355be libsystem_kernel.dylib`poll + 10
> frame #1: 0x000101daeafe postgres`processAllCommFileDescs + 158 at 
> rmcomm_AsyncComm.c:156
> frame #2: 0x000101db85f5 
> postgres`callSyncRPCRemote(hostname=0x7f9c19e00cd0, port=5437, 
> sendbuff=0x7f9c1b918f50, sendbuffsize=80, sendmsgid=259, 
> exprecvmsgid=2307, recvsmb=, errorbuf=0x00010230c1a0, 
> errorbufsize=) + 645 at rmcomm_SyncComm.c:122
> frame #3: 0x000101db2d85 postgres`acquireResourceFromRM [inlined] 
> callSyncRPCToRM(sendbuff=0x7f9c1b918f50, sendbuffsize=, 
> sendmsgid=259, exprecvmsgid=2307, recvsmb=0x7f9c1b918e70, 
> errorbuf=, errorbufsize=1024) + 73 at rmcomm_QD2RM.c:2780
> frame #4: 0x000101db2d3c 
> postgres`acquireResourceFromRM(index=, sessionid=12, 
> slice_size=462524016, iobytes=134217728, preferred_nodes=0x7f9c1a02d398, 
> preferred_nodes_size=, max_seg_count_fix=, 
> min_seg_count_fix=, errorbuf=, 
> errorbufsize=) + 572 at rmcomm_QD2RM.c:742
> frame #5: 0x000101c979e7 postgres`AllocateResource(life=QRL_ONCE, 
> slice_size=5, iobytes=134217728, max_target_segment_num=1, 
> min_target_segment_num=1, vol_info=0x7f9c1a02d398, vol_info_size=1) + 631 
> at pquery.c:796
> frame #6: 0x000101e8c60f 
> postgres`calculate_planner_segment_num(query=, 
> resourceLife=QRL_ONCE, fullRangeTable=, 
> intoPolicy=, sliceNum=5) + 14287 at cdbdatalocality.c:4207
> frame #7: 0x000101c0f671 postgres`planner + 106 at planner.c:496
> frame #8: 0x000101c0f607 postgres`planner(parse=0x7f9c1a02a140, 
> cursorOptions=, boundParams=0x, 
> resourceLife=QRL_ONCE) + 311 at planner.c:310
> frame #9: 0x000101c8eb33 
> postgres`pg_plan_query(querytree=0x7f9c1a02a140, 
> boundParams=0x, resource_life=QRL_ONCE) + 99 at postgres.c:837
> frame #10: 0x000101c956ae postgres`exec_simple_query + 21 at 
> postgres.c:911
> frame #11: 0x000101c95699 
> postgres`exec_simple_query(query_string=0x7f9c1a028a30, 
> seqServerHost=0x, seqServerPort=-1) + 1577 at postgres.c:1671
> frame #12: 0x000101c91a4c postgres`PostgresMain(argc=, 
> argv=, username=0x7f9c1b808cf0) + 9404 at postgres.c:4754
> frame #13: 0x000101c4ae02 postgres`ServerLoop [inlined] BackendRun + 
> 105 at postmaster.c:5889
> frame #14: 0x000101c4ad99 postgres`ServerLoop at postmaster.c:5484
> frame #15: 0x000101c4ad99 postgres`ServerLoop + 9593 at 
> postmaster.c:2163
> frame #16: 0x000101c47d3b postgres`PostmasterMain(argc=, 
> argv=) + 5019 at postmaster.c:1454
> frame #17: 0x000101bb1aa9 postgres`main(argc=9, 
> argv=0x7f9c19c1eef0) + 1433 at main.c:209
> frame #18: 0x7fff95e8c5c9 libdyld.dylib`start + 1
>   thread #2: tid = 0x21d8bf, 0x7fff890355be libsystem_kernel.dylib`poll + 
> 10
> frame #0: 0x7fff890355be libsystem_kernel.dylib`poll + 10
> frame #1: 0x000101dfe723 postgres`rxThreadFunc(arg=) + 
> 2163 at ic_udp.c:6251
> frame #2: 0x7fff95e822fc libsystem_pthread.dylib`_pthread_body + 131
> frame #3: 0x7fff95e82279 libsystem_pthread.dylib`_pthread_start + 176
> frame #4: 0x7fff95e804b1 libsystem_pthread.dylib`thread_start + 13
>   thread #3: tid = 0x21d9c2, 0x7fff890343f6 
> libsystem_kernel.dylib`__select + 10
> frame #0: 0x7fff890343f6 libsystem_kernel.dylib`__select + 10
> frame #1: 0x000101e9d42e postgres`pg_usleep(microsec=) + 
> 78 at pgsleep.c:43
> frame #2: 0x000101db1a66 
> postgres`generateResourceRefreshHeartBeat(arg=0x7f9c19f02480) + 166 at 
> rmcomm_QD2RM.c:1519
> frame #3: 0x7fff95e822fc libsystem_pthread.dylib`_pthread_body + 131
> frame #4: 0x7fff95e82279 libsystem_pthread.dylib`_pthread_start + 176
> frame #5: 

[jira] [Updated] (HAWQ-564) QD hangs when connecting to resource manager

2016-03-20 Thread Chunling Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chunling Wang updated HAWQ-564:
---
Description: 
When first inject panic in QE process, we run a query and segment is down. 
After the segment is up, we run another query and get correct answer. Then we 
inject the same panic second time. After the segment is down and then up again, 
we run a query and find QD process hangs when connecting to resource manager. 
Here is the backtrace when QD hangs:
{code}
* thread #1: tid = 0x21d8be, 0x7fff890355be libsystem_kernel.dylib`poll + 
10, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
  * frame #0: 0x7fff890355be libsystem_kernel.dylib`poll + 10
frame #1: 0x000101daeafe postgres`processAllCommFileDescs + 158 at 
rmcomm_AsyncComm.c:156
frame #2: 0x000101db85f5 
postgres`callSyncRPCRemote(hostname=0x7f9c19e00cd0, port=5437, 
sendbuff=0x7f9c1b918f50, sendbuffsize=80, sendmsgid=259, exprecvmsgid=2307, 
recvsmb=, errorbuf=0x00010230c1a0, errorbufsize=) 
+ 645 at rmcomm_SyncComm.c:122
frame #3: 0x000101db2d85 postgres`acquireResourceFromRM [inlined] 
callSyncRPCToRM(sendbuff=0x7f9c1b918f50, sendbuffsize=, 
sendmsgid=259, exprecvmsgid=2307, recvsmb=0x7f9c1b918e70, 
errorbuf=, errorbufsize=1024) + 73 at rmcomm_QD2RM.c:2780
frame #4: 0x000101db2d3c 
postgres`acquireResourceFromRM(index=, sessionid=12, 
slice_size=462524016, iobytes=134217728, preferred_nodes=0x7f9c1a02d398, 
preferred_nodes_size=, max_seg_count_fix=, 
min_seg_count_fix=, errorbuf=, 
errorbufsize=) + 572 at rmcomm_QD2RM.c:742
frame #5: 0x000101c979e7 postgres`AllocateResource(life=QRL_ONCE, 
slice_size=5, iobytes=134217728, max_target_segment_num=1, 
min_target_segment_num=1, vol_info=0x7f9c1a02d398, vol_info_size=1) + 631 
at pquery.c:796
frame #6: 0x000101e8c60f 
postgres`calculate_planner_segment_num(query=, 
resourceLife=QRL_ONCE, fullRangeTable=, intoPolicy=, 
sliceNum=5) + 14287 at cdbdatalocality.c:4207
frame #7: 0x000101c0f671 postgres`planner + 106 at planner.c:496
frame #8: 0x000101c0f607 postgres`planner(parse=0x7f9c1a02a140, 
cursorOptions=, boundParams=0x, 
resourceLife=QRL_ONCE) + 311 at planner.c:310
frame #9: 0x000101c8eb33 
postgres`pg_plan_query(querytree=0x7f9c1a02a140, 
boundParams=0x, resource_life=QRL_ONCE) + 99 at postgres.c:837
frame #10: 0x000101c956ae postgres`exec_simple_query + 21 at 
postgres.c:911
frame #11: 0x000101c95699 
postgres`exec_simple_query(query_string=0x7f9c1a028a30, 
seqServerHost=0x, seqServerPort=-1) + 1577 at postgres.c:1671
frame #12: 0x000101c91a4c postgres`PostgresMain(argc=, 
argv=, username=0x7f9c1b808cf0) + 9404 at postgres.c:4754
frame #13: 0x000101c4ae02 postgres`ServerLoop [inlined] BackendRun + 
105 at postmaster.c:5889
frame #14: 0x000101c4ad99 postgres`ServerLoop at postmaster.c:5484
frame #15: 0x000101c4ad99 postgres`ServerLoop + 9593 at 
postmaster.c:2163
frame #16: 0x000101c47d3b postgres`PostmasterMain(argc=, 
argv=) + 5019 at postmaster.c:1454
frame #17: 0x000101bb1aa9 postgres`main(argc=9, 
argv=0x7f9c19c1eef0) + 1433 at main.c:209
frame #18: 0x7fff95e8c5c9 libdyld.dylib`start + 1

  thread #2: tid = 0x21d8bf, 0x7fff890355be libsystem_kernel.dylib`poll + 10
frame #0: 0x7fff890355be libsystem_kernel.dylib`poll + 10
frame #1: 0x000101dfe723 postgres`rxThreadFunc(arg=) + 
2163 at ic_udp.c:6251
frame #2: 0x7fff95e822fc libsystem_pthread.dylib`_pthread_body + 131
frame #3: 0x7fff95e82279 libsystem_pthread.dylib`_pthread_start + 176
frame #4: 0x7fff95e804b1 libsystem_pthread.dylib`thread_start + 13

  thread #3: tid = 0x21d9c2, 0x7fff890343f6 libsystem_kernel.dylib`__select 
+ 10
frame #0: 0x7fff890343f6 libsystem_kernel.dylib`__select + 10
frame #1: 0x000101e9d42e postgres`pg_usleep(microsec=) + 
78 at pgsleep.c:43
frame #2: 0x000101db1a66 
postgres`generateResourceRefreshHeartBeat(arg=0x7f9c19f02480) + 166 at 
rmcomm_QD2RM.c:1519
frame #3: 0x7fff95e822fc libsystem_pthread.dylib`_pthread_body + 131
frame #4: 0x7fff95e82279 libsystem_pthread.dylib`_pthread_start + 176
frame #5: 0x7fff95e804b1 libsystem_pthread.dylib`thread_start + 13
{code}

And here is the operations:
1. Before injection, get query answer correctly.
{code}
dispatch=# select count(*) from test_dispatch as t1, test_dispatch as t2, 
test_dispatch as t3 where t1.id *2 = t2.id and t1.id < t3.id;
 count
---
  3725
(1 row)
{code}
2. Inject panic, fault triggered, and segment is down.
{code}
dispatch=# select count(*) from test_dispatch as t1, test_dispatch as t2, 
test_dispatch as t3 where t1.id *2 = t2.id and t1.id < t3.id;
ERROR:  fault triggered, fault 

  1   2   >