[jira] [Closed] (HAWQ-559) QD hangs when QE is killed after connected to QD
[ https://issues.apache.org/jira/browse/HAWQ-559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang closed HAWQ-559. -- > QD hangs when QE is killed after connected to QD > > > Key: HAWQ-559 > URL: https://issues.apache.org/jira/browse/HAWQ-559 > Project: Apache HAWQ > Issue Type: Bug > Components: Dispatcher >Affects Versions: 2.0.0.0-incubating > Environment: mac os X 10.10 >Reporter: Chunling Wang >Assignee: Lili Ma > Fix For: 2.0.0.0-incubating > > > When the first query finishes, the QE is still alive. Then we run the second > query. After the thread of QD is created and bind to QE but not send data to > QE, we kill this QE and find QD hangs. > Here is the backtrace when QD hangs: > {code} > * thread #1: tid = 0x1c4afd, 0x7fff890355be libsystem_kernel.dylib`poll + > 10, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP > * frame #0: 0x7fff890355be libsystem_kernel.dylib`poll + 10 > frame #1: 0x00010745692c postgres`receiveChunksUDP [inlined] > udpSignalPoll + 42 at ic_udp.c:2882 > frame #2: 0x000107456902 postgres`receiveChunksUDP + 26 at > ic_udp.c:2715 > frame #3: 0x0001074568e8 postgres`receiveChunksUDP [inlined] > waitOnCondition(timeout_us=25) + 82 at ic_udp.c:1599 > frame #4: 0x000107456896 > postgres`receiveChunksUDP(pTransportStates=0x7ff2a381ae48, > pEntry=0x7ff2a18f2230, motNodeID=, > srcRoute=0x7fff58c0ce96, conn=, inTeardown='\0') + 726 at > ic_udp.c:4039 > frame #5: 0x000107452a86 postgres`RecvTupleChunkFromAnyUDP [inlined] > RecvTupleChunkFromAnyUDP_Internal + 498 at ic_udp.c:4146 > frame #6: 0x000107452894 > postgres`RecvTupleChunkFromAnyUDP(mlStates=, > transportStates=, motNodeID=1, srcRoute=0x7fff58c0ce96) + > 100 at ic_udp.c:4167 > frame #7: 0x000107442254 postgres`RecvTupleFrom [inlined] > processIncomingChunks(mlStates=0x7ff2a3812a30, > transportStates=0x7ff2a381ae48, motNodeID=1, srcRoute=) + 34 > at cdbmotion.c:684 > frame #8: 0x000107442232 > postgres`RecvTupleFrom(mlStates=0x7ff2a3812a30, > transportStates=, motNodeID=1, tup_i=0x7fff58c0cf00, > srcRoute=-100) + 370 at cdbmotion.c:610 > frame #9: 0x0001071c8778 postgres`ExecMotion [inlined] > execMotionUnsortedReceiver(node=) + 57 at nodeMotion.c:466 > frame #10: 0x0001071c873f postgres`ExecMotion(node=) + > 1071 at nodeMotion.c:298 > frame #11: 0x0001071a4835 > postgres`ExecProcNode(node=0x7ff2a38164b8) + 613 at execProcnode.c:999 > frame #12: 0x0001071b9f82 postgres`ExecAgg + 104 at nodeAgg.c:1163 > frame #13: 0x0001071b9f1a postgres`ExecAgg + 316 at nodeAgg.c:1693 > frame #14: 0x0001071b9dde postgres`ExecAgg(node=0x7ff2a3815348) + > 126 at nodeAgg.c:1138 > frame #15: 0x0001071a4803 > postgres`ExecProcNode(node=0x7ff2a3815348) + 563 at execProcnode.c:979 > frame #16: 0x00010719ecfd > postgres`ExecutePlan(estate=0x7ff2a3814e30, planstate=0x7ff2a3815348, > operation=CMD_SELECT, numberTuples=0, direction=, > dest=0x7ff2a28db178) + 1181 at execMain.c:3218 > frame #17: 0x00010719e619 > postgres`ExecutorRun(queryDesc=0x7ff2a3811f00, > direction=ForwardScanDirection, count=0) + 569 at execMain.c:1213 > frame #18: 0x0001072e7fc2 postgres`PortalRun + 14 at pquery.c:1649 > frame #19: 0x0001072e7fb4 > postgres`PortalRun(portal=0x7ff2a1893e30, count=, > isTopLevel='\x01', dest=, altdest=0x7ff2a28db178, > completionTag=0x7fff58c0d530) + 1124 at pquery.c:1471 > frame #20: 0x0001072e4a8e > postgres`exec_simple_query(query_string=0x7ff2a380fe30, > seqServerHost=0x, seqServerPort=-1) + 2078 at postgres.c:1745 > frame #21: 0x0001072e0c4c postgres`PostgresMain(argc=, > argv=, username=0x7ff2a201bcf0) + 9404 at postgres.c:4754 > frame #22: 0x00010729a002 postgres`ServerLoop [inlined] BackendRun + > 105 at postmaster.c:5889 > frame #23: 0x000107299f99 postgres`ServerLoop at postmaster.c:5484 > frame #24: 0x000107299f99 postgres`ServerLoop + 9593 at > postmaster.c:2163 > frame #25: 0x000107296f3b postgres`PostmasterMain(argc=, > argv=) + 5019 at postmaster.c:1454 > frame #26: 0x000107200ca9 postgres`main(argc=9, > argv=0x7ff2a141eef0) + 1433 at main.c:209 > frame #27: 0x7fff95e8c5c9 libdyld.dylib`start + 1 > thread #2: tid = 0x1c4afe, 0x7fff890355be libsystem_kernel.dylib`poll + > 10 > frame #0: 0x7fff890355be libsystem_kernel.dylib`poll + 10 > frame #1: 0x00010744d8e3 postgres`rxThreadFunc(arg=) + > 2163 at ic_udp.c:6251 > frame #2: 0x7fff95e822fc libsystem_pthread.dylib`_pthread_body + 131 > frame #3:
[jira] [Closed] (HAWQ-524) Do not resolve the condition of 'executor->refResult = NULL' in executormgr_bind_executor_task()
[ https://issues.apache.org/jira/browse/HAWQ-524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang closed HAWQ-524. -- > Do not resolve the condition of 'executor->refResult = NULL' in > executormgr_bind_executor_task() > - > > Key: HAWQ-524 > URL: https://issues.apache.org/jira/browse/HAWQ-524 > Project: Apache HAWQ > Issue Type: Bug > Components: Dispatcher >Affects Versions: 2.0.0.0-incubating >Reporter: Chunling Wang >Assignee: Lili Ma > Fix For: 2.0.0.0-incubating > > > In executormgr.c, the code below should not be Assert(). The condition of > 'executor->refResult = NULL' should be catch. > bool > executormgr_bind_executor_task(struct DispatchData *data, > QueryExecutor *executor, > > SegmentDatabaseDescriptor *desc, > struct DispatchTask > *task, > struct DispatchSlice > *slice) > { > ... > Assert(executor->refResult != NULL); > ... > } -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Closed] (HAWQ-1145) After registering a partition table, if we want to insert some data into the table, it fails.
[ https://issues.apache.org/jira/browse/HAWQ-1145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang closed HAWQ-1145. --- > After registering a partition table, if we want to insert some data into the > table, it fails. > - > > Key: HAWQ-1145 > URL: https://issues.apache.org/jira/browse/HAWQ-1145 > Project: Apache HAWQ > Issue Type: Bug > Components: Command Line Tools >Affects Versions: 2.1.0.0-incubating >Reporter: Lili Ma >Assignee: Chunling Wang > Fix For: 2.1.0.0-incubating > > Attachments: dbgen, dists.dss > > > Reproduce Steps: > 1. Create a partition table > {code} > CREATE TABLE parquet_LINEITEM_uncompressed( > > > > L_ORDERKEY INT8, > > > > L_PARTKEY BIGINT, > > > > L_SUPPKEY BIGINT, > > > > L_LINENUMBER BIGINT, > > > > L_QUANTITY decimal, > > > > L_EXTENDEDPRICE decimal, > > > > L_DISCOUNT decimal, > > > > L_TAX decimal, > > > > L_RETURNFLAG CHAR(1), > > > > L_LINESTATUS > CHAR(1), > > > > L_SHIPDATE date, > > > > L_COMMITDATE date, > >
[jira] [Closed] (HAWQ-523) Dead code in executormgr_bind_executor_task()
[ https://issues.apache.org/jira/browse/HAWQ-523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang closed HAWQ-523. -- > Dead code in executormgr_bind_executor_task() > - > > Key: HAWQ-523 > URL: https://issues.apache.org/jira/browse/HAWQ-523 > Project: Apache HAWQ > Issue Type: New Feature > Components: Dispatcher >Affects Versions: 2.0.0.0-incubating >Reporter: Chunling Wang >Assignee: Lili Ma > Fix For: 2.0.0.0-incubating > > > In executormgr.c, the code below would never access: > bool > executormgr_bind_executor_task(struct DispatchData *data, > QueryExecutor *executor, > > SegmentDatabaseDescriptor *desc, > struct DispatchTask > *task, > struct DispatchSlice > *slice) > { > ... > if (desc == NULL) > { > executor->health = QEH_ERROR; > return false; > } > ... > } -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HAWQ-619) Change 'gpextract' to 'hawqextract' for InputFormat unit test
[ https://issues.apache.org/jira/browse/HAWQ-619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16174367#comment-16174367 ] Chunling Wang commented on HAWQ-619: The PR is https://github.com/apache/incubator-hawq/commit/53a9f76f04d3f56684f3c0e3cb3dd17ba1ae1997 > Change 'gpextract' to 'hawqextract' for InputFormat unit test > - > > Key: HAWQ-619 > URL: https://issues.apache.org/jira/browse/HAWQ-619 > Project: Apache HAWQ > Issue Type: Task > Components: Tests >Reporter: Chunling Wang >Assignee: Jiali Yao > Fix For: 2.0.0.0-incubating > > > Change 'gpextract' to 'hawqextract' in SimpleTableLocalTester.java for > InputFormat unit test. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Closed] (HAWQ-619) Change 'gpextract' to 'hawqextract' for InputFormat unit test
[ https://issues.apache.org/jira/browse/HAWQ-619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang closed HAWQ-619. -- > Change 'gpextract' to 'hawqextract' for InputFormat unit test > - > > Key: HAWQ-619 > URL: https://issues.apache.org/jira/browse/HAWQ-619 > Project: Apache HAWQ > Issue Type: Task > Components: Tests >Reporter: Chunling Wang >Assignee: Jiali Yao > Fix For: 2.0.0.0-incubating > > > Change 'gpextract' to 'hawqextract' in SimpleTableLocalTester.java for > InputFormat unit test. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Closed] (HAWQ-812) Activate standby master failed after create a new database
[ https://issues.apache.org/jira/browse/HAWQ-812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang closed HAWQ-812. -- > Activate standby master failed after create a new database > -- > > Key: HAWQ-812 > URL: https://issues.apache.org/jira/browse/HAWQ-812 > Project: Apache HAWQ > Issue Type: Bug >Reporter: Chunling Wang >Assignee: Ming LI > Fix For: 2.0.0.0-incubating > > > Activate standby master failed after create a new database. However, it will > success if we do not create a new database even we create a new table and > insert data. > 1. Create a new database 'gptest' > {code} > [gpadmin@test1 ~]$ psql -l > List of databases >Name| Owner | Encoding | Access privileges > ---+-+--+--- > postgres | gpadmin | UTF8 | > template0 | gpadmin | UTF8 | > template1 | gpadmin | UTF8 | > (3 rows) > [gpadmin@test1 ~]$ createdb gptest > [gpadmin@test1 ~]$ psql -l > List of databases >Name| Owner | Encoding | Access privileges > ---+-+--+--- > gptest| gpadmin | UTF8 | > postgres | gpadmin | UTF8 | > template0 | gpadmin | UTF8 | > template1 | gpadmin | UTF8 | > (4 rows) > {code} > 2. Stop HAWQ master > {code} > [gpadmin@test1 ~]$ hawq stop master -a > 20160613:20:13:44:068559 hawq_stop:test1:gpadmin-[INFO]:-Prepare to do 'hawq > stop' > 20160613:20:13:44:068559 hawq_stop:test1:gpadmin-[INFO]:-You can find log in: > 20160613:20:13:44:068559 > hawq_stop:test1:gpadmin-[INFO]:-/home/gpadmin/hawqAdminLogs/hawq_stop_20160613.log > 20160613:20:13:44:068559 hawq_stop:test1:gpadmin-[INFO]:-GPHOME is set to: > 20160613:20:13:44:068559 > hawq_stop:test1:gpadmin-[INFO]:-/data/pulse-agent-data/HAWQ-main-FeatureTest-opt-mutilnodeparallel-wcl/product/hawq/. > 20160613:20:13:44:068559 hawq_stop:test1:gpadmin-[INFO]:-Stop hawq with args: > ['stop', 'master'] > 20160613:20:13:45:068559 hawq_stop:test1:gpadmin-[INFO]:-There are 0 > connections to the database > 20160613:20:13:45:068559 hawq_stop:test1:gpadmin-[INFO]:-Commencing Master > instance shutdown with mode='smart' > 20160613:20:13:45:068559 hawq_stop:test1:gpadmin-[INFO]:-Master host=test1 > 20160613:20:13:45:068559 hawq_stop:test1:gpadmin-[INFO]:-Stop hawq master > 20160613:20:13:46:068559 hawq_stop:test1:gpadmin-[INFO]:-Master stopped > successfully > {code} > 3. Activate standby master > {code} > [gpadmin@test1 ~]$ ssh test5 'source > /data/pulse-agent-data/HAWQ-main-FeatureTest-opt-mutilnodeparallel-wcl/product/hawq/./greenplum_path.sh; > hawq activate standby -a' > 20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-Prepare to do > 'hawq activate' > 20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-You can find log > in: > 20160613:20:14:14:126841 > hawq_activate:test5:gpadmin-[INFO]:-/home/gpadmin/hawqAdminLogs/hawq_activate_20160613.log > 20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-GPHOME is set to: > 20160613:20:14:14:126841 > hawq_activate:test5:gpadmin-[INFO]:-/data/pulse-agent-data/HAWQ-main-FeatureTest-opt-mutilnodeparallel-wcl/product/hawq/. > 20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-Activate hawq > with args: ['activate', 'standby'] > 20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-Starting to > activate standby master 'test5' > 20160613:20:14:15:126841 hawq_activate:test5:gpadmin-[INFO]:-HAWQ master is > not running, skip > 20160613:20:14:15:126841 hawq_activate:test5:gpadmin-[INFO]:-Stopping all the > running segments > 20160613:20:14:21:126841 hawq_activate:test5:gpadmin-[INFO]:- > 20160613:20:14:21:126841 hawq_activate:test5:gpadmin-[INFO]:-Stopping running > standby > 20160613:20:14:23:126841 hawq_activate:test5:gpadmin-[INFO]:-Update master > host name in hawq-site.xml > 20160613:20:14:31:126841 hawq_activate:test5:gpadmin-[INFO]:-GUC > hawq_master_address_host already exist in hawq-site.xml > Update it with value: test5 > 20160613:20:14:31:126841 hawq_activate:test5:gpadmin-[INFO]:-Remove current > standby from hawq-site.xml > 20160613:20:14:39:126841 hawq_activate:test5:gpadmin-[INFO]:-Start master in > master only mode > {code} > It hangs and can not start master. And the master log is following: > {code} > 2016-06-13 20:14:40.268022 > PDT,,,p127518,th-12124628160,,,seg-1,"LOG","0","database > system was shut down at 2016-06-13 20:02:50 PDT",,,0,,"xlog.c",6205, > 2016-06-13 20:14:40.268112 > PDT,,,p127518,th-12124628160,,,seg-1,"LOG","0","found > recovery.conf file indicating standby takeover recovery > needed",,,0,,"xlog.c",5485, > 2016-06-13 20:14:40.268131 >
[jira] [Closed] (HAWQ-1034) add --repair option for hawq register
[ https://issues.apache.org/jira/browse/HAWQ-1034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang closed HAWQ-1034. --- > add --repair option for hawq register > - > > Key: HAWQ-1034 > URL: https://issues.apache.org/jira/browse/HAWQ-1034 > Project: Apache HAWQ > Issue Type: Sub-task > Components: Command Line Tools >Affects Versions: 2.1.0.0-incubating >Reporter: Lili Ma >Assignee: Chunling Wang > Fix For: 2.1.0.0-incubating > > > add --repair option for hawq register > Will change both file folder and catalog table pg_aoseg.pg_paqseg_$relid to > the state which .yml file configures. Note may some new generated files since > the checkpoint may be deleted here. Also note the all the files in .yml file > should all under the table folder on HDFS. Limitation: Do not support cases > for hash table redistribution, table truncate and table drop. This is for > scenario rollback of table: Do checkpoints somewhere, and need to rollback to > previous checkpoint. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Closed] (HAWQ-1418) Print executing command for hawq register
[ https://issues.apache.org/jira/browse/HAWQ-1418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang closed HAWQ-1418. --- > Print executing command for hawq register > - > > Key: HAWQ-1418 > URL: https://issues.apache.org/jira/browse/HAWQ-1418 > Project: Apache HAWQ > Issue Type: Sub-task > Components: Command Line Tools >Reporter: Chunling Wang >Assignee: Chunling Wang > Fix For: 2.2.0.0-incubating > > > Print executing command for hawq register -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Closed] (HAWQ-1426) hawq extract meets error after the table was reorganized.
[ https://issues.apache.org/jira/browse/HAWQ-1426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang closed HAWQ-1426. --- > hawq extract meets error after the table was reorganized. > - > > Key: HAWQ-1426 > URL: https://issues.apache.org/jira/browse/HAWQ-1426 > Project: Apache HAWQ > Issue Type: Bug > Components: Command Line Tools >Reporter: Lili Ma >Assignee: Chunling Wang > Fix For: 2.3.0.0-incubating > > > After one table is reorganized, hawq extract the table will meet error. > Reproduce Steps: > 1. create an AO table > 2. insert into several records into it > 3. Get the table reorganized. "alter table a set with (reorganize=true);" > 4. run hawq extract, error thrown out. > For the bug fix, we should also guarantee that hawq extract should work if > the table is truncated and re-inserted. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Closed] (HAWQ-1525) Segmentation fault occurs if reindex database when loading data from Hive to HAWQ using hcatalog
[ https://issues.apache.org/jira/browse/HAWQ-1525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang closed HAWQ-1525. --- > Segmentation fault occurs if reindex database when loading data from Hive to > HAWQ using hcatalog > > > Key: HAWQ-1525 > URL: https://issues.apache.org/jira/browse/HAWQ-1525 > Project: Apache HAWQ > Issue Type: Bug > Components: Query Execution >Reporter: Chunling Wang >Assignee: Chunling Wang > Fix For: 2.3.0.0-incubating > > > When we use hcatalog to load data from Hive to HAWQ, if the amount of data is > big enough, it will trigger automatic statistics collection, calling vacuum > analyze. At that time if we reindex the database, the system will panic on > the next auto analyze. Here is the call stack. > {code} > 2017-09-07 13:34:41.441970 IST,,,p34393,th0,,,2017-09-07 13:34:09 > IST,0,con1140,cmd6,seg-1,"PANIC","XX000","Unexpected internal error: > Master pr > ocess received signal SIGSEGV",,,0"10x96f57c postgres found> + 0x96f57c > 20x96f785 postgres StandardHandlerForSigillSigsegvSigbus_OnMainThread + > 0x2b > 30x88b04f postgres CdbProgramErrorHandler + 0xf1 > 40x3a16a0f7e0 libpthread.so.0 + 0x16a0f7e0 > 50x973048 postgres FunctionCall2 + 0x8e > 60xabefab postgres + 0xabefab > 70xabfee4 postgres InMemHeap_GetNext + 0x408 > 80x4f7bc6 postgres + 0x4f7bc6 > 90x4f7abc postgres systable_getnext + 0x50 > 10 0x953fb8 postgres SearchCatCache + 0x276 > 11 0x95ce10 postgres SearchSysCache + 0x93 > 12 0x95cecb postgres SearchSysCacheKeyArray + 0x9f > 13 0x5a07fc postgres caql_getoid_plus + 0x176 > 14 0x5c4888 postgres LookupNamespaceId + 0x129 > 15 0x5c475d postgres LookupInternalNamespaceId + 0x1d > 16 0x687897 postgres + 0x687897 > 17 0x687574 postgres CreateSchemaCommand + 0x8f > 18 0x8952d1 postgres ProcessUtility + 0x4ff > 19 0x5c5728 postgres + 0x5c5728 > 20 0x5c2fea postgres RangeVarGetCreationNamespace + 0x253 > 21 0x6e43f3 postgres + 0x6e43f3 > 22 0x6e49c4 postgres + 0x6e49c4 > 23 0x6e1401 postgres + 0x6e1401 > 24 0x6deb2d postgres ExecutorStart + 0xb01 > 25 0x738594 postgres + 0x738594 > 26 0x73809f postgres + 0x73809f > 27 0x7351a9 postgres SPI_execute + 0x13c > 28 0x6490f2 postgres spiExecuteWithCallback + 0x130 > 29 0x64956b postgres + 0x64956b > 30 0x648be0 postgres + 0x648be0 > 31 0x647be0 postgres analyzeStmt + 0x91d > 32 0x647247 postgres analyzeStatement + 0xb1 > 33 0x6ca11d postgres vacuum + 0xe5 > 34 0x827910 postgres autostats_issue_analyze + 0x160 > 35 0x827e10 postgres auto_stats + 0x19b > 36 0x8906b5 postgres + 0x8906b5 > 37 0x8930f5 postgres + 0x8930f5 > 38 0x892619 postgres PortalRun + 0x3e6 > 39 0x8884f6 postgres + 0x8884f6 > {code} > This is because reindex command clear the relcache, and inmemscan->rs_rd->rel > in InMemHeap_GetNext() using the address of this heap relation in relcache, > which is not same with that when heap relation is reopened. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (HAWQ-1525) Segmentation fault occurs if reindex database when loading data from Hive to HAWQ using hcatalog
[ https://issues.apache.org/jira/browse/HAWQ-1525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang resolved HAWQ-1525. - Resolution: Fixed Fix Version/s: 2.3.0.0-incubating > Segmentation fault occurs if reindex database when loading data from Hive to > HAWQ using hcatalog > > > Key: HAWQ-1525 > URL: https://issues.apache.org/jira/browse/HAWQ-1525 > Project: Apache HAWQ > Issue Type: Bug > Components: Query Execution >Reporter: Chunling Wang >Assignee: Chunling Wang > Fix For: 2.3.0.0-incubating > > > When we use hcatalog to load data from Hive to HAWQ, if the amount of data is > big enough, it will trigger automatic statistics collection, calling vacuum > analyze. At that time if we reindex the database, the system will panic on > the next auto analyze. Here is the call stack. > {code} > 2017-09-07 13:34:41.441970 IST,,,p34393,th0,,,2017-09-07 13:34:09 > IST,0,con1140,cmd6,seg-1,"PANIC","XX000","Unexpected internal error: > Master pr > ocess received signal SIGSEGV",,,0"10x96f57c postgres found> + 0x96f57c > 20x96f785 postgres StandardHandlerForSigillSigsegvSigbus_OnMainThread + > 0x2b > 30x88b04f postgres CdbProgramErrorHandler + 0xf1 > 40x3a16a0f7e0 libpthread.so.0 + 0x16a0f7e0 > 50x973048 postgres FunctionCall2 + 0x8e > 60xabefab postgres + 0xabefab > 70xabfee4 postgres InMemHeap_GetNext + 0x408 > 80x4f7bc6 postgres + 0x4f7bc6 > 90x4f7abc postgres systable_getnext + 0x50 > 10 0x953fb8 postgres SearchCatCache + 0x276 > 11 0x95ce10 postgres SearchSysCache + 0x93 > 12 0x95cecb postgres SearchSysCacheKeyArray + 0x9f > 13 0x5a07fc postgres caql_getoid_plus + 0x176 > 14 0x5c4888 postgres LookupNamespaceId + 0x129 > 15 0x5c475d postgres LookupInternalNamespaceId + 0x1d > 16 0x687897 postgres + 0x687897 > 17 0x687574 postgres CreateSchemaCommand + 0x8f > 18 0x8952d1 postgres ProcessUtility + 0x4ff > 19 0x5c5728 postgres + 0x5c5728 > 20 0x5c2fea postgres RangeVarGetCreationNamespace + 0x253 > 21 0x6e43f3 postgres + 0x6e43f3 > 22 0x6e49c4 postgres + 0x6e49c4 > 23 0x6e1401 postgres + 0x6e1401 > 24 0x6deb2d postgres ExecutorStart + 0xb01 > 25 0x738594 postgres + 0x738594 > 26 0x73809f postgres + 0x73809f > 27 0x7351a9 postgres SPI_execute + 0x13c > 28 0x6490f2 postgres spiExecuteWithCallback + 0x130 > 29 0x64956b postgres + 0x64956b > 30 0x648be0 postgres + 0x648be0 > 31 0x647be0 postgres analyzeStmt + 0x91d > 32 0x647247 postgres analyzeStatement + 0xb1 > 33 0x6ca11d postgres vacuum + 0xe5 > 34 0x827910 postgres autostats_issue_analyze + 0x160 > 35 0x827e10 postgres auto_stats + 0x19b > 36 0x8906b5 postgres + 0x8906b5 > 37 0x8930f5 postgres + 0x8930f5 > 38 0x892619 postgres PortalRun + 0x3e6 > 39 0x8884f6 postgres + 0x8884f6 > {code} > This is because reindex command clear the relcache, and inmemscan->rs_rd->rel > in InMemHeap_GetNext() using the address of this heap relation in relcache, > which is not same with that when heap relation is reopened. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HAWQ-1525) Segmentation fault occurs if reindex database when loading data from Hive to HAWQ using hcatalog
[ https://issues.apache.org/jira/browse/HAWQ-1525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang updated HAWQ-1525: Description: When we use hcatalog to load data from Hive to HAWQ, if the amount of data is big enough, it will trigger automatic statistics collection, calling vacuum analyze. At that time if we reindex the database, the system will panic on the next auto analyze. Here is the call stack. {code} 2017-09-07 13:34:41.441970 IST,,,p34393,th0,,,2017-09-07 13:34:09 IST,0,con1140,cmd6,seg-1,"PANIC","XX000","Unexpected internal error: Master pr ocess received signal SIGSEGV",,,0"10x96f57c postgres + 0x96f57c 20x96f785 postgres StandardHandlerForSigillSigsegvSigbus_OnMainThread + 0x2b 30x88b04f postgres CdbProgramErrorHandler + 0xf1 40x3a16a0f7e0 libpthread.so.0 + 0x16a0f7e0 50x973048 postgres FunctionCall2 + 0x8e 60xabefab postgres + 0xabefab 70xabfee4 postgres InMemHeap_GetNext + 0x408 80x4f7bc6 postgres + 0x4f7bc6 90x4f7abc postgres systable_getnext + 0x50 10 0x953fb8 postgres SearchCatCache + 0x276 11 0x95ce10 postgres SearchSysCache + 0x93 12 0x95cecb postgres SearchSysCacheKeyArray + 0x9f 13 0x5a07fc postgres caql_getoid_plus + 0x176 14 0x5c4888 postgres LookupNamespaceId + 0x129 15 0x5c475d postgres LookupInternalNamespaceId + 0x1d 16 0x687897 postgres + 0x687897 17 0x687574 postgres CreateSchemaCommand + 0x8f 18 0x8952d1 postgres ProcessUtility + 0x4ff 19 0x5c5728 postgres + 0x5c5728 20 0x5c2fea postgres RangeVarGetCreationNamespace + 0x253 21 0x6e43f3 postgres + 0x6e43f3 22 0x6e49c4 postgres + 0x6e49c4 23 0x6e1401 postgres + 0x6e1401 24 0x6deb2d postgres ExecutorStart + 0xb01 25 0x738594 postgres + 0x738594 26 0x73809f postgres + 0x73809f 27 0x7351a9 postgres SPI_execute + 0x13c 28 0x6490f2 postgres spiExecuteWithCallback + 0x130 29 0x64956b postgres + 0x64956b 30 0x648be0 postgres + 0x648be0 31 0x647be0 postgres analyzeStmt + 0x91d 32 0x647247 postgres analyzeStatement + 0xb1 33 0x6ca11d postgres vacuum + 0xe5 34 0x827910 postgres autostats_issue_analyze + 0x160 35 0x827e10 postgres auto_stats + 0x19b 36 0x8906b5 postgres + 0x8906b5 37 0x8930f5 postgres + 0x8930f5 38 0x892619 postgres PortalRun + 0x3e6 39 0x8884f6 postgres + 0x8884f6 {code} This is because reindex command clear the relcache, and inmemscan->rs_rd->rel in InMemHeap_GetNext() using the address of this heap relation in relcache, which is not same with that when heap relation is reopened. was: When we use hcatalog to load data from Hive to HAWQ, if the amount of data is big enough, it will trigger automatic statistics collection, calling vacuum analyze. At that time if we reindex the database, the system will panic on the next auto analyze. Here is the call stack. {code} 2017-09-07 13:34:41.441970 IST,,,p34393,th0,,,2017-09-07 13:34:09 IST,0,con1140,cmd6,seg-1,"PANIC","XX000","Unexpected internal error: Master pr ocess received signal SIGSEGV",,,0"10x96f57c postgres + 0x96f57c 20x96f785 postgres StandardHandlerForSigillSigsegvSigbus_OnMainThread + 0x2b 30x88b04f postgres CdbProgramErrorHandler + 0xf1 40x3a16a0f7e0 libpthread.so.0 + 0x16a0f7e0 50x973048 postgres FunctionCall2 + 0x8e 60xabefab postgres + 0xabefab 70xabfee4 postgres InMemHeap_GetNext + 0x408 80x4f7bc6 postgres + 0x4f7bc6 90x4f7abc postgres systable_getnext + 0x50 10 0x953fb8 postgres SearchCatCache + 0x276 11 0x95ce10 postgres SearchSysCache + 0x93 12 0x95cecb postgres SearchSysCacheKeyArray + 0x9f 13 0x5a07fc postgres caql_getoid_plus + 0x176 14 0x5c4888 postgres LookupNamespaceId + 0x129 15 0x5c475d postgres LookupInternalNamespaceId + 0x1d 16 0x687897 postgres + 0x687897 17 0x687574 postgres CreateSchemaCommand + 0x8f 18 0x8952d1 postgres ProcessUtility + 0x4ff 19 0x5c5728 postgres + 0x5c5728 20 0x5c2fea postgres RangeVarGetCreationNamespace + 0x253 21 0x6e43f3 postgres + 0x6e43f3 22 0x6e49c4 postgres + 0x6e49c4 23 0x6e1401 postgres + 0x6e1401 24 0x6deb2d postgres ExecutorStart + 0xb01 25 0x738594 postgres + 0x738594 26 0x73809f postgres + 0x73809f 27 0x7351a9 postgres SPI_execute + 0x13c 28 0x6490f2 postgres spiExecuteWithCallback + 0x130 29 0x64956b postgres + 0x64956b 30 0x648be0 postgres + 0x648be0 31 0x647be0 postgres analyzeStmt + 0x91d 32 0x647247 postgres analyzeStatement + 0xb1 33 0x6ca11d postgres vacuum + 0xe5 34 0x827910 postgres autostats_issue_analyze + 0x160 35 0x827e10 postgres auto_stats + 0x19b 36 0x8906b5 postgres + 0x8906b5 37 0x8930f5 postgres + 0x8930f5 38 0x892619 postgres PortalRun + 0x3e6 39 0x8884f6 postgres + 0x8884f6 {code} This is because reindex command clear the syscache, and inmemscan->rs_rd->rel in InMemHeap_GetNext() using the address of this heap relation in syscache, which is not
[jira] [Assigned] (HAWQ-1525) Segmentation fault occurs if reindex database when loading data from Hive to HAWQ using hcatalog
[ https://issues.apache.org/jira/browse/HAWQ-1525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang reassigned HAWQ-1525: --- Assignee: Chunling Wang (was: Lei Chang) > Segmentation fault occurs if reindex database when loading data from Hive to > HAWQ using hcatalog > > > Key: HAWQ-1525 > URL: https://issues.apache.org/jira/browse/HAWQ-1525 > Project: Apache HAWQ > Issue Type: Bug > Components: Query Execution >Reporter: Chunling Wang >Assignee: Chunling Wang > > When we use hcatalog to load data from Hive to HAWQ, if the amount of data is > big enough, it will trigger automatic statistics collection, calling vacuum > analyze. At that time if we reindex the database, the system will panic on > the next auto analyze. Here is the call stack. > {code} > 2017-09-07 13:34:41.441970 IST,,,p34393,th0,,,2017-09-07 13:34:09 > IST,0,con1140,cmd6,seg-1,"PANIC","XX000","Unexpected internal error: > Master pr > ocess received signal SIGSEGV",,,0"10x96f57c postgres found> + 0x96f57c > 20x96f785 postgres StandardHandlerForSigillSigsegvSigbus_OnMainThread + > 0x2b > 30x88b04f postgres CdbProgramErrorHandler + 0xf1 > 40x3a16a0f7e0 libpthread.so.0 + 0x16a0f7e0 > 50x973048 postgres FunctionCall2 + 0x8e > 60xabefab postgres + 0xabefab > 70xabfee4 postgres InMemHeap_GetNext + 0x408 > 80x4f7bc6 postgres + 0x4f7bc6 > 90x4f7abc postgres systable_getnext + 0x50 > 10 0x953fb8 postgres SearchCatCache + 0x276 > 11 0x95ce10 postgres SearchSysCache + 0x93 > 12 0x95cecb postgres SearchSysCacheKeyArray + 0x9f > 13 0x5a07fc postgres caql_getoid_plus + 0x176 > 14 0x5c4888 postgres LookupNamespaceId + 0x129 > 15 0x5c475d postgres LookupInternalNamespaceId + 0x1d > 16 0x687897 postgres + 0x687897 > 17 0x687574 postgres CreateSchemaCommand + 0x8f > 18 0x8952d1 postgres ProcessUtility + 0x4ff > 19 0x5c5728 postgres + 0x5c5728 > 20 0x5c2fea postgres RangeVarGetCreationNamespace + 0x253 > 21 0x6e43f3 postgres + 0x6e43f3 > 22 0x6e49c4 postgres + 0x6e49c4 > 23 0x6e1401 postgres + 0x6e1401 > 24 0x6deb2d postgres ExecutorStart + 0xb01 > 25 0x738594 postgres + 0x738594 > 26 0x73809f postgres + 0x73809f > 27 0x7351a9 postgres SPI_execute + 0x13c > 28 0x6490f2 postgres spiExecuteWithCallback + 0x130 > 29 0x64956b postgres + 0x64956b > 30 0x648be0 postgres + 0x648be0 > 31 0x647be0 postgres analyzeStmt + 0x91d > 32 0x647247 postgres analyzeStatement + 0xb1 > 33 0x6ca11d postgres vacuum + 0xe5 > 34 0x827910 postgres autostats_issue_analyze + 0x160 > 35 0x827e10 postgres auto_stats + 0x19b > 36 0x8906b5 postgres + 0x8906b5 > 37 0x8930f5 postgres + 0x8930f5 > 38 0x892619 postgres PortalRun + 0x3e6 > 39 0x8884f6 postgres + 0x8884f6 > {code} > This is because reindex command clear the syscache, and inmemscan->rs_rd->rel > in InMemHeap_GetNext() using the address of this heap relation in syscache, > which is not same with that when heap relation is reopened. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HAWQ-1525) Segmentation fault occurs if reindex database when loading data from Hive to HAWQ using hcatalog
Chunling Wang created HAWQ-1525: --- Summary: Segmentation fault occurs if reindex database when loading data from Hive to HAWQ using hcatalog Key: HAWQ-1525 URL: https://issues.apache.org/jira/browse/HAWQ-1525 Project: Apache HAWQ Issue Type: Bug Components: Query Execution Reporter: Chunling Wang Assignee: Lei Chang When we use hcatalog to load data from Hive to HAWQ, if the amount of data is big enough, it will trigger automatic statistics collection, calling vacuum analyze. At that time if we reindex the database, the system will panic on the next auto analyze. Here is the call stack. {code} 2017-09-07 13:34:41.441970 IST,,,p34393,th0,,,2017-09-07 13:34:09 IST,0,con1140,cmd6,seg-1,"PANIC","XX000","Unexpected internal error: Master pr ocess received signal SIGSEGV",,,0"10x96f57c postgres + 0x96f57c 20x96f785 postgres StandardHandlerForSigillSigsegvSigbus_OnMainThread + 0x2b 30x88b04f postgres CdbProgramErrorHandler + 0xf1 40x3a16a0f7e0 libpthread.so.0 + 0x16a0f7e0 50x973048 postgres FunctionCall2 + 0x8e 60xabefab postgres + 0xabefab 70xabfee4 postgres InMemHeap_GetNext + 0x408 80x4f7bc6 postgres + 0x4f7bc6 90x4f7abc postgres systable_getnext + 0x50 10 0x953fb8 postgres SearchCatCache + 0x276 11 0x95ce10 postgres SearchSysCache + 0x93 12 0x95cecb postgres SearchSysCacheKeyArray + 0x9f 13 0x5a07fc postgres caql_getoid_plus + 0x176 14 0x5c4888 postgres LookupNamespaceId + 0x129 15 0x5c475d postgres LookupInternalNamespaceId + 0x1d 16 0x687897 postgres + 0x687897 17 0x687574 postgres CreateSchemaCommand + 0x8f 18 0x8952d1 postgres ProcessUtility + 0x4ff 19 0x5c5728 postgres + 0x5c5728 20 0x5c2fea postgres RangeVarGetCreationNamespace + 0x253 21 0x6e43f3 postgres + 0x6e43f3 22 0x6e49c4 postgres + 0x6e49c4 23 0x6e1401 postgres + 0x6e1401 24 0x6deb2d postgres ExecutorStart + 0xb01 25 0x738594 postgres + 0x738594 26 0x73809f postgres + 0x73809f 27 0x7351a9 postgres SPI_execute + 0x13c 28 0x6490f2 postgres spiExecuteWithCallback + 0x130 29 0x64956b postgres + 0x64956b 30 0x648be0 postgres + 0x648be0 31 0x647be0 postgres analyzeStmt + 0x91d 32 0x647247 postgres analyzeStatement + 0xb1 33 0x6ca11d postgres vacuum + 0xe5 34 0x827910 postgres autostats_issue_analyze + 0x160 35 0x827e10 postgres auto_stats + 0x19b 36 0x8906b5 postgres + 0x8906b5 37 0x8930f5 postgres + 0x8930f5 38 0x892619 postgres PortalRun + 0x3e6 39 0x8884f6 postgres + 0x8884f6 {code} This is because reindex command clear the syscache, and inmemscan->rs_rd->rel in InMemHeap_GetNext() using the address of this heap relation in syscache, which is not same with that when heap relation is reopened. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HAWQ-1426) hawq extract meets error after the table was reorganized.
[ https://issues.apache.org/jira/browse/HAWQ-1426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang reassigned HAWQ-1426: --- Assignee: Chunling Wang (was: Ed Espino) > hawq extract meets error after the table was reorganized. > - > > Key: HAWQ-1426 > URL: https://issues.apache.org/jira/browse/HAWQ-1426 > Project: Apache HAWQ > Issue Type: Bug > Components: Command Line Tools >Reporter: Lili Ma >Assignee: Chunling Wang > Fix For: 2.3.0.0-incubating > > > After one table is reorganized, hawq extract the table will meet error. > Reproduce Steps: > 1. create an AO table > 2. insert into several records into it > 3. Get the table reorganized. "alter table a set with (reorganize=true);" > 4. run hawq extract, error thrown out. > For the bug fix, we should also guarantee that hawq extract should work if > the table is truncated and re-inserted. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Closed] (HAWQ-1418) Print executing command for hawq register
[ https://issues.apache.org/jira/browse/HAWQ-1418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang closed HAWQ-1418. --- > Print executing command for hawq register > - > > Key: HAWQ-1418 > URL: https://issues.apache.org/jira/browse/HAWQ-1418 > Project: Apache HAWQ > Issue Type: Sub-task > Components: Command Line Tools >Reporter: Chunling Wang >Assignee: Chunling Wang > Fix For: backlog > > > Print executing command for hawq register -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (HAWQ-1418) Print executing command for hawq register
[ https://issues.apache.org/jira/browse/HAWQ-1418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang resolved HAWQ-1418. - Resolution: Fixed > Print executing command for hawq register > - > > Key: HAWQ-1418 > URL: https://issues.apache.org/jira/browse/HAWQ-1418 > Project: Apache HAWQ > Issue Type: Sub-task > Components: Command Line Tools >Reporter: Chunling Wang >Assignee: Chunling Wang > Fix For: backlog > > > Print executing command for hawq register -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HAWQ-1418) Print executing command for hawq register
[ https://issues.apache.org/jira/browse/HAWQ-1418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang reassigned HAWQ-1418: --- Assignee: Chunling Wang (was: Ed Espino) > Print executing command for hawq register > - > > Key: HAWQ-1418 > URL: https://issues.apache.org/jira/browse/HAWQ-1418 > Project: Apache HAWQ > Issue Type: Sub-task > Components: Command Line Tools >Reporter: Chunling Wang >Assignee: Chunling Wang > Fix For: backlog > > > Print executing command for hawq register -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HAWQ-1418) Print executing command for hawq register
Chunling Wang created HAWQ-1418: --- Summary: Print executing command for hawq register Key: HAWQ-1418 URL: https://issues.apache.org/jira/browse/HAWQ-1418 Project: Apache HAWQ Issue Type: Sub-task Components: Command Line Tools Reporter: Chunling Wang Assignee: Ed Espino Print executing command for hawq register -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Closed] (HAWQ-1332) Can not grant database and schema privileges without table privileges in ranger or ranger plugin service
[ https://issues.apache.org/jira/browse/HAWQ-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang closed HAWQ-1332. --- > Can not grant database and schema privileges without table privileges in > ranger or ranger plugin service > > > Key: HAWQ-1332 > URL: https://issues.apache.org/jira/browse/HAWQ-1332 > Project: Apache HAWQ > Issue Type: Bug > Components: Security >Reporter: Chunling Wang >Assignee: Alexander Denissov > Fix For: 2.2.0.0-incubating > > Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png > > > We try to grant database connect and schema usage privileges to a non-super > user to connect database. We find that if we set policy with database and > schema included, but with table excluded, we can not connect database. But if > we include table, we can connect to database. We think there may be bug in > Ranger Plugin Service or Ranger. Here are steps to reproduce it. > 1. create a new user "usertest1" in database: > {code} > $ psql postgres > psql (8.2.15) > Type "help" for help. > postgres=# CREATE USER usertest1; > NOTICE: resource queue required -- using default resource queue "pg_default" > CREATE ROLE > postgres=# > {code} > 2. add user "usertest1" in pg_hba.conf > {code} > local all usertest1 trust > {code} > 3. set policy with database and schema included, with table excluded > !screenshot-1.png|width=800,height=400! > 4. connect database with user "usertest1" but failed with permission denied > {code} > $ psql postgres -U usertest1 > psql: FATAL: permission denied for database "postgres" > DETAIL: User does not have CONNECT privilege. > {code} > 5. set policy with database, schema and table included > !screenshot-2.png|width=800,height=400! > 6. connect database with user "usertest1" and succeed > {code} > $ psql postgres -U usertest1 > psql (8.2.15) > Type "help" for help. > postgres=# > {code} > But if we do not set table as "*", and specify table like "a", we can not > access database either. > !screenshot-3.png|width=800,height=400! -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (HAWQ-1332) Can not grant database and schema privileges without table privileges in ranger or ranger plugin service
[ https://issues.apache.org/jira/browse/HAWQ-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang resolved HAWQ-1332. - Resolution: Not A Problem > Can not grant database and schema privileges without table privileges in > ranger or ranger plugin service > > > Key: HAWQ-1332 > URL: https://issues.apache.org/jira/browse/HAWQ-1332 > Project: Apache HAWQ > Issue Type: Bug > Components: Security >Reporter: Chunling Wang >Assignee: Alexander Denissov > Fix For: 2.2.0.0-incubating > > Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png > > > We try to grant database connect and schema usage privileges to a non-super > user to connect database. We find that if we set policy with database and > schema included, but with table excluded, we can not connect database. But if > we include table, we can connect to database. We think there may be bug in > Ranger Plugin Service or Ranger. Here are steps to reproduce it. > 1. create a new user "usertest1" in database: > {code} > $ psql postgres > psql (8.2.15) > Type "help" for help. > postgres=# CREATE USER usertest1; > NOTICE: resource queue required -- using default resource queue "pg_default" > CREATE ROLE > postgres=# > {code} > 2. add user "usertest1" in pg_hba.conf > {code} > local all usertest1 trust > {code} > 3. set policy with database and schema included, with table excluded > !screenshot-1.png|width=800,height=400! > 4. connect database with user "usertest1" but failed with permission denied > {code} > $ psql postgres -U usertest1 > psql: FATAL: permission denied for database "postgres" > DETAIL: User does not have CONNECT privilege. > {code} > 5. set policy with database, schema and table included > !screenshot-2.png|width=800,height=400! > 6. connect database with user "usertest1" and succeed > {code} > $ psql postgres -U usertest1 > psql (8.2.15) > Type "help" for help. > postgres=# > {code} > But if we do not set table as "*", and specify table like "a", we can not > access database either. > !screenshot-3.png|width=800,height=400! -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Closed] (HAWQ-1367) hawq can access to user tables that have no permission with fallback check table.
[ https://issues.apache.org/jira/browse/HAWQ-1367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang closed HAWQ-1367. --- > hawq can access to user tables that have no permission with fallback check > table. > -- > > Key: HAWQ-1367 > URL: https://issues.apache.org/jira/browse/HAWQ-1367 > Project: Apache HAWQ > Issue Type: Bug > Components: Security >Reporter: Xiang Sheng >Assignee: Chunling Wang > Fix For: 2.2.0.0-incubating > > > if a user have access to catalog table and he have no access to user table b. > he can access to table b using "select * from catalog_table, b;" -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (HAWQ-1367) hawq can access to user tables that have no permission with fallback check table.
[ https://issues.apache.org/jira/browse/HAWQ-1367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang resolved HAWQ-1367. - Resolution: Fixed > hawq can access to user tables that have no permission with fallback check > table. > -- > > Key: HAWQ-1367 > URL: https://issues.apache.org/jira/browse/HAWQ-1367 > Project: Apache HAWQ > Issue Type: Bug > Components: Security >Reporter: Xiang Sheng >Assignee: Chunling Wang > Fix For: 2.2.0.0-incubating > > > if a user have access to catalog table and he have no access to user table b. > he can access to table b using "select * from catalog_table, b;" -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (HAWQ-1377) Add more information for Ranger related GUCs in default hawq-site.xml
[ https://issues.apache.org/jira/browse/HAWQ-1377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang resolved HAWQ-1377. - Resolution: Fixed > Add more information for Ranger related GUCs in default hawq-site.xml > - > > Key: HAWQ-1377 > URL: https://issues.apache.org/jira/browse/HAWQ-1377 > Project: Apache HAWQ > Issue Type: Improvement > Components: Security >Reporter: Chunling Wang >Assignee: Chunling Wang > Fix For: 2.2.0.0-incubating > > > We should add default GUCs for Ranger in sample hawq-site.xml, just as what > resource manager does, so that users don't need to refer to the documents for > detailed GUC names. > The output content should be like follows: > {code} > > > hawq_acl_type > standalone > > > hawq_rps_address_host > localhost > > > hawq_rps_address_suffix > rps > > > hawq_rps_address_port > 8432 > > > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Closed] (HAWQ-1377) Add more information for Ranger related GUCs in default hawq-site.xml
[ https://issues.apache.org/jira/browse/HAWQ-1377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang closed HAWQ-1377. --- > Add more information for Ranger related GUCs in default hawq-site.xml > - > > Key: HAWQ-1377 > URL: https://issues.apache.org/jira/browse/HAWQ-1377 > Project: Apache HAWQ > Issue Type: Improvement > Components: Security >Reporter: Chunling Wang >Assignee: Chunling Wang > Fix For: 2.2.0.0-incubating > > > We should add default GUCs for Ranger in sample hawq-site.xml, just as what > resource manager does, so that users don't need to refer to the documents for > detailed GUC names. > The output content should be like follows: > {code} > > > hawq_acl_type > standalone > > > hawq_rps_address_host > localhost > > > hawq_rps_address_suffix > rps > > > hawq_rps_address_port > 8432 > > > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HAWQ-1377) Add more information for Ranger related GUCs in default hawq-site.xml
[ https://issues.apache.org/jira/browse/HAWQ-1377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang reassigned HAWQ-1377: --- Assignee: Chunling Wang (was: Ed Espino) > Add more information for Ranger related GUCs in default hawq-site.xml > - > > Key: HAWQ-1377 > URL: https://issues.apache.org/jira/browse/HAWQ-1377 > Project: Apache HAWQ > Issue Type: Improvement > Components: Security >Reporter: Chunling Wang >Assignee: Chunling Wang > Fix For: 2.2.0.0-incubating > > > We should add default GUCs for Ranger in sample hawq-site.xml, just as what > resource manager does, so that users don't need to refer to the documents for > detailed GUC names. > The output content should be like follows: > {code} > > > hawq_acl_type > standalone > > > hawq_rps_address_host > localhost > > > hawq_rps_address_suffix > rps > > > hawq_rps_address_port > 8432 > > > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HAWQ-1377) Add more information for Ranger related GUCs in default hawq-site.xml
Chunling Wang created HAWQ-1377: --- Summary: Add more information for Ranger related GUCs in default hawq-site.xml Key: HAWQ-1377 URL: https://issues.apache.org/jira/browse/HAWQ-1377 Project: Apache HAWQ Issue Type: Improvement Components: Security Reporter: Chunling Wang Assignee: Ed Espino Fix For: 2.2.0.0-incubating We should add default GUCs for Ranger in sample hawq-site.xml, just as what resource manager does, so that users don't need to refer to the documents for detailed GUC names. The output content should be like follows: {code} hawq_acl_type standalone hawq_rps_address_host localhost hawq_rps_address_suffix rps hawq_rps_address_port 8432 {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HAWQ-1367) hawq can access to user tables that have no permission with fallback check table.
[ https://issues.apache.org/jira/browse/HAWQ-1367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang reassigned HAWQ-1367: --- Assignee: Chunling Wang (was: Ed Espino) > hawq can access to user tables that have no permission with fallback check > table. > -- > > Key: HAWQ-1367 > URL: https://issues.apache.org/jira/browse/HAWQ-1367 > Project: Apache HAWQ > Issue Type: Bug > Components: Security >Reporter: Xiang Sheng >Assignee: Chunling Wang > Fix For: 2.2.0.0-incubating > > > if a user have access to catalog table and he have no access to user table b. > he can access to table b using "select * from catalog_table, b;" -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HAWQ-1332) Can not grant database and schema privileges without table privileges in ranger or ranger plugin service
[ https://issues.apache.org/jira/browse/HAWQ-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang updated HAWQ-1332: Description: We try to grant database connect and schema usage privileges to a non-super user to connect database. We find that if we set policy with database and schema included, but with table excluded, we can not connect database. But if we include table, we can connect to database. We think there may be bug in Ranger Plugin Service or Ranger. Here are steps to reproduce it. 1. create a new user "usertest1" in database: {code} $ psql postgres psql (8.2.15) Type "help" for help. postgres=# CREATE USER usertest1; NOTICE: resource queue required -- using default resource queue "pg_default" CREATE ROLE postgres=# {code} 2. add user "usertest1" in pg_hba.conf {code} local all usertest1 trust {code} 3. set policy with database and schema included, with table excluded !screenshot-1.png|width=800,height=400! 4. connect database with user "usertest1" but failed with permission denied {code} $ psql postgres -U usertest1 psql: FATAL: permission denied for database "postgres" DETAIL: User does not have CONNECT privilege. {code} 5. set policy with database, schema and table included !screenshot-2.png|width=800,height=400! 6. connect database with user "usertest1" and succeed {code} $ psql postgres -U usertest1 psql (8.2.15) Type "help" for help. postgres=# {code} But if we do not set table as "*", and specify table like "a", we can not access database either. !screenshot-3.png|width=800,height=400! was: We try to grant database connect and schema usage privileges to a non-super user to connect database. We find that if we set policy with database and schema included, but with table excluded, we can not connect database. But if we include table, we can connect to database. We think there may be bug in Ranger Plugin Service or Ranger. Here are steps to reproduce it. 1. create a new user "usertest1" in database: {code} $ psql postgres psql (8.2.15) Type "help" for help. postgres=# CREATE USER usertest1; NOTICE: resource queue required -- using default resource queue "pg_default" CREATE ROLE postgres=# {code} 2. add user "usertest1" in pg_hba.conf {code} local all usertest1 trust {code} 3. set policy with database and schema included, with table excluded !screenshot-1.png|width=800,height=400! 4. connect database with user "usertest1" but failed with permission denied {code} $ psql postgres -U usertest1 psql: FATAL: permission denied for database "postgres" DETAIL: User does not have CONNECT privilege. {code} 5. set policy with database, schema and table included !screenshot-2.png|width=800,height=400! 6. connect database with user "usertest1" and succeed {code} $ psql postgres -U usertest1 psql (8.2.15) Type "help" for help. postgres=# {code} > Can not grant database and schema privileges without table privileges in > ranger or ranger plugin service > > > Key: HAWQ-1332 > URL: https://issues.apache.org/jira/browse/HAWQ-1332 > Project: Apache HAWQ > Issue Type: Bug > Components: Security >Reporter: Chunling Wang >Assignee: Ed Espino > Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png > > > We try to grant database connect and schema usage privileges to a non-super > user to connect database. We find that if we set policy with database and > schema included, but with table excluded, we can not connect database. But if > we include table, we can connect to database. We think there may be bug in > Ranger Plugin Service or Ranger. Here are steps to reproduce it. > 1. create a new user "usertest1" in database: > {code} > $ psql postgres > psql (8.2.15) > Type "help" for help. > postgres=# CREATE USER usertest1; > NOTICE: resource queue required -- using default resource queue "pg_default" > CREATE ROLE > postgres=# > {code} > 2. add user "usertest1" in pg_hba.conf > {code} > local all usertest1 trust > {code} > 3. set policy with database and schema included, with table excluded > !screenshot-1.png|width=800,height=400! > 4. connect database with user "usertest1" but failed with permission denied > {code} > $ psql postgres -U usertest1 > psql: FATAL: permission denied for database "postgres" > DETAIL: User does not have CONNECT privilege. > {code} > 5. set policy with database, schema and table included > !screenshot-2.png|width=800,height=400! > 6. connect database with user "usertest1" and succeed > {code} > $ psql postgres -U usertest1 > psql (8.2.15) > Type "help" for help. > postgres=# > {code} > But if we do not set table as "*", and specify table like "a", we can not > access database either. >
[jira] [Updated] (HAWQ-1332) Can not grant database and schema privileges without table privileges in ranger or ranger plugin service
[ https://issues.apache.org/jira/browse/HAWQ-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang updated HAWQ-1332: Attachment: screenshot-3.png > Can not grant database and schema privileges without table privileges in > ranger or ranger plugin service > > > Key: HAWQ-1332 > URL: https://issues.apache.org/jira/browse/HAWQ-1332 > Project: Apache HAWQ > Issue Type: Bug > Components: Security >Reporter: Chunling Wang >Assignee: Ed Espino > Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png > > > We try to grant database connect and schema usage privileges to a non-super > user to connect database. We find that if we set policy with database and > schema included, but with table excluded, we can not connect database. But if > we include table, we can connect to database. We think there may be bug in > Ranger Plugin Service or Ranger. Here are steps to reproduce it. > 1. create a new user "usertest1" in database: > {code} > $ psql postgres > psql (8.2.15) > Type "help" for help. > postgres=# CREATE USER usertest1; > NOTICE: resource queue required -- using default resource queue "pg_default" > CREATE ROLE > postgres=# > {code} > 2. add user "usertest1" in pg_hba.conf > {code} > local all usertest1 trust > {code} > 3. set policy with database and schema included, with table excluded > !screenshot-1.png|width=800,height=400! > 4. connect database with user "usertest1" but failed with permission denied > {code} > $ psql postgres -U usertest1 > psql: FATAL: permission denied for database "postgres" > DETAIL: User does not have CONNECT privilege. > {code} > 5. set policy with database, schema and table included > !screenshot-2.png|width=800,height=400! > 6. connect database with user "usertest1" and succeed > {code} > $ psql postgres -U usertest1 > psql (8.2.15) > Type "help" for help. > postgres=# > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HAWQ-1332) Can not grant database and schema privileges without table privileges in ranger or ranger plugin service
[ https://issues.apache.org/jira/browse/HAWQ-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang updated HAWQ-1332: Description: We try to grant database connect and schema usage privileges to a non-super user to connect database. We find that if we set policy with database and schema included, but with table excluded, we can not connect database. But if we include table, we can connect to database. We think there may be bug in Ranger Plugin Service or Ranger. Here are steps to reproduce it. 1. create a new user "usertest1" in database: {code} $ psql postgres psql (8.2.15) Type "help" for help. postgres=# CREATE USER usertest1; NOTICE: resource queue required -- using default resource queue "pg_default" CREATE ROLE postgres=# {code} 2. add user "usertest1" in pg_hba.conf {code} local all usertest1 trust {code} 3. set policy with database and schema included, with table excluded !screenshot-1.png|width=800,height=400! 4. connect database with user "usertest1" but failed with permission denied {code} $ psql postgres -U usertest1 psql: FATAL: permission denied for database "postgres" DETAIL: User does not have CONNECT privilege. {code} 5. set policy with database, schema and table included !screenshot-2.png|width=800,height=400! 6. connect database with user "usertest1" and succeed {code} $ psql postgres -U usertest1 psql (8.2.15) Type "help" for help. postgres=# {code} was:We try to grant database connect and schema usage privileges to a non-super user to connect database. We find that if we set policy with database and schema included, but with table excluded, we can not connect database. But if we include table, we can connect to database. We think there may be bug in Ranger Plugin Service or Ranger. > Can not grant database and schema privileges without table privileges in > ranger or ranger plugin service > > > Key: HAWQ-1332 > URL: https://issues.apache.org/jira/browse/HAWQ-1332 > Project: Apache HAWQ > Issue Type: Bug > Components: Security >Reporter: Chunling Wang >Assignee: Ed Espino > Attachments: screenshot-1.png, screenshot-2.png > > > We try to grant database connect and schema usage privileges to a non-super > user to connect database. We find that if we set policy with database and > schema included, but with table excluded, we can not connect database. But if > we include table, we can connect to database. We think there may be bug in > Ranger Plugin Service or Ranger. Here are steps to reproduce it. > 1. create a new user "usertest1" in database: > {code} > $ psql postgres > psql (8.2.15) > Type "help" for help. > postgres=# CREATE USER usertest1; > NOTICE: resource queue required -- using default resource queue "pg_default" > CREATE ROLE > postgres=# > {code} > 2. add user "usertest1" in pg_hba.conf > {code} > local all usertest1 trust > {code} > 3. set policy with database and schema included, with table excluded > !screenshot-1.png|width=800,height=400! > 4. connect database with user "usertest1" but failed with permission denied > {code} > $ psql postgres -U usertest1 > psql: FATAL: permission denied for database "postgres" > DETAIL: User does not have CONNECT privilege. > {code} > 5. set policy with database, schema and table included > !screenshot-2.png|width=800,height=400! > 6. connect database with user "usertest1" and succeed > {code} > $ psql postgres -U usertest1 > psql (8.2.15) > Type "help" for help. > postgres=# > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HAWQ-1332) Can not grant database and schema privileges without table privileges in ranger or ranger plugin service
[ https://issues.apache.org/jira/browse/HAWQ-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang updated HAWQ-1332: Attachment: screenshot-1.png > Can not grant database and schema privileges without table privileges in > ranger or ranger plugin service > > > Key: HAWQ-1332 > URL: https://issues.apache.org/jira/browse/HAWQ-1332 > Project: Apache HAWQ > Issue Type: Bug > Components: Security >Reporter: Chunling Wang >Assignee: Ed Espino > Attachments: screenshot-1.png, screenshot-2.png > > > We try to grant database connect and schema usage privileges to a non-super > user to connect database. We find that if we set policy with database and > schema included, but with table excluded, we can not connect database. But if > we include table, we can connect to database. We think there may be bug in > Ranger Plugin Service or Ranger. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HAWQ-1332) Can not grant database and schema privileges without table privileges in ranger or ranger plugin service
[ https://issues.apache.org/jira/browse/HAWQ-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang updated HAWQ-1332: Attachment: screenshot-2.png > Can not grant database and schema privileges without table privileges in > ranger or ranger plugin service > > > Key: HAWQ-1332 > URL: https://issues.apache.org/jira/browse/HAWQ-1332 > Project: Apache HAWQ > Issue Type: Bug > Components: Security >Reporter: Chunling Wang >Assignee: Ed Espino > Attachments: screenshot-1.png, screenshot-2.png > > > We try to grant database connect and schema usage privileges to a non-super > user to connect database. We find that if we set policy with database and > schema included, but with table excluded, we can not connect database. But if > we include table, we can connect to database. We think there may be bug in > Ranger Plugin Service or Ranger. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HAWQ-1332) Can not grant database and schema privileges without table privileges in ranger or ranger plugin service
[ https://issues.apache.org/jira/browse/HAWQ-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang updated HAWQ-1332: Attachment: (was: screenshot-2.png) > Can not grant database and schema privileges without table privileges in > ranger or ranger plugin service > > > Key: HAWQ-1332 > URL: https://issues.apache.org/jira/browse/HAWQ-1332 > Project: Apache HAWQ > Issue Type: Bug > Components: Security >Reporter: Chunling Wang >Assignee: Ed Espino > > We try to grant database connect and schema usage privileges to a non-super > user to connect database. We find that if we set policy with database and > schema included, but with table excluded, we can not connect database. But if > we include table, we can connect to database. We think there may be bug in > Ranger Plugin Service or Ranger. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HAWQ-1332) Can not grant database and schema privileges without table privileges in ranger or ranger plugin service
[ https://issues.apache.org/jira/browse/HAWQ-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang updated HAWQ-1332: Attachment: (was: screenshot-1.png) > Can not grant database and schema privileges without table privileges in > ranger or ranger plugin service > > > Key: HAWQ-1332 > URL: https://issues.apache.org/jira/browse/HAWQ-1332 > Project: Apache HAWQ > Issue Type: Bug > Components: Security >Reporter: Chunling Wang >Assignee: Ed Espino > > We try to grant database connect and schema usage privileges to a non-super > user to connect database. We find that if we set policy with database and > schema included, but with table excluded, we can not connect database. But if > we include table, we can connect to database. We think there may be bug in > Ranger Plugin Service or Ranger. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HAWQ-1332) Can not grant database and schema privileges without table privileges in ranger or ranger plugin service
[ https://issues.apache.org/jira/browse/HAWQ-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang updated HAWQ-1332: Attachment: screenshot-2.png > Can not grant database and schema privileges without table privileges in > ranger or ranger plugin service > > > Key: HAWQ-1332 > URL: https://issues.apache.org/jira/browse/HAWQ-1332 > Project: Apache HAWQ > Issue Type: Bug > Components: Security >Reporter: Chunling Wang >Assignee: Ed Espino > Attachments: screenshot-1.png, screenshot-2.png > > > We try to grant database connect and schema usage privileges to a non-super > user to connect database. We find that if we set policy with database and > schema included, but with table excluded, we can not connect database. But if > we include table, we can connect to database. We think there may be bug in > Ranger Plugin Service or Ranger. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HAWQ-1332) Can not grant database and schema privileges without table privileges in ranger or ranger plugin service
[ https://issues.apache.org/jira/browse/HAWQ-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang updated HAWQ-1332: Attachment: screenshot-1.png > Can not grant database and schema privileges without table privileges in > ranger or ranger plugin service > > > Key: HAWQ-1332 > URL: https://issues.apache.org/jira/browse/HAWQ-1332 > Project: Apache HAWQ > Issue Type: Bug > Components: Security >Reporter: Chunling Wang >Assignee: Ed Espino > Attachments: screenshot-1.png > > > We try to grant database connect and schema usage privileges to a non-super > user to connect database. We find that if we set policy with database and > schema included, but with table excluded, we can not connect database. But if > we include table, we can connect to database. We think there may be bug in > Ranger Plugin Service or Ranger. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HAWQ-1332) Can not grant database and schema privileges without table privileges in ranger or ranger plugin service
[ https://issues.apache.org/jira/browse/HAWQ-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang updated HAWQ-1332: Description: We try to grant database connect and schema usage privileges to a non-super user to connect database. We find that if we set policy with database and schema included, but with table excluded, we can not connect database. But if we include table, we can connect to database. We think there may be bug in Ranger Plugin Service or Ranger. (was: We try to grant database connect and schema usage privileges to a normal user to connect database. We find that if we set policy with database and schema included, but with table excluded, we can not connect database. But if we include table, we can connect to database. We think there may be bug in Ranger Plugin Service or Ranger.) > Can not grant database and schema privileges without table privileges in > ranger or ranger plugin service > > > Key: HAWQ-1332 > URL: https://issues.apache.org/jira/browse/HAWQ-1332 > Project: Apache HAWQ > Issue Type: Bug > Components: Security >Reporter: Chunling Wang >Assignee: Ed Espino > > We try to grant database connect and schema usage privileges to a non-super > user to connect database. We find that if we set policy with database and > schema included, but with table excluded, we can not connect database. But if > we include table, we can connect to database. We think there may be bug in > Ranger Plugin Service or Ranger. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HAWQ-1332) Can not grant database and schema privileges without table privileges in ranger or ranger plugin service
Chunling Wang created HAWQ-1332: --- Summary: Can not grant database and schema privileges without table privileges in ranger or ranger plugin service Key: HAWQ-1332 URL: https://issues.apache.org/jira/browse/HAWQ-1332 Project: Apache HAWQ Issue Type: Bug Components: Security Reporter: Chunling Wang Assignee: Ed Espino We try to grant database connect and schema usage privileges to a normal user to connect database. We find that if we set policy with database and schema included, but with table excluded, we can not connect database. But if we include table, we can connect to database. We think there may be bug in Ranger Plugin Service or Ranger. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Closed] (HAWQ-1249) Don't do ACL checks on segments
[ https://issues.apache.org/jira/browse/HAWQ-1249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang closed HAWQ-1249. --- > Don't do ACL checks on segments > --- > > Key: HAWQ-1249 > URL: https://issues.apache.org/jira/browse/HAWQ-1249 > Project: Apache HAWQ > Issue Type: Sub-task > Components: Security >Reporter: Chunling Wang >Assignee: Chunling Wang > Fix For: 2.2.0.0-incubating > > > HAWQ does ACL checks on segments, which we think is not necessary for QE > because there is no catalog data on segments. Even a hacker can connect to a > segdb with GP_ROLE_EXECUTE, he can not do any queries while he can do on > Greenplum for there is catalog data on segments. Further more, in ranger > checks, if all segments do same checks as master with RPS, it costs a lot and > effects the performance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HAWQ-1249) Don't do ACL checks on segments
[ https://issues.apache.org/jira/browse/HAWQ-1249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang resolved HAWQ-1249. - Resolution: Fixed Fix Version/s: (was: backlog) 2.2.0.0-incubating > Don't do ACL checks on segments > --- > > Key: HAWQ-1249 > URL: https://issues.apache.org/jira/browse/HAWQ-1249 > Project: Apache HAWQ > Issue Type: Sub-task > Components: Security >Reporter: Chunling Wang >Assignee: Chunling Wang > Fix For: 2.2.0.0-incubating > > > HAWQ does ACL checks on segments, which we think is not necessary for QE > because there is no catalog data on segments. Even a hacker can connect to a > segdb with GP_ROLE_EXECUTE, he can not do any queries while he can do on > Greenplum for there is catalog data on segments. Further more, in ranger > checks, if all segments do same checks as master with RPS, it costs a lot and > effects the performance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HAWQ-1249) Don't do ACL checks on segments
[ https://issues.apache.org/jira/browse/HAWQ-1249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang reassigned HAWQ-1249: --- Assignee: Chunling Wang (was: Ed Espino) > Don't do ACL checks on segments > --- > > Key: HAWQ-1249 > URL: https://issues.apache.org/jira/browse/HAWQ-1249 > Project: Apache HAWQ > Issue Type: Sub-task > Components: Security >Reporter: Chunling Wang >Assignee: Chunling Wang > Fix For: backlog > > > HAWQ does ACL checks on segments, which we think is not necessary for QE > because there is no catalog data on segments. Even a hacker can connect to a > segdb with GP_ROLE_EXECUTE, he can not do any queries while he can do on > Greenplum for there is catalog data on segments. Further more, in ranger > checks, if all segments do same checks as master with RPS, it costs a lot and > effects the performance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HAWQ-1249) Don't do ACL checks on segments
Chunling Wang created HAWQ-1249: --- Summary: Don't do ACL checks on segments Key: HAWQ-1249 URL: https://issues.apache.org/jira/browse/HAWQ-1249 Project: Apache HAWQ Issue Type: Sub-task Components: Security Reporter: Chunling Wang Assignee: Ed Espino HAWQ does ACL checks on segments, which we think is not necessary for QE because there is no catalog data on segments. Even a hacker can connect to a segdb with GP_ROLE_EXECUTE, he can not do any queries while he can do on Greenplum for there is catalog data on segments. Further more, in ranger checks, if all segments do same checks as master with RPS, it costs a lot and effects the performance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (HAWQ-1239) Fail to call pg_rangercheck_batch() when when "rte->rtekind != RTE_RELATION" or "requiredPerms == 0"
[ https://issues.apache.org/jira/browse/HAWQ-1239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang closed HAWQ-1239. --- > Fail to call pg_rangercheck_batch() when when "rte->rtekind != RTE_RELATION" > or "requiredPerms == 0" > > > Key: HAWQ-1239 > URL: https://issues.apache.org/jira/browse/HAWQ-1239 > Project: Apache HAWQ > Issue Type: Bug > Components: Security >Reporter: Chunling Wang >Assignee: Chunling Wang > Fix For: 2.2.0.0-incubating > > > In ExecCheckRTPermsWithRanger(), it should continue but not return when > "rte->rtekind != RTE_RELATION" or "requiredPerms == 0". > {code} > if (rte->rtekind != RTE_RELATION) > return; > requiredPerms = rte->requiredPerms; > if (requiredPerms == 0) > return; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HAWQ-1239) Fail to call pg_rangercheck_batch() when when "rte->rtekind != RTE_RELATION" or "requiredPerms == 0"
[ https://issues.apache.org/jira/browse/HAWQ-1239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang resolved HAWQ-1239. - Resolution: Fixed > Fail to call pg_rangercheck_batch() when when "rte->rtekind != RTE_RELATION" > or "requiredPerms == 0" > > > Key: HAWQ-1239 > URL: https://issues.apache.org/jira/browse/HAWQ-1239 > Project: Apache HAWQ > Issue Type: Bug > Components: Security >Reporter: Chunling Wang >Assignee: Chunling Wang > Fix For: 2.2.0.0-incubating > > > In ExecCheckRTPermsWithRanger(), it should continue but not return when > "rte->rtekind != RTE_RELATION" or "requiredPerms == 0". > {code} > if (rte->rtekind != RTE_RELATION) > return; > requiredPerms = rte->requiredPerms; > if (requiredPerms == 0) > return; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HAWQ-1237) Insert statement need "select" privilege in ranger check
[ https://issues.apache.org/jira/browse/HAWQ-1237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang resolved HAWQ-1237. - Resolution: Fixed > Insert statement need "select" privilege in ranger check > - > > Key: HAWQ-1237 > URL: https://issues.apache.org/jira/browse/HAWQ-1237 > Project: Apache HAWQ > Issue Type: Bug > Components: Security >Reporter: Chunling Wang >Assignee: Chunling Wang > Fix For: 2.2.0.0-incubating > > > The code in create_ranger_request_json_batch() in rangerrest.c is hard code > and make all statements need "select" privilege in ranger check. > {code} > //ListCell *cell; > //foreach(cell, arg_ptr->actions) > //{ > char tmp[7] = "select"; > json_object* jaction = json_object_new_string((char *)tmp); > //json_object* jaction = json_object_new_string((char > *)cell->data.ptr_value); > json_object_array_add(jactions, jaction); > //} > json_object_object_add(jelement, "privileges", jactions); > json_object_array_add(jaccess, jelement); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (HAWQ-1237) Insert statement need "select" privilege in ranger check
[ https://issues.apache.org/jira/browse/HAWQ-1237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang closed HAWQ-1237. --- > Insert statement need "select" privilege in ranger check > - > > Key: HAWQ-1237 > URL: https://issues.apache.org/jira/browse/HAWQ-1237 > Project: Apache HAWQ > Issue Type: Bug > Components: Security >Reporter: Chunling Wang >Assignee: Chunling Wang > Fix For: 2.2.0.0-incubating > > > The code in create_ranger_request_json_batch() in rangerrest.c is hard code > and make all statements need "select" privilege in ranger check. > {code} > //ListCell *cell; > //foreach(cell, arg_ptr->actions) > //{ > char tmp[7] = "select"; > json_object* jaction = json_object_new_string((char *)tmp); > //json_object* jaction = json_object_new_string((char > *)cell->data.ptr_value); > json_object_array_add(jactions, jaction); > //} > json_object_object_add(jelement, "privileges", jactions); > json_object_array_add(jaccess, jelement); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HAWQ-1237) Insert statement need "select" privilege in ranger check
[ https://issues.apache.org/jira/browse/HAWQ-1237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang updated HAWQ-1237: Fix Version/s: 2.2.0.0-incubating > Insert statement need "select" privilege in ranger check > - > > Key: HAWQ-1237 > URL: https://issues.apache.org/jira/browse/HAWQ-1237 > Project: Apache HAWQ > Issue Type: Bug > Components: Security >Reporter: Chunling Wang >Assignee: Chunling Wang > Fix For: 2.2.0.0-incubating > > > The code in create_ranger_request_json_batch() in rangerrest.c is hard code > and make all statements need "select" privilege in ranger check. > {code} > //ListCell *cell; > //foreach(cell, arg_ptr->actions) > //{ > char tmp[7] = "select"; > json_object* jaction = json_object_new_string((char *)tmp); > //json_object* jaction = json_object_new_string((char > *)cell->data.ptr_value); > json_object_array_add(jactions, jaction); > //} > json_object_object_add(jelement, "privileges", jactions); > json_object_array_add(jaccess, jelement); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HAWQ-1239) Fail to call pg_rangercheck_batch() when when "rte->rtekind != RTE_RELATION" or "requiredPerms == 0"
[ https://issues.apache.org/jira/browse/HAWQ-1239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang updated HAWQ-1239: Fix Version/s: 2.2.0.0-incubating > Fail to call pg_rangercheck_batch() when when "rte->rtekind != RTE_RELATION" > or "requiredPerms == 0" > > > Key: HAWQ-1239 > URL: https://issues.apache.org/jira/browse/HAWQ-1239 > Project: Apache HAWQ > Issue Type: Bug > Components: Security >Reporter: Chunling Wang >Assignee: Chunling Wang > Fix For: 2.2.0.0-incubating > > > In ExecCheckRTPermsWithRanger(), it should continue but not return when > "rte->rtekind != RTE_RELATION" or "requiredPerms == 0". > {code} > if (rte->rtekind != RTE_RELATION) > return; > requiredPerms = rte->requiredPerms; > if (requiredPerms == 0) > return; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HAWQ-1239) Fail to call pg_rangercheck_batch() when when "rte->rtekind != RTE_RELATION" or "requiredPerms == 0"
[ https://issues.apache.org/jira/browse/HAWQ-1239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang reassigned HAWQ-1239: --- Assignee: Chunling Wang (was: Ed Espino) > Fail to call pg_rangercheck_batch() when when "rte->rtekind != RTE_RELATION" > or "requiredPerms == 0" > > > Key: HAWQ-1239 > URL: https://issues.apache.org/jira/browse/HAWQ-1239 > Project: Apache HAWQ > Issue Type: Bug > Components: Security >Reporter: Chunling Wang >Assignee: Chunling Wang > > In ExecCheckRTPermsWithRanger(), it should continue but not return when > "rte->rtekind != RTE_RELATION" or "requiredPerms == 0". > {code} > if (rte->rtekind != RTE_RELATION) > return; > requiredPerms = rte->requiredPerms; > if (requiredPerms == 0) > return; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HAWQ-1239) Fail to call pg_rangercheck_batch() when when "rte->rtekind != RTE_RELATION" or "requiredPerms == 0"
Chunling Wang created HAWQ-1239: --- Summary: Fail to call pg_rangercheck_batch() when when "rte->rtekind != RTE_RELATION" or "requiredPerms == 0" Key: HAWQ-1239 URL: https://issues.apache.org/jira/browse/HAWQ-1239 Project: Apache HAWQ Issue Type: Bug Components: Security Reporter: Chunling Wang Assignee: Ed Espino In ExecCheckRTPermsWithRanger(), it should continue but not return when "rte->rtekind != RTE_RELATION" or "requiredPerms == 0". {code} if (rte->rtekind != RTE_RELATION) return; requiredPerms = rte->requiredPerms; if (requiredPerms == 0) return; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HAWQ-1238) Can not get any data when the network is connected again after a while disconnected.
Chunling Wang created HAWQ-1238: --- Summary: Can not get any data when the network is connected again after a while disconnected. Key: HAWQ-1238 URL: https://issues.apache.org/jira/browse/HAWQ-1238 Project: Apache HAWQ Issue Type: Bug Components: Security Reporter: Chunling Wang Assignee: Ed Espino Can not get any data when the network is connected again after a while disconnected. 1. Psql postgres, run "\d" and find relations in database. {code} psql postgres psql (8.2.15) Type "help" for help. postgres=# \d List of relations Schema | Name | Type |Owner | Storage ++---+--+- public | sales1 | table | wangchunling | append only public | sales1_1_prt_1 | table | wangchunling | append only public | sales1_1_prt_2 | table | wangchunling | append only public | t | table | wangchunling | append only public | tv | view | wangchunling | none (5 rows) {code} 2. Quit the session and disconnect the network. Then psql postgres, run "\d" and get expected error. {code} $ psql postgres psql (8.2.15) Type "help" for help. postgres=# \d WARNING: curl_easy_perform() failed: Couldn't connect to server LINE 1: select version() ^ WARNING: curl_easy_perform() failed: Couldn't connect to server ERROR: permission denied for function version WARNING: curl_easy_perform() failed: Couldn't connect to server ERROR: permission denied for function version WARNING: curl_easy_perform() failed: Couldn't connect to server LINE 5: FROM pg_catalog.pg_class c ^ ERROR: permission denied for schema pg_catalog LINE 5: FROM pg_catalog.pg_class c ^ {code} 3. Connect the network and run "\d", but find no relations. {code} postgres=# \d No relations found. {code} 4. Quit the session again. Then psql postgres, run "\d" and find relations correctly. {code} $ psql postgres psql (8.2.15) Type "help" for help. postgres=# \d List of relations Schema | Name | Type |Owner | Storage ++---+--+- public | sales1 | table | wangchunling | append only public | sales1_1_prt_1 | table | wangchunling | append only public | sales1_1_prt_2 | table | wangchunling | append only public | t | table | wangchunling | append only public | tv | view | wangchunling | none (5 rows) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HAWQ-1237) Insert statement need "select" privilege in ranger check
[ https://issues.apache.org/jira/browse/HAWQ-1237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang reassigned HAWQ-1237: --- Assignee: Chunling Wang (was: Ed Espino) > Insert statement need "select" privilege in ranger check > - > > Key: HAWQ-1237 > URL: https://issues.apache.org/jira/browse/HAWQ-1237 > Project: Apache HAWQ > Issue Type: Bug > Components: Security >Reporter: Chunling Wang >Assignee: Chunling Wang > > The code in create_ranger_request_json_batch() in rangerrest.c is hard code > and make all statements need "select" privilege in ranger check. > {code} > //ListCell *cell; > //foreach(cell, arg_ptr->actions) > //{ > char tmp[7] = "select"; > json_object* jaction = json_object_new_string((char *)tmp); > //json_object* jaction = json_object_new_string((char > *)cell->data.ptr_value); > json_object_array_add(jactions, jaction); > //} > json_object_object_add(jelement, "privileges", jactions); > json_object_array_add(jaccess, jelement); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HAWQ-1237) Insert statement need "select" privilege in ranger check
Chunling Wang created HAWQ-1237: --- Summary: Insert statement need "select" privilege in ranger check Key: HAWQ-1237 URL: https://issues.apache.org/jira/browse/HAWQ-1237 Project: Apache HAWQ Issue Type: Bug Components: Security Reporter: Chunling Wang Assignee: Ed Espino The code in create_ranger_request_json_batch() in rangerrest.c is hard code and make all statements need "select" privilege in ranger check. {code} //ListCell *cell; //foreach(cell, arg_ptr->actions) //{ char tmp[7] = "select"; json_object* jaction = json_object_new_string((char *)tmp); //json_object* jaction = json_object_new_string((char *)cell->data.ptr_value); json_object_array_add(jactions, jaction); //} json_object_object_add(jelement, "privileges", jactions); json_object_array_add(jaccess, jelement); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HAWQ-1149) Built-in function gp_persistent_build_all loses data in gp_relfile_node and gp_persistent_relfile_node
Chunling Wang created HAWQ-1149: --- Summary: Built-in function gp_persistent_build_all loses data in gp_relfile_node and gp_persistent_relfile_node Key: HAWQ-1149 URL: https://issues.apache.org/jira/browse/HAWQ-1149 Project: Apache HAWQ Issue Type: Bug Components: Core Reporter: Chunling Wang Assignee: Lei Chang When we create a new table, and insert data into it. There will be records in gp_relfile_node, gp_persistent_relfile_node and gp_persistent_relation_node. But if we run the HAWQ build-in function gp_persistent_build_all, we will find that the record in gp_relfile_node and gp_persistent_relfile_node for this table is lost. And if there are more than 1 file in this talbe, we will get error when we drop this table. Here are the steps to recur this bug: 1. Create table a, and insert data into a with two concurrent process: {code} postgres=# create table a(id int); CREATE TABLE postgres=# insert into a select generate_series(1, 1000); INSERT 0 1000 {code} {code} postgres=# insert into a select generate_series(1000, 2000); INSERT 0 1001 {code} 2. Check the persistent table and find two files in this table's directory: {code} postgres=# select oid from pg_class where relname='a'; oid - 3017232 (1 row) postgres=# select * from gp_relfile_node where relfilenode_oid=3017232; relfilenode_oid | segment_file_num | persistent_tid | persistent_serial_num -+--++--- 3017232 |1 | (4,128)|855050 3017232 |2 | (4,129)|855051 (2 rows) postgres=# select * from gp_persistent_relation_node where relfilenode_oid=3017232; tablespace_oid | database_oid | relfilenode_oid | persistent_state | reserved | parent_xid | persistent_serial_num | previous_free_tid +--+-+--+--++---+--- 16385 |16387 | 3017232 |2 |0 | 0 |158943 | (0,0) (1 row) postgres=# select * from gp_persistent_relfile_node where relfilenode_oid=3017232; tablespace_oid | database_oid | relfilenode_oid | segment_file_num | relation_storage_manager | persistent_state | relation_bufpool_kind | parent_xid | persistent_serial_num | previous_free_tid +--+-+--+--+--+---++---+--- 16385 |16387 | 3017232 |1 | 2 |2 | 0 | 0 | 855050 | (0,0) 16385 |16387 | 3017232 |2 | 2 |2 | 0 | 0 | 855051 | (0,0) (2 rows) hadoop fs -ls /hawq_default/16385/16387/3017232 -rw--- 3 wangchunling supergroup 100103584 2016-11-08 17:02 /hawq_default/16385/16387/3017232/1 -rw--- 3 wangchunling supergroup 100103600 2016-11-08 17:02 /hawq_default/16385/16387/3017232/2 {code} 3. Rebuilt persistent tables. {code} postgres=# insert into a select generate_series(1000, 2000); INSERT 0 1001 postgres=# select gp_persistent_reset_all(); gp_persistent_reset_all - 1 (1 row) postgres=# select gp_persistent_build_all(false); gp_persistent_build_all - 1 (1 row) {code} 4. Check persistent table and find data lost in gp_relfile_node and gp_persistent_relfile_node. {code} postgres=# select * from gp_relfile_node where relfilenode_oid=3017232; relfilenode_oid | segment_file_num | persistent_tid | persistent_serial_num -+--++--- (0 rows) postgres=# select * from gp_persistent_relation_node where relfilenode_oid=3017232; tablespace_oid | database_oid | relfilenode_oid | persistent_state | reserved | parent_xid | persistent_serial_num | previous_free_tid +--+-+--+--++---+--- 16385 |16387 | 3017232 |2 |0 | 0 |159020 | (0,0) (1 row) postgres=# select * from gp_persistent_relfile_node where relfilenode_oid=3017232; tablespace_oid | database_oid | relfilenode_oid | segment_file_num | relation_storage_manager | persistent_state | relation_bufpool_kind | parent_xid | persistent_serial_num | previous_free_tid
[jira] [Created] (HAWQ-1132) HAWQ should throw error when we insert data in a hash table and the virtual segment number is 1
Chunling Wang created HAWQ-1132: --- Summary: HAWQ should throw error when we insert data in a hash table and the virtual segment number is 1 Key: HAWQ-1132 URL: https://issues.apache.org/jira/browse/HAWQ-1132 Project: Apache HAWQ Issue Type: Bug Components: Core, Planner, Query Execution Reporter: Chunling Wang Assignee: Lei Chang If we set virtual segment number is 1, and create a hash table (default hash number is 6), we will just get a warning message in a no partition table when we insert a tuple. And we can even not get any message in a partition table. When we select from this table, HAWQ throws error. No partition table: {code} postgres=# set enforce_virtual_segment_number = 1; SET postgres=# create table t(id int) DISTRIBUTED BY (id); CREATE TABLE postgres=# insert into t values(1); WARNING: skipping "t" --- error returned: file count 1 in catalog is not in proportion to the bucket number 6 of hash table with oid=2966724, some data may be lost, if you still want to continue the query by considering the table as random, set GUC allow_file_count_bucket_num_mismatch to on and try again. INFO: ANALYZE completed. Success: 0, Failure: 1 (t) INSERT 0 1 postgres=# select * from t; ERROR: file count 1 in catalog is not in proportion to the bucket number 6 of hash table with oid=2966724, some data may be lost, if you still want to continue the query by considering the table as random, set GUC allow_file_count_bucket_num_mismatch to on and try again. (cdbdatalocality.c:3801) postgres=# {code} Partition table: {code} postgres=# set enforce_virtual_segment_number = 1; SET postgres=# CREATE TABLE t (id int, rank int, year int, gender char(1), count int ) DISTRIBUTED BY (id) PARTITION BY LIST (gender) ( PARTITION girls VALUES ('F'), PARTITION boys VALUES ('M'), DEFAULT PARTITION other ); NOTICE: CREATE TABLE will create partition "t_1_prt_girls" for table "t" NOTICE: CREATE TABLE will create partition "t_1_prt_boys" for table "t" NOTICE: CREATE TABLE will create partition "t_1_prt_other" for table "t" CREATE TABLE postgres=# insert into t values(51, 1, 1, 'F', 1); INSERT 0 1 postgres=# select * from t; ERROR: file count 1 in catalog is not in proportion to the bucket number 6 of hash table with oid=2966703, some data may be lost, if you still want to continue the query by considering the table as random, set GUC allow_file_count_bucket_num_mismatch to on and try again. (cdbdatalocality.c:3801) postgres=# {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HAWQ-1128) Support HAWQ register tables with same file name in different schema
[ https://issues.apache.org/jira/browse/HAWQ-1128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang reassigned HAWQ-1128: --- Assignee: Chunling Wang (was: Lei Chang) > Support HAWQ register tables with same file name in different schema > > > Key: HAWQ-1128 > URL: https://issues.apache.org/jira/browse/HAWQ-1128 > Project: Apache HAWQ > Issue Type: Sub-task > Components: Command Line Tools >Reporter: Chunling Wang >Assignee: Chunling Wang > Fix For: backlog > > > Now, in HAWQ Register, it can not distinguish tables with same file name but > in different schema, which are regarded as same table. We should save and use > schema information for HAWQ register. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HAWQ-1128) Support HAWQ register tables with same file name in different schema
Chunling Wang created HAWQ-1128: --- Summary: Support HAWQ register tables with same file name in different schema Key: HAWQ-1128 URL: https://issues.apache.org/jira/browse/HAWQ-1128 Project: Apache HAWQ Issue Type: Sub-task Components: Command Line Tools Reporter: Chunling Wang Assignee: Lei Chang Now, in HAWQ Register, it can not distinguish tables with same file name but in different schema, which are regarded as same table. We should save and use schema information for HAWQ register. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HAWQ-1034) add --repair option for hawq register
[ https://issues.apache.org/jira/browse/HAWQ-1034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15621371#comment-15621371 ] Chunling Wang commented on HAWQ-1034: - The reason why we remove these code is that removing data in table directory in repair mode will cause risk to lose data. So we decided to not remove data in table directory in repair mode, which can be replaced by force mode. > add --repair option for hawq register > - > > Key: HAWQ-1034 > URL: https://issues.apache.org/jira/browse/HAWQ-1034 > Project: Apache HAWQ > Issue Type: Sub-task > Components: Command Line Tools >Affects Versions: 2.0.1.0-incubating >Reporter: Lili Ma >Assignee: Chunling Wang > Fix For: 2.0.1.0-incubating > > > add --repair option for hawq register > Will change both file folder and catalog table pg_aoseg.pg_paqseg_$relid to > the state which .yml file configures. Note may some new generated files since > the checkpoint may be deleted here. Also note the all the files in .yml file > should all under the table folder on HDFS. Limitation: Do not support cases > for hash table redistribution, table truncate and table drop. This is for > scenario rollback of table: Do checkpoints somewhere, and need to rollback to > previous checkpoint. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HAWQ-1034) add --repair option for hawq register
[ https://issues.apache.org/jira/browse/HAWQ-1034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15621348#comment-15621348 ] Chunling Wang commented on HAWQ-1034: - The code and test cases for repair mode have been removed by GitHub Pull Request #986. > add --repair option for hawq register > - > > Key: HAWQ-1034 > URL: https://issues.apache.org/jira/browse/HAWQ-1034 > Project: Apache HAWQ > Issue Type: Sub-task > Components: Command Line Tools >Affects Versions: 2.0.1.0-incubating >Reporter: Lili Ma >Assignee: Chunling Wang > Fix For: 2.0.1.0-incubating > > > add --repair option for hawq register > Will change both file folder and catalog table pg_aoseg.pg_paqseg_$relid to > the state which .yml file configures. Note may some new generated files since > the checkpoint may be deleted here. Also note the all the files in .yml file > should all under the table folder on HDFS. Limitation: Do not support cases > for hash table redistribution, table truncate and table drop. This is for > scenario rollback of table: Do checkpoints somewhere, and need to rollback to > previous checkpoint. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HAWQ-1034) add --repair option for hawq register
[ https://issues.apache.org/jira/browse/HAWQ-1034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang reassigned HAWQ-1034: --- Assignee: Chunling Wang (was: hongwu) > add --repair option for hawq register > - > > Key: HAWQ-1034 > URL: https://issues.apache.org/jira/browse/HAWQ-1034 > Project: Apache HAWQ > Issue Type: Sub-task > Components: Command Line Tools >Affects Versions: 2.0.1.0-incubating >Reporter: Lili Ma >Assignee: Chunling Wang > Fix For: 2.0.1.0-incubating > > > add --repair option for hawq register > Will change both file folder and catalog table pg_aoseg.pg_paqseg_$relid to > the state which .yml file configures. Note may some new generated files since > the checkpoint may be deleted here. Also note the all the files in .yml file > should all under the table folder on HDFS. Limitation: Do not support cases > for hash table redistribution, table truncate and table drop. This is for > scenario rollback of table: Do checkpoints somewhere, and need to rollback to > previous checkpoint. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HAWQ-1113) In force mode, hawq register error when files in yaml is disordered
Chunling Wang created HAWQ-1113: --- Summary: In force mode, hawq register error when files in yaml is disordered Key: HAWQ-1113 URL: https://issues.apache.org/jira/browse/HAWQ-1113 Project: Apache HAWQ Issue Type: Bug Components: Command Line Tools Reporter: Chunling Wang Assignee: Lei Chang In force mode, hawq register error when files in yaml is in disordered. For example, the files order in yaml is as following: {code} Files: - path: /hawq_default/16385/@DATABASE_OID@/@TABLE_OID_OLD@/2 size: 250 - path: /hawq_default/16385/@DATABASE_OID@/@TABLE_OID_OLD@/4 size: 250 - path: /hawq_default/16385/@DATABASE_OID@/@TABLE_OID_OLD@/5 size: 258 - path: /hawq_default/16385/@DATABASE_OID@/@TABLE_OID_OLD@/6 size: 270 - path: /hawq_default/16385/@DATABASE_OID@/@TABLE_OID_OLD@/3 size: 258 - path: /hawq_default/16385/@DATABASE_OID@/@TABLE_OID_NEW2@/1 size: 228 - path: /hawq_default/16385/@DATABASE_OID@/@TABLE_OID_NEW@/2 size: 215 - path: /hawq_default/16385/@DATABASE_OID@/@TABLE_OID_NEW@/3 size: 215 - path: /hawq_default/16385/@DATABASE_OID@/@TABLE_OID_NEW@/4 size: 220 - path: /hawq_default/16385/@DATABASE_OID@/@TABLE_OID_OLD@/1 size: 254 - path: /hawq_default/16385/@DATABASE_OID@/@TABLE_OID_NEW@/6 size: 215 - path: /hawq_default/16385/@DATABASE_OID@/@TABLE_OID_NEW@/5 size: 210 {code} After hawq register success, we select data from table and get the error: {code} ERROR: hdfs file length does not equal to metadata logic length! (cdbdatalocality.c:1102) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HAWQ-910) "hawq register": before registration, need check the consistency between the file and HAWQ table
[ https://issues.apache.org/jira/browse/HAWQ-910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang updated HAWQ-910: --- Description: As a user, I can be notified that the uploading file is not consistent to the table I want to register to during registration so that I can do corresponding modifications as early as possible. There are two situations we need to check: 1. Hawq register a single file or folder, it should check the consistent to the table and uploading files. 2. Hawq register a .yml file, it should check the consistent to the table (if the table exists), .yml file and file(s) need to move. was: As a user, I can be notified that the uploading file is not consistent to the table I want to register to during registration so that I can do corresponding modifications as early as possible. > "hawq register": before registration, need check the consistency between the > file and HAWQ table > > > Key: HAWQ-910 > URL: https://issues.apache.org/jira/browse/HAWQ-910 > Project: Apache HAWQ > Issue Type: Sub-task > Components: Storage >Reporter: Lili Ma >Assignee: Lei Chang > Fix For: backlog > > > As a user, > I can be notified that the uploading file is not consistent to the table I > want to register to during registration > so that I can do corresponding modifications as early as possible. > There are two situations we need to check: > 1. Hawq register a single file or folder, it should check the consistent to > the table and uploading files. > 2. Hawq register a .yml file, it should check the consistent to the table (if > the table exists), .yml file and file(s) need to move. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HAWQ-975) Queries run much slower with 'explain analyze' than which without 'explain analyze'
[ https://issues.apache.org/jira/browse/HAWQ-975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang updated HAWQ-975: --- Affects Version/s: 2.0.0.0-incubating > Queries run much slower with 'explain analyze' than which without 'explain > analyze' > > > Key: HAWQ-975 > URL: https://issues.apache.org/jira/browse/HAWQ-975 > Project: Apache HAWQ > Issue Type: Bug > Components: Core >Affects Versions: 2.0.0.0-incubating >Reporter: Chunling Wang >Assignee: Chunling Wang >Priority: Critical > Labels: performance > Fix For: 2.0.1.0-incubating > > > When we run queries with 'explain analyze' in AWS cluster, the total running > time is about 2-3 times longer than which without 'explain analyze'. > Here is a group of TPC-H results for queries with 'explain analyze' and > queries without 'explain analyze'. > ||query ||without 'explain analyze' ||with 'explain analyze' > ||multiple > |TPCH_Query_01| 311843 | 818658 | 2.63 > |TPCH_Query_02| 34675 | 117884 | 3.40 > |TPCH_Query_03| 166155 | 422131 | 2.54 > |TPCH_Query_04| 157807 | 507143 | 3.21 > |TPCH_Query_05| 272657 | 710573 | 2.61 > |TPCH_Query_06| 12508 | 22276 | 1.78 > |TPCH_Query_07| 71893 | 370338 | 5.15 > |TPCH_Query_08| 12 | 672625 | 5.17 > |TPCH_Query_09| 575709 | 1171672 | 2.04 > |TPCH_Query_10| 93770 | 233391 | 2.49 > |TPCH_Query_11| 16252 | 58360 | 3.59 > |TPCH_Query_12| 142576 | 237270 | 1.66 > |TPCH_Query_13| 72682 | 343257 | 4.72 > |TPCH_Query_14| 10410 | 32337 | 3.11 > |TPCH_Query_15| 25719 | 98705 | 3.84 > |TPCH_Query_16| 21382 | 76877 | 3.60 > |TPCH_Query_17| 839683 | 2041169 | 2.43 > |TPCH_Query_18| 460570 | 1065940 | 2.31 > |TPCH_Query_19| 69075 | 82286 | 1.19 > |TPCH_Query_20| 78263 | 292041 | 3.73 > |TPCH_Query_21| 505606 | 1549690 | 3.07 > |TPCH_Query_22| 56450 | 329837 | 5.84 > |Total| 4125684 | 11254460| > 2.73 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (HAWQ-975) Queries run much slower with 'explain analyze' than which without 'explain analyze'
[ https://issues.apache.org/jira/browse/HAWQ-975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang closed HAWQ-975. -- Resolution: Not A Bug > Queries run much slower with 'explain analyze' than which without 'explain > analyze' > > > Key: HAWQ-975 > URL: https://issues.apache.org/jira/browse/HAWQ-975 > Project: Apache HAWQ > Issue Type: Bug > Components: Core >Affects Versions: 2.0.0.0-incubating >Reporter: Chunling Wang >Assignee: Chunling Wang >Priority: Critical > Labels: performance > Fix For: 2.0.1.0-incubating > > > When we run queries with 'explain analyze' in AWS cluster, the total running > time is about 2-3 times longer than which without 'explain analyze'. > Here is a group of TPC-H results for queries with 'explain analyze' and > queries without 'explain analyze'. > ||query ||without 'explain analyze' ||with 'explain analyze' > ||multiple > |TPCH_Query_01| 311843 | 818658 | 2.63 > |TPCH_Query_02| 34675 | 117884 | 3.40 > |TPCH_Query_03| 166155 | 422131 | 2.54 > |TPCH_Query_04| 157807 | 507143 | 3.21 > |TPCH_Query_05| 272657 | 710573 | 2.61 > |TPCH_Query_06| 12508 | 22276 | 1.78 > |TPCH_Query_07| 71893 | 370338 | 5.15 > |TPCH_Query_08| 12 | 672625 | 5.17 > |TPCH_Query_09| 575709 | 1171672 | 2.04 > |TPCH_Query_10| 93770 | 233391 | 2.49 > |TPCH_Query_11| 16252 | 58360 | 3.59 > |TPCH_Query_12| 142576 | 237270 | 1.66 > |TPCH_Query_13| 72682 | 343257 | 4.72 > |TPCH_Query_14| 10410 | 32337 | 3.11 > |TPCH_Query_15| 25719 | 98705 | 3.84 > |TPCH_Query_16| 21382 | 76877 | 3.60 > |TPCH_Query_17| 839683 | 2041169 | 2.43 > |TPCH_Query_18| 460570 | 1065940 | 2.31 > |TPCH_Query_19| 69075 | 82286 | 1.19 > |TPCH_Query_20| 78263 | 292041 | 3.73 > |TPCH_Query_21| 505606 | 1549690 | 3.07 > |TPCH_Query_22| 56450 | 329837 | 5.84 > |Total| 4125684 | 11254460| > 2.73 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (HAWQ-975) Queries run much slower with 'explain analyze' than which without 'explain analyze'
[ https://issues.apache.org/jira/browse/HAWQ-975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang reopened HAWQ-975: > Queries run much slower with 'explain analyze' than which without 'explain > analyze' > > > Key: HAWQ-975 > URL: https://issues.apache.org/jira/browse/HAWQ-975 > Project: Apache HAWQ > Issue Type: Bug > Components: Core >Reporter: Chunling Wang >Assignee: Chunling Wang >Priority: Critical > Labels: performance > Fix For: 2.0.1.0-incubating > > > When we run queries with 'explain analyze' in AWS cluster, the total running > time is about 2-3 times longer than which without 'explain analyze'. > Here is a group of TPC-H results for queries with 'explain analyze' and > queries without 'explain analyze'. > ||query ||without 'explain analyze' ||with 'explain analyze' > ||multiple > |TPCH_Query_01| 311843 | 818658 | 2.63 > |TPCH_Query_02| 34675 | 117884 | 3.40 > |TPCH_Query_03| 166155 | 422131 | 2.54 > |TPCH_Query_04| 157807 | 507143 | 3.21 > |TPCH_Query_05| 272657 | 710573 | 2.61 > |TPCH_Query_06| 12508 | 22276 | 1.78 > |TPCH_Query_07| 71893 | 370338 | 5.15 > |TPCH_Query_08| 12 | 672625 | 5.17 > |TPCH_Query_09| 575709 | 1171672 | 2.04 > |TPCH_Query_10| 93770 | 233391 | 2.49 > |TPCH_Query_11| 16252 | 58360 | 3.59 > |TPCH_Query_12| 142576 | 237270 | 1.66 > |TPCH_Query_13| 72682 | 343257 | 4.72 > |TPCH_Query_14| 10410 | 32337 | 3.11 > |TPCH_Query_15| 25719 | 98705 | 3.84 > |TPCH_Query_16| 21382 | 76877 | 3.60 > |TPCH_Query_17| 839683 | 2041169 | 2.43 > |TPCH_Query_18| 460570 | 1065940 | 2.31 > |TPCH_Query_19| 69075 | 82286 | 1.19 > |TPCH_Query_20| 78263 | 292041 | 3.73 > |TPCH_Query_21| 505606 | 1549690 | 3.07 > |TPCH_Query_22| 56450 | 329837 | 5.84 > |Total| 4125684 | 11254460| > 2.73 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HAWQ-975) Queries run much slower with 'explain analyze' than which without 'explain analyze'
[ https://issues.apache.org/jira/browse/HAWQ-975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang resolved HAWQ-975. Resolution: Not A Bug It is a system configuration issue other than a bug in HAWQ. > Queries run much slower with 'explain analyze' than which without 'explain > analyze' > > > Key: HAWQ-975 > URL: https://issues.apache.org/jira/browse/HAWQ-975 > Project: Apache HAWQ > Issue Type: Bug > Components: Core >Reporter: Chunling Wang >Assignee: Chunling Wang >Priority: Critical > Labels: performance > Fix For: 2.0.1.0-incubating > > > When we run queries with 'explain analyze' in AWS cluster, the total running > time is about 2-3 times longer than which without 'explain analyze'. > Here is a group of TPC-H results for queries with 'explain analyze' and > queries without 'explain analyze'. > ||query ||without 'explain analyze' ||with 'explain analyze' > ||multiple > |TPCH_Query_01| 311843 | 818658 | 2.63 > |TPCH_Query_02| 34675 | 117884 | 3.40 > |TPCH_Query_03| 166155 | 422131 | 2.54 > |TPCH_Query_04| 157807 | 507143 | 3.21 > |TPCH_Query_05| 272657 | 710573 | 2.61 > |TPCH_Query_06| 12508 | 22276 | 1.78 > |TPCH_Query_07| 71893 | 370338 | 5.15 > |TPCH_Query_08| 12 | 672625 | 5.17 > |TPCH_Query_09| 575709 | 1171672 | 2.04 > |TPCH_Query_10| 93770 | 233391 | 2.49 > |TPCH_Query_11| 16252 | 58360 | 3.59 > |TPCH_Query_12| 142576 | 237270 | 1.66 > |TPCH_Query_13| 72682 | 343257 | 4.72 > |TPCH_Query_14| 10410 | 32337 | 3.11 > |TPCH_Query_15| 25719 | 98705 | 3.84 > |TPCH_Query_16| 21382 | 76877 | 3.60 > |TPCH_Query_17| 839683 | 2041169 | 2.43 > |TPCH_Query_18| 460570 | 1065940 | 2.31 > |TPCH_Query_19| 69075 | 82286 | 1.19 > |TPCH_Query_20| 78263 | 292041 | 3.73 > |TPCH_Query_21| 505606 | 1549690 | 3.07 > |TPCH_Query_22| 56450 | 329837 | 5.84 > |Total| 4125684 | 11254460| > 2.73 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HAWQ-975) Queries run much slower with 'explain analyze' than which without 'explain analyze'
[ https://issues.apache.org/jira/browse/HAWQ-975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15451831#comment-15451831 ] Chunling Wang edited comment on HAWQ-975 at 8/31/16 10:30 AM: -- The performance of explain analyze on AWS is low because the VDSO on agents of AWS is not properly configured and does not work well. To be specific, gettimeofday() takes too much time. was (Author: wcl14): It is because that the VDSO on agents of AWS does not work well. So the execution time of function 'gettimeofday()' is too much. > Queries run much slower with 'explain analyze' than which without 'explain > analyze' > > > Key: HAWQ-975 > URL: https://issues.apache.org/jira/browse/HAWQ-975 > Project: Apache HAWQ > Issue Type: Bug > Components: Core >Reporter: Chunling Wang >Assignee: Chunling Wang >Priority: Critical > Labels: performance > Fix For: 2.0.1.0-incubating > > > When we run queries with 'explain analyze' in AWS cluster, the total running > time is about 2-3 times longer than which without 'explain analyze'. > Here is a group of TPC-H results for queries with 'explain analyze' and > queries without 'explain analyze'. > ||query ||without 'explain analyze' ||with 'explain analyze' > ||multiple > |TPCH_Query_01| 311843 | 818658 | 2.63 > |TPCH_Query_02| 34675 | 117884 | 3.40 > |TPCH_Query_03| 166155 | 422131 | 2.54 > |TPCH_Query_04| 157807 | 507143 | 3.21 > |TPCH_Query_05| 272657 | 710573 | 2.61 > |TPCH_Query_06| 12508 | 22276 | 1.78 > |TPCH_Query_07| 71893 | 370338 | 5.15 > |TPCH_Query_08| 12 | 672625 | 5.17 > |TPCH_Query_09| 575709 | 1171672 | 2.04 > |TPCH_Query_10| 93770 | 233391 | 2.49 > |TPCH_Query_11| 16252 | 58360 | 3.59 > |TPCH_Query_12| 142576 | 237270 | 1.66 > |TPCH_Query_13| 72682 | 343257 | 4.72 > |TPCH_Query_14| 10410 | 32337 | 3.11 > |TPCH_Query_15| 25719 | 98705 | 3.84 > |TPCH_Query_16| 21382 | 76877 | 3.60 > |TPCH_Query_17| 839683 | 2041169 | 2.43 > |TPCH_Query_18| 460570 | 1065940 | 2.31 > |TPCH_Query_19| 69075 | 82286 | 1.19 > |TPCH_Query_20| 78263 | 292041 | 3.73 > |TPCH_Query_21| 505606 | 1549690 | 3.07 > |TPCH_Query_22| 56450 | 329837 | 5.84 > |Total| 4125684 | 11254460| > 2.73 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HAWQ-975) Queries run much slower with 'explain analyze' than which without 'explain analyze'
[ https://issues.apache.org/jira/browse/HAWQ-975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang reassigned HAWQ-975: -- Assignee: Chunling Wang (was: Lei Chang) > Queries run much slower with 'explain analyze' than which without 'explain > analyze' > > > Key: HAWQ-975 > URL: https://issues.apache.org/jira/browse/HAWQ-975 > Project: Apache HAWQ > Issue Type: Bug > Components: Core >Reporter: Chunling Wang >Assignee: Chunling Wang >Priority: Critical > Labels: performance > Fix For: 2.0.1.0-incubating > > > When we run queries with 'explain analyze' in AWS cluster, the total running > time is about 2-3 times longer than which without 'explain analyze'. > Here is a group of TPC-H results for queries with 'explain analyze' and > queries without 'explain analyze'. > ||query ||without 'explain analyze' ||with 'explain analyze' > ||multiple > |TPCH_Query_01| 311843 | 818658 | 2.63 > |TPCH_Query_02| 34675 | 117884 | 3.40 > |TPCH_Query_03| 166155 | 422131 | 2.54 > |TPCH_Query_04| 157807 | 507143 | 3.21 > |TPCH_Query_05| 272657 | 710573 | 2.61 > |TPCH_Query_06| 12508 | 22276 | 1.78 > |TPCH_Query_07| 71893 | 370338 | 5.15 > |TPCH_Query_08| 12 | 672625 | 5.17 > |TPCH_Query_09| 575709 | 1171672 | 2.04 > |TPCH_Query_10| 93770 | 233391 | 2.49 > |TPCH_Query_11| 16252 | 58360 | 3.59 > |TPCH_Query_12| 142576 | 237270 | 1.66 > |TPCH_Query_13| 72682 | 343257 | 4.72 > |TPCH_Query_14| 10410 | 32337 | 3.11 > |TPCH_Query_15| 25719 | 98705 | 3.84 > |TPCH_Query_16| 21382 | 76877 | 3.60 > |TPCH_Query_17| 839683 | 2041169 | 2.43 > |TPCH_Query_18| 460570 | 1065940 | 2.31 > |TPCH_Query_19| 69075 | 82286 | 1.19 > |TPCH_Query_20| 78263 | 292041 | 3.73 > |TPCH_Query_21| 505606 | 1549690 | 3.07 > |TPCH_Query_22| 56450 | 329837 | 5.84 > |Total| 4125684 | 11254460| > 2.73 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HAWQ-975) Queries run much slower with 'explain analyze' than which without 'explain analyze'
[ https://issues.apache.org/jira/browse/HAWQ-975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15451831#comment-15451831 ] Chunling Wang commented on HAWQ-975: It is because that the VDSO on agents of AWS does not work well. So the execution time of function 'gettimeofday()' is too much. > Queries run much slower with 'explain analyze' than which without 'explain > analyze' > > > Key: HAWQ-975 > URL: https://issues.apache.org/jira/browse/HAWQ-975 > Project: Apache HAWQ > Issue Type: Bug > Components: Core >Reporter: Chunling Wang >Assignee: Lei Chang >Priority: Critical > Labels: performance > Fix For: 2.0.1.0-incubating > > > When we run queries with 'explain analyze' in AWS cluster, the total running > time is about 2-3 times longer than which without 'explain analyze'. > Here is a group of TPC-H results for queries with 'explain analyze' and > queries without 'explain analyze'. > ||query ||without 'explain analyze' ||with 'explain analyze' > ||multiple > |TPCH_Query_01| 311843 | 818658 | 2.63 > |TPCH_Query_02| 34675 | 117884 | 3.40 > |TPCH_Query_03| 166155 | 422131 | 2.54 > |TPCH_Query_04| 157807 | 507143 | 3.21 > |TPCH_Query_05| 272657 | 710573 | 2.61 > |TPCH_Query_06| 12508 | 22276 | 1.78 > |TPCH_Query_07| 71893 | 370338 | 5.15 > |TPCH_Query_08| 12 | 672625 | 5.17 > |TPCH_Query_09| 575709 | 1171672 | 2.04 > |TPCH_Query_10| 93770 | 233391 | 2.49 > |TPCH_Query_11| 16252 | 58360 | 3.59 > |TPCH_Query_12| 142576 | 237270 | 1.66 > |TPCH_Query_13| 72682 | 343257 | 4.72 > |TPCH_Query_14| 10410 | 32337 | 3.11 > |TPCH_Query_15| 25719 | 98705 | 3.84 > |TPCH_Query_16| 21382 | 76877 | 3.60 > |TPCH_Query_17| 839683 | 2041169 | 2.43 > |TPCH_Query_18| 460570 | 1065940 | 2.31 > |TPCH_Query_19| 69075 | 82286 | 1.19 > |TPCH_Query_20| 78263 | 292041 | 3.73 > |TPCH_Query_21| 505606 | 1549690 | 3.07 > |TPCH_Query_22| 56450 | 329837 | 5.84 > |Total| 4125684 | 11254460| > 2.73 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HAWQ-1037) modify way to get HDFS port in TestHawqRegister
[ https://issues.apache.org/jira/browse/HAWQ-1037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang updated HAWQ-1037: Summary: modify way to get HDFS port in TestHawqRegister (was: modify to get HDFS port in TestHawqRegister) > modify way to get HDFS port in TestHawqRegister > --- > > Key: HAWQ-1037 > URL: https://issues.apache.org/jira/browse/HAWQ-1037 > Project: Apache HAWQ > Issue Type: Bug > Components: Tests >Reporter: Chunling Wang >Assignee: Chunling Wang > > In test TestHawqRegister, the HDFS port is hard-coded. Now we get the HDFS > port from HdfsConfig. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HAWQ-1037) modify to get HDFS port in TestHawqRegister
[ https://issues.apache.org/jira/browse/HAWQ-1037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang reassigned HAWQ-1037: --- Assignee: Chunling Wang (was: Jiali Yao) > modify to get HDFS port in TestHawqRegister > --- > > Key: HAWQ-1037 > URL: https://issues.apache.org/jira/browse/HAWQ-1037 > Project: Apache HAWQ > Issue Type: Bug > Components: Tests >Reporter: Chunling Wang >Assignee: Chunling Wang > > In test TestHawqRegister, the HDFS port is hard-coded. Now we get the HDFS > port from HdfsConfig. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HAWQ-1037) modify to get HDFS port in TestHawqRegister
Chunling Wang created HAWQ-1037: --- Summary: modify to get HDFS port in TestHawqRegister Key: HAWQ-1037 URL: https://issues.apache.org/jira/browse/HAWQ-1037 Project: Apache HAWQ Issue Type: Bug Components: Tests Reporter: Chunling Wang Assignee: Jiali Yao In test TestHawqRegister, the HDFS port is hard-coded. Now we get the HDFS port from HdfsConfig. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (HAWQ-969) Add getting configuration from HDFS and YARN
[ https://issues.apache.org/jira/browse/HAWQ-969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang closed HAWQ-969. -- > Add getting configuration from HDFS and YARN > > > Key: HAWQ-969 > URL: https://issues.apache.org/jira/browse/HAWQ-969 > Project: Apache HAWQ > Issue Type: Sub-task > Components: Tests >Reporter: Chunling Wang >Assignee: Chunling Wang > Fix For: 2.0.1.0-incubating > > > Add getting configuration from HDFS and YARN and writing xml file in > xml_parser.cpp. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (HAWQ-1020) Fix bugs to let feature tests TestCommonLib.TestHdfsConfig and TestCommonLib.TestYanConfig run in concourse
[ https://issues.apache.org/jira/browse/HAWQ-1020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang closed HAWQ-1020. --- > Fix bugs to let feature tests TestCommonLib.TestHdfsConfig and > TestCommonLib.TestYanConfig run in concourse > --- > > Key: HAWQ-1020 > URL: https://issues.apache.org/jira/browse/HAWQ-1020 > Project: Apache HAWQ > Issue Type: Bug > Components: Tests >Affects Versions: 2.0.1.0-incubating >Reporter: Chunling Wang >Assignee: Chunling Wang > Fix For: 2.0.1.0-incubating > > > Fix bugs to let feature tests TestCommonLib.TestHdfsConfig and > TestCommonLib.TestYanConfig run in concourse -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HAWQ-969) Add getting configuration from HDFS and YARN
[ https://issues.apache.org/jira/browse/HAWQ-969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang resolved HAWQ-969. Resolution: Fixed > Add getting configuration from HDFS and YARN > > > Key: HAWQ-969 > URL: https://issues.apache.org/jira/browse/HAWQ-969 > Project: Apache HAWQ > Issue Type: Sub-task > Components: Tests >Reporter: Chunling Wang >Assignee: Chunling Wang > Fix For: 2.0.1.0-incubating > > > Add getting configuration from HDFS and YARN and writing xml file in > xml_parser.cpp. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HAWQ-1020) Fix bugs to let feature tests TestCommonLib.TestHdfsConfig and TestCommonLib.TestYanConfig run in concourse
[ https://issues.apache.org/jira/browse/HAWQ-1020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang reassigned HAWQ-1020: --- Assignee: Chunling Wang (was: Jiali Yao) > Fix bugs to let feature tests TestCommonLib.TestHdfsConfig and > TestCommonLib.TestYanConfig run in concourse > --- > > Key: HAWQ-1020 > URL: https://issues.apache.org/jira/browse/HAWQ-1020 > Project: Apache HAWQ > Issue Type: Bug > Components: Tests >Reporter: Chunling Wang >Assignee: Chunling Wang > > Fix bugs to let feature tests TestCommonLib.TestHdfsConfig and > TestCommonLib.TestYanConfig run in concourse -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HAWQ-1020) Fix bugs to let feature tests TestCommonLib.TestHdfsConfig and TestCommonLib.TestYanConfig run in concourse
Chunling Wang created HAWQ-1020: --- Summary: Fix bugs to let feature tests TestCommonLib.TestHdfsConfig and TestCommonLib.TestYanConfig run in concourse Key: HAWQ-1020 URL: https://issues.apache.org/jira/browse/HAWQ-1020 Project: Apache HAWQ Issue Type: Bug Components: Tests Reporter: Chunling Wang Assignee: Jiali Yao Fix bugs to let feature tests TestCommonLib.TestHdfsConfig and TestCommonLib.TestYanConfig run in concourse -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HAWQ-969) Add getting configuration from HDFS and YARN
[ https://issues.apache.org/jira/browse/HAWQ-969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang reassigned HAWQ-969: -- Assignee: Chunling Wang (was: Jiali Yao) > Add getting configuration from HDFS and YARN > > > Key: HAWQ-969 > URL: https://issues.apache.org/jira/browse/HAWQ-969 > Project: Apache HAWQ > Issue Type: Sub-task > Components: Tests >Reporter: Chunling Wang >Assignee: Chunling Wang > Fix For: 2.0.1.0-incubating > > > Add getting configuration from HDFS and YARN and writing xml file in > xml_parser.cpp. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HAWQ-969) Add getting configuration from HDFS and YARN
Chunling Wang created HAWQ-969: -- Summary: Add getting configuration from HDFS and YARN Key: HAWQ-969 URL: https://issues.apache.org/jira/browse/HAWQ-969 Project: Apache HAWQ Issue Type: Sub-task Components: Tests Reporter: Chunling Wang Assignee: Jiali Yao Add getting configuration from HDFS and YARN and writing xml file in xml_parser.cpp. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HAWQ-812) Activate standby master failed after create a new database
[ https://issues.apache.org/jira/browse/HAWQ-812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang updated HAWQ-812: --- Component/s: (was: Backup & restore) > Activate standby master failed after create a new database > -- > > Key: HAWQ-812 > URL: https://issues.apache.org/jira/browse/HAWQ-812 > Project: Apache HAWQ > Issue Type: Bug >Reporter: Chunling Wang >Assignee: Lei Chang > > Activate standby master failed after create a new database. However, it will > success if we do not create a new database even we create a new table and > insert data. > 1. Create a new database 'gptest' > {code} > [gpadmin@test1 ~]$ psql -l > List of databases >Name| Owner | Encoding | Access privileges > ---+-+--+--- > postgres | gpadmin | UTF8 | > template0 | gpadmin | UTF8 | > template1 | gpadmin | UTF8 | > (3 rows) > [gpadmin@test1 ~]$ createdb gptest > [gpadmin@test1 ~]$ psql -l > List of databases >Name| Owner | Encoding | Access privileges > ---+-+--+--- > gptest| gpadmin | UTF8 | > postgres | gpadmin | UTF8 | > template0 | gpadmin | UTF8 | > template1 | gpadmin | UTF8 | > (4 rows) > {code} > 2. Stop HAWQ master > {code} > [gpadmin@test1 ~]$ hawq stop master -a > 20160613:20:13:44:068559 hawq_stop:test1:gpadmin-[INFO]:-Prepare to do 'hawq > stop' > 20160613:20:13:44:068559 hawq_stop:test1:gpadmin-[INFO]:-You can find log in: > 20160613:20:13:44:068559 > hawq_stop:test1:gpadmin-[INFO]:-/home/gpadmin/hawqAdminLogs/hawq_stop_20160613.log > 20160613:20:13:44:068559 hawq_stop:test1:gpadmin-[INFO]:-GPHOME is set to: > 20160613:20:13:44:068559 > hawq_stop:test1:gpadmin-[INFO]:-/data/pulse-agent-data/HAWQ-main-FeatureTest-opt-mutilnodeparallel-wcl/product/hawq/. > 20160613:20:13:44:068559 hawq_stop:test1:gpadmin-[INFO]:-Stop hawq with args: > ['stop', 'master'] > 20160613:20:13:45:068559 hawq_stop:test1:gpadmin-[INFO]:-There are 0 > connections to the database > 20160613:20:13:45:068559 hawq_stop:test1:gpadmin-[INFO]:-Commencing Master > instance shutdown with mode='smart' > 20160613:20:13:45:068559 hawq_stop:test1:gpadmin-[INFO]:-Master host=test1 > 20160613:20:13:45:068559 hawq_stop:test1:gpadmin-[INFO]:-Stop hawq master > 20160613:20:13:46:068559 hawq_stop:test1:gpadmin-[INFO]:-Master stopped > successfully > {code} > 3. Activate standby master > {code} > [gpadmin@test1 ~]$ ssh test5 'source > /data/pulse-agent-data/HAWQ-main-FeatureTest-opt-mutilnodeparallel-wcl/product/hawq/./greenplum_path.sh; > hawq activate standby -a' > 20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-Prepare to do > 'hawq activate' > 20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-You can find log > in: > 20160613:20:14:14:126841 > hawq_activate:test5:gpadmin-[INFO]:-/home/gpadmin/hawqAdminLogs/hawq_activate_20160613.log > 20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-GPHOME is set to: > 20160613:20:14:14:126841 > hawq_activate:test5:gpadmin-[INFO]:-/data/pulse-agent-data/HAWQ-main-FeatureTest-opt-mutilnodeparallel-wcl/product/hawq/. > 20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-Activate hawq > with args: ['activate', 'standby'] > 20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-Starting to > activate standby master 'test5' > 20160613:20:14:15:126841 hawq_activate:test5:gpadmin-[INFO]:-HAWQ master is > not running, skip > 20160613:20:14:15:126841 hawq_activate:test5:gpadmin-[INFO]:-Stopping all the > running segments > 20160613:20:14:21:126841 hawq_activate:test5:gpadmin-[INFO]:- > 20160613:20:14:21:126841 hawq_activate:test5:gpadmin-[INFO]:-Stopping running > standby > 20160613:20:14:23:126841 hawq_activate:test5:gpadmin-[INFO]:-Update master > host name in hawq-site.xml > 20160613:20:14:31:126841 hawq_activate:test5:gpadmin-[INFO]:-GUC > hawq_master_address_host already exist in hawq-site.xml > Update it with value: test5 > 20160613:20:14:31:126841 hawq_activate:test5:gpadmin-[INFO]:-Remove current > standby from hawq-site.xml > 20160613:20:14:39:126841 hawq_activate:test5:gpadmin-[INFO]:-Start master in > master only mode > {code} > It hangs and can not start master. And the master log is following: > {code} > 2016-06-13 20:14:40.268022 > PDT,,,p127518,th-12124628160,,,seg-1,"LOG","0","database > system was shut down at 2016-06-13 20:02:50 PDT",,,0,,"xlog.c",6205, > 2016-06-13 20:14:40.268112 > PDT,,,p127518,th-12124628160,,,seg-1,"LOG","0","found > recovery.conf file indicating standby takeover recovery > needed",,,0,,"xlog.c",5485, > 2016-06-13 20:14:40.268131 >
[jira] [Updated] (HAWQ-812) Activate standby master failed after create a new database
[ https://issues.apache.org/jira/browse/HAWQ-812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang updated HAWQ-812: --- Description: Activate standby master failed after create a new database. However, it will success if we do not create a new database even we create a new table and insert data. 1. Create a new database 'gptest' {code} [gpadmin@test1 ~]$ psql -l List of databases Name| Owner | Encoding | Access privileges ---+-+--+--- postgres | gpadmin | UTF8 | template0 | gpadmin | UTF8 | template1 | gpadmin | UTF8 | (3 rows) [gpadmin@test1 ~]$ createdb gptest [gpadmin@test1 ~]$ psql -l List of databases Name| Owner | Encoding | Access privileges ---+-+--+--- gptest| gpadmin | UTF8 | postgres | gpadmin | UTF8 | template0 | gpadmin | UTF8 | template1 | gpadmin | UTF8 | (4 rows) {code} 2. Stop HAWQ master {code} [gpadmin@test1 ~]$ hawq stop master -a 20160613:20:13:44:068559 hawq_stop:test1:gpadmin-[INFO]:-Prepare to do 'hawq stop' 20160613:20:13:44:068559 hawq_stop:test1:gpadmin-[INFO]:-You can find log in: 20160613:20:13:44:068559 hawq_stop:test1:gpadmin-[INFO]:-/home/gpadmin/hawqAdminLogs/hawq_stop_20160613.log 20160613:20:13:44:068559 hawq_stop:test1:gpadmin-[INFO]:-GPHOME is set to: 20160613:20:13:44:068559 hawq_stop:test1:gpadmin-[INFO]:-/data/pulse-agent-data/HAWQ-main-FeatureTest-opt-mutilnodeparallel-wcl/product/hawq/. 20160613:20:13:44:068559 hawq_stop:test1:gpadmin-[INFO]:-Stop hawq with args: ['stop', 'master'] 20160613:20:13:45:068559 hawq_stop:test1:gpadmin-[INFO]:-There are 0 connections to the database 20160613:20:13:45:068559 hawq_stop:test1:gpadmin-[INFO]:-Commencing Master instance shutdown with mode='smart' 20160613:20:13:45:068559 hawq_stop:test1:gpadmin-[INFO]:-Master host=test1 20160613:20:13:45:068559 hawq_stop:test1:gpadmin-[INFO]:-Stop hawq master 20160613:20:13:46:068559 hawq_stop:test1:gpadmin-[INFO]:-Master stopped successfully {code} 3. Activate standby master {code} [gpadmin@test1 ~]$ ssh test5 'source /data/pulse-agent-data/HAWQ-main-FeatureTest-opt-mutilnodeparallel-wcl/product/hawq/./greenplum_path.sh; hawq activate standby -a' 20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-Prepare to do 'hawq activate' 20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-You can find log in: 20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-/home/gpadmin/hawqAdminLogs/hawq_activate_20160613.log 20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-GPHOME is set to: 20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-/data/pulse-agent-data/HAWQ-main-FeatureTest-opt-mutilnodeparallel-wcl/product/hawq/. 20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-Activate hawq with args: ['activate', 'standby'] 20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-Starting to activate standby master 'test5' 20160613:20:14:15:126841 hawq_activate:test5:gpadmin-[INFO]:-HAWQ master is not running, skip 20160613:20:14:15:126841 hawq_activate:test5:gpadmin-[INFO]:-Stopping all the running segments 20160613:20:14:21:126841 hawq_activate:test5:gpadmin-[INFO]:- 20160613:20:14:21:126841 hawq_activate:test5:gpadmin-[INFO]:-Stopping running standby 20160613:20:14:23:126841 hawq_activate:test5:gpadmin-[INFO]:-Update master host name in hawq-site.xml 20160613:20:14:31:126841 hawq_activate:test5:gpadmin-[INFO]:-GUC hawq_master_address_host already exist in hawq-site.xml Update it with value: test5 20160613:20:14:31:126841 hawq_activate:test5:gpadmin-[INFO]:-Remove current standby from hawq-site.xml 20160613:20:14:39:126841 hawq_activate:test5:gpadmin-[INFO]:-Start master in master only mode {code} It hangs and can not start master. And the master log is following: {code} 2016-06-13 20:14:40.268022 PDT,,,p127518,th-12124628160,,,seg-1,"LOG","0","database system was shut down at 2016-06-13 20:02:50 PDT",,,0,,"xlog.c",6205, 2016-06-13 20:14:40.268112 PDT,,,p127518,th-12124628160,,,seg-1,"LOG","0","found recovery.conf file indicating standby takeover recovery needed",,,0,,"xlog.c",5485, 2016-06-13 20:14:40.268131 PDT,,,p127518,th-12124628160,,,seg-1,"LOG","0","checkpoint record is at 0/1C75EF0",,,0,,"xlog.c",6304, 2016-06-13 20:14:40.268143 PDT,,,p127518,th-12124628160,,,seg-1,"LOG","0","redo record is at 0/1C75EF0; undo record is at 0/0; shutdown TRUE",,,0,,"xlog.c",6338, 2016-06-13 20:14:40.268155 PDT,,,p127518,th-12124628160,,,seg-1,"LOG","0","next transaction ID: 0/1003; next OID: 16508",,,0,,"xlog.c",6342, 2016-06-13 20:14:40.268165 PDT,,,p127518,th-12124628160,,,seg-1,"LOG","0","next MultiXactId: 1; next MultiXactOffset: 0",,,0,,"xlog.c",6345,
[jira] [Closed] (HAWQ-813) Activate standby master failed after create a new database
[ https://issues.apache.org/jira/browse/HAWQ-813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang closed HAWQ-813. -- Resolution: Invalid > Activate standby master failed after create a new database > -- > > Key: HAWQ-813 > URL: https://issues.apache.org/jira/browse/HAWQ-813 > Project: Apache HAWQ > Issue Type: Bug > Components: Backup & restore >Reporter: Chunling Wang >Assignee: Lei Chang > > Activate standby master failed after create a new database. However, it will > success if we do not create a new database even we create a new table and > insert data. > 1. Create a new database 'gptest' > {code} > [gpadmin@test1 ~]$ psql -l > List of databases >Name| Owner | Encoding | Access privileges > ---+-+--+--- > postgres | gpadmin | UTF8 | > template0 | gpadmin | UTF8 | > template1 | gpadmin | UTF8 | > (3 rows) > [gpadmin@test1 ~]$ createdb gptest > [gpadmin@test1 ~]$ psql -l > List of databases >Name| Owner | Encoding | Access privileges > ---+-+--+--- > gptest| gpadmin | UTF8 | > postgres | gpadmin | UTF8 | > template0 | gpadmin | UTF8 | > template1 | gpadmin | UTF8 | > (4 rows) > {code} > 2. Stop HAWQ master > {code} > [gpadmin@test1 ~]$ hawq stop master -a > 20160613:20:13:44:068559 hawq_stop:test1:gpadmin-[INFO]:-Prepare to do 'hawq > stop' > 20160613:20:13:44:068559 hawq_stop:test1:gpadmin-[INFO]:-You can find log in: > 20160613:20:13:44:068559 > hawq_stop:test1:gpadmin-[INFO]:-/home/gpadmin/hawqAdminLogs/hawq_stop_20160613.log > 20160613:20:13:44:068559 hawq_stop:test1:gpadmin-[INFO]:-GPHOME is set to: > 20160613:20:13:44:068559 > hawq_stop:test1:gpadmin-[INFO]:-/data/pulse-agent-data/HAWQ-main-FeatureTest-opt-mutilnodeparallel-wcl/product/hawq/. > 20160613:20:13:44:068559 hawq_stop:test1:gpadmin-[INFO]:-Stop hawq with args: > ['stop', 'master'] > 20160613:20:13:45:068559 hawq_stop:test1:gpadmin-[INFO]:-There are 0 > connections to the database > 20160613:20:13:45:068559 hawq_stop:test1:gpadmin-[INFO]:-Commencing Master > instance shutdown with mode='smart' > 20160613:20:13:45:068559 hawq_stop:test1:gpadmin-[INFO]:-Master host=test1 > 20160613:20:13:45:068559 hawq_stop:test1:gpadmin-[INFO]:-Stop hawq master > 20160613:20:13:46:068559 hawq_stop:test1:gpadmin-[INFO]:-Master stopped > successfully > {code} > 3. Activate standby master > {code} > [gpadmin@test1 ~]$ ssh test5 'source > /data/pulse-agent-data/HAWQ-main-FeatureTest-opt-mutilnodeparallel-wcl/product/hawq/./greenplum_path.sh; > hawq activate standby -a' > 20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-Prepare to do > 'hawq activate' > 20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-You can find log > in: > 20160613:20:14:14:126841 > hawq_activate:test5:gpadmin-[INFO]:-/home/gpadmin/hawqAdminLogs/hawq_activate_20160613.log > 20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-GPHOME is set to: > 20160613:20:14:14:126841 > hawq_activate:test5:gpadmin-[INFO]:-/data/pulse-agent-data/HAWQ-main-FeatureTest-opt-mutilnodeparallel-wcl/product/hawq/. > 20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-Activate hawq > with args: ['activate', 'standby'] > 20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-Starting to > activate standby master 'test5' > 20160613:20:14:15:126841 hawq_activate:test5:gpadmin-[INFO]:-HAWQ master is > not running, skip > 20160613:20:14:15:126841 hawq_activate:test5:gpadmin-[INFO]:-Stopping all the > running segments > 20160613:20:14:21:126841 hawq_activate:test5:gpadmin-[INFO]:- > 20160613:20:14:21:126841 hawq_activate:test5:gpadmin-[INFO]:-Stopping running > standby > 20160613:20:14:23:126841 hawq_activate:test5:gpadmin-[INFO]:-Update master > host name in hawq-site.xml > 20160613:20:14:31:126841 hawq_activate:test5:gpadmin-[INFO]:-GUC > hawq_master_address_host already exist in hawq-site.xml > Update it with value: test5 > 20160613:20:14:31:126841 hawq_activate:test5:gpadmin-[INFO]:-Remove current > standby from hawq-site.xml > 20160613:20:14:39:126841 hawq_activate:test5:gpadmin-[INFO]:-Start master in > master only mode > {code} > It hangs and can not start master. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HAWQ-812) Activate standby master failed after create a new database
[ https://issues.apache.org/jira/browse/HAWQ-812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang updated HAWQ-812: --- Description: Activate standby master failed after create a new database. However, it will success if we do not create a new database even we create a new table and insert data. 1. Create a new database 'gptest' {code} [gpadmin@test1 ~]$ psql -l List of databases Name| Owner | Encoding | Access privileges ---+-+--+--- postgres | gpadmin | UTF8 | template0 | gpadmin | UTF8 | template1 | gpadmin | UTF8 | (3 rows) [gpadmin@test1 ~]$ createdb gptest [gpadmin@test1 ~]$ psql -l List of databases Name| Owner | Encoding | Access privileges ---+-+--+--- gptest| gpadmin | UTF8 | postgres | gpadmin | UTF8 | template0 | gpadmin | UTF8 | template1 | gpadmin | UTF8 | (4 rows) {code} 2. Stop HAWQ master {code} [gpadmin@test1 ~]$ hawq stop master -a 20160613:20:13:44:068559 hawq_stop:test1:gpadmin-[INFO]:-Prepare to do 'hawq stop' 20160613:20:13:44:068559 hawq_stop:test1:gpadmin-[INFO]:-You can find log in: 20160613:20:13:44:068559 hawq_stop:test1:gpadmin-[INFO]:-/home/gpadmin/hawqAdminLogs/hawq_stop_20160613.log 20160613:20:13:44:068559 hawq_stop:test1:gpadmin-[INFO]:-GPHOME is set to: 20160613:20:13:44:068559 hawq_stop:test1:gpadmin-[INFO]:-/data/pulse-agent-data/HAWQ-main-FeatureTest-opt-mutilnodeparallel-wcl/product/hawq/. 20160613:20:13:44:068559 hawq_stop:test1:gpadmin-[INFO]:-Stop hawq with args: ['stop', 'master'] 20160613:20:13:45:068559 hawq_stop:test1:gpadmin-[INFO]:-There are 0 connections to the database 20160613:20:13:45:068559 hawq_stop:test1:gpadmin-[INFO]:-Commencing Master instance shutdown with mode='smart' 20160613:20:13:45:068559 hawq_stop:test1:gpadmin-[INFO]:-Master host=test1 20160613:20:13:45:068559 hawq_stop:test1:gpadmin-[INFO]:-Stop hawq master 20160613:20:13:46:068559 hawq_stop:test1:gpadmin-[INFO]:-Master stopped successfully {code} 3. Activate standby master {code} [gpadmin@test1 ~]$ ssh test5 'source /data/pulse-agent-data/HAWQ-main-FeatureTest-opt-mutilnodeparallel-wcl/product/hawq/./greenplum_path.sh; hawq activate standby -a' 20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-Prepare to do 'hawq activate' 20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-You can find log in: 20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-/home/gpadmin/hawqAdminLogs/hawq_activate_20160613.log 20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-GPHOME is set to: 20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-/data/pulse-agent-data/HAWQ-main-FeatureTest-opt-mutilnodeparallel-wcl/product/hawq/. 20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-Activate hawq with args: ['activate', 'standby'] 20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-Starting to activate standby master 'test5' 20160613:20:14:15:126841 hawq_activate:test5:gpadmin-[INFO]:-HAWQ master is not running, skip 20160613:20:14:15:126841 hawq_activate:test5:gpadmin-[INFO]:-Stopping all the running segments 20160613:20:14:21:126841 hawq_activate:test5:gpadmin-[INFO]:- 20160613:20:14:21:126841 hawq_activate:test5:gpadmin-[INFO]:-Stopping running standby 20160613:20:14:23:126841 hawq_activate:test5:gpadmin-[INFO]:-Update master host name in hawq-site.xml 20160613:20:14:31:126841 hawq_activate:test5:gpadmin-[INFO]:-GUC hawq_master_address_host already exist in hawq-site.xml Update it with value: test5 20160613:20:14:31:126841 hawq_activate:test5:gpadmin-[INFO]:-Remove current standby from hawq-site.xml 20160613:20:14:39:126841 hawq_activate:test5:gpadmin-[INFO]:-Start master in master only mode {code} It hangs and can not start master. And the master log is following: {code} 48,1 底端 2016-06-13 20:14:40.268022 PDT,,,p127518,th-12124628160,,,seg-1,"LOG","0","database system was shut down at 2016-06-13 20:02:50 PDT",,,0,,"xlog.c",6205, 2016-06-13 20:14:40.268112 PDT,,,p127518,th-12124628160,,,seg-1,"LOG","0","found recovery.conf file indicating standby takeover recovery needed",,,0,,"xlog.c",5485, 2016-06-13 20:14:40.268131 PDT,,,p127518,th-12124628160,,,seg-1,"LOG","0","checkpoint record is at 0/1C75EF0",,,0,,"xlog.c",6304, 2016-06-13 20:14:40.268143 PDT,,,p127518,th-12124628160,,,seg-1,"LOG","0","redo record is at 0/1C75EF0; undo record is at 0/0; shutdown TRUE",,,0,,"xlog.c",6338, 2016-06-13 20:14:40.268155 PDT,,,p127518,th-12124628160,,,seg-1,"LOG","0","next transaction ID: 0/1003; next OID: 16508",,,0,,"xlog.c",6342, 2016-06-13 20:14:40.268165
[jira] [Created] (HAWQ-813) Activate standby master failed after create a new database
Chunling Wang created HAWQ-813: -- Summary: Activate standby master failed after create a new database Key: HAWQ-813 URL: https://issues.apache.org/jira/browse/HAWQ-813 Project: Apache HAWQ Issue Type: Bug Components: Backup & restore Reporter: Chunling Wang Assignee: Lei Chang Activate standby master failed after create a new database. However, it will success if we do not create a new database even we create a new table and insert data. 1. Create a new database 'gptest' {code} [gpadmin@test1 ~]$ psql -l List of databases Name| Owner | Encoding | Access privileges ---+-+--+--- postgres | gpadmin | UTF8 | template0 | gpadmin | UTF8 | template1 | gpadmin | UTF8 | (3 rows) [gpadmin@test1 ~]$ createdb gptest [gpadmin@test1 ~]$ psql -l List of databases Name| Owner | Encoding | Access privileges ---+-+--+--- gptest| gpadmin | UTF8 | postgres | gpadmin | UTF8 | template0 | gpadmin | UTF8 | template1 | gpadmin | UTF8 | (4 rows) {code} 2. Stop HAWQ master {code} [gpadmin@test1 ~]$ hawq stop master -a 20160613:20:13:44:068559 hawq_stop:test1:gpadmin-[INFO]:-Prepare to do 'hawq stop' 20160613:20:13:44:068559 hawq_stop:test1:gpadmin-[INFO]:-You can find log in: 20160613:20:13:44:068559 hawq_stop:test1:gpadmin-[INFO]:-/home/gpadmin/hawqAdminLogs/hawq_stop_20160613.log 20160613:20:13:44:068559 hawq_stop:test1:gpadmin-[INFO]:-GPHOME is set to: 20160613:20:13:44:068559 hawq_stop:test1:gpadmin-[INFO]:-/data/pulse-agent-data/HAWQ-main-FeatureTest-opt-mutilnodeparallel-wcl/product/hawq/. 20160613:20:13:44:068559 hawq_stop:test1:gpadmin-[INFO]:-Stop hawq with args: ['stop', 'master'] 20160613:20:13:45:068559 hawq_stop:test1:gpadmin-[INFO]:-There are 0 connections to the database 20160613:20:13:45:068559 hawq_stop:test1:gpadmin-[INFO]:-Commencing Master instance shutdown with mode='smart' 20160613:20:13:45:068559 hawq_stop:test1:gpadmin-[INFO]:-Master host=test1 20160613:20:13:45:068559 hawq_stop:test1:gpadmin-[INFO]:-Stop hawq master 20160613:20:13:46:068559 hawq_stop:test1:gpadmin-[INFO]:-Master stopped successfully {code} 3. Activate standby master {code} [gpadmin@test1 ~]$ ssh test5 'source /data/pulse-agent-data/HAWQ-main-FeatureTest-opt-mutilnodeparallel-wcl/product/hawq/./greenplum_path.sh; hawq activate standby -a' 20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-Prepare to do 'hawq activate' 20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-You can find log in: 20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-/home/gpadmin/hawqAdminLogs/hawq_activate_20160613.log 20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-GPHOME is set to: 20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-/data/pulse-agent-data/HAWQ-main-FeatureTest-opt-mutilnodeparallel-wcl/product/hawq/. 20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-Activate hawq with args: ['activate', 'standby'] 20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-Starting to activate standby master 'test5' 20160613:20:14:15:126841 hawq_activate:test5:gpadmin-[INFO]:-HAWQ master is not running, skip 20160613:20:14:15:126841 hawq_activate:test5:gpadmin-[INFO]:-Stopping all the running segments 20160613:20:14:21:126841 hawq_activate:test5:gpadmin-[INFO]:- 20160613:20:14:21:126841 hawq_activate:test5:gpadmin-[INFO]:-Stopping running standby 20160613:20:14:23:126841 hawq_activate:test5:gpadmin-[INFO]:-Update master host name in hawq-site.xml 20160613:20:14:31:126841 hawq_activate:test5:gpadmin-[INFO]:-GUC hawq_master_address_host already exist in hawq-site.xml Update it with value: test5 20160613:20:14:31:126841 hawq_activate:test5:gpadmin-[INFO]:-Remove current standby from hawq-site.xml 20160613:20:14:39:126841 hawq_activate:test5:gpadmin-[INFO]:-Start master in master only mode {code} It hangs and can not start master. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HAWQ-812) Activate standby master failed after create a new database
Chunling Wang created HAWQ-812: -- Summary: Activate standby master failed after create a new database Key: HAWQ-812 URL: https://issues.apache.org/jira/browse/HAWQ-812 Project: Apache HAWQ Issue Type: Bug Components: Backup & restore Reporter: Chunling Wang Assignee: Lei Chang Activate standby master failed after create a new database. However, it will success if we do not create a new database even we create a new table and insert data. 1. Create a new database 'gptest' {code} [gpadmin@test1 ~]$ psql -l List of databases Name| Owner | Encoding | Access privileges ---+-+--+--- postgres | gpadmin | UTF8 | template0 | gpadmin | UTF8 | template1 | gpadmin | UTF8 | (3 rows) [gpadmin@test1 ~]$ createdb gptest [gpadmin@test1 ~]$ psql -l List of databases Name| Owner | Encoding | Access privileges ---+-+--+--- gptest| gpadmin | UTF8 | postgres | gpadmin | UTF8 | template0 | gpadmin | UTF8 | template1 | gpadmin | UTF8 | (4 rows) {code} 2. Stop HAWQ master {code} [gpadmin@test1 ~]$ hawq stop master -a 20160613:20:13:44:068559 hawq_stop:test1:gpadmin-[INFO]:-Prepare to do 'hawq stop' 20160613:20:13:44:068559 hawq_stop:test1:gpadmin-[INFO]:-You can find log in: 20160613:20:13:44:068559 hawq_stop:test1:gpadmin-[INFO]:-/home/gpadmin/hawqAdminLogs/hawq_stop_20160613.log 20160613:20:13:44:068559 hawq_stop:test1:gpadmin-[INFO]:-GPHOME is set to: 20160613:20:13:44:068559 hawq_stop:test1:gpadmin-[INFO]:-/data/pulse-agent-data/HAWQ-main-FeatureTest-opt-mutilnodeparallel-wcl/product/hawq/. 20160613:20:13:44:068559 hawq_stop:test1:gpadmin-[INFO]:-Stop hawq with args: ['stop', 'master'] 20160613:20:13:45:068559 hawq_stop:test1:gpadmin-[INFO]:-There are 0 connections to the database 20160613:20:13:45:068559 hawq_stop:test1:gpadmin-[INFO]:-Commencing Master instance shutdown with mode='smart' 20160613:20:13:45:068559 hawq_stop:test1:gpadmin-[INFO]:-Master host=test1 20160613:20:13:45:068559 hawq_stop:test1:gpadmin-[INFO]:-Stop hawq master 20160613:20:13:46:068559 hawq_stop:test1:gpadmin-[INFO]:-Master stopped successfully {code} 3. Activate standby master {code} [gpadmin@test1 ~]$ ssh test5 'source /data/pulse-agent-data/HAWQ-main-FeatureTest-opt-mutilnodeparallel-wcl/product/hawq/./greenplum_path.sh; hawq activate standby -a' 20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-Prepare to do 'hawq activate' 20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-You can find log in: 20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-/home/gpadmin/hawqAdminLogs/hawq_activate_20160613.log 20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-GPHOME is set to: 20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-/data/pulse-agent-data/HAWQ-main-FeatureTest-opt-mutilnodeparallel-wcl/product/hawq/. 20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-Activate hawq with args: ['activate', 'standby'] 20160613:20:14:14:126841 hawq_activate:test5:gpadmin-[INFO]:-Starting to activate standby master 'test5' 20160613:20:14:15:126841 hawq_activate:test5:gpadmin-[INFO]:-HAWQ master is not running, skip 20160613:20:14:15:126841 hawq_activate:test5:gpadmin-[INFO]:-Stopping all the running segments 20160613:20:14:21:126841 hawq_activate:test5:gpadmin-[INFO]:- 20160613:20:14:21:126841 hawq_activate:test5:gpadmin-[INFO]:-Stopping running standby 20160613:20:14:23:126841 hawq_activate:test5:gpadmin-[INFO]:-Update master host name in hawq-site.xml 20160613:20:14:31:126841 hawq_activate:test5:gpadmin-[INFO]:-GUC hawq_master_address_host already exist in hawq-site.xml Update it with value: test5 20160613:20:14:31:126841 hawq_activate:test5:gpadmin-[INFO]:-Remove current standby from hawq-site.xml 20160613:20:14:39:126841 hawq_activate:test5:gpadmin-[INFO]:-Start master in master only mode {code} It hangs and can not start master. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HAWQ-619) Change 'gpextract' to 'hawqextract' for InputFormat unit test
Chunling Wang created HAWQ-619: -- Summary: Change 'gpextract' to 'hawqextract' for InputFormat unit test Key: HAWQ-619 URL: https://issues.apache.org/jira/browse/HAWQ-619 Project: Apache HAWQ Issue Type: Task Components: Tests Reporter: Chunling Wang Assignee: Jiali Yao Change 'gpextract' to 'hawqextract' in SimpleTableLocalTester.java for InputFormat unit test. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HAWQ-592) QD fails when connects to QE again in executormgr_allocate_any_executor()
[ https://issues.apache.org/jira/browse/HAWQ-592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang updated HAWQ-592: --- Description: We first run a query to get some QEs. Then we kill one and run "set log_min_messages=DEBUG1" to let QD get executormgr_allocate_any_executor(). We find QD failed. 1. Run query to get some QEs. {code} dispatch=# select count(*) from test_dispatch as t1, test_dispatch as t2, test_dispatch as t3 where t1.id *2 = t2.id and t1.id < t3.id; count --- 3725 (1 row) {code} {code} $ ps -ef|grep postgres 501 12817 1 0 4:41下午 ?? 0:00.36 /usr/local/hawq/bin/postgres -D /Users/wangchunling/hawq-data-directory/masterdd -i -M master -p 5432 --silent-mode=true 501 12818 12817 0 4:41下午 ?? 0:00.01 postgres: port 5432, master logger process 501 12821 12817 0 4:41下午 ?? 0:00.00 postgres: port 5432, stats collector process 501 12822 12817 0 4:41下午 ?? 0:00.03 postgres: port 5432, writer process 501 12823 12817 0 4:41下午 ?? 0:00.00 postgres: port 5432, checkpoint process 501 12824 12817 0 4:41下午 ?? 0:00.00 postgres: port 5432, seqserver process 501 12825 12817 0 4:41下午 ?? 0:00.00 postgres: port 5432, WAL Send Server process 501 12826 12817 0 4:41下午 ?? 0:00.00 postgres: port 5432, DFS Metadata Cache process 501 12827 12817 0 4:41下午 ?? 0:00.16 postgres: port 5432, master resource manager 501 12844 1 0 4:41下午 ?? 0:00.57 /usr/local/hawq/bin/postgres -D /Users/wangchunling/hawq-data-directory/segmentdd -i -M segment -p 4 --silent-mode=true 501 12845 12844 0 4:41下午 ?? 0:00.01 postgres: port 4, logger process 501 12856 12862 0 4:42下午 ?? 0:00.05 postgres: port 5432, wangchunling dispatch [local] con13 cmd10 idle [local] 501 12872 12844 0 4:42下午 ?? 0:00.00 postgres: port 4, stats collector process 501 12873 12844 0 4:42下午 ?? 0:00.01 postgres: port 4, writer process 501 12874 12844 0 4:42下午 ?? 0:00.00 postgres: port 4, checkpoint process 501 12875 12844 0 4:42下午 ?? 0:00.03 postgres: port 4, segment resource manager {code} 2. Kill -9 some QE and wait segment up. {code} $ ps -ef|grep postgres 501 12817 1 0 4:41下午 ?? 0:00.91 /usr/local/hawq/bin/postgres -D /Users/wangchunling/hawq-data-directory/masterdd -i -M master -p 5432 --silent-mode=true 501 12818 12817 0 4:41下午 ?? 0:00.05 postgres: port 5432, master logger process 501 12844 1 0 4:41下午 ?? 0:01.52 /usr/local/hawq/bin/postgres -D /Users/wangchunling/hawq-data-directory/segmentdd -i -M segment -p 4 --silent-mode=true 501 12845 12844 0 4:41下午 ?? 0:00.04 postgres: port 4, logger process 501 12872 12844 0 4:42下午 ?? 0:00.02 postgres: port 4, stats collector process 501 12873 12844 0 4:42下午 ?? 0:00.19 postgres: port 4, writer process 501 12874 12844 0 4:42下午 ?? 0:00.03 postgres: port 4, checkpoint process 501 12875 12844 0 4:42下午 ?? 0:00.41 postgres: port 4, segment resource manager 501 12932 12817 0 4:52下午 ?? 0:00.00 postgres: port 5432, stats collector process 501 12933 12817 0 4:52下午 ?? 0:00.01 postgres: port 5432, writer process 501 12934 12817 0 4:52下午 ?? 0:00.00 postgres: port 5432, checkpoint process 501 12935 12817 0 4:52下午 ?? 0:00.00 postgres: port 5432, seqserver process 501 12936 12817 0 4:52下午 ?? 0:00.00 postgres: port 5432, WAL Send Server process 501 12937 12817 0 4:52下午 ?? 0:00.00 postgres: port 5432, DFS Metadata Cache process 501 12938 12817 0 4:52下午 ?? 0:00.04 postgres: port 5432, master resource manager 501 12952 12817 0 4:53下午 ?? 0:00.00 postgres: port 5432, wangchunling dispatch [local] con30 idle [local] {code} {code} dispatch=# select * from gp_segment_configuration; registration_order | role | status | port | hostname | address |description +--++---+-+-+ 0 | m| u | 5432 | ChunlingdeMacBook-Pro.local | ChunlingdeMacBook-Pro.local | 1 | p| d | 4 | localhost | 127.0.0.1 | resource manager process was reset (2 rows) dispatch=# select * from gp_segment_configuration; registration_order | role | status | port | hostname | address | description +--++---+-+-+- 0 | m| u | 5432 |
[jira] [Created] (HAWQ-592) QD fails when connects to QE again in executormgr_allocate_any_executor()
Chunling Wang created HAWQ-592: -- Summary: QD fails when connects to QE again in executormgr_allocate_any_executor() Key: HAWQ-592 URL: https://issues.apache.org/jira/browse/HAWQ-592 Project: Apache HAWQ Issue Type: Bug Components: Dispatcher Reporter: Chunling Wang Assignee: Lei Chang We first run a query to get some QEs. Then we kill one and run "set log_min_messages=DEBUG1" to let QD get executormgr_allocate_any_executor(). We find QD failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HAWQ-592) QD fails when connects to QE again in executormgr_allocate_any_executor()
[ https://issues.apache.org/jira/browse/HAWQ-592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang updated HAWQ-592: --- Affects Version/s: 2.0.0 > QD fails when connects to QE again in executormgr_allocate_any_executor() > - > > Key: HAWQ-592 > URL: https://issues.apache.org/jira/browse/HAWQ-592 > Project: Apache HAWQ > Issue Type: Bug > Components: Dispatcher >Affects Versions: 2.0.0 >Reporter: Chunling Wang >Assignee: Lei Chang > > We first run a query to get some QEs. Then we kill one and run "set > log_min_messages=DEBUG1" to let QD get executormgr_allocate_any_executor(). > We find QD failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HAWQ-564) QD hangs when connecting to resource manager
[ https://issues.apache.org/jira/browse/HAWQ-564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15206095#comment-15206095 ] Chunling Wang commented on HAWQ-564: And 'kill -6' can cause same result. > QD hangs when connecting to resource manager > > > Key: HAWQ-564 > URL: https://issues.apache.org/jira/browse/HAWQ-564 > Project: Apache HAWQ > Issue Type: Bug > Components: Resource Manager >Affects Versions: 2.0.0 >Reporter: Chunling Wang >Assignee: Lei Chang > > When first inject panic in QE process, we run a query and segment is down. > After the segment is up, we run another query and get correct answer. Then we > inject the same panic second time. After the segment is down and then up > again, we run a query and find QD process hangs when connecting to resource > manager. Here is the backtrace when QD hangs: > {code} > * thread #1: tid = 0x21d8be, 0x7fff890355be libsystem_kernel.dylib`poll + > 10, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP > * frame #0: 0x7fff890355be libsystem_kernel.dylib`poll + 10 > frame #1: 0x000101daeafe postgres`processAllCommFileDescs + 158 at > rmcomm_AsyncComm.c:156 > frame #2: 0x000101db85f5 > postgres`callSyncRPCRemote(hostname=0x7f9c19e00cd0, port=5437, > sendbuff=0x7f9c1b918f50, sendbuffsize=80, sendmsgid=259, > exprecvmsgid=2307, recvsmb=, errorbuf=0x00010230c1a0, > errorbufsize=) + 645 at rmcomm_SyncComm.c:122 > frame #3: 0x000101db2d85 postgres`acquireResourceFromRM [inlined] > callSyncRPCToRM(sendbuff=0x7f9c1b918f50, sendbuffsize=, > sendmsgid=259, exprecvmsgid=2307, recvsmb=0x7f9c1b918e70, > errorbuf=, errorbufsize=1024) + 73 at rmcomm_QD2RM.c:2780 > frame #4: 0x000101db2d3c > postgres`acquireResourceFromRM(index=, sessionid=12, > slice_size=462524016, iobytes=134217728, preferred_nodes=0x7f9c1a02d398, > preferred_nodes_size=, max_seg_count_fix=, > min_seg_count_fix=, errorbuf=, > errorbufsize=) + 572 at rmcomm_QD2RM.c:742 > frame #5: 0x000101c979e7 postgres`AllocateResource(life=QRL_ONCE, > slice_size=5, iobytes=134217728, max_target_segment_num=1, > min_target_segment_num=1, vol_info=0x7f9c1a02d398, vol_info_size=1) + 631 > at pquery.c:796 > frame #6: 0x000101e8c60f > postgres`calculate_planner_segment_num(query=, > resourceLife=QRL_ONCE, fullRangeTable=, > intoPolicy=, sliceNum=5) + 14287 at cdbdatalocality.c:4207 > frame #7: 0x000101c0f671 postgres`planner + 106 at planner.c:496 > frame #8: 0x000101c0f607 postgres`planner(parse=0x7f9c1a02a140, > cursorOptions=, boundParams=0x, > resourceLife=QRL_ONCE) + 311 at planner.c:310 > frame #9: 0x000101c8eb33 > postgres`pg_plan_query(querytree=0x7f9c1a02a140, > boundParams=0x, resource_life=QRL_ONCE) + 99 at postgres.c:837 > frame #10: 0x000101c956ae postgres`exec_simple_query + 21 at > postgres.c:911 > frame #11: 0x000101c95699 > postgres`exec_simple_query(query_string=0x7f9c1a028a30, > seqServerHost=0x, seqServerPort=-1) + 1577 at postgres.c:1671 > frame #12: 0x000101c91a4c postgres`PostgresMain(argc=, > argv=, username=0x7f9c1b808cf0) + 9404 at postgres.c:4754 > frame #13: 0x000101c4ae02 postgres`ServerLoop [inlined] BackendRun + > 105 at postmaster.c:5889 > frame #14: 0x000101c4ad99 postgres`ServerLoop at postmaster.c:5484 > frame #15: 0x000101c4ad99 postgres`ServerLoop + 9593 at > postmaster.c:2163 > frame #16: 0x000101c47d3b postgres`PostmasterMain(argc=, > argv=) + 5019 at postmaster.c:1454 > frame #17: 0x000101bb1aa9 postgres`main(argc=9, > argv=0x7f9c19c1eef0) + 1433 at main.c:209 > frame #18: 0x7fff95e8c5c9 libdyld.dylib`start + 1 > thread #2: tid = 0x21d8bf, 0x7fff890355be libsystem_kernel.dylib`poll + > 10 > frame #0: 0x7fff890355be libsystem_kernel.dylib`poll + 10 > frame #1: 0x000101dfe723 postgres`rxThreadFunc(arg=) + > 2163 at ic_udp.c:6251 > frame #2: 0x7fff95e822fc libsystem_pthread.dylib`_pthread_body + 131 > frame #3: 0x7fff95e82279 libsystem_pthread.dylib`_pthread_start + 176 > frame #4: 0x7fff95e804b1 libsystem_pthread.dylib`thread_start + 13 > thread #3: tid = 0x21d9c2, 0x7fff890343f6 > libsystem_kernel.dylib`__select + 10 > frame #0: 0x7fff890343f6 libsystem_kernel.dylib`__select + 10 > frame #1: 0x000101e9d42e postgres`pg_usleep(microsec=) + > 78 at pgsleep.c:43 > frame #2: 0x000101db1a66 > postgres`generateResourceRefreshHeartBeat(arg=0x7f9c19f02480) + 166 at > rmcomm_QD2RM.c:1519 > frame #3: 0x7fff95e822fc libsystem_pthread.dylib`_pthread_body + 131 > frame #4: 0x7fff95e82279
[jira] [Created] (HAWQ-572) Improve code coverage for dispatcher: fail_qe_after_connection & fail_qe_when_do_query & fail_qe_when_begin_parquet_scan
Chunling Wang created HAWQ-572: -- Summary: Improve code coverage for dispatcher: fail_qe_after_connection & fail_qe_when_do_query & fail_qe_when_begin_parquet_scan Key: HAWQ-572 URL: https://issues.apache.org/jira/browse/HAWQ-572 Project: Apache HAWQ Issue Type: Sub-task Components: Dispatcher Reporter: Chunling Wang Assignee: Lei Chang Add those fault injections: 1. fail_qe_after_connection 2. fail_qe_when_do_query 3. fail_qe_when_begin_parquet_scan And add test cases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HAWQ-568) After query finished, kill a QE but can still recv() data from this QE socket
[ https://issues.apache.org/jira/browse/HAWQ-568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang updated HAWQ-568: --- Summary: After query finished, kill a QE but can still recv() data from this QE socket (was: After query finished, kill a QE but can still recv() from this QE socket) > After query finished, kill a QE but can still recv() data from this QE socket > - > > Key: HAWQ-568 > URL: https://issues.apache.org/jira/browse/HAWQ-568 > Project: Apache HAWQ > Issue Type: Bug > Components: Dispatcher >Affects Versions: 2.0.0 >Reporter: Chunling Wang >Assignee: Lei Chang > > After query finished, we kill a QE and other QEs remain in QE pool. When > check the connection to this QE is whether alive, we use recv() to this QE > socket, but can still receive data. > 1. Run a query and remain some QEs. > {code} > dispatch=# select count(*) from test_dispatch as t1, test_dispatch as t2, > test_dispatch as t3 where t1.id *2 = t2.id and t1.id < t3.id; > count > --- > 3725 > (1 row) > {code} > {code} > $ ps -ef|grep postgres > 501 55701 1 0 5:38下午 ?? 0:00.38 /usr/local/hawq/bin/postgres > -D /Users/wangchunling/hawq-data-directory/masterdd -i -M master -p 5432 > --silent-mode=true > 501 55702 55701 0 5:38下午 ?? 0:00.01 postgres: port 5432, master > logger process > 501 55705 55701 0 5:38下午 ?? 0:00.00 postgres: port 5432, stats > collector process > 501 55706 55701 0 5:38下午 ?? 0:00.04 postgres: port 5432, writer > process > 501 55707 55701 0 5:38下午 ?? 0:00.01 postgres: port 5432, > checkpoint process > 501 55708 55701 0 5:38下午 ?? 0:00.00 postgres: port 5432, > seqserver process > 501 55709 55701 0 5:38下午 ?? 0:00.01 postgres: port 5432, WAL > Send Server process > 501 55710 55701 0 5:38下午 ?? 0:00.00 postgres: port 5432, DFS > Metadata Cache process > 501 55711 55701 0 5:38下午 ?? 0:00.26 postgres: port 5432, master > resource manager > 501 55727 1 0 5:38下午 ?? 0:00.52 /usr/local/hawq/bin/postgres > -D /Users/wangchunling/hawq-data-directory/segmentdd -i -M segment -p 4 > --silent-mode=true > 501 55728 55727 0 5:38下午 ?? 0:00.06 postgres: port 4, logger > process > 501 55731 55727 0 5:38下午 ?? 0:00.00 postgres: port 4, stats > collector process > 501 55732 55727 0 5:38下午 ?? 0:00.04 postgres: port 4, writer > process > 501 55733 55727 0 5:38下午 ?? 0:00.01 postgres: port 4, > checkpoint process > 501 55734 55727 0 5:38下午 ?? 0:00.09 postgres: port 4, > segment resource manager > 501 55741 55748 0 5:38下午 ?? 0:00.05 postgres: port 5432, > wangchunling dispatch [local] con12 cmd6 idle [local] > 501 55743 55727 0 5:38下午 ?? 0:00.36 postgres: port 4, > wangchunling dispatch 127.0.0.1(50800) con12 seg0 idle > 501 55770 55727 0 5:43下午 ?? 0:00.12 postgres: port 4, > wangchunling dispatch 127.0.0.1(50853) con12 seg0 idle > 501 55771 55727 0 5:44下午 ?? 0:00.11 postgres: port 4, > wangchunling dispatch 127.0.0.1(50855) con12 seg0 idle > 501 55774 26980 0 5:44下午 ttys0080:00.00 grep postgres > {code} > 2. Kill one QE. > {code} > $ kill 55771 > $ ps -ef|grep postgres > 501 55701 1 0 5:38下午 ?? 0:00.38 /usr/local/hawq/bin/postgres > -D /Users/wangchunling/hawq-data-directory/masterdd -i -M master -p 5432 > --silent-mode=true > 501 55702 55701 0 5:38下午 ?? 0:00.01 postgres: port 5432, master > logger process > 501 55705 55701 0 5:38下午 ?? 0:00.00 postgres: port 5432, stats > collector process > 501 55706 55701 0 5:38下午 ?? 0:00.04 postgres: port 5432, writer > process > 501 55707 55701 0 5:38下午 ?? 0:00.01 postgres: port 5432, > checkpoint process > 501 55708 55701 0 5:38下午 ?? 0:00.00 postgres: port 5432, > seqserver process > 501 55709 55701 0 5:38下午 ?? 0:00.01 postgres: port 5432, WAL > Send Server process > 501 55710 55701 0 5:38下午 ?? 0:00.00 postgres: port 5432, DFS > Metadata Cache process > 501 55711 55701 0 5:38下午 ?? 0:00.27 postgres: port 5432, master > resource manager > 501 55727 1 0 5:38下午 ?? 0:00.52 /usr/local/hawq/bin/postgres > -D /Users/wangchunling/hawq-data-directory/segmentdd -i -M segment -p 4 > --silent-mode=true > 501 55728 55727 0 5:38下午 ?? 0:00.06 postgres: port 4, logger > process > 501 55731 55727 0 5:38下午 ?? 0:00.00 postgres: port 4, stats > collector process > 501 55732 55727 0 5:38下午 ?? 0:00.04 postgres: port 4, writer > process > 501
[jira] [Created] (HAWQ-568) After query finished, kill a QE but can still recv() from this QE socket
Chunling Wang created HAWQ-568: -- Summary: After query finished, kill a QE but can still recv() from this QE socket Key: HAWQ-568 URL: https://issues.apache.org/jira/browse/HAWQ-568 Project: Apache HAWQ Issue Type: Bug Components: Dispatcher Reporter: Chunling Wang Assignee: Lei Chang After query finished, we kill a QE and other QEs remain in QE pool. When check the connection to this QE is whether alive, we use recv() to this QE socket, but can still receive data. 1. Run a query and remain some QEs. {code} dispatch=# select count(*) from test_dispatch as t1, test_dispatch as t2, test_dispatch as t3 where t1.id *2 = t2.id and t1.id < t3.id; count --- 3725 (1 row) {code} {code} $ ps -ef|grep postgres 501 55701 1 0 5:38下午 ?? 0:00.38 /usr/local/hawq/bin/postgres -D /Users/wangchunling/hawq-data-directory/masterdd -i -M master -p 5432 --silent-mode=true 501 55702 55701 0 5:38下午 ?? 0:00.01 postgres: port 5432, master logger process 501 55705 55701 0 5:38下午 ?? 0:00.00 postgres: port 5432, stats collector process 501 55706 55701 0 5:38下午 ?? 0:00.04 postgres: port 5432, writer process 501 55707 55701 0 5:38下午 ?? 0:00.01 postgres: port 5432, checkpoint process 501 55708 55701 0 5:38下午 ?? 0:00.00 postgres: port 5432, seqserver process 501 55709 55701 0 5:38下午 ?? 0:00.01 postgres: port 5432, WAL Send Server process 501 55710 55701 0 5:38下午 ?? 0:00.00 postgres: port 5432, DFS Metadata Cache process 501 55711 55701 0 5:38下午 ?? 0:00.26 postgres: port 5432, master resource manager 501 55727 1 0 5:38下午 ?? 0:00.52 /usr/local/hawq/bin/postgres -D /Users/wangchunling/hawq-data-directory/segmentdd -i -M segment -p 4 --silent-mode=true 501 55728 55727 0 5:38下午 ?? 0:00.06 postgres: port 4, logger process 501 55731 55727 0 5:38下午 ?? 0:00.00 postgres: port 4, stats collector process 501 55732 55727 0 5:38下午 ?? 0:00.04 postgres: port 4, writer process 501 55733 55727 0 5:38下午 ?? 0:00.01 postgres: port 4, checkpoint process 501 55734 55727 0 5:38下午 ?? 0:00.09 postgres: port 4, segment resource manager 501 55741 55748 0 5:38下午 ?? 0:00.05 postgres: port 5432, wangchunling dispatch [local] con12 cmd6 idle [local] 501 55743 55727 0 5:38下午 ?? 0:00.36 postgres: port 4, wangchunling dispatch 127.0.0.1(50800) con12 seg0 idle 501 55770 55727 0 5:43下午 ?? 0:00.12 postgres: port 4, wangchunling dispatch 127.0.0.1(50853) con12 seg0 idle 501 55771 55727 0 5:44下午 ?? 0:00.11 postgres: port 4, wangchunling dispatch 127.0.0.1(50855) con12 seg0 idle 501 55774 26980 0 5:44下午 ttys0080:00.00 grep postgres {code} 2. Kill one QE. {code} $ kill 55771 $ ps -ef|grep postgres 501 55701 1 0 5:38下午 ?? 0:00.38 /usr/local/hawq/bin/postgres -D /Users/wangchunling/hawq-data-directory/masterdd -i -M master -p 5432 --silent-mode=true 501 55702 55701 0 5:38下午 ?? 0:00.01 postgres: port 5432, master logger process 501 55705 55701 0 5:38下午 ?? 0:00.00 postgres: port 5432, stats collector process 501 55706 55701 0 5:38下午 ?? 0:00.04 postgres: port 5432, writer process 501 55707 55701 0 5:38下午 ?? 0:00.01 postgres: port 5432, checkpoint process 501 55708 55701 0 5:38下午 ?? 0:00.00 postgres: port 5432, seqserver process 501 55709 55701 0 5:38下午 ?? 0:00.01 postgres: port 5432, WAL Send Server process 501 55710 55701 0 5:38下午 ?? 0:00.00 postgres: port 5432, DFS Metadata Cache process 501 55711 55701 0 5:38下午 ?? 0:00.27 postgres: port 5432, master resource manager 501 55727 1 0 5:38下午 ?? 0:00.52 /usr/local/hawq/bin/postgres -D /Users/wangchunling/hawq-data-directory/segmentdd -i -M segment -p 4 --silent-mode=true 501 55728 55727 0 5:38下午 ?? 0:00.06 postgres: port 4, logger process 501 55731 55727 0 5:38下午 ?? 0:00.00 postgres: port 4, stats collector process 501 55732 55727 0 5:38下午 ?? 0:00.04 postgres: port 4, writer process 501 55733 55727 0 5:38下午 ?? 0:00.01 postgres: port 4, checkpoint process 501 55734 55727 0 5:38下午 ?? 0:00.09 postgres: port 4, segment resource manager 501 55741 55748 0 5:38下午 ?? 0:00.05 postgres: port 5432, wangchunling dispatch [local] con12 cmd6 idle [local] 501 55743 55727 0 5:38下午 ?? 0:00.36 postgres: port 4, wangchunling dispatch 127.0.0.1(50800) con12 seg0 idle 501 55770 55727 0 5:43下午 ?? 0:00.12 postgres: port 4, wangchunling dispatch 127.0.0.1(50853) con12 seg0 idle 501 55776
[jira] [Updated] (HAWQ-568) After query finished, kill a QE but can still recv() from this QE socket
[ https://issues.apache.org/jira/browse/HAWQ-568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang updated HAWQ-568: --- Affects Version/s: 2.0.0 > After query finished, kill a QE but can still recv() from this QE socket > > > Key: HAWQ-568 > URL: https://issues.apache.org/jira/browse/HAWQ-568 > Project: Apache HAWQ > Issue Type: Bug > Components: Dispatcher >Affects Versions: 2.0.0 >Reporter: Chunling Wang >Assignee: Lei Chang > > After query finished, we kill a QE and other QEs remain in QE pool. When > check the connection to this QE is whether alive, we use recv() to this QE > socket, but can still receive data. > 1. Run a query and remain some QEs. > {code} > dispatch=# select count(*) from test_dispatch as t1, test_dispatch as t2, > test_dispatch as t3 where t1.id *2 = t2.id and t1.id < t3.id; > count > --- > 3725 > (1 row) > {code} > {code} > $ ps -ef|grep postgres > 501 55701 1 0 5:38下午 ?? 0:00.38 /usr/local/hawq/bin/postgres > -D /Users/wangchunling/hawq-data-directory/masterdd -i -M master -p 5432 > --silent-mode=true > 501 55702 55701 0 5:38下午 ?? 0:00.01 postgres: port 5432, master > logger process > 501 55705 55701 0 5:38下午 ?? 0:00.00 postgres: port 5432, stats > collector process > 501 55706 55701 0 5:38下午 ?? 0:00.04 postgres: port 5432, writer > process > 501 55707 55701 0 5:38下午 ?? 0:00.01 postgres: port 5432, > checkpoint process > 501 55708 55701 0 5:38下午 ?? 0:00.00 postgres: port 5432, > seqserver process > 501 55709 55701 0 5:38下午 ?? 0:00.01 postgres: port 5432, WAL > Send Server process > 501 55710 55701 0 5:38下午 ?? 0:00.00 postgres: port 5432, DFS > Metadata Cache process > 501 55711 55701 0 5:38下午 ?? 0:00.26 postgres: port 5432, master > resource manager > 501 55727 1 0 5:38下午 ?? 0:00.52 /usr/local/hawq/bin/postgres > -D /Users/wangchunling/hawq-data-directory/segmentdd -i -M segment -p 4 > --silent-mode=true > 501 55728 55727 0 5:38下午 ?? 0:00.06 postgres: port 4, logger > process > 501 55731 55727 0 5:38下午 ?? 0:00.00 postgres: port 4, stats > collector process > 501 55732 55727 0 5:38下午 ?? 0:00.04 postgres: port 4, writer > process > 501 55733 55727 0 5:38下午 ?? 0:00.01 postgres: port 4, > checkpoint process > 501 55734 55727 0 5:38下午 ?? 0:00.09 postgres: port 4, > segment resource manager > 501 55741 55748 0 5:38下午 ?? 0:00.05 postgres: port 5432, > wangchunling dispatch [local] con12 cmd6 idle [local] > 501 55743 55727 0 5:38下午 ?? 0:00.36 postgres: port 4, > wangchunling dispatch 127.0.0.1(50800) con12 seg0 idle > 501 55770 55727 0 5:43下午 ?? 0:00.12 postgres: port 4, > wangchunling dispatch 127.0.0.1(50853) con12 seg0 idle > 501 55771 55727 0 5:44下午 ?? 0:00.11 postgres: port 4, > wangchunling dispatch 127.0.0.1(50855) con12 seg0 idle > 501 55774 26980 0 5:44下午 ttys0080:00.00 grep postgres > {code} > 2. Kill one QE. > {code} > $ kill 55771 > $ ps -ef|grep postgres > 501 55701 1 0 5:38下午 ?? 0:00.38 /usr/local/hawq/bin/postgres > -D /Users/wangchunling/hawq-data-directory/masterdd -i -M master -p 5432 > --silent-mode=true > 501 55702 55701 0 5:38下午 ?? 0:00.01 postgres: port 5432, master > logger process > 501 55705 55701 0 5:38下午 ?? 0:00.00 postgres: port 5432, stats > collector process > 501 55706 55701 0 5:38下午 ?? 0:00.04 postgres: port 5432, writer > process > 501 55707 55701 0 5:38下午 ?? 0:00.01 postgres: port 5432, > checkpoint process > 501 55708 55701 0 5:38下午 ?? 0:00.00 postgres: port 5432, > seqserver process > 501 55709 55701 0 5:38下午 ?? 0:00.01 postgres: port 5432, WAL > Send Server process > 501 55710 55701 0 5:38下午 ?? 0:00.00 postgres: port 5432, DFS > Metadata Cache process > 501 55711 55701 0 5:38下午 ?? 0:00.27 postgres: port 5432, master > resource manager > 501 55727 1 0 5:38下午 ?? 0:00.52 /usr/local/hawq/bin/postgres > -D /Users/wangchunling/hawq-data-directory/segmentdd -i -M segment -p 4 > --silent-mode=true > 501 55728 55727 0 5:38下午 ?? 0:00.06 postgres: port 4, logger > process > 501 55731 55727 0 5:38下午 ?? 0:00.00 postgres: port 4, stats > collector process > 501 55732 55727 0 5:38下午 ?? 0:00.04 postgres: port 4, writer > process > 501 55733 55727 0 5:38下午 ?? 0:00.01 postgres: port 4, > checkpoint process > 501 55734 55727 0 5:38下午 ?? 0:00.09 postgres: port
[jira] [Commented] (HAWQ-564) QD hangs when connecting to resource manager
[ https://issues.apache.org/jira/browse/HAWQ-564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15203864#comment-15203864 ] Chunling Wang commented on HAWQ-564: There is another way to cause this bug without fault injection. 1. First run query and get some QEs. {code} dispatch=# select count(*) from test_dispatch as t1, test_dispatch as t2, test_dispatch as t3 where t1.id *2 = t2.id and t1.id < t3.id; count --- 3725 (1 row) {code} {code} $ ps -ef|grep postgres 501 30190 1 0 2:34下午 ?? 0:00.31 /usr/local/hawq/bin/postgres -D /Users/wangchunling/hawq-data-directory/masterdd -i -M master -p 5432 --silent-mode=true 501 30191 30190 0 2:34下午 ?? 0:00.01 postgres: port 5432, master logger process 501 30194 30190 0 2:34下午 ?? 0:00.00 postgres: port 5432, stats collector process 501 30195 30190 0 2:34下午 ?? 0:00.01 postgres: port 5432, writer process 501 30196 30190 0 2:34下午 ?? 0:00.00 postgres: port 5432, checkpoint process 501 30197 30190 0 2:34下午 ?? 0:00.00 postgres: port 5432, seqserver process 501 30198 30190 0 2:34下午 ?? 0:00.00 postgres: port 5432, WAL Send Server process 501 30199 30190 0 2:34下午 ?? 0:00.00 postgres: port 5432, DFS Metadata Cache process 501 30200 30190 0 2:34下午 ?? 0:00.07 postgres: port 5432, master resource manager 501 30216 1 0 2:34下午 ?? 0:00.37 /usr/local/hawq/bin/postgres -D /Users/wangchunling/hawq-data-directory/segmentdd -i -M segment -p 4 --silent-mode=true 501 30217 30216 0 2:34下午 ?? 0:00.02 postgres: port 4, logger process 501 30220 30216 0 2:34下午 ?? 0:00.00 postgres: port 4, stats collector process 501 30221 30216 0 2:34下午 ?? 0:00.01 postgres: port 4, writer process 501 30222 30216 0 2:34下午 ?? 0:00.00 postgres: port 4, checkpoint process 501 30223 30216 0 2:34下午 ?? 0:00.03 postgres: port 4, segment resource manager 501 30231 30190 0 2:35下午 ?? 0:00.03 postgres: port 5432, wangchunling dispatch [local] con12 cmd6 idle [local] 501 30235 30216 0 2:35下午 ?? 0:00.13 postgres: port 4, wangchunling dispatch 127.0.0.1(65051) con12 seg0 idle 501 30239 30216 0 2:35下午 ?? 0:00.06 postgres: port 4, wangchunling dispatch 127.0.0.1(65061) con12 seg0 idle 501 30240 30216 0 2:35下午 ?? 0:00.06 postgres: port 4, wangchunling dispatch 127.0.0.1(65063) con12 seg0 idle 501 30242 99560 0 2:36下午 ttys0000:00.00 grep postgres {code} 2. Kill some QE and there is no QE. {code} $ kill -9 30235 $ ps -ef|grep postgres 501 30190 1 0 2:34下午 ?? 0:00.32 /usr/local/hawq/bin/postgres -D /Users/wangchunling/hawq-data-directory/masterdd -i -M master -p 5432 --silent-mode=true 501 30191 30190 0 2:34下午 ?? 0:00.01 postgres: port 5432, master logger process 501 30194 30190 0 2:34下午 ?? 0:00.00 postgres: port 5432, stats collector process 501 30195 30190 0 2:34下午 ?? 0:00.01 postgres: port 5432, writer process 501 30196 30190 0 2:34下午 ?? 0:00.00 postgres: port 5432, checkpoint process 501 30197 30190 0 2:34下午 ?? 0:00.00 postgres: port 5432, seqserver process 501 30198 30190 0 2:34下午 ?? 0:00.00 postgres: port 5432, WAL Send Server process 501 30199 30190 0 2:34下午 ?? 0:00.00 postgres: port 5432, DFS Metadata Cache process 501 30200 30190 0 2:34下午 ?? 0:00.08 postgres: port 5432, master resource manager 501 30216 1 0 2:34下午 ?? 0:00.58 /usr/local/hawq/bin/postgres -D /Users/wangchunling/hawq-data-directory/segmentdd -i -M segment -p 4 --silent-mode=true 501 30217 30216 0 2:34下午 ?? 0:00.03 postgres: port 4, logger process 501 30231 30190 0 2:35下午 ?? 0:00.04 postgres: port 5432, wangchunling dispatch [local] con12 cmd6 idle [local] 501 30248 30216 0 2:36下午 ?? 0:00.00 postgres: port 4, stats collector process 501 30249 30216 0 2:36下午 ?? 0:00.00 postgres: port 4, writer process 501 30250 30216 0 2:36下午 ?? 0:00.00 postgres: port 4, checkpoint process 501 30251 30216 0 2:36下午 ?? 0:00.00 postgres: port 4, segment resource manager 501 30255 99560 0 2:36下午 ttys0000:00.00 grep postgres {code} 3. Run query again and get some new QEs. {code} dispatch=# select count(*) from test_dispatch as t1, test_dispatch as t2, test_dispatch as t3 where t1.id *2 = t2.id and t1.id < t3.id; count --- 3725 (1 row) {code} {code} $ ps -ef|grep postgres 501 30190 1 0 2:34下午 ?? 0:00.33 /usr/local/hawq/bin/postgres -D /Users/wangchunling/hawq-data-directory/masterdd -i -M master -p 5432 --silent-mode=true 501 30191 30190 0 2:34下午 ?? 0:00.01 postgres: port
[jira] [Updated] (HAWQ-564) QD hangs when connecting to resource manager
[ https://issues.apache.org/jira/browse/HAWQ-564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang updated HAWQ-564: --- Description: When first inject panic in QE process, we run a query and segment is down. After the segment is up, we run another query and get correct answer. Then we inject the same panic second time. After the segment is down and then up again, we run a query and find QD process hangs when connecting to resource manager. Here is the backtrace when QD hangs: {code} * thread #1: tid = 0x21d8be, 0x7fff890355be libsystem_kernel.dylib`poll + 10, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP * frame #0: 0x7fff890355be libsystem_kernel.dylib`poll + 10 frame #1: 0x000101daeafe postgres`processAllCommFileDescs + 158 at rmcomm_AsyncComm.c:156 frame #2: 0x000101db85f5 postgres`callSyncRPCRemote(hostname=0x7f9c19e00cd0, port=5437, sendbuff=0x7f9c1b918f50, sendbuffsize=80, sendmsgid=259, exprecvmsgid=2307, recvsmb=, errorbuf=0x00010230c1a0, errorbufsize=) + 645 at rmcomm_SyncComm.c:122 frame #3: 0x000101db2d85 postgres`acquireResourceFromRM [inlined] callSyncRPCToRM(sendbuff=0x7f9c1b918f50, sendbuffsize=, sendmsgid=259, exprecvmsgid=2307, recvsmb=0x7f9c1b918e70, errorbuf=, errorbufsize=1024) + 73 at rmcomm_QD2RM.c:2780 frame #4: 0x000101db2d3c postgres`acquireResourceFromRM(index=, sessionid=12, slice_size=462524016, iobytes=134217728, preferred_nodes=0x7f9c1a02d398, preferred_nodes_size=, max_seg_count_fix=, min_seg_count_fix=, errorbuf=, errorbufsize=) + 572 at rmcomm_QD2RM.c:742 frame #5: 0x000101c979e7 postgres`AllocateResource(life=QRL_ONCE, slice_size=5, iobytes=134217728, max_target_segment_num=1, min_target_segment_num=1, vol_info=0x7f9c1a02d398, vol_info_size=1) + 631 at pquery.c:796 frame #6: 0x000101e8c60f postgres`calculate_planner_segment_num(query=, resourceLife=QRL_ONCE, fullRangeTable=, intoPolicy=, sliceNum=5) + 14287 at cdbdatalocality.c:4207 frame #7: 0x000101c0f671 postgres`planner + 106 at planner.c:496 frame #8: 0x000101c0f607 postgres`planner(parse=0x7f9c1a02a140, cursorOptions=, boundParams=0x, resourceLife=QRL_ONCE) + 311 at planner.c:310 frame #9: 0x000101c8eb33 postgres`pg_plan_query(querytree=0x7f9c1a02a140, boundParams=0x, resource_life=QRL_ONCE) + 99 at postgres.c:837 frame #10: 0x000101c956ae postgres`exec_simple_query + 21 at postgres.c:911 frame #11: 0x000101c95699 postgres`exec_simple_query(query_string=0x7f9c1a028a30, seqServerHost=0x, seqServerPort=-1) + 1577 at postgres.c:1671 frame #12: 0x000101c91a4c postgres`PostgresMain(argc=, argv=, username=0x7f9c1b808cf0) + 9404 at postgres.c:4754 frame #13: 0x000101c4ae02 postgres`ServerLoop [inlined] BackendRun + 105 at postmaster.c:5889 frame #14: 0x000101c4ad99 postgres`ServerLoop at postmaster.c:5484 frame #15: 0x000101c4ad99 postgres`ServerLoop + 9593 at postmaster.c:2163 frame #16: 0x000101c47d3b postgres`PostmasterMain(argc=, argv=) + 5019 at postmaster.c:1454 frame #17: 0x000101bb1aa9 postgres`main(argc=9, argv=0x7f9c19c1eef0) + 1433 at main.c:209 frame #18: 0x7fff95e8c5c9 libdyld.dylib`start + 1 thread #2: tid = 0x21d8bf, 0x7fff890355be libsystem_kernel.dylib`poll + 10 frame #0: 0x7fff890355be libsystem_kernel.dylib`poll + 10 frame #1: 0x000101dfe723 postgres`rxThreadFunc(arg=) + 2163 at ic_udp.c:6251 frame #2: 0x7fff95e822fc libsystem_pthread.dylib`_pthread_body + 131 frame #3: 0x7fff95e82279 libsystem_pthread.dylib`_pthread_start + 176 frame #4: 0x7fff95e804b1 libsystem_pthread.dylib`thread_start + 13 thread #3: tid = 0x21d9c2, 0x7fff890343f6 libsystem_kernel.dylib`__select + 10 frame #0: 0x7fff890343f6 libsystem_kernel.dylib`__select + 10 frame #1: 0x000101e9d42e postgres`pg_usleep(microsec=) + 78 at pgsleep.c:43 frame #2: 0x000101db1a66 postgres`generateResourceRefreshHeartBeat(arg=0x7f9c19f02480) + 166 at rmcomm_QD2RM.c:1519 frame #3: 0x7fff95e822fc libsystem_pthread.dylib`_pthread_body + 131 frame #4: 0x7fff95e82279 libsystem_pthread.dylib`_pthread_start + 176 frame #5: 0x7fff95e804b1 libsystem_pthread.dylib`thread_start + 13 {code} And here is the operations: 1. Before injection, get query answer correctly. {code} dispatch=# select count(*) from test_dispatch as t1, test_dispatch as t2, test_dispatch as t3 where t1.id *2 = t2.id and t1.id < t3.id; count --- 3725 (1 row) {code} 2. Inject panic, fault triggered, and segment is down. {code} dispatch=# select count(*) from test_dispatch as t1, test_dispatch as t2, test_dispatch as t3 where t1.id *2 = t2.id and t1.id < t3.id; ERROR: fault triggered, fault
[jira] [Updated] (HAWQ-564) QD hangs when connecting to resource manager
[ https://issues.apache.org/jira/browse/HAWQ-564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang updated HAWQ-564: --- Affects Version/s: 2.0.0 > QD hangs when connecting to resource manager > > > Key: HAWQ-564 > URL: https://issues.apache.org/jira/browse/HAWQ-564 > Project: Apache HAWQ > Issue Type: Bug > Components: Resource Manager >Affects Versions: 2.0.0 >Reporter: Chunling Wang >Assignee: Lei Chang > > When first inject panic in QE process, we run a query and segment is down. > After the segment is up, we run another query and get correct answer. Then we > inject the same panic second time. After the segment is down and then up > again, we run a query and find QD process hangs when connecting to resource > manager. Here is the backtrace when QD hangs: > {code} > * thread #1: tid = 0x21d8be, 0x7fff890355be libsystem_kernel.dylib`poll + > 10, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP > * frame #0: 0x7fff890355be libsystem_kernel.dylib`poll + 10 > frame #1: 0x000101daeafe postgres`processAllCommFileDescs + 158 at > rmcomm_AsyncComm.c:156 > frame #2: 0x000101db85f5 > postgres`callSyncRPCRemote(hostname=0x7f9c19e00cd0, port=5437, > sendbuff=0x7f9c1b918f50, sendbuffsize=80, sendmsgid=259, > exprecvmsgid=2307, recvsmb=, errorbuf=0x00010230c1a0, > errorbufsize=) + 645 at rmcomm_SyncComm.c:122 > frame #3: 0x000101db2d85 postgres`acquireResourceFromRM [inlined] > callSyncRPCToRM(sendbuff=0x7f9c1b918f50, sendbuffsize=, > sendmsgid=259, exprecvmsgid=2307, recvsmb=0x7f9c1b918e70, > errorbuf=, errorbufsize=1024) + 73 at rmcomm_QD2RM.c:2780 > frame #4: 0x000101db2d3c > postgres`acquireResourceFromRM(index=, sessionid=12, > slice_size=462524016, iobytes=134217728, preferred_nodes=0x7f9c1a02d398, > preferred_nodes_size=, max_seg_count_fix=, > min_seg_count_fix=, errorbuf=, > errorbufsize=) + 572 at rmcomm_QD2RM.c:742 > frame #5: 0x000101c979e7 postgres`AllocateResource(life=QRL_ONCE, > slice_size=5, iobytes=134217728, max_target_segment_num=1, > min_target_segment_num=1, vol_info=0x7f9c1a02d398, vol_info_size=1) + 631 > at pquery.c:796 > frame #6: 0x000101e8c60f > postgres`calculate_planner_segment_num(query=, > resourceLife=QRL_ONCE, fullRangeTable=, > intoPolicy=, sliceNum=5) + 14287 at cdbdatalocality.c:4207 > frame #7: 0x000101c0f671 postgres`planner + 106 at planner.c:496 > frame #8: 0x000101c0f607 postgres`planner(parse=0x7f9c1a02a140, > cursorOptions=, boundParams=0x, > resourceLife=QRL_ONCE) + 311 at planner.c:310 > frame #9: 0x000101c8eb33 > postgres`pg_plan_query(querytree=0x7f9c1a02a140, > boundParams=0x, resource_life=QRL_ONCE) + 99 at postgres.c:837 > frame #10: 0x000101c956ae postgres`exec_simple_query + 21 at > postgres.c:911 > frame #11: 0x000101c95699 > postgres`exec_simple_query(query_string=0x7f9c1a028a30, > seqServerHost=0x, seqServerPort=-1) + 1577 at postgres.c:1671 > frame #12: 0x000101c91a4c postgres`PostgresMain(argc=, > argv=, username=0x7f9c1b808cf0) + 9404 at postgres.c:4754 > frame #13: 0x000101c4ae02 postgres`ServerLoop [inlined] BackendRun + > 105 at postmaster.c:5889 > frame #14: 0x000101c4ad99 postgres`ServerLoop at postmaster.c:5484 > frame #15: 0x000101c4ad99 postgres`ServerLoop + 9593 at > postmaster.c:2163 > frame #16: 0x000101c47d3b postgres`PostmasterMain(argc=, > argv=) + 5019 at postmaster.c:1454 > frame #17: 0x000101bb1aa9 postgres`main(argc=9, > argv=0x7f9c19c1eef0) + 1433 at main.c:209 > frame #18: 0x7fff95e8c5c9 libdyld.dylib`start + 1 > thread #2: tid = 0x21d8bf, 0x7fff890355be libsystem_kernel.dylib`poll + > 10 > frame #0: 0x7fff890355be libsystem_kernel.dylib`poll + 10 > frame #1: 0x000101dfe723 postgres`rxThreadFunc(arg=) + > 2163 at ic_udp.c:6251 > frame #2: 0x7fff95e822fc libsystem_pthread.dylib`_pthread_body + 131 > frame #3: 0x7fff95e82279 libsystem_pthread.dylib`_pthread_start + 176 > frame #4: 0x7fff95e804b1 libsystem_pthread.dylib`thread_start + 13 > thread #3: tid = 0x21d9c2, 0x7fff890343f6 > libsystem_kernel.dylib`__select + 10 > frame #0: 0x7fff890343f6 libsystem_kernel.dylib`__select + 10 > frame #1: 0x000101e9d42e postgres`pg_usleep(microsec=) + > 78 at pgsleep.c:43 > frame #2: 0x000101db1a66 > postgres`generateResourceRefreshHeartBeat(arg=0x7f9c19f02480) + 166 at > rmcomm_QD2RM.c:1519 > frame #3: 0x7fff95e822fc libsystem_pthread.dylib`_pthread_body + 131 > frame #4: 0x7fff95e82279 libsystem_pthread.dylib`_pthread_start + 176 > frame #5:
[jira] [Updated] (HAWQ-564) QD hangs when connecting to resource manager
[ https://issues.apache.org/jira/browse/HAWQ-564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chunling Wang updated HAWQ-564: --- Description: When first inject panic in QE process, we run a query and segment is down. After the segment is up, we run another query and get correct answer. Then we inject the same panic second time. After the segment is down and then up again, we run a query and find QD process hangs when connecting to resource manager. Here is the backtrace when QD hangs: {code} * thread #1: tid = 0x21d8be, 0x7fff890355be libsystem_kernel.dylib`poll + 10, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP * frame #0: 0x7fff890355be libsystem_kernel.dylib`poll + 10 frame #1: 0x000101daeafe postgres`processAllCommFileDescs + 158 at rmcomm_AsyncComm.c:156 frame #2: 0x000101db85f5 postgres`callSyncRPCRemote(hostname=0x7f9c19e00cd0, port=5437, sendbuff=0x7f9c1b918f50, sendbuffsize=80, sendmsgid=259, exprecvmsgid=2307, recvsmb=, errorbuf=0x00010230c1a0, errorbufsize=) + 645 at rmcomm_SyncComm.c:122 frame #3: 0x000101db2d85 postgres`acquireResourceFromRM [inlined] callSyncRPCToRM(sendbuff=0x7f9c1b918f50, sendbuffsize=, sendmsgid=259, exprecvmsgid=2307, recvsmb=0x7f9c1b918e70, errorbuf=, errorbufsize=1024) + 73 at rmcomm_QD2RM.c:2780 frame #4: 0x000101db2d3c postgres`acquireResourceFromRM(index=, sessionid=12, slice_size=462524016, iobytes=134217728, preferred_nodes=0x7f9c1a02d398, preferred_nodes_size=, max_seg_count_fix=, min_seg_count_fix=, errorbuf=, errorbufsize=) + 572 at rmcomm_QD2RM.c:742 frame #5: 0x000101c979e7 postgres`AllocateResource(life=QRL_ONCE, slice_size=5, iobytes=134217728, max_target_segment_num=1, min_target_segment_num=1, vol_info=0x7f9c1a02d398, vol_info_size=1) + 631 at pquery.c:796 frame #6: 0x000101e8c60f postgres`calculate_planner_segment_num(query=, resourceLife=QRL_ONCE, fullRangeTable=, intoPolicy=, sliceNum=5) + 14287 at cdbdatalocality.c:4207 frame #7: 0x000101c0f671 postgres`planner + 106 at planner.c:496 frame #8: 0x000101c0f607 postgres`planner(parse=0x7f9c1a02a140, cursorOptions=, boundParams=0x, resourceLife=QRL_ONCE) + 311 at planner.c:310 frame #9: 0x000101c8eb33 postgres`pg_plan_query(querytree=0x7f9c1a02a140, boundParams=0x, resource_life=QRL_ONCE) + 99 at postgres.c:837 frame #10: 0x000101c956ae postgres`exec_simple_query + 21 at postgres.c:911 frame #11: 0x000101c95699 postgres`exec_simple_query(query_string=0x7f9c1a028a30, seqServerHost=0x, seqServerPort=-1) + 1577 at postgres.c:1671 frame #12: 0x000101c91a4c postgres`PostgresMain(argc=, argv=, username=0x7f9c1b808cf0) + 9404 at postgres.c:4754 frame #13: 0x000101c4ae02 postgres`ServerLoop [inlined] BackendRun + 105 at postmaster.c:5889 frame #14: 0x000101c4ad99 postgres`ServerLoop at postmaster.c:5484 frame #15: 0x000101c4ad99 postgres`ServerLoop + 9593 at postmaster.c:2163 frame #16: 0x000101c47d3b postgres`PostmasterMain(argc=, argv=) + 5019 at postmaster.c:1454 frame #17: 0x000101bb1aa9 postgres`main(argc=9, argv=0x7f9c19c1eef0) + 1433 at main.c:209 frame #18: 0x7fff95e8c5c9 libdyld.dylib`start + 1 thread #2: tid = 0x21d8bf, 0x7fff890355be libsystem_kernel.dylib`poll + 10 frame #0: 0x7fff890355be libsystem_kernel.dylib`poll + 10 frame #1: 0x000101dfe723 postgres`rxThreadFunc(arg=) + 2163 at ic_udp.c:6251 frame #2: 0x7fff95e822fc libsystem_pthread.dylib`_pthread_body + 131 frame #3: 0x7fff95e82279 libsystem_pthread.dylib`_pthread_start + 176 frame #4: 0x7fff95e804b1 libsystem_pthread.dylib`thread_start + 13 thread #3: tid = 0x21d9c2, 0x7fff890343f6 libsystem_kernel.dylib`__select + 10 frame #0: 0x7fff890343f6 libsystem_kernel.dylib`__select + 10 frame #1: 0x000101e9d42e postgres`pg_usleep(microsec=) + 78 at pgsleep.c:43 frame #2: 0x000101db1a66 postgres`generateResourceRefreshHeartBeat(arg=0x7f9c19f02480) + 166 at rmcomm_QD2RM.c:1519 frame #3: 0x7fff95e822fc libsystem_pthread.dylib`_pthread_body + 131 frame #4: 0x7fff95e82279 libsystem_pthread.dylib`_pthread_start + 176 frame #5: 0x7fff95e804b1 libsystem_pthread.dylib`thread_start + 13 {code} And here is the operations: 1. Before injection, get query answer correctly. {code} dispatch=# select count(*) from test_dispatch as t1, test_dispatch as t2, test_dispatch as t3 where t1.id *2 = t2.id and t1.id < t3.id; count --- 3725 (1 row) {code} 2. Inject panic, fault triggered, and segment is down. {code} dispatch=# select count(*) from test_dispatch as t1, test_dispatch as t2, test_dispatch as t3 where t1.id *2 = t2.id and t1.id < t3.id; ERROR: fault triggered, fault