[jira] [Created] (HAWQ-1620) Push Down Target List Information To Parquet Scan For Bloomfilter
Lin Wen created HAWQ-1620: - Summary: Push Down Target List Information To Parquet Scan For Bloomfilter Key: HAWQ-1620 URL: https://issues.apache.org/jira/browse/HAWQ-1620 Project: Apache HAWQ Issue Type: Bug Components: Query Execution Reporter: Lin Wen Assignee: Lei Chang Fix For: 2.4.0.0-incubating In function CreateRuntimeFilterState(), only simple Var information is pushed down to parquet scan, target list information(pi_targetlist in structure ProjectionInfo) should be pushed down too. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HAWQ-1616) Wrong Result of Hash Join When Enable Bloom filter
Lin Wen created HAWQ-1616: - Summary: Wrong Result of Hash Join When Enable Bloom filter Key: HAWQ-1616 URL: https://issues.apache.org/jira/browse/HAWQ-1616 Project: Apache HAWQ Issue Type: Bug Components: Query Execution Reporter: Lin Wen Assignee: Lei Chang Wrong result of Hash Join when enable Bloom filter in some cases, e.g join key "l_partkey" is not in select list: select l_quantity, l_partkey, l_extendedprice from part, lineitem where p_partkey = l_partkey and p_brand = 'Brand#23' and p_container = 'MED BOX' limit 10; select l_quantity, l_extendedprice from part, lineitem where p_partkey = l_partkey and p_brand = 'Brand#23' and p_container = 'MED BOX' limit 10; -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HAWQ-1608) Implement Printing Runtime Filter Information For "explain" and "explain analyze"
Lin Wen created HAWQ-1608: - Summary: Implement Printing Runtime Filter Information For "explain" and "explain analyze" Key: HAWQ-1608 URL: https://issues.apache.org/jira/browse/HAWQ-1608 Project: Apache HAWQ Issue Type: Sub-task Components: Planner, Query Execution Reporter: Lin Wen Assignee: Lei Chang -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HAWQ-1607) Implement Applying Bloom filter During Scan outer table
Lin Wen created HAWQ-1607: - Summary: Implement Applying Bloom filter During Scan outer table Key: HAWQ-1607 URL: https://issues.apache.org/jira/browse/HAWQ-1607 Project: Apache HAWQ Issue Type: Sub-task Components: Optimizer, Query Execution Reporter: Lin Wen Assignee: Lei Chang This subtask will implement # Pass down Bloom filter structure to outer table scan; # Check if the tuple from outer table is found in Bloom filter structure. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HAWQ-1606) Implement Deciding to Create Bloom Filter During Query Plan And Create Bloom filter For Inner Table
Lin Wen created HAWQ-1606: - Summary: Implement Deciding to Create Bloom Filter During Query Plan And Create Bloom filter For Inner Table Key: HAWQ-1606 URL: https://issues.apache.org/jira/browse/HAWQ-1606 Project: Apache HAWQ Issue Type: Sub-task Components: Optimizer, Query Execution Reporter: Lin Wen Assignee: Lei Chang This subtask will implement 1. Decide whether to create Bloom filter during query plan phase, if the hash join is suitable to use Bloom filter, then some information will be added into hash join plan node. 2. During query execution phase, create Bloom filter structure for tuples from inner table. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HAWQ-1604) Add A New GUC hawq_hashjoin_bloomfilter
Lin Wen created HAWQ-1604: - Summary: Add A New GUC hawq_hashjoin_bloomfilter Key: HAWQ-1604 URL: https://issues.apache.org/jira/browse/HAWQ-1604 Project: Apache HAWQ Issue Type: Sub-task Components: Query Execution Reporter: Lin Wen Assignee: Lei Chang Fix For: 2.4.0.0-incubating # Add A New GUC hawq_hashjoin_bloomfilter to indicate if use Bloom filter for hash join. # remove gp_hashjoin_bloomfilter and bloom filter in hash join table, this legacy has been verified that it won't improve hash join performance. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HAWQ-1597) Implement Runtime Filter for Hash Join
Lin Wen created HAWQ-1597: - Summary: Implement Runtime Filter for Hash Join Key: HAWQ-1597 URL: https://issues.apache.org/jira/browse/HAWQ-1597 Project: Apache HAWQ Issue Type: New Feature Components: Query Execution Reporter: Lin Wen Assignee: Lei Chang Bloom filter is a space-efficient probabilistic data structure invented in 1970, which is used to test whether an element is a member of a set. Nowdays, bloom filter is widely used in OLAP or data-intensive applications to quickly filter data. It is usually implemented in OLAP systems for hash join. The basic idea is, when hash join two tables, during the build phase, build a bloomfilter information for the inner table, then push down this bloomfilter information to the scan of the outer table, so that, less tuples from the outer table will be returned to hash join node and joined with hash table. It can greatly improment the hash join performance if the selectivity is high. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HAWQ-1521) Idle QE Processes Can't Quit After An Interval
Lin Wen created HAWQ-1521: - Summary: Idle QE Processes Can't Quit After An Interval Key: HAWQ-1521 URL: https://issues.apache.org/jira/browse/HAWQ-1521 Project: Apache HAWQ Issue Type: Bug Reporter: Lin Wen Assignee: Radar Lei After a query is finished, there are some idle QE processes on segments. These QE processes are expected to quit after a time interval, this interval is controlled by a GUC gp_vmem_idle_resource_timeout, the default value is 18 seconds. However, this does't act as expected. Idle QE processes on segments always exist there, unless the QD process quit. The reason is in postgres.c, the codes to enable this timer can't get executed. function gangsExist() always return false, since gang related structures are all NULL. if (IdleSessionGangTimeout > 0 && gangsExist()) if (!enable_sig_alarm( IdleSessionGangTimeout /* ms */, false)) elog(FATAL, "could not set timer for client wait timeout"); -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HAWQ-1469) Don't expose RPS warning messages to command line
Lin Wen created HAWQ-1469: - Summary: Don't expose RPS warning messages to command line Key: HAWQ-1469 URL: https://issues.apache.org/jira/browse/HAWQ-1469 Project: Apache HAWQ Issue Type: Sub-task Components: Security Reporter: Lin Wen Assignee: Ed Espino RPS service address exposing to end-user is not secure, and we should not expose it out. **Case 1: When master RPS is down, changing to standby RPS** Current behavior ``` postgres=# select * from a; WARNING: ranger plugin service from http://test1:8432/rps is unavailable : Couldn't connect to server, try another http://test5:8432/rps ERROR: permission denied for relation(s): public.a ``` Warning should be removed. Expected ``` postgres=# select * from a; ERROR: permission denied for relation(s): public.a ``` **Case 2: When both RPS are down, should only print that RPS is unavailable.** Current Behavior: ``` postgres=# select * from a; WARNING: ranger plugin service from http://test5:8432/rps is unavailable : Couldn't connect to server, try another http://test1:8432/rps ERROR: ranger plugin service from http://test1:8432/rps is unavailable : Couldn't connect to server. (rangerrest.c:463) ``` Expected ``` postgres=# select * from a; ERROR: ranger plugin service is unavailable : Couldn't connect to server. (rangerrest.c:463) ``` The warning message should be printed in cvs log file. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HAWQ-1325) Allow queries related to pg_temp if ranger is enable
Lin Wen created HAWQ-1325: - Summary: Allow queries related to pg_temp if ranger is enable Key: HAWQ-1325 URL: https://issues.apache.org/jira/browse/HAWQ-1325 Project: Apache HAWQ Issue Type: Sub-task Reporter: Lin Wen Assignee: Ed Espino Fix For: 2.2.0.0-incubating Queries related to temp will send request to RPS, asking the privilege of schema "pg_temp_XXX", like this: ./hawq-2017-02-13_142852.csv:2017-02-13 14:29:29.718445 CST,"linw","postgres",p71787,th-1324481600,"[local]",,2017-02-13 14:29:01 CST, 8477,con13,cmd3,seg-1,,,x8477,sx1,"DEBUG3","0","send json request to ranger : { ""requestId"": ""3"", ""user"": ""linw"", ""clientIp"": ""127.0.0.1"", ""context"": ""select * from temp1;"", ""access"": [ { ""resource"": { ""database"": ""postgres"", ""schema"": ""pg_temp_13"", ""table"": ""temp1"" }, ""privileges"": [ ""select"" ] } ] }",,"select * from temp1;",0,,"rangerrest.c",454, In order to better control, for pg_temp_XX schema and objects in that schema, we should fall back these checks to catalog without sending requests to RPS. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HAWQ-1318) Can't start/stop master succesfully if ranger is enable and with a wrong RPS address
Lin Wen created HAWQ-1318: - Summary: Can't start/stop master succesfully if ranger is enable and with a wrong RPS address Key: HAWQ-1318 URL: https://issues.apache.org/jira/browse/HAWQ-1318 Project: Apache HAWQ Issue Type: Bug Components: Security Reporter: Lin Wen Assignee: Ed Espino If ranger enable and with a wrong RPS address, hawq can start but can't start/stop succesfully. Lins-MacBook-Pro:apache-hawq linw$ hawq stop cluster -a 20170209:10:21:51:043784 hawq_stop:Lins-MacBook-Pro:linw-[INFO]:-Prepare to do 'hawq stop' 20170209:10:21:51:043784 hawq_stop:Lins-MacBook-Pro:linw-[INFO]:-You can find log in: 20170209:10:21:51:043784 hawq_stop:Lins-MacBook-Pro:linw-[INFO]:-/Users/linw/hawqAdminLogs/hawq_stop_20170209.log 20170209:10:21:51:043784 hawq_stop:Lins-MacBook-Pro:linw-[INFO]:-GPHOME is set to: 20170209:10:21:51:043784 hawq_stop:Lins-MacBook-Pro:linw-[INFO]:-/Users/linw/hawq-bin 20170209:10:21:51:043784 hawq_stop:Lins-MacBook-Pro:linw-[INFO]:-Stop hawq with args: ['stop', 'cluster'] 20170209:10:21:51:043784 hawq_stop:Lins-MacBook-Pro:linw-[INFO]:-No standby host configured 20170209:10:21:51:043784 hawq_stop:Lins-MacBook-Pro:linw-[INFO]:-Stop hawq cluster 20170209:10:22:22:043784 hawq_stop:Lins-MacBook-Pro:linw-[ERROR]:-Failed to connect to the running database, please check master status 20170209:10:22:22:043784 hawq_stop:Lins-MacBook-Pro:linw-[ERROR]:-Or you can check hawq stop --help for other stop options 501 43719 1 0 10:20AM ?? 0:00.58 /Users/linw/hawq-bin/bin/postgres -D /Users/linw/hawq-data/masterdd -i -M master -p 5432 --silent-mode=true 501 43721 43719 0 10:20AM ?? 0:00.02 postgres: port 5432, master logger process 501 43724 43719 0 10:20AM ?? 0:00.01 postgres: port 5432, stats collector process 501 43725 43719 0 10:20AM ?? 0:00.07 postgres: port 5432, writer process 501 43726 43719 0 10:20AM ?? 0:00.01 postgres: port 5432, checkpoint process 501 43727 43719 0 10:20AM ?? 0:00.01 postgres: port 5432, seqserver process 501 43728 43719 0 10:20AM ?? 0:00.01 postgres: port 5432, WAL Send Server process 501 43729 43719 0 10:20AM ?? 0:00.00 postgres: port 5432, DFS Metadata Cache process 501 43743 1 0 10:20AM ?? 0:00.79 /Users/linw/hawq-bin/bin/postgres -D /Users/linw/hawq-data/segmentdd -i -M segment -p 4 --silent-mode=true 501 43744 43743 0 10:20AM ?? 0:00.02 postgres: port 4, logger process 501 43747 43743 0 10:20AM ?? 0:00.01 postgres: port 4, stats collector process 501 43748 43743 0 10:20AM ?? 0:00.07 postgres: port 4, writer process 501 43749 43743 0 10:20AM ?? 0:00.01 postgres: port 4, checkpoint process 501 43750 43743 0 10:20AM ?? 0:00.16 postgres: port 4, segment resource manager 501 43830 43719 0 10:22AM ?? 0:00.01 postgres: port 5432, master resource manager 501 43867 43719 0 10:24AM ?? 0:00.01 postgres: port 5432, linw template1 [local] cmd1 SELECT [local] -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HAWQ-1312) Forbid grant/revoke command in HAWQ side once Ranger is configured.
Lin Wen created HAWQ-1312: - Summary: Forbid grant/revoke command in HAWQ side once Ranger is configured. Key: HAWQ-1312 URL: https://issues.apache.org/jira/browse/HAWQ-1312 Project: Apache HAWQ Issue Type: Sub-task Reporter: Lin Wen Assignee: Ed Espino When ranger check is enable, GRANT and REVOKE commands should not be allowed to run. This work is expected to be done in Ranger admin portal. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HAWQ-1127) HAWQ should print error message instead of python function stack when yaml file is invalid.
Lin Wen created HAWQ-1127: - Summary: HAWQ should print error message instead of python function stack when yaml file is invalid. Key: HAWQ-1127 URL: https://issues.apache.org/jira/browse/HAWQ-1127 Project: Apache HAWQ Issue Type: Bug Components: Command Line Tools Reporter: Lin Wen Assignee: Lei Chang when use a invalid yaml file to register, hawq prints python stack: [linw@linw-rhel feature]$ hawq register --force -d hawq_feature_test -c /home/linw/workspace/hawq_working/apache-hawq/src/test/feature/ManagementTool/partition/force_mode_normal.yml testhawqregister_testpartitionforcemodenormal.nt 20161031:12:48:49:557022 hawqregister:linw-rhel:linw-[INFO]:-try to connect database localhost:5432 hawq_feature_test Traceback (most recent call last): File "/home/linw/hawq-bin/bin/hawqregister", line 1137, in main(options, args) File "/home/linw/hawq-bin/bin/hawqregister", line 1093, in main ins.prepare() File "/home/linw/hawq-bin/bin/hawqregister", line 1021, in prepare self._option_parser_yml(options.yml_config) File "/home/linw/hawq-bin/bin/hawqregister", line 475, in _option_parser_yml partitions_constraint = [d['Constraint'] for d in params[Format_FileLocations]['Partitions']] KeyError: 'Constraint' Instead, hawq should print an error message. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HAWQ-1112) Error message is not accurate when hawq register with single file and the size is larger than real size
Lin Wen created HAWQ-1112: - Summary: Error message is not accurate when hawq register with single file and the size is larger than real size Key: HAWQ-1112 URL: https://issues.apache.org/jira/browse/HAWQ-1112 Project: Apache HAWQ Issue Type: Bug Components: Command Line Tools Reporter: Lin Wen Assignee: Lei Chang Error message is not accurate when hawq register with single file and the size is larger than real size. 20161017:10:13:59:259787 hawqregister:linw-rhel:linw-[ERROR]:-File size(658) in yaml configuration file should not exceed actual length(657) of file hdfs://localhost:9000/hawq_register_hawq.paq "in yaml configuration file" is not accurate. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HAWQ-979) Resource Broker Should Reconnect Hadoop Yarn When Failed to Get Cluster Report
Lin Wen created HAWQ-979: Summary: Resource Broker Should Reconnect Hadoop Yarn When Failed to Get Cluster Report Key: HAWQ-979 URL: https://issues.apache.org/jira/browse/HAWQ-979 Project: Apache HAWQ Issue Type: Bug Components: Resource Manager Reporter: Lin Wen Assignee: Lei Chang While HAWQ with yarn mode is running, sometimes the heartbeat thread of libyarn maybe fail(e.g. YARN RM restarts) and quit, 2016-08-03 18:45:27.913838 PDT,,,p34645,th-12906104000,con4,,seg-1,"WARNING","01000","YARN mode resource broker failed to get YARN queue report of queue default. LibYarnClient::getQueueInfo, Catch the Exception:LibYarnClient::libyarn AM heartbeat thread has stopped.",,,0,,"resourcebroker_LIBYARN_proc.c",1840, resource broker process should re-register HAWQ to YARN in this case, but actually not. The reason is: In function handleRM2RB_GetClusterReport(), when RB2YARN_getQueueReport() failed, function sendRBGetClusterReportErrorData() is called, but sendRBGetClusterReportErrorData() returns OK(should return RESBROK_ERROR_GRM) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HAWQ-970) Provide More Accurate Information When LibYARN Meets an Exception
Lin Wen created HAWQ-970: Summary: Provide More Accurate Information When LibYARN Meets an Exception Key: HAWQ-970 URL: https://issues.apache.org/jira/browse/HAWQ-970 Project: Apache HAWQ Issue Type: Bug Components: libyarn Reporter: Lin Wen Assignee: Lei Chang Sometimes when an exception happens in libyarn, the log information is not accurate enough. For example, below is an exception related to kerberos ticket expiration, but we can't know from this log. 2016-07-06 01:47:51.945902 BST,,,p182375,th1403270400,con4,,seg-1,"WARNING","01000","YARN mode resource broker failed to get container report. LibYarnClient::getContainerReports, Catch the Exception:YarnIOException: Unexpected exception: when calling ApplicationCl ientProtocol::getContainers in /data1/pulse2-agent/agents/agent1/work/LIBYARN-main-opt/rhel5_x86_64/src/libyarnserver/ApplicationClientProtocol.cpp: 195",,,0,,"resourcebroker_LIBYARN_proc.c",1748, -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HAWQ-966) Adjust Libyarn Output Log
Lin Wen created HAWQ-966: Summary: Adjust Libyarn Output Log Key: HAWQ-966 URL: https://issues.apache.org/jira/browse/HAWQ-966 Project: Apache HAWQ Issue Type: Bug Components: libyarn Reporter: Lin Wen Assignee: Lei Chang While HAWQ is running, libyarn generates a lot of logs. Some of them are useless or duplicate to HAWQ users, should be compressed or reduced, so that more meaningful log message can be provided for HAWQ users. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HAWQ-940) Kerberos Ticket Expired for LibYARN Operations
Lin Wen created HAWQ-940: Summary: Kerberos Ticket Expired for LibYARN Operations Key: HAWQ-940 URL: https://issues.apache.org/jira/browse/HAWQ-940 Project: Apache HAWQ Issue Type: Bug Components: libyarn Reporter: Lin Wen Assignee: Lei Chang HAWQ's libhdfs3 and libyarn use a same kerberos keyfile. Whenever a hdfs operation is triggered, a function named login() is called, in login() function, this ticket is initialized by "kinit". But for libyarn, login() function is only called during the resource broker process starts. So if HAWQ starts up and there is no query for a long period(24 hours in kerberos's configure file, krb.conf), this ticket will expire, and HAWQ fails to register itself in Hadoop YARN. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HAWQ-918) Fix memtuple forming bug when null-saved size is larger than 32763 bytes
Lin Wen created HAWQ-918: Summary: Fix memtuple forming bug when null-saved size is larger than 32763 bytes Key: HAWQ-918 URL: https://issues.apache.org/jira/browse/HAWQ-918 Project: Apache HAWQ Issue Type: Bug Components: Query Execution Reporter: Lin Wen Assignee: Lei Chang When run a sql, an error happens in QE: psql:run.sql:24: ERROR: Query Executor Error in seg2 localhost:4 pid=55810: server closed the connection unexpectedly DETAIL: This probably means the server terminated abnormally before or while processing the request. 2016-07-13 00:21:53.951987 CST,,,p34013,th0,,,2016-07-13 00:21:29 CST,0,con33,cmd33,seg1,slice2"PANIC","XX000","Unexpected internal error: Segment process received signal SIGSEGV",,,0"10x8b764e postgres + 0x8b764e 20x3b66e0f710 libpthread.so.0 + 0x66e0f710 30x3b6668995b libc.so.6 memcpy + 0x2eb 40x8940d8 postgres textout + 0x58 50x8c32d7 postgres DirectFunctionCall1 + 0x47 60x88a126 postgres text_timestamp + 0xc6 70x669b47 postgres + 0x669b47 80x669fe9 postgres + 0x669fe9 90x66f54e postgres ExecProject + 0x23e 10 0x680695 postgres ExecAgg + 0x525 11 0x6643b1 postgres ExecProcNode + 0x221 12 0x689da8 postgres ExecLimit + 0x218 13 0x664521 postgres ExecProcNode + 0x391 14 0x68d549 postgres ExecMotion + 0x39 15 0x6643c1 postgres ExecProcNode + 0x231 16 0x660752 postgres + 0x660752 17 0x6610ea postgres ExecutorRun + 0x4ca 18 0x7e4c3a postgres PortalRun + 0x58a 19 0x7dab64 postgres + 0x7dab64 20 0x7dfaf5 postgres PostgresMain + 0x2b65 21 0x790e7f postgres + 0x790e7f 22 0x793b39 postgres PostmasterMain + 0x759 23 0x4a19cf postgres main + 0x50f 24 0x3b6661ed5d libc.so.6 __libc_start_main + 0xfd 25 0x4a1a4d postgres + 0x4a1a4d -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HAWQ-899) Add feature test for nested null case with new test framework
Lin Wen created HAWQ-899: Summary: Add feature test for nested null case with new test framework Key: HAWQ-899 URL: https://issues.apache.org/jira/browse/HAWQ-899 Project: Apache HAWQ Issue Type: Sub-task Components: Tests Reporter: Lin Wen Assignee: Jiali Yao -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HAWQ-898) Add feature test for COPY with new test framework
Lin Wen created HAWQ-898: Summary: Add feature test for COPY with new test framework Key: HAWQ-898 URL: https://issues.apache.org/jira/browse/HAWQ-898 Project: Apache HAWQ Issue Type: Sub-task Components: Tests Reporter: Lin Wen Assignee: Jiali Yao -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HAWQ-897) Add feature test for create table distribution with new test framework
Lin Wen created HAWQ-897: Summary: Add feature test for create table distribution with new test framework Key: HAWQ-897 URL: https://issues.apache.org/jira/browse/HAWQ-897 Project: Apache HAWQ Issue Type: Sub-task Components: Tests Reporter: Lin Wen Assignee: Jiali Yao -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HAWQ-896) Add feature test for create table with new test framework
Lin Wen created HAWQ-896: Summary: Add feature test for create table with new test framework Key: HAWQ-896 URL: https://issues.apache.org/jira/browse/HAWQ-896 Project: Apache HAWQ Issue Type: Sub-task Components: Tests Reporter: Lin Wen Assignee: Jiali Yao -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HAWQ-894) Add feature test for polymorphism with new test framework
Lin Wen created HAWQ-894: Summary: Add feature test for polymorphism with new test framework Key: HAWQ-894 URL: https://issues.apache.org/jira/browse/HAWQ-894 Project: Apache HAWQ Issue Type: Sub-task Components: Tests Reporter: Lin Wen Assignee: Jiali Yao -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HAWQ-891) Refine libyarn source codes
Lin Wen created HAWQ-891: Summary: Refine libyarn source codes Key: HAWQ-891 URL: https://issues.apache.org/jira/browse/HAWQ-891 Project: Apache HAWQ Issue Type: Bug Components: libyarn Reporter: Lin Wen Assignee: Lei Chang Some parts of libyarn source codes need to be refined. 1. some definition of exception in exception.h are useless; e.g. FileAlreadyExistsException 2. some interface need to be changed. e.g. constructor of class LibYarnNodeInfo, use C++ string instead of char*; 3. fix misspelling, e.g. "contaier" should be "container" -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HAWQ-853) Master standby should avoid incomplete split operation
Lin Wen created HAWQ-853: Summary: Master standby should avoid incomplete split operation Key: HAWQ-853 URL: https://issues.apache.org/jira/browse/HAWQ-853 Project: Apache HAWQ Issue Type: Bug Components: Standby master Reporter: Lin Wen Assignee: Lei Chang Master standby performs replay of xlog records within a range of checkpoints similar to crash recovery. During this is it may replay incomplete multi-step operation at the end of the range. This generates xlog. But Master standby should not generate xlog as it does not have WAL subsystem active. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HAWQ-839) Libyarn coredump when failover to standby RM
Lin Wen created HAWQ-839: Summary: Libyarn coredump when failover to standby RM Key: HAWQ-839 URL: https://issues.apache.org/jira/browse/HAWQ-839 Project: Apache HAWQ Issue Type: Bug Components: libyarn Reporter: Lin Wen Assignee: Lei Chang Start hawq with yarn mode and kill Hadoop Yarn resource manager, coredump happens, the stack is below: #0 0x003e054325e5 in raise () from /lib64/libc.so.6 #1 0x003e05433dc5 in abort () from /lib64/libc.so.6 #2 0x7f04980b1109 in libyarn::HandleYarnFailoverException (e=...) at /home/gpadmin/workspace/hawq/incubator-hawq/depends/libyarn/src/libyarnclient/ApplicationClient.cpp:170 #3 0x7f04980b3211 in libyarn::ApplicationClient::getNewApplication (this=0x1f17cd0) at /home/gpadmin/workspace/hawq/incubator-hawq/depends/libyarn/src/libyarnclient/ApplicationClient.cpp:215 #4 0x7f049809d639 in libyarn::LibYarnClient::createJob (this=0x1f1e500, jobName="hawq", queue="default", jobId="") at /home/gpadmin/workspace/hawq/incubator-hawq/depends/libyarn/src/libyarnclient/LibYarnClient.cpp:163 #5 0x7f04980987b8 in createJob (client=0x1f25950, jobName=Unhandled dwarf expression opcode 0xf3 ) at /home/gpadmin/workspace/hawq/incubator-hawq/depends/libyarn/src/libyarnclient/LibYarnClientC.cpp:61 #6 createJob (client=0x1f25950, jobName=Unhandled dwarf expression opcode 0xf3 ) at /home/gpadmin/workspace/hawq/incubator-hawq/depends/libyarn/src/libyarnclient/LibYarnClientC.cpp:180 #7 0x008e1117 in RB2YARN_registerYARNApplication () #8 0x008e31ad in RB2YARN_initializeConnection () #9 0x008e358b in ResBrokerMainInternal () #10 0x008e38e8 in ResBrokerMain () #11 0x008dfb66 in RB_LIBYARN_start () #12 0x0090ae5e in MainHandlerLoop () #13 0x0090b46a in ResManagerMainServer2ndPhase () #14 0x0090ba14 in ResManagerMain () #15 0x0090bd71 in ResManagerProcessStartup () #16 0x00767f98 in CommenceNormalOperations () #17 0x00768d44 in do_reaper () #18 0x0076dbed in ServerLoop () #19 0x0076f73e in PostmasterMain () #20 0x006c828a in main () -- This message was sent by Atlassian JIRA (v6.3.4#6332)