[jira] [Commented] (HAWQ-1597) Implement Runtime Filter for Hash Join

2018-06-04 Thread Lin Wen (JIRA)


[ 
https://issues.apache.org/jira/browse/HAWQ-1597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16499944#comment-16499944
 ] 

Lin Wen commented on HAWQ-1597:
---

Bloom filter for Local Hash Join(do not cross slice on outer table) has been 
finished. By default, this GUC is disable. If want to use this feature, we 
should set hawq_hashjoin_bloomfilter to true.

I have run TPCH benchmark testing on a HAWQ cluster(1 master node, 1 standby 
node, 3 segment nodes) with 100G data. The result is below:
 !111BA854-7318-46A7-8338-5F2993D60FA3.png!

My summarization is:
 1. There are several queries probably can benefit from runtime filter since 
multiple table joins, they are Q4, Q8, Q12, Q17, Q18. However only Q8 is 
improved(from 19 seconds to 13 seconds).
 2. Hash Join in Q4, Q12, Q18 has cross slice on outer table, the Bloomfilter 
can't be pushed down across slices currently, so these queries are not 
improved. It is expected. 
 3. Q17 has two types of SQL, the origin one is:
 select
 sum(l_extendedprice) / 7.0 as avg_yearly
 from
 lineitem,
 part
 where
 p_partkey = l_partkey
 and p_brand = 'Brand#54'
 and p_container = 'JUMBO CASE'
 and l_quantity < (
 select
 0.2 * avg(l_quantity)
 from
 lineitem
 where
 l_partkey = p_partkey
 );

Another one is:
 with q17_part as (
 select p_partkey from part where 
 p_brand = 'Brand#23'
 and p_container = 'MED BOX'
 ),
 q17_avg as (
 select l_partkey as t_partkey, 0.2 * avg(l_quantity) as t_avg_quantity
 from lineitem 
 where l_partkey IN (select p_partkey from q17_part)
 group by l_partkey
 ),
 q17_price as (
 select
 l_quantity,
 l_partkey,
 l_extendedprice
 from
 lineitem
 where
 l_partkey IN (select p_partkey from q17_part)
 )
 select cast(sum(l_extendedprice) / 7.0 as decimal(32,2)) as avg_yearly
 from q17_avg, q17_price
 where 
 t_partkey = l_partkey and l_quantity < t_avg_quantity;

The query plan of the later SQL is:
 !q17_modified_hawq.gif!

Since Hash join's left tree is parquet scan(no motion between slices), so the 
Bloom filter can be used. It has improved a lot for this SQL. The result shows 
improved from 16 seconds to 8.3 seconds(Q17* in the first graph). The speedup 
is about 1.93X.

If we choose more simple Hash Join query, not TPCH queries, like: select count 
* from part, lineitem where p_partkey = l_partkey and p_brand = 'Brand#23' and 
p_container = 'MED BOX'; with bloomfilter enable, the speedup can be more than 
2X.

> Implement Runtime Filter for Hash Join
> --
>
> Key: HAWQ-1597
> URL: https://issues.apache.org/jira/browse/HAWQ-1597
> Project: Apache HAWQ
>  Issue Type: New Feature
>  Components: Query Execution
>Reporter: Lin Wen
>Assignee: Lin Wen
>Priority: Major
> Fix For: 2.4.0.0-incubating
>
> Attachments: 111BA854-7318-46A7-8338-5F2993D60FA3.png, HAWQ Runtime 
> Filter Design.pdf, HAWQ Runtime Filter Design.pdf, q17_modified_hawq.gif
>
>
> Bloom filter is a space-efficient probabilistic data structure invented in 
> 1970, which is used to test whether an element is a member of a set.
> Nowdays, bloom filter is widely used in OLAP or data-intensive applications 
> to quickly filter data. It is usually implemented in OLAP systems for hash 
> join. The basic idea is, when hash join two tables, during the build phase, 
> build a bloomfilter information for the inner table, then push down this 
> bloomfilter information to the scan of the outer table, so that, less tuples 
> from the outer table will be returned to hash join node and joined with hash 
> table. It can greatly improment the hash join performance if the selectivity 
> is high.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HAWQ-1597) Implement Runtime Filter for Hash Join

2018-06-04 Thread Lin Wen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HAWQ-1597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Wen updated HAWQ-1597:
--
Attachment: q17_modified_hawq.gif

> Implement Runtime Filter for Hash Join
> --
>
> Key: HAWQ-1597
> URL: https://issues.apache.org/jira/browse/HAWQ-1597
> Project: Apache HAWQ
>  Issue Type: New Feature
>  Components: Query Execution
>Reporter: Lin Wen
>Assignee: Lin Wen
>Priority: Major
> Fix For: 2.4.0.0-incubating
>
> Attachments: 111BA854-7318-46A7-8338-5F2993D60FA3.png, HAWQ Runtime 
> Filter Design.pdf, HAWQ Runtime Filter Design.pdf, q17_modified_hawq.gif
>
>
> Bloom filter is a space-efficient probabilistic data structure invented in 
> 1970, which is used to test whether an element is a member of a set.
> Nowdays, bloom filter is widely used in OLAP or data-intensive applications 
> to quickly filter data. It is usually implemented in OLAP systems for hash 
> join. The basic idea is, when hash join two tables, during the build phase, 
> build a bloomfilter information for the inner table, then push down this 
> bloomfilter information to the scan of the outer table, so that, less tuples 
> from the outer table will be returned to hash join node and joined with hash 
> table. It can greatly improment the hash join performance if the selectivity 
> is high.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HAWQ-1597) Implement Runtime Filter for Hash Join

2018-06-04 Thread Lin Wen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HAWQ-1597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Wen updated HAWQ-1597:
--
Attachment: 111BA854-7318-46A7-8338-5F2993D60FA3.png

> Implement Runtime Filter for Hash Join
> --
>
> Key: HAWQ-1597
> URL: https://issues.apache.org/jira/browse/HAWQ-1597
> Project: Apache HAWQ
>  Issue Type: New Feature
>  Components: Query Execution
>Reporter: Lin Wen
>Assignee: Lin Wen
>Priority: Major
> Fix For: 2.4.0.0-incubating
>
> Attachments: 111BA854-7318-46A7-8338-5F2993D60FA3.png, HAWQ Runtime 
> Filter Design.pdf, HAWQ Runtime Filter Design.pdf
>
>
> Bloom filter is a space-efficient probabilistic data structure invented in 
> 1970, which is used to test whether an element is a member of a set.
> Nowdays, bloom filter is widely used in OLAP or data-intensive applications 
> to quickly filter data. It is usually implemented in OLAP systems for hash 
> join. The basic idea is, when hash join two tables, during the build phase, 
> build a bloomfilter information for the inner table, then push down this 
> bloomfilter information to the scan of the outer table, so that, less tuples 
> from the outer table will be returned to hash join node and joined with hash 
> table. It can greatly improment the hash join performance if the selectivity 
> is high.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HAWQ-1620) Push Down Target List Information To Parquet Scan For Bloomfilter

2018-05-31 Thread Lin Wen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HAWQ-1620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Wen resolved HAWQ-1620.
---
Resolution: Fixed

> Push Down Target List Information To Parquet Scan For Bloomfilter
> -
>
> Key: HAWQ-1620
> URL: https://issues.apache.org/jira/browse/HAWQ-1620
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Query Execution
>Reporter: Lin Wen
>Assignee: Lin Wen
>Priority: Major
> Fix For: 2.4.0.0-incubating
>
>
> In function CreateRuntimeFilterState(), only simple Var information is pushed 
> down to parquet scan, target list information(pi_targetlist in structure 
> ProjectionInfo) should be pushed down too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HAWQ-1620) Push Down Target List Information To Parquet Scan For Bloomfilter

2018-05-31 Thread Lin Wen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HAWQ-1620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Wen reassigned HAWQ-1620:
-

Assignee: Lin Wen  (was: Lei Chang)

> Push Down Target List Information To Parquet Scan For Bloomfilter
> -
>
> Key: HAWQ-1620
> URL: https://issues.apache.org/jira/browse/HAWQ-1620
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Query Execution
>Reporter: Lin Wen
>Assignee: Lin Wen
>Priority: Major
> Fix For: 2.4.0.0-incubating
>
>
> In function CreateRuntimeFilterState(), only simple Var information is pushed 
> down to parquet scan, target list information(pi_targetlist in structure 
> ProjectionInfo) should be pushed down too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HAWQ-1620) Push Down Target List Information To Parquet Scan For Bloomfilter

2018-05-30 Thread Lin Wen (JIRA)
Lin Wen created HAWQ-1620:
-

 Summary: Push Down Target List Information To Parquet Scan For 
Bloomfilter
 Key: HAWQ-1620
 URL: https://issues.apache.org/jira/browse/HAWQ-1620
 Project: Apache HAWQ
  Issue Type: Bug
  Components: Query Execution
Reporter: Lin Wen
Assignee: Lei Chang
 Fix For: 2.4.0.0-incubating


In function CreateRuntimeFilterState(), only simple Var information is pushed 
down to parquet scan, target list information(pi_targetlist in structure 
ProjectionInfo) should be pushed down too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HAWQ-1616) Wrong Result of Hash Join When Enable Bloom filter

2018-05-27 Thread Lin Wen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Wen resolved HAWQ-1616.
---
   Resolution: Fixed
Fix Version/s: 2.4.0.0-incubating

> Wrong Result of Hash Join When Enable Bloom filter
> --
>
> Key: HAWQ-1616
> URL: https://issues.apache.org/jira/browse/HAWQ-1616
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Query Execution
>Reporter: Lin Wen
>Assignee: Lin Wen
>Priority: Major
> Fix For: 2.4.0.0-incubating
>
>
> Wrong result of Hash Join when enable Bloom filter in some cases, e.g join 
> key "l_partkey" is not in select list:
> select  l_quantity, l_partkey,  l_extendedprice  from part, lineitem where 
> p_partkey = l_partkey and p_brand = 'Brand#23' and p_container = 'MED BOX' 
> limit 10; 
>  select  l_quantity,  l_extendedprice  from part, lineitem where p_partkey = 
> l_partkey and p_brand = 'Brand#23' and p_container = 'MED BOX' limit 10;
> The SQL statement and data are from TPCH workload, the correct result should 
> be:
> l_quantity | l_extendedprice
> +-
>3.00 | 5399.55
>6.00 | 8318.58
>   38.00 |57927.20
>   49.00 |90545.63
>   44.00 |76197.88
>   10.00 |17146.20
>   26.00 |34376.94
>   35.00 |56332.85
>9.00 |11999.88
>   14.00 |24020.92
> (10 rows)
> The projection information hasn't been pushed down to parquet scan correctly, 
> so current result is none.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HAWQ-1616) Wrong Result of Hash Join When Enable Bloom filter

2018-05-24 Thread Lin Wen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Wen updated HAWQ-1616:
--
Description: 
Wrong result of Hash Join when enable Bloom filter in some cases, e.g join key 
"l_partkey" is not in select list:

select  l_quantity, l_partkey,  l_extendedprice  from part, lineitem where 
p_partkey = l_partkey and p_brand = 'Brand#23' and p_container = 'MED BOX' 
limit 10; 

 select  l_quantity,  l_extendedprice  from part, lineitem where p_partkey = 
l_partkey and p_brand = 'Brand#23' and p_container = 'MED BOX' limit 10;

The SQL statement and data are from TPCH workload, the correct result should be:
l_quantity | l_extendedprice
+-
   3.00 | 5399.55
   6.00 | 8318.58
  38.00 |57927.20
  49.00 |90545.63
  44.00 |76197.88
  10.00 |17146.20
  26.00 |34376.94
  35.00 |56332.85
   9.00 |11999.88
  14.00 |24020.92
(10 rows)

The projection information hasn't been pushed down to parquet scan correctly, 
so current result is none.

  was:
Wrong result of Hash Join when enable Bloom filter in some cases, e.g join key 
"l_partkey" is not in select list:

select  l_quantity, l_partkey,  l_extendedprice  from part, lineitem where 
p_partkey = l_partkey and p_brand = 'Brand#23' and p_container = 'MED BOX' 
limit 10; 

 select  l_quantity,  l_extendedprice  from part, lineitem where p_partkey = 
l_partkey and p_brand = 'Brand#23' and p_container = 'MED BOX' limit 10;


> Wrong Result of Hash Join When Enable Bloom filter
> --
>
> Key: HAWQ-1616
> URL: https://issues.apache.org/jira/browse/HAWQ-1616
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Query Execution
>Reporter: Lin Wen
>Assignee: Lin Wen
>Priority: Major
>
> Wrong result of Hash Join when enable Bloom filter in some cases, e.g join 
> key "l_partkey" is not in select list:
> select  l_quantity, l_partkey,  l_extendedprice  from part, lineitem where 
> p_partkey = l_partkey and p_brand = 'Brand#23' and p_container = 'MED BOX' 
> limit 10; 
>  select  l_quantity,  l_extendedprice  from part, lineitem where p_partkey = 
> l_partkey and p_brand = 'Brand#23' and p_container = 'MED BOX' limit 10;
> The SQL statement and data are from TPCH workload, the correct result should 
> be:
> l_quantity | l_extendedprice
> +-
>3.00 | 5399.55
>6.00 | 8318.58
>   38.00 |57927.20
>   49.00 |90545.63
>   44.00 |76197.88
>   10.00 |17146.20
>   26.00 |34376.94
>   35.00 |56332.85
>9.00 |11999.88
>   14.00 |24020.92
> (10 rows)
> The projection information hasn't been pushed down to parquet scan correctly, 
> so current result is none.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HAWQ-1616) Wrong Result of Hash Join When Enable Bloom filter

2018-05-24 Thread Lin Wen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Wen reassigned HAWQ-1616:
-

Assignee: Lin Wen  (was: Lei Chang)

> Wrong Result of Hash Join When Enable Bloom filter
> --
>
> Key: HAWQ-1616
> URL: https://issues.apache.org/jira/browse/HAWQ-1616
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Query Execution
>Reporter: Lin Wen
>Assignee: Lin Wen
>Priority: Major
>
> Wrong result of Hash Join when enable Bloom filter in some cases, e.g join 
> key "l_partkey" is not in select list:
> select  l_quantity, l_partkey,  l_extendedprice  from part, lineitem where 
> p_partkey = l_partkey and p_brand = 'Brand#23' and p_container = 'MED BOX' 
> limit 10; 
>  select  l_quantity,  l_extendedprice  from part, lineitem where p_partkey = 
> l_partkey and p_brand = 'Brand#23' and p_container = 'MED BOX' limit 10;



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HAWQ-1616) Wrong Result of Hash Join When Enable Bloom filter

2018-05-24 Thread Lin Wen (JIRA)
Lin Wen created HAWQ-1616:
-

 Summary: Wrong Result of Hash Join When Enable Bloom filter
 Key: HAWQ-1616
 URL: https://issues.apache.org/jira/browse/HAWQ-1616
 Project: Apache HAWQ
  Issue Type: Bug
  Components: Query Execution
Reporter: Lin Wen
Assignee: Lei Chang


Wrong result of Hash Join when enable Bloom filter in some cases, e.g join key 
"l_partkey" is not in select list:

select  l_quantity, l_partkey,  l_extendedprice  from part, lineitem where 
p_partkey = l_partkey and p_brand = 'Brand#23' and p_container = 'MED BOX' 
limit 10; 

 select  l_quantity,  l_extendedprice  from part, lineitem where p_partkey = 
l_partkey and p_brand = 'Brand#23' and p_container = 'MED BOX' limit 10;



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HAWQ-1615) Access Invalid Memory When Run a Hash-join query with Bloomfilter Enable.

2018-05-24 Thread Lin Wen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Wen resolved HAWQ-1615.
---
Resolution: Fixed

Fixed. 

> Access Invalid Memory When Run a Hash-join query with Bloomfilter Enable.
> -
>
> Key: HAWQ-1615
> URL: https://issues.apache.org/jira/browse/HAWQ-1615
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Query Execution
>Reporter: Lin Wen
>Assignee: Lin Wen
>Priority: Major
> Fix For: 2.4.0.0-incubating
>
>
> How to reproduce:
> 1. set hawq_hashjoin_bloomfilter=true;
> 2. run a query on TPCH database(1 master and 3 segments): select count(*) 
> from part, lineitem where p_partkey = l_partkey and p_brand = 'Brand#23' and 
> p_container = 'MED BOX'; 
> The QE process accessed invalid memory.
> 2018-05-16 06:27:40.010566 GMT,,,p567902,th0,,,2018-05-16 06:26:43 
> GMT,0,con21,cmd5,seg1,slice2"PANIC","XX000","Unexpected internal error: 
> Segment process received signal SIGSEGV",,,0"10x969adb postgres 
>  + 0x969adb
> 20x969ce4 postgres StandardHandlerForSigillSigsegvSigbus_OnMainThread + 
> 0x2b
> 30x88540b postgres CdbProgramErrorHandler + 0xf1
> 40x7ff663359370 libpthread.so.0  + 0x63359370
> 50x7221ff postgres ExecEndTableScan + 0x2f
> 60x6e5864 postgres ExecEndNode + 0x2b1
> 70x70cffb postgres ExecEndHashJoin + 0xdb
> 80x6e5963 postgres ExecEndNode + 0x3b0
> 90x704455 postgres ExecEndAgg + 0x104
> 10   0x6e59a7 postgres ExecEndNode + 0x3f4
> 11   0x71613d postgres ExecEndMotion + 0x87
> 12   0x6e5a0d postgres ExecEndNode + 0x45a
> 13   0x704455 postgres ExecEndAgg + 0x104
> 14   0x6e59a7 postgres ExecEndNode + 0x3f4
> 15   0x6de6a5 postgres ExecEndPlan + 0x4b
> 16   0x6dc139 postgres ExecutorEnd + 0x2fc



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HAWQ-1615) Access Invalid Memory When Run a Hash-join query with Bloomfilter Enable.

2018-05-22 Thread Lin Wen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Wen reassigned HAWQ-1615:
-

Assignee: Lin Wen  (was: Lei Chang)

> Access Invalid Memory When Run a Hash-join query with Bloomfilter Enable.
> -
>
> Key: HAWQ-1615
> URL: https://issues.apache.org/jira/browse/HAWQ-1615
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Query Execution
>Reporter: Lin Wen
>Assignee: Lin Wen
>Priority: Major
> Fix For: 2.4.0.0-incubating
>
>
> How to reproduce:
> 1. set hawq_hashjoin_bloomfilter=true;
> 2. run a query on TPCH database(1 master and 3 segments): select count(*) 
> from part, lineitem where p_partkey = l_partkey and p_brand = 'Brand#23' and 
> p_container = 'MED BOX'; 
> The QE process accessed invalid memory.
> 2018-05-16 06:27:40.010566 GMT,,,p567902,th0,,,2018-05-16 06:26:43 
> GMT,0,con21,cmd5,seg1,slice2"PANIC","XX000","Unexpected internal error: 
> Segment process received signal SIGSEGV",,,0"10x969adb postgres 
>  + 0x969adb
> 20x969ce4 postgres StandardHandlerForSigillSigsegvSigbus_OnMainThread + 
> 0x2b
> 30x88540b postgres CdbProgramErrorHandler + 0xf1
> 40x7ff663359370 libpthread.so.0  + 0x63359370
> 50x7221ff postgres ExecEndTableScan + 0x2f
> 60x6e5864 postgres ExecEndNode + 0x2b1
> 70x70cffb postgres ExecEndHashJoin + 0xdb
> 80x6e5963 postgres ExecEndNode + 0x3b0
> 90x704455 postgres ExecEndAgg + 0x104
> 10   0x6e59a7 postgres ExecEndNode + 0x3f4
> 11   0x71613d postgres ExecEndMotion + 0x87
> 12   0x6e5a0d postgres ExecEndNode + 0x45a
> 13   0x704455 postgres ExecEndAgg + 0x104
> 14   0x6e59a7 postgres ExecEndNode + 0x3f4
> 15   0x6de6a5 postgres ExecEndPlan + 0x4b
> 16   0x6dc139 postgres ExecutorEnd + 0x2fc



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HAWQ-1615) Access Invalid Memory When Run a Hash-join query with Bloomfilter Enable.

2018-05-21 Thread Lin Wen (JIRA)
Lin Wen created HAWQ-1615:
-

 Summary: Access Invalid Memory When Run a Hash-join query with 
Bloomfilter Enable.
 Key: HAWQ-1615
 URL: https://issues.apache.org/jira/browse/HAWQ-1615
 Project: Apache HAWQ
  Issue Type: Bug
  Components: Query Execution
Reporter: Lin Wen
Assignee: Lei Chang
 Fix For: 2.4.0.0-incubating


How to reproduce:
1. set hawq_hashjoin_bloomfilter=true;
2. run a query on TPCH database(1 master and 3 segments): select count(*) from 
part, lineitem where p_partkey = l_partkey and p_brand = 'Brand#23' and 
p_container = 'MED BOX'; 
The QE process accessed invalid memory.

2018-05-16 06:27:40.010566 GMT,,,p567902,th0,,,2018-05-16 06:26:43 
GMT,0,con21,cmd5,seg1,slice2"PANIC","XX000","Unexpected internal error: 
Segment process received signal SIGSEGV",,,0"10x969adb postgres 
 + 0x969adb
20x969ce4 postgres StandardHandlerForSigillSigsegvSigbus_OnMainThread + 0x2b
30x88540b postgres CdbProgramErrorHandler + 0xf1
40x7ff663359370 libpthread.so.0  + 0x63359370
50x7221ff postgres ExecEndTableScan + 0x2f
60x6e5864 postgres ExecEndNode + 0x2b1
70x70cffb postgres ExecEndHashJoin + 0xdb
80x6e5963 postgres ExecEndNode + 0x3b0
90x704455 postgres ExecEndAgg + 0x104
10   0x6e59a7 postgres ExecEndNode + 0x3f4
11   0x71613d postgres ExecEndMotion + 0x87
12   0x6e5a0d postgres ExecEndNode + 0x45a
13   0x704455 postgres ExecEndAgg + 0x104
14   0x6e59a7 postgres ExecEndNode + 0x3f4
15   0x6de6a5 postgres ExecEndPlan + 0x4b
16   0x6dc139 postgres ExecutorEnd + 0x2fc



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HAWQ-1608) Implement Printing Runtime Filter Information For "explain analyze"

2018-05-15 Thread Lin Wen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Wen resolved HAWQ-1608.
---
   Resolution: Fixed
Fix Version/s: 2.4.0.0-incubating

> Implement Printing Runtime Filter Information For "explain analyze"
> ---
>
> Key: HAWQ-1608
> URL: https://issues.apache.org/jira/browse/HAWQ-1608
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Planner, Query Execution
>Reporter: Lin Wen
>Assignee: Lin Wen
>Priority: Major
> Fix For: 2.4.0.0-incubating
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HAWQ-1597) Implement Runtime Filter for Hash Join

2018-05-15 Thread Lin Wen (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16475399#comment-16475399
 ] 

Lin Wen commented on HAWQ-1597:
---

Update design doc. add specification about expressions on Join filters

> Implement Runtime Filter for Hash Join
> --
>
> Key: HAWQ-1597
> URL: https://issues.apache.org/jira/browse/HAWQ-1597
> Project: Apache HAWQ
>  Issue Type: New Feature
>  Components: Query Execution
>Reporter: Lin Wen
>Assignee: Lin Wen
>Priority: Major
> Fix For: 2.4.0.0-incubating
>
> Attachments: HAWQ Runtime Filter Design.pdf, HAWQ Runtime Filter 
> Design.pdf
>
>
> Bloom filter is a space-efficient probabilistic data structure invented in 
> 1970, which is used to test whether an element is a member of a set.
> Nowdays, bloom filter is widely used in OLAP or data-intensive applications 
> to quickly filter data. It is usually implemented in OLAP systems for hash 
> join. The basic idea is, when hash join two tables, during the build phase, 
> build a bloomfilter information for the inner table, then push down this 
> bloomfilter information to the scan of the outer table, so that, less tuples 
> from the outer table will be returned to hash join node and joined with hash 
> table. It can greatly improment the hash join performance if the selectivity 
> is high.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HAWQ-1597) Implement Runtime Filter for Hash Join

2018-05-15 Thread Lin Wen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Wen updated HAWQ-1597:
--
Attachment: HAWQ Runtime Filter Design.pdf

> Implement Runtime Filter for Hash Join
> --
>
> Key: HAWQ-1597
> URL: https://issues.apache.org/jira/browse/HAWQ-1597
> Project: Apache HAWQ
>  Issue Type: New Feature
>  Components: Query Execution
>Reporter: Lin Wen
>Assignee: Lin Wen
>Priority: Major
> Fix For: 2.4.0.0-incubating
>
> Attachments: HAWQ Runtime Filter Design.pdf, HAWQ Runtime Filter 
> Design.pdf
>
>
> Bloom filter is a space-efficient probabilistic data structure invented in 
> 1970, which is used to test whether an element is a member of a set.
> Nowdays, bloom filter is widely used in OLAP or data-intensive applications 
> to quickly filter data. It is usually implemented in OLAP systems for hash 
> join. The basic idea is, when hash join two tables, during the build phase, 
> build a bloomfilter information for the inner table, then push down this 
> bloomfilter information to the scan of the outer table, so that, less tuples 
> from the outer table will be returned to hash join node and joined with hash 
> table. It can greatly improment the hash join performance if the selectivity 
> is high.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HAWQ-1608) Implement Printing Runtime Filter Information For "explain analyze"

2018-05-14 Thread Lin Wen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Wen updated HAWQ-1608:
--
Summary: Implement Printing Runtime Filter Information For "explain 
analyze"  (was: Implement Printing Runtime Filter Information For "explain" and 
"explain analyze")

> Implement Printing Runtime Filter Information For "explain analyze"
> ---
>
> Key: HAWQ-1608
> URL: https://issues.apache.org/jira/browse/HAWQ-1608
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Planner, Query Execution
>Reporter: Lin Wen
>Assignee: Lin Wen
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HAWQ-1608) Implement Printing Runtime Filter Information For "explain" and "explain analyze"

2018-05-09 Thread Lin Wen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Wen reassigned HAWQ-1608:
-

Assignee: Lin Wen  (was: Lei Chang)

> Implement Printing Runtime Filter Information For "explain" and "explain 
> analyze"
> -
>
> Key: HAWQ-1608
> URL: https://issues.apache.org/jira/browse/HAWQ-1608
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Planner, Query Execution
>Reporter: Lin Wen
>Assignee: Lin Wen
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HAWQ-1607) Implement Applying Bloom filter During Scan outer table

2018-05-09 Thread Lin Wen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Wen resolved HAWQ-1607.
---
   Resolution: Fixed
Fix Version/s: 2.3.0.0-incubating

> Implement Applying Bloom filter During Scan outer table
> ---
>
> Key: HAWQ-1607
> URL: https://issues.apache.org/jira/browse/HAWQ-1607
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Optimizer, Query Execution
>Reporter: Lin Wen
>Assignee: Lin Wen
>Priority: Major
> Fix For: 2.3.0.0-incubating
>
>
> This subtask will implement
>  # Pass down Bloom filter structure to outer table scan;
>  # Check if the tuple from outer table is found in Bloom filter structure.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HAWQ-1607) Implement Applying Bloom filter During Scan outer table

2018-05-06 Thread Lin Wen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Wen reassigned HAWQ-1607:
-

Assignee: Lin Wen  (was: Lei Chang)

> Implement Applying Bloom filter During Scan outer table
> ---
>
> Key: HAWQ-1607
> URL: https://issues.apache.org/jira/browse/HAWQ-1607
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Optimizer, Query Execution
>Reporter: Lin Wen
>Assignee: Lin Wen
>Priority: Major
>
> This subtask will implement
>  # Pass down Bloom filter structure to outer table scan;
>  # Check if the tuple from outer table is found in Bloom filter structure.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HAWQ-1606) Implement Deciding to Create Bloom Filter During Query Plan And Create Bloom filter For Inner Table

2018-05-04 Thread Lin Wen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Wen resolved HAWQ-1606.
---
   Resolution: Fixed
Fix Version/s: 2.4.0.0-incubating

> Implement Deciding to Create Bloom Filter During Query Plan And Create Bloom 
> filter For Inner Table 
> 
>
> Key: HAWQ-1606
> URL: https://issues.apache.org/jira/browse/HAWQ-1606
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Optimizer, Query Execution
>Reporter: Lin Wen
>Assignee: Lin Wen
>Priority: Major
> Fix For: 2.4.0.0-incubating
>
>
> This subtask will implement
> 1. Decide whether to create Bloom filter during query plan phase, if the hash 
> join is suitable to use Bloom filter, then some information will be added 
> into hash join plan node.
> 2. During query execution phase, create Bloom filter structure for tuples 
> from inner table.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HAWQ-1608) Implement Printing Runtime Filter Information For "explain" and "explain analyze"

2018-04-16 Thread Lin Wen (JIRA)
Lin Wen created HAWQ-1608:
-

 Summary: Implement Printing Runtime Filter Information For 
"explain" and "explain analyze"
 Key: HAWQ-1608
 URL: https://issues.apache.org/jira/browse/HAWQ-1608
 Project: Apache HAWQ
  Issue Type: Sub-task
  Components: Planner, Query Execution
Reporter: Lin Wen
Assignee: Lei Chang






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HAWQ-1606) Implement Deciding to Create Bloom Filter During Query Plan And Create Bloom filter For Inner Table

2018-04-15 Thread Lin Wen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Wen reassigned HAWQ-1606:
-

Assignee: Lin Wen  (was: Lei Chang)

> Implement Deciding to Create Bloom Filter During Query Plan And Create Bloom 
> filter For Inner Table 
> 
>
> Key: HAWQ-1606
> URL: https://issues.apache.org/jira/browse/HAWQ-1606
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Optimizer, Query Execution
>Reporter: Lin Wen
>Assignee: Lin Wen
>Priority: Major
>
> This subtask will implement
> 1. Decide whether to create Bloom filter during query plan phase, if the hash 
> join is suitable to use Bloom filter, then some information will be added 
> into hash join plan node.
> 2. During query execution phase, create Bloom filter structure for tuples 
> from inner table.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HAWQ-1607) Implement Applying Bloom filter During Scan outer table

2018-04-12 Thread Lin Wen (JIRA)
Lin Wen created HAWQ-1607:
-

 Summary: Implement Applying Bloom filter During Scan outer table
 Key: HAWQ-1607
 URL: https://issues.apache.org/jira/browse/HAWQ-1607
 Project: Apache HAWQ
  Issue Type: Sub-task
  Components: Optimizer, Query Execution
Reporter: Lin Wen
Assignee: Lei Chang


This subtask will implement
 # Pass down Bloom filter structure to outer table scan;
 # Check if the tuple from outer table is found in Bloom filter structure.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HAWQ-1606) Implement Deciding to Create Bloom Filter During Query Plan And Create Bloom filter For Inner Table

2018-04-12 Thread Lin Wen (JIRA)
Lin Wen created HAWQ-1606:
-

 Summary: Implement Deciding to Create Bloom Filter During Query 
Plan And Create Bloom filter For Inner Table 
 Key: HAWQ-1606
 URL: https://issues.apache.org/jira/browse/HAWQ-1606
 Project: Apache HAWQ
  Issue Type: Sub-task
  Components: Optimizer, Query Execution
Reporter: Lin Wen
Assignee: Lei Chang


This subtask will implement

1. Decide whether to create Bloom filter during query plan phase, if the hash 
join is suitable to use Bloom filter, then some information will be added into 
hash join plan node.

2. During query execution phase, create Bloom filter structure for tuples from 
inner table.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HAWQ-1604) Add A New GUC hawq_hashjoin_bloomfilter

2018-04-12 Thread Lin Wen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Wen resolved HAWQ-1604.
---
Resolution: Fixed

> Add A New GUC hawq_hashjoin_bloomfilter
> ---
>
> Key: HAWQ-1604
> URL: https://issues.apache.org/jira/browse/HAWQ-1604
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Query Execution
>Reporter: Lin Wen
>Assignee: Lin Wen
>Priority: Major
> Fix For: 2.4.0.0-incubating
>
>
> # Add A New GUC hawq_hashjoin_bloomfilter to indicate if use Bloom filter for 
> hash join.
>  # remove gp_hashjoin_bloomfilter and bloom filter in hash join table, this 
> legacy has been verified that it won't improve hash join performance.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HAWQ-1604) Add A New GUC hawq_hashjoin_bloomfilter

2018-04-08 Thread Lin Wen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Wen reassigned HAWQ-1604:
-

Assignee: Lin Wen  (was: Lei Chang)

> Add A New GUC hawq_hashjoin_bloomfilter
> ---
>
> Key: HAWQ-1604
> URL: https://issues.apache.org/jira/browse/HAWQ-1604
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Query Execution
>Reporter: Lin Wen
>Assignee: Lin Wen
>Priority: Major
> Fix For: 2.4.0.0-incubating
>
>
> # Add A New GUC hawq_hashjoin_bloomfilter to indicate if use Bloom filter for 
> hash join.
>  # remove gp_hashjoin_bloomfilter and bloom filter in hash join table, this 
> legacy has been verified that it won't improve hash join performance.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HAWQ-1604) Add A New GUC hawq_hashjoin_bloomfilter

2018-04-08 Thread Lin Wen (JIRA)
Lin Wen created HAWQ-1604:
-

 Summary: Add A New GUC hawq_hashjoin_bloomfilter
 Key: HAWQ-1604
 URL: https://issues.apache.org/jira/browse/HAWQ-1604
 Project: Apache HAWQ
  Issue Type: Sub-task
  Components: Query Execution
Reporter: Lin Wen
Assignee: Lei Chang
 Fix For: 2.4.0.0-incubating


# Add A New GUC hawq_hashjoin_bloomfilter to indicate if use Bloom filter for 
hash join.
 # remove gp_hashjoin_bloomfilter and bloom filter in hash join table, this 
legacy has been verified that it won't improve hash join performance.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HAWQ-1597) Implement Runtime Filter for Hash Join

2018-03-26 Thread Lin Wen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Wen reassigned HAWQ-1597:
-

Assignee: Lin Wen  (was: Lei Chang)

> Implement Runtime Filter for Hash Join
> --
>
> Key: HAWQ-1597
> URL: https://issues.apache.org/jira/browse/HAWQ-1597
> Project: Apache HAWQ
>  Issue Type: New Feature
>  Components: Query Execution
>Reporter: Lin Wen
>Assignee: Lin Wen
>Priority: Major
> Attachments: HAWQ Runtime Filter Design.pdf
>
>
> Bloom filter is a space-efficient probabilistic data structure invented in 
> 1970, which is used to test whether an element is a member of a set.
> Nowdays, bloom filter is widely used in OLAP or data-intensive applications 
> to quickly filter data. It is usually implemented in OLAP systems for hash 
> join. The basic idea is, when hash join two tables, during the build phase, 
> build a bloomfilter information for the inner table, then push down this 
> bloomfilter information to the scan of the outer table, so that, less tuples 
> from the outer table will be returned to hash join node and joined with hash 
> table. It can greatly improment the hash join performance if the selectivity 
> is high.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HAWQ-1597) Implement Runtime Filter for Hash Join

2018-03-25 Thread Lin Wen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Wen updated HAWQ-1597:
--
Attachment: HAWQ Runtime Filter Design.pdf

> Implement Runtime Filter for Hash Join
> --
>
> Key: HAWQ-1597
> URL: https://issues.apache.org/jira/browse/HAWQ-1597
> Project: Apache HAWQ
>  Issue Type: New Feature
>  Components: Query Execution
>Reporter: Lin Wen
>Assignee: Lei Chang
>Priority: Major
> Attachments: HAWQ Runtime Filter Design.pdf
>
>
> Bloom filter is a space-efficient probabilistic data structure invented in 
> 1970, which is used to test whether an element is a member of a set.
> Nowdays, bloom filter is widely used in OLAP or data-intensive applications 
> to quickly filter data. It is usually implemented in OLAP systems for hash 
> join. The basic idea is, when hash join two tables, during the build phase, 
> build a bloomfilter information for the inner table, then push down this 
> bloomfilter information to the scan of the outer table, so that, less tuples 
> from the outer table will be returned to hash join node and joined with hash 
> table. It can greatly improment the hash join performance if the selectivity 
> is high.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HAWQ-1597) Implement Runtime Filter for Hash Join

2018-03-14 Thread Lin Wen (JIRA)
Lin Wen created HAWQ-1597:
-

 Summary: Implement Runtime Filter for Hash Join
 Key: HAWQ-1597
 URL: https://issues.apache.org/jira/browse/HAWQ-1597
 Project: Apache HAWQ
  Issue Type: New Feature
  Components: Query Execution
Reporter: Lin Wen
Assignee: Lei Chang


Bloom filter is a space-efficient probabilistic data structure invented in 
1970, which is used to test whether an element is a member of a set.
Nowdays, bloom filter is widely used in OLAP or data-intensive applications to 
quickly filter data. It is usually implemented in OLAP systems for hash join. 
The basic idea is, when hash join two tables, during the build phase, build a 
bloomfilter information for the inner table, then push down this bloomfilter 
information to the scan of the outer table, so that, less tuples from the outer 
table will be returned to hash join node and joined with hash table. It can 
greatly improment the hash join performance if the selectivity is high.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HAWQ-1567) Unknown process holds the lock causes DROP TABLE hangs forever

2017-12-05 Thread Lin Wen (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16279634#comment-16279634
 ] 

Lin Wen commented on HAWQ-1567:
---

Hi, Kuien,

Have you seen any log information about dispatcher error in master log?
It might be related to hawq-1530, 
https://issues.apache.org/jira/browse/HAWQ-1530 


> Unknown process holds the lock causes DROP TABLE hangs forever
> --
>
> Key: HAWQ-1567
> URL: https://issues.apache.org/jira/browse/HAWQ-1567
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Core
>Reporter: Kuien Liu
>Assignee: Radar Lei
>
> On Hawq 2.2.0.0-incubating (Jun 2017), we meet several times that query is 
> hanging for long time:
> # 1. DROP TABLE hangs for tens of minutes, because it waits for 
> AccessExclusiveLock.
> # 2. BUT the lock is held by a ghost process ( not alive, and little message  
> in log file is availabe to know what's up)
> A detailed context is pasted:
> postgres=# select procpid, sess_id, usesysid, xact_start, waiting, 
> current_query from pg_stat_activity where current_query <> '';
>  procpid | sess_id | usesysid |  xact_start   | waiting | 
>   
>   current_query
> -+-+--+---+-+---
>91321 |  120242 |   328199 | 2017-11-28 14:45:52.631739+08 | t   |  
> drop table if exists ads_is_svc_rcv_approval_detail_df
> postgres=# select * from pg_locks where pid = 91321;
>locktype| database | relation | page | tuple | transactionid | classid 
> | objid | objsubid | transaction |  pid  |mode | granted | 
> mppsessionid | mppiswriter | gp_segment_id
> ---+--+--+--+---+---+-+---+--+-+---+-+-+--+-+---
>  transactionid |  |  |  |   |  21867785 | 
> |   |  |21867785 | 91321 | ExclusiveLock   | t   |
>120242 | f   |-1
>  relation  |16510 | 2608 |  |   |   | 
> |   |  |21867785 | 91321 | RowExclusiveLock| t   |
>120242 | f   |-1
>  relation  |16510 | 1259 |  |   |   | 
> |   |  |21867785 | 91321 | RowExclusiveLock| t   |
>120242 | f   |-1
>  relation  |16510 |  3212612 |  |   |   | 
> |   |  |21867785 | 91321 | AccessExclusiveLock | f   |
>120242 | f   |-1
>  relation  |16510 | 1247 |  |   |   | 
> |   |  |21867785 | 91321 | RowExclusiveLock| t   |
>120242 | f   |-1
> (5 rows)
> postgres=# select * from pg_locks where relation = 3212612;
>  locktype | database | relation | page | tuple | transactionid | classid | 
> objid | objsubid | transaction |  pid   |mode | granted | 
> mppsessionid | mppiswriter | gp_segment_id
> --+--+--+--+---+---+-+---+--+-++-+-+--+-+---
>  relation |16510 |  3212612 |  |   |   | |
>|  |21867785 |  91321 | AccessExclusiveLock | f   |   
> 120242 | f   |-1
>  relation |16510 |  3212612 |  |   |   | |
>|  |   0 | 107940 | AccessShareLock | t   |   
> 120553 | f   |-1
> (2 rows)
> postgres=# select * from pg_stat_activity where procpid = 107940;
>  datid | datname | procpid | sess_id | usesysid | usename | current_query | 
> waiting | query_start | backend_start | client_addr | client_port | 
> application_name | xact_start | waiting_resource
> ---+-+-+-+--+-+---+-+-+---+-+-+--++--
> (0 rows)
> postgres=# select * from pg_locks  where pid = 107940 or mppsessionid = 
> 120553;
>  locktype | database | relation | page | tuple | 

[jira] [Commented] (HAWQ-1530) Illegally killing a JDBC select query causes locking problems

2017-11-03 Thread Lin Wen (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16237193#comment-16237193
 ] 

Lin Wen commented on HAWQ-1530:
---

Hi, 
Do you have a detailed reproduce for this bug?
I want to reproduce it in my environment, is it a must to install Aqua Data 
Studio?
Can we open multiple psql sessions and run some queries, etc, to reproduce it?

Thanks!

> Illegally killing a JDBC select query causes locking problems
> -
>
> Key: HAWQ-1530
> URL: https://issues.apache.org/jira/browse/HAWQ-1530
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Transaction
>Reporter: Grant Krieger
>Assignee: Radar Lei
>Priority: Major
>
> Hi,
> When you perform a long running select statement on 2 hawq tables (join) from 
> JDBC and illegally kill the JDBC client (CTRL ALT DEL) before completion of 
> the query the 2 tables remained locked even when the query completes on the 
> server. 
> The lock is visible via PG_locks. One cannot kill the query via SELECT 
> pg_terminate_backend(393937). The only way to get rid of it is to kill -9 
> from linux or restart hawq but this can kill other things as well.
> The JDBC client I am using is Aqua Data Studio.
> I can provide exact steps to reproduce if required
> Thank you
> Grant 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HAWQ-1521) Idle QE Processes Can't Quit After An Interval

2017-08-29 Thread Lin Wen (JIRA)
Lin Wen created HAWQ-1521:
-

 Summary: Idle QE Processes Can't Quit After An Interval
 Key: HAWQ-1521
 URL: https://issues.apache.org/jira/browse/HAWQ-1521
 Project: Apache HAWQ
  Issue Type: Bug
Reporter: Lin Wen
Assignee: Radar Lei


After a query is finished, there are some idle QE processes on segments. These 
QE processes are expected to quit after a time interval, this interval is 
controlled by a GUC gp_vmem_idle_resource_timeout, the default value is 18 
seconds.

However, this does't act as expected. Idle QE processes on segments always 
exist there, unless the QD process quit. 

The reason is in postgres.c, the codes to enable this timer can't get executed. 
function gangsExist() always return false, since gang related structures are 
all NULL.

if (IdleSessionGangTimeout > 0 && gangsExist())
if (!enable_sig_alarm( IdleSessionGangTimeout /* ms */, false))
elog(FATAL, "could not set timer for client wait 
timeout");




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HAWQ-1498) Segments keep open file descriptors for deleted files

2017-07-07 Thread Lin Wen (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16078174#comment-16078174
 ] 

Lin Wen commented on HAWQ-1498:
---

Yes, I agree with you. Did your query run successfully or not? How many rows in 
table junk? And what's the rough size of one row? I may try to reproduce it in 
my environment. 

> Segments keep open file descriptors for deleted files
> -
>
> Key: HAWQ-1498
> URL: https://issues.apache.org/jira/browse/HAWQ-1498
> Project: Apache HAWQ
>  Issue Type: Bug
>Reporter: Harald Bögeholz
>Assignee: Radar Lei
> Fix For: 2.2.0.0-incubating
>
>
> I have been running some large computations in HAWQ using psql on the master. 
> These computations created temporary tables and dropped them again. 
> Nevertheless free disk space in HDFS decreased by much more than it should. 
> While the psql session on the master was still open I investigated on one of 
> the slave machines.
> HDFS is stored on /mds:
> {noformat}
> [root@mds-hdp-04 ~]# ls -l /mds
> total 36
> drwxr-xr-x. 3 root  root4096 Jun 14 04:23 falcon
> drwxr-xr-x. 3 root  root4096 Jun 14 04:42 hdfs
> drwx--. 2 root  root   16384 Jun  8 02:48 lost+found
> drwxr-xr-x. 5 storm hadoop  4096 Jun 14 04:45 storm
> drwxr-xr-x. 4 root  root4096 Jun 14 04:43 yarn
> drwxr-xr-x. 2 zookeeper hadoop  4096 Jun 14 04:39 zookeeper
> [root@mds-hdp-04 ~]# df /mds
> Filesystem 1K-blocks  Used Available Use% Mounted on
> /dev/vdc   515928320 314560220 175137316  65% /mds
> [root@mds-hdp-04 ~]# du -s /mds
> 89918952  /mds
> {noformat}
> Note that there is a more than 200 GB difference between the disk space used 
> according to df and the sum of all files on that file system according to du.
> I have found the culprit to be several postgres processes running as gpadmin 
> and holding open file descriptors to deleted files. Here are the first few:
> {noformat}
> [root@mds-hdp-04 ~]# lsof +L1 | grep /mds/hdfs | head -10
> postgres 665334 gpadmin   18r   REG 253,32 134217728 0  9438234 
> /mds/hdfs/data/current/BP-23056860-118.138.237.114-1497415333069/current/finalized/subdir2/subdir193/blk_1073922482
>  (deleted)
> postgres 665334 gpadmin   34r   REG 253,32 24488 0  9438114 
> /mds/hdfs/data/current/BP-23056860-118.138.237.114-1497415333069/current/finalized/subdir2/subdir193/blk_1073922398
>  (deleted)
> postgres 665334 gpadmin   35r   REG 253,32   199 0  9438115 
> /mds/hdfs/data/current/BP-23056860-118.138.237.114-1497415333069/current/finalized/subdir2/subdir193/blk_1073922398_187044.meta
>  (deleted)
> postgres 665334 gpadmin   37r   REG 253,32 134217728 0  9438208 
> /mds/hdfs/data/current/BP-23056860-118.138.237.114-1497415333069/current/finalized/subdir2/subdir193/blk_1073922446
>  (deleted)
> postgres 665334 gpadmin   38r   REG 253,32   1048583 0  9438209 
> /mds/hdfs/data/current/BP-23056860-118.138.237.114-1497415333069/current/finalized/subdir2/subdir193/blk_1073922446_187092.meta
>  (deleted)
> postgres 665334 gpadmin   39r   REG 253,32   1048583 0  9438235 
> /mds/hdfs/data/current/BP-23056860-118.138.237.114-1497415333069/current/finalized/subdir2/subdir193/blk_1073922482_187128.meta
>  (deleted)
> postgres 665334 gpadmin   40r   REG 253,32 134217728 0  9438262 
> /mds/hdfs/data/current/BP-23056860-118.138.237.114-1497415333069/current/finalized/subdir2/subdir193/blk_1073922555
>  (deleted)
> postgres 665334 gpadmin   41r   REG 253,32   1048583 0  9438263 
> /mds/hdfs/data/current/BP-23056860-118.138.237.114-1497415333069/current/finalized/subdir2/subdir193/blk_1073922555_187201.meta
>  (deleted)
> postgres 665334 gpadmin   42r   REG 253,32 134217728 0  9438285 
> /mds/hdfs/data/current/BP-23056860-118.138.237.114-1497415333069/current/finalized/subdir2/subdir194/blk_1073922602
>  (deleted)
> postgres 665334 gpadmin   43r   REG 253,32   1048583 0  9438286 
> /mds/hdfs/data/current/BP-23056860-118.138.237.114-1497415333069/current/finalized/subdir2/subdir194/blk_1073922602_187248.meta
>  (deleted)
> {noformat}
> As soon I close the psql session on the master the disk space is freed on the 
> slaves:
> {noformat}
> [root@mds-hdp-04 ~]# df /mds
> Filesystem 1K-blocks Used Available Use% Mounted on
> /dev/vdc   515928320 89992720 399704816  19% /mds
> [root@mds-hdp-04 ~]# du -s /mds
> 89918952  /mds
> [root@mds-hdp-04 ~]# lsof +L1 | grep /mds/hdfs | head -10
> {noformat}
> I believe this to be a bug. At least for me it looks like a very undesirable 
> behavior.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HAWQ-1498) Segments keep open file descriptors for deleted files

2017-07-06 Thread Lin Wen (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16077618#comment-16077618
 ] 

Lin Wen commented on HAWQ-1498:
---

Thanks for the information!
These process are running on segment node, right? Would you please run "ps -ef" 
for these processes? So that we can know detailedly which processes occupy 
these? Also run "pstack" for one of them, to check where the process is running 
to.

postgres  76698 gpadmin   15r   REG 253,32 121425488 0  9437196 
/mds/hdfs/data/current/BP-23056860-118.138.237.114-1497415333069/current/finalized/subdir3/subdir41/blk_1073949177
 (deleted)
postgres  76698 gpadmin   16r   REG 253,32948647 0  9438394 
/mds/hdfs/data/current/BP-23056860-118.138.237.114-1497415333069/current/finalized/subdir3/subdir41/blk_1073949177_214295.meta
 (deleted)
postgres  76698 gpadmin   18r   REG 253,32212024 0  9438395 
/mds/hdfs/data/current/BP-23056860-118.138.237.114-1497415333069/current/finalized/subdir3/subdir41/blk_1073949181
 (deleted)
postgres  76698 gpadmin   19r   REG 253,32  1667 0  9438396 
/mds/hdfs/data/current/BP-23056860-118.138.237.114-1497415333069/current/finalized/subdir3/subdir41/blk_1073949181_214299.meta
 (deleted)




> Segments keep open file descriptors for deleted files
> -
>
> Key: HAWQ-1498
> URL: https://issues.apache.org/jira/browse/HAWQ-1498
> Project: Apache HAWQ
>  Issue Type: Bug
>Reporter: Harald Bögeholz
>Assignee: Radar Lei
> Fix For: 2.2.0.0-incubating
>
>
> I have been running some large computations in HAWQ using psql on the master. 
> These computations created temporary tables and dropped them again. 
> Nevertheless free disk space in HDFS decreased by much more than it should. 
> While the psql session on the master was still open I investigated on one of 
> the slave machines.
> HDFS is stored on /mds:
> {noformat}
> [root@mds-hdp-04 ~]# ls -l /mds
> total 36
> drwxr-xr-x. 3 root  root4096 Jun 14 04:23 falcon
> drwxr-xr-x. 3 root  root4096 Jun 14 04:42 hdfs
> drwx--. 2 root  root   16384 Jun  8 02:48 lost+found
> drwxr-xr-x. 5 storm hadoop  4096 Jun 14 04:45 storm
> drwxr-xr-x. 4 root  root4096 Jun 14 04:43 yarn
> drwxr-xr-x. 2 zookeeper hadoop  4096 Jun 14 04:39 zookeeper
> [root@mds-hdp-04 ~]# df /mds
> Filesystem 1K-blocks  Used Available Use% Mounted on
> /dev/vdc   515928320 314560220 175137316  65% /mds
> [root@mds-hdp-04 ~]# du -s /mds
> 89918952  /mds
> {noformat}
> Note that there is a more than 200 GB difference between the disk space used 
> according to df and the sum of all files on that file system according to du.
> I have found the culprit to be several postgres processes running as gpadmin 
> and holding open file descriptors to deleted files. Here are the first few:
> {noformat}
> [root@mds-hdp-04 ~]# lsof +L1 | grep /mds/hdfs | head -10
> postgres 665334 gpadmin   18r   REG 253,32 134217728 0  9438234 
> /mds/hdfs/data/current/BP-23056860-118.138.237.114-1497415333069/current/finalized/subdir2/subdir193/blk_1073922482
>  (deleted)
> postgres 665334 gpadmin   34r   REG 253,32 24488 0  9438114 
> /mds/hdfs/data/current/BP-23056860-118.138.237.114-1497415333069/current/finalized/subdir2/subdir193/blk_1073922398
>  (deleted)
> postgres 665334 gpadmin   35r   REG 253,32   199 0  9438115 
> /mds/hdfs/data/current/BP-23056860-118.138.237.114-1497415333069/current/finalized/subdir2/subdir193/blk_1073922398_187044.meta
>  (deleted)
> postgres 665334 gpadmin   37r   REG 253,32 134217728 0  9438208 
> /mds/hdfs/data/current/BP-23056860-118.138.237.114-1497415333069/current/finalized/subdir2/subdir193/blk_1073922446
>  (deleted)
> postgres 665334 gpadmin   38r   REG 253,32   1048583 0  9438209 
> /mds/hdfs/data/current/BP-23056860-118.138.237.114-1497415333069/current/finalized/subdir2/subdir193/blk_1073922446_187092.meta
>  (deleted)
> postgres 665334 gpadmin   39r   REG 253,32   1048583 0  9438235 
> /mds/hdfs/data/current/BP-23056860-118.138.237.114-1497415333069/current/finalized/subdir2/subdir193/blk_1073922482_187128.meta
>  (deleted)
> postgres 665334 gpadmin   40r   REG 253,32 134217728 0  9438262 
> /mds/hdfs/data/current/BP-23056860-118.138.237.114-1497415333069/current/finalized/subdir2/subdir193/blk_1073922555
>  (deleted)
> postgres 665334 gpadmin   41r   REG 253,32   1048583 0  9438263 
> /mds/hdfs/data/current/BP-23056860-118.138.237.114-1497415333069/current/finalized/subdir2/subdir193/blk_1073922555_187201.meta
>  (deleted)
> postgres 665334 gpadmin   42r   REG 253,32 134217728 0  9438285 
> /mds/hdfs/data/current/BP-23056860-118.138.237.114-1497415333069/current/finalized/subdir2/subdir194/blk_1073922602
>  (deleted)
> postgres 665334 gpadmin   43r   REG 253,32   

[jira] [Commented] (HAWQ-1498) Segments keep open file descriptors for deleted files

2017-07-06 Thread Lin Wen (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16076231#comment-16076231
 ] 

Lin Wen commented on HAWQ-1498:
---

Hi, Harald,
Thank you for reporting it! Would you like to provide more information? For 
example, print all the postgres process on segment when disk are not freed. Or 
concrete steps that can reproduce it. I am wondering if the query is executed 
successfully. If the query is finished, after a period of time(can be 
controlled by a GUC property), the idle QEs on segment should exit. If the QEs 
on segment exit, the disk are still not freed?  

> Segments keep open file descriptors for deleted files
> -
>
> Key: HAWQ-1498
> URL: https://issues.apache.org/jira/browse/HAWQ-1498
> Project: Apache HAWQ
>  Issue Type: Bug
>Reporter: Harald Bögeholz
>Assignee: Radar Lei
> Fix For: 2.2.0.0-incubating
>
>
> I have been running some large computations in HAWQ using psql on the master. 
> These computations created temporary tables and dropped them again. 
> Nevertheless free disk space in HDFS decreased by much more than it should. 
> While the psql session on the master was still open I investigated on one of 
> the slave machines.
> HDFS is stored on /mds:
> {noformat}
> [root@mds-hdp-04 ~]# ls -l /mds
> total 36
> drwxr-xr-x. 3 root  root4096 Jun 14 04:23 falcon
> drwxr-xr-x. 3 root  root4096 Jun 14 04:42 hdfs
> drwx--. 2 root  root   16384 Jun  8 02:48 lost+found
> drwxr-xr-x. 5 storm hadoop  4096 Jun 14 04:45 storm
> drwxr-xr-x. 4 root  root4096 Jun 14 04:43 yarn
> drwxr-xr-x. 2 zookeeper hadoop  4096 Jun 14 04:39 zookeeper
> [root@mds-hdp-04 ~]# df /mds
> Filesystem 1K-blocks  Used Available Use% Mounted on
> /dev/vdc   515928320 314560220 175137316  65% /mds
> [root@mds-hdp-04 ~]# du -s /mds
> 89918952  /mds
> {noformat}
> Note that there is a more than 200 GB difference between the disk space used 
> according to df and the sum of all files on that file system according to du.
> I have found the culprit to be several postgres processes running as gpadmin 
> and holding open file descriptors to deleted files. Here are the first few:
> {noformat}
> [root@mds-hdp-04 ~]# lsof +L1 | grep /mds/hdfs | head -10
> postgres 665334 gpadmin   18r   REG 253,32 134217728 0  9438234 
> /mds/hdfs/data/current/BP-23056860-118.138.237.114-1497415333069/current/finalized/subdir2/subdir193/blk_1073922482
>  (deleted)
> postgres 665334 gpadmin   34r   REG 253,32 24488 0  9438114 
> /mds/hdfs/data/current/BP-23056860-118.138.237.114-1497415333069/current/finalized/subdir2/subdir193/blk_1073922398
>  (deleted)
> postgres 665334 gpadmin   35r   REG 253,32   199 0  9438115 
> /mds/hdfs/data/current/BP-23056860-118.138.237.114-1497415333069/current/finalized/subdir2/subdir193/blk_1073922398_187044.meta
>  (deleted)
> postgres 665334 gpadmin   37r   REG 253,32 134217728 0  9438208 
> /mds/hdfs/data/current/BP-23056860-118.138.237.114-1497415333069/current/finalized/subdir2/subdir193/blk_1073922446
>  (deleted)
> postgres 665334 gpadmin   38r   REG 253,32   1048583 0  9438209 
> /mds/hdfs/data/current/BP-23056860-118.138.237.114-1497415333069/current/finalized/subdir2/subdir193/blk_1073922446_187092.meta
>  (deleted)
> postgres 665334 gpadmin   39r   REG 253,32   1048583 0  9438235 
> /mds/hdfs/data/current/BP-23056860-118.138.237.114-1497415333069/current/finalized/subdir2/subdir193/blk_1073922482_187128.meta
>  (deleted)
> postgres 665334 gpadmin   40r   REG 253,32 134217728 0  9438262 
> /mds/hdfs/data/current/BP-23056860-118.138.237.114-1497415333069/current/finalized/subdir2/subdir193/blk_1073922555
>  (deleted)
> postgres 665334 gpadmin   41r   REG 253,32   1048583 0  9438263 
> /mds/hdfs/data/current/BP-23056860-118.138.237.114-1497415333069/current/finalized/subdir2/subdir193/blk_1073922555_187201.meta
>  (deleted)
> postgres 665334 gpadmin   42r   REG 253,32 134217728 0  9438285 
> /mds/hdfs/data/current/BP-23056860-118.138.237.114-1497415333069/current/finalized/subdir2/subdir194/blk_1073922602
>  (deleted)
> postgres 665334 gpadmin   43r   REG 253,32   1048583 0  9438286 
> /mds/hdfs/data/current/BP-23056860-118.138.237.114-1497415333069/current/finalized/subdir2/subdir194/blk_1073922602_187248.meta
>  (deleted)
> {noformat}
> As soon I close the psql session on the master the disk space is freed on the 
> slaves:
> {noformat}
> [root@mds-hdp-04 ~]# df /mds
> Filesystem 1K-blocks Used Available Use% Mounted on
> /dev/vdc   515928320 89992720 399704816  19% /mds
> [root@mds-hdp-04 ~]# du -s /mds
> 89918952  /mds
> [root@mds-hdp-04 ~]# lsof +L1 | grep /mds/hdfs | head -10
> {noformat}
> I believe this to be a bug. At least for me it looks like a very 

[jira] [Resolved] (HAWQ-1480) Packing a core file in hawq

2017-06-14 Thread Lin Wen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Wen resolved HAWQ-1480.
---
   Resolution: Fixed
Fix Version/s: 2.3.0.0-incubating

> Packing a core file in hawq
> ---
>
> Key: HAWQ-1480
> URL: https://issues.apache.org/jira/browse/HAWQ-1480
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: Command Line Tools
>Reporter: Shubham Sharma
>Assignee: Radar Lei
> Fix For: 2.3.0.0-incubating
>
>
> Currently there is no way to packing a core file with its context – 
> executable, application and system shared libraries in hawq. This information 
> can be later unpacked on another system and helps in debugging. It is a 
> useful feature to quickly gather all the data needed from a crash/core 
> generated on the system to analyze it later.
> Another open source project, greenplum, uses a script 
> [https://github.com/greenplum-db/gpdb/blob/master/gpMgmt/sbin/packcore] to 
> collect this information. Tested this script against Hawq's installation and 
> it collects the required information needed for debug.
> Can this be merged into Hawq, if yes, I can submit a pull request and test it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HAWQ-1480) Packing a core file in hawq

2017-06-05 Thread Lin Wen (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16036647#comment-16036647
 ] 

Lin Wen commented on HAWQ-1480:
---

looks good, a useful tool for debugging. 

> Packing a core file in hawq
> ---
>
> Key: HAWQ-1480
> URL: https://issues.apache.org/jira/browse/HAWQ-1480
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: Command Line Tools
>Reporter: Shubham Sharma
>Assignee: Radar Lei
>
> Currently there is no way to packing a core file with its context – 
> executable, application and system shared libraries in hawq. This information 
> can be later unpacked on another system and helps in debugging. It is a 
> useful feature to quickly gather all the data needed from a crash/core 
> generated on the system to analyze it later.
> Another open source project, greenplum, uses a script 
> [https://github.com/greenplum-db/gpdb/blob/master/gpMgmt/sbin/packcore] to 
> collect this information. Tested this script against Hawq's installation and 
> it collects the required information needed for debug.
> Can this be merged into Hawq, if yes, I can submit a pull request and test it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (HAWQ-1469) Don't expose RPS warning messages to command line

2017-05-17 Thread Lin Wen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Wen resolved HAWQ-1469.
---
Resolution: Fixed
  Assignee: Lin Wen  (was: Ed Espino)

> Don't expose RPS warning messages to command line
> -
>
> Key: HAWQ-1469
> URL: https://issues.apache.org/jira/browse/HAWQ-1469
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Security
>Reporter: Lin Wen
>Assignee: Lin Wen
> Fix For: backlog
>
>
> RPS service address exposing to end-user is not secure, and we should not 
> expose it out.
> **Case 1: When master RPS is down, changing to standby RPS**
> Current behavior
> ```
> postgres=# select * from a;
> WARNING:  ranger plugin service from http://test1:8432/rps is unavailable : 
> Couldn't connect to server, try another http://test5:8432/rps
> ERROR:  permission denied for relation(s): public.a
> ``` 
> Warning should be removed.
> Expected
> ```
> postgres=# select * from a;
> ERROR:  permission denied for relation(s): public.a
> ```
> **Case 2: When both RPS are down, should only print that RPS is unavailable.**
> Current Behavior:
> ```
> postgres=# select * from a;
> WARNING:  ranger plugin service from http://test5:8432/rps is unavailable : 
> Couldn't connect to server, try another http://test1:8432/rps
> ERROR:  ranger plugin service from http://test1:8432/rps is unavailable : 
> Couldn't connect to server. (rangerrest.c:463)
> ```
> Expected
> ```
> postgres=# select * from a;
> ERROR:  ranger plugin service is unavailable : Couldn't connect to server. 
> (rangerrest.c:463)
> ```
> The warning message should be printed in cvs log file.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (HAWQ-1469) Don't expose RPS warning messages to command line

2017-05-17 Thread Lin Wen (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16015047#comment-16015047
 ] 

Lin Wen edited comment on HAWQ-1469 at 5/18/17 2:50 AM:


Yes. If both master RPS and standby RPS are unavailable, this message is 
printed to console.
ERROR: ranger plugin service is unavailable : Couldn't connect to server. 
(rangerrest.c:463)

OK. We can make the message more descriptive. how about this?
ERROR: permission is unknown due to authorization failure, ranger plugin 
service is unavailable : Couldn't connect to server. (rangerrest.c:463)


was (Author: wlin):
Yes. If both master RPS and standby RPS are unavailable, this message is 
printed to console.
ERROR: ranger plugin service is unavailable : Couldn't connect to server. 
(rangerrest.c:463)

OK. We can make the message more descriptive. how about this?
ERROR: authorization failed, ranger plugin service is unavailable : Couldn't 
connect to server. (rangerrest.c:463)

> Don't expose RPS warning messages to command line
> -
>
> Key: HAWQ-1469
> URL: https://issues.apache.org/jira/browse/HAWQ-1469
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Security
>Reporter: Lin Wen
>Assignee: Ed Espino
> Fix For: backlog
>
>
> RPS service address exposing to end-user is not secure, and we should not 
> expose it out.
> **Case 1: When master RPS is down, changing to standby RPS**
> Current behavior
> ```
> postgres=# select * from a;
> WARNING:  ranger plugin service from http://test1:8432/rps is unavailable : 
> Couldn't connect to server, try another http://test5:8432/rps
> ERROR:  permission denied for relation(s): public.a
> ``` 
> Warning should be removed.
> Expected
> ```
> postgres=# select * from a;
> ERROR:  permission denied for relation(s): public.a
> ```
> **Case 2: When both RPS are down, should only print that RPS is unavailable.**
> Current Behavior:
> ```
> postgres=# select * from a;
> WARNING:  ranger plugin service from http://test5:8432/rps is unavailable : 
> Couldn't connect to server, try another http://test1:8432/rps
> ERROR:  ranger plugin service from http://test1:8432/rps is unavailable : 
> Couldn't connect to server. (rangerrest.c:463)
> ```
> Expected
> ```
> postgres=# select * from a;
> ERROR:  ranger plugin service is unavailable : Couldn't connect to server. 
> (rangerrest.c:463)
> ```
> The warning message should be printed in cvs log file.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (HAWQ-1469) Don't expose RPS warning messages to command line

2017-05-17 Thread Lin Wen (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16015047#comment-16015047
 ] 

Lin Wen edited comment on HAWQ-1469 at 5/18/17 2:49 AM:


Yes. If both master RPS and standby RPS are unavailable, this message is 
printed to console.
ERROR: ranger plugin service is unavailable : Couldn't connect to server. 
(rangerrest.c:463)

OK. We can make the message more descriptive. how about this?
ERROR: authorization failed, ranger plugin service is unavailable : Couldn't 
connect to server. (rangerrest.c:463)


was (Author: wlin):
Yes. If both master RPS and standby RPS are unavailable, this message is 
printed to console.
ERROR: ranger plugin service is unavailable : Couldn't connect to server. 
(rangerrest.c:463)

OK. We can make the message more descriptive. how about this?
ERROR: authentication failed, ranger plugin service is unavailable : Couldn't 
connect to server. (rangerrest.c:463)

> Don't expose RPS warning messages to command line
> -
>
> Key: HAWQ-1469
> URL: https://issues.apache.org/jira/browse/HAWQ-1469
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Security
>Reporter: Lin Wen
>Assignee: Ed Espino
> Fix For: backlog
>
>
> RPS service address exposing to end-user is not secure, and we should not 
> expose it out.
> **Case 1: When master RPS is down, changing to standby RPS**
> Current behavior
> ```
> postgres=# select * from a;
> WARNING:  ranger plugin service from http://test1:8432/rps is unavailable : 
> Couldn't connect to server, try another http://test5:8432/rps
> ERROR:  permission denied for relation(s): public.a
> ``` 
> Warning should be removed.
> Expected
> ```
> postgres=# select * from a;
> ERROR:  permission denied for relation(s): public.a
> ```
> **Case 2: When both RPS are down, should only print that RPS is unavailable.**
> Current Behavior:
> ```
> postgres=# select * from a;
> WARNING:  ranger plugin service from http://test5:8432/rps is unavailable : 
> Couldn't connect to server, try another http://test1:8432/rps
> ERROR:  ranger plugin service from http://test1:8432/rps is unavailable : 
> Couldn't connect to server. (rangerrest.c:463)
> ```
> Expected
> ```
> postgres=# select * from a;
> ERROR:  ranger plugin service is unavailable : Couldn't connect to server. 
> (rangerrest.c:463)
> ```
> The warning message should be printed in cvs log file.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HAWQ-1469) Don't expose RPS warning messages to command line

2017-05-17 Thread Lin Wen (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16015047#comment-16015047
 ] 

Lin Wen commented on HAWQ-1469:
---

Yes. If both master RPS and standby RPS are unavailable, this message is 
printed to console.
ERROR: ranger plugin service is unavailable : Couldn't connect to server. 
(rangerrest.c:463)

OK. We can make the message more descriptive. how about this?
ERROR: authentication failed, ranger plugin service is unavailable : Couldn't 
connect to server. (rangerrest.c:463)

> Don't expose RPS warning messages to command line
> -
>
> Key: HAWQ-1469
> URL: https://issues.apache.org/jira/browse/HAWQ-1469
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Security
>Reporter: Lin Wen
>Assignee: Ed Espino
> Fix For: backlog
>
>
> RPS service address exposing to end-user is not secure, and we should not 
> expose it out.
> **Case 1: When master RPS is down, changing to standby RPS**
> Current behavior
> ```
> postgres=# select * from a;
> WARNING:  ranger plugin service from http://test1:8432/rps is unavailable : 
> Couldn't connect to server, try another http://test5:8432/rps
> ERROR:  permission denied for relation(s): public.a
> ``` 
> Warning should be removed.
> Expected
> ```
> postgres=# select * from a;
> ERROR:  permission denied for relation(s): public.a
> ```
> **Case 2: When both RPS are down, should only print that RPS is unavailable.**
> Current Behavior:
> ```
> postgres=# select * from a;
> WARNING:  ranger plugin service from http://test5:8432/rps is unavailable : 
> Couldn't connect to server, try another http://test1:8432/rps
> ERROR:  ranger plugin service from http://test1:8432/rps is unavailable : 
> Couldn't connect to server. (rangerrest.c:463)
> ```
> Expected
> ```
> postgres=# select * from a;
> ERROR:  ranger plugin service is unavailable : Couldn't connect to server. 
> (rangerrest.c:463)
> ```
> The warning message should be printed in cvs log file.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HAWQ-1469) Don't expose RPS warning messages to command line

2017-05-17 Thread Lin Wen (JIRA)
Lin Wen created HAWQ-1469:
-

 Summary: Don't expose RPS warning messages to command line
 Key: HAWQ-1469
 URL: https://issues.apache.org/jira/browse/HAWQ-1469
 Project: Apache HAWQ
  Issue Type: Sub-task
  Components: Security
Reporter: Lin Wen
Assignee: Ed Espino


RPS service address exposing to end-user is not secure, and we should not 
expose it out.

**Case 1: When master RPS is down, changing to standby RPS**
Current behavior
```
postgres=# select * from a;
WARNING:  ranger plugin service from http://test1:8432/rps is unavailable : 
Couldn't connect to server, try another http://test5:8432/rps
ERROR:  permission denied for relation(s): public.a
``` 
Warning should be removed.
Expected
```
postgres=# select * from a;
ERROR:  permission denied for relation(s): public.a
```

**Case 2: When both RPS are down, should only print that RPS is unavailable.**
Current Behavior:
```
postgres=# select * from a;
WARNING:  ranger plugin service from http://test5:8432/rps is unavailable : 
Couldn't connect to server, try another http://test1:8432/rps
ERROR:  ranger plugin service from http://test1:8432/rps is unavailable : 
Couldn't connect to server. (rangerrest.c:463)
```
Expected
```
postgres=# select * from a;
ERROR:  ranger plugin service is unavailable : Couldn't connect to server. 
(rangerrest.c:463)
```

The warning message should be printed in cvs log file.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (HAWQ-1396) HAWQ Sends Wrong Request to RPS for PXF Hcatalog

2017-03-28 Thread Lin Wen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Wen resolved HAWQ-1396.
---
Resolution: Fixed

> HAWQ Sends Wrong Request to RPS for PXF Hcatalog
> 
>
> Key: HAWQ-1396
> URL: https://issues.apache.org/jira/browse/HAWQ-1396
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Security
>Reporter: Lin Wen
>Assignee: Lin Wen
> Fix For: 2.2.0.0-incubating
>
>
> If Ranger mode is enable, HAWQ send wrong request to RPS for PXF Hcatalog.
> gpadmin=# select count(*) from hcatalog.default.twitterexampletextexample;
> ERROR:  permission denied for relation(s): default.twitterexampletextexample
> RPS log:
> ```{"repoType":101,"repo":"hawq","reqUser":"gpadmin","evtTime":"2017-03-17 
> 01:18:55.734","access":"select","resource":"gpadmin/default/twitterexampletextexample","resType":"table","action":"select","result":1,"policy":7,"enforcer":"ranger-acl","cliIP":"127.0.0.1","reqData":"select
>  count(*) from 
> hcatalog.default.twitterexampletextexample;","agentHost":"ip-10-32-126-158","logType":"RangerAudit","id":"b1d5137d-adc8-4196-a5e3-35912a43d243","seq_num":63,"event_count":1,"event_dur_ms":0,"tags":[]}
> ```
> Notice `resource":"gpadmin/default/twitterexampletextexample` where gpadmin 
> is my database name for the psql session. HAWQ should have sent 
> `resource":"hcatalog/default/twitterexampletextexample` to RPS for policy 
> check.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (HAWQ-1355) Namespace check may occur multiple times in first query.

2017-02-27 Thread Lin Wen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Wen resolved HAWQ-1355.
---
   Resolution: Fixed
Fix Version/s: 2.2.0.0-incubating

> Namespace check may occur multiple times in first query.
> 
>
> Key: HAWQ-1355
> URL: https://issues.apache.org/jira/browse/HAWQ-1355
> Project: Apache HAWQ
>  Issue Type: Bug
>Reporter: Hubert Zhang
>Assignee: Hubert Zhang
> Fix For: 2.2.0.0-incubating
>
>
> When running a query, HAWQ need to check namespace usage privilege in 
> function recomputeNamespacePath. This function will be called repeatedly but 
> check will be skipped when last_query_sign is equal to current_query_sign.
> There is a bug that running the first query doesn't set the last_query_sign.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (HAWQ-1325) Allow queries related to pg_temp if ranger is enable

2017-02-15 Thread Lin Wen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Wen resolved HAWQ-1325.
---
   Resolution: Fixed
Fix Version/s: (was: 2.3.0.0-incubating)
   2.2.0.0-incubating

> Allow queries related to pg_temp if ranger is enable
> 
>
> Key: HAWQ-1325
> URL: https://issues.apache.org/jira/browse/HAWQ-1325
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Security
>Reporter: Lin Wen
>Assignee: Lin Wen
> Fix For: 2.2.0.0-incubating
>
>
> Queries related to temp will send request to RPS, asking the privilege of 
> schema "pg_temp_XXX", like this:
> ./hawq-2017-02-13_142852.csv:2017-02-13 14:29:29.718445 
> CST,"linw","postgres",p71787,th-1324481600,"[local]",,2017-02-13 14:29:01 
> CST,8477,con13,cmd3,seg-1,,,x8477,sx1,"DEBUG3","0","send json request 
> to ranger : { ""requestId"": ""3"", ""user"": ""linw"", ""clientIp"": 
> ""127.0.0.1"", ""context"": ""select * from temp1;"", ""access"": [ { 
> ""resource"": { ""database"": ""postgres"", ""schema"": ""pg_temp_13"", 
> ""table"": ""temp1"" }, ""privileges"": [ ""select"" ] } ] }",,"select * 
> from temp1;",0,,"rangerrest.c",454,
> In order to better control, for pg_temp_XX schema and objects in that schema, 
> we should fall back these checks to catalog without sending requests to RPS. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HAWQ-1325) Allow queries related to pg_temp if ranger is enable

2017-02-15 Thread Lin Wen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Wen reassigned HAWQ-1325:
-

Assignee: Lin Wen  (was: Ed Espino)

> Allow queries related to pg_temp if ranger is enable
> 
>
> Key: HAWQ-1325
> URL: https://issues.apache.org/jira/browse/HAWQ-1325
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Security
>Reporter: Lin Wen
>Assignee: Lin Wen
> Fix For: 2.3.0.0-incubating
>
>
> Queries related to temp will send request to RPS, asking the privilege of 
> schema "pg_temp_XXX", like this:
> ./hawq-2017-02-13_142852.csv:2017-02-13 14:29:29.718445 
> CST,"linw","postgres",p71787,th-1324481600,"[local]",,2017-02-13 14:29:01 
> CST,8477,con13,cmd3,seg-1,,,x8477,sx1,"DEBUG3","0","send json request 
> to ranger : { ""requestId"": ""3"", ""user"": ""linw"", ""clientIp"": 
> ""127.0.0.1"", ""context"": ""select * from temp1;"", ""access"": [ { 
> ""resource"": { ""database"": ""postgres"", ""schema"": ""pg_temp_13"", 
> ""table"": ""temp1"" }, ""privileges"": [ ""select"" ] } ] }",,"select * 
> from temp1;",0,,"rangerrest.c",454,
> In order to better control, for pg_temp_XX schema and objects in that schema, 
> we should fall back these checks to catalog without sending requests to RPS. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HAWQ-1325) Allow queries related to pg_temp if ranger is enable

2017-02-13 Thread Lin Wen (JIRA)
Lin Wen created HAWQ-1325:
-

 Summary: Allow queries related to pg_temp if ranger is enable
 Key: HAWQ-1325
 URL: https://issues.apache.org/jira/browse/HAWQ-1325
 Project: Apache HAWQ
  Issue Type: Sub-task
Reporter: Lin Wen
Assignee: Ed Espino
 Fix For: 2.2.0.0-incubating


Queries related to temp will send request to RPS, asking the privilege of 
schema "pg_temp_XXX", like this:

./hawq-2017-02-13_142852.csv:2017-02-13 14:29:29.718445 
CST,"linw","postgres",p71787,th-1324481600,"[local]",,2017-02-13 14:29:01 CST,  
  8477,con13,cmd3,seg-1,,,x8477,sx1,"DEBUG3","0","send json request to 
ranger : { ""requestId"": ""3"", ""user"": ""linw"", ""clientIp"": 
""127.0.0.1"", ""context"": ""select * from temp1;"", ""access"": [ { 
""resource"": { ""database"": ""postgres"", ""schema"": ""pg_temp_13"", 
""table"": ""temp1"" }, ""privileges"": [ ""select"" ] } ] }",,"select * 
from temp1;",0,,"rangerrest.c",454,

In order to better control, for pg_temp_XX schema and objects in that schema, 
we should fall back these checks to catalog without sending requests to RPS. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (HAWQ-1318) Can't start/stop master succesfully if ranger is enable and with a wrong RPS address

2017-02-09 Thread Lin Wen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Wen resolved HAWQ-1318.
---
   Resolution: Fixed
Fix Version/s: 2.2.0.0-incubating

> Can't start/stop master succesfully if ranger is enable and with a wrong RPS 
> address
> 
>
> Key: HAWQ-1318
> URL: https://issues.apache.org/jira/browse/HAWQ-1318
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Security
>Reporter: Lin Wen
>Assignee: Lin Wen
> Fix For: 2.2.0.0-incubating
>
>
> If ranger enable and with a wrong RPS address, hawq can start but can't 
> start/stop succesfully.
> Lins-MacBook-Pro:apache-hawq linw$ hawq stop cluster -a
> 20170209:10:21:51:043784 hawq_stop:Lins-MacBook-Pro:linw-[INFO]:-Prepare to 
> do 'hawq stop'
> 20170209:10:21:51:043784 hawq_stop:Lins-MacBook-Pro:linw-[INFO]:-You can find 
> log in:
> 20170209:10:21:51:043784 
> hawq_stop:Lins-MacBook-Pro:linw-[INFO]:-/Users/linw/hawqAdminLogs/hawq_stop_20170209.log
> 20170209:10:21:51:043784 hawq_stop:Lins-MacBook-Pro:linw-[INFO]:-GPHOME is 
> set to:
> 20170209:10:21:51:043784 
> hawq_stop:Lins-MacBook-Pro:linw-[INFO]:-/Users/linw/hawq-bin
> 20170209:10:21:51:043784 hawq_stop:Lins-MacBook-Pro:linw-[INFO]:-Stop hawq 
> with args: ['stop', 'cluster']
> 20170209:10:21:51:043784 hawq_stop:Lins-MacBook-Pro:linw-[INFO]:-No standby 
> host configured
> 20170209:10:21:51:043784 hawq_stop:Lins-MacBook-Pro:linw-[INFO]:-Stop hawq 
> cluster
> 20170209:10:22:22:043784 hawq_stop:Lins-MacBook-Pro:linw-[ERROR]:-Failed to 
> connect to the running database, please check master status
> 20170209:10:22:22:043784 hawq_stop:Lins-MacBook-Pro:linw-[ERROR]:-Or you can 
> check hawq stop --help for other stop options
>  501 43719 1   0 10:20AM ?? 0:00.58 
> /Users/linw/hawq-bin/bin/postgres -D /Users/linw/hawq-data/masterdd -i -M 
> master -p 5432 --silent-mode=true
>   501 43721 43719   0 10:20AM ?? 0:00.02 postgres: port  5432, master 
> logger process
>   501 43724 43719   0 10:20AM ?? 0:00.01 postgres: port  5432, stats 
> collector process
>   501 43725 43719   0 10:20AM ?? 0:00.07 postgres: port  5432, writer 
> process
>   501 43726 43719   0 10:20AM ?? 0:00.01 postgres: port  5432, 
> checkpoint process
>   501 43727 43719   0 10:20AM ?? 0:00.01 postgres: port  5432, 
> seqserver process
>   501 43728 43719   0 10:20AM ?? 0:00.01 postgres: port  5432, WAL 
> Send Server process
>   501 43729 43719   0 10:20AM ?? 0:00.00 postgres: port  5432, DFS 
> Metadata Cache process
>   501 43743 1   0 10:20AM ?? 0:00.79 
> /Users/linw/hawq-bin/bin/postgres -D /Users/linw/hawq-data/segmentdd -i -M 
> segment -p 4 --silent-mode=true
>   501 43744 43743   0 10:20AM ?? 0:00.02 postgres: port 4, logger 
> process
>   501 43747 43743   0 10:20AM ?? 0:00.01 postgres: port 4, stats 
> collector process
>   501 43748 43743   0 10:20AM ?? 0:00.07 postgres: port 4, writer 
> process
>   501 43749 43743   0 10:20AM ?? 0:00.01 postgres: port 4, 
> checkpoint process
>   501 43750 43743   0 10:20AM ?? 0:00.16 postgres: port 4, 
> segment resource manager
>   501 43830 43719   0 10:22AM ?? 0:00.01 postgres: port  5432, master 
> resource manager
>   501 43867 43719   0 10:24AM ?? 0:00.01 postgres: port  5432, linw 
> template1 [local] cmd1 SELECT [local]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HAWQ-1318) Can't start/stop master succesfully if ranger is enable and with a wrong RPS address

2017-02-09 Thread Lin Wen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Wen reassigned HAWQ-1318:
-

Assignee: Lin Wen  (was: Ed Espino)

> Can't start/stop master succesfully if ranger is enable and with a wrong RPS 
> address
> 
>
> Key: HAWQ-1318
> URL: https://issues.apache.org/jira/browse/HAWQ-1318
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Security
>Reporter: Lin Wen
>Assignee: Lin Wen
>
> If ranger enable and with a wrong RPS address, hawq can start but can't 
> start/stop succesfully.
> Lins-MacBook-Pro:apache-hawq linw$ hawq stop cluster -a
> 20170209:10:21:51:043784 hawq_stop:Lins-MacBook-Pro:linw-[INFO]:-Prepare to 
> do 'hawq stop'
> 20170209:10:21:51:043784 hawq_stop:Lins-MacBook-Pro:linw-[INFO]:-You can find 
> log in:
> 20170209:10:21:51:043784 
> hawq_stop:Lins-MacBook-Pro:linw-[INFO]:-/Users/linw/hawqAdminLogs/hawq_stop_20170209.log
> 20170209:10:21:51:043784 hawq_stop:Lins-MacBook-Pro:linw-[INFO]:-GPHOME is 
> set to:
> 20170209:10:21:51:043784 
> hawq_stop:Lins-MacBook-Pro:linw-[INFO]:-/Users/linw/hawq-bin
> 20170209:10:21:51:043784 hawq_stop:Lins-MacBook-Pro:linw-[INFO]:-Stop hawq 
> with args: ['stop', 'cluster']
> 20170209:10:21:51:043784 hawq_stop:Lins-MacBook-Pro:linw-[INFO]:-No standby 
> host configured
> 20170209:10:21:51:043784 hawq_stop:Lins-MacBook-Pro:linw-[INFO]:-Stop hawq 
> cluster
> 20170209:10:22:22:043784 hawq_stop:Lins-MacBook-Pro:linw-[ERROR]:-Failed to 
> connect to the running database, please check master status
> 20170209:10:22:22:043784 hawq_stop:Lins-MacBook-Pro:linw-[ERROR]:-Or you can 
> check hawq stop --help for other stop options
>  501 43719 1   0 10:20AM ?? 0:00.58 
> /Users/linw/hawq-bin/bin/postgres -D /Users/linw/hawq-data/masterdd -i -M 
> master -p 5432 --silent-mode=true
>   501 43721 43719   0 10:20AM ?? 0:00.02 postgres: port  5432, master 
> logger process
>   501 43724 43719   0 10:20AM ?? 0:00.01 postgres: port  5432, stats 
> collector process
>   501 43725 43719   0 10:20AM ?? 0:00.07 postgres: port  5432, writer 
> process
>   501 43726 43719   0 10:20AM ?? 0:00.01 postgres: port  5432, 
> checkpoint process
>   501 43727 43719   0 10:20AM ?? 0:00.01 postgres: port  5432, 
> seqserver process
>   501 43728 43719   0 10:20AM ?? 0:00.01 postgres: port  5432, WAL 
> Send Server process
>   501 43729 43719   0 10:20AM ?? 0:00.00 postgres: port  5432, DFS 
> Metadata Cache process
>   501 43743 1   0 10:20AM ?? 0:00.79 
> /Users/linw/hawq-bin/bin/postgres -D /Users/linw/hawq-data/segmentdd -i -M 
> segment -p 4 --silent-mode=true
>   501 43744 43743   0 10:20AM ?? 0:00.02 postgres: port 4, logger 
> process
>   501 43747 43743   0 10:20AM ?? 0:00.01 postgres: port 4, stats 
> collector process
>   501 43748 43743   0 10:20AM ?? 0:00.07 postgres: port 4, writer 
> process
>   501 43749 43743   0 10:20AM ?? 0:00.01 postgres: port 4, 
> checkpoint process
>   501 43750 43743   0 10:20AM ?? 0:00.16 postgres: port 4, 
> segment resource manager
>   501 43830 43719   0 10:22AM ?? 0:00.01 postgres: port  5432, master 
> resource manager
>   501 43867 43719   0 10:24AM ?? 0:00.01 postgres: port  5432, linw 
> template1 [local] cmd1 SELECT [local]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (HAWQ-1312) Forbid grant/revoke command in HAWQ side once Ranger is configured.

2017-02-07 Thread Lin Wen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Wen resolved HAWQ-1312.
---
Resolution: Fixed

> Forbid grant/revoke command in HAWQ side once Ranger is configured.
> ---
>
> Key: HAWQ-1312
> URL: https://issues.apache.org/jira/browse/HAWQ-1312
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Security
>Reporter: Lin Wen
>Assignee: Lin Wen
> Fix For: backlog
>
>
> When ranger check is enable, GRANT and REVOKE commands should not be allowed 
> to run. This work is expected to be done in Ranger admin portal. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HAWQ-1312) Forbid grant/revoke command in HAWQ side once Ranger is configured.

2017-02-07 Thread Lin Wen (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857246#comment-15857246
 ] 

Lin Wen commented on HAWQ-1312:
---

Grant/Revoke some system catalog objects are allowed if in ranger mode, since 
these checks are done in hawq natively. 
But for users' objects, Grant/Revoke commands are not allowed. 

> Forbid grant/revoke command in HAWQ side once Ranger is configured.
> ---
>
> Key: HAWQ-1312
> URL: https://issues.apache.org/jira/browse/HAWQ-1312
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Security
>Reporter: Lin Wen
>Assignee: Lin Wen
> Fix For: backlog
>
>
> When ranger check is enable, GRANT and REVOKE commands should not be allowed 
> to run. This work is expected to be done in Ranger admin portal. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HAWQ-1312) Forbid grant/revoke command in HAWQ side once Ranger is configured.

2017-02-04 Thread Lin Wen (JIRA)
Lin Wen created HAWQ-1312:
-

 Summary: Forbid grant/revoke command in HAWQ side once Ranger is 
configured.
 Key: HAWQ-1312
 URL: https://issues.apache.org/jira/browse/HAWQ-1312
 Project: Apache HAWQ
  Issue Type: Sub-task
Reporter: Lin Wen
Assignee: Ed Espino


When ranger check is enable, GRANT and REVOKE commands should not be allowed to 
run. This work is expected to be done in Ranger admin portal. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HAWQ-1312) Forbid grant/revoke command in HAWQ side once Ranger is configured.

2017-02-04 Thread Lin Wen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Wen reassigned HAWQ-1312:
-

Assignee: Lin Wen  (was: Ed Espino)

> Forbid grant/revoke command in HAWQ side once Ranger is configured.
> ---
>
> Key: HAWQ-1312
> URL: https://issues.apache.org/jira/browse/HAWQ-1312
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Security
>Reporter: Lin Wen
>Assignee: Lin Wen
> Fix For: backlog
>
>
> When ranger check is enable, GRANT and REVOKE commands should not be allowed 
> to run. This work is expected to be done in Ranger admin portal. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HAWQ-1284) HAWQ master is coredump when kill all process on master and standby

2017-01-19 Thread Lin Wen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Wen reassigned HAWQ-1284:
-

Assignee: Lin Wen  (was: Ed Espino)

> HAWQ master is coredump when kill all process on master and standby
> ---
>
> Key: HAWQ-1284
> URL: https://issues.apache.org/jira/browse/HAWQ-1284
> Project: Apache HAWQ
>  Issue Type: Bug
>Reporter: Lin Wen
>Assignee: Lin Wen
> Attachments: hawq-2017-01-17_054054.csv
>
>
> When hawq cluster is running(no active queries), kill all postgres processes 
> in master(with command "killall postgres") and then kill all processes in 
> standby(with command "killall gpsyncmaster"), hawq master will generate 
> coredump randomly.
> The callstack is:
> #0  0x0032214325e5 in raise () from /lib64/libc.so.6
> #1  0x003221433dc5 in abort () from /lib64/libc.so.6
> #2  0x008cce7f in errfinish (dummy=Unhandled dwarf expression opcode 
> 0xf3
> ) at elog.c:686
> #3  0x008cf032 in elog_finish (elevel=Unhandled dwarf expression 
> opcode 0xf3
> ) at elog.c:1463
> #4  0x007d4912 in proc_exit_prepare (code=1) at ipc.c:153
> #5  0x007d4a38 in proc_exit (code=1) at ipc.c:93
> #6  0x008ccc7e in errfinish (dummy=Unhandled dwarf expression opcode 
> 0xf3
> ) at elog.c:670
> #7  0x0078dea1 in ServiceDoConnect (listenerPort=64556, 
> complain=Unhandled dwarf expression opcode 0xf3
> ) at service.c:165
> #8  0x004efd5a in XLogQDMirrorWrite (WriteRqst=, 
> flexible=0 '\000', xlog_switch=0 '\000') at xlog.c:1981
> #9  XLogWrite (WriteRqst=, flexible=0 '\000', 
> xlog_switch=0 '\000') at xlog.c:2354
> #10 0x004f2242 in XLogFlush (record=...) at xlog.c:2572
> #11 0x004f7288 in CreateCheckPoint (shutdown=Unhandled dwarf 
> expression opcode 0xf3
> ) at xlog.c:8136
> #12 0x004f9f72 in ShutdownXLOG (code=Unhandled dwarf expression 
> opcode 0xf3
> ) at xlog.c:7865
> #13 0x0078b2b0 in BackgroundWriterMain () at bgwriter.c:318
> #14 0x0055a870 in AuxiliaryProcessMain (argc=, 
> argv=0x7fff02330850) at bootstrap.c:467
> #15 0x0079b4f0 in StartChildProcess (type=Unhandled dwarf expression 
> opcode 0xf3
> ) at postmaster.c:6836
> #16 0x0079b7aa in CommenceNormalOperations () at postmaster.c:3618
> #17 0x0079fee4 in do_reaper () at postmaster.c:3831
> #18 ServerLoop () at postmaster.c:2136
> #19 0x007a2179 in PostmasterMain (argc=Unhandled dwarf expression 
> opcode 0xf3
> ) at postmaster.c:1454
> #20 0x004a4f99 in main (argc=9, argv=0x2a4f010) at main.c:226
> The reason is the "WAL Send Server process" is killed firstly, when writer 
> process gets a shutdown request, it begins to create a checkpoint and sync 
> xlog to standby master, however at this point, wal send server process has 
> been killed. So writer process failed in connecting wal send server process, 
> then ereport ERROR, 
>   ereport(ERROR, 
> (errcode(ERRCODE_GP_INTERCONNECTION_ERROR),
>   errmsg("Could 
> not connect to '%s': %s",
>  
> serviceConfig->title,
>  
> strerror(saved_err;
> line:165, service.c
> From the call stack we can see, when ereport() is called, proc_exit_prepare() 
> will be called. And at line:152, CritSectionCount is larger than 0, so PANIC 
> occurs and a coredump is generated. CritSectionCount is added when writer 
> process calls XLogFlush().
>   if (CritSectionCount > 0)
>   elog(PANIC, "process is dying from critical section");
>  
> A possible solution is before writer process write log to standby, check if 
> wal send server process exists. If not, don't call call 
> WalSendServerClientConnect() to connect wal send server process. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HAWQ-1284) HAWQ master is coredump when kill all process on master and standby

2017-01-19 Thread Lin Wen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Wen updated HAWQ-1284:
--
Attachment: hawq-2017-01-17_054054.csv

master's log file

> HAWQ master is coredump when kill all process on master and standby
> ---
>
> Key: HAWQ-1284
> URL: https://issues.apache.org/jira/browse/HAWQ-1284
> Project: Apache HAWQ
>  Issue Type: Bug
>Reporter: Lin Wen
>Assignee: Ed Espino
> Attachments: hawq-2017-01-17_054054.csv
>
>
> When hawq cluster is running(no active queries), kill all postgres processes 
> in master(with command "killall postgres") and then kill all processes in 
> standby(with command "killall gpsyncmaster"), hawq master will generate 
> coredump randomly.
> The callstack is:
> #0  0x0032214325e5 in raise () from /lib64/libc.so.6
> #1  0x003221433dc5 in abort () from /lib64/libc.so.6
> #2  0x008cce7f in errfinish (dummy=Unhandled dwarf expression opcode 
> 0xf3
> ) at elog.c:686
> #3  0x008cf032 in elog_finish (elevel=Unhandled dwarf expression 
> opcode 0xf3
> ) at elog.c:1463
> #4  0x007d4912 in proc_exit_prepare (code=1) at ipc.c:153
> #5  0x007d4a38 in proc_exit (code=1) at ipc.c:93
> #6  0x008ccc7e in errfinish (dummy=Unhandled dwarf expression opcode 
> 0xf3
> ) at elog.c:670
> #7  0x0078dea1 in ServiceDoConnect (listenerPort=64556, 
> complain=Unhandled dwarf expression opcode 0xf3
> ) at service.c:165
> #8  0x004efd5a in XLogQDMirrorWrite (WriteRqst=, 
> flexible=0 '\000', xlog_switch=0 '\000') at xlog.c:1981
> #9  XLogWrite (WriteRqst=, flexible=0 '\000', 
> xlog_switch=0 '\000') at xlog.c:2354
> #10 0x004f2242 in XLogFlush (record=...) at xlog.c:2572
> #11 0x004f7288 in CreateCheckPoint (shutdown=Unhandled dwarf 
> expression opcode 0xf3
> ) at xlog.c:8136
> #12 0x004f9f72 in ShutdownXLOG (code=Unhandled dwarf expression 
> opcode 0xf3
> ) at xlog.c:7865
> #13 0x0078b2b0 in BackgroundWriterMain () at bgwriter.c:318
> #14 0x0055a870 in AuxiliaryProcessMain (argc=, 
> argv=0x7fff02330850) at bootstrap.c:467
> #15 0x0079b4f0 in StartChildProcess (type=Unhandled dwarf expression 
> opcode 0xf3
> ) at postmaster.c:6836
> #16 0x0079b7aa in CommenceNormalOperations () at postmaster.c:3618
> #17 0x0079fee4 in do_reaper () at postmaster.c:3831
> #18 ServerLoop () at postmaster.c:2136
> #19 0x007a2179 in PostmasterMain (argc=Unhandled dwarf expression 
> opcode 0xf3
> ) at postmaster.c:1454
> #20 0x004a4f99 in main (argc=9, argv=0x2a4f010) at main.c:226
> The reason is the "WAL Send Server process" is killed firstly, when writer 
> process gets a shutdown request, it begins to create a checkpoint and sync 
> xlog to standby master, however at this point, wal send server process has 
> been killed. So writer process failed in connecting wal send server process, 
> then ereport ERROR, 
>   ereport(ERROR, 
> (errcode(ERRCODE_GP_INTERCONNECTION_ERROR),
>   errmsg("Could 
> not connect to '%s': %s",
>  
> serviceConfig->title,
>  
> strerror(saved_err;
> line:165, service.c
> From the call stack we can see, when ereport() is called, proc_exit_prepare() 
> will be called. And at line:152, CritSectionCount is larger than 0, so PANIC 
> occurs and a coredump is generated. CritSectionCount is added when writer 
> process calls XLogFlush().
>   if (CritSectionCount > 0)
>   elog(PANIC, "process is dying from critical section");
>  
> A possible solution is before writer process write log to standby, check if 
> wal send server process exists. If not, don't call call 
> WalSendServerClientConnect() to connect wal send server process. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HAWQ-1284) HAWQ master is coredump when kill all process on master and standby

2017-01-19 Thread Lin Wen (JIRA)
Lin Wen created HAWQ-1284:
-

 Summary: HAWQ master is coredump when kill all process on master 
and standby
 Key: HAWQ-1284
 URL: https://issues.apache.org/jira/browse/HAWQ-1284
 Project: Apache HAWQ
  Issue Type: Bug
Reporter: Lin Wen
Assignee: Ed Espino


When hawq cluster is running(no active queries), kill all postgres processes in 
master(with command "killall postgres") and then kill all processes in 
standby(with command "killall gpsyncmaster"), hawq master will generate 
coredump randomly.

The callstack is:
#0  0x0032214325e5 in raise () from /lib64/libc.so.6
#1  0x003221433dc5 in abort () from /lib64/libc.so.6
#2  0x008cce7f in errfinish (dummy=Unhandled dwarf expression opcode 
0xf3
) at elog.c:686
#3  0x008cf032 in elog_finish (elevel=Unhandled dwarf expression opcode 
0xf3
) at elog.c:1463
#4  0x007d4912 in proc_exit_prepare (code=1) at ipc.c:153
#5  0x007d4a38 in proc_exit (code=1) at ipc.c:93
#6  0x008ccc7e in errfinish (dummy=Unhandled dwarf expression opcode 
0xf3
) at elog.c:670
#7  0x0078dea1 in ServiceDoConnect (listenerPort=64556, 
complain=Unhandled dwarf expression opcode 0xf3
) at service.c:165
#8  0x004efd5a in XLogQDMirrorWrite (WriteRqst=, 
flexible=0 '\000', xlog_switch=0 '\000') at xlog.c:1981
#9  XLogWrite (WriteRqst=, flexible=0 '\000', 
xlog_switch=0 '\000') at xlog.c:2354
#10 0x004f2242 in XLogFlush (record=...) at xlog.c:2572
#11 0x004f7288 in CreateCheckPoint (shutdown=Unhandled dwarf expression 
opcode 0xf3
) at xlog.c:8136
#12 0x004f9f72 in ShutdownXLOG (code=Unhandled dwarf expression opcode 
0xf3
) at xlog.c:7865
#13 0x0078b2b0 in BackgroundWriterMain () at bgwriter.c:318
#14 0x0055a870 in AuxiliaryProcessMain (argc=, 
argv=0x7fff02330850) at bootstrap.c:467
#15 0x0079b4f0 in StartChildProcess (type=Unhandled dwarf expression 
opcode 0xf3
) at postmaster.c:6836
#16 0x0079b7aa in CommenceNormalOperations () at postmaster.c:3618
#17 0x0079fee4 in do_reaper () at postmaster.c:3831
#18 ServerLoop () at postmaster.c:2136
#19 0x007a2179 in PostmasterMain (argc=Unhandled dwarf expression 
opcode 0xf3
) at postmaster.c:1454
#20 0x004a4f99 in main (argc=9, argv=0x2a4f010) at main.c:226

The reason is the "WAL Send Server process" is killed firstly, when writer 
process gets a shutdown request, it begins to create a checkpoint and sync xlog 
to standby master, however at this point, wal send server process has been 
killed. So writer process failed in connecting wal send server process, then 
ereport ERROR, 
ereport(ERROR, 
(errcode(ERRCODE_GP_INTERCONNECTION_ERROR),
errmsg("Could 
not connect to '%s': %s",
   
serviceConfig->title,
   
strerror(saved_err;
line:165, service.c

>From the call stack we can see, when ereport() is called, proc_exit_prepare() 
>will be called. And at line:152, CritSectionCount is larger than 0, so PANIC 
>occurs and a coredump is generated. CritSectionCount is added when writer 
>process calls XLogFlush().
if (CritSectionCount > 0)
elog(PANIC, "process is dying from critical section");
 
A possible solution is before writer process write log to standby, check if wal 
send server process exists. If not, don't call call 
WalSendServerClientConnect() to connect wal send server process. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HAWQ-1004) Implement calling Ranger REST Service using libcurl.

2016-12-19 Thread Lin Wen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Wen resolved HAWQ-1004.
---
Resolution: Fixed

> Implement calling Ranger REST Service using libcurl.
> 
>
> Key: HAWQ-1004
> URL: https://issues.apache.org/jira/browse/HAWQ-1004
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Core
>Reporter: Lili Ma
>Assignee: Lin Wen
> Fix For: backlog
>
>
> Decide How HAWQ connect Ranger, through which user, how to connect to REST 
> Server
> Acceptance Criteria: 
> Provide an interface for HAWQ connecting Ranger REST Server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HAWQ-1144) Register into a 2-level partition table, hawq register didn't throw error, and indicates that hawq register succeed, but no data can be selected out.

2016-11-04 Thread Lin Wen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Wen resolved HAWQ-1144.
---
Resolution: Fixed

> Register into a 2-level partition table, hawq register didn't throw error, 
> and indicates that hawq register succeed, but no data can be selected out.
> -
>
> Key: HAWQ-1144
> URL: https://issues.apache.org/jira/browse/HAWQ-1144
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Command Line Tools
>Reporter: Lili Ma
>Assignee: Lin Wen
> Fix For: 2.0.1.0-incubating
>
>
> Register into a 2-level partition table, hawq register didn't throw error, 
> and indicates that hawq register succeed, but no data can be selected out.
> Reproduce Steps:
> 1. Create a one-level partition table
> {code}
>  create table parquet_wt (id SERIAL,a1 int,a2 char(5),a3 numeric,a4 boolean 
> DEFAULT false ,a5 char DEFAULT 'd',a6 text,a7 timestamp,a8 character 
> varying(705),a9 bigint,a10 date,a11 varchar(600),a12 text,a13 decimal,a14 
> real,a15 bigint,a16 int4 ,a17 bytea,a18 timestamp with time zone,a19 
> timetz,a20 path,a21 box,a22 macaddr,a23 interval,a24 character 
> varying(800),a25 lseg,a26 point,a27 double precision,a28 circle,a29 int4,a30 
> numeric(8),a31 polygon,a32 date,a33 real,a34 money,a35 cidr,a36 inet,a37 
> time,a38 text,a39 bit,a40 bit varying(5),a41 smallint,a42 int )   WITH 
> (appendonly=true, orientation=parquet) distributed randomly  Partition by 
> range(a1) (start(1)  end(5000) every(1000) );
> {code}
> 2. insert some data into this table
> {code}
> insert into parquet_wt 
> (a1,a2,a3,a4,a5,a6,a7,a8,a9,a10,a11,a12,a13,a14,a15,a16,a17,a18,a19,a20,a21,a22,a23,a24,a25,a26,a27,a28,a29,a30,a31,a32,a33,a34,a35,a36,a37,a38,a39,a40,a41,a42)
>  values(generate_series(1,20),'M',2011,'t','a','This is news of today: 
> Deadlock between Republicans and Democrats over how best to reduce the U.S. 
> deficit, and over what period, has blocked an agreement to allow the raising 
> of the $14.3 trillion debt ceiling','2001-12-24 02:26:11','U.S. House of 
> Representatives Speaker John Boehner, the top Republican in Congress who has 
> put forward a deficit reduction plan to be voted on later on Thursday said he 
> had no control over whether his bill would avert a credit 
> downgrade.',generate_series(2490,2505),'2011-10-11','The 
> Republican-controlled House is tentatively scheduled to vote on Boehner 
> proposal this afternoon at around 6 p.m. EDT (2200 GMT). The main Republican 
> vote counter in the House, Kevin McCarthy, would not say if there were enough 
> votes to pass the bill.','WASHINGTON:House Speaker John Boehner says his plan 
> mixing spending cuts in exchange for raising the nations $14.3 trillion debt 
> limit is not perfect but is as large a step that a divided government can 
> take that is doable and signable by President Barack Obama.The Ohio 
> Republican says the measure is an honest and sincere attempt at compromise 
> and was negotiated with Democrats last weekend and that passing it would end 
> the ongoing debt crisis. The plan blends $900 billion-plus in spending cuts 
> with a companion increase in the nations borrowing 
> cap.','1234.56',323453,generate_series(3452,3462),7845,'0011','2005-07-16 
> 01:51:15+1359','2001-12-13 
> 01:51:15','((1,2),(0,3),(2,1))','((2,3)(4,5))','08:00:2b:01:02:03','1-2','Republicans
>  had been working throughout the day Thursday to lock down support for their 
> plan to raise the nations debt ceiling, even as Senate Democrats vowed to 
> swiftly kill it if 
> passed.','((2,3)(4,5))','(6,7)',11.222,'((4,5),7)',32,3214,'(1,0,2,3)','2010-02-21',43564,'$1,000.00','192.168.1','126.1.3.4','12:30:45','Johnson
>  & Johnsons McNeil Consumer Healthcare announced the voluntary dosage 
> reduction today. Labels will carry new dosing instructions this fall.The 
> company says it will cut the maximum dosage of Regular Strength Tylenol and 
> other acetaminophen-containing products in 2012.Acetaminophen is safe when 
> used as directed, says Edwin Kuffner, MD, McNeil vice president of 
> over-the-counter medical affairs. But, when too much is taken, it can cause 
> liver damage.The action is intended to cut the risk of such accidental 
> overdoses, the company says in a news release.','1','0',12,23);
> {code}
> 3. extract the metadata out for the table
> {code}
> hawq extract -d postgres -o ~/parquet.yaml parquet_wt
> {code}
> 4. create a two-level partition table
> {code}
> CREATE TABLE parquet_wt_subpartgzip2  
> (id SERIAL,a1 
> int,a2 char(5),a3 numeric,a4 boolean DEFAULT false ,a5 char DEFAULT 'd',a6 
> text,a7 timestamp,a8 

[jira] [Resolved] (HAWQ-1127) HAWQ should print error message instead of python function stack when yaml file is invalid.

2016-10-31 Thread Lin Wen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Wen resolved HAWQ-1127.
---
   Resolution: Fixed
Fix Version/s: 2.0.1.0-incubating

> HAWQ should print error message instead of python function stack when yaml 
> file is invalid.
> ---
>
> Key: HAWQ-1127
> URL: https://issues.apache.org/jira/browse/HAWQ-1127
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Command Line Tools
>Reporter: Lin Wen
>Assignee: Lin Wen
> Fix For: 2.0.1.0-incubating
>
> Attachments: force_mode_normal_tpl.yml
>
>
> when use a invalid yaml file to register, hawq prints python stack:
> [linw@linw-rhel feature]$ hawq register --force -d hawq_feature_test -c 
> /home/linw/workspace/hawq_working/apache-hawq/src/test/feature/ManagementTool/partition/force_mode_normal.yml
>  testhawqregister_testpartitionforcemodenormal.nt
> 20161031:12:48:49:557022 hawqregister:linw-rhel:linw-[INFO]:-try to connect 
> database localhost:5432 hawq_feature_test
> Traceback (most recent call last):
>   File "/home/linw/hawq-bin/bin/hawqregister", line 1137, in 
> main(options, args)
>   File "/home/linw/hawq-bin/bin/hawqregister", line 1093, in main
> ins.prepare()
>   File "/home/linw/hawq-bin/bin/hawqregister", line 1021, in prepare
> self._option_parser_yml(options.yml_config)
>   File "/home/linw/hawq-bin/bin/hawqregister", line 475, in _option_parser_yml
> partitions_constraint = [d['Constraint'] for d in 
> params[Format_FileLocations]['Partitions']]
> KeyError: 'Constraint'
> Instead, hawq should print an error message.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HAWQ-1127) HAWQ should print error message instead of python function stack when yaml file is invalid.

2016-10-30 Thread Lin Wen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Wen reassigned HAWQ-1127:
-

Assignee: Lin Wen  (was: Lei Chang)

> HAWQ should print error message instead of python function stack when yaml 
> file is invalid.
> ---
>
> Key: HAWQ-1127
> URL: https://issues.apache.org/jira/browse/HAWQ-1127
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Command Line Tools
>Reporter: Lin Wen
>Assignee: Lin Wen
> Attachments: force_mode_normal_tpl.yml
>
>
> when use a invalid yaml file to register, hawq prints python stack:
> [linw@linw-rhel feature]$ hawq register --force -d hawq_feature_test -c 
> /home/linw/workspace/hawq_working/apache-hawq/src/test/feature/ManagementTool/partition/force_mode_normal.yml
>  testhawqregister_testpartitionforcemodenormal.nt
> 20161031:12:48:49:557022 hawqregister:linw-rhel:linw-[INFO]:-try to connect 
> database localhost:5432 hawq_feature_test
> Traceback (most recent call last):
>   File "/home/linw/hawq-bin/bin/hawqregister", line 1137, in 
> main(options, args)
>   File "/home/linw/hawq-bin/bin/hawqregister", line 1093, in main
> ins.prepare()
>   File "/home/linw/hawq-bin/bin/hawqregister", line 1021, in prepare
> self._option_parser_yml(options.yml_config)
>   File "/home/linw/hawq-bin/bin/hawqregister", line 475, in _option_parser_yml
> partitions_constraint = [d['Constraint'] for d in 
> params[Format_FileLocations]['Partitions']]
> KeyError: 'Constraint'
> Instead, hawq should print an error message.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HAWQ-1127) HAWQ should print error message instead of python function stack when yaml file is invalid.

2016-10-30 Thread Lin Wen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Wen updated HAWQ-1127:
--
Attachment: force_mode_normal_tpl.yml

> HAWQ should print error message instead of python function stack when yaml 
> file is invalid.
> ---
>
> Key: HAWQ-1127
> URL: https://issues.apache.org/jira/browse/HAWQ-1127
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Command Line Tools
>Reporter: Lin Wen
>Assignee: Lei Chang
> Attachments: force_mode_normal_tpl.yml
>
>
> when use a invalid yaml file to register, hawq prints python stack:
> [linw@linw-rhel feature]$ hawq register --force -d hawq_feature_test -c 
> /home/linw/workspace/hawq_working/apache-hawq/src/test/feature/ManagementTool/partition/force_mode_normal.yml
>  testhawqregister_testpartitionforcemodenormal.nt
> 20161031:12:48:49:557022 hawqregister:linw-rhel:linw-[INFO]:-try to connect 
> database localhost:5432 hawq_feature_test
> Traceback (most recent call last):
>   File "/home/linw/hawq-bin/bin/hawqregister", line 1137, in 
> main(options, args)
>   File "/home/linw/hawq-bin/bin/hawqregister", line 1093, in main
> ins.prepare()
>   File "/home/linw/hawq-bin/bin/hawqregister", line 1021, in prepare
> self._option_parser_yml(options.yml_config)
>   File "/home/linw/hawq-bin/bin/hawqregister", line 475, in _option_parser_yml
> partitions_constraint = [d['Constraint'] for d in 
> params[Format_FileLocations]['Partitions']]
> KeyError: 'Constraint'
> Instead, hawq should print an error message.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HAWQ-1127) HAWQ should print error message instead of python function stack when yaml file is invalid.

2016-10-30 Thread Lin Wen (JIRA)
Lin Wen created HAWQ-1127:
-

 Summary: HAWQ should print error message instead of python 
function stack when yaml file is invalid.
 Key: HAWQ-1127
 URL: https://issues.apache.org/jira/browse/HAWQ-1127
 Project: Apache HAWQ
  Issue Type: Bug
  Components: Command Line Tools
Reporter: Lin Wen
Assignee: Lei Chang


when use a invalid yaml file to register, hawq prints python stack:
[linw@linw-rhel feature]$ hawq register --force -d hawq_feature_test -c 
/home/linw/workspace/hawq_working/apache-hawq/src/test/feature/ManagementTool/partition/force_mode_normal.yml
 testhawqregister_testpartitionforcemodenormal.nt
20161031:12:48:49:557022 hawqregister:linw-rhel:linw-[INFO]:-try to connect 
database localhost:5432 hawq_feature_test
Traceback (most recent call last):
  File "/home/linw/hawq-bin/bin/hawqregister", line 1137, in 
main(options, args)
  File "/home/linw/hawq-bin/bin/hawqregister", line 1093, in main
ins.prepare()
  File "/home/linw/hawq-bin/bin/hawqregister", line 1021, in prepare
self._option_parser_yml(options.yml_config)
  File "/home/linw/hawq-bin/bin/hawqregister", line 475, in _option_parser_yml
partitions_constraint = [d['Constraint'] for d in 
params[Format_FileLocations]['Partitions']]
KeyError: 'Constraint'

Instead, hawq should print an error message.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HAWQ-1104) Add tupcount, varblockcount and eofuncompressed value in hawq extract yaml configuration, also add implementation in hawq register to recognize these values

2016-10-26 Thread Lin Wen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Wen resolved HAWQ-1104.
---
Resolution: Fixed
  Assignee: Lin Wen  (was: hongwu)

> Add tupcount, varblockcount and eofuncompressed value in hawq extract yaml 
> configuration, also add implementation in hawq register to recognize these 
> values  
> --
>
> Key: HAWQ-1104
> URL: https://issues.apache.org/jira/browse/HAWQ-1104
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Command Line Tools
>Reporter: Lili Ma
>Assignee: Lin Wen
> Fix For: 2.0.1.0-incubating
>
>
> Add tupcount, varblockcount and eofuncompressed value in hawq extract yaml 
> configuration, and also add implementation in hawq register to recognize 
> these values so the information in catalog table pg_aoseg.pg_aoseg_$relid or 
> pg_aoseg.pg_paqseg_$relid can become correct.  
> After the work, the information in catalog table will become correct if we 
> register table according to the yaml configuration file which is generated by 
> another table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-1117) can't start hawq cluster

2016-10-20 Thread Lin Wen (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15590947#comment-15590947
 ] 

Lin Wen commented on HAWQ-1117:
---

When hawq cluster is started, RM process will cleanup segment history catalog 
table(in function cleanup_segment_config() ), then receive heartbeat message 
from segments and add the segment's information into gp_segment_configuation 
table.
According to the log, the PQ connection to manipulate gp_configuration_history 
and gp_segment_configuation table are not created due to some reason. So it 
fails to update the two catalog tables. 

It's a known issue, when " --enable-cassert" is enable, this error will happen. 
We can reproduce it here. [~yjin] is fixing it.
If " --enable-cassert" is disable, this error will not happen. 


> can't start hawq  cluster
> -
>
> Key: HAWQ-1117
> URL: https://issues.apache.org/jira/browse/HAWQ-1117
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Core
>Reporter: Devin Jia
>Assignee: Lei Chang
>
> after i upgrade hawq to 2.0.1 and build, the hawq cluster can't start.
> 1.configure and build:
> {quote}
> ./configure --prefix=/opt/hawq-build --enable-depend --enable-cassert 
> --enable-debug
> make && make install
> {quote}
> 2. start error:
> {quote}
> [gpadmin@hmaster pg_log]$ more 
> /home/gpadmin/hawq-data-directory/masterdd/pg_log/hawq-2016-10-20_133056.csv 
> 2016-10-20 13:30:56.549712 
> CST,"gpadmin","template1",p3279,th-266811104,"[local]",,2016-10-20 13:30:56 
> CST,0,,,seg-1,"FATAL","57P03","the database system is in recovery 
> mode",,,
> 0,,"postmaster.c",2656,
> 2016-10-20 13:30:56.556630 
> CST,,,p3280,th-2668111040,,,seg-1,"LOG","0","database system 
> was interrupted at 2016-10-20 13:22:51 CST",,,0,,"xlog.c",6229,
> 2016-10-20 13:30:56.558414 
> CST,,,p3280,th-2668111040,,,seg-1,"LOG","0","checkpoint 
> record is at 0/857ED8",,,0,,"xlog.c",6306,
> 2016-10-20 13:30:56.558464 
> CST,,,p3280,th-2668111040,,,seg-1,"LOG","0","redo record is 
> at 0/857ED8; undo record is at 0/0; shutdown TRUE",,,0,,"xlog.c",6340,
> 2016-10-20 13:30:56.558495 
> CST,,,p3280,th-2668111040,,,seg-1,"LOG","0","next transaction 
> ID: 0/963; next OID: 10896",,,0,,"xlog.c",6344,
> 2016-10-20 13:30:56.558522 
> CST,,,p3280,th-2668111040,,,seg-1,"LOG","0","next 
> MultiXactId: 1; next MultiXactOffset: 0",,,0,,"xlog.c",6347,
> 2016-10-20 13:30:56.558559 
> CST,,,p3280,th-2668111040,,,seg-1,"LOG","0","database system 
> was not properly shut down; automatic recovery in 
> progress",,,0,,"xlog.c",6436,
> 2016-10-20 13:30:56.563303 
> CST,,,p3280,th-2668111040,,,seg-1,"LOG","0","record with zero 
> length at 0/857F28",,,0,,"xlog.c",4110,
> 2016-10-20 13:30:56.563348 
> CST,,,p3280,th-2668111040,,,seg-1,"LOG","0","no record for 
> redo after checkpoint, skip redo and proceed for recovery 
> pass",,,0,,"xlog.c",6500,
> 2016-10-20 13:30:56.563411 
> CST,,,p3280,th-2668111040,,,seg-1,"LOG","0","end of 
> transaction log location is 0/857F28",,,0,,"xlog.c",6584,
> 2016-10-20 13:30:56.568795 
> CST,,,p3280,th-2668111040,,,seg-1,"LOG","0","Finished startup 
> pass 1.  Proceeding to startup crash recovery passes 2 and 
> 3.",,,0,,"xlog.c",681
> 8,
> 2016-10-20 13:30:56.580641 
> CST,,,p3281,th-2668111040,,,seg-1,"LOG","0","Finished startup 
> crash recovery pass 2",,,0,,"xlog.c",6989,
> 2016-10-20 13:30:56.595325 
> CST,,,p3282,th-2668111040,,,seg-1,"LOG","0","recovery restart 
> point at 0/857ED8","xlog redo checkpoint: redo 0/857ED8; undo 0/0; tli 1; 
> xid 0/
> 963; oid 10896; multi 1; offset 0; shutdown
> REDO PASS 3 @ 0/857ED8; LSN 0/857F28: prev 0/857E88; xid 0: XLOG - 
> checkpoint: redo 0/857ED8; undo 0/0; tli 1; xid 0/963; oid 10896; multi 1; 
> offset 0; shutdown",,0,,"xlog.c",8331,
> 2016-10-20 13:30:56.595390 
> CST,,,p3282,th-2668111040,,,seg-1,"LOG","0","record with zero 
> length at 0/857F28",,,0,,"xlog.c",4110,
> 2016-10-20 13:30:56.595477 
> CST,,,p3282,th-2668111040,,,seg-1,"LOG","0","Oldest active 
> transaction from prepared transactions 963",,,0,,"xlog.c",5998,
> 2016-10-20 13:30:56.603266 
> CST,,,p3282,th-2668111040,,,seg-1,"LOG","0","database system 
> is ready",,,0,,"xlog.c",6024,
> 2016-10-20 13:30:56.603314 
> CST,,,p3282,th-2668111040,,,seg-1,"LOG","0","PostgreSQL 
> 8.2.15 (Greenplum Database 4.2.0 build 1) (HAWQ 2.0.1.0 build dev) on 
> x86_64-unknown-linux
> -gnu, compiled by GCC gcc (GCC) 4.8.2 20140120 (Red Hat 4.8.2-15) compiled on 
> Oct 20 2016 12:27:04 (with assert checking)",,,0,,"xlog.c",6034,
> 

[jira] [Resolved] (HAWQ-1112) Error message is not accurate when hawq register with single file and the size is larger than real size

2016-10-18 Thread Lin Wen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Wen resolved HAWQ-1112.
---
   Resolution: Fixed
Fix Version/s: 2.0.1.0-incubating

> Error message is not accurate when hawq register with single file and the 
> size is larger than real size 
> 
>
> Key: HAWQ-1112
> URL: https://issues.apache.org/jira/browse/HAWQ-1112
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Command Line Tools
>Reporter: Lin Wen
>Assignee: Lin Wen
> Fix For: 2.0.1.0-incubating
>
>
> Error message is not accurate when hawq register with single file and the 
> size is larger than real size. 
> 20161017:10:13:59:259787 hawqregister:linw-rhel:linw-[ERROR]:-File size(658) 
> in yaml configuration file should not exceed actual length(657) of file 
> hdfs://localhost:9000/hawq_register_hawq.paq
> "in yaml configuration file" is not accurate.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HAWQ-1112) Error message is not accurate when hawq register with single file and the size is larger than real size

2016-10-18 Thread Lin Wen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Wen reassigned HAWQ-1112:
-

Assignee: Lin Wen  (was: Lei Chang)

> Error message is not accurate when hawq register with single file and the 
> size is larger than real size 
> 
>
> Key: HAWQ-1112
> URL: https://issues.apache.org/jira/browse/HAWQ-1112
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Command Line Tools
>Reporter: Lin Wen
>Assignee: Lin Wen
>
> Error message is not accurate when hawq register with single file and the 
> size is larger than real size. 
> 20161017:10:13:59:259787 hawqregister:linw-rhel:linw-[ERROR]:-File size(658) 
> in yaml configuration file should not exceed actual length(657) of file 
> hdfs://localhost:9000/hawq_register_hawq.paq
> "in yaml configuration file" is not accurate.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HAWQ-1112) Error message is not accurate when hawq register with single file and the size is larger than real size

2016-10-18 Thread Lin Wen (JIRA)
Lin Wen created HAWQ-1112:
-

 Summary: Error message is not accurate when hawq register with 
single file and the size is larger than real size 
 Key: HAWQ-1112
 URL: https://issues.apache.org/jira/browse/HAWQ-1112
 Project: Apache HAWQ
  Issue Type: Bug
  Components: Command Line Tools
Reporter: Lin Wen
Assignee: Lei Chang


Error message is not accurate when hawq register with single file and the size 
is larger than real size. 

20161017:10:13:59:259787 hawqregister:linw-rhel:linw-[ERROR]:-File size(658) in 
yaml configuration file should not exceed actual length(657) of file 
hdfs://localhost:9000/hawq_register_hawq.paq

"in yaml configuration file" is not accurate.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-845) Parameterize kerberos principal name for HAWQ

2016-09-11 Thread Lin Wen (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15483123#comment-15483123
 ] 

Lin Wen commented on HAWQ-845:
--

Hi, Matt,

HAWQ doesn't require HDFS owned by secured user in secure mode. But, the 
secureduser must have read/write permission on HDFS data directory.

> Parameterize kerberos principal name for HAWQ
> -
>
> Key: HAWQ-845
> URL: https://issues.apache.org/jira/browse/HAWQ-845
> Project: Apache HAWQ
>  Issue Type: Improvement
>Reporter: bhuvnesh chaudhary
>Assignee: Lei Chang
>Priority: Minor
> Fix For: 2.0.1.0-incubating
>
>
> Currently HAWQ only accepts the principle 'postgres' for kerberos settings.
> This is because it is hardcoded in gpcheckhdfs, we should ensure that it can 
> be parameterized.
> Also, it's better to change the default principal name postgres to gpadmin. 
> It will avoid the need of changing the the hdfs directory during securing the 
> cluster to postgres and will avoid the need of maintaining postgres user. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HAWQ-899) Add feature test for nested null case with new test framework

2016-09-11 Thread Lin Wen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Wen resolved HAWQ-899.
--
Resolution: Fixed
  Assignee: Yi Jin  (was: Lin Wen)

> Add feature test for nested null case with new test framework
> -
>
> Key: HAWQ-899
> URL: https://issues.apache.org/jira/browse/HAWQ-899
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Lin Wen
>Assignee: Yi Jin
> Fix For: 2.0.1.0-incubating
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HAWQ-898) Add feature test for COPY with new test framework

2016-09-11 Thread Lin Wen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Wen resolved HAWQ-898.
--
Resolution: Fixed

> Add feature test for COPY with new test framework 
> --
>
> Key: HAWQ-898
> URL: https://issues.apache.org/jira/browse/HAWQ-898
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Lin Wen
>Assignee: Lin Wen
> Fix For: 2.0.1.0-incubating
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HAWQ-897) Add feature test for create table distribution with new test framework

2016-09-07 Thread Lin Wen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Wen resolved HAWQ-897.
--
Resolution: Fixed

> Add feature test for create table distribution with new test framework
> --
>
> Key: HAWQ-897
> URL: https://issues.apache.org/jira/browse/HAWQ-897
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Lin Wen
>Assignee: Lin Wen
> Fix For: 2.0.1.0-incubating
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HAWQ-896) Add feature test for create table with new test framework

2016-09-07 Thread Lin Wen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Wen resolved HAWQ-896.
--
Resolution: Fixed

> Add feature test for create table with new test framework
> -
>
> Key: HAWQ-896
> URL: https://issues.apache.org/jira/browse/HAWQ-896
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Lin Wen
>Assignee: Lin Wen
> Fix For: 2.0.1.0-incubating
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HAWQ-940) Kerberos Ticket Expired for LibYARN Operations

2016-09-07 Thread Lin Wen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Wen resolved HAWQ-940.
--
Resolution: Fixed

> Kerberos Ticket Expired for LibYARN Operations
> --
>
> Key: HAWQ-940
> URL: https://issues.apache.org/jira/browse/HAWQ-940
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: libyarn
>Reporter: Lin Wen
>Assignee: Lin Wen
> Fix For: 2.0.1.0-incubating
>
>
> HAWQ's libhdfs3 and libyarn use a same kerberos keyfile. 
> Whenever a hdfs operation is triggered, a function named login() is called, 
> in login() function, this ticket is initialized by "kinit". 
> But for libyarn, login() function is only called during the resource broker 
> process starts. So if HAWQ starts up and there is no query for a long 
> period(24 hours in kerberos's configure file, krb.conf), this ticket will 
> expire, and HAWQ fails to register itself in Hadoop YARN.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HAWQ-845) Parameterize kerberos principal name for HAWQ

2016-09-01 Thread Lin Wen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Wen resolved HAWQ-845.
--
   Resolution: Fixed
Fix Version/s: (was: backlog)
   2.0.1.0-incubating

> Parameterize kerberos principal name for HAWQ
> -
>
> Key: HAWQ-845
> URL: https://issues.apache.org/jira/browse/HAWQ-845
> Project: Apache HAWQ
>  Issue Type: Improvement
>Reporter: bhuvnesh chaudhary
>Assignee: Lei Chang
>Priority: Minor
> Fix For: 2.0.1.0-incubating
>
>
> Currently HAWQ only accepts the principle 'postgres' for kerberos settings.
> This is because it is hardcoded in gpcheckhdfs, we should ensure that it can 
> be parameterized.
> Also, it's better to change the default principal name postgres to gpadmin. 
> It will avoid the need of changing the the hdfs directory during securing the 
> cluster to postgres and will avoid the need of maintaining postgres user. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-845) Parameterize kerberos principal name for HAWQ

2016-08-31 Thread Lin Wen (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15451291#comment-15451291
 ] 

Lin Wen commented on HAWQ-845:
--

For now I think we can keep 'postgres' as default kerberos service name, but 
customers should be able to parameterize it with other name.
If user want to use a different name, below property/value should be added into 
hawq-site.xml 

krb_srvname
gpadmin




> Parameterize kerberos principal name for HAWQ
> -
>
> Key: HAWQ-845
> URL: https://issues.apache.org/jira/browse/HAWQ-845
> Project: Apache HAWQ
>  Issue Type: Improvement
>Reporter: bhuvnesh chaudhary
>Assignee: Lei Chang
>Priority: Minor
> Fix For: backlog
>
>
> Currently HAWQ only accepts the principle 'postgres' for kerberos settings.
> This is because it is hardcoded in gpcheckhdfs, we should ensure that it can 
> be parameterized.
> Also, it's better to change the default principal name postgres to gpadmin. 
> It will avoid the need of changing the the hdfs directory during securing the 
> cluster to postgres and will avoid the need of maintaining postgres user. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HAWQ-256) Integrate Security with Apache Ranger

2016-08-17 Thread Lin Wen (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15425935#comment-15425935
 ] 

Lin Wen edited comment on HAWQ-256 at 8/18/16 5:44 AM:
---

Hi, Don,

Since HAWQ will call Ranger REST API to interact with Ranger, so what kind of 
security method is supported in Ranger REST API besides the common way? TLS, or 
SSL, or Kerberos?
Thanks! 



was (Author: wlin):
Hi, Don,

Since HAWQ will call Ranger REST API to interact with Ranger, so what kind of 
security method is supported in REST API besides the common way? TLS, or SSL, 
or Kerberos?
Thanks! 


> Integrate Security with Apache Ranger
> -
>
> Key: HAWQ-256
> URL: https://issues.apache.org/jira/browse/HAWQ-256
> Project: Apache HAWQ
>  Issue Type: New Feature
>  Components: PXF, Security
>Reporter: Michael Andre Pearce (IG)
>Assignee: Lili Ma
> Fix For: backlog
>
> Attachments: HAWQRangerSupportDesign.pdf
>
>
> Integrate security with Apache Ranger for a unified Hadoop security solution. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-256) Integrate Security with Apache Ranger

2016-08-17 Thread Lin Wen (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15425935#comment-15425935
 ] 

Lin Wen commented on HAWQ-256:
--

Hi, Don,

Since HAWQ will call Ranger REST API to interact with Ranger, so what kind of 
security method is supported in REST API besides the common way? TLS, or SSL, 
or Kerberos?
Thanks! 


> Integrate Security with Apache Ranger
> -
>
> Key: HAWQ-256
> URL: https://issues.apache.org/jira/browse/HAWQ-256
> Project: Apache HAWQ
>  Issue Type: New Feature
>  Components: PXF, Security
>Reporter: Michael Andre Pearce (IG)
>Assignee: Lili Ma
> Fix For: backlog
>
> Attachments: HAWQRangerSupportDesign.pdf
>
>
> Integrate security with Apache Ranger for a unified Hadoop security solution. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HAWQ-979) Resource Broker Should Reconnect Hadoop Yarn When Failed to Get Cluster Report

2016-08-04 Thread Lin Wen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Wen resolved HAWQ-979.
--
Resolution: Fixed

> Resource Broker Should Reconnect Hadoop Yarn When Failed to Get Cluster Report
> --
>
> Key: HAWQ-979
> URL: https://issues.apache.org/jira/browse/HAWQ-979
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Resource Manager
>Reporter: Lin Wen
>Assignee: Lin Wen
> Fix For: 2.0.1.0-incubating
>
>
> While HAWQ with yarn mode is running, sometimes the heartbeat thread of 
> libyarn maybe fail(e.g. YARN RM restarts) and quit, 
> 2016-08-03 18:45:27.913838 
> PDT,,,p34645,th-12906104000,con4,,seg-1,"WARNING","01000","YARN 
> mode resource broker failed to get YARN queue report of queue default. 
> LibYarnClient::getQueueInfo, Catch the Exception:LibYarnClient::libyarn AM 
> heartbeat thread has stopped.",,,0,,"resourcebroker_LIBYARN_proc.c",1840,
> resource broker process should re-register HAWQ to YARN in this case, but 
> actually not.
> The reason is:
> In function handleRM2RB_GetClusterReport(), when RB2YARN_getQueueReport() 
> failed, function sendRBGetClusterReportErrorData() is called, but 
> sendRBGetClusterReportErrorData() returns OK(should return RESBROK_ERROR_GRM)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HAWQ-979) Resource Broker Should Reconnect Hadoop Yarn When Failed to Get Cluster Report

2016-08-04 Thread Lin Wen (JIRA)
Lin Wen created HAWQ-979:


 Summary: Resource Broker Should Reconnect Hadoop Yarn When Failed 
to Get Cluster Report
 Key: HAWQ-979
 URL: https://issues.apache.org/jira/browse/HAWQ-979
 Project: Apache HAWQ
  Issue Type: Bug
  Components: Resource Manager
Reporter: Lin Wen
Assignee: Lei Chang


While HAWQ with yarn mode is running, sometimes the heartbeat thread of libyarn 
maybe fail(e.g. YARN RM restarts) and quit, 

2016-08-03 18:45:27.913838 
PDT,,,p34645,th-12906104000,con4,,seg-1,"WARNING","01000","YARN 
mode resource broker failed to get YARN queue report of queue default. 
LibYarnClient::getQueueInfo, Catch the Exception:LibYarnClient::libyarn AM 
heartbeat thread has stopped.",,,0,,"resourcebroker_LIBYARN_proc.c",1840,

resource broker process should re-register HAWQ to YARN in this case, but 
actually not.

The reason is:
In function handleRM2RB_GetClusterReport(), when RB2YARN_getQueueReport() 
failed, function sendRBGetClusterReportErrorData() is called, but 
sendRBGetClusterReportErrorData() returns OK(should return RESBROK_ERROR_GRM)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HAWQ-979) Resource Broker Should Reconnect Hadoop Yarn When Failed to Get Cluster Report

2016-08-04 Thread Lin Wen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Wen reassigned HAWQ-979:


Assignee: Lin Wen  (was: Lei Chang)

> Resource Broker Should Reconnect Hadoop Yarn When Failed to Get Cluster Report
> --
>
> Key: HAWQ-979
> URL: https://issues.apache.org/jira/browse/HAWQ-979
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Resource Manager
>Reporter: Lin Wen
>Assignee: Lin Wen
>
> While HAWQ with yarn mode is running, sometimes the heartbeat thread of 
> libyarn maybe fail(e.g. YARN RM restarts) and quit, 
> 2016-08-03 18:45:27.913838 
> PDT,,,p34645,th-12906104000,con4,,seg-1,"WARNING","01000","YARN 
> mode resource broker failed to get YARN queue report of queue default. 
> LibYarnClient::getQueueInfo, Catch the Exception:LibYarnClient::libyarn AM 
> heartbeat thread has stopped.",,,0,,"resourcebroker_LIBYARN_proc.c",1840,
> resource broker process should re-register HAWQ to YARN in this case, but 
> actually not.
> The reason is:
> In function handleRM2RB_GetClusterReport(), when RB2YARN_getQueueReport() 
> failed, function sendRBGetClusterReportErrorData() is called, but 
> sendRBGetClusterReportErrorData() returns OK(should return RESBROK_ERROR_GRM)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HAWQ-970) Provide More Accurate Information When LibYARN Meets an Exception

2016-08-03 Thread Lin Wen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Wen resolved HAWQ-970.
--
Resolution: Fixed

> Provide More Accurate Information When LibYARN Meets an Exception
> -
>
> Key: HAWQ-970
> URL: https://issues.apache.org/jira/browse/HAWQ-970
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: libyarn
>Reporter: Lin Wen
>Assignee: Lin Wen
> Fix For: backlog
>
>
> Sometimes when an exception happens in libyarn, the log information is not 
> accurate enough. For example, below is an exception related to kerberos 
> ticket expiration, but we can't know from this log. 
> {code}
> 2016-07-06 01:47:51.945902 
> BST,,,p182375,th1403270400,con4,,seg-1,"WARNING","01000","YARN 
> mode resource broker failed to get container report. 
> LibYarnClient::getContainerReports, Catch the Exception:YarnIOException: 
> Unexpected exception: when calling ApplicationCl
> ientProtocol::getContainers in 
> /data1/pulse2-agent/agents/agent1/work/LIBYARN-main-opt/rhel5_x86_64/src/libyarnserver/ApplicationClientProtocol.cpp:
>  195",,,0,,"resourcebroker_LIBYARN_proc.c",1748,
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HAWQ-970) Provide More Accurate Information When LibYARN Meets an Exception

2016-08-03 Thread Lin Wen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Wen reassigned HAWQ-970:


Assignee: Lin Wen  (was: Lei Chang)

> Provide More Accurate Information When LibYARN Meets an Exception
> -
>
> Key: HAWQ-970
> URL: https://issues.apache.org/jira/browse/HAWQ-970
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: libyarn
>Reporter: Lin Wen
>Assignee: Lin Wen
> Fix For: backlog
>
>
> Sometimes when an exception happens in libyarn, the log information is not 
> accurate enough. For example, below is an exception related to kerberos 
> ticket expiration, but we can't know from this log. 
> {code}
> 2016-07-06 01:47:51.945902 
> BST,,,p182375,th1403270400,con4,,seg-1,"WARNING","01000","YARN 
> mode resource broker failed to get container report. 
> LibYarnClient::getContainerReports, Catch the Exception:YarnIOException: 
> Unexpected exception: when calling ApplicationCl
> ientProtocol::getContainers in 
> /data1/pulse2-agent/agents/agent1/work/LIBYARN-main-opt/rhel5_x86_64/src/libyarnserver/ApplicationClientProtocol.cpp:
>  195",,,0,,"resourcebroker_LIBYARN_proc.c",1748,
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HAWQ-970) Provide More Accurate Information When LibYARN Meets an Exception

2016-08-01 Thread Lin Wen (JIRA)
Lin Wen created HAWQ-970:


 Summary: Provide More Accurate Information When LibYARN Meets an 
Exception
 Key: HAWQ-970
 URL: https://issues.apache.org/jira/browse/HAWQ-970
 Project: Apache HAWQ
  Issue Type: Bug
  Components: libyarn
Reporter: Lin Wen
Assignee: Lei Chang


Sometimes when an exception happens in libyarn, the log information is not 
accurate enough. For example, below is an exception related to kerberos ticket 
expiration, but we can't know from this log. 

2016-07-06 01:47:51.945902 
BST,,,p182375,th1403270400,con4,,seg-1,"WARNING","01000","YARN mode 
resource broker failed to get container report. 
LibYarnClient::getContainerReports, Catch the Exception:YarnIOException: 
Unexpected exception: when calling ApplicationCl
ientProtocol::getContainers in 
/data1/pulse2-agent/agents/agent1/work/LIBYARN-main-opt/rhel5_x86_64/src/libyarnserver/ApplicationClientProtocol.cpp:
 195",,,0,,"resourcebroker_LIBYARN_proc.c",1748,



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HAWQ-966) Adjust Libyarn Output Log

2016-07-31 Thread Lin Wen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Wen resolved HAWQ-966.
--
Resolution: Fixed

> Adjust Libyarn Output Log
> -
>
> Key: HAWQ-966
> URL: https://issues.apache.org/jira/browse/HAWQ-966
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: libyarn
>Reporter: Lin Wen
>Assignee: Lin Wen
>
> While HAWQ is running, libyarn generates a lot of logs. Some of them are 
> useless or duplicate to HAWQ users, should be compressed or reduced, so that 
> more meaningful log message can be provided for HAWQ users. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HAWQ-966) Adjust Libyarn Output Log

2016-07-31 Thread Lin Wen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Wen reassigned HAWQ-966:


Assignee: Lin Wen  (was: Lei Chang)

> Adjust Libyarn Output Log
> -
>
> Key: HAWQ-966
> URL: https://issues.apache.org/jira/browse/HAWQ-966
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: libyarn
>Reporter: Lin Wen
>Assignee: Lin Wen
>
> While HAWQ is running, libyarn generates a lot of logs. Some of them are 
> useless or duplicate to HAWQ users, should be compressed or reduced, so that 
> more meaningful log message can be provided for HAWQ users. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HAWQ-966) Adjust Libyarn Output Log

2016-07-29 Thread Lin Wen (JIRA)
Lin Wen created HAWQ-966:


 Summary: Adjust Libyarn Output Log
 Key: HAWQ-966
 URL: https://issues.apache.org/jira/browse/HAWQ-966
 Project: Apache HAWQ
  Issue Type: Bug
  Components: libyarn
Reporter: Lin Wen
Assignee: Lei Chang


While HAWQ is running, libyarn generates a lot of logs. Some of them are 
useless or duplicate to HAWQ users, should be compressed or reduced, so that 
more meaningful log message can be provided for HAWQ users. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HAWQ-940) Kerberos Ticket Expired for LibYARN Operations

2016-07-20 Thread Lin Wen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Wen reassigned HAWQ-940:


Assignee: Lin Wen  (was: Lei Chang)

> Kerberos Ticket Expired for LibYARN Operations
> --
>
> Key: HAWQ-940
> URL: https://issues.apache.org/jira/browse/HAWQ-940
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: libyarn
>Reporter: Lin Wen
>Assignee: Lin Wen
> Fix For: 2.0.1.0-incubating
>
>
> HAWQ's libhdfs3 and libyarn use a same kerberos keyfile. 
> Whenever a hdfs operation is triggered, a function named login() is called, 
> in login() function, this ticket is initialized by "kinit". 
> But for libyarn, login() function is only called during the resource broker 
> process starts. So if HAWQ starts up and there is no query for a long 
> period(24 hours in kerberos's configure file, krb.conf), this ticket will 
> expire, and HAWQ fails to register itself in Hadoop YARN.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HAWQ-940) Kerberos Ticket Expired for LibYARN Operations

2016-07-19 Thread Lin Wen (JIRA)
Lin Wen created HAWQ-940:


 Summary: Kerberos Ticket Expired for LibYARN Operations
 Key: HAWQ-940
 URL: https://issues.apache.org/jira/browse/HAWQ-940
 Project: Apache HAWQ
  Issue Type: Bug
  Components: libyarn
Reporter: Lin Wen
Assignee: Lei Chang


HAWQ's libhdfs3 and libyarn use a same kerberos keyfile. 
Whenever a hdfs operation is triggered, a function named login() is called, in 
login() function, this ticket is initialized by "kinit". 
But for libyarn, login() function is only called during the resource broker 
process starts. So if HAWQ starts up and there is no query for a long period(24 
hours in kerberos's configure file, krb.conf), this ticket will expire, and 
HAWQ fails to register itself in Hadoop YARN.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-930) HAWQ RM can not work

2016-07-18 Thread Lin Wen (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15381797#comment-15381797
 ] 

Lin Wen commented on HAWQ-930:
--

Hi,
HAWQ RM log is also in this log file.
Before run "hawq restart cluster", is there any running query? 
Would you please run "ps -ef | grep postgres" to check it before and after run 
"hawq restart cluster"?
Thanks!

> HAWQ RM can not work
> 
>
> Key: HAWQ-930
> URL: https://issues.apache.org/jira/browse/HAWQ-930
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 2.0.1.0-incubating
>Reporter: Biao Wu
>Assignee: Lei Chang
>
> The HAWQ Version is "HAWQ version 2.0.1.0 build dev".
> segment number:17
> Run `hawq restart cluster`,
> the pg_log:
> 2016-07-18 14:04:42.799428 
> CST,,,p136927,th-17368286400,,,seg-1,"LOG","0","Wait for HAWQ 
> RM 151",,,0,,"resourcemanager.c",421,
> 2016-07-18 14:04:43.799498 
> CST,,,p136927,th-17368286400,,,seg-1,"LOG","0","Wait for HAWQ 
> RM 152",,,0,,"resourcemanager.c",421,
> 2016-07-18 14:04:44.799569 
> CST,,,p136927,th-17368286400,,,seg-1,"LOG","0","Wait for HAWQ 
> RM 153",,,0,,"resourcemanager.c",421,
> 2016-07-18 14:04:45.799639 
> CST,,,p136927,th-17368286400,,,seg-1,"LOG","0","Wait for HAWQ 
> RM 154",,,0,,"resourcemanager.c",421,
> 2016-07-18 14:04:46.799709 
> CST,,,p136927,th-17368286400,,,seg-1,"LOG","0","Wait for HAWQ 
> RM 155",,,0,,"resourcemanager.c",421,
> 2016-07-18 14:04:47.799780 
> CST,,,p136927,th-17368286400,,,seg-1,"LOG","0","Wait for HAWQ 
> RM 156",,,0,,"resourcemanager.c",421,
> 2016-07-18 14:04:48.799850 
> CST,,,p136927,th-17368286400,,,seg-1,"LOG","0","Wait for HAWQ 
> RM 157",,,0,,"resourcemanager.c",421,
> 2016-07-18 14:04:49.799918 
> CST,,,p136927,th-17368286400,,,seg-1,"LOG","0","Wait for HAWQ 
> RM 158",,,0,,"resourcemanager.c",421,
> 2016-07-18 14:04:50.799988 
> CST,,,p136927,th-17368286400,,,seg-1,"LOG","0","Wait for HAWQ 
> RM 159",,,0,,"resourcemanager.c",421,
> 2016-07-18 14:04:51.800056 
> CST,,,p136927,th-17368286400,,,seg-1,"LOG","0","Wait for HAWQ 
> RM 160",,,0,,"resourcemanager.c",421,
> 2016-07-18 14:04:52.800126 
> CST,,,p136927,th-17368286400,,,seg-1,"LOG","0","Wait for HAWQ 
> RM 161",,,0,,"resourcemanager.c",421,
> 2016-07-18 14:04:53.800195 
> CST,,,p136927,th-17368286400,,,seg-1,"LOG","0","Wait for HAWQ 
> RM 162",,,0,,"resourcemanager.c",421,
> 2016-07-18 14:04:54.800263 
> CST,,,p136927,th-17368286400,,,seg-1,"LOG","0","Wait for HAWQ 
> RM 163",,,0,,"resourcemanager.c",421,
> 2016-07-18 14:04:55.800331 
> CST,,,p136927,th-17368286400,,,seg-1,"LOG","0","Wait for HAWQ 
> RM 164",,,0,,"resourcemanager.c",421,
> 2016-07-18 14:04:56.800399 
> CST,,,p136927,th-17368286400,,,seg-1,"LOG","0","Wait for HAWQ 
> RM 165",,,0,,"resourcemanager.c",421,
> 2016-07-18 14:04:57.800466 
> CST,,,p136927,th-17368286400,,,seg-1,"LOG","0","Wait for HAWQ 
> RM 166",,,0,,"resourcemanager.c",421,
> 2016-07-18 14:04:58.800535 
> CST,,,p136927,th-17368286400,,,seg-1,"LOG","0","Wait for HAWQ 
> RM 167",,,0,,"resourcemanager.c",421,
> 2016-07-18 14:04:59.800602 
> CST,,,p136927,th-17368286400,,,seg-1,"LOG","0","Wait for HAWQ 
> RM 168",,,0,,"resourcemanager.c",421,
> 2016-07-18 14:05:00.800669 
> CST,,,p136927,th-17368286400,,,seg-1,"LOG","0","Wait for HAWQ 
> RM 169",,,0,,"resourcemanager.c",421,
> 2016-07-18 14:05:01.800736 
> CST,,,p136927,th-17368286400,,,seg-1,"LOG","0","Wait for HAWQ 
> RM 170",,,0,,"resourcemanager.c",421,
> 2016-07-18 14:05:02.800803 
> CST,,,p136927,th-17368286400,,,seg-1,"LOG","0","Wait for HAWQ 
> RM 171",,,0,,"resourcemanager.c",421,
> 2016-07-18 14:05:03.800870 
> CST,,,p136927,th-17368286400,,,seg-1,"LOG","0","Wait for HAWQ 
> RM 172",,,0,,"resourcemanager.c",421,
> 2016-07-18 14:05:04.800938 
> CST,,,p136927,th-17368286400,,,seg-1,"LOG","0","Wait for HAWQ 
> RM 173",,,0,,"resourcemanager.c",421,
> 2016-07-18 14:05:05.801004 
> CST,,,p136927,th-17368286400,,,seg-1,"LOG","0","Wait for HAWQ 
> RM 174",,,0,,"resourcemanager.c",421,
> 2016-07-18 14:05:06.801073 
> CST,,,p136927,th-17368286400,,,seg-1,"LOG","0","Wait for HAWQ 
> RM 175",,,0,,"resourcemanager.c",421,
> 2016-07-18 14:05:07.801132 
> CST,,,p136927,th-17368286400,,,seg-1,"LOG","0","Wait for HAWQ 
> RM 176",,,0,,"resourcemanager.c",421,
> 2016-07-18 14:05:08.801224 
> 

[jira] [Resolved] (HAWQ-918) Fix memtuple forming bug when null-saved size is larger than 32763 bytes

2016-07-12 Thread Lin Wen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Wen resolved HAWQ-918.
--
Resolution: Fixed

> Fix memtuple forming bug when null-saved size is larger than 32763 bytes
> 
>
> Key: HAWQ-918
> URL: https://issues.apache.org/jira/browse/HAWQ-918
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Query Execution
>Reporter: Lin Wen
>Assignee: Lin Wen
> Attachments: run.sql
>
>
> When run a sql, an error happens in QE:
> psql:run.sql:24: ERROR:  Query Executor Error in seg2 localhost:4 
> pid=55810: server closed the connection unexpectedly
> DETAIL:
>   This probably means the server terminated abnormally
>   before or while processing the request.
> 2016-07-13 00:21:53.951987 CST,,,p34013,th0,,,2016-07-13 00:21:29 
> CST,0,con33,cmd33,seg1,slice2"PANIC","XX000","Unexpected internal error: 
> Segment process received signal SIGSEGV",,,0"10x8b764e postgres 
>  + 0x8b764e
> 20x3b66e0f710 libpthread.so.0  + 0x66e0f710
> 30x3b6668995b libc.so.6 memcpy + 0x2eb
> 40x8940d8 postgres textout + 0x58
> 50x8c32d7 postgres DirectFunctionCall1 + 0x47
> 60x88a126 postgres text_timestamp + 0xc6
> 70x669b47 postgres  + 0x669b47
> 80x669fe9 postgres  + 0x669fe9
> 90x66f54e postgres ExecProject + 0x23e
> 10   0x680695 postgres ExecAgg + 0x525
> 11   0x6643b1 postgres ExecProcNode + 0x221
> 12   0x689da8 postgres ExecLimit + 0x218
> 13   0x664521 postgres ExecProcNode + 0x391
> 14   0x68d549 postgres ExecMotion + 0x39
> 15   0x6643c1 postgres ExecProcNode + 0x231
> 16   0x660752 postgres  + 0x660752
> 17   0x6610ea postgres ExecutorRun + 0x4ca
> 18   0x7e4c3a postgres PortalRun + 0x58a
> 19   0x7dab64 postgres  + 0x7dab64
> 20   0x7dfaf5 postgres PostgresMain + 0x2b65
> 21   0x790e7f postgres  + 0x790e7f
> 22   0x793b39 postgres PostmasterMain + 0x759
> 23   0x4a19cf postgres main + 0x50f
> 24   0x3b6661ed5d libc.so.6 __libc_start_main + 0xfd
> 25   0x4a1a4d postgres  + 0x4a1a4d



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-918) Fix memtuple forming bug when null-saved size is larger than 32763 bytes

2016-07-12 Thread Lin Wen (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15372686#comment-15372686
 ] 

Lin Wen commented on HAWQ-918:
--

compute_null_save() is used to compute how many bytes a tuple saved by using
nullbitmap, formerly the return value is short type and can be easily exceeded
to a negative value which will corrupt memtuples and cause sigsegv.

> Fix memtuple forming bug when null-saved size is larger than 32763 bytes
> 
>
> Key: HAWQ-918
> URL: https://issues.apache.org/jira/browse/HAWQ-918
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Query Execution
>Reporter: Lin Wen
>Assignee: Lin Wen
> Attachments: run.sql
>
>
> When run a sql, an error happens in QE:
> psql:run.sql:24: ERROR:  Query Executor Error in seg2 localhost:4 
> pid=55810: server closed the connection unexpectedly
> DETAIL:
>   This probably means the server terminated abnormally
>   before or while processing the request.
> 2016-07-13 00:21:53.951987 CST,,,p34013,th0,,,2016-07-13 00:21:29 
> CST,0,con33,cmd33,seg1,slice2"PANIC","XX000","Unexpected internal error: 
> Segment process received signal SIGSEGV",,,0"10x8b764e postgres 
>  + 0x8b764e
> 20x3b66e0f710 libpthread.so.0  + 0x66e0f710
> 30x3b6668995b libc.so.6 memcpy + 0x2eb
> 40x8940d8 postgres textout + 0x58
> 50x8c32d7 postgres DirectFunctionCall1 + 0x47
> 60x88a126 postgres text_timestamp + 0xc6
> 70x669b47 postgres  + 0x669b47
> 80x669fe9 postgres  + 0x669fe9
> 90x66f54e postgres ExecProject + 0x23e
> 10   0x680695 postgres ExecAgg + 0x525
> 11   0x6643b1 postgres ExecProcNode + 0x221
> 12   0x689da8 postgres ExecLimit + 0x218
> 13   0x664521 postgres ExecProcNode + 0x391
> 14   0x68d549 postgres ExecMotion + 0x39
> 15   0x6643c1 postgres ExecProcNode + 0x231
> 16   0x660752 postgres  + 0x660752
> 17   0x6610ea postgres ExecutorRun + 0x4ca
> 18   0x7e4c3a postgres PortalRun + 0x58a
> 19   0x7dab64 postgres  + 0x7dab64
> 20   0x7dfaf5 postgres PostgresMain + 0x2b65
> 21   0x790e7f postgres  + 0x790e7f
> 22   0x793b39 postgres PostmasterMain + 0x759
> 23   0x4a19cf postgres main + 0x50f
> 24   0x3b6661ed5d libc.so.6 __libc_start_main + 0xfd
> 25   0x4a1a4d postgres  + 0x4a1a4d



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HAWQ-918) Fix memtuple forming bug when null-saved size is larger than 32763 bytes

2016-07-12 Thread Lin Wen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Wen updated HAWQ-918:
-
Attachment: run.sql

reproduce sql file

> Fix memtuple forming bug when null-saved size is larger than 32763 bytes
> 
>
> Key: HAWQ-918
> URL: https://issues.apache.org/jira/browse/HAWQ-918
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Query Execution
>Reporter: Lin Wen
>Assignee: Lin Wen
> Attachments: run.sql
>
>
> When run a sql, an error happens in QE:
> psql:run.sql:24: ERROR:  Query Executor Error in seg2 localhost:4 
> pid=55810: server closed the connection unexpectedly
> DETAIL:
>   This probably means the server terminated abnormally
>   before or while processing the request.
> 2016-07-13 00:21:53.951987 CST,,,p34013,th0,,,2016-07-13 00:21:29 
> CST,0,con33,cmd33,seg1,slice2"PANIC","XX000","Unexpected internal error: 
> Segment process received signal SIGSEGV",,,0"10x8b764e postgres 
>  + 0x8b764e
> 20x3b66e0f710 libpthread.so.0  + 0x66e0f710
> 30x3b6668995b libc.so.6 memcpy + 0x2eb
> 40x8940d8 postgres textout + 0x58
> 50x8c32d7 postgres DirectFunctionCall1 + 0x47
> 60x88a126 postgres text_timestamp + 0xc6
> 70x669b47 postgres  + 0x669b47
> 80x669fe9 postgres  + 0x669fe9
> 90x66f54e postgres ExecProject + 0x23e
> 10   0x680695 postgres ExecAgg + 0x525
> 11   0x6643b1 postgres ExecProcNode + 0x221
> 12   0x689da8 postgres ExecLimit + 0x218
> 13   0x664521 postgres ExecProcNode + 0x391
> 14   0x68d549 postgres ExecMotion + 0x39
> 15   0x6643c1 postgres ExecProcNode + 0x231
> 16   0x660752 postgres  + 0x660752
> 17   0x6610ea postgres ExecutorRun + 0x4ca
> 18   0x7e4c3a postgres PortalRun + 0x58a
> 19   0x7dab64 postgres  + 0x7dab64
> 20   0x7dfaf5 postgres PostgresMain + 0x2b65
> 21   0x790e7f postgres  + 0x790e7f
> 22   0x793b39 postgres PostmasterMain + 0x759
> 23   0x4a19cf postgres main + 0x50f
> 24   0x3b6661ed5d libc.so.6 __libc_start_main + 0xfd
> 25   0x4a1a4d postgres  + 0x4a1a4d



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HAWQ-918) Fix memtuple forming bug when null-saved size is larger than 32763 bytes

2016-07-12 Thread Lin Wen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Wen reassigned HAWQ-918:


Assignee: Lin Wen  (was: Lei Chang)

> Fix memtuple forming bug when null-saved size is larger than 32763 bytes
> 
>
> Key: HAWQ-918
> URL: https://issues.apache.org/jira/browse/HAWQ-918
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Query Execution
>Reporter: Lin Wen
>Assignee: Lin Wen
>
> When run a sql, an error happens in QE:
> psql:run.sql:24: ERROR:  Query Executor Error in seg2 localhost:4 
> pid=55810: server closed the connection unexpectedly
> DETAIL:
>   This probably means the server terminated abnormally
>   before or while processing the request.
> 2016-07-13 00:21:53.951987 CST,,,p34013,th0,,,2016-07-13 00:21:29 
> CST,0,con33,cmd33,seg1,slice2"PANIC","XX000","Unexpected internal error: 
> Segment process received signal SIGSEGV",,,0"10x8b764e postgres 
>  + 0x8b764e
> 20x3b66e0f710 libpthread.so.0  + 0x66e0f710
> 30x3b6668995b libc.so.6 memcpy + 0x2eb
> 40x8940d8 postgres textout + 0x58
> 50x8c32d7 postgres DirectFunctionCall1 + 0x47
> 60x88a126 postgres text_timestamp + 0xc6
> 70x669b47 postgres  + 0x669b47
> 80x669fe9 postgres  + 0x669fe9
> 90x66f54e postgres ExecProject + 0x23e
> 10   0x680695 postgres ExecAgg + 0x525
> 11   0x6643b1 postgres ExecProcNode + 0x221
> 12   0x689da8 postgres ExecLimit + 0x218
> 13   0x664521 postgres ExecProcNode + 0x391
> 14   0x68d549 postgres ExecMotion + 0x39
> 15   0x6643c1 postgres ExecProcNode + 0x231
> 16   0x660752 postgres  + 0x660752
> 17   0x6610ea postgres ExecutorRun + 0x4ca
> 18   0x7e4c3a postgres PortalRun + 0x58a
> 19   0x7dab64 postgres  + 0x7dab64
> 20   0x7dfaf5 postgres PostgresMain + 0x2b65
> 21   0x790e7f postgres  + 0x790e7f
> 22   0x793b39 postgres PostmasterMain + 0x759
> 23   0x4a19cf postgres main + 0x50f
> 24   0x3b6661ed5d libc.so.6 __libc_start_main + 0xfd
> 25   0x4a1a4d postgres  + 0x4a1a4d



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HAWQ-918) Fix memtuple forming bug when null-saved size is larger than 32763 bytes

2016-07-12 Thread Lin Wen (JIRA)
Lin Wen created HAWQ-918:


 Summary: Fix memtuple forming bug when null-saved size is larger 
than 32763 bytes
 Key: HAWQ-918
 URL: https://issues.apache.org/jira/browse/HAWQ-918
 Project: Apache HAWQ
  Issue Type: Bug
  Components: Query Execution
Reporter: Lin Wen
Assignee: Lei Chang


When run a sql, an error happens in QE:
psql:run.sql:24: ERROR:  Query Executor Error in seg2 localhost:4 
pid=55810: server closed the connection unexpectedly
DETAIL:
This probably means the server terminated abnormally
before or while processing the request.

2016-07-13 00:21:53.951987 CST,,,p34013,th0,,,2016-07-13 00:21:29 
CST,0,con33,cmd33,seg1,slice2"PANIC","XX000","Unexpected internal error: 
Segment process received signal SIGSEGV",,,0"10x8b764e postgres 
 + 0x8b764e
20x3b66e0f710 libpthread.so.0  + 0x66e0f710
30x3b6668995b libc.so.6 memcpy + 0x2eb
40x8940d8 postgres textout + 0x58
50x8c32d7 postgres DirectFunctionCall1 + 0x47
60x88a126 postgres text_timestamp + 0xc6
70x669b47 postgres  + 0x669b47
80x669fe9 postgres  + 0x669fe9
90x66f54e postgres ExecProject + 0x23e
10   0x680695 postgres ExecAgg + 0x525
11   0x6643b1 postgres ExecProcNode + 0x221
12   0x689da8 postgres ExecLimit + 0x218
13   0x664521 postgres ExecProcNode + 0x391
14   0x68d549 postgres ExecMotion + 0x39
15   0x6643c1 postgres ExecProcNode + 0x231
16   0x660752 postgres  + 0x660752
17   0x6610ea postgres ExecutorRun + 0x4ca
18   0x7e4c3a postgres PortalRun + 0x58a
19   0x7dab64 postgres  + 0x7dab64
20   0x7dfaf5 postgres PostgresMain + 0x2b65
21   0x790e7f postgres  + 0x790e7f
22   0x793b39 postgres PostmasterMain + 0x759
23   0x4a19cf postgres main + 0x50f
24   0x3b6661ed5d libc.so.6 __libc_start_main + 0xfd
25   0x4a1a4d postgres  + 0x4a1a4d



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HAWQ-912) Skip Temporary Directory Checking for Master/standby

2016-07-11 Thread Lin Wen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Wen resolved HAWQ-912.
--
Resolution: Fixed

> Skip Temporary Directory Checking for Master/standby
> 
>
> Key: HAWQ-912
> URL: https://issues.apache.org/jira/browse/HAWQ-912
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Fault Tolerance
>Reporter: Lin Wen
>Assignee: Lin Wen
>
> In function loadDynamicResourceManagerConfigure(), when hawq starts up, only 
> segment need to check temporary directories. 
> Master and standby should be skipped. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HAWQ-912) Skip Temporary Directory Checking for Master/standby

2016-07-11 Thread Lin Wen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Wen reassigned HAWQ-912:


Assignee: Lin Wen  (was: Lei Chang)

> Skip Temporary Directory Checking for Master/standby
> 
>
> Key: HAWQ-912
> URL: https://issues.apache.org/jira/browse/HAWQ-912
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Fault Tolerance
>Reporter: Lin Wen
>Assignee: Lin Wen
>
> In function loadDynamicResourceManagerConfigure(), when hawq starts up, only 
> segment need to check temporary directories. 
> Master and standby should be skipped. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HAWQ-912) Skip Temporary Directory Checking for Master/standby

2016-07-11 Thread Lin Wen (JIRA)
Lin Wen created HAWQ-912:


 Summary: Skip Temporary Directory Checking for Master/standby
 Key: HAWQ-912
 URL: https://issues.apache.org/jira/browse/HAWQ-912
 Project: Apache HAWQ
  Issue Type: Bug
  Components: Fault Tolerance
Reporter: Lin Wen
Assignee: Lei Chang


In function loadDynamicResourceManagerConfigure(), when hawq starts up, only 
segment need to check temporary directories. 
Master and standby should be skipped. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HAWQ-898) Add feature test for COPY with new test framework

2016-07-06 Thread Lin Wen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Wen reassigned HAWQ-898:


Assignee: Lin Wen  (was: Jiali Yao)

> Add feature test for COPY with new test framework 
> --
>
> Key: HAWQ-898
> URL: https://issues.apache.org/jira/browse/HAWQ-898
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Lin Wen
>Assignee: Lin Wen
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   3   >