[GitHub] incubator-hawq issue #1377: HAWQ-1627. Support setting the max protocol mess...
Github user linwen commented on the issue: https://github.com/apache/incubator-hawq/pull/1377 +1 ---
[GitHub] incubator-hawq pull request #1377: HAWQ-1627. Support setting the max protoc...
Github user linwen commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/1377#discussion_r196996343 --- Diff: depends/libhdfs3/src/rpc/RpcChannel.cpp --- @@ -768,7 +771,15 @@ void RpcChannelImpl::readOneResponse(bool writeLock) { buffer.resize(headerSize); in->readFully([0], headerSize, readTimeout); -if (!curRespHeader.ParseFromArray([0], headerSize)) { +// use CodedInputStream around the buffer, so we can set TotalBytesLimit on it +ArrayInputStream ais([0], headerSize); +CodedInputStream cis(); +cis.SetTotalBytesLimit(maxLength, maxLength/2); + +// use ParseFromCodedStream instead of ParseFromArray, so it can consume the above CodedInputStream +// +// if just use ParseFromArray, we have on chance to set TotalBytesLimit (64MB default) --- End diff -- "on" should be "no"? ---
[GitHub] incubator-hawq issue #1372: HAWQ-1619. Fix Vectorized Execution bugs
Github user linwen commented on the issue: https://github.com/apache/incubator-hawq/pull/1372 +1 ---
[GitHub] incubator-hawq issue #1373: HAWQ-1618. Segment panic at workfile_mgr_close_f...
Github user linwen commented on the issue: https://github.com/apache/incubator-hawq/pull/1373 +1 ---
[GitHub] incubator-hawq issue #1371: HAWQ-1620. Push down target list information(pi_...
Github user linwen commented on the issue: https://github.com/apache/incubator-hawq/pull/1371 merged to master. ---
[GitHub] incubator-hawq pull request #1371: HAWQ-1620. Push down target list informat...
Github user linwen closed the pull request at: https://github.com/apache/incubator-hawq/pull/1371 ---
[GitHub] incubator-hawq pull request #1371: HAWQ-1620. Push down target list informat...
GitHub user linwen opened a pull request: https://github.com/apache/incubator-hawq/pull/1371 HAWQ-1620. Push down target list information(pi_targetlist in structure ProjectionInfo) HAWQ-1620. Push down target list information(pi_targetlist in structure ProjectionInfo) to scan when create Bloomfilter structure. Please review, thanks! You can merge this pull request into a Git repository by running: $ git pull https://github.com/linwen/incubator-hawq hawq-1620 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-hawq/pull/1371.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1371 commit 46b778b2b29418ae78561cf211f2d99d956a138f Author: Wen Lin Date: 2018-05-30T13:38:45Z HAWQ-1620. Push down target list information(pi_targetlist in structure ProjectionInfo) to scan when create Bloomfilter structure. ---
[GitHub] incubator-hawq issue #1368: HAWQ-1616. Fix the wrong result of hash join whe...
Github user linwen commented on the issue: https://github.com/apache/incubator-hawq/pull/1368 merged. ---
[GitHub] incubator-hawq pull request #1368: HAWQ-1616. Fix the wrong result of hash j...
Github user linwen closed the pull request at: https://github.com/apache/incubator-hawq/pull/1368 ---
[GitHub] incubator-hawq pull request #1368: HAWQ-1616. Fix the wrong result of hash j...
GitHub user linwen opened a pull request: https://github.com/apache/incubator-hawq/pull/1368 HAWQ-1616. Fix the wrong result of hash join when enable Bloom filter. The projection information of join keys hasn't been pushed down to parquet scan correctly. It causes computing a wrong hash value during parquet scan. You can merge this pull request into a Git repository by running: $ git pull https://github.com/linwen/incubator-hawq hawq-1616 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-hawq/pull/1368.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1368 commit 452b95ecd36fd978d305fb4330ef7de2987c374c Author: Wen Lin <wlin@...> Date: 2018-05-25T02:07:16Z HAWQ-1616. Fix the wrong result of hash join when enable Bloom filter. The projection information of join keys hasn't been pushed down to parquet scan correctly. ---
[GitHub] incubator-hawq pull request #1366: HAWQ-1615. Fix accessing invalid memory w...
GitHub user linwen opened a pull request: https://github.com/apache/incubator-hawq/pull/1366 HAWQ-1615. Fix accessing invalid memory when run a hash-join query with Bloomfilter enable. The BloomFilter structure in RuntimeFilterState should be allocated, instead of using the address of HashJoinTable's BloomFilter, since it may be released when function FreeScanRuntimefilterState() tries to access it. Please review, thanks! You can merge this pull request into a Git repository by running: $ git pull https://github.com/linwen/incubator-hawq hawq-1615b Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-hawq/pull/1366.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1366 commit 3c695200ba990476656b0c6a7ca94df49e851866 Author: Wen Lin <wlin@...> Date: 2018-05-21T07:44:36Z HAWQ-1615. Fix accessing invalid memory when run a hash-join query with Bloomfilter enable. The BloomFilter structure in RuntimeFilterState should be allocated, instead of using the address of HashJoinTable's BloomFilter, since it may be released when function FreeScanRuntimefilterState() tries to access it. ---
[GitHub] incubator-hawq pull request #1363: HAWQ-1608. Implement Printing Runtime Fil...
GitHub user linwen opened a pull request: https://github.com/apache/incubator-hawq/pull/1363 HAWQ-1608. Implement Printing Runtime Filter Information For "explain analyze" Implement Printing Runtime Filter Information For "explain analyze" Change GUC hawq_hashjoin_bloomfilter to bool. Please review, thanks! You can merge this pull request into a Git repository by running: $ git pull https://github.com/linwen/incubator-hawq hawq-1608b Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-hawq/pull/1363.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1363 commit 25ed0ae26028a2d3f784fd75ad9db1d3f888539f Author: Wen Lin <wlin@...> Date: 2018-05-14T09:39:38Z HAWQ-1608. Implement Printing Runtime Filter Information For "explain analyze". Change GUC hawq_hashjoin_bloomfilter to bool. ---
[GitHub] incubator-hawq pull request #1360: HAWQ-1607. This commit implements applyin...
Github user linwen closed the pull request at: https://github.com/apache/incubator-hawq/pull/1360 ---
[GitHub] incubator-hawq issue #1360: HAWQ-1607. This commit implements applying Bloom...
Github user linwen commented on the issue: https://github.com/apache/incubator-hawq/pull/1360 merged into master. ---
[GitHub] incubator-hawq issue #1360: HAWQ-1607. This commit implements applying Bloom...
Github user linwen commented on the issue: https://github.com/apache/incubator-hawq/pull/1360 This commit doesn't contain test cases, test cases will be added with HAWQ-1608. After finish HAWQ-1608, users can use "explain analyze" statement to know if the Bloom filter is used for hash join. ---
[GitHub] incubator-hawq pull request #1360: HAWQ-1607. This commit implements applyin...
GitHub user linwen opened a pull request: https://github.com/apache/incubator-hawq/pull/1360 HAWQ-1607. This commit implements applying Bloom filter during Scan outer table 1. Pash down Bloom filter structure to outer table scan(only support parquet); 2. Check if the tuple from outer table is found in Bloom filter structure. 3. Add a GUC hawq_hashjoin_bloomfilter_sampling_number. This guc value controls the Bloom filter sampling number, while scanning outer table, for first N tuples of the outer table, if the ratio is larger than hawq_hashjoin_bloomfilter_ratio, the remain tuples will not be checked by Bloom filter. 4. If there is any expression on outer join keys except T_Var(projection), such as, fact.c1 + 1 = dim.c1. 2, if there are multiple join keys, e.g. fact.c1 = dim.c1 and fact.c2 = dim.c2, Bloomfilter won't be created. Since these cases invloves pushing down expression and project information to scan, which will be implemented later. Please review, thanks! You can merge this pull request into a Git repository by running: $ git pull https://github.com/linwen/incubator-hawq hawq_1607v2 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-hawq/pull/1360.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1360 commit 08a0951af95ce4945cc67a5c7bc67acdc4e9b94e Author: Wen Lin <wlin@...> Date: 2018-05-06T13:19:14Z HAWQ-1607. This commit implements applying Bloom filter during Scan outer table, test cases will be added with HAWQ-1608. 1. Pash down Bloom filter structure to outer table scan(only support parquet); 2. Check if the tuple from outer table is found in Bloom filter structure. 3. Add a GUC hawq_hashjoin_bloomfilter_sampling_number. This guc value controls the Bloom filter sampling number, while scanning outer table, for first N tuples of the outer table, if the ratio is larger than hawq_hashjoin_bloomfilter_ratio, the remain tuples will not be checked by Bloom filter. 4. If there is any expression on outer join keys except T_Var(projection), such as, fact.c1 + 1 = dim.c1. 2, if there are multiple join keys, e.g. fact.c1 = dim.c1 and fact.c2 = dim.c2, Bloomfilter won't be created. Since these cases invloves pushing down expression and project information to scan, which will be implemented later. ---
[GitHub] incubator-hawq issue #1356: HAWQ-1611. refactor the vtype in order to advanc...
Github user linwen commented on the issue: https://github.com/apache/incubator-hawq/pull/1356 +1 ---
[GitHub] incubator-hawq pull request #1355: HAWQ-1606. Fix "make unittest-check" erro...
GitHub user linwen opened a pull request: https://github.com/apache/incubator-hawq/pull/1355 HAWQ-1606. Fix "make unittest-check" error and set GUC error. Please review, thanks! You can merge this pull request into a Git repository by running: $ git pull https://github.com/linwen/incubator-hawq hawq-1606a Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-hawq/pull/1355.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1355 commit f68cac7e87949120aa1879650b196ea67348388b Author: Wen Lin <wlin@...> Date: 2018-04-18T03:57:25Z HAWQ-1606. Fix "make unittest-check" error and set GUC error ---
[GitHub] incubator-hawq pull request #1354: HAWQ-1606. Implement Deciding to Create B...
Github user linwen closed the pull request at: https://github.com/apache/incubator-hawq/pull/1354 ---
[GitHub] incubator-hawq pull request #1354: HAWQ-1606. Implement Deciding to Create B...
GitHub user linwen opened a pull request: https://github.com/apache/incubator-hawq/pull/1354 HAWQ-1606. Implement Deciding to Create Bloom Filter During Query Plan And Create Bloom filter For Inner Table This commit implements deciding to create Bloom Filter during query plan and create Bloom filter for inner table, including: 1. Introduce a GUC, hawq_hashjoin_bloomfilter_max_memory_size, controls the maximum memory size for one bloom filter in hash join. 2. Introduce a GUC, hawq_hashjoin_bloomfilter_ratio, when the ratio of (the estimated number of hash join tuples)/(number of tuples of outer table) is lower than the GUC, then Bloom filter can be used in hash join. 3. Decide whether to create Bloom filter during query plan phase. 4. During query execution phase, create Bloom filter structure and poputlate it for tuples from inner table. Please review it, thanks! You can merge this pull request into a Git repository by running: $ git pull https://github.com/linwen/incubator-hawq hawq-1606 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-hawq/pull/1354.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1354 commit 11b20b51026419cf91c71cabb9c17e0e467399f7 Author: Wen Lin <wlin@...> Date: 2018-04-15T11:29:19Z HAWQ-1606. This commit implements deciding to create Bloom Filter during query plan and create Bloom filter for inner table, including: 1. Introduce a GUC, hawq_hashjoin_bloomfilter_max_memory_size, controls the maximum memory size for one bloom filter in hash join. 2. Introduce a GUC, hawq_hashjoin_bloomfilter_ratio, when the ratio of (the estimated number of hash join tuples)/(number of tuples of outer table) is lower than the GUC, then Bloom filter can be used in hash join. 3. Decide whether to create Bloom filter during query plan phase. 4. During query execution phase, create Bloom filter structure and poputlate it for tuples from inner table. ---
[GitHub] incubator-hawq pull request #1352: HAWQ-1604. Add A New GUC hawq_hashjoin_bl...
Github user linwen closed the pull request at: https://github.com/apache/incubator-hawq/pull/1352 ---
[GitHub] incubator-hawq pull request #1352: HAWQ-1604. Add A New GUC hawq_hashjoin_bl...
GitHub user linwen opened a pull request: https://github.com/apache/incubator-hawq/pull/1352 HAWQ-1604. Add A New GUC hawq_hashjoin_bloomfilter HAWQ-1604. Add A New GUC hawq_hashjoin_bloomfilter to indicate if use Bloom filter for hash join. Remove gp_hashjoin_bloomfilter and bloom filter in hash join table, this part of legacy codes has been verified that it won't improve hash join performance. Please review, thanks! You can merge this pull request into a Git repository by running: $ git pull https://github.com/linwen/incubator-hawq hawq-1604 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-hawq/pull/1352.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1352 commit 2b05192b8112f6164496c4f2a9f8cedc5aa77be4 Author: Wen Lin <wlin@...> Date: 2018-04-08T08:49:36Z HAWQ-1604. Add A New GUC hawq_hashjoin_bloomfilter to indicate if use Bloom filter for hash join. Remove gp_hashjoin_bloomfilter and bloom filter in hash join table, this part of legacy codes has been verified that it won't improve hash join performance. ---
[GitHub] incubator-hawq pull request #1350: HAWQ-1600. Parquet table data vectorized ...
Github user linwen commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/1350#discussion_r178483664 --- Diff: contrib/vexecutor/vcheck.h --- @@ -27,6 +27,8 @@ typedef struct vFuncMap Oid ntype; vheader* (* vtbuild)(int n); void (* vtfree)(vheader **vh); + Datum (* gettypeptr)(vheader *vh,int n); --- End diff -- Please fix indent here. ---
[GitHub] incubator-hawq pull request #1350: HAWQ-1600. Parquet table data vectorized ...
Github user linwen commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/1350#discussion_r178483630 --- Diff: contrib/vexecutor/parquet_reader.c --- @@ -0,0 +1,194 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +#include "parquet_reader.h" + +#include "executor/executor.h" +#include "tuplebatch.h" +#include "vcheck.h" + +extern bool getNextRowGroup(ParquetScanDesc scan); +static int +ParquetRowGroupReader_ScanNextTupleBatch( + TupleDesc tupDesc, + ParquetRowGroupReader *rowGroupReader, + int *hawqAttrToParquetColNum, + bool*projs, + TupleTableSlot *slot); + +static void +parquet_vgetnext(ParquetScanDesc scan, ScanDirection direction, TupleTableSlot *slot); + +TupleTableSlot * +ParquetVScanNext(ScanState *scanState) +{ + Assert(IsA(scanState, TableScanState) || IsA(scanState, DynamicTableScanState)); + ParquetScanState *node = (ParquetScanState *)scanState; + Assert(node->opaque != NULL && node->opaque->scandesc != NULL); + + parquet_vgetnext(node->opaque->scandesc, node->ss.ps.state->es_direction, node->ss.ss_ScanTupleSlot); + return node->ss.ss_ScanTupleSlot; +} + +static void +parquet_vgetnext(ParquetScanDesc scan, ScanDirection direction, TupleTableSlot *slot) +{ + + //AOTupleId aoTupleId; + Assert(ScanDirectionIsForward(direction)); + + for(;;) + { + if(scan->bufferDone) + { + /* +* Get the next row group. We call this function until we +* successfully get a block to process, or finished reading +* all the data (all 'segment' files) for this relation. +*/ + while(!getNextRowGroup(scan)) + { + /* have we read all this relation's data. done! */ + if(scan->pqs_done_all_splits) + { + ExecClearTuple(slot); + return /*NULL*/; + } + } + scan->bufferDone = false; + } + + int row_num = ParquetRowGroupReader_ScanNextTupleBatch( + scan->pqs_tupDesc, + >rowGroupReader, + scan->hawqAttrToParquetColChunks, + scan->proj, + slot); + if(row_num > 0) + return; + + /* no more items in the row group, get new buffer */ + scan->bufferDone = true; + } +} + +/* + * Get next tuple batch from current row group into slot. + * + * Return false if current row group has no tuple left, true otherwise. + */ +static int +ParquetRowGroupReader_ScanNextTupleBatch( + TupleDesc tupDesc, + ParquetRowGroupReader *rowGroupReader, + int *hawqAttrToParquetColNum, + bool*projs, + TupleTableSlot *slot) +{ + Assert(slot); + + if (rowGroupReader->rowRead >= rowGroupReader->rowCount) + { + ParquetRowGroupReader_FinishedScanRowGroup(rowGroupReader); + return false; + } + + /* +* g
[GitHub] incubator-hawq pull request #1350: HAWQ-1600. Parquet table data vectorized ...
Github user linwen commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/1350#discussion_r178483636 --- Diff: contrib/vexecutor/parquet_reader.c --- @@ -0,0 +1,194 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +#include "parquet_reader.h" + +#include "executor/executor.h" +#include "tuplebatch.h" +#include "vcheck.h" + +extern bool getNextRowGroup(ParquetScanDesc scan); +static int +ParquetRowGroupReader_ScanNextTupleBatch( + TupleDesc tupDesc, + ParquetRowGroupReader *rowGroupReader, + int *hawqAttrToParquetColNum, + bool*projs, + TupleTableSlot *slot); + +static void +parquet_vgetnext(ParquetScanDesc scan, ScanDirection direction, TupleTableSlot *slot); + +TupleTableSlot * +ParquetVScanNext(ScanState *scanState) +{ + Assert(IsA(scanState, TableScanState) || IsA(scanState, DynamicTableScanState)); + ParquetScanState *node = (ParquetScanState *)scanState; + Assert(node->opaque != NULL && node->opaque->scandesc != NULL); + + parquet_vgetnext(node->opaque->scandesc, node->ss.ps.state->es_direction, node->ss.ss_ScanTupleSlot); + return node->ss.ss_ScanTupleSlot; +} + +static void +parquet_vgetnext(ParquetScanDesc scan, ScanDirection direction, TupleTableSlot *slot) +{ + + //AOTupleId aoTupleId; + Assert(ScanDirectionIsForward(direction)); + + for(;;) + { + if(scan->bufferDone) + { + /* +* Get the next row group. We call this function until we +* successfully get a block to process, or finished reading +* all the data (all 'segment' files) for this relation. +*/ + while(!getNextRowGroup(scan)) + { + /* have we read all this relation's data. done! */ + if(scan->pqs_done_all_splits) + { + ExecClearTuple(slot); + return /*NULL*/; + } + } + scan->bufferDone = false; + } + + int row_num = ParquetRowGroupReader_ScanNextTupleBatch( + scan->pqs_tupDesc, + >rowGroupReader, + scan->hawqAttrToParquetColChunks, + scan->proj, + slot); + if(row_num > 0) + return; + + /* no more items in the row group, get new buffer */ + scan->bufferDone = true; + } +} + +/* + * Get next tuple batch from current row group into slot. + * + * Return false if current row group has no tuple left, true otherwise. + */ +static int +ParquetRowGroupReader_ScanNextTupleBatch( + TupleDesc tupDesc, + ParquetRowGroupReader *rowGroupReader, + int *hawqAttrToParquetColNum, + bool*projs, + TupleTableSlot *slot) +{ + Assert(slot); + + if (rowGroupReader->rowRead >= rowGroupReader->rowCount) + { + ParquetRowGroupReader_FinishedScanRowGroup(rowGroupReader); + return false; + } + + /* +* g
[GitHub] incubator-hawq pull request #1350: HAWQ-1600. Parquet table data vectorized ...
Github user linwen commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/1350#discussion_r178484042 --- Diff: contrib/vexecutor/parquet_reader.c --- @@ -0,0 +1,194 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +#include "parquet_reader.h" + +#include "executor/executor.h" +#include "tuplebatch.h" +#include "vcheck.h" + +extern bool getNextRowGroup(ParquetScanDesc scan); +static int +ParquetRowGroupReader_ScanNextTupleBatch( + TupleDesc tupDesc, + ParquetRowGroupReader *rowGroupReader, + int *hawqAttrToParquetColNum, + bool*projs, + TupleTableSlot *slot); + +static void +parquet_vgetnext(ParquetScanDesc scan, ScanDirection direction, TupleTableSlot *slot); + +TupleTableSlot * +ParquetVScanNext(ScanState *scanState) +{ + Assert(IsA(scanState, TableScanState) || IsA(scanState, DynamicTableScanState)); + ParquetScanState *node = (ParquetScanState *)scanState; + Assert(node->opaque != NULL && node->opaque->scandesc != NULL); + + parquet_vgetnext(node->opaque->scandesc, node->ss.ps.state->es_direction, node->ss.ss_ScanTupleSlot); + return node->ss.ss_ScanTupleSlot; +} + +static void +parquet_vgetnext(ParquetScanDesc scan, ScanDirection direction, TupleTableSlot *slot) +{ + + //AOTupleId aoTupleId; + Assert(ScanDirectionIsForward(direction)); + + for(;;) + { + if(scan->bufferDone) + { + /* +* Get the next row group. We call this function until we +* successfully get a block to process, or finished reading +* all the data (all 'segment' files) for this relation. +*/ + while(!getNextRowGroup(scan)) + { + /* have we read all this relation's data. done! */ + if(scan->pqs_done_all_splits) + { + ExecClearTuple(slot); + return /*NULL*/; + } + } + scan->bufferDone = false; + } + + int row_num = ParquetRowGroupReader_ScanNextTupleBatch( + scan->pqs_tupDesc, + >rowGroupReader, + scan->hawqAttrToParquetColChunks, + scan->proj, + slot); + if(row_num > 0) + return; + + /* no more items in the row group, get new buffer */ + scan->bufferDone = true; + } +} + +/* + * Get next tuple batch from current row group into slot. + * + * Return false if current row group has no tuple left, true otherwise. --- End diff -- According the comments, this function returns true or false, but at last it returns a number of rows. If the function returns a number, it should not return false when it finish scan row group, use "0" instead, since there is no rows. If this function return a bool, it should not return a number. ---
[GitHub] incubator-hawq pull request #1350: HAWQ-1600. Parquet table data vectorized ...
Github user linwen commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/1350#discussion_r178483643 --- Diff: contrib/vexecutor/parquet_reader.c --- @@ -0,0 +1,194 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +#include "parquet_reader.h" + +#include "executor/executor.h" +#include "tuplebatch.h" +#include "vcheck.h" + +extern bool getNextRowGroup(ParquetScanDesc scan); +static int +ParquetRowGroupReader_ScanNextTupleBatch( + TupleDesc tupDesc, + ParquetRowGroupReader *rowGroupReader, + int *hawqAttrToParquetColNum, + bool*projs, + TupleTableSlot *slot); + +static void +parquet_vgetnext(ParquetScanDesc scan, ScanDirection direction, TupleTableSlot *slot); + +TupleTableSlot * +ParquetVScanNext(ScanState *scanState) +{ + Assert(IsA(scanState, TableScanState) || IsA(scanState, DynamicTableScanState)); + ParquetScanState *node = (ParquetScanState *)scanState; + Assert(node->opaque != NULL && node->opaque->scandesc != NULL); + + parquet_vgetnext(node->opaque->scandesc, node->ss.ps.state->es_direction, node->ss.ss_ScanTupleSlot); + return node->ss.ss_ScanTupleSlot; +} + +static void +parquet_vgetnext(ParquetScanDesc scan, ScanDirection direction, TupleTableSlot *slot) +{ + + //AOTupleId aoTupleId; + Assert(ScanDirectionIsForward(direction)); + + for(;;) + { + if(scan->bufferDone) + { + /* +* Get the next row group. We call this function until we +* successfully get a block to process, or finished reading +* all the data (all 'segment' files) for this relation. +*/ + while(!getNextRowGroup(scan)) + { + /* have we read all this relation's data. done! */ + if(scan->pqs_done_all_splits) + { + ExecClearTuple(slot); + return /*NULL*/; + } + } + scan->bufferDone = false; + } + + int row_num = ParquetRowGroupReader_ScanNextTupleBatch( + scan->pqs_tupDesc, + >rowGroupReader, + scan->hawqAttrToParquetColChunks, + scan->proj, + slot); + if(row_num > 0) + return; + + /* no more items in the row group, get new buffer */ + scan->bufferDone = true; + } +} + +/* + * Get next tuple batch from current row group into slot. + * + * Return false if current row group has no tuple left, true otherwise. + */ +static int +ParquetRowGroupReader_ScanNextTupleBatch( + TupleDesc tupDesc, + ParquetRowGroupReader *rowGroupReader, + int *hawqAttrToParquetColNum, + bool*projs, + TupleTableSlot *slot) +{ + Assert(slot); + + if (rowGroupReader->rowRead >= rowGroupReader->rowCount) + { + ParquetRowGroupReader_FinishedScanRowGroup(rowGroupReader); + return false; + } + + /* +* g
[GitHub] incubator-hawq pull request #1350: HAWQ-1600. Parquet table data vectorized ...
Github user linwen commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/1350#discussion_r178483689 --- Diff: contrib/vexecutor/vcheck.h --- @@ -37,6 +39,7 @@ typedef struct VectorizedState { bool vectorized; PlanState *parent; +bool* proj; --- End diff -- Please fix indent here. ---
[GitHub] incubator-hawq pull request #1350: HAWQ-1600. Parquet table data vectorized ...
Github user linwen commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/1350#discussion_r178483545 --- Diff: contrib/vexecutor/ao_reader.c --- @@ -0,0 +1,78 @@ +#include "ao_reader.h" +#include "tuplebatch.h" +#include "utils/datum.h" + + +void +BeginVScanAppendOnlyRelation(ScanState *scanState) +{ +BeginScanAppendOnlyRelation(scanState); +VectorizedState* vs = (VectorizedState*)scanState->ps.vectorized; +TupleBatch tb = scanState->ss_ScanTupleSlot->PRIVATE_tb; +vs->proj = palloc0(sizeof(bool) * tb->ncols); +GetNeededColumnsForScan((Node* )scanState->ps.plan->targetlist,vs->proj,tb->ncols); +GetNeededColumnsForScan((Node* )scanState->ps.plan->qual,vs->proj,tb->ncols); + +} + +void +EndVScanAppendOnlyRelation(ScanState *scanState) +{ +VectorizedState* vs = (VectorizedState*)scanState->ps.vectorized; +pfree(vs->proj); +EndScanAppendOnlyRelation(scanState); +} + +TupleTableSlot * +AppendOnlyVScanNext(ScanState *scanState) +{ +TupleTableSlot *slot = scanState->ss_ScanTupleSlot; +TupleBatch tb = (TupleBatch)slot->PRIVATE_tb; +TupleDesc td = scanState->ss_ScanTupleSlot->tts_tupleDescriptor; +VectorizedState* vs = scanState->ps.vectorized; +int row = 0; + +for(;row < tb->batchsize;row ++) +{ +AppendOnlyScanNext(scanState); + +slot = scanState->ss_ScanTupleSlot; +if(TupIsNull(slot)) +break; + +for(int i = 0;i < tb->ncols ; i ++) +{ + + if(vs->proj[i]) +{ --- End diff -- redundant space here ---
[GitHub] incubator-hawq pull request #1350: HAWQ-1600. Parquet table data vectorized ...
Github user linwen commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/1350#discussion_r178483577 --- Diff: contrib/vexecutor/ao_reader.c --- @@ -0,0 +1,78 @@ +#include "ao_reader.h" +#include "tuplebatch.h" +#include "utils/datum.h" + + +void +BeginVScanAppendOnlyRelation(ScanState *scanState) +{ +BeginScanAppendOnlyRelation(scanState); +VectorizedState* vs = (VectorizedState*)scanState->ps.vectorized; +TupleBatch tb = scanState->ss_ScanTupleSlot->PRIVATE_tb; +vs->proj = palloc0(sizeof(bool) * tb->ncols); +GetNeededColumnsForScan((Node* )scanState->ps.plan->targetlist,vs->proj,tb->ncols); +GetNeededColumnsForScan((Node* )scanState->ps.plan->qual,vs->proj,tb->ncols); + +} + +void +EndVScanAppendOnlyRelation(ScanState *scanState) +{ +VectorizedState* vs = (VectorizedState*)scanState->ps.vectorized; +pfree(vs->proj); +EndScanAppendOnlyRelation(scanState); +} + +TupleTableSlot * +AppendOnlyVScanNext(ScanState *scanState) +{ +TupleTableSlot *slot = scanState->ss_ScanTupleSlot; +TupleBatch tb = (TupleBatch)slot->PRIVATE_tb; +TupleDesc td = scanState->ss_ScanTupleSlot->tts_tupleDescriptor; +VectorizedState* vs = scanState->ps.vectorized; +int row = 0; + +for(;row < tb->batchsize;row ++) +{ +AppendOnlyScanNext(scanState); + +slot = scanState->ss_ScanTupleSlot; +if(TupIsNull(slot)) +break; + +for(int i = 0;i < tb->ncols ; i ++) +{ + + if(vs->proj[i]) +{ +Oid hawqTypeID = slot->tts_tupleDescriptor->attrs[i]->atttypid; +Oid hawqVTypeID = GetVtype(hawqTypeID); +if(!tb->datagroup[i]) +tbCreateColumn(tb,i,hawqVTypeID); + +Datum *ptr = GetVFunc(hawqVTypeID)->gettypeptr(tb->datagroup[i],row); +*ptr = slot_getattr(slot,i + 1, &(tb->datagroup[i]->isnull[row])); + +/* if attribute is a reference, deep copy the data out to prevent ao table buffer free before vectorized scan batch done */ +if(!slot->tts_mt_bind->tupdesc->attrs[i]->attbyval) +*ptr = datumCopy(*ptr,slot->tts_mt_bind->tupdesc->attrs[i]->attbyval,slot->tts_mt_bind->tupdesc->attrs[i]->attlen); +} +} + +AppendOnlyScanDesc scanDesc = ((AppendOnlyScanState*)scanState)->aos_ScanDesc; +VarBlockHeader *header = scanDesc->executorReadBlock.varBlockReader.header; + +//if(row + 1 == VarBlockGet_itemCount(header)) --- End diff -- Please remove useless codes. ---
[GitHub] incubator-hawq issue #1346: HAWQ-1594. Memory leak in standby master (gpsync...
Github user linwen commented on the issue: https://github.com/apache/incubator-hawq/pull/1346 +1 ---
[GitHub] incubator-hawq issue #1331: HAWQ-1557. Concurrent drop should not report err...
Github user linwen commented on the issue: https://github.com/apache/incubator-hawq/pull/1331 LGTM. ---
[GitHub] incubator-hawq pull request #1307: HAWQ-1544. prompt file count doesn't matc...
Github user linwen commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/1307#discussion_r150475805 --- Diff: src/backend/cdb/cdbcat.c --- @@ -296,7 +296,7 @@ GpPolicyStore(Oid tbloid, const GpPolicy *policy) /* * Sets the policy of a table into the gp_distribution_policy table * from a GpPolicy structure. - * + * @param update_bucketnum, whether update the bucketnum field --- End diff -- A line of redundant comment. Should be removed. ---
[GitHub] incubator-hawq issue #1308: HAWQ-1530. Illegally killing a JDBC select query...
Github user linwen commented on the issue: https://github.com/apache/incubator-hawq/pull/1308 +1 ---
[GitHub] incubator-hawq issue #1290: HAWQ-1529. Fix segment resource manager hang whe...
Github user linwen commented on the issue: https://github.com/apache/incubator-hawq/pull/1290 LGTM, +1 ---
[GitHub] incubator-hawq issue #1285: HAWQ-1520. Create filespace should also skip hdf...
Github user linwen commented on the issue: https://github.com/apache/incubator-hawq/pull/1285 +1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request #1282: HAWQ-1520. gpcheckhdfs should skip hdfs t...
Github user linwen commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/1282#discussion_r136239626 --- Diff: src/bin/gpcheckhdfs/gpcheckhdfs.c --- @@ -271,6 +273,21 @@ int testHdfsConnect(hdfsFS * fsptr, const char * host, int iPort, return 0; } +/* + * check path is a trash directory + * path, e.g: /hawq_default/.Trash + */ +static int is_trash_directory(const char *path) { --- End diff -- I suggest the name of this function uses "isTrashDirectory" instead of current one, since the other functions name follow this style: testFooBar, testXxxYyy, etc. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq issue #1243: HAWQ-1458. Fix share input scan bug for writer p...
Github user linwen commented on the issue: https://github.com/apache/incubator-hawq/pull/1243 +1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq issue #1278: HAWQ-1498. Segments keep open file descriptors f...
Github user linwen commented on the issue: https://github.com/apache/incubator-hawq/pull/1278 Please see the comments. The change LGTM. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request #1278: HAWQ-1498. Segments keep open file descri...
Github user linwen commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/1278#discussion_r132620396 --- Diff: src/backend/storage/file/fd.c --- @@ -2403,8 +2479,22 @@ HdfsGetConnection(const char * path) } } - entry = (struct FsEntry *) hash_search(HdfsFsTable, location, - HASH_ENTER, ); + /* If this is for normal connection, check from normal table, otherwise, +* check the table for dropping. */ + if (!isForDrop) { + entry = (struct FsEntry *) hash_search(HdfsFsTable, + location, + HASH_ENTER, + ); + } + else + { + elog(LOG, "search 4 drop 1"); --- End diff -- Is it a necessary log or better to be changed more meaningful? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request #1279: HAWQ-1310. Reformat resource_negotiator()...
Github user linwen commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/1279#discussion_r132612133 --- Diff: src/backend/optimizer/plan/planner.c --- @@ -438,256 +438,224 @@ PlannedStmt *refineCachedPlan(PlannedStmt * plannedstmt, * */ -PlannedStmt * -planner(Query *parse, int cursorOptions, - ParamListInfo boundParams, QueryResourceLife resourceLife) -{ - PlannedStmt *result = NULL; - instr_time starttime, endtime; - ResourceNegotiatorResult *ppResult = (ResourceNegotiatorResult *) palloc(sizeof(ResourceNegotiatorResult)); - SplitAllocResult initResult = {NULL, NIL, 0, NIL, NULL}; - ppResult->saResult = initResult; - ppResult->stmt = NULL; - static int plannerLevel = 0; - bool resourceNegotiateDone = false; - QueryResource *savedQueryResource = GetActiveQueryResource(); - SetActiveRelType(NIL); - - bool isDispatchParallel = false; - /* -* Before doing the true query optimization, we first run a resource_negotiator to give -* us some sense of the complexity of the query, and allocate the appropriate -* resource to run this query. After gaining the resource, we can perform the -* actual optimization. -*/ - increase_planning_depth(); - - plannerLevel++; - if (!resourceNegotiateDone) - { - PG_TRY(); - { - START_MEMORY_ACCOUNT(MemoryAccounting_CreateAccount(0, MEMORY_OWNER_TYPE_Resource_Negotiator)); - { -resource_negotiator(parse, cursorOptions, boundParams, resourceLife, ); - - decrease_planning_depth(); - - if(ppResult->stmt && ppResult->stmt->planTree) - { - isDispatchParallel = ppResult->stmt->planTree->dispatch == DISPATCH_PARALLEL; - } - } - END_MEMORY_ACCOUNT(); - } - PG_CATCH(); - { - decrease_planning_depth(); - - if ((ppResult != NULL)) - { - pfree(ppResult); - ppResult = NULL; - } - plannerLevel = 0; - PG_RE_THROW(); - } - PG_END_TRY(); - } - SetActiveRelType(NIL); - if (plannerLevel >= 1) - { - resourceNegotiateDone = true; - gp_segments_for_planner = ppResult->saResult.planner_segments; - if (ppResult->saResult.resource) - { - SetActiveQueryResource(ppResult->saResult.resource); - SetActiveRelType(ppResult->saResult.relsType); - } - } +PlannedStmt * +planner(Query *parse, int cursorOptions, ParamListInfo boundParams, QueryResourceLife resourceLife) { +PlannedStmt *result = NULL; +instr_time starttime, endtime; +ResourceNegotiatorResult *ppResult = (ResourceNegotiatorResult *) palloc(sizeof(ResourceNegotiatorResult)); +SplitAllocResult initResult = { NULL, NIL, 0, NIL, NULL }; +ppResult->saResult = initResult; +ppResult->stmt = NULL; +static int plannerLevel = 0; +bool resourceNegotiateDone = false; +QueryResource *savedQueryResource = GetActiveQueryResource(); +SetActiveRelType(NIL); + +bool isDispatchParallel = false; +/* + * Before doing the true query optimization, we first run a resource_negotiator to give + * us some sense of the complexity of the query, and allocate the appropriate + * resource to run this query. After gaining the resource, we can perform the + * actual optimization. + */ +increase_planning_depth(); + +plannerLevel++; +if (!resourceNegotiateDone) { +PG_TRY(); +{ +START_MEMORY_ACCOUNT(MemoryAccounting_CreateAccount(0, MEMORY_OWNER_TYPE_Resource_Negotiator)); +{ +resource_negotiator(parse, cursorOptions, boundParams, resourceLife, ); + +decrease_planning_depth(); + +if (ppResult->stmt && ppResult->stmt->planTree) { +isDispatchParallel = ppResult->stmt->planTree->dispatch == DISPATCH_PARALLEL; +} +} +END_MEMORY_ACCOUNT(); +}PG_CATCH(); +{ +decrease_planning_depth(); + +if ((ppResult != NULL)) { +pfree(ppResult); +ppResult = NULL; +} +plannerLevel = 0; +PG_RE_TH
[GitHub] incubator-hawq pull request #1279: HAWQ-1310. Reformat resource_negotiator()...
Github user linwen commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/1279#discussion_r132612181 --- Diff: src/backend/optimizer/plan/planner.c --- @@ -438,256 +438,224 @@ PlannedStmt *refineCachedPlan(PlannedStmt * plannedstmt, * */ -PlannedStmt * -planner(Query *parse, int cursorOptions, - ParamListInfo boundParams, QueryResourceLife resourceLife) -{ - PlannedStmt *result = NULL; - instr_time starttime, endtime; - ResourceNegotiatorResult *ppResult = (ResourceNegotiatorResult *) palloc(sizeof(ResourceNegotiatorResult)); - SplitAllocResult initResult = {NULL, NIL, 0, NIL, NULL}; - ppResult->saResult = initResult; - ppResult->stmt = NULL; - static int plannerLevel = 0; - bool resourceNegotiateDone = false; - QueryResource *savedQueryResource = GetActiveQueryResource(); - SetActiveRelType(NIL); - - bool isDispatchParallel = false; - /* -* Before doing the true query optimization, we first run a resource_negotiator to give -* us some sense of the complexity of the query, and allocate the appropriate -* resource to run this query. After gaining the resource, we can perform the -* actual optimization. -*/ - increase_planning_depth(); - - plannerLevel++; - if (!resourceNegotiateDone) - { - PG_TRY(); - { - START_MEMORY_ACCOUNT(MemoryAccounting_CreateAccount(0, MEMORY_OWNER_TYPE_Resource_Negotiator)); - { -resource_negotiator(parse, cursorOptions, boundParams, resourceLife, ); - - decrease_planning_depth(); - - if(ppResult->stmt && ppResult->stmt->planTree) - { - isDispatchParallel = ppResult->stmt->planTree->dispatch == DISPATCH_PARALLEL; - } - } - END_MEMORY_ACCOUNT(); - } - PG_CATCH(); - { - decrease_planning_depth(); - - if ((ppResult != NULL)) - { - pfree(ppResult); - ppResult = NULL; - } - plannerLevel = 0; - PG_RE_THROW(); - } - PG_END_TRY(); - } - SetActiveRelType(NIL); - if (plannerLevel >= 1) - { - resourceNegotiateDone = true; - gp_segments_for_planner = ppResult->saResult.planner_segments; - if (ppResult->saResult.resource) - { - SetActiveQueryResource(ppResult->saResult.resource); - SetActiveRelType(ppResult->saResult.relsType); - } - } +PlannedStmt * +planner(Query *parse, int cursorOptions, ParamListInfo boundParams, QueryResourceLife resourceLife) { +PlannedStmt *result = NULL; +instr_time starttime, endtime; +ResourceNegotiatorResult *ppResult = (ResourceNegotiatorResult *) palloc(sizeof(ResourceNegotiatorResult)); +SplitAllocResult initResult = { NULL, NIL, 0, NIL, NULL }; +ppResult->saResult = initResult; +ppResult->stmt = NULL; +static int plannerLevel = 0; +bool resourceNegotiateDone = false; +QueryResource *savedQueryResource = GetActiveQueryResource(); +SetActiveRelType(NIL); + +bool isDispatchParallel = false; +/* + * Before doing the true query optimization, we first run a resource_negotiator to give + * us some sense of the complexity of the query, and allocate the appropriate + * resource to run this query. After gaining the resource, we can perform the + * actual optimization. + */ +increase_planning_depth(); + +plannerLevel++; +if (!resourceNegotiateDone) { +PG_TRY(); +{ --- End diff -- Indent here doesn't align. Please fix. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq issue #1275: HAWQ-1333. Change access mode of source files fo...
Github user linwen commented on the issue: https://github.com/apache/incubator-hawq/pull/1275 +1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq issue #1274: HAWQ-1509. Support TDE read function.
Github user linwen commented on the issue: https://github.com/apache/incubator-hawq/pull/1274 +1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request #1273: HAWQ-1502. Add verification to support TD...
Github user linwen commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/1273#discussion_r130778902 --- Diff: depends/libhdfs3/test/function/TestCInterface.cpp --- @@ -369,29 +448,154 @@ TEST(TestCInterfaceTDE, TestAppendWithTDELargeFiles_Success) { if (NULL == (out = hdfsOpenFile(fs, tdefile, O_WRONLY | O_APPEND, 0, 0, 1024))) { break; } -Hdfs::FillBuffer([0], buffer.size(), 1024); -buffer.push_back(0); +Hdfs::FillBuffer([0], 128 * 3, 1024); while (todo > 0) { -if (0 > (rc = hdfsWrite(fs, out, [offset], todo))) { +if (0 > (rc = hdfsWrite(fs, out, [offset], 128))) { break; } todo -= rc; offset += rc; } rc = hdfsCloseFile(fs, out); } while (0); + +//Read buffer from tdefile with hadoop API. +FILE *file = popen("hadoop fs -cat /TDEAppend3/testfile", "r"); +char bufGets[128]; +while (fgets(bufGets, sizeof(bufGets), file)) { +} +pclose(file); +//Check the buffer's md5 value is eaqual to the tdefile's md5 value. system("rm -rf ./testfile"); -system("hadoop fs -get /TDE/testfile ./"); -diff_file2buffer("testfile", [0]); +system("hadoop fs -get /TDEAppend3/testfile ./"); +char resultFile[33] = { 0 }; --- End diff -- Since 33 is the size of MD5(32 characters) + 1, so I suggest define a const variable or macro for it. So that we can avoid use such strange number in this source file. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request #1265: HAWQ-1500. HAWQ-1501. HAWQ-1502. Support ...
Github user linwen commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/1265#discussion_r127118606 --- Diff: depends/libhdfs3/src/client/CryptoCodec.cpp --- @@ -0,0 +1,163 @@ +/ + * 2014 - + * open source under Apache License Version 2.0 + / +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#include "CryptoCodec.h" +#include "Logger.h" + +using namespace Hdfs::Internal; + +namespace Hdfs { + +/** + * Construct a CryptoCodec instance. + * @param encryptionInfo the encryption info of file. + * @param kcp a KmsClientProvider instance to get key from kms server. + * @param bufSize crypto buffer size. + */ +CryptoCodec::CryptoCodec(FileEncryptionInfo *encryptionInfo, std::shared_ptr kcp, int32_t bufSize) : encryptionInfo(encryptionInfo), kcp(kcp), bufSize(bufSize) +{ + + /* Init global status. */ + ERR_load_crypto_strings(); + OpenSSL_add_all_algorithms(); + OPENSSL_config(NULL); + + /* Create cipher context. */ + encryptCtx = EVP_CIPHER_CTX_new(); + cipher = NULL; + +} + +/** + * Destroy a CryptoCodec instance. + */ +CryptoCodec::~CryptoCodec() +{ + if (encryptCtx) + EVP_CIPHER_CTX_free(encryptCtx); +} + +/** + * Get decrypted key from kms. + */ +std::string CryptoCodec::getDecryptedKeyFromKms() +{ + ptree map = kcp->decryptEncryptedKey(*encryptionInfo); + std::string key = map.get("material"); + + int rem = key.length() % 4; + if (rem) { + rem = 4 - rem; + while (rem != 0) { + key = key + "="; + rem--; + } + } + + std::replace(key.begin(), key.end(), '-', '+'); + std::replace(key.begin(), key.end(), '_', '/'); + + LOG(INFO, "material is :%s", key.c_str()); --- End diff -- Suggest provide more clear log message, and if this function is called frequently, use DEBUG3 instead of INFO. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq issue #1265: HAWQ-1500. HAWQ-1501. HAWQ-1502. Support TDE wri...
Github user linwen commented on the issue: https://github.com/apache/incubator-hawq/pull/1265 Please unify the indent. We should avoid use both "space" and "tab" for indent in one source file. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq issue #1254: HAWQ-1373 - Added feature to reload GUC values u...
Github user linwen commented on the issue: https://github.com/apache/incubator-hawq/pull/1254 +1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq issue #1262: HAWQ-1493. Integrate Ranger lookup JAAS configur...
Github user linwen commented on the issue: https://github.com/apache/incubator-hawq/pull/1262 This PR has been merged into master. Please close it. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq issue #1262: HAWQ-1493. Integrate Ranger lookup JAAS configur...
Github user linwen commented on the issue: https://github.com/apache/incubator-hawq/pull/1262 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request #1262: HAWQ-1493. Integrate Ranger lookup JAAS c...
Github user linwen commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/1262#discussion_r125064275 --- Diff: ranger-plugin/admin-plugin/src/main/java/org/apache/hawq/ranger/service/HawqClient.java --- @@ -74,7 +72,7 @@ private static final String DEFAULT_DATABASE = "postgres"; private static final String DEFAULT_DATABASE_TEMPLATE = "DBTOBEREPLACEDINJDBCURL"; private static final String JDBC_DRIVER_CLASS = "org.postgresql.Driver"; - +private static final String jaasApplicationName = "pgjdbc"; --- End diff -- Suggest use all capital letters for static final variable, follow the name convention above, like XXX_XXX_XXX. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request #1262: HAWQ-1493. Integrate Ranger lookup JAAS c...
Github user linwen commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/1262#discussion_r125064432 --- Diff: ranger-plugin/admin-plugin/src/main/java/org/apache/hawq/ranger/service/HawqClient.java --- @@ -90,6 +88,8 @@ public HawqClient(String serviceName, Map<String, String> connectionProperties) throws Exception { super(serviceName, connectionProperties); this.connectionProperties = connectionProperties; + --- End diff -- two useless empty lines. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request #1258: HAWQ-1458. The maximum value of guc share...
GitHub user linwen opened a pull request: https://github.com/apache/incubator-hawq/pull/1258 HAWQ-1458. The maximum value of guc share_input_scan_wait_lockfile_timeout should be greater than the default value. fix a bug which cause HAWQ debug version failed in initializing. You can merge this pull request into a Git repository by running: $ git pull https://github.com/linwen/incubator-hawq HAWQ-1458 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-hawq/pull/1258.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1258 commit 7df7f96f2c07b41a481d03833e0a1e6106c27e34 Author: Wen Lin <w...@pivotal.io> Date: 2017-06-22T07:35:08Z HAWQ-1458. The maximum value of guc share_input_scan_wait_lockfile_timeout should be greater than the default value. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq issue #1257: HAWQ-1487. Fix hang process due to deadlock when...
Github user linwen commented on the issue: https://github.com/apache/incubator-hawq/pull/1257 +1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq issue #1256: HAWQ-1485. Fix exception of decryptPassword twic...
Github user linwen commented on the issue: https://github.com/apache/incubator-hawq/pull/1256 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq issue #1254: HAWQ-1373 - Added feature to reload GUC values u...
Github user linwen commented on the issue: https://github.com/apache/incubator-hawq/pull/1254 Shubham, I think what you've done in this PR is to add a command for hawq, which can reload GUC configs without restarting the system. Currently, this is done by this command "hawq stop cluster --reload", which is a little bit ambiguous in my opinion. So if we all agree on using "hawq reload-config" instead, the old codes related to this reloading GUC logic should be removed. @radarwave @vVineet --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq issue #1251: HAWQ-1480 - Added feature for packing a core fil...
Github user linwen commented on the issue: https://github.com/apache/incubator-hawq/pull/1251 Shubham, this PR has been merged into master. Would you please close it? Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq issue #1251: HAWQ-1480 - Added feature for packing a core fil...
Github user linwen commented on the issue: https://github.com/apache/incubator-hawq/pull/1251 merged, this pr can be closed now. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq issue #1251: HAWQ-1480 - Added feature for packing a core fil...
Github user linwen commented on the issue: https://github.com/apache/incubator-hawq/pull/1251 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq issue #1226: HAWQ-1447. Fix ranger build failure
Github user linwen commented on the issue: https://github.com/apache/incubator-hawq/pull/1226 I don't think it is a good way to explicitly copy jar files in Makefile, it should be done in mvn building file. So Xiang, would you like to make sure if this failure still exist? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq issue #1251: HAWQ-1480 - Added feature for packing a core fil...
Github user linwen commented on the issue: https://github.com/apache/incubator-hawq/pull/1251 Ed, I agree with you on adding documentation for this utility. I have some concern on adding a test case, is it a little bit strange to have a core dump file in source repository(maybe two, OSX and Linux)? Or write a test case which can trigger a core dump, then run this script. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request #1251: HAWQ-1480 - Added feature for packing a c...
Github user linwen commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/1251#discussion_r120526700 --- Diff: tools/sbin/packcore --- @@ -0,0 +1,262 @@ +#!/bin/env python +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +# Copyright Pivotal 2014 --- End diff -- I think it is better to remove it, also the ""Copyright Pivotal" in hawqstandbywatch.py. How do you think of this Roman? @rvs Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request #1251: HAWQ-1480 - Added feature for packing a c...
Github user linwen commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/1251#discussion_r120278044 --- Diff: tools/sbin/packcore --- @@ -0,0 +1,262 @@ +#!/bin/env python +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +# Copyright Pivotal 2014 --- End diff -- I don't suggest add "Copyright Pivotal 2014" here, since hawq is under Apache license. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq issue #1244: HAWQ-1443. Implement Ranger lookup for HAWQ with...
Github user linwen commented on the issue: https://github.com/apache/incubator-hawq/pull/1244 +1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request #1242: HAWQ-1469. Don't expose warning messages ...
Github user linwen closed the pull request at: https://github.com/apache/incubator-hawq/pull/1242 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request #1242: HAWQ-1469. Don't expose warning messages ...
Github user linwen commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/1242#discussion_r116961874 --- Diff: src/backend/libpq/rangerrest.c --- @@ -453,23 +453,30 @@ static int call_ranger_rest(CURL_HANDLE curl_handle, const char* request) { if (retry > 1) { - elog(WARNING, "ranger plugin service from http://%s:%d/rps is unavailable : %s, try another http://%s:%d/rps\n;, + /* Don't expose this warning message to client, just record in log. +* The value of whereToSendOutput is DestRemote, so set it to DestNone +* and set back after write a warning message in log file. +*/ + CommandDest commandDest = whereToSendOutput; + whereToSendOutput = DestNone; + elog(WARNING, "ranger plugin service from http://%s:%d/rps is unavailable : %s, " + "trying ranger plugin service at http://%s:%d/rps\n;, --- End diff -- When master RPS doesn't work due to some reason, and hawq begins to talk with standby RPS, a warning message should be recorded in log file, so that administrators can solve the master RPS problem. elog(LOG, ...) won't expose to console(by default, client_min_sessages is NOTICE), but in this switch case, a warning should be recorded. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request #1241: HAWQ-1436. Print a message to command lin...
Github user linwen closed the pull request at: https://github.com/apache/incubator-hawq/pull/1241 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request #1241: HAWQ-1436. Print a message to command lin...
Github user linwen commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/1241#discussion_r116679611 --- Diff: src/backend/libpq/rangerrest.c --- @@ -464,6 +466,11 @@ static int call_ranger_rest(CURL_HANDLE curl_handle, const char* request) } else { + if (switchToMaster) + { + /* master's RPS has recovered, switch from standby's RPS to master's RPS */ + elog(NOTICE, "switch from standby's RPS to master's RPS"); --- End diff -- Thanks! You are right. If switch to master and failed again, then talk with standby success, this notice message is also printed, a condition should be added. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request #1241: HAWQ-1436. Print a message to command lin...
GitHub user linwen opened a pull request: https://github.com/apache/incubator-hawq/pull/1241 HAWQ-1436. Print a message to command line if hawq switches from standby RPS to master RPS This improvement happens in such case: hawq master is talking with standby RPS, when master's RPS has recovered, master switches from standby RPS to local RPS, a reminder message is printed. Please review. You can merge this pull request into a Git repository by running: $ git pull https://github.com/linwen/incubator-hawq hawq-1436 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-hawq/pull/1241.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1241 commit 28e633dc99555a43b02f80c5e7c9ea40a090e561 Author: Wen Lin <w...@pivotal.io> Date: 2017-05-16T04:00:39Z HAWQ-1436. Print a message to command line if hawq switches from standby RPS to master RPS. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq issue #1235: HAWQ-1456. Copy RPS configuration files to stand...
Github user linwen commented on the issue: https://github.com/apache/incubator-hawq/pull/1235 +1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq issue #1238: HAWQ-1460. WAL Send Server process should exit i...
Github user linwen commented on the issue: https://github.com/apache/incubator-hawq/pull/1238 +1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq issue #1234: HAWQ-1436. Implement ranger pulgin service High ...
Github user linwen commented on the issue: https://github.com/apache/incubator-hawq/pull/1234 Fix and merge into master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request #1234: HAWQ-1436. Implement ranger pulgin servic...
Github user linwen closed the pull request at: https://github.com/apache/incubator-hawq/pull/1234 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request #1235: HAWQ-1456. Copy RPS configuration files t...
Github user linwen commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/1235#discussion_r115417102 --- Diff: ranger-plugin/scripts/enable-ranger-plugin.sh --- @@ -104,8 +157,20 @@ function get_hawq_password() { done } +function sync_rps_configuration() { --- End diff -- should follow 2 spaces indent. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request #1235: HAWQ-1456. Copy RPS configuration files t...
Github user linwen commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/1235#discussion_r115417172 --- Diff: ranger-plugin/scripts/enable-ranger-plugin.sh --- @@ -237,9 +307,14 @@ main() { if [[ $# -lt 1 ]]; then usage fi + if [[ -z "$GPHOME" ]]; then + GPHOME="/usr/local/hawq" + fi SCRIPT_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd -P)" parse_params "$@" validate_params + sync_rps_configuration + #exit 0 --- End diff -- useless codes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request #1235: HAWQ-1456. Copy RPS configuration files t...
Github user linwen commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/1235#discussion_r115417038 --- Diff: ranger-plugin/scripts/enable-ranger-plugin.sh --- @@ -70,14 +70,47 @@ function get_ranger_password() { done } +# get tag value from hawq-site.xml +function get_value_bytag() { --- End diff -- suggest use "property" instead of "tag". --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request #1234: HAWQ-1436. Implement ranger pulgin servic...
GitHub user linwen opened a pull request: https://github.com/apache/incubator-hawq/pull/1234 HAWQ-1436. Implement ranger pulgin service High Availability. 1. master will the connect to standby RPS for policy search if RPS on master failed; 2. if master has been talking to standby RPS for a period(controlled by GUC hawq_rps_check_local_interval, 5 minutes by default), it will try to connect local RPS again. Please review, thanks! You can merge this pull request into a Git repository by running: $ git pull https://github.com/linwen/incubator-hawq hawq_1436 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-hawq/pull/1234.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1234 commit baf088d7187de4c757c7d44e7b174615c84303b1 Author: Wen Lin <w...@pivotal.io> Date: 2017-05-09T03:36:13Z HAWQ-1436. Implement ranger pulgin service High Availability. 1. master will the connect to standby RPS for policy search if RPS on master failed; 2. if master has been talking to standby RPS for a period(controlled by GUC hawq_rps_check_local_interval, 5 minutes by default), it will try to connect local RPS again. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq issue #1233: HAWQ-1454. Exclude certain jars from Ranger Plug...
Github user linwen commented on the issue: https://github.com/apache/incubator-hawq/pull/1233 +1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq issue #1229: HAWQ-1451. HAWQ state should be able to report t...
Github user linwen commented on the issue: https://github.com/apache/incubator-hawq/pull/1229 +1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq issue #1231: HAWQ-1452. Remove hawq_rps_address_suffix and ha...
Github user linwen commented on the issue: https://github.com/apache/incubator-hawq/pull/1231 +1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq issue #1228: HAWQ-1449. HAWQ start/stop cluster should be abl...
Github user linwen commented on the issue: https://github.com/apache/incubator-hawq/pull/1228 +1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq issue #1220: HAWQ-1422. Resolve user groups using Hadoop conf...
Github user linwen commented on the issue: https://github.com/apache/incubator-hawq/pull/1220 +1 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq issue #1219: HAWQ-1433. ALTER RESOURCE QUEUE DDL does not che...
Github user linwen commented on the issue: https://github.com/apache/incubator-hawq/pull/1219 +1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq issue #1219: HAWQ-1433. ALTER RESOURCE QUEUE DDL does not che...
Github user linwen commented on the issue: https://github.com/apache/incubator-hawq/pull/1219 Does this fix mean only percentage format of MEMORY_CLUSTER_LIMIT and CORE_CLUSTER_LIMIT are supported in ALTER RESOURCE QUEUE DDL? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq issue #1201: HAWQ-1418. Move print executing command after se...
Github user linwen commented on the issue: https://github.com/apache/incubator-hawq/pull/1201 +1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq issue #1194: Hawq 1396. Fix the bug when query hcatalog via P...
Github user linwen commented on the issue: https://github.com/apache/incubator-hawq/pull/1194 +1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request #1194: Hawq 1396. Fix the bug when query hcatalo...
Github user linwen commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/1194#discussion_r108129930 --- Diff: src/test/feature/Ranger/test_ranger.cpp --- @@ -314,6 +314,35 @@ TEST_F(TestHawqRanger, ResourceIncludeATest) { } } +TEST_F(TestHawqRanger, HcatalogTest) { + SQLUtility util; + if (util.getGUCValue("hawq_acl_type") == "ranger") + { + /* +* create a table in hive and populate some rows +*/ + clearEnv(, "pxf", 1); + string rootPath(util.getTestRootPath()); + string sqlPath = rootPath + "/Ranger/data/testhive.sql"; + auto cmd = hawq::test::stringFormat("hive -f %s", sqlPath.c_str()); + Command::getCommandStatus(cmd); + + /* +* create a user and query this table, fail. +*/ + addUser(, "pxf", 1, false); + runSQLFile(, "pxf", "fail", 1); + + /* +* add an allow policy for this user and query again, succeed. --- End diff -- This line of comment is not accurate. Should be "allow policies" --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request #1194: Hawq 1396. Fix the bug when query hcatalo...
Github user linwen commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/1194#discussion_r108130047 --- Diff: src/test/feature/Ranger/test_ranger.cpp --- @@ -314,6 +314,35 @@ TEST_F(TestHawqRanger, ResourceIncludeATest) { } } +TEST_F(TestHawqRanger, HcatalogTest) { + SQLUtility util; + if (util.getGUCValue("hawq_acl_type") == "ranger") + { + /* +* create a table in hive and populate some rows +*/ + clearEnv(, "pxf", 1); --- End diff -- should clean env for 2 and 3, added from line:340 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request #1180: HAWQ-1396. Fix the bug when query hcatalo...
Github user linwen closed the pull request at: https://github.com/apache/incubator-hawq/pull/1180 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request #1174: HAWQ-1359. Remove getRangerHost() functio...
Github user linwen closed the pull request at: https://github.com/apache/incubator-hawq/pull/1174 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request #1174: HAWQ-1359. Remove getRangerHost() functio...
GitHub user linwen opened a pull request: https://github.com/apache/incubator-hawq/pull/1174 HAWQ-1359. Remove getRangerHost() function from ranger test, still use environment variable to specify the Ranger Admin. Please review, thanks! You can merge this pull request into a Git repository by running: $ git pull https://github.com/linwen/incubator-hawq hawq-1359 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-hawq/pull/1174.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1174 commit 59070b2c18d4edf7e0e5cd112647e384e76d93b6 Author: Wen Lin <w...@pivotal.io> Date: 2017-03-15T06:30:27Z HAWQ-1359. Remove getRangerHost() function from ranger test, still use environment variable to specify the Ranger Admin. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request #1171: HAWQ-1359. Add test cases for Ranger supp...
Github user linwen closed the pull request at: https://github.com/apache/incubator-hawq/pull/1171 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request #1171: HAWQ-1359. Add test cases for Ranger supp...
Github user linwen commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/1171#discussion_r106084037 --- Diff: src/test/feature/Ranger/test_ranger.cpp --- @@ -24,19 +24,34 @@ #include "lib/command.h" #include "lib/gpfdist.h" -#include "lib/sql_util.h" #include "lib/string_util.h" using std::vector; using std::string; using hawq::test::SQLUtility; using hawq::test::Command; +TestHawqRanger::TestHawqRanger() +{ + initfile = hawq::test::stringFormat("Ranger/sql/init_file"); + rangerHost = getRangerHost(); +} + +std::string& TestHawqRanger::getRangerHost() --- End diff -- The RPS and HAWQ master are required to installed on a same node. This test case is to add/remove policy from Ranger Admin, but HAWQ only knows and interacts with RPS, it need not configure Ranger Admin address/hostname. I am considering still use a environment value to control it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request #1171: HAWQ-1359. Add test cases for Ranger supp...
Github user linwen commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/1171#discussion_r106080149 --- Diff: src/test/feature/lib/sql_util.cpp --- @@ -224,7 +225,9 @@ const string SQLUtility::generateSQLFile(const string , bool usingDefaul EXPECT_TRUE(false) << "Error opening file " << newSqlFile; } out << "-- start_ignore" << std::endl; + printf("dd2d%s\n",schemaName.c_str()); if (!usingDefaultSchema) { + printf("ddd%s\n",schemaName.c_str()); --- End diff -- Thanks, I will check it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq issue #1172: Hawq 1385. fixed hawq_ctl stop failed when maste...
Github user linwen commented on the issue: https://github.com/apache/incubator-hawq/pull/1172 +1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq issue #1171: HAWQ-1359. Add test cases for Ranger support, co...
Github user linwen commented on the issue: https://github.com/apache/incubator-hawq/pull/1171 It has passed jenkins and travis check, I think it can pass license check. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request #1172: Hawq 1385. fixed hawq_ctl stop failed whe...
Github user linwen commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/1172#discussion_r105848316 --- Diff: tools/bin/hawq_ctl --- @@ -835,6 +835,8 @@ class HawqStop: acl_type = rows.next()[1] conn.close() except DatabaseError, ex: +# get hawq_acl_type from hawq_site.xml if connect db failed. +# avoid cannot stop segments where master is down. --- End diff -- This line could be removed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request #1172: Hawq 1385. fixed hawq_ctl stop failed whe...
Github user linwen commented on a diff in the pull request: https://github.com/apache/incubator-hawq/pull/1172#discussion_r105848224 --- Diff: tools/bin/hawq_ctl --- @@ -835,6 +835,8 @@ class HawqStop: acl_type = rows.next()[1] conn.close() except DatabaseError, ex: +# get hawq_acl_type from hawq_site.xml if connect db failed. --- End diff -- # get hawq_acl_type from local hawq_site.xml if connect master failed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq issue #1171: HAWQ-1359. Add test cases for Ranger support, co...
Github user linwen commented on the issue: https://github.com/apache/incubator-hawq/pull/1171 This file already has license header. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-hawq pull request #1171: HAWQ-1359. Add test cases for Ranger supp...
GitHub user linwen opened a pull request: https://github.com/apache/incubator-hawq/pull/1171 HAWQ-1359. Add test cases for Ranger support, combinations of differe⦠Please review, thanks! You can merge this pull request into a Git repository by running: $ git pull https://github.com/linwen/incubator-hawq hawq-1359 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-hawq/pull/1171.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1171 commit dc36e81b3a74d1c04c3b138f6cfc053a657b62d4 Author: Wen Lin <w...@pivotal.io> Date: 2017-03-14T07:17:32Z HAWQ-1359. Add test cases for Ranger support, combinations of different allow/exclude policies. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---