Re: Remain with HAWQ project or not?

2018-05-07 Thread Zhanwei Wang
Yes, I would like to remain a committer.


> On May 8, 2018, at 13:26, Hong  wrote:
> 
> Y
> 
> 2018-05-08 1:05 GMT-04:00 stanly sheng :
> 
>> Yes, I want to remain with HAWQ
>> 
>> 2018-05-08 12:16 GMT+08:00 Paul Guo :
>> 
>>> Yes. Thanks Radar to drive HAWQ graduation.
>>> 
>>> 2018-05-08 12:02 GMT+08:00 Lirong Jian :
>>> 
 Yes, I would like to remain a committer.
 
 Lirong
 
 Lirong Jian
 HashData Inc.
 
 2018-05-08 10:04 GMT+08:00 Hubert Zhang :
 
> Yes.
> 
> On Tue, May 8, 2018 at 9:30 AM, Lili Ma  wrote:
> 
>> Yes, of course I want to remain as PMC member!
>> 
>> Thanks Radar for the effort on HAWQ graduation:)
>> 
>> Best Regards,
>> Lili
>> 
>> 2018-05-07 20:07 GMT-04:00 Lisa Owen :
>> 
>>> yes, i would like to remain a committer.
>>> 
>>> 
>>> -lisa owen
>>> 
>>> On Mon, May 7, 2018 at 10:02 AM, Shubham Sharma <
>>> ssha...@pivotal.io>
>>> wrote:
>>> 
 Yes. I am looking forward to contributing to Hawq.
 
 On Mon, May 7, 2018 at 12:53 PM, Lav Jain 
 wrote:
 
> Yes. I am very excited about HAWQ.
> 
> Regards,
> 
> 
> *Lav Jain*
> *Pivotal Data*
> 
> lj...@pivotal.io
> 
> On Mon, May 7, 2018 at 6:51 AM, Alexander Denissov <
>>> adenis...@pivotal.io
> 
> wrote:
> 
>> Yes.
>> 
>>> On May 7, 2018, at 6:03 AM, Wen Lin 
>>> wrote:
>>> 
>>> Yes. I'd like to keep on contributing to HAWQ.
>>> 
 On Mon, May 7, 2018 at 5:21 PM, Ivan Weng <
>>> iw...@pivotal.io
> 
>>> wrote:
 
 Yes, I definitely would like to be with HAWQ.
 
 Regards,
 Ivan
 
> On Mon, May 7, 2018 at 5:12 PM, Hongxu Ma <
> inte...@outlook.com
>>> 
> wrote:
> 
> Yes, let's make HAWQ better.
> 
> Thanks.
> 
>> 在 07/05/2018 16:11, Radar Lei 写道:
>> HAWQ committers,
>> 
>> Per the discussion in "Apache HAWQ graduation from
> incubator?"
 [1],
> we
> want
>> to setup the PMC as part of HAWQ graduation
>> resolution.
>> 
>> So we'd like to confirm whether you want to remain as
>> a
> committer/PMC
>> member of Apache HAWQ project?
>> 
>> If you'd like to remain with HAWQ project, it's
>> welcome
 and
>>> please
> *respond**
>> 'Yes'* in this thread, or *respond 'No'* if you are
>> not
>>> interested
> in
 any
>> more. Thanks.
>> 
>> This thread will be available for at least 72 hours,
>>> after
>> that,
 we
 will
>> send individual confirm emails.
>> 
>> [1]
>> https://lists.apache.org/thread.html/
 b4a0b5671ce377b3d51c9b7ab00496
> a1eebfcbf1696ce8b67e078c64@%3Cdev.hawq.apache.org%3E
>> 
>> Regards,
>> Radar
>> 
> 
> --
> Regards,
> Hongxu.
> 
> 
 
>> 
> 
 
 
 
 --
 Regards,
 Shubham Sharma
 Staff Customer Engineer
 Pivotal Global Support Services
 ssha...@pivotal.io
 Direct Tel: +1(510)-304-8201
 Office Hours: Mon-Fri 9:00 am to 5:00 pm PDT
 Out of Office Hours Contact +1 877-477-2269
 
>>> 
>> 
> 
> 
> 
> --
> Thanks
> 
> Hubert Zhang
> 
 
>>> 
>> 
>> 
>> 
>> --
>> Best Regards,
>> Xiang Sheng
>> 



Re: Questions about filesystem / filespace / tablespace

2017-03-15 Thread Zhanwei Wang
Hi Kyle

Let me tell some history about HAWQ. It’s about six years ago…

When we were starting design HAWQ. We first implemented a demo version of HAWQ, 
of cause it was not called HAWQ at that time. It was called GoH (Greenplum on 
HDFS).  The first implement is quite simple. We mount HDFS on local filesystem 
with FUSE and run GPDB on it. And quickly we found that the performance is 
unacceptable. 

And then we decided to replace the storage layer of GPDB to make it work with 
HDFS. And we implemented a “pluggable filesystem”  layer and added 
pg_filesystem object to GPDB. That was the HAWQ at about early 2012.  

At first we wanted to exactly adopt HDFS C API because it is almost the de 
facto standard but we found that it cannot meet our requirement. So based on 
HDFS C API we implement a wrapper of it as our API standard. Any dynamic 
library which implement this API can be loaded into HAWQ and register into 
pg_filesystem catalog, used to access file on target filesystem without modify 
HAWQ code.

But this “pluggable filesystem” is never officially marked as a feature of 
HAWQ. We never tested it with other filesystem except HDFS. And as far as I 
known some new API was never added into pg_filesystem catalog due to history 
reason. So I do not think “pluggable filesystem” can work now without any 
change and bug fix.

Pluggable filesystem is charming but unfortunately it was never get enough 
priority. And the previous design maybe not suitable anymore. I guess it is a 
good change to rethink how we can achieve this goal and make it happen.




Zhanwei Wang

HashData
http://www.hashdata.cn



> 在 2017年3月15日,下午6:18,Ming Li <m...@pivotal.io> 写道:
> 
> Hi Kyle,
> 
> 
>  If we keep all these filesystem similar to hdfs, only support append
> only, then then change must be much less. I think we can go ahead to
> implement a demo for it if we have resource, we may encounter problems, but
> we can find more solution/workaround for it.
> 
> 
>  For your question about the relationship between 3 source code files,
> below is my understanding (because the code is not written by me, my
> opinion maybe not completely correct.)
> (1) bin/gpfilesystem/hdfs/gpfshdfs.c -- implement all API used in hdfs
> tuple in the catalog pg_filesystem, it will directly call API in libhdfs3
> to access hdfs file system. The reason why make it a wrapper is to define
> all these API as UDF, so that we can easily support similar filesystem by
> adding a similar tuple in pg_filesystem, and add similar code as this file,
> without changing any place calling these API. Also because they are UDF, we
> can upgrade the old binary hawq to add new file system.
> (2) backend/storage/file/filesystem.c -- because all API in (1) is in form
> of UDF,  so we need a conversion if we want to directly call these API.
> This file is responsible for converting normal hdfs calling in hawq kernel
> to UDF calling.
> (3) backend/storage/file/fd.c -- Because OS have file description open
> number limitation, PostgreSQL/HAWQ will use a LUR buffer to cache all
> opened file handlers. All hdfs API in this file also manage file handler
> same as native file systems. These functions call API in (2) to interact
> with hdfs.
> 
> In a word,  the calling stack is:  (3) --> (2) --> (1) --> libhdfs3
> API.
> ---
> 
>The last question about tablespace, PostgreSQL introduce it so that
> user can set different tablespace to different paths, and these paths can
> be mounted with different file system on linux. But all filesystems API are
> the same, and the functionality are the same (supporting UPDATE in place).
> So we cannot directly use tablespace to hand this scenario.  And also I
> cannot guess how much effort needed because I did participate the hdfs file
> system supporting in the hawq origin release.
> 
> 
> That's my opinion, any correction or suggestion are welcomed! Hope it can
> help you!  Thanks.
> 
> 
> On Wed, Mar 15, 2017 at 11:07 AM, Paul Guo <paul...@gmail.com> wrote:
> 
>> Hi Kyle,
>> 
>> I'm not sure whether I understand your point correctly, but for FUSE which
>> allows userspace file system implementation on Linux, users uses the
>> filesystem (e.g. S3 in your example) as a block storage, accesses it via
>> standard sys calls like open, close, read, write although some behaviours
>> or sys call could probably be not supported. That means for query for FUSE
>> fs, you are probably able to access them using the interfaces in fd.c
>> directly (I'm not sure some hacking is needed), but for such kind of
>> distributed file systems, compared with fuse access way, lib access is
>> usually more encouraged since: 1) performance (You could search for the

Re: Turning off DoS protection in hawq

2016-12-10 Thread Zhanwei Wang
HI Ruilong


According the pervious stress test, under very high workload, OS will consider 
the connection from HAWQ to HDFS as Dos and reject the connect request and then 
HAWQ query will fail.

After disabling CO table in HAWQ, we have significantly reduced the file write 
workload, I think it is time to reconsider this OS setting.



Best Regards

Zhanwei Wang
wan...@apache.org



> 在 2016年12月11日,上午8:38,Ruilong Huo <r...@pivotal.io> 写道:
> 
> Hi hawq community,
> 
> Anyone know that why we turn off DoS protection in hawq by setting
> net.ipv4.tcp_syncookies to off in /etc/sysctl.conf? Any other reason from
> hawq perspective other than below two more from operating system
> perspective?
> 
> 1) increase the number of concurrent tcp clients
> 
> 2) reduce cpu overhead for creating and processing the syncookies. Though
> the overhead is tiny
> 
> BTW: HAWQ turn off DoS protection
> <http://hdb.docs.pivotal.io/210/hdb/install/install-cli.html> while
> Greenplum Database enable DoS protection
> <http://gpdb.docs.pivotal.io/4320/install_guide/prep_os_install_gpdb.html> by
> default.
> 
> Best regards,
> Ruilong Huo



Re: new committer: Paul Guo

2016-11-03 Thread Zhanwei Wang
Congratulations!



Best Regards

Zhanwei Wang
wan...@apache.org



> 在 2016年11月4日,下午1:25,Ivan Weng <iw...@pivotal.io> 写道:
> 
> Congratulations!
> 
> 
> Regards,
> Ivan
> 
> On Fri, Nov 4, 2016 at 1:22 PM, Yi Jin <y...@pivotal.io> wrote:
> 
>> Congratulations! Paul! :)
>> 
>> On Fri, Nov 4, 2016 at 3:48 PM, Lili Ma <l...@pivotal.io> wrote:
>> 
>>> Congratulations, Paul!
>>> 
>>> On Fri, Nov 4, 2016 at 12:08 PM, Hong Wu <xunzhang...@gmail.com> wrote:
>>> 
>>>> Wow! Congrats Paul.
>>>> 
>>>> 
>>>>> 在 2016年11月4日,上午11:51,Wen Lin <w...@pivotal.io> 写道:
>>>>> 
>>>>> Paul,
>>>>> Congratulations!
>>>>> 
>>>>>> On Fri, Nov 4, 2016 at 11:34 AM, Ruilong Huo <r...@pivotal.io>
>> wrote:
>>>>>> 
>>>>>> The Project Management Committee (PMC) for Apache HAWQ (incubating)
>>> has
>>>>>> invited Paul Guo to become a committer and we are pleased to
>> announce
>>>> that
>>>>>> he has accepted.
>>>>>> 
>>>>>> Being a committer enables easier contribution to the project since
>>>> there is
>>>>>> no need to go via the patch submission process. This should enable
>>>> better
>>>>>> productivity.
>>>>>> 
>>>>>> Please join us in congratulating him and we are looking forward to
>>>>>> collaborating with him in the open source community.
>>>>>> 
>>>>>> His contribution includes (but not limited to):
>>>>>> 
>>>>>>  - *Direct contribution to code base*
>>>>>> - *56 commits in total which span most of the key components of
>>>>>> hawq. This demonstrate concrete knowledge and in depth
>>>>>> understanding of the
>>>>>> product*
>>>>>> - 56 closed PRs: https://github.com/apache
>>>>>> /incubator-hawq/pulls?q=is%3Apr+is%3Aclosed+author%3Apaul-guo-
>>>>>> <https://github.com/apache/incubator-hawq/pulls?q=is%
>>>>>> 3Apr+is%3Aclosed+author%3Apaul-guo->
>>>>>> - *9 features, enhancement and code refactor including storage
>>> and
>>>>>> compression, command line tool and management utility,
>>> procedural
>>>>>> language, configure and build system, test infrastructure and
>>> test,
>>>>>> etc*
>>>>>>- HAWQ-774 <https://issues.apache.org/jira/browse/HAWQ-774>
>>> Add
>>>>>>snappy compression support to row oriented storage
>>>>>>- HAWQ-984 <https://issues.apache.org/jira/browse/HAWQ-984>
>>>> hawq
>>>>>>config is too slow
>>>>>>- HAWQ-775 <https://issues.apache.org/jira/browse/HAWQ-775>
>>>>>> Provide
>>>>>>a seperate PLR package
>>>>>>- HAWQ-751 <https://issues.apache.org/jira/browse/HAWQ-751>
>>> Add
>>>>>>plr, pgcrypto, gporca into Apache HAWQ
>>>>>>- HAWQ-744 <https://issues.apache.org/jira/browse/HAWQ-744>
>>> Add
>>>>>>plperl code
>>>>>>- HAWQ-1007 <https://issues.apache.org/
>> jira/browse/HAWQ-1007>
>>>> Add
>>>>>>the pgcrypto code into hawq
>>>>>>- HAWQ-394 <https://issues.apache.org/jira/browse/HAWQ-394>
>>>>>> Remove
>>>>>>pgcrypto from code base
>>>>>>- HAWQ-914 <https://issues.apache.org/jira/browse/HAWQ-914>
>>>>>> Improve
>>>>>>user experience of HAWQ's build infrastructure
>>>>>>- HAWQ-1081 <https://issues.apache.org/
>> jira/browse/HAWQ-1081>
>>>>>> Check
>>>>>>missing perl modules (at least JSON) in configure
>>>>>>- HAWQ-867 <https://issues.apache.org/jira/browse/HAWQ-867>
>>>>>> Replace
>>>>>>the git-submobule mechanism with git-clone
>>>>>>- HAWQ-711 <https://issues.apache.org/jira/browse/HAWQ-711>
>>>>>> Integrate
>>>>>>libhdfs3 and libyarn makefile into hawq
&

Re: New Committer: Hong Wu

2016-11-03 Thread Zhanwei Wang
Congratulations!



Best Regards

Zhanwei Wang
wan...@apache.org



> 在 2016年11月4日,下午1:25,Wen Lin <w...@pivotal.io> 写道:
> 
> Congratulations!
> 
> On Fri, Nov 4, 2016 at 1:21 PM, Yi Jin <y...@pivotal.io> wrote:
> 
>> Congratulations! Hong!
>> 
>> On Fri, Nov 4, 2016 at 3:47 PM, Lili Ma <lil...@apache.org> wrote:
>> 
>>> The Project Management Committee (PMC) for Apache HAWQ (incubating) has
>>> invited Hong Wu to become a committer and we are pleased to announce that
>>> he has accepted.
>>> 
>>> Being a committer enables easier contribution to the project since there
>> is
>>> no need to go via the patch submission process. This should enable better
>>> productivity.
>>> 
>>> Please join us in congratulating him and we are looking forward to
>>> collaborating with him in the open source community.
>>> 
>>> His contribution includes (but not limited to):
>>> 
>>>   - *Direct contribution to code base:*
>>>  - *79 commits in total with most of the major components in hawq
>>>  involved. This shows that he has solid knowledge and skill of
>> hawq.*
>>>  - 66 closed PRs: https://github.com/apache/
>>> incubator-hawq/pulls?q=is%
>>>  3Apr+user%3Axunzhang+author%3Axunzhang+is%3Aclosed
>>>  <https://github.com/apache/incubator-hawq/pulls?q=is%
>>> 3Apr+user%3Axunzhang+author%3Axunzhang+is%3Aclosed>
>>>  - *9 features and code refactor including hawq register, hawq
>>>  extract, orc, libhdfs3, test infrastructure, etc.*
>>> - HAWQ-991 <https://issues.apache.org/jira/browse/HAWQ-991>.
>>> Write
>>> hawqregister to support registering tables from yaml files
>>> - HAWQ-1012 <https://issues.apache.org/jira/browse/HAWQ-1012>.
>>> Check whether the input yaml file for hawq register is valid
>>> - HAWQ-1011 <https://issues.apache.org/jira/browse/HAWQ-1011>.
>>> Check whether the table to be registered is existed
>>> - HAWQ-1033 <https://issues.apache.org/jira/browse/HAWQ-1033>.
>>> Add
>>> —force option for hawq register
>>> - HAWQ-1050 <https://issues.apache.org/jira/browse/HAWQ-1050>.
>>> Support help without dash for register
>>> - HAWQ-1034 <https://issues.apache.org/jira/browse/HAWQ-1034>.
>>> Implement —repair option for hawq register
>>> - HAWQ-1024 <https://issues.apache.org/jira/browse/HAWQ-1024>.
>>> Add
>>> rollback system in hawq register
>>> - HAWQ-1060 <https://issues.apache.org/jira/browse/HAWQ-1060>.
>>> Refactor hawq register with better readability and quality
>>> - HAWQ-1005 <https://issues.apache.org/jira/browse/HAWQ-1005>.
>>> Add
>>> schema info, distribution policy info with Parquet format in
>>> hawqextact
>>>- HAWQ-1025 <https://issues.apache.org/jira/browse/HAWQ-1025
>>> .
>>> Add bucket number in the yaml file of hawq extract, modify
>>> the actual elf
>>> for usage1
>>> - HAWQ-796 <https://issues.apache.org/jira/browse/HAWQ-796>.
>>> Extend orc library to support reading files from HDFS
>>> - HAWQ-618 <https://issues.apache.org/jira/browse/HAWQ-618>.
>>> Import libhdfs3 for internal management
>>> - HAWQ-707 <https://issues.apache.org/jira/browse/HAWQ-707>.
>>> Remove google test dependency from libhdfs3 and libyarn folder
>>> - HAWQ-873 <https://issues.apache.org/jira/browse/HAWQ-873>.
>>> Improve checking time for Travis CI
>>> - HAWQ-735 <https://issues.apache.org/jira/browse/HAWQ-735>.
>> Add
>>> —with-thrift to control building thrift inside or not
>>> - HAWQ-721 <https://issues.apache.org/jira/browse/HAWQ-721>.
>> New
>>> Feature Test Skeleton
>>> - HAWQ-911 <https://issues.apache.org/jira/browse/HAWQ-911>.
>>> Optimize and refactor makefile for feature test framework
>>> - HAWQ-810 <https://issues.apache.org/jira/browse/HAWQ-810>.
>> Add
>>> stringFormat utility
>>> - HAWQ-804 <https://issues.apache.org/jira/browse/HAWQ-804>.
>> Add
>>> feature test case for error table
>>> - HAWQ-805 <https://issues.apac

Re: libhdfs3 development is still going on outside of ASF

2016-09-16 Thread Zhanwei Wang
Hi Roman

I have create JIRA HAWQ-1058 to track the release of libhdfs3 tarball and 
HAWQ-1059 to separate it as new ASF repository and set the tag of backlog. 

I also update the readme of current libhdfs3’s Github repository in order to 
declare it is in read only mode and let its user know where it is going to. 
Hope they would not get lost.

https://github.com/Pivotal-Data-Attic/pivotalrd-libhdfs3/commit/59789c90f2f42726900eb2e417eea88a670f0b3c
 
<https://github.com/Pivotal-Data-Attic/pivotalrd-libhdfs3/commit/59789c90f2f42726900eb2e417eea88a670f0b3c>



Best Regards

Zhanwei Wang
wan...@apache.org



> 在 2016年9月16日,下午3:40,Zhanwei Wang <wan...@apache.org> 写道:
> 
>> Now, that is NOT to say that when you DO release you shouldn't be producing
>> multiple source tarballs. That's more than appropriate and will give your 
>> users
>> maximum benefit on both HAWQ and libhdfs3 side of things.
> 
> 
> Release separated tarball for libhdfs3 would be much better for libhdfs3 
> users to access the code.  
> 
> 
>> you guys can't quite master the release process yet. Splitting the project
>> into multiple repos will only make it worse. Master the release mechanics
>> and then we can talk about multiple repos.
> 
> 
> HAWQ has got enough difficulties for its release. I agree with you that we 
> separate repository for libhdfs3 after HAWQ establish the release process. 
> And when it is done, the result is good to both ASF and libhdfs3’s users.
> 
> 
> Best Regards
> 
> Zhanwei Wang
> wan...@apache.org
> 
> 
> 
>> 在 2016年9月16日,下午1:20,Roman Shaposhnik <ro...@shaposhnik.org> 写道:
>> 
>> On Wed, Sep 14, 2016 at 11:38 PM, Zhanwei Wang <wan...@apache.org 
>> <mailto:wan...@apache.org>> wrote:
>>> Hi Roman
>>> 
>>> I think I have discussed enough about the benefit and drawback of merge two 
>>> independent project together.
>>> Let me propose a way to see if it can make both ASF and libhdfs3’s user 
>>> happy. And I need your advise.
>>> 
>>> 
>>> Is it possibile to have two git repository in ASF for HAWQ incubator 
>>> project. If it is possible, I propose to solve the libhdfs3 issue like this.
>>> 
>>> 1) create a new git repository in ASF and push all libhdfs3’s code and 
>>> branch from Github to ASF.
>>> 2) make libhdfs3’s Github repository as read only mirror of ASF repository. 
>>> Maybe need to transfer current owner of Github repository from Pivotal to 
>>> ASF on Github.
>>> 3) HAWQ keep the stable version code of libhdfs3 or just Git reference.
>>> 
>>> 
>>> In this way, we keep libhdfs3 independent and keep its all pull request, 
>>> wiki, issues and history. And most importantly libhdfs3 can follow ASF 
>>> rules and process. People can file pull request on Github and commit to ASF 
>>> repository and eventually mirror to Github.
>>> 
>>> 
>>> Any comments?
>> 
>> It is possible, but at this point I will strongly recommend against
>> it. As it is,
>> you guys can't quite master the release process yet. Splitting the project
>> into multiple repos will only make it worse. Master the release mechanics
>> and then we can talk about multiple repos.
>> 
>> Now, that is NOT to say that when you DO release you shouldn't be producing
>> multiple source tarballs. That's more than appropriate and will give your 
>> users
>> maximum benefit on both HAWQ and libhdfs3 side of things.
>> 
>> Thanks,
>> Roman.
> 
> 



[jira] [Created] (HAWQ-1059) separate libhdfs3 into an independent repository

2016-09-16 Thread Zhanwei Wang (JIRA)
Zhanwei Wang created HAWQ-1059:
--

 Summary: separate libhdfs3 into an independent repository
 Key: HAWQ-1059
 URL: https://issues.apache.org/jira/browse/HAWQ-1059
 Project: Apache HAWQ
  Issue Type: Test
  Components: libhdfs
Reporter: Zhanwei Wang
Assignee: Lei Chang


As discussed in mail list. Separate libhdfs3 into an independent repository 
after HAWQ establish its release process.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HAWQ-1058) Create a separated tarball for libhdfs3

2016-09-16 Thread Zhanwei Wang (JIRA)
Zhanwei Wang created HAWQ-1058:
--

 Summary: Create a separated tarball for libhdfs3
 Key: HAWQ-1058
 URL: https://issues.apache.org/jira/browse/HAWQ-1058
 Project: Apache HAWQ
  Issue Type: Test
  Components: libhdfs
Reporter: Zhanwei Wang
Assignee: Lei Chang


As discussed in the dev mail list. Proposed by Ramon that create a separated 
tarball for libhdfs3 at HAWQ release.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: libhdfs3 development is still going on outside of ASF

2016-09-16 Thread Zhanwei Wang
> Now, that is NOT to say that when you DO release you shouldn't be producing
> multiple source tarballs. That's more than appropriate and will give your 
> users
> maximum benefit on both HAWQ and libhdfs3 side of things.


Release separated tarball for libhdfs3 would be much better for libhdfs3 users 
to access the code.  


> you guys can't quite master the release process yet. Splitting the project
> into multiple repos will only make it worse. Master the release mechanics
> and then we can talk about multiple repos.


HAWQ has got enough difficulties for its release. I agree with you that we 
separate repository for libhdfs3 after HAWQ establish the release process. And 
when it is done, the result is good to both ASF and libhdfs3’s users.


Best Regards

Zhanwei Wang
wan...@apache.org



> 在 2016年9月16日,下午1:20,Roman Shaposhnik <ro...@shaposhnik.org> 写道:
> 
> On Wed, Sep 14, 2016 at 11:38 PM, Zhanwei Wang <wan...@apache.org 
> <mailto:wan...@apache.org>> wrote:
>> Hi Roman
>> 
>> I think I have discussed enough about the benefit and drawback of merge two 
>> independent project together.
>> Let me propose a way to see if it can make both ASF and libhdfs3’s user 
>> happy. And I need your advise.
>> 
>> 
>> Is it possibile to have two git repository in ASF for HAWQ incubator 
>> project. If it is possible, I propose to solve the libhdfs3 issue like this.
>> 
>> 1) create a new git repository in ASF and push all libhdfs3’s code and 
>> branch from Github to ASF.
>> 2) make libhdfs3’s Github repository as read only mirror of ASF repository. 
>> Maybe need to transfer current owner of Github repository from Pivotal to 
>> ASF on Github.
>> 3) HAWQ keep the stable version code of libhdfs3 or just Git reference.
>> 
>> 
>> In this way, we keep libhdfs3 independent and keep its all pull request, 
>> wiki, issues and history. And most importantly libhdfs3 can follow ASF rules 
>> and process. People can file pull request on Github and commit to ASF 
>> repository and eventually mirror to Github.
>> 
>> 
>> Any comments?
> 
> It is possible, but at this point I will strongly recommend against
> it. As it is,
> you guys can't quite master the release process yet. Splitting the project
> into multiple repos will only make it worse. Master the release mechanics
> and then we can talk about multiple repos.
> 
> Now, that is NOT to say that when you DO release you shouldn't be producing
> multiple source tarballs. That's more than appropriate and will give your 
> users
> maximum benefit on both HAWQ and libhdfs3 side of things.
> 
> Thanks,
> Roman.



Re: libhdfs3 development is still going on outside of ASF

2016-09-15 Thread Zhanwei Wang
Hi Roman

I think I have discussed enough about the benefit and drawback of merge two 
independent project together. 
Let me propose a way to see if it can make both ASF and libhdfs3’s user happy. 
And I need your advise.


Is it possibile to have two git repository in ASF for HAWQ incubator project. 
If it is possible, I propose to solve the libhdfs3 issue like this.

1) create a new git repository in ASF and push all libhdfs3’s code and branch 
from Github to ASF.
2) make libhdfs3’s Github repository as read only mirror of ASF repository. 
Maybe need to transfer current owner of Github repository from Pivotal to ASF 
on Github.
3) HAWQ keep the stable version code of libhdfs3 or just Git reference.


In this way, we keep libhdfs3 independent and keep its all pull request, wiki, 
issues and history. And most importantly libhdfs3 can follow ASF rules and 
process. People can file pull request on Github and commit to ASF repository 
and eventually mirror to Github.

 
Any comments?


Best Regards

Zhanwei Wang
wan...@apache.org



> 在 2016年9月15日,下午2:19,Zhanwei Wang <wan...@apache.org> 写道:
> 
>> Open source is about community first.
> 
> Good point Kyle. I strongly agree with you!
> 
> But unfortunately seems no one in this thread care about libhdfs3’s community 
> (users) except me. Positively ignore the frustration of libhdfs3 users and 
> about to delete it’s repository.
> 
> 
> So let’s set the tone of this thread.
> 
> If we remove libhdfs3’s repository or make it read only:
>  a. What benefit we can get for BOTH HAWQ and libhdfs3’s users?
>  b. What drawback for BOTH HAWQ and libhdfs3’s users?
> 
> 
> 
> The following is my answer.
> 
> a. Benefit: For HAWQ, seems ASF govern its property with ASF rules.  For 
> libhdfs3’s users, none.
> 
> b. Drawback: For HAWQ, not relevant commits will come into HAWQ’s commit log. 
> JIRA and pull request will be fired in HAWQ but not related to HAWQ.  
> Furthermore commit in libhdfs3 may break HAWQ and it’s hard to debug, I have 
> experienced it enough. It is important to use the stable version of libhdfs3, 
> HAWQ code should only keep the stable version of libhdfs3.
> 
>For libhdfs3’s user, they have to ask question in HAWQ’s community. They 
> have to clone entire HAWQ to build libhdfs3 and contribute.
> 
> Let’s think about more. How we schedule a release of libhdfs3 when HAWQ is 
> under developing? Should we branch HAWQ for libhdfs3’s release? Should we 
> merge libhdfs3’s pull request when we are releasing HAWQ? Do we have to sync 
> the release process of HAWQ and libhdfs3 and how?
> 
> Maybe we should better involve libhdfs3’s users into this thread. But 
> unfortunately they are not in HAWQ’s mail list. See, this is another big 
> issue. We discuss dropping libhdfs3’s repository in HAWQ’s mail list without 
> libhdfs3’s users involved, seems odd. Image this, one day the repository you 
> are working with is gone and you even do not know this discuss.
> 
> If anyone want to discuss if we should dropping libhdfs3’s repository, the 
> better place is libhdfs3’s repository.
> 
> In general merge two independent project together introduce more trouble than 
> benefit. 
> 
> To be clear, I’m not against ASF rule. I’m deeply understand the importance 
> of it. Is there any way to make HAWQ and libhdfs3 separated and make both ASF 
> and libhdfs3’s user happy? Just like Kyle said, “HOW” is more important. 
> 
> @Roman, your mentoring is important.
> 
> 
> Any comments?
> 
> 
> Best Regards
> 
> Zhanwei Wang
> wan...@apache.org
> 
> 
> 
>> 在 2016年9月15日,下午12:54,Kyle Dunn <kd...@pivotal.io> 写道:
>> 
>> Chiming in here only as a casual but concerned observer.
>> 
>> Open source is about community first. If the logistics around "where"
>> libhdfs3 lives rather than the much more important issue of "how" it lives
>> are the focus here, I think we've missed the real issue.
>> 
>> For what it's worth, I concur with others, let's move it to HAWQ
>> exclusively and move on to addressing the community, starting with the
>> decision being made and how/where future contributions can be made.
>> 
>> My brief scan of libhdfs3 shows numerous open pull requests (with
>> apparently useful contributions) and several loose ends "issues". We need
>> to communicate effectively to these contributors whether those PRs and
>> issues are valuable and relevant. This type of engagement is what OSS
>> projects live and die by. We need to be better, starting with libhdfs3,
>> into HAWQ, and beyond.
>> 
>> "Open source isn't someone else's job" - it's everyone's job. I'm
>> challenging everyone with commit resp

Re: libhdfs3 development is still going on outside of ASF

2016-09-15 Thread Zhanwei Wang
> Open source is about community first.

Good point Kyle. I strongly agree with you!

But unfortunately seems no one in this thread care about libhdfs3’s community 
(users) except me. Positively ignore the frustration of libhdfs3 users and 
about to delete it’s repository.


So let’s set the tone of this thread.

 If we remove libhdfs3’s repository or make it read only:
  a. What benefit we can get for BOTH HAWQ and libhdfs3’s users?
  b. What drawback for BOTH HAWQ and libhdfs3’s users?



The following is my answer.

a. Benefit: For HAWQ, seems ASF govern its property with ASF rules.  For 
libhdfs3’s users, none.

b. Drawback: For HAWQ, not relevant commits will come into HAWQ’s commit log. 
JIRA and pull request will be fired in HAWQ but not related to HAWQ.  
Furthermore commit in libhdfs3 may break HAWQ and it’s hard to debug, I have 
experienced it enough. It is important to use the stable version of libhdfs3, 
HAWQ code should only keep the stable version of libhdfs3.

For libhdfs3’s user, they have to ask question in HAWQ’s community. They 
have to clone entire HAWQ to build libhdfs3 and contribute.

Let’s think about more. How we schedule a release of libhdfs3 when HAWQ is 
under developing? Should we branch HAWQ for libhdfs3’s release? Should we merge 
libhdfs3’s pull request when we are releasing HAWQ? Do we have to sync the 
release process of HAWQ and libhdfs3 and how?

Maybe we should better involve libhdfs3’s users into this thread. But 
unfortunately they are not in HAWQ’s mail list. See, this is another big issue. 
We discuss dropping libhdfs3’s repository in HAWQ’s mail list without 
libhdfs3’s users involved, seems odd. Image this, one day the repository you 
are working with is gone and you even do not know this discuss.

If anyone want to discuss if we should dropping libhdfs3’s repository, the 
better place is libhdfs3’s repository.

In general merge two independent project together introduce more trouble than 
benefit. 

To be clear, I’m not against ASF rule. I’m deeply understand the importance of 
it. Is there any way to make HAWQ and libhdfs3 separated and make both ASF and 
libhdfs3’s user happy? Just like Kyle said, “HOW” is more important. 

@Roman, your mentoring is important.


Any comments?


Best Regards

Zhanwei Wang
wan...@apache.org



> 在 2016年9月15日,下午12:54,Kyle Dunn <kd...@pivotal.io> 写道:
> 
> Chiming in here only as a casual but concerned observer.
> 
> Open source is about community first. If the logistics around "where"
> libhdfs3 lives rather than the much more important issue of "how" it lives
> are the focus here, I think we've missed the real issue.
> 
> For what it's worth, I concur with others, let's move it to HAWQ
> exclusively and move on to addressing the community, starting with the
> decision being made and how/where future contributions can be made.
> 
> My brief scan of libhdfs3 shows numerous open pull requests (with
> apparently useful contributions) and several loose ends "issues". We need
> to communicate effectively to these contributors whether those PRs and
> issues are valuable and relevant. This type of engagement is what OSS
> projects live and die by. We need to be better, starting with libhdfs3,
> into HAWQ, and beyond.
> 
> "Open source isn't someone else's job" - it's everyone's job. I'm
> challenging everyone with commit responsibly on repos to value community
> input (both code and issues) as highly as your own backlog. Pay it forward
> and maybe the community will start shrinking your backlog unexpectedly.
> 
> 
> -Kyle
> 
> On Wed, Sep 14, 2016, 21:33 Lei Chang <chang.lei...@gmail.com> wrote:
> 
>> 
>> There was a short discussion before when we moved libhfds3 to HAWQ repo.
>> 
>> http://mail-archives.apache.org/mod_mbox/incubator-hawq-dev/201602.mbox/%3cCAE44UQe1xgcVOC76T_mgVbgGbR=Lx=xubpvw18zk4iz3euc...@mail.gmail.com%3e
>> I think it makes sense to keep libhdfs3 only in HAWQ repo to simplify
>> Apache build and releases in current phase. This is what we have done in
>> the past. But looks not everyone is on the same page.
>> CheersLei
>> 
>> 
>> 
>> 
>> 
>> 
>> On Thu, Sep 15, 2016 at 11:12 AM +0800, "Greg Chase" <g...@gregchase.com>
>> wrote:
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> Its fine if libhdfs3 is a third party license, and is treated that way.
>> 
>> However, why does Apache HAWQ want to be dependent on some strange 3rd
>> party library with no transparency?
>> 
>> We are having enough difficulties just getting our first release out.
>> 
>> Is there a compelling reason why we need to keep up with the independently
>> developed libhdfs3 project? 

Re: libhdfs3 development is still going on outside of ASF

2016-09-14 Thread Zhanwei Wang
> But my concern is the current users of libhdfs3 and all the pull requests,
> wiki docs and issues. Another uncertain aspect from my perspective is that
> although HAWQ could not run without libhdfs3, libhdfs3 could be used in
> other open source projects, that might be the true meaning of making
> libhdfs3 open source at the beginning.


That’s what I concern about. Think about others before we take actions. Users 
already show there frustration HAWQ-1046. 

libhdfs3 open source as independent project before Apache HAWQ was born. People 
contribute to it before Apache HAWQ was born. And I do not think they all sign 
the contribution license with ASF. 

When Apache HAWQ start incubating, libhdfs3 is not part of it, HAWQ users 
should build and install libhdfs3 as other third partiesbefore build Apache 
HAWQ. See the commit 
https://github.com/apache/incubator-hawq/commit/8b26974cd8d6e1d824f274eb4a68f950fd94156c
 
<https://github.com/apache/incubator-hawq/commit/8b26974cd8d6e1d824f274eb4a68f950fd94156c>


I really do not mind who govern it and follow what kind process. I care about 
what troubles to libhdfs3’s users will be introduced if we drop libhdfs3’s 
repository.  Please note HAWQ is not the only user.

Import libhdfs3 into HAWQ also introduce trouble to HAWQ.  The commit to 
libhdfs3 probably not related to HAWQ, but it will interfere with HAWQ.  HAWQ 
should not always keep the latest development version of libhdfs3. The stable 
version of libhdfs3 is best for HAWQ. If we import libhdfs3 into HAWQ, we have 
to schedule the release process both HAWQ and libhdfs3. And libhdfs3 commit may 
break HAWQ severely.

What benefit of deleting libhdfs3’s repository except ASF declare its 
governance (I also doubt if libhdfs3 is included in the HAWQ donation license, 
but it is ok to me it is governed by ASF)?  In my opinion it only introduce 
trouble to libhdfs3’s users and HAWQ. 

Keep current status (make libhdfs3 as third party dependency and copy it stable 
version to HAWQ) is best for HAWQ.




Best Regards

Zhanwei Wang
wan...@apache.org



> 在 2016年9月15日,上午10:54,Hong Wu <xunzhang...@gmail.com> 写道:
> 
> In my opinion, I think it is reasonable to transfer the third-party repo of
> libhdfs3 totally into HAWQ, not only for the convenience of HAWQ build, but
> also for the consideration of ASF project. So for HAWQ project, I am with
> Roman.
> 
> But my concern is the current users of libhdfs3 and all the pull requests,
> wiki docs and issues. Another uncertain aspect from my perspective is that
> although HAWQ could not run without libhdfs3, libhdfs3 could be used in
> other open source projects, that might be the true meaning of making
> libhdfs3 open source at the beginning.
> 
> In summary, if it is really against the spirit of a ASF project for HAWQ, a
> suggested way might be marking original libhdfs3 repo as a legacy repo in
> stead of remove it.
> 
> Best
> Hong
> 
> 2016-09-15 10:04 GMT+08:00 Zhanwei Wang <wan...@apache.org>:
> 
>> Currently libhdfs3’s official code is not the same as in HAWQ. Some new
>> code does not copy into HAWQ.  I do not think code change of libhdfs3
>> should follow HAWQ’s commit process because  many change are not related to
>> HAWQ.
>> 
>> From HAWQ side, I suggest to keep the stable version of its third-party
>> libraries and copy new libhdfs3’s code only when it is necessary.
>> 
>> libhdfs3 was open source years before HAWQ incubating with a separated
>> permission of its authority. So in my opinion it is a third party and it
>> actually was a third party before HAWQ incubating. And HAWQ is not the only
>> user.
>> 
>> 
>> 
>> Best Regards
>> 
>> Zhanwei Wang
>> wan...@apache.org
>> 
>> 
>> 
>>> 在 2016年9月15日,上午9:35,Roman Shaposhnik <ro...@shaposhnik.org> 写道:
>>> 
>>> On Wed, Sep 14, 2016 at 6:29 PM, Zhanwei Wang <wan...@apache.org> wrote:
>>>> Hi Roman
>>>> 
>>>> libhdfs3 works as third-party library of HAWQ, Just for the convenience
>> of HAWQ release
>>>> process we copy its code into HAWQ.  The reason is that HAWQ used to
>> dependent on
>>>> specific version of libhdfs3 and libhdfs3 only distribute as source
>> code and the build process is complicated.
>>> 
>>> I actually don't buy this argument. libhdfs3 is not an optional
>>> dependency for HAWQ
>>> like ORCA is (for example). Without libhdfs3 there's pretty tough to
>>> imagine HAWQ.
>>> As such the code base needs to be governed as part of the ASF project,
>>> not a random
>>> GitHub dependency.
>>> 
>>> IOW, let me ask you this: were all the changes that went into libh

Re: libhdfs3 development is still going on outside of ASF

2016-09-14 Thread Zhanwei Wang
Currently libhdfs3’s official code is not the same as in HAWQ. Some new code 
does not copy into HAWQ.  I do not think code change of libhdfs3 should follow 
HAWQ’s commit process because  many change are not related to HAWQ. 

From HAWQ side, I suggest to keep the stable version of its third-party 
libraries and copy new libhdfs3’s code only when it is necessary.

libhdfs3 was open source years before HAWQ incubating with a separated 
permission of its authority. So in my opinion it is a third party and it 
actually was a third party before HAWQ incubating. And HAWQ is not the only 
user.



Best Regards

Zhanwei Wang
wan...@apache.org



> 在 2016年9月15日,上午9:35,Roman Shaposhnik <ro...@shaposhnik.org> 写道:
> 
> On Wed, Sep 14, 2016 at 6:29 PM, Zhanwei Wang <wan...@apache.org> wrote:
>> Hi Roman
>> 
>> libhdfs3 works as third-party library of HAWQ, Just for the convenience of 
>> HAWQ release
>> process we copy its code into HAWQ.  The reason is that HAWQ used to 
>> dependent on
>> specific version of libhdfs3 and libhdfs3 only distribute as source code and 
>> the build process is complicated.
> 
> I actually don't buy this argument. libhdfs3 is not an optional
> dependency for HAWQ
> like ORCA is (for example). Without libhdfs3 there's pretty tough to
> imagine HAWQ.
> As such the code base needs to be governed as part of the ASF project,
> not a random
> GitHub dependency.
> 
> IOW, let me ask you this: were all the changes that went into libhdfs3
> that is part of
> HAWQ discussed and reviewed via the ASF development process or did you just
> import them from time to time as this comment suggests:
>
> https://issues.apache.org/jira/browse/HAWQ-1046?focusedCommentId=15489669=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15489669
> ?
> 
>> I do not think we have any reason to shutdown a third party’s official 
>> repository.
> 
> You say 3d party as though its not just you guys maintaining it on the side.
> 
>> We also copy google test source code into HAWQ, just as what we did for 
>> libhdfs3.
> 
> But this is very different. You don't do any development (certainly
> you don't do any
> non-trivial development) of that code.
> 
>> libhdfs3 open source under Apache license version 2 just the same as HAWQ. 
>> So I believe there is no license issue.
> 
> You're correct. There's no licensing issue but there's a pretty significant
> governance issue.
> 
> Thanks,
> Roman.
> 



Re: libhdfs3 development is still going on outside of ASF

2016-09-14 Thread Zhanwei Wang
Hi Roman

libhdfs3 works as third-party library of HAWQ, Just for the convenience of HAWQ 
release process we copy its code into HAWQ.  The reason is that HAWQ used to 
dependent on specific version of libhdfs3 and libhdfs3 only distribute as 
source code and the build process is complicated.

I do not think we have any reason to shutdown a third party’s official 
repository. We also copy google test source code into HAWQ, just as what we did 
for libhdfs3.

libhdfs3 open source under Apache license version 2 just the same as HAWQ. So I 
believe there is no license issue. 


Best Regards

Zhanwei Wang
wan...@apache.org



> 在 2016年9月15日,上午8:42,Roman Shaposhnik <ro...@shaposhnik.org> 写道:
> 
> Hi!
> 
> a good discussion over at:
>https://issues.apache.org/jira/browse/HAWQ-1046
> highlighted the fact that there still seems to be
> non-trivial HAWQ development going on outside of
> ASF repos. This is contrary to my understanding and
> we have to come up with a plan of how to shut down
> that repo (or make it a read-only mirror).
> 
> If you really need a separate repo we can create
> an extra one on the ASF side, but at this point it is
> likely going to complicate your release mechanics
> which I really don't recommend until you get a few
> releases under your belt.
> 
> If there's any other outstanding issues -- lets discuss
> those on this thread.
> 
> Thanks,
> Roman.
> 



Re: enforce -Werror (if gcc) in hawq?

2016-09-05 Thread Zhanwei Wang
Hi Hong

I removed -Werror flag when we push HAWQ to open source. In Pivotal we build 
HAWQ on specific OS with specific GCC and dependent libraries. -Werror flag 
worked fine since we fix all warnings. But for a open source project. It is 
impossible to make users and contributors to build HAWQ on specific OS and 
compiler just like what we did before. We cannot say we just support HAWQ on 
Centos 6 with GCC 4.4.2.

So if we enforce -Werror flag, I guess the build will fail on many environments.

I propose three solutions and let’s discuss which is better.

1) Do not enforce -Werror flag but add it to our test (concourse and Travis) 
like this.
./configure —prefix=/path/to/install CFLAGS=“-Werror”

By this way we can enforce -Werror flag on our tested environment.


2) Only enforce -Werror flag on development build, remove it on release build.


3) Enforce -Werror flag and we setup more test environments(different versions 
of CentOS Ubuntu SUSE with default compiler and latest GCC and MacOS with 
default compiler and latest clang)


Any comments?


Best Regards

Zhanwei Wang
wan...@apache.org



> 在 2016年9月6日,上午9:22,Ed Espino <esp...@apache.org> 写道:
> 
> +1
> 
> Have we considered setting up separate public Concourse pipelines to try
> the various build scenarios.
> 
> -=e
> 
> On Tue, Sep 6, 2016 at 12:58 AM, Hong Wu <xunzhang...@gmail.com> wrote:
> 
>> Ming's comment makes sense, but I think it is another thread. I have
>> already tried those[1] but there are some further works need to do[2].
>> 
>> [1]
>> - clang analysis scan report: I uploaded the result of not that fresh HAWQ
>> in my personal link, please check the report out here
>> <http://xunzhangthu.org/tmp/hawq_check> if you are interested.
>> - coverity scan: the latest reported is here
>> <https://scan.coverity.com/projects/apache-incubator-hawq>. If you want to
>> see the defects in detail, you need to submit a permission request.
>> 
>> [2]
>> - Make clang analysis scan reported generating periodically and publishing
>> to the public automatically. I suggest we donate a domain such as hawq.io
>> and
>> a host for it, also for automatically publishing the report, we need to
>> write a web service to reply the requests and transmitting html data.
>> 
>> - Travis CI script have already integrated the coverity scan service using
>> github webhook. we need to create a coverity_scan branch for hawq and then
>> modify the .travis.yml file. I have done that under Redhat environment.
>> While the only environment Travis server supports for Linux is
>> Ubuntu/Debian. Although hawq could be built under Ubuntu, it needs extra
>> effort to extend the .travis.yml script to support that. For osx
>> environment, I am not sure what the problem is, the issue is that the
>> report could not be sent to the coverity scan server automatically.
>> 
>> ps: I think Chunlin <https://github.com/wcl14> is starting working on the
>> defects generated by coverity.
>> 
>> 
>> Back to main thread mentioned by Paul, I think we should just try to open
>> the flag and discuss errors after opening -Werror.
>> 
>> Best
>> Hong
>> 
>> 
>> 2016-09-05 21:51 GMT+08:00 Ming Li <m...@pivotal.io>:
>> 
>>> Good suggestion.
>>> 
>>> However, IMHO, we may need to firstly enable coverity scan check or clang
>>> analysis scan. Also we should make the output of these check on a public
>>> server so that all contributor can access them.
>>> 
>>> On Mon, Sep 5, 2016 at 6:05 PM, Paul Guo <paul...@gmail.com> wrote:
>>> 
>>>> -Werror
>>>> Make all warnings into errors.
>>>> I've seen many cases (not just hawq) before that ignoring gcc warning
>>> leads
>>>> to bugs. I'm wondering we should add the option for the gcc case. Given
>>>> there may be a lot of warnings when building the common postgres code
>> in
>>>> hawq, we could at least enforce it in our own code at first
>>>> (src/backend/cdb, src/backend/resourcemanager, src/test/feature, other
>>>> directories?)? Any suggestion?
>>>> 
>>> 
>> 
> 
> 
> 
> -- 
> *Ed Espino*



Re: Versioning for libhdfs3

2016-05-31 Thread Zhanwei Wang
Version of libhdfs3 can be found in its source code. 
https://github.com/apache/incubator-hawq/blob/master/depends/libhdfs3/src/CMakeLists.txt
 
<https://github.com/apache/incubator-hawq/blob/master/depends/libhdfs3/src/CMakeLists.txt>

Personally I do not like record version in document because if someone update 
version but not update the document, it will make inconsistency. 

As a module of HAWQ, perhaps it is better to record all libhdfs3 change into 
HAWQ release note, including version change.



Best Regards

Zhanwei Wang
wan...@apache.org



> 在 2016年5月31日,下午9:44,Kristopher Overholt <koverh...@continuum.io> 写道:
> 
> Is there some place where I can track libhdfs3 versions/releases that
> correspond to the commit history? If not, could something like this be
> added, perhaps in documentation within incubator-hawq/depends/libhdfs3 ?
> 
> On Tue, May 31, 2016 at 8:27 AM, Zhanwei Wang <wan...@apache.org> wrote:
> 
>> We moved libhdfs3 code to HAWQ but it is still an independent module of
>> HAWQ. I suggest to keep its version independent so that others can use
>> libhdfs3 independently.
>> 
>> 
>> 
>> 
>> Best Regards
>> 
>> Zhanwei Wang
>> wan...@apache.org
>> 
>> 
>> 
>>> 在 2016年5月31日,下午8:36,Kristopher Overholt <koverh...@continuum.io> 写道:
>>> 
>>> Hi,
>>> 
>>> Thank you for your work on the libhdfs3 library.
>>> 
>>> Now that libhdfs3 has moved to the Apache HAWQ repository, how will the
>>> version of libhdfs3 be tagged? This older page lists v2.2.31 as the most
>>> recent libhdfs3 release:
>>> 
>>> https://github.com/Pivotal-Data-Attic/pivotalrd-libhdfs3/releases
>>> 
>>> But the releases on this page are only for the HAWQ project:
>>> 
>>> https://github.com/apache/incubator-hawq/releases
>>> 
>>> Is there a new page that I can find the release versions or tags for
>>> libhdfs3?
>> 
>> 



Re: Versioning for libhdfs3

2016-05-31 Thread Zhanwei Wang
We moved libhdfs3 code to HAWQ but it is still an independent module of HAWQ. I 
suggest to keep its version independent so that others can use libhdfs3 
independently.




Best Regards

Zhanwei Wang
wan...@apache.org



> 在 2016年5月31日,下午8:36,Kristopher Overholt <koverh...@continuum.io> 写道:
> 
> Hi,
> 
> Thank you for your work on the libhdfs3 library.
> 
> Now that libhdfs3 has moved to the Apache HAWQ repository, how will the
> version of libhdfs3 be tagged? This older page lists v2.2.31 as the most
> recent libhdfs3 release:
> 
> https://github.com/Pivotal-Data-Attic/pivotalrd-libhdfs3/releases
> 
> But the releases on this page are only for the HAWQ project:
> 
> https://github.com/apache/incubator-hawq/releases
> 
> Is there a new page that I can find the release versions or tags for
> libhdfs3?



Re: [VOTE] HAWQ 2.0.0-beta-incubating RC4

2016-01-26 Thread Zhanwei Wang
+1

Downloaded, deployed and tested

On Thu, Jan 21, 2016 at 7:34 AM, Ting(Goden) Yao <t...@pivotal.io> wrote:

> This is the 1st release for Apache HAWQ (incubating), version:
> 2.0.0-beta-incubating
>
> *It fixes the following issues:*
> Clear all IP related issues for HAWQ and this is a source code tarball only
> release.
> Full list of JIRAs fixed/related to the release: link
> <
> https://cwiki.apache.org/confluence/display/HAWQ/HAWQ+Release+2.0.0-beta-incubating
> >
>
> *** Please download, review and vote by *Friday 6pm Jan 22, 2016 PST* ***
>
> *We're voting upon the source (tag):*
> 2.0.0-beta-incubating-RC4
>
> *Source Files:*
>
> https://dist.apache.org/repos/dist/dev/incubator/hawq/2.0.0-beta-incubating.RC4
>
> *Tag to be voted upon:*
>
> https://git-wip-us.apache.org/repos/asf?p=incubator-hawq.git;a=commit;h=1b11926fef3a7ca445238c157571494c03276a82
>
>
> *KEYS file containing PGP Keys we use to sign the release:*
> https://dist.apache.org/repos/dist/dev/incubator/hawq/KEYS
> …
>



-- 
Best Regards
--

Zhanwei Wang


Re: Crashed while hawq init master

2015-12-08 Thread Zhanwei Wang
Hi Leon

HAWQ-233 is create for this issue

https://issues.apache.org/jira/browse/HAWQ-233

On Tue, Dec 8, 2015 at 4:28 PM, Zhanwei Wang <zw...@pivotal.io> wrote:

> Hi Leon
>
> I have reproduce the issue.  I will file a JIRA to for it.
>
>
>
> On Sat, Dec 5, 2015 at 10:15 AM, Leon Zhang <leonca...@gmail.com> wrote:
>
>> Hi, HAWQ dev:
>>
>>I recently rebuild the latest hawq on Centos7, it crashed while "hawq
>> init master". The stack dump looks like this:
>>
>>The files belonging to this database system will be owned by user
>> "xiaolin".
>> This user must also own the server process.
>>
>> The database cluster will be initialized with locale en_US.utf8.
>>
>> fixing permissions on existing directory
>> /mnt/xiaolin/hawq-data-directory/masterdd ... ok
>> creating subdirectories ... ok
>> selecting default max_connections ... 1280
>> selecting default shared_buffers/max_fsm_pages ... 125MB/20
>> creating configuration files ... ok
>> creating template1 database in
>> /mnt/xiaolin/hawq-data-directory/masterdd/base/1 ... 2015-12-05
>> 02:00:44.038224
>>
>> GMT,,,p106570,th20386063360,,,seg-1,"WARNING","01000","""fsync"":
>> can not be set by the user and will be
>> ignored.""set_config_option","guc.c",9990,
>> ok
>> loading file-system persistent tables for template1 ...
>> 2015-12-05 02:00:50.473854
>>
>> GMT,,,p106586,th20678272000,,,seg-1,"WARNING","01000","""fsync"":
>> can not be set by the user and will be
>> ignored.""set_config_option","guc.c",9990,
>> ok
>> initializing pg_authid ... 2015-12-05 02:00:51.873844
>>
>> GMT,,,p106590,th-12675251200,,,seg-1,"WARNING","01000","""fsync"":
>> can not be set by the user and will be
>> ignored.""set_config_option","guc.c",9990,
>> 2015-12-05 10:00:52.434633
>>
>> CST,,,p106590,th-12675251200,,cmd1,seg-1,,,x6,sx1,"FATAL","XX000","wrong
>> number of index expressions (index.c:1186)",,"CREATE TRIGGER
>> pg_sync_pg_database   AFTER INSERT OR UPDATE OR DELETE ON pg_database
>>  FOR
>> EACH STATEMENT EXECUTE PROCEDURE flatfile_update_trigger();
>> ",,"FormIndexDatum","index.c",1186,10x8c2b28 postgres errstart
>> (elog.c:473)
>> 20x8c489b postgres elog_finish (elog.c:1421)
>> 30x5735e5 postgres FormIndexDatum (index.c:1186)
>> 40x575030 postgres CatalogIndexInsert (discriminator 2)
>> 50x562f14 postgres caql_insert (caqlaccess.c:830)
>> 60x63fb38 postgres CreateTrigger (trigger.c:427)
>> 70x7ed0ec postgres ProcessUtility (utility.c:1578)
>> 80x7e8d6e postgres  (pquery.c:1885)
>> 90x7ea54e postgres  (pquery.c:1989)
>> 10   0x7ec2a5 postgres PortalRun (pquery.c:1510)
>> 11   0x7e41f1 postgres  (postgres.c:1732)
>> 12   0x7e56b4 postgres PostgresMain (postgres.c:4697)
>> 13   0x4a2982 postgres main (main.c:204)
>> 14   0x7f1eb174faf5 libc.so.6 __libc_start_main (??:0)
>> 15   0x4a2a1d postgres  (??:?)
>>
>> child process exited with exit code 1
>> initdb: removing contents of data directory
>> "/mnt/xiaolin/hawq-data-directory/masterdd"
>> Master postgres initdb failed
>> 20151205:10:01:00:106236 hawq_init:dserver1:xiaolin-[INFO]:-Master
>> postgres
>> initdb failed
>> 20151205:10:01:00:106236 hawq_init:dserver1:xiaolin-[ERROR]:-Master init
>> failed, exit
>>
>>
>>
>>
>> I have no idea why it crashed, any help will be appreciated.
>>
>> Thanks.
>>
>
>
>
> --
> Best Regards
> --
>
> Zhanwei Wang
>
>


-- 
Best Regards
--

Zhanwei Wang


Re: Crashed while hawq init master

2015-12-08 Thread Zhanwei Wang
Hi Leon

I have reproduce the issue.  I will file a JIRA to for it.



On Sat, Dec 5, 2015 at 10:15 AM, Leon Zhang <leonca...@gmail.com> wrote:

> Hi, HAWQ dev:
>
>I recently rebuild the latest hawq on Centos7, it crashed while "hawq
> init master". The stack dump looks like this:
>
>The files belonging to this database system will be owned by user
> "xiaolin".
> This user must also own the server process.
>
> The database cluster will be initialized with locale en_US.utf8.
>
> fixing permissions on existing directory
> /mnt/xiaolin/hawq-data-directory/masterdd ... ok
> creating subdirectories ... ok
> selecting default max_connections ... 1280
> selecting default shared_buffers/max_fsm_pages ... 125MB/20
> creating configuration files ... ok
> creating template1 database in
> /mnt/xiaolin/hawq-data-directory/masterdd/base/1 ... 2015-12-05
> 02:00:44.038224
>
> GMT,,,p106570,th20386063360,,,seg-1,"WARNING","01000","""fsync"":
> can not be set by the user and will be
> ignored.""set_config_option","guc.c",9990,
> ok
> loading file-system persistent tables for template1 ...
> 2015-12-05 02:00:50.473854
>
> GMT,,,p106586,th20678272000,,,seg-1,"WARNING","01000","""fsync"":
> can not be set by the user and will be
> ignored.""set_config_option","guc.c",9990,
> ok
> initializing pg_authid ... 2015-12-05 02:00:51.873844
>
> GMT,,,p106590,th-12675251200,,,seg-1,"WARNING","01000","""fsync"":
> can not be set by the user and will be
> ignored.""set_config_option","guc.c",9990,
> 2015-12-05 10:00:52.434633
>
> CST,,,p106590,th-12675251200,,cmd1,seg-1,,,x6,sx1,"FATAL","XX000","wrong
> number of index expressions (index.c:1186)",,"CREATE TRIGGER
> pg_sync_pg_database   AFTER INSERT OR UPDATE OR DELETE ON pg_database   FOR
> EACH STATEMENT EXECUTE PROCEDURE flatfile_update_trigger();
> ",,"FormIndexDatum","index.c",1186,10x8c2b28 postgres errstart
> (elog.c:473)
> 20x8c489b postgres elog_finish (elog.c:1421)
> 30x5735e5 postgres FormIndexDatum (index.c:1186)
> 40x575030 postgres CatalogIndexInsert (discriminator 2)
> 50x562f14 postgres caql_insert (caqlaccess.c:830)
> 60x63fb38 postgres CreateTrigger (trigger.c:427)
> 70x7ed0ec postgres ProcessUtility (utility.c:1578)
> 80x7e8d6e postgres  (pquery.c:1885)
> 90x7ea54e postgres  (pquery.c:1989)
> 10   0x7ec2a5 postgres PortalRun (pquery.c:1510)
> 11   0x7e41f1 postgres  (postgres.c:1732)
> 12   0x7e56b4 postgres PostgresMain (postgres.c:4697)
> 13   0x4a2982 postgres main (main.c:204)
> 14   0x7f1eb174faf5 libc.so.6 __libc_start_main (??:0)
> 15   0x4a2a1d postgres  (??:?)
>
> child process exited with exit code 1
> initdb: removing contents of data directory
> "/mnt/xiaolin/hawq-data-directory/masterdd"
> Master postgres initdb failed
> 20151205:10:01:00:106236 hawq_init:dserver1:xiaolin-[INFO]:-Master postgres
> initdb failed
> 20151205:10:01:00:106236 hawq_init:dserver1:xiaolin-[ERROR]:-Master init
> failed, exit
>
>
>
>
> I have no idea why it crashed, any help will be appreciated.
>
> Thanks.
>



-- 
Best Regards
--

Zhanwei Wang


Re: Install failed when initdb (creating template0 instead template1)

2015-11-08 Thread Zhanwei Wang
ignored.""set_config_option","guc.c",9434,
> ok
> creating information schema ... 2015-11-08 06:35:08.803969
> GMT,,,p591765,th-18576527040,,,seg-1,"WARNING","01000","""fsync"":
> can not be set by the user and will be
> ignored.""set_config_option","guc.c",9434,
> 2015-11-08 06:35:10.726271
> GMT,,,p591767,th-11932896960,,,seg-1,"WARNING","01000","""fsync"":
> can not be set by the user and will be
> ignored.""set_config_option","guc.c",9434,
> ok
> creating HAWQ schema ... 2015-11-08 06:35:11.565366
> GMT,,,p591769,th1595330880,,,seg-1,"WARNING","01000","""fsync"":
> can not be set by the user and will be
> ignored.""set_config_option","guc.c",9434,
> ok
> vacuuming database template0 ... 2015-11-08 06:35:12.913042
> GMT,,,p591771,th438743360,,,seg-1,"WARNING","01000","""fsync"": can
> not be set by the user and will be
> ignored.""set_config_option","guc.c",9434,
> ok
> updating content id ... 2015-11-08 06:35:19.245955
> GMT,,,p591775,th1360875840,,,seg-1,"WARNING","01000","""fsync"":
> can not be set by the user and will be
> ignored.""set_config_option","guc.c",9434,
> ok
>
> WARNING: enabling "trust" authentication for local connections
> You can change this by editing pg_hba.conf or using the -A option the
> next time you run initdb.
>
> Success. You can now start the database server using:
>
> /usr/local/hawq/./bin/postgres -D /data/hawq/master/gpseg-1
> or
> /usr/local/hawq/./bin/pg_ctl -D /data/hawq/master/gpseg-1 -l logfile
> start
>
>
> Thank you very much.
>
> Yiming Liu.




-- 
Best Regards
--

Zhanwei Wang


Re: Commit messages

2015-09-29 Thread Zhanwei Wang
Hi Caleb

The PR will automatically close if HAWQ-XXX is in the commit message.
"closes N" is not required.



On Wed, Sep 30, 2015 at 7:44 AM, Caleb Welton <cwel...@pivotal.io> wrote:

> Another related observation. The git commit hooks are setup to
> automatically close The github PRs if your commit message ends in (closes
> N) where N is your PR number - Roman please correct me if I have the syntax
> wrong.
>
> So when dealing with a commit from github you might use:
>
> HAWQ-1: remove .p4ignore files (closes 2)
>
> Then the PR will autoclose.
>
> > On Sep 30, 2015, at 12:31 AM, Jimmy Da <jd...@cornell.edu> wrote:
> >
> > Not a big deal, just the OCD part of me want to bring this up.
> >
> > According to
> > https://cwiki.apache.org/confluence/display/HAWQ/Contributing+to+HAWQ,
> our
> > commit message should look like
> >
> > HAWQ-###*.* Commit message blah blah blah
> >
> > We use period '.' to separate the JIRA number and the commit message.
> >
> > I don't quite feel good about the '.', so I looked up a bunch of other
> > Apache projects and found their separators:
> > hbase uses just a space
> > storm, hive, samza uses ":"
> >
> > I really like the ":" separator as they convey the relationship between
> the
> > JIRA number and the message explaining the JIRA number.
> >
> > Can we change the commit messages to
> > HAWQ-###: Commit message blah blah blah
>



-- 
Best Regards
--

Zhanwei Wang