Re: Share Web-Based HAWQ Admin Tool

2018-08-05 Thread Lei Chang
nice!

On Sun, Aug 5, 2018 at 11:03 PM, zhouzhan.chen <
zhouzhan.c...@iluvatar.ai.invalid> wrote:

> Hi All:
>
>
> In the past month, our team has been adapting pgadmin4 to enable it to be
> used as a management tool for HAWQ.
>
>
> Now,the pgadmin4 we adapted has been able to meet the basic use. So we
> plan to share it with the community and hope the uers of HAWQ can use it
> together.
>
>
> The source code repository can be found here: https://github.com/SkyAI/
> PGAdmin4HAWQ.git.
>
>
> All the users can post bug reports to us through the email:
> dp...@iluvatar.ai.
>
>
> We also welcome you to discuss pgAdmin4 adapted for HAWQ, or contribute to
> the project.
>
>
> Best Regards
>
>
> [image: ttps://www.sky-data.cn/assets/images/logo/small_logo.png]
>
> *陈周瞻*
>
> 平台事业部
>
> 南京天数智芯科技有限公司
>
> Iluvatar CoreX Inc.
>
> Tel:18115151186 web: http://iluvatar.ai
>
> 江苏省南京市雨花台区软件大道180号 大数据产基地5栋4层
>
> 4F, Building 5, No.180, Ruan Jian Ave, Yuhuatai District, Nanjing, Jiangsu 
> Province
>
>


Re: [DISCUSS] Graduate Apache HAWQ (incubating) as a TLP

2018-07-27 Thread Lei Chang
ce software, for distribution at no charge to
> >the public, related to Hadoop native SQL query engine that
> >combines the key technological advantages of MPP database
> >with the scalability and convenience of Hadoop.
> >
> >NOW, THEREFORE, BE IT RESOLVED, that a Project Management
> >Committee (PMC), to be known as the "Apache HAWQ Project",
> >be and hereby is established pursuant to Bylaws of the
> >Foundation; and be it further
> >
> >RESOLVED, that the Apache HAWQ Project be and hereby is
> >responsible for the creation and maintenance of software
> >related to Hadoop native SQL query engine that
> >combines the key technological advantages of MPP database
> >with the scalability and convenience of Hadoop;
> >and be it further
> >
> >RESOLVED, that the office of "Vice President, Apache HAWQ" be
> >and hereby is created, the person holding such office to
> >serve at the direction of the Board of Directors as the chair
> >of the Apache HAWQ Project, and to have primary responsibility
> >for management of the projects within the scope of
> >responsibility of the Apache HAWQ Project; and be it further
> >
> >RESOLVED, that the persons listed immediately below be and
> >hereby are appointed to serve as the initial members of the
> >Apache HAWQ Project:
> >
> > * Alan Gates   
> > * Alexander Denissov   
> > * Amy Bai  
> > * Atri Sharma  
> > * Bhuvnesh Chaudhary   
> > * Bosco
> > * Chunling Wang
> > * David Yozie  
> > * Ed Espino
> > * Entong Shen  
> > * Foyzur Rahman
> > * Goden Yao
> > * Gregory Chase
> > * Hong Wu  
> > * Hongxu Ma
> > * Hubert Zhang 
> > * Ivan Weng
> > * Jesse Zhang          
> > * Jiali Yao
> > * Jun Aoki 
> > * Kavinder Dhaliwal
> > * Lav Jain 
> > * Lei Chang
> > * Lili Ma  
> > * Lirong Jian  
> > * Lisa Owen
> > * Ming Li  
> > * Mohamed Soliman  
> > * Newton Alex  
> > * Noa Horn 
> > * Oleksandr Diachenko  
> > * Paul Guo 
> > * Radar Da Lei 
> > * Roman Shaposhnik 
> > * Ruilong Huo  
> > * Shivram Mani 
> > * Shubham Sharma   
> > * Tushar Pednekar  
> > * Venkatesh Raghavan   
> > * Vineet Goel  
> > * Wen Lin  
> > * Xiang Sheng  
> > * Yi Jin   
> > * Zhanwei Wang 
> > * Zhenglin Tao 
> >
> >NOW, THEREFORE, BE IT FURTHER RESOLVED, that Lei Chang
> >be appointed to the office of Vice President, Apache HAWQ, to
> >serve in accordance with and subject to the direction of the
> >Board of Directors and the Bylaws of the Foundation until
> >death, resignation, retirement, removal or disqualification,
> >or until a successor is appointed; and be it further
> >
> >RESOLVED, that the initial Apache HAWQ PMC be and hereby is
> >tasked with the creation of a set of bylaws intended to
> >encourage open development and increased participation in the
> >Apache HAWQ Project; and be it further
> >
> >RESOLVED, that the Apache HAWQ Project be and hereby
> >is tasked with the migration and rationalization of the Apache
> >Incubator HAWQ podling; and be it further
> >
> >RESOLVED, that all responsibilities pertaining to the Apache
> >Incubator HAWQ podling encumbered upon the Apache Incubator
> >Project are hereafter discharged.
>


Re: July incubator report for HAWQ

2018-07-06 Thread Lei Chang
Thanks Radar for the reminder. Looks like some emails are lost.

@Alan. I have fixed it. It is not a conflict, the content is too long so I
added a separator line "-". I have removed it. Thanks.

Cheers
Lei




On Fri, Jul 6, 2018 at 5:06 PM, Radar Lei  wrote:

> Hi Lei,
>
> FYI in case you did not notice this message. Thanks.
>
> Regards,
> Radar
>
> -- Forwarded message --
> From: Alan Gates 
> Date: Fri, Jul 6, 2018 at 1:55 AM
> Subject: July incubator report
> To: dev@hawq.incubator.apache.org
> Cc: jmcl...@apache.org
>
>
> I went to sign off on the July incubator report but it looks there is an
> unresolved editing conflict.  It wasn't clear which parts to keep and which
> parts to remove so I did not fix it.
> If the original authors can take a look and clean it up then I and other
> mentors can sign off on it.
>
> Alan.
>
>


Re: Oushu and Trademark use

2018-07-06 Thread Lei Chang
@Jim, Thanks for pointing this out. +trademarks@ for help.

If there is any misuse of the trademark, we will change it soon.

Cheers
Lei



On Fri, Jul 6, 2018 at 11:42 AM, Jim Apple  wrote:

> > That is an interesting question actually. This seems to be a customer
> quote.
> > Which means that even if the customer was confused its not like you can
> > change it without doing a round-trip with said customer.
>
> I'm not a lawyer, so I'm not sure to what extent Apache rules about
> trademarks reflect actual laws, but I think generally that there are some
> prohibited advertising statements that companies cannot make by quoting a
> customer who said that forbidden phrase.
>
> So, if I sell iceberg lettuce, which does not cure the common cold, I
> don't think I can say that directly in my ads or quote someone who said
> that in my ads.
>
> > What's the issue with this? I guess a more specific statement would be:
> >"Oushu' is a database and AI company founded by the team who
> > founded Apache HAWQ."
> > (sans annoying founded...founded). Is this the kind of clarification
> you're
> > looking for?
>
> I am subscribed to board@, but I can't access the archives. A similar
> situation came up about a TLP and a company associated with it. If you were
> subscribed, search you mail for the phrase "So perhaps it not just
> marketing that need some education" on or around March 13, 2018.
>
> Maybe trademarks@ would better be able to say.
>



On Fri, Jul 6, 2018 at 10:54 AM, Roman Shaposhnik 
wrote:

> On Thu, Jul 5, 2018 at 7:45 PM, Jim Apple  wrote:
> > I don't think the following fit with Apache guideline about TM use:
> https://www.apache.org/foundation/marks/
> >
> > http://www.oushu.io/
> >
> > "Oushu database, the Apache HAWQ enterprise version, helps us building
> the most advanced data platform"
>
> That is an interesting question actually. This seems to be a customer
> quote.
> Which means that even if the customer was confused its not like you can
> change it without doing a round-trip with said customer.
>
> Something to definitely clarify, tho -- thanks for pointing it out!
>
> > http://oushu.io/about-us.html
> >
> > "'Oushu' is a database and AI company founded by the team who created
> Apache HAWQ."
>
> What's the issue with this? I guess a more specific statement would be:
>"Oushu' is a database and AI company founded by the team who
> founded Apache HAWQ."
> (sans annoying founded...founded). Is this the kind of clarification you're
> looking for?
>
> Thanks,
> Roman.
>


Re: Podling Report Reminder - July 2018

2018-07-03 Thread Lei Chang
Hi Guys,

The following is the podling report draft. Please feel free to give your
comments and suggestions.

Cheers
Lei



HAWQ

HAWQ is an advanced enterprise SQL on Hadoop analytic engine built around a
robust and high-performance massively-parallel processing (MPP) SQL engine
evolved from Greenplum Database.

HAWQ has been incubating since 2015-09-04.

Three most important issues to address in the move towards graduation:

Nothing at this time.

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be aware
of?

 Nothing urgent at this time.

How has the community developed since the last report?

1. Conference Talks :

* HAWQ for data scientists,  Shanghai CIO forum (Speaker: Lei Chang,  June
23, 2018)


2.  "Graduate Apache HAWQ (incubating) as a TLP" discussion was started:
https://lists.apache.org/thread.html/67a2d52ef29cbf9e93d8050ed0193c
c110a919962dd92f8436b343b7@%3Cdev.hawq.apache.org%3E

With the 2.3.0.0-incubating release officially out, the Apache HAWQ
community and its mentors believe it is time to consider graduation to the
TLP:https://lists.apache.org/thread.html/b4a0b5671ce377b3d51c9b7
ab00496a1eebfcbf1696ce8b67e078c64@%3Cdev.hawq.apache.org%3E

Apache HAWQ entered incubation in September of 2015, since then, the HAWQ
community learned a lot about how to do things in Apache ways. Now we have
a healthy and engaged community, ready to help with all questions from the
HAWQ community. We delivered four releases including two binary releases,
now we can do self-driving releases in good cadence. The PPMC has
demonstrated a good understanding of growing the community by electing 12
individuals as committers and PPMC members. The PPMC addressed the maturity
issues one by one followed by Apache Project Maturity Model, currently all
the License and IP issues are resolved. This demonstrated our understanding
of ASF's IP policies.

All in all, I believe this project is qualified as a true TLP and we should
recognize this fact by formally awarding it such a status. This thread
means to open up the very same discussion that we had among the mentors and
HAWQ community to the rest of the IPMC. It is a DISCUSS thread so feel free
to ask questions.

To get you all going, here are a few data points which may help:

Project status:http://incubator.apache.org/projects/hawq.html

Project website:http://hawq.incubator.apache.org/

Project 
documentation:http://hawq.incubator.apache.org/docs/userguide/2.3.0.0-inc
ubating/overview/HAWQOverview.htmlhttp://hawq.apache.org/#download

Maturity 
assessment:https://cwiki.apache.org/confluence/display/HAWQ/ASF+Maturity+Evaluation

DRAFT of the board resolution is at the bottom of this email

Proposed PMC size: 45 members

Total number of committers: 45 members

PMC affiliation (* indicated chair):
Pivotal (20)
* Oushu (7)
Amazon (3)
Hashdata (2)
Autonomic (1)
Confluent (1)
Datometry (1)
Hortonworks (1)
Microsoft (1)
PETUUM (1)
Privacera (1)
Qubole (1)
Snowflake (1)
State Street (1)
Unifi (1)
Visa (1)
ZEDEDA (1)

1549 commits on develop
1375 PR”s on GitHub
63 contributors across all branches

1624 issues created
1350 issues resolved

dev list averaged ~53 msgs/month over last 12 months
user list averaged ~6 msgs/month over last 12 months
129 unique posters


committer affiliations:
activepivotal.iooushu.iohashdata.cn
occasionalamazon.comautonomic.aiconfluent.iodatometry.comhortonworks.commicrosoft.competuum.comprivacera.comqubole.comsnowflake.netstatestreet.comunifisoftware.comvisa.comzededa.com



How has the project developed since the last report?


1. HAWQ 2.4.0 Plan finalized. It includes the following features.


   - New Feature: Pluggable Vectorized Execution Engine on HAWQ.
   - New Feature: Support Runtime Filter for HAWQ local hash join.
   - Bug fixes.



Project page link: https://cwiki.apache.org/confluence/display/HAWQ/
Apache+HAWQ+2.4.0.0-incubating+Release


How would you assess the podling's maturity?
Please feel free to add your own commentary.

  [ ] Initial setup
  [ ] Working towards first release
  [ ] Community building
  [X] Nearing graduation
  [ ] Other:

Date of last release:

2018-03-12, Apache HAWQ 2.3.0.0

When were the last committers or PPMC members elected?

1) Lav Jain: April 5, 2018
2) Shubham Sharma: April 5, 2018


Signed-off-by:

  [ ](hawq) Alan Gates Comments:
  [ ](hawq) Justin Erenkrantz Comments:
  [ ](hawq) Thejas Nair Comments:
  [ ](hawq) Roman Shaposhnik Comments:



On Thu, Jun 28, 2018 at 5:41 AM,  wrote:

> Dear podling,
>
> This email was sent by an automated system on behalf of the Apache
> Incubator PMC. It is an initial reminder to give you plenty of time to
> prepare your quarterly board report.
>
> The board meeting is scheduled for Wed, 18 July 2018, 10:30 am PDT.
> The report for your podling will form a part of the Incubator PMC
> report. The Incubator PMC requires your report to be submitted 2 weeks
> before the

Re: [DISCUSS] Graduate Apache HAWQ (incubating) as a TLP

2018-06-27 Thread Lei Chang
across all branches
> > >>
> > >> 1624 issues created
> > >> 1350 issues resolved
> > >>
> > >> dev list averaged ~53 msgs/month over last 12 months
> > >> user list averaged ~6 msgs/month over last 12 months
> > >> 129 unique posters
> > >>
> > >>
> > >> committer affiliations:
> > >>active
> > >>  pivotal.io
> > >>  oushu.io
> > >>  hashdata.cn
> > >>occasional
> > >>  amazon.com
> > >>  autonomic.ai
> > >>  confluent.io
> > >>  datometry.com
> > >>  hortonworks.com
> > >>  microsoft.com
> > >>  petuum.com
> > >>  privacera.com
> > >>  qubole.com
> > >>  snowflake.net
> > >>  statestreet.com
> > >>  unifisoftware.com
> > >>  visa.com
> > >>  zededa.com
> > >>
> > >>
> > >> Thanks,
> > >> Radar
> > >>
> > >>
> > >>
> > >> ## Resolution to create a TLP from graduating Incubator podling
> > >>
> > >>X. Establish the Apache HAWQ Project
> > >>
> > >>   WHEREAS, the Board of Directors deems it to be in the best
> > >>   interests of the Foundation and consistent with the
> > >>   Foundation's purpose to establish a Project Management
> > >>   Committee charged with the creation and maintenance of
> > >>   open-source software, for distribution at no charge to
> > >>   the public, related to Hadoop native SQL query engine that
> > >>   combines the key technological advantages of MPP database
> > >>   with the scalability and convenience of Hadoop.
> > >>
> > >>   NOW, THEREFORE, BE IT RESOLVED, that a Project Management
> > >>   Committee (PMC), to be known as the "Apache HAWQ Project",
> > >>   be and hereby is established pursuant to Bylaws of the
> > >>   Foundation; and be it further
> > >>
> > >>   RESOLVED, that the Apache HAWQ Project be and hereby is
> > >>   responsible for the creation and maintenance of software
> > >>   related to Hadoop native SQL query engine that
> > >>   combines the key technological advantages of MPP database
> > >>   with the scalability and convenience of Hadoop;
> > >>   and be it further
> > >>
> > >>   RESOLVED, that the office of "Vice President, Apache HAWQ"
> be
> > >>   and hereby is created, the person holding such office to
> > >>   serve at the direction of the Board of Directors as the
> chair
> > >>   of the Apache HAWQ Project, and to have primary
> responsibility
> > >>   for management of the projects within the scope of
> > >>   responsibility of the Apache HAWQ Project; and be it further
> > >>
> > >>   RESOLVED, that the persons listed immediately below be and
> > >>   hereby are appointed to serve as the initial members of the
> > >>   Apache HAWQ Project:
> > >>
> > >>* Alan Gates   
> > >>* Alexander Denissov   
> > >>* Amy Bai  
> > >>* Atri Sharma  
> > >>* Bhuvnesh Chaudhary   
> > >>* Bosco
> > >>* Chunling Wang
> > >>* David Yozie  
> > >>* Ed Espino
> > >>* Entong Shen  
> > >>* Foyzur Rahman
> > >>* Goden Yao
> > >>* Gregory Chase
> > >>* Hong Wu  
> > >>* Hongxu Ma
> > >>* Hubert Zhang 
> > >>* Ivan Weng
> > >>* Jesse Zhang  
> > >>* Ji

Re: [DISCUSS] Graduate Apache HAWQ (incubating) as a TLP

2018-06-20 Thread Lei Chang
Thanks Radar for the effort!

Looking forwards to seeing the progress on graduation.

Cheers
Lei




On Thu, Jun 21, 2018 at 3:04 AM, Lili Ma  wrote:

> Thanks Radar for your great effort :)
>
> +1
>
> 2018-06-20 12:00 GMT-04:00 Ed Espino :
>
> > +1 on our march to Apache TLP status. With Lei Chang as our initial VP
> and
> > our wide global and diverse community supporting each other, I look
> forward
> > to the new and exciting chapter for the project.
> >
> > Radar - Thank you for the excellent leadership through this process.
> >
> > Regards,
> > -=e
> >
> > On Tue, Jun 19, 2018 at 2:40 AM Radar Lei  wrote:
> >
> > > Hi All,
> > >
> > > With the 2.3.0.0-incubating release officially out, the Apache HAWQ
> > > community and its mentors believe it is time to consider graduation to
> > the
> > > TLP:
> > > https://lists.apache.org/thread.html/b4a0b5671ce377b3d51c9b7
> > > ab00496a1eebfcbf1696ce8b67e078c64@%3Cdev.hawq.apache.org%3E
> > >
> > > Apache HAWQ entered incubation in September of 2015, since then, the
> HAWQ
> > > community learned a lot about how to do things in Apache ways. Now we
> > have
> > > a healthy and engaged community, ready to help with all questions from
> > the
> > > HAWQ community. We delivered four releases including two binary
> releases,
> > > now we can do self-driving releases in good cadence. The PPMC has
> > > demonstrated a good understanding of growing the community by electing
> 12
> > > individuals as committers and PPMC members. The PPMC addressed the
> > maturity
> > > issues one by one followed by Apache Project Maturity Model, currently
> > all
> > > the License and IP issues are resolved. This demonstrated our
> > understanding
> > > of ASF's IP policies.
> > >
> > > All in all, I believe this project is qualified as a true TLP and we
> > should
> > > recognize this fact by formally awarding it such a status. This thread
> > > means to open up the very same discussion that we had among the mentors
> > and
> > > HAWQ community to the rest of the IPMC. It is a DISCUSS thread so feel
> > free
> > > to ask questions.
> > >
> > > To get you all going, here are a few data points which may help:
> > >
> > > Project status:
> > >  http://incubator.apache.org/projects/hawq.html
> > >
> > > Project website:
> > >   http://hawq.incubator.apache.org/
> > >
> > > Project documentation:
> > >http://hawq.incubator.apache.org/docs/userguide/2.3.0.0-inc
> > > ubating/overview/HAWQOverview.html
> > >http://hawq.apache.org/#download
> > >
> > > Maturity assessment:
> > > https://cwiki.apache.org/confluence/display/HAWQ/ASF+
> Maturity+Evaluation
> > >
> > > DRAFT of the board resolution is at the bottom of this email
> > >
> > > Proposed PMC size: 45 members
> > >
> > > Total number of committers: 45 members
> > >
> > > PMC affiliation (* indicated chair):
> > >Pivotal (20)
> > >  * Oushu (7)
> > >Amazon (3)
> > >Hashdata (2)
> > >Autonomic (1)
> > >Confluent (1)
> > >Datometry (1)
> > >Hortonworks (1)
> > >Microsoft (1)
> > >PETUUM (1)
> > >Privacera (1)
> > >Qubole (1)
> > >Snowflake (1)
> > >State Street (1)
> > >Unifi (1)
> > >Visa (1)
> > >ZEDEDA (1)
> > >
> > > 1549 commits on develop
> > > 1375 PR”s on GitHub
> > > 63 contributors across all branches
> > >
> > > 1624 issues created
> > > 1350 issues resolved
> > >
> > > dev list averaged ~53 msgs/month over last 12 months
> > > user list averaged ~6 msgs/month over last 12 months
> > > 129 unique posters
> > >
> > >
> > > committer affiliations:
> > > active
> > >   pivotal.io
> > >   oushu.io
> > >   hashdata.cn
> > > occasional
> > >   amazon.com
> > >   autonomic.ai
> > >   confluent.io
> > >   datometry.com
> > >   hortonworks.com
> > >   microsoft.com
> > >   petuum.com
> > >   privacera.com
> > >   qubole.com
> > >   snowflake.net
> > >   

Re: Re: [DISCUSS] Apache HAWQ Graduation from Incubator

2018-06-04 Thread Lei Chang
Thank you guys for the nomination :-).

Great to work with the team on the HAWQ graduation and any future issues.

Cheers
Lei



On Fri, Jun 1, 2018 at 9:50 AM, 陶征霖  wrote:

> +1 to Lei. He is worthy of the title of Apache HAWQ PMC Chairman.
>
> 2018-06-01 9:44 GMT+08:00 Yi JIN :
>
> > +1 to Lei, I fully support Lei Change as Apache HAWQ project PMC
> Chairman,
> > he deserves this role, not only outstanding contributions to this project
> > in a very long period till today also his solid leadership and vision.
> >
> > Best,
> > Yi  Jin
> >
> > On Fri, Jun 1, 2018 at 3:17 AM, Ed Espino  wrote:
> >
> > > Ruilong,
> > >
> > > I also give my full support for Lei Chang as the Apache HAWQ project's
> > > initial PMC Chairman. His leadership and vision have contributed
> > immensely
> > > to the project.
> > >
> > > Regards,
> > > -=e
> > >
> > > On Thu, May 31, 2018 at 6:32 AM, Ruilong Huo  wrote:
> > >
> > > > Great progress towards Apache HAWQ graduation! Thanks Radar for
> pushing
> > > > this forward!
> > > >
> > > > I would like to nominate Lei Chang as the PMC Chair. He initiated
> HAWQ
> > > > project several years ago, led the development, brought it to Apache
> > > > incubation, and has always been active in HAWQ community to make it a
> > > > world-leading big data product as well as a successfully Apache
> > project.
> > > > There is no doubt that he is perfect for the role. And I believe he
> > will
> > > > continue to share his insights and go even further with HAWQ after
> > > > graduation.
> > > >
> > > >
> > > > Best regards,
> > > > Ruilong Huo
> > > >
> > > >
> > > > At 2018-05-31 15:40:42, "Radar Lei"  wrote:
> > > > >Just find some good material for nominating chair from Roman's
> email,
> > > > >thanks Roman.
> > > > >
> > > > >I think we can follow this to nominate a Chair in this thread too.
> > Guys
> > > > >please help to nominate or self-nominate. Thanks a lot.
> > > > >
> > > > >See:
> > > > >At the very minimum your resolution will contain: 1. A name of the
> > > project
> > > > >2. A list of proposed PMC 3. A proposed PMC chair
> > > > >
> > > > >On #3 I typically recommend podlings I mento to setup a rotating
> chair
> > > > >policy. This is, in no way, an ASF requirement so feel free to
> ignore
> > > it,
> > > > >but it worked well before. The chair will be expected up for
> rotation
> > > > every
> > > > >year. It will be more that ok for the same person to self-nominate
> > once
> > > > the
> > > > >year is up -- but at the same time it'll be up to the same person to
> > > > >actually kick off a thread asking if anybody else is interested in
> > > serving
> > > > >as a chair for the next year. Of course, if there multiple
> candidates
> > > > there
> > > > >will have to be a vote.
> > > > >
> > > > >
> > > > >Regards,
> > > > >Radar
> > > > >
> > > > >On Tue, May 29, 2018 at 10:05 PM, Radar Lei 
> wrote:
> > > > >
> > > > >> Thanks Roman.
> > > > >>
> > > > >> This make sense, I will start to draft the resolution. BTW, we
> would
> > > > need
> > > > >> to nominate a chair, I guess it's the last piece we need to draft
> > the
> > > > >> resolution.
> > > > >>
> > > > >> Regards,
> > > > >> Radar
> > > > >>
> > > > >> On Tue, May 29, 2018 at 11:24 AM, Roman Shaposhnik <
> > > > ro...@shaposhnik.org>
> > > > >> wrote:
> > > > >>
> > > > >>> On Sun, May 27, 2018 at 11:37 PM, Radar Lei 
> > wrote:
> > > > >>> > Hi Roman,
> > > > >>> >
> > > > >>> > We have confirmed with each HAWQ committer whether they want to
> > > > remain
> > > > >>> with
> > > > >>> > HAWQ project.  As a summary, 37 PPMC members(including two
> > mentors)
> > > > and
> > > > >>> 7
> > > > >>> > committers confirmed they want to remain with HAWQ. [1] The
> total
> > > > >>> committers
> > > > >>> > number 44 seems pretty close with PPMC member number 37, is it
> > good
> > > > >>> enough
> > > > >>> > to make PMC == committers as our graduation resolution?
> > > > >>>
> > > > >>> PMC == committers in this case makes perfect sense to me!
> > > > >>>
> > > > >>> > Should we update the whimsy and project webpage now or do
> update
> > > > after
> > > > >>> graduation?
> > > > >>>
> > > > >>> It really doesn't matter much. Your next step is to draft a
> > > resolution
> > > > >>> similar to:
> > > > >>> https://www.mail-archive.com/general@incubator.apache.org/ms
> > > > >>> g56982.html
> > > > >>>
> > > > >>> and start a [DISCUSS] thread similar to the above.
> > > > >>>
> > > > >>> Makes sense?
> > > > >>>
> > > > >>> Thanks,
> > > > >>> Roman.
> > > > >>>
> > > > >>
> > > > >>
> > > >
> > >
> >
>


Re: Apache JIRA administration responsibilities for HAWQ project

2018-06-01 Thread Lei Chang
+1 for the suggestion.

Cheers
Lei




On Fri, Jun 1, 2018 at 7:39 AM, Ed Espino  wrote:

> As part of the graduation process, members of the HAWQ project will be
> assuming all JIRA administration tasks for the HAWQ project.  After
> discussing with on of the project's mentors (Roman), we can start this
> process now. Amongst other administration tasks, this includes assigning
> the available roles (administrator, committers and contributors) to a user.
> This is important so an issue can be assigned to a task. We currently have
> four administrators. I recommend members of the dev community wanting
> higher access than a normal user will send an email to the dev list and one
> of the existing four administrators fulfills the request. We can consider
> more administrators as time passes.
>
> I have updated the project's wiki with the corresponding instructions:
> https://cwiki.apache.org/confluence/display/HAWQ/Contributing+to+HAWQ#
> ContributingtoHAWQ-HAWQissuetracking-Jira
>
> Does this seem reasonable?
>
> Thanks,
> -=e
>


Re: how hawq auto active standby

2018-06-01 Thread Lei Chang
A JIRA has been created to track this feature:
https://issues.apache.org/jira/browse/HAWQ-1623

Cheers
Lei





On Sat, Jun 2, 2018 at 7:36 AM, Lei Chang  wrote:

>
> Let's start a JIRA about this, this has been implemented in Oushu HAWQ
> version, we will put this to Apache.
>
> Cheers
> Lei
>
>
>
>
> On Fri, Jun 1, 2018 at 4:40 PM,  wrote:
>
>> Could you tell me more details about what i need be careful?
>>
>> By the way, is there any plan about this?
>>
>> Thanks
>>
>> --
>> *From: *"Radar Lei" 
>> *To: *"user" , "dev" <
>> dev@hawq.incubator.apache.org>
>> *Sent: *Wednesday, May 30, 2018 8:34:49 PM
>> *Subject: *Re: how hawq auto active standby
>>
>> There is no command line tools to activate standby automatically yet.
>> Maybe you can write a script to do it by yourself, just need to be more
>> carefull.
>>
>>
>> Regards,
>> Radar
>>
>> On Wed, May 30, 2018 at 8:22 PM,  wrote:
>>
>>> Hi!
>>>
>>> I has add a standby node for hawq, and it can active to master by hand
>>> if master failed.
>>>
>>> But i do not find details about how to config hawq auto active standby
>>> to master.
>>>
>>> I google and only find with ambari, hawq can auto active standby to
>>> master.
>>>
>>> Is there any doc about how to make it with command line?
>>>
>>> Thanks
>>>
>>
>>
>


[jira] [Created] (HAWQ-1623) Automatic master node HA

2018-06-01 Thread Lei Chang (JIRA)
Lei Chang created HAWQ-1623:
---

 Summary: Automatic master node HA
 Key: HAWQ-1623
 URL: https://issues.apache.org/jira/browse/HAWQ-1623
 Project: Apache HAWQ
  Issue Type: New Feature
Reporter: Lei Chang
Assignee: Radar Lei


 

In current HAWQ, when master node dies, it needs manual switch from master to 
standby. It is not convenient for end users. Let's add an automatic failover 
mechanism.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: how hawq auto active standby

2018-06-01 Thread Lei Chang
Let's start a JIRA about this, this has been implemented in Oushu HAWQ
version, we will put this to Apache.

Cheers
Lei




On Fri, Jun 1, 2018 at 4:40 PM,  wrote:

> Could you tell me more details about what i need be careful?
>
> By the way, is there any plan about this?
>
> Thanks
>
> --
> *From: *"Radar Lei" 
> *To: *"user" , "dev" <
> dev@hawq.incubator.apache.org>
> *Sent: *Wednesday, May 30, 2018 8:34:49 PM
> *Subject: *Re: how hawq auto active standby
>
> There is no command line tools to activate standby automatically yet.
> Maybe you can write a script to do it by yourself, just need to be more
> carefull.
>
>
> Regards,
> Radar
>
> On Wed, May 30, 2018 at 8:22 PM,  wrote:
>
>> Hi!
>>
>> I has add a standby node for hawq, and it can active to master by hand
>> if master failed.
>>
>> But i do not find details about how to config hawq auto active standby to
>> master.
>>
>> I google and only find with ambari, hawq can auto active standby to
>> master.
>>
>> Is there any doc about how to make it with command line?
>>
>> Thanks
>>
>
>


Re: Questions for HAWQ dev community: Pluggable storage formats and files systems vs. PXF profiles

2018-06-01 Thread Lei Chang
Hi Ed,

Thanks for bringing this out. The framework part of the pluggable storage
format has been committed, but the plugins still need some development work
and take some time.

Thanks
Lei








On Wed, May 30, 2018 at 11:35 PM, Ed Espino  wrote:

> As HAWQ has matured, the introduction of the Pluggable storage formats is
> giving users two methods to access external data. Is the project at a point
> where we can advise the community that we will recommend the use of the 
> Pluggable
> storage format instead of PXF? I believe this was discussed in a thread
> earlier in the year but I haven't seen any further discussion or action
> plan of action for it.
>
> I suggest we recommend HAWQ community users start the migration and upon
> graduation the project should indicate we will stop supporting PXF in favor
> of the Pluggable storage formats. Thoughts? Depending on feedback, I would
> like to put this to a vote for the dev community's consideration.
>
> email ref: https://lists.apache.org/thread.html/
> 3a1e5c358bc0b0ecf04d49e38979d6bf0ae21e3a56188d34986fc8a5@%
> 3Cdev.hawq.apache.org%3E
>
> For reference, PXF is also in use and under active development in another
> open source project: Greenplum Database ( https://github.com/greenplum-
> db/gpdb/tree/5X_STABLE/gpAux/extensions/pxf ) and depends on the PXF
> server-side components that exist in HAWQ’s source tree.
>
> Thanks,
> -=e
>
>


Re: Remain with HAWQ project or not?

2018-05-09 Thread Lei Chang
yes

On Wed, May 9, 2018 at 1:01 PM, Yi JIN  wrote:

> Yes, I would like to remain a committer/PMC member of Apache HAWQ project.
>
> Best,
> Yi (yjin)
>
> On Wed, May 9, 2018 at 12:41 AM, Ed Espino  wrote:
>
> > Yes I would like to remain a committer/PMC member of Apache HAWQ project.
> >
> > Regards,
> > -=e
> >
> > On Mon, May 7, 2018 at 1:11 AM, Radar Lei  wrote:
> >
> > > HAWQ committers,
> > >
> > > Per the discussion in "Apache HAWQ graduation from incubator?" [1], we
> > want
> > > to setup the PMC as part of HAWQ graduation resolution.
> > >
> > > So we'd like to confirm whether you want to remain as a committer/PMC
> > > member of Apache HAWQ project?
> > >
> > > If you'd like to remain with HAWQ project, it's welcome and please
> > > *respond**
> > > 'Yes'* in this thread, or *respond 'No'* if you are not interested in
> any
> > > more. Thanks.
> > >
> > > This thread will be available for at least 72 hours, after that, we
> will
> > > send individual confirm emails.
> > >
> > > [1]
> > > https://lists.apache.org/thread.html/b4a0b5671ce377b3d51c9b7ab00496
> > > a1eebfcbf1696ce8b67e078c64@%3Cdev.hawq.apache.org%3E
> > >
> > > Regards,
> > > Radar
> > >
> >
>


Re: Podling Report Reminder - April 2018

2018-04-07 Thread Lei Chang
Thank you Radar. Please see the comments below.


On Thu, Apr 5, 2018 at 12:29 PM, Radar Lei <r...@pivotal.io> wrote:

> Thanks Lei, nice report.
>
> Please check below comments:
>
> Three most important issues to address in the move towards
> > graduation:
>
> Should be one?
>

This is the template. Currently we have only one issue left for the top
three.


>
> > Three committer candidates passed the voting process:
>
> Should be two.
>

Corrected.


>
> When were the last committers or PPMC members elected?
> >1) Amy BAI:  Nov 1, 2017
> >2) ChunLing WANG: Nov 1, 2017
> >3) Hongxu MA: Nov 4, 2017
>
>
Maybe we should update with the latest committer votes.
>


Updated.




>
> Regards,
> Radar
>
> On Wed, Apr 4, 2018 at 10:11 PM, Lei Chang <chang.lei...@gmail.com> wrote:
>
> > Hi Guys,
> >
> > Here is the podling report. Please feel free to add your comments.
> >
> > Cheers
> > Lei
> >
> >
> > 
> > 
> > HAWQ
> >
> > HAWQ is an advanced enterprise SQL on Hadoop analytic engine built
> around a
> > robust and high-performance massively-parallel processing (MPP) SQL
> engine
> > evolved from Greenplum Database.
> >
> > HAWQ has been incubating since 2015-09-04.
> >
> > Three most important issues to address in the move towards
> > graduation:
> >
> >   1. Continue to improve the project's release cadence. To this
> >  end we plan on expanding automation services to support
> >  increased developer participation. (HAWQ-127)
> >
> >
> > Any issues that the Incubator PMC (IPMC) or ASF Board wish/need
> > to be aware of?
> >
> >  Nothing urgent at this time.
> >
> > How has the community developed since the last report?
> >
> > 1. Conference Talks :
> >
> >  * HAWQ on Microsoft Azure Cloud. Microsoft Incubator Talk (Speaker: Lei
> > Chang, Mar 21, 2018)
> >
> >
> > 2. Three committer candidates passed the voting process:
> >
> >1) Shubham SHARMA
> >2) Lav JAIN
> >
> >
> > How has the project developed since the last report?
> >
> >
> > 1. HAWQ 2.3 released. It includes the following features.
> >
> >   1) New Feature: HAWQ Core supports plugable external storage framework.
> >   2) New Feature: HAWQ Ranger supports RPS HA.
> >   3) New Feature: HAWQ Ranger supports Kerberos authentication.
> >   4) New Feature: HAWQ Core supports HDFS TDE (Transparent Data
> Encryption)
> > through libHdfs3.
> >   5) Licenses: Fix PXF license files located in PXF jar files.
> >   6) Licenses: Check Apache HAWQ mandatory libraries to match LC20, LC30
> > license criteria.
> >   7) Build: Release build project
> >   8) Bug fixes.
> >
> > Project page link:
> > https://cwiki.apache.org/confluence/display/HAWQ/Apache+HAWQ+2.3.0.0-
> > incubating+Release
> >
> > 2. HAWQ 2.4 release plan was proposed.
> >
> >   1) New Feature: Pluggable Vectorized Execution Engine on HAWQ.
> >   2) New Feature: Support Runtime Filter for HAWQ local hash join.
> >   3) New Feature: Support accessing Hive table data by the new Pluggable
> > Storage Framework.
> >   4) Bug fixes.
> >
> > Project page link:
> > https://cwiki.apache.org/confluence/display/HAWQ/Apache+HAWQ+2.4.0.0-
> > incubating+Release
> >
> >
> > How would you assess the podling's maturity?
> > Please feel free to add your own commentary.
> >
> >   [ ] Initial setup
> >   [ ] Working towards first release
> >   [ ] Community building
> >   [X] Nearing graduation
> >   [ ] Other:
> >
> > Date of last release:
> >
> > 2018-03-12, Apache HAWQ 2.3.0.0
> >
> > When were the last committers or PPMC members elected?
> >
> >1) Amy BAI:  Nov 1, 2017
> >2) ChunLing WANG: Nov 1, 2017
> >3) Hongxu MA: Nov 4, 2017
> >
> >
> > Signed-off-by:
> >
> >   [](hawq) Alan Gates
> >  Comments:
> >   [ ](hawq) Konstantin Boudnik
> >  Comments:
> >   [ ](hawq) Justin Erenkrantz
> >  Comments:
> >   [ ](hawq) Thejas Nair
> >  Comments:
> >   [](hawq) Roman Shaposhnik
> >  Comments:
> >
> > On Wed, Apr 4, 2018 at 8:55 AM, <johndam...@apache.org> wrote:
> >
> > > Dear podling,
> > >
> > > This email was sent by an automat

Re: [ANNOUNCE] Apache HAWQ 2.3.0.0-incubating Release

2018-03-22 Thread Lei Chang
congrats!

Cheers
Lei



On Wed, Mar 21, 2018 at 1:58 PM, Yi JIN  wrote:

> Thanks John, I fixed it and resent the announcement.
>
> Best,
> Yi (yjin)
>
> On Wed, Mar 21, 2018 at 4:14 PM, Amy Bai  wrote:
>
> > Congratulations! Thanks Yi for this release!
> >
> > On Wed, Mar 21, 2018 at 11:15 AM, John D. Ament 
> > wrote:
> >
> > > Dropping announce@.
> > >
> > > On Tue, Mar 20, 2018 at 10:28 PM Yi JIN  wrote:
> > >
> > > > Apache HAWQ (incubating) Project Team is proud to announce Apache
> > > > HAWQ 2.3.0.0-incubating has been released.
> > > >
> > > > Apache HAWQ (incubating) combines exceptional MPP-based analytics
> > > > performance, robust ANSI SQL compliance, Hadoop ecosystem
> > > > integration and manageability, and flexible data-store format
> > > > support, all natively in Hadoop, no connectors required. Built
> > > > from a decade’s worth of massively parallel processing (MPP)
> > > > expertise developed through the creation of the Pivotal
> > > > Greenplum® enterprise database and open source PostgreSQL, HAWQ
> > > > enables to you to swiftly and interactively query Hadoop data,
> > > > natively via HDFS.
> > > >
> > > > *Download Link*:
> > > >
> > > > https://dist.apache.org/repos/dist/release/incubator/hawq/2.
> > > 3.0.0-incubating/
> > > >
> > > >
> > > This has been a hot topic as of late.  This link is incorrect.  Please
> > > review http://www.apache.org/dev/release-publishing#distribution_dist
> > and
> > > related pages and fix the link.  Please resend this email once you've
> > > corrected it.
> > >
> > >
> > > > *About this release*
> > > > This is a release having both source code and binary
> > > >
> > > > All changes:
> > > >
> > > > https://cwiki.apache.org/confluence/display/HAWQ/
> Apache+HAWQ+2.3.0.0-
> > > incubating+Release
> > > >
> > > >
> > > > *HAWQ Resources:*
> > > >
> > > >- JIRA: https://issues.apache.org/jira/browse/HAWQ
> > > >- Wiki:
> > > > https://cwiki.apache.org/confluence/display/HAWQ/Apache+HAWQ+Home
> > > >- Mailing list(s): dev@hawq.incubator.apache.org
> > > >   u...@hawq.incubator.apache.org
> > > >
> > > > *Know more about HAWQ:*
> > > > http://hawq.apache.org
> > > >
> > > > - Apache HAWQ (incubating) Team
> > > >
> > > > =
> > > > *Disclaimer*
> > > >
> > > > Apache HAWQ (incubating) is an effort undergoing incubation at The
> > > > Apache Software Foundation (ASF), sponsored by the name of Apache
> > > > Incubator PMC. Incubation is required of all newly accepted
> > > > projects until a further review indicates that the
> > > > infrastructure, communications, and decision making process have
> > > > stabilized in a manner consistent with other successful ASF
> > > > projects. While incubation status is not necessarily a reflection
> > > > of the completeness or stability of the code, it does indicate
> > > > that the project has yet to be fully endorsed by the ASF.
> > > >
> > >
> >
>


Re: [VOTE]: Apache HAWQ 2.3.0.0-incubating Release (RC2)

2018-03-04 Thread Lei Chang
+1. Install and test it.

Cheers
Lei




On Sun, Mar 4, 2018 at 9:54 AM, Chunling Wang 
wrote:

> +1
> Download source code, build and install, init, run simple queries.
>
> > 在 2018年3月3日,17:21,陶征霖  写道:
> >
> > Reviewed the version file, it's good now. +1
> >
> > 2018-03-03 14:58 GMT+08:00 Wen Lin :
> >
> >> Build from source, installed and run feature tests.
> >> +1
> >>
> >> On Sat, Mar 3, 2018 at 11:02 AM, Yi JIN  wrote:
> >>
> >>> Guys, this is a reminder, please vote asap. Thanks
> >>>
> >>> Best,
> >>> Yi (yjin)
> >>>
> >>> On Fri, Mar 2, 2018 at 1:07 PM, Hubert Zhang 
> wrote:
> >>>
>  +1
>  Build and Installed. Tests passed.
> 
>  On Thu, Mar 1, 2018 at 2:01 PM, Ruilong Huo  wrote:
> 
> > +1 for the HAWQ 2.3.0.0-incubating RC2
> >
> >
> > Here are the checks have been done:
> >
> >
> > 1. Reviewed LICENSE, NOTICE, DISCLAIMER, and pom.xml.
> >
> >
> > 2. Passed RAT configuration check successfully.
> >
> >
> > 3. Passed source and rpm package signature, md5 and sha256 checksum.
> >
> >
> > 4. Compiled from source tarball and installed RC2 with feature check
> > successful.
> >
> >
> > 5. Downloaded rpm tarball, installed hawq on CentOS 7.2 VM following
> > https://cwiki.apache.org/confluence/display/HAWQ/Build+
> > Package+and+Install+with+RPM. The initialization and basic query
> >>> passed.
> >
> >
> > Best regards,
> > Ruilong Huo
> > At 2018-03-01 13:50:58, "Bai Jie"  wrote:
> >> Build from branch 2.3.0.0 source code, installed , init , run
> >> feature
>  test
> >> and simple queries. Looks good to me. +1
> >>
> >> On Wed, Feb 28, 2018 at 11:24 AM, Hongxu Ma 
>  wrote:
> >>
> >>> +1
> >>>
> >>> Both source and rpm package are verified in my environment: Red
> >> Hat
> >>> Enterprise Linux Server release 7.2
> >>> Include installation and execute some simple queries.
> >>>
> >>> 在 27/02/2018 12:19, Yi JIN 写道:
>  Hi All,
> 
>  This is the vote for Apache HAWQ (incubating) 2.3.0.0-incubating
> > Release
>  Candidate 2 (RC2). It is a source release for HAWQ core, PXF,
> >> and
> > Ranger;
>  and binary release for HAWQ core,  PXF and Ranger. We have rpm
>  package
>  involved for the binary release.
> 
>  The vote will run for at least 72 hours and will close on
> >>> Saturday,
> > March
>  3rd, 2017. Thanks.
> 
>  1. Wiki page of the release:
>  *https://cwiki.apache.org/confluence/display/HAWQ/
> > Apache+HAWQ+2.3.0.0-
> >>> incubating+Release
>   > Apache+HAWQ+2.3.0.0-
> >>> incubating+Release>*
> 
> 
>  2. Release Notes (Apache Jira generated):
>  https://issues.apache.org/jira/secure/ReleaseNote.jspa?
>  version=12340262=Html=12318826
> 
> 
>  3. Release verification steps can be found at:
>  For source tarball: https://cwiki.apache.org/
>  confluence/display/HAWQ/
>  Release+Process%3A+Step+by+step+guide#ReleaseProcess:
>  Stepbystepguide-ValidatetheReleaseCandidate
>  For rpm package: https://cwiki.apache.org/
> >>> confluence/display/HAWQ/
>  Build+Package+and+Install+with+RPM
> 
> 
>  4. Git release branch:
>  https://git-wip-us.apache.org/repos/asf?p=incubator-hawq.
>  git;a=shortlog;h=refs/heads/2.3.0.0-incubating
> 
>  5. Source and Binary release balls with signare:
>  https://dist.apache.org/repos/dist/dev/incubator/hawq/2.3.0.
>  0-incubating.RC2/
> 
> 
>  6. Keys to verify the signature of the release artifact are
>  available
> > at:
>  https://dist.apache.org/repos/dist/dev/incubator/hawq/KEYS
> 
> 
>  7. The artifact(s) has been signed with Key ID: CE60F90D1333092A
> 
>  8. Fixed issues in RC2.
>  https://issues.apache.org/jira/browse/HAWQ-1589
>  https://issues.apache.org/jira/browse/HAWQ-1590
> 
>  REMINDER: Please provide details of what you have tried and
> >>> verified
> >>> before
>  your vote conclusion. Thanks!
> 
> 
>  Please vote accordingly:
>  [ ] +1 approve
>  [ ] +0 no opinion
>  [ ] -1 disapprove (and reason why)
> 
> 
>  Best regards,
>  Yi (yjin)
> 
> >>>
> >>> --
> >>> Regards,
> >>> Hongxu.
> >>>
> >>>
> >
> 
> 
> 
>  --
>  Thanks
> 
>  Hubert Zhang
> 
> >>>
> >>
>
>
>


Re: Podling Report Reminder - January 2018

2018-01-02 Thread Lei Chang
Thanks Roman, please see the changes here ( https://wiki.apache.org/
incubator/January2018) and comments inline:

On Wed, Jan 3, 2018 at 12:43 AM, Roman Shaposhnik <ro...@shaposhnik.org>
wrote:

> Looks good, but I'd like a bit more details around:
>   "Three most important issues to address in the move towards graduation:"
>

I added the JIRA number and another license related item to the list.

Three most important issues to address in the move towards

graduation:


  1. Continue to improve the project's release cadence. To this

 end we plan on expanding automation services to support

 increased developer participation. (HAWQ-127)

  2. Licenses: Check Apache HAWQ mandatory libraries to match LC20, LC30
license criteria. (HAWQ-1512)


>
> Also, I think it would be useful to highlight the PXF discussion that
> happened
> in the:
>"Questions for HAWQ dev community: Pluggable storage formats and
> files systems vs. PXF"
> thread recently.
>

Highlighted the discussion.


>
> Thanks,
> Roman.
>
>
>


>
> On Tue, Jan 2, 2018 at 6:31 AM, Lei Chang <chang.lei...@gmail.com> wrote:
> > https://wiki.apache.org/incubator/January2018
> >
> > On Tue, Jan 2, 2018 at 10:31 PM, Lei Chang <chang.lei...@gmail.com>
> wrote:
> >
> >>
> >> sorry, please use this link.
> >>
> >> Thanks
> >> Lei
> >>
> >>
> >> On Tue, Jan 2, 2018 at 3:28 PM, Hongxu Ma <inte...@outlook.com> wrote:
> >>
> >>> Hi Lei
> >>> The format is broken...
> >>>
> >>> 在 02/01/2018 14:38, Lei Chang 写道:
> >>>
> >>> Hi Guys,
> >>>
> >>> The following is the Podling Report. Please review and give your
> comments.
> >>>
> >>> Cheers
> >>> Lei
> >>>
> >>>
> >>> 
> >>>
> >>>
> >>> HAWQHAWQ is an advanced enterprise SQL on Hadoop analytic engine built
> >>> around arobust and high-performance massively-parallel processing
> >>> (MPP) SQL frameworkevolved from Greenplum Database.HAWQ has been
> >>> incubating since 2015-09-04.Three most important issues to address in
> >>> the move towardsgraduation:  1. Continue to improve the project's
> >>> release cadence. To this end we plan on expanding automation
> >>> services to support increased developer participation.Any issues
> >>> that the Incubator PMC (IPMC) or ASF Board wish/needto be aware of?
> >>>  Nothing urgent at this time.How has the community developed since the
> >>> last report?1. Conference Talks (2): * The nature of cloud database.
> >>> The 7th Data Technology Carnival (Speaker: Lei Chang, Nov 17, 2017)  *
> >>> New Data Warehouse: Apache HAWQ. 2017 Global Internet Technology
> >>> Conference (Speaker: Lei Chang, Nov 24, 2017) 2. Active contributions
> >>> from approximately 20 different community   contributors since the
> >>> last report.3. Three committer candidates passed the voting process:
> >>> 1) Amy BAI   2) ChunLing WANG3) Hongxu MA   How has the project
> >>> developed since the last report?1. The scope of 2.3 release is
> >>> finalized, and is under development  1) New Feature: HAWQ Ranger
> >>> supports RPS HA.  (Done)  2) New Feature: HAWQ Ranger supports
> >>> Kerberos authentication. (Done)  3) New Feature: HAWQ Core supports
> >>> plugable external storage framework. (Almost Done HAWQ-786)  4) New
> >>> Feature: HAWQ Core supports HDFS TDE (Transparent Data Encryption)
> >>> through libHdfs3. (Done, HAWQ-1193)  5) Licenses: Fix PXF license
> >>> files located in PXF jar files. (Done HAWQ-1496)  6) Licenses: Check
> >>> Apache HAWQ mandatory libraries to match LC20, LC30 license criteria.
> >>> (Not started HAWQ-1512)  7) Build: Release build project (On going
> >>> HAWQ-127)  8) Bug fixes. (On going)Project page link:
> https://cwiki.apache.org/confluence/display/HAWQ/Apache+HAWQ+2.3.0.0-
> incubating+Release
> >>> How would you assess the podling's maturity?Please feel free to
> >>> add your own commentary.  [ ] Initial setup  [ ] Working towards first
> >>> release  [ ] Community building  [X] Nearing graduation  [ ]
> >>> Other:Date of last release:  2017-07-12, Apache HAWQ 2.2.0.0  When
> >>> were the last committers or PPMC members elected?   1) Amy BAI:  Nov
> >>> 1, 2017   2) ChunLing WANG: Nov 1, 2017   3) Hongxu MA: Nov 

Re: Podling Report Reminder - January 2018

2018-01-02 Thread Lei Chang
https://wiki.apache.org/incubator/January2018

On Tue, Jan 2, 2018 at 10:31 PM, Lei Chang <chang.lei...@gmail.com> wrote:

>
> sorry, please use this link.
>
> Thanks
> Lei
>
>
> On Tue, Jan 2, 2018 at 3:28 PM, Hongxu Ma <inte...@outlook.com> wrote:
>
>> Hi Lei
>> The format is broken...
>>
>> 在 02/01/2018 14:38, Lei Chang 写道:
>>
>> Hi Guys,
>>
>> The following is the Podling Report. Please review and give your comments.
>>
>> Cheers
>> Lei
>>
>>
>> 
>>
>>
>> HAWQHAWQ is an advanced enterprise SQL on Hadoop analytic engine built
>> around arobust and high-performance massively-parallel processing
>> (MPP) SQL frameworkevolved from Greenplum Database.HAWQ has been
>> incubating since 2015-09-04.Three most important issues to address in
>> the move towardsgraduation:  1. Continue to improve the project's
>> release cadence. To this end we plan on expanding automation
>> services to support increased developer participation.Any issues
>> that the Incubator PMC (IPMC) or ASF Board wish/needto be aware of?
>>  Nothing urgent at this time.How has the community developed since the
>> last report?1. Conference Talks (2): * The nature of cloud database.
>> The 7th Data Technology Carnival (Speaker: Lei Chang, Nov 17, 2017)  *
>> New Data Warehouse: Apache HAWQ. 2017 Global Internet Technology
>> Conference (Speaker: Lei Chang, Nov 24, 2017) 2. Active contributions
>> from approximately 20 different community   contributors since the
>> last report.3. Three committer candidates passed the voting process:
>> 1) Amy BAI   2) ChunLing WANG3) Hongxu MA   How has the project
>> developed since the last report?1. The scope of 2.3 release is
>> finalized, and is under development  1) New Feature: HAWQ Ranger
>> supports RPS HA.  (Done)  2) New Feature: HAWQ Ranger supports
>> Kerberos authentication. (Done)  3) New Feature: HAWQ Core supports
>> plugable external storage framework. (Almost Done HAWQ-786)  4) New
>> Feature: HAWQ Core supports HDFS TDE (Transparent Data Encryption)
>> through libHdfs3. (Done, HAWQ-1193)  5) Licenses: Fix PXF license
>> files located in PXF jar files. (Done HAWQ-1496)  6) Licenses: Check
>> Apache HAWQ mandatory libraries to match LC20, LC30 license criteria.
>> (Not started HAWQ-1512)  7) Build: Release build project (On going
>> HAWQ-127)  8) Bug fixes. (On going)Project page 
>> link:https://cwiki.apache.org/confluence/display/HAWQ/Apache+HAWQ+2.3.0.0-incubating+Release
>> How would you assess the podling's maturity?Please feel free to
>> add your own commentary.  [ ] Initial setup  [ ] Working towards first
>> release  [ ] Community building  [X] Nearing graduation  [ ]
>> Other:Date of last release:  2017-07-12, Apache HAWQ 2.2.0.0  When
>> were the last committers or PPMC members elected?   1) Amy BAI:  Nov
>> 1, 2017   2) ChunLing WANG: Nov 1, 2017   3) Hongxu MA: Nov 4,
>> 2017Signed-off-by:  [ ](hawq) Alan Gates Comments:  [ ](hawq)
>> Konstantin Boudnik Comments:  [ ](hawq) Justin Erenkrantz
>> Comments:  [ ](hawq) Thejas Nair Comments:  [ ](hawq) Roman
>> Shaposhnik Comments: IPMC/Shepherd notes:
>>
>>
>> On Sat, Dec 30, 2017 at 10:30 PM, <johndam...@apache.org> 
>> <johndam...@apache.org> wrote:
>>
>>
>> Dear podling,
>>
>> This email was sent by an automated system on behalf of the Apache
>> Incubator PMC. It is an initial reminder to give you plenty of time to
>> prepare your quarterly board report.
>>
>> The board meeting is scheduled for Wed, 17 January 2018, 10:30 am PDT.
>> The report for your podling will form a part of the Incubator PMC
>> report. The Incubator PMC requires your report to be submitted 2 weeks
>> before the board meeting, to allow sufficient time for review and
>> submission (Wed, January 03).
>>
>> Please submit your report with sufficient time to allow the Incubator
>> PMC, and subsequently board members to review and digest. Again, the
>> very latest you should submit your report is 2 weeks prior to the board
>> meeting.
>>
>> Thanks,
>>
>> The Apache Incubator PMC
>>
>> Submitting your Report
>>
>> --
>>
>> Your report should contain the following:
>>
>> *   Your project name
>> *   A brief description of your project, which assumes no knowledge of
>> the project or necessarily of its field
>> *   A list of the three most important issues to address in the move
>> t

Re: Podling Report Reminder - January 2018

2018-01-02 Thread Lei Chang
sorry, please use this link.

Thanks
Lei


On Tue, Jan 2, 2018 at 3:28 PM, Hongxu Ma <inte...@outlook.com> wrote:

> Hi Lei
> The format is broken...
>
> 在 02/01/2018 14:38, Lei Chang 写道:
>
> Hi Guys,
>
> The following is the Podling Report. Please review and give your comments.
>
> Cheers
> Lei
>
>
> 
>
>
> HAWQHAWQ is an advanced enterprise SQL on Hadoop analytic engine built
> around arobust and high-performance massively-parallel processing
> (MPP) SQL frameworkevolved from Greenplum Database.HAWQ has been
> incubating since 2015-09-04.Three most important issues to address in
> the move towardsgraduation:  1. Continue to improve the project's
> release cadence. To this end we plan on expanding automation
> services to support increased developer participation.Any issues
> that the Incubator PMC (IPMC) or ASF Board wish/needto be aware of?
>  Nothing urgent at this time.How has the community developed since the
> last report?1. Conference Talks (2): * The nature of cloud database.
> The 7th Data Technology Carnival (Speaker: Lei Chang, Nov 17, 2017)  *
> New Data Warehouse: Apache HAWQ. 2017 Global Internet Technology
> Conference (Speaker: Lei Chang, Nov 24, 2017) 2. Active contributions
> from approximately 20 different community   contributors since the
> last report.3. Three committer candidates passed the voting process:
> 1) Amy BAI   2) ChunLing WANG3) Hongxu MA   How has the project
> developed since the last report?1. The scope of 2.3 release is
> finalized, and is under development  1) New Feature: HAWQ Ranger
> supports RPS HA.  (Done)  2) New Feature: HAWQ Ranger supports
> Kerberos authentication. (Done)  3) New Feature: HAWQ Core supports
> plugable external storage framework. (Almost Done HAWQ-786)  4) New
> Feature: HAWQ Core supports HDFS TDE (Transparent Data Encryption)
> through libHdfs3. (Done, HAWQ-1193)  5) Licenses: Fix PXF license
> files located in PXF jar files. (Done HAWQ-1496)  6) Licenses: Check
> Apache HAWQ mandatory libraries to match LC20, LC30 license criteria.
> (Not started HAWQ-1512)  7) Build: Release build project (On going
> HAWQ-127)  8) Bug fixes. (On going)Project page 
> link:https://cwiki.apache.org/confluence/display/HAWQ/Apache+HAWQ+2.3.0.0-incubating+Release
> How would you assess the podling's maturity?Please feel free to
> add your own commentary.  [ ] Initial setup  [ ] Working towards first
> release  [ ] Community building  [X] Nearing graduation  [ ]
> Other:Date of last release:  2017-07-12, Apache HAWQ 2.2.0.0  When
> were the last committers or PPMC members elected?   1) Amy BAI:  Nov
> 1, 2017   2) ChunLing WANG: Nov 1, 2017   3) Hongxu MA: Nov 4,
> 2017Signed-off-by:  [ ](hawq) Alan Gates Comments:  [ ](hawq)
> Konstantin Boudnik Comments:  [ ](hawq) Justin Erenkrantz
> Comments:  [ ](hawq) Thejas Nair Comments:  [ ](hawq) Roman
> Shaposhnik Comments: IPMC/Shepherd notes:
>
>
> On Sat, Dec 30, 2017 at 10:30 PM, <johndam...@apache.org> 
> <johndam...@apache.org> wrote:
>
>
> Dear podling,
>
> This email was sent by an automated system on behalf of the Apache
> Incubator PMC. It is an initial reminder to give you plenty of time to
> prepare your quarterly board report.
>
> The board meeting is scheduled for Wed, 17 January 2018, 10:30 am PDT.
> The report for your podling will form a part of the Incubator PMC
> report. The Incubator PMC requires your report to be submitted 2 weeks
> before the board meeting, to allow sufficient time for review and
> submission (Wed, January 03).
>
> Please submit your report with sufficient time to allow the Incubator
> PMC, and subsequently board members to review and digest. Again, the
> very latest you should submit your report is 2 weeks prior to the board
> meeting.
>
> Thanks,
>
> The Apache Incubator PMC
>
> Submitting your Report
>
> --
>
> Your report should contain the following:
>
> *   Your project name
> *   A brief description of your project, which assumes no knowledge of
> the project or necessarily of its field
> *   A list of the three most important issues to address in the move
> towards graduation.
> *   Any issues that the Incubator PMC or ASF Board might wish/need to be
> aware of
> *   How has the community developed since the last report
> *   How has the project developed since the last report.
> *   How does the podling rate their own maturity.
>
> This should be appended to the Incubator Wiki page at:
> https://wiki.apache.org/incubator/January2018
>
> Note: This is manually populated. You may need to wait a little before
> this page is created from a template.
>
> Mentors
> ---
>
> Mentors should review reports for their project(s) and sign them off on
> the Incubator wiki page. Signing off reports shows that you are
> following the project - projects that are not signed may raise alarms
> for the Incubator PMC.
>
> Incubator PMC
>
>
>
> --
> Regards,
> Hongxu.
>
>


Re: Podling Report Reminder - January 2018

2018-01-01 Thread Lei Chang
Hi Guys,

The following is the Podling Report. Please review and give your comments.

Cheers
Lei





HAWQHAWQ is an advanced enterprise SQL on Hadoop analytic engine built
around arobust and high-performance massively-parallel processing
(MPP) SQL frameworkevolved from Greenplum Database.HAWQ has been
incubating since 2015-09-04.Three most important issues to address in
the move towardsgraduation:  1. Continue to improve the project's
release cadence. To this end we plan on expanding automation
services to support increased developer participation.Any issues
that the Incubator PMC (IPMC) or ASF Board wish/needto be aware of?
 Nothing urgent at this time.How has the community developed since the
last report?1. Conference Talks (2): * The nature of cloud database.
The 7th Data Technology Carnival (Speaker: Lei Chang, Nov 17, 2017)  *
New Data Warehouse: Apache HAWQ. 2017 Global Internet Technology
Conference (Speaker: Lei Chang, Nov 24, 2017) 2. Active contributions
from approximately 20 different community   contributors since the
last report.3. Three committer candidates passed the voting process:
1) Amy BAI   2) ChunLing WANG3) Hongxu MA   How has the project
developed since the last report?1. The scope of 2.3 release is
finalized, and is under development  1) New Feature: HAWQ Ranger
supports RPS HA.  (Done)  2) New Feature: HAWQ Ranger supports
Kerberos authentication. (Done)  3) New Feature: HAWQ Core supports
plugable external storage framework. (Almost Done HAWQ-786)  4) New
Feature: HAWQ Core supports HDFS TDE (Transparent Data Encryption)
through libHdfs3. (Done, HAWQ-1193)  5) Licenses: Fix PXF license
files located in PXF jar files. (Done HAWQ-1496)  6) Licenses: Check
Apache HAWQ mandatory libraries to match LC20, LC30 license criteria.
(Not started HAWQ-1512)  7) Build: Release build project (On going
HAWQ-127)  8) Bug fixes. (On going)Project page link:
https://cwiki.apache.org/confluence/display/HAWQ/Apache+HAWQ+2.3.0.0-incubating+Release
How would you assess the podling's maturity?Please feel free to
add your own commentary.  [ ] Initial setup  [ ] Working towards first
release  [ ] Community building  [X] Nearing graduation  [ ]
Other:Date of last release:  2017-07-12, Apache HAWQ 2.2.0.0  When
were the last committers or PPMC members elected?   1) Amy BAI:  Nov
1, 2017   2) ChunLing WANG: Nov 1, 2017   3) Hongxu MA: Nov 4,
2017Signed-off-by:  [ ](hawq) Alan Gates Comments:  [ ](hawq)
Konstantin Boudnik Comments:  [ ](hawq) Justin Erenkrantz
Comments:  [ ](hawq) Thejas Nair Comments:  [ ](hawq) Roman
Shaposhnik Comments: IPMC/Shepherd notes:


On Sat, Dec 30, 2017 at 10:30 PM, <johndam...@apache.org> wrote:

> Dear podling,
>
> This email was sent by an automated system on behalf of the Apache
> Incubator PMC. It is an initial reminder to give you plenty of time to
> prepare your quarterly board report.
>
> The board meeting is scheduled for Wed, 17 January 2018, 10:30 am PDT.
> The report for your podling will form a part of the Incubator PMC
> report. The Incubator PMC requires your report to be submitted 2 weeks
> before the board meeting, to allow sufficient time for review and
> submission (Wed, January 03).
>
> Please submit your report with sufficient time to allow the Incubator
> PMC, and subsequently board members to review and digest. Again, the
> very latest you should submit your report is 2 weeks prior to the board
> meeting.
>
> Thanks,
>
> The Apache Incubator PMC
>
> Submitting your Report
>
> --
>
> Your report should contain the following:
>
> *   Your project name
> *   A brief description of your project, which assumes no knowledge of
> the project or necessarily of its field
> *   A list of the three most important issues to address in the move
> towards graduation.
> *   Any issues that the Incubator PMC or ASF Board might wish/need to be
> aware of
> *   How has the community developed since the last report
> *   How has the project developed since the last report.
> *   How does the podling rate their own maturity.
>
> This should be appended to the Incubator Wiki page at:
>
> https://wiki.apache.org/incubator/January2018
>
> Note: This is manually populated. You may need to wait a little before
> this page is created from a template.
>
> Mentors
> ---
>
> Mentors should review reports for their project(s) and sign them off on
> the Incubator wiki page. Signing off reports shows that you are
> following the project - projects that are not signed may raise alarms
> for the Incubator PMC.
>
> Incubator PMC
>


Re: Requesting committer status for incubator-hawq

2017-12-28 Thread Lei Chang
Hi Lav,

Great to see your contributions to HAWQ. Here is the process to become a
committer:
https://cwiki.apache.org/confluence/display/HAWQ/Becoming+a+committer

Cheers
Lei



On Thu, Dec 28, 2017 at 3:04 AM, Lav Jain  wrote:

> Hello Hawq Community,
>
> I would like to request committer access to incubator-hawq project based on
> my contributions (10 commits) so far:
>
> 
> https://github.com/apache/incubator-hawq/commits?author=lavjain
>
> I have worked actively on Hawq integration with Ambari in the past and also
> have over 300 commits to hawq-ci project (private to Pivotal employees):
>
> https://github.com/apache/ambari/commits?author=lavjain
> https://github.com/Pivotal-DataFabric/hawq-ci/commits?author=lavjain
>
> This would allow me to speedier development on future JIRAs for Hawq :-)
>
> Regards,
>
>
> *Lav Jain*
> *Pivotal Data*
>
> lj...@pivotal.io
>


Re: Questions for HAWQ dev community: Pluggable storage formats and files systems vs. PXF

2017-12-04 Thread Lei Chang
Great to see Greenplum is using PXF. I think PXF is a very good fit for
Greenplum's current architecture.

To avoid duplicate maintenance cost, my suggestion is to only maintain PXF
code in one place: Greenplum.

>From HAWQ side, it can be deprecated in a future release after the new
framework is ready.

Thanks
Lei


On Tue, Dec 5, 2017 at 3:02 AM, Ed Espino  wrote:

> To the HAWQ dev community,
>
> I wanted to raise up an issue for discussion regarding JIRA HAWQ-786
> . This is a proposal for a
> new component/functionality (Framework to support pluggable formats and
> file systems) that appears to replace that currently provided by the PXF
> component.
>
> PXF was recently re-used in another open source project: Greenplum-DB (
> https://github.com/greenplum-db/gpdb/tree/5X_STABLE/gpAux/extensions/pxf )
> and depends on the server-side components that exist today in HAWQ’s source
> tree.
>
> The question I have for the community is: with the possibility of PXF being
> replaced by a new component in a future release of HAWQ, what should become
> of the PXF code? Older releases of HAWQ (2.3.0 >) will continue to use it
> but there is an outside project now depending on it.
>
> Does the HAWQ community want to maintain the PXF code in the HAWQ project
> or if not here, where? If the GPDB community forked PXF from HAWQ would
> that be ok?
>
> Regards,
>
> Ed Espino
>


Re: Apache HAWQ (incubating) 2.3.0.0 suggestion

2017-11-14 Thread Lei Chang
Cool. Looks your guys' work is quite related to the pluggable storage
feature (HAWQ-786).

The feature implements a native C interface for external formats and it is
several times faster than the current JAVA interface. And the feature has
been validated in Oushu version for some time.

If you guys are interested in the alignment of the feature, we can discuss
it on the JIRA.

Cheers
Lei




On Tue, Nov 14, 2017 at 2:02 PM, 刘奎恩(局外) <kuien@alibaba-inc.com> wrote:

>
> Missed out DEV.
>
> --
> 发件人:刘奎恩(局外) <kuien@alibaba-inc.com>
> 发送时间:2017年11月14日(星期二) 11:15
> 收件人:Lei Chang <chang.lei...@gmail.com>
> 主 题:回复:Apache HAWQ (incubating) 2.3.0.0 suggestion
>
> Hi Dr. Chang,
>
> Good, thanks for useful suggestion. Now there is:
> https://issues.apache.org/jira/browse/HAWQ-1550
>
> We are trying to integrate Hawq engine onto MaxCompute (former name ODPS, 
> within
> Aliyun Cloud), Seahawks, it is led by Mr. Chen Xia. We develop a component
> AXF (borrow idea from PXF) to connect data sources from Druid and
> MaxCompute, it works well for this 11.11 battle, but it is not stand-alone
> program.
>
> -——
> Kuien Liu/奎恩
>
> --
> 发件人:Lei Chang <chang.lei...@gmail.com>
> 发送时间:2017年11月14日(星期二) 10:28
> 收件人:dev <dev@hawq.incubator.apache.org>; 刘奎恩(局外) <
> kuien@alibaba-inc.com>
> 主 题:Re: Apache HAWQ (incubating) 2.3.0.0 suggestion
>
>
> Kuien, we are welcoming any good contributions from the community.
>
> Looks hawq_log_master_concise is a good enhancement, I'd like suggest you
> create a JIRA and we can discuss it on the JIRA.
>
> Can you explain more about Druid Wrapper & MaxCompute Wrapper? What are
> the use cases here?
>
> Cheers
> Lei
>
>
>
>
>
> On Tue, Nov 14, 2017 at 10:05 AM, 刘奎恩(局外) <kuien@alibaba-inc.com>
> wrote:
> Hi Ed and Hawq,
> I have add two GUCs (log_max_size, log_max_age) to control the logfile
> size on master, otherwise the query to hawq_toolkit.hawq_log_master_concise
> will be slower and slower, and disk usage on master is hard to constrain.
> If some gys are interested to this, I can submit it to Hawq.
> Besides, our team has introduced many interesting features on hawq, but
> most of them are deep coupling with Alibaba Cloud. Some of them may be
> (maybe not) seperated alone, such as Druid Wrapper, MaxCompute Wrapper. I
> will discuss with teammates Mr. Chen Xia, Mr. Zhiyong Dai et al.
>
> 祝好!刘奎恩/局外
> --发件人:Ed
> Espino <esp...@apache.org>发送时间:2017年11月13日(星期一) 23:25收件人:dev <
> dev@hawq.incubator.apache.org>主 题:Apache HAWQ (incubating) 2.3.0.0
> suggestion
> HAWQ,
>
> I feel we have sufficient content to warrant a 2.3.0.0
> release. It has been
> 4+ months since our last release (2.2.0.0). I suggest we include the
> pluggable storage feature (HAWQ-786 - assigned to Chiyang Wan -
> chiyang10...@gmail.com) plus others (as appropriate) in
> the 2.3.0.0 queue in
> the next release.
>
> Are there any outstanding issues that are mandatory for
> the 2.3.0.0 release?
> What do others think?
>
> -=e
>
> --
> *Ed Espino*
>
>
>
>


Re: Apache HAWQ (incubating) 2.3.0.0 suggestion

2017-11-13 Thread Lei Chang
Kuien, we are welcoming any good contributions from the community.

Looks hawq_log_master_concise is a good enhancement, I'd like suggest you
create a JIRA and we can discuss it on the JIRA.

Can you explain more about Druid Wrapper & MaxCompute Wrapper? What are the
use cases here?

Cheers
Lei





On Tue, Nov 14, 2017 at 10:05 AM, 刘奎恩(局外)  wrote:

> Hi Ed and Hawq,
> I have add two GUCs (log_max_size, log_max_age) to control the logfile
> size on master, otherwise the query to hawq_toolkit.hawq_log_master_concise
> will be slower and slower, and disk usage on master is hard to constrain.
> If some gys are interested to this, I can submit it to Hawq.
> Besides, our team has introduced many interesting features on hawq, but
> most of them are deep coupling with Alibaba Cloud. Some of them may be
> (maybe not) seperated alone, such as Druid Wrapper, MaxCompute Wrapper. I
> will discuss with teammates Mr. Chen Xia, Mr. Zhiyong Dai et al.
>
> 祝好!刘奎恩/局外
> --发件人:Ed
> Espino 发送时间:2017年11月13日(星期一) 23:25收件人:dev <
> dev@hawq.incubator.apache.org>主 题:Apache HAWQ (incubating) 2.3.0.0
> suggestion
> HAWQ,
>
> I feel we have sufficient content to warrant a 2.3.0.0
> release. It has been
> 4+ months since our last release (2.2.0.0). I suggest we include the
> pluggable storage feature (HAWQ-786 - assigned to Chiyang Wan -
> chiyang10...@gmail.com) plus others (as appropriate) in
> the 2.3.0.0 queue in
> the next release.
>
> Are there any outstanding issues that are mandatory for
> the 2.3.0.0 release?
> What do others think?
>
> -=e
>
> --
> *Ed Espino*
>


Re: Hawq standby sync fails if there are existing connections to master

2017-11-13 Thread Lei Chang
oh, yes, make sense, I overlooked the -M option here.

Here we should respect the input "-M" option.

Cheers
Lei



On Tue, Nov 14, 2017 at 8:46 AM, Shubham Sharma <ssha...@pivotal.io> wrote:

> Thanks for the response Lei.
>
> I understand this completely and agree with the behavior that we should not
> brutally terminate connections.
>
> However, the case here is, am deliberately trying to use stop mode using
> command `hawq init standby -n -v -M fast`. At this point as a user I
> understand that connections will be terminated without warning. When
> passing option `-M fast` hawq should not interrupt me.
>
> Basically, hawq command line is not respecting `-M fast`. Whereas hawq init
> standby --help documents this option. Let me know if this makes sense.
>
>
> hawq init standby --help
>
> Usage: HAWQ management scripts options
>
>
> Options:
>
>   -h, --helpshow this help message and exit
>
>   -a, --prompt  Execute automatically
>
>   -M STOP_MODE, --mode=STOP_MODE
>
>         HAWQ stop mode: smart/fast/immediate
>
>
> On Mon, Nov 13, 2017 at 4:32 PM, Lei Chang <chang.lei...@gmail.com> wrote:
>
> > Hi Shubham,
> >
> > The behavior is intentional. If there are connections when HAWQ init
> > standby, it is better to warn the client instead of cutting the
> connections
> > brutally.
> >
> > Cheers
> > Lei
> >
> >
> >
> > On Mon, Nov 13, 2017 at 11:58 AM, Shubham Sharma <ssha...@pivotal.io>
> > wrote:
> >
> > > Hello folks,
> > >
> > > Recently observed a behaviour while re-syncing standby from hawq
> command
> > > line.
> > >
> > > Here are the reproduction steps -
> > >
> > > 1 - Open a client connection to hawq using psql
> > > 2 - From a different terminal run command - hawq init standby -n -v -M
> > fast
> > > 3 - Standby resync fails with error
> > >
> > >
> > > 20171113:03:49:21:158354 hawq_stop:hdp3:gpadmin-[WARNING]:-There are
> > > other connections to this instance, shutdown mode smart aborted
> > >
> > > 20171113:03:49:21:158354 hawq_stop:hdp3:gpadmin-[WARNING]:-Either
> > > remove connections, or use 'hawq stop master -M fast' or 'hawq stop
> > > master -M immediate'
> > >
> > > 20171113:03:49:21:158354 hawq_stop:hdp3:gpadmin-[WARNING]:-See hawq
> > > stop --help for all options
> > >
> > > 20171113:03:49:21:158354 hawq_stop:hdp3:gpadmin-[ERROR]:-Active
> > > connections. Aborting shutdown...
> > >
> > > 20171113:03:49:21:158143 hawq_init:hdp3:gpadmin-[ERROR]:-Stop hawq
> > > cluster failed, exit
> > >
> > > 4 - My understanding is when -M (stop mode) is passed it should
> terminate
> > > existing client connections. Also, it seems like a good practice to
> > > terminate client connections before standby master resync.
> > >
> > > Is this an expected behavior in hawq ? If not, I can open a JIRA and
> work
> > > on a pull request to fix this.
> > >
> > > Looking forward to your thoughts on this.
> > > ​
> > > --
> > > Regards,
> > > Shubham Sharma
> > >
> >
>
>
>
> --
> Regards,
> Shubham Sharma
> Staff Customer Engineer
> Pivotal Global Support Services
> ssha...@pivotal.io
> Direct Tel: +1(510)-304-8201
> Office Hours: Mon-Fri 9:00 am to 5:00 pm PDT
> Out of Office Hours Contact +1 877-477-2269
>


Re: Hawq standby sync fails if there are existing connections to master

2017-11-13 Thread Lei Chang
Hi Shubham,

The behavior is intentional. If there are connections when HAWQ init
standby, it is better to warn the client instead of cutting the connections
brutally.

Cheers
Lei



On Mon, Nov 13, 2017 at 11:58 AM, Shubham Sharma  wrote:

> Hello folks,
>
> Recently observed a behaviour while re-syncing standby from hawq command
> line.
>
> Here are the reproduction steps -
>
> 1 - Open a client connection to hawq using psql
> 2 - From a different terminal run command - hawq init standby -n -v -M fast
> 3 - Standby resync fails with error
>
>
> 20171113:03:49:21:158354 hawq_stop:hdp3:gpadmin-[WARNING]:-There are
> other connections to this instance, shutdown mode smart aborted
>
> 20171113:03:49:21:158354 hawq_stop:hdp3:gpadmin-[WARNING]:-Either
> remove connections, or use 'hawq stop master -M fast' or 'hawq stop
> master -M immediate'
>
> 20171113:03:49:21:158354 hawq_stop:hdp3:gpadmin-[WARNING]:-See hawq
> stop --help for all options
>
> 20171113:03:49:21:158354 hawq_stop:hdp3:gpadmin-[ERROR]:-Active
> connections. Aborting shutdown...
>
> 20171113:03:49:21:158143 hawq_init:hdp3:gpadmin-[ERROR]:-Stop hawq
> cluster failed, exit
>
> 4 - My understanding is when -M (stop mode) is passed it should terminate
> existing client connections. Also, it seems like a good practice to
> terminate client connections before standby master resync.
>
> Is this an expected behavior in hawq ? If not, I can open a JIRA and work
> on a pull request to fix this.
>
> Looking forward to your thoughts on this.
> ​
> --
> Regards,
> Shubham Sharma
>


Re: I want to contribute code to Apache HAWQ

2017-11-12 Thread Lei Chang
Cool.  Welcome to Apache HAWQ.

And looking forwards to seeing your updates on the feature.

Cheers
Lei




On Mon, Nov 13, 2017 at 10:54 AM, Chiyang Wan 
wrote:

> Hello, everyone. I want to contribute code to Apache HAWQ, especially for
> HAWQ-786(Framework to support pluggable formats and file systems).
>


Re: New committer: Hongxu Ma

2017-11-01 Thread Lei Chang
Congrats!

Cheers
Lei


On Wed, Nov 1, 2017 at 11:33 PM, Brian Lu  wrote:

> Congrats Hongxu!
>
> Best Regards,
> Brian
>
> On Wed, Nov 1, 2017 at 10:44 PM, Shubham Sharma 
> wrote:
>
> > Congrats Hongxu !
> >
> > On Wed, Nov 1, 2017 at 3:15 AM, stanly sheng 
> > wrote:
> >
> > > Congrats, Hongxu!
> > >
> > > 2017-11-01 14:16 GMT+08:00 Hubert Zhang :
> > >
> > > > Congrats to Hongxu!
> > > >
> > > > On Wed, Nov 1, 2017 at 2:06 PM, Radar Lei  wrote:
> > > >
> > > > > The Project Management Committee (PMC) for Apache HAWQ (incubating)
> > has
> > > > > invited Hongxu Ma to become a committer and we are pleased to
> > announce
> > > > that
> > > > > he has accepted.
> > > > > Being a committer enables easier contribution to the project since
> > > there
> > > > is
> > > > > no need to go via the patch submission process. This should enable
> > > better
> > > > > productivity.
> > > > > Please join us in congratulating him and we are looking forward to
> > > > > collaborating with him in the open source community.
> > > > >
> > > > > His contribution includes (but not limited to):
> > > > >
> > > > > *Direct contribution to code base:*
> > > > >
> > > > >- 26 commits in total with some major components in hawq
> involved,
> > > > >including contributions to Apache Ranger integration, TDE
> support
> > > > >and command line tools.
> > > > > *https://github.com/apache/incubator-hawq/commits?author=interma
> > > > > interma
> > >*
> > > > >- 27 closed PRs:
> > > > > *https://github.com/apache/incubator-hawq/pulls?utf8=%E2%
> > > > > 9C%93=is%3Apr%20is%3Aclosed%20author%3Ainterma
> > > > > > > > > 9C%93=is%3Apr%20is%3Aclosed%20author%3Ainterma>*
> > > > >- 11 Jiras on Apache Ranger integration
> > > > >   - https://issues.apache.org/jira/browse/HAWQ-1493 Integrate
> > > Ranger
> > > > >   lookup JAAS configuration in ranger-admin plugin jar
> > > > >   - https://issues.apache.org/jira/browse/HAWQ-1485 Use
> > > > user/password
> > > > >   instead of credentials cache in Ranger lookup for HAWQ with
> > > > Kerberos
> > > > >   enabled.
> > > > >   - https://issues.apache.org/jira/browse/HAWQ-1477
> > Ranger-plugin
> > > > >   connect to Ranger admin under kerberos security.
> > > > >   - https://issues.apache.org/jira/browse/HAWQ-1456 Copy RPS
> > > > >   configuration files to standby in specific scenarios
> > > > >   - https://issues.apache.org/jira/browse/HAWQ-1436 Implement
> > RPS
> > > > High
> > > > >   availability on HAWQ
> > > > >   - https://issues.apache.org/jira/browse/HAWQ-1410 Add basic
> > test
> > > > > case
> > > > >   for hcatalog with ranger
> > > > >   - https://issues.apache.org/jira/browse/HAWQ-1405 'hawq stop
> > > > >   --reload' should not stop RPS
> > > > >   - https://issues.apache.org/jira/browse/HAWQ-1393 'hawq stop
> > > > > cluster'
> > > > >   failed when rps.sh have some path errors (e.g. CATALINA_HOME)
> > > > >   - https://issues.apache.org/jira/browse/HAWQ-1365 Print out
> > > > detailed
> > > > >   schema information for tables which the user doesn't have
> > > > privileges
> > > > >   - https://issues.apache.org/jira/browse/HAWQ-1279 Force to
> > > > recompute
> > > > >   namespace_path when enable_ranger
> > > > >   - https://issues.apache.org/jira/browse/HAWQ-1257 If user
> > > doesn't
> > > > >   have privileges on certain objects, need return user which
> > > specific
> > > > > table
> > > > >   he doesn't have right.
> > > > >- 4 Jiras on hawq TDE support
> > > > >   - https://issues.apache.org/jira/browse/HAWQ-1520
> gpcheckhdfs
> > > > should
> > > > >   skip hdfs trash directory
> > > > >   - https://issues.apache.org/jira/browse/HAWQ-1510 Add
> > > TDE-related
> > > > >   functionality into hawq command line tools
> > > > >   - https://issues.apache.org/jira/browse/HAWQ-1506 Support
> > > > >   multi-append a file within encryption zone
> > > > >   - https://issues.apache.org/jira/browse/HAWQ-1193 TDE
> support
> > in
> > > > > HAWQ
> > > > >- 5 Jiras on improvements including documentation, test, build,
> > > > command
> > > > >line tools, code refactor.
> > > > >   - https://issues.apache.org/jira/browse/HAWQ-1385 hawq_ctl
> > stop
> > > > >   failed when master is down
> > > > >   - https://issues.apache.org/jira/browse/HAWQ-870 Allocate
> > > target's
> > > > >   tuple table slot in PortalHeapMemory during split partition
> > > > >   - https://issues.apache.org/jira/browse/HAWQ-513 initdb.c
> > failed
> > > > on
> > > > >   OSX 10.11.3 due to fgets error
> > > > >   - https://issues.apache.org/jira/browse/HAWQ-1380 Keep
> > > > hawq_toolkit
> > > > >   schema 

Re: New committer: Chunling Wang

2017-11-01 Thread Lei Chang
Congrats!

Cheers
Lei


On Wed, Nov 1, 2017 at 11:42 PM, Ed Espino  wrote:

> Congratulations Chunling.
>
> Regards,
> -=e
>
> On Wed, Nov 1, 2017 at 8:33 AM, Brian Lu  wrote:
>
> > Congrats Chunling!
> >
> > Best Regards,
> > Brian
> >
> > On Wed, Nov 1, 2017 at 10:45 PM, Shubham Sharma 
> > wrote:
> >
> > > Congratulations Chunling !
> > >
> > > On Wed, Nov 1, 2017 at 3:15 AM, stanly sheng 
> > > wrote:
> > >
> > > > Congrats, Chunling !
> > > >
> > > > 2017-11-01 14:15 GMT+08:00 Hubert Zhang :
> > > >
> > > > > Congrats to Chunling
> > > > >
> > > > > On Wed, Nov 1, 2017 at 2:10 PM, yanqing weng 
> > wrote:
> > > > >
> > > > > > The Project Management Committee (PMC) for Apache HAWQ
> (incubating)
> > > has
> > > > > > invited Chunling Wang to become a committer and we are pleased to
> > > > > announce
> > > > > > that
> > > > > > she has accepted.
> > > > > > Being a committer enables easier contribution to the project
> since
> > > > there
> > > > > is
> > > > > > no need to go via the patch submission process. This should
> enable
> > > > better
> > > > > > productivity.
> > > > > > Please join us in congratulating her and we are looking forward
> to
> > > > > > collaborating with her in the open source community.
> > > > > >
> > > > > > Her contribution includes (but not limited to):
> > > > > >
> > > > > > *Direct contribution to code base:*
> > > > > >
> > > > > >- *37 commits* in total with some major components in hawq
> > > involved,
> > > > > >including contributions to Apache Ranger integration, hawq
> > > register
> > > > > >and command line tools. https://github.com/apac
> > > > > >he/incubator-hawq/commits?author=wcl14
> > > > > > wcl14>
> > > > > >- *29 closed PRs*: https://github.com/apache
> > > > > >/incubator-hawq/pulls?utf8=%E2%9C%93=is%3Apr%20is%3Aclosed
> > > > > >%20author%3Awcl14
> > > > > > > > > > > 9C%93=is%3Apr%20is%3Aclosed%20author%3Awcl14>
> > > > > >- *2 improvements* including documentation, test, build,
> command
> > > > line
> > > > > >tools, code refactor.
> > > > > >   - HAWQ-1358  jira/browse/HAWQ-1358
> > >.
> > > > > >   Refactor gpfdist library in featuretest.
> > > > > >   - HAWQ-1377  jira/browse/HAWQ-1377
> > >.
> > > > Add
> > > > > >   more information for Ranger related GUCs in default
> > > > hawq-site.xml.
> > > > > >- *13 bug fixes* including test failure, ranger, register,
> > > > core-dump,
> > > > > >command line tools, build components.
> > > > > >   - HAWQ-1020  jira/browse/HAWQ-1020
> > >.
> > > > Fix
> > > > > >   bugs to let feature tests TestCommonLib.TestHdfsConfig and
> > > > > >   TestCommonLib.TestYanConfig run in concourse
> > > > > >   - HAWQ-1037  jira/browse/HAWQ-1037
> > >.
> > > > > > Modify
> > > > > >   way to get HDFS port in TestHawqRegister
> > > > > >   - HAWQ-1037  jira/browse/HAWQ-1037
> > >.
> > > > Get
> > > > > >   HDFS namenode host from HAWQ catalog tables instead of from
> > > HDFS
> > > > > >   configuration file
> > > > > >   - HAWQ-1037  jira/browse/HAWQ-1037
> > >.
> > > > Add
> > > > > >   get OS user in hdfs_config when is not set
> > > > > >   - HAWQ-1113  jira/browse/HAWQ-1113
> > >
> > > > > >   . Fix bug when files in yaml is
> > > > > > disordered, hawq
> > > > > >   register error in force mode.
> > > > > >   - HAWQ-1145  jira/browse/HAWQ-1145
> > >.
> > > > Add
> > > > > >   UDF gp_relfile_insert_for_register and add insert metadata
> > into
> > > > > >   gp_relfile_node and gp_persistent_relfile_node for HAWQ
> > > register
> > > > > >   - HAWQ-1237 <>https://issues.apache.org/
> > jira/browse/HAWQ-1237.
> > > > > > Modify
> > > > > >   hard code 'select' privilege in create_ranger_request_json_
> > > > batch()
> > > > > > in
> > > > > >   rangerrest.c
> > > > > >   - HAWQ-1239  jira/browse/HAWQ-1239
> > >.
> > > > > Fail
> > > > > >   to call pg_rangercheck_batch() when when 'rte->rtekind !=
> > > > > > RTE_RELATION' or
> > > > > >   'requiredPerms == 0'
> > > > > >   - HAWQ-1356  jira/browse/HAWQ-1356
> > >.
> > > > Add
> > > > > >   waring when user does not have usage privilege of
> namespace.
> > > > > >   - HAWQ-1357  jira/browse/HAWQ-1357
> > >.
> > > > > Super
> > > > > >   user also need to check create privilege of public schema
> > from
> > > > > > Ranger.
> 

Re: New Committer: Amy Bai

2017-11-01 Thread Lei Chang
Congrats!

Cheers
Lei

On Wed, Nov 1, 2017 at 2:02 PM, Wen Lin  wrote:

> Hi,
>
> The Project Management Committee (PMC) for Apache HAWQ (incubating) has
> invited Amy Bai to become a committer and we are pleased to announce that
> she has accepted.
> Being a committer enables easier contribution to the project since there is
> no need to go via the patch submission process. This should enable better
> productivity. Please join us in congratulating her and we are looking
> forward to collaborating with her in the open source community. Her
> contribution includes (but not limited to):
> List contributions to code base, documentation, code review, discussion in
> mailing list, JIRA, etc.
>
> Regards!
>


Re: Podling Report Reminder - October 2017

2017-09-25 Thread Lei Chang
Thanks Hongxu, updated:


---
HAWQ

Apache HAWQ is a Hadoop native SQL query engine that combines the
key technological advantages of MPP database with the scalability
and convenience of Hadoop. HAWQ reads data from and writes data
to HDFS natively.  HAWQ delivers industry-leading performance and
linear scalability. It provides users the tools to confidently
and successfully interact with petabyte range data sets. HAWQ
provides users with a complete, standards compliant SQL
interface.

HAWQ has been incubating since 2015-09-04.

Three most important issues to address in the move towards
graduation:

  1. Continue to improve the project's release cadence. To this
 end we plan on expanding automation services to support
 increased developer participation.


Any issues that the Incubator PMC (IPMC) or ASF Board wish/need
to be aware of?

 Nothing urgent at this time.

How has the community developed since the last report?

1. Conference Talks (2):

 * Big data technology Trends. China Big Data Industry Ecosystem Conference
(Speaker: Lei Chang, Oushu Inc, August 2, 2017)

 * Future Data Warehouse. China CIO conference (Speaker: Lei Chang, Sep 16,
2017)

2. Active contributions from approximately 20 different community
   contributors since the last report (July 2017).

3. Yi JIN volunteered as the release manager for 2.3 release.

4. Three committer candidates are under voting process:

   1) Amy BAI
   2) ChunLing WANG
   3) Hongxu MA


How has the project developed since the last report?

1. Apache HAWQ 2.2.0.0 was released. HAWQ 2.2 is the first binary release.

   Release information:

   1) Release page:
  https://cwiki.apache.org/confluence/display/HAWQ/Apache+HAWQ+2.2.0.0-
incubating+Release

   2) Issues/tasks fixed (80):
  https://issues.apache.org/jira/issues/?filter=12339844


2. The scope of 2.3 release is finalized:

1) New Feature: HAWQ Ranger supports RPS HA.  (Done)
2) New Feature: HAWQ Ranger supports Kerberos authentication. (Done)
3) New Feature: HAWQ Core supports plugable external storage framework. (On
going HAWQ-786)
4) New Feature: HAWQ Core supports HDFS TDE (Transparent Data Encryption)
through libHdfs3. (Done, HAWQ-1193)
5) Licenses: Fix PXF license files located in PXF jar files. (Not started
HAWQ-1496)
6) Licenses: Check Apache HAWQ mandatory libraries to match LC20, LC30
license criteria. (Not started HAWQ-1512)
7) Build: Release build project (On going HAWQ-127)
8) Bug fixes. (On going)

Project page link: https://cwiki.apache.org/confluence/display/HAWQ/
Apache+HAWQ+2.3.0.0-incubating+Release


3. Project mail list activity:

   Between July 1, 2017 and Sep 25, 2017:

   d...@hawq.apache.org & u...@hawq.apache.org
 155 emails sent
 53  participants


How would you assess the podling's maturity?
Please feel free to add your own commentary.

  [ ] Initial setup
  [ ] Working towards first release
  [ ] Community building
  [X] Nearing graduation
  [ ] Other:

Date of last release:

  2017-07-12, Apache HAWQ 2.2.0.0

When were the last committers or PPMC members elected?

  Podling committers (1) added:
Xiang Sheng (https://github.com/stanlyxiang), May 16, 2017

Signed-off-by:

  [ ](hawq) Alan Gates
 Comments:
  [ ](hawq) Konstantin Boudnik
 Comments:
  [ ](hawq) Justin Erenkrantz
 Comments:
  [ ](hawq) Thejas Nair
 Comments:
  [ ](hawq) Roman Shaposhnik
 Comments:



On Mon, Sep 25, 2017 at 5:11 PM, Hongxu Ma <inte...@outlook.com> wrote:

> Hi Dr.Chang
> > 4) New Feature: HAWQ Core supports HDFS TDE (Transparent Data Encryption)
> > through libHdfs3. (On going HAWQ-1193)
> HAWQ-1193 is already finished now, we forgot close this jira before.
> Thanks!
>
> --
> Regards,
> Hongxu.
>
>


Re: Podling Report Reminder - October 2017

2017-09-25 Thread Lei Chang
Hi Guys,

Please see the following draft podling report. Please feel free to add more
contents.

Cheers
Lei


---
HAWQ

Apache HAWQ is a Hadoop native SQL query engine that combines the
key technological advantages of MPP database with the scalability
and convenience of Hadoop. HAWQ reads data from and writes data
to HDFS natively.  HAWQ delivers industry-leading performance and
linear scalability. It provides users the tools to confidently
and successfully interact with petabyte range data sets. HAWQ
provides users with a complete, standards compliant SQL
interface.

HAWQ has been incubating since 2015-09-04.

Three most important issues to address in the move towards
graduation:

  1. Continue to improve the project's release cadence. To this
 end we plan on expanding automation services to support
 increased developer participation.


Any issues that the Incubator PMC (IPMC) or ASF Board wish/need
to be aware of?

 Nothing urgent at this time.

How has the community developed since the last report?

1. Conference Talks (2):

 * Big data technology Trends. China Big Data Industry Ecosystem Conference
(Speaker: Lei Chang, Oushu Inc, August 2, 2017)

 * Future Data Warehouse. China CIO conference (Speaker: Lei Chang, Sep 16,
2017)

2. Active contributions from approximately 20 different community
   contributors since the last report (July 2017).

3. Yi JIN volunteered as the release manager for 2.3 release.

4. Three committer candidates are under voting process:

   1) Amy BAI
   2) ChunLing WANG
   3) Hongxu MA


How has the project developed since the last report?

1. Apache HAWQ 2.2.0.0 was released. HAWQ 2.2 is the first binary release.

   Release information:

   1) Release page:

https://cwiki.apache.org/confluence/display/HAWQ/Apache+HAWQ+2.2.0.0-incubating+Release

   2) Issues/tasks fixed (80):
  https://issues.apache.org/jira/issues/?filter=12339844


2. The scope of 2.3 release is finalized:

1) New Feature: HAWQ Ranger supports RPS HA.  (Done)
2) New Feature: HAWQ Ranger supports Kerberos authentication. (Done)
3) New Feature: HAWQ Core supports plugable external storage framework. (On
going HAWQ-786)
4) New Feature: HAWQ Core supports HDFS TDE (Transparent Data Encryption)
through libHdfs3. (On going HAWQ-1193)
5) Licenses: Fix PXF license files located in PXF jar files. (Not started
HAWQ-1496)
6) Licenses: Check Apache HAWQ mandatory libraries to match LC20, LC30
license criteria. (Not started HAWQ-1512)
7) Build: Release build project (On going HAWQ-127)
8) Bug fixes. (On going)

Project page link:
https://cwiki.apache.org/confluence/display/HAWQ/Apache+HAWQ+2.3.0.0-incubating+Release


3. Project mail list activity:

   Between July 1, 2017 and Sep 25, 2017:

   d...@hawq.apache.org & u...@hawq.apache.org
 155 emails sent
 53  participants


How would you assess the podling's maturity?
Please feel free to add your own commentary.

  [ ] Initial setup
  [ ] Working towards first release
  [ ] Community building
  [X] Nearing graduation
  [ ] Other:

Date of last release:

  2017-07-12, Apache HAWQ 2.2.0.0

When were the last committers or PPMC members elected?

  Podling committers (1) added:
Xiang Sheng (https://github.com/stanlyxiang), May 16, 2017

Signed-off-by:

  [ ](hawq) Alan Gates
 Comments:
  [ ](hawq) Konstantin Boudnik
 Comments:
  [ ](hawq) Justin Erenkrantz
 Comments:
  [ ](hawq) Thejas Nair
 Comments:
  [ ](hawq) Roman Shaposhnik
 Comments:




On Sat, Sep 23, 2017 at 9:10 PM, <johndam...@apache.org> wrote:

> Dear podling,
>
> This email was sent by an automated system on behalf of the Apache
> Incubator PMC. It is an initial reminder to give you plenty of time to
> prepare your quarterly board report.
>
> The board meeting is scheduled for Wed, 18 October 2017, 10:30 am PDT.
> The report for your podling will form a part of the Incubator PMC
> report. The Incubator PMC requires your report to be submitted 2 weeks
> before the board meeting, to allow sufficient time for review and
> submission (Wed, October 04).
>
> Please submit your report with sufficient time to allow the Incubator
> PMC, and subsequently board members to review and digest. Again, the
> very latest you should submit your report is 2 weeks prior to the board
> meeting.
>
> Thanks,
>
> The Apache Incubator PMC
>
> Submitting your Report
>
> --
>
> Your report should contain the following:
>
> *   Your project name
> *   A brief description of your project, which assumes no knowledge of
> the project or necessarily of its field
> *   A list of the three most important issues to address in the move
> towards graduation.
> *   Any issues that the Incubator PMC or ASF Board might wish/need to be
> aware of
> *   How has the community developed since the last report
> *   How has the pr

Re: Gitbox enables the full GitHub workflow

2017-08-07 Thread Lei Chang
cool. this will simplify the commit workflow a lot.

And if we can use github issues instead of apache JIRA, that will be more
convenient.

Cheers
Lei




On Tue, Aug 8, 2017 at 9:09 AM, Roman Shaposhnik 
wrote:

> Hi!
>
> it has just come to my attention that Gitbox at ASF
> has been enabling full GitHub workflow (with being
> able to click Merge this PR button, etc.) for quite
> some time.
>
> This basically allows a project to have GH as a R/W
> repo as opposed to R/O mirror of what we all currnently
> have: https://gitbox.apache.org/repos/asf
>
> Personally I'm not sure I like GH workflow all that much,
> but if there's interest -- you can opt-in into Gitbox. Once
> you do -- your source of truth moves to GH. You can't
> have it both ways with git-wip-us.apache.org and Gitbox.
>
> Thanks,
> Roman.
>


Re: Re: [ANNOUNCE] Apache HAWQ 2.2.0.0-incubating Released

2017-07-13 Thread Lei Chang
Awesome!

I think this will be a very successful release with Yi managing this.

Cheers
Lei


On Thu, Jul 13, 2017 at 2:48 PM, Yi JIN <y...@apache.org> wrote:

> Hi Ruilong,
>
> I would like to take this responsibility as a volunteer for the next
> release. As a committer I used to contribute a lot of code to Apache HAWQ,
> consequently besides code work, if possible I would like to contribute more
> in another way and learn more about growing an Apache project.
>
> Best,
> Yi (yjin)
>
> On Thu, Jul 13, 2017 at 4:43 PM, HuoRuilong <huoruil...@163.com> wrote:
>
> > Great step towards a mature hawq and active community! Thanks everyone
> for
> > making this real, especially the help from Ed!
> >
> > To make it a more successful apache project and community, we need to
> keep
> > the release cadence. Who would like to be volunteer for the next release
> > manager and drive the effort? Thanks.
> >
> > Best regards,
> > Ruilong Huo
> >
> >
> > At 2017-07-13 14:39:21, "Lili Ma" <lil...@apache.org> wrote:
> > >Congratulations everyone :)
> > >
> > >We're stepping further towards graduation!
> > >
> > >Best Regards,
> > >Lili
> > >
> > >2017-07-13 13:16 GMT+08:00 Ed Espino <esp...@apache.org>:
> > >
> > >> Congratulations to everyone on the first Apache HAWQ release with
> > >> convenience binaries. Special thanks to Ruilong for his excellent
> > release
> > >> management guidance.
> > >>
> > >> I'm very proud to be part of a great dev team.
> > >>
> > >> Cheers,
> > >> -=e
> > >>
> > >> On Wed, Jul 12, 2017 at 10:00 PM, 陶征霖 <ztao1...@apache.org> wrote:
> > >>
> > >> > Congrats!
> > >> >
> > >> > 2017-07-13 9:55 GMT+08:00 Yandong Yao <y...@pivotal.io>:
> > >> >
> > >> > > Great achievement, Congrats!
> > >> > >
> > >> > > On Thu, Jul 13, 2017 at 8:46 AM, Lei Chang <
> chang.lei...@gmail.com>
> > >> > wrote:
> > >> > >
> > >> > > > Congrats!
> > >> > > >
> > >> > > > Cheers
> > >> > > > Lei
> > >> > > >
> > >> > > >
> > >> > > > On Wed, Jul 12, 2017 at 3:27 PM, Ruilong Huo <h...@apache.org>
> > >> wrote:
> > >> > > >
> > >> > > > > Hi All,
> > >> > > > >
> > >> > > > > The Apache HAWQ (incubating) Project Team is proud to announce
> > >> > > > > the release of Apache HAWQ 2.2.0.0-incubating.
> > >> > > > >
> > >> > > > > This is a source code and binary release.
> > >> > > > >
> > >> > > > > ABOUT HAWQ
> > >> > > > > Apache HAWQ (incubating) combines exceptional MPP-based
> > analytics
> > >> > > > > performance, robust ANSI SQL compliance, Hadoop ecosystem
> > >> integration
> > >> > > > > and manageability, and flexible data-store format support, all
> > >> > > > > natively in Hadoop, no connectors required.
> > >> > > > >
> > >> > > > > Built from a decade’s worth of massively parallel processing
> > (MPP)
> > >> > > > > expertise developed through the creation of open source
> > Greenplum®
> > >> > > > > Database and PostgreSQL, HAWQ enables you to
> > >> > > > > swiftly and interactively query Hadoop data, natively via
> HDFS.
> > >> > > > >
> > >> > > > > FEATURES AND ENHANCEMENTS INCLUDED IN THIS RELEASE
> > >> > > > > - CentOS 7.x support
> > >> > > > > Apache HAWQ is improved to be compatible with CentOS 7.x along
> > with
> > >> > > 6.x.
> > >> > > > >
> > >> > > > > - Apache Ranger integration
> > >> > > > > Integrate Apache HAWQ with Apache Ranger through HAWQ Ranger
> > Plugin
> > >> > > > Service
> > >> > > > > which is a RESTful service. It enables users to use Apache
> > Ranger
> > >> to
> > >> > > > > authorize
> > >> > > > > user access to Apache HAWQ r

Re: [ANNOUNCE] Apache HAWQ 2.2.0.0-incubating Released

2017-07-12 Thread Lei Chang
Congrats!

Cheers
Lei


On Wed, Jul 12, 2017 at 3:27 PM, Ruilong Huo  wrote:

> Hi All,
>
> The Apache HAWQ (incubating) Project Team is proud to announce
> the release of Apache HAWQ 2.2.0.0-incubating.
>
> This is a source code and binary release.
>
> ABOUT HAWQ
> Apache HAWQ (incubating) combines exceptional MPP-based analytics
> performance, robust ANSI SQL compliance, Hadoop ecosystem integration
> and manageability, and flexible data-store format support, all
> natively in Hadoop, no connectors required.
>
> Built from a decade’s worth of massively parallel processing (MPP)
> expertise developed through the creation of open source Greenplum®
> Database and PostgreSQL, HAWQ enables you to
> swiftly and interactively query Hadoop data, natively via HDFS.
>
> FEATURES AND ENHANCEMENTS INCLUDED IN THIS RELEASE
> - CentOS 7.x support
> Apache HAWQ is improved to be compatible with CentOS 7.x along with 6.x.
>
> - Apache Ranger integration
> Integrate Apache HAWQ with Apache Ranger through HAWQ Ranger Plugin Service
> which is a RESTful service. It enables users to use Apache Ranger to
> authorize
> user access to Apache HAWQ resources. It also manages all Hadoop
> components’
> authorization policies with the same user interface, policy store, and
> auditing
> stores.
>
> - PXF ORC profile
> Fully supports PXF with Optimized Row Columnar (ORC) file format.
>
> - Fixes and enhancements on Apache HAWQ resource manager, query execution,
> dispatcher,
> catalog, management utilities and more.
>
> JIRA GENERATED RELEASE NOTES
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?
> projectId=12318826=12339641
>
> RELEASE ARTIFACTS ARE AVAILABLE AT
> http://apache.org/dyn/closer.cgi/incubator/hawq/2.2.0.0-incubating
>
> SHA256 & MD5 SIGNATURES (verify your downloads <
> https://www.apache.org/dyn/closer.cgi#verify>):
> https://dist.apache.org/repos/dist/release/incubator/hawq/2.
> 2.0.0-incubating
>
> PGP KEYS
> https://dist.apache.org/repos/dist/release/incubator/hawq/KEYS
>
> DOCUMENTATION
> http://hawq.incubator.apache.org/docs/userguide/2.2.0.0-incubating
>
> HAWQ RESOURCES
> - JIRA: https://issues.apache.org/jira/browse/HAWQ
> - Wiki: https://cwiki.apache.org/confluence/display/HAWQ/Apache+HAWQ+Home
> - Mailing list:
> dev@hawq.incubator.apache.org
> u...@hawq.incubator.apache.org
>
> LEARN MORE ABOUT HAWQ
> http://hawq.apache.org
>
> Best regards,
> - Apache HAWQ (incubating) Team
>
> ==
> DISCLAIMER
>
> Apache HAWQ (incubating) is an effort undergoing incubation at the Apache
> Software Foundation (ASF), sponsored by the Apache Incubator PMC.
>
> Incubation is required of all newly accepted projects until a further
> review indicates that the infrastructure, communications, and decision
> making process have stabilized in a manner consistent with other
> successful ASF projects.
>
> While incubation status is not necessarily a reflection of the
> completeness or stability of the code, it does indicate that the
> project has yet to be fully endorsed by the ASF.
>
> Best regards,
> Ruilong Huo
>


Re: Updating HAWQ’s website

2017-07-12 Thread Lei Chang
Thanks Alan, It is very helpful :-)

Cheers
Lei


On Tue, Jul 11, 2017 at 11:45 PM, Alan Gates  wrote:

> A question came up recently on how to update HAWQ’s website.  In case
> anyone else has a similar question, i wanted to let you know that there are
> instructions for incubator projects updating their websites at
> http://incubator.apache.org/guides/website.html
>
> If you have any questions feel free to ask on this list or
> general@incubator
>
> Alan.
>


Re: GNU SASL Library - Libgsasl dependency for libhdfs3?

2017-07-07 Thread Lei Chang
Radar is correct. when we open sourced HAWQ, GPL/LGPL runtime/build
dependencies have been considered.

Cheers
Lei




On Fri, Jul 7, 2017 at 1:54 PM, Radar Lei  wrote:

> Hi Ed,
>
> Per my understanding, libgsasl is a build and runtime dependence of HAWQ,
> but it's not included in HAWQ source code and HAWQ binary release tarball.
>
> If you check the HAWQ rpm dependences, you can see 'libgsasl' is listed in
> 'hawq.spec', user need to install it before install HAWQ.
>
> So I think we don't need care if it's Apache License compatible and don't
> need to include it in our license file. Thanks.
>
>
>
> Regards,
> Radar
>
> On Fri, Jul 7, 2017 at 1:34 PM, Ed Espino  wrote:
>
> > HAWQ Dev,
> >
> > I have been filling out the project's Apache Project Maturity Model at
> the
> > following location:
> > https://cwiki.apache.org/confluence/display/HAWQ/
> > Apache+Project+Maturity+Model
> >
> > In trying to gather information for the Licenses and Copyright section
> > (specifically: LC20 and LC30), I started to review dependent third party
> > components. I reviewed the project's LICENSE file (
> > https://github.com/apache/incubator-hawq/blob/master/LICENSE). In
> > addition,
> > I reviewed the build and install steps from the wiki. I noticed one
> > component (libgsasl 1.8.0) which is listed but isn't listed in the
> LICENSE
> > file. This component is a build dependency of the libhdfs3 component (
> > https://github.com/apache/incubator-hawq/blob/master/
> > depends/libhdfs3/README.md).
> > The component libhdfs3 will not build without the development
> > libgsasl (header and libs) available. The component libgsasl is the GNU
> > SASL Library (https://www.gnu.org/software/gsasl/).
> >
> > Here is it's licensing information:
> >
> > The core GNU SASL library, and most mechanisms, are licensed under the
> GNU
> > Lesser General Public version 2.1 (or later). It is distributed
> separately,
> > as the "libgsasl" package. The GNU SASL command line application, self
> test
> > suite and more are licensed under the GNU General Public License version
> 3
> > (or later). The "gsasl" package distribution includes the library part as
> > well, so you do not need to install two packages.
> >
> > To the best of my knowledge, this dependent component libgsasl is not
> > compatible with Apache Software projects. Have there been any discussions
> > in the past on its use in HAWQ? I couldn't find any in the mail
> archives. I
> > could be entirely wrong and this could be a false alarm. If anyone can
> > provide some information, it would be greatly appreciated.
> >
> > Regards,
> > -=e
> >
> > --
> > *Ed Espino*
> >
>


Re: how to report a bug?

2017-07-05 Thread Lei Chang
Hi Harald, Thanks for your contribution to HAWQ.

Yes, you can create a bug ticket in Jira (
https://issues.apache.org/jira/projects/HAWQ). Fix version can be backlog.
And if you are not sure about the component, please let it blank.

Cheers
Lei



On Thu, Jul 6, 2017 at 10:22 AM, Harald Bögeholz <
harald.boegeh...@monash.edu> wrote:

> Hi,
>
>
> I am relatively new to HAWQ. Nevertheless I think I may have found a bug
> and would like to know how to go about reporting it. I have looked at
> Jira, but I'm new to that as well.
>
> Do I just create a bug ticket in Jira? I'm unsure what to put in the
> "Component/s" and "Fix Version/s" fields.
>
> Short description: I found that HAWQ segments keep open file handles on
> deleted files thereby keeping the disk space from being released. I
> believe this to be a bug since I can't imagine why this should be
> desirable behavior.
>
> What to do about it? Advice for a newbie would be appreciated.
>
>
> Harald
>


Re: Podling Report - July 2017

2017-07-05 Thread Lei Chang
Thanks Ed.

Cheers
Lei


On Wed, Jul 5, 2017 at 11:57 PM, Ed Espino  wrote:

> FYI: I performed a review of the recently posted HAWQ Podling Report - July
> 2017 (https://wiki.apache.org/incubator/July2017).  I made some edits.
> Please review and provide feedback as soon as possible.
>
> Cheers,
> -=e
>
> --
> *Ed Espino*
>


Re: Podling Report Reminder - July 2017

2017-06-28 Thread Lei Chang
Thanks Alexander, talk added.

Cheers
Lei


---

HAWQ

Apache HAWQ is a Hadoop native SQL query engine that combines the key
technological advantages of MPP database with the scalability and
convenience
of Hadoop. HAWQ reads data from and writes data to HDFS natively.  HAWQ
delivers industry-leading performance and linear scalability. It provides
users the tools to confidently and successfully interact with petabyte range
data sets. HAWQ provides users with a complete, standards compliant SQL
interface.

HAWQ has been incubating since 2015-09-04.

Three most important issues to address in the move towards graduation:

  1. Continue to improve the project's release cadence. To this end we
 plan on expanding automation services to support increased
 developer participation.

  2. Continue to expand the community, by adding new contributors and
 focusing on making sure that there's a much more robust level of
 conversations and discussions happening around roadmaps and
 feature development on the public dev mailing list.

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
aware of?

 Nothing urgent at this time.


How has the community developed since the last report?

1.  Conference Talks (6):

* The Big Data Engine in Cloud Erea, CSDN Cloud Computing Technology
Conference (Speaker: Zhenglin Tao, Oushu Inc, May 19, 2017)

* HAWQ Introduction. China Internation Big Data Industry Expo 2017
(Speaker: Lan Zhou, Oushu Inc,  May 26, 2017)

* Apache HAWQ: Open Source MPP++ Database, The 12th China Open Source World
Summit, (Speaker: Lei Chang, Oushu Inc, June 21, 2017)

* Introduction to latest technology in HAWQ 2.X, the 8th Database
Technology Conference China (Speaker: Lili Ma, Pivotal, May 13, 2017)

* Extending Apache Ranger Authorization Beyond Hadoop: Review of Apache
Ranger Extensibility Framework & Case Study for Integration
with Apache Hawq, DataWorks Summit, https://www.youtube.com/watch?
v=6SE32zrgIAU (Speaker: Ramesh Mani, Hortonworks & Alexander Denissov,
Pivotal, June 13, 2017)

* Hawq Meets Hive - Querying Unmanaged Data,  DataWorks Summit,
https://www.youtube.com/watch?v=sjlZJvHx1hM (Speaker: Shivram Mani
& Alex (Oleksandr) Diachenko, Pivotal, June 14, 2017)

* "Podling Shark" lightning session, ApacheCon, (Speaker: Aleksandr
Diachenko and Alexander Denissov, Pivotal, May 18, 2017)



2. Active contributions from around 20 different community contributors
since last report.


How has the project developed since the last report?

1. Apache HAWQ 2.2.0.0 release is in voting process on dev mailing list
2. HAWQ 2.2.0.0 release information:

1) release page: https://cwiki.apache.org/confluence/display/HAWQ/Apach
e+HAWQ+2.2.0.0-incubating+Release
2) 80 issues/tasks fixed: https://issues.apache.o
rg/jira/browse/HAWQ-1489?jql=filter%3D12339844%20%20


How would you assess the podling's maturity?
Please feel free to add your own commentary.

  [ ] Initial setup
  [ ] Working towards first release
  [X] Community building
  [X] Nearing graduation
  [ ] Other:

Date of last release:

  2017-02-28


When were the last committers or PMC members elected?

  Committers (1) added:  Xiang Sheng (https://github.com/stanlyxiang)


Signed-off-by:

 [](hawq) Alan Gates
   Comments:
 [](hawq) Konstantin Boudnik
   Comments:
 [](hawq) Justin Erenkrantz
   Comments:
 [](hawq) Thejas Nair
   Comments:
 [](hawq) Roman Shaposhnik
   Comments:

IPMC/Shepherd notes:


On Thu, Jun 29, 2017 at 12:56 AM, Alexander Denissov <adenis...@pivotal.io>
wrote:

> For the conference talks, we (Aleksandr Diachenko and Alexander Denissov)
> conducted a "Podling Shark" lightning session, presenting HAWQ to Apache
> Board members at ApacheCon Miami 2017 on May 18, 2017.
>
> --
> Thanks,
> Alex.
>
> On Tue, Jun 27, 2017 at 11:55 PM, Lei Chang <chang.lei...@gmail.com>
> wrote:
>
> > Thanks Lili/Ed, Please see the revised version below, good suggestion and
> > added the contributor company information too.
> >
> > 
> > -
> > HAWQ
> >
> > Apache HAWQ is a Hadoop native SQL query engine that combines the key
> > technological advantages of MPP database with the scalability and
> > convenience
> > of Hadoop. HAWQ reads data from and writes data to HDFS natively.  HAWQ
> > delivers industry-leading performance and linear scalability. It provides
> > users the tools to confidently and successfully interact with petabyte
> > range
> > data sets. HAWQ provides users with a complete, standards compliant SQL
> > interface.
> >
> > HAWQ has been incubating since 2015-09-04.
> >
> > Three most important issues to address in the move to

Re: Podling Report Reminder - July 2017

2017-06-28 Thread Lei Chang
Thanks Lili/Ed, Please see the revised version below, good suggestion and
added the contributor company information too.


-
HAWQ

Apache HAWQ is a Hadoop native SQL query engine that combines the key
technological advantages of MPP database with the scalability and
convenience
of Hadoop. HAWQ reads data from and writes data to HDFS natively.  HAWQ
delivers industry-leading performance and linear scalability. It provides
users the tools to confidently and successfully interact with petabyte range
data sets. HAWQ provides users with a complete, standards compliant SQL
interface.

HAWQ has been incubating since 2015-09-04.

Three most important issues to address in the move towards graduation:

  1. Continue to improve the project's release cadence. To this end we
 plan on expanding automation services to support increased
 developer participation.

  2. Continue to expand the community, by adding new contributors and
 focusing on making sure that there's a much more robust level of
 conversations and discussions happening around roadmaps and
 feature development on the public dev mailing list.

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
aware of?

 Nothing urgent at this time.


How has the community developed since the last report?

1.  6 Conference Talks:

* The Big Data Engine in Cloud Erea, CSDN Cloud Computing Technology
Conference (Speaker: Zhenglin Tao, Oushu Inc, May 19, 2017)

* HAWQ Introduction. China Internation Big Data Industry Expo 2017
(Speaker: Lan Zhou, Oushu Inc,  May 26, 2017)

* Apache HAWQ: Open Source MPP++ Database, The 12th China Open Source World
Summit, (Speaker: Lei Chang, Oushu Inc, June 21, 2017)

* Introduction to latest technology in HAWQ 2.X, the 8th Database
Technology Conference China (Speaker: Lili Ma, Pivotal, May 13, 2017)

* Extending Apache Ranger Authorization Beyond Hadoop: Review of Apache
Ranger Extensibility Framework & Case Study for Integration
with Apache Hawq, DataWorks Summit, https://www.youtube.com/watch?
v=6SE32zrgIAU (Speaker: Ramesh Mani, Hortonworks & Alexander Denissov,
Pivotal, June 13, 2017)

* Hawq Meets Hive - Querying Unmanaged Data,  DataWorks Summit,
https://www.youtube.com/watch?v=sjlZJvHx1hM (Speaker: Shivram Mani
& Alex (Oleksandr) Diachenko, Pivotal, June 14, 2017)


2. Active contributions from around 20 different community contributors
since last report.


How has the project developed since the last report?

1. Apache HAWQ 2.2.0.0 release is in voting process on dev mailing list
2. HAWQ 2.2.0.0 release information:

1) release page: https://cwiki.apache.org/confluence/display/HAWQ/Ap
ache+HAWQ+2.2.0.0-incubating+Release
2) 80 issues/tasks fixed: https://issues.apache.
org/jira/browse/HAWQ-1489?jql=filter%3D12339844%20%20


How would you assess the podling's maturity?
Please feel free to add your own commentary.

  [ ] Initial setup
  [ ] Working towards first release
  [X] Community building
  [X] Nearing graduation
  [ ] Other:

Date of last release:

  2017-02-28


When were the last committers or PMC members elected?

  Committers (1) added:  Xiang Sheng (https://github.com/stanlyxiang)


Signed-off-by:

 [](hawq) Alan Gates
   Comments:
 [](hawq) Konstantin Boudnik
   Comments:
 [](hawq) Justin Erenkrantz
   Comments:
 [](hawq) Thejas Nair
   Comments:
 [](hawq) Roman Shaposhnik
   Comments:

IPMC/Shepherd notes:


On Wed, Jun 28, 2017 at 2:18 PM, Ed Espino <eesp...@pivotal.io> wrote:

> Nice start Lei.
>
> Here are two more conference talks that were given at the 2017
> DataWorks Summit. For transparency purposes, might it make sense to
> list the companies the dev community members are affiliated with?
>
> --
>
> 2017 DataWorks Summit
> Tuesday, June 13th
>
> Session: Extending Apache Ranger Authorization Beyond Hadoop: Review
> of Apache Ranger Extensibility Framework & Case Study for Integration
> with Apache Hawq
>
> Video: https://www.youtube.com/watch?v=6SE32zrgIAU
>
>   Ramesh Mani
>   Staff Software Engineer
>   Hortonworks
>
>   Alexander Denissov
>   Software Architect
>   Pivotal
>
> --
>
> 2017 DataWorks Summit
> Wednesday, June 14th
>
> Session: Hawq Meets Hive - Querying Unmanaged Data
>
> Video: https://www.youtube.com/watch?v=sjlZJvHx1hM
>
>   Shivram Mani
>   Principal Engineering Manager
>   Pivotal
>
>   Alex (Oleksandr) Diachenko
>   Staff Software Engineer
>   Pivotal
>
> --
>
>
> On Tue, Jun 27, 2017 at 11:08 PM, Lili Ma <lil...@apache.org> wrote:
>
>

Re: Podling Report Reminder - July 2017

2017-06-27 Thread Lei Chang
Hi, Guys, Please see the following podling report draft, comments welcomed!

-
HAWQ

Apache HAWQ is a Hadoop native SQL query engine that combines the key
technological advantages of MPP database with the scalability and
convenience
of Hadoop. HAWQ reads data from and writes data to HDFS natively.  HAWQ
delivers industry-leading performance and linear scalability. It provides
users the tools to confidently and successfully interact with petabyte range
data sets. HAWQ provides users with a complete, standards compliant SQL
interface.

HAWQ has been incubating since 2015-09-04.

Three most important issues to address in the move towards graduation:

  1. Continue to improve the project's release cadence. To this end we
 plan on expanding automation services to support increased
 developer participation.

  2. Continue to expand the community, by adding new contributors and
 focusing on making sure that there's a much more robust level of
 conversations and discussions happening around roadmaps and
 feature development on the public dev mailing list.

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
aware of?

 Nothing urgent at this time.


How has the community developed since the last report?

1. Three Conference Talks:

* The Big Data Engine in Cloud Erea, CSDN Cloud Computing Technology
Conference (Speaker: Zhenglin Tao, May 19, 2017)
* HAWQ Introduction. China Internation Big Data Industry Expo 2017
(Speaker: Lan Zhou, May 26, 2017)
* Apache HAWQ: Open Source MPP++ Database (Speaker: Lei Chang, The 12th
China Open Source World Summit)

2. Active contributions from around 20 different community contributors
since last report


How has the project developed since the last report?

1. Apache HAWQ 2.2.0.0 release is in voting process on dev mailing list
2. HAWQ 2.2.0.0 release information:

1) release page:
https://cwiki.apache.org/confluence/display/HAWQ/Apache+HAWQ+2.2.0.0-incubating+Release
2) 80 issues/tasks fixed:
https://issues.apache.org/jira/browse/HAWQ-1489?jql=filter%3D12339844%20%20


How would you assess the podling's maturity?
Please feel free to add your own commentary.

  [ ] Initial setup
  [ ] Working towards first release
  [X] Community building
  [X] Nearing graduation
  [ ] Other:

Date of last release:

  2017-02-28


When were the last committers or PMC members elected?

  Committers (3) added:
Lisa Owen: January 31, 2017
Jane Beckman: February 1, 2017
Kyle Dunn: March 2, 2017

  PPMC member (1) added:
Paul Guo: March 10, 2017

Signed-off-by:

 [](hawq) Alan Gates
   Comments:
 [](hawq) Konstantin Boudnik
   Comments:
 [](hawq) Justin Erenkrantz
   Comments:
 [](hawq) Thejas Nair
   Comments:
 [](hawq) Roman Shaposhnik
   Comments:

IPMC/Shepherd notes:



On Wed, Jun 28, 2017 at 7:54 AM, <johndam...@apache.org> wrote:

> Dear podling,
>
> This email was sent by an automated system on behalf of the Apache
> Incubator PMC. It is an initial reminder to give you plenty of time to
> prepare your quarterly board report.
>
> The board meeting is scheduled for Wed, 19 July 2017, 10:30 am PDT.
> The report for your podling will form a part of the Incubator PMC
> report. The Incubator PMC requires your report to be submitted 2 weeks
> before the board meeting, to allow sufficient time for review and
> submission (Wed, July 05).
>
> Please submit your report with sufficient time to allow the Incubator
> PMC, and subsequently board members to review and digest. Again, the
> very latest you should submit your report is 2 weeks prior to the board
> meeting.
>
> Thanks,
>
> The Apache Incubator PMC
>
> Submitting your Report
>
> --
>
> Your report should contain the following:
>
> *   Your project name
> *   A brief description of your project, which assumes no knowledge of
> the project or necessarily of its field
> *   A list of the three most important issues to address in the move
> towards graduation.
> *   Any issues that the Incubator PMC or ASF Board might wish/need to be
> aware of
> *   How has the community developed since the last report
> *   How has the project developed since the last report.
> *   How does the podling rate their own maturity.
>
> This should be appended to the Incubator Wiki page at:
>
> https://wiki.apache.org/incubator/July2017
>
> Note: This is manually populated. You may need to wait a little before
> this page is created from a template.
>
> Mentors
> ---
>
> Mentors should review reports for their project(s) and sign them off on
> the Incubator wiki page. Signing off reports shows that you are
> following the project - projects that are not signed may raise alarms
> for the Incubator PMC.
>
> Incubator PMC
>


Re: Hawq repositories

2017-05-25 Thread Lei Chang
David, looks a great proposal, this will ease the dev & doc synchronization
a lot.

Cheers
Lei


On Fri, May 26, 2017 at 6:25 AM, David Yozie  wrote:

> Great - my proposal then is to merge the full contents of the
> incubator-hawq-docs master branch into a new "hawq-docs" subdirectory of
> the incubator-hawq source repo.  This is to avoid any conflicts with the
> existing "docs" directory, which contains SGML files used for command-line
> tool help.
>
> After that, I can modify the README in incubator-hawq-docs to point to the
> correct location and/or make it a read-only repo.
>
> Let me know if there are any concerns about the above, or other suggestions
> for managing the docs source alongside the code.
>
> Thanks,
>
> -David
>
> On Thu, May 25, 2017 at 11:28 AM, Shivram Mani 
> wrote:
>
> > David, that would be great !
> > We could address incubator-hawq-site docs later given the nature of its
> > document source.
> >
> > On Wed, May 24, 2017 at 2:43 PM, David Yozie  wrote:
> >
> >> Shivram:  I was just about to propose moving the incubator-hawq-docs
> >> source into a subdirectory of incubator-hawq.  That organization seems
> to
> >> be more common with Apache projects and I'm hopeful that having docs
> source
> >> alongside the code will encourage more doc contributions (ie. PRs with
> both
> >> the code changes and doc changes).
> >>
> >> Most of the docs on incubator-hawq-site are just HTML files generated
> >> from the source.  I don't think we'd necessarily want to combine that
> >> compiled output alongside source.
> >>
> >> -David
> >>
> >> On Wed, May 24, 2017 at 2:07 PM, Shivram Mani 
> >> wrote:
> >>
> >>> Why can't we integrate the hawq docs
> >>>  and the hawq incubator
> >>> site
> >>> docs  repositories as
> >>> part
> >>> of the incubator hawq repository  >>> ubator-hawq>
> >>> ? Its a bit cumbersome for new committers who intend to contribute to
> >>> docs
> >>> to keep track of three different repositories and also request for
> >>> permission on each of them.
> >>>
> >>> --
> >>> shivram mani
> >>>
> >>
> >>
> >
> >
> > --
> > shivram mani
> >
>


Re: Edit privileges for HAWQ wiki

2017-05-24 Thread Lei Chang
Hi Lav,

I just added you. you can have a try now.

Cheers
Lei


On Wed, May 24, 2017 at 11:09 AM, Ruilong Huo  wrote:

> You can shoot an email to gene...@incubator.apache.org for the permission.
>
> Best regards,
> Ruilong Huo
>
> On Wed, May 24, 2017 at 9:17 AM, Lav Jain  wrote:
>
> > Hi,
> >
> > I would like to contribute to the HAWQ wiki. Can you please grant me
> > access?
> >
> > My id is: *lavjain*
> >
> >
> > Regards,
> >
> > *Lav Jain*
> >
>


Re: Layout of LICENSE, NOTICE, and DISCLAIMER files for Apache HAWQ 2.2.0.0-incubating rpm binary release

2017-05-22 Thread Lei Chang
I think Option 2 is better, more clear.

Cheers
Lei



On Mon, May 22, 2017 at 10:15 AM, Ruilong Huo  wrote:

> Hi Roman,
>
> Please let us know if you get a chance to review this. Or someone else who
> can help on this? Thanks.
>
> Best regards,
> Ruilong Huo
>
> On Mon, May 15, 2017 at 3:12 PM, Ruilong Huo  wrote:
>
> > Hi Roman,
> >
> > Currently I am preparing LICENSE, NOTICE, and DISCLAIMER files for Apache
> > HAWQ 2.2.0.0-incubating rpm binary release. The components of the binary
> > package
> >  2.2.0.0-incubating.RC2/apache-hawq-rpm-2.2.0.0-incubating.tar.gz>
> > are as below:
> >
> > *> tar -xzvf apache-hawq-rpm-2.2.0.0-incubating.tar.gz; tree
> > hawq_rpm_packages*
> > hawq_rpm_packages
> > ├── apache-hawq-2.2.0.0-el7.x86_64.rpm
> > ├── apache-tomcat-7.0.62-el6.noarch.rpm
> > ├── hawq-ranger-plugin-2.2.0.0-1.el7.centos.noarch.rpm
> > ├── pxf-3.2.1.0-1.el6.noarch.rpm
> > ├── pxf-hbase-3.2.1.0-1.el6.noarch.rpm
> > ├── pxf-hdfs-3.2.1.0-1.el6.noarch.rpm
> > ├── pxf-hive-3.2.1.0-1.el6.noarch.rpm
> > ├── pxf-jdbc-3.2.1.0-1.el6.noarch.rpm
> > ├── pxf-json-3.2.1.0-1.el6.noarch.rpm
> > └── pxf-service-3.2.1.0-1.el6.noarch.rpm
> >
> > Given the LICENSE, NOTICE, and DISCLAIMER for Apache HAWQ source in top
> > directory:
> >
> > *> tree incubator-hawq/*
> > incubator-hawq/
> > ├── DISCLAIMER
> > ├── LICENSE
> > └── NOTICE
> >
> > We plan to put LICENSE, NOTICE, and DISCLAIMER for binary release in a
> > dedicated directory named dist which under top directory. Then these
> files
> > will be copied to the rpm packages in packaging stage.
> >
> > Here are two options for the layout of the LICENSE, NOTICE, and
> DISCLAIMER
> > for the components:
> >
> > *Option 1: Combine the licenses of all the components into one LICENSE,
> > NOTICE, and DISCLAIMER respectively. For example:*
> >
> > *> cd $APACHE_HAWQ_TOP_DIR; tree dist*
> > dist
> > ├── DISCLAIMER
> > ├── LICENSE
> > └── NOTICE
> >
> > *Option 2: Keep the separated LICENSE, NOTICE, and DISCLAIMER for each of
> > the components. For example:*
> >
> > *> cd $APACHE_HAWQ_TOP_DIR; tree dist/*
> > dist/
> > ├── hawq
> > │   ├── DISCLAIMER
> > │   ├── LICENSE
> > │   └── NOTICE
> > ├── pxf
> > │   ├── DISCLAIMER
> > │   ├── LICENSE
> > │   └── NOTICE
> > ├── pxf-hbase
> > │   ├── DISCLAIMER
> > │   ├── LICENSE
> > │   └── NOTICE
> > ├── pxf-hdfs
> > │   ├── DISCLAIMER
> > │   ├── LICENSE
> > │   └── NOTICE
> > ├── pxf-hive
> > │   ├── DISCLAIMER
> > │   ├── LICENSE
> > │   └── NOTICE
> > ├── pxf-jdbc
> > │   ├── DISCLAIMER
> > │   ├── LICENSE
> > │   └── NOTICE
> > ├── pxf-json
> > │   ├── DISCLAIMER
> > │   ├── LICENSE
> > │   └── NOTICE
> > ├── pxf-service
> > │   ├── DISCLAIMER
> > │   ├── LICENSE
> > │   └── NOTICE
> > ├── ranger-plugin
> > │   ├── DISCLAIMER
> > │   ├── LICENSE
> > │   └── NOTICE
> > └── tomcat
> > ├── DISCLAIMER
> > ├── LICENSE
> > └── NOTICE
> >
> > For option 1, it is easier to maintain the LICENSE, NOTICE, and
> DISCLAIMER
> > files. However, it contains all the licenses for all the components. Thus
> > it is hard to identify which component contains what licenses.
> >
> > For option 2, it needs extra maintenance effort. But, it is clear that
> > what are the licenses for each of the components.
> >
> > Would you please share you comments and let us know which is better?
> > Thanks.
> >
> > Best regards,
> > Ruilong Huo
> >
>


Re: ALTER ROLE SET Statements

2017-05-11 Thread Lei Chang
In hawq 1.0 release, the time is limited, so a lot of functionality are not
supported.

There is no other special reason.

Cheers
Lei



On Thu, May 11, 2017 at 11:11 AM, Vineet Goel  wrote:

> Hi all,
>
> Does anyone have some historical context on why 'alter role set' statements
> are not supported ?
>
> https://github.com/apache/incubator-hawq/blob/master/
> src/backend/tcop/utility.c#L1654
>
> gpadmin=# alter role gpadmin SET search_path TO myschema;
> ERROR:  Cannot support alter role set statement yet
>
> Thanks
> Vineet
>


Re: [VOTE] New committer: Xiang Sheng

2017-05-09 Thread Lei Chang
Hi Wen, I think this vote should be in private@ list.

Cheers
Lei



On Tue, May 9, 2017 at 4:15 PM, Lei Chang <chang.lei...@gmail.com> wrote:

>
> +1.
>
> On Tue, May 9, 2017 at 3:54 PM, Yi Jin <y...@pivotal.io> wrote:
>
>> +1
>>
>> Xiang showed great code contribution in Apache HAWQ project including
>> fixing bugs and contributing new features. His work shows his deep insight
>> of this project and effective development work, he is also active in both
>> user and dev mail list. Therefore, I think he deserves an Apache HAWQ
>> committer.
>>
>> Best,
>> Yi
>>
>> On Tue, May 9, 2017 at 5:40 PM, Lili Ma <lil...@apache.org> wrote:
>>
>> > +1 for Xiang!
>> >
>> > Xiang has contributed a lot to Apache HAWQ project, including Ranger
>> > integration, Hawq Register implementation and Resource Manager bug fix.
>> > Also Xiang answered a lot of questions for HAWQ in usr/dev mail list and
>> > StackOverFlow channel.  He also shared the tech talk in Apache HAWQ
>> Meetup.
>> >
>> > I think Xiang is well deserved to become a Apache HAWQ committer.
>> >
>> > Thanks
>> > Lili
>> >
>> > 2017-05-09 14:18 GMT+08:00 Wen Lin <w...@pivotal.io>:
>> >
>> > > Hi All,
>> > >
>> > > This is a VOTE email for promoting candidate *Xiang Sheng* (with
>> github
>> > id
>> > > *stanlyxiang*) from contributor to committer, who has been
>> contributing
>> > to
>> > > Apache HAWQ (incubating) in last one and half years (from Nov. 2015 to
>> > May
>> > > 2017). Please give +1, 0 or -1 with reasons in this email thread.
>> > >
>> > > His contribution includes (but not limited to):
>> > > *Direct contribution to code base:*
>> > >
>> > >- 43 commits in total with some major components in hawq involved,
>> > >including contributions to Apache Ranger integration, hawq register
>> > >and command line tools, resource manager.
>> > >https://github.com/apache/incubator-hawq/commits?author=stan
>> lyxiang
>> > >- 41 closed PRs: https://github.com/apache/incu
>> bator-hawq/pulls?q=is%
>> > >3Apr+is%3Aclosed+author%3Astanlyxiang
>> > ><https://github.com/apache/incubator-hawq/pulls?q=is%
>> > > 3Apr+is%3Aclosed+author%3Astanlyxiang>
>> > >
>> > >- 13 improvements including documentation, test, build, command
>> line
>> > >tools, code refactor.
>> > >
>> > >
>> > >- HAWQ-140 <https://issues.apache.org/jira/browse/HAWQ-140>Add
>> more
>> > >   information in HAWQ build instructions file
>> > >   - HAWQ-143 <https://issues.apache.org/jira/browse/HAWQ-143>Add
>> > >   informations in Apache-HAWQ README.md
>> > >   - HAWQ-154<https://issues.apache.org/jira/browse/HAWQ-154>
>> Update
>> > >   BUILD_INSTRUCTIONS file for dependencies install method and
>> > > ambiguous word.
>> > >   - HAWQ-203<https://issues.apache.org/jira/browse/HAWQ-203>Add a
>> > guc
>> > >   for debug metadata, datalocality time stat.
>> > >   - HAWQ-265 <https://issues.apache.org/jira/browse/HAWQ-265
>> > >   <https://issues.apache.org/jira/browse/HAWQ-279> >Change
>> metadata
>> > >   share memory flush strategy to prevent out of share memory
>> problem
>> > > when
>> > >   create too many hdfs_file.
>> > >   - HAWQ-279 <https://issues.apache.org/jira/browse/HAWQ-279>
>>  Add
>> > 2
>> > >   guc in template-hawq-site
>> > >   - HAWQ-284 <https://issues.apache.org/jira/browse/HAWQ-284>
>> Add a
>> > > udf
>> > >   for new metadata flush strategy testing.
>> > >   - HAWQ-313 <https://issues.apache.org/jira/browse/HAWQ-313>Fix
>> > >   dereference pointer before null check
>> > >   - HAWQ-475 <https://issues.apache.org/jira/browse/HAWQ-475>Add
>> > >   build_type gcov for code coverage.
>> > >   - HAWQ-486 <https://issues.apache.org/jira/browse/HAWQ-486
>> >gpcheck
>> > >   can’t find namenode with Ambari install PHD
>> > >   - HAWQ-498 <https://issues.apache.org/jira/browse/HAWQ-498>
>> Update
>> > >   property value 

Re: [VOTE] New committer: Xiang Sheng

2017-05-09 Thread Lei Chang
+1.

On Tue, May 9, 2017 at 3:54 PM, Yi Jin  wrote:

> +1
>
> Xiang showed great code contribution in Apache HAWQ project including
> fixing bugs and contributing new features. His work shows his deep insight
> of this project and effective development work, he is also active in both
> user and dev mail list. Therefore, I think he deserves an Apache HAWQ
> committer.
>
> Best,
> Yi
>
> On Tue, May 9, 2017 at 5:40 PM, Lili Ma  wrote:
>
> > +1 for Xiang!
> >
> > Xiang has contributed a lot to Apache HAWQ project, including Ranger
> > integration, Hawq Register implementation and Resource Manager bug fix.
> > Also Xiang answered a lot of questions for HAWQ in usr/dev mail list and
> > StackOverFlow channel.  He also shared the tech talk in Apache HAWQ
> Meetup.
> >
> > I think Xiang is well deserved to become a Apache HAWQ committer.
> >
> > Thanks
> > Lili
> >
> > 2017-05-09 14:18 GMT+08:00 Wen Lin :
> >
> > > Hi All,
> > >
> > > This is a VOTE email for promoting candidate *Xiang Sheng* (with github
> > id
> > > *stanlyxiang*) from contributor to committer, who has been contributing
> > to
> > > Apache HAWQ (incubating) in last one and half years (from Nov. 2015 to
> > May
> > > 2017). Please give +1, 0 or -1 with reasons in this email thread.
> > >
> > > His contribution includes (but not limited to):
> > > *Direct contribution to code base:*
> > >
> > >- 43 commits in total with some major components in hawq involved,
> > >including contributions to Apache Ranger integration, hawq register
> > >and command line tools, resource manager.
> > >https://github.com/apache/incubator-hawq/commits?author=stanlyxiang
> > >- 41 closed PRs: https://github.com/apache/
> incubator-hawq/pulls?q=is%
> > >3Apr+is%3Aclosed+author%3Astanlyxiang
> > > > > 3Apr+is%3Aclosed+author%3Astanlyxiang>
> > >
> > >- 13 improvements including documentation, test, build, command line
> > >tools, code refactor.
> > >
> > >
> > >- HAWQ-140 Add more
> > >   information in HAWQ build instructions file
> > >   - HAWQ-143 Add
> > >   informations in Apache-HAWQ README.md
> > >   - HAWQ-154Update
> > >   BUILD_INSTRUCTIONS file for dependencies install method and
> > > ambiguous word.
> > >   - HAWQ-203Add a
> > guc
> > >   for debug metadata, datalocality time stat.
> > >   - HAWQ-265  > >    >Change
> metadata
> > >   share memory flush strategy to prevent out of share memory
> problem
> > > when
> > >   create too many hdfs_file.
> > >   - HAWQ-279 
>  Add
> > 2
> > >   guc in template-hawq-site
> > >   - HAWQ-284  Add
> a
> > > udf
> > >   for new metadata flush strategy testing.
> > >   - HAWQ-313 Fix
> > >   dereference pointer before null check
> > >   - HAWQ-475 Add
> > >   build_type gcov for code coverage.
> > >   - HAWQ-486  >gpcheck
> > >   can’t find namenode with Ambari install PHD
> > >   - HAWQ-498 
> Update
> > >   property value in gpcheck.cnf
> > >   - HAWQ-1430 
> > Update
> > >   ranger related log level to avoid log flood
> > >   - HAWQ-77  Fix
> > source
> > >   code comment for new ALTER/CREATE RESOURCE QUEUE
> > >- 10 bug fixes including test failure, resource manager, core-dump,
> > >command line tools, build components.
> > >
> > >
> > >- HAWQ-295 New
> > metadata
> > >   flush strategy remove 1 entry every time flush due to flush
> > condition
> > >   wrong.
> > >   - HAWQ-998  Fix
> > test
> > >   for aggregate-with-null test.
> > >   - HAWQ-1051 
> > > failing
> > >   in reverse DNS lookup causes resource manager core dump
> > >   - HAWQ-1076.
> > Fixed
> > >   USAGE privilege bug on nextval(sequence) when optimizer on
> > >   - HAWQ-1117  RM
> > >   crash when init db after configure with param '--enable-cassert'
> > >   - HAWQ-1160 
> Hawq
> > >   

Re: Impala vs Greenplum

2017-05-02 Thread Lei Chang
I remember there is a benchmark several months ago (not public) comparing
hawq and other sql-on-hadoop engines on tpcds benchmark, hawq is much
faster. Different vendors might have different benchmark results since
different tuning are made on different engines. And there were a lot of
discussions around how to improve HAWQ executor before hawq was open
sourced including vectorization, codegen, new hardware et al.

@Michael, I also think it is a good time to discuss how to build a new HAWQ
executor with various new optimizations. This may potentially improve the
query performance a lot.

I have started a JIRA on this topic (
https://issues.apache.org/jira/browse/HAWQ-1450). Hope that we can have a
design and start working on this soon.

Thanks
Lei


On Wed, May 3, 2017 at 6:03 AM, Michael André Pearce <
michael.andre.pea...@me.com> wrote:

> Indeed the intent was very much less so for the mine is bigger than yours.
>
> But more was to challenge the question of is the result actual, and if so,
> is there ideas or improvements that could be learnt from the approaches
> impala have taken, that could be used in hawq?
>
> https://www.slideshare.net/mobile/cloudera/impala-performance-update
> http://www.sciencedirect.com/science/article/pii/S0164121216302400
>
> Likewise are we benefitting at all from the upstream greenplum sister
> project from, as in code gen?
>
> Yes we know it was greenplum in the results but hawq is its sister, and is
> indicative.
>
> Cheers
> Mike
>
>
>
> Sent from my iPad
>
> On 1 May 2017, at 23:27, Konstantin Boudnik  wrote:
>
> With my Apache hat on, I'd like to say that it is of little, if any at
> all, relevance to the Apache projects what companies like Cloudera say
> about their internal benchmarks.
>
> Apache projects do not compete between each other nor with any
> commercial products. While it is completely ok to say "official
> release of Apache Foo" was x percent faster than "official release of
> Apache Bar" somewhere in Apache Foo's blog or something, it is
> unacceptable for Apache Foo to get into pissing contest with something
> forked from Apache Bar and sold by a commercial entity as a part of
> their offering (sometimes it is even impossible to say what exactly
> the entity in question is selling).
>
> In other words - let's not get into one of these "My Hadoop is bigger
> than yours" [1] moments again.
>
> But by all means - let's discuss the technicalities of bringing more
> efficient code generation code into the project, etc.
>
> [1] https://gigaom.com/2011/12/19/my-hadoop-is-bigger-than-yours/
>
> --
> With regards,
>  Cos
>
> 2CAC 8312 4870 D885 8616  6115 220F 6980 1F27 E622
>
> Disclaimer: Opinions expressed in this email are those of the author,
> and do not necessarily represent the views of any company the author
> might be affiliated with at the moment of writing.
>
>
> On Mon, May 1, 2017 at 2:59 PM, Michael André Pearce
>  wrote:
>
> No doubt if not already seen cloudera announced the following blog
>
>
> http://blog.cloudera.com/blog/2017/04/apache-impala-leads-
> traditional-analytic-database/
>
>
> A clear shot across the bows of hawq.
>
>
> Also how does hawq really compare? There is some old/dated hawq performance
>
> blogs, Should it be something that is updated?
>
>
> For the hawq community it be good to know how long till hawq would get
>
> upstream green plum improvements like codegen.
>
>
> Likewise what features or changes have impala implemented to make it leap
>
> frog greenplum/hawq soo much? Are any of the changes portable to hawq?
>
>
>
>
> Sent from my iPad
>
>


[jira] [Created] (HAWQ-1450) New HAWQ executor with vectorization & possible code generation

2017-05-02 Thread Lei Chang (JIRA)
Lei Chang created HAWQ-1450:
---

 Summary: New HAWQ executor with vectorization & possible code 
generation
 Key: HAWQ-1450
 URL: https://issues.apache.org/jira/browse/HAWQ-1450
 Project: Apache HAWQ
  Issue Type: New Feature
  Components: Query Execution
Reporter: Lei Chang
Assignee: Lei Chang
 Fix For: backlog



Most HAWQ executor code is inherited from postgres & gpdb. Let's discuss how to 
build a new hawq executor with vectorization and possibly code generation. 
These optimization may potentially improve the query performance a lot.





--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: Slow "make distclean"

2017-04-25 Thread Lei Chang
This user experience looks not good, especially when network is bad. This
is quite common.

Any chance just delete the output in "make distclean" step instead of
downloading then cleaning.

Cheers
Lei



On Wed, Apr 26, 2017 at 2:16 AM, Alex (Oleksandr) Diachenko <
odiache...@pivotal.io> wrote:

> Hi Paul,
>
> It's unlikely to run distclean without building a project before,
> so once you have jars in Gradle cache - download phase will be skipped.
>
> Regards, Alex.
>
> On Mon, Apr 24, 2017 at 8:28 PM, stanly sheng 
> wrote:
>
> > just the first time to download I think.
> >
> > 2017-04-25 10:51 GMT+08:00 Paul Guo :
> >
> > > I just tired to build a hawq from scratch, however "make distclean" is
> > > quite slow in the following step,
> > >
> > > make[1]: Leaving directory `/data/github/incubator-hawq2/src'
> > > make -C pxf distclean
> > > make[1]: Entering directory `/data/github/incubator-hawq2/pxf'
> > > ./gradlew clean
> > >   % Total% Received % Xferd  Average Speed   TimeTime Time
> > >  Current
> > >  Dload  Upload   Total   SpentLeft
> > >  Speed
> > >   0 00 00 0  0  0 --:--:--  0:00:01
> --:--:--
> > >   0
> > > 100 66.6M  100 66.6M0 0   372k  0  0:03:03  0:03:03
> --:--:--
> > >  606k
> > > ~/.gradle/wrapper/dists/gradle-2.13-all/f1e30dfd9ad21c8d22d8fa4c664648
> 3a
> > > /data/github/incubator-hawq2/pxf
> > > /data/github/incubator-hawq2/pxf
> > > Download
> > > http://repo1.maven.org/maven2/com/netflix/nebula/gradle-
> > > ospackage-plugin/2.2.6/gradle-ospackage-plugin-2.2.6.pom
> > > Download
> > > http://repo1.maven.org/maven2/de/undercouch/gradle-download-
> > > task/2.1.0/gradle-download-task-2.1.0.pom
> > > Download
> > > http://repo1.maven.org/maven2/com/netflix/nebula/gradle-
> > > aggregate-javadocs-plugin/2.2.1/gradle-aggregate-javadocs-
> > plugin-2.2.1.pom
> > > Download
> > > http://repo1.maven.org/maven2/org/redline-rpm/redline/1.2.1/
> > > redline-1.2.1.pom
> > > Download http://repo1.maven.org/maven2/org/vafer/jdeb/1.4/jdeb-1.4.pom
> > > Download
> > > http://repo1.maven.org/maven2/com/bmuschko/gradle-docker-
> > > plugin/2.0.3/gradle-docker-plugin-2.0.3.pom
> > > Download
> > > http://repo1.maven.org/maven2/org/apache/ant/ant/1.9.1/ant-1.9.1.pom
> > > Download
> > > http://repo1.maven.org/maven2/org/apache/ant/ant-parent/1.9.
> > > 1/ant-parent-1.9.1.pom
> > > Download
> > > http://repo1.maven.org/maven2/org/slf4j/slf4j-api/1.7.5/
> > > slf4j-api-1.7.5.pom
> > > Download
> > > http://repo1.maven.org/maven2/org/slf4j/slf4j-parent/1.7.5/
> > > slf4j-parent-1.7.5.pom
> > > Download
> > > http://repo1.maven.org/maven2/org/apache/commons/commons-
> > > compress/1.6/commons-compress-1.6.pom
> > > Download
> > > http://repo1.maven.org/maven2/org/apache/commons/commons-
> > > parent/32/commons-parent-32.pom
> > > Download http://repo1.maven.org/maven2/org/tukaani/xz/1.4/xz-1.4.pom
> > > Download
> > > http://repo1.maven.org/maven2/org/bouncycastle/bcpg-jdk15on/
> > > 1.50/bcpg-jdk15on-1.50.pom
> > > Download
> > > http://repo1.maven.org/maven2/org/apache/commons/commons-
> > > compress/1.8/commons-compress-1.8.pom
> > > Download
> > > http://repo1.maven.org/maven2/org/apache/ant/ant/1.9.3/ant-1.9.3.pom
> > > Download
> > > http://repo1.maven.org/maven2/org/apache/ant/ant-parent/1.9.
> > > 3/ant-parent-1.9.3.pom
> > > Download
> > > http://repo1.maven.org/maven2/org/apache/ant/ant-launcher/1.
> > > 9.3/ant-launcher-1.9.3.pom
> > > Download
> > > http://repo1.maven.org/maven2/org/bouncycastle/bcpg-jdk15on/
> > > 1.51/bcpg-jdk15on-1.51.pom
> > > Download
> > > http://repo1.maven.org/maven2/org/bouncycastle/bcprov-
> > > jdk15on/1.51/bcprov-jdk15on-1.51.pom
> > > Download http://repo1.maven.org/maven2/org/tukaani/xz/1.5/xz-1.5.pom
> > > Download
> > > http://repo1.maven.org/maven2/com/netflix/nebula/gradle-
> > > ospackage-plugin/2.2.6/gradle-ospackage-plugin-2.2.6.jar
> > > Download
> > > http://repo1.maven.org/maven2/de/undercouch/gradle-download-
> > > task/2.1.0/gradle-download-task-2.1.0.jar
> > > Download
> > > http://repo1.maven.org/maven2/com/netflix/nebula/gradle-
> > > aggregate-javadocs-plugin/2.2.1/gradle-aggregate-javadocs-
> > plugin-2.2.1.jar
> > > Download
> > > http://repo1.maven.org/maven2/org/redline-rpm/redline/1.2.1/
> > > redline-1.2.1.jar
> > > Download http://repo1.maven.org/maven2/org/vafer/jdeb/1.4/jdeb-1.4.jar
> > > Download
> > > http://repo1.maven.org/maven2/com/bmuschko/gradle-docker-
> > > plugin/2.0.3/gradle-docker-plugin-2.0.3.jar
> > > Download
> > > http://repo1.maven.org/maven2/org/slf4j/slf4j-api/1.7.5/
> > > slf4j-api-1.7.5.jar
> > > Download
> > > http://repo1.maven.org/maven2/org/apache/commons/commons-
> > > compress/1.8/commons-compress-1.8.jar
> > > Download
> > > http://repo1.maven.org/maven2/org/apache/ant/ant/1.9.3/ant-1.9.3.jar
> > > Download
> > > http://repo1.maven.org/maven2/org/apache/ant/ant-launcher/1.
> > 

Re: s/gettimeofday/clock_gettime/ in hawq?

2017-04-25 Thread Lei Chang
Good catch. There are a lot of uses of getimeofday for timeout.

clock_gettime is also affected by NTP if NTP moves time back except
that CLOCK_MONOTONIC
option is used.

Thanks
Lei


On Mon, Apr 24, 2017 at 1:36 PM, Paul Guo  wrote:

> Hi,
>
> HAWQ sometimes use gettimeofday() syscall for some timeout check in various
> module, however it could be affected by ntp, so the timeout checking logic
> could be wrong sometimes. I would propose to use clock_gettime() to replace
> it on Linux (I have not investigated the alternative on other platforms
> e.g. mac). Both gettimeofday() and clock_gettime() are fast vdso sys call
> so I do not expect there is performance loss in some case where there are
> frequent calls of gettimeofday(). By the way, I found some gettimeofday()
> calls on postgresql and gpdb, so they might have this issue also.
>
> Regards,
>
> Paul
>


Re: Proposal for becoming committer in Apache HAWQ project

2017-04-16 Thread Lei Chang
Nice summary!

Cheers
Lei




On Mon, Apr 17, 2017 at 9:49 AM, Ruilong Huo  wrote:

> Hi All,
>
> To attract more and more contributions from open source community and get
> more and more contributors involved in Apache HAWQ project, we have below
> proposal with criteria and process to become committer:
> https://cwiki.apache.org/confluence/display/HAWQ/Becoming+a+committer
>
> Feel free to share us with your feedback and comments. Thanks.
>


Re: New Committer: Kyle Dunn

2017-03-02 Thread Lei Chang
Congratulations! Kyle.

Cheers
Lei



On Fri, Mar 3, 2017 at 8:30 AM, Kyle Dunn  wrote:

> Thanks team! Honored to be invited to join this great community!
>
>
> -Kyle
>
> On Thu, Mar 2, 2017 at 10:07 AM Roman Shaposhnik 
> wrote:
>
> > Congrats Kyle! Great to have you onboard!
> >
> > Thanks,
> > Roman.
> >
> > On Thu, Mar 2, 2017 at 9:05 AM, Ed Espino  wrote:
> > > The Project Management Committee (PMC) for Apache HAWQ (incubating) has
> > > invited Kyle Dunn to become a committer and we are pleased to announce
> > that
> > > he has accepted.
> > >
> > > His contributions include (but not limited to):
> > >
> > >- Issues submitted: He has reported a total of 10 HAWQ issues
> > ><
> > https://issues.apache.org/jira/issues/?jql=project%20%
> 3D%20%22Apache%20HAWQ%22%20and%20reporter%20%3D%20kdunn926
> > >
> > > across
> > >significant areas of the HAWQ product
> > >- He is active on the dev list.
> > >- He has recently requested and was granted update privileges to the
> > >HAWQ wiki. He has updated the RHEL 6.X build instructions.
> > >- He has submitted 5 HAWQ PRs
> > ><
> > https://github.com/apache/incubator-hawq/pulls?utf8=%E2%
> 9C%93=is%3Apr%20author%3Akdunn926%20
> > >
> > > covering
> > >new feature areas. Not all were accepted but it shows he is
> committed
> > to
> > >enhancing HAWQ feature set.
> > >
> > > Being a committer enables easier contribution to the project. This
> should
> > > enable improved productivity.
> > >
> > > Please join us in congratulating him.  We are looking forward to a
> closer
> > > collaboration with the open source community.
> > >
> > > *WELCOME and CONGRATULATIONS Kyle!!!*
> > >
> > > Cheers,
> > > -=e
> > >
> > > --
> > > *Ed Espino*
> >
> --
> *Kyle Dunn | Data Engineering | Pivotal*
> Direct: 303.905.3171 <3039053171> | Email: kd...@pivotal.io
>


Re: hawq Ambari integration

2017-03-02 Thread Lei Chang
I think there are some use cases we need What Jon proposed.

For example, users installed hawq via Ambari, but they want to automate the
configuration changes, it would be convenient to have hawq CLI change the
configurations.

Cheers
Lei







On Fri, Mar 3, 2017 at 8:36 AM, Alex (Oleksandr) Diachenko <
odiache...@pivotal.io> wrote:

> I see, that makes sense.
> But is there any action users cannot do via Ambari?
>
> Ranger is also a good example, there we are making assumption,
> users either use Ranger or HAWQ's authorization engine.
>
> The same logic might be extrapolated to HAWQ/Ambari - users might use
> either Ambari or HAWQ CLI, but not both at the same time.
> In that way, we can keep things simple.
>
> Regards, Alex.
>
>
>
> On Thu, Mar 2, 2017 at 4:28 PM, Jon Roberts  wrote:
>
> > Right.  Just like HAWQ will be operational without Ranger.
> >
> > We have the hawq CLI and will obviously continue to have it.  Some people
> > use Ambari while others don't.  So just like with Ranger support,
> integrate
> > when possible but don't require it.
> >
> >
> > Jon
> >
> > On Thu, Mar 2, 2017 at 6:26 PM, Alex (Oleksandr) Diachenko <
> > odiache...@pivotal.io> wrote:
> >
> > > Not really, because HAWQ should be operational even without Ambari(if
> > > that's the case).
> > >
> > > On Thu, Mar 2, 2017 at 4:21 PM, Jon Roberts 
> wrote:
> > >
> > > > If that is the case, should we remove the "hawq" CLI?
> > > >
> > > > Jon
> > > >
> > > > On Thu, Mar 2, 2017 at 6:12 PM, Alex (Oleksandr) Diachenko <
> > > > odiache...@pivotal.io> wrote:
> > > >
> > > > > Hi Jon,
> > > > >
> > > > > I think it was designed that Ambari is supposed to be only one
> source
> > > of
> > > > > true.
> > > > > The whole purpose of integration id to provide a user-friendly
> > > interface
> > > > > and avoid manually editing/distributing config files
> > > > > or running CLI commands.
> > > > > The idea of coupling HAWQ master with Ambari doesn't seem to be
> > clean.
> > > > >
> > > > > Regards, Alex.
> > > > >
> > > > > On Thu, Mar 2, 2017 at 4:05 PM, Jon Roberts 
> > > wrote:
> > > > >
> > > > > > It would be handy if the "hawq config" also updated Ambari's
> > database
> > > > so
> > > > > > that changes could be made in either place are retained when
> > changes
> > > > are
> > > > > > made in either place.
> > > > > >
> > > > > > Register Ambari:
> > > > > > hawq ambari -u admin -w admin -h myhost -p 8080
> > > > > >
> > > > > > "hawq config" could then raise INFO/WARN messages about updating
> > > > Ambari.
> > > > > >
> > > > > > Example:
> > > > > > hawq config -c hawq_rm_stmt_vseg_memory -v 16gb
> > > > > > INFO: Updated Ambari with hawq_rm_stmt_vseg_memory=16gb
> > > > > > or
> > > > > > hawq config -c hawq_rm_stmt_vseg_memory -v 16gb
> > > > > > WARN: Failed to update Ambari with hawq_rm_stmt_vseg_memory=16gb.
> > > > Please
> > > > > > update Ambari credentials manually to retain this configuration
> > > change
> > > > > > after a restart.
> > > > > >
> > > > > > The implementation would require interacting with the Ambari APIs
> > and
> > > > > also
> > > > > > storing the credentials in an encrypted file on the HAWQ Master.
> > > > > >
> > > > > > Thoughts?
> > > > > >
> > > > > >
> > > > > > Jon Roberts
> > > > > > Principal Engineer | jrobe...@pivotal.io | 615-426-8661
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: Apache HAWQ & Pivotal HDB release alignment

2017-02-16 Thread Lei Chang
Agree with Greg. Looks this is not an issue and should not be discussed in
apache.

Thanks
Lei



On Fri, Feb 17, 2017 at 3:14 AM, Greg Chase  wrote:

> I'm confused here.  Are we voting on version numbering of a commercial
> distribution of HAWQ? That would not be a concern or in the jurisdiction of
> the Apache HAWQ community.
>
> Are we asking Apache HAWQ to change its version numbering to reflect that
> of a commercial distribution? That would not be appropriate.
>
> Either way, this either doesn't need to be voted on in the community, or
> shouldn't be.
>
> A commercial distribution is always welcome to take whatever version of the
> code lines it wants from Apache HAWQ.  However, there's a whole lot of
> benefits for the commercial distribution if they were to take established
> release versions from Apache HAWQ that likely have gone through IP checks
> and hopefully a degree of quality checks.
>
> It also helps improve transparency of the commercial version since users
> can look into the Jira and Github to see what new features and fixes are
> present in the open source code.
>
> -Greg
>
> On Thu, Feb 16, 2017 at 10:33 AM, Shivram Mani 
> wrote:
>
> > +1
> > I don't see any negative impact in bumping up the version to 2.2. The
> > positive outcome from this is that we will have more frequent apache HAWQ
> > releases since majority of the committers who happen to also work on HDB
> > will be focussed on apache release as the primary release channel.
> >
> > On Thu, Feb 16, 2017 at 1:25 AM, Michael Pearce 
> > wrote:
> >
> > > -1
> > >
> > > Whilst I agree that version alignment is important for Pivotal and
> users
> > > of HDB (my own self being a HDP client).
> > >
> > > We have to remember this is an open source Apache project and Pivotal
> are
> > > providing a downstream supported version, surely this should be a case
> of
> > > Pivotal aligning to the Apache Version, not the other-way around.
> > >
> > > Likewise, if any other company wished to provide a supported bundle of
> > > HAWQ, then I wouldn’t expect the open source Apache project to change
> > their
> > > versioning for a commercial enterprise. I see this much the same way
> > > multiple companies support postgres.
> > >
> > > Cheers
> > > Mike
> > >
> > > On 16/02/2017, 06:06, "Lili Ma"  wrote:
> > >
> > > +1 for version alignment
> > >
> > > 2017-02-16 13:43 GMT+08:00 Ruilong Huo :
> > >
> > > > Looks a good plan for the version alignment. +1
> > > >
> > > > Best regards,
> > > > Ruilong Huo
> > > >
> > > > On Thu, Feb 16, 2017 at 1:09 PM, Yandong Yao 
> > > wrote:
> > > >
> > > > > +1 for consistence
> > > > >
> > > > > On Thu, Feb 16, 2017 at 10:40 AM, Ed Espino  >
> > > wrote:
> > > > >
> > > > > > +1 to this recommendation. It has been a bit confusing
> keeping
> > > track of
> > > > > > versions. The Apache HAWQ version update is fairly simple.
> Now
> > > is the
> > > > > time
> > > > > > to make such an update. I imagine it will get harder the more
> > > time
> > > > passes
> > > > > > on and the more the community grows.
> > > > > >
> > > > > > This will impact Jira versioning for our upcoming Apache HAWQ
> > > > incubating
> > > > > > release. I will take care of that as part of the release
> > process.
> > > > > >
> > > > > > Thanks,
> > > > > > -=e
> > > > > >
> > > > > > On Wed, Feb 15, 2017 at 5:00 PM, Vineet Goel <
> > vvin...@apache.org
> > > >
> > > > wrote:
> > > > > >
> > > > > > > Hi HAWQ dev community,
> > > > > > >
> > > > > > >
> > > > > > > Over the last few months, many users in the HAWQ community
> > have
> > > > > expressed
> > > > > > > confusion about Apache HAWQ incubating release versions as
> > > compared
> > > > to
> > > > > > > Pivotal HDB release version numbering. Since Pivotal’s
> > > donation of
> > > > HAWQ
> > > > > > > codebase to Apache in September 2015, the community has
> > grown,
> > > and
> > > > > users
> > > > > > of
> > > > > > > Apache HAWQ as well as HDB have participated and sought
> help
> > > from the
> > > > > > HAWQ
> > > > > > > dev/user community via mailing lists.
> > > > > > >
> > > > > > >
> > > > > > > With my Pivotal representation on this topic, I’m proposing
> > > Pivotal
> > > > > team
> > > > > > to
> > > > > > > make an effort to align commercial releases of HDB based on
> > > Apache
> > > > HAWQ
> > > > > > > releases as much as possible. And, as part of the proposal,
> > the
> > > > > > commercial
> > > > > > > HDB versions should also be aligned with the Apache HAWQ
> > > release
> > > > > > > versioning. The net result of this alignment at Pivotal
> will
> > > likely
> > > > > > result
> > > > > > 

Re: Impact on HAWQ when used with Multi-Homed Nodes

2017-02-08 Thread Lei Chang
Hi Dino,

HAWQ 2.x needs NIC bonding. If there is no NIC bonding, only 1 NIC will be
used.

Thanks
Lei


On Tue, Feb 7, 2017 at 7:17 PM, Dino Bukvic  wrote:

> Hi all,
>
> We have a customer who is setting up 2 NICs on all cluster nodes that are
> configured to be in different VLANs which are not routable to each other.
>
> Will this have an impact on HAWQ when running (master and segments) on
> those nodes?
>
>
> Thanks in advance.
>
> Regards,
> Dino Bukvic *|* Advisory Solutions Architect *|* PDE EMEA *|* Pivotal
> *mail *dbuk...@pivotal.io *web* www.pivotal.io
> *mobile* +49 (0)152 03459731
>
> GoPivotal Deutschland GmbH
> Hauptverwaltung und Sitz: Am Kronberger Hang 2a, 65824 Schwalbach/Ts.
> Registergericht: Amtsgericht Königstein im Taunus, HRB 8433
> Geschäftsführer: Andrew Michael Cohen, Paul Thomas Dacier
>
> 
>


Re: build fail on arch linux

2017-01-29 Thread Lei Chang
Hi Dmitry,

Which bison version do you use? Looks this is a known issue when compiling
hawq on latest bison (3.x) version.  Bison 2.x version should work.

Thanks
Lei




On Mon, Jan 30, 2017 at 3:41 AM, Dmitry Bouzolin <
dbouzo...@yahoo.com.invalid> wrote:

> Hi All,
> Yes, I know arch linux is not supported, however I appreciate any clues on
> why the build would fail like so:
>
> make -C caql allmake[4]: Entering directory '/data/src/incubator-hawq/src/
> backend/catalog/caql'
> gcc -O3 -std=gnu99  -Wall -Wmissing-prototypes -Wpointer-arith
> -Wendif-labels -Wformat-security -fno-strict-aliasing -fwrapv
> -fno-aggressive-loop-optimizations  -I/usr/include/libxml2
> -I../../../../src/include -D_GNU_SOURCE  -I/data/src/incubator-hawq/
> depends/libhdfs3/build/install/opt/hawq/include
> -I/data/src/incubator-hawq/depends/libyarn/build/install/opt/hawq/include
> -c -o gram.o gram.c
> gram.c: In function ‘caql_yyparse’:
> gram.c:1368:41: error: ‘yyscanner’ undeclared (first use in this function)
>yychar = yylex (, , yyscanner);
>  ^
> gram.c:1368:41: note: each undeclared identifier is reported only once for
> each function it appears in
> : recipe for target 'gram.o' failed
>
> If I build on CentOS, I get different make like for this target and build
> succeeds:
> make -C caql all
> make[4]: Entering directory `/data/src/incubator-hawq/src/
> backend/catalog/caql'
> gcc -O3 -std=gnu99  -Wall -Wmissing-prototypes -Wpointer-arith
> -Wendif-labels -Wformat-security -fno-strict-aliasing -fwrapv
> -fno-aggressive-loop-optimizations  -I/usr/include/libxml2
> -I../../../../src/include -D_GNU_SOURCE  -I/data/src/incubator-hawq/
> depends/libhdfs3/build/install/opt/hawq/include
> -I/data/src/incubator-hawq/depends/libyarn/build/install/opt/hawq/include
> -c -o caqlanalyze.o caqlanalyze.c
>
> The difference is in input and output file. The same line in Arch
> completes successfully. All dependencies are in place.
>
> Thanks, Dmitry.
>


Re: hawq_rm_stmt_vseg_memory and hawq_rm_nvseg_perquery_perseg_limit

2017-01-20 Thread Lei Chang
hawq_rm_stmt_vseg_memory and hawq_rm_stmt_nvseg need to be used together to
set the specific number of segments and the vseg memory. And
hawq_rm_stmt_nvseg should be less than hawq_rm_nvseg_perquery_perseg_limit.

set hawq_rm_stmt_vseg_memory = '2GB';set hawq_rm_stmt_nvseg = 6;

looks 16GB is somewhat small for big dedicated machines: if 16GB is per
virtual segment memory, if 8 segment is used, it only use 128GB.

Cheers
Lei


On Fri, Jan 20, 2017 at 9:11 PM, Jon Roberts  wrote:

> Why is there a limit of 16GB for hawq_rm_stmt_vseg_memory?  A cluster with
> 256GB per node and dedicated for HAWQ may certainly want to utilize more
> memory per segment.  Is there something I'm missing regarding statement
> memory?
>
> Secondly, does the number of vsegs for a query get influenced by the
> statement memory or does it just look at the plan and
> hawq_rm_nvseg_perquery_perseg_limit?
>
>
> Jon Roberts
>


[jira] [Created] (HAWQ-1255) Looks "segment size with penalty" number in "explain analyze" not correct

2017-01-04 Thread Lei Chang (JIRA)
Lei Chang created HAWQ-1255:
---

 Summary: Looks "segment size with penalty" number in "explain 
analyze" not correct
 Key: HAWQ-1255
 URL: https://issues.apache.org/jira/browse/HAWQ-1255
 Project: Apache HAWQ
  Issue Type: Bug
  Components: Query Execution
        Reporter: Lei Chang
        Assignee: Lei Chang



"segment size" is about 500MB, while "segment size with penalty" is about 
100MB. Looks not reasonable.

How to reproduce:
on laptop, 1G tpch data, lineitem table is created as hash distributed with 2 
buckets, and orders table is randomly.


```
postgres=# explain analyze SELECT l_orderkey, count(l_quantity) 
 FROM 
lineitem_b2, orders 
WHERE 
l_orderkey = o_orderkey 
   GROUP BY 
l_orderkey;


QUERY PLAN  


  
--
 Gather Motion 2:1  (slice2; segments: 2)  (cost=291580.96..318527.67 
rows=1230576 width=16)
   Rows out:  Avg 150.0 rows x 1 workers at destination.  
Max/Last(seg-1:changlei.local/seg-1:changlei.local) 150/150 rows with 
2209/2209 ms to first row, 2577/2577 ms to end, start offset by 1.429/1.429 ms.
   ->  HashAggregate  (cost=291580.96..318527.67 rows=615288 width=16)
 Group By: lineitem_b2.l_orderkey
 Rows out:  Avg 75.0 rows x 2 workers.  
Max/Last(seg1:changlei.local/seg1:changlei.local) 75/75 rows with 
2243/2243 ms to first row, 2498/2498 ms to end, start offset by 2.615/2.615 ms.
 Executor memory:  56282K bytes avg, 56282K bytes max 
(seg1:changlei.local).
 ->  Hash Join  (cost=70069.00..250010.38 rows=3000608 width=15)
   Hash Cond: lineitem_b2.l_orderkey = orders.o_orderkey
   Rows out:  Avg 3000607.5 rows x 2 workers.  
Max/Last(seg0:changlei.local/seg1:changlei.local) 3001300/215 rows with 
350/350 ms to first row, 1611/1645 ms to end, start offset by 3.819/3.816 ms.
   Executor memory:  49153K bytes avg, 49153K bytes max 
(seg1:changlei.local).
   Work_mem used:  23438K bytes avg, 23438K bytes max 
(seg1:changlei.local). Workfile: (0 spilling, 0 reused)
   (seg0)   Hash chain length 1.7 avg, 3 max, using 434205 of 
524341 buckets.
   ->  Append-only Scan on lineitem_b2  (cost=0.00..89923.15 
rows=3000608 width=15)
 Rows out:  Avg 3000607.5 rows x 2 workers.  
Max/Last(seg0:changlei.local/seg1:changlei.local) 3001300/215 rows with 
4.460/4.757 ms to first row, 546/581 ms to end, start offset by 350/349 ms.
   ->  Hash  (cost=51319.00..51319.00 rows=75 width=8)
 Rows in:  Avg 75.0 rows x 2 workers.  
Max/Last(seg1:changlei.local/seg0:changlei.local) 75/75 rows with 
341/344 ms to end, start offset by 8.114/5.610 ms.
 ->  Redistribute Motion 2:2  (slice1; segments: 2)  
(cost=0.00..51319.00 rows=75 width=8)
   Hash Key: orders.o_orderkey
   Rows out:  Avg 75.0 rows x 2 workers at 
destination.  Max/Last(seg1:changlei.local/seg0:changlei.local) 75/75 
rows with 0.052/2.461 ms to first row, 207/207 ms to end, start offset by 
8.114/5.611 ms.
   ->  Append-only Scan on orders  (cost=0.00..21319.00 
rows=75 width=8)
 Rows out:  Avg 75.0 rows x 2 workers.  
Max/Last(seg1:changlei.local/seg0:changlei.local) 75/75 rows with 
4.773/4.987 ms to first row, 166/171 ms to end, start offset by 2.911/2.697 ms.
 Slice statistics:
   (slice0)Executor memory: 281K bytes.
   (slice1)Executor memory: 319K bytes avg x 2 workers, 319K bytes max 
(seg1:changlei.local).
   (slice2)Executor memor

Re: Podling Report Reminder - January 2017

2016-12-30 Thread Lei Chang
what do you want to add? There is an item around pxf in the report.

Cheers
Lei




On Fri, Dec 30, 2016 at 1:58 AM +0800, "晓 胡" <huxia...@icloud.com> wrote:










Nice version. How about adding something about PXF?
> On 29 Dec 2016, at 9:49 PM, Lei Chang  wrote:
> 
> Good suggestion. Please see the revised version.
> 
> ---
> 
> HAWQ
> 
> Apache HAWQ is a Hadoop native SQL query engine that combines the key
> technological advantages of MPP database with the scalability and
> convenience
> of Hadoop. HAWQ reads data from and writes data to HDFS natively.  HAWQ
> delivers industry-leading performance and linear scalability. It provides
> users the tools to confidently and successfully interact with petabyte range
> data sets. HAWQ provides users with a complete, standards compliant SQL
> interface.
> 
> HAWQ has been incubating since 2015-09-04.
> 
> Three most important issues to address in the move towards graduation:
> 
> 1. Expand the community, by adding new
> contributors and focusing on making sure that there's a much more robust
> level
> of conversations and discussions happening around roadmaps and feature
> development on the public dev mailing list
> 
> 2. Infrastructure migration: create
> Jenkins projects that build HAWQ binary, source tarballs, and run feature
> tests including at least installcheck-good tests for each commit (HAWQ-127).
> 
> Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be aware
> of?
> 
> Everything seems to be smooth, nothing urgent at this time.
> 
> How has the community developed since the last report?
> 
> 1. The community becomes more open for discussion around the roadmap,
> various features that committers are working on, infrastructure
> enhancements for the project.
> 
> 2. Two talks:
> 
> * The SQL-on-Hadoop engine that replaces traditional data warehouses: HAWQ,
> China Open Source Conference Oct, 2016, Lei Chang
> * Apache HAWQ on cloud: the easiest way to cloud from traditional data
> warehouses, Big Data Technology Conferences, Dec, 2016, Lei Chang
> 
> 3. Ed volunteered as an RM for the upcoming 2.1.0.0 release
> 
> 4. Interesting discussions and work around Docker for HAWQ. HAWQ has an
> account on docker hub: https://hub.docker.com/u/hawq/
> 
> How has the project developed since the last report?
> 
> 1. Apache HAWQ 2.0.0.0 released.
> 
> 2. HAWQ 2.1.0.0 release proposed: https://cwiki.
> apache.org/confluence/display/HAWQ/HAWQ+Release+2.1.0.0-incubating+Release
> 
> * Critical HAWQ Register bug fixes
> * Move HAWQ Ambari plugin to Apache HAWQ:   HAWQ-1013 RESOLVED
> * Introduction of the PXF ORC support
> * Many bug fixes
> 
> 
> Date of last release: Oct 8, 2016
> 
> 
> When were the last committers or PMC members elected?
> 
> Two committers added: Hong Wu and Paul Guo
> 
> 
> Signed-off-by:
> 
> [ ](hawq) Alan Gates
> [ ](hawq) Konstantin Boudnik
> [ ](hawq) Justin Erenkrantz
> [ ](hawq) Thejas Nair
> [ ](hawq) Roman Shaposhnik
> 
> On Thu, Dec 29, 2016 at 8:41 AM, Roman Shaposhnik 
> wrote:
> 
>> Nice report! My only suggestion is perhaps to add a note about Ed
>> volunteering as an RM for the upcoming release and also highlighting
>> the work that's happening around Docker.
>> 
>> Thanks,
>> Roman.
>> 
>> On Wed, Dec 28, 2016 at 4:33 PM, Lei Chang  wrote:
>>> Hi Guys,
>>> 
>>> Please see the following report draft, and feel free to add more contents
>>> or give your comments:
>>> 
>>> ---
>>> 
>>> HAWQ
>>> 
>>> Apache HAWQ is a Hadoop native SQL query engine that combines the key
>>> technological advantages of MPP database with the scalability and
>>> convenience
>>> of Hadoop. HAWQ reads data from and writes data to HDFS natively.  HAWQ
>>> delivers industry-leading performance and linear scalability. It provides
>>> users the tools to confidently and successfully interact with petabyte
>> range
>>> data sets. HAWQ provides users with a complete, standards compliant SQL
>>> interface.
>>> 
>>> HAWQ has been incubating since 2015-09-04.
>>> 
>>> Three most important issues to address in the move towards graduation:
>>> 
>>> 1. Expand the community, by adding new
>>> contributors and focusing on making sure that there's a much more robust
>>> level
>>> of conversations and discussions happening around roadmaps and feature
>>> development on the public dev mailing list
>>> 
>>> 2

Re: Podling Report Reminder - January 2017

2016-12-28 Thread Lei Chang
Hi Guys,

Please see the following report draft, and feel free to add more contents
or give your comments:

---

HAWQ

Apache HAWQ is a Hadoop native SQL query engine that combines the key
technological advantages of MPP database with the scalability and
convenience
of Hadoop. HAWQ reads data from and writes data to HDFS natively.  HAWQ
delivers industry-leading performance and linear scalability. It provides
users the tools to confidently and successfully interact with petabyte range
data sets. HAWQ provides users with a complete, standards compliant SQL
interface.

HAWQ has been incubating since 2015-09-04.

Three most important issues to address in the move towards graduation:

1. Expand the community, by adding new
contributors and focusing on making sure that there's a much more robust
level
of conversations and discussions happening around roadmaps and feature
development on the public dev mailing list

2. Infrastructure migration: create
Jenkins projects that build HAWQ binary, source tarballs, and run feature
tests including at least installcheck-good tests for each commit (HAWQ-127).

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be aware
of?

Everything seems to be smooth, nothing urgent at this time.

How has the community developed since the last report?

1. The community becomes more open for discussion around the roadmap,
various features that committers are working on, infrastructure
enhancements for the project.

2. Two talks:

* The SQL-on-Hadoop engine that replacing traditional data warehouses:
HAWQ, China Open Source Conference Oct, 2016, Lei Chang
* Apache HAWQ on cloud: The easiest way to cloud from traditional data
warehouses, Big Data Technology Conferences, Dec, 2016, Lei Chang


How has the project developed since the last report?

1. Apache HAWQ 2.0.0.0 released.

2. HAWQ 2.1.0.0 release proposed: https://cwiki.apache.org/
confluence/display/HAWQ/HAWQ+Release+2.1.0.0-incubating+Release

* Critical HAWQ Register bug fixes
* Move HAWQ Ambari plugin to Apache HAWQ:   HAWQ-1013 RESOLVED
* Introduction of the PXF ORC support
* Many bug fixes


Date of last release: Oct 8, 2016


When were the last committers or PMC members elected?

Two committers added: Hong Wu and Paul Guo


Signed-off-by:

 [ ](hawq) Alan Gates
 [ ](hawq) Konstantin Boudnik
 [ ](hawq) Justin Erenkrantz
 [ ](hawq) Thejas Nair
 [ ](hawq) Roman Shaposhnik


Cheers
Lei



On Wed, Dec 28, 2016 at 10:40 PM, <johndam...@apache.org> wrote:

> Dear podling,
>
> This email was sent by an automated system on behalf of the Apache
> Incubator PMC. It is an initial reminder to give you plenty of time to
> prepare your quarterly board report.
>
> The board meeting is scheduled for Wed, 18 January 2017, 10:30 am PDT.
> The report for your podling will form a part of the Incubator PMC
> report. The Incubator PMC requires your report to be submitted 2 weeks
> before the board meeting, to allow sufficient time for review and
> submission (Wed, January 04).
>
> Please submit your report with sufficient time to allow the Incubator
> PMC, and subsequently board members to review and digest. Again, the
> very latest you should submit your report is 2 weeks prior to the board
> meeting.
>
> Thanks,
>
> The Apache Incubator PMC
>
> Submitting your Report
>
> --
>
> Your report should contain the following:
>
> *   Your project name
> *   A brief description of your project, which assumes no knowledge of
> the project or necessarily of its field
> *   A list of the three most important issues to address in the move
> towards graduation.
> *   Any issues that the Incubator PMC or ASF Board might wish/need to be
> aware of
> *   How has the community developed since the last report
> *   How has the project developed since the last report.
>
> This should be appended to the Incubator Wiki page at:
>
> https://wiki.apache.org/incubator/January2017
>
> Note: This is manually populated. You may need to wait a little before
> this page is created from a template.
>
> Mentors
> ---
>
> Mentors should review reports for their project(s) and sign them off on
> the Incubator wiki page. Signing off reports shows that you are
> following the project - projects that are not signed may raise alarms
> for the Incubator PMC.
>
> Incubator PMC
>


Re: Publishing HAWQ dev docker images

2016-12-28 Thread Lei Chang
Great to see we have a formal account on docker hub now.

Cheers
Lei



On Thu, Dec 29, 2016 at 3:43 AM, Roman Shaposhnik 
wrote:

> Hi!
>
> On Tue, Dec 27, 2016 at 9:55 PM, Richard Guo 
> wrote:
> > Hi everyone,
> >
> > HAWQ team is building HAWQ dev docker images. The purpose is to provide
> an
> > out-of-box way for developers to setup build and test environment for
> HAWQ.
>
> First of all I think it is a great idea and will go a long way of
> simplifying the developer
> experience.
>
> Based on my observations of how Docker containers are used for build
> and development
> purposes by various other projects at ASF I'd like to share some of my
> experience:
>* while it may seem counterintuitive, Dockerfiles (for various
> supported platforms) are
>  much more important than a docker container on Docker Hub. This
> is partially because
>  despite all the deafening hype not everybody is in love with
> Docker as much as
>  the tech press would make you believe. Having a Dockerfile lets
> folks who are not
>  using Docker recreate a build environment based on what's in
> there as part of RUN, etc.
>  Which brings me to my next point:
>
>* extract your actual environment setup logic into a script that
> can easily be run outside
>  of Docker environment, then copy this script into a Docker
> container and RUN it. See
>  this example from Docker best practices:
>   https://docs.docker.com/engine/userguide/eng-image/
> dockerfile_best-practices/#/add-or-copy
>
>* make sure to use your containers for actual builds on
> builds.apache.org otherwise they
>  tend to bitrot.
>
> > A demo could be found on github hawq-docker
> >  . Currently only CentOS
> 7
> > is supported. CentOS 6 will be supported soon. It is based on Zhanwei's
> > work.
>
> This is a good start that needs to be folded into the HAWQ code base itself
> and ideally hooked to the top level Make logic.
>
> Also, if you could consider separating the logic into a separate script
> that'd
> be super awesome and would allow us for a much better integration with
> downstream consumers such as Linux distributions and Bigtop.
>
> > The idea is to predefine all the environment setup steps in the
> dockerfile
> > and then build the image from dockerfile with tools provided by docker.
> > After that, users can simply create containers with the docker image and
> > then do the HAWQ build and test jobs.
> > Also a Makefile is provided to simplify this process. Please refer to the
> > README in the github repository for more details.
> >
> > Regarding the place to host the docker images and dockerfiles, does
> anyone
> > have any idea? Comments and discussions are welcomed.
>
> These types of build environments are full of GPL things and thus need to
> be
> kept separately from ASF artifacts. I suggest the same approach we used in
> Bigtop: a community owned, public GitHub repo. E.g.:
>  https://hub.docker.com/u/bigtop/
>
> Initially I'd suggest a pretty tight ownership of who can modify
> containers in your
> account. The best way we found so far is to have a single volunteer
> who establishes
> credentials and makes sure to push containers into the account everytime
> there's
> a change. If you couple it with setting the email on the account to
> private@hawq.i.a.o
> you've got yourself a pretty secure AND community friendly setup.
>
> In fact, while checking if hawq was available on Docker hub I took the
> liberty of creating
> an account for you ;-) Let me know if you like it:
>  https://hub.docker.com/u/hawq/
>
> Thanks,
> Roman.
>


Re: hawq安装求助

2016-12-22 Thread Lei Chang
建议使用centos 7来安装编译,centos 6很多依赖yum装不了。

Cheers
Lei




On Fri, Dec 23, 2016 at 11:24 AM +0800, "wuxing...@cmdi.chinamobile.com" 
 wrote:










您好

在网上看到您的帖子(http://www.wtoutiao.com/p/2caBcDl.html),想参照安装Hawq,在安装依赖包时碰到问题,过程如下:
下载了epel-release的rpm包,并正常安装。  /etc/yum.repos.d目录下没有其它repo文件,也执行了yum clean all 
和yum update指令。

网上也搜了好几个安装Hawq的帖子,都碰到依赖包安装不成功。我用的是centos6.8.

不知道您能给点建议不?谢谢了!

***
[root@slave1 yum.repos.d]# curl 
https://dl.fedoraproject.org/pub/epel/epel-release-latest-6.noarch.rpm -o 
epel-release-latest-6.noarch.rpm
  % Total% Received % Xferd  Average Speed   TimeTime Time  Current
 Dload  Upload   Total   SpentLeft  Speed
100 14540  100 145400 0   5276  0  0:00:02  0:00:02 --:--:--  7771

[root@slave1 yum.repos.d]# rpm -ivh epel-release-latest-6.noarch.rpm
Preparing...### [100%]
package epel-release-6-8.noarch is already installed

[root@slave1 yum.repos.d]# yum install -y man passwd sudo tar which git mlocate 
links make bzip2 net-tools autoconf automake libtool m4 gcc gcc-c++ gdbbison 
flex cmake gperf maven indent libuuid-devel krb5-devel libgsasl-devel 
expat-devellibxml2-devel perl-ExtUtils-Embed pam-devel python-devel 
libcurl-develsnappy-devel thrift-devel libyaml-devel libevent-devel 
bzip2-developenssl-devel openldap-devel protobuf-devel readline-devel 
net-snmp-develapr-devel libesmtp-devel xerces-c-devel python-pip json-c-devel 
apache-ivyjava-1.7.0-openjdk-devel openssh-clients openssh-server python-pip
Loaded plugins: fastestmirror, refresh-packagekit, security
Setting up Install Process
Loading mirror speeds from cached hostfile
No package git available.
No package links available.
No package autoconf available.
No package automake available.
No package libtool available.
No package gdbbison available.
No package cmake available.
No package maven available.
No package indent available.
No package expat-devellibxml2-devel available.
No package perl-ExtUtils-Embed available.
No package pam-devel available.
No package libcurl-develsnappy-devel available.
No package thrift-devel available.
No package bzip2-developenssl-devel available.
No package openldap-devel available.
No package protobuf-devel available.
No package net-snmp-develapr-devel available.
No package xerces-c-devel available.
No package python-pip available.
No package json-c-devel available.
No package apache-ivyjava-1.7.0-openjdk-devel available.
No package python-pip available.
Nothing to do
***


wuxing...@cmdi.chinamobile.com







Re: DISCUSSION NEEDED: 2.1.0.0-incubating - what to do with unresolved JIRAs

2016-12-22 Thread Lei Chang
I think if we are pretty sure that the issue will be fixed in a version, it
is better to be labeled as the specific version. If we are not sure, it is
better to label it as "backlog".

>From external user side, it is much better since they will know when the
issues they are watching will be fixed.

Cheers
Lei


On Fri, Dec 23, 2016 at 9:02 AM, Vineet Goel <vvin...@apache.org> wrote:

> Agree with Ruilong.
>
> And for JIRA owners, please remember to update the Fix Version when you
> close a JIRA. There are many JIRAs which are resolved/closed but Fix
> Version = Backlog, which is very confusing for a someone looking to see
> which release has the fix.
>
>
> On Thu, Dec 22, 2016 at 4:52 PM Ruilong Huo <r...@pivotal.io> wrote:
>
> > As these issues are not resolved, I think it is more reasonable to mark
> the
> > affected version as 2.1.0.0-incubating and make fix version as backlog.
> > Then the correct fix version can be marked at the time when the issue is
> > actually resolved.
> >
> > Best regards,
> > Ruilong Huo
> >
> > On Fri, Dec 23, 2016 at 8:24 AM, Lei Chang <lei_ch...@apache.org> wrote:
> >
> > > Sounds a good idea!
> > >
> > > Cheers
> > > Lei
> > >
> > >
> > > On Fri, Dec 23, 2016 at 5:50 AM, Ed Espino <esp...@apache.org> wrote:
> > >
> > > > We have a significant number of JIRAs opened and currently with a fix
> > > > version of 2.1.0.0-incubating. As this is a source only release, I am
> > > > inclined to start the branch/tagging release process and update these
> > > JIRAs
> > > > with the next release version (2.2.0.0-incubating currently
> available).
> > > > Thoughts?
> > > >
> > > > HAWQ 2.1.0.0-incubating (Open, In Progress, Reopened)
> > > > https://issues.apache.org/jira/browse/HAWQ-1232?filter=12339113
> > > >
> > > > Thanks,
> > > > -=e
> > > >
> > > > FYI: Although we may change the version for the next release, I am
> > using
> > > > this as a place holder for the next release JIRAs.
> > > >
> > > > --
> > > > *Ed Espino*
> > > > *esp...@apache.org <esp...@apache.org>*
> > > >
> > >
> >
>


Re: overcommit_memory setting in cluster with hawq and hadoop deployed

2016-12-16 Thread Lei Chang
This issue has been raised many times. I think Taylor gave a good proposal.

>From long term, I think we should add more tests around killing process
randomly.

If it leads to corruptions, I think it is a bug. From database perspective,
we should not assume that processes cannot be killed under some specific
conditions or at some time.

Thanks
Lei


On Sat, Dec 17, 2016 at 1:43 AM, Taylor Vesely  wrote:

> Hi Ruilong,
>
> I've been brainstorming the issue, and this is my proposed solution. Please
> tell me what you think.
>
> Segments are stateless. In Greenplum, are worried about catalog corruption
> when a segment dies. In HAWQ, all of the data nodes are stateless. Even if
> OOM killer ends up killing a segment, we shouldn't need to worry about
> catalog corruption. *Only the master has a catalog that matters. *
>
> My proposition:
>
> Because the catalog matters on the master, we should probably continue to
> run master nodes with vm.overcommit=2. On the segments, however, I think
> that we shouldn't worry so much about an OOM event. The problem still
> remains that all queries across the cluster will be canceled if a data node
> goes offline (at least until HAWQ is able to restart failed query
> executors).
> If we *really* want to prevent the segments from being killed, we could
> tell the kernel to prefer killing the other processes on the node via the
> /proc//oom_score_adj facility. Because Hadoop processes are generally
> resilient enough to restart failed containers, most Java processes can be
> treated as more expendable than HAWQ processes.
>
> /proc//oom_score_odj ref:
> https://www.kernel.org/doc/Documentation/filesystems/proc.txt
>
> Thanks,
>
> Taylor Vesely
>
> On Fri, Dec 16, 2016 at 7:01 AM, Ruilong Huo  wrote:
>
> > Hi HAWQ Community,
> >
> > overcommit_memory setting in linux control the behaviour of memory
> > allocation. In cluster deployed with hawq and hadoop, it is controversial
> > to set overcommit_memory for the nodes. To be specific, it is recommended
> > to use overcommit strategy 2 by hawq, while it is recommended to use 1
> or 0
> > in hadoop.
> >
> > This thread is to start the discussion regarding the options to make a
> > reasonable choice here so that it is good with both products.
> >
> > *1. From HAWQ perspective*
> >
> > It is recommended to use vm.overcommit_memory = 2 (other than 0 and 1) to
> > prevent random kill of HAWQ process and thus backend reset.
> >
> > If nodes of the cluster are set to overcommit_memory  = 0 or 1, there is
> > risk that running query might get terminated due to backend reset. Even
> > worse, with overcommit_memory = 1, there is chance that data file and
> > transaction log might get corrupted due to insufficient cleanup during
> > process exit when oom happens. More details of overcommit_memory setting
> in
> > HAWQ can be found at: Linux-Overcommit-strategies-and-Pivotal-GPDB-HDB
> >  > Linux-Overcommit-strategies-and-Pivotal-Greenplum-GPDB-Pivotal-HDB-HDB->
> > .
> >
> > *2. From Hadoop perspective*
> >
> > The crash of datanode usually happens when there is not enough heap
> memory
> > for JVM. To be specific, JVM allocates more heap (via a malloc or mmap
> > system call) and the address space has been exhausted. When
> > overcommit_memory = 2 and we run out of available address space, the
> system
> > will return ENOMEM for the system call, and the JVM will crash.
> >
> > This is due to the fact is that Java is very address space greedy. It
> will
> > allocate large regions of address space that it isn't actually using. The
> > overcommit_memory = 2 setting doesn't actually restrict physical memory
> > use, it restricts address space use. Many applications (especially java)
> > actually allocate sparse pages of memory, and rely on the kernel/OS to
> > actually provide the memory as soon as a page fault occurs.
> >
> > Best regards,
> > Ruilong Huo
> >
>


Re: Apache HAWQ release manager volunteer

2016-12-07 Thread Lei Chang
Awesome!

Cheers
Lei


On Wed, Dec 7, 2016 at 11:12 AM, Ed Espino  wrote:

> HAWQ dev community,
>
> I am volunteering to be the release manager for the second Apache HAWQ
> release.  I will be following the guidelines set forth in the project's
> Release Management wiki section as well as the Apache documentation:
>
> o Apache HAWQ wiki Release Management:
> https://cwiki.apache.org/confluence/display/HAWQ/Release+Management
>
> o A Guide To Release Management During Incubation (DRAFT):
> http://incubator.apache.org/guides/releasemanagement.html
>
> NOTE: As needed, I will be updating the project's Release Management wiki
> to reflect updates and filling possible gaps to the release process.
>
> Assuming there are no objections, I will do my best to send a release plan
> within 48 hours.
>
> Regards,
> -=e
>
> --
> *Ed Espino*
> *esp...@apache.org *
>


Re: Help for HAWQ wiki/jira permission

2016-11-11 Thread Lei Chang
welcome to hawq, done!

Cheers
Lei


On Thu, Nov 10, 2016 at 5:40 PM, Ivan Weng  wrote:

> Hongxu, welcome to contribute to Apache HAWQ!!!
>
>
> Hi Romon, Lei,
>
> Would you help on that ?
>
>
> Regards,
> Ivan
>
>
>
>
>
> On Thu, Nov 10, 2016 at 5:26 PM, Hongxu Ma  wrote:
>
>> Hi everyone,
>>
>> I dived into HAWQ recently, wish to share some ideas to wiki (e.g. build
>> and install), and assign jira issues(e.g. HAWQ-513) to myself.
>>
>> Can who give me the permission to edit access the HAWQ wiki/jira?
>>
>> Thanks, below is my info:
>>
>> JIRA:
>>
>>  * username: hongxu ma
>>  * mail: inte...@outlook.com 
>>
>> Confluence:
>>
>>  * username: interma
>>  * mail: inte...@outlook.com 
>>
>>
>> --
>> Regards,
>> Hongxu.
>>
>>
>


Re: Default assigned for HAWQ JIRAs

2016-11-06 Thread Lei Chang
I think it is better to leave the default assignee as "unassigned", it is
more clear for triaging and it also makes the community clear that which
issues are looking for contributions. Otherwise, with default assignee, it
will confuse the community about whether the default assignee is working on
the issue or not.

Cheers
Lei


On Sat, Nov 5, 2016 at 9:22 AM, Roman Shaposhnik  wrote:

> Hi!
>
> I just noticed when filing a JIRA that
> the default assignee for the project is
> Lei. Now, the point of a default assignee
> is to have somebody who can do an
> effective triage on top of that triage
> typically happens within the context
> of the next release. Given that Ed is
> our RM for the next release of HAWQ
> wouldn't it make sense for him to
> be that?
>
> Thanks,
> Roman.
>


Re: Podling Report Reminder - October 2016

2016-10-04 Thread Lei Chang
Hi Guys,

Please see the following report. Looking forwards to your comments &
suggestions!

---
Apache HAWQ is a Hadoop native SQL query engine that combines the key
technological advantages of MPP database with the scalability and
convenience of Hadoop. HAWQ reads data from and writes data to HDFS
natively.  HAWQ delivers industry-leading performance and linear
scalability. It provides users the tools to confidently and successfully
interact with petabyte range data sets. HAWQ provides users with a
complete, standards compliant SQL interface.

HAWQ has been incubating since 2015-09-04.

Three most important issues to address in the move towards graduation:

1. Produce our first Apache Release
2. Expand the community, by adding new contributors and focusing on making
sure that there's a much more robust level of conversations and discussions
happening around roadmaps and feature development on the public dev mailing
list
3. Infrastructure migration: create Jenkins projects that build HAWQ
binary, source tarballs, and run feature tests including at least
installcheck-good tests for each commit (HAWQ-127).

Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
aware of?

Everything seems to be smooth, nothing urgent at this time.

How has the community developed since the last report?

1. we have seen significant increase of communication and collaboration
across the whole community in past 3 months. The community becomes more
open for discussion around the roadmap, various features that committers
are working on, infrastructure enhancements for the project.

2. One talk at trust cloud computing summit: Apache HAWQ: The leading
SQL-on-Hadoop Query Engine. (http://www.cnii.com.cn/
technology/img/4598.files/yicheng.html)

How has the project developed since the last report?

1. The release candidate (Apache HAWQ 2.0.0.0-incubating RC4) has been
proposed and the voting process on the dev mailing list completed. The main
target of the first release is to clear all IP related issues for HAWQ and
this is a source code tarball only
release. Full list of JIRAs fixed/related to the release: link


2. New features added include snappy compression for AO tables, HAWQ
register feature for registering data into HAWQ native tables


Date of last release:

We have not had a release yet.


When were the last committers or PMC members elected?

Add one committer in September: Kavinder Dhaliwal


Signed-off-by:

  [ ](hawq) Alan Gates
  [ ](hawq) Konstantin Boudnik
  [ ](hawq) Justin Erenkrantz
  [ ](hawq) Thejas Nair
  [ ](hawq) Roman Shaposhnik

Shepherd/Mentor notes:

On Tue, Oct 4, 2016 at 10:47 AM,  wrote:

> Dear podling,
>
> This email was sent by an automated system on behalf of the Apache
> Incubator PMC. It is an initial reminder to give you plenty of time to
> prepare your quarterly board report.
>
> The board meeting is scheduled for Wed, 19 October 2016, 10:30 am PDT.
> The report for your podling will form a part of the Incubator PMC
> report. The Incubator PMC requires your report to be submitted 2 weeks
> before the board meeting, to allow sufficient time for review and
> submission (Wed, October 05).
>
> Please submit your report with sufficient time to allow the Incubator
> PMC, and subsequently board members to review and digest. Again, the
> very latest you should submit your report is 2 weeks prior to the board
> meeting.
>
> Thanks,
>
> The Apache Incubator PMC
>
> Submitting your Report
>
> --
>
> Your report should contain the following:
>
> *   Your project name
> *   A brief description of your project, which assumes no knowledge of
> the project or necessarily of its field
> *   A list of the three most important issues to address in the move
> towards graduation.
> *   Any issues that the Incubator PMC or ASF Board might wish/need to be
> aware of
> *   How has the community developed since the last report
> *   How has the project developed since the last report.
>
> This should be appended to the Incubator Wiki page at:
>
> http://wiki.apache.org/incubator/October2016
>
> Note: This is manually populated. You may need to wait a little before
> this page is created from a template.
>
> Mentors
> ---
>
> Mentors should review reports for their project(s) and sign them off on
> the Incubator wiki page. Signing off reports shows that you are
> following the project - projects that are not signed may raise alarms
> for the Incubator PMC.
>
> Incubator PMC
>


Re: PXF question with HAWQInputFormat to migrate data 1.x -> 2.x

2016-09-25 Thread Lei Chang

I think it might be possible to use HAWQ 1.x MR Inputformt to develop a 2.0 pxf 
plugin. Then we  do not need to run 2 versions together.

Cheers
Lei





On Mon, Sep 26, 2016 at 4:27 AM +0800, "Goden Yao"  wrote:










+ dev mailing list , modified the title.
Hi Kyle.
Based on your description, your scenario is (as I understand):1. HAWQ 1.x 
cluster installed.2. HAWQ 2.x cluster installed in the same nodes3. Data 
migration (ETL) from HAWQ 1.x files to HAWQ 2.x using PXF (from 2.x 
installation)
Is that correct?So you want to develop a custom PXF plugin that can read HAWQ 
1.x parquet data as external tables on HDFS then Insert into new HAWQ 2.x 
native table?
According to 1.3 
doc:http://hdb.docs.pivotal.io/131/topics/HAWQInputFormatforMapReduce.html#hawqinputformatexample
 

1) To use HAWQInputFormat, it'll require you also run HAWQ 1.x (as it requires 
database URL to access metadata), so this mean you need to run 1.x and 2.x side 
by side. In theory , it should be doable, but configuration wise, no one has 
tried this.
2) If you run hawq side by side, this means PXF will run side by side as well - 
have to make sure there's no conflicts in ports or ambiguity of which version 
PXF you are invoking.
That's all I can think of for now.-Goden
On Fri, Sep 23, 2016 at 12:10 PM Kyle Dunn  wrote:
Glad to hear Resolver is the only other piece - should work out nicely.
So I'm looking at bolting on HAWQInputFormat to PXF (which actually looks quite 
straightforward) and I just want to ensure as many column types are supported 
as possible. This is motivated by needing to be able to read orphaned HAWQ 1.x 
files with PXF in HDB/HAWQ 2.x. Will make "in-place" upgrades much simpler.
Here is the list of datatypes HAWQInputFormat supports, and the potential 
mapping to PXF types:



On Fri, Sep 23, 2016 at 12:51 PM Goden Yao  wrote:
Thanks for the wishes.Are you talking about developing a new plugin (a new data 
source). Mapping data type has 2 parts:1. what pxf recognized from HAWQthis is 
https://github.com/apache/incubator-hawq/blob/master/pxf/pxf-api/src/main/java/org/apache/hawq/pxf/api/io/DataType.java
 2. what plugins recognize and want to convert to HAWQ type. (Resolver)sample: 
https://github.com/apache/incubator-hawq/blob/master/pxf/pxf-hive/src/main/java/org/apache/hawq/pxf/plugins/hive/HiveResolver.java
 
basically, 1 provides a type list, and 2 select from that list to see which 
data type should be converted to hawq recognized type. 
If you're developing a new plugin with a new type mapping in HAWQ, you need to 
do both 1 and 2. 
Which specific primitive type you need which is not on the list?BTW, you can 
also mail dev mailing list so answers will be archived in public for everyone :)
-Goden

On Fri, Sep 23, 2016 at 11:43 AM Kyle Dunn  wrote:
Hey Goden -
I'm looking at extending PXF for a new data source and noticed only a subset of 
the HAWQ-supported primitive datatypes are implemented in PXF. Is this as 
trivial as mapping a type to the corresponding OID in "api/io/DataType.java" or 
is there something more I'm missing?
Hope the new adventure is starting well.

-Kyle-- 
Kyle Dunn | Data Engineering | PivotalDirect: 303.905.3171 | Email: 
kd...@pivotal.io
-- 
Kyle Dunn | Data Engineering | PivotalDirect: 303.905.3171 | Email: 
kd...@pivotal.io








Re: HAWQ I/O Scheduler

2016-09-23 Thread Lei Chang
There is no performance benchmarking done for hawq before to compare the
two schedulers.

For typciall transaction database workload, deadline is better (
https://blog.pgaddict.com/posts/postgresql-io-schedulers-cfq-noop-deadline).
But hawq workload is typically sequential IO instead of random read.

So I think it deserves a benchmark to show which is better.

Cheers
Lei


On Fri, Sep 23, 2016 at 12:57 AM, Taylor Vesely  wrote:

> Hi All,
>
> I was running hawq check on a system, and I hit the following error:
>
> 20160909:16:34:48:339941 gpcheck:hdw1:gpadmin-[ERROR]:-host(hdw1): on
> device (sdd) IO scheduler 'cfq' does not match expected value 'deadline'
> 20160909:16:34:48:339941 gpcheck:hdw1-[ERROR]:-host(hdw1): on device (sde)
> IO scheduler 'cfq' does not match expected value 'deadline'
>
> I did a bit of research, and generally I see hadoop hardware guides
> recommend cfq as the I/O scheduler, rather than deadline.
>
> http://amd-dev.wpengine.netdna-cdn.com/wordpress/
> media/2012/10/Hadoop_Tuning_Guide-Version5.pdf
> - Page 18
>
> http://www.datanubes.com/mediac/HadoopTuningDHT.pdf - Page 9
>
> Have we done any actual benchmarking for HAWQ I/O schedulers? Did we
> account for different use cases? Is deadline actually recommended for
> systems that run HAWQ, or is this recommendation just a holdover from the
> port from Greenplum?
>
> Thanks,
>
> Taylor Vesely
>


Re: Network interconnect settings in IaaS environments

2016-09-16 Thread Lei Chang
please see the comments inline

On Sat, Sep 17, 2016 at 3:07 AM, Kyle Dunn  wrote:

> In an ongoing evaluation of HAWQ in Azure, we've encountered some
> sub-optimal network performance. It would be great to get some additional
> information about a few server parameters related to the network:
>
> - gp_max_packet_size
>The default is documented at 8192. Why was this number chosen? Should
> this value be aligned with the network infrastructure's configured MTU,
> accounting for the packet header size of the chosen interconnect type?
>  (Azure only support MTU 1500 and has been showing better reliability using
> TCP in Greenplum)
>

8K is an empirical value when we evaluate the interconnect performance on
physical hardware. It is shown that 8K has the optimal performance.

But on Azure, it is not benchmarked, looks like udp on azure is not stable.
you can set "gp_interconnect_log_stats" to see the statistics about the
queries. And you can also use ifconfig to see the errors about packets.

If the network is not stable, it deserves a try to decrease the value to
less than 1500 to align the user space packet size with maximal kernel
packet size. But Decreasing the value increases the cpu cost
for marshaling/unmarshalling the packets. There will be a tradeoff here.


>
> - gp_interconnect_type
> The docs claim UDPIFC is the default, UDP is the observed default. Do
> the recommendations around which setting to use vary in an IaaS environment
> (AWS or Azure)?
>

which doc? when we release UDPIFC for gpdb, we kept old UDP and added
UDPIFC to avoid potential regressions since there are a lot of UDP
deployments for gpdb at that time. After UDPIFC was released, it is shown
UDPIFC is much more stable and perform better than UDP. So when we release
hawq, we just replaced UDP with UDPIFC. But use UDP for the name. So UDP is
UDPIFC in HAWQ.

There are two flow control methods in UDPIFC, I'd like suggest you have a
try: Gp_interconnect_fc_method (INTERCONNECT_FC_METHOD_CAPACITY &
INTERCONNECT_FC_METHOD_LOSS).


> - gp_interconnect_queue_depth
>My naive read of this is performance can be traded off for (potentially
> significant) RAM utilization. Is there additional detail around turning
> this knob? How does the interaction between this and the underlying NIC
> queue depth affect performance? As an example, in Azure, disabling TX
> queuing (ifconfig eth0 txqueue 0) on the virtual NIC improved benchmark
> performance, as the underlying HyperV host is doing it's own queuing
> anyway.
>
>
This queue is application level queue, and use for caching, handling
out-of-order and lost packets.

According to our past performance testing on physical hardware, increasing
it to a large value does not show a lot of benefits. Too small value does
impact performance. But it needs more testing on Azure I think.


>
> Thanks,
> Kyle
> --
> *Kyle Dunn | Data Engineering | Pivotal*
> Direct: 303.905.3171 <3039053171> | Email: kd...@pivotal.io
>


Re: libhdfs3 development is still going on outside of ASF

2016-09-14 Thread Lei Chang

There was a short discussion before when we moved libhfds3 to HAWQ repo.
http://mail-archives.apache.org/mod_mbox/incubator-hawq-dev/201602.mbox/%3cCAE44UQe1xgcVOC76T_mgVbgGbR=Lx=xubpvw18zk4iz3euc...@mail.gmail.com%3e
I think it makes sense to keep libhdfs3 only in HAWQ repo to simplify Apache 
build and releases in current phase. This is what we have done in the past. But 
looks not everyone is on the same page.
CheersLei






On Thu, Sep 15, 2016 at 11:12 AM +0800, "Greg Chase"  wrote:










Its fine if libhdfs3 is a third party license, and is treated that way.

However, why does Apache HAWQ want to be dependent on some strange 3rd
party library with no transparency?

We are having enough difficulties just getting our first release out.

Is there a compelling reason why we need to keep up with the independently
developed libhdfs3 project?  Are they willing to make necessary changes so
that they are compatible with ASF's strict-for-a-good-reason policies?

Can we fork hdfs3 for Apache HAWQ's purposes in Apache?

If any libhdfs3 committers are also part of Apache HAWQ, perhaps you can
shed some light on the viability of this as an independent project since I
only see 4 contributors.

-Greg

On Wed, Sep 14, 2016 at 7:54 PM, Hong Wu  wrote:

> In my opinion, I think it is reasonable to transfer the third-party repo of
> libhdfs3 totally into HAWQ, not only for the convenience of HAWQ build, but
> also for the consideration of ASF project. So for HAWQ project, I am with
> Roman.
>
> But my concern is the current users of libhdfs3 and all the pull requests,
> wiki docs and issues. Another uncertain aspect from my perspective is that
> although HAWQ could not run without libhdfs3, libhdfs3 could be used in
> other open source projects, that might be the true meaning of making
> libhdfs3 open source at the beginning.
>
> In summary, if it is really against the spirit of a ASF project for HAWQ, a
> suggested way might be marking original libhdfs3 repo as a legacy repo in
> stead of remove it.
>
> Best
> Hong
>
> 2016-09-15 10:04 GMT+08:00 Zhanwei Wang :
>
> > Currently libhdfs3’s official code is not the same as in HAWQ. Some new
> > code does not copy into HAWQ.  I do not think code change of libhdfs3
> > should follow HAWQ’s commit process because  many change are not related
> to
> > HAWQ.
> >
> > From HAWQ side, I suggest to keep the stable version of its third-party
> > libraries and copy new libhdfs3’s code only when it is necessary.
> >
> > libhdfs3 was open source years before HAWQ incubating with a separated
> > permission of its authority. So in my opinion it is a third party and it
> > actually was a third party before HAWQ incubating. And HAWQ is not the
> only
> > user.
> >
> >
> >
> > Best Regards
> >
> > Zhanwei Wang
> > wan...@apache.org
> >
> >
> >
> > > 在 2016年9月15日,上午9:35,Roman Shaposhnik  写道:
> > >
> > > On Wed, Sep 14, 2016 at 6:29 PM, Zhanwei Wang 
> wrote:
> > >> Hi Roman
> > >>
> > >> libhdfs3 works as third-party library of HAWQ, Just for the
> convenience
> > of HAWQ release
> > >> process we copy its code into HAWQ.  The reason is that HAWQ used to
> > dependent on
> > >> specific version of libhdfs3 and libhdfs3 only distribute as source
> > code and the build process is complicated.
> > >
> > > I actually don't buy this argument. libhdfs3 is not an optional
> > > dependency for HAWQ
> > > like ORCA is (for example). Without libhdfs3 there's pretty tough to
> > > imagine HAWQ.
> > > As such the code base needs to be governed as part of the ASF project,
> > > not a random
> > > GitHub dependency.
> > >
> > > IOW, let me ask you this: were all the changes that went into libhdfs3
> > > that is part of
> > > HAWQ discussed and reviewed via the ASF development process or did you
> > just
> > > import them from time to time as this comment suggests:
> > >https://issues.apache.org/jira/browse/HAWQ-1046?
> > focusedCommentId=15489669=com.atlassian.jira.
> > plugin.system.issuetabpanels:comment-tabpanel#comment-15489669
> > > ?
> > >
> > >> I do not think we have any reason to shutdown a third party’s official
> > repository.
> > >
> > > You say 3d party as though its not just you guys maintaining it on the
> > side.
> > >
> > >> We also copy google test source code into HAWQ, just as what we did
> for
> > libhdfs3.
> > >
> > > But this is very different. You don't do any development (certainly
> > > you don't do any
> > > non-trivial development) of that code.
> > >
> > >> libhdfs3 open source under Apache license version 2 just the same as
> > HAWQ. So I believe there is no license issue.
> > >
> > > You're correct. There's no licensing issue but there's a pretty
> > significant
> > > governance issue.
> > >
> > > Thanks,
> > > Roman.
> > >
> >
> >
>







Re: Please append 'close #PR_id' to commit message when you merge other's pull request

2016-09-09 Thread Lei Chang
ming, I think Roman has already suggested a solution. Add "close #" in the 
commit message body instead of message title. Does it solve your concerns?

Cheers
Lei




On Fri, Sep 9, 2016 at 1:09 PM +0800, "Ming Li" <m...@pivotal.io> wrote:










I think we should offer a solution for this problem, even if the solution
is not good enough. If you find a better solution for it, we can  enhance
it afterward.

On Thu, Sep 8, 2016 at 3:16 PM, Ming Li  wrote:

> Hi Roman,
>
> The problem is still have someone forget to close PR, and we can't contact
> him only using email notification.
>
> As for your suggestion, could you please share with us the exact steps how
> to do it? Thanks.
>
> On Thu, Sep 8, 2016 at 2:13 PM, Roman Shaposhnik 
> wrote:
>
>> On Wed, Sep 7, 2016 at 11:08 PM, Lei Chang  wrote:
>> > @ming, there is a discussion on this mailing list before about what
>> should
>> > be included in the commit message.
>> >
>> > Appending "close #" makes the commit message very messy.
>> >
>> > So the conclusion at that time is to not append "close #" to a commit
>> > message.
>> >
>> > If someone forgets closing a pull request, looks better to add a
>> reminder
>> > to the pull request.
>>
>> Not to reopen that old discussion, but have you guys considered adding
>> it to the body of the commit? That way it won't mess up git log and such
>> but will still have the desired effect.
>>
>> Thanks,
>> Roman.
>>
>
>







Re: enforce -Werror (if gcc) in hawq?

2016-09-08 Thread Lei Chang
I think consistent behavior is much better to avoid confusions.

different options at different places looks strange. And only people on
this thread might be able follow this discussion. And maybe someday, they
will forget this too.

With new contributors/committers added, it is much more easier for them to
follow simple/default options.

Cheers
Lei




On Thu, Sep 8, 2016 at 2:12 PM, Hong Wu  wrote:

> I still have some reservation and do not recommend that. For the original
> purpose of adding -Werror option, I think following alternative solutions
> are better:
> 1. Try to avoid warnings duringreviewing process.
> 2. Add -Werror in travis CI script. If it fails to compile in travis, it is
> not allowed to check the pull request in.
>
> Above solutions could both be added and not conflict with reasons mentioned
> by Paul. Any comments?
>
> Best
> Hong
>
> 2016-09-08 13:41 GMT+08:00 Paul Guo :
>
> > For HAWQ, it is an open source project so I think from this respective,
> we
> > should
> > try to make all contributors have  the same development/test experience,
> > else
> > main contributors (e.g. committer) will have to waste time to fix those
> > errors introduced
> > by some contributors who know nothing about the context. Making the
> > build/test
> > less complex is our target. This is my answer to Hong and concern 1) of
> > Zhanwei
> >
> > Back to the -Wall and -Werror details. To my best of knowledge,
> > 1) gcc has been adjusting (probably mainly adding) the warning cases.
> > 2) -Wall actually enables just part of warnings and the set of warning
> > could
> > vary in various gcc versions.
> >
> > But I still think it should be enabled by default. The reasons are:
> >
> > 1) We actually just mainly tested on centos6.x by now. We actually did
> not
> > test on
> > mac and centos7 that much. So "officially" we do not support that
> many
> > os and
> > the fix will be not that tedious.
> >
> > 2) In the future if we want to have more os support, "gcc -Werror" is
> just
> > one small
> > step of the whole work. I do not think this introduces much more
> work.
> >
> > 3) Although different gcc versions have different -Wall definitions, I
> > think the fixes
> > for various gcc versions should be quite mutually tolerated.
> >
> > 4) Even above solutions do not work finally, we could specifies common
> > warnings
> >via -Werror=, e.g. -Werror=-Wunused-variable,...
> >
> >
> >
> > 2016-09-06 10:31 GMT+08:00 Hong Wu :
> >
> > > I strongly agree with ZhanWei's opinion, I think it is much more
> flexible
> > > to honor environment variable to do that. Take CMake system for
> example,
> > it
> > > is usual to honor "-DCMAKE_XXX" when doing configuration.
> > >
> > > As ZhanWei's recommendation, I think we should try it ourselves to make
> > > sure good coding style and code quality. For customers and contributors
> > > from community, it is better not doing that by default.
> > >
> > > Thanks
> > > Hong
> > >
> > > 2016-09-06 9:52 GMT+08:00 Zhanwei Wang :
> > >
> > > > Hi Hong
> > > >
> > > > I removed -Werror flag when we push HAWQ to open source. In Pivotal
> we
> > > > build HAWQ on specific OS with specific GCC and dependent libraries.
> > > > -Werror flag worked fine since we fix all warnings. But for a open
> > source
> > > > project. It is impossible to make users and contributors to build
> HAWQ
> > on
> > > > specific OS and compiler just like what we did before. We cannot say
> we
> > > > just support HAWQ on Centos 6 with GCC 4.4.2.
> > > >
> > > > So if we enforce -Werror flag, I guess the build will fail on many
> > > > environments.
> > > >
> > > > I propose three solutions and let’s discuss which is better.
> > > >
> > > > 1) Do not enforce -Werror flag but add it to our test (concourse and
> > > > Travis) like this.
> > > > ./configure —prefix=/path/to/install CFLAGS=“-Werror”
> > > >
> > > > By this way we can enforce -Werror flag on our tested environment.
> > > >
> > > >
> > > > 2) Only enforce -Werror flag on development build, remove it on
> release
> > > > build.
> > > >
> > > >
> > > > 3) Enforce -Werror flag and we setup more test environments(different
> > > > versions of CentOS Ubuntu SUSE with default compiler and latest GCC
> and
> > > > MacOS with default compiler and latest clang)
> > > >
> > > >
> > > > Any comments?
> > > >
> > > >
> > > > Best Regards
> > > >
> > > > Zhanwei Wang
> > > > wan...@apache.org
> > > >
> > > >
> > > >
> > > > > 在 2016年9月6日,上午9:22,Ed Espino  写道:
> > > > >
> > > > > +1
> > > > >
> > > > > Have we considered setting up separate public Concourse pipelines
> to
> > > try
> > > > > the various build scenarios.
> > > > >
> > > > > -=e
> > > > >
> > > > > On Tue, Sep 6, 2016 at 12:58 AM, Hong Wu 
> > > wrote:
> > > > >
> > > > >> Ming's comment makes sense, but I think it is another thread. I
> have
> > > > 

Re: Please append 'close #PR_id' to commit message when you merge other's pull request

2016-09-08 Thread Lei Chang
looks a great idea. Roman.

Cheers
Lei


On Thu, Sep 8, 2016 at 2:13 PM, Roman Shaposhnik <ro...@shaposhnik.org>
wrote:

> On Wed, Sep 7, 2016 at 11:08 PM, Lei Chang <lei_ch...@apache.org> wrote:
> > @ming, there is a discussion on this mailing list before about what
> should
> > be included in the commit message.
> >
> > Appending "close #" makes the commit message very messy.
> >
> > So the conclusion at that time is to not append "close #" to a commit
> > message.
> >
> > If someone forgets closing a pull request, looks better to add a reminder
> > to the pull request.
>
> Not to reopen that old discussion, but have you guys considered adding
> it to the body of the commit? That way it won't mess up git log and such
> but will still have the desired effect.
>
> Thanks,
> Roman.
>


Re: Please append 'close #PR_id' to commit message when you merge other's pull request

2016-09-08 Thread Lei Chang
@ming, there is a discussion on this mailing list before about what should
be included in the commit message.

Appending "close #" makes the commit message very messy.

So the conclusion at that time is to not append "close #" to a commit
message.

If someone forgets closing a pull request, looks better to add a reminder
to the pull request.

Cheers
Lei


On Thu, Sep 8, 2016 at 11:42 AM, Ming Li  wrote:

> Hi all committers,
>
> It seems that we can not close other's pull request conviently (
> https://issues.apache.org/jira/browse/INFRA-12580),  it is better to tell
> git hook automatically close PR automatically.
>
> Here is the steps I added into the wiki(
> https://cwiki.apache.org/confluence/display/HAWQ/Contributing+to+HAWQ):
> # In case of some contributor often forget to close pull request,
> # Append commit message to automatically close pull request after code
> merged. e.g. PR number is 12:
> run `git commit --amend` and append "(close #12)" to commit message.
>


Re: enforce -Werror (if gcc) in hawq?

2016-09-05 Thread Lei Chang
In the past, it is enabled by default. Looks like it is a regression during oss 
and build refactoring process.

Cheers
Lei




On Mon, Sep 5, 2016 at 6:05 PM +0800, "Paul Guo"  wrote:










-Werror
Make all warnings into errors.
I've seen many cases (not just hawq) before that ignoring gcc warning leads
to bugs. I'm wondering we should add the option for the gcc case. Given
there may be a lot of warnings when building the common postgres code in
hawq, we could at least enforce it in our own code at first
(src/backend/cdb, src/backend/resourcemanager, src/test/feature, other
directories?)? Any suggestion?







Re: HAWQ and Azure blob storage

2016-08-18 Thread Lei Chang
yes, it is pretty tricky. blob store is not hdfs and people are simulating
file system interfaces on top of blob store.

hopefully, our pluggable file system design can abstract the common part
out, and add special handling for semantic differences.

Cheers
Lei



On Fri, Aug 19, 2016 at 9:59 AM, Roman Shaposhnik 
wrote:

> On Wed, Aug 17, 2016 at 8:30 PM, Vineet Goel  wrote:
> > Recently, a question came up whether HAWQ can support Azure with it's
> blob
> > storage (WASB), given that WASB is HDFS compatible.
>
> This is a pretty tricky statement. More here on blob stores vs. file
> systems:
> https://hadoop.apache.org/docs/r2.7.2/hadoop-project-
> dist/hadoop-common/filesystem/introduction.html#Object_
> Stores_vs._Filesystems
>
> Thanks,
> Roman.
>


Re: HAWQ and Azure blob storage

2016-08-18 Thread Lei Chang
I think this is quite related to "[HAWQ-786
] - Framework to support
pluggable formats and file systems)".

Append should not be a problem since when we do update & delete feature,
anyway we need to do "file merge".

And for single-writer guarantee, currently, hawq enforce this on master
side. we only need to handle some corner cases for example avoiding split
brain issues. Then we can support WASB feature.

Cheers
Lei







On Thu, Aug 18, 2016 at 11:30 AM, Vineet Goel  wrote:

> Recently, a question came up whether HAWQ can support Azure with it's blob
> storage (WASB), given that WASB is HDFS compatible. I wanted to get the
> developer community's thoughts on HAWQ compatibility. It seems to me that
> HAWQ will not work as-is with WASB, right?
>
> Some investigation and discussion:
> Windows Azure Storage Blob (WASB) is an extension built on top of the HDFS
> APIs. Upon further investigation, I was told that WASB added support for
> append relatively recently in HADOOP-12635
> , but with limitations
> in its semantics as compared to HDFS.  Unlike HDFS, it does not enforce a
> single-writer guarantee.  Instead, responsibility is pushed to the
> application to guarantee mutually exclusive access to the file being
> appended.  Failure to do so can result in data loss or corruption.  If HAWQ
> relies on the traditional HDFS single-writer semantics, then WASB’s append
> implementation won’t be suitable. Also, WASB has no support for HDFS
> Truncate.  Attempts to call truncate will fail with an exception.
>


Re: [ANNOUNCE] Donation of HAWQ documentation

2016-08-15 Thread Lei Chang
awesome!

Cheers
Lei


On Tue, Aug 16, 2016 at 9:09 AM, Roman Shaposhnik  wrote:

> Hi!
>
> I am pleased to announce the donation of HAWQ documentaion
> to the HAWQ ASF community. This source code donation
> includes everything one needs to build a full set of documentation
> for Apache HAWQ (incubating).
>
> The Software Grant Agreement for this code has been accepted by the ASF
> secretary.
>
> The donated code currently sits in a separate git repo:
>   https://github.com/Pivotal-DataFabric/docs-apache-hawq-md
> and is awaiting community review.  I encourage everyone in the HAWQ
> community to review this donation and provide feedback.  In particular
> your input on build/staging would be really helpful.
>
> As the permanent home for this code, I'd like to propose a brand new
> repo:
>  http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs.git
> To avoid bike-shed painting, I'll assume the lazy consensus on the
> name and location of the repo -- speak now or forever hold your peace ;-)
>
> As always, your suggestions are most welcome!
>
> Thanks,
> Roman.
>


Re: Options Usage of "hawq register"

2016-08-15 Thread Lei Chang
I think this is an very useful feature for backup/restore, disaster
recovery and some other scenarios.

>From the usage side, "hawq register" follows the typically "hawq command"
design pattern: that is, "hawq action object". But for "hawq register",
there is no "object" here.

---
hawq extract -o t1.yml t1;
hawq register -h localhost -p 5432 -u me -d postgres -c t1.yml;
---

Cheers
Lei


On Mon, Aug 15, 2016 at 3:25 PM, Hong Wu  wrote:

> Hi HAWQ developers,
>
> This thread means to confirm the option usage of hawq register.
>
> There will be two scenarios for users to use the hawq register tool so far.
> - I. Register external parquet data into HAWQ. For example, users want to
> migrate parquet tables from HIVE to HAWQ as quick as possible. In this
> case, only parquet format is supported and the original parquet files in
> hive are moved.
>
> - II. User should be able to use hawq register to register table files into
> a new HAWQ cluster. It is some kind of protecting against corruption from
> users' perspective. Users use the last-known-good metadata to update the
> portion of catalog managing HDFS blocks. The table files or dictionary
> should be backuped(such as using distcp) into the same path in the new HDFS
> setting. And in this case, both AO and Parquet formats are supported.
>
> Considering above cases, the designed options for hawq register looks
> below:
>
> hawq register [-h hostname] [-p port] [-U username] [-d database] [-t
> tablename] [-f filepath] [-c config]
> Note that the -h, p, -U options are optional, the -c option and the -t, -f
> options are mutually exclusive which are corresponding to two different
> cases above. Consequently, the expected usage of hawq register should be
> like below:
>
> - Case I
> hadoop fs -put -f hdfs://localhost:8020/hive/original_data.paq
> hdfs://localhost:8020/test_data.paq;
>
> create table t1(i int) with (appendonly = true, orientation=parquer);
>
> hawq register -h localhost -p 5432 -u me -d postgres -t t1 -f
> hdfs://localhost:8020/test_data.paq;
>
> - Case II
> hawq extract -o t1.yml t1;
>
> hawq register -h localhost -p 5432 -u me -d postgres -c t1.yml;
>
> Incorrect usage(in both of these cases, hawq resgiter will print an error
> and then exit):
> hawq register -h localhost -p 5432 -u me -d postgres -c t1.yml -t t1;
> hawq register -h localhost -p 5432 -u me -d postgres -c t1.yml -f
> hdfs://localhost:8020/test_data.paq;
> hawq register -h localhost -p 5432 -u me -d postgres -c t1.yml -t t1 -f
> hdfs://localhost:8020/test_data.paq;
>
> Does this design make sense, any comments? Thanks.
>
> Best
> Hong
>


Re: Replace python module paramiko with pexpect

2016-08-11 Thread Lei Chang
looks great.

Cheers
Lei


On Fri, Aug 12, 2016 at 12:17 AM, Radar Da lei  wrote:

> Hi All,
>
> Currently HAWQ use 'paramiko' to sync password-less ssh keys between the
> cluster nodes. It works fine, but 'paramiko' have license compatible issues
> with Apache HAWQ. So we removed that part code, then users need to install
> it manually by pip install.
>
> Only 'hawq ssh-exkeys ...' command used it.
>
> I did some research and find we can use 'pexpect' and
> 'ptyprocess'(submodule of pexpect) to replace 'paramiko'.
>
> Seems they are all compatible with Apache HAWQ license. So I proposal to
> include these two python modules in HAWQ code, then users won't need to do
> manual install works.
>
> For their licenses, please see below link:
>
> Licenses:
> pexpect:
> https://github.com/pexpect/pexpect/blob/master/LICENSE
>
> ptyprocess:
> https://github.com/pexpect/ptyprocess/blob/master/LICENSE
>
> Any comments? Thanks.
>
> Regards,
> Radar
>


[jira] [Created] (HAWQ-982) PL/Python with psycopg2 cannot connect to remote postgres

2016-08-05 Thread Lei Chang (JIRA)
Lei Chang created HAWQ-982:
--

 Summary: PL/Python with psycopg2 cannot connect to remote postgres
 Key: HAWQ-982
 URL: https://issues.apache.org/jira/browse/HAWQ-982
 Project: Apache HAWQ
  Issue Type: Bug
  Components: Core
Reporter: Lei Chang
Assignee: Lei Chang



For one use case I want to connect to external postgreSQL database from HAWQ 
PL/Python procedure.
I use python psycopg2 library.
Remote postgreSQL server reject connecion from HAWQ  with 
this error :  FATAL  unsupported frontend protocol 28675.0: server supports 1.0 
to 3.0.
The same python code is running well from OS level.

I wonder if  it is  HAWQ or PostgreSQL PL/Python interpreter related issiue.
Any help or pointers would be great.

---
my code below:

CREATE OR REPLACE FUNCTION dchoma.connection_test( ) RETURNS text AS
$$
import psycopg2

try:
conn = psycopg2.connect("dbname='database_name' user='user' 
host='remote_host' password='pass' port=5432")
return "Connection successful "
except Exception , msg :
return "Exception: {m}".format(m=msg)
$$
LANGUAGE 'plpythonu' VOLATILE;

select * from dchoma.connection_test();


HAWQ version 2.0.1.0 build dev ( compiled from github)
Remote database version:  PostgreSQL 9.2.15 on x86_64-redhat-linux-gnu, 
compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-4), 64-bit
OS: CentOS 7-1511

i found similar issiue here, but the problem is not solved.
https://discuss.zendesk.com/hc/en-us/community/posts/200793368-greenplum-dblink-postgresql-remote-is-error




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Removal from committer?

2016-08-01 Thread Lei Chang
anyone can create jira. I just checked the jira committer list, you are not
in it. I just added you to the list.

Thanks
Lei


On Tue, Aug 2, 2016 at 9:36 AM, Gregory Chase <gch...@pivotal.io> wrote:

> Probably just an infrastructure issue...
>
> On Mon, Aug 1, 2016 at 6:35 PM, Caleb Welton <cwel...@pivotal.io> wrote:
>
> > Now looks fixed.  Weird.
> >
> > On Mon, Aug 1, 2016 at 6:26 PM, Caleb Welton <cwel...@pivotal.io> wrote:
> >
> > > I haven't explilcity tested committing a change to the repository, but
> I
> > > can no longer create jira's.
> > >
> > >
> > > On Mon, Aug 1, 2016 at 6:07 PM, Lei Chang <lei_ch...@apache.org>
> wrote:
> > >
> > >> you cannot commit now? who made the change?
> > >>
> > >> I think it is not a general practice too.
> > >>
> > >> Thanks
> > >> Lei
> > >>
> > >>
> > >>
> > >> On Tue, Aug 2, 2016 at 9:02 AM, Caleb Welton <cwel...@pivotal.io>
> > wrote:
> > >>
> > >> > It looks like I have been removed as a committer in HAWQ?
> > >> >
> > >> > I did not think that was general practice for an Apache project.  Is
> > >> there
> > >> > a reason that this change was made?
> > >> >
> > >> > Regards,
> > >> >   Caleb
> > >> >
> > >>
> > >
> > >
> >
>
>
>
> --
> Greg Chase
>
> Global Head, Big Data Communities
> http://www.pivotal.io/big-data
>
> Pivotal Software
> http://www.pivotal.io/
>
> 650-215-0477
> @GregChase
> Blog: http://geekmarketing.biz/
>


Re: Removal from committer?

2016-08-01 Thread Lei Chang
you cannot commit now? who made the change?

I think it is not a general practice too.

Thanks
Lei



On Tue, Aug 2, 2016 at 9:02 AM, Caleb Welton  wrote:

> It looks like I have been removed as a committer in HAWQ?
>
> I did not think that was general practice for an Apache project.  Is there
> a reason that this change was made?
>
> Regards,
>   Caleb
>


Re: [VOTE] HAWQ 2.0.0.0-incubating (RC2)

2016-07-22 Thread Lei Chang
The source tarball compiled and rat check passed. +1

Cheers
Lei


On Fri, Jul 22, 2016 at 5:35 PM, Lili Ma  wrote:

> +1. I downloaded the source code tarball, compiled successfully, started a
> single node cluster, and tested with a few queries, and it worked!
>
>
> Cheers,
> Lili
>
>
> On Fri, Jul 22, 2016 at 4:52 PM, Radar Da lei  wrote:
>
> > I downloaded the source code tarball, compiled successfully, started a
> > single cluster successfully.  +1
> >
> > Regards,
> > Radar
> >
> > On Fri, Jul 22, 2016 at 4:50 PM, Amy Bai  wrote:
> >
> > > I downloaded the source tarball, and compiled successfully. +1
> > >
> > > Thanks
> > > Amy
> > >
> > >
> > > From: Goden Yao 
> > > > Date: Wed, Jul 20, 2016 at 8:57 AM
> > > > Subject: [VOTE] HAWQ 2.0.0.0-incubating (RC2)
> > > > To: dev 
> > > >
> > > >
> > > > This is the 1st release for Apache HAWQ (incubating), version:
> > > > 2.0.0.0-incubating
> > > >
> > > > *It fixes the following issues:*
> > > > Clear all IP related issues for HAWQ and this is a source code
> tarball
> > > only
> > > > release.
> > > > Full list of JIRAs fixed/related to the release: link
> > > > <
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/HAWQ/HAWQ+Release+2.0.0.0-incubating
> > > > >
> > > >
> > > > *** Please download, review and vote by *Friday 6pm July 29, 2016
> PST*
> > > ***
> > > > or When we have enough votes to bring this source tarball to IPMC
> > > >
> > > > *We're voting upon the release branch:*
> > > > 2.0.0.0-incubating
> > > > HEAD: commit
> > > > <
> > > >
> > >
> >
> https://git-wip-us.apache.org/repos/asf?p=incubator-hawq.git;a=commit;h=9bdad43ebbbcefce23db193c3a7dd62ea6a3d805
> > > > >
> > > >
> > > >
> > > > *Source Files:*
> > > >
> > > >
> > >
> >
> https://dist.apache.org/repos/dist/dev/incubator/hawq/2.0.0.0-incubating.RC2
> > > >
> > > > *KEYS file containing PGP Keys we use to sign the release:*
> > > > https://dist.apache.org/repos/dist/dev/incubator/hawq/KEYS
> > > >
> > > > Thanks
> > > > -Goden
> > > >
> > > >
> > >
> >
>


Re: PXF读取HDFS数据NULL问题

2016-07-14 Thread Lei Chang
I got the email now, looks there are very big delays here, several hours,
sometimes.

Thanks
Lei


2016-07-15 0:50 GMT+08:00 Roman Shaposhnik <ro...@shaposhnik.org>:

> You may want to check your spam filter. I saw all these messages.
>
> Thanks,
> Roman.
>
> 2016-07-14 6:03 GMT-07:00 Lei Chang <lch...@pivotal.io>:
>
> >
> > FYI, looks dev mailing list does not get this email. forward to dev@hawq
> > again.
> >
> >
> > -- Forwarded message --
> > From: 吴彪 <sywub...@jd.com>
> > Date: 2016-07-14 19:05 GMT+08:00
> > Subject: PXF读取HDFS数据NULL问题
> > To: dev@hawq.incubator.apache.org
> > Cc: 周龙波 <zhoulon...@jd.com>, Brian Lu <b...@pivotal.io>, Lei Chang <
> > lch...@pivotal.io>
> >
> >
> > 你好,
> >
> >在使用PXF读取HDFS数据时遇到以下问题,HDFS上的数据是通过Hive生成的,:
> >
> > 1.Hive中String类型的字段有些为NULL,hawq 查询会报以下错误;在Hive中把NULL字段处理为非NULL时,在hawq
> > 中可以正常查询。
> >
> >
> >
> >
> >
> > 2.Hive中Bigint 类型的字段有些为NULL,hawq 查询会报以下错误:
> >
> >
> >
> >请问有比较好的方案可以直接处理带有NULL的数据吗?谢谢
> >
> >
> >
> > 吴彪
> >
> > JD.COM 京东 【大数据部-大数据研发部-平台基础研发部】
> >
> > ---
> >
> > 地址:北京市朝阳区北辰世纪中心A座10层
> >
> > 邮编:100195
> >
> > 手机:18810848987
> >
> > 邮箱:*sywub...@jd.com <sywub...@jd.com?subject=sywub...@jd.com>*
> >
> > 
> >
> > [image: j39]
> >
> >
> >
> >
>


Re: PXF读取HDFS数据NULL问题

2016-07-14 Thread Lei Chang
Resent to see whether you guys can see images in the email

-

你好,

   在使用PXF读取HDFS数据时遇到以下问题,HDFS上的数据是通过Hive生成的,:

1.Hive中String类型的字段有些为NULL,hawq 查询会报以下错误;在Hive中把NULL字段处理为非NULL时,在hawq
中可以正常查询。





2.Hive中Bigint 类型的字段有些为NULL,hawq 查询会报以下错误:



   请问有比较好的方案可以直接处理带有NULL的数据吗?谢谢



吴彪

JD.COM <http://jd.com/> 京东 【大数据部-大数据研发部-平台基础研发部】

---

地址:北京市朝阳区北辰世纪中心A座10层

邮编:100195

手机:18810848987

邮箱:*sywub...@jd.com <sywub...@jd.com?subject=sywub...@jd.com>*

2016-07-15 1:31 GMT+08:00 Goden Yao <goden...@apache.org>:

> Also I cannot see the screenshots attached in the email, can you ask the
> issue reporter to resend (or you can help to resend the picture, Lei?)
>
> On Thu, Jul 14, 2016 at 10:28 AM Goden Yao <goden...@apache.org> wrote:
>
> > I believe the sender needs to be part of dev mailing list to send email.
> >
> >
> > On Thu, Jul 14, 2016 at 6:13 AM Lei Chang <lei_ch...@apache.org> wrote:
> >
> >>
> >> looks our mailing list has some issues, no one see this email. forward
> >> from my apache email.
> >>
> >>
> >> -- Forwarded message --
> >> From: 吴彪 <sywub...@jd.com>
> >> Date: 2016-07-14 19:05 GMT+08:00
> >> Subject: PXF读取HDFS数据NULL问题
> >> To: dev@hawq.incubator.apache.org
> >> Cc: 周龙波 <zhoulon...@jd.com>, Brian Lu <b...@pivotal.io>, Lei Chang <
> >> lch...@pivotal.io>
> >>
> >>
> >> 你好,
> >>
> >>在使用PXF读取HDFS数据时遇到以下问题,HDFS上的数据是通过Hive生成的,:
> >>
> >> 1.Hive中String类型的字段有些为NULL,hawq 查询会报以下错误;在Hive中把NULL字段处理为非NULL时,在hawq
> >> 中可以正常查询。
> >>
> >>
> >>
> >>
> >>
> >> 2.Hive中Bigint 类型的字段有些为NULL,hawq 查询会报以下错误:
> >>
> >>
> >>
> >>请问有比较好的方案可以直接处理带有NULL的数据吗?谢谢
> >>
> >>
> >>
> >> 吴彪
> >>
> >> JD.COM 京东 【大数据部-大数据研发部-平台基础研发部】
> >>
> >> ---
> >>
> >> 地址:北京市朝阳区北辰世纪中心A座10层
> >>
> >> 邮编:100195
> >>
> >> 手机:18810848987
> >>
> >> 邮箱:*sywub...@jd.com <sywub...@jd.com?subject=sywub...@jd.com>*
> >>
> >> 
> >>
> >> [image: j39]
> >>
> >>
> >>
> >>
> >>
>


Fwd: PXF读取HDFS数据NULL问题

2016-07-14 Thread Lei Chang
looks our mailing list has some issues, no one see this email. forward from
my apache email.


-- Forwarded message --
From: 吴彪 <sywub...@jd.com>
Date: 2016-07-14 19:05 GMT+08:00
Subject: PXF读取HDFS数据NULL问题
To: dev@hawq.incubator.apache.org
Cc: 周龙波 <zhoulon...@jd.com>, Brian Lu <b...@pivotal.io>, Lei Chang <
lch...@pivotal.io>


你好,

   在使用PXF读取HDFS数据时遇到以下问题,HDFS上的数据是通过Hive生成的,:

1.Hive中String类型的字段有些为NULL,hawq 查询会报以下错误;在Hive中把NULL字段处理为非NULL时,在hawq
中可以正常查询。





2.Hive中Bigint 类型的字段有些为NULL,hawq 查询会报以下错误:



   请问有比较好的方案可以直接处理带有NULL的数据吗?谢谢



吴彪

JD.COM 京东 【大数据部-大数据研发部-平台基础研发部】

---

地址:北京市朝阳区北辰世纪中心A座10层

邮编:100195

手机:18810848987

邮箱:*sywub...@jd.com <sywub...@jd.com?subject=sywub...@jd.com>*



[image: j39]


Re: [Propose] More data skipping technology for IO intensive performance enhancement

2016-07-13 Thread Lei Chang
looks we get consensus on this enhancement, i just started a JIRA to track
this: https://issues.apache.org/jira/browse/HAWQ-923

and I also added this to performance enhancement on roadmap page

Cheers
Lei


On Mon, Jul 11, 2016 at 2:00 PM, Ming Li  wrote:

> It seems the Dynamic partition pruning in impala is different from the DPE
> (dynamic partition elimination) in HAWQ, below is the feature description
> from impala roadmap (http://impala.io/overview.html).
>
>
>- Dynamic partition pruning - to perform data elimination of queries
>where the partition filters are in dimension tables instead of the fact
>tables
>
>
> On Fri, Jul 8, 2016 at 9:56 PM, Ruilong Huo  wrote:
>
> > Strong agree with Ming's proposal.
> >
> > We do have DPE (dynamic partition elimination) in HAWQ. But it is a kind
> of
> > high level skipping which is conducted at planning phase.
> > If fine-grained filter can be done at runtime in execution phase, there
> > might be more performance gain for I/O intensive workload.
> >
> > Looking forward to see a plan for it soon:)
> >
> > Best regards,
> > Ruilong Huo
> >
> > On Fri, Jul 8, 2016 at 7:02 AM, Ivan Weng  wrote:
> >
> > > Thanks Ming, data skipping technology is really what HAWQ needed.
> > > Hope to see this design and maybe prototype soon.
> > >
> > > On Thu, Jul 7, 2016 at 10:33 AM, Wen Lin  wrote:
> > >
> > > > Thanks for sharing with us!
> > > > It's really a good investigation and proposal.
> > > > Iooking forward to a design draft.
> > > >
> > > > On Thu, Jul 7, 2016 at 10:16 AM, Lili Ma  wrote:
> > > >
> > > > > What about we work out a draft design describing how to implement
> > data
> > > > > skipping technology for HAWQ?
> > > > >
> > > > >
> > > > > Thanks
> > > > > Lili
> > > > >
> > > > > On Wed, Jul 6, 2016 at 7:23 PM, Gmail 
> wrote:
> > > > >
> > > > > > BTW, could you create some related issues in JIRA?
> > > > > >
> > > > > > Thanks
> > > > > > xunzhang
> > > > > >
> > > > > > Send from my iPhone
> > > > > >
> > > > > > > 在 2016年7月2日,23:19,Ming Li  写道:
> > > > > > >
> > > > > > > Data skipping technology can extremely avoiding unnecessary IO,
> > so
> > > > it
> > > > > > can
> > > > > > > extremely enhance performance for IO intensive query. Including
> > > > > > eliminating
> > > > > > > query on unnecessary table partition according to the partition
> > key
> > > > > > range ,
> > > > > > > I think more options are available now:
> > > > > > >
> > > > > > > (1) Parquet / ORC format introduce a lightweight meta data info
> > > like
> > > > > > > Min/Max/Bloom filter for each block, such meta data can be
> > > exploited
> > > > > when
> > > > > > > predicate/filter info can be fetched before executing scan.
> > > > > > >
> > > > > > > However now in HAWQ, all data in parquet need to be scanned
> into
> > > > memory
> > > > > > > before processing predicate/filter. We don't generate the meta
> > info
> > > > > when
> > > > > > > INSERT into parquet table, the scan executor doesn't utilize
> the
> > > meta
> > > > > > info
> > > > > > > neither. Maybe some scan API need to be refactored so that we
> can
> > > get
> > > > > > > predicate/filter
> > > > > > > info before executing base relation scan.
> > > > > > >
> > > > > > > (2) Base on (1) technology,  especially with Bloom filter, more
> > > > > optimizer
> > > > > > > technology can be explored furthur. E.g. Impala implemented
> > Runtime
> > > > > > > filtering(*
> > > > > >
> > > > >
> > > >
> > >
> >
> https://www.cloudera.com/documentation/enterprise/latest/topics/impala_runtime_filtering.html
> > > > > > > <
> > > > > >
> > > > >
> > > >
> > >
> >
> https://www.cloudera.com/documentation/enterprise/latest/topics/impala_runtime_filtering.html
> > > > > > >*
> > > > > > > ),  which can be used at
> > > > > > > - dynamic partition pruning
> > > > > > > - converting join predicate to base relation predicate
> > > > > > >
> > > > > > > It tell the executor to wait for one moment(the interval time
> can
> > > be
> > > > > set
> > > > > > in
> > > > > > > guc) before executing base relation scan, if the interested
> > > > values(e.g.
> > > > > > the
> > > > > > > column in join predicate only have very small set) arrived in
> > time,
> > > > it
> > > > > > can
> > > > > > > use these value to filter this scan, if doesn't arrived in
> time,
> > it
> > > > > scan
> > > > > > > without this filter, which doesn't impact result correctness.
> > > > > > >
> > > > > > > Unlike (1) technology, this technology cannot be used in any
> > case,
> > > it
> > > > > > only
> > > > > > > outperform in some cases. So it just add some more query plan
> > > > > > > choices/paths, and the optimizer need based on statistics info
> to
> > > > > > calculate
> > > > > > > the cost, and apply it when cost down.
> > > > > > >
> > > > > > > All in one, maybe more similar technology can be adoptable 

Re: A question of HAWQ

2016-07-13 Thread Lei Chang
please see items on the roadmap page here:
https://cwiki.apache.org/confluence/display/HAWQ/HAWQ+Roadmap

Cheers
Lei


On Thu, Jul 14, 2016 at 8:44 AM, Paul Guo  wrote:

> HAWQ has not supported update and delete yet, but the feature is on the
> plan.
>
> 2016-07-13 23:13 GMT+08:00 Wales Wang :
>
> > HAWQ is append only.
> > Actian vectorH have full feature and index,update, delete
> >
> > Wales Wang
> >
> > 在 2016-7-13,上午10:10,jinzhy  写道:
> >
> > > hello everybody,
> > >
> > >   Can HAWQ or Pivotal HD support 'delete' or 'update' operation in
> > HDFS now?I can only  create append only table in my computer。
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> >
> ---
> > > Confidentiality Notice: The information contained in this e-mail and
> any
> > accompanying attachment(s)
> > > is intended only for the use of the intended recipient and may be
> > confidential and/or privileged of
> > > Neusoft Corporation, its subsidiaries and/or its affiliates. If any
> > reader of this communication is
> > > not the intended recipient, unauthorized use, forwarding, printing,
> > storing, disclosure or copying
> > > is strictly prohibited, and may be unlawful.If you have received this
> > communication in error,please
> > > immediately notify the sender by return e-mail, and delete the original
> > message and all copies from
> > > your system. Thank you.
> > >
> >
> ---
> >
>


Re: [Propose] Create a new HAWQ roadmap page

2016-07-13 Thread Lei Chang
Added to the page:
https://cwiki.apache.org/confluence/display/HAWQ/HAWQ+Roadmap

It can be incrementally updated.

Cheers
Lei




On Wed, Jun 29, 2016 at 5:23 PM, Lei Chang <lei_ch...@apache.org> wrote:

>
> I classified the items into the following categories. appreciate your
> comments.
>
> Cloud related:
> [HAWQ-308] - S3 Integration
> [HAWQ-310] - Snapshot support
>
> Data Management Functionality Enhancement
> [HAWQ-786] - Framework to support pluggable formats and file systems
> [HAWQ-864] - Support ORC as a native file format
> [HAWQ-150] - External tables can be designated for both READ and WRITE
> [HAWQ-304] - Support update and delete on non-heap tables
> [HAWQ-401] - json type support
> [HAWQ-319] - REST API for HAWQ
> [HAWQ-312] - Multiple active master support
>
> Performance enhancement
> [HAWQ-303] - Index support for non-heap tables
>
> Languages & Analytics
> [HAWQ-321] - Support plpython3u
>
> Ecosystem:
> [HAWQ-256] - Integrate Security with Apache Ranger
> [HAWQ-29] - Refactor HAWQ InputFormat to support Spark/Scala
>
> Management & Build
> [HAWQ-8] - Installing the HAWQ Software thru the Apache Ambari
> [HAWQ-311] - Data Transfer tool
> [HAWQ-326] - Support RPM build for HAWQ
>
> Cheers
> Lei
>
>
>
>
> On Fri, Jun 24, 2016 at 5:10 PM, Lei Chang <lei_ch...@apache.org> wrote:
>
>>
>> Nice, I created a page and we can discuss the items and put them on the
>> page.
>>
>> For the items, I think it makes sense to add at least items in the jira
>> roadmap panel, here are some major ones I extracted from the panel. looks
>> better to classify them into categories.
>>
>> [HAWQ-786] - Framework to support pluggable formats and file systems
>> [HAWQ-864] - Support ORC as a native file format
>> [HAWQ-308] - S3 Integration
>> [HAWQ-256] - Integrate Security with Apache Ranger
>> [HAWQ-150] - External tables can be designated for both READ and WRITE
>> [HAWQ-303] - Index support for non-heap tables
>> [HAWQ-304] - Support update and delete on non-heap tables
>> [HAWQ-310] - Snapshot support
>> [HAWQ-312] - Multiple active master support
>> [HAWQ-319] - REST API for HAWQ
>> [HAWQ-321] - Support plpython3u
>> [HAWQ-29] - Refactor HAWQ InputFormat to support Spark/Scala
>> [HAWQ-311] - Data Transfer tool
>> [HAWQ-326] - Support RPM build for HAWQ
>> [HAWQ-8] - Installing the HAWQ Software thru the Apache Ambari
>> [HAWQ-752] - build pxf compatible with Apache Hadoop
>> [HAWQ-401] - json type support
>>
>> Cheers
>> Lei
>>
>>
>>
>> On Thu, Jun 23, 2016 at 11:23 PM, Vineet Goel <vvin...@apache.org> wrote:
>>
>>> +1 too
>>>
>>> I can help start a draft on the wiki based on historical user requests
>>> and
>>> trends in the ecosystem. And of course, the roadmap is a living and
>>> breathing document which will continue to evolve over time based on
>>> continuous feedback, and more.
>>>
>>> -Vineet
>>>
>>>
>>> On Thu, Jun 23, 2016 at 8:18 AM, Kavinder Dhaliwal <kdhali...@pivotal.io
>>> >
>>> wrote:
>>>
>>> > +1 I'm in favor of this. The Zeppelin roadmap is very community driven
>>> and
>>> > having something similar for HAWQ will go a long way to getting more
>>> > feedback about the overall direction and goals of HAWQ.
>>> >
>>> > On Thu, Jun 23, 2016 at 2:02 AM, Lei Chang <lei_ch...@apache.org>
>>> wrote:
>>> >
>>> > > Hi Guys,
>>> > >
>>> > > I noticed there are a lot of requests about hawq roadmaps coming
>>> from the
>>> > > offline hawq activities (meetup et al).
>>> > >
>>> > > Although we have the list of backlog JIRAs on our JIRA page
>>> > > <
>>> > >
>>> >
>>> https://issues.apache.org/jira/browse/HAWQ/?selectedTab=com.atlassian.jira.jira-projects-plugin:roadmap-panel
>>> > > >.
>>> > > But it does not give a high level description. A good example from
>>> other
>>> > > communities is here:
>>> > >
>>> https://cwiki.apache.org/confluence/display/ZEPPELIN/Zeppelin+Roadmap
>>> > >
>>> > > So I am proposing we have a similar HAWQ Roadmap page maintained on
>>> our
>>> > > wiki page.
>>> > >
>>> > > Thoughts?
>>> > >
>>> > > Cheers
>>> > > Lei
>>> > >
>>> >
>>>
>>
>>
>


Re: Question on hawq_rm_nvseg_perquery_limit

2016-07-13 Thread Lei Chang
On Wed, Jul 13, 2016 at 3:16 PM, Vineet Goel  wrote:

> This leads me to another question on Apache Ambari UI integration.
>
> It seems the need to tune hawq_rm_nvseg_perquery_limit is minimal, as we
> seem to prescribe a limit of 512 regardless of cluster size. If that's the
> case, two options come to mind:
>
> 1) Either the "default" hawq_rm_nvseg_perquery_limit should be the lower
> value between (6 * segment host count) and 512. This way, it's less
> confusing to users and there is a logic behind the value.
>

If ambari uses the lower value, it is difficult to change
hawq_rm_nvseg_perquery_perseg_limit anymore.

for example, it we want to change hawq_rm_nvseg_perquery_perseg_limit to 8
for better performance on lower concurrency workload, it is doable anymore.


>
> 2) Or, the parameter should not be exposed on the UI, leaving the default
> to 512. When/why would a user want to change this value?
>

I think this is an advanced configuration and only used by some cases, not
exposed is fine, but i think we need a way to change it.

If users want to increase the max value of degree of parallelism, users
should change this. For example, if end user workload has just some simple
to scale queries, on a large cluster, it is fine to tune the value.


>
> Thoughts?
>
> Vineet
>
>
> On Tue, Jul 12, 2016 at 11:51 PM, Hubert Zhang  wrote:
>
> > +1 with Yi's answer.
> > Vseg numbers are controlled by Resource Negotiator(a module before
> > planner),  all the vseg related GUCs will affect the behaviour of RN,
> some
> > of them will also affect Resource Manager.
> > To be specific, hawq_rm_nvseg_perquery_limit and
> > hawq_rm_nvseg_perquery_perseg_limit
> > are both considered by Resource Negotiator(RN) and Resource Manager(RM),
> > while default_hash_table_bucket_number is only considered by RN.
> > As a result, suppose default_hash_table_bucket_number  = 60, query like
> > "select * from hash_table" will request #60 vsegs in RN and if
> > hawq_rm_nvseg_perquery_limit
> > is less than 60, RM will not able to allocate 60 vsegs.
> >
> > So we need to ensure default_hash_table_bucket_number is less than the
> > capacity of RM.
> >
> > On Wed, Jul 13, 2016 at 1:40 PM, Yi Jin  wrote:
> >
> > > Hi Vineet,
> > >
> > > Some my comment.
> > >
> > > For question 1.
> > > Yes,
> > > perquery_limit is introduced mainly for restrict resource usage in
> large
> > > scale cluster; perquery_perseg_limit is to avoid allocating too many
> > > processes in one segment, which may cause serious performance issue.
> So,
> > > two gucs are for different performance aspects. Along with the
> variation
> > of
> > > cluster scale, one of the two limits actually takes effect. We dont
> have
> > to
> > > let both active for resource allocation.
> > >
> > > For question 2.
> > >
> > > In fact, perquery_perseg_limit is a general resource restriction for
> all
> > > queries not only hash table queries and external table queries, this is
> > why
> > > this guc is not merged with another one. For example, when we run some
> > > queries upon random distributed tables, it does not make sense to let
> > > resource manager refer a guc for hash table.
> > >
> > > For the last topic item.
> > >
> > > In my opinion, it is not necessary to adjust
> > hawq_rm_nvseg_perquery_limit,
> > > say, we just need to leave it unchanged and actually not active until
> we
> > > really want to run a large-scale HAWQ cluster, for example, 100+ nodes.
> > >
> > > Best,
> > > Yi
> > >
> > > On Wed, Jul 13, 2016 at 1:18 PM, Vineet Goel 
> wrote:
> > >
> > > > Hi all,
> > > >
> > > > I’m trying to document some GUC usage in detail and have questions on
> > > > hawq_rm_nvseg_perquery_limit and hawq_rm_nvseg_perquery_perseg_limit
> > > > tuning.
> > > >
> > > > *hawq_rm_nvseg_perquery_limit* = (default value = 512) . Let’s call
> it
> > > > *perquery_limit* in short.
> > > > *hawq_rm_nvseg_perquery_perseg_limit* (default value = 6) . Let’s
> call
> > it
> > > > *perquery_perseg_limit* in short.
> > > >
> > > >
> > > > 1) Is there ever any benefit in having perquery_limit *greater than*
> > > > (perquery_perseg_limit * segment host count) ?
> > > > For example in a 10-node cluster, HAWQ will never allocate more than
> > (GUC
> > > > default 6 * 10 =) 60 v-segs, so the perquery_limit default of 512
> > doesn’t
> > > > have any effect. It seems perquery_limit overrides (takes effect)
> > > > perquery_perseg_limit only when it’s value is less than
> > > > (perquery_perseg_limit * segment host count).
> > > >
> > > > Is that the correct assumption? That would make sense, as users may
> > want
> > > to
> > > > keep a check on how much processing a single query can take up (that
> > > > implies that the limit must be lower than the total possible v-segs).
> > Or,
> > > > it may make sense in large clusters (100-nodes or more) where we need
> > to
> > > > limit the pressure on HDFS.
> > > >
> > > >
> > > > 

[jira] [Created] (HAWQ-919) pxf has some issues on RAT check

2016-07-12 Thread Lei Chang (JIRA)
Lei Chang created HAWQ-919:
--

 Summary: pxf has some issues on RAT check
 Key: HAWQ-919
 URL: https://issues.apache.org/jira/browse/HAWQ-919
 Project: Apache HAWQ
  Issue Type: Sub-task
  Components: PXF
Reporter: Lei Chang
Assignee: Goden Yao






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: sanity-check before running cases in feature-test

2016-07-12 Thread Lei Chang
I think the better way is to let test cases run under some conditions.

for example, pl/python is optional, if user did not run configure with
pl/python option, the test about pl/python should not run.

Cheers
Lei



On Tue, Jul 12, 2016 at 2:15 PM, Ivan Weng  wrote:

> Agree with Hong. Test case should check its environment needed. If the
> check failed, it should terminate the execution and report the error.
>
> On Tue, Jul 12, 2016 at 2:04 PM, Hong Wu  wrote:
>
> > It is user/developer themselves that should take care. Say, if you write
> a
> > test case which is related to plpython, why don't you configure HAWQ with
> > "--with-python" option? We should write a README for feature-test that
> > guides user to run this tests. For example, tell them sourcing
> > "greenplum.sh" before running tests.
> >
> > Consequently, I think add such sanity-check is a little bit of
> > over-engineering which will bring extra problems and complexities.
> >
> > Best
> > xunzhang
> >
> > 2016-07-12 13:47 GMT+08:00 Paul Guo :
> >
> > > I have >1 times to encounter some feature test failures due to reported
> > > missing stuffs.
> > >
> > > e.g.
> > >
> > > 1. I did not have pl/python installed in my hawq build so
> > >UDF/sql/function_set_returning.sql fails to "create language
> > plpythonu"
> > >This makes this case fails.
> > >
> > > 2. Sometimes I forgot to source a greenplum.sh, then all cases run
> > > with failures due to missing psql.
> > >
> > > We seem to be able to improve.
> > >
> > > 1) Sanity-check some file existence in common code, e.g.
> > > psql, gpdiff.pl,
> > >
> > > 2) Some cases could do sanity-check in their own test constructor
> > > functions,
> > > e.g. if the case uses the extension plpython, the test case should
> > > check it itself.
> > >
> > > More thoughts?
> > >
> >
>


Re: Replace git submodule with git clone + file with commit number?

2016-07-07 Thread Lei Chang
I think what Roman means is the new "2.0.0.0-incubating" branch that will
be cut.

Cheers
Lei



On Fri, Jul 8, 2016 at 10:27 AM, Paul Guo  wrote:

> Thanks Hong,
>
> To be clear,
>
> The original patch is:
> commit 497ae5db996094150e475659e06eea929e209841
> Author: Paul Guo 
> Date:   Thu Jun 30 18:34:27 2016 +0800
>
> HAWQ-867. Replace the git-submobule mechanism with git-clone
>
> Following further patch is trivial,
> commit 3ba7d8ef4b27b11bec02beefe0dd037698687175
> Author: Paul Guo 
> Date:   Tue Jul 5 18:16:17 2016 +0800
>
> HAWQ-888. Remove some dummy files which were used to keep related
> "empty" directories.
>
>
> 2016-07-08 10:10 GMT+08:00 Hong Wu :
>
> > Hi roman,
> >
> > I found his related pull requests for this thread and listed below,
> >
> > https://github.com/apache/incubator-hawq/pull/773
> > https://github.com/apache/incubator-hawq/pull/771
> > https://github.com/apache/incubator-hawq/pull/762
> >
> > All of above pull requests have been merged into master because these
> block
> > the latest release in github. welcome further comments, thanks.
> >
> > Best
> > xunzhang
> >
> > 2016-07-08 2:04 GMT+08:00 Roman Shaposhnik :
> >
> > > On Thu, Jul 7, 2016 at 3:20 AM, Paul Guo  wrote:
> > > > For gporca it is ok to pre-build them and pass orca installation path
> > to
> > > > hawq, but for
> > > > pgcrypto and plr, having a script to run before building hawq seems
> to
> > > not
> > > > be a good
> > > > idea, technically speaking.
> > > >
> > > > plr/pgcrypto depends on the configure options and configure checking.
> > > > (e.g. with and without openssl option in configure, pgcrypto build
> > > results
> > > > will be different).
> > > >
> > > > That means building of these features are not 100% independent on
> > > building
> > > > of hawq.
> > >
> > > The above makes sense, but there's way too many ways to interpret the
> > > particulars of it. Before we move ahead, how about I take a look at the
> > > branch that is being cut (see the other thread) and provide you more
> > > technical feedback?
> > >
> > > Thanks,
> > > Roman.
> > >
> >
>


Re: Can we create one wiki page for FAQ?

2016-07-07 Thread Lei Chang
You can just click "insert", and then select "table of contents".

I added it.

Cheers
Lei



On Thu, Jul 7, 2016 at 5:53 PM, Ming Li  wrote:

> Thanks all for your great ideas.
>
> The problem is I don't know how to create an auto-update index in wiki. If
> you know how to do it, please update wiki directly. And also the contents
> need your inputs. Really appreciate your contribution!​
>


  1   2   >