Re: Remain with HAWQ project or not?

2018-05-08 Thread jiali yao
Yes, I want to remain with HAWQ.

Thanks
Jiali

On Tue, May 8, 2018 at 1:40 PM, Zhanwei Wang  wrote:

> Yes, I would like to remain a committer.
>
>
> > On May 8, 2018, at 13:26, Hong  wrote:
> >
> > Y
> >
> > 2018-05-08 1:05 GMT-04:00 stanly sheng :
> >
> >> Yes, I want to remain with HAWQ
> >>
> >> 2018-05-08 12:16 GMT+08:00 Paul Guo :
> >>
> >>> Yes. Thanks Radar to drive HAWQ graduation.
> >>>
> >>> 2018-05-08 12:02 GMT+08:00 Lirong Jian :
> >>>
>  Yes, I would like to remain a committer.
> 
>  Lirong
> 
>  Lirong Jian
>  HashData Inc.
> 
>  2018-05-08 10:04 GMT+08:00 Hubert Zhang :
> 
> > Yes.
> >
> > On Tue, May 8, 2018 at 9:30 AM, Lili Ma  wrote:
> >
> >> Yes, of course I want to remain as PMC member!
> >>
> >> Thanks Radar for the effort on HAWQ graduation:)
> >>
> >> Best Regards,
> >> Lili
> >>
> >> 2018-05-07 20:07 GMT-04:00 Lisa Owen :
> >>
> >>> yes, i would like to remain a committer.
> >>>
> >>>
> >>> -lisa owen
> >>>
> >>> On Mon, May 7, 2018 at 10:02 AM, Shubham Sharma <
> >>> ssha...@pivotal.io>
> >>> wrote:
> >>>
>  Yes. I am looking forward to contributing to Hawq.
> 
>  On Mon, May 7, 2018 at 12:53 PM, Lav Jain 
>  wrote:
> 
> > Yes. I am very excited about HAWQ.
> >
> > Regards,
> >
> >
> > *Lav Jain*
> > *Pivotal Data*
> >
> > lj...@pivotal.io
> >
> > On Mon, May 7, 2018 at 6:51 AM, Alexander Denissov <
> >>> adenis...@pivotal.io
> >
> > wrote:
> >
> >> Yes.
> >>
> >>> On May 7, 2018, at 6:03 AM, Wen Lin 
> >>> wrote:
> >>>
> >>> Yes. I'd like to keep on contributing to HAWQ.
> >>>
>  On Mon, May 7, 2018 at 5:21 PM, Ivan Weng <
> >>> iw...@pivotal.io
> >
> >>> wrote:
> 
>  Yes, I definitely would like to be with HAWQ.
> 
>  Regards,
>  Ivan
> 
> > On Mon, May 7, 2018 at 5:12 PM, Hongxu Ma <
> > inte...@outlook.com
> >>>
> > wrote:
> >
> > Yes, let's make HAWQ better.
> >
> > Thanks.
> >
> >> 在 07/05/2018 16:11, Radar Lei 写道:
> >> HAWQ committers,
> >>
> >> Per the discussion in "Apache HAWQ graduation from
> > incubator?"
>  [1],
> > we
> > want
> >> to setup the PMC as part of HAWQ graduation
> >> resolution.
> >>
> >> So we'd like to confirm whether you want to remain as
> >> a
> > committer/PMC
> >> member of Apache HAWQ project?
> >>
> >> If you'd like to remain with HAWQ project, it's
> >> welcome
>  and
> >>> please
> > *respond**
> >> 'Yes'* in this thread, or *respond 'No'* if you are
> >> not
> >>> interested
> > in
>  any
> >> more. Thanks.
> >>
> >> This thread will be available for at least 72 hours,
> >>> after
> >> that,
>  we
>  will
> >> send individual confirm emails.
> >>
> >> [1]
> >> https://lists.apache.org/thread.html/
>  b4a0b5671ce377b3d51c9b7ab00496
> > a1eebfcbf1696ce8b67e078c64@%3Cdev.hawq.apache.org%3E
> >>
> >> Regards,
> >> Radar
> >>
> >
> > --
> > Regards,
> > Hongxu.
> >
> >
> 
> >>
> >
> 
> 
> 
>  --
>  Regards,
>  Shubham Sharma
>  Staff Customer Engineer
>  Pivotal Global Support Services
>  ssha...@pivotal.io
>  Direct Tel: +1(510)-304-8201
>  Office Hours: Mon-Fri 9:00 am to 5:00 pm PDT
>  Out of Office Hours Contact +1 877-477-2269
> 
> >>>
> >>
> >
> >
> >
> > --
> > Thanks
> >
> > Hubert Zhang
> >
> 
> >>>
> >>
> >>
> >>
> >> --
> >> Best Regards,
> >> Xiang Sheng
> >>
>
>


Re: [ANNOUNCE] Apache HAWQ 2.3.0.0-incubating Release

2018-03-20 Thread jiali yao
Cool!!

Thanks Yi and all the contributors for the release.



On Wed, Mar 21, 2018 at 10:41 AM, stanly sheng 
wrote:

> Great!!! Thanks Yi and all the contributors for the release.
>
> 2018-03-21 10:27 GMT+08:00 Yi JIN :
>
> > Apache HAWQ (incubating) Project Team is proud to announce Apache
> > HAWQ 2.3.0.0-incubating has been released.
> >
> > Apache HAWQ (incubating) combines exceptional MPP-based analytics
> > performance, robust ANSI SQL compliance, Hadoop ecosystem
> > integration and manageability, and flexible data-store format
> > support, all natively in Hadoop, no connectors required. Built
> > from a decade’s worth of massively parallel processing (MPP)
> > expertise developed through the creation of the Pivotal
> > Greenplum® enterprise database and open source PostgreSQL, HAWQ
> > enables to you to swiftly and interactively query Hadoop data,
> > natively via HDFS.
> >
> > *Download Link*:
> > https://dist.apache.org/repos/dist/release/incubator/hawq/2.
> > 3.0.0-incubating/
> >
> > *About this release*
> > This is a release having both source code and binary
> >
> > All changes:
> > https://cwiki.apache.org/confluence/display/HAWQ/Apache+HAWQ+2.3.0.0-
> > incubating+Release
> >
> >
> > *HAWQ Resources:*
> >
> >- JIRA: https://issues.apache.org/jira/browse/HAWQ
> >- Wiki: https://cwiki.apache.org/confluence/display/HAWQ/
> > Apache+HAWQ+Home
> >- Mailing list(s): dev@hawq.incubator.apache.org
> >   u...@hawq.incubator.apache.org
> >
> > *Know more about HAWQ:*
> > http://hawq.apache.org
> >
> > - Apache HAWQ (incubating) Team
> >
> > =
> > *Disclaimer*
> >
> > Apache HAWQ (incubating) is an effort undergoing incubation at The
> > Apache Software Foundation (ASF), sponsored by the name of Apache
> > Incubator PMC. Incubation is required of all newly accepted
> > projects until a further review indicates that the
> > infrastructure, communications, and decision making process have
> > stabilized in a manner consistent with other successful ASF
> > projects. While incubation status is not necessarily a reflection
> > of the completeness or stability of the code, it does indicate
> > that the project has yet to be fully endorsed by the ASF.
> >
>
>
>
> --
> Best Regards,
> Xiang Sheng
>


Re: Re: Re: [ANNOUNCE] Apache HAWQ 2.2.0.0-incubating Released

2017-07-13 Thread jiali yao
Great! Congratulation to the team!

+1 for Yi's volunteering for next release.

Looking forward to next release! Looking forward for HAWQ future!



Thanks
Jiali

On Thu, Jul 13, 2017 at 3:38 PM, Ruilong Huo  wrote:

> Thanks your tremendous contribution for the project and the volunteering!
> You deserve #1 committer with 87k+ lines of code changes in 160+ commits!
>
> From my perspective, it will be great to have you as the release manager
> for the next release. Hope we can reach the graduation as the project is
> more
> mature, the diversity of the community grows fast with committers from
> every corner of the world.
>
>
> You have my full support through the release as well as that from hawq
> community!
>
>
> Best regards,
> Ruilong Huo
>
>
> At 2017-07-13 15:16:31, "Wen Lin"  wrote:
> >Congratulations!
> >Thanks Ruilong for all the efforts on release!
> >Thanks Yi for volunteering for next release!
> >
> >Regards!
> >
> >On Thu, Jul 13, 2017 at 3:04 PM, Ed Espino  wrote:
> >
> >> Yi,
> >>
> >> +1 to your offer to be the Release Manager for the next Apache HAWQ
> >> release. We all know your past contributions well. Thank you for
> >> volunteering.
> >>
> >> Regards,
> >> -=e
> >>
> >> On Wed, Jul 12, 2017 at 11:48 PM, Yi JIN  wrote:
> >>
> >> > Hi Ruilong,
> >> >
> >> > I would like to take this responsibility as a volunteer for the next
> >> > release. As a committer I used to contribute a lot of code to Apache
> >> HAWQ,
> >> > consequently besides code work, if possible I would like to contribute
> >> more
> >> > in another way and learn more about growing an Apache project.
> >> >
> >> > Best,
> >> > Yi (yjin)
> >> >
> >> > On Thu, Jul 13, 2017 at 4:43 PM, HuoRuilong 
> wrote:
> >> >
> >> > > Great step towards a mature hawq and active community! Thanks
> everyone
> >> > for
> >> > > making this real, especially the help from Ed!
> >> > >
> >> > > To make it a more successful apache project and community, we need
> to
> >> > keep
> >> > > the release cadence. Who would like to be volunteer for the next
> >> release
> >> > > manager and drive the effort? Thanks.
> >> > >
> >> > > Best regards,
> >> > > Ruilong Huo
> >> > >
> >> > >
> >> > > At 2017-07-13 14:39:21, "Lili Ma"  wrote:
> >> > > >Congratulations everyone :)
> >> > > >
> >> > > >We're stepping further towards graduation!
> >> > > >
> >> > > >Best Regards,
> >> > > >Lili
> >> > > >
> >> > > >2017-07-13 13:16 GMT+08:00 Ed Espino :
> >> > > >
> >> > > >> Congratulations to everyone on the first Apache HAWQ release with
> >> > > >> convenience binaries. Special thanks to Ruilong for his excellent
> >> > > release
> >> > > >> management guidance.
> >> > > >>
> >> > > >> I'm very proud to be part of a great dev team.
> >> > > >>
> >> > > >> Cheers,
> >> > > >> -=e
> >> > > >>
> >> > > >> On Wed, Jul 12, 2017 at 10:00 PM, 陶征霖 
> wrote:
> >> > > >>
> >> > > >> > Congrats!
> >> > > >> >
> >> > > >> > 2017-07-13 9:55 GMT+08:00 Yandong Yao :
> >> > > >> >
> >> > > >> > > Great achievement, Congrats!
> >> > > >> > >
> >> > > >> > > On Thu, Jul 13, 2017 at 8:46 AM, Lei Chang <
> >> > chang.lei...@gmail.com>
> >> > > >> > wrote:
> >> > > >> > >
> >> > > >> > > > Congrats!
> >> > > >> > > >
> >> > > >> > > > Cheers
> >> > > >> > > > Lei
> >> > > >> > > >
> >> > > >> > > >
> >> > > >> > > > On Wed, Jul 12, 2017 at 3:27 PM, Ruilong Huo <
> h...@apache.org
> >> >
> >> > > >> wrote:
> >> > > >> > > >
> >> > > >> > > > > Hi All,
> >> > > >> > > > >
> >> > > >> > > > > The Apache HAWQ (incubating) Project Team is proud to
> >> announce
> >> > > >> > > > > the release of Apache HAWQ 2.2.0.0-incubating.
> >> > > >> > > > >
> >> > > >> > > > > This is a source code and binary release.
> >> > > >> > > > >
> >> > > >> > > > > ABOUT HAWQ
> >> > > >> > > > > Apache HAWQ (incubating) combines exceptional MPP-based
> >> > > analytics
> >> > > >> > > > > performance, robust ANSI SQL compliance, Hadoop ecosystem
> >> > > >> integration
> >> > > >> > > > > and manageability, and flexible data-store format
> support,
> >> all
> >> > > >> > > > > natively in Hadoop, no connectors required.
> >> > > >> > > > >
> >> > > >> > > > > Built from a decade’s worth of massively parallel
> processing
> >> > > (MPP)
> >> > > >> > > > > expertise developed through the creation of open source
> >> > > Greenplum®
> >> > > >> > > > > Database and PostgreSQL, HAWQ enables you to
> >> > > >> > > > > swiftly and interactively query Hadoop data, natively via
> >> > HDFS.
> >> > > >> > > > >
> >> > > >> > > > > FEATURES AND ENHANCEMENTS INCLUDED IN THIS RELEASE
> >> > > >> > > > > - CentOS 7.x support
> >> > > >> > > > > Apache HAWQ is improved to be compatible with CentOS 7.x
> >> along
> >> > > with
> >> > > >> > > 6.x.
> >> > > >> > > > >
> >> > > >> > > > > - Apache Ranger integration
> >> > > >> > > > > Integrate Apache 

Re: [GitHub] incubator-hawq pull request #1240: HAWQ-1465. Disable alter schema help doc

2017-05-15 Thread jiali yao
Looks good to me

On Mon, May 15, 2017 at 10:02 AM, ztao1987  wrote:

> GitHub user ztao1987 opened a pull request:
>
> https://github.com/apache/incubator-hawq/pull/1240
>
> HAWQ-1465. Disable alter schema help doc
>
> testdb=# ALTER SCHEMA test1 owner to test1;
> ERROR: Cannot support alter schema owner statement yet
> testdb=# \h alter schema
> Command: ALTER SCHEMA
> Description: change the definition of a schema
> Syntax:
> ALTER SCHEMA name RENAME TO newname
> ALTER SCHEMA name OWNER TO newowner
>
> Disable this part of help doc
>
> You can merge this pull request into a Git repository by running:
>
> $ git pull https://github.com/ztao1987/incubator-hawq HAWQ-1465
>
> Alternatively you can review and apply these changes as the patch at:
>
> https://github.com/apache/incubator-hawq/pull/1240.patch
>
> To close this pull request, make a commit to your master/trunk branch
> with (at least) the following in the commit message:
>
> This closes #1240
>
> 
> commit b206088da631de52933d45d370df4d13e65c9cff
> Author: ztao1987 
> Date:   2017-05-15T01:51:17Z
>
> HAWQ-1465. Disable alter schema help doc
>
> 
>
>
> ---
> If your project is set up for it, you can reply to this email and have your
> reply appear on GitHub as well. If your project does not have this feature
> enabled and wishes so, or if the feature is enabled but not working, please
> contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
> with INFRA.
> ---
>


Re: HAWQ Ranger Integration Design Doc

2016-07-28 Thread Jiali Yao
Good to see ranger in HAWQ.

I have some questions:
1. If we want to use ranger, it must be from initial phase or shall we
enable range certification when we already have a running database?  Also
similar for upgrade from non-ranger HAWQ to range supported HAWQ.In create
user second part, if we only have gpadmin, how to mapping existing user ?

2. If we use ranger, "create user in LDAP" will be only entry for user
creation? Will we still support "create user" in HAWQ? If yes, it will
trigger sync when create user right?

3. How to handling "drop user"? Will drop all related policy in Ranger?
What about user in linux ldap?

Thanks

Jiali


On Thu, Jul 28, 2016 at 4:48 PM, Hubert Zhang  wrote:

> @ruilong
> Q1:yes, you can tune the sync interval parameter in conf file of UserSync,
> default is 5mins for Unix
> Q2:  If Ranger is down, all the queries in HAWQ cannot get privilege and
> will be refused. New connections to HAWQ should be refused too.
>


Re: Question on hawq_rm_nvseg_perquery_limit

2016-07-13 Thread Jiali Yao
+1 for detail explanation.
One more, normally we do not suggest that default_hash_table_bucket_number
is greater than hawq_rm_nvseg_perquery_limit(512).
When initing large cluster, the default_hash_table_bucket_number will be
adjusted accordingly.  If default_hash_table_bucket_number >
hawq_rm_nvseg_perquery_limit, it will be adjusted to (
hawq_rm_nvseg_perquery_limit / hostnumber ) * hostnumber.
If the cluster is expanded, it should also need to be set properly.

Jiali


On Wed, Jul 13, 2016 at 1:40 PM, Yi Jin  wrote:

> Hi Vineet,
>
> Some my comment.
>
> For question 1.
> Yes,
> perquery_limit is introduced mainly for restrict resource usage in large
> scale cluster; perquery_perseg_limit is to avoid allocating too many
> processes in one segment, which may cause serious performance issue. So,
> two gucs are for different performance aspects. Along with the variation of
> cluster scale, one of the two limits actually takes effect. We dont have to
> let both active for resource allocation.
>
> For question 2.
>
> In fact, perquery_perseg_limit is a general resource restriction for all
> queries not only hash table queries and external table queries, this is why
> this guc is not merged with another one. For example, when we run some
> queries upon random distributed tables, it does not make sense to let
> resource manager refer a guc for hash table.
>
> For the last topic item.
>
> In my opinion, it is not necessary to adjust hawq_rm_nvseg_perquery_limit,
> say, we just need to leave it unchanged and actually not active until we
> really want to run a large-scale HAWQ cluster, for example, 100+ nodes.
>
> Best,
> Yi
>
> On Wed, Jul 13, 2016 at 1:18 PM, Vineet Goel  wrote:
>
> > Hi all,
> >
> > I’m trying to document some GUC usage in detail and have questions on
> > hawq_rm_nvseg_perquery_limit and hawq_rm_nvseg_perquery_perseg_limit
> > tuning.
> >
> > *hawq_rm_nvseg_perquery_limit* = (default value = 512) . Let’s call it
> > *perquery_limit* in short.
> > *hawq_rm_nvseg_perquery_perseg_limit* (default value = 6) . Let’s call it
> > *perquery_perseg_limit* in short.
> >
> >
> > 1) Is there ever any benefit in having perquery_limit *greater than*
> > (perquery_perseg_limit * segment host count) ?
> > For example in a 10-node cluster, HAWQ will never allocate more than (GUC
> > default 6 * 10 =) 60 v-segs, so the perquery_limit default of 512 doesn’t
> > have any effect. It seems perquery_limit overrides (takes effect)
> > perquery_perseg_limit only when it’s value is less than
> > (perquery_perseg_limit * segment host count).
> >
> > Is that the correct assumption? That would make sense, as users may want
> to
> > keep a check on how much processing a single query can take up (that
> > implies that the limit must be lower than the total possible v-segs). Or,
> > it may make sense in large clusters (100-nodes or more) where we need to
> > limit the pressure on HDFS.
> >
> >
> > 2) Now, if the purpose of hawq_rm_nvseg_perquery_limit is to keep a check
> > on single query resource usage (by limiting the # of v-segs), doesn’t if
> > affect default_hash_table_bucket_number because queries will fail when
> > *default_hash_table_bucket_number* is greater than
> > hawq_rm_nvseg_perquery_limit ? In that case, the purpose of
> > hawq_rm_nvseg_perquery_limit conflicts with the ability to run queries on
> > HASH dist tables. This then means that tuning
> hawq_rm_nvseg_perquery_limit
> > down is not a good idea, which seems conflicting to the purpose of the
> GUC
> > (in relation to other GUC).
> >
> >
> > Perhaps someone can provide some examples on *how and when would you
> > tune hawq_rm_nvseg_perquery_limit* in this 10-node example:
> >
> > *Defaults on a 10-node cluster are:*
> > a) *hawq_rm_nvseg_perquery_perseg_limit* = 6 (hence ability to spin up 6
> *
> > 10 = 60 total v-segs for random tables)
> > b) *hawq_rm_nvseg_perquery_limit* = 512 (but HAWQ will never dispatch
> more
> > than 60 v-segs on random table, so value of 512 does not seem practical)
> > c) *default_hash_table_bucket_number* = 60 (6 * 10)
> >
> >
> >
> > Thanks
> > Vineet
> >
>


Re: Rename "greenplum" to "hawq"

2016-07-13 Thread Jiali Yao
I think it would be more confusing since there are two files and
environment variables have same purpose.
I think it is worth to investigate the impact from code and user and to
replace greenplum.

Jiali


On Wed, Jul 13, 2016 at 2:01 PM, Lili Ma  wrote:

> For the command line tools name, what about we keep greenplum_path.sh and
> add a hawq_env.sh and inside hawq_env.sh, we call greenplum_path.sh. So the
> already user won't need to change their behavior, and new user can directly
> use the new hawq named script?
>
> For the environment variable, for example GPHOME,can we create a new
> variable HAWQHOME, and set it the same value as GPHOME?
>
>
> Thanks
> Lili
>
> 2016-07-13 13:55 GMT+08:00 Yi Jin :
>
> > I think it is a must-do, but some concerns of customer using convention
> and
> > legacy applications, scripts etc.
> >
> > On Wed, Jul 13, 2016 at 1:44 PM, 陶征霖  wrote:
> >
> > > Good idea, but need quite a lot of effort and may also affect custormer
> > > behavior. Should handle it carefully.
> > >
> > > 2016-07-13 9:54 GMT+08:00 Ivan Weng :
> > >
> > > > Agree with this good idea. But as Paul said, there are maybe already
> > many
> > > > users use greeenplum_path.sh or something else in their environment.
> So
> > > we
> > > > need to think about it.
> > > >
> > > >
> > > > Regards,
> > > > Ivan
> > > >
> > > > On Wed, Jul 13, 2016 at 9:31 AM, Paul Guo  wrote:
> > > >
> > > > > I've asked this before. Seems that affects some old users. I'm not
> > sure
> > > > > about the details.
> > > > > I agree that we should change it to a better name in a release.
> > > > >
> > > > > 2016-07-13 9:25 GMT+08:00 Roman Shaposhnik :
> > > > >
> > > > > > On Tue, Jul 12, 2016 at 6:21 PM, Xiang Sheng 
> > > > wrote:
> > > > > > > Agree . @xunzhang.
> > > > > > > However , some greenplum strings can be easily replaced , but
> > there
> > > > are
> > > > > > too
> > > > > > > many in the code or comments.  Changing all of them costs too
> > much
> > > > > > efforts.
> > > > > > >
> > > > > > > So changing the strings that users can see is enough.
> > > > > >
> > > > > > Huge +1 to this! Btw, is this something we may be able to tackle
> in
> > > our
> > > > > > next Apache release?
> > > > > >
> > > > > > Thanks,
> > > > > > Roman.
> > > > > >
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Thanks,
> > > Zhenglin
> > >
> >
>


Re: sanity-check before running cases in feature-test

2016-07-12 Thread Jiali Yao
For the test case checking, I think it should report "SKIPPED" instead of
 ERROR.
The test case should check whether this feature is supported or not. If
supported, run the case; otherwise skipped it.
Agree on that we should add it in common lib.

On the other topic, I think source greenplum_path.sh is must. It is env
related.

Thanks

Jiali



On Tue, Jul 12, 2016 at 2:19 PM, Lei Chang  wrote:

> I think the better way is to let test cases run under some conditions.
>
> for example, pl/python is optional, if user did not run configure with
> pl/python option, the test about pl/python should not run.
>
> Cheers
> Lei
>
>
>
> On Tue, Jul 12, 2016 at 2:15 PM, Ivan Weng  wrote:
>
> > Agree with Hong. Test case should check its environment needed. If the
> > check failed, it should terminate the execution and report the error.
> >
> > On Tue, Jul 12, 2016 at 2:04 PM, Hong Wu  wrote:
> >
> > > It is user/developer themselves that should take care. Say, if you
> write
> > a
> > > test case which is related to plpython, why don't you configure HAWQ
> with
> > > "--with-python" option? We should write a README for feature-test that
> > > guides user to run this tests. For example, tell them sourcing
> > > "greenplum.sh" before running tests.
> > >
> > > Consequently, I think add such sanity-check is a little bit of
> > > over-engineering which will bring extra problems and complexities.
> > >
> > > Best
> > > xunzhang
> > >
> > > 2016-07-12 13:47 GMT+08:00 Paul Guo :
> > >
> > > > I have >1 times to encounter some feature test failures due to
> reported
> > > > missing stuffs.
> > > >
> > > > e.g.
> > > >
> > > > 1. I did not have pl/python installed in my hawq build so
> > > >UDF/sql/function_set_returning.sql fails to "create language
> > > plpythonu"
> > > >This makes this case fails.
> > > >
> > > > 2. Sometimes I forgot to source a greenplum.sh, then all cases run
> > > > with failures due to missing psql.
> > > >
> > > > We seem to be able to improve.
> > > >
> > > > 1) Sanity-check some file existence in common code, e.g.
> > > > psql, gpdiff.pl,
> > > >
> > > > 2) Some cases could do sanity-check in their own test constructor
> > > > functions,
> > > > e.g. if the case uses the extension plpython, the test case
> should
> > > > check it itself.
> > > >
> > > > More thoughts?
> > > >
> > >
> >
>


Re: question on pg_hba.conf updates

2016-07-11 Thread Jiali Yao
Hi Radar and Vineet,

One more, I think it should be good make pg_hba.conf visible and
editable in Amabari.
And it should be synced automatically with standby when pg_hba is modified
by in master via Amabari.
Thought?

Jiali


On Tue, Jul 12, 2016 at 12:24 PM, Radar Da lei <r...@pivotal.io> wrote:

> Hi Jiali,
>
> It would be great if we can have this in our document. We can list some
> frequently used scenarios and add to document together.
>
> Regards,
> Radar
>
> On Tue, Jul 12, 2016 at 11:32 AM, Jiali Yao <j...@pivotal.io> wrote:
>
> > Hi Radar,
> >
> > I think we should add the scenario Ming described in document in case
> when
> > master is crashed after user add some entries in pg_hba.conf.
> > Thought?
> >
> > Jiali
> >
> >
> > On Tue, Jul 12, 2016 at 11:12 AM, Radar Da lei <r...@pivotal.io> wrote:
> >
> > > Hi Ming,
> > >
> > > If user manually added some entires into pg_hba.conf and they want it
> be
> > > available after activated standby, then user need to add them to
> > standby's
> > > pg_hba.conf as well.
> > >
> > > We do not do sync things for such case. Thanks.
> > >
> > > Regards,
> > > Radar
> > >
> > > On Tue, Jul 12, 2016 at 11:05 AM, Ming Li <m...@pivotal.io> wrote:
> > >
> > > > Hi Radar,
> > > >
> > > > If user manually added some entries into master pg_hba.conf, after we
> > > > active standby as new master, those entries should be exists on the
> new
> > > > master(previous standby).
> > > >
> > > > So my question is: when those entries copied? At init standby or
> active
> > > > standby?
> > > >
> > > > Thanks.
> > > >
> > > > On Tue, Jul 12, 2016 at 10:49 AM, Radar Da lei <r...@pivotal.io>
> > wrote:
> > > >
> > > > > Hi Vineet,
> > > > >
> > > > > While doing HAWQ cluster init, we will collect
> master/standby/segment
> > > ip
> > > > > addresses and update pg_hba.conf. After hawq init, content of
> > > > 'pg_hba.conf'
> > > > > on each node should be different.
> > > > >
> > > > > When a cluster adding a new standby, this will get pg_hba.conf
> > updated
> > > on
> > > > > all the nodes. Activate standby will not update pg_hba.conf since
> all
> > > > > entires already in standby and segments's pg_hba.conf.
> > > > >
> > > > > Adding a new segment will only generate a new pg_hba.conf for
> itself.
> > > > >
> > > > > These are what I can think out for this moment. Thanks.
> > > > >
> > > > > Regards,
> > > > > Radar
> > > > >
> > > > > On Tue, Jul 12, 2016 at 10:08 AM, Vineet Goel <vvin...@apache.org>
> > > > wrote:
> > > > >
> > > > > > Hi all,
> > > > > >
> > > > > > Question related to integration with Apache Ambari.
> > > > > >
> > > > > > It would be nice to make pg_hba.conf visible and editable in
> > Ambari,
> > > so
> > > > > > that Ambari allows one single interface for admins to update HAWQ
> > and
> > > > > > System configs such as hawq-site.xml, hawq-check.conf,
> sysctl.conf,
> > > > > > limits.conf, hdfs-client.xml, yarn-client.xml etc. Rollback and
> > > > > > version-history of config files is always a bonus in Ambari.
> > > > > >
> > > > > > Are there any backend HAWQ utilities (such as activate-standby or
> > > > others)
> > > > > > that update pg_hba.conf file in any way, ever? It would be nice
> to
> > > know
> > > > > so
> > > > > > that config file change conflicts are managed appropriately.
> > > > > >
> > > > > > Thanks
> > > > > > Vineet
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: question on pg_hba.conf updates

2016-07-11 Thread Jiali Yao
Hi Radar,

I think we should add the scenario Ming described in document in case when
master is crashed after user add some entries in pg_hba.conf.
Thought?

Jiali


On Tue, Jul 12, 2016 at 11:12 AM, Radar Da lei  wrote:

> Hi Ming,
>
> If user manually added some entires into pg_hba.conf and they want it be
> available after activated standby, then user need to add them to standby's
> pg_hba.conf as well.
>
> We do not do sync things for such case. Thanks.
>
> Regards,
> Radar
>
> On Tue, Jul 12, 2016 at 11:05 AM, Ming Li  wrote:
>
> > Hi Radar,
> >
> > If user manually added some entries into master pg_hba.conf, after we
> > active standby as new master, those entries should be exists on the new
> > master(previous standby).
> >
> > So my question is: when those entries copied? At init standby or active
> > standby?
> >
> > Thanks.
> >
> > On Tue, Jul 12, 2016 at 10:49 AM, Radar Da lei  wrote:
> >
> > > Hi Vineet,
> > >
> > > While doing HAWQ cluster init, we will collect master/standby/segment
> ip
> > > addresses and update pg_hba.conf. After hawq init, content of
> > 'pg_hba.conf'
> > > on each node should be different.
> > >
> > > When a cluster adding a new standby, this will get pg_hba.conf updated
> on
> > > all the nodes. Activate standby will not update pg_hba.conf since all
> > > entires already in standby and segments's pg_hba.conf.
> > >
> > > Adding a new segment will only generate a new pg_hba.conf for itself.
> > >
> > > These are what I can think out for this moment. Thanks.
> > >
> > > Regards,
> > > Radar
> > >
> > > On Tue, Jul 12, 2016 at 10:08 AM, Vineet Goel 
> > wrote:
> > >
> > > > Hi all,
> > > >
> > > > Question related to integration with Apache Ambari.
> > > >
> > > > It would be nice to make pg_hba.conf visible and editable in Ambari,
> so
> > > > that Ambari allows one single interface for admins to update HAWQ and
> > > > System configs such as hawq-site.xml, hawq-check.conf, sysctl.conf,
> > > > limits.conf, hdfs-client.xml, yarn-client.xml etc. Rollback and
> > > > version-history of config files is always a bonus in Ambari.
> > > >
> > > > Are there any backend HAWQ utilities (such as activate-standby or
> > others)
> > > > that update pg_hba.conf file in any way, ever? It would be nice to
> know
> > > so
> > > > that config file change conflicts are managed appropriately.
> > > >
> > > > Thanks
> > > > Vineet
> > > >
> > >
> >
>


Re: Confusion around HAWQ versions in JIRA

2016-07-06 Thread Jiali Yao
+1 for consolidating  the version.

For 4-digit number, from the concept described above, I think 4 digit make
more sense. And from it, user can easily know whether specific upgrade
process needed or just binary switch if fine.
Based on that, for the "2.0.0", "2.0.0-incubating" or "2.0.0.0-incubating".
I prefer to 2.0.0.0-incubating since it would be consistent in JIRA and
code.

Thanks
Jiali


On Wed, Jul 6, 2016 at 3:56 PM, Lei Chang  wrote:

> On Wed, Jul 6, 2016 at 3:17 PM, Vineet Goel  wrote:
>
> > Apologies for any confusion. Let me expand further:
> >
> > 1) My proposal was to update the JIRA versions. I didn't think
> > 2.0.0-incubating and 2.0.0 are the same, we should either consolidate
> them
> > as one, or change the JIRA version numbers to be numerically different.
> > Version 2.0.0 shows 5 open JIRAs that may or may not belong to
> > "2.0.0-incubating" release. See link:
> >
> >
> https://issues.apache.org/jira/browse/HAWQ/fixforversion/12334195/?selectedTab=com.atlassian.jira.jira-projects-plugin:version-summary-panel
> > vs
> >
> >
> https://issues.apache.org/jira/browse/HAWQ/fixforversion/12334000/?selectedTab=com.atlassian.jira.jira-projects-plugin:version-summary-panel
> >
> > We should update the 5 JIRAs listed in 2.0.0 with the correct status and
> > fix versions. This will make it easy to track the upcoming release.
> >
> >
> >
> Agree. What I meant is also to consolidate the two into "2.0.0-incubating"
> or "2.0.0.0-incubating" depending on which version schema we will choose.
>
>
>
> > 2) Regarding the 4-digit versioning in the code, that's a good discussion
> > to have.
> > What is the proposed convention for managing the 4 digits and what sort
> of
> > code/API changes trigger a change in specific digits ? It would be good
> to
> > discuss the details.
> >
>
>
> The 4-digit x.y.z.w versioning is:
>
> x: means major release
> y. means minor release
> z. means bug fix release
> w. used for hot fix release
>
> Catalog and data format changes need x or y change. From the number
> changes, end users know whether it needs a hawq upgrade. for this scheme,
> API changes are not reflected in the number. For 3-digit semantic
> versioning, the rules to increase the number is quite different, the number
> change does not reflect catalog changes or data format changes but it
> reflects API changes.
>
>
> >
> > Thanks
> > -Vineet
> >
> >
> > On Tue, Jul 5, 2016 at 11:35 PM, Ruilong Huo  wrote:
> >
> > > I would prefer the option 1 to keep the 4-digit versions. This
> mechanism
> > > address the compatible issues of library in a more proper manner.
> > >
> > > PS, here are some background of the hawq versioning policy which might
> > > help:
> > > Postgres based systems, including GPDB and HAWQ, have
> > > the notion of "MODULE_MAGIC" which is intended for the
> > > purpose of guaranteeing version compatibility.  In addition
> > > to the "MAGIC NUMBER", defined as the Major.Minor version
> > > , GPDB and HAWQ also have the notion of a "MAGIC
> > > PRODUCT" which GPDB uses to differentiate itself from
> > > Postgres and provide clear messages regarding "this
> > > library was built against Postgres" this mechanism
> > > could be easily employed to differentiate HAWQ and GPDB
> > > and allow basing the "MAGIC NUMBER" off of the HAWQ version
> > >  instead of the GPDB version as it does today.
> > >
> > > Best regards,
> > > Ruilong Huo
> > >
> > > On Wed, Jul 6, 2016 at 2:26 PM, Radar Da lei  wrote:
> > >
> > > > For Lei's proposal, I would prefer option 1 for below reasons:
> > > >
> > > > 1. Save time we may spend to solve incompatible issues.
> > > > 2. It will be hard to maintain semantic version if we increase major
> > > > version every time when we are changing catalog and interface. If so,
> > > HAWQ
> > > > version will reach 10.0.0 very soon.
> > > >
> > > > Thanks.
> > > >
> > > > Regards,
> > > > Radar
> > > >
> > > > On Wed, Jul 6, 2016 at 1:58 PM, Lei Chang 
> > wrote:
> > > >
> > > > > This is indeed a confusing issue. I am even confused by what Vineet
> > > > > proposed.
> > > > >
> > > > > There are several versions currently used across the systems:
> > > > >
> > > > > 1) the 3-digit JIRA versions: currently it has 2.0.0-incubating and
> > > > 2.0.0.
> > > > > and i think they are the same, "2.0.0-incubating" is more formal
> for
> > > > > incubating project.
> > > > >
> > > > > 2) the 4-digit versions in the code which is inherited from
> postgres
> > > and
> > > > > will be shown in "select version()" command;  it is somewhat
> related
> > to
> > > > > library compatibility and it is also related to third party tools.
> > Some
> > > > > tools may read and parse versions, and changing from 4 digit to 3
> > digit
> > > > > might introduce some unknown incompatibility issues.
> > > > >
> > > > >
> > > > > So currently there are 2 options:
> > > > >
> > > > > 1. Keep 4-digit version 

Re: Replace git submodule with git clone + file with commit number?

2016-06-29 Thread Jiali Yao
Hi Paul

Generally agree on the approach while I have some questions:

1. Do we need to add branch handling?  For example, I am a developer
of gp-xerces,
 I need to test my branch code.

2. Let me double confirm about the build process.  Below is gp-xerces repo
and now HAWQ will be commit a. So I will set commit file gp-xerces.commit
in HAWQ repo which indicate we will use commit a. And then sometime later,
we want to use b, then we commit   gp-xerces.commit to b right?  And we
will have multiple .commit file for each submodule now. right?
gp-xerces:
 commit c
 
 commit b
 
 commit a

Thanks
Jiali

On Thu, Jun 30, 2016 at 11:03 AM, Radar Da lei  wrote:

> +1
> Since 'git clone' is called automatically in Makefile, I feel this is much
> better than use submodule.
>
> Regards,
> Radar
>
> On Thu, Jun 30, 2016 at 10:07 AM, Lili Ma  wrote:
>
> > Hi Paul,
> >
> > I have one question, you mentioned "if I update gp-xerces commit number
> in
> > the commit file gp-xerces.commit make will trigger an auto-build".  Could
> > we set the $(gp_xerces_commit) to any commit number? What I mean is,
> could
> > we set it to a commit number already existing in my local code, or from
> > remote code which I have not fetched?
> >
> > If the answer is yes, I totally agree this suggestion.
> >
> > Thanks
> > Lili
> >
> > On Wed, Jun 29, 2016 at 11:52 PM, Kavinder Dhaliwal <
> kdhali...@pivotal.io>
> > wrote:
> >
> > > +1. I am in favor of this approach, especially if the submodules make
> the
> > > source tarball difficult to build. I also agree it will make a
> developers
> > > life much easier.
> > >
> > > On Wed, Jun 29, 2016 at 3:18 AM, Guo Gang  wrote:
> > >
> > > > I'm proposing this change because:
> > > >
> > > > 1) We are ready to release the first apache release with "source
> > > tarball",
> > > > but
> > > > submodule is not friendly to source tarball since git submodule
> > > >require a git parent.
> > > >
> > > > 2) With more and more development, I found that submodule mechanism
> > > > is not that friendly for development, e.g.
> > > >
> > > > If the commit number of one submodule is modified, it is hard to
> > > detect
> > > > this for Makefile. We need to manually update it in an old repo.
> If
> > > > using "git clone", we could easily detect update by set a commit
> > > number
> > > > file
> > > > as dependency and thus easily update the code.
> > > >
> > > >Some developers have complained the annoying output in "git
> status"
> > > >after building submodules. With "git clone" we can easily mask
> those
> > > >directories via .gitignore.
> > > >
> > > >It is hard for developers who is not familiar with submodule
> > (Frankly
> > > > speaking
> > > >I really do not think submodule is friendly) to manipulate related
> > > > directories
> > > >when necessary.
> > > >
> > > > While with the "git clone" solution, we save the commit of previous
> > > > submodule in a file,
> > > > and the related Makefile code change is rather simple, e.g. for
> > > gp-xerces.
> > > >
> > > > $(ORCA_BLD_PATH)/gp-xerces_prepare_timestamp:
> > > > $(ORCA_SRC_PATH)/gp-xerces.commit
> > > > rm -f $(ORCA_BLD_PATH)/gp-xerces_prepare_timestamp
> > > >
> > > > gp-xerces_prepare: $(ORCA_BLD_PATH)/gp-xerces_prepare_timestamp
> > > > if [ ! -f $(ORCA_BLD_PATH)/gp-xerces_prepare_timestamp ]; then \
> > > > [ "x$(gp_xerces_commit)" != "x" ] || exit 1; \
> > > > cd $(abs_top_srcdir)/$(subdir); mkdir -p gp-xerces; cd
> > > gp-xerces; \
> > > > [ ! -d .git ] && git clone
> > > > https://github.com/greenplum-db/gp-xerces.git .; \
> > > > git reset --hard $(gp_xerces_commit) || exit 2; \
> > > > touch $(ORCA_BLD_PATH)/gp_xerces_prepare_timestamp; \
> > > > fi
> > > >
> > > > With above code change, if I update gp-xerces commit number in the
> > commit
> > > > file
> > > > gp-xerces.commit make will trigger an auto-build, If I messed up the
> > > > gp-xerces directory
> > > > I can easily remove the whole gp-xerces, or just remove the timestamp
> > > file
> > > > gp-xerces_prepare_timestamp to trigger an auto build.
> > > >
> > > > Any suggestion? Thanks.
> > > >
> > >
> >
>


Re: Use Travis instead of Jenkins to ensure building success from pull request?

2016-06-27 Thread Jiali Yao
@Hong, thanks for your comments.
For 1, Ubuntu can be integrated. Thought?
For 2, Since the time is related to resource and CPU, it is not stabled
every time.  But I do not think it is blocked so far.

Thanks
Jiali


On Mon, Jun 27, 2016 at 11:16 AM, hong wu <xunzhang...@gmail.com> wrote:

> @Jiali, see my comments inline below:
>
> 1. Beside MAC build, should we add other platform builds such as Redhat?
> > Travis CI only supports ubuntu for Linux(
> https://docs.travis-ci.com/user/ci-environment/#Virtualization-environments
> ).
> Does it make sense  integrating HAWQ build in Ubuntu?
> 2. About building time, is it possible that we figure out ways that we can
> make it quick. 25 mins seems a little long.
> > Actually I didn't optimize the build time yet and it is possible to make
> it faster. What time is suitable, 20 mins?
>
> Thanks
>
> 2016-06-27 10:50 GMT+08:00 Jiali Yao <j...@pivotal.io>:
>
> > Great point to have more test in Apache CI.
> > While I want to add some comments
> > 1. Beside MAC build, should we add other platform builds such as Redhat?
> > 2. About building time, is it possible that we figure out ways that we
> can
> > make it quick. 25 mins seems a little long.
> >
> > Thanks
> > Jiali
> >
> >
> > On Fri, Jun 24, 2016 at 4:43 PM, Ming Li <m...@pivotal.io> wrote:
> >
> > > Moreover, we can put other simple build processes onto free github
> > > integrated services.
> > >
> > > E.g. after enable travis_ci, we can enable Coverity Scan build (
> > > https://scan.coverity.com/travis_ci) further.
> > >
> > > More free services can be exploited (like Coverage report build), we
> need
> > > to investigate how to use them.
> > >
> > > On Fri, Jun 24, 2016 at 4:17 PM, Ming Li <m...@pivotal.io> wrote:
> > >
> > > > Agree.
> > > >
> > > > BTW, one more problem: Now we only have Jenkins for testing pull
> > > > requests, but we don't test against the latest code on branch master.
> > > > And also it is better to keep monitor on the building status on main
> > page
> > > > at https://github.com/apache/incubator-hawq, so that we can easily
> > found
> > > > building error.
> > > >
> > > > On Fri, Jun 24, 2016 at 3:06 PM, hong wu <xunzhang...@gmail.com>
> > wrote:
> > > >
> > > >> Hi HAWQ committers,
> > > >>
> > > >> Recently, since the Jenkins service integrated inside apache HAWQ
> > > project
> > > >> is problematic, could we open the travis service instead? The
> > > .travis.yml
> > > >> file
> > > >> has already existed and has worked in self-forked HAWQ repos(such as
> > > >> https://travis-ci.org/xunzhang/incubator-hawq/builds). The original
> > > >> Jenkins
> > > >> script was something wrong and even didn't check compiling.
> > > >>
> > > >> Some pros:
> > > >>  - Travis CI script is visible to developers/users which is much
> more
> > > >> friendly and easier to maintain(comparing to Jenkins)
> > > >>  - To make sure every pull request is valid(comparing to current
> > status)
> > > >>
> > > >> Some cons:
> > > >>  - Admin could not log into the Travis machine to debug.
> > > >>  - Current travis script only check building status in osx. Because
> of
> > > the
> > > >> osx resource in travis machine is limited: some pending time + not
> > that
> > > >> enough CPUs. It will take about 25min to pass the total HAWQ
> building
> > > >> process.
> > > >>
> > > >> Also, I am not sure whether A apache project must use Jenkins for
> its
> > > >> open-source CI. Any comments? Thanks.
> > > >>
> > > >> Best
> > > >> xunzhang
> > > >>
> > > >
> > > >
> > >
> >
>


Re: About the commit

2016-06-27 Thread Jiali Yao
+1 on below steps:
git pull --rebase $upstream master
git push -f $my_repo $my_branch
Then on github, create a pull request

It will make the git log clear and keep the correct commit order.

Thanks
Jiali

On Tue, Jun 28, 2016 at 9:37 AM, Guo Gang  wrote:

> I usually does the following things before creating a pull request.
>
> git pull --rebase $upstream master
> git push -f $my_repo $my_branch
> Then on github, create a pull request
>
> This removes "Merge" commit, and make commits clean, and make
> pull request easy.
>
> In local repo,
> Keep one commit only for multiple checkins: git commit --amend
> or
> Merge N commits into one: git reset --soft HEAD~N   +  git commit.
>
> FYI.
>
>
> 2016-06-25 9:43 GMT+08:00 hong wu :
>
> > FYI: In new version of github
> > , committers could
> > handle
> > the squash process much more convenient.
> >
> > xunzhang
> >
> > 2016-06-25 9:30 GMT+08:00 hong wu :
> >
> > > Hi HAWQ committers,
> > >
> > > I notice that there are some informal commits of recent check-in. For
> > > example:
> > > ce3f7c6b5c0315b97298d651f5d5f7383000491a
> > > <
> >
> https://github.com/apache/incubator-hawq/commit/ce3f7c6b5c0315b97298d651f5d5f7383000491a
> > >
> > > 4d44097085fd139002a255b1032082dc0b030414
> > > <
> >
> https://github.com/apache/incubator-hawq/commit/4d44097085fd139002a255b1032082dc0b030414
> > >
> > > 817249a4605abd4415fc0de8e6a545bf88d2aa2e
> > > <
> >
> https://github.com/apache/incubator-hawq/commit/817249a4605abd4415fc0de8e6a545bf88d2aa2e
> > >
> > > ...
> > >
> > > I think we'd better ensure a commit info starting with `HAWQ-#JIRA`. In
> > > the following, I try to list the senses developer should pay attention
> > to:
> > > 1. To avoid commits generated automatically(for example
> > > ce3f7c6b5c0315b97298d651f5d5f7383000491a
> > > <
> >
> https://github.com/apache/incubator-hawq/commit/ce3f7c6b5c0315b97298d651f5d5f7383000491a
> > >),
> > > do not merge or pull from upstream after local commits. Sync with
> > upstream
> > > master before your local commits.
> > > 2. Check commits info before pushing into master. If you found a commit
> > > with empty code change, rebase then squash it.
> > > 3. If a pull request contains some informal commits(maybe for code
> review
> > > convenience), committers should rebase then squash this temporary
> commits
> > > before pushing into master.
> > > 4. If you local develop branch is behind of upstream, it's ok. There
> are
> > > some acceptable reasons for that. For example, during the discuss of
> your
> > > pull request, the master branch is updated. But in this case, the
> asfgit
> > >  could not resolve it which means
> developers
> > > should close your pull request manually. So I recommend to attach the
> > pull
> > > request number in your commits comments(For example
> > > a57cc9523f97e471a69b658556c989d13ad88661
> > > <
> >
> https://github.com/apache/incubator-hawq/commit/a57cc9523f97e471a69b658556c989d13ad88661
> > >
> > > ).
> > >
> > > Best
> > > xunzhang
> > >
> > >
> >
>


Re: Integrate ICG to GTest

2016-06-26 Thread Jiali Yao
Hi all,

Thanks very much for comments so far.
We will move the ICG tests step by steps using google test.

Still looking forward to your comments.

Thanks

Jiali


On Sat, Jun 25, 2016 at 9:09 AM, Lei Chang <lei_ch...@apache.org> wrote:

> first of all, +1 for the proposal consolidating the test frameworks, it
> make existing developers daily work easy and also lower the barrier for
> contributions from new contributors.
>
> I'd like suggest we move the tests step by step and we can continuously
> release.
>
> Cheers
> Lei
>
>
> On Fri, Jun 24, 2016 at 1:31 PM, Jiali Yao <j...@pivotal.io> wrote:
>
> > +1 for the doc
> >
> https://github.com/google/googletest/blob/master/googletest/docs/Primer.md
> >
> > For test we suggest that all new tests should use google test. Existing
> > install check good tests should be moved to google test.
> > What is your thoughts?
> >
> > Thanks
> >
> > Jiali
> >
> >
> > On Fri, Jun 24, 2016 at 1:20 AM, Shivram Mani <shivram.m...@gmail.com>
> > wrote:
> >
> > > Thumbs up on transitioning to Google test framework. This is a doc that
> > has
> > > detailed benefits of Googletest framework
> > >
> >
> https://github.com/google/googletest/blob/master/googletest/docs/Primer.md
> > > .
> > > How do we plan to transition test to the GTest framework. Are we only
> > going
> > > to target adding new tests in the framework or do we plan to do a bulk
> > > transfer of existing test ?
> > >
> > > On Fri, Jun 17, 2016 at 1:34 AM, Jiali Yao <j...@pivotal.io> wrote:
> > >
> > > > Hi all,
> > > >
> > > > In HAWQ test, google test is used for libyarn and libhdfs test.
> Install
> > > > check test framework is used for smoke test and has a lot of
> > limitations.
> > > > To make it easy to learn and consolidate test, we want to unify the
> two
> > > > frameworks. Considering below factors, we want to use google test:
> > > >
> > > >- It supports more functions and then a developer can write more
> > > complex
> > > >tests which are not only limited to SQL test.
> > > >- Google test support run test parallelly.
> > > >- Google mock is an extension for google test, it can also be used
> > for
> > > >unit test.
> > > >
> > > > For detail information please refer to
> > > > https://issues.apache.org/jira/browse/HAWQ-832
> > > >
> > > > Looking forward to your comments.
> > > >
> > > > Thanks
> > > >
> > > > Jiali
> > > >
> > >
> > >
> > >
> > > --
> > > shivram mani
> > >
> >
>


Re: Use Travis instead of Jenkins to ensure building success from pull request?

2016-06-26 Thread Jiali Yao
Great point to have more test in Apache CI.
While I want to add some comments
1. Beside MAC build, should we add other platform builds such as Redhat?
2. About building time, is it possible that we figure out ways that we can
make it quick. 25 mins seems a little long.

Thanks
Jiali


On Fri, Jun 24, 2016 at 4:43 PM, Ming Li  wrote:

> Moreover, we can put other simple build processes onto free github
> integrated services.
>
> E.g. after enable travis_ci, we can enable Coverity Scan build (
> https://scan.coverity.com/travis_ci) further.
>
> More free services can be exploited (like Coverage report build), we need
> to investigate how to use them.
>
> On Fri, Jun 24, 2016 at 4:17 PM, Ming Li  wrote:
>
> > Agree.
> >
> > BTW, one more problem: Now we only have Jenkins for testing pull
> > requests, but we don't test against the latest code on branch master.
> > And also it is better to keep monitor on the building status on main page
> > at https://github.com/apache/incubator-hawq, so that we can easily found
> > building error.
> >
> > On Fri, Jun 24, 2016 at 3:06 PM, hong wu  wrote:
> >
> >> Hi HAWQ committers,
> >>
> >> Recently, since the Jenkins service integrated inside apache HAWQ
> project
> >> is problematic, could we open the travis service instead? The
> .travis.yml
> >> file
> >> has already existed and has worked in self-forked HAWQ repos(such as
> >> https://travis-ci.org/xunzhang/incubator-hawq/builds). The original
> >> Jenkins
> >> script was something wrong and even didn't check compiling.
> >>
> >> Some pros:
> >>  - Travis CI script is visible to developers/users which is much more
> >> friendly and easier to maintain(comparing to Jenkins)
> >>  - To make sure every pull request is valid(comparing to current status)
> >>
> >> Some cons:
> >>  - Admin could not log into the Travis machine to debug.
> >>  - Current travis script only check building status in osx. Because of
> the
> >> osx resource in travis machine is limited: some pending time + not that
> >> enough CPUs. It will take about 25min to pass the total HAWQ building
> >> process.
> >>
> >> Also, I am not sure whether A apache project must use Jenkins for its
> >> open-source CI. Any comments? Thanks.
> >>
> >> Best
> >> xunzhang
> >>
> >
> >
>


Re: Improve code organization for HAWQ?

2016-06-25 Thread Jiali Yao
Cool suggestion. Then we can easily compile the result for different
configuration.

Thanks
Jiali


On Sat, Jun 25, 2016 at 1:53 AM, Roman Shaposhnik 
wrote:

> Great points and +1 on the suggestions!
>
> Thanks,
> Roman.
>
> On Fri, Jun 24, 2016 at 12:38 AM, hong wu  wrote:
> > The current code organization has some building disadvantages for
> > developers.
> >
> > Developers can only compile HAWQ code in HAWQ_HOME which means if we
> build
> > HAWQ in another folder, the make process will fail.
> >
> > Since the make system of HAWQ is come from Postgres. I also tried
> building
> > Postgres in a temporary new build folder and it works.
> >
> > This kind of limitation has some disadvantages:
> > 1. It is not neat to mix building generated files with source code.
> > 2. We need to type make distclean if we want different configuration
> > environment. An ideal way for example is to keep a build_opt folder to
> hold
> > release mode build and keep a build_dev folder to hold debug mode.
> >
> > Best
> > xunzhang
>


Re: Integrate ICG to GTest

2016-06-23 Thread Jiali Yao
+1 for the doc
https://github.com/google/googletest/blob/master/googletest/docs/Primer.md

For test we suggest that all new tests should use google test. Existing
install check good tests should be moved to google test.
What is your thoughts?

Thanks

Jiali


On Fri, Jun 24, 2016 at 1:20 AM, Shivram Mani <shivram.m...@gmail.com>
wrote:

> Thumbs up on transitioning to Google test framework. This is a doc that has
> detailed benefits of Googletest framework
> https://github.com/google/googletest/blob/master/googletest/docs/Primer.md
> .
> How do we plan to transition test to the GTest framework. Are we only going
> to target adding new tests in the framework or do we plan to do a bulk
> transfer of existing test ?
>
> On Fri, Jun 17, 2016 at 1:34 AM, Jiali Yao <j...@pivotal.io> wrote:
>
> > Hi all,
> >
> > In HAWQ test, google test is used for libyarn and libhdfs test. Install
> > check test framework is used for smoke test and has a lot of limitations.
> > To make it easy to learn and consolidate test, we want to unify the two
> > frameworks. Considering below factors, we want to use google test:
> >
> >- It supports more functions and then a developer can write more
> complex
> >tests which are not only limited to SQL test.
> >- Google test support run test parallelly.
> >- Google mock is an extension for google test, it can also be used for
> >unit test.
> >
> > For detail information please refer to
> > https://issues.apache.org/jira/browse/HAWQ-832
> >
> > Looking forward to your comments.
> >
> > Thanks
> >
> > Jiali
> >
>
>
>
> --
> shivram mani
>


Integrate ICG to GTest

2016-06-17 Thread Jiali Yao
Hi all,

In HAWQ test, google test is used for libyarn and libhdfs test. Install
check test framework is used for smoke test and has a lot of limitations.
To make it easy to learn and consolidate test, we want to unify the two
frameworks. Considering below factors, we want to use google test:

   - It supports more functions and then a developer can write more complex
   tests which are not only limited to SQL test.
   - Google test support run test parallelly.
   - Google mock is an extension for google test, it can also be used for
   unit test.

For detail information please refer to
https://issues.apache.org/jira/browse/HAWQ-832

Looking forward to your comments.

Thanks

Jiali


[jira] [Created] (HAWQ-832) Integrate ICG to Gtest

2016-06-17 Thread Jiali Yao (JIRA)
Jiali Yao created HAWQ-832:
--

 Summary: Integrate ICG to Gtest
 Key: HAWQ-832
 URL: https://issues.apache.org/jira/browse/HAWQ-832
 Project: Apache HAWQ
  Issue Type: Test
  Components: Tests
Reporter: Jiali Yao
Assignee: Jiali Yao
 Attachments: GoogleTest.pdf

In HAWQ test, google test is used for libyarn and libhdfs test. Install check 
test framework is used for smoke test and has a lot of limitations. To make it 
easy to learn and consolidate test, we want to unify the two frameworks. 
Considering below factors, we want to use google test: 
It supports more functions and then a developer can write more complex tests 
which are not only limited to SQL test.  
Google test support run test parallelly. 
Google mock is an extension for google test, it can also be used for unit test. 






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HAWQ-809) Change libhdfs3 function test port to hadoop default port

2016-06-13 Thread Jiali Yao (JIRA)
Jiali Yao created HAWQ-809:
--

 Summary: Change libhdfs3 function test port to hadoop default port
 Key: HAWQ-809
 URL: https://issues.apache.org/jira/browse/HAWQ-809
 Project: Apache HAWQ
  Issue Type: Test
  Components: libhdfs
Reporter: Jiali Yao
Assignee: Lei Chang


In libhdfs3 function test, now it use hdfs port 9000 we need to change it to 
default hdfs port




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Performance issue about HAWQ 2.0 beta

2015-11-30 Thread Jiali Yao
Hi Leon,

I do not see the schema and see there is different plan for HAWQ 1.3 and
HAWQ 2.0.
In general the performance difference maybe occur on different part

1. Hash VS Random  (In HAWQ 1.3 the default is HASH, while in HAWQ 2.0 it
is random) To check it, please see the definition of the related table.
2. Default optimizer.   I do not see what planner you used. If you use open
source version , it should be planner. While in enterprise version and HAWQ
1.3, it use ORCA. It can be seen by GUC optimizer on or off.
3. Segment configuration.When comparing performance, we need to have
comparable segment configuration with HAWQ 2.0 and HAWQ 1.X. It has 5
servers and it has one segment per node in HAWQ 1.X. One more, the vseg is
also based on your hardware. In normal physical server such as 64G memory,
8 coreCPU, we suggest that 8. But if you use VM, you can set to lower
value.
4. default segment num setting: I see in your previous email , you set
default_segment_num to 160. If your cluster is only 5 nodes, the value is
not true. Normally we suggest that is should be cluster size * 8

So for your cases, let us identify same configuration for 1.3 and HAWQ 2.0
and get comparison.
Thanks

Jiali




On Mon, Nov 30, 2015 at 3:22 PM, Leon Zhang  wrote:

> Hi, Martin Visser
>
>Thanks for you quick reply.  I attached the "explain analyze" in my last
> email of this thread.
>
>   And because hawq-2.0 introduce the "virtual segment", and we configure 8
> virtual-segment for each node. So, we can see different segment numbers.
>
> On Fri, Nov 27, 2015 at 4:58 PM, Martin Visser  wrote:
>
> > Hi Leon,
> >
> > looking at the 2.0 plan, you're perhaps missing stats on some of the
> tables
> > for example:
> > -> Parquet table Scan on catalog_sales  (cost=0.00..23885.35 rows=1
> > width=197)
> > -> Parquet table Scan on web_sales  (cost=0.00..11982.30 rows=1
> width=197)
> >
> > Can you check or run explain analyze?  Also number of segments is showing
> > different numbers 1.3 5 segs and 2.0 40 sets
> >
> > On Fri, Nov 27, 2015 at 7:43 AM, Leon Zhang  wrote:
> >
> > > Hi, HAWQ Developers:
> > >
> > >As my previous email hint, I run TPC-DS test on our development.
> > > Comparing with previous version 1.3.x, we can see the performance
> > > improvement on most of queries.
> > >
> > >But the problem is performance reduction for *some* queries. For
> > > example, the query64, the running time increase from 10754.688 ms
> > > to 68884.731 ms . I am not sure if any changes were made that increase
> > the
> > > running time?
> > >
> > >In order to discuss the detail about this issue, I would like use
> the
> > > query10. The running time increase from 1795.746 ms to 744919.251 ms. I
> > > also attache the sql about this query, and the query plan for this
> query.
> > >
> > >Thanks
> > >
> > >
> >
>


Re: Performance issue about HAWQ 2.0 beta

2015-11-27 Thread Jiali Yao
Hi Leon

Thanks for providing it. The result is not as we expected. In our
performance test, we found the performance is comparable with 1.3.
Could you please some more information:
1. Get segment configuration information from 1.3 and 2.0
select * from gp_segment_configuration ;
2. Could you please run "explain analyze" to get more statistic information?
3. Want to confirm with you: The result run in yarn mode ,right? Also I see
your previous email to indicate there is some error in yarn, these query is
also from that test round, right?

Thanks

Jiali

On Fri, Nov 27, 2015 at 3:43 PM, Leon Zhang  wrote:

> Hi, HAWQ Developers:
>
>As my previous email hint, I run TPC-DS test on our development.
> Comparing with previous version 1.3.x, we can see the performance
> improvement on most of queries.
>
>But the problem is performance reduction for *some* queries. For
> example, the query64, the running time increase from 10754.688 ms
> to 68884.731 ms . I am not sure if any changes were made that increase the
> running time?
>
>In order to discuss the detail about this issue, I would like use the
> query10. The running time increase from 1795.746 ms to 744919.251 ms. I
> also attache the sql about this query, and the query plan for this query.
>
>Thanks
>
>