I have no problem substituting local testing as long as we test all the environments that Travis does. I've done that to get around this problem in the past. It takes a while to run each maven test profile, but it works.

rb

On 01/26/2016 09:44 PM, Wes McKinney wrote:
Also, things have been made much worse by Travis CI continuing to have
infrastructure problems. The ASF build queue on Travis CI had completely
stalled by this morning so that no builds were completing; fortunately
their support is quite responsible and they've resolved the queue
blockage, so builds are executing again.

On Tue, Jan 26, 2016 at 4:00 PM, Wes McKinney <[email protected]
<mailto:[email protected]>> wrote:

    There's 3 more patches outstanding that are causing blockage (418,
    433, and 451/453), so I think if we get them merged today or
    tomorrow when we should be able to proceed with some parallel
    efforts without quite as much conflict.

    On Tue, Jan 26, 2016 at 3:56 PM, Nong Li <[email protected]
    <mailto:[email protected]>> wrote:

        I'm going to try to more active this week but I admittedly don't
        have a lot of
        time to work on this. I understand we need to get critical mass
        in committers,
        code, etc to keep this going but I think we're making good progress.

        On Tue, Jan 26, 2016 at 3:27 PM, Julien Le Dem
        <[email protected] <mailto:[email protected]>> wrote:

            Also as Nong mentioned, PRs should be prefixed by the jira
            id followed by a ":" as follows "PARQUET-X: description"
            that's just to have the reference in the git changelog. The
            merge script enforces it.


            On Tue, Jan 26, 2016 at 3:24 PM, Julien Le Dem
            <[email protected] <mailto:[email protected]>> wrote:

                I'm happy too with Aliaksei, Deepak, Wes, etc reviewing
                each other.
                I see Nong (who's a committer) has been doing some
                reviews already.

                When you guys reach a consensus on a PR and want it
                merged please mention it in the PR (+1, LGTM) and
                mention us directly (@julienledem, ...) to have it merged.

                right now I see that #19 and #21 have been committed
                (thanks Nong) but it is not clear to me in what order
                the others should be committed.

                For example Deepak should comment directly on #22 to
                approve it. Right now he mentioned it on another PR.
                
https://github.com/apache/parquet-cpp/pull/24#issuecomment-174354139
                Similarly Wes could confirm on that PR whether it looks
                good.

                Tomorrow is the Parquet sync up if you want to discuss
                further:
                https://plus.google.com/u/0/events/cvgi67jmoptmgb1i488re8scbuo


                On Mon, Jan 25, 2016 at 4:20 PM, Ryan Blue
                <[email protected] <mailto:[email protected]>> wrote:

                    Aliaksei, thanks for being understanding here.

                    I agree with you that it is too difficult. We really
                    want to get the cpp side bootstrapped as soon as
                    possible. Lets go with what you suggested, to have
                    contributors review one another's patches and then
                    ask a committer for a final review once both
                    contributors reach a consensus.

                    If there are issues that are easy to review, maybe
                    some of us other than Nong can take a look.

                    rb


                    On 01/25/2016 02:33 PM, Aliaksei Sandryhaila wrote:

                        Hi Ryan,

                        This sounds very reasonable. I do not argue to
                        disregard the standard
                        Apache approach to promoting contributors to
                        committers. I am just
                        pointing out that without the input from current
                        committers it is hard
                        for us to productively contribute to the
                        project. As a consequence, it
                        is hard for us demonstrate our fit to become
                        committers in the future.
                        This leaves us in a deadlock, which can be
                        resolved either by an
                        increased feedback from existing committers or
                        by making us committers
                        sooner.

                        I understand that most committers on the Parquet
                        project are working on
                        the Java implementation, so it can be harder for
                        them to review patches
                        for parquet-cpp. In this regard, how about the
                        following protocol for
                        parquet-cpp pull requests: After contributors
                        review and revise a pull
                        request and agree that it is in a good shape, we
                        will ask a designated
                        committer to review and commit the pull request.
                        So far we have been
                        asking Nong; if there is a better designated
                        committer for parquet-cpp,
                        please let us know.

                        Thank you,
                        Aliaksei.


                        On 01/25/2016 04:54 PM, Ryan Blue wrote:

                            Hi everyone,

                            Sorry about the current backlog on the
                            parquet-cpp side. Most of the
                            current committer base works on the Java
                            implementation so it's either
                            slow or not reliable for us to do those reviews.

                            I think the best way to move forward is to
                            review patches for each
                            other. That will keep those issues
                            progressing, make it easy for
                            committers to validate the commit, and --
                            most importantly -- to build
                            a trail of contributions that we can look at
                            to vote in new committers.

                            I completely sympathize with the need for
                            committers on the CPP
                            project, but I don't think this will take a
                            long time given the
                            current level of activity. We're really just
                            trying to build
                            confidence that:

                            1. You produce quality contributions and
                            understand the codebase
                            2. You give friendly, thoughtful reviews and
                            don't rubber-stamp
                            3. You defer judgment and ask others when
                            you don't know
                            4. You respect others and interact
                            professionally

                            I don't think any of those are that hard to
                            demonstrate, but I'd be
                            uncomfortable not validating committers like
                            we normally do.
                            Especially in this situation, where I could
                            easily see the amount of
                            work you guys are doing adding up pretty
                            quickly!

                            Does that sound like a reasonable path forward?

                            rb


                            On 01/25/2016 12:46 PM, Aliaksei Sandryhaila
                            wrote:

                                Hi Nong and Julien,

                                As Wes has pointed out, we have a number
                                of patches for parquet-cpp
                                outstanding. Wes, Deepak, and I have
                                been reviewing each other's pull
                                requests. At this point, the patches
                                need to be reviewed and approved by
                                Parquet committers in order to be
                                committed to master.

                                Unfortunately, there is not much
                                activity on this side of the project.
                                The lack of response from current
                                committers is holding us back, and we
                                have to repeatedly rebase our batches,
                                merge multiple pull requests
                                together, and overall step on each
                                others' toes.

                                Is it possible to make Wes, Deepak, and
                                me committers on the project, so
                                we can contribute to parquet-cpp more
                                efficiently?

                                Thanks,
                                Aliaksei.


                                On 01/23/2016 06:07 PM, Wes McKinney wrote:

                                    Folks,

                                    We're working on a pretty solid
                                    patch queue.

                                    independent patches
                                    PARQUET-449:
                                    
https://github.com/apache/parquet-cpp/pull/21

                                    interdependent patches (order to
                                    apply patches)
                                    PARQUET-437 (MOSTLY REVIEWED):
                                    
https://github.com/apache/parquet-cpp/pull/19

                                    PARQUET-418:
                                    
https://github.com/apache/parquet-cpp/pull/18
                                    PARQUET-434:
                                    
https://github.com/apache/parquet-cpp/pull/20
                                    PARQUET-433:
                                    
https://github.com/apache/parquet-cpp/pull/22
                                    PARQUET-451 & PARQUET-453:
                                    
https://github.com/apache/parquet-cpp/pull/23

                                    PARQUET-428 (needs to be rebased on
                                    top of PARQUET-433):
                                    
https://github.com/apache/parquet-cpp/pull/24

                                    I'm going to take a breather and
                                    work on some other things this
                                    weekend,
                                    but I'll be available for code
                                    reviews and fixes to try to move along
                                    this
                                    patch queue.

                                    Thanks,
                                    Wes

                                    On Fri, Jan 15, 2016 at 8:18 AM, Wes
                                    McKinney <[email protected]
                                    <mailto:[email protected]>> wrote:

                                        Great to meet you all!

                                        I've recently been collaborating
                                        with the Apache Drill team to spin
                                        out
                                        the ValueVector columnar
                                        in-memory data structure into a new
                                        standalone
                                        project that will be called
                                        Arrow [1] [2]. A brief summary of
                                        Arrow/ValueVectors is that it
                                        permits O(1) random access on nested
                                        columnar
                                        structures and is efficient for
                                        projections and scans in a columnar
                                        SQL
                                        setting.

                                        I'm very interested in making
                                        Parquet read/write support
                                        available to
                                        Python programmers via C/C++
                                        extensions, so I'm going to be
                                        working
                                        the
                                        next few months on a
                                        Parquet->Arrow->Python
                                        toolchain, along with some
                                        tools to manipulate tables
                                        in-memory columnar data in the
                                        style of
                                        Python's
                                        pandas library.

                                        I will propose patches as needed
                                        to parquet-cpp to improve its
                                        performance
                                        and add functionality for
                                        writing Parquet files as well. The
                                        details of
                                        converting to/from Parquet's
                                        repetition/definition level
                                        representation of
                                        nested data will stay separate
                                        in the arrow-parquet adapter code.

                                        cheers,
                                        Wes

                                        [1]:
                                        
http://mail-archives.apache.org/mod_mbox/drill-dev/201510.mbox/%3CCAJrw0OSVoirU_EUrBBqKY12uDi_f8U9MP7J_6Puuh_DmcyzS9g%40mail.gmail.com%3E


                                        [2]:
                                        
http://permalink.gmane.org/gmane.comp.apache.incubator.drill.devel/16490


                                        On Fri, Jan 15, 2016 at 1:22 AM,
                                        Mickaël Lacour
                                        <[email protected]
                                        <mailto:[email protected]>>
                                        wrote:

                                            Hi,

                                            I'm very interested in this
                                            subject because I would like
                                            to export
                                            parquet data from HDFS to
                                            Vertica (using VSQL).
                                            I'm planning to work on it
                                            next quarter, but I will be
                                            very happy to
                                            help
                                            you on this subject (review,
                                            testing).

                                            Have a nice day,
                                            --
                                            Mickaël Lacour
                                            Senior Software Engineer
                                            Analytics Infrastructure
                                            team @Scalability

                                            
________________________________________
                                            From: Walkauskas, Stephen
                                            Gregory (Vertica)
                                            <[email protected]
                                            <mailto:[email protected]>>
                                            Sent: Thursday, January 14,
                                            2016 3:23 PM
                                            To: Sandryhaila, Aliaksei;
                                            [email protected]
                                            <mailto:[email protected]>;
                                            Majeti, Deepak;
                                            [email protected]
                                            <mailto:[email protected]>;
                                            Wes McKinney
                                            Subject: Re: Parquet-cpp

                                            Yes, thanks for the
                                            introduction Julien.

                                            Nong and Wes,

                                            It'd be interesting to know
                                            your goals for parquet-cpp.

                                            The Vertica database already
                                            supports optimized reads of
                                            ORC files
                                            (fast
                                            c++ parser, predicate
                                            pushdown, columns selection
                                            etc). We'd like
                                            to do
                                            the same for parquet.

                                            Cheers,
                                            Stephen

                                            On 01/13/2016 05:53 PM,
                                            Sandryhaila, Aliaksei wrote:

                                                Thank you for the
                                                introduction, Julien!

                                                Hello Nong and Wes,

                                                Stephen, Deepak and I
                                                are developing a C++
                                                library to support
                                                Parquet in
                                                Vertica RDBMS. We are
                                                using Parquet-cpp as a
                                                starting point and are
                                                expanding its
                                                functionality as well as
                                                improving it and fixing
                                                bugs. We
                                                would like to contribute
                                                these improvements back
                                                to the open-source
                                                community. We plan to do
                                                this through the usual
                                                process of creating
                                                jiras that justify and
                                                explain a code change,
                                                and then submitting
                                                pull
                                                requests. We look
                                                forward to working with
                                                you on Parquet-cpp and to
                                                your
                                                feedback and suggestions.

                                                Best regards,
                                                Aliaksei.


                                                On 01/13/2016 02:54 PM,
                                                Julien Le Dem wrote:

                                                    Hello Nong, Wes,
                                                    Stephen, Deepak and
                                                    Aliaksei
                                                    I wanted to
                                                    introduce you to
                                                    each other as you
                                                    are all looking at
                                                    Parquet-cpp.

                                                    I'd recommend
                                                    opening JIRAs in the
                                                    parquet-cpp component to

                                            collaborate (I

                                                    see you already
                                                    doing this):

                                            
https://issues.apache.org/jira/browse/PARQUET-418?jql=project%20%3D%20PARQUET%20AND%20component%20%3D%20parquet-cpp


                                                    Nong is a committer
                                                    and can merged pull
                                                    requests (he also
                                                    understands

                                            that

                                                    code base very well).
                                                    Other committer can
                                                    too, feel free to
                                                    ping us if you need help
                                                    Obviously, you don't
                                                    need to be a
                                                    committer to give
                                                    others reviews
                                                    (you
                                                    just need one to
                                                    approve and merge).








                    --
                    Ryan Blue
                    Software Engineer
                    Cloudera, Inc.




                --
                Julien




            --
            Julien






--
Ryan Blue
Software Engineer
Cloudera, Inc.

Reply via email to