GitHub user ypcat opened a pull request:

    https://github.com/apache/spark/pull/5525

    [SPARK-6352] [SQL] Custom parquet output committer

    Add new config "spark.sql.parquet.output.committer.class" to allow custom 
parquet output committer and an output committer class specific to use on s3.
    Fix compilation error introduced by 
https://github.com/apache/spark/pull/5042.
    Respect ParquetOutputFormat.ENABLE_JOB_SUMMARY flag.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ypcat/spark spark-6352

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/5525.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #5525
    
----
commit f75e261c5652e4d6fa69e6f790f5b4a9238ad29e
Author: Pei-Lun Lee <[email protected]>
Date:   2015-03-16T06:44:00Z

    DirectParquetOutputCommitter

commit 769bd6737acc3f13a8678688e842e902fc38e802
Author: Pei-Lun Lee <[email protected]>
Date:   2015-03-13T03:48:03Z

    DirectParquetOutputCommitter

commit 0fc03ca563c38ae0b625898c59730cdedfb0534b
Author: Pei-Lun Lee <[email protected]>
Date:   2015-03-17T07:27:44Z

    [SPARK-6532] [SQL] hide class DirectParquetOutputCommitter

commit c42468c9b207edd995524afc2ebb1f723e375d20
Author: Pei-Lun Lee <[email protected]>
Date:   2015-03-17T07:28:28Z

    [SPARK-6352] [SQL] add test case

commit 0d540b9cfc03fc71d228616123f8cad4602e8f14
Author: Pei-Lun Lee <[email protected]>
Date:   2015-03-17T07:56:02Z

    [SPARK-6352] [SQL] add license

commit 9ae7545701f522702f2d0240367fc6fba06b7c26
Author: Pei-Lun Lee <[email protected]>
Date:   2015-03-23T10:32:15Z

    [SPARL-6352] [SQL] Change to allow custom parquet output committer.
    
    Add a new configuration key: spark.sql.parquet.output.committer.class
    which should be a sub-class of ParquetOutputCommitter

commit e17bf474ab3db112958cf67318d13305bde60788
Author: Pei-Lun Lee <[email protected]>
Date:   2015-03-23T10:37:55Z

    Merge branch 'master' of https://github.com/apache/spark into spark-6352
    
    Conflicts:
        
sql/core/src/test/scala/org/apache/spark/sql/parquet/ParquetIOSuite.scala

commit fe659151c7f8e2547404fe8a93c6010ceebb865a
Author: Pei-Lun Lee <[email protected]>
Date:   2015-04-14T08:22:07Z

    add support for parquet config parquet.enable.summary-metadata

commit 8413fcd7ec11d2c1892283643f89bd1bf0bbe062
Author: Pei-Lun Lee <[email protected]>
Date:   2015-04-14T08:31:17Z

    Merge branch 'master' of https://github.com/apache/spark into spark-6352
    
    Conflicts:
        
sql/core/src/main/scala/org/apache/spark/sql/parquet/DirectParquetOutputCommitter.scala

commit 9ece5c5cb366ba34fd542fe207dcdd6564385448
Author: Pei-Lun Lee <[email protected]>
Date:   2015-04-15T06:10:28Z

    compatibility with hadoop 1.x

commit ddd0f69258a3d61552fbebe3fafeffd364fca322
Author: Pei-Lun Lee <[email protected]>
Date:   2015-04-15T09:45:54Z

    Merge branch 'master' of https://github.com/apache/spark into spark-6352
    
    Conflicts:
        
sql/core/src/main/scala/org/apache/spark/sql/parquet/DirectParquetOutputCommitter.scala
        
sql/core/src/test/scala/org/apache/spark/sql/parquet/ParquetIOSuite.scala

commit 472870e290ffb9e264889d72d6275e7abc6c231f
Author: Pei-Lun Lee <[email protected]>
Date:   2015-04-15T10:19:02Z

    add back custom parquet output committer

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to