[GitHub] spark pull request: Sha jar staging

orenmazor Tue, 03 Feb 2015 10:05:58 -0800

GitHub user orenmazor opened a pull request:

    https://github.com/apache/spark/pull/4341


    Sha jar staging

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/Shopify/spark sha_jar_staging

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/4341.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #4341
    
----
commit 007ae14b0f9feb9555655012ae5c60eccdb4c4a3
Author: Cheng Hao <[email protected]>
Date:   2014-12-12T06:51:49Z

    [SPARK-4825] [SQL] CTAS fails to resolve when created using saveAsTable
    
    Fix bug when query like:
    ```
      test("save join to table") {
        val testData = sparkContext.parallelize(1 to 10).map(i => TestData(i, 
i.toString))
        sql("CREATE TABLE test1 (key INT, value STRING)")
        testData.insertInto("test1")
        sql("CREATE TABLE test2 (key INT, value STRING)")
        testData.insertInto("test2")
        testData.insertInto("test2")
        sql("SELECT COUNT(a.value) FROM test1 a JOIN test2 b ON a.key = 
b.key").saveAsTable("test")
        checkAnswer(
          table("test"),
          sql("SELECT COUNT(a.value) FROM test1 a JOIN test2 b ON a.key = 
b.key").collect().toSeq)
      }
    ```
    
    Author: Cheng Hao <[email protected]>
    
    Closes #3673 from chenghao-intel/spark_4825 and squashes the following 
commits:
    
    e8cbd56 [Cheng Hao] alternate the pattern matching order for logical 
plan:CTAS
    e004895 [Cheng Hao] fix bug

commit da13b63936b26d11dd8c17afd133db9405731c21
Author: Sasaki Toru <[email protected]>
Date:   2014-12-12T06:54:21Z

    [SPARK-4742][SQL] The name of Parquet File generated by 
AppendingParquetOutputFormat should be zero padded
    
    When I use Parquet File as a output file using 
ParquetOutputFormat#getDefaultWorkFile, the file name is not zero padded while 
RDD#saveAsText does zero padding.
    
    Author: Sasaki Toru <[email protected]>
    
    Closes #3602 from sasakitoa/parquet-zeroPadding and squashes the following 
commits:
    
    6b0e58f [Sasaki Toru] Merge branch 'master' of 
git://github.com/apache/spark into parquet-zeroPadding
    20dc79d [Sasaki Toru] Fixed the name of Parquet File generated by 
AppendingParquetOutputFormat

commit 53b74f57fc22fc5645f1e313349542cf77044b13
Author: Daoyuan Wang <[email protected]>
Date:   2014-12-12T06:56:42Z

    [SPARK-4829] [SQL] add rule to fold count(expr) if expr is not null
    
    Author: Daoyuan Wang <[email protected]>
    
    Closes #3676 from adrian-wang/countexpr and squashes the following commits:
    
    dc5765b [Daoyuan Wang] add rule to fold count(expr) if expr is not null

commit 7cdebe97d433fc4346e6b983ae2dd408608b7ccc
Author: Peter Klipfel <[email protected]>
Date:   2014-12-14T08:01:16Z

    fixed spelling errors in documentation
    
    changed "form" to "from" in 3 documentation entries for Kafka integration
    
    Author: Peter Klipfel <[email protected]>
    
    Closes #3691 from peterklipfel/master and squashes the following commits:
    
    0fe7fc5 [Peter Klipfel] fixed spelling errors in documentation

commit d89b26c97941bc9c026671a1380ca18036e26e3f
Author: Patrick Wendell <[email protected]>
Date:   2014-12-15T18:54:45Z

    HOTFIX: Disabling failing block manager test

commit 7cf4d7a4dedb5fe3bd6e108990f70bd848e68066
Author: Yuu ISHIKAWA <[email protected]>
Date:   2014-12-15T21:44:15Z

    [SPARK-4494][mllib] IDFModel.transform() add support for single vector
    
    I improved `IDFModel.transform` to allow using a single vector.
    
    [[SPARK-4494] IDFModel.transform() add support for single vector - ASF 
JIRA](https://issues.apache.org/jira/browse/SPARK-4494)
    
    Author: Yuu ISHIKAWA <[email protected]>
    
    Closes #3603 from yu-iskw/idf and squashes the following commits:
    
    256ff3d [Yuu ISHIKAWA] Fix typo
    a3bf566 [Yuu ISHIKAWA] - Fix typo - Optimize import order - Aggregate the 
assertion tests - Modify `IDFModel.transform` API for pyspark
    d25e49b [Yuu ISHIKAWA] Add the implementation of `IDFModel.transform` for a 
term frequency vector

commit 8347fbe06ca08b83c1e4a4291e326b394f5eb24c
Author: Josh Rosen <[email protected]>
Date:   2014-12-15T22:33:43Z

    [SPARK-4826] Fix generation of temp file names in WAL tests
    
    This PR should fix SPARK-4826, an issue where a bug in how we generate 
temp. file names was causing spurious test failures in the write ahead log 
suites.
    
    Closes #3695.
    Closes #3701.
    
    Author: Josh Rosen <[email protected]>
    
    Closes #3704 from JoshRosen/SPARK-4826 and squashes the following commits:
    
    f2307f5 [Josh Rosen] Use Spark Utils class for directory creation/deletion
    a693ddb [Josh Rosen] remove unused Random import
    b275e41 [Josh Rosen] Move creation of temp. dir to beforeEach/afterEach.
    9362919 [Josh Rosen] [SPARK-4826] Fix bug in generation of temp file names. 
in WAL suites.
    86c1944 [Josh Rosen] Revert "HOTFIX: Disabling failing block manager test"

commit 9d3276737facf5838eec8e08404a2fd7f52940db
Author: Ilya Ganelin <[email protected]>
Date:   2014-12-15T22:51:15Z

    [SPARK-1037] The name of findTaskFromList & findTask in 
TaskSetManager.scala is confusing
    
    Hi all - I've renamed the methods referenced in this JIRA to clarify that 
they modify the provided arrays (find vs. deque).
    
    Author: Ilya Ganelin <[email protected]>
    
    Closes #3665 from ilganeli/SPARK-1037B and squashes the following commits:
    
    64c177c [Ilya Ganelin] Renamed deque to dequeue
    f27d85e [Ilya Ganelin] Renamed private methods to clarify that they modify 
the provided parameters
    683482a [Ilya Ganelin] Renamed private methods to clarify that they modify 
the provided parameters

commit be7d6ea34e5f38c5f77d00014a6425224d505e4e
Author: Ryan Williams <[email protected]>
Date:   2014-12-15T22:52:17Z

    [SPARK-4668] Fix some documentation typos.
    
    Author: Ryan Williams <[email protected]>
    
    Closes #3523 from ryan-williams/tweaks and squashes the following commits:
    
    d2eddaa [Ryan Williams] code review feedback
    ce27fc1 [Ryan Williams] CoGroupedRDD comment nit
    c6cfad9 [Ryan Williams] remove unnecessary if statement
    b74ea35 [Ryan Williams] comment fix
    b0221f0 [Ryan Williams] fix a gendered pronoun
    c71ffed [Ryan Williams] use names on a few boolean parameters
    89954aa [Ryan Williams] clarify some comments in {Security,Shuffle}Manager
    e465dac [Ryan Williams] Saved building-spark.md with Dillinger.io
    83e8358 [Ryan Williams] fix pom.xml typo
    dc4662b [Ryan Williams] typo fixes in tuning.md, configuration.md

commit 6c185e926d26bd6ca25cab85e431641cc2e9a4a8
Author: Sean Owen <[email protected]>
Date:   2014-12-16T00:06:15Z

    SPARK-785 [CORE] ClosureCleaner not invoked on most PairRDDFunctions
    
    This looked like perhaps a simple and important one. `combineByKey` looks 
like it should clean its arguments' closures, and that in turn covers 
apparently all remaining functions in `PairRDDFunctions` which delegate to it.
    
    Author: Sean Owen <[email protected]>
    
    Closes #3690 from srowen/SPARK-785 and squashes the following commits:
    
    8df68fe [Sean Owen] Clean context of most remaining functions in 
PairRDDFunctions, which ultimately call combineByKey

commit cb465e0fbb3da3b8f4abaf49d071eb7a9692a6cc
Author: wangfei <[email protected]>
Date:   2014-12-16T00:46:21Z

    [Minor][Core] fix comments in MapOutputTracker
    
    Using driver and executor in the comments of ```MapOutputTracker``` is more 
clear.
    
    Author: wangfei <[email protected]>
    
    Closes #3700 from scwf/commentFix and squashes the following commits:
    
    aa68524 [wangfei] master and worker should be driver and executor

commit e147c42320c3a8b89940ec93110e6c83fc9954ba
Author: Sean Owen <[email protected]>
Date:   2014-12-16T01:12:05Z

    SPARK-4814 [CORE] Enable assertions in SBT, Maven tests / AssertionError 
from Hive's LazyBinaryInteger
    
    This enables assertions for the Maven and SBT build, but overrides the Hive 
module to not enable assertions.
    
    Author: Sean Owen <[email protected]>
    
    Closes #3692 from srowen/SPARK-4814 and squashes the following commits:
    
    caca704 [Sean Owen] Disable assertions just for Hive
    f71e783 [Sean Owen] Enable assertions for SBT and Maven build

commit f2c7c90830ecca9603ce53c43ec26dc0a3cdbad2
Author: meiyoula <[email protected]>
Date:   2014-12-16T06:30:18Z

    [SPARK-4792] Add error message when making local dir unsuccessfully
    
    Author: meiyoula <[email protected]>
    
    Closes #3635 from XuTingjun/master and squashes the following commits:
    
    dd1c66d [meiyoula] when old is deleted, it will throw an exception where 
call it
    2a55bc2 [meiyoula] Update DiskBlockManager.scala
    1483a4a [meiyoula] Delete multiple retries to make dir
    67f7902 [meiyoula] Try some times to make dir maybe more reasonable
    1c51a0c [meiyoula] Update DiskBlockManager.scala

commit b1f3e00a550c65ed11dbdccc92ef0ebecebb225a
Author: Davies Liu <[email protected]>
Date:   2014-12-16T06:58:26Z

    [SPARK-4841] fix zip with textFile()
    
    UTF8Deserializer can not be used in BatchedSerializer, so always use 
PickleSerializer() when change batchSize in zip().
    
    Also, if two RDD have the same batch size already, they did not need 
re-serialize any more.
    
    Author: Davies Liu <[email protected]>
    
    Closes #3706 from davies/fix_4841 and squashes the following commits:
    
    20ce3a3 [Davies Liu] fix bug in _reserialize()
    e3ebf7c [Davies Liu] add comment
    379d2c8 [Davies Liu] fix zip with textFile()

commit 6ce251a008345a26ab94fc6ee567e070523560b9
Author: Davies Liu <[email protected]>
Date:   2014-12-16T19:19:36Z

    [SPARK-4437] update doc for WholeCombineFileRecordReader
    
    update doc for WholeCombineFileRecordReader
    
    Author: Davies Liu <[email protected]>
    Author: Josh Rosen <[email protected]>
    
    Closes #3301 from davies/fix_doc and squashes the following commits:
    
    1d7422f [Davies Liu] Merge pull request #2 from 
JoshRosen/whole-text-file-cleanup
    dc3d21a [Josh Rosen] More genericization in 
ConfigurableCombineFileRecordReader.
    95d13eb [Davies Liu] address comment
    bf800b9 [Davies Liu] update doc for WholeCombineFileRecordReader

commit 01d123ffbf3e5e517ac1ad8f2207cfebac2e8bdd
Author: jbencook <[email protected]>
Date:   2014-12-16T19:37:23Z

    [SPARK-4855][mllib] testing the Chi-squared hypothesis test
    
    This PR tests the pyspark Chi-squared hypothesis test from this commit: 
c8abddc5164d8cf11cdede6ab3d5d1ea08028708 and moves some of the error messaging 
in to python.
    
    It is a port of the Scala tests here: 
[HypothesisTestSuite.scala](https://github.com/apache/spark/blob/master/mllib/src/test/scala/org/apache/spark/mllib/stat/HypothesisTestSuite.scala)
    
    Hopefully, SPARK-2980 can be closed.
    
    Author: jbencook <[email protected]>
    
    Closes #3679 from jbencook/master and squashes the following commits:
    
    44078e0 [jbencook] checking that bad input throws the correct exceptions
    f12ee10 [jbencook] removing checks for ValueError since input tests are on 
the Scala side
    7536cf1 [jbencook] removing python checks for invalid input
    a17ee84 [jbencook] [SPARK-2980][mllib] adding unit tests for the pyspark 
chi-squared test
    3aeb0d9 [jbencook] [SPARK-2980][mllib] bringing Chi-squared error messages 
to the python side

commit d52330cdfffb689a8bc80e277c5e30556459cc0a
Author: Mike Jennings <[email protected]>
Date:   2014-12-16T20:13:21Z

    [SPARK-3405] add subnet-id and vpc-id options to spark_ec2.py
    
    Based on this gist:
    https://gist.github.com/amar-analytx/0b62543621e1f246c0a2
    
    We use security group ids instead of security group to get around this 
issue:
    https://github.com/boto/boto/issues/350
    
    Author: Mike Jennings <[email protected]>
    Author: Mike Jennings <[email protected]>
    
    Closes #2872 from mvj101/SPARK-3405 and squashes the following commits:
    
    be9cb43 [Mike Jennings] `pep8 spark_ec2.py` runs cleanly.
    4dc6756 [Mike Jennings] Remove duplicate comment
    731d94c [Mike Jennings] Update for code review.
    ad90a36 [Mike Jennings] Merge branch 'master' of 
https://github.com/apache/spark into SPARK-3405
    1ebffa1 [Mike Jennings] Merge branch 'master' into SPARK-3405
    52aaeec [Mike Jennings] [SPARK-3405] add subnet-id and vpc-id options to 
spark_ec2.py

commit 47a43c0475676ab1a0a8abd800b00dbc45e1c49d
Author: Judy Nash <[email protected]>
Date:   2014-12-16T20:37:26Z

    [SQL] SPARK-4700: Add HTTP protocol spark thrift server
    
    Add HTTP protocol support and test cases to spark thrift server, so users 
can deploy thrift server in both TCP and http mode.
    
    Author: Judy Nash <[email protected]>
    Author: judynash <[email protected]>
    
    Closes #3672 from judynash/master and squashes the following commits:
    
    526315d [Judy Nash] correct spacing on startThriftServer method
    31a6520 [Judy Nash] fix code style issues and update sql programming guide 
format issue
    47bf87e [Judy Nash] modify withJdbcStatement method definition to meet less 
than 100 line length
    2e9c11c [Judy Nash] add thrift server in http mode documentation on sql 
programming guide
    1cbd305 [Judy Nash] Merge remote-tracking branch 'upstream/master'
    2b1d312 [Judy Nash] updated http thrift server support based on feedback
    377532c [judynash] add HTTP protocol spark thrift server

commit 4b5fd5f1c632843434b6e61ea263471e75998958
Author: Peter Vandenabeele <[email protected]>
Date:   2014-12-16T21:57:55Z

    [DOCS][SQL] Add a Note on jsonFile having separate JSON objects per line
    
    * This commit hopes to avoid the confusion I faced when trying
      to submit a regular, valid multi-line JSON file, also see
    
      
http://apache-spark-user-list.1001560.n3.nabble.com/Loading-JSON-Dataset-fails-with-com-fasterxml-jackson-databind-JsonMappingException-td20041.html
    
    Author: Peter Vandenabeele <[email protected]>
    
    Closes #3517 from petervandenabeele/pv-docs-note-on-jsonFile-format/01 and 
squashes the following commits:
    
    1f98e52 [Peter Vandenabeele] Revert to people.json and simple Note text
    6b6e062 [Peter Vandenabeele] Change the "JSON" connotation to "txt"
    fca7dfb [Peter Vandenabeele] Add a Note on jsonFile having separate JSON 
objects per line

commit 88bef9990fb3ef3cd53faf780c618a24ffd0a42d
Author: jerryshao <[email protected]>
Date:   2014-12-16T22:08:28Z

    [SPARK-4847][SQL]Fix "extraStrategies cannot take effect in SQLContext" 
issue
    
    Author: jerryshao <[email protected]>
    
    Closes #3698 from jerryshao/SPARK-4847 and squashes the following commits:
    
    4741130 [jerryshao] Make later added extraStrategies effect when calling 
strategies

commit 0574fd8c2012da8fccb95f4f519426deb807d906
Author: zsxwing <[email protected]>
Date:   2014-12-16T22:13:40Z

    [SPARK-4812][SQL] Fix the initialization issue of 'codegenEnabled'
    
    The problem is `codegenEnabled` is `val`, but it uses a `val` `sqlContext`, 
which can be override by subclasses. Here is a simple example to show this 
issue.
    
    ```Scala
    scala> :paste
    // Entering paste mode (ctrl-D to finish)
    
    abstract class Foo {
    
      protected val sqlContext = "Foo"
    
      val codegenEnabled: Boolean = {
        println(sqlContext) // it will call subclass's `sqlContext` which has 
not yet been initialized.
        if (sqlContext != null) {
          true
        } else {
          false
        }
      }
    }
    
    class Bar extends Foo {
      override val sqlContext = "Bar"
    }
    
    println(new Bar().codegenEnabled)
    
    // Exiting paste mode, now interpreting.
    
    null
    false
    defined class Foo
    defined class Bar
    ```
    
    We should make `sqlContext` `final` to prevent subclasses from overriding 
it incorrectly.
    
    Author: zsxwing <[email protected]>
    
    Closes #3660 from zsxwing/SPARK-4812 and squashes the following commits:
    
    1cbb623 [zsxwing] Make `sqlContext` final to prevent subclasses from 
overriding it incorrectly

commit d136891dc408795aacdcd11634e2e968c28bc96b
Author: Holden Karau <[email protected]>
Date:   2014-12-16T22:37:04Z

    SPARK-4767: Add support for launching in a specified placement group to 
spark_ec2
    
    Placement groups are cool and all the cool kids are using them. Lets add 
support for them to spark_ec2.py because I'm lazy
    
    Author: Holden Karau <[email protected]>
    
    Closes #3623 from 
holdenk/SPARK-4767-add-support-for-launching-in-a-specified-placement-group-to-spark-ec2-scripts
 and squashes the following commits:
    
    111a5fd [Holden Karau] merge in master
    70ace25 [Holden Karau] Placement groups are cool and all the cool kids are 
using them. Lets add support for them to spark_ec2.py because I'm lazy

commit 010e1a66fd2a0843704f52232e1d5dbcc6e779a1
Author: wangxiaojing <[email protected]>
Date:   2014-12-16T22:45:56Z

    [SPARK-4527][SQl]Add BroadcastNestedLoopJoin operator selection testsuite
    
    In `JoinSuite` add BroadcastNestedLoopJoin operator selection testsuite
    
    Author: wangxiaojing <[email protected]>
    
    Closes #3395 from wangxiaojing/SPARK-4527 and squashes the following 
commits:
    
    ea0e495 [wangxiaojing] change style
    53c3952 [wangxiaojing] Add BroadcastNestedLoopJoin operator selection 
testsuite

commit 6d0d264b6b2b2d080a74b855a01e47ad94150569
Author: tianyi <[email protected]>
Date:   2014-12-16T23:22:29Z

    [SPARK-4483][SQL]Optimization about reduce memory costs during the 
HashOuterJoin
    
    In `HashOuterJoin.scala`, spark read data from both side of join operation 
before zip them together. It is a waste for memory. We are trying to read data 
from only one side, put them into a hashmap, and then generate the `JoinedRow` 
with data from other side one by one.
    Currently, we could only do this optimization for `left outer join` and 
`right outer join`. For `full outer join`, we will do something in another 
issue.
    
    for
    table test_csv contains 1 million records
    table dim_csv contains 10 thousand records
    
    SQL:
    `select * from test_csv a left outer join dim_csv b on a.key = b.key`
    
    the result is:
    master:
    ```
    CSV: 12671 ms
    CSV: 9021 ms
    CSV: 9200 ms
    Current Mem Usage:787788984
    ```
    after patchï¼
    ```
    CSV: 10382 ms
    CSV: 7543 ms
    CSV: 7469 ms
    Current Mem Usage:208145728
    ```
    
    Author: tianyi <[email protected]>
    Author: tianyi <[email protected]>
    
    Closes #3375 from tianyi/SPARK-4483 and squashes the following commits:
    
    72a8aec [tianyi] avoid having mutable state stored inside of the task
    99c5c97 [tianyi] performance optimization
    d2f94d7 [tianyi] fix bug: missing output when the join-key is null.
    2be45d1 [tianyi] fix spell bug
    1f2c6f1 [tianyi] remove commented codes
    a676de6 [tianyi] optimize some codes
    9e7d5b5 [tianyi] remove commented old codes
    838707d [tianyi] Optimization about reduce memory costs during the 
HashOuterJoin

commit b4abf0f384513b8403df29adf8f0b9a91a79bb7c
Author: Michael Armbrust <[email protected]>
Date:   2014-12-16T23:31:19Z

    [SPARK-4827][SQL] Fix resolution of deeply nested Project(attr, 
Project(Star,...)).
    
    Since `AttributeReference` resolution and `*` expansion are currently in 
separate rules, each pair requires a full iteration instead of being able to 
resolve in a single pass.  Since its pretty easy to construct queries that have 
many of these in a row, I combine them into a single rule in this PR.
    
    Author: Michael Armbrust <[email protected]>
    
    Closes #3674 from marmbrus/projectStars and squashes the following commits:
    
    d83d6a1 [Michael Armbrust] Fix resolution of deeply nested Project(attr, 
Project(Star,...)).

commit b2ff56dd71e11a4f0fac1dda1c7492e5eeeaafd0
Author: Jacky Li <[email protected]>
Date:   2014-12-16T23:34:59Z

    [SPARK-4269][SQL] make wait time configurable in BroadcastHashJoin
    
    In BroadcastHashJoin, currently it is using a hard coded value (5 minutes) 
to wait for the execution and broadcast of the small table.
    In my opinion, it should be a configurable value since broadcast may exceed 
5 minutes in some case, like in a busy/congested network environment.
    
    Author: Jacky Li <[email protected]>
    
    Closes #3133 from jackylk/timeout-config and squashes the following commits:
    
    733ac08 [Jacky Li] add spark.sql.broadcastTimeout in SQLConf.scala
    557acd4 [Jacky Li] switch to sqlContext.getConf
    81a5e20 [Jacky Li] make wait time configurable in BroadcastHashJoin

commit b58814c59bd172a817d42b283b107a87eafde15b
Author: Andrew Or <[email protected]>
Date:   2014-12-17T01:55:27Z

    [Release] Major improvements to generate contributors script
    
    This commit introduces several major improvements to the script
    that generates the contributors list for release notes, notably:
    
    (1) Use release tags instead of a range of commits. Across branches,
    commits are not actually strictly two-dimensional, and so it is not
    sufficient to specify a start hash and an end hash. Otherwise, we
    end up counting commits that were already merged in an older branch.
    
    (2) Match PR numbers in addition to commit hashes. This is related
    to the first point in that if a PR is already merged in an older
    minor release tag, it should be filtered out here. This requires us
    to do some intelligent regex parsing on the commit description in
    addition to just relying on the GitHub API.
    
    (3) Relax author validity check. The old code fails on a name that
    has many middle names, for instance. The test was just too strict.
    
    (4) Use GitHub authentication. This allows us to make far more
    requests through the GitHub API than before (5000 as opposed to 60
    per hour).
    
    (5) Translate from Github username, not commit author name. This is
    important because the commit author name is not always configured
    correctly by the user. For instance, the username "falaki" used to
    resolve to just "Hossein", which was treated as a github username
    and translated to something else that is completely arbitrary.
    
    (6) Add an option to use the untranslated name. If there is not
    a satisfactory candidate to replace the untranslated name with,
    at least allow the user to not translate it.

commit 5ef3a1b37b6ee385f84431fd2c17ace1b119a67d
Author: Andrew Or <[email protected]>
Date:   2014-12-17T03:28:43Z

    [Release] Cache known author translations locally
    
    This bypasses unnecessary calls to the Github and JIRA API.
    Additionally, having a local cache allows us to remember names
    that we had to manually discover ourselves.

commit bdf561cc4ce2c4f5b398b38b57cf33cc0388b156
Author: Cheng Lian <[email protected]>
Date:   2014-12-17T05:16:03Z

    [SPARK-4798][SQL] A new set of Parquet testing API and test suites
    
    This PR provides a set Parquet testing API (see trait `ParquetTest`) that 
enables developers to write more concise test cases. A new set of Parquet test 
suites built upon this API  are added and aim to replace the old 
`ParquetQuerySuite`. To avoid potential merge conflicts, old testing code are 
not removed yet. The following classes can be safely removed after most Parquet 
related PRs are handled:
    
    - `ParquetQuerySuite`
    - `ParquetTestData`
    
    <!-- Reviewable:start -->
    [<img src="https://reviewable.io/review_button.png"; height=40 alt="Review 
on Reviewable"/>](https://reviewable.io/reviews/apache/spark/3644)
    <!-- Reviewable:end -->
    
    Author: Cheng Lian <[email protected]>
    
    Closes #3644 from liancheng/parquet-tests and squashes the following 
commits:
    
    800e745 [Cheng Lian] Enforces ordering of test output
    3bb8731 [Cheng Lian] Refactors HiveParquetSuite
    aa2cb2e [Cheng Lian] Decouples ParquetTest and TestSQLContext
    7b43a68 [Cheng Lian] Updates ParquetTest Scaladoc
    7f07af0 [Cheng Lian] Adds a new set of Parquet test suites

commit 5db941c9d05c575c5414b544e3ee6fb6491cec38
Author: Cheng Hao <[email protected]>
Date:   2014-12-17T05:18:39Z

    [SPARK-4744] [SQL] Short circuit evaluation for AND & OR in CodeGen
    
    Author: Cheng Hao <[email protected]>
    
    Closes #3606 from chenghao-intel/codegen_short_circuit and squashes the 
following commits:
    
    f466303 [Cheng Hao] short circuit for AND & OR

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: Sha jar staging

Reply via email to