GitHub user andrewor14 opened a pull request:
https://github.com/apache/spark/pull/8111
[SPARK-9580] [SQL] [WIP] Replace singletons TestSQLContext and TestData
A fundamental limitation of the existing SQL tests is that *there is simply
no way to create your own `SparkContext`*. This is a serious limitation because
the user may wish to use a different master or config. As a case in point,
`BroadcastJoinSuite` is entirely commented out because there is no way to make
it pass with the existing infrastructure.
This patch removes the singletons `TestSQLContext` and `TestData`, and
instead introduces a `SharedSQLContext` that starts a context per suite.
Unfortunately the singletons were so ingrained in the SQL tests that this patch
necessarily needed to touch *all* the SQL test files.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/andrewor14/spark sql-tests-refactor
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/8111.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #8111
----
commit 16ca7acca881e167a3d42c903e6f3872a72c9f6a
Author: Andrew Or <[email protected]>
Date: 2015-08-04T23:14:40Z
It compiles!!
commit 5e6a20c37562fcc792c56fcdf8e9df16a661daf1
Author: Andrew Or <[email protected]>
Date: 2015-08-05T01:17:44Z
Refactor SQLTestUtils to reduce duplication
commit d1d1449f3c01c8bb08cc56e9169180313f187bf3
Author: Andrew Or <[email protected]>
Date: 2015-08-05T17:30:31Z
Remove HiveTest singleton
This allows us to use custom SparkContexts in hive tests.
commit d4aafb16c4201fc31cc0875906f751eee50a9dc1
Author: Andrew Or <[email protected]>
Date: 2015-08-05T18:15:03Z
Avoid the need to switch to HiveContexts
This is a clean up to refactor helper test traits and abstract
classes in such a way that is accessible to hive tests.
commit 6345cee8f4fca7d5819bfddad51cdf6fe5d4bfe8
Author: Andrew Or <[email protected]>
Date: 2015-08-05T18:46:25Z
Clean up JsonSuite
commit 68ac6fe3994cbe3aa4a51909cf9e130e8c46cfb1
Author: Andrew Or <[email protected]>
Date: 2015-08-05T19:05:35Z
Rename the test traits properly
MyTestSQLContext -> SharedSQLContext
MyTestHiveContext -> SharedHiveContext
LocalSQLContext -> TestSQLContext
TestData -> TestSQLData
commit b15fdc6713c125b8f3ea0047c752351bcbc84d0c
Author: Andrew Or <[email protected]>
Date: 2015-08-05T20:04:21Z
Stop SparkContext in Java SQL tests
commit 0d74a72b9547c2eb0bf9ca592e9e56c2e46f0836
Author: Andrew Or <[email protected]>
Date: 2015-08-05T21:20:07Z
Load test data early in case tables are accessed by name
The test data is currently loaded as a bunch of lazy vals.
If the data is accessed by name, however, they won't be loaded
automatically. This patch adds an explicit method call that loads
the data if necessary.
commit eee415d1899162534753aab4774a69374d19cf5f
Author: Andrew Or <[email protected]>
Date: 2015-08-10T22:18:16Z
Refactor implicits into SQLTestUtils
This commit allows us to call `import testImplicits._` in the test
constructor and use implicit methods properly. This was previously
not possible without also starting a SQLContext in the constructor.
Instead, now we can properly use implicits *while* starting the
SQLContext in `beforeAll`.
However, there is currently an issue with tests using the test
data prepared in advance. This will be fixed in the subsequent
commit.
commit 55d0b1bd314dcd61a9808b92cf8099edb315cb9b
Author: Andrew Or <[email protected]>
Date: 2015-08-10T22:30:56Z
Fix Java not serializable exception in tests
Tests that use test data used to fail before this commit. This is
because the underlying case classes would bring in the entire
`SQLTestData` trait into the scope. This no longer happens after
we move the case classes outside of the trait.
commit 4f59beef61c7aceac8fe6400b720e1ef12bc4beb
Author: Andrew Or <[email protected]>
Date: 2015-08-10T23:33:28Z
Fix DataSourceTest et al.
Test suites that extend DataSourceTest used to have this weird
implicit SQLContext that was created in the constructor. This was
failing tests because the base SQLContext is not ready until after
the first test is run. A minor refactor was required to fix the
resulting NPEs.
This commit also fixes test suites that need to materialize the
test data. These suites were materializing them in the constructor
before the SQLContext was ready.
commit 88d4f16f543e65dc709b99034fd169687fd0b2b1
Author: Andrew Or <[email protected]>
Date: 2015-08-11T02:35:22Z
Fix hive tests to use the same pattern
This makes hive tests use the same pattern as SQL tests, i.e.
everything inherits HiveTestUtils, and those that want to use
implicits can do `import testImplicits._`.
commit 5fe4bfb7f151e14b409b57b3782dfcc3ee91e562
Author: Andrew Or <[email protected]>
Date: 2015-08-11T03:30:07Z
Merge branch 'master' of github.com:apache/spark into sql-tests-refactor
Conflicts:
sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala
sql/core/src/test/scala/org/apache/spark/sql/execution/AggregateSuite.scala
sql/core/src/test/scala/org/apache/spark/sql/execution/PlannerSuite.scala
sql/core/src/test/scala/org/apache/spark/sql/execution/RowFormatConvertersSuite.scala
sql/core/src/test/scala/org/apache/spark/sql/execution/SparkSqlSerializer2Suite.scala
sql/core/src/test/scala/org/apache/spark/sql/execution/UnsafeKVExternalSorterSuite.scala
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonSuite.scala
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/TestJsonData.scala
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetAvroCompatibilitySuite.scala
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetCompatibilityTest.scala
sql/core/src/test/scala/org/apache/spark/sql/sources/InsertSuite.scala
sql/core/src/test/scala/org/apache/spark/sql/sources/SaveLoadSuite.scala
sql/hive/compatibility/src/test/scala/org/apache/spark/sql/hive/execution/HiveCompatibilitySuite.scala
sql/hive/src/main/scala/org/apache/spark/sql/hive/test/TestHiveContext.scala
sql/hive/src/test/java/test/org/apache/spark/sql/hive/JavaDataFrameSuite.java
sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveMetastoreCatalogSuite.scala
sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveParquetSuite.scala
sql/hive/src/test/scala/org/apache/spark/sql/hive/MetastoreDataSourcesSuite.scala
sql/hive/src/test/scala/org/apache/spark/sql/hive/ParquetHiveCompatibilitySuite.scala
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/AggregationQuerySuite.scala
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveExplainSuite.scala
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala
sql/hive/src/test/scala/org/apache/spark/sql/hive/parquetSuites.scala
sql/hive/src/test/scala/org/apache/spark/sql/sources/ParquetHadoopFsRelationSuite.scala
commit c4a22bcb8d26362f741c9b24f31552694b3d4f64
Author: Andrew Or <[email protected]>
Date: 2015-08-11T03:39:14Z
Fix compile after resolving merge conflicts
commit d9a839093adc6bd6106493df56b664e0c7ec8e1b
Author: Andrew Or <[email protected]>
Date: 2015-08-11T17:19:38Z
Merge branch 'master' of github.com:apache/spark into sql-tests-refactor
Conflicts:
sql/core/src/test/scala/org/apache/spark/sql/JoinSuite.scala
sql/core/src/test/scala/org/apache/spark/sql/execution/joins/OuterJoinSuite.scala
sql/core/src/test/scala/org/apache/spark/sql/execution/joins/SemiJoinSuite.scala
sql/core/src/test/scala/org/apache/spark/sql/test/SQLTestUtils.scala
commit f5619f8201f8dff2b335d42e9fa1b303735de4a8
Author: Andrew Or <[email protected]>
Date: 2015-08-11T20:08:42Z
Fix test compile after resolving merge conflicts
commit 9395cfad98eafc817d1347ac26491aae87fb8e81
Author: Andrew Or <[email protected]>
Date: 2015-08-11T20:26:48Z
Merge branch 'master' of github.com:apache/spark into sql-tests-refactor
Conflicts:
sql/core/src/test/scala/org/apache/spark/sql/execution/metric/SQLMetricsSuite.scala
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]