GitHub user andrewor14 opened a pull request:

    https://github.com/apache/spark/pull/11189

    [SPARK-13080] [SQL] [WIP] Implement new Catalog API using Hive

    This is a step towards merging `SQLContext` and `HiveContext`. A new 
internal `Catalog` API was introduced in #10982 and extended in #11069. This 
patch introduces an implementation of this API using `HiveClient`, an existing 
interface to Hive. It also extends `HiveClient` with additional calls to Hive 
that are needed to complete the catalog implementation.
    
    The new class hierarchy is as follows:
    ```
    org.apache.spark.sql.catalyst.catalog.Catalog
      - org.apache.spark.sql.catalyst.catalog.InMemoryCatalog
      - org.apache.spark.sql.hive.HiveCatalog
    ```
    Note that, as of this patch, none of these classes are currently used 
anywhere yet. This will come in the future before the Spark 2.0 release.
    
    WIP pending tests.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/andrewor14/spark hive-catalog

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/11189.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #11189
    
----
commit 3b6660578f23c69abfb59fae6796ee10bf4d482d
Author: Andrew Or <[email protected]>
Date:   2016-02-10T21:16:30Z

    Add skeleton for HiveCatalog

commit f3e094ad21bd38d400f90b93898995182a508e9b
Author: Andrew Or <[email protected]>
Date:   2016-02-10T21:34:36Z

    Implement createDatabase

commit 4b09a7da8ddcc17a813e494d868a6ea55f01cd2e
Author: Andrew Or <[email protected]>
Date:   2016-02-10T21:48:00Z

    Fix style

commit 526f278d78664c49572fd1b48495ca99d12d1896
Author: Andrew Or <[email protected]>
Date:   2016-02-10T21:59:02Z

    Implement dropDatabase

commit 4aa6e66b5ee9fa2e5f8e4b9955ed98de5b35a57c
Author: Andrew Or <[email protected]>
Date:   2016-02-10T22:06:08Z

    Implement alterDatabase

commit 433d180260c57a905e226f0b8686eeb92d5dc938
Author: Andrew Or <[email protected]>
Date:   2016-02-10T22:14:15Z

    Implement getDatabase, listDatabases and databaseExists

commit ff5c5bea8d4d84ae56acd4caf225e59231b946ba
Author: Andrew Or <[email protected]>
Date:   2016-02-10T23:18:53Z

    Implement createTable
    
    This required converting o.a.s.sql.catalyst.catalog.Table to its
    counterpart in o.a.s.sql.hive.client.HiveTable. This required
    making o.a.s.sql.hive.client.TableType an enum because we need
    to create one of these from name.

commit ff49f0cf6fabc645121b43b5746017c838a3551d
Author: Andrew Or <[email protected]>
Date:   2016-02-10T23:22:38Z

    Explicitly mark methods with override in HiveCatalog

commit ca98c00264564717ddd427282bfff301ebdb6c70
Author: Andrew Or <[email protected]>
Date:   2016-02-10T23:25:27Z

    Implement dropTable

commit 71f99646cdf30a68a8e592b80ef5a6f40685551b
Author: Andrew Or <[email protected]>
Date:   2016-02-10T23:40:37Z

    Implement renameTable, alterTable

commit 13795d83c325a69fb35260c300b379e2e55725aa
Author: Andrew Or <[email protected]>
Date:   2016-02-12T00:51:36Z

    Remove intermediate representation of tables, columns etc.
    
    Currently there's the catalog table, the Spark table used in the
    hive module, and the Hive table. To avoid converting to and from
    between these table representations, we kill the intermediate one,
    which is the one currently used throughout HiveClient and friends.

commit af5ffc0ee84f3dc3c2b9249228293ae7285f916e
Author: Andrew Or <[email protected]>
Date:   2016-02-12T01:34:24Z

    Remove TableType enum
    
    Instead, this commit introduces CatalogTableType that serves
    the same purpose. This adds some type-safety and keeps the code
    clean.

commit d7b18e628374659f0a792d5c5a9154711fc9073b
Author: Andrew Or <[email protected]>
Date:   2016-02-12T01:48:30Z

    Re-implement all table operations after the refactor

commit a915d01eac651994c4d69b961299b476fe40f77d
Author: Andrew Or <[email protected]>
Date:   2016-02-12T20:50:39Z

    Implement all partition operations

commit 3ceb88d51a6e6af92cff2e90622ba235d0d107e9
Author: Andrew Or <[email protected]>
Date:   2016-02-12T22:04:45Z

    Implement all function operations

commit 07332ad6803e578d9a61cc4693d8ce665ad8c29a
Author: Andrew Or <[email protected]>
Date:   2016-02-12T22:10:33Z

    Simplify alterDatabase
    
    The operation doesn't support renaming anyway, so it doesn't
    make sense to pass in a name AND a CatalogDatabase that always
    has the same name.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to