GitHub user liancheng opened a pull request:

    https://github.com/apache/spark/pull/5183

    [SPARK-6450] [SQL] Fixes metastore Parquet table conversion

    The `ParquetConversions` analysis rule generates a hash map, which maps 
from the original `MetastoreRelation` instances to the newly created 
`ParquetRelation2` instances. However, `MetastoreRelation.equals` doesn't 
compare output attributes. Thus, if a single metastore Parquet table appears 
multiple times in a query, only a single entry ends up in the hash map, and the 
conversion is not correctly performed.
    
    Proper fix for this issue should be overriding `equals` and `hashCode` for 
MetastoreRelation. Unfortunately, this breaks more tests than expected. It's 
possible that these tests are ill-formed from the very beginning. As 1.3.1 
release is approaching, we'd like to make the change more surgical to avoid 
potential regressions. The proposed fix here is to make both the metastore 
relations and their output attributes as keys in the hash map used in 
ParquetConversions.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/liancheng/spark spark-6450

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/5183.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #5183
    
----
commit 353678059ed1c776632f22b234f8d4cdfc19ccb6
Author: Cheng Lian <[email protected]>
Date:   2015-03-25T05:13:33Z

    Fixes metastore Parquet table conversion

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to