Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/1439#issuecomment-49483743 I'll just add the the `HiveTableReader` vs `HiveTableScan` separation is purely artificial, and the split is based on what code was stolen from Shark vs what code was written for Spark SQL. It would be reasonable to combine them at some point. However, for this PR it would be great to just fix the bug at hand. If we are going to do major refactoring I'd want to see benchmarks showing that we aren't introducing any performance regressions. It would also be nice to see a test case that would be currently failing but passes after this PR is added.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---