[
https://issues.apache.org/jira/browse/SPARK-5941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Cheng Hao updated SPARK-5941:
-----------------------------
Comment: was deleted
(was: Eagerly resolving the table probably causes side effect in some
scenarios, let's keep it the same behavior (deferred resolving) with the other
DF APIs.
In the meantime, I noticed during the unit test of leftsemijoin, the table
sales will be loaded twice, hence we will get duplicated records(double the
records), which causes the unit test failure after updating the DataFrameImpl
code by using the UnresolvedRelation instead of lookupRelation eagerly.
The root reason for this is in leftsemijoin.q, there is a data loading command
for table sales, but in TestHive, the table sales has been registered as
TestTable, UnresolvedRelation will lead to table data loading if it's
registered as TestTable, however, the ResolvedRelation will not trigger that.)
> Unit Test loads the table `src` twice for leftsemijoin.q
> --------------------------------------------------------
>
> Key: SPARK-5941
> URL: https://issues.apache.org/jira/browse/SPARK-5941
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Reporter: Cheng Hao
>
> In leftsemijoin.q, there is a data loading command for table sales already,
> but in TestHive, it also created the table sales, which causes duplicated
> records inserted into the sales.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]