[ 
https://issues.apache.org/jira/browse/SPARK-5941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheng Hao updated SPARK-5941:
-----------------------------
    Comment: was deleted

(was: Eagerly resolving the table probably causes side effect in some 
scenarios, let's keep it the same behavior (deferred resolving) with the other 
DF APIs.

In the meantime, I noticed during the unit test of leftsemijoin, the table 
sales will be loaded twice, hence we will get duplicated records(double the 
records), which causes the unit test failure after updating the DataFrameImpl 
code by using the UnresolvedRelation instead of lookupRelation eagerly.

The root reason for this is in leftsemijoin.q, there is a data loading command 
for table sales, but in TestHive, the table sales has been registered as 
TestTable, UnresolvedRelation will lead to table data loading if it's 
registered as TestTable, however, the ResolvedRelation will not trigger that.)

> Unit Test loads the table `src` twice for leftsemijoin.q
> --------------------------------------------------------
>
>                 Key: SPARK-5941
>                 URL: https://issues.apache.org/jira/browse/SPARK-5941
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>            Reporter: Cheng Hao
>
> In leftsemijoin.q, there is a data loading command for table sales already, 
> but in TestHive, it also created the table sales, which causes duplicated 
> records inserted into the sales.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to