[ https://issues.apache.org/jira/browse/SPARK-16716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
PJ Fanning closed SPARK-16716. ------------------------------ Resolution: Duplicate This looks like it was fixed by SPARK-16664 > calling cache on joined dataframe can lead to data being blanked > ---------------------------------------------------------------- > > Key: SPARK-16716 > URL: https://issues.apache.org/jira/browse/SPARK-16716 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 1.6.2 > Reporter: PJ Fanning > > I have reproduced the issue in Spark 1.6.2 and latest 1.6.3-SNAPSHOT code. > The code works ok on Spark 1.6.1. > I have a notebook up on Databricks Community Edition that demonstrates the > issue. The notebook depends on the library com.databricks:spark-csv_2.10:1.4.0 > The code uses some custom code to join 4 dataframes. > It calls show on this dataframe and the data is as expected. > After calling .cache, the data is blanked. > https://databricks-prod-cloudfront.cloud.databricks.com/public/4027ec902e239c93eaaa8714f173bcfc/5458351705459939/3760010872339805/5521341683971298/latest.html -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org