DataFrame/Dataset join not producing correct results in Spark 2.0/Yarn

2016-10-12 Thread shankinson
Hi, We have a cluster running Apache Spark 2.0 on Hadoop 2.7.2, Centos 7.2. We had written some new code using the Spark DataFrame/DataSet APIs but are noticing incorrect results on a join after writing and then reading data to Windows Azure Storage Blobs (The default HDFS location). I've been abl

DataFrame/Dataset join not producing correct results in Spark 2.0/Yarn

2016-10-12 Thread Stephen Hankinson
Hi, We have a cluster running Apache Spark 2.0 on Hadoop 2.7.2, Centos 7.2. We had written some new code using the Spark DataFrame/DataSet APIs but are noticing incorrect results on a join after writing and then reading data to Windows Azure Storage Blobs (The default HDFS location). I've been abl