Hello, I am having some interesting issues with a consistent error in spark that occurs when I'm working with dataframes that are the result of some amounts of joining and other transformations.
PartitioningCollection requires all of its partitionings have the same numPartitions. It seems to happen after I join two DataFrames together which are fairly reasonable on their own, but after joining them, the operations on the joined dataframe can yield this error. I am really just trying to understand why this error might be appearing or what the meaning behind it is as I can't seem to find any documentation on it: The following invocation results in the exception: val resultDataframe = dataFrame1 .join(dataFrame2, $"first_column" === $"second_column").take(2) but I can certainly call dataFrame1.take(2) and dataFrame2.take(2) I also tried repartitioning the DataFrames, using Dataset.repartition(numPartitions) or Dataset.coalesce(numParitions) on dataFrame1 and dataFrame2 before joining, and on resultDataFrame after the join, but nothing seemed to have affected the error. I cannot determine nor easily make reproducible the circumstances surrounding the error, and this message is more asking why this error might appear. I posted essentially this issue on an external help site, stackoverflow, about this issue, which I will link here as there was a small amount of discussion I probably can't reproduce here: http://stackoverflow.com/questions/39780784/spark-2-0-0-error-partitioningcollection-requires-all-of-its-partitionings-have/39793449 (I hope it is not frowned upon to link to an external page on help requests), and so far the issue seems to be confirmed by at least one other user, but I was not able to find other mentions of it in this listserv or elsewhere through some cursory googling. Thanks for any help -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Error-PartitioningCollection-requires-all-of-its-partitionings-have-the-same-numPartitions-tp27875.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org