pyspark not starting

2022-08-11 Thread Kelum Perera
Dear All, I'm on windows OS. I can't start pyspark, getting the error message "" [image: image.png] I downloaded spark-3.3.0-bin-hadoop3, jre-8u341-windows-x64.tar, relevant winutil.exe and python-3.7.7-embed-amd64 and placed them in C:\ drive. Also added the system environment paths [image:

Joins internally

2022-08-11 Thread Sid
Hi Team, Assume we have a large dataset and sort merge is by default join that spark applies on this dataset. Now, i want to understand internal working of joins. How does this join work or any join work ? Assume that data is already shuffled and sorted on the basis of keys. So lets say that