Hi Sebastian, Do you have any updates on the issue? I faced with pretty the same problem and disabling kryo + raising the spark.network.timeout up to 600s helped. So for my job it takes about 5 minutes to broadcast the variable (~5GB in my case) but then it's fast. I mean much faster than shuffling with usual join anyway. Hope it helps.
Thanks, Alex.