UnspecifiedDistribution Error using AQE

Jesse Lord Tue, 03 Aug 2021 11:37:09 -0700

Hello spark users,

I have an error that I would like to report as a spark 3.1.1 bug but I do not 
know how to create a reproducible example. I can provide a full stack trace if 
desired but the most useful information seems to be


E                   py4j.protocol.Py4JJavaError: An error occurred while 
calling o3301.toJavaRDD.
E                   : java.lang.IllegalStateException: UnspecifiedDistribution 
does not have default partitioning.
E                       at 
org.apache.spark.sql.catalyst.plans.physical.UnspecifiedDistribution$.createPartitioning(partitioning.scala:52)
E                       at 
org.apache.spark.sql.execution.exchange.EnsureRequirements$.$anonfun$ensureDistributionAndOrdering$1(EnsureRequirements.scala:54)

This error happens when I have spark.sql.adaptive.enabled=true but does not 
happen when I change to false. It happens for both one of my unit tests (~30 
rows) and with production data. Another work-around is to cache the dataframe 
before calling the collect/toJSON statement.

I was not able to find any information about this kind of error on the jira or 
from stackexchange. I was wondering if anyone has seen this error before 
related to AQE and has any suggestions for trying to report it.

Thanks,
Jesse

UnspecifiedDistribution Error using AQE

Reply via email to