Hello spark users,

I have an error that I would like to report as a spark 3.1.1 bug but I do not 
know how to create a reproducible example. I can provide a full stack trace if 
desired but the most useful information seems to be

E                   py4j.protocol.Py4JJavaError: An error occurred while 
calling o3301.toJavaRDD.
E                   : java.lang.IllegalStateException: UnspecifiedDistribution 
does not have default partitioning.
E                       at 
org.apache.spark.sql.catalyst.plans.physical.UnspecifiedDistribution$.createPartitioning(partitioning.scala:52)
E                       at 
org.apache.spark.sql.execution.exchange.EnsureRequirements$.$anonfun$ensureDistributionAndOrdering$1(EnsureRequirements.scala:54)

This error happens when I have spark.sql.adaptive.enabled=true but does not 
happen when I change to false. It happens for both one of my unit tests (~30 
rows) and with production data. Another work-around is to cache the dataframe 
before calling the collect/toJSON statement.

I was not able to find any information about this kind of error on the jira or 
from stackexchange. I was wondering if anyone has seen this error before 
related to AQE and has any suggestions for trying to report it.

Thanks,
Jesse

Reply via email to