In PostgreSQL and Presto, the below query works well
sql> create table t1 (id int);
sql> create table t2 (id int);
sql> select * from t1 join t2 on t1.id = floor(random() * 9) + t2.id;

But it throws "Error in query: nondeterministic expressions are only
allowed in Project, Filter, Aggregate or Window". Why Spark doesn't support
random expressions in join condition?
Here the purpose to add a random in join key is to resolve the data skew
problem.

Thanks,
Lantao

Reply via email to