advancedxy commented on code in PR #42255:
URL: https://github.com/apache/spark/pull/42255#discussion_r1280215039


##########
python/pyspark/sql/tests/test_dataframe.py:
##########
@@ -645,6 +645,35 @@ def test_generic_hints(self):
             df1.join(df2.hint("broadcast"), "id").explain(True)
             self.assertEqual(1, buf.getvalue().count("BroadcastHashJoin"))
 
+    def test_partitioning_hint_with_columns(self):

Review Comment:
   I prefer to handle the parsing in the Scala side for two reasons:
   1. the current hint method is hard to use and existing hints requires all 
params to be `UnresolvedAttribute`. Even in the scala side, users have to 
specify `df.hint("REBALANCE", 123, $"id".expr)` Note the `$"id".expr` 
expression. It would be much verbose to specify it in the Java side.
   2. If we handle this in scala side, PySpark and R Spark could both benefit 
this. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to