parthchandra commented on code in PR #1901:
URL: https://github.com/apache/datafusion-comet/pull/1901#discussion_r2164519556


##########
spark/src/test/scala/org/apache/comet/exec/CometJoinSuite.scala:
##########
@@ -54,25 +54,6 @@ class CometJoinSuite extends CometTestBase {
         .toSeq)
   }
 
-  test("SortMergeJoin with unsupported key type should fall back to Spark") {

Review Comment:
   The test you have added is great. Thank you!
   
   However this removed test exercises a few things that we should also check 
for - Timestamps read from Parquet, and the actual plan created. For the latter 
you can simply change the test to use `checkSparkAnswerAndOperator` and remove 
the check that the canonicalized plans are the same. 
   
   Also, to really make sure that we are testing timestamps, we really should 
have the left side and the right side of the  join use timestamps with 
different timezones. 
   
   To create timestamps with different timezones, we can modify this test to 
create the test files separately - 
   ```
     withSQLConf(
         SQLConf.SESSION_LOCAL_TIMEZONE.key -> "Asia/Kathmandu",
         SQLConf.ADAPTIVE_AUTO_BROADCASTJOIN_THRESHOLD.key -> "-1",
         SQLConf.AUTO_BROADCASTJOIN_THRESHOLD.key -> "-1") {
         withTable("t1", "t2") {
           withSQLconf( SQLConf.SESSION_LOCAL_TIMEZONE.key -> 
"Australia/Darwin") {
               sql("CREATE TABLE t1(name STRING, time TIMESTAMP) USING PARQUET")
               sql("INSERT OVERWRITE t1 VALUES('a', timestamp'2019-01-01 
11:11:11')")
           }
           withSQLconf( SQLConf.SESSION_LOCAL_TIMEZONE.key -> "Canada/Pacific") 
{
               sql("CREATE TABLE t2(name STRING, time TIMESTAMP) USING PARQUET")
               sql("INSERT OVERWRITE t2 VALUES('a', timestamp'2019-01-01 
11:11:11')")
           }
           ...
   ```
   The join above with different timezones will return zero rows since the 
timezones are not the same.
   
   Also, we could rename the test.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to