sandugood commented on PR #4760:
URL: 
https://github.com/apache/datafusion-comet/pull/4760#issuecomment-4842874790

   Will try to give more context as this might be suitable (however, it might 
be related to a different issue):
   
   When running the same query (as presented in the issue that his PR tackles) 
just from logs it can be seen that Comet skips a rather big stage. 
   
   1. Default Spark has both of these stages at the beginning of the execution:
   `[Stage 0:> (0 + 0) / 7350][Stage 1:> (0 + 0) / 7350]`
   2. Comet has only one stage with 7350 tasks.
   
   It might be relative to FULLOUTER join. Because when we are forming features 
with a 30-day, 180-day, 365-day windows everything seems fine and resulting 
values are the same across both engines. However when performing FULLOUTER join 
for the end result - we get significantly smaller values for Comet's side.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to