jerqi commented on issue #2409:
URL: https://github.com/apache/uniffle/issues/2409#issuecomment-2754316762

   > Indeed, it is hard to see from error stack itself whats going on here. 
That's the actual problem here. In the grand scheme of things, stacktrace 
doesn't matter. Point of focus is really that both uniffle and delta expect 
things to work differently.
   > 
   > **There are two gaps:** 2. Uniffle uses DUMMY_HOST and DUMMY_PORT for the 
blocks making it hard for Delta to actually determine where the blocks are. 3. 
Delta uses its own reader to read the blocks. Even if uniffle pointed at right 
block locations, it wouldn't really work. That is because uniffle has a custom 
reader and writer implementation for shuffle blocks.
   > 
   > There are improvements that need to happen in both uniffle and delta, 
these are a bit tangential to the exact .
   > 
   > 1. Delta needs to allow for a custom reader to be plugged in.
   > 2. Uniffle needs to have a way to publish sizes of partitions (and also 
blocks of super huge partitions).
   > 
   > These two are enough to optimize the subsequent stages, in this case the 
stage that writes the files of certain size.
   
   Actually Deltalake reader is a datasource reader. It shouldn't read the 
shuffle data.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to