GitHub user zhangxffff added a comment to the discussion: Add a new backend: 
Bolt

> > Bolt has improved it by treating Shuffle as an operator in Bolt and 
> > offloading it from Gluten to Bolt for parallel processing.
> 
> @frankobe would you give more explanation about `treat shuffle as an 
> operator`? doest Bolt still fetch shuffle data from spark's java iterator?

When shuffle writer or shuffle reader is adjacent to other operators that run 
in Gluten,it would be offloaded into Bolt and treated as a Bolt operator.

For the shuffle reader, Bolt would fetch the shuffle data directly from spark's 
InputStream, and then read raw bytes from it and do decompression and 
deserialization inside the shuffle reader operator, and then outputs result as 
a normal operator instead using Iterator[ColumnarBatch].

For the shuffle writer, Bolt also constructs a shuffle writer operator. It 
receives data directly from the upstream operator instead of wrapping it into 
an Iterator[ColumnarBatch].

GitHub link: 
https://github.com/apache/incubator-gluten/discussions/10929#discussioncomment-15036189

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to