This is an awesome news! Is there anything we can do to help? We are currently facing huge performance penalties due this issue.
Thanks, David On Wed, Dec 19, 2018 at 5:43 PM Ilan Filonenko <[email protected]> wrote: > Recently, the community has actively been working on this. The JIRA to > follow is: > https://issues.apache.org/jira/browse/SPARK-25299. A group of various > companies including Bloomberg and Palantir are in the works of a WIP > solution that implements a varied version of Option #5 (which is elaborated > upon in the google doc linked in the JIRA summary). > > On Wed, Dec 19, 2018 at 5:20 AM <[email protected]> wrote: > >> Hi everyone, >> we are facing same problems as Facebook had, where shuffle service is >> a bottleneck. For now we solved that with large task size (2g) to reduce >> shuffle I/O. >> >> I saw very nice presentation from Brian Cho on Optimizing shuffle I/O at >> large scale[1]. It is a implementation of white paper[2]. >> Brian Cho at the end of the lecture kindly mentioned about plans to >> contribute it back to Spark[3]. I checked mailing list and spark JIRA and >> didn't find any ticket on this topic. >> >> Please, does anyone has a contact on someone from Facebook who could know >> more about this? Or are there some plans to bring similar optimization to >> Spark? >> >> [1] https://databricks.com/session/sos-optimizing-shuffle-i-o >> [2] https://haoyuzhang.org/publications/riffle-eurosys18.pdf >> [3] >> https://image.slidesharecdn.com/5brianchoerginseyfe-180613004126/95/sos-optimizing-shuffle-io-with-brian-cho-and-ergin-seyfe-30-638.jpg?cb=1528850545 >> >
