[ https://issues.apache.org/jira/browse/SPARK-5682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14352396#comment-14352396 ]
liyunzhang_intel commented on SPARK-5682: ----------------------------------------- Hi [~srowen]: Encrypted shuffle can make the process of shuffle more safer. I think it is necessary in spark. Previous design is reusing hadoop encrypted shuffle algorithm to enable spark encrypted shuffle. The design has a big problem that it imports many crypto classes like CryptoInputStream and CryptoOutputStream which is marked "private" in hadoop. Now my teammates and i decided to write the crypto classes in spark so no dependance to hadoop 2.6. Not directly copying hadoop code to spark. we only reference the crypto algorithm like JCE/AES-NI which is used in hadoop to spark. Maybe i need rename the jira name from "Reuse hadoop encrypted shuffle algorithm to enable spark encrypted shuffle" to "Add encrypted shuffle in spark". Any advices are welcome. > Reuse hadoop encrypted shuffle algorithm to enable spark encrypted shuffle > -------------------------------------------------------------------------- > > Key: SPARK-5682 > URL: https://issues.apache.org/jira/browse/SPARK-5682 > Project: Spark > Issue Type: New Feature > Components: Shuffle > Reporter: liyunzhang_intel > Attachments: Design Document of Encrypted Spark Shuffle_20150209.docx > > > Encrypted shuffle is enabled in hadoop 2.6 which make the process of shuffle > data safer. This feature is necessary in spark. We reuse hadoop encrypted > shuffle feature to spark and because ugi credential info is necessary in > encrypted shuffle, we first enable encrypted shuffle on spark-on-yarn > framework. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org