[
https://issues.apache.org/jira/browse/SPARK-5682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14352396#comment-14352396
]
liyunzhang_intel commented on SPARK-5682:
-----------------------------------------
Hi [~srowen]:
Encrypted shuffle can make the process of shuffle more safer. I think it is
necessary in spark. Previous design is reusing hadoop encrypted shuffle
algorithm to enable spark encrypted shuffle. The design has a big problem that
it imports many crypto classes like CryptoInputStream and CryptoOutputStream
which is marked "private" in hadoop. Now my teammates and i decided to write
the crypto classes in spark so no dependance to hadoop 2.6. Not directly
copying hadoop code to spark. we only reference the crypto algorithm like
JCE/AES-NI which is used in hadoop to spark. Maybe i need rename the jira name
from "Reuse hadoop encrypted shuffle algorithm to enable spark encrypted
shuffle" to "Add encrypted shuffle in spark". Any advices are welcome.
> Reuse hadoop encrypted shuffle algorithm to enable spark encrypted shuffle
> --------------------------------------------------------------------------
>
> Key: SPARK-5682
> URL: https://issues.apache.org/jira/browse/SPARK-5682
> Project: Spark
> Issue Type: New Feature
> Components: Shuffle
> Reporter: liyunzhang_intel
> Attachments: Design Document of Encrypted Spark Shuffle_20150209.docx
>
>
> Encrypted shuffle is enabled in hadoop 2.6 which make the process of shuffle
> data safer. This feature is necessary in spark. We reuse hadoop encrypted
> shuffle feature to spark and because ugi credential info is necessary in
> encrypted shuffle, we first enable encrypted shuffle on spark-on-yarn
> framework.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]