spark.local.dir http://spark.apache.org/docs/latest/configuration.html
On Fri, Apr 28, 2017 at 8:51 AM, Shashi Vishwakarma < shashi.vish...@gmail.com> wrote: > Yes I am using HDFS .Just trying to understand couple of point. > > There would be two kind of encryption which would be required. > > 1. Data in Motion - This could be achieved by enabling SSL - > https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6. > 0/bk_spark-component-guide/content/spark-encryption.html > > 2. Data at Rest - HDFS Encryption can be applied. > > Apart from this when spark executes a job , each disk available in all > node needs to be encrypted . > > I can have multiple disk on each node and encrypting all of them could be > costly operation - Therefore I was trying to identify during job execution > what are possible folders where spark can spill data . > > Once these items are identified those specific disk can be encrypted. > > Thanks > Shashi > > > > > On Fri, Apr 28, 2017 at 4:34 PM, Jörn Franke <jornfra...@gmail.com> wrote: > >> Why don't you use whole disk encryption? >> Are you using HDFS? >> >> On 28. Apr 2017, at 16:57, Shashi Vishwakarma <shashi.vish...@gmail.com> >> wrote: >> >> Agreed Jorn. Disk encryption is one option that will help to secure data >> but how do I know at which location Spark is spilling temp file, shuffle >> data and application data ? >> >> Thanks >> Shashi >> >> On Fri, Apr 28, 2017 at 3:54 PM, Jörn Franke <jornfra...@gmail.com> >> wrote: >> >>> You can use disk encryption as provided by the operating system. >>> Additionally, you may think about shredding disks after they are not used >>> anymore. >>> >>> > On 28. Apr 2017, at 14:45, Shashi Vishwakarma < >>> shashi.vish...@gmail.com> wrote: >>> > >>> > Hi All >>> > >>> > I was dealing with one the spark requirement here where Client (like >>> Banking Client where security is major concern) needs all spark processing >>> should happen securely. >>> > >>> > For example all communication happening between spark client and >>> server ( driver & executor communication) should be on secure channel. Even >>> when spark spills on disk based on storage level (Mem+Disk), it should not >>> be written in un-encrypted format on local disk or there should be some >>> workaround to prevent spill. >>> > >>> > I did some research but could not get any concrete solution.Let me >>> know if someone has done this. >>> > >>> > Any guidance would be a great help. >>> > >>> > Thanks >>> > Shashi >>> >> >> >