[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-11-18 Thread vanzin
Github user vanzin commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-157908221
  
@winningsix could you update the patch so it merges and compiles?

Also, the PR title and description need to be updated after the latest 
changes. While you're at it, please follow the PR title convention (see 
https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark#ContributingtoSpark-PullRequest).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-11-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-155804412
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-11-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-155804445
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-11-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-155805866
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/45640/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-11-11 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-155805171
  
**[Test build #45640 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45640/consoleFull)**
 for PR 8880 at commit 
[`d08af32`](https://github.com/apache/spark/commit/d08af32b3d8fe2a2c4fa5614ddfc86758b2eeeb8).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-11-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-155805865
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-11-11 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-155805861
  
**[Test build #45640 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45640/consoleFull)**
 for PR 8880 at commit 
[`d08af32`](https://github.com/apache/spark/commit/d08af32b3d8fe2a2c4fa5614ddfc86758b2eeeb8).
 * This patch **fails to build**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-11-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-155079340
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-11-09 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-155079079
  
**[Test build #45369 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45369/consoleFull)**
 for PR 8880 at commit 
[`3c810ac`](https://github.com/apache/spark/commit/3c810ace342ac77bfbecc50e1def635c6afa8151).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-11-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-155077100
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-11-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-155077066
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-11-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-155075773
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-11-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-155075706
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-11-09 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/8880#discussion_r44361794
  
--- Diff: core/src/main/scala/org/apache/spark/crypto/CipherSuite.scala ---
@@ -0,0 +1,59 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.crypto
+
+/**
+ * Defines properties of a CipherSuite. Modeled after the ciphers in 
[[javax.crypto.Cipher]]
+ * @param namename of cipher suite, as in [[javax.crypto.Cipher]]
+ * @param algoBlockSize size of an algorithm block in bytes
+ */
+private[spark] case class CipherSuite(name: String, algoBlockSize: Int) {
+  private var _unknownValue: Integer = _
+
+  def unknownValue: Integer = _unknownValue
+
+  def unknownValue_=(unknownValue: Integer): Unit = {
--- End diff --

Isn't this just the same as declaring a public `var _unknownValue: Integer 
= _`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-11-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-155117504
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-11-09 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-155117279
  
**[Test build #45369 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45369/consoleFull)**
 for PR 8880 at commit 
[`3c810ac`](https://github.com/apache/spark/commit/3c810ace342ac77bfbecc50e1def635c6afa8151).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-11-09 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/8880#discussion_r44362026
  
--- Diff: 
core/src/main/scala/org/apache/spark/crypto/CommonConfigurationKeys.scala ---
@@ -0,0 +1,53 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.crypto
+
+import org.apache.hadoop.io.Text
+
+/**
+ * Constant variables
--- End diff --

They're either constants or variables...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-11-09 Thread winningsix
Github user winningsix commented on a diff in the pull request:

https://github.com/apache/spark/pull/8880#discussion_r44354373
  
--- Diff: 
core/src/test/scala/org/apache/spark/crypto/JceAesCtrCryptoCodecSuite.scala ---
@@ -0,0 +1,72 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.crypto
+
+import java.io.{ByteArrayInputStream, BufferedOutputStream, 
ByteArrayOutputStream}
+import java.security.SecureRandom
+
+import org.apache.spark.crypto.CommonConfigurationKeys._
+import org.apache.spark.{SparkFunSuite, SparkConf, Logging}
+
+/**
+ * test JceAesCtrCryptoCodec
+ */
+class JceAesCtrCryptoCodecSuite extends SparkFunSuite with Logging {
+
+  test("TestCryptoCodecSuite") {
+val random = new SecureRandom
+val dataLen = 1000
+val inputData = new Array[Byte](dataLen)
+val outputData = new Array[Byte](dataLen)
+random.nextBytes(inputData)
+// encrypt
+val sparkConf = new SparkConf
+
sparkConf.set(SPARK_SECURITY_CRYPTO_CODEC_CLASSES_AES_CTR_NOPADDING_KEY,
+  classOf[JceAesCtrCryptoCodec].getName)
+val cipherSuite = new CipherSuite("AES/CTR/NoPadding", 16)
+val codec = new JceAesCtrCryptoCodec(sparkConf)
+val aos = new ByteArrayOutputStream
+val bos = new BufferedOutputStream(aos)
+val key = new Array[Byte](16)
+val iv = new Array[Byte](16)
+random.nextBytes(key)
+random.nextBytes(iv)
+
+val cos = new CryptoOutputStream(bos, codec, 1024, key, iv)
+cos.write(inputData, 0, inputData.length)
--- End diff --

Tests added in the cryptoCodec level in my latest updated pull request.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-150155027
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-150155055
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-22 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-150178787
  
**[Test build #44144 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44144/consoleFull)**
 for PR 8880 at commit 
[`135b380`](https://github.com/apache/spark/commit/135b3809982c9b66f479a855fac944799e56f7fc).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-150178851
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-150178852
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44144/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-22 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-150156455
  
**[Test build #44144 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44144/consoleFull)**
 for PR 8880 at commit 
[`135b380`](https://github.com/apache/spark/commit/135b3809982c9b66f479a855fac944799e56f7fc).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-149942075
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-149942029
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-21 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-149945368
  
**[Test build #44068 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44068/consoleFull)**
 for PR 8880 at commit 
[`a75931f`](https://github.com/apache/spark/commit/a75931fdc57101ac4cef5f4467f1a5ccc6947b0a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-21 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-149980692
  
**[Test build #44068 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44068/consoleFull)**
 for PR 8880 at commit 
[`a75931f`](https://github.com/apache/spark/commit/a75931fdc57101ac4cef5f4467f1a5ccc6947b0a).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-149980939
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44068/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-149980937
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-149850877
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44052/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-21 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-149850819
  
**[Test build #44052 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44052/consoleFull)**
 for PR 8880 at commit 
[`de92e60`](https://github.com/apache/spark/commit/de92e60abc053ceb50064878fd93ff67c4bb2eec).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-149850876
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-149822285
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-149822365
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-21 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-149824547
  
**[Test build #44052 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44052/consoleFull)**
 for PR 8880 at commit 
[`de92e60`](https://github.com/apache/spark/commit/de92e60abc053ceb50064878fd93ff67c4bb2eec).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-14 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/8880#discussion_r42050263
  
--- Diff: core/src/main/scala/org/apache/spark/crypto/CipherSuite.scala ---
@@ -0,0 +1,65 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.crypto
+
+import com.google.common.base.Objects
+
+/**
+ * Defines properties of a CipherSuite. Modeled after the ciphers in 
[[javax.crypto.Cipher]]
+ * @param namename of cipher suite, as in [[javax.crypto.Cipher]]
+ * @param algoBlockSize size of an algorithm block in bytes
+ */
+private[spark] case class CipherSuite(name: String, algoBlockSize: Int) {
+  private var _unknownValue: Integer = _
+
+  def unknownValue: Integer = _unknownValue
+
+  def unknownValue_=(unknownValue: Integer): Unit = {
+_unknownValue = unknownValue
+  }
+
+  override def toString(): String = {
+Objects.toStringHelper(this).add("name", name).add("algoBlockSize", 
algoBlockSize).toString
--- End diff --

This is a case class so I'm not sure you even need to override `toString`. 
(Unless you really want the property names there.)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-14 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/8880#discussion_r42051744
  
--- Diff: 
core/src/main/scala/org/apache/spark/crypto/CommonConfigurationKeys.scala ---
@@ -0,0 +1,48 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.crypto
+
+import org.apache.hadoop.io.Text
+
+/**
+ * Constant variables
+ */
+private[spark] object CommonConfigurationKeys {
+  val SPARK_SHUFFLE_TOKEN = new Text("SPARK_SHUFFLE_TOKEN")
+  val SPARK_SECURITY_CRYPTO_BUFFER_SIZE_DEFAULT = 8192
+  val SPARK_SECURITY_CRYPTO_CIPHER_SUITE_DEFAULT = "AES/CTR/NoPadding"
+  val SPARK_SECURITY_CRYPTO_CIPHER_SUITE_KEY = 
"spark.security.crypto.cipher.suite"
+  val SPARK_SECURITY_CRYPTO_CODEC_CLASSES_KEY_PREFIX = 
"spark.security.crypto.codec.classes"
+  val SPARK_SECURITY_CRYPTO_CODEC_CLASSES_AES_CTR_NOPADDING_KEY =
+SPARK_SECURITY_CRYPTO_CODEC_CLASSES_KEY_PREFIX + 
CipherSuite.AES_CTR_NOPADDING.getConfigSuffix()
+  val SPARK_SECURITY_JAVA_SECURE_RANDOM_ALGORITHM_KEY = 
"spark.security.java.secure.random" +
+  ".algorithm"
+  val SPARK_SECURITY_CRYPTO_JCE_PROVIDER_KEY = 
"spark.security.crypto.jce.provider"
+  val SPARK_SECURITY_JAVA_SECURE_RANDOM_ALGORITHM_DEFAULT = "SHA1PRNG"
+  val SPARK_SECURITY_SECURE_RANDOM_IMPL_KEY = 
"spark.security.secure.random.impl"
--- End diff --

Not used anywhere.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-14 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/8880#discussion_r42051693
  
--- Diff: 
core/src/main/scala/org/apache/spark/crypto/CommonConfigurationKeys.scala ---
@@ -0,0 +1,48 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.crypto
+
+import org.apache.hadoop.io.Text
+
+/**
+ * Constant variables
+ */
+private[spark] object CommonConfigurationKeys {
+  val SPARK_SHUFFLE_TOKEN = new Text("SPARK_SHUFFLE_TOKEN")
+  val SPARK_SECURITY_CRYPTO_BUFFER_SIZE_DEFAULT = 8192
+  val SPARK_SECURITY_CRYPTO_CIPHER_SUITE_DEFAULT = "AES/CTR/NoPadding"
+  val SPARK_SECURITY_CRYPTO_CIPHER_SUITE_KEY = 
"spark.security.crypto.cipher.suite"
+  val SPARK_SECURITY_CRYPTO_CODEC_CLASSES_KEY_PREFIX = 
"spark.security.crypto.codec.classes"
+  val SPARK_SECURITY_CRYPTO_CODEC_CLASSES_AES_CTR_NOPADDING_KEY =
+SPARK_SECURITY_CRYPTO_CODEC_CLASSES_KEY_PREFIX + 
CipherSuite.AES_CTR_NOPADDING.getConfigSuffix()
+  val SPARK_SECURITY_JAVA_SECURE_RANDOM_ALGORITHM_KEY = 
"spark.security.java.secure.random" +
+  ".algorithm"
+  val SPARK_SECURITY_CRYPTO_JCE_PROVIDER_KEY = 
"spark.security.crypto.jce.provider"
+  val SPARK_SECURITY_JAVA_SECURE_RANDOM_ALGORITHM_DEFAULT = "SHA1PRNG"
+  val SPARK_SECURITY_SECURE_RANDOM_IMPL_KEY = 
"spark.security.secure.random.impl"
+  val SPARK_ENCRYPTED_INTERMEDIATE_DATA_BUFFER_KB = 
"spark.job.encryptedIntermediateData.buffer.kb"
+  val DEFAULT_SPARK_ENCRYPTED_INTERMEDIATE_DATA_BUFFER_KB = "128"
+  val SPARK_SECURITY_SECURE_RANDOM_DEVICE_FILE_PATH_KEY = 
"spark.security.random.device.file.path"
+  val SPARK_SECURITY_SECURE_RANDOM_DEVICE_FILE_PATH_DEFAULT = 
"/dev/urandom"
--- End diff --

This is not only not used anywhere, but probably not cross-platform.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-14 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/8880#discussion_r42057147
  
--- Diff: 
core/src/main/scala/org/apache/spark/crypto/CryptoInputStream.scala ---
@@ -0,0 +1,425 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.crypto
+
+import java.io.{IOException, InputStream, FilterInputStream}
+import java.nio.ByteBuffer
+import java.nio.channels.ReadableByteChannel
+import java.security.GeneralSecurityException
+import java.util.Queue
+import java.util.concurrent.ConcurrentLinkedQueue
+
+import com.google.common.base.Preconditions
+
+/**
+ * CryptoInputStream decrypts data. It is not thread-safe. AES CTR mode is
+ * required in order to ensure that the plain text and cipher text have a 
1:1
+ * mapping. The decryption is buffer based. The key points of the 
decryption
+ * are (1) calculating the counter and (2) padding through stream position:
+ *
+ * counter = base + pos/(algorithm blocksize);
+ * padding = pos%(algorithm blocksize);
+ *
+ * The underlying stream offset is maintained as state.
+ */
+private[spark] class CryptoInputStream(
+in: InputStream,
+private[this] val codec: CryptoCodec,
+bufferSizeVal: Integer,
+keyVal: Array[Byte],
+ivVal: Array[Byte],
+private[this] var streamOffset: Long,// Underlying stream offset.
+isDirectBuf: Boolean)
+extends FilterInputStream(in: InputStream) with ReadableByteChannel {
+  val oneByteBuf = new Array[Byte](1)
+
+  val bufferSize = CryptoStreamUtils.checkBufferSize(codec, bufferSizeVal)
+  /**
+   * Input data buffer. The data starts at inBuffer.position() and ends at
+   * to inBuffer.limit().
+   */
+  val inBuffer = if (isDirectBuf) {
+ByteBuffer.allocateDirect(bufferSizeVal)
+  } else {
+ByteBuffer.allocate(bufferSizeVal)
+  }
+
+  /**
+   * The decrypted data buffer. The data starts at outBuffer.position() and
+   * ends at outBuffer.limit()
+   */
+  val outBuffer = if (isDirectBuf) {
+ByteBuffer.allocateDirect(bufferSizeVal)
+  } else {
+ByteBuffer.allocate(bufferSizeVal)
+  }
+
+  /**
+   * Whether the underlying stream supports
+   * [[org.apache.hadoop.fs.ByteBufferReadable]]
+   */
+  var usingByteBufferRead = false
+  var usingByteBufferReadInitialized = false
+  /**
+   * Padding = pos%(algorithm blocksize) Padding is put into [[inBuffer]]
+   * before any other data goes in. The purpose of padding is to put the 
input
+   * data at proper position.
+   */
+  var padding: Byte = '0'
+  var closed: Boolean = false
+  var key: Array[Byte] = keyVal.clone()
+  var initIV: Array[Byte] = ivVal.clone()
+  var iv: Array[Byte] = ivVal.clone()
+  var isReadableByteChannel: Boolean = in.isInstanceOf[ReadableByteChannel]
+
+  /** DirectBuffer pool */
+  var bufferPool: Queue[ByteBuffer] = new 
ConcurrentLinkedQueue[ByteBuffer]()
+  /** Decryptor pool */
+  var decryptorPool: Queue[Decryptor] = new 
ConcurrentLinkedQueue[Decryptor]()
+
+  lazy val tmpBuf: Array[Byte] = new Array[Byte](bufferSize)
+  var decryptor: Decryptor = getDecryptor()
+  CryptoStreamUtils.checkCodec(codec)
+  resetStreamOffset(streamOffset)
+
+  def this(in: InputStream,
+  codec: CryptoCodec,
+  bufferSize: Integer,
+  key: Array[Byte],
+  iv: Array[Byte]) {
+this(in, codec, bufferSize, key, iv, 0, true)
+  }
+
+  def this(in: InputStream,
+  codec: CryptoCodec,
+  key: Array[Byte],
+  iv: Array[Byte]) {
+this(in, codec, CryptoStreamUtils.getBufferSize, key, iv)
+  }
+
+  def getWrappedStream(): InputStream = in
+
+  /**
+   * Decryption is buffer based.
+   * If there is data in [[outBuffer]], then read it out of this buffer.
+   * If there is no data in [[outBuffer]], then read more from 

[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-14 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/8880#discussion_r42057527
  
--- Diff: 
core/src/main/scala/org/apache/spark/crypto/CryptoInputStream.scala ---
@@ -0,0 +1,425 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.crypto
+
+import java.io.{IOException, InputStream, FilterInputStream}
+import java.nio.ByteBuffer
+import java.nio.channels.ReadableByteChannel
+import java.security.GeneralSecurityException
+import java.util.Queue
+import java.util.concurrent.ConcurrentLinkedQueue
+
+import com.google.common.base.Preconditions
+
+/**
+ * CryptoInputStream decrypts data. It is not thread-safe. AES CTR mode is
+ * required in order to ensure that the plain text and cipher text have a 
1:1
+ * mapping. The decryption is buffer based. The key points of the 
decryption
+ * are (1) calculating the counter and (2) padding through stream position:
+ *
+ * counter = base + pos/(algorithm blocksize);
+ * padding = pos%(algorithm blocksize);
+ *
+ * The underlying stream offset is maintained as state.
+ */
+private[spark] class CryptoInputStream(
+in: InputStream,
+private[this] val codec: CryptoCodec,
+bufferSizeVal: Integer,
+keyVal: Array[Byte],
+ivVal: Array[Byte],
+private[this] var streamOffset: Long,// Underlying stream offset.
+isDirectBuf: Boolean)
+extends FilterInputStream(in: InputStream) with ReadableByteChannel {
+  val oneByteBuf = new Array[Byte](1)
+
+  val bufferSize = CryptoStreamUtils.checkBufferSize(codec, bufferSizeVal)
+  /**
+   * Input data buffer. The data starts at inBuffer.position() and ends at
+   * to inBuffer.limit().
+   */
+  val inBuffer = if (isDirectBuf) {
+ByteBuffer.allocateDirect(bufferSizeVal)
+  } else {
+ByteBuffer.allocate(bufferSizeVal)
+  }
+
+  /**
+   * The decrypted data buffer. The data starts at outBuffer.position() and
+   * ends at outBuffer.limit()
+   */
+  val outBuffer = if (isDirectBuf) {
+ByteBuffer.allocateDirect(bufferSizeVal)
+  } else {
+ByteBuffer.allocate(bufferSizeVal)
+  }
+
+  /**
+   * Whether the underlying stream supports
+   * [[org.apache.hadoop.fs.ByteBufferReadable]]
+   */
+  var usingByteBufferRead = false
+  var usingByteBufferReadInitialized = false
+  /**
+   * Padding = pos%(algorithm blocksize) Padding is put into [[inBuffer]]
+   * before any other data goes in. The purpose of padding is to put the 
input
+   * data at proper position.
+   */
+  var padding: Byte = '0'
+  var closed: Boolean = false
+  var key: Array[Byte] = keyVal.clone()
+  var initIV: Array[Byte] = ivVal.clone()
+  var iv: Array[Byte] = ivVal.clone()
+  var isReadableByteChannel: Boolean = in.isInstanceOf[ReadableByteChannel]
+
+  /** DirectBuffer pool */
+  var bufferPool: Queue[ByteBuffer] = new 
ConcurrentLinkedQueue[ByteBuffer]()
+  /** Decryptor pool */
+  var decryptorPool: Queue[Decryptor] = new 
ConcurrentLinkedQueue[Decryptor]()
+
+  lazy val tmpBuf: Array[Byte] = new Array[Byte](bufferSize)
+  var decryptor: Decryptor = getDecryptor()
+  CryptoStreamUtils.checkCodec(codec)
+  resetStreamOffset(streamOffset)
+
+  def this(in: InputStream,
+  codec: CryptoCodec,
+  bufferSize: Integer,
+  key: Array[Byte],
+  iv: Array[Byte]) {
+this(in, codec, bufferSize, key, iv, 0, true)
+  }
+
+  def this(in: InputStream,
+  codec: CryptoCodec,
+  key: Array[Byte],
+  iv: Array[Byte]) {
+this(in, codec, CryptoStreamUtils.getBufferSize, key, iv)
+  }
+
+  def getWrappedStream(): InputStream = in
+
+  /**
+   * Decryption is buffer based.
+   * If there is data in [[outBuffer]], then read it out of this buffer.
+   * If there is no data in [[outBuffer]], then read more from 

[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-14 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/8880#discussion_r42057655
  
--- Diff: 
core/src/main/scala/org/apache/spark/crypto/CryptoInputStream.scala ---
@@ -0,0 +1,425 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.crypto
+
+import java.io.{IOException, InputStream, FilterInputStream}
+import java.nio.ByteBuffer
+import java.nio.channels.ReadableByteChannel
+import java.security.GeneralSecurityException
+import java.util.Queue
+import java.util.concurrent.ConcurrentLinkedQueue
+
+import com.google.common.base.Preconditions
+
+/**
+ * CryptoInputStream decrypts data. It is not thread-safe. AES CTR mode is
+ * required in order to ensure that the plain text and cipher text have a 
1:1
+ * mapping. The decryption is buffer based. The key points of the 
decryption
+ * are (1) calculating the counter and (2) padding through stream position:
+ *
+ * counter = base + pos/(algorithm blocksize);
+ * padding = pos%(algorithm blocksize);
+ *
+ * The underlying stream offset is maintained as state.
+ */
+private[spark] class CryptoInputStream(
+in: InputStream,
+private[this] val codec: CryptoCodec,
+bufferSizeVal: Integer,
+keyVal: Array[Byte],
+ivVal: Array[Byte],
+private[this] var streamOffset: Long,// Underlying stream offset.
+isDirectBuf: Boolean)
+extends FilterInputStream(in: InputStream) with ReadableByteChannel {
+  val oneByteBuf = new Array[Byte](1)
+
+  val bufferSize = CryptoStreamUtils.checkBufferSize(codec, bufferSizeVal)
+  /**
+   * Input data buffer. The data starts at inBuffer.position() and ends at
+   * to inBuffer.limit().
+   */
+  val inBuffer = if (isDirectBuf) {
+ByteBuffer.allocateDirect(bufferSizeVal)
+  } else {
+ByteBuffer.allocate(bufferSizeVal)
+  }
+
+  /**
+   * The decrypted data buffer. The data starts at outBuffer.position() and
+   * ends at outBuffer.limit()
+   */
+  val outBuffer = if (isDirectBuf) {
+ByteBuffer.allocateDirect(bufferSizeVal)
+  } else {
+ByteBuffer.allocate(bufferSizeVal)
+  }
+
+  /**
+   * Whether the underlying stream supports
+   * [[org.apache.hadoop.fs.ByteBufferReadable]]
+   */
+  var usingByteBufferRead = false
+  var usingByteBufferReadInitialized = false
+  /**
+   * Padding = pos%(algorithm blocksize) Padding is put into [[inBuffer]]
+   * before any other data goes in. The purpose of padding is to put the 
input
+   * data at proper position.
+   */
+  var padding: Byte = '0'
+  var closed: Boolean = false
+  var key: Array[Byte] = keyVal.clone()
+  var initIV: Array[Byte] = ivVal.clone()
+  var iv: Array[Byte] = ivVal.clone()
+  var isReadableByteChannel: Boolean = in.isInstanceOf[ReadableByteChannel]
+
+  /** DirectBuffer pool */
+  var bufferPool: Queue[ByteBuffer] = new 
ConcurrentLinkedQueue[ByteBuffer]()
+  /** Decryptor pool */
+  var decryptorPool: Queue[Decryptor] = new 
ConcurrentLinkedQueue[Decryptor]()
+
+  lazy val tmpBuf: Array[Byte] = new Array[Byte](bufferSize)
+  var decryptor: Decryptor = getDecryptor()
+  CryptoStreamUtils.checkCodec(codec)
+  resetStreamOffset(streamOffset)
+
+  def this(in: InputStream,
+  codec: CryptoCodec,
+  bufferSize: Integer,
+  key: Array[Byte],
+  iv: Array[Byte]) {
+this(in, codec, bufferSize, key, iv, 0, true)
+  }
+
+  def this(in: InputStream,
+  codec: CryptoCodec,
+  key: Array[Byte],
+  iv: Array[Byte]) {
+this(in, codec, CryptoStreamUtils.getBufferSize, key, iv)
+  }
+
+  def getWrappedStream(): InputStream = in
+
+  /**
+   * Decryption is buffer based.
+   * If there is data in [[outBuffer]], then read it out of this buffer.
+   * If there is no data in [[outBuffer]], then read more from 

[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-14 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/8880#discussion_r42059715
  
--- Diff: 
core/src/main/scala/org/apache/spark/crypto/CryptoOutputStream.scala ---
@@ -0,0 +1,225 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.crypto
+
+import java.io.{IOException, FilterOutputStream, OutputStream}
+import java.nio.ByteBuffer
+import java.security.GeneralSecurityException
+
+import com.google.common.base.Preconditions
+
+import org.apache.spark.Logging
+
+/**
+ * CryptoOutputStream encrypts data. It is not thread-safe. AES CTR mode is
+ * required in order to ensure that the plain text and cipher text have a 
1:1
+ * mapping. The encryption is buffer based. The key points of the 
encryption are
+ * (1) calculating counter and (2) padding through stream position.
+ *
+ * counter = base + pos/(algorithm blocksize);
+ * padding = pos%(algorithm blocksize);
+ *
+ * The underlying stream offset is maintained as state.
+ */
+private[spark] class CryptoOutputStream(
+out: OutputStream,
+private[spark] val codec: CryptoCodec,
+bufferSizeVal: Int,
+keyVal: Array[Byte],
+ivVal: Array[Byte],
+streamOffsetVal: Long,
+isDirectBuf: Boolean) extends FilterOutputStream(out: OutputStream) 
with Logging {
+  var encryptor: Encryptor = null
+  var streamOffset: Long = 0
+  /**
+   * Padding = pos%(algorithm blocksize); Padding is put into [[inBuffer]]
+   * before any other data goes in. The purpose of padding is to put input 
data
+   * at proper position.
+   */
+  var padding: Byte = 0
+  var closed: Boolean = false
+  var key: Array[Byte] = null
+  var initIV: Array[Byte] = null
+  var iv: Array[Byte] = null
+  val oneByteBuf: Array[Byte] = new Array[Byte](1)
+
+  CryptoStreamUtils.checkCodec(codec)
+  var bufferSize = CryptoStreamUtils.checkBufferSize(codec, bufferSizeVal)
+
+  lazy val tmpBuf: Array[Byte] = new Array[Byte](bufferSize)
+
+  key = keyVal.clone()
+  initIV = ivVal.clone()
+  iv = ivVal.clone()
+
+  /**
+   * Input data buffer. The data starts at inBuffer.position() and ends at
+   * inBuffer.limit().
+   */
+  lazy val inBuffer = if (isDirectBuf) {
+ByteBuffer.allocateDirect(bufferSize)
+  } else {
+ByteBuffer.allocate(bufferSize)
+  }
+
+  /**
+   * Encrypted data buffer. The data starts at outBuffer.position() and 
ends at
+   * outBuffer.limit();
+   */
+  lazy val outBuffer = if (isDirectBuf) {
+ByteBuffer.allocateDirect(bufferSize)
+  } else {
+ByteBuffer.allocate(bufferSize)
+  }
+  streamOffset = streamOffsetVal
+  encryptor = codec.createEncryptor()
+
+  updateEncryptor()
+
+  def this(
--- End diff --

BTW are all these constructors really needed? If everything goes through 
`createCryptoOutputStream` and `createCryptoInputStream`, you should only need 
one constructor in each class.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-14 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/8880#discussion_r42051545
  
--- Diff: 
core/src/main/scala/org/apache/spark/crypto/CommonConfigurationKeys.scala ---
@@ -0,0 +1,48 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.crypto
+
+import org.apache.hadoop.io.Text
+
+/**
+ * Constant variables
+ */
+private[spark] object CommonConfigurationKeys {
+  val SPARK_SHUFFLE_TOKEN = new Text("SPARK_SHUFFLE_TOKEN")
+  val SPARK_SECURITY_CRYPTO_BUFFER_SIZE_DEFAULT = 8192
+  val SPARK_SECURITY_CRYPTO_CIPHER_SUITE_DEFAULT = "AES/CTR/NoPadding"
+  val SPARK_SECURITY_CRYPTO_CIPHER_SUITE_KEY = 
"spark.security.crypto.cipher.suite"
+  val SPARK_SECURITY_CRYPTO_CODEC_CLASSES_KEY_PREFIX = 
"spark.security.crypto.codec.classes"
+  val SPARK_SECURITY_CRYPTO_CODEC_CLASSES_AES_CTR_NOPADDING_KEY =
+SPARK_SECURITY_CRYPTO_CODEC_CLASSES_KEY_PREFIX + 
CipherSuite.AES_CTR_NOPADDING.getConfigSuffix()
+  val SPARK_SECURITY_JAVA_SECURE_RANDOM_ALGORITHM_KEY = 
"spark.security.java.secure.random" +
+  ".algorithm"
+  val SPARK_SECURITY_CRYPTO_JCE_PROVIDER_KEY = 
"spark.security.crypto.jce.provider"
+  val SPARK_SECURITY_JAVA_SECURE_RANDOM_ALGORITHM_DEFAULT = "SHA1PRNG"
+  val SPARK_SECURITY_SECURE_RANDOM_IMPL_KEY = 
"spark.security.secure.random.impl"
+  val SPARK_ENCRYPTED_INTERMEDIATE_DATA_BUFFER_KB = 
"spark.job.encryptedIntermediateData.buffer.kb"
+  val DEFAULT_SPARK_ENCRYPTED_INTERMEDIATE_DATA_BUFFER_KB = "128"
+  val SPARK_SECURITY_SECURE_RANDOM_DEVICE_FILE_PATH_KEY = 
"spark.security.random.device.file.path"
+  val SPARK_SECURITY_SECURE_RANDOM_DEVICE_FILE_PATH_DEFAULT = 
"/dev/urandom"
+  val SPARK_ENCRYPTED_INTERMEDIATE_DATA = 
"spark.job.encryptedIntermediateData"
--- End diff --

This is not used anywhere.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-14 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/8880#discussion_r42052097
  
--- Diff: 
core/src/main/scala/org/apache/spark/crypto/CommonConfigurationKeys.scala ---
@@ -0,0 +1,48 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.crypto
+
+import org.apache.hadoop.io.Text
+
+/**
+ * Constant variables
+ */
+private[spark] object CommonConfigurationKeys {
+  val SPARK_SHUFFLE_TOKEN = new Text("SPARK_SHUFFLE_TOKEN")
+  val SPARK_SECURITY_CRYPTO_BUFFER_SIZE_DEFAULT = 8192
+  val SPARK_SECURITY_CRYPTO_CIPHER_SUITE_DEFAULT = "AES/CTR/NoPadding"
+  val SPARK_SECURITY_CRYPTO_CIPHER_SUITE_KEY = 
"spark.security.crypto.cipher.suite"
+  val SPARK_SECURITY_CRYPTO_CODEC_CLASSES_KEY_PREFIX = 
"spark.security.crypto.codec.classes"
+  val SPARK_SECURITY_CRYPTO_CODEC_CLASSES_AES_CTR_NOPADDING_KEY =
+SPARK_SECURITY_CRYPTO_CODEC_CLASSES_KEY_PREFIX + 
CipherSuite.AES_CTR_NOPADDING.getConfigSuffix()
+  val SPARK_SECURITY_JAVA_SECURE_RANDOM_ALGORITHM_KEY = 
"spark.security.java.secure.random" +
+  ".algorithm"
+  val SPARK_SECURITY_CRYPTO_JCE_PROVIDER_KEY = 
"spark.security.crypto.jce.provider"
+  val SPARK_SECURITY_JAVA_SECURE_RANDOM_ALGORITHM_DEFAULT = "SHA1PRNG"
+  val SPARK_SECURITY_SECURE_RANDOM_IMPL_KEY = 
"spark.security.secure.random.impl"
+  val SPARK_ENCRYPTED_INTERMEDIATE_DATA_BUFFER_KB = 
"spark.job.encryptedIntermediateData.buffer.kb"
--- End diff --

Why `spark.job`? It's related to encrypted shuffle, so it should be in 
`spark.shuffle.crypto`. For example, `spark.shuffle.crypto.bufferSize`. We also 
don't use units in config keys anymore; you can use `SparkConf.getSizeAsX` to 
translate values to the units you want, and the user can specify the values in 
the units they prefer.

I'd also rename the constant itself. "INTERMEDIATE" is both confusing and 
really not necessary.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-14 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/8880#discussion_r42053251
  
--- Diff: core/src/main/scala/org/apache/spark/crypto/CryptoCodec.scala ---
@@ -0,0 +1,93 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.crypto
+
+import org.apache.spark.{Logging, SparkConf}
+import 
org.apache.spark.crypto.CommonConfigurationKeys.SPARK_SECURITY_CRYPTO_CIPHER_SUITE_DEFAULT
+import 
org.apache.spark.crypto.CommonConfigurationKeys.SPARK_SECURITY_CRYPTO_CIPHER_SUITE_KEY
+import 
org.apache.spark.crypto.CommonConfigurationKeys.SPARK_SECURITY_CRYPTO_CODEC_CLASSES_AES_CTR_NOPADDING_KEY
+
+/**
+ * Crypto codec class, encapsulates encryptor/decryptor pair.
+ */
+private[spark] abstract class CryptoCodec {
+  /**
+   *
+   * @return the CipherSuite for this codec.
+   */
+  def getCipherSuite(): CipherSuite
+
+  /**
+   * This interface is only for Counter (CTR) mode. Generally the Encryptor
+   * or Decryptor calculates the IV and maintain encryption context 
internally.
+   * For example a [[javax.crypto.Cipher]] will maintain its encryption
+   * context internally when we do encryption/decryption using the
+   * Cipher#update interface.
+   *
+   * The IV can be calculated by combining the initial IV and the counter 
with
+   * a lossless operation (concatenation, addition, or XOR).
+   * @see 
http://en.wikipedia.org/wiki/Block_cipher_mode_of_operation#Counter_.28CTR.29
+   * @param initIV
--- End diff --

Can you explain what each argument is? for example, from the 
implementation, `iv` seems to be output only; why can't it be a return value?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-14 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/8880#discussion_r42054231
  
--- Diff: 
core/src/main/scala/org/apache/spark/crypto/CryptoInputStream.scala ---
@@ -0,0 +1,425 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.crypto
+
+import java.io.{IOException, InputStream, FilterInputStream}
+import java.nio.ByteBuffer
+import java.nio.channels.ReadableByteChannel
+import java.security.GeneralSecurityException
+import java.util.Queue
+import java.util.concurrent.ConcurrentLinkedQueue
+
+import com.google.common.base.Preconditions
+
+/**
+ * CryptoInputStream decrypts data. It is not thread-safe. AES CTR mode is
+ * required in order to ensure that the plain text and cipher text have a 
1:1
+ * mapping. The decryption is buffer based. The key points of the 
decryption
+ * are (1) calculating the counter and (2) padding through stream position:
+ *
+ * counter = base + pos/(algorithm blocksize);
+ * padding = pos%(algorithm blocksize);
+ *
+ * The underlying stream offset is maintained as state.
+ */
+private[spark] class CryptoInputStream(
+in: InputStream,
+private[this] val codec: CryptoCodec,
+bufferSizeVal: Integer,
+keyVal: Array[Byte],
+ivVal: Array[Byte],
+private[this] var streamOffset: Long,// Underlying stream offset.
+isDirectBuf: Boolean)
+extends FilterInputStream(in: InputStream) with ReadableByteChannel {
+  val oneByteBuf = new Array[Byte](1)
+
+  val bufferSize = CryptoStreamUtils.checkBufferSize(codec, bufferSizeVal)
+  /**
+   * Input data buffer. The data starts at inBuffer.position() and ends at
+   * to inBuffer.limit().
+   */
+  val inBuffer = if (isDirectBuf) {
+ByteBuffer.allocateDirect(bufferSizeVal)
+  } else {
+ByteBuffer.allocate(bufferSizeVal)
+  }
+
+  /**
+   * The decrypted data buffer. The data starts at outBuffer.position() and
+   * ends at outBuffer.limit()
+   */
+  val outBuffer = if (isDirectBuf) {
+ByteBuffer.allocateDirect(bufferSizeVal)
+  } else {
+ByteBuffer.allocate(bufferSizeVal)
+  }
+
+  /**
+   * Whether the underlying stream supports
+   * [[org.apache.hadoop.fs.ByteBufferReadable]]
+   */
+  var usingByteBufferRead = false
+  var usingByteBufferReadInitialized = false
+  /**
+   * Padding = pos%(algorithm blocksize) Padding is put into [[inBuffer]]
+   * before any other data goes in. The purpose of padding is to put the 
input
+   * data at proper position.
+   */
+  var padding: Byte = '0'
+  var closed: Boolean = false
+  var key: Array[Byte] = keyVal.clone()
+  var initIV: Array[Byte] = ivVal.clone()
+  var iv: Array[Byte] = ivVal.clone()
+  var isReadableByteChannel: Boolean = in.isInstanceOf[ReadableByteChannel]
+
+  /** DirectBuffer pool */
+  var bufferPool: Queue[ByteBuffer] = new 
ConcurrentLinkedQueue[ByteBuffer]()
+  /** Decryptor pool */
+  var decryptorPool: Queue[Decryptor] = new 
ConcurrentLinkedQueue[Decryptor]()
+
+  lazy val tmpBuf: Array[Byte] = new Array[Byte](bufferSize)
+  var decryptor: Decryptor = getDecryptor()
+  CryptoStreamUtils.checkCodec(codec)
+  resetStreamOffset(streamOffset)
+
+  def this(in: InputStream,
+  codec: CryptoCodec,
+  bufferSize: Integer,
+  key: Array[Byte],
+  iv: Array[Byte]) {
+this(in, codec, bufferSize, key, iv, 0, true)
+  }
+
+  def this(in: InputStream,
+  codec: CryptoCodec,
+  key: Array[Byte],
+  iv: Array[Byte]) {
+this(in, codec, CryptoStreamUtils.getBufferSize, key, iv)
+  }
+
+  def getWrappedStream(): InputStream = in
+
+  /**
+   * Decryption is buffer based.
+   * If there is data in [[outBuffer]], then read it out of this buffer.
+   * If there is no data in [[outBuffer]], then read more from 

[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-14 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/8880#discussion_r42059236
  
--- Diff: 
core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala 
---
@@ -27,7 +27,7 @@ import org.apache.spark.{Logging, SparkException, 
TaskContext}
 import org.apache.spark.network.buffer.ManagedBuffer
 import org.apache.spark.network.shuffle.{BlockFetchingListener, 
ShuffleClient}
 import org.apache.spark.shuffle.FetchFailedException
-import org.apache.spark.util.Utils
+import org.apache.spark.util.{CompletionIterator, Utils}
--- End diff --

This change does not seem necessary?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-14 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/8880#discussion_r42051371
  
--- Diff: 
core/src/main/scala/org/apache/spark/crypto/CommonConfigurationKeys.scala ---
@@ -0,0 +1,48 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.crypto
+
+import org.apache.hadoop.io.Text
+
+/**
+ * Constant variables
+ */
+private[spark] object CommonConfigurationKeys {
+  val SPARK_SHUFFLE_TOKEN = new Text("SPARK_SHUFFLE_TOKEN")
+  val SPARK_SECURITY_CRYPTO_BUFFER_SIZE_DEFAULT = 8192
+  val SPARK_SECURITY_CRYPTO_CIPHER_SUITE_DEFAULT = "AES/CTR/NoPadding"
+  val SPARK_SECURITY_CRYPTO_CIPHER_SUITE_KEY = 
"spark.security.crypto.cipher.suite"
+  val SPARK_SECURITY_CRYPTO_CODEC_CLASSES_KEY_PREFIX = 
"spark.security.crypto.codec.classes"
+  val SPARK_SECURITY_CRYPTO_CODEC_CLASSES_AES_CTR_NOPADDING_KEY =
--- End diff --

I'm a little confused about these config options. Could you write 
documentation for them in `configuration.md` or in this class's scaladoc?

This one `SPARK_SECURITY_CRYPTO_CODEC_CLASSES_AES_CTR_NOPADDING_KEY` 
particularly looks like a mouthful. If I understand correctly it would be 
"spark.security.crypto.codec.classes.aes.ctr.nopadding".

My question is: how is this expected to be used?

From what I can understand, you have two configs: one to choose the cipher 
suite, one to choose the codec. Why can't you have just two configs:

`spark.security.crypto.cipher.suite`
`spark.security.crypto.codec.classes`

And that's it? Why do you need the second one to have a separate, dynamic 
config key depending on the chose cipher suite?

Also, minor, but I'd be explicit about these options being related to the 
shuffle, so `spark.shuffle` instead of `spark.security`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-14 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/8880#discussion_r42052623
  
--- Diff: 
core/src/main/scala/org/apache/spark/crypto/CryptoInputStream.scala ---
@@ -0,0 +1,425 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.crypto
+
+import java.io.{IOException, InputStream, FilterInputStream}
+import java.nio.ByteBuffer
+import java.nio.channels.ReadableByteChannel
+import java.security.GeneralSecurityException
+import java.util.Queue
+import java.util.concurrent.ConcurrentLinkedQueue
+
+import com.google.common.base.Preconditions
+
+/**
+ * CryptoInputStream decrypts data. It is not thread-safe. AES CTR mode is
+ * required in order to ensure that the plain text and cipher text have a 
1:1
+ * mapping. The decryption is buffer based. The key points of the 
decryption
+ * are (1) calculating the counter and (2) padding through stream position:
+ *
+ * counter = base + pos/(algorithm blocksize);
+ * padding = pos%(algorithm blocksize);
+ *
+ * The underlying stream offset is maintained as state.
+ */
+private[spark] class CryptoInputStream(
+in: InputStream,
+private[this] val codec: CryptoCodec,
+bufferSizeVal: Integer,
+keyVal: Array[Byte],
+ivVal: Array[Byte],
+private[this] var streamOffset: Long,// Underlying stream offset.
+isDirectBuf: Boolean)
+extends FilterInputStream(in: InputStream) with ReadableByteChannel {
+  val oneByteBuf = new Array[Byte](1)
+
+  val bufferSize = CryptoStreamUtils.checkBufferSize(codec, bufferSizeVal)
+  /**
+   * Input data buffer. The data starts at inBuffer.position() and ends at
+   * to inBuffer.limit().
+   */
+  val inBuffer = if (isDirectBuf) {
+ByteBuffer.allocateDirect(bufferSizeVal)
+  } else {
+ByteBuffer.allocate(bufferSizeVal)
+  }
+
+  /**
+   * The decrypted data buffer. The data starts at outBuffer.position() and
+   * ends at outBuffer.limit()
+   */
+  val outBuffer = if (isDirectBuf) {
+ByteBuffer.allocateDirect(bufferSizeVal)
+  } else {
+ByteBuffer.allocate(bufferSizeVal)
+  }
+
+  /**
+   * Whether the underlying stream supports
+   * [[org.apache.hadoop.fs.ByteBufferReadable]]
+   */
+  var usingByteBufferRead = false
+  var usingByteBufferReadInitialized = false
+  /**
+   * Padding = pos%(algorithm blocksize) Padding is put into [[inBuffer]]
+   * before any other data goes in. The purpose of padding is to put the 
input
+   * data at proper position.
+   */
+  var padding: Byte = '0'
+  var closed: Boolean = false
+  var key: Array[Byte] = keyVal.clone()
+  var initIV: Array[Byte] = ivVal.clone()
+  var iv: Array[Byte] = ivVal.clone()
+  var isReadableByteChannel: Boolean = in.isInstanceOf[ReadableByteChannel]
+
+  /** DirectBuffer pool */
+  var bufferPool: Queue[ByteBuffer] = new 
ConcurrentLinkedQueue[ByteBuffer]()
+  /** Decryptor pool */
+  var decryptorPool: Queue[Decryptor] = new 
ConcurrentLinkedQueue[Decryptor]()
+
+  lazy val tmpBuf: Array[Byte] = new Array[Byte](bufferSize)
+  var decryptor: Decryptor = getDecryptor()
+  CryptoStreamUtils.checkCodec(codec)
+  resetStreamOffset(streamOffset)
+
+  def this(in: InputStream,
--- End diff --

Constructors should be at the top of the file.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-14 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/8880#discussion_r42054578
  
--- Diff: 
core/src/main/scala/org/apache/spark/crypto/CryptoInputStream.scala ---
@@ -0,0 +1,425 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.crypto
+
+import java.io.{IOException, InputStream, FilterInputStream}
+import java.nio.ByteBuffer
+import java.nio.channels.ReadableByteChannel
+import java.security.GeneralSecurityException
+import java.util.Queue
+import java.util.concurrent.ConcurrentLinkedQueue
+
+import com.google.common.base.Preconditions
+
+/**
+ * CryptoInputStream decrypts data. It is not thread-safe. AES CTR mode is
+ * required in order to ensure that the plain text and cipher text have a 
1:1
+ * mapping. The decryption is buffer based. The key points of the 
decryption
+ * are (1) calculating the counter and (2) padding through stream position:
+ *
+ * counter = base + pos/(algorithm blocksize);
+ * padding = pos%(algorithm blocksize);
+ *
+ * The underlying stream offset is maintained as state.
+ */
+private[spark] class CryptoInputStream(
+in: InputStream,
+private[this] val codec: CryptoCodec,
+bufferSizeVal: Integer,
+keyVal: Array[Byte],
+ivVal: Array[Byte],
+private[this] var streamOffset: Long,// Underlying stream offset.
+isDirectBuf: Boolean)
+extends FilterInputStream(in: InputStream) with ReadableByteChannel {
+  val oneByteBuf = new Array[Byte](1)
+
+  val bufferSize = CryptoStreamUtils.checkBufferSize(codec, bufferSizeVal)
+  /**
+   * Input data buffer. The data starts at inBuffer.position() and ends at
+   * to inBuffer.limit().
+   */
+  val inBuffer = if (isDirectBuf) {
+ByteBuffer.allocateDirect(bufferSizeVal)
+  } else {
+ByteBuffer.allocate(bufferSizeVal)
+  }
+
+  /**
+   * The decrypted data buffer. The data starts at outBuffer.position() and
+   * ends at outBuffer.limit()
+   */
+  val outBuffer = if (isDirectBuf) {
+ByteBuffer.allocateDirect(bufferSizeVal)
+  } else {
+ByteBuffer.allocate(bufferSizeVal)
+  }
+
+  /**
+   * Whether the underlying stream supports
+   * [[org.apache.hadoop.fs.ByteBufferReadable]]
+   */
+  var usingByteBufferRead = false
+  var usingByteBufferReadInitialized = false
+  /**
+   * Padding = pos%(algorithm blocksize) Padding is put into [[inBuffer]]
+   * before any other data goes in. The purpose of padding is to put the 
input
+   * data at proper position.
+   */
+  var padding: Byte = '0'
+  var closed: Boolean = false
+  var key: Array[Byte] = keyVal.clone()
+  var initIV: Array[Byte] = ivVal.clone()
+  var iv: Array[Byte] = ivVal.clone()
+  var isReadableByteChannel: Boolean = in.isInstanceOf[ReadableByteChannel]
+
+  /** DirectBuffer pool */
+  var bufferPool: Queue[ByteBuffer] = new 
ConcurrentLinkedQueue[ByteBuffer]()
+  /** Decryptor pool */
+  var decryptorPool: Queue[Decryptor] = new 
ConcurrentLinkedQueue[Decryptor]()
+
+  lazy val tmpBuf: Array[Byte] = new Array[Byte](bufferSize)
+  var decryptor: Decryptor = getDecryptor()
+  CryptoStreamUtils.checkCodec(codec)
+  resetStreamOffset(streamOffset)
+
+  def this(in: InputStream,
+  codec: CryptoCodec,
+  bufferSize: Integer,
+  key: Array[Byte],
+  iv: Array[Byte]) {
+this(in, codec, bufferSize, key, iv, 0, true)
+  }
+
+  def this(in: InputStream,
+  codec: CryptoCodec,
+  key: Array[Byte],
+  iv: Array[Byte]) {
+this(in, codec, CryptoStreamUtils.getBufferSize, key, iv)
+  }
+
+  def getWrappedStream(): InputStream = in
+
+  /**
+   * Decryption is buffer based.
+   * If there is data in [[outBuffer]], then read it out of this buffer.
+   * If there is no data in [[outBuffer]], then read more from 

[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-14 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/8880#discussion_r42058261
  
--- Diff: 
core/src/main/scala/org/apache/spark/crypto/CryptoOutputStream.scala ---
@@ -0,0 +1,225 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.crypto
+
+import java.io.{IOException, FilterOutputStream, OutputStream}
+import java.nio.ByteBuffer
+import java.security.GeneralSecurityException
+
+import com.google.common.base.Preconditions
+
+import org.apache.spark.Logging
+
+/**
+ * CryptoOutputStream encrypts data. It is not thread-safe. AES CTR mode is
+ * required in order to ensure that the plain text and cipher text have a 
1:1
+ * mapping. The encryption is buffer based. The key points of the 
encryption are
+ * (1) calculating counter and (2) padding through stream position.
+ *
+ * counter = base + pos/(algorithm blocksize);
+ * padding = pos%(algorithm blocksize);
+ *
+ * The underlying stream offset is maintained as state.
+ */
+private[spark] class CryptoOutputStream(
+out: OutputStream,
+private[spark] val codec: CryptoCodec,
+bufferSizeVal: Int,
+keyVal: Array[Byte],
+ivVal: Array[Byte],
+streamOffsetVal: Long,
+isDirectBuf: Boolean) extends FilterOutputStream(out: OutputStream) 
with Logging {
+  var encryptor: Encryptor = null
+  var streamOffset: Long = 0
+  /**
+   * Padding = pos%(algorithm blocksize); Padding is put into [[inBuffer]]
+   * before any other data goes in. The purpose of padding is to put input 
data
+   * at proper position.
+   */
+  var padding: Byte = 0
+  var closed: Boolean = false
+  var key: Array[Byte] = null
+  var initIV: Array[Byte] = null
+  var iv: Array[Byte] = null
+  val oneByteBuf: Array[Byte] = new Array[Byte](1)
+
+  CryptoStreamUtils.checkCodec(codec)
+  var bufferSize = CryptoStreamUtils.checkBufferSize(codec, bufferSizeVal)
+
+  lazy val tmpBuf: Array[Byte] = new Array[Byte](bufferSize)
+
+  key = keyVal.clone()
+  initIV = ivVal.clone()
+  iv = ivVal.clone()
+
+  /**
+   * Input data buffer. The data starts at inBuffer.position() and ends at
+   * inBuffer.limit().
+   */
+  lazy val inBuffer = if (isDirectBuf) {
+ByteBuffer.allocateDirect(bufferSize)
+  } else {
+ByteBuffer.allocate(bufferSize)
+  }
+
+  /**
+   * Encrypted data buffer. The data starts at outBuffer.position() and 
ends at
+   * outBuffer.limit();
+   */
+  lazy val outBuffer = if (isDirectBuf) {
+ByteBuffer.allocateDirect(bufferSize)
+  } else {
+ByteBuffer.allocate(bufferSize)
+  }
+  streamOffset = streamOffsetVal
+  encryptor = codec.createEncryptor()
+
+  updateEncryptor()
+
+  def this(
--- End diff --

Constructors go at the top of the file.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-14 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/8880#discussion_r42059164
  
--- Diff: 
core/src/test/scala/org/apache/spark/crypto/JceAesCtrCryptoCodecSuite.scala ---
@@ -0,0 +1,72 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.crypto
+
+import java.io.{ByteArrayInputStream, BufferedOutputStream, 
ByteArrayOutputStream}
+import java.security.SecureRandom
+
+import org.apache.spark.crypto.CommonConfigurationKeys._
+import org.apache.spark.{SparkFunSuite, SparkConf, Logging}
+
+/**
+ * test JceAesCtrCryptoCodec
+ */
+class JceAesCtrCryptoCodecSuite extends SparkFunSuite with Logging {
+
+  test("TestCryptoCodecSuite") {
+val random = new SecureRandom
+val dataLen = 1000
+val inputData = new Array[Byte](dataLen)
+val outputData = new Array[Byte](dataLen)
+random.nextBytes(inputData)
+// encrypt
+val sparkConf = new SparkConf
+
sparkConf.set(SPARK_SECURITY_CRYPTO_CODEC_CLASSES_AES_CTR_NOPADDING_KEY,
+  classOf[JceAesCtrCryptoCodec].getName)
+val cipherSuite = new CipherSuite("AES/CTR/NoPadding", 16)
+val codec = new JceAesCtrCryptoCodec(sparkConf)
+val aos = new ByteArrayOutputStream
+val bos = new BufferedOutputStream(aos)
+val key = new Array[Byte](16)
+val iv = new Array[Byte](16)
+random.nextBytes(key)
+random.nextBytes(iv)
+
+val cos = new CryptoOutputStream(bos, codec, 1024, key, iv)
+cos.write(inputData, 0, inputData.length)
--- End diff --

Just pinging about this issue; the test doesn't really exercise all 
existing code paths and it would be really nice to get more coverage given the 
code is not that trivial.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-14 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/8880#discussion_r42059206
  
--- Diff: 
core/src/test/scala/org/apache/spark/crypto/JceAesCtrCryptoCodecSuite.scala ---
@@ -0,0 +1,77 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.crypto
+
+import java.io.{ByteArrayInputStream, BufferedOutputStream, 
ByteArrayOutputStream}
+import java.security.SecureRandom
+import java.util
+
+import org.scalatest.prop.TableDrivenPropertyChecks
+
+import org.apache.spark.{SparkFunSuite, SparkConf, Logging}
+import org.apache.spark.crypto.CommonConfigurationKeys._
+
+/**
+ * test JceAesCtrCryptoCodec
+ */
+private[crypto] class JceAesCtrCryptoCodecSuite extends SparkFunSuite with 
Logging with
+TableDrivenPropertyChecks {
--- End diff --

nit: indent this line


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-14 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/8880#discussion_r42051574
  
--- Diff: 
core/src/main/scala/org/apache/spark/crypto/CommonConfigurationKeys.scala ---
@@ -0,0 +1,48 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.crypto
+
+import org.apache.hadoop.io.Text
+
+/**
+ * Constant variables
+ */
+private[spark] object CommonConfigurationKeys {
+  val SPARK_SHUFFLE_TOKEN = new Text("SPARK_SHUFFLE_TOKEN")
+  val SPARK_SECURITY_CRYPTO_BUFFER_SIZE_DEFAULT = 8192
+  val SPARK_SECURITY_CRYPTO_CIPHER_SUITE_DEFAULT = "AES/CTR/NoPadding"
+  val SPARK_SECURITY_CRYPTO_CIPHER_SUITE_KEY = 
"spark.security.crypto.cipher.suite"
+  val SPARK_SECURITY_CRYPTO_CODEC_CLASSES_KEY_PREFIX = 
"spark.security.crypto.codec.classes"
+  val SPARK_SECURITY_CRYPTO_CODEC_CLASSES_AES_CTR_NOPADDING_KEY =
+SPARK_SECURITY_CRYPTO_CODEC_CLASSES_KEY_PREFIX + 
CipherSuite.AES_CTR_NOPADDING.getConfigSuffix()
+  val SPARK_SECURITY_JAVA_SECURE_RANDOM_ALGORITHM_KEY = 
"spark.security.java.secure.random" +
+  ".algorithm"
+  val SPARK_SECURITY_CRYPTO_JCE_PROVIDER_KEY = 
"spark.security.crypto.jce.provider"
+  val SPARK_SECURITY_JAVA_SECURE_RANDOM_ALGORITHM_DEFAULT = "SHA1PRNG"
+  val SPARK_SECURITY_SECURE_RANDOM_IMPL_KEY = 
"spark.security.secure.random.impl"
+  val SPARK_ENCRYPTED_INTERMEDIATE_DATA_BUFFER_KB = 
"spark.job.encryptedIntermediateData.buffer.kb"
+  val DEFAULT_SPARK_ENCRYPTED_INTERMEDIATE_DATA_BUFFER_KB = "128"
+  val SPARK_SECURITY_SECURE_RANDOM_DEVICE_FILE_PATH_KEY = 
"spark.security.random.device.file.path"
+  val SPARK_SECURITY_SECURE_RANDOM_DEVICE_FILE_PATH_DEFAULT = 
"/dev/urandom"
+  val SPARK_ENCRYPTED_INTERMEDIATE_DATA = 
"spark.job.encryptedIntermediateData"
+  val SPARK_ENCRYPTED_INTERMEDIATE_DATA_KEY_SIZE_BITS = "spark.job" +
+  ".encryptedIntermediateDataKeySizeBits"
+  val DEFAULT_SPARK_ENCRYPTED_INTERMEDIATE_DATA_KEY_SIZE_BITS = 128
+  val SPARK_SHUFFLE_KEYGEN_ALGORITHM = "spark.shuffle.keygen.algorithm"
--- End diff --

Should be `spark.shuffle.crypto.keygen.algorithm`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-14 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/8880#discussion_r42051668
  
--- Diff: 
core/src/main/scala/org/apache/spark/crypto/CommonConfigurationKeys.scala ---
@@ -0,0 +1,48 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.crypto
+
+import org.apache.hadoop.io.Text
+
+/**
+ * Constant variables
+ */
+private[spark] object CommonConfigurationKeys {
+  val SPARK_SHUFFLE_TOKEN = new Text("SPARK_SHUFFLE_TOKEN")
+  val SPARK_SECURITY_CRYPTO_BUFFER_SIZE_DEFAULT = 8192
+  val SPARK_SECURITY_CRYPTO_CIPHER_SUITE_DEFAULT = "AES/CTR/NoPadding"
+  val SPARK_SECURITY_CRYPTO_CIPHER_SUITE_KEY = 
"spark.security.crypto.cipher.suite"
+  val SPARK_SECURITY_CRYPTO_CODEC_CLASSES_KEY_PREFIX = 
"spark.security.crypto.codec.classes"
+  val SPARK_SECURITY_CRYPTO_CODEC_CLASSES_AES_CTR_NOPADDING_KEY =
+SPARK_SECURITY_CRYPTO_CODEC_CLASSES_KEY_PREFIX + 
CipherSuite.AES_CTR_NOPADDING.getConfigSuffix()
+  val SPARK_SECURITY_JAVA_SECURE_RANDOM_ALGORITHM_KEY = 
"spark.security.java.secure.random" +
+  ".algorithm"
+  val SPARK_SECURITY_CRYPTO_JCE_PROVIDER_KEY = 
"spark.security.crypto.jce.provider"
+  val SPARK_SECURITY_JAVA_SECURE_RANDOM_ALGORITHM_DEFAULT = "SHA1PRNG"
+  val SPARK_SECURITY_SECURE_RANDOM_IMPL_KEY = 
"spark.security.secure.random.impl"
+  val SPARK_ENCRYPTED_INTERMEDIATE_DATA_BUFFER_KB = 
"spark.job.encryptedIntermediateData.buffer.kb"
+  val DEFAULT_SPARK_ENCRYPTED_INTERMEDIATE_DATA_BUFFER_KB = "128"
+  val SPARK_SECURITY_SECURE_RANDOM_DEVICE_FILE_PATH_KEY = 
"spark.security.random.device.file.path"
--- End diff --

This is not used anywhere.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-14 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/8880#discussion_r42053618
  
--- Diff: 
core/src/main/scala/org/apache/spark/crypto/CryptoInputStream.scala ---
@@ -0,0 +1,425 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.crypto
+
+import java.io.{IOException, InputStream, FilterInputStream}
+import java.nio.ByteBuffer
+import java.nio.channels.ReadableByteChannel
+import java.security.GeneralSecurityException
+import java.util.Queue
+import java.util.concurrent.ConcurrentLinkedQueue
+
+import com.google.common.base.Preconditions
+
+/**
+ * CryptoInputStream decrypts data. It is not thread-safe. AES CTR mode is
+ * required in order to ensure that the plain text and cipher text have a 
1:1
+ * mapping. The decryption is buffer based. The key points of the 
decryption
+ * are (1) calculating the counter and (2) padding through stream position:
+ *
+ * counter = base + pos/(algorithm blocksize);
+ * padding = pos%(algorithm blocksize);
+ *
+ * The underlying stream offset is maintained as state.
+ */
+private[spark] class CryptoInputStream(
+in: InputStream,
+private[this] val codec: CryptoCodec,
+bufferSizeVal: Integer,
+keyVal: Array[Byte],
+ivVal: Array[Byte],
+private[this] var streamOffset: Long,// Underlying stream offset.
+isDirectBuf: Boolean)
+extends FilterInputStream(in: InputStream) with ReadableByteChannel {
+  val oneByteBuf = new Array[Byte](1)
+
+  val bufferSize = CryptoStreamUtils.checkBufferSize(codec, bufferSizeVal)
+  /**
+   * Input data buffer. The data starts at inBuffer.position() and ends at
+   * to inBuffer.limit().
+   */
+  val inBuffer = if (isDirectBuf) {
+ByteBuffer.allocateDirect(bufferSizeVal)
+  } else {
+ByteBuffer.allocate(bufferSizeVal)
+  }
+
+  /**
+   * The decrypted data buffer. The data starts at outBuffer.position() and
+   * ends at outBuffer.limit()
+   */
+  val outBuffer = if (isDirectBuf) {
+ByteBuffer.allocateDirect(bufferSizeVal)
+  } else {
+ByteBuffer.allocate(bufferSizeVal)
+  }
+
+  /**
+   * Whether the underlying stream supports
+   * [[org.apache.hadoop.fs.ByteBufferReadable]]
+   */
+  var usingByteBufferRead = false
+  var usingByteBufferReadInitialized = false
+  /**
+   * Padding = pos%(algorithm blocksize) Padding is put into [[inBuffer]]
+   * before any other data goes in. The purpose of padding is to put the 
input
+   * data at proper position.
+   */
+  var padding: Byte = '0'
+  var closed: Boolean = false
+  var key: Array[Byte] = keyVal.clone()
+  var initIV: Array[Byte] = ivVal.clone()
+  var iv: Array[Byte] = ivVal.clone()
+  var isReadableByteChannel: Boolean = in.isInstanceOf[ReadableByteChannel]
+
+  /** DirectBuffer pool */
+  var bufferPool: Queue[ByteBuffer] = new 
ConcurrentLinkedQueue[ByteBuffer]()
+  /** Decryptor pool */
+  var decryptorPool: Queue[Decryptor] = new 
ConcurrentLinkedQueue[Decryptor]()
+
+  lazy val tmpBuf: Array[Byte] = new Array[Byte](bufferSize)
+  var decryptor: Decryptor = getDecryptor()
+  CryptoStreamUtils.checkCodec(codec)
+  resetStreamOffset(streamOffset)
+
+  def this(in: InputStream,
+  codec: CryptoCodec,
+  bufferSize: Integer,
+  key: Array[Byte],
+  iv: Array[Byte]) {
+this(in, codec, bufferSize, key, iv, 0, true)
+  }
+
+  def this(in: InputStream,
+  codec: CryptoCodec,
+  key: Array[Byte],
+  iv: Array[Byte]) {
+this(in, codec, CryptoStreamUtils.getBufferSize, key, iv)
+  }
+
+  def getWrappedStream(): InputStream = in
+
+  /**
+   * Decryption is buffer based.
+   * If there is data in [[outBuffer]], then read it out of this buffer.
+   * If there is no data in [[outBuffer]], then read more from 

[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-14 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/8880#discussion_r42054099
  
--- Diff: 
core/src/main/scala/org/apache/spark/crypto/CryptoInputStream.scala ---
@@ -0,0 +1,425 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.crypto
+
+import java.io.{IOException, InputStream, FilterInputStream}
+import java.nio.ByteBuffer
+import java.nio.channels.ReadableByteChannel
+import java.security.GeneralSecurityException
+import java.util.Queue
+import java.util.concurrent.ConcurrentLinkedQueue
+
+import com.google.common.base.Preconditions
+
+/**
+ * CryptoInputStream decrypts data. It is not thread-safe. AES CTR mode is
+ * required in order to ensure that the plain text and cipher text have a 
1:1
+ * mapping. The decryption is buffer based. The key points of the 
decryption
+ * are (1) calculating the counter and (2) padding through stream position:
+ *
+ * counter = base + pos/(algorithm blocksize);
+ * padding = pos%(algorithm blocksize);
+ *
+ * The underlying stream offset is maintained as state.
+ */
+private[spark] class CryptoInputStream(
+in: InputStream,
+private[this] val codec: CryptoCodec,
+bufferSizeVal: Integer,
+keyVal: Array[Byte],
+ivVal: Array[Byte],
+private[this] var streamOffset: Long,// Underlying stream offset.
+isDirectBuf: Boolean)
+extends FilterInputStream(in: InputStream) with ReadableByteChannel {
+  val oneByteBuf = new Array[Byte](1)
+
+  val bufferSize = CryptoStreamUtils.checkBufferSize(codec, bufferSizeVal)
+  /**
+   * Input data buffer. The data starts at inBuffer.position() and ends at
+   * to inBuffer.limit().
+   */
+  val inBuffer = if (isDirectBuf) {
+ByteBuffer.allocateDirect(bufferSizeVal)
+  } else {
+ByteBuffer.allocate(bufferSizeVal)
+  }
+
+  /**
+   * The decrypted data buffer. The data starts at outBuffer.position() and
+   * ends at outBuffer.limit()
+   */
+  val outBuffer = if (isDirectBuf) {
+ByteBuffer.allocateDirect(bufferSizeVal)
+  } else {
+ByteBuffer.allocate(bufferSizeVal)
+  }
+
+  /**
+   * Whether the underlying stream supports
+   * [[org.apache.hadoop.fs.ByteBufferReadable]]
+   */
+  var usingByteBufferRead = false
+  var usingByteBufferReadInitialized = false
+  /**
+   * Padding = pos%(algorithm blocksize) Padding is put into [[inBuffer]]
+   * before any other data goes in. The purpose of padding is to put the 
input
+   * data at proper position.
+   */
+  var padding: Byte = '0'
+  var closed: Boolean = false
+  var key: Array[Byte] = keyVal.clone()
+  var initIV: Array[Byte] = ivVal.clone()
+  var iv: Array[Byte] = ivVal.clone()
+  var isReadableByteChannel: Boolean = in.isInstanceOf[ReadableByteChannel]
+
+  /** DirectBuffer pool */
+  var bufferPool: Queue[ByteBuffer] = new 
ConcurrentLinkedQueue[ByteBuffer]()
+  /** Decryptor pool */
+  var decryptorPool: Queue[Decryptor] = new 
ConcurrentLinkedQueue[Decryptor]()
+
+  lazy val tmpBuf: Array[Byte] = new Array[Byte](bufferSize)
+  var decryptor: Decryptor = getDecryptor()
+  CryptoStreamUtils.checkCodec(codec)
+  resetStreamOffset(streamOffset)
+
+  def this(in: InputStream,
+  codec: CryptoCodec,
+  bufferSize: Integer,
+  key: Array[Byte],
+  iv: Array[Byte]) {
+this(in, codec, bufferSize, key, iv, 0, true)
+  }
+
+  def this(in: InputStream,
+  codec: CryptoCodec,
+  key: Array[Byte],
+  iv: Array[Byte]) {
+this(in, codec, CryptoStreamUtils.getBufferSize, key, iv)
+  }
+
+  def getWrappedStream(): InputStream = in
+
+  /**
+   * Decryption is buffer based.
+   * If there is data in [[outBuffer]], then read it out of this buffer.
+   * If there is no data in [[outBuffer]], then read more from 

[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-14 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/8880#discussion_r42054088
  
--- Diff: 
core/src/main/scala/org/apache/spark/crypto/CryptoInputStream.scala ---
@@ -0,0 +1,425 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.crypto
+
+import java.io.{IOException, InputStream, FilterInputStream}
+import java.nio.ByteBuffer
+import java.nio.channels.ReadableByteChannel
+import java.security.GeneralSecurityException
+import java.util.Queue
+import java.util.concurrent.ConcurrentLinkedQueue
+
+import com.google.common.base.Preconditions
+
+/**
+ * CryptoInputStream decrypts data. It is not thread-safe. AES CTR mode is
+ * required in order to ensure that the plain text and cipher text have a 
1:1
+ * mapping. The decryption is buffer based. The key points of the 
decryption
+ * are (1) calculating the counter and (2) padding through stream position:
+ *
+ * counter = base + pos/(algorithm blocksize);
+ * padding = pos%(algorithm blocksize);
+ *
+ * The underlying stream offset is maintained as state.
+ */
+private[spark] class CryptoInputStream(
+in: InputStream,
+private[this] val codec: CryptoCodec,
+bufferSizeVal: Integer,
+keyVal: Array[Byte],
+ivVal: Array[Byte],
+private[this] var streamOffset: Long,// Underlying stream offset.
+isDirectBuf: Boolean)
+extends FilterInputStream(in: InputStream) with ReadableByteChannel {
+  val oneByteBuf = new Array[Byte](1)
+
+  val bufferSize = CryptoStreamUtils.checkBufferSize(codec, bufferSizeVal)
+  /**
+   * Input data buffer. The data starts at inBuffer.position() and ends at
+   * to inBuffer.limit().
+   */
+  val inBuffer = if (isDirectBuf) {
+ByteBuffer.allocateDirect(bufferSizeVal)
+  } else {
+ByteBuffer.allocate(bufferSizeVal)
+  }
+
+  /**
+   * The decrypted data buffer. The data starts at outBuffer.position() and
+   * ends at outBuffer.limit()
+   */
+  val outBuffer = if (isDirectBuf) {
+ByteBuffer.allocateDirect(bufferSizeVal)
+  } else {
+ByteBuffer.allocate(bufferSizeVal)
+  }
+
+  /**
+   * Whether the underlying stream supports
+   * [[org.apache.hadoop.fs.ByteBufferReadable]]
+   */
+  var usingByteBufferRead = false
+  var usingByteBufferReadInitialized = false
+  /**
+   * Padding = pos%(algorithm blocksize) Padding is put into [[inBuffer]]
+   * before any other data goes in. The purpose of padding is to put the 
input
+   * data at proper position.
+   */
+  var padding: Byte = '0'
+  var closed: Boolean = false
+  var key: Array[Byte] = keyVal.clone()
+  var initIV: Array[Byte] = ivVal.clone()
+  var iv: Array[Byte] = ivVal.clone()
+  var isReadableByteChannel: Boolean = in.isInstanceOf[ReadableByteChannel]
+
+  /** DirectBuffer pool */
+  var bufferPool: Queue[ByteBuffer] = new 
ConcurrentLinkedQueue[ByteBuffer]()
+  /** Decryptor pool */
+  var decryptorPool: Queue[Decryptor] = new 
ConcurrentLinkedQueue[Decryptor]()
+
+  lazy val tmpBuf: Array[Byte] = new Array[Byte](bufferSize)
+  var decryptor: Decryptor = getDecryptor()
+  CryptoStreamUtils.checkCodec(codec)
+  resetStreamOffset(streamOffset)
+
+  def this(in: InputStream,
+  codec: CryptoCodec,
+  bufferSize: Integer,
+  key: Array[Byte],
+  iv: Array[Byte]) {
+this(in, codec, bufferSize, key, iv, 0, true)
+  }
+
+  def this(in: InputStream,
+  codec: CryptoCodec,
+  key: Array[Byte],
+  iv: Array[Byte]) {
+this(in, codec, CryptoStreamUtils.getBufferSize, key, iv)
+  }
+
+  def getWrappedStream(): InputStream = in
+
+  /**
+   * Decryption is buffer based.
+   * If there is data in [[outBuffer]], then read it out of this buffer.
+   * If there is no data in [[outBuffer]], then read more from 

[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-14 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/8880#discussion_r42054837
  
--- Diff: 
core/src/main/scala/org/apache/spark/crypto/CryptoInputStream.scala ---
@@ -0,0 +1,425 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.crypto
+
+import java.io.{IOException, InputStream, FilterInputStream}
+import java.nio.ByteBuffer
+import java.nio.channels.ReadableByteChannel
+import java.security.GeneralSecurityException
+import java.util.Queue
+import java.util.concurrent.ConcurrentLinkedQueue
+
+import com.google.common.base.Preconditions
+
+/**
+ * CryptoInputStream decrypts data. It is not thread-safe. AES CTR mode is
+ * required in order to ensure that the plain text and cipher text have a 
1:1
+ * mapping. The decryption is buffer based. The key points of the 
decryption
+ * are (1) calculating the counter and (2) padding through stream position:
+ *
+ * counter = base + pos/(algorithm blocksize);
+ * padding = pos%(algorithm blocksize);
+ *
+ * The underlying stream offset is maintained as state.
+ */
+private[spark] class CryptoInputStream(
+in: InputStream,
+private[this] val codec: CryptoCodec,
+bufferSizeVal: Integer,
+keyVal: Array[Byte],
+ivVal: Array[Byte],
+private[this] var streamOffset: Long,// Underlying stream offset.
+isDirectBuf: Boolean)
+extends FilterInputStream(in: InputStream) with ReadableByteChannel {
+  val oneByteBuf = new Array[Byte](1)
+
+  val bufferSize = CryptoStreamUtils.checkBufferSize(codec, bufferSizeVal)
+  /**
+   * Input data buffer. The data starts at inBuffer.position() and ends at
+   * to inBuffer.limit().
+   */
+  val inBuffer = if (isDirectBuf) {
+ByteBuffer.allocateDirect(bufferSizeVal)
+  } else {
+ByteBuffer.allocate(bufferSizeVal)
+  }
+
+  /**
+   * The decrypted data buffer. The data starts at outBuffer.position() and
+   * ends at outBuffer.limit()
+   */
+  val outBuffer = if (isDirectBuf) {
+ByteBuffer.allocateDirect(bufferSizeVal)
+  } else {
+ByteBuffer.allocate(bufferSizeVal)
+  }
+
+  /**
+   * Whether the underlying stream supports
+   * [[org.apache.hadoop.fs.ByteBufferReadable]]
+   */
+  var usingByteBufferRead = false
+  var usingByteBufferReadInitialized = false
+  /**
+   * Padding = pos%(algorithm blocksize) Padding is put into [[inBuffer]]
+   * before any other data goes in. The purpose of padding is to put the 
input
+   * data at proper position.
+   */
+  var padding: Byte = '0'
+  var closed: Boolean = false
+  var key: Array[Byte] = keyVal.clone()
+  var initIV: Array[Byte] = ivVal.clone()
+  var iv: Array[Byte] = ivVal.clone()
+  var isReadableByteChannel: Boolean = in.isInstanceOf[ReadableByteChannel]
+
+  /** DirectBuffer pool */
+  var bufferPool: Queue[ByteBuffer] = new 
ConcurrentLinkedQueue[ByteBuffer]()
+  /** Decryptor pool */
+  var decryptorPool: Queue[Decryptor] = new 
ConcurrentLinkedQueue[Decryptor]()
+
+  lazy val tmpBuf: Array[Byte] = new Array[Byte](bufferSize)
+  var decryptor: Decryptor = getDecryptor()
+  CryptoStreamUtils.checkCodec(codec)
+  resetStreamOffset(streamOffset)
+
+  def this(in: InputStream,
+  codec: CryptoCodec,
+  bufferSize: Integer,
+  key: Array[Byte],
+  iv: Array[Byte]) {
+this(in, codec, bufferSize, key, iv, 0, true)
+  }
+
+  def this(in: InputStream,
+  codec: CryptoCodec,
+  key: Array[Byte],
+  iv: Array[Byte]) {
+this(in, codec, CryptoStreamUtils.getBufferSize, key, iv)
+  }
+
+  def getWrappedStream(): InputStream = in
+
+  /**
+   * Decryption is buffer based.
+   * If there is data in [[outBuffer]], then read it out of this buffer.
+   * If there is no data in [[outBuffer]], then read more from 

[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-14 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/8880#discussion_r42057302
  
--- Diff: 
core/src/main/scala/org/apache/spark/crypto/CryptoInputStream.scala ---
@@ -0,0 +1,425 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.crypto
+
+import java.io.{IOException, InputStream, FilterInputStream}
+import java.nio.ByteBuffer
+import java.nio.channels.ReadableByteChannel
+import java.security.GeneralSecurityException
+import java.util.Queue
+import java.util.concurrent.ConcurrentLinkedQueue
+
+import com.google.common.base.Preconditions
+
+/**
+ * CryptoInputStream decrypts data. It is not thread-safe. AES CTR mode is
+ * required in order to ensure that the plain text and cipher text have a 
1:1
+ * mapping. The decryption is buffer based. The key points of the 
decryption
+ * are (1) calculating the counter and (2) padding through stream position:
+ *
+ * counter = base + pos/(algorithm blocksize);
+ * padding = pos%(algorithm blocksize);
+ *
+ * The underlying stream offset is maintained as state.
+ */
+private[spark] class CryptoInputStream(
+in: InputStream,
+private[this] val codec: CryptoCodec,
+bufferSizeVal: Integer,
+keyVal: Array[Byte],
+ivVal: Array[Byte],
+private[this] var streamOffset: Long,// Underlying stream offset.
+isDirectBuf: Boolean)
+extends FilterInputStream(in: InputStream) with ReadableByteChannel {
+  val oneByteBuf = new Array[Byte](1)
+
+  val bufferSize = CryptoStreamUtils.checkBufferSize(codec, bufferSizeVal)
+  /**
+   * Input data buffer. The data starts at inBuffer.position() and ends at
+   * to inBuffer.limit().
+   */
+  val inBuffer = if (isDirectBuf) {
+ByteBuffer.allocateDirect(bufferSizeVal)
+  } else {
+ByteBuffer.allocate(bufferSizeVal)
+  }
+
+  /**
+   * The decrypted data buffer. The data starts at outBuffer.position() and
+   * ends at outBuffer.limit()
+   */
+  val outBuffer = if (isDirectBuf) {
+ByteBuffer.allocateDirect(bufferSizeVal)
+  } else {
+ByteBuffer.allocate(bufferSizeVal)
+  }
+
+  /**
+   * Whether the underlying stream supports
+   * [[org.apache.hadoop.fs.ByteBufferReadable]]
+   */
+  var usingByteBufferRead = false
+  var usingByteBufferReadInitialized = false
+  /**
+   * Padding = pos%(algorithm blocksize) Padding is put into [[inBuffer]]
+   * before any other data goes in. The purpose of padding is to put the 
input
+   * data at proper position.
+   */
+  var padding: Byte = '0'
+  var closed: Boolean = false
+  var key: Array[Byte] = keyVal.clone()
+  var initIV: Array[Byte] = ivVal.clone()
+  var iv: Array[Byte] = ivVal.clone()
+  var isReadableByteChannel: Boolean = in.isInstanceOf[ReadableByteChannel]
+
+  /** DirectBuffer pool */
+  var bufferPool: Queue[ByteBuffer] = new 
ConcurrentLinkedQueue[ByteBuffer]()
+  /** Decryptor pool */
+  var decryptorPool: Queue[Decryptor] = new 
ConcurrentLinkedQueue[Decryptor]()
+
+  lazy val tmpBuf: Array[Byte] = new Array[Byte](bufferSize)
+  var decryptor: Decryptor = getDecryptor()
+  CryptoStreamUtils.checkCodec(codec)
+  resetStreamOffset(streamOffset)
+
+  def this(in: InputStream,
+  codec: CryptoCodec,
+  bufferSize: Integer,
+  key: Array[Byte],
+  iv: Array[Byte]) {
+this(in, codec, bufferSize, key, iv, 0, true)
+  }
+
+  def this(in: InputStream,
+  codec: CryptoCodec,
+  key: Array[Byte],
+  iv: Array[Byte]) {
+this(in, codec, CryptoStreamUtils.getBufferSize, key, iv)
+  }
+
+  def getWrappedStream(): InputStream = in
+
+  /**
+   * Decryption is buffer based.
+   * If there is data in [[outBuffer]], then read it out of this buffer.
+   * If there is no data in [[outBuffer]], then read more from 

[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-14 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/8880#discussion_r42057285
  
--- Diff: 
core/src/main/scala/org/apache/spark/crypto/CryptoInputStream.scala ---
@@ -0,0 +1,425 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.crypto
+
+import java.io.{IOException, InputStream, FilterInputStream}
+import java.nio.ByteBuffer
+import java.nio.channels.ReadableByteChannel
+import java.security.GeneralSecurityException
+import java.util.Queue
+import java.util.concurrent.ConcurrentLinkedQueue
+
+import com.google.common.base.Preconditions
+
+/**
+ * CryptoInputStream decrypts data. It is not thread-safe. AES CTR mode is
+ * required in order to ensure that the plain text and cipher text have a 
1:1
+ * mapping. The decryption is buffer based. The key points of the 
decryption
+ * are (1) calculating the counter and (2) padding through stream position:
+ *
+ * counter = base + pos/(algorithm blocksize);
+ * padding = pos%(algorithm blocksize);
+ *
+ * The underlying stream offset is maintained as state.
+ */
+private[spark] class CryptoInputStream(
+in: InputStream,
+private[this] val codec: CryptoCodec,
+bufferSizeVal: Integer,
+keyVal: Array[Byte],
+ivVal: Array[Byte],
+private[this] var streamOffset: Long,// Underlying stream offset.
+isDirectBuf: Boolean)
+extends FilterInputStream(in: InputStream) with ReadableByteChannel {
+  val oneByteBuf = new Array[Byte](1)
+
+  val bufferSize = CryptoStreamUtils.checkBufferSize(codec, bufferSizeVal)
+  /**
+   * Input data buffer. The data starts at inBuffer.position() and ends at
+   * to inBuffer.limit().
+   */
+  val inBuffer = if (isDirectBuf) {
+ByteBuffer.allocateDirect(bufferSizeVal)
+  } else {
+ByteBuffer.allocate(bufferSizeVal)
+  }
+
+  /**
+   * The decrypted data buffer. The data starts at outBuffer.position() and
+   * ends at outBuffer.limit()
+   */
+  val outBuffer = if (isDirectBuf) {
+ByteBuffer.allocateDirect(bufferSizeVal)
+  } else {
+ByteBuffer.allocate(bufferSizeVal)
+  }
+
+  /**
+   * Whether the underlying stream supports
+   * [[org.apache.hadoop.fs.ByteBufferReadable]]
+   */
+  var usingByteBufferRead = false
+  var usingByteBufferReadInitialized = false
+  /**
+   * Padding = pos%(algorithm blocksize) Padding is put into [[inBuffer]]
+   * before any other data goes in. The purpose of padding is to put the 
input
+   * data at proper position.
+   */
+  var padding: Byte = '0'
+  var closed: Boolean = false
+  var key: Array[Byte] = keyVal.clone()
+  var initIV: Array[Byte] = ivVal.clone()
+  var iv: Array[Byte] = ivVal.clone()
+  var isReadableByteChannel: Boolean = in.isInstanceOf[ReadableByteChannel]
+
+  /** DirectBuffer pool */
+  var bufferPool: Queue[ByteBuffer] = new 
ConcurrentLinkedQueue[ByteBuffer]()
+  /** Decryptor pool */
+  var decryptorPool: Queue[Decryptor] = new 
ConcurrentLinkedQueue[Decryptor]()
+
+  lazy val tmpBuf: Array[Byte] = new Array[Byte](bufferSize)
+  var decryptor: Decryptor = getDecryptor()
+  CryptoStreamUtils.checkCodec(codec)
+  resetStreamOffset(streamOffset)
+
+  def this(in: InputStream,
+  codec: CryptoCodec,
+  bufferSize: Integer,
+  key: Array[Byte],
+  iv: Array[Byte]) {
+this(in, codec, bufferSize, key, iv, 0, true)
+  }
+
+  def this(in: InputStream,
+  codec: CryptoCodec,
+  key: Array[Byte],
+  iv: Array[Byte]) {
+this(in, codec, CryptoStreamUtils.getBufferSize, key, iv)
+  }
+
+  def getWrappedStream(): InputStream = in
+
+  /**
+   * Decryption is buffer based.
+   * If there is data in [[outBuffer]], then read it out of this buffer.
+   * If there is no data in [[outBuffer]], then read more from 

[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-14 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/8880#discussion_r42057228
  
--- Diff: 
core/src/main/scala/org/apache/spark/crypto/CryptoInputStream.scala ---
@@ -0,0 +1,425 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.crypto
+
+import java.io.{IOException, InputStream, FilterInputStream}
+import java.nio.ByteBuffer
+import java.nio.channels.ReadableByteChannel
+import java.security.GeneralSecurityException
+import java.util.Queue
+import java.util.concurrent.ConcurrentLinkedQueue
+
+import com.google.common.base.Preconditions
+
+/**
+ * CryptoInputStream decrypts data. It is not thread-safe. AES CTR mode is
+ * required in order to ensure that the plain text and cipher text have a 
1:1
+ * mapping. The decryption is buffer based. The key points of the 
decryption
+ * are (1) calculating the counter and (2) padding through stream position:
+ *
+ * counter = base + pos/(algorithm blocksize);
+ * padding = pos%(algorithm blocksize);
+ *
+ * The underlying stream offset is maintained as state.
+ */
+private[spark] class CryptoInputStream(
+in: InputStream,
+private[this] val codec: CryptoCodec,
+bufferSizeVal: Integer,
+keyVal: Array[Byte],
+ivVal: Array[Byte],
+private[this] var streamOffset: Long,// Underlying stream offset.
+isDirectBuf: Boolean)
+extends FilterInputStream(in: InputStream) with ReadableByteChannel {
+  val oneByteBuf = new Array[Byte](1)
+
+  val bufferSize = CryptoStreamUtils.checkBufferSize(codec, bufferSizeVal)
+  /**
+   * Input data buffer. The data starts at inBuffer.position() and ends at
+   * to inBuffer.limit().
+   */
+  val inBuffer = if (isDirectBuf) {
+ByteBuffer.allocateDirect(bufferSizeVal)
+  } else {
+ByteBuffer.allocate(bufferSizeVal)
+  }
+
+  /**
+   * The decrypted data buffer. The data starts at outBuffer.position() and
+   * ends at outBuffer.limit()
+   */
+  val outBuffer = if (isDirectBuf) {
+ByteBuffer.allocateDirect(bufferSizeVal)
+  } else {
+ByteBuffer.allocate(bufferSizeVal)
+  }
+
+  /**
+   * Whether the underlying stream supports
+   * [[org.apache.hadoop.fs.ByteBufferReadable]]
+   */
+  var usingByteBufferRead = false
+  var usingByteBufferReadInitialized = false
+  /**
+   * Padding = pos%(algorithm blocksize) Padding is put into [[inBuffer]]
+   * before any other data goes in. The purpose of padding is to put the 
input
+   * data at proper position.
+   */
+  var padding: Byte = '0'
+  var closed: Boolean = false
+  var key: Array[Byte] = keyVal.clone()
+  var initIV: Array[Byte] = ivVal.clone()
+  var iv: Array[Byte] = ivVal.clone()
+  var isReadableByteChannel: Boolean = in.isInstanceOf[ReadableByteChannel]
+
+  /** DirectBuffer pool */
+  var bufferPool: Queue[ByteBuffer] = new 
ConcurrentLinkedQueue[ByteBuffer]()
+  /** Decryptor pool */
+  var decryptorPool: Queue[Decryptor] = new 
ConcurrentLinkedQueue[Decryptor]()
+
+  lazy val tmpBuf: Array[Byte] = new Array[Byte](bufferSize)
+  var decryptor: Decryptor = getDecryptor()
+  CryptoStreamUtils.checkCodec(codec)
+  resetStreamOffset(streamOffset)
+
+  def this(in: InputStream,
+  codec: CryptoCodec,
+  bufferSize: Integer,
+  key: Array[Byte],
+  iv: Array[Byte]) {
+this(in, codec, bufferSize, key, iv, 0, true)
+  }
+
+  def this(in: InputStream,
+  codec: CryptoCodec,
+  key: Array[Byte],
+  iv: Array[Byte]) {
+this(in, codec, CryptoStreamUtils.getBufferSize, key, iv)
+  }
+
+  def getWrappedStream(): InputStream = in
+
+  /**
+   * Decryption is buffer based.
+   * If there is data in [[outBuffer]], then read it out of this buffer.
+   * If there is no data in [[outBuffer]], then read more from 

[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-14 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/8880#discussion_r42059483
  
--- Diff: 
core/src/main/scala/org/apache/spark/crypto/CommonConfigurationKeys.scala ---
@@ -0,0 +1,48 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.crypto
+
+import org.apache.hadoop.io.Text
+
+/**
+ * Constant variables
+ */
+private[spark] object CommonConfigurationKeys {
+  val SPARK_SHUFFLE_TOKEN = new Text("SPARK_SHUFFLE_TOKEN")
+  val SPARK_SECURITY_CRYPTO_BUFFER_SIZE_DEFAULT = 8192
+  val SPARK_SECURITY_CRYPTO_CIPHER_SUITE_DEFAULT = "AES/CTR/NoPadding"
+  val SPARK_SECURITY_CRYPTO_CIPHER_SUITE_KEY = 
"spark.security.crypto.cipher.suite"
+  val SPARK_SECURITY_CRYPTO_CODEC_CLASSES_KEY_PREFIX = 
"spark.security.crypto.codec.classes"
+  val SPARK_SECURITY_CRYPTO_CODEC_CLASSES_AES_CTR_NOPADDING_KEY =
+SPARK_SECURITY_CRYPTO_CODEC_CLASSES_KEY_PREFIX + 
CipherSuite.AES_CTR_NOPADDING.getConfigSuffix()
+  val SPARK_SECURITY_JAVA_SECURE_RANDOM_ALGORITHM_KEY = 
"spark.security.java.secure.random" +
--- End diff --

As with other settings, this is shuffle related, so `spark.shuffle.crypto`, 
not `spark.security`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-14 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/8880#discussion_r42060005
  
--- Diff: 
core/src/main/scala/org/apache/spark/crypto/CryptoStreamUtils.scala ---
@@ -0,0 +1,93 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.crypto
+
+import java.io.{InputStream, OutputStream}
+import java.nio.ByteBuffer
+
+import com.google.common.base.Preconditions
+import sun.misc.Cleaner
+import sun.nio.ch.DirectBuffer
+
+import org.apache.spark.crypto.CommonConfigurationKeys._
+import org.apache.spark.deploy.SparkHadoopUtil
+import org.apache.spark.SparkConf
+
+/**
+ * A util class for CryptoInputStream and  CryptoOutputStream
+ */
+private[spark] object CryptoStreamUtils {
+  val MIN_BUFFER_SIZE: Int = 512
+
+  /** Forcibly free the direct buffer. */
+  def freeDB(buffer: ByteBuffer): Unit = {
+if (buffer.isInstanceOf[DirectBuffer]) {
+  val bufferCleaner: Cleaner = 
(buffer.asInstanceOf[DirectBuffer]).cleaner
+  bufferCleaner.clean
--- End diff --

nit: `clean()`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-14 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/8880#discussion_r42052749
  
--- Diff: 
core/src/main/scala/org/apache/spark/crypto/CommonConfigurationKeys.scala ---
@@ -0,0 +1,48 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.crypto
+
+import org.apache.hadoop.io.Text
+
+/**
+ * Constant variables
+ */
+private[spark] object CommonConfigurationKeys {
+  val SPARK_SHUFFLE_TOKEN = new Text("SPARK_SHUFFLE_TOKEN")
+  val SPARK_SECURITY_CRYPTO_BUFFER_SIZE_DEFAULT = 8192
--- End diff --

Another one that's a bit confusing. You return this value from a method 
(`CryptoStreamUtils.getBufferSize`), so that method doesn't really seem 
necessary. From the code it also seems related to 
`SPARK_ENCRYPTED_INTERMEDIATE_DATA_BUFFER_KB` but both have different default 
values.

Looking at the way the input and output streams are created, it seems to me 
that these two configs are indeed the same. Could you clean this up?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-14 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/8880#discussion_r42058945
  
--- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala ---
@@ -823,6 +822,14 @@ private[spark] class Client(
 val securityManager = new SecurityManager(sparkConf)
 amContainer.setApplicationACLs(
   
YarnSparkHadoopUtil.getApplicationAclsForYarn(securityManager).asJava)
+
+if (CryptoConf.isShuffleEncryted(sparkConf)) {
+  CryptoConf.initSparkShuffleCredentials(sparkConf, credentials)
+  val dob: DataOutputBuffer = new DataOutputBuffer
--- End diff --

Everything after this line seems unnecessary; that's exactly what 
`setupSecurityToken` already does.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-14 Thread vanzin
Github user vanzin commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-148217525
  
Hi @winningsix , I think this patch needs a lot of cleanup around the 
configuration part. It's really confusing at the moment, and among other issues 
it seems like there are duplicate, conflicting settings.

The input stream implementation is also hard to follow, although I think I 
convinced myself it's doing the right thing. It would be nice to try to 
simplify it, so that others can more easily understand the code.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-14 Thread winningsix
Github user winningsix commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-148256354
  
Hi @vanzin, thank you for your review and comments. I will work on the code 
refactory of crypto input/output stream part. Before that, I will update the 
patch by resolving the configuration related issues and completing the 
configuration.md. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-147441714
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-12 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-147443941
  
  [Test build #43569 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43569/consoleFull)
 for   PR 8880 at commit 
[`6116b2c`](https://github.com/apache/spark/commit/6116b2c7adc7f54aa152fbd89f7a914d24aedb9e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-147441746
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-12 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-147473474
  
  [Test build #43569 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43569/console)
 for   PR 8880 at commit 
[`6116b2c`](https://github.com/apache/spark/commit/6116b2c7adc7f54aa152fbd89f7a914d24aedb9e).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-147473627
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-147473629
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43569/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-12 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-147576477
  
  [Test build #43608 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43608/consoleFull)
 for   PR 8880 at commit 
[`89736c0`](https://github.com/apache/spark/commit/89736c05867c7edd7ece62cd8b674cb6a1f1a012).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-147575412
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-147575419
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-12 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-147598348
  
  [Test build #43608 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43608/console)
 for   PR 8880 at commit 
[`89736c0`](https://github.com/apache/spark/commit/89736c05867c7edd7ece62cd8b674cb6a1f1a012).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-147598394
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-147598395
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43608/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-147292960
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-11 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-147292917
  
  [Test build #43555 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43555/console)
 for   PR 8880 at commit 
[`843fd72`](https://github.com/apache/spark/commit/843fd72713f56d5447a5988c33cec05ae6348549).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-147292961
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43555/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-11 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-147281265
  
  [Test build #43555 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43555/consoleFull)
 for   PR 8880 at commit 
[`843fd72`](https://github.com/apache/spark/commit/843fd72713f56d5447a5988c33cec05ae6348549).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-147280813
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-147280801
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-10 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-147056390
  
  [Test build #43527 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43527/consoleFull)
 for   PR 8880 at commit 
[`2f31c3b`](https://github.com/apache/spark/commit/2f31c3baceaa9711dbb9d90a9797e6257e78e765).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-147050557
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-147068395
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-147068396
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43527/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-10 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-147068377
  
  [Test build #43527 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43527/console)
 for   PR 8880 at commit 
[`2f31c3b`](https://github.com/apache/spark/commit/2f31c3baceaa9711dbb9d90a9797e6257e78e765).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `  class JceAesCtrCipher(mode: Int, provider: String) extends Encryptor 
with Decryptor `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-147069789
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43525/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-147069788
  
Build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-10 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-147069754
  
  [Test build #43525 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43525/console)
 for   PR 8880 at commit 
[`37d6121`](https://github.com/apache/spark/commit/37d612111e0d1897e70d3acd97580d4405868807).
 * This patch **fails Spark unit tests**.
 * This patch **does not merge cleanly**.
 * This patch adds the following public classes _(experimental)_:
  * `  class JceAesCtrCipher(mode: Int, provider: String) extends Encryptor 
with Decryptor `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-147047849
  
 Build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-147047862
  
Build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-10 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-147048107
  
  [Test build #43525 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43525/consoleFull)
 for   PR 8880 at commit 
[`37d6121`](https://github.com/apache/spark/commit/37d612111e0d1897e70d3acd97580d4405868807).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8880#issuecomment-147050562
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-09 Thread winningsix
Github user winningsix commented on a diff in the pull request:

https://github.com/apache/spark/pull/8880#discussion_r41608824
  
--- Diff: core/src/main/scala/org/apache/spark/crypto/CryptoCodec.scala ---
@@ -0,0 +1,96 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.crypto
+
+import scala.reflect.runtime.universe
+
+import org.apache.spark.{Logging, SparkConf}
+import 
org.apache.spark.crypto.CommonConfigurationKeys.SPARK_SECURITY_CRYPTO_CIPHER_SUITE_DEFAULT
+import 
org.apache.spark.crypto.CommonConfigurationKeys.SPARK_SECURITY_CRYPTO_CIPHER_SUITE_KEY
+import 
org.apache.spark.crypto.CommonConfigurationKeys.SPARK_SECURITY_CRYPTO_CODEC_CLASSES_AES_CTR_NOPADDING_KEY
+
+/**
+ * Crypto codec class, encapsulates encryptor/decryptor pair.
+ */
+abstract class CryptoCodec() {
+  /**
+   *
+   * @return the CipherSuite for this codec.
+   */
+  def getCipherSuite(): CipherSuite
+
+  /**
+   * This interface is only for Counter (CTR) mode. Generally the Encryptor
+   * or Decryptor calculates the IV and maintain encryption context 
internally.
+   * For example a {@link javax.crypto.Cipher} will maintain its encryption
+   * context internally when we do encryption/decryption using the
+   * Cipher#update interface.
+   * 
+   * The IV can be calculated by combining the initial IV and the counter 
with
+   * a lossless operation (concatenation, addition, or XOR).
+   * @see 
http://en.wikipedia.org/wiki/Block_cipher_mode_of_operation#Counter_.28CTR.29
+   * @param initIV
+   * @param counter
+   * @param iv initialization vector
+   */
+  def calculateIV(initIV: Array[Byte], counter: Long, iv: Array[Byte])
+
+  /**
+   * @return Encryptor the encryptor
+   */
+  def createEncryptor: Encryptor
+
+  /**
+   * @return Decryptor the decryptor
+   */
+  def createDecryptor: Decryptor
+
+  /**
+   * Generate a number of secure, random bytes suitable for cryptographic 
use.
+   * This method needs to be thread-safe.
+   * @param bytes byte array to populate with random data
+   */
+  def generateSecureRandom(bytes: Array[Byte])
+}
+
+object CryptoCodec extends Logging {
+  def getInstance(conf: SparkConf): CryptoCodec = {
+val name = conf.get(SPARK_SECURITY_CRYPTO_CIPHER_SUITE_KEY,
+  SPARK_SECURITY_CRYPTO_CIPHER_SUITE_DEFAULT)
+getInstance(conf, CipherSuite.apply(name))
+  }
+
+  def getInstance(conf: SparkConf, cipherSuite: CipherSuite): CryptoCodec 
= {
+getCodecClasses(conf, cipherSuite).toIterator.map {
+  case name if name == classOf[JceAesCtrCryptoCodec].getName => new 
JceAesCtrCryptoCodec(conf)
--- End diff --

Yes, I will add some TODOs here. As discussed in 
https://github.com/apache/spark/pull/5307, we will add AES-NI cipher once we 
decide how to do it(include Chimera into Spark or just leverage the library).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   >