[GitHub] spark pull request #16052: [SPARK-18617][CORE][STREAMING] Close "kryo auto p...
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/16052#discussion_r90306255 --- Diff: streaming/src/test/scala/org/apache/spark/streaming/StreamingContextSuite.scala --- @@ -869,6 +891,31 @@ object TestReceiver { val counter = new AtomicInteger(1) } +class FakeByteArrayReceiver extends Receiver[Array[Byte]](StorageLevel.MEMORY_ONLY) with Logging { --- End diff -- Any class in [primitiveAndPrimitiveArrayClassTags](https://github.com/apache/spark/blob/3f03c90a807872d47588f3c3920769b8978033bf/core/src/main/scala/org/apache/spark/serializer/SerializerManager.scala#L46) can trigger this issue. That's why you can also use the existing TestReceiver. It's `Receiver[Int]`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16052: [SPARK-18617][CORE][STREAMING] Close "kryo auto p...
Github user uncleGen commented on a diff in the pull request: https://github.com/apache/spark/pull/16052#discussion_r90191359 --- Diff: streaming/src/test/scala/org/apache/spark/streaming/StreamingContextSuite.scala --- @@ -869,6 +891,31 @@ object TestReceiver { val counter = new AtomicInteger(1) } +class FakeByteArrayReceiver extends Receiver[Array[Byte]](StorageLevel.MEMORY_ONLY) with Logging { --- End diff -- BTW, existing unit test could cover other cases besides `Array[Byte]` type. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16052: [SPARK-18617][CORE][STREAMING] Close "kryo auto p...
Github user uncleGen commented on a diff in the pull request: https://github.com/apache/spark/pull/16052#discussion_r90191009 --- Diff: streaming/src/test/scala/org/apache/spark/streaming/StreamingContextSuite.scala --- @@ -869,6 +891,31 @@ object TestReceiver { val counter = new AtomicInteger(1) } +class FakeByteArrayReceiver extends Receiver[Array[Byte]](StorageLevel.MEMORY_ONLY) with Logging { --- End diff -- @zsxwing yes, failure occurs when receiver store `Array[Byte]` data and the automatic serializer selection would pick JavaSerializer as the type of data is erased to be Object . However, after get from remote executor, the input-stream data will be deserialized with KryoSerializer as Task could get data type properly, leading to the **com.esotericsoftware.kryo.KryoException: Encountered unregistered class ID: 13994** --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16052: [SPARK-18617][CORE][STREAMING] Close "kryo auto p...
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/16052#discussion_r90182581 --- Diff: streaming/src/test/scala/org/apache/spark/streaming/StreamingContextSuite.scala --- @@ -869,6 +891,31 @@ object TestReceiver { val counter = new AtomicInteger(1) } +class FakeByteArrayReceiver extends Receiver[Array[Byte]](StorageLevel.MEMORY_ONLY) with Logging { --- End diff -- nit: Why create a new class? Is there any concern to just use TestReceiver? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16052: [SPARK-18617][CORE][STREAMING] Close "kryo auto p...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/16052 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16052: [SPARK-18617][CORE][STREAMING] Close "kryo auto p...
Github user uncleGen commented on a diff in the pull request: https://github.com/apache/spark/pull/16052#discussion_r89983105 --- Diff: core/src/main/scala/org/apache/spark/serializer/SerializerManager.scala --- @@ -155,7 +155,12 @@ private[spark] class SerializerManager(defaultSerializer: Serializer, conf: Spar outputStream: OutputStream, values: Iterator[T]): Unit = { val byteStream = new BufferedOutputStream(outputStream) -val ser = getSerializer(implicitly[ClassTag[T]]).newInstance() +val ser = blockId match { --- End diff -- Ah, yours is better. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16052: [SPARK-18617][CORE][STREAMING] Close "kryo auto p...
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/16052#discussion_r89954432 --- Diff: core/src/main/scala/org/apache/spark/serializer/SerializerManager.scala --- @@ -77,8 +77,8 @@ private[spark] class SerializerManager(defaultSerializer: Serializer, conf: Spar primitiveAndPrimitiveArrayClassTags.contains(ct) || ct == stringClassTag } - def getSerializer(ct: ClassTag[_]): Serializer = { -if (canUseKryo(ct)) { + def getSerializer(ct: ClassTag[_], autoPick: Boolean): Serializer = { --- End diff -- document what this means --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16052: [SPARK-18617][CORE][STREAMING] Close "kryo auto p...
GitHub user uncleGen opened a pull request: https://github.com/apache/spark/pull/16052 [SPARK-18617][CORE][STREAMING] Close "kryo auto pick" feature for Spark Streaming ## What changes were proposed in this pull request? #15992 provided a solution to fix the bug, i.e. **receiver data can not be deserialized properly**. As @zsxwing said, it is a critical bug, but we should not break APIs between maintenance releases. It may be a rational choice to close auto pick kryo serializer for Spark Streaming in the first step. ## How was this patch tested? existing ut You can merge this pull request into a Git repository by running: $ git pull https://github.com/uncleGen/spark SPARK-18617 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/16052.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #16052 commit e4416bd155f1b3a973fd37f03a60d61f1a0612f3 Author: uncleGenDate: 2016-11-29T06:11:07Z SPARK-18617: Close "kryo auto pick" feature for Spark Streaming --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org