Hi everyone, I'm continuing to work with Spark and am enjoying it a lot. I have, however, come across an infelicity in Spark Streaming as it pertains to Kafka. One of the KafkaFunctions is:
def kafkaStream[T, D <: kafka.serializer.Decoder[_]](kafkaParams: Map[String, String], topics: Map[String,Int], storageLevel: StorageLevel)(implicit arg0: ClassManifest[T], arg1: Manifest[D]): DStream[T] implicitly taking a Manifest of a subtype of Decoder. The implementation then instantiates the type via reflection, assuming a nullary constructor. Please don't do that. I'll spare you the rant about the use of reflection being an anti-pattern in general and especially for construction—this time. :-) For now, I'll only note that the implicit Manifest here is really offering a false economy, since the function invocation is overwhelmingly likely to require explicitly passing the Decoder type anyway. Why not just take a Decoder as a parameter instead? More significantly, kafka.serializer.Decoder is a Scala trait: it has no constructors, let alone nullary ones. So the implementation of kafkaStream() is presuming to be in possession of information about my design that it can't possibly be, which is annoying both for philosophical reasons and because it doesn't work. :-) Ideally, I could pass in a Decoder[T] and get a DStream[T] back. This would require the -Ydependent-method-types option at compile time, but it seems worthwhile to me. Thanks! Paul
