Due to a bug in spark we have a nasty work around for Spark 1.2.1 so I’m trying
1.3.0.
Hoever they have redesigned the rdd.saveAsSequenceFile in
SequenceFileRDDFunctions. The class now expects K and V Writables to be
supplied in the constructor:
class SequenceFileRDDFunctions[K <% Writable: ClassTag, V <% Writable :
ClassTag](
self: RDD[(K, V)],
_keyWritableClass: Class[_ <: Writable], // <=========new
_valueWritableClass: Class[_ <: Writable]) // <========new
extends Logging
with Serializable {
as explained in the commit log:
[SPARK-4795][Core] Redesign the "primitive type => Writable" implicit APIs to
make them be activated automatically Try to redesign the "primitive type =>
Writable" implicit APIs to make them be activated automatically and without
breaking binary compatibility. However, this PR will breaking the source
compatibility if people use `xxxToXxxWritable` occasionally. See the unit test
in `graphx`. Author: zsxwing Closes #3642 from zsxwing/SPARK-4795 and squashes
the following commits:
Since Andy, Gokhan, and Dmitriy have been messing with the Key type recently I
didn’t want to plow ahead with this before consulting. It appears that the
Writable classes need to be available to the constructor when the RDD is
written. This breaks all instances of rdd.saveAsSequenceFile in Mahout.
Where is the best place to fix this?