Right, I'm clearly facing headwinds. getPartitions returns Array[Partition], so sub-classing HadoopPartition wouldn't help. Maybe I'm better off just having a custom InputFormat. I'll explore that option some more. Thanks for your input.
Ameet On Fri, Feb 21, 2014 at 3:24 PM, Jey Kottalam <j...@cs.berkeley.edu> wrote: > What's the motivation for having your own subclass of HadoopPartition? > As far as I know, that's not a supported use case either. > > On Fri, Feb 21, 2014 at 11:54 AM, Ameet Kini <ameetk...@gmail.com> wrote: > > The use case is to control the partitions as they come out of the > HadoopRDD. > > 1. Have my own HadoopPartition that has fields specific to my > application. > > These fields would then be used by other RDD operations (also overridden > by > > me). This is why I was looking to extend HadoopPartition. > > 2. Have my own getPartitions which has slightly different partitioning > > logic. This can almost be solved by subclassing InputFormat and its > > getSplits method, but I still need to have getPartitions create > > MyHadoopPartition instead of HadoopPartition. > > > > Ameet > > > > > > On Fri, Feb 21, 2014 at 2:37 PM, Jey Kottalam <j...@cs.berkeley.edu> > wrote: > >> > >> What's the motivation for subclassing HadoopRDD? I don't believe > >> that's a supported use case. Is it not possible to do what you need > >> with a Hadoop InputFormat? > >> > >> On Fri, Feb 21, 2014 at 11:16 AM, Ameet Kini <ameetk...@gmail.com> > wrote: > >> > I'm looking to subclass HadoopRDD and was hoping to subclass > >> > NextIterator > >> > in compute(). > >> > > >> > Thanks, > >> > Ameet > > > > >