NamedVector is already supported in VectorWritable, do we need a new Writable?
Ah, well if VectorWritable can already support this, then there definitely isn't any need for another writable. I took a look at VW awhile back and didn't see anything that could help; is there some sort of a label I could use?

Is the issue that you are doing joins? Without CompositeInputFormat it's still 
possible, and we use the pattern elsewhere. You need some cleverness with a 
custom key and partitioner that will send key x from source A and key x from 
source B to the same reducer while maintaining inside a bit that indicates 
whether it's from A or B.

Yes, the issue is joins. I'm effectively trying to replace this one line of code:

    conf.set("mapred.join.expr", CompositeInputFormat.compose(
          "inner", SequenceFileInputFormat.class, aPath, bPath));

If this can be done without CompositeInputFormat, or the partitioner can be modified to definitively assign specific/custom keys and values to specific nodes, then that would be perfect. Should I look into Hadoop's Partitioner/MapPartitioner/MapTask classes for this, or is there somewhere else I should look?

Thanks for the feedback!

Shannon

Reply via email to