Somewhat like that as we are also using that same approach but I was more thinking of it as PTables.asPTable(PCollection<V>, Keyfinder<V>, PType<K>) and return as PTable<K,V>
Basically KeyFinder<V> is an interface which will have somekind of method like findKey(V) returning K from that V or calculated or anyway it wants. Because the only thing that MapFn will be emitting will be Pair<K,V> right. This is provided V does not change to V1. On Thu, Feb 20, 2014 at 12:07 AM, Gabriel Reid <[email protected]>wrote: > > > > On 20 Feb 2014, at 05:11, Jinal Shah <[email protected]> wrote: > > > > I didn't knew that, but I was more talking about something like this > > PCollection<V> to PTable<K,V> basically. > > > > I think what you want is the PCollection#by method. It takes a MapFn that > maps each value V to a key, and returns a PTable<K,V> > > - Gabriel > > > > > > >> On Wed, Feb 19, 2014 at 5:49 PM, Josh Wills <[email protected]> > wrote: > >> > >> org.apache.crunch.lib.PTables.asPTable is likely what you want. > >> > >> > >> On Wed, Feb 19, 2014 at 3:47 PM, Jinal Shah <[email protected]> > >> wrote: > >> > >>> Hi everyone, > >>> > >>> Is there a generic way of converting PCollection to PTable? If not, Can > >> we > >>> create a generic class? Because we are having lot of places where we > want > >>> to perform a join on 2 PCollections so we have to convert it into > PTables > >>> and then do a join and then convert it into a PCollection. So i was > >>> wondering is there a better way of doing this. > >>> > >>> Thanks > >> > >> > >> > >> -- > >> Director of Data Science > >> Cloudera <http://www.cloudera.com> > >> Twitter: @josh_wills <http://twitter.com/josh_wills> > >> >
