Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Pig Wiki" for change 
notification.

The following page has been changed by OlgaN:
http://wiki.apache.org/pig/JoinFramework

------------------------------------------------------------------------------
  
  Note that `GROUP` can take advantage of this knowledge as well.
  
- [Discussion of different data layout options.]
+ To support this type of joing the data can be layed out in 2 ways. First, the 
data is globally sorted on the join key and range index is available. Second, 
the data is staticly partitioned in a fixed number of buckets using the same 
partitioning function. The first approach is more flexible since it allows 
arbitrary level of parallelism for processing the data but it is more complex 
and expensive to generate the data. Also, an open question for the second 
approach is how to identify matching partitions; using file name seems like a 
pretty fragile approach.
  
  === Fragment Replicate Join (FRJ) ===
  

Reply via email to