replicated join gets extra job

2013-11-11 Thread Dexin Wang
Hi, I'm running a job like this: raw_large = LOAD 'lots_of_files' AS (...); raw_filtered = FILTER raw_large BY ...; large_table = FOREACH raw_filtered GENERATE f1, f2, f3,; joined_1 = JOIN large_table BY (key1) LEFT, config_table_1 BY (key2) USING 'replicated'; joined_2 = JOIN join1

Re: replicated join gets extra job

2013-11-11 Thread Pradeep Gollakota
Use the ILLUSTRATE or EXPLAIN keywords to look at the details of the physical execution plan... from first glance it doesn't look like you'd need a 2nd job to do the joins, but if you can post the output of ILLUSTRATE/EXPLAIN, we can look into it. On Mon, Nov 11, 2013 at 4:36 PM, Dexin Wang