Hi folks, I have a somewhat obvious question, that needs asking (for my sakes).
Pig can do Joins, I realise that. But take for example: Table_1 ---------------------- | ID | fileName | 1 foo.dat 2 bar.dat 3 harry.dat Table_2 ---------------------- | ID | fileName | 1 tom.dat 2 bar.dat 3 gamma.dat SQL Syntax for conditional select: "select t1.fileName from Table_1 t1, Table_2 t2 where t1.fileName = t2.fileName" Result -------- bar.dat How is such a query represented in Pig? tableOne = LOAD 'input1.dat' USING PigStorage() AS (id:int, filename:chararray); tableTwo = LOAD 'input2.dat' USING PigStorage() AS (id:int, filename:chararray); [Now what??] STORE query INTO 'Output.pig' USING PigStorage(); As a bonus question, can anybody tell me if this sort of conditional select query is possible writing in Java MapReduce? thanks, Rob Stewart
