Many operators, such as join and group by, are not implemented by a single physical operation. Also, they are spread through the code as they have logical components and physical components. The logical components of join are in org.apache.pig.newplan.logical.relational.LOJoin.java. That gets translated to three physical operators, POLocalRearrange, POPackage, and POForeach. All of the physical operators are in org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators
Alan. On Oct 5, 2012, at 11:01 AM, Brian Stempin wrote: > Thanks Russell -- That's really useful. > > Just for kicks and giggles: Where would I look in the code base to see how > the JOIN keyword is implemented? I've found the built in functions, but not > the keywords (JOIN, GROUP, etc). Perhaps that would give me some hints. > Perhaps it'll show me that a UDF might not be the best option for my set of > problems. > > Thanks once again, > Brian > > > This e-mail is intended solely for the above-mentioned recipient and it may > contain confidential or privileged information. If you have received it in > error, please notify us immediately and delete the e-mail. You must not copy, > distribute, disclose or take any action in reliance on it. In addition, the > contents of an attachment to this e-mail may contain software viruses which > could damage your own computer system. While ColdLight Solutions, LLC has > taken every reasonable precaution to minimize this risk, we cannot accept > liability for any damage which you sustain as a result of software viruses. > You should perform your own virus checks before opening the attachment.
