Dmitriy V. Ryaboy commented on PIG-845:
Alan, Ashutosh -- maybe I am misunderstanding where null keys come from in the
Indexer. I assumed this was due to the processing that happens in the plan the
indexer deserializes and attaches to its POLocalRearrange.
In regards to errors, I was referring to this:
int errCode = 2034;
String msg = "Error compiling operator " +
throw new MRCompilerException(msg, errCode, PigException.BUG, e);
The only central place for error codes seems to be the Wiki. A class with a
bunch of static+final error codes would be a better place.
Ashutosh, I completely disagree with you on changing all tests to run in MR
mode. The tests are already impossible to run on a laptop (people, myself
included, actually submit patches to jira just to see if tests pass). Running
in MR mode will incur significant overhead per test. Only things that actually
rely on the MR bits should be tested in MR (and use mock objects if possible..
there's been some advancement on that front in Hadoop 20, I haven't looked at
Would love to see a more efficient indexing MR job (which will reduce load on
the JT, keep schedules less busy, and incur less overhead in task startups by
requiring fewer tasks), but perhaps not before 0.4 is out the door with
existing functionality. Just to be clear, I don't think more than 1 record per
block is necessary, but more than one block per task would probably be a good
Any thoughts on how to choose which of two relations to index? We get locality
on the non-indexed relation, but not on the indexed one, which probably throws
a kink in the normal way of thinking about this.
> PERFORMANCE: Merge Join
> Key: PIG-845
> URL: https://issues.apache.org/jira/browse/PIG-845
> Project: Pig
> Issue Type: Improvement
> Reporter: Olga Natkovich
> Assignee: Ashutosh Chauhan
> Attachments: merge-join.patch
> Thsi join would work if the data for both tables is sorted on the join key.
This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.