Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Pig Wiki" for change 
notification.

The following page has been changed by OlgaN:
http://wiki.apache.org/pig/ProposedProjects

------------------------------------------------------------------------------
  || Execution || Pig currently executes scripts by building a pipeline of 
pre-built operators and running data through those operators in map reduce 
jobs.  We need to investigate instead have Pig generate java code specific to a 
job, and then compiling that code and using it to run the map reduce jobs. || 
|| || Many conference attendees || gates ||
  || Language || Currently only DISTINCT, ORDER BY, and FILTER are allowed 
inside FOREACH.  All operators should be allowed in FOREACH. (Limit is being 
worked on [https://issues.apache.org/jira/browse/PIG-741 741] || || || gates || 
||
  || Optimization || Speed up comparison of tuples during shuffle for ORDER BY 
|| [https://issues.apache.org/jira/browse/PIG-659 659] || || olgan || ||
- || Optimization || Order by should be changed to not use POPackage to put all 
of the tuples in a bag on the reduce side, as the bag is just immediately 
flattened.  It can instead work like join does for the last input in the join. 
|| || || gates || ||
+ || Optimization || Order by should be changed to not use POPackage to put all 
of the tuples in a bag on the reduce side, as the bag is just immediately 
flattened.  It can instead work like join does for the last input in the join. 
|| [https://issues.apache.org/jira/browse/PIG-802 802] || || gates || olgan ||
  || Optimization || Often in a Pig script that produces a chain of MR jobs, 
the map phases of 2nd and subsequent jobs very little.  What little they do 
should be pushed into the proceeding reduce and the map replaced by the 
identity mapper.  Initial tests showed that the identity mapper was 50% faster 
than using a Pig mapper (because Pig uses the loader to parse out tuples even 
if the map itself is empty). || [https://issues.apache.org/jira/browse/PIG-480 
480] || || olgan || gates ||
  || Optimization || Use hand crafted calls to do string to integer or float 
conversions.  Initial tests showed these could be done about 8x faster than 
String.toIntger() and String.toFloat(). || 
[https://issues.apache.org/jira/browse/PIG-482 482] || || olgan || gates ||
  || Optimization || Currently Pig always samples for and ORDER BY to determine 
how to partition, and then runs another job to do the sort.  For small enough 
inputs, it should just sort with a single reducer. || 
[https://issues.apache.org/jira/browse/PIG-483 483] || || olgan || ||

Reply via email to