We are going roughly the same route as what Ciemo described, but our decisions are only going to be as good as the statistics. Currently stats object contains file size, but no compression information, so bz files will artificially small. Perhaps we should clarify "uncompressed file size" in the spec.
-----Original Message----- From: "Santhosh Srinivasan" <[email protected]> To: [email protected] Cc: "David (Ciemo) Ciemiewicz" <[email protected]> Sent: 11/12/2009 12:57 PM Subject: RE: Could pig dynamic change the reduce number according the mapper task number ? I was hoping that the cost based optimizer being developed by Ashutosh and Dmitriy will address this issue. Santhosh -----Original Message----- From: Alan Gates [mailto:[email protected]] Sent: Thursday, November 12, 2009 8:26 AM To: [email protected] Cc: David (Ciemo) Ciemiewicz Subject: Re: Could pig dynamic change the reduce number according the mapper task number ? I agree that it would be very useful to have a dynamic number of reducers. However, I'm not s [truncated by sender]
