Hi Alan,

Your language proposal sounds good.

For the implementation proposal, depends on what sorting we are talking about. The sorting of a whole table, or the sorting of the bags nested within foreach. They are implemented differently. The former uses Hadoop, while the latter is done all in our code.

You implementation proposal looks good for the latter (except that why would we create a new type of eval spec, we could change sortdistinct spec to take an optional comparator argument ?)

For the sorting of the outer bag, we need to look for a way to pass the user-defined comparator to Hadoop. Can someone more familiar with hadoop internals shed some light on this? Right now, seems to me the only way would be to generate a class that has the user-defined comparator (because hadoop uses the compareTo method of the keyClass)

Utkarsh



On Nov 2, 2007, at 4:05 PM, Alan Gates wrote:

All,

I've posted a proposal at http://wiki.apache.org/pig/ UserDefinedOrdering for how to add user defined ordering to pig. This is being urgently requested by some of our users. Utkarsh, please review this and make sure I properly understood how to hook things together in the logical and physical plans. I'm not 100% confident what I proposed will work in the current framework.

Alan.

Reply via email to