[ https://issues.apache.org/jira/browse/PIG-953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753828#action_12753828 ]
Dmitriy V. Ryaboy commented on PIG-953: --------------------------------------- bq. Pradeep: Pig only guarantees order with limit following order - for any other relational operator following order there are no guarantees. Today it is true that filter or a column pruning foreach would also preserve order but this can change if needed in the future. There explicit code to ensure order-limit combination works by preserving order - there is no such explicit check for other operators (keeping it open for change in the future) That actually tells me that an orderPreserving property on a LogicalOperator is a really good idea. That way we can set it to true on all operators that are at the moment order-preserving (limit, filter, column-prining foreach), and not commit to forever maintaining that contract. If filter starts changing order, the patch will simply have to include a change to set orderPreserving to false in POFilter, and everything will work automagically. > Enable merge join in pig to work with loaders and store functions which can > internally index sorted data > --------------------------------------------------------------------------------------------------------- > > Key: PIG-953 > URL: https://issues.apache.org/jira/browse/PIG-953 > Project: Pig > Issue Type: Improvement > Affects Versions: 0.3.0 > Reporter: Pradeep Kamath > Assignee: Pradeep Kamath > Attachments: PIG-953.patch > > > Currently merge join implementation in pig includes construction of an index > on sorted data and use of that index to seek into the "right input" to > efficiently perform the join operation. Some loaders (notably the zebra > loader) internally implement an index on sorted data and can perform this > seek efficiently using their index. So the use of the index needs to be > abstracted in such a way that when the loader supports indexing, pig uses it > (indirectly through the loader) and does not construct an index. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.