[ https://issues.apache.org/jira/browse/PHOENIX-1561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14269517#comment-14269517 ]
Gabriel Reid commented on PHOENIX-1561: --------------------------------------- {quote}If it's possible to catch some of these conditions and alert the user that's great, but documenting them is also sufficient because they are only used when they are specified in the pig script.{quote} FWIW, I'd definitely be in favor of catching these situations (e.g. throwing an exception in {{getSplitComparable()}} and/or {{ensureAllKeyInstancesInSameSplit()}}) over just documenting them. The semantics of HBase row keys (i.e. ordering within a region, etc) are pretty well known, whereas for Phoenix they aren't necessarily as well known. The row key contains all primary key values, each PK column can have its own ordering rules, and there are potentially two "hidden" values within the row key as well (tenant id and salt bucket). > Pig optimized joins > ------------------- > > Key: PHOENIX-1561 > URL: https://issues.apache.org/jira/browse/PHOENIX-1561 > Project: Phoenix > Issue Type: Bug > Affects Versions: 4.2 > Reporter: Brian Johnson > Assignee: Brian Johnson > Attachments: 0001-PHOENIX-1561-Optimizing-Joins.patch, patch > > > PhoenixHBaseLoader should implement both OrderedLoadFunc and > CollectableLoadFunc just like HBaseStorage. There is nothing special that > needs to be done other than implementing a single method. As in HBaseStorage, > it is up to the user to ensure that the required constraints are not > violated. > {code:java} > public void ensureAllKeyInstancesInSameSplit() throws IOException { > /** > * no-op because hbase keys are unique > * This will also work with things like > DelimitedKeyPrefixRegionSplitPolicy > * if you need a partial key match to be included in the split > */ > LOG.debug("ensureAllKeyInstancesInSameSplit"); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)