[ 
https://issues.apache.org/jira/browse/PHOENIX-1561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14269334#comment-14269334
 ] 

Brian Johnson commented on PHOENIX-1561:
----------------------------------------

Generally with most of these joins it's up to the end user to ensure that the 
conditions are maintained. For instance, it's possible to use IndexedStorage 
for a join that requires OrderedLoadFunc, but without sorting the data. This of 
course would produce incorrect results, but there is nothing to stop you from 
doing it. In terms of the CollectableLoadFunc, the only key you can use would 
be the rowkey (since scans as per-region) or some other key that is also unique 
per region. If it's possible to catch some of these conditions and alert the 
user that's great, but documenting them is also sufficient because they are 
only used when they are specified in the pig script. Those exceptions could be 
thrown in the methods for the interface because pig will only call those when 
it's doing an optimized join.

> Pig optimized joins
> -------------------
>
>                 Key: PHOENIX-1561
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-1561
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.2
>            Reporter: Brian Johnson
>            Assignee: Brian Johnson
>         Attachments: 0001-PHOENIX-1561-Optimizing-Joins.patch, patch
>
>
> PhoenixHBaseLoader should implement both OrderedLoadFunc and 
> CollectableLoadFunc just like HBaseStorage. There is nothing special that 
> needs to be done other than implementing a single method. As in HBaseStorage, 
> it is up to the user to ensure that the required constraints are not 
> violated. 
> {code:java}
>     public void ensureAllKeyInstancesInSameSplit() throws IOException {
>         /** 
>          * no-op because hbase keys are unique 
>          * This will also work with things like 
> DelimitedKeyPrefixRegionSplitPolicy
>          * if you need a partial key match to be included in the split
>          */
>         LOG.debug("ensureAllKeyInstancesInSameSplit");
>     }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to