[ 
https://issues.apache.org/jira/browse/PHOENIX-4010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16157453#comment-16157453
 ] 

ASF GitHub Bot commented on PHOENIX-4010:
-----------------------------------------

Github user JamesRTaylor commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/268#discussion_r137629055
  
    --- Diff: 
phoenix-core/src/it/java/org/apache/phoenix/end2end/BaseJoinIT.java ---
    @@ -30,14 +30,19 @@
     import java.util.Map;
     import java.util.regex.Pattern;
     
    +import org.apache.hadoop.hbase.HConstants;
    +import org.apache.phoenix.cache.ServerCacheClient;
    +import 
org.apache.phoenix.end2end.HashJoinCacheIT.InvalidateHashCacheRandomly;
    +import org.apache.phoenix.util.ReadOnlyProps;
     import org.apache.phoenix.util.SchemaUtil;
     import org.apache.phoenix.util.StringUtil;
     import org.junit.Before;
    +import org.junit.BeforeClass;
     
     import com.google.common.collect.ImmutableMap;
     import com.google.common.collect.Maps;
     
    -public abstract class BaseJoinIT extends ParallelStatsDisabledIT {
    +public abstract class BaseJoinIT extends BaseUniqueNamesOwnClusterIT {
    --- End diff --
    
    Can we keep BaseJoinIT derived from ParallelStatsDisabledIT? Otherwise, 
there's a lot more mini clusters created which increases our end-to-end test 
time run more and more.
    
    If the reason is because you're adding InvalidateHashCacheRandomly, just 
add it instead dynamically to the table created in your test. To prevent 
flapping, you need to poll until you the descriptor matches what you've 
changed. See PartialIndexRebuilderIT.addWriteFailingCoprocessor() for an 
example you can generalize and/or copy/paste.


> Hash Join cache may not be send to all regionservers when we have stale HBase 
> meta cache
> ----------------------------------------------------------------------------------------
>
>                 Key: PHOENIX-4010
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-4010
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: Ankit Singhal
>            Assignee: Ankit Singhal
>             Fix For: 4.12.0
>
>         Attachments: PHOENIX-4010.patch, PHOENIX-4010_v1.patch, 
> PHOENIX-4010_v2.patch, PHOENIX-4010_v2_rebased_1.patch, 
> PHOENIX-4010_v2_rebased.patch
>
>
>  If the region locations changed and our HBase meta cache is not updated then 
> we might not be sending hash join cache to all region servers hosting the 
> regions.
> ConnectionQueryServicesImpl#getAllTableRegions
> {code}
> boolean reload =false;
>         while (true) {
>             try {
>                 // We could surface the package projected 
> HConnectionImplementation.getNumberOfCachedRegionLocations
>                 // to get the sizing info we need, but this would require a 
> new class in the same package and a cast
>                 // to this implementation class, so it's probably not worth 
> it.
>                 List<HRegionLocation> locations = Lists.newArrayList();
>                 byte[] currentKey = HConstants.EMPTY_START_ROW;
>                 do {
>                     HRegionLocation regionLocation = 
> connection.getRegionLocation(
>                             TableName.valueOf(tableName), currentKey, reload);
>                     locations.add(regionLocation);
>                     currentKey = regionLocation.getRegionInfo().getEndKey();
>                 } while (!Bytes.equals(currentKey, HConstants.EMPTY_END_ROW));
>                 return locations;
> {code}
> Skipping duplicate servers in ServerCacheClient#addServerCache
> {code}
> List<HRegionLocation> locations = 
> services.getAllTableRegions(cacheUsingTable.getPhysicalName().getBytes());
>             int nRegions = locations.size();
>             
> .....
>  if ( ! servers.contains(entry) && 
>                         keyRanges.intersectRegion(regionStartKey, 
> regionEndKey,
>                                 cacheUsingTable.getIndexType() == 
> IndexType.LOCAL)) {  
>                     // Call RPC once per server
>                     servers.add(entry);
> {code}
> For eg:- Table ’T’ has two regions R1 and R2 originally hosted on 
> regionserver RS1. 
> while Phoenix/Hbase connection is still active, R2 is transitioned to RS2 ,  
> but stale meta cache will still give old region locations i.e R1 and R2 on 
> RS1 and when we start copying hash table, we copy for R1 and skip R2 as they 
> are hosted on same regionserver. so, the query on a table will fail as it 
> will unable to find hash table cache on RS2 for processing regions R2.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to