Ashish, Here is a bunch of analysis on that scenario.
------------- No redundancy ------------- With partitioned regions, no redundancy and no filter, the function is being sent to every member that contains buckets. In that case, you see this kind of behavior (I have 3 servers in my test): The argument containing the keys is sent to every member. In this case, I have 24 keys. keysSize=24; keys=[44, 67, 59, 49, 162, 261, 284, 473, 397, 475, 376, 387, 101, 157, 366, 301, 469, 403, 427, 70, 229, 108, 50, 85] When you call PartitionRegionHelper.getLocalData or getLocalPrimaryData, you're getting back a LocalDataSet. Calling get or getAll on a LocalDataSet returns null if the value is not in that LocalDataSet. This causes all the get calls to be local and a bunch of nulls in the result. If I print the LocalDataSet and the value of getAll in all three servers, I see 24 non-null results across the servers. Server1 ------- localData=org.apache.geode.internal.cache.LocalDataSet[path='/data-2';scope=DISTRIBUTED_NO_ACK';dataPolicy=PARTITION ;bucketIds=[2, 3, 7, 10, 11, 15, 16, 19, 20, 25, 26, 29, 31, 34, 40, 41, 46, 49, 52, 56, 57, 58, 60, 65, 67, 68, 71, 75, 78, 82, 84, 87, 92, 93, 102, 104, 110]] localDataKeysAndValuesSize=24; nonNullLocalDataKeysAndValuesSize=10; localDataKeysAndValues={44=44, 67=67, 59=null, 49=49, 162=162, 261=null, 284=null, 473=473, 397=null, 475=null, 376=null, 387=387, 101=null, 157=157, 366=366, 301=null, 469=null, 403=null, 427=427, 70=70, 229=null, 108=null, 50=null, 85=null}; nonNullLocalDataKeysAndValues={44=44, 473=473, 67=67, 387=387, 157=157, 366=366, 49=49, 427=427, 70=70, 162=162} Server2 ------- localData=org.apache.geode.internal.cache.LocalDataSet[path='/data-2';scope=DISTRIBUTED_NO_ACK';dataPolicy=PARTITION ;bucketIds=[0, 5, 6, 8, 13, 18, 21, 24, 27, 32, 35, 36, 37, 38, 43, 45, 48, 51, 55, 61, 64, 66, 69, 72, 74, 79, 80, 83, 86, 91, 94, 96, 99, 100, 105, 106, 108, 112]] localDataKeysAndValuesSize=24; nonNullLocalDataKeysAndValuesSize=8; localDataKeysAndValues={44=null, 67=null, 59=59, 49=null, 162=null, 261=null, 284=284, 473=null, 397=397, 475=null, 376=null, 387=null, 101=101, 157=null, 366=null, 301=301, 469=null, 403=403, 427=null, 70=null, 229=null, 108=108, 50=null, 85=85}; nonNullLocalDataKeysAndValues={397=397, 101=101, 59=59, 301=301, 403=403, 108=108, 85=85, 284=284} Server3 ------- localData=org.apache.geode.internal.cache.LocalDataSet[path='/data-2';scope=DISTRIBUTED_NO_ACK';dataPolicy=PARTITION ;bucketIds=[1, 4, 9, 12, 14, 17, 22, 23, 28, 30, 33, 39, 42, 44, 47, 50, 53, 54, 59, 62, 63, 70, 73, 76, 77, 81, 85, 88, 89, 90, 95, 97, 98, 101, 103, 107, 109, 111]] localDataKeysAndValuesSize=24; nonNullLocalDataKeysAndValuesSize=6; localDataKeysAndValues={44=null, 67=null, 59=null, 49=null, 162=null, 261=261, 284=null, 473=null, 397=null, 475=475, 376=376, 387=null, 101=null, 157=null, 366=null, 301=null, 469=469, 403=null, 427=null, 70=null, 229=229, 108=null, 50=50, 85=null}; nonNullLocalDataKeysAndValues={475=475, 376=376, 469=469, 229=229, 50=50, 261=261} --------------------------------- Redundancy optimizeForWrite false --------------------------------- If I change both regions to be redundant, I see very different behavior. First, with optimizeForWrite returning false (the default), only 2 of the servers invoke the function. In the optimizeForWrite false case, the function is sent to the fewest number of servers that include all the buckets. keysSize=24; keys=[390, 292, 370, 261, 250, 273, 130, 460, 274, 452, 123, 388, 113, 268, 455, 400, 159, 435, 314, 429, 51, 419, 84, 43] As you saw, getLocalData will produce duplicates since some of the buckets overlap between the servers. In this case, you'll see all the data. If you call getLocalPrimaryData, you probably won't see all the data since some of the primaries will be in the server that doesn't execute the function. You can see below the local data set returns 35 entries for the 24 keys; the primary set returns only 15. Server1 ------- localData=org.apache.geode.internal.cache.LocalDataSet[path='/data-2';scope=DISTRIBUTED_NO_ACK';dataPolicy=PARTITION ;bucketIds=[0, 2, 3, 4, 7, 9, 10, 12, 15, 16, 17, 19, 21, 22, 23, 24, 26, 28, 29, 31, 32, 33, 34, 36, 37, 38, 40, 41, 43, 45, 46, 47, 48, 49, 51, 53, 54, 56, 58, 61, 62, 63, 66, 67, 68, 69, 70, 72, 74, 75, 76, 78, 80, 81, 82, 84, 86, 87, 88, 89, 90, 91, 93, 94, 95, 97, 98, 99, 100, 101, 103, 105, 106, 110, 111]] nonNullLocalDataKeysAndValuesSize=19; nonNullLocalDataKeysAndValues={390=390, 292=292, 261=261, 250=250, 273=273, 130=130, 460=460, 274=274, 452=452, 123=123, 388=388, 113=113, 400=400, 159=159, 435=435, 429=429, 51=51, 84=84, 43=43} primaryData=org.apache.geode.internal.cache.LocalDataSet[path='/data-2';scope=DISTRIBUTED_NO_ACK';dataPolicy=PARTITION ;bucketIds=[0, 66, 4, 68, 70, 7, 10, 74, 12, 76, 78, 17, 82, 84, 21, 87, 24, 90, 28, 95, 32, 33, 99, 37, 101, 38, 103, 106, 43, 45, 46, 110, 49, 51, 54, 58, 62, 63]] nonNullPrimaryDataKeysAndValuesSize=7; nonNullPrimaryDataKeysAndValues={452=452, 388=388, 435=435, 429=429, 51=51, 250=250, 274=274} Server2 ------- localData=org.apache.geode.internal.cache.LocalDataSet[path='/data-2';scope=DISTRIBUTED_NO_ACK';dataPolicy=PARTITION ;bucketIds=[0, 1, 4, 5, 6, 8, 9, 10, 11, 13, 14, 15, 16, 18, 20, 21, 23, 25, 27, 28, 29, 30, 33, 35, 36, 39, 40, 41, 42, 44, 46, 49, 50, 52, 53, 55, 56, 57, 59, 60, 61, 64, 65, 66, 67, 69, 71, 72, 73, 75, 76, 77, 78, 79, 81, 82, 83, 85, 87, 88, 91, 92, 93, 96, 97, 98, 101, 102, 103, 104, 105, 107, 108, 109, 111, 112]] nonNullLocalDataKeysAndValuesSize=16; nonNullLocalDataKeysAndValues={370=370, 261=261, 250=250, 130=130, 460=460, 274=274, 388=388, 113=113, 268=268, 455=455, 400=400, 435=435, 314=314, 419=419, 84=84, 43=43} primaryData=org.apache.geode.internal.cache.LocalDataSet[path='/data-2';scope=DISTRIBUTED_NO_ACK';dataPolicy=PARTITION ;bucketIds=[64, 1, 67, 5, 69, 72, 9, 11, 75, 15, 79, 16, 81, 18, 83, 20, 23, 88, 91, 29, 93, 30, 97, 98, 35, 36, 102, 40, 41, 105, 108, 109, 111, 50, 53, 56, 61]] nonNullPrimaryDataKeysAndValuesSize=8; nonNullPrimaryDataKeysAndValues={113=113, 400=400, 419=419, 84=84, 261=261, 130=130, 460=460, 43=43} -------------------------------- Redundancy optimizeForWrite true -------------------------------- With optimizeForWrite returning true, I see exactly 24 results. Thats because the function is executed on all members, and each key is primary in only one. The approach still won't work though if you have some servers with no primaries. keysSize=24; keys=[13, 58, 25, 181, 491, 183, 282, 173, 294, 495, 122, 365, 222, 486, 476, 146, 236, 139, 70, 449, 60, 51, 63, 43] nonNullPrimaryDataKeysAndValuesSize=8; nonNullPrimaryDataKeysAndValues={365=365, 146=146, 236=236, 139=139, 491=491, 183=183, 173=173, 43=43} nonNullPrimaryDataKeysAndValuesSize=7; nonNullPrimaryDataKeysAndValues={13=13, 222=222, 449=449, 282=282, 51=51, 294=294, 63=63} nonNullPrimaryDataKeysAndValuesSize=9; nonNullPrimaryDataKeysAndValues={495=495, 122=122, 486=486, 58=58, 25=25, 476=476, 70=70, 181=181, 60=60} ------- Options ------- There are a few things you could look into: 1. Plan for duplicates by defining a custom ResultCollector that filters the duplicates 2. Co-locate your regions which will mean that the same buckets will be on the same servers and they will be primary on the same servers. Then use a filter instead of an argument and return true for optimizeForWrite. In this case, it shouldn't matter how you get the other region. 3. If you can't colocate your regions, then don't use either getLocalData or getLocalPrimaryData. Still use a filter, but just use the region in that case. Some of the gets will be remote. The result of option 2 on three servers with 24 keys looks like below. The keys are split among the servers, region.getAll is entirely local and all the values are returned. keysSize=9; keys=[222, 179, 476, 147, 346, 259, 128, 81, 153] regionKeysAndValues=9; regionKeysAndValues={222=222, 179=179, 476=476, 147=147, 346=346, 259=259, 128=128, 81=81, 153=153} keysSize=9; keys=[133, 45, 387, 343, 58, 234, 107, 9, 141] regionKeysAndValues=9; regionKeysAndValues={133=133, 45=45, 387=387, 343=343, 58=58, 234=234, 107=107, 9=9, 141=141} keysSize=6; keys=[122, 279, 237, 381, 131, 351] regionKeysAndValues=6; regionKeysAndValues={122=122, 279=279, 237=237, 381=381, 131=131, 351=351} Thanks, Barry Oglesby On Mon, Aug 19, 2019 at 2:18 PM aashish choudhary < aashish.choudha...@gmail.com> wrote: > You have a data-aware function (invoked by onRegion) from which you call > getAll in region X. That's correct. > > Is region X the region on which the function is executed? Or is it another > region? X is a Different region. > If multiple regions are involved, are they co-located? Not colocated. > > How do you determine the keys to getAll? Let's just say that key passed to > both region is same we basically merge data and return the result. > > Are they passed into the function? If so, as a filter or as an argument? > As an argument. With filter could have been a better approach. > > What does optimizeForWrite return? How many members are running? Have to > check and confirm. We have 12 nodes running. > > Tue, Aug 20, 2019, 2:32 AM Barry Oglesby <bogle...@pivotal.io> wrote: > >> Ashish, >> >> Sorry for all the questions, but I want to make sure I understand the >> scenario. You have a data-aware function (invoked by onRegion) from which >> you call getAll in region X. Is region X the region on which the function >> is executed? Or is it another region? If multiple regions are involved, are >> they co-located? How do you determine the keys to getAll? Are they passed >> into the function? If so, as a filter or as an argument? What does >> optimizeForWrite return? How many members are running? >> >> Thanks, >> Barry Oglesby >> >> >> >> On Mon, Aug 19, 2019 at 1:19 PM aashish choudhary < >> aashish.choudha...@gmail.com> wrote: >> >>> We use data aware function and We make a call to region X from a data >>> aware function using getLocalData API and then we do getall. Recently we >>> introduced redundancy for our partitioned region and now we are getting >>> duplicate enteries for that region X from function response. My hunch is >>> that it is becuase of getLocalData + get all call so if we will change it >>> to getLocalPrimaryData(hope name is correct) for region X it should only do >>> get for primary copies. Is that correct way of handling it.? >>> >>> With best regards, >>> Ashish >>> >>