Re: Gemfire functions giving duplicate records

aashish choudhary Tue, 20 Aug 2019 02:55:50 -0700

Thanks Barry for thorough analysis. I am kind of in favour of option 1 as
we are not so sure about colocated regions.
Option 3 not sure if I really get it "Still use a filter, but just use the
region in that case. Some of the gets will be remote".


So you are saying that if I make the function onregion call with the
withfilter thing then other region X call where we get duplicate willl get
resolved? And it's irrespective of whether I have optimizieforwrite true or
false.

I was also thinking about making a nested function call to region X but not
sure if it is recommended or could run into some distributed lock situation.

On Tue, Aug 20, 2019, 4:49 AM Barry Oglesby <bogle...@pivotal.io> wrote:

> Ashish,
>
> Here is a bunch of analysis on that scenario.
>
> -------------
> No redundancy
> -------------
> With partitioned regions, no redundancy and no filter, the function is
> being sent to every member that contains buckets.
>
> In that case, you see this kind of behavior (I have 3 servers in my test):
>
> The argument containing the keys is sent to every member. In this case, I
> have 24 keys.
>
> keysSize=24; keys=[44, 67, 59, 49, 162, 261, 284, 473, 397, 475, 376, 387,
> 101, 157, 366, 301, 469, 403, 427, 70, 229, 108, 50, 85]
>
> When you call PartitionRegionHelper.getLocalData or getLocalPrimaryData,
> you're getting back a LocalDataSet. Calling get or getAll on a LocalDataSet
> returns null if the value is not in that LocalDataSet. This causes all the
> get calls to be local and a bunch of nulls in the result.
>
> If I print the LocalDataSet and the value of getAll in all three servers,
> I see 24 non-null results across the servers.
>
> Server1
> -------
> localData=org.apache.geode.internal.cache.LocalDataSet[path='/data-2';scope=DISTRIBUTED_NO_ACK';dataPolicy=PARTITION
> ;bucketIds=[2, 3, 7, 10, 11, 15, 16, 19, 20, 25, 26, 29, 31, 34, 40, 41,
> 46, 49, 52, 56, 57, 58, 60, 65, 67, 68, 71, 75, 78, 82, 84, 87, 92, 93,
> 102, 104, 110]]
> localDataKeysAndValuesSize=24; nonNullLocalDataKeysAndValuesSize=10;
> localDataKeysAndValues={44=44, 67=67, 59=null, 49=49, 162=162, 261=null,
> 284=null, 473=473, 397=null, 475=null, 376=null, 387=387, 101=null,
> 157=157, 366=366, 301=null, 469=null, 403=null, 427=427, 70=70, 229=null,
> 108=null, 50=null, 85=null}; nonNullLocalDataKeysAndValues={44=44, 473=473,
> 67=67, 387=387, 157=157, 366=366, 49=49, 427=427, 70=70, 162=162}
>
> Server2
> -------
> localData=org.apache.geode.internal.cache.LocalDataSet[path='/data-2';scope=DISTRIBUTED_NO_ACK';dataPolicy=PARTITION
> ;bucketIds=[0, 5, 6, 8, 13, 18, 21, 24, 27, 32, 35, 36, 37, 38, 43, 45, 48,
> 51, 55, 61, 64, 66, 69, 72, 74, 79, 80, 83, 86, 91, 94, 96, 99, 100, 105,
> 106, 108, 112]]
> localDataKeysAndValuesSize=24; nonNullLocalDataKeysAndValuesSize=8;
> localDataKeysAndValues={44=null, 67=null, 59=59, 49=null, 162=null,
> 261=null, 284=284, 473=null, 397=397, 475=null, 376=null, 387=null,
> 101=101, 157=null, 366=null, 301=301, 469=null, 403=403, 427=null, 70=null,
> 229=null, 108=108, 50=null, 85=85}; nonNullLocalDataKeysAndValues={397=397,
> 101=101, 59=59, 301=301, 403=403, 108=108, 85=85, 284=284}
>
> Server3
> -------
> localData=org.apache.geode.internal.cache.LocalDataSet[path='/data-2';scope=DISTRIBUTED_NO_ACK';dataPolicy=PARTITION
> ;bucketIds=[1, 4, 9, 12, 14, 17, 22, 23, 28, 30, 33, 39, 42, 44, 47, 50,
> 53, 54, 59, 62, 63, 70, 73, 76, 77, 81, 85, 88, 89, 90, 95, 97, 98, 101,
> 103, 107, 109, 111]]
> localDataKeysAndValuesSize=24; nonNullLocalDataKeysAndValuesSize=6;
> localDataKeysAndValues={44=null, 67=null, 59=null, 49=null, 162=null,
> 261=261, 284=null, 473=null, 397=null, 475=475, 376=376, 387=null,
> 101=null, 157=null, 366=null, 301=null, 469=469, 403=null, 427=null,
> 70=null, 229=229, 108=null, 50=50, 85=null};
> nonNullLocalDataKeysAndValues={475=475, 376=376, 469=469, 229=229, 50=50,
> 261=261}
>
> ---------------------------------
> Redundancy optimizeForWrite false
> ---------------------------------
> If I change both regions to be redundant, I see very different behavior.
>
> First, with optimizeForWrite returning false (the default), only 2 of the
> servers invoke the function. In the optimizeForWrite false case, the
> function is sent to the fewest number of servers that include all the
> buckets.
>
> keysSize=24; keys=[390, 292, 370, 261, 250, 273, 130, 460, 274, 452, 123,
> 388, 113, 268, 455, 400, 159, 435, 314, 429, 51, 419, 84, 43]
>
> As you saw, getLocalData will produce duplicates since some of the buckets
> overlap between the servers. In this case, you'll see all the data. If you
> call getLocalPrimaryData, you probably won't see all the data since some of
> the primaries will be in the server that doesn't execute the function.
>
> You can see below the local data set returns 35 entries for the 24 keys;
> the primary set returns only 15.
>
> Server1
> -------
> localData=org.apache.geode.internal.cache.LocalDataSet[path='/data-2';scope=DISTRIBUTED_NO_ACK';dataPolicy=PARTITION
> ;bucketIds=[0, 2, 3, 4, 7, 9, 10, 12, 15, 16, 17, 19, 21, 22, 23, 24, 26,
> 28, 29, 31, 32, 33, 34, 36, 37, 38, 40, 41, 43, 45, 46, 47, 48, 49, 51, 53,
> 54, 56, 58, 61, 62, 63, 66, 67, 68, 69, 70, 72, 74, 75, 76, 78, 80, 81, 82,
> 84, 86, 87, 88, 89, 90, 91, 93, 94, 95, 97, 98, 99, 100, 101, 103, 105,
> 106, 110, 111]]
> nonNullLocalDataKeysAndValuesSize=19;
> nonNullLocalDataKeysAndValues={390=390, 292=292, 261=261, 250=250, 273=273,
> 130=130, 460=460, 274=274, 452=452, 123=123, 388=388, 113=113, 400=400,
> 159=159, 435=435, 429=429, 51=51, 84=84, 43=43}
>
> primaryData=org.apache.geode.internal.cache.LocalDataSet[path='/data-2';scope=DISTRIBUTED_NO_ACK';dataPolicy=PARTITION
> ;bucketIds=[0, 66, 4, 68, 70, 7, 10, 74, 12, 76, 78, 17, 82, 84, 21, 87,
> 24, 90, 28, 95, 32, 33, 99, 37, 101, 38, 103, 106, 43, 45, 46, 110, 49, 51,
> 54, 58, 62, 63]]
> nonNullPrimaryDataKeysAndValuesSize=7;
> nonNullPrimaryDataKeysAndValues={452=452, 388=388, 435=435, 429=429, 51=51,
> 250=250, 274=274}
>
> Server2
> -------
> localData=org.apache.geode.internal.cache.LocalDataSet[path='/data-2';scope=DISTRIBUTED_NO_ACK';dataPolicy=PARTITION
> ;bucketIds=[0, 1, 4, 5, 6, 8, 9, 10, 11, 13, 14, 15, 16, 18, 20, 21, 23,
> 25, 27, 28, 29, 30, 33, 35, 36, 39, 40, 41, 42, 44, 46, 49, 50, 52, 53, 55,
> 56, 57, 59, 60, 61, 64, 65, 66, 67, 69, 71, 72, 73, 75, 76, 77, 78, 79, 81,
> 82, 83, 85, 87, 88, 91, 92, 93, 96, 97, 98, 101, 102, 103, 104, 105, 107,
> 108, 109, 111, 112]]
> nonNullLocalDataKeysAndValuesSize=16;
> nonNullLocalDataKeysAndValues={370=370, 261=261, 250=250, 130=130, 460=460,
> 274=274, 388=388, 113=113, 268=268, 455=455, 400=400, 435=435, 314=314,
> 419=419, 84=84, 43=43}
>
> primaryData=org.apache.geode.internal.cache.LocalDataSet[path='/data-2';scope=DISTRIBUTED_NO_ACK';dataPolicy=PARTITION
> ;bucketIds=[64, 1, 67, 5, 69, 72, 9, 11, 75, 15, 79, 16, 81, 18, 83, 20,
> 23, 88, 91, 29, 93, 30, 97, 98, 35, 36, 102, 40, 41, 105, 108, 109, 111,
> 50, 53, 56, 61]]
> nonNullPrimaryDataKeysAndValuesSize=8;
> nonNullPrimaryDataKeysAndValues={113=113, 400=400, 419=419, 84=84, 261=261,
> 130=130, 460=460, 43=43}
>
> --------------------------------
> Redundancy optimizeForWrite true
> --------------------------------
> With optimizeForWrite returning true, I see exactly 24 results. Thats
> because the function is executed on all members, and each key is primary in
> only one. The approach still won't work though if you have some servers
> with no primaries.
>
> keysSize=24; keys=[13, 58, 25, 181, 491, 183, 282, 173, 294, 495, 122,
> 365, 222, 486, 476, 146, 236, 139, 70, 449, 60, 51, 63, 43]
>
> nonNullPrimaryDataKeysAndValuesSize=8;
> nonNullPrimaryDataKeysAndValues={365=365, 146=146, 236=236, 139=139,
> 491=491, 183=183, 173=173, 43=43}
> nonNullPrimaryDataKeysAndValuesSize=7;
> nonNullPrimaryDataKeysAndValues={13=13, 222=222, 449=449, 282=282, 51=51,
> 294=294, 63=63}
> nonNullPrimaryDataKeysAndValuesSize=9;
> nonNullPrimaryDataKeysAndValues={495=495, 122=122, 486=486, 58=58, 25=25,
> 476=476, 70=70, 181=181, 60=60}
>
> -------
> Options
> -------
> There are a few things you could look into:
>
> 1. Plan for duplicates by defining a custom ResultCollector that filters
> the duplicates
> 2. Co-locate your regions which will mean that the same buckets will be on
> the same servers and they will be primary on the same servers. Then use a
> filter instead of an argument and return true for optimizeForWrite. In this
> case, it shouldn't matter how you get the other region.
> 3. If you can't colocate your regions, then don't use either getLocalData
> or getLocalPrimaryData. Still use a filter, but just use the region in that
> case. Some of the gets will be remote.
>
> The result of option 2 on three servers with 24 keys looks like below. The
> keys are split among the servers, region.getAll is entirely local and all
> the values are returned.
>
> keysSize=9; keys=[222, 179, 476, 147, 346, 259, 128, 81, 153]
> regionKeysAndValues=9; regionKeysAndValues={222=222, 179=179, 476=476,
> 147=147, 346=346, 259=259, 128=128, 81=81, 153=153}
>
> keysSize=9; keys=[133, 45, 387, 343, 58, 234, 107, 9, 141]
> regionKeysAndValues=9; regionKeysAndValues={133=133, 45=45, 387=387,
> 343=343, 58=58, 234=234, 107=107, 9=9, 141=141}
>
> keysSize=6; keys=[122, 279, 237, 381, 131, 351]
> regionKeysAndValues=6; regionKeysAndValues={122=122, 279=279, 237=237,
> 381=381, 131=131, 351=351}
>
> Thanks,
> Barry Oglesby
>
>
>
> On Mon, Aug 19, 2019 at 2:18 PM aashish choudhary <
> aashish.choudha...@gmail.com> wrote:
>
>> You have a data-aware function (invoked by onRegion) from which you call
>> getAll in region X. That's correct.
>>
>> Is region X the region on which the function is executed? Or is it
>> another region? X is a Different region.
>> If multiple regions are involved, are they co-located? Not colocated.
>>
>> How do you determine the keys to getAll? Let's just say that key passed
>> to both region is same we basically merge data and return the result.
>>
>> Are they passed into the function? If so, as a filter or as an argument?
>> As an argument. With filter could have been a better approach.
>>
>> What does optimizeForWrite return? How many members are running? Have to
>> check and confirm. We have 12 nodes running.
>>
>> Tue, Aug 20, 2019, 2:32 AM Barry Oglesby <bogle...@pivotal.io> wrote:
>>
>>> Ashish,
>>>
>>> Sorry for all the questions, but I want to make sure I understand the
>>> scenario. You have a data-aware function (invoked by onRegion) from which
>>> you call getAll in region X. Is region X the region on which the function
>>> is executed? Or is it another region? If multiple regions are involved, are
>>> they co-located? How do you determine the keys to getAll? Are they passed
>>> into the function? If so, as a filter or as an argument? What does
>>> optimizeForWrite return? How many members are running?
>>>
>>> Thanks,
>>> Barry Oglesby
>>>
>>>
>>>
>>> On Mon, Aug 19, 2019 at 1:19 PM aashish choudhary <
>>> aashish.choudha...@gmail.com> wrote:
>>>
>>>> We use data aware function and We make a call to region X from a data
>>>> aware function using getLocalData API and then we do getall. Recently we
>>>> introduced redundancy for our partitioned region and now we are getting
>>>> duplicate enteries for that region X from function response. My hunch is
>>>> that it is becuase of getLocalData + get all call so if we will change it
>>>> to getLocalPrimaryData(hope name is correct) for region X it should only do
>>>> get for primary copies. Is that correct way of handling  it.?
>>>>
>>>> With best regards,
>>>> Ashish
>>>>
>>>

Re: Gemfire functions giving duplicate records

Reply via email to