Thanks Udo. I am also looking at this example which is also implemented the
same way you have explained it earlier.

https://www.github.com/apache/geode-examples/tree/develop/colocation%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fgeode_examples%2Fcolocation%2FOrderPartitionResolver.java

I have some more doubts please help to clarify.

This partition resolver class needs to present in the classpath of the
applications which will do get and put into the region? Actually we have
different set of applications which do data ingestion on geode regions and
then another set of applications which just read data from geode regions
via functions.

There is no need to deploy this partition resolver class on the server side?

As you said in your previous email that with the implementation that I
shared it will hit resolver to every request and may degrade the
performance. So just trying to understand how in the example and the code
snippet you shared with me it will not hit the resolver every time. Just
trying to understand internally how it will handle that.

With best regards,
Ashish

On Fri, Jul 17, 2020, 1:33 AM Udo Kohlmeyer <u...@vmware.com> wrote:

> Hi Ashish,
>
> I think (from just the perspective of not doing an unnecessary String op)
> a complex Key object would be better..
>
> But, your way could work as well… I feel you might definitely hit a
> performance wall with the string manipulation… Also.. for every get, you
> will hit the resolver, to determine what bucket to route to… So once again,
> more performance problems. (And memory)
>
> —Udo
> On Jul 16, 2020, 12:01 PM -0700, aashish choudhary <
> aashish.choudha...@gmail.com>, wrote:
>
> Thanks Udo for Your inputs . I am planning to do something like this in
> Partitionresolver implementing.
>
> Public class CustomPartitionResolver implements 
> PartitionResolver<String,Object>
> {
>    public Object getRoutingObject(EntryOperation opDetails)
>     { String key =(String)opDetails.getKey();
>
> return key.split(“_”)[0];
>
>     }
> }
>
>
>
>
> On Thu, 16 Jul 2020 at 10:56 PM, Udo Kohlmeyer <u...@vmware.com> wrote:
>
>> Hi there Ashish,
>>
>> I think it is safe to assume that once you change the PartitionResolver
>> strategy that you might have to reload the data.
>>
>> I will not commit to a definitive, “Yes, you have to reload the data and
>> cannot load it again from disk” answer, but I think that answer will become
>> self-evident when you change the region configuration, as some settings on
>> the region cannot be amended after creation.
>>
>> I don’t know if you have considered this yet, but it sounds like you have
>> some “complex” string key, that you try and parse for the common. Have you
>> consider maybe using an Object like
>>
>> public class ComplexKey implements DataSerializable {
>>   private String commonPartitioningKey;
>>   private String key;
>>
>>   public ComplexKey() {}
>>
>>   public ComplexKey(String commonPartitioningKey, String key) {
>>     this.commonPartitioningKey = commonPartitioningKey;
>>     this.key = key;
>>   }
>>
>>   @Override
>>   public int hashCode() {
>>     return key.hashCode();
>>   }
>>
>>   @Override
>>   public boolean equals(Object obj) {
>>     return this.key.equals(((ComplexKey) obj).key);
>>   }
>>
>>   public Object getCommonPartitioningKey() {
>>     return commonPartitioningKey;
>>   }
>>
>>   public void setCommonPartitioningKey(String commonPartitioningKey) {
>>     this.commonPartitioningKey = commonPartitioningKey;
>>   }
>>
>>   public Object getKey() {
>>     return key;
>>   }
>>
>>   public void setKey(String key) {
>>     this.key = key;
>>   }
>>
>>   @Override
>>   public void toData(DataOutput out) throws IOException {
>>     out.writeUTF(commonPartitioningKey);
>>     out.writeUTF(key);
>>   }
>>
>>   @Override
>>   public void fromData(DataInput in) throws IOException, 
>> ClassNotFoundException {
>>     commonPartitioningKey = in.readUTF();
>>     key = in.readUTF();
>>   }
>> }
>>
>>
>> Where you can still do a get using the natural key of the object but the
>> PartitionResolver can partition according to the partitioningKey. Imo, it
>> just cleanly separates the partitioning and natural key logic.
>>
>> BE AWARE, you should not use PDX serialization for keys, so stick to
>> Serializable or DataSerializable.
>>
>> As for functions. You should see no difference. Colocation just means
>> that the same bucket number of colocated regions are stored on the same
>> server. What you can now use, is you the notion of “local” data across
>> colocated regions and don’t need to go across the network if you need to
>> access colocated data. So possibly functions can run using local data only
>> and don’t need to go across a network if they need data from another
>> region. I might improve performance a little.
>>
>> Anyway, lots of information. Reach out if you get stuck or don’t
>> understand something.
>>
>>
>> —Udo
>> On Jul 16, 2020, 9:38 AM -0700, aashish choudhary <
>> aashish.choudha...@gmail.com>, wrote:
>>
>> Hi,
>>
>> We are seeing some performance issue with partitioned regions as when we
>> execute data aware function then some of the calls to other regions inside
>> functions goes to different nodes for further processing. So we are trying
>> to implement data colocation between those regions.
>>
>> We will be using custom partitioning of data by implementing
>> PartitionResolver interface.
>>
>> Questions
>>
>> I believe we would need to import/export data again after creating
>> regions with colocation. Please confirm.
>>
>> Since we have regions with different key but all regions have first part
>> of the key common(separated by _) so in partition resolver implementing
>> class we just take the first of key for routing. Will this custom partition
>> the data correctly?
>>
>> Do we need to do any changes while reading data in functions after
>> enabling data colocation?
>>
>>
>> With best regards,
>> Ashish
>>
>> --
> With Best Regards,
> Ashish
>
>

Reply via email to