Retaining local pox types in the client after a disconnect will cause problems 
as you observed.  Take a look at the “ON_DISCONNECT_CLEAR_PDXTYPEIDS” property 
to improve this.

Anthony


> On May 4, 2021, at 4:36 AM, Mario Salazar de Torres 
> <mario.salazar.de.tor...@est.tech> wrote:
> 
> Hi everyone,
> 
> While debugging some coredumps in the native client related to 
> PdxTypeRegistry cleanup, I tried to reproduce the scenario with the Java 
> client API to see how it was handled.
> Thing is I've noticed that this scenario in the Java client might lead to 
> Geode storing a corrupted entry, meaning that queries won't work on those 
> regions containing corrupted entries.
> And with corrupted entries, I refer to entries using a missing PdxType. The 
> scenario involves a cluster restart. It's described below:
> 
>  1.  Start a cluster with 1 locator and 3 servers, and persistence is 
> disabled for PdxTypes.
>  2.  Setup a region called "test-region" with persistence disabled. It 
> doesn't mind whether is replicated or partitioned.
>  3.  In the client, instantiate the client region with PROXY region shortcut 
> and establish the connection toward the cluster.
>  4.  In the client, create a PdxInstance and put in into the "test-region" 
> with key "test".
>  5.  In the client, get the entry which key is "test", which turns out to be 
> the PdxInstance inserted in step 4.
>  6.  At this point, cluster is restarted, meaning that all the data is lost, 
> included PdxTypes.
>  7.  In the client, the PdxInstance obtained in step 5 is put into 
> "test-region" with key "test2"
>  8.  In the client, the following query is executed: "SELECT * FROM 
> /test-region WHERE value = -1".
> Such query fails with the message "Unknown pdx type=<PdxType ID>" and it 
> won't work until the corrupted entry is removed.
> 
> Also, the above scenario could be solved by enabling persistence for 
> PdxTypes, but if you have an unrecoverable issue in your cluster and you need 
> to spin up a backup,
> it could happen that PdxInstance's PdxType obtained step 5 is not present in 
> the backup, leading to the entry being inserted but, yet again, the PdxType 
> being missing.
> 
> It's worth mentioning that in the native client, this scenario currently 
> results in a coredump, but no data corruption,
> given that after losing the connection towards the cluster PdxTypeRegistry is 
> cleaned up and PdxTypes are obtained with its ID, rather than directly using 
> the object.
> 
> My question here are:
> 
>  *   Have you seen this issue before?
>  *   Is there a way to verify that PdxTypes are present in the cluster before 
> writing an entry which holds some PdxInstances?
> 
> Thanks,
> Mario.

Reply via email to