When the NPE occurs, has the server completed its bootstrapping from cluster configuration yet?
Anthony > On Jul 6, 2021, at 12:06 AM, Mario Kevo <mario.k...@est.tech> wrote: > > Hi Geode devs, > > I opened a new ticket https://issues.apache.org/jira/browse/GEODE-9409 > regarding NullPointerException on creating region while one of the servers is > restarting. > If we run the "create region" command through gfsh while the server is > starting it passed, but if the server is restarted then it fails. The > difference is that when we restarted the server, we kill them and start > again. As it has already a server directory, it takes more time to get the > server up as expected. > In that case, if we run the "create region" command it can happen that the > cache is not fully created and we are trying to do something on that. That > can lead to the NullPointerException, as creating region catches pdxRegistry > from the cache while doing findDiskStore, but sometimes it is not initialized > in the cache yet. So every method run against that will throw > NullPoniterException. > There is a part of the code where the exception is thrown: > > DiskStoreImpl findDiskStore(RegionAttributes regionAttributes, > InternalRegionArguments internalRegionArgs) { > // validate that persistent type registry is persistent > if (getAttributes().getDataPolicy().withPersistence()) { > getCache().getPdxRegistry().creatingPersistentRegion(); > } > > As I already mention, getPdxRegistry(LocalRegion.java) will be null if it is > not yet initialized in create(CacheCreation.java): > > DiskStoreAttributesCreation pdxRegDSC = initializePdxDiskStore(cache); > > cache.initializePdxRegistry(); > > createDiskStores(cache, pdxRegDSC); > > I tried to do some fixes, but without a success. 🙁 > It can be passed if we add some retry and sleep, but that is not acceptable. > > So if someone has some idea how to do some wait until pdxRegistry is > initialized or something else what will help us to avoid this problem? > > BR, > Mario