Hi Kapil,

Geode (by default) writes data synchronously to other cluster members.  If a 
node crashes like in your test, the update is preserved by the cluster even in 
the absence of persistence.  Synchronous disk writes can be turned on (see [1]) 
but many users prefer to avoid the fsync performance penalty.

Anthony

[1] https://cwiki.apache.org/confluence/display/GEODE/Native+Disk+Persistence

> On Oct 13, 2016, at 6:46 PM, Kapil Goyal <goy...@vmware.com> wrote:
> 
> Hi Folks,
> 
> I am doing some crash testing with a single cache node of GemFire, where I 
> power off the VM where cache is running and then bring it back up. Upon 
> restart, GemFire refuses to come up with this error:
> 
> Caused by: java.lang.NullPointerException
>         at 
> com.gemstone.gemfire.internal.util.concurrent.CustomEntryConcurrentHashMap.keyHash(CustomEntryConcurrentHashMap.java:228)
>  ~[gemfire-8.2.0.2.jar:?]
>         at 
> com.gemstone.gemfire.internal.cache.AbstractRegionEntry$HashRegionEntryCreator.keyHashCode(AbstractRegionEntry.java:934)
>  ~[gemfire-8.2.0.2.jar:?]
>         at 
> com.gemstone.gemfire.internal.util.concurrent.CustomEntryConcurrentHashMap.get(CustomEntryConcurrentHashMap.java:1447)
>  ~[gemfire-8.2.0.2.jar:?]
>         at 
> com.gemstone.gemfire.internal.cache.AbstractRegionMap.getEntry(AbstractRegionMap.java:368)
>  ~[gemfire-8.2.0.2.jar:?]
>         at 
> com.gemstone.gemfire.internal.cache.AbstractLRURegionMap.getEntry(AbstractLRURegionMap.java:47)
>  ~[gemfire-8.2.0.2.jar:?]
>         at 
> com.gemstone.gemfire.internal.cache.PlaceHolderDiskRegion.getDiskEntry(PlaceHolderDiskRegion.java:93)
>  ~[gemfire-8.2.0.2.jar:?]
>         at 
> com.gemstone.gemfire.internal.cache.Oplog.readModifyEntry(Oplog.java:2779) 
> ~[gemfire-8.2.0.2.jar:?]
>         at com.gemstone.gemfire.internal.cache.Oplog.readCrf(Oplog.java:1957) 
> ~[gemfire-8.2.0.2.jar:?]
>         at 
> com.gemstone.gemfire.internal.cache.Oplog.recoverCrf(Oplog.java:2270) 
> ~[gemfire-8.2.0.2.jar:?]
>         at 
> com.gemstone.gemfire.internal.cache.PersistentOplogSet.recoverOplogs(PersistentOplogSet.java:459)
>  ~[gemfire-8.2.0.2.jar:?]
>         at 
> com.gemstone.gemfire.internal.cache.PersistentOplogSet.recoverRegionsThatAreReady(PersistentOplogSet.java:367)
>  ~[gemfire-8.2.0.2.jar:?]
>         at 
> com.gemstone.gemfire.internal.cache.DiskStoreImpl.recoverRegionsThatAreReady(DiskStoreImpl.java:2065)
>  ~[gemfire-8.2.0.2.jar:?]
>         at 
> com.gemstone.gemfire.internal.cache.DiskStoreImpl.initializeIfNeeded(DiskStoreImpl.java:2052)
>  ~[gemfire-8.2.0.2.jar:?]
>         at 
> com.gemstone.gemfire.internal.cache.DiskStoreImpl.doInitialRecovery(DiskStoreImpl.java:2057)
>  ~[gemfire-8.2.0.2.jar:?]
>         at 
> com.gemstone.gemfire.internal.cache.DiskStoreFactoryImpl.create(DiskStoreFactoryImpl.java:135)
>  ~[gemfire-8.2.0.2.jar:?]
>         at 
> com.gemstone.gemfire.internal.cache.xmlcache.CacheCreation.createDiskStore(CacheCreation.java:650)
>  ~[gemfire-8.2.0.2.jar:?]
>         at 
> com.gemstone.gemfire.internal.cache.xmlcache.CacheCreation.create(CacheCreation.java:425)
>  ~[gemfire-8.2.0.2.jar:?]
>         at 
> com.gemstone.gemfire.internal.cache.xmlcache.CacheXmlParser.create(CacheXmlParser.java:331)
>  ~[gemfire-8.2.0.2.jar:?]
>         at 
> com.gemstone.gemfire.internal.cache.GemFireCacheImpl.loadCacheXml(GemFireCacheImpl.java:4248)
>  ~[gemfire-8.2.0.2.jar:?]
>         at 
> org.springframework.data.gemfire.CacheFactoryBean.init(CacheFactoryBean.java:306)
>  ~[spring-data-gemfire-1.5.2.RELEASE.jar:1.5.2.RELEASE]
>         at 
> org.springframework.data.gemfire.CacheFactoryBean.getObject(CacheFactoryBean.java:455)
>  ~[spring-data-gemfire-1.5.2.RELEASE.jar:1.5.2.RELEASE]
> 
> It hints at GemFire data on disk being corrupted, so I used 'gfsh' to verify:
> 
> gfsh>validate offline-disk-store --name=nsxDiskStore 
> --disk-dirs=/common/nsxapi/data/self
> 
> Validating nsxDiskStore
> /nsx_sys/ArrayListIDPriorityModel: entryCount=0
> /nsx_sys/Crl: entryCount=0
> /nsx_sys/Certificate: entryCount=1
> ……
> Error in validating disk store nsxDiskStore is : null
> 
> This confirms that the disk-store is corrupted, but doesn't give any more 
> information to debug this further. How do I go about debugging this? Have you 
> seen this before and are there any fixes/workarounds available?
> 
> Thanks
> Kapil

Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

Reply via email to