It is highly unusual to use Geode with just a single cache node. A big part
of the value of an In-Memory Data Grid is that it can provide
fault-tolerance and high-availability for your data. Please consider
running at least 3 nodes in your tests as that would be the minimum
real-world configuration that Geode would likely be used in.

--
Mike Stolz
Principal Engineer, GemFire Product Manager
Mobile: 631-835-4771

On Sat, Oct 15, 2016 at 4:59 AM, Kapil Goyal <goy...@vmware.com> wrote:

> Thanks Anthony.
>
> We have already enabled synchronous disk writes to minimize data loss in
> the event of crash.
>
> From: Anthony Baker <aba...@pivotal.io>
> Reply-To: <user@geode.incubator.apache.org>
> Date: Thursday, October 13, 2016 at 8:31 PM
> To: <user@geode.incubator.apache.org>
> Subject: Re: GemFire persisted data corruption - how to debug?
>
> Hi Kapil,
>
> Geode (by default) writes data synchronously to other cluster members.  If
> a node crashes like in your test, the update is preserved by the cluster
> even in the absence of persistence.  Synchronous disk writes can be turned
> on (see [1]) but many users prefer to avoid the fsync performance penalty.
>
> Anthony
>
> [1] https://cwiki.apache.org/confluence/display/GEODE/
> Native+Disk+Persistence
>
> On Oct 13, 2016, at 6:46 PM, Kapil Goyal <goy...@vmware.com> wrote:
>
> Hi Folks,
>
> I am doing some crash testing with a single cache node of GemFire, where I
> power off the VM where cache is running and then bring it back up. Upon
> restart, GemFire refuses to come up with this error:
>
> Caused by: java.lang.NullPointerException
>         at com.gemstone.gemfire.internal.util.concurrent.
> CustomEntryConcurrentHashMap.keyHash(CustomEntryConcurrentHashMap.java:228)
> ~[gemfire-8.2.0.2.jar:?]
>         at com.gemstone.gemfire.internal.cache.AbstractRegionEntry$
> HashRegionEntryCreator.keyHashCode(AbstractRegionEntry.java:934)
> ~[gemfire-8.2.0.2.jar:?]
>         at com.gemstone.gemfire.internal.util.concurrent.
> CustomEntryConcurrentHashMap.get(CustomEntryConcurrentHashMap.java:1447)
> ~[gemfire-8.2.0.2.jar:?]
>         at com.gemstone.gemfire.internal.cache.AbstractRegionMap.
> getEntry(AbstractRegionMap.java:368) ~[gemfire-8.2.0.2.jar:?]
>         at com.gemstone.gemfire.internal.cache.AbstractLRURegionMap.
> getEntry(AbstractLRURegionMap.java:47) ~[gemfire-8.2.0.2.jar:?]
>         at com.gemstone.gemfire.internal.cache.PlaceHolderDiskRegion.
> getDiskEntry(PlaceHolderDiskRegion.java:93) ~[gemfire-8.2.0.2.jar:?]
>         at 
> com.gemstone.gemfire.internal.cache.Oplog.readModifyEntry(Oplog.java:2779)
> ~[gemfire-8.2.0.2.jar:?]
>         at com.gemstone.gemfire.internal.cache.Oplog.readCrf(Oplog.java:1957)
> ~[gemfire-8.2.0.2.jar:?]
>         at 
> com.gemstone.gemfire.internal.cache.Oplog.recoverCrf(Oplog.java:2270)
> ~[gemfire-8.2.0.2.jar:?]
>         at com.gemstone.gemfire.internal.cache.PersistentOplogSet.
> recoverOplogs(PersistentOplogSet.java:459) ~[gemfire-8.2.0.2.jar:?]
>         at com.gemstone.gemfire.internal.cache.PersistentOplogSet.
> recoverRegionsThatAreReady(PersistentOplogSet.java:367)
> ~[gemfire-8.2.0.2.jar:?]
>         at com.gemstone.gemfire.internal.cache.DiskStoreImpl.
> recoverRegionsThatAreReady(DiskStoreImpl.java:2065)
> ~[gemfire-8.2.0.2.jar:?]
>         at com.gemstone.gemfire.internal.cache.DiskStoreImpl.
> initializeIfNeeded(DiskStoreImpl.java:2052) ~[gemfire-8.2.0.2.jar:?]
>         at com.gemstone.gemfire.internal.cache.DiskStoreImpl.
> doInitialRecovery(DiskStoreImpl.java:2057) ~[gemfire-8.2.0.2.jar:?]
>         at com.gemstone.gemfire.internal.cache.DiskStoreFactoryImpl.
> create(DiskStoreFactoryImpl.java:135) ~[gemfire-8.2.0.2.jar:?]
>         at com.gemstone.gemfire.internal.cache.xmlcache.CacheCreation.
> createDiskStore(CacheCreation.java:650) ~[gemfire-8.2.0.2.jar:?]
>         at 
> com.gemstone.gemfire.internal.cache.xmlcache.CacheCreation.create(CacheCreation.java:425)
> ~[gemfire-8.2.0.2.jar:?]
>         at com.gemstone.gemfire.internal.cache.xmlcache.CacheXmlParser.
> create(CacheXmlParser.java:331) ~[gemfire-8.2.0.2.jar:?]
>         at com.gemstone.gemfire.internal.cache.GemFireCacheImpl.
> loadCacheXml(GemFireCacheImpl.java:4248) ~[gemfire-8.2.0.2.jar:?]
>         at 
> org.springframework.data.gemfire.CacheFactoryBean.init(CacheFactoryBean.java:306)
> ~[spring-data-gemfire-1.5.2.RELEASE.jar:1.5.2.RELEASE]
>         at org.springframework.data.gemfire.CacheFactoryBean.
> getObject(CacheFactoryBean.java:455) ~[spring-data-gemfire-1.5.2.
> RELEASE.jar:1.5.2.RELEASE]
>
> It hints at GemFire data on disk being corrupted, so I used 'gfsh' to
> verify:
>
> gfsh>validate offline-disk-store --name=nsxDiskStore
> --disk-dirs=/common/nsxapi/data/self
>
> Validating nsxDiskStore
> /nsx_sys/ArrayListIDPriorityModel: entryCount=0
> /nsx_sys/Crl: entryCount=0
> /nsx_sys/Certificate: entryCount=1
> ……
> Error in validating disk store nsxDiskStore is : null
>
> This confirms that the disk-store is corrupted, but doesn't give any more
> information to debug this further. How do I go about debugging this? Have
> you seen this before and are there any fixes/workarounds available?
>
> Thanks
> Kapil
>
>
>

Reply via email to