[ 
https://issues.apache.org/jira/browse/GEODE-8029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17094606#comment-17094606
 ] 

Juan Ramos commented on GEODE-8029:
-----------------------------------

Thanks [~jagan23527001],

Looks like the validation fails with the same exception, which is expected... 
my current theory is that you have a *huge* amount of deleted records within 
the {{disk-store}} that should have been compacted but, instead, are still 
there, preventing the member from starting up. If my theory is correct, you 
should be able to execute an {{offline compaction}} instead of the steps I've 
previously shared, the steps are below:

# For member {{provServerHO2}}, copy all files under 
{{/app/provServerHO2/data/}} to another directory, just as a backup.
# For member {{provServerHO2}}, execute {{compact offline-disk-store 
--name=geodeStore --disk-dirs=/app/provServerHO2/data}}.
# Try to start member {{provServerHO2}} again, it should come up just fine.
# At this point the cluster should be fully operational, so you can go ahead 
and execute your internal verifications to double check everything is correct.

If the above steps don't work (they should, I'm just adding another option as a 
workaround), you can go ahead and execute the steps I've shared previously, 
that will guarantee the member starts fresh and gets the data from the already 
running servers. The steps are below again, just for your reference:

# Make sure {{provServerHO1}} and {{provServerHO3}} are fully up and running, 
without any exceptions in the logs. If you notice any exceptions or weirdness 
within these members logs, don't continue with the rest of the steps.
# For member {{provServerHO2}}, copy all files under 
{{/app/provServerHO2/data/}} to another directory, just as a backup.
# For member {{provServerHO2}}, remove all files under 
{{/app/provServerHO2/data/}}.
# Try to start member {{provServerHO2}} again, during the startup procedure the 
member should be able to get the latest data from the other running members 
({{provServerHO1}} and {{provServerHO3}}).
# If the above steps finished correctly, execute the [{{gfsh 
rebalance}}|https://geode.apache.org/docs/guide/112/tools_modules/gfsh/command-pages/rebalance.html]
 command to make sure buckets are evenly distributed across the three members 
(this is an expensive operation, so you might want to go through [Rebalancing 
Partitioned Region 
Data|https://geode.apache.org/docs/guide/112/developing/partitioned_regions/rebalancing_pr_data.html]
 to fully understand the implications and requirements).
# At this point the cluster should be fully operational, so you can go ahead 
and execute your internal verifications to double check everything is correct.

Please let me know how it goes.


> java.lang.IllegalArgumentException: Too large (805306401 expected elements 
> with load factor 0.75)
> -------------------------------------------------------------------------------------------------
>
>                 Key: GEODE-8029
>                 URL: https://issues.apache.org/jira/browse/GEODE-8029
>             Project: Geode
>          Issue Type: Bug
>          Components: configuration, core, gfsh
>    Affects Versions: 1.9.0
>            Reporter: Jagadeesh sivasankaran
>            Assignee: Juan Ramos
>            Priority: Major
>              Labels: GeodeCommons, caching-applications
>         Attachments: Screen Shot 2020-04-27 at 12.21.19 PM.png, Screen Shot 
> 2020-04-27 at 12.21.19 PM.png, server02.log
>
>
> we have a cluster of three Locator Geode and three Cache Server running in 
> CentOS servers. Today (April 27) after patching our CENTOS servers , all 
> locator and 2 servers came up , But one Cache server was not starting . here 
> is the Exception details.  Please let me know how to resolve the beloe issue 
> and need any configuration changes to diskstore ? 
>  
>  
> Starting a Geode Server in /app/provServerHO2...
> ....................................................................................................................................................................................................................The
>  Cache Server process terminated unexpectedly with exit status 1. Please 
> refer to the log file in /app/provServerHO2 for full details.
> Exception in thread "main" java.lang.IllegalArgumentException: Too large 
> (805306401 expected elements with load factor 0.75)
> at it.unimi.dsi.fastutil.HashCommon.arraySize(HashCommon.java:222)
> at it.unimi.dsi.fastutil.ints.IntOpenHashSet.add(IntOpenHashSet.java:308)
> at 
> org.apache.geode.internal.cache.DiskStoreImpl$OplogEntryIdSet.add(DiskStoreImpl.java:3474)
> at org.apache.geode.internal.cache.Oplog.readDelEntry(Oplog.java:3007)
> at org.apache.geode.internal.cache.Oplog.recoverDrf(Oplog.java:1500)
> at 
> org.apache.geode.internal.cache.PersistentOplogSet.recoverOplogs(PersistentOplogSet.java:445)
> at 
> org.apache.geode.internal.cache.PersistentOplogSet.recoverRegionsThatAreReady(PersistentOplogSet.java:369)
> at 
> org.apache.geode.internal.cache.DiskStoreImpl.recoverRegionsThatAreReady(DiskStoreImpl.java:2053)
> at 
> org.apache.geode.internal.cache.DiskStoreImpl.initializeIfNeeded(DiskStoreImpl.java:2041)
> security-peer-auth-init=
> at 
> org.apache.geode.internal.cache.DiskStoreImpl.doInitialRecovery(DiskStoreImpl.java:2046)
> at 
> org.apache.geode.internal.cache.DiskStoreFactoryImpl.initializeDiskStore(DiskStoreFactoryImpl.java:184)
> at 
> org.apache.geode.internal.cache.DiskStoreFactoryImpl.create(DiskStoreFactoryImpl.java:150)
> at 
> org.apache.geode.internal.cache.xmlcache.CacheCreation.createDiskStore(CacheCreation.java:794)
> at 
> org.apache.geode.internal.cache.xmlcache.CacheCreation.initializePdxDiskStore(CacheCreation.java:785)
> at 
> org.apache.geode.internal.cache.xmlcache.CacheCreation.create(CacheCreation.java:509)
> at 
> org.apache.geode.internal.cache.xmlcache.CacheXmlParser.create(CacheXmlParser.java:337)
> at 
> org.apache.geode.internal.cache.GemFireCacheImpl.loadCacheXml(GemFireCacheImpl.java:4272)
> at 
> org.apache.geode.internal.cache.ClusterConfigurationLoader.applyClusterXmlConfiguration(ClusterConfigurationLoader.java:197)
> at 
> org.apache.geode.internal.cache.GemFireCacheImpl.applyJarAndXmlFromClusterConfig(GemFireCacheImpl.java:1240)
> at 
> org.apache.geode.internal.cache.GemFireCacheImpl.initialize(GemFireCacheImpl.java:1206)
> at 
> org.apache.geode.internal.cache.InternalCacheBuilder.create(InternalCacheBuilder.java:207)
> at 
> org.apache.geode.internal.cache.InternalCacheBuilder.create(InternalCacheBuilder.java:164)
> at org.apache.geode.cache.CacheFactory.create(CacheFactory.java:139)
> at 
> org.apache.geode.distributed.internal.DefaultServerLauncherCacheProvider.createCache(DefaultServerLauncherCacheProvider.java:52)
> at 
> org.apache.geode.distributed.ServerLauncher.createCache(ServerLauncher.java:869)
> at org.apache.geode.distributed.ServerLauncher.start(ServerLauncher.java:786)
> at org.apache.geode.distributed.ServerLauncher.run(ServerLauncher.java:716)
> at org.apache.geode.distributed.ServerLauncher.main(ServerLauncher.java:236)
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to