One thing I run into more often than not is how teams shutdown Geode.   If
the shutdown process is killing each of the processes one by one (gfsh stop
or kill -9 <pid>)  it actually facilitates a distrust on the remaining
members.   Try using the gfsh shutdown command and see how much better or
worse things are.

The shutdown command allows every distributed system member to save its
state in a consistent manner.    Then on start up the system is only
waiting for known members/quorum before it allows clients to access

Note:  Shutdown also assumes we are starting everything in parallel -
lets not cause a timeout for last members to join.

Regards,

Charlie

On Wed, Oct 17, 2018 at 8:44 AM Swapnil Bawaskar <[email protected]>
wrote:

> I would first try to resolve the 75 min startup for clean cluster issue.
> Are you seeing the above recovery messages for a clean start? If so, then
> the start does not look clean.
> Also, the idea behind the disk store concept was to utilize multiple disks
> on a single machine to get better throughput for writing and recovery. I
> don't know if you still get that advantage on mounted volumes in cloud, but
> you could try mounting two disks and then point the disk stores to one/disk.
>
>
> On Wed, Oct 17, 2018 at 7:15 AM Pieter van Zyl <[email protected]>
> wrote:
>
>> Hi Jens.
>>
>> I am using GCP to fire up 3 servers. The import is quick enough and the
>> cluster and network looks ok then.
>> Speed also looks fine between the 3 nodes.
>>
>> I have these properties enabled when I start the server:
>>
>> java -server
>> -agentpath:/home/r2d2/yourkit/bin/linux-x86-64/libyjpagent.so
>> -javaagent:lib/aspectj/lib/aspectjweaver.jar -Dgemfire.EXPIRY_THREADS=20
>> -Dgemfire.PREFER_SERIALIZED=false 
>> *-Dgemfire.enable.network.partition.detection=false
>> *-Dgemfire.autopdx.ignoreConstructor=true
>> -Dgemfire.ALLOW_PERSISTENT_TRANSACTIONS=true
>> -Dgemfire.member-timeout=600000 -Xmx90G -Xms90G -Xmn30G -XX:SurvivorRatio=1
>> -XX:MaxTenuringThreshold=15 -XX:CMSInitiatingOccupancyFraction=78
>> -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:+CMSParallelRemarkEnabled
>> -XX:+UseCMSInitiatingOccupancyOnly -XX:+DisableExplicitGC
>> -XX:+PrintGCDetails -XX:+PrintTenuringDistribution -XX:+PrintGCTimeStamps
>> -XX:+PrintGCDateStamps -XX:+PrintGCApplicationStoppedTime -verbose:gc
>> -Xloggc:/home/r2d2/rdb-geode-server/gc/gc-server.log
>> -Djava.rmi.server.hostname='localhost'
>> -Dcom.sun.management.jmxremote.port=9010
>> -Dcom.sun.management.jmxremote.rmi.port=9010
>> -Dcom.sun.management.jmxremote.local.only=false
>> -Dcom.sun.management.jmxremote.authenticate=false
>> -Dcom.sun.management.jmxremote.ssl=false
>> .....org.rdb.geode.server.GeodeServer
>>
>> Could this setting influence the cluster:
>> *Dgemfire.enable.network.partition.detection=false*
>>
>> *I am seeing a lot of recovery messages:*
>>
>> [info 2018/10/16 15:32:26.867 UTC  <Recovery thread for bucket
>>> _B__net.lautus.gls.domain.life.instruction.instruction.rebalance.
>>> AggregatePortfolioRebalanceChoice_92> tid=0x42c9] Initialization of
>>> region _B__net.lautus.gls.domain.life.instruction.instruction.rebalance.
>>> AggregatePortfolioRebalanceChoice_92 completed
>>> [info 2018/10/14 11:19:17.329 SAST  <RedundancyLogger for region
>>> net.lautus.gls.domain.life.additionalfields.AdditionalFieldConfiguration>
>>> tid=0x1858] Region
>>> /net.lautus.gls.domain.life.additionalfields.AdditionalFieldConfiguration
>>> (and any colocated sub-regions) has potentially stale data.  Buckets [3]
>>> are waiting for another offline member to recover the latest data.
>>>   My persistent id is:
>>>     DiskStore ID: 932530bc-4c45-4926-b4a1-6fe5fe1f0493
>>>     Name:
>>>     Location: /10.154.0.2:/home/r2d2/rdb-geode-server/geode/tauDiskStore
>>>
>>>   Offline members with potentially new data:
>>>   [
>>>     DiskStore ID: c09e4cce-51e9-4111-8643-fe582677f49f
>>>     Location: /10.154.0.4:/home/r2d2/rdb-geode-server/geode/tauDiskStore
>>>     Buckets: [3]
>>>   ]
>>>   Use the "gfsh show missing-disk-stores" command to see all disk stores
>>> that are being waited on by other members.
>>> [info 2018/10/14 11:19:35.250 SAST  <Pooled Waiting Message Processor 7>
>>> tid=0x1318] Configured redundancy of 1 copies has been restored to
>>> /net.lautus.gls.domain.life.additionalfields.AdditionalFieldConfiguration
>>
>>
>> Btw using Apache Geode 1.7.0.
>>
>> Kindly
>>
>> Pieter
>>
>>
>> On Wed, Oct 17, 2018 at 3:56 PM Jens Deppe <[email protected]> wrote:
>>
>>> Hi Pieter,
>>>
>>> Your startup times are definitely  too long - probably at least an order
>>> of magnitude. My first guess is that this is network related. This may
>>> either be a DNS lookup issue or, if the the cluster is isolated from the
>>> internet, it may be some problem with XSD validation needing internet
>>> access (even though we do bundle the XSD files with Geode - should be the
>>> same for Spring too). I will see if I can find any potential XSD issue.
>>>
>>> --Jens
>>>
>>> On Wed, Oct 17, 2018 at 3:22 AM Pieter van Zyl <
>>> [email protected]> wrote:
>>>
>>>> Good day.
>>>>
>>>> We are currently running a 3 node Geode cluster.
>>>>
>>>> We are running the locator from gfsh and then staring up 3 servers with
>>>> Spring that connects to the central locator.
>>>>
>>>> We are using persistence on all the regions and have basically one data
>>>> and pdx store per node.
>>>>
>>>> The problem  we are experiencing is that with no data aka clean cluster
>>>> it take 75minutes to start up.
>>>>
>>>> Once data has been imported into the cluster and we shutdown all
>>>> nodes/server and startup again it takes 128 to 160 minutes
>>>> This is very slow.
>>>>
>>>> Question is is there anyway to improve the startup speed? Is this
>>>> normal and expected speed?
>>>>
>>>> We have a 100gig database distributed across the 3 nodes.
>>>> Server 1: 100 gig memory and 90 gig assigned heap and db size of 49gig
>>>> and 32 cores.
>>>> Server 2: 64 gig memory and 60 gig assigned heap and db size of 34gig
>>>> and 16 cores
>>>> Server 3: 64 gig memory and 60 gig assigned heap and db size of 34gig
>>>> and 16 cores
>>>>
>>>> Should we have more data stores? Maybe separate stores for the
>>>> partition vs replicated regions?
>>>>
>>>> <gfe:disk-store id="pdx-disk-store" allow-force-compaction="true"
>>>> auto-compact="true" max-oplog-size="1024">
>>>>    * <gfe:disk-dir location="geode/pdx"/>*
>>>> </gfe:disk-store>
>>>>
>>>> <gfe:disk-store id="tauDiskStore" allow-force-compaction="true"
>>>> auto-compact="true" max-oplog-size="5120"
>>>>                 compaction-threshold="90">
>>>>   *  <gfe:disk-dir location="geode/tauDiskStore"/>*
>>>> </gfe:disk-store>
>>>>
>>>> We have a mix of regions:
>>>>
>>>> Example partitioned region:
>>>>
>>>> <gfe:replicated-region
>>>> id="net.lautus.gls.domain.life.accounting.Account"
>>>> disk-store-ref="tauDiskStore"
>>>>                        statistics="true"
>>>> persistent="true"><!--<gfe:cache-listener ref="cacheListener"/>-->
>>>>     <gfe:eviction type="HEAP_PERCENTAGE" action="OVERFLOW_TO_DISK"/>
>>>> </gfe:replicated-region>
>>>>
>>>> Example replicated region:
>>>> <gfe:replicated-region
>>>> id="org.rdb.internal.session.rootmap.RootMapHolder"
>>>>                        disk-store-ref="tauDiskStore"
>>>>                        statistics="true" persistent="true"
>>>> >
>>>>     <!--<gfe:cache-listener ref="cacheListener"/>-->
>>>>     <gfe:eviction type="ENTRY_COUNT" action="OVERFLOW_TO_DISK"
>>>> threshold="100">
>>>>         <gfe:object-sizer ref="objectSizer"/>
>>>>     </gfe:eviction>
>>>> </gfe:replicated-region>
>>>>
>>>>
>>>> Any advice would be appreciated
>>>>
>>>> Kindly
>>>> Pieter
>>>>
>>> --
[email protected] | +1.858.480.9722
Principal Realtime Data Engineer

Reply via email to