[
https://issues.apache.org/jira/browse/GEODE-6233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16726332#comment-16726332
]
Brian Rowe commented on GEODE-6233:
-----------------------------------
With 1M entries, we were able to get some stack traces during the pre
population phase. These are the interesting traces:
In server1
{noformat}
"pool-3-thread-16" #91 prio=5 os_prio=0 tid=0x00007f5f0489f000 nid=0x26b9
runnable [0x00007f5ee07ca000]
java.lang.Thread.State: RUNNABLE
at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
at sun.nio.ch.IOUtil.write(IOUtil.java:51)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471)
- locked <0x0000000740066180> (a java.lang.Object)
at
org.apache.geode.internal.tcp.Connection.nioWriteFully(Connection.java:3280)
- locked <0x00000007400661b0> (a java.lang.Object)
at
org.apache.geode.internal.tcp.Connection.sendPreserialized(Connection.java:2517)
at
org.apache.geode.internal.tcp.MsgStreamer.realFlush(MsgStreamer.java:315)
at
org.apache.geode.internal.tcp.MsgStreamer.ensureCapacity(MsgStreamer.java:266)
at
org.apache.geode.internal.tcp.MsgStreamer.writeAsSerializedByteArray(MsgStreamer.java:917)
at
org.apache.geode.DataSerializer.writeObjectAsByteArray(DataSerializer.java:1306)
at
org.apache.geode.internal.cache.DistributedPutAllOperation$PutAllEntryData.toData(DistributedPutAllOperation.java:415)
at
org.apache.geode.internal.cache.DistributedPutAllOperation$PutAllMessage.toData(DistributedPutAllOperation.java:1238)
at
org.apache.geode.internal.InternalDataSerializer.invokeToData(InternalDataSerializer.java:2398)
at
org.apache.geode.internal.InternalDataSerializer.writeDSFID(InternalDataSerializer.java:1517)
at
org.apache.geode.internal.tcp.MsgStreamer.writeMessage(MsgStreamer.java:234)
at
org.apache.geode.distributed.internal.direct.DirectChannel.sendToMany(DirectChannel.java:394)
at
org.apache.geode.distributed.internal.direct.DirectChannel.sendToOne(DirectChannel.java:251)
at
org.apache.geode.distributed.internal.direct.DirectChannel.send(DirectChannel.java:616)
at
org.apache.geode.distributed.internal.membership.gms.mgr.GMSMembershipManager.directChannelSend(GMSMembershipManager.java:1686)
at
org.apache.geode.distributed.internal.membership.gms.mgr.GMSMembershipManager.send(GMSMembershipManager.java:1864)
at
org.apache.geode.distributed.internal.ClusterDistributionManager.sendViaMembershipManager(ClusterDistributionManager.java:2860)
at
org.apache.geode.distributed.internal.ClusterDistributionManager.sendOutgoing(ClusterDistributionManager.java:2780)
at
org.apache.geode.distributed.internal.ClusterDistributionManager.sendMessage(ClusterDistributionManager.java:2819)
at
org.apache.geode.distributed.internal.ClusterDistributionManager.putOutgoing(ClusterDistributionManager.java:1523)
at
org.apache.geode.internal.cache.DistributedCacheOperation._distribute(DistributedCacheOperation.java:555)
at
org.apache.geode.internal.cache.DistributedCacheOperation.startOperation(DistributedCacheOperation.java:263)
at
org.apache.geode.internal.cache.DistributedRegion.postPutAllSend(DistributedRegion.java:3244)
at
org.apache.geode.internal.cache.LocalRegionDataView.postPutAll(LocalRegionDataView.java:326)
at
org.apache.geode.internal.cache.LocalRegion.basicPutAll(LocalRegion.java:9698)
at
org.apache.geode.internal.cache.DistributedRegion.basicPutAll(DistributedRegion.java:3270)
at
org.apache.geode.internal.cache.LocalRegion.putAll(LocalRegion.java:9446)
at
org.apache.geode.internal.cache.LocalRegion.putAll(LocalRegion.java:9458)
at
org.apache.geode.benchmark.tasks.PrePopulateRegion.doPuts(PrePopulateRegion.java:133)
at
org.apache.geode.benchmark.tasks.PrePopulateRegion.lambda$run$0(PrePopulateRegion.java:107)
at
org.apache.geode.benchmark.tasks.PrePopulateRegion$$Lambda$74/530302711.run(Unknown
Source)
at
java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1626)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
{noformat}
In server-2
{noformat}
"p2pDestreamer for P2P message reader@574d2058 msgId=476" #159 daemon prio=10
os_prio=0 tid=0x00007f3164009000 nid=0x2678 in Object.wait()
[0x00007f319d010000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:502)
at
org.apache.geode.internal.tcp.MsgDestreamer$DestreamerIS.waitForData(MsgDestreamer.java:383)
- locked <0x0000000752e1ce78> (a java.lang.Object)
at
org.apache.geode.internal.tcp.MsgDestreamer$DestreamerIS.waitForAvailableData(MsgDestreamer.java:462)
at
org.apache.geode.internal.tcp.MsgDestreamer$DestreamerIS.read(MsgDestreamer.java:503)
at java.io.DataInputStream.readByte(DataInputStream.java:265)
at
org.apache.geode.internal.InternalDataSerializer.readArrayLength(InternalDataSerializer.java:2586)
at
org.apache.geode.DataSerializer.readByteArray(DataSerializer.java:1345)
at
org.apache.geode.internal.cache.DistributedPutAllOperation$PutAllEntryData.<init>(DistributedPutAllOperation.java:338)
at
org.apache.geode.internal.cache.DistributedPutAllOperation$PutAllMessage.fromData(DistributedPutAllOperation.java:1204)
at
org.apache.geode.internal.InternalDataSerializer.invokeFromData(InternalDataSerializer.java:2470)
at org.apache.geode.internal.DSFIDFactory.create(DSFIDFactory.java:990)
at
org.apache.geode.internal.InternalDataSerializer.readDSFID(InternalDataSerializer.java:2611)
at
org.apache.geode.internal.InternalDataSerializer.readDSFID(InternalDataSerializer.java:2625)
at
org.apache.geode.internal.tcp.MsgDestreamer$DestreamerThread.run(MsgDestreamer.java:259)
{noformat}
In the stats, we do not appear to be low on memory. CPU usage and GC look
normal.
> Prepopulate phase of benchmarks is taking too long
> --------------------------------------------------
>
> Key: GEODE-6233
> URL: https://issues.apache.org/jira/browse/GEODE-6233
> Project: Geode
> Issue Type: Bug
> Components: benchmarks
> Reporter: Brian Rowe
> Priority: Major
>
> We are seeing very strange behavior with the prepopulate phase of the
> benchmarks. With 10K keys, the prepopulate phase occasionally takes several
> minutes.
> Looking at the stats, we see that almost all of the data is loaded
> immediately, but there is a long tail of just a few new entries being created
> per second. This appears to be the result a single or very small number of
> PutAll operations having a hard time completing their last few values.
> When we bump up the number of keys to 1M, we see similar behavior, but the
> initial burst is an even lower fraction of the data, and we're still only
> around 500K entries created after an hour of prepopulation.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)