[jira] [Commented] (CASSANDRA-14616) cassandra-stress write hangs with default options

2018-11-18 Thread Stefania (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16691164#comment-16691164
 ] 

Stefania commented on CASSANDRA-14616:
--

[~Yarnspinner], [~jay.zhuang] are you OK with committing Jay's approach? I 
don't mind too much which approach, they are both OK, it's just a matter of 
picking a default value.

Jay you are a committer correct? So if Jeremey is OK committing your patch, I 
assume you prefer to merge it yourself?

> cassandra-stress write hangs with default options
> -
>
> Key: CASSANDRA-14616
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14616
> Project: Cassandra
>  Issue Type: Bug
>  Components: Stress
>Reporter: Chris Lohfink
>Assignee: Jeremy
>Priority: Major
>
> Cassandra stress sits there for incredibly long time after connecting to JMX. 
> To reproduce {code}./tools/bin/cassandra-stress write{code}
> If you give it a -n its not as bad which is why dtests etc dont seem to be 
> impacted. Does not occur in 3.0 branch but does in 3.11 and trunk



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14616) cassandra-stress write hangs with default options

2018-11-14 Thread Stefania (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687400#comment-16687400
 ] 

Stefania commented on CASSANDRA-14616:
--

[~jay.zhuang] , [~Yarnspinner] , thanks for fixing this problem and writing the 
test.

I checked both patches, they are both good. Perhap's Jay's approach of assuming 
50,000 iterations for duration tests is slightly preferable since that was the 
old behavior.

 

> cassandra-stress write hangs with default options
> -
>
> Key: CASSANDRA-14616
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14616
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Chris Lohfink
>Assignee: Jeremy
>Priority: Major
>
> Cassandra stress sits there for incredibly long time after connecting to JMX. 
> To reproduce {code}./tools/bin/cassandra-stress write{code}
> If you give it a -n its not as bad which is why dtests etc dont seem to be 
> impacted. Does not occur in 3.0 branch but does in 3.11 and trunk



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14616) cassandra-stress write hangs with default options

2018-11-14 Thread Jay Zhuang (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687347#comment-16687347
 ] 

Jay Zhuang commented on CASSANDRA-14616:


The failed the utest is because of CASSANDRA-14891

> cassandra-stress write hangs with default options
> -
>
> Key: CASSANDRA-14616
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14616
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Chris Lohfink
>Assignee: Jeremy
>Priority: Major
>
> Cassandra stress sits there for incredibly long time after connecting to JMX. 
> To reproduce {code}./tools/bin/cassandra-stress write{code}
> If you give it a -n its not as bad which is why dtests etc dont seem to be 
> impacted. Does not occur in 3.0 branch but does in 3.11 and trunk



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14616) cassandra-stress write hangs with default options

2018-11-14 Thread Jay Zhuang (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687300#comment-16687300
 ] 

Jay Zhuang commented on CASSANDRA-14616:


Hi [~Yarnspinner], the fix looks good. I had the similar fix which re-enables 
{{warm-up}} to 50k as before 
([{{StressAction.java}}|https://github.com/apache/cassandra/commit/6a1b1f26b7174e8c9bf86a96514ab626ce2a4117#diff-fd2f2d2364937fcb1c0d73c8314f1418L90])

|Branch|uTest|dTest|
|[14890-3.0|https://github.com/cooldoger/cassandra/tree/14890-3.0]|[!https://circleci.com/gh/cooldoger/cassandra/tree/14890-3.0.svg?style=svg!|https://circleci.com/gh/cooldoger/cassandra/tree/14890-3.0]|[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/661/badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/661/]|
|[14890-3.11|https://github.com/cooldoger/cassandra/tree/14890-3.11]|[!https://circleci.com/gh/cooldoger/cassandra/tree/14890-3.11.svg?style=svg!|https://circleci.com/gh/cooldoger/cassandra/tree/14890-3.11]|[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/662/badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/662/]|
|[14890-trunk|https://github.com/cooldoger/cassandra/tree/14890-trunk]|[!https://circleci.com/gh/cooldoger/cassandra/tree/14890-trunk.svg?style=svg!|https://circleci.com/gh/cooldoger/cassandra/tree/14890-trunk]|[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/663/badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/663/]|

Here is a dTest to reproduce the problem:
|[14890|https://github.com/cooldoger/cassandra-dtest/tree/14890]|

> cassandra-stress write hangs with default options
> -
>
> Key: CASSANDRA-14616
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14616
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Chris Lohfink
>Priority: Major
>
> Cassandra stress sits there for incredibly long time after connecting to JMX. 
> To reproduce {code}./tools/bin/cassandra-stress write{code}
> If you give it a -n its not as bad which is why dtests etc dont seem to be 
> impacted. Does not occur in 3.0 branch but does in 3.11 and trunk



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14616) cassandra-stress write hangs with default options

2018-08-13 Thread Jeremy (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16578028#comment-16578028
 ] 

Jeremy commented on CASSANDRA-14616:


Hello, I would like to try solving this issue.

I have done some preliminary testing and it appears that it is caused by 
cassandra-stress waiting for uncertainty to stabilize, the trace from jstack is 
included below.

{quote}
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0x00076d873e20> (a 
java.util.concurrent.CountDownLatch$Sync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
at 
org.apache.cassandra.stress.util.Uncertainty$WaitForTargetUncertainty.await(Uncertainty.java:56)
at 
org.apache.cassandra.stress.util.Uncertainty.await(Uncertainty.java:85)
at 
org.apache.cassandra.stress.report.StressMetrics.waitUntilConverges(StressMetrics.java:135)
at org.apache.cassandra.stress.StressAction.run(StressAction.java:269)
at 
org.apache.cassandra.stress.StressAction.warmup(StressAction.java:121)
at org.apache.cassandra.stress.StressAction.run(StressAction.java:70)
at org.apache.cassandra.stress.Stress.run(Stress.java:143)
at org.apache.cassandra.stress.Stress.main(Stress.java:62)
{quote}

I also did some printlns for debugging in 3.11.
{quote}
uncertainty: NaN targetUncertainty: 0.02 
measurements: 1 minMeasurements: 30 
measurements: 1 maxMeasurements: 200 
uncertainty: NaN targetUncertainty: 0.02 
measurements: 2 minMeasurements: 30 
measurements: 2 maxMeasurements: 200 
...
uncertainty: NaN targetUncertainty: 0.02 
measurements: 200 minMeasurements: 30 
measurements: 200 maxMeasurements: 200 
{quote}

In the warmup phase, the program aims for either uncertainty to fall below 0.02 
or to hit 200 measurements. It ends up waiting for 200 measurements since the 
uncertainty is always NaN. The same problem doesn't occur in 3.0 because the 
Runnable 
(https://github.com/apache/cassandra/blob/cassandra-3.0/tools/stress/src/org/apache/cassandra/stress/StressMetrics.java#L86)
 calls wakeAll after 2 iterations. However, uncertainty is still always NaN in 
3.0.

 The problem arises in 3.11 and trunk as that runnable loop was refactored  
into reportingLoop which waited for all 200 tries first. 
https://github.com/apache/cassandra/blob/cassandra-3.11/tools/stress/src/org/apache/cassandra/stress/report/StressMetrics.java#L154

Here's what it looks like for 3.0.
{quote}
Warming up WRITE with 0 iterations...
___
Updated value: NaN 
uncertainty: NaN targetUncertainty: 0.02 
measurements: 1 minMeasurements: 30 
measurements: 1 maxMeasurements: 200 
___
Updated value: NaN 
uncertainty: NaN targetUncertainty: 0.02 
measurements: 2 minMeasurements: 30 
measurements: 2 maxMeasurements: 200 
latch counted down via wakeall
wakeAll via line 123 in stressmetrics
WARNING: uncertainty mode (err<) results in uneven workload between thread 
runs, so should be used for high level analysis only
Running with 4 threadCount
{quote}

I think this is being caused by having 0 iterations for warmup. The number of 
iterations is decided at the start by 
 {{Math.min(5, (int) (settings.command.count * 0.25)) * 
settings.node.nodes.size();}}. 
https://github.com/apache/cassandra/blob/cassandra-3.11/tools/stress/src/org/apache/cassandra/stress/StressAction.java#L108
 . 

When {{./tools/bin/cassandra-stress write}} is called without any arguments, 
settings.command.count evaluates to -1 and  {{Math.min(5, (int) 
(settings.command.count * 0.25)) * settings.node.nodes.size();}} evaluates to 0 
so we always end up with 0 iterations. 

One proposed fix is to choose a minimum nonzerovalue for iterations in the 
warmup phase. Something like 
https://github.com/yarnspinnered/cassandra/commit/33cf059f63b56ac17a3f66869615d3d7cc52f8a9
 . I tried this and it no longer hangs but I'm not sure on the exact value or 
if there is a better way to fix this.

> cassandra-stress write hangs with default options
> -
>
> Key: CASSANDRA-14616
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14616
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Chris Lohfink
>Priority: Major
>
> Cassandra stress sits there for