Phillip, btw & FYI, got a "Recipient address rejected: Your host is blacklisted." when including your p...@s... email account directly in the to: of the email below.

On 12/01/2012 9:06 AM, Roel Van de Paar wrote:
Hi Phillip,

Found what the issue is. Table creation takes a long time under Valgrind (and all the more so when running # threads I presume)

Hence, before we even get to run time, a significant amount of time has passed:

# 2012-01-11T23:49:44 # Creating MySQL table: test.A; engine: innodb; rows: 100 .
# 2012-01-11T23:50:39 # Creating MySQL table: test.AA; engine: innodb; rows: 100 .
# 2012-01-11T23:51:33 # Creating MySQL table: test.B; engine: innodb; rows: 100 .
# 2012-01-11T23:52:33 # Creating MySQL table: test.BB; engine: innodb; rows: 100 .
# 2012-01-11T23:53:26 # Creating MySQL table: test.C; engine: innodb; rows: 100 .
# 2012-01-11T23:54:33 # Creating MySQL table: test.CC; engine: innodb; rows: 100 .
# 2012-01-11T23:55:30 # Creating MySQL table: test.D; engine: innodb; rows: 100 .
[... etc ...]

I wonder if using a optimized build run under Valgrind instead of a Valgrind-instrumented build may work better here.


On 26/12/2011 8:21 PM, Roel Van de Paar wrote:
Hi Philip,

OK, I'll check with Bernt to see if the issue is related to the new options implemented.

In this case, I am just using a replication grammar, I am not using a replication setup. Hence, 15 threads.

(btw, ignore the binlog settings in the cc file, not really relevant here)

I have found many threads + random + short runs to work well, including for reproducing sporadic issues, as I was trying to do here.

For sporadic issues, a [slightly] overloaded server also tends to help.


On 26/12/2011 7:38 PM, Philip Stoev wrote:
Roel,

Bernt should comment on this particular problem, as he implemented those options to combinations.pl .I personally never use --parallel, and I have never had any hangs.

However, you are trying to run 30 mysqld servers under valgrind. What are the specs of the machine that is doing that? Even if it had 30 cores, does it have 30 hard drives and 30 separate memory channels to ensure that enough useful work happens within 600 seconds? My guess is that for some of the test runs, replication barely started before the 600 seconds were up.

Philip Stoev

----- Original Message ----- From: "Roel Van de Paar" <[email protected]>
To: <[email protected]>
Sent: Monday, December 26, 2011 5:18 AM
Subject: [Randgen] combinations.pl: 1 hour+ instead of 10 minutes


Hi All,

I am running into something odd: a combinations.pl run takes 6-7 as long
as it should:

Relevant switches to combinations.pl:
============
  --run-all-combinations-once
  --parallel=15
  --force
============

Relevant settings in the .cc file:
============
 ['
  --mysqld=--log-output=none
  --mysqld=--sql_mode=ONLY_FULL_GROUP_BY
  --mysqld=--default-time-zone=UTC
  --duration=600
  --queries=100000000
  --querytimeout=5

--reporters=Shutdown,Backtrace,QueryTimeout,ErrorLog,ErrorLogAlarm,ValgrindErrors
  --short_column_names
  --strict_fields
  --threads=1
  --valgrind
  --validators=MarkErrorLog
  --seed=132
  --mysqld=--binlog-format=MIXED
  --mysqld=--log-bin=binlog
  --mysqld=--log-bin-index=binlog.index
 '
 ],[
  '','','','','','','','','','','','','','',''
 ]
============

Result:
============
bash-4.1$ ./108.run
# 2011-12-26T03:56:30 /randgen Revno: 912
[...]
# 2011-12-26T03:56:30 Started thread [1] pid=25834
# 2011-12-26T03:56:30 [1] Running combination 1/15
# 2011-12-26T03:56:30 Started thread [2] pid=25835
# 2011-12-26T03:56:30 [2] Running combination 2/15
[... 15 threads, all at once in parallel, as expected ...]
# 2011-12-26T05:03:12 [15] runall.pl exited with exit status
STATUS_OK(0), see /.../trial15.log
[...]
# 2011-12-26T05:05:14 [8] runall.pl exited with exit status
STATUS_VALGRIND_FAILURE(108), see /.../trial8.log
[...]
# 2011-12-26T05:05:45 [14] runall.pl exited with exit status
STATUS_OK(0), see /.../trial14.log
[...]
# 2011-12-26T05:05:45 ./combinations.pl will exit with exit status
STATUS_VALGRIND_FAILURE(108)
============

It took more than one hour while --duration was set to 600!

I think this happens more.

Some ideas:
- Is there some function which "delays" terminating RQG if not enough
"real" time has been processed or something?
- Could it be related to Valgrind runs?
- A 1 hour offset somewhere which causes ++1 hour runs?

Any input/ideas?

-- 
Kind regards,
God Bless,

Oracle <http://www.oracle.com>
Roel Van de Paar | Senior QA Engineer
Oracle MySQL Server QA
Oracle Australia | NSW 2440
Green Oracle <http://www.oracle.com/commitment> Oracle is committed to
developing practices and products that help protect the environment




--------------------------------------------------------------------------------


Hi All,

I am running into something odd: a combinations.pl run takes 6-7 as long as it should:

Relevant switches to combinations.pl:
============
 --run-all-combinations-once
 --parallel=15
 --force
============

Relevant settings in the .cc file:
============
['
 --mysqld=--log-output=none
 --mysqld=--sql_mode=ONLY_FULL_GROUP_BY
 --mysqld=--default-time-zone=UTC
 --duration=600
 --queries=100000000
 --querytimeout=5
 --reporters=Shutdown,Backtrace,QueryTimeout,ErrorLog,ErrorLogAlarm,ValgrindErrors
 --short_column_names
 --strict_fields
 --threads=1
 --valgrind
 --validators=MarkErrorLog
 --seed=132
 --mysqld=--binlog-format=MIXED
 --mysqld=--log-bin=binlog
 --mysqld=--log-bin-index=binlog.index
'
],[
 '','','','','','','','','','','','','','',''
]
============

Result:
============
bash-4.1$ ./108.run
# 2011-12-26T03:56:30 /randgen Revno: 912
[...]
# 2011-12-26T03:56:30 Started thread [1] pid=25834
# 2011-12-26T03:56:30 [1] Running combination 1/15
# 2011-12-26T03:56:30 Started thread [2] pid=25835
# 2011-12-26T03:56:30 [2] Running combination 2/15
[... 15 threads, all at once in parallel, as expected ...]
# 2011-12-26T05:03:12 [15] runall.pl exited with exit status STATUS_OK(0), see /.../trial15.log
[...]
# 2011-12-26T05:05:14 [8] runall.pl exited with exit status STATUS_VALGRIND_FAILURE(108), see /.../trial8.log
[...]
# 2011-12-26T05:05:45 [14] runall.pl exited with exit status STATUS_OK(0), see /.../trial14.log
[...]
# 2011-12-26T05:05:45 ./combinations.pl will exit with exit status STATUS_VALGRIND_FAILURE(108)
============

It took more than one hour while --duration was set to 600!

I think this happens more.

Some ideas:
- Is there some function which "delays" terminating RQG if not enough "real" time has been processed or something?
- Could it be related to Valgrind runs?
- A 1 hour offset somewhere which causes ++1 hour runs?

Any input/ideas?

--
Kind regards,
God Bless,

Oracle
Roel Van de Paar | Senior QA Engineer
Mobile: +61 0400 225 827
Oracle MySQL Server QA
Oracle Australia | NSW 2440

Green
            Oracle Oracle is committed to developing practices and products that help protect the environment

_______________________________________________
Mailing list: https://launchpad.net/~randgen
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~randgen
More help   : https://help.launchpad.net/ListHelp

Reply via email to