Hi Loic,

We rebased our teuthology/ceph-qa-suite and retried the test toward LRC on 
current master.
However, we unfortunately got the same result as before (timeout error).

[test conditions]
Target : Ceph-9.0.0-971-gd49d816
https://github.com/kawaguchi-s/teuthology
https://github.com/kawaguchi-s/ceph-qa-suite/tree/wip-10886-lrc

[teuthology log]

2015-05-25 10:18:23     # start

2015-05-25 11:59:52,106.106 INFO:teuthology.orchestra.run.RX35-1:Running: 
'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph 
status -- format=json-pretty'
2015-05-25 11:59:52,564.564 INFO:tasks.ceph.ceph_manager:no progress seen, 
keeping timeout for now
2015-05-25 11:59:52,565.565 INFO:tasks.thrashosds.thrasher:Traceback (most 
recent call last):
  File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 635, in 
wrapper
    return func(self)
  File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 668, in 
do_thrash
    timeout=self.config.get('timeout')
  File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 1569, in 
wait_for_recovery
    'failed to recover before timeout expired'
AssertionError: failed to recover before timeout expired

Traceback (most recent call last):
  File 
"/root/work/teuthology/virtualenv/lib/python2.7/site-packages/gevent/greenlet.py",
 line 390, in run
    result = self._run(*self.args, **self.kwargs)
  File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 635, in 
wrapper
    return func(self)
  File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 668, in 
do_thrash
    timeout=self.config.get('timeout')
  File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 1569, in 
wait_for_recovery
    'failed to recover before timeout expired'
AssertionError: failed to recover before timeout expired <Greenlet at 
0x36cacd0: <bound method Thrasher.do_thrash of <tasks.ceph_manager.Thrasher 
instance at 0x36df3f8>>> failed with AssertionError

Best regards,
Takeshi Miyamae

-----Original Message-----
From: Loic Dachary [mailto:l...@dachary.org] 
Sent: Thursday, May 21, 2015 6:38 PM
To: Miyamae, Takeshi/宮前 剛; Ceph Development
Cc: Kawaguchi, Shotaro/川口 翔太朗; Imai, Hiroki/今井 宏樹; Nakao, Takanori/中尾 鷹詔; 
Shiozawa, Kensuke/塩沢 賢輔
Subject: Re: teuthology timeout error

Hi,

[sorry the previous mail was sent by accident, here is the full mail]

On 21/05/2015 10:32, Miyamae, Takeshi wrote:
> Hi Loic,
> 
>> Could you please share the teuthology/ceph-qa-suite repository you 
>> are using to run these tests so I can try to reproduce / diagnose the 
>> problem ?
> 
> https://github.com/kawaguchi-s/teuthology/tree/wip-10886
> https://github.com/kawaguchi-s/ceph-qa-suite/tree/wip-10886
> 

When compared against master they show differences that indicate it would be 
good to rebase:

https://github.com/ceph/teuthology/compare/master...kawaguchi-s:wip-10886
https://github.com/ceph/ceph-qa-suite/compare/master...kawaguchi-s:wip-10886

I think the teuthology commit on top of wip-10886 is a mistake

https://github.com/kawaguchi-s/teuthology/commit/348e54931f89c9b0ae7a84eb931576f8414017b5

do you really need to modify teuthology ? It should just be necessary to use 
the latest master branch.

It looks like the

https://github.com/kawaguchi-s/ceph-qa-suite/commit/f2e3ca5d12ceef742eae2a9cf4057c436e9040c3

commit in your ceph-qa-suite is not what you intended. However

https://github.com/kawaguchi-s/ceph-qa-suite/commit/4b39d6d4862f9091a849d224e880795be406815d
https://github.com/kawaguchi-s/ceph-qa-suite/commit/d16b4b058ae118931928541a2c8acd68f9703a44

look ok :-) Instead of naming the test 4nodes16osds3mons1client.yaml it would 
be better to use the same kind of naming you see at 
https://github.com/ceph/ceph-qa-suite/tree/master/suites/rados/thrash-erasure-code/workloads.
 That is a file name made of the distinctive parameters for the shec plugin 
(the parameters that are the default can be omited).

Cheers

> Here are our teuthology/ceph-qa-suite repositories. Thanks in advance.
> 
> Best regards,
> Takeshi Miyamae
> 
> -----Original Message-----
> From: Loic Dachary [mailto:l...@dachary.org]
> Sent: Wednesday, May 20, 2015 4:49 PM
> To: Miyamae, Takeshi/宮前 剛; Ceph Development
> Cc: Kawaguchi, Shotaro/川口 翔太朗; Imai, Hiroki/今井 宏樹; Nakao, Takanori/中尾 
> 鷹詔; Shiozawa, Kensuke/塩沢 賢輔
> Subject: Re: teuthology timeout error
> 
> Hi,
> 
> On 20/05/2015 04:20, Miyamae, Takeshi wrote:
>> Hi Loic,
>>
>> When we fixed our own issue and restarted teuthology,
> 
> Great !
> 
>> we encountered another issue (timeout error) which occurs in case of LRC as 
>> well.
>> Do you have any information about that ?
> 
> Could you please share the teuthology/ceph-qa-suite repository you are using 
> to run these tests so I can try to reproduce / diagnose the problem ?
> 
> Thanks
> 
>>
>> [error messages (in case of LRC pool)]
>>
>> 2015-04-28 12:38:54,128.128 INFO:teuthology.orchestra.run.RX35-1:Running: 
>> 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph 
>> status --format=json-pretty'
>> 2015-04-28 12:38:54,516.516 INFO:tasks.ceph.ceph_manager:no progress 
>> seen, keeping timeout for now
>> 2015-04-28 12:38:54,516.516 INFO:tasks.thrashosds.thrasher:Traceback (most 
>> recent call last):
>>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 632, in 
>> wrapper
>>     return func(self)
>>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 665, in 
>> do_thrash
>>     timeout=self.config.get('timeout')
>>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 1566, in 
>> wait_for_recovery
>>     'failed to recover before timeout expired'
>> AssertionError: failed to recover before timeout expired
>>
>> Traceback (most recent call last):
>>   File 
>> "/root/work/teuthology/virtualenv/lib/python2.7/site-packages/gevent/greenlet.py",
>>  line 390, in run
>>     result = self._run(*self.args, **self.kwargs)
>>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 632, in 
>> wrapper
>>     return func(self)
>>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 665, in 
>> do_thrash
>>     timeout=self.config.get('timeout')
>>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 1566, in 
>> wait_for_recovery
>>     'failed to recover before timeout expired'
>> AssertionError: failed to recover before timeout expired <Greenlet at 
>> 0x2a7d550: <bound method Thrasher.do_thrash of 
>> <tasks.ceph_manager.Thrasher instance at 0x2bd12d8>>> failed with 
>> AssertionError
>>
>> [ceph version]
>> 0.93-952-gfe28daa
>>
>> [teuthology, ceph-qa-suite]
>> newest version at 3/25/2015
>>
>> [configurations]
>>   check-locks: false
>>   overrides:
>>     ceph:
>>       conf:
>>         global:
>>           ms inject socket failures: 5000
>>         osd:
>>           osd heartbeat use min delay socket: true
>>           osd sloppy crc: true
>>       fs: xfs
>>   roles:
>>   - - mon.a
>>     - osd.0
>>     - osd.4
>>     - osd.8
>>     - osd.12
>>   - - mon.b
>>     - osd.1
>>     - osd.5
>>     - osd.9
>>     - osd.13
>>   - - mon.c
>>     - osd.2
>>     - osd.6
>>     - osd.10
>>     - osd.14
>>   - - osd.3
>>     - osd.7
>>     - osd.11
>>     - osd.15
>>     - client.0
>>   targets:
>>     ubu...@rx35-1.primary.ceph-poc.fsc.net:
>>     ubu...@rx35-2.primary.ceph-poc.fsc.net:
>>     ubu...@rx35-3.primary.ceph-poc.fsc.net:
>>     ubu...@rx35-4.primary.ceph-poc.fsc.net:
>>   tasks:
>>   - ceph:
>>       conf:
>>         osd:
>>           osd debug reject backfill probability: 0.3
>>           osd max backfills: 1
>>           osd scrub max interval: 120
>>           osd scrub min interval: 60
>>       log-whitelist:
>>       - wrongly marked me down
>>       - objects unfound and apparently lost
>>   - thrashosds:
>>       chance_pgnum_grow: 1
>>       chance_pgpnum_fix: 1
>>       min_in: 4
>>       timeout: 1200
>>   - rados:
>>       clients:
>>       - client.0
>>       ec_pool: true
>>       erasure_code_profile:
>>         k: 4
>>         l: 3
>>         m: 2
>>         name: lrcprofile
>>         plugin: lrc
>>         ruleset-failure-domain: osd
>>       objects: 50
>>       op_weights:
>>         append: 100
>>         copy_from: 50
>>         delete: 50
>>         read: 100
>>         rmattr: 25
>>         rollback: 50
>>         setattr: 25
>>         snap_create: 50
>>         snap_remove: 50
>>         write: 0
>>       ops: 190000
>>
>> Best regards,
>> Takeshi Miyamae
>>
> 

--
Loïc Dachary, Artisan Logiciel Libre



N�����r��y����b�X��ǧv�^�)޺{.n�+���z�]z���{ay�ʇڙ�,j��f���h���z��w���
���j:+v���w�j�m��������zZ+�����ݢj"��!�i

Reply via email to