from:"Yuri Weinstein"

nightlies on firefly removed

2015-12-01 Thread Yuri Weinstein

As we are nearing EOL for firefly release all nightlies run on firefly
branch were disabled.

Let me know if this presents any problems.

Thx
YuriW
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

nightlies moved to ovh (openstack)

2015-11-30 Thread Yuri Weinstein

In preparation to sepia lab move, nightlies' schedules were moved to
the ovh (openstack) lab.

See details here - http://tracker.ceph.com/projects/ceph-releases/wiki/ovh

Please let me know if you see any problems.

PS: I will optimize times/frequencies in next several days/weeks

Thx
YuriW
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: v0.80.11 QE validation status

2015-11-16 Thread Yuri Weinstein

This release QE validation took longer time due to the #11104
additional fixing/testing and discovered related to it issues ##13794,
13622

We agreed to release v0.80.11 based on tests results.

Thx
YuriW

On Wed, Oct 28, 2015 at 9:04 AM, Yuri Weinstein <ywein...@redhat.com> wrote:
> Summary of suites executed for this release can be found in
> http://tracker.ceph.com/issues/11644
>
> rados - 1/7th passed
>
> rbd - http://tracker.ceph.com/issues/11104
>
> rgw - http://tracker.ceph.com/issues/11104
>
> fs - http://tracker.ceph.com/issues/11104, 
> http://tracker.ceph.com/issues/13630
>
> krbd - http://tracker.ceph.com/issues/13631
>
> kcephfs - http://tracker.ceph.com/issues/13631,
> http://tracker.ceph.com/issues/13630
>
> samba - http://tracker.ceph.com/issues/6613 sama as in v0.80.10 was
> aprroved for release
>
> ceph-deploy(ubuntu_) - almost passed, 1 job is still running
>
> ceph-deploy(distros) - http://tracker.ceph.com/issues/13367
>
> upgrade/dumpling-x (to firefly)(distros) - passed
>
> upgrade/firefly(distros) - passed
>
> upgrades to giant - deprecated
>
> upgrade/firefly-x (to hammer)(distros) -
> http://tracker.ceph.com/issues/11104,
> http://tracker.ceph.com/issues/13632
>
> powercycle - http://tracker.ceph.com/issues/11104,
> http://tracker.ceph.com/issues/13631
>
> All found problems seem unrelated to the product, however they
> prevented some tests from running.  In particular #11104 is widespread
> and has to be fixed (see also http://tracker.ceph.com/issues/13622 as
> proposed workaround)
>
> I suggest we rerunning failed tests after addressing the issues above.
>
> Thx
> YuriW
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: v0.80.11 QE validation status

2015-11-16 Thread Yuri Weinstein

Loic,

I am not actually sure about resolving #11104.

Warren?

Thx
YuriW


On Mon, Nov 16, 2015 at 1:04 PM, Loic Dachary <ldach...@redhat.com> wrote:
> Hi Yuri,
>
> Thanks for the update :-) Should we mark #11104 as resolved ?
>
> Cheers
>
> On 16/11/2015 19:45, Yuri Weinstein wrote:
>> This release QE validation took longer time due to the #11104
>> additional fixing/testing and discovered related to it issues ##13794,
>> 13622
>>
>> We agreed to release v0.80.11 based on tests results.
>>
>> Thx
>> YuriW
>>
>> On Wed, Oct 28, 2015 at 9:04 AM, Yuri Weinstein <ywein...@redhat.com> wrote:
>>> Summary of suites executed for this release can be found in
>>> http://tracker.ceph.com/issues/11644
>>>
>>> rados - 1/7th passed
>>>
>>> rbd - http://tracker.ceph.com/issues/11104
>>>
>>> rgw - http://tracker.ceph.com/issues/11104
>>>
>>> fs - http://tracker.ceph.com/issues/11104, 
>>> http://tracker.ceph.com/issues/13630
>>>
>>> krbd - http://tracker.ceph.com/issues/13631
>>>
>>> kcephfs - http://tracker.ceph.com/issues/13631,
>>> http://tracker.ceph.com/issues/13630
>>>
>>> samba - http://tracker.ceph.com/issues/6613 sama as in v0.80.10 was
>>> aprroved for release
>>>
>>> ceph-deploy(ubuntu_) - almost passed, 1 job is still running
>>>
>>> ceph-deploy(distros) - http://tracker.ceph.com/issues/13367
>>>
>>> upgrade/dumpling-x (to firefly)(distros) - passed
>>>
>>> upgrade/firefly(distros) - passed
>>>
>>> upgrades to giant - deprecated
>>>
>>> upgrade/firefly-x (to hammer)(distros) -
>>> http://tracker.ceph.com/issues/11104,
>>> http://tracker.ceph.com/issues/13632
>>>
>>> powercycle - http://tracker.ceph.com/issues/11104,
>>> http://tracker.ceph.com/issues/13631
>>>
>>> All found problems seem unrelated to the product, however they
>>> prevented some tests from running.  In particular #11104 is widespread
>>> and has to be fixed (see also http://tracker.ceph.com/issues/13622 as
>>> proposed workaround)
>>>
>>> I suggest we rerunning failed tests after addressing the issues above.
>>>
>>> Thx
>>> YuriW
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

suites' runs on jewel added to the schedule

2015-11-09 Thread Yuri Weinstein

(rados suite/jewel - on hold for the time being to avoid queue overload)

But other suites have been added to the schedule:

http://tracker.ceph.com/projects/ceph-releases/wiki/Sepia

Pls let me know if you see problems or any issues.

Thx
YuriW
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

giant suites removed from nightlies

2015-11-02 Thread Yuri Weinstein

As giant was declared EOL all related suites had been removed from the schedule:

#giant EOL 15 18 * * 3,6 teuthology-suite -v -c giant -k distro -m vps
-s upgrade/dumpling-firefly-x ~/vps.yaml

#giant EOL 18 18 * * 3,6 teuthology-suite -v -c giant -k distro -m vps
-s upgrade/firefly-x ~/vps.yaml

#giant EOL 05 17 * * 1,5 teuthology-suite -v -c hammer -k distro -m
vps -s upgrade/giant-x ~/vps.yaml

Thx
YuriW
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

v0.80.11 QE validation status

2015-10-28 Thread Yuri Weinstein

Summary of suites executed for this release can be found in
http://tracker.ceph.com/issues/11644

rados - 1/7th passed

rbd - http://tracker.ceph.com/issues/11104

rgw - http://tracker.ceph.com/issues/11104

fs - http://tracker.ceph.com/issues/11104, http://tracker.ceph.com/issues/13630

krbd - http://tracker.ceph.com/issues/13631

kcephfs - http://tracker.ceph.com/issues/13631,
http://tracker.ceph.com/issues/13630

samba - http://tracker.ceph.com/issues/6613 sama as in v0.80.10 was
aprroved for release

ceph-deploy(ubuntu_) - almost passed, 1 job is still running

ceph-deploy(distros) - http://tracker.ceph.com/issues/13367

upgrade/dumpling-x (to firefly)(distros) - passed

upgrade/firefly(distros) - passed

upgrades to giant - deprecated

upgrade/firefly-x (to hammer)(distros) -
http://tracker.ceph.com/issues/11104,
http://tracker.ceph.com/issues/13632

powercycle - http://tracker.ceph.com/issues/11104,
http://tracker.ceph.com/issues/13631

All found problems seem unrelated to the product, however they
prevented some tests from running.  In particular #11104 is widespread
and has to be fixed (see also http://tracker.ceph.com/issues/13622 as
proposed workaround)

I suggest we rerunning failed tests after addressing the issues above.

Thx
YuriW
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: timeout 120 teuthology-killl is highly recommended

2015-07-21 Thread Yuri Weinstein

I was thinking of teuthology-nuke thou !

Thx
YuriW

- Original Message -
From: Yuri Weinstein ywein...@redhat.com
To: Loic Dachary l...@dachary.org
Cc: Ceph Development ceph-devel@vger.kernel.org
Sent: Tuesday, July 21, 2015 9:33:26 AM
Subject: Re: timeout 120 teuthology-killl is highly recommended

Loic

I don't use teuthology-kill simultaneously only sequentially.
As far as run time, just as a note, when we use 'stale' arg and it invokes 
ipmitool interface it does take awhile to finish. 

Thx
YuriW

- Original Message -
From: Loic Dachary l...@dachary.org
To: Ceph Development ceph-devel@vger.kernel.org
Sent: Tuesday, July 21, 2015 9:13:04 AM
Subject: timeout 120 teuthology-killl is highly recommended

Hi Ceph,

Today I did something wrong and that blocked the lab for a good half hour. 

a) I ran two teuthology-kill simultaneously and that makes them deadlock each 
other
b) I let them run unattended only to come back to the terminal 30 minutes later 
and see them stuck.

Sure, two teuthology-kill simultaneously should not deadlock and that needs to 
be fixed. But the easy workaround to avoid that trouble is to just not let it 
run forever. Even for ~200 jobs it takes at most a minute or two. And if it 
takes longer it probably means another teuthology-kill competes and it should 
be interrupted and restarted later. From now on I'll do

timeout 120 teuthology-kill  || echo FAIL!

as a generic safeguard.

Apologies for the troubles.

-- 
Loïc Dachary, Artisan Logiciel Libre
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: timeout 120 teuthology-killl is highly recommended

2015-07-21 Thread Yuri Weinstein

Loic

I don't use teuthology-kill simultaneously only sequentially.
As far as run time, just as a note, when we use 'stale' arg and it invokes 
ipmitool interface it does take awhile to finish. 


Thx
YuriW

- Original Message -
From: Loic Dachary l...@dachary.org
To: Ceph Development ceph-devel@vger.kernel.org
Sent: Tuesday, July 21, 2015 9:13:04 AM
Subject: timeout 120 teuthology-killl is highly recommended

Hi Ceph,

Today I did something wrong and that blocked the lab for a good half hour. 

a) I ran two teuthology-kill simultaneously and that makes them deadlock each 
other
b) I let them run unattended only to come back to the terminal 30 minutes later 
and see them stuck.

Sure, two teuthology-kill simultaneously should not deadlock and that needs to 
be fixed. But the easy workaround to avoid that trouble is to just not let it 
run forever. Even for ~200 jobs it takes at most a minute or two. And if it 
takes longer it probably means another teuthology-kill competes and it should 
be interrupted and restarted later. From now on I'll do

timeout 120 teuthology-kill  || echo FAIL!

as a generic safeguard.

Apologies for the troubles.

-- 
Loïc Dachary, Artisan Logiciel Libre
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: VPS memory

2015-06-17 Thread Yuri Weinstein

I am all for it!


Thx
YuriW

- Original Message -
From: Sage Weil s...@newdream.net
To: Yuri Weinstein ywein...@redhat.com
Cc: se...@ceph.com, ceph-devel@vger.kernel.org, Loic Dachary 
ldach...@redhat.com, Xinxin Shu xinxin@intel.com, Alfredo Deza 
ad...@redhat.com
Sent: Wednesday, June 17, 2015 8:13:17 PM
Subject: VPS memory

On Wed, 17 Jun 2015, Yuri Weinstein wrote:
 - upgrade/dumpling-firefly-x (to hammer)(distros) - runs out of memory on vps 
 and unreliable

How about we

- half the number of vps's per node.
- double the default ram per vps instance
- double the cpu?

It'll mean lower test throughput, but (hopefully) reliable test results.  
Given the amount of time we waste sifting through noisy results that seems 
like a better path?

sage
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

firefly v0.80.10 QE validation completed

2015-06-17 Thread Yuri Weinstein

firefly v0.80.10 is ready for publishing (Sage, Alfredo, Loic, Xinxin FYI)

All results details were summarized in http://tracker.ceph.com/issues/11090

Note:

- upgrade/dumpling-firefly-x (to hammer)(distros) - runs out of memory on vps 
and unreliable
- #11957 fixed (thanks Ilya, Zack!) and tested

Thx
YuriW

- Original Message -
From: Yuri Weinstein ywein...@redhat.com
To: Ceph Development ceph-devel@vger.kernel.org
Cc: Loic Dachary ldach...@redhat.com, Xinxin Shu xinxin@intel.com
Sent: Monday, June 15, 2015 9:37:20 AM
Subject: firefly v0.80.10 QE validation status 6/15/2015

QE validation is almost completed (there are a couple of jobs that are still 
running)

All statis details were summarized in http://tracker.ceph.com/issues/11090

Highlights (by suite/issue):

rados #11914 needs Sam's approval
kcephfs n/a needs Greg's approval
samba #6613 needs Greg's approval

Nice to have fixes (no blockers):
upgrade/firefly(distros) #11957 (env noise)


Thx
YuriW
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: v9.0.1 released

2015-06-16 Thread Yuri Weinstein

Sage

We still running nightlies on next and branches.

Just wanted to reaffirm that this is not time yet to start scheduling suites on 
infernalis?

Thx
YuriW

- Original Message -
From: Sage Weil sw...@redhat.com
To: ceph-annou...@ceph.com, ceph-devel@vger.kernel.org, ceph-us...@ceph.com, 
ceph-maintain...@ceph.com
Sent: Thursday, June 11, 2015 10:06:38 AM
Subject: v9.0.1 released

This development release is delayed a bit due to tooling changes in the 
build environment.  As a result the next one (v9.0.2) will have a bit more 
work than is usual.

Highlights here include lots of RGW Swift fixes, RBD feature work 
surrounding the new object map feature, more CephFS snapshot fixes, and a 
few important CRUSH fixes.

Notable Changes
---

* auth: cache/reuse crypto lib key objects, optimize msg signature check 
  (Sage Weil)
* build: allow tcmalloc-minimal (Thorsten Behrens)
* build: do not build ceph-dencoder with tcmalloc (#10691 Boris Ranto)
* build: fix pg ref disabling (William A. Kennington III)
* build: install-deps.sh improvements (Loic Dachary)
* build: misc fixes (Boris Ranto, Ken Dreyer, Owen Synge)
* ceph-authtool: fix return code on error (Gerhard Muntingh)
* ceph-disk: fix zap sgdisk invocation (Owen Synge, Thorsten Behrens)
* ceph-disk: pass --cluster arg on prepare subcommand (Kefu Chai)
* ceph-fuse, libcephfs: drop inode when rmdir finishes (#11339 Yan, Zheng)
* ceph-fuse,libcephfs: fix uninline (#11356 Yan, Zheng)
* ceph-monstore-tool: fix store-copy (Huangjun)
* common: add perf counter descriptions (Alyona Kiseleva)
* common: fix throttle max change (Henry Chang)
* crush: fix crash from invalid 'take' argument (#11602 Shiva Rkreddy, 
  Sage Weil)
* crush: fix divide-by-2 in straw2 (#11357 Yann Dupont, Sage Weil)
* deb: fix rest-bench-dbg and ceph-test-dbg dependendies (Ken Dreyer)
* doc: document region hostnames (Robin H. Johnson)
* doc: update release schedule docs (Loic Dachary)
* init-radosgw: run radosgw as root (#11453 Ken Dreyer)
* librados: fadvise flags per op (Jianpeng Ma)
* librbd: allow additional metadata to be stored with the image (Haomai 
  Wang)
* librbd: better handling for dup flatten requests (#11370 Jason Dillaman)
* librbd: cancel in-flight ops on watch error (#11363 Jason Dillaman)
* librbd: default new images to format 2 (#11348 Jason Dillaman)
* librbd: fast diff implementation that leverages object map (Jason 
  Dillaman)
* librbd: fix snapshot creation when other snap is active (#11475 Jason 
  Dillaman)
* librbd: new diff_iterate2 API (Jason Dillaman)
* librbd: object map rebuild support (Jason Dillaman)
* logrotate.d: prefer service over invoke-rc.d (#11330 Win Hierman, Sage 
  Weil)
* mds: avoid getting stuck in XLOCKDONE (#11254 Yan, Zheng)
* mds: fix integer truncateion on large client ids (Henry Chang)
* mds: many snapshot and stray fixes (Yan, Zheng)
* mds: persist completed_requests reliably (#11048 John Spray)
* mds: separate safe_pos in Journaler (#10368 John Spray)
* mds: snapshot rename support (#3645 Yan, Zheng)
* mds: warn when clients fail to advance oldest_client_tid (#10657 Yan, 
  Zheng)
* misc cleanups and fixes (Danny Al-Gaaf)
* mon: fix average utilization calc for 'osd df' (Mykola Golub)
* mon: fix variance calc in 'osd df' (Sage Weil)
* mon: improve callout to crushtool (Mykola Golub)
* mon: prevent bucket deletion when referenced by a crush rule (#11602 
  Sage Weil)
* mon: prime pg_temp when CRUSH map changes (Sage Weil)
* monclient: flush_log (John Spray)
* msgr: async: many many fixes (Haomai Wang)
* msgr: simple: fix clear_pipe (#11381 Haomai Wang)
* osd: add latency perf counters for tier operations (Xinze Chi)
* osd: avoid multiple hit set insertions (Zhiqiang Wang)
* osd: break PG removal into multiple iterations (#10198 Guang Yang)
* osd: check scrub state when handling map (Jianpeng Ma)
* osd: fix endless repair when object is unrecoverable (Jianpeng Ma, Kefu 
  Chai)
* osd: fix pg resurrection (#11429 Samuel Just)
* osd: ignore non-existent osds in unfound calc (#10976 Mykola Golub)
* osd: increase default max open files (Owen Synge)
* osd: prepopulate needs_recovery_map when only one peer has missing 
  (#9558 Guang Yang)
* osd: relax reply order on proxy read (#11211 Zhiqiang Wang)
* osd: skip promotion for flush/evict op (Zhiqiang Wang)
* osd: write journal header on clean shutdown (Xinze Chi)
* qa: run-make-check.sh script (Loic Dachary)
* rados bench: misc fixes (Dmitry Yatsushkevich)
* rados: fix error message on failed pool removal (Wido den Hollander)
* radosgw-admin: add 'bucket check' function to repair bucket index 
  (Yehuda Sadeh)
* rbd: allow unmapping by spec (Ilya Dryomov)
* rbd: deprecate --new-format option (Jason Dillman)
* rgw: do not set content-type if length is 0 (#11091 Orit Wasserman)
* rgw: don't use end_marker for namespaced object listing (#11437 Yehuda 
  Sadeh)
* rgw: fail if parts not specified on multipart upload (#11435 Yehuda 
  Sadeh)
* rgw: fix GET on swift account when

firefly v0.80.10 QE validation status 6/15/2015

2015-06-15 Thread Yuri Weinstein

QE validation is almost completed (there are a couple of jobs that are still 
running)

All statis details were summarized in http://tracker.ceph.com/issues/11090

Highlights (by suite/issue):

rados #11914 needs Sam's approval
kcephfs n/a needs Greg's approval
samba #6613 needs Greg's approval

Nice to have fixes (no blockers):
upgrade/firefly(distros) #11957 (env noise)


Thx
YuriW
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: firefly branch for v0.80.10 ready for QE

2015-06-05 Thread Yuri Weinstein

Then it still in the queue and I will reschedule to pick up the latest code.

Thx
YuriW

- Original Message -
From: Loic Dachary l...@dachary.org
To: Yuri Weinstein ywein...@redhat.com
Cc: Ceph Development ceph-devel@vger.kernel.org, Xinxin Shu 
xinxin@intel.com
Sent: Friday, June 5, 2015 10:26:20 AM
Subject: Re: firefly branch for v0.80.10 ready for QE

On 05/06/2015 16:54, Yuri Weinstein wrote:
 Loic

 Thx for the heads up.

 Would that touch only the rgw suite?

Yes.

 Thx
 YuriW

 - Original Message -
 From: Loic Dachary l...@dachary.org
 To: Yuri Weinstein ywein...@redhat.com
 Cc: Ceph Development ceph-devel@vger.kernel.org, Xinxin Shu 
 xinxin@intel.com
 Sent: Friday, June 5, 2015 2:11:44 AM
 Subject: Re: firefly branch for v0.80.10 ready for QE

 Hi Yuri,

 A month passed since this mail was sent and the firefly branch has a few 
 additional commits. All but one (https://github.com/ceph/ceph/pull/4829 with 
 more information at http://tracker.ceph.com/issues/11890) have been tested. 
 This exception seems harmless but I thought you should know. 

 For the record, the head of the firefly branch now is 
 https://github.com/ceph/ceph/commit/d0f9c5f47024f53b4eccea2e0fde9b7844746362 
 and http://tracker.ceph.com/issues/11090#Release-information has been updated 
 accordingly.

 Cheers

 On 29/05/2015 18:18, Loic Dachary wrote: Hi Yuri,

 The firefly branch for v0.80.10 as found at 
 https://github.com/ceph/ceph/commits/firefly has been approved by Greg, 
 Yehuda, Josh and Sam and is ready for QE. For the record, the head is 
 https://github.com/ceph/ceph/commit/071c94385ee71b86c5ed8363d56cf299da1aa7b3 
 and the details of the tests run are at http://tracker.ceph.com/issues/11090

 Cheers

-- 
Loïc Dachary, Artisan Logiciel Libre
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: firefly branch for v0.80.10 ready for QE

2015-06-05 Thread Yuri Weinstein

Loic

Thx for the heads up.

Would that touch only the rgw suite?

Thx
YuriW

- Original Message -
From: Loic Dachary l...@dachary.org
To: Yuri Weinstein ywein...@redhat.com
Cc: Ceph Development ceph-devel@vger.kernel.org, Xinxin Shu 
xinxin@intel.com
Sent: Friday, June 5, 2015 2:11:44 AM
Subject: Re: firefly branch for v0.80.10 ready for QE

Hi Yuri,

A month passed since this mail was sent and the firefly branch has a few 
additional commits. All but one (https://github.com/ceph/ceph/pull/4829 with 
more information at http://tracker.ceph.com/issues/11890) have been tested. 
This exception seems harmless but I thought you should know. 

For the record, the head of the firefly branch now is 
https://github.com/ceph/ceph/commit/d0f9c5f47024f53b4eccea2e0fde9b7844746362 
and http://tracker.ceph.com/issues/11090#Release-information has been updated 
accordingly.

Cheers

On 29/05/2015 18:18, Loic Dachary wrote: Hi Yuri,
 
 The firefly branch for v0.80.10 as found at 
 https://github.com/ceph/ceph/commits/firefly has been approved by Greg, 
 Yehuda, Josh and Sam and is ready for QE. For the record, the head is 
 https://github.com/ceph/ceph/commit/071c94385ee71b86c5ed8363d56cf299da1aa7b3 
 and the details of the tests run are at http://tracker.ceph.com/issues/11090
 
 Cheers
 

-- 
Loïc Dachary, Artisan Logiciel Libre
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: hammer branch for v0.94.2 ready for QE

2015-05-30 Thread Yuri Weinstein

QE validation is complete and this release is ready for publishing.

(Greg, I assumed that you approved this release with failures in knfs suite re: 
unresolved #11789, marked for v0.94.3 backport)

Summary of all tests performed for this releases and notes can be found in 
http://tracker.ceph.com/issues/11492 .

Thx
YuriW

- Original Message -
From: Loic Dachary l...@dachary.org
To: Yuri Weinstein ywein...@redhat.com
Cc: Ceph Development ceph-devel@vger.kernel.org, Abhishek L 
abhishek.lekshma...@gmail.com
Sent: Monday, May 18, 2015 5:42:12 AM
Subject: hammer branch for v0.94.2 ready for QE

Hi Yuri,

The hammer branch for v0.94.2 as found at 
https://github.com/ceph/ceph/commits/hammer has been approved by Greg, Yehuda, 
Josh and Sam and is ready for QE. For the record, the head is 
https://github.com/ceph/ceph/commit/63832d4039889b6b704b88b86eaba4aadcfceb2e 
and the details of the tests run are at http://tracker.ceph.com/issues/11492

Note that it has two more commits compared to what you tested before:

https://github.com/ceph/ceph/commit/293affe992118ed6e04f685030b2d83a794ca624 
fixing http://tracker.ceph.com/issues/11622
https://github.com/ceph/ceph/commit/a43d24861089a02f3b42061e482e05016a0021f6 
fixing http://tracker.ceph.com/issues/11604

which address two blockers that you listed at 
http://tracker.ceph.com/issues/11492#QE-Validation

These two new commits only have influence, directly or indirectly, on rgw. They 
do not require or deserve a new run of the rados, fs or rbd suite because none 
of them depend on rgw, directly or indirectly.

The other two issues listed as blockers are

http://tracker.ceph.com/issues/11613#note-4 do not need a backport to hammer
http://tracker.ceph.com/issues/11591 is a teuthology related issue that can be 
worked around and does not need to be a blocker for hammer

Cheers

-- 
Loïc Dachary, Artisan Logiciel Libre







--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: teuthology job priorities

2015-05-28 Thread Yuri Weinstein

I usually use: 
priority [90,100]

for point releases validations.

This is a good thread to bring up for open approval/disapproval.

Does that sound reasonable ??

Thx
YuriW

- Original Message -
From: Loic Dachary l...@dachary.org
To: Ceph Development ceph-devel@vger.kernel.org
Sent: Thursday, May 28, 2015 2:32:29 AM
Subject: teuthology job priorities

Hi,

This morning I'll schedule a job with priority 50, assuming nobody will get mad 
at me for using such a low priority because the associated bug fix blocks the 
release of v0.94.2 (http://tracker.ceph.com/issues/11546) and also assuming 
noone uses a priority lower than 100 just to get in front of the nightlies[1]. 
In my imagination

   priority [0,100] is for emergencies
   priority [100,1000] is to schedule a job with higher priority than the 
nightlies
   priority 1000 (the default) is for all automated tests and no human being 
wait on them (the nightlies for instance).

Does someone have a different mapping in mind ?

Cheers

[1] the nightlies 
http://tracker.ceph.com/projects/ceph-releases/wiki/HOWTO_monitor_the_automated_tests_AKA_nightlies
-- 
Loïc Dachary, Artisan Logiciel Libre
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: hammer branch for v0.94.2 ready for QE

2015-05-27 Thread Yuri Weinstein

QE validation status.

All detailed information is summarized in http://tracker.ceph.com/issues/11492

Team leads pls review for do go-no-go decision.

Issues to be considered:

rados - passed ~2.8K jobs, listed issues (#11660, #11661) are not blockers 
(NOTE: we also agreed to use the 0/7th rule for future point releases, e.g. 
passing --subset 0/7 will be sufficient for release)

knfs - #11789, #11790 - per Sage - not blockers; Greg, John - agreed?

samba - I assumed that failured in 
http://pulpito.ceph.com/teuthology-2015-05-18_13:46:55-samba-hammer-testing-basic-multi/
 due to #6613, Greg pls confirm.

upgrade/client-upgrade - blocked by #11546 (3 jobs passed)
upgrade/firefly-x - blocked by #11546
upgrade/dumpling-firefly-x - blocked by #11546

Sage, Loic are you willing to push this release out without upgrade suites run 
due to packaging issues (NOTE: upgrade/giant-x - hammer passed on all distros)?

Thx
YuriW

- Original Message -
From: Loic Dachary l...@dachary.org
To: Yuri Weinstein ywein...@redhat.com
Cc: Ceph Development ceph-devel@vger.kernel.org, Abhishek L 
abhishek.lekshma...@gmail.com
Sent: Monday, May 18, 2015 5:42:12 AM
Subject: hammer branch for v0.94.2 ready for QE

Hi Yuri,

The hammer branch for v0.94.2 as found at 
https://github.com/ceph/ceph/commits/hammer has been approved by Greg, Yehuda, 
Josh and Sam and is ready for QE. For the record, the head is 
https://github.com/ceph/ceph/commit/63832d4039889b6b704b88b86eaba4aadcfceb2e 
and the details of the tests run are at http://tracker.ceph.com/issues/11492

Note that it has two more commits compared to what you tested before:

https://github.com/ceph/ceph/commit/293affe992118ed6e04f685030b2d83a794ca624 
fixing http://tracker.ceph.com/issues/11622
https://github.com/ceph/ceph/commit/a43d24861089a02f3b42061e482e05016a0021f6 
fixing http://tracker.ceph.com/issues/11604

which address two blockers that you listed at 
http://tracker.ceph.com/issues/11492#QE-Validation

These two new commits only have influence, directly or indirectly, on rgw. They 
do not require or deserve a new run of the rados, fs or rbd suite because none 
of them depend on rgw, directly or indirectly.

The other two issues listed as blockers are

http://tracker.ceph.com/issues/11613#note-4 do not need a backport to hammer
http://tracker.ceph.com/issues/11591 is a teuthology related issue that can be 
worked around and does not need to be a blocker for hammer

Cheers

-- 
Loïc Dachary, Artisan Logiciel Libre







--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: hammer branch for v0.94.2 ready for QE

2015-05-26 Thread Yuri Weinstein

Loic

There are infrastructure related issues as well as for example the rados suite 
needs review as many jobs failed.

See for example lines with need review notes which I am suggesting to be 
reviewed by the team leads.

rados run on typica and magna (for example):

http://pulpito-rdu.front.sepia.ceph.com/teuthology-2015-05-15_12:42:53-rados-hammer-distro-basic-typica/
http://pulpito.ceph.redhat.com/teuthology-2015-05-11_20:22:13-rados-hammer-distro-basic-magna/

I think it's more efficient to review such results as it feels like amount of 
failures is unproportionally high.
 
Thx
YuriW

- Original Message -
From: Loic Dachary l...@dachary.org
To: Yuri Weinstein ywein...@redhat.com
Cc: Ceph Development ceph-devel@vger.kernel.org, Abhishek L 
abhishek.lekshma...@gmail.com
Sent: Tuesday, May 26, 2015 9:25:25 AM
Subject: Re: hammer branch for v0.94.2 ready for QE

Hi Yuri,

If I'm not mistaken http://tracker.ceph.com/issues/11660 is the last issue 
blocking v0.94.2. Is there another one I don't see ?

Cheers

On 26/05/2015 18:13, Yuri Weinstein wrote:
 Loic
 
 This hammer release QE validation is taking unusually long time and has 
 issues that has to be clarified. 
 
 All test results were summarized in http://tracker.ceph.com/issues/11492
 
 There are several reasons contributing to slowness of this validation, 
 product related as well as infrastructure related, also high amount of tests 
 make turn around time slower as well.
 
 I think some suites, e.g. rados and upgrades for example will have to be 
 re-run after issues had been clarified/fixed.
 
 rados, krbd, knfs, samba suite test results need reviews by the team leads.
 
 Thx
 YuriW
 
 - Original Message -
 From: Loic Dachary l...@dachary.org
 To: Yuri Weinstein ywein...@redhat.com
 Cc: Ceph Development ceph-devel@vger.kernel.org, Abhishek L 
 abhishek.lekshma...@gmail.com
 Sent: Monday, May 18, 2015 5:42:12 AM
 Subject: hammer branch for v0.94.2 ready for QE
 
 Hi Yuri,
 
 The hammer branch for v0.94.2 as found at 
 https://github.com/ceph/ceph/commits/hammer has been approved by Greg, 
 Yehuda, Josh and Sam and is ready for QE. For the record, the head is 
 https://github.com/ceph/ceph/commit/63832d4039889b6b704b88b86eaba4aadcfceb2e 
 and the details of the tests run are at http://tracker.ceph.com/issues/11492
 
 Note that it has two more commits compared to what you tested before:
 
 https://github.com/ceph/ceph/commit/293affe992118ed6e04f685030b2d83a794ca624 
 fixing http://tracker.ceph.com/issues/11622
 https://github.com/ceph/ceph/commit/a43d24861089a02f3b42061e482e05016a0021f6 
 fixing http://tracker.ceph.com/issues/11604
 
 which address two blockers that you listed at 
 http://tracker.ceph.com/issues/11492#QE-Validation
 
 These two new commits only have influence, directly or indirectly, on rgw. 
 They do not require or deserve a new run of the rados, fs or rbd suite 
 because none of them depend on rgw, directly or indirectly.
 
 The other two issues listed as blockers are
 
 http://tracker.ceph.com/issues/11613#note-4 do not need a backport to hammer
 http://tracker.ceph.com/issues/11591 is a teuthology related issue that can 
 be worked around and does not need to be a blocker for hammer
 
 Cheers
 

-- 
Loïc Dachary, Artisan Logiciel Libre
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: giant branch for v0.87.2 ready for QE

2015-04-24 Thread Yuri Weinstein


QE validation of this release has been completed and it's ready for next steps.

All tests results and notes were summarized in 
http://tracker.ceph.com/issues/11153 QE Validation section.

Thx
YuriW

- Original Message -
From: Loic Dachary l...@dachary.org
To: Yuri Weinstein ywein...@redhat.com
Cc: Ceph Development ceph-devel@vger.kernel.org, Abhishek L 
abhishek.lekshma...@gmail.com
Sent: Friday, April 17, 2015 2:17:16 PM
Subject: giant branch for v0.87.2 ready for QE

Hi Yuri,

The giant branch for v0.87.2 as found at 
https://github.com/ceph/ceph/commits/giant has been approved by Greg, Yehuda, 
Josh and Sam and is ready for QE. For the record, the head is 
https://github.com/ceph/ceph/commit/c1301e84aee0f399db85e2d37818a66147a0ce78 
and the details of the tests run are at http://tracker.ceph.com/issues/11153

Cheers

-- 
Loïc Dachary, Artisan Logiciel Libre




--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: client/cluster compatibility testing

2015-04-20 Thread Yuri Weinstein

We have a PR https://github.com/ceph/ceph-qa-suite/pull/414 that addressed part 
of this issue, e.g. added hammer-x to the mix (it's ready to be merged).

We also have two tickets where requirements for this suite were captured:
http://tracker.ceph.com/issues/11413
http://tracker.ceph.com/issues/11414

Per Josh's comment Also I think we'll want to start doing mixed-client-version 
tests, particularly for things like rbd's exclusive locking, I assigned #11414 
for next steps.

Question/request to the team leads - pls either agree with a need to add 
specific tests for mixed clients testing (and pls add tickets as you feel 
necessary.) or suggest otherwise.

I am guessing:

rbd - confirmed by Josh, we need those
rados - Sam, Sage?
cephfs - Greg?
rgw - Yehuda?

I am sure I missing lots of others...

What do you think?

Thx
YuriW

- Original Message -
From: Josh Durgin jdur...@redhat.com
To: Sage Weil sw...@redhat.com, ceph-devel@vger.kernel.org
Sent: Thursday, April 16, 2015 1:59:11 PM
Subject: Re: client/cluster compatibility testing

On 04/16/2015 09:42 AM, Sage Weil wrote:
 I think the simplest way to address this is to talk about compatibility in
 terms of the upstream stable releases (firefly, hammer, etc.), and test
 that compatibility with teuthology tests from ceph-qa-suite.git.  We have
 some basic inter-version client/cluster tests already in
 suites/upgrade/client-upgrade.  Currently these test new (version x)
 clients against a given release (dumpling, firefly).  I think we just need
 to add hammer to that mix, and then add a second set of tests that do the
 reverse: test clients from a given release (dumpling, firefly, hammer)
 against an arbitrary cluster version (x).

The suites in suites/upgrade/$version-x do this, and use a mixed
version cluster rather than a purely version x cluster. It seems like
people would want that intra-cluster version coverage for smooth
upgrades.

Just need to add hammer-x there too (Yuri's renaming the client ones to
be $version-client-x for less confusion).

Also I think we'll want to start doing mixed-client-version tests,
particularly for things like rbd's exclusive locking:

http://tracker.ceph.com/issues/11405

Josh
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: client/cluster compatibility testing

2015-04-16 Thread Yuri Weinstein

Yea, Sage, that sounds reasonable.

I added a ticket to capture this plan (http://tracker.ceph.com/issues/11413) 
and will add those tests soon.

Please add your comments to the ticket above.

I am assuming that it will look something like this for dumpling, firefly and 
hammer:

dumpling(stable) - client-x
firefly(stable) - client-x
hammer(stable) - client-x

and reverse

dumpling-client(stable) - cluster-x
firefly-cluster(stable) - cluster-x
hammer-cluster(stable) - cluster-x

Yes?

Thx
YuriW

- Original Message -
From: Sage Weil sw...@redhat.com
To: ceph-devel@vger.kernel.org
Sent: Thursday, April 16, 2015 9:42:29 AM
Subject: client/cluster compatibility testing

Now that there are several different vendors shipping and supporting Ceph 
in their products, we'll invariably have people running different 
versions of Ceph that are interested in interoperability.  If we focus 
just on client - cluster compatability, I think the issues are (1) 
compatibility between upstream ceph versions (firefly vs hammer) and 
(2) ensuring that any downstream changes the vendor makes don't break that 
compatibility.

I think the simplest way to address this is to talk about compatibility in 
terms of the upstream stable releases (firefly, hammer, etc.), and test 
that compatibility with teuthology tests from ceph-qa-suite.git.  We have 
some basic inter-version client/cluster tests already in 
suites/upgrade/client-upgrade.  Currently these test new (version x) 
clients against a given release (dumpling, firefly).  I think we just need 
to add hammer to that mix, and then add a second set of tests that do the 
reverse: test clients from a given release (dumpling, firefly, hammer) 
against an arbitrary cluster version (x).

We'll obviously run these tests on upstream releases to ensure that we are 
not breaking compatibility (or are doing so in known, explicit ways).  
Downstream folks can run the same test suites against any changes they 
make as well to ensure that their product is compatible with firefly 
clients, or whatever.

Does that sound reasonable?
sage

--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: hammer tasks in http://tracker.ceph.com/projects/ceph-releases

2015-03-23 Thread Yuri Weinstein

How will that go for the next run of upgrade/giant-x ?

I was thinking that as soon as for example this suite passed, #11189 gets 
resolved as thus indicates that it's ready for for the hammer release cut. 


Thx
YuriW

- Original Message -
From: Loic Dachary l...@dachary.org
To: Yuri Weinstein ywein...@redhat.com
Cc: Sage Weil sw...@redhat.com, Ceph Development 
ceph-devel@vger.kernel.org
Sent: Sunday, March 22, 2015 5:35:19 PM
Subject: Re: hammer tasks in http://tracker.ceph.com/projects/ceph-releases



On 22/03/2015 17:16, Yuri Weinstein wrote:
 Loic, I think the idea was to do more process driven approach for releasing 
 hammer, e.g. keep track of suites vs. results and open issues, so we can have 
 a high level view on status at any time before the final cut day.
 
 Do you have any suggestions or objections?

Reading http://tracker.ceph.com/issues/11189 I see it has one run, and a run of 
failed tests, and got resolved because all passed. The title is hammer: 
upgrade/giant-x. How will that go for the next run of upgrade/giant-x ?

I use a python snippet to display the errors in a redmine format 
(http://workbench.dachary.org/dachary/ceph-workbench/issues/2)

$ python ../fail.py 
teuthology-2015-03-20_17:05:02-upgrade:giant-x-hammer-distro-basic-vps
** *'mkdir -p -- /home/ubuntu/cephtest/mnt.1/client.1/tmp  cd -- 
/home/ubuntu/cephtest/mnt.1/client.1/tmp  CEPH_CLI_TEST_DUP_COMMAND=1 
CEPH_REF=giant TESTDIR=/home/ubuntu/cephtest CEPH_ID=1 PATH=$PATH:/usr/sbin 
adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 3h 
/home/ubuntu/cephtest/workunit.client.1/cls/test_cls_rgw.sh'*
*** upgrade:giant-x/parallel/{0-cluster/start.yaml 1-giant-install/giant.yaml 
2-workload/parallel_run/{ec-rados-parallel.yaml rados_api.yaml 
rados_loadgenbig.yaml test_cache-pool-snaps.yaml test_rbd_api.yaml 
test_rbd_python.yaml} 3-upgrade-sequence/upgrade-mon-osd-mds.yaml 
4-final-workload/{rados-snaps-few-objects.yaml rados_loadgenmix.yaml 
rados_mon_thrash.yaml rbd_cls.yaml rbd_import_export.yaml rgw_swift.yaml} 
distros/rhel_7.0.yaml}:http://pulpito.ceph.com/teuthology-2015-03-20_17:05:02-upgrade:giant-x-hammer-distro-basic-vps/814081
** *2015-03-20 23:04:51.042345 mon.0 10.214.130.49:6789/0 3 : cluster [WRN] 
message from mon.1 was stamped 14400.248297s in the future, clocks not 
synchronized in cluster log*
*** upgrade:giant-x/parallel/{0-cluster/start.yaml 1-giant-install/giant.yaml 
2-workload/sequential_run/test_rbd_api.yaml 3-upgrade-sequence/upgrade-all.yaml 
4-final-workload/{rados-snaps-few-objects.yaml rados_loadgenmix.yaml 
rados_mon_thrash.yaml rbd_cls.yaml rbd_import_export.yaml rgw_swift.yaml} 
distros/centos_6.5.yaml}:http://pulpito.ceph.com/teuthology-2015-03-20_17:05:02-upgrade:giant-x-hammer-distro-basic-vps/814155
** *Could not reconnect to ubu...@vpm169.front.sepia.ceph.com*
*** upgrade:giant-x/parallel/{0-cluster/start.yaml 1-giant-install/giant.yaml 
2-workload/sequential_run/ec-rados-default.yaml 
3-upgrade-sequence/upgrade-mon-osd-mds.yaml 
4-final-workload/{rados-snaps-few-objects.yaml rados_loadgenmix.yaml 
rados_mon_thrash.yaml rbd_cls.yaml rbd_import_export.yaml rgw_swift.yaml} 
distros/rhel_7.0.yaml}:http://pulpito.ceph.com/teuthology-2015-03-20_17:05:02-upgrade:giant-x-hammer-distro-basic-vps/814108
** *Could not reconnect to ubu...@vpm166.front.sepia.ceph.com*
*** upgrade:giant-x/stress-split-erasure-code/{0-cluster/start.yaml 
1-giant-install/giant.yaml 2-partial-upgrade/firsthalf.yaml 
3-thrash/default.yaml 4-mon/mona.yaml 5-workload/ec-rados-default.yaml 
6-next-mon/monb.yaml 8-next-mon/monc.yaml 
9-workload/ec-rados-plugin=jerasure-k=3-m=1.yaml 
distros/rhel_7.0.yaml}:http://pulpito.ceph.com/teuthology-2015-03-20_17:05:02-upgrade:giant-x-hammer-distro-basic-vps/814194
** *'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage 
daemon-helper kill ceph-mon -f -i a'*
*** upgrade:giant-x/stress-split-erasure-code-x86_64/{0-cluster/start.yaml 
1-giant-install/giant.yaml 2-partial-upgrade/firsthalf.yaml 
3-thrash/default.yaml 4-mon/mona.yaml 5-workload/ec-rados-default.yaml 
6-next-mon/monb.yaml 8-next-mon/monc.yaml 
9-workload/ec-rados-plugin=isa-k=2-m=1.yaml 
distros/rhel_7.0.yaml}:http://pulpito.ceph.com/teuthology-2015-03-20_17:05:02-upgrade:giant-x-hammer-distro-basic-vps/814197
** *timed out waiting for admin_socket to appear after osd.13 restart*
*** upgrade:giant-x/stress-split/{0-cluster/start.yaml 
1-giant-install/giant.yaml 2-partial-upgrade/firsthalf.yaml 
3-thrash/default.yaml 4-mon/mona.yaml 5-workload/{rbd-cls.yaml 
rbd-import-export.yaml readwrite.yaml snaps-few-objects.yaml} 
6-next-mon/monb.yaml 7-workload/{radosbench.yaml rbd_api.yaml} 
8-next-mon/monc.yaml 9-workload/{rbd-python.yaml rgw-swift.yaml 
snaps-many-objects.yaml} 
distros/rhel_6.5.yaml}:http://pulpito.ceph.com/teuthology-2015-03-20_17:05:02-upgrade:giant-x-hammer-distro-basic-vps/814186

 
 Thx
 YuriW
 
 - Original Message -
 From: Loic Dachary l

Re: hammer tasks in http://tracker.ceph.com/projects/ceph-releases

2015-03-23 Thread Yuri Weinstein



Thx
YuriW

- Original Message -
From: Loic Dachary l...@dachary.org
To: Yuri Weinstein ywein...@redhat.com
Cc: Sage Weil sw...@redhat.com, Ceph Development 
ceph-devel@vger.kernel.org
Sent: Monday, March 23, 2015 8:40:02 AM
Subject: Re: hammer tasks in http://tracker.ceph.com/projects/ceph-releases

Hi Yuri,

On 23/03/2015 16:09, Yuri Weinstein wrote:
 How will that go for the next run of upgrade/giant-x ?
 
 I was thinking that as soon as for example this suite passed, #11189 gets 
 resolved as thus indicates that it's ready for for the hammer release cut. 

If the following happens:

* hammer: upgrade/giant-x runs and passes
* a dozen more commits are added because problems are fixed
* hammer: upgrade/giant-x runs and passes

That leaves us with two issues with the same name but with different update 
dates. So if I look at the hammer: upgrade/giant-x issues in chronological 
order, I have a complete history of the successive runs and I can check the 
latest one to see how it went. Or older ones if I need to dig the history. 

This is good :-)

After hammer is released, the same will presumably happen for point releases. 
Instead of naming them hammer: upgrade/giant-x which would be confusing, I 
guess we could name them v0.94.1: upgrade/giant-x instead. 

Does that sound right ?

Yes, we can alternatively name the set of those tasks as hammer v0.94.1


 
 Thx
 YuriW
 
 - Original Message -
 From: Loic Dachary l...@dachary.org
 To: Yuri Weinstein ywein...@redhat.com
 Cc: Sage Weil sw...@redhat.com, Ceph Development 
 ceph-devel@vger.kernel.org
 Sent: Sunday, March 22, 2015 5:35:19 PM
 Subject: Re: hammer tasks in http://tracker.ceph.com/projects/ceph-releases
 
 
 
 On 22/03/2015 17:16, Yuri Weinstein wrote:
 Loic, I think the idea was to do more process driven approach for releasing 
 hammer, e.g. keep track of suites vs. results and open issues, so we can 
 have a high level view on status at any time before the final cut day.

 Do you have any suggestions or objections?
 
 Reading http://tracker.ceph.com/issues/11189 I see it has one run, and a run 
 of failed tests, and got resolved because all passed. The title is hammer: 
 upgrade/giant-x. How will that go for the next run of upgrade/giant-x ?
 
 I use a python snippet to display the errors in a redmine format 
 (http://workbench.dachary.org/dachary/ceph-workbench/issues/2)
 
 $ python ../fail.py 
 teuthology-2015-03-20_17:05:02-upgrade:giant-x-hammer-distro-basic-vps
 ** *'mkdir -p -- /home/ubuntu/cephtest/mnt.1/client.1/tmp  cd -- 
 /home/ubuntu/cephtest/mnt.1/client.1/tmp  CEPH_CLI_TEST_DUP_COMMAND=1 
 CEPH_REF=giant TESTDIR=/home/ubuntu/cephtest CEPH_ID=1 
 PATH=$PATH:/usr/sbin adjust-ulimits ceph-coverage 
 /home/ubuntu/cephtest/archive/coverage timeout 3h 
 /home/ubuntu/cephtest/workunit.client.1/cls/test_cls_rgw.sh'*
 *** upgrade:giant-x/parallel/{0-cluster/start.yaml 
 1-giant-install/giant.yaml 2-workload/parallel_run/{ec-rados-parallel.yaml 
 rados_api.yaml rados_loadgenbig.yaml test_cache-pool-snaps.yaml 
 test_rbd_api.yaml test_rbd_python.yaml} 
 3-upgrade-sequence/upgrade-mon-osd-mds.yaml 
 4-final-workload/{rados-snaps-few-objects.yaml rados_loadgenmix.yaml 
 rados_mon_thrash.yaml rbd_cls.yaml rbd_import_export.yaml rgw_swift.yaml} 
 distros/rhel_7.0.yaml}:http://pulpito.ceph.com/teuthology-2015-03-20_17:05:02-upgrade:giant-x-hammer-distro-basic-vps/814081
 ** *2015-03-20 23:04:51.042345 mon.0 10.214.130.49:6789/0 3 : cluster [WRN] 
 message from mon.1 was stamped 14400.248297s in the future, clocks not 
 synchronized in cluster log*
 *** upgrade:giant-x/parallel/{0-cluster/start.yaml 
 1-giant-install/giant.yaml 2-workload/sequential_run/test_rbd_api.yaml 
 3-upgrade-sequence/upgrade-all.yaml 
 4-final-workload/{rados-snaps-few-objects.yaml rados_loadgenmix.yaml 
 rados_mon_thrash.yaml rbd_cls.yaml rbd_import_export.yaml rgw_swift.yaml} 
 distros/centos_6.5.yaml}:http://pulpito.ceph.com/teuthology-2015-03-20_17:05:02-upgrade:giant-x-hammer-distro-basic-vps/814155
 ** *Could not reconnect to ubu...@vpm169.front.sepia.ceph.com*
 *** upgrade:giant-x/parallel/{0-cluster/start.yaml 
 1-giant-install/giant.yaml 2-workload/sequential_run/ec-rados-default.yaml 
 3-upgrade-sequence/upgrade-mon-osd-mds.yaml 
 4-final-workload/{rados-snaps-few-objects.yaml rados_loadgenmix.yaml 
 rados_mon_thrash.yaml rbd_cls.yaml rbd_import_export.yaml rgw_swift.yaml} 
 distros/rhel_7.0.yaml}:http://pulpito.ceph.com/teuthology-2015-03-20_17:05:02-upgrade:giant-x-hammer-distro-basic-vps/814108
 ** *Could not reconnect to ubu...@vpm166.front.sepia.ceph.com*
 *** upgrade:giant-x/stress-split-erasure-code/{0-cluster/start.yaml 
 1-giant-install/giant.yaml 2-partial-upgrade/firsthalf.yaml 
 3-thrash/default.yaml 4-mon/mona.yaml 5-workload/ec-rados-default.yaml 
 6-next-mon/monb.yaml 8-next-mon/monc.yaml 
 9-workload/ec-rados-plugin=jerasure-k=3-m=1.yaml 
 distros/rhel_7.0.yaml}:http://pulpito.ceph.com/teuthology-2015-03

Re: hammer tasks in http://tracker.ceph.com/projects/ceph-releases

2015-03-23 Thread Yuri Weinstein

Loic, done, pls review and edit.

Thx
YuriW

- Original Message -
From: Loic Dachary l...@dachary.org
To: Yuri Weinstein ywein...@redhat.com
Cc: Sage Weil sw...@redhat.com, Ceph Development 
ceph-devel@vger.kernel.org
Sent: Monday, March 23, 2015 9:10:20 AM
Subject: Re: hammer tasks in http://tracker.ceph.com/projects/ceph-releases

On 23/03/2015 16:44, Yuri Weinstein wrote:

 Thx
 YuriW

 - Original Message -
 From: Loic Dachary l...@dachary.org
 To: Yuri Weinstein ywein...@redhat.com
 Cc: Sage Weil sw...@redhat.com, Ceph Development 
 ceph-devel@vger.kernel.org
 Sent: Monday, March 23, 2015 8:40:02 AM
 Subject: Re: hammer tasks in http://tracker.ceph.com/projects/ceph-releases

 Hi Yuri,

 On 23/03/2015 16:09, Yuri Weinstein wrote:
 How will that go for the next run of upgrade/giant-x ?

 I was thinking that as soon as for example this suite passed, #11189 gets 
 resolved as thus indicates that it's ready for for the hammer release cut. 

 If the following happens:

 * hammer: upgrade/giant-x runs and passes
 * a dozen more commits are added because problems are fixed
 * hammer: upgrade/giant-x runs and passes

 That leaves us with two issues with the same name but with different update 
 dates. So if I look at the hammer: upgrade/giant-x issues in chronological 
 order, I have a complete history of the successive runs and I can check the 
 latest one to see how it went. Or older ones if I need to dig the history. 

 This is good :-)

 After hammer is released, the same will presumably happen for point releases. 
 Instead of naming them hammer: upgrade/giant-x which would be confusing, I 
 guess we could name them v0.94.1: upgrade/giant-x instead. 

 Does that sound right ?

 Yes, we can alternatively name the set of those tasks as hammer v0.94.1

Great !

Would you like me to add a section at 
http://tracker.ceph.com/projects/ceph-releases/wiki/Wiki to summarize this 
conversation ?

 Thx
 YuriW

 - Original Message -
 From: Loic Dachary l...@dachary.org
 To: Yuri Weinstein ywein...@redhat.com
 Cc: Sage Weil sw...@redhat.com, Ceph Development 
 ceph-devel@vger.kernel.org
 Sent: Sunday, March 22, 2015 5:35:19 PM
 Subject: Re: hammer tasks in http://tracker.ceph.com/projects/ceph-releases

 On 22/03/2015 17:16, Yuri Weinstein wrote:
 Loic, I think the idea was to do more process driven approach for releasing 
 hammer, e.g. keep track of suites vs. results and open issues, so we can 
 have a high level view on status at any time before the final cut day.

 Do you have any suggestions or objections?

 Reading http://tracker.ceph.com/issues/11189 I see it has one run, and a run 
 of failed tests, and got resolved because all passed. The title is hammer: 
 upgrade/giant-x. How will that go for the next run of upgrade/giant-x ?

 I use a python snippet to display the errors in a redmine format 
 (http://workbench.dachary.org/dachary/ceph-workbench/issues/2)

 $ python ../fail.py 
 teuthology-2015-03-20_17:05:02-upgrade:giant-x-hammer-distro-basic-vps
 ** *'mkdir -p -- /home/ubuntu/cephtest/mnt.1/client.1/tmp  cd -- 
 /home/ubuntu/cephtest/mnt.1/client.1/tmp  CEPH_CLI_TEST_DUP_COMMAND=1 
 CEPH_REF=giant TESTDIR=/home/ubuntu/cephtest CEPH_ID=1 
 PATH=$PATH:/usr/sbin adjust-ulimits ceph-coverage 
 /home/ubuntu/cephtest/archive/coverage timeout 3h 
 /home/ubuntu/cephtest/workunit.client.1/cls/test_cls_rgw.sh'*
 *** upgrade:giant-x/parallel/{0-cluster/start.yaml 
 1-giant-install/giant.yaml 2-workload/parallel_run/{ec-rados-parallel.yaml 
 rados_api.yaml rados_loadgenbig.yaml test_cache-pool-snaps.yaml 
 test_rbd_api.yaml test_rbd_python.yaml} 
 3-upgrade-sequence/upgrade-mon-osd-mds.yaml 
 4-final-workload/{rados-snaps-few-objects.yaml rados_loadgenmix.yaml 
 rados_mon_thrash.yaml rbd_cls.yaml rbd_import_export.yaml rgw_swift.yaml} 
 distros/rhel_7.0.yaml}:http://pulpito.ceph.com/teuthology-2015-03-20_17:05:02-upgrade:giant-x-hammer-distro-basic-vps/814081
 ** *2015-03-20 23:04:51.042345 mon.0 10.214.130.49:6789/0 3 : cluster [WRN] 
 message from mon.1 was stamped 14400.248297s in the future, clocks not 
 synchronized in cluster log*
 *** upgrade:giant-x/parallel/{0-cluster/start.yaml 
 1-giant-install/giant.yaml 2-workload/sequential_run/test_rbd_api.yaml 
 3-upgrade-sequence/upgrade-all.yaml 
 4-final-workload/{rados-snaps-few-objects.yaml rados_loadgenmix.yaml 
 rados_mon_thrash.yaml rbd_cls.yaml rbd_import_export.yaml rgw_swift.yaml} 
 distros/centos_6.5.yaml}:http://pulpito.ceph.com/teuthology-2015-03-20_17:05:02-upgrade:giant-x-hammer-distro-basic-vps/814155
 ** *Could not reconnect to ubu...@vpm169.front.sepia.ceph.com*
 *** upgrade:giant-x/parallel/{0-cluster/start.yaml 
 1-giant-install/giant.yaml 2-workload/sequential_run/ec-rados-default.yaml 
 3-upgrade-sequence/upgrade-mon-osd-mds.yaml 
 4-final-workload/{rados-snaps-few-objects.yaml rados_loadgenmix.yaml 
 rados_mon_thrash.yaml rbd_cls.yaml rbd_import_export.yaml rgw_swift.yaml

Re: hammer tasks in http://tracker.ceph.com/projects/ceph-releases

2015-03-22 Thread Yuri Weinstein

Loic, I think the idea was to do more process driven approach for releasing 
hammer, e.g. keep track of suites vs. results and open issues, so we can have a 
high level view on status at any time before the final cut day.

Do you have any suggestions or objections?

Thx
YuriW

- Original Message -
From: Loic Dachary l...@dachary.org
To: Sage Weil sw...@redhat.com
Cc: Ceph Development ceph-devel@vger.kernel.org
Sent: Sunday, March 22, 2015 1:54:06 AM
Subject: hammer tasks in http://tracker.ceph.com/projects/ceph-releases

Hi Sage,

You have created a few hammer related tasks at 
http://tracker.ceph.com/projects/ceph-releases/issues . What did you have in 
mind ?

Cheers

-- 
Loïc Dachary, Artisan Logiciel Libre
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: firefly integration branch for v0.80.9 ready for QE

2015-03-09 Thread Yuri Weinstein

QE validation is finished for this release and v0.80.9 is ready for next steps.

Summary of all runs is in http://tracker.ceph.com/issues/10641 with details.

The following suites were executed and passed as part of this validation:

rados
rbd
rgw
fs
krbd
kcephfs
samba
ceph-deploy
upgrade/firefly
upgrade/dumpling-firefly-x (to giant)
powercycle 

Alfredo, the ticket is in your hands.

Thx
YuriW

- Original Message -
From: Loic Dachary l...@dachary.org
To: Yuri Weinstein ywein...@redhat.com
Cc: Ceph Development ceph-devel@vger.kernel.org
Sent: Tuesday, March 3, 2015 8:30:42 AM
Subject: firefly integration branch for v0.80.9 ready for QE

Hi Yuri,

The firefly branch for v0.80.9 as found at 
https://github.com/ceph/ceph/commits/firefly has been approved by Greg, Yehuda, 
Josh and Sam and is ready for QE. 
For the record, the head is 
https://github.com/ceph/ceph/commit/edd37e39d155fbe36012008df3d49e33ec3117cc 
and the details of the tests run are at http://tracker.ceph.com/issues/10641

Cheers

-- 
Loïc Dachary, Artisan Logiciel Libre


--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: v0.80.8 and librbd performance

2015-03-03 Thread Yuri Weinstein

Ken

PLs se http://tracker.ceph.com/issues/10641 for more details

Thx
YuriW

- Original Message -
From: Ken Dreyer kdre...@redhat.com
To: Sage Weil sw...@redhat.com, ceph-devel@vger.kernel.org, 
ceph-us...@ceph.com
Sent: Tuesday, March 3, 2015 3:28:02 PM
Subject: Re: v0.80.8 and librbd performance

On 03/03/2015 04:19 PM, Sage Weil wrote:
 Hi,

 This is just a heads up that we've identified a performance regression in 
 v0.80.8 from previous firefly releases.  A v0.80.9 is working it's way 
 through QA and should be out in a few days.  If you haven't upgraded yet 
 you may want to wait.

 Thanks!
 sage

Hi Sage,

I've seen a couple Redmine tickets on this (eg
http://tracker.ceph.com/issues/9854 ,
http://tracker.ceph.com/issues/10956). It's not totally clear to me
which of the 70+ unreleased commits on the firefly branch fix this
librbd issue.  Is it only the three commits in
https://github.com/ceph/ceph/pull/3410 , or are there more?

- Ken
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: re-running teuthology jobs

2015-02-28 Thread Yuri Weinstein

Loic

In case you want to add some comments - http://tracker.ceph.com/issues/10945

Thx
YuriW

- Original Message -
From: Loic Dachary l...@dachary.org
To: Ceph Development ceph-devel@vger.kernel.org
Sent: Saturday, February 28, 2015 7:01:29 AM
Subject: Re: re-running teuthology jobs

The simpler way is to use the --filter argument of teuthology-suite with the 
value of the description: field found in the config.yaml file. For instance, 
running the rados failed jobs http://tracker.ceph.com/issues/10641#rados failed 
jobs:

$ ./virtualenv/bin/teuthology-suite --priority 101 --suite rados --filter 
'rados/multimon/{clusters/21.yaml msgr-failures/many.yaml 
tasks/mon_clock_with_skews.yaml},rados/thrash/{clusters/fixed-2.yaml 
fs/xfs.yaml msgr-failures/osd-delay.yaml thrashers/morepggrow.yaml 
workloads/small-objects.yaml},rados/thrash/{clusters/fixed-2.yaml fs/xfs.yaml 
msgr-failures/osd-delay.yaml thrashers/pggrow.yaml 
workloads/ec-small-objects.yaml},rados/verify/{1thrash/none.yaml 
clusters/fixed-2.yaml fs/btrfs.yaml msgr-failures/few.yaml 
tasks/mon_recovery.yaml 
validater/valgrind.yaml},rados/thrash/{clusters/fixed-2.yaml fs/xfs.yaml 
msgr-failures/osd-delay.yaml thrashers/default.yaml 
workloads/cache-agent-small.yaml}' --suite-branch firefly --machine-type 
plana,burnupi,mira --distro ubuntu --email l...@dachary.org --owner 
l...@dachary.org  --ceph firefly-backports
2015-02-28 15:58:08,474.474 INFO:teuthology.suite:ceph sha1: 
e54834bfac3c38562987730b317cb1944a96005b
2015-02-28 15:58:08,969.969 INFO:teuthology.suite:ceph version: 
0.80.8-75-ge54834b-1precise
2015-02-28 15:58:09,606.606 INFO:teuthology.suite:teuthology branch: master
2015-02-28 15:58:10,407.407 INFO:teuthology.suite:ceph-qa-suite branch: firefly
2015-02-28 15:58:10,409.409 INFO:teuthology.repo_utils:Fetching from upstream 
into /home/loic/src/ceph-qa-suite_firefly
2015-02-28 15:58:11,522.522 INFO:teuthology.repo_utils:Resetting repo at 
/home/loic/src/ceph-qa-suite_firefly to branch firefly
2015-02-28 15:58:12,393.393 INFO:teuthology.suite:Suite rados in 
/home/loic/src/ceph-qa-suite_firefly/suites/rados generated 693 jobs (not yet 
filtered)
2015-02-28 15:58:12,419.419 INFO:teuthology.suite:Scheduling 
rados/multimon/{clusters/21.yaml msgr-failures/many.yaml 
tasks/mon_clock_with_skews.yaml}
Job scheduled with name 
loic-2015-02-28_15:58:07-rados-firefly-backports---basic-multi and ID 783145
2015-02-28 15:58:14,199.199 INFO:teuthology.suite:Scheduling 
rados/thrash/{clusters/fixed-2.yaml fs/xfs.yaml msgr-failures/osd-delay.yaml 
thrashers/default.yaml workloads/cache-agent-small.yaml}
Job scheduled with name 
loic-2015-02-28_15:58:07-rados-firefly-backports---basic-multi and ID 783146
2015-02-28 15:58:15,650.650 INFO:teuthology.suite:Scheduling 
rados/thrash/{clusters/fixed-2.yaml fs/xfs.yaml msgr-failures/osd-delay.yaml 
thrashers/morepggrow.yaml workloads/small-objects.yaml}
Job scheduled with name 
loic-2015-02-28_15:58:07-rados-firefly-backports---basic-multi and ID 783147
2015-02-28 15:58:16,837.837 INFO:teuthology.suite:Scheduling 
rados/thrash/{clusters/fixed-2.yaml fs/xfs.yaml msgr-failures/osd-delay.yaml 
thrashers/pggrow.yaml workloads/ec-small-objects.yaml}
Job scheduled with name 
loic-2015-02-28_15:58:07-rados-firefly-backports---basic-multi and ID 783148
2015-02-28 15:58:18,421.421 INFO:teuthology.suite:Scheduling 
rados/verify/{1thrash/none.yaml clusters/fixed-2.yaml fs/btrfs.yaml 
msgr-failures/few.yaml tasks/mon_recovery.yaml validater/valgrind.yaml}
Job scheduled with name 
loic-2015-02-28_15:58:07-rados-firefly-backports---basic-multi and ID 783149
2015-02-28 15:58:19,729.729 INFO:teuthology.suite:Suite rados in 
/home/loic/src/ceph-qa-suite_firefly/suites/rados scheduled 5 jobs.
2015-02-28 15:58:19,729.729 INFO:teuthology.suite:Suite rados in 
/home/loic/src/ceph-qa-suite_firefly/suites/rados -- 688 jobs were filtered out.
Job scheduled with name 
loic-2015-02-28_15:58:07-rados-firefly-backports---basic-multi and ID 783150

Creates the 
http://pulpito.ceph.com/loic-2015-02-28_15:58:07-rados-firefly-backports---basic-multi/
 run with just 5 jobs.

On 28/02/2015 11:28, Loic Dachary wrote:
 Hi,

 A teuthology rados run ( 
 https://github.com/ceph/ceph-qa-suite/tree/master/suites/rados ) completed 
 with five dead jobs out of 693. They failed because of DNS errors and I'd 
 like to re-run them. Ideally I could do something like:

 teuthology-schedule --run 
 loic-2015-02-27_20:22:09-rados-firefly-backports---basic-multi --job-id 
 781444 --job-id  781457 ...

 and it would re-schedule a run of the designated jobs from the designated 
 run. But I don't think such a command exist. 

 I will therefore manually do what such a command would do, for each failed 
 job:

 * download 
 http://qa-proxy.ceph.com/teuthology/loic-2015-02-27_20:22:09-rados-firefly-backports---basic-multi/781444/orig.config.yaml
 * git clone https://github.com/ceph/ceph-qa-suite /srv/ceph-qa-suite
 * cd /srv/ceph-qa-suite ; git

Re: dumpling integration branch for v0.67.12 ready for QE

2015-02-25 Thread Yuri Weinstein

All issues in http://tracker.ceph.com/issues/10560 updated.

Loic - #10801 can be resolved.

v0.67.12 ready for release.

Thx
YuriW

- Original Message -
From: Yuri Weinstein ywein...@redhat.com
To: Loic Dachary l...@dachary.org
Cc: Ceph Development ceph-devel@vger.kernel.org, Sage Weil 
s...@redhat.com, Tamil Muthamizhan tmuth...@redhat.com, Zack Cerza 
z...@redhat.com, Sandon Van Ness svann...@redhat.com
Sent: Wednesday, February 18, 2015 9:38:19 AM
Subject: Re: dumpling integration branch for v0.67.12 ready for QE

Hi all

I updated all issues in http://tracker.ceph.com/issues/10560

Based on what is listed there, we have 
http://tracker.ceph.com/issues/10801 - Yehuda pls comment
http://tracker.ceph.com/issues/10694 - Sam pls re-confirm

rbd - Josh, I understood that we are good to go, pls re-confirm.

I can re-run some suites if you'd like and we can make a call on this release.

Loic - back to you, let me know what you think.

Thx
YuriW

- Original Message -
From: Loic Dachary l...@dachary.org
To: Yuri Weinstein ywein...@redhat.com
Cc: Ceph Development ceph-devel@vger.kernel.org, Sage Weil 
s...@redhat.com, Tamil Muthamizhan tmuth...@redhat.com, Zack Cerza 
z...@redhat.com, Sandon Van Ness svann...@redhat.com
Sent: Thursday, February 12, 2015 2:17:49 PM
Subject: Re: dumpling integration branch for v0.67.12 ready for QE



On 12/02/2015 23:06, Yuri Weinstein wrote:
 I linked all issues related to this release testing to the ticket 
 http://tracker.ceph.com/issues/10560
 
 After the team leads make a call of those, including environment issues, I 
 suggest re-running suites the failed again.
 
 Loic, I'd re-run them in the Octo, since we already started there, if you 
 agree ?

Sure :-)

 
 Thx
 YuriW
 
 - Original Message -
 From: Yuri Weinstein ywein...@redhat.com
 To: Loic Dachary l...@dachary.org
 Cc: Ceph Development ceph-devel@vger.kernel.org, Sage Weil 
 s...@redhat.com, Tamil Muthamizhan tmuth...@redhat.com
 Sent: Wednesday, February 11, 2015 2:24:33 PM
 Subject: Re: dumpling integration branch for v0.67.12 ready for QE
 
 I replied to individual suites runs, but just wanted to summarize QE 
 validation status.
 
 The following suites were executed in the Octo lab (we will use Sepia in the 
 future if nobody objects).
 
 upgrade:dumpling
 
 ['45493']
 http://tracker.ceph.com/issues/10694 - Known Won't fix
 Assertion: osd/Watch.cc: 290: FAILED assert(!cb)
 
 *** Sam - pls confirm the Won't fix status.
 
 ['45495', '45496', '45498', '45499', '45500']
 http://tracker.ceph.com/issues/10838
 s3tests failed
 
 *** Yehuda - need your verdict on s3tests.
 
 fs
 
 All green !
 
 rados
 
 ['45054']
 http://tracker.ceph.com/issues/10841
 Issued certificate has expired 
 *** Sandon pls comment.
 
 ['45168', '45169']
 http://tracker.ceph.com/issues/10840
 coredump ceph_test_filestore_idempotent_sequence
 *** Sam - pls comment
 
 ['45215']
 Missing packages - no ticket FYI
 Failed to fetch 
 http://apt-mirror.front.sepia.ceph.com/archive.ubuntu.com/ubuntu/dists/trusty-updates/universe/binary-i386/Packages
   Hash Sum mismatch
 
 *** Zack, Sandon ?
 
 ceph-deploy
 
 
 Travis - pls suggest
 In general I am not sure if we needed to test this - Sage?
 
 rbd
 
 ['45365', '45366', '45367']
 http://tracker.ceph.com/issues/10842
 unable to connect to apt-mirror.front.sepia.ceph.com
 
 ['45349', '45350', '45351', '45355', '45356', '45357', '45363']
 http://tracker.ceph.com/issues/10802
 error: image still has watchers 
 (duplicate of 10680)
 
 *** Zack, Sandon, Josh - all environment noise, pls comment. 
 
 rgw
 
 ['45382', '45390']
 http://tracker.ceph.com/issues/10843
 s3tests failed - could be related or duplicate of 10838
 
 *** Yehuda - same as issues in upgrades?
 
 I am standing by for you analysis/replies and recommendations for next steps.
 
 Loic - let me know is you want to follow specific items in our backport 
 testing process that I missed here.
 PS:  I would think that you could've wanted to assign the release ticket to 
 QE (me) for validation and at this point I could've re-assigned it back to 
 devel (you), a?
 
 Thx
 YuriW
 
 - Original Message -
 From: Loic Dachary l...@dachary.org
 To: Yuri Weinstein ywein...@redhat.com
 Cc: Ceph Development ceph-devel@vger.kernel.org
 Sent: Tuesday, February 10, 2015 9:05:31 AM
 Subject: dumpling integration branch for v0.67.12 ready for QE
 
 Hi Yuri,
 
 The dumpling integration branch for v0.67.12 as found at 
 https://github.com/ceph/ceph/commits/dumpling-backports has been approved by 
 Yehuda, Josh and Sam and is ready for QE. 
 
 For the record, the head is 
 https://github.com/ceph/ceph/commit/3944c77c404c4a05886fe8276d5d0dd7e4f20410
 
 I think it would be best for the QE tests to use the dumpling-backports. The 
 alternative would be to merge dumpling-backports into dumpling. However, 
 since testing may take a long time

Re: giant integration branch for v0.87.1 ready for QE

2015-02-20 Thread Yuri Weinstein

Team leads,

Please review QE validation results summary in 
http://tracker.ceph.com/issues/10501

Loic - this RC looks ready for release (in my opinion) ! 

Thx
YuriW

- Original Message -
From: Yuri Weinstein ywein...@redhat.com
To: Loic Dachary l...@dachary.org
Cc: Ceph Development ceph-devel@vger.kernel.org
Sent: Monday, February 16, 2015 9:33:44 AM
Subject: Re: giant integration branch for v0.87.1 ready for QE

I have completed suites execution on giant branch (v0.87.1 RC)

All results are summarized in http://tracker.ceph.com/issues/10501 under QE 
VALIDATION section.
Some suites had to be run more then ones due to environment noise.

Two suites are being re-run now - upgrade:firefly-x and powecycle.

Next steps:
-  the team leads to review/confirm results
-  Loic - can you review and triage issues as needed.
-  two suites require results analysis:
   multimds
   rados (two known tickets, but need more checking) ## 10209, 9891
   krbd (two new tickets, but need more checking) ## 10889, 10890


Thx
YuriW

- Original Message -
From: Loic Dachary l...@dachary.org
To: Yuri Weinstein ywein...@redhat.com
Cc: Ceph Development ceph-devel@vger.kernel.org
Sent: Wednesday, February 11, 2015 7:30:06 AM
Subject: Re: giant integration branch for v0.87.1 ready for QE

Hi Yuri,

The giant-backports pull requests were merged into 
https://github.com/ceph/ceph/tree/giant which is not ready for testing.

For the record, the head is 
https://github.com/ceph/ceph/commit/78c71b9200da5e7d832ec58765478404d31ae6b5

Cheers

On 10/02/2015 18:20, Loic Dachary wrote:
 Hi Yuri,
 
 The giant integration branch for v0.87.1 as found at 
 https://github.com/ceph/ceph/commits/giant-backports has been approved by 
 Yehuda, Josh and Sam and is ready for QE. 
 
 For the record, the head is 
 https://github.com/ceph/ceph/commit/6b08a729540c61f3c8b15c5a3ce9382634bf800c
 
 Cheers
 

-- 
Loïc Dachary, Artisan Logiciel Libre
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Clocks out of sync

2015-02-20 Thread Yuri Weinstein

David,

We had ntp server issue tot too long ago, could be the same or new

http://tracker.ceph.com/issues/10675

Thx
YuriW

- Original Message -
From: David Zafman dzaf...@redhat.com
To: ceph-devel@vger.kernel.org
Sent: Friday, February 20, 2015 3:08:29 PM
Subject: Clocks out of sync


On 2 of my rados thrash runs clocks out of sync.   Is this an occasional 
issue or did we have an infrastructure problem?

On burnupi19 and burnupi25:
2015-02-20 12:52:52.636017 mon.1 10.214.134.14:6789/0 177 : cluster 
[WRN] message from mon.0 was stamped 0.501458s in the future, clocks not 
synchronized

On plana62 and plana64:
2015-02-20 10:00:56.842533 mon.0 10.214.132.14:6789/0 3 : cluster [WRN] 
message from mon.1 was stamped 0.855106s in the future, clocks not 
synchronized


--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Disk failing plana74

2015-02-20 Thread Yuri Weinstein

I ran smart and it came back good, hmm

ubuntu@plana74:~$ /usr/libexec/smart.pl 
All 4 drives happy as clams

Thx
YuriW

- Original Message -
From: David Zafman dzaf...@redhat.com
To: Sandon Van Ness svann...@redhat.com
Cc: ceph-devel@vger.kernel.org
Sent: Friday, February 20, 2015 1:10:48 PM
Subject: Disk failing plana74


A recent test run had an EIO on the following disk:

plana74 /dev/sdb

The machine is locked right now.

David Zafman
Senior Developer
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: dumpling integration branch for v0.67.12 ready for QE

2015-02-18 Thread Yuri Weinstein

Hi all

I updated all issues in http://tracker.ceph.com/issues/10560

Based on what is listed there, we have 
http://tracker.ceph.com/issues/10801 - Yehuda pls comment
http://tracker.ceph.com/issues/10694 - Sam pls re-confirm

rbd - Josh, I understood that we are good to go, pls re-confirm.

I can re-run some suites if you'd like and we can make a call on this release.

Loic - back to you, let me know what you think.

Thx
YuriW

- Original Message -
From: Loic Dachary l...@dachary.org
To: Yuri Weinstein ywein...@redhat.com
Cc: Ceph Development ceph-devel@vger.kernel.org, Sage Weil 
s...@redhat.com, Tamil Muthamizhan tmuth...@redhat.com, Zack Cerza 
z...@redhat.com, Sandon Van Ness svann...@redhat.com
Sent: Thursday, February 12, 2015 2:17:49 PM
Subject: Re: dumpling integration branch for v0.67.12 ready for QE



On 12/02/2015 23:06, Yuri Weinstein wrote:
 I linked all issues related to this release testing to the ticket 
 http://tracker.ceph.com/issues/10560
 
 After the team leads make a call of those, including environment issues, I 
 suggest re-running suites the failed again.
 
 Loic, I'd re-run them in the Octo, since we already started there, if you 
 agree ?

Sure :-)

 
 Thx
 YuriW
 
 - Original Message -
 From: Yuri Weinstein ywein...@redhat.com
 To: Loic Dachary l...@dachary.org
 Cc: Ceph Development ceph-devel@vger.kernel.org, Sage Weil 
 s...@redhat.com, Tamil Muthamizhan tmuth...@redhat.com
 Sent: Wednesday, February 11, 2015 2:24:33 PM
 Subject: Re: dumpling integration branch for v0.67.12 ready for QE
 
 I replied to individual suites runs, but just wanted to summarize QE 
 validation status.
 
 The following suites were executed in the Octo lab (we will use Sepia in the 
 future if nobody objects).
 
 upgrade:dumpling
 
 ['45493']
 http://tracker.ceph.com/issues/10694 - Known Won't fix
 Assertion: osd/Watch.cc: 290: FAILED assert(!cb)
 
 *** Sam - pls confirm the Won't fix status.
 
 ['45495', '45496', '45498', '45499', '45500']
 http://tracker.ceph.com/issues/10838
 s3tests failed
 
 *** Yehuda - need your verdict on s3tests.
 
 fs
 
 All green !
 
 rados
 
 ['45054']
 http://tracker.ceph.com/issues/10841
 Issued certificate has expired 
 *** Sandon pls comment.
 
 ['45168', '45169']
 http://tracker.ceph.com/issues/10840
 coredump ceph_test_filestore_idempotent_sequence
 *** Sam - pls comment
 
 ['45215']
 Missing packages - no ticket FYI
 Failed to fetch 
 http://apt-mirror.front.sepia.ceph.com/archive.ubuntu.com/ubuntu/dists/trusty-updates/universe/binary-i386/Packages
   Hash Sum mismatch
 
 *** Zack, Sandon ?
 
 ceph-deploy
 
 
 Travis - pls suggest
 In general I am not sure if we needed to test this - Sage?
 
 rbd
 
 ['45365', '45366', '45367']
 http://tracker.ceph.com/issues/10842
 unable to connect to apt-mirror.front.sepia.ceph.com
 
 ['45349', '45350', '45351', '45355', '45356', '45357', '45363']
 http://tracker.ceph.com/issues/10802
 error: image still has watchers 
 (duplicate of 10680)
 
 *** Zack, Sandon, Josh - all environment noise, pls comment. 
 
 rgw
 
 ['45382', '45390']
 http://tracker.ceph.com/issues/10843
 s3tests failed - could be related or duplicate of 10838
 
 *** Yehuda - same as issues in upgrades?
 
 I am standing by for you analysis/replies and recommendations for next steps.
 
 Loic - let me know is you want to follow specific items in our backport 
 testing process that I missed here.
 PS:  I would think that you could've wanted to assign the release ticket to 
 QE (me) for validation and at this point I could've re-assigned it back to 
 devel (you), a?
 
 Thx
 YuriW
 
 - Original Message -
 From: Loic Dachary l...@dachary.org
 To: Yuri Weinstein ywein...@redhat.com
 Cc: Ceph Development ceph-devel@vger.kernel.org
 Sent: Tuesday, February 10, 2015 9:05:31 AM
 Subject: dumpling integration branch for v0.67.12 ready for QE
 
 Hi Yuri,
 
 The dumpling integration branch for v0.67.12 as found at 
 https://github.com/ceph/ceph/commits/dumpling-backports has been approved by 
 Yehuda, Josh and Sam and is ready for QE. 
 
 For the record, the head is 
 https://github.com/ceph/ceph/commit/3944c77c404c4a05886fe8276d5d0dd7e4f20410
 
 I think it would be best for the QE tests to use the dumpling-backports. The 
 alternative would be to merge dumpling-backports into dumpling. However, 
 since testing may take a long time and require more patches, it probably is 
 better to not do that iterative process on the dumpling branch itself. As it 
 is now, there already are a number of commits in the dumpling branch that 
 should really be in the dumpling-backports: they do not belong to v0.67.11 
 and are going to be released in v0.67.12. In the future though, the dumpling 
 branch will only receive commits that have been carefully tested and all the 
 integration work will be on the dumpling-backports branch exclusively. So 
 that third parties do not have

Re: giant integration branch for v0.87.1 ready for QE

2015-02-16 Thread Yuri Weinstein

I have completed suites execution on giant branch (v0.87.1 RC)

All results are summarized in http://tracker.ceph.com/issues/10501 under QE 
VALIDATION section.
Some suites had to be run more then ones due to environment noise.

Two suites are being re-run now - upgrade:firefly-x and powecycle.

Next steps:
-  the team leads to review/confirm results
-  Loic - can you review and triage issues as needed.
-  two suites require results analysis:
   multimds
   rados (two known tickets, but need more checking) ## 10209, 9891
   krbd (two new tickets, but need more checking) ## 10889, 10890


Thx
YuriW

- Original Message -
From: Loic Dachary l...@dachary.org
To: Yuri Weinstein ywein...@redhat.com
Cc: Ceph Development ceph-devel@vger.kernel.org
Sent: Wednesday, February 11, 2015 7:30:06 AM
Subject: Re: giant integration branch for v0.87.1 ready for QE

Hi Yuri,

The giant-backports pull requests were merged into 
https://github.com/ceph/ceph/tree/giant which is not ready for testing.

For the record, the head is 
https://github.com/ceph/ceph/commit/78c71b9200da5e7d832ec58765478404d31ae6b5

Cheers

On 10/02/2015 18:20, Loic Dachary wrote:
 Hi Yuri,
 
 The giant integration branch for v0.87.1 as found at 
 https://github.com/ceph/ceph/commits/giant-backports has been approved by 
 Yehuda, Josh and Sam and is ready for QE. 
 
 For the record, the head is 
 https://github.com/ceph/ceph/commit/6b08a729540c61f3c8b15c5a3ce9382634bf800c
 
 Cheers
 

-- 
Loïc Dachary, Artisan Logiciel Libre
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: giant integration branch for v0.87.1 ready for QE

2015-02-16 Thread Yuri Weinstein

Greg, thx, so noted.

Thx
YuriW

- Original Message -
From: Gregory Farnum g...@gregs42.com
To: Yuri Weinstein ywein...@redhat.com, Loic Dachary l...@dachary.org
Cc: Ceph Development ceph-devel@vger.kernel.org
Sent: Monday, February 16, 2015 10:44:35 AM
Subject: Re: giant integration branch for v0.87.1 ready for QE

The multimds suite has never passed and is strictly informational at this
point. You shouldn't worry about it. (We use it only to make sure we don't
completely break multimds systems, but we don't expect it to pass. It's
just nice to have a rough idea how far off we are.)
-Greg
On Mon, Feb 16, 2015 at 9:34 AM Yuri Weinstein ywein...@redhat.com wrote:

 I have completed suites execution on giant branch (v0.87.1 RC)

 All results are summarized in http://tracker.ceph.com/issues/10501 under
 QE VALIDATION section.
 Some suites had to be run more then ones due to environment noise.

 Two suites are being re-run now - upgrade:firefly-x and powecycle.

 Next steps:
 -  the team leads to review/confirm results
 -  Loic - can you review and triage issues as needed.
 -  two suites require results analysis:
multimds
rados (two known tickets, but need more checking) ## 10209, 9891
krbd (two new tickets, but need more checking) ## 10889, 10890


 Thx
 YuriW

 - Original Message -
 From: Loic Dachary l...@dachary.org
 To: Yuri Weinstein ywein...@redhat.com
 Cc: Ceph Development ceph-devel@vger.kernel.org
 Sent: Wednesday, February 11, 2015 7:30:06 AM
 Subject: Re: giant integration branch for v0.87.1 ready for QE

 Hi Yuri,

 The giant-backports pull requests were merged into
 https://github.com/ceph/ceph/tree/giant which is not ready for testing.

 For the record, the head is https://github.com/ceph/ceph/commit/
 78c71b9200da5e7d832ec58765478404d31ae6b5

 Cheers

 On 10/02/2015 18:20, Loic Dachary wrote:
  Hi Yuri,
 
  The giant integration branch for v0.87.1 as found at
 https://github.com/ceph/ceph/commits/giant-backports has been approved by
 Yehuda, Josh and Sam and is ready for QE.
 
  For the record, the head is https://github.com/ceph/ceph/commit/
 6b08a729540c61f3c8b15c5a3ce9382634bf800c
 
  Cheers
 

 --
 Loïc Dachary, Artisan Logiciel Libre
 --
 To unsubscribe from this list: send the line unsubscribe ceph-devel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Ceph-qa] 1 hung, 11 passed in teuthology-2015-02-11_16:13:01-samba-giant-distro-basic-multi

2015-02-14 Thread Yuri Weinstein

 Yeah. Well, the last run alone isn't so important; we want to see a
 string of clean runs because a lot of issues aren't reproduced in
 every run.

My hope was that we can see all green results for say this giant 
release/backport, but I agree that we would need to make our go/no-go decision 
based on multiple run results, as I am not sure if we can get them all green 
due to complexity, time needed to execute, environment state etc..

We could thou modify our process a bit:
1. after backport-branch is ready for QE, merge it to the named branch (say 
'giant' in this example) - that what we did now
2. cut a release numbered brach (maybe it's tag, not sure), say v0.87.1
3. run all QE suites on v0.87.1 and get it to all passed state
4. make sure that commits to v0.87.1 are committed to the named branch 
('giant') 

#2 is that we have not done this time.

Thx
YuriW

- Original Message -
From: Gregory Farnum g...@gregs42.com
To: Loic Dachary l...@dachary.org
Cc: Ceph Development ceph-devel@vger.kernel.org
Sent: Friday, February 13, 2015 11:56:18 PM
Subject: Re: [Ceph-qa] 1 hung, 11 passed in 
teuthology-2015-02-11_16:13:01-samba-giant-distro-basic-multi

On Fri, Feb 13, 2015 at 10:34 PM, Loic Dachary l...@dachary.org wrote:
 Hi Greg,

 I'm curious to know how you handle the flow of mails from QA runs. Here is a 
 wild guess:

 * from time to time check that the nightlies run the suites that should be run

Uh, I guess?

 * read the ceph-qa reports daily

Yeah

 * for each failed job, either relate it to an issue or create one or declare 
 it noise

Yeah

 * if a job fails on an existing ticket store a link to the job if it's rare 
 occurrence and the cause is not yet known

Yeah, or just to make clear it's still happening or whatever

 * bi-weekly bug scrub makes sure no issue, old or new, is forgotten

Hopefully!

 * at release time you decide that it is ready based on:
 ** the list of urgent/immediate issues that you can browse to ensure no issue 
 is a blocker
 ** the last run of each suite to ensure they are recent enough and 
 environmental noise did not permanently shadow anything

Yeah. Well, the last run alone isn't so important; we want to see a
string of clean runs because a lot of issues aren't reproduced in
every run.
-Greg
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Ceph-qa] 1 hung, 11 passed in teuthology-2015-02-11_16:13:01-samba-giant-distro-basic-multi

2015-02-14 Thread Yuri Weinstein

Loic,

+1 - I like the way you're discussing:

v0.87.1-rc2
v0.87.1-rcX = v0.87.1 - is it easy to make this look like this after the 
validation is completed?

BTW:  When I re-run suites now for validation I use -s named_branch arg in 
the command line.   Maybe I should be using SHA ref instead?  I never tried 
this way, but guessing it should work, what do you think?

Thx
YuriW

- Original Message -
From: Loic Dachary l...@dachary.org
To: Yuri Weinstein ywein...@redhat.com
Cc: Ceph Development ceph-devel@vger.kernel.org
Sent: Saturday, February 14, 2015 2:12:05 PM
Subject: Re: [Ceph-qa] 1 hung, 11 passed in 
teuthology-2015-02-11_16:13:01-samba-giant-distro-basic-multi



On 14/02/2015 22:53, Loic Dachary wrote:
 Hi Yuri,
 
 On 14/02/2015 17:22, Yuri Weinstein wrote:
 Yeah. Well, the last run alone isn't so important; we want to see a
 string of clean runs because a lot of issues aren't reproduced in
 every run.

 My hope was that we can see all green results for say this giant 
 release/backport, but I agree that we would need to make our go/no-go 
 decision based on multiple run results, as I am not sure if we can get them 
 all green due to complexity, time needed to execute, environment state 
 etc..

 We could thou modify our process a bit:
 1. after backport-branch is ready for QE, merge it to the named branch (say 
 'giant' in this example) - that what we did now
 2. cut a release numbered brach (maybe it's tag, not sure), say v0.87.1
 3. run all QE suites on v0.87.1 and get it to all passed state
 4. make sure that commits to v0.87.1 are committed to the named branch 
 ('giant') 
 
 That makes sense to me, only with 
 s/v0.87.1/78c71b9200da5e7d832ec58765478404d31ae6b5/.
 
 #2 is that we have not done this time.
 
 We have not done #2 but we have cut the branch at given SHA ( 
 78c71b9200da5e7d832ec58765478404d31ae6b5 ) instead, which is can be 
 referenced by a tag if and when it is released. In the mail Re: giant 
 integration branch for v0.87.1 ready for QE dated 11th february 2015 I wrote:
 
 The giant-backports pull requests were merged into 
 https://github.com/ceph/ceph/tree/giant which is not ready for testing.
 
 For the record, the head is 
 https://github.com/ceph/ceph/commit/78c71b9200da5e7d832ec58765478404d31ae6b5
 
 We cannot add a v0.87.1 tag to the branch before the release process is 
 complete because we won't be able to change it afterwards (people rely on the 
 fact that the history of the giant branch is not rewritten and that tags 
 references are not changed). If during the QE test process we discover that a 
 backport must be included (I'm thinking about 
 https://github.com/ceph/ceph/pull/3731 for instance), 
 78c71b9200da5e7d832ec58765478404d31ae6b5 won't be v0.87.1 after all.
 
 In a nutshell I think we're having the same view of the process, modulo the 
 timing of the tagging of the release.

We could also have tags like:

v0.87.1-rc1 = 78c71b9200da5e7d832ec58765478404d31ae6b5
v0.87.1-rc2 = whatever SHA includes more backports

and if v0.87.1-rc2 turns out to be good the release notes could be committed 
and other non code changes. This naming scheme common, is there a downside to 
it ? It's easier to talk about v0.87.1-rc1 rather than 
78c71b9200da5e7d832ec58765478404d31ae6b5 ;-) 

Cheers

 
 Cheers
 

 Thx
 YuriW

 - Original Message -
 From: Gregory Farnum g...@gregs42.com
 To: Loic Dachary l...@dachary.org
 Cc: Ceph Development ceph-devel@vger.kernel.org
 Sent: Friday, February 13, 2015 11:56:18 PM
 Subject: Re: [Ceph-qa] 1 hung, 11 passed in 
 teuthology-2015-02-11_16:13:01-samba-giant-distro-basic-multi

 On Fri, Feb 13, 2015 at 10:34 PM, Loic Dachary l...@dachary.org wrote:
 Hi Greg,

 I'm curious to know how you handle the flow of mails from QA runs. Here is 
 a wild guess:

 * from time to time check that the nightlies run the suites that should be 
 run

 Uh, I guess?

 * read the ceph-qa reports daily

 Yeah

 * for each failed job, either relate it to an issue or create one or 
 declare it noise

 Yeah

 * if a job fails on an existing ticket store a link to the job if it's rare 
 occurrence and the cause is not yet known

 Yeah, or just to make clear it's still happening or whatever

 * bi-weekly bug scrub makes sure no issue, old or new, is forgotten

 Hopefully!

 * at release time you decide that it is ready based on:
 ** the list of urgent/immediate issues that you can browse to ensure no 
 issue is a blocker
 ** the last run of each suite to ensure they are recent enough and 
 environmental noise did not permanently shadow anything

 Yeah. Well, the last run alone isn't so important; we want to see a
 string of clean runs because a lot of issues aren't reproduced in
 every run.
 -Greg
 --
 To unsubscribe from this list: send the line unsubscribe ceph-devel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

 

-- 
Loïc Dachary, Artisan Logiciel Libre

Re: giant integration branch for v0.87.1 ready for QE

2015-02-11 Thread Yuri Weinstein

Loic

Just to double check - giant is *ready* for testing?

(you said below which is not ready for testing maybe wanted o say *now*)

Thx
YuriW

- Original Message -
From: Loic Dachary l...@dachary.org
To: Yuri Weinstein ywein...@redhat.com
Cc: Ceph Development ceph-devel@vger.kernel.org
Sent: Wednesday, February 11, 2015 7:30:06 AM
Subject: Re: giant integration branch for v0.87.1 ready for QE

Hi Yuri,

The giant-backports pull requests were merged into 
https://github.com/ceph/ceph/tree/giant which is not ready for testing.

For the record, the head is 
https://github.com/ceph/ceph/commit/78c71b9200da5e7d832ec58765478404d31ae6b5

Cheers

On 10/02/2015 18:20, Loic Dachary wrote:
 Hi Yuri,
 
 The giant integration branch for v0.87.1 as found at 
 https://github.com/ceph/ceph/commits/giant-backports has been approved by 
 Yehuda, Josh and Sam and is ready for QE. 
 
 For the record, the head is 
 https://github.com/ceph/ceph/commit/6b08a729540c61f3c8b15c5a3ce9382634bf800c
 
 Cheers
 

-- 
Loïc Dachary, Artisan Logiciel Libre
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: dumpling integration branch for v0.67.12 ready for QE

2015-02-11 Thread Yuri Weinstein

I replied to individual suites runs, but just wanted to summarize QE validation 
status.

The following suites were executed in the Octo lab (we will use Sepia in the 
future if nobody objects).

upgrade:dumpling

['45493']
http://tracker.ceph.com/issues/10694 - Known Won't fix
Assertion: osd/Watch.cc: 290: FAILED assert(!cb)

*** Sam - pls confirm the Won't fix status.

['45495', '45496', '45498', '45499', '45500']
http://tracker.ceph.com/issues/10838
s3tests failed

*** Yehuda - need your verdict on s3tests.

fs

All green !

rados

['45054']
http://tracker.ceph.com/issues/10841
Issued certificate has expired 
*** Sandon pls comment.

['45168', '45169']
http://tracker.ceph.com/issues/10840
coredump ceph_test_filestore_idempotent_sequence
*** Sam - pls comment

['45215']
Missing packages - no ticket FYI
Failed to fetch 
http://apt-mirror.front.sepia.ceph.com/archive.ubuntu.com/ubuntu/dists/trusty-updates/universe/binary-i386/Packages
  Hash Sum mismatch

*** Zack, Sandon ?

ceph-deploy


Travis - pls suggest
In general I am not sure if we needed to test this - Sage?

rbd

['45365', '45366', '45367']
http://tracker.ceph.com/issues/10842
unable to connect to apt-mirror.front.sepia.ceph.com

['45349', '45350', '45351', '45355', '45356', '45357', '45363']
http://tracker.ceph.com/issues/10802
error: image still has watchers 
(duplicate of 10680)

*** Zack, Sandon, Josh - all environment noise, pls comment. 

rgw

['45382', '45390']
http://tracker.ceph.com/issues/10843
s3tests failed - could be related or duplicate of 10838

*** Yehuda - same as issues in upgrades?

I am standing by for you analysis/replies and recommendations for next steps.

Loic - let me know is you want to follow specific items in our backport testing 
process that I missed here.
PS:  I would think that you could've wanted to assign the release ticket to QE 
(me) for validation and at this point I could've re-assigned it back to devel 
(you), a?

Thx
YuriW

- Original Message -
From: Loic Dachary l...@dachary.org
To: Yuri Weinstein ywein...@redhat.com
Cc: Ceph Development ceph-devel@vger.kernel.org
Sent: Tuesday, February 10, 2015 9:05:31 AM
Subject: dumpling integration branch for v0.67.12 ready for QE

Hi Yuri,

The dumpling integration branch for v0.67.12 as found at 
https://github.com/ceph/ceph/commits/dumpling-backports has been approved by 
Yehuda, Josh and Sam and is ready for QE. 

For the record, the head is 
https://github.com/ceph/ceph/commit/3944c77c404c4a05886fe8276d5d0dd7e4f20410

I think it would be best for the QE tests to use the dumpling-backports. The 
alternative would be to merge dumpling-backports into dumpling. However, since 
testing may take a long time and require more patches, it probably is better to 
not do that iterative process on the dumpling branch itself. As it is now, 
there already are a number of commits in the dumpling branch that should really 
be in the dumpling-backports: they do not belong to v0.67.11 and are going to 
be released in v0.67.12. In the future though, the dumpling branch will only 
receive commits that have been carefully tested and all the integration work 
will be on the dumpling-backports branch exclusively. So that third parties do 
not have to worry that the dumpling branch contains commits that have not been 
tested yet.

Cheers

-- 
Loïc Dachary, Artisan Logiciel Libre
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: giant integration branch for v0.87.1 ready for QE

2015-02-11 Thread Yuri Weinstein


I am planning to run a complete set of tests for this release in Sepia.

Will temporarily disable all *giant* suites in crontab before 4 pm today and 
schedule all suites to run with high priority.

Pls let me know if you have concerns or have emergency need for resources in 
Sepia lab.


Thx
YuriW

- Original Message -
From: Loic Dachary l...@dachary.org
To: Yuri Weinstein ywein...@redhat.com
Cc: Ceph Development ceph-devel@vger.kernel.org
Sent: Wednesday, February 11, 2015 7:30:06 AM
Subject: Re: giant integration branch for v0.87.1 ready for QE

Hi Yuri,

The giant-backports pull requests were merged into 
https://github.com/ceph/ceph/tree/giant which is not ready for testing.

For the record, the head is 
https://github.com/ceph/ceph/commit/78c71b9200da5e7d832ec58765478404d31ae6b5

Cheers

On 10/02/2015 18:20, Loic Dachary wrote:
 Hi Yuri,
 
 The giant integration branch for v0.87.1 as found at 
 https://github.com/ceph/ceph/commits/giant-backports has been approved by 
 Yehuda, Josh and Sam and is ready for QE. 
 
 For the record, the head is 
 https://github.com/ceph/ceph/commit/6b08a729540c61f3c8b15c5a3ce9382634bf800c
 
 Cheers
 

-- 
Loïc Dachary, Artisan Logiciel Libre
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: dumpling giant backports update

2015-02-10 Thread Yuri Weinstein

Hi Loic

I was thinking that as soon as one of the branches is declared ready we will 
merge *-backports with the main branch and execute appropriate set of suites as 
we run them in the Octo lab for released branches.

For dumpling:
rados
rbd
rgw
fs
ceph-deploy
upgrade/dumpling

For giant:
rados
rbd
rgw
fs
krbd
kcephfs
knfs
haddop
samba
rest
multimds
multi-version
upgrade/giant
powecycle

Not sure if we need to run more tests, e.g. giant-x kind of suites for upgrades.

Do you agree with this plan? 

Thx
YuriW

- Original Message -
From: Loic Dachary l...@dachary.org
To: Yuri Weinstein yuri.weinst...@inktank.com
Cc: Ceph Development ceph-devel@vger.kernel.org
Sent: Tuesday, February 10, 2015 7:25:55 AM
Subject: dumpling  giant backports update

Hi Yuri,

The dumpling integration branch 
https://github.com/ceph/ceph/commits/dumpling-backports is ready for Josh and 
Sam and we are expecting approval from Yehuda (the details are here 
http://tracker.ceph.com/issues/10560).

The giant integration branch 
https://github.com/ceph/ceph/commits/giant-backports is ready for Sam and we 
are expecting approval from Josh and Yehuda (the details are here 
http://tracker.ceph.com/issues/10501).

It is likely that we get approval for one branch or the other in the next 48h. 
When we do, I assume you will be conducting your own round of testing, using 
the dumpling-backports and/or giant-backports branch. Do you have a list of 
suites you plan to run already ? I'm also curious to understand who will be 
analyzing the results and ultimately declare that it is ready to be released.

Cheers

-- 
Loïc Dachary, Artisan Logiciel Libre


--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: dumpling integration branch for v0.67.12 ready for QE

2015-02-10 Thread Yuri Weinstein


Loic,

The only difference between options if we run suits on merged dumpling vs 
dumpling-backports first - is time.
We will have to run suites on the final branch after the merge anyway.

Unless I hear otherwise, I will schedule suites to run on dumpling-backports 
first (as you are suggesting, with higher priority) and then assuming that we 
resolved all issues, we will run on the dumpling merged. 

Sage, pls correct if this is not what has to be done.


Thx
YuriW

- Original Message -
From: Loic Dachary l...@dachary.org
To: Yuri Weinstein ywein...@redhat.com
Cc: Ceph Development ceph-devel@vger.kernel.org
Sent: Tuesday, February 10, 2015 9:05:31 AM
Subject: dumpling integration branch for v0.67.12 ready for QE

Hi Yuri,

The dumpling integration branch for v0.67.12 as found at 
https://github.com/ceph/ceph/commits/dumpling-backports has been approved by 
Yehuda, Josh and Sam and is ready for QE. 

For the record, the head is 
https://github.com/ceph/ceph/commit/3944c77c404c4a05886fe8276d5d0dd7e4f20410

I think it would be best for the QE tests to use the dumpling-backports. The 
alternative would be to merge dumpling-backports into dumpling. However, since 
testing may take a long time and require more patches, it probably is better to 
not do that iterative process on the dumpling branch itself. As it is now, 
there already are a number of commits in the dumpling branch that should really 
be in the dumpling-backports: they do not belong to v0.67.11 and are going to 
be released in v0.67.12. In the future though, the dumpling branch will only 
receive commits that have been carefully tested and all the integration work 
will be on the dumpling-backports branch exclusively. So that third parties do 
not have to worry that the dumpling branch contains commits that have not been 
tested yet.

Cheers

-- 
Loïc Dachary, Artisan Logiciel Libre
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: dumpling integration branch for v0.67.12 ready for QE

2015-02-10 Thread Yuri Weinstein


On 10/02/2015 18:19, Yuri Weinstein wrote: 
 Loic,
 
 The only difference between options if we run suits on merged dumpling vs 
 dumpling-backports first - is time.
 We will have to run suites on the final branch after the merge anyway.

Could you explain why ? After merging dumpling and dumpling-backports will be 
exactly the same.

Loic - I feel that finial QE validation should be done on the code that gets 
actually released to customers, e.g. dumpling branch after the merge.  I do see 
your point about branches being identical and ready to change my mind if you 
insist.  Does my reasoning make sense?  Please advice, how we should proceed.


 
 Unless I hear otherwise, I will schedule suites to run on dumpling-backports 
 first (as you are suggesting, with higher priority) and then assuming that we 
 resolved all issues, we will run on the dumpling merged. 
 
 Sage, pls correct if this is not what has to be done.
 
 
 Thx
 YuriW
 
 - Original Message -
 From: Loic Dachary l...@dachary.org
 To: Yuri Weinstein ywein...@redhat.com
 Cc: Ceph Development ceph-devel@vger.kernel.org
 Sent: Tuesday, February 10, 2015 9:05:31 AM
 Subject: dumpling integration branch for v0.67.12 ready for QE
 
 Hi Yuri,
 
 The dumpling integration branch for v0.67.12 as found at 
 https://github.com/ceph/ceph/commits/dumpling-backports has been approved by 
 Yehuda, Josh and Sam and is ready for QE. 
 
 For the record, the head is 
 https://github.com/ceph/ceph/commit/3944c77c404c4a05886fe8276d5d0dd7e4f20410
 
 I think it would be best for the QE tests to use the dumpling-backports. The 
 alternative would be to merge dumpling-backports into dumpling. However, 
 since testing may take a long time and require more patches, it probably is 
 better to not do that iterative process on the dumpling branch itself. As it 
 is now, there already are a number of commits in the dumpling branch that 
 should really be in the dumpling-backports: they do not belong to v0.67.11 
 and are going to be released in v0.67.12. In the future though, the dumpling 
 branch will only receive commits that have been carefully tested and all the 
 integration work will be on the dumpling-backports branch exclusively. So 
 that third parties do not have to worry that the dumpling branch contains 
 commits that have not been tested yet.
 
 Cheers
 

-- 
Loïc Dachary, Artisan Logiciel Libre
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: dumpling integration branch for v0.67.12 ready for QE

2015-02-10 Thread Yuri Weinstein

Great!

As soon as it's merged I will schedule suite to run as listed somewhere below 
...

dumpling with higher priority and then giant.

Thx
YuriW

- Original Message -
From: Loic Dachary l...@dachary.org
To: Sage Weil s...@newdream.net, Gregory Farnum g...@gregs42.com
Cc: Yuri Weinstein ywein...@redhat.com, Ceph Development 
ceph-devel@vger.kernel.org
Sent: Tuesday, February 10, 2015 11:06:43 AM
Subject: Re: dumpling integration branch for v0.67.12 ready for QE

Hi,

That's too much information for me to digest quickly. Instead of stalling I 
will go ahead and merge the dumpling pull requests into the dumpling branch so 
that Yuri can proceed. And I'll take time to revise my understanding of the 
backport workflow with your input.

Cheers

On 10/02/2015 19:37, Sage Weil wrote:
 On Tue, 10 Feb 2015, Gregory Farnum wrote:
 On Tue, Feb 10, 2015 at 10:04 AM, Loic Dachary l...@dachary.org wrote:


 On 10/02/2015 18:29, Yuri Weinstein wrote:
 On 10/02/2015 18:19, Yuri Weinstein wrote:
 Loic,

 The only difference between options if we run suits on merged dumpling vs 
 dumpling-backports first - is time.
 We will have to run suites on the final branch after the merge anyway.

 Could you explain why ? After merging dumpling and dumpling-backports will 
 be exactly the same.

 Loic - I feel that finial QE validation should be done on the code that 
 gets actually released to customers, e.g. dumpling branch after the merge. 
  I do see your point about branches being identical and ready to change my 
 mind if you insist.  Does my reasoning make sense?  Please advice, how we 
 should proceed.

 The dumpling-backports branch currently is at

 https://github.com/ceph/ceph/commit/3944c77c404c4a05886fe8276d5d0dd7e4f20410

 after a successful test run from QE and a merge into dumpling, the dumpling 
 branch will be at

 https://github.com/ceph/ceph/commit/3944c77c404c4a05886fe8276d5d0dd7e4f20410

 as well. In other words they are identical and there is no point in running 
 a test again. The only reason why they could be different is if a commit is 
 inadvertently added to the dumpling branch while testing happens on the 
 dumpling-backport branc. In this case the presence of this new commit would 
 be reason enough to run another round of test indeed. So the process could 
 be:

 If tests are ok and merge can fast forward, then release.
 If tests are ok and merge cannot fast forward, send back to loic because a 
 commit was added by accident and needs to be approved by the leads.

 If testing happens on the dumpling branch, adding a commit to the dumpling 
 branch would have side effects that could taint the results of the tests 
 or, even worse, go unnoticed. There is zero chance that someone adds a 
 commit to the dumpling-backports branch and that gives us something stable. 
 On the contrary, the odds that someone adds a commit to the dumpling branch 
 are high, specially if the tests take a few weeks to complete.

 As Greg mentioned, merging in dumpling does not matter much for this round 
 because it is what has been done in the past. And to be honest, I would not 
 mind if an additional commit taints the process by accident. However, 
 unless there is a reason not to, it would be good to establish a process 
 that is solid if we can.

 I've witnessed Alfredo's pain on each release and an additional benefit of 
 having a dumpling-backports branch that nobody tampers with just occured to 
 me. When and if QE finds that dumpling-backports is fit for release, 
 instead of merging it into dumpling it could be handed over to Alfredo for 
 release. And he would be able to proceed knowing it is stable and won't be 
 moving forward. Once the release is done and the tag set to the proper 
 commit, the dumpling branch can be reset to dumpling-backports. If commits 
 were added during the process, their authors could be notified that they 
 were discarded and need to be merge again. That would not work for the 
 master branch but it would definitely be possible for the stable branches 
 because such out of process commits are rarely added.

 I've not thought this through, but the more I think about it the more I 
 like the idea of using dumpling-backports as a staging area until the 
 release is final.

 What's the purpose of even having a dumpling branch at that point?
 We're not using it for anything under your model.
 
 Yeah, it seems to me like the same general process we use for 'next' and 
 'master' would work here:
 
  - prepare a batch of backports, say dumpling-rgw-next
  - run it through the rgw suite
  - if that is okay, merge to dumpling
  - run regular tests on dumpling (all suites)
 
 so that dumpling acts as in integration branch the same way the others do.   
 This is reasonably lightweight on process and means that our periodic 
 scheduled runs are doing double duty for the integration testing and 
 catching long-tail bugs.
 
 After talking through the last release vs 'next' branch race

QE validation of dumpling and giant releases

2015-02-10 Thread Yuri Weinstein

I am planning to schedule suites with high priority in the Octo and disable 
temporarily schedule in crontab today until validations are finished. 

Please let me know if you have any concerns about this.

Thx
YuriW
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: dumpling giant backports update

2015-02-02 Thread Yuri Weinstein


Loic,

Thanks for the updates!

YuriW


On 2/2/15 3:09 PM, Loic Dachary wrote:

Hi Yuri,

There is one remaining issue in the dumpling backports (the details are here 
http://tracker.ceph.com/issues/10560).

The giant integration branch has been updated today with all the pending pull 
requests (rgw in particular) and the rbd, rados and rgw suites scheduled (the 
details are here http://tracker.ceph.com/issues/10501). I'll analyze the result 
as soon as one of them finishes. The previous run was good and I'm hopeful the 
additional backports won't create unexpected difficulties.

Cheers

P.S. I moved the branches inventory to a wiki updatable via git to save the 
tedious copy / paste. There are here now : 
http://workbench.dachary.org/ceph/ceph-backports/wikis/pages



--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: upcoming dumpling v0.67.12

2015-01-26 Thread Yuri Weinstein

Loic,

Here is the run from sepia
http://pulpito.front.sepia.ceph.com/ubuntu-2015-01-26_09:26:27-upgrade:dumpling-dumpling-distro-basic-vps/

Two failures seems like env noise.

Thx
YuriW

On Mon, Jan 26, 2015 at 9:49 AM, Loic Dachary l...@dachary.org wrote:
Thanks for letting me know about the upgrade tests results, it's encouraging
:-) I'll let you know when the tests make progress.

On 26/01/2015 18:00, Yuri Weinstein wrote:
Loic,

Thanks for the update.
I ran upgrade/dumpling last week (and all 42 jobs passed in octo and
sepia) to establish a base line. And today running another one,
assuming it will pick up the already merged pull requests.

Let me know when you ready for next steps.

Thx
YuriW

On Mon, Jan 26, 2015 at 7:37 AM, Loic Dachary l...@dachary.org wrote:
Hi Yuri,

Here is a short update on the progress of the upcoming dumpling v0.67.12.

It is tracked with http://tracker.ceph.com/issues/10560. In the inventory
part, there is a list of all pull requests that are already merged in the
dumpling branch. There only is one pull request waiting to be merged and
three issues waiting for backports. While these last three are being worked
on, I started rbd, rgw and rados suites.

I chose to display the inventory by pull request because I figured it would
be more convenient to read because sometimes a single pull request spans
multiple issues ( https://github.com/ceph/ceph/pull/2611 for instance fixes
two issues ).

Cheers

--
Loïc Dachary, Artisan Logiciel Libre

--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

--
Loïc Dachary, Artisan Logiciel Libre

--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: upcoming dumpling v0.67.12

2015-01-26 Thread Yuri Weinstein

Loic,

Thanks for the update.
I ran upgrade/dumpling last week (and all 42 jobs passed in octo and
sepia) to establish a base line.  And today running another one,
assuming it will pick up the already merged pull requests.

Let me know when you ready for next steps.

Thx
YuriW

On Mon, Jan 26, 2015 at 7:37 AM, Loic Dachary l...@dachary.org wrote:
 Hi Yuri,

 Here is a short update on the progress of the upcoming dumpling v0.67.12.

 It is tracked with http://tracker.ceph.com/issues/10560. In the inventory 
 part, there is a list of all pull requests that are already merged in the 
 dumpling branch. There only is one pull request waiting to be merged and 
 three issues waiting for backports. While these last three are being worked 
 on, I started rbd, rgw and rados suites.

 I chose to display the inventory by pull request because I figured it would 
 be more convenient to read because sometimes a single pull request spans 
 multiple issues ( https://github.com/ceph/ceph/pull/2611 for instance fixes 
 two issues ).

 Cheers

 --
 Loïc Dachary, Artisan Logiciel Libre

--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: ceph-qa analysis output

2015-01-17 Thread Yuri Weinstein

Hi Loic,

I'd love to.

Let's chat on Monday to finalize format.

(
PS:  I think format would be easier to maintain like new lines enforced, e,g.

'701575', '701576', '701582', '701590'
 ttp://tracker.ceph.com/issues/10543
FAILED assert(m_seed  old_pg_num

jobs,...
issue RE
description  (this I suppose can be retrieved from tracker, a?  if
yes, we may not need it at all)
)

Would it be possible to feed output of those machine digested emails
(and hopefully others) into this doc -
https://docs.google.com/a/inktank.com/spreadsheets/d/1S01gkuA149U5XSLStuzEoh-14tK2ICGm5tsCUomuzTo/edit#gid=403616374
(I granted you access as it's still under development/review) ?


PPS: Again this ticket http://tracker.ceph.com/issues/10455 would be
helpful in what we are discussing here.

Thx
YuriW

On Sat, Jan 17, 2015 at 5:07 AM, Loic Dachary l...@dachary.org wrote:
 Hi Yuri,

 It would be great if the analysis you compile daily was machine readable. For 
 instance, in a mail you sent to ceph-qa I read

 '701575', '701576', '701582', '701590' - known issue
 http://tracker.ceph.com/issues/10543

 FAILED assert(m_seed  old_pg_num)

 (duplicate of http://tracker.ceph.com/issues/10430)

 which could be something like:

 '701575', '701576', '701582', '701590': http://tracker.ceph.com/issues/10543 
 FAILED assert(m_seed  old_pg_num)

 or any other format you find easier to use consistently.

 The reason I ask is because it would help me write a script that associates 
 redmine tickets to your findings, in the context of backporting. It's just an 
 idea, not a request ;-)

 Cheers

 --
 Loïc Dachary, Artisan Logiciel Libre

--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: ceph-qa analysis output

2015-01-17 Thread Yuri Weinstein

Sounds good, Loic

Are you aware BTW about scrape tool written by John Spray?
https://github.com/jcsp/scrape

I use it for test runs analysis often.

Just FYI

Thx
YuriW

On Sat, Jan 17, 2015 at 3:27 PM, Loic Dachary l...@dachary.org wrote:


 On 17/01/2015 23:08, Yuri Weinstein wrote:
 Hi Loic,

 I'd love to.

 Let's chat on Monday to finalize format.

 (
 PS:  I think format would be easier to maintain like new lines enforced, e,g.

 '701575', '701576', '701582', '701590'
  ttp://tracker.ceph.com/issues/10543
 FAILED assert(m_seed  old_pg_num

 Yes, as long as it's consistent enough to be machine readable, that works :-)


 PPS: Again this ticket http://tracker.ceph.com/issues/10455 would be
 helpful in what we are discussing here.


 I'm not sure where the output of such a parsing would go. The update redmine 
 API is currently broken (the read API works ok) but if it was fixed the 
 tickets could be updated indeed.

 Cheers

 Thx
 YuriW

 On Sat, Jan 17, 2015 at 5:07 AM, Loic Dachary l...@dachary.org wrote:
 Hi Yuri,

 It would be great if the analysis you compile daily was machine readable. 
 For instance, in a mail you sent to ceph-qa I read

 '701575', '701576', '701582', '701590' - known issue
 http://tracker.ceph.com/issues/10543

 FAILED assert(m_seed  old_pg_num)

 (duplicate of http://tracker.ceph.com/issues/10430)

 which could be something like:

 '701575', '701576', '701582', '701590': 
 http://tracker.ceph.com/issues/10543 FAILED assert(m_seed  old_pg_num)

 or any other format you find easier to use consistently.

 The reason I ask is because it would help me write a script that associates 
 redmine tickets to your findings, in the context of backporting. It's just 
 an idea, not a request ;-)

 Cheers

 --
 Loïc Dachary, Artisan Logiciel Libre


 --
 Loïc Dachary, Artisan Logiciel Libre

--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Testing the next giant release

2015-01-07 Thread Yuri Weinstein

Look for them in new Octo lab - http://pulpito.ceph.redhat.com/

On Wed, Jan 7, 2015 at 3:03 PM, Loic Dachary l...@dachary.org wrote:

 Thanks Yuri  Tamil !

 One last question : http://pulpito.ceph.com/?branch=giant does not show a run 
 of rbd or rgw. They would be useful to figure out what kind of errors I 
 should expect. Are past results archived elsewhere by any chance ?


 On 07/01/2015 23:53, Tamil Muthamizhan wrote:
 yes, just those suites will do Loic.

 On Wed, Jan 7, 2015 at 2:34 PM, Yuri Weinstein
 yuri.weinst...@inktank.com wrote:
 Loic,

 I think if you run those on bare metals (not vps) they will run on
 whatever machines are available in the octo or sepia labs.

 Thx
 YuriW

 On Wed, Jan 7, 2015 at 2:28 PM, Loic Dachary l...@dachary.org wrote:


 On 07/01/2015 23:20, Tamil Muthamizhan wrote:
 hi Loic,

 we have a suite to perform smoke tests for rados/rbd/rgw.
 maybe you can try that to make sure things work.

 Are these the suites I should try:

 https://github.com/ceph/ceph-qa-suite/tree/master/suites/rados
 https://github.com/ceph/ceph-qa-suite/tree/master/suites/rbd
 https://github.com/ceph/ceph-qa-suite/tree/master/suites/rgw

 any specific settings or just all of three without restrictions (i.e. all 
 os versions etc.)

 Cheers


 once it looks good, we can have them scheduled to run using teuthology 
 for a more elaborate run.

 Thanks,
 Tamil


 On Wed, Jan 7, 2015 at 6:02 AM, Loic Dachary l...@dachary.org 
 mailto:l...@dachary.org wrote:

 Hi Tamil,

 I've merged / integrated the giant backports found at

 https://github.com/ceph/ceph/pull/3186
 https://github.com/ceph/ceph/pull/3178
 https://github.com/ceph/ceph/pull/2954
 https://github.com/ceph/ceph/pull/3191
 https://github.com/ceph/ceph/pull/3168
 https://github.com/ceph/ceph/pull/3289

 into

 
 http://workbench.dachary.org/ceph/ceph/commit/0ea20e6c51208d6710f469454ab3f964bfa7c9d2

 and successfully ran make check on it

 http://workbench.dachary.org:8080/projects/10?ref=giant-backports

 If I'm not mistaken the next step would be to run teuthology. If I'm 
 to do it, would you be so kind as to let me know which suites are most 
 relevant ? If someone else will take care of it, should I push the 
 integration branch somewhere ?

 Cheers

 --
 Loïc Dachary, Artisan Logiciel Libre




 --
 Regards,
 Tamil

 --
 Loïc Dachary, Artisan Logiciel Libre





 --
 Loïc Dachary, Artisan Logiciel Libre

--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Testing the next giant release

2015-01-07 Thread Yuri Weinstein

Loic,

I think if you run those on bare metals (not vps) they will run on
whatever machines are available in the octo or sepia labs.

Thx
YuriW

On Wed, Jan 7, 2015 at 2:28 PM, Loic Dachary l...@dachary.org wrote:


 On 07/01/2015 23:20, Tamil Muthamizhan wrote:
 hi Loic,

 we have a suite to perform smoke tests for rados/rbd/rgw.
 maybe you can try that to make sure things work.

 Are these the suites I should try:

 https://github.com/ceph/ceph-qa-suite/tree/master/suites/rados
 https://github.com/ceph/ceph-qa-suite/tree/master/suites/rbd
 https://github.com/ceph/ceph-qa-suite/tree/master/suites/rgw

 any specific settings or just all of three without restrictions (i.e. all os 
 versions etc.)

 Cheers


 once it looks good, we can have them scheduled to run using teuthology for a 
 more elaborate run.

 Thanks,
 Tamil


 On Wed, Jan 7, 2015 at 6:02 AM, Loic Dachary l...@dachary.org 
 mailto:l...@dachary.org wrote:

 Hi Tamil,

 I've merged / integrated the giant backports found at

 https://github.com/ceph/ceph/pull/3186
 https://github.com/ceph/ceph/pull/3178
 https://github.com/ceph/ceph/pull/2954
 https://github.com/ceph/ceph/pull/3191
 https://github.com/ceph/ceph/pull/3168
 https://github.com/ceph/ceph/pull/3289

 into

 
 http://workbench.dachary.org/ceph/ceph/commit/0ea20e6c51208d6710f469454ab3f964bfa7c9d2

 and successfully ran make check on it

 http://workbench.dachary.org:8080/projects/10?ref=giant-backports

 If I'm not mistaken the next step would be to run teuthology. If I'm to 
 do it, would you be so kind as to let me know which suites are most relevant 
 ? If someone else will take care of it, should I push the integration branch 
 somewhere ?

 Cheers

 --
 Loïc Dachary, Artisan Logiciel Libre




 --
 Regards,
 Tamil

 --
 Loïc Dachary, Artisan Logiciel Libre

--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Swift tests failing randomly

2014-08-11 Thread Yuri Weinstein

Here is what we have in vps.yaml now:

overrides:
  ceph:
conf:
  global:
osd heartbeat grace: 40

What do we want to add?

~

On Mon, Aug 11, 2014 at 10:13 AM, Sage Weil sw...@redhat.com wrote:
 On Mon, 11 Aug 2014, Yehuda Sadeh wrote:
 Yeah, looking at these logs, it really seem that it's just that things
 are going slow on these machines and it's hitting timeouts. The fix is
 ok with me, although I'd rather have it adjusted per machine type
 (somehow).

 There is a vps.yaml that bumps up another timeout, so we could put it
 there.  Right now it lives on the teuthology machine
 (~teuthworker/vps.yaml I think?), but perhaps we should stick it in
 ceph-qa-suite.git somewhere ...

 sage


 Yehuda

 On Mon, Aug 11, 2014 at 9:21 AM, Loic Dachary l...@dachary.org wrote:
  Hi Yehuda,
 
  It looks like increasing the rgw idle timeout makes the problem go away ( 
  https://github.com/ceph/ceph-qa-suite/pull/79 and 
  http://tracker.ceph.com/issues/8988 ). It previously was 300 sec which 
  looks like a large value already. Does this fix / workaround make sense to 
  you ?
 
  Cheers
 
  On 10/08/2014 10:46, Loic Dachary wrote:
  Hi Yehuda,
 
  In the past few months the swift tests failed randomly and I was 
  unfortunately unable to figure out why. Here are a few examples:
 
  
  http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406944
  
  http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406941
  
  http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406946
  
  http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406947
 
  and it has happened on every upgrade test run since I can remember. I 
  fail to see a pattern and cannot figure out what the real problem is. It 
  would be really great if you could take a look. Even a hunch or a tip 
  would be greatly appreciated :-)
 
  You can find more context in
 
  http://tracker.ceph.com/issues/8988
  http://tracker.ceph.com/issues/8016
  http://tracker.ceph.com/issues/7799
 
  and discussions at
 
  http://www.spinics.net/lists/ceph-devel/msg19933.html
 
  Cheers
 
 
  --
  Lo?c Dachary, Artisan Logiciel Libre
 
 --
 To unsubscribe from this list: send the line unsubscribe ceph-devel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html


 --
 To unsubscribe from this list: send the line unsubscribe ceph-devel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Swift tests failing randomly

2014-08-11 Thread Yuri Weinstein

I thought we could do the same in run-time for vps'es only.

Sage?

On Mon, Aug 11, 2014 at 11:47 AM, Loic Dachary l...@dachary.org wrote:


 On 11/08/2014 19:34, Yuri Weinstein wrote:
 Here is what we have in vps.yaml now:

 overrides:
   ceph:
 conf:
   global:
 osd heartbeat grace: 40

 What do we want to add?

 I think the idle_timeout values at

 https://github.com/ceph/ceph-qa-suite/pull/79/files



 ~

 On Mon, Aug 11, 2014 at 10:13 AM, Sage Weil sw...@redhat.com wrote:
 On Mon, 11 Aug 2014, Yehuda Sadeh wrote:
 Yeah, looking at these logs, it really seem that it's just that things
 are going slow on these machines and it's hitting timeouts. The fix is
 ok with me, although I'd rather have it adjusted per machine type
 (somehow).

 There is a vps.yaml that bumps up another timeout, so we could put it
 there.  Right now it lives on the teuthology machine
 (~teuthworker/vps.yaml I think?), but perhaps we should stick it in
 ceph-qa-suite.git somewhere ...

 sage


 Yehuda

 On Mon, Aug 11, 2014 at 9:21 AM, Loic Dachary l...@dachary.org wrote:
 Hi Yehuda,

 It looks like increasing the rgw idle timeout makes the problem go away ( 
 https://github.com/ceph/ceph-qa-suite/pull/79 and 
 http://tracker.ceph.com/issues/8988 ). It previously was 300 sec which 
 looks like a large value already. Does this fix / workaround make sense 
 to you ?

 Cheers

 On 10/08/2014 10:46, Loic Dachary wrote:
 Hi Yehuda,

 In the past few months the swift tests failed randomly and I was 
 unfortunately unable to figure out why. Here are a few examples:

 
 http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406944
 
 http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406941
 
 http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406946
 
 http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-basic-vps/406947

 and it has happened on every upgrade test run since I can remember. I 
 fail to see a pattern and cannot figure out what the real problem is. It 
 would be really great if you could take a look. Even a hunch or a tip 
 would be greatly appreciated :-)

 You can find more context in

 http://tracker.ceph.com/issues/8988
 http://tracker.ceph.com/issues/8016
 http://tracker.ceph.com/issues/7799

 and discussions at

 http://www.spinics.net/lists/ceph-devel/msg19933.html

 Cheers


 --
 Lo?c Dachary, Artisan Logiciel Libre

 --
 To unsubscribe from this list: send the line unsubscribe ceph-devel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html


 --
 To unsubscribe from this list: send the line unsubscribe ceph-devel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

 --
 Loïc Dachary, Artisan Logiciel Libre

--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Firefly upgrade tests

2014-07-05 Thread Yuri Weinstein

I killed several runs that had been running for 2-3 days, hopefully it
will speed up your runs.

Thx
YuriW

On Sat, Jul 5, 2014 at 6:46 AM, Loic Dachary l...@dachary.org wrote:

 Hi,

 It looks like there is a shortage of VPS for some reason:

 http://pulpito.ceph.com/loic-2014-07-03_11:24:33-upgrade:firefly-x:stress-split-wip-8475-testing-basic-vps/

 has a number of tests scheduled since ~48h and not making progress.

 Cheers

 On 04/07/2014 00:39, Loic Dachary wrote:
 Hi Ceph,

 The firefly-x test upgrade suite is designed to check that upgrading from 
 Firefly to a newer version (master or a branch) works as expected. It was 
 created it by copying dumpling-x and can be browsed at 
 https://github.com/ceph/ceph-qa-suite/tree/master/suites/upgrade/firefly-x

 To establish a baseline, a run was scheduled to upgrade from firefly to 
 firefly (i.e. no upgrade really ;-) and it should therefore show that when 
 nothing happens all is well. It however fails in various ways as can be seen 
 here.

 ./virtualenv/bin/teuthology-suite --suite upgrade/firefly-x/stress-split  
 --suite-dir ~/software/ceph/ceph-qa-suite  --ceph firefly --machine-type vps 
 --email l...@dachary.org 
 http://pulpito.ceph.com/loic-2014-07-02_23:05:05-upgrade:firefly-x:stress-split-firefly-testing-basic-vps/

 * Command failed on vpm105 with status 1: 'sudo yum install -y 
 http://gitbuilder.ceph.com/kernel-rpm-redhatenterpriseserver6-x86_64-basic/sha1/8102ce7556a99f6348067c60583320d308f36362/kernel.x86_64.rpm'
   Does that mean kernels are not ready yet for this distribution and the 
 tests should be skipped ?
 * Command failed on vpm058 with status 1: 
 SWIFT_TEST_CONFIG_FILE=/home/ubuntu/cephtest/archive/testswift.client.0.conf
  /home/ubuntu/cephtest/swift/virtualenv/bin/nosetests -w 
 /home/ubuntu/cephtest/swift/test/functional -v -a '!fails_on_rgw'
   
 http://pulpito.ceph.com/loic-2014-07-02_23:05:05-upgrade:firefly-x:stress-split-firefly-testing-basic-vps/338941

   Although it looks like http://tracker.ceph.com/issues/7808 which is a 
 duplicate of http://tracker.ceph.com/issues/7799 it is slightly different 
 and http://tracker.ceph.com/issues/8735 was created to keep track of it.

 * Command failed on vpm070 with status 1: 'sudo adjust-ulimits ceph-coverage 
 /home/ubuntu/cephtest/archive/coverage daemon-helper kill ceph-osd -f -i 1'  
   
 http://pulpito.ceph.com/loic-2014-07-02_23:05:05-upgrade:firefly-x:stress-split-firefly-testing-basic-vps/338904/

Although the root of the error seems to be that osd 1 cannot be killed by 
 the thrasher, I don't see meaningfull error messages. 
 http://tracker.ceph.com/issues/8736 was filed to keep track of this 
 condition.

 * timed out waiting for admin_socket to appear after osd.1 restart
 http://pulpito.ceph.com/loic-2014-07-02_23:05:05-upgrade:firefly-x:stress-split-firefly-testing-basic-vps/338908/

It looks like a race : the osd is killed at the same time it is restarted 
 by the thrasher and http://tracker.ceph.com/issues/8737 was opened for this

 * hang on INFO:teuthology.task.rados:joining rados
   
 http://pulpito.ceph.com/loic-2014-07-02_23:05:05-upgrade:firefly-x:stress-split-firefly-testing-basic-vps/338915/

   It looks like a bug and http://tracker.ceph.com/issues/8740 was filed

 When the same suite is run to upgrade from firefly to master it gives 
 http://pulpito.ceph.com/loic-2014-07-02_22:04:23-upgrade:firefly-x:stress-split-master-testing-basic-vps/
  which shows the following errors:

 * Command failed on vpm105 with status 1: 'sudo yum install -y 
 http://gitbuilder.ceph.com/kernel-rpm-redhatenterpriseserver6-x86_64-basic/sha1/8102ce7556a99f6348067c60583320d308f36362/kernel.x86_64.rpm'
(same as above)

 * Could not reconnect to ubu...@vpm042.front.sepia.ceph.com  : it looks like 
 a transient timeout problem that can be ignored
   
 http://pulpito.ceph.com/loic-2014-07-02_22:04:23-upgrade:firefly-x:stress-split-master-testing-basic-vps/338891/
   2014-07-02T18:52:24.546 INFO:teuthology.orchestra.connection:{'username': 
 u'ubuntu', 'hostname': u'vpm042.front.sepia.ceph.com', 'timeout': 60}

 * Command failed on vpm017 with status 1: 
 SWIFT_TEST_CONFIG_FILE=/home/ubuntu/cephtest/archive/testswift.client.0.conf
  /home/ubuntu/cephtest/swift/virtualenv/bin/nosetests -w 
 /home/ubuntu/cephtest/swift/test/functional -v -a '!fails_on_rgw'
   One of which looks exactly as http://tracker.ceph.com/issues/7799 which 
 was re-opened

 * hang on INFO:teuthology.task.rados:joining rados (same as above)

 Cheers


 --
 Loïc Dachary, Artisan Logiciel Libre

--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: teuthology task waiting for machines ( 8h)

2014-06-28 Thread Yuri Weinstein

Technically yes.

If queue is busy - patience is needed.

Assuming that there are no runs in the queue which are hung. Zack is
diligently looking and fixing to prevent hung tests. If we see runs
older then say one day, we kill them (altho 'teuthology-kill' is not
working for me today :( )

Another option to speed up run - use PRIO (for priority) when
scheduling it and/or use not plana machines as they are in high
demand.

Thx
YuriW

On Sat, Jun 28, 2014 at 3:27 AM, Loic Dachary l...@dachary.org wrote:
Hi Zack,

http://pulpito.ceph.com/loic-2014-06-27_18:45:37-upgrade:firefly-x:stress-split-wip-8475-testing-basic-plana/329515/

seems to indicate that the tasks cannot obtain the machines it needs:

2014-06-27T17:55:19.072 INFO:teuthology.task.internal:Locking machines...
2014-06-27T17:55:19.110 INFO:teuthology.task.internal:waiting for more
machines to be free (need 3 see 5)...
2014-06-27T17:55:29.175 INFO:teuthology.task.internal:waiting for more
machines to be free (need 3 see 5)...
...
2014-06-28T03:22:13.745 INFO:teuthology.task.internal:waiting for more
machines to be free (need 3 see 0)...
2014-06-28T03:22:23.787 INFO:teuthology.task.internal:waiting for more
machines to be free (need 3 see 0)...

Is it something expected (for instance when tasks with a higher priorty take
precedence) ? If it is then all that's needed is patience right ?

Cheers

--
Loïc Dachary, Artisan Logiciel Libre

--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Recommended teuthology upgrade test

2014-06-27 Thread Yuri Weinstein

Loic

I don't intent to answer all questions, but some info, see inline

On Fri, Jun 27, 2014 at 8:16 AM, Loic Dachary l...@dachary.org wrote:
 Hi Sam,

 TL;DR: what oneliner do you recommend to run upgrade tests for 
 https://github.com/ceph/ceph/pull/1890 ?

 Running the rados suite can be done with :

./schedule_suite.sh rados wip-8071 testing l...@dachary.org basic master 
 plana


It was replaced with teuthology-suite, see --help for more info

 or something else since ./schedule_suite.sh was recently obsoleted ( 
 http://tracker.ceph.com/issues/8678 ). Running something similar for upgrade 
 will presumably run all of 
 https://github.com/ceph/ceph-qa-suite/tree/master/suites/upgrade

 Is there a way to run minimal tests by limiting the upgrade suite so that it 
 only focuses on a firefly cluster that upgrades to 
 https://github.com/ceph/ceph/pull/1890 so that it checks the behavior when 
 running a mixed cluster (firefly + master with the change) ?

You can run specifying argument with smaller suite, like this:
dumpling-x/parallel


 It looks like http://pulpito.ceph.com/?suite=upgrade was never run ( at least 
 that's what appears to cause http://tracker.ceph.com/issues/8681 ) Is 
 http://pulpito.ceph.com/?suite=upgrade-rados a good fit ? If so is there a 
 way to figure out how it was created ?

 Cheers

 --
 Loïc Dachary, Artisan Logiciel Libre

--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

58 matches

Mail list logo