Re: [ceph-users] RocksDB and WAL migration to new block device

2018-11-22 Thread Igor Fedotov

Hi Florian,


On 11/21/2018 7:01 PM, Florian Engelmann wrote:

Hi Igor,

sad to say but I failed building the tool. I tried to build the whole 
project like documented here:


http://docs.ceph.com/docs/mimic/install/build-ceph/

But as my workstation is running Ubuntu the binary fails on SLES:

./ceph-bluestore-tool --help
./ceph-bluestore-tool: symbol lookup error: ./ceph-bluestore-tool: 
undefined symbol: _ZNK7leveldb6Status8ToStringB5cxx11Ev


I did copy all libraries to ~/lib and exported LD_LIBRARY_PATH but it 
did not solve the problem.


Is there any simple method to just build the bluestore-tool standalone 
and static?



Unfortunately I don't know such a method.

May be try hex editing instead?


All the best,
Florian


Am 11/21/18 um 9:34 AM schrieb Igor Fedotov:
Actually  (given that your devices are already expanded) you don't 
need to expand them once again - one can just update size labels with 
my new PR.


For new migrations you can use updated bluefs expand command which 
sets size label automatically though.



Thanks,
Igor
On 11/21/2018 11:11 AM, Florian Engelmann wrote:
Great support Igor Both thumbs up! We will try to build the tool 
today and expand those bluefs devices once again.



Am 11/20/18 um 6:54 PM schrieb Igor Fedotov:

FYI: https://github.com/ceph/ceph/pull/25187


On 11/20/2018 8:13 PM, Igor Fedotov wrote:


On 11/20/2018 7:05 PM, Florian Engelmann wrote:

Am 11/20/18 um 4:59 PM schrieb Igor Fedotov:



On 11/20/2018 6:42 PM, Florian Engelmann wrote:

Hi Igor,



what's your Ceph version?


12.2.8 (SES 5.5 - patched to the latest version)



Can you also check the output for

ceph-bluestore-tool show-label -p 


ceph-bluestore-tool show-label --path /var/lib/ceph/osd/ceph-0/
infering bluefs devices from bluestore path
{
    "/var/lib/ceph/osd/ceph-0//block": {
    "osd_uuid": "1e5b3908-20b1-41e4-b6eb-f5636d20450b",
    "size": 8001457295360,
    "btime": "2018-06-29 23:43:12.088842",
    "description": "main",
    "bluefs": "1",
    "ceph_fsid": "a146-6561-307e-b032-c5cee2ee520c",
    "kv_backend": "rocksdb",
    "magic": "ceph osd volume v026",
    "mkfs_done": "yes",
    "ready": "ready",
    "whoami": "0"
    },
    "/var/lib/ceph/osd/ceph-0//block.wal": {
    "osd_uuid": "1e5b3908-20b1-41e4-b6eb-f5636d20450b",
    "size": 524288000,
    "btime": "2018-06-29 23:43:12.098690",
    "description": "bluefs wal"
    },
    "/var/lib/ceph/osd/ceph-0//block.db": {
    "osd_uuid": "1e5b3908-20b1-41e4-b6eb-f5636d20450b",
    "size": 524288000,
    "btime": "2018-06-29 23:43:12.098023",
    "description": "bluefs db"
    }
}





It should report 'size' labels for every volume, please check 
they contain new values.




That's exactly the problem, whether "ceph-bluestore-tool 
show-label" nor "ceph daemon osd.0 perf dump|jq '.bluefs'" did 
recognize the new sizes. But we are 100% sure the new devices 
are used as we already deleted the old once...


We tried to delete the "key" "size" to add one with the new 
value but:


ceph-bluestore-tool rm-label-key --dev 
/var/lib/ceph/osd/ceph-0/block.db -k size

key 'size' not present

even if:

ceph-bluestore-tool show-label --dev 
/var/lib/ceph/osd/ceph-0/block.db

{
    "/var/lib/ceph/osd/ceph-0/block.db": {
    "osd_uuid": "1e5b3908-20b1-41e4-b6eb-f5636d20450b",
    "size": 524288000,
    "btime": "2018-06-29 23:43:12.098023",
    "description": "bluefs db"
    }
}

So it looks like the key "size" is "read-only"?


There was a bug in updating specific keys, see
https://github.com/ceph/ceph/pull/24352

This PR also eliminates the need to set sizes manually on 
bdev-expand.


I thought it had been backported to Luminous but it looks like 
it doesn't.

Will submit a PR shortly.




Thank you so much Igor! So we have to decide how to proceed. 
Maybe you could help us here as well.


Option A: Wait for this fix to be available. -> could last weeks 
or even months
if you can build a custom version of ceph_bluestore_tool then this 
is a short path. I'll submit a patch today or tomorrow which you 
need to integrate into your private build.

Then you need to upgrade just the tool and apply new sizes.



Option B: Recreate OSDs "one-by-one". -> will take a very long 
time as well

No need for that IMO.


Option C: There is some "lowlevel" commad allowing us to fix 
those sizes?
Well hex editor might help here as well. What you need is just to 
update 64bit size value in block.db and block.wal files. In my lab 
I can find it at offset 0x52. Most probably this is the fixed 
location but it's better to check beforehand - existing value 
should contain value corresponding to the one reported with 
show-label. Or I can do that for you - please send the first 4K 
chunks to me along with corresponding label report.
Then update with new values - the field has to contain exactly the 
same size as your new partition.










Thanks,

Igor


On 

Re: [ceph-users] RocksDB and WAL migration to new block device

2018-11-21 Thread Florian Engelmann

Hi Igor,

sad to say but I failed building the tool. I tried to build the whole 
project like documented here:


http://docs.ceph.com/docs/mimic/install/build-ceph/

But as my workstation is running Ubuntu the binary fails on SLES:

./ceph-bluestore-tool --help
./ceph-bluestore-tool: symbol lookup error: ./ceph-bluestore-tool: 
undefined symbol: _ZNK7leveldb6Status8ToStringB5cxx11Ev


I did copy all libraries to ~/lib and exported LD_LIBRARY_PATH but it 
did not solve the problem.


Is there any simple method to just build the bluestore-tool standalone 
and static?


All the best,
Florian


Am 11/21/18 um 9:34 AM schrieb Igor Fedotov:
Actually  (given that your devices are already expanded) you don't need 
to expand them once again - one can just update size labels with my new PR.


For new migrations you can use updated bluefs expand command which sets 
size label automatically though.



Thanks,
Igor
On 11/21/2018 11:11 AM, Florian Engelmann wrote:
Great support Igor Both thumbs up! We will try to build the tool 
today and expand those bluefs devices once again.



Am 11/20/18 um 6:54 PM schrieb Igor Fedotov:

FYI: https://github.com/ceph/ceph/pull/25187


On 11/20/2018 8:13 PM, Igor Fedotov wrote:


On 11/20/2018 7:05 PM, Florian Engelmann wrote:

Am 11/20/18 um 4:59 PM schrieb Igor Fedotov:



On 11/20/2018 6:42 PM, Florian Engelmann wrote:

Hi Igor,



what's your Ceph version?


12.2.8 (SES 5.5 - patched to the latest version)



Can you also check the output for

ceph-bluestore-tool show-label -p 


ceph-bluestore-tool show-label --path /var/lib/ceph/osd/ceph-0/
infering bluefs devices from bluestore path
{
    "/var/lib/ceph/osd/ceph-0//block": {
    "osd_uuid": "1e5b3908-20b1-41e4-b6eb-f5636d20450b",
    "size": 8001457295360,
    "btime": "2018-06-29 23:43:12.088842",
    "description": "main",
    "bluefs": "1",
    "ceph_fsid": "a146-6561-307e-b032-c5cee2ee520c",
    "kv_backend": "rocksdb",
    "magic": "ceph osd volume v026",
    "mkfs_done": "yes",
    "ready": "ready",
    "whoami": "0"
    },
    "/var/lib/ceph/osd/ceph-0//block.wal": {
    "osd_uuid": "1e5b3908-20b1-41e4-b6eb-f5636d20450b",
    "size": 524288000,
    "btime": "2018-06-29 23:43:12.098690",
    "description": "bluefs wal"
    },
    "/var/lib/ceph/osd/ceph-0//block.db": {
    "osd_uuid": "1e5b3908-20b1-41e4-b6eb-f5636d20450b",
    "size": 524288000,
    "btime": "2018-06-29 23:43:12.098023",
    "description": "bluefs db"
    }
}





It should report 'size' labels for every volume, please check 
they contain new values.




That's exactly the problem, whether "ceph-bluestore-tool 
show-label" nor "ceph daemon osd.0 perf dump|jq '.bluefs'" did 
recognize the new sizes. But we are 100% sure the new devices are 
used as we already deleted the old once...


We tried to delete the "key" "size" to add one with the new value 
but:


ceph-bluestore-tool rm-label-key --dev 
/var/lib/ceph/osd/ceph-0/block.db -k size

key 'size' not present

even if:

ceph-bluestore-tool show-label --dev 
/var/lib/ceph/osd/ceph-0/block.db

{
    "/var/lib/ceph/osd/ceph-0/block.db": {
    "osd_uuid": "1e5b3908-20b1-41e4-b6eb-f5636d20450b",
    "size": 524288000,
    "btime": "2018-06-29 23:43:12.098023",
    "description": "bluefs db"
    }
}

So it looks like the key "size" is "read-only"?


There was a bug in updating specific keys, see
https://github.com/ceph/ceph/pull/24352

This PR also eliminates the need to set sizes manually on 
bdev-expand.


I thought it had been backported to Luminous but it looks like it 
doesn't.

Will submit a PR shortly.




Thank you so much Igor! So we have to decide how to proceed. Maybe 
you could help us here as well.


Option A: Wait for this fix to be available. -> could last weeks or 
even months
if you can build a custom version of ceph_bluestore_tool then this 
is a short path. I'll submit a patch today or tomorrow which you 
need to integrate into your private build.

Then you need to upgrade just the tool and apply new sizes.



Option B: Recreate OSDs "one-by-one". -> will take a very long time 
as well

No need for that IMO.


Option C: There is some "lowlevel" commad allowing us to fix those 
sizes?
Well hex editor might help here as well. What you need is just to 
update 64bit size value in block.db and block.wal files. In my lab I 
can find it at offset 0x52. Most probably this is the fixed location 
but it's better to check beforehand - existing value should contain 
value corresponding to the one reported with show-label. Or I can do 
that for you - please send the first 4K chunks to me along with 
corresponding label report.
Then update with new values - the field has to contain exactly the 
same size as your new partition.










Thanks,

Igor


On 11/20/2018 5:29 PM, Florian Engelmann wrote:

Hi,

today we migrated all of our rocksdb and wal devices to new 
once. The new once are much bigger 

Re: [ceph-users] RocksDB and WAL migration to new block device

2018-11-21 Thread Igor Fedotov
Actually  (given that your devices are already expanded) you don't need 
to expand them once again - one can just update size labels with my new PR.


For new migrations you can use updated bluefs expand command which sets 
size label automatically though.



Thanks,
Igor
On 11/21/2018 11:11 AM, Florian Engelmann wrote:
Great support Igor Both thumbs up! We will try to build the tool 
today and expand those bluefs devices once again.



Am 11/20/18 um 6:54 PM schrieb Igor Fedotov:

FYI: https://github.com/ceph/ceph/pull/25187


On 11/20/2018 8:13 PM, Igor Fedotov wrote:


On 11/20/2018 7:05 PM, Florian Engelmann wrote:

Am 11/20/18 um 4:59 PM schrieb Igor Fedotov:



On 11/20/2018 6:42 PM, Florian Engelmann wrote:

Hi Igor,



what's your Ceph version?


12.2.8 (SES 5.5 - patched to the latest version)



Can you also check the output for

ceph-bluestore-tool show-label -p 


ceph-bluestore-tool show-label --path /var/lib/ceph/osd/ceph-0/
infering bluefs devices from bluestore path
{
    "/var/lib/ceph/osd/ceph-0//block": {
    "osd_uuid": "1e5b3908-20b1-41e4-b6eb-f5636d20450b",
    "size": 8001457295360,
    "btime": "2018-06-29 23:43:12.088842",
    "description": "main",
    "bluefs": "1",
    "ceph_fsid": "a146-6561-307e-b032-c5cee2ee520c",
    "kv_backend": "rocksdb",
    "magic": "ceph osd volume v026",
    "mkfs_done": "yes",
    "ready": "ready",
    "whoami": "0"
    },
    "/var/lib/ceph/osd/ceph-0//block.wal": {
    "osd_uuid": "1e5b3908-20b1-41e4-b6eb-f5636d20450b",
    "size": 524288000,
    "btime": "2018-06-29 23:43:12.098690",
    "description": "bluefs wal"
    },
    "/var/lib/ceph/osd/ceph-0//block.db": {
    "osd_uuid": "1e5b3908-20b1-41e4-b6eb-f5636d20450b",
    "size": 524288000,
    "btime": "2018-06-29 23:43:12.098023",
    "description": "bluefs db"
    }
}





It should report 'size' labels for every volume, please check 
they contain new values.




That's exactly the problem, whether "ceph-bluestore-tool 
show-label" nor "ceph daemon osd.0 perf dump|jq '.bluefs'" did 
recognize the new sizes. But we are 100% sure the new devices are 
used as we already deleted the old once...


We tried to delete the "key" "size" to add one with the new value 
but:


ceph-bluestore-tool rm-label-key --dev 
/var/lib/ceph/osd/ceph-0/block.db -k size

key 'size' not present

even if:

ceph-bluestore-tool show-label --dev 
/var/lib/ceph/osd/ceph-0/block.db

{
    "/var/lib/ceph/osd/ceph-0/block.db": {
    "osd_uuid": "1e5b3908-20b1-41e4-b6eb-f5636d20450b",
    "size": 524288000,
    "btime": "2018-06-29 23:43:12.098023",
    "description": "bluefs db"
    }
}

So it looks like the key "size" is "read-only"?


There was a bug in updating specific keys, see
https://github.com/ceph/ceph/pull/24352

This PR also eliminates the need to set sizes manually on 
bdev-expand.


I thought it had been backported to Luminous but it looks like it 
doesn't.

Will submit a PR shortly.




Thank you so much Igor! So we have to decide how to proceed. Maybe 
you could help us here as well.


Option A: Wait for this fix to be available. -> could last weeks or 
even months
if you can build a custom version of ceph_bluestore_tool then this 
is a short path. I'll submit a patch today or tomorrow which you 
need to integrate into your private build.

Then you need to upgrade just the tool and apply new sizes.



Option B: Recreate OSDs "one-by-one". -> will take a very long time 
as well

No need for that IMO.


Option C: There is some "lowlevel" commad allowing us to fix those 
sizes?
Well hex editor might help here as well. What you need is just to 
update 64bit size value in block.db and block.wal files. In my lab I 
can find it at offset 0x52. Most probably this is the fixed location 
but it's better to check beforehand - existing value should contain 
value corresponding to the one reported with show-label. Or I can do 
that for you - please send the first 4K chunks to me along with 
corresponding label report.
Then update with new values - the field has to contain exactly the 
same size as your new partition.










Thanks,

Igor


On 11/20/2018 5:29 PM, Florian Engelmann wrote:

Hi,

today we migrated all of our rocksdb and wal devices to new 
once. The new once are much bigger (500MB for wal/db -> 60GB db 
and 2G WAL) and LVM based.


We migrated like:

    export OSD=x

    systemctl stop ceph-osd@$OSD

    lvcreate -n db-osd$OSD -L60g data || exit 1
    lvcreate -n wal-osd$OSD -L2g data || exit 1

    dd if=/var/lib/ceph/osd/ceph-$OSD/block.wal 
of=/dev/data/wal-osd$OSD bs=1M || exit 1
    dd if=/var/lib/ceph/osd/ceph-$OSD/block.db 
of=/dev/data/db-osd$OSD bs=1M  || exit 1


    rm -v /var/lib/ceph/osd/ceph-$OSD/block.db || exit 1
    rm -v /var/lib/ceph/osd/ceph-$OSD/block.wal || exit 1
    ln -vs /dev/data/db-osd$OSD 
/var/lib/ceph/osd/ceph-$OSD/block.db || exit 1
    ln -vs /dev/data/wal-osd$OSD 

Re: [ceph-users] RocksDB and WAL migration to new block device

2018-11-21 Thread Florian Engelmann
Great support Igor Both thumbs up! We will try to build the tool 
today and expand those bluefs devices once again.



Am 11/20/18 um 6:54 PM schrieb Igor Fedotov:

FYI: https://github.com/ceph/ceph/pull/25187


On 11/20/2018 8:13 PM, Igor Fedotov wrote:


On 11/20/2018 7:05 PM, Florian Engelmann wrote:

Am 11/20/18 um 4:59 PM schrieb Igor Fedotov:



On 11/20/2018 6:42 PM, Florian Engelmann wrote:

Hi Igor,



what's your Ceph version?


12.2.8 (SES 5.5 - patched to the latest version)



Can you also check the output for

ceph-bluestore-tool show-label -p 


ceph-bluestore-tool show-label --path /var/lib/ceph/osd/ceph-0/
infering bluefs devices from bluestore path
{
    "/var/lib/ceph/osd/ceph-0//block": {
    "osd_uuid": "1e5b3908-20b1-41e4-b6eb-f5636d20450b",
    "size": 8001457295360,
    "btime": "2018-06-29 23:43:12.088842",
    "description": "main",
    "bluefs": "1",
    "ceph_fsid": "a146-6561-307e-b032-c5cee2ee520c",
    "kv_backend": "rocksdb",
    "magic": "ceph osd volume v026",
    "mkfs_done": "yes",
    "ready": "ready",
    "whoami": "0"
    },
    "/var/lib/ceph/osd/ceph-0//block.wal": {
    "osd_uuid": "1e5b3908-20b1-41e4-b6eb-f5636d20450b",
    "size": 524288000,
    "btime": "2018-06-29 23:43:12.098690",
    "description": "bluefs wal"
    },
    "/var/lib/ceph/osd/ceph-0//block.db": {
    "osd_uuid": "1e5b3908-20b1-41e4-b6eb-f5636d20450b",
    "size": 524288000,
    "btime": "2018-06-29 23:43:12.098023",
    "description": "bluefs db"
    }
}





It should report 'size' labels for every volume, please check they 
contain new values.




That's exactly the problem, whether "ceph-bluestore-tool 
show-label" nor "ceph daemon osd.0 perf dump|jq '.bluefs'" did 
recognize the new sizes. But we are 100% sure the new devices are 
used as we already deleted the old once...


We tried to delete the "key" "size" to add one with the new value but:

ceph-bluestore-tool rm-label-key --dev 
/var/lib/ceph/osd/ceph-0/block.db -k size

key 'size' not present

even if:

ceph-bluestore-tool show-label --dev /var/lib/ceph/osd/ceph-0/block.db
{
    "/var/lib/ceph/osd/ceph-0/block.db": {
    "osd_uuid": "1e5b3908-20b1-41e4-b6eb-f5636d20450b",
    "size": 524288000,
    "btime": "2018-06-29 23:43:12.098023",
    "description": "bluefs db"
    }
}

So it looks like the key "size" is "read-only"?


There was a bug in updating specific keys, see
https://github.com/ceph/ceph/pull/24352

This PR also eliminates the need to set sizes manually on bdev-expand.

I thought it had been backported to Luminous but it looks like it 
doesn't.

Will submit a PR shortly.




Thank you so much Igor! So we have to decide how to proceed. Maybe 
you could help us here as well.


Option A: Wait for this fix to be available. -> could last weeks or 
even months
if you can build a custom version of ceph_bluestore_tool then this is 
a short path. I'll submit a patch today or tomorrow which you need to 
integrate into your private build.

Then you need to upgrade just the tool and apply new sizes.



Option B: Recreate OSDs "one-by-one". -> will take a very long time 
as well

No need for that IMO.


Option C: There is some "lowlevel" commad allowing us to fix those 
sizes?
Well hex editor might help here as well. What you need is just to 
update 64bit size value in block.db and block.wal files. In my lab I 
can find it at offset 0x52. Most probably this is the fixed location 
but it's better to check beforehand - existing value should contain 
value corresponding to the one reported with show-label. Or I can do 
that for you - please send the  first 4K chunks to me along with 
corresponding label report.
Then update with new values - the field has to contain exactly the 
same size as your new partition.










Thanks,

Igor


On 11/20/2018 5:29 PM, Florian Engelmann wrote:

Hi,

today we migrated all of our rocksdb and wal devices to new once. 
The new once are much bigger (500MB for wal/db -> 60GB db and 2G 
WAL) and LVM based.


We migrated like:

    export OSD=x

    systemctl stop ceph-osd@$OSD

    lvcreate -n db-osd$OSD -L60g data || exit 1
    lvcreate -n wal-osd$OSD -L2g data || exit 1

    dd if=/var/lib/ceph/osd/ceph-$OSD/block.wal 
of=/dev/data/wal-osd$OSD bs=1M || exit 1
    dd if=/var/lib/ceph/osd/ceph-$OSD/block.db 
of=/dev/data/db-osd$OSD bs=1M  || exit 1


    rm -v /var/lib/ceph/osd/ceph-$OSD/block.db || exit 1
    rm -v /var/lib/ceph/osd/ceph-$OSD/block.wal || exit 1
    ln -vs /dev/data/db-osd$OSD 
/var/lib/ceph/osd/ceph-$OSD/block.db || exit 1
    ln -vs /dev/data/wal-osd$OSD 
/var/lib/ceph/osd/ceph-$OSD/block.wal || exit 1



    chown -c ceph:ceph $(realpath /dev/data/db-osd$OSD) || exit 1
    chown -c ceph:ceph $(realpath /dev/data/wal-osd$OSD) || exit 1
    chown -ch ceph:ceph /var/lib/ceph/osd/ceph-$OSD/block.db || 
exit 1
    chown -ch ceph:ceph /var/lib/ceph/osd/ceph-$OSD/block.wal || 
exit 1



    

Re: [ceph-users] RocksDB and WAL migration to new block device

2018-11-20 Thread Igor Fedotov

FYI: https://github.com/ceph/ceph/pull/25187


On 11/20/2018 8:13 PM, Igor Fedotov wrote:


On 11/20/2018 7:05 PM, Florian Engelmann wrote:

Am 11/20/18 um 4:59 PM schrieb Igor Fedotov:



On 11/20/2018 6:42 PM, Florian Engelmann wrote:

Hi Igor,



what's your Ceph version?


12.2.8 (SES 5.5 - patched to the latest version)



Can you also check the output for

ceph-bluestore-tool show-label -p 


ceph-bluestore-tool show-label --path /var/lib/ceph/osd/ceph-0/
infering bluefs devices from bluestore path
{
    "/var/lib/ceph/osd/ceph-0//block": {
    "osd_uuid": "1e5b3908-20b1-41e4-b6eb-f5636d20450b",
    "size": 8001457295360,
    "btime": "2018-06-29 23:43:12.088842",
    "description": "main",
    "bluefs": "1",
    "ceph_fsid": "a146-6561-307e-b032-c5cee2ee520c",
    "kv_backend": "rocksdb",
    "magic": "ceph osd volume v026",
    "mkfs_done": "yes",
    "ready": "ready",
    "whoami": "0"
    },
    "/var/lib/ceph/osd/ceph-0//block.wal": {
    "osd_uuid": "1e5b3908-20b1-41e4-b6eb-f5636d20450b",
    "size": 524288000,
    "btime": "2018-06-29 23:43:12.098690",
    "description": "bluefs wal"
    },
    "/var/lib/ceph/osd/ceph-0//block.db": {
    "osd_uuid": "1e5b3908-20b1-41e4-b6eb-f5636d20450b",
    "size": 524288000,
    "btime": "2018-06-29 23:43:12.098023",
    "description": "bluefs db"
    }
}





It should report 'size' labels for every volume, please check they 
contain new values.




That's exactly the problem, whether "ceph-bluestore-tool 
show-label" nor "ceph daemon osd.0 perf dump|jq '.bluefs'" did 
recognize the new sizes. But we are 100% sure the new devices are 
used as we already deleted the old once...


We tried to delete the "key" "size" to add one with the new value but:

ceph-bluestore-tool rm-label-key --dev 
/var/lib/ceph/osd/ceph-0/block.db -k size

key 'size' not present

even if:

ceph-bluestore-tool show-label --dev /var/lib/ceph/osd/ceph-0/block.db
{
    "/var/lib/ceph/osd/ceph-0/block.db": {
    "osd_uuid": "1e5b3908-20b1-41e4-b6eb-f5636d20450b",
    "size": 524288000,
    "btime": "2018-06-29 23:43:12.098023",
    "description": "bluefs db"
    }
}

So it looks like the key "size" is "read-only"?


There was a bug in updating specific keys, see
https://github.com/ceph/ceph/pull/24352

This PR also eliminates the need to set sizes manually on bdev-expand.

I thought it had been backported to Luminous but it looks like it 
doesn't.

Will submit a PR shortly.




Thank you so much Igor! So we have to decide how to proceed. Maybe 
you could help us here as well.


Option A: Wait for this fix to be available. -> could last weeks or 
even months
if you can build a custom version of ceph_bluestore_tool then this is 
a short path. I'll submit a patch today or tomorrow which you need to 
integrate into your private build.

Then you need to upgrade just the tool and apply new sizes.



Option B: Recreate OSDs "one-by-one". -> will take a very long time 
as well

No need for that IMO.


Option C: There is some "lowlevel" commad allowing us to fix those 
sizes?
Well hex editor might help here as well. What you need is just to 
update 64bit size value in block.db and block.wal files. In my lab I 
can find it at offset 0x52. Most probably this is the fixed location 
but it's better to check beforehand - existing value should contain 
value corresponding to the one reported with show-label. Or I can do 
that for you - please send the  first 4K chunks to me along with 
corresponding label report.
Then update with new values - the field has to contain exactly the 
same size as your new partition.










Thanks,

Igor


On 11/20/2018 5:29 PM, Florian Engelmann wrote:

Hi,

today we migrated all of our rocksdb and wal devices to new once. 
The new once are much bigger (500MB for wal/db -> 60GB db and 2G 
WAL) and LVM based.


We migrated like:

    export OSD=x

    systemctl stop ceph-osd@$OSD

    lvcreate -n db-osd$OSD -L60g data || exit 1
    lvcreate -n wal-osd$OSD -L2g data || exit 1

    dd if=/var/lib/ceph/osd/ceph-$OSD/block.wal 
of=/dev/data/wal-osd$OSD bs=1M || exit 1
    dd if=/var/lib/ceph/osd/ceph-$OSD/block.db 
of=/dev/data/db-osd$OSD bs=1M  || exit 1


    rm -v /var/lib/ceph/osd/ceph-$OSD/block.db || exit 1
    rm -v /var/lib/ceph/osd/ceph-$OSD/block.wal || exit 1
    ln -vs /dev/data/db-osd$OSD 
/var/lib/ceph/osd/ceph-$OSD/block.db || exit 1
    ln -vs /dev/data/wal-osd$OSD 
/var/lib/ceph/osd/ceph-$OSD/block.wal || exit 1



    chown -c ceph:ceph $(realpath /dev/data/db-osd$OSD) || exit 1
    chown -c ceph:ceph $(realpath /dev/data/wal-osd$OSD) || exit 1
    chown -ch ceph:ceph /var/lib/ceph/osd/ceph-$OSD/block.db || 
exit 1
    chown -ch ceph:ceph /var/lib/ceph/osd/ceph-$OSD/block.wal || 
exit 1



    ceph-bluestore-tool bluefs-bdev-expand --path 
/var/lib/ceph/osd/ceph-$OSD/ || exit 1


    systemctl start ceph-osd@$OSD


Everything went fine but it looks like the db 

Re: [ceph-users] RocksDB and WAL migration to new block device

2018-11-20 Thread Igor Fedotov


On 11/20/2018 7:05 PM, Florian Engelmann wrote:

Am 11/20/18 um 4:59 PM schrieb Igor Fedotov:



On 11/20/2018 6:42 PM, Florian Engelmann wrote:

Hi Igor,



what's your Ceph version?


12.2.8 (SES 5.5 - patched to the latest version)



Can you also check the output for

ceph-bluestore-tool show-label -p 


ceph-bluestore-tool show-label --path /var/lib/ceph/osd/ceph-0/
infering bluefs devices from bluestore path
{
    "/var/lib/ceph/osd/ceph-0//block": {
    "osd_uuid": "1e5b3908-20b1-41e4-b6eb-f5636d20450b",
    "size": 8001457295360,
    "btime": "2018-06-29 23:43:12.088842",
    "description": "main",
    "bluefs": "1",
    "ceph_fsid": "a146-6561-307e-b032-c5cee2ee520c",
    "kv_backend": "rocksdb",
    "magic": "ceph osd volume v026",
    "mkfs_done": "yes",
    "ready": "ready",
    "whoami": "0"
    },
    "/var/lib/ceph/osd/ceph-0//block.wal": {
    "osd_uuid": "1e5b3908-20b1-41e4-b6eb-f5636d20450b",
    "size": 524288000,
    "btime": "2018-06-29 23:43:12.098690",
    "description": "bluefs wal"
    },
    "/var/lib/ceph/osd/ceph-0//block.db": {
    "osd_uuid": "1e5b3908-20b1-41e4-b6eb-f5636d20450b",
    "size": 524288000,
    "btime": "2018-06-29 23:43:12.098023",
    "description": "bluefs db"
    }
}





It should report 'size' labels for every volume, please check they 
contain new values.




That's exactly the problem, whether "ceph-bluestore-tool show-label" 
nor "ceph daemon osd.0 perf dump|jq '.bluefs'" did recognize the new 
sizes. But we are 100% sure the new devices are used as we already 
deleted the old once...


We tried to delete the "key" "size" to add one with the new value but:

ceph-bluestore-tool rm-label-key --dev 
/var/lib/ceph/osd/ceph-0/block.db -k size

key 'size' not present

even if:

ceph-bluestore-tool show-label --dev /var/lib/ceph/osd/ceph-0/block.db
{
    "/var/lib/ceph/osd/ceph-0/block.db": {
    "osd_uuid": "1e5b3908-20b1-41e4-b6eb-f5636d20450b",
    "size": 524288000,
    "btime": "2018-06-29 23:43:12.098023",
    "description": "bluefs db"
    }
}

So it looks like the key "size" is "read-only"?


There was a bug in updating specific keys, see
https://github.com/ceph/ceph/pull/24352

This PR also eliminates the need to set sizes manually on bdev-expand.

I thought it had been backported to Luminous but it looks like it 
doesn't.

Will submit a PR shortly.




Thank you so much Igor! So we have to decide how to proceed. Maybe you 
could help us here as well.


Option A: Wait for this fix to be available. -> could last weeks or 
even months
if you can build a custom version of ceph_bluestore_tool then this is a 
short path. I'll submit a patch today or tomorrow which you need to 
integrate into your private build.

Then you need to upgrade just the tool and apply new sizes.



Option B: Recreate OSDs "one-by-one". -> will take a very long time as 
well

No need for that IMO.


Option C: There is some "lowlevel" commad allowing us to fix those sizes?
Well hex editor might help here as well. What you need is just to update 
64bit size value in block.db and block.wal files. In my lab I can find 
it at offset 0x52. Most probably this is the fixed location but it's 
better to check beforehand - existing value should contain value 
corresponding to the one reported with show-label. Or I can do that for 
you - please send the  first 4K chunks to me along with corresponding 
label report.
Then update with new values - the field has to contain exactly the same 
size as your new partition.










Thanks,

Igor


On 11/20/2018 5:29 PM, Florian Engelmann wrote:

Hi,

today we migrated all of our rocksdb and wal devices to new once. 
The new once are much bigger (500MB for wal/db -> 60GB db and 2G 
WAL) and LVM based.


We migrated like:

    export OSD=x

    systemctl stop ceph-osd@$OSD

    lvcreate -n db-osd$OSD -L60g data || exit 1
    lvcreate -n wal-osd$OSD -L2g data || exit 1

    dd if=/var/lib/ceph/osd/ceph-$OSD/block.wal 
of=/dev/data/wal-osd$OSD bs=1M || exit 1
    dd if=/var/lib/ceph/osd/ceph-$OSD/block.db 
of=/dev/data/db-osd$OSD bs=1M  || exit 1


    rm -v /var/lib/ceph/osd/ceph-$OSD/block.db || exit 1
    rm -v /var/lib/ceph/osd/ceph-$OSD/block.wal || exit 1
    ln -vs /dev/data/db-osd$OSD 
/var/lib/ceph/osd/ceph-$OSD/block.db || exit 1
    ln -vs /dev/data/wal-osd$OSD 
/var/lib/ceph/osd/ceph-$OSD/block.wal || exit 1



    chown -c ceph:ceph $(realpath /dev/data/db-osd$OSD) || exit 1
    chown -c ceph:ceph $(realpath /dev/data/wal-osd$OSD) || exit 1
    chown -ch ceph:ceph /var/lib/ceph/osd/ceph-$OSD/block.db || 
exit 1
    chown -ch ceph:ceph /var/lib/ceph/osd/ceph-$OSD/block.wal || 
exit 1



    ceph-bluestore-tool bluefs-bdev-expand --path 
/var/lib/ceph/osd/ceph-$OSD/ || exit 1


    systemctl start ceph-osd@$OSD


Everything went fine but it looks like the db and wal size is 
still the old one:


ceph daemon osd.0 perf dump|jq '.bluefs'
{
  

Re: [ceph-users] RocksDB and WAL migration to new block device

2018-11-20 Thread Florian Engelmann

Am 11/20/18 um 4:59 PM schrieb Igor Fedotov:



On 11/20/2018 6:42 PM, Florian Engelmann wrote:

Hi Igor,



what's your Ceph version?


12.2.8 (SES 5.5 - patched to the latest version)



Can you also check the output for

ceph-bluestore-tool show-label -p 


ceph-bluestore-tool show-label --path /var/lib/ceph/osd/ceph-0/
infering bluefs devices from bluestore path
{
    "/var/lib/ceph/osd/ceph-0//block": {
    "osd_uuid": "1e5b3908-20b1-41e4-b6eb-f5636d20450b",
    "size": 8001457295360,
    "btime": "2018-06-29 23:43:12.088842",
    "description": "main",
    "bluefs": "1",
    "ceph_fsid": "a146-6561-307e-b032-c5cee2ee520c",
    "kv_backend": "rocksdb",
    "magic": "ceph osd volume v026",
    "mkfs_done": "yes",
    "ready": "ready",
    "whoami": "0"
    },
    "/var/lib/ceph/osd/ceph-0//block.wal": {
    "osd_uuid": "1e5b3908-20b1-41e4-b6eb-f5636d20450b",
    "size": 524288000,
    "btime": "2018-06-29 23:43:12.098690",
    "description": "bluefs wal"
    },
    "/var/lib/ceph/osd/ceph-0//block.db": {
    "osd_uuid": "1e5b3908-20b1-41e4-b6eb-f5636d20450b",
    "size": 524288000,
    "btime": "2018-06-29 23:43:12.098023",
    "description": "bluefs db"
    }
}





It should report 'size' labels for every volume, please check they 
contain new values.




That's exactly the problem, whether "ceph-bluestore-tool show-label" 
nor "ceph daemon osd.0 perf dump|jq '.bluefs'" did recognize the new 
sizes. But we are 100% sure the new devices are used as we already 
deleted the old once...


We tried to delete the "key" "size" to add one with the new value but:

ceph-bluestore-tool rm-label-key --dev 
/var/lib/ceph/osd/ceph-0/block.db -k size

key 'size' not present

even if:

ceph-bluestore-tool show-label --dev /var/lib/ceph/osd/ceph-0/block.db
{
    "/var/lib/ceph/osd/ceph-0/block.db": {
    "osd_uuid": "1e5b3908-20b1-41e4-b6eb-f5636d20450b",
    "size": 524288000,
    "btime": "2018-06-29 23:43:12.098023",
    "description": "bluefs db"
    }
}

So it looks like the key "size" is "read-only"?


There was a bug in updating specific keys, see
https://github.com/ceph/ceph/pull/24352

This PR also eliminates the need to set sizes manually on bdev-expand.

I thought it had been backported to Luminous but it looks like it doesn't.
Will submit a PR shortly.




Thank you so much Igor! So we have to decide how to proceed. Maybe you 
could help us here as well.


Option A: Wait for this fix to be available. -> could last weeks or even 
months


Option B: Recreate OSDs "one-by-one". -> will take a very long time as well

Option C: There is some "lowlevel" commad allowing us to fix those sizes?







Thanks,

Igor


On 11/20/2018 5:29 PM, Florian Engelmann wrote:

Hi,

today we migrated all of our rocksdb and wal devices to new once. 
The new once are much bigger (500MB for wal/db -> 60GB db and 2G 
WAL) and LVM based.


We migrated like:

    export OSD=x

    systemctl stop ceph-osd@$OSD

    lvcreate -n db-osd$OSD -L60g data || exit 1
    lvcreate -n wal-osd$OSD -L2g data || exit 1

    dd if=/var/lib/ceph/osd/ceph-$OSD/block.wal 
of=/dev/data/wal-osd$OSD bs=1M || exit 1
    dd if=/var/lib/ceph/osd/ceph-$OSD/block.db 
of=/dev/data/db-osd$OSD bs=1M  || exit 1


    rm -v /var/lib/ceph/osd/ceph-$OSD/block.db || exit 1
    rm -v /var/lib/ceph/osd/ceph-$OSD/block.wal || exit 1
    ln -vs /dev/data/db-osd$OSD /var/lib/ceph/osd/ceph-$OSD/block.db 
|| exit 1
    ln -vs /dev/data/wal-osd$OSD 
/var/lib/ceph/osd/ceph-$OSD/block.wal || exit 1



    chown -c ceph:ceph $(realpath /dev/data/db-osd$OSD) || exit 1
    chown -c ceph:ceph $(realpath /dev/data/wal-osd$OSD) || exit 1
    chown -ch ceph:ceph /var/lib/ceph/osd/ceph-$OSD/block.db || exit 1
    chown -ch ceph:ceph /var/lib/ceph/osd/ceph-$OSD/block.wal || exit 1


    ceph-bluestore-tool bluefs-bdev-expand --path 
/var/lib/ceph/osd/ceph-$OSD/ || exit 1


    systemctl start ceph-osd@$OSD


Everything went fine but it looks like the db and wal size is still 
the old one:


ceph daemon osd.0 perf dump|jq '.bluefs'
{
  "gift_bytes": 0,
  "reclaim_bytes": 0,
  "db_total_bytes": 524279808,
  "db_used_bytes": 330301440,
  "wal_total_bytes": 524283904,
  "wal_used_bytes": 69206016,
  "slow_total_bytes": 320058949632,
  "slow_used_bytes": 13606322176,
  "num_files": 220,
  "log_bytes": 44204032,
  "log_compactions": 0,
  "logged_bytes": 31145984,
  "files_written_wal": 1,
  "files_written_sst": 1,
  "bytes_written_wal": 37753489,
  "bytes_written_sst": 238992
}


Even if the new block devices are recognized correctly:

2018-11-20 11:40:34.653524 7f70219b8d00  1 bdev(0x5647ea9ce200 
/var/lib/ceph/osd/ceph-0/block.db) open size 64424509440 
(0xf, 60GiB) block_size 4096 (4KiB) non-rotational
2018-11-20 11:40:34.653532 7f70219b8d00  1 bluefs add_block_device 
bdev 1 path /var/lib/ceph/osd/ceph-0/block.db size 60GiB



2018-11-20 11:40:34.662385 7f70219b8d00  1 

Re: [ceph-users] RocksDB and WAL migration to new block device

2018-11-20 Thread Igor Fedotov



On 11/20/2018 6:42 PM, Florian Engelmann wrote:

Hi Igor,



what's your Ceph version?


12.2.8 (SES 5.5 - patched to the latest version)



Can you also check the output for

ceph-bluestore-tool show-label -p 


ceph-bluestore-tool show-label --path /var/lib/ceph/osd/ceph-0/
infering bluefs devices from bluestore path
{
    "/var/lib/ceph/osd/ceph-0//block": {
    "osd_uuid": "1e5b3908-20b1-41e4-b6eb-f5636d20450b",
    "size": 8001457295360,
    "btime": "2018-06-29 23:43:12.088842",
    "description": "main",
    "bluefs": "1",
    "ceph_fsid": "a146-6561-307e-b032-c5cee2ee520c",
    "kv_backend": "rocksdb",
    "magic": "ceph osd volume v026",
    "mkfs_done": "yes",
    "ready": "ready",
    "whoami": "0"
    },
    "/var/lib/ceph/osd/ceph-0//block.wal": {
    "osd_uuid": "1e5b3908-20b1-41e4-b6eb-f5636d20450b",
    "size": 524288000,
    "btime": "2018-06-29 23:43:12.098690",
    "description": "bluefs wal"
    },
    "/var/lib/ceph/osd/ceph-0//block.db": {
    "osd_uuid": "1e5b3908-20b1-41e4-b6eb-f5636d20450b",
    "size": 524288000,
    "btime": "2018-06-29 23:43:12.098023",
    "description": "bluefs db"
    }
}





It should report 'size' labels for every volume, please check they 
contain new values.




That's exactly the problem, whether "ceph-bluestore-tool show-label" 
nor "ceph daemon osd.0 perf dump|jq '.bluefs'" did recognize the new 
sizes. But we are 100% sure the new devices are used as we already 
deleted the old once...


We tried to delete the "key" "size" to add one with the new value but:

ceph-bluestore-tool rm-label-key --dev 
/var/lib/ceph/osd/ceph-0/block.db -k size

key 'size' not present

even if:

ceph-bluestore-tool show-label --dev /var/lib/ceph/osd/ceph-0/block.db
{
    "/var/lib/ceph/osd/ceph-0/block.db": {
    "osd_uuid": "1e5b3908-20b1-41e4-b6eb-f5636d20450b",
    "size": 524288000,
    "btime": "2018-06-29 23:43:12.098023",
    "description": "bluefs db"
    }
}

So it looks like the key "size" is "read-only"?


There was a bug in updating specific keys, see
https://github.com/ceph/ceph/pull/24352

This PR also eliminates the need to set sizes manually on bdev-expand.

I thought it had been backported to Luminous but it looks like it doesn't.
Will submit a PR shortly.


Thanks,
Igor




Thanks,

Igor


On 11/20/2018 5:29 PM, Florian Engelmann wrote:

Hi,

today we migrated all of our rocksdb and wal devices to new once. 
The new once are much bigger (500MB for wal/db -> 60GB db and 2G 
WAL) and LVM based.


We migrated like:

    export OSD=x

    systemctl stop ceph-osd@$OSD

    lvcreate -n db-osd$OSD -L60g data || exit 1
    lvcreate -n wal-osd$OSD -L2g data || exit 1

    dd if=/var/lib/ceph/osd/ceph-$OSD/block.wal 
of=/dev/data/wal-osd$OSD bs=1M || exit 1
    dd if=/var/lib/ceph/osd/ceph-$OSD/block.db 
of=/dev/data/db-osd$OSD bs=1M  || exit 1


    rm -v /var/lib/ceph/osd/ceph-$OSD/block.db || exit 1
    rm -v /var/lib/ceph/osd/ceph-$OSD/block.wal || exit 1
    ln -vs /dev/data/db-osd$OSD /var/lib/ceph/osd/ceph-$OSD/block.db 
|| exit 1
    ln -vs /dev/data/wal-osd$OSD 
/var/lib/ceph/osd/ceph-$OSD/block.wal || exit 1



    chown -c ceph:ceph $(realpath /dev/data/db-osd$OSD) || exit 1
    chown -c ceph:ceph $(realpath /dev/data/wal-osd$OSD) || exit 1
    chown -ch ceph:ceph /var/lib/ceph/osd/ceph-$OSD/block.db || exit 1
    chown -ch ceph:ceph /var/lib/ceph/osd/ceph-$OSD/block.wal || exit 1


    ceph-bluestore-tool bluefs-bdev-expand --path 
/var/lib/ceph/osd/ceph-$OSD/ || exit 1


    systemctl start ceph-osd@$OSD


Everything went fine but it looks like the db and wal size is still 
the old one:


ceph daemon osd.0 perf dump|jq '.bluefs'
{
  "gift_bytes": 0,
  "reclaim_bytes": 0,
  "db_total_bytes": 524279808,
  "db_used_bytes": 330301440,
  "wal_total_bytes": 524283904,
  "wal_used_bytes": 69206016,
  "slow_total_bytes": 320058949632,
  "slow_used_bytes": 13606322176,
  "num_files": 220,
  "log_bytes": 44204032,
  "log_compactions": 0,
  "logged_bytes": 31145984,
  "files_written_wal": 1,
  "files_written_sst": 1,
  "bytes_written_wal": 37753489,
  "bytes_written_sst": 238992
}


Even if the new block devices are recognized correctly:

2018-11-20 11:40:34.653524 7f70219b8d00  1 bdev(0x5647ea9ce200 
/var/lib/ceph/osd/ceph-0/block.db) open size 64424509440 
(0xf, 60GiB) block_size 4096 (4KiB) non-rotational
2018-11-20 11:40:34.653532 7f70219b8d00  1 bluefs add_block_device 
bdev 1 path /var/lib/ceph/osd/ceph-0/block.db size 60GiB



2018-11-20 11:40:34.662385 7f70219b8d00  1 bdev(0x5647ea9ce600 
/var/lib/ceph/osd/ceph-0/block.wal) open size 2147483648 
(0x8000, 2GiB) block_size 4096 (4KiB) non-rotational
2018-11-20 11:40:34.662406 7f70219b8d00  1 bluefs add_block_device 
bdev 0 path /var/lib/ceph/osd/ceph-0/block.wal size 2GiB



Are we missing some command to "notify" rocksdb about the new device 
size?


All the best,
Florian



Re: [ceph-users] RocksDB and WAL migration to new block device

2018-11-20 Thread Florian Engelmann

Hi Igor,



what's your Ceph version?


12.2.8 (SES 5.5 - patched to the latest version)



Can you also check the output for

ceph-bluestore-tool show-label -p 


ceph-bluestore-tool show-label --path /var/lib/ceph/osd/ceph-0/
infering bluefs devices from bluestore path
{
"/var/lib/ceph/osd/ceph-0//block": {
"osd_uuid": "1e5b3908-20b1-41e4-b6eb-f5636d20450b",
"size": 8001457295360,
"btime": "2018-06-29 23:43:12.088842",
"description": "main",
"bluefs": "1",
"ceph_fsid": "a146-6561-307e-b032-c5cee2ee520c",
"kv_backend": "rocksdb",
"magic": "ceph osd volume v026",
"mkfs_done": "yes",
"ready": "ready",
"whoami": "0"
},
"/var/lib/ceph/osd/ceph-0//block.wal": {
"osd_uuid": "1e5b3908-20b1-41e4-b6eb-f5636d20450b",
"size": 524288000,
"btime": "2018-06-29 23:43:12.098690",
"description": "bluefs wal"
},
"/var/lib/ceph/osd/ceph-0//block.db": {
"osd_uuid": "1e5b3908-20b1-41e4-b6eb-f5636d20450b",
"size": 524288000,
"btime": "2018-06-29 23:43:12.098023",
"description": "bluefs db"
}
}





It should report 'size' labels for every volume, please check they 
contain new values.




That's exactly the problem, whether "ceph-bluestore-tool show-label" nor 
"ceph daemon osd.0 perf dump|jq '.bluefs'" did recognize the new sizes. 
But we are 100% sure the new devices are used as we already deleted the 
old once...


We tried to delete the "key" "size" to add one with the new value but:

ceph-bluestore-tool rm-label-key --dev /var/lib/ceph/osd/ceph-0/block.db 
-k size

key 'size' not present

even if:

ceph-bluestore-tool show-label --dev /var/lib/ceph/osd/ceph-0/block.db
{
"/var/lib/ceph/osd/ceph-0/block.db": {
"osd_uuid": "1e5b3908-20b1-41e4-b6eb-f5636d20450b",
"size": 524288000,
"btime": "2018-06-29 23:43:12.098023",
"description": "bluefs db"
}
}

So it looks like the key "size" is "read-only"?





Thanks,

Igor


On 11/20/2018 5:29 PM, Florian Engelmann wrote:

Hi,

today we migrated all of our rocksdb and wal devices to new once. The 
new once are much bigger (500MB for wal/db -> 60GB db and 2G WAL) and 
LVM based.


We migrated like:

    export OSD=x

    systemctl stop ceph-osd@$OSD

    lvcreate -n db-osd$OSD -L60g data || exit 1
    lvcreate -n wal-osd$OSD -L2g data || exit 1

    dd if=/var/lib/ceph/osd/ceph-$OSD/block.wal 
of=/dev/data/wal-osd$OSD bs=1M || exit 1
    dd if=/var/lib/ceph/osd/ceph-$OSD/block.db of=/dev/data/db-osd$OSD 
bs=1M  || exit 1


    rm -v /var/lib/ceph/osd/ceph-$OSD/block.db || exit 1
    rm -v /var/lib/ceph/osd/ceph-$OSD/block.wal || exit 1
    ln -vs /dev/data/db-osd$OSD /var/lib/ceph/osd/ceph-$OSD/block.db 
|| exit 1
    ln -vs /dev/data/wal-osd$OSD /var/lib/ceph/osd/ceph-$OSD/block.wal 
|| exit 1



    chown -c ceph:ceph $(realpath /dev/data/db-osd$OSD) || exit 1
    chown -c ceph:ceph $(realpath /dev/data/wal-osd$OSD) || exit 1
    chown -ch ceph:ceph /var/lib/ceph/osd/ceph-$OSD/block.db || exit 1
    chown -ch ceph:ceph /var/lib/ceph/osd/ceph-$OSD/block.wal || exit 1


    ceph-bluestore-tool bluefs-bdev-expand --path 
/var/lib/ceph/osd/ceph-$OSD/ || exit 1


    systemctl start ceph-osd@$OSD


Everything went fine but it looks like the db and wal size is still 
the old one:


ceph daemon osd.0 perf dump|jq '.bluefs'
{
  "gift_bytes": 0,
  "reclaim_bytes": 0,
  "db_total_bytes": 524279808,
  "db_used_bytes": 330301440,
  "wal_total_bytes": 524283904,
  "wal_used_bytes": 69206016,
  "slow_total_bytes": 320058949632,
  "slow_used_bytes": 13606322176,
  "num_files": 220,
  "log_bytes": 44204032,
  "log_compactions": 0,
  "logged_bytes": 31145984,
  "files_written_wal": 1,
  "files_written_sst": 1,
  "bytes_written_wal": 37753489,
  "bytes_written_sst": 238992
}


Even if the new block devices are recognized correctly:

2018-11-20 11:40:34.653524 7f70219b8d00  1 bdev(0x5647ea9ce200 
/var/lib/ceph/osd/ceph-0/block.db) open size 64424509440 (0xf, 
60GiB) block_size 4096 (4KiB) non-rotational
2018-11-20 11:40:34.653532 7f70219b8d00  1 bluefs add_block_device 
bdev 1 path /var/lib/ceph/osd/ceph-0/block.db size 60GiB



2018-11-20 11:40:34.662385 7f70219b8d00  1 bdev(0x5647ea9ce600 
/var/lib/ceph/osd/ceph-0/block.wal) open size 2147483648 (0x8000, 
2GiB) block_size 4096 (4KiB) non-rotational
2018-11-20 11:40:34.662406 7f70219b8d00  1 bluefs add_block_device 
bdev 0 path /var/lib/ceph/osd/ceph-0/block.wal size 2GiB



Are we missing some command to "notify" rocksdb about the new device 
size?


All the best,
Florian


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



--

EveryWare AG
Florian Engelmann

Re: [ceph-users] RocksDB and WAL migration to new block device

2018-11-20 Thread Igor Fedotov

Hi Florian,

what's your Ceph version?

Can you also check the output for

ceph-bluestore-tool show-label -p 


It should report 'size' labels for every volume, please check they 
contain new values.



Thanks,

Igor


On 11/20/2018 5:29 PM, Florian Engelmann wrote:

Hi,

today we migrated all of our rocksdb and wal devices to new once. The 
new once are much bigger (500MB for wal/db -> 60GB db and 2G WAL) and 
LVM based.


We migrated like:

    export OSD=x

    systemctl stop ceph-osd@$OSD

    lvcreate -n db-osd$OSD -L60g data || exit 1
    lvcreate -n wal-osd$OSD -L2g data || exit 1

    dd if=/var/lib/ceph/osd/ceph-$OSD/block.wal 
of=/dev/data/wal-osd$OSD bs=1M || exit 1
    dd if=/var/lib/ceph/osd/ceph-$OSD/block.db of=/dev/data/db-osd$OSD 
bs=1M  || exit 1


    rm -v /var/lib/ceph/osd/ceph-$OSD/block.db || exit 1
    rm -v /var/lib/ceph/osd/ceph-$OSD/block.wal || exit 1
    ln -vs /dev/data/db-osd$OSD /var/lib/ceph/osd/ceph-$OSD/block.db 
|| exit 1
    ln -vs /dev/data/wal-osd$OSD /var/lib/ceph/osd/ceph-$OSD/block.wal 
|| exit 1



    chown -c ceph:ceph $(realpath /dev/data/db-osd$OSD) || exit 1
    chown -c ceph:ceph $(realpath /dev/data/wal-osd$OSD) || exit 1
    chown -ch ceph:ceph /var/lib/ceph/osd/ceph-$OSD/block.db || exit 1
    chown -ch ceph:ceph /var/lib/ceph/osd/ceph-$OSD/block.wal || exit 1


    ceph-bluestore-tool bluefs-bdev-expand --path 
/var/lib/ceph/osd/ceph-$OSD/ || exit 1


    systemctl start ceph-osd@$OSD


Everything went fine but it looks like the db and wal size is still 
the old one:


ceph daemon osd.0 perf dump|jq '.bluefs'
{
  "gift_bytes": 0,
  "reclaim_bytes": 0,
  "db_total_bytes": 524279808,
  "db_used_bytes": 330301440,
  "wal_total_bytes": 524283904,
  "wal_used_bytes": 69206016,
  "slow_total_bytes": 320058949632,
  "slow_used_bytes": 13606322176,
  "num_files": 220,
  "log_bytes": 44204032,
  "log_compactions": 0,
  "logged_bytes": 31145984,
  "files_written_wal": 1,
  "files_written_sst": 1,
  "bytes_written_wal": 37753489,
  "bytes_written_sst": 238992
}


Even if the new block devices are recognized correctly:

2018-11-20 11:40:34.653524 7f70219b8d00  1 bdev(0x5647ea9ce200 
/var/lib/ceph/osd/ceph-0/block.db) open size 64424509440 (0xf, 
60GiB) block_size 4096 (4KiB) non-rotational
2018-11-20 11:40:34.653532 7f70219b8d00  1 bluefs add_block_device 
bdev 1 path /var/lib/ceph/osd/ceph-0/block.db size 60GiB



2018-11-20 11:40:34.662385 7f70219b8d00  1 bdev(0x5647ea9ce600 
/var/lib/ceph/osd/ceph-0/block.wal) open size 2147483648 (0x8000, 
2GiB) block_size 4096 (4KiB) non-rotational
2018-11-20 11:40:34.662406 7f70219b8d00  1 bluefs add_block_device 
bdev 0 path /var/lib/ceph/osd/ceph-0/block.wal size 2GiB



Are we missing some command to "notify" rocksdb about the new device 
size?


All the best,
Florian


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com