Re: [ceph-users] Looking for the best way to utilize 1TB NVMe added to the host with 8x3TB HDD OSDs

2019-09-22 Thread Wladimir Mutel

Ashley Merrick wrote:

Correct, in a large cluster no problem.

I was talking in Wladimir setup where they are running single node with 
a failure domain of OSD. Which would be a loss of all OSD's and all data.


	Sure I am aware that running with 1 NVMe is risky, so we have a plan to 
add a mirroring NVMe to it in some future. Hope this could be solved by 
using simple mdadm+lvm scheme


	Btw, are there any recommendations on cheapest Ceph node hardware ? Now 
I understand that 8x3TB HDDs in single host is quite a centralized 
setup. And I have a feeling that a good Ceph cluster should have more 
hosts than OSDs in each host. Like, with 8 OSDs per host, at least 8 
hosts. Or at least 3 hosts with 3 OSDs in each. Right ? And then it 
would be reasonable to add single NVMe per host to allow any component 
of the host to fail within failure domain=host.
I am still thinking within the cheapest concept of multiple HDDs + 
single NVMe per host.


 On Sun, 22 Sep 2019 03:42:52 +0800 *solarflow99 
mailto:solarflo...@gmail.com>>* wrote 


now my understanding is that a NVMe drive is recommended to help
speed up bluestore.  If it were to fail then those OSDs would be
lost but assuming there is 3x replication and enough OSDs I don't
see the problem here.  There are other scenarios where a whole
server might le lost, it doesn't mean the total loss of the cluster.


On Sat, Sep 21, 2019 at 5:27 AM Ashley Merrick
mailto:singap...@amerrick.co.uk>> wrote:

__
Placing it as a Journal / Bluestore DB/WAL will help with writes
mostly, by the sounds of it you want to increase read
performance?, how important is the data on this CEPH cluster?

If you place it as a Journal DB/WAL any failure of it will cause
total data loss so I would very much advise against this unless
this is totally for testing and total data loss is not an issue.

In that can is worth upgrading to blue store by rebuilding each
OSD placing the DB/WAL on a SSD partition, you can do this one
OSD at a time but there is no migration path so you would need
to wait for data rebuilding after each OSD change before moving
onto the next.

If you need to make sure your data is safe then your really
limited to using it as a read only cache, but I think even then
most setups would cause all OSD's to go offline till you
manually removed it from a read only cache if the disk failed.
However bcache/dm-cache may support this automatically however
is still a risk that I personally wouldn't want to take.

Also it really depends on your use for CEPH and the I/O activity
expected to what the best option may be.



 On Fri, 20 Sep 2019 14:56:12 +0800 *Wladimir Mutel
mailto:m...@mwg.dp.ua>>* wrote 

 Dear everyone,

 Last year I set up an experimental Ceph cluster (still
single node,
failure domain = osd, MB Asus P10S-M WS, CPU Xeon E3-1235L,
RAM 64 GB,
HDDs WD30EFRX, Ubuntu 18.04, now with kernel 5.3.0 from
Ubuntu mainline
PPA and Ceph 14.2.4 from
download.ceph.com/debian-nautilus/dists/bionic
<http://download.ceph.com/debian-nautilus/dists/bionic>
). I set up JErasure 2+1 pool, created some RBDs using that
as data pool
and exported them by iSCSI (using tcmu-runner, gwcli and
associated
packages). But with HDD-only setup their performance was
less than
stellar, not saturating even 1Gbit Ethernet on RBD reads.

 This year my experiment was funded with Gigabyte PCIe
NVMe 1TB SSD
(GP-ASACNE2100TTTDR). Now it is plugged in the MB and is
visible as a
storage device to lsblk. Also I can see its 4 interrupt
queues in
/proc/interrupts, and its transfer measured by hdparm -t is
about 2.3GB/sec.

 And now I want to ask your advice on how to best
include it into this
already existing setup. Should I allocate it for OSD
journals and
databases ? Is there a way to reconfigure existing OSD in
this way
without destroying and recreating it ? Or are there plans to
ease this
kind of migration ? Can I add it as a write-adsorbing cache to
individual RBD images ? To individual block devices at the
level of
bcache/dm-cache ? What about speeding up RBD reads ?

 I would appreciate to read your opinions and
recommendations.
 (just want to warn you that in this situation I don't
have financial
option of going full-SSD)

 

[ceph-users] Looking for the best way to utilize 1TB NVMe added to the host with 8x3TB HDD OSDs

2019-09-20 Thread Wladimir Mutel

Dear everyone,

	Last year I set up an experimental Ceph cluster (still single node, 
failure domain = osd, MB Asus P10S-M WS, CPU Xeon E3-1235L, RAM 64 GB, 
HDDs WD30EFRX, Ubuntu 18.04, now with kernel 5.3.0 from Ubuntu mainline 
PPA and Ceph 14.2.4 from download.ceph.com/debian-nautilus/dists/bionic 
). I set up JErasure 2+1 pool, created some RBDs using that as data pool 
and exported them by iSCSI (using tcmu-runner, gwcli and associated 
packages). But with HDD-only setup their performance was less than 
stellar, not saturating even 1Gbit Ethernet on RBD reads.


	This year my experiment was funded with Gigabyte PCIe NVMe 1TB SSD 
(GP-ASACNE2100TTTDR). Now it is plugged in the MB and is visible as a 
storage device to lsblk. Also I can see its 4 interrupt queues in 
/proc/interrupts, and its transfer measured by hdparm -t is about 2.3GB/sec.


	And now I want to ask your advice on how to best include it into this 
already existing setup. Should I allocate it for OSD journals and 
databases ? Is there a way to reconfigure existing OSD in this way 
without destroying and recreating it ? Or are there plans to ease this 
kind of migration ? Can I add it as a write-adsorbing cache to 
individual RBD images ? To individual block devices at the level of 
bcache/dm-cache ? What about speeding up RBD reads ?


I would appreciate to read your opinions and recommendations.
	(just want to warn you that in this situation I don't have financial 
option of going full-SSD)


Thank you all in advance for your response
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Mimic - EC and crush rules - clarification

2018-11-01 Thread Wladimir Mutel

David Turner wrote:
Yes, when creating an EC profile, it automatically creates a CRUSH rule 
specific for that EC profile.  You are also correct that 2+1 doesn't 
really have any resiliency built in.  2+2 would allow 1 node to go down 
while still having your data accessible.  It will use 2x data to raw as 


Is not EC 2+2 the same as 2x replication (i.e. RAID1) ?
Is not EC benefit and intention to allow equivalent replication
factors be chosen between >1 and <2 ?
That's why it is recommended to have m2
and k>m. Overall, your reliability in Ceph is measured as a
cluster rebuild/performance degradation time in case of
up-to m OSDs failure, provided that no more than m OSDs
(or larger failure domains) have failed at once.
Sure, EC is beneficial only when you have enough failure domains
(i.e. hosts). My criterion is that you should have more hosts
than you have individual OSDs within a single host.
I.e. at least 8 (and better >8) hosts when you have 8 OSDs
per host.

opposed to the 1.5x of 2+1, but it gives you resiliency.  The example in 
your command of 3+2 is not possible with your setup.  May I ask why you 
want EC on such a small OSD count?  I'm guessing to not use as much 
storage on your SSDs, but I would just suggest going with replica with 
such a small cluster.  If you have a larger node/OSD count, then you can 
start seeing if EC is right for your use case, but if this is production 
data... I wouldn't risk it.



When setting the crush rule, it wants the name of it, ssdrule, not 2.




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Upgrade Ceph 13.2.0 -> 13.2.1 and Windows iSCSI clients breakup

2018-07-28 Thread Wladimir Mutel

Dear all,

	I want to share some experience of upgrading my experimental 1-host 
Ceph cluster from v13.2.0 to v13.2.1.
 First, I fetched new packages and installed them using 'apt 
dist-upgrade', which went smooth as usual.
 Then I noticed from 'lsof', that Ceph daemons were not restarted after 
the upgrade ('ceph osd versions' still showed 13.2.0).
 Using instructions on Luminous->Mimic upgrade, I decided to restart 
ceph-{mon,mgr,osd}.targets.
 And surely, on restarting ceph-osd.target, iSCSI sessions had been 
broken on tcmu-runner side ('Timing out cmd', 'Handler connection 
lost'), and Windows (2008 R2) clients lost their iSCSI devices.

 But that was only a beginning of surprises that followed.
Looking into Windows Disk Management, I noticed that iSCSI disks were 
re-detected with size about 0.12 Gb larger, i.e. 2794.52 GB instead of 
2794.40 GB, and of course the system lost their GPT labels from its 
sight. I quickly checked 'rbd info' on Ceph side and did not notice any 
increase in RBD images. They were still exactly 715398 4MB-objects as I 
intended initially.
 Restarting iSCSI initiator service on Windows did not help. Restarting 
the whole Windows did not help. Restarting tcmu-runner on Ceph side did 
not help. What resolved the problem, to my great surprise, was 
_removing/re-adding MPIO feature and re-adding iSCSI multipath support_.
After that, Windows detected iSCSI disks with proper size again, and 
restored visibility of GPT partitions, dynamic disk metadata and all the 
rest.


Ok, I avoided data loss at this time, but I have some remaining 
questions :

1. Can Ceph minor version upgrades be made less disruptive and 
traumatic? Like, some king of staged/rolling OSD daemons restart within 
single upgraded host, without losing librbd sessions ?


2. Is Windows (2008 R2) MPIO support really that screwed & crippled ? 
Were there any improvements in Win2012/2016 ? I have physical servers 
with Windows 2008 R2, and I would like to mirror their volumes to Ceph 
iSCSI targets, then convert them into QEMU/KVM virtual machines where 
the same data will be accessed with librbd. During my initial 
experiments, I found that reinstalling MPIO & re-enabling iSCSI 
multipath would fix most problems in Windows iSCSI access, but I would 
like to have a faster way of resetting iSCSI+MPIO state when something 
is going wrong on Windows side like in my case.


3. Anybody has an idea of where these 0.12 GB (probably 120 or 128 MB) 
were taken from ?


Thank you in advance for your responses.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RBD image repurpose between iSCSI and QEMU VM, how to do properly ?

2018-07-15 Thread Wladimir Mutel

Jason Dillaman wrote:


  I am doing more experiments with Ceph iSCSI gateway and I am a bit
confused on how to properly repurpose an RBD image from iSCSI target
into QEMU virtual disk and back


This isn't really a use case that we support nor intend to support. Your 
best bet would be to use an initiator in your linux host to connect to 
the same LUN as is being exported over iSCSI (just make sure the NTFS 
file system is quiesced / frozen.


	After some more tries I found a way which is convenient enough to me : 
for every step which requires changing RBD role from qemu/librbd to 
iscsi or back, I create an RBD snap and a clone of this snap under some 
new name and assign it to qemu/librbd or to gwcli/iscsi accordingly. 
Then I can easily drop original RBDs as they are unneeded.


New Mimic functionality which frees me up from mandatory snap protection 
is of great help.


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] chkdsk /b fails on Ceph iSCSI volume

2018-07-15 Thread Wladimir Mutel

Hi,

	I cloned a NTFS with bad blocks from USB HDD onto Ceph RBD volume 
(using ntfsclone, so the copy has sparse regions), and decided to clean 
bad blocks within the copy. I run chkdsk /b from WIndows and it fails on 
free space verification (step 5 of 5).

In tcmu-runner.log I see that command 8f (SCSI Verify) is not supported.
Does it mean that I should not try to run chkdsk /b on this volume at 
all ? (it seems that bad blocks were re-verified and cleared)

Are there any plans to make user:rbd backstore support verify requests ?

Thanks in advance for your replies.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] RBD image repurpose between iSCSI and QEMU VM, how to do properly ?

2018-07-02 Thread Wladimir Mutel
 Dear all,

 I am doing more experiments with Ceph iSCSI gateway and I am a bit confused on 
how to properly repurpose an RBD image from iSCSI target into QEMU virtual disk 
and back

 First, I create an RBD image and set it up as iSCSI backstore in gwcli, 
specifying its size exactly to avoid unwanted resizes
 Next, I connect Windows 2008 R2 to this image (enable MPIO before connect and 
select MPIO policy 'Failover only' for the accessed device)
 Then in Windows Disk Management I initialize the physical disk with GPT, 
convert it into Dynamic disk and create a simple NTFS volume in its free space
 Then in the same console I put the disk 'offline', and in iSCSI control panel 
I disconnect the session from Windows side

 Then I attach the same RBD image to QEMU/KVM virtual machine with Ubuntu 18.04 
as virtio/librbd storage drive
 Then I boot Ubuntu 18.04 VM, find NTFS filesystem using 'ldmtool create all', 
and during ntfsclone from external disk I discover that RBD image is mapped 
read-only
 Ok, I stop Ubuntu VM, do 'rbd lock rm' for this image (lock is held by 
tcmu-runner, I suppose), restart Ubuntu, restart ntfsclone, and this time it is 
going well.
 Btw, ntfsclone onto device-mapper target created by ldmtool is going about 2x 
faster than directly onto Virtio Disk (vdN), so it transferred my 1600+GB in 
just 13+ hours instead of 27+ 

 Ok, external NTFS is cloned seemingly well, I shutdown Ubuntu VM (it properly 
removed the RBD lock on shutdown) and try to access it from Windows by iSCSI 
again.
 And at this moment I stumble into trouble. First, I don't see added RBD image 
in 'Devices' on iSCSI initiator control panel. This I tried to resolve by 
restarting tcmu-runner.
 After reconnect from Windows side, RBD image became visible in devices (and 
RBD lock from tcmu side was reacquired), 
 but its MPIO button was disabled, so I could not check or change MPIO policy 
(surely I enable MPIO in 'Connect' dialog).
 I tried also to restart rbd-target-gw but this also did not help. Restarting 
Windows server also did not improve the situation (MPIO button still disabled).
 What should I try to restart next, to avoid restarting the whole Ceph host ? 
May be unload/reload some kernel modules ?

 Thanks in advance for your help. Hope I could determine and resolve the 
problem myself, but this could take more time than getting help from you.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RBD gets resized when used as iSCSI target

2018-06-29 Thread Wladimir Mutel

Wladimir Mutel wrote:

it back to gwcli/disks), I discover that its size is rounded up to 3 
TiB, i.e. 3072 GiB or 786432*4M Ceph objects. As we know, GPT is stored 


'targetcli ls /' (there, it is still 3.0T). Also, when I restart 
rbd-target-gw.service, it gets resized back up to 3.0T as shown by 


Well, I see this size in /etc/target/saveconfig.json
And I see how the RBD is extended in /var/log/tcmu-runner.log
	And I remember that once I lazily added 2.7T RBD specifying its size as 
3T in gwcli. Now trying to fix that wihout deleting/recreating the RBD...


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] RBD gets resized when used as iSCSI target

2018-06-29 Thread Wladimir Mutel

Dear all,

	I create an RBD to be used as iSCSI target, with size close to the most 
popular 3TB HDD size, 5860533168 512-byte sectors, or 715398*4M Ceph 
objects (2.7 TB or 2794.4 GB). Then I add it into gwcli/disks (having to 
specify the same size, 2861592M), and then, after some manipulations 
which I do not remember exactly (like, remove it from gwcli conf, use it 
for some time as RBD target in QEMU VM, then re-add it back to 
gwcli/disks), I discover that its size is rounded up to 3 TiB, i.e. 3072 
GiB or 786432*4M Ceph objects. As we know, GPT is stored at the end of 
block device, so when we increase its size in this way, GPT becomes 
inaccessible and partition limits need to be guessed anew in some other way.


	I can shrink this gratuitously-increased RBD by 'rbd resize', and this 
is reflected in 'gwcli ls /' (3.0T becomes 2.7T). But not in 'targetcli 
ls /' (there, it is still 3.0T). Also, when I restart 
rbd-target-gw.service, it gets resized back up to 3.0T as shown by 
'gwcli ls /' and to 786432 objects in 'rbd info'. I look into 
rbd/gateway.conf RADOS object, and don't see any explicit size specified 
there. Where does it take this 3.0T size from ? My last suspicion is RBD 
name which is 'tower-prime-e-3tb'. Can its '3tb' suffix be the culprit ?


Thank you in advance for your replies.
I am getting lost and slowly infuriated with this behavior.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] HDD-only performance, how far can it be sped up ?

2018-06-20 Thread Wladimir Mutel

Dear all,

I set up a minimal 1-node Ceph cluster to evaluate its performance. 
We tried to save as much as possible on the hardware, so now the box has 
Asus P10S-M WS motherboard, Xeon E3-1235L v5 CPU, 64 GB DDR4 ECC RAM and 
8x3TB HDDs (WD30EFRX) connected to on-board SATA ports. Also we are 
trying to save on storage redundancy, so for most of our RBD images we 
use erasure-coded data-pool (default profile, jerasure 2+1) instead of 
3x replication. I started with Luminous/Xenial 12.2.5 setup which 
initialized my OSDs as Bluestore during deploy, then updated it to 
Mimic/Bionic 13.2.0. Base OS is Ubuntu 18.04 with kernel updated to 
4.17.2 from Ubuntu mainline PPA.


With this setup, I created a number of RBD images to test iSCSI, 
rbd-nbd and QEMU+librbd performance (running QEMU VMs on the same box). 
And that worked moderately well as far as data volume transferred within 
one session was limited. The fastest transfers I had with 'rbd import' 
which pulled an ISO image file at up to 25 MBytes/sec from the remote 
CIFS share over Gigabit Ethernet and stored it into EC data-pool. 
Windows 2008 R2 & 2016 setup, update installation, Win 2008 upgrade to 
2012 and to 2016 within QEMU VM also went through tolerably well. I 
found that cache=writeback gives the best performance with librbd, 
unlike cache=unsafe which gave the best performance with VMs on plain 
local SATA drives. Also I have a subjective feeling (not confirmed by 
exact measurements) that providing a huge libRBD cache (like, 
cache size = 1GB, max dirty = 7/8GB, max dirty age = 60) improved 
Windows VM performance on bursty writes (like, during Windows update 
installations) as well as on reboots (due to cached reads).


Now, what discouraged me, was my next attempt to clone an NTFS 
partition of ~2TB from a physical drive (via USB3-SATA3 convertor) to a 
partition on an RBD image. I tried to map RBD image with rbd-nbd either 
locally or remotely over Gigabit Ethernet, and the fastest speed I got 
with ntfsclone was about 8 MBytes/sec. Which means that it could spend 
up to 3 days copying these ~2TB of NTFS data. I thought about running
ntfsclone /dev/sdX1 -o - | rbd import ... - , but ntfsclone needs to 
rewrite a part of existing RBD image starting from certain offset, so I 
decided this was not a solution in my situation. Now I am thinking about 
taking out one of OSDs and using it as a 'bcache' for this operation, 
but I am not sure how good is bcache performance with cache on rotating 
HDD. I know that keeping OSD logs and RocksDB on the same HDD creates a 
seeky workload which hurts overall transfer performance.


Also I am thinking about a number of next-close possibilities, and 
I would like to hear your opinions on the benefits and drawbacks of each 
of them.


1. Would iSCSI access to that RBD image improve my performance 
(compared to rbd-nbd) ? I did not check that yet, but I noticed that 
Windows transferred about 2.5 MBytes/sec while formatting NTFS volume on 
this RBD attached to it by iSCSI. So, for seeky/sparse workloads like 
NTFS formatting the performance was not great.


2. Would it help to run ntfsclone in Linux VM, with RBD image 
accessed through QEMU+librbd ? (also going to measure that myself)


3. Is there any performance benefits in using Ceph cache-tier pools 
with my setup ? I hear now use of this technique is advised against, no?


4. We have an unused older box (Supermicro X8SIL-F mobo, Xeon X3430 
CPU, 32 GB of DDR3 ECC RAM, 6 onboard SATA ports, used from 2010 to 
2017, in perfectly working condition) which can be stuffed with up to 6 
SATA HDDs and added to this Ceph cluster, so far with only Gigabit 
network interconnect. Like, move 4 OSDs out of first box into it, to 
have 2 boxes with 4 HDDs each. Is this going to improve Ceph performance 
with the setup described above ?


5. I hear that RAID controllers like Adaptec 5805, LSI 2108 provide 
better performance with SATA HDDs exported as JBODs than onboard SATA 
AHCI controllers due to more aggressive caching and reordering requests. 
Is this true ?


6. On the local market we can buy Kingston KC1000/960GB NVMe drive 
for moderately reasonable price. Its specification has rewrite limit of 
1 PB and 0.58 DWPD (drive rewrite per day). Is there any 
counterindications against using it in production Ceph setup (i.e., too 
low rewrite limit, look for 8+PB) ? What is the difference between using 
it as a 'bcache' os as specifically-designed OSD log+rocksdb storage ? 
Can it be used as a single shared partition for all OSD daemons, or will 
it require spitting into 8 separate partitions ?


Thank you in advance for your replies.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] iSCSI to a Ceph node with 2 network adapters - how to ?

2018-06-15 Thread Wladimir Mutel

Jason Dillaman wrote:


[1] http://docs.ceph.com/docs/master/rbd/iscsi-initiator-win/



 I don't use either MPIO or MCS on Windows 2008 R2 or Windows 10
initiator (not Win2016 but hope there is no much difference). I try to make
it work with a single session first. Also, right now I only have a single
iSCSI gateway/portal (single host, single IP, single port).
Or is MPIO mandatory to use with Ceph target ?



It's mandatory even if you only have a single path since MPIO is
responsible for activating the paths.


Who would know ? I installed MPIO, enabled it for iSCSI
(required Windows reboot), set MPIO policy to 'Failover only',
and now my iSCSI target is readable !
Thanks a lot for your help !
Probably this should be written with bigger and redder letters
in Ceph docs.

Next question, would it be possible for iPXE loader to boot
from such iSCSI volumes ? I am going to experiment with that
but if the answer is known in advance, it would be good to know.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] iSCSI to a Ceph node with 2 network adapters - how to ?

2018-06-15 Thread Wladimir Mutel

Jason Dillaman wrote:


чер 13 08:38:14 p10s tcmu-runner[54121]: 2018-06-13 08:38:14.726 54121 [DEBUG] 
dbus_name_acquired:461: name org.kernel.TCMUService1 acquired
чер 13 08:38:30 p10s tcmu-runner[54121]: 2018-06-13 08:38:30.521 54121 
[DEBUG_SCSI_CMD] tcmu_print_cdb_info:1205 rbd/libvirt.tower-prime-e-3tb: 1a 0 
3f 0 c0 0
чер 13 08:38:30 p10s tcmu-runner[54121]: 2018-06-13 08:38:30.523 54121 
[DEBUG_SCSI_CMD] tcmu_print_cdb_info:1205 rbd/libvirt.tower-prime-e-3tb: 1a 0 
3f 0 c0 0
чер 13 08:38:30 p10s tcmu-runner[54121]: 2018-06-13 08:38:30.543 54121 
[DEBUG_SCSI_CMD] tcmu_print_cdb_info:1205 rbd/libvirt.tower-prime-e-3tb: 9e 10 
0 0 0 0 0 0 0 0 0 0 0 c 0 0
чер 13 08:38:30 p10s tcmu-runner[54121]: 2018-06-13 08:38:30.550 54121 
[DEBUG_SCSI_CMD] tcmu_print_cdb_info:1205 rbd/libvirt.tower-prime-e-3tb: 9e 10 
0 0 0 0 0 0 0 0 0 0 0 c 0 0
чер 13 08:38:47 p10s tcmu-runner[54121]: 2018-06-13 08:38:47.944 54121 
[DEBUG_SCSI_CMD] tcmu_print_cdb_info:1205 rbd/libvirt.tower-prime-e-3tb: 9e 10 
0 0 0 0 0 0 0 0 0 0 0 c 0 0



 Wikipedia says that 1A is 'mode sense' and 9e is 'service action in'.
 These records are logged when I try to put the disk online or 
initialize it with GPT/MBR partition table in Windows Disk Management (and 
Windows report errors after that)
 What to check next ? Any importance of missing 'SSD' device class ?



Did you configure MPIO within Windows [1]? Any errors recorded in the
Windows Event Viewer?



The "SSD" device class isn't important -- it's just a way to describe
the LUN as being backed by non-rotational media (e.g. VMware will show
a different icon).



[1] http://docs.ceph.com/docs/master/rbd/iscsi-initiator-win/


	I don't use either MPIO or MCS on Windows 2008 R2 or Windows 10 
initiator (not Win2016 but hope there is no much difference). I try to 
make it work with a single session first. Also, right now I only have a 
single iSCSI gateway/portal (single host, single IP, single port).

Or is MPIO mandatory to use with Ceph target ?

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] iSCSI to a Ceph node with 2 network adapters - how to ?

2018-06-13 Thread Wladimir Mutel
On Tue, Jun 12, 2018 at 10:39:59AM -0400, Jason Dillaman wrote:

> > So, my usual question is - where to look and what logs to enable
> > to find out what is going wrong ?

> If not overridden, tcmu-runner will default to 'client.admin' [1] so
> you shouldn't need to add any additional caps. In the short-term to
> debug your issue, you can perhaps increase the log level for
> tcmu-runner to see if it's showing an error [2].

So, I put 'log_level = 5' into /etc/tcmu/tcmu.conf , restart 
tcmu0runner and see only this in its logs :

чер 13 08:38:14 p10s systemd[1]: Starting LIO Userspace-passthrough daemon...
чер 13 08:38:14 p10s tcmu-runner[54121]: Inotify is watching 
"/etc/tcmu/tcmu.conf", wd: 1, mask: IN_ALL_EVENTS
чер 13 08:38:14 p10s tcmu-runner[54121]: 2018-06-13 08:38:14.634 54121 [DEBUG] 
load_our_module:531: Module 'target_core_user' is already loaded
чер 13 08:38:14 p10s tcmu-runner[54121]: 2018-06-13 08:38:14.634 54121 [DEBUG] 
main:1087: handler path: /usr/lib/x86_64-linux-gnu/tcmu-runner
чер 13 08:38:14 p10s tcmu-runner[54121]: 2018-06-13 08:38:14.657 54121 [DEBUG] 
main:1093: 2 runner handlers found
чер 13 08:38:14 p10s tcmu-runner[54121]: 2018-06-13 08:38:14.658 54121 [DEBUG] 
tcmu_block_device:404 rbd/libvirt.tower-prime-e-3tb: blocking kernel device
чер 13 08:38:14 p10s tcmu-runner[54121]: 2018-06-13 08:38:14.658 54121 [DEBUG] 
tcmu_block_device:410 rbd/libvirt.tower-prime-e-3tb: block done
чер 13 08:38:14 p10s tcmu-runner[54121]: 2018-06-13 08:38:14.658 54121 [DEBUG] 
dev_added:769 rbd/libvirt.tower-prime-e-3tb: Got block_size 512, size in bytes 
3000596692992
чер 13 08:38:14 p10s tcmu-runner[54121]: 2018-06-13 08:38:14.658 54121 [DEBUG] 
tcmu_rbd_open:829 rbd/libvirt.tower-prime-e-3tb: tcmu_rbd_open config 
rbd/libvirt/tower-prime-e-3tb;osd_op_timeout=30 block size 512 num lbas 
5860540416.
чер 13 08:38:14 p10s tcmu-runner[54121]: 2018-06-13 08:38:14.665 54121 [DEBUG] 
timer_check_and_set_def:383 rbd/libvirt.tower-prime-e-3tb: The cluster's 
default osd op timeout(0.00), osd heartbeat grace(20) interval(6)
чер 13 08:38:14 p10s tcmu-runner[54121]: 2018-06-13 08:38:14.672 54121 [DEBUG] 
tcmu_rbd_detect_device_class:300 rbd/libvirt.tower-prime-e-3tb: Pool libvirt 
using crush rule "replicated_rule"
чер 13 08:38:14 p10s tcmu-runner[54121]: 2018-06-13 08:38:14.672 54121 [DEBUG] 
tcmu_rbd_detect_device_class:316 rbd/libvirt.tower-prime-e-3tb: SSD not a 
registered device class.
чер 13 08:38:14 p10s tcmu-runner[54121]: 2018-06-13 08:38:14.715 7f30e08c9880  
1 mgrc service_daemon_register tcmu-runner.p10s:libvirt/tower-prime-e-3tb 
metadata {arch=x86_64,ceph_release=mimic,ceph_version=ceph version 13.2.0 
(79a10589f1f80dfe21e8f9794365ed98143071c4) mimic 
(stable),ceph_version_short=13.2.0,cpu=Intel(R) Xeon(R) CPU E3-1235L v5 @ 
2.00GHz,distro=ubuntu,distro_description=Ubuntu 18.04 
LTS,distro_version=18.04,hostname=p10s,image_id=25c21238e1f29,image_name=tower-prime-e-3tb,kernel_description=#201806032231
 SMP Sun Jun 3 22:33:34 UTC 
2018,kernel_version=4.17.0-041700-generic,mem_swap_kb=15622140,mem_total_kb=65827836,os=Linux,pool_name=libvirt}
чер 13 08:38:14 p10s tcmu-runner[54121]: 2018-06-13 08:38:14.721 54121 [DEBUG] 
tcmu_unblock_device:422 rbd/libvirt.tower-prime-e-3tb: unblocking kernel device
чер 13 08:38:14 p10s tcmu-runner[54121]: 2018-06-13 08:38:14.721 54121 [DEBUG] 
tcmu_unblock_device:428 rbd/libvirt.tower-prime-e-3tb: unblock done
чер 13 08:38:14 p10s tcmu-runner[54121]: 2018-06-13 08:38:14.724 54121 [DEBUG] 
dbus_bus_acquired:445: bus org.kernel.TCMUService1 acquired
чер 13 08:38:14 p10s systemd[1]: Started LIO Userspace-passthrough daemon.
чер 13 08:38:14 p10s tcmu-runner[54121]: 2018-06-13 08:38:14.726 54121 [DEBUG] 
dbus_name_acquired:461: name org.kernel.TCMUService1 acquired
чер 13 08:38:30 p10s tcmu-runner[54121]: 2018-06-13 08:38:30.521 54121 
[DEBUG_SCSI_CMD] tcmu_print_cdb_info:1205 rbd/libvirt.tower-prime-e-3tb: 1a 0 
3f 0 c0 0
чер 13 08:38:30 p10s tcmu-runner[54121]: 2018-06-13 08:38:30.523 54121 
[DEBUG_SCSI_CMD] tcmu_print_cdb_info:1205 rbd/libvirt.tower-prime-e-3tb: 1a 0 
3f 0 c0 0
чер 13 08:38:30 p10s tcmu-runner[54121]: 2018-06-13 08:38:30.543 54121 
[DEBUG_SCSI_CMD] tcmu_print_cdb_info:1205 rbd/libvirt.tower-prime-e-3tb: 9e 10 
0 0 0 0 0 0 0 0 0 0 0 c 0 0
чер 13 08:38:30 p10s tcmu-runner[54121]: 2018-06-13 08:38:30.550 54121 
[DEBUG_SCSI_CMD] tcmu_print_cdb_info:1205 rbd/libvirt.tower-prime-e-3tb: 9e 10 
0 0 0 0 0 0 0 0 0 0 0 c 0 0
чер 13 08:38:47 p10s tcmu-runner[54121]: 2018-06-13 08:38:47.944 54121 
[DEBUG_SCSI_CMD] tcmu_print_cdb_info:1205 rbd/libvirt.tower-prime-e-3tb: 9e 10 
0 0 0 0 0 0 0 0 0 0 0 c 0 0

Wikipedia says that 1A is 'mode sense' and 9e is 'service action in'. 
These records are logged when I try to put the disk online or 
initialize it with GPT/MBR partition table in Windows Disk Management (and 
Windows report errors after that)
What to check next ? Any importance of missing 'SSD' device class 

Re: [ceph-users] iSCSI to a Ceph node with 2 network adapters - how to ?

2018-06-12 Thread Wladimir Mutel

Hi everyone again,

I continue set up of my testing Ceph cluster (1-node so far).
I changed 'chooseleaf' from 'host' to 'osd' in CRUSH map
to make it run healthy on 1 node. For the same purpose,
I also set 'minimum_gateways = 1' for Ceph iSCSI gateway.
Also I upgraded Ubuntu 18.04 kernel to mainline v4.17 to get
up-to-date iSCSI attributes support required by gwcli
(qfull_time_out and probably something else).

I was able to add client host IQNs and configure their CHAP
authentication. I was able to add iSCSI LUNs referring to RBD
images, and to assign LUNs to clients. 'gwcli ls /' and
'targetcli ls /' show nice diagrams without signs of errors.
iSCSI initiators from Windows 10 and 2008 R2 can log in to the
portal with CHAP auth and list their assigned LUNs.
And authenticated sessions are also shown in '*cli ls' printout

But:

in Windows disk management, mapped LUN is shown in 'offline'
state. When I try to bring it online or to initalize the disk
with MBR or GPT partition table, I get messages like
'device not ready' on Win10 or 'driver detected controller error
 on \device\harddisk\dr5' or the like.

So, my usual question is - where to look and what logs to enable
to find out what is going wrong ?

My setup specifics are that I create my RBDs in non-default pool
('libvirt' instead of 'rbd'). Also I create them with erasure
data-pool (called it 'jerasure21' as was configured in default
erasure profile). Should I add explicit access to these pools
to some Ceph client I don't know ? I know that 'gwcli'
logs into Ceph as 'client.admin' but I am not sure
about tcmu-runner and/or user:rbd backstore provider.

Thank you in advance for your useful directions
out of my problem.

Wladimir Mutel wrote:


Failed : disk create/update failed on p10s. LUN allocation failure



Well, this was fixed by updating kernel to v4.17 from Ubuntu 
kernel/mainline PPA
Going on...

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] QEMU maps RBD but can't read them

2018-06-12 Thread Wladimir Mutel

Jason Dillaman wrote:


 One more question, how should I set profile 'rbd-read-only' properly
? I tried to set is for 'client.iso' on both 'iso' and 'jerasure21' pools,
and this did not work. Set profile on both pools to 'rbd', it worked. But I
don't want my iso imaged to be accidentally modified by virtual guests. Can
this be solved with Ceph auth, or in some other way ? (in fact, I look for
Ceph equivalent of 'chattr +i')



QEMU doesn't currently handle the case for opening RBD images in
read-only mode, so if you attempt to use 'profile rbd-read-only', I
suspect attempting to open the image will fail. You could perhaps take
a middle ground and just apply 'profile rbd-read-only pool=jerasure21'
to protect the contents of the image.


	For QEMU I found that profile 'rbd-read-only' currently does not work. 
So, I use 'profile rbd' for both replicated and erasure pools, and hope 
that 'readonly' configuration in QEMU disk would help.
	In my past experience I found that running 'kvm ... -cdrom 
something.iso' sometimes would modify that .iso-file, so I had to set 
immutable attribute on the FS level.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] QEMU maps RBD but can't read them

2018-06-06 Thread Wladimir Mutel

Jason Dillaman wrote:


The caps for those users looks correct for Luminous and later
clusters. Any chance you are using data pools with the images? It's
just odd that you have enough permissions to open the RBD image but
cannot read its data objects.



 Yes, I use erasure-pool as data-pool for these images
 (to save on replication overhead).
 Should I add it to the [osd] profile list ?



Indeed, that's the problem since the libvirt and/or iso user doesn't
have access to the data-pool.


This really helped, thanks !

client.iso
key: AQBp...gA==
caps: [mon] profile rbd
caps: [osd] profile rbd pool=iso, profile rbd pool=jerasure21
client.libvirt
key: AQBt...IA==
caps: [mon] profile rbd
caps: [osd] profile rbd pool=libvirt, profile rbd pool=jerasure21

Now I can boot the VM from the .iso image and install Windows.

	One more question, how should I set profile 'rbd-read-only' properly ? 
I tried to set is for 'client.iso' on both 'iso' and 'jerasure21' pools, 
and this did not work. Set profile on both pools to 'rbd', it worked. 
But I don't want my iso imaged to be accidentally modified by virtual 
guests. Can this be solved with Ceph auth, or in some other way ? (in 
fact, I look for Ceph equivalent of 'chattr +i')

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] QEMU maps RBD but can't read them

2018-06-06 Thread Wladimir Mutel

Jason Dillaman wrote:

The caps for those users looks correct for Luminous and later
clusters. Any chance you are using data pools with the images? It's
just odd that you have enough permissions to open the RBD image but
cannot read its data objects.


Yes, I use erasure-pool as data-pool for these images
(to save on replication overhead).
Should I add it to the [osd] profile list ?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] QEMU maps RBD but can't read them

2018-06-06 Thread Wladimir Mutel

Jason Dillaman wrote:

Can you run "rbd --id libvirt --pool libvirt win206-test-3tb " w/o error? It sounds like your CephX caps for
client.libvirt are not permitting read access to the image data
objects.


I tried to run 'rbd export' with these params,
but it said it was unable to find a keyring.
Is keyring file mandatory for every client ?

'ceph auth ls' shows these accounts with seemingly-proper
permissions :

client.iso
key: AQBp...gA==
caps: [mon] profile rbd
caps: [osd] profile rbd pool=iso
client.libvirt
key: AQBt...IA==
caps: [mon] profile rbd
caps: [osd] profile rbd pool=libvirt

And these same keys are listed in /etc/libvirt/secrets :

/etc/libvirt/secrets# ls | while read a ; do echo $a : $(cat $a) ; done
ac1d8d7b-d243-4474-841d-91c26fd93a14.base64 : AQBt...IA==

ac1d8d7b-d243-4474-841d-91c26fd93a14.xml : private='yes'> ac1d8d7b-d243-4474-841d-91c26fd93a14 
CEPH passphrase example  
ceph_example  


cf00c7e4-740a-4935-9d7c-223d3c81871f.base64 : AQBp...gA==

cf00c7e4-740a-4935-9d7c-223d3c81871f.xml : private='yes'> cf00c7e4-740a-4935-9d7c-223d3c81871f 
CEPH ISO pool  
ceph_iso  


I just thought this should be enough. no ?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] QEMU maps RBD but can't read them

2018-06-06 Thread Wladimir Mutel


Dear all,

I installed QEMU, libvirtd and its RBD plugins and now trying
to make QEMU use my Ceph storage. I created 'iso' pool
and imported Windows installation image there (rbd import).
Also I created 'libvirt' pool and there, created 2.7-TB image
for Windows installation. I created client.iso and
client.libvirt accounts for Ceph authentication,
and configured their secrets for pool access in virsh
(as told in http://docs.ceph.com/docs/master/rbd/libvirt/ ).
Then I started pools and checked that I can list their contents
from virsh. Then I created a VM with dummy HDD and optical
drive, and edited them using 'virsh edit' :


  
  

  
  

  
  
  
  



  
  

  
  name='iso/SW_DVD9_Win_Server_STD_CORE_2016_64Bit_Russian_-4_DC_STD_MLF_X21-70539.ISO'>


  
  
  
  
  


Now I see this in the systemd journalctl :

чер 06 16:24:12 p10s qemu-system-x86_64[4907]: 2018-06-06 16:24:12.147 
7f40f37fe700 -1 librbd::io::ObjectRequest: 0x7f40d4010500 
handle_read_object: failed to read from object: (1) Operation not permitted


What should I check and where ?
I can map the same RBD using rbd-nbd and read sectors
from the mapped device. If I map using kernel RBD driver
(I know this is not recommended to do on the same host),
I get :

чер 06 16:27:54 p10s kernel: rbd: image 
SW_DVD9_Win_Server_STD_CORE_2016_64Bit_Russian_-4_DC_STD_MLF_X21-70539.ISO: 
image uses unsupported features: 0x38


and

RBD image feature set mismatch. You can disable features unsupported by 
the kernel with "rbd feature disable 
iso/SW_DVD9_Win_Server_STD_CORE_2016_64Bit_Russian_-4_DC_STD_MLF_X21-70539.ISO 
object-map fast-diff deep-flatten".


Probably I need to change some attributes for the RBD
to be usable with QEMU. Please give some hints.
Thank you in advance.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] iSCSI to a Ceph node with 2 network adapters - how to ?

2018-06-04 Thread Wladimir Mutel
On Mon, Jun 04, 2018 at 11:12:58AM +0300, Wladimir Mutel wrote:
>   /disks> create pool=rbd image=win2016-3tb-1 size=2861589M 
> CMD: /disks/ create pool=rbd image=win2016-3tb-1 size=2861589M count=1 
> max_data_area_mb=None
> pool 'rbd' is ok to use
> Creating/mapping disk rbd/win2016-3tb-1
> Issuing disk create request
> Failed : disk create/update failed on p10s. LUN allocation failure

>   Surely I could investigate what is happening by studying gwcli sources,
>   but if anyone already knows how to fix that, I would appreciate your 
> response.

Well, this was fixed by updating kernel to v4.17 from Ubuntu 
kernel/mainline PPA
Going on...
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] iSCSI to a Ceph node with 2 network adapters - how to ?

2018-06-04 Thread Wladimir Mutel
On Fri, Jun 01, 2018 at 08:20:12PM +0300, Wladimir Mutel wrote:
> 
>   And still, when I do '/disks create ...' in gwcli, it says
>   that it wants 2 existing gateways. Probably this is related
>   to the created 2-TPG structure and I should look for more ways
>   to 'improve' that json config so that rbd-target-gw loads it
>   as I need on single host.

Well, I decided to bond my network interfaces and assign a single IP on 
them (as mchristi@ suggested)
Also I put 'minimum_gateways = 1' into /etc/ceph/iscsi-gateway.cfg and 
got rid of 'At least 2 gateways required' in gwcli
But now I have one more stumble :

gwcli -d
Adding ceph cluster 'ceph' to the UI
Fetching ceph osd information
Querying ceph for state information
Refreshing disk information from the config object
- Scanning will use 8 scan threads
- rbd image scan complete: 0s
Refreshing gateway & client information
- checking iSCSI/API ports on p10s
Querying ceph for state information
Gathering pool stats for cluster 'ceph'

/disks> create pool=rbd image=win2016-3tb-1 size=2861589M 
CMD: /disks/ create pool=rbd image=win2016-3tb-1 size=2861589M count=1 
max_data_area_mb=None
pool 'rbd' is ok to use
Creating/mapping disk rbd/win2016-3tb-1
Issuing disk create request
Failed : disk create/update failed on p10s. LUN allocation failure

Surely I could investigate what is happening by studying gwcli sources,
but if anyone already knows how to fix that, I would appreciate your 
response.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] iSCSI to a Ceph node with 2 network adapters - how to ?

2018-06-01 Thread Wladimir Mutel

Ok, I looked into Python sources of ceph-iscsi-{cli,config} and
found that per-host configuration sections use short host name
(returned by this_host() function) as their primary key.
So I can't trick gwcli with alternative host name like p10s2
which I put into /etc/hosts to denote my second IP,
as this_host() calls gethostname() and further code
disregards alternative host names at all.
I added 192.168.201.231 into trusted_ip_list,
but after 'create p10s2 192.168.201.231 skipchecks=true'
I got KeyError 'p10s2' in gwcli/gateway.py line 571

Fortunately, I found a way to edit Ceph iSCSI configuration
as a text file (rados --pool rbd get gateway.conf gateway.conf)
I added needed IP to the appropriate json lists
(."gateways"."ip_list" and."gateways"."p10s"."gateway_ip_list"),
put the file back into RADOS and restarted rbd-target-gw
in the hope everything will go well

Unfortunately, I found (by running 'targetcli ls')
that now it creates 2 TPGs with single IP portal in each of them
Also, it disables 1st TPG but enables 2nd one, like this :

  o- iscsi  [Targets: 1]
  | o- iqn.2018-06.domain.p10s:p10s [TPGs: 2]
  |   o- tpg1   [disabled]
  |   | o- portals  [Portals: 1]
  |   |   o- 192.168.200.230:3260   [OK]
  |   o- tpg2   [no-gen-acls, no-auth]
  | o- portals  [Portals: 1]
  |   o- 192.168.201.231:3260   [OK]

And still, when I do '/disks create ...' in gwcli, it says
that it wants 2 existing gateways. Probably this is related
to the created 2-TPG structure and I should look for more ways
to 'improve' that json config so that rbd-target-gw loads it
as I need on single host.


Wladimir Mutel wrote:
 Well, ok, I moved second address into different subnet 
(192.168.201.231/24) and also reflected that in 'hosts' file


 But that did not help much :

/iscsi-target...test/gateways> create p10s2 192.168.201.231 skipchecks=true
OS version/package checks have been bypassed
Adding gateway, sync'ing 0 disk(s) and 0 client(s)
Failed : Gateway creation failed, gateway(s) 
unavailable:192.168.201.231(UNKNOWN state)


/disks> create pool=replicated image=win2016-3gb size=2861589M
Failed : at least 2 gateways must exist before disk operations are 
permitted


 I see this mentioned in Ceph-iSCSI-CLI GitHub issues
https://github.com/ceph/ceph-iscsi-cli/issues/54 and
https://github.com/ceph/ceph-iscsi-cli/issues/59
 but apparently without a solution

 So, would anybody propose an idea
 on how can I start using iSCSI over Ceph acheap?
 With the single P10S host I have in my hands right now?

 Additional host and 10GBE hardware would require additional
 funding, which would possible only in some future.

 Thanks in advance for your responses

Wladimir Mutel wrote:


 I have both its Ethernets connected to the same LAN,
 with different IPs in the same subnet
 (like, 192.168.200.230/24 and 192.168.200.231/24)



192.168.200.230 p10s
192.168.200.231 p10s2


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] iSCSI to a Ceph node with 2 network adapters - how to ?

2018-06-01 Thread Wladimir Mutel
	Well, ok, I moved second address into different subnet 
(192.168.201.231/24) and also reflected that in 'hosts' file


But that did not help much :

/iscsi-target...test/gateways> create p10s2 192.168.201.231 skipchecks=true
OS version/package checks have been bypassed
Adding gateway, sync'ing 0 disk(s) and 0 client(s)
Failed : Gateway creation failed, gateway(s) 
unavailable:192.168.201.231(UNKNOWN state)


/disks> create pool=replicated image=win2016-3gb size=2861589M
Failed : at least 2 gateways must exist before disk operations are permitted

I see this mentioned in Ceph-iSCSI-CLI GitHub issues
https://github.com/ceph/ceph-iscsi-cli/issues/54 and
https://github.com/ceph/ceph-iscsi-cli/issues/59
but apparently without a solution

So, would anybody propose an idea
on how can I start using iSCSI over Ceph acheap?
With the single P10S host I have in my hands right now?

Additional host and 10GBE hardware would require additional
funding, which would possible only in some future.

Thanks in advance for your responses

Wladimir Mutel wrote:


 I have both its Ethernets connected to the same LAN,
 with different IPs in the same subnet
 (like, 192.168.200.230/24 and 192.168.200.231/24)



192.168.200.230 p10s
192.168.200.231 p10s2


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] iSCSI to a Ceph node with 2 network adapters - how to ?

2018-06-01 Thread Wladimir Mutel

Dear all,

I am experimenting with Ceph setup. I set up a single node
(Asus P10S-M WS, Xeon E3-1235 v5, 64 GB RAM, 8x3TB SATA HDDs,
Ubuntu 18.04 Bionic, Ceph packages from
http://download.ceph.com/debian-luminous/dists/xenial/
and iscsi parts built manually per
http://docs.ceph.com/docs/master/rbd/iscsi-target-cli-manual-install/)
Also i changed 'chooseleaf ... host' into 'chooseleaf ... osd'
in the CRUSH map to run with single host.

I have both its Ethernets connected to the same LAN,
with different IPs in the same subnet
(like, 192.168.200.230/24 and 192.168.200.231/24)
mon_host in ceph.conf is set to 192.168.200.230,
and ceph daemons (mgr, mon, osd) are listening to this IP.

What I would like to finally achieve, is to provide multipath
iSCSI access through both these Ethernets to Ceph RBDs,
and apparently, gwcli does not allow me to add a second
gateway to the same target. It is going like this :

/iscsi-target> create iqn.2018-06.host.test:test
ok
/iscsi-target> cd iqn.2018-06.host.test:test/gateways
/iscsi-target...test/gateways> create p10s 192.168.200.230 skipchecks=true
OS version/package checks have been bypassed
Adding gateway, sync'ing 0 disk(s) and 0 client(s)
ok
/iscsi-target...test/gateways> create p10s2 192.168.200.231 skipchecks=true
OS version/package checks have been bypassed
Adding gateway, sync'ing 0 disk(s) and 0 client(s)
Failed : Gateway creation failed, gateway(s) 
unavailable:192.168.200.231(UNKNOWN state)


host names are defined in /etc/hosts as follows :

192.168.200.230 p10s
192.168.200.231 p10s2

	so I suppose that something does not listen on 192.168.200.231, but I 
don't have an idea what is that thing and how to make it listen there. 
Or how to achieve this goal (utilization of both Ethernets for iSCSI) in 
different way. Shoud I aggregate Ethernets into a 'bond' interface with 
single IP ? Should I build and use 'lrbd' tool instead of 'gwcli' ? Is 
it acceptable that I run kernel 4.15, not 4.16+ ?

What other directions could you give me on this task ?
Thanks in advance for your replies.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com