Re: Issue with Ceph File System and LIO

2015-12-17 Thread Eric Eastman
I patched the 4.4rc4 kernel source and restarted the test.  Shortly
after starting it, this showed up in dmesg:

[Thu Dec 17 03:29:55 2015] WARNING: CPU: 0 PID: 2547 at
fs/ceph/addr.c:1162 ceph_write_begin+0xfb/0x120 [ceph]()
[Thu Dec 17 03:29:55 2015] Modules linked in: iscsi_target_mod
vhost_scsi tcm_qla2xxx ib_srpt tcm_fc tcm_usb_gadget tcm_loop
target_core_file target_core_iblock target_core_pscsi target_core_user
target_core_mod ipmi_devintf vhost qla2xxx ib_cm ib_sa ib_mad ib_core
ib_addr libfc scsi_transport_fc libcomposite udc_core uio configfs ttm
ipmi_ssif drm_kms_helper drm coretemp kvm gpio_ich i2c_algo_bit
i7core_edac fb_sys_fops syscopyarea edac_core sysfillrect sysimgblt
ipmi_si input_leds hpilo ipmi_msghandler shpchp acpi_power_meter
irqbypass serio_raw 8250_fintek lpc_ich mac_hid ceph bonding libceph
lp parport libcrc32c fscache mlx4_en vxlan ip6_udp_tunnel udp_tunnel
ptp pps_core hid_generic usbhid hid mlx4_core hpsa psmouse bnx2 fjes
scsi_transport_sas [last unloaded: target_core_mod]
[Thu Dec 17 03:29:55 2015] CPU: 0 PID: 2547 Comm: iscsi_trx Tainted: G
   W I 4.4.0-rc4-ede1 #1
[Thu Dec 17 03:29:55 2015] Hardware name: HP ProLiant DL360 G6, BIOS
P64 01/22/2015
[Thu Dec 17 03:29:55 2015]  c020cd47 8805f1e97958
813ad644 
[Thu Dec 17 03:29:55 2015]  8805f1e97990 81079702
8805f1e97a50 015dd000
[Thu Dec 17 03:29:55 2015]  880c034df800 0200
eab26a80 8805f1e979a0
[Thu Dec 17 03:29:55 2015] Call Trace:
[Thu Dec 17 03:29:55 2015]  [] dump_stack+0x44/0x60
[Thu Dec 17 03:29:55 2015]  [] warn_slowpath_common+0x82/0xc0
[Thu Dec 17 03:29:55 2015]  [] warn_slowpath_null+0x1a/0x20
[Thu Dec 17 03:29:55 2015]  []
ceph_write_begin+0xfb/0x120 [ceph]
[Thu Dec 17 03:29:55 2015]  []
generic_perform_write+0xbf/0x1a0
[Thu Dec 17 03:29:55 2015]  []
ceph_write_iter+0xf5c/0x1010 [ceph]
[Thu Dec 17 03:29:55 2015]  [] ? __enqueue_entity+0x6c/0x70
[Thu Dec 17 03:29:55 2015]  [] ?
iov_iter_get_pages+0x113/0x210
[Thu Dec 17 03:29:55 2015]  [] ?
skb_copy_datagram_iter+0x122/0x250
[Thu Dec 17 03:29:55 2015]  [] vfs_iter_write+0x63/0xa0
[Thu Dec 17 03:29:55 2015]  []
fd_do_rw.isra.5+0xc9/0x1b0 [target_core_file]
[Thu Dec 17 03:29:55 2015]  []
fd_execute_rw+0xc5/0x2a0 [target_core_file]
[Thu Dec 17 03:29:55 2015]  []
sbc_execute_rw+0x22/0x30 [target_core_mod]
[Thu Dec 17 03:29:55 2015]  []
__target_execute_cmd+0x1f/0x70 [target_core_mod]
[Thu Dec 17 03:29:55 2015]  []
target_execute_cmd+0x195/0x2a0 [target_core_mod]
[Thu Dec 17 03:29:55 2015]  []
iscsit_execute_cmd+0x20a/0x270 [iscsi_target_mod]
[Thu Dec 17 03:29:55 2015]  []
iscsit_sequence_cmd+0xda/0x190 [iscsi_target_mod]
[Thu Dec 17 03:29:55 2015]  []
iscsi_target_rx_thread+0x51d/0xe30 [iscsi_target_mod]
[Thu Dec 17 03:29:55 2015]  [] ? __switch_to+0x1cd/0x570
[Thu Dec 17 03:29:55 2015]  [] ?
iscsi_target_tx_thread+0x1c0/0x1c0 [iscsi_target_mod]
[Thu Dec 17 03:29:55 2015]  [] kthread+0xc9/0xe0
[Thu Dec 17 03:29:55 2015]  [] ?
kthread_create_on_node+0x180/0x180
[Thu Dec 17 03:29:55 2015]  [] ret_from_fork+0x3f/0x70
[Thu Dec 17 03:29:55 2015]  [] ?
kthread_create_on_node+0x180/0x180
[Thu Dec 17 03:29:55 2015] ---[ end trace 382a45986961da4e ]---

There are WARNINGs on both line 125 and 1162. I will attached the
whole set of dmesg output to the tracker ticket 14086

I wanted to note that file system snapshots are enabled and being used
on this file system.

Thanks
Eric

On Wed, Dec 16, 2015 at 8:15 AM, Eric Eastman
 wrote:
>>>
>> This warning is really strange. Could you try the attached debug patch.
>>
>> Regards
>> Yan, Zheng
>
> I will try the patch and get back to the list.
>
> Eric
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Issue with Ceph File System and LIO

2015-12-17 Thread Yan, Zheng
On Thu, Dec 17, 2015 at 4:56 PM, Eric Eastman
 wrote:
> I patched the 4.4rc4 kernel source and restarted the test.  Shortly
> after starting it, this showed up in dmesg:
>
> [Thu Dec 17 03:29:55 2015] WARNING: CPU: 0 PID: 2547 at
> fs/ceph/addr.c:1162 ceph_write_begin+0xfb/0x120 [ceph]()
> [Thu Dec 17 03:29:55 2015] Modules linked in: iscsi_target_mod
> vhost_scsi tcm_qla2xxx ib_srpt tcm_fc tcm_usb_gadget tcm_loop
> target_core_file target_core_iblock target_core_pscsi target_core_user
> target_core_mod ipmi_devintf vhost qla2xxx ib_cm ib_sa ib_mad ib_core
> ib_addr libfc scsi_transport_fc libcomposite udc_core uio configfs ttm
> ipmi_ssif drm_kms_helper drm coretemp kvm gpio_ich i2c_algo_bit
> i7core_edac fb_sys_fops syscopyarea edac_core sysfillrect sysimgblt
> ipmi_si input_leds hpilo ipmi_msghandler shpchp acpi_power_meter
> irqbypass serio_raw 8250_fintek lpc_ich mac_hid ceph bonding libceph
> lp parport libcrc32c fscache mlx4_en vxlan ip6_udp_tunnel udp_tunnel
> ptp pps_core hid_generic usbhid hid mlx4_core hpsa psmouse bnx2 fjes
> scsi_transport_sas [last unloaded: target_core_mod]
> [Thu Dec 17 03:29:55 2015] CPU: 0 PID: 2547 Comm: iscsi_trx Tainted: G
>W I 4.4.0-rc4-ede1 #1
> [Thu Dec 17 03:29:55 2015] Hardware name: HP ProLiant DL360 G6, BIOS
> P64 01/22/2015
> [Thu Dec 17 03:29:55 2015]  c020cd47 8805f1e97958
> 813ad644 
> [Thu Dec 17 03:29:55 2015]  8805f1e97990 81079702
> 8805f1e97a50 015dd000
> [Thu Dec 17 03:29:55 2015]  880c034df800 0200
> eab26a80 8805f1e979a0
> [Thu Dec 17 03:29:55 2015] Call Trace:
> [Thu Dec 17 03:29:55 2015]  [] dump_stack+0x44/0x60
> [Thu Dec 17 03:29:55 2015]  [] 
> warn_slowpath_common+0x82/0xc0
> [Thu Dec 17 03:29:55 2015]  [] warn_slowpath_null+0x1a/0x20
> [Thu Dec 17 03:29:55 2015]  []
> ceph_write_begin+0xfb/0x120 [ceph]
> [Thu Dec 17 03:29:55 2015]  []
> generic_perform_write+0xbf/0x1a0
> [Thu Dec 17 03:29:55 2015]  []
> ceph_write_iter+0xf5c/0x1010 [ceph]
> [Thu Dec 17 03:29:55 2015]  [] ? __enqueue_entity+0x6c/0x70
> [Thu Dec 17 03:29:55 2015]  [] ?
> iov_iter_get_pages+0x113/0x210
> [Thu Dec 17 03:29:55 2015]  [] ?
> skb_copy_datagram_iter+0x122/0x250
> [Thu Dec 17 03:29:55 2015]  [] vfs_iter_write+0x63/0xa0
> [Thu Dec 17 03:29:55 2015]  []
> fd_do_rw.isra.5+0xc9/0x1b0 [target_core_file]
> [Thu Dec 17 03:29:55 2015]  []
> fd_execute_rw+0xc5/0x2a0 [target_core_file]
> [Thu Dec 17 03:29:55 2015]  []
> sbc_execute_rw+0x22/0x30 [target_core_mod]
> [Thu Dec 17 03:29:55 2015]  []
> __target_execute_cmd+0x1f/0x70 [target_core_mod]
> [Thu Dec 17 03:29:55 2015]  []
> target_execute_cmd+0x195/0x2a0 [target_core_mod]
> [Thu Dec 17 03:29:55 2015]  []
> iscsit_execute_cmd+0x20a/0x270 [iscsi_target_mod]
> [Thu Dec 17 03:29:55 2015]  []
> iscsit_sequence_cmd+0xda/0x190 [iscsi_target_mod]
> [Thu Dec 17 03:29:55 2015]  []
> iscsi_target_rx_thread+0x51d/0xe30 [iscsi_target_mod]
> [Thu Dec 17 03:29:55 2015]  [] ? __switch_to+0x1cd/0x570
> [Thu Dec 17 03:29:55 2015]  [] ?
> iscsi_target_tx_thread+0x1c0/0x1c0 [iscsi_target_mod]
> [Thu Dec 17 03:29:55 2015]  [] kthread+0xc9/0xe0
> [Thu Dec 17 03:29:55 2015]  [] ?
> kthread_create_on_node+0x180/0x180
> [Thu Dec 17 03:29:55 2015]  [] ret_from_fork+0x3f/0x70
> [Thu Dec 17 03:29:55 2015]  [] ?
> kthread_create_on_node+0x180/0x180
> [Thu Dec 17 03:29:55 2015] ---[ end trace 382a45986961da4e ]---


Could you please try the apply the new incremental patch and try again.


Regards
Yan, Zheng


>
> There are WARNINGs on both line 125 and 1162. I will attached the
> whole set of dmesg output to the tracker ticket 14086
>
> I wanted to note that file system snapshots are enabled and being used
> on this file system.
>
> Thanks
> Eric
>
> On Wed, Dec 16, 2015 at 8:15 AM, Eric Eastman
>  wrote:

>>> This warning is really strange. Could you try the attached debug patch.
>>>
>>> Regards
>>> Yan, Zheng
>>
>> I will try the patch and get back to the list.
>>
>> Eric


cephfs1.patch
Description: Binary data


Client still connect failed leader after that mon down

2015-12-17 Thread Jaze Lee
Hello cephers:
In our test, there are three monitors. We find client run ceph
command will slow when the leader mon is down. Even after long time, a
client run ceph command will also slow in first time.
>From strace, we find that the client first to connect the leader, then
after 3s, it connect the second.
After some search we find that the quorum is not change, the leader is
still the down monitor.
Is that normal?  Or is there something i miss?

Thanks a lot



-- 
谦谦君子
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: puzzling disapearance of /dev/sdc1

2015-12-17 Thread Loic Dachary
Hi Sage,

On 17/12/2015 14:31, Sage Weil wrote:
> On Thu, 17 Dec 2015, Loic Dachary wrote:
>> Hi Ilya,
>>
>> This is another puzzling behavior (the log of all commands is at 
>> http://tracker.ceph.com/issues/14094#note-4). in a nutshell, after a 
>> series of sgdisk -i commands to examine various devices including 
>> /dev/sdc1, the /dev/sdc1 file disappears (and I think it will showup 
>> again although I don't have a definitive proof of this).
>>
>> It looks like a side effect of a previous partprobe command, the only 
>> command I can think of that removes / re-adds devices. I thought calling 
>> udevadm settle after running partprobe would be enough to ensure 
>> partprobe completed (and since it takes as much as 2mn30 to return, I 
>> would be shocked if it does not ;-).
>>
>> Any idea ? I desperately try to find a consistent behavior, something 
>> reliable that we could use to say : "wait for the partition table to be 
>> up to date in the kernel and all udev events generated by the partition 
>> table update to complete".
> 
> I wonder if the underlying issue is that we shouldn't be calling udevadm 
> settle from something running from udev.  Instead, of a udev-triggered 
> run of ceph-disk does something that changes the partitions, it 
> should just exit and let udevadm run ceph-disk again on the new 
> devices...?

Unless I missed something this is on CentOS 7 and ceph-disk is only called from 
udev as ceph-disk trigger which does nothing else but asynchronously delegate 
the work to systemd. Therefore there is no udevadm settle from within udev 
(which would deadlock and timeout every time... I hope ;-).

Cheers

-- 
Loïc Dachary, Artisan Logiciel Libre



signature.asc
Description: OpenPGP digital signature


Re: Client still connect failed leader after that mon down

2015-12-17 Thread Sage Weil
On Thu, 17 Dec 2015, Jaze Lee wrote:
> Hello cephers:
> In our test, there are three monitors. We find client run ceph
> command will slow when the leader mon is down. Even after long time, a
> client run ceph command will also slow in first time.
> >From strace, we find that the client first to connect the leader, then
> after 3s, it connect the second.
> After some search we find that the quorum is not change, the leader is
> still the down monitor.
> Is that normal?  Or is there something i miss?

It's normal.  Even when the quorum does change, the client doesn't 
know that.  It should be contacting a random mon on startup, though, so I 
would expect the 3s delay 1/3 of the time.

A long-standing low-priority feature request is to have the client contact 
2 mons in parallel so that it can still connect quickly if one is down.  
It's requires some non-trivial work in mon/MonClient.{cc,h} though and I 
don't think anyone has looked at it seriously.

sage

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: puzzling disapearance of /dev/sdc1

2015-12-17 Thread Sage Weil
On Thu, 17 Dec 2015, Loic Dachary wrote:
> Hi Ilya,
> 
> This is another puzzling behavior (the log of all commands is at 
> http://tracker.ceph.com/issues/14094#note-4). in a nutshell, after a 
> series of sgdisk -i commands to examine various devices including 
> /dev/sdc1, the /dev/sdc1 file disappears (and I think it will showup 
> again although I don't have a definitive proof of this).
> 
> It looks like a side effect of a previous partprobe command, the only 
> command I can think of that removes / re-adds devices. I thought calling 
> udevadm settle after running partprobe would be enough to ensure 
> partprobe completed (and since it takes as much as 2mn30 to return, I 
> would be shocked if it does not ;-).
> 
> Any idea ? I desperately try to find a consistent behavior, something 
> reliable that we could use to say : "wait for the partition table to be 
> up to date in the kernel and all udev events generated by the partition 
> table update to complete".

I wonder if the underlying issue is that we shouldn't be calling udevadm 
settle from something running from udev.  Instead, of a udev-triggered 
run of ceph-disk does something that changes the partitions, it 
should just exit and let udevadm run ceph-disk again on the new 
devices...?

sage
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: puzzling disapearance of /dev/sdc1

2015-12-17 Thread Ilya Dryomov
On Thu, Dec 17, 2015 at 3:10 PM, Loic Dachary  wrote:
> Hi Sage,
>
> On 17/12/2015 14:31, Sage Weil wrote:
>> On Thu, 17 Dec 2015, Loic Dachary wrote:
>>> Hi Ilya,
>>>
>>> This is another puzzling behavior (the log of all commands is at
>>> http://tracker.ceph.com/issues/14094#note-4). in a nutshell, after a
>>> series of sgdisk -i commands to examine various devices including
>>> /dev/sdc1, the /dev/sdc1 file disappears (and I think it will showup
>>> again although I don't have a definitive proof of this).
>>>
>>> It looks like a side effect of a previous partprobe command, the only
>>> command I can think of that removes / re-adds devices. I thought calling
>>> udevadm settle after running partprobe would be enough to ensure
>>> partprobe completed (and since it takes as much as 2mn30 to return, I
>>> would be shocked if it does not ;-).

Yeah, IIRC partprobe goes through every slot in the partition table,
trying to first remove and then add the partition back.  But, I don't
see any mention of partprobe in the log you referred to.

Should udevadm settle for a few vd* devices be taking that much time?
I'd investigate that regardless of the issue at hand.

>>>
>>> Any idea ? I desperately try to find a consistent behavior, something
>>> reliable that we could use to say : "wait for the partition table to be
>>> up to date in the kernel and all udev events generated by the partition
>>> table update to complete".
>>
>> I wonder if the underlying issue is that we shouldn't be calling udevadm
>> settle from something running from udev.  Instead, of a udev-triggered
>> run of ceph-disk does something that changes the partitions, it
>> should just exit and let udevadm run ceph-disk again on the new
>> devices...?

>
> Unless I missed something this is on CentOS 7 and ceph-disk is only called 
> from udev as ceph-disk trigger which does nothing else but asynchronously 
> delegate the work to systemd. Therefore there is no udevadm settle from 
> within udev (which would deadlock and timeout every time... I hope ;-).

That's a sure lockup, until one of them times out.

How are you delegating to systemd?  Is it to avoid long-running udev
events?  I'm probably missing something - udevadm settle wouldn't block
on anything other than udev, so if you are shipping work off to
somewhere else, udev can't be relied upon for waiting.

Thanks,

Ilya
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Issue with Ceph File System and LIO

2015-12-17 Thread Minfei Huang
Hi.

It may be helpful to address this issue, if we flip the debug.

Thanks
Minfei

On 12/17/15 at 01:56P, Eric Eastman wrote:
> I patched the 4.4rc4 kernel source and restarted the test.  Shortly
> after starting it, this showed up in dmesg:
> 
> [Thu Dec 17 03:29:55 2015] WARNING: CPU: 0 PID: 2547 at
> fs/ceph/addr.c:1162 ceph_write_begin+0xfb/0x120 [ceph]()
> [Thu Dec 17 03:29:55 2015] Modules linked in: iscsi_target_mod
> vhost_scsi tcm_qla2xxx ib_srpt tcm_fc tcm_usb_gadget tcm_loop
> target_core_file target_core_iblock target_core_pscsi target_core_user
> target_core_mod ipmi_devintf vhost qla2xxx ib_cm ib_sa ib_mad ib_core
> ib_addr libfc scsi_transport_fc libcomposite udc_core uio configfs ttm
> ipmi_ssif drm_kms_helper drm coretemp kvm gpio_ich i2c_algo_bit
> i7core_edac fb_sys_fops syscopyarea edac_core sysfillrect sysimgblt
> ipmi_si input_leds hpilo ipmi_msghandler shpchp acpi_power_meter
> irqbypass serio_raw 8250_fintek lpc_ich mac_hid ceph bonding libceph
> lp parport libcrc32c fscache mlx4_en vxlan ip6_udp_tunnel udp_tunnel
> ptp pps_core hid_generic usbhid hid mlx4_core hpsa psmouse bnx2 fjes
> scsi_transport_sas [last unloaded: target_core_mod]
> [Thu Dec 17 03:29:55 2015] CPU: 0 PID: 2547 Comm: iscsi_trx Tainted: G
>W I 4.4.0-rc4-ede1 #1
> [Thu Dec 17 03:29:55 2015] Hardware name: HP ProLiant DL360 G6, BIOS
> P64 01/22/2015
> [Thu Dec 17 03:29:55 2015]  c020cd47 8805f1e97958
> 813ad644 
> [Thu Dec 17 03:29:55 2015]  8805f1e97990 81079702
> 8805f1e97a50 015dd000
> [Thu Dec 17 03:29:55 2015]  880c034df800 0200
> eab26a80 8805f1e979a0
> [Thu Dec 17 03:29:55 2015] Call Trace:
> [Thu Dec 17 03:29:55 2015]  [] dump_stack+0x44/0x60
> [Thu Dec 17 03:29:55 2015]  [] 
> warn_slowpath_common+0x82/0xc0
> [Thu Dec 17 03:29:55 2015]  [] warn_slowpath_null+0x1a/0x20
> [Thu Dec 17 03:29:55 2015]  []
> ceph_write_begin+0xfb/0x120 [ceph]
> [Thu Dec 17 03:29:55 2015]  []
> generic_perform_write+0xbf/0x1a0
> [Thu Dec 17 03:29:55 2015]  []
> ceph_write_iter+0xf5c/0x1010 [ceph]
> [Thu Dec 17 03:29:55 2015]  [] ? __enqueue_entity+0x6c/0x70
> [Thu Dec 17 03:29:55 2015]  [] ?
> iov_iter_get_pages+0x113/0x210
> [Thu Dec 17 03:29:55 2015]  [] ?
> skb_copy_datagram_iter+0x122/0x250
> [Thu Dec 17 03:29:55 2015]  [] vfs_iter_write+0x63/0xa0
> [Thu Dec 17 03:29:55 2015]  []
> fd_do_rw.isra.5+0xc9/0x1b0 [target_core_file]
> [Thu Dec 17 03:29:55 2015]  []
> fd_execute_rw+0xc5/0x2a0 [target_core_file]
> [Thu Dec 17 03:29:55 2015]  []
> sbc_execute_rw+0x22/0x30 [target_core_mod]
> [Thu Dec 17 03:29:55 2015]  []
> __target_execute_cmd+0x1f/0x70 [target_core_mod]
> [Thu Dec 17 03:29:55 2015]  []
> target_execute_cmd+0x195/0x2a0 [target_core_mod]
> [Thu Dec 17 03:29:55 2015]  []
> iscsit_execute_cmd+0x20a/0x270 [iscsi_target_mod]
> [Thu Dec 17 03:29:55 2015]  []
> iscsit_sequence_cmd+0xda/0x190 [iscsi_target_mod]
> [Thu Dec 17 03:29:55 2015]  []
> iscsi_target_rx_thread+0x51d/0xe30 [iscsi_target_mod]
> [Thu Dec 17 03:29:55 2015]  [] ? __switch_to+0x1cd/0x570
> [Thu Dec 17 03:29:55 2015]  [] ?
> iscsi_target_tx_thread+0x1c0/0x1c0 [iscsi_target_mod]
> [Thu Dec 17 03:29:55 2015]  [] kthread+0xc9/0xe0
> [Thu Dec 17 03:29:55 2015]  [] ?
> kthread_create_on_node+0x180/0x180
> [Thu Dec 17 03:29:55 2015]  [] ret_from_fork+0x3f/0x70
> [Thu Dec 17 03:29:55 2015]  [] ?
> kthread_create_on_node+0x180/0x180
> [Thu Dec 17 03:29:55 2015] ---[ end trace 382a45986961da4e ]---
> 
> There are WARNINGs on both line 125 and 1162. I will attached the
> whole set of dmesg output to the tracker ticket 14086
> 
> I wanted to note that file system snapshots are enabled and being used
> on this file system.
> 
> Thanks
> Eric
> 
> On Wed, Dec 16, 2015 at 8:15 AM, Eric Eastman
>  wrote:
> >>>
> >> This warning is really strange. Could you try the attached debug patch.
> >>
> >> Regards
> >> Yan, Zheng
> >
> > I will try the patch and get back to the list.
> >
> > Eric
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: understanding partprobe failure

2015-12-17 Thread Ilya Dryomov
On Thu, Dec 17, 2015 at 1:19 PM, Loic Dachary  wrote:
> Hi Ilya,
>
> I'm seeing a partprobe failure right after a disk was zapped with sgdisk 
> --clear --mbrtogpt -- /dev/vdb:
>
> partprobe /dev/vdb failed : Error: Partition(s) 1 on /dev/vdb have been 
> written, but we have been unable to inform the kernel of the change, probably 
> because it/they are in use. As a result, the old partition(s) will remain in 
> use. You should reboot now before making further changes.
>
> waiting 60 seconds (see the log below) and trying again succeeds. The 
> partprobe call is guarded by udevadm settle to prevent udev actions from 
> racing and nothing else goes on in the machine.
>
> Any idea how that could happen ?
>
> Cheers
>
> 2015-12-17 11:46:10,356.356 
> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:DEBUG:ceph-disk:get_dm_uuid
>  /dev/vdb uuid path is /sys/dev/block/253:16/dm/uuid
> 2015-12-17 11:46:10,357.357 
> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:DEBUG:ceph-disk:Zapping
>  partition table on /dev/vdb
> 2015-12-17 11:46:10,358.358 
> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:INFO:ceph-disk:Running
>  command: /usr/sbin/sgdisk --zap-all -- /dev/vdb
> 2015-12-17 11:46:10,365.365 
> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:Caution:
>  invalid backup GPT header, but valid main header; regenerating
> 2015-12-17 11:46:10,366.366 
> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:backup 
> header from main header.
> 2015-12-17 11:46:10,366.366 
> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:
> 2015-12-17 11:46:10,366.366 
> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:Warning!
>  Main and backup partition tables differ! Use the 'c' and 'e' options
> 2015-12-17 11:46:10,367.367 
> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:on the 
> recovery & transformation menu to examine the two tables.
> 2015-12-17 11:46:10,367.367 
> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:
> 2015-12-17 11:46:10,367.367 
> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:Warning!
>  One or more CRCs don't match. You should repair the disk!
> 2015-12-17 11:46:10,368.368 
> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:
> 2015-12-17 11:46:11,413.413 
> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:
> 2015-12-17 11:46:11,414.414 
> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:Caution:
>  Found protective or hybrid MBR and corrupt GPT. Using GPT, but disk
> 2015-12-17 11:46:11,414.414 
> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:verification
>  and recovery are STRONGLY recommended.
> 2015-12-17 11:46:11,414.414 
> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:
> 2015-12-17 11:46:11,415.415 
> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:Warning:
>  The kernel is still using the old partition table.
> 2015-12-17 11:46:11,415.415 
> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:The new 
> table will be used at the next reboot.
> 2015-12-17 11:46:11,416.416 
> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:GPT 
> data structures destroyed! You may now partition the disk using fdisk or
> 2015-12-17 11:46:11,416.416 
> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:other 
> utilities.
> 2015-12-17 11:46:11,416.416 
> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:INFO:ceph-disk:Running
>  command: /usr/sbin/sgdisk --clear --mbrtogpt -- /dev/vdb
> 2015-12-17 11:46:12,504.504 
> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:Creating
>  new GPT entries.
> 2015-12-17 11:46:12,505.505 
> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:Warning:
>  The kernel is still using the old partition table.
> 2015-12-17 11:46:12,505.505 
> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:The new 
> table will be used at the next reboot.
> 2015-12-17 11:46:12,505.505 
> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:The 
> operation has completed successfully.
> 2015-12-17 11:46:12,506.506 
> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:DEBUG:ceph-disk:Calling
>  partprobe on zapped device /dev/vdb
> 2015-12-17 11:46:12,507.507 
> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:INFO:ceph-disk:Running
>  command: /usr/bin/udevadm settle --timeout=600
> 2015-12-17 11:46:15,427.427 
> 

CoprHD Integrating Ceph

2015-12-17 Thread Patrick McGarry
Hey cephers,

In the pursuit of openness I wanted to share a ceph-related bit of
work that is happening beyond our immediate sphere of influence and
see who is already contributing, or might be interested in the
results.

https://groups.google.com/forum/?hl=en#!topic/coprhddevsupport/llZeiTWxddM

EMC’s CoprHD initiative continues to try to expand their influence
through open contribution. Currently there is work to integrate Ceph
support into their SB SDK. So, a few questions for anyone who wishes
to weigh in:

1) Is this inherently interesting to you?
2) Are you already contributing to this effort (or would you be
interested in contributing to this effort)?
3) Would you want to see this made a priority by the core team to
review and “bless” an integration?

Just want to get an idea to see if anyone is really excited about this
and just hasn’t expressed it yet. If nothing else I wanted people to
be aware that it was an option that was floating around out there.
Thanks.



Best Regards,


Patrick McGarry
pmcga...@gmail.com
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: rgw subuser create and admin api

2015-12-17 Thread Yehuda Sadeh-Weinraub
On Thu, Dec 17, 2015 at 9:04 AM, Derek Yarnell  wrote:
> I am having an issue with the 'radosgw-admin subuser create' command
> doing something different than the '/{admin}/user?subuser=json'
> admin API.  I want to leverage subusers in S3 which looks to be possible
> in my testing for bit more control without resorting to ACLs.
>
> radosgw-admin subuser create --uid=-staff --subuser=test1
> --access-key=a --secret=z --access=read
>
> This command will work and create a both a subuser -staff:test1 with
> permission read and a s3 key with the the correct access and secret key set.
>
> The Admin API will not allow me to do this it would seem as the
> following is accepted and a subuser is created however a swift_key is
> created instead.
>
> DEBUG:requests.packages.urllib3.connectionpool:"PUT
> /admin/user?subuser=json=-staff=test2=b=cc=read
> HTTP/1.1" 200 130
>
> The documentation for the admin API[0] does not seem to indicate that
> access-key is accepted at all.  Also if you pass key-type=s3 it will
> return a 400 with InvalidArgument although the documentation says it
> should accept the key type s3.
>
> Bug? Design?

Somewhat a bug. The whole subusers that use s3 was unintentional, so
when creating the subuser api, we didn't think of needing the access
key. For some reason we do get the key type. Can you open a ceph
tracker issue for that?

You can try using the metadata api to modify the user once it has been
created (need to get the user info, add the s3 key to the structure,
put the user info).

>
> One other issue is that a command that uses the --purge-keys from
> radosgw-admin seems to have no effect.  The following command removes
> the subuser and leaves the swift keys it has (but also any s3 keys too).
>
> radosgw-admin subuser rm --uid=-staff --subuser=test2 --purge-keys
>

It's a known issue, and it will be fixed soon (so it seems).

Thanks,
Yehuda

>
> [0] - http://docs.ceph.com/docs/master/radosgw/adminops/#create-subuser
>
>
> --
> Derek T. Yarnell
> University of Maryland
> Institute for Advanced Computer Studies
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


rgw subuser create and admin api

2015-12-17 Thread Derek Yarnell
I am having an issue with the 'radosgw-admin subuser create' command
doing something different than the '/{admin}/user?subuser=json'
admin API.  I want to leverage subusers in S3 which looks to be possible
in my testing for bit more control without resorting to ACLs.

radosgw-admin subuser create --uid=-staff --subuser=test1
--access-key=a --secret=z --access=read

This command will work and create a both a subuser -staff:test1 with
permission read and a s3 key with the the correct access and secret key set.

The Admin API will not allow me to do this it would seem as the
following is accepted and a subuser is created however a swift_key is
created instead.

DEBUG:requests.packages.urllib3.connectionpool:"PUT
/admin/user?subuser=json=-staff=test2=b=cc=read
HTTP/1.1" 200 130

The documentation for the admin API[0] does not seem to indicate that
access-key is accepted at all.  Also if you pass key-type=s3 it will
return a 400 with InvalidArgument although the documentation says it
should accept the key type s3.

Bug? Design?

One other issue is that a command that uses the --purge-keys from
radosgw-admin seems to have no effect.  The following command removes
the subuser and leaves the swift keys it has (but also any s3 keys too).

radosgw-admin subuser rm --uid=-staff --subuser=test2 --purge-keys


[0] - http://docs.ceph.com/docs/master/radosgw/adminops/#create-subuser


-- 
Derek T. Yarnell
University of Maryland
Institute for Advanced Computer Studies
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: rgw subuser create and admin api

2015-12-17 Thread Yehuda Sadeh-Weinraub
On Thu, Dec 17, 2015 at 12:06 PM, Derek Yarnell  wrote:
> On 12/17/15 2:36 PM, Yehuda Sadeh-Weinraub wrote:
>> Try 'section=user=cephtests'
>
> Doesn't seem to work either.
>
> # radosgw-admin metadata get user:cephtest
> {
> "key": "user:cephtest",
> "ver": {
> "tag": "_dhpzgdOjqJI-OsR1MsYV5-p",
> "ver": 1
> },
> "mtime": 1450378246,
> "data": {
> "user_id": "cephtest",
> "display_name": "Ceph Test",
> "email": "",
> "suspended": 0,
> "max_buckets": 1000,
> "auid": 0,
> "subusers": [],
> "keys": [
> {
> "user": "cephtest",
> "access_key": "eee",
> "secret_key": ""
> },
> {
> "user": "cephtest",
> "access_key": "aaa",
> "secret_key": ""
> }
> ],
> "swift_keys": [],
> "caps": [],
> "op_mask": "read, write, delete",
> "default_placement": "",
> "placement_tags": [],
> "bucket_quota": {
> "enabled": false,
> "max_size_kb": -1,
> "max_objects": -1
> },
> "user_quota": {
> "enabled": false,
> "max_size_kb": -1,
> "max_objects": -1
> },
> "temp_url_keys": []
> }
> }
>
>
> 2015-12-17 15:03:41.126024 7f88ef7e6700 20 RGWEnv::set(): HTTP_HOST:
> localhost:7480
> 2015-12-17 15:03:41.126056 7f88ef7e6700 20 RGWEnv::set(): HTTP_DATE:
> Thu, 17 Dec 2015 20:03:41 GMT
> 2015-12-17 15:03:41.126059 7f88ef7e6700 20 RGWEnv::set(): HTTP_ACCEPT: */*
> 2015-12-17 15:03:41.126064 7f88ef7e6700 20 RGWEnv::set():
> HTTP_ACCEPT_ENCODING: gzip, deflate
> 2015-12-17 15:03:41.126066 7f88ef7e6700 20 RGWEnv::set():
> HTTP_AUTHORIZATION: AWS RTJ1TL13CH613JRU2PJD:F6wMKxSrrFhl2m3fyo/M0yXIGT8=
> 2015-12-17 15:03:41.126070 7f88ef7e6700 20 RGWEnv::set():
> HTTP_USER_AGENT: python-requests/2.3.0 CPython/2.7.10 Darwin/14.5.0
> 2015-12-17 15:03:41.126071 7f88ef7e6700 20 RGWEnv::set():
> HTTP_X_FORWARDED_FOR: 192.168.86.254
> 2015-12-17 15:03:41.126073 7f88ef7e6700 20 RGWEnv::set():
> HTTP_X_FORWARDED_HOST: ceph.umiacs.umd.edu
> 2015-12-17 15:03:41.126075 7f88ef7e6700 20 RGWEnv::set():
> HTTP_X_FORWARDED_SERVER: cephproxy00.umiacs.umd.edu
> 2015-12-17 15:03:41.126077 7f88ef7e6700 20 RGWEnv::set():
> HTTP_CONNECTION: Keep-Alive
> 2015-12-17 15:03:41.126079 7f88ef7e6700 20 RGWEnv::set():
> REQUEST_METHOD: GET
> 2015-12-17 15:03:41.126080 7f88ef7e6700 20 RGWEnv::set(): REQUEST_URI:
> /admin/metadata/get
> 2015-12-17 15:03:41.126081 7f88ef7e6700 20 RGWEnv::set(): QUERY_STRING:
> section=user=cephtest
> 2015-12-17 15:03:41.126082 7f88ef7e6700 20 RGWEnv::set(): REMOTE_USER:
> 2015-12-17 15:03:41.126083 7f88ef7e6700 20 RGWEnv::set(): SCRIPT_URI:
> /admin/metadata/get
> 2015-12-17 15:03:41.126089 7f88ef7e6700 20 RGWEnv::set(): SERVER_PORT: 7480
> 2015-12-17 15:03:41.126090 7f88ef7e6700 20 HTTP_ACCEPT=*/*
> 2015-12-17 15:03:41.126093 7f88ef7e6700 20 HTTP_ACCEPT_ENCODING=gzip,
> deflate
> 2015-12-17 15:03:41.126094 7f88ef7e6700 20 HTTP_AUTHORIZATION=AWS
> RTJ1TL13CH613JRU2PJD:F6wMKxSrrFhl2m3fyo/M0yXIGT8=
> 2015-12-17 15:03:41.126094 7f88ef7e6700 20 HTTP_CONNECTION=Keep-Alive
> 2015-12-17 15:03:41.126095 7f88ef7e6700 20 HTTP_DATE=Thu, 17 Dec 2015
> 20:03:41 GMT
> 2015-12-17 15:03:41.126095 7f88ef7e6700 20 HTTP_HOST=localhost:7480
> 2015-12-17 15:03:41.126096 7f88ef7e6700 20
> HTTP_USER_AGENT=python-requests/2.3.0 CPython/2.7.10 Darwin/14.5.0
> 2015-12-17 15:03:41.126097 7f88ef7e6700 20
> HTTP_X_FORWARDED_FOR=192.168.86.254
> 2015-12-17 15:03:41.126097 7f88ef7e6700 20
> HTTP_X_FORWARDED_HOST=ceph.umiacs.umd.edu
> 2015-12-17 15:03:41.126098 7f88ef7e6700 20
> HTTP_X_FORWARDED_SERVER=cephproxy00.umiacs.umd.edu
> 2015-12-17 15:03:41.126099 7f88ef7e6700 20
> QUERY_STRING=section=user=cephtest
> 2015-12-17 15:03:41.126099 7f88ef7e6700 20 REMOTE_USER=
> 2015-12-17 15:03:41.126100 7f88ef7e6700 20 REQUEST_METHOD=GET
> 2015-12-17 15:03:41.126101 7f88ef7e6700 20 REQUEST_URI=/admin/metadata/get
> 2015-12-17 15:03:41.126101 7f88ef7e6700 20 SCRIPT_URI=/admin/metadata/get
> 2015-12-17 15:03:41.126102 7f88ef7e6700 20 SERVER_PORT=7480
> 2015-12-17 15:03:41.126104 7f88ef7e6700 20 RGWEnv::set(): HTTP_HOST:
> localhost:7480
> 2015-12-17 15:03:41.126105 7f88ef7e6700 20 RGWEnv::set(): HTTP_DATE:
> Thu, 17 Dec 2015 20:03:41 GMT
> 2015-12-17 15:03:41.126107 7f88ef7e6700 20 RGWEnv::set(): HTTP_ACCEPT: */*
> 2015-12-17 15:03:41.126108 7f88ef7e6700 20 RGWEnv::set():
> HTTP_ACCEPT_ENCODING: gzip, deflate
> 2015-12-17 15:03:41.126110 7f88ef7e6700 20 RGWEnv::set():
> HTTP_AUTHORIZATION: AWS RTJ1TL13CH613JRU2PJD:F6wMKxSrrFhl2m3fyo/M0yXIGT8=
> 2015-12-17 15:03:41.126113 7f88ef7e6700 20 RGWEnv::set():
> HTTP_USER_AGENT: python-requests/2.3.0 CPython/2.7.10 

Re: rgw subuser create and admin api

2015-12-17 Thread Derek Yarnell
On 12/17/15 2:36 PM, Yehuda Sadeh-Weinraub wrote:
> Try 'section=user=cephtests'

Doesn't seem to work either.

# radosgw-admin metadata get user:cephtest
{
"key": "user:cephtest",
"ver": {
"tag": "_dhpzgdOjqJI-OsR1MsYV5-p",
"ver": 1
},
"mtime": 1450378246,
"data": {
"user_id": "cephtest",
"display_name": "Ceph Test",
"email": "",
"suspended": 0,
"max_buckets": 1000,
"auid": 0,
"subusers": [],
"keys": [
{
"user": "cephtest",
"access_key": "eee",
"secret_key": ""
},
{
"user": "cephtest",
"access_key": "aaa",
"secret_key": ""
}
],
"swift_keys": [],
"caps": [],
"op_mask": "read, write, delete",
"default_placement": "",
"placement_tags": [],
"bucket_quota": {
"enabled": false,
"max_size_kb": -1,
"max_objects": -1
},
"user_quota": {
"enabled": false,
"max_size_kb": -1,
"max_objects": -1
},
"temp_url_keys": []
}
}


2015-12-17 15:03:41.126024 7f88ef7e6700 20 RGWEnv::set(): HTTP_HOST:
localhost:7480
2015-12-17 15:03:41.126056 7f88ef7e6700 20 RGWEnv::set(): HTTP_DATE:
Thu, 17 Dec 2015 20:03:41 GMT
2015-12-17 15:03:41.126059 7f88ef7e6700 20 RGWEnv::set(): HTTP_ACCEPT: */*
2015-12-17 15:03:41.126064 7f88ef7e6700 20 RGWEnv::set():
HTTP_ACCEPT_ENCODING: gzip, deflate
2015-12-17 15:03:41.126066 7f88ef7e6700 20 RGWEnv::set():
HTTP_AUTHORIZATION: AWS RTJ1TL13CH613JRU2PJD:F6wMKxSrrFhl2m3fyo/M0yXIGT8=
2015-12-17 15:03:41.126070 7f88ef7e6700 20 RGWEnv::set():
HTTP_USER_AGENT: python-requests/2.3.0 CPython/2.7.10 Darwin/14.5.0
2015-12-17 15:03:41.126071 7f88ef7e6700 20 RGWEnv::set():
HTTP_X_FORWARDED_FOR: 192.168.86.254
2015-12-17 15:03:41.126073 7f88ef7e6700 20 RGWEnv::set():
HTTP_X_FORWARDED_HOST: ceph.umiacs.umd.edu
2015-12-17 15:03:41.126075 7f88ef7e6700 20 RGWEnv::set():
HTTP_X_FORWARDED_SERVER: cephproxy00.umiacs.umd.edu
2015-12-17 15:03:41.126077 7f88ef7e6700 20 RGWEnv::set():
HTTP_CONNECTION: Keep-Alive
2015-12-17 15:03:41.126079 7f88ef7e6700 20 RGWEnv::set():
REQUEST_METHOD: GET
2015-12-17 15:03:41.126080 7f88ef7e6700 20 RGWEnv::set(): REQUEST_URI:
/admin/metadata/get
2015-12-17 15:03:41.126081 7f88ef7e6700 20 RGWEnv::set(): QUERY_STRING:
section=user=cephtest
2015-12-17 15:03:41.126082 7f88ef7e6700 20 RGWEnv::set(): REMOTE_USER:
2015-12-17 15:03:41.126083 7f88ef7e6700 20 RGWEnv::set(): SCRIPT_URI:
/admin/metadata/get
2015-12-17 15:03:41.126089 7f88ef7e6700 20 RGWEnv::set(): SERVER_PORT: 7480
2015-12-17 15:03:41.126090 7f88ef7e6700 20 HTTP_ACCEPT=*/*
2015-12-17 15:03:41.126093 7f88ef7e6700 20 HTTP_ACCEPT_ENCODING=gzip,
deflate
2015-12-17 15:03:41.126094 7f88ef7e6700 20 HTTP_AUTHORIZATION=AWS
RTJ1TL13CH613JRU2PJD:F6wMKxSrrFhl2m3fyo/M0yXIGT8=
2015-12-17 15:03:41.126094 7f88ef7e6700 20 HTTP_CONNECTION=Keep-Alive
2015-12-17 15:03:41.126095 7f88ef7e6700 20 HTTP_DATE=Thu, 17 Dec 2015
20:03:41 GMT
2015-12-17 15:03:41.126095 7f88ef7e6700 20 HTTP_HOST=localhost:7480
2015-12-17 15:03:41.126096 7f88ef7e6700 20
HTTP_USER_AGENT=python-requests/2.3.0 CPython/2.7.10 Darwin/14.5.0
2015-12-17 15:03:41.126097 7f88ef7e6700 20
HTTP_X_FORWARDED_FOR=192.168.86.254
2015-12-17 15:03:41.126097 7f88ef7e6700 20
HTTP_X_FORWARDED_HOST=ceph.umiacs.umd.edu
2015-12-17 15:03:41.126098 7f88ef7e6700 20
HTTP_X_FORWARDED_SERVER=cephproxy00.umiacs.umd.edu
2015-12-17 15:03:41.126099 7f88ef7e6700 20
QUERY_STRING=section=user=cephtest
2015-12-17 15:03:41.126099 7f88ef7e6700 20 REMOTE_USER=
2015-12-17 15:03:41.126100 7f88ef7e6700 20 REQUEST_METHOD=GET
2015-12-17 15:03:41.126101 7f88ef7e6700 20 REQUEST_URI=/admin/metadata/get
2015-12-17 15:03:41.126101 7f88ef7e6700 20 SCRIPT_URI=/admin/metadata/get
2015-12-17 15:03:41.126102 7f88ef7e6700 20 SERVER_PORT=7480
2015-12-17 15:03:41.126104 7f88ef7e6700 20 RGWEnv::set(): HTTP_HOST:
localhost:7480
2015-12-17 15:03:41.126105 7f88ef7e6700 20 RGWEnv::set(): HTTP_DATE:
Thu, 17 Dec 2015 20:03:41 GMT
2015-12-17 15:03:41.126107 7f88ef7e6700 20 RGWEnv::set(): HTTP_ACCEPT: */*
2015-12-17 15:03:41.126108 7f88ef7e6700 20 RGWEnv::set():
HTTP_ACCEPT_ENCODING: gzip, deflate
2015-12-17 15:03:41.126110 7f88ef7e6700 20 RGWEnv::set():
HTTP_AUTHORIZATION: AWS RTJ1TL13CH613JRU2PJD:F6wMKxSrrFhl2m3fyo/M0yXIGT8=
2015-12-17 15:03:41.126113 7f88ef7e6700 20 RGWEnv::set():
HTTP_USER_AGENT: python-requests/2.3.0 CPython/2.7.10 Darwin/14.5.0
2015-12-17 15:03:41.126115 7f88ef7e6700 20 RGWEnv::set():
HTTP_X_FORWARDED_FOR: 192.168.86.254
2015-12-17 15:03:41.126117 7f88ef7e6700 20 RGWEnv::set():
HTTP_X_FORWARDED_HOST: ceph.umiacs.umd.edu
2015-12-17 15:03:41.126119 7f88ef7e6700 20 RGWEnv::set():
HTTP_X_FORWARDED_SERVER: cephproxy00.umiacs.umd.edu

Re: rgw subuser create and admin api

2015-12-17 Thread Derek Yarnell
On 12/17/15 1:09 PM, Yehuda Sadeh-Weinraub wrote:
>> Bug? Design?
> 
> Somewhat a bug. The whole subusers that use s3 was unintentional, so
> when creating the subuser api, we didn't think of needing the access
> key. For some reason we do get the key type. Can you open a ceph
> tracker issue for that?
> 
> You can try using the metadata api to modify the user once it has been
> created (need to get the user info, add the s3 key to the structure,
> put the user info).
> 

This will actually create a subuser and the S3 keys correctly (but not
let you specify the access_key and secret_key).

DEBUG:requests.packages.urllib3.connectionpool:"PUT
/admin/user?subuser=json=-staff=test3=s3=read=True
HTTP/1.1" 200 87

I know about a 'GET /admin/metadata/user?format=json' to get the list of
users from the adminops.  I see I can do things like 'radosgw-admin
metadata get user:cephtest' but I can't seem to get something like this
to work.

DEBUG:requests.packages.urllib3.connectionpool:"GET
/admin/metadata/get?format=json=user%3Acephtest HTTP/1.1" 404 20
ERROR:rgwadmin.rgw:{u'Code': u'NoSuchKey'}


-- 
Derek T. Yarnell
University of Maryland
Institute for Advanced Computer Studies
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: rgw subuser create and admin api

2015-12-17 Thread Yehuda Sadeh-Weinraub
On Thu, Dec 17, 2015 at 11:05 AM, Derek Yarnell  wrote:
> On 12/17/15 1:09 PM, Yehuda Sadeh-Weinraub wrote:
>>> Bug? Design?
>>
>> Somewhat a bug. The whole subusers that use s3 was unintentional, so
>> when creating the subuser api, we didn't think of needing the access
>> key. For some reason we do get the key type. Can you open a ceph
>> tracker issue for that?
>>
>> You can try using the metadata api to modify the user once it has been
>> created (need to get the user info, add the s3 key to the structure,
>> put the user info).
>>
>
> This will actually create a subuser and the S3 keys correctly (but not
> let you specify the access_key and secret_key).
>
> DEBUG:requests.packages.urllib3.connectionpool:"PUT
> /admin/user?subuser=json=-staff=test3=s3=read=True
> HTTP/1.1" 200 87
>
> I know about a 'GET /admin/metadata/user?format=json' to get the list of
> users from the adminops.  I see I can do things like 'radosgw-admin
> metadata get user:cephtest' but I can't seem to get something like this
> to work.
>
> DEBUG:requests.packages.urllib3.connectionpool:"GET
> /admin/metadata/get?format=json=user%3Acephtest HTTP/1.1" 404 20
> ERROR:rgwadmin.rgw:{u'Code': u'NoSuchKey'}
>

Try 'section=user=cephtests'

>
> --
> Derek T. Yarnell
> University of Maryland
> Institute for Advanced Computer Studies
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Issue with Ceph File System and LIO

2015-12-17 Thread Eric Eastman
With cephfs.patch and cephfs1.patch applied and I am now seeing:

[Thu Dec 17 14:27:59 2015] [ cut here ]
[Thu Dec 17 14:27:59 2015] WARNING: CPU: 0 PID: 3036 at
fs/ceph/addr.c:1171 ceph_write_begin+0xfb/0x120 [ceph]()
[Thu Dec 17 14:27:59 2015] Modules linked in: iscsi_target_mod
vhost_scsi tcm_qla2xxx ib_srpt tcm_fc tcm_usb_gadget tcm_loop
target_core_file target_core_iblock target_core_pscsi target_core_user
target_core_mod ipmi_devintf vhost qla2xxx ib_cm ib_sa ib_mad ib_core
ib_addr libfc scsi_transport_fc libcomposite udc_core uio configfs ttm
drm_kms_helper drm ipmi_ssif coretemp gpio_ich i2c_algo_bit kvm
fb_sys_fops syscopyarea sysfillrect sysimgblt shpchp input_leds ceph
irqbypass i7core_edac serio_raw hpilo edac_core ipmi_si
ipmi_msghandler 8250_fintek lpc_ich acpi_power_meter libceph mac_hid
libcrc32c fscache bonding lp parport mlx4_en vxlan ip6_udp_tunnel
udp_tunnel ptp pps_core hid_generic usbhid hid mlx4_core hpsa psmouse
bnx2 fjes scsi_transport_sas [last unloaded: target_core_mod]
[Thu Dec 17 14:27:59 2015] CPU: 0 PID: 3036 Comm: iscsi_trx Tainted: G
   W I 4.4.0-rc4-ede2 #1
[Thu Dec 17 14:27:59 2015] Hardware name: HP ProLiant DL360 G6, BIOS
P64 01/22/2015
[Thu Dec 17 14:27:59 2015]  c02b2e37 880c0289b958
813ad644 
[Thu Dec 17 14:27:59 2015]  880c0289b990 81079702
880c0289ba50 000846c21000
[Thu Dec 17 14:27:59 2015]  880c009ea200 1000
ea00122ed700 880c0289b9a0
[Thu Dec 17 14:27:59 2015] Call Trace:
[Thu Dec 17 14:27:59 2015]  [] dump_stack+0x44/0x60
[Thu Dec 17 14:27:59 2015]  [] warn_slowpath_common+0x82/0xc0
[Thu Dec 17 14:27:59 2015]  [] warn_slowpath_null+0x1a/0x20
[Thu Dec 17 14:27:59 2015]  []
ceph_write_begin+0xfb/0x120 [ceph]
[Thu Dec 17 14:27:59 2015]  []
generic_perform_write+0xbf/0x1a0
[Thu Dec 17 14:27:59 2015]  []
ceph_write_iter+0xf5c/0x1010 [ceph]
[Thu Dec 17 14:27:59 2015]  [] ? __schedule+0x386/0x9c0
[Thu Dec 17 14:27:59 2015]  [] ? schedule+0x35/0x80
[Thu Dec 17 14:27:59 2015]  [] ? __slab_free+0xb5/0x290
[Thu Dec 17 14:27:59 2015]  [] ?
iov_iter_get_pages+0x113/0x210
[Thu Dec 17 14:27:59 2015]  [] vfs_iter_write+0x63/0xa0
[Thu Dec 17 14:27:59 2015]  []
fd_do_rw.isra.5+0xc9/0x1b0 [target_core_file]
[Thu Dec 17 14:27:59 2015]  []
fd_execute_rw+0xc5/0x2a0 [target_core_file]
[Thu Dec 17 14:27:59 2015]  []
sbc_execute_rw+0x22/0x30 [target_core_mod]
[Thu Dec 17 14:27:59 2015]  []
__target_execute_cmd+0x1f/0x70 [target_core_mod]
[Thu Dec 17 14:27:59 2015]  []
target_execute_cmd+0x195/0x2a0 [target_core_mod]
[Thu Dec 17 14:27:59 2015]  []
iscsit_execute_cmd+0x20a/0x270 [iscsi_target_mod]
[Thu Dec 17 14:27:59 2015]  []
iscsit_sequence_cmd+0xda/0x190 [iscsi_target_mod]
[Thu Dec 17 14:27:59 2015]  []
iscsi_target_rx_thread+0x51d/0xe30 [iscsi_target_mod]
[Thu Dec 17 14:27:59 2015]  [] ? __switch_to+0x1cd/0x570
[Thu Dec 17 14:27:59 2015]  [] ?
iscsi_target_tx_thread+0x1c0/0x1c0 [iscsi_target_mod]
[Thu Dec 17 14:27:59 2015]  [] kthread+0xc9/0xe0
[Thu Dec 17 14:27:59 2015]  [] ?
kthread_create_on_node+0x180/0x180
[Thu Dec 17 14:27:59 2015]  [] ret_from_fork+0x3f/0x70
[Thu Dec 17 14:27:59 2015]  [] ?
kthread_create_on_node+0x180/0x180
[Thu Dec 17 14:27:59 2015] ---[ end trace 8346192e3f29ed5d ]---

Each of the WARNING on line 1171 is followed by a WARNING on line 125.
The dmesg output is attached to the tracker ticket 14086

Regards,
Eric

On Thu, Dec 17, 2015 at 2:38 AM, Yan, Zheng  wrote:
> On Thu, Dec 17, 2015 at 4:56 PM, Eric Eastman
>  wrote:
>> I patched the 4.4rc4 kernel source and restarted the test.  Shortly
>> after starting it, this showed up in dmesg:
>>
>> [Thu Dec 17 03:29:55 2015] WARNING: CPU: 0 PID: 2547 at
>> fs/ceph/addr.c:1162 ceph_write_begin+0xfb/0x120 [ceph]()
>> [Thu Dec 17 03:29:55 2015] Modules linked in: iscsi_target_mod
>> vhost_scsi tcm_qla2xxx ib_srpt tcm_fc tcm_usb_gadget tcm_loop
>> target_core_file target_core_iblock target_core_pscsi target_core_user
>> target_core_mod ipmi_devintf vhost qla2xxx ib_cm ib_sa ib_mad ib_core
>> ib_addr libfc scsi_transport_fc libcomposite udc_core uio configfs ttm
>> ipmi_ssif drm_kms_helper drm coretemp kvm gpio_ich i2c_algo_bit
>> i7core_edac fb_sys_fops syscopyarea edac_core sysfillrect sysimgblt
>> ipmi_si input_leds hpilo ipmi_msghandler shpchp acpi_power_meter
>> irqbypass serio_raw 8250_fintek lpc_ich mac_hid ceph bonding libceph
>> lp parport libcrc32c fscache mlx4_en vxlan ip6_udp_tunnel udp_tunnel
>> ptp pps_core hid_generic usbhid hid mlx4_core hpsa psmouse bnx2 fjes
>> scsi_transport_sas [last unloaded: target_core_mod]
>> [Thu Dec 17 03:29:55 2015] CPU: 0 PID: 2547 Comm: iscsi_trx Tainted: G
>>W I 4.4.0-rc4-ede1 #1
>> [Thu Dec 17 03:29:55 2015] Hardware name: HP ProLiant DL360 G6, BIOS
>> P64 01/22/2015
>> [Thu Dec 17 03:29:55 2015]  c020cd47 8805f1e97958
>> 813ad644 
>> [Thu Dec 17 03:29:55 2015]  

Re: Issue with Ceph File System and LIO

2015-12-17 Thread Yan, Zheng
On Fri, Dec 18, 2015 at 2:23 PM, Eric Eastman
 wrote:
>> Hi Yan Zheng, Eric Eastman
>>
>> Similar bug was reported in f2fs, btrfs, it does affect 4.4-rc4, the fixing
>> patch was merged into 4.4-rc5, dfd01f026058 ("sched/wait: Fix the signal
>> handling fix").
>>
>> Related report & discussion was here:
>> https://lkml.org/lkml/2015/12/12/149
>>
>> I'm not sure the current reported issue of ceph was related to that though,
>> but at least try testing with an upgraded or patched kernel could verify it.
>> :)
>>
>> Thanks,
>>
>>> -Original Message-
>>> From: ceph-devel-ow...@vger.kernel.org 
>>> [mailto:ceph-devel-ow...@vger.kernel.org] On Behalf Of
>>> Yan, Zheng
>>> Sent: Friday, December 18, 2015 12:05 PM
>>> To: Eric Eastman
>>> Cc: Ceph Development
>>> Subject: Re: Issue with Ceph File System and LIO
>>>
>>> On Fri, Dec 18, 2015 at 3:49 AM, Eric Eastman
>>>  wrote:
>>> > With cephfs.patch and cephfs1.patch applied and I am now seeing:
>>> >
>>> > [Thu Dec 17 14:27:59 2015] [ cut here ]
>>> > [Thu Dec 17 14:27:59 2015] WARNING: CPU: 0 PID: 3036 at
>>> > fs/ceph/addr.c:1171 ceph_write_begin+0xfb/0x120 [ceph]()
>>> > [Thu Dec 17 14:27:59 2015] Modules linked in: iscsi_target_mod
> ...
>>> >
>>>
>>> The page gets unlocked mystically. I still don't find any clue. Could
>>> you please try the new patch (not incremental patch). Besides, please
>>> enable CONFIG_DEBUG_VM when compiling the kernel.
>>>
>>> Thanks you very much
>>> Yan, Zheng
>>
> I have just installed the cephfs_new.patch and have set
> CONFIG_DEBUG_VM=y on a new 4.4rc4 kernel and restarted the ESXi iSCSI
> test to my Ceph File System gateway.  I plan to let it run overnight
> and report the status tomorrow.
>
> Let me know if I should move on to 4.4rc5 with or without patches and
> with or without  CONFIG_DEBUG_VM=y
>

please try rc5 kernel without patches and DEBUG_VM=y

Regards
Yan, Zheng


> Looking at the network traffic stats on my iSCSI gateway, with
> CONFIG_DEBUG_VM=y, throughput seems to be down by a factor of at least
> 10 compared to my last test without setting CONFIG_DEBUG_VM=y
>
> Regards,
> Eric
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Issue with Ceph File System and LIO

2015-12-17 Thread Eric Eastman
> Hi Yan Zheng, Eric Eastman
>
> Similar bug was reported in f2fs, btrfs, it does affect 4.4-rc4, the fixing
> patch was merged into 4.4-rc5, dfd01f026058 ("sched/wait: Fix the signal
> handling fix").
>
> Related report & discussion was here:
> https://lkml.org/lkml/2015/12/12/149
>
> I'm not sure the current reported issue of ceph was related to that though,
> but at least try testing with an upgraded or patched kernel could verify it.
> :)
>
> Thanks,
>
>> -Original Message-
>> From: ceph-devel-ow...@vger.kernel.org 
>> [mailto:ceph-devel-ow...@vger.kernel.org] On Behalf Of
>> Yan, Zheng
>> Sent: Friday, December 18, 2015 12:05 PM
>> To: Eric Eastman
>> Cc: Ceph Development
>> Subject: Re: Issue with Ceph File System and LIO
>>
>> On Fri, Dec 18, 2015 at 3:49 AM, Eric Eastman
>>  wrote:
>> > With cephfs.patch and cephfs1.patch applied and I am now seeing:
>> >
>> > [Thu Dec 17 14:27:59 2015] [ cut here ]
>> > [Thu Dec 17 14:27:59 2015] WARNING: CPU: 0 PID: 3036 at
>> > fs/ceph/addr.c:1171 ceph_write_begin+0xfb/0x120 [ceph]()
>> > [Thu Dec 17 14:27:59 2015] Modules linked in: iscsi_target_mod
...
>> >
>>
>> The page gets unlocked mystically. I still don't find any clue. Could
>> you please try the new patch (not incremental patch). Besides, please
>> enable CONFIG_DEBUG_VM when compiling the kernel.
>>
>> Thanks you very much
>> Yan, Zheng
>
I have just installed the cephfs_new.patch and have set
CONFIG_DEBUG_VM=y on a new 4.4rc4 kernel and restarted the ESXi iSCSI
test to my Ceph File System gateway.  I plan to let it run overnight
and report the status tomorrow.

Let me know if I should move on to 4.4rc5 with or without patches and
with or without  CONFIG_DEBUG_VM=y

Looking at the network traffic stats on my iSCSI gateway, with
CONFIG_DEBUG_VM=y, throughput seems to be down by a factor of at least
10 compared to my last test without setting CONFIG_DEBUG_VM=y

Regards,
Eric
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: rgw subuser create and admin api

2015-12-17 Thread Derek Yarnell
On 12/17/15 3:15 PM, Yehuda Sadeh-Weinraub wrote:
> 
> Right. Reading the code again:
> 
> Try:
> GET /admin/metadata/user=cephtest

Thanks this is very helpful and works and I was able to also get the PUT
working.  Only question is that is it expected to return a 204 no content?

2015-12-17 17:42:39.422612 7f88f47f0700 20 RGWEnv::set(): HTTP_HOST:
localhost:7480
2015-12-17 17:42:39.422619 7f88f47f0700 20 RGWEnv::set():
HTTP_ACCEPT_ENCODING: gzip, deflate
2015-12-17 17:42:39.422621 7f88f47f0700 20 RGWEnv::set(): HTTP_ACCEPT: */*
2015-12-17 17:42:39.422623 7f88f47f0700 20 RGWEnv::set():
HTTP_USER_AGENT: python-requests/2.3.0 CPython/2.7.10 Darwin/14.5.0
2015-12-17 17:42:39.422625 7f88f47f0700 20 RGWEnv::set(): HTTP_DATE:
Thu, 17 Dec 2015 22:42:39 GMT
2015-12-17 17:42:39.422627 7f88f47f0700 20 RGWEnv::set(): CONTENT_TYPE:
application/json
2015-12-17 17:42:39.422629 7f88f47f0700 20 RGWEnv::set():
HTTP_AUTHORIZATION: AWS RTJ1TL13CH613JRU2PJD:K3xaPHDy6t3r0COfjwl9rAUsUfY=
2015-12-17 17:42:39.422630 7f88f47f0700 20 RGWEnv::set():
HTTP_X_FORWARDED_FOR: 192.168.86.254
2015-12-17 17:42:39.422632 7f88f47f0700 20 RGWEnv::set():
HTTP_X_FORWARDED_HOST: ceph.umiacs.umd.edu
2015-12-17 17:42:39.422634 7f88f47f0700 20 RGWEnv::set():
HTTP_X_FORWARDED_SERVER: cephproxy00.umiacs.umd.edu
2015-12-17 17:42:39.422636 7f88f47f0700 20 RGWEnv::set():
HTTP_CONNECTION: Keep-Alive
2015-12-17 17:42:39.422637 7f88f47f0700 20 RGWEnv::set():
CONTENT_LENGTH: 1531
2015-12-17 17:42:39.422638 7f88f47f0700 20 RGWEnv::set():
REQUEST_METHOD: PUT
2015-12-17 17:42:39.422640 7f88f47f0700 20 RGWEnv::set(): REQUEST_URI:
/admin/metadata/user
2015-12-17 17:42:39.422641 7f88f47f0700 20 RGWEnv::set(): QUERY_STRING:
key=-staff
2015-12-17 17:42:39.422643 7f88f47f0700 20 RGWEnv::set(): REMOTE_USER:
2015-12-17 17:42:39.422644 7f88f47f0700 20 RGWEnv::set(): SCRIPT_URI:
/admin/metadata/user
2015-12-17 17:42:39.422651 7f88f47f0700 20 RGWEnv::set(): SERVER_PORT: 7480
2015-12-17 17:42:39.422652 7f88f47f0700 20 CONTENT_LENGTH=1531
2015-12-17 17:42:39.422654 7f88f47f0700 20 CONTENT_TYPE=application/json
2015-12-17 17:42:39.422655 7f88f47f0700 20 HTTP_ACCEPT=*/*
2015-12-17 17:42:39.422655 7f88f47f0700 20 HTTP_ACCEPT_ENCODING=gzip,
deflate
2015-12-17 17:42:39.422656 7f88f47f0700 20 HTTP_AUTHORIZATION=AWS
RTJ1TL13CH613JRU2PJD:K3xaPHDy6t3r0COfjwl9rAUsUfY=
2015-12-17 17:42:39.422657 7f88f47f0700 20 HTTP_CONNECTION=Keep-Alive
2015-12-17 17:42:39.422658 7f88f47f0700 20 HTTP_DATE=Thu, 17 Dec 2015
22:42:39 GMT
2015-12-17 17:42:39.422658 7f88f47f0700 20 HTTP_HOST=localhost:7480
2015-12-17 17:42:39.422659 7f88f47f0700 20
HTTP_USER_AGENT=python-requests/2.3.0 CPython/2.7.10 Darwin/14.5.0
2015-12-17 17:42:39.422660 7f88f47f0700 20
HTTP_X_FORWARDED_FOR=192.168.86.254
2015-12-17 17:42:39.422660 7f88f47f0700 20
HTTP_X_FORWARDED_HOST=ceph.umiacs.umd.edu
2015-12-17 17:42:39.422661 7f88f47f0700 20
HTTP_X_FORWARDED_SERVER=cephproxy00.umiacs.umd.edu
2015-12-17 17:42:39.422662 7f88f47f0700 20 QUERY_STRING=key=-staff
2015-12-17 17:42:39.422662 7f88f47f0700 20 REMOTE_USER=
2015-12-17 17:42:39.422663 7f88f47f0700 20 REQUEST_METHOD=PUT
2015-12-17 17:42:39.422664 7f88f47f0700 20 REQUEST_URI=/admin/metadata/user
2015-12-17 17:42:39.422664 7f88f47f0700 20 SCRIPT_URI=/admin/metadata/user
2015-12-17 17:42:39.422665 7f88f47f0700 20 SERVER_PORT=7480
2015-12-17 17:42:39.422667 7f88f47f0700 20 RGWEnv::set(): HTTP_HOST:
localhost:7480
2015-12-17 17:42:39.422668 7f88f47f0700 20 RGWEnv::set():
HTTP_ACCEPT_ENCODING: gzip, deflate
2015-12-17 17:42:39.422670 7f88f47f0700 20 RGWEnv::set(): HTTP_ACCEPT: */*
2015-12-17 17:42:39.422671 7f88f47f0700 20 RGWEnv::set():
HTTP_USER_AGENT: python-requests/2.3.0 CPython/2.7.10 Darwin/14.5.0
2015-12-17 17:42:39.422672 7f88f47f0700 20 RGWEnv::set(): HTTP_DATE:
Thu, 17 Dec 2015 22:42:39 GMT
2015-12-17 17:42:39.422673 7f88f47f0700 20 RGWEnv::set(): CONTENT_TYPE:
application/json
2015-12-17 17:42:39.422674 7f88f47f0700 20 RGWEnv::set():
HTTP_AUTHORIZATION: AWS RTJ1TL13CH613JRU2PJD:K3xaPHDy6t3r0COfjwl9rAUsUfY=
2015-12-17 17:42:39.422676 7f88f47f0700 20 RGWEnv::set():
HTTP_X_FORWARDED_FOR: 192.168.86.254
2015-12-17 17:42:39.422677 7f88f47f0700 20 RGWEnv::set():
HTTP_X_FORWARDED_HOST: ceph.umiacs.umd.edu
2015-12-17 17:42:39.422678 7f88f47f0700 20 RGWEnv::set():
HTTP_X_FORWARDED_SERVER: cephproxy00.umiacs.umd.edu
2015-12-17 17:42:39.422679 7f88f47f0700 20 RGWEnv::set():
HTTP_CONNECTION: Keep-Alive
2015-12-17 17:42:39.422680 7f88f47f0700 20 RGWEnv::set():
CONTENT_LENGTH: 1531
2015-12-17 17:42:39.422681 7f88f47f0700 20 RGWEnv::set():
REQUEST_METHOD: PUT
2015-12-17 17:42:39.422682 7f88f47f0700 20 RGWEnv::set(): REQUEST_URI:
/admin/metadata/user
2015-12-17 17:42:39.422683 7f88f47f0700 20 RGWEnv::set(): QUERY_STRING:
key=-staff
2015-12-17 17:42:39.422684 7f88f47f0700 20 RGWEnv::set(): REMOTE_USER:
2015-12-17 17:42:39.422685 7f88f47f0700 20 RGWEnv::set(): SCRIPT_URI:
/admin/metadata/user
2015-12-17 17:42:39.422686 7f88f47f0700 20 RGWEnv::set(): SERVER_PORT: 7480
2015-12-17 

Re: understanding partprobe failure

2015-12-17 Thread Loic Dachary


On 17/12/2015 16:49, Ilya Dryomov wrote:
> On Thu, Dec 17, 2015 at 1:19 PM, Loic Dachary  wrote:
>> Hi Ilya,
>>
>> I'm seeing a partprobe failure right after a disk was zapped with sgdisk 
>> --clear --mbrtogpt -- /dev/vdb:
>>
>> partprobe /dev/vdb failed : Error: Partition(s) 1 on /dev/vdb have been 
>> written, but we have been unable to inform the kernel of the change, 
>> probably because it/they are in use. As a result, the old partition(s) will 
>> remain in use. You should reboot now before making further changes.
>>
>> waiting 60 seconds (see the log below) and trying again succeeds. The 
>> partprobe call is guarded by udevadm settle to prevent udev actions from 
>> racing and nothing else goes on in the machine.
>>
>> Any idea how that could happen ?
>>
>> Cheers
>>
>> 2015-12-17 11:46:10,356.356 
>> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:DEBUG:ceph-disk:get_dm_uuid
>>  /dev/vdb uuid path is /sys/dev/block/253:16/dm/uuid
>> 2015-12-17 11:46:10,357.357 
>> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:DEBUG:ceph-disk:Zapping
>>  partition table on /dev/vdb
>> 2015-12-17 11:46:10,358.358 
>> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:INFO:ceph-disk:Running
>>  command: /usr/sbin/sgdisk --zap-all -- /dev/vdb
>> 2015-12-17 11:46:10,365.365 
>> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:Caution:
>>  invalid backup GPT header, but valid main header; regenerating
>> 2015-12-17 11:46:10,366.366 
>> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:backup 
>> header from main header.
>> 2015-12-17 11:46:10,366.366 
>> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:
>> 2015-12-17 11:46:10,366.366 
>> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:Warning!
>>  Main and backup partition tables differ! Use the 'c' and 'e' options
>> 2015-12-17 11:46:10,367.367 
>> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:on the 
>> recovery & transformation menu to examine the two tables.
>> 2015-12-17 11:46:10,367.367 
>> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:
>> 2015-12-17 11:46:10,367.367 
>> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:Warning!
>>  One or more CRCs don't match. You should repair the disk!
>> 2015-12-17 11:46:10,368.368 
>> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:
>> 2015-12-17 11:46:11,413.413 
>> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:
>> 2015-12-17 11:46:11,414.414 
>> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:Caution:
>>  Found protective or hybrid MBR and corrupt GPT. Using GPT, but disk
>> 2015-12-17 11:46:11,414.414 
>> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:verification
>>  and recovery are STRONGLY recommended.
>> 2015-12-17 11:46:11,414.414 
>> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:
>> 2015-12-17 11:46:11,415.415 
>> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:Warning:
>>  The kernel is still using the old partition table.
>> 2015-12-17 11:46:11,415.415 
>> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:The 
>> new table will be used at the next reboot.
>> 2015-12-17 11:46:11,416.416 
>> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:GPT 
>> data structures destroyed! You may now partition the disk using fdisk or
>> 2015-12-17 11:46:11,416.416 
>> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:other 
>> utilities.
>> 2015-12-17 11:46:11,416.416 
>> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:INFO:ceph-disk:Running
>>  command: /usr/sbin/sgdisk --clear --mbrtogpt -- /dev/vdb
>> 2015-12-17 11:46:12,504.504 
>> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:Creating
>>  new GPT entries.
>> 2015-12-17 11:46:12,505.505 
>> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:Warning:
>>  The kernel is still using the old partition table.
>> 2015-12-17 11:46:12,505.505 
>> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:The 
>> new table will be used at the next reboot.
>> 2015-12-17 11:46:12,505.505 
>> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:The 
>> operation has completed successfully.
>> 2015-12-17 11:46:12,506.506 
>> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:DEBUG:ceph-disk:Calling
>>  partprobe on zapped device /dev/vdb
>> 2015-12-17 11:46:12,507.507 
>> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:INFO:ceph-disk:Running
>>  command: /usr/bin/udevadm settle 

Re: rgw subuser create and admin api

2015-12-17 Thread Yehuda Sadeh-Weinraub
On Thu, Dec 17, 2015 at 2:44 PM, Derek Yarnell  wrote:
> On 12/17/15 3:15 PM, Yehuda Sadeh-Weinraub wrote:
>>
>> Right. Reading the code again:
>>
>> Try:
>> GET /admin/metadata/user=cephtest
>
> Thanks this is very helpful and works and I was able to also get the PUT
> working.  Only question is that is it expected to return a 204 no content?

Yes, it's expected.

Yehuda

>
> 2015-12-17 17:42:39.422612 7f88f47f0700 20 RGWEnv::set(): HTTP_HOST:
> localhost:7480
> 2015-12-17 17:42:39.422619 7f88f47f0700 20 RGWEnv::set():
> HTTP_ACCEPT_ENCODING: gzip, deflate
> 2015-12-17 17:42:39.422621 7f88f47f0700 20 RGWEnv::set(): HTTP_ACCEPT: */*
> 2015-12-17 17:42:39.422623 7f88f47f0700 20 RGWEnv::set():
> HTTP_USER_AGENT: python-requests/2.3.0 CPython/2.7.10 Darwin/14.5.0
> 2015-12-17 17:42:39.422625 7f88f47f0700 20 RGWEnv::set(): HTTP_DATE:
> Thu, 17 Dec 2015 22:42:39 GMT
> 2015-12-17 17:42:39.422627 7f88f47f0700 20 RGWEnv::set(): CONTENT_TYPE:
> application/json
> 2015-12-17 17:42:39.422629 7f88f47f0700 20 RGWEnv::set():
> HTTP_AUTHORIZATION: AWS RTJ1TL13CH613JRU2PJD:K3xaPHDy6t3r0COfjwl9rAUsUfY=
> 2015-12-17 17:42:39.422630 7f88f47f0700 20 RGWEnv::set():
> HTTP_X_FORWARDED_FOR: 192.168.86.254
> 2015-12-17 17:42:39.422632 7f88f47f0700 20 RGWEnv::set():
> HTTP_X_FORWARDED_HOST: ceph.umiacs.umd.edu
> 2015-12-17 17:42:39.422634 7f88f47f0700 20 RGWEnv::set():
> HTTP_X_FORWARDED_SERVER: cephproxy00.umiacs.umd.edu
> 2015-12-17 17:42:39.422636 7f88f47f0700 20 RGWEnv::set():
> HTTP_CONNECTION: Keep-Alive
> 2015-12-17 17:42:39.422637 7f88f47f0700 20 RGWEnv::set():
> CONTENT_LENGTH: 1531
> 2015-12-17 17:42:39.422638 7f88f47f0700 20 RGWEnv::set():
> REQUEST_METHOD: PUT
> 2015-12-17 17:42:39.422640 7f88f47f0700 20 RGWEnv::set(): REQUEST_URI:
> /admin/metadata/user
> 2015-12-17 17:42:39.422641 7f88f47f0700 20 RGWEnv::set(): QUERY_STRING:
> key=-staff
> 2015-12-17 17:42:39.422643 7f88f47f0700 20 RGWEnv::set(): REMOTE_USER:
> 2015-12-17 17:42:39.422644 7f88f47f0700 20 RGWEnv::set(): SCRIPT_URI:
> /admin/metadata/user
> 2015-12-17 17:42:39.422651 7f88f47f0700 20 RGWEnv::set(): SERVER_PORT: 7480
> 2015-12-17 17:42:39.422652 7f88f47f0700 20 CONTENT_LENGTH=1531
> 2015-12-17 17:42:39.422654 7f88f47f0700 20 CONTENT_TYPE=application/json
> 2015-12-17 17:42:39.422655 7f88f47f0700 20 HTTP_ACCEPT=*/*
> 2015-12-17 17:42:39.422655 7f88f47f0700 20 HTTP_ACCEPT_ENCODING=gzip,
> deflate
> 2015-12-17 17:42:39.422656 7f88f47f0700 20 HTTP_AUTHORIZATION=AWS
> RTJ1TL13CH613JRU2PJD:K3xaPHDy6t3r0COfjwl9rAUsUfY=
> 2015-12-17 17:42:39.422657 7f88f47f0700 20 HTTP_CONNECTION=Keep-Alive
> 2015-12-17 17:42:39.422658 7f88f47f0700 20 HTTP_DATE=Thu, 17 Dec 2015
> 22:42:39 GMT
> 2015-12-17 17:42:39.422658 7f88f47f0700 20 HTTP_HOST=localhost:7480
> 2015-12-17 17:42:39.422659 7f88f47f0700 20
> HTTP_USER_AGENT=python-requests/2.3.0 CPython/2.7.10 Darwin/14.5.0
> 2015-12-17 17:42:39.422660 7f88f47f0700 20
> HTTP_X_FORWARDED_FOR=192.168.86.254
> 2015-12-17 17:42:39.422660 7f88f47f0700 20
> HTTP_X_FORWARDED_HOST=ceph.umiacs.umd.edu
> 2015-12-17 17:42:39.422661 7f88f47f0700 20
> HTTP_X_FORWARDED_SERVER=cephproxy00.umiacs.umd.edu
> 2015-12-17 17:42:39.422662 7f88f47f0700 20 QUERY_STRING=key=-staff
> 2015-12-17 17:42:39.422662 7f88f47f0700 20 REMOTE_USER=
> 2015-12-17 17:42:39.422663 7f88f47f0700 20 REQUEST_METHOD=PUT
> 2015-12-17 17:42:39.422664 7f88f47f0700 20 REQUEST_URI=/admin/metadata/user
> 2015-12-17 17:42:39.422664 7f88f47f0700 20 SCRIPT_URI=/admin/metadata/user
> 2015-12-17 17:42:39.422665 7f88f47f0700 20 SERVER_PORT=7480
> 2015-12-17 17:42:39.422667 7f88f47f0700 20 RGWEnv::set(): HTTP_HOST:
> localhost:7480
> 2015-12-17 17:42:39.422668 7f88f47f0700 20 RGWEnv::set():
> HTTP_ACCEPT_ENCODING: gzip, deflate
> 2015-12-17 17:42:39.422670 7f88f47f0700 20 RGWEnv::set(): HTTP_ACCEPT: */*
> 2015-12-17 17:42:39.422671 7f88f47f0700 20 RGWEnv::set():
> HTTP_USER_AGENT: python-requests/2.3.0 CPython/2.7.10 Darwin/14.5.0
> 2015-12-17 17:42:39.422672 7f88f47f0700 20 RGWEnv::set(): HTTP_DATE:
> Thu, 17 Dec 2015 22:42:39 GMT
> 2015-12-17 17:42:39.422673 7f88f47f0700 20 RGWEnv::set(): CONTENT_TYPE:
> application/json
> 2015-12-17 17:42:39.422674 7f88f47f0700 20 RGWEnv::set():
> HTTP_AUTHORIZATION: AWS RTJ1TL13CH613JRU2PJD:K3xaPHDy6t3r0COfjwl9rAUsUfY=
> 2015-12-17 17:42:39.422676 7f88f47f0700 20 RGWEnv::set():
> HTTP_X_FORWARDED_FOR: 192.168.86.254
> 2015-12-17 17:42:39.422677 7f88f47f0700 20 RGWEnv::set():
> HTTP_X_FORWARDED_HOST: ceph.umiacs.umd.edu
> 2015-12-17 17:42:39.422678 7f88f47f0700 20 RGWEnv::set():
> HTTP_X_FORWARDED_SERVER: cephproxy00.umiacs.umd.edu
> 2015-12-17 17:42:39.422679 7f88f47f0700 20 RGWEnv::set():
> HTTP_CONNECTION: Keep-Alive
> 2015-12-17 17:42:39.422680 7f88f47f0700 20 RGWEnv::set():
> CONTENT_LENGTH: 1531
> 2015-12-17 17:42:39.422681 7f88f47f0700 20 RGWEnv::set():
> REQUEST_METHOD: PUT
> 2015-12-17 17:42:39.422682 7f88f47f0700 20 RGWEnv::set(): REQUEST_URI:
> /admin/metadata/user
> 2015-12-17 17:42:39.422683 7f88f47f0700 20 

Re: [ceph-users] v10.0.0 released

2015-12-17 Thread Loic Dachary
The script handles UTF-8 fine, the copy/paste is at fault here ;-)

On 24/11/2015 07:59, piotr.da...@ts.fujitsu.com wrote:
>> -Original Message-
>> From: ceph-devel-ow...@vger.kernel.org [mailto:ceph-devel-
>> ow...@vger.kernel.org] On Behalf Of Sage Weil
>> Sent: Monday, November 23, 2015 5:08 PM
>>
>> This is the first development release for the Jewel cycle.  We are off to a
>> good start, with lots of performance improvements flowing into the tree.
>> We are targetting sometime in Q1 2016 for the final Jewel.
>>
>> [..]
>> (`pr#5853 `_, Piotr Dałek)
> 
> Hopefully at that point the script that generates this list will learn how to 
> handle UTF-8 ;-)
> 
> 
> With best regards / Pozdrawiam
> Piotr Dałek
> ___
> ceph-users mailing list
> ceph-us...@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 

-- 
Loïc Dachary, Artisan Logiciel Libre



signature.asc
Description: OpenPGP digital signature


understanding partprobe failure

2015-12-17 Thread Loic Dachary
Hi Ilya,

I'm seeing a partprobe failure right after a disk was zapped with sgdisk 
--clear --mbrtogpt -- /dev/vdb:

partprobe /dev/vdb failed : Error: Partition(s) 1 on /dev/vdb have been 
written, but we have been unable to inform the kernel of the change, probably 
because it/they are in use. As a result, the old partition(s) will remain in 
use. You should reboot now before making further changes.

waiting 60 seconds (see the log below) and trying again succeeds. The partprobe 
call is guarded by udevadm settle to prevent udev actions from racing and 
nothing else goes on in the machine.

Any idea how that could happen ?

Cheers

2015-12-17 11:46:10,356.356 
INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:DEBUG:ceph-disk:get_dm_uuid
 /dev/vdb uuid path is /sys/dev/block/253:16/dm/uuid
2015-12-17 11:46:10,357.357 
INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:DEBUG:ceph-disk:Zapping
 partition table on /dev/vdb
2015-12-17 11:46:10,358.358 
INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:INFO:ceph-disk:Running
 command: /usr/sbin/sgdisk --zap-all -- /dev/vdb
2015-12-17 11:46:10,365.365 
INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:Caution: 
invalid backup GPT header, but valid main header; regenerating
2015-12-17 11:46:10,366.366 
INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:backup 
header from main header.
2015-12-17 11:46:10,366.366 
INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:
2015-12-17 11:46:10,366.366 
INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:Warning! 
Main and backup partition tables differ! Use the 'c' and 'e' options
2015-12-17 11:46:10,367.367 
INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:on the 
recovery & transformation menu to examine the two tables.
2015-12-17 11:46:10,367.367 
INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:
2015-12-17 11:46:10,367.367 
INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:Warning! 
One or more CRCs don't match. You should repair the disk!
2015-12-17 11:46:10,368.368 
INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:
2015-12-17 11:46:11,413.413 
INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:
2015-12-17 11:46:11,414.414 
INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:Caution: 
Found protective or hybrid MBR and corrupt GPT. Using GPT, but disk
2015-12-17 11:46:11,414.414 
INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:verification
 and recovery are STRONGLY recommended.
2015-12-17 11:46:11,414.414 
INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:
2015-12-17 11:46:11,415.415 
INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:Warning: 
The kernel is still using the old partition table.
2015-12-17 11:46:11,415.415 
INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:The new 
table will be used at the next reboot.
2015-12-17 11:46:11,416.416 
INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:GPT data 
structures destroyed! You may now partition the disk using fdisk or
2015-12-17 11:46:11,416.416 
INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:other 
utilities.
2015-12-17 11:46:11,416.416 
INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:INFO:ceph-disk:Running
 command: /usr/sbin/sgdisk --clear --mbrtogpt -- /dev/vdb
2015-12-17 11:46:12,504.504 
INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:Creating 
new GPT entries.
2015-12-17 11:46:12,505.505 
INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:Warning: 
The kernel is still using the old partition table.
2015-12-17 11:46:12,505.505 
INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:The new 
table will be used at the next reboot.
2015-12-17 11:46:12,505.505 
INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:The 
operation has completed successfully.
2015-12-17 11:46:12,506.506 
INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:DEBUG:ceph-disk:Calling
 partprobe on zapped device /dev/vdb
2015-12-17 11:46:12,507.507 
INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:INFO:ceph-disk:Running
 command: /usr/bin/udevadm settle --timeout=600
2015-12-17 11:46:15,427.427 
INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:INFO:ceph-disk:Running
 command: /usr/sbin/partprobe /dev/vdb
2015-12-17 11:46:16,860.860 
INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:DEBUG:ceph-disk:partprobe
 /dev/vdb failed : Error: Partition(s) 1 on /dev/vdb have been written, but we 
have 

puzzling disapearance of /dev/sdc1

2015-12-17 Thread Loic Dachary
Hi Ilya,

This is another puzzling behavior (the log of all commands is at 
http://tracker.ceph.com/issues/14094#note-4). in a nutshell, after a series of 
sgdisk -i commands to examine various devices including /dev/sdc1, the 
/dev/sdc1 file disappears (and I think it will showup again although I don't 
have a definitive proof of this).

It looks like a side effect of a previous partprobe command, the only command I 
can think of that removes / re-adds devices. I thought calling udevadm settle 
after running partprobe would be enough to ensure partprobe completed (and 
since it takes as much as 2mn30 to return, I would be shocked if it does not 
;-).

Any idea ? I desperately try to find a consistent behavior, something reliable 
that we could use to say : "wait for the partition table to be up to date in 
the kernel and all udev events generated by the partition table update to 
complete". 

Cheers
-- 
Loïc Dachary, Artisan Logiciel Libre



signature.asc
Description: OpenPGP digital signature


Re: Client still connect failed leader after that mon down

2015-12-17 Thread Jevon Qiao

On 17/12/15 21:27, Sage Weil wrote:

On Thu, 17 Dec 2015, Jaze Lee wrote:

Hello cephers:
 In our test, there are three monitors. We find client run ceph
command will slow when the leader mon is down. Even after long time, a
client run ceph command will also slow in first time.
>From strace, we find that the client first to connect the leader, then
after 3s, it connect the second.
After some search we find that the quorum is not change, the leader is
still the down monitor.
Is that normal?  Or is there something i miss?

It's normal.  Even when the quorum does change, the client doesn't
know that.  It should be contacting a random mon on startup, though, so I
would expect the 3s delay 1/3 of the time.
That's because client randomly picks up a mon from Monmap. But what we 
observed is that when a mon is down no change is made to monmap(neither 
the epoch nor the members). Is it the culprit for this phenomenon?


Thanks,
Jevon

A long-standing low-priority feature request is to have the client contact
2 mons in parallel so that it can still connect quickly if one is down.
It's requires some non-trivial work in mon/MonClient.{cc,h} though and I
don't think anyone has looked at it seriously.

sage

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Issue with Ceph File System and LIO

2015-12-17 Thread Yan, Zheng
On Fri, Dec 18, 2015 at 3:49 AM, Eric Eastman
 wrote:
> With cephfs.patch and cephfs1.patch applied and I am now seeing:
>
> [Thu Dec 17 14:27:59 2015] [ cut here ]
> [Thu Dec 17 14:27:59 2015] WARNING: CPU: 0 PID: 3036 at
> fs/ceph/addr.c:1171 ceph_write_begin+0xfb/0x120 [ceph]()
> [Thu Dec 17 14:27:59 2015] Modules linked in: iscsi_target_mod
> vhost_scsi tcm_qla2xxx ib_srpt tcm_fc tcm_usb_gadget tcm_loop
> target_core_file target_core_iblock target_core_pscsi target_core_user
> target_core_mod ipmi_devintf vhost qla2xxx ib_cm ib_sa ib_mad ib_core
> ib_addr libfc scsi_transport_fc libcomposite udc_core uio configfs ttm
> drm_kms_helper drm ipmi_ssif coretemp gpio_ich i2c_algo_bit kvm
> fb_sys_fops syscopyarea sysfillrect sysimgblt shpchp input_leds ceph
> irqbypass i7core_edac serio_raw hpilo edac_core ipmi_si
> ipmi_msghandler 8250_fintek lpc_ich acpi_power_meter libceph mac_hid
> libcrc32c fscache bonding lp parport mlx4_en vxlan ip6_udp_tunnel
> udp_tunnel ptp pps_core hid_generic usbhid hid mlx4_core hpsa psmouse
> bnx2 fjes scsi_transport_sas [last unloaded: target_core_mod]
> [Thu Dec 17 14:27:59 2015] CPU: 0 PID: 3036 Comm: iscsi_trx Tainted: G
>W I 4.4.0-rc4-ede2 #1
> [Thu Dec 17 14:27:59 2015] Hardware name: HP ProLiant DL360 G6, BIOS
> P64 01/22/2015
> [Thu Dec 17 14:27:59 2015]  c02b2e37 880c0289b958
> 813ad644 
> [Thu Dec 17 14:27:59 2015]  880c0289b990 81079702
> 880c0289ba50 000846c21000
> [Thu Dec 17 14:27:59 2015]  880c009ea200 1000
> ea00122ed700 880c0289b9a0
> [Thu Dec 17 14:27:59 2015] Call Trace:
> [Thu Dec 17 14:27:59 2015]  [] dump_stack+0x44/0x60
> [Thu Dec 17 14:27:59 2015]  [] 
> warn_slowpath_common+0x82/0xc0
> [Thu Dec 17 14:27:59 2015]  [] warn_slowpath_null+0x1a/0x20
> [Thu Dec 17 14:27:59 2015]  []
> ceph_write_begin+0xfb/0x120 [ceph]
> [Thu Dec 17 14:27:59 2015]  []
> generic_perform_write+0xbf/0x1a0
> [Thu Dec 17 14:27:59 2015]  []
> ceph_write_iter+0xf5c/0x1010 [ceph]
> [Thu Dec 17 14:27:59 2015]  [] ? __schedule+0x386/0x9c0
> [Thu Dec 17 14:27:59 2015]  [] ? schedule+0x35/0x80
> [Thu Dec 17 14:27:59 2015]  [] ? __slab_free+0xb5/0x290
> [Thu Dec 17 14:27:59 2015]  [] ?
> iov_iter_get_pages+0x113/0x210
> [Thu Dec 17 14:27:59 2015]  [] vfs_iter_write+0x63/0xa0
> [Thu Dec 17 14:27:59 2015]  []
> fd_do_rw.isra.5+0xc9/0x1b0 [target_core_file]
> [Thu Dec 17 14:27:59 2015]  []
> fd_execute_rw+0xc5/0x2a0 [target_core_file]
> [Thu Dec 17 14:27:59 2015]  []
> sbc_execute_rw+0x22/0x30 [target_core_mod]
> [Thu Dec 17 14:27:59 2015]  []
> __target_execute_cmd+0x1f/0x70 [target_core_mod]
> [Thu Dec 17 14:27:59 2015]  []
> target_execute_cmd+0x195/0x2a0 [target_core_mod]
> [Thu Dec 17 14:27:59 2015]  []
> iscsit_execute_cmd+0x20a/0x270 [iscsi_target_mod]
> [Thu Dec 17 14:27:59 2015]  []
> iscsit_sequence_cmd+0xda/0x190 [iscsi_target_mod]
> [Thu Dec 17 14:27:59 2015]  []
> iscsi_target_rx_thread+0x51d/0xe30 [iscsi_target_mod]
> [Thu Dec 17 14:27:59 2015]  [] ? __switch_to+0x1cd/0x570
> [Thu Dec 17 14:27:59 2015]  [] ?
> iscsi_target_tx_thread+0x1c0/0x1c0 [iscsi_target_mod]
> [Thu Dec 17 14:27:59 2015]  [] kthread+0xc9/0xe0
> [Thu Dec 17 14:27:59 2015]  [] ?
> kthread_create_on_node+0x180/0x180
> [Thu Dec 17 14:27:59 2015]  [] ret_from_fork+0x3f/0x70
> [Thu Dec 17 14:27:59 2015]  [] ?
> kthread_create_on_node+0x180/0x180
> [Thu Dec 17 14:27:59 2015] ---[ end trace 8346192e3f29ed5d ]---
>

The page gets unlocked mystically. I still don't find any clue. Could
you please try the new patch (not incremental patch). Besides, please
enable CONFIG_DEBUG_VM when compiling the kernel.

Thanks you very much
Yan, Zheng


cephfs_new.patch
Description: Binary data