Re: [RFC 00/18] vfio: Adopt iommufd

2022-05-10 Thread Zhangfei Gao




On 2022/5/10 下午2:51, Eric Auger wrote:

Hi Hi, Zhangfei,

On 5/10/22 05:17, Yi Liu wrote:

Hi Zhangfei,

On 2022/5/9 22:24, Zhangfei Gao wrote:

Hi, Alex

On 2022/4/27 上午12:35, Alex Williamson wrote:

On Tue, 26 Apr 2022 12:43:35 +
Shameerali Kolothum Thodi  wrote:


-Original Message-
From: Eric Auger [mailto:eric.au...@redhat.com]
Sent: 26 April 2022 12:45
To: Shameerali Kolothum Thodi
; Yi
Liu ; alex.william...@redhat.com;
coh...@redhat.com;
qemu-devel@nongnu.org
Cc: da...@gibson.dropbear.id.au; th...@redhat.com;
far...@linux.ibm.com;
mjros...@linux.ibm.com; akrow...@linux.ibm.com; pa...@linux.ibm.com;
jjhe...@linux.ibm.com; jasow...@redhat.com; k...@vger.kernel.org;
j...@nvidia.com; nicol...@nvidia.com; eric.auger@gmail.com;
kevin.t...@intel.com; chao.p.p...@intel.com; yi.y@intel.com;
pet...@redhat.com; Zhangfei Gao 
Subject: Re: [RFC 00/18] vfio: Adopt iommufd

[...]

https://lore.kernel.org/kvm/0-v1-e79cd8d168e8+6-iommufd_...@nvidia.com


/
[2] https://github.com/luxis1999/iommufd/tree/iommufd-v5.17-rc6
[3]
https://github.com/luxis1999/qemu/tree/qemu-for-5.17-rc6-vm-rfcv1

Hi,

I had a go with the above branches on our ARM64 platform trying to

pass-through

a VF dev, but Qemu reports an error as below,

[    0.444728] hisi_sec2 :00:01.0: enabling device ( ->
0002)
qemu-system-aarch64-iommufd: IOMMU_IOAS_MAP failed: Bad address
qemu-system-aarch64-iommufd: vfio_container_dma_map(0xfeb40ce0,

0x80, 0x1, 0xb40ef000) = -14 (Bad address)

I think this happens for the dev BAR addr range. I haven't
debugged the

kernel

yet to see where it actually reports that.

Does it prevent your assigned device from working? I have such errors
too but this is a known issue. This is due to the fact P2P DMA is not
supported yet.

Yes, the basic tests all good so far. I am still not very clear how
it works if
the map() fails though. It looks like it fails in,

iommufd_ioas_map()
    iopt_map_user_pages()
     iopt_map_pages()
     ..
   pfn_reader_pin_pages()

So does it mean it just works because the page is resident()?

No, it just means that you're not triggering any accesses that require
peer-to-peer DMA support.  Any sort of test where the device is only
performing DMA to guest RAM, which is by far the standard use case,
will work fine.  This also doesn't affect vCPU access to BAR space.
It's only a failure of the mappings of the BAR space into the IOAS,
which is only used when a device tries to directly target another
device's BAR space via DMA.  Thanks,

I also get this issue when trying adding prereg listenner

+    container->prereg_listener = vfio_memory_prereg_listener;
+    memory_listener_register(>prereg_listener,
+    _space_memory);

host kernel log:
iommufd_ioas_map 1 iova=80, iova1=80,
cmd->iova=80, cmd->user_va=9c495000, cmd->length=1
iopt_alloc_area input area=859a2d00 iova=80
iopt_alloc_area area=859a2d00 iova=80
pin_user_pages_remote rc=-14

qemu log:
vfio_prereg_listener_region_add
iommufd_map iova=0x80
qemu-system-aarch64: IOMMU_IOAS_MAP failed: Bad address
qemu-system-aarch64: vfio_dma_map(0xfb96a930, 0x80,
0x1, 0x9c495000) = -14 (Bad address)
qemu-system-aarch64: (null)
double free or corruption (fasttop)
Aborted (core dumped)

With hack of ignoring address 0x80 in map and unmap, kernel
can boot.

do you know if the iova 0x80 guest RAM or MMIO? Currently,
iommufd kernel part doesn't support mapping device BAR MMIO. This is a
known gap.

In qemu arm virt machine this indeed matches the PCI MMIO region.


Thanks Yi and Eric,
Then will wait for the updated iommufd kernel for the PCI MMIO region.

Another question,
How to get the iommu_domain in the ioctl.

qemu can get container->ioas_id.

kernel can get ioas via the ioas_id.
But how to get the domain?
Currently I am hacking with ioas->iopt.next_domain_id, which is increasing.
domain = xa_load(>iopt.domains, ioas->iopt.next_domain_id-1);

Any idea?

Thanks



Re: [RFC 00/18] vfio: Adopt iommufd

2022-05-09 Thread Zhangfei Gao

Hi, Alex

On 2022/4/27 上午12:35, Alex Williamson wrote:

On Tue, 26 Apr 2022 12:43:35 +
Shameerali Kolothum Thodi  wrote:


-Original Message-
From: Eric Auger [mailto:eric.au...@redhat.com]
Sent: 26 April 2022 12:45
To: Shameerali Kolothum Thodi ; Yi
Liu ; alex.william...@redhat.com; coh...@redhat.com;
qemu-devel@nongnu.org
Cc: da...@gibson.dropbear.id.au; th...@redhat.com; far...@linux.ibm.com;
mjros...@linux.ibm.com; akrow...@linux.ibm.com; pa...@linux.ibm.com;
jjhe...@linux.ibm.com; jasow...@redhat.com; k...@vger.kernel.org;
j...@nvidia.com; nicol...@nvidia.com; eric.auger@gmail.com;
kevin.t...@intel.com; chao.p.p...@intel.com; yi.y@intel.com;
pet...@redhat.com; Zhangfei Gao 
Subject: Re: [RFC 00/18] vfio: Adopt iommufd

[...]
  
  

https://lore.kernel.org/kvm/0-v1-e79cd8d168e8+6-iommufd_...@nvidia.com

/
[2] https://github.com/luxis1999/iommufd/tree/iommufd-v5.17-rc6
[3] https://github.com/luxis1999/qemu/tree/qemu-for-5.17-rc6-vm-rfcv1

Hi,

I had a go with the above branches on our ARM64 platform trying to

pass-through

a VF dev, but Qemu reports an error as below,

[0.444728] hisi_sec2 :00:01.0: enabling device ( -> 0002)
qemu-system-aarch64-iommufd: IOMMU_IOAS_MAP failed: Bad address
qemu-system-aarch64-iommufd: vfio_container_dma_map(0xfeb40ce0,

0x80, 0x1, 0xb40ef000) = -14 (Bad address)

I think this happens for the dev BAR addr range. I haven't debugged the

kernel

yet to see where it actually reports that.

Does it prevent your assigned device from working? I have such errors
too but this is a known issue. This is due to the fact P2P DMA is not
supported yet.
   

Yes, the basic tests all good so far. I am still not very clear how it works if
the map() fails though. It looks like it fails in,

iommufd_ioas_map()
   iopt_map_user_pages()
iopt_map_pages()
..
  pfn_reader_pin_pages()

So does it mean it just works because the page is resident()?

No, it just means that you're not triggering any accesses that require
peer-to-peer DMA support.  Any sort of test where the device is only
performing DMA to guest RAM, which is by far the standard use case,
will work fine.  This also doesn't affect vCPU access to BAR space.
It's only a failure of the mappings of the BAR space into the IOAS,
which is only used when a device tries to directly target another
device's BAR space via DMA.  Thanks,


I also get this issue when trying adding prereg listenner

+    container->prereg_listener = vfio_memory_prereg_listener;
+    memory_listener_register(>prereg_listener,
+    _space_memory);

host kernel log:
iommufd_ioas_map 1 iova=80, iova1=80, 
cmd->iova=80, cmd->user_va=9c495000, cmd->length=1

iopt_alloc_area input area=859a2d00 iova=80
iopt_alloc_area area=859a2d00 iova=80
pin_user_pages_remote rc=-14

qemu log:
vfio_prereg_listener_region_add
iommufd_map iova=0x80
qemu-system-aarch64: IOMMU_IOAS_MAP failed: Bad address
qemu-system-aarch64: vfio_dma_map(0xfb96a930, 0x80, 0x1, 
0x9c495000) = -14 (Bad address)

qemu-system-aarch64: (null)
double free or corruption (fasttop)
Aborted (core dumped)

With hack of ignoring address 0x80 in map and unmap, kernel can 
boot.


Thanks




Re: [RFC v6 00/24] vSMMUv3/pSMMUv3 2 stage VFIO integration

2020-03-31 Thread Zhangfei Gao

Hi, Eric

On 2020/3/31 下午4:12, Auger Eric wrote:

Hi Zhangfei,

On 3/31/20 8:42 AM, Zhangfei Gao wrote:

Hi, Eric

On 2020/3/21 上午12:58, Eric Auger wrote:

Up to now vSMMUv3 has not been integrated with VFIO. VFIO
integration requires to program the physical IOMMU consistently
with the guest mappings. However, as opposed to VTD, SMMUv3 has
no "Caching Mode" which allows easy trapping of guest mappings.
This means the vSMMUV3 cannot use the same VFIO integration as VTD.

However SMMUv3 has 2 translation stages. This was devised with
virtualization use case in mind where stage 1 is "owned" by the
guest whereas the host uses stage 2 for VM isolation.

This series sets up this nested translation stage. It only works
if there is one physical SMMUv3 used along with QEMU vSMMUv3 (in
other words, it does not work if there is a physical SMMUv2).

- We force the host to use stage 2 instead of stage 1, when we
    detect a vSMMUV3 is behind a VFIO device. For a VFIO device
    without any virtual IOMMU, we still use stage 1 as many existing
    SMMUs expect this behavior.
- We use PCIPASIDOps to propage guest stage1 config changes on
    STE (Stream Table Entry) changes.
- We implement a specific UNMAP notifier that conveys guest
    IOTLB invalidations to the host
- We register MSI IOVA/GPA bindings to the host so that this latter
    can build a nested stage translation
- As the legacy MAP notifier is not called anymore, we must make
    sure stage 2 mappings are set. This is achieved through another
    prereg memory listener.
- Physical SMMU stage 1 related faults are reported to the guest
    via en eventfd mechanism and exposed trhough a dedicated VFIO-PCI
    region. Then they are reinjected into the guest.

Best Regards

Eric

This series can be found at:
https://github.com/eauger/qemu/tree/v4.2.0-2stage-rfcv6

Kernel Dependencies:
[1] [PATCH v10 00/11] SMMUv3 Nested Stage Setup (VFIO part)
[2] [PATCH v10 00/13] SMMUv3 Nested Stage Setup (IOMMU part)
branch at:
https://github.com/eauger/linux/tree/will-arm-smmu-updates-2stage-v10

Really appreciated that you re-start this work.

I tested your branch and some update.

Guest: https://github.com/Linaro/linux-kernel-warpdrive/tree/sva-devel
<https://github.com/Linaro/linux-kernel-warpdrive/tree/sva-devel>
Host:
https://github.com/eauger/linux/tree/will-arm-smmu-updates-2stage-v10
<https://github.com/eauger/linux/tree/will-arm-smmu-updates-2stage-v10>
qemu: https://github.com/eauger/qemu/tree/v4.2.0-2stage-rfcv6
<https://github.com/eauger/qemu/tree/v4.2.0-2stage-rfcv6>

The guest I am using is contains Jean's sva patches.
Since currently they are many patch conflict, so use two different tree.

Result
No-sva mode works.
This mode, guest directly get physical address via ioctl.

OK thanks for testing

While vSVA can not work, there are still much work to do.
I am trying to SVA mode, and it fails, so choose no-sva instead.
iommu_dev_enable_feature(parent, IOMMU_DEV_FEAT_SVA)

Indeed I assume there are plenty of things missing to make vSVM work on
ARM (iommu, vfio, QEMU). I am currently reviewing Jacob and Yi's kernel
and QEMU series on Intel side. After that, I will come back to you to
help. Also vSMMUv3 does not support multiple contexts at the moment. I
will add this soon.


Still the problem I have is testing. Any suggestion welcome.



To make sure
Do you mean you need a environment for testing?

How about Hisilicon kunpeng920, arm64 platform supporting SVA in host now.
There is such a platform in linaro mlab that I think we can share.
Currently I am testing with uacce,
By testing a user driver (hisi zip accelerator) in guest, we can test 
vSVA and PASID easily.


Thanks




Re: [RFC v6 00/24] vSMMUv3/pSMMUv3 2 stage VFIO integration

2020-03-31 Thread Zhangfei Gao

Hi, Eric

On 2020/3/21 上午12:58, Eric Auger wrote:

Up to now vSMMUv3 has not been integrated with VFIO. VFIO
integration requires to program the physical IOMMU consistently
with the guest mappings. However, as opposed to VTD, SMMUv3 has
no "Caching Mode" which allows easy trapping of guest mappings.
This means the vSMMUV3 cannot use the same VFIO integration as VTD.

However SMMUv3 has 2 translation stages. This was devised with
virtualization use case in mind where stage 1 is "owned" by the
guest whereas the host uses stage 2 for VM isolation.

This series sets up this nested translation stage. It only works
if there is one physical SMMUv3 used along with QEMU vSMMUv3 (in
other words, it does not work if there is a physical SMMUv2).

- We force the host to use stage 2 instead of stage 1, when we
   detect a vSMMUV3 is behind a VFIO device. For a VFIO device
   without any virtual IOMMU, we still use stage 1 as many existing
   SMMUs expect this behavior.
- We use PCIPASIDOps to propage guest stage1 config changes on
   STE (Stream Table Entry) changes.
- We implement a specific UNMAP notifier that conveys guest
   IOTLB invalidations to the host
- We register MSI IOVA/GPA bindings to the host so that this latter
   can build a nested stage translation
- As the legacy MAP notifier is not called anymore, we must make
   sure stage 2 mappings are set. This is achieved through another
   prereg memory listener.
- Physical SMMU stage 1 related faults are reported to the guest
   via en eventfd mechanism and exposed trhough a dedicated VFIO-PCI
   region. Then they are reinjected into the guest.

Best Regards

Eric

This series can be found at:
https://github.com/eauger/qemu/tree/v4.2.0-2stage-rfcv6

Kernel Dependencies:
[1] [PATCH v10 00/11] SMMUv3 Nested Stage Setup (VFIO part)
[2] [PATCH v10 00/13] SMMUv3 Nested Stage Setup (IOMMU part)
branch at: https://github.com/eauger/linux/tree/will-arm-smmu-updates-2stage-v10

Really appreciated that you re-start this work.

I tested your branch and some update.

Guest: https://github.com/Linaro/linux-kernel-warpdrive/tree/sva-devel 

Host: 
https://github.com/eauger/linux/tree/will-arm-smmu-updates-2stage-v10 

qemu: https://github.com/eauger/qemu/tree/v4.2.0-2stage-rfcv6 



The guest I am using is contains Jean's sva patches.
Since currently they are many patch conflict, so use two different tree.

Result
No-sva mode works.
This mode, guest directly get physical address via ioctl.

While vSVA can not work, there are still much work to do.
I am trying to SVA mode, and it fails, so choose no-sva instead.
iommu_dev_enable_feature(parent, IOMMU_DEV_FEAT_SVA)

I am in debugging how to enable this.

Thanks




Re: [PATCH v16 00/10] VIRTIO-IOMMU device

2020-03-03 Thread Zhangfei Gao
On Tue, Mar 3, 2020 at 5:41 PM Auger Eric  wrote:
>
> Hi Zhangfei,
> On 3/3/20 4:23 AM, Zhangfei Gao wrote:
> > Hi Eric
> >
> > On Thu, Feb 27, 2020 at 9:50 PM Auger Eric  wrote:
> >>
> >> Hi Daniel,
> >>
> >> On 2/27/20 12:17 PM, Daniel P. Berrangé wrote:
> >>> On Fri, Feb 14, 2020 at 02:27:35PM +0100, Eric Auger wrote:
> >>>> This series implements the QEMU virtio-iommu device.
> >>>>
> >>>> This matches the v0.12 spec (voted) and the corresponding
> >>>> virtio-iommu driver upstreamed in 5.3. All kernel dependencies
> >>>> are resolved for DT integration. The virtio-iommu can be
> >>>> instantiated in ARM virt using:
> >>>>
> >>>> "-device virtio-iommu-pci".
> >>>
> >>> Is there any more documentation besides this ?
> >>
> >> not yet in qemu.
> >>>
> >>> I'm wondering on the intended usage of this, and its relation
> >>> or pros/cons vs other iommu devices
> >>
> >> Maybe if you want to catch up on the topic, looking at the very first
> >> kernel RFC may be a good starting point. Motivation, pros & cons were
> >> discussed in that thread (hey, April 2017!)
> >> https://lists.linuxfoundation.org/pipermail/iommu/2017-April/021217.html
> >>
> >> on ARM we have SMMUv3 emulation. But the VFIO integration is not
> >> possible because SMMU does not have any "caching mode" and my nested
> >> paging kernel series is blocked. So the only solution to integrate with
> >> VFIO is looming virtio-iommu.
> >>
> >> In general the pros that were put forward are: virtio-iommu is
> >> architecture agnostic, removes the burden to accurately model complex
> >> device states, driver can support virtualization specific optimizations
> >> without being constrained by production driver maintenance. Cons is perf
> >> and mem footprint if we do not consider any optimization.
> >>
> >> You can have a look at
> >>
> >> http://events17.linuxfoundation.org/sites/events/files/slides/viommu_arm.pdf
> >>
> > Thanks for the patches.
> >
> > Could I ask one question?
> > To support vSVA and pasid in guest, which direction you recommend,
> > virtio-iommu or vSMMU (your nested paging)
> >
> > Do we still have any obstacles?
> you can ask the question but not sure I can answer ;-)
>
> 1) SMMUv3 2stage implementation is blocked by Will at kernel level.
>
> Despite this situation I may/can respin as Marvell said they were
> interested in this effort. If you are also interested in (I know you
> tested it several times and I am grateful to you for that), please reply
> to:
> [PATCH v9 00/14] SMMUv3 Nested Stage Setup (IOMMU part)
> (https://patchwork.kernel.org/cover/11039871/) and say you are
> interested in that work so that maintainers are aware there are
> potential users.
>
> At the moment I have not supported multiple CDs because it introduced
> other dependencies.
>
> 2) virtio-iommu
>
> So only virtio-iommu dt boot on machvirt is currently supported. For non
> DT, Jean respinned his kernel series
> "[PATCH v2 0/3] virtio-iommu on x86 and non-devicetree platforms" as you
> may know. However non DT integration still is controversial. Michael is
> pushing for putting the binding info the PCI config space. Joerg
> yesterday challenged this solution and said he would prefer ACPI
> integration. ACPI support depends on ACPI spec update & vote anyway.
>
> To support PASID at virtio-iommu level you also need virtio-iommu API
> extensions to be proposed and written + kernel works. So that's a long
> road. I will let Jean-Philippe comment on that.
>
> I would just say that Intel is working on nested paging solution with
> their emulated intel-iommu. We can help them getting that upstream and
> partly benefit from this work.
>
> > Would you mind give some breakdown.
> > Jean mentioned PASID still not supported in QEMU.
> Do you mean support of multiple CDs in the emulated SMMU? That's a thing
> I could implement quite easily. What is more tricky is how to test it.

Thanks Eric

Discussed with Jean before, here are some obstacles for vSVA via nested paging.
Do you think they are still big issues?

Copy "
* PASID support in QEMU, I don't think there is anything yet
// this is not a big issue as your comments.

* Page response support in VFIO and QEMU. With Eric's series we can
inject recoverable faults into the guest, but there is no channel for
the guest to RESUME the stall after fixing it.

* We can't use DVM in nested mode unless the VMID is shared with the
CPU. For that we'll need the host SMMU driver to hook into the KVM VMID
allocator, just like we do for the ASID allocator. I haven't yet
investigated how to do that. It's possible to do vSVA without DVM
though, by sending all TLB invalidations through the SMMU command queue.
"

Thanks



Re: [PATCH v16 00/10] VIRTIO-IOMMU device

2020-03-02 Thread Zhangfei Gao
Hi Eric

On Thu, Feb 27, 2020 at 9:50 PM Auger Eric  wrote:
>
> Hi Daniel,
>
> On 2/27/20 12:17 PM, Daniel P. Berrangé wrote:
> > On Fri, Feb 14, 2020 at 02:27:35PM +0100, Eric Auger wrote:
> >> This series implements the QEMU virtio-iommu device.
> >>
> >> This matches the v0.12 spec (voted) and the corresponding
> >> virtio-iommu driver upstreamed in 5.3. All kernel dependencies
> >> are resolved for DT integration. The virtio-iommu can be
> >> instantiated in ARM virt using:
> >>
> >> "-device virtio-iommu-pci".
> >
> > Is there any more documentation besides this ?
>
> not yet in qemu.
> >
> > I'm wondering on the intended usage of this, and its relation
> > or pros/cons vs other iommu devices
>
> Maybe if you want to catch up on the topic, looking at the very first
> kernel RFC may be a good starting point. Motivation, pros & cons were
> discussed in that thread (hey, April 2017!)
> https://lists.linuxfoundation.org/pipermail/iommu/2017-April/021217.html
>
> on ARM we have SMMUv3 emulation. But the VFIO integration is not
> possible because SMMU does not have any "caching mode" and my nested
> paging kernel series is blocked. So the only solution to integrate with
> VFIO is looming virtio-iommu.
>
> In general the pros that were put forward are: virtio-iommu is
> architecture agnostic, removes the burden to accurately model complex
> device states, driver can support virtualization specific optimizations
> without being constrained by production driver maintenance. Cons is perf
> and mem footprint if we do not consider any optimization.
>
> You can have a look at
>
> http://events17.linuxfoundation.org/sites/events/files/slides/viommu_arm.pdf
>
Thanks for the patches.

Could I ask one question?
To support vSVA and pasid in guest, which direction you recommend,
virtio-iommu or vSMMU (your nested paging)

Do we still have any obstacles?
Would you mind give some breakdown.
Jean mentioned PASID still not supported in QEMU.

Thanks



Re: [Qemu-devel] [RFC v4 00/27] vSMMUv3/pSMMUv3 2 stage VFIO integration

2019-07-11 Thread Zhangfei Gao
On Thu, Jul 11, 2019 at 1:55 PM Auger Eric  wrote:
>
> Hi Zhangfei,
>
> On 7/11/19 3:53 AM, Zhangfei Gao wrote:
> > On Mon, May 27, 2019 at 7:44 PM Eric Auger  wrote:
> >>
> >> Up to now vSMMUv3 has not been integrated with VFIO. VFIO
> >> integration requires to program the physical IOMMU consistently
> >> with the guest mappings. However, as opposed to VTD, SMMUv3 has
> >> no "Caching Mode" which allows easy trapping of guest mappings.
> >> This means the vSMMUV3 cannot use the same VFIO integration as VTD.
> >>
> >> However SMMUv3 has 2 translation stages. This was devised with
> >> virtualization use case in mind where stage 1 is "owned" by the
> >> guest whereas the host uses stage 2 for VM isolation.
> >>
> >> This series sets up this nested translation stage. It only works
> >> if there is one physical SMMUv3 used along with QEMU vSMMUv3 (in
> >> other words, it does not work if there is a physical SMMUv2).
> >>
> >> The series uses a new kernel user API [1], still under definition.
> >>
> >> - We force the host to use stage 2 instead of stage 1, when we
> >>   detect a vSMMUV3 is behind a VFIO device. For a VFIO device
> >>   without any virtual IOMMU, we still use stage 1 as many existing
> >>   SMMUs expect this behavior.
> >> - We introduce new IOTLB "config" notifiers, requested to notify
> >>   changes in the config of a given iommu memory region. So now
> >>   we have notifiers for IOTLB changes and config changes.
> >> - vSMMUv3 calls config notifiers when STE (Stream Table Entries)
> >>   are updated by the guest.
> >> - We implement a specific UNMAP notifier that conveys guest
> >>   IOTLB invalidations to the host
> >> - We implement a new MAP notifiers only used for MSI IOVAs so
> >>   that the host can build a nested stage translation for MSI IOVAs
> >> - As the legacy MAP notifier is not called anymore, we must make
> >>   sure stage 2 mappings are set. This is achieved through another
> >>   memory listener.
> >> - Physical SMMUs faults are reported to the guest via en eventfd
> >>   mechanism and reinjected into this latter.
> >>
> >> Note: The first patch is a code cleanup and was sent separately.
> >>
> >> Best Regards
> >>
> >> Eric
> >>
> >> This series can be found at:
> >> https://github.com/eauger/qemu/tree/v4.0.0-2stage-rfcv4
> >>
> >> Compatible with kernel series:
> >> [PATCH v8 00/29] SMMUv3 Nested Stage Setup
> >> (https://lkml.org/lkml/2019/5/26/95)
> >>
> >
> > Have tested vfio mode in qemu on arm64 platform.
> >
> > Tested-by: Zhangfei Gao 
> > qemu: https://github.com/eauger/qemu/tree/v4.0.0-2stage-rfcv4
> > kernel: https://github.com/eauger/linux/tree/v5.2-rc1-2stage-v8
>
> Your testing is really appreciated.
>
> Both kernel and QEMU series will be respinned. I am currently waiting
> for 5.3 kernel window as it will resolve some dependencies on the fault
> reporting APIs. My focus is to get the updated kernel series reviewed
> and tested and then refine the QEMU integration accordingly.
>
Thanks Eric, that's great
Since I found kernel part (drivers/iommu/arm-smmu-v3.c) will be
conflicting with Jean's sva patch.
Especially this one: iommu/smmuv3: Dynamically allocate s1_cfg and s2_cfg

Thanks



Re: [Qemu-devel] [RFC v4 00/27] vSMMUv3/pSMMUv3 2 stage VFIO integration

2019-07-10 Thread Zhangfei Gao
On Mon, May 27, 2019 at 7:44 PM Eric Auger  wrote:
>
> Up to now vSMMUv3 has not been integrated with VFIO. VFIO
> integration requires to program the physical IOMMU consistently
> with the guest mappings. However, as opposed to VTD, SMMUv3 has
> no "Caching Mode" which allows easy trapping of guest mappings.
> This means the vSMMUV3 cannot use the same VFIO integration as VTD.
>
> However SMMUv3 has 2 translation stages. This was devised with
> virtualization use case in mind where stage 1 is "owned" by the
> guest whereas the host uses stage 2 for VM isolation.
>
> This series sets up this nested translation stage. It only works
> if there is one physical SMMUv3 used along with QEMU vSMMUv3 (in
> other words, it does not work if there is a physical SMMUv2).
>
> The series uses a new kernel user API [1], still under definition.
>
> - We force the host to use stage 2 instead of stage 1, when we
>   detect a vSMMUV3 is behind a VFIO device. For a VFIO device
>   without any virtual IOMMU, we still use stage 1 as many existing
>   SMMUs expect this behavior.
> - We introduce new IOTLB "config" notifiers, requested to notify
>   changes in the config of a given iommu memory region. So now
>   we have notifiers for IOTLB changes and config changes.
> - vSMMUv3 calls config notifiers when STE (Stream Table Entries)
>   are updated by the guest.
> - We implement a specific UNMAP notifier that conveys guest
>   IOTLB invalidations to the host
> - We implement a new MAP notifiers only used for MSI IOVAs so
>   that the host can build a nested stage translation for MSI IOVAs
> - As the legacy MAP notifier is not called anymore, we must make
>   sure stage 2 mappings are set. This is achieved through another
>   memory listener.
> - Physical SMMUs faults are reported to the guest via en eventfd
>   mechanism and reinjected into this latter.
>
> Note: The first patch is a code cleanup and was sent separately.
>
> Best Regards
>
> Eric
>
> This series can be found at:
> https://github.com/eauger/qemu/tree/v4.0.0-2stage-rfcv4
>
> Compatible with kernel series:
> [PATCH v8 00/29] SMMUv3 Nested Stage Setup
> (https://lkml.org/lkml/2019/5/26/95)
>

Have tested vfio mode in qemu on arm64 platform.

Tested-by: Zhangfei Gao 
qemu: https://github.com/eauger/qemu/tree/v4.0.0-2stage-rfcv4
kernel: https://github.com/eauger/linux/tree/v5.2-rc1-2stage-v8



Re: [Qemu-devel] [RFC v9 00/17] VIRTIO-IOMMU device

2019-07-05 Thread Zhangfei Gao
Hi, Bharat

On Tue, Nov 27, 2018 at 3:12 PM Bharat Bhushan  wrote:

> > Testing:
> > - tested with guest using virtio-net-pci
> >   (,vhost=off,iommu_platform,disable-modern=off,disable-legacy=on)
> >   and virtio-blk-pci
> > - VFIO/VHOST integration is not part of this series
> > - When using the virtio-blk-pci, some EDK2 FW versions feature
> >   unmapped transactions and in that case the guest fails to boot.
>
> I have tested this series with virtio and VFIO both
> Tested-by: Bharat Bhushan 
>

Would you mind pasting the qemu test command.
A bit confused about testing vfio, virtio-iommu-pci has no "host" property.
Do we need unbind pf and bind the device to vfio-pci first,

Thanks



Re: [Qemu-devel] ext4 error when testing virtio-scsi & vhost-scsi

2016-07-31 Thread Zhangfei Gao
Hi, Jan

On Thu, Jul 28, 2016 at 9:29 AM, Zhangfei Gao <zhangfei@gmail.com> wrote:
> Hi, Jan
>
> On Wed, Jul 27, 2016 at 11:56 PM, Jan Kara <j...@suse.cz> wrote:
>> Hi!
>>
>> On Wed 27-07-16 15:58:55, Zhangfei Gao wrote:
>>> Hi, Michael
>>>
>>> I have met ext4 error when using vhost_scsi on arm64 platform, and
>>> suspect it is vhost_scsi issue.
>>>
>>> Ext4 error when testing virtio_scsi & vhost_scsi
>>>
>>>
>>> No issue:
>>> 1. virtio_scsi, ext4
>>> 2. vhost_scsi & virtio_scsi, ext2
>>> 3.  Instead of vhost, also tried loopback and no problem.
>>> Using loopback, host can use the new block device, while vhost is used
>>> by guest (qemu).
>>> http://www.linux-iscsi.org/wiki/Tcm_loop
>>> Test directly in host, not find ext4 error.
>>>
>>>
>>>
>>> Have issue:
>>> 1. vhost_scsi & virtio_scsi, ext4
>>> a. iblock
>>> b, fileio, file located in /tmp (ram), no device based.
>>>
>>> 2, Have tried 4.7-r2 and 4.5-rc1 on D02 board, both have issue.
>>> Since I need kvm specific patch for D02, so it may not freely to switch
>>> to older version.
>>>
>>> 3. Also test with ext4, disabling journal
>>> mkfs.ext4 -O ^has_journal /dev/sda
>>>
>>>
>>> Do you have any suggestion?
>>
>> So can you mount the filesystem with errors=remount-ro to avoid clobbering
>> the fs after the problem happens? And then run e2fsck on the problematic
>> filesystem and send the output here?
>>
>
> Tested twice, log pasted.
> Both using fileio, located in host ramfs /tmp
> Before e2fsck, umount /dev/sda
>
> 1.
> root@(none)$ mount -o errors=remount-ro /dev/sda /mnt
> [   22.812053] EXT4-fs (sda): mounted filesystem with ordered data
> mode. Opts: errors=remount-ro
> $ rm /mnt/test
> [  108.388905] EXT4-fs error (device sda) in
> ext4_reserve_inode_write:5362: Corrupt filesystem
> [  108.406930] Aborting journal on device sda-8.
> [  108.414120] EXT4-fs (sda): Remounting filesystem read-only
> [  108.414847] EXT4-fs error (device sda) in ext4_dirty_inode:5487: IO failure
> [  108.423571] EXT4-fs error (device sda) in ext4_free_blocks:4904:
> Journal has aborted
> [  108.431919] EXT4-fs error (device sda) in
> ext4_reserve_inode_write:5362: Corrupt filesystem
> [  108.440269] EXT4-fs error (device sda) in
> ext4_reserve_inode_write:5362: Corrupt filesystem
> [  108.448568] EXT4-fs error (device sda) in
> ext4_ext_remove_space:3058: IO failure
> [  108.456917] EXT4-fs error (device sda) in ext4_ext_truncate:4657:
> Corrupt filesystem
> [  108.465267] EXT4-fs error (device sda) in
> ext4_reserve_inode_write:5362: Corrupt filesystem
> [  108.473567] EXT4-fs error (device sda) in ext4_truncate:4150: IO failure
> [  108.481917] EXT4-fs error (device sda) in
> ext4_reserve_inode_write:5362: Corrupt filesystem
> root@(none)$ e2fsck /dev/sda
> e2fsck 1.42.9 (28-Dec-2013)
> /dev/sda is mounted.
> e2fsck: Cannot continue, aborting.
>
>
> root@(none)$ umount /mnt
> [  260.756250] EXT4-fs error (device sda): ext4_put_super:837:
> Couldn't clean up the journal
> root@(none)$ umount /mnt   e2fsck /dev/sda
> e2fsck 1.42.9 (28-Dec-2013)
> ext2fs_open2: Bad magic number in super-block
> e2fsck: Superblock invalid, trying backup blocks...
> Superblock needs_recovery flag is clear, but journal has data.
> Recovery flag not set in backup superblock, so running journal anyway.
> /dev/sda: recovering journal
> Pass 1: Checking inodes, blocks, and sizes
> Pass 2: Checking directory structure
> Pass 3: Checking directory connectivity
> Pass 4: Checking reference counts
> Pass 5: Checking group summary information
> Free blocks count wrong for group #1 (32703, counted=8127).
> Fix? yes
> Free blocks count wrong for group #2 (32768, counted=31744).
> Fix? yes
> Free blocks count wrong (249509, counted=223909).
> Fix? yes
> Free inodes count wrong for group #0 (8181, counted=8180).
> Fix? yes
> Free inodes count wrong (65525, counted=65524).
> Fix? yes
>
> /dev/sda: * FILE SYSTEM WAS MODIFIED *
> /dev/sda: 12/65536 files (8.3% non-contiguous), 38235/262144 blocks
> root@(none)$
>
> 2.
>
>  root@(none)$ rm /mnt/test
> [   71.021484] EXT4-fs error (device sda) in
> ext4_reserve_inode_write:5362: Corrupt filesystem
> [   71.044959] Aborting journal on device sda-8.
> [   71.052152] EXT4-fs (sda): Remounting filesystem read-only
> [   71.052833] EXT4-fs error (device sda) in ext4_dirty_inode:5487: IO failure
> [   71.061600] EXT4-fs error (device sda)

Re: [Qemu-devel] ext4 error when testing virtio-scsi & vhost-scsi

2016-07-27 Thread Zhangfei Gao
Hi, Jan

On Wed, Jul 27, 2016 at 11:56 PM, Jan Kara <j...@suse.cz> wrote:
> Hi!
>
> On Wed 27-07-16 15:58:55, Zhangfei Gao wrote:
>> Hi, Michael
>>
>> I have met ext4 error when using vhost_scsi on arm64 platform, and
>> suspect it is vhost_scsi issue.
>>
>> Ext4 error when testing virtio_scsi & vhost_scsi
>>
>>
>> No issue:
>> 1. virtio_scsi, ext4
>> 2. vhost_scsi & virtio_scsi, ext2
>> 3.  Instead of vhost, also tried loopback and no problem.
>> Using loopback, host can use the new block device, while vhost is used
>> by guest (qemu).
>> http://www.linux-iscsi.org/wiki/Tcm_loop
>> Test directly in host, not find ext4 error.
>>
>>
>>
>> Have issue:
>> 1. vhost_scsi & virtio_scsi, ext4
>> a. iblock
>> b, fileio, file located in /tmp (ram), no device based.
>>
>> 2, Have tried 4.7-r2 and 4.5-rc1 on D02 board, both have issue.
>> Since I need kvm specific patch for D02, so it may not freely to switch
>> to older version.
>>
>> 3. Also test with ext4, disabling journal
>> mkfs.ext4 -O ^has_journal /dev/sda
>>
>>
>> Do you have any suggestion?
>
> So can you mount the filesystem with errors=remount-ro to avoid clobbering
> the fs after the problem happens? And then run e2fsck on the problematic
> filesystem and send the output here?
>

Tested twice, log pasted.
Both using fileio, located in host ramfs /tmp
Before e2fsck, umount /dev/sda

1.
root@(none)$ mount -o errors=remount-ro /dev/sda /mnt
[   22.812053] EXT4-fs (sda): mounted filesystem with ordered data
mode. Opts: errors=remount-ro
$ rm /mnt/test
[  108.388905] EXT4-fs error (device sda) in
ext4_reserve_inode_write:5362: Corrupt filesystem
[  108.406930] Aborting journal on device sda-8.
[  108.414120] EXT4-fs (sda): Remounting filesystem read-only
[  108.414847] EXT4-fs error (device sda) in ext4_dirty_inode:5487: IO failure
[  108.423571] EXT4-fs error (device sda) in ext4_free_blocks:4904:
Journal has aborted
[  108.431919] EXT4-fs error (device sda) in
ext4_reserve_inode_write:5362: Corrupt filesystem
[  108.440269] EXT4-fs error (device sda) in
ext4_reserve_inode_write:5362: Corrupt filesystem
[  108.448568] EXT4-fs error (device sda) in
ext4_ext_remove_space:3058: IO failure
[  108.456917] EXT4-fs error (device sda) in ext4_ext_truncate:4657:
Corrupt filesystem
[  108.465267] EXT4-fs error (device sda) in
ext4_reserve_inode_write:5362: Corrupt filesystem
[  108.473567] EXT4-fs error (device sda) in ext4_truncate:4150: IO failure
[  108.481917] EXT4-fs error (device sda) in
ext4_reserve_inode_write:5362: Corrupt filesystem
root@(none)$ e2fsck /dev/sda
e2fsck 1.42.9 (28-Dec-2013)
/dev/sda is mounted.
e2fsck: Cannot continue, aborting.


root@(none)$ umount /mnt
[  260.756250] EXT4-fs error (device sda): ext4_put_super:837:
Couldn't clean up the journal
root@(none)$ umount /mnt   e2fsck /dev/sda
e2fsck 1.42.9 (28-Dec-2013)
ext2fs_open2: Bad magic number in super-block
e2fsck: Superblock invalid, trying backup blocks...
Superblock needs_recovery flag is clear, but journal has data.
Recovery flag not set in backup superblock, so running journal anyway.
/dev/sda: recovering journal
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Free blocks count wrong for group #1 (32703, counted=8127).
Fix? yes
Free blocks count wrong for group #2 (32768, counted=31744).
Fix? yes
Free blocks count wrong (249509, counted=223909).
Fix? yes
Free inodes count wrong for group #0 (8181, counted=8180).
Fix? yes
Free inodes count wrong (65525, counted=65524).
Fix? yes

/dev/sda: * FILE SYSTEM WAS MODIFIED *
/dev/sda: 12/65536 files (8.3% non-contiguous), 38235/262144 blocks
root@(none)$

2.

 root@(none)$ rm /mnt/test
[   71.021484] EXT4-fs error (device sda) in
ext4_reserve_inode_write:5362: Corrupt filesystem
[   71.044959] Aborting journal on device sda-8.
[   71.052152] EXT4-fs (sda): Remounting filesystem read-only
[   71.052833] EXT4-fs error (device sda) in ext4_dirty_inode:5487: IO failure
[   71.061600] EXT4-fs error (device sda) in ext4_free_blocks:4904:
Journal has aborted
[   71.069948] EXT4-fs error (device sda) in
ext4_reserve_inode_write:5362: Corrupt filesystem
[   71.078296] EXT4-fs error (device sda) in
ext4_reserve_inode_write:5362: Corrupt filesystem
[   71.086597] EXT4-fs error (device sda) in
ext4_ext_remove_space:3058: IO failure
[   71.094946] EXT4-fs error (device sda) in ext4_ext_truncate:4657:
Corrupt filesystem
[   71.103296] EXT4-fs error (device sda) in
ext4_reserve_inode_write:5362: Corrupt filesystem
[   71.111595] EXT4-fs error (device sda) in ext4_truncate:4150: IO failure
[   71.119946] EXT4-fs error (device sda) in
ext4_reserve_inode_writ

Re: [Qemu-devel] ext4 error when testing virtio-scsi & vhost-scsi

2016-07-27 Thread Zhangfei Gao
Hi, Michael

I have met ext4 error when using vhost_scsi on arm64 platform, and
suspect it is vhost_scsi issue.

Ext4 error when testing virtio_scsi & vhost_scsi


No issue:
1. virtio_scsi, ext4
2. vhost_scsi & virtio_scsi, ext2
3.  Instead of vhost, also tried loopback and no problem.
Using loopback, host can use the new block device, while vhost is used
by guest (qemu).
http://www.linux-iscsi.org/wiki/Tcm_loop
Test directly in host, not find ext4 error.



Have issue:
1. vhost_scsi & virtio_scsi, ext4
a. iblock
b, fileio, file located in /tmp (ram), no device based.

2, Have tried 4.7-r2 and 4.5-rc1 on D02 board, both have issue.
Since I need kvm specific patch for D02, so it may not freely to switch
to older version.

3. Also test with ext4, disabling journal
mkfs.ext4 -O ^has_journal /dev/sda


Do you have any suggestion?

Thanks

On Tue, Jul 19, 2016 at 4:21 PM, Zhangfei Gao <zhangfei@gmail.com> wrote:
> On Tue, Jul 19, 2016 at 3:56 PM, Zhangfei Gao <zhangfei@gmail.com> wrote:
>> Dear Ted
>>
>> On Wed, Jul 13, 2016 at 12:43 AM, Theodore Ts'o <ty...@mit.edu> wrote:
>>> On Tue, Jul 12, 2016 at 03:14:38PM +0800, Zhangfei Gao wrote:
>>>> Some update:
>>>>
>>>> If test with ext2, no problem in iblock.
>>>> If test with ext4, ext4_mb_generate_buddy reported error in the
>>>> removing files after reboot.
>>>>
>>>>
>>>> root@(none)$ rm test
>>>> [   21.006549] EXT4-fs error (device sda): ext4_mb_generate_buddy:758: 
>>>> group 18
>>>> , block bitmap and bg descriptor inconsistent: 26464 vs 25600 free clusters
>>>> [   21.008249] JBD2: Spotted dirty metadata buffer (dev = sda, blocknr = 
>>>> 0). Th
>>>> ere's a risk of filesystem corruption in case of system crash.
>>>>
>>>> Any special notes of using ext4 in qemu?
>>>
>>> Ext4 has more runtime consistency checking than ext2.  So just because
>>> ext4 complains doesn't mean that there isn't a problem with the file
>>> system; it just means that ext4 is more likely to notice before you
>>> lose user data.
>>>
>>> So if you test with ext2, try running e2fsck afterwards, to make sure
>>> the file system is consistent.
>>>
>>> Given that I'm reguarly testing ext4 using kvm, and I haven't seen
>>> anything like this in a very long time, I suspect the problemb is with
>>> your SCSI code, and not with ext4.
>>>
>>
>> Do you know what's the possible reason of this error.
>>
>> Have tried 4.7-rc2, same issue exist.
>> It can be reproduced by fileio and iblock as backstore.
>> It is easier to happen in qemu like this process:
>> qemu-> mount-> dd xx -> umout -> mount -> rm xx, then the error may
>> happen, no need to reboot.
>>
>> ramdisk can not cause error just because it just malloc and memcpy,
>> while not going to blk layer.
>>
>> Also tried creating one file in /tmp, used as fileio, also can reproduce.
>> So no real device is based.
>>
>> like:
>> cd /tmp
>> dd if=/dev/zero of=test bs=1M count=1024; sync;
>> targetcli
>> #targetcli
>> (targetcli) /> cd backstores/fileio
>> (targetcli) /> create name=file_backend file_or_dev=/tmp/test size=1G
>> (targetcli) /> cd /vhost
>> (targetcli) /> create wwn=naa.60014052cc816bf4
>> (targetcli) /> cd naa.60014052cc816bf4/tpgt1/luns
>> (targetcli) /> create /backstores/fileio/file_backend
>> (targetcli) /> cd /
>> (targetcli) /> saveconfig
>> (targetcli) /> exit
>>
>> /work/qemu.git/aarch64-softmmu/qemu-system-aarch64 \
>> -enable-kvm -nographic -kernel Image \
>> -device vhost-scsi-pci,wwpn=naa.60014052cc816bf4 \
>> -m 512 -M virt -cpu host \
>> -append "earlyprintk console=ttyAMA0 mem=512M"
>>
>> in qemu:
>> mkfs.ext4 /dev/sda
>> mount /dev/sda /mnt/
>> sync; date; dd if=/dev/zero of=/mnt/test bs=1M count=100; sync; date;
>>
>> using dd test, then some error happen.
>> log like:
>> oot@(none)$ sync; date; dd if=/dev/zero of=test bs=1M count=100; sync;; date;
>> [ 1789.917963] sbc_parse_cdb cdb[0]=0x35
>> [ 1789.922000] fd_execute_sync_cache immed=0
>> Tue Jul 19 07:26:12 UTC 2016
>> [  200.712879] EXT4-fs error (device sda) [ 1790.191770] sbc_parse_cdb
>> cdb[0]=0x2a
>> in ext4_reserve_inode_write:5362[ 1790.198382]  fd_execute_rw
>> : Corrupt filesystem
>> [  200.729001] EXT4-fs error (device sda) [ 1790.207843] sbc_parse_cdb
>> cdb[0]=0x2a
>> in ext4_reserve_inode_wr

Re: [Qemu-devel] ext4 error when testing virtio-scsi & vhost-scsi

2016-07-19 Thread Zhangfei Gao
On Tue, Jul 19, 2016 at 3:56 PM, Zhangfei Gao <zhangfei@gmail.com> wrote:
> Dear Ted
>
> On Wed, Jul 13, 2016 at 12:43 AM, Theodore Ts'o <ty...@mit.edu> wrote:
>> On Tue, Jul 12, 2016 at 03:14:38PM +0800, Zhangfei Gao wrote:
>>> Some update:
>>>
>>> If test with ext2, no problem in iblock.
>>> If test with ext4, ext4_mb_generate_buddy reported error in the
>>> removing files after reboot.
>>>
>>>
>>> root@(none)$ rm test
>>> [   21.006549] EXT4-fs error (device sda): ext4_mb_generate_buddy:758: 
>>> group 18
>>> , block bitmap and bg descriptor inconsistent: 26464 vs 25600 free clusters
>>> [   21.008249] JBD2: Spotted dirty metadata buffer (dev = sda, blocknr = 
>>> 0). Th
>>> ere's a risk of filesystem corruption in case of system crash.
>>>
>>> Any special notes of using ext4 in qemu?
>>
>> Ext4 has more runtime consistency checking than ext2.  So just because
>> ext4 complains doesn't mean that there isn't a problem with the file
>> system; it just means that ext4 is more likely to notice before you
>> lose user data.
>>
>> So if you test with ext2, try running e2fsck afterwards, to make sure
>> the file system is consistent.
>>
>> Given that I'm reguarly testing ext4 using kvm, and I haven't seen
>> anything like this in a very long time, I suspect the problemb is with
>> your SCSI code, and not with ext4.
>>
>
> Do you know what's the possible reason of this error.
>
> Have tried 4.7-rc2, same issue exist.
> It can be reproduced by fileio and iblock as backstore.
> It is easier to happen in qemu like this process:
> qemu-> mount-> dd xx -> umout -> mount -> rm xx, then the error may
> happen, no need to reboot.
>
> ramdisk can not cause error just because it just malloc and memcpy,
> while not going to blk layer.
>
> Also tried creating one file in /tmp, used as fileio, also can reproduce.
> So no real device is based.
>
> like:
> cd /tmp
> dd if=/dev/zero of=test bs=1M count=1024; sync;
> targetcli
> #targetcli
> (targetcli) /> cd backstores/fileio
> (targetcli) /> create name=file_backend file_or_dev=/tmp/test size=1G
> (targetcli) /> cd /vhost
> (targetcli) /> create wwn=naa.60014052cc816bf4
> (targetcli) /> cd naa.60014052cc816bf4/tpgt1/luns
> (targetcli) /> create /backstores/fileio/file_backend
> (targetcli) /> cd /
> (targetcli) /> saveconfig
> (targetcli) /> exit
>
> /work/qemu.git/aarch64-softmmu/qemu-system-aarch64 \
> -enable-kvm -nographic -kernel Image \
> -device vhost-scsi-pci,wwpn=naa.60014052cc816bf4 \
> -m 512 -M virt -cpu host \
> -append "earlyprintk console=ttyAMA0 mem=512M"
>
> in qemu:
> mkfs.ext4 /dev/sda
> mount /dev/sda /mnt/
> sync; date; dd if=/dev/zero of=/mnt/test bs=1M count=100; sync; date;
>
> using dd test, then some error happen.
> log like:
> oot@(none)$ sync; date; dd if=/dev/zero of=test bs=1M count=100; sync;; date;
> [ 1789.917963] sbc_parse_cdb cdb[0]=0x35
> [ 1789.922000] fd_execute_sync_cache immed=0
> Tue Jul 19 07:26:12 UTC 2016
> [  200.712879] EXT4-fs error (device sda) [ 1790.191770] sbc_parse_cdb
> cdb[0]=0x2a
> in ext4_reserve_inode_write:5362[ 1790.198382]  fd_execute_rw
> : Corrupt filesystem
> [  200.729001] EXT4-fs error (device sda) [ 1790.207843] sbc_parse_cdb
> cdb[0]=0x2a
> in ext4_reserve_inode_write:5362[ 1790.214495]  fd_execute_rw
> : Corrupt filesystem
>
> Looks like the error usually happens after SYCHRONIZE CACHE, but not
> for sure it is always happen after sync cache.
>
It is not always happen after SYCHRONIZE CACHE

Just tried in qemu: mount-> dd xx -> umount -> mount -> rm xx
ram based, (/tmp/test), no reboot.

root@(none)$ cd /mnt
root@(none)$ ls
[  301.444966]  sbc_parse_cdb cdb[0]=0x28
[  301.449003]  fd_execute_rw
lost+found  test
root@(none)$ rm test
[  304.281920]  sbc_parse_cdb cdb[0]=0x28
[  304.285955]  fd_execute_rw
[  118.002338] EXT4-fs error (device sda):[  304.290685] gzf sbc_parse_cdb cdb[0
]=0x28
 ext4_mb_generate_buddy:758: gro[  304.296737] gzf fd_execute_rw
up 3, block bitmap and bg descri[  304.304099]  sbc_parse_cdb cdb[0]=0x28
ptor inconsistent: 21504 vs 2143[  304.309322]  fd_execute_rw
9 free clusters
[  118.015903] JBD2: Spotted dirty metadata buffer (dev = sda, blocknr = 0). The
re's a risk of filesystem corruption in case of system crash.
root@(none)$

Thanks



Re: [Qemu-devel] ext4 error when testing virtio-scsi & vhost-scsi

2016-07-19 Thread Zhangfei Gao
Dear Ted

On Wed, Jul 13, 2016 at 12:43 AM, Theodore Ts'o <ty...@mit.edu> wrote:
> On Tue, Jul 12, 2016 at 03:14:38PM +0800, Zhangfei Gao wrote:
>> Some update:
>>
>> If test with ext2, no problem in iblock.
>> If test with ext4, ext4_mb_generate_buddy reported error in the
>> removing files after reboot.
>>
>>
>> root@(none)$ rm test
>> [   21.006549] EXT4-fs error (device sda): ext4_mb_generate_buddy:758: group 
>> 18
>> , block bitmap and bg descriptor inconsistent: 26464 vs 25600 free clusters
>> [   21.008249] JBD2: Spotted dirty metadata buffer (dev = sda, blocknr = 0). 
>> Th
>> ere's a risk of filesystem corruption in case of system crash.
>>
>> Any special notes of using ext4 in qemu?
>
> Ext4 has more runtime consistency checking than ext2.  So just because
> ext4 complains doesn't mean that there isn't a problem with the file
> system; it just means that ext4 is more likely to notice before you
> lose user data.
>
> So if you test with ext2, try running e2fsck afterwards, to make sure
> the file system is consistent.
>
> Given that I'm reguarly testing ext4 using kvm, and I haven't seen
> anything like this in a very long time, I suspect the problemb is with
> your SCSI code, and not with ext4.
>

Do you know what's the possible reason of this error.

Have tried 4.7-rc2, same issue exist.
It can be reproduced by fileio and iblock as backstore.
It is easier to happen in qemu like this process:
qemu-> mount-> dd xx -> umout -> mount -> rm xx, then the error may
happen, no need to reboot.

ramdisk can not cause error just because it just malloc and memcpy,
while not going to blk layer.

Also tried creating one file in /tmp, used as fileio, also can reproduce.
So no real device is based.

like:
cd /tmp
dd if=/dev/zero of=test bs=1M count=1024; sync;
targetcli
#targetcli
(targetcli) /> cd backstores/fileio
(targetcli) /> create name=file_backend file_or_dev=/tmp/test size=1G
(targetcli) /> cd /vhost
(targetcli) /> create wwn=naa.60014052cc816bf4
(targetcli) /> cd naa.60014052cc816bf4/tpgt1/luns
(targetcli) /> create /backstores/fileio/file_backend
(targetcli) /> cd /
(targetcli) /> saveconfig
(targetcli) /> exit

/work/qemu.git/aarch64-softmmu/qemu-system-aarch64 \
-enable-kvm -nographic -kernel Image \
-device vhost-scsi-pci,wwpn=naa.60014052cc816bf4 \
-m 512 -M virt -cpu host \
-append "earlyprintk console=ttyAMA0 mem=512M"

in qemu:
mkfs.ext4 /dev/sda
mount /dev/sda /mnt/
sync; date; dd if=/dev/zero of=/mnt/test bs=1M count=100; sync; date;

using dd test, then some error happen.
log like:
oot@(none)$ sync; date; dd if=/dev/zero of=test bs=1M count=100; sync;; date;
[ 1789.917963] sbc_parse_cdb cdb[0]=0x35
[ 1789.922000] fd_execute_sync_cache immed=0
Tue Jul 19 07:26:12 UTC 2016
[  200.712879] EXT4-fs error (device sda) [ 1790.191770] sbc_parse_cdb
cdb[0]=0x2a
in ext4_reserve_inode_write:5362[ 1790.198382]  fd_execute_rw
: Corrupt filesystem
[  200.729001] EXT4-fs error (device sda) [ 1790.207843] sbc_parse_cdb
cdb[0]=0x2a
in ext4_reserve_inode_write:5362[ 1790.214495]  fd_execute_rw
: Corrupt filesystem

Looks like the error usually happens after SYCHRONIZE CACHE, but not
for sure it is always happen after sync cache.

Thanks



Re: [Qemu-devel] ext4 error when testing virtio-scsi & vhost-scsi

2016-07-15 Thread Zhangfei Gao
Dear Dave

On Wed, Jul 13, 2016 at 7:03 AM, Dave Chinner <da...@fromorbit.com> wrote:
> On Tue, Jul 12, 2016 at 12:43:24PM -0400, Theodore Ts'o wrote:
>> On Tue, Jul 12, 2016 at 03:14:38PM +0800, Zhangfei Gao wrote:
>> > Some update:
>> >
>> > If test with ext2, no problem in iblock.
>> > If test with ext4, ext4_mb_generate_buddy reported error in the
>> > removing files after reboot.
>> >
>> >
>> > root@(none)$ rm test
>> > [   21.006549] EXT4-fs error (device sda): ext4_mb_generate_buddy:758: 
>> > group 18
>> > , block bitmap and bg descriptor inconsistent: 26464 vs 25600 free clusters
>> > [   21.008249] JBD2: Spotted dirty metadata buffer (dev = sda, blocknr = 
>> > 0). Th
>> > ere's a risk of filesystem corruption in case of system crash.
>> >
>> > Any special notes of using ext4 in qemu?
>>
>> Ext4 has more runtime consistency checking than ext2.  So just because
>> ext4 complains doesn't mean that there isn't a problem with the file
>> system; it just means that ext4 is more likely to notice before you
>> lose user data.
>>
>> So if you test with ext2, try running e2fsck afterwards, to make sure
>> the file system is consistent.
>>
>> Given that I'm reguarly testing ext4 using kvm, and I haven't seen
>> anything like this in a very long time, I suspect the problemb is with
>> your SCSI code, and not with ext4.
>
> It's the same error I reported yesterday for ext3 on 4.7-rc6 when
> rebooting a VM after it hung.


Any link of this error?

Now I still can not get conclusion of which part cause this error?

1. No problem
a. Using with virtio-scsi, and test via files with ext4 filesystem, no problem.

 /work/qemu.git/aarch64-softmmu/qemu-system-aarch64 \
-enable-kvm -nographic -kernel Image \
-global virtio-blk-device.scsi=on -device virtio-scsi-device,id=scsi \
-drive file=ext4_oe64.img,id=coreimg,cache=none,if=none,format=raw \
-device scsi-hd,drive=coreimg \
-m 512 -M virt -cpu host \
-append "earlyprintk console=ttyAMA0 mem=512M"

ext4_oe64.img is ext4 file system.

b. Use vhost-scsi & target, ramdisk as backstore, ext4 filesystem,
also no problem.


2. Has problem
a. Using vhost-scsi & target, iblock, with sas disk & u-disk as
backstore, ext4, both has issue.
it only prove the issue is not in driver (sas & u-disk) itself.


Looks the issue is in vhost-scsi & target.
Still in checking how to narrow down.

Any suggestion?


Thanks



Re: [Qemu-devel] ext4 error when testing virtio-scsi & vhost-scsi

2016-07-13 Thread Zhangfei Gao
Dear Ted

On Wed, Jul 13, 2016 at 12:43 AM, Theodore Ts'o <ty...@mit.edu> wrote:
> On Tue, Jul 12, 2016 at 03:14:38PM +0800, Zhangfei Gao wrote:
>> Some update:
>>
>> If test with ext2, no problem in iblock.
>> If test with ext4, ext4_mb_generate_buddy reported error in the
>> removing files after reboot.
>>
>>
>> root@(none)$ rm test
>> [   21.006549] EXT4-fs error (device sda): ext4_mb_generate_buddy:758: group 
>> 18
>> , block bitmap and bg descriptor inconsistent: 26464 vs 25600 free clusters
>> [   21.008249] JBD2: Spotted dirty metadata buffer (dev = sda, blocknr = 0). 
>> Th
>> ere's a risk of filesystem corruption in case of system crash.
>>
>> Any special notes of using ext4 in qemu?
>
> Ext4 has more runtime consistency checking than ext2.  So just because
> ext4 complains doesn't mean that there isn't a problem with the file
> system; it just means that ext4 is more likely to notice before you
> lose user data.
>
> So if you test with ext2, try running e2fsck afterwards, to make sure
> the file system is consistent.
>
> Given that I'm reguarly testing ext4 using kvm, and I haven't seen
> anything like this in a very long time, I suspect the problemb is with
> your SCSI code, and not with ext4.
>

Instead of using sas disk, I am trying with u-disk as backstore, via iblock.

# targetcli
/backstores/iblock> create name=block_backend dev=/dev/sdb
/backstores/iblock> cd /vhost
/vhost> create wwn=naa.60014053c5cc00ac
/vhost> ls
o- vhost  [1 Target]
  o- naa.60014053c5cc00ac .. [1 TPG]
o- tpg1 . [naa.6001405830beacfa]
  o- luns . [0 LUNs]
/vhost> cd naa.60014053c5cc00ac/tpg1/luns
/vhost/naa.60...0ac/tpg1/luns> create /backstores/iblock/block_backend

/work/qemu.git/aarch64-softmmu/qemu-system-aarch64 \
-enable-kvm -nographic -kernel Image \
-device vhost-scsi-pci,wwpn=naa.60014053c5cc00ac \
-m 512 -M virt -cpu host \
-append "earlyprintk console=ttyAMA0 mem=512M"



in qemu:

Just test with dd, got following error.

 #sync; date; dd if=/dev/zero of=test bs=1M count=100; sync;
Thu Jan  1 00:00:45 UTC 1970
[   45.150514] EXT4-fs error (device sda) in ext4_reserve_inode_write:5172: Corr
upt filesystem
[   45.153319] EXT4-fs error (device sda) in ext4_reserve_inode_write:5172: Corr
upt filesystem
[   45.156054] EXT4-fs error (device sda) in ext4_reserve_inode_write:5172: Corr
upt filesystem
[   45.160806] EXT4-fs error (device sda) in ext4_ext_truncate:4661: Corrupt fil
esystem
[   45.165431] EXT4-fs error (device sda) in ext4_reserve_inode_write:5172: Corr
upt filesystem
[   45.169177] EXT4-fs error (device sda) in ext4_orphan_del:2896: Corrupt files
ystem
[   45.172676] EXT4-fs error (device sda) in ext4_reserve_inode_write:5172: Corr
upt filesystem
[   45.176427] EXT4-fs error (device sda) in ext4_reserve_inode_write:5172: Corr
upt filesystem
[   45.180800] EXT4-fs error (device sda) in ext4_reserve_inode_write:5172: Corr
upt filesystem
[   45.183571] EXT4-fs error (device sda) in ext4_reserve_inode_write:5172: Corr
upt filesystem

[   50.122300] jbd2_journal_bmap: journal block not found at offset 26 on sda-8
[   50.123181] Aborting journal on device sda-8.
[   50.138046] EXT4-fs error (device sda): ext4_journal_check_start:56: Detected
 aborted journal
[   50.139117] EXT4-fs (sda): Remounting filesystem read-only
dd: writing 'test': Read-only file system
6+0 records in
4+1 records out
Thu Jan  1 00:00:50 UTC 1970


Also get error like this after reboot, not always happen.
root@(none)$ rm test
root@(none)$ ls
lost+found
; date;one)$ sync; date; dd if=/dev/zero of=test bs=1M count=100; sync
Thu Jan  1 00:00:29 UTC 1970
[   29.909074] EXT4-fs error (device sda): ext4_init_inode_bitmap:79: comm dd: C
hecksum bad for group 3
[   29.910205] BUG: scheduling while atomic: dd/1091/0x0002
[   29.910928] Modules linked in:
[   29.911340] CPU: 0 PID: 1091 Comm: dd Not tainted 4.5.0-rc1+ #70
[   29.912066] Hardware name: linux,dummy-virt (DT)
[   29.912639] Call trace:
[   29.912957] [] dump_backtrace+0x0/0x180
[   29.913623] [] show_stack+0x14/0x20
[   29.914249] [] dump_stack+0x90/0xc8
[   29.914893] [] __schedule_bug+0x44/0x58
[   29.915573] [] __schedule+0x4f4/0x5a0
[   29.916200] [] schedule+0x3c/0xa8
[   29.916786] [] schedule_timeout+0x15c/0x1b0
[   29.917471] [] io_schedule_timeout+0xa0/0x110
[   29.918177] [] bit_wait_io+0x18/0x68
[   29.918827] [] __wait_on_bit_lock+0x7c/0xf0
[   29.919552] [] out_of_line_wait_on_bit_lock+0x60/0x68
[   29.920352] [] __lock_buffer+0x38/0x48
[   29.920991] [] __sync_dirty_buffer+0xf4/0xf8
[   29.921692] [] ext4_commit_super+0x18c/0x268
[   29.922389] [] __ext4_error+0x60/0xd

Re: [Qemu-devel] ext4 error when testing virtio-scsi & vhost-scsi

2016-07-12 Thread Zhangfei Gao
Some update:

If test with ext2, no problem in iblock.
If test with ext4, ext4_mb_generate_buddy reported error in the
removing files after reboot.


root@(none)$ rm test
[   21.006549] EXT4-fs error (device sda): ext4_mb_generate_buddy:758: group 18
, block bitmap and bg descriptor inconsistent: 26464 vs 25600 free clusters
[   21.008249] JBD2: Spotted dirty metadata buffer (dev = sda, blocknr = 0). Th
ere's a risk of filesystem corruption in case of system crash.

Any special notes of using ext4 in qemu?

Thanks


On Mon, Jul 11, 2016 at 12:05 PM, Zhangfei Gao <zhangfei@gmail.com> wrote:
> Hi
>
> Does qemu process need flush data before closing?
>
> In the test of virtio_scsi & vhost_scsi, the first time read & write
> to the mounted disk have no problem.
> But after reboot, remount the disk, error happen immediately when
> remove the files created in the first time.
>
> For example:
> # targetcli
> /> cd backstores/iblock
> /backstores/iblock> create name=block_backend dev=/dev/sda3
> /backstores/iblock> cd /vhost
> /vhost> create wwn=naa.60014053c5cc00ac
> /vhost> ls
> o- vhost  [1 
> Target]
>   o- naa.60014053c5cc00ac .. [1 
> TPG]
> o- tpg1 . 
> [naa.6001405830beacfa]
>   o- luns . [0 
> LUNs]
> /vhost> cd naa.60014053c5cc00ac/tpg1/luns
> /vhost/naa.60...0ac/tpg1/luns> create /backstores/iblock/block_backend
>
> qemu.git/aarch64-softmmu/qemu-system-aarch64 \
> -enable-kvm -nographic -kernel Image \
> -device vhost-scsi-pci,wwpn=naa.60014053c5cc00ac \
> -m 512 -M virt -cpu host \
> -append "earlyprintk console=ttyAMA0 mem=512M"
>
> in qemu system:
> mount /dev/sda /mnt;
>
> sync; date; dd if=/dev/zero of=/mnt/test bs=1M count=100; sync; date;
>
> no problem for several times.
>
> Reboot
> targetcli config -> start qemu again.
> in qemu:
>
> mount /dev/sda /mnt;
>
> root@(none)$ rm test
> [   12.900540] EXT4-fs error (device sda): ext4_mb_generate_buddy:758: group 
> 3s
> [   12.908844] EXT4-fs error (device sda): ext4_mb_generate_buddy:758: group 
> 3s
> [   12.911154] JBD2: Spotted dirty metadata buffer (dev = sda, blocknr = 0). 
> T.
>
> Error happens immediately removing the files, which is created in the
> first time.
>
> Thanks
>
>
> On Sun, Jun 12, 2016 at 11:23 AM, Zhangfei Gao <zhangfei@gmail.com> wrote:
>> Here is one question about testing virtio-scsi & vhost-scsi.
>> I met ext4 error using fileio or iblock.
>> And after the error, the filesystem can not be remount next time in
>> guest os except mkfs.ext4 again.
>>
>> Any suggestions?
>> Thanks in advance.
>>
>>
>> Basic steps.
>> fileio:
>> mount /dev/sda3 /mnt
>> dd if=/dev/zero of=test bs=1M count=1024
>>
>>
>> #targetcli
>>
>> (targetcli) /> cd backstores/fileio
>>
>> (targetcli) /> create name=file_backend file_or_dev=/mnt/test size=1G
>>
>> (targetcli) /> cd /vhost
>>
>> (targetcli) /> create wwn=naa.60014052cc816bf4
>>
>> (targetcli) /> cd naa.60014052cc816bf4/tpgt1/luns
>>
>> (targetcli) /> create /backstores/fileio/file_backend
>>
>> (targetcli) /> cd /
>>
>> (targetcli) /> saveconfig
>>
>> (targetcli) /> exit
>>
>> qemu.git/aarch64-softmmu/qemu-system-aarch64 \
>>
>>-enable-kvm -nographic -kernel Image \
>>
>>-device vhost-scsi-pci,wwpn=naa.60014052cc816bf4 \
>>
>>-m 512 -M virt -cpu host \
>>
>>-append "earlyprintk console=ttyAMA0 mem=512M rw"
>>
>>
>> After guest kernel is boot,
>>
>> Mkfs.ext4 /dev/sda
>>
>> Mount /dev/sda /mnt
>>
>>
>> sync; date; dd if=/dev/zero of=test bs=1M count=100; sync; date;
>>
>>
>> Ext4 error:
>>
>> And can not be mounted next time.
>>
>> [  762.387457] EXT4-fs error (device sda) in
>> ext4_reserve_inode_write:5172: Corrupt filesystem
>>
>> [  762.395622] EXT4-fs error (device sda) in
>> ext4_reserve_inode_write:5172: Corrupt filesystem
>>
>> [  762.403915] EXT4-fs error (device sda) in
>> ext4_reserve_inode_write:5172: Corrupt filesystem
>>
>> [  762.412263] EXT4-fs error (device sda) in ext4_ext_truncate:4661:
>> Corrupt filesystem
>>
>> [  762.420613] EXT4-fs error (device sda) in
>> ext4_reserve_inode_write:5172: Corrupt files

Re: [Qemu-devel] ext4 error when testing virtio-scsi & vhost-scsi

2016-07-10 Thread Zhangfei Gao
Hi

Does qemu process need flush data before closing?

In the test of virtio_scsi & vhost_scsi, the first time read & write
to the mounted disk have no problem.
But after reboot, remount the disk, error happen immediately when
remove the files created in the first time.

For example:
# targetcli
/> cd backstores/iblock
/backstores/iblock> create name=block_backend dev=/dev/sda3
/backstores/iblock> cd /vhost
/vhost> create wwn=naa.60014053c5cc00ac
/vhost> ls
o- vhost  [1 Target]
  o- naa.60014053c5cc00ac .. [1 TPG]
o- tpg1 . [naa.6001405830beacfa]
  o- luns . [0 LUNs]
/vhost> cd naa.60014053c5cc00ac/tpg1/luns
/vhost/naa.60...0ac/tpg1/luns> create /backstores/iblock/block_backend

qemu.git/aarch64-softmmu/qemu-system-aarch64 \
-enable-kvm -nographic -kernel Image \
-device vhost-scsi-pci,wwpn=naa.60014053c5cc00ac \
-m 512 -M virt -cpu host \
-append "earlyprintk console=ttyAMA0 mem=512M"

in qemu system:
mount /dev/sda /mnt;

sync; date; dd if=/dev/zero of=/mnt/test bs=1M count=100; sync; date;

no problem for several times.

Reboot
targetcli config -> start qemu again.
in qemu:

mount /dev/sda /mnt;

root@(none)$ rm test
[   12.900540] EXT4-fs error (device sda): ext4_mb_generate_buddy:758: group 3s
[   12.908844] EXT4-fs error (device sda): ext4_mb_generate_buddy:758: group 3s
[   12.911154] JBD2: Spotted dirty metadata buffer (dev = sda, blocknr = 0). T.

Error happens immediately removing the files, which is created in the
first time.

Thanks


On Sun, Jun 12, 2016 at 11:23 AM, Zhangfei Gao <zhangfei@gmail.com> wrote:
> Here is one question about testing virtio-scsi & vhost-scsi.
> I met ext4 error using fileio or iblock.
> And after the error, the filesystem can not be remount next time in
> guest os except mkfs.ext4 again.
>
> Any suggestions?
> Thanks in advance.
>
>
> Basic steps.
> fileio:
> mount /dev/sda3 /mnt
> dd if=/dev/zero of=test bs=1M count=1024
>
>
> #targetcli
>
> (targetcli) /> cd backstores/fileio
>
> (targetcli) /> create name=file_backend file_or_dev=/mnt/test size=1G
>
> (targetcli) /> cd /vhost
>
> (targetcli) /> create wwn=naa.60014052cc816bf4
>
> (targetcli) /> cd naa.60014052cc816bf4/tpgt1/luns
>
> (targetcli) /> create /backstores/fileio/file_backend
>
> (targetcli) /> cd /
>
> (targetcli) /> saveconfig
>
> (targetcli) /> exit
>
> qemu.git/aarch64-softmmu/qemu-system-aarch64 \
>
>-enable-kvm -nographic -kernel Image \
>
>-device vhost-scsi-pci,wwpn=naa.60014052cc816bf4 \
>
>-m 512 -M virt -cpu host \
>
>-append "earlyprintk console=ttyAMA0 mem=512M rw"
>
>
> After guest kernel is boot,
>
> Mkfs.ext4 /dev/sda
>
> Mount /dev/sda /mnt
>
>
> sync; date; dd if=/dev/zero of=test bs=1M count=100; sync; date;
>
>
> Ext4 error:
>
> And can not be mounted next time.
>
> [  762.387457] EXT4-fs error (device sda) in
> ext4_reserve_inode_write:5172: Corrupt filesystem
>
> [  762.395622] EXT4-fs error (device sda) in
> ext4_reserve_inode_write:5172: Corrupt filesystem
>
> [  762.403915] EXT4-fs error (device sda) in
> ext4_reserve_inode_write:5172: Corrupt filesystem
>
> [  762.412263] EXT4-fs error (device sda) in ext4_ext_truncate:4661:
> Corrupt filesystem
>
> [  762.420613] EXT4-fs error (device sda) in
> ext4_reserve_inode_write:5172: Corrupt filesystem
>
> [  762.428913] EXT4-fs error (device sda) in ext4_orphan_del:2896:
> Corrupt filesystem
>
> [  762.437262] EXT4-fs error (device sda) in
> ext4_reserve_inode_write:5172: Corrupt filesystem
>
> [  762.445614] EXT4-fs error (device sda) in
> ext4_reserve_inode_write:5172: Corrupt filesystem
>
> [  762.454516] EXT4-fs error (device sda) in
> ext4_reserve_inode_write:5172: Corrupt filesystem
>
> [  762.462283] EXT4-fs error (device sda) in
> ext4_reserve_inode_write:5172: Corrupt filesystem
>
> [  767.370571] jbd2_journal_bmap: journal block not found at offset 13 on 
> sda-8
>
> [  767.371458] Aborting journal on device sda-8.
>
> [  767.395583] EXT4-fs error: 564 callbacks suppressed
>
> [  767.396173] EXT4-fs error (device sda) in ext4_da_write_end:2841: IO 
> failure
>
> [  767.412221] EXT4-fs error (device sda):
> ext4_journal_check_start:56: Detected aborted journal
>
> [  767.413325] EXT4-fs (sda): Remounting filesystem read-only
>
> dd: writing '/mnt/test.bin': Read-only file system
>
>
> blockio:
>
> # targetcli
>
> /> 

[Qemu-devel] ext4 error when testing virtio-scsi & vhost-scsi

2016-06-11 Thread Zhangfei Gao
Here is one question about testing virtio-scsi & vhost-scsi.
I met ext4 error using fileio or iblock.
And after the error, the filesystem can not be remount next time in
guest os except mkfs.ext4 again.

Any suggestions?
Thanks in advance.


Basic steps.
fileio:
mount /dev/sda3 /mnt
dd if=/dev/zero of=test bs=1M count=1024


#targetcli

(targetcli) /> cd backstores/fileio

(targetcli) /> create name=file_backend file_or_dev=/mnt/test size=1G

(targetcli) /> cd /vhost

(targetcli) /> create wwn=naa.60014052cc816bf4

(targetcli) /> cd naa.60014052cc816bf4/tpgt1/luns

(targetcli) /> create /backstores/fileio/file_backend

(targetcli) /> cd /

(targetcli) /> saveconfig

(targetcli) /> exit

qemu.git/aarch64-softmmu/qemu-system-aarch64 \

   -enable-kvm -nographic -kernel Image \

   -device vhost-scsi-pci,wwpn=naa.60014052cc816bf4 \

   -m 512 -M virt -cpu host \

   -append "earlyprintk console=ttyAMA0 mem=512M rw"


After guest kernel is boot,

Mkfs.ext4 /dev/sda

Mount /dev/sda /mnt


sync; date; dd if=/dev/zero of=test bs=1M count=100; sync; date;


Ext4 error:

And can not be mounted next time.

[  762.387457] EXT4-fs error (device sda) in
ext4_reserve_inode_write:5172: Corrupt filesystem

[  762.395622] EXT4-fs error (device sda) in
ext4_reserve_inode_write:5172: Corrupt filesystem

[  762.403915] EXT4-fs error (device sda) in
ext4_reserve_inode_write:5172: Corrupt filesystem

[  762.412263] EXT4-fs error (device sda) in ext4_ext_truncate:4661:
Corrupt filesystem

[  762.420613] EXT4-fs error (device sda) in
ext4_reserve_inode_write:5172: Corrupt filesystem

[  762.428913] EXT4-fs error (device sda) in ext4_orphan_del:2896:
Corrupt filesystem

[  762.437262] EXT4-fs error (device sda) in
ext4_reserve_inode_write:5172: Corrupt filesystem

[  762.445614] EXT4-fs error (device sda) in
ext4_reserve_inode_write:5172: Corrupt filesystem

[  762.454516] EXT4-fs error (device sda) in
ext4_reserve_inode_write:5172: Corrupt filesystem

[  762.462283] EXT4-fs error (device sda) in
ext4_reserve_inode_write:5172: Corrupt filesystem

[  767.370571] jbd2_journal_bmap: journal block not found at offset 13 on sda-8

[  767.371458] Aborting journal on device sda-8.

[  767.395583] EXT4-fs error: 564 callbacks suppressed

[  767.396173] EXT4-fs error (device sda) in ext4_da_write_end:2841: IO failure

[  767.412221] EXT4-fs error (device sda):
ext4_journal_check_start:56: Detected aborted journal

[  767.413325] EXT4-fs (sda): Remounting filesystem read-only

dd: writing '/mnt/test.bin': Read-only file system


blockio:

# targetcli

/> cd backstores/iblock

/backstores/iblock> create name=block_backend dev=/dev/sda4

/backstores/iblock> cd /vhost

/vhost> create

/vhost> ls

o- vhost  [1 Target]

 o- naa.60014053c5cc00ac .. [1 TPG]

   o- tpg1 . [naa.6001405830beacfa]

 o- luns . [0 LUNs]

/vhost> cd naa.60014053c5cc00ac/tpg1/luns

/vhost/naa.60...0ac/tpg1/luns> create /backstores/iblock/block_backend

/vhost/naa.60...0ac/tpg1/luns> cd /

/> saveconfig

qemu.git/aarch64-softmmu/qemu-system-aarch64 \

   -enable-kvm -nographic -kernel Image \

   -device vhost-scsi-pci,wwpn=naa.60014053c5cc00ac \

   -m 512 -M virt -cpu host \

   -append "earlyprintk console=ttyAMA0 mem=512M"


Mount /dev/sda /mnt

sync; date; dd if=/dev/zero of=/mnt/test bs=1M count=100; sync; date;


sync; date; sync; date; dd if=/dev/zero of=/mnt/test bs=1M count=100;

Thu Jan  1 00:01:16 UTC 1970

[   77.044879] EXT4-fs error (device sda) in
ext4_reserve_inode_write:5172: Corrupt filesystem

[   77.067334] EXT4-fs error (device sda) in
ext4_reserve_inode_write:5172: Corrupt filesystem

[   77.075623] EXT4-fs error (device sda) in
ext4_reserve_inode_write:5172: Corrupt filesystem

[   77.083970] EXT4-fs error (device sda) in ext4_ext_truncate:4661:
Corrupt filesystem

[   77.092322] EXT4-fs error (device sda) in
ext4_reserve_inode_write:5172: Corrupt filesystem

[   77.100619] EXT4-fs error (device sda) in ext4_orphan_del:2896:
Corrupt filesystem

[   77.108971] EXT4-fs error (device sda) in
ext4_reserve_inode_write:5172: Corrupt filesystem

[   77.117321] EXT4-fs error (device sda) in
ext4_reserve_inode_write:5172: Corrupt filesystem

[   77.126204] EXT4-fs error (device sda) in
ext4_reserve_inode_write:5172: Corrupt filesystem

[   77.133989] EXT4-fs error (device sda) in
ext4_reserve_inode_write:5172: Corrupt filesystem


[   82.025630] jbd2_journal_bmap: journal block not found at offset 10 on sda-8

[   82.026522] Aborting journal on device sda-8.

[   82.050642] EXT4-fs error: 563 callbacks suppressed

[   82.051278] EXT4-fs error (device sda) in ext4_da_write_end:2841: IO failure

[   82.067283] EXT4-fs error (device sda):
ext4_journal_check_start:56: Detected aborted journal

[   82.068372] EXT4-fs (sda): Remounting filesystem read-only