Re: [RFC PATCH 01/10] s390/pci: base support

2012-12-18 Thread Sebastian Ott


On Thu, 13 Dec 2012, Bjorn Helgaas wrote:

> On Thu, Dec 13, 2012 at 4:51 AM, Jan Glauber  wrote:
> > On Mon, 2012-12-10 at 14:14 -0700, Bjorn Helgaas wrote:
> >> On Wed, Nov 14, 2012 at 2:41 AM, Jan Glauber  
> >> wrote:
> >> > Add PCI support for s390, (only 64 bit mode is supported by hardware):
> >> > - PCI facility tests
> >> > - PCI instructions: pcilg, pcistg, pcistb, stpcifc, mpcifc, rpcit
> >> > - map readb/w/l/q and writeb/w/l/q to pcilg and pcistg instructions
> >> > - pci_iomap implementation
> >> > - memcpy_fromio/toio
> >> > - pci_root_ops using special pcilg/pcistg
> >> > - device, bus and domain allocation
> >> >
> >> > Signed-off-by: Jan Glauber 
> >>
> >> I think these patches are in -next already, so I just have some
> >> general comments & questions.
> >
> > Yes, since the feedback was manageable we decided to give the patches
> > some exposure in -next and if no one complains we'll just go for the
> > next merge window. BTW, Sebastian & Gerald (on CC:) will continue the
> > work on the PCI code.
> >
> >> My overall impression is that these are exceptionally well done.
> >> They're easy to read, well organized, and well documented.  It's a
> >> refreshing change from a lot of the stuff that's posted.
> >
> > Thanks Björn!
> >
> >> As with other arches that run on top of hypervisors, you have
> >> arch-specific code that enumerates PCI devices using hypervisor calls,
> >> and you hook into the PCI core at a lower level than
> >> pci_scan_root_bus().  That is, you call pci_create_root_bus(),
> >> pci_scan_single_device(), pci_bus_add_devices(), etc., directly from
> >> the arch code.  This is the typical approach, but it does make more
> >> dependencies between the arch code and the PCI core than I'd like.
> >>
> >> Did you consider hiding any of the hypervisor details behind the PCI
> >> config accessor interface?  If that could be done, the overall
> >> structure might be more similar to other architectures.
> >
> > You mean pci_root_ops? I'm not sure I understand how that can be used
> > to hide hipervisor details.
> 
> The object of doing this would be to let you use pci_scan_root_bus(),
> so you wouldn't have to call pci_scan_single_device() and
> pci_bus_add_resources() directly.  The idea is to make pci_read() and
> pci_write() smart enough that the PCI core can use them as though you
> have a normal PCI implementation.  When pci_scan_root_bus() enumerates
> devices on the root bus using pci_scan_child_bus(), it does config
> reads on the VENDOR_ID of bb:00.0, bb:01.0, ..., bb:1f.0.  Your
> pci_read() would return error or 0x data for everything except
> bb:00.0 (I guess it actually already does that).
> 
> Then some of the other init, e.g., zpci_enable_device(), could be done
> by the standard pcibios hooks such as pcibios_add_device() and
> pcibios_enable_device(), which would remove some of the PCI grunge
> from zpci_scan_devices() and the s390 hotplug driver.
> 
> > One reason why we use the lower level
> > functions is that we need to create the root bus (and its resources)
> > much earlier then the pci_dev. For instance pci_hp_register wants a
> > pci_bus to create the PCI slot and the slot can exist without a pci_dev.
> 
> There must be something else going on here; a bus is *always* created
> before any of the pci_devs on the bus.
> 
> One thing that looks a little strange is that zpci_list seems to be
> sort of a cross between a list of PCI host bridges and a list of PCI
> devices.  Understandable, since you usually have a one-to-one
> correspondence between them, but for your hotplug stuff, you can have
> the host bridge and slot without the pci_dev.
> 
> The hotplug slot support is really a function of the host bridge
> support, so it doesn't seem quite right to me to split it into a
> separate module (although that is the way most PCI hotplug drivers are
> currently structured).  One reason is that it makes hot-add of a host
> bridge awkward: you have to have the "if (hotplug_ops.create_slot)"
> test in zpci_create_device().
> 
> >> The current config accessors only work for dev/fn 00.0 (they fail when
> >> "devfn != ZPCI_DEVFN").  Why is that?  It looks like it precludes
> >> multi-function devices and basically prevents you from building an
> >> arbitrary PCI hierarchy.
> >
> > Our hypervisor does not support multi-function devices. In fact the
> > hypervisor will limit the reported PCI devices to a hand-picked
> > selection so we can be sure that there will be no unsupported devices.
> > The PCI hierarchy is hidden by the hipervisor. We only use the PCI
> > domain number, bus and devfn are always zero. So it looks like every
> > function is directly attached to a PCI root complex.
> >
> > That was the reason for the sanity check, but thinking about it I could
> > remove it since although we don't support multi-function devices I
> > think the s390 code should be more agnostic to these limitations.
> 
> The config accessor interface should be defined so it 

Re: [RFC PATCH 01/10] s390/pci: base support

2012-12-18 Thread Sebastian Ott


On Thu, 13 Dec 2012, Bjorn Helgaas wrote:

 On Thu, Dec 13, 2012 at 4:51 AM, Jan Glauber j...@linux.vnet.ibm.com wrote:
  On Mon, 2012-12-10 at 14:14 -0700, Bjorn Helgaas wrote:
  On Wed, Nov 14, 2012 at 2:41 AM, Jan Glauber j...@linux.vnet.ibm.com 
  wrote:
   Add PCI support for s390, (only 64 bit mode is supported by hardware):
   - PCI facility tests
   - PCI instructions: pcilg, pcistg, pcistb, stpcifc, mpcifc, rpcit
   - map readb/w/l/q and writeb/w/l/q to pcilg and pcistg instructions
   - pci_iomap implementation
   - memcpy_fromio/toio
   - pci_root_ops using special pcilg/pcistg
   - device, bus and domain allocation
  
   Signed-off-by: Jan Glauber j...@linux.vnet.ibm.com
 
  I think these patches are in -next already, so I just have some
  general comments  questions.
 
  Yes, since the feedback was manageable we decided to give the patches
  some exposure in -next and if no one complains we'll just go for the
  next merge window. BTW, Sebastian  Gerald (on CC:) will continue the
  work on the PCI code.
 
  My overall impression is that these are exceptionally well done.
  They're easy to read, well organized, and well documented.  It's a
  refreshing change from a lot of the stuff that's posted.
 
  Thanks Björn!
 
  As with other arches that run on top of hypervisors, you have
  arch-specific code that enumerates PCI devices using hypervisor calls,
  and you hook into the PCI core at a lower level than
  pci_scan_root_bus().  That is, you call pci_create_root_bus(),
  pci_scan_single_device(), pci_bus_add_devices(), etc., directly from
  the arch code.  This is the typical approach, but it does make more
  dependencies between the arch code and the PCI core than I'd like.
 
  Did you consider hiding any of the hypervisor details behind the PCI
  config accessor interface?  If that could be done, the overall
  structure might be more similar to other architectures.
 
  You mean pci_root_ops? I'm not sure I understand how that can be used
  to hide hipervisor details.
 
 The object of doing this would be to let you use pci_scan_root_bus(),
 so you wouldn't have to call pci_scan_single_device() and
 pci_bus_add_resources() directly.  The idea is to make pci_read() and
 pci_write() smart enough that the PCI core can use them as though you
 have a normal PCI implementation.  When pci_scan_root_bus() enumerates
 devices on the root bus using pci_scan_child_bus(), it does config
 reads on the VENDOR_ID of bb:00.0, bb:01.0, ..., bb:1f.0.  Your
 pci_read() would return error or 0x data for everything except
 bb:00.0 (I guess it actually already does that).
 
 Then some of the other init, e.g., zpci_enable_device(), could be done
 by the standard pcibios hooks such as pcibios_add_device() and
 pcibios_enable_device(), which would remove some of the PCI grunge
 from zpci_scan_devices() and the s390 hotplug driver.
 
  One reason why we use the lower level
  functions is that we need to create the root bus (and its resources)
  much earlier then the pci_dev. For instance pci_hp_register wants a
  pci_bus to create the PCI slot and the slot can exist without a pci_dev.
 
 There must be something else going on here; a bus is *always* created
 before any of the pci_devs on the bus.
 
 One thing that looks a little strange is that zpci_list seems to be
 sort of a cross between a list of PCI host bridges and a list of PCI
 devices.  Understandable, since you usually have a one-to-one
 correspondence between them, but for your hotplug stuff, you can have
 the host bridge and slot without the pci_dev.
 
 The hotplug slot support is really a function of the host bridge
 support, so it doesn't seem quite right to me to split it into a
 separate module (although that is the way most PCI hotplug drivers are
 currently structured).  One reason is that it makes hot-add of a host
 bridge awkward: you have to have the if (hotplug_ops.create_slot)
 test in zpci_create_device().
 
  The current config accessors only work for dev/fn 00.0 (they fail when
  devfn != ZPCI_DEVFN).  Why is that?  It looks like it precludes
  multi-function devices and basically prevents you from building an
  arbitrary PCI hierarchy.
 
  Our hypervisor does not support multi-function devices. In fact the
  hypervisor will limit the reported PCI devices to a hand-picked
  selection so we can be sure that there will be no unsupported devices.
  The PCI hierarchy is hidden by the hipervisor. We only use the PCI
  domain number, bus and devfn are always zero. So it looks like every
  function is directly attached to a PCI root complex.
 
  That was the reason for the sanity check, but thinking about it I could
  remove it since although we don't support multi-function devices I
  think the s390 code should be more agnostic to these limitations.
 
 The config accessor interface should be defined so it works for all
 PCI devices that exist, and fails for devices that do not exist.  The
 approach you've taken so far is to prevent the PCI 

Re: [RFC PATCH 01/10] s390/pci: base support

2012-12-13 Thread Bjorn Helgaas
On Thu, Dec 13, 2012 at 4:51 AM, Jan Glauber  wrote:
> On Mon, 2012-12-10 at 14:14 -0700, Bjorn Helgaas wrote:
>> On Wed, Nov 14, 2012 at 2:41 AM, Jan Glauber  wrote:
>> > Add PCI support for s390, (only 64 bit mode is supported by hardware):
>> > - PCI facility tests
>> > - PCI instructions: pcilg, pcistg, pcistb, stpcifc, mpcifc, rpcit
>> > - map readb/w/l/q and writeb/w/l/q to pcilg and pcistg instructions
>> > - pci_iomap implementation
>> > - memcpy_fromio/toio
>> > - pci_root_ops using special pcilg/pcistg
>> > - device, bus and domain allocation
>> >
>> > Signed-off-by: Jan Glauber 
>>
>> I think these patches are in -next already, so I just have some
>> general comments & questions.
>
> Yes, since the feedback was manageable we decided to give the patches
> some exposure in -next and if no one complains we'll just go for the
> next merge window. BTW, Sebastian & Gerald (on CC:) will continue the
> work on the PCI code.
>
>> My overall impression is that these are exceptionally well done.
>> They're easy to read, well organized, and well documented.  It's a
>> refreshing change from a lot of the stuff that's posted.
>
> Thanks Björn!
>
>> As with other arches that run on top of hypervisors, you have
>> arch-specific code that enumerates PCI devices using hypervisor calls,
>> and you hook into the PCI core at a lower level than
>> pci_scan_root_bus().  That is, you call pci_create_root_bus(),
>> pci_scan_single_device(), pci_bus_add_devices(), etc., directly from
>> the arch code.  This is the typical approach, but it does make more
>> dependencies between the arch code and the PCI core than I'd like.
>>
>> Did you consider hiding any of the hypervisor details behind the PCI
>> config accessor interface?  If that could be done, the overall
>> structure might be more similar to other architectures.
>
> You mean pci_root_ops? I'm not sure I understand how that can be used
> to hide hipervisor details.

The object of doing this would be to let you use pci_scan_root_bus(),
so you wouldn't have to call pci_scan_single_device() and
pci_bus_add_resources() directly.  The idea is to make pci_read() and
pci_write() smart enough that the PCI core can use them as though you
have a normal PCI implementation.  When pci_scan_root_bus() enumerates
devices on the root bus using pci_scan_child_bus(), it does config
reads on the VENDOR_ID of bb:00.0, bb:01.0, ..., bb:1f.0.  Your
pci_read() would return error or 0x data for everything except
bb:00.0 (I guess it actually already does that).

Then some of the other init, e.g., zpci_enable_device(), could be done
by the standard pcibios hooks such as pcibios_add_device() and
pcibios_enable_device(), which would remove some of the PCI grunge
from zpci_scan_devices() and the s390 hotplug driver.

> One reason why we use the lower level
> functions is that we need to create the root bus (and its resources)
> much earlier then the pci_dev. For instance pci_hp_register wants a
> pci_bus to create the PCI slot and the slot can exist without a pci_dev.

There must be something else going on here; a bus is *always* created
before any of the pci_devs on the bus.

One thing that looks a little strange is that zpci_list seems to be
sort of a cross between a list of PCI host bridges and a list of PCI
devices.  Understandable, since you usually have a one-to-one
correspondence between them, but for your hotplug stuff, you can have
the host bridge and slot without the pci_dev.

The hotplug slot support is really a function of the host bridge
support, so it doesn't seem quite right to me to split it into a
separate module (although that is the way most PCI hotplug drivers are
currently structured).  One reason is that it makes hot-add of a host
bridge awkward: you have to have the "if (hotplug_ops.create_slot)"
test in zpci_create_device().

>> The current config accessors only work for dev/fn 00.0 (they fail when
>> "devfn != ZPCI_DEVFN").  Why is that?  It looks like it precludes
>> multi-function devices and basically prevents you from building an
>> arbitrary PCI hierarchy.
>
> Our hypervisor does not support multi-function devices. In fact the
> hypervisor will limit the reported PCI devices to a hand-picked
> selection so we can be sure that there will be no unsupported devices.
> The PCI hierarchy is hidden by the hipervisor. We only use the PCI
> domain number, bus and devfn are always zero. So it looks like every
> function is directly attached to a PCI root complex.
>
> That was the reason for the sanity check, but thinking about it I could
> remove it since although we don't support multi-function devices I
> think the s390 code should be more agnostic to these limitations.

The config accessor interface should be defined so it works for all
PCI devices that exist, and fails for devices that do not exist.  The
approach you've taken so far is to prevent the PCI core from even
attempting access to non-existent devices, which requires you to use
some lower-level PCI 

Re: [RFC PATCH 01/10] s390/pci: base support

2012-12-13 Thread Jan Glauber
On Mon, 2012-12-10 at 14:14 -0700, Bjorn Helgaas wrote:
> On Wed, Nov 14, 2012 at 2:41 AM, Jan Glauber  wrote:
> > Add PCI support for s390, (only 64 bit mode is supported by hardware):
> > - PCI facility tests
> > - PCI instructions: pcilg, pcistg, pcistb, stpcifc, mpcifc, rpcit
> > - map readb/w/l/q and writeb/w/l/q to pcilg and pcistg instructions
> > - pci_iomap implementation
> > - memcpy_fromio/toio
> > - pci_root_ops using special pcilg/pcistg
> > - device, bus and domain allocation
> >
> > Signed-off-by: Jan Glauber 
> 
> I think these patches are in -next already, so I just have some
> general comments & questions.

Yes, since the feedback was manageable we decided to give the patches
some exposure in -next and if no one complains we'll just go for the
next merge window. BTW, Sebastian & Gerald (on CC:) will continue the
work on the PCI code.

> My overall impression is that these are exceptionally well done.
> They're easy to read, well organized, and well documented.  It's a
> refreshing change from a lot of the stuff that's posted.

Thanks Björn!

> As with other arches that run on top of hypervisors, you have
> arch-specific code that enumerates PCI devices using hypervisor calls,
> and you hook into the PCI core at a lower level than
> pci_scan_root_bus().  That is, you call pci_create_root_bus(),
> pci_scan_single_device(), pci_bus_add_devices(), etc., directly from
> the arch code.  This is the typical approach, but it does make more
> dependencies between the arch code and the PCI core than I'd like.
> 
> Did you consider hiding any of the hypervisor details behind the PCI
> config accessor interface?  If that could be done, the overall
> structure might be more similar to other architectures.

You mean pci_root_ops? I'm not sure I understand how that can be used
to hide hipervisor details. One reason why we use the lower level
functions is that we need to create the root bus (and its resources)
much earlier then the pci_dev. For instance pci_hp_register wants a
pci_bus to create the PCI slot and the slot can exist without a pci_dev.

> The current config accessors only work for dev/fn 00.0 (they fail when
> "devfn != ZPCI_DEVFN").  Why is that?  It looks like it precludes
> multi-function devices and basically prevents you from building an
> arbitrary PCI hierarchy.

Our hypervisor does not support multi-function devices. In fact the
hypervisor will limit the reported PCI devices to a hand-picked
selection so we can be sure that there will be no unsupported devices.
The PCI hierarchy is hidden by the hipervisor. We only use the PCI
domain number, bus and devfn are always zero. So it looks like every
function is directly attached to a PCI root complex.

That was the reason for the sanity check, but thinking about it I could
remove it since although we don't support multi-function devices I
think the s390 code should be more agnostic to these limitations.

> zpci_map_resources() is very unusual.  The struct pci_dev resource[]
> table normally contains CPU physical addresses, but
> zpci_map_resources() fills it with virtual addresses.  I suspect this
> has something to do with the "BAR spaces are not disjunctive on s390"
> comment.  It almost sounds like that's describing host bridges where
> the PCI bus address is not equal to the CPU physical address -- in
> that case, device A and device B may have the same values in their
> BARs, but they can still be distinct if they're under host bridges
> that apply different CPU-to-bus address translations.

Yeah, you've found it... I've had 3 or 4 tries on different
implementations but all failed. If we use the resources as they are we
cannot map them to the instructions (and ioremap does not help because
there we cannot find out which device the resource belongs to). If we
change the BARs on the card MMIO stops to work. I don't know about host
bridges - if we would use a host bridge at which point in the
translation process would it kick in?

Jan

> Bjorn
> 



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 01/10] s390/pci: base support

2012-12-13 Thread Jan Glauber
On Mon, 2012-12-10 at 14:14 -0700, Bjorn Helgaas wrote:
 On Wed, Nov 14, 2012 at 2:41 AM, Jan Glauber j...@linux.vnet.ibm.com wrote:
  Add PCI support for s390, (only 64 bit mode is supported by hardware):
  - PCI facility tests
  - PCI instructions: pcilg, pcistg, pcistb, stpcifc, mpcifc, rpcit
  - map readb/w/l/q and writeb/w/l/q to pcilg and pcistg instructions
  - pci_iomap implementation
  - memcpy_fromio/toio
  - pci_root_ops using special pcilg/pcistg
  - device, bus and domain allocation
 
  Signed-off-by: Jan Glauber j...@linux.vnet.ibm.com
 
 I think these patches are in -next already, so I just have some
 general comments  questions.

Yes, since the feedback was manageable we decided to give the patches
some exposure in -next and if no one complains we'll just go for the
next merge window. BTW, Sebastian  Gerald (on CC:) will continue the
work on the PCI code.

 My overall impression is that these are exceptionally well done.
 They're easy to read, well organized, and well documented.  It's a
 refreshing change from a lot of the stuff that's posted.

Thanks Björn!

 As with other arches that run on top of hypervisors, you have
 arch-specific code that enumerates PCI devices using hypervisor calls,
 and you hook into the PCI core at a lower level than
 pci_scan_root_bus().  That is, you call pci_create_root_bus(),
 pci_scan_single_device(), pci_bus_add_devices(), etc., directly from
 the arch code.  This is the typical approach, but it does make more
 dependencies between the arch code and the PCI core than I'd like.
 
 Did you consider hiding any of the hypervisor details behind the PCI
 config accessor interface?  If that could be done, the overall
 structure might be more similar to other architectures.

You mean pci_root_ops? I'm not sure I understand how that can be used
to hide hipervisor details. One reason why we use the lower level
functions is that we need to create the root bus (and its resources)
much earlier then the pci_dev. For instance pci_hp_register wants a
pci_bus to create the PCI slot and the slot can exist without a pci_dev.

 The current config accessors only work for dev/fn 00.0 (they fail when
 devfn != ZPCI_DEVFN).  Why is that?  It looks like it precludes
 multi-function devices and basically prevents you from building an
 arbitrary PCI hierarchy.

Our hypervisor does not support multi-function devices. In fact the
hypervisor will limit the reported PCI devices to a hand-picked
selection so we can be sure that there will be no unsupported devices.
The PCI hierarchy is hidden by the hipervisor. We only use the PCI
domain number, bus and devfn are always zero. So it looks like every
function is directly attached to a PCI root complex.

That was the reason for the sanity check, but thinking about it I could
remove it since although we don't support multi-function devices I
think the s390 code should be more agnostic to these limitations.

 zpci_map_resources() is very unusual.  The struct pci_dev resource[]
 table normally contains CPU physical addresses, but
 zpci_map_resources() fills it with virtual addresses.  I suspect this
 has something to do with the BAR spaces are not disjunctive on s390
 comment.  It almost sounds like that's describing host bridges where
 the PCI bus address is not equal to the CPU physical address -- in
 that case, device A and device B may have the same values in their
 BARs, but they can still be distinct if they're under host bridges
 that apply different CPU-to-bus address translations.

Yeah, you've found it... I've had 3 or 4 tries on different
implementations but all failed. If we use the resources as they are we
cannot map them to the instructions (and ioremap does not help because
there we cannot find out which device the resource belongs to). If we
change the BARs on the card MMIO stops to work. I don't know about host
bridges - if we would use a host bridge at which point in the
translation process would it kick in?

Jan

 Bjorn
 



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 01/10] s390/pci: base support

2012-12-13 Thread Bjorn Helgaas
On Thu, Dec 13, 2012 at 4:51 AM, Jan Glauber j...@linux.vnet.ibm.com wrote:
 On Mon, 2012-12-10 at 14:14 -0700, Bjorn Helgaas wrote:
 On Wed, Nov 14, 2012 at 2:41 AM, Jan Glauber j...@linux.vnet.ibm.com wrote:
  Add PCI support for s390, (only 64 bit mode is supported by hardware):
  - PCI facility tests
  - PCI instructions: pcilg, pcistg, pcistb, stpcifc, mpcifc, rpcit
  - map readb/w/l/q and writeb/w/l/q to pcilg and pcistg instructions
  - pci_iomap implementation
  - memcpy_fromio/toio
  - pci_root_ops using special pcilg/pcistg
  - device, bus and domain allocation
 
  Signed-off-by: Jan Glauber j...@linux.vnet.ibm.com

 I think these patches are in -next already, so I just have some
 general comments  questions.

 Yes, since the feedback was manageable we decided to give the patches
 some exposure in -next and if no one complains we'll just go for the
 next merge window. BTW, Sebastian  Gerald (on CC:) will continue the
 work on the PCI code.

 My overall impression is that these are exceptionally well done.
 They're easy to read, well organized, and well documented.  It's a
 refreshing change from a lot of the stuff that's posted.

 Thanks Björn!

 As with other arches that run on top of hypervisors, you have
 arch-specific code that enumerates PCI devices using hypervisor calls,
 and you hook into the PCI core at a lower level than
 pci_scan_root_bus().  That is, you call pci_create_root_bus(),
 pci_scan_single_device(), pci_bus_add_devices(), etc., directly from
 the arch code.  This is the typical approach, but it does make more
 dependencies between the arch code and the PCI core than I'd like.

 Did you consider hiding any of the hypervisor details behind the PCI
 config accessor interface?  If that could be done, the overall
 structure might be more similar to other architectures.

 You mean pci_root_ops? I'm not sure I understand how that can be used
 to hide hipervisor details.

The object of doing this would be to let you use pci_scan_root_bus(),
so you wouldn't have to call pci_scan_single_device() and
pci_bus_add_resources() directly.  The idea is to make pci_read() and
pci_write() smart enough that the PCI core can use them as though you
have a normal PCI implementation.  When pci_scan_root_bus() enumerates
devices on the root bus using pci_scan_child_bus(), it does config
reads on the VENDOR_ID of bb:00.0, bb:01.0, ..., bb:1f.0.  Your
pci_read() would return error or 0x data for everything except
bb:00.0 (I guess it actually already does that).

Then some of the other init, e.g., zpci_enable_device(), could be done
by the standard pcibios hooks such as pcibios_add_device() and
pcibios_enable_device(), which would remove some of the PCI grunge
from zpci_scan_devices() and the s390 hotplug driver.

 One reason why we use the lower level
 functions is that we need to create the root bus (and its resources)
 much earlier then the pci_dev. For instance pci_hp_register wants a
 pci_bus to create the PCI slot and the slot can exist without a pci_dev.

There must be something else going on here; a bus is *always* created
before any of the pci_devs on the bus.

One thing that looks a little strange is that zpci_list seems to be
sort of a cross between a list of PCI host bridges and a list of PCI
devices.  Understandable, since you usually have a one-to-one
correspondence between them, but for your hotplug stuff, you can have
the host bridge and slot without the pci_dev.

The hotplug slot support is really a function of the host bridge
support, so it doesn't seem quite right to me to split it into a
separate module (although that is the way most PCI hotplug drivers are
currently structured).  One reason is that it makes hot-add of a host
bridge awkward: you have to have the if (hotplug_ops.create_slot)
test in zpci_create_device().

 The current config accessors only work for dev/fn 00.0 (they fail when
 devfn != ZPCI_DEVFN).  Why is that?  It looks like it precludes
 multi-function devices and basically prevents you from building an
 arbitrary PCI hierarchy.

 Our hypervisor does not support multi-function devices. In fact the
 hypervisor will limit the reported PCI devices to a hand-picked
 selection so we can be sure that there will be no unsupported devices.
 The PCI hierarchy is hidden by the hipervisor. We only use the PCI
 domain number, bus and devfn are always zero. So it looks like every
 function is directly attached to a PCI root complex.

 That was the reason for the sanity check, but thinking about it I could
 remove it since although we don't support multi-function devices I
 think the s390 code should be more agnostic to these limitations.

The config accessor interface should be defined so it works for all
PCI devices that exist, and fails for devices that do not exist.  The
approach you've taken so far is to prevent the PCI core from even
attempting access to non-existent devices, which requires you to use
some lower-level PCI interfaces.  The alternative I'm raising 

Re: [RFC PATCH 01/10] s390/pci: base support

2012-12-10 Thread Bjorn Helgaas
On Wed, Nov 14, 2012 at 2:41 AM, Jan Glauber  wrote:
> Add PCI support for s390, (only 64 bit mode is supported by hardware):
> - PCI facility tests
> - PCI instructions: pcilg, pcistg, pcistb, stpcifc, mpcifc, rpcit
> - map readb/w/l/q and writeb/w/l/q to pcilg and pcistg instructions
> - pci_iomap implementation
> - memcpy_fromio/toio
> - pci_root_ops using special pcilg/pcistg
> - device, bus and domain allocation
>
> Signed-off-by: Jan Glauber 

I think these patches are in -next already, so I just have some
general comments & questions.

My overall impression is that these are exceptionally well done.
They're easy to read, well organized, and well documented.  It's a
refreshing change from a lot of the stuff that's posted.

As with other arches that run on top of hypervisors, you have
arch-specific code that enumerates PCI devices using hypervisor calls,
and you hook into the PCI core at a lower level than
pci_scan_root_bus().  That is, you call pci_create_root_bus(),
pci_scan_single_device(), pci_bus_add_devices(), etc., directly from
the arch code.  This is the typical approach, but it does make more
dependencies between the arch code and the PCI core than I'd like.

Did you consider hiding any of the hypervisor details behind the PCI
config accessor interface?  If that could be done, the overall
structure might be more similar to other architectures.

The current config accessors only work for dev/fn 00.0 (they fail when
"devfn != ZPCI_DEVFN").  Why is that?  It looks like it precludes
multi-function devices and basically prevents you from building an
arbitrary PCI hierarchy.

zpci_map_resources() is very unusual.  The struct pci_dev resource[]
table normally contains CPU physical addresses, but
zpci_map_resources() fills it with virtual addresses.  I suspect this
has something to do with the "BAR spaces are not disjunctive on s390"
comment.  It almost sounds like that's describing host bridges where
the PCI bus address is not equal to the CPU physical address -- in
that case, device A and device B may have the same values in their
BARs, but they can still be distinct if they're under host bridges
that apply different CPU-to-bus address translations.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 01/10] s390/pci: base support

2012-12-10 Thread Bjorn Helgaas
On Wed, Nov 14, 2012 at 2:41 AM, Jan Glauber j...@linux.vnet.ibm.com wrote:
 Add PCI support for s390, (only 64 bit mode is supported by hardware):
 - PCI facility tests
 - PCI instructions: pcilg, pcistg, pcistb, stpcifc, mpcifc, rpcit
 - map readb/w/l/q and writeb/w/l/q to pcilg and pcistg instructions
 - pci_iomap implementation
 - memcpy_fromio/toio
 - pci_root_ops using special pcilg/pcistg
 - device, bus and domain allocation

 Signed-off-by: Jan Glauber j...@linux.vnet.ibm.com

I think these patches are in -next already, so I just have some
general comments  questions.

My overall impression is that these are exceptionally well done.
They're easy to read, well organized, and well documented.  It's a
refreshing change from a lot of the stuff that's posted.

As with other arches that run on top of hypervisors, you have
arch-specific code that enumerates PCI devices using hypervisor calls,
and you hook into the PCI core at a lower level than
pci_scan_root_bus().  That is, you call pci_create_root_bus(),
pci_scan_single_device(), pci_bus_add_devices(), etc., directly from
the arch code.  This is the typical approach, but it does make more
dependencies between the arch code and the PCI core than I'd like.

Did you consider hiding any of the hypervisor details behind the PCI
config accessor interface?  If that could be done, the overall
structure might be more similar to other architectures.

The current config accessors only work for dev/fn 00.0 (they fail when
devfn != ZPCI_DEVFN).  Why is that?  It looks like it precludes
multi-function devices and basically prevents you from building an
arbitrary PCI hierarchy.

zpci_map_resources() is very unusual.  The struct pci_dev resource[]
table normally contains CPU physical addresses, but
zpci_map_resources() fills it with virtual addresses.  I suspect this
has something to do with the BAR spaces are not disjunctive on s390
comment.  It almost sounds like that's describing host bridges where
the PCI bus address is not equal to the CPU physical address -- in
that case, device A and device B may have the same values in their
BARs, but they can still be distinct if they're under host bridges
that apply different CPU-to-bus address translations.

Bjorn
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/