Le 27 oct. 2015 5:34 PM, "Eric Snowberg" <eric.snowb...@oracle.com> a
écrit :
>
>
> > On Oct 25, 2015, at 9:20 AM, Vladimir 'φ-coder/phcoder' Serbinenko <
phco...@gmail.com> wrote:
> >
> > On 11.10.2015 18:49, Eric Snowberg wrote:
> >>
> >>> On Oct 10, 2015, at 1:35 AM, Vladimir 'phcoder' Serbinenko <
phco...@gmail.com> wrote:
> >>>
> >>>
> >>> Le 10 oct. 2015 3:31 AM, "Eric Snowberg" <eric.snowb...@oracle.com> a
écrit :
> >>>>
> >>>>
> >>>>> On Oct 9, 2015, at 12:50 AM, Andrei Borzenkov <arvidj...@gmail.com>
wrote:
> >>>>>
> >>>>> On Thu, Oct 8, 2015 at 2:20 AM, Eric Snowberg <
eric.snowb...@oracle.com> wrote:
> >>>>>>
> >>>>>>> On Oct 7, 2015, at 2:36 AM, Andrei Borzenkov <arvidj...@gmail.com>
wrote:
> >>>>>>>
> >>>>>>> On Tue, Oct 6, 2015 at 8:39 PM, Eric Snowberg <
eric.snowb...@oracle.com> wrote:
> >>>>>>>> Keep of devices open.  This can save 6 - 7 seconds per open call
and
> >>>>>>>> can decrease boot times from over 10 minutes to 29 seconds on
> >>>>>>>> larger SPARC systems.  The open/close calls with some vendors'
> >>>>>>>> SAS controllers cause the entire card to be reinitialized after
> >>>>>>>> each close.
> >>>>>>>>
> >>>>>>>
> >>>>>>> Is it necessary to close these handles before launching kernel?
> >>>>>>
> >>>>>> On SPARC hardware it is not necessary.  I would be interested to
hear if it is necessary on other IEEE1275 platforms.
> >>>>>>
> >>>>>>> It sounds like it can accumulate quite a lot of them - are there
any
> >>>>>>> memory limits/restrictions? Also your patch is rather generic and
so
> >>>>>>> applies to any IEEE1275 platform, I think some of them may have
less
> >>>>>>> resources. Just trying to understand what run-time cost is.
> >>>>>>
> >>>>>> The most I have seen are two entries in the linked list.  So the
additional memory requirements are very small.  I have tried this on a
T2000, T4, and T5.
> >>>>>
> >>>>> I thought you were speaking about *larger* SPARC servers :)
> >>>>>
> >>>>> Anyway I'd still prefer if this would be explicitly restricted to
> >>>>> Oracle servers. Please look at
> >>>>> grub-core/kern/ieee1275/cmain.c:grub_ieee1275_find_options() if new
> >>>>> quirk can be added to detect your platform(s).
> >>>>>
> >>>>
> >>>> That makes sense, I’ll restrict it to all sun4v equipment.
> >>>>
> >>> Not good enough. Some of sun4v probably have the bug I told about in
the other e-mail
> >>
> >> I can get access to every sun4v platform.  Could you provide any
additional information on which sun4v systems had this failure.  If so,
I’ll try to submit a fix for that problem as well.  It seems like there
could be a better way to handle this than resetting the SAS or SATA HBA
before each read operation.
> >>
> >>
> > See http://permalink.gmane.org/gmane.comp.boot-loaders.grub.devel/10533
> > for holding open handle.
>
> The patch I submitted will prevent the same device from being opened
multiple times concurrently.   Just as the link above describes. The only
time this will not be the case is when grub references the same device name
differently.  For example I have seen on one system where it alternates
between the SAS WWN and the typical OpenBoot alias for the drive name.  I
intend on fixing this problem.  But so far with this system OpenBoot has
been able to handle it.
>
I tried to fix this some time ago. Unfortunately I didn't find a way to
reliably determine if 2 disks are the same. Canon doesn't add part after @
if they are not there. Comparing phandle considers devices on the same bus
as same device. How are you going to fix it?
Another question: can we have ihandle caching logic in ofdisk rather than
deep in open logic?
> The problem with the last_ihandle caching within ofdisk.c is that the
last_devpath pointer changes with each file open call.  So there really
isn’t much benefit, since the device will always be closed and reopened,
since the disk->data pointer changes, even though it contains the same
string.  So on a typical boot, there are around 33 open and close calls
taking place.  This causes the SAS controller to be restarted 33 times.
With the patch I submitted, all the last_ihandle caching, could be removed.
>
Sounds like we can just replace comparing pointers with strcmp
> > Do you readily have a scenario when you need multiple handles open?
>
> The other problem my patch solves is where on some systems, grub will
walk the entire device tree and open every single device, multiple times. I
do not understand why it does this on some systems, but when it does, it
takes 20+ minutes before the grub menu appears.  With the patch I
submitted, the menu will appear in under 29 seconds on all sun4v systems.
>
It does so to resolve UUID. It usually means that hints have failed or
disks changed after last grub-install/mkconfig
> I have made a V2 patch with the changes Andrei recommended, where the
code will only be used on sun4v platforms.  I have tested it on almost
every generation of the sun4v platform.  I still have a few platforms to
go.   Please let me know how you want me to proceed with this V2 patch if
this testing looks promising.
>
Yes, definitely.
> > Can you try following naive optimisation:
> > diff --git a/grub-core/disk/ieee1275/ofdisk.c
> > b/grub-core/disk/ieee1275/ofdisk.c
> > index 331769b..34237f9 100644
> > --- a/grub-core/disk/ieee1275/ofdisk.c
> > +++ b/grub-core/disk/ieee1275/ofdisk.c
> > @@ -448,13 +448,6 @@ grub_ofdisk_open (const char *name, grub_disk_t
disk)
> > static void
> > grub_ofdisk_close (grub_disk_t disk)
> > {
> > -  if (disk->data == last_devpath)
> > -    {
> > -      if (last_ihandle)
> > -       grub_ieee1275_close (last_ihandle);
> > -      last_ihandle = 0;
> > -      last_devpath = NULL;
> > -    }
> >   disk->data = 0;
> > }
>
> This optimization will not work.  It will cause the same device node to
be opened multiple times concurrently, which can lead to problems on some
sun4v systems.
>
>
> >
> >> There are many problems with the upstream grub2 implementation for
sun4v.  I will be submitting additional follow on patches, since it
currently will not even boot to the grub menu.  My intention is to get
grub2 working on every sun4v platform.
> >>
> > Could you start with
>
> I think something got cut off here.
>
I meant bug fixes can go in without waiting for this
> >>
> >>>>>
> >>>>>>
> >>>>>> The path name sent into the grub_ieee1275_open function is not
consistent throughout the code, even though it is referencing the same
device.  Originally I tried having a global containing the path sent into
grub_ieee1275_open, but this failed due to the various ways of referencing
the same device name.  That is why I added the linked list just to be
safe.  Caching the grub_ieee1275_ihandle_t this way saves a significant
amount of time, since OF calls are expensive.  We were seeing the same
device opened 50+ times in the process of displaying the grub menu and then
launching the kernel.
> >>>>>>
> >>>>>>>
> >>>>>>>> Signed-off-by: Eric Snowberg <eric.snowb...@oracle.com>
> >>>>>>>> ---
> >>>>>>>> grub-core/kern/ieee1275/ieee1275.c |   39
++++++++++++++++++++++++++++++++++++
> >>>>>>>> 1 files changed, 39 insertions(+), 0 deletions(-)
> >>>>>>>>
> >>>>>>>> diff --git a/grub-core/kern/ieee1275/ieee1275.c
b/grub-core/kern/ieee1275/ieee1275.c
> >>>>>>>> index 9821702..30f973b 100644
> >>>>>>>> --- a/grub-core/kern/ieee1275/ieee1275.c
> >>>>>>>> +++ b/grub-core/kern/ieee1275/ieee1275.c
> >>>>>>>> @@ -19,11 +19,24 @@
> >>>>>>>>
> >>>>>>>> #include <grub/ieee1275/ieee1275.h>
> >>>>>>>> #include <grub/types.h>
> >>>>>>>> +#include <grub/misc.h>
> >>>>>>>> +#include <grub/list.h>
> >>>>>>>> +#include <grub/mm.h>
> >>>>>>>>
> >>>>>>>> #define IEEE1275_PHANDLE_INVALID  ((grub_ieee1275_cell_t) -1)
> >>>>>>>> #define IEEE1275_IHANDLE_INVALID  ((grub_ieee1275_cell_t) 0)
> >>>>>>>> #define IEEE1275_CELL_INVALID     ((grub_ieee1275_cell_t) -1)
> >>>>>>>>
> >>>>>>>> +struct grub_of_opened_device
> >>>>>>>> +{
> >>>>>>>> +  struct grub_of_opened_device *next;
> >>>>>>>> +  struct grub_of_opened_device **prev;
> >>>>>>>> +  grub_ieee1275_ihandle_t ihandle;
> >>>>>>>> +  char *path;
> >>>>>>>> +};
> >>>>>>>> +
> >>>>>>>> +static struct grub_of_opened_device *grub_of_opened_devices;
> >>>>>>>> +
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> int
> >>>>>>>> @@ -452,6 +465,18 @@ grub_ieee1275_open (const char *path,
grub_ieee1275_ihandle_t *result)
> >>>>>>>> }
> >>>>>>>> args;
> >>>>>>>>
> >>>>>>>> +  struct grub_of_opened_device *dev;
> >>>>>>>> +
> >>>>>>>> +  FOR_LIST_ELEMENTS(dev, grub_of_opened_devices)
> >>>>>>>> +    if (grub_strcmp(dev->path, path) == 0)
> >>>>>>>> +      break;
> >>>>>>>> +
> >>>>>>>> +  if (dev)
> >>>>>>>> +  {
> >>>>>>>> +    *result = dev->ihandle;
> >>>>>>>> +    return 0;
> >>>>>>>> +  }
> >>>>>>>> +
> >>>>>>>> INIT_IEEE1275_COMMON (&args.common, "open", 1, 1);
> >>>>>>>> args.path = (grub_ieee1275_cell_t) path;
> >>>>>>>>
> >>>>>>>> @@ -460,6 +485,11 @@ grub_ieee1275_open (const char *path,
grub_ieee1275_ihandle_t *result)
> >>>>>>>> *result = args.result;
> >>>>>>>> if (args.result == IEEE1275_IHANDLE_INVALID)
> >>>>>>>>   return -1;
> >>>>>>>> +
> >>>>>>>> +  dev = grub_zalloc(sizeof(struct grub_of_opened_device));
> >>>>>>>
> >>>>>>> Error check
> >>>>>>
> >>>>>> I’ll resubmit V2 with this change
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>>> +  dev->path = grub_strdup(path);
> >>>>>>>
> >>>>>>> Ditto
> >>>>>>>
> >>>>>>
> >>>>>> I’ll resubmit V2 with this change
> >>>>>>
> >>>>>>
> >>>>>>>> +  dev->ihandle = args.result;
> >>>>>>>> +  grub_list_push(GRUB_AS_LIST_P (&grub_of_opened_devices),
GRUB_AS_LIST (dev));
> >>>>>>>> return 0;
> >>>>>>>> }
> >>>>>>>>
> >>>>>>>> @@ -473,6 +503,15 @@ grub_ieee1275_close
(grub_ieee1275_ihandle_t ihandle)
> >>>>>>>> }
> >>>>>>>> args;
> >>>>>>>>
> >>>>>>>> +  struct grub_of_opened_device *dev;
> >>>>>>>> +
> >>>>>>>> +  FOR_LIST_ELEMENTS(dev, grub_of_opened_devices)
> >>>>>>>> +    if (dev->ihandle == ihandle)
> >>>>>>>> +      break;
> >>>>>>>> +
> >>>>>>>> +  if (dev)
> >>>>>>>> +    return 0;
> >>>>>>>> +
> >>>>>>>
> >>>>>>> How can we come here (i.e. can we get open handle without
grub_ieee1275_open)?
> >>>>>>>
> >>>>>>
> >>>>>> I don’t see it being possible with SPARC and would assume other
platforms could not get there either. I’m not familiar with the other
IEEE1275 platforms to know if this would be appropriate, but If there isn’t
a platform that needs this close function, could the function be changed to
do nothing but return 0?
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>>> INIT_IEEE1275_COMMON (&args.common, "close", 1, 0);
> >>>>>>>> args.ihandle = ihandle;
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>> 1.7.1
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> _______________________________________________
> >>>>>>>> Grub-devel mailing list
> >>>>>>>> Grub-devel@gnu.org
> >>>>>>>> https://lists.gnu.org/mailman/listinfo/grub-devel
> >>>>>>>
> >>>>>>> _______________________________________________
> >>>>>>> Grub-devel mailing list
> >>>>>>> Grub-devel@gnu.org
> >>>>>>> https://lists.gnu.org/mailman/listinfo/grub-devel
> >>>>>>
> >>>>>>
> >>>>>> _______________________________________________
> >>>>>> Grub-devel mailing list
> >>>>>> Grub-devel@gnu.org
> >>>>>> https://lists.gnu.org/mailman/listinfo/grub-devel
> >>>>>
> >>>>> _______________________________________________
> >>>>> Grub-devel mailing list
> >>>>> Grub-devel@gnu.org
> >>>>> https://lists.gnu.org/mailman/listinfo/grub-devel
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> Grub-devel mailing list
> >>>> Grub-devel@gnu.org
> >>>> https://lists.gnu.org/mailman/listinfo/grub-devel
> >>> _______________________________________________
> >>> Grub-devel mailing list
> >>> Grub-devel@gnu.org
> >>> https://lists.gnu.org/mailman/listinfo/grub-devel
> >>
> >>
> >> _______________________________________________
> >> Grub-devel mailing list
> >> Grub-devel@gnu.org
> >> https://lists.gnu.org/mailman/listinfo/grub-devel
> >>
> >
> >
> > _______________________________________________
> > Grub-devel mailing list
> > Grub-devel@gnu.org
> > https://lists.gnu.org/mailman/listinfo/grub-devel
>
>
> _______________________________________________
> Grub-devel mailing list
> Grub-devel@gnu.org
> https://lists.gnu.org/mailman/listinfo/grub-devel
_______________________________________________
Grub-devel mailing list
Grub-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/grub-devel

Reply via email to