Re: [XenPPC] Re: Automated reliability report for SMP patch on JS2x
Even though I proposed this, I like to withdraw the request. We have identified a machine, a JS20 model 884241X, on which Xen with the SMP patch does not boot (Kawachiya-san machine). It would be useful to know what kind of JS20 blades have successfully booted xen with the SMP patch. In Yorktown we have booted successfully only on JS20 model 21X. Maria Butrico wrote: Jimi, the problem with this approach is that as changes are made to the Xen code, you have no idea if they make the smp situation better or worse. If you introduce a bug only visible with SMP or more likely to happen running MP you don't find out until someone picks up your code and applies the smp patch. Jimi Xenidis wrote: On Oct 3, 2006, at 12:25 PM, Maria Butrico wrote: What's really interesting to me about this is that the invocation of the icache invalidation did not go in till later. But it did include the I/D cache flush of text. The i-cache invalidate you speak requires the running of DomUs So if anything we could find this to be even more reliable one the other changes are also picked up. not much has happened that would effect boot and ssh to dom0 I missed this: what is transient? I would like to suggest that the SMP patch be applied to the base, and that in those case where we known that SMP fails, like on maples, we use the nosmp option. I'm still not prepared to take the SMP patch, the I-Cache invalidate fix has improved the situation on maple, but not enough to convince me that there are no more troubles waiting to pounce. -JX ___ Xen-ppc-devel mailing list Xen-ppc-devel@lists.xensource.com http://lists.xensource.com/xen-ppc-devel ___ Xen-ppc-devel mailing list Xen-ppc-devel@lists.xensource.com http://lists.xensource.com/xen-ppc-devel ___ Xen-ppc-devel mailing list Xen-ppc-devel@lists.xensource.com http://lists.xensource.com/xen-ppc-devel
Re: [XenPPC] Re: Automated reliability report for SMP patch on JS2x
On Oct 3, 2006, at 12:25 PM, Maria Butrico wrote: What's really interesting to me about this is that the invocation of the icache invalidation did not go in till later. But it did include the I/D cache flush of text. The i-cache invalidate you speak requires the running of DomUs So if anything we could find this to be even more reliable one the other changes are also picked up. not much has happened that would effect boot and ssh to dom0 I missed this: what is transient? I would like to suggest that the SMP patch be applied to the base, and that in those case where we known that SMP fails, like on maples, we use the nosmp option. I'm still not prepared to take the SMP patch, the I-Cache invalidate fix has improved the situation on maple, but not enough to convince me that there are no more troubles waiting to pounce. -JX ___ Xen-ppc-devel mailing list Xen-ppc-devel@lists.xensource.com http://lists.xensource.com/xen-ppc-devel
Re: [XenPPC] Re: Automated reliability report for SMP patch on JS2x
On Tue, Oct 03, 2006 at 12:25:33PM -0400, Maria Butrico wrote: What's really interesting to me about this is that the invocation of the icache invalidation did not go in till later. So if anything we could find this to be even more reliable one the other changes are also picked up. The key commit from my perspective was the flush of the icache on secondary processor spinup. That made SMP spinup on JS2x quite solid, in my experiments. The function you are talking about is relevant when destroying a domain and loading a new one, I believe. Note that the tests reported in this email only create domains, because I knew that the invocation you are talking about had not gone in yet. I missed this: what is transient? Transient is how my tool classifies things like the network going down for five minutes at 3:00 a.m., which causes TFTP to fail. ___ Xen-ppc-devel mailing list Xen-ppc-devel@lists.xensource.com http://lists.xensource.com/xen-ppc-devel
Re: [XenPPC] Re: Automated reliability report for SMP patch on JS2x
Jimi, the problem with this approach is that as changes are made to the Xen code, you have no idea if they make the smp situation better or worse. If you introduce a bug only visible with SMP or more likely to happen running MP you don't find out until someone picks up your code and applies the smp patch. Jimi Xenidis wrote: On Oct 3, 2006, at 12:25 PM, Maria Butrico wrote: What's really interesting to me about this is that the invocation of the icache invalidation did not go in till later. But it did include the I/D cache flush of text. The i-cache invalidate you speak requires the running of DomUs So if anything we could find this to be even more reliable one the other changes are also picked up. not much has happened that would effect boot and ssh to dom0 I missed this: what is transient? I would like to suggest that the SMP patch be applied to the base, and that in those case where we known that SMP fails, like on maples, we use the nosmp option. I'm still not prepared to take the SMP patch, the I-Cache invalidate fix has improved the situation on maple, but not enough to convince me that there are no more troubles waiting to pounce. -JX ___ Xen-ppc-devel mailing list Xen-ppc-devel@lists.xensource.com http://lists.xensource.com/xen-ppc-devel ___ Xen-ppc-devel mailing list Xen-ppc-devel@lists.xensource.com http://lists.xensource.com/xen-ppc-devel
Re: [XenPPC] Re: Automated reliability report for SMP patch on JS2x
On Oct 3, 2006, at 3:21 PM, Maria Butrico wrote: Jimi, the problem with this approach is that as changes are made to the Xen code, you have no idea if they make the smp situation better or worse. If you introduce a bug only visible with SMP or more likely to happen running MP you don't find out until someone picks up your code and applies the smp patch. This is exactly the reason why we won't make SMP the default. I'll try my best not to break it and yes I do use the SMP patch to test bigger commits, but its important for all to realize that the maintainers of XenPPC do _not_ consider SMP stable, its just a fact. -JX Jimi Xenidis wrote: On Oct 3, 2006, at 12:25 PM, Maria Butrico wrote: What's really interesting to me about this is that the invocation of the icache invalidation did not go in till later. But it did include the I/D cache flush of text. The i-cache invalidate you speak requires the running of DomUs So if anything we could find this to be even more reliable one the other changes are also picked up. not much has happened that would effect boot and ssh to dom0 I missed this: what is transient? I would like to suggest that the SMP patch be applied to the base, and that in those case where we known that SMP fails, like on maples, we use the nosmp option. I'm still not prepared to take the SMP patch, the I-Cache invalidate fix has improved the situation on maple, but not enough to convince me that there are no more troubles waiting to pounce. -JX ___ Xen-ppc-devel mailing list Xen-ppc-devel@lists.xensource.com http://lists.xensource.com/xen-ppc-devel ___ Xen-ppc-devel mailing list Xen-ppc-devel@lists.xensource.com http://lists.xensource.com/xen-ppc-devel ___ Xen-ppc-devel mailing list Xen-ppc-devel@lists.xensource.com http://lists.xensource.com/xen-ppc-devel