Re: [Xen-ia64-devel] [PATCH] Fix some IPF Xen VT-d bugs

2009-01-04 Thread Isaku Yamahata
Hi. Sorry for delayed reply.

On Thu, Dec 25, 2008 at 10:14:09PM +0800, Cui, Dexuan wrote:
> Isaku Yamahata wrote:
> > On Wed, Dec 24, 2008 at 01:11:03PM +0800, Cui, Dexuan wrote:
> >> Isaku Yamahata wrote:
>  diff -r 008b68ff6095 xen/arch/ia64/xen/domain.c
>  --- a/xen/arch/ia64/xen/domain.c Tue Nov 18 10:33:55 2008 +0900
>  +++ b/xen/arch/ia64/xen/domain.c Mon Dec 15 18:41:52 2008 +0800
>  @@ -602,10 +602,8 @@ int arch_domain_create(struct domain *d,
>   if ((d->arch.mm.pgd = pgd_alloc(&d->arch.mm)) == NULL)  
> 
>  goto fail_nomem; 
>  
>  -if ( iommu_enabled && (is_hvm_domain(d) || need_iommu(d)) ){
>  -if(iommu_domain_init(d) != 0)
>  -goto fail_iommu;
>  -}
>  +if(iommu_domain_init(d) != 0)
>  +goto fail_iommu;
>  
>   /*
>    * grant_table_create() can't fully initialize grant table for
>  domain
> >>> 
> >>> Please don't drop is_hvm_domain(d) check.
> >>> At this moment ia64 doesn't support iommu for PV domain because
> >> Oh, thanks for the reminder. Here I neglected this.
> >> 
> >> Do you mean this:
> >> if ( is_hvm_domain(d) )
> >> if(iommu_domain_init(d) != 0)
> >> goto fail_iommu;
> >> This is also not ok since we must ensure iommu_domain_init() is
> >> invoked for Dom0 -- we need the function invoked to enable DMA
> >> remapping.  
> >> 
> >> So how about changing the logic to:
> >> if ( (d->domain_id == 0) || is_hvm_domain(d) )
> >> if(iommu_domain_init(d) != 0)
> >> goto fail_iommu;
> >> 
> >> If you agree this, I'll post a new patch.
> > 
> > Do you mean if ( d->domain_id == 0 ) clause in
> > the function, intel_iommu_domain_init()?
> Yes. 
> 
> > Is iommu map/unmap for dom0 is necessary?
> >   intel_iommu_domain_init() maps all the pages excect ones xen uses
> >   to dom0. I suppose this is what you want.
> Yes.
> When Dom0 boots up, we assign all the devices to it, so it needs the 1:1 VT-d 
> pagetables mapping.
> 
> >   However later pages is mapped/unmapped even for dom0 because
> I suppose you mean the balloon driver and the grant table operations. Correct?

That's right.


> >   need_iommu(dom0) returns true due ot iommu_domain_init(dom0).
> >   Since dom0 is PV, so iommu mapping/unmapping causes race on ia64.
> In the cases of balloon and granttable, the iommu mapping/unmapping would 
> cause race on IA64?
> Sorry, I know few about the lockless p2m table now. I'm trying to understand 
> more.

Yes. That is why the first ia64 VT-d patches doesn't enable VT-d
for PV domains by not calling iommu_domain_init().
On x86 case p2m_lock/unlock() avoids the race, but ia64 doesn't have such
lock.
At this moment, the only HVM domain would be supported.
The issue is dom0 case. I suppose it can be supported by mapping
all the pages except xen pages at boot time and not iommu
mapping/unmapping because those pages are already mapped to dom0
by intel_iommu_domain_init().


> >   Only setting up iommu tables at the dom0 creation is necessary,
> Could you please explain more about the this? I can't get the point.
> 
> >   all "if ( iommu_enabled && (is_hvm_domain(d) || need_iommu(d)) )"
> >   would be "if ( iommu_enabled && is_hvm_domain(d) && need_iommu(d))
> > )" 
> Am I missing somthing?
> #define need_iommu(d)((d)->need_iommu && !(d)->is_hvm)
> So,
> iommu_enabled && is_hvm_domain(d) && need_iommu(d)
> is undoubtedly false. :-)

Ah sorry. I missed d->is_hvm. Please forget this sentence.


> > intel_iommu_domain_init() and dom0 memory size
> >   calc_dom0_size() in xen/arch/ia64/domain.c calculates default dom0
> >   memory size. You should take memory for iommu page table
> >   into account because the memory size for iommu page table wouldn't
> >   be neglectable.
> >   probably iommu_pages = (max phys addr) / PTRS_PER_PTE_4K + (some
> >   spare) where PTRS_PER_PTE_4K = (1 << (PAGE_SHIFT_4K - 3))
> Now, in intel_iommu_domain_init(), with respect to iommu mapping, Xen maps 
> all the pages for Dom0 except for the pages used by Xen itself.
> Do you mean xen should only maps the page owned actually by Dom0?  -- for 
> instance, you're saying xen should not map the iommu page tables? -- since in 
> Dom0 normally drivers don't touch iommu pagetables at all, looks the current 
> code  is OK?

No. I meant that calc_dom0_size() should be updated.
It calculates the maximum memory size which can be passed to dom0 safely.
Without dom0_mem_size Xen VMM tries to give dom0 the maximum memory size
which is a common use case.

On the other hand, it isn't uncommon that ia64 machine has several
hundred Giga bytes, so memory size for VT-d table would reach tens or
hundreds megabytes which can't be neglectable compared to xen heap size.
Memory for the VT-d table size should be taken into acount
in calc_dom0_size().


> > intel_iommu_domain_init() and sparse memory.
> >   To be honest, I'm not sure how i

Re: [Xen-ia64-devel] A patch to fix mis-setting ed bit for itlb entry.

2009-01-04 Thread Isaku Yamahata
applied, thanks.

On Sun, Jan 04, 2009 at 03:22:06PM +0800, Zhang, Xiantao wrote:
> Hi, Isaku 
> When debugging  a windows BSOD issue,  we found it is caused by 
> mis-setting pte's ED bit for itlb entry.  For hash vTLB, it uses unified tlb 
> and doesn't differentiate itc and dtc in its implementation, so itlb_miss 
> handler may reference dtlb entry in hash vTLB.  But it may result in issues, 
> because dtlb's ED bit may be different with itlb's setting.  Since the case 
> is very rare, so just purge the corresponding entry in hash vTLB and let 
> guest OS to determin how to set ED bit for itlb mapping once found it. 
> Xiantao
> 
> Signed-off-by : Xiantao Zhang 
> 
> diff -r e97216802360 xen/arch/ia64/vmx/vtlb.c
> --- a/xen/arch/ia64/vmx/vtlb.c  Fri Dec 12 10:43:39 2008 +0900
> +++ b/xen/arch/ia64/vmx/vtlb.c  Sun Jan 04 10:43:19 2009 +0800
> @@ -678,11 +678,20 @@ thash_data_t *vtlb_lookup(VCPU *v, u64 v
>  cch = vtlb_thash(hcb->pta, va, vrr.rrval, &tag);
>  do {
>  if (cch->etag == tag && cch->ps == ps)
> -return cch;
> +goto found;
>  cch = cch->next;
>  } while(cch);
>  }
>  return NULL;
> +found:
> +if (is_data == ISIDE_TLB && !cch->ed) {
> +  /*The case is very rare, and it may lead to incorrect setting
> +  for itlb's ed bit! Purge it from hash vTLB and let guest os
> +  determin the ed bit of the itlb entry.*/
> +   vtlb_purge(v, va, ps);
> +   cch = NULL;
> +}
> +return cch;
>  }

> ___
> Xen-ia64-devel mailing list
> Xen-ia64-devel@lists.xensource.com
> http://lists.xensource.com/xen-ia64-devel

-- 
yamahata

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


Re: [Xen-ia64-devel] [Test Report] Xen/IPF Unstable CS#18860 Status --- Dom0 Crash

2009-01-04 Thread Isaku Yamahata
On Mon, Jan 05, 2009 at 01:06:23PM +0800, Zhang, Xiantao wrote:
> Oh, I haven't notice the check-in due to my old codebase. It introduces many 
> odd issues to us.   Okay, it is also good to remove it. :)
> For adopting fast eoi path,  it should be okay to me.  Please check-in them.  

Applied, thanks.
-- 
yamahata

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


RE: [Xen-ia64-devel] [Test Report] Xen/IPF Unstable CS#18860 Status --- Dom0 Crash

2009-01-04 Thread Zhang, Xiantao

Isaku Yamahata wrote:
> On Mon, Jan 05, 2009 at 12:29:55PM +0800, Zhang, Xiantao wrote:
>> Isaku Yamahata wrote:
>>> Hi. Good catch. Some comments.
>>> I attached two patches to fix, could you try them?
>>> 
>>> - bss.page_aligned.
>>>   Where is the section used?
>>>   grep didn't tell me. Surely x86 uses .bss.page_aligned in
>>>   linux/arch/[i386, x86_64]/kernel/head[-xen].S,
>>>   but no files unuder linux/arch/ia64/ don't use it.
>> 
>> You may need to check drivers/xen/core/evtchn.c, the code as
>> following :-) 
>> Xiantao
>> 
>> static int pirq_eoi_does_unmask;
>> static DECLARE_BITMAP(pirq_needs_eoi, ALIGN(NR_PIRQS, PAGE_SIZE * 8))
>> __attribute__ ((__section__(".bss.page_aligned"),
>> __aligned__(PAGE_SIZE))); 
>> 
> 
> Ah, that line was deleted by the chageset of 760:0d10be086a78

Oh, I haven't notice the check-in due to my old codebase. It introduces many 
odd issues to us.   Okay, it is also good to remove it. :)
For adopting fast eoi path,  it should be okay to me.  Please check-in them.  
Xiantao


___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


Re: [Xen-ia64-devel] [Test Report] Xen/IPF Unstable CS#18860 Status --- Dom0 Crash

2009-01-04 Thread Isaku Yamahata
On Mon, Jan 05, 2009 at 12:29:55PM +0800, Zhang, Xiantao wrote:
> Isaku Yamahata wrote:
> > Hi. Good catch. Some comments.
> > I attached two patches to fix, could you try them?
> > 
> > - bss.page_aligned.
> >   Where is the section used?
> >   grep didn't tell me. Surely x86 uses .bss.page_aligned in
> >   linux/arch/[i386, x86_64]/kernel/head[-xen].S,
> >   but no files unuder linux/arch/ia64/ don't use it.
> 
> You may need to check drivers/xen/core/evtchn.c, the code as following :-)
> Xiantao
> 
> static int pirq_eoi_does_unmask;
> static DECLARE_BITMAP(pirq_needs_eoi, ALIGN(NR_PIRQS, PAGE_SIZE * 8))
> __attribute__ ((__section__(".bss.page_aligned"), 
> __aligned__(PAGE_SIZE)));
> 

Ah, that line was deleted by the chageset of 760:0d10be086a78.

-- 
yamahata

___
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel


RE: [Xen-ia64-devel] [Test Report] Xen/IPF Unstable CS#18860 Status --- Dom0 Crash

2009-01-04 Thread Zhang, Xiantao
Isaku Yamahata wrote:
> Hi. Good catch. Some comments.
> I attached two patches to fix, could you try them?
> 
> - bss.page_aligned.
>   Where is the section used?
>   grep didn't tell me. Surely x86 uses .bss.page_aligned in
>   linux/arch/[i386, x86_64]/kernel/head[-xen].S,
>   but no files unuder linux/arch/ia64/ don't use it.

You may need to check drivers/xen/core/evtchn.c, the code as following :-)
Xiantao

static int pirq_eoi_does_unmask;
static DECLARE_BITMAP(pirq_needs_eoi, ALIGN(NR_PIRQS, PAGE_SIZE * 8))
__attribute__ ((__section__(".bss.page_aligned"), 
__aligned__(PAGE_SIZE)));



> - ia64_fast_eoi.
>   I suppose ia64_fast_eoi is used for optimization instead of
>   PHYSDEVOP_eoi. I'm not sure how much improvement it provides,
>   though. Anyway ia64_fast_eoi hypercall implementation should also
>   be updated which I overlooked when I added PHYSDEVOP_pirq_eoi_gmfn
> support. 
> 
> thanks,
> 
> On Sun, Jan 04, 2009 at 06:05:07PM +0800, Zhang, Xiantao wrote:
>> Hi, Isaku & All
>> The attached patch should fix the weird issue.  In upstream, we
>> also find some other weird issues, for example, we can't boot dom0
>> on some platforms, and dom0 may have different behavior with
>> different initrds.  After debug, I found it should be caused by
>> incorrect setting for pirq_needs_eoi page.  There are two main
>> issues found during the debug: 
>> 1.  the related two hypercalls are not enabled in the correct way,
>> so dom0 and hypervisor doesn't have the agreement on which pirq
>> needs EOI.  
>> 2.  the page is not really linked to bss section even if this is the
>> must, so kernel deems it as memory cache and uses it for many ways,
>> and finally leads to varid issues.  
>> Thanks
>> Xiantao
>> 
>> 
>> 
>> You, Yongkang wrote:
 I tried 2048M (and other value), but I wasn't reproduce it.
 Hmm, does it reproduce with "dom0_mem=2048M" on all boxes which you
 tested?
>>> 
>>> Isaku/All,
>>> 
>>> This issue is really very hard to locate. Now I am a little
>>> suspecting it is related with building process, as if changing
>>> building method, this issue is gone too.
>>> 
>>> 1, It doesn't happen to all machines. But it can be stably reproduce
>>> in our nightly test machine with same binary. 2, When system
>>> crashing, dom0_mem is set to 2048M. And if using other memory size,
>>> this issue disappeared too. 3, It seems happened between dom0
>>> changeset 743~753, as it workds if we use old built Dom0 kernel +
>>> new Xen. And the old nightly testing doesn't have issue. 4, When I
>>> try to do regression testing between 743~753, I found different
>>> build method might cause crashing and non-crashing.
>>> 
>>> In our default building process, as stubdomain is not supported in
>>> IA64, so we removed install-stubdom and dist-stubdom from "install:"
>>> and "dist:" lines in main Makefile. It has been changed  more than 2
>>> months. The real compiling command is "make -j3 >xyz_file". And the
>>> crashing issue is related with building process.
>>> 
>>> When I do regression testing, sometimes I didn't change Makefile,
>>> but still use "make -j3". Then the crashing is gone.
>>> 
>>> I am not sure if my suspection is possible, as it still need more
>>> trying. Compiling Dom0 is not easy like Xen. It is costing. I would
>>> try to do more, but maybe not so quick, as many another things need
>>> to do at the same time. If the default compilation is okay, do you
>>> think it is worthy to do more investigating?
>>> 
>>> Any suggestion will be much appreciated.
>>> 
>>> Best Regards,
>>> Yongkang You
>>> 
>>> On Tuesday, December 16, 2008 10:22 AM, "Isaku Yamahata" wrote:
>>> 
 On Tue, Dec 09, 2008 at 05:56:25PM +0800, You, Yongkang wrote:
> On Monday, December 08, 2008 2:10 PM, "Isaku Yamahata" wrote:
> 
>> On Mon, Dec 08, 2008 at 01:52:38PM +0800, Zhang, Jingke wrote:
>>> Isaku Yamahata wrote:
 On Mon, Dec 08, 2008 at 11:31:15AM +0800, Zhang, Jingke wrote:
> Hi Isaku,
> We re-get the detail information from serial port, please
> see below. Two comments add:
 
 Thank you.
 
 
> 1. We can be sure the Cset#18832 works well on the same
> tiger4 machine. But we did not do regression test between
> 18832 and this 18860. 
> 2. It is strange that on another Tiger4 box, dom0 will NOT
> crash. Do you have any idea from the serial log? Thanks!
 
 I haven't hit this crash. And Kuwamura-san's test seems that
 he haven't hit it either. Kuwamura-san, is it correct?
 Hmm... it seems to depend on hw configuration?
 I'm inclined to suspect masking/unmasking interruption race.
 event channel issues? But that's just only my very vague guess.
 
 The difference between 18832 and 18860 means the merging
 xen-unstable into xen-ia64-unstable. Looking the log, I suspect
 linux-2.6.18-xen instead o

Re: [Xen-ia64-devel] [Test Report] Xen/IPF Unstable CS#18860 Status --- Dom0 Crash

2009-01-04 Thread Isaku Yamahata
Hi. Good catch. Some comments.
I attached two patches to fix, could you try them?

- bss.page_aligned.
  Where is the section used?
  grep didn't tell me. Surely x86 uses .bss.page_aligned in
  linux/arch/[i386, x86_64]/kernel/head[-xen].S,
  but no files unuder linux/arch/ia64/ don't use it.

- ia64_fast_eoi.
  I suppose ia64_fast_eoi is used for optimization instead of
  PHYSDEVOP_eoi. I'm not sure how much improvement it provides, though.
  Anyway ia64_fast_eoi hypercall implementation should also be updated
  which I overlooked when I added PHYSDEVOP_pirq_eoi_gmfn support.

thanks,

On Sun, Jan 04, 2009 at 06:05:07PM +0800, Zhang, Xiantao wrote:
> Hi, Isaku & All
> The attached patch should fix the weird issue.  In upstream, we also find 
> some other weird issues, for example, we can't boot dom0 on some platforms, 
> and dom0 may have different behavior with different initrds.  After debug, I 
> found it should be caused by incorrect setting for pirq_needs_eoi page.  
> There are two main issues found during the debug: 
> 1.  the related two hypercalls are not enabled in the correct way, so dom0 
> and hypervisor doesn't have the agreement on which pirq needs EOI. 
> 2.  the page is not really linked to bss section even if this is the must, so 
> kernel deems it as memory cache and uses it for many ways, and finally leads 
> to varid issues. 
> Thanks 
> Xiantao
> 
> 
> 
> You, Yongkang wrote:
> >> I tried 2048M (and other value), but I wasn't reproduce it.
> >> Hmm, does it reproduce with "dom0_mem=2048M" on all boxes which you
> >> tested?
> > 
> > Isaku/All,
> > 
> > This issue is really very hard to locate. Now I am a little
> > suspecting it is related with building process, as if changing
> > building method, this issue is gone too.  
> > 
> > 1, It doesn't happen to all machines. But it can be stably reproduce
> > in our nightly test machine with same binary. 2, When system
> > crashing, dom0_mem is set to 2048M. And if using other memory size,
> > this issue disappeared too. 3, It seems happened between dom0
> > changeset 743~753, as it workds if we use old built Dom0 kernel + new
> > Xen. And the old nightly testing doesn't have issue. 4, When I try to
> > do regression testing between 743~753, I found different build method
> > might cause crashing and non-crashing.
> > 
> > In our default building process, as stubdomain is not supported in
> > IA64, so we removed install-stubdom and dist-stubdom from "install:"
> > and "dist:" lines in main Makefile. It has been changed  more than 2
> > months. The real compiling command is "make -j3 >xyz_file". And the
> > crashing issue is related with building process.
> > 
> > When I do regression testing, sometimes I didn't change Makefile, but
> > still use "make -j3". Then the crashing is gone. 
> > 
> > I am not sure if my suspection is possible, as it still need more
> > trying. Compiling Dom0 is not easy like Xen. It is costing. I would
> > try to do more, but maybe not so quick, as many another things need
> > to do at the same time. If the default compilation is okay, do you
> > think it is worthy to do more investigating?
> > 
> > Any suggestion will be much appreciated.
> > 
> > Best Regards,
> > Yongkang You
> > 
> > On Tuesday, December 16, 2008 10:22 AM, "Isaku Yamahata" wrote:
> > 
> >> On Tue, Dec 09, 2008 at 05:56:25PM +0800, You, Yongkang wrote:
> >>> On Monday, December 08, 2008 2:10 PM, "Isaku Yamahata" wrote:
> >>> 
>  On Mon, Dec 08, 2008 at 01:52:38PM +0800, Zhang, Jingke wrote:
> > Isaku Yamahata wrote:
> >> On Mon, Dec 08, 2008 at 11:31:15AM +0800, Zhang, Jingke wrote:
> >>> Hi Isaku,
> >>> We re-get the detail information from serial port, please
> >>> see below. Two comments add:
> >> 
> >> Thank you.
> >> 
> >> 
> >>> 1. We can be sure the Cset#18832 works well on the same
> >>> tiger4 machine. But we did not do regression test between 18832
> >>> and this 18860. 
> >>> 2. It is strange that on another Tiger4 box, dom0 will NOT
> >>> crash. Do you have any idea from the serial log? Thanks!
> >> 
> >> I haven't hit this crash. And Kuwamura-san's test seems that
> >> he haven't hit it either. Kuwamura-san, is it correct?
> >> Hmm... it seems to depend on hw configuration?
> >> I'm inclined to suspect masking/unmasking interruption race.
> >> event channel issues? But that's just only my very vague guess.
> >> 
> >> The difference between 18832 and 18860 means the merging
> >> xen-unstable into xen-ia64-unstable. Looking the log, I suspect
> >> linux-2.6.18-xen instead of xen.
> >> Could you provide the linux c/s which corresponds to 18832 and
> >> 18860?
> > 
> > 
> > Hi Isaku,
> > Yes, some of our machines do not crash. I am afraid there may
> > be some potential issue. By testing 18832, we use linux#742.
> > While 18860 uses linux#753. Thanks!
>  
>  Tha

RE: [Xen-ia64-devel] [Test Report] Xen/IPF Unstable CS#18860 Status --- Dom0 Crash

2009-01-04 Thread Zhang, Xiantao
Hi, Isaku & All
The attached patch should fix the weird issue.  In upstream, we also find 
some other weird issues, for example, we can't boot dom0 on some platforms, and 
dom0 may have different behavior with different initrds.  After debug, I found 
it should be caused by incorrect setting for pirq_needs_eoi page.  There are 
two main issues found during the debug: 
1.  the related two hypercalls are not enabled in the correct way, so dom0 and 
hypervisor doesn't have the agreement on which pirq needs EOI. 
2.  the page is not really linked to bss section even if this is the must, so 
kernel deems it as memory cache and uses it for many ways, and finally leads to 
varid issues. 
Thanks 
Xiantao



You, Yongkang wrote:
>> I tried 2048M (and other value), but I wasn't reproduce it.
>> Hmm, does it reproduce with "dom0_mem=2048M" on all boxes which you
>> tested?
> 
> Isaku/All,
> 
> This issue is really very hard to locate. Now I am a little
> suspecting it is related with building process, as if changing
> building method, this issue is gone too.  
> 
> 1, It doesn't happen to all machines. But it can be stably reproduce
> in our nightly test machine with same binary. 2, When system
> crashing, dom0_mem is set to 2048M. And if using other memory size,
> this issue disappeared too. 3, It seems happened between dom0
> changeset 743~753, as it workds if we use old built Dom0 kernel + new
> Xen. And the old nightly testing doesn't have issue. 4, When I try to
> do regression testing between 743~753, I found different build method
> might cause crashing and non-crashing.
> 
> In our default building process, as stubdomain is not supported in
> IA64, so we removed install-stubdom and dist-stubdom from "install:"
> and "dist:" lines in main Makefile. It has been changed  more than 2
> months. The real compiling command is "make -j3 >xyz_file". And the
> crashing issue is related with building process.
> 
> When I do regression testing, sometimes I didn't change Makefile, but
> still use "make -j3". Then the crashing is gone. 
> 
> I am not sure if my suspection is possible, as it still need more
> trying. Compiling Dom0 is not easy like Xen. It is costing. I would
> try to do more, but maybe not so quick, as many another things need
> to do at the same time. If the default compilation is okay, do you
> think it is worthy to do more investigating?
> 
> Any suggestion will be much appreciated.
> 
> Best Regards,
> Yongkang You
> 
> On Tuesday, December 16, 2008 10:22 AM, "Isaku Yamahata" wrote:
> 
>> On Tue, Dec 09, 2008 at 05:56:25PM +0800, You, Yongkang wrote:
>>> On Monday, December 08, 2008 2:10 PM, "Isaku Yamahata" wrote:
>>> 
 On Mon, Dec 08, 2008 at 01:52:38PM +0800, Zhang, Jingke wrote:
> Isaku Yamahata wrote:
>> On Mon, Dec 08, 2008 at 11:31:15AM +0800, Zhang, Jingke wrote:
>>> Hi Isaku,
>>> We re-get the detail information from serial port, please
>>> see below. Two comments add:
>> 
>> Thank you.
>> 
>> 
>>> 1. We can be sure the Cset#18832 works well on the same
>>> tiger4 machine. But we did not do regression test between 18832
>>> and this 18860. 
>>> 2. It is strange that on another Tiger4 box, dom0 will NOT
>>> crash. Do you have any idea from the serial log? Thanks!
>> 
>> I haven't hit this crash. And Kuwamura-san's test seems that
>> he haven't hit it either. Kuwamura-san, is it correct?
>> Hmm... it seems to depend on hw configuration?
>> I'm inclined to suspect masking/unmasking interruption race.
>> event channel issues? But that's just only my very vague guess.
>> 
>> The difference between 18832 and 18860 means the merging
>> xen-unstable into xen-ia64-unstable. Looking the log, I suspect
>> linux-2.6.18-xen instead of xen.
>> Could you provide the linux c/s which corresponds to 18832 and
>> 18860?
> 
> 
> Hi Isaku,
> Yes, some of our machines do not crash. I am afraid there may
> be some potential issue. By testing 18832, we use linux#742.
> While 18860 uses linux#753. Thanks!
 
 Thank you. Taking rough look at them those change sets doesn't seem
 culprit. I agree with you that this may indicate some potential
 bugs...
>>> 
>>> Hi All,
>>> 
>>> This bug is stably reproduced, if providing "dom0_mem=2048M" in
>>> append option. And if setting dom0_mem to 1024M or 4096M, the
>>> crashing doesn't happen. 
>>> 
>>> We tried #18869 Xen + #742 Dom0, system is okay. So the problem
>>> might be in Linux tree between #742~#753
>> 
>> I tried 2048M (and other value), but I wasn't reproduce it.
>> Hmm, does it reproduce with "dom0_mem=2048M" on all boxes which you
>> tested? 
>> 
>> thanks,
> 
> ___
> Xen-ia64-devel mailing list
> Xen-ia64-devel@lists.xensource.com
> http://lists.xensource.com/xen-ia64-devel



fix_pirq_eoi_page.patch
Description: fix_pirq_eoi_page.patch
__