Re: IA64: copying /proc/vmcore caused kernel MCA'ed

2008-09-09 Thread Bernhard Walle
* Jay Lan [EMAIL PROTECTED] [2008-09-08]: 

 Any input helping me speed up debugging is appreciated.

I would start with comparing the ELF program headers of /proc/vmcore
which you get with readelf -l /proc/vmcore in kdump environment and
the /proc/iomem which kexec uses to set up the ELF core headers.

If both does contain the memory regions which should not be accessed,
then it's a bug in the ressource assignment of the kernel, and
if /proc/iomem does not, the kexec tool has a bug.

Hope that helps a bit, I don't have time to look deeper into that now.


Bernhard

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH] Use gzip for manpage compression

2008-09-09 Thread Bernhard Walle
Use gzip instead of zip for manual page compression. That's the usual
method of compressing manual pages on Unix.


Signed-off-by: Bernhard Walle [EMAIL PROTECTED]


1 file changed, 1 insertion(+), 1 deletion(-)
Makefile |2 +-


Use gzip instead of zip for manual page compression. That's the usual
method of compressing manual pages on Unix.


Signed-off-by: Bernhard Walle [EMAIL PROTECTED]

diff --git a/Makefile b/Makefile
--- a/Makefile
+++ b/Makefile
@@ -31,7 +31,7 @@
 
 makedumpfile: $(SRC) $(OBJ_ARCH)
 	$(CC) $(CFLAGS) $(OBJ_ARCH) -o $@ $ -static -ldw -lelf -lz
-	zip ./makedumpfile.8.gz ./makedumpfile.8
+	gzip -c ./makedumpfile.8  ./makedumpfile.8.gz
 
 clean:
 	rm -f $(OBJ) $(OBJ_ARCH) makedumpfile makedumpfile.8.gz
___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: IA64: copying /proc/vmcore caused kernel MCA'ed

2008-09-09 Thread Jay Lan
Bernhard Walle wrote:
 * Jay Lan [EMAIL PROTECTED] [2008-09-08]: 
 
 Any input helping me speed up debugging is appreciated.
 
 I would start with comparing the ELF program headers of /proc/vmcore
 which you get with readelf -l /proc/vmcore in kdump environment and
 the /proc/iomem which kexec uses to set up the ELF core headers.
 
 If both does contain the memory regions which should not be accessed,
 then it's a bug in the ressource assignment of the kernel, and
 if /proc/iomem does not, the kexec tool has a bug.

Hi Bernhard,

I talked to Jack Steiner about this problem. He said:
  The memory at 0xe0600010 is part of Altix fetchop space
  (AKA mspec). The memory supports only uncached attributes. A normal
  cached access may cause MCAs.

  The kernel should not be using this memory for anything. Only the
  fetchop driver is suppose to access this area.

  Note: /proc/iomem shows the memory as System RAM but that does NOT
  mean that is can be accessed w/o special code. See efi.c for the code
  that prints the iomem info. Maybe efi.c needs to be changed to show
  a different name for the fetchop memory so that kdump will work.

I still need to understand why it did not cause a problem before.

Thanks,
jay

 
 
 Bernhard
 
 ___
 kexec mailing list
 kexec@lists.infradead.org
 http://lists.infradead.org/mailman/listinfo/kexec


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: Trying to test my gart/iommu vmcore problem on RH

2008-09-09 Thread Bob Montgomery
On Fri, 2008-09-05 at 15:12 +, Vivek Goyal wrote:

 Nice summary Bob. Few thoughts.
 
 - So until and unless one is reserving memory for crashkernel above 4G,
 there is no need for initializing the IOMMU in second kernel (At this
 moment I am not too worried about need of isolation in second kernel). If
 that's the case, we shouldn't have initialized the calgary iommu in second
 kernel and just should have left it alone and things probably would have
 been fine?

Did you ever try booting the kdump kernel with iommu=off?  That should
have prevented the detection and initialization of the calgary iommu
from the kdump kernel, which (I ass-u-me) is what you had problems with,
if leftover IOs from the crashed kernel were still in progress when the
kdump kernel initialized it?

As for 4G, the info in 2.6.26 Documentation/x86_64/boot-options.txt
mentions 3G as the decision for using an iommu in several places.  The
test in our kernel for using the GART IOMMU is (end_pfn  MAX_DMA32_PFN)
where MAX_DMA32_PFN is:
#define MAX_DMA32_PFN ((4UL*1024*1024*1024)  PAGE_SHIFT)

 
 The only issue is that how do you make sure that first kernel has not
 setup an IOMMU entry with same bus address which falls in crash kernel
 reserved area. I am not very familiar with the dma/iommu code and how
 bus addresses are selected. Because if there is bus address overlap in
 first kernel and second kernel, IOMMU will trap the second kernel's DMA
 attempts and redirect it somewhere else. If we don't run into this issue
 then it is fine otherwise we will be forced to use IOMMU in second kernel
 and try to find free bus addresses/entries so that we don't conflict with
 the first kernel's settings.

I finally thought about this long enough to figure out what you meant, I
think :-).If the existence of a real IOMMU in the first kernel
allows the use of completely virtualized addresses on the IO side, then
there's no reason that they would have to avoid real RAM addresses when
handing out addresses to IO cards with that IOMMU.  

In particular, our little joke about allocating the crash kernel under
the GART aperture (with CPU-side access turned off) would prevent the
kdump kernel from doing non-iommu IO to the crash kernel address range,
because the still-active GART from the old kernel would grab any of
those addresses coming in from the IO side.

The kdump kernel wouldn't be in danger of being overwritten, it just
might not be able to set up any IOs that work to its own address space
if an IOMMU is out there waiting to grab them.

For the calgary case, we'd maybe have to add the Crash Kernel range to
the list of things sent to iommu_range_reserve in
calgary_reserve_regions, to prevent those addresses from ever being
given out.

Bob M. 



___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec