Re: [2.6.22] negative time jump

2007-07-30 Thread Vasily Averin
john stultz wrote:
> On 7/29/07, Vasily Averin <[EMAIL PROTECTED]> wrote:
>> I've investigated why my testnode freezes. When I found that node is freezed
>> again I've started to press Sysrq keys and noticed the following negative 
>> time jump.
>>
>> Could anybody please help me to understand the reasons of this issue?
>>
>> --- VvS comment: some pre-history: node boot
>> Jul 27 13:58:10 ts28 Linux version 2.6.22 ([EMAIL PROTECTED]) (gcc version 
>> 3.4.6
>> 20060404 (Red Hat 3.4.6-3)) #11 SMP Fri Jul 27 12:47:45 MSD 2007
>> Jul 27 13:58:10 ts28 Command line: ro root=LABEL=/1 console=ttyS0,115200
>> console=tty debug silencelevel=8 [EMAIL PROTECTED] acpc=noirq clocksource=tsc
> 
> clocksource=tsc?
> 
> I suspect you're forcing the clocksource as its not selected by
> default. Could you provide dmesg output without that option. It might
> shed some light as to why the clocksource isn't chosen.

Default clocksource was acpi-pm. But the node have similar behavior when I've
used this clocksource. (please look at the following logs)

Originally I've investigated SATA-related issue and noticed some strange with
timers. When I've reproduced situation again I've pressed Alt+Sysrq+Q keys and
noticed that it shows incorrect time (it shows 431968 sec from booting but
according to the serial console timestamps it should be ~445954 sec). Then you
can see that time was jumped back, next timestamp is 431965 sec, 3 sec back.

> It may very well be your TSCs are not synched or are otherwise not
> reliable for timekeeping , and thus time is not consistent between
> cpus.

However IMHO it cannot explain time loops (~4400 sec) that I'm observing right 
now.

thank you,
Vasily Averin

PS. You can look at the other logs and find more details in my attachments to
http://bugzilla.kernel.org/show_bug.cgi?id=8650

Jul 12 09:00:16 ts28 Linux version 2.6.22 ([EMAIL PROTECTED]) (gcc version 3.4.6 20060404 (Red Hat 3.4.6-3)) #4 SMP Thu Jul 12 07:58:27 MSD 2007
Jul 12 09:00:16 ts28 Command line: ro root=LABEL=/1 console=ttyS0,115200 console=tty debug silencelevel=8 [EMAIL PROTECTED] acpi=noirq
Jul 12 09:00:16 ts28 BIOS-provided physical RAM map:
Jul 12 09:00:16 ts28  BIOS-e820:  - 0009d800 (usable)
Jul 12 09:00:16 ts28  BIOS-e820: 0009d800 - 000a (reserved)
Jul 12 09:00:16 ts28  BIOS-e820: 000f - 0010 (reserved)
Jul 12 09:00:16 ts28  BIOS-e820: 0010 - dfee (usable)
Jul 12 09:00:16 ts28  BIOS-e820: dfee - dfee3000 (ACPI NVS)
Jul 12 09:00:16 ts28  BIOS-e820: dfee3000 - dfef (ACPI data)
Jul 12 09:00:16 ts28  BIOS-e820: dfef - dff0 (reserved)
Jul 12 09:00:16 ts28  BIOS-e820: fec0 - 0001 (reserved)
Jul 12 09:00:16 ts28 Entering add_active_range(0, 0, 157) 0 entries of 3200 used
Jul 12 09:00:16 ts28 Entering add_active_range(0, 256, 917216) 1 entries of 3200 used
Jul 12 09:00:16 ts28 end_pfn_map = 1048576
Jul 12 09:00:16 ts28 DMI 2.3 present.
Jul 12 09:00:16 ts28 ACPI: RSDP 000F6980, 0014 (r0 VIAK8 )
Jul 12 09:00:16 ts28 ACPI: RSDT DFEE3000, 002C (r1 VIAK8  AWRDACPI 42302E31 AWRD0)
Jul 12 09:00:16 ts28 ACPI: FACP DFEE3040, 0074 (r1 VIAK8  AWRDACPI 42302E31 AWRD0)
Jul 12 09:00:16 ts28 ACPI: DSDT DFEE30C0, 4A4C (r1 VIAK8  AWRDACPI 1000 MSFT  10E)
Jul 12 09:00:16 ts28 ACPI: FACS DFEE, 0040
Jul 12 09:00:16 ts28 ACPI: APIC DFEE7B40, 0068 (r1 VIAK8  AWRDACPI 42302E31 AWRD0)
Jul 12 09:00:16 ts28 Scanning NUMA topology in Northbridge 24
Jul 12 09:00:16 ts28 Number of nodes 2
Jul 12 09:00:16 ts28 Node 0 MemBase  Limit dfee
Jul 12 09:00:16 ts28 Entering add_active_range(0, 0, 157) 0 entries of 3200 used
Jul 12 09:00:16 ts28 Entering add_active_range(0, 256, 917216) 1 entries of 3200 used
Jul 12 09:00:16 ts28 Skipping disabled node 1
Jul 12 09:00:16 ts28 NUMA: Using 63 for the hash shift.
Jul 12 09:00:16 ts28 Using node hash shift of 63
Jul 12 09:00:16 ts28 Bootmem setup node 0 -dfee
Jul 12 09:00:16 ts28 Zone PFN ranges:
Jul 12 09:00:16 ts28   DMA 0 -> 4096
Jul 12 09:00:16 ts28   DMA324096 ->  1048576
Jul 12 09:00:16 ts28   Normal1048576 ->  1048576
Jul 12 09:00:16 ts28 early_node_map[2] active PFN ranges
Jul 12 09:00:16 ts28 0:0 ->  157
Jul 12 09:00:16 ts28 0:  256 ->   917216
Jul 12 09:00:16 ts28 On node 0 totalpages: 917117
Jul 12 09:00:16 ts28   DMA zone: 56 pages used for memmap
Jul 12 09:00:16 ts28 
Jul 12 09:00:16 ts28   DMA zone: 2018 pages reserved
Jul 12 09:00:16 ts28   DMA zone: 1923 pages, LIFO batch:0
Jul 12 09:00:16 ts28   DMA32 zone: 12484 pages used for memmap
Jul 12 09:00:16 ts28   DMA32 zone: 900636 pages, LIFO batch:31
Jul 12 09:00:16 ts28   Normal zone: 0 pages used for memmap
Jul 12 09:00:16 ts28 ACPI: PM-Timer IO Port: 0x4008
Jul 12 09:00:16 ts28 ACPI: Local APIC address 0xfee0
Jul 12 09:00:16 ts28 

Re: [2.6.22] negative time jump

2007-07-30 Thread john stultz
Whoops, forgot to CC lkml.

-- Forwarded message --
From: john stultz <[EMAIL PROTECTED]>
Date: Jul 30, 2007 11:13 AM
Subject: Re: [2.6.22] negative time jump
To: Vasily Averin <[EMAIL PROTECTED]>


On 7/29/07, Vasily Averin <[EMAIL PROTECTED]> wrote:
> I've investigated why my testnode freezes. When I found that node is freezed
> again I've started to press Sysrq keys and noticed the following negative 
> time jump.
>
> Could anybody please help me to understand the reasons of this issue?
>
> --- VvS comment: some pre-history: node boot
> Jul 27 13:58:10 ts28 Linux version 2.6.22 ([EMAIL PROTECTED]) (gcc version 
> 3.4.6
> 20060404 (Red Hat 3.4.6-3)) #11 SMP Fri Jul 27 12:47:45 MSD 2007
> Jul 27 13:58:10 ts28 Command line: ro root=LABEL=/1 console=ttyS0,115200
> console=tty debug silencelevel=8 [EMAIL PROTECTED] acpc=noirq clocksource=tsc

clocksource=tsc?

I suspect you're forcing the clocksource as its not selected by
default. Could you provide dmesg output without that option. It might
shed some light as to why the clocksource isn't chosen.

It may very well be your TSCs are not synched or are otherwise not
reliable for timekeeping , and thus time is not consistent between
cpus.

-john

PS: also, you might want to check that acpc=noirq bit as well.. i
think you mean "acpi=noirq"
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [2.6.22] negative time jump

2007-07-30 Thread Vasily Averin
Groundhog Day?

Vasily Averin wrote:
> [EMAIL PROTECTED] ~]# echo 1 > /proc/timer_stats
> [EMAIL PROTECTED] ~]# cat /proc/timer_stats
> Timer Stats Version: v0.1
> Sample period: 17.992 s
> 0 total events
> [EMAIL PROTECTED] ~]# cat /proc/timer_stats
> Timer Stats Version: v0.1
> Sample period: -4366.278 s
> 0 total events
> [EMAIL PROTECTED] ~]# cat /proc/timer_stats
> Timer Stats Version: v0.1
> Sample period: -4359.840 s
> 0 total events

[EMAIL PROTECTED] ~]# cat /proc/timer_stats
Timer Stats Version: v0.1
Sample period: -5.060 s
0 total events
[EMAIL PROTECTED] ~]# cat /proc/timer_stats
Timer Stats Version: v0.1
Sample period: 0.365 s
0 total events
[EMAIL PROTECTED] ~]# cat /proc/timer_stats
Timer Stats Version: v0.1
Sample period: 8.675 s
0 total events
[EMAIL PROTECTED] ~]# cat /proc/timer_stats
Timer Stats Version: v0.1
Sample period: -4355.978 s
0 total events
[skip]

next loop:
[EMAIL PROTECTED] ~]# cat /proc/timer_stats
Timer Stats Version: v0.1
Sample period: 9.968 s
0 total events
[EMAIL PROTECTED] ~]# echo q > /proc/sysrq-trigger
Jul 30 11:10:17 ts28 now at 29196596844685 nsecs

[EMAIL PROTECTED] ~]# cat /proc/timer_stats
Timer Stats Version: v0.1
Sample period: 14.104 s
0 total events
[EMAIL PROTECTED] ~]# cat /proc/timer_stats
Timer Stats Version: v0.1
Sample period: 17.666 s
0 total events
[EMAIL PROTECTED] ~]# cat /proc/timer_stats
Timer Stats Version: v0.1
Sample period: 19.791 s
0 total events
[EMAIL PROTECTED] ~]# cat /proc/timer_stats
Timer Stats Version: v0.1
Sample period: 21.278 s
0 total events
[EMAIL PROTECTED] ~]# echo q > /proc/sysrq-trigger
Jul 30 11:10:28 ts28 now at 24810199356545 nsecs

[EMAIL PROTECTED] ~]# cat /proc/timer_stats
Timer Stats Version: v0.1
Sample period: -4373.161 s
0 total events

Node is still alive and I'm able to look at proc files and press sysrq keys.

Thank you,
Vasily Averin
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [2.6.22] negative time jump

2007-07-30 Thread Vasily Averin
Groundhog Day?

Vasily Averin wrote:
 [EMAIL PROTECTED] ~]# echo 1  /proc/timer_stats
 [EMAIL PROTECTED] ~]# cat /proc/timer_stats
 Timer Stats Version: v0.1
 Sample period: 17.992 s
 0 total events
 [EMAIL PROTECTED] ~]# cat /proc/timer_stats
 Timer Stats Version: v0.1
 Sample period: -4366.278 s
 0 total events
 [EMAIL PROTECTED] ~]# cat /proc/timer_stats
 Timer Stats Version: v0.1
 Sample period: -4359.840 s
 0 total events

[EMAIL PROTECTED] ~]# cat /proc/timer_stats
Timer Stats Version: v0.1
Sample period: -5.060 s
0 total events
[EMAIL PROTECTED] ~]# cat /proc/timer_stats
Timer Stats Version: v0.1
Sample period: 0.365 s
0 total events
[EMAIL PROTECTED] ~]# cat /proc/timer_stats
Timer Stats Version: v0.1
Sample period: 8.675 s
0 total events
[EMAIL PROTECTED] ~]# cat /proc/timer_stats
Timer Stats Version: v0.1
Sample period: -4355.978 s
0 total events
[skip]

next loop:
[EMAIL PROTECTED] ~]# cat /proc/timer_stats
Timer Stats Version: v0.1
Sample period: 9.968 s
0 total events
[EMAIL PROTECTED] ~]# echo q  /proc/sysrq-trigger
Jul 30 11:10:17 ts28 now at 29196596844685 nsecs

[EMAIL PROTECTED] ~]# cat /proc/timer_stats
Timer Stats Version: v0.1
Sample period: 14.104 s
0 total events
[EMAIL PROTECTED] ~]# cat /proc/timer_stats
Timer Stats Version: v0.1
Sample period: 17.666 s
0 total events
[EMAIL PROTECTED] ~]# cat /proc/timer_stats
Timer Stats Version: v0.1
Sample period: 19.791 s
0 total events
[EMAIL PROTECTED] ~]# cat /proc/timer_stats
Timer Stats Version: v0.1
Sample period: 21.278 s
0 total events
[EMAIL PROTECTED] ~]# echo q  /proc/sysrq-trigger
Jul 30 11:10:28 ts28 now at 24810199356545 nsecs

[EMAIL PROTECTED] ~]# cat /proc/timer_stats
Timer Stats Version: v0.1
Sample period: -4373.161 s
0 total events

Node is still alive and I'm able to look at proc files and press sysrq keys.

Thank you,
Vasily Averin
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [2.6.22] negative time jump

2007-07-30 Thread john stultz
Whoops, forgot to CC lkml.

-- Forwarded message --
From: john stultz [EMAIL PROTECTED]
Date: Jul 30, 2007 11:13 AM
Subject: Re: [2.6.22] negative time jump
To: Vasily Averin [EMAIL PROTECTED]


On 7/29/07, Vasily Averin [EMAIL PROTECTED] wrote:
 I've investigated why my testnode freezes. When I found that node is freezed
 again I've started to press Sysrq keys and noticed the following negative 
 time jump.

 Could anybody please help me to understand the reasons of this issue?

 --- VvS comment: some pre-history: node boot
 Jul 27 13:58:10 ts28 Linux version 2.6.22 ([EMAIL PROTECTED]) (gcc version 
 3.4.6
 20060404 (Red Hat 3.4.6-3)) #11 SMP Fri Jul 27 12:47:45 MSD 2007
 Jul 27 13:58:10 ts28 Command line: ro root=LABEL=/1 console=ttyS0,115200
 console=tty debug silencelevel=8 [EMAIL PROTECTED] acpc=noirq clocksource=tsc

clocksource=tsc?

I suspect you're forcing the clocksource as its not selected by
default. Could you provide dmesg output without that option. It might
shed some light as to why the clocksource isn't chosen.

It may very well be your TSCs are not synched or are otherwise not
reliable for timekeeping , and thus time is not consistent between
cpus.

-john

PS: also, you might want to check that acpc=noirq bit as well.. i
think you mean acpi=noirq
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [2.6.22] negative time jump

2007-07-30 Thread Vasily Averin
john stultz wrote:
 On 7/29/07, Vasily Averin [EMAIL PROTECTED] wrote:
 I've investigated why my testnode freezes. When I found that node is freezed
 again I've started to press Sysrq keys and noticed the following negative 
 time jump.

 Could anybody please help me to understand the reasons of this issue?

 --- VvS comment: some pre-history: node boot
 Jul 27 13:58:10 ts28 Linux version 2.6.22 ([EMAIL PROTECTED]) (gcc version 
 3.4.6
 20060404 (Red Hat 3.4.6-3)) #11 SMP Fri Jul 27 12:47:45 MSD 2007
 Jul 27 13:58:10 ts28 Command line: ro root=LABEL=/1 console=ttyS0,115200
 console=tty debug silencelevel=8 [EMAIL PROTECTED] acpc=noirq clocksource=tsc
 
 clocksource=tsc?
 
 I suspect you're forcing the clocksource as its not selected by
 default. Could you provide dmesg output without that option. It might
 shed some light as to why the clocksource isn't chosen.

Default clocksource was acpi-pm. But the node have similar behavior when I've
used this clocksource. (please look at the following logs)

Originally I've investigated SATA-related issue and noticed some strange with
timers. When I've reproduced situation again I've pressed Alt+Sysrq+Q keys and
noticed that it shows incorrect time (it shows 431968 sec from booting but
according to the serial console timestamps it should be ~445954 sec). Then you
can see that time was jumped back, next timestamp is 431965 sec, 3 sec back.

 It may very well be your TSCs are not synched or are otherwise not
 reliable for timekeeping , and thus time is not consistent between
 cpus.

However IMHO it cannot explain time loops (~4400 sec) that I'm observing right 
now.

thank you,
Vasily Averin

PS. You can look at the other logs and find more details in my attachments to
http://bugzilla.kernel.org/show_bug.cgi?id=8650

Jul 12 09:00:16 ts28 Linux version 2.6.22 ([EMAIL PROTECTED]) (gcc version 3.4.6 20060404 (Red Hat 3.4.6-3)) #4 SMP Thu Jul 12 07:58:27 MSD 2007
Jul 12 09:00:16 ts28 Command line: ro root=LABEL=/1 console=ttyS0,115200 console=tty debug silencelevel=8 [EMAIL PROTECTED] acpi=noirq
Jul 12 09:00:16 ts28 BIOS-provided physical RAM map:
Jul 12 09:00:16 ts28  BIOS-e820:  - 0009d800 (usable)
Jul 12 09:00:16 ts28  BIOS-e820: 0009d800 - 000a (reserved)
Jul 12 09:00:16 ts28  BIOS-e820: 000f - 0010 (reserved)
Jul 12 09:00:16 ts28  BIOS-e820: 0010 - dfee (usable)
Jul 12 09:00:16 ts28  BIOS-e820: dfee - dfee3000 (ACPI NVS)
Jul 12 09:00:16 ts28  BIOS-e820: dfee3000 - dfef (ACPI data)
Jul 12 09:00:16 ts28  BIOS-e820: dfef - dff0 (reserved)
Jul 12 09:00:16 ts28  BIOS-e820: fec0 - 0001 (reserved)
Jul 12 09:00:16 ts28 Entering add_active_range(0, 0, 157) 0 entries of 3200 used
Jul 12 09:00:16 ts28 Entering add_active_range(0, 256, 917216) 1 entries of 3200 used
Jul 12 09:00:16 ts28 end_pfn_map = 1048576
Jul 12 09:00:16 ts28 DMI 2.3 present.
Jul 12 09:00:16 ts28 ACPI: RSDP 000F6980, 0014 (r0 VIAK8 )
Jul 12 09:00:16 ts28 ACPI: RSDT DFEE3000, 002C (r1 VIAK8  AWRDACPI 42302E31 AWRD0)
Jul 12 09:00:16 ts28 ACPI: FACP DFEE3040, 0074 (r1 VIAK8  AWRDACPI 42302E31 AWRD0)
Jul 12 09:00:16 ts28 ACPI: DSDT DFEE30C0, 4A4C (r1 VIAK8  AWRDACPI 1000 MSFT  10E)
Jul 12 09:00:16 ts28 ACPI: FACS DFEE, 0040
Jul 12 09:00:16 ts28 ACPI: APIC DFEE7B40, 0068 (r1 VIAK8  AWRDACPI 42302E31 AWRD0)
Jul 12 09:00:16 ts28 Scanning NUMA topology in Northbridge 24
Jul 12 09:00:16 ts28 Number of nodes 2
Jul 12 09:00:16 ts28 Node 0 MemBase  Limit dfee
Jul 12 09:00:16 ts28 Entering add_active_range(0, 0, 157) 0 entries of 3200 used
Jul 12 09:00:16 ts28 Entering add_active_range(0, 256, 917216) 1 entries of 3200 used
Jul 12 09:00:16 ts28 Skipping disabled node 1
Jul 12 09:00:16 ts28 NUMA: Using 63 for the hash shift.
Jul 12 09:00:16 ts28 Using node hash shift of 63
Jul 12 09:00:16 ts28 Bootmem setup node 0 -dfee
Jul 12 09:00:16 ts28 Zone PFN ranges:
Jul 12 09:00:16 ts28   DMA 0 - 4096
Jul 12 09:00:16 ts28   DMA324096 -  1048576
Jul 12 09:00:16 ts28   Normal1048576 -  1048576
Jul 12 09:00:16 ts28 early_node_map[2] active PFN ranges
Jul 12 09:00:16 ts28 0:0 -  157
Jul 12 09:00:16 ts28 0:  256 -   917216
Jul 12 09:00:16 ts28 On node 0 totalpages: 917117
Jul 12 09:00:16 ts28   DMA zone: 56 pages used for memmap
Jul 12 09:00:16 ts28 
Jul 12 09:00:16 ts28   DMA zone: 2018 pages reserved
Jul 12 09:00:16 ts28   DMA zone: 1923 pages, LIFO batch:0
Jul 12 09:00:16 ts28   DMA32 zone: 12484 pages used for memmap
Jul 12 09:00:16 ts28   DMA32 zone: 900636 pages, LIFO batch:31
Jul 12 09:00:16 ts28   Normal zone: 0 pages used for memmap
Jul 12 09:00:16 ts28 ACPI: PM-Timer IO Port: 0x4008
Jul 12 09:00:16 ts28 ACPI: Local APIC address 0xfee0
Jul 12 09:00:16 ts28 ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] 

[2.6.22] negative time jump

2007-07-29 Thread Vasily Averin
I've investigated why my testnode freezes. When I found that node is freezed
again I've started to press Sysrq keys and noticed the following negative time 
jump.

Could anybody please help me to understand the reasons of this issue?

--- VvS comment: some pre-history: node boot
Jul 27 13:58:10 ts28 Linux version 2.6.22 ([EMAIL PROTECTED]) (gcc version 3.4.6
20060404 (Red Hat 3.4.6-3)) #11 SMP Fri Jul 27 12:47:45 MSD 2007
Jul 27 13:58:10 ts28 Command line: ro root=LABEL=/1 console=ttyS0,115200
console=tty debug silencelevel=8 [EMAIL PROTECTED] acpc=noirq clocksource=tsc

[EMAIL PROTECTED] ~]# echo q > /proc/sysrq-trigger
Jul 27 14:09:34 ts28 now at 685327961777 nsecs

--- VvS comment: AIM7 test hangs at friday ~Jul 27 20:27:06 2007
48  135.33   98 2.8193   2149.48167.67   Fri Jul 27 17:42:10 2007
54  127.76   99 2.3660   2561.30190.97   Fri Jul 27 18:24:51 2007
67  117.00   99 1.7463   3470.13237.20   Fri Jul 27 19:22:41 2007
72  112.89   99 1.5679   3864.93248.71   Fri Jul 27 20:27:06 2007
82


---VvS comment: what I do monday morning:
[EMAIL PROTECTED] ~]# echo q > /proc/sysrq-trigger
Jul 30 06:15:37 ts28 SysRq : Show Pending Timers
Jul 30 06:15:37 ts28 Timer List Version: v0.3
Jul 30 06:15:37 ts28 HRTIMER_MAX_CLOCK_BASES: 2
Jul 30 06:15:37 ts28 now at 29111288132045 nsecs
--- VvS comment: I would note: 29111 sec is very strange value for node with 3
day uptime...

[EMAIL PROTECTED] ~]# echo q > /proc/sysrq-trigger
Jul 30 06:16:21 ts28 now at 29155644785562 nsecs

[EMAIL PROTECTED] ~]# echo 1 > /proc/timer_stats
[EMAIL PROTECTED] ~]# cat /proc/timer_stats
Timer Stats Version: v0.1
Sample period: 17.992 s
0 total events
[EMAIL PROTECTED] ~]# cat /proc/timer_stats
Timer Stats Version: v0.1
Sample period: -4366.278 s
0 total events
[EMAIL PROTECTED] ~]# cat /proc/timer_stats
Timer Stats Version: v0.1
Sample period: -4359.840 s
0 total events
[EMAIL PROTECTED] ~]# cat /proc/timer_stats
Timer Stats Version: v0.1
Sample period: -4347.397 s
0 total events
[EMAIL PROTECTED] ~]# cat /proc/timer_stats
Timer Stats Version: v0.1
Sample period: -4331.652 s
0 total events
[EMAIL PROTECTED] ~]# cat /proc/timer_stats
Timer Stats Version: v0.1
Sample period: -4297.293 s
0 total events
[EMAIL PROTECTED] ~]# echo q > /proc/sysrq-trigger
Jul 30 06:19:07 ts28 now at 24922828615221 nsecs

[EMAIL PROTECTED] ~]# echo q > /proc/sysrq-trigger
Jul 30 06:22:07 ts28 now at 25103332336122 nsecs

[EMAIL PROTECTED] ~]# cat /proc/timer_stats
Timer Stats Version: v0.1
Sample period: -4065.515 s
0 total events
[EMAIL PROTECTED] ~]# cat /proc/timer_stats
Timer Stats Version: v0.1
Sample period: -3913.800 s
0 total events
[EMAIL PROTECTED] ~]# echo q > /proc/sysrq-trigger
Jul 30 06:25:47 ts28 now at 25322820465391 nsecs

Full serial console logs are attached, The other hardware details can be found
in http://bugzilla.kernel.org/show_bug.cgi?id=8650

Thank you,
Vasily Averin
Jul 27 13:58:10 ts28 Linux version 2.6.22 ([EMAIL PROTECTED]) (gcc version 3.4.6 20060404 (Red Hat 3.4.6-3)) #11 SMP Fri Jul 27 12:47:45 MSD 2007
Jul 27 13:58:10 ts28 Command line: ro root=LABEL=/1 console=ttyS0,115200 console=tty debug silencelevel=8 [EMAIL PROTECTED] acpc=noirq clocksource=tsc
Jul 27 13:58:10 ts28 BIOS-provided physical RAM map:
Jul 27 13:58:10 ts28  BIOS-e820:  - 0009d800 (usable)
Jul 27 13:58:10 ts28  BIOS-e820: 0009d800 - 000a (reserved)
Jul 27 13:58:10 ts28  BIOS-e820: 000f - 0010 (reserved)
Jul 27 13:58:10 ts28  BIOS-e820: 0010 - dfee (usable)
Jul 27 13:58:10 ts28  BIOS-e820: dfee - dfee3000 (ACPI NVS)
Jul 27 13:58:10 ts28  BIOS-e820: dfee3000 - dfef (ACPI data)
Jul 27 13:58:10 ts28 
Jul 27 13:58:10 ts28  BIOS-e820: dfef - dff0 (reserved)
Jul 27 13:58:10 ts28  BIOS-e820: fec0 - 0001 (reserved)
Jul 27 13:58:10 ts28 Entering add_active_range(0, 0, 157) 0 entries of 3200 used
Jul 27 13:58:10 ts28 Entering add_active_range(0, 256, 917216) 1 entries of 3200 used
Jul 27 13:58:10 ts28 end_pfn_map = 1048576
Jul 27 13:58:10 ts28 DMI 2.3 present.
Jul 27 13:58:10 ts28 ACPI: RSDP 000F6980, 0014 (r0 VIAK8 )
Jul 27 13:58:10 ts28 
Jul 27 13:58:10 ts28 ACPI: RSDT DFEE3000, 002C (r1 VIAK8  AWRDACPI 42302E31 AWRD0)
Jul 27 13:58:10 ts28 ACPI: FACP DFEE3040, 0074 (r1 VIAK8  AWRDACPI 42302E31 AWRD0)
Jul 27 13:58:10 ts28 ACPI: DSDT DFEE30C0, 4A4C (r1 VIAK8  AWRDACPI 1000 MSFT  10E)
Jul 27 13:58:10 ts28 ACPI: FACS DFEE, 0040
Jul 27 13:58:10 ts28 ACPI: APIC DFEE7B40, 0068 (r1 VIAK8  AWRDACPI 42302E31 AWRD0)
Jul 27 13:58:10 ts28 Scanning NUMA topology in Northbridge 24
Jul 27 13:58:10 ts28 Number of nodes 2
Jul 27 13:58:10 ts28 Node 0 MemBase  Limit dfee
Jul 27 13:58:10 ts28 Entering add_active_range(0, 0, 157) 0 entries of 3200 used
Jul 27 13:58:10 ts28 Entering 

[2.6.22] negative time jump

2007-07-29 Thread Vasily Averin
I've investigated why my testnode freezes. When I found that node is freezed
again I've started to press Sysrq keys and noticed the following negative time 
jump.

Could anybody please help me to understand the reasons of this issue?

--- VvS comment: some pre-history: node boot
Jul 27 13:58:10 ts28 Linux version 2.6.22 ([EMAIL PROTECTED]) (gcc version 3.4.6
20060404 (Red Hat 3.4.6-3)) #11 SMP Fri Jul 27 12:47:45 MSD 2007
Jul 27 13:58:10 ts28 Command line: ro root=LABEL=/1 console=ttyS0,115200
console=tty debug silencelevel=8 [EMAIL PROTECTED] acpc=noirq clocksource=tsc

[EMAIL PROTECTED] ~]# echo q  /proc/sysrq-trigger
Jul 27 14:09:34 ts28 now at 685327961777 nsecs

--- VvS comment: AIM7 test hangs at friday ~Jul 27 20:27:06 2007
48  135.33   98 2.8193   2149.48167.67   Fri Jul 27 17:42:10 2007
54  127.76   99 2.3660   2561.30190.97   Fri Jul 27 18:24:51 2007
67  117.00   99 1.7463   3470.13237.20   Fri Jul 27 19:22:41 2007
72  112.89   99 1.5679   3864.93248.71   Fri Jul 27 20:27:06 2007
82


---VvS comment: what I do monday morning:
[EMAIL PROTECTED] ~]# echo q  /proc/sysrq-trigger
Jul 30 06:15:37 ts28 SysRq : Show Pending Timers
Jul 30 06:15:37 ts28 Timer List Version: v0.3
Jul 30 06:15:37 ts28 HRTIMER_MAX_CLOCK_BASES: 2
Jul 30 06:15:37 ts28 now at 29111288132045 nsecs
--- VvS comment: I would note: 29111 sec is very strange value for node with 3
day uptime...

[EMAIL PROTECTED] ~]# echo q  /proc/sysrq-trigger
Jul 30 06:16:21 ts28 now at 29155644785562 nsecs

[EMAIL PROTECTED] ~]# echo 1  /proc/timer_stats
[EMAIL PROTECTED] ~]# cat /proc/timer_stats
Timer Stats Version: v0.1
Sample period: 17.992 s
0 total events
[EMAIL PROTECTED] ~]# cat /proc/timer_stats
Timer Stats Version: v0.1
Sample period: -4366.278 s
0 total events
[EMAIL PROTECTED] ~]# cat /proc/timer_stats
Timer Stats Version: v0.1
Sample period: -4359.840 s
0 total events
[EMAIL PROTECTED] ~]# cat /proc/timer_stats
Timer Stats Version: v0.1
Sample period: -4347.397 s
0 total events
[EMAIL PROTECTED] ~]# cat /proc/timer_stats
Timer Stats Version: v0.1
Sample period: -4331.652 s
0 total events
[EMAIL PROTECTED] ~]# cat /proc/timer_stats
Timer Stats Version: v0.1
Sample period: -4297.293 s
0 total events
[EMAIL PROTECTED] ~]# echo q  /proc/sysrq-trigger
Jul 30 06:19:07 ts28 now at 24922828615221 nsecs

[EMAIL PROTECTED] ~]# echo q  /proc/sysrq-trigger
Jul 30 06:22:07 ts28 now at 25103332336122 nsecs

[EMAIL PROTECTED] ~]# cat /proc/timer_stats
Timer Stats Version: v0.1
Sample period: -4065.515 s
0 total events
[EMAIL PROTECTED] ~]# cat /proc/timer_stats
Timer Stats Version: v0.1
Sample period: -3913.800 s
0 total events
[EMAIL PROTECTED] ~]# echo q  /proc/sysrq-trigger
Jul 30 06:25:47 ts28 now at 25322820465391 nsecs

Full serial console logs are attached, The other hardware details can be found
in http://bugzilla.kernel.org/show_bug.cgi?id=8650

Thank you,
Vasily Averin
Jul 27 13:58:10 ts28 Linux version 2.6.22 ([EMAIL PROTECTED]) (gcc version 3.4.6 20060404 (Red Hat 3.4.6-3)) #11 SMP Fri Jul 27 12:47:45 MSD 2007
Jul 27 13:58:10 ts28 Command line: ro root=LABEL=/1 console=ttyS0,115200 console=tty debug silencelevel=8 [EMAIL PROTECTED] acpc=noirq clocksource=tsc
Jul 27 13:58:10 ts28 BIOS-provided physical RAM map:
Jul 27 13:58:10 ts28  BIOS-e820:  - 0009d800 (usable)
Jul 27 13:58:10 ts28  BIOS-e820: 0009d800 - 000a (reserved)
Jul 27 13:58:10 ts28  BIOS-e820: 000f - 0010 (reserved)
Jul 27 13:58:10 ts28  BIOS-e820: 0010 - dfee (usable)
Jul 27 13:58:10 ts28  BIOS-e820: dfee - dfee3000 (ACPI NVS)
Jul 27 13:58:10 ts28  BIOS-e820: dfee3000 - dfef (ACPI data)
Jul 27 13:58:10 ts28 
Jul 27 13:58:10 ts28  BIOS-e820: dfef - dff0 (reserved)
Jul 27 13:58:10 ts28  BIOS-e820: fec0 - 0001 (reserved)
Jul 27 13:58:10 ts28 Entering add_active_range(0, 0, 157) 0 entries of 3200 used
Jul 27 13:58:10 ts28 Entering add_active_range(0, 256, 917216) 1 entries of 3200 used
Jul 27 13:58:10 ts28 end_pfn_map = 1048576
Jul 27 13:58:10 ts28 DMI 2.3 present.
Jul 27 13:58:10 ts28 ACPI: RSDP 000F6980, 0014 (r0 VIAK8 )
Jul 27 13:58:10 ts28 
Jul 27 13:58:10 ts28 ACPI: RSDT DFEE3000, 002C (r1 VIAK8  AWRDACPI 42302E31 AWRD0)
Jul 27 13:58:10 ts28 ACPI: FACP DFEE3040, 0074 (r1 VIAK8  AWRDACPI 42302E31 AWRD0)
Jul 27 13:58:10 ts28 ACPI: DSDT DFEE30C0, 4A4C (r1 VIAK8  AWRDACPI 1000 MSFT  10E)
Jul 27 13:58:10 ts28 ACPI: FACS DFEE, 0040
Jul 27 13:58:10 ts28 ACPI: APIC DFEE7B40, 0068 (r1 VIAK8  AWRDACPI 42302E31 AWRD0)
Jul 27 13:58:10 ts28 Scanning NUMA topology in Northbridge 24
Jul 27 13:58:10 ts28 Number of nodes 2
Jul 27 13:58:10 ts28 Node 0 MemBase  Limit dfee
Jul 27 13:58:10 ts28 Entering add_active_range(0, 0, 157) 0 entries of 3200 used
Jul 27 13:58:10 ts28 Entering add_active_range(0,