-------- Original Message --------
Subject: CR 6853286 Created, P2 opensolaris/triage-queue bogus file attribute
change events slow down gnome applications, after 6539657

*Synopsis*: bogus file attribute change events slow down gnome applications,
after 6539657

*Change Request ID*: 6853286

*Synopsis*: bogus file attribute change events slow down gnome applications,
after 6539657

  Product: solaris
  Category: opensolaris
  Subcategory: triage-queue
  Type: Defect
  Subtype:
  Status: 1-Dispatched
  Substatus:
  Priority: 2-High
  Introduced In Release:
  Introduced In Build:
  Keywords: opensolaris

=== *Description* ============================================================
Category
   solaris/kernel (Solaris Kernel)
Sub-Category
   event_ports
Description
   After I bfu'ed my system using archives compiled from current onnv-gate
sources (~ since 2009-6-17), the idle gnome desktop has started to consume
*lots* of cpu time.  The system is a SX:CE snv_111, bfu'ed with current
onnv-gate bits.  With bfu updates before ~ 2009-6-17 the idle gnome desktop
didn't have a significant amount of cpu usage.
I see several gnome processes (gnome-settings-daemon, nautilus, gnome-panel)
each trying to consume 100% cpu time; the processes perform stat and
port_associate system calls for files all over the place (/etc/xdg/...,
X11 font files, files in my home directory, ~/.local/.... ).
Looking though the change log for the last 2-3 days I noticed the putback
for bug 6539657 "touch(1) does not set the nanosecond timestamp of a file
correctly".
When I disable the stat microsecond workaround[*] that was added with
6539657, the gnome desktop returns to sane behaviour; the idle gnome
desktop does not consume lots of cpu time any more.
[*]    echo stat_force_usec_granularity/W0 | mdb -wk
-----------------
Ok, seems that stat() on zfs is returning file attribute timestamps with the
tv_nsec field rounded to microsecond granularity after the putback for
6539657.
Gnome is using these timestamps to watch for file attribute changes using
port_associate() and port_getn() system calls.  Inside the kernel the tv_nsec
fields for a file from a zfs are compared to the rounded tv_nsec timestamps
that had been returned from stat().  They are different (the sub-microsecond
part should be different), so that port_getn() returns bogus file attribute
change events.  These events seem to trigger some heavy rescanning of
gnome desktop files, start menu file entries, fonts, ... in several gnome
programs.
Setting stat_force_usec_granularity to 0 returns full nanosecond timestamps,
and port_associate() / port_getn() don't report bogus file timestamp change
events.
-----------------
...
/1: 0.0006clock_gettime(4, 0x08046C58)= 0
/1: 0.0005clock_gettime(4, 0x08046C58)= 0
/2:read(17, 0x9AC8EF54, 20)(sleeping...)
/1: 1.3090pollsys(0x082611B8, 9, 0x08046DE8, 0x00000000)= 1
/1:fd=4  ev=POLLIN rev=POLLIN
/1:fd=6  ev=POLLIN rev=0
/1:fd=10 ev=POLLIN|POLLPRI rev=0
/1:fd=13 ev=POLLIN|POLLPRI rev=0
/1:fd=14 ev=POLLIN|POLLPRI rev=0
/1:fd=15 ev=POLLIN|POLLPRI rev=0
/1:fd=16 ev=POLLIN|POLLPRI rev=0
/1:fd=19 ev=POLLIN rev=0
/1:fd=5  ev=POLLIN rev=0
/1:timeout: 0.000000000 sec
/1: 0.0007port_getn(20, 0x08046930, 64, 1, 0x08046928)= 64 [0]
/1: 0.0002port_getn(20, 0x08046930, 64, 1, 0x08046928)= 64 [0]
/1: 0.0001port_getn(20, 0x08046930, 64, 1, 0x08046928)= 64 [0]
/1: 0.0002port_getn(20, 0x08046930, 64, 1, 0x08046928)= 64 [0]
/1: 0.0002port_getn(20, 0x08046930, 64, 1, 0x08046928)= 64 [0]
/1: 0.0001port_getn(20, 0x08046930, 64, 1, 0x08046928)= 64 [0]
/1: 0.0002port_getn(20, 0x08046930, 64, 1, 0x08046928)= 64 [0]
/1: 0.0001port_getn(20, 0x08046930, 64, 1, 0x08046928)= 64 [0]
/1: 0.0002port_getn(20, 0x08046930, 64, 1, 0x08046928)= 64 [0]
/1: 0.0001port_getn(20, 0x08046930, 64, 1, 0x08046928)= 64 [0]
/1: 0.0001port_getn(20, 0x08046930, 64, 1, 0x08046928)= 64 [0]
/1: 0.0002port_getn(20, 0x08046930, 64, 1, 0x08046928)= 64 [0]
/1: 0.0001port_getn(20, 0x08046930, 64, 1, 0x08046928)= 64 [0]
/1: 0.0002port_getn(20, 0x08046930, 64, 1, 0x08046928)= 64 [0]
/1: 0.0001port_getn(20, 0x08046930, 64, 1, 0x08046928)= 64 [0]
/1: 0.0001port_getn(20, 0x08046930, 64, 1, 0x08046928)= 64 [0]
/1: 0.0002port_getn(20, 0x08046930, 64, 1, 0x08046928)= 64 [0]
/1: 0.0001port_getn(20, 0x08046930, 64, 1, 0x08046928)= 64 [0]
/1: 0.0001port_getn(20, 0x08046930, 64, 1, 0x08046928)= 64 [0]
/1: 0.0002port_getn(20, 0x08046930, 64, 1, 0x08046928)= 64 [0]
/1: 0.0002port_getn(20, 0x08046930, 64, 1, 0x08046928)= 64 [0]
/1: 0.0001port_getn(20, 0x08046930, 64, 1, 0x08046928)= 64 [0]
/1: 0.0001port_getn(20, 0x08046930, 64, 1, 0x08046928)= 64 [0]
/1: 0.0002port_getn(20, 0x08046930, 64, 1, 0x08046928)= 64 [0]
/1: 0.0001port_getn(20, 0x08046930, 64, 1, 0x08046928)= 64 [0]
/1: 0.0001port_getn(20, 0x08046930, 64, 1, 0x08046928)= 64 [0]
/1: 0.0002port_getn(20, 0x08046930, 64, 1, 0x08046928)= 64 [0]
/1: 0.0001port_getn(20, 0x08046930, 64, 1, 0x08046928)= 64 [0]
/1: 0.0002port_getn(20, 0x08046930, 64, 1, 0x08046928)= 64 [0]
/1: 0.0001port_getn(20, 0x08046930, 64, 1, 0x08046928)= 64 [0]
/1: 0.0001port_getn(20, 0x08046930, 64, 1, 0x08046928)= 45 [0]
/1: 0.0028lstat64("/etc/fonts/fonts.conf", 0x08046C40)= 0
/1:    d=0x02D90002 i=12386 m=0100444 l=1  u=0     g=2     sz=14364
/1:at = Jun 20 11:34:17 CEST 2009  [ 1245490457 ]
/1:mt = Mar 13 06:43:42 CET 2009  [ 1236923022 ]
/1:ct = Apr 27 18:53:27 CEST 2009  [ 1240851207 ]
/1:    bsz=14848 blks=7     fs=zfs
/1: 0.0003port_associate(20, 7, 0x080BB810, 0x10000006, 0x080B95C8) = 0
/1: 0.0004lstat64("/etc/vfstab", 0x08046C40)= 0
/1:    d=0x02D90002 i=362618 m=0100644 l=1  u=0     g=3     sz=410
/1:at = Jun 20 11:44:42 CEST 2009  [ 1245491082 ]
/1:mt = May  9 20:26:04 CEST 2009  [ 1241893564 ]
/1:ct = Jun 20 11:08:49 CEST 2009  [ 1245488929 ]
/1:    bsz=512   blks=2     fs=zfs
/1: 0.0001port_associate(20, 7, 0x080BD1B0, 0x10000006, 0x080B9898) = 0
/1: 0.0003lstat64("/usr/openwin/lib/X11/fonts/TrueType/FreeSansOblique.ttf",
0x08046C40) = 0
/1:    d=0x02D90002 i=315877 m=0100755 l=1  u=0     g=2     sz=110740
/1:at = Mar 11 06:31:49 CET 2009  [ 1236749509 ]
/1:mt = Mar 11 06:31:49 CET 2009  [ 1236749509 ]
/1:ct = Apr 27 20:22:04 CEST 2009  [ 1240856524 ]
/1:    bsz=111104 blks=110   fs=zfs
/1: 0.0002port_associate(20, 7, 0x080C1C68, 0x10000006, 0x080BEF18) = 0
/1:
0.0002lstat64("/usr/openwin/lib/X11/fonts/TrueType/LucidaBrightDemiBold.ttf",
0x08046C40) = 0
/1:    d=0x02D90002 i=53833 m=0100444 l=1  u=0     g=2     sz=75144
/1:at = Apr 27 18:58:49 CEST 2009  [ 1240851529 ]
/1:mt = Mar 13 08:41:11 CET 2009  [ 1236930071 ]
/1:ct = Apr 27 18:58:45 CEST 2009  [ 1240851525 ]
/1:    bsz=75264 blks=95    fs=zfs
/1: 0.0002port_associate(20, 7, 0x080C1D90, 0x10000006, 0x080BEF30) = 0
/1: 0.0002lstat64("/usr/openwin/lib/X11/fonts/TrueType/gulim.ttf", 0x08046C40) 
= 0
/1:    d=0x02D90002 i=315884 m=0100755 l=1  u=0     g=2     sz=10385096
/1:at = Mar 11 06:31:49 CET 2009  [ 1236749509 ]
/1:mt = Mar 11 06:31:49 CET 2009  [ 1236749509 ]
/1:ct = Apr 27 20:22:08 CEST 2009  [ 1240856528 ]
/1:    bsz=131072 blks=8755  fs=zfs
/1: 0.0002port_associate(20, 7, 0x080C1E80, 0x10000006, 0x080BEF48) = 0
/1: 0.0003lstat64("/usr/openwin/lib/X11/fonts/TrueType/hline.ttf", 0x08046C40) 
= 0
/1:    d=0x02D90002 i=315885 m=0100755 l=1  u=0     g=2     sz=1203464
/1:at = Apr 27 21:59:14 CEST 2009  [ 1240862354 ]
/1:mt = Mar 11 06:31:49 CET 2009  [ 1236749509 ]
/1:ct = Apr 27 20:22:08 CEST 2009  [ 1240856528 ]
/1:    bsz=131072 blks=808   fs=zfs
/1: 0.0002port_associate(20, 7, 0x080C2918, 0x10000006, 0x080BEE88) = 0
...
-----------------
#!/usr/sbin/dtrace -s
BEGIN {
PORT_FOP_ACTIVE = 1;
FILE_MODIFIED = 2;
}
fbt::port_check_timestamp:entry
/(((portfop_t *)arg3)->pfop_flags & PORT_FOP_ACTIVE) &&
 (((portfop_t *)arg3)->pfop_events & FILE_MODIFIED)/
{
this->vp = (vnode_t*)arg1;
this->pfp = arg3;
this->objptr = arg4;
this->fobj = (file_obj_t*)this->objptr;
printf("at:%u.%09u mt:%u.%09u ct:%u.%09u",
this->fobj->fo_atime.tv_sec, this->fobj->fo_atime.tv_nsec,
this->fobj->fo_mtime.tv_sec, this->fobj->fo_mtime.tv_nsec,
this->fobj->fo_ctime.tv_sec, this->fobj->fo_ctime.tv_nsec);
self->d = stackdepth;
}
fbt::vhead_getattr:entry
/stackdepth == self->d + 2/
{
self->vap = (vattr_t*)arg1;
}
fbt::vhead_getattr:return
/self->vap != 0/
{
printf("at:%u.%09u mt:%u.%09u ct:%u.%09u",
self->vap->va_atime.tv_sec, self->vap->va_atime.tv_nsec,
self->vap->va_mtime.tv_sec, self->vap->va_mtime.tv_nsec,
self->vap->va_ctime.tv_sec, self->vap->va_ctime.tv_nsec);
self->vap = 0;
self->d = 0;
}
  ...
  0  50545       port_check_timestamp:entry at:1236929113.000000000
mt:1236929113.000000000 ct:1240851088.196400000
  0  10004             vhead_getattr:return at:1236929113.000000000
mt:1236929113.000000000 ct:1240851088.196400740
  0  50545       port_check_timestamp:entry at:1240862492.034252000
mt:1236929113.000000000 ct:1240851088.197096000
  0  10004             vhead_getattr:return at:1240862492.034252384
mt:1236929113.000000000 ct:1240851088.197096808
  0  50545       port_check_timestamp:entry at:1240862491.900676000
mt:1236929112.000000000 ct:1240851088.195174000
  0  10004             vhead_getattr:return at:1240862491.900676696
mt:1236929112.000000000 ct:1240851088.195174171
  ...
Frequency
   Always
Regression
   Solaris 10
Steps to Reproduce
   Start gnome desktop on a system that was bfu'ed to bits containing the 
putback
for 6539657.  Check "prstat" or "top" output for the idle desktop.
Expected Result
   more than 99% of idle cpu time.
Actual Result
   0% idle cpu time; several gnome processes consume all available cpu cycles.
Error Message(s)

Test Case

Workaround
   echo stat_force_usec_granularity/W0 | mdb -wk
Additional configuration information
   SX:CE b111, bfu'ed to current opensolaris bits compiled from onnv-gate
mercurial source repository.


=== *Public Comments* ========================================================

=== *Comments* ===============================================================

=== *Evaluation* =============================================================

=== *Suggested Fix* ==========================================================

=== *Workaround* =============================================================

=== *Justification* ==========================================================
Priority changed from [] to [2-High]
system becomes more or less unusable after gnome desktop has started


=== *Additional Details* =====================================================
        Targeted Release:
        Commit To Fix In Build:
        Fixed In Build:
        Integrated In Build:
        Verified In Build:
  See Also: 6539657
  Duplicate of:
  Program Management:
  Root Cause:
  Is a Security Vulnerability?: No
  Fix Affects Documentation: No
  Fix Affects Localization: No
  Reported by:

=== *History* ================================================================
        Date Submitted: 2009-06-20 16:48:14 GMT+00:00

        Status Changed    Date Updated                  Updated By




Reply via email to