Re: [opendx-dev] Re: dxexec: 20% perf speedup

gabra Wed, 31 May 2000 12:30:43 -0700 (PDT)

Richard,   if the X event loop (or whatever is driving your renderer) is
asynchronous with respect to DX,  then objects can get deleted out from
under the renderer regardless of locking.  Vanilla DX works because the fd
for the OpenGL renderer X display is registered with the
RegisterInputHandler interface, so DX knows about it.  The grand loop of DX
waits for input on the registered fds, plus the console (in script mode) or
the UI socket in UI mode.  When input arrives on an fd the appropriate
handler registered for that fd is called and runs to completion; thus only
one is active at a time, and network execution and apparently asynchronous
hardware rendering are exclusive.


Which it probably a tedious way of saying that if your stuff uses the
RegisterIInputHandler interface, then I'd really doubt that there's a
necessity for locking, whereas if its not, then all hell would be breaking
loose, which its not, so I don't think there's a problem.

Greg

Randall Hopper <[EMAIL PROTECTED]>@opendx.watson.ibm.com on 05/31/2000
01:15:07 PM

Please respond to [email protected]

Sent by:  [EMAIL PROTECTED]


To:   [email protected]
cc:
Subject:  [opendx-dev] Re: dxexec: 20% perf speedup



richard:
 |Randall Hopper:
 |>      From profiling dxexec, I've found that 20-25% of it's time in the
 |> single-process case is spent locking and unlocking mutex locks (on
SGI).
 |> AFAICT these locks add unnecessary, avoidable overhead to the
 |> single-process case.
 |
 |I am wondering if this will affect our VR application.  We use pointers
 |to DX data inside draw operations that occur on multiple graphics heads
 |so memory locks may be important even in DX's single-processor mode.

Ok.  Please check me on this.  Here's what I have in mind.

>From what I'm aware of, I don't think this will be a problem in the
executive.  dxexec doesn't spawn threads, and in single-process mode,
there's one process executing in the dxexec memory space (including shared
memory).  This covers inboard and runtime-loadable modules which share
address space and a thread with dxexec.  Outboard modules are a separate
process, but they don't share the same address space with DX (getting their
data from/to dxexec through sockets).

However where these locks may still be needed is in routines linked into
the DX client (libdx/*).  I'm not sure which are exposed to the client API
and therefore may be useful to the client (e.g. if the client is
multi-threaded).

Rather than trace each and every one (see list below) to a DX API and to
determine if it's used in the executive, my aim is to identify which dxexec
lock sets are the bottlenecks and selectively disable them for the
single-process case, ideally leaving the libdx locks as-is (always
enabled).  Intuition says it's the graph evaluation and state locks (3000
DX modules mount up), but the numbers will tell.

   ./src/exec/dpexec/d.c            - Dictionary locks for MarkTime*
   ./src/exec/dpexec/rq.c           - Job run queue lock
   ./src/exec/dpexec/dxmain.c       - 1) IBM PVS "profiling" lock
                                      2) On MP, lock for global child PID
table
   ./src/exec/dpexec/task.c         - Task group state & job enqueue locks
   ./src/exec/dpexec/loader.c       - IBM6000 dynamic module loader lock
   ./src/exec/dpexec/queue.c        - Generic queue (NOT USED)
   ./src/exec/dpexec/vcr.c          - Sequencer state access lock
   ./src/exec/dpexec/swap.c         - Executive memory reclaimation disable
   ./src/exec/dpexec/evalgraph.c    - _dxd_dphosts (Distributed processing
                                      hosts table)
   ./src/exec/dpexec/exobject.c     - Alloc locks for global variables
   ./src/exec/dpexec/exobject.h     - Executive obj ref cnt & delete
locking
   ./src/exec/dpexec/graphqueue.c   - Execution graph
   ./src/exec/dpexec/packet.c       - tmpbuffer lock for writing packets to
                                      dxui/dxl client
   ./src/exec/libdx/shade.c         - Render object counts lock
   ./src/exec/libdx/task.c          - Task group state locks
   ./src/exec/libdx/cache.c         - cache access lock
   ./src/exec/libdx/render.h        - Render object counts lock
   ./src/exec/libdx/object.c        - If trace>=1, lock for obj alloc stats
tbl
   ./src/exec/libdx/arrayClass.h    - Field array data realloc lock
   ./src/exec/libdx/arrayClass.c    - Field array data realloc lock
   ./src/exec/libdx/tile.c          - Use count for image patch render data
   ./src/exec/libdx/stringtable.c   - String hash table lock
   ./src/exec/libdx/qmessage.c      - String message queue lock
   ./src/exec/libdx/memory.c        - Free list access and arena pool locks
   ./src/exec/libdx/group.c         - Group member modification lock

 |When you code it, please provide a mechanism to turn it back on if
 |needed, either a compile option or environment variable would be ok.

Will do.  That sounds like a good idea.

--
Randall Hopper (mailto:[EMAIL PROTECTED])
EPA Scientific Visualization Center
US EPA MD/24 ERC-1A; RTP, NC 27711

Re: [opendx-dev] Re: dxexec: 20% perf speedup

Reply via email to