On 13/11/2013 12:34 PM, Kenneth Wilkerson wrote:
Actually, the algorithm performs well for read-often, write-rarely list
because the active chain count does not change and therefore there are
relatively infrequent re-drives. The active chain count only changes on an
add or delete. So if there are infrequent adds and deletes, there will be
infrequent re-drives. And you are wrong, readers will not contend unless two
or more tasks are referencing the exact same element simultaneously. And
even then, the re-drive only involves the update to the use count.
Thanks, I get it now. Maybe IBM should have used PLO for the z/OS C++
shared_ptr SMR algorithm which has both weak/strong reference counts +
the pointer. They use a lock-free algorithm using CS
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2674.htm.
shared_ptr implements proxy based garbage collection.
There are a lot of optimization that I do not describe in this algorithm for
simplicity. For example, when you do a PLO DCAS, the condition code is set
to indicate which compare failed. This can be used to optimize the re-drive.
There are many others optimizations with re-drives to make this more
efficient. And if you get away from traditional lists, there are even more
optimizations.
Like what? What do you replace linked lists with, arrays with offsets
instead of pointers?
Honestly, I provided this algorithm after reading the paper on hazard
pointers. The paper was written in 2002 and claimed there was no atomic DCAS
when PLO DCAS became available in 2001. So I took a much simpler algorithm
that I had and modified it to use a use count to accommodate traditional
storage managers to prove that a PLO could be used to manage a conventional
list using a traditional storage manager and provide a SMR algorithm without
the need for DU level management structures. I don't use many conventional
lists and I have a proprietary storage manager that does not use chains.
Most of my PLO operations are much simpler.
I would love to test this algorithm against any other SMR algorithm. My
point has been and remains, that PLO can be efficiently used to serialize
any list in a lock-free manner and even if it does take more CPU this will
be offset by increased throughput.
And just because UNIX has issues with PLO doesn't mean the issue is with
PLO...
UNIX doesn't have an issue with PLO. It clearly states that
popping/pushing elements at the beginning of the queue is a good
performer. Surely your algorithm would have the same problem if multiple
producers/consumers were inserting/removing elements from the middle of
a long list.
Kenneth
-----Original Message-----
From: IBM Mainframe Discussion List [mailto:[email protected]] On
Behalf Of David Crayford
Sent: Tuesday, November 12, 2013 8:39 PM
To: [email protected]
Subject: Re: Serialization without Enque
Thanks for sharing your design Ken. It seems to me that PLO is best used for
data structures like double-ended queues where elements can be
inserted/removed from both ends of the queue atomically. In the case of a
read-often-write-rarely list with multiple readers that traverse the list it
doesn't seem optimal. Correct me if I'm wrong but readers will contend with
each other.
FYI, z/OS UNIX message queues can be configured to use PLO (with fallback to
a latch). The documentation states that inserting/removing from the middle
of a long list using PLO is a poor performer
http://pic.dhe.ibm.com/infocenter/zos/v1r12/topic/com.ibm.zos.r12.bpxb100/qc
t.htm#qct.
There isn't a one-size-fits-all solution. It depends on the usage patterns.
Hopefully HTM will solve that. Depending on how Intels Haswell TSX
implementation performs in the wild we could see HTM in our cell phones as
early as next year.
On 12/11/2013 11:18 PM, Kenneth Wilkerson wrote:
I use cell pools. I also use a proprietary storage manager that
doesn't use chains. These methodology offer me capabilities well
beyond those found in traditional methods. Much of what I do is based
on these capabilities, but the algorithms could easily be adapted to
use a conventional storage manager that uses chains.
Here is an algorithm I use that I've adopted for traditional storage
management. This algorithm will work for any list; LIFO, FIFO,
ordered or other , and for deletion from the head, middle or tail.
Setup: Allocate a word in each element. Bit 0 is one for all active
elements. Bit 1 is one for all elements pending deletion. Bit 2 is
`reserved for an extension to this algorithm (such as garbage
cleanup). The remaining bits are a use count allowing for many more DUs
than are supported by MVS.
Note.: When PLO Compare and Swap (CS) or Double Compare and Swap
(DCAS) is referenced, the PLO uses the use count address as the lock
word. This will serialize all updates to the use counter for that
element. For the PLO Compare and Loads or PLO update, the lock word is the
active chain counter.
To search:
Step 1: Use the PLO Compare and Load on the active chain counter to
search the chain as before. If the PLO fails, re-drive the search.
Step 2: Before examining the element, increment the use count with a
PLO Double Compare and Swap (DCAS). Load the first register pair with
the current chain counter. The swap value will also be the current
chain counter. Essentially, we're using the active chain count to
serialize increments to the use count to avoid accessing an area that
may have been released. The second register pair will contain the
current use count with a swap value incremented by 1 using an AL to
avoid resetting the high bit. If the PLO DCAS fails, the previous PLO
Compare and Load (Step 1) should be re-driven.
Step 3: Use the PLO Compare and Load for the next element. Save the
PLO status and decrement the use count with a SL using PLO CS. We
don't need a DCAS because the use count is not 0 and this element
can't be deleted. If this PLO CS fails, re-drive it. If PLO Compare
and Load status is re-drive, then before re-driving the search, check
the use count (in the register used to update it). If Bit 1 (pending
delete) is set and the use count is 0, this task can release it.
To delete: (this assumes the deleting task has already updated the
use count in the process of finding the element to delete)
Step1: Use a PLO update function to remove the element from the active
chain to avoid any future references.
Step2: If the PLO update to remove the element fails, decrement the
use count but do NOT set bit 1 (pending delete) using a PLO CS. If the
PLO CS fails, re-drive it .
Step 3: If the PLO update to remove the element succeeds, decrement
the use count, SET bit 1 (pending delete) and RESET Bit 0 (active
bit) using a PL CS. If the PLO CS fails, re-drive it .
Step 4: Whether the PLO update succeeded or failed, check the use
count in the register used to update it:
If bit 1 (pending delete) is set and the use
count is not 0, then this task should exit .
If bit 1 (pending delete) is set and the use
count is 0, then this task can release it.
Otherwise, this task must re-drive the search to
find the element to be deleted
You can work out the various scenarios yourself. But because the count
is incremented/decremented after a PLO Compare and Load or update, the
status of the PLO provides a decision point on whether an element may
have been deleted. Using the use count address as the lock word
insures that all use count updates for a specific use count occur
serially. There are numerous extensions to this algorithm that are
more than I want to describe. Things like adding pending deletes to a
delete chain or having an asynchronous, periodic garbage collection task
handle the deletes.
Kenneth
-----Original Message-----
From: IBM Mainframe Discussion List [mailto:[email protected]]
On Behalf Of Jon Perryman
Sent: Monday, November 11, 2013 9:38 PM
To: [email protected]
Subject: Re: Serialization without Enque
Could you provide an insight to a design that would handle the
situation where an SSI program with non-trivial workload uses a very
large list. This list is normally semi-static but there can be periods
of time where entries are heavily deleted and added. Not freeing
storage is an option because it could be a significant amount of storage.
Thanks, Jon Perryman.
________________________________
From: Tony Harminc <[email protected]>
To: [email protected]
Sent: Monday, November 11, 2013 7:07 PM
Subject: Re: Serialization without Enque
On 11 November 2013 20:15, Jon Perryman <[email protected]> wrote:
L R2,QUEUE
L R3,NEXT_ENTRY
CS R2,R3,QUEUE New queue head While this
seems bullet proof, it's not. If there is a long delay between
between the L instructions then next entry could have been freed
causing an
abend.
If your code is referencing an item that may have been freed, then
your design is wrong.
While the delay is very unlikely, it is still a possiblity (e.g.
swapout after first "L"). This example is just removing the first entry.
If you try to remove an entry in the middle of the chain, you run a
larger risk.
Some of the techniques we use to minimize risk:
1. Abend recovery
2. don't overload the entry - only contains shared information.
3. Place entry on a timeout queue before placing on a free queue or
freeing.
4. Copy the entry to workarea - reduces reference time of actual
entry 5. Eyecatchers that can be verified.
6. Unique identifier for each entry that can be verified.
7. Turn off interrupts.
Most of these are indicative of incorrect design rather than of
careful attention to risk. If it's unclear whether a queue entry is
eligible to be worked on or even to be looked at unless you do it in
a hurry, then there is something quite wrong.
By all means provide abend recovery, but don't make it part of the
queue handling design.
Tony H.
---------------------------------------------------------------------
- For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO
IBM-MAIN
----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions, send
email to [email protected] with the message: INFO IBM-MAIN
----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions, send
email to [email protected] with the message: INFO IBM-MAIN
----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions, send email
to [email protected] with the message: INFO IBM-MAIN
----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO IBM-MAIN
----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO IBM-MAIN