I would say that in addition to just be a fast data structure the data structure must be fast and accommodate synchronous and asynchronous backups, partitions, and transactions.
On Dec 13, 2010, at 5:15 PM, Patricia Shanahan wrote: > You make a good point that I should take a look around to see if I can find > anything suitable that already exists, with appropriate licensing. > > That said, I believe the requirements for an outrigger FastList are: > > 1. Highly scalable performance. In particular, iterating over the list should > be fast even when many threads are doing it. As far as I can tell, any > JavaSpace lookup turns into iterating over a FastList representing items of > acceptable classes, to see if any of the nodes match the other template > requirements. > > 2. A limited set of features: Iterate over the list starting at the head, > remove a node, add at tail. > > The two libraries each go in a different direction. > > Javolution is primarily a single thread speed play, with fast and predictable > performance. It is what I would want if I were writing real time code. It > does have some support for concurrency, but not what is needed for outrigger: > "Fast collections (or maps) can be made thread-safe by marking them > FastCollection#shared shared Having a shared collection (or map) means that > it can be safely used without synchronization. It does not mean that a change > made by a thread is automatically viewed by another thread (that would > require synchronizing)." > > The Apache Commons collections are very feature rich, but seem to be > primarily single thread, with the synchronizing decorator approach to > concurrency. > > I will do some more web searching, but my current strongest contender for an > existing class is java.util.concurrent.ConcurrentLinkedQueue. It's main > limitation is slow remove, but that might be worked around by using a pair of > queues, and periodically cleaning by copying from one queue to another. Also, > java.util.concurrent contains a reference to an important paper, > http://www.cs.rochester.edu/u/michael/PODC96.html, that gives me a starting > point for finding the latest research on these issues. > > Patricia > > > > > On 12/13/2010 4:30 PM, James Grahn wrote: >> Because there've been a few concerns about memory models and concurrent >> data structures, I thought I'd suggest a couple of resources. Library >> usage is preferable to reinventing the wheel, where possible: >> >> First: >> http://javolution.org/ >> >> I haven't used it directly; I've only used a library which used it. But >> it perhaps deserves a scan to see if it would meet our requirements (and >> testing to ensure it's acceptable if it claims to). >> >> Also, it's BSD-licensed. I believe that's acceptable? >> >> Second: >> http://commons.apache.org/collections/index.html >> >> I don't *think* it has what we need, but it's worth poking around in >> before creating a new datastructure. Has the advantage of being part of >> the Apache family. >> >> jamesG >> >> On 12/13/2010 5:24 PM, Peter Firmstone wrote: >>> Gregg Wonderly wrote: >>>> On 12/13/2010 12:34 PM, Gregg Wonderly wrote: >>>>> This does fail fairly quickly (immediately) on my windows laptop. >>>>> >>>>> I am not sure that I have time to really look over this code. I >>>>> wonder if anyone >>>>> knows if this is relatively new code that John put together as part >>>>> of the >>>>> effort to remove the use of PSE from outrigger, or is the original >>>>> "non-persistent" javaspaces implementation? >>>>> >>>>> Perhaps we need to do something different here, a segmented list for >>>>> example, >>>>> which is what PSE did with it's Vector implementation so that >>>>> segments of the >>>>> list could be locked independently, as well as allowing the segments >>>>> to be >>>>> "deleted" from disk once they were "empty". >>>> >>>> And, of course if we pull out outrigger, as an application/service, >>>> separate from the Jini part of river, we could just say that Outrigger >>>> requires JDK1.5 so that we can move to a different concurrency >>>> implementation if that is needed, using the new memory model. >>> >>> It seems like a lot of work to fix a package private implementation, >>> already based on flawed assumptions (Patricia's done a great job >>> debugging this, she's a real asset, I look forward to learning more >>> debugging tips). I suspect we'll get a much better result starting from >>> scratch, utilising Java 6. >>> >>> Since River is a Jini platform, why don't we start by creating an >>> independent implementation of outrigger utilising any latest available >>> java features. Not only will this produce a better implementation, that >>> will be easier to support, but it might improve our understanding of >>> what's required for a modular build as well. >>> >>> The existing outrigger implementation can remain as it is, but >>> deprecated, left in place for legacy support. >>> >>> I've got some concurrent utilities in pepe you may pinch / improve if >>> you like: >>> >>> org.apache.river.impl.util.* >>> >>> ConcurrentCollections >>> ConcurrentSoftMap >>> ConcurrentWeakIdentityMap >>> ConcurrentWeakMap >>> >>> ConcurrentCollections is a multi read single write lock based collection >>> wrapper. It defensively copies for Iterators but still allows performing >>> removals from the underlying collection. >>> >>> The ConcurrentMap's are based on ConcurrentHashMap, using a >>> ReferenceQueue to remove stale entries prior to every method call. >>> Cheers, >>> >>> Peter. >>> >>>> >>>> One of the things I did in griddle, was define a RemoteIterator which >>>> allows a "get" operation to have a return value before anything is in >>>> the space. A server side thread, then looks through the space for >>>> matches and adds them to the return queue for the calling "get". This >>>> allows server side contention to be managed to some degree because the >>>> number of "searching" threads could be held to an appropriate minimum >>>> (even one). The javaspaces API doesn't disallow such proxy >>>> implementation and JavaSpaces05 iterator support starts to expose this >>>> kind of thing more literally. >>>> >>>> Ultimately, I think a segmented list that looks like a >>>> List<List<Entry>> would be a way to distribute locking for "get" >>>> because iteration would be less likely to be on the same segment at >>>> the same time. >>>> >>>> Insertion at the tail, is always a contentious issue for concurrent >>>> lists. Sometimes you can just use a small ConcurrentHashMap to perform >>>> adds until it gets to a particular size, and then turn its contents >>>> into a List and add that list to the tail of the List<List<Entry>>. >>>> You can choose then to decide when to do that movement by watching for >>>> the other lists to be empty too, or when a traversal gets to the end >>>> of what is in a list already. >>>> >>>> This keeps writers quite free to insert quickly and completely away >>>> from the readers as well. Ordering presents most of the opportunities >>>> for big time contention. Unordered (map), or more segmented (map of >>>> list) construction relieves some of the contention if course, by >>>> distributing it. >>>> >>>> Gregg >>>> >>>>> Gregg >>>>> >>>>> On 12/13/2010 12:16 AM, Patricia Shanahan wrote: >>>>>> On 12/12/2010 5:48 PM, Peter Firmstone wrote: >>>>>>> Patricia Shanahan wrote: >>>>>>>> On 12/3/2010 7:15 AM, Gregg Wonderly wrote: >>>>>>>> ... >>>>>>>> > The important issue in FastList is that it was written with the >>>>>>>> JDK1.4 >>>>>>>> > memory model. After moving River to Java 1.5, we'd have the >>>>>>>> JSR166 work >>>>>>>> > and the new, consistent memory model where volatile has a true >>>>>>>> meaning. >>>>>>>> > However, this code in particular is quite complex as you have >>>>>>>> noted, so >>>>>>>> > even adjusting to the new memory model could be problematic. >>>>>>>> >>>>>>>> I've just run a modified, simplified version of my test with java >>>>>>>> version "1.4.2_19" and an unmodified copy of FastList, and I still >>>>>>>> get >>>>>>>> the NullPointerException. This changes my thinking a bit. I had been >>>>>>>> working from the assumption that the issue was to do with the >>>>>>>> changes >>>>>>>> in memory model between 1.4 and 1.5. I now have to consider the >>>>>>>> possibility of a more basic bug that is independent of the memory >>>>>>>> model. >>>>>>>> >>>>>>>> If there is anyone with a FastList or Java memory model background >>>>>>>> who >>>>>>>> would like to help, please reply. I would welcome another set of >>>>>>>> eyes >>>>>>>> on the code, and a cross check on my conclusions so far about how >>>>>>>> FastList is supposed to work. There seems to be a critical invariant >>>>>>>> that gets broken, and once that happens we are on track to either a >>>>>>>> NullPointerException or dropped items. >>>>>>>> >>>>>>>> I can supply my test as a unit test (JDK 1.6, Junit 4) and as a main >>>>>>>> program (JDK 1.4 or later0. In both forms, all it does is fire up a >>>>>>>> mixture of threads that repeatedly add items to a FastList and >>>>>>>> threads >>>>>>>> that repeatedly remove the first item they can from the FastList. >>>>>>>> Failures seem to require simultaneous adds and removes. >>>>>>>> >>>>>>>> If I don't nail this problem fairly soon, I may abandon the current, >>>>>>>> rather complicated, code and switch to writing a concurrent high >>>>>>>> performance FastList substitute for 1.5 or later. >>>>>>>> >>>>>>>> Patricia >>>>>>>> >>>>>>> I'll have a look tonight, no promises though ;) >>>>>> >>>>>> I'm attaching the simplified test application main program that can >>>>>> run, and >>>>>> fail, under JRE 1.4, with no need for Junit. >>>>>> >>>>>> Here's what I think I know. First of all, I have found some dubious >>>>>> synchronization situations. However, fixing all the things I have so >>>>>> far found >>>>>> of that type has only reduced the failure rate, not eliminated >>>>>> failures. That >>>>>> could be caused by changing timings without having any impact on the >>>>>> root cause. >>>>>> >>>>>> The key invariant relates to a thread that is doing a scan, starting >>>>>> with a call >>>>>> to head() and proceeding through a series of calls to next() to >>>>>> examine nodes. >>>>>> The head() call sets up a guard node for the thread that was the >>>>>> tail at some >>>>>> point during the head call. The invariant is that the series of >>>>>> next() calls >>>>>> will reach the guard node before finding a null next pointer, >>>>>> indicating the >>>>>> actual tail. >>>>>> >>>>>> The remove call does not really remove anything, it merely marks the >>>>>> node >>>>>> removed. Removed nodes are unlinked as a side effect of scans, >>>>>> during head and >>>>>> next calls, but only if they are not guard nodes. >>>>>> >>>>>> There are additional complications in the restart and reap methods, >>>>>> but we can >>>>>> ignore them for now - my test does not use them. >>>>>> >>>>>> Once a guard node is lost, the synchronization breaks down >>>>>> completely, because >>>>>> e.g. insertion at tail is protected by synchronization on the >>>>>> FastList instance, >>>>>> but unlinking of a removed node in the middle is protected, to the >>>>>> extent it is >>>>>> protected at all, by synchronization on the FastList.Node instance >>>>>> that is being >>>>>> removed. >>>>>> >>>>>> The commonest failure symptom is a scan reaching the null next >>>>>> pointer at the >>>>>> end of the FIFO during head(), without first finding the guard node >>>>>> it just set >>>>>> up. An alternative form of failure is loss of some entries - they >>>>>> get added, but >>>>>> the remove threads never find them. The second symptom is >>>>>> predominant in the >>>>>> JavaSpaces stress test that got me started on this. Messing up a >>>>>> next pointer >>>>>> could cause either. >>>>>> >>>>>> Incidentally, I'm curious about why it has such a fragile system in >>>>>> which the >>>>>> state of a scan is partly tracked by thread, when it seems like an >>>>>> obvious >>>>>> candidate for the Iterator pattern. Callers do need to be able to >>>>>> find out if a >>>>>> remove call succeeded or not (the node may have been removed by >>>>>> another thread), >>>>>> but that could be done in an interface extending Iterator. The >>>>>> WeakHashMap in a >>>>>> node that keeps track of the threads for which it is a guard would >>>>>> instead track >>>>>> the Iterator. There would be no need for thread local storage, the >>>>>> same data >>>>>> could be kept in the Iterator. >>>>>> >>>>>> Thanks for any time you can spend looking at this. >>>>>> >>>>>> Patricia >>>>>> >>>>>> >>>>> >>>>> >>>> >>>> >>> >>> >> > Michael McGrady Chief Architect Topia Technology, Inc. Cel 1.253.720.3365 Work 1.253.572.9712 extension 2037 mmcgr...@topiatechnology.com