Thank you. Sent from my Samsung device. Include original message ---- Original message ---- From: Patricia Shanahan <p...@acm.org> Sent: 05/12/2015 03:02:21 pm To: dev@river.apache.org Subject: Re: Trunk merge and thread pools
During the gap between releases, it seems Rat has changed from .jar to executable. It's my bed time, and I have another commitment tomorrow, but I should be able to look at it on Sunday. On 12/4/2015 8:50 PM, Patricia Shanahan wrote: > I'll look into it. > > On 12/4/2015 8:01 PM, Peter wrote: >> Yes, >> >> There's a tool called Rat, there's a shell script it's included in our >> trunk, it checks our license headers etc are compliant for release. I >> don't recall the details unfortunately, but I think you download the >> rat tool, set an env variable and run the script. That would be very >> helpful. >> >> Thanks, >> >> Peter. >> >> Sent from my Samsung device. >> Include original message >> ---- Original message ---- >> From: Patricia Shanahan <p...@acm.org> >> Sent: 05/12/2015 12:01:04 pm >> To: dev@river.apache.org >> Subject: Re: Trunk merge and thread pools >> >> Are there any practical things I can do to expedite the release? For >> example, if you rough draft documentation and/or release notes, I can do >> some editing. >> >> On 12/4/2015 5:08 PM, Peter wrote: >>> Trunk has now been replaced. The only changes I'll make now are >>> documentation, release notes, build scripts, license header checks >>> and key signing. >>> >>> I could use some help as i am time poor. >>> >>> Presently I'm marking bugs on Jira as resolved, then I'll generate >>> the release notes. >>> >>> This release has been more thoroughly tested than any previous River >>> release >>> >>> After 3 is released, we could really utilise your experience >>> Patricia, especially with JERI's ConnectionManager and Multiplexer, >>> it uses complex shared and nested locks, supports 128 concurrent >>> connections (shared endpoints) over one Socket or Channel. If I can >>> get contention with 2 cpu's, it's only going to get worse in a real >>> world situation. >>> >>> This contention would only affect nodes that share multiple remote >>> objects between them. I suspect Gregg's use case will have multiple >>> connections between node pairs and hit this contention, I also >>> suspect that Greg is likely to only have 1 or 2 connections between >>> nodes. Once two nodes have more than 128 connections (connections >>> are directly proportional to the number of remote objects, server or >>> client shared between two nodes) another multiplexer will be created, >>> and so on, multiplexers sync on the ConnectionManager's monitor. >>> >>> I have a Sun T5240 (128 way 64GB Ram), with DilOS (an Illumos based >>> distro with Debian package management). Soon, I should have an high >>> speed ipv6 connection courtesy of the NBN. When I do I'll set you up >>> with a remote login for testing. >>> >>> I'll delay further discussion of security until after 3 is released. >>> The changes I propose will have no bearing (won't be in their call >>> stack) on those who aren't concerned about security. >>> >>> I'll be gratefull for an opportunity to present my security code, >>> perhaps doing so may even dispell some fears. >>> >>> Regards, >>> >>> Peter. >>> >>> >>> Sent from my Samsung device. Include original message ---- Original >>> message ---- From: Patricia Shanahan <p...@acm.org> Sent: 05/12/2015 >>> 01:37:10 am To: dev@river.apache.org Subject: Re: Trunk merge and >>> thread pools >>> >>> If you have a real world workload that shows contention, we could >>> make serious progress on performance improvement - after 3.0 ships. >>> >>> I am not even disagreeing with changes that are only shown to make >>> the tests more effective - after 3.0 ships. >>> >>> I am unsure about whether Peter is tilting at windmills or showing >>> the optimum future direction for River with his security ideas. I >>> would be happy to discuss the topic - after 3.0 ships. >>> >>> River 2.2.2, was released November 18, 2013, over two years ago >>> There is already a lot of good stuff in 3.0 that should be available >>> to users. >>> >>> I have a feeling at this point that we will still be discussing what >>> should be in 3..0 this time next year. In order to get 3.0 out, I >>> believe we need to freeze it. That means two types of changes only >>> until it ships - changes related to organizing the release and fixes >>> for deal-killing regression bugs. >>> >>> If I had the right skills and knowledge to finish up the release I >>> would do it. I don't. Ironically, I do know about multiprocessor >>> performance - I was performance architect for the Sun E10000 and >>> SunFire 15k. Given a suitable benchmark environment, I would love to >>> work on contention - after 3.0 ships >>> >>> Patricia >>> >>> >>> >>> On 12/4/2015 6:19 AM, Gregg Wonderly wrote: >>>> With a handful of clients, you can ignore contention. My >>>> applications have 20s of threads per client making very frequent >>>> calls through the service and this means that 10ms delays evolve >>>> into seconds of delay fairly quickly. >>>> >>>> I believe that if you can measure the contention with tooling, on >>>> your desktop, it is a viable goal to reduce it or eliminate it. >>>> >>>> It's like system time vs user time optimizations of old. Now we >>>> are contending for processor cores instead of the processor, locked >>>> in the kernel, unable to dispatch more network traffic where it is >>>> always convenient to bury latency. >>>> >>>> Gregg >>>> >>>> Sent from my iPhone >>>> >>>> On Dec 4, 2015, at 9:57 AM, Greg Trasuk <tras...@stratuscom.com> >>>> wrote: >>>> >>>>>> On Dec 4, 2015, at 1:16 AM, Peter <j...@zeus.net.au> wrote: >>>>>> >>>>>> Since ObjectInputStream is a big hotspot, for testing >>>>>> purposes, I merged these changes into my local version of >>>>>> River, my validating ObjectInputStream outperforms the >>>>>> standard java ois >>>>>> >>>>>> Then TaskManager, used by the test became a problem, with >>>>>> tasks in contention up to 30% of the time. >>>>>> >>>>>> Next I replaced TaskManager with an ExecutorService (River 3, >>>>>> only uses TaskManager in tests now, it's no longer used by >>>>>> release code), but there was still contention although not >>>>>> quite as bad. >>>>>> >>>>>> Then I notice that tasks in the test call Thread.yield(), >>>>>> which tends to thrash, so I replaced it with a short sleep of >>>>>> 100ms. >>>>>> >>>>>> Now monitor state was a maximum of 5%, much better. >>>>>> >>>>>> After these changes, the hotspot consuming 27% cpu was JERI's >>>>>> ConnectionManager.connect, followed by >>>>>> Class.getDeclaredMethod at 15.5%, Socketaccept 14.4% and >>>>>> Class.newInstance at 10.8%. >>>>> >>>>> >>>>> First - performance optimization: Unless you’re testing with >>>>> real-life workloads, in real-ife-like network environments, >>>>> you’re wasting your time. In the real world, clients discover >>>>> services pretty rarely, and real-world architects always make >>>>> sure that communications time is small compared to processing >>>>> time. In the real world, remote call latency is controlled by >>>>> network bandwidth and the speed of light. Running in the >>>>> integration test environment, you’re seeing processor loads, not >>>>> network loads. There isn’t any need for this kind of >>>>> micro-optimization. All you’re doing is delaying shipping, no >>>>> matter how wonderful you keep telling us it is. >>>>> >>>>> >>>>>> My validating ois, originating from apache harmony, was >>>>>> modified to use explicit constructors during deserialization. >>>>>> This addressed finalizer attacks, final field immutability and >>>>>> input stream validation and the ois itself places a limit on >>>>>> downloaded bytes by controlling >>> >>> >>> >>> >> >>