Re: Trunk merge and thread pools

Patricia Shanahan Fri, 04 Dec 2015 00:22:26 -0800

Although we don't have a formal release schedule, I think there is
general agreement that getting 3.0 out in the field is way, way past
due. It should be considered frozen except for fixing really critical
bugs in the current functionality. Even livable bugs should be
documented and put off to 3.1.


Patricia

On 12/3/2015 10:57 PM, Greg Trasuk wrote:

On Dec 4, 2015, at 1:16 AM, Peter <[email protected]> wrote:

Since ObjectInputStream is a big hotspot,  for testing purposes, I merged these 
changes into my local version of River,  my validating ObjectInputStream 
outperforms the standard java ois

Then TaskManager, used by the test became a problem, with tasks in contention 
up to 30% of the time.

Next I replaced TaskManager with an ExecutorService (River 3, only uses 
TaskManager in tests now, it's no longer used by release code), but there was 
still contention  although not quite as bad.

Then I notice that tasks in the test call Thread.yield(), which tends to 
thrash, so I replaced it with a short sleep of 100ms.

Now monitor state was a maximum of 5%, much better.

After these changes, the hotspot consuming 27% cpu was JERI's 
ConnectionManager.connect,  followed by Class.getDeclaredMethod at 15.5%, 
Socket.accept 14.4% and Class.newInstance at 10.8%.



First - performance optimization:  Unless you’re testing with real-life 
workloads, in real-ife-like network environments, you’re wasting your time.  In 
the real world, clients discover services pretty rarely, and real-world 
architects always make sure that communications time is small compared to 
processing time.  In the real world, remote call latency is controlled by 
network bandwidth and the speed of light.  Running in the integration test 
environment, you’re seeing processor loads, not network loads.  There isn’t any 
need for this kind of micro-optimization.  All you’re doing is delaying 
shipping, no matter how wonderful you keep telling us it is.

My validating ois,  originating from apache harmony, was modified to use 
explicit constructors during deserialization.  This addressed finalizer 
attacks, final field immutability and input stream validation and the ois 
itself places a limit on downloaded bytes by controlling array creation and 
expecting a stream reset every so often, if it doesn't receive one, it throws 
an exception  The reset ensures the receiver will regain control of the stream 
before any DOS can occur, even for long running connections that transfer a lot 
of data.  There is no support for circular links in object graphs at this stage.

The deserialization constructor accepts a parameter that's caller sensitive, so 
each class in an object's inheritance hierarchy has it's own get field 
namespace.

A child class can validate it's parent class invariants by creating a 
superclass only instance, calling its constructor and passing the caller 
sensitive parameter (it cannot create this itself, it's created by the ois), 
once the class has validated all invariants including the superclasses , the 
client class knows it's safe to proceed with construction, otherwise it throws 
an exception, an instance has not been created and deserialization is 
terminated there.

Validation is performed inside static methods, prior to an object instance 
being created.

The openjdk team have adopted static method validation, but not the 
constructor.  Unfortunately, that alone, wouldn't provide the level of security 
required for an internet visible service or registrar.

The constructor and validation are very simple  to implement.

This would allow an internet based registrar to safely download only the 
bootstrap proxy to clients, so the client can authenticate it, before 
downloading Entry's for local filtering or ServiceUI, followed by the smart 
proxy.  This would be performed by proxy preparation.  So clients using SDM 
wouldn't require modification, with changes being backward compatible.

This would provide both security and delayed unmarshalling along with a 
significant increase in performance, as clients only ever download what they 
need and permit.

The validating ois was designed for bootstrap proxy's, the lookup service and 
unicast discovery,  but can be also implemented by services as well, as I have 
done.  A new constraint determines whether validating, or standard 
serialization is used.

Since we can now take advantage of interface default methods, we have an 
opportunity to alter the lookup service interface, should we wish to leave 
existing methods as they are and implement delayed unmarshalling and 
authenticate prior to unmarshalling smart proxy’s


Second - River on the Internet:  We’ve talked about this as a community.  River 
on the Internet is your pet, not the community’s interest.  Please stop trying 
to make River on the Internet happen.  It’s not going to happen.

If you want to indulge your obsession with making River safe for untrusted 
code, call it something else and go do it on Github, or go to the Incubator and 
build a new community.  This community has talked about it and said it wasn’t 
interested.

Third - You’ve been messing around with “qa-refactor” for what, 4 years now?  
You keep sending out long emails telling us how great your local copy of River 
is, and how great ‘qa-refactor’ is.  But so far, it’s an exercise in 
self-indulgence.  You complain endlessly about us old Jini types blocking 
progress, but you refuse to let your work be finished.  It isn’t real until it 
ships and someone uses it.  If you want it to be real, you need to quit 
screwing around with it, and ship something.

Recently, you posted:

If I had time, id do the following, I may be able to assist with some tasks:

  1. Move trunk to trunk-old.
  2. Copy Dennis' branch of qa-refactor-rename to trunk.
  3. Copy new classes in RIVER-413 from trunk-old, and change all
     instances of InetAddress.getLocalHost() to point to
     LocalHostLookup.getLocalHost().
  4. Go through all resolved issues in JIRA and mark as resolved
     (you'll need my help).
  5. Generate the release notes (summary of all resolved JIRA issues).
  6. Add the generated release notes to the release documentation.
  7. Check that River version has been updated to 3.0 (I think this has
     been done, but check again anyway).  Don't confuse the river
     version with Jini standards versions on documents.
  8. Run the tests one last time, change the qa tests on Hudson to
     point to trunk.
  9. Build the release artifacts.
10. Sign the release artifacts.
11. Post the release artifacts for voting.
12. Vote on release artifacts.
13. Release River 3.0.



That seemed like a reasonable suggestion.  Why not follow your own plan?

Sorry to be harsh, but I think it’s time somebody said it.

Regards,

Greg Trasuk

Regards,

Peter.

Sent from my Samsung device.
   Include original message
---- Original message ----
From: Bryan Thompson <[email protected]>
Sent: 03/12/2015 11:48:12 am
To: <[email protected]> <[email protected]>
Subject: Re: Trunk merge and thread pools

Great!

----
Bryan Thompson
Chief Scientist & Founder
SYSTAP, LLC
4501 Tower Road
Greensboro, NC 27410
[email protected]
http://blazegraph.com
http://blog.blazegraph.com

Blazegraph™ <http://www.blazegraph.com/> is our ultra high-performance
graph database that supports both RDF/SPARQL and Tinkerpop/Blueprints
APIs.  Blazegraph is now available with GPU acceleration using our disruptive
technology to accelerate data-parallel graph analytics and graph query.

CONFIDENTIALITY NOTICE:  This email and its contents and attachments are
for the sole use of the intended recipient(s) and are confidential or
proprietary to SYSTAP. Any unauthorized review, use, disclosure,
dissemination or copying of this email or its contents or attachments is
prohibited. If you have received this communication in error, please notify
the sender by reply email and permanently delete all copies of the email
and its contents and attachments.

On Wed, Dec 2, 2015 at 7:26 PM, Peter <[email protected]> wrote:

  Just tried wrapping an Executors.newCachedThreadPool with a thread factory
  that creates threads as per the original
  org.apache.river.thread.NewThreadAction.

  Performance is much improved, the hotspot is gone.

  There are regression tests with sun bug Id's, which cause oome.  I thought
  this might
  prevent the executor from running,  but to my surprise both tests pass.
  These tests failed when I didn't pool threads and just let them be gc'd.
  These tests created over 11000 threads with waiting tasks.  In practise I
  wouldn't expect that to happen as an IOException should be thrown.  However
  there are sun bug id's 6313626 and 6304782 for these regression tests, if
  anyone has a record of these bugs or any information they can share, it
  would be much appreciated.

  It's worth noting that the jvm memory options should be tuned properly to
  avoid oome in any case.

  Lesson here is, creating threads and gc'ing them is much faster than
  thread pooling if your thread pool is not well optimised..

  It's worth noting that ObjectInputStream is now the hotspot for the test,
  the tested code's hotspots are DatagramSocket and SocketInputStream.

  ClassLoading is thread confined, there's a lot of class loading going on,
  but because it is uncontended, it only consumes 0.2% cpu, about the same as
  our security architecture overhead (non encrypted).

  Regards,

  Peter.

  Sent from my Samsung device.
    Include original message
  ---- Original message ----
  From: Bryan Thompson <[email protected]>
  Sent: 02/12/2015 11:25:03 pm
  To: <[email protected]> <[email protected]>
  Subject: Re: Trunk merge and thread pools

  Ah. I did not realize that we were discussing a river specific ThreadPool
  vs a Java Concurrency classes ThreadPoolExecutor.  I assume that it would
  be difficult to just substitute in one of the standard executors?

  Bryan

  On Wed, Dec 2, 2015 at 8:18 AM, Peter <[email protected]> wrote:

  > First it's worth considering we have a very suboptimal threadpool.  There
  > are qa and jtreg tests that limit our ability to do much with ThreadPool.
  >
  > There are only two instances of ThreadPool, shared by various jeri
  > endpoint implementations, and other components.
  >
  > The implementation is allowed to create numerous threads, only limited by
  > available memory and oome.  At least two tests cause it to create over
  > 11000 threads.
  >
  > Also, it previously used a LinkedList queue,  but now uses a
  > BlockingQueue,  however the queue still uses poll, not take.
  >
  > The limitation seems to be the concern by the original developers that
  > there may be interdependencies between tasks.  Most tasks are method
  > invocations from incoming and outgoing remote calls.
  >
  > It probably warrants further investigation to see if there's a suitable
  > replacement.
  >
  > Regards,
  >
  > Peter.
  >
  > Sent from my Samsung device.
  >   Include original message
  > ---- Original message ----
  > From: Bryan Thompson <[email protected]>
  > Sent: 02/12/2015 09:46:13 am
  > To: <[email protected]> <[email protected]>
  > Subject: Re: Trunk merge and thread pools
  >
  > Peter,
  >

  > It might be worth taking this observation about the thread pool behavior to
  > the java concurrency list.  See what feedback you get.  I would certainly
  > be interested in what people there have to say about this.
  >
  > Bryan
  > 
  >
  >

Re: Trunk merge and thread pools

Reply via email to