Re: Don't let Jini Standards become an impediment to development

Peter Wed, 09 Sep 2015 02:01:09 -0700

Greg,

Thanks for putting the time & effort to look up these links.

Primitive fields are an interesting case, I guess they had to bedisallowed, because the serial form of an int is different from anInteger, at least for java Serialization.

The performance trade off I mentioned relates to the time it takes forreflection to set each field in an Entry and when defensive copying isused, instead of pass by reference.

ServiceDiscoveryManager is a good example of complexity, even after 15years, it still contains bugs because its design is too complex (evenI'm not 100% confident I've fixed them all).


Everything should be made as simple as possible, but not simpler.

Ken Arnold made Entry's simple for this reason, but in those days, thejvm still had issues with its memory model, final fields didn't have thethread safety guarantee they have now and most computers only had one cpu.


https://www.cs.umd.edu/~pugh/java/memoryModel/jsr-133-faq.html#finalWrong

Allowing Entry fields to be final, doesn't significantly increasecomplexity, in fact, in can assist in reducing complexity in concurrentcode.


I'm still convinced of the benefits of allowing final fields in Entry's :)

How about a compromise?

Leave existing Entry implementations as is, but allow final fields innew Entry's?

Then in River code, we treat Entry's as immutable, regardes of whetherthey are or not?

This way, existing code won't break from final modifiers, but new codecan benefit significantly.


I'm not aware of any Entry using final fields at present.

Regards,

Peter.



On 9/09/2015 3:55 PM, Greg Trasuk wrote:

On Sep 9, 2015, at 12:30 AM, Peter<[email protected]>  wrote:

Thanks Greg,

Was it a case of; because we can't set final fields (well not without a 
Permission anyway), that they shouldn't be included in Entry serialized state, 
because then we can't deserialize them?

No, it’s just the definition of “what’s in the schema”.  A class in the normal 
sense is a template for objects that are going to be instantiated and used as 
objects.  A class that implements Entry is not really doing the same thing.  
It’s really a schema for a set of values that can be stored or used in a 
matching operation.  Nothing more, nothing less.  Dynamic operations in 
matching are a recipe for undefined behaviour (e.g. changing a value that’s 
used as a key in a hash map), so the Entry spec says “don’t do that”, by 
requiring that the matchable fields exist as fields, not calculated properties 
or anything else.  Using getter or setter methods would  provide an abstraction 
from the storage, but an Entry is fundamentally a storage structure, so it 
outlaws the abstraction. (Ken Arnold explained this nicely in 
http://www.artima.com/intv/sway2.html).

Also, a null value in a field is a “wildcard”, hence the statement that the 
fields are serialized individually, so field-by-field comparisons are possible. 
 Notice that if you look at the usage, the class that implements Entry is never 
directly serialized - both Reggie and Outrigger create a representation object 
that has an array of serialized fields.  They are treating the Entry class as a 
schema definition.  The Entry instance itself is like a record formatted 
according to that schema.

Now, even though we’re not really going to use it as a traditional class, 
certain features of classes might still be useful.  Like, it might be useful to 
have constants for frequently-used values.  So you can still have static or 
non-static final fields (I can’t think of a good reason to have non-static 
final fields in an Entry class, but I suppose there might be a use case).  But 
any matching or storage operation only applies to fields defined in the 
“schema” - by definition, the public, non-static, non-final fields.

I've done my best to fix the existing implementations, so hopefully they won't 
need further fixes, however, the fixes were very difficult and these 
implementations very difficult to reason about, because there is so much 
mutable state.  In ServiceDiscoveryManager, a thread holds a lock while waiting 
for the result of a remote call, there was no solution I could find to remove 
this lock.

To quote Keith Edwards "The Special Semantics of Attributes":

   "All the methods of the object are ignored for purposes of
   searching, as are "special" data fields: static, transient,
   non-public, or final fields.  Likewise all fields that are primitive
   types (such as ints and booleans) are ignored; only references to
   other objects within an attribute are considered for searching.”

I’ll confess, it’s always struck me as odd that primitive fields are ignored, 
and it’s certainly bit me once or twice.  But I understand that it’s designed 
so that all the “matchable” fields are objects, so can be stored and compared 
in a uniform way (typically as MarshalledObjects).  Since Java5 added 
auto-boxing/unboxing, there’s no real coding overhead for the developer.

So our choices are (for River 4.0):

  1. Break backward compatibility and increase scalability, performance
     and reduce bugs, by not ignoring final fields in Entry's, but
     instead mandating them.

No.  That’s something else.  That’s not an Entry.  Such an object might be 
useful for implementations, but that’s not an Entry.

An Entry is basically a local data structure that I can modify freely.   The 
value is crystallized when I send the Entry somewhere.  Like to a JavaSpace.   
If I’m using an Entry to store things locally where I have concurrent access to 
the entry fields, that’s my problem, not the Entry’s.

  2. Or continue full compatibility and live with lower performance,
     less scalability and harder to debug code.

If that tradeoff is true, you’re misusing or misunderstanding Entries.

Not to mention, if you speed up Lookups, so what?  They don’t happen that often 
- only on startup of a client or failure of a service, in most cases outside of 
the test framework.  JavaSpaces are a different issue, but again, without 
knowing the use case of the JavaSpace (messaging, persistence, read-mostly, 
write-mostly, who knows?), it’s hard to do any reasoning about performance.

Java’s default serialization mechanism is slow?  Well, I don’t know that for 
sure for a given use case, but OK, in both the Registrar and the JavaSpaces 
case, the client receives a proxy.  The current proxy implementations use JERI, 
but that’s an implementation detail.  The proxy is free to implement a 
different serialization or communication mechanism.  Don’t like reflection?  I 
guess you could use ASM or BCEL to create a marshaller object for each Entry 
type that a given JavaSpace implementation wants to support.  You could 
probably even generate them dynamically.  For that matter, you could precompile 
Google Protocol Buffers for the most frequent Entry types.  Is it worth the 
effort?  I don’t know, but it’s technically feasible.  Jini’s dynamic code 
approach guarantees that it’s possible.

I think there's plenty of time for implementations to prepare for River 4.0, if 
we start talking about it now.

Regards,

Peter.

How are these for code comments (from ServiceDiscoveryManager)?

“Those who enjoy laws or sausages shouldn’t see either being made” - misquoting 
Bismarck or Saxe.  Would you rather the original developer didn’t document 
areas that she thinks might eventually cause trouble?

What can I tell you?  Concurrency is hard.  Distributed programming is hard.  
Hate to say it, but in a distributed scenario, it might not be realistic to 
expect deterministic behaviour out of a system.  Jini is all about embracing 
failure and indeterminacy. (Ken Arnold again - 
http://www.artima.com/intv/distrib.html)

                // Don't like the fact that we're calling foreign code while
                // holding an object lock, however holding this lock doesn't
                // provide an opportunity for DOS as the lock only relates to a 
specific
                // ServiceRegistrar and doesn't interact with client code.
                matches = proxy.lookup(tmpl, Integer.MAX_VALUE);

   /* The cache must be created inside the listener sync block,

             * otherwise a race condition can occur. This is because the
             * creation of a cache results in event registration which
             * will ultimately result in the invocation of the serviceAdded()
             * method in the cache's listener, and the interruption of any
             * objects waiting on the cache's listener. If the notifications
             * happen to occur before commencing the wait on the listener
             * object (see below), then the wait will never be interrupted
             * because the interrupts were sent before the wait() method
             * was invoked. Synchronizing on the listener and the listener's
             * serviceAdded() method, and creating the cache only after the
             * lock has been acquired, together will prevent this situation
             * since event registration cannot occur until the cache is
             * created, and the lock that allows entry into the serviceAdded()
             * method (which is invoked once the events do arrive) is not
             * released until the wait() method is invoked .
             */

        /**
         * With respect to a given service (referenced by the parameter
         * newItem), if either an event has been received from the given lookup
         * service (referenced by the proxy parameter), or a snapshot of the
         * given lookup service's state has been retrieved, this method
         * determines whether the service's attributes have changed, or whether
         * a new version of the service has been registered. After the
         * appropriate determination has been made, this method applies the
         * filter associated with the current cache and sends the appropriate
         * local ServiceDiscoveryEvent(s).
         *
         * This method is called under the following conditions: - when a new
         * lookup service is discovered, this method will be called for each
         * previously discovered service - when a gap in the events from a
         * previously discovered lookup service is discovered, this method will
         * be called for each previously discovered service - when a MATCH_MATCH
         * event is received, this method will be called for each previously
         * discovered service - when a NOMATCH_MATCH event is received, this
         * method will be called for each previously discovered service Note
         * that this method is never called when a MATCH_NOMATCH event is
         * received; such an event is always handled by the handleMatchNoMatch
         * method.
         *
         * When this method is called, it may send one of the following events
         * or combination of events: - a service changed event - a service
         * removed event followed by a service added event - a service removed
         * event
         *
         * A service removed event is sent when the service either fails the
         * filter, or the filter produces an indefinite result; in which case,
         * the service is also discarded.
         *
         * A service changed event is sent when the service passes the filter,
         * and it is determined that the service's attributes have changed. In
         * this case, the old and new service proxies are treated as the same if
         * one of the following conditions is met: - this method was called
         * because of the receipt of a MATCH_MATCH event - the old and new
         * service proxies are byte-wise fully equal (Note that the lookup
         * service specification guarantees that the proxies are the same when a
         * MATCH_MATCH event is received.)
         *
         * A service removed event followed by a service added event is sent
         * when the service passes the filter, and the conditions for which a
         * service changed event would be considered are not met; that is, this
         * method was not called because of the receipt of a MATCH_MATCH event;
         * or the old and new service proxies are not byte-wise fully equal.
         *
         * The if-else-block contained in this method implements the logic just
         * described. The parameter matchMatchEvent reflects the pertinent event
         * state that causes this method to be called. That is, either a
         * MATCH_MATCH event was received, or it wasn't, (and if it wasn't, then
         * a full byte-wise comparison is performed to determine whether the
         * proxies are still the same).
         *
         * To understand when the 'else' part of the if-else-block is executed,
         * consider the following conditions: - there is more than one lookup
         * service with which the service registers (ex. LUS-0 and LUS-1) -
         * after the service registers with LUS-0, a NOMATCH_MATCH event is
         * received and handled (so the service is now known to the cache) -
         * before the service registers with LUS-1, the service is replaced with
         * a new version - the NOMATCH_MATCH event resulting from the service's
         * registration with LUS-1 is received BEFORE receiving the
         * MATCH_NOMATCH/NOMATCH_MATCH event sequence that will ultimately
         * result from the re-registration of that new version with LUS-0 When
         * the above conditions occur, the NOMATCH_MATCH event that resulted
         * from the service's registration with LUS-1 will cause this method to
         * be invoked and the proxies to be fully compared (because the event
         * was not a MATCH_MATCH event); and since the old service proxy and the
         * new service proxy will not be fully equal, the else part of the
         * if-else-block will be executed.
         *
         * This method applies the filter only after the above comparisons and
         * determinations have been completed.
         */

On 9/09/2015 1:40 PM, Greg Trasuk wrote:

On Sep 8, 2015, at 10:40 PM, Peter<[email protected]>   wrote:

On 8/09/2015 11:26 PM, Greg Trasuk wrote:

That’s the current state.  Changing (e.g. by enforcing a builder pattern or 
something) would add unneeded complexity for the user if you ask me.

Yes, I agree, a constructor is suitable, keep it simple.

In fact, the fact that the Entry fields are non-final is the way Jini knows it’s a field 
and not a constant.  i.e. in the AbstractEntry docs, it specifically says "The entry 
fields of an Entry are its public, non-primitive, non-static, non-transient, non-final 
fields."

There's no code that inspects the field and checks whether it's final, not 
within River that I'm aware of, feel free to show me where if I'm wrong :).

com.sun.jini.outrigger.EntryRep.  Look for ‘Modifier.FINAL’ in the 
‘usableField(…)’ method.  There are 12 other uses of ‘FINAL’ in River 2.2.2.  
Not to mention any external product (Blitz, Rio, who knows what) that are 
designed around the Entry specification.

The EntryRep you posted below is com.sun.jini.reggie.EntryRep.  You’ll notice 
that it calls com.sun.jini.reggie.ClassMapper, which also filters final fields 
out of the comparisons and mappings.

  The reason these fields are non final, is so they can be set with reflection, 
this is a detail of the Entry specification we should look at changing.

The reason they’re non-final is so that services like Reggie and Outrigger know 
what constitutes an Entry field that they might need to match.  Entry fields 
are used to form the templates for matching entries.

Entry's are not subject to the usual serialization rules.  All fields in an 
Entry in superclass to subclass order are stored in an EntryRep (appended).
At present these are set after construction using reflection, however a 
constructor that accepts an array parameter, will allow the child most class to 
pass that array up through all constructors to reconstruct the Entry without 
using reflection (magnitudes faster performance wise).

Eliminating the use of reflection during deserialization will increase 
performance, and immutability increases scalability.

Where do we use entries?  Typically in doing Service registrations and lookups. 
 Most users can understand that the Entry is going to be serialized and used 
remotely.

Clients of Javaspaces and utility classes too, like JoinManager and 
ServiceDiscoveryManager.

Add to documentation?  Sure.  Change the API?  Maybe add an optional 
builder-style object.  (e.g. create a StatusBuilder that acts as a bean and 
generates Status entries).  But even so, it seems like a small enough part of 
using Jini to not bother much with.

Due to the way the current Entry spec works, you can't add a field to an Entry 
without breaking compatibility with subclassses.

So the proposed change would be a new public constructor, that accepts an 
array, containing field object values in the order that fields occur and that 
all fields be final.

I'd also propose including a warning that once published the order and number 
of fields in an Entry should not be changed if it can be extended, otherwise if 
there's a chance that additional fields might need to be appended at a later 
date, a reccommendation that the class be made final, so it can't be subclassed.

So in other words, the following changes will break an Entry:

  1. A change in the class heirarchy.
  2. A change in the order or number or type of fields.

An Entry is best thought of an interface definition for a defined group of 
objects in serial form.

Another way to think about it is that Entries don’t mean much until you send 
them somewhere.  So you need to make sure they’re all setup and everything they 
refer to is stable when you send them.  To me, that’s a user-understanding item 
that’s difficult to enforce through code.

But because Entry fields are mutable and unsynchronized, and used in utility 
classes such as ServiceDiscoveryManager, it becomes difficult to manage when 
Entry's are passed around between threads.   To submit an Entry to a service, 
it is passed to a thread pool where it becomes a serialized method invocation.

To ensure the changes made by one thread are visible to another, the Entry must 
be published safely, otherwise the changes can't be guaranteed visible between 
threads.  There is no synchronization on Entry field access, so this makes it 
very difficult to reason about an Entry, one strategy I've adopted is defensive 
copying, however all this does, is guarantee that any modifications made by 
client code, doesn't affect a copy in ServiceDiscoveryManager (for example), 
there is a memory usage cost, because internal copies can't be shared with 
client ServiceItemFilter’s

Typical usage of a ServiceDiscoveryManager, for instance, would be if I want to 
perform a lookup I’ll create a ServiceTemplate that contains the Entries I want 
matched.  If I want to create a LookupCache, I’ll pass in a ServiceTemplate to 
the createLookupCache method.

If you have a separate thread that messes around with those entries while the 
lookup is in process, or while the LookupCache is active, that’s a client error 
- you deserve what you get.

If ServiceDiscoveryManager does it wrong, we need to fix 
ServiceDiscoveryManager.

.

Reasoning about shared state becomes much simpler if Entry's are immutable and 
the upside is scalability and performance improves, especially for clients of 
lookup services deserializing a lot of Entry objects.

Entries aren’t intended to encapsulate shared state in any widely concurrent 
way.  They’re intended to be created on a thread, and then sent over the 
network.  If there is shared state inside a given service, that’s the concern 
of the service implementation.  Does Outrigger handle its concurrency 
correctly?  I don’t know- I’ve never looked.  If it doesn’t then we need to fix 
Outrigger, not change the Entry specification.

It’s good that we’re having these discussions, so we can make sure that the 
real-world usages get taken into account before we go altering the 
specifications.

Cheers,

Greg Trasuk.

/**
* An EntryRep contains the fields of an Entry packaged up for
* transmission between client-side proxies and the registrar server.
* Instances are never visible to clients, they are private to the
* communication between the proxies and the server.
*<p>
* This class only has a bare minimum of methods, to minimize
* the amount of code downloaded into clients.
*
* @author Sun Microsystems, Inc.
*
*/
class EntryRep implements Serializable, Cloneable {

    private static final long serialVersionUID = 2L;

    /**
     * The Class of the Entry converted to EntryClass.
     *
     * @serial
     */
    public EntryClass eclass;
    /**
     * The codebase of the entry class.
     *
     * @serial
     */
    public String codebase;
    /**
     * The public fields of the Entry, each converted as necessary to
     * a MarshalledWrapper (or left as is if of known java.lang immutable
     * type).  The fields are in super- to subclass order.
     *
     * @serial
     */
    public Object[] fields;

    /**
     * Converts an Entry to an EntryRep.  Any exception that results
     * is bundled up into a MarshalException.
     */
    public EntryRep(Entry entry) throws RemoteException {
    EntryClassBase ecb = ClassMapper.toEntryClassBase(entry.getClass());
    eclass = ecb.eclass;
    codebase = ecb.codebase;
    try {
        EntryField[] efields = ClassMapper.getFields(entry.getClass());
        fields = new Object[efields.length];
        for (int i = efields.length; --i>= 0; ) {
        EntryField f = efields[i];
        Object val = f.field.get(entry);
        if (f.marshal&&   val != null)
            val = new MarshalledWrapper(val);
        fields[i] = val;
        }
    } catch (IOException e) {
        throw new MarshalException("error marshalling arguments", e);
    } catch (IllegalAccessException e) {
        throw new MarshalException("error marshalling arguments", e);
    }
    }

    /**
     * Convert back to an Entry.  If the Entry cannot be constructed,
     * null is returned.  If a field cannot be unmarshalled, it is set
     * to null.
     */
    public Entry get() {
    try {
        Class clazz = eclass.toClass(codebase);
        EntryField[] efields = ClassMapper.getFields(clazz);
        Entry entry = (Entry)clazz.newInstance();
        for (int i = efields.length; --i>= 0; ) {
        Object val = fields[i];
        EntryField f = efields[i];
        Field rf = f.field;
        try {
            if (f.marshal&&   val != null)
            val = ((MarshalledWrapper) val).get();
            rf.set(entry, val);
        } catch (Throwable e) {
            if (e instanceof IllegalArgumentException) {
            // fix 4872566: work around empty exception message
            String msg = "unable to assign " +
                ((val != null) ?
                "value of type " + val.getClass().getName() :
                "null") +
                " to field " + rf.getDeclaringClass().getName() +
                "." + rf.getName() + " of type " +
                rf.getType().getName();
            e = new ClassCastException(msg).initCause(e);
            }
            RegistrarProxy.handleException(e);
        }
        }
        return entry;
    } catch (Throwable e) {
        RegistrarProxy.handleException(e);
    }
    return null;
    }

    /**
     * We don't need this in the client or the server, but since we
     * redefine equals we provide a minimal hashCode that works.
     */
    public int hashCode() {
    return eclass.hashCode();
    }

    /**
     * EntryReps are equal if they have the same class and the fields
     * are pairwise equal.  This is really only needed in the server,
     * but it's very convenient to have here.
     */
    public boolean equals(Object obj) {
    if (obj instanceof EntryRep) {
        EntryRep entry = (EntryRep)obj;
        if (!eclass.equals(entry.eclass) ||
        fields.length != entry.fields.length)
        return false;
        for (int i = fields.length; --i>= 0; ) {
        if ((fields[i] == null&&   entry.fields[i] != null) ||
            (fields[i] != null&&   !fields[i].equals(entry.fields[i])))
            return false;
        }
        return true;
    }
    return false;
    }

    /**
     * Deep clone (which just means cloning the fields array too).
     * This is really only needed in the server, but it's very
     * convenient to have here.
     */
    public Object clone() {
    try {
        EntryRep entry = (EntryRep)super.clone();
        entry.fields = (Object[])entry.fields.clone();
        return entry;
    } catch (CloneNotSupportedException e) {
        throw new InternalError();
    }
    }

    /**
     * Converts an array of Entry to an array of EntryRep.  If needCodebase
     * is false, then the codebase of every EntryRep will be null.
     */
    public static EntryRep[] toEntryRep(Entry[] entries, boolean needCodebase)
    throws RemoteException
    {
    EntryRep[] reps = null;
    if (entries != null) {
        reps = new EntryRep[entries.length];
        for (int i = entries.length; --i>= 0; ) {
        if (entries[i] != null) {
            reps[i] = new EntryRep(entries[i]);
            if (!needCodebase)
            reps[i].codebase = null;
        }
        }
    }
    return reps;
    }

    /** Converts an array of EntryRep to an array of Entry. */
    public static Entry[] toEntry(EntryRep[] reps) {
    Entry[] entries = null;
    if (reps != null) {
        entries = new Entry[reps.length];
        for (int i = reps.length; --i>= 0; ) {
        entries[i] = reps[i].get();
        }
    }
    return entries;
    }
}

Re: Don't let Jini Standards become an impediment to development

Reply via email to