Thanks Gregg,

Your spot on.

The memory explosion caused by multiple class loaders, one for each remote location, is significantly eliminated by your solution of preferred local classes. This works since all marshalled instances are unloaded in the one classloader and utilise the same bytecode also improving compatibility, by preventing isolation of compatible objects by classloader tree Type differences.

I think utilising the OSGi framework combined with Codebase services, eliminating the coupling between codebase URL's and Classloaders, can perform a similar reduction in downloaded code, although not quite as small, all classes from a package can use the latest compatible bundle, share the same classloader, bytecode and any other packages that are depended upon, significantly reducing RMIClassloader explosion and duplicate bytecodes, once a compatible bundle has been downloaded, it can be utilised for all instances of that class, we should prefer latest compatible bundles I believe.

Package version Metadata (specified in each bundle) can be stored in MarshalledObject instance Metadata, the OSGi versioning scheme, specifies compatibility across bundle upgrades.

In reality, MarshalledObjects are a compromise to reduce downloading remote code, they duplicate the information held within the binary marshalled form of an object (such as implemented interfaces.) Keeping duplicated information to a minimum is important, whenever we duplicate data, we risk duplication errors or increased size. At some point it becomes more efficient to just unmarshall the object rather than increase stored metadata, this is best done at the client, internet servers just wont have the resources to perform queries while the security implications of executing foreign code are significant.

I think the full load issues in Reggie can be fixed, so it doesn't fail under load, levelling out instead. Although I think your right, Reggie isn't suited for the internet. Perhaps some type of Global indexing service that crawls the list of available services through DNS-SD and stores their marshalled proxy instances to perform the functions that Reggie currently performs. Perhaps an interface inserted into the hierarchy implementing a subset of Reggie's current methods for some compatibility with Reggie? Another interface might implement other methods that assist in filtering the results, or returning a bytestream, where proxy's are unmarshalled one at a time at the client and inspected, then dropped until a suitable match is found, the remaining bytestream can be discarded. Garbage collection would clean up unwanted proxy's during bytestream inspection, keeping memory usage to a minimum.

Looking at this service type definition, Daniel is using DNS-SD to locate a Jini service directly:

Mutiple identical matches are returned with an index integer appended.

DNS SRV (RFC 2782) Service Types at http://www.dns-sd.org/ServiceTypes.html contains this service entry:

 jini            Jini Service Discovery
                 Daniel Steinberg <daniel at oreilly.com>
Protocol description: Convention giving a deterministic programmatic mapping between Jini service interface names and subtypes of the DNS-SD service meta-type "_jini._tcp". For example, a client wishing to discover objects that implement the "com.oreilly.ExampleService" interface would broswse for the DNS-SD service subtype "ExampleService.oreilly.com._sub._jini._tcp". (Note: Using Apple's Bonjour programming API, service subtypes like this are expressed as a comma-separated list following
                 main type, e.g. "_jini._tcp,ExampleService.oreilly.com".
This allows an object that implements several interfaces to specify all of those interfaces in a list when it registers its service. When browsing for services, at most a single subtype is allowed.)
                 Defined TXT keys: None

Some observations:

  1. Re-discovering the correct identical service would be almost
     impossible with DNS-SD
  2. Reggie is not designed for a world wide network.
  3. Interfaces utilised by services must not change over time, they
     can be extended when change is required.
  4. Filtering is limited to qualified names.
  5. Additional Types can be registered for one service instance.
  6. We can crawl the DNS-SD list downloading marshalled proxy's
     forming the basis of a new lookup service implementation.

Some thoughts:

  1. You would be aware of which services you would want to make
     global, perhaps a configuration option?
  2. We probably want to restrict Service proxy's to simple reflective
     proxies with Secure Jeri for now, security is much simpler.
  3. Smart proxies, hmm, proxy verification, hmm needs more thought.
  4. Interfaces for services should be in separate bundles from
     implementations, to prevent Interface duplication in local JVM
     ClassLoaders when implementations change in an incompatible manner
     ( versions over time), allowing a service to remain as an
     abstraction.  The OSGi frame work could locally upgrade an
     interface bundle when a new service utilises a new interface or an
     interface extending existing interfaces, allowing old and new
     service implementations to be utilised as the same Type within a
     local JVM.

DNS-SD might be used when we don't care about identity or matching semantics.

What about when we cared about identity? Similar ground has been covered before in Project Neuromancer with XuidDirectory, perhaps this could be a crawler instead of utilising registration? Don't worry about leasing, just repeat the crawl, discarding the older data on a cyclic basis? Would it be safe to unmarshall downloaded proxy instance to query for Xuid, provided it was sandboxed with no permissions and utilised integrity checking? What do you think Jim?

Interface UuidDirectory
{
Lease register(Xuid id, Object o, long leaselen) throws UnknownXuidException, RemoteException; Lease[] register (Xuid[] ids, Object[] o, long[] leaseLens) throws UnknownXuidException, RemoteException; Object lookup(Xuid id, XuidDirectory[] visited) throws UnknownXuidException, RemoteException;
}

Not only would we need worldwide available Codebase services, but caching Codebase services also.
eg interfaces:

org.apache.river.global.CodebaseService //code I make available which I sign. org.apache.river.global.CachingCodebaseService //code I've dynamically downloaded, not signed by me ( extends CodebaseService)

A caching code base service is important to ensure bytecode remains available over time when other service locations go down.

Unsigned jar files would not be allowed in a Codebase Service.

For Objects passed between services where identity is important, local JVM immutability would be desired. The same object is duplicated across nodes and doesn't change so we don't need to coordinate transactions etc.

All platform services would rely on local code. Platform code could be upgraded using codebase services on a periodical basis, worldwide utilising the OSGi platform. I wonder how project Jigsaw will pan out, perhaps the JVM will become dynamically upgradeable also?

Your Thoughts?

Peter.

Gregg Wonderly wrote:
One of the primary issues with the current lookup server design and the ServiceRegistrar interface in particular is the fact that one can only receive unmarshalled services. My work on providing marshalled results, visible in the http://reef.dev.java.net project, allows the opportunity to find stuff without getting a JVM memory explosion. However, there is a further issue, and that is in order to "see into" the marshalled object you need to either resolve it or dive into the stream of bytes. My further work on the PreferredClassLoader mechanism for establishing "never preferred" classes helps to make it possible to do resolution of remote objects using locally defined class instances, so that you can, for example, look at Entry objects.

Also, in my reef work, I investigated adding the names of all classes that are visible in the type hierarchy of the objects so that you could ask "instanceof" kinds of questions without unmarshalling.

There are just all kinds of issues related to this that come into play. Performing a Jini lookup, on the internet today, would be like asking your web browers to open a tab for every page on the net, and then waiting for that to finish so that you could click through the tabs to find what you are looking for.

Clearly, lookup needs to be a completely different concept to exist in a large world such as is visible "on the internet."

Gregg Wonderly

Peter Firmstone wrote:
Anyone got any opinions about Lookup Service Discovery?

How could lookup service discovery be extended to encompass the internet? Could we utilise DNS to return locations of Lookup Services?

For world wide lookup services, our current lookup service might return a massive array with too many service matches. Queries present the opportunity to reduce the size of returned results, however security issues from code execution on the lookup service present problems.

If we did allow queries on a Lookup Service, could we do so with a restricted set of available Types utilising only trusted signed bytecodes? If bytecode becomes divorced from the origin of a Marshalled Object, and instead obtained from a trusted codebase service, then perhaps we could have a system of vetting source code submitted for the purpose of becoming trusted authorised query types? Any query utilising untrusted bytecode might return an UntrustedByteCodeException?

Perhaps we could make service match results available as a bytestream, clients that couldn't handle large amounts of data could inspect the bytestream, continually discarding what isn't required?

Check out this link on DNS service discovery:

http://files.dns-sd.org/draft-cheshire-dnsext-dns-sd.txt

Cheers,

Peter.





Reply via email to