Dan Creswell wrote:
Clarification: Separate the location and identity of a jar file or
archive containing class files:
We have integrity constraints, which rely on message digests to confirm
files have not been tampered with.
The message digest (although now considered a weak form of encryption), is
the identity of a jar file, well we use this type of information to identify
it's the jar we expect.
For httpmd, we use a URL string annotation (which could be an IP address
and port, or a DNS hostname and port), a path and file name, followed by the
message digest.
The IP address or dns hostname , port and path represent the location of
the jar file, while the file name and message digest represent it's
identity.
If we separate the location + port, it can be discovered using DNS-SRV
records, allowing redundant codebase servers, while the identity is limited
to the file name and message digest.
Then the RMI codebase property only needs to be a domain in which a
suitable codebase can be discovered and queried.
Mmm, but the process of resolution on codebases slows things down for Gregg.
Correct, Gregg avoids the process of resolution, until it's absolutely
necessary, while Chris, caches his jar files after downloading, so he
only has to do so once. DNS resolution might take 10 seconds, so this
is an important point you've made. A local caching DNS would be
advantageous.
I think we're gonna have to decide which profiles of machines we're going to
target with what solutions. I'm not suggesting we try and build for all the
different profiles more that we'll have to choose what we do and do not
support.
But, if service proxy's are sharing jar files, what does that mean? Someone
somewhere chose to package them all together like that for some reason.
What
reason, what problem are they solving?
Deployment reasons.
:) Yes, deployment but why are we choosing to deploy with this setup. What
does it imply about services when they share .jar files?
Because it becomes difficult to manage with existing tools when a
codebase contains multiple jar files for multiple services. I believe
that the Codebase service implementors managed to make deployment easier
by automating codebase annotations.
Again, I'm sitting here thinking as a deployer of services, if I want to
widget around and consolidate .jars I can do that ahead of time and then
tweak service codebases via config just prior to deploy.
But it would be faster for the client to receive unconsolidated jar's,
since other services might use some of these jars also.
Not for Gregg who hates roundtrips and multiple discoveries of codebases and
such.
If we have the Entry jar files installed at the client or first download
the jar files for Entry's we need, using the getEntryClasses method,
then we can unmarshall only these entry's which we already have
codebases for during lookup, of course at some point we're going to have
to download something, but we can avoid downloading the codebases for
services we don't need.
What matters is the proxy gets the correct class files, in its own private
namespace that are not shared with other proxy's (except for service api,
which may include Entry's), but we can save duplicate downloads.
Can you explain your reasoning for saving duplicate downloads? I think you
mean because in some cases service's could share a codebase which they can
do under the current scheme of course. Note that having "correct class
files" isn't IMHO a sufficient constraint, it has to be a "particular
collection of specific implementations of classes and versions".
Correct, but it might reduce the size of the download when we've got to
bite the bullet and download the service jar files, if we've already got
some of the necessary jar's cached locally.
Maven provisioning is interesting.
This is why I'm interested to investigate separating jar file identity from
location, to simplify deployment and redundancy. I'm putting my thoughts
out on the list, to gather responses, to see if there's a better way.
Okay, so I think this also impacts on the stuff being discussed with Gregg.
Whilst there are some complimentary aspects there are some costly steps in
there as well.
The trick is to delay the costly step until we've decided which service
instance we want.
Jini's lookup service lack of AND / OR querying capability is due to
security, the avoidance of instantiating foreign objects.
Delayed unmarshalling of the service proxy allows service entry's to be
compared as objects, without requiring a codebase download for the proxy
if
it's not the service we want, so it's not quite just returning a
MarshalledInstance. This should be done without compromising the good
security features of the existing lookup service.
I don't follow - I could tweak ServiceItem to hold the proxy as a
MarshalledInstance and still expose all other "service identifying
information". That MarshalledInstance mightn't even immediately carry the
proxy code, could still be on the server and pulled down at point the
consumer actually wants the proxy. Feels like some simple
interface/sub-classing.
This is true for cases where the service types don't matter to the client,
I think Gregg wanted to elimate all codebase downloads until he was sure he
had the correct service. Use of a MarshalledInstance could be an acceptable
compromise, if Entry's have their own codebase annotations. How would this
affect the lookup semantics if we're looking for particular service types
though?
Entry's have to have their own codebase for all cases of clients that don't
know about those Entry's. Any client that has built-in knowledge of those
Entry's will have them available on the classpath by virtue of it's need to
specify them in it's lookup search.
Except when we call getEntryClasses for a ServiceTemplate and someone's
created a new Entry, then we need reflection to get the fields, but
that's probably a rare case anyway.
Can you explain more about the service types question?
Reggie stores the class types of the service (not class files), so that
Reggie can use a ServiceTemplate to retrieve marshalled service proxy's
that match the service type defined in the template.
If I'm looking for a particular type, I specify a particular interface.
Exactly, so if we use a MarshalledInstance when we register the proxy,
the only interface (or class in this case) a client can specify is
Object or MarshalledInstance.
I'm
guessing you mean a service with specific Entry's? A client after some
specific set of Entry's will already know those via classpath. We could
simply allow a client to say "I'm only interested in Services with these
Entry's, return me all matches but only give me the following Entry types as
part of the ServiceItem". This feels much closer to the original intent
"stop stuff getting to a client that it's not interested in" than trying to
"control in detail all aspects of download and intimately dig around in
service implementations, including .jars, to do it".
I don't think we should dig around in jars etc to do it, just package
Entry's separately so we don't have to. By providing the class files
for the Entry's we want unmarshalled via the lookup call, we wouldn't
need to download their jar files.
Hence the method:
ResultStream lookup(ServiceTemplate tmpl, Class[] unmarshalledEntries,
int maxBatchSize) throws IOException;
But once we have the service we want, we'll need to download the jar files.
Over the internet, we could potentially have very large lookup services,
by
allowing clients to remove unwanted services from their results before
unmarshalling, we can reduce the resources required of the client:
* Network Bandwidth, clients don't need to download unwanted codebases.
* Memory (ClassLoader and unwanted classes are not loaded into memory).
If we were to go across internet have we squared away use of e.g. DNS-SD?
i.e. Is it a given we'll expose a classic JINI LUS?
The semantics of DNS-SD make it well suited to discovery of a lookup
service. I think Sim highlighted earlier that its difficult to get domain
administrators to do Dynamic Updated DNS-SD, they're comfortable with
DNS-SRV records, so it would appear easier to rely on DNS-SD as a lookup
locator / domain browser, not as a lookup service replacement.
Agreed. Solves my concerns of exposing unicast locators of the traditional
type and multicast across the net.
;)
Cheers,
Peter.