Re: [osgi-dev] OSGifying an existing application

Peter Mon, 20 Feb 2017 16:11:10 -0800

I don't want this conversation to be shut down either.

I want to identify and document what works based on limitations.


I think the word is out about issues with Serialization.

The only way to secure serialization is to sanitize input, by reimplimenting 
ObjectInputStream from scratch and redesigning the api.  

The public api also means that users must import the api packages, creating 
visibility in a modular environment.

Or ensure it's performed over a secure connection where both ends are 
authenticated prior to any serialized objects being sent.

Regards,

Peter.

Sent from my Samsung device.
 
  Include original message
---- Original message ----
From: Peter Kriens <peter.kri...@aqute.biz>
Sent: 21/02/2017 03:49:01 am
To: OSGi Developer Mail List <osgi-dev@mail.osgi.org>
Subject: Re: [osgi-dev] OSGifying an existing application

It is wonderful to not be completely shut down this time A few years ago making 
the kind of comments I am making now and I would have been condescendingly 
looked at, while you saw people thinking that I probably tried to hide a 
deficiency in OSGi. :-)

This is a fundamental computing problem. I love OO since 1981, I came to like 
types, but they both have severe limitations once you go distributed (which 
includes persisting data). And very few interesting programs do not have one of 
those problems.

Kind regards,

        Peter Kriens



On 20 Feb 2017, at 18:32, Matt Sicker <boa...@gmail.com> wrote:

I was thinking about how ActiveMQ added extra security around serialization by 
forcing you to configure which packages it allows for serialization. That 
struck me as similar to import-/export-package and made me think that if you're 
using serialization like this, you should at the very least be required to mark 
your DTOs as publicly accessible. On the import side for your service, though, 
things get out of whack like Peter mentioned with different versions of the 
same bundle running on separate instances (or even the same instance).

On 20 February 2017 at 08:24, Peter Kriens <peter.kri...@aqute.biz> wrote:
The problem is that in all distributed I’ve been involved in there were rolling 
updates. This implies that you cannot guarantee that each server runs the same 
bundle. And clusters that do not do rolling updates seem kind of useless 
because they introduce a huge failure point. So I have a hard time 
understanding how you can guarantee that invariant? Yes, it works most of the 
time but when I learned to work with computers that tended not be good enough.

Now I do understand ( and sympathise) that you cannot change the app. I’ve been 
there.

You indicate you have the same bundle on both sides. So the class graph on both 
sides is equal. This implies that if you ensure that the API interface is in 
the same bundle as the implementation classes then you’re safe as long as you 
use the interface’s class loader as the root loader for classes you find. This 
is of course highly not modular but then you’re not anyway under the covers :-) 
You could also make the API bundle import the implementation bundles, that 
should work as well.

As long as you make sure OSGi has the proper dependency information things tend 
to work out of the box.

Kind regards,

        Peter Kriens
        


On 20 Feb 2017, at 11:47, Peter <j...@zeus.net.au> wrote:

Thanks Pete, good to hear from you again, I must admit it's been too long.  We 
last spoke when I was refactoring a class dependency tool to use ASM instead of 
the jdk's tools.jar.  You once asked, how do you find a dependency for calls to 
Class.forName?

The reasons you've stated are also why I've chosen to support a very narrow use 
case, in which you may have already noted that the serialized connection is 
between two identical bundles, in separate jvm's with compatible package 
imports.

There’s no intent to support transferring any classes outside of the Service 
Now API and that includes overriding classes.  What you describe about data 
hiding reminds me of Entry's, which have public fields.


I'll be the first to admit there are significant issues with the design of 
Java's Serialization's extralinguistic api.   Ironically though the wire 
protocol is reasonably well thought out, with regards to evolution.

As an exercise to fix security issues, I have reimplemented Java serialization 
with input validation using a public api, it has backward compatible serial 
form, but only supports a subset of Java serialization, it doesn't support 
circular object graphs for example as this would compromise security.   It 
performs input validation, sets resource limits and expects periodic stream 
resets to avoid DOS and gadget attacks. The problem is there's a lot of 
existing software that utilises java Serialization, that's going to need 
support for some time.   Things like Serialization and Remote method invocation 
are damaged by attempts to implement too much functionality, when a more rigid 
subset would avoid a number of issues.  But I guess no one was thinking of 
modularity and versioning when they created these frameworks either.

James Gosling said something once about why Generics weren't included in Java 
from the outset, which was because at the time they didn't know how to do it 
properly, it's better leaving it out until you do.

JBoss has a nice web page with some graphics that illustrate some major issues 
with implementation hiding you've mentioned with Serialization and modular 
frameworks here:  https://developer.jboss.org/wiki/ModularSerialization

It's worth noting Service API of the smart proxy bundle doesn't need to be 
Serializable, instead it's relegated to a communication means between two 
identical bundles in different JVM's.   It's also important to recognise that 
it doesn't need to be the communcation mechanism either.   These bundles have 
an identical class namespace, although there may be variances in package import 
versions.

Yes we are also looking at moving away from java serialization.

Also over time, because this is a service interface, at some point down the 
track, serialization can be replaced, without impacting the public api.  So yes 
the underlying protocols can be stripped out to data and message passing if 
that's more satisfactory.

So yes java serialization is an existing part of our application and it has 
it's warts.

But we've also had a number of users over the years who have requested support 
for OSGi.

This is not a greenfields project, I'm hoping that I'm not going to be told 
that no, the chasm is too wide you can't cross over to OSGi, rewrite or start 
again, there's just too many LOC.

So you have raised some important questions.  Some of our users have had a lot 
of success with Maven (recognising there a pro's and cons's with module based 
versioning and transitive dependencies), where versioning on a module level 
allows codebase annotations to be utilised in remote invocation, avoiding class 
visibility issues by mapping module ClassLoader's directly to a URI based 
identity.  However with OSGi there's a mismatch between different jvm's and how 
bundles and packages imports will be resolved will end up being wired, so we 
can't rely on codebase annotations for OSGi.

Jvm's using OSGi frameworks are quite likely to have different dependency 
graphs (wires) between bundles and their package imports.

While I don't expect to solve the worlds problems or boil the ocean, I'm 
looking for the most workable compromise, one that doesn't promise the world 
and is easier to explain what users can and can't expect to do.  I'm relatively 
pragmatic.  To me it would seem logical that a subset where two identical 
bundles (that should have resolved similar package import versions) should be a 
good place to start.

Hence my post on this list, as I realise many of you have already spent a lot 
of time bumping into these issues.

Cheers,

Peter.

On 20/02/2017 6:38 PM, Peter Kriens wrote:
After working in this area for too many years I’ve come to the conclusion that 
objects cannot be really transferred to other systems in a reliable way, only 
self typed data can JPA, RMI, and many other systems promise heaven to the 
programmer that they can use their objects local and remote transparently. The 
consequence of this dream is a huge amount of complexity that far outweighs any 
gains in programmer friendliness. Few things have caused so much trauma in the 
software world as ORM. (Persistence is communications to a future process.)

The reason objects are so complex to use in communications is that it is in 
direct violation of the goal of OO to hide your data. However, once you expose 
the internal data on the wire you have effectively made it public but too many 
people they can still have the advantages of abstract data types. OSGi is a 
bitch in this case because it tells you that you’re trying to do something 
wrong by refusing to cooperate. In this case, it balks at you because you 
create an invisible dependency between the sender and the receiver. Though this 
is a good thing too often the receivers of this message blame the messenger.

You can handle this dependency but you’ll find out is that it is a hugely 
complex task that introduces a lot of frailty in the overall system. Having 
tried this several times I can assure you that any gains in programmer 
friendliness are dwarfed by the complexity of creating this facade.

The best solution I found is to give up on data hiding. The fact your objects 
is on the wire means that that wire format is public. I therefore use Data 
Transfer Objects, in my case objects with public fields. On both sides I have 
my own objects to provide behavior to this data with methods and classes but 
this data record is at the core of my code. Since this data is public because 
it goes over the wire it is better to wrap you code around that ‘standardized 
public’ object than to try you internal object data.

If you look at the OSGi specifications of the past 5 year then you will notice 
that all applicable APIs have been designed to be useful with Distributed OSGi. 
Calls do not pass objects but they pass DTOs back and forth. They do not rely 
that the receiver and sender have exactly the same type and version. In this 
model it is easy to replace an endpoint using another language, which is a 
really good sign.

For Java developers this is often an unpleasant message, and quite often OSGi 
get the blame. However, the fact OSGi gives you these problems means that 
you’re trying to do something that has hidden dependencies.

Distributed computing has 7 well known fallacies[1] but I strongly believe that 
there is an eighth: ’One can communicate objects over a network’.

Now your question. Yes, you could run a resolve and load the proper bundles but 
you introduce a huge amount of error cases and a large amount of complexity and 
you won’t solve the fundamental problem.

Kind regards,

Peter Kriens

[1]: https://en.wikipedia.org/wiki/Fallacies_of_distributed_computing


On 20 Feb 2017, at 05:13, Peter <j...@zeus.net.au <mailto:j...@zeus.net.au>> 
wrote:

Hello,

I'm currently working on converting an existing application to OSGi.

This application has a network service architecture based on java interfaces. 
I've broken the application into modules, using a Maven build, which uses bnd 
and bndtools to create bundle manifests. Some of these modules are 
ServiceLoader provider's, so I've used annotations to ensure these are loaded 
into the OSGi service registry using the Service Loader Mediator.

The main issue that I face is this application is a networked application and 
has it's own Remote Invocation protocols (which currently utilise Java 
Serialization, but not Java RMI). As you'll appreciate, class visiblity is a 
little different in OSGi. :)

The services mentioned above are remote services, these remote services have a 
proxy which implements the service interface, these services are discovered and 
installed at the client. There are two types of proxy's, one, called a smart 
proxy, requires a codebase from which to retrieve a jar or jar files that are 
downloaded and installed at the cleint (traditionally during deserialization), 
the other type of proxy is called a dynamic proxy (it's basically just an 
instance of java.lang.reflect.Proxy), which is dynamically generated at the 
client.

The Service implementation is broken up into three components:

1. The service api
2. The smart proxy (resolved and provisioned into in client jvm).
3. The server

The server bundle imports packages from the smart proxy bundle, while the smart 
proxy imports packages from the service api as well as exporting it's own 
packages, as required by the server bundle.

The server that provides the remote service has three bundles loaded; 
server-impl, smart-proxy & service-api.

The client only has the service api bundle installed at deployment and the 
smart proxy is resolved and provisioned before the service is made available 
via the local OSGi service registry, where the client will learn of it's 
existence using ServiceTracker.

At first glance only the smart proxy bundle needs to be provisioned at the 
client, however for cases where a dynamic proxy is required to implement 
interfaces from different packages, where class visibility issues may exist, it 
may be beneficial in these cases to utilise and provision a proxy bundle that 
imports all these interfaces, one might do that by taking advantage of java's 
interface multiple inheritance; create a bundle that contains one interface 
(annotated with @ProviderType) which extends all interfaces, which the bundle 
doesn't export, so we ensure that the dynamic proxy has a proper bundle 
manifest with all package imports and version ranges correctly defined.

The inbuilt remote invocation protocol has server and client endpoints, the 
protocol is extensible and has a number of implementations (for example https, 
http, tls, kerberos, tcp). Each endpoint is assigned a ClassLoader when it's 
created.

For classes installed at the client, these are typically installed in a 
URLClassLoader, typically with the Application loader as parent loader. In an 
OSGi environment however, the smart proxy bundle will be installed at the 
client, it's ClassLoader utilised by the client endpoint, the smart proxy 
bundle will also be installed at the server and it's ClassLoader utilised by 
the server endpoint. In this case the visibility of the bundles at each 
endpoint will be utilised to resolve serializable classes. Private smart proxy 
serializable classes will be resolvable at each end, but only public classes 
from imported packages will be deserializable, since the client interacts using 
the Service API, all serializable classes in the Service API packages will need 
to be exported and public and imported by the client and smart proxy.

Once a bundle has been provisioned its ClassLoader will be given to the client 
endpoint and the marshalled state of the proxy unmarshalled into it. At this 
point the service that the proxy provides would be registered with the OSGi 
service registry for the client to discover and consume. The smart proxy 
communicates with it's server via an internal dynamic proxy 
(java.lang.reflect.Proxy), it's used to invoke methods on the server.

While the existing protocol uses Java serialization, it doesn't use Java 
serialization's method of resolving classes. Java Serialization walks the stack 
and finds the first non system classloader (looking for the application 
ClassLoader). The existing class resolution method isn't suitable for OSGi, 
however the mechanism is extensible, so can be replaced with something suitable.



Does anyone have any advise or experience utilising the OSGi Enterprise 
Resolver Service Specification (chapter 136) and the OSGi Enterprise Repository 
Service Specification (chapter 132) to resolve and provision a bundle for the 
smart proxy at the client?



The intent here is the bundle manifests at each endpoint will be used to 
determine class visiblity, so the resolution and provisioning process will be 
of critical importance.

For anyone curios, the application is a fork of Apache River / Jini and I'm 
experimenting with support for OSGi. I'm also a committer and PMC member of 
Apache River. This isn't the old Jini we all know and love however, there are 
some additional features that allow provisioning to occur using a feature 
called delayed unmarshalling, so we can avoid the need for codebase annotations 
and URLClassLoaders.

The work in progress can be found here, for anyone who's curious:

https://github.com/pfirmstone/JGDMS/tree/Maven_build/modularize/JGDMS

Regards,

Peter.

_______________________________________________
OSGi Developer Mail List
osgi-dev@mail.osgi.org
https://mail.osgi.org/mailman/listinfo/osgi-dev


_______________________________________________
OSGi Developer Mail List
osgi-dev@mail.osgi.org
https://mail.osgi.org/mailman/listinfo/osgi-dev

_______________________________________________
OSGi Developer Mail List
osgi-dev@mail.osgi.org
https://mail.osgi.org/mailman/listinfo/osgi-dev


_______________________________________________
OSGi Developer Mail List
osgi-dev@mail.osgi.org
https://mail.osgi.org/mailman/listinfo/osgi-dev



-- 
Matt Sicker <boa...@gmail.com>
_______________________________________________
OSGi Developer Mail List
osgi-dev@mail.osgi.org
https://mail.osgi.org/mailman/listinfo/osgi-dev

_______________________________________________
OSGi Developer Mail List
osgi-dev@mail.osgi.org
https://mail.osgi.org/mailman/listinfo/osgi-dev

Re: [osgi-dev] OSGifying an existing application

Reply via email to