Re: [ADVANCED-DOTNET] Microsoft's future plans for Component Services and ORM

Frans Bouma Wed, 11 Feb 2004 01:08:55 -0800

After re-reading what I said about the XmlSerializer performance I
understand that it was me who started the confusion by mixing things up
and not wording things correctly when it comes down to average
performance of the xmlserializer component. For that I appologize.


Now that's out of the way, I really'd like to go back to the real issue:
binary interprocess communication vs a full xml producing service stack.

> > SoapFormatter is pretty slow also, but it's my experience that 
> > if I serialize a type to disk using soapformatter or using 
> > xmlserializer the soapformatter is faster big time. IF the 
> > xmlserializer is even capable of producing XML because it bites the 
> > dust as soon as it runs into an interface typed member variable, 
> > something the soapformatter can handle nicely and which has always
got 
> > me wondered why on earth the xmlserializer wasn't capable of
producing 
> > xml with interface types or cyclic references.
> 
> Well now we get down to the crucial issue of this discussion 
> - its all a question of type. As I'm sure you know, the only 
> thing the SoapFormatter has in common with web services is 
> that it has the letters S.O.A. and P in its name. It emits a 
> bizarre serialization based on Rpc-Encoding (Soap section 5 
> type transition rules) plus some stuff that assumes .NET on 
> the other end of the conversation. The Asmx stack doesn't use 
> the SoapFormatter - it uses the XmlSerializer directly. Web 
> Services are designed to be cross platform so we can't assume 
> a particular type system - therefore we use a portable one - 
> Xml Schema. This is the reason cyclic refs and interface 
> types can't be handled by the XmlSerializer because it is 
> schema driven and Xml Schema doesn't support those. So if you 
> want to make an argument in defence of Soap Section 5 
> encoding that's fine but for webservices we take an XSD view 
> of the world - which is not object based.

        I understand that and if (!) you want XML that's compliant to
some XmlSchema, then I fully agree with you that an XML webservice is
the thing you need, after all, the output (XML) is what you want and its
in the exact right format.

        However we were talking about the interprocess communication
setup which is often based on webservices. In that setup, the format in
which the data is transported is not important, as the process
(indirectly) consuming the datastream will never see the format the data
is transported in, but will see only re-incarnations of objects, rebuild
with the data sent. 

        In that light, it's essential that cyclic references and for
example complex types are passable to the consumer. 

> Now I know this is part of the argument about web services 
> not being suitable for some jobs and yes you're right - but 
> we need to take a view on why some restrictions are there 
> rather than say its broken because it down't support x or y.

        Ok, fair enough. I'll rephrase that and will say the
xmlserializer is not up to the job of being in the stack which produces
the datastream for interprocess communication because it has
restrictions which would severily limit that interprocess communication.
I would normally call that 'broken' as it IS used in a webservice no
matter what, and when I f.e. try to develop a webservice and a client
which will try to communicate 1 silly object with an interface typed
member nothing tells me this isn't possible, I'll find out at runtime
some component somewhere (the xmlserializer) can't produce what I want
it to produce. Perhaps the xmlserializer isnt the root of the problem,
it's however the component where the problem faces daylight :)

> >         Who hosts remoting services in IIS? Of course it's slower in

> > IIS. :) This is about communication between 2 applications on a
binary 
> > level. Why would you need a webserver/http consumer/xml
serialization 
> > etc. etc. ? You just want to get the binary stream across.
> 
> So what do you do about security (assuming you aren't going 
> to base your application architecture on unsupported sample 
> code)? I'm not saying that there aren't situations where a 
> distributed app is running in a trusted closed environment - 
> but there are a large number of situations where this is not 
> the case and for these IIS hosting gives a reasonable 
> approach - especially that once you go this route you get a 
> whole bunch of goodness with scalability that the TcpChannel 
> denies you by trying to hang on to its connections.

        true, it supplies a good set of services for the programmer.
I've never seen the sense of remoted services outside a trusted net
segment without full encryption. I mean: a stream to and from a
webservice is easily intercepted. SSL is then a must have, even on
lan's, or a similar form of encryption. Security is a tough issue with
distributed applications as you need a service layer which handles
encryption and security tokens transparently, in a lot of setups where
you have to distribute data outside trusted net segments, like a
client-server setup with desktop systems and servers in a central server
area. 

        I'm not sure about the scalability issue. IIS of course has a
connection scheduler build in and can dispatch connections in high load
easily, but I'm not sure if that's not also easily writable in a
remoting world, after all, the connections aren't kept alive forever
there. It might require some more work, but it also lets you work with
the only things you need to get data accross.

> But of course - once you have ASP.NET hosting why not use web 
> services - ah complex object graphs - lets move on

        no not only object graphs. This silly class also won't be able
to get accross:

public class Foo
{
        private IList _bar;

        public void DoSomethingWithBar()
        {
                //...
        }
}

        BEsides that, vs.net redefines your complex types on the client.
This can lead to casting errors because on the client yuo have to work
with these re-generated types, while you want with the real types. To
avoid that requires some work by hand, which is degrading the ease of
use of webservices dramatically. 

> >         Again, Ian, I just said it was slower, I didn't mention a 
> > situation nor a context. If I have to serialize a big object graph
to 
> > disk using teh soap formatter it's 'faster' (it's pretty slow) than 
> > the xmlserializer. I'm not serializing the graph to disk every 20 
> > seconds, just now and then. Why should I then conclude "the 
> > xmlserializer is faster", because it isn't? In other situations it
is 
> > perhaps, due to the code it generates which hardcoded produces xml
for a given type...
> 
> Passing complex object graphs is a thorny problem for 
> distributed systems, even if we do have the same type system 
> on both ends. We need knowledge of every type in that graph 
> on both ends of the wire - even if they are interface based 
> (interfaces are still types). All concrete types must be 
> either serializable or inherit from MarshalByRefObject. We 
> better hope the former or we have network roundtrips to talk 
> to the objects. There are few situations where I have found 
> it necessary to pass massive amounts of state on method calls 
> - which is where object graphs come in useful. But assuming 
> we are doing something meaningful with the data - like 
> writing to a database, etc - then passing an XML stream is 
> going to be ameliorated by the number of database roundtrips 
> - especially if you process it using the streaming XML API 
> (XmlReader and friends) rather than load it into the DOM.

        Well, it depends on how you set up your system. If the client is
pretty thin, you can be sure you have to pump around some big amounts of
data if you want to for example display data in grids and the like. Of
course you can fall back to stuff like datasets, however these are
cumbersome to work with with their string based indexers. It's often
conveniant to have an object graph on teh client to bind to a grid, or
to process to alter the gui state. It's thus key to get the object graph
as fast as possible to the other side of the wire. Having to translate
it into ascii and back doesn't seem to me the 'fastest' approach. If it
turns out to be the fastest, the binary approaches aren't coded
correctly, how else can an ascii parsing piece of code be faster than a
piece of code which just has to set indexes to where which datablock
starts/ends?

> >         The XmlSerializer is severily broken as it can't produce Xml

> > for simple classes with interface typed members, it can't handle 
> > cyclic references, it can't handle correctly classes which implement

> > IXmlSerializable (to work around these issues). I'd love to
implement 
> > IXmlSerializable to connect its methods to my ReadXml and WriteXml 
> > methods which do serialize cyclic references and interface based
types 
> > to xml in my classes but I can't, due to the hardcoded paths in 
> > XmlSerializer for that particular interface.
> 
> The IXmlSerializable issue is fixed in Whidbey IIRC

        'will be fixed', Richard :) Whidbey is 1 year away and then it
will take at least 6 months before the big mass moves to it. I'm looking
forward to this 'fix', however today (and the year to come) we can't use
it. 

> >         Aren't you agreeing with me that if you have two boxes, A
and 
> > B, and on both an application is running and communicating with the 
> > other one on a binary level, that it is completely insane to run
this 
> > communication through a webserver/http client, xml 
> > serializer/deserializer, soap envelopes etc. ? :) Do you really see 
> > the need for that full stack? I don't. In the early days we used 
> > sockets on Unix and RPC. That's very low level and I fully agree a
bit 
> > more sophisticated layer on top of that might be very handy. However
a 
> > full stack of services on top of eachother producing XML in the end
is 
> > nuts as all the work done in that stack is not used, the XML is not 
> > consumed as XML, it's just a different format the data is in, but it

> > was already in a format. That was the original point of the argument

> > started: xml is not needed in that situation, so leave it out of it,

> > no matter how much code is generated behind the scenes to produce
xml on teh fly.
> > 
> 
> Well if you want to boxes to talk to eachother and massive 
> scalability isn't an issue and neither is security then yes - 
> remoting is exactly the right solution. It depends on what 
> sort of app you write.

        scalability as in: I want 1000's of connections per second and I
want a fast connection dispatcher to cope with that? perhaps. I don't
see why a webservice is more secure. The data that comes and goes over
the wire is readable ascii and can be intercepted. It always has to use
encryption (SSL) anyway, and on the internet, there is no concept of
'windows authentication' so you have to implement your own way of
authentication. 

        In a lot of applications where webservices are now used,
scalability isn't the issue, because these servers are serving on an
intranet or extranet, not serving millions of people. 

> Now back to the original point about web services and COM+. 
> Indigo is "meant" to add distributed Tx and the rest of the 
> WSA stuff so that we have a whole bunch of services available 
> to code if we want them. We then know that when we write 
> stuff its not going to be redundant if we change or 
> implementation to another technology. Also multiple endpoints 
> can implement the same service in different technologies. No 
> one is bound to anything (other than the web service specs). 
> Now I agree that the state of web services is in its infancy 
> at the moment and I would make a bet that a binary XML wire 
> rep starts coming through in the next few years.

        why does it still have to be XML? Isn't a simpler format more
appropriate, where teh CLR can stream it's objects into a binary stream
and re-instantiate them from a binary stream, no matter if the CLR is on
a mac, x86, 64bit or 32bit box? But I think you also meant that with
'binary xml', but I'm not sure :). 

        I find Indigo the best part (and also the only part worth
discussing) of Longhorn. It's just sad it is 2 years at least away,
which makes it 3 years at least from now, before it is widely used.

> As you have argued, the issue is now and no - there isn't a 
> solution that is not based on web service technology on 
> Microsoft's agenda in the long term.

        on a 'service', not on a webservice per se. Get data accross,
call a method. two things which are very old in computer land, unix
boxes (and NT boxes) can do that for years. It's now more approachable
with the vs.net integration of creating webservices, but the concept is
the same. I truly hope we're not pumping ascii blocks around in 5 years
'because that's the only way'. If passing data in ascii format is faster
than any other format, the other formats aren't optimized/implemented
well. After all, they don't need textparsing. 

        FB

===================================
This list is hosted by DevelopMentor�  http://www.develop.com
Some .NET courses you may be interested in:

NEW! Guerrilla ASP.NET, 26 Jan 2004, in Los Angeles
http://www.develop.com/courses/gaspdotnetls

View archives and manage your subscription(s) at http://discuss.develop.com

Re: [ADVANCED-DOTNET] Microsoft's future plans for Component Services and ORM

Reply via email to