Re: Schema Validation (WAS: [axis2] Validating Messages :: WSDL :: )

Anne Thomas Manes Wed, 12 Jul 2006 12:57:15 -0700

The schema describing the message structure is the published
interface. And a schema can include all kinds of validation
requirements (e.g., default values, fixed values, uniqueness,
referential integrity, etc.) Some of these requirements are extremely
expensive to validate. A databinding framework does a reasonably
decent job at performing structure validation, but it typically
doesn't do data/content validation. You could seriously mar system
performance if you required the SOAP server to validate all messages
going in or out. -- So yes, my primary objection to automatic
validation is entirely about performance. If validation came for free
(as does the structural validation when using a databinding framework)
then obviously that would be cool.


But I do have another objection -- schema poisoning is a security
threat. Someone can use validation as a way to lauch a DOS attack or
to inject malevant content into a document.

If you anticipate that not all clients will send valid messages, then
it's a great idea to use a hardware-accelerated intermediary to
validate the messages and protect against various security attacks.
You can set the intermediary to validate all messages or just messages
that come from less than trustworthy sources. And the overhead is
minimal.

Anne


On 7/12/06, Doug B <[EMAIL PROTECTED]> wrote:

Thanks, Anne.

I guess I want to draw the line at what is specifiable in the
published interface.  So, yes, regular expressions, enumerations, etc.
 I'd push the decision about using those back on the Schema design
phase, not on the implementation engine.  I can see you don't agree
:-).  Where would you draw the line?  Just enforcing the tree?  With
or without enforcement of required/optional elements?

Again, is it primarily (current) performance that causes you to not
want to automatically enforce the full interface as specified?  Said
another way, if validation was instantaneous, would you/everyone do
it?  If not, shouldn't we have chosen something more limited than
Schema for specifying the interface?  What's the point of such a rich
specification if you're not going to enforce it?

(Yes, XML is extensible, but I wouldn't expect message sent over a Web
Service interface to be.)

Also, if I controlled both the client and server sides of a service, I
don't think I'd be using Web Services.  So I'm assuming especially the
server will want to make sure that everything it receives is valid,
one way or another.

Or should I go back to very loose Schemas where everything is just a
string?  In case I never said so, my entry into Web Services was from
custom socket interfaces where we had to describe in a text document
every field and what it was allowed to contain.  I thought Schema
would take most of that work away from me, but there's not much point
in explaining all the constraints in the Schema if I still have to
manually duplicate them in code logic.

Thanks, as always, for your time and advice.  I realize my experience
is limited, so I might be missing a lot of scenarios that I should be
considering.

Doug

On 7/12/06, Anne Thomas Manes <[EMAIL PROTECTED]> wrote:
> Sorry for the silence. I took a couple of days off.
>
> True schema validation is an extremely expensive process -- especially
> if you put uniqueness or referential integrity constraints into the
> schema. Even checking for nulls can be expensive if it's a large
> document instance. If you have control over the client environment and
> know that the client will send only valid messages, then it absolutely
> doesn't make sense to validate all incoming messages.
>
> Also, in some circumstances, you may want to allow systems to add
> extraneous information or change a type or whatever for extensibility
> reasons. After all, XML is supposed to be extensible.
>
> Validation should be performed only when explicitly requested.
>
> As I said in my original response, a databinding framework will always
> perform some basic validation -- it's expecting to get a particular
> XML structure that it knows how to map to a particular object graph.
> If you feed it an unexpected XML structure, it will barf. That's not
> the same a validation, though. Is it appropriate to check for valid
> nulls during data binding? Enumerations? Regular expression formatting
> constraints? Maybe. But where do you draw the line?
>
> Anne
>
> On 7/12/06, Doug B <[EMAIL PROTECTED]> wrote:
> > Anne doesn't appear to be around right now, but I'll bug her when it
> > looks like she is.  In the meantime, surely some of the other list
> > readers have some opinions on this topic?
> >
> > On 7/9/06, Benjamin Fan <[EMAIL PROTECTED]> wrote:
> > >
> > > If there is going to be a discussion then I would very much like to
> > > participate in it. I am in the middle of building a production system 
where
> > > I do in fact need to validate against the schema. In fact the WSDL (doc
> > > literal) will form the basis of a commercial interface specification for 
3rd
> > > parties. They will not be Java.
> > >
> > > Best Regards,
> > >
> > >
> > > Benjamin Fam
> > >
> > >
> > >
> > > On 7/10/06, Doug B <[EMAIL PROTECTED]> wrote:
> > > > <warning: long, "best practice" questions to follow>
> > > >
> > > > Interesting to hear you say that, Anne.  I've been on a multi-year
> > > > quest to get automatic, fast validation out of a Web Services engine.
> > > > Conceptually, it always seemed like the "right" approach (especially
> > > > for Document-Literal).  If you're having to parse XML anyway, and your
> > > > XML parser can validate when it does so, why not do it, actually
> > > > enforcing your restrictive WSDL/Schema at the layer that defines and
> > > > understands it?
> > > >
> > > > At the time, Axis did not do this at all, but my Bugzilla feature
> > > > request seemed to get agreement that this was a good goal:
> > > >
> > > > http://issues.apache.org/jira/browse/AXIS-222
> > > >
> > > > Also at that time, I tried out Systinet Wasp, which did do
> > > > auto-validation.  Much later, I came across the DeveloperWorks article
> > > > about combining Axis and Castor to get auto-validation:
> > > >
> > > >
> > > http://www-128.ibm.com/developerworks/webservices/library/ws-castor/
> > > >
> > > > This approach was exactly what I was seeking, so we started using it.
> > > > It's very simple, but it does seem slow.  (I'm also not getting pure
> > > > POJOs for my "schema" beans, which I'd really like, but Castor's
> > > > aren't too bad.  Haven't found an XML framework that generates clean
> > > > POJOs from a schema and don't want to write mapping files if I can
> > > > help it.)
> > > >
> > > > Along the way, we asked everyone we could whether or not
> > > > auto-validation was a good approach, and we got responses all across
> > > > the spectrum.  Clearly some people expect and want it, but others
> > > > don't.  Some engines can do it, but others can't.  If the only reason
> > > > not to do it is performance, will the newer parsers or something like
> > > > JiBX make a significant difference?  What if you have access to an XML
> > > > appliance?  Would more people do it in that case?  I suppose an engine
> > > > that let you enable and disable it at will would be nice.
> > > >
> > > > I'm not exactly sure where I stand at this point, but I'm not quite
> > > > willing to give up on the "dream".  That is, the dream of wholly
> > > > specifying my interface via WSDL/Schema, and having a WS engine
> > > > completely wrap, translate, and validate
> > > > requests/responses/exceptions, hiding from my business code the fact
> > > > that it's even being accessed as a Web Service, but ensuring that
> > > > anything that comes through the Web Service interface doesn't violate
> > > > the Web Service's specification.  I've started accepting the value of
> > > > having my business code do business validations as well (namely in
> > > > cases where I want to use it from other interfaces), but it just seems
> > > > too natural, logically, for the XML parsing layer to do it.
> > > > Otherwise, you're throwing away much of the information you've
> > > > carefully specified (in a handwritten, authoritative, contract WSDL,
> > > > at least).
> > > >
> > > > I'd be happy to take this discussion somewhere else, since it's really
> > > > not specific to Axis, if you'd like and if you have time to
> > > > participate in it.  Thanks.
> > > >
> > > > Doug
> > > >
> > > > On 7/7/06, Anne Thomas Manes < [EMAIL PROTECTED]> wrote:
> > > > > Axis makes no attempt to validate messages. (It's very expensive
> > > > > process that would significantly degrade performance). A databinding
> > > > > system will catch many validation issues, but it also does not do true
> > > > > validation. If you pass in elements that it doesn't expect, it will
> > > > > reject the request. But as long as the message matches what the
> > > > > databinding system can deal with, it will pass.
> > > > >
> > > > > If you want to validate the message, then use a handler or
> > > > > intermediary to do so.
> > > > >
> > > > > Anne
> > > > >
> > > > >
> > > ---------------------------------------------------------------------
> > > > > To unsubscribe, e-mail:
> > > [EMAIL PROTECTED]
> > > > > For additional commands, e-mail: [EMAIL PROTECTED]
> > > > >
> > > > >
> > > >
> > > >
> > > ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail:
> > > [EMAIL PROTECTED]
> > > > For additional commands, e-mail: [EMAIL PROTECTED]
> > > >
> > > >
> > >
> > >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > For additional commands, e-mail: [EMAIL PROTECTED]
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Schema Validation (WAS: [axis2] Validating Messages :: WSDL :: )

Reply via email to