On Friday 07 February 2003 08:26 pm, S Woodside wrote:
> I think this is a bit off-topic but still interesting.
>
> On Friday, February 7, 2003, at 03:51  PM, Tod Harter wrote:
> > Well, I'm not an expert on RNG by any means, so I won't get into a
> > debate
> > about which is better. I expect it depends on the task... SOME form of
> > schema
> > is very necessary however for many applications. When you say below
> > "let the
> > software figure out" the way the software figures out IS TO USE A
> > SCHEMA!!!
> > Thats what a schema is for, is to let a piece of software know how to
> > tell if
> > the document its validating is actually a valid instance or not.
>
> There's a big difference, though, between validating the structure of
> the document and validating the data. Also, using a (structural) schema
> implies a certain level of formalism, that can be very useful and
> important but isn't always necessary. Actually, I think in many cases
> writing an XSLT implies a schema even if you don't have one written up.

Well, I agree that a given XSLT assumes SOMETHING about the input its 
processing, to the extent that if the input is drastically different than 
what the designer imagined then the output is likely to be as well (though 
you might be surprised at how robust XSLT can be).

There IS a difference between validating structure and data, but its not as 
clear cut as you imply. That is there are SOME rules that have nothing to do 
with structure, like say "the birth date must come before the death date". 
Those are rules best enforced in most cases by procedures or some sort of 
constraint language, though its worth noting that XSLT could do the job in 
some cases at least. Its a lot like with an RDBMS, sometimes you just need to 
have those stored procedures...

>
> > Data types
> > in XML are just a part of that, a shorthand if you will, so that if
> > you say
> > "oh thats a date" then you don't have to spell out that its text with
> > DD::MM::YYYY (or whatever) and deal with all the possibly acceptable
> > minor
> > subvariations thereof on your own. XML data types when used in things
> > like
> > XML Schema or RNG just moves a lot of definition to another level, and
> > in the
> > process insures that because that level is standardized its more
> > likely to be
> > adhered to in more places.
>
> Well, not exactly. I've been looking at schemas that are too formal on
> dates (picking on that example, because it's shown up most often as a
> problem for me), e.g. instead of having <date/> they have
> <year/><month/><day/>, or else they use WXS typing expressions to try
> to define what a proper date looks like. Well, what if I want to type
> in February 5, 2001, instead of 2001/02/05 ? Or Feb 5, 01, or
> 2002/Feb/5 or whatever. I think the best way to handle this is to just
> have <date/> and let the application that needs to do something
> intelligent with the date figure it out. Push the intelligence to the
> edge of the network, because there's plenty of code out there that can
> take a string that's supposed to be a date and figure it out.

But the point is that when I stick data in my database I want a concise way of 
describing what is and what isn't acceptable. I don't want to have to rely on 
an application for that. What happens when the app is told 'iterate through 
all the <date/> elements and return the most frequent month? If one of your 
dates is bogus what do you do? If you never allowed that data into your 
system in the first place because you validated it when it was submitted then 
its not a problem. Granted you can devise error handling in every app that 
consumes the data, but now your just multiplying your work. Large databases 
with data coming in from diverse sources will VERY quickly turn to pea soup 
if you don't have pretty rigid rules. That doesn't preclude having a standard 
for date objects that allows a wide variety of formats, but remember that you 
WILL pay a performance penalty to validate them! I mean Date::Manip is 3 
orders of magnitude slower than less flexible packages that accept only fixed 
formats.

>
> > SQL and XML are meant to solve different problems. They can be
> > complementary
> > in some cases or alternatives in others, but mastering both the
> > relational
> > and tree/hedge/graph ways of representing data is necessary to have a
> > complete basis in data processing theory and practice. Just as its
> > necessary
> > to understand both declarative and procedural/OO systems on the
> > processing
> > side.
>
> Right, I'm not saying SQL is a bad thing, just that I don't like it. I
> think that it's very cool that XML is flexible and self-describing.
> Part of my angst comes from my impression that it's moving towards
> strong typing and that's a bad thing. Heck, perl is not strongly-typed,
> Objective-C is not strongly type, and really those are my languages of
> choice right now, and both are perfectly capable of high-powered tasks.
>
There is a huge difference between strong typing vs weak typing in a language 
and in a database. In a language it can reduce maintenance requirements and 
has other possible benefits. In most databases it just opens you up to 
problems down the road. In those cases where you DO want relaxed typing in 
your database you can certainly use text fields.

Personally what I see as the advantages of XML are that it IS possible to have 
well defined and typed data and yet construct ad-hoc collections OF such 
data. I mean the constraints of RDBMS have less to do with typing in my mind 
than with its rigid overall structure. If XML could be indexed and queried as 
efficiently as relational data is now I'd give up most uses of RDBMS.

> simon
>
> ---
> www.simonwoodside.com
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]

-- 
Tod Harter
Giant Electronic Brain

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to