On Friday 07 February 2003 08:26 pm, S Woodside wrote: > I think this is a bit off-topic but still interesting. > > On Friday, February 7, 2003, at 03:51 PM, Tod Harter wrote: > > Well, I'm not an expert on RNG by any means, so I won't get into a > > debate > > about which is better. I expect it depends on the task... SOME form of > > schema > > is very necessary however for many applications. When you say below > > "let the > > software figure out" the way the software figures out IS TO USE A > > SCHEMA!!! > > Thats what a schema is for, is to let a piece of software know how to > > tell if > > the document its validating is actually a valid instance or not. > > There's a big difference, though, between validating the structure of > the document and validating the data. Also, using a (structural) schema > implies a certain level of formalism, that can be very useful and > important but isn't always necessary. Actually, I think in many cases > writing an XSLT implies a schema even if you don't have one written up.
Well, I agree that a given XSLT assumes SOMETHING about the input its processing, to the extent that if the input is drastically different than what the designer imagined then the output is likely to be as well (though you might be surprised at how robust XSLT can be). There IS a difference between validating structure and data, but its not as clear cut as you imply. That is there are SOME rules that have nothing to do with structure, like say "the birth date must come before the death date". Those are rules best enforced in most cases by procedures or some sort of constraint language, though its worth noting that XSLT could do the job in some cases at least. Its a lot like with an RDBMS, sometimes you just need to have those stored procedures... > > > Data types > > in XML are just a part of that, a shorthand if you will, so that if > > you say > > "oh thats a date" then you don't have to spell out that its text with > > DD::MM::YYYY (or whatever) and deal with all the possibly acceptable > > minor > > subvariations thereof on your own. XML data types when used in things > > like > > XML Schema or RNG just moves a lot of definition to another level, and > > in the > > process insures that because that level is standardized its more > > likely to be > > adhered to in more places. > > Well, not exactly. I've been looking at schemas that are too formal on > dates (picking on that example, because it's shown up most often as a > problem for me), e.g. instead of having <date/> they have > <year/><month/><day/>, or else they use WXS typing expressions to try > to define what a proper date looks like. Well, what if I want to type > in February 5, 2001, instead of 2001/02/05 ? Or Feb 5, 01, or > 2002/Feb/5 or whatever. I think the best way to handle this is to just > have <date/> and let the application that needs to do something > intelligent with the date figure it out. Push the intelligence to the > edge of the network, because there's plenty of code out there that can > take a string that's supposed to be a date and figure it out. But the point is that when I stick data in my database I want a concise way of describing what is and what isn't acceptable. I don't want to have to rely on an application for that. What happens when the app is told 'iterate through all the <date/> elements and return the most frequent month? If one of your dates is bogus what do you do? If you never allowed that data into your system in the first place because you validated it when it was submitted then its not a problem. Granted you can devise error handling in every app that consumes the data, but now your just multiplying your work. Large databases with data coming in from diverse sources will VERY quickly turn to pea soup if you don't have pretty rigid rules. That doesn't preclude having a standard for date objects that allows a wide variety of formats, but remember that you WILL pay a performance penalty to validate them! I mean Date::Manip is 3 orders of magnitude slower than less flexible packages that accept only fixed formats. > > > SQL and XML are meant to solve different problems. They can be > > complementary > > in some cases or alternatives in others, but mastering both the > > relational > > and tree/hedge/graph ways of representing data is necessary to have a > > complete basis in data processing theory and practice. Just as its > > necessary > > to understand both declarative and procedural/OO systems on the > > processing > > side. > > Right, I'm not saying SQL is a bad thing, just that I don't like it. I > think that it's very cool that XML is flexible and self-describing. > Part of my angst comes from my impression that it's moving towards > strong typing and that's a bad thing. Heck, perl is not strongly-typed, > Objective-C is not strongly type, and really those are my languages of > choice right now, and both are perfectly capable of high-powered tasks. > There is a huge difference between strong typing vs weak typing in a language and in a database. In a language it can reduce maintenance requirements and has other possible benefits. In most databases it just opens you up to problems down the road. In those cases where you DO want relaxed typing in your database you can certainly use text fields. Personally what I see as the advantages of XML are that it IS possible to have well defined and typed data and yet construct ad-hoc collections OF such data. I mean the constraints of RDBMS have less to do with typing in my mind than with its rigid overall structure. If XML could be indexed and queried as efficiently as relational data is now I'd give up most uses of RDBMS. > simon > > --- > www.simonwoodside.com > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] -- Tod Harter Giant Electronic Brain --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
