On Nov 28, 2006, at 5:44 PM, Kevin S. Clarke wrote:
Is there a standard for specifying how textual analysis works as
well, so that tokenization can be standardized across these XQuery
engines as well?
Not that I know. What I've seen so far is that tokenization is
implementation specific. Perhaps this is something that is
configurable so that implementations can be set up and then queried
consistently. Any indexing engine worth its salt should be
configurable I'd think. There is nothing I'm aware of in the fulltext
work though that defines how things are indexed.
If you leave out all the configurability in tokenization for indexing
and querying from the XQuery standard, then there will surely be
extensions needed for concrete implementations to allow this stuff to
be specified. Interesting issue.
For all you Java savvy folks out there, how about "standards" like
J2EE that make it easy to move an application from one vendors app.
server to another. Works for the simplest of applications, but all
vendors have their own specific custom deployment descriptors too.
Erik