On 30 Nov 2009, at 5:51pm, Nicolas Williams wrote:

> Consider a column that contains a person's last name.  Q: do proper
> names have a language?  A: No, since people can be from all over and
> even within a single country may have last names of various radically
> different origins.

But what is the purpose of collating a column ?  Why, to allow it to be 
indexed, of course.  And for it to be indexed every value in the column must be 
comparable to every other value.  So it might be sufficient to simply declare 
the column as having a language:

ALTER TABLE ADD COLUMN familyname UNICODE LANG Deutsche

Actually, we'd probably use ISO 639-3:

ALTER TABLE ADD COLUMN familyname UNICODE LANG deu

That would be sufficient to allow the standard SQL functions like indexing and 
comparison to be implemented.  The column 'language' could perhaps be absolute, 
or perhaps be used as a default if the individual values did not declare a 
language.  On the other hand, it might perhaps not be necessary to declare the 
language for each column: it's likely that all columns for any database would 
want to use the same language for collation.

> Note too that Unicode has codepoints for specifying the language that
> the subsequent text is written in.

I did not know this !  This makes things simpler.  Are you talking about

http://unicode.org/reports/tr35/

This appears to be a way of specifying a language outside of the text stream, 
not inside it.

> Such codepoints could be used for
> deriving a collation from some text.  But again, I don't think this will
> prove useful, both, for the reasons given above (SELECT ...  ORDER BY
> ... COLLATE lang_of(...) makes no sense) and also because users won't
> know how to ensure that such language tags are embedded in the text that
> they write.

You could provide a function in SQLite that converts two text fields into the 
Unicode form:

TEXTINLANG(theText, theLanguage)

and this would be sufficient

I agree that it cannot be done in an elegant manner without at least one slight 
modification to the SQLite syntax parser.

Simon.
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to