On Fri, 2004-10-22 at 17:28, Andy Hassall wrote: [snip] > I can see there being an ultimate character set torture test with the > Encode module, and working out the intersection between NLS_LANG's character > set and that of the database character set and again for the database > national character set, and running each character through an insert and > select and making sure it comes out the same as it went in (taking into > account UTF8 flags and so on). No time to write such a thing at the moment > though :-(
Hi Andy, We talked about exhaustive character set testing. And decided that testing representative cases, and boundary conditions would be a more profitable use of our time. The tests we came up with were certainly challenging enough to flush out a LOT of issues. If you think we have missed some cases that might be use full to test, look at t/nchar_test_lib.pl for easy ways to add them. You can use this as a pattern for defining new sets to test with new tests, which should be easy to clone from one of the existing tests in the t/2[1-4]*CHARSETNAME.t series. Testing more combinations of client character sets might be useful. But the main goal was to get UTF8 working correctly. In fact, it was helpful to us to realize that while we could get some bizarre combinations to work in some cases, it was not universal, and would not be as portable as focusing just on pure utf8. Which is the way everything is heading anyway. I think we have come pretty darn close, especially for Oracle 9.2 versions and above. We do test high bit set single byte charsets with the t/23wide_db_8bit.t. This does assume a Unicode Database Character set ... but now a days, if one wanted to support extended characters, I would think the smart way to go would be to just use a unicode database characterset. Lincoln
