On Thu, Feb 27, 2014 at 9:39 AM, Don Guinn <dongu...@gmail.com> wrote:
> Although the unicode value of 'þ' is less than 256 it still must be
> represented with two bytes in UTF-8. This is where it gets confusing to
> view UTF-8 as literal. And why I sometimes think it would be nice if UTF-8
> was a type unique from literal and unicode.

You could do that.

That could easily double the size of the J interpreter and yield all
sorts of new errors. It would also take a lot of work to implement.
Also, it's not clear whether any of the existing J commands should
work on such a type.

Personally, I'd much rather see J support utf-32.

Put differently, J represents code points, it's up to the programmer
to make sure that these code points represent meaningful characters.
If you prohibit the language from representing things which are not
meaningful to you you are also prohibiting it from representing those
things for other people.

For example, let's say that 'þ' was represented in a utf-8 type.  What
would 3 3$'þ' do?

Here's how it works, currently:
   3 3$'þ'
þ�
�þ
þ�

What you are seeing here is that 'þ' is a sequence of two literals in
utf-8. So an array of those literals with an odd length will
necessarily be flawed.

Now, J will already report the error, if that is what you want:
   7 u:"1(3 3$'þ')
|domain error

But the real issue here is not J, it's the complexity of unicode. No
matter how the language is implemented, you are going to have to come
to terms with that complexity if you are going to work with unicode.
And yes, this is frustrating. And, yes, it's tempting to blame the
language for this frustration. But if you've worked with unicode in
another programming language you'll be experiencing similar (or worse)
frustrations.

And, eventually, once you get past those frustrations, you'll have a
decent understanding of what's going on.

Personally, I think it's best if people do not limit themselves to a
single programming language. Thinking about problems in multiple
programming languages gives you useful perspectives on how to solve
problems.

Of course, it's also good to not limit your knowledge to "only
programming languages". To be useful you need to have knowledge of
other fields (engineering, or whatever else). Overspecialization keeps
you from recognizing and solving problems.

Thanks,

-- 
Raul
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to