Neither of those corresponds to the verbs I posted.
latin2utf8_v1=. (9&u: ] ]) :: (8 u: 10 u: ])
latin2utf8_v2=. 8 u: (9&u:) :: (10&u:)
valid_utf8=. 8 u: 10 u: ?1e6$16384
invalid_utf8=. a.{~?1e6$256
utf8s=. valid_utf8;invalid_utf8
datatype@:latin2utf8_v1 &.> utf8s
┌───────┬───────┐
│literal│literal│
└───────┴───────┘
datatype@:latin2utf8_v1 &.> utf8s
┌───────┬───────┐
│literal│literal│
└───────┴───────┘
timex 'latin2utf8_v1 valid_utf8'
0.0111741
timex 'latin2utf8_v2 valid_utf8'
0.0235571
NB. performance is the same on invalid utf8
On Tue, 22 Mar 2022, Raul Miller wrote:
On Tue, Mar 22, 2022 at 7:47 PM Elijah Stone <[email protected]> wrote:
It's not; it should give exactly the same result.
datatype 9 u: 8 u: 10 u: 166 97 98 99 { a.
unicode4
datatype 8 u: 9 u: 8 u: 10 u: 166 97 98 99 { a.
literal
--
Raul
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm