The way I have done it is to ensure that truncation only happens on valid UTF-8
boundaries but length calculation is done in "use bytes" mode, and then try to
guesstimate where to make the UTF-8 cut, and try that repeatedly until you
manage to go under the max size you specified. Then you should have a valid
UTF-8 string whos' byte encoded representation is less than the given number of
bytes. There might be circumstances where this truncates slightly more but has
worked well in practice for me for exactly this purpose of storing slightly too
long text values in varchar columns.
sub truncByteString
{
my ($str, $bytes) = @_;
my ($len,$blen);
return undef unless defined $str;
while(1) {
do { use bytes; $blen = length $str };
last unless $blen > $bytes;
$len = length $str;
my $scaled_diff = int (($blen-$bytes)/($blen/$len))-1;
$scaled_diff = 1 if $scaled_diff<1;
my $nlen = $len - $scaled_diff;
$str = substr $str,0,$nlen;
}
return $str;
}
/ Marcus
--
Marcus Bergner, M.Sc CSE............E-mail: [email protected]
Software Architect, Vizrt.................Mobile: +46 (0)730-808025
________________________________
From: Fennell, Brian <[email protected]>
Sent: Friday, May 28, 2021 03:11
To: Bruce Johnson <[email protected]>; dbi users <[email protected]>
Subject: RE: Strange issue with inserting varchar2()
One more
https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstackoverflow.com%2Fquestions%2F1454952%2Fdummys-guide-to-unicode&data=04%7C01%7CMarcus.Bergner%40vizrt.com%7C1767beb561424d97b19f08d92175a617%7Cc63c3ba740db460ebdf1e63a02a3ad59%7C0%7C0%7C637577611463282317%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=wqFX4kaJyubBeA2vuUo4U5bhggujKZcyukf1%2FN5%2BwEg%3D&reserved=0
The information contained in this electronic mail transmission is intended only
for the use of the individual or entity named in this transmission. If you are
not the intended recipient of this transmission, you are hereby notified that
any disclosure, copying or distribution of the contents of this transmission is
strictly prohibited and that you should delete the contents of this
transmission from your system immediately. Any comments or statements contained
in this transmission do not necessarily reflect the views or position of Radial
or its subsidiaries and/or affiliates.