Re: [Rd] Spurious warnings in coercion from double/complex/character to raw
On 10/09/2021 12:53, brodie gaslam wrote: On Friday, September 10, 2021, 03:13:54 PM EDT, Hervé Pagès wrote: Good catch, thanks! Replacing if(ISNAN(vi) || (tmp = (int) vi) < 0 || tmp > 255) { tmp = 0; warn |= WARN_RAW; } pa[i] = (Rbyte) tmp; with if(ISNAN(vi) || vi <= -1.0 || vi >= 256.0) { tmp = 0; warn |= WARN_RAW; } else { tmp = (int) vi; } pa[i] = (Rbyte) tmp; should address that. FWIW IntegerFromReal() has a similar risk of int overflow (src/main/coerce.c, lines 128-138): int attribute_hidden IntegerFromReal(double x, int *warn) { if (ISNAN(x)) return NA_INTEGER; else if (x >= INT_MAX+1. || x <= INT_MIN ) { *warn |= WARN_INT_NA; return NA_INTEGER; } return (int) x; } The cast to int will also be an int overflow situation if x is > INT_MAX and < INT_MAX+1 so the risk is small! I might be being dense, but it feels this isn't a problem? Quoting C99 6.3.1.4 again (emph added): When a finite value of real floating type is converted to an integer type other than _Bool, **the fractional part is discarded** (i.e., the value is truncated toward zero). If the value of the integral part cannot be represented by the integer type, the behavior is undefined.50) Does the "fractional part is discarded" not save us here? I think it does. Thanks for clarifying and sorry for the false positive! H. Best, B. -- Hervé Pagès Bioconductor Core Team hpages.on.git...@gmail.com __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Spurious warnings in coercion from double/complex/character to raw
> On Friday, September 10, 2021, 03:13:54 PM EDT, Hervé Pagès > wrote: > > Good catch, thanks! > > Replacing > > if(ISNAN(vi) || (tmp = (int) vi) < 0 || tmp > 255) { > tmp = 0; > > warn |= WARN_RAW; > > } > pa[i] = (Rbyte) tmp; > > with > > if(ISNAN(vi) || vi <= -1.0 || vi >= 256.0) > { > tmp = 0; > > warn |= WARN_RAW; > > } else { > tmp = (int) vi; > } > pa[i] = (Rbyte) tmp; > > should address that. > > FWIW IntegerFromReal() has a similar risk of int overflow > (src/main/coerce.c, lines 128-138): > > > int attribute_hidden > > IntegerFromReal(double x, int *warn) > > { > > if (ISNAN(x)) > > return NA_INTEGER; > > else if (x >= INT_MAX+1. || x <= INT_MIN ) { > > *warn |= WARN_INT_NA; > > return NA_INTEGER; > > } > > return (int) x; > > } > > > > The cast to int will also be an int overflow situation if x is > INT_MAX > and < INT_MAX+1 so the risk is small! I might be being dense, but it feels this isn't a problem? Quoting C99 6.3.1.4 again (emph added): > When a finite value of real floating type is converted to an integer > type other than _Bool, **the fractional part is discarded** (i.e., the > value is truncated toward zero). If the value of the integral part > cannot be represented by the integer type, the behavior is undefined.50) Does the "fractional part is discarded" not save us here? Best, B. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Spurious warnings in coercion from double/complex/character to raw
On 10/09/2021 09:12, Duncan Murdoch wrote: On 10/09/2021 11:29 a.m., Hervé Pagès wrote: Hi, The first warning below is unexpected and confusing: > as.raw(c(3e9, 5.1)) [1] 00 05 Warning messages: 1: NAs introduced by coercion to integer range 2: out-of-range values treated as 0 in coercion to raw The reason we get it is that coercion from numeric to raw is currently implemented on top of coercion from numeric to int (file src/main/coerce.c, lines 700-710): case REALSXP: for (i = 0; i < n; i++) { // if ((i+1) % NINTERRUPT == 0) R_CheckUserInterrupt(); tmp = IntegerFromReal(REAL_ELT(v, i), &warn); if(tmp == NA_INTEGER || tmp < 0 || tmp > 255) { tmp = 0; warn |= WARN_RAW; } pa[i] = (Rbyte) tmp; } break; The first warning comes from the call to IntegerFromReal(). The following code avoids the spurious warning and is also simpler and slightly faster: case REALSXP: for (i = 0; i < n; i++) { // if ((i+1) % NINTERRUPT == 0) R_CheckUserInterrupt(); double vi = REAL_ELT(v, i); if(ISNAN(vi) || (tmp = (int) vi) < 0 || tmp > 255) { tmp = 0; warn |= WARN_RAW; } pa[i] = (Rbyte) tmp; } break; Doesn't that give different results in case vi is so large that "(int) vi" overflows? (I don't know what the C standard says, but some online references say that behaviour is implementation dependent.) For example, if vi = 1.0 + INT_MAX; wouldn't "(int) vi" be equal to a small integer? Good catch, thanks! Replacing if(ISNAN(vi) || (tmp = (int) vi) < 0 || tmp > 255) { tmp = 0; warn |= WARN_RAW; } pa[i] = (Rbyte) tmp; with if(ISNAN(vi) || vi <= -1.0 || vi >= 256.0) { tmp = 0; warn |= WARN_RAW; } else { tmp = (int) vi; } pa[i] = (Rbyte) tmp; should address that. FWIW IntegerFromReal() has a similar risk of int overflow (src/main/coerce.c, lines 128-138): int attribute_hidden IntegerFromReal(double x, int *warn) { if (ISNAN(x)) return NA_INTEGER; else if (x >= INT_MAX+1. || x <= INT_MIN ) { *warn |= WARN_INT_NA; return NA_INTEGER; } return (int) x; } The cast to int will also be an int overflow situation if x is > INT_MAX and < INT_MAX+1 so the risk is small! There are other instances of this situation in IntegerFromComplex() and IntegerFromString(). More below... Duncan Murdoch Coercion from complex to raw has the same problem: > as.raw(c(3e9+0i, 5.1)) [1] 00 05 Warning messages: 1: NAs introduced by coercion to integer range 2: out-of-range values treated as 0 in coercion to raw Current implementation (file src/main/coerce.c, lines 711-721): case CPLXSXP: for (i = 0; i < n; i++) { // if ((i+1) % NINTERRUPT == 0) R_CheckUserInterrupt(); tmp = IntegerFromComplex(COMPLEX_ELT(v, i), &warn); if(tmp == NA_INTEGER || tmp < 0 || tmp > 255) { tmp = 0; warn |= WARN_RAW; } pa[i] = (Rbyte) tmp; } break; This implementation has the following additional problem when the supplied complex has a nonzero imaginary part: > as.raw(300+4i) [1] 00 Warning messages: 1: imaginary parts discarded in coercion 2: out-of-range values treated as 0 in coercion to raw > as.raw(3e9+4i) [1] 00 Warning messages: 1: NAs introduced by coercion to integer range 2: out-of-range values treated as 0 in coercion to raw In one case we get a warning about the discarding of the imaginary part but not the other case, which is unexpected. We should see the exact same warning (or warnings) in both cases. With the following fix we only get the warning about the discarding of the imaginary part if we are not in a "out-of-range values treated as 0 in coercion to raw" situation: case CPLXSXP: for (i = 0; i < n; i++) { // if ((i+1) % NINTERRUPT == 0) R_CheckUserInterrupt(); Rcomplex vi = COMPLEX_ELT(v, i); if(ISNAN(vi.r) || ISNAN(vi.i) || (tmp = (int) vi.r) < 0 || tmp > 255) { tmp = 0; warn |= WARN_RAW; } else { if(vi.i != 0.0) warn |= WARN_IMAG; } pa[i] = (Rbyte) tmp; } break; Corrected version: if(ISNAN(vi.r) || ISNAN(vi.i) || vi.r <= -1.00 || vi.r >= 256.00) { tmp = 0; warn |= WARN_RAW; } else { tmp = (int) vi.r; if(vi.i != 0.0) warn |= WARN_IMAG; } pa[i] = (Rbyte) tmp; Hopefully this time I got it right. Best, H. Finally, coercion from cha
Re: [Rd] Spurious warnings in coercion from double/complex/character to raw
Hello, Integer overflow is undefined behavior by the C standard. For instance, on my computer, with GCC 5.4.0, with the optimization level 2, the following program never stops: include int main(void) { for(int i=1; i != 0; i++) { if ((i & 0xFFF) == 0) { printf("%d\n", i); } } } This is due to a compiler optimization, that assumes that the integer can never overflow, and so, can never be equal to zero, and so, the for loop should never stops. You should always be very cautious when adding two integers, to avoid any overflow. There is no problem with unsigned integers. Similarly, double-to-integer conversions are only safe if the double is in the range [INT_MIN to INT_MAX] The standard contains: When a finite value of real floating type is converted to an integer type other than _Bool, the fractional part is discarded (i.e., the value is truncated toward zero). If the value of the integral part cannot be represented by the integer type, the behavior is undefined The easiest solution to avoid a risk when converting, is to check that the double (e.g. vi) is in range [0 to 255] BEFORE converting to an integer. -- Sincerely Andr� GILLIBERT De : R-devel de la part de Duncan Murdoch Envoy� : vendredi 10 septembre 2021 18:12:02 � : Herv� Pag�s; r-devel Objet : Re: [Rd] Spurious warnings in coercion from double/complex/character to raw ATTENTION: Cet e-mail provient d�une adresse mail ext�rieure au CHU de Rouen. Ne cliquez pas sur les liens ou n'ouvrez pas les pi�ces jointes � moins de conna�tre l'exp�diteur et de savoir que le contenu est s�r. En cas de doute, transf�rer le mail � � DSI, S�curit� � pour analyse. Merci de votre vigilance On 10/09/2021 11:29 a.m., Herv� Pag�s wrote: > Hi, > > The first warning below is unexpected and confusing: > > > as.raw(c(3e9, 5.1)) > [1] 00 05 > Warning messages: > 1: NAs introduced by coercion to integer range > 2: out-of-range values treated as 0 in coercion to raw > > The reason we get it is that coercion from numeric to raw is currently > implemented on top of coercion from numeric to int (file > src/main/coerce.c, lines 700-710): > > case REALSXP: > for (i = 0; i < n; i++) { > // if ((i+1) % NINTERRUPT == 0) R_CheckUserInterrupt(); > tmp = IntegerFromReal(REAL_ELT(v, i), &warn); > if(tmp == NA_INTEGER || tmp < 0 || tmp > 255) { > tmp = 0; > warn |= WARN_RAW; > } > pa[i] = (Rbyte) tmp; > } > break; > > The first warning comes from the call to IntegerFromReal(). > > The following code avoids the spurious warning and is also simpler and > slightly faster: > > case REALSXP: > for (i = 0; i < n; i++) { > // if ((i+1) % NINTERRUPT == 0) R_CheckUserInterrupt(); > double vi = REAL_ELT(v, i); > if(ISNAN(vi) || (tmp = (int) vi) < 0 || tmp > 255) { > tmp = 0; > warn |= WARN_RAW; > } > pa[i] = (Rbyte) tmp; > } > break; Doesn't that give different results in case vi is so large that "(int) vi" overflows? (I don't know what the C standard says, but some online references say that behaviour is implementation dependent.) For example, if vi = 1.0 + INT_MAX; wouldn't "(int) vi" be equal to a small integer? Duncan Murdoch > > Coercion from complex to raw has the same problem: > > > as.raw(c(3e9+0i, 5.1)) > [1] 00 05 > Warning messages: > 1: NAs introduced by coercion to integer range > 2: out-of-range values treated as 0 in coercion to raw > > Current implementation (file src/main/coerce.c, lines 711-721): > > case CPLXSXP: > for (i = 0; i < n; i++) { > // if ((i+1) % NINTERRUPT == 0) R_CheckUserInterrupt(); > tmp = IntegerFromComplex(COMPLEX_ELT(v, i), &warn); > if(tmp == NA_INTEGER || tmp < 0 || tmp > 255) { > tmp = 0; > warn |= WARN_RAW; > } > pa[i] = (Rbyte) tmp; > } > break; > > This implementation has the following additional problem when the > supplied complex has a nonzero imaginary part: > > > as.raw(300+4i) > [1] 00 > Warning messages: > 1: imaginary parts discarded in coercion > 2: out-of-range values treated as 0 in coercion to raw > > > as.raw(3e9+4i) > [1] 00 > Warning messages: > 1: NAs introduced b
Re: [Rd] Spurious warnings in coercion from double/complex/character to raw
On 10/09/2021 11:29 a.m., Hervé Pagès wrote: Hi, The first warning below is unexpected and confusing: > as.raw(c(3e9, 5.1)) [1] 00 05 Warning messages: 1: NAs introduced by coercion to integer range 2: out-of-range values treated as 0 in coercion to raw The reason we get it is that coercion from numeric to raw is currently implemented on top of coercion from numeric to int (file src/main/coerce.c, lines 700-710): case REALSXP: for (i = 0; i < n; i++) { // if ((i+1) % NINTERRUPT == 0) R_CheckUserInterrupt(); tmp = IntegerFromReal(REAL_ELT(v, i), &warn); if(tmp == NA_INTEGER || tmp < 0 || tmp > 255) { tmp = 0; warn |= WARN_RAW; } pa[i] = (Rbyte) tmp; } break; The first warning comes from the call to IntegerFromReal(). The following code avoids the spurious warning and is also simpler and slightly faster: case REALSXP: for (i = 0; i < n; i++) { // if ((i+1) % NINTERRUPT == 0) R_CheckUserInterrupt(); double vi = REAL_ELT(v, i); if(ISNAN(vi) || (tmp = (int) vi) < 0 || tmp > 255) { tmp = 0; warn |= WARN_RAW; } pa[i] = (Rbyte) tmp; } break; Doesn't that give different results in case vi is so large that "(int) vi" overflows? (I don't know what the C standard says, but some online references say that behaviour is implementation dependent.) For example, if vi = 1.0 + INT_MAX; wouldn't "(int) vi" be equal to a small integer? Duncan Murdoch Coercion from complex to raw has the same problem: > as.raw(c(3e9+0i, 5.1)) [1] 00 05 Warning messages: 1: NAs introduced by coercion to integer range 2: out-of-range values treated as 0 in coercion to raw Current implementation (file src/main/coerce.c, lines 711-721): case CPLXSXP: for (i = 0; i < n; i++) { // if ((i+1) % NINTERRUPT == 0) R_CheckUserInterrupt(); tmp = IntegerFromComplex(COMPLEX_ELT(v, i), &warn); if(tmp == NA_INTEGER || tmp < 0 || tmp > 255) { tmp = 0; warn |= WARN_RAW; } pa[i] = (Rbyte) tmp; } break; This implementation has the following additional problem when the supplied complex has a nonzero imaginary part: > as.raw(300+4i) [1] 00 Warning messages: 1: imaginary parts discarded in coercion 2: out-of-range values treated as 0 in coercion to raw > as.raw(3e9+4i) [1] 00 Warning messages: 1: NAs introduced by coercion to integer range 2: out-of-range values treated as 0 in coercion to raw In one case we get a warning about the discarding of the imaginary part but not the other case, which is unexpected. We should see the exact same warning (or warnings) in both cases. With the following fix we only get the warning about the discarding of the imaginary part if we are not in a "out-of-range values treated as 0 in coercion to raw" situation: case CPLXSXP: for (i = 0; i < n; i++) { // if ((i+1) % NINTERRUPT == 0) R_CheckUserInterrupt(); Rcomplex vi = COMPLEX_ELT(v, i); if(ISNAN(vi.r) || ISNAN(vi.i) || (tmp = (int) vi.r) < 0 || tmp > 255) { tmp = 0; warn |= WARN_RAW; } else { if(vi.i != 0.0) warn |= WARN_IMAG; } pa[i] = (Rbyte) tmp; } break; Finally, coercion from character to raw has the same problem and its code can be fixed in a similar manner: > as.raw(c("3e9", 5.1)) [1] 00 05 Warning messages: 1: NAs introduced by coercion to integer range 2: out-of-range values treated as 0 in coercion to raw Cheers, H. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Spurious warnings in coercion from double/complex/character to raw
Hi, The first warning below is unexpected and confusing: > as.raw(c(3e9, 5.1)) [1] 00 05 Warning messages: 1: NAs introduced by coercion to integer range 2: out-of-range values treated as 0 in coercion to raw The reason we get it is that coercion from numeric to raw is currently implemented on top of coercion from numeric to int (file src/main/coerce.c, lines 700-710): case REALSXP: for (i = 0; i < n; i++) { // if ((i+1) % NINTERRUPT == 0) R_CheckUserInterrupt(); tmp = IntegerFromReal(REAL_ELT(v, i), &warn); if(tmp == NA_INTEGER || tmp < 0 || tmp > 255) { tmp = 0; warn |= WARN_RAW; } pa[i] = (Rbyte) tmp; } break; The first warning comes from the call to IntegerFromReal(). The following code avoids the spurious warning and is also simpler and slightly faster: case REALSXP: for (i = 0; i < n; i++) { // if ((i+1) % NINTERRUPT == 0) R_CheckUserInterrupt(); double vi = REAL_ELT(v, i); if(ISNAN(vi) || (tmp = (int) vi) < 0 || tmp > 255) { tmp = 0; warn |= WARN_RAW; } pa[i] = (Rbyte) tmp; } break; Coercion from complex to raw has the same problem: > as.raw(c(3e9+0i, 5.1)) [1] 00 05 Warning messages: 1: NAs introduced by coercion to integer range 2: out-of-range values treated as 0 in coercion to raw Current implementation (file src/main/coerce.c, lines 711-721): case CPLXSXP: for (i = 0; i < n; i++) { // if ((i+1) % NINTERRUPT == 0) R_CheckUserInterrupt(); tmp = IntegerFromComplex(COMPLEX_ELT(v, i), &warn); if(tmp == NA_INTEGER || tmp < 0 || tmp > 255) { tmp = 0; warn |= WARN_RAW; } pa[i] = (Rbyte) tmp; } break; This implementation has the following additional problem when the supplied complex has a nonzero imaginary part: > as.raw(300+4i) [1] 00 Warning messages: 1: imaginary parts discarded in coercion 2: out-of-range values treated as 0 in coercion to raw > as.raw(3e9+4i) [1] 00 Warning messages: 1: NAs introduced by coercion to integer range 2: out-of-range values treated as 0 in coercion to raw In one case we get a warning about the discarding of the imaginary part but not the other case, which is unexpected. We should see the exact same warning (or warnings) in both cases. With the following fix we only get the warning about the discarding of the imaginary part if we are not in a "out-of-range values treated as 0 in coercion to raw" situation: case CPLXSXP: for (i = 0; i < n; i++) { // if ((i+1) % NINTERRUPT == 0) R_CheckUserInterrupt(); Rcomplex vi = COMPLEX_ELT(v, i); if(ISNAN(vi.r) || ISNAN(vi.i) || (tmp = (int) vi.r) < 0 || tmp > 255) { tmp = 0; warn |= WARN_RAW; } else { if(vi.i != 0.0) warn |= WARN_IMAG; } pa[i] = (Rbyte) tmp; } break; Finally, coercion from character to raw has the same problem and its code can be fixed in a similar manner: > as.raw(c("3e9", 5.1)) [1] 00 05 Warning messages: 1: NAs introduced by coercion to integer range 2: out-of-range values treated as 0 in coercion to raw Cheers, H. -- Hervé Pagès Bioconductor Core Team hpages.on.git...@gmail.com __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel