Re: [R-pkg-devel] Possible false negative for compiled C++ code in CRAN checks

2024-11-17 Thread Mauricio Vargas Sepulveda
The issue was finally resolved and the package is back on CRAN!

Extensive try-and-fail is not the best with my advanced arthritis, but progress 
is progress

I added a small section with a minimal example here: 
https://cpp4r.org/08-debugging.html#testing-with-docker

This is a "nautical diary " I keep with a long collection of errors I found. My 
research is International Relations and Public Policy, so it is good to have a 
resource with all my Stackoverflow searches and questions in 1 place.

Best,




Mauricio "Pach��" Vargas Sep��lveda

PhD Student, Political Science
University of Toronto




From: Mauricio Vargas Sepulveda 
Sent: November 15, 2024 7:16 AM
To: Ivan Krylov 
Cc: r-package-devel@r-project.org 
Subject: Re: [R-pkg-devel] Possible false negative for compiled C++ code in 
CRAN checks

Dear R Developers,

I implemented Dr. Krylov's fix, and now redatam is back to CRAN (Point 1).

Dr. Eddelbuettel's suggestion about not fully mimicking CRAN gcc-san setup was 
also correct (Point 2).

Point 1:
- When the big endian issue was described here, I tried the s390x image from 
R-Hub, which does not work on my laptop. Then I went to a big endian server 
that we have at UofT and that reproduced the error!

Point 2:
- I was testing with docker.io/rocker/r-devel-san:latest, and I could not 
replicate the error.
- I ended up using the R-Hub image with 
https://github.com/r-hub/containers/pull/81#issuecomment-2478009166, and that 
replicated the error line by line.

Besides it, I implemented the same fix for the command line tool and the Python 
package. I also changed the example data by aggregating Uruguay's census to 
show some numbers. Just cutting the levels and show region/city is not 
informative. Being Redatam a closed format with no standard or specification, I 
ended up reading the census with my pkg, aggregating with dplyr, saving to CSV, 
and then converting back to Redatam with a point-and-click tool following an 
unofficial tutorial from You Tube. There is no easy way  to aggregate and save 
in the same format as we do with SPSS.

I added Dr. Krylov to ctb for this very useful fix! Thanks a lot to Dr. Krylov 
and Dr. Eddelbuettel for the suggestions, I was going in circles being unable 
to replicate the issue and I now asked Dr. Ligges for the possibility to know 
more about the CRAN specific configuration to add it to R-Hub.

Best,


Mauricio "Pach��" Vargas Sep��lveda
PhD Student, Political Science
University of Toronto



From: Ivan Krylov 
Sent: November 14, 2024 4:50 PM
To: Mauricio Vargas Sepulveda 
Cc: r-package-devel@r-project.org 
Subject: Re: [R-pkg-devel] Possible false negative for compiled C++ code in 
CRAN checks

�� Thu, 14 Nov 2024 16:24:16 +
Mauricio Vargas Sepulveda  ��ڧ�֧�:

> After enabling the SAN flags, I cannot reproduce the gcc-san error
> [2].

Can you use the rocker/r-devel-san container? It reproduces the problem
for me.

When reading galapagos/cg15.dic, FuzzyEntityParser::ParseEntities()
keeps advancing over the file and failing to parse a single entity
until it eventually calls stop() because it didn't find any entities.

In a non-sanitized build, it first succeeds at 0-based offset 1095. In
a sanitized build, it fails for all offsets. I think this is due to the
ordering of the byte reads:
https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fpachadotdev%2Fopen-redatam%2Fblob%2Fbbb65242f1af5f601def1c0b971ed601d459b4f3%2Fsrc%2Freaders%2FByteArrayReader.cpp%23L176-L192&data=05%7C02%7Cm.sepulveda%40mail.utoronto.ca%7C762907f4ee8f46c7f59b08dd04f667a8%7C78aac2262f034b4d9037b46d56c55210%7C0%7C0%7C638672178560868627%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=O8Cjgp1m%2FqQc7CAKWOl%2F6IMcAi0cIhCFnFUY7x%2FYdhU%3D&reserved=0<https://github.com/pachadotdev/open-redatam/blob/bbb65242f1af5f601def1c0b971ed601d459b4f3/src/readers/ByteArrayReader.cpp#L176-L192>

In C++, an operation like the following:

static_cast(ReadByte()) << 8 |
static_cast(ReadByte());

...depends on the order in which the compiler will choose to evaluate
the calls to static_cast(ReadByte()), and this order is not
guaranteed to be left-to-right:
https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fen.cppreference.com%2Fw%2Fcpp%2Flanguage%2Feval_order&data=05%7C02%7Cm.sepulveda%40mail.utoronto.ca%7C762907f4ee8f46c7f59b08dd04f667a8%7C78aac2262f034b4d9037b46d56c55210%7C0%7C0%7C638672178560885711%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=8k0RVuOCfXwzh72IjVkzKy0yq1uM1bHZc7OBgAhKe9w%3D&reserved=0<https://en.cppreference.com/w/cpp/language/eval_order>

I edited all four byte-reading functions and replaced the o

Re: [R-pkg-devel] Possible false negative for compiled C++ code in CRAN checks

2024-11-15 Thread Dirk Eddelbuettel


On 15 November 2024 at 12:16, Mauricio Vargas Sepulveda wrote:
| [...] and I now asked Dr. Ligges for the possibility to know more about the 
CRAN specific configuration to add it to R-Hub.

It is (and has always been) documented in a text file on the server 

   https://www.stats.ox.ac.uk/pub/bdr/memtests/README.txt

which is also referenced in a few places (but I forget where, I tend to go
back to my rocker san repos who have it too -- Google gets a few pages of
hits for the URL as a constant quoted string).

Dirk

-- 
dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Possible false negative for compiled C++ code in CRAN checks

2024-11-15 Thread Mauricio Vargas Sepulveda
Dear R Developers,

I implemented Dr. Krylov's fix, and now redatam is back to CRAN (Point 1).

Dr. Eddelbuettel's suggestion about not fully mimicking CRAN gcc-san setup was 
also correct (Point 2).

Point 1:
- When the big endian issue was described here, I tried the s390x image from 
R-Hub, which does not work on my laptop. Then I went to a big endian server 
that we have at UofT and that reproduced the error!

Point 2:
- I was testing with docker.io/rocker/r-devel-san:latest, and I could not 
replicate the error.
- I ended up using the R-Hub image with 
https://github.com/r-hub/containers/pull/81#issuecomment-2478009166, and that 
replicated the error line by line.

Besides it, I implemented the same fix for the command line tool and the Python 
package. I also changed the example data by aggregating Uruguay's census to 
show some numbers. Just cutting the levels and show region/city is not 
informative. Being Redatam a closed format with no standard or specification, I 
ended up reading the census with my pkg, aggregating with dplyr, saving to CSV, 
and then converting back to Redatam with a point-and-click tool following an 
unofficial tutorial from You Tube. There is no easy way  to aggregate and save 
in the same format as we do with SPSS.

I added Dr. Krylov to ctb for this very useful fix! Thanks a lot to Dr. Krylov 
and Dr. Eddelbuettel for the suggestions, I was going in circles being unable 
to replicate the issue and I now asked Dr. Ligges for the possibility to know 
more about the CRAN specific configuration to add it to R-Hub.

Best,

——
Mauricio "Pachá" Vargas Sepúlveda
PhD Student, Political Science
University of Toronto



From: Ivan Krylov 
Sent: November 14, 2024 4:50 PM
To: Mauricio Vargas Sepulveda 
Cc: r-package-devel@r-project.org 
Subject: Re: [R-pkg-devel] Possible false negative for compiled C++ code in 
CRAN checks

В Thu, 14 Nov 2024 16:24:16 +
Mauricio Vargas Sepulveda  пишет:

> After enabling the SAN flags, I cannot reproduce the gcc-san error
> [2].

Can you use the rocker/r-devel-san container? It reproduces the problem
for me.

When reading galapagos/cg15.dic, FuzzyEntityParser::ParseEntities()
keeps advancing over the file and failing to parse a single entity
until it eventually calls stop() because it didn't find any entities.

In a non-sanitized build, it first succeeds at 0-based offset 1095. In
a sanitized build, it fails for all offsets. I think this is due to the
ordering of the byte reads:
https://github.com/pachadotdev/open-redatam/blob/bbb65242f1af5f601def1c0b971ed601d459b4f3/src/readers/ByteArrayReader.cpp#L176-L192

In C++, an operation like the following:

static_cast(ReadByte()) << 8 |
static_cast(ReadByte());

...depends on the order in which the compiler will choose to evaluate
the calls to static_cast(ReadByte()), and this order is not
guaranteed to be left-to-right:
https://en.cppreference.com/w/cpp/language/eval_order

I edited all four byte-reading functions and replaced the one-statement
operations with separate statements for each of the byte reads:

--- redatam.orig/src/redatamlib/readers/ByteArrayReader.cpp 2024-11-09 
02:12:17.0 +
+++ redatam.new/src/redatamlib/readers/ByteArrayReader.cpp  2024-11-14 
21:25:54.0 +
@@ -175,23 +175,27 @@
 }

 uint16_t ByteArrayReader::ReadInt16LE() {
-  return static_cast(ReadByte()) |
- (static_cast(ReadByte()) << 8);
+  uint16_t a = static_cast(ReadByte());
+  uint16_t b = static_cast(ReadByte()) << 8;
+  return a | b;
 }

 uint32_t ByteArrayReader::ReadInt32LE() {
-  return static_cast(ReadInt16LE()) |
- static_cast(ReadInt16LE()) << 16;
+  uint32_t a = static_cast(ReadInt16LE());
+  uint32_t b = static_cast(ReadInt16LE()) << 16;
+  return a | b;
 }

 uint16_t ByteArrayReader::ReadInt16BE() {
-  return (static_cast(ReadByte()) << 8) |
- static_cast(ReadByte());
+ uint16_t a= (static_cast(ReadByte()) << 8);
+ uint16_t b= static_cast(ReadByte());
+ return a| b;
 }

 uint32_t ByteArrayReader::ReadInt32BE() {
-  return (static_cast(ReadInt16BE()) << 16) |
- static_cast(ReadInt16BE());
+  uint32_t b = static_cast(ReadInt16LE()) << 16;
+  uint32_t a = static_cast(ReadInt16LE());
+  return b | a;
 }

 }  // namespace RedatamLib

...and this seems to make the error vanish. I think I see the
misordering too. In the output of objdump -d
redatam.Rcheck/redatam/libs/redatam.so, I see:

00267010 <_ZN10RedatamLib15ByteArrayReader11ReadInt16LEEv>:

  267028:   e8 93 6f f5 ff  call   1bdfc0 
<_ZN10RedatamLib15ByteArrayReader8ReadByteEv@plt>
   first_byte <- ReadByte()
  26702d:   41 89 c4mov%eax,%r12d
   save first byte in r12
  26703d:   41 c1 e4 08 shl$0x8,%r12d

Re: [R-pkg-devel] Possible false negative for compiled C++ code in CRAN checks

2024-11-14 Thread Ivan Krylov via R-package-devel
В Thu, 14 Nov 2024 16:24:16 +
Mauricio Vargas Sepulveda  пишет:

> After enabling the SAN flags, I cannot reproduce the gcc-san error
> [2].

Can you use the rocker/r-devel-san container? It reproduces the problem
for me.

When reading galapagos/cg15.dic, FuzzyEntityParser::ParseEntities()
keeps advancing over the file and failing to parse a single entity
until it eventually calls stop() because it didn't find any entities.

In a non-sanitized build, it first succeeds at 0-based offset 1095. In
a sanitized build, it fails for all offsets. I think this is due to the
ordering of the byte reads:
https://github.com/pachadotdev/open-redatam/blob/bbb65242f1af5f601def1c0b971ed601d459b4f3/src/readers/ByteArrayReader.cpp#L176-L192

In C++, an operation like the following:

static_cast(ReadByte()) << 8 |
static_cast(ReadByte());

...depends on the order in which the compiler will choose to evaluate
the calls to static_cast(ReadByte()), and this order is not
guaranteed to be left-to-right:
https://en.cppreference.com/w/cpp/language/eval_order

I edited all four byte-reading functions and replaced the one-statement
operations with separate statements for each of the byte reads:

--- redatam.orig/src/redatamlib/readers/ByteArrayReader.cpp 2024-11-09 
02:12:17.0 +
+++ redatam.new/src/redatamlib/readers/ByteArrayReader.cpp  2024-11-14 
21:25:54.0 +
@@ -175,23 +175,27 @@
 }

 uint16_t ByteArrayReader::ReadInt16LE() {
-  return static_cast(ReadByte()) |
- (static_cast(ReadByte()) << 8);
+  uint16_t a = static_cast(ReadByte());
+  uint16_t b = static_cast(ReadByte()) << 8;
+  return a | b;
 }

 uint32_t ByteArrayReader::ReadInt32LE() {
-  return static_cast(ReadInt16LE()) |
- static_cast(ReadInt16LE()) << 16;
+  uint32_t a = static_cast(ReadInt16LE());
+  uint32_t b = static_cast(ReadInt16LE()) << 16;
+  return a | b;
 }

 uint16_t ByteArrayReader::ReadInt16BE() {
-  return (static_cast(ReadByte()) << 8) |
- static_cast(ReadByte());
+ uint16_t a= (static_cast(ReadByte()) << 8);
+ uint16_t b= static_cast(ReadByte());
+ return a| b;
 }

 uint32_t ByteArrayReader::ReadInt32BE() {
-  return (static_cast(ReadInt16BE()) << 16) |
- static_cast(ReadInt16BE());
+  uint32_t b = static_cast(ReadInt16LE()) << 16;
+  uint32_t a = static_cast(ReadInt16LE());
+  return b | a;
 }

 }  // namespace RedatamLib

...and this seems to make the error vanish. I think I see the
misordering too. In the output of objdump -d
redatam.Rcheck/redatam/libs/redatam.so, I see:

00267010 <_ZN10RedatamLib15ByteArrayReader11ReadInt16LEEv>:

  267028:   e8 93 6f f5 ff  call   1bdfc0 
<_ZN10RedatamLib15ByteArrayReader8ReadByteEv@plt>
   first_byte <- ReadByte()
  26702d:   41 89 c4mov%eax,%r12d
   save first byte in r12
  26703d:   41 c1 e4 08 shl$0x8,%r12d
   left-shift the first byte!
  267041:   e8 7a 6f f5 ff  call   1bdfc0 
<_ZN10RedatamLib15ByteArrayReader8ReadByteEv@plt>
   second byte in eax
  26704a:   44 09 e0or %r12d,%eax
   OR them together

...which is how you read big-endian numbers, not little-endian ones.

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Possible false negative for compiled C++ code in CRAN checks

2024-11-14 Thread Dirk Eddelbuettel


On 14 November 2024 at 16:24, Mauricio Vargas Sepulveda wrote:
| After enabling the SAN flags, I cannot reproduce the gcc-san error [2].
| 
| Should I report this as a false positive?

No.

Replicating ASAN/UBSAN issues is known to be potentially tricky.

It drove me so batty a decade ago that I created a package (on CRAN at [1])
with _known triggers_ from the corresponding ASAN/UBSAN wiki. They help
ensure a given setup finds what it is supposed to find. (It also helps to
become familiar with ASAN/UBSAN because you can also assert that standard g++
/ clang++ do not necessarily find these. But the newer compilers are getting
better so some issue may now be flagged too.)

We have known-good containers: I maintain two within Rocker, they have weekly
builds, Winston has the ones in the sumu container [2] which many of us
(myself included) use. There is also rhub2 but that failed for me when I
tried a while back, it may be better now.

Sadly the CRAN one is not available as a container. But not reproducing the
CRAN findings does not imply CRAN is wrong. More likely it means your setup
does not (yet ?) match CRAN.

Dirk

[1] https://cran.r-project.org/package=sanitizers
[2] https://github.com/wch/r-debug

-- 
dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Possible false negative for compiled C++ code in CRAN checks

2024-11-14 Thread Mauricio Vargas Sepulveda
Dear R Developers,

I hope you are doing well.

I managed to create an image that mimic CRAN checks [1].

After enabling the SAN flags, I cannot reproduce the gcc-san error [2].

Should I report this as a false positive?

Links

1. https://github.com/r-hub/containers/pull/81/files
2. 
https://win-builder.r-project.org/incoming_pretest/redatam_2.0.4_20241114_071006/specialChecks/gcc-san/outputs.txt

Best,

——
Mauricio "Pachá" Vargas Sepúlveda
PhD Student, Political Science
University of Toronto



From: Ivan Krylov 
Sent: November 12, 2024 3:37 PM
To: Mauricio Vargas Sepulveda 
Cc: r-package-devel@r-project.org 
Subject: Re: [R-pkg-devel] Possible false negative for compiled C++ code in 
CRAN checks

В Tue, 12 Nov 2024 17:37:40 +
Mauricio Vargas Sepulveda  пишет:

> I re-sent my package and I still see:
>
>   1.
> https://win-builder.r-project.org/incoming_pretest/redatam_2.0.3_20241112_165458/specialChecks/clang-san/outputs.txt

I understand this doesn't feel like progress, but this time clang-san
is giving your package a clean bill of health. There are no issues to
fix here.

>   2.
> https://win-builder.r-project.org/incoming_pretest/redatam_2.0.3_20241112_165458/specialChecks/gcc-san/outputs.txt

This one, unfortunately, does demonstrate a new error in the package.
The problem with software testing is that it can only demonstrate the
presence of bugs, not their absence. The new error is different from
all the errors we've seen previously. The code at
redatamlib/readers/FuzzyVariableParser.cpp:33:38 performs an integer
division by zero, which crashes the process:

  size_t chunkSize = entities.size() / numThreads;

It's not in the outputs.txt, but in the summary.txt:
https://win-builder.r-project.org/incoming_pretest/redatam_2.0.3_20241112_165458/specialChecks/gcc-san/summary.txt

https://en.cppreference.com/w/cpp/thread/thread/hardware_concurrency
says that std::thread::hardware_concurrency() could plausibly return 0.
Or is this due to empty entities vector?

> I am also using "-Wall -O0 -pedantic" and "-UDEBUG -g" to check
> locally, and I am also using all the available R-Hub images (27
> different configurations, as in
> https://github.com/pachadotdev/open-redatam/actions/runs/11801021816)

Unfortunately, none of these use gcc with UBSanitizer enabled. I think
that the rocker/r-devel-san container [*] image comes closest to the
configuration that fails your package during the CRAN checks. By
running the following commands, I can reproduce the crash:

podman run -it docker.io/rocker/r-devel-san Rdevel
f <- 'https://cran.r-project.org/incoming/archive/redatam_2.0.3.tar.gz'
download.file(f, basename(f))
install.packages(c('data.table','janitor', 'stringi', 'knitr',
'rmarkdown','testthat'))
tools::Rcmd('check redatam_2.0.3.tar.gz')

By installing gdb inside the container, running Rdevel -d gdb and
setting a breakpoint on __ubsan_on_report, I can catch the moment
before the crash:

(gdb) frame 5
#5  0x7f95bf293a16 in
RedatamLib::FuzzyVariableParser::ParseAllVariables
(this=this@entry=0x7fff80974bd0, entities=std::vector of length 0,
capacity 0) at redatamlib/readers/FuzzyVariableParser.cpp:33
33   size_t chunkSize = entities.size() / numThreads;
(gdb) p entities.size()
$1 = 0
(gdb) p numThreads
$2 = 0
(gdb) p maxThreads
$3 = 4

The entities vector is empty, and the 0/0 integer division invokes
undefined behaviour.

Would you find it acceptable to also test your package using the
docker.io/rocker/r-devel-san container before the next submission?

--
Best regards,
Ivan

[*] https://rocker-project.org/images/base/r-devel.html
__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Possible false negative for compiled C++ code in CRAN checks

2024-11-12 Thread Ivan Krylov via R-package-devel
В Tue, 12 Nov 2024 17:37:40 +
Mauricio Vargas Sepulveda  пишет:

> I re-sent my package and I still see:
> 
>   1.
> https://win-builder.r-project.org/incoming_pretest/redatam_2.0.3_20241112_165458/specialChecks/clang-san/outputs.txt

I understand this doesn't feel like progress, but this time clang-san
is giving your package a clean bill of health. There are no issues to
fix here.

>   2.
> https://win-builder.r-project.org/incoming_pretest/redatam_2.0.3_20241112_165458/specialChecks/gcc-san/outputs.txt

This one, unfortunately, does demonstrate a new error in the package.
The problem with software testing is that it can only demonstrate the
presence of bugs, not their absence. The new error is different from
all the errors we've seen previously. The code at
redatamlib/readers/FuzzyVariableParser.cpp:33:38 performs an integer
division by zero, which crashes the process:

  size_t chunkSize = entities.size() / numThreads;

It's not in the outputs.txt, but in the summary.txt:
https://win-builder.r-project.org/incoming_pretest/redatam_2.0.3_20241112_165458/specialChecks/gcc-san/summary.txt

https://en.cppreference.com/w/cpp/thread/thread/hardware_concurrency
says that std::thread::hardware_concurrency() could plausibly return 0.
Or is this due to empty entities vector?

> I am also using "-Wall -O0 -pedantic" and "-UDEBUG -g" to check
> locally, and I am also using all the available R-Hub images (27
> different configurations, as in
> https://github.com/pachadotdev/open-redatam/actions/runs/11801021816)

Unfortunately, none of these use gcc with UBSanitizer enabled. I think
that the rocker/r-devel-san container [*] image comes closest to the
configuration that fails your package during the CRAN checks. By
running the following commands, I can reproduce the crash:

podman run -it docker.io/rocker/r-devel-san Rdevel
f <- 'https://cran.r-project.org/incoming/archive/redatam_2.0.3.tar.gz'
download.file(f, basename(f))
install.packages(c('data.table','janitor', 'stringi', 'knitr',
'rmarkdown','testthat'))
tools::Rcmd('check redatam_2.0.3.tar.gz')

By installing gdb inside the container, running Rdevel -d gdb and
setting a breakpoint on __ubsan_on_report, I can catch the moment
before the crash:

(gdb) frame 5
#5  0x7f95bf293a16 in
RedatamLib::FuzzyVariableParser::ParseAllVariables
(this=this@entry=0x7fff80974bd0, entities=std::vector of length 0,
capacity 0) at redatamlib/readers/FuzzyVariableParser.cpp:33
33   size_t chunkSize = entities.size() / numThreads;
(gdb) p entities.size()
$1 = 0
(gdb) p numThreads
$2 = 0
(gdb) p maxThreads 
$3 = 4

The entities vector is empty, and the 0/0 integer division invokes
undefined behaviour.

Would you find it acceptable to also test your package using the
docker.io/rocker/r-devel-san container before the next submission?

-- 
Best regards,
Ivan

[*] https://rocker-project.org/images/base/r-devel.html

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Possible false negative for compiled C++ code in CRAN checks

2024-11-12 Thread Mauricio Vargas Sepulveda
Dear R Developers,

I hope you are doing well.

I re-sent my package and I still see:

  1.
https://win-builder.r-project.org/incoming_pretest/redatam_2.0.3_20241112_165458/specialChecks/clang-san/outputs.txt
  2.
https://win-builder.r-project.org/incoming_pretest/redatam_2.0.3_20241112_165458/specialChecks/gcc-san/outputs.txt

As it was discussed in:

1. "Possible false negative for compiled C++ code in CRAN checks" (this R 
developers mailing list)
2. "AddressSanitizer Error (alloc-dealloc-mismatch, operator new vs free) in R 
Package with C++ Code Using libc++ and CLANG-ASAN/UBSAN" (Stackoverflow, 
https://stackoverflow.com/questions/79171799/addresssanitizer-error-alloc-dealloc-mismatch-operator-new-vs-free-in-r-packa?noredirect=1#comment139621299_79171799)
3. "clang-asan: use clang 19 as CRAN does" (GitHub, 
https://github.com/r-hub/containers/pull/79#issuecomment-2470390769)

The package was tested with Clang 19 on Fedora 36 and Ubuntu 22, and as S.O. 
reads "it is likely a false positive in the container used by R-Hub as 
discussed in this [#598] issue at its repo but not a false positive but an 
actual error as shown in the setup at CRAN (different container)."

The R-Hub container was fixed with a pull request I sent, and now it passes the 
checks. The CRAN errors were addressed as mentioned in the mailing list.

I am also using "-Wall -O0 -pedantic" and "-UDEBUG -g" to check locally, and I 
am also using all the available R-Hub images (27 different configurations, as 
in https://github.com/pachadotdev/open-redatam/actions/runs/11801021816)

I would really appreciate some guidance here before re-submitting.

Best wishes,




Mauricio "Pach��" Vargas Sep��lveda

PhD Student, Political Science
University of Toronto




From: Mauricio Vargas Sepulveda 
Sent: November 12, 2024 11:01 AM
To: Ivan Krylov 
Cc: r-package-devel@r-project.org 
Subject: Re: [R-pkg-devel] Possible false negative for compiled C++ code in 
CRAN checks

Dear Prof. Dr. Kryov,

Thanks a lot.

Apologies if the email has an odd format, I have a modified Outlook UI that 
makes everything "curious".

After taking each of the suggestions, now I get this with Clang-ASAN on R-Hub:

> 0 errors | 0 warnings | 2 note
> Package was archived on CRAN
> CRAN repository db overrides: X-CRAN-Comment: Archived on 2024-11-06 for 
> repeated policy violation. Repeatedy spamming a team member's personal email 
> address in HTML.
> checking compilation flags used ... NOTE Compilation used the following 
> non-portable flag(s): ��-Wp,-D_FORTIFY_SOURCE=3��

This is great. Thanks a lot. I would like to add you to AUTHORS.md or as "ctb" 
to list your contribution. Hopefully it can be re-accepted.

The first note was because I was asking about this same issue when CRAN 
notified me. Now I make sure to use non-HTML email when sending R-related 
emails.

So far, this is the only useful communication I received about this. On Stack 
Overflow I only got "-1".

I added the changes here: 
https://github.com/pachadotdev/open-redatam/commit/d8955195d4e13fd7794de56d24b4238b7a7ec26f

Best wishes,


Mauricio "Pach��" Vargas Sep��lveda
PhD Student, Political Science
University of Toronto


____________
From: Ivan Krylov 
Sent: November 12, 2024 5:56 AM
To: Mauricio Vargas Sepulveda 
Cc: r-package-devel@r-project.org 
Subject: Re: [R-pkg-devel] Possible false negative for compiled C++ code in 
CRAN checks

�� Tue, 12 Nov 2024 04:02:17 +
Mauricio Vargas Sepulveda  ��ڧ�֧�:

> For v2.0.3 I implemented a full refactor, where I simplified the code
> considerably while trying to solve what looks to be a false positive.
>
> Now I implemented a "patch" based on the suggestion:

I see the change on GitHub [1], made back on October 22nd, but I am not
seeing these checks in your latest submission to CRAN [2] from November
7th. Before you have implemented [1], have you been able to reproduce
the errors that your package received on CRAN, that is, the
container-overflow and the null object pointers?

I've just noticed that FuzzyVariableParser::ParseAllVariables is using
std::thread::hardware_concurrency(). CRAN uses a shared computer with
many cores to test many packages at the same time, so the code most
likely needs to limit itself to two threads during tests and examples
[3].

>SUMMARY: AddressSanitizer: alloc-dealloc-mismatch
> (/opt/R/devel-asan/lib/R/bin/exec/R+0xc6306) (BuildId:
> 178e357df79b1589a38c1949da5e5f022d4bb535) in free

This is an error you're getting when testing with an R-hub image. R-hub
is not CRAN. You've performed good investigative work to reduce the
issue you're seeing on R-hub for the StackOverflow question [4], but it
looks like this particular problem is already reported to R-hub
developers [

Re: [R-pkg-devel] Possible false negative for compiled C++ code in CRAN checks

2024-11-12 Thread Mauricio Vargas Sepulveda
Dear Prof. Dr. Kryov,

Thanks a lot.

Apologies if the email has an odd format, I have a modified Outlook UI that 
makes everything "curious".

After taking each of the suggestions, now I get this with Clang-ASAN on R-Hub:

> 0 errors | 0 warnings | 2 note
> Package was archived on CRAN
> CRAN repository db overrides: X-CRAN-Comment: Archived on 2024-11-06 for 
> repeated policy violation. Repeatedy spamming a team member's personal email 
> address in HTML.
> checking compilation flags used ... NOTE Compilation used the following 
> non-portable flag(s): ‘-Wp,-D_FORTIFY_SOURCE=3’

This is great. Thanks a lot. I would like to add you to AUTHORS.md or as "ctb" 
to list your contribution. Hopefully it can be re-accepted.

The first note was because I was asking about this same issue when CRAN 
notified me. Now I make sure to use non-HTML email when sending R-related 
emails.

So far, this is the only useful communication I received about this. On Stack 
Overflow I only got "-1".

I added the changes here: 
https://github.com/pachadotdev/open-redatam/commit/d8955195d4e13fd7794de56d24b4238b7a7ec26f

Best wishes,

——
Mauricio "Pachá" Vargas Sepúlveda
PhD Student, Political Science
University of Toronto



From: Ivan Krylov 
Sent: November 12, 2024 5:56 AM
To: Mauricio Vargas Sepulveda 
Cc: r-package-devel@r-project.org 
Subject: Re: [R-pkg-devel] Possible false negative for compiled C++ code in 
CRAN checks

В Tue, 12 Nov 2024 04:02:17 +
Mauricio Vargas Sepulveda  пишет:

> For v2.0.3 I implemented a full refactor, where I simplified the code
> considerably while trying to solve what looks to be a false positive.
>
> Now I implemented a "patch" based on the suggestion:

I see the change on GitHub [1], made back on October 22nd, but I am not
seeing these checks in your latest submission to CRAN [2] from November
7th. Before you have implemented [1], have you been able to reproduce
the errors that your package received on CRAN, that is, the
container-overflow and the null object pointers?

I've just noticed that FuzzyVariableParser::ParseAllVariables is using
std::thread::hardware_concurrency(). CRAN uses a shared computer with
many cores to test many packages at the same time, so the code most
likely needs to limit itself to two threads during tests and examples
[3].

>SUMMARY: AddressSanitizer: alloc-dealloc-mismatch
> (/opt/R/devel-asan/lib/R/bin/exec/R+0xc6306) (BuildId:
> 178e357df79b1589a38c1949da5e5f022d4bb535) in free

This is an error you're getting when testing with an R-hub image. R-hub
is not CRAN. You've performed good investigative work to reduce the
issue you're seeing on R-hub for the StackOverflow question [4], but it
looks like this particular problem is already reported to R-hub
developers [5] and it's not what got your latest CRAN submission
archived.

When your latest submission, redatam_2.0.3.tar.gz, was archived, you
should have received an e-mail from CRAN. Was it about the package
failing automatic checks? Were there any other comments? Did the e-mail
highlight any issues in particular?

I do see the problem in [6], but it's hard to diagnose by itself. I was
able to reproduce it (but only after compiling everything with
-shared-san and providing the necessary LD_LIBRARY_PATH, otherwise
packages using C++ failed to load; I also had to manually set
UBSAN_OPTIONS=print_stacktrace=1 to find out where the overflow
originates):

> read_redatam(paste(dout, "cg15.dic", sep = "/"))
Opening dictionary file...
/usr/bin/../lib/gcc/x86_64-linux-gnu/14/../../../../include/c++/14/bits/basic_string.h:1272:9:
 runtime error: addition of unsigned offset to 0x7f38f12bcc30 overflowed to 
0x7f38f12bcc2f
#0 0x7f38e859a15f in std::__cxx11::basic_string, std::allocator>::operator[](unsigned long) 
/usr/bin/../lib/gcc/x86_64-linux-gnu/14/../../../../include/c++/14/bits/basic_string.h:1272:9
#1 0x7f38e859a15f in std::__cxx11::basic_string, std::allocator>::back() 
/usr/bin/../lib/gcc/x86_64-linux-gnu/14/../../../../include/c++/14/bits/basic_string.h:1353:9
#2 0x7f38e859a15f in 
RedatamLib::ListExporter::ListExporter(std::__cxx11::basic_string, std::allocator> const&) 
/root/redatam.Rcheck/00_pkg_src/redatam/src/redatamlib/exporters/RListExporter.cpp:16:21
#3 0x7f38e85a7d60 in RedatamLib::RedatamDatabase::ExportRLists() const 
/root/redatam.Rcheck/00_pkg_src/redatam/src/redatamlib/entities/RedatamDatabase.cpp:23:16
#4 0x7f38e85b2224 in 
export_redatam_to_list_(std::__cxx11::basic_string, std::allocator>) 
/root/redatam.Rcheck/00_pkg_src/redatam/src/main.cpp:13:15
#5 0x7f38e85b3068 in _redatam_export_redatam_to_list_ 
/root/redatam.Rcheck/00_pkg_src/redatam/src/cpp11.cpp:12:27
#6 0x55b5da67bf6a in R_doDotCall 
/root/R/src/main/../../../R-svn/src/main/dotcode.c:754:11

Th

Re: [R-pkg-devel] Possible false negative for compiled C++ code in CRAN checks

2024-11-12 Thread Ivan Krylov via R-package-devel
В Tue, 12 Nov 2024 04:02:17 +
Mauricio Vargas Sepulveda  пишет:

> For v2.0.3 I implemented a full refactor, where I simplified the code
> considerably while trying to solve what looks to be a false positive.
> 
> Now I implemented a "patch" based on the suggestion:

I see the change on GitHub [1], made back on October 22nd, but I am not
seeing these checks in your latest submission to CRAN [2] from November
7th. Before you have implemented [1], have you been able to reproduce
the errors that your package received on CRAN, that is, the
container-overflow and the null object pointers?

I've just noticed that FuzzyVariableParser::ParseAllVariables is using
std::thread::hardware_concurrency(). CRAN uses a shared computer with
many cores to test many packages at the same time, so the code most
likely needs to limit itself to two threads during tests and examples
[3].

>SUMMARY: AddressSanitizer: alloc-dealloc-mismatch
> (/opt/R/devel-asan/lib/R/bin/exec/R+0xc6306) (BuildId:
> 178e357df79b1589a38c1949da5e5f022d4bb535) in free

This is an error you're getting when testing with an R-hub image. R-hub
is not CRAN. You've performed good investigative work to reduce the
issue you're seeing on R-hub for the StackOverflow question [4], but it
looks like this particular problem is already reported to R-hub
developers [5] and it's not what got your latest CRAN submission
archived.

When your latest submission, redatam_2.0.3.tar.gz, was archived, you
should have received an e-mail from CRAN. Was it about the package
failing automatic checks? Were there any other comments? Did the e-mail
highlight any issues in particular?

I do see the problem in [6], but it's hard to diagnose by itself. I was
able to reproduce it (but only after compiling everything with
-shared-san and providing the necessary LD_LIBRARY_PATH, otherwise
packages using C++ failed to load; I also had to manually set
UBSAN_OPTIONS=print_stacktrace=1 to find out where the overflow
originates):

> read_redatam(paste(dout, "cg15.dic", sep = "/"))
Opening dictionary file...
/usr/bin/../lib/gcc/x86_64-linux-gnu/14/../../../../include/c++/14/bits/basic_string.h:1272:9:
 runtime error: addition of unsigned offset to 0x7f38f12bcc30 overflowed to 
0x7f38f12bcc2f
#0 0x7f38e859a15f in std::__cxx11::basic_string, std::allocator>::operator[](unsigned long) 
/usr/bin/../lib/gcc/x86_64-linux-gnu/14/../../../../include/c++/14/bits/basic_string.h:1272:9
#1 0x7f38e859a15f in std::__cxx11::basic_string, std::allocator>::back() 
/usr/bin/../lib/gcc/x86_64-linux-gnu/14/../../../../include/c++/14/bits/basic_string.h:1353:9
#2 0x7f38e859a15f in 
RedatamLib::ListExporter::ListExporter(std::__cxx11::basic_string, std::allocator> const&) 
/root/redatam.Rcheck/00_pkg_src/redatam/src/redatamlib/exporters/RListExporter.cpp:16:21
#3 0x7f38e85a7d60 in RedatamLib::RedatamDatabase::ExportRLists() const 
/root/redatam.Rcheck/00_pkg_src/redatam/src/redatamlib/entities/RedatamDatabase.cpp:23:16
#4 0x7f38e85b2224 in 
export_redatam_to_list_(std::__cxx11::basic_string, std::allocator>) 
/root/redatam.Rcheck/00_pkg_src/redatam/src/main.cpp:13:15
#5 0x7f38e85b3068 in _redatam_export_redatam_to_list_ 
/root/redatam.Rcheck/00_pkg_src/redatam/src/cpp11.cpp:12:27
#6 0x55b5da67bf6a in R_doDotCall 
/root/R/src/main/../../../R-svn/src/main/dotcode.c:754:11

The corresponding lines of the source code are:

ListExporter::ListExporter(const std::string &outputDirectory)
: m_path(outputDirectory) {
  if ('/' != m_path.back()) { // <-- here
m_path.append("/");
  }
}

Looks like the code is calling .back() on an empty string. This also
needs to be prevented somehow.

To summarise, I recommend to do the following for the next submission:

 - make sure the fix for the UBSan issues from [7] is included and note
   it in the submission comments
 - avoid using more than two threads during R CMD check
 - fix the empty string issue from [6] and note it in the submission
   comments
 - fix any other issues noted in the CRAN correspondence and note it in
   the submission comments
 - do not mention the alloc-free-mismatch problem in the submission
   comments

Good luck! I hope this helps and there won't be any more issues hiding
in the code.

-- 
Best regards,
Ivan

[1] 
https://github.com/pachadotdev/open-redatam/commit/5399332f26fde8029cc7f78e0f889dfd983504c2

[2]
https://cran.r-project.org/incoming/archive/redatam_2.0.3.tar.gz

[3]
https://contributor.r-project.org/cran-cookbook/code_issues.html#using-more-than-2-cores

[4]
https://stackoverflow.com/q/79171799

[5]
https://github.com/r-hub/rhub/issues/598
There's no actual alloc-free-mismatch, it's a false positive due to the
way libc++ (correctly) uses the delete operator:
https://github.com/llvm/llvm-project/issues/59432

[6]
https://win-builder.r-project.org/incoming_pretest/redatam_2.0.3_20241107_154750/specialChecks/clang-san/summary.txt

[7]
https://www.stats.ox.ac.uk/pub/bdr/memtests/clang-ASAN/redatam/00check.log

Re: [R-pkg-devel] Possible false negative for compiled C++ code in CRAN checks

2024-11-11 Thread Mauricio Vargas Sepulveda
The issue persists, and I really appreciate the help. The last comment in Stack 
Overflow 
(https://stackoverflow.com/questions/79171799/addresssanitizer-error-alloc-dealloc-mismatch-operator-new-vs-free-in-r-packa?noredirect=1#comment139621299_79171799)
 suggests a false positive, and if that is the case, should I add that as a 
note when resubmitting?

For v2.0.3 I implemented a full refactor, where I simplified the code 
considerably while trying to solve what looks to be a false positive.

Now I implemented a "patch" based on the suggestion:

  vector> ret;

  // ADDED //
  if (entities.empty() || entities.size() == 0) {
return ret;
  }
  

  for (size_t i = 0; i < entities.size() - 1; ++i) {
ret.push_back(
{entities[i].GetBounds().second, entities[i + 1].GetBounds().first});
  }

After testing with the R-Hub image "Clang Asan" with this code:

  # Define the package directory and Docker image
  PACKAGE_DIR=$(pwd)
  DOCKER_IMAGE="ghcr.io/r-hub/containers/clang-asan:latest"

  # Pull the Docker image
  docker pull $DOCKER_IMAGE

  # Run the R CMD check inside the Docker container
  docker run --rm -v "$PACKAGE_DIR":/workspace -w /workspace $DOCKER_IMAGE bash 
-c "
   Rscript -e 'if (!requireNamespace(\"pak\", quietly = TRUE)) { 
install.packages(\"pak\", repos = \"https://r-lib.github.io/p/pak/dev//";) }'
   Rscript -e 'pak::pkg_install(\"decor\", dependencies = TRUE)'
   Rscript -e 'pak::pkg_install(\"rcmdcheck\", dependencies = TRUE)'
   Rscript -e 'pak::pkg_install(\"pkgbuild\", dependencies = TRUE)'
   Rscript -e 'pak::pkg_install(\".\", dependencies = TRUE)'
   Rscript -e 'pkgbuild::check_build_tools(debug = TRUE)'
   Rscript -e 'rcmdcheck::rcmdcheck(args = c(\"--no-manual\", \"--as-cran\"), 
build_args = \"--no-manual\", error_on = \"error\")'
  "

I get the following output:

#2 0x7bc709b3e119 in RedatamLib::ByteArrayReader::MovePos(int) 
readers/ByteArrayReader.cpp:99:44
#3 0x7bc709b3e119 in RedatamLib::ByteArrayReader::ReadByte() 
readers/ByteArrayReader.cpp:172:3
#4 0x7bc709b3d2a6 in RedatamLib::ByteArrayReader::ReadInt16LE() 
readers/ByteArrayReader.cpp:178:32
#5 0x7bc709b3d2a6 in 
RedatamLib::ByteArrayReader::TryReadStr(std::__1::basic_string, std::__1::allocator>*, bool) 
readers/ByteArrayReader.cpp:110:20
#6 0x7bc709b4778a in RedatamLib::FuzzyEntityParser::TryGetEntity() 
readers/FuzzyEntityParser.cpp:44:17
#7 0x7bc709b4601b in RedatamLib::FuzzyEntityParser::ParseEntities() 
readers/FuzzyEntityParser.cpp:18:14
#8 0x7bc709b7 in 
RedatamLib::RedatamDatabase::OpenDictionary(std::__1::basic_string, std::__1::allocator> const&) 
entities/RedatamDatabase.cpp:31:25
#9 0x7bc709baa876 in 
RedatamLib::RedatamDatabase::RedatamDatabase(std::__1::basic_string, std::__1::allocator> const&) 
entities/RedatamDatabase.cpp:18:3
#10 0x7bc709bb76f8 in export_redatam_to_list_(std::__1::basic_string, std::__1::allocator>) 
/tmp/RtmpEHQaSu/file136708ed8/redatam.Rcheck/00_pkg_src/redatam/src/main.cpp:13:33
#11 0x7bc709bb8438 in _redatam_export_redatam_to_list_ src/cpp11.cpp:12:27
#12 0x7bc757f3a700 in R_doDotCall /tmp/R-devel/src/main/dotcode.c:754:11

   SUMMARY: AddressSanitizer: alloc-dealloc-mismatch 
(/opt/R/devel-asan/lib/R/bin/exec/R+0xc6306) (BuildId: 
178e357df79b1589a38c1949da5e5f022d4bb535) in free
   ==4218==HINT: if you don't care about these errors you may set 
ASAN_OPTIONS=alloc_dealloc_mismatch=0
   ==4218==ABORTING

  2 errors ✖ | 0 warnings ✔ | 2 notes ✖
  Error: R CMD check found ERRORs


——
Mauricio "Pachá" Vargas Sepúlveda
PhD Student, Political Science
University of Toronto



From: Mauricio Vargas Sepulveda 
Sent: November 11, 2024 7:44 PM
To: Ivan Krylov 
Cc: r-package-devel@r-project.org 
Subject: Re: [R-pkg-devel] Possible false negative for compiled C++ code in 
CRAN checks

Dear Prof. Dr.  Ivan Krylov,

I hope you are doing well.

Thanks for your prompt reply. I shall try to implement a fix and resumit based 
on this. It is a valuable suggestion.
——
Mauricio "Pachá" Vargas Sepúlveda
PhD Student, Political Science
University of Toronto



From: Ivan Krylov 
Sent: November 11, 2024 4:48 AM
To: Mauricio Vargas Sepulveda 
Cc: r-package-devel@r-project.org 
Subject: Re: [R-pkg-devel] Possible false negative for compiled C++ code in 
CRAN checks

[You don't often get email from ikry...@disroot.org. Learn why this is 
important at https://aka.ms/LearnAboutSenderIdentification ]

Dear Mauricio Vargas Sepulveda,

Welcome to R-package-devel!

В Sat, 9 Nov 2024 17:34:07 +
Mauricio Vargas Sepulveda  пишет:

> CRAN reported memory leaks for:
>
> CLAN

Re: [R-pkg-devel] Possible false negative for compiled C++ code in CRAN checks

2024-11-11 Thread Mauricio Vargas Sepulveda
Dear Prof. Dr.  Ivan Krylov,

I hope you are doing well.

Thanks for your prompt reply. I shall try to implement a fix and resumit based 
on this. It is a valuable suggestion.
——
Mauricio "Pachá" Vargas Sepúlveda
PhD Student, Political Science
University of Toronto



From: Ivan Krylov 
Sent: November 11, 2024 4:48 AM
To: Mauricio Vargas Sepulveda 
Cc: r-package-devel@r-project.org 
Subject: Re: [R-pkg-devel] Possible false negative for compiled C++ code in 
CRAN checks

[You don't often get email from ikry...@disroot.org. Learn why this is 
important at https://aka.ms/LearnAboutSenderIdentification ]

Dear Mauricio Vargas Sepulveda,

Welcome to R-package-devel!

В Sat, 9 Nov 2024 17:34:07 +
Mauricio Vargas Sepulveda  пишет:

> CRAN reported memory leaks for:
>
> CLAN/ASAN:
> https://www.stats.ox.ac.uk/pub/bdr/memtests/clang-ASAN/redatam/00check.log
> CLANG/UBSAN:
> https://www.stats.ox.ac.uk/pub/bdr/memtests/gcc-UBSAN/redatam/00check.log

These include container-overflow at
src/redatamlib/ByteArrayReader.cpp:170:23 and method calls with `this`
being a null pointer at src/redatamlib/FuzzyVariableParser.cpp:47 and
src/redatamlib/Entity.cpp:52.

A container-overflow in ByteArrayReader::ReadByte is dangerous because
the code is reading past the end of the vector populated with known
data. While the code doesn't crash, the remainder of the vector could
contain anything. This could invalidate the conclusions of a scientific
work if they end up being based on random contents of uninitialised
memory. You need to find out why m_currPos exceeds the number of bytes
in the array and prevent it from happening.

Calling methods on a null pointer is also an error in the code. This is
quite confusing, but I have a guess for what might be causing this:

  for (size_t i = 0; i < entities.size() - 1; ++i) {
ret.push_back(
{entities[i].GetBounds().second, entities[i + 1].GetBounds().first});
  }

If entities.size() is 0, entities.size() - 1 will overflow to a large
positive number, due to std::vector::size_type being unsigned. The
loop will proceed accessing out-of-bounds elements of 'entities' until
something will cause it to crash. The guess could be completely wrong,
unfortunately.

Have you fixed these problems since v2.0.0?

> After asking on Stack Overflow
> (https://stackoverflow.com/q/79171799/3720258), it was suggested that
> I set 'CXXFLAGS="-stdlib=libc++"' in 'configure'. The question is
> very long and provides all the details that I skip here.

I am not seeing alloc-dealloc-mismatch in the CRAN checks, neither in
the links you provided, nor in the latest pretest results at
<https://win-builder.r-project.org/incoming_pretest/redatam_2.0.3_20241107_154750/>.
I think that these indicate a problem with the way your GitHub actions
are set up. Have you received a CRAN check report with
alloc-dealloc-mismatch in it?

I've tried to compare the versions visible in the CRAN archive but
couldn't get far because there was a lot of formatting changes. What
exactly are the problems for which your latest submission of v2.0.3 has
been archived? Do you need help reproducing the container-overflow
and/or null object pointer errors?

--
Best regards,
Ivan
__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Possible false negative for compiled C++ code in CRAN checks

2024-11-11 Thread Ivan Krylov via R-package-devel
Dear Mauricio Vargas Sepulveda,

Welcome to R-package-devel!

В Sat, 9 Nov 2024 17:34:07 +
Mauricio Vargas Sepulveda  пишет:

> CRAN reported memory leaks for:
> 
> CLAN/ASAN:
> https://www.stats.ox.ac.uk/pub/bdr/memtests/clang-ASAN/redatam/00check.log
> CLANG/UBSAN:
> https://www.stats.ox.ac.uk/pub/bdr/memtests/gcc-UBSAN/redatam/00check.log

These include container-overflow at
src/redatamlib/ByteArrayReader.cpp:170:23 and method calls with `this`
being a null pointer at src/redatamlib/FuzzyVariableParser.cpp:47 and
src/redatamlib/Entity.cpp:52.

A container-overflow in ByteArrayReader::ReadByte is dangerous because
the code is reading past the end of the vector populated with known
data. While the code doesn't crash, the remainder of the vector could
contain anything. This could invalidate the conclusions of a scientific
work if they end up being based on random contents of uninitialised
memory. You need to find out why m_currPos exceeds the number of bytes
in the array and prevent it from happening.

Calling methods on a null pointer is also an error in the code. This is
quite confusing, but I have a guess for what might be causing this:

  for (size_t i = 0; i < entities.size() - 1; ++i) {
ret.push_back(
{entities[i].GetBounds().second, entities[i + 1].GetBounds().first});
  }

If entities.size() is 0, entities.size() - 1 will overflow to a large
positive number, due to std::vector::size_type being unsigned. The
loop will proceed accessing out-of-bounds elements of 'entities' until
something will cause it to crash. The guess could be completely wrong,
unfortunately.

Have you fixed these problems since v2.0.0?

> After asking on Stack Overflow
> (https://stackoverflow.com/q/79171799/3720258), it was suggested that
> I set 'CXXFLAGS="-stdlib=libc++"' in 'configure'. The question is
> very long and provides all the details that I skip here.

I am not seeing alloc-dealloc-mismatch in the CRAN checks, neither in
the links you provided, nor in the latest pretest results at
.
I think that these indicate a problem with the way your GitHub actions
are set up. Have you received a CRAN check report with
alloc-dealloc-mismatch in it?

I've tried to compare the versions visible in the CRAN archive but
couldn't get far because there was a lot of formatting changes. What
exactly are the problems for which your latest submission of v2.0.3 has
been archived? Do you need help reproducing the container-overflow
and/or null object pointer errors?

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel