Hi Brian, Thanks! Should the cygwin-debuginfo package be installed on the computer that is compiling the code or the computer that is running the code. I currently only have it installed on the computer that is running the code, not the one that is compiling the code.
Best Regards, Kennon > On 02/25/2026 10:28 AM PST Brian Inglis via Cygwin <[email protected]> wrote: > > > Hi Kennon, > > To make debugging easier and more informative with source and symbols install > the cygwin-debuginfo package, similarly with any other library dependencies, > or > other package binaries. > > On 2026-02-25 11:18, KENNON J CONRAD via Cygwin wrote: > > new_score_rank is a local variable. GDB only prints local variables. No > > other thread can access this local variable. > > > > Best Regards, > > > > Kennon > > > >> On 02/25/2026 2:32 AM PST Duncan Roe via Cygwin <[email protected]> wrote: > >> > >> > >> Hi Kennon, > >> > >> On Tue, Feb 24, 2026 at 10:38:01AM -0800, cygwin wrote: > >>> Hello, > >>> > >>> I am having a problem with that is apparently related to memmove and > >>> looking for some advice on how to investigate further. This winter I > >>> have been working to simplify GLZA source code and make it more readable. > >>> GLZA is an advanced open source code straight line grammar compressor > >>> first released in 2015. Among these changes was replacing some rather > >>> bloated code with memmove and memset in various locations. The program > >>> started crashing occassionally and after extensively reviewing the > >>> changes, I was unable to find a cause for these crashes. So I installed > >>> gdb to try to find out what was going on and was apparently able to find > >>> the cause of the problem. As a new gdb user, I am not very comfortable > >>> with trusting the results of what gdb showing, but it is pointing > >>> directly to one of the code changes I made. I backed out of this code > >>> change and the program has not crashed after 3 days of nearly continuous > >>> testing. > >>> > >>> So here is what gdb reports when backtrace is run immediately after > >>> reporting a "SIGTRAP": > >>> > >>> (gdb) bt full > >>> #0 0x00007ff9dd8aa98b in KERNELBASE!DebugBreak () from > >>> /cygdrive/c/Windows/system32/KERNELBASE.dll > >>> No symbol table info available. > >>> #1 0x00007ff9ca3b6417 in cygwin1!.assert () from > >>> /cygdrive/c/Windows/cygwin1.dll > >>> No symbol table info available. > >>> #2 0x00007ff9ca3cfb18 in secure_getenv () from > >>> /cygdrive/c/Windows/cygwin1.dll > >>> No symbol table info available. > >>> #3 0x00007ff9e03dd82d in ntdll!.chkstk () from > >>> /cygdrive/c/Windows/SYSTEM32/ntdll.dll > >>> No symbol table info available. > >>> #4 0x00007ff9e038916b in ntdll!RtlRaiseException () from > >>> /cygdrive/c/Windows/SYSTEM32/ntdll.dll > >>> No symbol table info available. > >>> #5 0x00007ff9e03dc9ee in ntdll!KiUserExceptionDispatcher () from > >>> /cygdrive/c/Windows/SYSTEM32/ntdll.dll > >>> No symbol table info available. > >>> #6 0x00007ff9ca3b12a9 in memmove () from /cygdrive/c/Windows/cygwin1.dll > >>> No symbol table info available. > >>> #7 0x0000000100409a7c in rank_scores_thread (arg=0x6ffece890010) at > >>> GLZAcompress.c:904 > >>> new_score_rank = 2633 > >>> new_score_lmi2 = 183964750 > >>> new_score_pmi2 = 183964725 > >>> rank = 4380 > >>> max_rank = 2633 > >>> num_symbols = 25 > >>> new_score_lmi = 92079851 > >>> new_score_pmi = 92079826 > >>> thread_data_ptr = 0x6ffece890010 > >>> max_scores = 4883 > >>> candidates_index = 0xa00034470 > >>> score_index = 4380 > >>> node_score_num_symbols = 7 > >>> num_candidates = 4381 > >>> node_ptrs_num = 12224 > >>> local_write_index = 12225 > >>> rank_scores_buffer = 0x6ffece890020 > >>> candidates = 0x6ffece990020 > >>> score = 47.6283531 > >>> #8 0x00007ff9ca412eec in cygwin1!.getreent () from > >>> /cygdrive/c/Windows/cygwin1.dll > >>> No symbol table info available. > >>> #9 0x00007ff9ca3b47d3 in cygwin1!.assert () from > >>> /cygdrive/c/Windows/cygwin1.dll > >>> No symbol table info available. > >>> #10 0x0000000000000000 in ?? () > >>> No symbol table info available. > >>> > >>> GLZAcompress.c line 904 is as follows and is in code that runs as a > >>> separate thread created in main: > >>> memmove(&candidates_index[new_score_rank+1], > >>> &candidates_index[new_score_rank], 2 * (rank - new_score_rank)); > >>> This does point directly to where a code change was made. > >>> > >>> candidates_index is allocated in main and not ever intentionally changed > >>> until deallocated at the end of program execution: > >>> if (0 == (candidates_index = (uint16_t *)malloc(max_scores * > >>> sizeof(uint16_t)))) > >>> fprintf(stderr, "ERROR - memory allocation failed\n"); > >>> This value is passed to the thread in a structure pointed to by the > >>> thread arg. The value 0xa00034470 for candidates_index is similar to > >>> what is reported on subsequent runs with added code to print this value > >>> so I don't think it's corrupted, but would need to duplicate the crash > >>> after checking the initial value to be 100% certain. With gdb reporting > >>> that rank = 4380 and new_score_rank = 2633 at the time of the SIGTRAP, > >>> this should be a backward move of 1747 uint16_t values by 2 bytes with a > >>> 2 byte difference between the source and destination addresses. > >>> > >>> Prior to this code change and for the last 3 days I have been using this > >>> code instead and not seen any crashes: > >>> uint16_t * score_ptr = &candidates_index[new_score_rank]; > >>> uint16_t * candidate_ptr = &candidates_index[rank]; > >>> while (candidate_ptr >= score_ptr + 8) { > >>> *candidate_ptr = *(candidate_ptr - 1); > >>> *(candidate_ptr - 1) = *(candidate_ptr - 2); > >>> *(candidate_ptr - 2) = *(candidate_ptr - 3); > >>> *(candidate_ptr - 3) = *(candidate_ptr - 4); > >>> *(candidate_ptr - 4) = *(candidate_ptr - 5); > >>> *(candidate_ptr - 5) = *(candidate_ptr - 6); > >>> *(candidate_ptr - 6) = *(candidate_ptr - 7); > >>> *(candidate_ptr - 7) = *(candidate_ptr - 8); > >>> candidate_ptr -= 8; > >>> } > >>> while (candidate_ptr > score_ptr) { > >>> *candidate_ptr = *(candidate_ptr - 1); > >>> candidate_ptr--; > >>> } > >>> Yes, it's bloated code that should do the same thing as the memmove, but > >>> most importantly the code has never caused any problems. Interestingly, > >>> even this code shows memmove in the assembly code (gcc -S), but only for > >>> the second while loop. The looping code for the first while loop looks > >>> like this and moves 8 uint16_t's in just 5 instruction so it is perhaps > >>> not as inefficient as the source code looks: > >>> .L25: > >>> movdqu -16(%rax), %xmm1 > >>> subq $16, %rax > >>> movups %xmm1, 2(%rax) > >>> cmpq %rdx, %rax > >>> jnb .L25 > >>> > >>> It may or may not matter, but the code this is happening on is very CPU > >>> intensive - there can be up to 8 threads running at the same time when > >>> this problem occurs. The problem doesn't occur consistently, it seems to > >>> be rather random. The program runs about 500 iterations of ranking up to > >>> the top 30,000 new grammar rule candidates over nearly 4 hours on my test > >>> case and has crashed on different iterations each time it has crashed, > >>> even though the thread that seems to be crashing should be seeing exactly > >>> the same data each time the program is run. The malloc'ed array address > >>> could be changing, I haven't checked that out. > >>> > >>> I find it really hard to believe there is a bug in memmove but that seems > >>> to be what gdb and my testing are indicating. So I am looking for advice > >>> on how to better understand what is causing the program to crash. I > >>> would like to review the code memset is using, but have not been able to > >>> figure out how to track that down. Any help in understanding what code > >>> the complier is using for memmove would be helpful. Are there other > >>> things I could possibly be overlooking? Are the any other things I > >>> should review or report that would be helpful? I could try to write a > >>> simplified test case if that would be useful. > >>> > >>> Best Regards, > >>> > >>> Kennon Conrad > >>> > >>> > >> > >>> > >>> -- > >>> Problem reports: https://cygwin.com/problems.html > >>> FAQ: https://cygwin.com/faq/ > >>> Documentation: https://cygwin.com/docs.html > >>> Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple > >> > >> The memmove() call acceses new_score_rank 3 times while the old code only > >> accessed it once. Is it possible that another CPU alters new_score_rank > >> between > >> these acesses? > >> > >> You could eliminate that possibility by making a local copy of > >> new_score_rank > >> and using that in the memmove() call. Worth a try? > >> > >> Cheers ... Duncan. > >> > >> -- > >> Problem reports: https://cygwin.com/problems.html > >> FAQ: https://cygwin.com/faq/ > >> Documentation: https://cygwin.com/docs.html > >> Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple > > > > > -- > Take care. Thanks, Brian Inglis Calgary, Alberta, Canada > > La perfection est atteinte Perfection is achieved > non pas lorsqu'il n'y a plus rien à ajouter not when there is no more to add > mais lorsqu'il n'y a plus rien à retrancher but when there is no more to cut > -- Antoine de Saint-Exupéry > > -- > Problem reports: https://cygwin.com/problems.html > FAQ: https://cygwin.com/faq/ > Documentation: https://cygwin.com/docs.html > Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation: https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple

