Re: GC seems to crash my C-code function
On Sunday, 19 September 2021 at 02:18:20 UTC, Steven Schveighoffer wrote: ```d // option 1 output[p++] = (ulong(w - e) + 3) % 40; // option 2 output[p++] = cast(char)(((w - e) + 3) % 40); ``` Remember also, `char` is C's only way to say "byte". So this may just be data, and not unicode data. You may want to consider using `ubyte` in your translation instead of `char`. But wait until your code is compiling and working as it did in C. -Steve Thanks again, good to know how D sees things. That data is indeed meant to be raw bytes. Option 2 seems to be always the most reliable one and that kind of casts are the only one I did in first place ;-)
Re: GC seems to crash my C-code function
On 9/18/21 5:16 PM, frame wrote: On Saturday, 18 September 2021 at 18:48:07 UTC, Steven Schveighoffer wrote: Did you mean "long to char" cast? In that case, yes, you have to cast it. Note, `out` is a keyword, it can't be used as a variable, but you probably already figured that out. But if `out` here is a `char *`, then yes, you need a cast there. Yes, of course. `out(put)` is a `char[6]` here. So here is why you need a cast: First, the code is doing math on 2 pointers, which returns a `ptrdiff_t`, a.k.a. `long` in 64-bits. Second, you are using mod 40, which is great, because D will recognize that the range of values must be within 40! However, since it's signed, that's -40 to 40. Which doesn't fit in a `char` (which is unsigned). D does not allow an implicit narrowing conversion. Since -40 to -1 won't fit into a char (dubious anyway for char to be an integer IMO), you need a cast. So my recommendation is shoehorn it into ulong to take away the sign before the mod, or cast to char at the end. Is there any chance that `w` is less than `e`? if so, do NOT use the first option. ```d // option 1 output[p++] = (ulong(w - e) + 3) % 40; // option 2 output[p++] = cast(char)(((w - e) + 3) % 40); ``` Remember also, `char` is C's only way to say "byte". So this may just be data, and not unicode data. You may want to consider using `ubyte` in your translation instead of `char`. But wait until your code is compiling and working as it did in C. -Steve
Re: GC seems to crash my C-code function
On 9/18/21 5:20 PM, frame wrote: On Saturday, 18 September 2021 at 21:16:13 UTC, frame wrote: On Saturday, 18 September 2021 at 18:48:07 UTC, Steven Schveighoffer wrote: Are you defining the prototype for strchr yourself instead of importing it from core.stdc.string? Not really :D but without cast it complains: ``` Error: cannot implicitly convert expression strchr(e, cast(int)c) of type const(char)* to char* ``` But I guess it's because it tries to assign to a `char*` therefore inout() doesn't work. Well, the variable you are assigning it to should be a `const char *` if the source variable is a `const char *`. Possibly, the C code didn't mark it as `const char *`, because I'm pretty sure C has `strchr` being: `char * strchr(const char *, int)` Because, you know, const doesn't matter in C ;) If that doesn't work, then you *may* need to start allocating. But I'd try that first. -Steve
Re: GC seems to crash my C-code function
On Saturday, 18 September 2021 at 21:16:13 UTC, frame wrote: On Saturday, 18 September 2021 at 18:48:07 UTC, Steven Schveighoffer wrote: Are you defining the prototype for strchr yourself instead of importing it from core.stdc.string? Not really :D but without cast it complains: ``` Error: cannot implicitly convert expression strchr(e, cast(int)c) of type const(char)* to char* ``` But I guess it's because it tries to assign to a `char*` therefore inout() doesn't work.
Re: GC seems to crash my C-code function
On Saturday, 18 September 2021 at 18:48:07 UTC, Steven Schveighoffer wrote: Are you defining the prototype for strchr yourself instead of importing it from core.stdc.string? Not really :D but without cast it complains: ``` Error: cannot implicitly convert expression strchr(e, cast(int)c) of type const(char)* to char* ``` Did you mean "long to char" cast? In that case, yes, you have to cast it. Note, `out` is a keyword, it can't be used as a variable, but you probably already figured that out. But if `out` here is a `char *`, then yes, you need a cast there. -Steve Yes, of course. `out(put)` is a `char[6]` here.
Re: GC seems to crash my C-code function
On 9/18/21 12:52 PM, frame wrote: There were also parts where the pointer is used in calculations - which is accepted by the compiler - it just complains about implicitly `long` to `char*` cast: ``` // const char *e // char *w out[p++] = ((w - e) + 3) % 40; ``` Did you mean "long to char" cast? In that case, yes, you have to cast it. Note, `out` is a keyword, it can't be used as a variable, but you probably already figured that out. But if `out` here is a `char *`, then yes, you need a cast there. -Steve
Re: GC seems to crash my C-code function
On 9/18/21 12:52 PM, frame wrote: On Saturday, 18 September 2021 at 11:47:52 UTC, Steven Schveighoffer wrote: Have you tried: ```d const(char)* s2 = "..."; ``` This will work because string literals are zero terminated and implicitly castable to `immutable(char)*`, which will also implicitly cast to `const(char)*`. That should allow s2 to be reassigned but not modify the data. IIRC, string literal data even in C is put into a read-only section by many compilers, so the code shouldn't be changing it. ... The first rule of porting -- just translate, don't change anything. I would try to do exactly what C does without using the GC at all. Continue to use malloc/free. If you have issues with type representation, you may need to adjust for that. This is what I try to achieve - not to change much. But I see there is a mistake by me, s2 __is__ const in the original C-code too. Unfortunately, with `const(char)*`, `strchr()` did complain and then I would have to cast it to `char*` - so I didn't used it in first place because I really thought `.dup` wouldn't allocate here. `dup` definitely allocates. But when I look at the [dlang docs for strchr](https://dlang.org/phobos/core_stdc_string.html#.strchr), it properly adds the `inout` type modifier which should correct that problem. Are you defining the prototype for `strchr` yourself instead of importing it from `core.stdc.string`? There were also parts where the pointer is used in calculations - which is accepted by the compiler - it just complains about implicitly `long` to `char*` cast: ``` // const char *e // char *w out[p++] = ((w - e) + 3) % 40; ``` Why doesn't the compiler complain about `char* - const(char)*`? `long` doesn't implicitly cast to `char *`. But of course, subtracting two pointers works for the same base type. Note that `long` in C is *not the same* as `long` in D. You should take a look at https://dlang.org/spec/interfaceToC.html#data_type_compat -Steve
Re: GC seems to crash my C-code function
On Saturday, 18 September 2021 at 11:47:52 UTC, Steven Schveighoffer wrote: Have you tried: ```d const(char)* s2 = "..."; ``` This will work because string literals are zero terminated and implicitly castable to `immutable(char)*`, which will also implicitly cast to `const(char)*`. That should allow s2 to be reassigned but not modify the data. IIRC, string literal data even in C is put into a read-only section by many compilers, so the code shouldn't be changing it. ... The first rule of porting -- just translate, don't change anything. I would try to do exactly what C does without using the GC at all. Continue to use malloc/free. If you have issues with type representation, you may need to adjust for that. This is what I try to achieve - not to change much. But I see there is a mistake by me, s2 __is__ const in the original C-code too. Unfortunately, with `const(char)*`, `strchr()` did complain and then I would have to cast it to `char*` - so I didn't used it in first place because I really thought `.dup` wouldn't allocate here. There were also parts where the pointer is used in calculations - which is accepted by the compiler - it just complains about implicitly `long` to `char*` cast: ``` // const char *e // char *w out[p++] = ((w - e) + 3) % 40; ``` Why doesn't the compiler complain about `char* - const(char)*`?
Re: GC seems to crash my C-code function
On 9/18/21 5:40 AM, frame wrote: On Friday, 17 September 2021 at 14:29:23 UTC, Steven Schveighoffer wrote: Looking at that signature, it does not appear that it uses zero-termination at all, as it takes a length. So using `dup` and therefore the gc is totally unnecessary. I'm assuming that string is the barcode argument? No, the string appears inside the C-function. I'm calling the function with .ptr and the .length property as it expects. Oh wow, I totally misunderstood what you are doing. I thought you were *calling* that function, but you are *reimplementing* that code in D. Have you tried: ```d const(char)* s2 = "..."; ``` This will work because string literals are zero terminated and implicitly castable to `immutable(char)*`, which will also implicitly cast to `const(char)*`. That should allow s2 to be reassigned but not modify the data. IIRC, string literal data even in C is put into a read-only section by many compilers, so the code shouldn't be changing it. Now that I understand what you are doing, it becomes clear that this isn't a situation of C code being called. Or are you calling other parts of the C library with that translated function? The first rule of porting -- just translate, don't change anything. I would try to do exactly what C does without using the GC at all. Continue to use malloc/free. If you have issues with type representation, you may need to adjust for that. -Steve
Re: GC seems to crash my C-code function
On Friday, 17 September 2021 at 14:29:23 UTC, Steven Schveighoffer wrote: Looking at that signature, it does not appear that it uses zero-termination at all, as it takes a length. So using `dup` and therefore the gc is totally unnecessary. I'm assuming that string is the barcode argument? No, the string appears inside the C-function. I'm calling the function with .ptr and the .length property as it expects. The `-vgc` switch appears to only identify allocations that the compiler invokes via hooks, not ones that other functions invoke (or ones that are direct calls into the GC). ... The docs for `-vgc` should really be updated to clarify. It currently just says "List all gc allocations including hidden ones". -Steve Thanks for clarification!
Re: GC seems to crash my C-code function
On 9/17/21 2:27 AM, frame wrote: On Thursday, 16 September 2021 at 18:02:44 UTC, Steven Schveighoffer wrote: Are you sure? Be very pedantic about what C functions do with the data you send it. Sometimes they store it somewhere to use later. Sometimes they expect it to be allocated by the C heap, etc. Without seeing how you use it, I can't tell you if it's wrong or not. If you want to have a look the original C-library is here https://github.com/rdoeffinger/iec16022 I'm only using the encoder function iec16022ecc200f. Looking at that signature, it does not appear that it uses zero-termination at all, as it takes a length. So using `dup` and therefore the gc is totally unnecessary. I'm assuming that string is the barcode argument? What does your call look like? If it's a literal, you don't need to toStringz (which also allocates). All string literals are zero-terminated (and actually implicitly castable to `immutable char *`). Thanks, I'm just careful with casting. Does it really allocate from a literal if it's used on the stack only? Is `-vgc` switch reliable? The `-vgc` switch appears to only identify allocations that the compiler invokes via hooks, not ones that other functions invoke (or ones that are direct calls into the GC). In other words, it helps you find your direct allocations using the compiler, not ones that are buried in already-compiled code. Try this with `-vgc`, and it reports nothing: ```d import core.memory; void main() { auto x = GC.malloc(10); } ``` This makes sense, as it may not have the function code to analyze, and it also cannot infer GC allocation from a lack of @nogc (a non-@nogc function *may* allocate, but does not *necessarily* allocate). It also would be quite useless to see some function buried inside druntime that allocates, not knowing what the call stack was. The docs for `-vgc` should really be updated to clarify. It currently just says "List all gc allocations including hidden ones". -Steve
Re: GC seems to crash my C-code function
On Friday, 17 September 2021 at 06:58:01 UTC, jfondren wrote: On Friday, 17 September 2021 at 06:27:40 UTC, frame wrote: Thanks, I'm just careful with casting. Does it really allocate from a literal if it's used on the stack only? Is `-vgc` switch reliable? looks to me like it calls ... Interesting analysis, thanks for your effort!
Re: GC seems to crash my C-code function
On Friday, 17 September 2021 at 06:27:40 UTC, frame wrote: Thanks, I'm just careful with casting. Does it really allocate from a literal if it's used on the stack only? Is `-vgc` switch reliable? looks to me like it calls ```d // object private U[] _dup(T, U)(scope T[] a) pure nothrow @trusted if (__traits(isPOD, T)) { if (__ctfe) return _dupCtfe!(T, U)(a); import core.stdc.string : memcpy; auto arr = _d_newarrayU(typeid(T[]), a.length); memcpy(arr.ptr, cast(const(void)*) a.ptr, T.sizeof * a.length); return *cast(U[]*) } ``` -> ```d // rt.lifetime extern (C) void[] _d_newarrayU(const scope TypeInfo ti, size_t length) pure nothrow @weak { ... auto info = __arrayAlloc(size, ti, tinext); ... } ``` -> ```d // rt.lifetime BlkInfo __arrayAlloc(size_t arrsize, const scope TypeInfo ti, const TypeInfo tinext) nothrow pure { ... auto bi = GC.qalloc(padded_size, attr, tinext); ... } ``` -> ```d // gc.impl.conservative.gc BlkInfo qalloc( size_t size, uint bits, const TypeInfo ti) nothrow { if (!size) { return BlkInfo.init; } BlkInfo retval; retval.base = runLocked!(mallocNoSync, mallocTime, numMallocs)(size, bits, retval.size, ti); if (!(bits & BlkAttr.NO_SCAN)) { memset(retval.base + size, 0, retval.size - size); } retval.attr = bits; return retval; } ``` which you can also follow in an objdump. Conclusion: -vgc is missing this GC allocation. To stack-allocate a mutable copy of a string literal go with ```d @safe @nogc nothrow unittest { enum S = "hello world"; char[S.length+1] s1 = S; char* s2 = [0]; // Normally you'd expect s1.ptr[s1.length] here, // but the '\0' byte is explicitly part of s1 due to length+1 above. assert(s1[s1.length-1] == '\0'); (() @trusted { import core.stdc.stdio : puts; import std.string : fromStringz; puts(s2); assert(s2.fromStringz == S); })(); } ```
Re: GC seems to crash my C-code function
On Thursday, 16 September 2021 at 18:02:44 UTC, Steven Schveighoffer wrote: Are you sure? Be very pedantic about what C functions do with the data you send it. Sometimes they store it somewhere to use later. Sometimes they expect it to be allocated by the C heap, etc. Without seeing how you use it, I can't tell you if it's wrong or not. If you want to have a look the original C-library is here https://github.com/rdoeffinger/iec16022 I'm only using the encoder function iec16022ecc200f. If it's a literal, you don't need to toStringz (which also allocates). All string literals are zero-terminated (and actually implicitly castable to `immutable char *`). -Steve Thanks, I'm just careful with casting. Does it really allocate from a literal if it's used on the stack only? Is `-vgc` switch reliable?
Re: GC seems to crash my C-code function
On 9/16/21 1:08 PM, frame wrote: On Thursday, 16 September 2021 at 15:34:25 UTC, Steven Schveighoffer wrote: `dup` is a GC allocation. Are you using that in your C code? the GC might be collecting that string. The compiler doesn't show that lines with -vgc. Maybe it knows that it is only stack allocated? Technically, the GC could collect that data if it wants - it's not longer used after the function returns. At least I control that the GC cannot collect it till my data is processed, so there should be no problem. Are you sure? Be very pedantic about what C functions do with the data you send it. Sometimes they store it somewhere to use later. Sometimes they expect it to be allocated by the C heap, etc. Without seeing how you use it, I can't tell you if it's wrong or not. I guess the C-methods did corrupt the memory which the compiler has reserved for that function data statically and then the GC collect marks it as free or some other UB. You are better off to cast away the immutable (as long as you are 100% sure the C code isn't writing to it), as the string literal will not be collected. Yes, I changed it to stringz and a cast and the problem is gone so far. All char* are read only in the function or passed to stdlib functions that should not modify it. If it's a literal, you don't need to toStringz (which also allocates). All string literals are zero-terminated (and actually implicitly castable to `immutable char *`). -Steve
Re: GC seems to crash my C-code function
On Thursday, 16 September 2021 at 15:34:25 UTC, Steven Schveighoffer wrote: `dup` is a GC allocation. Are you using that in your C code? the GC might be collecting that string. The compiler doesn't show that lines with -vgc. Maybe it knows that it is only stack allocated? Technically, the GC could collect that data if it wants - it's not longer used after the function returns. At least I control that the GC cannot collect it till my data is processed, so there should be no problem. I guess the C-methods did corrupt the memory which the compiler has reserved for that function data statically and then the GC collect marks it as free or some other UB. You are better off to cast away the immutable (as long as you are 100% sure the C code isn't writing to it), as the string literal will not be collected. -Steve Yes, I changed it to stringz and a cast and the problem is gone so far. All char* are read only in the function or passed to stdlib functions that should not modify it.
Re: GC seems to crash my C-code function
On 9/16/21 6:28 AM, frame wrote: I have C-code translated in D that acts sometimes incorrect if the GC has made some collect. I would like to know why. - Code runs correct if the GC collections are off - There are no allocations within the C-translated-code except `throw new` (but they are not called) ... I didn't want to change the code much so I have some piece like that: ```d // const char *s2 = "!\"#$%&'()*+,-./:;<=>?@[\\]^_"; char* s2 = "!\"#$%&'()*+,-./:;<=>?@[\\]^_".dup.ptr; ``` Could this cause the issue? But the pointer is not used outside the function where it's created. `dup` is a GC allocation. Are you using that in your C code? the GC might be collecting that string. You are better off to cast away the immutable (as long as you are 100% sure the C code isn't writing to it), as the string literal will not be collected. -Steve
Re: GC seems to crash my C-code function
On Thursday, 16 September 2021 at 11:35:27 UTC, frame wrote: On Thursday, 16 September 2021 at 11:11:56 UTC, bauss wrote: On Thursday, 16 September 2021 at 11:06:04 UTC, frame wrote: On Thursday, 16 September 2021 at 10:48:19 UTC, bauss wrote: Use toStringz and not .ptr. Or append \0 to your string. Stupid me should really know that already, thanks =) Of course I have dup'ed the \0 from the string away... But still I don't know why it works if the GC is off? Did you see my second response? Yes, but as I mentioned this pointer data isn't hold outside the function and the GC collect runs after data is processed, not in between. This pointer should be fully stack allocated and also newly generated each time the function is called, so I see no impact by the GC? But GC collecting works now without issues. My guess is something is undefined behavior in the C code, idk really
Re: GC seems to crash my C-code function
On Thursday, 16 September 2021 at 11:11:56 UTC, bauss wrote: On Thursday, 16 September 2021 at 11:06:04 UTC, frame wrote: On Thursday, 16 September 2021 at 10:48:19 UTC, bauss wrote: Use toStringz and not .ptr. Or append \0 to your string. Stupid me should really know that already, thanks =) Of course I have dup'ed the \0 from the string away... But still I don't know why it works if the GC is off? Did you see my second response? Yes, but as I mentioned this pointer data isn't hold outside the function and the GC collect runs after data is processed, not in between. This pointer should be fully stack allocated and also newly generated each time the function is called, so I see no impact by the GC? But GC collecting works now without issues.
Re: GC seems to crash my C-code function
On Thursday, 16 September 2021 at 11:06:04 UTC, frame wrote: On Thursday, 16 September 2021 at 10:48:19 UTC, bauss wrote: Use toStringz and not .ptr. Or append \0 to your string. Stupid me should really know that already, thanks =) Of course I have dup'ed the \0 from the string away... But still I don't know why it works if the GC is off? Did you see my second response?
Re: GC seems to crash my C-code function
On Thursday, 16 September 2021 at 10:48:19 UTC, bauss wrote: Use toStringz and not .ptr. Or append \0 to your string. Stupid me should really know that already, thanks =) Of course I have dup'ed the \0 from the string away... But still I don't know why it works if the GC is off?
Re: GC seems to crash my C-code function
On Thursday, 16 September 2021 at 10:48:19 UTC, bauss wrote: On Thursday, 16 September 2021 at 10:28:37 UTC, frame wrote: I have C-code translated in D that acts sometimes incorrect if the GC has made some collect. I would like to know why. - Code runs correct if the GC collections are off - There are no allocations within the C-translated-code except `throw new` (but they are not called) - All allocations made in C-translated-code are still calloc/malloc `ed - Even if I disable the GC before calling the function and just enable it after there will be an incorrect result - Data passed to the function belongs to a struct and the function is called in a member function and is always correct The main public function accepts a char* and returns a char*. Signature is like this: ```d char* fun(ref int, ref int, size_t, const char*, out int, out int, out int, uint); ``` Input paramter gets the pointer from char[] `.ptr` property (and length must be supplied too). I didn't want to change the code much so I have some piece like that: ```d // const char *s2 = "!\"#$%&'()*+,-./:;<=>?@[\\]^_"; char* s2 = "!\"#$%&'()*+,-./:;<=>?@[\\]^_".dup.ptr; ``` Could this cause the issue? But the pointer is not used outside the function where it's created. Use toStringz and not .ptr. Or append \0 to your string. Also see the documentation for "toStringz" which has this: ``` Important Note: When passing a char* to a C function, and the C function keeps it around for any reason, make sure that you keep a reference to it in your D code. Otherwise, it may become invalid during a garbage collection cycle and cause a nasty bug when the C code tries to use it. ``` It probably should tell that somewhere else too.
Re: GC seems to crash my C-code function
On Thursday, 16 September 2021 at 10:28:37 UTC, frame wrote: I have C-code translated in D that acts sometimes incorrect if the GC has made some collect. I would like to know why. - Code runs correct if the GC collections are off - There are no allocations within the C-translated-code except `throw new` (but they are not called) - All allocations made in C-translated-code are still calloc/malloc `ed - Even if I disable the GC before calling the function and just enable it after there will be an incorrect result - Data passed to the function belongs to a struct and the function is called in a member function and is always correct The main public function accepts a char* and returns a char*. Signature is like this: ```d char* fun(ref int, ref int, size_t, const char*, out int, out int, out int, uint); ``` Input paramter gets the pointer from char[] `.ptr` property (and length must be supplied too). I didn't want to change the code much so I have some piece like that: ```d // const char *s2 = "!\"#$%&'()*+,-./:;<=>?@[\\]^_"; char* s2 = "!\"#$%&'()*+,-./:;<=>?@[\\]^_".dup.ptr; ``` Could this cause the issue? But the pointer is not used outside the function where it's created. Use toStringz and not .ptr. Or append \0 to your string.
GC seems to crash my C-code function
I have C-code translated in D that acts sometimes incorrect if the GC has made some collect. I would like to know why. - Code runs correct if the GC collections are off - There are no allocations within the C-translated-code except `throw new` (but they are not called) - All allocations made in C-translated-code are still calloc/malloc `ed - Even if I disable the GC before calling the function and just enable it after there will be an incorrect result - Data passed to the function belongs to a struct and the function is called in a member function and is always correct The main public function accepts a char* and returns a char*. Signature is like this: ```d char* fun(ref int, ref int, size_t, const char*, out int, out int, out int, uint); ``` Input paramter gets the pointer from char[] `.ptr` property (and length must be supplied too). I didn't want to change the code much so I have some piece like that: ```d // const char *s2 = "!\"#$%&'()*+,-./:;<=>?@[\\]^_"; char* s2 = "!\"#$%&'()*+,-./:;<=>?@[\\]^_".dup.ptr; ``` Could this cause the issue? But the pointer is not used outside the function where it's created.