Re: [Chicken-users] c-string return question
Sorry John&Felix, I must have been overworked. Some sleep already made me aware that the memory in question is indeed clobbered. The c-pointer section in the FFI manual is just not clear about that. (And somehow I must have convinced myself that C_mpointer would already copy out the memory, which is obviously not the case.) Given the facts the c-pointer type is much less interesting now. I'll avoid it from now. /Jörg ___ Chicken-users mailing list Chicken-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-users
Re: [Chicken-users] c-string return question
> My point is, that this code is valid wrt. the manual and apparently > valid given my understanding of what the C code tries to do: > make it possible to return on stack strings. Can you point out the relevant manual section? > For me it ends up like this: > > --- #define return(x) C_cblock C_r = > (C_mpointer(&C_a,(void*)(x))); goto > --- #C_ret; C_cblockend static C_word C_fcall stub26(C_word > C_buf,C_word > --- #C_a0) C_regparm; C_regparm static C_word C_fcall > stub26(C_word > --- #C_buf,C_word C_a0){ C_word > C_r=C_SCHEME_UNDEFINED,*C_a=(C_word*)C_buf; > --- #unsigned int ch=(unsigned int )C_num_to_unsigned_int(C_a0); > static > --- #unsigned char off[6]={0xFC,0xF8,0xF0,0xE0,0xC0,0x00}; > int size=5; C_char buf[7]; > buf[6]='\0'; > if (ch < 0x80) { >buf[5]=ch; > } else { >buf[size--]=(ch&0x3F)|0x80; ch=ch>>6; >while (ch) { buf[size--]=(ch&0x3F)|0x80; ch=ch>>6; } >/* Write the size information into the first byte */ >++size; >buf[size]=off[size]|buf[size]; > } > return(buf+size); > > C_ret: > #undef return > > return C_r;} > --- > > to be called like this: > > - > t3=C_a_i_bytevector(&a,1,C_fix(3)); > t4=C_i_foreign_unsigned_integer_argumentp(t2); > t5=stub26(t3,t4); > C_trace("##sys#peek-c-string"); > t6=*((C_word*)lf[5]+1); > ((C_proc4)(void*)(*((C_word*)t6+1)))(4,t6,t1,t5,C_fix(0));} > - > > But somehow the "C_proc4" receives clobbered memory. "stub26" is called as a normal C function and the returned buffer will be clobbered if it points to (nonstatic) heap data. cheers, felix ___ Chicken-users mailing list Chicken-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-users
Re: [Chicken-users] c-string return question
On Oct 13 2011, Alan Post wrote: On Thu, Oct 13, 2011 at 02:46:30PM -0400, John Cowan wrote: Alan Post scripsit: > It does make the routine non-reentrant. Does that matter here? I don't see how. This routine is called from Chicken, and the string gets copied into a Chicken string right away. I suppose you might want to shut off interrupts. Right! I was laboring under the illusion of posix threads. Hm, I'm working in the presence of posix threads; just until now there is only one chicken thread for me. Which might change, as I said. But shutting off interrupts is totally irrelevant here. We are talking about the generated code as seen within the C function. interrupts are checked at their begin. ___ Chicken-users mailing list Chicken-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-users
Re: [Chicken-users] c-string return question
On Oct 13 2011, John Cowan wrote: Alan Post scripsit: It does make the routine non-reentrant. Does that matter here? I don't see how. This routine is called from Chicken, and the string gets copied into a Chicken string right away. I suppose you might want to shut off interrupts. Common. When I consider such low level things, then I'm not caught in the cage of the application at hand. It might very well be that I want one day to run two chicken threads in one process. So far there is no promise that this might work. But the declarations in chicken core look already as if one could try to do that. I don't want to accidentally create a stupid test case for the fact that there is no provision (I can't even imagine any) for code inside foreign-lambda* to be always thread local... I'd rather keep a test case for the temporary dysfunctional but good API for returning on stack strings. If it was not for QA wrt. chicken, my simplest solution would be to just use the equivalent definition as it went into chicken. But, that's *not* the point, you see. ___ Chicken-users mailing list Chicken-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-users
Re: [Chicken-users] c-string return question
On Oct 13 2011, John Cowan wrote: Jörg F. Wittenberger scripsit: So I'll stick with the test case and remove the "static" keyword from the buffer definition once I have an updated gcc in my production environment. "Program testing can be used to show the presence of bugs, but never to show their absence!" --Edsger Dijkstra And this is especially true for Heisenbugs like this. Keep the 'static' permanently: it's safe and it costs essentially nothing. Except for the reentrance/thread safety issue that is! ___ Chicken-users mailing list Chicken-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-users
Re: [Chicken-users] c-string return question
John, it's not my intention to argue about the merits of the way the foreign-lambda* I posted has been written. (If I had to do so, I would argue that using a dynamic "buf" would be better style. Less sensible to being [re]used in a multi threaded or reentrant environment.) My point is, that this code is valid wrt. the manual and apparently valid given my understanding of what the C code tries to do: make it possible to return on stack strings. The compiler output could be much simpler, if there where a restriction that c-string would always point to heap or static memory. (Maybe a c-string-static type would be an idea to distinguish the complex case and the simple one? Not sure.) The situation is, that more recent gcc versions will do fine with that one (and break elsewhere) while those in some common linux distributions at fail to work. On Oct 13 2011, John Cowan wrote: I looked at all instances of 'define return' and at most they seem to copy pointers: That's what they do. They arrange things for ##sys#peek-c-string to find the C string. For me it ends up like this: --- #define return(x) C_cblock C_r = (C_mpointer(&C_a,(void*)(x))); goto C_ret; C_cblockend static C_word C_fcall stub26(C_word C_buf,C_word C_a0) C_regparm; C_regparm static C_word C_fcall stub26(C_word C_buf,C_word C_a0){ C_word C_r=C_SCHEME_UNDEFINED,*C_a=(C_word*)C_buf; unsigned int ch=(unsigned int )C_num_to_unsigned_int(C_a0); static unsigned char off[6]={0xFC,0xF8,0xF0,0xE0,0xC0,0x00}; int size=5; C_char buf[7]; buf[6]='\0'; if (ch < 0x80) { buf[5]=ch; } else { buf[size--]=(ch&0x3F)|0x80; ch=ch>>6; while (ch) { buf[size--]=(ch&0x3F)|0x80; ch=ch>>6; } /* Write the size information into the first byte */ ++size; buf[size]=off[size]|buf[size]; } return(buf+size); C_ret: #undef return return C_r;} --- to be called like this: - t3=C_a_i_bytevector(&a,1,C_fix(3)); t4=C_i_foreign_unsigned_integer_argumentp(t2); t5=stub26(t3,t4); C_trace("##sys#peek-c-string"); t6=*((C_word*)lf[5]+1); ((C_proc4)(void*)(*((C_word*)t6+1)))(4,t6,t1,t5,C_fix(0));} - But somehow the "C_proc4" receives clobbered memory. I don't see why. /Jörg ___ Chicken-users mailing list Chicken-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-users
Re: [Chicken-users] c-string return question
On Thu, Oct 13, 2011 at 02:46:30PM -0400, John Cowan wrote: > Alan Post scripsit: > > > It does make the routine non-reentrant. Does that matter here? > > I don't see how. This routine is called from Chicken, and the string > gets copied into a Chicken string right away. > > I suppose you might want to shut off interrupts. > Right! I was laboring under the illusion of posix threads. -Alan -- .i ma'a lo bradi cu penmi gi'e du ___ Chicken-users mailing list Chicken-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-users
Re: [Chicken-users] c-string return question
Alan Post scripsit: > It does make the routine non-reentrant. Does that matter here? I don't see how. This routine is called from Chicken, and the string gets copied into a Chicken string right away. I suppose you might want to shut off interrupts. -- John Cowanhttp://ccil.org/~cowanco...@ccil.org SAXParserFactory [is] a hideous, evil monstrosity of a class that should be hung, shot, beheaded, drawn and quartered, burned at the stake, buried in unconsecrated ground, dug up, cremated, and the ashes tossed in the Tiber while the complete cast of Wicked sings "Ding dong, the witch is dead." --Elliotte Rusty Harold on xml-dev ___ Chicken-users mailing list Chicken-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-users
Re: [Chicken-users] c-string return question
On Thu, Oct 13, 2011 at 02:07:04PM -0400, John Cowan wrote: > Jörg F. Wittenberger scripsit: > > > So I'll stick with the test case and remove the "static" keyword from > > the buffer definition once I have an updated gcc in my production > > environment. > > "Program testing can be used to show the presence of bugs, but never to > show their absence!" --Edsger Dijkstra > > And this is especially true for Heisenbugs like this. Keep the 'static' > permanently: it's safe and it costs essentially nothing. > It does make the routine non-reentrant. Does that matter here? -Alan -- .i ma'a lo bradi cu penmi gi'e du ___ Chicken-users mailing list Chicken-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-users
Re: [Chicken-users] c-string return question
Jörg F. Wittenberger scripsit: > So I'll stick with the test case and remove the "static" keyword from > the buffer definition once I have an updated gcc in my production > environment. "Program testing can be used to show the presence of bugs, but never to show their absence!" --Edsger Dijkstra And this is especially true for Heisenbugs like this. Keep the 'static' permanently: it's safe and it costs essentially nothing. -- My confusion is rapidly waxing John Cowan For XML Schema's too taxing:co...@ccil.org I'd use DTDshttp://www.ccil.org/~cowan If they had local trees -- I think I best switch to RELAX NG. ___ Chicken-users mailing list Chicken-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-users
Re: [Chicken-users] c-string return question
Jörg F. Wittenberger scripsit: > (Watch out for C_cblock and C_cblockend #defines in chicken.h , which > depend on the C compiler in use.) Normally, they are ({ and }) respectively, the GNU C extension for statement expressions (see http://gcc.gnu.org/onlinedocs/gcc/Statement-Exprs.html ). In C++ mode, they compile as "do{" and "}while(0)" instead. In neither case do they do anything to the stack. > It does a local #define return(x) to insert a block wherein it saves > the to-be-returned string *before* the actual return statement is seen > by the C compiler. I looked at all instances of 'define return' and at most they seem to copy pointers: they don't copy the chars that are pointed to. That is what matters here: one way or another, this code returns a pointer to garbage outside the current stack. > the trick as deployed in the Chicken source does not work under > certain C compilers. Since it's still not valid C despite the trick, that's no surprise. An alternative approach to using a static string, overkill in this case, is to malloc() the result string and declare the result type to be c-string* rather than c-string. -- Only do what only you can do. John Cowan --Edsger W. Dijkstra's advice to a student in search of a thesis ___ Chicken-users mailing list Chicken-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-users
Re: [Chicken-users] c-string return question
On Oct 13 2011, Jim Ursetto wrote: On Oct 13, 2011, at 11:02 AM, Jörg F. Wittenberger wrote: ages ago I wrote these simple lines: Out of curiosity, would this suit your purposes instead: (##sys#char->utf8-string (integer->char x)) Looks good. I did not notice that this made it into the chicken core since I wrote my code. NB: The code I posted is actually a good test case for the c-string return value in the chicken FFI. This code was actually converted to C from an equivalent Scheme implementation (good enough by all counts for the actual purpose at hand) to learn about Chickens c-string return handling. So I'll stick with the test case and remove the "static" keyword from the buffer definition once I have an updated gcc in my production environment. Have Fun /Jörg ___ Chicken-users mailing list Chicken-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-users
Re: [Chicken-users] c-string return question
On Thu, Oct 13, 2011 at 07:24:12PM +0200, Jörg F. Wittenberger wrote: > IMHO the moral of the story: Never trust you C compiler too much. > I've had to get more familiar with gcc's -f flag, as the years have gone by. '-fno-strict-aliasing' is one that I've personally needed (and chicken requires too, I believe) for some time now, and variously I've had to turn those on and off based on writing C that was a little too comfortable with the underlying machine architecture. A favorite trick of mine, for instance: struct string { size_t string_size; char string_buffer[1]; /* note the single character string */ } Where I then malloc 'sizeof(struct string)+strlen(str)' all as one block of memory and write the string past the end of the struct.[1] You might find a wonderful playground of debugging potential if you try this code fiddling with your -f options: start with the ones that get defined with -O3, particularly those that aren't defined in -O2. -Alan 1: this stores both the size of the string and an extra character for the null pointer, which I do on purpose. -- .i ma'a lo bradi cu penmi gi'e du ___ Chicken-users mailing list Chicken-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-users
Re: [Chicken-users] c-string return question
On Oct 13 2011, Jörg F. Wittenberger wrote: Recently this code begin to return garbage under gcc 4.4.5 on amd64 and ARM, though more reliable on ARM. I forgot some marginal thing you might want to know just in case: With gcc 4.4.5 (as in current debian stable) you really, really don't want to compile C code as produced by Chicken with gcc -O3 !! This works for me for small test programs so far. But with a 50k LoC program it runs into all sorts of errors. Just deleting the .o files and recompile the same C code gives me a working executable. This trigger memories to my recent observation http://lists.nongnu.org/archive/html/chicken-users/2011-10/msg00067.html This one came up under gcc 4.5.2 (as in current Ubuntu). IMHO the moral of the story: Never trust you C compiler too much. Since the latter would be a case of the newer compiler producing code from perfect C source about valgrind will complain. (Which does not exclude the chance that valgrind would be wrong. Just I don't believe in that.) I'm afraid these facts are almost off-topic here. Unrelated except for the collateral damage, that Chicken compiles to C, which is not exactly an executable format on most machines. ;-) /Jörg ___ Chicken-users mailing list Chicken-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-users
Re: [Chicken-users] c-string return question
On Oct 13, 2011, at 11:02 AM, Jörg F. Wittenberger wrote: > ages ago I wrote these simple lines: Out of curiosity, would this suit your purposes instead: (##sys#char->utf8-string (integer->char x)) ___ Chicken-users mailing list Chicken-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-users
Re: [Chicken-users] c-string return question
On Oct 13 2011, John Cowan wrote: Jörg F. Wittenberger scripsit: (define integer->utf8string (foreign-lambda* c-string ((unsigned-integer ch)) "static C_uchar off[6]={0xFC,0xF8,0xF0,0xE0,0xC0,0x00}; int size=5; C_uchar buf[7]; buf[6]='\\0'; if (ch < 0x80) { buf[5]=ch; } else { buf[size--]=(ch&0x3F)|0x80; ch=ch>>6; while (ch) { buf[size--]=(ch&0x3F)|0x80; ch=ch>>6; } /* Write the size information into the first byte */ ++size; buf[size]=off[size]|buf[size]; } return(buf+size); ")) This code is not good C, because it returns a pointer into a stack frame which has already been exited. It may just so happen that there is still a correct value there, but there are no guarantees. I'd guess that the corruption happens when there is a minor GC. See http://c-faq.com/~scs/cclass/int/sx5.html . Wait! The chicken manual does not mention this restriction. For a reason. When you read the expanded C code as Chicken produces, you will find, that it does through some magic to make sure this restriction shall not apply. (Watch out for C_cblock and C_cblockend #defines in chicken.h , which depend on the C compiler in use.) It does a local #define return(x) to insert a block wherein it saves the to-be-returned string *before* the actual return statement is seen by the C compiler. 1.) static C_uchar buf[7]; ^^ does the trick. That's absolutely the Right Thing. You are now returning a pointer to the static data region, which will always be available. Not exactly. While your explanation of my reasoning how to circumvent the none-working situation is correct, this means that the trick as deployed in the Chicken source does not work under certain C compilers. Best Regards /Jörg ___ Chicken-users mailing list Chicken-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-users
Re: [Chicken-users] c-string return question
Jörg F. Wittenberger scripsit: > (define integer->utf8string > (foreign-lambda* > c-string ((unsigned-integer ch)) > "static C_uchar off[6]={0xFC,0xF8,0xF0,0xE0,0xC0,0x00}; > int size=5; C_uchar buf[7]; > buf[6]='\\0'; > if (ch < 0x80) { >buf[5]=ch; > } else { >buf[size--]=(ch&0x3F)|0x80; ch=ch>>6; >while (ch) { buf[size--]=(ch&0x3F)|0x80; ch=ch>>6; } >/* Write the size information into the first byte */ >++size; >buf[size]=off[size]|buf[size]; > } > return(buf+size); > ")) This code is not good C, because it returns a pointer into a stack frame which has already been exited. It may just so happen that there is still a correct value there, but there are no guarantees. I'd guess that the corruption happens when there is a minor GC. See http://c-faq.com/~scs/cclass/int/sx5.html . > 1.) static C_uchar buf[7]; >^^ >does the trick. That's absolutely the Right Thing. You are now returning a pointer to the static data region, which will always be available. -- John Cowan co...@ccil.org http://ccil.org/~cowan Consider the matter of Analytic Philosophy. Dennett and Bennett are well-known. Dennett rarely or never cites Bennett, so Bennett rarely or never cites Dennett. There is also one Dummett. By their works shall ye know them. However, just as no trinities have fourth persons (Zeppo Marx notwithstanding), Bummett is hardly known by his works. Indeed, Bummett does not exist. It is part of the function of this and other e-mail messages, therefore, to do what they can to create him. ___ Chicken-users mailing list Chicken-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-users
[Chicken-users] c-string return question
Hi, ages ago I wrote these simple lines: (define integer->utf8string (foreign-lambda* c-string ((unsigned-integer ch)) "static C_uchar off[6]={0xFC,0xF8,0xF0,0xE0,0xC0,0x00}; int size=5; C_uchar buf[7]; buf[6]='\\0'; if (ch < 0x80) { buf[5]=ch; } else { buf[size--]=(ch&0x3F)|0x80; ch=ch>>6; while (ch) { buf[size--]=(ch&0x3F)|0x80; ch=ch>>6; } /* Write the size information into the first byte */ ++size; buf[size]=off[size]|buf[size]; } return(buf+size); ")) this happend to work at least on i336 amd64 and ARM for years every day. Recently this code begin to return garbage under gcc 4.4.5 on amd64 and ARM, though more reliable on ARM. However: no clear test case available: When I write the above definition plus some test code (define xx (integer->utf8string 160)) (display (char->integer (string-ref xx 0))) into it's own file, I have so far been unable to make it return garbage. It does return garbage I compile this code as the only one foreign function together with the ssax parser in it's own module (and link it into a larger program). Maybe it's helpful to know how I escaped: 1.) static C_uchar buf[7]; ^^ does the trick. 2.) AND so does adding a for-loop right before the return, which prints a hex output of "buf" to stderr! (Instead of the static declaration. So it's rather obvious a gcc issue.) As far as I understand the C code into which Chicken expands this function, I'd say: that one is correct. Plus: so far I have gcc 4.5.2 on my dev machine. There I never have been able to reproduce this case. MAybe it's helpful to know what can go wrong. BEst Regards /Jörg ___ Chicken-users mailing list Chicken-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-users