Bugs item #3532983, was opened at 2012-06-07 14:07
Message generated for change (Comment added) made by kerneis
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=742140&aid=3532983&group_id=138953

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: CIL core (parser, visitor, etc)
Group: Bug
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Ed Schwartz (edmcman)
Assigned to: Nobody/Anonymous (nobody)
Summary: Handling of array initialization is incorrect

Initial Comment:
Consider an array, int x[50000] = {0, };

In 6.7.8, paragraph 21 of the C99 standard:

If there are fewer initializers in a brace-enclosed list than there are 
elements or members
of an aggregate, or fewer characters in a string literal used to initialize an 
array of known
size than there are elements in the array, the remainder of the aggregate shall 
be
initialized implicitly the same as objects that have static storage duration.

In other words, when in doubt, initialize to zero.

CIL does this, but only for up to 16 elements.  If it has to put an implied 
zero for more than 16 elements, it does not put any implied zeros.  This is 
wrong, and has lead to incorrect behavior on real code.  Specifically, in 
card_of_complement from tr.c, in coreutils:

  bool in_set[N_CHARS] = { 0, };

where N_CHARS = 256.

There seem to be a few ways of solving this problem:
* Remove the 16 element limit.  This could add a lot of statements.
* Allow local variables to use initializers.  Global variables are allowed 
initializers, but not local ones... is there any reason for this?
* Add a loop to initialize the array

I don't mind implementing one of these, but I want to make sure it's something 
worth including in the CIL codebase.  I prefer the third option.

----------------------------------------------------------------------

>Comment By: Gabriel Kerneis (kerneis)
Date: 2012-06-11 22:24

Message:
In fact, your patch is incorrect is several places:

> The loop is currently only added when there are more than 16 implied
initializers

no it's not (and that's one of the reasons why it breaks the test suite),
But anyway I'll remove these 16-initializers trick.

> BinOp(Ge, Lval ctrlval, Const(CInt64(ni, IUInt, None)), uintType)

Binop(Ge, …) requires intType, not uintType.

> castTo iet (typeOfLval lv) (Const(CInt64(0L, IUInt, None)))

This one is the real problem.  IIUC, you copy/pasted it from the SingleInit
case, but it does not work.  You could have an array of any type (bt), the
cast is non-sensical and you have no guarantee that a direct init is
possibe (it could be an array of struct containing an array, etc.). I think
the solution is to call assignInit recursively with makeZeroInit bt, but I
have to test it to be sure.

----------------------------------------------------------------------

Comment By: Ed Schwartz (edmcman)
Date: 2012-06-11 06:53

Message:
Hi Gabriel,

The loop is currently only added when there are more than 16 implied
initializers.  I also don't like magic numbers, but it could simplify
program analysis to have the unrolled initializers outside of the loop when
there are only a few of them, so I see a case for both sides.

I do not feel strongly either way.  Feel free to amend the patch in
whatever way you think is best.

Thanks,
Ed

----------------------------------------------------------------------

Comment By: Gabriel Kerneis (kerneis)
Date: 2012-06-11 00:22

Message:
Looks good to me.  However, is there any reason to keep the first 16
initializers in collectInitializers since we add the loop in every case? I
think we should get rid of it but I might be missing something. Or use a
loop only for longer arrays, but I prefer avoiding "magic numbers". If you
agree, I'll amend the patch, test it and apply it.

----------------------------------------------------------------------

Comment By: Ed Schwartz (edmcman)
Date: 2012-06-10 18:06

Message:
I attached a patch for this.  Let me know if you have any comments..

----------------------------------------------------------------------

Comment By: Ed Schwartz (edmcman)
Date: 2012-06-08 07:48

Message:
Calling memset is a good idea.

VC++ does have intrinsics:
http://msdn.microsoft.com/en-us/library/26td21ds(v=VS.80).aspx

But, there are a few problems that I can see:

* They can only be enabled at the file level; so if someone was using
memset and expected it to be a function call, that could be a problem if we
needed to use the intrinsic.
* To use the intrinsics, one needs to include a header file which seems to
cause problems:
https://www.google.com/webhp?sourceid=chrome-instant&ie=UTF-8&ion=1#hl=en&sclient=psy-ab&q=intrin.h%20windows.h%20conflict&oq=&aq=&aqi=&aql=&gs_l=&pbx=1&fp=48739054e748cc81&ion=1&bav=on.2,or.r_gc.r_pw.r_cp.r_qf.,cf.osb&biw=1198&bih=1495
* In my tests, the memset intrinsic just calls memset in CRT anyway!  It
doesn't do any kind of inlining.

So, I guess I'll work on the third option.

----------------------------------------------------------------------

Comment By: Gabriel Kerneis (kerneis)
Date: 2012-06-08 00:01

Message:
I would gladly accept a patch. I prefer the second option, but it is not
realistic to implement it given the current state of CIL (the reason why it
was done that way is to be able to move every local variable to function
scope).

So if you feel like coding the third option, please go ahead. A fourth and
shorter solution would be to use __builtin_memset, but it wouldn't work in
MSVC mode (unless there is some intrisic memset available there too).

----------------------------------------------------------------------

Comment By: Ed Schwartz (edmcman)
Date: 2012-06-07 14:10

Message:
And here is how it handles _Bool x[4096] = { 0 };


int f(void) 
{ 
  _Bool x[4096] ;

  {
#line 2
  x[0] = (_Bool)0;
#line 4
  return ((int )x[1]);
}
}


----------------------------------------------------------------------

Comment By: Ed Schwartz (edmcman)
Date: 2012-06-07 14:09

Message:
By the way, here is an example of how CIL hanhdles _Bool x[4] = { 0 }:


int f(void) 
{ 
  _Bool x[4] ;

  {
#line 2
  x[0] = (_Bool)0;
#line 2
  x[1] = (_Bool)0;
#line 2
  x[2] = (_Bool)0;
#line 2
  x[3] = (_Bool)0;
#line 4
  return ((int )x[1]);
}
}


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=742140&aid=3532983&group_id=138953

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
CIL-users mailing list
CIL-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cil-users

Reply via email to