On Tue, Oct 30, 2012 at 12:10 PM, Jason Merrill <ja...@redhat.com> wrote:
> On 10/27/2012 09:16 PM, Sriraman Tallam wrote:
>>
>> +         /* See if there's a match.   For functions that are
>> multi-versioned,
>> +            all the versions match.  */
>>           if (same_type_p (target_fn_type, static_fn_type (fn)))
>> -           matches = tree_cons (fn, NULL_TREE, matches);
>> +           {
>> +             matches = tree_cons (fn, NULL_TREE, matches);
>> +             /*If versioned, push all possible versions into a vector.
>> */
>> +             if (DECL_FUNCTION_VERSIONED (fn))
>> +               {
>> +                 if (fn_ver_vec == NULL)
>> +                  fn_ver_vec = VEC_alloc (tree, heap, 2);
>> +                 VEC_safe_push (tree, heap, fn_ver_vec, fn);
>> +               }
>> +           }
>
>
> Why do we need to keep both a list and vector of the matches?

Right, but we later call the target hook
get_function_versions_dispatcher which takes a vector. I could change
that to accept a list instead if that is preferable?

>
>> +        Call decls_match to make sure they are different because they are
>> +        versioned.  */
>> +      if (DECL_FUNCTION_VERSIONED (fn))
>> +       {
>> +          for (match = TREE_CHAIN (matches); match; match = TREE_CHAIN
>> (match))
>> +           if (decls_match (fn, TREE_PURPOSE (match)))
>> +             break;
>> +       }
>
>
> What if you have multiple matches that aren't all versions of the same
> function?

Right, I should really check if there are versions by comparing params
too. I fixed this in joust but missed out here. I will make the change
so that any matches with functions that do not belong to the
semantically identical group of function versions will be caught and
the ambiguity will be flagged.

>
> Why would it be a problem to have two separate declarations of the same
> function?

AFAIU, this should not be a problem. For duplicate declarations,
duplicate_decls should merge them and they should never be seen here.
Did I miss something?

>
>> +      dispatcher_decl = targetm.get_function_versions_dispatcher
>> (fn_ver_vec);
>
>
> Is the idea here that if you have some versions declared, then a call, then
> more versions declared, then another call, you will call two different
> dispatchers,

No, I thought about this but I did not want to handle this case in
this iteration. The dispatcher is created only once and if more
functions are declared later, they will not be dispatched atleast in
this iteration.

> where the first one will only dispatch to the versions declared
> before the first call?  If not, why do we care about the set of declarations
> at this point?

I am taking the address of a multi-versioned function here. The
front-end is returning the address of the dispatcher decl instead.
Since, I am building the dispatcher here, why not construct the cgraph
datastructures for these versions too?  That is why I aggregate all
the declarations here.

>
>> +      /* Mark this functio to be output.  */
>> +      node->local.finalized = 1;
>
>
> Missing 'n' in "function".
>
>> @@ -14227,7 +14260,11 @@ cxx_comdat_group (tree decl)
>>           else
>>             break;
>>         }
>> -      name = DECL_ASSEMBLER_NAME (decl);
>> +      if (TREE_CODE (decl) == FUNCTION_DECL
>> +            && DECL_FUNCTION_VERSIONED (decl))
>> +          name = DECL_NAME (decl);
>
>
> This would mean that f in the global namespace and f in namespace foo would
> end up in the same comdat group.  Why do we need special handling here at
> all?


Right, we do not need special handling. It is ok for each function
version to be in its own comdat group, I will remove this.

>
>>  dump_function_name (tree t, int flags)
>>  {
>> -  tree name = DECL_NAME (t);
>> +  tree name;
>>
>> +  /* For function versions, use the assembler name as the decl name is
>> +     the same for all versions.  */
>> +  if (TREE_CODE (t) == FUNCTION_DECL
>> +      && DECL_FUNCTION_VERSIONED (t))
>> +    name = DECL_ASSEMBLER_NAME (t);
>
>
> This shouldn't be necessary; we should print the target attribute when
> printing the function declaration.

Ok.

>
>> +        Also, mark this function as needed if it is marked inline but
>> +        is a multi-versioned function.  */
>> +      if (((flag_keep_inline_functions
>> +           || DECL_FUNCTION_VERSIONED (fn))
>
>
> This should be marked as needed by the code that builds the dispatcher.

I had some trouble previously figuring out where to mark this as
needed. I will fix it.

>
>> +  /* For calls to a multi-versioned function, overload resolution
>> +     returns the function with the highest target priority, that is,
>> +     the version that will checked for dispatching first.  If this
>> +     version is inlinable, a direct call to this version can be made
>> +     otherwise the call should go through the dispatcher.  */
>
>
> I'm a bit confused why people would want both dispatched calls and
> non-dispatched inlining; I would expect that if a function can be compiled
> differently enough on newer hardware to make versioning worthwhile, that
> would be a larger difference than the call overhead.

Simple example:

int
foo ()
{
  return 1;
}
int __attribute__ ((target ("popcnt")))
foo ()
{
  return 0;
}

int __attribute__ ((target ("popcnt")))
bar ()
{
  return foo ();
}

Here, the call to foo () from bar () will be turned into a direct call
to the popcnt version.

Here, if bar is executed, then popcnt is supported and the call to foo
from bar will be dispatched to the popcnt version even if it goes
through the dispatcher and this is known at compile time. So, why not
make a direct call?  I am only making direct calls to versions when I
am sure the dispatcher would do the same.

>
>> +  if (DECL_FUNCTION_VERSIONED (fn)
>> +      && !targetm.target_option.can_inline_p (current_function_decl, fn))
>> +    {
>> +      struct cgraph_node *dispatcher_node = NULL;
>> +      fn = get_function_version_dispatcher (fn);
>> +      if (fn == NULL)
>> +       return NULL;
>> +      dispatcher_node = cgraph_get_create_node (fn);
>> +      gcc_assert (dispatcher_node != NULL);
>> +      /* Mark this function to be output.  */
>> +      dispatcher_node->local.finalized = 1;
>> +    }
>
>
> Why do you need to mark this here?  If you generate a call to the
> dispatcher, cgraph should mark it to be output automatically.

dispatcher_node does not have a body  until it is generated in
cgraphunit.c, so cgraph does not mark this field before this is
processed in cgraph_analyze_function.

>
>> +  /* For candidates of a multi-versioned function,  make the version with
>> +     the highest priority win.  This version will be checked for
>> dispatching
>> +     first.  If this version can be inlined into the caller, the
>> front-end
>> +     will simply make a direct call to this function.  */
>
>
> This is still too high in joust.  I believe I said before that this code
> should come just above
>
>    /* If the two function declarations represent the same function (this can
>       happen with declarations in multiple scopes and arg-dependent lookup),
>       arbitrarily choose one.  But first make sure the default args we're
>       using match.  */

Yes, I missed this the last time around. Will fix it this time.

>
>> +  /* For multiversioned functions, aggregate all the versions here for
>> +     generating the dispatcher body later if necessary.  Check to see if
>> +     the dispatcher is already generated to avoid doing this more than
>> +     once.  */
>
>
> This caching seems to assume that you'll always be considering the same
> group of declarations, which goes back to my earlier question.


Yes, for now I want to be only considering the same group of
declarations.  I am assuming that all declarations/definitions of all
versions of foo are seen before the first call to foo. I do not want
multiple dispatcher support complexity in this iteration. Is it ok to
delay this to the next patch iteration?

Your earlier question on this was:

 "This seems to assume that all the functions in the list of
candidates are versioned, but there might be unrelated functions from
different namespaces.  Also, doing this every time someone calls a
versioned function seems like the wrong place; I would think it would
be better to build up a list of versions as you seed declarations, and
then use that list to define the dispatcher at EOF if it's needed."

I have fixed the problem of unrelated functions by always checking the
type (same_type_p) and params (comp_params) in
get_function_version_dispatcher. You talked about doing the dispatcher
building later, but I did it here since I am doing it only once.


Thanks,
-Sri.


>
> Jason
>

Reply via email to