Re: int_cst_hash_table mapping persistence and the garbage collector

2011-10-13 Thread Gary Funck
On 10/13/11 06:15:31, Laurynas Biveinis wrote:
 [...] In your case (correct me if I misunderstood something)
 you have one hash table, marking of which will mark more objects
 which are required for the correct marking of the second hash table.
 GC might be simply walking the second one first.

Yes, I think that this accurately summarizes the situation, and the result.

Any suggestions on how to fix this?  It seems that one fix might
be to use a non garbage-collected hash table for the hash map.

- Gary


Re: int_cst_hash_table mapping persistence and the garbage collector

2011-10-13 Thread Laurynas Biveinis
2011/10/13 Gary Funck g...@intrepid.com:
 On 10/13/11 06:15:31, Laurynas Biveinis wrote:
 [...] In your case (correct me if I misunderstood something)
 you have one hash table, marking of which will mark more objects
 which are required for the correct marking of the second hash table.
 GC might be simply walking the second one first.

 Yes, I think that this accurately summarizes the situation, and the result.

 Any suggestions on how to fix this?  It seems that one fix might
 be to use a non garbage-collected hash table for the hash map.

Is it feasible to write an if_marked function for the second hash
table that would duplicate the work of the first hash table function
and then some? I.e. it would determine if an entry needs to be marked
based on information outside of both hash tables and independently of
the first one (even if duplicating its logic).


-- 
Laurynas


Re: int_cst_hash_table mapping persistence and the garbage collector

2011-10-12 Thread Gary Funck
On 10/11/11 11:05:18, Eric Botcazou wrote:
  One easy way to address the current issue is to call
  tree_int_cst_equal() if the integer constant tree pointers
  do not match:
 
  if ((c1 != c2)  !tree_int_cst_equal (c1, c2))
/* integer constants aren't equal.  */
 
 You have two objects C1 and C2 for the same constant and you're comparing
 them. One was created first, say C1.  If C1 was still live when C2 was
 created, why was C2 created in the first class?  If C1 wasn't live
 anymore when C2 was created, why are you still using C1 here?

Eric, this note provides some more detail in addition to my
earlier reply to Richard.

The problem is that the references to object C1 and C2 live in
a hash table, and that although the referenced nodes will be retained
by the garbage collector, their mapping in int_cst_hash_table is
deleted by the GC.

Thus, if we follow the diagram:

 tree (type) - [ upc_block_factor_for_type ]
 - tree (integer constant)
 tree (integer constant) - [ int_cst_hash_table ] {unique map}
 - tree (integer constant)

Given two tree nodes, P (prototype) and F (function) they declare
a parameter that is pointer to a UPC shared object and this pointer
is declared with a UPC blocking factor of 1000.  Without garbage
collection, the mappings look like this:

 P.param - C1, F.param - C1 where C1 is an integer constant of
   the form (sizetype, 1000).

but when GC kicks in, it decides that the hash table entry
in int_cst_hash_table can be deleted because it doesn't think
that C1 is live.  Therefore the next attempt to map (sizetype, 1000)
will yield a new integral constant tree node, C2.  Then the mapping
changes to: P.param - C1, F.param - C2, and we can no longer use

  TYPE_UPC_BLOCKING_FACTOR (P.param) == TYPE_UPC_BLOCKING_FACTOR (F.param)

to check that the blocking factors of P.param and F.param are equal.

For the GC to know that int_cst_hash_table
entry is needed, perhaps the upc_block_factor_for_type needs
to be traversed, and then each integer constant needs to be
marked, or the constant has to be hashed into int_cst_hash_table
and the actual hash table entry has to be marked.

I am not familiar with the details of garbage collection and
pretty much just try use existing code as a model.  Apparently,
this sequence of statements is insufficient to tell the GC that it should
mark the integer constants referenced in this hash table as in use.

  static GTY ((if_marked (tree_map_marked_p),
 param_is (struct tree_map)))
   htab_t upc_block_factor_for_type;
[...]
 upc_block_factor_for_type = htab_create_ggc (512, tree_map_hash
  tree_map_eq, 0);

Reading the GC code:

static int
ggc_htab_delete (void **slot, void *info)
{
  const struct ggc_cache_tab *r = (const struct ggc_cache_tab *) info;

  if (! (*r-marked_p) (*slot))
htab_clear_slot (*r-base, slot);
  else
(*r-cb) (*slot);

  return 1;
}

It appears that the int_cst_hash_table entry for C1 needs to be
marked or it will be cleared.  I don't know how to set things
up so that so that the garbage collecting mechanisms are in
place to do that, and was hoping that tree_map_hash table
would provide the required mechanisms.  Apparently, this is
not the case.

I had hoped that this declaration would be sufficient to convince
the GC to consider all mapped integer constant nodes to be live.
If not, then perhaps I need a GC hook associated with
upc_block_factor_for_type that does something like the following:

  for t (each used upc_block_factor_for_type entry):
c = t.to  # the mapped integer constant
if is_integer_constant (c):
  h = int_cst_hash_table.hash(c)
  gc_mark_htab (int_cst_hash_table, h)

or perhaps this is sufficient?

  for t (each used upc_block_factor_for_type entry):
c = t.to
gc_mark_tree_node (c)

However, I thought that this would already have been done
automatically by the GC hash tree implementation.

If either of those methods are required, I would appreciate
suggestions/pointers/code that would help me make sure that
this approach is implemented correctly.

thanks,
- Gary


Re: int_cst_hash_table mapping persistence and the garbage collector

2011-10-12 Thread Eric Botcazou
 The problem is that the references to object C1 and C2 live in
 a hash table, and that although the referenced nodes will be retained
 by the garbage collector, their mapping in int_cst_hash_table is
 deleted by the GC.

This isn't a simple hash table, is it?

 I am not familiar with the details of garbage collection and
 pretty much just try use existing code as a model.  Apparently,
 this sequence of statements is insufficient to tell the GC that it should
 mark the integer constants referenced in this hash table as in use.

   static GTY ((if_marked (tree_map_marked_p),
param_is (struct tree_map)))
htab_t upc_block_factor_for_type;
 [...]
  upc_block_factor_for_type = htab_create_ggc (512, tree_map_hash
   tree_map_eq, 0);

This is a map so it is garbage-collected as a map: if the key isn't marked, 
then the value isn't either.  Hence 2 questions:
 - why using a map and not a simple hash table?
 - what is the key and why isn't it marked?

-- 
Eric Botcazou


Re: int_cst_hash_table mapping persistence and the garbage collector

2011-10-12 Thread Eric Botcazou
 It maps a type node into a corresponding integer node that is
 the blocking factor associated with the type node.  Before
 the advent of this hash table the blocking factor was stored
 in a dedicated field in the tree type node.  The suggestion was
 made to move this into a hash table to save space.  I chose
 the tree map hash table because I thought it could do the job.

So this isn't a simple hash table since this is a map.  A simple hash table 
doesn't store the key in the slot, only the value; a map does.

 The keys are valid.  In the example discussed in this thread,
 there is a pointer to type node that used in a parameter declaration
 of a function prototype and also in the similarly named parameter
 of the function definition.  Both tree pointers are used as keys,
 and they are live at the point that the GC runs.

But somehow they aren't marked by the GC.  You need to find out why, since the 
value will be kept only if the key is already marked by the GC.

By the time a GC pass is run, all trees to be kept must be linked to a GC root.
You said that the pass was run between the point that the function prototype
tree node was created and the point at which the function declaration was 
processed.  To which GC root are the keys linked between these points?

-- 
Eric Botcazou


Re: int_cst_hash_table mapping persistence and the garbage collector

2011-10-12 Thread Richard Guenther
On Wed, Oct 12, 2011 at 10:29 AM, Eric Botcazou ebotca...@adacore.com wrote:
 It maps a type node into a corresponding integer node that is
 the blocking factor associated with the type node.  Before
 the advent of this hash table the blocking factor was stored
 in a dedicated field in the tree type node.  The suggestion was
 made to move this into a hash table to save space.  I chose
 the tree map hash table because I thought it could do the job.

 So this isn't a simple hash table since this is a map.  A simple hash table
 doesn't store the key in the slot, only the value; a map does.

 The keys are valid.  In the example discussed in this thread,
 there is a pointer to type node that used in a parameter declaration
 of a function prototype and also in the similarly named parameter
 of the function definition.  Both tree pointers are used as keys,
 and they are live at the point that the GC runs.

 But somehow they aren't marked by the GC.  You need to find out why, since the
 value will be kept only if the key is already marked by the GC.

 By the time a GC pass is run, all trees to be kept must be linked to a GC 
 root.
 You said that the pass was run between the point that the function prototype
 tree node was created and the point at which the function declaration was
 processed.  To which GC root are the keys linked between these points?

I think there is an issue when two cache htabs refer to each other
with respect to GC, you might search the list to find out more.

Richard.

 --
 Eric Botcazou



Re: int_cst_hash_table mapping persistence and the garbage collector

2011-10-12 Thread Gary Funck
On 10/12/11 14:00:54, Richard Guenther wrote:
 I think there is an issue when two cache htabs refer to each other
 with respect to GC, you might search the list to find out more.

Richard, thanks.  I thought that might be the case, but
I don't understand the GC well enough to make this determination.

- Gary


Re: int_cst_hash_table mapping persistence and the garbage collector

2011-10-12 Thread Eric Botcazou
 I think there is an issue when two cache htabs refer to each other
 with respect to GC, you might search the list to find out more.

I'm not sure this is the case here, there seems to be a clear hierarchy.

-- 
Eric Botcazou


Re: int_cst_hash_table mapping persistence and the garbage collector

2011-10-12 Thread Laurynas Biveinis
2011/10/11 Gary Funck g...@intrepid.com:
 static GTY ((if_marked (tree_map_marked_p),
           param_is (struct tree_map)))
     htab_t upc_block_factor_for_type;
 [...]
  upc_block_factor_for_type = htab_create_ggc (512, tree_map_hash,
                                               tree_map_eq, 0);

 I had hoped that this would be sufficient to ensure that all
 integer constant references recorded in this hash table would
 be considered live by the GC.  Reading the code in tree_map_marked_p(),
 however, I see the following:

 #define tree_map_marked_p tree_map_base_marked_p
 [...]
 /* Return true if this tree map structure is marked for garbage collection
   purposes.  We simply return true if the from tree is marked, so that this
   structure goes away when the from tree goes away.  */

 int
 tree_map_base_marked_p (const void *p)
 {
  return ggc_marked_p (((const struct tree_map_base *) p)-from);
 }

 This takes care of recycling an entry when the '-from' reference
 goes away, but it doesn't make sure that the '-to' reference is
 considered live.  I don't understand the GC well enough to
 know when/where the '-to' entry should be marked as live.

If -from is marked, then GC will mark the whole hash table entry,
including the tree_map::to field.

Currently I believe that your issue is one hash table pointing to
another one.  Basically all the objects that need to be live must be
known before the hash table marking starts, either by being already
marked in that GC run, either by having mark bits set somewhere where
if_marked option will check them correctly. In your case (correct me
if I misunderstood something) you have one hash table, marking of
which will mark more objects which are required for the correct
marking of the second hash table. GC might be simply walking the
second one first.

-- 
Laurynas


Re: int_cst_hash_table mapping persistence and the garbage collector

2011-10-11 Thread Richard Guenther
On Mon, Oct 10, 2011 at 7:02 PM, Gary Funck g...@intrepid.com wrote:
 Recently, a few UPC test programs failed to compile
 due to mis-matches of parameters in a prototype and its
 corresponding function definition.  The mis-match was
 based upon the apparent inequality of UPC layout qualifiers
 (blocking factors).

 UPC blocking factors are integer constants.  They are
 recorded in a hash table indexed by the type tree node
 that they correspond to.

 Currently, the test for equality of blocking factors
 tests only the pointer to the tree node defining the constant.
 All blocking factors are recorded as sizetype type'd nodes.
 Given that integer constants are hashed by type/value, it seemed
 safe to assume that a given blocking factor would map to
 a single tree node due to the underlying hash method that is used
 when integral constants are created.

 Is it valid to assume that pointer equality is sufficient
 to ensure that two integer constants are equal as long
 as their type and values are equal?

 The bug that we ran into occurred because a garbage collection
 pass was run between the point that the function prototype
 tree node was created and the point at which the function declaration
 was processed.  The garbage collector decided that the integer
 constant representing the blocking factor was no longer in use,
 because it had not been marked.

 In fact, the integer constant was needed because it appeared
 in the blocking factor hash table, but not via a direct pointer.
 Rather it was referenced by nature of the fact that
 the blocking factor hash table referenced the integer constant
 that is mapped in the integer constant hash table.

 Here's a rough diagram:

   tree (type) - [ blocking factor hash ] - tree (integer constant)
   tree (integer constant) - [ integer constant hash ] {unique map}
                              - tree (integer constant)

 When the garbage collector deleted the entry from the
 integer constant hash, it forced a new integer constant tree
 node to be created for the same (type, value) integral constant
 blocking factor.

 One easy way to address the current issue is to call
 tree_int_cst_equal() if the integer constant tree pointers
 do not match:

    if ((c1 != c2)  !tree_int_cst_equal (c1, c2))
      /* integer constants aren't equal.  */

 This may be necessary if 'int_cst_hash_table' is viewed as
 a cache rather than a persistent, unique mapping.

 Another approach, would be to somehow mark the node
 in int_cst_hash_table as in use when the blocking factor
 hash table is traversed by the garbage collector, or
 to add logic the hash table delete function associated
 with int_cst_hash_table; to dis-allow the delete if the
 integer constant is present in the UPC blocking factor
 hash table.

 To effect this change in a modular way probably the hash table
 delete function associated with 'int_cst_hash_table' would have
 to be daisy-chained, where the UPC blocking factor check is made
 first.  The difficulty with implementing the daisy chaining is that
 int_cst_hash_table needs to exist before the UPC-related initialization
 code is run.  One way to handle this might be yet another language
 hook, called from the code that creates 'int_cst_hash_table'.
 That seems overly complex.

 For reference, the current blocking factor mapping table
 is created as follows:

  upc_block_factor_for_type = htab_create_ggc (512, tree_map_hash,
                                               tree_map_eq, 0);

 Summary:

 1. Is it valid to assume that pointer equality is sufficient
 to compare two integer constants for equality as long as they
 have identical type and value?

Yes, if both constants are live

 2. Should 'int_cst_hash_table' be viewed as a cache, where
 the mapping of a given (type, value) integer constant may
 vary over time?

Yes, if a constant is unused it may get collected and re-allocated
later.  Cannot be observed from any valid use of 1.

 3. If the answer to 1. is yes and the answer to 2. is no
 then what is the recommended way to ensure that nodes in
 'int_cst_hash_table' are not removed if the integer constant
 is being referenced via the 'upc_block_factor_for_type'
 hash table?

You need to ensure the constants are marked properly.

Richard.

 thanks,
 - Gary



Re: int_cst_hash_table mapping persistence and the garbage collector

2011-10-11 Thread Eric Botcazou
 In fact, the integer constant was needed because it appeared
 in the blocking factor hash table, but not via a direct pointer.
 Rather it was referenced by nature of the fact that
 the blocking factor hash table referenced the integer constant
 that is mapped in the integer constant hash table.

You'd need to elaborate here: what does by nature of the fact that mean?

 When the garbage collector deleted the entry from the
 integer constant hash, it forced a new integer constant tree
 node to be created for the same (type, value) integral constant
 blocking factor.

 One easy way to address the current issue is to call
 tree_int_cst_equal() if the integer constant tree pointers
 do not match:

 if ((c1 != c2)  !tree_int_cst_equal (c1, c2))
   /* integer constants aren't equal.  */

You have two objects C1 and C2 for the same constant and you're comparing them.
One was created first, say C1.  If C1 was still live when C2 was created, why 
was C2 created in the first class?  If C1 wasn't live anymore when C2 was 
created, why are you still using C1 here?

-- 
Eric Botcazou


Re: int_cst_hash_table mapping persistence and the garbage collector

2011-10-11 Thread Gary Funck
On 10/11/11 10:24:52, Richard Guenther wrote:
 GF: 1. Is it valid to assume that pointer equality is sufficient
 GF: to compare two integer constants for equality as long as they
 GF: have identical type and value?
 
 Yes, if both constants are live

The upc blocking factor hash table is declared as follows:

static GTY ((if_marked (tree_map_marked_p),
   param_is (struct tree_map)))
 htab_t upc_block_factor_for_type;
[...]
  upc_block_factor_for_type = htab_create_ggc (512, tree_map_hash,
   tree_map_eq, 0);

I had hoped that this would be sufficient to ensure that all
integer constant references recorded in this hash table would
be considered live by the GC.  Reading the code in tree_map_marked_p(),
however, I see the following:

#define tree_map_marked_p tree_map_base_marked_p
[...]
/* Return true if this tree map structure is marked for garbage collection
   purposes.  We simply return true if the from tree is marked, so that this
   structure goes away when the from tree goes away.  */

int
tree_map_base_marked_p (const void *p)
{ 
  return ggc_marked_p (((const struct tree_map_base *) p)-from);
}

This takes care of recycling an entry when the '-from' reference
goes away, but it doesn't make sure that the '-to' reference is
considered live.  I don't understand the GC well enough to
know when/where the '-to' entry should be marked as live.

(note: in the cited test case, the -from pointers in question
are known to be live and did survive garbage collection.)

Given that the declaration above tells the GC that the nodes
in the blocking factor hash table are of type 'struct tree_map',

struct GTY(()) tree_map_base {
  tree from;
};


/* Map from a tree to another tree.  */

struct GTY(()) tree_map {
  struct tree_map_base base;
  unsigned int hash;
  tree to;
};

I thought that the GC would mark the -to nodes as
live automatically?  (note: probably the only direct
reference to the integer constant that is the focus of
this discussion is in the upc_block_factor_for_type hash table.
Therefore, if it isn't seen as live there, it won't be seen
as live anywhere else.)

I suppose that I could declare a linear tree list of mapped integer
constants and let the GC walk that, but that is more of a hack than a solution.

- Gary