Re: int_cst_hash_table mapping persistence and the garbage collector
On 10/13/11 06:15:31, Laurynas Biveinis wrote: [...] In your case (correct me if I misunderstood something) you have one hash table, marking of which will mark more objects which are required for the correct marking of the second hash table. GC might be simply walking the second one first. Yes, I think that this accurately summarizes the situation, and the result. Any suggestions on how to fix this? It seems that one fix might be to use a non garbage-collected hash table for the hash map. - Gary
Re: int_cst_hash_table mapping persistence and the garbage collector
2011/10/13 Gary Funck g...@intrepid.com: On 10/13/11 06:15:31, Laurynas Biveinis wrote: [...] In your case (correct me if I misunderstood something) you have one hash table, marking of which will mark more objects which are required for the correct marking of the second hash table. GC might be simply walking the second one first. Yes, I think that this accurately summarizes the situation, and the result. Any suggestions on how to fix this? It seems that one fix might be to use a non garbage-collected hash table for the hash map. Is it feasible to write an if_marked function for the second hash table that would duplicate the work of the first hash table function and then some? I.e. it would determine if an entry needs to be marked based on information outside of both hash tables and independently of the first one (even if duplicating its logic). -- Laurynas
Re: int_cst_hash_table mapping persistence and the garbage collector
On 10/11/11 11:05:18, Eric Botcazou wrote: One easy way to address the current issue is to call tree_int_cst_equal() if the integer constant tree pointers do not match: if ((c1 != c2) !tree_int_cst_equal (c1, c2)) /* integer constants aren't equal. */ You have two objects C1 and C2 for the same constant and you're comparing them. One was created first, say C1. If C1 was still live when C2 was created, why was C2 created in the first class? If C1 wasn't live anymore when C2 was created, why are you still using C1 here? Eric, this note provides some more detail in addition to my earlier reply to Richard. The problem is that the references to object C1 and C2 live in a hash table, and that although the referenced nodes will be retained by the garbage collector, their mapping in int_cst_hash_table is deleted by the GC. Thus, if we follow the diagram: tree (type) - [ upc_block_factor_for_type ] - tree (integer constant) tree (integer constant) - [ int_cst_hash_table ] {unique map} - tree (integer constant) Given two tree nodes, P (prototype) and F (function) they declare a parameter that is pointer to a UPC shared object and this pointer is declared with a UPC blocking factor of 1000. Without garbage collection, the mappings look like this: P.param - C1, F.param - C1 where C1 is an integer constant of the form (sizetype, 1000). but when GC kicks in, it decides that the hash table entry in int_cst_hash_table can be deleted because it doesn't think that C1 is live. Therefore the next attempt to map (sizetype, 1000) will yield a new integral constant tree node, C2. Then the mapping changes to: P.param - C1, F.param - C2, and we can no longer use TYPE_UPC_BLOCKING_FACTOR (P.param) == TYPE_UPC_BLOCKING_FACTOR (F.param) to check that the blocking factors of P.param and F.param are equal. For the GC to know that int_cst_hash_table entry is needed, perhaps the upc_block_factor_for_type needs to be traversed, and then each integer constant needs to be marked, or the constant has to be hashed into int_cst_hash_table and the actual hash table entry has to be marked. I am not familiar with the details of garbage collection and pretty much just try use existing code as a model. Apparently, this sequence of statements is insufficient to tell the GC that it should mark the integer constants referenced in this hash table as in use. static GTY ((if_marked (tree_map_marked_p), param_is (struct tree_map))) htab_t upc_block_factor_for_type; [...] upc_block_factor_for_type = htab_create_ggc (512, tree_map_hash tree_map_eq, 0); Reading the GC code: static int ggc_htab_delete (void **slot, void *info) { const struct ggc_cache_tab *r = (const struct ggc_cache_tab *) info; if (! (*r-marked_p) (*slot)) htab_clear_slot (*r-base, slot); else (*r-cb) (*slot); return 1; } It appears that the int_cst_hash_table entry for C1 needs to be marked or it will be cleared. I don't know how to set things up so that so that the garbage collecting mechanisms are in place to do that, and was hoping that tree_map_hash table would provide the required mechanisms. Apparently, this is not the case. I had hoped that this declaration would be sufficient to convince the GC to consider all mapped integer constant nodes to be live. If not, then perhaps I need a GC hook associated with upc_block_factor_for_type that does something like the following: for t (each used upc_block_factor_for_type entry): c = t.to # the mapped integer constant if is_integer_constant (c): h = int_cst_hash_table.hash(c) gc_mark_htab (int_cst_hash_table, h) or perhaps this is sufficient? for t (each used upc_block_factor_for_type entry): c = t.to gc_mark_tree_node (c) However, I thought that this would already have been done automatically by the GC hash tree implementation. If either of those methods are required, I would appreciate suggestions/pointers/code that would help me make sure that this approach is implemented correctly. thanks, - Gary
Re: int_cst_hash_table mapping persistence and the garbage collector
The problem is that the references to object C1 and C2 live in a hash table, and that although the referenced nodes will be retained by the garbage collector, their mapping in int_cst_hash_table is deleted by the GC. This isn't a simple hash table, is it? I am not familiar with the details of garbage collection and pretty much just try use existing code as a model. Apparently, this sequence of statements is insufficient to tell the GC that it should mark the integer constants referenced in this hash table as in use. static GTY ((if_marked (tree_map_marked_p), param_is (struct tree_map))) htab_t upc_block_factor_for_type; [...] upc_block_factor_for_type = htab_create_ggc (512, tree_map_hash tree_map_eq, 0); This is a map so it is garbage-collected as a map: if the key isn't marked, then the value isn't either. Hence 2 questions: - why using a map and not a simple hash table? - what is the key and why isn't it marked? -- Eric Botcazou
Re: int_cst_hash_table mapping persistence and the garbage collector
It maps a type node into a corresponding integer node that is the blocking factor associated with the type node. Before the advent of this hash table the blocking factor was stored in a dedicated field in the tree type node. The suggestion was made to move this into a hash table to save space. I chose the tree map hash table because I thought it could do the job. So this isn't a simple hash table since this is a map. A simple hash table doesn't store the key in the slot, only the value; a map does. The keys are valid. In the example discussed in this thread, there is a pointer to type node that used in a parameter declaration of a function prototype and also in the similarly named parameter of the function definition. Both tree pointers are used as keys, and they are live at the point that the GC runs. But somehow they aren't marked by the GC. You need to find out why, since the value will be kept only if the key is already marked by the GC. By the time a GC pass is run, all trees to be kept must be linked to a GC root. You said that the pass was run between the point that the function prototype tree node was created and the point at which the function declaration was processed. To which GC root are the keys linked between these points? -- Eric Botcazou
Re: int_cst_hash_table mapping persistence and the garbage collector
On Wed, Oct 12, 2011 at 10:29 AM, Eric Botcazou ebotca...@adacore.com wrote: It maps a type node into a corresponding integer node that is the blocking factor associated with the type node. Before the advent of this hash table the blocking factor was stored in a dedicated field in the tree type node. The suggestion was made to move this into a hash table to save space. I chose the tree map hash table because I thought it could do the job. So this isn't a simple hash table since this is a map. A simple hash table doesn't store the key in the slot, only the value; a map does. The keys are valid. In the example discussed in this thread, there is a pointer to type node that used in a parameter declaration of a function prototype and also in the similarly named parameter of the function definition. Both tree pointers are used as keys, and they are live at the point that the GC runs. But somehow they aren't marked by the GC. You need to find out why, since the value will be kept only if the key is already marked by the GC. By the time a GC pass is run, all trees to be kept must be linked to a GC root. You said that the pass was run between the point that the function prototype tree node was created and the point at which the function declaration was processed. To which GC root are the keys linked between these points? I think there is an issue when two cache htabs refer to each other with respect to GC, you might search the list to find out more. Richard. -- Eric Botcazou
Re: int_cst_hash_table mapping persistence and the garbage collector
On 10/12/11 14:00:54, Richard Guenther wrote: I think there is an issue when two cache htabs refer to each other with respect to GC, you might search the list to find out more. Richard, thanks. I thought that might be the case, but I don't understand the GC well enough to make this determination. - Gary
Re: int_cst_hash_table mapping persistence and the garbage collector
I think there is an issue when two cache htabs refer to each other with respect to GC, you might search the list to find out more. I'm not sure this is the case here, there seems to be a clear hierarchy. -- Eric Botcazou
Re: int_cst_hash_table mapping persistence and the garbage collector
2011/10/11 Gary Funck g...@intrepid.com: static GTY ((if_marked (tree_map_marked_p), param_is (struct tree_map))) htab_t upc_block_factor_for_type; [...] upc_block_factor_for_type = htab_create_ggc (512, tree_map_hash, tree_map_eq, 0); I had hoped that this would be sufficient to ensure that all integer constant references recorded in this hash table would be considered live by the GC. Reading the code in tree_map_marked_p(), however, I see the following: #define tree_map_marked_p tree_map_base_marked_p [...] /* Return true if this tree map structure is marked for garbage collection purposes. We simply return true if the from tree is marked, so that this structure goes away when the from tree goes away. */ int tree_map_base_marked_p (const void *p) { return ggc_marked_p (((const struct tree_map_base *) p)-from); } This takes care of recycling an entry when the '-from' reference goes away, but it doesn't make sure that the '-to' reference is considered live. I don't understand the GC well enough to know when/where the '-to' entry should be marked as live. If -from is marked, then GC will mark the whole hash table entry, including the tree_map::to field. Currently I believe that your issue is one hash table pointing to another one. Basically all the objects that need to be live must be known before the hash table marking starts, either by being already marked in that GC run, either by having mark bits set somewhere where if_marked option will check them correctly. In your case (correct me if I misunderstood something) you have one hash table, marking of which will mark more objects which are required for the correct marking of the second hash table. GC might be simply walking the second one first. -- Laurynas
Re: int_cst_hash_table mapping persistence and the garbage collector
On Mon, Oct 10, 2011 at 7:02 PM, Gary Funck g...@intrepid.com wrote: Recently, a few UPC test programs failed to compile due to mis-matches of parameters in a prototype and its corresponding function definition. The mis-match was based upon the apparent inequality of UPC layout qualifiers (blocking factors). UPC blocking factors are integer constants. They are recorded in a hash table indexed by the type tree node that they correspond to. Currently, the test for equality of blocking factors tests only the pointer to the tree node defining the constant. All blocking factors are recorded as sizetype type'd nodes. Given that integer constants are hashed by type/value, it seemed safe to assume that a given blocking factor would map to a single tree node due to the underlying hash method that is used when integral constants are created. Is it valid to assume that pointer equality is sufficient to ensure that two integer constants are equal as long as their type and values are equal? The bug that we ran into occurred because a garbage collection pass was run between the point that the function prototype tree node was created and the point at which the function declaration was processed. The garbage collector decided that the integer constant representing the blocking factor was no longer in use, because it had not been marked. In fact, the integer constant was needed because it appeared in the blocking factor hash table, but not via a direct pointer. Rather it was referenced by nature of the fact that the blocking factor hash table referenced the integer constant that is mapped in the integer constant hash table. Here's a rough diagram: tree (type) - [ blocking factor hash ] - tree (integer constant) tree (integer constant) - [ integer constant hash ] {unique map} - tree (integer constant) When the garbage collector deleted the entry from the integer constant hash, it forced a new integer constant tree node to be created for the same (type, value) integral constant blocking factor. One easy way to address the current issue is to call tree_int_cst_equal() if the integer constant tree pointers do not match: if ((c1 != c2) !tree_int_cst_equal (c1, c2)) /* integer constants aren't equal. */ This may be necessary if 'int_cst_hash_table' is viewed as a cache rather than a persistent, unique mapping. Another approach, would be to somehow mark the node in int_cst_hash_table as in use when the blocking factor hash table is traversed by the garbage collector, or to add logic the hash table delete function associated with int_cst_hash_table; to dis-allow the delete if the integer constant is present in the UPC blocking factor hash table. To effect this change in a modular way probably the hash table delete function associated with 'int_cst_hash_table' would have to be daisy-chained, where the UPC blocking factor check is made first. The difficulty with implementing the daisy chaining is that int_cst_hash_table needs to exist before the UPC-related initialization code is run. One way to handle this might be yet another language hook, called from the code that creates 'int_cst_hash_table'. That seems overly complex. For reference, the current blocking factor mapping table is created as follows: upc_block_factor_for_type = htab_create_ggc (512, tree_map_hash, tree_map_eq, 0); Summary: 1. Is it valid to assume that pointer equality is sufficient to compare two integer constants for equality as long as they have identical type and value? Yes, if both constants are live 2. Should 'int_cst_hash_table' be viewed as a cache, where the mapping of a given (type, value) integer constant may vary over time? Yes, if a constant is unused it may get collected and re-allocated later. Cannot be observed from any valid use of 1. 3. If the answer to 1. is yes and the answer to 2. is no then what is the recommended way to ensure that nodes in 'int_cst_hash_table' are not removed if the integer constant is being referenced via the 'upc_block_factor_for_type' hash table? You need to ensure the constants are marked properly. Richard. thanks, - Gary
Re: int_cst_hash_table mapping persistence and the garbage collector
In fact, the integer constant was needed because it appeared in the blocking factor hash table, but not via a direct pointer. Rather it was referenced by nature of the fact that the blocking factor hash table referenced the integer constant that is mapped in the integer constant hash table. You'd need to elaborate here: what does by nature of the fact that mean? When the garbage collector deleted the entry from the integer constant hash, it forced a new integer constant tree node to be created for the same (type, value) integral constant blocking factor. One easy way to address the current issue is to call tree_int_cst_equal() if the integer constant tree pointers do not match: if ((c1 != c2) !tree_int_cst_equal (c1, c2)) /* integer constants aren't equal. */ You have two objects C1 and C2 for the same constant and you're comparing them. One was created first, say C1. If C1 was still live when C2 was created, why was C2 created in the first class? If C1 wasn't live anymore when C2 was created, why are you still using C1 here? -- Eric Botcazou
Re: int_cst_hash_table mapping persistence and the garbage collector
On 10/11/11 10:24:52, Richard Guenther wrote: GF: 1. Is it valid to assume that pointer equality is sufficient GF: to compare two integer constants for equality as long as they GF: have identical type and value? Yes, if both constants are live The upc blocking factor hash table is declared as follows: static GTY ((if_marked (tree_map_marked_p), param_is (struct tree_map))) htab_t upc_block_factor_for_type; [...] upc_block_factor_for_type = htab_create_ggc (512, tree_map_hash, tree_map_eq, 0); I had hoped that this would be sufficient to ensure that all integer constant references recorded in this hash table would be considered live by the GC. Reading the code in tree_map_marked_p(), however, I see the following: #define tree_map_marked_p tree_map_base_marked_p [...] /* Return true if this tree map structure is marked for garbage collection purposes. We simply return true if the from tree is marked, so that this structure goes away when the from tree goes away. */ int tree_map_base_marked_p (const void *p) { return ggc_marked_p (((const struct tree_map_base *) p)-from); } This takes care of recycling an entry when the '-from' reference goes away, but it doesn't make sure that the '-to' reference is considered live. I don't understand the GC well enough to know when/where the '-to' entry should be marked as live. (note: in the cited test case, the -from pointers in question are known to be live and did survive garbage collection.) Given that the declaration above tells the GC that the nodes in the blocking factor hash table are of type 'struct tree_map', struct GTY(()) tree_map_base { tree from; }; /* Map from a tree to another tree. */ struct GTY(()) tree_map { struct tree_map_base base; unsigned int hash; tree to; }; I thought that the GC would mark the -to nodes as live automatically? (note: probably the only direct reference to the integer constant that is the focus of this discussion is in the upc_block_factor_for_type hash table. Therefore, if it isn't seen as live there, it won't be seen as live anywhere else.) I suppose that I could declare a linear tree list of mapped integer constants and let the GC walk that, but that is more of a hack than a solution. - Gary