On Thu, Sep 13, 2012 at 8:02 PM, Richard Guenther
<richard.guent...@gmail.com> wrote:
> On Wed, Sep 12, 2012 at 6:39 PM, Xinliang David Li <davi...@google.com> wrote:
>> On Wed, Sep 12, 2012 at 2:13 AM, Richard Guenther
>> <richard.guent...@gmail.com> wrote:
>>> On Wed, Sep 12, 2012 at 7:06 AM, Dehao Chen <de...@google.com> wrote:
>>>> Now I think we are facing a more complex problem. The data structure
>>>> we use to store the location_adhoc_data are file-static in linemap.c
>>>> in libcpp. These data structures are not guarded by GTY(()).
>>>> Meanwhile, as we have removed the block data structure from
>>>> gimple.gsbase as well as tree.exp (encoding them into an location_t).
>>>> This could cause block being GCed and the LOCATION_BLOCK becoming
>>>> dangling pointers.
>>>
>>> Uh.  Note that it is quite important that we are able to garbage-collect 
>>> unused
>>> BLOCKs, this is the whole point of removing unused BLOCK scopes in
>>> remove_unused_locals.  So this indeed becomes much more complicated ...
>>> What would be desired is that the garbage collector can NULL an entry in
>>> the mapping table when it is not referenced in any other way (that other
>>> reference would be the BLOCK tree as stored in a FUNCTION_DECLs 
>>> DECL_INITIAL).
>>
>> It would be nice to GC those unused BLOCKS. I wonder how many BLOCKS
>> are created for a large C++ program. This patch saves memory by
>> shrinking tree size, is it a net win or loss without GC those BLOCKS?
>
> Memory usage issues pop up with C++ code using expression templates
> (try BOOST MPL or tramp3d or some larger spirit testcases).  Inlining

I compared the memory consumption for tramp3d, the patched version has
a peak of 504065kB, while non-patched version has a peak of 491853kB.

> creates tons of "empty" BLOCK trees that just wrap others.  It is important
> to be able to GC those.  Now, it might be that no expression / location
> which references the BLOCK survives, and if the line-table is not scanned

Those non-used blocks will still be GCed in this patch.

> by GC then we will just end up with never re-usable entries (the BLOCK address
> may get re-used - can we get false sharing here?)

That is true (memory wasted). However, in the tramp3d case, only
409600 entries are allocated in location_adhoc_data (4.9MB, 1% of the
peak mem consumption). Thus the wasted entry should not be
significant.

Concerning re-used BLOCK, if the block address and the location are
the same, the previously allocated entry will be reused. But it'll not
affect the correctness.

Thanks,
Dehao

>
> Richard.
>
>> thanks,
>>
>> David
>>
>>
>>>
>>>> I tried to manipulate GTY to make it recognize the LOCATION_BLOCK from
>>>> gimple.gsbase.location. However, neigher nested_ptr nor mark_hook can
>>>> help me.
>>>>
>>>> Another approach would be guard the location_adhoc_data and related
>>>> data structures in GTY(()). However, this is non-trivial because tree
>>>> is not visible in libcpp. At the same time, my implementation heavily
>>>> relies on hashtable to make the code efficient, thus it's quite tricky
>>>> to make "param_is" and "use_params" work.
>>>>
>>>> The final approach, which I'll try tomorrow, would be move all my
>>>> implementation from libcpp to gcc, and guard them with GTY(()). I
>>>> still haven't thought of any potential problem of this approach. Any
>>>> comments?
>>>
>>> I think moving the mapping to GC in a lazy manner as I described above
>>> would be the way to go.  For hashtables GC already supports if_marked,
>>> not sure if similar support is available for arrays/vecs.
>>>
>>> Richard.
>>>
>>>> Thanks,
>>>> Dehao
>>>>
>>>> On Tue, Sep 11, 2012 at 9:00 AM, Dehao Chen <de...@google.com> wrote:
>>>>> I saw comments in tree-streamer-out.c:
>>>>>
>>>>>   /* Do not stream BLOCK_SOURCE_LOCATION.  We cannot handle debug 
>>>>> information
>>>>>      for early inlining so drop it on the floor instead of ICEing in
>>>>>      dwarf2out.c.  */
>>>>>   streamer_write_chain (ob, BLOCK_VARS (expr), ref_p);
>>>>>
>>>>> However, what the code is doing seemed contradictory with the comment.
>>>>> Or am I missing something?
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Sep 11, 2012 at 8:32 AM, Michael Matz <m...@suse.de> wrote:
>>>>>> Hi,
>>>>>>
>>>>>> On Tue, 11 Sep 2012, Dehao Chen wrote:
>>>>>>
>>>>>>> Looks like we have two choices:
>>>>>>>
>>>>>>> 1. Stream out block info, and use LTO_SET_PREVAIL for TREE_CHAIN(t)
>>>>>>
>>>>>> This will actually not work correctly in some cases.  The problem is, if
>>>>>> the prevailing decl is already part of another chain (say in another
>>>>>> block_var list) you would break the current chain.  Hence block vars need
>>>>>> special handling in the lto streamer (another reason why tree_chain is 
>>>>>> not
>>>>>> the most clever think to use for this chain).  This problem area needs to
>>>>>> be solved somehow if block info is to be preserved correctly.
>>>>>>
>>>>>>> 2. Don't stream out block info for LTO, and still call LTO_NO_PREVAIL
>>>>>>> (TREE_CHAIN (t)).
>>>>>>
>>>>>> That's also a large hammer as it basically will mean no debug info after
>>>>>> LTO :-/ Sigh, at this point I have no good solution that doesn't involve
>>>>>> quite some work, perhaps your hack is good enough for the time being,
>>>>>> though I hate it :)
>>>>>
>>>>> I got it. Then I'll keep the patch as it is (remove the
>>>>> LTO_NO_PREVAIL), and work with Honza to resolve the issue he had, and
>>>>> then we should be good to check in?
>>>>>
>>>>> Thanks,
>>>>> Dehao
>>>>>
>>>>>>
>>>>>>
>>>>>> Ciao,
>>>>>> Michael.

Reply via email to