Thanks for your suggestions and engaging response.
Based on the feedback I think that the scope of this project comprises of
following three indicative actions:
1. Creating separate driver i.e. separate dump tool that uses lto object
API for reading the lto file.
2. Extending LTO dump infrastructure:
GCC already seems to have dump infrastructure for pretty-printing tree
nodes, gimple statements etc. However I suppose we’d need to extend that
for dumping pass summaries ? For instance, should we add a new hook say
“dump” to ipa_opt_pass_d that’d dump the pass
3. Refactoring streaming API - Could you please elaborate more on what
improvements could be made to the streaming API ? Would it be a good idea
to make it more “C++ style” similar to iostream interface ? Also while
going thru ipa-cp/ipa-prop I noticed the following in
ipa_prop_read_functions(), which looks like some kind of “preamble” for
setting up header to read the summary. Perhaps this could be abstracted
into streaming API too ?
const struct lto_function_header *header =
(const struct lto_function_header *) data;
const int cfg_offset = sizeof (struct lto_function_header);
const int main_offset = cfg_offset + header->cfg_size;
const int string_offset = main_offset + header->main_size;
I would be grateful for suggestions, on how to proceed further, especially
with modifying makefiles for creating the new driver. Unfortunately I have
some school exams next week and won’t be able to work much on GCC during
On Wed, Feb 28, 2018 at 4:05 PM, Martin Liška <mli...@suse.cz> wrote:
> On 02/25/2018 10:46 AM, Martin Jambor wrote:
> > Hello Hrishikesh,
> > I apologize for replying to you this late, this has been a busy week
> > and now I am traveling.
> > On Mon, Feb 19 2018, Hrishikesh Kulkarni wrote:
> >> Hi,
> >> I am Hrishikesh Kulkarni currently studying as an undergrad student in
> >> Computer Engineering at Pune University, India. I find compilers quite
> >> interesting as a subject, and would like to apply to GSoC to gain some
> >> understanding of how real-world compilers work. So far, I have managed
> >> build gcc and perform some simple tweaks to the codebase. In
> particular, I
> >> would like to apply to the Textual LTO dump tool project.
> > I must say I am impressed by the research you have already done.
> > Nevertheless, please note that Ray Kim has also expressed interest in
> > the project. Martin Liska will be the mentor, so I will let him drive
> > the selection process. On the other hand, Ray also liked another
> > project, so maybe he will pick that and everyone will be happy.
> I'm really happy that there are multiple volunteers that want to work on
> LTO dump
> tool project. According to what I've took a look I would like to have
> working on the project. He's got experience with C, C++ and also with
> Python language
> that can be well used for prototyping. Apart from that he's spent quite
> some time
> with investigation of LTO internals in GCC.
> That said, may I please ask other candidates to seek for a different GSoC
> we offered? I believe the other topics are also interesting and important
> for the project.
> >> As far as I understand, the motivation for LTO framework was to enable
> >> cross file interprocedural optimizations, and for this purpose an ipa
> >> is divided into following three stages:
> >> 1.
> >> LGEN: The pass does a local analysis of the function and generates a
> >> “summary”, ie, the information relevant to the pass and writes it to
> >> object file.
> > A pass might do that, but the output of the whole stage is not just the
> > pass summaries, it also writes the function IL (the function gimple
> > statements, above all) to the object file.
> >> 2.
> >> WPA: The LTO object files are given as input to the linker, which
> >> invokes the lto1 frontend to perform global ipa analysis over the
> >> call-graph and write optimized summaries to LTO object files
> >> (partitioning). The global ipa analysis is done over summary and not
> >> actual function bodies.
> > Well... note that partitioning actually means dividing the whole
> > compiled program/library into chunks that are then compiled
> > independently in the LTRANS stage. But you are basically right that WPA
> > does also do whole-program analysis based on summaries and then writes
> > its decisions to optimization summaries, yes.
> >> 3.
> >> LTRANS: The partitions are read back, and the function bodies are
> >> reconstructed from summary and are then compiled to produce real
> >> files.
> > Function bodies and the summaries are distinct things. The body
> > consists of gimple statements and all the associated stuff (such as
> > types, so a lot of stuff), whereas when we refer to summaries, we mean
> > small chunks of data that interprocedural optimizations such as inlining
> > or IPA-CP scurry away because they cannot feasibly work on bodies of the
> > entire program.
> > But apart from this terminology issue, you are basically correct, at the
> > LTRANS stage, IPA passes apply transformations to the bodies according
> > to the optimization summary generated by the WPA phase. And then, all
> > normal, intra-procedural passes and code generation runs.
> >> If I understand correctly, the motivation for textual LTO dump tool is
> >> easily analyze contents of LTO object file, similar to readelf or
> objdump ?
> Yes. Richi in previous email defined how that could be done.
> > That is how I understand it too, but Martin may have some further uses
> > in mind.
> >> Assume that LTO object file contains in pureconst section: 0b0110 (0b
> >> binary prefix) corresponding to values of fs->pure_const_state and
> >> fs->state_previously_known.
> >> If I understand correctly, the output of dump tool should then be:
> >> pure_const pass:
> >> pure_const_state = IPA_PURE (enum value of pure_const_state_e
> >> to 0b01)
> >> state_previously_known = IPA_NEITHER (enum value of pure_const_state_e
> >> corresponding to 0b10)
> >> Is this the expected output of the dump tool ?
> > I think the tool would have to a bit more than just dumping summaries of
> > IPA passes. I tend to think that the task should also include dumping
> > gimple bodies (but we already do that in GCC and so it should be mostly
> > easy) and also of types (that are merged as one of the first steps of
> > WPA and interesting things happen when mergingit does something
> > "interesting"). And perhaps quite a bit more. Martin?
> Yes, as we transitioned to early-debug info in LTO mode, printing tree
> that reside in LTO stream would help us to reduce the stream in the future.
> >> I am reasonably familiar working with C, C++ and python. My prior
> >> experience includes opportunities to work in areas of NLP. Some of my
> >> accomplishments in the area include presenting project VicharDhara- A
> >> thought Mapper that was selected among top five ideas in Accenture
> >> Innovation Challenge among 7000 nationwide entries. My paper on this
> >> won the best paper award in IEEE Conference ICCUBEA-2017. My previous
> >> was focused on simple parsers, student psychology, thought process
> >> detection for team selection.
> > Interesting, congratulations.
> >> In the interim, I have been through a few docs on GCC and LTO 
> >> am trying to write a toy ipa pass to better understand LTO/IPA
> >> infrastructure.
> > Great, I believe that's exactly what my advice would be
> >> I would be grateful for feedback on the textual LTO dump
> >> tool.
> > I hope that Martin will shed a bit more light on what output he
> > envisions the tool to have. I will talk to him about it too when I get
> > back to the office (so maybe on Tuesday but probably on Wednesday).
> As mentioned above it was mentioned by Richard. First step would be to
> write-only mode, where lto-dump will only provide verbose information
> for debugging.
> One another topic is current LTO dumping infrastructure. I know Honza does
> like the interface. Maybe it can be improved in respect to bitpack_d and
> some generalization can be done. Honza?
> > Thanks,
> > Martin
> >>  http://www.ucw.cz/~hubicka/slides/labs2013.pdf
> >>  https://gcc.gnu.org/wiki/LinkTimeOptimizatio
> >> <https://gcc.gnu.org/wiki/LinkTimeOptimization>
> >>  https://gcc.gnu.org/onlinedocs/gccint/LTO-Overview.html
> >> My two recent publications are listed below:
> >> [A] Hrishikesh Kulkarni, "Contextual Data Representation Using Prime
> >> Route Mapping Method and Ontology" IEEE Conference, ICCUBEA, 2017
> >> [B] Hrishikesh Kulkarni, “Multi-Graph based Intent Hierarchy Generation
> >> Determine Action Sequence”, Springer Conference, ICDECT, December 2017,
> >> Thanks,
> >> Hrishikesh Kulkarni