Re: Understanding tree_swap_operands_p wrt SSA name versions
On Tue, Jul 17, 2018 at 5:20 AM Michael Ploujnikov wrote: > > On 2018-07-16 04:30 AM, Richard Biener wrote: > > On Mon, Jul 16, 2018 at 12:19 AM Michael Ploujnikov > > wrote: > >> > >> On 2018-07-04 04:52 AM, Richard Biener wrote: > >>> On Tue, Jul 3, 2018 at 9:09 PM Jeff Law wrote: > > On 07/03/2018 11:55 AM, Michael Ploujnikov wrote: > > On 2018-07-03 12:46 PM, Richard Biener wrote: > >> On July 3, 2018 4:56:57 PM GMT+02:00, Michael Ploujnikov > >> wrote: > >>> On 2018-06-20 04:23 AM, Richard Biener wrote: > On Wed, Jun 20, 2018 at 7:31 AM Jeff Law wrote: > > > > On 06/19/2018 12:30 PM, Michael Ploujnikov wrote: > >> Hi everyone, > >> > >> (I hope this is the right place to ask, if not my apologies; please > >> point me in the right direction) > >> > >> I'm trying to get a better understanding of the following part in > >> tree_swap_operands_p(): > >> > >> /* It is preferable to swap two SSA_NAME to ensure a canonical > >>> form > >> for commutative and comparison operators. Ensuring a > >>> canonical > >> form allows the optimizers to find additional redundancies > >>> without > >> having to explicitly check for both orderings. */ > >> if (TREE_CODE (arg0) == SSA_NAME > >> && TREE_CODE (arg1) == SSA_NAME > >> && SSA_NAME_VERSION (arg0) > SSA_NAME_VERSION (arg1)) > >> return 1; > >> > >> My questions in no particular order: It looks like this was added > >>> in > >> 2004. I couldn't find any info other than what's in the > >>> corresponding > >> commit (cc0bdf913) so I'm wondering if the canonical form/order > >>> still > >> relevant/needed today? Does the ordering have to be done based on > >>> the > >> name versions specifically? Or can it be based on something more > >> intrinsic to the input source code rather than a GCC internal > >>> value, eg: > >> would alphabetic sort order of the variable names be a reasonable > >> replacement? > > Canonicalization is still important and useful. > > Indeed. > > > However, canonicalizing on SSA_NAMEs is problematical due to the way > >>> we > > recycle nodes and re-pack them. > > In the past we made sure to not disrupt order - hopefully that didn't > >>> change > so the re-packing shoudln't invaidate previous canonicalization: > > static void > release_free_names_and_compact_live_names (function *fun) > { > ... > /* And compact the SSA number space. We make sure to not change > >>> the > relative order of SSA versions. */ > for (i = 1, j = 1; i < fun->gimple_df->ssa_names->length (); ++i) > { > > > > I think defining additional rules for canonicalization prior to > >>> using > > SSA_NAME_VERSION as the fallback would be looked upon favorably. > > I don't see a good reason to do that, it will be harder to spot > >>> canonicalization > issues and it will take extra compile-time. > > > Note however, that many of the _DECL nodes referenced by SSA_NAMEs > >>> are > > temporaries generated by the compiler and do not correspond to any > > declared/defined object in the original source. So you'll still > >>> need > > the SSA_NAME_VERSION (or something as stable or better) > >>> canonicalization > > to handle those cases. > > And not all SSA_NAMEs have underlying _DECL nodes (or IDENTIFIER_NODE > >>> names). > > Richard. > > > Jeff > >>> > >>> After a bit more digging I found that insert_phi_nodes inserts PHIs in > >>> the order of UIDs, which indirectly affects the order of vars in > >>> old_ssa_names, which in turn affects the order in which > >>> make_ssa_name_fn > >>> is called to pick SSA versions from FREE_SSANAMES so in the end > >>> ordering by SSA_NAME_VERSION's is more or less equivalent to ordering > >>> by > >>> UIDs. I'm trying to figure out if there's a way to avoid depending on > >>> UIDs being ordered in a certain way. So if tree_swap_operands_p stays > >>> the same I'm wondering if there's some other info available at the > >>> point > >>> of insert_phi_nodes that would be a good replacement for UID. From my > >>> very limited experience with a very small source input, and if I > >>> understand things correctly, all of the var_infos have a var which is > >>> DECL_P and thus should have an IDENTIFIER_NODE. Is that true in the > >>> general case? I don't fully understand what are all the things that >
Re: Understanding tree_swap_operands_p wrt SSA name versions
On 2018-07-16 04:30 AM, Richard Biener wrote: > On Mon, Jul 16, 2018 at 12:19 AM Michael Ploujnikov > wrote: >> >> On 2018-07-04 04:52 AM, Richard Biener wrote: >>> On Tue, Jul 3, 2018 at 9:09 PM Jeff Law wrote: On 07/03/2018 11:55 AM, Michael Ploujnikov wrote: > On 2018-07-03 12:46 PM, Richard Biener wrote: >> On July 3, 2018 4:56:57 PM GMT+02:00, Michael Ploujnikov >> wrote: >>> On 2018-06-20 04:23 AM, Richard Biener wrote: On Wed, Jun 20, 2018 at 7:31 AM Jeff Law wrote: > > On 06/19/2018 12:30 PM, Michael Ploujnikov wrote: >> Hi everyone, >> >> (I hope this is the right place to ask, if not my apologies; please >> point me in the right direction) >> >> I'm trying to get a better understanding of the following part in >> tree_swap_operands_p(): >> >> /* It is preferable to swap two SSA_NAME to ensure a canonical >>> form >> for commutative and comparison operators. Ensuring a >>> canonical >> form allows the optimizers to find additional redundancies >>> without >> having to explicitly check for both orderings. */ >> if (TREE_CODE (arg0) == SSA_NAME >> && TREE_CODE (arg1) == SSA_NAME >> && SSA_NAME_VERSION (arg0) > SSA_NAME_VERSION (arg1)) >> return 1; >> >> My questions in no particular order: It looks like this was added >>> in >> 2004. I couldn't find any info other than what's in the >>> corresponding >> commit (cc0bdf913) so I'm wondering if the canonical form/order >>> still >> relevant/needed today? Does the ordering have to be done based on >>> the >> name versions specifically? Or can it be based on something more >> intrinsic to the input source code rather than a GCC internal >>> value, eg: >> would alphabetic sort order of the variable names be a reasonable >> replacement? > Canonicalization is still important and useful. Indeed. > However, canonicalizing on SSA_NAMEs is problematical due to the way >>> we > recycle nodes and re-pack them. In the past we made sure to not disrupt order - hopefully that didn't >>> change so the re-packing shoudln't invaidate previous canonicalization: static void release_free_names_and_compact_live_names (function *fun) { ... /* And compact the SSA number space. We make sure to not change >>> the relative order of SSA versions. */ for (i = 1, j = 1; i < fun->gimple_df->ssa_names->length (); ++i) { > I think defining additional rules for canonicalization prior to >>> using > SSA_NAME_VERSION as the fallback would be looked upon favorably. I don't see a good reason to do that, it will be harder to spot >>> canonicalization issues and it will take extra compile-time. > Note however, that many of the _DECL nodes referenced by SSA_NAMEs >>> are > temporaries generated by the compiler and do not correspond to any > declared/defined object in the original source. So you'll still >>> need > the SSA_NAME_VERSION (or something as stable or better) >>> canonicalization > to handle those cases. And not all SSA_NAMEs have underlying _DECL nodes (or IDENTIFIER_NODE >>> names). Richard. > Jeff >>> >>> After a bit more digging I found that insert_phi_nodes inserts PHIs in >>> the order of UIDs, which indirectly affects the order of vars in >>> old_ssa_names, which in turn affects the order in which >>> make_ssa_name_fn >>> is called to pick SSA versions from FREE_SSANAMES so in the end >>> ordering by SSA_NAME_VERSION's is more or less equivalent to ordering >>> by >>> UIDs. I'm trying to figure out if there's a way to avoid depending on >>> UIDs being ordered in a certain way. So if tree_swap_operands_p stays >>> the same I'm wondering if there's some other info available at the >>> point >>> of insert_phi_nodes that would be a good replacement for UID. From my >>> very limited experience with a very small source input, and if I >>> understand things correctly, all of the var_infos have a var which is >>> DECL_P and thus should have an IDENTIFIER_NODE. Is that true in the >>> general case? I don't fully understand what are all the things that >>> insert_phi_nodes iterates over. >> >> Why do you want to remove the dependence on UID ordering? It's pervasive >> throughout the whole compilation... >> >> Richard. >> >>> - Michael >> > > > Well, I'm working on a reduction of the
Re: Understanding tree_swap_operands_p wrt SSA name versions
On Mon, Jul 16, 2018 at 12:19 AM Michael Ploujnikov wrote: > > On 2018-07-04 04:52 AM, Richard Biener wrote: > > On Tue, Jul 3, 2018 at 9:09 PM Jeff Law wrote: > >> > >> On 07/03/2018 11:55 AM, Michael Ploujnikov wrote: > >>> On 2018-07-03 12:46 PM, Richard Biener wrote: > On July 3, 2018 4:56:57 PM GMT+02:00, Michael Ploujnikov > wrote: > > On 2018-06-20 04:23 AM, Richard Biener wrote: > >> On Wed, Jun 20, 2018 at 7:31 AM Jeff Law wrote: > >>> > >>> On 06/19/2018 12:30 PM, Michael Ploujnikov wrote: > Hi everyone, > > (I hope this is the right place to ask, if not my apologies; please > point me in the right direction) > > I'm trying to get a better understanding of the following part in > tree_swap_operands_p(): > > /* It is preferable to swap two SSA_NAME to ensure a canonical > > form > for commutative and comparison operators. Ensuring a > > canonical > form allows the optimizers to find additional redundancies > > without > having to explicitly check for both orderings. */ > if (TREE_CODE (arg0) == SSA_NAME > && TREE_CODE (arg1) == SSA_NAME > && SSA_NAME_VERSION (arg0) > SSA_NAME_VERSION (arg1)) > return 1; > > My questions in no particular order: It looks like this was added > > in > 2004. I couldn't find any info other than what's in the > > corresponding > commit (cc0bdf913) so I'm wondering if the canonical form/order > > still > relevant/needed today? Does the ordering have to be done based on > > the > name versions specifically? Or can it be based on something more > intrinsic to the input source code rather than a GCC internal > > value, eg: > would alphabetic sort order of the variable names be a reasonable > replacement? > >>> Canonicalization is still important and useful. > >> > >> Indeed. > >> > >>> However, canonicalizing on SSA_NAMEs is problematical due to the way > > we > >>> recycle nodes and re-pack them. > >> > >> In the past we made sure to not disrupt order - hopefully that didn't > > change > >> so the re-packing shoudln't invaidate previous canonicalization: > >> > >> static void > >> release_free_names_and_compact_live_names (function *fun) > >> { > >> ... > >> /* And compact the SSA number space. We make sure to not change > > the > >> relative order of SSA versions. */ > >> for (i = 1, j = 1; i < fun->gimple_df->ssa_names->length (); ++i) > >> { > >> > >> > >>> I think defining additional rules for canonicalization prior to > > using > >>> SSA_NAME_VERSION as the fallback would be looked upon favorably. > >> > >> I don't see a good reason to do that, it will be harder to spot > > canonicalization > >> issues and it will take extra compile-time. > >> > >>> Note however, that many of the _DECL nodes referenced by SSA_NAMEs > > are > >>> temporaries generated by the compiler and do not correspond to any > >>> declared/defined object in the original source. So you'll still > > need > >>> the SSA_NAME_VERSION (or something as stable or better) > > canonicalization > >>> to handle those cases. > >> > >> And not all SSA_NAMEs have underlying _DECL nodes (or IDENTIFIER_NODE > > names). > >> > >> Richard. > >> > >>> Jeff > > > > After a bit more digging I found that insert_phi_nodes inserts PHIs in > > the order of UIDs, which indirectly affects the order of vars in > > old_ssa_names, which in turn affects the order in which > > make_ssa_name_fn > > is called to pick SSA versions from FREE_SSANAMES so in the end > > ordering by SSA_NAME_VERSION's is more or less equivalent to ordering > > by > > UIDs. I'm trying to figure out if there's a way to avoid depending on > > UIDs being ordered in a certain way. So if tree_swap_operands_p stays > > the same I'm wondering if there's some other info available at the > > point > > of insert_phi_nodes that would be a good replacement for UID. From my > > very limited experience with a very small source input, and if I > > understand things correctly, all of the var_infos have a var which is > > DECL_P and thus should have an IDENTIFIER_NODE. Is that true in the > > general case? I don't fully understand what are all the things that > > insert_phi_nodes iterates over. > > Why do you want to remove the dependence on UID ordering? It's pervasive > throughout the whole compilation... > > Richard. > > > - Michael > > >>> > >>> > >>> Well, I'm working on a reduction of the number of changes seen with > >>> binary diffing (a
Re: Understanding tree_swap_operands_p wrt SSA name versions
On 2018-07-04 04:52 AM, Richard Biener wrote: > On Tue, Jul 3, 2018 at 9:09 PM Jeff Law wrote: >> >> On 07/03/2018 11:55 AM, Michael Ploujnikov wrote: >>> On 2018-07-03 12:46 PM, Richard Biener wrote: On July 3, 2018 4:56:57 PM GMT+02:00, Michael Ploujnikov wrote: > On 2018-06-20 04:23 AM, Richard Biener wrote: >> On Wed, Jun 20, 2018 at 7:31 AM Jeff Law wrote: >>> >>> On 06/19/2018 12:30 PM, Michael Ploujnikov wrote: Hi everyone, (I hope this is the right place to ask, if not my apologies; please point me in the right direction) I'm trying to get a better understanding of the following part in tree_swap_operands_p(): /* It is preferable to swap two SSA_NAME to ensure a canonical > form for commutative and comparison operators. Ensuring a > canonical form allows the optimizers to find additional redundancies > without having to explicitly check for both orderings. */ if (TREE_CODE (arg0) == SSA_NAME && TREE_CODE (arg1) == SSA_NAME && SSA_NAME_VERSION (arg0) > SSA_NAME_VERSION (arg1)) return 1; My questions in no particular order: It looks like this was added > in 2004. I couldn't find any info other than what's in the > corresponding commit (cc0bdf913) so I'm wondering if the canonical form/order > still relevant/needed today? Does the ordering have to be done based on > the name versions specifically? Or can it be based on something more intrinsic to the input source code rather than a GCC internal > value, eg: would alphabetic sort order of the variable names be a reasonable replacement? >>> Canonicalization is still important and useful. >> >> Indeed. >> >>> However, canonicalizing on SSA_NAMEs is problematical due to the way > we >>> recycle nodes and re-pack them. >> >> In the past we made sure to not disrupt order - hopefully that didn't > change >> so the re-packing shoudln't invaidate previous canonicalization: >> >> static void >> release_free_names_and_compact_live_names (function *fun) >> { >> ... >> /* And compact the SSA number space. We make sure to not change > the >> relative order of SSA versions. */ >> for (i = 1, j = 1; i < fun->gimple_df->ssa_names->length (); ++i) >> { >> >> >>> I think defining additional rules for canonicalization prior to > using >>> SSA_NAME_VERSION as the fallback would be looked upon favorably. >> >> I don't see a good reason to do that, it will be harder to spot > canonicalization >> issues and it will take extra compile-time. >> >>> Note however, that many of the _DECL nodes referenced by SSA_NAMEs > are >>> temporaries generated by the compiler and do not correspond to any >>> declared/defined object in the original source. So you'll still > need >>> the SSA_NAME_VERSION (or something as stable or better) > canonicalization >>> to handle those cases. >> >> And not all SSA_NAMEs have underlying _DECL nodes (or IDENTIFIER_NODE > names). >> >> Richard. >> >>> Jeff > > After a bit more digging I found that insert_phi_nodes inserts PHIs in > the order of UIDs, which indirectly affects the order of vars in > old_ssa_names, which in turn affects the order in which > make_ssa_name_fn > is called to pick SSA versions from FREE_SSANAMES so in the end > ordering by SSA_NAME_VERSION's is more or less equivalent to ordering > by > UIDs. I'm trying to figure out if there's a way to avoid depending on > UIDs being ordered in a certain way. So if tree_swap_operands_p stays > the same I'm wondering if there's some other info available at the > point > of insert_phi_nodes that would be a good replacement for UID. From my > very limited experience with a very small source input, and if I > understand things correctly, all of the var_infos have a var which is > DECL_P and thus should have an IDENTIFIER_NODE. Is that true in the > general case? I don't fully understand what are all the things that > insert_phi_nodes iterates over. Why do you want to remove the dependence on UID ordering? It's pervasive throughout the whole compilation... Richard. > - Michael >>> >>> >>> Well, I'm working on a reduction of the number of changes seen with >>> binary diffing (a la https://wiki.debian.org/ReproducibleBuilds) and >>> since current UID assignment is essentially tied to the order of things >>> in the input source code one function's changes can cascade to others >>> (even when they're unchanged). As you said UID dependence is quiet >>> pervasive, and although
Re: Understanding tree_swap_operands_p wrt SSA name versions
On Tue, Jul 3, 2018 at 9:09 PM Jeff Law wrote: > > On 07/03/2018 11:55 AM, Michael Ploujnikov wrote: > > On 2018-07-03 12:46 PM, Richard Biener wrote: > >> On July 3, 2018 4:56:57 PM GMT+02:00, Michael Ploujnikov > >> wrote: > >>> On 2018-06-20 04:23 AM, Richard Biener wrote: > On Wed, Jun 20, 2018 at 7:31 AM Jeff Law wrote: > > > > On 06/19/2018 12:30 PM, Michael Ploujnikov wrote: > >> Hi everyone, > >> > >> (I hope this is the right place to ask, if not my apologies; please > >> point me in the right direction) > >> > >> I'm trying to get a better understanding of the following part in > >> tree_swap_operands_p(): > >> > >> /* It is preferable to swap two SSA_NAME to ensure a canonical > >>> form > >> for commutative and comparison operators. Ensuring a > >>> canonical > >> form allows the optimizers to find additional redundancies > >>> without > >> having to explicitly check for both orderings. */ > >> if (TREE_CODE (arg0) == SSA_NAME > >> && TREE_CODE (arg1) == SSA_NAME > >> && SSA_NAME_VERSION (arg0) > SSA_NAME_VERSION (arg1)) > >> return 1; > >> > >> My questions in no particular order: It looks like this was added > >>> in > >> 2004. I couldn't find any info other than what's in the > >>> corresponding > >> commit (cc0bdf913) so I'm wondering if the canonical form/order > >>> still > >> relevant/needed today? Does the ordering have to be done based on > >>> the > >> name versions specifically? Or can it be based on something more > >> intrinsic to the input source code rather than a GCC internal > >>> value, eg: > >> would alphabetic sort order of the variable names be a reasonable > >> replacement? > > Canonicalization is still important and useful. > > Indeed. > > > However, canonicalizing on SSA_NAMEs is problematical due to the way > >>> we > > recycle nodes and re-pack them. > > In the past we made sure to not disrupt order - hopefully that didn't > >>> change > so the re-packing shoudln't invaidate previous canonicalization: > > static void > release_free_names_and_compact_live_names (function *fun) > { > ... > /* And compact the SSA number space. We make sure to not change > >>> the > relative order of SSA versions. */ > for (i = 1, j = 1; i < fun->gimple_df->ssa_names->length (); ++i) > { > > > > I think defining additional rules for canonicalization prior to > >>> using > > SSA_NAME_VERSION as the fallback would be looked upon favorably. > > I don't see a good reason to do that, it will be harder to spot > >>> canonicalization > issues and it will take extra compile-time. > > > Note however, that many of the _DECL nodes referenced by SSA_NAMEs > >>> are > > temporaries generated by the compiler and do not correspond to any > > declared/defined object in the original source. So you'll still > >>> need > > the SSA_NAME_VERSION (or something as stable or better) > >>> canonicalization > > to handle those cases. > > And not all SSA_NAMEs have underlying _DECL nodes (or IDENTIFIER_NODE > >>> names). > > Richard. > > > Jeff > >>> > >>> After a bit more digging I found that insert_phi_nodes inserts PHIs in > >>> the order of UIDs, which indirectly affects the order of vars in > >>> old_ssa_names, which in turn affects the order in which > >>> make_ssa_name_fn > >>> is called to pick SSA versions from FREE_SSANAMES so in the end > >>> ordering by SSA_NAME_VERSION's is more or less equivalent to ordering > >>> by > >>> UIDs. I'm trying to figure out if there's a way to avoid depending on > >>> UIDs being ordered in a certain way. So if tree_swap_operands_p stays > >>> the same I'm wondering if there's some other info available at the > >>> point > >>> of insert_phi_nodes that would be a good replacement for UID. From my > >>> very limited experience with a very small source input, and if I > >>> understand things correctly, all of the var_infos have a var which is > >>> DECL_P and thus should have an IDENTIFIER_NODE. Is that true in the > >>> general case? I don't fully understand what are all the things that > >>> insert_phi_nodes iterates over. > >> > >> Why do you want to remove the dependence on UID ordering? It's pervasive > >> throughout the whole compilation... > >> > >> Richard. > >> > >>> - Michael > >> > > > > > > Well, I'm working on a reduction of the number of changes seen with > > binary diffing (a la https://wiki.debian.org/ReproducibleBuilds) and > > since current UID assignment is essentially tied to the order of things > > in the input source code one function's changes can cascade to others > > (even when they're unchanged). As you said UID dependence is quiet > > pervasive, and although finding and improving individual cases (such as > >
Re: Understanding tree_swap_operands_p wrt SSA name versions
On 07/03/2018 11:55 AM, Michael Ploujnikov wrote: > On 2018-07-03 12:46 PM, Richard Biener wrote: >> On July 3, 2018 4:56:57 PM GMT+02:00, Michael Ploujnikov >> wrote: >>> On 2018-06-20 04:23 AM, Richard Biener wrote: On Wed, Jun 20, 2018 at 7:31 AM Jeff Law wrote: > > On 06/19/2018 12:30 PM, Michael Ploujnikov wrote: >> Hi everyone, >> >> (I hope this is the right place to ask, if not my apologies; please >> point me in the right direction) >> >> I'm trying to get a better understanding of the following part in >> tree_swap_operands_p(): >> >> /* It is preferable to swap two SSA_NAME to ensure a canonical >>> form >> for commutative and comparison operators. Ensuring a >>> canonical >> form allows the optimizers to find additional redundancies >>> without >> having to explicitly check for both orderings. */ >> if (TREE_CODE (arg0) == SSA_NAME >> && TREE_CODE (arg1) == SSA_NAME >> && SSA_NAME_VERSION (arg0) > SSA_NAME_VERSION (arg1)) >> return 1; >> >> My questions in no particular order: It looks like this was added >>> in >> 2004. I couldn't find any info other than what's in the >>> corresponding >> commit (cc0bdf913) so I'm wondering if the canonical form/order >>> still >> relevant/needed today? Does the ordering have to be done based on >>> the >> name versions specifically? Or can it be based on something more >> intrinsic to the input source code rather than a GCC internal >>> value, eg: >> would alphabetic sort order of the variable names be a reasonable >> replacement? > Canonicalization is still important and useful. Indeed. > However, canonicalizing on SSA_NAMEs is problematical due to the way >>> we > recycle nodes and re-pack them. In the past we made sure to not disrupt order - hopefully that didn't >>> change so the re-packing shoudln't invaidate previous canonicalization: static void release_free_names_and_compact_live_names (function *fun) { ... /* And compact the SSA number space. We make sure to not change >>> the relative order of SSA versions. */ for (i = 1, j = 1; i < fun->gimple_df->ssa_names->length (); ++i) { > I think defining additional rules for canonicalization prior to >>> using > SSA_NAME_VERSION as the fallback would be looked upon favorably. I don't see a good reason to do that, it will be harder to spot >>> canonicalization issues and it will take extra compile-time. > Note however, that many of the _DECL nodes referenced by SSA_NAMEs >>> are > temporaries generated by the compiler and do not correspond to any > declared/defined object in the original source. So you'll still >>> need > the SSA_NAME_VERSION (or something as stable or better) >>> canonicalization > to handle those cases. And not all SSA_NAMEs have underlying _DECL nodes (or IDENTIFIER_NODE >>> names). Richard. > Jeff >>> >>> After a bit more digging I found that insert_phi_nodes inserts PHIs in >>> the order of UIDs, which indirectly affects the order of vars in >>> old_ssa_names, which in turn affects the order in which >>> make_ssa_name_fn >>> is called to pick SSA versions from FREE_SSANAMES so in the end >>> ordering by SSA_NAME_VERSION's is more or less equivalent to ordering >>> by >>> UIDs. I'm trying to figure out if there's a way to avoid depending on >>> UIDs being ordered in a certain way. So if tree_swap_operands_p stays >>> the same I'm wondering if there's some other info available at the >>> point >>> of insert_phi_nodes that would be a good replacement for UID. From my >>> very limited experience with a very small source input, and if I >>> understand things correctly, all of the var_infos have a var which is >>> DECL_P and thus should have an IDENTIFIER_NODE. Is that true in the >>> general case? I don't fully understand what are all the things that >>> insert_phi_nodes iterates over. >> >> Why do you want to remove the dependence on UID ordering? It's pervasive >> throughout the whole compilation... >> >> Richard. >> >>> - Michael >> > > > Well, I'm working on a reduction of the number of changes seen with > binary diffing (a la https://wiki.debian.org/ReproducibleBuilds) and > since current UID assignment is essentially tied to the order of things > in the input source code one function's changes can cascade to others > (even when they're unchanged). As you said UID dependence is quiet > pervasive, and although finding and improving individual cases (such as > tree_swap_operands_p) won't make it perfect, I think it will be a step > in the positive direction. > > Also, I have some ideas for a UID assignment scheme that might improve > things overall, that I'll try to share after I get back from vacation. I'm still not sure what the point is.
Re: Understanding tree_swap_operands_p wrt SSA name versions
On 2018-07-03 12:46 PM, Richard Biener wrote: > On July 3, 2018 4:56:57 PM GMT+02:00, Michael Ploujnikov > wrote: >> On 2018-06-20 04:23 AM, Richard Biener wrote: >>> On Wed, Jun 20, 2018 at 7:31 AM Jeff Law wrote: On 06/19/2018 12:30 PM, Michael Ploujnikov wrote: > Hi everyone, > > (I hope this is the right place to ask, if not my apologies; please > point me in the right direction) > > I'm trying to get a better understanding of the following part in > tree_swap_operands_p(): > > /* It is preferable to swap two SSA_NAME to ensure a canonical >> form > for commutative and comparison operators. Ensuring a >> canonical > form allows the optimizers to find additional redundancies >> without > having to explicitly check for both orderings. */ > if (TREE_CODE (arg0) == SSA_NAME > && TREE_CODE (arg1) == SSA_NAME > && SSA_NAME_VERSION (arg0) > SSA_NAME_VERSION (arg1)) > return 1; > > My questions in no particular order: It looks like this was added >> in > 2004. I couldn't find any info other than what's in the >> corresponding > commit (cc0bdf913) so I'm wondering if the canonical form/order >> still > relevant/needed today? Does the ordering have to be done based on >> the > name versions specifically? Or can it be based on something more > intrinsic to the input source code rather than a GCC internal >> value, eg: > would alphabetic sort order of the variable names be a reasonable > replacement? Canonicalization is still important and useful. >>> >>> Indeed. >>> However, canonicalizing on SSA_NAMEs is problematical due to the way >> we recycle nodes and re-pack them. >>> >>> In the past we made sure to not disrupt order - hopefully that didn't >> change >>> so the re-packing shoudln't invaidate previous canonicalization: >>> >>> static void >>> release_free_names_and_compact_live_names (function *fun) >>> { >>> ... >>> /* And compact the SSA number space. We make sure to not change >> the >>> relative order of SSA versions. */ >>> for (i = 1, j = 1; i < fun->gimple_df->ssa_names->length (); ++i) >>> { >>> >>> I think defining additional rules for canonicalization prior to >> using SSA_NAME_VERSION as the fallback would be looked upon favorably. >>> >>> I don't see a good reason to do that, it will be harder to spot >> canonicalization >>> issues and it will take extra compile-time. >>> Note however, that many of the _DECL nodes referenced by SSA_NAMEs >> are temporaries generated by the compiler and do not correspond to any declared/defined object in the original source. So you'll still >> need the SSA_NAME_VERSION (or something as stable or better) >> canonicalization to handle those cases. >>> >>> And not all SSA_NAMEs have underlying _DECL nodes (or IDENTIFIER_NODE >> names). >>> >>> Richard. >>> Jeff >> >> After a bit more digging I found that insert_phi_nodes inserts PHIs in >> the order of UIDs, which indirectly affects the order of vars in >> old_ssa_names, which in turn affects the order in which >> make_ssa_name_fn >> is called to pick SSA versions from FREE_SSANAMES so in the end >> ordering by SSA_NAME_VERSION's is more or less equivalent to ordering >> by >> UIDs. I'm trying to figure out if there's a way to avoid depending on >> UIDs being ordered in a certain way. So if tree_swap_operands_p stays >> the same I'm wondering if there's some other info available at the >> point >> of insert_phi_nodes that would be a good replacement for UID. From my >> very limited experience with a very small source input, and if I >> understand things correctly, all of the var_infos have a var which is >> DECL_P and thus should have an IDENTIFIER_NODE. Is that true in the >> general case? I don't fully understand what are all the things that >> insert_phi_nodes iterates over. > > Why do you want to remove the dependence on UID ordering? It's pervasive > throughout the whole compilation... > > Richard. > >> - Michael > Well, I'm working on a reduction of the number of changes seen with binary diffing (a la https://wiki.debian.org/ReproducibleBuilds) and since current UID assignment is essentially tied to the order of things in the input source code one function's changes can cascade to others (even when they're unchanged). As you said UID dependence is quiet pervasive, and although finding and improving individual cases (such as tree_swap_operands_p) won't make it perfect, I think it will be a step in the positive direction. Also, I have some ideas for a UID assignment scheme that might improve things overall, that I'll try to share after I get back from vacation. - Michael signature.asc Description: OpenPGP digital signature
Re: Understanding tree_swap_operands_p wrt SSA name versions
On July 3, 2018 4:56:57 PM GMT+02:00, Michael Ploujnikov wrote: >On 2018-06-20 04:23 AM, Richard Biener wrote: >> On Wed, Jun 20, 2018 at 7:31 AM Jeff Law wrote: >>> >>> On 06/19/2018 12:30 PM, Michael Ploujnikov wrote: Hi everyone, (I hope this is the right place to ask, if not my apologies; please point me in the right direction) I'm trying to get a better understanding of the following part in tree_swap_operands_p(): /* It is preferable to swap two SSA_NAME to ensure a canonical >form for commutative and comparison operators. Ensuring a >canonical form allows the optimizers to find additional redundancies >without having to explicitly check for both orderings. */ if (TREE_CODE (arg0) == SSA_NAME && TREE_CODE (arg1) == SSA_NAME && SSA_NAME_VERSION (arg0) > SSA_NAME_VERSION (arg1)) return 1; My questions in no particular order: It looks like this was added >in 2004. I couldn't find any info other than what's in the >corresponding commit (cc0bdf913) so I'm wondering if the canonical form/order >still relevant/needed today? Does the ordering have to be done based on >the name versions specifically? Or can it be based on something more intrinsic to the input source code rather than a GCC internal >value, eg: would alphabetic sort order of the variable names be a reasonable replacement? >>> Canonicalization is still important and useful. >> >> Indeed. >> >>> However, canonicalizing on SSA_NAMEs is problematical due to the way >we >>> recycle nodes and re-pack them. >> >> In the past we made sure to not disrupt order - hopefully that didn't >change >> so the re-packing shoudln't invaidate previous canonicalization: >> >> static void >> release_free_names_and_compact_live_names (function *fun) >> { >> ... >> /* And compact the SSA number space. We make sure to not change >the >> relative order of SSA versions. */ >> for (i = 1, j = 1; i < fun->gimple_df->ssa_names->length (); ++i) >> { >> >> >>> I think defining additional rules for canonicalization prior to >using >>> SSA_NAME_VERSION as the fallback would be looked upon favorably. >> >> I don't see a good reason to do that, it will be harder to spot >canonicalization >> issues and it will take extra compile-time. >> >>> Note however, that many of the _DECL nodes referenced by SSA_NAMEs >are >>> temporaries generated by the compiler and do not correspond to any >>> declared/defined object in the original source. So you'll still >need >>> the SSA_NAME_VERSION (or something as stable or better) >canonicalization >>> to handle those cases. >> >> And not all SSA_NAMEs have underlying _DECL nodes (or IDENTIFIER_NODE >names). >> >> Richard. >> >>> Jeff > >After a bit more digging I found that insert_phi_nodes inserts PHIs in >the order of UIDs, which indirectly affects the order of vars in >old_ssa_names, which in turn affects the order in which >make_ssa_name_fn >is called to pick SSA versions from FREE_SSANAMES so in the end >ordering by SSA_NAME_VERSION's is more or less equivalent to ordering >by >UIDs. I'm trying to figure out if there's a way to avoid depending on >UIDs being ordered in a certain way. So if tree_swap_operands_p stays >the same I'm wondering if there's some other info available at the >point >of insert_phi_nodes that would be a good replacement for UID. From my >very limited experience with a very small source input, and if I >understand things correctly, all of the var_infos have a var which is >DECL_P and thus should have an IDENTIFIER_NODE. Is that true in the >general case? I don't fully understand what are all the things that >insert_phi_nodes iterates over. Why do you want to remove the dependence on UID ordering? It's pervasive throughout the whole compilation... Richard. >- Michael
Re: Understanding tree_swap_operands_p wrt SSA name versions
On 2018-06-20 04:23 AM, Richard Biener wrote: > On Wed, Jun 20, 2018 at 7:31 AM Jeff Law wrote: >> >> On 06/19/2018 12:30 PM, Michael Ploujnikov wrote: >>> Hi everyone, >>> >>> (I hope this is the right place to ask, if not my apologies; please >>> point me in the right direction) >>> >>> I'm trying to get a better understanding of the following part in >>> tree_swap_operands_p(): >>> >>> /* It is preferable to swap two SSA_NAME to ensure a canonical form >>> for commutative and comparison operators. Ensuring a canonical >>> form allows the optimizers to find additional redundancies without >>> having to explicitly check for both orderings. */ >>> if (TREE_CODE (arg0) == SSA_NAME >>> && TREE_CODE (arg1) == SSA_NAME >>> && SSA_NAME_VERSION (arg0) > SSA_NAME_VERSION (arg1)) >>> return 1; >>> >>> My questions in no particular order: It looks like this was added in >>> 2004. I couldn't find any info other than what's in the corresponding >>> commit (cc0bdf913) so I'm wondering if the canonical form/order still >>> relevant/needed today? Does the ordering have to be done based on the >>> name versions specifically? Or can it be based on something more >>> intrinsic to the input source code rather than a GCC internal value, eg: >>> would alphabetic sort order of the variable names be a reasonable >>> replacement? >> Canonicalization is still important and useful. > > Indeed. > >> However, canonicalizing on SSA_NAMEs is problematical due to the way we >> recycle nodes and re-pack them. > > In the past we made sure to not disrupt order - hopefully that didn't change > so the re-packing shoudln't invaidate previous canonicalization: > > static void > release_free_names_and_compact_live_names (function *fun) > { > ... > /* And compact the SSA number space. We make sure to not change the > relative order of SSA versions. */ > for (i = 1, j = 1; i < fun->gimple_df->ssa_names->length (); ++i) > { > > >> I think defining additional rules for canonicalization prior to using >> SSA_NAME_VERSION as the fallback would be looked upon favorably. > > I don't see a good reason to do that, it will be harder to spot > canonicalization > issues and it will take extra compile-time. > >> Note however, that many of the _DECL nodes referenced by SSA_NAMEs are >> temporaries generated by the compiler and do not correspond to any >> declared/defined object in the original source. So you'll still need >> the SSA_NAME_VERSION (or something as stable or better) canonicalization >> to handle those cases. > > And not all SSA_NAMEs have underlying _DECL nodes (or IDENTIFIER_NODE names). > > Richard. > >> Jeff After a bit more digging I found that insert_phi_nodes inserts PHIs in the order of UIDs, which indirectly affects the order of vars in old_ssa_names, which in turn affects the order in which make_ssa_name_fn is called to pick SSA versions from FREE_SSANAMES so in the end ordering by SSA_NAME_VERSION's is more or less equivalent to ordering by UIDs. I'm trying to figure out if there's a way to avoid depending on UIDs being ordered in a certain way. So if tree_swap_operands_p stays the same I'm wondering if there's some other info available at the point of insert_phi_nodes that would be a good replacement for UID. From my very limited experience with a very small source input, and if I understand things correctly, all of the var_infos have a var which is DECL_P and thus should have an IDENTIFIER_NODE. Is that true in the general case? I don't fully understand what are all the things that insert_phi_nodes iterates over. - Michael signature.asc Description: OpenPGP digital signature
Re: Understanding tree_swap_operands_p wrt SSA name versions
On Wed, Jun 20, 2018 at 7:31 AM Jeff Law wrote: > > On 06/19/2018 12:30 PM, Michael Ploujnikov wrote: > > Hi everyone, > > > > (I hope this is the right place to ask, if not my apologies; please > > point me in the right direction) > > > > I'm trying to get a better understanding of the following part in > > tree_swap_operands_p(): > > > > /* It is preferable to swap two SSA_NAME to ensure a canonical form > > for commutative and comparison operators. Ensuring a canonical > > form allows the optimizers to find additional redundancies without > > having to explicitly check for both orderings. */ > > if (TREE_CODE (arg0) == SSA_NAME > > && TREE_CODE (arg1) == SSA_NAME > > && SSA_NAME_VERSION (arg0) > SSA_NAME_VERSION (arg1)) > > return 1; > > > > My questions in no particular order: It looks like this was added in > > 2004. I couldn't find any info other than what's in the corresponding > > commit (cc0bdf913) so I'm wondering if the canonical form/order still > > relevant/needed today? Does the ordering have to be done based on the > > name versions specifically? Or can it be based on something more > > intrinsic to the input source code rather than a GCC internal value, eg: > > would alphabetic sort order of the variable names be a reasonable > > replacement? > Canonicalization is still important and useful. Indeed. > However, canonicalizing on SSA_NAMEs is problematical due to the way we > recycle nodes and re-pack them. In the past we made sure to not disrupt order - hopefully that didn't change so the re-packing shoudln't invaidate previous canonicalization: static void release_free_names_and_compact_live_names (function *fun) { ... /* And compact the SSA number space. We make sure to not change the relative order of SSA versions. */ for (i = 1, j = 1; i < fun->gimple_df->ssa_names->length (); ++i) { > I think defining additional rules for canonicalization prior to using > SSA_NAME_VERSION as the fallback would be looked upon favorably. I don't see a good reason to do that, it will be harder to spot canonicalization issues and it will take extra compile-time. > Note however, that many of the _DECL nodes referenced by SSA_NAMEs are > temporaries generated by the compiler and do not correspond to any > declared/defined object in the original source. So you'll still need > the SSA_NAME_VERSION (or something as stable or better) canonicalization > to handle those cases. And not all SSA_NAMEs have underlying _DECL nodes (or IDENTIFIER_NODE names). Richard. > Jeff
Re: Understanding tree_swap_operands_p wrt SSA name versions
On 06/19/2018 12:30 PM, Michael Ploujnikov wrote: > Hi everyone, > > (I hope this is the right place to ask, if not my apologies; please > point me in the right direction) > > I'm trying to get a better understanding of the following part in > tree_swap_operands_p(): > > /* It is preferable to swap two SSA_NAME to ensure a canonical form > for commutative and comparison operators. Ensuring a canonical > form allows the optimizers to find additional redundancies without > having to explicitly check for both orderings. */ > if (TREE_CODE (arg0) == SSA_NAME > && TREE_CODE (arg1) == SSA_NAME > && SSA_NAME_VERSION (arg0) > SSA_NAME_VERSION (arg1)) > return 1; > > My questions in no particular order: It looks like this was added in > 2004. I couldn't find any info other than what's in the corresponding > commit (cc0bdf913) so I'm wondering if the canonical form/order still > relevant/needed today? Does the ordering have to be done based on the > name versions specifically? Or can it be based on something more > intrinsic to the input source code rather than a GCC internal value, eg: > would alphabetic sort order of the variable names be a reasonable > replacement? Canonicalization is still important and useful. However, canonicalizing on SSA_NAMEs is problematical due to the way we recycle nodes and re-pack them. I think defining additional rules for canonicalization prior to using SSA_NAME_VERSION as the fallback would be looked upon favorably. Note however, that many of the _DECL nodes referenced by SSA_NAMEs are temporaries generated by the compiler and do not correspond to any declared/defined object in the original source. So you'll still need the SSA_NAME_VERSION (or something as stable or better) canonicalization to handle those cases. Jeff