Re: [PATCH 0/3] Convert grep to recurse in-process
On Wed, Jul 12, 2017 at 11:24:47AM -0700, Jonathan Nieder wrote: > Jeff King wrote: > > On Wed, Jul 12, 2017 at 11:06:03AM -0700, Brandon Williams wrote: > > >> Each 'struct repository' does have its own config so we could > >> potentially want a config in a submodule to override some config in the > >> superproject. Though for right now it may be simpler to not worry about > >> doing this overriding, mostly because you would only want to allow > >> overriding of some configuration and not all configuration. One example > >> would be the number of threads allowed in grep, > [...] > > I think that's probably one of the more complicated cases, and I don't > > think it really needs to be done on day one. > > That's fair. Could you give an example of a simpler case? I think core.quotepath that I gave is one example that I'd expect to work differently than it does in 'next'. Though I really think that's a fairly uninteresting one, as it's extremely unlikely to be set on a per-repo basis. Something like color.grep.* is in the same boat (I think it ought to respect submodule config, but I'd be surprised if anybody cares). I know those are kind of lame examples, and if they remain broken forever, I don't know if anybody would care. I'm more concerned about us missing a case that causes a regression (and unlike some regressions, where you say "oops, a bug" and fix it, I'd worry that we'll have made a big jump to an in-process model, and the only fixes are "implement a complete split-config solution" or "revert back to multiple processes"). I do agree that not having concrete examples makes it harder to reason about the desired semantics. -Peff
Re: [PATCH 0/3] Convert grep to recurse in-process
On Wed, Jul 12, 2017 at 11:09:23AM -0700, Jonathan Nieder wrote: > > I didn't follow the rest of the "struct repository" series closely, but > > I don't feel like we ever reached a resolution on how config would be > > handled. I notice that the in-process "ls-files" behaves differently > > than the old one when config differs between the submodule and the > > parent repository. As we convert more commands (that use more config) > > this will become more likely to be noticed by somebody. > > > > Do we have a plan for dealing with this? Is our solution just "recursed > > operations always respect the parent config, deal with it"? > > For settings like branch..remote, I don't think anyone would > disagree that the right thing to do is to use the per-repository > config of the submodule. The repository object is already able to > handle per-repository config, so this just involves callers being > careful not to cache values locally in a way that conflates > repositories. It should be pretty straightforward (for commands like > "git fetch --recurse-submodules", for example). I agree that's the right approach. What I'm worried about is that I see in-process work proceeding without the "callers being careful" part being audited, which can lead to regressions (e.g., ls-files with core.quotepath is "broken" in next right now). Though at least the regression would be limited to people using submodules. > For settings like grep.patternType, on the other hand, it would be > very strange for the behavior to change when grep crosses the > submodule boundary. So I think using the parent project config is the > right thing to do and the old behavior was simply wrong. In other > words, I don't think this is so much a case of "deal with it" as > "sorry we got the behavior so wrong before --- we've finally fixed it > now". I think that's not actually about the parent project's config, as much as it is about the parameters of the current operation. I.e., the argument is that this particular grep operation is using a particular pattern type, no matter how we arrived at that decision, and it should be used for all of the recursive bits. So whether the superproject has its own grep.patternType set, or whether the user used "-P" on the command line, the result is the same: we need to tell the submodule to ignore any config and use the parameters we feed it. In a multi-process model, that should happen by converting all of the bits in "struct grep_opt" back into command-line parameters and feeding them to the recursive processes (which would then give them precedence over any config). But I'm pretty sure we don't do that. In the in-process model, that would hopefully be a bit simpler, as we'd just pass in a pre-made grep_opt. -Peff
Re: [PATCH 0/3] Convert grep to recurse in-process
Jeff King wrote: > On Wed, Jul 12, 2017 at 11:06:03AM -0700, Brandon Williams wrote: >> Each 'struct repository' does have its own config so we could >> potentially want a config in a submodule to override some config in the >> superproject. Though for right now it may be simpler to not worry about >> doing this overriding, mostly because you would only want to allow >> overriding of some configuration and not all configuration. One example >> would be the number of threads allowed in grep, [...] > I think that's probably one of the more complicated cases, and I don't > think it really needs to be done on day one. That's fair. Could you give an example of a simpler case? Thanks, Jonathan
Re: [PATCH 0/3] Convert grep to recurse in-process
On Wed, Jul 12, 2017 at 11:06:03AM -0700, Brandon Williams wrote: > > I didn't follow the rest of the "struct repository" series closely, but > > I don't feel like we ever reached a resolution on how config would be > > handled. I notice that the in-process "ls-files" behaves differently > > than the old one when config differs between the submodule and the > > parent repository. As we convert more commands (that use more config) > > this will become more likely to be noticed by somebody. > > > > Do we have a plan for dealing with this? Is our solution just "recursed > > operations always respect the parent config, deal with it"? > > Each 'struct repository' does have its own config so we could > potentially want a config in a submodule to override some config in the > superproject. Though for right now it may be simpler to not worry about > doing this overriding, mostly because you would only want to allow > overriding of some configuration and not all configuration. One example > would be the number of threads allowed in grep, it doesn't make much > sense to let a submodule's configuration of this to trump the > superproject's since the command was invoked in the context of the > superproject. I'm not sure I agree 100% with that example. What makes threads special, I think, is not the config but the total count spread across all of the recursive processes. So it's not that we don't want to respect submodule config so much as we want to take the submodule config into account, but throttle it based on what other threads are running. So if your superproject says "1" and the submodule says "8", I'd expect "8" threads to run in the submodule. If you're already running another 3 threads on behalf of another submodule, I think it would be reasonable to do some job control and only give the submodule 5 slots. But I don't think that should happen at the config layer. It should probably happen when the submodule decides to spawn threads, and it should ask of the superproject "how many slots am I allowed?". I think that's probably one of the more complicated cases, and I don't think it really needs to be done on day one. Setting differing thread counts is even more unlikely than the rest of the config. I suspect doing job management in general would come up first, because people don't want to fork-bomb themselves. Anyway, that got pretty far afield. What I was trying to say is that I think you can treat the config uniformly, without making special exceptions for things like grep.threads. -Peff
Re: [PATCH 0/3] Convert grep to recurse in-process
On Wed, Jul 12, 2017 at 11:09 AM, Jonathan Niederwrote: > Hi, > > Jeff King wrote: > >> I didn't follow the rest of the "struct repository" series closely, but >> I don't feel like we ever reached a resolution on how config would be >> handled. I notice that the in-process "ls-files" behaves differently >> than the old one when config differs between the submodule and the >> parent repository. As we convert more commands (that use more config) >> this will become more likely to be noticed by somebody. >> >> Do we have a plan for dealing with this? Is our solution just "recursed >> operations always respect the parent config, deal with it"? > > For settings like branch..remote, I don't think anyone would > disagree that the right thing to do is to use the per-repository > config of the submodule. The repository object is already able to > handle per-repository config, so this just involves callers being > careful not to cache values locally in a way that conflates > repositories. It should be pretty straightforward (for commands like > "git fetch --recurse-submodules", for example). > > For settings like grep.patternType, on the other hand, it would be > very strange for the behavior to change when grep crosses the > submodule boundary. That is because this option relates to the input given. Other options relate to the items to be processed, such as grep.color.* which I would not find strange if they were respected in the submodule. > So I think using the parent project config is the > right thing to do and the old behavior was simply wrong. In other > words, I don't think this is so much a case of "deal with it" as > "sorry we got the behavior so wrong before --- we've finally fixed it > now". In an ideal world we should pay partial attention to the config in the submodule, and for each config key we'd have to determine if it rather relates to the general runtime (grep.threads), the command line options (grep.patternType) or to the specific content inside the repo (coloring, whether we show the line number). For now and in reality we can ignore the content specific settings and claim that any setting here is related to runtime and command line options, both of which a user may expect the superproject to win in case of differing configurations. Thanks, Stefan
Re: [PATCH 0/3] Convert grep to recurse in-process
Hi, Jeff King wrote: > I didn't follow the rest of the "struct repository" series closely, but > I don't feel like we ever reached a resolution on how config would be > handled. I notice that the in-process "ls-files" behaves differently > than the old one when config differs between the submodule and the > parent repository. As we convert more commands (that use more config) > this will become more likely to be noticed by somebody. > > Do we have a plan for dealing with this? Is our solution just "recursed > operations always respect the parent config, deal with it"? For settings like branch..remote, I don't think anyone would disagree that the right thing to do is to use the per-repository config of the submodule. The repository object is already able to handle per-repository config, so this just involves callers being careful not to cache values locally in a way that conflates repositories. It should be pretty straightforward (for commands like "git fetch --recurse-submodules", for example). For settings like grep.patternType, on the other hand, it would be very strange for the behavior to change when grep crosses the submodule boundary. So I think using the parent project config is the right thing to do and the old behavior was simply wrong. In other words, I don't think this is so much a case of "deal with it" as "sorry we got the behavior so wrong before --- we've finally fixed it now". But this is subtle. Maybe some notes in the config documentation for relevant settings would help. That would make the intended behavior clearer and make debugging easier for users. Thanks, Jonathan
Re: [PATCH 0/3] Convert grep to recurse in-process
On 07/12, Jeff King wrote: > On Tue, Jul 11, 2017 at 03:04:05PM -0700, Brandon Williams wrote: > > > This series utilizes the new 'struct repository' in order to convert grep > > to be > > able to recurse into submodules in-process much like how ls-files was > > converted > > to recuse in-process. The result is a much smaller code footprint due to > > not > > needing to compile an argv array of options to be used when launched a > > process > > for operating on a submodule. > > I didn't follow the rest of the "struct repository" series closely, but > I don't feel like we ever reached a resolution on how config would be > handled. I notice that the in-process "ls-files" behaves differently > than the old one when config differs between the submodule and the > parent repository. As we convert more commands (that use more config) > this will become more likely to be noticed by somebody. > > Do we have a plan for dealing with this? Is our solution just "recursed > operations always respect the parent config, deal with it"? Each 'struct repository' does have its own config so we could potentially want a config in a submodule to override some config in the superproject. Though for right now it may be simpler to not worry about doing this overriding, mostly because you would only want to allow overriding of some configuration and not all configuration. One example would be the number of threads allowed in grep, it doesn't make much sense to let a submodule's configuration of this to trump the superproject's since the command was invoked in the context of the superproject. -- Brandon Williams
Re: [PATCH 0/3] Convert grep to recurse in-process
On Tue, Jul 11, 2017 at 03:04:05PM -0700, Brandon Williams wrote: > This series utilizes the new 'struct repository' in order to convert grep to > be > able to recurse into submodules in-process much like how ls-files was > converted > to recuse in-process. The result is a much smaller code footprint due to not > needing to compile an argv array of options to be used when launched a process > for operating on a submodule. I didn't follow the rest of the "struct repository" series closely, but I don't feel like we ever reached a resolution on how config would be handled. I notice that the in-process "ls-files" behaves differently than the old one when config differs between the submodule and the parent repository. As we convert more commands (that use more config) this will become more likely to be noticed by somebody. Do we have a plan for dealing with this? Is our solution just "recursed operations always respect the parent config, deal with it"? -Peff