Hi Colin, I agree that the --alter flag is not useful. I just wanted to gauge how close to the original command line we want.
I also think that positional args are not the way to go. Parsing can be smarter than that. Using --add-config to be the leed for setting dynamic config is fine. I think disambiguating primary arguments from the arguments of a specific config will be simple to understand. The '--' is for primary arguments to the kafka-storage command. Multiple subarguments are possible for a single --add-config each of which does not have a '--'. So like the following. -add-config 'SCRAM-SHA-256=[iterations=8192,password=alice-secret],SCRAM-SHA-512=[password=alice-secret]' entity-type users entity-name alice The following are also acceptable from your example. I changed the ordering because it does't matter. --add-config entity-type brokers entity-name 0 foo=bar --add-config default-entity entity-type broker baaz=quux The --add-config sub argument parsing knows about the possible sub fields entity-type, entity-name, default-entity and the key=value indicates what the config is applied to. Thanks --Proven On Fri, Jan 27, 2023 at 6:20 PM Colin McCabe <cmcc...@apache.org> wrote: > On Fri, Jan 27, 2023, at 11:25, Proven Provenzano wrote: > > Hi Colin, > > > > So I think we are converging where we put the adding of records in > > kafka-storage, allow for salt and iterations to be optional and now we > are > > getting to how to parse the individual arguments. > > > > My previous approach is to keep '--' as a delimiter for primary > > kafka-storage arguments but to allow multiple space separated sub args > > for > > each 'scram-config' can contain multiple space separated key=value > > pairs. > > --scram-config > > user=alice 'SCRAM-SHA-256=[iterations=8192,password=alicepass] > > It could also be like the following with sub parsing of arguments where > > name requires an additional arg and config requires one too. This looks > > more like kafka-configs. > > --scram-config entity-name alice config 'SCRAM-SHA-256=[ > > iterations=8192,password=alicepass] > > If we really want to keep a similar parsing to kafka-configs we could > > use '--alter' > > as a primary argument like --config and --cluster-id and the other args > > are > > secondary arguments to --alter . The following now looks almost exactly > > like kafka-configs. We would just ignore --entity-type as it must be of > > type user. > > --alter --add-config > > > 'SCRAM-SHA-256=[iterations=8192,password=alice-secret],SCRAM-SHA-512=[password=alice-secret]' > > --entity-type users --entity-name alice > > > > Thoughts? > > --Proven > > > > Hi Proven, > > I don't think it makes sense to have an --alter flag because we are not > altering anything. We are creating a new bootstrap file. > > In general, we need to think about what else we will add to the formatting > tool in the future. One thing I can think of is we will probably need to > add the ability to set dynamic configurations. So we probably will have > something like "--add-config key=value". Then, of course, the question > becomes how to supply the entity-type and entity-name. We cannot really do > it exactly like kafka-configs.sh does it, because that tool only accepts > one alteration at a time, whereas we may have to do many. > > If we are willing to enforce ordering, we could do something like this: > --entity-type brokers --entity-name 0 --add-config foo=bar > --entity-type brokers --default-entity --add-config baaz=quux > > If we go that route, then maybe it makes sense to do SCRAM with a similar > paradigm. So something like: > --entity-type users --entity-name alice --add-config > 'SCRAM-SHA-512=[password=alice-secret]' > > I do think this is doable with argparse4j. What do you think? > > best, > Colin > > > > > > > > On Tue, Jan 24, 2023 at 12:22 AM Colin McCabe <cmcc...@apache.org> > wrote: > > > >> On Fri, Jan 20, 2023, at 13:02, Proven Provenzano wrote: > >> > Hi Colin, > >> > Thanks for the response. > >> > > >> > I chose raw records, thinking it might be useful for future additions > of > >> > records that customers might want to add before the first start of the > >> > cluster. I do see that it is at best an engineer friendly interface. > >> > > >> > I do think kafka-storage is the correct place to put the logic for > adding > >> > records to the bootstrap.checkpoint file. I think keeping the logic > for > >> > managing the bootstrap separate from the logic of configuring an > existing > >> > cluster that is already running is a good division of functionality > and I > >> > think this separation will reduce the parsing logic significantly. > >> > >> Hi Proven, > >> > >> I suspect having the separation will increase the parsing logic you have > >> to write rather than reducing it, since it will be more difficult to > reuse > >> the logic that is already in kafka-configs.sh. > >> > >> I do agree that having multiple commands modifying the bootstrap may be > >> confusing, though. So maybe it's best to keep it all in the formatting > >> tool, as you say. > >> > >> > > >> > The API suggestion you made for kafka-storage is okay. I would prefer > to > >> > have one option for an entire SCRAM config including the user, such as > >> the > >> > following: > >> > > >> > >> Hmm... I don't think having a single giant string for all the SCRAM > >> configurations makes sense. At minimum each SCRAM user should get its > own > >> string. Another reason for doing it this way is that this is how > >> kafka-configs.sh works, and we don't want two different incompatible > ways > >> of specifying SCRAM users on the command line. > >> > >> > I think adding the Argparse4j support for reading the arguments from a > >> file > >> > is a must. > >> > >> Yeah, agreed. > >> > >> best, > >> Colin > >> > >> > > >> > --Proven > >> > > >> > > >> > On Thu, Jan 19, 2023 at 7:07 PM Colin McCabe <cmcc...@apache.org> > wrote: > >> > > >> >> Hi Proven, > >> >> > >> >> Thanks for putting this together. > >> >> > >> >> We always intended to have a way to bootstrap into using an all-SCRAM > >> >> cluster, from scratch. > >> >> > >> >> I have two big comments here. First, I think we need a better > interface > >> >> than raw records. And second, I'm not sure that kafka-storage.sh is > the > >> >> right place to put this. > >> >> > >> >> I think raw records are going to be tough for people to use, because > >> there > >> >> are a lot of fields, and the values to set them to are not intuitive. > >> For > >> >> example, to set SHA512, the user needs to set "mechanism" equal to 2. > >> That > >> >> is going to be impossible to remember or figure out without looking > at > >> the > >> >> source code. The other thing of course is that we may add more fields > >> over > >> >> time, including mandatory ones. So any documentation could quickly > get > >> out > >> >> of date. > >> >> > >> >> I think people are going to want to specify SCRAM users here the same > >> way > >> >> they do when using the kafka-configs.sh tool. As a reminder, using > >> >> kafka-configs.sh, they specify users like this: > >> >> > >> >> ./bin/kafka-configs --bootstrap-server localhost:9092 --alter \ > >> >> --add-config 'SCRAM-SHA-256=[iterations=8192,password=pass]' \ > >> >> --entity-type users \ > >> >> --entity-name alice > >> >> > >> >> Of course, in this example, we're not specifying a salt. So we'd > have to > >> >> evaluate whether that's what we want for our use-case as well. On the > >> plus > >> >> side, specifying a salt could ensure that the bootstrap files end up > >> >> identical on every node. On the minus side, it is another random > number > >> >> that users would need to generate and explicitly pass in. > >> >> > >> >> I would lean towards auto-generating the salt. I don't think the salt > >> >> needs to be the same on all nodes. Only one controller will become > >> active > >> >> and write the bootstrap records to the log; no other controllers > will do > >> >> that. Brokers don't need to read the SCRAM records out of the > bootstrap > >> >> file. > >> >> > >> >> If we put all the functionality into kafka-storage.sh, it might look > >> >> something like this: > >> >> > >> >> ./bin/kafka-storage.sh format \ > >> >> --config [my-config-path] \ > >> >> --cluster-id mb0Zz1YPTUeVzpedHHPT-Q \ > >> >> --release-version 3.5-IV0 \ > >> >> --scram-user alice \ > >> >> --scram-config > 'SCRAM-SHA-256=[iterations=8192,password=alicepass]' \ > >> >> --scram-user bob \ > >> >> --scram-config 'SCRAM-SHA-256=[password=bobpass]' > >> >> > >> >> (Here I am assuming that each --scram-user must be followed by > exactly > >> on > >> >> --scram-config line) > >> >> > >> >> Perhaps it's worth considering whether it woudl be better to add a > mode > >> to > >> >> kafka-configs.sh where it appends to a bootstrap file. > >> >> > >> >> If we do put everything into kafka-storage.sh, we should consider the > >> >> plight of people with low limits on the maximum length of their > command > >> >> lines. One fix for these people could be allowing them to read their > >> >> arguments from a file like this: > >> >> > >> >> $ ./bin/kafka-storage.sh @myfile > >> >> $ cat myfile: > >> >> ./bin/kafka-storage.sh format \ > >> >> --config [my-config-path] \ > >> >> ... > >> >> [etc, etc.] > >> >> > >> >> Argparse4j supports this natively with fromFilePrefix. See > >> >> https://argparse4j.github.io/usage.html#fromfileprefix > >> >> > >> >> best, > >> >> Colin > >> >> > >> >> > >> >> On Thu, Jan 19, 2023, at 11:08, Proven Provenzano wrote: > >> >> > I have written a KIP describing the API additions needed to > >> >> > kafka-storage > >> >> > to store SCRAM > >> >> > credentials at bootstrap time. Please take a look at > >> >> > > >> >> > >> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-900%3A+KRaft+kafka-storage.sh+API+additions+to+support+SCRAM+for+Kafka+Brokers > >> >> > > >> >> > -- > >> >> > --Proven > >> >> > >> >