On Fri, Jan 20, 2023, at 13:02, Proven Provenzano wrote:
> Hi Colin,
> Thanks for the response.
>
> I chose raw records, thinking it might be useful for future additions of
> records that customers might want to add before the first start of the
> cluster. I do see that it is at best an engineer friendly interface.
>
> I do think kafka-storage is the correct place to put the logic for adding
> records to the bootstrap.checkpoint file. I think keeping the logic for
> managing the bootstrap separate from the logic of configuring an existing
> cluster that is already running is a good division of functionality and I
> think this separation will reduce the parsing logic significantly.

Hi Proven,

I suspect having the separation will increase the parsing logic you have to 
write rather than reducing it, since it will be more difficult to reuse the 
logic that is already in kafka-configs.sh.

I do agree that having multiple commands modifying the bootstrap may be 
confusing, though. So maybe it's best to keep it all in the formatting tool, as 
you say.

>
> The API suggestion you made for kafka-storage is okay. I would prefer to
> have one option for an entire SCRAM config including the user, such as the
> following:
>

Hmm... I don't think having a single giant string for all the SCRAM 
configurations makes sense. At minimum each SCRAM user should get its own 
string. Another reason for doing it this way is that this is how 
kafka-configs.sh works, and we don't want two different incompatible ways of 
specifying SCRAM users on the command line.

> I think adding the Argparse4j support for reading the arguments from a file
> is a must.

Yeah, agreed.

best,
Colin

>
> --Proven
>
>
> On Thu, Jan 19, 2023 at 7:07 PM Colin McCabe <cmcc...@apache.org> wrote:
>
>> Hi Proven,
>>
>> Thanks for putting this together.
>>
>> We always intended to have a way to bootstrap into using an all-SCRAM
>> cluster, from scratch.
>>
>> I have two big comments here. First, I think we need a better interface
>> than raw records. And second, I'm not sure that kafka-storage.sh is the
>> right place to put this.
>>
>> I think raw records are going to be tough for people to use, because there
>> are a lot of fields, and the values to set them to are not intuitive. For
>> example, to set SHA512, the user needs to set "mechanism" equal to 2. That
>> is going to be impossible to remember or figure out without looking at the
>> source code. The other thing of course is that we may add more fields over
>> time, including mandatory ones. So any documentation could quickly get out
>> of date.
>>
>> I think people are going to want to specify SCRAM users here the same way
>> they do when using the kafka-configs.sh tool. As a reminder, using
>> kafka-configs.sh, they specify users like this:
>>
>> ./bin/kafka-configs --bootstrap-server localhost:9092 --alter \
>>   --add-config 'SCRAM-SHA-256=[iterations=8192,password=pass]' \
>>   --entity-type users \
>>   --entity-name alice
>>
>> Of course, in this example, we're not specifying a salt. So we'd have to
>> evaluate whether that's what we want for our use-case as well. On the plus
>> side, specifying a salt could ensure that the bootstrap files end up
>> identical on every node. On the minus side, it is another random number
>> that users would need to generate and explicitly pass in.
>>
>> I would lean towards auto-generating the salt. I don't think the salt
>> needs to be the same on all nodes. Only one controller will become active
>> and write the bootstrap records to the log; no other controllers will do
>> that. Brokers don't need to read the SCRAM records out of the bootstrap
>> file.
>>
>> If we put all the functionality into kafka-storage.sh, it might look
>> something like this:
>>
>> ./bin/kafka-storage.sh format \
>>   --config [my-config-path] \
>>   --cluster-id mb0Zz1YPTUeVzpedHHPT-Q \
>>   --release-version 3.5-IV0 \
>>   --scram-user alice \
>>   --scram-config 'SCRAM-SHA-256=[iterations=8192,password=alicepass]' \
>>   --scram-user bob \
>>   --scram-config 'SCRAM-SHA-256=[password=bobpass]'
>>
>> (Here I am assuming that each --scram-user must be followed by exactly on
>> --scram-config line)
>>
>> Perhaps it's worth considering whether it woudl be better to add a mode to
>> kafka-configs.sh where it appends to a bootstrap file.
>>
>> If we do put everything into kafka-storage.sh, we should consider the
>> plight of people with low limits on the maximum length of their command
>> lines. One fix for these people could be allowing them to read their
>> arguments from a file like this:
>>
>> $ ./bin/kafka-storage.sh @myfile
>> $ cat myfile:
>>   ./bin/kafka-storage.sh format \
>>     --config [my-config-path] \
>>   ...
>> [etc, etc.]
>>
>> Argparse4j supports this natively with fromFilePrefix. See
>> https://argparse4j.github.io/usage.html#fromfileprefix
>>
>> best,
>> Colin
>>
>>
>> On Thu, Jan 19, 2023, at 11:08, Proven Provenzano wrote:
>> > I have written a KIP describing the API additions needed to
>> > kafka-storage
>> > to store SCRAM
>> > credentials at bootstrap time. Please take a look at
>> >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-900%3A+KRaft+kafka-storage.sh+API+additions+to+support+SCRAM+for+Kafka+Brokers
>> >
>> > --
>> > --Proven
>>

Reply via email to