Thanks @Takanobu for your reply. I will make it configurable and default as
disable.

Takanobu Asanuma <tasan...@apache.org> 于2021年10月13日周三 下午3:34写道:

> I think many users parse audit logs in their own way, and they will be
> affected if the format is changed. So I agree with Masatake's suggestion.
> - Takanobu
>
> 2021年10月11日(月) 18:19 tom lee <tomlees...@gmail.com>:
>
>> Thanks @Masatake Iwasaki <iwasak...@oss.nttdata.co.jp> for your
>> suggestion. This is a good idea.
>>
>> Masatake Iwasaki <iwasak...@oss.nttdata.co.jp> 于2021年10月11日周一 下午3:26写道:
>>
>> > > I am not sure whether we can directly go and change this. Any changes
>> to
>> > Audit Log format are considered incompatible.
>> > >
>> > >
>> >
>> https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html#Audit_Log_Output
>> >
>> > Adding a field for caller context seemed to be accepted since it is
>> > optional feature disabled by default.
>> >
>> >
>> https://github.com/apache/hadoop/blob/rel/release-3.3.1/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java#L8480-L8498
>> >
>> > If we need to add fields, making it optional might be an option.
>> >
>> > Masatake Iwasaki
>> >
>> > On 2021/10/11 16:09, tom lee wrote:
>> > > However, adding port is to modify the internal content of the IP
>> field,
>> > > which has little impact on the overall layout.
>> > >
>> > > In our cluster, we parse the audit log through Vector and send the
>> data
>> > to
>> > > Kafka, which is unaffected.
>> > >
>> > > tom lee <tomlees...@gmail.com> 于2021年10月11日周一 下午2:44写道:
>> > >
>> > >> Thank Ayush for reminding me. I also have similar concerns, so I
>> > published
>> > >> this discussion, hoping to let the members of the community know
>> about
>> > this
>> > >> matter and then give suggestions.
>> > >>
>> > >> Ayush Saxena <ayush...@gmail.com> 于2021年10月11日周一 下午2:38写道:
>> > >>
>> > >>> Hey
>> > >>> I am not sure whether we can directly go and change this. Any
>> changes
>> > to
>> > >>> Audit Log format are considered incompatible.
>> > >>>
>> > >>>
>> > >>>
>> >
>> https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html#Audit_Log_Output
>> > >>>
>> > >>> -Ayush
>> > >>>
>> > >>> On 10-Oct-2021, at 7:57 PM, tom lee <tomlees...@gmail.com> wrote:
>> > >>>
>> > >>> Hi all,
>> > >>>
>> > >>> In our production environment, we occasionally encounter a problem
>> > where a
>> > >>> user submits an abnormal computation task, causing a sudden flood of
>> > >>> requests, which causes the queueTime and processingTime of the
>> > Namenode to
>> > >>> rise very high, causing a large backlog of tasks.
>> > >>>
>> > >>> We usually locate and kill specific Spark, Flink, or MapReduce tasks
>> > based
>> > >>> on metrics and audit logs. Currently, IP and UGI are recorded in
>> audit
>> > >>> logs, but there is no port information, so it is difficult to locate
>> > >>> specific processes sometimes. Therefore, I propose that we add the
>> port
>> > >>> information to the audit log, so that we can easily track the
>> upstream
>> > >>> process.
>> > >>>
>> > >>> Currently, some projects contain port information in audit logs,
>> such
>> > as
>> > >>> Hbase and Alluxio. I think it is also necessary to add port
>> information
>> > >>> for
>> > >>> HDFS audit logs.
>> > >>>
>> > >>> I submitted a PR(https://github.com/apache/hadoop/pull/3538), which
>> > has
>> > >>> been tested in our test environment, and both RPC and HTTP are in
>> > effect.
>> > >>> I
>> > >>> look forward to your discussion on possible problems and suggestions
>> > for
>> > >>> modification. I will actively update the PR.
>> > >>>
>> > >>> Best Regards,
>> > >>> Tom
>> > >>>
>> > >>>
>> > >
>> >
>>
>

Reply via email to