Thanks zihao for driving the discussion and bumping the email.

LGTM +1 from my side.

Best,
Yuepeng Pan

zihao chen <[email protected]> 于2026年6月10日周三 14:07写道:

> Bumping this thread. Thanks!
>
> Best regards,
> Zihao
>
> zihao chen <[email protected]> 于2026年5月23日周六 18:39写道:
>
> > Hi all,
> >
> > Thanks everyone for the valuable feedback and discussions on this FLIP.
> >
> > Based on the discussion so far, the proposal has received generally
> > positive
> > feedback, and several important points have been clarified, including:
> >
> >    - ArchiveStorage API design considerations
> >    - RocksDB deployment model and isolation between HistoryServer
> >    instances
> >    - Cleanup and retention strategy compatibility with existing
> mechanisms
> >
> >
> > Besides, the earlier related discussion can be found here:
> > https://lists.apache.org/thread/6thlq9c5twyvzmcw7q24nm4q0rcbz5qp
> >
> > If there are no further major concerns, I’m planning to start the VOTE
> > thread
> > next Tuesday.
> >
> > Please feel free to share any additional feedback before then.
> >
> > Best regards,
> > Zihao
> >
> > zihao chen <[email protected]> 于2026年5月19日周二 21:05写道:
> >
> >> Hi Zuo,
> >>
> >> Thanks for your feedback and for aligning in this direction.
> >>
> >> Here are the clarifications regarding your questions:
> >>
> >>    - *RocksDB Deployment*:
> >>
> >> RocksDB instance is coupled with the HistoryServer instance (each
> >> instance has its own independent local RocksDB). There is no shared
> >> access between multiple HistoryServer instances.
> >>
> >>
> >>    - *Cleanup Strategy*:
> >>
> >> The core cleanup still relies on the original ArchiveRetainedStrategy
> >> (max
> >> job counts, TTL, etc.). While we've also implemented a
> >> disk-capacity-based
> >> cleanup strategy in our internal practice to prevent disk exhaustion,
> >> this feature is relatively independent. I  decouple it for now and
> >> discuss it
> >> further in a follow-up FLIP.
> >>
> >>
> >> Let me know if this looks good to you!
> >>
> >>
> >> Best regards,
> >>
> >> Zihao
> >>
> >>
> >> 魏祚 <[email protected]> 于2026年5月19日周二 17:33写道:
> >>
> >>>
> >>>
> >>> Hi Zihao,
> >>>
> >>>
> >>> Thanks for your proposal. The excessive small files problem of
> >>> HistoryServer is indeed a real pain point in large-scale production
> >>> environments, and introducing RocksDB is a great idea.
> >>> There's a few details I'd like to clarify:
> >>> What is the deployment strategy for RocksDB? Is there a scenario where
> >>> multiple HistoryServer instances share and access the same RocksDB
> >>> instance? If so, are there any potential compatibility or concurrency
> risks?
> >>> After introducing RocksDB, what is the strategy for cleaning up
> >>> historical garbage files and expired job archives?
> >>>
> >>>
> >>> Best regards,
> >>> Zuo Wei
> >>>
> >>>
> >>> ----- Original Message -----
> >>> From: "zihao chen" <[email protected]>
> >>> To: [email protected]
> >>> Sent: Sat, 9 May 2026 11:37:08 +0800
> >>> Subject: [DISCUSS] FLIP-XXX: Support Pluggable Storage Backend for
> >>> HistoryServer
> >>>
> >>> Hi all,
> >>>
> >>> I’d like to start a discussion on FLIP-XXX:
> >>>
> >>> *Support Pluggable Storage Backend forHistoryServer*.
> >>>
> >>> This FLIP proposes improving the HistoryServer
> >>> to address excessive *small files* when handling
> >>> large numbers of archived jobs.
> >>>
> >>> [Proposal]
> >>> Optional *RocksDB-based storage* to reduce
> >>> small files
> >>>
> >>> [Compatibility]
> >>> Full backward compatibility (FILE as default)
> >>>
> >>> The detailed design is described in the
> >>> FLIP document:
> >>>
> >>>
> >>>
> https://docs.google.com/document/d/1idHu5bq0GOsUuUAEIJSJ2UuekcDjbW0tHLNbsQfugDg/edit?usp=sharing
> >>>
> >>> This FLIP is split from the earlier discussion [1].
> >>>
> >>> Looking forward to your feedback.
> >>>
> >>> [1] https://lists.apache.org/thread/6thlq9c5twyvzmcw7q24nm4q0rcbz5qp
> >>>
> >>>
> >>> Best regards,
> >>>
> >>> Zihao Chen
> >>>
> >>
>

Reply via email to