Hi Zihao,

Thanks for driving this — the pluggable ArchiveStorage abstraction looks like a 
clean fit for the HistoryServer. I have two questions on the ArchiveStorage 
interface that I'd like to understand better before the vote:

Asymmetric value type between read and write. get / getByPrefix return the 
generic type T, but put hard-codes the value to String:

···
T get(String key);
void put(String key, String archiveContent);
···

Could you share the rationale? Making the write side symmetric (either also T, 
or unifying both sides on byte[] / InputStream) would feel more consistent and 
avoid forcing every backend to materialize the archive as a String. Is there a 
specific reason String was chosen for put?

OOM risk of getByPrefix returning List<T>. In production a single prefix (e.g. 
all entries under one job, or under /jobs/) can easily expand to thousands of 
entries with non-trivial JSON payloads. Returning a fully materialized List<T> 
means the whole result set is loaded into heap at once, which I'm worried could 
cause OOM on busy HistoryServers.

Have you considered exposing it as Iterator<T> / CloseableIterator<T> (or a 
Stream<T>) instead? It maps very naturally to RocksDB's prefix iterator, and 
FileArchiveStorage can implement it lazily as well. If there's a concrete call 
site that really needs the full list, it can always do Lists.newArrayList(iter) 
locally.

Other than these two points, +1 from me on the overall direction.

Best, Verne

On 2026/05/09 03:37:08 zihao chen wrote:
> Hi all,
> 
> I’d like to start a discussion on FLIP-XXX:
> 
> *Support Pluggable Storage Backend forHistoryServer*.
> 
> This FLIP proposes improving the HistoryServer
> to address excessive *small files* when handling
> large numbers of archived jobs.
> 
> [Proposal]
> Optional *RocksDB-based storage* to reduce
> small files
> 
> [Compatibility]
> Full backward compatibility (FILE as default)
> 
> The detailed design is described in the
> FLIP document:
> 
> https://docs.google.com/document/d/1idHu5bq0GOsUuUAEIJSJ2UuekcDjbW0tHLNbsQfugDg/edit?usp=sharing
> 
> This FLIP is split from the earlier discussion [1].
> 
> Looking forward to your feedback.
> 
> [1] https://lists.apache.org/thread/6thlq9c5twyvzmcw7q24nm4q0rcbz5qp
> 
> 
> Best regards,
> 
> Zihao Chen
> 

Reply via email to