What exactly are you looking for? To my knowledge neither Capacitor nor Artus have been described in enough detail external to Google to allow for external benchmarking, so the details would probably only be relevant to Google.
Both formats have more complicated encodings and embedded data-structures making them closer to Parquet (which is loosely based on precursor to capacitor) and ORC then Arrow. There are interesting ideas from the Procella paper which covers Artus that might be worth thinking about in the context of these formats (or a new one). Arrow has not spent much focus on optimizing storage size. Cheers, Micah On Wednesday, December 22, 2021, Benson Muite <benson_mu...@emailplus.org> wrote: > On 12/23/21 7:14 AM, Hayden Livingston wrote: > >> Has anyone been able to benchmark the Artus file format vs Arrow? >> >> It seems that the Artus file format is gaining traction inside Google, >> replacing their current columnar format Capacitor. >> >> Hayden, > Do you have a link to a specification or implementation of Artus? > Performance may also be related to disk type, network etc. >