be good for a benchmark to be targetable at cloud storage; local stores, especially those with SSD, hide a lot of the costs of datalakes
On Tue, 28 May 2024 at 07:17, Micah Kornfield <emkornfi...@gmail.com> wrote: > As a follow-up to the "V3" Discussions [1][2] I wanted to start a thread on > improvements to encodings. > > There are several areas to pursue here: > 1. Curating a standard set of benchmarks and criteria for determining if a > new encoding is worth adding. > 2. Developing new encodings > 3. Better implementations to select existing encodings. > 4. Better support for encodings with point/indexed lookups. > 5. Benchmarking frameworks that allow assessing trade-off of encodings on > storage systems with different latency/throughput. > > Realistically, given my current commitments, I don't think I have bandwidth > to help with this track in the near term. If someone else would like to > help drive this and make concrete proposals in these areas it would be > greatly appreciated. > > Thanks, > Micah > > > [1] https://lists.apache.org/thread/5jyhzkwyrjk9z52g0b49g31ygnz73gxo > [2] > > https://docs.google.com/document/d/19hQLYcU5_r5nJB7GtnjfODLlSDiNS24GXAtKg9b0_ls/edit >