Re: Status of Arrow conbench data + conbench OSS project

Rok Mihevc Mon, 22 Jun 2026 14:50:53 -0700

> > I see buildkite scheduling is removed
>
> I think this is mostly accidental, and can be easily restored


Removing buildkite is likely a good development in this case.
How is scheduling done then?


> Can you show me where the Arrow conbench suite lives so I can test /
> harden a single CI benchmark run / insert? That sounds like the next
> step, to validate things end to end and identify exactly what changes
> would be needed in the infrastructure.

Benchmarks are here https://github.com/arctosalliance/benchmarks
and here https://github.com/arctosalliance/arrowbench.


> I think the migration path would look like:
>
> 1) Start the new server in read-only mode in parallel against RDS, get
> comfortable with new UI and fill any accidental feature gaps
> 2) If step 1 seems okay and benchmark insertion runs work seamlessly
> against the new API endpoint, switch over
> 3) Explore alternative storage architectures, e.g. incrementally
> migrate historical data to Parquet while keeping recent data (e.g. the
> last 30 days worth of benchmark runs) in Postgres

I'd propose something like:
1. splice new go cli tool into the current runner so data goes to RDS and
S3.
2. stand up the new application with alerting and get comfortable with it
3. replace runner with the go cli one and decommision postgres and old app


> > On RDS we are seeing a performance bottleneck due to large tables
> > and suboptimal queries, so local postgres performance might not indicate
> > the actual RDS postgres performance.
>
> On this, since many of the Postgres queries were rewritten by gpt-5.5
> from first principles during the reimplementation process, it's
> possible that some things got faster by accident. Go vs. Python may
> also help. If there are particular workflows that are currently slow
> that I can focus on I can take a closer look.

What a time to be alive. See https://github.com/conbench/conbench/pull/1614
for the query in question.


> Note that the Go server binary can be easily wrapped in a Python wheel
> and deployed on PyPI also (that's how `pip install agentsview` works —
> wrapped Go binary)

Could be useful, not sure we need it if go cli tool is self-sufficient?


> I have a number of projects whose performance I need to continuously
> monitor, so I'd be willing to develop and maintain this new project
> with AI assistance — building this new implementation required
> probably 8 hours worth of my attention and the rest of the work was
> conducted through spec-driven development (with roborev for quality
> assurance), about $2700 worth of tokens (which all came out of a Codex
> subscription).

Great to hear about the project maintenance! Does that include deployment
maintenance? I'd be happy to step back from maintaining the current
deployment
in that case.

Thinking out loud - would it make sense to start benchmarking on one of
your new
projects first and work out the issues there before porting arrow? That
would
give maximum speed and flexiblity. Arrow's benchmark is also quite
cumbersome
and wouldn't be fun to prototype with - we currently burn ~1500 USD/moth on
AWS
runners.


Rok

On Mon, Jun 22, 2026 at 10:40 PM Wes McKinney <[email protected]> wrote:

> > I see buildkite scheduling is removed
>
> I think this is mostly accidental, and can be easily restored
>
> Can you show me where the Arrow conbench suite lives so I can test /
> harden a single CI benchmark run / insert? That sounds like the next
> step, to validate things end to end and identify exactly what changes
> would be needed in the infrastructure.
>
> I think the migration path would look like:
>
> 1) Start the new server in read-only mode in parallel against RDS, get
> comfortable with new UI and fill any accidental feature gaps
> 2) If step 1 seems okay and benchmark insertion runs work seamlessly
> against the new API endpoint, switch over
> 3) Explore alternative storage architectures, e.g. incrementally
> migrate historical data to Parquet while keeping recent data (e.g. the
> last 30 days worth of benchmark runs) in Postgres
>
> > On RDS we are seeing a performance bottleneck due to large tables
> and suboptimal queries, so local postgres performance might not indicate
> the actual RDS postgres performance.
>
> On this, since many of the Postgres queries were rewritten by gpt-5.5
> from first principles during the reimplementation process, it's
> possible that some things got faster by accident. Go vs. Python may
> also help. If there are particular workflows that are currently slow
> that I can focus on I can take a closer look.
>
> Note that the Go server binary can be easily wrapped in a Python wheel
> and deployed on PyPI also (that's how `pip install agentsview` works —
> wrapped Go binary)
>
> I have a number of projects whose performance I need to continuously
> monitor, so I'd be willing to develop and maintain this new project
> with AI assistance — building this new implementation required
> probably 8 hours worth of my attention and the rest of the work was
> conducted through spec-driven development (with roborev for quality
> assurance), about $2700 worth of tokens (which all came out of a Codex
> subscription).
>
> Thanks,
> Wes
>
> On Mon, Jun 22, 2026 at 3:22 PM Rok Mihevc <[email protected]> wrote:
> >
> > Hi Wes,
> >
> > > The general idea is that there is a new backend in Go and a single-page
> > Typescript frontend application.
> >
> > Single tool approach seems preferable over the current implementation for
> > simplicity etc. I see buildkite scheduling is removed; does the core app
> > now handle scheduling? I assume the DB schema and core alghoritms are
> kept.
> > This moves in the right direction.
> >
> > > My general sense is that at least for result data storage, using e.g.
> > DuckDB + Parquet files on object storage might be a viable alternative to
> > Postgres to tinker with, but the performance of this application against
> a
> > local Postgres database seems acceptable.
> >
> > I'm curious about postgres performance - are you using current queries or
> > new ones? On RDS we are seeing a performance bottleneck due to large
> tables
> > and suboptimal queries, so local postgres performance might not indicate
> > the actual RDS postgres performance.
> > That said I believe parquet files on object storage would be preferable
> > anyway. Ideally, the conbench result viewer should (IMO) only be a view
> > layer over precomputed data generated at benchmark or compaction time.
> >
> > > I am not expecting this PR to be accepted necessarily but mostly using
> it
> > to see if there is interest in going down this path! I am also okay to
> > create a new git repository and do green field development, or let it
> live
> > in a branch that can get developed an experimented with for a period of
> > time (e.g. it seems like part of this project would be proving out
> getting
> > the Arrow benchmark runners to upload their results using the new HTTP
> API
> > or Go CLI).
> >
> > My question is mainly - Does this proposal move Arrow's conbench instance
> > to a more maintainable state?
> >
> > Having migrated and maintained Arrow's current conbench deployment, I
> would
> > love to see it simplified for easier maintenance and I'm happy to help
> move
> > our deployment to a better state.
> > That said do you intend to maintain conbench going forward? Who do you
> see
> > maintaining Arrow's conbench deployment?
> >
> > Rok
> >
> > On Sun, Jun 21, 2026 at 9:28 PM Wes McKinney <[email protected]>
> wrote:
> >
> > > I had an opportunity to grind on this for a couple weeks with spare
> > > coding agent cycles and produced this pull request, which I've stood
> > > up successfully against the clone of the prod Postgres database that
> > > Rok shared with me. The general idea is that there is a new backend in
> > > Go and a single-page Typescript frontend application.
> > >
> > > My general sense is that at least for result data storage, using e.g.
> > > DuckDB + Parquet files on object storage might be a viable alternative
> > > to Postgres to tinker with, but the performance of this application
> > > against a local Postgres database seems acceptable.
> > >
> > > https://github.com/conbench/conbench/pull/1619
> > >
> > > I stood up a documentation site here:
> > >
> > > https://wesm.github.io/conbench-tmp/
> > >
> > > Example screenshots:
> > >
> > > https://wesm.github.io/conbench-tmp/dashboard-screenshots/
> > >
> > > I am not expecting this PR to be accepted necessarily but mostly using
> > > it to see if there is interest in going down this path! I am also okay
> > > to create a new git repository and do green field development, or let
> > > it live in a branch that can get developed an experimented with for a
> > > period of time (e.g. it seems like part of this project would be
> > > proving out getting the Arrow benchmark runners to upload their
> > > results using the new HTTP API or Go CLI).
> > >
> > > Thanks,
> > > Wes
> > >
> > > On Thu, Jun 4, 2026 at 9:26 PM Jonathan Keane <[email protected]>
> wrote:
> > > >
> > > > Thanks for this. I also don’t know of anyone else using conbench
> > > currently, but I am happy to hear that you are interested in (having
> your
> > > agents) work on it.
> > > >
> > > > -Jon
> > > >
> > > > > On Jun 4, 2026, at 16:45, Rok Mihevc <[email protected]> wrote:
> > > > >
> > > > > Will do! Looking forward to hearing what you're looking to achieve.
> > > > >
> > > > > Perhaps we can use the attention here (if we got any) to discuss
> what
> > > Arrow
> > > > > needs/wants in terms of benchmarking functionality.
> > > > >
> > > > >
> > > > > On Thu, Jun 4, 2026 at 9:55 PM Wes McKinney <[email protected]>
> > > wrote:
> > > > >
> > > > >> Okay sounds good! I’ll start a v2 branch and do some work on it.
> Can
> > > you
> > > > >> contact me offline and let me know how I can get a copy of the
> prod
> > > > >> database to help with development and test migrations?
> > > > >>
> > > > >> Thanks
> > > > >> Wes
> > > > >>
> > > > >> On Thu, Jun 4, 2026 at 12:57 Rok Mihevc <[email protected]>
> wrote:
> > > > >>
> > > > >>> As per my discussion with the Velox team in March, they weren't
> using
> > > > >>> Conbench at the time and had no plans to migrate back.
> > > > >>>
> > > > >>> I never reported on the Conbench/Voltron migration to the mailing
> > > list,
> > > > >> so
> > > > >>> this is a good opportunity to do so:
> > > > >>> - In November, it became clear Voltron Data would shut down its
> CI
> > > and
> > > > >>> benchmarking infrastructure
> > > > >>> - Using the donated AWS credits [1] and with the help of Mike
> Wendt I
> > > > >> moved
> > > > >>> conbench infra to the new AWS account and backfilled benchmarks
> to
> > > cover
> > > > >>> the missing commits
> > > > >>> - Raul and I moved CUDA runners to the new AWS account
> > > > >>> - Moved doc-preview, Parquet benchmarking/testing datasets, the
> > > nightly
> > > > >>> dashboard, etc. to S3 with help from Nic, Jacob, and Raul
> > > > >>> - Conbench gradually stabilized, but benchmarks occasionally fail
> > > due to
> > > > >>> dependency issues. We don't really track these, but Nic and I
> > > typically
> > > > >> fix
> > > > >>> them.
> > > > >>> - The database design still makes the web views slow, and we
> > > discussed
> > > > >>> about dropping some history to speed them up
> > > > >>> - AWS benchmark runner-minutes consume nearly the entire donated
> AWS
> > > > >> quota.
> > > > >>> Discussion on how to optimize this and what exactly we need from
> > > > >> benchmarks
> > > > >>> as a project occasionally comes up on Zulip, and I expect that
> once
> > > it
> > > > >>> crystallizes we'll bring it to the ML as well.
> > > > >>> - Things are now stable enough that we can discuss redesigning
> > > toward a
> > > > >>> more stable and sustainable setup
> > > > >>>
> > > > >>> Best,
> > > > >>> Rok
> > > > >>>
> > > > >>> [1]
> https://lists.apache.org/thread/q33oofy2v3zpg9s9l8o0w68rmjr3ocsv
> > > > >>>
> > > > >>> On Thu, Jun 4, 2026 at 5:10 PM Jacob Wujciak <
> [email protected]>
> > > > >>> wrote:
> > > > >>>
> > > > >>>> +1 improving conbench and the orchestration would free up aws
> > > credits
> > > > >> and
> > > > >>>> more importantly the time taken to fix outrages!
> > > > >>>>
> > > > >>>> Velox used to use conbench but through a VD provided instance. I
> > > > >> haven't
> > > > >>>> seen any movement to set up an independent instance.
> > > > >>>>
> > > > >>>>
> > > > >>>> Rok Mihevc <[email protected]> schrieb am Do., 4. Juni 2026,
> > > 16:38:
> > > > >>>>
> > > > >>>>> Hi Wes,
> > > > >>>>>
> > > > >>>>> Conbench is now at conbench.arrow-dev.org. I forked all repos
> to
> > > > >>>>> arctosalliance github org in case VD deleted them on the way
> out.
> > > > >>>>> Historical RDS db is preserved in the AWS account we migrated
> > > Arrow's
> > > > >>> CI
> > > > >>>>> infra to. It's supported by donated AWS credits.
> > > > >>>>> For orchestration buildkite.com/apache-arrow is still used
> and I
> > > > >> even
> > > > >>>> got
> > > > >>>>> the old M1 mac mini from Mike. It's now in my rack running
> > > benchmarks
> > > > >>>>> again.
> > > > >>>>>
> > > > >>>>> That said current design of the orchestration layer, api
> server and
> > > > >> db
> > > > >>> is
> > > > >>>>> somewhat costly and brittle. Downtime and adhoc fixes are
> common
> > > and
> > > > >> I
> > > > >>>> was
> > > > >>>>> thinking about refactoring it somewhat to make it more
> > > maintainable.
> > > > >>>>>
> > > > >>>>> 1) If things become more maintainable and long term stable
> that'd
> > > be
> > > > >>>> great.
> > > > >>>>> I'm happy to collaborate on this.
> > > > >>>>> 2) I'm only aware of arrow and arrow-go currently using it.
> > > > >>>>>
> > > > >>>>> Best,
> > > > >>>>> Rok
> > > > >>>>>
> > > > >>>>> On Thu, Jun 4, 2026 at 4:20 PM Wes McKinney <
> [email protected]>
> > > > >>> wrote:
> > > > >>>>>
> > > > >>>>>> hi all,
> > > > >>>>>>
> > > > >>>>>> I saw that conbench.ursa.dev has been down and I had a need
> to
> > > set
> > > > >>> up
> > > > >>>>>> some continuous project benchmarks, and was interested in
> doing
> > > > >>>>>> development on Conbench (well, having my agents do
> development on
> > > > >>>>>> Conbench), and was interested in the following:
> > > > >>>>>>
> > > > >>>>>> 1) is there interest in migrating the historical Arrow
> conbench
> > > > >> data
> > > > >>>>>> to a new server, has that been preserved somewhere? I'll
> probably
> > > > >>>>>> rewrite the conbench backend in Go and give it a client CLI
> for
> > > > >>>>>> submitting new data or querying old data.
> > > > >>>>>>
> > > > >>>>>> 2) are there other users of conbench (conbench/conbench) that
> > > > >> anyone
> > > > >>>>>> is aware of? I'd be done doing in-situ development in that
> > > > >> repository
> > > > >>>>>> or setting up a conbench-v2 project.
> > > > >>>>>>
> > > > >>>>>> No particular urgency but if anyone has opinions let me know!
> > > > >>>>>>
> > > > >>>>>> thanks,
> > > > >>>>>> Wes
> > > > >>>>>>
> > > > >>>>>
> > > > >>>>
> > > > >>>
> > > > >>
> > > >
> > >
>

Re: Status of Arrow conbench data + conbench OSS project

Reply via email to