Hiya Ziang,
This is a neat project, thanks for sharing!
One comment about the expression DSL you have created, for example in the
below "SELECT, FILTER, LIMIT" expression:
* Example SDF Specific Payload: { "id":
"dacp://10.0.0.1/weather_db/sensors <http://10.0.0.1/weather_db/sensors>",
"actions": [ ["filter", {"expression": "temperature > 25.0"}],
["select", {"columns": ["location", "temperature"]}], ["limit",
{"n": 100}] ] }*
Are you aware of the "Substrait" project, which attempts to standardize
Relational Algebra expressions for sending over the wire between compute
engines?
substrait-io/substrait: A cross platform way to express data
transformation, relational algebra, standardized record expression and
plans. <https://github.com/substrait-io/substrait>
Home - Substrait: Cross-Language Serialization for Relational Algebra
<https://substrait.io/>
It seems like DACP shares some of the same goals, but with the addition of
access policies for data rather than just data expressions + compute?
On Mon, Nov 24, 2025 at 5:42 AM 周子昂 <[email protected]> wrote:
> Hi Apache Arrow Community,
>
>
> I'm Ziang Zhou from CNIC, Chinese Academy of Sciences. I'd like to share a
> proposal about DACP (Data Access and Collaboration Protocol) on behave of
> my Team, a protocol built on Apache Arrow Flight, and discuss potential
> integration with the Arrow ecosystem.
>
>
> ### 1. Background of DACP
> DACP is designed for cross-node, cross-process data access in scientific
> and distributed computing environments. It addresses pain points like
> fragmented data sharing, lack of collaboration support, and inefficient
> streaming in existing solutions.
>
>
> ### 2. Relationship with Apache Arrow
> DACP is tightly integrated with Apache Arrow Flight:
> - Uses Arrow Flight as the underlying RPC layer for zero-copy, columnar
> data transfer;
> - Reuses Arrow's in-memory format for SDF (Streaming DataFrame), ensuring
> interoperability with other Arrow-enabled systems;
> - Extends Flight with high-level features like dataset catalog management,
> end-to-end provenance tracking, and secure collaboration.
>
>
> ### 3. Current Status
> - Project repo: https://github.com/rdcn-link/dftp-dacp
> - IETF draft: https://datatracker.ietf.org/doc/draft-shenzhihong-dacp/
> - Has been tested in scientific computing clusters for multi-node data
> sharing in the fields of scientific and distributed computing from
> Institute of Atmospheric Physics, CAS
>
>
> ### 4. Collaboration Request
> We hope to:
> 1. Get technical feedback from the Arrow community on DACP's design
> (especially compatibility with Arrow Flight);
> 2. Discuss the possibility of listing DACP as an official Arrow ecosystem
> extension;
> 3. Explore potential collaboration on protocol optimization (e.g.,
> aligning SDF with Arrow's data model).
>
>
> We've already submitted a PR to add DACP to the "Powered By Apache Arrow"
> list (PR link: https://github.com/apache/arrow-site/pull/728), and look
> forward to your valuable comments.
>
>
> Thank you for your time!
>
>
> Best regards,
> Ziang Zhou
> CNIC, Chinese Academy of Sciences
> Email: [email protected]
> Project Repo: https://github.com/rdcn-link/dftp-dacp
>