Hi folks,

I have updated our Roadmap wiki page adding some items that can bring some
value to the entire project. ManifoldCF in this way could become an
AI-Ready Data Ingestion Hub:
https://cwiki.apache.org/confluence/display/CONNECTORS/Roadmap

The discussion is open and please share your feedback and feel free to
raise your hand to eventually take the ownership of some of those tasks
summarized below:

Core Performance & Modernization (The Java 21 Leap)
The transition to OpenJDK 21 is the foundation for a more scalable and
responsive architecture:

- Virtual Threads Integration (Project Loom)
- REST API v2 & OpenAPI Specification
- Enable "Configuration-as-Code" to support modern DevOps workflows
- Observability with OpenTelemetry

AI & Vector Ecosystem Integration (RAG-Readiness)
Positioning ManifoldCF as the primary "ingestion engine" for
Retrieval-Augmented Generation (RAG) and LLM applications:

- Universal Embedding Transformation Connector (High Priority)
- Enable in-flight embedding generation (converting text to vectors)
directly within the MCF pipeline at no cost using local open-source models
(e.g., BGE-M3, Nomic).
- Native Vector Store Output Connectors: Solr Dense Vector, Milvus, Qdrant,
and Weaviate
- Develop a specialized pgvector connector for users leveraging PostgreSQL
as a unified metadata and vector store (High Priority)

Advanced Metadata & ACL Mapping for AI
- Ensure that security permissions (ACLs) are seamlessly passed to vector
stores as "payload" filters to maintain document security in AI search
interfaces.

Cloud-Native & Ecosystem Synergy
Expanding the reach of ManifoldCF through deeper integration with the
Apache ecosystem and containerized environments:

- Apache Solr Dense Vector Output Connector (High Priority)
- Apache Airflow & NiFi Integration
- Kubernetes Operator
- Next-Gen Administrative UI


-- 
Piergiorgio

Reply via email to