Sure, I have added you. Thanks, Danny
On Mon, Feb 24, 2025 at 12:21 PM Aditya <adiworkprof...@gmail.com> wrote: > Thanks for the reply. > > Can i ask something > Can I join slack communication channel of beam > > On Mon, 24 Feb, 2025, 22:44 Danny McCormick, <dannymccorm...@google.com> > wrote: > >> Hey Aditya, glad to hear that you are interested in this project. I've >> tried to answer your questions below: >> >> > What are the key technical challenges in integrating Beam with Pinecone >> and Tecton? >> >> The main challenges will be around understanding how those systems (and >> other similar systems) work, how their client libraries are set up, how >> Beam handles sources/sinks to enable efficient execution, and being able to >> stitch all of those pieces together into a working connector. This will >> require an understanding of the Beam model and will require reasoning >> through some distributed systems principles. >> >> > Should the connectors support both batch and streaming modes? >> >> Yes, we will need to support both. >> >> > Are there any existing patterns or reference implementations to follow? >> >> Yes, here is an example enrichment handler with the Feast Feature store - >> https://github.com/apache/beam/blob/42bbc1ed432bf912f895271b3d3954cb70e69cf8/sdks/python/apache_beam/transforms/enrichment_handlers/feast_feature_store.py#L83 >> and >> here is an example sink for writing TFRecords - >> https://github.com/apache/beam/blob/42bbc1ed432bf912f895271b3d3954cb70e69cf8/sdks/python/apache_beam/io/tfrecordio.py#L299 >> . >> >> We'd need similar concepts for writing to and enriching from various >> feature stores and vector DBs. >> >> Thanks, >> Danny >> >> On Sat, Feb 22, 2025 at 11:02 AM Aditya <adiworkprof...@gmail.com> wrote: >> >>> *Hi Danny and Beam Dev Team,* >>> >>> I hope you're doing well. I am interested in contributing to the *"Beam >>> ML Vector DB/Feature Store Integrations"* project as part of GSoC and >>> would love to get more insights into the project’s scope and expectations. >>> About Me >>> >>> I am a software engineer passionate about distributed systems and >>> machine learning infrastructure. I have been actively contributing to >>> Apache projects and open-source communities. Below is a summary of my >>> contributions: >>> >>> *Previous Contributions:* >>> >>> - *Apache Airflow* >>> - 10+ contributions via PRs and issues >>> - 5+ merged PRs >>> - Active daily participation in the Slack community >>> - Currently working on HTTP operator improvements >>> - *Shell_sage* >>> - Implemented a logging flag feature >>> - Created SQLite database integration for log storage >>> - Successfully merged PR >>> - *Other Apache Projects* >>> - Contributions to Apache ZooKeeper >>> - Documentation improvements for Apache Maven >>> - Active participation in MSS and SugarLabs >>> >>> *My Profiles:* >>> >>> - *GitHub:* https://github.com/aditya0yadav >>> - *LinkedIn:* https://www.linkedin.com/in/2580aditya/ >>> >>> I would love to understand more about this project, specifically: >>> >>> 1. What are the key technical challenges in integrating Beam with >>> Pinecone and Tecton? >>> 2. Should the connectors support both batch and streaming modes? >>> 3. Are there any existing patterns or reference implementations to >>> follow? >>> >>> Looking forward to your guidance and hoping to contribute meaningfully >>> to the project. >>> >>> *Best regards,* >>> Aditya Yadav >>> adiworkprof...@gmail.com >>> >>