> Sir, as you have mentioned in the mail, Python is must for this project, I just wanted to ask, what about Java and Golang SDK applications, I mean I know it’s an AI/ML pipeline based project, but if you could tell me it would add to my clarity.
I would expect this project to pretty much exclusively be in Python. The only exception is if some vector DB or feature store only offers a Go or Java client (but this seems unlikely) > Sir, I wanted to also ask, as Retrieval Augmented Generation(RAG) has a close relation with this project, don’t you think RAG is still limited to capturing historical data, or it has capability of capturing latest/modern data’s too? I'm not sure I understand the question, but I can try to give an overview of how I think Beam and RAG work together. Basically, I think Beam can be used to: 1. Ingest data -> generate embeddings -> write to a vector DB. This can include very recent data, it just depends on how you configure your source (e.g. you could ingest Data continuously with PubSub or Kafka) 2. Ingest incoming query -> enrich with embedding data from a vector DB -> perform inference with the additional relevant context -> write result somewhere So I think this can handle reasonably tight data freshness requirements. On Tue, Feb 18, 2025 at 11:01 AM SIDDHARTH SALIAN < siddharthsalia...@gmail.com> wrote: > Respected Sir, > > > > 1. Thank you for the email. With the reference to the previous mail , > I have understood all the points and I shall also go through the I/O page > in the documentation page as well as vector DB’s, features. > > > > 2. Sir, as you have mentioned in the mail, Python is must for this > project, I just wanted to ask, what about Java and Golang SDK applications, > I mean I know it’s an AI/ML pipeline based project, but if you could tell > me it would add to my clarity. > > > > 3. Sir, I wanted to also ask, as Retrieval Augmented Generation(RAG) > has a close relation with this project, don’t you think RAG is still > limited to capturing historical data, or it has capability of capturing > latest/modern data’s too? > > > > Best regards, > > Thanking you, > > Siddharth Salian > > > > *From: *Danny McCormick via user <user@beam.apache.org> > *Date: *Tuesday, 18 February 2025 at 8:36 PM > *To: *user@beam.apache.org <user@beam.apache.org> > *Cc: *Danny McCormick <dannymccorm...@google.com> > *Subject: *Re: Regarding the GSOC 2025 Project > > Hey Siddharth, thanks for reaching out. I'm glad you're interested in the > project. In general, I would expect there to be more details about projects > once we know which ones have been accepted. > > > > > Sir, if you could tell me the pre-required knowledge (such as major > programming languages used, etc., ) for this project, it would bring more > clarity to me sir. > > > > I would expect it to be primarily done in Python, though it depends what > connectors are available for each vector DB/feature store. Other than that, > the main things you'd want to learn about are Beam itself, especially about > how to write a sink (IO standards > <https://beam.apache.org/documentation/io/io-standards> can help here), > and also high level how vector DBs and feature stores work. > > > > Thanks, > > Danny > > > > > > > > On Thu, Feb 13, 2025 at 10:55 PM SIDDHARTH SALIAN < > siddharthsalia...@gmail.com> wrote: > > Hello Sir, > > > > 1. My intention of writing this email is with reference to the GSOC > 2025 mail - > https://lists.apache.org/thread/o3mwncq0k4c58c630n49l7bvhq74o2wj > > > > 2. I’m Siddharth Salian and I’m an undergraduate student and I’m part > of Apache Beam and I have just joined the community. After going through > the GSOC 2025 idea list and going through the project description, I > founded https://issues.apache.org/jira/browse/GSOC-279 this project to > be interesting for me sir. So sir, I would like to contribute to this > project in GSOC 2025, since AI/ML is area of my interest. Since you are the > mentor, I’m letting you know sir. > > > > 3. Sir, if you could tell me the pre-required knowledge (such as major > programming languages used, etc., ) for this project, it would bring more > clarity to me sir. > > > > 4. Sir also wanted to ask is there any other project that you are > thinking about for GSOC 2025, I would like to contribute in it sir. > > > > Best Regards, > > > > Thanking You > > Siddharth Salian > > > > > >