Hi guys, Here comes the summarized Marvin mission:
The Apache Marvin-AI platform aims to offer a practical and standardized solution to help its users to perform data exploration, model development and application lifecycle management, aiming to offer: scalability, language agnosticism and a standardized pipeline. Thanks for the help, Lucas Cardoso Em qua., 29 de jul. de 2020 às 17:05, Lucas Cardoso Silva < cardosolucas61....@gmail.com> escreveu: > Hi guys! > Great Lucas, I will wait a couple of days to see if anyone has other > things to add, and then we can close this phase! > > Wei, we can discuss how to make the data pipelines easier to the users in > another step of the evaluation. With the experience of the users and > developers with this topic we can track their needs better and make > use-case scenarios. I agree with you that data preparation is messy and can > take a lot of time and will be great if Marvin could help in that. > > Best regards, > Lucas > > > Em qua., 29 de jul. de 2020 às 11:59, Wei Chen <weic...@apache.org> > escreveu: > >> Hello Lucas, >> >> I am thinking of processing JSON or XML files with a hierarchy dynamic >> structure. >> Or building a pipeline to crop image with object detection metadata. >> Data preparation can be very messy, >> I wonder if we can have a stage to handle both batch and streaming >> processing well. >> >> I simply think we don't need to focus on this part since we can utilize a >> wide variety of tools for our specific needs. >> >> Best Regards, >> Wei >> >> >> >> On Wed, Jul 29, 2020 at 8:48 PM Lucas Bonatto Miguel < >> lucasb...@apache.org> >> wrote: >> >> > Hi folks, >> > >> > In regards to the mission, you're correct. If I could summarize it, it >> > would be like: *to help its users to perform data exploration, model >> > development and application lifecycle management*. >> > >> > I'm all in for having a better integration with Kubernetes. I think that >> > the first step is to create a new thread in order to design something >> > following their operator pattern: >> > https://kubernetes.io/docs/concepts/extend-kubernetes/operator/ >> > >> > Wei, currently one already can perform merges and joins in the >> > transformation step. Could you comment a bit more on what you think we >> > could improve there? Maybe something for a new thread as well? >> > >> > Best! >> > Lucas >> > >> > On Wed, Jul 29, 2020 at 1:24 AM Wei Chen <weic...@apache.org> wrote: >> > >> > > I think deploying to K8S does expend our capabilities for inference >> > scaling >> > > and managing. >> > > I am not familiar with Luigi, but it makes sense since we are going to >> > > setup data pipelines. >> > > >> > > Best Regards, >> > > Wei >> > > >> > > On Wed, Jul 29, 2020 at 5:32 AM Lucas Cardoso Silva < >> > > cardosolucas61....@gmail.com> wrote: >> > > >> > > > Great Wei! I find the suggestions really interesting. I think we can >> > work >> > > > with the deployment on K8s. The idea of it in Marvin would be, after >> > > > development, the user would give some parameters and a script would >> > > > facilitate a deployment in a kubernetes cluster, right? Regarding >> data >> > > > acquisition, I think it would be great if we were able to integrate >> > some >> > > > third party library like Luigi. Thanks! >> > > > >> > > > >> > > > >> > > > Em qua., 22 de jul. de 2020 às 14:27, Wei Chen <weic...@apache.org> >> > > > escreveu: >> > > > >> > > > > Hello Lucas, >> > > > > >> > > > > I have some ideas: >> > > > > >> > > > > 1. Should we consider to use K8S or similar tools for inference >> > > container >> > > > > scaling and management? >> > > > > Marvin's current container management is not as powerful as some >> > > > container >> > > > > focus projects. >> > > > > K8S can also be deployed into most environments now. >> > > > > >> > > > > 2. Is our current data cleaning stage flexible enough for multiple >> > data >> > > > > sources with table join? >> > > > > Or if we should cut the data preparation stage out for the user to >> > make >> > > > > their own data pipeline on their data storage. >> > > > > I figured that preprocessing might be too complex to be >> generalized >> > for >> > > > > different ML projects. >> > > > > >> > > > > Best Regards >> > > > > Wei >> > > > > >> > > > > >> > > > > >> > > > > On Thu, Jul 23, 2020 at 12:26 AM Lucas Cardoso Silva < >> > > > > cardosolucas61....@gmail.com> wrote: >> > > > > >> > > > > > Hi guys. >> > > > > > I would like to know if anyone else has any ideas about this >> > > evaluation >> > > > > > phase. Both the opinion of those who have been in the community >> > for a >> > > > > long >> > > > > > time and those who are still getting to know Marvin is now >> > important >> > > > for >> > > > > > this step, so your suggestion or validation of the initial text >> is >> > > > always >> > > > > > welcome! >> > > > > > >> > > > > > Best regards, >> > > > > > Lucas Cardoso >> > > > > > >> > > > > > Em sex., 10 de jul. de 2020 às 13:48, Lucas Cardoso Silva < >> > > > > > cardosolucas61....@gmail.com> escreveu: >> > > > > > >> > > > > > > Hello guys. The time has come for us to take the first step in >> > > > > > > architectural assessment: the definition of the mission. >> > Basically >> > > we >> > > > > > have >> > > > > > > to decide here what is important in Marvin and what is outside >> > the >> > > > > scope >> > > > > > of >> > > > > > > the project. This is important because, during this analysis >> and >> > > the >> > > > > > > development process as a whole, we will be able to segment >> what >> > is >> > > > > really >> > > > > > > important and make things more simple and functional. Also, >> if it >> > > > looks >> > > > > > > cool, we can include that on the Marvin-AI homepage. >> > > > > > > >> > > > > > > As stated earlier, I will post an initial draft and would >> like to >> > > > > receive >> > > > > > > your feedback to complete a few points: >> > > > > > > >> > > > > > > The Apache Marvin-AI platform aims to offer: >> > > > > > > >> > > > > > > - >> > > > > > > >> > > > > > > a practical and standardized solution, >> > > > > > > - >> > > > > > > >> > > > > > > for the development and deployment of machine learning >> > > > applications. >> > > > > > > >> > > > > > > >> > > > > > > Aiming to offer the user: >> > > > > > > >> > > > > > > - >> > > > > > > >> > > > > > > scalability, >> > > > > > > - >> > > > > > > >> > > > > > > language agnosticism, >> > > > > > > - >> > > > > > > >> > > > > > > standardized pipeline (DASFE), >> > > > > > > - >> > > > > > > >> > > > > > > possibility of remote versioning of artifacts. >> > > > > > > >> > > > > > > >> > > > > > > Does anyone have any suggestions for more important features, >> > > > resources >> > > > > > or >> > > > > > > design decisions in Marvin? >> > > > > > > >> > > > > > > Thank you very much, >> > > > > > > >> > > > > > > Lucas Cardoso >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> >