Re: Marvin’s mission discussion

Lucas Bonatto Miguel Sat, 15 Aug 2020 08:58:08 -0700

It's good, the only thing I would change would be to mention what sort of
applications. Although we have AI in the name, one may mistakenly think
Marvin is intended to serve any type of application.


Best

On Fri, Aug 14, 2020 at 11:37 AM Lucas Cardoso Silva <
[email protected]> wrote:

> Hi guys,
>
> Here comes the summarized Marvin mission:
>
> The Apache Marvin-AI platform aims to offer a practical and standardized
> solution to help its users to perform data exploration, model development
> and application lifecycle management, aiming to offer: scalability,
> language agnosticism and a standardized pipeline.
>
> Thanks for the help,
> Lucas Cardoso
>
> Em qua., 29 de jul. de 2020 às 17:05, Lucas Cardoso Silva <
> [email protected]> escreveu:
>
> > Hi guys!
> > Great Lucas, I will wait a couple of days to see if anyone has other
> > things to add, and then we can close this phase!
> >
> > Wei, we can discuss how to make the data pipelines easier to the users in
> > another step of the evaluation. With the experience of the users and
> > developers with this topic we can track their needs better and make
> > use-case scenarios. I agree with you that data preparation is messy and
> can
> > take a lot of time and will be great if Marvin could help in that.
> >
> > Best regards,
> > Lucas
> >
> >
> > Em qua., 29 de jul. de 2020 às 11:59, Wei Chen <[email protected]>
> > escreveu:
> >
> >> Hello Lucas,
> >>
> >> I am thinking of processing JSON or XML files with a hierarchy dynamic
> >> structure.
> >> Or building a pipeline to crop image with object detection metadata.
> >> Data preparation can be very messy,
> >> I wonder if we can have a stage to handle both batch and streaming
> >> processing well.
> >>
> >> I simply think we don't need to focus on this part since we can utilize
> a
> >> wide variety of tools for our specific needs.
> >>
> >> Best Regards,
> >> Wei
> >>
> >>
> >>
> >> On Wed, Jul 29, 2020 at 8:48 PM Lucas Bonatto Miguel <
> >> [email protected]>
> >> wrote:
> >>
> >> > Hi folks,
> >> >
> >> > In regards to the mission, you're correct. If I could summarize it, it
> >> > would be like: *to help its users to perform data exploration, model
> >> > development and application lifecycle management*.
> >> >
> >> > I'm all in for having a better integration with Kubernetes. I think
> that
> >> > the first step is to create a new thread in order to design something
> >> > following their operator pattern:
> >> > https://kubernetes.io/docs/concepts/extend-kubernetes/operator/
> >> >
> >> > Wei, currently one already can perform merges and joins in the
> >> > transformation step. Could you comment a bit more on what you think we
> >> > could improve there? Maybe something for a new thread as well?
> >> >
> >> > Best!
> >> > Lucas
> >> >
> >> > On Wed, Jul 29, 2020 at 1:24 AM Wei Chen <[email protected]> wrote:
> >> >
> >> > > I think deploying to K8S does expend our capabilities for inference
> >> > scaling
> >> > > and managing.
> >> > > I am not familiar with Luigi, but it makes sense since we are going
> to
> >> > > setup data pipelines.
> >> > >
> >> > > Best Regards,
> >> > > Wei
> >> > >
> >> > > On Wed, Jul 29, 2020 at 5:32 AM Lucas Cardoso Silva <
> >> > > [email protected]> wrote:
> >> > >
> >> > > > Great Wei! I find the suggestions really interesting. I think we
> can
> >> > work
> >> > > > with the deployment on K8s. The idea of it in Marvin would be,
> after
> >> > > > development, the user would give some parameters and a script
> would
> >> > > > facilitate a deployment in a kubernetes cluster, right? Regarding
> >> data
> >> > > > acquisition, I think it would be great if we were able to
> integrate
> >> > some
> >> > > > third party library like Luigi. Thanks!
> >> > > >
> >> > > >
> >> > > >
> >> > > > Em qua., 22 de jul. de 2020 às 14:27, Wei Chen <
> [email protected]>
> >> > > > escreveu:
> >> > > >
> >> > > > > Hello Lucas,
> >> > > > >
> >> > > > > I have some ideas:
> >> > > > >
> >> > > > > 1. Should we consider to use K8S or similar tools for inference
> >> > > container
> >> > > > > scaling and management?
> >> > > > > Marvin's current container management is not as powerful as some
> >> > > > container
> >> > > > > focus projects.
> >> > > > > K8S can also be deployed into most environments now.
> >> > > > >
> >> > > > > 2. Is our current data cleaning stage flexible enough for
> multiple
> >> > data
> >> > > > > sources with table join?
> >> > > > > Or if we should cut the data preparation stage out for the user
> to
> >> > make
> >> > > > > their own data pipeline on their data storage.
> >> > > > > I figured that preprocessing might be too complex to be
> >> generalized
> >> > for
> >> > > > > different ML projects.
> >> > > > >
> >> > > > > Best Regards
> >> > > > > Wei
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > > On Thu, Jul 23, 2020 at 12:26 AM Lucas Cardoso Silva <
> >> > > > > [email protected]> wrote:
> >> > > > >
> >> > > > > > Hi guys.
> >> > > > > > I would like to know if anyone else has any ideas about this
> >> > > evaluation
> >> > > > > > phase. Both the opinion of those who have been in the
> community
> >> > for a
> >> > > > > long
> >> > > > > > time and those who are still getting to know Marvin is now
> >> > important
> >> > > > for
> >> > > > > > this step, so your suggestion or validation of the initial
> text
> >> is
> >> > > > always
> >> > > > > > welcome!
> >> > > > > >
> >> > > > > > Best regards,
> >> > > > > > Lucas Cardoso
> >> > > > > >
> >> > > > > > Em sex., 10 de jul. de 2020 às 13:48, Lucas Cardoso Silva <
> >> > > > > > [email protected]> escreveu:
> >> > > > > >
> >> > > > > > > Hello guys. The time has come for us to take the first step
> in
> >> > > > > > > architectural assessment: the definition of the mission.
> >> > Basically
> >> > > we
> >> > > > > > have
> >> > > > > > > to decide here what is important in Marvin and what is
> outside
> >> > the
> >> > > > > scope
> >> > > > > > of
> >> > > > > > > the project. This is important because, during this analysis
> >> and
> >> > > the
> >> > > > > > > development process as a whole, we will be able to segment
> >> what
> >> > is
> >> > > > > really
> >> > > > > > > important and make things more simple and functional. Also,
> >> if it
> >> > > > looks
> >> > > > > > > cool, we can include that on the Marvin-AI homepage.
> >> > > > > > >
> >> > > > > > > As stated earlier, I will post an initial draft and would
> >> like to
> >> > > > > receive
> >> > > > > > > your feedback to complete a few points:
> >> > > > > > >
> >> > > > > > > The Apache Marvin-AI platform aims to offer:
> >> > > > > > >
> >> > > > > > >    -
> >> > > > > > >
> >> > > > > > >    a practical and standardized solution,
> >> > > > > > >    -
> >> > > > > > >
> >> > > > > > >    for the development and deployment of machine learning
> >> > > > applications.
> >> > > > > > >
> >> > > > > > >
> >> > > > > > > Aiming to offer the user:
> >> > > > > > >
> >> > > > > > >    -
> >> > > > > > >
> >> > > > > > >    scalability,
> >> > > > > > >    -
> >> > > > > > >
> >> > > > > > >    language agnosticism,
> >> > > > > > >    -
> >> > > > > > >
> >> > > > > > >    standardized pipeline (DASFE),
> >> > > > > > >    -
> >> > > > > > >
> >> > > > > > >    possibility of remote versioning of artifacts.
> >> > > > > > >
> >> > > > > > >
> >> > > > > > > Does anyone have any suggestions for more important
> features,
> >> > > > resources
> >> > > > > > or
> >> > > > > > > design decisions in Marvin?
> >> > > > > > >
> >> > > > > > > Thank you very much,
> >> > > > > > >
> >> > > > > > > Lucas Cardoso
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> >
>

Re: Marvin’s mission discussion

Reply via email to