Re: Marvin’s mission discussion

2020-08-15 Thread Lucas Cardoso Silva
Good! I agree.

The Apache Marvin-AI platform aims to offer a practical and standardized
solution to help its users to perform data exploration, model development
and application lifecycle management for artificial intelligence tasks,
aiming to offer: scalability, language agnosticism and a standardized
pipeline.

Something like this?


Em sáb., 15 de ago. de 2020 às 16:03, Daniel Takabayashi <
daniel.takabaya...@gmail.com> escreveu:

> +1
>
> Em sáb., 15 de ago. de 2020 às 08:57, Lucas Bonatto Miguel <
> lucasb...@apache.org> escreveu:
>
> > It's good, the only thing I would change would be to mention what sort of
> > applications. Although we have AI in the name, one may mistakenly think
> > Marvin is intended to serve any type of application.
> >
> > Best
> >
> > On Fri, Aug 14, 2020 at 11:37 AM Lucas Cardoso Silva <
> > cardosolucas61@gmail.com> wrote:
> >
> > > Hi guys,
> > >
> > > Here comes the summarized Marvin mission:
> > >
> > > The Apache Marvin-AI platform aims to offer a practical and
> standardized
> > > solution to help its users to perform data exploration, model
> development
> > > and application lifecycle management, aiming to offer: scalability,
> > > language agnosticism and a standardized pipeline.
> > >
> > > Thanks for the help,
> > > Lucas Cardoso
> > >
> > > Em qua., 29 de jul. de 2020 às 17:05, Lucas Cardoso Silva <
> > > cardosolucas61@gmail.com> escreveu:
> > >
> > > > Hi guys!
> > > > Great Lucas, I will wait a couple of days to see if anyone has other
> > > > things to add, and then we can close this phase!
> > > >
> > > > Wei, we can discuss how to make the data pipelines easier to the
> users
> > in
> > > > another step of the evaluation. With the experience of the users and
> > > > developers with this topic we can track their needs better and make
> > > > use-case scenarios. I agree with you that data preparation is messy
> and
> > > can
> > > > take a lot of time and will be great if Marvin could help in that.
> > > >
> > > > Best regards,
> > > > Lucas
> > > >
> > > >
> > > > Em qua., 29 de jul. de 2020 às 11:59, Wei Chen 
> > > > escreveu:
> > > >
> > > >> Hello Lucas,
> > > >>
> > > >> I am thinking of processing JSON or XML files with a hierarchy
> dynamic
> > > >> structure.
> > > >> Or building a pipeline to crop image with object detection metadata.
> > > >> Data preparation can be very messy,
> > > >> I wonder if we can have a stage to handle both batch and streaming
> > > >> processing well.
> > > >>
> > > >> I simply think we don't need to focus on this part since we can
> > utilize
> > > a
> > > >> wide variety of tools for our specific needs.
> > > >>
> > > >> Best Regards,
> > > >> Wei
> > > >>
> > > >>
> > > >>
> > > >> On Wed, Jul 29, 2020 at 8:48 PM Lucas Bonatto Miguel <
> > > >> lucasb...@apache.org>
> > > >> wrote:
> > > >>
> > > >> > Hi folks,
> > > >> >
> > > >> > In regards to the mission, you're correct. If I could summarize
> it,
> > it
> > > >> > would be like: *to help its users to perform data exploration,
> model
> > > >> > development and application lifecycle management*.
> > > >> >
> > > >> > I'm all in for having a better integration with Kubernetes. I
> think
> > > that
> > > >> > the first step is to create a new thread in order to design
> > something
> > > >> > following their operator pattern:
> > > >> > https://kubernetes.io/docs/concepts/extend-kubernetes/operator/
> > > >> >
> > > >> > Wei, currently one already can perform merges and joins in the
> > > >> > transformation step. Could you comment a bit more on what you
> think
> > we
> > > >> > could improve there? Maybe something for a new thread as well?
> > > >> >
> > > >> > Best!
> > > >> > Lucas
> > > >> >
> > > >> > On Wed, Jul 29, 2020 at 1:24 AM Wei Chen 
> > wrote:
> > > >> >
> > > >> > > I think deploying to K8S does expend our capabilities for
> > inference
> > > >> > scaling
> > > >> > > and managing.
> > > >> > > I am not familiar with Luigi, but it makes sense since we are
> > going
> > > to
> > > >> > > setup data pipelines.
> > > >> > >
> > > >> > > Best Regards,
> > > >> > > Wei
> > > >> > >
> > > >> > > On Wed, Jul 29, 2020 at 5:32 AM Lucas Cardoso Silva <
> > > >> > > cardosolucas61@gmail.com> wrote:
> > > >> > >
> > > >> > > > Great Wei! I find the suggestions really interesting. I think
> we
> > > can
> > > >> > work
> > > >> > > > with the deployment on K8s. The idea of it in Marvin would be,
> > > after
> > > >> > > > development, the user would give some parameters and a script
> > > would
> > > >> > > > facilitate a deployment in a kubernetes cluster, right?
> > Regarding
> > > >> data
> > > >> > > > acquisition, I think it would be great if we were able to
> > > integrate
> > > >> > some
> > > >> > > > third party library like Luigi. Thanks!
> > > >> > > >
> > > >> > > >
> > > >> > > >
> > > >> > > > Em qua., 22 de jul. de 2020 às 14:27, Wei Chen <
> > > weic...@apache.org>
> > > >> > > > escreveu:
> > > >> > > >
> > > >> > > > > Hello Lucas,

Re: Marvin’s mission discussion

2020-08-15 Thread Daniel Takabayashi
+1

Em sáb., 15 de ago. de 2020 às 08:57, Lucas Bonatto Miguel <
lucasb...@apache.org> escreveu:

> It's good, the only thing I would change would be to mention what sort of
> applications. Although we have AI in the name, one may mistakenly think
> Marvin is intended to serve any type of application.
>
> Best
>
> On Fri, Aug 14, 2020 at 11:37 AM Lucas Cardoso Silva <
> cardosolucas61@gmail.com> wrote:
>
> > Hi guys,
> >
> > Here comes the summarized Marvin mission:
> >
> > The Apache Marvin-AI platform aims to offer a practical and standardized
> > solution to help its users to perform data exploration, model development
> > and application lifecycle management, aiming to offer: scalability,
> > language agnosticism and a standardized pipeline.
> >
> > Thanks for the help,
> > Lucas Cardoso
> >
> > Em qua., 29 de jul. de 2020 às 17:05, Lucas Cardoso Silva <
> > cardosolucas61@gmail.com> escreveu:
> >
> > > Hi guys!
> > > Great Lucas, I will wait a couple of days to see if anyone has other
> > > things to add, and then we can close this phase!
> > >
> > > Wei, we can discuss how to make the data pipelines easier to the users
> in
> > > another step of the evaluation. With the experience of the users and
> > > developers with this topic we can track their needs better and make
> > > use-case scenarios. I agree with you that data preparation is messy and
> > can
> > > take a lot of time and will be great if Marvin could help in that.
> > >
> > > Best regards,
> > > Lucas
> > >
> > >
> > > Em qua., 29 de jul. de 2020 às 11:59, Wei Chen 
> > > escreveu:
> > >
> > >> Hello Lucas,
> > >>
> > >> I am thinking of processing JSON or XML files with a hierarchy dynamic
> > >> structure.
> > >> Or building a pipeline to crop image with object detection metadata.
> > >> Data preparation can be very messy,
> > >> I wonder if we can have a stage to handle both batch and streaming
> > >> processing well.
> > >>
> > >> I simply think we don't need to focus on this part since we can
> utilize
> > a
> > >> wide variety of tools for our specific needs.
> > >>
> > >> Best Regards,
> > >> Wei
> > >>
> > >>
> > >>
> > >> On Wed, Jul 29, 2020 at 8:48 PM Lucas Bonatto Miguel <
> > >> lucasb...@apache.org>
> > >> wrote:
> > >>
> > >> > Hi folks,
> > >> >
> > >> > In regards to the mission, you're correct. If I could summarize it,
> it
> > >> > would be like: *to help its users to perform data exploration, model
> > >> > development and application lifecycle management*.
> > >> >
> > >> > I'm all in for having a better integration with Kubernetes. I think
> > that
> > >> > the first step is to create a new thread in order to design
> something
> > >> > following their operator pattern:
> > >> > https://kubernetes.io/docs/concepts/extend-kubernetes/operator/
> > >> >
> > >> > Wei, currently one already can perform merges and joins in the
> > >> > transformation step. Could you comment a bit more on what you think
> we
> > >> > could improve there? Maybe something for a new thread as well?
> > >> >
> > >> > Best!
> > >> > Lucas
> > >> >
> > >> > On Wed, Jul 29, 2020 at 1:24 AM Wei Chen 
> wrote:
> > >> >
> > >> > > I think deploying to K8S does expend our capabilities for
> inference
> > >> > scaling
> > >> > > and managing.
> > >> > > I am not familiar with Luigi, but it makes sense since we are
> going
> > to
> > >> > > setup data pipelines.
> > >> > >
> > >> > > Best Regards,
> > >> > > Wei
> > >> > >
> > >> > > On Wed, Jul 29, 2020 at 5:32 AM Lucas Cardoso Silva <
> > >> > > cardosolucas61@gmail.com> wrote:
> > >> > >
> > >> > > > Great Wei! I find the suggestions really interesting. I think we
> > can
> > >> > work
> > >> > > > with the deployment on K8s. The idea of it in Marvin would be,
> > after
> > >> > > > development, the user would give some parameters and a script
> > would
> > >> > > > facilitate a deployment in a kubernetes cluster, right?
> Regarding
> > >> data
> > >> > > > acquisition, I think it would be great if we were able to
> > integrate
> > >> > some
> > >> > > > third party library like Luigi. Thanks!
> > >> > > >
> > >> > > >
> > >> > > >
> > >> > > > Em qua., 22 de jul. de 2020 às 14:27, Wei Chen <
> > weic...@apache.org>
> > >> > > > escreveu:
> > >> > > >
> > >> > > > > Hello Lucas,
> > >> > > > >
> > >> > > > > I have some ideas:
> > >> > > > >
> > >> > > > > 1. Should we consider to use K8S or similar tools for
> inference
> > >> > > container
> > >> > > > > scaling and management?
> > >> > > > > Marvin's current container management is not as powerful as
> some
> > >> > > > container
> > >> > > > > focus projects.
> > >> > > > > K8S can also be deployed into most environments now.
> > >> > > > >
> > >> > > > > 2. Is our current data cleaning stage flexible enough for
> > multiple
> > >> > data
> > >> > > > > sources with table join?
> > >> > > > > Or if we should cut the data preparation stage out for the
> user
> > to
> > >> > make
> > >> > > > > their own data pipeline on their data 

Re: Marvin’s mission discussion

2020-08-15 Thread Lucas Bonatto Miguel
It's good, the only thing I would change would be to mention what sort of
applications. Although we have AI in the name, one may mistakenly think
Marvin is intended to serve any type of application.

Best

On Fri, Aug 14, 2020 at 11:37 AM Lucas Cardoso Silva <
cardosolucas61@gmail.com> wrote:

> Hi guys,
>
> Here comes the summarized Marvin mission:
>
> The Apache Marvin-AI platform aims to offer a practical and standardized
> solution to help its users to perform data exploration, model development
> and application lifecycle management, aiming to offer: scalability,
> language agnosticism and a standardized pipeline.
>
> Thanks for the help,
> Lucas Cardoso
>
> Em qua., 29 de jul. de 2020 às 17:05, Lucas Cardoso Silva <
> cardosolucas61@gmail.com> escreveu:
>
> > Hi guys!
> > Great Lucas, I will wait a couple of days to see if anyone has other
> > things to add, and then we can close this phase!
> >
> > Wei, we can discuss how to make the data pipelines easier to the users in
> > another step of the evaluation. With the experience of the users and
> > developers with this topic we can track their needs better and make
> > use-case scenarios. I agree with you that data preparation is messy and
> can
> > take a lot of time and will be great if Marvin could help in that.
> >
> > Best regards,
> > Lucas
> >
> >
> > Em qua., 29 de jul. de 2020 às 11:59, Wei Chen 
> > escreveu:
> >
> >> Hello Lucas,
> >>
> >> I am thinking of processing JSON or XML files with a hierarchy dynamic
> >> structure.
> >> Or building a pipeline to crop image with object detection metadata.
> >> Data preparation can be very messy,
> >> I wonder if we can have a stage to handle both batch and streaming
> >> processing well.
> >>
> >> I simply think we don't need to focus on this part since we can utilize
> a
> >> wide variety of tools for our specific needs.
> >>
> >> Best Regards,
> >> Wei
> >>
> >>
> >>
> >> On Wed, Jul 29, 2020 at 8:48 PM Lucas Bonatto Miguel <
> >> lucasb...@apache.org>
> >> wrote:
> >>
> >> > Hi folks,
> >> >
> >> > In regards to the mission, you're correct. If I could summarize it, it
> >> > would be like: *to help its users to perform data exploration, model
> >> > development and application lifecycle management*.
> >> >
> >> > I'm all in for having a better integration with Kubernetes. I think
> that
> >> > the first step is to create a new thread in order to design something
> >> > following their operator pattern:
> >> > https://kubernetes.io/docs/concepts/extend-kubernetes/operator/
> >> >
> >> > Wei, currently one already can perform merges and joins in the
> >> > transformation step. Could you comment a bit more on what you think we
> >> > could improve there? Maybe something for a new thread as well?
> >> >
> >> > Best!
> >> > Lucas
> >> >
> >> > On Wed, Jul 29, 2020 at 1:24 AM Wei Chen  wrote:
> >> >
> >> > > I think deploying to K8S does expend our capabilities for inference
> >> > scaling
> >> > > and managing.
> >> > > I am not familiar with Luigi, but it makes sense since we are going
> to
> >> > > setup data pipelines.
> >> > >
> >> > > Best Regards,
> >> > > Wei
> >> > >
> >> > > On Wed, Jul 29, 2020 at 5:32 AM Lucas Cardoso Silva <
> >> > > cardosolucas61@gmail.com> wrote:
> >> > >
> >> > > > Great Wei! I find the suggestions really interesting. I think we
> can
> >> > work
> >> > > > with the deployment on K8s. The idea of it in Marvin would be,
> after
> >> > > > development, the user would give some parameters and a script
> would
> >> > > > facilitate a deployment in a kubernetes cluster, right? Regarding
> >> data
> >> > > > acquisition, I think it would be great if we were able to
> integrate
> >> > some
> >> > > > third party library like Luigi. Thanks!
> >> > > >
> >> > > >
> >> > > >
> >> > > > Em qua., 22 de jul. de 2020 às 14:27, Wei Chen <
> weic...@apache.org>
> >> > > > escreveu:
> >> > > >
> >> > > > > Hello Lucas,
> >> > > > >
> >> > > > > I have some ideas:
> >> > > > >
> >> > > > > 1. Should we consider to use K8S or similar tools for inference
> >> > > container
> >> > > > > scaling and management?
> >> > > > > Marvin's current container management is not as powerful as some
> >> > > > container
> >> > > > > focus projects.
> >> > > > > K8S can also be deployed into most environments now.
> >> > > > >
> >> > > > > 2. Is our current data cleaning stage flexible enough for
> multiple
> >> > data
> >> > > > > sources with table join?
> >> > > > > Or if we should cut the data preparation stage out for the user
> to
> >> > make
> >> > > > > their own data pipeline on their data storage.
> >> > > > > I figured that preprocessing might be too complex to be
> >> generalized
> >> > for
> >> > > > > different ML projects.
> >> > > > >
> >> > > > > Best Regards
> >> > > > > Wei
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > > On Thu, Jul 23, 2020 at 12:26 AM Lucas Cardoso Silva <
> >> > > > > cardosolucas61@gmail.com> wrote:
> >> > > > >
> >> > > > > > Hi guys.
> >> > 

Re: Marvin’s mission discussion

2020-08-14 Thread Lucas Cardoso Silva
Hi guys,

Here comes the summarized Marvin mission:

The Apache Marvin-AI platform aims to offer a practical and standardized
solution to help its users to perform data exploration, model development
and application lifecycle management, aiming to offer: scalability,
language agnosticism and a standardized pipeline.

Thanks for the help,
Lucas Cardoso

Em qua., 29 de jul. de 2020 às 17:05, Lucas Cardoso Silva <
cardosolucas61@gmail.com> escreveu:

> Hi guys!
> Great Lucas, I will wait a couple of days to see if anyone has other
> things to add, and then we can close this phase!
>
> Wei, we can discuss how to make the data pipelines easier to the users in
> another step of the evaluation. With the experience of the users and
> developers with this topic we can track their needs better and make
> use-case scenarios. I agree with you that data preparation is messy and can
> take a lot of time and will be great if Marvin could help in that.
>
> Best regards,
> Lucas
>
>
> Em qua., 29 de jul. de 2020 às 11:59, Wei Chen 
> escreveu:
>
>> Hello Lucas,
>>
>> I am thinking of processing JSON or XML files with a hierarchy dynamic
>> structure.
>> Or building a pipeline to crop image with object detection metadata.
>> Data preparation can be very messy,
>> I wonder if we can have a stage to handle both batch and streaming
>> processing well.
>>
>> I simply think we don't need to focus on this part since we can utilize a
>> wide variety of tools for our specific needs.
>>
>> Best Regards,
>> Wei
>>
>>
>>
>> On Wed, Jul 29, 2020 at 8:48 PM Lucas Bonatto Miguel <
>> lucasb...@apache.org>
>> wrote:
>>
>> > Hi folks,
>> >
>> > In regards to the mission, you're correct. If I could summarize it, it
>> > would be like: *to help its users to perform data exploration, model
>> > development and application lifecycle management*.
>> >
>> > I'm all in for having a better integration with Kubernetes. I think that
>> > the first step is to create a new thread in order to design something
>> > following their operator pattern:
>> > https://kubernetes.io/docs/concepts/extend-kubernetes/operator/
>> >
>> > Wei, currently one already can perform merges and joins in the
>> > transformation step. Could you comment a bit more on what you think we
>> > could improve there? Maybe something for a new thread as well?
>> >
>> > Best!
>> > Lucas
>> >
>> > On Wed, Jul 29, 2020 at 1:24 AM Wei Chen  wrote:
>> >
>> > > I think deploying to K8S does expend our capabilities for inference
>> > scaling
>> > > and managing.
>> > > I am not familiar with Luigi, but it makes sense since we are going to
>> > > setup data pipelines.
>> > >
>> > > Best Regards,
>> > > Wei
>> > >
>> > > On Wed, Jul 29, 2020 at 5:32 AM Lucas Cardoso Silva <
>> > > cardosolucas61@gmail.com> wrote:
>> > >
>> > > > Great Wei! I find the suggestions really interesting. I think we can
>> > work
>> > > > with the deployment on K8s. The idea of it in Marvin would be, after
>> > > > development, the user would give some parameters and a script would
>> > > > facilitate a deployment in a kubernetes cluster, right? Regarding
>> data
>> > > > acquisition, I think it would be great if we were able to integrate
>> > some
>> > > > third party library like Luigi. Thanks!
>> > > >
>> > > >
>> > > >
>> > > > Em qua., 22 de jul. de 2020 às 14:27, Wei Chen 
>> > > > escreveu:
>> > > >
>> > > > > Hello Lucas,
>> > > > >
>> > > > > I have some ideas:
>> > > > >
>> > > > > 1. Should we consider to use K8S or similar tools for inference
>> > > container
>> > > > > scaling and management?
>> > > > > Marvin's current container management is not as powerful as some
>> > > > container
>> > > > > focus projects.
>> > > > > K8S can also be deployed into most environments now.
>> > > > >
>> > > > > 2. Is our current data cleaning stage flexible enough for multiple
>> > data
>> > > > > sources with table join?
>> > > > > Or if we should cut the data preparation stage out for the user to
>> > make
>> > > > > their own data pipeline on their data storage.
>> > > > > I figured that preprocessing might be too complex to be
>> generalized
>> > for
>> > > > > different ML projects.
>> > > > >
>> > > > > Best Regards
>> > > > > Wei
>> > > > >
>> > > > >
>> > > > >
>> > > > > On Thu, Jul 23, 2020 at 12:26 AM Lucas Cardoso Silva <
>> > > > > cardosolucas61@gmail.com> wrote:
>> > > > >
>> > > > > > Hi guys.
>> > > > > > I would like to know if anyone else has any ideas about this
>> > > evaluation
>> > > > > > phase. Both the opinion of those who have been in the community
>> > for a
>> > > > > long
>> > > > > > time and those who are still getting to know Marvin is now
>> > important
>> > > > for
>> > > > > > this step, so your suggestion or validation of the initial text
>> is
>> > > > always
>> > > > > > welcome!
>> > > > > >
>> > > > > > Best regards,
>> > > > > > Lucas Cardoso
>> > > > > >
>> > > > > > Em sex., 10 de jul. de 2020 às 13:48, Lucas Cardoso Silva <
>> > > > > > 

Re: Marvin’s mission discussion

2020-07-29 Thread Lucas Cardoso Silva
Hi guys!
Great Lucas, I will wait a couple of days to see if anyone has other things
to add, and then we can close this phase!

Wei, we can discuss how to make the data pipelines easier to the users in
another step of the evaluation. With the experience of the users and
developers with this topic we can track their needs better and make
use-case scenarios. I agree with you that data preparation is messy and can
take a lot of time and will be great if Marvin could help in that.

Best regards,
Lucas


Em qua., 29 de jul. de 2020 às 11:59, Wei Chen 
escreveu:

> Hello Lucas,
>
> I am thinking of processing JSON or XML files with a hierarchy dynamic
> structure.
> Or building a pipeline to crop image with object detection metadata.
> Data preparation can be very messy,
> I wonder if we can have a stage to handle both batch and streaming
> processing well.
>
> I simply think we don't need to focus on this part since we can utilize a
> wide variety of tools for our specific needs.
>
> Best Regards,
> Wei
>
>
>
> On Wed, Jul 29, 2020 at 8:48 PM Lucas Bonatto Miguel  >
> wrote:
>
> > Hi folks,
> >
> > In regards to the mission, you're correct. If I could summarize it, it
> > would be like: *to help its users to perform data exploration, model
> > development and application lifecycle management*.
> >
> > I'm all in for having a better integration with Kubernetes. I think that
> > the first step is to create a new thread in order to design something
> > following their operator pattern:
> > https://kubernetes.io/docs/concepts/extend-kubernetes/operator/
> >
> > Wei, currently one already can perform merges and joins in the
> > transformation step. Could you comment a bit more on what you think we
> > could improve there? Maybe something for a new thread as well?
> >
> > Best!
> > Lucas
> >
> > On Wed, Jul 29, 2020 at 1:24 AM Wei Chen  wrote:
> >
> > > I think deploying to K8S does expend our capabilities for inference
> > scaling
> > > and managing.
> > > I am not familiar with Luigi, but it makes sense since we are going to
> > > setup data pipelines.
> > >
> > > Best Regards,
> > > Wei
> > >
> > > On Wed, Jul 29, 2020 at 5:32 AM Lucas Cardoso Silva <
> > > cardosolucas61@gmail.com> wrote:
> > >
> > > > Great Wei! I find the suggestions really interesting. I think we can
> > work
> > > > with the deployment on K8s. The idea of it in Marvin would be, after
> > > > development, the user would give some parameters and a script would
> > > > facilitate a deployment in a kubernetes cluster, right? Regarding
> data
> > > > acquisition, I think it would be great if we were able to integrate
> > some
> > > > third party library like Luigi. Thanks!
> > > >
> > > >
> > > >
> > > > Em qua., 22 de jul. de 2020 às 14:27, Wei Chen 
> > > > escreveu:
> > > >
> > > > > Hello Lucas,
> > > > >
> > > > > I have some ideas:
> > > > >
> > > > > 1. Should we consider to use K8S or similar tools for inference
> > > container
> > > > > scaling and management?
> > > > > Marvin's current container management is not as powerful as some
> > > > container
> > > > > focus projects.
> > > > > K8S can also be deployed into most environments now.
> > > > >
> > > > > 2. Is our current data cleaning stage flexible enough for multiple
> > data
> > > > > sources with table join?
> > > > > Or if we should cut the data preparation stage out for the user to
> > make
> > > > > their own data pipeline on their data storage.
> > > > > I figured that preprocessing might be too complex to be generalized
> > for
> > > > > different ML projects.
> > > > >
> > > > > Best Regards
> > > > > Wei
> > > > >
> > > > >
> > > > >
> > > > > On Thu, Jul 23, 2020 at 12:26 AM Lucas Cardoso Silva <
> > > > > cardosolucas61@gmail.com> wrote:
> > > > >
> > > > > > Hi guys.
> > > > > > I would like to know if anyone else has any ideas about this
> > > evaluation
> > > > > > phase. Both the opinion of those who have been in the community
> > for a
> > > > > long
> > > > > > time and those who are still getting to know Marvin is now
> > important
> > > > for
> > > > > > this step, so your suggestion or validation of the initial text
> is
> > > > always
> > > > > > welcome!
> > > > > >
> > > > > > Best regards,
> > > > > > Lucas Cardoso
> > > > > >
> > > > > > Em sex., 10 de jul. de 2020 às 13:48, Lucas Cardoso Silva <
> > > > > > cardosolucas61@gmail.com> escreveu:
> > > > > >
> > > > > > > Hello guys. The time has come for us to take the first step in
> > > > > > > architectural assessment: the definition of the mission.
> > Basically
> > > we
> > > > > > have
> > > > > > > to decide here what is important in Marvin and what is outside
> > the
> > > > > scope
> > > > > > of
> > > > > > > the project. This is important because, during this analysis
> and
> > > the
> > > > > > > development process as a whole, we will be able to segment what
> > is
> > > > > really
> > > > > > > important and make things more simple and functional. Also, if
> it
> > > > looks
> > > > 

Re: Marvin’s mission discussion

2020-07-29 Thread Wei Chen
Hello Lucas,

I am thinking of processing JSON or XML files with a hierarchy dynamic
structure.
Or building a pipeline to crop image with object detection metadata.
Data preparation can be very messy,
I wonder if we can have a stage to handle both batch and streaming
processing well.

I simply think we don't need to focus on this part since we can utilize a
wide variety of tools for our specific needs.

Best Regards,
Wei



On Wed, Jul 29, 2020 at 8:48 PM Lucas Bonatto Miguel 
wrote:

> Hi folks,
>
> In regards to the mission, you're correct. If I could summarize it, it
> would be like: *to help its users to perform data exploration, model
> development and application lifecycle management*.
>
> I'm all in for having a better integration with Kubernetes. I think that
> the first step is to create a new thread in order to design something
> following their operator pattern:
> https://kubernetes.io/docs/concepts/extend-kubernetes/operator/
>
> Wei, currently one already can perform merges and joins in the
> transformation step. Could you comment a bit more on what you think we
> could improve there? Maybe something for a new thread as well?
>
> Best!
> Lucas
>
> On Wed, Jul 29, 2020 at 1:24 AM Wei Chen  wrote:
>
> > I think deploying to K8S does expend our capabilities for inference
> scaling
> > and managing.
> > I am not familiar with Luigi, but it makes sense since we are going to
> > setup data pipelines.
> >
> > Best Regards,
> > Wei
> >
> > On Wed, Jul 29, 2020 at 5:32 AM Lucas Cardoso Silva <
> > cardosolucas61@gmail.com> wrote:
> >
> > > Great Wei! I find the suggestions really interesting. I think we can
> work
> > > with the deployment on K8s. The idea of it in Marvin would be, after
> > > development, the user would give some parameters and a script would
> > > facilitate a deployment in a kubernetes cluster, right? Regarding data
> > > acquisition, I think it would be great if we were able to integrate
> some
> > > third party library like Luigi. Thanks!
> > >
> > >
> > >
> > > Em qua., 22 de jul. de 2020 às 14:27, Wei Chen 
> > > escreveu:
> > >
> > > > Hello Lucas,
> > > >
> > > > I have some ideas:
> > > >
> > > > 1. Should we consider to use K8S or similar tools for inference
> > container
> > > > scaling and management?
> > > > Marvin's current container management is not as powerful as some
> > > container
> > > > focus projects.
> > > > K8S can also be deployed into most environments now.
> > > >
> > > > 2. Is our current data cleaning stage flexible enough for multiple
> data
> > > > sources with table join?
> > > > Or if we should cut the data preparation stage out for the user to
> make
> > > > their own data pipeline on their data storage.
> > > > I figured that preprocessing might be too complex to be generalized
> for
> > > > different ML projects.
> > > >
> > > > Best Regards
> > > > Wei
> > > >
> > > >
> > > >
> > > > On Thu, Jul 23, 2020 at 12:26 AM Lucas Cardoso Silva <
> > > > cardosolucas61@gmail.com> wrote:
> > > >
> > > > > Hi guys.
> > > > > I would like to know if anyone else has any ideas about this
> > evaluation
> > > > > phase. Both the opinion of those who have been in the community
> for a
> > > > long
> > > > > time and those who are still getting to know Marvin is now
> important
> > > for
> > > > > this step, so your suggestion or validation of the initial text is
> > > always
> > > > > welcome!
> > > > >
> > > > > Best regards,
> > > > > Lucas Cardoso
> > > > >
> > > > > Em sex., 10 de jul. de 2020 às 13:48, Lucas Cardoso Silva <
> > > > > cardosolucas61@gmail.com> escreveu:
> > > > >
> > > > > > Hello guys. The time has come for us to take the first step in
> > > > > > architectural assessment: the definition of the mission.
> Basically
> > we
> > > > > have
> > > > > > to decide here what is important in Marvin and what is outside
> the
> > > > scope
> > > > > of
> > > > > > the project. This is important because, during this analysis and
> > the
> > > > > > development process as a whole, we will be able to segment what
> is
> > > > really
> > > > > > important and make things more simple and functional. Also, if it
> > > looks
> > > > > > cool, we can include that on the Marvin-AI homepage.
> > > > > >
> > > > > > As stated earlier, I will post an initial draft and would like to
> > > > receive
> > > > > > your feedback to complete a few points:
> > > > > >
> > > > > > The Apache Marvin-AI platform aims to offer:
> > > > > >
> > > > > >-
> > > > > >
> > > > > >a practical and standardized solution,
> > > > > >-
> > > > > >
> > > > > >for the development and deployment of machine learning
> > > applications.
> > > > > >
> > > > > >
> > > > > > Aiming to offer the user:
> > > > > >
> > > > > >-
> > > > > >
> > > > > >scalability,
> > > > > >-
> > > > > >
> > > > > >language agnosticism,
> > > > > >-
> > > > > >
> > > > > >standardized pipeline (DASFE),
> > > > > >-
> > > > > >
> > > > > >possibility of 

Re: Marvin’s mission discussion

2020-07-29 Thread Lucas Bonatto Miguel
Hi folks,

In regards to the mission, you're correct. If I could summarize it, it
would be like: *to help its users to perform data exploration, model
development and application lifecycle management*.

I'm all in for having a better integration with Kubernetes. I think that
the first step is to create a new thread in order to design something
following their operator pattern:
https://kubernetes.io/docs/concepts/extend-kubernetes/operator/

Wei, currently one already can perform merges and joins in the
transformation step. Could you comment a bit more on what you think we
could improve there? Maybe something for a new thread as well?

Best!
Lucas

On Wed, Jul 29, 2020 at 1:24 AM Wei Chen  wrote:

> I think deploying to K8S does expend our capabilities for inference scaling
> and managing.
> I am not familiar with Luigi, but it makes sense since we are going to
> setup data pipelines.
>
> Best Regards,
> Wei
>
> On Wed, Jul 29, 2020 at 5:32 AM Lucas Cardoso Silva <
> cardosolucas61@gmail.com> wrote:
>
> > Great Wei! I find the suggestions really interesting. I think we can work
> > with the deployment on K8s. The idea of it in Marvin would be, after
> > development, the user would give some parameters and a script would
> > facilitate a deployment in a kubernetes cluster, right? Regarding data
> > acquisition, I think it would be great if we were able to integrate some
> > third party library like Luigi. Thanks!
> >
> >
> >
> > Em qua., 22 de jul. de 2020 às 14:27, Wei Chen 
> > escreveu:
> >
> > > Hello Lucas,
> > >
> > > I have some ideas:
> > >
> > > 1. Should we consider to use K8S or similar tools for inference
> container
> > > scaling and management?
> > > Marvin's current container management is not as powerful as some
> > container
> > > focus projects.
> > > K8S can also be deployed into most environments now.
> > >
> > > 2. Is our current data cleaning stage flexible enough for multiple data
> > > sources with table join?
> > > Or if we should cut the data preparation stage out for the user to make
> > > their own data pipeline on their data storage.
> > > I figured that preprocessing might be too complex to be generalized for
> > > different ML projects.
> > >
> > > Best Regards
> > > Wei
> > >
> > >
> > >
> > > On Thu, Jul 23, 2020 at 12:26 AM Lucas Cardoso Silva <
> > > cardosolucas61@gmail.com> wrote:
> > >
> > > > Hi guys.
> > > > I would like to know if anyone else has any ideas about this
> evaluation
> > > > phase. Both the opinion of those who have been in the community for a
> > > long
> > > > time and those who are still getting to know Marvin is now important
> > for
> > > > this step, so your suggestion or validation of the initial text is
> > always
> > > > welcome!
> > > >
> > > > Best regards,
> > > > Lucas Cardoso
> > > >
> > > > Em sex., 10 de jul. de 2020 às 13:48, Lucas Cardoso Silva <
> > > > cardosolucas61@gmail.com> escreveu:
> > > >
> > > > > Hello guys. The time has come for us to take the first step in
> > > > > architectural assessment: the definition of the mission. Basically
> we
> > > > have
> > > > > to decide here what is important in Marvin and what is outside the
> > > scope
> > > > of
> > > > > the project. This is important because, during this analysis and
> the
> > > > > development process as a whole, we will be able to segment what is
> > > really
> > > > > important and make things more simple and functional. Also, if it
> > looks
> > > > > cool, we can include that on the Marvin-AI homepage.
> > > > >
> > > > > As stated earlier, I will post an initial draft and would like to
> > > receive
> > > > > your feedback to complete a few points:
> > > > >
> > > > > The Apache Marvin-AI platform aims to offer:
> > > > >
> > > > >-
> > > > >
> > > > >a practical and standardized solution,
> > > > >-
> > > > >
> > > > >for the development and deployment of machine learning
> > applications.
> > > > >
> > > > >
> > > > > Aiming to offer the user:
> > > > >
> > > > >-
> > > > >
> > > > >scalability,
> > > > >-
> > > > >
> > > > >language agnosticism,
> > > > >-
> > > > >
> > > > >standardized pipeline (DASFE),
> > > > >-
> > > > >
> > > > >possibility of remote versioning of artifacts.
> > > > >
> > > > >
> > > > > Does anyone have any suggestions for more important features,
> > resources
> > > > or
> > > > > design decisions in Marvin?
> > > > >
> > > > > Thank you very much,
> > > > >
> > > > > Lucas Cardoso
> > > > >
> > > >
> > >
> >
>


Re: Marvin’s mission discussion

2020-07-28 Thread Wei Chen
I think deploying to K8S does expend our capabilities for inference scaling
and managing.
I am not familiar with Luigi, but it makes sense since we are going to
setup data pipelines.

Best Regards,
Wei

On Wed, Jul 29, 2020 at 5:32 AM Lucas Cardoso Silva <
cardosolucas61@gmail.com> wrote:

> Great Wei! I find the suggestions really interesting. I think we can work
> with the deployment on K8s. The idea of it in Marvin would be, after
> development, the user would give some parameters and a script would
> facilitate a deployment in a kubernetes cluster, right? Regarding data
> acquisition, I think it would be great if we were able to integrate some
> third party library like Luigi. Thanks!
>
>
>
> Em qua., 22 de jul. de 2020 às 14:27, Wei Chen 
> escreveu:
>
> > Hello Lucas,
> >
> > I have some ideas:
> >
> > 1. Should we consider to use K8S or similar tools for inference container
> > scaling and management?
> > Marvin's current container management is not as powerful as some
> container
> > focus projects.
> > K8S can also be deployed into most environments now.
> >
> > 2. Is our current data cleaning stage flexible enough for multiple data
> > sources with table join?
> > Or if we should cut the data preparation stage out for the user to make
> > their own data pipeline on their data storage.
> > I figured that preprocessing might be too complex to be generalized for
> > different ML projects.
> >
> > Best Regards
> > Wei
> >
> >
> >
> > On Thu, Jul 23, 2020 at 12:26 AM Lucas Cardoso Silva <
> > cardosolucas61@gmail.com> wrote:
> >
> > > Hi guys.
> > > I would like to know if anyone else has any ideas about this evaluation
> > > phase. Both the opinion of those who have been in the community for a
> > long
> > > time and those who are still getting to know Marvin is now important
> for
> > > this step, so your suggestion or validation of the initial text is
> always
> > > welcome!
> > >
> > > Best regards,
> > > Lucas Cardoso
> > >
> > > Em sex., 10 de jul. de 2020 às 13:48, Lucas Cardoso Silva <
> > > cardosolucas61@gmail.com> escreveu:
> > >
> > > > Hello guys. The time has come for us to take the first step in
> > > > architectural assessment: the definition of the mission. Basically we
> > > have
> > > > to decide here what is important in Marvin and what is outside the
> > scope
> > > of
> > > > the project. This is important because, during this analysis and the
> > > > development process as a whole, we will be able to segment what is
> > really
> > > > important and make things more simple and functional. Also, if it
> looks
> > > > cool, we can include that on the Marvin-AI homepage.
> > > >
> > > > As stated earlier, I will post an initial draft and would like to
> > receive
> > > > your feedback to complete a few points:
> > > >
> > > > The Apache Marvin-AI platform aims to offer:
> > > >
> > > >-
> > > >
> > > >a practical and standardized solution,
> > > >-
> > > >
> > > >for the development and deployment of machine learning
> applications.
> > > >
> > > >
> > > > Aiming to offer the user:
> > > >
> > > >-
> > > >
> > > >scalability,
> > > >-
> > > >
> > > >language agnosticism,
> > > >-
> > > >
> > > >standardized pipeline (DASFE),
> > > >-
> > > >
> > > >possibility of remote versioning of artifacts.
> > > >
> > > >
> > > > Does anyone have any suggestions for more important features,
> resources
> > > or
> > > > design decisions in Marvin?
> > > >
> > > > Thank you very much,
> > > >
> > > > Lucas Cardoso
> > > >
> > >
> >
>


Re: Marvin’s mission discussion

2020-07-28 Thread Lucas Cardoso Silva
Great Wei! I find the suggestions really interesting. I think we can work
with the deployment on K8s. The idea of it in Marvin would be, after
development, the user would give some parameters and a script would
facilitate a deployment in a kubernetes cluster, right? Regarding data
acquisition, I think it would be great if we were able to integrate some
third party library like Luigi. Thanks!



Em qua., 22 de jul. de 2020 às 14:27, Wei Chen 
escreveu:

> Hello Lucas,
>
> I have some ideas:
>
> 1. Should we consider to use K8S or similar tools for inference container
> scaling and management?
> Marvin's current container management is not as powerful as some container
> focus projects.
> K8S can also be deployed into most environments now.
>
> 2. Is our current data cleaning stage flexible enough for multiple data
> sources with table join?
> Or if we should cut the data preparation stage out for the user to make
> their own data pipeline on their data storage.
> I figured that preprocessing might be too complex to be generalized for
> different ML projects.
>
> Best Regards
> Wei
>
>
>
> On Thu, Jul 23, 2020 at 12:26 AM Lucas Cardoso Silva <
> cardosolucas61@gmail.com> wrote:
>
> > Hi guys.
> > I would like to know if anyone else has any ideas about this evaluation
> > phase. Both the opinion of those who have been in the community for a
> long
> > time and those who are still getting to know Marvin is now important for
> > this step, so your suggestion or validation of the initial text is always
> > welcome!
> >
> > Best regards,
> > Lucas Cardoso
> >
> > Em sex., 10 de jul. de 2020 às 13:48, Lucas Cardoso Silva <
> > cardosolucas61@gmail.com> escreveu:
> >
> > > Hello guys. The time has come for us to take the first step in
> > > architectural assessment: the definition of the mission. Basically we
> > have
> > > to decide here what is important in Marvin and what is outside the
> scope
> > of
> > > the project. This is important because, during this analysis and the
> > > development process as a whole, we will be able to segment what is
> really
> > > important and make things more simple and functional. Also, if it looks
> > > cool, we can include that on the Marvin-AI homepage.
> > >
> > > As stated earlier, I will post an initial draft and would like to
> receive
> > > your feedback to complete a few points:
> > >
> > > The Apache Marvin-AI platform aims to offer:
> > >
> > >-
> > >
> > >a practical and standardized solution,
> > >-
> > >
> > >for the development and deployment of machine learning applications.
> > >
> > >
> > > Aiming to offer the user:
> > >
> > >-
> > >
> > >scalability,
> > >-
> > >
> > >language agnosticism,
> > >-
> > >
> > >standardized pipeline (DASFE),
> > >-
> > >
> > >possibility of remote versioning of artifacts.
> > >
> > >
> > > Does anyone have any suggestions for more important features, resources
> > or
> > > design decisions in Marvin?
> > >
> > > Thank you very much,
> > >
> > > Lucas Cardoso
> > >
> >
>


Re: Marvin’s mission discussion

2020-07-22 Thread Wei Chen
Hello Lucas,

I have some ideas:

1. Should we consider to use K8S or similar tools for inference container
scaling and management?
Marvin's current container management is not as powerful as some container
focus projects.
K8S can also be deployed into most environments now.

2. Is our current data cleaning stage flexible enough for multiple data
sources with table join?
Or if we should cut the data preparation stage out for the user to make
their own data pipeline on their data storage.
I figured that preprocessing might be too complex to be generalized for
different ML projects.

Best Regards
Wei



On Thu, Jul 23, 2020 at 12:26 AM Lucas Cardoso Silva <
cardosolucas61@gmail.com> wrote:

> Hi guys.
> I would like to know if anyone else has any ideas about this evaluation
> phase. Both the opinion of those who have been in the community for a long
> time and those who are still getting to know Marvin is now important for
> this step, so your suggestion or validation of the initial text is always
> welcome!
>
> Best regards,
> Lucas Cardoso
>
> Em sex., 10 de jul. de 2020 às 13:48, Lucas Cardoso Silva <
> cardosolucas61@gmail.com> escreveu:
>
> > Hello guys. The time has come for us to take the first step in
> > architectural assessment: the definition of the mission. Basically we
> have
> > to decide here what is important in Marvin and what is outside the scope
> of
> > the project. This is important because, during this analysis and the
> > development process as a whole, we will be able to segment what is really
> > important and make things more simple and functional. Also, if it looks
> > cool, we can include that on the Marvin-AI homepage.
> >
> > As stated earlier, I will post an initial draft and would like to receive
> > your feedback to complete a few points:
> >
> > The Apache Marvin-AI platform aims to offer:
> >
> >-
> >
> >a practical and standardized solution,
> >-
> >
> >for the development and deployment of machine learning applications.
> >
> >
> > Aiming to offer the user:
> >
> >-
> >
> >scalability,
> >-
> >
> >language agnosticism,
> >-
> >
> >standardized pipeline (DASFE),
> >-
> >
> >possibility of remote versioning of artifacts.
> >
> >
> > Does anyone have any suggestions for more important features, resources
> or
> > design decisions in Marvin?
> >
> > Thank you very much,
> >
> > Lucas Cardoso
> >
>


Re: Marvin’s mission discussion

2020-07-22 Thread Lucas Cardoso Silva
Hi guys.
I would like to know if anyone else has any ideas about this evaluation
phase. Both the opinion of those who have been in the community for a long
time and those who are still getting to know Marvin is now important for
this step, so your suggestion or validation of the initial text is always
welcome!

Best regards,
Lucas Cardoso

Em sex., 10 de jul. de 2020 às 13:48, Lucas Cardoso Silva <
cardosolucas61@gmail.com> escreveu:

> Hello guys. The time has come for us to take the first step in
> architectural assessment: the definition of the mission. Basically we have
> to decide here what is important in Marvin and what is outside the scope of
> the project. This is important because, during this analysis and the
> development process as a whole, we will be able to segment what is really
> important and make things more simple and functional. Also, if it looks
> cool, we can include that on the Marvin-AI homepage.
>
> As stated earlier, I will post an initial draft and would like to receive
> your feedback to complete a few points:
>
> The Apache Marvin-AI platform aims to offer:
>
>-
>
>a practical and standardized solution,
>-
>
>for the development and deployment of machine learning applications.
>
>
> Aiming to offer the user:
>
>-
>
>scalability,
>-
>
>language agnosticism,
>-
>
>standardized pipeline (DASFE),
>-
>
>possibility of remote versioning of artifacts.
>
>
> Does anyone have any suggestions for more important features, resources or
> design decisions in Marvin?
>
> Thank you very much,
>
> Lucas Cardoso
>