Hello Lucas,

I am thinking of processing JSON or XML files with a hierarchy dynamic
structure.
Or building a pipeline to crop image with object detection metadata.
Data preparation can be very messy,
I wonder if we can have a stage to handle both batch and streaming
processing well.

I simply think we don't need to focus on this part since we can utilize a
wide variety of tools for our specific needs.

Best Regards,
Wei



On Wed, Jul 29, 2020 at 8:48 PM Lucas Bonatto Miguel <lucasb...@apache.org>
wrote:

> Hi folks,
>
> In regards to the mission, you're correct. If I could summarize it, it
> would be like: *to help its users to perform data exploration, model
> development and application lifecycle management*.
>
> I'm all in for having a better integration with Kubernetes. I think that
> the first step is to create a new thread in order to design something
> following their operator pattern:
> https://kubernetes.io/docs/concepts/extend-kubernetes/operator/
>
> Wei, currently one already can perform merges and joins in the
> transformation step. Could you comment a bit more on what you think we
> could improve there? Maybe something for a new thread as well?
>
> Best!
> Lucas
>
> On Wed, Jul 29, 2020 at 1:24 AM Wei Chen <weic...@apache.org> wrote:
>
> > I think deploying to K8S does expend our capabilities for inference
> scaling
> > and managing.
> > I am not familiar with Luigi, but it makes sense since we are going to
> > setup data pipelines.
> >
> > Best Regards,
> > Wei
> >
> > On Wed, Jul 29, 2020 at 5:32 AM Lucas Cardoso Silva <
> > cardosolucas61....@gmail.com> wrote:
> >
> > > Great Wei! I find the suggestions really interesting. I think we can
> work
> > > with the deployment on K8s. The idea of it in Marvin would be, after
> > > development, the user would give some parameters and a script would
> > > facilitate a deployment in a kubernetes cluster, right? Regarding data
> > > acquisition, I think it would be great if we were able to integrate
> some
> > > third party library like Luigi. Thanks!
> > >
> > >
> > >
> > > Em qua., 22 de jul. de 2020 às 14:27, Wei Chen <weic...@apache.org>
> > > escreveu:
> > >
> > > > Hello Lucas,
> > > >
> > > > I have some ideas:
> > > >
> > > > 1. Should we consider to use K8S or similar tools for inference
> > container
> > > > scaling and management?
> > > > Marvin's current container management is not as powerful as some
> > > container
> > > > focus projects.
> > > > K8S can also be deployed into most environments now.
> > > >
> > > > 2. Is our current data cleaning stage flexible enough for multiple
> data
> > > > sources with table join?
> > > > Or if we should cut the data preparation stage out for the user to
> make
> > > > their own data pipeline on their data storage.
> > > > I figured that preprocessing might be too complex to be generalized
> for
> > > > different ML projects.
> > > >
> > > > Best Regards
> > > > Wei
> > > >
> > > >
> > > >
> > > > On Thu, Jul 23, 2020 at 12:26 AM Lucas Cardoso Silva <
> > > > cardosolucas61....@gmail.com> wrote:
> > > >
> > > > > Hi guys.
> > > > > I would like to know if anyone else has any ideas about this
> > evaluation
> > > > > phase. Both the opinion of those who have been in the community
> for a
> > > > long
> > > > > time and those who are still getting to know Marvin is now
> important
> > > for
> > > > > this step, so your suggestion or validation of the initial text is
> > > always
> > > > > welcome!
> > > > >
> > > > > Best regards,
> > > > > Lucas Cardoso
> > > > >
> > > > > Em sex., 10 de jul. de 2020 às 13:48, Lucas Cardoso Silva <
> > > > > cardosolucas61....@gmail.com> escreveu:
> > > > >
> > > > > > Hello guys. The time has come for us to take the first step in
> > > > > > architectural assessment: the definition of the mission.
> Basically
> > we
> > > > > have
> > > > > > to decide here what is important in Marvin and what is outside
> the
> > > > scope
> > > > > of
> > > > > > the project. This is important because, during this analysis and
> > the
> > > > > > development process as a whole, we will be able to segment what
> is
> > > > really
> > > > > > important and make things more simple and functional. Also, if it
> > > looks
> > > > > > cool, we can include that on the Marvin-AI homepage.
> > > > > >
> > > > > > As stated earlier, I will post an initial draft and would like to
> > > > receive
> > > > > > your feedback to complete a few points:
> > > > > >
> > > > > > The Apache Marvin-AI platform aims to offer:
> > > > > >
> > > > > >    -
> > > > > >
> > > > > >    a practical and standardized solution,
> > > > > >    -
> > > > > >
> > > > > >    for the development and deployment of machine learning
> > > applications.
> > > > > >
> > > > > >
> > > > > > Aiming to offer the user:
> > > > > >
> > > > > >    -
> > > > > >
> > > > > >    scalability,
> > > > > >    -
> > > > > >
> > > > > >    language agnosticism,
> > > > > >    -
> > > > > >
> > > > > >    standardized pipeline (DASFE),
> > > > > >    -
> > > > > >
> > > > > >    possibility of remote versioning of artifacts.
> > > > > >
> > > > > >
> > > > > > Does anyone have any suggestions for more important features,
> > > resources
> > > > > or
> > > > > > design decisions in Marvin?
> > > > > >
> > > > > > Thank you very much,
> > > > > >
> > > > > > Lucas Cardoso
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to