Re: [scikit-learn] is Sci_kiet-Learn the right choice for my project

Brown J.B. via scikit-learn Sat, 08 Oct 2022 04:37:35 -0700

Dear Mike,

Just my two cents about your inquiry, where I strictly a user of
scikit-learn for many years.


- From your description of application context, I would say that
scikit-learn is perfectly fine. However, I would suggest the awareness that
a monolithic model incorporating all data (as is the image TV wrongfully
projects) is not a valid strategy. Stratifying data into contextually
correct subgroups and then running scikit-learn, for example to estimate
during development the extent of predictability, will be helpful.
- Duplicate checking should be easy to use using standard python objects
(set or list counting), once the context derives how the objects are
vectorized/featurized. I don't see a need to force scikit-learn for that
context.
- Missing data could be implemented by context-specific object classes that
you design, which could contain something like a __bool__()  method that
could tell if you if the object has all of the required data populated and
configured.
- Detection of errors in configuration could be either explicitly driven by
logic (of the context, again something to return a bool that an object is
configured correctly), or potentially could be statistically derived as
outliers from the given background data distribution, in which then
scikit-learn could be of help. If there are too many variates (thousands or
tens of thousands) in your data that prohibit explicit logic, then
scikit-learn's Random Forest algorithms might be perfectly fine and provide
verification through visualization of Decision Tree rules.

Hope this helps,
J.B. Brown

2022年10月8日(土) 10:59 Mike Oliver <m...@globalsaassol.com>:

> Dear Sirs,
>
>
>
> I am evaluating SciKit-Learn for a new project.  I am hoping to find a AI
> Machine Learning package that can take a large dataset of objects that have
> various object types and attributes.  These objects are typically related
> to other objects, such as a server to a Wifi device, or two network routers
> to each other, etc.  When these objects are setup data is gathered about
> where they are located, what settings there are, the device type, etc.
>
>
>
> With large organizations there can be thousands of these objects and tens
> of thousands of relationships, descriptions, settings, etc.  My hope is
> that with machine learning we can detect when an object is missing, or
> configured in error, or duplicates.
>
>
>
> The question is, will SciKit-Learn help with this problem? I understand
> that we will have to train it to identify what to look for and then act on
> what was found and predicted to be the solution algorithm. Or instructions.
>
>
>
> Thanks for your help,
>
>
>
> Great looking product and already have the tutorial up and running and
> have installed it in my Django platform.
>
>
>
> Mike
> _______________________________________________
> scikit-learn mailing list
> scikit-learn@python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>

_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

Re: [scikit-learn] is Sci_kiet-Learn the right choice for my project

Reply via email to