Introducing NimbusML - experimental Python bindings for ML.NET

Gani Nazirov via Python-announce-list Mon, 05 Nov 2018 05:05:16 -0800

We are excited to announce that yesterday we released and open sourced 
NimbusML<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FMicrosoft%2FNimbusML&data=02%7C01%7Cganaziro%40microsoft.com%7C0c37bb14ad3545d507c008d6410025dc%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636767866514956851&sdata=1irzCc9xFFC0OID4SNpVniylBH7dxjgIXCv2L8pT01E%3D&reserved=0>
 ! This project provides experimental Python bindings for 
ML.NET<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fdotnet%2Fmachinelearning&data=02%7C01%7Cganaziro%40microsoft.com%7C0c37bb14ad3545d507c008d6410025dc%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636767866514966861&sdata=CA6yPTUsjOjnHhEkXpugahOPgg%2BVRYusE%2B%2BI%2FiQmyuI%3D&reserved=0>
 (an open source and cross-platform machine learning framework for .NET)



NimbusML allows you to build ML.NET pipelines in Python and also integrate them 
into Scikit-Learn pipelines.



Highlights



·         Cross-platform: NimbusML is supported on Mac, Linux, and Windows.

·         Efficient interop with Scikit-learn/Pandas: NimbusML can accept 
Pandas dataframes as input and its components can also be used within 
Scikit-learn pipelines.

·         Majority of ML.NET components are available: Most ML.NET components 
can be used through NimbusML.

·         Performance parity with ML.NET: When using only NimbusML components 
(loaders, transforms, scorers, and evaluators), NimbusML performance matches 
ML.NET performance.

·         Familiar APIs for Scikit-learn users: NimbusML adheres to existing 
Scikit-learn conventions but also introduces some new concepts such as how to 
work with multiple columns in the pipelines.

·         Open-source: NimbusML will be built in the open and we encourage any 
non-confidential issues/questions to be added on GitHub. Please let us know if 
you are interested in contributing.

·         Interop with ML.NET models: models trained in NimbusML can be 
deployed in .NET applications using ML.NET (see 
here<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2FNimbusML%2Floadsavemodels&data=02%7C01%7Cganaziro%40microsoft.com%7C0c37bb14ad3545d507c008d6410025dc%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636767866514976865&sdata=WRAvv8XWuUs%2BSQvAHwGj8eP1XXOrgwKbkWSmhV5ipyo%3D&reserved=0>
 for an example).



Click here to view the NimbusML 
repo.<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FMicrosoft%2FNimbusML&data=02%7C01%7Cganaziro%40microsoft.com%7C0c37bb14ad3545d507c008d6410025dc%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636767866514976865&sdata=%2FmjPQoZIildRuLJPocDTlMf0Xn65yAyn9R7oNUveNYw%3D&reserved=0>

Click here to view the NimbusML 
samples.<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FMicrosoft%2FNimbusML-samples&data=02%7C01%7Cganaziro%40microsoft.com%7C0c37bb14ad3545d507c008d6410025dc%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636767866514986870&sdata=LJVm0IQClk4IgFmkpJbGsFp0r%2BZwSKpRALWA4T3QVN8%3D&reserved=0>

Click here to view the NimbusML 
docs.<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2FNimbusML%2Foverview&data=02%7C01%7Cganaziro%40microsoft.com%7C0c37bb14ad3545d507c008d6410025dc%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636767866514986870&sdata=jcgvpNNckRDO0bsGhgYD%2BVesxVHie3hU5doRQ7VfR7A%3D&reserved=0>



Installation



NimbusML can be installed using pip:

pip install nimbusml




You can run a quick test with:

python -m nimbusml.examples.FastLinearClassifier




NimbusML has been tested on Windows 10, MacOS 10.13, Ubuntu 14.04, Ubuntu 
16.04, Ubuntu 18.04, CentOS 7, and RHEL 7.



NimbusML requires Python 2.7, 3.5, or 3.6 (64 bit). Python 3.7 is not supported 
yet.



Getting Started



Documentation can be found 
here<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2FNimbusML%2Foverview&data=02%7C01%7Cganaziro%40microsoft.com%7C0c37bb14ad3545d507c008d6410025dc%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636767866514996884&sdata=RboJY%2B%2FkznoN0g1ecoE0neZ4UeEKLvAIsIoFKDPID4I%3D&reserved=0>.
 Sample notebooks can be found 
here<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FMicrosoft%2FNimbusML-Samples&data=02%7C01%7Cganaziro%40microsoft.com%7C0c37bb14ad3545d507c008d6410025dc%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636767866514996884&sdata=tIRDpTjIv2ZrKoxzMwYAsUCFumzgqKChnpsJCsmtK1c%3D&reserved=0>.
 A few examples:



·         Twitter Sentiment 
Analysis<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2FNimbusML%2Ftutorials%2Fb_b-sentiment-analysis-2-data-streaming-with-filedatastream&data=02%7C01%7Cganaziro%40microsoft.com%7C0c37bb14ad3545d507c008d6410025dc%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636767866515006889&sdata=wBQ3XbiaM7UqT7GsB0f%2BBirYIgBtHw38GaTsh6ySkPs%3D&reserved=0>

·         Ranking with 
LightGBM<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2FNimbusML%2Ftutorials%2Fb_e-learning-to-rank-with-microsoft-bing-data&data=02%7C01%7Cganaziro%40microsoft.com%7C0c37bb14ad3545d507c008d6410025dc%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636767866515006889&sdata=6C5DN89JLuvOuG1BFEgvFACDnX1PyrDCQzk5O2j1g84%3D&reserved=0>

·         Image clustering using a TensorFlow 
model<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2FNimbusML%2Ftutorials%2Fb_f-image-processing-clustering&data=02%7C01%7Cganaziro%40microsoft.com%7C0c37bb14ad3545d507c008d6410025dc%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636767866515016898&sdata=7bE9pUF7D6ELHjZ8nHznlTe40oC3O5EWrOYR7jceU6s%3D&reserved=0>

·         Binary classification with Logistic 
Regression<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fpython%2Fapi%2Fnimbusml%2Fnimbusml.linear_model.logisticregressionbinaryclassifier%3Fview%3Dnimbusml-py-latest&data=02%7C01%7Cganaziro%40microsoft.com%7C0c37bb14ad3545d507c008d6410025dc%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636767866515016898&sdata=XBeUH4m6J5R3M2rYWsW8HNdkEoi5GMwlogzsN%2BFYo2I%3D&reserved=0>

·         Save and load models (and use NimbusML models in 
ML.NET)<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Fnimbusml%2Floadsavemodels%3Fview%3Dnimbusml-py-latest&data=02%7C01%7Cganaziro%40microsoft.com%7C0c37bb14ad3545d507c008d6410025dc%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636767866515026911&sdata=5pUTzUcE50TYsqQlRLPG8MV%2BW60a5OpGaLmAgn5XWmI%3D&reserved=0>



Sentiment analysis example with NimbusML components:



from nimbusml import Pipeline, FileDataStream
from nimbusml.datasets import get_dataset
from nimbusml.ensemble import FastTreesBinaryClassifier
from nimbusml.feature_extraction.text import NGramFeaturizer



train_file = get_dataset('gen_twittertrain').as_filepath()
test_file = get_dataset('gen_twittertest').as_filepath()

train_data = FileDataStream.read_csv(train_file, sep='\t')
test_data = FileDataStream.read_csv(test_file, sep='\t')



pipeline = Pipeline([ # nimbusml pipeline
    NGramFeaturizer(columns={'Features': ['Text']}),
    FastTreesBinaryClassifier(feature=['Features'], label='Label')
])



# fit and predict
pipeline.fit(train_data)
results = pipeline.predict(test_data)




A complete notebook for this example can be found 
here<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FMicrosoft%2FNimbusML-Samples%2Fblob%2Fmaster%2Fsamples%2F2.2%2520%255BText%255D%2520Sentiment%2520Analysis%25202%2520-%2520Data%2520Streaming%2520with%2520FileDataStream.ipynb&data=02%7C01%7Cganaziro%40microsoft.com%7C0c37bb14ad3545d507c008d6410025dc%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636767866515026911&sdata=WHuyTVINzb5RCO2fxQqeyhEgXf%2Bh1rfxDjhMjQJtOV4%3D&reserved=0>.

Sentiment analysis example with NimbusML + Scikit-Learn components:

from nimbusml.datasets import get_dataset
from nimbusml.ensemble import FastTreesBinaryClassifier

from sklearn.pipeline import Pipeline
from sklearn.feature_extraction.text import TfidfVectorizer
import pandas as pd



train_file = get_dataset('gen_twittertrain').as_filepath()
test_file = get_dataset('gen_twittertest').as_filepath()

train_data = pd.read_csv(train_file, sep='\t')
test_data = pd.read_csv(test_file, sep='\t')



pipeline = Pipeline([ # sklearn pipeline
    ('tfidf', TfidfVectorizer()), # sklearn transform
    ('clf', FastTreesBinaryClassifier()) # nimbusml learner
])



# fit and predict
pipeline.fit(train_data["Text"], train_data["Label"])
results = pipeline.predict(test_data["Text"])




A complete notebook for this example can be found 
here<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FMicrosoft%2FNimbusML-Samples%2Fblob%2Fmaster%2Fsamples%2F2.3%2520%255BText%255D%2520Sentiment%2520Analysis%25203%2520-%2520Combining%2520NimbusML%2520and%2520Scikit-learn.ipynb&data=02%7C01%7Cganaziro%40microsoft.com%7C0c37bb14ad3545d507c008d6410025dc%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636767866515036907&sdata=ONit2cuOtU8AznT7uD6QzK9oltVikb3yM8e2MZ9DiI4%3D&reserved=0>.



Thank you!



-ML.NET Team

-- 
https://mail.python.org/mailman/listinfo/python-announce-list

        Support the Python Software Foundation:
        http://www.python.org/psf/donations/

Introducing NimbusML - experimental Python bindings for ML.NET

Reply via email to