ifeherva commented on a change in pull request #11651: Add logistic regression 
tutorial
URL: https://github.com/apache/incubator-mxnet/pull/11651#discussion_r202189850
 
 

 ##########
 File path: docs/tutorials/gluon/logistic_regression_explained.md
 ##########
 @@ -0,0 +1,215 @@
+
+# Logistic regression using Gluon API explained
+
+Logistic Regression is one of the first models newcomers to Deep Learning are 
implementing. In this tutorial I am going to focus on how to do logistic 
regression using Gluon API and provide some high level tips.
+
+Before anything else, let's import required packages for this tutorial.
+
+
+```python
+import numpy as np
+import mxnet as mx
+from mxnet import nd, autograd, gluon
+from mxnet.gluon import nn, Trainer
+from mxnet.gluon.data import DataLoader, ArrayDataset
+
+mx.random.seed(12345)  # Added for reproducibility
+```
+
+In this tutorial we will use fake dataset, which contains 10 features drawn 
from a normal distribution with mean equals to 0 and standard deviation equals 
to 1, and a class label, which can be either 0 or 1. The length of the dataset 
is an arbitrary value. The function below helps us to generate a dataset.
+
+
+```python
+def get_random_data(size, ctx):
+    x = nd.normal(0, 1, shape=(size, 10), ctx=ctx)
+    # Class label is generated via non-random logic so the network would have 
a pattern to look for
+    # Number 3 is selected to make sure that number of positive examples 
smaller than negative, but not too small
+    y = x.sum(axis=1) > 3
+    return x, y
+```
+
+Also, let's define a set of hyperparameters, that we are going to use later. 
Since our model is simple and dataset is small, we are going to use CPU for 
calculations. Feel free to change it to GPU for a more advanced scenario.
+
+
+```python
+ctx = mx.cpu()
+train_data_size = 1000
+val_data_size = 100
+batch_size = 10
+```
+
+## Working with data
+
+To work with data, Apache MXNet provides Dataset and DataLoader classes. The 
former is used to provide an indexed access to the data, the latter is used to 
shuffle and batchify the data. 
+
+This separation is done because a source of Dataset can vary from a simple 
array of numbers to complex data structures like text and images. DataLoader 
doesn't need to be aware of the source of data as long as Dataset provides a 
way to get the number of records and to load a record by index. As an outcome, 
Dataset doesn't need to hold in memory all data at once. Needless to say, that 
one can implement its own versions of Dataset and DataLoader, but we are going 
to use existing implementation.
+
+Below we define 2 datasets: training dataset and validation dataset. It is a 
good practice to measure performance of a trained model on a data that the 
network hasn't seen before. That is why we are going to use training set for 
training the model and validation set to calculate model's accuracy.
+
+
+```python
+train_x, train_ground_truth_class = get_random_data(train_data_size, ctx)
+train_dataset = ArrayDataset(train_x, train_ground_truth_class)
+train_dataloader = DataLoader(train_dataset, batch_size=batch_size, 
shuffle=True)
+
+val_x, val_ground_truth_class = get_random_data(val_data_size, ctx)
+val_dataset = ArrayDataset(val_x, val_ground_truth_class)
+val_dataloader = DataLoader(val_dataset, batch_size=batch_size, shuffle=True)
+```
+
+## Defining and training the model
+
+In real application, model can be arbitrary complex. The only requirement for 
the logistic regression is that the last layer of the network must be a single 
neuron. Apache MXNet allows us to do so by using `Dense` layer and specifying 
the number of units to 1.
+
+Below, we define a model which has an input layer of 10 neurons, a couple 
inner layers of 10 neurons each, and output layer of 1 neuron, as it is 
required by logistic regression. We stack the layers using `HybridSequential` 
block and initialize parameters of the network using `Xavier` initialization. 
+
+
+```python
+net = nn.HybridSequential()
+
+with net.name_scope():
+    net.add(nn.Dense(units=10, activation='relu'))  # input layer
+    net.add(nn.Dense(units=10, activation='relu'))   # inner layer 1
+    net.add(nn.Dense(units=10, activation='relu'))   # inner layer 2
+    net.add(nn.Dense(units=1))   # output layer: notice, it must have only 1 
neuron
+
+net.initialize(mx.init.Xavier(magnitude=2.34))
+```
+
+After defining the model, we need to define a few more thing: our loss, our 
trainer and our metric.
+
+Loss function is used to calculate how the output of the network different 
from the ground truth. In case of the logistic regression the ground truth are 
class labels, which can be either 0 or 1. Because of that, we are using 
`SigmoidBinaryCrossEntropyLoss`, which suites well for that scenario.
+
+Trainer object allows to specify the method of training to be used. There are 
various methods available, and for our tutorial we use a widely accepted method 
Stochastic Gradient Descent. We also need to parametrize it with learning rate 
value, which defines how fast training happens, and weight decay which is used 
for regularization.
 
 Review comment:
   "We also need to parametrize it with learning rate value, which defines how 
fast training happens" The learning rate of SGD defines the weight updates, not 
necessarily how fast training happens. I propose to reword it a bit to avoid 
confusion that large LR will lead to fast training.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to