[GitHub] ThomasDelteil commented on a change in pull request #10537: [MX-307] Add .md tutorials to .ipynb for CI integration

2018-04-16 Thread GitBox
ThomasDelteil commented on a change in pull request #10537: [MX-307] Add .md 
tutorials to .ipynb for CI integration
URL: https://github.com/apache/incubator-mxnet/pull/10537#discussion_r181877584
 
 

 ##
 File path: docs/tutorials/python/data_augmentation.md
 ##
 @@ -49,13 +47,21 @@ One of the most convenient ways to augment your image data 
is via arguments of [
 We show a simple example of this below, after creating an `images.lst` file 
used by the 
[`ImageIter`](https://mxnet.incubator.apache.org/api/python/image/image.html?highlight=imageiter#mxnet.image.ImageIter).
 Use 
[`tools/im2rec.py`](https://github.com/apache/incubator-mxnet/blob/master/tools/im2rec.py)
 to create the `images.lst` if you don't already have this for your data.
 
 ```python
-!echo -e "0\t0.00\timages/0.jpg" > ./data/images.lst
+path_to_image = os.path.join("images","0.jpg")
+index = 0
+label = "0.00"
+list_file_content = "{}\t{}\t{}".format(index, label, path_to_image)
 
 Review comment:
   I find it slightly less visual that way but will update


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ThomasDelteil commented on a change in pull request #10537: [MX-307] Add .md tutorials to .ipynb for CI integration

2018-04-16 Thread GitBox
ThomasDelteil commented on a change in pull request #10537: [MX-307] Add .md 
tutorials to .ipynb for CI integration
URL: https://github.com/apache/incubator-mxnet/pull/10537#discussion_r181875579
 
 

 ##
 File path: docs/tutorials/gluon/datasets.md
 ##
 @@ -175,58 +180,51 @@ for epoch in range(epochs):
 print("Epoch {}, training loss: {:.2f}, validation loss: 
{:.2f}".format(epoch, train_loss, valid_loss))
 ```
 
-Epoch 0, training loss: 0.54, validation loss: 0.45
-Epoch 1, training loss: 0.40, validation loss: 0.39
-Epoch 2, training loss: 0.36, validation loss: 0.39
-Epoch 3, training loss: 0.33, validation loss: 0.34
-Epoch 4, training loss: 0.32, validation loss: 0.33
+`Epoch 0, training loss: 0.54, validation loss: 0.45`
+
+`...`
+
+`Epoch 4, training loss: 0.32, validation loss: 0.33`
 
 
 # Using own data with included `Dataset`s
 
 Gluon has a number of different 
[`Dataset`](https://mxnet.incubator.apache.org/api/python/gluon/data.html?highlight=dataset#mxnet.gluon.data.Dataset)
 classes for working with your own image data straight out-of-the-box. You can 
get started quickly using the 
[`mxnet.gluon.data.vision.datasets.ImageFolderDataset`](https://mxnet.incubator.apache.org/api/python/gluon/data.html?highlight=imagefolderdataset#mxnet.gluon.data.vision.datasets.ImageFolderDataset)
 which loads images directly from a user-defined folder, and infers the label 
(i.e. class) from the folders.
 
 We will run through an example for image classification, but a similar process 
applies for other vision tasks. If you already have your own collection of 
images to work with you should partition your data into training and test sets, 
and place all objects of the same class into seperate folders. Similar to:
-
+```
 ./images/train/car/abc.jpg
 ./images/train/car/efg.jpg
 ./images/train/bus/hij.jpg
 ./images/train/bus/klm.jpg
 ./images/test/car/xyz.jpg
 ./images/test/bus/uvw.jpg
+```
 
 You can download the Caltech 101 dataset if you don't already have images to 
work with for this example, but please note the download is 126MB.
 
 ```python
-!wget 
http://www.vision.caltech.edu/Image_Datasets/Caltech101/101_ObjectCategories.tar.gz
-!tar -xzf 101_ObjectCategories.tar.gz
+
+data_folder = "data"
+dataset_name = "101_ObjectCategories"
+archive_file = "{}.tar.gz".format(dataset_name)
+archive_path = os.path.join(data_folder, archive_file)
+data_url = "https://s3.us-east-2.amazonaws.com/mxnet-public/;
+
+if not os.path.isfile(archive_path):
+mx.test_utils.download("{}{}".format(data_url, archive_file), dirname = 
data_folder)
+print('Extracting {} in {}...'.format(archive_file, data_folder))
+tar = tarfile.open(archive_path, "r:gz")
+tar.extractall(data_folder)
+tar.close()
+print('Data extracted.')
 ```
 
-After downloading and extracting the data archive, we seperate the data into 
training and test sets (50:50 split), and place images of the same class into 
the same folders, as required for using 
[`ImageFolderDataset`](https://mxnet.incubator.apache.org/api/python/gluon/data.html?highlight=imagefolderdataset#mxnet.gluon.data.vision.datasets.ImageFolderDataset).
+After downloading and extracting the data archive, we have two folders: 
`data/101_ObjectCategories` and `data/101_ObjectCategories_test`. We load the 
data into a training and testing dataset  
[`ImageFolderDataset`](https://mxnet.incubator.apache.org/api/python/gluon/data.html?highlight=imagefolderdataset#mxnet.gluon.data.vision.datasets.ImageFolderDataset).
 
 Review comment:
   will update


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ThomasDelteil commented on a change in pull request #10537: [MX-307] Add .md tutorials to .ipynb for CI integration

2018-04-16 Thread GitBox
ThomasDelteil commented on a change in pull request #10537: [MX-307] Add .md 
tutorials to .ipynb for CI integration
URL: https://github.com/apache/incubator-mxnet/pull/10537#discussion_r181875787
 
 

 ##
 File path: docs/tutorials/python/types_of_data_augmentation.md
 ##
 @@ -335,7 +332,7 @@ And lastly, you can use 
[`mxnet.image.RandomOrderAug`](https://mxnet.incubator.a
 ```python
 example_image_copy = example_image.copy()
 aug_list = [
-mx.image.RandomCropAug(size=(50, 50)),
+mx.image.RandomCropAug(size=(250, 250)),
 
 Review comment:
   the generated image was really random, this one always shows at least a part 
of the giraffe, which makes more sense IMO. The image can remain since it is a 
sensible one.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ThomasDelteil commented on a change in pull request #10537: [MX-307] Add .md tutorials to .ipynb for CI integration

2018-04-16 Thread GitBox
ThomasDelteil commented on a change in pull request #10537: [MX-307] Add .md 
tutorials to .ipynb for CI integration
URL: https://github.com/apache/incubator-mxnet/pull/10537#discussion_r181875817
 
 

 ##
 File path: docs/tutorials/vision/large_scale_classification.md
 ##
 @@ -11,6 +11,11 @@ Training a neural network with a large number of images 
presents several challen
 $ pip install opencv-python
 ```
 
+```python
+import mxnet as mx
+print(mx.__version__)
 
 Review comment:
   will update


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ThomasDelteil commented on a change in pull request #10537: [MX-307] Add .md tutorials to .ipynb for CI integration

2018-04-16 Thread GitBox
ThomasDelteil commented on a change in pull request #10537: [MX-307] Add .md 
tutorials to .ipynb for CI integration
URL: https://github.com/apache/incubator-mxnet/pull/10537#discussion_r181875420
 
 

 ##
 File path: docs/tutorials/gluon/datasets.md
 ##
 @@ -245,7 +243,7 @@ As with the Fashion MNIST dataset the labels will be 
integer encoded. You can us
 
 
 ```python
-sample_idx = 888
+sample_idx = 539
 
 Review comment:
   no I found the same by doing a bisect


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ThomasDelteil commented on a change in pull request #10537: [MX-307] Add .md tutorials to .ipynb for CI integration

2018-04-16 Thread GitBox
ThomasDelteil commented on a change in pull request #10537: [MX-307] Add .md 
tutorials to .ipynb for CI integration
URL: https://github.com/apache/incubator-mxnet/pull/10537#discussion_r181875085
 
 

 ##
 File path: docs/tutorials/speech_recognition/ctc.md
 ##
 @@ -1,5 +1,11 @@
 # Connectionist Temporal Classification
 
+```python
+
+import mxnet as mx
+print(mx.__version__)
 
 Review comment:
   it's to enable the generation of the .ipynb, it needs to have at least one 
code statement. I think this one shows the user which version it is using. I 
will add 1.1.0 below, which you are right should at least inform the user that 
this worked with 1.1.0.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ThomasDelteil commented on a change in pull request #10537: [MX-307] Add .md tutorials to .ipynb for CI integration

2018-04-13 Thread GitBox
ThomasDelteil commented on a change in pull request #10537: [MX-307] Add .md 
tutorials to .ipynb for CI integration
URL: https://github.com/apache/incubator-mxnet/pull/10537#discussion_r181295076
 
 

 ##
 File path: tests/nightly/test_tutorial_config.txt
 ##
 @@ -1,20 +1,31 @@
 basic/ndarray
+basic/ndarray_indexing
 basic/symbol
 basic/module
 basic/data
-python/linear-regression
-python/mnist
-python/predict_image
-onnx/super_resolution
-onnx/fine_tuning_gluon
-onnx/inference_on_onnx_model
-basic/ndarray_indexing
-python/matrix_factorization
+gluon/customop
+gluon/data_augmentation
+gluon/datasets
 gluon/ndarray
 gluon/mnist
 gluon/autograd
 gluon/gluon
 gluon/hybrid
+nlp/cnn
+onnx/super_resolution
+onnx/fine_tuning_gluon
+onnx/inference_on_onnx_model
+python/matrix_factorization
+python/linear-regression
+python/mnist
+python/predict_image
+python/data_augmentation
+python/data_augmentation_with_masks
+python/kvstore
+python/types_of_data_augmentation
 sparse/row_sparse
 sparse/csr
-sparse/train
 
 Review comment:
   Indeed, going forward, there will be a one individual test per tutorial, to 
allow the use of annotation like `@highCpu`, `@highMemory`, `@gpu`. And there 
will be an integration test that will check that each notebook has been added 
to the test suite. 
   
   This will be part of my next PR, as part of this work of integrating 
tutorials to the CI  


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ThomasDelteil commented on a change in pull request #10537: [MX-307] Add .md tutorials to .ipynb for CI integration

2018-04-13 Thread GitBox
ThomasDelteil commented on a change in pull request #10537: [MX-307] Add .md 
tutorials to .ipynb for CI integration
URL: https://github.com/apache/incubator-mxnet/pull/10537#discussion_r181295076
 
 

 ##
 File path: tests/nightly/test_tutorial_config.txt
 ##
 @@ -1,20 +1,31 @@
 basic/ndarray
+basic/ndarray_indexing
 basic/symbol
 basic/module
 basic/data
-python/linear-regression
-python/mnist
-python/predict_image
-onnx/super_resolution
-onnx/fine_tuning_gluon
-onnx/inference_on_onnx_model
-basic/ndarray_indexing
-python/matrix_factorization
+gluon/customop
+gluon/data_augmentation
+gluon/datasets
 gluon/ndarray
 gluon/mnist
 gluon/autograd
 gluon/gluon
 gluon/hybrid
+nlp/cnn
+onnx/super_resolution
+onnx/fine_tuning_gluon
+onnx/inference_on_onnx_model
+python/matrix_factorization
+python/linear-regression
+python/mnist
+python/predict_image
+python/data_augmentation
+python/data_augmentation_with_masks
+python/kvstore
+python/types_of_data_augmentation
 sparse/row_sparse
 sparse/csr
-sparse/train
 
 Review comment:
   Indeed, going forward, there will be a one individual test per tutorial, to 
allow the use of annotation like `@highCpu`, `@hihMemory`. And there will be an 
integration test that will check that each notebook has been added to the test 
suite. 
   
   This will be part of my next PR, as part of this work of integrating 
tutorials to the CI  


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services