[hpx-users] Report GSoC: Alternative Smart Executors

Gabriel Laberge Mon, 11 Jun 2018 09:56:14 -0700

Hi,
First of, I would like to apologize for submitting this report late. I  
forgot the deadline was 10h and I assume full responsibility for this  
mistake. I promise other reports will be submitted on time.


Here is the progress in the project: Alternative Smart executors

https://github.com/gablabc/hpxML/tree/submodules

0- The hpx repository within hpxml has been built with clang6.0.0 and  
boost1.67 on rostam. Some modifications had to be made concerning  
missing headers. The loop-convert executable used to extract features  
of given loops has been built. the path to hpx headers must me set  
manually when calling the executable but this will be changed further  
on in the project.

1- A python machine learning repository as been added to hpxml. This  
repository contains a python script that uses scikit-learn's  
algorithms on data files. Those algorithms are Support Vector  
Regression, Neural-Network regression and k-Nearest-Neighbors  
regression. The current data files used to train the algorithms have  
been previously generated by Zahra but soon I should be able to train  
the algorithms using my own data files. To compare the different  
algorithms, the kfold-cross-validation technique is used. the error  
chosen to compare the regressions with the test set is the mean  
absolute error. The algorithm with the lowest error will be chosen.  
Also, since right now, the target values for chunk_size and  
prefetching distance are on different scales, there is an option to  
fit the log(Y) instead of fitting Y dues ensuring that the targets  
values are on the same scale.

2- A training data repository has been added. The goal of this  
repository is to generate a framework that anyone can use to  
automatically generate data. The algorithm folder contains various  
functions that apply a for_each loop on a given lambda function with  
different chunk_size values and output the execution time for all the  
chunk-size candidates. The number of functions will continue to grow  
as the project moves one. To make data, the user can use the  
training.txt file which contains a list of the functions and the  
number of iterations they want to run. Then using the sbatch command

sbatch train.sbatch training.txt

the user can automatically run all the functions and the results will  
be written in a data file.

What is left to be done in the near future.

1-Currently the data generated in the training repository doesn't  
extract features using loop-convert. This will have to be added to  
ensure that the data files contain all the information necessary to  
train the pythons machine learning algorithms.

2-Once an optimal regression has been found using  
kfold-cross-validation, the algorithm will be fully implemented in  
python as a way for me to get familiar with the algorithm.

Thank you very much, Once again I would like to apologize for missing  
the deadline.
Gabriel Laberge



_______________________________________________
hpx-users mailing list
[email protected]
https://mail.cct.lsu.edu/mailman/listinfo/hpx-users

[hpx-users] Report GSoC: Alternative Smart Executors

Reply via email to