Patrick and Ray,

Just for clarity's sake I will point out that Zahra's work did include information gathered at compile time. She used a custom Clang plugin to gather this information.


That being said, we may want to only utilize runtime information in the context of this project. However, I will deffer that decision to the mentors assigned to the project.

Adrian


On 2/18/2018 4:10 PM, Patrick Diehl wrote:
Hi Ray Kim,

But the original GSoC project description and the paper[1]  both mention
*compiler provided* *static data*.
 From my belief this cannot be acquired from HPX performance counters
(excuse me if they actually do)
If this project should be independent of compiler (or libray), than I
suppose I may not be bothered?
Yes, this was mentioned in the project description. But since we wrote
the description, I looked into the blaze library and how they decide to
use single core or omp execution, depending on the parameters, like the
input size. They measuring the threshold when a parallel execution is
faster as the single core execution. These thresholds for different
algorithms are provided in a header file as constants.

I was thinking that something similar could be done with machine
learning for the different algorithms. Instead of a threshold we would
have function learned by the machine learning algorithm.

This is just my thinking how to do this. For sure, there are many other
approaches out there. it is your task to propose any solution, you think
could solve this problem.

In case of not doing any machine learning during run time,
What form of implementation are you expecting?
I was thinking to have for different parallel algorithms one function

f_i(length of input, amount of cpus, ...) -> (chunk_size, used amount of
cpus)

So during run time, your function get input parameters, like the amount
of cpus and the length of the input, and maybe many more. And the
function returns the optimal chunk size and maybe the  used amount of
cpus. Where used amount is the amount of cpu used by hpx. For example
one wants to run hpx with 3 cpus, but the input size is very small and
it would be faster to just use one cpu. So your function would predict
this.

I was thinking that one could use performance counters to obtain, e.g.
the /threads/idle-rate for different input parameters. So one would have
an d-dimensional input space, which could be used to learn a function
approximating this points.

A separate program that does prediction and sets all the parameters?
1) A shell script, which runs the hpx applications and save all data to
a csv file.

2) A python script, which uses this data to learn the function f_i.

3) These functions should be implemented and a possible smart executor
could use the function for each parallel algorithm to predict the best
chunk size and parallel or serial execution, depending on your input
parameters.

Or a meta program that does the job in compile time?
If you think this is necessary, please provide a explanation why and how
you want to do this.

Or is this up to my proposition?
Yes, you should look into how the task could be solved and propose to
the community any solution, you think could do this. We will discuss
with you your proposed solution and improve it.

I am happy to discuss with you next week on irc. I think it will be
easier to clarify things.

Best,

Patrick

On 02/18/2018 02:15 PM, 김규래 wrote:
​Hi again Patrick,
​Thanks for keeping track of me.

But the original GSoC project description and the paper[1]  both mention
*compiler provided* *static data*.
 From my belief this cannot be acquired from HPX performance counters
(excuse me if they actually do)
If this project should be independent of compiler (or libray), than I
suppose I may not be bothered?

/> If you collect data and try to generate a model and use the > trained
model without any machine learning during run time,/

In case of not doing any machine learning during run time,
What form of implementation are you expecting?
A separate program that does prediction and sets all the parameters?
Or a meta program that does the job in compile time?
Or is this up to my proposition?

Thanks for all
Ray Kim

[1] http://stellar.cct.lsu.edu/pubs/khatami_espm2_2017.pdf, p4 fig2



_______________________________________________
hpx-users mailing list
hpx-users@stellar.cct.lsu.edu
https://mail.cct.lsu.edu/mailman/listinfo/hpx-users

--
Adrian Serio
Scientific Program Coordinator
2118 Digital Media Center
225.578.8506

_______________________________________________
hpx-users mailing list
hpx-users@stellar.cct.lsu.edu
https://mail.cct.lsu.edu/mailman/listinfo/hpx-users

Reply via email to