[jira] [Commented] (SYSTEMML-2458) Add experiment on spark paramserv

2018-08-05 Thread Matthias Boehm (JIRA)


[ 
https://issues.apache.org/jira/browse/SYSTEMML-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16569601#comment-16569601
 ] 

Matthias Boehm commented on SYSTEMML-2458:
--

Thanks - the adagrad results are in the repo; currently adam and sgd are 
running. One observation is that ASP-batch is much slower than BSP-batch. It's 
understandable because for BSP-batch we simply accure gradients and perform one 
update for all workers but this effect should not be that pronounced.

> Add experiment on spark paramserv
> -
>
> Key: SYSTEMML-2458
> URL: https://issues.apache.org/jira/browse/SYSTEMML-2458
> Project: SystemML
>  Issue Type: Sub-task
>Reporter: LI Guobao
>Assignee: LI Guobao
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (SYSTEMML-2458) Add experiment on spark paramserv

2018-08-05 Thread LI Guobao (JIRA)


[ 
https://issues.apache.org/jira/browse/SYSTEMML-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16569424#comment-16569424
 ] 

LI Guobao commented on SYSTEMML-2458:
-

[~mboehm7], yes, I added the baseline experiment w/o paramserv and fixed the 
location of SystemML-config.xml file. Addtionnally, I've double checked the 
configuration of native BLAS for remote worker and it is well transferred and 
set to remote worker.

> Add experiment on spark paramserv
> -
>
> Key: SYSTEMML-2458
> URL: https://issues.apache.org/jira/browse/SYSTEMML-2458
> Project: SystemML
>  Issue Type: Sub-task
>Reporter: LI Guobao
>Assignee: LI Guobao
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (SYSTEMML-2458) Add experiment on spark paramserv

2018-08-04 Thread Matthias Boehm (JIRA)


[ 
https://issues.apache.org/jira/browse/SYSTEMML-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16569372#comment-16569372
 ] 

Matthias Boehm commented on SYSTEMML-2458:
--

OK I just kicked of a run for LOCAL experiments with MKL. However, note that 
the SystemML-config.xml file needs to be in each of the subdirectories 
otherwise it's not picked up correctly. Also, the Intel MKL's direct conv2d 
still runs into segmentation faults on this new architecture whenever the 
batchsize larger than 64 and hence I limited it to max 64. 

Tomorrow, I will kickoff baseline runs (e.g., without parameter server, varying 
number of workers, and with our java backend operations). The distributed 
experiments will follow subsequently. 

> Add experiment on spark paramserv
> -
>
> Key: SYSTEMML-2458
> URL: https://issues.apache.org/jira/browse/SYSTEMML-2458
> Project: SystemML
>  Issue Type: Sub-task
>Reporter: LI Guobao
>Assignee: LI Guobao
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (SYSTEMML-2458) Add experiment on spark paramserv

2018-08-04 Thread Matthias Boehm (JIRA)


[ 
https://issues.apache.org/jira/browse/SYSTEMML-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16569318#comment-16569318
 ] 

Matthias Boehm commented on SYSTEMML-2458:
--

Sure, I'm happy to kickoff additional rounds for local and distributed 
experiments. For the presentation, it would also be important to have baseline 
comparisons. Could you please add the baseline without paramserv to the 
experiments. Furthermore, I'll run these experiments with MKL so please double 
check that the native BLAS configuration is correctly set for distributed spark 
workers as well (see remote parfor worker setup)

> Add experiment on spark paramserv
> -
>
> Key: SYSTEMML-2458
> URL: https://issues.apache.org/jira/browse/SYSTEMML-2458
> Project: SystemML
>  Issue Type: Sub-task
>Reporter: LI Guobao
>Assignee: LI Guobao
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (SYSTEMML-2458) Add experiment on spark paramserv

2018-08-04 Thread LI Guobao (JIRA)


[ 
https://issues.apache.org/jira/browse/SYSTEMML-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16569312#comment-16569312
 ] 

LI Guobao commented on SYSTEMML-2458:
-

[~mboehm7], for the reason of hoping to have some experiments result for the 
presentation, I have pushed the latest polished scripts and the new packaged 
jar with the recent patches. Maybe we could continue to launch the experiments?

> Add experiment on spark paramserv
> -
>
> Key: SYSTEMML-2458
> URL: https://issues.apache.org/jira/browse/SYSTEMML-2458
> Project: SystemML
>  Issue Type: Sub-task
>Reporter: LI Guobao
>Assignee: LI Guobao
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (SYSTEMML-2458) Add experiment on spark paramserv

2018-07-25 Thread Matthias Boehm (JIRA)


[ 
https://issues.apache.org/jira/browse/SYSTEMML-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16556472#comment-16556472
 ] 

Matthias Boehm commented on SYSTEMML-2458:
--

Thanks - I just gave it a try and the script failed due to invalid name 
bindings on function invocations. I just pushed the fix. Subsequently, it ran 
into SYSTEMML-2466 - maybe you could have a look [~Guobao]?

> Add experiment on spark paramserv
> -
>
> Key: SYSTEMML-2458
> URL: https://issues.apache.org/jira/browse/SYSTEMML-2458
> Project: SystemML
>  Issue Type: Sub-task
>Reporter: LI Guobao
>Assignee: LI Guobao
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (SYSTEMML-2458) Add experiment on spark paramserv

2018-07-24 Thread LI Guobao (JIRA)


[ 
https://issues.apache.org/jira/browse/SYSTEMML-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16554752#comment-16554752
 ] 

LI Guobao commented on SYSTEMML-2458:
-

[~mboehm7], I've pushed the scripts for the distributed spark experiments. 
Could you please take a look on that?

> Add experiment on spark paramserv
> -
>
> Key: SYSTEMML-2458
> URL: https://issues.apache.org/jira/browse/SYSTEMML-2458
> Project: SystemML
>  Issue Type: Sub-task
>Reporter: LI Guobao
>Assignee: LI Guobao
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)