GPU hardware support for SystemML bleeding edge.

2018-03-28 Thread Janardhan Pulivarthi
Greetings,

What are the hardware(hw) configuration that SystemML (an opensource
project) needs to support?

i.e., do we have support for all of these?
 - Kepler
 - Maxwell
 - Pascal
 - Volta

Lets say,
1. we have tuned a kernel only for a specific hw configuration(eg. Maxwell)
with `asm()` code, can we integrate it into our project
2. And can we ignore this hw specific optimized kernel for other
incompatible hardware and simply use the present kernels.

@Niketan - If the answer for the 1 & 2 is yes, then we first tune for
maxwell and then we support others, case by case as they (openai) have done.

Thanks,
Janardhan


Factorization Machines addition to our library.

2018-03-15 Thread Janardhan Pulivarthi
Hi all,

we recently added factorization machines, core layer to our `nn` library,
along with regression and classification scripts.

What is great about it?
--
1. The factorization machines handles the cases where SVM fails i.e., where
the information is so sparse.
2. In this sparse situation, the algorithm factorizes all the inputs,
according to how deep we want to correlate input matrix entries.

Factorization Machines are highly scalable, which are already being used at
google. If anyone would like to scale them, I am very happy to be involved.

Thanks,
Janardhan


[TESTS] What are the best practices for dml script testing. Thanks.

2018-01-11 Thread Janardhan Pulivarthi
Hi prithvirajsen,

If a script contains the gradients, then we could compare them with
numerical gradients, as we have done in the `grad_check.dml`, file.

How can we place a test script with a main script (dml), so that we can be
confident that the script will work. (just like junit test for java files)

Thanks,
Janardhan.


Can BITWISE_XOR be added. Thanks.

2018-01-01 Thread Janardhan Pulivarthi
Hi all,

I am using the following line code operation
`
x_tmp = bitwise_xor( v_index, as.scalar(V[v_index,]) )
`
Can this `bitwise_xor` be added?. I will try to take up the task, if there
it fits into dml.

Thanks,
Janardhan


Re: [DISCUSS] Roadmap SystemML 1.1 and beyond

2017-12-09 Thread Janardhan Pulivarthi
Hi all, my 0.02$ I am working on one by one.

Please add to the above list..

0. Algorithms
* Factorization machines, with regression & classification capabities with
the help of nn layers.[ 1437]
* A test suite for the nn optimization, with well known optimization test
functions. [1974]

1. Deep Learning
* I am working on model selection + hyperparameter optimization, a basic
implementation
will be possible by January. [SYSTEMML-1973] - some components of it are in
testing phase, now.
* I think distributed DL is a great idea, & it may be necessary now.

2. GPU backends
* Support for sparse operations - [SYSTEMML-2041] Implementation of block
sparse kernel enables us to model LSTM
with 10,000 hidden units, instead current state-of-the-art 1000 hidden units

6. Misc. compiler
* support for single-output UDFs in expressions.
* SPOOF compiler improvement
* Rewrites

8. Builtin functions
* Well known distribution functions - weibull, gamma etc.
* Generalization of operations, such as xor, and, other operations.

9. Documentation improvement.

Thanks,
Janardhan

On Sat, Dec 9, 2017 at 8:11 AM, Matthias Boehm  wrote:

> Hi all,
>
> with our SystemML 1.0 release around the corner, I think we should start
> the discussion on the roadmap for SystemML 1.1 and beyond. Below is an
> initial list as a starting point, but please help to add relevant items,
> especially for algorithms and APIs, which are barely covered so far.
>
> 1) Deep Learning
>  * Full compiler integration GPU backend
>  * Extended sparse operations on CPU/GPU
>  * Extended single-precision support CPU
>  * Distributed DL operations?
>
> 2) GPU Backend
>  * Full support for sparse operations
>  * Automatic decisions on CPU vs GPU operations
>  * Graduate GPU backends (enable by default)
>
> 3) Code generation
>  * Graduate code generation (enable by default)
>  * Support for deep learning operations
>  * Code generation for the heterogeneous HW, incl GPUs
>
> 4) Compressed Linear Algebra
>  * Support for matrix-matrix multiplications
>  * Support for deep learning operations
>  * Improvements for ultra-sparse datasets
>
> 5) Misc Runtime
>  * Large dense matrix blocks > 16GB
>  * NUMA-awareness (thread pools, matrix partitioning)
>  * Unified memory management (ops, bufferpool, RDDs/broadcasts)
>  * Support feather format for matrices and frames
>  * Parfor support for broadcasts
>  * Extended support for multi-threaded operations
>  * Boolean matrices
>
> 6) Misc Compiler
>  * Support single-output UDFs in expressions
>  * Consolidate replicated compilation chain (e.g., diff APIs)
>  * Holistic sum-product optimization and operator fusion
>  * Extended sparsity estimators
>  * Rewrites and compiler improvements for mini-batching
>  * Parfor optimizer support for shared reads
>
> 7) APIs
>  * Python Binding for JMLC API
>  * Consistency Python/Java APIs
>
>
> Regards,
> Matthias
>


add. of matrices of different dim. (the bias term)

2017-12-06 Thread Janardhan Pulivarthi
Hi Matthias,

When adding the biases... For example.,


   * Inputs:
   *  - X: Inputs, of shape (N, D).
   *  - W: Weights, of shape (D, M).
   *  - b: Biases, of shape (1, M).
   *
   * Outputs:
   *  - out: Outputs, of shape (N, M).
   */

out = (X %*% W) + b; # This script works
out = b + (X %*% W); # throws dimension incompatibility error.

This can be confusing for the dml authors (it played with me, for a month).

Thanks,
Janardhan


My life made easier, now!

2017-10-30 Thread Janardhan Pulivarthi
Hi all,

After 5 months of struggle for installation of Apache Systemml ( After,
more than 150 failed installations...), I finally, got this message
"Welcome to Apache SystemML!".


​
Now, I am not going to bother our Jenkins for testing my patches.

Thanks everyone,
Janardhan


CI test environment configuration

2017-10-23 Thread Janardhan Pulivarthi
Hi all,

Any body knows how to set the TRAVIS CI 
configuration for test

   1. spark backend, for executing a script like this (spark-submit
   SystemML.jar -f nn/test/run_tests.dml)
   2. hadoop


I have some success in building for the maven script here (
https://github.com/j143/systemml/blob/test/.travis.yml).


Thank you very much in advance,

- Janardhan


Re: Regarding `enable remote hyperparameter tuning`[BLOCKER issue]. Thanks.

2017-10-12 Thread Janardhan Pulivarthi
Hi all,

*Agenda: *First of all, I just like to have a decision support. For,
bayesian optimization is a huge algorithm with a lot of functions & there
may be many ways to exploit parallelism.

Bayes:
1. An intro an bayesian optimization, and how we are going to use it.
(understanding the definition)
2. there are around 20 functions needs to be implemented in total. (there
are four functions I did not understand.)
3. there are around 5 distributions, some which are already supported by
SystemML - DML builtin functions.
4. Inputs - observations, functions. outputs - optimized selection of
hyperparameters. Discussion of input & output behaviour.

Sobolev:
1. An intro about the sampling robustness of sobol sequence generation for
sampling.
2. Implementation approach, depending on primitive polynomial used.
3. There are 5 steps in the implementation, little discussion on how to
implement.

surrogate slice sampling:
1. The slice sampling algorithm, a brief on how it actually works.
2. Discussion of some of the functions & how actually they need to be
implemented.

let's move this forward.

Thank you very much,
Janardhan


On Mon, Oct 9, 2017 at 11:40 PM, Niketan Pansare <npan...@us.ibm.com> wrote:

> Hi Janardhan,
>
> I am available anytime on thursday or friday this week works for me. I
> would recommend sending an agenda before scheduling the meeting.
>
> Thanks,
>
> Niketan.
>
> - Original message -
> From: Janardhan Pulivarthi <janardhan.pulivar...@gmail.com>
> To: Mike Dusenberry <dusenberr...@gmail.com>, dev@systemml.apache.org,
> Niketan Pansare <npan...@us.ibm.com>, Alexandre V Evfimievski <
> evf...@us.ibm.com>
> Cc:
> Subject: Regarding `enable remote hyperparameter tuning`[BLOCKER issue].
> Thanks.
> Date: Mon, Oct 9, 2017 8:21 AM
>
> @niketan - I don't have a time preference, please give me any time (or)
> date for meeting at your convenience. Thanks.
>
> Hi Mike,
>
> This issue [https://issues.apache.org/jira/browse/SYSTEMML-1159
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_SYSTEMML-2D1159=DwMFaQ=jf_iaSHvJObTbx-siA1ZOg=HzVC6v79boGYQrpc383_Kao_6a6SaOkZrfiSrYZVby0=BsXCzntJwXhYDKUBFpMJEomcZUpBW5Tzl146Y44X56c=MCgYXLiG7ThIFh2qi3HH-NyQ39VjVFBga7vP04qLst0=>]
> has been marked as a blocker. I've gone through the reference paper you
> have attached there.
>
> In this paper, to my knowledge is stressing out on the point that
> 1. `random sampling` is better & is equivalent to expert trained or better
> than `Grid search`.
> 2. They proved that with some `mnist` variants.
>
> So, this random sampling can simulated by `Sobol Sequence generation`,
> method that we are trying implement for Bayesian optimization case.
>
> Conclusion: Niketan, Sasha and I trying to schedule a conversation, can
> you please join us.
>
> Thanks,
> Janardhan
>
>
>
>
>
>


Regarding `enable remote hyperparameter tuning`[BLOCKER issue]. Thanks.

2017-10-09 Thread Janardhan Pulivarthi
@niketan - I don't have a time preference, please give me any time (or)
date for meeting at your convenience. Thanks.

Hi Mike,

This issue [https://issues.apache.org/jira/browse/SYSTEMML-1159] has been
marked as a blocker. I've gone through the reference paper you have
attached there.

In this paper, to my knowledge is stressing out on the point that
1. `random sampling` is better & is equivalent to expert trained or better
than `Grid search`.
2. They proved that with some `mnist` variants.

So, this random sampling can simulated by `Sobol Sequence generation`,
method that we are trying implement for Bayesian optimization case.

Conclusion: Niketan, Sasha and I trying to schedule a conversation, can you
please join us.

Thanks,
Janardhan


Minor script changes for SVM with `MLContext`, `spark_submit` etc.

2017-10-03 Thread Janardhan Pulivarthi
Hi Matthias,

Based on your comment here[
https://github.com/apache/systemml/pull/529#issuecomment-316253791], I've
updated the SVM scripts(I believe).

Can you please have a look at whether the changeset[
https://github.com/apache/systemml/pull/673/files] needs to be updated. :)

Thanks,
Janardhan


[Question] SVM & GLM test with MLContext

2017-10-03 Thread Janardhan Pulivarthi
Hi Deron,

I would like to resolve the testing of algorithms through `MLContext` in 2
phases.

1st phase - For the time being I am try to test the input/output behaviour
of `Algorithms` through `MLContext`.

2nd phase - Here, with the help of domain expert I try to validate the
output matrices or strings( depending on the problems, as in
`UnivariateStatsTest`).

Completed checks:
1. LinearRegression scripts

Ongoing:
1. GLM scripts
2. SVM scripts - @deron, the integration test is failing, is it because of
my stale branch.-thanks. :)

Thank you very much,
Janardhan


Re: [QUESTION] XOR operations in SystemML. Thanks.

2017-09-22 Thread Janardhan Pulivarthi
Hi Matthias,

In your previous mail - you wrote,

> Down the road, we can think about a generalization of our existing
cumulative operations such as cumsum, cumprod, cummax, to arbitrary cell
computations and aggregation functions, which would be useful for quite a
number of applications.

> would be to generalize our AND, OR, NOT, (and XOR) operations to
matrices, because
these are currently only supported over scalars.

Is there any jira is created for this? (would you mind please, creating a
jira for me with a little description of task). Is there anything that you
would like to add for the above tasks, to handle in one nice PR at once.

Thanks,
Janardhan


On Thu, Sep 7, 2017 at 12:18 PM, Matthias Boehm <mboe...@googlemail.com>
wrote:

> thanks Janardhan, in that case I would recommend to go with R syntax
> because (1) it's actually one of our selling points that users don't have
> to learn a new language, (2) it simplifies the porting of R scripts to DML
> (and vice versa), and (3) I would think it's rather uncommon to have long
> chains of xor operations.
>
> Btw, a nice follow-up task - in case you're interested - would be to
> generalize our AND, OR, NOT, (and XOR) operations to matrices, because
> these are currently only supported over scalars. It would get you in touch
> with the parser validation, compilation, and runtime of rather simple
> operations.
>
> Regards,
> Matthias
>
> On Mon, Sep 4, 2017 at 8:57 PM, Janardhan Pulivarthi <
> janardhan.pulivar...@gmail.com> wrote:
>
> > Hi,
> >
> > yes, no big reason to deviate. But, simplicity of `*a (+) b*` or `*a ><
> > b*`like
> > ` *a ^ 2*` (compared to `*xor(a, b)`*, which of the type` *pow(a, 2) `*),
> > to be consistent with the other symbols of dml.
> >
> > In this simple case:
> > 1. ` *a (+) b (+) c (+) d(+)...* `
> > 2. ` *xor(xor(a, b), c)..)  ` (*sorry, if I written this syntax wrongly)
> >
> > Your word will be final.
> >
> > Thanks,
> > Janardhan
> >
> >
> > On Mon, Sep 4, 2017 at 6:46 PM, Matthias Boehm <mboe...@googlemail.com>
> > wrote:
> >
> > > Could we please stick to R syntax (i.e., "xor(a, b)") here, unless
> there
> > is
> > > a good reason to deviate? Thanks.
> > >
> > > Regards,
> > > Matthias
> > >
> > > On Mon, Sep 4, 2017 at 7:55 AM, Janardhan Pulivarthi <
> > > janardhan.pulivar...@gmail.com> wrote:
> > >
> > > > Hi all, [XOR symbol]
> > > >
> > > > Now, I gave a sample try for the XOR operator, with caret ` ^ `
> symbol.
> > > > But, this have been reserved for exponentiation. So, another
> > alternative
> > > > would be
> > > >
> > > > 1. ` (+) `
> > > > 2. ` >< `
> > > > 3. ` >-< `
> > > >
> > > > Thanks,
> > > > Janardhan
> > > >
> > > > On Thu, Aug 31, 2017 at 7:38 PM, Matthias Boehm <
> > mboe...@googlemail.com>
> > > > wrote:
> > > >
> > > >> From a scalar operation perspective, you could of course emulate XOR
> > via
> > > >> AND, OR, and negation. However, you might want to write anyway a
> > > java-based
> > > >> UDF to efficiently implement this recursive operator.
> > > >>
> > > >> Down the road, we can think about a generalization of our existing
> > > >> cumulative operations such as cumsum, cumprod, cummax, to arbitrary
> > cell
> > > >> computations and aggregation functions, which would be useful for
> > quite
> > > a
> > > >> number of applications.
> > > >>
> > > >> Regards,
> > > >> Matthias
> > > >>
> > > >> On Thu, Aug 31, 2017 at 5:59 AM, Janardhan Pulivarthi <
> > > >> janardhan.pulivar...@gmail.com> wrote:
> > > >>
> > > >>> Hi,
> > > >>>
> > > >>> The following is an equation (2.4) from the algorithm for the
> > > generation
> > > >>> of sobol sequences. The authors of the paper have utilized the
> > bitwise
> > > >>> operations of C++ to calculate this efficiently.
> > > >>>
> > > >>> *Now, the question is:* Can we do this at script level (in dml) or
> we
> > > >>> should do it in the `java` itself as a builtin, function to
> generate
> > > the
> > > >>> numbers?.
> > > >>>
> > > >>>
> > > >>> ​
> > > >>> Thanks,
> > > >>> Janardhan
> > > >>>
> > > >>
> > > >>
> > > >
> > >
> >
>


Re: [QUESTION] about initialization of outputs , depends on if-else execution. Thanks.

2017-09-08 Thread Janardhan Pulivarthi
Hi Niketan,

1. Please correct me here, -> https://github.com/apache/
systemml/pull/651/files#diff-ef5e4f92ced39a334cdce976800ace26R139

2. please refer this comment ->  https://github.com/apache/
systemml/pull/651#issuecomment-326902771

Thank you,
Janardhan


Re: Bayesian optimizer support for SystemML.

2017-09-04 Thread Janardhan Pulivarthi
Hi Sasha, Niketan, and Mike, (sorry, if I missed out on someone)

So far we have encountered some problems and situations where we need some
more thinking. But, until then let us start a preliminary script, for
checking different scenarios with our existing top level algorithms and
deep learning algorithms.

Along with the previously proposed ones, we can try
1. The constraints (both constrained & unconstrained)

2. Convergence rate check, may be for settling at a prior (and our
convergence criteria, based upon  Convergence Rates for Efficient Global
Optimization Algorithms: https://arxiv.org/pdf/1101.3501v3.pdf )

May be we could implement some priors, instead of one particularly.


I am planning to keep my schedule free for a month to only focus on this
implementation. Owing to its importance for the neural networks where we
need less memory consumption especially to fit into the GPUs, It would be
great if we could ship this with `1.0` release.

*Design document: *http://bit.do/systemml-bayesian

Thanks you very much,
Janardhan



On Wed, Aug 23, 2017 at 4:04 PM, Alexandre V Evfimievski <evf...@us.ibm.com>
wrote:

> Hi Janardhan,
>
> The number of parameters could be rather large, that's certainly an issue
> for Bayesian Optimization.  A perfect implementation would, perhaps, pick a
> sample of parameters and a sample of the dataset for every iteration.  It
> seems that Sobol sequences require generating primitive polynomials of
> large degree.  What is better: a higher-dimensional B.O., or a
> lower-dimensional one combined with parameter sampling?  Probably the
> latter.  By the way, in cases where parameters feed into heuristics, there
> may be considerable independence across the set of parameters, especially
> when conditioned by a specific dataset record.  Each heuristic targets
> certain situations that arise in some records.  Not sure how to take
> advantage of this.
>
> Thanks,
> Sasha
>
>
>
> From:Janardhan Pulivarthi <janardhan.pulivar...@gmail.com>
> To:Alexandre V Evfimievski <evf...@us.ibm.com>, npan...@us.ibm.com,
> dev@systemml.apache.org
> Date:08/10/2017 09:39 AM
>
> Subject:Re: Bayesian optimizer support for SystemML.
> --
>
>
>
> Hi Sasha,
>
> And one more thing, I would like to ask, what are you thinking about
> `sobol` function. What is the dimension requirement and pattern of
> sampling?. Please help me understand, what are the tasks exactly that we
> are going to optimize, in SystemML.
>
> Surrogate slice sampling - What are your thoughts about it.
>
> Thank you very much,
> Janardhan
>
> On Wed, Jul 26, 2017 at 12:25 AM, Alexandre V Evfimievski <
> *evf...@us.ibm.com* <evf...@us.ibm.com>> wrote:
> Hi, Janardhan,
>
> We are still studying Bayesian Optimization (B.O.), you are ahead of us!
> Just one comment:  The "black box" loss function that is being optimized is
> not always totally black.  Sometimes it is a sum of many small black-box
> functions.  Suppose we want to train a complex system with many parameters
> over a large dataset.  The system involves many heuristics, and the
> parameters feed into these heuristics.  We want to minimize a loss
> function, which is a sum of individual losses per each data record.  We
> want to use B.O. to find an optimal vector of parameters.  The parameters
> affect the system's behavior in complex ways and do not allow for the
> computation of a gradient.  However, because the loss is a sum of many
> losses, when running B.O., we have a choice: either to run each test over
> the entire dataset, or to run over a small sample of the dataset (but try
> more parameter vectors per hour, say).  The smaller the sample, the higher
> the variance of the loss.  Not sure which implementation of B.O. is the
> best to handle such a case.
>
> Thanks,
> Alexandre (Sasha)
>
>
>
> From:Janardhan Pulivarthi <*janardhan.pulivar...@gmail.com*
> <janardhan.pulivar...@gmail.com>>
> To:*dev@systemml.apache.org* <dev@systemml.apache.org>
> Date:07/25/2017 10:33 AM
> Subject:Re: Bayesian optimizer support for SystemML.
> --
>
>
>
> Hi Niketan and Mike,
>
> As we are trying to implement this Bayesian Optimization, should we take
> input from more committers as well as this optimizer approach seems to have
> a couple of ways to implement. We may need to find out which suits us the
> best.
>
> Thanks,
> Janardhan
>
> On Sat, Jul 22, 2017 at 3:41 PM, Janardhan Pulivarthi <
> *janardhan.pulivar...@gmail.com* <janardhan.pulivar...@gmail.com>> wrote:
>
> > Dear committers,
> >
> > We will be plannin

Re: [QUESTION] XOR operations in SystemML. Thanks.

2017-09-04 Thread Janardhan Pulivarthi
Hi all, [XOR symbol]

Now, I gave a sample try for the XOR operator, with caret ` ^ ` symbol.
But, this have been reserved for exponentiation. So, another alternative
would be

1. ` (+) `
2. ` >< `
3. ` >-< `

Thanks,
Janardhan

On Thu, Aug 31, 2017 at 7:38 PM, Matthias Boehm <mboe...@googlemail.com>
wrote:

> From a scalar operation perspective, you could of course emulate XOR via
> AND, OR, and negation. However, you might want to write anyway a java-based
> UDF to efficiently implement this recursive operator.
>
> Down the road, we can think about a generalization of our existing
> cumulative operations such as cumsum, cumprod, cummax, to arbitrary cell
> computations and aggregation functions, which would be useful for quite a
> number of applications.
>
> Regards,
> Matthias
>
> On Thu, Aug 31, 2017 at 5:59 AM, Janardhan Pulivarthi <
> janardhan.pulivar...@gmail.com> wrote:
>
>> Hi,
>>
>> The following is an equation (2.4) from the algorithm for the generation
>> of sobol sequences. The authors of the paper have utilized the bitwise
>> operations of C++ to calculate this efficiently.
>>
>> *Now, the question is:* Can we do this at script level (in dml) or we
>> should do it in the `java` itself as a builtin, function to generate the
>> numbers?.
>>
>>
>> ​
>> Thanks,
>> Janardhan
>>
>
>


Re: Memory estimates equal to zero

2017-09-04 Thread Janardhan Pulivarthi
Hi Nantia,

Yes, even the negative estimates happened sometimes, as follows (although,
this was fixed) :
```

DEBUG opt.Optimizer: --- RULEBASED OPTIMIZER ---
DEBUG opt.Optimizer: RULEBASED OPT: Optimize w/
max_mem=12743MB/525MB/525MB, max_k=16/144/144).
WARN opt.CostEstimator: Cannot get memory estimate for hop
(op=BIAS_ADD, name=4_out, memest=-1.238822E7).
WARN opt.CostEstimator: Cannot get memory estimate for hop
(op=DIRECT_CONV2D, name=4_out, memest=-7612708.0).

WARN opt.CostEstimator: Cannot get memory estimate for hop (op=BIAS_ADD,
name=5_out, memest=-3096956.0).
```
As far as I believe, Some possible causes might be:
1. unspecified degree of parallelism and/or execution type (in your case it
is CP)
2. Due to the offsetting of the worstcase estimates and optimized estimates.

If you need additional help, you might want to provide additional details:
1. dml function used
2. and its configuration


Thanks,
Janardhan



On Mon, Sep 4, 2017 at 5:24 AM, Nantia Makrynioti 
wrote:

> Hello,
>
> I generated a HOP plan and memory estimates for the input data of HOP
> operators are all 0MB. The code runs locally and the execution type is CP.
> What could be the case where memory is estimated to 0MB?
>
> Thank you very much,
> Nantia
>


[QUESTION] XOR operations in SystemML. Thanks.

2017-08-31 Thread Janardhan Pulivarthi
Hi,

The following is an equation (2.4) from the algorithm for the generation of
sobol sequences. The authors of the paper have utilized the bitwise
operations of C++ to calculate this efficiently.

*Now, the question is:* Can we do this at script level (in dml) or we
should do it in the `java` itself as a builtin, function to generate the
numbers?.


​
Thanks,
Janardhan


Gentle ping for help on all my PRs. Thanks.

2017-08-23 Thread Janardhan Pulivarthi
Dear committers,

I am feeling that my contributions are not in an ordered way. So, I am
listing them here. And, also listed the help required from volunteers.

1. [SYSTEMML-1437] - factorization machines
 [*in progress*]
 - till now, I have implemented the `*fm.dml*` core module.
 - *Help: *I am unclear as to how the `i/o` will be for the example
implementations, such as regression. A sample script for this, might help
me complete all the examples *regression*, *classification*, & *ranking*.


2. [SYSTEMML-1645] - Verify whether all scripts work with MLContext &
automate  [*in progress*]
- This PR tries to write the test scripts for all top level algorithms
for the new MLContext.
- I am working with *Jerome* on this. Once he verifies all the scripts,
I will add the tests for them.
- *Help: *Can any body help review this PR, and suggest what is missing
in this PR. I am getting script execution failures.

3. [SYSTEMML-1444] - UDFs w/ single output in expressions
 [*in progress*]
   - The objective is to make udf's callable from expressions. I've gone
through all the Hop, Lop implementations, compiler, parser, api to have a
clear picture.
   - I am still making my way through this.
   - *Help: *Hi Matthias, I tried implementing a lop *FunctionCallCPSingle.java

 *

4. [SYSTEMML-1216] - implement local svd( ) function
 [*done*]
   - With previously implemented local svd(), I've added little
improvements and tests.
   - *Help: *This is ready to be merged.  (I believe)

5. [SYSTEMML-1214] Implement scalable version of singular value
decomposition  [*issue with
the testing?*]
- This PR depends on the above PR, this implements distributed svd
based on already implemented distributed qr() and then calculates local
svd() of then obtained R matrix.
- *Help: *I have implemented it preliminarily, but how should I test
for scalability, Can I do that on *Jenkins CI*. or Do I need to run that on
any cluster.

6. [SYSTEMML-979] Add support for Bayesian Optimization.
 [*a lot to be done*]
- There's a lot of work in progress, but I implemented a skeleton with
bad syntax. ( I'll improve this soon)
- many improvements have to be done, better operation needs to be kept
and loops needs to be completely eliminated.

Thanks all for the support,
Janardhan


Re: Numerical accuracy of DML.

2017-08-20 Thread Janardhan Pulivarthi
Thanks for the detailed explanation. I have found the following
research "*Scalable
and Numerically Stable Descriptive Statistics in SystemML*" to understand
more about the implementation.

Yes, doing something like this will definitely cause a little trouble for
the algorithms developer who works at the script level.

Thank you,
Janardhan

On Sat, Aug 19, 2017 at 3:14 AM, Matthias Boehm <mboe...@googlemail.com>
wrote:

> Good question - let me separate the somewhat orthogonal aspects to it.
>
> First, for descriptive statistics such as sum, mean, skewness, or kurtosis,
> we already use numerically stable implementations based on Kahan Plus (see
> org.apache.sysml.runtime.functionobjects.KahanPlus if your interested).
> However, for performance reasons, operations like matrix multiplication
> rely on the basic multiply and adds (except for block aggregations of
> distributed operations which also use KahanPlus).
>
> Second, for comparisons, we do simply rely on Java's builtin operators.
> Once we extend the rather limited NaN support, we should change that to
> Double.compare accordingly. However, both of these alternatives check for
> exact matches. Hence, for comparisons of equivalence on script level, it's
> usually a better idea to compare with a tolerance as follows:
> abs(1-val)<10e-4 instead of val==1. Doing something like this inside the
> builtin operations would probably create more problems and confusion than
> it helps.
>
> Regards,
> Matthias
>
>
> On Fri, Aug 18, 2017 at 11:39 PM, Janardhan Pulivarthi <
> janardhan.pulivar...@gmail.com> wrote:
>
> > Dear committers,
> >
> > May I know the numerical accuracy of dml at present, and are you planning
> > to increase it. It seems for comparison operators we have depended upon
> > java numerical floating point accuracy.
> >
> > Thank you very much,
> > Janardhan
> >
>


Numerical accuracy of DML.

2017-08-19 Thread Janardhan Pulivarthi
Dear committers,

May I know the numerical accuracy of dml at present, and are you planning
to increase it. It seems for comparison operators we have depended upon
java numerical floating point accuracy.

Thank you very much,
Janardhan


Re: svd( ) implementation

2017-07-28 Thread Janardhan Pulivarthi
Thanks, Imran. As per the paper, at first we perform QR decomposition of
input matrix (A), from which we obtain R. And then, we compute the svd(R)
using the builtin local function (??). I'll try this.

Tall-skinny matrix: so, do we have problem with square matrices?. or do we
have to partition the matrix into tall-skinny matrices if we have a square
one?.

Thanks,

Janardhan

On Fri, Jul 28, 2017 at 11:52 PM, Imran Younus <imranyou...@gmail.com>
wrote:

> Just to clarify one thing. For QR based, method, you can assume that R
> matrix is small enough to fit on driver memory and them perform SVD on the
> driver. That means your actual matrix has to tall-skinny matrix.
>
> imran
>
> On Fri, Jul 28, 2017 at 11:15 AM, Imran Younus <imranyou...@gmail.com>
> wrote:
>
> > Janardhan,
> >
> > The papers you're referring may not be relevant. The first paper, as far
> > as I can tell, is about updating an existing svd decomposition as new
> data
> > comes in. The 3rd paper in this list is the one I used, but that method
> is
> > not good.
> >
> > There is also a method that uses QR decomposition and then calculates SVD
> > from R matrix. Please have a look at equation 1.3 in this paper:
> >
> > http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.127.115=1
> >
> > I think this is worth trying out. The distributed QR is already
> > implemented in SystemlML, so it may quick to try out.
> >
> > imran
> >
> >
> >
> > On Fri, Jul 28, 2017 at 10:10 AM, Janardhan Pulivarthi <
> > janardhan.pulivar...@gmail.com> wrote:
> >
> >> Hi Nakul & all the committers,
> >>
> >> Till now I am half way through the literature. But, for now a couple of
> >> things to mention, in SVD there are three stages
> >>   1. Bidiagonal reduction step
> >>   2. Computation of the singular values
> >>   3. Computation of the singular vectors
> >>
> >> of these three, The* Bidiagonal reduction* step is very expensive, so is
> >> our focus on this( when considering GPU, at times where handling with
> CPU
> >> is infeasible).
> >>
> >> About literature:
> >>
> >>- I took some time to go through " A Stable and Fast Algorithm for
> >>Updating the Singular Value Decomposition" by "Gu & Stanley", to
> >> understand
> >>the numerical stability and round-off errors when we are partitioning
> >> the
> >>matrix in this distributed algorithm. The author has assured that
> each
> >>component computed will be of high absolute accuracy. And also, the
> >>properties that the resultant matrix support do not have any
> conflicts
> >> with
> >>parent matrix. [pdf
> >><http://www.cs.yale.edu/publications/techreports/tr966.pdf>]
> >>
> >>
> >>- "High performance bidiagonal reduction using the tile algorithms on
> >>homogeneous multicore clusters ", by "Ltaief et. al", this paper has
> >>focused on the first stage mainly and has discussed a good about tile
> >>algorithms and their runtime implementations.(although off-topic, I
> >> read
> >>this just to understand.) [pdf
> >><http://www.netlib.org/lapack/lawnspdf/lawn247.pdf>]
> >>
> >>
> >>-  "A distributed and incremental svd algorithm for agglomerative
> data
> >>analysis on large networks", by "Iwen & Ong", *Please go through* the
> >>(a). TABLE 1, TABLE 2 . (b). APPENDIX A. RAW DATA FROM NUMERICAL
> >>EXPERIMENTS. [pdf <https://arxiv.org/pdf/1601.07010.pdf>]
> >>
> >> Thanks,
> >>
> >> Janardhan
> >>
> >> On Wed, Jul 26, 2017 at 12:29 AM, Nakul Jindal <naku...@gmail.com>
> wrote:
> >>
> >> > Hi Janardhan,
> >> >
> >> > The images you've used as attachments haven't reached my inbox.
> >> > Could you please send them to me directly, rather than through the dev
> >> > mailing list.
> >> > (Or upload it to a image hosting site like imgur and paste the links
> in
> >> the
> >> > email)
> >> >
> >> > I would like to point out that my knowledge of machine learning is
> >> limited.
> >> > Still, how would you want to test the algorithm?
> >> >
> >> >
> >> > Sparse matrices in SystemML (in Spark Execution Mode)
> >> > Sparse matrix support in SystemML is im

Bayesian optimizer support for SystemML.

2017-07-22 Thread Janardhan Pulivarthi
Dear committers,

We will be planning to add bayesian optimizer support for both the ML and
Deep learning tasks for the SystemML. Relevant jira link:
https://issues.apache.org/jira/browse/SYSTEMML-979

The following is a simple outline of how we are going implement it. Please
feel free to make any kind of changes. In this google docs link:
http://bit.do/systemml-bayesian

Description:

Bayesian optimization is a sequential design strategy for global
optimization of black-box functions that doesn’t require derivatives.

Process:

   1.

   First we select a point that will be the best as far as the no. of
   iterations that has happened.
   2.

   Candidate point selection with sampling from Sobol quasirandom sequence
   generator the space.
   3.

   Gaussian process hyperparameter sampling with surrogate slice sampling
   method.


Components:

   1.

   Selecting the next point to Evaluate.

[image: nextpoint.PNG]

We specify a uniform prior for the mean, m, and width 2 top-hat priors for
each of the D length scale parameters. As we expect the observation noise
generally to be close to or exactly zero, v(nu) is given a horseshoe prior.
The covariance amplitude theta0 is given a zero mean, unit variance
lognormal prior, theta0 ~ ln N (0, 1).



   1.

   Generation of QuasiRandom Sobol Sequence.

Which kind of sobol patterns are needed?

[image: sobol patterns.PNG]

How many dimensions do we need?

This paper argues that its generation target dimension is 21201. [pdf link

]



   1.

   Surrogate Slice Sampling.

[image: surrogate data sampling.PNG]


References:

1. For the next point to evaluate:

https://papers.nips.cc/paper/4522-practical-bayesian-optimization-of-machine-learning-algorithms.pdf

 http://www.dmi.usherb.ca/~larocheh/publications/gpopt_nips_appendix.pdf


2. QuasiRandom Sobol Sequence Generator:

https://researchcommons.waikato.ac.nz/bitstream/handle/10289/967/Joe%20constructing.pdf


3. Surrogate Slice Sampling:

http://homepages.inf.ed.ac.uk/imurray2/pub/10hypers/hypers.pdf



Thank you so much,

Janardhan


about performance statistics of PCA.dml

2017-07-21 Thread Janardhan Pulivarthi
Hi Mike,

I'd like to know how much expensive this critical code is

 C = (t(A) %*% A)/(N-1) - (N/(N-1))*t(mu) %*% mu;

(at
https://github.com/apache/systemml/blob/master/scripts/algorithms/PCA.dml#L81)
in the SPARK setting given

   1. 60Kx700 input for A
   2. For a datasize of 28 MB with 100 continuous variable and 1 column
   with numeric label variable

with reference to this comment.(
https://issues.apache.org/jira/browse/SYSTEMML-831?focusedCommentId=15525147=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15525147
)

Thank you,
Janardhan


Re: On the need for Parameter Server. ( A Model Parallel Construct )

2017-06-20 Thread Janardhan Pulivarthi
Waiting for the response from Niketan Pansare,and other committers.
[SYSTEMML-739]

​
 Parameter Server: a model parallel construct
<https://docs.google.com/document/d/1AOW53numMJSF_msGvo1lekpyv7_3VF51i6xAjNCEC9I/edit?usp=drive_web>
​

Thanks,
Janardhan


> On Sun, Jun 18, 2017 at 10:16 PM, Janardhan Pulivarthi <
> janardhan.pulivar...@gmail.com> wrote:
>
 > Dear committers,
 >
 > Implementation/Integration of existing parameter server for the execution
 > of algorithms in a distributed fashion both for the machine learning and
 > deep learning.
 >
 > The following document covers a bit about whether we need one or not ?.
 >
 > I am currently working on [SYSTEMML-1437] implementation
 > of factorization machines, which are to be sparse-safe and scalable, to
 > stick to this philosophy we might need a model parallel construct. I know
 > very little about how systemml exactly works. If you find some *7
minutes*
 > please have a look at this doc.
 >
 >  Parameter Server: a model parallel construct
 > <
https://docs.google.com/document/d/1AOW53numMJSF_msGvo1lekpyv7_3VF51i6xAjNCEC9I/edit
>
>
> Thanks,
> Janardhan


On the need for Parameter Server. ( A Model Parallel Construct )

2017-06-18 Thread Janardhan Pulivarthi
Dear committers,

Implementation/Integration of existing parameter server for the execution
of algorithms in a distributed fashion both for the machine learning and
deep learning.

The following document covers a bit about whether we need one or not ?.

My name is Janardhan, currently working on [SYSTEMML-1437] implementation
of factorization machines, which are to be sparse-safe and scalable, to
stick to this philosophy we might need a model parallel construct. I know
very little about how systemml exactly works. If you find some *7 minutes*
please have a look at this doc.
​​​
 Parameter Server: a model parallel construct