[GitHub] incubator-madlib pull request #162: MLP: Multilayer Perceptron Phase 2

2017-08-14 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-madlib/pull/162


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-madlib pull request #162: MLP: Multilayer Perceptron Phase 2

2017-08-10 Thread orhankislal
Github user orhankislal commented on a diff in the pull request:

https://github.com/apache/incubator-madlib/pull/162#discussion_r132541348
  
--- Diff: src/ports/postgres/modules/convex/test/mlp.sql_in ---
@@ -241,9 +247,8 @@ SELECT mlp_predict(
 
 select * from mlp_prediction;
 SELECT assert(
--- Accuracy greater than 90%
-COUNT(*)/150.0 > 0.95,
-'MLP: Accuracy is too low (< 95%). Wrong result.'
+COUNT(*)/150.0 > 0.99,
--- End diff --

Tested on PG 9.4 and the latest commit of GPDB 5 (1 node, 3 segments).
IC passes on Postgres but fails on GPDB5. I tested the code manually and it 
works fine with 148/150 accuracy. I would suggest dropping the assertion 
threshold to 95% to make sure such minute changes do not fail the whole test.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-madlib pull request #162: MLP: Multilayer Perceptron Phase 2

2017-08-10 Thread orhankislal
Github user orhankislal commented on a diff in the pull request:

https://github.com/apache/incubator-madlib/pull/162#discussion_r132541507
  
--- Diff: src/ports/postgres/modules/convex/test/mlp.sql_in ---
@@ -241,9 +247,8 @@ SELECT mlp_predict(
 
 select * from mlp_prediction;
--- End diff --

select * from -> SELECT * FROM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-madlib pull request #162: MLP: Multilayer Perceptron Phase 2

2017-08-09 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request:

https://github.com/apache/incubator-madlib/pull/162#discussion_r132340750
  
--- Diff: doc/design/modules/neural-network.tex ---
@@ -197,3 +207,44 @@ \subsubsection{The $\mathit{Gradient}$ Function}
 \State \Return $\delta$
 \end{algorithmic}
 \end{algorithm}
+
+\begin{algorithm}[mlp-train-iteration$(X, Y, \eta)$] 
\label{alg:mlp-train-iteration}
+\alginput{
+start vectors $X_{i...m} \in \mathbb{R}^{n_0}$,\\
+end vectors $Y_{i...m} \in \mathbb{R}^{n_N}$,\\
+learning rate $\eta$,\\}
+\algoutput{Coefficients $u = \{ u_{k-1}^{sj} \; | \; k = 1,...,N, \: s = 
0,...,n_{k-1}, \: j = 1,...,n_k\}$}
+\begin{algorithmic}[1]
+\State \texttt{Randomnly initialize u}
+\For{$i = 1,...,m$}
+\State $\nabla f(u) \set \texttt{mlp-gradient}(u,X_i,Y_i)$
+\State $u \set u - (\eta \nabla f(u) u + \lambda u)$
+\EndFor
+\State \Return $u$
+\end{algorithmic}
+\end{algorithm}
+
+\begin{algorithm}[mlp-train-parallel$(X, Y, \eta, s, t)$] 
\label{alg:mlp-train-parallel}
+\alginput{
+start vectors $X_{i...m} \in \mathbb{R}^{n_0}$,\\
+end vectors $Y_{i...m} \in \mathbb{R}^{n_N}$,\\
+learning rate $\eta$,\\
+segments $s$,\\
+iterations $t$,\\}
+\algoutput{Coefficients $u = \{ u_{k-1}^{sj} \; | \; k = 1,...,N, \: s = 
0,...,n_{k-1}, \: j = 1,...,n_k\}$}
+\begin{algorithmic}[1]
+\State \texttt{Randomnly initialize u}
+\For{$j = 1,...,s$}
+\State $X_j \set \texttt{subset-of-X}$
+\State $Y_j \set \texttt{subset-of-Y}$
+\EndFor
+\For{$i = 1,...,t$}
+\For{$j = 1,...,s$}
+\State $u_j \set copy(u)$
+\State $u_j \set \texttt{mlp-train-iteration}(X_j, Y_j, \eta)$
+\EndFor
+\State $u \set \texttt{weighted-avg}(u_{1...s})$
+\EndFor
+\State \Return $u$
--- End diff --

Just the `return u` shows up in a new page in the pdf, see if you can avoid 
that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-madlib pull request #162: MLP: Multilayer Perceptron Phase 2

2017-08-09 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request:

https://github.com/apache/incubator-madlib/pull/162#discussion_r132340321
  
--- Diff: doc/design/modules/neural-network.tex ---
@@ -46,41 +47,49 @@ \subsection{Formal Description}
 In the remaining part of this section, we will give a formal description 
of the derivation of objective function and its gradient.
 
 \paragraph{Objective function.}
-We mostly follow the notations in example 1.5.3 from Bertsekas 
\cite{bertsekas1999nonlinear}, for a multilayer perceptron that has $N$ layers 
(stages), and the $k$th stage has $n_k$ activation units ($\phi : \mathbb{R} 
\to \mathbb{R}$), the objective function is given as
-\[f_{(y, z)}(u) = \frac{1}{2} \|h(u, y) - z\|_2^2,\]
-where $y \in \mathbb{R}^{n_0}$ is the input vector, $z \in 
\mathbb{R}^{n_N}$ is the output vector,
+We mostly follow the notations in example 1.5.3 from Bertsekas 
\cite{bertsekas1999nonlinear}, for a multilayer perceptron that has $N$ layers 
(stages), and the $k$th stage has $n_k$ activation units ($\phi : \mathbb{R} 
\to \mathbb{R}$), the objective function for regression is given as
+\[f_{(x, y)}(u) = \frac{1}{2} \|h(u, x) - y\|_2^2,\]
+and for classification the objective function is given as
+\[f_{(x, y)}(u) = \sum_i (\log(h_i(u, x)) * z_i + (1-\log(h_i(u, x))) *( 
1- z_i) ,\]
+where $x \in \mathbb{R}^{n_0}$ is the input vector, $y \in 
\mathbb{R}^{n_N}$ is the output vector (one hot encoded for classification),
 \footnote{Of course, the objective function can be defined over a set of 
input-output vector pairs, which is simply given as the addition of the above 
$f$.}
 and the coefficients are given as
-\[u = \{ u_{k-1}^{sj} \; | \; k = 1,...,N, \: s = 0,...,n_{k-1}, \: j = 
1,...,n_k\}\]
+\[u = \{ u_{k-1}^{sj} \; | \; k = 1,...,N, \: s = 0,...,n_{k-1}, \: j = 
1,...,n_k\},\]
+And are initialized from a uniform distribution as follows:
+\[u_{k}^{sj} = uniform(-r,r),\]
+where r is defined as follows:
+\[r = \sqrt{\frac{6}{n_k+n_{k+1}}}\]
+With regularization, an additional term enters the objective function, 
given as
+\[\sum_{u_k^{sj}} \frac{1}{2} \lambda u_k^{sj2} \]
 This still leaves $h : \mathbb{R}^{n_0} \to \mathbb{R}^{n_N}$ as an open 
item.
-Let $x_k \in \mathbb{R}^{n_k}, k = 1,...,N$ be the output vector of the 
$k$th layer. Then we define $h(u, y) = x_N$, based on setting $x_0 = y$ and the 
$j$th component of $x_k$ is given in an iterative fashion as
-\footnote{$x_k^0 \equiv 1$ is used to simplified the notations, and 
$x_k^0$ is not a component of $x_k$, for any $k = 0,...,N$.}
+Let $o_k \in \mathbb{R}^{n_k}, k = 1,...,N$ be the output vector of the 
$k$th layer. Then we define $h(u, x) = o_N$, based on setting $o_0 = x$ and the 
$j$th component of $o_k$ is given in an iterative fashion as
+\footnote{$o_k^0 \equiv 1$ is used to simplified the notations, and 
$o_k^0$ is not a component of $o_k$, for any $k = 0,...,N$.}
 \[\begin{alignedat}{5}
-x_k^j = \phi \left( \sum_{s=0}^{n_{k-1}} x_{k-1}^s u_{k-1}^{sj} 
\right), &\quad k = 1,...,N, \; j = 1,...,n_k
+o_k^j = \phi \left( \sum_{s=0}^{n_{k-1}} o_{k-1}^s u_{k-1}^{sj} 
\right), &\quad k = 1,...,N, \; j = 1,...,n_k
 \end{alignedat}\]
 
 \paragraph{Gradient of the End Layer.}
 Let's first handle $u_{N-1}^{st}, s = 0,...,n_{N-1}, t = 1,...,n_N$.
-Let $z^t$ denote the $t$th component of $z \in \mathbb{R}^{n_N}$, and 
$h^t$ the $t$th component of output of $h$.
+Let $y^t$ denote the $t$th component of $y \in \mathbb{R}^{n_N}$, and 
$h^t$ the $t$th component of output of $h$.
 \[\begin{aligned}
 \frac{\partial f}{\partial u_{N-1}^{st}}
-&= \left( h^t(u, y) - z^t \right) \cdot \frac{\partial h^t(u, 
y)}{\partial u_{N-1}^{st}} \\
-&= \left( x_N^t - z^t \right) \cdot \frac{\partial x_N^t}{\partial 
u_{N-1}^{st}} \\
-&= \left( x_N^t - z^t \right) \cdot \frac{\partial \phi \left( 
\sum_{s=0}^{n_{N-1}} x_{N-1}^s u_{N-1}^{st} \right)}{\partial u_{N-1}^{st}} \\
-&= \left( x_N^t - z^t \right) \cdot \phi' \left( \sum_{s=0}^{n_{N-1}} 
x_{N-1}^s u_{N-1}^{st} \right) \cdot x_{N-1}^s \\
+&= \left( h^t(u, x) - y^t \right) \cdot \frac{\partial h^t(u, 
x)}{\partial u_{N-1}^{st}} \\
+&= \left( o_N^t - y^t \right) \cdot \frac{\partial o_N^t}{\partial 
u_{N-1}^{st}} \\
+&= \left( o_N^t - y^t \right) \cdot \frac{\partial \phi \left( 
\sum_{s=0}^{n_{N-1}} o_{N-1}^s u_{N-1}^{st} \right)}{\partial u_{N-1}^{st}} \\
+&= \left( o_N^t - y^t \right) \cdot \phi' \left( \sum_{s=0}^{n_{N-1}} 
o_{N-1}^s u_{N-1}^{st} \right) \cdot o_{N-1}^s \\
 \end{aligned}\]
 To ease the notation, let the input vector of the $j$th activation unit of 
the $(k+1)$th layer be
--- End diff --

`$(k+1)$th` -> `$(k+1)$^{th}`.


---
If your project is set up for it, you can reply to this email and have your
reply appear o

[GitHub] incubator-madlib pull request #162: MLP: Multilayer Perceptron Phase 2

2017-08-09 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request:

https://github.com/apache/incubator-madlib/pull/162#discussion_r132339837
  
--- Diff: doc/design/modules/neural-network.tex ---
@@ -46,41 +47,49 @@ \subsection{Formal Description}
 In the remaining part of this section, we will give a formal description 
of the derivation of objective function and its gradient.
 
 \paragraph{Objective function.}
-We mostly follow the notations in example 1.5.3 from Bertsekas 
\cite{bertsekas1999nonlinear}, for a multilayer perceptron that has $N$ layers 
(stages), and the $k$th stage has $n_k$ activation units ($\phi : \mathbb{R} 
\to \mathbb{R}$), the objective function is given as
-\[f_{(y, z)}(u) = \frac{1}{2} \|h(u, y) - z\|_2^2,\]
-where $y \in \mathbb{R}^{n_0}$ is the input vector, $z \in 
\mathbb{R}^{n_N}$ is the output vector,
+We mostly follow the notations in example 1.5.3 from Bertsekas 
\cite{bertsekas1999nonlinear}, for a multilayer perceptron that has $N$ layers 
(stages), and the $k$th stage has $n_k$ activation units ($\phi : \mathbb{R} 
\to \mathbb{R}$), the objective function for regression is given as
+\[f_{(x, y)}(u) = \frac{1}{2} \|h(u, x) - y\|_2^2,\]
+and for classification the objective function is given as
+\[f_{(x, y)}(u) = \sum_i (\log(h_i(u, x)) * z_i + (1-\log(h_i(u, x))) *( 
1- z_i) ,\]
+where $x \in \mathbb{R}^{n_0}$ is the input vector, $y \in 
\mathbb{R}^{n_N}$ is the output vector (one hot encoded for classification),
 \footnote{Of course, the objective function can be defined over a set of 
input-output vector pairs, which is simply given as the addition of the above 
$f$.}
 and the coefficients are given as
-\[u = \{ u_{k-1}^{sj} \; | \; k = 1,...,N, \: s = 0,...,n_{k-1}, \: j = 
1,...,n_k\}\]
+\[u = \{ u_{k-1}^{sj} \; | \; k = 1,...,N, \: s = 0,...,n_{k-1}, \: j = 
1,...,n_k\},\]
+And are initialized from a uniform distribution as follows:
+\[u_{k}^{sj} = uniform(-r,r),\]
+where r is defined as follows:
+\[r = \sqrt{\frac{6}{n_k+n_{k+1}}}\]
+With regularization, an additional term enters the objective function, 
given as
+\[\sum_{u_k^{sj}} \frac{1}{2} \lambda u_k^{sj2} \]
 This still leaves $h : \mathbb{R}^{n_0} \to \mathbb{R}^{n_N}$ as an open 
item.
-Let $x_k \in \mathbb{R}^{n_k}, k = 1,...,N$ be the output vector of the 
$k$th layer. Then we define $h(u, y) = x_N$, based on setting $x_0 = y$ and the 
$j$th component of $x_k$ is given in an iterative fashion as
-\footnote{$x_k^0 \equiv 1$ is used to simplified the notations, and 
$x_k^0$ is not a component of $x_k$, for any $k = 0,...,N$.}
+Let $o_k \in \mathbb{R}^{n_k}, k = 1,...,N$ be the output vector of the 
$k$th layer. Then we define $h(u, x) = o_N$, based on setting $o_0 = x$ and the 
$j$th component of $o_k$ is given in an iterative fashion as
+\footnote{$o_k^0 \equiv 1$ is used to simplified the notations, and 
$o_k^0$ is not a component of $o_k$, for any $k = 0,...,N$.}
--- End diff --

`$j$th` -> `$j$^{th}`.
Similar changes in several other places.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-madlib pull request #162: MLP: Multilayer Perceptron Phase 2

2017-08-09 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request:

https://github.com/apache/incubator-madlib/pull/162#discussion_r132339557
  
--- Diff: doc/design/modules/neural-network.tex ---
@@ -46,41 +47,49 @@ \subsection{Formal Description}
 In the remaining part of this section, we will give a formal description 
of the derivation of objective function and its gradient.
 
 \paragraph{Objective function.}
-We mostly follow the notations in example 1.5.3 from Bertsekas 
\cite{bertsekas1999nonlinear}, for a multilayer perceptron that has $N$ layers 
(stages), and the $k$th stage has $n_k$ activation units ($\phi : \mathbb{R} 
\to \mathbb{R}$), the objective function is given as
-\[f_{(y, z)}(u) = \frac{1}{2} \|h(u, y) - z\|_2^2,\]
-where $y \in \mathbb{R}^{n_0}$ is the input vector, $z \in 
\mathbb{R}^{n_N}$ is the output vector,
+We mostly follow the notations in example 1.5.3 from Bertsekas 
\cite{bertsekas1999nonlinear}, for a multilayer perceptron that has $N$ layers 
(stages), and the $k$th stage has $n_k$ activation units ($\phi : \mathbb{R} 
\to \mathbb{R}$), the objective function for regression is given as
+\[f_{(x, y)}(u) = \frac{1}{2} \|h(u, x) - y\|_2^2,\]
--- End diff --

`$k$th` -> `$k$^{th}`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-madlib pull request #162: MLP: Multilayer Perceptron Phase 2

2017-08-09 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request:

https://github.com/apache/incubator-madlib/pull/162#discussion_r132340137
  
--- Diff: doc/design/modules/neural-network.tex ---
@@ -46,41 +47,49 @@ \subsection{Formal Description}
 In the remaining part of this section, we will give a formal description 
of the derivation of objective function and its gradient.
 
 \paragraph{Objective function.}
-We mostly follow the notations in example 1.5.3 from Bertsekas 
\cite{bertsekas1999nonlinear}, for a multilayer perceptron that has $N$ layers 
(stages), and the $k$th stage has $n_k$ activation units ($\phi : \mathbb{R} 
\to \mathbb{R}$), the objective function is given as
-\[f_{(y, z)}(u) = \frac{1}{2} \|h(u, y) - z\|_2^2,\]
-where $y \in \mathbb{R}^{n_0}$ is the input vector, $z \in 
\mathbb{R}^{n_N}$ is the output vector,
+We mostly follow the notations in example 1.5.3 from Bertsekas 
\cite{bertsekas1999nonlinear}, for a multilayer perceptron that has $N$ layers 
(stages), and the $k$th stage has $n_k$ activation units ($\phi : \mathbb{R} 
\to \mathbb{R}$), the objective function for regression is given as
+\[f_{(x, y)}(u) = \frac{1}{2} \|h(u, x) - y\|_2^2,\]
+and for classification the objective function is given as
+\[f_{(x, y)}(u) = \sum_i (\log(h_i(u, x)) * z_i + (1-\log(h_i(u, x))) *( 
1- z_i) ,\]
+where $x \in \mathbb{R}^{n_0}$ is the input vector, $y \in 
\mathbb{R}^{n_N}$ is the output vector (one hot encoded for classification),
 \footnote{Of course, the objective function can be defined over a set of 
input-output vector pairs, which is simply given as the addition of the above 
$f$.}
 and the coefficients are given as
--- End diff --

Change ```classification),\n
\footnote{```
to ```classification),~\footnote{```

Similar comment for the second footnote too.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-madlib pull request #162: MLP: Multilayer Perceptron Phase 2

2017-08-08 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request:

https://github.com/apache/incubator-madlib/pull/162#discussion_r132059184
  
--- Diff: src/ports/postgres/modules/convex/mlp_igd.py_in ---
@@ -59,60 +63,115 @@ def mlp(schema_madlib, source_table, output_table, 
independent_varname,
 Returns:
 None
 """
-with MinWarning('warning'):
-optimizer_params = _get_optimizer_params(optimizer_param_str or "")
-summary_table = add_postfix(output_table, "_summary")
-_validate_args(source_table, output_table, summary_table, 
independent_varname,
-   dependent_varname, hidden_layer_sizes,
-   optimizer_params, is_classification)
-
-current_iteration = 1
-prev_state = None
-tolerance = optimizer_params["tolerance"]
-n_iterations = optimizer_params["n_iterations"]
-step_size = optimizer_params["step_size"]
-n_tries = optimizer_params["n_tries"]
-activation_name = _get_activation_function_name(activation)
-activation_index = _get_activation_index(activation_name)
-num_input_nodes = array_col_dimension(
-source_table, independent_varname)
-num_output_nodes = 0
-classes = []
-dependent_type = get_expr_type(dependent_varname, source_table)
-original_dependent_varname = dependent_varname
-
-if is_classification:
-dependent_variable_sql = """
-SELECT DISTINCT {dependent_varname}
-FROM {source_table}
-""".format(dependent_varname=dependent_varname,
-   source_table=source_table)
-labels = plpy.execute(dependent_variable_sql)
-one_hot_dependent_varname = 'ARRAY['
-num_output_nodes = len(labels)
-for label_obj in labels:
-label = _format_label(label_obj[dependent_varname])
-classes.append(label)
-one_hot_dependent_varname += dependent_varname + \
-"=" + str(label) + ","
-# Remove the last comma
-one_hot_dependent_varname = one_hot_dependent_varname[:-1]
-one_hot_dependent_varname += ']::integer[]'
-dependent_varname = one_hot_dependent_varname
-else:
-if "[]" not in dependent_type:
-dependent_varname = "ARRAY[" + dependent_varname + "]"
-num_output_nodes = array_col_dimension(
-source_table, dependent_varname)
-layer_sizes = [num_input_nodes] + \
-hidden_layer_sizes + [num_output_nodes]
+warm_start = bool(warm_start)
+optimizer_params = _get_optimizer_params(optimizer_param_str or "")
+summary_table = add_postfix(output_table, "_summary")
+weights = '1' if not weights or not weights.strip() else 
weights.strip()
+hidden_layer_sizes = hidden_layer_sizes or []
+activation = _get_activation_function_name(activation)
+learning_rate_policy = _get_learning_rate_policy_name(
+optimizer_params["learning_rate_policy"])
+activation_index = _get_activation_index(activation)
+
+_validate_args(source_table, output_table, summary_table, 
independent_varname,
+   dependent_varname, hidden_layer_sizes,
+   optimizer_params, is_classification, weights,
+   warm_start, activation)
+
+current_iteration = 1
+prev_state = None
+tolerance = optimizer_params["tolerance"]
+n_iterations = optimizer_params["n_iterations"]
+step_size_init = optimizer_params["learning_rate_init"]
+iterations_per_step = optimizer_params["iterations_per_step"]
+power = optimizer_params["power"]
+gamma = optimizer_params["gamma"]
+step_size = step_size_init
+n_tries = optimizer_params["n_tries"]
+# lambda is a reserved word in python
+lmbda = optimizer_params["lambda"]
+iterations_per_step = optimizer_params["iterations_per_step"]
+num_input_nodes = array_col_dimension(source_table,
+  independent_varname)
+num_output_nodes = 0
+classes = []
+dependent_type = get_expr_type(dependent_varname, source_table)
+original_dependent_varname = dependent_varname
+dimension, n_tuples = _tbl_dimension_rownum(
+schema_madlib, source_table, independent_varname)
+x_scales = __utils_ind_var_scales(
+source_table, independent_varname, dimension, schema_madlib)
+x_means = py_list_to_sql_string(
+x_scales["mean"], array_type="DOUBLE PRECISION")
+

[GitHub] incubator-madlib pull request #162: MLP: Multilayer Perceptron Phase 2

2017-08-08 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request:

https://github.com/apache/incubator-madlib/pull/162#discussion_r132057309
  
--- Diff: src/ports/postgres/modules/utilities/utilities.py_in ---
@@ -54,6 +54,17 @@ def is_orca():
 # 
--
 
 
+def _assert_equal(o1, o2, msg):
+"""
+@brief if the given condition is false, then raise an error with the 
message
+@param conditionthe condition to be asserted
--- End diff --

This docstring is misleading.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-madlib pull request #162: MLP: Multilayer Perceptron Phase 2

2017-08-08 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request:

https://github.com/apache/incubator-madlib/pull/162#discussion_r132058905
  
--- Diff: src/ports/postgres/modules/convex/mlp_igd.py_in ---
@@ -59,60 +63,115 @@ def mlp(schema_madlib, source_table, output_table, 
independent_varname,
 Returns:
 None
 """
-with MinWarning('warning'):
-optimizer_params = _get_optimizer_params(optimizer_param_str or "")
-summary_table = add_postfix(output_table, "_summary")
-_validate_args(source_table, output_table, summary_table, 
independent_varname,
-   dependent_varname, hidden_layer_sizes,
-   optimizer_params, is_classification)
-
-current_iteration = 1
-prev_state = None
-tolerance = optimizer_params["tolerance"]
-n_iterations = optimizer_params["n_iterations"]
-step_size = optimizer_params["step_size"]
-n_tries = optimizer_params["n_tries"]
-activation_name = _get_activation_function_name(activation)
-activation_index = _get_activation_index(activation_name)
-num_input_nodes = array_col_dimension(
-source_table, independent_varname)
-num_output_nodes = 0
-classes = []
-dependent_type = get_expr_type(dependent_varname, source_table)
-original_dependent_varname = dependent_varname
-
-if is_classification:
-dependent_variable_sql = """
-SELECT DISTINCT {dependent_varname}
-FROM {source_table}
-""".format(dependent_varname=dependent_varname,
-   source_table=source_table)
-labels = plpy.execute(dependent_variable_sql)
-one_hot_dependent_varname = 'ARRAY['
-num_output_nodes = len(labels)
-for label_obj in labels:
-label = _format_label(label_obj[dependent_varname])
-classes.append(label)
-one_hot_dependent_varname += dependent_varname + \
-"=" + str(label) + ","
-# Remove the last comma
-one_hot_dependent_varname = one_hot_dependent_varname[:-1]
-one_hot_dependent_varname += ']::integer[]'
-dependent_varname = one_hot_dependent_varname
-else:
-if "[]" not in dependent_type:
-dependent_varname = "ARRAY[" + dependent_varname + "]"
-num_output_nodes = array_col_dimension(
-source_table, dependent_varname)
-layer_sizes = [num_input_nodes] + \
-hidden_layer_sizes + [num_output_nodes]
+warm_start = bool(warm_start)
+optimizer_params = _get_optimizer_params(optimizer_param_str or "")
+summary_table = add_postfix(output_table, "_summary")
+weights = '1' if not weights or not weights.strip() else 
weights.strip()
+hidden_layer_sizes = hidden_layer_sizes or []
+activation = _get_activation_function_name(activation)
+learning_rate_policy = _get_learning_rate_policy_name(
+optimizer_params["learning_rate_policy"])
+activation_index = _get_activation_index(activation)
+
+_validate_args(source_table, output_table, summary_table, 
independent_varname,
+   dependent_varname, hidden_layer_sizes,
+   optimizer_params, is_classification, weights,
+   warm_start, activation)
+
+current_iteration = 1
+prev_state = None
+tolerance = optimizer_params["tolerance"]
+n_iterations = optimizer_params["n_iterations"]
+step_size_init = optimizer_params["learning_rate_init"]
+iterations_per_step = optimizer_params["iterations_per_step"]
+power = optimizer_params["power"]
+gamma = optimizer_params["gamma"]
+step_size = step_size_init
+n_tries = optimizer_params["n_tries"]
+# lambda is a reserved word in python
+lmbda = optimizer_params["lambda"]
+iterations_per_step = optimizer_params["iterations_per_step"]
+num_input_nodes = array_col_dimension(source_table,
+  independent_varname)
+num_output_nodes = 0
+classes = []
+dependent_type = get_expr_type(dependent_varname, source_table)
+original_dependent_varname = dependent_varname
+dimension, n_tuples = _tbl_dimension_rownum(
+schema_madlib, source_table, independent_varname)
+x_scales = __utils_ind_var_scales(
+source_table, independent_varname, dimension, schema_madlib)
+x_means = py_list_to_sql_string(
+x_scales["mean"], array_type="DOUBLE PRECISION")
+

[GitHub] incubator-madlib pull request #162: MLP: Multilayer Perceptron Phase 2

2017-08-08 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request:

https://github.com/apache/incubator-madlib/pull/162#discussion_r132060205
  
--- Diff: src/ports/postgres/modules/convex/mlp_igd.py_in ---
@@ -122,206 +181,349 @@ def mlp(schema_madlib, source_table, output_table, 
independent_varname,
 {layer_sizes},
 ({step_size})::FLOAT8,
 {activation},
-{is_classification}) as curr_state
-FROM {source_table} AS _src
-""".format(schema_madlib=schema_madlib,
-   independent_varname=independent_varname,
-   dependent_varname=dependent_varname,
-   prev_state=prev_state_str,
-   # C++ uses double internally
-   layer_sizes=py_list_to_sql_string(layer_sizes,
- 
array_type="double precision"),
-   step_size=step_size,
-   source_table=source_table,
-   activation=activation_index,
-   is_classification=int(is_classification))
+{is_classification},
+({weights})::DOUBLE PRECISION,
+{warm_start},
+({warm_start_coeff})::DOUBLE PRECISION[],
+{n_tuples},
+{lmbda},
+{x_means},
+{x_stds}
+) as curr_state
+FROM {source_table} as _src
+""".format(
+schema_madlib=schema_madlib,
+independent_varname=independent_varname,
+dependent_varname=dependent_varname,
+prev_state=prev_state_str,
+# c++ uses double internally
+layer_sizes=py_list_to_sql_string(
+layer_sizes, array_type="DOUBLE PRECISION"),
+step_size=step_size,
+source_table=source_table,
+activation=activation_index,
+is_classification=int(is_classification),
+weights=weights,
+warm_start=warm_start,
+warm_start_coeff=py_list_to_sql_string(
+coeff, array_type="DOUBLE PRECISION"),
+n_tuples=n_tuples,
+lmbda=lmbda,
+x_means=x_means,
+x_stds=x_stds)
 curr_state = plpy.execute(train_sql)[0]["curr_state"]
 dist_sql = """
-SELECT {schema_madlib}.internal_mlp_igd_distance(
-{prev_state},
-{curr_state}) as state_dist
-""".format(schema_madlib=schema_madlib,
-   prev_state=prev_state_str,
-   curr_state=py_list_to_sql_string(curr_state, 
"double precision"))
+SELECT {schema_madlib}.internal_mlp_igd_distance(
+{prev_state},
+{curr_state}) as state_dist
+""".format(
+schema_madlib=schema_madlib,
+prev_state=prev_state_str,
+curr_state=py_list_to_sql_string(curr_state,
+ "DOUBLE PRECISION"))
 state_dist = plpy.execute(dist_sql)[0]["state_dist"]
-if ((state_dist and state_dist < tolerance) or
-current_iteration > n_iterations):
+if verbose and 1

[GitHub] incubator-madlib pull request #162: MLP: Multilayer Perceptron Phase 2

2017-08-08 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request:

https://github.com/apache/incubator-madlib/pull/162#discussion_r132058113
  
--- Diff: src/ports/postgres/modules/convex/mlp.sql_in ---
@@ -226,26 +268,52 @@ the parameter is ignored.
 
 
 
-  'step_size = ,
+  'learning_rate_init = ,
n_iterations = ,
n_tries = ,
tolerance = '
 
 \b Optimizer Parameters
 
 
-step_size
-Default: [0.001].
+learning_rate_init
+Default: 0.001.
 Also known as the learning rate. A small value is usually desirable to
 ensure convergence, while a large value provides more room for progress 
during
 training. Since the best value depends on the condition number of the 
data, in
 practice one often tunes this parameter.
 
 
+learning_rate_policy
+Default: constant.
+One of 'constant', 'exp', 'inv' or 'step' or any prefix of these.
+'constant': learning_rate = learning_rate_init
+'exp': learning_rate = learning_rate_init * gamma^(iter)
+'inv': learning_rate = learning_rate_init * (iter+1)^(-power)
+'step': learning_rate = learning_rate_init * 
gamma^(floor(iter/iterations_per_step))
+Where iter is the current iteration of SGD.
+
+
+gamma
+Default: 0.1.
+Decay rate for learning rate when learning_rate_policy is 'exp' or 'step'.
+
+
+power
+Default: 0.5.
+Exponent for learning_rate_policy = 'inv'
+
+
+iterations_per_step
+Default: 100.
+Number of iterations to run before decreasing the learning rate by
+a factor of gamma.  Valid for learning rate policy = 'step'
+
--- End diff --

Documentation for `lambda` missing here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-madlib pull request #162: MLP: Multilayer Perceptron Phase 2

2017-08-08 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request:

https://github.com/apache/incubator-madlib/pull/162#discussion_r132057110
  
--- Diff: src/ports/postgres/modules/convex/test/mlp.sql_in ---
@@ -190,22 +190,28 @@ INSERT INTO iris_data VALUES
 (150,ARRAY[5.9,3.0,5.1,1.8],'Iris-virginica',3);
 
 
-SELECT mlp_classification(
-'iris_data',  -- Source table
+SELECT madlib.mlp_classification(
--- End diff --

This will fail when MADlib is not installed on default schema `madlib`. 
Using `SELECT map_classification(...)` should be good enough.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-madlib pull request #162: MLP: Multilayer Perceptron Phase 2

2017-08-08 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request:

https://github.com/apache/incubator-madlib/pull/162#discussion_r132060382
  
--- Diff: src/ports/postgres/modules/convex/mlp_igd.py_in ---
@@ -122,206 +181,349 @@ def mlp(schema_madlib, source_table, output_table, 
independent_varname,
 {layer_sizes},
 ({step_size})::FLOAT8,
 {activation},
-{is_classification}) as curr_state
-FROM {source_table} AS _src
-""".format(schema_madlib=schema_madlib,
-   independent_varname=independent_varname,
-   dependent_varname=dependent_varname,
-   prev_state=prev_state_str,
-   # C++ uses double internally
-   layer_sizes=py_list_to_sql_string(layer_sizes,
- 
array_type="double precision"),
-   step_size=step_size,
-   source_table=source_table,
-   activation=activation_index,
-   is_classification=int(is_classification))
+{is_classification},
+({weights})::DOUBLE PRECISION,
+{warm_start},
+({warm_start_coeff})::DOUBLE PRECISION[],
+{n_tuples},
+{lmbda},
+{x_means},
+{x_stds}
+) as curr_state
+FROM {source_table} as _src
+""".format(
+schema_madlib=schema_madlib,
+independent_varname=independent_varname,
+dependent_varname=dependent_varname,
+prev_state=prev_state_str,
+# c++ uses double internally
+layer_sizes=py_list_to_sql_string(
+layer_sizes, array_type="DOUBLE PRECISION"),
+step_size=step_size,
+source_table=source_table,
+activation=activation_index,
+is_classification=int(is_classification),
+weights=weights,
+warm_start=warm_start,
+warm_start_coeff=py_list_to_sql_string(
+coeff, array_type="DOUBLE PRECISION"),
+n_tuples=n_tuples,
+lmbda=lmbda,
+x_means=x_means,
+x_stds=x_stds)
 curr_state = plpy.execute(train_sql)[0]["curr_state"]
 dist_sql = """
-SELECT {schema_madlib}.internal_mlp_igd_distance(
-{prev_state},
-{curr_state}) as state_dist
-""".format(schema_madlib=schema_madlib,
-   prev_state=prev_state_str,
-   curr_state=py_list_to_sql_string(curr_state, 
"double precision"))
+SELECT {schema_madlib}.internal_mlp_igd_distance(
+{prev_state},
+{curr_state}) as state_dist
+""".format(
+schema_madlib=schema_madlib,
+prev_state=prev_state_str,
+curr_state=py_list_to_sql_string(curr_state,
+ "DOUBLE PRECISION"))
 state_dist = plpy.execute(dist_sql)[0]["state_dist"]
-if ((state_dist and state_dist < tolerance) or
-current_iteration > n_iterations):
+if verbose and 1 n_iterations):
 break
 prev_state = curr_state
 current_iteration += 1
-_build_model_table(schema_madlib, output_table,
-   curr_state, n_iterations)
-layer_sizes_str = py_list_to_sql_string(
-layer_sizes, array_type="integer")
-classes_str = py_list_to_sql_string(
-[strip_end_quotes(cl, "'") for cl in classes],
-array_type=dependent_type)
-summary_table_creation_query = """
-CREATE TABLE {summary_table}(
-source_table TEXT,
-independent_varname TEXT,
-dependent_varname TEXT,
-tolerance FLOAT,
-step_size FLOAT,
-n_iterations INTEGER,
-n_tries INTEGER,
-layer_sizes INTEGER[],
-activation_function TEXT,
-is_classification BOOLEAN,
-classes {dependent_type}[]
-)""".format(summary_table=summary_table,
-dependent_type=dependent_type)
-
-summary_table_update_query = 

[GitHub] incubator-madlib pull request #162: MLP: Multilayer Perceptron Phase 2

2017-08-07 Thread cooper-sloan
Github user cooper-sloan commented on a diff in the pull request:

https://github.com/apache/incubator-madlib/pull/162#discussion_r131789689
  
--- Diff: doc/design/modules/neural-network.tex ---
@@ -22,15 +22,16 @@
 \chapter{Neural Network}
 
 \begin{moduleinfo}
-\item[Authors] {Xixuan Feng}
+\item[Authors] {Xixuan Feng, Cooper Sloan}
 \end{moduleinfo}
 
 % Abstract. What is the problem we want to solve?
 This module implements artificial neural network \cite{ann_wiki}.
 
 \section{Multilayer Perceptron}
 Multilayer perceptron is arguably the most popular model among many neural 
network models \cite{mlp_wiki}.
-Here, we learn the coefficients by minimizing a least square objective 
function (\cite{bertsekas1999nonlinear}, example 1.5.3).
+Here, we learn the coefficients by minimizing a least square objective 
function, or cross entropy (\cite{bertsekas1999nonlinear}, example 1.5.3).
+The parallel architecture is based on the paper by Zhihen Huang 
\cite{mlp_parallel}.
--- End diff --

Good catch.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-madlib pull request #162: MLP: Multilayer Perceptron Phase 2

2017-08-07 Thread haying
Github user haying commented on a diff in the pull request:

https://github.com/apache/incubator-madlib/pull/162#discussion_r131783608
  
--- Diff: doc/design/modules/neural-network.tex ---
@@ -22,15 +22,16 @@
 \chapter{Neural Network}
 
 \begin{moduleinfo}
-\item[Authors] {Xixuan Feng}
+\item[Authors] {Xixuan Feng, Cooper Sloan}
 \end{moduleinfo}
 
 % Abstract. What is the problem we want to solve?
 This module implements artificial neural network \cite{ann_wiki}.
 
 \section{Multilayer Perceptron}
 Multilayer perceptron is arguably the most popular model among many neural 
network models \cite{mlp_wiki}.
-Here, we learn the coefficients by minimizing a least square objective 
function (\cite{bertsekas1999nonlinear}, example 1.5.3).
+Here, we learn the coefficients by minimizing a least square objective 
function, or cross entropy (\cite{bertsekas1999nonlinear}, example 1.5.3).
+The parallel architecture is based on the paper by Zhihen Huang 
\cite{mlp_parallel}.
--- End diff --

"Zhiheng" instead


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-madlib pull request #162: MLP: Multilayer Perceptron Phase 2

2017-08-07 Thread cooper-sloan
GitHub user cooper-sloan opened a pull request:

https://github.com/apache/incubator-madlib/pull/162

MLP: Multilayer Perceptron Phase 2

JIRA: MADLIB-1134

Weights, warm start, n_tries,
regularization, learning rate policy,
standardization and tests.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/cooper-sloan/incubator-madlib mlp_phase2

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-madlib/pull/162.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #162


commit 0d008d9995b1ffb5b35271318b878b396375456a
Author: Cooper Sloan 
Date:   2017-06-17T00:41:07Z

MLP: Multilayer Perceptron Phase 2

JIRA: MADLIB-1134

Weights, warm start, n_tries,
regularization, learning rate policy,
standardization and tests.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---