http://git-wip-us.apache.org/repos/asf/incubator-madlib-site/blob/bed9253d/docs/latest/hypothesis__tests_8sql__in.html ---------------------------------------------------------------------- diff --git a/docs/latest/hypothesis__tests_8sql__in.html b/docs/latest/hypothesis__tests_8sql__in.html index d147634..8333354 100644 --- a/docs/latest/hypothesis__tests_8sql__in.html +++ b/docs/latest/hypothesis__tests_8sql__in.html @@ -47,7 +47,7 @@ <td id="projectlogo"><a href="http://madlib.net"><img alt="Logo" src="madlib.png" height="50" style="padding-left:0.5em;" border="0"/ ></a></td> <td style="padding-left: 0.5em;"> <div id="projectname"> - <span id="projectnumber">1.9</span> + <span id="projectnumber">1.9.1</span> </div> <div id="projectbrief">User Documentation for MADlib</div> </td> @@ -219,24 +219,24 @@ Functions</h2></td></tr> </tr> </table> </div><div class="memdoc"> -<p>Let <img class="formulaInl" alt="$ n_1, \dots, n_k $" src="form_445.png"/> be a realization of a (vector) random variable <img class="formulaInl" alt="$ N = (N_1, \dots, N_k) $" src="form_446.png"/> that follows the multinomial distribution with parameters <img class="formulaInl" alt="$ k $" src="form_97.png"/> and <img class="formulaInl" alt="$ p = (p_1, \dots, p_k) $" src="form_447.png"/>. Test the null hypothesis <img class="formulaInl" alt="$ H_0 : p = p^0 $" src="form_448.png"/>.</p> +<p>Let <img class="formulaInl" alt="$ n_1, \dots, n_k $" src="form_446.png"/> be a realization of a (vector) random variable <img class="formulaInl" alt="$ N = (N_1, \dots, N_k) $" src="form_447.png"/> that follows the multinomial distribution with parameters <img class="formulaInl" alt="$ k $" src="form_97.png"/> and <img class="formulaInl" alt="$ p = (p_1, \dots, p_k) $" src="form_448.png"/>. Test the null hypothesis <img class="formulaInl" alt="$ H_0 : p = p^0 $" src="form_449.png"/>.</p> <dl class="params"><dt>Parameters</dt><dd> <table class="params"> - <tr><td class="paramname">observed</td><td>Number <img class="formulaInl" alt="$ n_i $" src="form_449.png"/> of observations of the current event/row </td></tr> - <tr><td class="paramname">expected</td><td>Expected number of observations of current event/row. This number is not required to be normalized. That is, <img class="formulaInl" alt="$ p^0_i $" src="form_450.png"/> will be taken as <code>expected</code> divided by <code>sum(expected)</code>. Hence, if this parameter is not specified, chi2_test() will by default use <img class="formulaInl" alt="$ p^0 = (\frac 1k, \dots, \frac 1k) $" src="form_451.png"/>, i.e., test that <img class="formulaInl" alt="$ p $" src="form_110.png"/> is a discrete uniform distribution. </td></tr> - <tr><td class="paramname">df</td><td>Degrees of freedom. This is the number of events reduced by the degree of freedom lost by using the observed numbers for defining the expected number of observations. If this parameter is 0, the degree of freedom is taken as <img class="formulaInl" alt="$ (k - 1) $" src="form_452.png"/>.</td></tr> + <tr><td class="paramname">observed</td><td>Number <img class="formulaInl" alt="$ n_i $" src="form_450.png"/> of observations of the current event/row </td></tr> + <tr><td class="paramname">expected</td><td>Expected number of observations of current event/row. This number is not required to be normalized. That is, <img class="formulaInl" alt="$ p^0_i $" src="form_451.png"/> will be taken as <code>expected</code> divided by <code>sum(expected)</code>. Hence, if this parameter is not specified, chi2_test() will by default use <img class="formulaInl" alt="$ p^0 = (\frac 1k, \dots, \frac 1k) $" src="form_452.png"/>, i.e., test that <img class="formulaInl" alt="$ p $" src="form_110.png"/> is a discrete uniform distribution. </td></tr> + <tr><td class="paramname">df</td><td>Degrees of freedom. This is the number of events reduced by the degree of freedom lost by using the observed numbers for defining the expected number of observations. If this parameter is 0, the degree of freedom is taken as <img class="formulaInl" alt="$ (k - 1) $" src="form_453.png"/>.</td></tr> </table> </dd> </dl> -<dl class="section return"><dt>Returns</dt><dd>A composite value as follows. Let <img class="formulaInl" alt="$ n = \sum_{i=1}^n n_i $" src="form_453.png"/>.<ul> +<dl class="section return"><dt>Returns</dt><dd>A composite value as follows. Let <img class="formulaInl" alt="$ n = \sum_{i=1}^n n_i $" src="form_454.png"/>.<ul> <li><code>statistic FLOAT8</code> - Statistic <p class="formulaDsp"> -<img class="formulaDsp" alt="\[ \chi^2 = \sum_{i=1}^k \frac{(n_i - np_i)^2}{np_i} \]" src="form_454.png"/> +<img class="formulaDsp" alt="\[ \chi^2 = \sum_{i=1}^k \frac{(n_i - np_i)^2}{np_i} \]" src="form_455.png"/> </p> The corresponding random variable is approximately chi-squared distributed with <code>df</code> degrees of freedom.</li> <li><code>df BIGINT</code> - Degrees of freedom</li> -<li><code>p_value FLOAT8</code> - Approximate p-value, i.e., <img class="formulaInl" alt="$ \Pr[X^2 \geq \chi^2 \mid p = p^0] $" src="form_455.png"/>. Computed as <code>(1.0 - <a class="el" href="prob_8sql__in.html#a230513b6b549d5b445cbacbdbab42c15">chi_squared_cdf</a>(statistic))</code>.</li> -<li><code>phi FLOAT8</code> - Phi coefficient, i.e., <img class="formulaInl" alt="$ \phi = \sqrt{\frac{\chi^2}{n}} $" src="form_456.png"/></li> -<li><code>contingency_coef FLOAT8</code> - Contingency coefficient, i.e., <img class="formulaInl" alt="$ \sqrt{\frac{\chi^2}{n + \chi^2}} $" src="form_457.png"/></li> +<li><code>p_value FLOAT8</code> - Approximate p-value, i.e., <img class="formulaInl" alt="$ \Pr[X^2 \geq \chi^2 \mid p = p^0] $" src="form_456.png"/>. Computed as <code>(1.0 - <a class="el" href="prob_8sql__in.html#a230513b6b549d5b445cbacbdbab42c15">chi_squared_cdf</a>(statistic))</code>.</li> +<li><code>phi FLOAT8</code> - Phi coefficient, i.e., <img class="formulaInl" alt="$ \phi = \sqrt{\frac{\chi^2}{n}} $" src="form_457.png"/></li> +<li><code>contingency_coef FLOAT8</code> - Contingency coefficient, i.e., <img class="formulaInl" alt="$ \sqrt{\frac{\chi^2}{n + \chi^2}} $" src="form_458.png"/></li> </ul> </dd></dl> <dl class="section user"><dt>Usage</dt><dd><ul> @@ -461,23 +461,23 @@ FROM ( </tr> </table> </div><div class="memdoc"> -<p>Given realizations <img class="formulaInl" alt="$ x_1, \dots, x_m $" src="form_433.png"/> and <img class="formulaInl" alt="$ y_1, \dots, y_n $" src="form_434.png"/> of i.i.d. random variables <img class="formulaInl" alt="$ X_1, \dots, X_m \sim N(\mu_X, \sigma^2) $" src="form_435.png"/> and <img class="formulaInl" alt="$ Y_1, \dots, Y_n \sim N(\mu_Y, \sigma^2) $" src="form_436.png"/> with unknown parameters <img class="formulaInl" alt="$ \mu_X, \mu_Y, $" src="form_416.png"/> and <img class="formulaInl" alt="$ \sigma^2 $" src="form_304.png"/>, test the null hypotheses <img class="formulaInl" alt="$ H_0 : \sigma_X < \sigma_Y $" src="form_437.png"/> and <img class="formulaInl" alt="$ H_0 : \sigma_X = \sigma_Y $" src="form_438.png"/>.</p> +<p>Given realizations <img class="formulaInl" alt="$ x_1, \dots, x_m $" src="form_434.png"/> and <img class="formulaInl" alt="$ y_1, \dots, y_n $" src="form_435.png"/> of i.i.d. random variables <img class="formulaInl" alt="$ X_1, \dots, X_m \sim N(\mu_X, \sigma^2) $" src="form_436.png"/> and <img class="formulaInl" alt="$ Y_1, \dots, Y_n \sim N(\mu_Y, \sigma^2) $" src="form_437.png"/> with unknown parameters <img class="formulaInl" alt="$ \mu_X, \mu_Y, $" src="form_417.png"/> and <img class="formulaInl" alt="$ \sigma^2 $" src="form_305.png"/>, test the null hypotheses <img class="formulaInl" alt="$ H_0 : \sigma_X < \sigma_Y $" src="form_438.png"/> and <img class="formulaInl" alt="$ H_0 : \sigma_X = \sigma_Y $" src="form_439.png"/>.</p> <dl class="params"><dt>Parameters</dt><dd> <table class="params"> - <tr><td class="paramname">first</td><td>Indicator whether <code>value</code> is from first sample <img class="formulaInl" alt="$ x_1, \dots, x_m $" src="form_433.png"/> (if <code>TRUE</code>) or from second sample <img class="formulaInl" alt="$ y_1, \dots, y_n $" src="form_434.png"/> (if <code>FALSE</code>) </td></tr> + <tr><td class="paramname">first</td><td>Indicator whether <code>value</code> is from first sample <img class="formulaInl" alt="$ x_1, \dots, x_m $" src="form_434.png"/> (if <code>TRUE</code>) or from second sample <img class="formulaInl" alt="$ y_1, \dots, y_n $" src="form_435.png"/> (if <code>FALSE</code>) </td></tr> <tr><td class="paramname">value</td><td>Value of random variate <img class="formulaInl" alt="$ x_i $" src="form_62.png"/> or <img class="formulaInl" alt="$ y_i $" src="form_60.png"/></td></tr> </table> </dd> </dl> -<dl class="section return"><dt>Returns</dt><dd>A composite value as follows. We denote by <img class="formulaInl" alt="$ \bar x, \bar y $" src="form_419.png"/> the sample means and by <img class="formulaInl" alt="$ s_X^2, s_Y^2 $" src="form_420.png"/> the sample variances.<ul> +<dl class="section return"><dt>Returns</dt><dd>A composite value as follows. We denote by <img class="formulaInl" alt="$ \bar x, \bar y $" src="form_420.png"/> the sample means and by <img class="formulaInl" alt="$ s_X^2, s_Y^2 $" src="form_421.png"/> the sample variances.<ul> <li><code>statistic FLOAT8</code> - Statistic <p class="formulaDsp"> -<img class="formulaDsp" alt="\[ f = \frac{s_Y^2}{s_X^2} \]" src="form_439.png"/> +<img class="formulaDsp" alt="\[ f = \frac{s_Y^2}{s_X^2} \]" src="form_440.png"/> </p> - The corresponding random variable is F-distributed with <img class="formulaInl" alt="$ (n - 1) $" src="form_408.png"/> degrees of freedom in the numerator and <img class="formulaInl" alt="$ (m - 1) $" src="form_440.png"/> degrees of freedom in the denominator.</li> -<li><code>df1 BIGINT</code> - Degrees of freedom in the numerator <img class="formulaInl" alt="$ (n - 1) $" src="form_408.png"/></li> -<li><code>df2 BIGINT</code> - Degrees of freedom in the denominator <img class="formulaInl" alt="$ (m - 1) $" src="form_440.png"/></li> -<li><code>p_value_one_sided FLOAT8</code> - Lower bound on one-sided p-value. In detail, the result is <img class="formulaInl" alt="$ \Pr[F \geq f \mid \sigma_X = \sigma_Y] $" src="form_441.png"/>, which is a lower bound on <img class="formulaInl" alt="$ \Pr[F \geq f \mid \sigma_X \leq \sigma_Y] $" src="form_442.png"/>. Computed as <code>(1.0 - <a class="el" href="prob_8sql__in.html#a6c5b3e35531e44098f9d0cbef14cb8a6">fisher_f_cdf</a>(statistic))</code>.</li> -<li><code>p_value_two_sided FLOAT8</code> - Two-sided p-value, i.e., <img class="formulaInl" alt="$ 2 \cdot \min \{ p, 1 - p \} $" src="form_443.png"/> where <img class="formulaInl" alt="$ p = \Pr[ F \geq f \mid \sigma_X = \sigma_Y] $" src="form_444.png"/>. Computed as <code>(min(p_value_one_sided, 1. - p_value_one_sided))</code>.</li> + The corresponding random variable is F-distributed with <img class="formulaInl" alt="$ (n - 1) $" src="form_409.png"/> degrees of freedom in the numerator and <img class="formulaInl" alt="$ (m - 1) $" src="form_441.png"/> degrees of freedom in the denominator.</li> +<li><code>df1 BIGINT</code> - Degrees of freedom in the numerator <img class="formulaInl" alt="$ (n - 1) $" src="form_409.png"/></li> +<li><code>df2 BIGINT</code> - Degrees of freedom in the denominator <img class="formulaInl" alt="$ (m - 1) $" src="form_441.png"/></li> +<li><code>p_value_one_sided FLOAT8</code> - Lower bound on one-sided p-value. In detail, the result is <img class="formulaInl" alt="$ \Pr[F \geq f \mid \sigma_X = \sigma_Y] $" src="form_442.png"/>, which is a lower bound on <img class="formulaInl" alt="$ \Pr[F \geq f \mid \sigma_X \leq \sigma_Y] $" src="form_443.png"/>. Computed as <code>(1.0 - <a class="el" href="prob_8sql__in.html#a6c5b3e35531e44098f9d0cbef14cb8a6">fisher_f_cdf</a>(statistic))</code>.</li> +<li><code>p_value_two_sided FLOAT8</code> - Two-sided p-value, i.e., <img class="formulaInl" alt="$ 2 \cdot \min \{ p, 1 - p \} $" src="form_444.png"/> where <img class="formulaInl" alt="$ p = \Pr[ F \geq f \mid \sigma_X = \sigma_Y] $" src="form_445.png"/>. Computed as <code>(min(p_value_one_sided, 1. - p_value_one_sided))</code>.</li> </ul> </dd></dl> <dl class="section user"><dt>Usage</dt><dd><ul> @@ -608,23 +608,23 @@ FROM ( </tr> </table> </div><div class="memdoc"> -<p>Given realizations <img class="formulaInl" alt="$ x_1, \dots, x_m $" src="form_433.png"/> and <img class="formulaInl" alt="$ y_1, \dots, y_m $" src="form_413.png"/> of i.i.d. random variables <img class="formulaInl" alt="$ X_1, \dots, X_m $" src="form_458.png"/> and i.i.d. <img class="formulaInl" alt="$ Y_1, \dots, Y_n $" src="form_459.png"/>, respectively, test the null hypothesis that the underlying distributions function <img class="formulaInl" alt="$ F_X, F_Y $" src="form_460.png"/> are identical, i.e., <img class="formulaInl" alt="$ H_0 : F_X = F_Y $" src="form_461.png"/>.</p> +<p>Given realizations <img class="formulaInl" alt="$ x_1, \dots, x_m $" src="form_434.png"/> and <img class="formulaInl" alt="$ y_1, \dots, y_m $" src="form_414.png"/> of i.i.d. random variables <img class="formulaInl" alt="$ X_1, \dots, X_m $" src="form_459.png"/> and i.i.d. <img class="formulaInl" alt="$ Y_1, \dots, Y_n $" src="form_460.png"/>, respectively, test the null hypothesis that the underlying distributions function <img class="formulaInl" alt="$ F_X, F_Y $" src="form_461.png"/> are identical, i.e., <img class="formulaInl" alt="$ H_0 : F_X = F_Y $" src="form_462.png"/>.</p> <dl class="params"><dt>Parameters</dt><dd> <table class="params"> <tr><td class="paramname">first</td><td>Determines whether the value belongs to the first (if <code>TRUE</code>) or the second sample (if <code>FALSE</code>) </td></tr> <tr><td class="paramname">value</td><td>Value of random variate <img class="formulaInl" alt="$ x_i $" src="form_62.png"/> or <img class="formulaInl" alt="$ y_i $" src="form_60.png"/> </td></tr> - <tr><td class="paramname">m</td><td>Size <img class="formulaInl" alt="$ m $" src="form_291.png"/> of the first sample. See usage instructions below. </td></tr> + <tr><td class="paramname">m</td><td>Size <img class="formulaInl" alt="$ m $" src="form_292.png"/> of the first sample. See usage instructions below. </td></tr> <tr><td class="paramname">n</td><td>Size of the second sample. See usage instructions below.</td></tr> </table> </dd> </dl> <dl class="section return"><dt>Returns</dt><dd>A composite value.<ul> <li><code>statistic FLOAT8</code> - KolmogorovâSmirnov statistic <p class="formulaDsp"> -<img class="formulaDsp" alt="\[ d = \max_{t \in \mathbb R} |F_x(t) - F_y(t)| \]" src="form_462.png"/> +<img class="formulaDsp" alt="\[ d = \max_{t \in \mathbb R} |F_x(t) - F_y(t)| \]" src="form_463.png"/> </p> - where <img class="formulaInl" alt="$ F_x(t) := \frac 1m |\{ i \mid x_i \leq t \}| $" src="form_463.png"/> and <img class="formulaInl" alt="$ F_y $" src="form_464.png"/> (defined likewise) are the empirical distribution functions.</li> -<li><code>k_statistic FLOAT8</code> - Kolmogorov statistic <img class="formulaInl" alt="$ k = (r + 0.12 + \frac{0.11}{r}) \cdot d $" src="form_465.png"/> where <img class="formulaInl" alt="$ r = \sqrt{\frac{m n}{m+n}}. $" src="form_466.png"/> and <img class="formulaInl" alt="$ d $" src="form_467.png"/> is the statistic. Then <img class="formulaInl" alt="$ k $" src="form_97.png"/> is approximately Kolmogorov distributed.</li> -<li><code>p_value FLOAT8</code> - Approximate p-value, i.e., an approximate value for <img class="formulaInl" alt="$ \Pr[D \geq d \mid F_X = F_Y] $" src="form_468.png"/>. Computed as <code>(1.0 - <a class="el" href="prob_8sql__in.html#aeef43f74f583bdff17bd074d9c0d9607">kolmogorov_cdf</a>(k_statistic))</code>.</li> + where <img class="formulaInl" alt="$ F_x(t) := \frac 1m |\{ i \mid x_i \leq t \}| $" src="form_464.png"/> and <img class="formulaInl" alt="$ F_y $" src="form_465.png"/> (defined likewise) are the empirical distribution functions.</li> +<li><code>k_statistic FLOAT8</code> - Kolmogorov statistic <img class="formulaInl" alt="$ k = (r + 0.12 + \frac{0.11}{r}) \cdot d $" src="form_466.png"/> where <img class="formulaInl" alt="$ r = \sqrt{\frac{m n}{m+n}}. $" src="form_467.png"/> and <img class="formulaInl" alt="$ d $" src="form_468.png"/> is the statistic. Then <img class="formulaInl" alt="$ k $" src="form_97.png"/> is approximately Kolmogorov distributed.</li> +<li><code>p_value FLOAT8</code> - Approximate p-value, i.e., an approximate value for <img class="formulaInl" alt="$ \Pr[D \geq d \mid F_X = F_Y] $" src="form_469.png"/>. Computed as <code>(1.0 - <a class="el" href="prob_8sql__in.html#aeef43f74f583bdff17bd074d9c0d9607">kolmogorov_cdf</a>(k_statistic))</code>.</li> </ul> </dd></dl> <dl class="section user"><dt>Usage</dt><dd><ul> @@ -662,26 +662,26 @@ FROM ( </tr> </table> </div><div class="memdoc"> -<p>Given realizations <img class="formulaInl" alt="$ x_{1,1}, \dots, x_{1, n_1}, x_{2,1}, \dots, x_{2,n_2}, \dots, x_{k,n_k} $" src="form_493.png"/> of i.i.d. random variables <img class="formulaInl" alt="$ X_{i,j} \sim N(\mu_i, \sigma^2) $" src="form_494.png"/> with unknown parameters <img class="formulaInl" alt="$ \mu_1, \dots, \mu_k $" src="form_495.png"/> and <img class="formulaInl" alt="$ \sigma^2 $" src="form_304.png"/>, test the null hypotheses <img class="formulaInl" alt="$ H_0 : \mu_1 = \dots = \mu_k $" src="form_496.png"/>.</p> +<p>Given realizations <img class="formulaInl" alt="$ x_{1,1}, \dots, x_{1, n_1}, x_{2,1}, \dots, x_{2,n_2}, \dots, x_{k,n_k} $" src="form_494.png"/> of i.i.d. random variables <img class="formulaInl" alt="$ X_{i,j} \sim N(\mu_i, \sigma^2) $" src="form_495.png"/> with unknown parameters <img class="formulaInl" alt="$ \mu_1, \dots, \mu_k $" src="form_496.png"/> and <img class="formulaInl" alt="$ \sigma^2 $" src="form_305.png"/>, test the null hypotheses <img class="formulaInl" alt="$ H_0 : \mu_1 = \dots = \mu_k $" src="form_497.png"/>.</p> <dl class="params"><dt>Parameters</dt><dd> <table class="params"> <tr><td class="paramname">group</td><td>Group which <code>value</code> is from. Note that <code>group</code> can assume arbitary value not limited to a continguous range of integers. </td></tr> - <tr><td class="paramname">value</td><td>Value of random variate <img class="formulaInl" alt="$ x_{i,j} $" src="form_497.png"/></td></tr> + <tr><td class="paramname">value</td><td>Value of random variate <img class="formulaInl" alt="$ x_{i,j} $" src="form_498.png"/></td></tr> </table> </dd> </dl> -<dl class="section return"><dt>Returns</dt><dd>A composite value as follows. Let <img class="formulaInl" alt="$ n := \sum_{i=1}^k n_i $" src="form_498.png"/> be the total size of all samples. Denote by <img class="formulaInl" alt="$ \bar x $" src="form_405.png"/> the grand mean, by <img class="formulaInl" alt="$ \overline{x_i} $" src="form_499.png"/> the group sample means, and by <img class="formulaInl" alt="$ s_i^2 $" src="form_500.png"/> the group sample variances.<ul> -<li><code>sum_squares_between DOUBLE PRECISION</code> - sum of squares between the group means, i.e., <img class="formulaInl" alt="$ \mathit{SS}_b = \sum_{i=1}^k n_i (\overline{x_i} - \bar x)^2. $" src="form_501.png"/></li> -<li><code>sum_squares_within DOUBLE PRECISION</code> - sum of squares within the groups, i.e., <img class="formulaInl" alt="$ \mathit{SS}_w = \sum_{i=1}^k (n_i - 1) s_i^2. $" src="form_502.png"/></li> -<li><code>df_between BIGINT</code> - degree of freedom for between-group variation <img class="formulaInl" alt="$ (k-1) $" src="form_503.png"/></li> -<li><code>df_within BIGINT</code> - degree of freedom for within-group variation <img class="formulaInl" alt="$ (n-k) $" src="form_504.png"/></li> -<li><code>mean_squares_between DOUBLE PRECISION</code> - mean square between groups, i.e., <img class="formulaInl" alt="$ s_b^2 := \frac{\mathit{SS}_b}{k-1} $" src="form_505.png"/></li> -<li><code>mean_squares_within DOUBLE PRECISION</code> - mean square within groups, i.e., <img class="formulaInl" alt="$ s_w^2 := \frac{\mathit{SS}_w}{n-k} $" src="form_506.png"/></li> +<dl class="section return"><dt>Returns</dt><dd>A composite value as follows. Let <img class="formulaInl" alt="$ n := \sum_{i=1}^k n_i $" src="form_499.png"/> be the total size of all samples. Denote by <img class="formulaInl" alt="$ \bar x $" src="form_406.png"/> the grand mean, by <img class="formulaInl" alt="$ \overline{x_i} $" src="form_500.png"/> the group sample means, and by <img class="formulaInl" alt="$ s_i^2 $" src="form_501.png"/> the group sample variances.<ul> +<li><code>sum_squares_between DOUBLE PRECISION</code> - sum of squares between the group means, i.e., <img class="formulaInl" alt="$ \mathit{SS}_b = \sum_{i=1}^k n_i (\overline{x_i} - \bar x)^2. $" src="form_502.png"/></li> +<li><code>sum_squares_within DOUBLE PRECISION</code> - sum of squares within the groups, i.e., <img class="formulaInl" alt="$ \mathit{SS}_w = \sum_{i=1}^k (n_i - 1) s_i^2. $" src="form_503.png"/></li> +<li><code>df_between BIGINT</code> - degree of freedom for between-group variation <img class="formulaInl" alt="$ (k-1) $" src="form_504.png"/></li> +<li><code>df_within BIGINT</code> - degree of freedom for within-group variation <img class="formulaInl" alt="$ (n-k) $" src="form_505.png"/></li> +<li><code>mean_squares_between DOUBLE PRECISION</code> - mean square between groups, i.e., <img class="formulaInl" alt="$ s_b^2 := \frac{\mathit{SS}_b}{k-1} $" src="form_506.png"/></li> +<li><code>mean_squares_within DOUBLE PRECISION</code> - mean square within groups, i.e., <img class="formulaInl" alt="$ s_w^2 := \frac{\mathit{SS}_w}{n-k} $" src="form_507.png"/></li> <li><code>statistic DOUBLE PRECISION</code> - Statistic computed as <p class="formulaDsp"> -<img class="formulaDsp" alt="\[ f = \frac{s_b^2}{s_w^2}. \]" src="form_507.png"/> +<img class="formulaDsp" alt="\[ f = \frac{s_b^2}{s_w^2}. \]" src="form_508.png"/> </p> - This statistic is Fisher F-distributed with <img class="formulaInl" alt="$ (k-1) $" src="form_503.png"/> degrees of freedom in the numerator and <img class="formulaInl" alt="$ (n-k) $" src="form_504.png"/> degrees of freedom in the denominator.</li> -<li><code>p_value DOUBLE PRECISION</code> - p-value, i.e., <img class="formulaInl" alt="$ \Pr[ F \geq f \mid H_0] $" src="form_508.png"/>.</li> + This statistic is Fisher F-distributed with <img class="formulaInl" alt="$ (k-1) $" src="form_504.png"/> degrees of freedom in the numerator and <img class="formulaInl" alt="$ (n-k) $" src="form_505.png"/> degrees of freedom in the denominator.</li> +<li><code>p_value DOUBLE PRECISION</code> - p-value, i.e., <img class="formulaInl" alt="$ \Pr[ F \geq f \mid H_0] $" src="form_509.png"/>.</li> </ul> </dd></dl> <dl class="section user"><dt>Usage</dt><dd><ul> @@ -762,37 +762,37 @@ FROM ( </tr> </table> </div><div class="memdoc"> -<p>Given realizations <img class="formulaInl" alt="$ x_1, \dots, x_n $" src="form_183.png"/> of i.i.d. random variables <img class="formulaInl" alt="$ X_1, \dots, X_n $" src="form_478.png"/> with unknown mean <img class="formulaInl" alt="$ \mu $" src="form_286.png"/>, test the null hypotheses <img class="formulaInl" alt="$ H_0 : \mu \leq 0 $" src="form_403.png"/> and <img class="formulaInl" alt="$ H_0 : \mu = 0 $" src="form_404.png"/>.</p> +<p>Given realizations <img class="formulaInl" alt="$ x_1, \dots, x_n $" src="form_183.png"/> of i.i.d. random variables <img class="formulaInl" alt="$ X_1, \dots, X_n $" src="form_479.png"/> with unknown mean <img class="formulaInl" alt="$ \mu $" src="form_287.png"/>, test the null hypotheses <img class="formulaInl" alt="$ H_0 : \mu \leq 0 $" src="form_404.png"/> and <img class="formulaInl" alt="$ H_0 : \mu = 0 $" src="form_405.png"/>.</p> <dl class="params"><dt>Parameters</dt><dd> <table class="params"> <tr><td class="paramname">value</td><td>Value of random variate <img class="formulaInl" alt="$ x_i $" src="form_62.png"/> or <img class="formulaInl" alt="$ y_i $" src="form_60.png"/>. Values of 0 are ignored (i.e., they do not count towards <img class="formulaInl" alt="$ n $" src="form_10.png"/>). </td></tr> - <tr><td class="paramname">precision</td><td>The precision <img class="formulaInl" alt="$ \epsilon_i $" src="form_479.png"/> with which value is known. The precision determines the handling of ties. The current value <img class="formulaInl" alt="$ v_i $" src="form_480.png"/> is regarded a tie with the previous value <img class="formulaInl" alt="$ v_{i-1} $" src="form_481.png"/> if <img class="formulaInl" alt="$ v_i - \epsilon_i \leq \max_{j=1, \dots, i-1} v_j + \epsilon_j $" src="form_482.png"/>. If <code>precision</code> is negative, then it will be treated as <code>value * 2^(-52)</code>. (Note that <img class="formulaInl" alt="$ 2^{-52} $" src="form_365.png"/> is the machine epsilon for type <code>DOUBLE PRECISION</code>.)</td></tr> + <tr><td class="paramname">precision</td><td>The precision <img class="formulaInl" alt="$ \epsilon_i $" src="form_480.png"/> with which value is known. The precision determines the handling of ties. The current value <img class="formulaInl" alt="$ v_i $" src="form_481.png"/> is regarded a tie with the previous value <img class="formulaInl" alt="$ v_{i-1} $" src="form_482.png"/> if <img class="formulaInl" alt="$ v_i - \epsilon_i \leq \max_{j=1, \dots, i-1} v_j + \epsilon_j $" src="form_483.png"/>. If <code>precision</code> is negative, then it will be treated as <code>value * 2^(-52)</code>. (Note that <img class="formulaInl" alt="$ 2^{-52} $" src="form_366.png"/> is the machine epsilon for type <code>DOUBLE PRECISION</code>.)</td></tr> </table> </dd> </dl> <dl class="section return"><dt>Returns</dt><dd>A composite value:<ul> -<li><code>statistic FLOAT8</code> - statistic computed as follows. Let <img class="formulaInl" alt="$ w^+ = \sum_{i \mid x_i > 0} r_i $" src="form_483.png"/> and <img class="formulaInl" alt="$ w^- = \sum_{i \mid x_i < 0} r_i $" src="form_484.png"/> be the <em>signed rank sums</em> where <p class="formulaDsp"> -<img class="formulaDsp" alt="\[ r_i = \{ j \mid |x_j| < |x_i| \} + \frac{\{ j \mid |x_j| = |x_i| \} + 1}{2}. \]" src="form_485.png"/> +<li><code>statistic FLOAT8</code> - statistic computed as follows. Let <img class="formulaInl" alt="$ w^+ = \sum_{i \mid x_i > 0} r_i $" src="form_484.png"/> and <img class="formulaInl" alt="$ w^- = \sum_{i \mid x_i < 0} r_i $" src="form_485.png"/> be the <em>signed rank sums</em> where <p class="formulaDsp"> +<img class="formulaDsp" alt="\[ r_i = \{ j \mid |x_j| < |x_i| \} + \frac{\{ j \mid |x_j| = |x_i| \} + 1}{2}. \]" src="form_486.png"/> </p> - The Wilcoxon signed-rank statistic is <img class="formulaInl" alt="$ w = \min \{ w^+, w^- \} $" src="form_486.png"/>.</li> -<li><code>rank_sum_pos FLOAT8</code> - rank sum of all positive values, i.e., <img class="formulaInl" alt="$ w^+ $" src="form_487.png"/></li> -<li><code>rank_sum_neg FLOAT8</code> - rank sum of all negative values, i.e., <img class="formulaInl" alt="$ w^- $" src="form_488.png"/></li> + The Wilcoxon signed-rank statistic is <img class="formulaInl" alt="$ w = \min \{ w^+, w^- \} $" src="form_487.png"/>.</li> +<li><code>rank_sum_pos FLOAT8</code> - rank sum of all positive values, i.e., <img class="formulaInl" alt="$ w^+ $" src="form_488.png"/></li> +<li><code>rank_sum_neg FLOAT8</code> - rank sum of all negative values, i.e., <img class="formulaInl" alt="$ w^- $" src="form_489.png"/></li> <li><code>num BIGINT</code> - number <img class="formulaInl" alt="$ n $" src="form_10.png"/> of non-zero values</li> <li><code>z_statistic FLOAT8</code> - z-statistic <p class="formulaDsp"> -<img class="formulaDsp" alt="\[ z = \frac{w^+ - \frac{n(n+1)}{4}} {\sqrt{\frac{n(n+1)(2n+1)}{24} - \sum_{i=1}^n \frac{t_i^2 - 1}{48}}} \]" src="form_489.png"/> +<img class="formulaDsp" alt="\[ z = \frac{w^+ - \frac{n(n+1)}{4}} {\sqrt{\frac{n(n+1)(2n+1)}{24} - \sum_{i=1}^n \frac{t_i^2 - 1}{48}}} \]" src="form_490.png"/> </p> - where <img class="formulaInl" alt="$ t_i $" src="form_388.png"/> is the number of values with absolute value equal to <img class="formulaInl" alt="$ |x_i| $" src="form_490.png"/>. The corresponding random variable is approximately standard normally distributed.</li> -<li><code>p_value_one_sided FLOAT8</code> - One-sided p-value i.e., <img class="formulaInl" alt="$ \Pr[Z \geq z \mid \mu \leq 0] $" src="form_491.png"/>. Computed as <code>(1.0 - <a class="el" href="prob_8sql__in.html#a6c0a499faa80db26c0178f1e69cf7a50">normal_cdf</a>(z_statistic))</code>.</li> -<li><code>p_value_two_sided FLOAT8</code> - Two-sided p-value, i.e., <img class="formulaInl" alt="$ \Pr[ |Z| \geq |z| \mid \mu = 0] $" src="form_492.png"/>. Computed as <code>(2 * <a class="el" href="prob_8sql__in.html#a6c0a499faa80db26c0178f1e69cf7a50">normal_cdf</a>(-abs(z_statistic)))</code>.</li> + where <img class="formulaInl" alt="$ t_i $" src="form_389.png"/> is the number of values with absolute value equal to <img class="formulaInl" alt="$ |x_i| $" src="form_491.png"/>. The corresponding random variable is approximately standard normally distributed.</li> +<li><code>p_value_one_sided FLOAT8</code> - One-sided p-value i.e., <img class="formulaInl" alt="$ \Pr[Z \geq z \mid \mu \leq 0] $" src="form_492.png"/>. Computed as <code>(1.0 - <a class="el" href="prob_8sql__in.html#a6c0a499faa80db26c0178f1e69cf7a50">normal_cdf</a>(z_statistic))</code>.</li> +<li><code>p_value_two_sided FLOAT8</code> - Two-sided p-value, i.e., <img class="formulaInl" alt="$ \Pr[ |Z| \geq |z| \mid \mu = 0] $" src="form_493.png"/>. Computed as <code>(2 * <a class="el" href="prob_8sql__in.html#a6c0a499faa80db26c0178f1e69cf7a50">normal_cdf</a>(-abs(z_statistic)))</code>.</li> </ul> </dd></dl> <dl class="section user"><dt>Usage</dt><dd><ul> -<li>One-sample test: Test null hypothesis that the mean of a sample is at most (or equal to, respectively) <img class="formulaInl" alt="$ \mu_0 $" src="form_412.png"/>: <pre>SELECT (wsr_test(<em>value</em> - <em>mu_0</em> ORDER BY abs(<em>value</em>))).* FROM <em>source</em></pre></li> -<li>Dependent paired test: Test null hypothesis that the mean difference between the first and second value in a pair is at most (or equal to, respectively) <img class="formulaInl" alt="$ \mu_0 $" src="form_412.png"/>: <pre>SELECT (wsr_test(<em>first</em> - <em>second</em> - <em>mu_0</em> ORDER BY abs(<em>first</em> - <em>second</em>))).* FROM <em>source</em></pre> If correctly determining ties is important (e.g., you may want to do so when comparing to software products that take <code>first</code>, <code>second</code>, and <code>mu_0</code> as individual parameters), supply the precision parameter. This can be done as follows: <pre>SELECT (wsr_test( +<li>One-sample test: Test null hypothesis that the mean of a sample is at most (or equal to, respectively) <img class="formulaInl" alt="$ \mu_0 $" src="form_413.png"/>: <pre>SELECT (wsr_test(<em>value</em> - <em>mu_0</em> ORDER BY abs(<em>value</em>))).* FROM <em>source</em></pre></li> +<li>Dependent paired test: Test null hypothesis that the mean difference between the first and second value in a pair is at most (or equal to, respectively) <img class="formulaInl" alt="$ \mu_0 $" src="form_413.png"/>: <pre>SELECT (wsr_test(<em>first</em> - <em>second</em> - <em>mu_0</em> ORDER BY abs(<em>first</em> - <em>second</em>))).* FROM <em>source</em></pre> If correctly determining ties is important (e.g., you may want to do so when comparing to software products that take <code>first</code>, <code>second</code>, and <code>mu_0</code> as individual parameters), supply the precision parameter. This can be done as follows: <pre>SELECT (wsr_test( <em>first</em> - <em>second</em> - <em>mu_0</em>, 3 * 2^(-52) * greatest(first, second, mu_0) ORDER BY abs(<em>first</em> - <em>second</em>) -)).* FROM <em>source</em></pre> Here <img class="formulaInl" alt="$ 2^{-52} $" src="form_365.png"/> is the machine epsilon, which we scale to the magnitude of the input data and multiply with 3 because we have a sum with three terms.</li> +)).* FROM <em>source</em></pre> Here <img class="formulaInl" alt="$ 2^{-52} $" src="form_366.png"/> is the machine epsilon, which we scale to the magnitude of the input data and multiply with 3 because we have a sum with three terms.</li> </ul> </dd></dl> <dl class="section note"><dt>Note</dt><dd>This aggregate must be used as an ordered aggregate (<code>ORDER BY abs(<em>value</code></em>)) and will raise an exception if the absolute values are not ordered. </dd></dl> @@ -838,26 +838,26 @@ FROM ( </tr> </table> </div><div class="memdoc"> -<p>Given realizations <img class="formulaInl" alt="$ x_1, \dots, x_n $" src="form_183.png"/> of i.i.d. random variables <img class="formulaInl" alt="$ X_1, \dots, X_n \sim N(\mu, \sigma^2) $" src="form_402.png"/> with unknown parameters <img class="formulaInl" alt="$ \mu $" src="form_286.png"/> and <img class="formulaInl" alt="$ \sigma^2 $" src="form_304.png"/>, test the null hypotheses <img class="formulaInl" alt="$ H_0 : \mu \leq 0 $" src="form_403.png"/> and <img class="formulaInl" alt="$ H_0 : \mu = 0 $" src="form_404.png"/>.</p> +<p>Given realizations <img class="formulaInl" alt="$ x_1, \dots, x_n $" src="form_183.png"/> of i.i.d. random variables <img class="formulaInl" alt="$ X_1, \dots, X_n \sim N(\mu, \sigma^2) $" src="form_403.png"/> with unknown parameters <img class="formulaInl" alt="$ \mu $" src="form_287.png"/> and <img class="formulaInl" alt="$ \sigma^2 $" src="form_305.png"/>, test the null hypotheses <img class="formulaInl" alt="$ H_0 : \mu \leq 0 $" src="form_404.png"/> and <img class="formulaInl" alt="$ H_0 : \mu = 0 $" src="form_405.png"/>.</p> <dl class="params"><dt>Parameters</dt><dd> <table class="params"> <tr><td class="paramname">value</td><td>Value of random variate <img class="formulaInl" alt="$ x_i $" src="form_62.png"/></td></tr> </table> </dd> </dl> -<dl class="section return"><dt>Returns</dt><dd>A composite value as follows. We denote by <img class="formulaInl" alt="$ \bar x $" src="form_405.png"/> the sample mean and by <img class="formulaInl" alt="$ s^2 $" src="form_406.png"/> the sample variance.<ul> +<dl class="section return"><dt>Returns</dt><dd>A composite value as follows. We denote by <img class="formulaInl" alt="$ \bar x $" src="form_406.png"/> the sample mean and by <img class="formulaInl" alt="$ s^2 $" src="form_407.png"/> the sample variance.<ul> <li><code>statistic FLOAT8</code> - Statistic <p class="formulaDsp"> -<img class="formulaDsp" alt="\[ t = \frac{\sqrt n \cdot \bar x}{s} \]" src="form_407.png"/> +<img class="formulaDsp" alt="\[ t = \frac{\sqrt n \cdot \bar x}{s} \]" src="form_408.png"/> </p> - The corresponding random variable is Student-t distributed with <img class="formulaInl" alt="$ (n - 1) $" src="form_408.png"/> degrees of freedom.</li> -<li><code>df FLOAT8</code> - Degrees of freedom <img class="formulaInl" alt="$ (n - 1) $" src="form_408.png"/></li> -<li><code>p_value_one_sided FLOAT8</code> - Lower bound on one-sided p-value. In detail, the result is <img class="formulaInl" alt="$ \Pr[\bar X \geq \bar x \mid \mu = 0] $" src="form_409.png"/>, which is a lower bound on <img class="formulaInl" alt="$ \Pr[\bar X \geq \bar x \mid \mu \leq 0] $" src="form_410.png"/>. Computed as <code>(1.0 - <a class="el" href="prob_8sql__in.html#a5322531131074c23a2dbf067ee504ef7">students_t_cdf</a>(statistic))</code>.</li> -<li><code>p_value_two_sided FLOAT8</code> - Two-sided p-value, i.e., <img class="formulaInl" alt="$ \Pr[ |\bar X| \geq |\bar x| \mid \mu = 0] $" src="form_411.png"/>. Computed as <code>(2 * <a class="el" href="prob_8sql__in.html#a5322531131074c23a2dbf067ee504ef7">students_t_cdf</a>(-abs(statistic)))</code>.</li> + The corresponding random variable is Student-t distributed with <img class="formulaInl" alt="$ (n - 1) $" src="form_409.png"/> degrees of freedom.</li> +<li><code>df FLOAT8</code> - Degrees of freedom <img class="formulaInl" alt="$ (n - 1) $" src="form_409.png"/></li> +<li><code>p_value_one_sided FLOAT8</code> - Lower bound on one-sided p-value. In detail, the result is <img class="formulaInl" alt="$ \Pr[\bar X \geq \bar x \mid \mu = 0] $" src="form_410.png"/>, which is a lower bound on <img class="formulaInl" alt="$ \Pr[\bar X \geq \bar x \mid \mu \leq 0] $" src="form_411.png"/>. Computed as <code>(1.0 - <a class="el" href="prob_8sql__in.html#a5322531131074c23a2dbf067ee504ef7">students_t_cdf</a>(statistic))</code>.</li> +<li><code>p_value_two_sided FLOAT8</code> - Two-sided p-value, i.e., <img class="formulaInl" alt="$ \Pr[ |\bar X| \geq |\bar x| \mid \mu = 0] $" src="form_412.png"/>. Computed as <code>(2 * <a class="el" href="prob_8sql__in.html#a5322531131074c23a2dbf067ee504ef7">students_t_cdf</a>(-abs(statistic)))</code>.</li> </ul> </dd></dl> <dl class="section user"><dt>Usage</dt><dd><ul> -<li>One-sample t-test: Test null hypothesis that the mean of a sample is at most (or equal to, respectively) <img class="formulaInl" alt="$ \mu_0 $" src="form_412.png"/>: <pre>SELECT (t_test_one(<em>value</em> - <em>mu_0</em>)).* FROM <em>source</em></pre></li> -<li>Dependent paired t-test: Test null hypothesis that the mean difference between the first and second value in each pair is at most (or equal to, respectively) <img class="formulaInl" alt="$ \mu_0 $" src="form_412.png"/>: <pre>SELECT (t_test_one(<em>first</em> - <em>second</em> - <em>mu_0</em>)).* +<li>One-sample t-test: Test null hypothesis that the mean of a sample is at most (or equal to, respectively) <img class="formulaInl" alt="$ \mu_0 $" src="form_413.png"/>: <pre>SELECT (t_test_one(<em>value</em> - <em>mu_0</em>)).* FROM <em>source</em></pre></li> +<li>Dependent paired t-test: Test null hypothesis that the mean difference between the first and second value in each pair is at most (or equal to, respectively) <img class="formulaInl" alt="$ \mu_0 $" src="form_413.png"/>: <pre>SELECT (t_test_one(<em>first</em> - <em>second</em> - <em>mu_0</em>)).* FROM <em>source</em></pre> </li> </ul> </dd></dl> @@ -929,25 +929,25 @@ FROM ( </tr> </table> </div><div class="memdoc"> -<p>Given realizations <img class="formulaInl" alt="$ x_1, \dots, x_n $" src="form_183.png"/> and <img class="formulaInl" alt="$ y_1, \dots, y_m $" src="form_413.png"/> of i.i.d. random variables <img class="formulaInl" alt="$ X_1, \dots, X_n \sim N(\mu_X, \sigma^2) $" src="form_414.png"/> and <img class="formulaInl" alt="$ Y_1, \dots, Y_m \sim N(\mu_Y, \sigma^2) $" src="form_415.png"/> with unknown parameters <img class="formulaInl" alt="$ \mu_X, \mu_Y, $" src="form_416.png"/> and <img class="formulaInl" alt="$ \sigma^2 $" src="form_304.png"/>, test the null hypotheses <img class="formulaInl" alt="$ H_0 : \mu_X \leq \mu_Y $" src="form_417.png"/> and <img class="formulaInl" alt="$ H_0 : \mu_X = \mu_Y $" src="form_418.png"/>.</p> +<p>Given realizations <img class="formulaInl" alt="$ x_1, \dots, x_n $" src="form_183.png"/> and <img class="formulaInl" alt="$ y_1, \dots, y_m $" src="form_414.png"/> of i.i.d. random variables <img class="formulaInl" alt="$ X_1, \dots, X_n \sim N(\mu_X, \sigma^2) $" src="form_415.png"/> and <img class="formulaInl" alt="$ Y_1, \dots, Y_m \sim N(\mu_Y, \sigma^2) $" src="form_416.png"/> with unknown parameters <img class="formulaInl" alt="$ \mu_X, \mu_Y, $" src="form_417.png"/> and <img class="formulaInl" alt="$ \sigma^2 $" src="form_305.png"/>, test the null hypotheses <img class="formulaInl" alt="$ H_0 : \mu_X \leq \mu_Y $" src="form_418.png"/> and <img class="formulaInl" alt="$ H_0 : \mu_X = \mu_Y $" src="form_419.png"/>.</p> <dl class="params"><dt>Parameters</dt><dd> <table class="params"> - <tr><td class="paramname">first</td><td>Indicator whether <code>value</code> is from first sample <img class="formulaInl" alt="$ x_1, \dots, x_n $" src="form_183.png"/> (if <code>TRUE</code>) or from second sample <img class="formulaInl" alt="$ y_1, \dots, y_m $" src="form_413.png"/> (if <code>FALSE</code>) </td></tr> + <tr><td class="paramname">first</td><td>Indicator whether <code>value</code> is from first sample <img class="formulaInl" alt="$ x_1, \dots, x_n $" src="form_183.png"/> (if <code>TRUE</code>) or from second sample <img class="formulaInl" alt="$ y_1, \dots, y_m $" src="form_414.png"/> (if <code>FALSE</code>) </td></tr> <tr><td class="paramname">value</td><td>Value of random variate <img class="formulaInl" alt="$ x_i $" src="form_62.png"/> or <img class="formulaInl" alt="$ y_i $" src="form_60.png"/></td></tr> </table> </dd> </dl> -<dl class="section return"><dt>Returns</dt><dd>A composite value as follows. We denote by <img class="formulaInl" alt="$ \bar x, \bar y $" src="form_419.png"/> the sample means and by <img class="formulaInl" alt="$ s_X^2, s_Y^2 $" src="form_420.png"/> the sample variances.<ul> +<dl class="section return"><dt>Returns</dt><dd>A composite value as follows. We denote by <img class="formulaInl" alt="$ \bar x, \bar y $" src="form_420.png"/> the sample means and by <img class="formulaInl" alt="$ s_X^2, s_Y^2 $" src="form_421.png"/> the sample variances.<ul> <li><code>statistic FLOAT8</code> - Statistic <p class="formulaDsp"> -<img class="formulaDsp" alt="\[ t = \frac{\bar x - \bar y}{s_p \sqrt{1/n + 1/m}} \]" src="form_421.png"/> +<img class="formulaDsp" alt="\[ t = \frac{\bar x - \bar y}{s_p \sqrt{1/n + 1/m}} \]" src="form_422.png"/> </p> where <p class="formulaDsp"> -<img class="formulaDsp" alt="\[ s_p^2 = \frac{\sum_{i=1}^n (x_i - \bar x)^2 + \sum_{i=1}^m (y_i - \bar y)^2} {n + m - 2} \]" src="form_422.png"/> +<img class="formulaDsp" alt="\[ s_p^2 = \frac{\sum_{i=1}^n (x_i - \bar x)^2 + \sum_{i=1}^m (y_i - \bar y)^2} {n + m - 2} \]" src="form_423.png"/> </p> - is the <em>pooled variance</em>. The corresponding random variable is Student-t distributed with <img class="formulaInl" alt="$ (n + m - 2) $" src="form_423.png"/> degrees of freedom.</li> -<li><code>df FLOAT8</code> - Degrees of freedom <img class="formulaInl" alt="$ (n + m - 2) $" src="form_423.png"/></li> -<li><code>p_value_one_sided FLOAT8</code> - Lower bound on one-sided p-value. In detail, the result is <img class="formulaInl" alt="$ \Pr[\bar X - \bar Y \geq \bar x - \bar y \mid \mu_X = \mu_Y] $" src="form_424.png"/>, which is a lower bound on <img class="formulaInl" alt="$ \Pr[\bar X - \bar Y \geq \bar x - \bar y \mid \mu_X \leq \mu_Y] $" src="form_425.png"/>. Computed as <code>(1.0 - <a class="el" href="prob_8sql__in.html#a5322531131074c23a2dbf067ee504ef7">students_t_cdf</a>(statistic))</code>.</li> -<li><code>p_value_two_sided FLOAT8</code> - Two-sided p-value, i.e., <img class="formulaInl" alt="$ \Pr[ |\bar X - \bar Y| \geq |\bar x - \bar y| \mid \mu_X = \mu_Y] $" src="form_426.png"/>. Computed as <code>(2 * <a class="el" href="prob_8sql__in.html#a5322531131074c23a2dbf067ee504ef7">students_t_cdf</a>(-abs(statistic)))</code>.</li> + is the <em>pooled variance</em>. The corresponding random variable is Student-t distributed with <img class="formulaInl" alt="$ (n + m - 2) $" src="form_424.png"/> degrees of freedom.</li> +<li><code>df FLOAT8</code> - Degrees of freedom <img class="formulaInl" alt="$ (n + m - 2) $" src="form_424.png"/></li> +<li><code>p_value_one_sided FLOAT8</code> - Lower bound on one-sided p-value. In detail, the result is <img class="formulaInl" alt="$ \Pr[\bar X - \bar Y \geq \bar x - \bar y \mid \mu_X = \mu_Y] $" src="form_425.png"/>, which is a lower bound on <img class="formulaInl" alt="$ \Pr[\bar X - \bar Y \geq \bar x - \bar y \mid \mu_X \leq \mu_Y] $" src="form_426.png"/>. Computed as <code>(1.0 - <a class="el" href="prob_8sql__in.html#a5322531131074c23a2dbf067ee504ef7">students_t_cdf</a>(statistic))</code>.</li> +<li><code>p_value_two_sided FLOAT8</code> - Two-sided p-value, i.e., <img class="formulaInl" alt="$ \Pr[ |\bar X - \bar Y| \geq |\bar x - \bar y| \mid \mu_X = \mu_Y] $" src="form_427.png"/>. Computed as <code>(2 * <a class="el" href="prob_8sql__in.html#a5322531131074c23a2dbf067ee504ef7">students_t_cdf</a>(-abs(statistic)))</code>.</li> </ul> </dd></dl> <dl class="section user"><dt>Usage</dt><dd><ul> @@ -1028,25 +1028,25 @@ FROM ( </tr> </table> </div><div class="memdoc"> -<p>Given realizations <img class="formulaInl" alt="$ x_1, \dots, x_n $" src="form_183.png"/> and <img class="formulaInl" alt="$ y_1, \dots, y_m $" src="form_413.png"/> of i.i.d. random variables <img class="formulaInl" alt="$ X_1, \dots, X_n \sim N(\mu_X, \sigma_X^2) $" src="form_427.png"/> and <img class="formulaInl" alt="$ Y_1, \dots, Y_m \sim N(\mu_Y, \sigma_Y^2) $" src="form_428.png"/> with unknown parameters <img class="formulaInl" alt="$ \mu_X, \mu_Y, \sigma_X^2, $" src="form_429.png"/> and <img class="formulaInl" alt="$ \sigma_Y^2 $" src="form_430.png"/>, test the null hypotheses <img class="formulaInl" alt="$ H_0 : \mu_X \leq \mu_Y $" src="form_417.png"/> and <img class="formulaInl" alt="$ H_0 : \mu_X = \mu_Y $" src="form_418.png"/>.</p> +<p>Given realizations <img class="formulaInl" alt="$ x_1, \dots, x_n $" src="form_183.png"/> and <img class="formulaInl" alt="$ y_1, \dots, y_m $" src="form_414.png"/> of i.i.d. random variables <img class="formulaInl" alt="$ X_1, \dots, X_n \sim N(\mu_X, \sigma_X^2) $" src="form_428.png"/> and <img class="formulaInl" alt="$ Y_1, \dots, Y_m \sim N(\mu_Y, \sigma_Y^2) $" src="form_429.png"/> with unknown parameters <img class="formulaInl" alt="$ \mu_X, \mu_Y, \sigma_X^2, $" src="form_430.png"/> and <img class="formulaInl" alt="$ \sigma_Y^2 $" src="form_431.png"/>, test the null hypotheses <img class="formulaInl" alt="$ H_0 : \mu_X \leq \mu_Y $" src="form_418.png"/> and <img class="formulaInl" alt="$ H_0 : \mu_X = \mu_Y $" src="form_419.png"/>.</p> <dl class="params"><dt>Parameters</dt><dd> <table class="params"> - <tr><td class="paramname">first</td><td>Indicator whether <code>value</code> is from first sample <img class="formulaInl" alt="$ x_1, \dots, x_n $" src="form_183.png"/> (if <code>TRUE</code>) or from second sample <img class="formulaInl" alt="$ y_1, \dots, y_m $" src="form_413.png"/> (if <code>FALSE</code>) </td></tr> + <tr><td class="paramname">first</td><td>Indicator whether <code>value</code> is from first sample <img class="formulaInl" alt="$ x_1, \dots, x_n $" src="form_183.png"/> (if <code>TRUE</code>) or from second sample <img class="formulaInl" alt="$ y_1, \dots, y_m $" src="form_414.png"/> (if <code>FALSE</code>) </td></tr> <tr><td class="paramname">value</td><td>Value of random variate <img class="formulaInl" alt="$ x_i $" src="form_62.png"/> or <img class="formulaInl" alt="$ y_i $" src="form_60.png"/></td></tr> </table> </dd> </dl> -<dl class="section return"><dt>Returns</dt><dd>A composite value as follows. We denote by <img class="formulaInl" alt="$ \bar x, \bar y $" src="form_419.png"/> the sample means and by <img class="formulaInl" alt="$ s_X^2, s_Y^2 $" src="form_420.png"/> the sample variances.<ul> +<dl class="section return"><dt>Returns</dt><dd>A composite value as follows. We denote by <img class="formulaInl" alt="$ \bar x, \bar y $" src="form_420.png"/> the sample means and by <img class="formulaInl" alt="$ s_X^2, s_Y^2 $" src="form_421.png"/> the sample variances.<ul> <li><code>statistic FLOAT8</code> - Statistic <p class="formulaDsp"> -<img class="formulaDsp" alt="\[ t = \frac{\bar x - \bar y}{\sqrt{s_X^2/n + s_Y^2/m}} \]" src="form_431.png"/> +<img class="formulaDsp" alt="\[ t = \frac{\bar x - \bar y}{\sqrt{s_X^2/n + s_Y^2/m}} \]" src="form_432.png"/> </p> The corresponding random variable is approximately Student-t distributed with <p class="formulaDsp"> -<img class="formulaDsp" alt="\[ \frac{(s_X^2 / n + s_Y^2 / m)^2}{(s_X^2 / n)^2/(n-1) + (s_Y^2 / m)^2/(m-1)} \]" src="form_432.png"/> +<img class="formulaDsp" alt="\[ \frac{(s_X^2 / n + s_Y^2 / m)^2}{(s_X^2 / n)^2/(n-1) + (s_Y^2 / m)^2/(m-1)} \]" src="form_433.png"/> </p> degrees of freedom (WelchâSatterthwaite formula).</li> <li><code>df FLOAT8</code> - Degrees of freedom (as above)</li> -<li><code>p_value_one_sided FLOAT8</code> - Lower bound on one-sided p-value. In detail, the result is <img class="formulaInl" alt="$ \Pr[\bar X - \bar Y \geq \bar x - \bar y \mid \mu_X = \mu_Y] $" src="form_424.png"/>, which is a lower bound on <img class="formulaInl" alt="$ \Pr[\bar X - \bar Y \geq \bar x - \bar y \mid \mu_X \leq \mu_Y] $" src="form_425.png"/>. Computed as <code>(1.0 - <a class="el" href="prob_8sql__in.html#a5322531131074c23a2dbf067ee504ef7">students_t_cdf</a>(statistic))</code>.</li> -<li><code>p_value_two_sided FLOAT8</code> - Two-sided p-value, i.e., <img class="formulaInl" alt="$ \Pr[ |\bar X - \bar Y| \geq |\bar x - \bar y| \mid \mu_X = \mu_Y] $" src="form_426.png"/>. Computed as <code>(2 * <a class="el" href="prob_8sql__in.html#a5322531131074c23a2dbf067ee504ef7">students_t_cdf</a>(-abs(statistic)))</code>.</li> +<li><code>p_value_one_sided FLOAT8</code> - Lower bound on one-sided p-value. In detail, the result is <img class="formulaInl" alt="$ \Pr[\bar X - \bar Y \geq \bar x - \bar y \mid \mu_X = \mu_Y] $" src="form_425.png"/>, which is a lower bound on <img class="formulaInl" alt="$ \Pr[\bar X - \bar Y \geq \bar x - \bar y \mid \mu_X \leq \mu_Y] $" src="form_426.png"/>. Computed as <code>(1.0 - <a class="el" href="prob_8sql__in.html#a5322531131074c23a2dbf067ee504ef7">students_t_cdf</a>(statistic))</code>.</li> +<li><code>p_value_two_sided FLOAT8</code> - Two-sided p-value, i.e., <img class="formulaInl" alt="$ \Pr[ |\bar X - \bar Y| \geq |\bar x - \bar y| \mid \mu_X = \mu_Y] $" src="form_427.png"/>. Computed as <code>(2 * <a class="el" href="prob_8sql__in.html#a5322531131074c23a2dbf067ee504ef7">students_t_cdf</a>(-abs(statistic)))</code>.</li> </ul> </dd></dl> <dl class="section user"><dt>Usage</dt><dd><ul> @@ -1117,7 +1117,7 @@ FROM ( </tr> </table> </div><div class="memdoc"> -<p>Given realizations <img class="formulaInl" alt="$ x_1, \dots, x_m $" src="form_433.png"/> and <img class="formulaInl" alt="$ y_1, \dots, y_m $" src="form_413.png"/> of i.i.d. random variables <img class="formulaInl" alt="$ X_1, \dots, X_m $" src="form_458.png"/> and i.i.d. <img class="formulaInl" alt="$ Y_1, \dots, Y_n $" src="form_459.png"/>, respectively, test the null hypothesis that the underlying distributions are equal, i.e., <img class="formulaInl" alt="$ H_0 : \forall i,j: \Pr[X_i > Y_j] + \frac{\Pr[X_i = Y_j]}{2} = \frac 12 $" src="form_469.png"/>.</p> +<p>Given realizations <img class="formulaInl" alt="$ x_1, \dots, x_m $" src="form_434.png"/> and <img class="formulaInl" alt="$ y_1, \dots, y_m $" src="form_414.png"/> of i.i.d. random variables <img class="formulaInl" alt="$ X_1, \dots, X_m $" src="form_459.png"/> and i.i.d. <img class="formulaInl" alt="$ Y_1, \dots, Y_n $" src="form_460.png"/>, respectively, test the null hypothesis that the underlying distributions are equal, i.e., <img class="formulaInl" alt="$ H_0 : \forall i,j: \Pr[X_i > Y_j] + \frac{\Pr[X_i = Y_j]}{2} = \frac 12 $" src="form_470.png"/>.</p> <dl class="params"><dt>Parameters</dt><dd> <table class="params"> <tr><td class="paramname">first</td><td>Determines whether the value belongs to the first (if <code>TRUE</code>) or the second sample (if <code>FALSE</code>) </td></tr> @@ -1127,18 +1127,18 @@ FROM ( </dl> <dl class="section return"><dt>Returns</dt><dd>A composite value.<ul> <li><code>statistic FLOAT8</code> - Statistic <p class="formulaDsp"> -<img class="formulaDsp" alt="\[ z = \frac{u - \bar x}{\sqrt{\frac{mn(m+n+1)}{12}}} \]" src="form_470.png"/> +<img class="formulaDsp" alt="\[ z = \frac{u - \bar x}{\sqrt{\frac{mn(m+n+1)}{12}}} \]" src="form_471.png"/> </p> - where <img class="formulaInl" alt="$ u $" src="form_471.png"/> is the u-statistic computed as follows. The z-statistic is approximately standard normally distributed.</li> -<li><code>u_statistic FLOAT8</code> - Statistic <img class="formulaInl" alt="$ u = \min \{ u_x, u_y \} $" src="form_472.png"/> where <p class="formulaDsp"> -<img class="formulaDsp" alt="\[ u_x = mn + \binom{m+1}{2} - \sum_{i=1}^m r_{x,i} \]" src="form_473.png"/> + where <img class="formulaInl" alt="$ u $" src="form_472.png"/> is the u-statistic computed as follows. The z-statistic is approximately standard normally distributed.</li> +<li><code>u_statistic FLOAT8</code> - Statistic <img class="formulaInl" alt="$ u = \min \{ u_x, u_y \} $" src="form_473.png"/> where <p class="formulaDsp"> +<img class="formulaDsp" alt="\[ u_x = mn + \binom{m+1}{2} - \sum_{i=1}^m r_{x,i} \]" src="form_474.png"/> </p> where <p class="formulaDsp"> -<img class="formulaDsp" alt="\[ r_{x,i} = \{ j \mid x_j < x_i \} + \{ j \mid y_j < x_i \} + \frac{\{ j \mid x_j = x_i \} + \{ j \mid y_j = x_i \} + 1}{2} \]" src="form_474.png"/> +<img class="formulaDsp" alt="\[ r_{x,i} = \{ j \mid x_j < x_i \} + \{ j \mid y_j < x_i \} + \frac{\{ j \mid x_j = x_i \} + \{ j \mid y_j = x_i \} + 1}{2} \]" src="form_475.png"/> </p> - is defined as the rank of <img class="formulaInl" alt="$ x_i $" src="form_62.png"/> in the combined list of all <img class="formulaInl" alt="$ m+n $" src="form_475.png"/> observations. For ties, the average rank of all equal values is used.</li> -<li><code>p_value_one_sided FLOAT8</code> - Approximate one-sided p-value, i.e., an approximate value for <img class="formulaInl" alt="$ \Pr[Z \geq z \mid H_0] $" src="form_476.png"/>. Computed as <code>(1.0 - <a class="el" href="prob_8sql__in.html#a6c0a499faa80db26c0178f1e69cf7a50">normal_cdf</a>(z_statistic))</code>.</li> -<li><code>p_value_two_sided FLOAT8</code> - Approximate two-sided p-value, i.e., an approximate value for <img class="formulaInl" alt="$ \Pr[|Z| \geq |z| \mid H_0] $" src="form_477.png"/>. Computed as <code>(2 * <a class="el" href="prob_8sql__in.html#a6c0a499faa80db26c0178f1e69cf7a50">normal_cdf</a>(-abs(z_statistic)))</code>.</li> + is defined as the rank of <img class="formulaInl" alt="$ x_i $" src="form_62.png"/> in the combined list of all <img class="formulaInl" alt="$ m+n $" src="form_476.png"/> observations. For ties, the average rank of all equal values is used.</li> +<li><code>p_value_one_sided FLOAT8</code> - Approximate one-sided p-value, i.e., an approximate value for <img class="formulaInl" alt="$ \Pr[Z \geq z \mid H_0] $" src="form_477.png"/>. Computed as <code>(1.0 - <a class="el" href="prob_8sql__in.html#a6c0a499faa80db26c0178f1e69cf7a50">normal_cdf</a>(z_statistic))</code>.</li> +<li><code>p_value_two_sided FLOAT8</code> - Approximate two-sided p-value, i.e., an approximate value for <img class="formulaInl" alt="$ \Pr[|Z| \geq |z| \mid H_0] $" src="form_478.png"/>. Computed as <code>(2 * <a class="el" href="prob_8sql__in.html#a6c0a499faa80db26c0178f1e69cf7a50">normal_cdf</a>(-abs(z_statistic)))</code>.</li> </ul> </dd></dl> <dl class="section user"><dt>Usage</dt><dd><ul> @@ -1181,7 +1181,7 @@ FROM ( <div id="nav-path" class="navpath"><!-- id is needed for treeview function! --> <ul> <li class="navelem"><a class="el" href="dir_68267d1309a1af8e8297ef4c3efbcdba.html">src</a></li><li class="navelem"><a class="el" href="dir_efbcf68973d247bbf15f9eecae7f24e3.html">ports</a></li><li class="navelem"><a class="el" href="dir_a4a48839224ef8488facbffa8a397967.html">postgres</a></li><li class="navelem"><a class="el" href="dir_dc596537ad427a4d866006d1a3e1fe29.html">modules</a></li><li class="navelem"><a class="el" href="dir_505cd743a8a717435eca324f49291a46.html">stats</a></li><li class="navelem"><a class="el" href="hypothesis__tests_8sql__in.html">hypothesis_tests.sql_in</a></li> - <li class="footer">Generated on Thu Apr 7 2016 14:24:10 for MADlib by + <li class="footer">Generated on Tue Sep 20 2016 11:27:01 for MADlib by <a href="http://www.doxygen.org/index.html"> <img class="footer" src="doxygen.png" alt="doxygen"/></a> 1.8.10 </li> </ul>
http://git-wip-us.apache.org/repos/asf/incubator-madlib-site/blob/bed9253d/docs/latest/index.html ---------------------------------------------------------------------- diff --git a/docs/latest/index.html b/docs/latest/index.html index 645fa47..5aad377 100644 --- a/docs/latest/index.html +++ b/docs/latest/index.html @@ -47,7 +47,7 @@ <td id="projectlogo"><a href="http://madlib.net"><img alt="Logo" src="madlib.png" height="50" style="padding-left:0.5em;" border="0"/ ></a></td> <td style="padding-left: 0.5em;"> <div id="projectname"> - <span id="projectnumber">1.9</span> + <span id="projectnumber">1.9.1</span> </div> <div id="projectbrief">User Documentation for MADlib</div> </td> @@ -123,7 +123,7 @@ $(document).ready(function(){initNavTree('index.html','');}); <li> <a href="https://mail-archives.apache.org/mod_mbox/incubator-madlib-dev/">Dev mailing list</a> </li> <li> -User documentation for earlier releases: <a href="../v1.8/index.html">v1.8</a>, <a href="../v1.7.1/index.html">v1.7.1</a>, <a href="../v1.7/index.html">v1.7</a>, <a href="../v1.6/index.html">v1.6</a>, <a href="../v1.5/index.html">v1.5</a>, <a href="../v1.4/index.html">v1.4</a>, <a href="../v1.3/index.html">v1.3</a>, <a href="../v1.2/index.html">v1.2</a> </li> +User documentation for earlier releases: <a href="../v1.9/index.html">v1.9</a>, <a href="../v1.8/index.html">v1.8</a>, <a href="../v1.7.1/index.html">v1.7.1</a>, <a href="../v1.7/index.html">v1.7</a>, <a href="../v1.6/index.html">v1.6</a>, <a href="../v1.5/index.html">v1.5</a>, <a href="../v1.4/index.html">v1.4</a>, <a href="../v1.3/index.html">v1.3</a>, <a href="../v1.2/index.html">v1.2</a> </li> </ul> <p>Please refer to the <a href="https://github.com/apache/incubator-madlib/blob/master/ReadMe.txt">Read-Me</a> file for information about incorporated third-party material. License information regarding MADlib and included third-party libraries can be found inside the <a href="https://github.com/apache/incubator-madlib/blob/master/LICENSE">License</a> directory. </p> </div></div><!-- contents --> @@ -131,7 +131,7 @@ User documentation for earlier releases: <a href="../v1.8/index.html">v1.8</a>, <!-- start footer part --> <div id="nav-path" class="navpath"><!-- id is needed for treeview function! --> <ul> - <li class="footer">Generated on Thu Apr 7 2016 14:24:11 for MADlib by + <li class="footer">Generated on Tue Sep 20 2016 11:27:01 for MADlib by <a href="http://www.doxygen.org/index.html"> <img class="footer" src="doxygen.png" alt="doxygen"/></a> 1.8.10 </li> </ul> http://git-wip-us.apache.org/repos/asf/incubator-madlib-site/blob/bed9253d/docs/latest/kmeans_8sql__in.html ---------------------------------------------------------------------- diff --git a/docs/latest/kmeans_8sql__in.html b/docs/latest/kmeans_8sql__in.html index 9b215f2..99d180f 100644 --- a/docs/latest/kmeans_8sql__in.html +++ b/docs/latest/kmeans_8sql__in.html @@ -47,7 +47,7 @@ <td id="projectlogo"><a href="http://madlib.net"><img alt="Logo" src="madlib.png" height="50" style="padding-left:0.5em;" border="0"/ ></a></td> <td style="padding-left: 0.5em;"> <div id="projectname"> - <span id="projectnumber">1.9</span> + <span id="projectnumber">1.9.1</span> </div> <div id="projectbrief">User Documentation for MADlib</div> </td> @@ -243,7 +243,7 @@ Functions</h2></td></tr> <div id="nav-path" class="navpath"><!-- id is needed for treeview function! --> <ul> <li class="navelem"><a class="el" href="dir_68267d1309a1af8e8297ef4c3efbcdba.html">src</a></li><li class="navelem"><a class="el" href="dir_efbcf68973d247bbf15f9eecae7f24e3.html">ports</a></li><li class="navelem"><a class="el" href="dir_a4a48839224ef8488facbffa8a397967.html">postgres</a></li><li class="navelem"><a class="el" href="dir_dc596537ad427a4d866006d1a3e1fe29.html">modules</a></li><li class="navelem"><a class="el" href="dir_73ccba3aa44ce35463f879b4ebbd3f46.html">kmeans</a></li><li class="navelem"><a class="el" href="kmeans_8sql__in.html">kmeans.sql_in</a></li> - <li class="footer">Generated on Thu Apr 7 2016 14:24:10 for MADlib by + <li class="footer">Generated on Tue Sep 20 2016 11:27:01 for MADlib by <a href="http://www.doxygen.org/index.html"> <img class="footer" src="doxygen.png" alt="doxygen"/></a> 1.8.10 </li> </ul> http://git-wip-us.apache.org/repos/asf/incubator-madlib-site/blob/bed9253d/docs/latest/lda_8sql__in.html ---------------------------------------------------------------------- diff --git a/docs/latest/lda_8sql__in.html b/docs/latest/lda_8sql__in.html index da80cce..6c8bd6c 100644 --- a/docs/latest/lda_8sql__in.html +++ b/docs/latest/lda_8sql__in.html @@ -47,7 +47,7 @@ <td id="projectlogo"><a href="http://madlib.net"><img alt="Logo" src="madlib.png" height="50" style="padding-left:0.5em;" border="0"/ ></a></td> <td style="padding-left: 0.5em;"> <div id="projectname"> - <span id="projectnumber">1.9</span> + <span id="projectnumber">1.9.1</span> </div> <div id="projectbrief">User Documentation for MADlib</div> </td> @@ -1311,7 +1311,7 @@ Functions</h2></td></tr> <div id="nav-path" class="navpath"><!-- id is needed for treeview function! --> <ul> <li class="navelem"><a class="el" href="dir_68267d1309a1af8e8297ef4c3efbcdba.html">src</a></li><li class="navelem"><a class="el" href="dir_efbcf68973d247bbf15f9eecae7f24e3.html">ports</a></li><li class="navelem"><a class="el" href="dir_a4a48839224ef8488facbffa8a397967.html">postgres</a></li><li class="navelem"><a class="el" href="dir_dc596537ad427a4d866006d1a3e1fe29.html">modules</a></li><li class="navelem"><a class="el" href="dir_6ff79b0655deb26abf8f86290b84a97c.html">lda</a></li><li class="navelem"><a class="el" href="lda_8sql__in.html">lda.sql_in</a></li> - <li class="footer">Generated on Thu Apr 7 2016 14:24:10 for MADlib by + <li class="footer">Generated on Tue Sep 20 2016 11:27:01 for MADlib by <a href="http://www.doxygen.org/index.html"> <img class="footer" src="doxygen.png" alt="doxygen"/></a> 1.8.10 </li> </ul> http://git-wip-us.apache.org/repos/asf/incubator-madlib-site/blob/bed9253d/docs/latest/linalg_8sql__in.html ---------------------------------------------------------------------- diff --git a/docs/latest/linalg_8sql__in.html b/docs/latest/linalg_8sql__in.html index 64cabeb..8e5ea8a 100644 --- a/docs/latest/linalg_8sql__in.html +++ b/docs/latest/linalg_8sql__in.html @@ -47,7 +47,7 @@ <td id="projectlogo"><a href="http://madlib.net"><img alt="Logo" src="madlib.png" height="50" style="padding-left:0.5em;" border="0"/ ></a></td> <td style="padding-left: 0.5em;"> <div id="projectname"> - <span id="projectnumber">1.9</span> + <span id="projectnumber">1.9.1</span> </div> <div id="projectbrief">User Documentation for MADlib</div> </td> @@ -1251,7 +1251,7 @@ Functions</h2></td></tr> <div id="nav-path" class="navpath"><!-- id is needed for treeview function! --> <ul> <li class="navelem"><a class="el" href="dir_68267d1309a1af8e8297ef4c3efbcdba.html">src</a></li><li class="navelem"><a class="el" href="dir_efbcf68973d247bbf15f9eecae7f24e3.html">ports</a></li><li class="navelem"><a class="el" href="dir_a4a48839224ef8488facbffa8a397967.html">postgres</a></li><li class="navelem"><a class="el" href="dir_dc596537ad427a4d866006d1a3e1fe29.html">modules</a></li><li class="navelem"><a class="el" href="dir_9e42ee0a0235722f482630aa6cc99334.html">linalg</a></li><li class="navelem"><a class="el" href="linalg_8sql__in.html">linalg.sql_in</a></li> - <li class="footer">Generated on Thu Apr 7 2016 14:24:10 for MADlib by + <li class="footer">Generated on Tue Sep 20 2016 11:27:01 for MADlib by <a href="http://www.doxygen.org/index.html"> <img class="footer" src="doxygen.png" alt="doxygen"/></a> 1.8.10 </li> </ul> http://git-wip-us.apache.org/repos/asf/incubator-madlib-site/blob/bed9253d/docs/latest/linear_8sql__in.html ---------------------------------------------------------------------- diff --git a/docs/latest/linear_8sql__in.html b/docs/latest/linear_8sql__in.html index 8936a22..53de899 100644 --- a/docs/latest/linear_8sql__in.html +++ b/docs/latest/linear_8sql__in.html @@ -47,7 +47,7 @@ <td id="projectlogo"><a href="http://madlib.net"><img alt="Logo" src="madlib.png" height="50" style="padding-left:0.5em;" border="0"/ ></a></td> <td style="padding-left: 0.5em;"> <div id="projectname"> - <span id="projectnumber">1.9</span> + <span id="projectnumber">1.9.1</span> </div> <div id="projectbrief">User Documentation for MADlib</div> </td> @@ -323,11 +323,11 @@ Functions</h2></td></tr> <dl class="section user"><dt></dt><dd>To include an intercept in the model, set one coordinate in the <code>independentVariables</code> array to 1.</dd></dl> <dl class="section return"><dt>Returns</dt><dd>A composite value:<ul> <li><code>coef FLOAT8[]</code> - Array of coefficients, <img class="formulaInl" alt="$ \boldsymbol c $" src="form_78.png"/></li> -<li><code>r2 FLOAT8</code> - Coefficient of determination, <img class="formulaInl" alt="$ R^2 $" src="form_336.png"/></li> -<li><code>std_err FLOAT8[]</code> - Array of standard errors, <img class="formulaInl" alt="$ \mathit{se}(c_1), \dots, \mathit{se}(c_k) $" src="form_349.png"/></li> -<li><code>t_stats FLOAT8[]</code> - Array of t-statistics, <img class="formulaInl" alt="$ \boldsymbol t $" src="form_350.png"/></li> -<li><code>p_values FLOAT8[]</code> - Array of p-values, <img class="formulaInl" alt="$ \boldsymbol p $" src="form_351.png"/></li> -<li><code>condition_no FLOAT8</code> - The condition number of matrix <img class="formulaInl" alt="$ X^T X $" src="form_352.png"/>.</li> +<li><code>r2 FLOAT8</code> - Coefficient of determination, <img class="formulaInl" alt="$ R^2 $" src="form_337.png"/></li> +<li><code>std_err FLOAT8[]</code> - Array of standard errors, <img class="formulaInl" alt="$ \mathit{se}(c_1), \dots, \mathit{se}(c_k) $" src="form_350.png"/></li> +<li><code>t_stats FLOAT8[]</code> - Array of t-statistics, <img class="formulaInl" alt="$ \boldsymbol t $" src="form_351.png"/></li> +<li><code>p_values FLOAT8[]</code> - Array of p-values, <img class="formulaInl" alt="$ \boldsymbol p $" src="form_352.png"/></li> +<li><code>condition_no FLOAT8</code> - The condition number of matrix <img class="formulaInl" alt="$ X^T X $" src="form_353.png"/>.</li> </ul> </dd></dl> <dl class="section user"><dt>Usage</dt><dd><ul> @@ -339,7 +339,7 @@ FROM <em>sourceName</em>;</pre></li> <pre>SELECT (linregr(<em>dependentVariable</em>, <em>independentVariables</em>)).coef FROM <em>sourceName</em>;</pre></li> -<li>Get a subset of the output columns, e.g., only the array of coefficients <img class="formulaInl" alt="$ \boldsymbol c $" src="form_78.png"/>, the coefficient of determination <img class="formulaInl" alt="$ R^2 $" src="form_336.png"/>, and the array of p-values <img class="formulaInl" alt="$ \boldsymbol p $" src="form_351.png"/>: <pre>SELECT (lr).coef, (lr).r2, (lr).p_values +<li>Get a subset of the output columns, e.g., only the array of coefficients <img class="formulaInl" alt="$ \boldsymbol c $" src="form_78.png"/>, the coefficient of determination <img class="formulaInl" alt="$ R^2 $" src="form_337.png"/>, and the array of p-values <img class="formulaInl" alt="$ \boldsymbol p $" src="form_352.png"/>: <pre>SELECT (lr).coef, (lr).r2, (lr).p_values FROM ( SELECT linregr( <em>dependentVariable</em>, <em>independentVariables</em>) AS lr @@ -659,7 +659,7 @@ FROM ( <div id="nav-path" class="navpath"><!-- id is needed for treeview function! --> <ul> <li class="navelem"><a class="el" href="dir_68267d1309a1af8e8297ef4c3efbcdba.html">src</a></li><li class="navelem"><a class="el" href="dir_efbcf68973d247bbf15f9eecae7f24e3.html">ports</a></li><li class="navelem"><a class="el" href="dir_a4a48839224ef8488facbffa8a397967.html">postgres</a></li><li class="navelem"><a class="el" href="dir_dc596537ad427a4d866006d1a3e1fe29.html">modules</a></li><li class="navelem"><a class="el" href="dir_ac52a4b89b7b1b1591f2952b5cbd041a.html">regress</a></li><li class="navelem"><a class="el" href="linear_8sql__in.html">linear.sql_in</a></li> - <li class="footer">Generated on Thu Apr 7 2016 14:24:10 for MADlib by + <li class="footer">Generated on Tue Sep 20 2016 11:27:01 for MADlib by <a href="http://www.doxygen.org/index.html"> <img class="footer" src="doxygen.png" alt="doxygen"/></a> 1.8.10 </li> </ul> http://git-wip-us.apache.org/repos/asf/incubator-madlib-site/blob/bed9253d/docs/latest/lmf_8sql__in.html ---------------------------------------------------------------------- diff --git a/docs/latest/lmf_8sql__in.html b/docs/latest/lmf_8sql__in.html index f86e47d..c639f6b 100644 --- a/docs/latest/lmf_8sql__in.html +++ b/docs/latest/lmf_8sql__in.html @@ -47,7 +47,7 @@ <td id="projectlogo"><a href="http://madlib.net"><img alt="Logo" src="madlib.png" height="50" style="padding-left:0.5em;" border="0"/ ></a></td> <td style="padding-left: 0.5em;"> <div id="projectname"> - <span id="projectnumber">1.9</span> + <span id="projectnumber">1.9.1</span> </div> <div id="projectbrief">User Documentation for MADlib</div> </td> @@ -845,7 +845,7 @@ Functions</h2></td></tr> <div id="nav-path" class="navpath"><!-- id is needed for treeview function! --> <ul> <li class="navelem"><a class="el" href="dir_68267d1309a1af8e8297ef4c3efbcdba.html">src</a></li><li class="navelem"><a class="el" href="dir_efbcf68973d247bbf15f9eecae7f24e3.html">ports</a></li><li class="navelem"><a class="el" href="dir_a4a48839224ef8488facbffa8a397967.html">postgres</a></li><li class="navelem"><a class="el" href="dir_dc596537ad427a4d866006d1a3e1fe29.html">modules</a></li><li class="navelem"><a class="el" href="dir_93c42bb4df0f3e1302223b6dfd48c81e.html">convex</a></li><li class="navelem"><a class="el" href="lmf_8sql__in.html">lmf.sql_in</a></li> - <li class="footer">Generated on Thu Apr 7 2016 14:24:10 for MADlib by + <li class="footer">Generated on Tue Sep 20 2016 11:27:01 for MADlib by <a href="http://www.doxygen.org/index.html"> <img class="footer" src="doxygen.png" alt="doxygen"/></a> 1.8.10 </li> </ul> http://git-wip-us.apache.org/repos/asf/incubator-madlib-site/blob/bed9253d/docs/latest/logistic_8sql__in.html ---------------------------------------------------------------------- diff --git a/docs/latest/logistic_8sql__in.html b/docs/latest/logistic_8sql__in.html index bcf2ec5..99ee367 100644 --- a/docs/latest/logistic_8sql__in.html +++ b/docs/latest/logistic_8sql__in.html @@ -47,7 +47,7 @@ <td id="projectlogo"><a href="http://madlib.net"><img alt="Logo" src="madlib.png" height="50" style="padding-left:0.5em;" border="0"/ ></a></td> <td style="padding-left: 0.5em;"> <div id="projectname"> - <span id="projectnumber">1.9</span> + <span id="projectnumber">1.9.1</span> </div> <div id="projectbrief">User Documentation for MADlib</div> </td> @@ -673,9 +673,9 @@ Functions</h2></td></tr> </table> </dd> </dl> -<dl class="section return"><dt>Returns</dt><dd><img class="formulaInl" alt="$ \frac{1}{1 + \exp(-x)} $" src="form_360.png"/></dd></dl> +<dl class="section return"><dt>Returns</dt><dd><img class="formulaInl" alt="$ \frac{1}{1 + \exp(-x)} $" src="form_361.png"/></dd></dl> <p>Evaluating this expression directly can lead to under- or overflows. This function performs the evaluation in a safe manner, making use of the following observations:</p> -<p>In order for the outcome of <img class="formulaInl" alt="$ \exp(x) $" src="form_361.png"/> to be within the range of the minimum positive double-precision number (i.e., <img class="formulaInl" alt="$ 2^{-1074} $" src="form_362.png"/>) and the maximum positive double-precision number (i.e., <img class="formulaInl" alt="$ (1 + (1 - 2^{52})) * 2^{1023}) $" src="form_363.png"/>, <img class="formulaInl" alt="$ x $" src="form_178.png"/> has to be within the natural logarithm of these numbers, so roughly in between -744 and 709. However, <img class="formulaInl" alt="$ 1 + \exp(x) $" src="form_364.png"/> will just evaluate to 1 if <img class="formulaInl" alt="$ \exp(x) $" src="form_361.png"/> is less than the machine epsilon (i.e., <img class="formulaInl" alt="$ 2^{-52} $" src="form_365.png"/>) or, equivalently, if <img class="formulaInl" alt="$ x $" src="form_178.png"/> is less than the natural logarithm of that; i.e., in any case if <img class="formulaInl" alt="$ x $" src="form_178.png "/> is less than -37. Note that taking the reciprocal of the largest double-precision number will not cause an underflow. Hence, no further checks are necessary. </p> +<p>In order for the outcome of <img class="formulaInl" alt="$ \exp(x) $" src="form_362.png"/> to be within the range of the minimum positive double-precision number (i.e., <img class="formulaInl" alt="$ 2^{-1074} $" src="form_363.png"/>) and the maximum positive double-precision number (i.e., <img class="formulaInl" alt="$ (1 + (1 - 2^{52})) * 2^{1023}) $" src="form_364.png"/>, <img class="formulaInl" alt="$ x $" src="form_178.png"/> has to be within the natural logarithm of these numbers, so roughly in between -744 and 709. However, <img class="formulaInl" alt="$ 1 + \exp(x) $" src="form_365.png"/> will just evaluate to 1 if <img class="formulaInl" alt="$ \exp(x) $" src="form_362.png"/> is less than the machine epsilon (i.e., <img class="formulaInl" alt="$ 2^{-52} $" src="form_366.png"/>) or, equivalently, if <img class="formulaInl" alt="$ x $" src="form_178.png"/> is less than the natural logarithm of that; i.e., in any case if <img class="formulaInl" alt="$ x $" src="form_178.png "/> is less than -37. Note that taking the reciprocal of the largest double-precision number will not cause an underflow. Hence, no further checks are necessary. </p> </div> </div> @@ -884,11 +884,11 @@ Functions</h2></td></tr> - <tt>coef FLOAT8[]</tt> - Array of coefficients, \form#78 - <tt>log_likelihood FLOAT8</tt> - Log-likelihood \form#79 - <tt>std_err FLOAT8[]</tt> - Array of standard errors, -</pre> <img class="formulaInl" alt="$ \mathit{se}(c_1), \dots, \mathit{se}(c_k) $" src="form_349.png"/><ul> -<li><code>z_stats FLOAT8[]</code> - Array of Wald z-statistics, <img class="formulaInl" alt="$ \boldsymbol z $" src="form_357.png"/></li> -<li><code>p_values FLOAT8[]</code> - Array of Wald p-values, <img class="formulaInl" alt="$ \boldsymbol p $" src="form_351.png"/></li> -<li><code>odds_ratios FLOAT8[]</code>: Array of odds ratios, <img class="formulaInl" alt="$ \mathit{odds}(c_1), \dots, \mathit{odds}(c_k) $" src="form_358.png"/></li> -<li><code>condition_no FLOAT8</code> - The condition number of matrix <img class="formulaInl" alt="$ X^T A X $" src="form_359.png"/> during the iteration immediately <em>preceding</em> convergence (i.e., <img class="formulaInl" alt="$ A $" src="form_13.png"/> is computed using the coefficients of the previous iteration) </li> +</pre> <img class="formulaInl" alt="$ \mathit{se}(c_1), \dots, \mathit{se}(c_k) $" src="form_350.png"/><ul> +<li><code>z_stats FLOAT8[]</code> - Array of Wald z-statistics, <img class="formulaInl" alt="$ \boldsymbol z $" src="form_358.png"/></li> +<li><code>p_values FLOAT8[]</code> - Array of Wald p-values, <img class="formulaInl" alt="$ \boldsymbol p $" src="form_352.png"/></li> +<li><code>odds_ratios FLOAT8[]</code>: Array of odds ratios, <img class="formulaInl" alt="$ \mathit{odds}(c_1), \dots, \mathit{odds}(c_k) $" src="form_359.png"/></li> +<li><code>condition_no FLOAT8</code> - The condition number of matrix <img class="formulaInl" alt="$ X^T A X $" src="form_360.png"/> during the iteration immediately <em>preceding</em> convergence (i.e., <img class="formulaInl" alt="$ A $" src="form_13.png"/> is computed using the coefficients of the previous iteration) </li> </ul> </td></tr> <tr><td class="paramname">dependent_varname</td><td>Name of the dependent column (of type BOOLEAN) </td></tr> @@ -909,7 +909,7 @@ Functions</h2></td></tr> </pre></li> <li>Get vector of coefficients <img class="formulaInl" alt="$ \boldsymbol c $" src="form_78.png"/>:<br /> <pre>SELECT coef from outName;</pre></li> -<li>Get a subset of the output columns, e.g., only the array of coefficients <img class="formulaInl" alt="$ \boldsymbol c $" src="form_78.png"/>, the log-likelihood of determination <img class="formulaInl" alt="$ l(\boldsymbol c) $" src="form_79.png"/>, and the array of p-values <img class="formulaInl" alt="$ \boldsymbol p $" src="form_351.png"/>: <pre>SELECT coef, log_likelihood, p_values FROM outName;</pre></li> +<li>Get a subset of the output columns, e.g., only the array of coefficients <img class="formulaInl" alt="$ \boldsymbol c $" src="form_78.png"/>, the log-likelihood of determination <img class="formulaInl" alt="$ l(\boldsymbol c) $" src="form_79.png"/>, and the array of p-values <img class="formulaInl" alt="$ \boldsymbol p $" src="form_352.png"/>: <pre>SELECT coef, log_likelihood, p_values FROM outName;</pre></li> </ul> </dd></dl> <dl class="section note"><dt>Note</dt><dd>This function starts an iterative algorithm. It is not an aggregate function. Source, output, and column names have to be passed as strings (due to limitations of the SQL syntax). </dd></dl> @@ -1203,7 +1203,7 @@ Functions</h2></td></tr> <div id="nav-path" class="navpath"><!-- id is needed for treeview function! --> <ul> <li class="navelem"><a class="el" href="dir_68267d1309a1af8e8297ef4c3efbcdba.html">src</a></li><li class="navelem"><a class="el" href="dir_efbcf68973d247bbf15f9eecae7f24e3.html">ports</a></li><li class="navelem"><a class="el" href="dir_a4a48839224ef8488facbffa8a397967.html">postgres</a></li><li class="navelem"><a class="el" href="dir_dc596537ad427a4d866006d1a3e1fe29.html">modules</a></li><li class="navelem"><a class="el" href="dir_ac52a4b89b7b1b1591f2952b5cbd041a.html">regress</a></li><li class="navelem"><a class="el" href="logistic_8sql__in.html">logistic.sql_in</a></li> - <li class="footer">Generated on Thu Apr 7 2016 14:24:10 for MADlib by + <li class="footer">Generated on Tue Sep 20 2016 11:27:01 for MADlib by <a href="http://www.doxygen.org/index.html"> <img class="footer" src="doxygen.png" alt="doxygen"/></a> 1.8.10 </li> </ul>