http://git-wip-us.apache.org/repos/asf/spark-website/blob/26b52712/site/docs/2.2.1/api/python/pyspark.mllib.html ---------------------------------------------------------------------- diff --git a/site/docs/2.2.1/api/python/pyspark.mllib.html b/site/docs/2.2.1/api/python/pyspark.mllib.html index cd27d38..baf0804 100644 --- a/site/docs/2.2.1/api/python/pyspark.mllib.html +++ b/site/docs/2.2.1/api/python/pyspark.mllib.html @@ -5,14 +5,14 @@ <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> - <title>pyspark.mllib package — PySpark documentation</title> + <title>pyspark.mllib package — PySpark 2.2.1 documentation</title> <link rel="stylesheet" href="_static/nature.css" type="text/css" /> <link rel="stylesheet" href="_static/pygments.css" type="text/css" /> <link rel="stylesheet" href="_static/pyspark.css" type="text/css" /> <script type="text/javascript"> var DOCUMENTATION_OPTIONS = { URL_ROOT: './', - VERSION: '', + VERSION: '2.2.1', COLLAPSE_INDEX: false, FILE_SUFFIX: '.html', HAS_SOURCE: true, @@ -35,7 +35,7 @@ <a href="pyspark.ml.html" title="pyspark.ml package" accesskey="P">previous</a></li> - <li class="nav-item nav-item-0"><a href="index.html">PySpark documentation</a> »</li> + <li class="nav-item nav-item-0"><a href="index.html">PySpark 2.2.1 documentation</a> »</li> <li class="nav-item nav-item-1"><a href="pyspark.html" accesskey="U">pyspark package</a> »</li> </ul> @@ -2633,7 +2633,7 @@ Compositionality.</p> <p>Querying for synonyms of a word will not return that word:</p> <div class="highlight-default"><div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="n">syms</span> <span class="o">=</span> <span class="n">model</span><span class="o">.</span><span class="n">findSynonyms</span><span class="p">(</span><span class="s2">"a"</span><span class="p">,</span> <span class="mi">2</span><span class="p">)</span> <span class="gp">>>> </span><span class="p">[</span><span class="n">s</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="k">for</span> <span class="n">s</span> <span class="ow">in</span> <span class="n">syms</span><span class="p">]</span> -<span class="go">[u'b', u'c']</span> +<span class="go">['b', 'c']</span> </pre></div> </div> <p>But querying for synonyms of a vector may return the word whose @@ -2641,7 +2641,7 @@ representation is that vector:</p> <div class="highlight-default"><div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="n">vec</span> <span class="o">=</span> <span class="n">model</span><span class="o">.</span><span class="n">transform</span><span class="p">(</span><span class="s2">"a"</span><span class="p">)</span> <span class="gp">>>> </span><span class="n">syms</span> <span class="o">=</span> <span class="n">model</span><span class="o">.</span><span class="n">findSynonyms</span><span class="p">(</span><span class="n">vec</span><span class="p">,</span> <span class="mi">2</span><span class="p">)</span> <span class="gp">>>> </span><span class="p">[</span><span class="n">s</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="k">for</span> <span class="n">s</span> <span class="ow">in</span> <span class="n">syms</span><span class="p">]</span> -<span class="go">[u'a', u'b']</span> +<span class="go">['a', 'b']</span> </pre></div> </div> <div class="highlight-default"><div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="kn">import</span> <span class="nn">os</span><span class="o">,</span> <span class="nn">tempfile</span> @@ -2652,7 +2652,7 @@ representation is that vector:</p> <span class="go">True</span> <span class="gp">>>> </span><span class="n">syms</span> <span class="o">=</span> <span class="n">sameModel</span><span class="o">.</span><span class="n">findSynonyms</span><span class="p">(</span><span class="s2">"a"</span><span class="p">,</span> <span class="mi">2</span><span class="p">)</span> <span class="gp">>>> </span><span class="p">[</span><span class="n">s</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="k">for</span> <span class="n">s</span> <span class="ow">in</span> <span class="n">syms</span><span class="p">]</span> -<span class="go">[u'b', u'c']</span> +<span class="go">['b', 'c']</span> <span class="gp">>>> </span><span class="kn">from</span> <span class="nn">shutil</span> <span class="k">import</span> <span class="n">rmtree</span> <span class="gp">>>> </span><span class="k">try</span><span class="p">:</span> <span class="gp">... </span> <span class="n">rmtree</span><span class="p">(</span><span class="n">path</span><span class="p">)</span> @@ -3073,7 +3073,7 @@ using the Parallel FP-Growth algorithm.</p> <span class="gp">>>> </span><span class="n">rdd</span> <span class="o">=</span> <span class="n">sc</span><span class="o">.</span><span class="n">parallelize</span><span class="p">(</span><span class="n">data</span><span class="p">,</span> <span class="mi">2</span><span class="p">)</span> <span class="gp">>>> </span><span class="n">model</span> <span class="o">=</span> <span class="n">FPGrowth</span><span class="o">.</span><span class="n">train</span><span class="p">(</span><span class="n">rdd</span><span class="p">,</span> <span class="mf">0.6</span><span class="p">,</span> <span class="mi">2</span><span class="p">)</span> <span class="gp">>>> </span><span class="nb">sorted</span><span class="p">(</span><span class="n">model</span><span class="o">.</span><span class="n">freqItemsets</span><span class="p">()</span><span class="o">.</span><span class="n">collect</span><span class="p">())</span> -<span class="go">[FreqItemset(items=[u'a'], freq=4), FreqItemset(items=[u'c'], freq=3), ...</span> +<span class="go">[FreqItemset(items=['a'], freq=4), FreqItemset(items=['c'], freq=3), ...</span> <span class="gp">>>> </span><span class="n">model_path</span> <span class="o">=</span> <span class="n">temp_path</span> <span class="o">+</span> <span class="s2">"/fpm"</span> <span class="gp">>>> </span><span class="n">model</span><span class="o">.</span><span class="n">save</span><span class="p">(</span><span class="n">sc</span><span class="p">,</span> <span class="n">model_path</span><span class="p">)</span> <span class="gp">>>> </span><span class="n">sameModel</span> <span class="o">=</span> <span class="n">FPGrowthModel</span><span class="o">.</span><span class="n">load</span><span class="p">(</span><span class="n">sc</span><span class="p">,</span> <span class="n">model_path</span><span class="p">)</span> @@ -3171,7 +3171,7 @@ another iteration of distributed prefix growth is run. <span class="gp">>>> </span><span class="n">rdd</span> <span class="o">=</span> <span class="n">sc</span><span class="o">.</span><span class="n">parallelize</span><span class="p">(</span><span class="n">data</span><span class="p">,</span> <span class="mi">2</span><span class="p">)</span> <span class="gp">>>> </span><span class="n">model</span> <span class="o">=</span> <span class="n">PrefixSpan</span><span class="o">.</span><span class="n">train</span><span class="p">(</span><span class="n">rdd</span><span class="p">)</span> <span class="gp">>>> </span><span class="nb">sorted</span><span class="p">(</span><span class="n">model</span><span class="o">.</span><span class="n">freqSequences</span><span class="p">()</span><span class="o">.</span><span class="n">collect</span><span class="p">())</span> -<span class="go">[FreqSequence(sequence=[[u'a']], freq=3), FreqSequence(sequence=[[u'a'], [u'a']], freq=1), ...</span> +<span class="go">[FreqSequence(sequence=[['a']], freq=3), FreqSequence(sequence=[['a'], ['a']], freq=1), ...</span> </pre></div> </div> <div class="versionadded"> @@ -5178,7 +5178,7 @@ distribution with the input mean.</p> <dl class="staticmethod"> <dt id="pyspark.mllib.random.RandomRDDs.exponentialVectorRDD"> -<em class="property">static </em><code class="descname">exponentialVectorRDD</code><span class="sig-paren">(</span><em>sc</em>, <em>*a</em>, <em>**kw</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/pyspark/mllib/random.html#RandomRDDs.exponentialVectorRDD"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#pyspark.mllib.random.RandomRDDs.exponentialVectorRDD" title="Permalink to this definition">¶</a></dt> +<em class="property">static </em><code class="descname">exponentialVectorRDD</code><span class="sig-paren">(</span><em>sc</em>, <em>mean</em>, <em>numRows</em>, <em>numCols</em>, <em>numPartitions=None</em>, <em>seed=None</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/pyspark/mllib/random.html#RandomRDDs.exponentialVectorRDD"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#pyspark.mllib.random.RandomRDDs.exponentialVectorRDD" title="Permalink to this definition">¶</a></dt> <dd><p>Generates an RDD comprised of vectors containing i.i.d. samples drawn from the Exponential distribution with the input mean.</p> <table class="docutils field-list" frame="void" rules="none"> @@ -5264,7 +5264,7 @@ distribution with the input shape and scale.</p> <dl class="staticmethod"> <dt id="pyspark.mllib.random.RandomRDDs.gammaVectorRDD"> -<em class="property">static </em><code class="descname">gammaVectorRDD</code><span class="sig-paren">(</span><em>sc</em>, <em>*a</em>, <em>**kw</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/pyspark/mllib/random.html#RandomRDDs.gammaVectorRDD"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#pyspark.mllib.random.RandomRDDs.gammaVectorRDD" title="Permalink to this definition">¶</a></dt> +<em class="property">static </em><code class="descname">gammaVectorRDD</code><span class="sig-paren">(</span><em>sc</em>, <em>shape</em>, <em>scale</em>, <em>numRows</em>, <em>numCols</em>, <em>numPartitions=None</em>, <em>seed=None</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/pyspark/mllib/random.html#RandomRDDs.gammaVectorRDD"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#pyspark.mllib.random.RandomRDDs.gammaVectorRDD" title="Permalink to this definition">¶</a></dt> <dd><p>Generates an RDD comprised of vectors containing i.i.d. samples drawn from the Gamma distribution.</p> <table class="docutils field-list" frame="void" rules="none"> @@ -5354,7 +5354,7 @@ distribution with the input mean and standard distribution.</p> <dl class="staticmethod"> <dt id="pyspark.mllib.random.RandomRDDs.logNormalVectorRDD"> -<em class="property">static </em><code class="descname">logNormalVectorRDD</code><span class="sig-paren">(</span><em>sc</em>, <em>*a</em>, <em>**kw</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/pyspark/mllib/random.html#RandomRDDs.logNormalVectorRDD"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#pyspark.mllib.random.RandomRDDs.logNormalVectorRDD" title="Permalink to this definition">¶</a></dt> +<em class="property">static </em><code class="descname">logNormalVectorRDD</code><span class="sig-paren">(</span><em>sc</em>, <em>mean</em>, <em>std</em>, <em>numRows</em>, <em>numCols</em>, <em>numPartitions=None</em>, <em>seed=None</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/pyspark/mllib/random.html#RandomRDDs.logNormalVectorRDD"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#pyspark.mllib.random.RandomRDDs.logNormalVectorRDD" title="Permalink to this definition">¶</a></dt> <dd><p>Generates an RDD comprised of vectors containing i.i.d. samples drawn from the log normal distribution.</p> <table class="docutils field-list" frame="void" rules="none"> @@ -5440,7 +5440,7 @@ to some other normal N(mean, sigma^2), use <dl class="staticmethod"> <dt id="pyspark.mllib.random.RandomRDDs.normalVectorRDD"> -<em class="property">static </em><code class="descname">normalVectorRDD</code><span class="sig-paren">(</span><em>sc</em>, <em>*a</em>, <em>**kw</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/pyspark/mllib/random.html#RandomRDDs.normalVectorRDD"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#pyspark.mllib.random.RandomRDDs.normalVectorRDD" title="Permalink to this definition">¶</a></dt> +<em class="property">static </em><code class="descname">normalVectorRDD</code><span class="sig-paren">(</span><em>sc</em>, <em>numRows</em>, <em>numCols</em>, <em>numPartitions=None</em>, <em>seed=None</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/pyspark/mllib/random.html#RandomRDDs.normalVectorRDD"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#pyspark.mllib.random.RandomRDDs.normalVectorRDD" title="Permalink to this definition">¶</a></dt> <dd><p>Generates an RDD comprised of vectors containing i.i.d. samples drawn from the standard normal distribution.</p> <table class="docutils field-list" frame="void" rules="none"> @@ -5518,7 +5518,7 @@ distribution with the input mean.</p> <dl class="staticmethod"> <dt id="pyspark.mllib.random.RandomRDDs.poissonVectorRDD"> -<em class="property">static </em><code class="descname">poissonVectorRDD</code><span class="sig-paren">(</span><em>sc</em>, <em>*a</em>, <em>**kw</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/pyspark/mllib/random.html#RandomRDDs.poissonVectorRDD"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#pyspark.mllib.random.RandomRDDs.poissonVectorRDD" title="Permalink to this definition">¶</a></dt> +<em class="property">static </em><code class="descname">poissonVectorRDD</code><span class="sig-paren">(</span><em>sc</em>, <em>mean</em>, <em>numRows</em>, <em>numCols</em>, <em>numPartitions=None</em>, <em>seed=None</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/pyspark/mllib/random.html#RandomRDDs.poissonVectorRDD"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#pyspark.mllib.random.RandomRDDs.poissonVectorRDD" title="Permalink to this definition">¶</a></dt> <dd><p>Generates an RDD comprised of vectors containing i.i.d. samples drawn from the Poisson distribution with the input mean.</p> <table class="docutils field-list" frame="void" rules="none"> @@ -5602,7 +5602,7 @@ to U(a, b), use <dl class="staticmethod"> <dt id="pyspark.mllib.random.RandomRDDs.uniformVectorRDD"> -<em class="property">static </em><code class="descname">uniformVectorRDD</code><span class="sig-paren">(</span><em>sc</em>, <em>*a</em>, <em>**kw</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/pyspark/mllib/random.html#RandomRDDs.uniformVectorRDD"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#pyspark.mllib.random.RandomRDDs.uniformVectorRDD" title="Permalink to this definition">¶</a></dt> +<em class="property">static </em><code class="descname">uniformVectorRDD</code><span class="sig-paren">(</span><em>sc</em>, <em>numRows</em>, <em>numCols</em>, <em>numPartitions=None</em>, <em>seed=None</em><span class="sig-paren">)</span><a class="reference internal" href="_modules/pyspark/mllib/random.html#RandomRDDs.uniformVectorRDD"><span class="viewcode-link">[source]</span></a><a class="headerlink" href="#pyspark.mllib.random.RandomRDDs.uniformVectorRDD" title="Permalink to this definition">¶</a></dt> <dd><p>Generates an RDD comprised of vectors containing i.i.d. samples drawn from the uniform distribution U(0.0, 1.0).</p> <table class="docutils field-list" frame="void" rules="none"> @@ -6873,9 +6873,9 @@ of freedom, p-value, the method used, and the null hypothesis.</p> <span class="gp">>>> </span><span class="nb">print</span><span class="p">(</span><span class="nb">round</span><span class="p">(</span><span class="n">pearson</span><span class="o">.</span><span class="n">pValue</span><span class="p">,</span> <span class="mi">4</span><span class="p">))</span> <span class="go">0.8187</span> <span class="gp">>>> </span><span class="n">pearson</span><span class="o">.</span><span class="n">method</span> -<span class="go">u'pearson'</span> +<span class="go">'pearson'</span> <span class="gp">>>> </span><span class="n">pearson</span><span class="o">.</span><span class="n">nullHypothesis</span> -<span class="go">u'observed follows the same distribution as expected.'</span> +<span class="go">'observed follows the same distribution as expected.'</span> </pre></div> </div> <div class="highlight-default"><div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="n">observed</span> <span class="o">=</span> <span class="n">Vectors</span><span class="o">.</span><span class="n">dense</span><span class="p">([</span><span class="mi">21</span><span class="p">,</span> <span class="mi">38</span><span class="p">,</span> <span class="mi">43</span><span class="p">,</span> <span class="mi">80</span><span class="p">])</span> @@ -7055,7 +7055,7 @@ the method used, and the null hypothesis.</p> <span class="gp">>>> </span><span class="nb">print</span><span class="p">(</span><span class="nb">round</span><span class="p">(</span><span class="n">ksmodel</span><span class="o">.</span><span class="n">statistic</span><span class="p">,</span> <span class="mi">3</span><span class="p">))</span> <span class="go">0.175</span> <span class="gp">>>> </span><span class="n">ksmodel</span><span class="o">.</span><span class="n">nullHypothesis</span> -<span class="go">u'Sample follows theoretical distribution'</span> +<span class="go">'Sample follows theoretical distribution'</span> </pre></div> </div> <div class="highlight-default"><div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="n">data</span> <span class="o">=</span> <span class="n">sc</span><span class="o">.</span><span class="n">parallelize</span><span class="p">([</span><span class="mf">2.0</span><span class="p">,</span> <span class="mf">3.0</span><span class="p">,</span> <span class="mf">4.0</span><span class="p">])</span> @@ -8453,7 +8453,7 @@ this method throws an exception.</li> <a href="pyspark.ml.html" title="pyspark.ml package" >previous</a></li> - <li class="nav-item nav-item-0"><a href="index.html">PySpark documentation</a> »</li> + <li class="nav-item nav-item-0"><a href="index.html">PySpark 2.2.1 documentation</a> »</li> <li class="nav-item nav-item-1"><a href="pyspark.html" >pyspark package</a> »</li> </ul>
--------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org