[GitHub] spark pull request: [SPARK-7557] [ml] [doc] User guide for spark.m...

SparkQA Tue, 12 May 2015 14:42:07 -0700

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/6093#issuecomment-101429818
  
      [Test build #32528 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/32528/consoleFull)
 for   PR 6093 at commit 
[`bc6058c`](https://github.com/apache/spark/commit/bc6058c42d66a66072597b3dfa12bca35540ccfc).
     * This patch **fails PySpark unit tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class LabeledSentence(label: Double, sentence: String)`
      * 
`[Tokenization](http://en.wikipedia.org/wiki/Lexical_analysis#Tokenization) is 
the process of taking text (such as a sentence) and breaking it into individual 
terms (usually words).  A simple 
[Tokenizer](api/scala/index.html#org.apache.spark.ml.feature.Tokenizer) class 
provides this functionality.  The example below shows how to split sentences 
into sequences of words.`
      * `case class LabeledSentence(label: Double, sentence: String)`
      * `public class JavaHashingTFSuite `




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-7557] [ml] [doc] User guide for spark.m...

Reply via email to