Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/6093#issuecomment-101429818
[Test build #32528 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/32528/consoleFull)
for PR 6093 at commit
[`bc6058c`](https://github.com/apache/spark/commit/bc6058c42d66a66072597b3dfa12bca35540ccfc).
* This patch **fails PySpark unit tests**.
* This patch merges cleanly.
* This patch adds the following public classes _(experimental)_:
* `case class LabeledSentence(label: Double, sentence: String)`
*
`[Tokenization](http://en.wikipedia.org/wiki/Lexical_analysis#Tokenization) is
the process of taking text (such as a sentence) and breaking it into individual
terms (usually words). A simple
[Tokenizer](api/scala/index.html#org.apache.spark.ml.feature.Tokenizer) class
provides this functionality. The example below shows how to split sentences
into sequences of words.`
* `case class LabeledSentence(label: Double, sentence: String)`
* `public class JavaHashingTFSuite `
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]