[jira] [Commented] (FLINK-5976) Refactoring duplicate Tokenizer in flink-test

2017-03-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905084#comment-15905084
 ] 

ASF GitHub Bot commented on FLINK-5976:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/3485


> Refactoring duplicate Tokenizer in flink-test
> -
>
> Key: FLINK-5976
> URL: https://issues.apache.org/jira/browse/FLINK-5976
> Project: Flink
>  Issue Type: Improvement
>  Components: Examples
>Affects Versions: 1.2.0
>Reporter: liuyuzhong7
>Priority: Minor
>  Labels: test
> Fix For: 1.2.0
>
>
> There are some duplicate code like this in flink-test, I think refactor this 
> will be better. 
> ```
> public final class Tokenizer implements FlatMapFunction Tuple2> {
>   @Override
>   public void flatMap(String value, Collector> 
> out) {
>   // normalize and split the line
>   String[] tokens = value.toLowerCase().split("\\W+");
>   // emit the pairs
>   for (String token : tokens) {
>   if (token.length() > 0) {
>   out.collect(new Tuple2(token, 
> 1));
>   }
>   }
>   }
> }
> ```



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5976) Refactoring duplicate Tokenizer in flink-test

2017-03-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904857#comment-15904857
 ] 

ASF GitHub Bot commented on FLINK-5976:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3485
  
Looks good, thanks! Merging this...


> Refactoring duplicate Tokenizer in flink-test
> -
>
> Key: FLINK-5976
> URL: https://issues.apache.org/jira/browse/FLINK-5976
> Project: Flink
>  Issue Type: Improvement
>  Components: Examples
>Affects Versions: 1.2.0
>Reporter: liuyuzhong7
>Priority: Minor
>  Labels: test
> Fix For: 1.2.0
>
>
> There are some duplicate code like this in flink-test, I think refactor this 
> will be better. 
> ```
> public final class Tokenizer implements FlatMapFunction Tuple2> {
>   @Override
>   public void flatMap(String value, Collector> 
> out) {
>   // normalize and split the line
>   String[] tokens = value.toLowerCase().split("\\W+");
>   // emit the pairs
>   for (String token : tokens) {
>   if (token.length() > 0) {
>   out.collect(new Tuple2(token, 
> 1));
>   }
>   }
>   }
> }
> ```



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5976) Refactoring duplicate Tokenizer in flink-test

2017-03-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904280#comment-15904280
 ] 

ASF GitHub Bot commented on FLINK-5976:
---

Github user liuyuzhong7 commented on the issue:

https://github.com/apache/flink/pull/3485
  
@StephanEwen  
FlatMapFunction "Tokenizer" has move to  
org.apache.flink.test.testfunctions.  
Help me to review this. Thanks.  
And collect other reusable functions to org.apache.flink.test.testfunctions 
later.  


> Refactoring duplicate Tokenizer in flink-test
> -
>
> Key: FLINK-5976
> URL: https://issues.apache.org/jira/browse/FLINK-5976
> Project: Flink
>  Issue Type: Improvement
>  Components: Examples
>Affects Versions: 1.2.0
>Reporter: liuyuzhong7
>Priority: Minor
>  Labels: test
> Fix For: 1.2.0
>
>
> There are some duplicate code like this in flink-test, I think refactor this 
> will be better. 
> ```
> public final class Tokenizer implements FlatMapFunction Tuple2> {
>   @Override
>   public void flatMap(String value, Collector> 
> out) {
>   // normalize and split the line
>   String[] tokens = value.toLowerCase().split("\\W+");
>   // emit the pairs
>   for (String token : tokens) {
>   if (token.length() > 0) {
>   out.collect(new Tuple2(token, 
> 1));
>   }
>   }
>   }
> }
> ```



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5976) Refactoring duplicate Tokenizer in flink-test

2017-03-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904258#comment-15904258
 ] 

ASF GitHub Bot commented on FLINK-5976:
---

Github user liuyuzhong7 commented on the issue:

https://github.com/apache/flink/pull/3485
  
OK, get it. So how about just do it now in flink-tests like this pull 
request?


> Refactoring duplicate Tokenizer in flink-test
> -
>
> Key: FLINK-5976
> URL: https://issues.apache.org/jira/browse/FLINK-5976
> Project: Flink
>  Issue Type: Improvement
>  Components: Examples
>Affects Versions: 1.2.0
>Reporter: liuyuzhong7
>Priority: Minor
>  Labels: test
> Fix For: 1.2.0
>
>
> There are some duplicate code like this in flink-test, I think refactor this 
> will be better. 
> ```
> public final class Tokenizer implements FlatMapFunction Tuple2> {
>   @Override
>   public void flatMap(String value, Collector> 
> out) {
>   // normalize and split the line
>   String[] tokens = value.toLowerCase().split("\\W+");
>   // emit the pairs
>   for (String token : tokens) {
>   if (token.length() > 0) {
>   out.collect(new Tuple2(token, 
> 1));
>   }
>   }
>   }
> }
> ```



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5976) Refactoring duplicate Tokenizer in flink-test

2017-03-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15902736#comment-15902736
 ] 

ASF GitHub Bot commented on FLINK-5976:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3485
  
I don't think it is necessary to unify all example functions. These are 
just minimal non critical pieces defined by some tests. One we drop Java 7, 
most of them will anyways disappear and become Lambdas. So I would not overdo 
this right now...


> Refactoring duplicate Tokenizer in flink-test
> -
>
> Key: FLINK-5976
> URL: https://issues.apache.org/jira/browse/FLINK-5976
> Project: Flink
>  Issue Type: Improvement
>  Components: Examples
>Affects Versions: 1.2.0
>Reporter: liuyuzhong7
>Priority: Minor
>  Labels: test
> Fix For: 1.2.0
>
>
> There are some duplicate code like this in flink-test, I think refactor this 
> will be better. 
> ```
> public final class Tokenizer implements FlatMapFunction Tuple2> {
>   @Override
>   public void flatMap(String value, Collector> 
> out) {
>   // normalize and split the line
>   String[] tokens = value.toLowerCase().split("\\W+");
>   // emit the pairs
>   for (String token : tokens) {
>   if (token.length() > 0) {
>   out.collect(new Tuple2(token, 
> 1));
>   }
>   }
>   }
> }
> ```



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5976) Refactoring duplicate Tokenizer in flink-test

2017-03-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15902310#comment-15902310
 ] 

ASF GitHub Bot commented on FLINK-5976:
---

Github user liuyuzhong7 commented on the issue:

https://github.com/apache/flink/pull/3485
  
@StephanEwen It's a good idea. We can remove many duplicate code in 
flink-tests in this way.
But I found it also duplicate in other modules. How to solve it in other 
modules.

![image](https://cloud.githubusercontent.com/assets/24708126/23732447/1f3cdd08-04ae-11e7-866e-a5a0e6d5e242.png)





> Refactoring duplicate Tokenizer in flink-test
> -
>
> Key: FLINK-5976
> URL: https://issues.apache.org/jira/browse/FLINK-5976
> Project: Flink
>  Issue Type: Improvement
>  Components: Examples
>Affects Versions: 1.2.0
>Reporter: liuyuzhong7
>Priority: Minor
>  Labels: test
> Fix For: 1.2.0
>
>
> There are some duplicate code like this in flink-test, I think refactor this 
> will be better. 
> ```
> public final class Tokenizer implements FlatMapFunction Tuple2> {
>   @Override
>   public void flatMap(String value, Collector> 
> out) {
>   // normalize and split the line
>   String[] tokens = value.toLowerCase().split("\\W+");
>   // emit the pairs
>   for (String token : tokens) {
>   if (token.length() > 0) {
>   out.collect(new Tuple2(token, 
> 1));
>   }
>   }
>   }
> }
> ```



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5976) Refactoring duplicate Tokenizer in flink-test

2017-03-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15901227#comment-15901227
 ] 

ASF GitHub Bot commented on FLINK-5976:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3485
  
How about we create a package `org.apache.flink.test.testfunctions` where 
we collect all shared reusable function implementations?


> Refactoring duplicate Tokenizer in flink-test
> -
>
> Key: FLINK-5976
> URL: https://issues.apache.org/jira/browse/FLINK-5976
> Project: Flink
>  Issue Type: Improvement
>  Components: Examples
>Affects Versions: 1.2.0
>Reporter: liuyuzhong7
>Priority: Minor
>  Labels: test
> Fix For: 1.2.0
>
>
> There are some duplicate code like this in flink-test, I think refactor this 
> will be better. 
> ```
> public final class Tokenizer implements FlatMapFunction Tuple2> {
>   @Override
>   public void flatMap(String value, Collector> 
> out) {
>   // normalize and split the line
>   String[] tokens = value.toLowerCase().split("\\W+");
>   // emit the pairs
>   for (String token : tokens) {
>   if (token.length() > 0) {
>   out.collect(new Tuple2(token, 
> 1));
>   }
>   }
>   }
> }
> ```



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5976) Refactoring duplicate Tokenizer in flink-test

2017-03-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15901228#comment-15901228
 ] 

ASF GitHub Bot commented on FLINK-5976:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3485
  
Would be nice to follow up here and collect also functions like "identity 
mapper" or so in that package.


> Refactoring duplicate Tokenizer in flink-test
> -
>
> Key: FLINK-5976
> URL: https://issues.apache.org/jira/browse/FLINK-5976
> Project: Flink
>  Issue Type: Improvement
>  Components: Examples
>Affects Versions: 1.2.0
>Reporter: liuyuzhong7
>Priority: Minor
>  Labels: test
> Fix For: 1.2.0
>
>
> There are some duplicate code like this in flink-test, I think refactor this 
> will be better. 
> ```
> public final class Tokenizer implements FlatMapFunction Tuple2> {
>   @Override
>   public void flatMap(String value, Collector> 
> out) {
>   // normalize and split the line
>   String[] tokens = value.toLowerCase().split("\\W+");
>   // emit the pairs
>   for (String token : tokens) {
>   if (token.length() > 0) {
>   out.collect(new Tuple2(token, 
> 1));
>   }
>   }
>   }
> }
> ```



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5976) Refactoring duplicate Tokenizer in flink-test

2017-03-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900542#comment-15900542
 ] 

ASF GitHub Bot commented on FLINK-5976:
---

Github user liuyuzhong7 commented on a diff in the pull request:

https://github.com/apache/flink/pull/3485#discussion_r104826690
  
--- Diff: 
flink-tests/src/test/java/org/apache/flink/test/util/Tokenizer.java ---
@@ -0,0 +1,17 @@
+package org.apache.flink.test.util;
--- End diff --

@zentol fixed.


> Refactoring duplicate Tokenizer in flink-test
> -
>
> Key: FLINK-5976
> URL: https://issues.apache.org/jira/browse/FLINK-5976
> Project: Flink
>  Issue Type: Improvement
>  Components: Examples
>Affects Versions: 1.2.0
>Reporter: liuyuzhong7
>Priority: Minor
>  Labels: test
> Fix For: 1.2.0
>
>
> There are some duplicate code like this in flink-test, I think refactor this 
> will be better. 
> ```
> public final class Tokenizer implements FlatMapFunction Tuple2> {
>   @Override
>   public void flatMap(String value, Collector> 
> out) {
>   // normalize and split the line
>   String[] tokens = value.toLowerCase().split("\\W+");
>   // emit the pairs
>   for (String token : tokens) {
>   if (token.length() > 0) {
>   out.collect(new Tuple2(token, 
> 1));
>   }
>   }
>   }
> }
> ```



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5976) Refactoring duplicate Tokenizer in flink-test

2017-03-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15899328#comment-15899328
 ] 

ASF GitHub Bot commented on FLINK-5976:
---

Github user zentol commented on a diff in the pull request:

https://github.com/apache/flink/pull/3485#discussion_r104650335
  
--- Diff: 
flink-tests/src/test/java/org/apache/flink/test/util/Tokenizer.java ---
@@ -0,0 +1,17 @@
+package org.apache.flink.test.util;
--- End diff --

this file is missing the apache license.


> Refactoring duplicate Tokenizer in flink-test
> -
>
> Key: FLINK-5976
> URL: https://issues.apache.org/jira/browse/FLINK-5976
> Project: Flink
>  Issue Type: Improvement
>  Components: Examples
>Affects Versions: 1.2.0
>Reporter: liuyuzhong7
>Priority: Minor
>  Labels: test
> Fix For: 1.2.0
>
>
> There are some duplicate code like this in flink-test, I think refactor this 
> will be better. 
> ```
> public final class Tokenizer implements FlatMapFunction Tuple2> {
>   @Override
>   public void flatMap(String value, Collector> 
> out) {
>   // normalize and split the line
>   String[] tokens = value.toLowerCase().split("\\W+");
>   // emit the pairs
>   for (String token : tokens) {
>   if (token.length() > 0) {
>   out.collect(new Tuple2(token, 
> 1));
>   }
>   }
>   }
> }
> ```



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5976) Refactoring duplicate Tokenizer in flink-test

2017-03-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15899276#comment-15899276
 ] 

ASF GitHub Bot commented on FLINK-5976:
---

Github user liuyuzhong7 commented on the issue:

https://github.com/apache/flink/pull/3485
  
@uce 
1. This pull request has squash in one commit.
2. JIRA has set affects version to 1.2.0

Help me to review this code, Thanks.


> Refactoring duplicate Tokenizer in flink-test
> -
>
> Key: FLINK-5976
> URL: https://issues.apache.org/jira/browse/FLINK-5976
> Project: Flink
>  Issue Type: Improvement
>  Components: Examples
>Affects Versions: 1.2.0
>Reporter: liuyuzhong7
>Priority: Minor
>  Labels: test
> Fix For: 1.2.0
>
>
> There are some duplicate code like this in flink-test, I think refactor this 
> will be better. 
> ```
> public final class Tokenizer implements FlatMapFunction Tuple2> {
>   @Override
>   public void flatMap(String value, Collector> 
> out) {
>   // normalize and split the line
>   String[] tokens = value.toLowerCase().split("\\W+");
>   // emit the pairs
>   for (String token : tokens) {
>   if (token.length() > 0) {
>   out.collect(new Tuple2(token, 
> 1));
>   }
>   }
>   }
> }
> ```



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5976) Refactoring duplicate Tokenizer in flink-test

2017-03-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15899270#comment-15899270
 ] 

ASF GitHub Bot commented on FLINK-5976:
---

GitHub user liuyuzhong7 opened a pull request:

https://github.com/apache/flink/pull/3485

FLINK-5976 [tests] Deduplicate Tokenizer in tests

There are some duplicate code like this in flink-test, I think refactor 
this will be better. 

```
public final class Tokenizer implements FlatMapFunction> {

@Override
public void flatMap(String value, Collector> 
out) {
// normalize and split the line
String[] tokens = value.toLowerCase().split("\\W+");

// emit the pairs
for (String token : tokens) {
if (token.length() > 0) {
out.collect(new Tuple2(token, 
1));
}
}
}
}
```

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/liuyuzhong7/flink FLINK-5976

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/3485.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3485


commit db4192110b1b610846f50527a5fc67450d3d1cca
Author: liuyuzhong7 
Date:   2017-03-07T11:15:40Z

FLINK-5976 [tests] Deduplicate Tokenizer in tests




> Refactoring duplicate Tokenizer in flink-test
> -
>
> Key: FLINK-5976
> URL: https://issues.apache.org/jira/browse/FLINK-5976
> Project: Flink
>  Issue Type: Improvement
>  Components: Examples
>Affects Versions: 1.2.0
>Reporter: liuyuzhong7
>Priority: Minor
>  Labels: test
> Fix For: 1.2.0
>
>
> There are some duplicate code like this in flink-test, I think refactor this 
> will be better. 
> ```
> public final class Tokenizer implements FlatMapFunction Tuple2> {
>   @Override
>   public void flatMap(String value, Collector> 
> out) {
>   // normalize and split the line
>   String[] tokens = value.toLowerCase().split("\\W+");
>   // emit the pairs
>   for (String token : tokens) {
>   if (token.length() > 0) {
>   out.collect(new Tuple2(token, 
> 1));
>   }
>   }
>   }
> }
> ```



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5976) Refactoring duplicate Tokenizer in flink-test

2017-03-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15898993#comment-15898993
 ] 

ASF GitHub Bot commented on FLINK-5976:
---

Github user liuyuzhong7 closed the pull request at:

https://github.com/apache/flink/pull/3482


> Refactoring duplicate Tokenizer in flink-test
> -
>
> Key: FLINK-5976
> URL: https://issues.apache.org/jira/browse/FLINK-5976
> Project: Flink
>  Issue Type: Improvement
>  Components: Examples
>Affects Versions: 1.2.0
>Reporter: liuyuzhong7
>Priority: Minor
>  Labels: test
> Fix For: 1.2.0
>
>
> There are some duplicate code like this in flink-test, I think refactor this 
> will be better. 
> ```
> public final class Tokenizer implements FlatMapFunction Tuple2> {
>   @Override
>   public void flatMap(String value, Collector> 
> out) {
>   // normalize and split the line
>   String[] tokens = value.toLowerCase().split("\\W+");
>   // emit the pairs
>   for (String token : tokens) {
>   if (token.length() > 0) {
>   out.collect(new Tuple2(token, 
> 1));
>   }
>   }
>   }
> }
> ```



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5976) Refactoring duplicate Tokenizer in flink-test

2017-03-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15898946#comment-15898946
 ] 

ASF GitHub Bot commented on FLINK-5976:
---

Github user liuyuzhong7 commented on the issue:

https://github.com/apache/flink/pull/3482
  
@uce Thanks for review. OK, I'll do it later.


> Refactoring duplicate Tokenizer in flink-test
> -
>
> Key: FLINK-5976
> URL: https://issues.apache.org/jira/browse/FLINK-5976
> Project: Flink
>  Issue Type: Improvement
>  Components: Examples
>Affects Versions: 1.2.0
>Reporter: liuyuzhong7
>Priority: Minor
>  Labels: test
> Fix For: 1.2.0
>
>
> There are some duplicate code like this in flink-test, I think refactor this 
> will be better. 
> ```
> public final class Tokenizer implements FlatMapFunction Tuple2> {
>   @Override
>   public void flatMap(String value, Collector> 
> out) {
>   // normalize and split the line
>   String[] tokens = value.toLowerCase().split("\\W+");
>   // emit the pairs
>   for (String token : tokens) {
>   if (token.length() > 0) {
>   out.collect(new Tuple2(token, 
> 1));
>   }
>   }
>   }
> }
> ```



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5976) Refactoring duplicate Tokenizer in flink-test

2017-03-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15898937#comment-15898937
 ] 

ASF GitHub Bot commented on FLINK-5976:
---

Github user uce commented on the issue:

https://github.com/apache/flink/pull/3482
  
Hey @liuyuzhong7! Thanks for the contribution. A few notes:
- Could you rebase your branch on top of the master branch. We don't do 
merge commits. After you've rebased the branch, I will review it (you can force 
push to your PR branch)
- Could you rename the commits and this PR to:
  ```
  [FLINK-5976] [tests] Deduplicate Tokenizer in tests
  ```
- Could you unset the `fix version` in the JIRA and set `affects version` 
to `1.2.0`?



> Refactoring duplicate Tokenizer in flink-test
> -
>
> Key: FLINK-5976
> URL: https://issues.apache.org/jira/browse/FLINK-5976
> Project: Flink
>  Issue Type: Improvement
>  Components: Examples
>Affects Versions: 2.0.0
>Reporter: liuyuzhong7
>Priority: Minor
>  Labels: test
> Fix For: 2.0.0
>
>
> There are some duplicate code like this in flink-test, I think refactor this 
> will be better. 
> ```
> public final class Tokenizer implements FlatMapFunction Tuple2> {
>   @Override
>   public void flatMap(String value, Collector> 
> out) {
>   // normalize and split the line
>   String[] tokens = value.toLowerCase().split("\\W+");
>   // emit the pairs
>   for (String token : tokens) {
>   if (token.length() > 0) {
>   out.collect(new Tuple2(token, 
> 1));
>   }
>   }
>   }
> }
> ```



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)