date:20170810

[jira] [Commented] (FLINK-7423) Always reuse an instance to get elements from the inputFormat

2017-08-10 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-7423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16122926#comment-16122926
 ] 

ASF GitHub Bot commented on FLINK-7423:
---

GitHub user XuPingyong opened a pull request:

https://github.com/apache/flink/pull/4525

[FLINK-7423] Always reuse an instance to get elements from the inputFormat

## What is the purpose of the change

This pull request fix a bug about getting elements from the inputFormat in 
InputFormatSourceFunction.java


## Verifying this change

This change added tests and can be verified as follows:

  - *Added test in InputFormatSourceFunctionTest*

## Does this pull request potentially affect one of the following parts:

  - Dependencies (does it add or upgrade a dependency): (no)
  - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
  - The serializers: (no)
  - The runtime per-record code paths (performance sensitive): (no)
  - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)

## Documentation

  - Does this pull request introduce a new feature? (no)

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/XuPingyong/flink FLINK-7423

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/4525.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4525


commit ddca0f9d258eed4bc69bf931aebc1bbde385c799
Author: pingyong.xpy 
Date:   2017-08-11T06:09:12Z

[FLINK-7423] Always reuse an instance to get elements from the inputFormat




> Always reuse an instance  to get elements from the inputFormat 
> ---
>
> Key: FLINK-7423
> URL: https://issues.apache.org/jira/browse/FLINK-7423
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming
>Reporter: Xu Pingyong
>Assignee: Xu Pingyong
>
> In InputFormatSourceFunction.java:
> {code:java}
> OUT nextElement = serializer.createInstance();
>   while (isRunning) {
>   format.open(splitIterator.next());
>   // for each element we also check if cancel
>   // was called by checking the isRunning flag
>   while (isRunning && !format.reachedEnd()) {
>   nextElement = 
> format.nextRecord(nextElement);
>   if (nextElement != null) {
>   ctx.collect(nextElement);
>   } else {
>   break;
>   }
>   }
>   format.close();
>   completedSplitsCounter.inc();
>   if (isRunning) {
>   isRunning = splitIterator.hasNext();
>   }
>   }
> {code}
> the format may return other element or null when nextRecord, that will may 
> cause exception.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] flink pull request #4525: [FLINK-7423] Always reuse an instance to get eleme...

2017-08-10 Thread XuPingyong

GitHub user XuPingyong opened a pull request:

https://github.com/apache/flink/pull/4525

[FLINK-7423] Always reuse an instance to get elements from the inputFormat

## What is the purpose of the change

This pull request fix a bug about getting elements from the inputFormat in 
InputFormatSourceFunction.java


## Verifying this change

This change added tests and can be verified as follows:

  - *Added test in InputFormatSourceFunctionTest*

## Does this pull request potentially affect one of the following parts:

  - Dependencies (does it add or upgrade a dependency): (no)
  - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
  - The serializers: (no)
  - The runtime per-record code paths (performance sensitive): (no)
  - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)

## Documentation

  - Does this pull request introduce a new feature? (no)

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/XuPingyong/flink FLINK-7423

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/4525.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4525


commit ddca0f9d258eed4bc69bf931aebc1bbde385c799
Author: pingyong.xpy 
Date:   2017-08-11T06:09:12Z

[FLINK-7423] Always reuse an instance to get elements from the inputFormat




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Updated] (FLINK-7423) Always reuse an instance to get elements from the inputFormat

2017-08-10 Thread Xu Pingyong (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-7423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xu Pingyong updated FLINK-7423:
---
Summary: Always reuse an instance  to get elements from the inputFormat   
(was: Always reuse an instance  to get elements from an inputFormat )

> Always reuse an instance  to get elements from the inputFormat 
> ---
>
> Key: FLINK-7423
> URL: https://issues.apache.org/jira/browse/FLINK-7423
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming
>Reporter: Xu Pingyong
>Assignee: Xu Pingyong
>
> In InputFormatSourceFunction.java:
> {code:java}
> OUT nextElement = serializer.createInstance();
>   while (isRunning) {
>   format.open(splitIterator.next());
>   // for each element we also check if cancel
>   // was called by checking the isRunning flag
>   while (isRunning && !format.reachedEnd()) {
>   nextElement = 
> format.nextRecord(nextElement);
>   if (nextElement != null) {
>   ctx.collect(nextElement);
>   } else {
>   break;
>   }
>   }
>   format.close();
>   completedSplitsCounter.inc();
>   if (isRunning) {
>   isRunning = splitIterator.hasNext();
>   }
>   }
> {code}
> the format may return other element or null when nextRecord, that will may 
> cause exception.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (FLINK-7417) Create flink-shaded-jackson

2017-08-10 Thread Fang Yong (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-7417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang Yong reassigned FLINK-7417:


Assignee: (was: Fang Yong)

> Create flink-shaded-jackson
> ---
>
> Key: FLINK-7417
> URL: https://issues.apache.org/jira/browse/FLINK-7417
> Project: Flink
>  Issue Type: Sub-task
>  Components: Build System
>Reporter: Stephan Ewen
> Fix For: 1.4.0
>
>
> The {{com.fasterml:jackson}} library is another culprit of frequent conflicts 
> that we need to shade away.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (FLINK-7417) Create flink-shaded-jackson

2017-08-10 Thread Fang Yong (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-7417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang Yong reassigned FLINK-7417:


Assignee: Fang Yong

> Create flink-shaded-jackson
> ---
>
> Key: FLINK-7417
> URL: https://issues.apache.org/jira/browse/FLINK-7417
> Project: Flink
>  Issue Type: Sub-task
>  Components: Build System
>Reporter: Stephan Ewen
>Assignee: Fang Yong
> Fix For: 1.4.0
>
>
> The {{com.fasterml:jackson}} library is another culprit of frequent conflicts 
> that we need to shade away.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (FLINK-7419) Shade jackson dependency in flink-avro

2017-08-10 Thread Fang Yong (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-7419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang Yong reassigned FLINK-7419:


Assignee: Fang Yong

> Shade jackson dependency in flink-avro
> --
>
> Key: FLINK-7419
> URL: https://issues.apache.org/jira/browse/FLINK-7419
> Project: Flink
>  Issue Type: Sub-task
>  Components: Build System
>Reporter: Stephan Ewen
>Assignee: Fang Yong
> Fix For: 1.4.0
>
>
> Avro uses {{org.codehouse.jackson}} which also exists in multiple 
> incompatible versions. We should shade it to 
> {{org.apache.flink.shaded.avro.org.codehouse.jackson}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (FLINK-7419) Shade jackson dependency in flink-avro

2017-08-10 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-7419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16122847#comment-16122847
 ] 

ASF GitHub Bot commented on FLINK-7419:
---

GitHub user zjureel opened a pull request:

https://github.com/apache/flink/pull/4524

[FLINK-7419] Shade jackson dependency in flink-avro

## What is the purpose of the change

Shade jackson dependency in flink-avro to avoid incompatible versions

## Brief change log

  - *Shade jackson in pom.xml of flink-avro project*


## Verifying this change

*(Please pick either of the following options)*

No test case

## Does this pull request potentially affect one of the following parts:

  - Dependencies (does it add or upgrade a dependency): (yes)
  - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
  - The serializers: (no)
  - The runtime per-record code paths (performance sensitive): (no)
  - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)

## Documentation

  - Does this pull request introduce a new feature? (no)
  - If yes, how is the feature documented? (not applicable / docs / 
JavaDocs / not documented)



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zjureel/flink FLINK-7419

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/4524.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4524


commit da6ceca91b53e9fac120dfc00219de35927215de
Author: zjureel 
Date:   2017-08-11T03:35:02Z

[FLINK-7419] Shade jackson dependency in flink-avro




> Shade jackson dependency in flink-avro
> --
>
> Key: FLINK-7419
> URL: https://issues.apache.org/jira/browse/FLINK-7419
> Project: Flink
>  Issue Type: Sub-task
>  Components: Build System
>Reporter: Stephan Ewen
> Fix For: 1.4.0
>
>
> Avro uses {{org.codehouse.jackson}} which also exists in multiple 
> incompatible versions. We should shade it to 
> {{org.apache.flink.shaded.avro.org.codehouse.jackson}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] flink pull request #4524: [FLINK-7419] Shade jackson dependency in flink-avr...

2017-08-10 Thread zjureel

GitHub user zjureel opened a pull request:

https://github.com/apache/flink/pull/4524

[FLINK-7419] Shade jackson dependency in flink-avro

## What is the purpose of the change

Shade jackson dependency in flink-avro to avoid incompatible versions

## Brief change log

  - *Shade jackson in pom.xml of flink-avro project*


## Verifying this change

*(Please pick either of the following options)*

No test case

## Does this pull request potentially affect one of the following parts:

  - Dependencies (does it add or upgrade a dependency): (yes)
  - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
  - The serializers: (no)
  - The runtime per-record code paths (performance sensitive): (no)
  - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)

## Documentation

  - Does this pull request introduce a new feature? (no)
  - If yes, how is the feature documented? (not applicable / docs / 
JavaDocs / not documented)



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zjureel/flink FLINK-7419

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/4524.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4524


commit da6ceca91b53e9fac120dfc00219de35927215de
Author: zjureel 
Date:   2017-08-11T03:35:02Z

[FLINK-7419] Shade jackson dependency in flink-avro




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-7396) Don't put multiple directories in HADOOP_CONF_DIR in config.sh

2017-08-10 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-7396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16122845#comment-16122845
 ] 

ASF GitHub Bot commented on FLINK-7396:
---

Github user zjureel commented on the issue:

https://github.com/apache/flink/pull/4511
  
@aljoscha Sorry it's my fault, I have fixed it, thanks :)


> Don't put multiple directories in HADOOP_CONF_DIR in config.sh
> --
>
> Key: FLINK-7396
> URL: https://issues.apache.org/jira/browse/FLINK-7396
> Project: Flink
>  Issue Type: Bug
>  Components: Startup Shell Scripts
>Affects Versions: 1.4.0, 1.3.2
>Reporter: Aljoscha Krettek
>Assignee: Fang Yong
>Priority: Blocker
> Fix For: 1.4.0, 1.3.3
>
>
> In config.sh we do this:
> {code}
> # Check if deprecated HADOOP_HOME is set.
> if [ -n "$HADOOP_HOME" ]; then
> # HADOOP_HOME is set. Check if its a Hadoop 1.x or 2.x HADOOP_HOME path
> if [ -d "$HADOOP_HOME/conf" ]; then
> # its a Hadoop 1.x
> HADOOP_CONF_DIR="$HADOOP_CONF_DIR:$HADOOP_HOME/conf"
> fi
> if [ -d "$HADOOP_HOME/etc/hadoop" ]; then
> # Its Hadoop 2.2+
> HADOOP_CONF_DIR="$HADOOP_CONF_DIR:$HADOOP_HOME/etc/hadoop"
> fi
> fi
> {code}
> while our {{HadoopFileSystem}} actually only treats this paths as a single 
> path, not a colon-separated path: 
> https://github.com/apache/flink/blob/854b05376a459a6197e41e141bb28a9befe481ad/flink-runtime/src/main/java/org/apache/flink/runtime/fs/hdfs/HadoopFileSystem.java#L236
> I also think that other tools don't assume multiple paths in there and at 
> least one user ran into the problem on their setup.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] flink issue #4511: [FLINK-7396] Don't put multiple directories in HADOOP_CON...

2017-08-10 Thread zjureel

Github user zjureel commented on the issue:

https://github.com/apache/flink/pull/4511
  
@aljoscha Sorry it's my fault, I have fixed it, thanks :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Created] (FLINK-7425) Support Map in TypeExtractor

2017-08-10 Thread Timo Walther (JIRA)

Timo Walther created FLINK-7425:
---

 Summary: Support Map in TypeExtractor
 Key: FLINK-7425
 URL: https://issues.apache.org/jira/browse/FLINK-7425
 Project: Flink
  Issue Type: Improvement
  Components: Type Serialization System
Reporter: Timo Walther


If you print {{new TypeHint>(){}.getType()}} you will see 
that even this is GenericTypeInfo. This is confusing, but maps where not 
supported initially. It would be good to officially support all major Java 
collections. I don't know if we can change this behavior because of backwards 
compatibility.

See discussion here: 
https://stackoverflow.com/questions/45621542/flink-sql-api-with-map-types-java/45622438#45622438



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (FLINK-7424) `CEP` component make `KeyedStream` choose wrong channel

2017-08-10 Thread Benedict Jin (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-7424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict Jin updated FLINK-7424:

Description: 
`CEP` component make `KeyedStream` choose wrong channel

Origin KeySelector is perfect right.
{code:java}
public static KeySelector buildKeySelector() {
return (KeySelector) log -> {
if (log == null) return 0;
Integer flumeId;
if ((flumeId = log.getFlumeId()) == null) return 1;
return flumeId;
};
}
{code.java}

After some changes, it will throw Key group index out of range of key group 
range [16, 32) exception.
{code.java}
public static KeySelector buildKeySelector(final int 
parallelism) {
return new KeySelector() {
private Random r = new Random(System.nanoTime());
@Override
public Integer getKey(HBaseServerLog log) throws Exception {
if (log == null) return 0;
Integer flumeId;
if ((flumeId = log.getFlumeId()) == null) return 1;
return Math.max(flumeId + (r.nextBoolean() ? 0 : -1 * 
r.nextInt(parallelism)), 0);
}
};
}
{code.java}

But, after MathUtils.murmurHash(keyHash) % maxParallelism process, it shouldn't 
be wrong. Actually, when we add some `CEP` component 
(IterativeCondition/PatternFlatSelectFunction) code after it. It make the 
KeySelector  choose wrong channel and throw IllegalArgumentException.

  was:
`CEP` component make `KeyedStream` choose wrong channel

Origin KeySelector is perfect right.
{code:java}
public static KeySelector buildKeySelector() {
return (KeySelector) log -> {
if (log == null) return 0;
Integer flumeId;
if ((flumeId = log.getFlumeId()) == null) return 1;
return flumeId;
};
}
{code}

After some changes, it will throw Key group index out of range of key group 
range [16, 32) exception.
{code.java}
public static KeySelector buildKeySelector(final int 
parallelism) {
return new KeySelector() {
private Random r = new Random(System.nanoTime());
@Override
public Integer getKey(HBaseServerLog log) throws Exception {
if (log == null) return 0;
Integer flumeId;
if ((flumeId = log.getFlumeId()) == null) return 1;
return Math.max(flumeId + (r.nextBoolean() ? 0 : -1 * 
r.nextInt(parallelism)), 0);
}
};
}
{code}

But, after MathUtils.murmurHash(keyHash) % maxParallelism process, it shouldn't 
be wrong. Actually, when we add some `CEP` component 
(IterativeCondition/PatternFlatSelectFunction) code after it. It make the 
KeySelector  choose wrong channel and throw IllegalArgumentException.


> `CEP` component make `KeyedStream` choose wrong channel
> ---
>
> Key: FLINK-7424
> URL: https://issues.apache.org/jira/browse/FLINK-7424
> Project: Flink
>  Issue Type: Bug
>  Components: CEP, Streaming
>Reporter: Benedict Jin
>Assignee: Benedict Jin
>
> `CEP` component make `KeyedStream` choose wrong channel
> Origin KeySelector is perfect right.
> {code:java}
> public static KeySelector buildKeySelector() {
> return (KeySelector) log -> {
> if (log == null) return 0;
> Integer flumeId;
> if ((flumeId = log.getFlumeId()) == null) return 1;
> return flumeId;
> };
> }
> {code.java}
> After some changes, it will throw Key group index out of range of key group 
> range [16, 32) exception.
> {code.java}
> public static KeySelector buildKeySelector(final int 
> parallelism) {
> return new KeySelector() {
> private Random r = new Random(System.nanoTime());
> @Override
> public Integer getKey(HBaseServerLog log) throws Exception {
> if (log == null) return 0;
> Integer flumeId;
> if ((flumeId = log.getFlumeId()) == null) return 1;
> return Math.max(flumeId + (r.nextBoolean() ? 0 : -1 * 
> r.nextInt(parallelism)), 0);
> }
> };
> }
> {code.java}
> But, after MathUtils.murmurHash(keyHash) % maxParallelism process, it 
> shouldn't be wrong. Actually, when we add some `CEP` component 
> (IterativeCondition/PatternFlatSelectFunction) code after it. It make the 
> KeySelector  choose wrong channel and throw IllegalArgumentException.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (FLINK-7424) `CEP` component make `KeyedStream` choose wrong channel

2017-08-10 Thread Benedict Jin (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-7424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict Jin updated FLINK-7424:

Description: 
`CEP` component make `KeyedStream` choose wrong channel

Origin KeySelector is perfect right.
{code:java}
public static KeySelector buildKeySelector() {
return (KeySelector) log -> {
if (log == null) return 0;
Integer flumeId;
if ((flumeId = log.getFlumeId()) == null) return 1;
return flumeId;
};
}
{code}

After some changes, it will throw Key group index out of range of key group 
range [16, 32) exception.
{code}
public static KeySelector buildKeySelector(final int 
parallelism) {
return new KeySelector() {
private Random r = new Random(System.nanoTime());
@Override
public Integer getKey(HBaseServerLog log) throws Exception {
if (log == null) return 0;
Integer flumeId;
if ((flumeId = log.getFlumeId()) == null) return 1;
return Math.max(flumeId + (r.nextBoolean() ? 0 : -1 * 
r.nextInt(parallelism)), 0);
}
};
}
{code}

But, after MathUtils.murmurHash(keyHash) % maxParallelism process, it shouldn't 
be wrong. Actually, when we add some `CEP` component 
(IterativeCondition/PatternFlatSelectFunction) code after it. It make the 
KeySelector  choose wrong channel and throw IllegalArgumentException.

  was:
`CEP` component make `KeyedStream` choose wrong channel

Origin KeySelector is perfect right.
{code:java}
public static KeySelector buildKeySelector() {
return (KeySelector) log -> {
if (log == null) return 0;
Integer flumeId;
if ((flumeId = log.getFlumeId()) == null) return 1;
return flumeId;
};
}
{code}

After some changes, it will throw Key group index out of range of key group 
range [16, 32) exception.
{code.java}
public static KeySelector buildKeySelector(final int 
parallelism) {
return new KeySelector() {
private Random r = new Random(System.nanoTime());
@Override
public Integer getKey(HBaseServerLog log) throws Exception {
if (log == null) return 0;
Integer flumeId;
if ((flumeId = log.getFlumeId()) == null) return 1;
return Math.max(flumeId + (r.nextBoolean() ? 0 : -1 * 
r.nextInt(parallelism)), 0);
}
};
}
{code}

But, after MathUtils.murmurHash(keyHash) % maxParallelism process, it shouldn't 
be wrong. Actually, when we add some `CEP` component 
(IterativeCondition/PatternFlatSelectFunction) code after it. It make the 
KeySelector  choose wrong channel and throw IllegalArgumentException.


> `CEP` component make `KeyedStream` choose wrong channel
> ---
>
> Key: FLINK-7424
> URL: https://issues.apache.org/jira/browse/FLINK-7424
> Project: Flink
>  Issue Type: Bug
>  Components: CEP, Streaming
>Reporter: Benedict Jin
>Assignee: Benedict Jin
>
> `CEP` component make `KeyedStream` choose wrong channel
> Origin KeySelector is perfect right.
> {code:java}
> public static KeySelector buildKeySelector() {
> return (KeySelector) log -> {
> if (log == null) return 0;
> Integer flumeId;
> if ((flumeId = log.getFlumeId()) == null) return 1;
> return flumeId;
> };
> }
> {code}
> After some changes, it will throw Key group index out of range of key group 
> range [16, 32) exception.
> {code}
> public static KeySelector buildKeySelector(final int 
> parallelism) {
> return new KeySelector() {
> private Random r = new Random(System.nanoTime());
> @Override
> public Integer getKey(HBaseServerLog log) throws Exception {
> if (log == null) return 0;
> Integer flumeId;
> if ((flumeId = log.getFlumeId()) == null) return 1;
> return Math.max(flumeId + (r.nextBoolean() ? 0 : -1 * 
> r.nextInt(parallelism)), 0);
> }
> };
> }
> {code}
> But, after MathUtils.murmurHash(keyHash) % maxParallelism process, it 
> shouldn't be wrong. Actually, when we add some `CEP` component 
> (IterativeCondition/PatternFlatSelectFunction) code after it. It make the 
> KeySelector  choose wrong channel and throw IllegalArgumentException.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (FLINK-7424) `CEP` component make `KeyedStream` choose wrong channel

2017-08-10 Thread Benedict Jin (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-7424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict Jin updated FLINK-7424:

Description: 
`CEP` component make `KeyedStream` choose wrong channel

Origin KeySelector is perfect right.
{code:java}
public static KeySelector buildKeySelector() {
return (KeySelector) log -> {
if (log == null) return 0;
Integer flumeId;
if ((flumeId = log.getFlumeId()) == null) return 1;
return flumeId;
};
}
{code}

After some changes, it will throw Key group index out of range of key group 
range [16, 32) exception.
{code.java}
public static KeySelector buildKeySelector(final int 
parallelism) {
return new KeySelector() {
private Random r = new Random(System.nanoTime());
@Override
public Integer getKey(HBaseServerLog log) throws Exception {
if (log == null) return 0;
Integer flumeId;
if ((flumeId = log.getFlumeId()) == null) return 1;
return Math.max(flumeId + (r.nextBoolean() ? 0 : -1 * 
r.nextInt(parallelism)), 0);
}
};
}
{code}

But, after MathUtils.murmurHash(keyHash) % maxParallelism process, it shouldn't 
be wrong. Actually, when we add some `CEP` component 
(IterativeCondition/PatternFlatSelectFunction) code after it. It make the 
KeySelector  choose wrong channel and throw IllegalArgumentException.

  was:
`CEP` component make `KeyedStream` choose wrong channel

Origin KeySelector is perfect right.
{code:java}
public static KeySelector buildKeySelector() {
return (KeySelector) log -> {
if (log == null) return 0;
Integer flumeId;
if ((flumeId = log.getFlumeId()) == null) return 1;
return flumeId;
};
}
{code.java}

After some changes, it will throw Key group index out of range of key group 
range [16, 32) exception.
{code.java}
public static KeySelector buildKeySelector(final int 
parallelism) {
return new KeySelector() {
private Random r = new Random(System.nanoTime());
@Override
public Integer getKey(HBaseServerLog log) throws Exception {
if (log == null) return 0;
Integer flumeId;
if ((flumeId = log.getFlumeId()) == null) return 1;
return Math.max(flumeId + (r.nextBoolean() ? 0 : -1 * 
r.nextInt(parallelism)), 0);
}
};
}
{code.java}

But, after MathUtils.murmurHash(keyHash) % maxParallelism process, it shouldn't 
be wrong. Actually, when we add some `CEP` component 
(IterativeCondition/PatternFlatSelectFunction) code after it. It make the 
KeySelector  choose wrong channel and throw IllegalArgumentException.


> `CEP` component make `KeyedStream` choose wrong channel
> ---
>
> Key: FLINK-7424
> URL: https://issues.apache.org/jira/browse/FLINK-7424
> Project: Flink
>  Issue Type: Bug
>  Components: CEP, Streaming
>Reporter: Benedict Jin
>Assignee: Benedict Jin
>
> `CEP` component make `KeyedStream` choose wrong channel
> Origin KeySelector is perfect right.
> {code:java}
> public static KeySelector buildKeySelector() {
> return (KeySelector) log -> {
> if (log == null) return 0;
> Integer flumeId;
> if ((flumeId = log.getFlumeId()) == null) return 1;
> return flumeId;
> };
> }
> {code}
> After some changes, it will throw Key group index out of range of key group 
> range [16, 32) exception.
> {code.java}
> public static KeySelector buildKeySelector(final int 
> parallelism) {
> return new KeySelector() {
> private Random r = new Random(System.nanoTime());
> @Override
> public Integer getKey(HBaseServerLog log) throws Exception {
> if (log == null) return 0;
> Integer flumeId;
> if ((flumeId = log.getFlumeId()) == null) return 1;
> return Math.max(flumeId + (r.nextBoolean() ? 0 : -1 * 
> r.nextInt(parallelism)), 0);
> }
> };
> }
> {code}
> But, after MathUtils.murmurHash(keyHash) % maxParallelism process, it 
> shouldn't be wrong. Actually, when we add some `CEP` component 
> (IterativeCondition/PatternFlatSelectFunction) code after it. It make the 
> KeySelector  choose wrong channel and throw IllegalArgumentException.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (FLINK-7424) `CEP` component make `KeyedStream` choose wrong channel

2017-08-10 Thread Benedict Jin (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-7424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict Jin updated FLINK-7424:

Description: 
`CEP` component make `KeyedStream` choose wrong channel

Origin KeySelector is perfect right.
{code:java}
public static KeySelector buildKeySelector() {
return (KeySelector) log -> {
if (log == null) return 0;
Integer flumeId;
if ((flumeId = log.getFlumeId()) == null) return 1;
return flumeId;
};
}
{code}

After some changes, it will throw Key group index out of range of key group 
range [16, 32) exception.
{code.java}
public static KeySelector buildKeySelector(final int 
parallelism) {
return new KeySelector() {
private Random r = new Random(System.nanoTime());
@Override
public Integer getKey(HBaseServerLog log) throws Exception {
if (log == null) return 0;
Integer flumeId;
if ((flumeId = log.getFlumeId()) == null) return 1;
return Math.max(flumeId + (r.nextBoolean() ? 0 : -1 * 
r.nextInt(parallelism)), 0);
}
};
}
{code}

But, after MathUtils.murmurHash(keyHash) % maxParallelism process, it shouldn't 
be wrong. Actually, when we add some `CEP` component 
(IterativeCondition/PatternFlatSelectFunction) code after it. It make the 
KeySelector  choose wrong channel and throw IllegalArgumentException.

  was:
`CEP` component make `KeyedStream` choose wrong channel

Origin KeySelector is perfect right.
{code:java}
public static KeySelector buildKeySelector() {
return (KeySelector) log -> {
if (log == null) return 0;
Integer flumeId;
if ((flumeId = log.getFlumeId()) == null) return 1;
return flumeId;
};
}
{code}

After some changes, it will throw Key group index out of range of key group 
range [16, 32) exception.
{code.java}
public static KeySelector buildKeySelector(final int 
parallelism) {
return new KeySelector() {
private Random r = new Random(System.nanoTime());
@Override
public Integer getKey(HBaseServerLog log) throws Exception {
if (log == null) return 0;
Integer flumeId;
if ((flumeId = log.getFlumeId()) == null) return 1;
return Math.max(flumeId + (r.nextBoolean() ? 0 : -1 * 
r.nextInt(parallelism)), 0);
}
};
}
{code}

But, after MathUtils.murmurHash(keyHash) % maxParallelism process, it shouldn't 
be wrong. Actually, when we add some `CEP` component 
(IterativeCondition/PatternFlatSelectFunction) code after it. It make the 
{code.java}KeySelector{code}  choose wrong channel and throw 
IllegalArgumentException.


> `CEP` component make `KeyedStream` choose wrong channel
> ---
>
> Key: FLINK-7424
> URL: https://issues.apache.org/jira/browse/FLINK-7424
> Project: Flink
>  Issue Type: Bug
>  Components: CEP, Streaming
>Reporter: Benedict Jin
>Assignee: Benedict Jin
>
> `CEP` component make `KeyedStream` choose wrong channel
> Origin KeySelector is perfect right.
> {code:java}
> public static KeySelector buildKeySelector() {
> return (KeySelector) log -> {
> if (log == null) return 0;
> Integer flumeId;
> if ((flumeId = log.getFlumeId()) == null) return 1;
> return flumeId;
> };
> }
> {code}
> After some changes, it will throw Key group index out of range of key group 
> range [16, 32) exception.
> {code.java}
> public static KeySelector buildKeySelector(final int 
> parallelism) {
> return new KeySelector() {
> private Random r = new Random(System.nanoTime());
> @Override
> public Integer getKey(HBaseServerLog log) throws Exception {
> if (log == null) return 0;
> Integer flumeId;
> if ((flumeId = log.getFlumeId()) == null) return 1;
> return Math.max(flumeId + (r.nextBoolean() ? 0 : -1 * 
> r.nextInt(parallelism)), 0);
> }
> };
> }
> {code}
> But, after MathUtils.murmurHash(keyHash) % maxParallelism process, it 
> shouldn't be wrong. Actually, when we add some `CEP` component 
> (IterativeCondition/PatternFlatSelectFunction) code after it. It make the 
> KeySelector  choose wrong channel and throw IllegalArgumentException.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (FLINK-7424) `CEP` component make `KeyedStream` choose wrong channel

2017-08-10 Thread Benedict Jin (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-7424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict Jin updated FLINK-7424:

Description: 
`CEP` component make `KeyedStream` choose wrong channel

Origin KeySelector is perfect right.
{code:java}
public static KeySelector buildKeySelector() {
return (KeySelector) log -> {
if (log == null) return 0;
Integer flumeId;
if ((flumeId = log.getFlumeId()) == null) return 1;
return flumeId;
};
}
{code}

After some changes, it will throw Key group index out of range of key group 
range [16, 32) exception.
{code.java}
public static KeySelector buildKeySelector(final int 
parallelism) {
return new KeySelector() {
private Random r = new Random(System.nanoTime());
@Override
public Integer getKey(HBaseServerLog log) throws Exception {
if (log == null) return 0;
Integer flumeId;
if ((flumeId = log.getFlumeId()) == null) return 1;
return Math.max(flumeId + (r.nextBoolean() ? 0 : -1 * 
r.nextInt(parallelism)), 0);
}
};
}
{code}

But, after MathUtils.murmurHash(keyHash) % maxParallelism process, it shouldn't 
be wrong. Actually, when we add some `CEP` component 
(IterativeCondition/PatternFlatSelectFunction) code after it. It make the 
{code.java}KeySelector{code}  choose wrong channel and throw 
IllegalArgumentException.

  was:
`CEP` component make `KeyedStream` choose wrong channel

Origin KeySelector is perfect right.
{code:java}
public static KeySelector buildKeySelector() {
return (KeySelector) log -> {
if (log == null) return 0;
Integer flumeId;
if ((flumeId = log.getFlumeId()) == null) return 1;
return flumeId;
};
}
{code}

After some changes, it will throw {code.java}Key group index out of range of 
key group range [16, 32){code} exception.
{code.java}
public static KeySelector buildKeySelector(final int 
parallelism) {
return new KeySelector() {
private Random r = new Random(System.nanoTime());
@Override
public Integer getKey(HBaseServerLog log) throws Exception {
if (log == null) return 0;
Integer flumeId;
if ((flumeId = log.getFlumeId()) == null) return 1;
return Math.max(flumeId + (r.nextBoolean() ? 0 : -1 * 
r.nextInt(parallelism)), 0);
}
};
}
{code}

But, after {code.java}MathUtils.murmurHash(keyHash) % maxParallelism{code} 
process, it shouldn't be wrong. Actually, when we add some `CEP` component 
(IterativeCondition/PatternFlatSelectFunction) code after it. It make the 
{code.java}KeySelector{code}  choose wrong channel and throw 
IllegalArgumentException.


> `CEP` component make `KeyedStream` choose wrong channel
> ---
>
> Key: FLINK-7424
> URL: https://issues.apache.org/jira/browse/FLINK-7424
> Project: Flink
>  Issue Type: Bug
>  Components: CEP, Streaming
>Reporter: Benedict Jin
>Assignee: Benedict Jin
>
> `CEP` component make `KeyedStream` choose wrong channel
> Origin KeySelector is perfect right.
> {code:java}
> public static KeySelector buildKeySelector() {
> return (KeySelector) log -> {
> if (log == null) return 0;
> Integer flumeId;
> if ((flumeId = log.getFlumeId()) == null) return 1;
> return flumeId;
> };
> }
> {code}
> After some changes, it will throw Key group index out of range of key group 
> range [16, 32) exception.
> {code.java}
> public static KeySelector buildKeySelector(final int 
> parallelism) {
> return new KeySelector() {
> private Random r = new Random(System.nanoTime());
> @Override
> public Integer getKey(HBaseServerLog log) throws Exception {
> if (log == null) return 0;
> Integer flumeId;
> if ((flumeId = log.getFlumeId()) == null) return 1;
> return Math.max(flumeId + (r.nextBoolean() ? 0 : -1 * 
> r.nextInt(parallelism)), 0);
> }
> };
> }
> {code}
> But, after MathUtils.murmurHash(keyHash) % maxParallelism process, it 
> shouldn't be wrong. Actually, when we add some `CEP` component 
> (IterativeCondition/PatternFlatSelectFunction) code after it. It make the 
> {code.java}KeySelector{code}  choose wrong channel and throw 
> IllegalArgumentException.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Created] (FLINK-7424) `CEP` component make `KeyedStream` choose wrong channel

2017-08-10 Thread Benedict Jin (JIRA)

Benedict Jin created FLINK-7424:
---

 Summary: `CEP` component make `KeyedStream` choose wrong channel
 Key: FLINK-7424
 URL: https://issues.apache.org/jira/browse/FLINK-7424
 Project: Flink
  Issue Type: Bug
  Components: CEP, Streaming
Reporter: Benedict Jin
Assignee: Benedict Jin


`CEP` component make `KeyedStream` choose wrong channel

Origin KeySelector is perfect right.
{code:java}
public static KeySelector buildKeySelector() {
return (KeySelector) log -> {
if (log == null) return 0;
Integer flumeId;
if ((flumeId = log.getFlumeId()) == null) return 1;
return flumeId;
};
}
{code}

After some changes, it will throw {code.java}Key group index out of range of 
key group range [16, 32){code} exception.
{code.java}
public static KeySelector buildKeySelector(final int 
parallelism) {
return new KeySelector() {
private Random r = new Random(System.nanoTime());
@Override
public Integer getKey(HBaseServerLog log) throws Exception {
if (log == null) return 0;
Integer flumeId;
if ((flumeId = log.getFlumeId()) == null) return 1;
return Math.max(flumeId + (r.nextBoolean() ? 0 : -1 * 
r.nextInt(parallelism)), 0);
}
};
}
{code}

But, after {code.java}MathUtils.murmurHash(keyHash) % maxParallelism{code} 
process, it shouldn't be wrong. Actually, when we add some `CEP` component 
(IterativeCondition/PatternFlatSelectFunction) code after it. It make the 
{code.java}KeySelector{code}  choose wrong channel and throw 
IllegalArgumentException.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (FLINK-7423) Always reuse an instance to get elements from an inputFormat

2017-08-10 Thread Xu Pingyong (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-7423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xu Pingyong updated FLINK-7423:
---
Component/s: Streaming

> Always reuse an instance  to get elements from an inputFormat 
> --
>
> Key: FLINK-7423
> URL: https://issues.apache.org/jira/browse/FLINK-7423
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming
>Reporter: Xu Pingyong
>Assignee: Xu Pingyong
>
> In InputFormatSourceFunction.java:
> {code:java}
> OUT nextElement = serializer.createInstance();
>   while (isRunning) {
>   format.open(splitIterator.next());
>   // for each element we also check if cancel
>   // was called by checking the isRunning flag
>   while (isRunning && !format.reachedEnd()) {
>   nextElement = 
> format.nextRecord(nextElement);
>   if (nextElement != null) {
>   ctx.collect(nextElement);
>   } else {
>   break;
>   }
>   }
>   format.close();
>   completedSplitsCounter.inc();
>   if (isRunning) {
>   isRunning = splitIterator.hasNext();
>   }
>   }
> {code}
> the format may return other element or null when nextRecord, that will may 
> cause exception.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Created] (FLINK-7423) Always reuse an instance to get elements from an inputFormat

2017-08-10 Thread Xu Pingyong (JIRA)

Xu Pingyong created FLINK-7423:
--

 Summary: Always reuse an instance  to get elements from an 
inputFormat 
 Key: FLINK-7423
 URL: https://issues.apache.org/jira/browse/FLINK-7423
 Project: Flink
  Issue Type: Bug
Reporter: Xu Pingyong
Assignee: Xu Pingyong


In InputFormatSourceFunction.java:


{code:java}
OUT nextElement = serializer.createInstance();
while (isRunning) {
format.open(splitIterator.next());

// for each element we also check if cancel
// was called by checking the isRunning flag

while (isRunning && !format.reachedEnd()) {
nextElement = 
format.nextRecord(nextElement);
if (nextElement != null) {
ctx.collect(nextElement);
} else {
break;
}
}
format.close();
completedSplitsCounter.inc();

if (isRunning) {
isRunning = splitIterator.hasNext();
}
}
{code}

the format may return other element or null when nextRecord, that will may 
cause exception.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (FLINK-7123) Support timesOrMore in CEP

2017-08-10 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-7123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16122781#comment-16122781
 ] 

ASF GitHub Bot commented on FLINK-7123:
---

GitHub user dianfu opened a pull request:

https://github.com/apache/flink/pull/4523

[FLINK-7123] [cep] Support timesOrMore in CEP


## What is the purpose of the change

*This pull request adds timesOrMore to CEP pattern API*


## Verifying this change

This change added tests and can be verified as follows:

  - *Added test in TimesOrMoreITCase*

## Does this pull request potentially affect one of the following parts:

  - Dependencies (does it add or upgrade a dependency): (no)
  - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
  - The serializers: (no)
  - The runtime per-record code paths (performance sensitive): (no)
  - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: ( no)

## Documentation

  - Does this pull request introduce a new feature? (yes)
  - If yes, how is the feature documented? (docs)



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dianfu/flink timesOrMore

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/4523.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4523


commit 6ccbb0f53f1a860ae05b3d24c17408cf22a5aab0
Author: Dian Fu 
Date:   2017-08-11T03:29:01Z

[FLINK-7123] [cep] Support timesOrMore in CEP




> Support timesOrMore in CEP
> --
>
> Key: FLINK-7123
> URL: https://issues.apache.org/jira/browse/FLINK-7123
> Project: Flink
>  Issue Type: Bug
>  Components: CEP
>Reporter: Dian Fu
>Assignee: Dian Fu
>
> The CEP API should provide API such as timesOrMore(#ofTimes) for quantifier 
> {n,}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] flink pull request #4523: [FLINK-7123] [cep] Support timesOrMore in CEP

2017-08-10 Thread dianfu

GitHub user dianfu opened a pull request:

https://github.com/apache/flink/pull/4523

[FLINK-7123] [cep] Support timesOrMore in CEP


## What is the purpose of the change

*This pull request adds timesOrMore to CEP pattern API*


## Verifying this change

This change added tests and can be verified as follows:

  - *Added test in TimesOrMoreITCase*

## Does this pull request potentially affect one of the following parts:

  - Dependencies (does it add or upgrade a dependency): (no)
  - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
  - The serializers: (no)
  - The runtime per-record code paths (performance sensitive): (no)
  - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: ( no)

## Documentation

  - Does this pull request introduce a new feature? (yes)
  - If yes, how is the feature documented? (docs)



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dianfu/flink timesOrMore

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/4523.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4523


commit 6ccbb0f53f1a860ae05b3d24c17408cf22a5aab0
Author: Dian Fu 
Date:   2017-08-11T03:29:01Z

[FLINK-7123] [cep] Support timesOrMore in CEP




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Comment Edited] (FLINK-6805) Flink Cassandra connector dependency on Netty disagrees with Flink

2017-08-10 Thread Michael Fong (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-6805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120885#comment-16120885
 ] 

Michael Fong edited comment on FLINK-6805 at 8/11/17 2:50 AM:
--

Thanks for your comment, [~Zentol], 

One question above all, since Flink shaded netty would rename the classes FQN 
to specified class path, and would thus stick with the netty implementation to 
4.0.27 as required for Flink. Would there be any need to consolidate all other 
netty version used in other libraries, in this cases, cassandra-driver-core 
uses netty 4.0.33

Could you please share the example you mentioned for guava, and how would that 
prevent from potential conflict in future upgrade.  Thanks.

Regards,

Michael Fong


was (Author: mcfongtw):
Thanks for your comment, [~Zentol], 

I see the point of Flink's shaded netty and relocating original netty to Flink 
shaded netty dependency, but I don't quite get the point 2). Could you please 
share the example you mentioned for guava, and how would that prevent from 
potential conflict in future upgrade if it is not using shading approach.  
Thanks.

Regards,

Michael Fong

> Flink Cassandra connector dependency on Netty disagrees with Flink
> --
>
> Key: FLINK-6805
> URL: https://issues.apache.org/jira/browse/FLINK-6805
> Project: Flink
>  Issue Type: Bug
>  Components: Cassandra Connector
>Affects Versions: 1.3.0, 1.2.1
>Reporter: Shannon Carey
> Fix For: 1.4.0
>
>
> The Flink Cassandra connector has a dependency on Netty libraries (via 
> promotion of transitive dependencies by the Maven shade plugin) at version 
> 4.0.33.Final, which disagrees with the version included in Flink of 
> 4.0.27.Final which is included & managed by the parent POM via dependency on 
> netty-all.
> Due to use of netty-all, the dependency management doesn't take effect on the 
> individual libraries such as netty-handler, netty-codec, etc.
> I suggest that dependency management of Netty should be added for all Netty 
> libraries individually (netty-handler, etc.) so that all Flink modules use 
> the same version, and similarly I suggest that exclusions be added to the 
> quickstart example POM for the individual Netty libraries so that fat JARs 
> don't include conflicting versions of Netty.
> It seems like this problem started when FLINK-6084 was implemented: 
> transitive dependencies of the flink-connector-cassandra were previously 
> omitted, and now that they are included we must make sure that they agree 
> with the Flink distribution.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Reopened] (FLINK-7123) Support timesOrMore in CEP

2017-08-10 Thread Dian Fu (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-7123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dian Fu reopened FLINK-7123:


Reopen this JIRA as I think it's useful. Although we can achieve this via 
{{times(n-1)}} followed by {{oneOrMore}}, but we have to give each pattern a 
unique pattern name. For this case, actually, these two patterns should be with 
the same pattern name.

> Support timesOrMore in CEP
> --
>
> Key: FLINK-7123
> URL: https://issues.apache.org/jira/browse/FLINK-7123
> Project: Flink
>  Issue Type: Bug
>  Components: CEP
>Reporter: Dian Fu
>Assignee: Dian Fu
>
> The CEP API should provide API such as timesOrMore(#ofTimes) for quantifier 
> {n,}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (FLINK-7062) Support the basic functionality of MATCH_RECOGNIZE

2017-08-10 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-7062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16122668#comment-16122668
 ] 

ASF GitHub Bot commented on FLINK-7062:
---

Github user dianfu commented on the issue:

https://github.com/apache/flink/pull/4502
  
@fhueske Great, thanks.


> Support the basic functionality of MATCH_RECOGNIZE
> --
>
> Key: FLINK-7062
> URL: https://issues.apache.org/jira/browse/FLINK-7062
> Project: Flink
>  Issue Type: Sub-task
>  Components: CEP, Table API & SQL
>Reporter: Dian Fu
>Assignee: Dian Fu
>
> In this JIRA, we will support the basic functionality of {{MATCH_RECOGNIZE}} 
> in Flink SQL API which includes the support of syntax {{MEASURES}}, 
> {{PATTERN}} and {{DEFINE}}. This would allow users write basic cep use cases 
> with SQL like the following example:
> {code}
> SELECT T.aid, T.bid, T.cid
> FROM MyTable
> MATCH_RECOGNIZE (
>   MEASURES
> A.id AS aid,
> B.id AS bid,
> C.id AS cid
>   PATTERN (A B C)
>   DEFINE
> A AS A.name = 'a',
> B AS B.name = 'b',
> C AS C.name = 'c'
> ) AS T
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] flink issue #4502: [FLINK-7062] [table, cep] Support the basic functionality...

2017-08-10 Thread dianfu

Github user dianfu commented on the issue:

https://github.com/apache/flink/pull/4502
  
@fhueske Great, thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Updated] (FLINK-7367) Parameterize more configs for FlinkKinesisProducer (RecordMaxBufferedTime, MaxConnections, RequestTimeout, etc)

2017-08-10 Thread Bowen Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-7367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bowen Li updated FLINK-7367:

Description: 
Right now, FlinkKinesisProducer only expose two configs for the underlying 
KinesisProducer:

- AGGREGATION_MAX_COUNT
- COLLECTION_MAX_COUNT

Well, according to [AWS 
doc|http://docs.aws.amazon.com/streams/latest/dev/kinesis-kpl-config.html] and 
[their sample on 
github|https://github.com/awslabs/amazon-kinesis-producer/blob/master/java/amazon-kinesis-producer-sample/default_config.properties],
 developers can set more to make the max use of KinesisProducer, and make it 
fault-tolerant (e.g. by increasing timeout).

I select a few more configs that we need when using Flink with Kinesis:

- MAX_CONNECTIONS
- RATE_LIMIT
- RECORD_MAX_BUFFERED_TIME
- RECORD_TIME_TO_LIVE
- REQUEST_TIMEOUT

Flink is using KPL's default values. They make Flink writing too fast to 
Kinesis, which fail Flink job too frequently. We need to parameterize 
FlinkKinesisProducer to pass in the above params, in order to slowing down 
Flink's write rate to Kinesis.

  was:
Right now, FlinkKinesisProducer only expose two configs for the underlying 
KinesisProducer:

- AGGREGATION_MAX_COUNT
- COLLECTION_MAX_COUNT

Well, according to [AWS 
doc|http://docs.aws.amazon.com/streams/latest/dev/kinesis-kpl-config.html] and 
[their sample on 
github|https://github.com/awslabs/amazon-kinesis-producer/blob/master/java/amazon-kinesis-producer-sample/default_config.properties],
 developers can set more to make the max use of KinesisProducer, and make it 
fault-tolerant (e.g. by increasing timeout).

I select a few more configs that we need when using Flink with Kinesis:

- MAX_CONNECTIONS
- RATE_LIMIT
- RECORD_MAX_BUFFERED_TIME
- RECORD_TIME_TO_LIVE
- REQUEST_TIMEOUT

We need to parameterize FlinkKinesisProducer to pass in the above params, in 
order to cater to our need


> Parameterize more configs for FlinkKinesisProducer (RecordMaxBufferedTime, 
> MaxConnections, RequestTimeout, etc)
> ---
>
> Key: FLINK-7367
> URL: https://issues.apache.org/jira/browse/FLINK-7367
> Project: Flink
>  Issue Type: Bug
>  Components: Kinesis Connector
>Affects Versions: 1.3.0
>Reporter: Bowen Li
>Assignee: Bowen Li
> Fix For: 1.4.0, 1.3.3
>
>
> Right now, FlinkKinesisProducer only expose two configs for the underlying 
> KinesisProducer:
> - AGGREGATION_MAX_COUNT
> - COLLECTION_MAX_COUNT
> Well, according to [AWS 
> doc|http://docs.aws.amazon.com/streams/latest/dev/kinesis-kpl-config.html] 
> and [their sample on 
> github|https://github.com/awslabs/amazon-kinesis-producer/blob/master/java/amazon-kinesis-producer-sample/default_config.properties],
>  developers can set more to make the max use of KinesisProducer, and make it 
> fault-tolerant (e.g. by increasing timeout).
> I select a few more configs that we need when using Flink with Kinesis:
> - MAX_CONNECTIONS
> - RATE_LIMIT
> - RECORD_MAX_BUFFERED_TIME
> - RECORD_TIME_TO_LIVE
> - REQUEST_TIMEOUT
> Flink is using KPL's default values. They make Flink writing too fast to 
> Kinesis, which fail Flink job too frequently. We need to parameterize 
> FlinkKinesisProducer to pass in the above params, in order to slowing down 
> Flink's write rate to Kinesis.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (FLINK-7367) Parameterize more configs for FlinkKinesisProducer (RecordMaxBufferedTime, MaxConnections, RequestTimeout, etc)

2017-08-10 Thread Bowen Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-7367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bowen Li updated FLINK-7367:

Issue Type: Bug  (was: Improvement)

> Parameterize more configs for FlinkKinesisProducer (RecordMaxBufferedTime, 
> MaxConnections, RequestTimeout, etc)
> ---
>
> Key: FLINK-7367
> URL: https://issues.apache.org/jira/browse/FLINK-7367
> Project: Flink
>  Issue Type: Bug
>  Components: Kinesis Connector
>Affects Versions: 1.3.0
>Reporter: Bowen Li
>Assignee: Bowen Li
> Fix For: 1.4.0, 1.3.3
>
>
> Right now, FlinkKinesisProducer only expose two configs for the underlying 
> KinesisProducer:
> - AGGREGATION_MAX_COUNT
> - COLLECTION_MAX_COUNT
> Well, according to [AWS 
> doc|http://docs.aws.amazon.com/streams/latest/dev/kinesis-kpl-config.html] 
> and [their sample on 
> github|https://github.com/awslabs/amazon-kinesis-producer/blob/master/java/amazon-kinesis-producer-sample/default_config.properties],
>  developers can set more to make the max use of KinesisProducer, and make it 
> fault-tolerant (e.g. by increasing timeout).
> I select a few more configs that we need when using Flink with Kinesis:
> - MAX_CONNECTIONS
> - RATE_LIMIT
> - RECORD_MAX_BUFFERED_TIME
> - RECORD_TIME_TO_LIVE
> - REQUEST_TIMEOUT
> We need to parameterize FlinkKinesisProducer to pass in the above params, in 
> order to cater to our need



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (FLINK-7367) Parameterize more configs for FlinkKinesisProducer (RecordMaxBufferedTime, MaxConnections, RequestTimeout, etc)

2017-08-10 Thread Bowen Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-7367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bowen Li updated FLINK-7367:

Fix Version/s: 1.4.0

> Parameterize more configs for FlinkKinesisProducer (RecordMaxBufferedTime, 
> MaxConnections, RequestTimeout, etc)
> ---
>
> Key: FLINK-7367
> URL: https://issues.apache.org/jira/browse/FLINK-7367
> Project: Flink
>  Issue Type: Bug
>  Components: Kinesis Connector
>Affects Versions: 1.3.0
>Reporter: Bowen Li
>Assignee: Bowen Li
> Fix For: 1.4.0, 1.3.3
>
>
> Right now, FlinkKinesisProducer only expose two configs for the underlying 
> KinesisProducer:
> - AGGREGATION_MAX_COUNT
> - COLLECTION_MAX_COUNT
> Well, according to [AWS 
> doc|http://docs.aws.amazon.com/streams/latest/dev/kinesis-kpl-config.html] 
> and [their sample on 
> github|https://github.com/awslabs/amazon-kinesis-producer/blob/master/java/amazon-kinesis-producer-sample/default_config.properties],
>  developers can set more to make the max use of KinesisProducer, and make it 
> fault-tolerant (e.g. by increasing timeout).
> I select a few more configs that we need when using Flink with Kinesis:
> - MAX_CONNECTIONS
> - RATE_LIMIT
> - RECORD_MAX_BUFFERED_TIME
> - RECORD_TIME_TO_LIVE
> - REQUEST_TIMEOUT
> We need to parameterize FlinkKinesisProducer to pass in the above params, in 
> order to cater to our need



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (FLINK-7366) Upgrade kinesis producer library, kinesis client library, and AWS SDK in flink-connector-kinesis

2017-08-10 Thread Bowen Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-7366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bowen Li updated FLINK-7366:

Description: 
We need to upgrade KPL and KCL to pick up the enhanced performance and 
stability for Flink to work better  with Kinesis. Upgrading KPL is specially 
necessary, because the KPL version Flink uses is old, and doesn't have good 
retry and error handling logic.

*KPL:*
flink-connector-kinesis currently uses kinesis-producer-library 0.10.2, which 
is released in Nov 2015 by AWS. It's old. It's the fourth release, and thus 
problematic. It doesn't even have good retry logic, therefore Flink fails 
really frequently (about every 10 mins as we observed) when Flink writes too 
fast to Kinesis and receives RateLimitExceededException, 

Quotes from https://github.com/awslabs/amazon-kinesis-producer/issues/56, 
"*With the newer version of the KPL it uses the AWS C++ SDK which should offer 
additional retries.*" on Oct 2016. 0.12.5, the version we are upgrading to, is 
released in May 2017 and should have the enhanced retry logic.

*KCL:*
Upgrade KCL from 1.6.2 to 1.8.1

*AWS SDK*
from 1.10.71 to 1.11.171

  was:

We need to upgrade KPL and KCL to pick up the enhanced performance and 
stability for Flink to work better  with Kinesis.

*KPL:*
flink-connector-kinesis currently uses kinesis-producer-library 0.10.2, which 
is released in Nov 2015 by AWS. It's old. It's the fourth release, and thus 
problematic. It doesn't even have good retry logic, therefore Flink fails 
really frequently (about every 10 mins as we observed) when Flink writes too 
fast to Kinesis and receives RateLimitExceededException, 

Quotes from https://github.com/awslabs/amazon-kinesis-producer/issues/56, 
"*With the newer version of the KPL it uses the AWS C++ SDK which should offer 
additional retries.*" on Oct 2016. 0.12.5, the version we are upgrading to, is 
released in May 2017 and should have the enhanced retry logic.

*KCL:*
Upgrade KCL from 1.6.2 to 1.8.1

*AWS SDK*
from 1.10.71 to 1.11.171


> Upgrade kinesis producer library, kinesis client library, and AWS SDK in 
> flink-connector-kinesis
> 
>
> Key: FLINK-7366
> URL: https://issues.apache.org/jira/browse/FLINK-7366
> Project: Flink
>  Issue Type: Bug
>  Components: Kinesis Connector
>Affects Versions: 1.3.2
>Reporter: Bowen Li
>Assignee: Bowen Li
> Fix For: 1.4.0, 1.3.3
>
>
> We need to upgrade KPL and KCL to pick up the enhanced performance and 
> stability for Flink to work better  with Kinesis. Upgrading KPL is specially 
> necessary, because the KPL version Flink uses is old, and doesn't have good 
> retry and error handling logic.
> *KPL:*
> flink-connector-kinesis currently uses kinesis-producer-library 0.10.2, which 
> is released in Nov 2015 by AWS. It's old. It's the fourth release, and thus 
> problematic. It doesn't even have good retry logic, therefore Flink fails 
> really frequently (about every 10 mins as we observed) when Flink writes too 
> fast to Kinesis and receives RateLimitExceededException, 
> Quotes from https://github.com/awslabs/amazon-kinesis-producer/issues/56, 
> "*With the newer version of the KPL it uses the AWS C++ SDK which should 
> offer additional retries.*" on Oct 2016. 0.12.5, the version we are upgrading 
> to, is released in May 2017 and should have the enhanced retry logic.
> *KCL:*
> Upgrade KCL from 1.6.2 to 1.8.1
> *AWS SDK*
> from 1.10.71 to 1.11.171



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (FLINK-7366) Upgrade kinesis producer library, kinesis client library, and AWS SDK in flink-connector-kinesis

2017-08-10 Thread Bowen Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-7366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bowen Li updated FLINK-7366:

Issue Type: Bug  (was: Improvement)

> Upgrade kinesis producer library, kinesis client library, and AWS SDK in 
> flink-connector-kinesis
> 
>
> Key: FLINK-7366
> URL: https://issues.apache.org/jira/browse/FLINK-7366
> Project: Flink
>  Issue Type: Bug
>  Components: Kinesis Connector
>Affects Versions: 1.3.2
>Reporter: Bowen Li
>Assignee: Bowen Li
> Fix For: 1.4.0, 1.3.3
>
>
> We need to upgrade KPL and KCL to pick up the enhanced performance and 
> stability for Flink to work better  with Kinesis.
> *KPL:*
> flink-connector-kinesis currently uses kinesis-producer-library 0.10.2, which 
> is released in Nov 2015 by AWS. It's old. It's the fourth release, and thus 
> problematic. It doesn't even have good retry logic, therefore Flink fails 
> really frequently (about every 10 mins as we observed) when Flink writes too 
> fast to Kinesis and receives RateLimitExceededException, 
> Quotes from https://github.com/awslabs/amazon-kinesis-producer/issues/56, 
> "*With the newer version of the KPL it uses the AWS C++ SDK which should 
> offer additional retries.*" on Oct 2016. 0.12.5, the version we are upgrading 
> to, is released in May 2017 and should have the enhanced retry logic.
> *KCL:*
> Upgrade KCL from 1.6.2 to 1.8.1
> *AWS SDK*
> from 1.10.71 to 1.11.171



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (FLINK-7366) Upgrade kinesis producer library, kinesis client library, and AWS SDK in flink-connector-kinesis

2017-08-10 Thread Bowen Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-7366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bowen Li updated FLINK-7366:

Description: 

We need to upgrade KPL and KCL to pick up the enhanced performance and 
stability for Flink to work better  with Kinesis.

*KPL:*
flink-connector-kinesis currently uses kinesis-producer-library 0.10.2, which 
is released in Nov 2015 by AWS. It's old. It's the fourth release, and thus 
problematic. It doesn't even have good retry logic, therefore Flink fails 
really frequently (about every 10 mins as we observed) when Flink writes too 
fast to Kinesis and receives RateLimitExceededException, 

Quotes from https://github.com/awslabs/amazon-kinesis-producer/issues/56, 
"*With the newer version of the KPL it uses the AWS C++ SDK which should offer 
additional retries.*" on Oct 2016. 0.12.5, the version we are upgrading to, is 
released in May 2017 and should have the enhanced retry logic.

*KCL:*
Upgrade KCL from 1.6.2 to 1.8.1

*AWS SDK*
from 1.10.71 to 1.11.171

  was:
*KPL:*
flink-connector-kinesis currently uses kinesis-producer-library 0.10.2, which 
is released in Nov 2015 by AWS. It's old. It's the fourth release, and thus 
problematic. It doesn't even have good retry logic, therefore Flink fails 
really frequently (about every 10 mins as we observed) when Flink writes too 
fast to Kinesis and receives RateLimitExceededException, 

Quotes from https://github.com/awslabs/amazon-kinesis-producer/issues/56, 
"*With the newer version of the KPL it uses the AWS C++ SDK which should offer 
additional retries.*" on Oct 2016.

Upgrade it to 0.12.5 released in May 2017 to take advantages of all the new 
features and improvements.

*KCL:*
Upgrade KCL from 1.6.2 to 1.8.1

*AWS SDK*
from 1.10.71 to 1.11.171


> Upgrade kinesis producer library, kinesis client library, and AWS SDK in 
> flink-connector-kinesis
> 
>
> Key: FLINK-7366
> URL: https://issues.apache.org/jira/browse/FLINK-7366
> Project: Flink
>  Issue Type: Improvement
>  Components: Kinesis Connector
>Affects Versions: 1.3.2
>Reporter: Bowen Li
>Assignee: Bowen Li
> Fix For: 1.4.0, 1.3.3
>
>
> We need to upgrade KPL and KCL to pick up the enhanced performance and 
> stability for Flink to work better  with Kinesis.
> *KPL:*
> flink-connector-kinesis currently uses kinesis-producer-library 0.10.2, which 
> is released in Nov 2015 by AWS. It's old. It's the fourth release, and thus 
> problematic. It doesn't even have good retry logic, therefore Flink fails 
> really frequently (about every 10 mins as we observed) when Flink writes too 
> fast to Kinesis and receives RateLimitExceededException, 
> Quotes from https://github.com/awslabs/amazon-kinesis-producer/issues/56, 
> "*With the newer version of the KPL it uses the AWS C++ SDK which should 
> offer additional retries.*" on Oct 2016. 0.12.5, the version we are upgrading 
> to, is released in May 2017 and should have the enhanced retry logic.
> *KCL:*
> Upgrade KCL from 1.6.2 to 1.8.1
> *AWS SDK*
> from 1.10.71 to 1.11.171



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (FLINK-7366) Upgrade kinesis producer library, kinesis client library, and AWS SDK in flink-connector-kinesis

2017-08-10 Thread Bowen Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-7366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bowen Li updated FLINK-7366:

Description: 
*KPL:*
flink-connector-kinesis currently uses kinesis-producer-library 0.10.2, which 
is released in Nov 2015 by AWS. It's old. It's the fourth release, and thus 
problematic. It doesn't even have good retry logic, therefore Flink fails 
really frequently (about every 10 mins as we observed) when Flink writes too 
fast to Kinesis and receives RateLimitExceededException, 

Quotes from https://github.com/awslabs/amazon-kinesis-producer/issues/56, 
"*With the newer version of the KPL it uses the AWS C++ SDK which should offer 
additional retries.*" on Oct 2016.

Upgrade it to 0.12.5 released in May 2017 to take advantages of all the new 
features and improvements.

*KCL:*
Upgrade KCL from 1.6.2 to 1.8.1

*AWS SDK*
from 1.10.71 to 1.11.171

  was:
*KPL: *
flink-connector-kinesis currently uses kinesis-producer-library 0.10.2, which 
is released in Nov 2015 by AWS. It's old. It's the fourth release, and thus 
problematic. It doesn't even have good retry logic, therefore Flink fails 
really frequently (about every 10 mins as we observed) when Flink writes too 
fast to Kinesis and receives RateLimitExceededException, 

Quotes from https://github.com/awslabs/amazon-kinesis-producer/issues/56, 
"*With the newer version of the KPL it uses the AWS C++ SDK which should offer 
additional retries.*" on Oct 2016.

Upgrade it to 0.12.5 released in May 2017 to take advantages of all the new 
features and improvements.

*KCL:*
Upgrade KCL from 1.6.2 to 1.8.1

*AWS SDK*
from 1.10.71 to 1.11.171


> Upgrade kinesis producer library, kinesis client library, and AWS SDK in 
> flink-connector-kinesis
> 
>
> Key: FLINK-7366
> URL: https://issues.apache.org/jira/browse/FLINK-7366
> Project: Flink
>  Issue Type: Improvement
>  Components: Kinesis Connector
>Affects Versions: 1.3.2
>Reporter: Bowen Li
>Assignee: Bowen Li
> Fix For: 1.4.0, 1.3.3
>
>
> *KPL:*
> flink-connector-kinesis currently uses kinesis-producer-library 0.10.2, which 
> is released in Nov 2015 by AWS. It's old. It's the fourth release, and thus 
> problematic. It doesn't even have good retry logic, therefore Flink fails 
> really frequently (about every 10 mins as we observed) when Flink writes too 
> fast to Kinesis and receives RateLimitExceededException, 
> Quotes from https://github.com/awslabs/amazon-kinesis-producer/issues/56, 
> "*With the newer version of the KPL it uses the AWS C++ SDK which should 
> offer additional retries.*" on Oct 2016.
> Upgrade it to 0.12.5 released in May 2017 to take advantages of all the new 
> features and improvements.
> *KCL:*
> Upgrade KCL from 1.6.2 to 1.8.1
> *AWS SDK*
> from 1.10.71 to 1.11.171



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (FLINK-7366) Upgrade kinesis producer library, kinesis client library, and AWS SDK in flink-connector-kinesis

2017-08-10 Thread Bowen Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-7366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bowen Li updated FLINK-7366:

Description: 
*KPL: *
flink-connector-kinesis currently uses kinesis-producer-library 0.10.2, which 
is released in Nov 2015 by AWS. It's old. It's the fourth release, and thus 
problematic. It doesn't even have good retry logic, therefore Flink fails 
really frequently (about every 10 mins as we observed) when Flink writes too 
fast to Kinesis and receives RateLimitExceededException, 

Quotes from https://github.com/awslabs/amazon-kinesis-producer/issues/56, 
"*With the newer version of the KPL it uses the AWS C++ SDK which should offer 
additional retries.*" on Oct 2016.

Upgrade it to 0.12.5 released in May 2017 to take advantages of all the new 
features and improvements.

*KCL:*
Upgrade KCL from 1.6.2 to 1.8.1

*AWS SDK*
from 1.10.71 to 1.11.171

  was:
# KPL: 

flink-connector-kinesis currently uses kinesis-producer-library 0.10.2, which 
is released in Nov 2015 by AWS. It's old. It's the fourth release, and thus 
problematic. It doesn't even have good retry logic, therefore Flink fails 
really frequently (about every 10 mins as we observed) when Flink writes too 
fast to Kinesis and receives RateLimitExceededException, 

Quotes from https://github.com/awslabs/amazon-kinesis-producer/issues/56, 
"*With the newer version of the KPL it uses the AWS C++ SDK which should offer 
additional retries.*" on Oct 2016.

Upgrade it to 0.12.5 released in May 2017 to take advantages of all the new 
features and improvements.

# KCL:

Upgrade KCL from 1.6.2 to 1.8.1

# AWS SDK

from 1.10.71 to 1.11.171


> Upgrade kinesis producer library, kinesis client library, and AWS SDK in 
> flink-connector-kinesis
> 
>
> Key: FLINK-7366
> URL: https://issues.apache.org/jira/browse/FLINK-7366
> Project: Flink
>  Issue Type: Improvement
>  Components: Kinesis Connector
>Affects Versions: 1.3.2
>Reporter: Bowen Li
>Assignee: Bowen Li
> Fix For: 1.4.0, 1.3.3
>
>
> *KPL: *
> flink-connector-kinesis currently uses kinesis-producer-library 0.10.2, which 
> is released in Nov 2015 by AWS. It's old. It's the fourth release, and thus 
> problematic. It doesn't even have good retry logic, therefore Flink fails 
> really frequently (about every 10 mins as we observed) when Flink writes too 
> fast to Kinesis and receives RateLimitExceededException, 
> Quotes from https://github.com/awslabs/amazon-kinesis-producer/issues/56, 
> "*With the newer version of the KPL it uses the AWS C++ SDK which should 
> offer additional retries.*" on Oct 2016.
> Upgrade it to 0.12.5 released in May 2017 to take advantages of all the new 
> features and improvements.
> *KCL:*
> Upgrade KCL from 1.6.2 to 1.8.1
> *AWS SDK*
> from 1.10.71 to 1.11.171



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (FLINK-7366) Upgrade kinesis producer library, kinesis client library, and AWS SDK in flink-connector-kinesis

2017-08-10 Thread Bowen Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-7366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bowen Li updated FLINK-7366:

Description: 
# KPL: 

flink-connector-kinesis currently uses kinesis-producer-library 0.10.2, which 
is released in Nov 2015 by AWS. It's old. It's the fourth release, and thus 
problematic. It doesn't even have good retry logic, therefore Flink fails 
really frequently (about every 10 mins as we observed) when Flink writes too 
fast to Kinesis and receives RateLimitExceededException, 

Quotes from https://github.com/awslabs/amazon-kinesis-producer/issues/56, 
"*With the newer version of the KPL it uses the AWS C++ SDK which should offer 
additional retries.*" on Oct 2016.

Upgrade it to 0.12.5 released in May 2017 to take advantages of all the new 
features and improvements.

# KCL:

Upgrade KCL from 1.6.2 to 1.8.1

# AWS SDK

from 1.10.71 to 1.11.171

  was:
flink-connector-kinesis currently uses kinesis-producer-library 0.10.2, which 
is released in Nov 2015 by AWS. It's old. It's the fourth release, and thus 
problematic. It doesn't even have good retry logic, therefore Flink fails 
really frequently (about every 10 mins as we observed) when Flink writes too 
fast to Kinesis and receives RateLimitExceededException, 

Quotes from https://github.com/awslabs/amazon-kinesis-producer/issues/56, 
"*With the newer version of the KPL it uses the AWS C++ SDK which should offer 
additional retries.*" on Oct 2016.

Upgrade it to 0.12.5 released in May 2017 to take advantages of all the new 
features and improvements.


> Upgrade kinesis producer library, kinesis client library, and AWS SDK in 
> flink-connector-kinesis
> 
>
> Key: FLINK-7366
> URL: https://issues.apache.org/jira/browse/FLINK-7366
> Project: Flink
>  Issue Type: Improvement
>  Components: Kinesis Connector
>Affects Versions: 1.3.2
>Reporter: Bowen Li
>Assignee: Bowen Li
> Fix For: 1.4.0, 1.3.3
>
>
> # KPL: 
> flink-connector-kinesis currently uses kinesis-producer-library 0.10.2, which 
> is released in Nov 2015 by AWS. It's old. It's the fourth release, and thus 
> problematic. It doesn't even have good retry logic, therefore Flink fails 
> really frequently (about every 10 mins as we observed) when Flink writes too 
> fast to Kinesis and receives RateLimitExceededException, 
> Quotes from https://github.com/awslabs/amazon-kinesis-producer/issues/56, 
> "*With the newer version of the KPL it uses the AWS C++ SDK which should 
> offer additional retries.*" on Oct 2016.
> Upgrade it to 0.12.5 released in May 2017 to take advantages of all the new 
> features and improvements.
> # KCL:
> Upgrade KCL from 1.6.2 to 1.8.1
> # AWS SDK
> from 1.10.71 to 1.11.171



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Closed] (FLINK-7422) Upgrade Kinesis Client Library (KCL) in flink-connector-kinesis from 1.6.2 to 1.8.1

2017-08-10 Thread Bowen Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-7422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bowen Li closed FLINK-7422.
---
Resolution: Fixed

> Upgrade Kinesis Client Library (KCL) in flink-connector-kinesis from 1.6.2 to 
> 1.8.1
> ---
>
> Key: FLINK-7422
> URL: https://issues.apache.org/jira/browse/FLINK-7422
> Project: Flink
>  Issue Type: Improvement
>  Components: Kinesis Connector
>Affects Versions: 1.3.2
>Reporter: Bowen Li
>Assignee: Bowen Li
> Fix For: 1.4.0, 1.3.3
>
>
> Upgrade KCL from 1.6.2 to 1.8.1 
> (https://mvnrepository.com/artifact/com.amazonaws/amazon-kinesis-client)
> Since FLINK-7366, we may also need to bump aws sdk version as well in this 
> ticket.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (FLINK-7366) Upgrade kinesis producer library, kinesis client library, and AWS SDK in flink-connector-kinesis

2017-08-10 Thread Bowen Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-7366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bowen Li updated FLINK-7366:

Summary: Upgrade kinesis producer library, kinesis client library, and AWS 
SDK in flink-connector-kinesis  (was: Upgrade kinesis-producer-library in 
flink-connector-kinesis from 0.10.2 to 0.12.5)

> Upgrade kinesis producer library, kinesis client library, and AWS SDK in 
> flink-connector-kinesis
> 
>
> Key: FLINK-7366
> URL: https://issues.apache.org/jira/browse/FLINK-7366
> Project: Flink
>  Issue Type: Improvement
>  Components: Kinesis Connector
>Affects Versions: 1.3.2
>Reporter: Bowen Li
>Assignee: Bowen Li
> Fix For: 1.4.0, 1.3.3
>
>
> flink-connector-kinesis currently uses kinesis-producer-library 0.10.2, which 
> is released in Nov 2015 by AWS. It's old. It's the fourth release, and thus 
> problematic. It doesn't even have good retry logic, therefore Flink fails 
> really frequently (about every 10 mins as we observed) when Flink writes too 
> fast to Kinesis and receives RateLimitExceededException, 
> Quotes from https://github.com/awslabs/amazon-kinesis-producer/issues/56, 
> "*With the newer version of the KPL it uses the AWS C++ SDK which should 
> offer additional retries.*" on Oct 2016.
> Upgrade it to 0.12.5 released in May 2017 to take advantages of all the new 
> features and improvements.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (FLINK-7366) Upgrade kinesis-producer-library in flink-connector-kinesis from 0.10.2 to 0.12.5

2017-08-10 Thread Bowen Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-7366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bowen Li updated FLINK-7366:

Affects Version/s: 1.3.2

> Upgrade kinesis-producer-library in flink-connector-kinesis from 0.10.2 to 
> 0.12.5
> -
>
> Key: FLINK-7366
> URL: https://issues.apache.org/jira/browse/FLINK-7366
> Project: Flink
>  Issue Type: Improvement
>  Components: Kinesis Connector
>Affects Versions: 1.3.2
>Reporter: Bowen Li
>Assignee: Bowen Li
> Fix For: 1.4.0, 1.3.3
>
>
> flink-connector-kinesis currently uses kinesis-producer-library 0.10.2, which 
> is released in Nov 2015 by AWS. It's old. It's the fourth release, and thus 
> problematic. It doesn't even have good retry logic, therefore Flink fails 
> really frequently (about every 10 mins as we observed) when Flink writes too 
> fast to Kinesis and receives RateLimitExceededException, 
> Quotes from https://github.com/awslabs/amazon-kinesis-producer/issues/56, 
> "*With the newer version of the KPL it uses the AWS C++ SDK which should 
> offer additional retries.*" on Oct 2016.
> Upgrade it to 0.12.5 released in May 2017 to take advantages of all the new 
> features and improvements.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (FLINK-7366) Upgrade kinesis-producer-library in flink-connector-kinesis from 0.10.2 to 0.12.5

2017-08-10 Thread Bowen Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-7366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bowen Li updated FLINK-7366:

Fix Version/s: 1.3.3

> Upgrade kinesis-producer-library in flink-connector-kinesis from 0.10.2 to 
> 0.12.5
> -
>
> Key: FLINK-7366
> URL: https://issues.apache.org/jira/browse/FLINK-7366
> Project: Flink
>  Issue Type: Improvement
>  Components: Kinesis Connector
>Affects Versions: 1.3.2
>Reporter: Bowen Li
>Assignee: Bowen Li
> Fix For: 1.4.0, 1.3.3
>
>
> flink-connector-kinesis currently uses kinesis-producer-library 0.10.2, which 
> is released in Nov 2015 by AWS. It's old. It's the fourth release, and thus 
> problematic. It doesn't even have good retry logic, therefore Flink fails 
> really frequently (about every 10 mins as we observed) when Flink writes too 
> fast to Kinesis and receives RateLimitExceededException, 
> Quotes from https://github.com/awslabs/amazon-kinesis-producer/issues/56, 
> "*With the newer version of the KPL it uses the AWS C++ SDK which should 
> offer additional retries.*" on Oct 2016.
> Upgrade it to 0.12.5 released in May 2017 to take advantages of all the new 
> features and improvements.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (FLINK-7366) Upgrade kinesis-producer-library in flink-connector-kinesis from 0.10.2 to 0.12.5

2017-08-10 Thread Bowen Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-7366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bowen Li updated FLINK-7366:

Description: 
flink-connector-kinesis currently uses kinesis-producer-library 0.10.2, which 
is released in Nov 2015 by AWS. It's old. It's the fourth release, and thus 
problematic. It doesn't even have good retry logic, therefore Flink fails 
really frequently (about every 10 mins as we observed) when Flink writes too 
fast to Kinesis and receives RateLimitExceededException, 

Quotes from https://github.com/awslabs/amazon-kinesis-producer/issues/56, 
"*With the newer version of the KPL it uses the AWS C++ SDK which should offer 
additional retries.*" on Oct 2016.

Upgrade it to 0.12.5 released in May 2017 to take advantages of all the new 
features and improvements.

  was:
flink-connector-kinesis currently uses kinesis-producer-library 0.10.2, which 
is released in Nov 2015 by AWS. It's old. It's the fourth release, and thus 
problematic. It doesn't even have good retry logic, therefore Flink fails 
really frequently (about every 10 mins as we observed) when Flink writes too 
fast to Kinesis and receives RateLimitExceededException, 

Upgrade it to 0.12.5 released in May 2017 to take advantages of all the new 
features and improvements.


> Upgrade kinesis-producer-library in flink-connector-kinesis from 0.10.2 to 
> 0.12.5
> -
>
> Key: FLINK-7366
> URL: https://issues.apache.org/jira/browse/FLINK-7366
> Project: Flink
>  Issue Type: Improvement
>  Components: Kinesis Connector
>Affects Versions: 1.3.2
>Reporter: Bowen Li
>Assignee: Bowen Li
> Fix For: 1.4.0, 1.3.3
>
>
> flink-connector-kinesis currently uses kinesis-producer-library 0.10.2, which 
> is released in Nov 2015 by AWS. It's old. It's the fourth release, and thus 
> problematic. It doesn't even have good retry logic, therefore Flink fails 
> really frequently (about every 10 mins as we observed) when Flink writes too 
> fast to Kinesis and receives RateLimitExceededException, 
> Quotes from https://github.com/awslabs/amazon-kinesis-producer/issues/56, 
> "*With the newer version of the KPL it uses the AWS C++ SDK which should 
> offer additional retries.*" on Oct 2016.
> Upgrade it to 0.12.5 released in May 2017 to take advantages of all the new 
> features and improvements.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Created] (FLINK-7422) Upgrade Kinesis Client Library (KCL) in flink-connector-kinesis from 1.6.2 to 1.8.1

2017-08-10 Thread Bowen Li (JIRA)

Bowen Li created FLINK-7422:
---

 Summary: Upgrade Kinesis Client Library (KCL) in 
flink-connector-kinesis from 1.6.2 to 1.8.1
 Key: FLINK-7422
 URL: https://issues.apache.org/jira/browse/FLINK-7422
 Project: Flink
  Issue Type: Improvement
  Components: Kinesis Connector
Affects Versions: 1.3.2
Reporter: Bowen Li
Assignee: Bowen Li
 Fix For: 1.4.0, 1.3.3


Upgrade KCL from 1.6.2 to 1.8.1 
(https://mvnrepository.com/artifact/com.amazonaws/amazon-kinesis-client)

Since FLINK-7366, we may also need to bump aws sdk version as well in this 
ticket.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (FLINK-7366) Upgrade kinesis-producer-library in flink-connector-kinesis from 0.10.2 to 0.12.5

2017-08-10 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-7366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16122447#comment-16122447
 ] 

ASF GitHub Bot commented on FLINK-7366:
---

GitHub user bowenli86 opened a pull request:

https://github.com/apache/flink/pull/4522

[FLINK-7366][kinesis connector] Upgrade kinesis-producer-library in 
flink-connector-kinesis from 0.10.2 to 0.12.5

## What is the purpose of the change

Upgrade kinesis-producer-library in flink-connector-kinesis from 0.10.2 to 
0.12.5. 

0.10.2 is released in Nov 2015, which is a bit old.


## Verifying this change

This change is already covered by existing tests

## Does this pull request potentially affect one of the following parts:

Nothing outside flink-kinesis-connector

## Documentation

  - Does this pull request introduce a new feature? (no)


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/bowenli86/flink FLINK-7366

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/4522.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4522


commit 7486878631e7238283eabfdd575387d32b210f91
Author: Bowen Li 
Date:   2017-08-10T22:09:56Z

FLINK-7366 Upgrade kinesis-producer-library in flink-connector-kinesis from 
0.10.2 to 0.12.5




> Upgrade kinesis-producer-library in flink-connector-kinesis from 0.10.2 to 
> 0.12.5
> -
>
> Key: FLINK-7366
> URL: https://issues.apache.org/jira/browse/FLINK-7366
> Project: Flink
>  Issue Type: Improvement
>  Components: Kinesis Connector
>Reporter: Bowen Li
>Assignee: Bowen Li
> Fix For: 1.4.0
>
>
> flink-connector-kinesis currently uses kinesis-producer-library 0.10.2, which 
> is released in Nov 2015 by AWS. It's old. It's the fourth release, and thus 
> problematic. It doesn't even have good retry logic, therefore Flink fails 
> really frequently (about every 10 mins as we observed) when Flink writes too 
> fast to Kinesis and receives RateLimitExceededException, 
> Upgrade it to 0.12.5 released in May 2017 to take advantages of all the new 
> features and improvements.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] flink pull request #4522: [FLINK-7366][kinesis connector] Upgrade kinesis-pr...

2017-08-10 Thread bowenli86

GitHub user bowenli86 opened a pull request:

https://github.com/apache/flink/pull/4522

[FLINK-7366][kinesis connector] Upgrade kinesis-producer-library in 
flink-connector-kinesis from 0.10.2 to 0.12.5

## What is the purpose of the change

Upgrade kinesis-producer-library in flink-connector-kinesis from 0.10.2 to 
0.12.5. 

0.10.2 is released in Nov 2015, which is a bit old.


## Verifying this change

This change is already covered by existing tests

## Does this pull request potentially affect one of the following parts:

Nothing outside flink-kinesis-connector

## Documentation

  - Does this pull request introduce a new feature? (no)


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/bowenli86/flink FLINK-7366

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/4522.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4522


commit 7486878631e7238283eabfdd575387d32b210f91
Author: Bowen Li 
Date:   2017-08-10T22:09:56Z

FLINK-7366 Upgrade kinesis-producer-library in flink-connector-kinesis from 
0.10.2 to 0.12.5




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Created] (FLINK-7421) AvroRow(De)serializationSchema not serializable

2017-08-10 Thread Timo Walther (JIRA)

Timo Walther created FLINK-7421:
---

 Summary: AvroRow(De)serializationSchema not serializable
 Key: FLINK-7421
 URL: https://issues.apache.org/jira/browse/FLINK-7421
 Project: Flink
  Issue Type: Bug
  Components: Streaming Connectors, Table API & SQL
Reporter: Timo Walther
Assignee: Timo Walther


Both {{AvroRowDeserializationSchema}} and {{AvroRowSerializationSchema}} 
contain fields that are not serializable. Those fields should be made transient 
and both schemas need to be tested in practice.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (FLINK-7420) Move all Avro code to flink-avro

2017-08-10 Thread Timo Walther (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-7420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther reassigned FLINK-7420:
---

Assignee: Timo Walther

> Move all Avro code to flink-avro
> 
>
> Key: FLINK-7420
> URL: https://issues.apache.org/jira/browse/FLINK-7420
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System
>Reporter: Stephan Ewen
>Assignee: Timo Walther
> Fix For: 1.4.0
>
>
> *Problem*
> Currently, the {{flink-avro}} project is a shell with some tests and mostly 
> duplicate and dead code. The classes that use Avro are distributed quite 
> wildly through the code base, and introduce multiple direct dependencies on 
> Avro in a messy way.
> That way, we cannot create a proper fat Avro dependency in which we shade 
> Jackson away.
> Also, we expose Avro as a direct and hard dependency on many Flink modules, 
> while it should be a dependency that users that use Avro types selectively 
> add.
> *Suggested Changes*
> We should move all Avro related classes to {{flink-avro}}, and give 
> {{flink-avro}} a dependency on {{flink-core}} and {{flink-streaming-java}}.
>   - {{AvroTypeInfo}}
>   - {{AvroSerializer}}
>   - {{AvroRowSerializationSchema}}
>   - {{AvroRowDeserializationSchema}}
> To be able to move the the avro serialization code from {{flink-ore}} to 
> {{flink-avro}}, we need to load the {{AvroTypeInformation}} reflectively, 
> similar to how we load the {{WritableTypeInfo}} for Hadoop.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Reopened] (FLINK-6168) Make flink-core independent of Avro

2017-08-10 Thread Timo Walther (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-6168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther reopened FLINK-6168:
-

> Make flink-core independent of Avro
> ---
>
> Key: FLINK-6168
> URL: https://issues.apache.org/jira/browse/FLINK-6168
> Project: Flink
>  Issue Type: Sub-task
>  Components: Core
>Reporter: Timo Walther
>
> Right now, flink-core has Avro dependencies. We should move AvroTypeInfo to 
> flink-avro and make the TypeExtractor Avro independent (e.g. reflection-based 
> similar to Hadoop Writables or with an other approach).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (FLINK-6168) Make flink-core independent of Avro

2017-08-10 Thread Timo Walther (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-6168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther updated FLINK-6168:

Fix Version/s: (was: 2.0.0)

> Make flink-core independent of Avro
> ---
>
> Key: FLINK-6168
> URL: https://issues.apache.org/jira/browse/FLINK-6168
> Project: Flink
>  Issue Type: Sub-task
>  Components: Core
>Reporter: Timo Walther
>
> Right now, flink-core has Avro dependencies. We should move AvroTypeInfo to 
> flink-avro and make the TypeExtractor Avro independent (e.g. reflection-based 
> similar to Hadoop Writables or with an other approach).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Closed] (FLINK-6168) Make flink-core independent of Avro

2017-08-10 Thread Timo Walther (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-6168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther closed FLINK-6168.
---
Resolution: Duplicate

> Make flink-core independent of Avro
> ---
>
> Key: FLINK-6168
> URL: https://issues.apache.org/jira/browse/FLINK-6168
> Project: Flink
>  Issue Type: Sub-task
>  Components: Core
>Reporter: Timo Walther
>
> Right now, flink-core has Avro dependencies. We should move AvroTypeInfo to 
> flink-avro and make the TypeExtractor Avro independent (e.g. reflection-based 
> similar to Hadoop Writables or with an other approach).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Closed] (FLINK-6168) Make flink-core independent of Avro

2017-08-10 Thread Timo Walther (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-6168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther closed FLINK-6168.
---
Resolution: Fixed

Will be covered by FLINK-7420.

> Make flink-core independent of Avro
> ---
>
> Key: FLINK-6168
> URL: https://issues.apache.org/jira/browse/FLINK-6168
> Project: Flink
>  Issue Type: Sub-task
>  Components: Core
>Reporter: Timo Walther
> Fix For: 2.0.0
>
>
> Right now, flink-core has Avro dependencies. We should move AvroTypeInfo to 
> flink-avro and make the TypeExtractor Avro independent (e.g. reflection-based 
> similar to Hadoop Writables or with an other approach).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Comment Edited] (FLINK-7420) Move all Avro code to flink-avro

2017-08-10 Thread Timo Walther (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-7420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16122100#comment-16122100
 ] 

Timo Walther edited comment on FLINK-7420 at 8/10/17 6:58 PM:
--

This is a duplicate of FLINK-6168. I will close the other one.


was (Author: twalthr):
This is a duplicate of FLINK-6168.

> Move all Avro code to flink-avro
> 
>
> Key: FLINK-7420
> URL: https://issues.apache.org/jira/browse/FLINK-7420
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System
>Reporter: Stephan Ewen
> Fix For: 1.4.0
>
>
> *Problem*
> Currently, the {{flink-avro}} project is a shell with some tests and mostly 
> duplicate and dead code. The classes that use Avro are distributed quite 
> wildly through the code base, and introduce multiple direct dependencies on 
> Avro in a messy way.
> That way, we cannot create a proper fat Avro dependency in which we shade 
> Jackson away.
> Also, we expose Avro as a direct and hard dependency on many Flink modules, 
> while it should be a dependency that users that use Avro types selectively 
> add.
> *Suggested Changes*
> We should move all Avro related classes to {{flink-avro}}, and give 
> {{flink-avro}} a dependency on {{flink-core}} and {{flink-streaming-java}}.
>   - {{AvroTypeInfo}}
>   - {{AvroSerializer}}
>   - {{AvroRowSerializationSchema}}
>   - {{AvroRowDeserializationSchema}}
> To be able to move the the avro serialization code from {{flink-ore}} to 
> {{flink-avro}}, we need to load the {{AvroTypeInformation}} reflectively, 
> similar to how we load the {{WritableTypeInfo}} for Hadoop.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (FLINK-7420) Move all Avro code to flink-avro

2017-08-10 Thread Timo Walther (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-7420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16122100#comment-16122100
 ] 

Timo Walther commented on FLINK-7420:
-

This is a duplicate of FLINK-6168.

> Move all Avro code to flink-avro
> 
>
> Key: FLINK-7420
> URL: https://issues.apache.org/jira/browse/FLINK-7420
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System
>Reporter: Stephan Ewen
> Fix For: 1.4.0
>
>
> *Problem*
> Currently, the {{flink-avro}} project is a shell with some tests and mostly 
> duplicate and dead code. The classes that use Avro are distributed quite 
> wildly through the code base, and introduce multiple direct dependencies on 
> Avro in a messy way.
> That way, we cannot create a proper fat Avro dependency in which we shade 
> Jackson away.
> Also, we expose Avro as a direct and hard dependency on many Flink modules, 
> while it should be a dependency that users that use Avro types selectively 
> add.
> *Suggested Changes*
> We should move all Avro related classes to {{flink-avro}}, and give 
> {{flink-avro}} a dependency on {{flink-core}} and {{flink-streaming-java}}.
>   - {{AvroTypeInfo}}
>   - {{AvroSerializer}}
>   - {{AvroRowSerializationSchema}}
>   - {{AvroRowDeserializationSchema}}
> To be able to move the the avro serialization code from {{flink-ore}} to 
> {{flink-avro}}, we need to load the {{AvroTypeInformation}} reflectively, 
> similar to how we load the {{WritableTypeInfo}} for Hadoop.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (FLINK-6988) Add Apache Kafka 0.11 connector

2017-08-10 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-6988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16122077#comment-16122077
 ] 

ASF GitHub Bot commented on FLINK-6988:
---

Github user rangadi commented on the issue:

https://github.com/apache/flink/pull/4239
  
Yep, that makes sense.


> Add Apache Kafka 0.11 connector
> ---
>
> Key: FLINK-6988
> URL: https://issues.apache.org/jira/browse/FLINK-6988
> Project: Flink
>  Issue Type: Improvement
>  Components: Kafka Connector
>Affects Versions: 1.3.1
>Reporter: Piotr Nowojski
>Assignee: Piotr Nowojski
>
> Kafka 0.11 (it will be released very soon) add supports for transactions. 
> Thanks to that, Flink might be able to implement Kafka sink supporting 
> "exactly-once" semantic. API changes and whole transactions support is 
> described in 
> [KIP-98|https://cwiki.apache.org/confluence/display/KAFKA/KIP-98+-+Exactly+Once+Delivery+and+Transactional+Messaging].
> The goal is to mimic implementation of existing BucketingSink. New 
> FlinkKafkaProducer011 would 
> * upon creation begin transaction, store transaction identifiers into the 
> state and would write all incoming data to an output Kafka topic using that 
> transaction
> * on `snapshotState` call, it would flush the data and write in state 
> information that current transaction is pending to be committed
> * on `notifyCheckpointComplete` we would commit this pending transaction
> * in case of crash between `snapshotState` and `notifyCheckpointComplete` we 
> either abort this pending transaction (if not every participant successfully 
> saved the snapshot) or restore and commit it. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] flink issue #4239: [FLINK-6988] flink-connector-kafka-0.11 with exactly-once...

2017-08-10 Thread rangadi

Github user rangadi commented on the issue:

https://github.com/apache/flink/pull/4239
  
Yep, that makes sense.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-6692) The flink-dist jar contains unshaded netty jar

2017-08-10 Thread Chesnay Schepler (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-6692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16122034#comment-16122034
 ] 

Chesnay Schepler commented on FLINK-6692:
-

The unshaded io.netty dependency is no longer included in flink-dist, and a 
check was added to prevent this from happening again in the future.

However, the jboss.netty dependency is still contained.

> The flink-dist jar contains unshaded netty jar
> --
>
> Key: FLINK-6692
> URL: https://issues.apache.org/jira/browse/FLINK-6692
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Fix For: 1.4.0
>
>
> The {{flink-dist}} jar contains unshaded netty 3 and netty 4 classes:
> {noformat}
> io/netty/handler/codec/http/router/
> io/netty/handler/codec/http/router/BadClientSilencer.class
> io/netty/handler/codec/http/router/MethodRouted.class
> io/netty/handler/codec/http/router/Handler.class
> io/netty/handler/codec/http/router/Router.class
> io/netty/handler/codec/http/router/DualMethodRouter.class
> io/netty/handler/codec/http/router/Routed.class
> io/netty/handler/codec/http/router/AbstractHandler.class
> io/netty/handler/codec/http/router/KeepAliveWrite.class
> io/netty/handler/codec/http/router/DualAbstractHandler.class
> io/netty/handler/codec/http/router/MethodRouter.class
> {noformat}
> {noformat}
> org/jboss/netty/util/internal/jzlib/InfBlocks.class
> org/jboss/netty/util/internal/jzlib/InfCodes.class
> org/jboss/netty/util/internal/jzlib/InfTree.class
> org/jboss/netty/util/internal/jzlib/Inflate$1.class
> org/jboss/netty/util/internal/jzlib/Inflate.class
> org/jboss/netty/util/internal/jzlib/JZlib$WrapperType.class
> org/jboss/netty/util/internal/jzlib/JZlib.class
> org/jboss/netty/util/internal/jzlib/StaticTree.class
> org/jboss/netty/util/internal/jzlib/Tree.class
> org/jboss/netty/util/internal/jzlib/ZStream$1.class
> org/jboss/netty/util/internal/jzlib/ZStream.class
> {noformat}
> Is it an expected behavior?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (FLINK-7420) Move all Avro code to flink-avro

2017-08-10 Thread Stephan Ewen (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-7420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen updated FLINK-7420:

Description: 
*Problem*

Currently, the {{flink-avro}} project is a shell with some tests and mostly 
duplicate and dead code. The classes that use Avro are distributed quite wildly 
through the code base, and introduce multiple direct dependencies on Avro in a 
messy way.

That way, we cannot create a proper fat Avro dependency in which we shade 
Jackson away.

Also, we expose Avro as a direct and hard dependency on many Flink modules, 
while it should be a dependency that users that use Avro types selectively add.

*Suggested Changes*

We should move all Avro related classes to {{flink-avro}}, and give 
{{flink-avro}} a dependency on {{flink-core}} and {{flink-streaming-java}}.

  - {{AvroTypeInfo}}
  - {{AvroSerializer}}
  - {{AvroRowSerializationSchema}}
  - {{AvroRowDeserializationSchema}}

To be able to move the the avro serialization code from {{flink-ore}} to 
{{flink-avro}}, we need to load the {{AvroTypeInformation}} reflectively, 
similar to how we load the {{WritableTypeInfo}} for Hadoop.

  was:
Currently, the {{flink-avro}} project is a shell with some tests and mostly 
duplicate and dead code. The classes that use Avro are distributed quite wildly 
through the code base, and introduce multiple direct dependencies on Avro in a 
messy way.

We should move all Avro related classes to {{flink-avro}}, and give 
{{flink-avro}} a dependency on {{flink-core}} and {{flink-streaming-java}}.

  - {{AvroTypeInfo}}
  - {{AvroSerializer}}
  - {{AvroRowSerializationSchema}}
  - {{AvroRowDeserializationSchema}}

To be able to move the the avro serialization code from {{flink-ore}} to 
{{flink-avro}}, we need to load the {{AvroTypeInformation}} reflectively, 
similar to how we load the {{WritableTypeInfo}} for Hadoop.


> Move all Avro code to flink-avro
> 
>
> Key: FLINK-7420
> URL: https://issues.apache.org/jira/browse/FLINK-7420
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System
>Reporter: Stephan Ewen
> Fix For: 1.4.0
>
>
> *Problem*
> Currently, the {{flink-avro}} project is a shell with some tests and mostly 
> duplicate and dead code. The classes that use Avro are distributed quite 
> wildly through the code base, and introduce multiple direct dependencies on 
> Avro in a messy way.
> That way, we cannot create a proper fat Avro dependency in which we shade 
> Jackson away.
> Also, we expose Avro as a direct and hard dependency on many Flink modules, 
> while it should be a dependency that users that use Avro types selectively 
> add.
> *Suggested Changes*
> We should move all Avro related classes to {{flink-avro}}, and give 
> {{flink-avro}} a dependency on {{flink-core}} and {{flink-streaming-java}}.
>   - {{AvroTypeInfo}}
>   - {{AvroSerializer}}
>   - {{AvroRowSerializationSchema}}
>   - {{AvroRowDeserializationSchema}}
> To be able to move the the avro serialization code from {{flink-ore}} to 
> {{flink-avro}}, we need to load the {{AvroTypeInformation}} reflectively, 
> similar to how we load the {{WritableTypeInfo}} for Hadoop.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (FLINK-6988) Add Apache Kafka 0.11 connector

2017-08-10 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-6988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16122007#comment-16122007
 ] 

ASF GitHub Bot commented on FLINK-6988:
---

Github user pnowojski commented on the issue:

https://github.com/apache/flink/pull/4239
  
This solution (basically a pool with a fixed size of 2) would work, only if 
there would be at most one pending commit transaction. Which is not always true 
in Flink - there can be multiple triggered checkpoints pending completion.


> Add Apache Kafka 0.11 connector
> ---
>
> Key: FLINK-6988
> URL: https://issues.apache.org/jira/browse/FLINK-6988
> Project: Flink
>  Issue Type: Improvement
>  Components: Kafka Connector
>Affects Versions: 1.3.1
>Reporter: Piotr Nowojski
>Assignee: Piotr Nowojski
>
> Kafka 0.11 (it will be released very soon) add supports for transactions. 
> Thanks to that, Flink might be able to implement Kafka sink supporting 
> "exactly-once" semantic. API changes and whole transactions support is 
> described in 
> [KIP-98|https://cwiki.apache.org/confluence/display/KAFKA/KIP-98+-+Exactly+Once+Delivery+and+Transactional+Messaging].
> The goal is to mimic implementation of existing BucketingSink. New 
> FlinkKafkaProducer011 would 
> * upon creation begin transaction, store transaction identifiers into the 
> state and would write all incoming data to an output Kafka topic using that 
> transaction
> * on `snapshotState` call, it would flush the data and write in state 
> information that current transaction is pending to be committed
> * on `notifyCheckpointComplete` we would commit this pending transaction
> * in case of crash between `snapshotState` and `notifyCheckpointComplete` we 
> either abort this pending transaction (if not every participant successfully 
> saved the snapshot) or restore and commit it. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] flink issue #4239: [FLINK-6988] flink-connector-kafka-0.11 with exactly-once...

2017-08-10 Thread pnowojski

Github user pnowojski commented on the issue:

https://github.com/apache/flink/pull/4239
  
This solution (basically a pool with a fixed size of 2) would work, only if 
there would be at most one pending commit transaction. Which is not always true 
in Flink - there can be multiple triggered checkpoints pending completion.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Created] (FLINK-7420) Move all Avro code to flink-avro

2017-08-10 Thread Stephan Ewen (JIRA)

Stephan Ewen created FLINK-7420:
---

 Summary: Move all Avro code to flink-avro
 Key: FLINK-7420
 URL: https://issues.apache.org/jira/browse/FLINK-7420
 Project: Flink
  Issue Type: Improvement
  Components: Build System
Reporter: Stephan Ewen
 Fix For: 1.4.0


Currently, the {{flink-avro}} project is a shell with some tests and mostly 
duplicate and dead code. The classes that use Avro are distributed quite wildly 
through the code base, and introduce multiple direct dependencies on Avro in a 
messy way.

We should move all Avro related classes to {{flink-avro}}, and give 
{{flink-avro}} a dependency on {{flink-core}} and {{flink-streaming-java}}.

  - {{AvroTypeInfo}}
  - {{AvroSerializer}}
  - {{AvroRowSerializationSchema}}
  - {{AvroRowDeserializationSchema}}

To be able to move the the avro serialization code from {{flink-ore}} to 
{{flink-avro}}, we need to load the {{AvroTypeInformation}} reflectively, 
similar to how we load the {{WritableTypeInfo}} for Hadoop.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (FLINK-7357) HOP_START() HOP_END() does not work when using HAVING clause with GROUP BY HOP window

2017-08-10 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-7357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16121999#comment-16121999
 ] 

ASF GitHub Bot commented on FLINK-7357:
---

GitHub user walterddr opened a pull request:

https://github.com/apache/flink/pull/4521

[FLINK-7357] [table] Created extended rules for 
WindowStartEndPropertiesRule on Having clause



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/walterddr/flink FLINK-7357

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/4521.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4521


commit 12e6737e3612de94885cc8c16d341b7e2c607370
Author: Rong Rong 
Date:   2017-08-10T17:46:25Z

make WindowStartEndPropertiesRule abstract and create extended rules for 
Having clause




> HOP_START() HOP_END() does not work when using HAVING clause with GROUP BY 
> HOP window
> -
>
> Key: FLINK-7357
> URL: https://issues.apache.org/jira/browse/FLINK-7357
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Affects Versions: 1.3.1
>Reporter: Rong Rong
>Assignee: Rong Rong
>
> The following SQL does not compile:
> {code:title=invalid_having_hop_start_sql}
> SELECT 
>   c AS k, 
>   COUNT(a) AS v, 
>   HOP_START(rowtime, INTERVAL '1' MINUTE, INTERVAL '1' MINUTE) AS 
> windowStart, 
>   HOP_END(rowtime, INTERVAL '1' MINUTE, INTERVAL '1' MINUTE) AS windowEnd 
> FROM 
>   T1 
> GROUP BY 
>   HOP(rowtime, INTERVAL '1' MINUTE, INTERVAL '1' MINUTE), 
>   c 
> HAVING 
>   SUM(b) > 1
> {code}
> While individually keeping HAVING clause or HOP_START field compiles and runs 
> without issue.
> more details: 
> https://github.com/apache/flink/compare/master...walterddr:having_does_not_work_with_hop_start_end



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] flink pull request #4521: [FLINK-7357] [table] Created extended rules for Wi...

2017-08-10 Thread walterddr

GitHub user walterddr opened a pull request:

https://github.com/apache/flink/pull/4521

[FLINK-7357] [table] Created extended rules for 
WindowStartEndPropertiesRule on Having clause



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/walterddr/flink FLINK-7357

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/4521.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4521


commit 12e6737e3612de94885cc8c16d341b7e2c607370
Author: Rong Rong 
Date:   2017-08-10T17:46:25Z

make WindowStartEndPropertiesRule abstract and create extended rules for 
Having clause




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-7357) HOP_START() HOP_END() does not work when using HAVING clause with GROUP BY HOP window

2017-08-10 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-7357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16121997#comment-16121997
 ] 

ASF GitHub Bot commented on FLINK-7357:
---

Github user walterddr closed the pull request at:

https://github.com/apache/flink/pull/4520


> HOP_START() HOP_END() does not work when using HAVING clause with GROUP BY 
> HOP window
> -
>
> Key: FLINK-7357
> URL: https://issues.apache.org/jira/browse/FLINK-7357
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Affects Versions: 1.3.1
>Reporter: Rong Rong
>Assignee: Rong Rong
>
> The following SQL does not compile:
> {code:title=invalid_having_hop_start_sql}
> SELECT 
>   c AS k, 
>   COUNT(a) AS v, 
>   HOP_START(rowtime, INTERVAL '1' MINUTE, INTERVAL '1' MINUTE) AS 
> windowStart, 
>   HOP_END(rowtime, INTERVAL '1' MINUTE, INTERVAL '1' MINUTE) AS windowEnd 
> FROM 
>   T1 
> GROUP BY 
>   HOP(rowtime, INTERVAL '1' MINUTE, INTERVAL '1' MINUTE), 
>   c 
> HAVING 
>   SUM(b) > 1
> {code}
> While individually keeping HAVING clause or HOP_START field compiles and runs 
> without issue.
> more details: 
> https://github.com/apache/flink/compare/master...walterddr:having_does_not_work_with_hop_start_end



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] flink pull request #4520: [FLINK-7357] [table] Created extended rules for Wi...

2017-08-10 Thread walterddr

Github user walterddr closed the pull request at:

https://github.com/apache/flink/pull/4520


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Created] (FLINK-7419) Shade jackson dependency in flink-avro

2017-08-10 Thread Stephan Ewen (JIRA)

Stephan Ewen created FLINK-7419:
---

 Summary: Shade jackson dependency in flink-avro
 Key: FLINK-7419
 URL: https://issues.apache.org/jira/browse/FLINK-7419
 Project: Flink
  Issue Type: Sub-task
  Components: Build System
Reporter: Stephan Ewen
 Fix For: 1.4.0


Avro uses {{org.codehouse.jackson}} which also exists in multiple incompatible 
versions. We should shade it to 
{{org.apache.flink.shaded.avro.org.codehouse.jackson}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] flink pull request #4520: Created extended rules for WindowStartEndPropertie...

2017-08-10 Thread walterddr

GitHub user walterddr opened a pull request:

https://github.com/apache/flink/pull/4520

Created extended rules for WindowStartEndPropertiesRule on Having clause



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/walterddr/flink FLINK-7357

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/4520.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4520


commit 12e6737e3612de94885cc8c16d341b7e2c607370
Author: Rong Rong 
Date:   2017-08-10T17:46:25Z

make WindowStartEndPropertiesRule abstract and create extended rules for 
Having clause




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Created] (FLINK-7418) Replace all uses of jackson with flink-shaded-jackson

2017-08-10 Thread Stephan Ewen (JIRA)

Stephan Ewen created FLINK-7418:
---

 Summary: Replace all uses of jackson with flink-shaded-jackson
 Key: FLINK-7418
 URL: https://issues.apache.org/jira/browse/FLINK-7418
 Project: Flink
  Issue Type: Sub-task
  Components: Build System
Reporter: Stephan Ewen
 Fix For: 1.4.0


Jackson is currently used to create JSON responses in the web UI, in the future 
possibly for the client REST communication.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Created] (FLINK-7417) Create flink-shaded-jackson

2017-08-10 Thread Stephan Ewen (JIRA)

Stephan Ewen created FLINK-7417:
---

 Summary: Create flink-shaded-jackson
 Key: FLINK-7417
 URL: https://issues.apache.org/jira/browse/FLINK-7417
 Project: Flink
  Issue Type: Sub-task
  Components: Build System
Reporter: Stephan Ewen
 Fix For: 1.4.0


The {{com.fasterml:jackson}} library is another culprit of frequent conflicts 
that we need to shade away.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (FLINK-7413) Release Hadoop 2.8.x convenience binaries for 1.3.x

2017-08-10 Thread Eron Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-7413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16121969#comment-16121969
 ] 

Eron Wright  commented on FLINK-7413:
-

Note there was some discussion about backporting to 1.3 in the PR of FLINK-6466.

> Release Hadoop 2.8.x convenience binaries for 1.3.x 
> 
>
> Key: FLINK-7413
> URL: https://issues.apache.org/jira/browse/FLINK-7413
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.3.2
>Reporter: Aljoscha Krettek
>Priority: Blocker
> Fix For: 1.3.3
>
>
> At least one user on the mailing lists had an issue because Hadoop 2.8.x 
> binaries are not available: 
> https://lists.apache.org/thread.html/c8badc66778144d9d6c3ee5cb23dd732a66cb6690c6867f47f4bd456@%3Cuser.flink.apache.org%3E
> It should be as easy as adding Hadoop 2.8.x to the list of created binaries 
> in the release files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (FLINK-6692) The flink-dist jar contains unshaded netty jar

2017-08-10 Thread Stephan Ewen (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-6692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16121962#comment-16121962
 ] 

Stephan Ewen commented on FLINK-6692:
-

Is this solved now with FLINK-7027 being closed?

> The flink-dist jar contains unshaded netty jar
> --
>
> Key: FLINK-6692
> URL: https://issues.apache.org/jira/browse/FLINK-6692
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Fix For: 1.4.0
>
>
> The {{flink-dist}} jar contains unshaded netty 3 and netty 4 classes:
> {noformat}
> io/netty/handler/codec/http/router/
> io/netty/handler/codec/http/router/BadClientSilencer.class
> io/netty/handler/codec/http/router/MethodRouted.class
> io/netty/handler/codec/http/router/Handler.class
> io/netty/handler/codec/http/router/Router.class
> io/netty/handler/codec/http/router/DualMethodRouter.class
> io/netty/handler/codec/http/router/Routed.class
> io/netty/handler/codec/http/router/AbstractHandler.class
> io/netty/handler/codec/http/router/KeepAliveWrite.class
> io/netty/handler/codec/http/router/DualAbstractHandler.class
> io/netty/handler/codec/http/router/MethodRouter.class
> {noformat}
> {noformat}
> org/jboss/netty/util/internal/jzlib/InfBlocks.class
> org/jboss/netty/util/internal/jzlib/InfCodes.class
> org/jboss/netty/util/internal/jzlib/InfTree.class
> org/jboss/netty/util/internal/jzlib/Inflate$1.class
> org/jboss/netty/util/internal/jzlib/Inflate.class
> org/jboss/netty/util/internal/jzlib/JZlib$WrapperType.class
> org/jboss/netty/util/internal/jzlib/JZlib.class
> org/jboss/netty/util/internal/jzlib/StaticTree.class
> org/jboss/netty/util/internal/jzlib/Tree.class
> org/jboss/netty/util/internal/jzlib/ZStream$1.class
> org/jboss/netty/util/internal/jzlib/ZStream.class
> {noformat}
> Is it an expected behavior?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] flink issue #4503: [FLINK-6982] [guava] Integrate flink-shaded-guava-18

2017-08-10 Thread StephanEwen

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/4503
  
Looks pretty good, all in all!

+1 from my side


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-6982) Replace guava dependencies

2017-08-10 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-6982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16121957#comment-16121957
 ] 

ASF GitHub Bot commented on FLINK-6982:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/4503
  
Looks pretty good, all in all!

+1 from my side


> Replace guava dependencies
> --
>
> Key: FLINK-6982
> URL: https://issues.apache.org/jira/browse/FLINK-6982
> Project: Flink
>  Issue Type: Sub-task
>  Components: Build System
>Affects Versions: 1.4.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
> Fix For: 1.4.0
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (FLINK-6738) HBaseConnectorITCase is flaky

2017-08-10 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-6738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated FLINK-6738:
--
Description: 
I ran integration tests for flink 1.3 RC2 and got the following failure:

{code}
Failed tests:
  
HBaseConnectorITCase>HBaseTestingClusterAutostarter.tearDown:240->HBaseTestingClusterAutostarter.deleteTables:127
 Exception found deleting the table expected null, but 
was:
{code}

  was:
I ran integration tests for flink 1.3 RC2 and got the following failure:
{code}
Failed tests:
  
HBaseConnectorITCase>HBaseTestingClusterAutostarter.tearDown:240->HBaseTestingClusterAutostarter.deleteTables:127
 Exception found deleting the table expected null, but 
was:
{code}


> HBaseConnectorITCase is flaky
> -
>
> Key: FLINK-6738
> URL: https://issues.apache.org/jira/browse/FLINK-6738
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming Connectors
>Reporter: Ted Yu
>Priority: Critical
>  Labels: hbase, test-stability
>
> I ran integration tests for flink 1.3 RC2 and got the following failure:
> {code}
> Failed tests:
>   
> HBaseConnectorITCase>HBaseTestingClusterAutostarter.tearDown:240->HBaseTestingClusterAutostarter.deleteTables:127
>  Exception found deleting the table expected null, but 
> was: java.util.concurrent.TimeoutException: The procedure 5 is still running>
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (FLINK-7260) Match not exhaustive in WindowJoinUtil.scala

2017-08-10 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-7260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated FLINK-7260:
--
Description: 
Here is the warning:
{code}
[WARNING] 
/flink/flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/WindowJoinUtil.scala:296:
 warning: match may not be exhaustive.
[WARNING] It would fail on the following inputs: BETWEEN, CREATE_VIEW, DEFAULT, 
DESCRIBE_SCHEMA, DOT, EXCEPT, EXTRACT, GREATEST, LAST_VALUE, 
MAP_QUERY_CONSTRUCTOR, NOT, NOT_EQUALS, NULLS_LAST, PATTERN_ALTER, PREV, 
SIMILAR, SUM0, TIMESTAMP_DIFF, UNION, VAR_SAMP
[WARNING]   timePred.pred.getKind match {
{code}

  was:
Here is the warning:

{code}
[WARNING] 
/flink/flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/WindowJoinUtil.scala:296:
 warning: match may not be exhaustive.
[WARNING] It would fail on the following inputs: BETWEEN, CREATE_VIEW, DEFAULT, 
DESCRIBE_SCHEMA, DOT, EXCEPT, EXTRACT, GREATEST, LAST_VALUE, 
MAP_QUERY_CONSTRUCTOR, NOT, NOT_EQUALS, NULLS_LAST, PATTERN_ALTER, PREV, 
SIMILAR, SUM0, TIMESTAMP_DIFF, UNION, VAR_SAMP
[WARNING]   timePred.pred.getKind match {
{code}


> Match not exhaustive in WindowJoinUtil.scala
> 
>
> Key: FLINK-7260
> URL: https://issues.apache.org/jira/browse/FLINK-7260
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Reporter: Ted Yu
>Priority: Minor
>
> Here is the warning:
> {code}
> [WARNING] 
> /flink/flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/WindowJoinUtil.scala:296:
>  warning: match may not be exhaustive.
> [WARNING] It would fail on the following inputs: BETWEEN, CREATE_VIEW, 
> DEFAULT, DESCRIBE_SCHEMA, DOT, EXCEPT, EXTRACT, GREATEST, LAST_VALUE, 
> MAP_QUERY_CONSTRUCTOR, NOT, NOT_EQUALS, NULLS_LAST, PATTERN_ALTER, PREV, 
> SIMILAR, SUM0, TIMESTAMP_DIFF, UNION, VAR_SAMP
> [WARNING]   timePred.pred.getKind match {
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Comment Edited] (FLINK-7049) TestingApplicationMaster keeps running after integration tests finish

2017-08-10 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-7049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16069086#comment-16069086
 ] 

Ted Yu edited comment on FLINK-7049 at 8/10/17 5:20 PM:


Stack trace for TestingApplicationMaster .


was (Author: yuzhih...@gmail.com):
Stack trace for TestingApplicationMaster.

> TestingApplicationMaster keeps running after integration tests finish
> -
>
> Key: FLINK-7049
> URL: https://issues.apache.org/jira/browse/FLINK-7049
> Project: Flink
>  Issue Type: Test
>  Components: Tests, YARN
>Reporter: Ted Yu
>Priority: Minor
> Attachments: testingApplicationMaster.stack
>
>
> After integration tests finish, TestingApplicationMaster is still running.
> Toward the end of 
> flink-yarn-tests/target/flink-yarn-tests-ha/flink-yarn-tests-ha-logDir-nm-1_0/application_1498768839874_0001/container_1498768839874_0001_03_01/jobmanager.log
>  :
> {code}
> 2017-06-29 22:09:49,681 INFO  org.apache.zookeeper.ClientCnxn 
>   - Opening socket connection to server 127.0.0.1/127.0.0.1:46165
> 2017-06-29 22:09:49,681 ERROR 
> org.apache.flink.shaded.org.apache.curator.ConnectionState- 
> Authentication failed
> 2017-06-29 22:09:49,682 WARN  org.apache.zookeeper.ClientCnxn 
>   - Session 0x0 for server null, unexpected error, closing socket 
> connection and attempting reconnect
> java.net.ConnectException: Connection refused
>   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>   at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
> 2017-06-29 22:09:50,782 WARN  org.apache.zookeeper.ClientCnxn 
>   - SASL configuration failed: 
> javax.security.auth.login.LoginException: No JAAS configuration section named 
> 'Client' was found in specified JAAS configuration file: 
> '/tmp/jaas-3597644653611245612.conf'. Will continue connection to Zookeeper 
> server without SASL authentication, if Zookeeper server allows it.
> 2017-06-29 22:09:50,782 INFO  org.apache.zookeeper.ClientCnxn 
>   - Opening socket connection to server 127.0.0.1/127.0.0.1:46165
> 2017-06-29 22:09:50,782 ERROR 
> org.apache.flink.shaded.org.apache.curator.ConnectionState- 
> Authentication failed
> 2017-06-29 22:09:50,783 WARN  org.apache.zookeeper.ClientCnxn 
>   - Session 0x0 for server null, unexpected error, closing socket 
> connection and attempting reconnect
> java.net.ConnectException: Connection refused
>   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>   at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
>   at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
>   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (FLINK-7245) Enhance the operators to support holding back watermarks

2017-08-10 Thread Xingcan Cui (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-7245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16121855#comment-16121855
 ] 

Xingcan Cui commented on FLINK-7245:


Hi [~fhueske], I've got some new ideas about the rowtime/watermark.

Currently in an operator with two inputs, the watermarks from different streams 
are merged in advance and only lower ones can be reserved. For example, given 
two streams {{S1}} and {{S2}}, if {{S1}} is generated in real-time while {{S2}} 
gets a two hours delay from now, watermarks from {{S1}} will be totally 
discarded since they are always higher than those of {{S2}}. Maybe we could 
make watermarks from different streams distinguishable and apply/emit them all. 

Specifically, I think we could add an extra field, which indicates the 
corresponding rowtime field of a row, in the {{Watermark}} class. For a single 
operator, there could be at most {{n}} (where {{n}} is equal to the number of 
inputs) rowtime fields activated (that may be deduced from the query 
conditions) and only watermarks corresponding to those fields will be *held 
back/applied* in the operator. All the watermarks should be emitted to 
downstream operators. As a consequence, all the timestamps (which are stored in 
rows) and watermarks are reserved and users can operate on different rowtime 
fields in different levels of a nested query.

As a running example, consider the following query. 

{code:sql}
SELECT COUNT(S1.a)
OVER (PARTITION BY S1.key ORDER BY S2.rowtime RANGE BETWEEN 2 PRECEDING AND 
CURRENT ROW)
FROM
(SELECT * FROM S1, S2 
WHERE S1.key = S2.key AND S1.rowtime>=S2.rowtime - 10 SECONDS and 
S1.rowtime Enhance the operators to support holding back watermarks
> 
>
> Key: FLINK-7245
> URL: https://issues.apache.org/jira/browse/FLINK-7245
> Project: Flink
>  Issue Type: New Feature
>  Components: DataStream API
>Reporter: Xingcan Cui
>Assignee: Xingcan Cui
>
> Currently the watermarks are applied and emitted by the 
> {{AbstractStreamOperator}} instantly. 
> {code:java}
> public void processWatermark(Watermark mark) throws Exception {
>   if (timeServiceManager != null) {
>   timeServiceManager.advanceWatermark(mark);
>   }
>   output.emitWatermark(mark);
> }
> {code}
> Some calculation results (with timestamp fields) triggered by these 
> watermarks (e.g., join or aggregate results) may be regarded as delayed by 
> the downstream operators since their timestamps must be less than or equal to 
> the corresponding triggers. 
> This issue aims to add another "working mode", which supports holding back 
> watermarks, to current operators. These watermarks should be blocked and 
> stored by the operators until all the corresponding new generated results are 
> emitted.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (FLINK-7415) Add comments for cassandra test class `BatchExample`

2017-08-10 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-7415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16121832#comment-16121832
 ] 

ASF GitHub Bot commented on FLINK-7415:
---

Github user yew1eb commented on the issue:

https://github.com/apache/flink/pull/4519
  
I ran the test class with an exception, I found the cause finally.


> Add comments for cassandra test class `BatchExample`
> 
>
> Key: FLINK-7415
> URL: https://issues.apache.org/jira/browse/FLINK-7415
> Project: Flink
>  Issue Type: Test
>  Components: Cassandra Connector
>Reporter: Hai Zhou
>
> In 
> `org.apache.flink.batch.connectors.cassandra.example.BatchExample`description:
> {code:java}
>  * The example assumes that a table exists in a local cassandra database, 
> according to the following query:
>  * CREATE TABLE test.batches (number int, strings text, PRIMARY KEY(number, 
> strings));
>  */
> {code}
>  The Keyspace 'test' not exist，We should create it at first.
> I will improve its description.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] flink issue #4519: [FLINK-7415][Cassandra Connector] Add comments for `Batch...

2017-08-10 Thread yew1eb

Github user yew1eb commented on the issue:

https://github.com/apache/flink/pull/4519
  
I ran the test class with an exception, I found the cause finally.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-7415) Add comments for cassandra test class `BatchExample`

2017-08-10 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-7415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16121830#comment-16121830
 ] 

ASF GitHub Bot commented on FLINK-7415:
---

GitHub user yew1eb opened a pull request:

https://github.com/apache/flink/pull/4519

[FLINK-7415][Cassandra Connector] Add comments for `BatchExample.java`

## What is the purpose of the change
Help beginners follow it with less confusion. 

## Brief change log
- Add comments for cassandra batch-test class `BatchExample.java`

## Verifying this change



## Does this pull request potentially affect one of the following parts:

  - Dependencies (does it add or upgrade a dependency): (no)
  - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
  - The serializers: (no)
  - The runtime per-record code paths (performance sensitive): (no)
  - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)

## Documentation

  - Does this pull request introduce a new feature? (no)

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/yew1eb/flink FLINK-7415

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/4519.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4519


commit 62fbc338f5049bebd5b770f66938b9394677dbe9
Author: yew1eb 
Date:   2017-08-10T15:28:00Z

add comments for BatchExample.java




> Add comments for cassandra test class `BatchExample`
> 
>
> Key: FLINK-7415
> URL: https://issues.apache.org/jira/browse/FLINK-7415
> Project: Flink
>  Issue Type: Test
>  Components: Cassandra Connector
>Reporter: Hai Zhou
>
> In 
> `org.apache.flink.batch.connectors.cassandra.example.BatchExample`description:
> {code:java}
>  * The example assumes that a table exists in a local cassandra database, 
> according to the following query:
>  * CREATE TABLE test.batches (number int, strings text, PRIMARY KEY(number, 
> strings));
>  */
> {code}
>  The Keyspace 'test' not exist，We should create it at first.
> I will improve its description.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] flink pull request #4519: [FLINK-7415][Cassandra Connector] Add comments for...

2017-08-10 Thread yew1eb

GitHub user yew1eb opened a pull request:

https://github.com/apache/flink/pull/4519

[FLINK-7415][Cassandra Connector] Add comments for `BatchExample.java`

## What is the purpose of the change
Help beginners follow it with less confusion. 

## Brief change log
- Add comments for cassandra batch-test class `BatchExample.java`

## Verifying this change



## Does this pull request potentially affect one of the following parts:

  - Dependencies (does it add or upgrade a dependency): (no)
  - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
  - The serializers: (no)
  - The runtime per-record code paths (performance sensitive): (no)
  - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)

## Documentation

  - Does this pull request introduce a new feature? (no)

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/yew1eb/flink FLINK-7415

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/4519.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4519


commit 62fbc338f5049bebd5b770f66938b9394677dbe9
Author: yew1eb 
Date:   2017-08-10T15:28:00Z

add comments for BatchExample.java




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request #4518: [FLINK-7412][network] optimise NettyMessage.TaskEv...

2017-08-10 Thread NicoK

GitHub user NicoK opened a pull request:

https://github.com/apache/flink/pull/4518

[FLINK-7412][network] optimise NettyMessage.TaskEventRequest#readFrom() to 
read from netty buffers directly

## What is the purpose of the change

`NettyMessage.TaskEventRequest#readFrom()` had an outstanding TODO to read 
from the netty buffer directly, instead of copying to a `ByteBuffer` first and 
then deserializing the event.

## Brief change log

- use `ByteBuf#nioBuffer()` to deserialize from a `ByteBuffer` instance 
wrapping netty's buffer contents 

## Verifying this change

This change is already covered by existing tests, such as 
`NettyMessageSerializationTest` as well as many other tests, especially IT 
cases, involving network.

## Does this pull request potentially affect one of the following parts:

  - Dependencies (does it add or upgrade a dependency): (no)
  - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
  - The serializers: (no)
  - The runtime per-record code paths (performance sensitive): (yes)
  - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes, if you consider 
network communication part of this)

## Documentation

  - Does this pull request introduce a new feature? (no)
  - If yes, how is the feature documented? (not applicable)



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/NicoK/flink flink-7412

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/4518.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4518


commit b33e036b43d00ddf564f105280894f6287dd3e92
Author: Nico Kruber 
Date:   2017-08-07T15:38:36Z

[FLINK-7411][network] minor (performance) improvements in NettyMessage

* use a switch rather than multiple if conditions
* use static `readFrom` methods to create instances of the message sub types

commit bce73af1b01793574d748e6536052ec6530360b0
Author: Nico Kruber 
Date:   2017-08-07T16:12:28Z

[FLINK-7412][network] optimise NettyMessage.TaskEventRequest#readFrom() to 
read from netty buffers directly




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-7412) optimise NettyMessage.TaskEventRequest#readFrom() to read from netty buffers directly

2017-08-10 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-7412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16121818#comment-16121818
 ] 

ASF GitHub Bot commented on FLINK-7412:
---

GitHub user NicoK opened a pull request:

https://github.com/apache/flink/pull/4518

[FLINK-7412][network] optimise NettyMessage.TaskEventRequest#readFrom() to 
read from netty buffers directly

## What is the purpose of the change

`NettyMessage.TaskEventRequest#readFrom()` had an outstanding TODO to read 
from the netty buffer directly, instead of copying to a `ByteBuffer` first and 
then deserializing the event.

## Brief change log

- use `ByteBuf#nioBuffer()` to deserialize from a `ByteBuffer` instance 
wrapping netty's buffer contents 

## Verifying this change

This change is already covered by existing tests, such as 
`NettyMessageSerializationTest` as well as many other tests, especially IT 
cases, involving network.

## Does this pull request potentially affect one of the following parts:

  - Dependencies (does it add or upgrade a dependency): (no)
  - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
  - The serializers: (no)
  - The runtime per-record code paths (performance sensitive): (yes)
  - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes, if you consider 
network communication part of this)

## Documentation

  - Does this pull request introduce a new feature? (no)
  - If yes, how is the feature documented? (not applicable)



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/NicoK/flink flink-7412

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/4518.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4518


commit b33e036b43d00ddf564f105280894f6287dd3e92
Author: Nico Kruber 
Date:   2017-08-07T15:38:36Z

[FLINK-7411][network] minor (performance) improvements in NettyMessage

* use a switch rather than multiple if conditions
* use static `readFrom` methods to create instances of the message sub types

commit bce73af1b01793574d748e6536052ec6530360b0
Author: Nico Kruber 
Date:   2017-08-07T16:12:28Z

[FLINK-7412][network] optimise NettyMessage.TaskEventRequest#readFrom() to 
read from netty buffers directly




> optimise NettyMessage.TaskEventRequest#readFrom() to read from netty buffers 
> directly
> -
>
> Key: FLINK-7412
> URL: https://issues.apache.org/jira/browse/FLINK-7412
> Project: Flink
>  Issue Type: Improvement
>  Components: Network
>Affects Versions: 1.4.0
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>
> {{NettyMessage.TaskEventRequest#readFrom()}} allocates a new {{ByteBuffer}} 
> for event deserialization although, here, we are easily able to read from 
> netty's buffer directly



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (FLINK-7411) minor performance improvements in NettyMessage

2017-08-10 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16121812#comment-16121812
 ] 

ASF GitHub Bot commented on FLINK-7411:
---

GitHub user NicoK opened a pull request:

https://github.com/apache/flink/pull/4517

[FLINK-7411][network] minor performance improvements in NettyMessage

## What is the purpose of the change

This PR adds some (minor) performance improvements to `NettyMessage` which 
I came across: using a `switch` rather than multiple `if...ifelse...ifelse...` 
avoiding unneeded virtual method calls.

## Brief change log

- use a switch rather than multiple if conditions
- use static `readFrom` methods to create instances of the message sub types

## Verifying this change

This change is already covered by existing tests, such as 
`NettyMessageSerializationTest`.

## Does this pull request potentially affect one of the following parts:

  - Dependencies (does it add or upgrade a dependency): (no)
  - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
  - The serializers: (no)
  - The runtime per-record code paths (performance sensitive): (yes)
  - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes, if you consider 
network communication part of this)

## Documentation

  - Does this pull request introduce a new feature? (no)
  - If yes, how is the feature documented? (not applicable)



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/NicoK/flink flink-7411

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/4517.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4517


commit b33e036b43d00ddf564f105280894f6287dd3e92
Author: Nico Kruber 
Date:   2017-08-07T15:38:36Z

[FLINK-7411][network] minor (performance) improvements in NettyMessage

* use a switch rather than multiple if conditions
* use static `readFrom` methods to create instances of the message sub types




> minor performance improvements in NettyMessage
> --
>
> Key: FLINK-7411
> URL: https://issues.apache.org/jira/browse/FLINK-7411
> Project: Flink
>  Issue Type: Improvement
>  Components: Network
>Affects Versions: 1.4.0
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>Priority: Minor
>
> {{NettyMessage}} may be improved slightly performance-wise in these regards:
> - in {{NettyMessage.NettyMessageDecoder#decode()}}: instead of having 
> multiple if-elseif-... use a switch to cycle through the message ID
> - use a static {{NettyMessage}} subtype {{readFrom(ByteBuf buffer)}} - we do 
> not really need to have a virtual function here



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] flink pull request #4492: [FLINK-7381] [web] Decouple WebRuntimeMonitor from...

2017-08-10 Thread tillrohrmann

Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/4492#discussion_r132490977
  
--- Diff: 
flink-core/src/main/java/org/apache/flink/configuration/WebOptions.java ---
@@ -149,6 +149,13 @@
.defaultValue(50)

.withDeprecatedKeys("jobmanager.web.backpressure.delay-between-samples");
 
+   /**
+* Timeout for asynchronous operations by the WebRuntimeMonitor
+*/
+   public static final ConfigOption TIMEOUT = ConfigOptions
--- End diff --

True. Will add it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request #4517: [FLINK-7411][network] minor performance improvemen...

2017-08-10 Thread NicoK

GitHub user NicoK opened a pull request:

https://github.com/apache/flink/pull/4517

[FLINK-7411][network] minor performance improvements in NettyMessage

## What is the purpose of the change

This PR adds some (minor) performance improvements to `NettyMessage` which 
I came across: using a `switch` rather than multiple `if...ifelse...ifelse...` 
avoiding unneeded virtual method calls.

## Brief change log

- use a switch rather than multiple if conditions
- use static `readFrom` methods to create instances of the message sub types

## Verifying this change

This change is already covered by existing tests, such as 
`NettyMessageSerializationTest`.

## Does this pull request potentially affect one of the following parts:

  - Dependencies (does it add or upgrade a dependency): (no)
  - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
  - The serializers: (no)
  - The runtime per-record code paths (performance sensitive): (yes)
  - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes, if you consider 
network communication part of this)

## Documentation

  - Does this pull request introduce a new feature? (no)
  - If yes, how is the feature documented? (not applicable)



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/NicoK/flink flink-7411

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/4517.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4517


commit b33e036b43d00ddf564f105280894f6287dd3e92
Author: Nico Kruber 
Date:   2017-08-07T15:38:36Z

[FLINK-7411][network] minor (performance) improvements in NettyMessage

* use a switch rather than multiple if conditions
* use static `readFrom` methods to create instances of the message sub types




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-7381) Decouple WebRuntimeMonitor from ActorGateway

2017-08-10 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-7381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16121810#comment-16121810
 ] 

ASF GitHub Bot commented on FLINK-7381:
---

Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/4492#discussion_r132490977
  
--- Diff: 
flink-core/src/main/java/org/apache/flink/configuration/WebOptions.java ---
@@ -149,6 +149,13 @@
.defaultValue(50)

.withDeprecatedKeys("jobmanager.web.backpressure.delay-between-samples");
 
+   /**
+* Timeout for asynchronous operations by the WebRuntimeMonitor
+*/
+   public static final ConfigOption TIMEOUT = ConfigOptions
--- End diff --

True. Will add it.


> Decouple WebRuntimeMonitor from ActorGateway
> 
>
> Key: FLINK-7381
> URL: https://issues.apache.org/jira/browse/FLINK-7381
> Project: Flink
>  Issue Type: Sub-task
>  Components: Webfrontend
>Affects Versions: 1.4.0
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
>  Labels: flip-6
>
> The {{WebRuntimeMonitor}} has a hard wired dependency on the {{ActorGateway}} 
> in order to communicate with the {{JobManager}}. In order to make it work 
> with the {{JobMaster}} (Flip-6), we have to abstract this dependency away. I 
> propose to add a {{JobManagerGateway}} interface which can be implemented 
> using Akka for the old {{JobManager}} code. The Flip-6 {{JobMasterGateway}} 
> can then directly inherit from this interface.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (FLINK-7408) Extract WebRuntimeMonitor options from JobManagerOptions

2017-08-10 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-7408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16121803#comment-16121803
 ] 

ASF GitHub Bot commented on FLINK-7408:
---

Github user tillrohrmann commented on the issue:

https://github.com/apache/flink/pull/4512
  
Thanks for the review @zentol. You're right that we should remove the 
`key()` method calls.


> Extract WebRuntimeMonitor options from JobManagerOptions
> 
>
> Key: FLINK-7408
> URL: https://issues.apache.org/jira/browse/FLINK-7408
> Project: Flink
>  Issue Type: Improvement
>  Components: Configuration
>Affects Versions: 1.4.0
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
>Priority: Trivial
>
> With the Flip-6 code changes, the {{WebRuntimeMonitor}} won't run exclusively 
> next to the {{JobManager}}. Therefore, it makes sense to refactor the web 
> monitor options and moving them from {{JobManagerOptions}} to {{WebOptions}} 
> and removing the prefix {{jobmanager}}. 
> This is done as requested by 
> https://github.com/apache/flink/pull/4492#issuecomment-321271819.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] flink issue #4512: [FLINK-7408] [conf] Create WebOptions for WebRuntimeMonit...

2017-08-10 Thread tillrohrmann

Github user tillrohrmann commented on the issue:

https://github.com/apache/flink/pull/4512
  
Thanks for the review @zentol. You're right that we should remove the 
`key()` method calls.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Reopened] (FLINK-7300) End-to-end tests are instable on Travis

2017-08-10 Thread Till Rohrmann (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-7300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann reopened FLINK-7300:
--

I fear the problem has not been fixed: 
https://travis-ci.org/apache/flink/jobs/263016472

> End-to-end tests are instable on Travis
> ---
>
> Key: FLINK-7300
> URL: https://issues.apache.org/jira/browse/FLINK-7300
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.4.0
>Reporter: Tzu-Li (Gordon) Tai
>Assignee: Aljoscha Krettek
>  Labels: test-stability
> Fix For: 1.4.0
>
>
> It seems like the end-to-end tests are instable, causing the {{misc}} build 
> profile to sporadically fail.
> Incorrect matched output:
> https://s3.amazonaws.com/archive.travis-ci.org/jobs/258569408/log.txt?X-Amz-Expires=30&X-Amz-Date=20170731T060526Z&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAJRYRXRSVGNKPKO5A/20170731/us-east-1/s3/aws4_request&X-Amz-SignedHeaders=host&X-Amz-Signature=4ef9ff5e60fe06db53a84be8d73775a46cb595a8caeb806b05dbbf824d3b69e8
> Another failure example of a different cause then the above, also on the 
> end-to-end tests:
> https://s3.amazonaws.com/archive.travis-ci.org/jobs/258841693/log.txt?X-Amz-Expires=30&X-Amz-Date=20170731T060007Z&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAJRYRXRSVGNKPKO5A/20170731/us-east-1/s3/aws4_request&X-Amz-SignedHeaders=host&X-Amz-Signature=4a106b3990228b7628c250cc15407bc2c131c8332e1a94ad68d649fe8d32d726



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Created] (FLINK-7416) Implement Netty receiver outgoing pipeline for credit-based

2017-08-10 Thread zhijiang (JIRA)

zhijiang created FLINK-7416:
---

 Summary: Implement Netty receiver outgoing pipeline for 
credit-based
 Key: FLINK-7416
 URL: https://issues.apache.org/jira/browse/FLINK-7416
 Project: Flink
  Issue Type: Sub-task
  Components: Network
Reporter: zhijiang
Assignee: zhijiang
 Fix For: 1.4.0


This is a part of work for credit-based network flow control.

The related works are :

* The {{InputChannel}} notifies the initial credit which equals to the number 
of exclusive buffers per channel via {{PartitionRequest}} message.
*  We define another message called {{AddCredit}} to notify the incremental 
credit during data shuffle. 
* Whenever an {{InputChannel}}’s unannounced credit goes up from zero, the 
channel is enqueued in the pipeline.
* Whenever the channel becomes writable, it takes the next {{InputChannel}} and 
sends its unannounced credit. The credit is reset to zero after each sent.
* That way, messages are sent as often as the network has capacity and contain 
as much credit as available for the channel at that point in time. Otherwise, 
it would only add latency to the announcements and not increase throughput.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (FLINK-6180) Remove TestingSerialRpcService

2017-08-10 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-6180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16121796#comment-16121796
 ] 

ASF GitHub Bot commented on FLINK-6180:
---

GitHub user tillrohrmann opened a pull request:

https://github.com/apache/flink/pull/4516

[FLINK-6180] [rpc] Remove TestingSerialRpcService

## What is the purpose of the change

The TestingSerialRpcService produces thread interleavings which are not 
happening
when being executed with a proper RpcService implementation. Due to this 
the test
cases can fail or succeed wrongly. In order to avoid this problem, this 
commit
removes the TestingSerialRpcService and adapts all existing tests which 
used it
before.

This PR is based on #4498.

## Brief change log

- adapt all affected test cases
- remove `TestingSerialRpcService`

## Verifying this change

This change is a trivial rework / code cleanup without any test coverage.

## Does this pull request potentially affect one of the following parts:

  - Dependencies (does it add or upgrade a dependency): (no)
  - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
  - The serializers: (no)
  - The runtime per-record code paths (performance sensitive): (no)
  - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)

## Documentation

  - Does this pull request introduce a new feature? (no)
  - If yes, how is the feature documented? (not applicable)



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tillrohrmann/flink removeTestingSerialRPC

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/4516.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4516


commit c76c4f13e95eeb60f64e0ef4604036ff853da3fd
Author: Till Rohrmann 
Date:   2017-08-08T12:43:47Z

[FLINK-7387] [rpc] Require RpcEndpoints to directly implement RpcGateways

This commit changes the relation between RpcEndpoints and RpcGateways. From 
now on,
the RpcEndpoints have to implement the RpcGateways they want to support 
instead of
coupling it loosely via a type parameter. In order to obtain self gateway a 
new
method RpcEndpoint#getSelfGateway(Class) has been introduced. This method 
can be used
to obtain the RpcGateway type at run time to talk to the RpcEndpoint 
asynchronously.

All existing RpcEndpoints have been adapted to the new model. This 
basically means
that they now return a CompletableFuture instead of X.

Add RpcEndpointTest

commit 03d513b456b32af15eedbe2d39f44fca742d627d
Author: Till Rohrmann 
Date:   2017-08-09T11:24:33Z

Fix Failing TaskExecutorITCase

commit 18267c941b66ec4229285371a4e209b7a8b25eaa
Author: Till Rohrmann 
Date:   2017-08-10T12:07:09Z

[FLINK-6180] [rpc] Remove TestingSerialRpcService

The TestingSerialRpcService produces thread interleavings which are not 
happening
when being executed with a proper RpcService implementation. Due to this 
the test
cases can fail or succeed wrongly. In order to avoid this problem, this 
commit
removes the TestingSerialRpcService and adapts all existing tests which 
used it
before.

Remove TestingSerialRpcService from MesosResourceManagerTest

Remove TestingSerialRpcService from ResourceManagerJobMasterTest

Remove TestingSerialRpcService from ResourceManagerTaskExecutorTest

Remove TestingSerialRpcService from ResourceManagerTest

Remove SerialTestingRpcService from JobMasterTest

Remove TestingSerialRpcService from TaskExecutorITCase

Remove TestingSerialRpcService from TaskExecutorTest

Remove TestingSerialRpcService from SlotPoolTest

Delete TestingSerialRpcService




> Remove TestingSerialRpcService
> --
>
> Key: FLINK-6180
> URL: https://issues.apache.org/jira/browse/FLINK-6180
> Project: Flink
>  Issue Type: Sub-task
>  Components: Distributed Coordination
>Affects Versions: 1.3.0
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
>  Labels: flip-6
> Fix For: 1.4.0
>
>
> The {{TestingSerialRpcService}} is problematic because it allows execution 
> interleavings which are not possible when using the {{AkkaRpcService}}, 
> because main thread calls can be executed while another main thread call is 
> still being executed. Therefore, we might test things which are not possible 
> and might not test certain interleavings which occur when using

[GitHub] flink pull request #4516: [FLINK-6180] [rpc] Remove TestingSerialRpcService

2017-08-10 Thread tillrohrmann

GitHub user tillrohrmann opened a pull request:

https://github.com/apache/flink/pull/4516

[FLINK-6180] [rpc] Remove TestingSerialRpcService

## What is the purpose of the change

The TestingSerialRpcService produces thread interleavings which are not 
happening
when being executed with a proper RpcService implementation. Due to this 
the test
cases can fail or succeed wrongly. In order to avoid this problem, this 
commit
removes the TestingSerialRpcService and adapts all existing tests which 
used it
before.

This PR is based on #4498.

## Brief change log

- adapt all affected test cases
- remove `TestingSerialRpcService`

## Verifying this change

This change is a trivial rework / code cleanup without any test coverage.

## Does this pull request potentially affect one of the following parts:

  - Dependencies (does it add or upgrade a dependency): (no)
  - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
  - The serializers: (no)
  - The runtime per-record code paths (performance sensitive): (no)
  - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)

## Documentation

  - Does this pull request introduce a new feature? (no)
  - If yes, how is the feature documented? (not applicable)



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tillrohrmann/flink removeTestingSerialRPC

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/4516.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4516


commit c76c4f13e95eeb60f64e0ef4604036ff853da3fd
Author: Till Rohrmann 
Date:   2017-08-08T12:43:47Z

[FLINK-7387] [rpc] Require RpcEndpoints to directly implement RpcGateways

This commit changes the relation between RpcEndpoints and RpcGateways. From 
now on,
the RpcEndpoints have to implement the RpcGateways they want to support 
instead of
coupling it loosely via a type parameter. In order to obtain self gateway a 
new
method RpcEndpoint#getSelfGateway(Class) has been introduced. This method 
can be used
to obtain the RpcGateway type at run time to talk to the RpcEndpoint 
asynchronously.

All existing RpcEndpoints have been adapted to the new model. This 
basically means
that they now return a CompletableFuture instead of X.

Add RpcEndpointTest

commit 03d513b456b32af15eedbe2d39f44fca742d627d
Author: Till Rohrmann 
Date:   2017-08-09T11:24:33Z

Fix Failing TaskExecutorITCase

commit 18267c941b66ec4229285371a4e209b7a8b25eaa
Author: Till Rohrmann 
Date:   2017-08-10T12:07:09Z

[FLINK-6180] [rpc] Remove TestingSerialRpcService

The TestingSerialRpcService produces thread interleavings which are not 
happening
when being executed with a proper RpcService implementation. Due to this 
the test
cases can fail or succeed wrongly. In order to avoid this problem, this 
commit
removes the TestingSerialRpcService and adapts all existing tests which 
used it
before.

Remove TestingSerialRpcService from MesosResourceManagerTest

Remove TestingSerialRpcService from ResourceManagerJobMasterTest

Remove TestingSerialRpcService from ResourceManagerTaskExecutorTest

Remove TestingSerialRpcService from ResourceManagerTest

Remove SerialTestingRpcService from JobMasterTest

Remove TestingSerialRpcService from TaskExecutorITCase

Remove TestingSerialRpcService from TaskExecutorTest

Remove TestingSerialRpcService from SlotPoolTest

Delete TestingSerialRpcService




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Assigned] (FLINK-6180) Remove TestingSerialRpcService

2017-08-10 Thread Till Rohrmann (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-6180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann reassigned FLINK-6180:


Assignee: Till Rohrmann

> Remove TestingSerialRpcService
> --
>
> Key: FLINK-6180
> URL: https://issues.apache.org/jira/browse/FLINK-6180
> Project: Flink
>  Issue Type: Sub-task
>  Components: Distributed Coordination
>Affects Versions: 1.3.0
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
>  Labels: flip-6
> Fix For: 1.4.0
>
>
> The {{TestingSerialRpcService}} is problematic because it allows execution 
> interleavings which are not possible when using the {{AkkaRpcService}}, 
> because main thread calls can be executed while another main thread call is 
> still being executed. Therefore, we might test things which are not possible 
> and might not test certain interleavings which occur when using the 
> {{AkkaRpcService}}.
> Therefore, I propose to remove the {{TestingSerialRpcService}} and to 
> refactor the existing tests to use the {{AkkaRpcService}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] flink pull request #4515: Logging improvements in NetworkFailuresProxy

2017-08-10 Thread pnowojski

GitHub user pnowojski opened a pull request:

https://github.com/apache/flink/pull/4515

Logging improvements in NetworkFailuresProxy

This PR consists of only one minor bug fix and one improvement in logging 
of `NetworkFailuresProxy` class used in tests. It's highly unlikely that it 
will brake anything. Change is covered by existing `NetworkFailuresProxyTest`.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/pnowojski/flink misc

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/4515.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4515


commit 3ebe2038caf0d0e4d3522f53395bb41081d978ad
Author: Piotr Nowojski 
Date:   2017-08-09T14:26:08Z

[hotfix][misc] Fix logging of local port in NetworkFailuresProxy

Previously if localPort was set 0, actually obtained/bind port was not
logged anywhere. Now we print local port after binding.

commit aed0f5c9d79b82897ee227093e8fde8ebd10b593
Author: Piotr Nowojski 
Date:   2017-08-10T08:29:17Z

[hotfix][misc] More verbose logging in NetworkFailuresProxy




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-6988) Add Apache Kafka 0.11 connector

2017-08-10 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-6988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16121769#comment-16121769
 ] 

ASF GitHub Bot commented on FLINK-6988:
---

Github user rangadi commented on the issue:

https://github.com/apache/flink/pull/4239
  
I guess you could store the transactional.id for _next_ transaction in 
committed state. That way the new task starts the new transaction with the name 
stored in state which automatically aborts the open transaction.  


> Add Apache Kafka 0.11 connector
> ---
>
> Key: FLINK-6988
> URL: https://issues.apache.org/jira/browse/FLINK-6988
> Project: Flink
>  Issue Type: Improvement
>  Components: Kafka Connector
>Affects Versions: 1.3.1
>Reporter: Piotr Nowojski
>Assignee: Piotr Nowojski
>
> Kafka 0.11 (it will be released very soon) add supports for transactions. 
> Thanks to that, Flink might be able to implement Kafka sink supporting 
> "exactly-once" semantic. API changes and whole transactions support is 
> described in 
> [KIP-98|https://cwiki.apache.org/confluence/display/KAFKA/KIP-98+-+Exactly+Once+Delivery+and+Transactional+Messaging].
> The goal is to mimic implementation of existing BucketingSink. New 
> FlinkKafkaProducer011 would 
> * upon creation begin transaction, store transaction identifiers into the 
> state and would write all incoming data to an output Kafka topic using that 
> transaction
> * on `snapshotState` call, it would flush the data and write in state 
> information that current transaction is pending to be committed
> * on `notifyCheckpointComplete` we would commit this pending transaction
> * in case of crash between `snapshotState` and `notifyCheckpointComplete` we 
> either abort this pending transaction (if not every participant successfully 
> saved the snapshot) or restore and commit it. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] flink issue #4239: [FLINK-6988] flink-connector-kafka-0.11 with exactly-once...

2017-08-10 Thread rangadi

Github user rangadi commented on the issue:

https://github.com/apache/flink/pull/4239
  
I guess you could store the transactional.id for _next_ transaction in 
committed state. That way the new task starts the new transaction with the name 
stored in state which automatically aborts the open transaction.  


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Created] (FLINK-7415) Add comments for cassandra test class `BatchExample`

2017-08-10 Thread Hai Zhou (JIRA)

Hai Zhou created FLINK-7415:
---

 Summary: Add comments for cassandra test class `BatchExample`
 Key: FLINK-7415
 URL: https://issues.apache.org/jira/browse/FLINK-7415
 Project: Flink
  Issue Type: Test
  Components: Cassandra Connector
Reporter: Hai Zhou


In 
`org.apache.flink.batch.connectors.cassandra.example.BatchExample`description:
{code:java}
 * The example assumes that a table exists in a local cassandra database, 
according to the following query:
 * CREATE TABLE test.batches (number int, strings text, PRIMARY KEY(number, 
strings));
 */
{code}

 The Keyspace 'test' not exist，We should create it at first.

I will improve its description.




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (FLINK-7398) Table API operators/UDFs must not store Logger

2017-08-10 Thread Fabian Hueske (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-7398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16121753#comment-16121753
 ] 

Fabian Hueske commented on FLINK-7398:
--

+1 [~jark], if we decide to port the runtime code, we can do it one operator at 
a time and migrated step by step.

> Table API operators/UDFs must not store Logger
> --
>
> Key: FLINK-7398
> URL: https://issues.apache.org/jira/browse/FLINK-7398
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Affects Versions: 1.4.0, 1.3.2
>Reporter: Aljoscha Krettek
>Assignee: Haohui Mai
>Priority: Blocker
> Fix For: 1.4.0, 1.3.3
>
>
> Table API operators and UDFs store a reference to the (slf4j) {{Logger}} in 
> an instance field (c.f. 
> https://github.com/apache/flink/blob/f37988c19adc30d324cde83c54f2fa5d36efb9e7/flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/FlatMapRunner.scala#L39).
>  This means that the {{Logger}} will be serialised with the UDF and sent to 
> the cluster. This in itself does not sound right and leads to problems when 
> the slf4j configuration on the Client is different from the cluster 
> environment.
> This is an example of a user running into that problem: 
> https://lists.apache.org/thread.html/01dd44007c0122d60c3fd2b2bb04fd2d6d2114bcff1e34d1d2079522@%3Cuser.flink.apache.org%3E.
>  Here, they have Logback on the client but the Logback classes are not 
> available on the cluster and so deserialisation of the UDFs fails with a 
> {{ClassNotFoundException}}.
> This is a rough list of the involved classes:
> {code}
> src/main/scala/org/apache/flink/table/catalog/ExternalCatalogSchema.scala:43: 
>  private val LOG: Logger = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/catalog/ExternalTableSourceUtil.scala:45:
>   private val LOG: Logger = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/codegen/calls/BuiltInMethods.scala:28:  
> val LOG10 = Types.lookupMethod(classOf[Math], "log10", classOf[Double])
> src/main/scala/org/apache/flink/table/plan/nodes/datastream/DataStreamGroupAggregate.scala:62:
>   private val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/plan/nodes/datastream/DataStreamGroupWindowAggregate.scala:59:
>   private val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/plan/nodes/datastream/DataStreamOverAggregate.scala:51:
>   private val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/plan/nodes/FlinkConventions.scala:28:  
> val LOGICAL: Convention = new Convention.Impl("LOGICAL", 
> classOf[FlinkLogicalRel])
> src/main/scala/org/apache/flink/table/plan/rules/FlinkRuleSets.scala:38:  val 
> LOGICAL_OPT_RULES: RuleSet = RuleSets.ofList(
> src/main/scala/org/apache/flink/table/runtime/aggregate/AggregateAggFunction.scala:36:
>   val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/runtime/aggregate/DataSetAggFunction.scala:43:
>   val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/runtime/aggregate/DataSetFinalAggFunction.scala:44:
>   val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/runtime/aggregate/DataSetPreAggFunction.scala:44:
>   val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/runtime/aggregate/DataSetSessionWindowAggregatePreProcessor.scala:55:
>   val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/runtime/aggregate/DataSetSessionWindowAggReduceGroupFunction.scala:66:
>   val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/runtime/aggregate/DataSetSlideWindowAggReduceGroupFunction.scala:56:
>   val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/runtime/aggregate/DataSetSlideTimeWindowAggReduceGroupFunction.scala:64:
>   val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/runtime/aggregate/DataSetTumbleCountWindowAggReduceGroupFunction.scala:46:
>   val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/runtime/aggregate/DataSetWindowAggMapFunction.scala:52:
>   val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/runtime/aggregate/DataSetTumbleTimeWindowAggReduceGroupFunction.scala:55:
>   val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/runtime/aggregate/GroupAggProcessFunction.scala:48:
>   val LOG: Logger = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/runtime/aggregate/ProcTimeBoundedRowsOver.scala:66:
>   val LOG

[jira] [Commented] (FLINK-7410) Add getName method to UserDefinedFunction

2017-08-10 Thread Fabian Hueske (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-7410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16121750#comment-16121750
 ] 

Fabian Hueske commented on FLINK-7410:
--

I think that would be a good feature!

> Add getName method to UserDefinedFunction
> -
>
> Key: FLINK-7410
> URL: https://issues.apache.org/jira/browse/FLINK-7410
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Affects Versions: 1.4.0
>Reporter: Hequn Cheng
>Assignee: Hequn Cheng
>
> *Motivation*
> Operator names setted in table-api are used by visualization and logging, it 
> is import to make these names simple and readable. Currently, 
> UserDefinedFunction’s name contains class CanonicalName and md5 value making 
> the name too long and unfriendly to users. 
> As shown in the following example, 
> {quote}
> select: (a, b, c, 
> org$apache$flink$table$expressions$utils$RichFunc1$281f7e61ec5d8da894f5783e2e17a4f5(a)
>  AS _c3, 
> org$apache$flink$table$expressions$utils$RichFunc2$fb99077e565685ebc5f48b27edc14d98(c)
>  AS _c4)
> {quote}
> *Changes:*
>   
> Provide getName method for UserDefinedFunction. The method will return class 
> name by default. Users can also override the method to return whatever he 
> wants.
> What do you think [~fhueske] ?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (FLINK-7062) Support the basic functionality of MATCH_RECOGNIZE

2017-08-10 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-7062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16121745#comment-16121745
 ] 

ASF GitHub Bot commented on FLINK-7062:
---

Github user fhueske commented on the issue:

https://github.com/apache/flink/pull/4502
  
I'd like to have a look at this PR as well. Thank you!


> Support the basic functionality of MATCH_RECOGNIZE
> --
>
> Key: FLINK-7062
> URL: https://issues.apache.org/jira/browse/FLINK-7062
> Project: Flink
>  Issue Type: Sub-task
>  Components: CEP, Table API & SQL
>Reporter: Dian Fu
>Assignee: Dian Fu
>
> In this JIRA, we will support the basic functionality of {{MATCH_RECOGNIZE}} 
> in Flink SQL API which includes the support of syntax {{MEASURES}}, 
> {{PATTERN}} and {{DEFINE}}. This would allow users write basic cep use cases 
> with SQL like the following example:
> {code}
> SELECT T.aid, T.bid, T.cid
> FROM MyTable
> MATCH_RECOGNIZE (
>   MEASURES
> A.id AS aid,
> B.id AS bid,
> C.id AS cid
>   PATTERN (A B C)
>   DEFINE
> A AS A.name = 'a',
> B AS B.name = 'b',
> C AS C.name = 'c'
> ) AS T
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] flink issue #4502: [FLINK-7062] [table, cep] Support the basic functionality...

2017-08-10 Thread fhueske

Github user fhueske commented on the issue:

https://github.com/apache/flink/pull/4502
  
I'd like to have a look at this PR as well. Thank you!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-6094) Implement stream-stream proctime non-window inner join

2017-08-10 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-6094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16121733#comment-16121733
 ] 

ASF GitHub Bot commented on FLINK-6094:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/4471#discussion_r132303211
  
--- Diff: 
flink-libraries/flink-table/src/test/scala/org/apache/flink/table/api/stream/table/validation/JoinValidationTest.scala
 ---
@@ -0,0 +1,112 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.api.stream.table.validation
+
+import org.apache.flink.api.scala._
+import org.apache.flink.streaming.api.scala.StreamExecutionEnvironment
+import org.apache.flink.table.api.{TableEnvironment, TableException, 
ValidationException}
+import org.apache.flink.table.api.scala._
+import org.apache.flink.table.runtime.utils.StreamTestData
+import org.apache.flink.table.utils.TableTestBase
+import org.apache.flink.types.Row
+import org.junit.Test
+
+class JoinValidationTest extends TableTestBase {
+
+  private val util = streamTestUtil()
+  private val ds1 = util.addTable[(Int, Long, String)]("Table3",'a, 'b, 'c)
+  private val ds2 = util.addTable[(Int, Long, Int, String, 
Long)]("Table5", 'd, 'e, 'f, 'g, 'h)
+
+  @Test(expected = classOf[ValidationException])
+  def testJoinNonExistingKey(): Unit = {
+ds1.join(ds2)
+  // must fail. Field 'foo does not exist
+  .where('foo === 'e)
+  .select('c, 'g)
+  }
+
+  @Test(expected = classOf[ValidationException])
+  def testJoinWithNonMatchingKeyTypes(): Unit = {
+ds1.join(ds2)
+  // must fail. Field 'a is Int, and 'g is String
+  .where('a === 'g)
+  .select('c, 'g)
+  }
+
+
+  @Test(expected = classOf[ValidationException])
+  def testJoinWithAmbiguousFields(): Unit = {
+ds1.join(ds2.select('d, 'e, 'f, 'g, 'h as 'c))
+  // must fail. Both inputs share the same field 'c
+  .where('a === 'd)
+  .select('c, 'g)
+  }
+
+  @Test(expected = classOf[TableException])
+  def testNoEqualityJoinPredicate1(): Unit = {
+ds1.join(ds2)
+  // must fail. No equality join predicate
+  .where('d === 'f)
+  .select('c, 'g)
+  .toDataSet[Row]
--- End diff --

replace `toDataSet` call by `toRetractStream[Row]`


> Implement stream-stream proctime non-window  inner join
> ---
>
> Key: FLINK-6094
> URL: https://issues.apache.org/jira/browse/FLINK-6094
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Shaoxuan Wang
>Assignee: Hequn Cheng
>
> This includes:
> 1.Implement stream-stream proctime non-window  inner join
> 2.Implement the retract process logic for join



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (FLINK-6094) Implement stream-stream proctime non-window inner join

2017-08-10 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-6094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16121731#comment-16121731
 ] 

ASF GitHub Bot commented on FLINK-6094:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/4471#discussion_r132284969
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/ProcTimeNonWindowInnerJoin.scala
 ---
@@ -0,0 +1,275 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.flink.table.runtime.join
+
+import org.apache.flink.api.common.functions.RichFlatJoinFunction
+import org.apache.flink.api.common.state._
+import org.apache.flink.api.common.typeinfo.TypeInformation
+import org.apache.flink.api.java.typeutils.{ResultTypeQueryable, 
TupleTypeInfo}
+import org.apache.flink.configuration.Configuration
+import org.apache.flink.streaming.api.functions.co.CoProcessFunction
+import org.apache.flink.table.api.{StreamQueryConfig, Types}
+import org.apache.flink.table.runtime.CRowWrappingMultiOuputCollector
+import org.apache.flink.table.runtime.types.CRow
+import org.apache.flink.types.Row
+import org.apache.flink.util.Collector
+import org.apache.flink.api.java.tuple.{Tuple2 => JTuple2}
+
+/**
+  * Connect data for left stream and right stream. Only use for innerJoin.
+  *
+  * @param joiner   join function
+  * @param leftType the input type of left stream
+  * @param rightTypethe input type of right stream
+  * @param resultType   the output type of join
+  * @param queryConfig
+  */
+class ProcTimeNonWindowInnerJoin(
+joiner: RichFlatJoinFunction[Row, Row, Row],
+leftType: TypeInformation[Row],
+rightType: TypeInformation[Row],
+resultType: TypeInformation[CRow],
+queryConfig: StreamQueryConfig) extends
+  CoProcessFunction[CRow, CRow, CRow] with ResultTypeQueryable[CRow] {
+
+
+  // state to hold left stream element
+  private var leftState: MapState[Row, JTuple2[Int, Long]] = null
+  // state to hold right stream element
+  private var rightState: MapState[Row, JTuple2[Int, Long]] = null
+  private var cRowWrapper: CRowWrappingMultiOuputCollector = null
+
+  private val minRetentionTime: Long = 
queryConfig.getMinIdleStateRetentionTime
+  private val maxRetentionTime: Long = 
queryConfig.getMaxIdleStateRetentionTime
+  private val stateCleaningEnabled: Boolean = minRetentionTime > 1
+
+  // state to record last timer of left stream, 0 means no timer
+  private var timerState1: ValueState[Long] = _
+  // state to record last timer of right stream, 0 means no timer
+  private var timerState2: ValueState[Long] = _
+
+
+  override def open(parameters: Configuration): Unit = {
+// initialize left and right state
+val tupleTypeInfo = new TupleTypeInfo[JTuple2[Int, Long]](Types.INT, 
Types.LONG)
--- End diff --

Add a comment about the structure of the state. What do we need the 
`Tuple2` for?


> Implement stream-stream proctime non-window  inner join
> ---
>
> Key: FLINK-6094
> URL: https://issues.apache.org/jira/browse/FLINK-6094
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Shaoxuan Wang
>Assignee: Hequn Cheng
>
> This includes:
> 1.Implement stream-stream proctime non-window  inner join
> 2.Implement the retract process logic for join



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

1 2 3 4 >

1 - 100 of 333 matches

Mail list logo