[jira] [Commented] (FLINK-4152) TaskManager registration exponential backoff doesn't work

2016-07-13 Thread Till Rohrmann (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375296#comment-15375296
 ] 

Till Rohrmann commented on FLINK-4152:
--

[~mxm]The restarted registration attempts are the observable symptoms caused by 
a different problem. 

The actual problem is that the {{YarnFlinkRessourceManager}} forgets about the 
registered task managers if the job manager loses its leadership. Each task 
manager has a resource ID with which it registers at the resource manager. The 
{{YarnFlinkResourceManager}} has two states for allocated resources: 
{{containersInLaunch}} and {{registeredWorkers}}. A container can only go from 
{{containersInLaunch}} to {{registeredWorkers}}. This also works for the 
initial registration. However, when the job manager loses its leadership and 
the {{registeredWorkers}} list is cleared, there is no longer an container in 
launch associated with the respective resource ID. Consequently, when the old 
task manager is being re-registered by the new leader, the registration is 
rejected.

This rejection is then sent to the task manager. Upon receiving a rejection, 
the task manager reschedules another registration attempt after waiting for 
some time. Here the problem is that the old registration attempts are not 
cancelled. Consequently, one will have multiple registration attempts taking 
place at the "same" time/concurrently. That's the reason why you observe many 
registration attempt messages in the log.

I think the symptom can be fixed by cancelling all currently active 
registration attempts when you want to restart the registration.

It is a bit unclear to me what the expected behaviour of the 
FlinkYarnResourceManager should be. In the {{jobManagerLostLeadership}} method 
where the {{registeredWorkers}} list is cleared, a comment says "all currently 
registered TaskManagers are put under "awaiting registration"". But there is no 
such state. Furthermore, I'm not sure whether registered TaskManagers have to 
re-register if only the job manager has failed.

Thus, I see two solutions. Either not clearing {{registeredWorkers}} or 
introducing a new state "awaiting registration" which keeps all formerly 
registered task managers which can be re-registered.

Maybe [~mxm] can give some input.

> TaskManager registration exponential backoff doesn't work
> -
>
> Key: FLINK-4152
> URL: https://issues.apache.org/jira/browse/FLINK-4152
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination, TaskManager, YARN Client
>Reporter: Robert Metzger
>Assignee: Till Rohrmann
> Attachments: logs.tgz
>
>
> While testing Flink 1.1 I've found that the TaskManagers are logging many 
> messages when registering at the JobManager.
> This is the log file: 
> https://gist.github.com/rmetzger/0cebe0419cdef4507b1e8a42e33ef294
> Its logging more than 3000 messages in less than a minute. I don't think that 
> this is the expected behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (FLINK-4152) TaskManager registration exponential backoff doesn't work

2016-07-13 Thread Till Rohrmann (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375296#comment-15375296
 ] 

Till Rohrmann edited comment on FLINK-4152 at 7/13/16 4:24 PM:
---

The restarted registration attempts are the observable symptoms caused by a 
different problem. 

The actual problem is that the {{YarnFlinkRessourceManager}} forgets about the 
registered task managers if the job manager loses its leadership. Each task 
manager has a resource ID with which it registers at the resource manager. The 
{{YarnFlinkResourceManager}} has two states for allocated resources: 
{{containersInLaunch}} and {{registeredWorkers}}. A container can only go from 
{{containersInLaunch}} to {{registeredWorkers}}. This also works for the 
initial registration. However, when the job manager loses its leadership and 
the {{registeredWorkers}} list is cleared, there is no longer an container in 
launch associated with the respective resource ID. Consequently, when the old 
task manager is being re-registered by the new leader, the registration is 
rejected.

This rejection is then sent to the task manager. Upon receiving a rejection, 
the task manager reschedules another registration attempt after waiting for 
some time. Here the problem is that the old registration attempts are not 
cancelled. Consequently, one will have multiple registration attempts taking 
place at the "same" time/concurrently. That's the reason why you observe many 
registration attempt messages in the log.

I think the symptom can be fixed by cancelling all currently active 
registration attempts when you want to restart the registration.

It is a bit unclear to me what the expected behaviour of the 
FlinkYarnResourceManager should be. In the {{jobManagerLostLeadership}} method 
where the {{registeredWorkers}} list is cleared, a comment says "all currently 
registered TaskManagers are put under "awaiting registration"". But there is no 
such state. Furthermore, I'm not sure whether registered TaskManagers have to 
re-register if only the job manager has failed.

Thus, I see two solutions. Either not clearing {{registeredWorkers}} or 
introducing a new state "awaiting registration" which keeps all formerly 
registered task managers which can be re-registered.

Maybe [~mxm] can give some input.


was (Author: till.rohrmann):
[~mxm]The restarted registration attempts are the observable symptoms caused by 
a different problem. 

The actual problem is that the {{YarnFlinkRessourceManager}} forgets about the 
registered task managers if the job manager loses its leadership. Each task 
manager has a resource ID with which it registers at the resource manager. The 
{{YarnFlinkResourceManager}} has two states for allocated resources: 
{{containersInLaunch}} and {{registeredWorkers}}. A container can only go from 
{{containersInLaunch}} to {{registeredWorkers}}. This also works for the 
initial registration. However, when the job manager loses its leadership and 
the {{registeredWorkers}} list is cleared, there is no longer an container in 
launch associated with the respective resource ID. Consequently, when the old 
task manager is being re-registered by the new leader, the registration is 
rejected.

This rejection is then sent to the task manager. Upon receiving a rejection, 
the task manager reschedules another registration attempt after waiting for 
some time. Here the problem is that the old registration attempts are not 
cancelled. Consequently, one will have multiple registration attempts taking 
place at the "same" time/concurrently. That's the reason why you observe many 
registration attempt messages in the log.

I think the symptom can be fixed by cancelling all currently active 
registration attempts when you want to restart the registration.

It is a bit unclear to me what the expected behaviour of the 
FlinkYarnResourceManager should be. In the {{jobManagerLostLeadership}} method 
where the {{registeredWorkers}} list is cleared, a comment says "all currently 
registered TaskManagers are put under "awaiting registration"". But there is no 
such state. Furthermore, I'm not sure whether registered TaskManagers have to 
re-register if only the job manager has failed.

Thus, I see two solutions. Either not clearing {{registeredWorkers}} or 
introducing a new state "awaiting registration" which keeps all formerly 
registered task managers which can be re-registered.

Maybe [~mxm] can give some input.

> TaskManager registration exponential backoff doesn't work
> -
>
> Key: FLINK-4152
> URL: https://issues.apache.org/jira/browse/FLINK-4152
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination, TaskManager, YARN Client
>Reporter: Robert Metzger
>Assignee: Till Rohrmann
> 

[jira] [Commented] (FLINK-4197) Allow Kinesis Endpoint to be Overridden via Config

2016-07-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375360#comment-15375360
 ] 

ASF GitHub Bot commented on FLINK-4197:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2227


> Allow Kinesis Endpoint to be Overridden via Config
> --
>
> Key: FLINK-4197
> URL: https://issues.apache.org/jira/browse/FLINK-4197
> Project: Flink
>  Issue Type: Improvement
>  Components: Kinesis Connector
>Affects Versions: 1.0.3
>Reporter: Scott Kidder
>Priority: Minor
>  Labels: easyfix
> Fix For: 1.0.4
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> I perform local testing of my application stack with Flink configured as a 
> consumer on a Kinesis stream provided by Kinesalite, an implementation of 
> Kinesis built on LevelDB. This requires me to override the AWS endpoint to 
> refer to my local Kinesalite server rather than reference the real AWS 
> endpoint. I'd like to add a configuration property to the Kinesis streaming 
> connector that allows the AWS endpoint to be specified explicitly.
> This should be a fairly small change and provide a lot of flexibility to 
> people looking to integrate Flink with Kinesis in a non-production setup.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2227: [FLINK-4197] Allow Kinesis endpoint to be overridd...

2016-07-13 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2227


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #2234: [hotfix][kinesis-connector] Remove duplicate info ...

2016-07-13 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2234


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4206) Metric names should allow special characters

2016-07-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375377#comment-15375377
 ] 

ASF GitHub Bot commented on FLINK-4206:
---

Github user rmetzger commented on the issue:

https://github.com/apache/flink/pull/2237
  
I stumbled across this and I found the limitation quite annoying, so if 
there are no good reasons for the check, I agree to remove it.
Since @StephanEwen wrote this, it might be good if he could confirm.


> Metric names should allow special characters
> 
>
> Key: FLINK-4206
> URL: https://issues.apache.org/jira/browse/FLINK-4206
> Project: Flink
>  Issue Type: Improvement
>  Components: Metrics
>Affects Versions: 1.1.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
> Fix For: 1.1.0
>
>
> Currently, the name of the metric is restricted to alphanumeric characters. 
> This restriction was originally put in place to circumvent issues due to 
> systems not supporting certain characters.
> However, this restriction does not make a lot of sense since for group names 
> we don't enforce such a restriction.
> This also affects the integration of the Kafka metrics, so i suggest removing 
> the restriction.
> From now on it will be the responsibility of the reporter to make sure that 
> the metric identifier is supported by the external system.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-4197) Allow Kinesis Endpoint to be Overridden via Config

2016-07-13 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger resolved FLINK-4197.
---
   Resolution: Fixed
 Assignee: Scott Kidder
Fix Version/s: (was: 1.0.4)
   1.1.0

Resolved in http://git-wip-us.apache.org/repos/asf/flink/commit/bc3a96f5

Thank you for the contribution [~skidder].

> Allow Kinesis Endpoint to be Overridden via Config
> --
>
> Key: FLINK-4197
> URL: https://issues.apache.org/jira/browse/FLINK-4197
> Project: Flink
>  Issue Type: Improvement
>  Components: Kinesis Connector
>Affects Versions: 1.0.3
>Reporter: Scott Kidder
>Assignee: Scott Kidder
>Priority: Minor
>  Labels: easyfix
> Fix For: 1.1.0
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> I perform local testing of my application stack with Flink configured as a 
> consumer on a Kinesis stream provided by Kinesalite, an implementation of 
> Kinesis built on LevelDB. This requires me to override the AWS endpoint to 
> refer to my local Kinesalite server rather than reference the real AWS 
> endpoint. I'd like to add a configuration property to the Kinesis streaming 
> connector that allows the AWS endpoint to be specified explicitly.
> This should be a fairly small change and provide a lot of flexibility to 
> people looking to integrate Flink with Kinesis in a non-production setup.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3466) Job might get stuck in restoreState() from HDFS due to interrupt

2016-07-13 Thread Stephan Ewen (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375397#comment-15375397
 ] 

Stephan Ewen commented on FLINK-3466:
-

Here is a Unit test that minimally reproduces getting stuck in interrupt 
sensitive state handles (like those reading from HDFS)

{code}
public class InterruptSensitiveRestoreTest {

private static final OneShotLatch IN_RESTORE_LATCH = new OneShotLatch();

@Test
public void testRestoreWithInterrupt() throws Exception {

Configuration taskConfig = new Configuration();
StreamConfig cfg = new StreamConfig(taskConfig);
cfg.setTimeCharacteristic(TimeCharacteristic.ProcessingTime);

TaskDeploymentDescriptor tdd = 
createTaskDeploymentDescriptor(taskConfig, new InterruptLockingStateHandle());
Task task = createTask(tdd);

// start the task and wait until it is in "restore"
task.startTaskThread();
IN_RESTORE_LATCH.await();

// trigger cancellation and signal to continue
task.cancelExecution();

task.getExecutingThread().join(3);

if (task.getExecutionState() == ExecutionState.CANCELING) {
fail("Task is stuck and not canceling");
}

assertEquals(ExecutionState.CANCELED, task.getExecutionState());
assertNull(task.getFailureCause());
}

// 

//  Utilities
// 


private static TaskDeploymentDescriptor createTaskDeploymentDescriptor(
Configuration taskConfig,
StateHandle state) throws IOException {
return new TaskDeploymentDescriptor(
new JobID(),
"test job name",
new JobVertexID(),
new ExecutionAttemptID(),
new SerializedValue<>(new ExecutionConfig()),
"test task name",
0, 1, 0,
new Configuration(),
taskConfig,
SourceStreamTask.class.getName(),

Collections.emptyList(),

Collections.emptyList(),
Collections.emptyList(),
Collections.emptyList(),
0,
new SerializedValue(state));
}

private static Task createTask(TaskDeploymentDescriptor tdd) throws 
IOException {
return new Task(
tdd,
mock(MemoryManager.class),
mock(IOManager.class),
mock(NetworkEnvironment.class),
mock(BroadcastVariableManager.class),
mock(ActorGateway.class),
mock(ActorGateway.class),
new FiniteDuration(10, TimeUnit.SECONDS),
new FallbackLibraryCacheManager(),
new FileCache(new Configuration()),
new TaskManagerRuntimeInfo(
"localhost", new 
Configuration(), EnvironmentInformation.getTemporaryFileDirectory()),
mock(TaskMetricGroup.class));

}

@SuppressWarnings("serial")
private static class InterruptLockingStateHandle extends 
StreamTaskStateList {

public InterruptLockingStateHandle() throws Exception {
super(new StreamTaskState[0]);
}

@Override
public StreamTaskState[] getState(ClassLoader 
userCodeClassLoader) {
IN_RESTORE_LATCH.trigger();

// this mimics what happens in the HDFS client code.
// an interrupt on a waiting object leads to an 
infinite loop
try {
synchronized (this) {
wait();
}
}
catch (InterruptedException e) {
while (true) {
try {
synchronized (this) {
 

[GitHub] flink issue #2234: [hotfix][kinesis-connector] Remove duplicate info in Kine...

2016-07-13 Thread rmetzger
Github user rmetzger commented on the issue:

https://github.com/apache/flink/pull/2234
  
Merging ...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #2239: [FLINK-4208] Support Running Flink processes in foregroun...

2016-07-13 Thread iemejia
Github user iemejia commented on the issue:

https://github.com/apache/flink/pull/2239
  
Hi, Yes, this is exactly the situation, in a previous pull request I was 
optimizing the flink docker image, however I found that the image used 
supervisord to catch and keep alive those daemons, so I wanted to remove this 
dependency (because it adds around 40MB to the image + python and some extra 
stuff).

Can you give me some hints on the best way to address this ? Or how can I 
improve my current approach (notice that I took the start-foreground idea from 
zookeeper).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (FLINK-4212) Lock on pid file when starting daemons

2016-07-13 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-4212:
-

 Summary: Lock on pid file when starting daemons
 Key: FLINK-4212
 URL: https://issues.apache.org/jira/browse/FLINK-4212
 Project: Flink
  Issue Type: Improvement
  Components: Startup Shell Scripts
Affects Versions: 1.1.0
Reporter: Greg Hogan
Assignee: Greg Hogan


As noted on the mailing list (0), when multiple TaskManagers are started in 
parallel (using pdsh) there is a race condition on updating the pid: 1) the pid 
file is first read to parse the process' index, 2) the process is started, and 
3) on success the daemon pid is appended to the pid file.

We could use a tool such as {{flock}} to lock on the pid file while starting 
the Flink daemon.

0: 
http://mail-archives.apache.org/mod_mbox/flink-user/201607.mbox/%3CCA%2BssbKXw954Bz_sBRwP6db0FntWyGWzTyP7wJZ5nhOeQnof3kg%40mail.gmail.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2237: [FLINK-4206][metrics] Remove alphanumeric name restrictio...

2016-07-13 Thread rmetzger
Github user rmetzger commented on the issue:

https://github.com/apache/flink/pull/2237
  
I stumbled across this and I found the limitation quite annoying, so if 
there are no good reasons for the check, I agree to remove it.
Since @StephanEwen wrote this, it might be good if he could confirm.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #2228: [FLINK-4170][kinesis-connector] Simplify Kinesis connecte...

2016-07-13 Thread tzulitai
Github user tzulitai commented on the issue:

https://github.com/apache/flink/pull/2228
  
Rebased again on the new endpoint config merged in 
https://github.com/apache/flink/pull/2227.
This PR should be ready for a final review now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #2243: [FLINK-4196] [runtime] Remove recovery timestamp f...

2016-07-13 Thread StephanEwen
GitHub user StephanEwen opened a pull request:

https://github.com/apache/flink/pull/2243

[FLINK-4196] [runtime] Remove recovery timestamp from checkpoint restores

The 'recoveryTimestamp' was an unsafe wall clock timestamp attached by the 
master upon recovery. Because this timestamp cannot be relied upon in 
distributed setups, it is removed here.

If we need something like this in the future, we should try and get a 
globally progress counter or logical timestamp instead.

No code in the core Flink repository is affected by this change.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/StephanEwen/incubator-flink 
remove_recovery_timestamp

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2243.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2243


commit ae65ef4b8ce35aadd089be46b52e7fddb5a3ef85
Author: Stephan Ewen 
Date:   2016-07-05T08:18:38Z

[hotfix] [kafka connector] Minor code cleanups in the Kafka Producer

commit c738fcd9f6031becc405c17cc479b9c2340c2414
Author: Stephan Ewen 
Date:   2016-07-11T18:36:44Z

[hotfix] [runtim] Minor code cleanups.

commit 116321241923194e9fa6db556681b333197fceed
Author: Stephan Ewen 
Date:   2016-07-13T15:31:35Z

[FLINK-4196] [runtime] Remove the 'recoveryTimestamp' from checkpoint 
restores.

The 'recoveryTimestamp' was an unsafe wall clock timestamp attached by the 
master
upon recovery. This this timestamp cannot be relied upon in distributed 
setups,
it is removed.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4196) Remove "recoveryTimestamp"

2016-07-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375630#comment-15375630
 ] 

ASF GitHub Bot commented on FLINK-4196:
---

GitHub user StephanEwen opened a pull request:

https://github.com/apache/flink/pull/2243

[FLINK-4196] [runtime] Remove recovery timestamp from checkpoint restores

The 'recoveryTimestamp' was an unsafe wall clock timestamp attached by the 
master upon recovery. Because this timestamp cannot be relied upon in 
distributed setups, it is removed here.

If we need something like this in the future, we should try and get a 
globally progress counter or logical timestamp instead.

No code in the core Flink repository is affected by this change.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/StephanEwen/incubator-flink 
remove_recovery_timestamp

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2243.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2243


commit ae65ef4b8ce35aadd089be46b52e7fddb5a3ef85
Author: Stephan Ewen 
Date:   2016-07-05T08:18:38Z

[hotfix] [kafka connector] Minor code cleanups in the Kafka Producer

commit c738fcd9f6031becc405c17cc479b9c2340c2414
Author: Stephan Ewen 
Date:   2016-07-11T18:36:44Z

[hotfix] [runtim] Minor code cleanups.

commit 116321241923194e9fa6db556681b333197fceed
Author: Stephan Ewen 
Date:   2016-07-13T15:31:35Z

[FLINK-4196] [runtime] Remove the 'recoveryTimestamp' from checkpoint 
restores.

The 'recoveryTimestamp' was an unsafe wall clock timestamp attached by the 
master
upon recovery. This this timestamp cannot be relied upon in distributed 
setups,
it is removed.




> Remove "recoveryTimestamp"
> --
>
> Key: FLINK-4196
> URL: https://issues.apache.org/jira/browse/FLINK-4196
> Project: Flink
>  Issue Type: Bug
>  Components: State Backends, Checkpointing
>Affects Versions: 1.0.3
>Reporter: Stephan Ewen
>Assignee: Stephan Ewen
>
> I think we should remove the {{recoveryTimestamp}} that is attached on state 
> restore calls.
> Given that this is a wall clock timestamp from a master node, which may 
> change when clocks are adjusted, and between different master nodes during 
> leader change, this is an unsafe concept.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2125) String delimiter for SocketTextStream

2016-07-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375530#comment-15375530
 ] 

ASF GitHub Bot commented on FLINK-2125:
---

Github user delding commented on the issue:

https://github.com/apache/flink/pull/2233
  
Hi @zentol , thanks for your comments. I have updated this PR. Please let 
me know if there are further improvements that need to be done.


> String delimiter for SocketTextStream
> -
>
> Key: FLINK-2125
> URL: https://issues.apache.org/jira/browse/FLINK-2125
> Project: Flink
>  Issue Type: Improvement
>  Components: Streaming
>Affects Versions: 0.9
>Reporter: Márton Balassi
>Priority: Minor
>  Labels: starter
>
> The SocketTextStreamFunction uses a character delimiter, despite other parts 
> of the API using String delimiter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2233: [FLINK-2125][streaming] Delimiter change from char to str...

2016-07-13 Thread delding
Github user delding commented on the issue:

https://github.com/apache/flink/pull/2233
  
Hi @zentol , thanks for your comments. I have updated this PR. Please let 
me know if there are further improvements that need to be done.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #2242: ExceptionHandler keep count of exceptions

2016-07-13 Thread sumitchawla
GitHub user sumitchawla opened a pull request:

https://github.com/apache/flink/pull/2242

ExceptionHandler keep count of exceptions

This is just a bug identified while going through exception handler. There 
is no JIRA ticket for this.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sumitchawla/flink master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2242.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2242


commit 35a75cd672db8c40d591d2121f246fc1e642ae7c
Author: Sumit Chawla 
Date:   2016-07-13T19:16:01Z

increment the exceptions counter

commit 2afb0b32f750c5d77ed7da8f86e7ec78da27f3d5
Author: Sumit Chawla 
Date:   2016-07-13T19:18:12Z

Merge pull request #1 from sumitchawla/sumitchawla-exceptions-handler-fix

increment the exceptions counter




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4170) Remove `CONFIG_` prefix from KinesisConfigConstants variables

2016-07-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375464#comment-15375464
 ] 

ASF GitHub Bot commented on FLINK-4170:
---

Github user tzulitai commented on the issue:

https://github.com/apache/flink/pull/2228
  
Rebased again on the new endpoint config merged in 
https://github.com/apache/flink/pull/2227.
This PR should be ready for a final review now.


> Remove `CONFIG_` prefix from KinesisConfigConstants variables
> -
>
> Key: FLINK-4170
> URL: https://issues.apache.org/jira/browse/FLINK-4170
> Project: Flink
>  Issue Type: Improvement
>  Components: Streaming Connectors
>Reporter: Ufuk Celebi
>Assignee: Tzu-Li (Gordon) Tai
> Fix For: 1.1.0
>
>
> I find the static variable names verbose. I think it's clear from context 
> that they refer to the Kinesis configuration since they are all gathered in 
> that class.
> Therefore would like to remove the {{CONFIG_}} prefix before the release, so 
> that we have
> {code}
> conf.put(KinesisConfigConstants.AWS_REGION, "")
> {code}
> instead of 
> {code}
> conf.put(KinesisConfigConstants.CONFIG_AWS_REGION, "")
> {code}
> For longer variables it becomes even longer otherwise.
> ---
> Some basic variable names that might be accessed frequently are also very 
> long:
> {code}
> CONFIG_AWS_CREDENTIALS_PROVIDER_BASIC_SECRETKEY
> CONFIG_AWS_CREDENTIALS_PROVIDER_BASIC_ACCESSKEYID
> {code}
> It might suffice to just have:
> {code}
> AWS_SECRET_KEY
> AWS_ACCESS_KEY
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4206) Metric names should allow special characters

2016-07-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375505#comment-15375505
 ] 

ASF GitHub Bot commented on FLINK-4206:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/2237
  
Actually, I just wrote a more efficient version of a check that was there 
before.


> Metric names should allow special characters
> 
>
> Key: FLINK-4206
> URL: https://issues.apache.org/jira/browse/FLINK-4206
> Project: Flink
>  Issue Type: Improvement
>  Components: Metrics
>Affects Versions: 1.1.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
> Fix For: 1.1.0
>
>
> Currently, the name of the metric is restricted to alphanumeric characters. 
> This restriction was originally put in place to circumvent issues due to 
> systems not supporting certain characters.
> However, this restriction does not make a lot of sense since for group names 
> we don't enforce such a restriction.
> This also affects the integration of the Kafka metrics, so i suggest removing 
> the restriction.
> From now on it will be the responsibility of the reporter to make sure that 
> the metric identifier is supported by the external system.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2237: [FLINK-4206][metrics] Remove alphanumeric name restrictio...

2016-07-13 Thread StephanEwen
Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/2237
  
Actually, I just wrote a more efficient version of a check that was there 
before.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3477) Add hash-based combine strategy for ReduceFunction

2016-07-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15376126#comment-15376126
 ] 

ASF GitHub Bot commented on FLINK-3477:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1517


> Add hash-based combine strategy for ReduceFunction
> --
>
> Key: FLINK-3477
> URL: https://issues.apache.org/jira/browse/FLINK-3477
> Project: Flink
>  Issue Type: Sub-task
>  Components: Local Runtime
>Reporter: Fabian Hueske
>Assignee: Gabor Gevay
>
> This issue is about adding a hash-based combine strategy for ReduceFunctions.
> The interface of the {{reduce()}} method is as follows:
> {code}
> public T reduce(T v1, T v2)
> {code}
> Input type and output type are identical and the function returns only a 
> single value. A Reduce function is incrementally applied to compute a final 
> aggregated value. This allows to hold the preaggregated value in a hash-table 
> and update it with each function call. 
> The hash-based strategy requires special implementation of an in-memory hash 
> table. The hash table should support in place updates of elements (if the 
> updated value has the same size as the new value) but also appending updates 
> with invalidation of the old value (if the binary length of the new value 
> differs). The hash table needs to be able to evict and emit all elements if 
> it runs out-of-memory.
> We should also add {{HASH}} and {{SORT}} compiler hints to 
> {{DataSet.reduce()}} and {{Grouping.reduce()}} to allow users to pick the 
> execution strategy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #1517: [FLINK-3477] [runtime] Add hash-based combine stra...

2016-07-13 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1517


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Closed] (FLINK-3477) Add hash-based combine strategy for ReduceFunction

2016-07-13 Thread Greg Hogan (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Hogan closed FLINK-3477.
-
Resolution: Implemented

Implemented in 52e191a5067322e82192314c16e70ae9e937ae2c

> Add hash-based combine strategy for ReduceFunction
> --
>
> Key: FLINK-3477
> URL: https://issues.apache.org/jira/browse/FLINK-3477
> Project: Flink
>  Issue Type: Sub-task
>  Components: Local Runtime
>Reporter: Fabian Hueske
>Assignee: Gabor Gevay
>
> This issue is about adding a hash-based combine strategy for ReduceFunctions.
> The interface of the {{reduce()}} method is as follows:
> {code}
> public T reduce(T v1, T v2)
> {code}
> Input type and output type are identical and the function returns only a 
> single value. A Reduce function is incrementally applied to compute a final 
> aggregated value. This allows to hold the preaggregated value in a hash-table 
> and update it with each function call. 
> The hash-based strategy requires special implementation of an in-memory hash 
> table. The hash table should support in place updates of elements (if the 
> updated value has the same size as the new value) but also appending updates 
> with invalidation of the old value (if the binary length of the new value 
> differs). The hash table needs to be able to evict and emit all elements if 
> it runs out-of-memory.
> We should also add {{HASH}} and {{SORT}} compiler hints to 
> {{DataSet.reduce()}} and {{Grouping.reduce()}} to allow users to pick the 
> execution strategy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-2246) Add chained combine driver strategy for ReduceFunction

2016-07-13 Thread Greg Hogan (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Hogan closed FLINK-2246.
-
Resolution: Implemented

Implemented in 0db804b936efd8631f1a08db37753dad7f1f71ea

> Add chained combine driver strategy for ReduceFunction
> --
>
> Key: FLINK-2246
> URL: https://issues.apache.org/jira/browse/FLINK-2246
> Project: Flink
>  Issue Type: Improvement
>  Components: Local Runtime
>Affects Versions: 0.9, 0.10.0
>Reporter: Ufuk Celebi
>Assignee: Gabor Gevay
>Priority: Minor
>
> Running the WordCount example with a text file input/output results and a 
> manual reduce function (instead of the sum(1)) results in a combiner, which 
> is not chained.
> Replace sum(1) with the following to reproduce and use a text file as input:
> {code}
> fileOutput = true;
> textPath = "...";
> outputPath = "...";
> {code}
> {code}
> .reduce(new ReduceFunction>() {
> @Override
> public Tuple2 reduce(Tuple2 value1, 
> Tuple2 value2) throws Exception {
> return new Tuple2(value1.f0, value1.f1 + value2.f1);
> }
> });
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4213) Provide CombineHint in Gelly algorithms

2016-07-13 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-4213:
-

 Summary: Provide CombineHint in Gelly algorithms
 Key: FLINK-4213
 URL: https://issues.apache.org/jira/browse/FLINK-4213
 Project: Flink
  Issue Type: Improvement
  Components: Gelly
Affects Versions: 1.1.0
Reporter: Greg Hogan
Assignee: Greg Hogan


Many graph algorithms will see better {{reduce}} performance with the 
hash-combine compared with the still default sort-combine, e.g. HITS and 
LocalClusteringCoefficient.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4190) Generalise RollingSink to work with arbitrary buckets

2016-07-13 Thread Aljoscha Krettek (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374568#comment-15374568
 ] 

Aljoscha Krettek commented on FLINK-4190:
-

What you could also do is create a new package and put the new sink and related 
classes in there. This way you wouldn't have to rename {{Bucketed}} to 
{{BucketFunction}} and stuff would be nicely "isolated".

> Generalise RollingSink to work with arbitrary buckets
> -
>
> Key: FLINK-4190
> URL: https://issues.apache.org/jira/browse/FLINK-4190
> Project: Flink
>  Issue Type: Improvement
>  Components: filesystem-connector, Streaming Connectors
>Reporter: Josh Forman-Gornall
>Assignee: Josh Forman-Gornall
>Priority: Minor
>
> The current RollingSink implementation appears to be intended for writing to 
> directories that are bucketed by system time (e.g. minutely) and to only be 
> writing to one file within one bucket at any point in time. When the system 
> time determines that the current bucket should be changed, the current bucket 
> and file are closed and a new bucket and file are created. The sink cannot be 
> used for the more general problem of writing to arbitrary buckets, perhaps 
> determined by an attribute on the element/tuple being processed.
> There are three limitations which prevent the existing sink from being used 
> for more general problems:
> - Only bucketing by the current system time is supported, and not by e.g. an 
> attribute of the element being processed by the sink.
> - Whenever the sink sees a change in the bucket being written to, it flushes 
> the file and moves on to the new bucket. Therefore the sink cannot have more 
> than one bucket/file open at a time. Additionally the checkpointing mechanics 
> only support saving the state of one active bucket and file.
> - The sink determines that it should 'close' an active bucket and file when 
> the bucket path changes. We need another way to determine when a bucket has 
> become inactive and needs to be closed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4205) Implement stratified sampling for DataSet

2016-07-13 Thread Do Le Quoc (JIRA)
Do Le Quoc created FLINK-4205:
-

 Summary: Implement stratified sampling for DataSet
 Key: FLINK-4205
 URL: https://issues.apache.org/jira/browse/FLINK-4205
 Project: Flink
  Issue Type: New Feature
Reporter: Do Le Quoc


Since a Dataset might consist of data from disparate sources. As such, every 
data source should be considered fairly to have a representative sample. For 
this, stratified sampling is needed to ensure that data from every source 
(stratum) is selected and none of the minorities are excluded. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1501) Integrate metrics library and report basic metrics to JobManager web interface

2016-07-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374538#comment-15374538
 ] 

ASF GitHub Bot commented on FLINK-1501:
---

Github user rmetzger commented on the issue:

https://github.com/apache/flink/pull/421
  
@sumitchawla: These charts have been removed from the TaskManager view due 
to licensing issues.


> Integrate metrics library and report basic metrics to JobManager web interface
> --
>
> Key: FLINK-1501
> URL: https://issues.apache.org/jira/browse/FLINK-1501
> Project: Flink
>  Issue Type: Sub-task
>  Components: JobManager, TaskManager
>Affects Versions: 0.9
>Reporter: Robert Metzger
>Assignee: Robert Metzger
> Fix For: 0.9
>
>
> As per mailing list, the library: https://github.com/dropwizard/metrics
> The goal of this task is to get the basic infrastructure in place.
> Subsequent issues will integrate more features into the system.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #421: [FLINK-1501] Add metrics library for monitoring TaskManage...

2016-07-13 Thread rmetzger
Github user rmetzger commented on the issue:

https://github.com/apache/flink/pull/421
  
@sumitchawla: These charts have been removed from the TaskManager view due 
to licensing issues.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #2234: [hotfix][kinesis-connector] Remove duplicate info ...

2016-07-13 Thread tzulitai
GitHub user tzulitai opened a pull request:

https://github.com/apache/flink/pull/2234

[hotfix][kinesis-connector] Remove duplicate info in 
KinesisDeserializationSchema

The duplicate was added in https://github.com/apache/flink/pull/2225.
`byte[] recordKey` and `String partitionKey` are actually the same only in 
different form, so now we have duplicate info in the deser schema. I think we 
can keep the `String` one, since it's the type we get from AWS API for the key.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tzulitai/flink hf-kinesis-deser

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2234.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2234


commit f23a699f990ee5f86a12b2b5249a055c27ff1caf
Author: Gordon Tai 
Date:   2016-07-13T07:36:56Z

[hotfix][kinesis-connector] Remove duplicate info in 
KinesisDeserializationSchema




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4190) Generalise RollingSink to work with arbitrary buckets

2016-07-13 Thread Josh Forman-Gornall (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374589#comment-15374589
 ] 

Josh Forman-Gornall commented on FLINK-4190:


Ok I'll do that. Can you see any issues with the code itself? The main thing I 
wasn't sure about was whether the inactive buckets check should occur in a 
separate thread spawned by a timer/scheduled executor, or whether it should 
occur in the {{invoke}} method, when new elements arrive. I decided to do the 
latter since I thought it might be better running on the main operator thread. 
But it has the disadvantage that if no new elements arrive, the inactive 
buckets would not be closed (although I guess this is unlikely to happen in 
practice).

> Generalise RollingSink to work with arbitrary buckets
> -
>
> Key: FLINK-4190
> URL: https://issues.apache.org/jira/browse/FLINK-4190
> Project: Flink
>  Issue Type: Improvement
>  Components: filesystem-connector, Streaming Connectors
>Reporter: Josh Forman-Gornall
>Assignee: Josh Forman-Gornall
>Priority: Minor
>
> The current RollingSink implementation appears to be intended for writing to 
> directories that are bucketed by system time (e.g. minutely) and to only be 
> writing to one file within one bucket at any point in time. When the system 
> time determines that the current bucket should be changed, the current bucket 
> and file are closed and a new bucket and file are created. The sink cannot be 
> used for the more general problem of writing to arbitrary buckets, perhaps 
> determined by an attribute on the element/tuple being processed.
> There are three limitations which prevent the existing sink from being used 
> for more general problems:
> - Only bucketing by the current system time is supported, and not by e.g. an 
> attribute of the element being processed by the sink.
> - Whenever the sink sees a change in the bucket being written to, it flushes 
> the file and moves on to the new bucket. Therefore the sink cannot have more 
> than one bucket/file open at a time. Additionally the checkpointing mechanics 
> only support saving the state of one active bucket and file.
> - The sink determines that it should 'close' an active bucket and file when 
> the bucket path changes. We need another way to determine when a bucket has 
> become inactive and needs to be closed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (FLINK-4194) KinesisDeserializationSchema.isEndOfStream() is never called

2016-07-13 Thread Tzu-Li (Gordon) Tai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374453#comment-15374453
 ] 

Tzu-Li (Gordon) Tai edited comment on FLINK-4194 at 7/13/16 6:34 AM:
-

[~rmetzger] Hi Robert, I can take care of this if you're not on it already.

I'd like to ask how we intend {{isEndOfStream()}} to behave.
>From what I understand from the Kafka connector code on how this method is 
>respected, {{FlinkKafkaConsumer08}} and {{FlinkKafkaConsumer09}} seems to 
>behave differently, so I'm a bit confused. 
{{FlinkKafkaConsumer08}} seems to only stop consuming the partition that the 
ending record was fetched from and continues to read the other partitions. On 
the other hand, for {{FlinkKafkaConsumer09}} the whole fetch loop for all 
subscribed partitions is stopped once a ending record is found.

I'm not sure how to proceed with this. I think by intuition, the behaviour of 
{{FlinkKafkaConsumer09}} seems more reasonable.


was (Author: tzulitai):
[~rmetzger] Hi Robert, I can take care of this if you're not on it already.

I'd like to ask how we intend {{isEndOfStream()}} to behave.
>From what I understand from the Kafka connector code on how this method is 
>respected, `FlinkKafkaConsumer08` and `FlinkKafkaConsumer09` seems to behave 
>differently, so I'm a bit confused. 
`FlinkKafkaConsumer08` seems to only stop consuming the partition that the 
ending record was fetched from and continues to read the other partitions. On 
the other hand, for `FlinkKafkaConsumer09` the whole fetch loop for all 
subscribed partitions is stopped once a ending record is found.

I'm not sure how to proceed with this. I think by intuition, the behaviour of 
`FlinkKafkaConsumer09` seems more reasonable.

> KinesisDeserializationSchema.isEndOfStream() is never called
> 
>
> Key: FLINK-4194
> URL: https://issues.apache.org/jira/browse/FLINK-4194
> Project: Flink
>  Issue Type: Sub-task
>  Components: Kinesis Connector
>Affects Versions: 1.1.0
>Reporter: Robert Metzger
>
> The Kinesis connector does not respect the 
> {{KinesisDeserializationSchema.isEndOfStream()}} method.
> The purpose of this method is to stop consuming from a source, based on input 
> data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2219: [FLINK-4143][metrics] Configurable delimiter

2016-07-13 Thread zentol
Github user zentol commented on a diff in the pull request:

https://github.com/apache/flink/pull/2219#discussion_r70584088
  
--- Diff: 
flink-core/src/main/java/org/apache/flink/metrics/MetricRegistry.java ---
@@ -64,6 +66,15 @@ public MetricRegistry(Configuration config) {
}
this.scopeFormats = scopeFormats;
 
+   char delim;
+   try {
+   delim = 
config.getString(ConfigConstants.METRICS_SCOPE_DELIMITER, ".").charAt(0);
--- End diff --

well...sure it would work, but it will only be a matter of time until 
someone forgets to pass a delimiter and gets the weirdest metric names 
imaginable. I would simply *prefer* a solution that can catch this error at 
compile-time.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #2225: [FLINK-4191] Expose shard information in kinesis deserial...

2016-07-13 Thread tzulitai
Github user tzulitai commented on the issue:

https://github.com/apache/flink/pull/2225
  
Just realized that the `byte[] recordKey` and `String partitionKey` are 
actually the same only in different form, so now we have duplicate info in the 
deser schema. I think we can keep the `String` one, since it's the type we get 
from AWS API for the key. Opening a hotfix for this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4191) Expose shard information in KinesisDeserializationSchema

2016-07-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374528#comment-15374528
 ] 

ASF GitHub Bot commented on FLINK-4191:
---

Github user tzulitai commented on the issue:

https://github.com/apache/flink/pull/2225
  
Just realized that the `byte[] recordKey` and `String partitionKey` are 
actually the same only in different form, so now we have duplicate info in the 
deser schema. I think we can keep the `String` one, since it's the type we get 
from AWS API for the key. Opening a hotfix for this.


> Expose shard information in KinesisDeserializationSchema
> 
>
> Key: FLINK-4191
> URL: https://issues.apache.org/jira/browse/FLINK-4191
> Project: Flink
>  Issue Type: Sub-task
>  Components: Kinesis Connector, Streaming Connectors
>Affects Versions: 1.1.0
>Reporter: Robert Metzger
>Assignee: Robert Metzger
> Fix For: 1.1.0
>
>
> Currently, we are not exposing the Shard ID and other shard-related 
> information in the deserialization schema.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #421: [FLINK-1501] Add metrics library for monitoring TaskManage...

2016-07-13 Thread tillrohrmann
Github user tillrohrmann commented on the issue:

https://github.com/apache/flink/pull/421
  
Yes, per default metrics are published via the JMX. But you can also
configure a Ganglia, Graphite or StatsD reporter.

On Tue, Jul 12, 2016 at 12:02 AM, Sumit Chawla 
wrote:

> @rmetzger  how can i view these metrics? Do
> i need JMX to be enabled for viewing these metrics? As of now i see only a
> few numbers in metrics tab for TaskManager
>
> —
> You are receiving this because you commented.
> Reply to this email directly, view it on GitHub
> , or mute
> the thread
> 

> .
>



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1501) Integrate metrics library and report basic metrics to JobManager web interface

2016-07-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374533#comment-15374533
 ] 

ASF GitHub Bot commented on FLINK-1501:
---

Github user tillrohrmann commented on the issue:

https://github.com/apache/flink/pull/421
  
Yes, per default metrics are published via the JMX. But you can also
configure a Ganglia, Graphite or StatsD reporter.

On Tue, Jul 12, 2016 at 12:02 AM, Sumit Chawla 
wrote:

> @rmetzger  how can i view these metrics? Do
> i need JMX to be enabled for viewing these metrics? As of now i see only a
> few numbers in metrics tab for TaskManager
>
> —
> You are receiving this because you commented.
> Reply to this email directly, view it on GitHub
> , or mute
> the thread
> 

> .
>



> Integrate metrics library and report basic metrics to JobManager web interface
> --
>
> Key: FLINK-1501
> URL: https://issues.apache.org/jira/browse/FLINK-1501
> Project: Flink
>  Issue Type: Sub-task
>  Components: JobManager, TaskManager
>Affects Versions: 0.9
>Reporter: Robert Metzger
>Assignee: Robert Metzger
> Fix For: 0.9
>
>
> As per mailing list, the library: https://github.com/dropwizard/metrics
> The goal of this task is to get the basic infrastructure in place.
> Subsequent issues will integrate more features into the system.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4194) KinesisDeserializationSchema.isEndOfStream() is never called

2016-07-13 Thread Tzu-Li (Gordon) Tai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374453#comment-15374453
 ] 

Tzu-Li (Gordon) Tai commented on FLINK-4194:


[~rmetzger] Hi Robert, I can take care of this if you're not on it already.

I'd like to ask how we intend {{isEndOfStream()}} to behave.
>From what I understand from the Kafka connector code on how this method is 
>respected, `FlinkKafkaConsumer08` and `FlinkKafkaConsumer09` seems to behave 
>differently, so I'm a bit confused. 
`FlinkKafkaConsumer08` seems to only stop consuming the partition that the 
ending record was fetched from and continues to read the other partitions. On 
the other hand, for `FlinkKafkaConsumer09` the whole fetch loop for all 
subscribed partitions is stopped once a ending record is found.

I'm not sure how to proceed with this. I think by intuition, the behaviour of 
`FlinkKafkaConsumer09` seems more reasonable.

> KinesisDeserializationSchema.isEndOfStream() is never called
> 
>
> Key: FLINK-4194
> URL: https://issues.apache.org/jira/browse/FLINK-4194
> Project: Flink
>  Issue Type: Sub-task
>  Components: Kinesis Connector
>Affects Versions: 1.1.0
>Reporter: Robert Metzger
>
> The Kinesis connector does not respect the 
> {{KinesisDeserializationSchema.isEndOfStream()}} method.
> The purpose of this method is to stop consuming from a source, based on input 
> data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2219: [FLINK-4143][metrics] Configurable delimiter

2016-07-13 Thread tillrohrmann
Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/2219#discussion_r70577752
  
--- Diff: 
flink-core/src/main/java/org/apache/flink/metrics/MetricRegistry.java ---
@@ -64,6 +66,15 @@ public MetricRegistry(Configuration config) {
}
this.scopeFormats = scopeFormats;
 
+   char delim;
+   try {
+   delim = 
config.getString(ConfigConstants.METRICS_SCOPE_DELIMITER, ".").charAt(0);
--- End diff --

Sorry, it was a hick-up of Github, because it didn't show me that I posted 
the comment so I thought I clicked on cancel.

Why not changing `ScopeFormat.concat` such that it always takes a delimiter 
string as first parameter? Giving `""` as first argument will lead to the 
former variant where you don't specify a delimiter.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4143) Configurable delimiter for metric identifier

2016-07-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374541#comment-15374541
 ] 

ASF GitHub Bot commented on FLINK-4143:
---

Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/2219#discussion_r70577752
  
--- Diff: 
flink-core/src/main/java/org/apache/flink/metrics/MetricRegistry.java ---
@@ -64,6 +66,15 @@ public MetricRegistry(Configuration config) {
}
this.scopeFormats = scopeFormats;
 
+   char delim;
+   try {
+   delim = 
config.getString(ConfigConstants.METRICS_SCOPE_DELIMITER, ".").charAt(0);
--- End diff --

Sorry, it was a hick-up of Github, because it didn't show me that I posted 
the comment so I thought I clicked on cancel.

Why not changing `ScopeFormat.concat` such that it always takes a delimiter 
string as first parameter? Giving `""` as first argument will lead to the 
former variant where you don't specify a delimiter.


> Configurable delimiter for metric identifier
> 
>
> Key: FLINK-4143
> URL: https://issues.apache.org/jira/browse/FLINK-4143
> Project: Flink
>  Issue Type: Improvement
>  Components: Metrics
>Affects Versions: 1.1.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Minor
>
> The metric identifier is currently hard-coded to separate components with a 
> dot.
> We should make this configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4196) Remove "recoveryTimestamp"

2016-07-13 Thread Aljoscha Krettek (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374566#comment-15374566
 ] 

Aljoscha Krettek commented on FLINK-4196:
-

+1, I think this was introduced for the DB backend.

> Remove "recoveryTimestamp"
> --
>
> Key: FLINK-4196
> URL: https://issues.apache.org/jira/browse/FLINK-4196
> Project: Flink
>  Issue Type: Bug
>  Components: State Backends, Checkpointing
>Affects Versions: 1.0.3
>Reporter: Stephan Ewen
>
> I think we should remove the {{recoveryTimestamp}} that is attached on state 
> restore calls.
> Given that this is a wall clock timestamp from a master node, which may 
> change when clocks are adjusted, and between different master nodes during 
> leader change, this is an unsafe concept.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2235: [hotfix] removed duplicated code

2016-07-13 Thread StefanRRichter
GitHub user StefanRRichter opened a pull request:

https://github.com/apache/flink/pull/2235

[hotfix] removed duplicated code





You can merge this pull request into a Git repository by running:

$ git pull https://github.com/StefanRRichter/flink 
hotfix-duplicated-code-environmentinfo

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2235.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2235


commit 1a7336c2842fd3d01ec1ddf6c249c921e91341fc
Author: Stefan Richter 
Date:   2016-07-13T08:23:17Z

[hotfix] removed duplicated code




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4143) Configurable delimiter for metric identifier

2016-07-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374594#comment-15374594
 ] 

ASF GitHub Bot commented on FLINK-4143:
---

Github user zentol commented on a diff in the pull request:

https://github.com/apache/flink/pull/2219#discussion_r70584088
  
--- Diff: 
flink-core/src/main/java/org/apache/flink/metrics/MetricRegistry.java ---
@@ -64,6 +66,15 @@ public MetricRegistry(Configuration config) {
}
this.scopeFormats = scopeFormats;
 
+   char delim;
+   try {
+   delim = 
config.getString(ConfigConstants.METRICS_SCOPE_DELIMITER, ".").charAt(0);
--- End diff --

well...sure it would work, but it will only be a matter of time until 
someone forgets to pass a delimiter and gets the weirdest metric names 
imaginable. I would simply *prefer* a solution that can catch this error at 
compile-time.


> Configurable delimiter for metric identifier
> 
>
> Key: FLINK-4143
> URL: https://issues.apache.org/jira/browse/FLINK-4143
> Project: Flink
>  Issue Type: Improvement
>  Components: Metrics
>Affects Versions: 1.1.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Minor
>
> The metric identifier is currently hard-coded to separate components with a 
> dot.
> We should make this configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4170) Remove `CONFIG_` prefix from KinesisConfigConstants variables

2016-07-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374595#comment-15374595
 ] 

ASF GitHub Bot commented on FLINK-4170:
---

Github user rmetzger commented on the issue:

https://github.com/apache/flink/pull/2228
  
I think there was an issue with Github and travis: 
https://www.traviscistatus.com/incidents/t4xn7bmq7fww


> Remove `CONFIG_` prefix from KinesisConfigConstants variables
> -
>
> Key: FLINK-4170
> URL: https://issues.apache.org/jira/browse/FLINK-4170
> Project: Flink
>  Issue Type: Improvement
>  Components: Streaming Connectors
>Reporter: Ufuk Celebi
>Assignee: Tzu-Li (Gordon) Tai
> Fix For: 1.1.0
>
>
> I find the static variable names verbose. I think it's clear from context 
> that they refer to the Kinesis configuration since they are all gathered in 
> that class.
> Therefore would like to remove the {{CONFIG_}} prefix before the release, so 
> that we have
> {code}
> conf.put(KinesisConfigConstants.AWS_REGION, "")
> {code}
> instead of 
> {code}
> conf.put(KinesisConfigConstants.CONFIG_AWS_REGION, "")
> {code}
> For longer variables it becomes even longer otherwise.
> ---
> Some basic variable names that might be accessed frequently are also very 
> long:
> {code}
> CONFIG_AWS_CREDENTIALS_PROVIDER_BASIC_SECRETKEY
> CONFIG_AWS_CREDENTIALS_PROVIDER_BASIC_ACCESSKEYID
> {code}
> It might suffice to just have:
> {code}
> AWS_SECRET_KEY
> AWS_ACCESS_KEY
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2228: [FLINK-4170][kinesis-connector] Simplify Kinesis connecte...

2016-07-13 Thread rmetzger
Github user rmetzger commented on the issue:

https://github.com/apache/flink/pull/2228
  
I think there was an issue with Github and travis: 
https://www.traviscistatus.com/incidents/t4xn7bmq7fww


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3477) Add hash-based combine strategy for ReduceFunction

2016-07-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375658#comment-15375658
 ] 

ASF GitHub Bot commented on FLINK-3477:
---

Github user greghogan commented on the issue:

https://github.com/apache/flink/pull/1517
  
CI tests are passing. I've been testing Gelly algorithms with this without 
error. I will merge this ...


> Add hash-based combine strategy for ReduceFunction
> --
>
> Key: FLINK-3477
> URL: https://issues.apache.org/jira/browse/FLINK-3477
> Project: Flink
>  Issue Type: Sub-task
>  Components: Local Runtime
>Reporter: Fabian Hueske
>Assignee: Gabor Gevay
>
> This issue is about adding a hash-based combine strategy for ReduceFunctions.
> The interface of the {{reduce()}} method is as follows:
> {code}
> public T reduce(T v1, T v2)
> {code}
> Input type and output type are identical and the function returns only a 
> single value. A Reduce function is incrementally applied to compute a final 
> aggregated value. This allows to hold the preaggregated value in a hash-table 
> and update it with each function call. 
> The hash-based strategy requires special implementation of an in-memory hash 
> table. The hash table should support in place updates of elements (if the 
> updated value has the same size as the new value) but also appending updates 
> with invalidation of the old value (if the binary length of the new value 
> differs). The hash table needs to be able to evict and emit all elements if 
> it runs out-of-memory.
> We should also add {{HASH}} and {{SORT}} compiler hints to 
> {{DataSet.reduce()}} and {{Grouping.reduce()}} to allow users to pick the 
> execution strategy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2244: [FLINK-3874] Add a Kafka TableSink with JSON serializatio...

2016-07-13 Thread mushketyk
Github user mushketyk commented on the issue:

https://github.com/apache/flink/pull/2244
  
I tried to inherit TableSink trait in Java code but it seems that it is 
impossible to inherit traits with vars in Java, therefor I had to change class 
structures there somewhat. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3874) Add a Kafka TableSink with JSON serialization

2016-07-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375822#comment-15375822
 ] 

ASF GitHub Bot commented on FLINK-3874:
---

Github user mushketyk commented on the issue:

https://github.com/apache/flink/pull/2244
  
I tried to inherit TableSink trait in Java code but it seems that it is 
impossible to inherit traits with vars in Java, therefor I had to change class 
structures there somewhat. 


> Add a Kafka TableSink with JSON serialization
> -
>
> Key: FLINK-3874
> URL: https://issues.apache.org/jira/browse/FLINK-3874
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Fabian Hueske
>Priority: Minor
>
> Add a TableSink that writes JSON serialized data to Kafka.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #1517: [FLINK-3477] [runtime] Add hash-based combine strategy fo...

2016-07-13 Thread greghogan
Github user greghogan commented on the issue:

https://github.com/apache/flink/pull/1517
  
CI tests are passing. I've been testing Gelly algorithms with this without 
error. I will merge this ...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #2244: Kafka json

2016-07-13 Thread mushketyk
GitHub user mushketyk opened a pull request:

https://github.com/apache/flink/pull/2244

Kafka json

Thanks for contributing to Apache Flink. Before you open your pull request, 
please take the following check list into consideration.
If your changes take all of the items into account, feel free to open your 
pull request. For more information and/or questions please refer to the [How To 
Contribute guide](http://flink.apache.org/how-to-contribute.html).
In addition to going through the list, please provide a meaningful 
description of your changes.

- [x] General
  - The pull request references the related JIRA issue ("[FLINK-XXX] Jira 
title text")
  - The pull request addresses only one issue
  - Each commit in the PR has a meaningful commit message (including the 
JIRA id)

- [x] Documentation
  - Documentation has been added for new functionality
  - Old documentation affected by the pull request has been updated
  - JavaDoc for public methods has been added

- [x] Tests & Build
  - Functionality added by the pull request is covered by tests
  - `mvn clean verify` has been executed successfully locally or a Travis 
build has passed


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mushketyk/flink kafka-json

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2244.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2244


commit 9ff33698e44afc005360d8acb10fdbf2ccba814b
Author: Ivan Mushketyk 
Date:   2016-07-05T21:00:18Z

[FLINK-3874] Implement KafkaJsonTableSink

commit 3eeb1dcd0f4febe37f92725bc94f3d3b13e3368f
Author: Ivan Mushketyk 
Date:   2016-07-13T21:43:13Z

[FLINK-3874] Implement tests for CsvTableSink




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (FLINK-4214) JobExceptionsHandler will return all exceptions

2016-07-13 Thread Sumit Chawla (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sumit Chawla updated FLINK-4214:

Priority: Minor  (was: Major)

> JobExceptionsHandler will return all exceptions
> ---
>
> Key: FLINK-4214
> URL: https://issues.apache.org/jira/browse/FLINK-4214
> Project: Flink
>  Issue Type: Bug
>Reporter: Sumit Chawla
>Priority: Minor
>
> JobExceptionsHandler will return all exceptions and is not incrementing the 
> integer to track the exceptions being serialized



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #1813: [FLINK-3034] Redis Sink Connector

2016-07-13 Thread subhankarb
Github user subhankarb commented on the issue:

https://github.com/apache/flink/pull/1813
  
Hi @kl0u, IMO that is the expected behavior. The sink would not know that 
if the Redis is down or not unless it tries to send the next data to the Redis. 
When ever a new message reaches the sink it tries to use the connection pool, 
then an then only it can throw exception that it can not send the data to Redis.
You can build a heartbeat mechanism to check periodically that Redis serve 
is up or down, and can cancel the job if Redis is down. 
@mjsax please correct me if my understanding is wrong.  


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4214) JobExceptionsHandler will return all exceptions

2016-07-13 Thread Sumit Chawla (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15376269#comment-15376269
 ] 

Sumit Chawla commented on FLINK-4214:
-

Change added in https://github.com/apache/flink/pull/2242

> JobExceptionsHandler will return all exceptions
> ---
>
> Key: FLINK-4214
> URL: https://issues.apache.org/jira/browse/FLINK-4214
> Project: Flink
>  Issue Type: Bug
>Reporter: Sumit Chawla
>
> JobExceptionsHandler will return all exceptions and is not incrementing the 
> integer to track the exceptions being serialized



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3034) Redis SInk Connector

2016-07-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15376320#comment-15376320
 ] 

ASF GitHub Bot commented on FLINK-3034:
---

Github user subhankarb commented on the issue:

https://github.com/apache/flink/pull/1813
  
Hi @kl0u, IMO that is the expected behavior. The sink would not know that 
if the Redis is down or not unless it tries to send the next data to the Redis. 
When ever a new message reaches the sink it tries to use the connection pool, 
then an then only it can throw exception that it can not send the data to Redis.
You can build a heartbeat mechanism to check periodically that Redis serve 
is up or down, and can cancel the job if Redis is down. 
@mjsax please correct me if my understanding is wrong.  


> Redis SInk Connector
> 
>
> Key: FLINK-3034
> URL: https://issues.apache.org/jira/browse/FLINK-3034
> Project: Flink
>  Issue Type: New Feature
>  Components: Streaming Connectors
>Reporter: Matthias J. Sax
>Assignee: Subhankar Biswas
>Priority: Minor
> Fix For: 1.1.0
>
>
> Flink does not provide a sink connector for Redis.
> See FLINK-3033



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4214) JobExceptionsHandler will return all exceptions

2016-07-13 Thread Sumit Chawla (JIRA)
Sumit Chawla created FLINK-4214:
---

 Summary: JobExceptionsHandler will return all exceptions
 Key: FLINK-4214
 URL: https://issues.apache.org/jira/browse/FLINK-4214
 Project: Flink
  Issue Type: Bug
Reporter: Sumit Chawla


JobExceptionsHandler will return all exceptions and is not incrementing the 
integer to track the exceptions being serialized



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4127) Clean up configuration and check breaking API changes

2016-07-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374609#comment-15374609
 ] 

ASF GitHub Bot commented on FLINK-4127:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2177


> Clean up configuration and check breaking API changes
> -
>
> Key: FLINK-4127
> URL: https://issues.apache.org/jira/browse/FLINK-4127
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System
>Reporter: Robert Metzger
>Assignee: Robert Metzger
> Fix For: 1.1.0
>
> Attachments: flink-core.html, flink-java.html, flink-scala.html, 
> flink-streaming-java.html, flink-streaming-scala.html
>
>
> For the upcoming 1.1. release, I'll check if there are any breaking API 
> changes and if the documentation is up tp date with the configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2236: [FLINK-4186] Use Flink metrics to report Kafka metrics

2016-07-13 Thread rmetzger
Github user rmetzger commented on the issue:

https://github.com/apache/flink/pull/2236
  
This will simplify this PR: https://issues.apache.org/jira/browse/FLINK-4206


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4186) Expose Kafka metrics through Flink metrics

2016-07-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374618#comment-15374618
 ] 

ASF GitHub Bot commented on FLINK-4186:
---

GitHub user rmetzger opened a pull request:

https://github.com/apache/flink/pull/2236

[FLINK-4186] Use Flink metrics to report Kafka metrics

We were using Flink's accumulators in the past to report the Kafka metrics.
With this change, we'll use the new metrics of Flink.

I'm now also exposing the current offset of each topic partition through 
the metrics.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rmetzger/flink flink4186

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2236.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2236


commit 234c549b6046d33ff8545661435ddb97b453bb2e
Author: Robert Metzger 
Date:   2016-07-12T11:55:29Z

[FLINK-4186] Use Flink metrics to report Kafka metrics

This commit also adds monitoring for the current offset




> Expose Kafka metrics through Flink metrics
> --
>
> Key: FLINK-4186
> URL: https://issues.apache.org/jira/browse/FLINK-4186
> Project: Flink
>  Issue Type: Improvement
>  Components: Kafka Connector
>Affects Versions: 1.1.0
>Reporter: Robert Metzger
>Assignee: Robert Metzger
>
> Currently, we expose the Kafka metrics through Flink's accumulators.
> We can now use the metrics system in Flink to report Kafka metrics.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2236: [FLINK-4186] Use Flink metrics to report Kafka met...

2016-07-13 Thread zentol
Github user zentol commented on a diff in the pull request:

https://github.com/apache/flink/pull/2236#discussion_r70589875
  
--- Diff: 
flink-streaming-connectors/flink-connector-kafka-0.8/src/main/java/org/apache/flink/streaming/connectors/kafka/internals/Kafka08Fetcher.java
 ---
@@ -161,6 +167,12 @@ public void runFetchLoop() throws Exception {
periodicCommitter.start();
}
 
+   // register offset metrics
+   if(useMetrics) {
--- End diff --

missing space after if


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #2236: [FLINK-4186] Use Flink metrics to report Kafka met...

2016-07-13 Thread zentol
Github user zentol commented on a diff in the pull request:

https://github.com/apache/flink/pull/2236#discussion_r70591165
  
--- Diff: 
flink-streaming-connectors/flink-connector-kafka-0.8/src/main/java/org/apache/flink/streaming/connectors/kafka/internals/Kafka08Fetcher.java
 ---
@@ -161,6 +167,12 @@ public void runFetchLoop() throws Exception {
periodicCommitter.start();
}
 
+   // register offset metrics
+   if(useMetrics) {
+   final MetricGroup kafkaMetricGroup = 
metricGroup.addGroup("Kafka08Consumer");
--- End diff --

i don't think it is necessary to include the version in the group name.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #2236: [FLINK-4186] Use Flink metrics to report Kafka met...

2016-07-13 Thread zentol
Github user zentol commented on a diff in the pull request:

https://github.com/apache/flink/pull/2236#discussion_r70591328
  
--- Diff: 
flink-streaming-connectors/flink-connector-kafka-base/src/main/java/org/apache/flink/streaming/connectors/kafka/internals/AbstractFetcher.java
 ---
@@ -389,8 +396,36 @@ private void updateMinPunctuatedWatermark(Watermark 
nextWatermark) {
throw new RuntimeException();
}
}
-   
-   // 

+
--- End diff --

double empty line


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4186) Expose Kafka metrics through Flink metrics

2016-07-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374677#comment-15374677
 ] 

ASF GitHub Bot commented on FLINK-4186:
---

Github user zentol commented on a diff in the pull request:

https://github.com/apache/flink/pull/2236#discussion_r70591165
  
--- Diff: 
flink-streaming-connectors/flink-connector-kafka-0.8/src/main/java/org/apache/flink/streaming/connectors/kafka/internals/Kafka08Fetcher.java
 ---
@@ -161,6 +167,12 @@ public void runFetchLoop() throws Exception {
periodicCommitter.start();
}
 
+   // register offset metrics
+   if(useMetrics) {
+   final MetricGroup kafkaMetricGroup = 
metricGroup.addGroup("Kafka08Consumer");
--- End diff --

i don't think it is necessary to include the version in the group name.


> Expose Kafka metrics through Flink metrics
> --
>
> Key: FLINK-4186
> URL: https://issues.apache.org/jira/browse/FLINK-4186
> Project: Flink
>  Issue Type: Improvement
>  Components: Kafka Connector
>Affects Versions: 1.1.0
>Reporter: Robert Metzger
>Assignee: Robert Metzger
>
> Currently, we expose the Kafka metrics through Flink's accumulators.
> We can now use the metrics system in Flink to report Kafka metrics.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2236: [FLINK-4186] Use Flink metrics to report Kafka met...

2016-07-13 Thread zentol
Github user zentol commented on a diff in the pull request:

https://github.com/apache/flink/pull/2236#discussion_r70591698
  
--- Diff: 
flink-streaming-connectors/flink-connector-kafka-base/src/test/java/org/apache/flink/streaming/connectors/kafka/testutils/MockRuntimeContext.java
 ---
@@ -161,6 +163,11 @@ public Histogram getHistogram(String name) {
}
 
@Override
+   public MetricGroup getMetricGroup() {
+   return new UnregisteredTaskMetricsGroup.DummyIOMetricGroup();
--- End diff --

You don't really have to override this method, using the TaskMetricGroup is 
fine.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4186) Expose Kafka metrics through Flink metrics

2016-07-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374684#comment-15374684
 ] 

ASF GitHub Bot commented on FLINK-4186:
---

Github user zentol commented on a diff in the pull request:

https://github.com/apache/flink/pull/2236#discussion_r70591835
  
--- Diff: 
flink-streaming-connectors/flink-connector-kafka-base/src/test/java/org/apache/flink/streaming/connectors/kafka/KafkaConsumerTestBase.java
 ---
@@ -1235,15 +1235,104 @@ public void flatMap(Tuple2 
value, Collector out) throws
 
JobExecutionResult result = tryExecute(env1, "Consume " + 
ELEMENT_COUNT + " elements from Kafka");
 
-   Map accuResults = 
result.getAllAccumulatorResults();
-   // kafka 0.9 consumer: 39 results
-   if (kafkaServer.getVersion().equals("0.9")) {
-   assertTrue("Not enough accumulators from Kafka 
Consumer: " + accuResults.size(), accuResults.size() > 38);
+   deleteTestTopic(topic);
+   }
+
+   /**
+* Test metrics reporting for consumer
--- End diff --

Can we have a test for the producer as well?


> Expose Kafka metrics through Flink metrics
> --
>
> Key: FLINK-4186
> URL: https://issues.apache.org/jira/browse/FLINK-4186
> Project: Flink
>  Issue Type: Improvement
>  Components: Kafka Connector
>Affects Versions: 1.1.0
>Reporter: Robert Metzger
>Assignee: Robert Metzger
>
> Currently, we expose the Kafka metrics through Flink's accumulators.
> We can now use the metrics system in Flink to report Kafka metrics.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2236: [FLINK-4186] Use Flink metrics to report Kafka met...

2016-07-13 Thread zentol
Github user zentol commented on a diff in the pull request:

https://github.com/apache/flink/pull/2236#discussion_r70591835
  
--- Diff: 
flink-streaming-connectors/flink-connector-kafka-base/src/test/java/org/apache/flink/streaming/connectors/kafka/KafkaConsumerTestBase.java
 ---
@@ -1235,15 +1235,104 @@ public void flatMap(Tuple2 
value, Collector out) throws
 
JobExecutionResult result = tryExecute(env1, "Consume " + 
ELEMENT_COUNT + " elements from Kafka");
 
-   Map accuResults = 
result.getAllAccumulatorResults();
-   // kafka 0.9 consumer: 39 results
-   if (kafkaServer.getVersion().equals("0.9")) {
-   assertTrue("Not enough accumulators from Kafka 
Consumer: " + accuResults.size(), accuResults.size() > 38);
+   deleteTestTopic(topic);
+   }
+
+   /**
+* Test metrics reporting for consumer
--- End diff --

Can we have a test for the producer as well?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4186) Expose Kafka metrics through Flink metrics

2016-07-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374683#comment-15374683
 ] 

ASF GitHub Bot commented on FLINK-4186:
---

Github user zentol commented on a diff in the pull request:

https://github.com/apache/flink/pull/2236#discussion_r70591698
  
--- Diff: 
flink-streaming-connectors/flink-connector-kafka-base/src/test/java/org/apache/flink/streaming/connectors/kafka/testutils/MockRuntimeContext.java
 ---
@@ -161,6 +163,11 @@ public Histogram getHistogram(String name) {
}
 
@Override
+   public MetricGroup getMetricGroup() {
+   return new UnregisteredTaskMetricsGroup.DummyIOMetricGroup();
--- End diff --

You don't really have to override this method, using the TaskMetricGroup is 
fine.


> Expose Kafka metrics through Flink metrics
> --
>
> Key: FLINK-4186
> URL: https://issues.apache.org/jira/browse/FLINK-4186
> Project: Flink
>  Issue Type: Improvement
>  Components: Kafka Connector
>Affects Versions: 1.1.0
>Reporter: Robert Metzger
>Assignee: Robert Metzger
>
> Currently, we expose the Kafka metrics through Flink's accumulators.
> We can now use the metrics system in Flink to report Kafka metrics.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4143) Configurable delimiter for metric identifier

2016-07-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374700#comment-15374700
 ] 

ASF GitHub Bot commented on FLINK-4143:
---

Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/2219#discussion_r70592983
  
--- Diff: 
flink-core/src/main/java/org/apache/flink/metrics/MetricRegistry.java ---
@@ -64,6 +66,15 @@ public MetricRegistry(Configuration config) {
}
this.scopeFormats = scopeFormats;
 
+   char delim;
+   try {
+   delim = 
config.getString(ConfigConstants.METRICS_SCOPE_DELIMITER, ".").charAt(0);
--- End diff --

I guess we can still change it in case a user requests this feature.


> Configurable delimiter for metric identifier
> 
>
> Key: FLINK-4143
> URL: https://issues.apache.org/jira/browse/FLINK-4143
> Project: Flink
>  Issue Type: Improvement
>  Components: Metrics
>Affects Versions: 1.1.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Minor
>
> The metric identifier is currently hard-coded to separate components with a 
> dot.
> We should make this configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2219: [FLINK-4143][metrics] Configurable delimiter

2016-07-13 Thread tillrohrmann
Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/2219#discussion_r70592983
  
--- Diff: 
flink-core/src/main/java/org/apache/flink/metrics/MetricRegistry.java ---
@@ -64,6 +66,15 @@ public MetricRegistry(Configuration config) {
}
this.scopeFormats = scopeFormats;
 
+   char delim;
+   try {
+   delim = 
config.getString(ConfigConstants.METRICS_SCOPE_DELIMITER, ".").charAt(0);
--- End diff --

I guess we can still change it in case a user requests this feature.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #2210: [FLINK-4167] [metrics] Close IOMetricGroup in Task...

2016-07-13 Thread zentol
Github user zentol commented on a diff in the pull request:

https://github.com/apache/flink/pull/2210#discussion_r70597751
  
--- Diff: 
flink-core/src/main/java/org/apache/flink/metrics/groups/ProxyMetricGroup.java 
---
@@ -0,0 +1,100 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.metrics.groups;
+
+import org.apache.flink.annotation.Internal;
+import org.apache.flink.metrics.Counter;
+import org.apache.flink.metrics.Gauge;
+import org.apache.flink.metrics.Histogram;
+import org.apache.flink.metrics.MetricGroup;
+import org.apache.flink.util.Preconditions;
+
+/**
+ * Metric group which forwards all registration calls to its parent metric 
group.
+ *
+ * @param  Type of the parent metric group
+ */
+@Internal
+public class ProxyMetricGroup implements 
MetricGroup {
+   private final P parentMetricGroup;
+
+   public ProxyMetricGroup(P parentMetricGroup) {
+   this.parentMetricGroup = 
Preconditions.checkNotNull(parentMetricGroup);
+   }
+
+   @Override
+   public final void close() {
+   parentMetricGroup.close();
--- End diff --

It is safer to not do anything here. `close()` should only be called by the 
Task on the `TaskMetricGroup`, this would open us to the possibility of 
components closing the TaskMG as well.

There's also the looming StackOverflow when someone puts 
`ioMetrics.close()` into the `TaskMetricGroup#close()`.

Now that i think about it i believe `close()` (and by extension, 
`isClosed()`) has no business being in the MetricGroup interface in the first 
place, as users actually don't need to call it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4184) Ganglia and GraphiteReporter report metric names with invalid characters

2016-07-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374763#comment-15374763
 ] 

ASF GitHub Bot commented on FLINK-4184:
---

Github user tillrohrmann commented on the issue:

https://github.com/apache/flink/pull/2220
  
True, it conflicts with your proposed changes for the definable metric 
group delimiter. I will rebase and adapt this PR wrt #2219.


> Ganglia and GraphiteReporter report metric names with invalid characters
> 
>
> Key: FLINK-4184
> URL: https://issues.apache.org/jira/browse/FLINK-4184
> Project: Flink
>  Issue Type: Bug
>  Components: Metrics
>Affects Versions: 1.1.0
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
> Fix For: 1.1.0
>
>
> Flink's {{GangliaReporter}} and {{GraphiteReporter}} report metrics with 
> names which contain invalid characters. For example, quotes are not filtered 
> out which can be problematic for Ganglia. Moreover, dots are not replaced 
> which causes Graphite to think that an IP address is actually a scoped metric 
> name.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4167) TaskMetricGroup does not close IOMetricGroup

2016-07-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374764#comment-15374764
 ] 

ASF GitHub Bot commented on FLINK-4167:
---

Github user zentol commented on a diff in the pull request:

https://github.com/apache/flink/pull/2210#discussion_r70597751
  
--- Diff: 
flink-core/src/main/java/org/apache/flink/metrics/groups/ProxyMetricGroup.java 
---
@@ -0,0 +1,100 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.metrics.groups;
+
+import org.apache.flink.annotation.Internal;
+import org.apache.flink.metrics.Counter;
+import org.apache.flink.metrics.Gauge;
+import org.apache.flink.metrics.Histogram;
+import org.apache.flink.metrics.MetricGroup;
+import org.apache.flink.util.Preconditions;
+
+/**
+ * Metric group which forwards all registration calls to its parent metric 
group.
+ *
+ * @param  Type of the parent metric group
+ */
+@Internal
+public class ProxyMetricGroup implements 
MetricGroup {
+   private final P parentMetricGroup;
+
+   public ProxyMetricGroup(P parentMetricGroup) {
+   this.parentMetricGroup = 
Preconditions.checkNotNull(parentMetricGroup);
+   }
+
+   @Override
+   public final void close() {
+   parentMetricGroup.close();
--- End diff --

It is safer to not do anything here. `close()` should only be called by the 
Task on the `TaskMetricGroup`, this would open us to the possibility of 
components closing the TaskMG as well.

There's also the looming StackOverflow when someone puts 
`ioMetrics.close()` into the `TaskMetricGroup#close()`.

Now that i think about it i believe `close()` (and by extension, 
`isClosed()`) has no business being in the MetricGroup interface in the first 
place, as users actually don't need to call it.


> TaskMetricGroup does not close IOMetricGroup
> 
>
> Key: FLINK-4167
> URL: https://issues.apache.org/jira/browse/FLINK-4167
> Project: Flink
>  Issue Type: Bug
>  Components: Metrics
>Affects Versions: 1.1.0
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
> Fix For: 1.1.0
>
>
> The {{TaskMetricGroup}} does not close the {{ioMetrics}} metric group. This 
> causes that metrics registered under the {{ioMetrics}} are not deregistered 
> after the termination of a job.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4167) TaskMetricGroup does not close IOMetricGroup

2016-07-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374776#comment-15374776
 ] 

ASF GitHub Bot commented on FLINK-4167:
---

Github user zentol commented on the issue:

https://github.com/apache/flink/pull/2210
  
Only had a small comment, otherwise +1.


> TaskMetricGroup does not close IOMetricGroup
> 
>
> Key: FLINK-4167
> URL: https://issues.apache.org/jira/browse/FLINK-4167
> Project: Flink
>  Issue Type: Bug
>  Components: Metrics
>Affects Versions: 1.1.0
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
> Fix For: 1.1.0
>
>
> The {{TaskMetricGroup}} does not close the {{ioMetrics}} metric group. This 
> causes that metrics registered under the {{ioMetrics}} are not deregistered 
> after the termination of a job.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2210: [FLINK-4167] [metrics] Close IOMetricGroup in TaskMetricG...

2016-07-13 Thread zentol
Github user zentol commented on the issue:

https://github.com/apache/flink/pull/2210
  
Only had a small comment, otherwise +1.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (FLINK-4207) WindowOperator becomes very slow with allowed lateness

2016-07-13 Thread Aljoscha Krettek (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aljoscha Krettek updated FLINK-4207:

Priority: Blocker  (was: Major)

> WindowOperator becomes very slow with allowed lateness
> --
>
> Key: FLINK-4207
> URL: https://issues.apache.org/jira/browse/FLINK-4207
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming
>Affects Versions: 1.1.0
>Reporter: Aljoscha Krettek
>Priority: Blocker
>
> In this simple example the throughput (as measured by the count the window 
> emits) becomes very low when an allowed lateness is set:
> {code}
> public class WindowWordCount {
>   public static void main(String[] args) throws Exception {
>   final StreamExecutionEnvironment env = 
> StreamExecutionEnvironment.getExecutionEnvironment();
>   
> env.setStreamTimeCharacteristic(TimeCharacteristic.IngestionTime);
>   env.setParallelism(1);
>   env.addSource(new InfiniteTupleSource(100_000))
>   .keyBy(0)
>   .timeWindow(Time.seconds(3))
>   .allowedLateness(Time.seconds(1))
>   .reduce(new ReduceFunction Integer>>() {
>   @Override
>   public Tuple2 
> reduce(Tuple2 value1,
>   Tuple2 
> value2) throws Exception {
>   return Tuple2.of(value1.f0, 
> value1.f1 + value2.f1);
>   }
>   })
>   .filter(new FilterFunction Integer>>() {
>   private static final long 
> serialVersionUID = 1L;
>   @Override
>   public boolean filter(Tuple2 Integer> value) throws Exception {
>   return 
> value.f0.startsWith("Tuple 0");
>   }
>   })
>   .print();
>   // execute program
>   env.execute("WindowWordCount");
>   }
>   public static class InfiniteTupleSource implements 
> ParallelSourceFunction> {
>   private static final long serialVersionUID = 1L;
>   private int numGroups;
>   public InfiniteTupleSource(int numGroups) {
>   this.numGroups = numGroups;
>   }
>   @Override
>   public void run(SourceContext> out) 
> throws Exception {
>   long index = 0;
>   while (true) {
>   Tuple2 tuple = new 
> Tuple2<>("Tuple " + (index % numGroups), 1);
>   out.collect(tuple);
>   index++;
>   }
>   }
>   @Override
>   public void cancel() {
>   }
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4186) Expose Kafka metrics through Flink metrics

2016-07-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374642#comment-15374642
 ] 

ASF GitHub Bot commented on FLINK-4186:
---

Github user zentol commented on the issue:

https://github.com/apache/flink/pull/2236
  
You can rebase on top of #2237.


> Expose Kafka metrics through Flink metrics
> --
>
> Key: FLINK-4186
> URL: https://issues.apache.org/jira/browse/FLINK-4186
> Project: Flink
>  Issue Type: Improvement
>  Components: Kafka Connector
>Affects Versions: 1.1.0
>Reporter: Robert Metzger
>Assignee: Robert Metzger
>
> Currently, we expose the Kafka metrics through Flink's accumulators.
> We can now use the metrics system in Flink to report Kafka metrics.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2237: [FLINK-4206][metrics] Remove alphanumeric name res...

2016-07-13 Thread zentol
GitHub user zentol opened a pull request:

https://github.com/apache/flink/pull/2237

[FLINK-4206][metrics] Remove alphanumeric name restriction



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zentol/flink 4206_me_so_special

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2237.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2237


commit 9f72ac66d748fe1aab1ee382c33d55ab5dc717fe
Author: zentol 
Date:   2016-07-13T09:03:08Z

[FLINK-4206][metrics] Remove alphanumeric name restriction




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #2236: [FLINK-4186] Use Flink metrics to report Kafka metrics

2016-07-13 Thread zentol
Github user zentol commented on the issue:

https://github.com/apache/flink/pull/2236
  
You can rebase on top of #2237.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4206) Metric names should allow special characters

2016-07-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374641#comment-15374641
 ] 

ASF GitHub Bot commented on FLINK-4206:
---

GitHub user zentol opened a pull request:

https://github.com/apache/flink/pull/2237

[FLINK-4206][metrics] Remove alphanumeric name restriction



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zentol/flink 4206_me_so_special

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2237.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2237


commit 9f72ac66d748fe1aab1ee382c33d55ab5dc717fe
Author: zentol 
Date:   2016-07-13T09:03:08Z

[FLINK-4206][metrics] Remove alphanumeric name restriction




> Metric names should allow special characters
> 
>
> Key: FLINK-4206
> URL: https://issues.apache.org/jira/browse/FLINK-4206
> Project: Flink
>  Issue Type: Improvement
>  Components: Metrics
>Affects Versions: 1.1.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
> Fix For: 1.1.0
>
>
> Currently, the name of the metric is restricted to alphanumeric characters. 
> This restriction was originally put in place to circumvent issues due to 
> systems not supporting certain characters.
> However, this restriction does not make a lot of sense since for group names 
> we don't enforce such a restriction.
> This also affects the integration of the Kafka metrics, so i suggest removing 
> the restriction.
> From now on it will be the responsibility of the reporter to make sure that 
> the metric identifier is supported by the external system.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4186) Expose Kafka metrics through Flink metrics

2016-07-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374654#comment-15374654
 ] 

ASF GitHub Bot commented on FLINK-4186:
---

Github user zentol commented on a diff in the pull request:

https://github.com/apache/flink/pull/2236#discussion_r70589875
  
--- Diff: 
flink-streaming-connectors/flink-connector-kafka-0.8/src/main/java/org/apache/flink/streaming/connectors/kafka/internals/Kafka08Fetcher.java
 ---
@@ -161,6 +167,12 @@ public void runFetchLoop() throws Exception {
periodicCommitter.start();
}
 
+   // register offset metrics
+   if(useMetrics) {
--- End diff --

missing space after if


> Expose Kafka metrics through Flink metrics
> --
>
> Key: FLINK-4186
> URL: https://issues.apache.org/jira/browse/FLINK-4186
> Project: Flink
>  Issue Type: Improvement
>  Components: Kafka Connector
>Affects Versions: 1.1.0
>Reporter: Robert Metzger
>Assignee: Robert Metzger
>
> Currently, we expose the Kafka metrics through Flink's accumulators.
> We can now use the metrics system in Flink to report Kafka metrics.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2219: [FLINK-4143][metrics] Configurable delimiter

2016-07-13 Thread zentol
Github user zentol closed the pull request at:

https://github.com/apache/flink/pull/2219


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Assigned] (FLINK-3729) Several SQL tests fail on Windows OS

2016-07-13 Thread Chesnay Schepler (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler reassigned FLINK-3729:
---

Assignee: Chesnay Schepler

> Several SQL tests fail on Windows OS
> 
>
> Key: FLINK-3729
> URL: https://issues.apache.org/jira/browse/FLINK-3729
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Affects Versions: 1.0.1
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>
> The Table API SqlExplain(Test/ITCase) fail categorically on Windows due to 
> different line-endings. These tests generate an string representation of an 
> abstract syntax tree; problem is there is a difference in line-endings.
> The expected ones contain LF, the actual one CRLF.
> The tests should be either changed to either
> * include CRLF line-endings in the expected string when run on windows
> * always use LF line-endings regardless of OS
> * use a compare method that is aware of this issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4167) TaskMetricGroup does not close IOMetricGroup

2016-07-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374597#comment-15374597
 ] 

ASF GitHub Bot commented on FLINK-4167:
---

Github user tillrohrmann commented on the issue:

https://github.com/apache/flink/pull/2210
  
Ah ok, I see the underlying problem now. I will adapt my PR wrt your 
feedback.


> TaskMetricGroup does not close IOMetricGroup
> 
>
> Key: FLINK-4167
> URL: https://issues.apache.org/jira/browse/FLINK-4167
> Project: Flink
>  Issue Type: Bug
>  Components: Metrics
>Affects Versions: 1.1.0
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
> Fix For: 1.1.0
>
>
> The {{TaskMetricGroup}} does not close the {{ioMetrics}} metric group. This 
> causes that metrics registered under the {{ioMetrics}} are not deregistered 
> after the termination of a job.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2235: [hotfix] removed duplicated code

2016-07-13 Thread zentol
Github user zentol commented on the issue:

https://github.com/apache/flink/pull/2235
  
+1



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #2177: [FLINK-4127] Check API compatbility for 1.1 in fli...

2016-07-13 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2177


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (FLINK-4127) Clean up configuration and check breaking API changes

2016-07-13 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger resolved FLINK-4127.
---
Resolution: Fixed

Merged in http://git-wip-us.apache.org/repos/asf/flink/commit/6b7bb761

> Clean up configuration and check breaking API changes
> -
>
> Key: FLINK-4127
> URL: https://issues.apache.org/jira/browse/FLINK-4127
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System
>Reporter: Robert Metzger
>Assignee: Robert Metzger
> Fix For: 1.1.0
>
> Attachments: flink-core.html, flink-java.html, flink-scala.html, 
> flink-streaming-java.html, flink-streaming-scala.html
>
>
> For the upcoming 1.1. release, I'll check if there are any breaking API 
> changes and if the documentation is up tp date with the configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4206) Metric names should alle special characters

2016-07-13 Thread Chesnay Schepler (JIRA)
Chesnay Schepler created FLINK-4206:
---

 Summary: Metric names should alle special characters
 Key: FLINK-4206
 URL: https://issues.apache.org/jira/browse/FLINK-4206
 Project: Flink
  Issue Type: Improvement
  Components: Metrics
Affects Versions: 1.1.0
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.1.0


Currently, the name of the metric is restricted to alphanumeric characters. 
This restriction was originally put in place to circumvent issues due to 
systems not supporting certain characters.

However, this restriction does not make a lot of sense since for group names we 
don't enforce such a restriction.

This also affects the integration of the Kafka metrics, so i suggest removing 
the restriction.

>From now on it will be the responsibility of the reporter to make sure that 
>the metric identifier is supported by the external system.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4207) WindowOperator becomes very slow with allowed lateness

2016-07-13 Thread Aljoscha Krettek (JIRA)
Aljoscha Krettek created FLINK-4207:
---

 Summary: WindowOperator becomes very slow with allowed lateness
 Key: FLINK-4207
 URL: https://issues.apache.org/jira/browse/FLINK-4207
 Project: Flink
  Issue Type: Bug
  Components: Streaming
Affects Versions: 1.1.0
Reporter: Aljoscha Krettek


In this simple example the throughput (as measured by the count the window 
emits) becomes very low when an allowed lateness is set:

{code}
public class WindowWordCount {

public static void main(String[] args) throws Exception {

final StreamExecutionEnvironment env = 
StreamExecutionEnvironment.getExecutionEnvironment();

env.setStreamTimeCharacteristic(TimeCharacteristic.IngestionTime);
env.setParallelism(1);

env.addSource(new InfiniteTupleSource(100_000))
.keyBy(0)
.timeWindow(Time.seconds(3))
.allowedLateness(Time.seconds(1))
.reduce(new ReduceFunction>() {
@Override
public Tuple2 
reduce(Tuple2 value1,
Tuple2 
value2) throws Exception {
return Tuple2.of(value1.f0, 
value1.f1 + value2.f1);
}
})
.filter(new FilterFunction>() {
private static final long 
serialVersionUID = 1L;

@Override
public boolean filter(Tuple2 value) throws Exception {
return 
value.f0.startsWith("Tuple 0");
}
})
.print();

// execute program
env.execute("WindowWordCount");
}

public static class InfiniteTupleSource implements 
ParallelSourceFunction> {
private static final long serialVersionUID = 1L;

private int numGroups;

public InfiniteTupleSource(int numGroups) {
this.numGroups = numGroups;
}

@Override
public void run(SourceContext> out) 
throws Exception {
long index = 0;
while (true) {
Tuple2 tuple = new 
Tuple2<>("Tuple " + (index % numGroups), 1);
out.collect(tuple);
index++;
}
}

@Override
public void cancel() {
}
}
}
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4159) Quickstart poms exclude unused dependencies

2016-07-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374721#comment-15374721
 ] 

ASF GitHub Bot commented on FLINK-4159:
---

Github user zentol closed the pull request at:

https://github.com/apache/flink/pull/2217


> Quickstart poms exclude unused dependencies
> ---
>
> Key: FLINK-4159
> URL: https://issues.apache.org/jira/browse/FLINK-4159
> Project: Flink
>  Issue Type: Bug
>  Components: Quickstarts
>Affects Versions: 1.1.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Minor
> Fix For: 1.1.0
>
>
> The Quickstart poms exclude several dependencies from being packaged into the 
> fat-jar, even though they aren't used by Flink according to `mvn 
> dependency:tree`.
> com.amazonaws:aws-java-sdk
> com.twitter:chill-avro_*
> com.twitter:chill-bijection_*
> com.twitter:bijection-core_*
> com.twitter:bijection-avro_*
> de.javakaffee:kryo-serializers



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2219: [FLINK-4143][metrics] Configurable delimiter

2016-07-13 Thread zentol
Github user zentol commented on the issue:

https://github.com/apache/flink/pull/2219
  
merging


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Closed] (FLINK-4159) Quickstart poms exclude unused dependencies

2016-07-13 Thread Chesnay Schepler (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler closed FLINK-4159.
---
Resolution: Fixed

Fixed in 90658c8383efef1e3330cf8bd2ca327d1b55baf4

> Quickstart poms exclude unused dependencies
> ---
>
> Key: FLINK-4159
> URL: https://issues.apache.org/jira/browse/FLINK-4159
> Project: Flink
>  Issue Type: Bug
>  Components: Quickstarts
>Affects Versions: 1.1.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Minor
> Fix For: 1.1.0
>
>
> The Quickstart poms exclude several dependencies from being packaged into the 
> fat-jar, even though they aren't used by Flink according to `mvn 
> dependency:tree`.
> com.amazonaws:aws-java-sdk
> com.twitter:chill-avro_*
> com.twitter:chill-bijection_*
> com.twitter:bijection-core_*
> com.twitter:bijection-avro_*
> de.javakaffee:kryo-serializers



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4143) Configurable delimiter for metric identifier

2016-07-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374733#comment-15374733
 ] 

ASF GitHub Bot commented on FLINK-4143:
---

Github user zentol closed the pull request at:

https://github.com/apache/flink/pull/2219


> Configurable delimiter for metric identifier
> 
>
> Key: FLINK-4143
> URL: https://issues.apache.org/jira/browse/FLINK-4143
> Project: Flink
>  Issue Type: Improvement
>  Components: Metrics
>Affects Versions: 1.1.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Minor
>
> The metric identifier is currently hard-coded to separate components with a 
> dot.
> We should make this configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-4143) Configurable delimiter for metric identifier

2016-07-13 Thread Chesnay Schepler (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler closed FLINK-4143.
---
Resolution: Fixed

Implemented in 790a654c5e08e0e54f3e02499be4dd8c4006227a

> Configurable delimiter for metric identifier
> 
>
> Key: FLINK-4143
> URL: https://issues.apache.org/jira/browse/FLINK-4143
> Project: Flink
>  Issue Type: Improvement
>  Components: Metrics
>Affects Versions: 1.1.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Minor
>
> The metric identifier is currently hard-coded to separate components with a 
> dot.
> We should make this configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3704) JobManagerHAProcessFailureBatchRecoveryITCase.testJobManagerProcessFailure unstable

2016-07-13 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374601#comment-15374601
 ] 

Robert Metzger commented on FLINK-3704:
---

Once again: 
https://s3.amazonaws.com/archive.travis-ci.org/jobs/144211723/log.txt

> JobManagerHAProcessFailureBatchRecoveryITCase.testJobManagerProcessFailure 
> unstable
> ---
>
> Key: FLINK-3704
> URL: https://issues.apache.org/jira/browse/FLINK-3704
> Project: Flink
>  Issue Type: Bug
>  Components: JobManager
>Reporter: Robert Metzger
>  Labels: test-stability
>
> https://s3.amazonaws.com/archive.travis-ci.org/jobs/120882840/log.txt
> {code}
> testJobManagerProcessFailure[1](org.apache.flink.test.recovery.JobManagerHAProcessFailureBatchRecoveryITCase)
>   Time elapsed: 9.302 sec  <<< ERROR!
> java.io.IOException: Actor at 
> akka.tcp://flink@127.0.0.1:55591/user/jobmanager not reachable. Please make 
> sure that the actor is running and its port is reachable.
>   at 
> org.apache.flink.runtime.akka.AkkaUtils$.getActorRef(AkkaUtils.scala:384)
>   at org.apache.flink.runtime.akka.AkkaUtils.getActorRef(AkkaUtils.scala)
>   at 
> org.apache.flink.test.recovery.JobManagerHAProcessFailureBatchRecoveryITCase.testJobManagerProcessFailure(JobManagerHAProcessFailureBatchRecoveryITCase.java:290)
> Caused by: akka.actor.ActorNotFound: Actor not found for: 
> ActorSelection[Anchor(akka.tcp://flink@127.0.0.1:55591/), 
> Path(/user/jobmanager)]
>   at 
> akka.actor.ActorSelection$$anonfun$resolveOne$1.apply(ActorSelection.scala:65)
>   at 
> akka.actor.ActorSelection$$anonfun$resolveOne$1.apply(ActorSelection.scala:63)
>   at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32)
>   at 
> akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.processBatch$1(BatchingExecutor.scala:67)
>   at 
> akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:82)
>   at 
> akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.apply(BatchingExecutor.scala:59)
>   at 
> akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.apply(BatchingExecutor.scala:59)
>   at 
> scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:72)
>   at akka.dispatch.BatchingExecutor$Batch.run(BatchingExecutor.scala:58)
>   at 
> akka.dispatch.ExecutionContexts$sameThreadExecutionContext$.unbatchedExecute(Future.scala:74)
>   at 
> akka.dispatch.BatchingExecutor$class.execute(BatchingExecutor.scala:110)
>   at 
> akka.dispatch.ExecutionContexts$sameThreadExecutionContext$.execute(Future.scala:73)
>   at 
> scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:40)
>   at 
> scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:248)
>   at akka.pattern.PromiseActorRef.$bang(AskSupport.scala:267)
>   at akka.actor.EmptyLocalActorRef.specialHandle(ActorRef.scala:508)
>   at akka.actor.DeadLetterActorRef.specialHandle(ActorRef.scala:541)
>   at akka.actor.DeadLetterActorRef.$bang(ActorRef.scala:531)
>   at 
> akka.remote.RemoteActorRefProvider$RemoteDeadLetterActorRef.$bang(RemoteActorRefProvider.scala:87)
>   at akka.remote.EndpointWriter.postStop(Endpoint.scala:561)
>   at akka.actor.Actor$class.aroundPostStop(Actor.scala:475)
>   at akka.remote.EndpointActor.aroundPostStop(Endpoint.scala:415)
>   at 
> akka.actor.dungeon.FaultHandling$class.akka$actor$dungeon$FaultHandling$$finishTerminate(FaultHandling.scala:210)
>   at 
> akka.actor.dungeon.FaultHandling$class.terminate(FaultHandling.scala:172)
>   at akka.actor.ActorCell.terminate(ActorCell.scala:369)
>   at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:462)
>   at akka.actor.ActorCell.systemInvoke(ActorCell.scala:478)
>   at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:279)
>   at akka.dispatch.Mailbox.run(Mailbox.scala:220)
>   at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
>   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>   at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2236: [FLINK-4186] Use Flink metrics to report Kafka met...

2016-07-13 Thread rmetzger
GitHub user rmetzger opened a pull request:

https://github.com/apache/flink/pull/2236

[FLINK-4186] Use Flink metrics to report Kafka metrics

We were using Flink's accumulators in the past to report the Kafka metrics.
With this change, we'll use the new metrics of Flink.

I'm now also exposing the current offset of each topic partition through 
the metrics.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rmetzger/flink flink4186

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2236.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2236


commit 234c549b6046d33ff8545661435ddb97b453bb2e
Author: Robert Metzger 
Date:   2016-07-12T11:55:29Z

[FLINK-4186] Use Flink metrics to report Kafka metrics

This commit also adds monitoring for the current offset




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #2236: [FLINK-4186] Use Flink metrics to report Kafka met...

2016-07-13 Thread zentol
Github user zentol commented on a diff in the pull request:

https://github.com/apache/flink/pull/2236#discussion_r70590934
  
--- Diff: 
flink-streaming-connectors/flink-connector-kafka-0.8/src/main/java/org/apache/flink/streaming/connectors/kafka/internals/Kafka08Fetcher.java
 ---
@@ -78,7 +79,10 @@
private final long invalidOffsetBehavior;
 
/** The interval in which to automatically commit (-1 if deactivated) */
-   private final long autoCommitInterval; 
+   private final long autoCommitInterval;
+
+   /** The metric group of this operator */
+   private final MetricGroup metricGroup;
--- End diff --

the 0.8 Fetcher has a `MetricGroup` field, while the 0.9 Fetcher only 
creates a local variable inside run.  Let's make this more consistent.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4186) Expose Kafka metrics through Flink metrics

2016-07-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374673#comment-15374673
 ] 

ASF GitHub Bot commented on FLINK-4186:
---

Github user zentol commented on a diff in the pull request:

https://github.com/apache/flink/pull/2236#discussion_r70590934
  
--- Diff: 
flink-streaming-connectors/flink-connector-kafka-0.8/src/main/java/org/apache/flink/streaming/connectors/kafka/internals/Kafka08Fetcher.java
 ---
@@ -78,7 +79,10 @@
private final long invalidOffsetBehavior;
 
/** The interval in which to automatically commit (-1 if deactivated) */
-   private final long autoCommitInterval; 
+   private final long autoCommitInterval;
+
+   /** The metric group of this operator */
+   private final MetricGroup metricGroup;
--- End diff --

the 0.8 Fetcher has a `MetricGroup` field, while the 0.9 Fetcher only 
creates a local variable inside run.  Let's make this more consistent.


> Expose Kafka metrics through Flink metrics
> --
>
> Key: FLINK-4186
> URL: https://issues.apache.org/jira/browse/FLINK-4186
> Project: Flink
>  Issue Type: Improvement
>  Components: Kafka Connector
>Affects Versions: 1.1.0
>Reporter: Robert Metzger
>Assignee: Robert Metzger
>
> Currently, we expose the Kafka metrics through Flink's accumulators.
> We can now use the metrics system in Flink to report Kafka metrics.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4167) TaskMetricGroup does not close IOMetricGroup

2016-07-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374689#comment-15374689
 ] 

ASF GitHub Bot commented on FLINK-4167:
---

Github user tillrohrmann commented on the issue:

https://github.com/apache/flink/pull/2210
  
I've addressed your comments @zentol and introduced a `ProxyMetricGroup` 
which simply forwards all calls as you've suggested.


> TaskMetricGroup does not close IOMetricGroup
> 
>
> Key: FLINK-4167
> URL: https://issues.apache.org/jira/browse/FLINK-4167
> Project: Flink
>  Issue Type: Bug
>  Components: Metrics
>Affects Versions: 1.1.0
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
> Fix For: 1.1.0
>
>
> The {{TaskMetricGroup}} does not close the {{ioMetrics}} metric group. This 
> causes that metrics registered under the {{ioMetrics}} are not deregistered 
> after the termination of a job.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2230: [FLINK-4200] [Kafka Connector] Kafka consumers log...

2016-07-13 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2230


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #2210: [FLINK-4167] [metrics] Close IOMetricGroup in TaskMetricG...

2016-07-13 Thread tillrohrmann
Github user tillrohrmann commented on the issue:

https://github.com/apache/flink/pull/2210
  
I've addressed your comments @zentol and introduced a `ProxyMetricGroup` 
which simply forwards all calls as you've suggested.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


  1   2   >