[jira] [Resolved] (FLINK-4879) class KafkaTableSource should be public just like KafkaTableSink

2016-10-24 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger resolved FLINK-4879.
---
   Resolution: Fixed
Fix Version/s: (was: 1.1.4)
   1.2.0

Resolved for 1.2 in http://git-wip-us.apache.org/repos/asf/flink/commit/e3324372

> class KafkaTableSource should be public just like KafkaTableSink
> 
>
> Key: FLINK-4879
> URL: https://issues.apache.org/jira/browse/FLINK-4879
> Project: Flink
>  Issue Type: Bug
>  Components: Kafka Connector, Table API & SQL
>Affects Versions: 1.1.1, 1.1.3
>Reporter: yuemeng
>Priority: Minor
> Fix For: 1.2.0
>
> Attachments: 0001-class-KafkaTableSource-should-be-public.patch
>
>
> *class KafkaTableSource should be public just like KafkaTableSink,by 
> default,it's modifier is default ,and we cann't access out of it's package*,
> for example:
>  {code}
> def createKafkaTableSource(
>   topic: String,
>   properties: Properties,
>   deserializationSchema: DeserializationSchema[Row],
>   fieldsNames: Array[String],
>   typeInfo: Array[TypeInformation[_]]): KafkaTableSource = {
> if (deserializationSchema != null) {
>   new Kafka09TableSource(topic, properties, deserializationSchema, 
> fieldsNames, typeInfo)
> } else {
>   new Kafka09JsonTableSource(topic, properties, fieldsNames, typeInfo)
> }
>   }
> {code}
> Because of the class KafkaTableSource modifier is default,we cann't define 
> this function result type with KafkaTableSource ,we must give the specific 
> type.
> if some other kafka source extends KafkaTableSource ,and we don't sure which 
> subclass of KafkaTableSource should be use,how can we specific the type?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-4876) Allow web interface to be bound to a specific ip/interface/inetHost

2016-10-24 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-4876:
--
Assignee: Bram Vogelaar

> Allow web interface to be bound to a specific ip/interface/inetHost
> ---
>
> Key: FLINK-4876
> URL: https://issues.apache.org/jira/browse/FLINK-4876
> Project: Flink
>  Issue Type: Improvement
>  Components: Webfrontend
>Affects Versions: 1.2.0, 1.1.2, 1.1.3
>Reporter: Bram Vogelaar
>Assignee: Bram Vogelaar
>Priority: Minor
>
> Currently the web interface automatically binds to all interfaces on 0.0.0.0. 
> IMHO there are some use cases to only bind to a specific ipadress, (e.g. 
> access through an authenticated proxy, not binding on the management or 
> backup interface)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4876) Allow web interface to be bound to a specific ip/interface/inetHost

2016-10-24 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15602064#comment-15602064
 ] 

Robert Metzger commented on FLINK-4876:
---

Thank you for working on this. I gave you "Contributor" permissions in our JIRA 
so that you can assign issues yourself. (I've already assigned this one to you)

> Allow web interface to be bound to a specific ip/interface/inetHost
> ---
>
> Key: FLINK-4876
> URL: https://issues.apache.org/jira/browse/FLINK-4876
> Project: Flink
>  Issue Type: Improvement
>  Components: Webfrontend
>Affects Versions: 1.2.0, 1.1.2, 1.1.3
>Reporter: Bram Vogelaar
>Priority: Minor
>
> Currently the web interface automatically binds to all interfaces on 0.0.0.0. 
> IMHO there are some use cases to only bind to a specific ipadress, (e.g. 
> access through an authenticated proxy, not binding on the management or 
> backup interface)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4905) Kafka test instability IllegalStateException: Client is not started

2016-10-25 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-4905:
-

 Summary: Kafka test instability IllegalStateException: Client is 
not started
 Key: FLINK-4905
 URL: https://issues.apache.org/jira/browse/FLINK-4905
 Project: Flink
  Issue Type: Bug
  Components: Kafka Connector
Reporter: Robert Metzger


The following travis build 
(https://s3.amazonaws.com/archive.travis-ci.org/jobs/170365439/log.txt)  failed 
because of this error

{code}
08:17:11,239 INFO  org.apache.flink.runtime.jobmanager.JobManager   
 - Status of job 33ebdc0e7c91be186d80658ce3d17069 (Read some records to commit 
offsets to Kafka) changed to FAILING.
java.lang.RuntimeException: Error while confirming checkpoint
at org.apache.flink.runtime.taskmanager.Task$4.run(Task.java:1040)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.IllegalStateException: Client is not started
at 
org.apache.flink.shaded.com.google.common.base.Preconditions.checkState(Preconditions.java:173)
at 
org.apache.curator.CuratorZookeeperClient.getZooKeeper(CuratorZookeeperClient.java:113)
at 
org.apache.curator.utils.EnsurePath$InitialHelper$1.call(EnsurePath.java:148)
at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:107)
at 
org.apache.curator.utils.EnsurePath$InitialHelper.ensure(EnsurePath.java:141)
at org.apache.curator.utils.EnsurePath.ensure(EnsurePath.java:99)
at 
org.apache.flink.streaming.connectors.kafka.internals.ZookeeperOffsetHandler.setOffsetInZooKeeper(ZookeeperOffsetHandler.java:133)
at 
org.apache.flink.streaming.connectors.kafka.internals.ZookeeperOffsetHandler.prepareAndCommitOffsets(ZookeeperOffsetHandler.java:93)
at 
org.apache.flink.streaming.connectors.kafka.internals.Kafka08Fetcher.commitInternalOffsetsToKafka(Kafka08Fetcher.java:341)
at 
org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase.notifyCheckpointComplete(FlinkKafkaConsumerBase.java:421)
at 
org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.notifyOfCompletedCheckpoint(AbstractUdfStreamOperator.java:229)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.notifyCheckpointComplete(StreamTask.java:571)
at org.apache.flink.runtime.taskmanager.Task$4.run(Task.java:1035)
... 5 more
08:17:11,241 INFO  org.apache.flink.runtime.taskmanager.Task
 - Attempting to cancel task Source: Custom Source -> Map -> Map -> Sink: 
Unnamed (1/3)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (FLINK-4050) FlinkKafkaProducer API Refactor

2016-10-25 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger reassigned FLINK-4050:
-

Assignee: Robert Metzger

> FlinkKafkaProducer API Refactor
> ---
>
> Key: FLINK-4050
> URL: https://issues.apache.org/jira/browse/FLINK-4050
> Project: Flink
>  Issue Type: Improvement
>  Components: Kafka Connector
>Affects Versions: 1.0.3
>Reporter: Elias Levy
>Assignee: Robert Metzger
>
> The FlinkKafkaProducer API seems more difficult to use than it should be.  
> The API requires you pass it a SerializationSchema or a 
> KeyedSerializationSchema, but the Kafka producer already has a serialization 
> API.  Requiring a serializer in the Flink API precludes the use of the Kafka 
> serializers.  For instance, they preclude the use of the Confluent 
> KafkaAvroSerializer class that makes use of the Confluent Schema Registry.  
> Ideally, the serializer would be optional, so as to allow the Kafka producer 
> serializers to handle the task.
> In addition, the KeyedSerializationSchema conflates message key extraction 
> with key serialization.  If the serializer were optional, to allow the Kafka 
> producer serializers to take over, you'd still need to extract a key from the 
> message.
> And given that the key may not be part of the message you want to write to 
> Kafka, an upstream step may have to package the key with the message to make 
> both available to the sink, for instance in a tuple. That means you also need 
> to define a method to extract the message to write to Kafka from the element 
> passed into the sink by Flink.  
> In summary, there should be separation of extraction of the key and message 
> from the element passed into the sink from serialization, and the 
> serialization step should be optional.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4905) Kafka test instability IllegalStateException: Client is not started

2016-10-26 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15608006#comment-15608006
 ] 

Robert Metzger commented on FLINK-4905:
---

I'm not 100% sure if this can happen, because {{close()}} and 
{{notifyCheckpointComplete}} are already synchronized.

The lock for close is here: 
https://github.com/apache/flink/blob/770f2f83a81b2810aff171b2f56390ef686f725a/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/StreamTask.java#L279
 

The lock for the notify is here: 
https://github.com/apache/flink/blob/770f2f83a81b2810aff171b2f56390ef686f725a/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/StreamTask.java#L565


> Kafka test instability IllegalStateException: Client is not started
> ---
>
> Key: FLINK-4905
> URL: https://issues.apache.org/jira/browse/FLINK-4905
> Project: Flink
>  Issue Type: Bug
>  Components: Kafka Connector
>Reporter: Robert Metzger
>  Labels: test-stability
>
> The following travis build 
> (https://s3.amazonaws.com/archive.travis-ci.org/jobs/170365439/log.txt)  
> failed because of this error
> {code}
> 08:17:11,239 INFO  org.apache.flink.runtime.jobmanager.JobManager 
>- Status of job 33ebdc0e7c91be186d80658ce3d17069 (Read some records to 
> commit offsets to Kafka) changed to FAILING.
> java.lang.RuntimeException: Error while confirming checkpoint
>   at org.apache.flink.runtime.taskmanager.Task$4.run(Task.java:1040)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.IllegalStateException: Client is not started
>   at 
> org.apache.flink.shaded.com.google.common.base.Preconditions.checkState(Preconditions.java:173)
>   at 
> org.apache.curator.CuratorZookeeperClient.getZooKeeper(CuratorZookeeperClient.java:113)
>   at 
> org.apache.curator.utils.EnsurePath$InitialHelper$1.call(EnsurePath.java:148)
>   at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:107)
>   at 
> org.apache.curator.utils.EnsurePath$InitialHelper.ensure(EnsurePath.java:141)
>   at org.apache.curator.utils.EnsurePath.ensure(EnsurePath.java:99)
>   at 
> org.apache.flink.streaming.connectors.kafka.internals.ZookeeperOffsetHandler.setOffsetInZooKeeper(ZookeeperOffsetHandler.java:133)
>   at 
> org.apache.flink.streaming.connectors.kafka.internals.ZookeeperOffsetHandler.prepareAndCommitOffsets(ZookeeperOffsetHandler.java:93)
>   at 
> org.apache.flink.streaming.connectors.kafka.internals.Kafka08Fetcher.commitInternalOffsetsToKafka(Kafka08Fetcher.java:341)
>   at 
> org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase.notifyCheckpointComplete(FlinkKafkaConsumerBase.java:421)
>   at 
> org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.notifyOfCompletedCheckpoint(AbstractUdfStreamOperator.java:229)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.notifyCheckpointComplete(StreamTask.java:571)
>   at org.apache.flink.runtime.taskmanager.Task$4.run(Task.java:1035)
>   ... 5 more
> 08:17:11,241 INFO  org.apache.flink.runtime.taskmanager.Task  
>- Attempting to cancel task Source: Custom Source -> Map -> Map -> Sink: 
> Unnamed (1/3)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4941) Show ship strategy in web interface

2016-10-27 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-4941:
-

 Summary: Show ship strategy in web interface
 Key: FLINK-4941
 URL: https://issues.apache.org/jira/browse/FLINK-4941
 Project: Flink
  Issue Type: Bug
  Components: Webfrontend
Reporter: Robert Metzger
Assignee: Robert Metzger


Currently, only Flink's Plan visualizer shows the ship strategy of a streaming 
job, however not web interface.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-4941) Show ship strategy in web interface

2016-10-27 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-4941:
--
Attachment: ship_strategy_bad.png
ship_strategy_good.png

I attached two screenshots 

> Show ship strategy in web interface
> ---
>
> Key: FLINK-4941
> URL: https://issues.apache.org/jira/browse/FLINK-4941
> Project: Flink
>  Issue Type: Bug
>  Components: Webfrontend
>Reporter: Robert Metzger
>Assignee: Robert Metzger
> Attachments: ship_strategy_bad.png, ship_strategy_good.png
>
>
> Currently, only Flink's Plan visualizer shows the ship strategy of a 
> streaming job, however not web interface.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (FLINK-4973) Flakey Yarn tests due to recently added latency marker

2016-10-31 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger reassigned FLINK-4973:
-

Assignee: Robert Metzger

> Flakey Yarn tests due to recently added latency marker
> --
>
> Key: FLINK-4973
> URL: https://issues.apache.org/jira/browse/FLINK-4973
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.2.0
>Reporter: Till Rohrmann
>Assignee: Robert Metzger
>Priority: Critical
>  Labels: test-stability
> Fix For: 1.2.0
>
>
> The newly introduced {{LatencyMarksEmitter}} emits latency marker on the 
> {{Output}}. This can still happen after the underlying {{BufferPool}} has 
> been destroyed. The occurring exception is then logged:
> {code}
> 2016-10-29 15:00:48,088 INFO  org.apache.flink.runtime.taskmanager.Task   
>   - Source: Custom File Source (1/1) switched to FINISHED
> 2016-10-29 15:00:48,089 INFO  org.apache.flink.runtime.taskmanager.Task   
>   - Freeing task resources for Source: Custom File Source (1/1)
> 2016-10-29 15:00:48,089 INFO  org.apache.flink.yarn.YarnTaskManager   
>   - Un-registering task and sending final execution state 
> FINISHED to JobManager for task Source: Custom File Source 
> (8fe0f817fa6d960ea33f6e57e0c3891c)
> 2016-10-29 15:00:48,101 WARN  
> org.apache.flink.streaming.api.operators.AbstractStreamOperator  - Error 
> while emitting latency marker
> java.lang.RuntimeException: Buffer pool is destroyed.
>   at 
> org.apache.flink.streaming.runtime.io.RecordWriterOutput.emitLatencyMarker(RecordWriterOutput.java:99)
>   at 
> org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.emitLatencyMarker(AbstractStreamOperator.java:734)
>   at 
> org.apache.flink.streaming.api.operators.StreamSource$LatencyMarksEmitter$1.run(StreamSource.java:134)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.IllegalStateException: Buffer pool is destroyed.
>   at 
> org.apache.flink.runtime.io.network.buffer.LocalBufferPool.requestBuffer(LocalBufferPool.java:144)
>   at 
> org.apache.flink.runtime.io.network.buffer.LocalBufferPool.requestBufferBlocking(LocalBufferPool.java:133)
>   at 
> org.apache.flink.runtime.io.network.api.writer.RecordWriter.sendToTarget(RecordWriter.java:118)
>   at 
> org.apache.flink.runtime.io.network.api.writer.RecordWriter.randomEmit(RecordWriter.java:103)
>   at 
> org.apache.flink.streaming.runtime.io.StreamRecordWriter.randomEmit(StreamRecordWriter.java:104)
>   at 
> org.apache.flink.streaming.runtime.io.RecordWriterOutput.emitLatencyMarker(RecordWriterOutput.java:96)
>   ... 9 more
> {code}
> This exception is clearly related to the shutdown of a stream operator and 
> does not indicate a wrong behaviour. Since the yarn tests simply scan the log 
> for some keywords (including exception) such a case can make them fail.
> Best if we could make sure that the {{LatencyMarksEmitter}} would only emit 
> latency marker if the {{Output}} would still be active. But we could also 
> simply not log exceptions which occurred after the stream operator has been 
> stopped.
> https://s3.amazonaws.com/archive.travis-ci.org/jobs/171578846/log.txt



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4973) Flakey Yarn tests due to recently added latency marker

2016-10-31 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15622478#comment-15622478
 ] 

Robert Metzger commented on FLINK-4973:
---

Thank you for reporting this issue. I'll look into it.

> Flakey Yarn tests due to recently added latency marker
> --
>
> Key: FLINK-4973
> URL: https://issues.apache.org/jira/browse/FLINK-4973
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.2.0
>Reporter: Till Rohrmann
>Assignee: Robert Metzger
>Priority: Critical
>  Labels: test-stability
> Fix For: 1.2.0
>
>
> The newly introduced {{LatencyMarksEmitter}} emits latency marker on the 
> {{Output}}. This can still happen after the underlying {{BufferPool}} has 
> been destroyed. The occurring exception is then logged:
> {code}
> 2016-10-29 15:00:48,088 INFO  org.apache.flink.runtime.taskmanager.Task   
>   - Source: Custom File Source (1/1) switched to FINISHED
> 2016-10-29 15:00:48,089 INFO  org.apache.flink.runtime.taskmanager.Task   
>   - Freeing task resources for Source: Custom File Source (1/1)
> 2016-10-29 15:00:48,089 INFO  org.apache.flink.yarn.YarnTaskManager   
>   - Un-registering task and sending final execution state 
> FINISHED to JobManager for task Source: Custom File Source 
> (8fe0f817fa6d960ea33f6e57e0c3891c)
> 2016-10-29 15:00:48,101 WARN  
> org.apache.flink.streaming.api.operators.AbstractStreamOperator  - Error 
> while emitting latency marker
> java.lang.RuntimeException: Buffer pool is destroyed.
>   at 
> org.apache.flink.streaming.runtime.io.RecordWriterOutput.emitLatencyMarker(RecordWriterOutput.java:99)
>   at 
> org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.emitLatencyMarker(AbstractStreamOperator.java:734)
>   at 
> org.apache.flink.streaming.api.operators.StreamSource$LatencyMarksEmitter$1.run(StreamSource.java:134)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.IllegalStateException: Buffer pool is destroyed.
>   at 
> org.apache.flink.runtime.io.network.buffer.LocalBufferPool.requestBuffer(LocalBufferPool.java:144)
>   at 
> org.apache.flink.runtime.io.network.buffer.LocalBufferPool.requestBufferBlocking(LocalBufferPool.java:133)
>   at 
> org.apache.flink.runtime.io.network.api.writer.RecordWriter.sendToTarget(RecordWriter.java:118)
>   at 
> org.apache.flink.runtime.io.network.api.writer.RecordWriter.randomEmit(RecordWriter.java:103)
>   at 
> org.apache.flink.streaming.runtime.io.StreamRecordWriter.randomEmit(StreamRecordWriter.java:104)
>   at 
> org.apache.flink.streaming.runtime.io.RecordWriterOutput.emitLatencyMarker(RecordWriterOutput.java:96)
>   ... 9 more
> {code}
> This exception is clearly related to the shutdown of a stream operator and 
> does not indicate a wrong behaviour. Since the yarn tests simply scan the log 
> for some keywords (including exception) such a case can make them fail.
> Best if we could make sure that the {{LatencyMarksEmitter}} would only emit 
> latency marker if the {{Output}} would still be active. But we could also 
> simply not log exceptions which occurred after the stream operator has been 
> stopped.
> https://s3.amazonaws.com/archive.travis-ci.org/jobs/171578846/log.txt



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4974) RescalingITCase.testSavepointRescalingInPartitionedOperatorState unstable

2016-10-31 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-4974:
-

 Summary: 
RescalingITCase.testSavepointRescalingInPartitionedOperatorState unstable
 Key: FLINK-4974
 URL: https://issues.apache.org/jira/browse/FLINK-4974
 Project: Flink
  Issue Type: Bug
Reporter: Robert Metzger


{code}
testSavepointRescalingInPartitionedOperatorState(org.apache.flink.test.checkpointing.RescalingITCase)
  Time elapsed: 2.761 sec  <<< FAILURE!
java.lang.AssertionError: expected:<24> but was:<72>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:834)
at org.junit.Assert.assertEquals(Assert.java:645)
at org.junit.Assert.assertEquals(Assert.java:631)
at 
org.apache.flink.test.checkpointing.RescalingITCase.testSavepointRescalingPartitionedOperatorState(RescalingITCase.java:560)
at 
org.apache.flink.test.checkpointing.RescalingITCase.testSavepointRescalingInPartitionedOperatorState(RescalingITCase.java:445)
{code}

in https://s3.amazonaws.com/archive.travis-ci.org/jobs/171445547/log.txt



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-4945) KafkaConsumer logs wrong warning about confirmation for unknown checkpoint

2016-11-02 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-4945:
--
Component/s: Kafka Connector

> KafkaConsumer logs wrong warning about confirmation for unknown checkpoint
> --
>
> Key: FLINK-4945
> URL: https://issues.apache.org/jira/browse/FLINK-4945
> Project: Flink
>  Issue Type: Bug
>  Components: Kafka Connector
>Reporter: Stefan Richter
>Assignee: Stefan Richter
>Priority: Minor
> Fix For: 1.2.0
>
>
> Checkpoints are currently not registered in all cases. While the code still 
> behaves correctly this leads to misleading warnings.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-4945) KafkaConsumer logs wrong warning about confirmation for unknown checkpoint

2016-11-02 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-4945:
--
Fix Version/s: 1.2.0

> KafkaConsumer logs wrong warning about confirmation for unknown checkpoint
> --
>
> Key: FLINK-4945
> URL: https://issues.apache.org/jira/browse/FLINK-4945
> Project: Flink
>  Issue Type: Bug
>  Components: Kafka Connector
>Reporter: Stefan Richter
>Assignee: Stefan Richter
>Priority: Minor
> Fix For: 1.2.0
>
>
> Checkpoints are currently not registered in all cases. While the code still 
> behaves correctly this leads to misleading warnings.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-4945) KafkaConsumer logs wrong warning about confirmation for unknown checkpoint

2016-11-02 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger resolved FLINK-4945.
---
Resolution: Fixed

Thank you for fixing the issue.

Merged to master in http://git-wip-us.apache.org/repos/asf/flink/commit/223b0aa0

> KafkaConsumer logs wrong warning about confirmation for unknown checkpoint
> --
>
> Key: FLINK-4945
> URL: https://issues.apache.org/jira/browse/FLINK-4945
> Project: Flink
>  Issue Type: Bug
>  Components: Kafka Connector
>Reporter: Stefan Richter
>Assignee: Stefan Richter
>Priority: Minor
> Fix For: 1.2.0
>
>
> Checkpoints are currently not registered in all cases. While the code still 
> behaves correctly this leads to misleading warnings.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-5001) Ensure that the Kafka 0.9+ connector is compatible with kafka-consumer-groups.sh

2016-11-02 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-5001:
-

 Summary: Ensure that the Kafka 0.9+ connector is compatible with 
kafka-consumer-groups.sh
 Key: FLINK-5001
 URL: https://issues.apache.org/jira/browse/FLINK-5001
 Project: Flink
  Issue Type: Bug
  Components: Kafka Connector
Affects Versions: 1.1.0, 1.2.0
Reporter: Robert Metzger
Priority: Critical


Similarly to FLINK-4822, the offsets committed by Flink's Kafka 0.9+ consumer 
are not available through the {{kafka-consumer-groups.sh}} tool.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-5001) Ensure that the Kafka 0.9+ connector is compatible with kafka-consumer-groups.sh

2016-11-03 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15632274#comment-15632274
 ] 

Robert Metzger commented on FLINK-5001:
---

The {{kafka.admin.ConsumerGroupCommand}} object contains the implementation of 
the {{kafka-consumer-groups.sh}} utility.
I ran it with a debugger attached, and it seems that the broker is not 
returning all consumer groups to the client.

> Ensure that the Kafka 0.9+ connector is compatible with 
> kafka-consumer-groups.sh
> 
>
> Key: FLINK-5001
> URL: https://issues.apache.org/jira/browse/FLINK-5001
> Project: Flink
>  Issue Type: Bug
>  Components: Kafka Connector
>Affects Versions: 1.1.0, 1.2.0
>Reporter: Robert Metzger
>Priority: Blocker
>
> Similarly to FLINK-4822, the offsets committed by Flink's Kafka 0.9+ consumer 
> are not available through the {{kafka-consumer-groups.sh}} tool.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-5001) Ensure that the Kafka 0.9+ connector is compatible with kafka-consumer-groups.sh

2016-11-03 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15632307#comment-15632307
 ] 

Robert Metzger commented on FLINK-5001:
---

Okay, this is a limitation of the {{KafkaConsumer}}. Since we are manually 
assigning the partitions to consume, we are not participating in the 
group-balancing mechanism.
http://grokbase.com/t/kafka/users/163rrq9ne8/new-consumer-group-not-showing-up


> Ensure that the Kafka 0.9+ connector is compatible with 
> kafka-consumer-groups.sh
> 
>
> Key: FLINK-5001
> URL: https://issues.apache.org/jira/browse/FLINK-5001
> Project: Flink
>  Issue Type: Bug
>  Components: Kafka Connector
>Affects Versions: 1.1.0, 1.2.0
>Reporter: Robert Metzger
>Priority: Blocker
>
> Similarly to FLINK-4822, the offsets committed by Flink's Kafka 0.9+ consumer 
> are not available through the {{kafka-consumer-groups.sh}} tool.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (FLINK-5001) Ensure that the Kafka 0.9+ connector is compatible with kafka-consumer-groups.sh

2016-11-03 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger reassigned FLINK-5001:
-

Assignee: Robert Metzger

> Ensure that the Kafka 0.9+ connector is compatible with 
> kafka-consumer-groups.sh
> 
>
> Key: FLINK-5001
> URL: https://issues.apache.org/jira/browse/FLINK-5001
> Project: Flink
>  Issue Type: Bug
>  Components: Kafka Connector
>Affects Versions: 1.1.0, 1.2.0
>Reporter: Robert Metzger
>Assignee: Robert Metzger
>Priority: Blocker
>
> Similarly to FLINK-4822, the offsets committed by Flink's Kafka 0.9+ consumer 
> are not available through the {{kafka-consumer-groups.sh}} tool.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-5013) Flink Kinesis connector doesn't work on old EMR versions

2016-11-04 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-5013:
-

 Summary: Flink Kinesis connector doesn't work on old EMR versions
 Key: FLINK-5013
 URL: https://issues.apache.org/jira/browse/FLINK-5013
 Project: Flink
  Issue Type: Bug
  Components: Kinesis Connector
Reporter: Robert Metzger


A user reported on the mailing list that our Kinesis connector doesn't work 
with EMR 4.4.0: 
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Kinesis-Connector-Dependency-Problems-td9790.html

The problem seems to be that Flink is loading older libraries from the "YARN 
container classpath", which on EMR contains the default Amazon libraries.

We should try to shade kinesis and its amazon dependencies into a different 
namespace.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-4221) Show metrics in WebFrontend

2016-11-04 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger resolved FLINK-4221.
---
   Resolution: Fixed
Fix Version/s: 1.2.0

Resolved in http://git-wip-us.apache.org/repos/asf/flink/commit/3a4fc537

> Show metrics in WebFrontend
> ---
>
> Key: FLINK-4221
> URL: https://issues.apache.org/jira/browse/FLINK-4221
> Project: Flink
>  Issue Type: Sub-task
>  Components: Metrics, Webfrontend
>Affects Versions: 1.0.0
>Reporter: Chesnay Schepler
>Assignee: Robert Metzger
> Fix For: 1.2.0, pre-apache
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-5731) Split up CI builds

2017-02-10 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15860971#comment-15860971
 ] 

Robert Metzger commented on FLINK-5731:
---

We dropped Hadoop 1 support just recently with the 1.2 release, so we have a 
tradition of supporting older Hadoop versions for a long time.
The reasoning is that we don't want to exclude any new users because of their 
Hadoop version. As long as supporting old Hadoop versions doesn't add too much 
pain, we'll probably keep them.
If you want to really change it, you would probably need to start a discussion 
on the dev@ mailing list about it.

> Split up CI builds
> --
>
> Key: FLINK-5731
> URL: https://issues.apache.org/jira/browse/FLINK-5731
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System, Tests
>Reporter: Ufuk Celebi
>Assignee: Robert Metzger
>Priority: Critical
>
> Test builds regularly time out because we are hitting the Travis 50 min 
> limit. Previously, we worked around this by splitting up the tests into 
> groups. I think we have to split them further.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5316) Make the GenericWriteAheadSink backwards compatible.

2017-02-12 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15862757#comment-15862757
 ] 

Robert Metzger commented on FLINK-5316:
---

I would suggest to not implement this, unless a user asks for it.

> Make the GenericWriteAheadSink backwards compatible.
> 
>
> Key: FLINK-5316
> URL: https://issues.apache.org/jira/browse/FLINK-5316
> Project: Flink
>  Issue Type: Improvement
>  Components: Cassandra Connector
>Affects Versions: 1.2.0
>Reporter: Kostas Kloudas
>Assignee: Kostas Kloudas
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (FLINK-3985) A metric with the name * was already registered

2017-02-12 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger resolved FLINK-3985.
---
Resolution: Fixed

The issue is fixed in the YARN tests.

> A metric with the name * was already registered
> ---
>
> Key: FLINK-3985
> URL: https://issues.apache.org/jira/browse/FLINK-3985
> Project: Flink
>  Issue Type: Bug
>  Components: Metrics
>Affects Versions: 1.1.0
>Reporter: Robert Metzger
>Assignee: Stephan Ewen
>  Labels: test-stability
>
> The YARN tests detected the following failure while running WordCount.
> {code}
> 2016-05-27 21:50:48,230 INFO  org.apache.flink.yarn.YarnTaskManager   
>   - Received task CHAIN DataSource (at main(WordCount.java:70) 
> (org.apache.flink.api.java.io.TextInputFormat)) -> FlatMap (FlatMap at 
> main(WordCount.java:80)) -> Combine(SUM(1), at main(WordCount.java:83) (1/2)
> 2016-05-27 21:50:48,231 ERROR org.apache.flink.metrics.reporter.JMXReporter   
>   - A metric with the name 
> org.apache.flink.metrics:key0=testing-worker-linux-docker-6e03e1e8-3385-linux-1,key1=taskmanager,key2=ee7c10183f32c9a96f8e7cfd873863d1,key3=WordCount_Example,key4=CHAIN_DataSource_(at_main(WordCount.java-70)_(org.apache.flink.api.java.io.TextInputFormat))_->_FlatMap_(FlatMap_at_main(WordCount.java-80))_->_Combine(SUM(1)-_at_main(WordCount.java-83),name=numBytesIn
>  was already registered.
> javax.management.InstanceAlreadyExistsException: 
> org.apache.flink.metrics:key0=testing-worker-linux-docker-6e03e1e8-3385-linux-1,key1=taskmanager,key2=ee7c10183f32c9a96f8e7cfd873863d1,key3=WordCount_Example,key4=CHAIN_DataSource_(at_main(WordCount.java-70)_(org.apache.flink.api.java.io.TextInputFormat))_->_FlatMap_(FlatMap_at_main(WordCount.java-80))_->_Combine(SUM(1)-_at_main(WordCount.java-83),name=numBytesIn
>   at com.sun.jmx.mbeanserver.Repository.addMBean(Repository.java:437)
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerWithRepository(DefaultMBeanServerInterceptor.java:1898)
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerDynamicMBean(DefaultMBeanServerInterceptor.java:966)
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:900)
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:324)
>   at 
> com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522)
>   at 
> org.apache.flink.metrics.reporter.JMXReporter.notifyOfAddedMetric(JMXReporter.java:76)
>   at 
> org.apache.flink.metrics.MetricRegistry.register(MetricRegistry.java:177)
>   at 
> org.apache.flink.metrics.groups.AbstractMetricGroup.addMetric(AbstractMetricGroup.java:191)
>   at 
> org.apache.flink.metrics.groups.AbstractMetricGroup.counter(AbstractMetricGroup.java:144)
>   at 
> org.apache.flink.metrics.groups.IOMetricGroup.(IOMetricGroup.java:40)
>   at 
> org.apache.flink.metrics.groups.TaskMetricGroup.(TaskMetricGroup.java:74)
>   at 
> org.apache.flink.metrics.groups.JobMetricGroup.addTask(JobMetricGroup.java:74)
>   at 
> org.apache.flink.metrics.groups.TaskManagerMetricGroup.addTaskForJob(TaskManagerMetricGroup.java:86)
>   at 
> org.apache.flink.runtime.taskmanager.TaskManager.submitTask(TaskManager.scala:1093)
>   at 
> org.apache.flink.runtime.taskmanager.TaskManager.org$apache$flink$runtime$taskmanager$TaskManager$$handleTaskMessage(TaskManager.scala:442)
>   at 
> org.apache.flink.runtime.taskmanager.TaskManager$$anonfun$handleMessage$1.applyOrElse(TaskManager.scala:284)
>   at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
>   at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
>   at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
>   at 
> org.apache.flink.runtime.LeaderSessionMessageFilter$$anonfun$receive$1.applyOrElse(LeaderSessionMessageFilter.scala:36)
>   at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
>   at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
>   at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
>   at 
> org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:33)
>   at 
> org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:28)
>   at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
>   at 
> org.apache.flink.runtime.LogMessages$$anon$1.applyOrElse(LogMessages.scala:28)
>   at akka.actor.Actor$class.aroundRece

[jira] [Commented] (FLINK-5690) protobuf is not shaded properly

2017-02-12 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15862821#comment-15862821
 ] 

Robert Metzger commented on FLINK-5690:
---

You are right. My project contained the protobuf dependency.

Are you okay with closing this JIRA as "won't fix"? We can currently not shade 
protobuf away, due to our dependency to Akka.

> protobuf is not shaded properly
> ---
>
> Key: FLINK-5690
> URL: https://issues.apache.org/jira/browse/FLINK-5690
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.1.4, 1.3.0
>Reporter: Andrey
>Assignee: Robert Metzger
>
> Currently distributive contains com/google/protobuf package. Without proper 
> shading client code could fail with:
> {code}
> Caused by: java.lang.IllegalAccessError: tried to access method 
> com.google.protobuf.
> {code}
> Steps to reproduce:
> * create job class "com.google.protobuf.TestClass"
> * call com.google.protobuf.TextFormat.escapeText(String) method from this 
> class
> * deploy job to flink cluster (usign web console for example)
> * run job. In logs IllegalAccessError.
> Issue in package protected method and different classloaders. TestClass 
> loaded by FlinkUserCodeClassLoader, but TextFormat class loaded by 
> sun.misc.Launcher$AppClassLoader



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5690) protobuf is not shaded properly

2017-02-13 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15864266#comment-15864266
 ] 

Robert Metzger commented on FLINK-5690:
---

1) I'll open a pull request.

Regarding 2) This is something we could look into. 
However, so far, almost all complaints from users were about the dependencies 
we inherit from Hadoop. That's why I've tried relocating all of Hadoop's 
dependencies: 
https://issues.apache.org/jira/browse/FLINK-5297?focusedCommentId=15815050&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15815050
 (with mixed results).

Also, I'm planning to provide a "hadoop free" version of Flink. As more and 
more of our users actually run without anything from Hadoop (for example on AWS 
with S3 and Docker), these users don't need all the deps we inherit from Hadoop.
The libraries you've mentioned are pretty good with maintaining API 
compatibility. So you can mix these libraries better in one classpath.

> protobuf is not shaded properly
> ---
>
> Key: FLINK-5690
> URL: https://issues.apache.org/jira/browse/FLINK-5690
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.1.4, 1.3.0
>Reporter: Andrey
>Assignee: Robert Metzger
>
> Currently distributive contains com/google/protobuf package. Without proper 
> shading client code could fail with:
> {code}
> Caused by: java.lang.IllegalAccessError: tried to access method 
> com.google.protobuf.
> {code}
> Steps to reproduce:
> * create job class "com.google.protobuf.TestClass"
> * call com.google.protobuf.TextFormat.escapeText(String) method from this 
> class
> * deploy job to flink cluster (usign web console for example)
> * run job. In logs IllegalAccessError.
> Issue in package protected method and different classloaders. TestClass 
> loaded by FlinkUserCodeClassLoader, but TextFormat class loaded by 
> sun.misc.Launcher$AppClassLoader



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5690) protobuf is not shaded properly

2017-02-13 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15864312#comment-15864312
 ] 

Robert Metzger commented on FLINK-5690:
---

Feel free to review my pull request.

> protobuf is not shaded properly
> ---
>
> Key: FLINK-5690
> URL: https://issues.apache.org/jira/browse/FLINK-5690
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.1.4, 1.3.0
>Reporter: Andrey
>Assignee: Robert Metzger
>
> Currently distributive contains com/google/protobuf package. Without proper 
> shading client code could fail with:
> {code}
> Caused by: java.lang.IllegalAccessError: tried to access method 
> com.google.protobuf.
> {code}
> Steps to reproduce:
> * create job class "com.google.protobuf.TestClass"
> * call com.google.protobuf.TextFormat.escapeText(String) method from this 
> class
> * deploy job to flink cluster (usign web console for example)
> * run job. In logs IllegalAccessError.
> Issue in package protected method and different classloaders. TestClass 
> loaded by FlinkUserCodeClassLoader, but TextFormat class loaded by 
> sun.misc.Launcher$AppClassLoader



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5736) Fix Javadocs / Scaladocs build (Feb 2017)

2017-02-14 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15865758#comment-15865758
 ] 

Robert Metzger commented on FLINK-5736:
---

I've fixed the javadocs for the 1.2 build: 
https://ci.apache.org/projects/flink/flink-docs-release-1.2/api/java/
Tonight, it'll also be rebuild for the other versions.

> Fix Javadocs / Scaladocs build (Feb 2017)
> -
>
> Key: FLINK-5736
> URL: https://issues.apache.org/jira/browse/FLINK-5736
> Project: Flink
>  Issue Type: Task
>  Components: Build System, Documentation
>Reporter: Robert Metzger
>
> It looks like the scaladocs are not building properly.
> Expected URL: 
> https://ci.apache.org/projects/flink/flink-docs-release-1.2/api/java/
> Buildbot output: 
> https://ci.apache.org/builders/flink-docs-release-1.2/builds/15/steps/Java%20%26%20Scala%20docs/logs/stdio
> Command to reproduce issue on master: mvn clean javadoc:aggregate 
> -Paggregate-scaladoc -DadditionalJOption="-Xdoclint:none" -Dheader=" href="http://flink.apache.org/"; target="_top">Back to Flink 
> Website"



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (FLINK-5736) Fix Javadocs / Scaladocs build (Feb 2017)

2017-02-14 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger reassigned FLINK-5736:
-

Assignee: Robert Metzger

> Fix Javadocs / Scaladocs build (Feb 2017)
> -
>
> Key: FLINK-5736
> URL: https://issues.apache.org/jira/browse/FLINK-5736
> Project: Flink
>  Issue Type: Task
>  Components: Build System, Documentation
>Reporter: Robert Metzger
>Assignee: Robert Metzger
>
> It looks like the scaladocs are not building properly.
> Expected URL: 
> https://ci.apache.org/projects/flink/flink-docs-release-1.2/api/java/
> Buildbot output: 
> https://ci.apache.org/builders/flink-docs-release-1.2/builds/15/steps/Java%20%26%20Scala%20docs/logs/stdio
> Command to reproduce issue on master: mvn clean javadoc:aggregate 
> -Paggregate-scaladoc -DadditionalJOption="-Xdoclint:none" -Dheader=" href="http://flink.apache.org/"; target="_top">Back to Flink 
> Website"



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (FLINK-5736) Fix Javadocs / Scaladocs build (Feb 2017)

2017-02-14 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger resolved FLINK-5736.
---
Resolution: Fixed

> Fix Javadocs / Scaladocs build (Feb 2017)
> -
>
> Key: FLINK-5736
> URL: https://issues.apache.org/jira/browse/FLINK-5736
> Project: Flink
>  Issue Type: Task
>  Components: Build System, Documentation
>Reporter: Robert Metzger
>Assignee: Robert Metzger
>
> It looks like the scaladocs are not building properly.
> Expected URL: 
> https://ci.apache.org/projects/flink/flink-docs-release-1.2/api/java/
> Buildbot output: 
> https://ci.apache.org/builders/flink-docs-release-1.2/builds/15/steps/Java%20%26%20Scala%20docs/logs/stdio
> Command to reproduce issue on master: mvn clean javadoc:aggregate 
> -Paggregate-scaladoc -DadditionalJOption="-Xdoclint:none" -Dheader=" href="http://flink.apache.org/"; target="_top">Back to Flink 
> Website"



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5736) Fix Javadocs / Scaladocs build (Feb 2017)

2017-02-14 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15865766#comment-15865766
 ] 

Robert Metzger commented on FLINK-5736:
---

The problem was the following:
On the buildbot server (configuration is located here: 
https://svn.apache.org/repos/infra/infrastructure/buildbot/aegis/buildmaster/master1/projects)
 there is a script that generates the javadocs.
The script checks if the build server (the servers are apparently not 
heterogeneous) has Java8 or not. If Java8 is present 
{{-DadditionalJOption="-Xdoclint:none"}} is passed to the maven invocation.

The problem was that the java version check is not working, hence the flag was 
not passed.
It seems that this doclint option does not only affect html errors (or wrong 
links etc). It also affects other issues during the javadocs compilation. With 
doclint, basically all javadoc errors turn into warnings.

As a short term fix, I'm now always passing the doclint option.

> Fix Javadocs / Scaladocs build (Feb 2017)
> -
>
> Key: FLINK-5736
> URL: https://issues.apache.org/jira/browse/FLINK-5736
> Project: Flink
>  Issue Type: Task
>  Components: Build System, Documentation
>Reporter: Robert Metzger
>Assignee: Robert Metzger
>
> It looks like the scaladocs are not building properly.
> Expected URL: 
> https://ci.apache.org/projects/flink/flink-docs-release-1.2/api/java/
> Buildbot output: 
> https://ci.apache.org/builders/flink-docs-release-1.2/builds/15/steps/Java%20%26%20Scala%20docs/logs/stdio
> Command to reproduce issue on master: mvn clean javadoc:aggregate 
> -Paggregate-scaladoc -DadditionalJOption="-Xdoclint:none" -Dheader=" href="http://flink.apache.org/"; target="_top">Back to Flink 
> Website"



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5690) protobuf is not shaded properly

2017-02-14 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15865813#comment-15865813
 ] 

Robert Metzger commented on FLINK-5690:
---

Merged documentation update to master in 
http://git-wip-us.apache.org/repos/asf/flink/commit/d3228144

> protobuf is not shaded properly
> ---
>
> Key: FLINK-5690
> URL: https://issues.apache.org/jira/browse/FLINK-5690
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.1.4, 1.3.0
>Reporter: Andrey
>Assignee: Robert Metzger
>
> Currently distributive contains com/google/protobuf package. Without proper 
> shading client code could fail with:
> {code}
> Caused by: java.lang.IllegalAccessError: tried to access method 
> com.google.protobuf.
> {code}
> Steps to reproduce:
> * create job class "com.google.protobuf.TestClass"
> * call com.google.protobuf.TextFormat.escapeText(String) method from this 
> class
> * deploy job to flink cluster (usign web console for example)
> * run job. In logs IllegalAccessError.
> Issue in package protected method and different classloaders. TestClass 
> loaded by FlinkUserCodeClassLoader, but TextFormat class loaded by 
> sun.misc.Launcher$AppClassLoader



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5796) broken links in the docs

2017-02-14 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15865894#comment-15865894
 ] 

Robert Metzger commented on FLINK-5796:
---

Sorry, I don't have time for this right now. 

I think we can integrate this check into the build docs script. This way, we 
don't need to work on buildbot. (ideally this is put into a separate script 
that is called from both the build docs script and buildbot) (I  can do the 
call in buildbot)

> broken links in the docs
> 
>
> Key: FLINK-5796
> URL: https://issues.apache.org/jira/browse/FLINK-5796
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.3.0
>Reporter: Nico Kruber
>
> running a link checker on the locally-served flink docs yields the following 
> broken links (same for online docs):
> {code}
> 15:21:55  Error:  "Not Found " (404) at link 0.0.0.0:4000/dev/state (from 
> 0.0.0.0:4000/dev/datastream_api.html)
> 15:21:55  Error:  "Not Found " (404) at link 
> 0.0.0.0:4000/monitoring/best_practices.html (from 0.0.0.0:4000/dev/batch/)
> 15:21:55  Error:  "Not Found " (404) at link 
> 0.0.0.0:4000/api/java/org/apache/flink/table/api/Table.html (from 
> 0.0.0.0:4000/dev/table_api.html)
> 15:21:55  Error:  "Not Found " (404) at link 0.0.0.0:4000/dev/state.html 
> (from 0.0.0.0:4000/ops/state_backends.html)
> 15:21:55  Error:  "Not Found " (404) at link 
> 0.0.0.0:4000/internals/state_backends.html (from 
> 0.0.0.0:4000/internals/stream_checkpointing.html)
> {code}
> FYI: command to replay:
> {{httrack http://0.0.0.0:4000/  -O "$PWD/flink-docs" --testlinks  -%v 
> --depth= --ext-depth=0}}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (FLINK-5831) Sort metrics in metric selector and add search box

2017-02-17 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-5831:
-

 Summary: Sort metrics in metric selector and add search box
 Key: FLINK-5831
 URL: https://issues.apache.org/jira/browse/FLINK-5831
 Project: Flink
  Issue Type: Improvement
  Components: Webfrontend
Reporter: Robert Metzger


The JobManager UI makes it hard to select metrics using the drop down menu.

First of all, it would me nice to sort all entries. Also a search box on top of 
the drop down would make it much easier to find the metrics.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (FLINK-5831) Sort metrics in metric selector and add search box

2017-02-17 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-5831:
--
Attachment: dropDown.png

I've attached the drop down I'm talking about as a screenshot.

> Sort metrics in metric selector and add search box
> --
>
> Key: FLINK-5831
> URL: https://issues.apache.org/jira/browse/FLINK-5831
> Project: Flink
>  Issue Type: Improvement
>  Components: Webfrontend
>Reporter: Robert Metzger
> Attachments: dropDown.png
>
>
> The JobManager UI makes it hard to select metrics using the drop down menu.
> First of all, it would me nice to sort all entries. Also a search box on top 
> of the drop down would make it much easier to find the metrics.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (FLINK-5487) Proper at-least-once support for ElasticsearchSink

2017-02-21 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger reassigned FLINK-5487:
-

Assignee: Tzu-Li (Gordon) Tai

> Proper at-least-once support for ElasticsearchSink
> --
>
> Key: FLINK-5487
> URL: https://issues.apache.org/jira/browse/FLINK-5487
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming Connectors
>Reporter: Tzu-Li (Gordon) Tai
>Assignee: Tzu-Li (Gordon) Tai
>Priority: Critical
>
> Discussion in ML: 
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Fault-tolerance-guarantees-of-Elasticsearch-sink-in-flink-elasticsearch2-td10982.html
> Currently, the Elasticsearch Sink actually doesn't offer any guarantees for 
> message delivery.
> For proper support of at-least-once, the sink will need to participate in 
> Flink's checkpointing: when snapshotting is triggered at the 
> {{ElasticsearchSink}}, we need to synchronize on the pending ES requests by 
> flushing the internal bulk processor. For temporary ES failures (see 
> FLINK-5122) that may happen on the flush, we should retry them before 
> returning from snapshotting and acking the checkpoint. If there are 
> non-temporary ES failures on the flush, the current snapshot should fail.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (FLINK-5874) Reject arrays as keys in DataStream API to avoid inconsistent hashing

2017-02-21 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-5874:
-

 Summary: Reject arrays as keys in DataStream API to avoid 
inconsistent hashing
 Key: FLINK-5874
 URL: https://issues.apache.org/jira/browse/FLINK-5874
 Project: Flink
  Issue Type: Bug
  Components: DataStream API
Affects Versions: 1.1.4, 1.2.0
Reporter: Robert Metzger
Priority: Blocker


This issue has been reported on the mailing list twice:
- 
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Previously-working-job-fails-on-Flink-1-2-0-td11741.html
- 
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Arrays-values-in-keyBy-td7530.html

The problem is the following: We are using just Key[].hashCode() to compute the 
hash when shuffling data. Java's default hashCode() implementation doesn't take 
the arrays contents into account, but the memory address.
This leads to different hash code on the sender and receiver side.
In Flink 1.1 this means that the data is shuffled randomly and not keyed, and 
in Flink 1.2 the keygroups code detect a violation of the hashing.

The proper fix of the problem would be to rely on Flink's {{TypeComparator}} 
class, which has a type-specific hashing function. But introducing this change 
would break compatibility with existing code.
I'll file a JIRA for the 2.0 changes for that fix.

For 1.2.1 and 1.3.0 we should at least reject arrays as keys.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (FLINK-5875) Use TypeComparator.hash() instead of Object.hashCode() for keying in DataStream API

2017-02-21 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-5875:
--
Summary: Use TypeComparator.hash() instead of Object.hashCode() for keying 
in DataStream API  (was: Use TypeComparator.hash instead of Object.hashCode() 
for keying in DataStream API)

> Use TypeComparator.hash() instead of Object.hashCode() for keying in 
> DataStream API
> ---
>
> Key: FLINK-5875
> URL: https://issues.apache.org/jira/browse/FLINK-5875
> Project: Flink
>  Issue Type: Sub-task
>  Components: DataStream API
>Reporter: Robert Metzger
> Fix For: 2.0.0
>
>
> See FLINK-5874 for details.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (FLINK-5875) Use TypeComparator.hash instead of Object.hashCode() for keying in DataStream API

2017-02-21 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-5875:
-

 Summary: Use TypeComparator.hash instead of Object.hashCode() for 
keying in DataStream API
 Key: FLINK-5875
 URL: https://issues.apache.org/jira/browse/FLINK-5875
 Project: Flink
  Issue Type: Sub-task
  Components: DataStream API
Reporter: Robert Metzger
 Fix For: 2.0.0


See FLINK-5874 for details.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (FLINK-5731) Split up CI builds

2017-02-22 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger resolved FLINK-5731.
---
Resolution: Fixed

Resolved in http://git-wip-us.apache.org/repos/asf/flink/commit/e7a914d4

> Split up CI builds
> --
>
> Key: FLINK-5731
> URL: https://issues.apache.org/jira/browse/FLINK-5731
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System, Tests
>Reporter: Ufuk Celebi
>Assignee: Robert Metzger
>Priority: Critical
>
> Test builds regularly time out because we are hitting the Travis 50 min 
> limit. Previously, we worked around this by splitting up the tests into 
> groups. I think we have to split them further.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5898) Race-Condition with Amazon Kinesis KPL

2017-02-24 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15882319#comment-15882319
 ] 

Robert Metzger commented on FLINK-5898:
---

Thank you Scott for looking into this!
Fixing it at the KPL is probably the easiest.

If that doesn't work, we could consider temporarily changing the 
"java.io.tmpdir" system property to include a random UUID.

> Race-Condition with Amazon Kinesis KPL
> --
>
> Key: FLINK-5898
> URL: https://issues.apache.org/jira/browse/FLINK-5898
> Project: Flink
>  Issue Type: Bug
>  Components: Kinesis Connector
>Affects Versions: 1.2.0
>Reporter: Scott Kidder
>
> The Flink Kinesis streaming-connector uses the Amazon Kinesis Producer 
> Library (KPL) to send messages to Kinesis streams. The KPL relies on a native 
> binary client to send messages to achieve better performance.
> When a Kinesis Producer is instantiated, the KPL will extract the native 
> binary to a sub-directory of `/tmp` (or whatever the platform-specific 
> temporary directory happens to be).
> The KPL tries to prevent multiple processes from extracting the binary at the 
> same time by wrapping the operation in a mutex. Unfortunately, this does not 
> prevent multiple Flink cores from trying to perform this operation at the 
> same time. If two or more processes attempt to do this at the same time, then 
> the native binary in /tmp will be corrupted.
> The authors of the KPL are aware of this possibility and suggest that users 
> of the KPL  not do that ... (sigh):
> https://github.com/awslabs/amazon-kinesis-producer/issues/55#issuecomment-251408897
> I encountered this in my production environment when bringing up a new Flink 
> task-manager with multiple cores and restoring from an earlier savepoint, 
> resulting in the instantiation of a KPL client on each core at roughly the 
> same time.
> A stack-trace follows:
> {noformat}
> java.lang.RuntimeException: Could not copy native binaries to temp directory 
> /tmp/amazon-kinesis-producer-native-binaries
>   at 
> com.amazonaws.services.kinesis.producer.KinesisProducer.extractBinaries(KinesisProducer.java:849)
>   at 
> com.amazonaws.services.kinesis.producer.KinesisProducer.(KinesisProducer.java:243)
>   at 
> org.apache.flink.streaming.connectors.kinesis.FlinkKinesisProducer.open(FlinkKinesisProducer.java:198)
>   at 
> org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36)
>   at 
> org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(AbstractUdfStreamOperator.java:112)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.openAllOperators(StreamTask.java:376)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:252)
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:655)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.SecurityException: The contents of the binary 
> /tmp/amazon-kinesis-producer-native-binaries/kinesis_producer_e9a87c761db92a73eb74519a4468ee71def87eb2
>  is not what it's expected to be.
>   at 
> com.amazonaws.services.kinesis.producer.KinesisProducer.extractBinaries(KinesisProducer.java:822)
>   ... 8 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5668) Reduce dependency on HDFS at job startup time

2017-02-27 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15885456#comment-15885456
 ] 

Robert Metzger commented on FLINK-5668:
---

Sorry that I did not look at this JIRA earlier.

[~bill.liu8904] and [~wheat9] if I understand you correctly, you want Flink on 
YARN not to use Hadoop's {{fs.defaultFS}} configuration for choosing the 
filesystem used to distribute jars and configuration files during deployment?

This basically means that we need to provide a custom configuration key in 
Flink (something like {{yarn.deploy.fs}}) to put the stuff to. This would allow 
you so use s3 for deploying Flink and hdfs for rocksdb or other state backups.

I'm not able to completely understand FLINK-5631: How can I register an 
additional jar from a different file system as a required resource?

> Reduce dependency on HDFS at job startup time
> -
>
> Key: FLINK-5668
> URL: https://issues.apache.org/jira/browse/FLINK-5668
> Project: Flink
>  Issue Type: Improvement
>  Components: YARN
>Reporter: Bill Liu
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> When create a Flink cluster on Yarn,  JobManager depends on  HDFS to share  
> taskmanager-conf.yaml  with TaskManager.
> It's better to share the taskmanager-conf.yaml  on JobManager Web server 
> instead of HDFS, which could reduce the HDFS dependency  at job startup.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5668) passing taskmanager configuration through taskManagerEnv instead of file

2017-02-28 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15887703#comment-15887703
 ] 

Robert Metzger commented on FLINK-5668:
---

[~wheat9] I still did not understand how you can "inject" a custom path for 
some of the resources deployed by Flink on YARN.
All the paths are programatically generated and there are no configuration 
parameters for passing custom paths (correct me if I'm wrong).

Are you planning to basically fork Flink and create a custom YARN client / 
Application Master implementation that allows using custom paths?

> regarding this issue, do you have an idea why the current implementation 
> writes the configuration into a file on default.FS?

I think we didn't have your use case in mind when implementing the code. We 
assumed that one file system will be used for distributing all required files. 
Also, this approach works nicely will all the Hadoop vendor's versions.

> What do you think if passing the configuration through the ``taskManagerEnv``?

I can only assess this approach once I've understood what you are trying to 
achieve.

> passing taskmanager configuration through taskManagerEnv instead of file
> 
>
> Key: FLINK-5668
> URL: https://issues.apache.org/jira/browse/FLINK-5668
> Project: Flink
>  Issue Type: Improvement
>  Components: YARN
>Reporter: Bill Liu
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> When create a Flink cluster on Yarn,  JobManager depends on  HDFS to share  
> taskmanager-conf.yaml  with TaskManager.
> It's better to share the taskmanager-conf.yaml  on JobManager Web server 
> instead of HDFS, which could reduce the HDFS dependency  at job startup.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (FLINK-5598) Return jar name when jar is uploaded

2017-03-03 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger reassigned FLINK-5598:
-

Assignee: Fabian Wollert

> Return jar name when jar is uploaded
> 
>
> Key: FLINK-5598
> URL: https://issues.apache.org/jira/browse/FLINK-5598
> Project: Flink
>  Issue Type: Improvement
>  Components: Web Client
>Reporter: Sendoh
>Assignee: Fabian Wollert
>
> As as a Jenkins user who wants to uplaod jar through http call,  I want jar 
> file name is returned after jar is uploaded.
> Currently it returns nothing, as the code shown:
> File newFile = new File(jarDir, UUID.randomUUID() + "_" + filename);
>   if (tempFile.renameTo(newFile)) {
>   // all went well
>   return "{}";
>   }
> Ref: 
> https://github.com/apache/flink/blob/master/flink-runtime-web/src/main/java/org/apache/flink/runtime/webmonitor/handlers/JarUploadHandler.java#L58
> My proposal will be 
> reuturn {"fileName": newFile.getName()}
> Any suggestion is welcome.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5598) Return jar name when jar is uploaded

2017-03-03 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15894932#comment-15894932
 ] 

Robert Metzger commented on FLINK-5598:
---

Thanks a lot for fixing this issue. I assigned the JIRA to you (you can now 
assign JIRAs yourself, since you have "Contributor" permissions now). Note that 
I gave your @zalando.com account the permissions.

> Return jar name when jar is uploaded
> 
>
> Key: FLINK-5598
> URL: https://issues.apache.org/jira/browse/FLINK-5598
> Project: Flink
>  Issue Type: Improvement
>  Components: Web Client
>Reporter: Sendoh
>Assignee: Fabian Wollert
>
> As as a Jenkins user who wants to uplaod jar through http call,  I want jar 
> file name is returned after jar is uploaded.
> Currently it returns nothing, as the code shown:
> File newFile = new File(jarDir, UUID.randomUUID() + "_" + filename);
>   if (tempFile.renameTo(newFile)) {
>   // all went well
>   return "{}";
>   }
> Ref: 
> https://github.com/apache/flink/blob/master/flink-runtime-web/src/main/java/org/apache/flink/runtime/webmonitor/handlers/JarUploadHandler.java#L58
> My proposal will be 
> reuturn {"fileName": newFile.getName()}
> Any suggestion is welcome.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-4286) Have Kafka examples that use the Kafka 0.9 connector

2017-03-03 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15894965#comment-15894965
 ] 

Robert Metzger commented on FLINK-4286:
---

I would vote to not have two examples that basically differ only in the Kafka 
version.
We can just generally update the Kafka 0.8-based example to Kafka 0.9

> Have Kafka examples that use the Kafka 0.9 connector
> 
>
> Key: FLINK-4286
> URL: https://issues.apache.org/jira/browse/FLINK-4286
> Project: Flink
>  Issue Type: Improvement
>  Components: Examples
>Reporter: Tzu-Li (Gordon) Tai
>Assignee: Dmitrii Kniazev
>Priority: Minor
>  Labels: starter
>
> The {{ReadFromKafka}} and {{WriteIntoKafka}} examples use the 0.8 connector, 
> and the built example jar is named {{Kafka.jar}} under 
> {{examples/streaming/}} in the distributed package.
> Since we have different connectors for different Kafka versions, it would be 
> good to have examples for different versions, and package them as 
> {{Kafka08.jar}} and {{Kafka09.jar}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (FLINK-5067) Make Flink compile with 1.8 Java compiler

2017-03-04 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger resolved FLINK-5067.
---
   Resolution: Fixed
 Assignee: Andrey Melentyev
Fix Version/s: 1.3.0

Resolved in master for 1.3 in 
http://git-wip-us.apache.org/repos/asf/flink/commit/dc00fb0c

> Make Flink compile with 1.8 Java compiler
> -
>
> Key: FLINK-5067
> URL: https://issues.apache.org/jira/browse/FLINK-5067
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System
>Affects Versions: 1.2.0
> Environment: macOS Sierra 10.12.1, java version "1.8.0_112", Apache 
> Maven 3.3.9
>Reporter: Andrey Melentyev
>Assignee: Andrey Melentyev
>Priority: Minor
> Fix For: 1.3.0
>
>
> Flink fails to compile when using 1.8 as source and target in Maven. There 
> are two types of issue that are both related to the new type inference rules:
> * Call to TypeSerializer.copy method in TupleSerializer.java:112 now resolves 
> to a different overload than before causing a compilation error: [ERROR] 
> /Users/andrey.melentyev/Dev/github.com/apache/flink/flink-core/src/main/java/org/apache/flink/api/java/typeutils/runtime/TupleSerializer.java:[112,63]
>  incompatible types: void cannot be converted to java.lang.Object
> * A number of unit tests using assertEquals fail to compile:
> [ERROR] 
> /Users/andrey.melentyev/Dev/github.com/apache/flink/flink-java/src/test/java/org/apache/flink/api/common/operators/CollectionExecutionAccumulatorsTest.java:[50,25]
>  reference to assertEquals is ambiguous
> [ERROR] both method assertEquals(long,long) in org.junit.Assert and method 
> assertEquals(java.lang.Object,java.lang.Object) in org.junit.Assert match
> In both of the above scenarios explicitly casting one of the arguments helps 
> the compiler to resolve overloaded method call correctly.
> It is possible to maintain Flink's code base in a state when it can be built 
> by both 1.7 and 1.8. For this purpose we need minor code fixes and an 
> automated build in Travis to keep the new good state.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (FLINK-2268) Provide Flink binary release without Hadoop

2017-03-08 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-2268:
--
Description: 
Currently, all Flink releases ship with Hadoop 2.3.0 binaries.
The big Hadoop distributions are usually not relying on vanilla Hadoop 
releases, but on custom patched versions.
To provide the best user experience, we should offer a Flink binary that uses 
the Hadoop jars provided by the user (=hadoop distribution)

  was:
Currently, all Flink releases ship with Hadoop 2.2.0 binaries.
The big Hadoop distributions are usually not relying on vanilla Hadoop 
releases, but on custom patched versions.
To provide the best user experience, we should offer a Flink binary that uses 
the Hadoop jars provided by the user (=hadoop distribution)


> Provide Flink binary release without Hadoop
> ---
>
> Key: FLINK-2268
> URL: https://issues.apache.org/jira/browse/FLINK-2268
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System
>Reporter: Robert Metzger
>
> Currently, all Flink releases ship with Hadoop 2.3.0 binaries.
> The big Hadoop distributions are usually not relying on vanilla Hadoop 
> releases, but on custom patched versions.
> To provide the best user experience, we should offer a Flink binary that uses 
> the Hadoop jars provided by the user (=hadoop distribution)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (FLINK-5998) Un-fat Hadoop from Flink fat jar

2017-03-08 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-5998:
-

 Summary: Un-fat Hadoop from Flink fat jar
 Key: FLINK-5998
 URL: https://issues.apache.org/jira/browse/FLINK-5998
 Project: Flink
  Issue Type: Improvement
  Components: Build System
Reporter: Robert Metzger


As a first step towards FLINK-2268, I would suggest to put all hadoop 
dependencies into a jar separate from Flink's fat jar.

This would allow users to put a custom Hadoop jar in there, or even deploy 
Flink without a Hadoop fat jar at all in environments where Hadoop is provided 
(EMR).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (FLINK-5379) Flink CliFrontend does not return when not logged in with kerberos

2016-12-21 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-5379:
-

 Summary: Flink CliFrontend does not return when not logged in with 
kerberos
 Key: FLINK-5379
 URL: https://issues.apache.org/jira/browse/FLINK-5379
 Project: Flink
  Issue Type: Bug
  Components: Client
Affects Versions: 1.2.0
Reporter: Robert Metzger


In pre 1.2 versions, Flink immediately fails when trying to deploy it on YARN 
and the current user is not kerberos authenticated:

{code}
Error while deploying YARN cluster: Couldn't deploy Yarn cluster
java.lang.RuntimeException: Couldn't deploy Yarn cluster
at 
org.apache.flink.yarn.AbstractYarnClusterDescriptor.deploy(AbstractYarnClusterDescriptor.java:384)
at 
org.apache.flink.yarn.cli.FlinkYarnSessionCli.run(FlinkYarnSessionCli.java:591)
at 
org.apache.flink.yarn.cli.FlinkYarnSessionCli.main(FlinkYarnSessionCli.java:465)
Caused by: 
org.apache.flink.yarn.AbstractYarnClusterDescriptor$YarnDeploymentException: In 
secure mode. Please provide Kerberos credentials in order to authenticate. You 
may use kinit to authenticate and request a TGT from the Kerberos server.
at 
org.apache.flink.yarn.AbstractYarnClusterDescriptor.deploy(AbstractYarnClusterDescriptor.java:371)
... 2 more
{code}

In 1.2, the following happens:
{code}
2016-12-21 13:51:29,925 INFO  org.apache.hadoop.yarn.client.RMProxy 
- Connecting to ResourceManager at 
my-cluster-2wv1.c.sorter-757.internal/10.240.0.24:8032
2016-12-21 13:51:30,153 WARN  org.apache.hadoop.security.UserGroupInformation   
- PriviledgedActionException as:longrunning (auth:KERBEROS) 
cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by 
GSSException: No valid credentials provided (Mechanism level: Failed to find 
any Kerberos tgt)]
2016-12-21 13:51:30,154 WARN  org.apache.hadoop.ipc.Client  
- Exception encountered while connecting to the server : 
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: 
No valid credentials provided (Mechanism level: Failed to find any Kerberos 
tgt)]
2016-12-21 13:51:30,154 WARN  org.apache.hadoop.security.UserGroupInformation   
- PriviledgedActionException as:longrunning (auth:KERBEROS) 
cause:java.io.IOException: javax.security.sasl.SaslException: GSS initiate 
failed [Caused by GSSException: No valid credentials provided (Mechanism level: 
Failed to find any Kerberos tgt)]
2016-12-21 13:52:00,171 WARN  org.apache.hadoop.security.UserGroupInformation   
- PriviledgedActionException as:longrunning (auth:KERBEROS) 
cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by 
GSSException: No valid credentials provided (Mechanism level: Failed to find 
any Kerberos tgt)]
2016-12-21 13:52:00,172 WARN  org.apache.hadoop.ipc.Client  
- Exception encountered while connecting to the server : 
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: 
No valid credentials provided (Mechanism level: Failed to find any Kerberos 
tgt)]
2016-12-21 13:52:00,172 WARN  org.apache.hadoop.security.UserGroupInformation   
- PriviledgedActionException as:longrunning (auth:KERBEROS) 
cause:java.io.IOException: javax.security.sasl.SaslException: GSS initiate 
failed [Caused by GSSException: No valid credentials provided (Mechanism level: 
Failed to find any Kerberos tgt)]
2016-12-21 13:52:30,188 WARN  org.apache.hadoop.security.UserGroupInformation   
- PriviledgedActionException as:longrunning (auth:KERBEROS) 
cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by 
GSSException: No valid credentials provided (Mechanism level: Failed to find 
any Kerberos tgt)]
2016-12-21 13:52:30,189 WARN  org.apache.hadoop.ipc.Client  
- Exception encountered while connecting to the server : 
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: 
No valid credentials provided (Mechanism level: Failed to find any Kerberos 
tgt)]
2016-12-21 13:52:30,189 WARN  org.apache.hadoop.security.UserGroupInformation   
- PriviledgedActionException as:longrunning (auth:KERBEROS) 
cause:java.io.IOException: javax.security.sasl.SaslException: GSS initiate 
failed [Caused by GSSException: No valid credentials provided (Mechanism level: 
Failed to find any Kerberos tgt)]
2016-12-21 13:53:00,203 WARN  org.apache.hadoop.security.UserGroupInformation   
- PriviledgedActionException as:longrunning (auth:KERBEROS) 
cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by 
GSSException: No valid credentials provided (Mechanism level: Failed to find 
any Kerberos tgt)]
2016-12-21 13:53:00,204 WARN  org.apache.hadoop.ipc.Client  
- Exception encountered while connecting to the 

[jira] [Updated] (FLINK-5379) Flink CliFrontend does not return when not logged in with kerberos

2016-12-21 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-5379:
--
Description: 
In pre 1.2 versions, Flink immediately fails when trying to deploy it on YARN 
and the current user is not kerberos authenticated:

{code}
Error while deploying YARN cluster: Couldn't deploy Yarn cluster
java.lang.RuntimeException: Couldn't deploy Yarn cluster
at 
org.apache.flink.yarn.AbstractYarnClusterDescriptor.deploy(AbstractYarnClusterDescriptor.java:384)
at 
org.apache.flink.yarn.cli.FlinkYarnSessionCli.run(FlinkYarnSessionCli.java:591)
at 
org.apache.flink.yarn.cli.FlinkYarnSessionCli.main(FlinkYarnSessionCli.java:465)
Caused by: 
org.apache.flink.yarn.AbstractYarnClusterDescriptor$YarnDeploymentException: In 
secure mode. Please provide Kerberos credentials in order to authenticate. You 
may use kinit to authenticate and request a TGT from the Kerberos server.
at 
org.apache.flink.yarn.AbstractYarnClusterDescriptor.deploy(AbstractYarnClusterDescriptor.java:371)
... 2 more
{code}

In 1.2, the following happens (the CLI frontend does not return. It seems to be 
stuck in a loop)
{code}
2016-12-21 13:51:29,925 INFO  org.apache.hadoop.yarn.client.RMProxy 
- Connecting to ResourceManager at 
my-cluster-2wv1.c.sorter-757.internal/10.240.0.24:8032
2016-12-21 13:51:30,153 WARN  org.apache.hadoop.security.UserGroupInformation   
- PriviledgedActionException as:longrunning (auth:KERBEROS) 
cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by 
GSSException: No valid credentials provided (Mechanism level: Failed to find 
any Kerberos tgt)]
2016-12-21 13:51:30,154 WARN  org.apache.hadoop.ipc.Client  
- Exception encountered while connecting to the server : 
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: 
No valid credentials provided (Mechanism level: Failed to find any Kerberos 
tgt)]
2016-12-21 13:51:30,154 WARN  org.apache.hadoop.security.UserGroupInformation   
- PriviledgedActionException as:longrunning (auth:KERBEROS) 
cause:java.io.IOException: javax.security.sasl.SaslException: GSS initiate 
failed [Caused by GSSException: No valid credentials provided (Mechanism level: 
Failed to find any Kerberos tgt)]
2016-12-21 13:52:00,171 WARN  org.apache.hadoop.security.UserGroupInformation   
- PriviledgedActionException as:longrunning (auth:KERBEROS) 
cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by 
GSSException: No valid credentials provided (Mechanism level: Failed to find 
any Kerberos tgt)]
2016-12-21 13:52:00,172 WARN  org.apache.hadoop.ipc.Client  
- Exception encountered while connecting to the server : 
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: 
No valid credentials provided (Mechanism level: Failed to find any Kerberos 
tgt)]
2016-12-21 13:52:00,172 WARN  org.apache.hadoop.security.UserGroupInformation   
- PriviledgedActionException as:longrunning (auth:KERBEROS) 
cause:java.io.IOException: javax.security.sasl.SaslException: GSS initiate 
failed [Caused by GSSException: No valid credentials provided (Mechanism level: 
Failed to find any Kerberos tgt)]
2016-12-21 13:52:30,188 WARN  org.apache.hadoop.security.UserGroupInformation   
- PriviledgedActionException as:longrunning (auth:KERBEROS) 
cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by 
GSSException: No valid credentials provided (Mechanism level: Failed to find 
any Kerberos tgt)]
2016-12-21 13:52:30,189 WARN  org.apache.hadoop.ipc.Client  
- Exception encountered while connecting to the server : 
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: 
No valid credentials provided (Mechanism level: Failed to find any Kerberos 
tgt)]
2016-12-21 13:52:30,189 WARN  org.apache.hadoop.security.UserGroupInformation   
- PriviledgedActionException as:longrunning (auth:KERBEROS) 
cause:java.io.IOException: javax.security.sasl.SaslException: GSS initiate 
failed [Caused by GSSException: No valid credentials provided (Mechanism level: 
Failed to find any Kerberos tgt)]
2016-12-21 13:53:00,203 WARN  org.apache.hadoop.security.UserGroupInformation   
- PriviledgedActionException as:longrunning (auth:KERBEROS) 
cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by 
GSSException: No valid credentials provided (Mechanism level: Failed to find 
any Kerberos tgt)]
2016-12-21 13:53:00,204 WARN  org.apache.hadoop.ipc.Client  
- Exception encountered while connecting to the server : 
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: 
No valid credentials provided (Mechanism lev

[jira] [Created] (FLINK-5380) Number of outgoing records not reported in web interface

2016-12-21 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-5380:
-

 Summary: Number of outgoing records not reported in web interface
 Key: FLINK-5380
 URL: https://issues.apache.org/jira/browse/FLINK-5380
 Project: Flink
  Issue Type: Bug
  Components: Webfrontend
Affects Versions: 1.2.0
Reporter: Robert Metzger


The web frontend does not report any outgoing records in the web frontend.
The amount of data in MB is reported correctly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-5380) Number of outgoing records not reported in web interface

2016-12-21 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-5380:
--
Attachment: outRecordsNotreported.png

> Number of outgoing records not reported in web interface
> 
>
> Key: FLINK-5380
> URL: https://issues.apache.org/jira/browse/FLINK-5380
> Project: Flink
>  Issue Type: Bug
>  Components: Webfrontend
>Affects Versions: 1.2.0
>Reporter: Robert Metzger
> Attachments: outRecordsNotreported.png
>
>
> The web frontend does not report any outgoing records in the web frontend.
> The amount of data in MB is reported correctly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-5381) Scrolling in some web interface pages doesn't work (taskmanager details, jobmanager config)

2016-12-21 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-5381:
-

 Summary: Scrolling in some web interface pages doesn't work 
(taskmanager details, jobmanager config)
 Key: FLINK-5381
 URL: https://issues.apache.org/jira/browse/FLINK-5381
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Robert Metzger


It seems that scrolling in the web interface doesn't work anymore on some pages 
in the 1.2 release branch.

Example pages: 
- When you click the "JobManager" tab
- The TaskManager logs page





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-5382) Taskmanager log download button causes 404

2016-12-21 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-5382:
-

 Summary: Taskmanager log download button causes 404
 Key: FLINK-5382
 URL: https://issues.apache.org/jira/browse/FLINK-5382
 Project: Flink
  Issue Type: Bug
  Components: Webfrontend
Affects Versions: 1.2.0
Reporter: Robert Metzger


The "download logs" button when viewing the TaskManager logs in the web UI 
leads to a 404 page.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-5383) TaskManager fails with SIGBUS when loading RocksDB

2016-12-21 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-5383:
-

 Summary: TaskManager fails with SIGBUS when loading RocksDB
 Key: FLINK-5383
 URL: https://issues.apache.org/jira/browse/FLINK-5383
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Robert Metzger


While trying out Flink 1.2, my TaskManager died with the following error while 
deploying a job:

{code}
2016-12-21 15:57:50,080 INFO  org.apache.flink.runtime.taskmanager.Task 
- Map -> Sink
: Unnamed (15/16) (50f527e4445479fb1fc9f34394d86d2f) switched from DEPLOYING to 
RUNNING.
2016-12-21 15:57:50,081 INFO  org.apache.flink.runtime.taskmanager.Task 
- Map -> Sink
: Unnamed (16/16) (b4b3d3340de587d729fe83d65eac3e10) switched from DEPLOYING to 
RUNNING.
2016-12-21 15:57:50,081 INFO  
org.apache.flink.streaming.runtime.tasks.StreamTask   - Using user-
defined state backend: RocksDB State Backend {isInitialized=false, 
configuredDbBasePaths=null, initialize
dDbBasePaths=null, checkpointStreamBackend=File State Backend @ 
hdfs://nameservice1/shared/checkpoint-dir
-rocks}.
2016-12-21 15:57:50,081 INFO  
org.apache.flink.streaming.runtime.tasks.StreamTask   - Using user-
defined state backend: RocksDB State Backend {isInitialized=false, 
configuredDbBasePaths=null, initialize
dDbBasePaths=null, checkpointStreamBackend=File State Backend @ 
hdfs://nameservice1/shared/checkpoint-dir
-rocks}.
2016-12-21 15:57:50,223 INFO  
org.apache.flink.contrib.streaming.state.RocksDBStateBackend  - Attempting 
to load RocksDB native library and store it at 
'/yarn/nm/usercache/longrunning/appcache/application_14821
56101125_0016'

LogType:taskmanager.out
Log Upload Time:Wed Dec 21 16:00:35 + 2016
LogLength:959
Log Contents:
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGBUS (0x7) at pc=0x7fe745fd596a, pid=7414, tid=140630801725184
#
# JRE version: Java(TM) SE Runtime Environment (7.0_67-b01) (build 1.7.0_67-b01)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (24.65-b04 mixed mode linux-amd64 
compressed oops)
# Problematic frame:
# C  [ld-linux-x86-64.so.2+0x1a96a]  realloc+0x2bfa
#
{code}

the error report file contained the following frames:

{code}
Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
j  java.lang.ClassLoader$NativeLibrary.load(Ljava/lang/String;)V+0
j  java.lang.ClassLoader.loadLibrary1(Ljava/lang/Class;Ljava/io/File;)Z+302
j  java.lang.ClassLoader.loadLibrary0(Ljava/lang/Class;Ljava/io/File;)Z+2
j  java.lang.ClassLoader.loadLibrary(Ljava/lang/Class;Ljava/lang/String;Z)V+48
j  java.lang.Runtime.load0(Ljava/lang/Class;Ljava/lang/String;)V+57
j  java.lang.System.load(Ljava/lang/String;)V+7
j  org.rocksdb.NativeLibraryLoader.loadLibraryFromJar(Ljava/lang/String;)V+14
j  org.rocksdb.NativeLibraryLoader.loadLibrary(Ljava/lang/String;)V+22
j  
org.apache.flink.contrib.streaming.state.RocksDBStateBackend.ensureRocksDBIsLoaded(Ljava/lang/String;)V+62
j  
org.apache.flink.contrib.streaming.state.RocksDBStateBackend.createKeyedStateBackend(Lorg/apache/flink/runtime/execution/Environment;Lorg/apache/flink/api/common/JobID;Ljava/lang/String;Lorg/apache/flink/api/common/typeutils/TypeSerializer;ILorg/apache/flink/runtime/state/KeyGroupRange;Lorg/apache/flink/runtime/query/TaskKvStateRegistry;)Lorg/apache/flink/runtime/state/AbstractKeyedStateBackend;+16
j  
org.apache.flink.streaming.runtime.tasks.StreamTask.createKeyedStateBackend(Lorg/apache/flink/api/common/typeutils/TypeSerializer;ILorg/apache/flink/runtime/state/KeyGroupRange;)Lorg/apache/flink/runtime/state/AbstractKeyedStateBackend;+137
{code}

I saw this error only once so far. I'll report again if it happens more 
frequently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-5382) Taskmanager log download button causes 404

2016-12-26 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15778843#comment-15778843
 ] 

Robert Metzger commented on FLINK-5382:
---

Did you use Flink on YARN to reproduce the issue?

> Taskmanager log download button causes 404
> --
>
> Key: FLINK-5382
> URL: https://issues.apache.org/jira/browse/FLINK-5382
> Project: Flink
>  Issue Type: Bug
>  Components: Webfrontend
>Affects Versions: 1.2.0
>Reporter: Robert Metzger
>Assignee: Sachin Goel
>
> The "download logs" button when viewing the TaskManager logs in the web UI 
> leads to a 404 page.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-4861) Package optional project artifacts

2017-01-03 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger resolved FLINK-4861.
---
Resolution: Fixed

Resolved in 
https://github.com/apache/flink/commit/5c76baa1734303a01472afd17cfaf3442eb06c43 

> Package optional project artifacts
> --
>
> Key: FLINK-4861
> URL: https://issues.apache.org/jira/browse/FLINK-4861
> Project: Flink
>  Issue Type: New Feature
>  Components: Build System
>Affects Versions: 1.2.0
>Reporter: Greg Hogan
>Assignee: Greg Hogan
> Fix For: 1.2.0
>
>
> Per the mailing list 
> [discussion|http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Additional-project-downloads-td13223.html],
>  package the Flink libraries and connectors into subdirectories of a new 
> {{opt}} directory in the release/snapshot tarballs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-4391) Provide support for asynchronous operations over streams

2017-01-03 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-4391:
--
Fix Version/s: 1.2.0

> Provide support for asynchronous operations over streams
> 
>
> Key: FLINK-4391
> URL: https://issues.apache.org/jira/browse/FLINK-4391
> Project: Flink
>  Issue Type: New Feature
>  Components: DataStream API
>Reporter: Jamie Grier
>Assignee: david.wang
> Fix For: 1.2.0
>
>
> Many Flink users need to do asynchronous processing driven by data from a 
> DataStream.  The classic example would be joining against an external 
> database in order to enrich a stream with extra information.
> It would be nice to add general support for this type of operation in the 
> Flink API.  Ideally this could simply take the form of a new operator that 
> manages async operations, keeps so many of them in flight, and then emits 
> results to downstream operators as the async operations complete.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4391) Provide support for asynchronous operations over streams

2017-01-03 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15794719#comment-15794719
 ] 

Robert Metzger commented on FLINK-4391:
---

I could not find any documentation in f52830763d8f95a955c10265e2c3543a5890e719. 
What's the plan for documenting the feature?

> Provide support for asynchronous operations over streams
> 
>
> Key: FLINK-4391
> URL: https://issues.apache.org/jira/browse/FLINK-4391
> Project: Flink
>  Issue Type: New Feature
>  Components: DataStream API
>Reporter: Jamie Grier
>Assignee: david.wang
> Fix For: 1.2.0
>
>
> Many Flink users need to do asynchronous processing driven by data from a 
> DataStream.  The classic example would be joining against an external 
> database in order to enrich a stream with extra information.
> It would be nice to add general support for this type of operation in the 
> Flink API.  Ideally this could simply take the form of a new operator that 
> manages async operations, keeps so many of them in flight, and then emits 
> results to downstream operators as the async operations complete.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-5264) Improve error message when assigning uid to intermediate operator

2017-01-03 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-5264:
--
Component/s: Java API

> Improve error message when assigning uid to intermediate operator
> -
>
> Key: FLINK-5264
> URL: https://issues.apache.org/jira/browse/FLINK-5264
> Project: Flink
>  Issue Type: Bug
>  Components: Java API
>Affects Versions: 1.1.3
>Reporter: Kostas Kloudas
>Assignee: Ufuk Celebi
> Fix For: 1.2.0
>
>
> Currently when trying to assign uid to an intermediate operator in a chain, 
> the error message just says that it is not the right place, without any 
> further information.
> We could improve the message by telling explicitly to the user on which 
> operator to assign the uid.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3710) ScalaDocs for org.apache.flink.streaming.scala are missing from the web site

2017-01-04 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15798948#comment-15798948
 ] 

Robert Metzger commented on FLINK-3710:
---

I agree that the current situation is not acceptable. I did a quick search on 
the web, but I also could not find a good way of generating aggregated 
scaladocs. Apache Spark uses SBT: 
https://issues.apache.org/jira/browse/SPARK-1439.

I'll remove the links from our website for now.

The only solution I see is having separate scaladocs links for the batch and 
streaming API. But I guess that they also have some code in common which will 
not be cross-referenced then.

> ScalaDocs for org.apache.flink.streaming.scala are missing from the web site
> 
>
> Key: FLINK-3710
> URL: https://issues.apache.org/jira/browse/FLINK-3710
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.0.1
>Reporter: Elias Levy
> Fix For: 1.0.4
>
>
> The ScalaDocs only include docs for org.apache.flink.scala and sub-packages.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3710) ScalaDocs for org.apache.flink.streaming.scala are missing from the web site

2017-01-04 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15798959#comment-15798959
 ] 

Robert Metzger commented on FLINK-3710:
---

I removed the links in this commit: 
http://git-wip-us.apache.org/repos/asf/flink-web/commit/4d8a7e26

> ScalaDocs for org.apache.flink.streaming.scala are missing from the web site
> 
>
> Key: FLINK-3710
> URL: https://issues.apache.org/jira/browse/FLINK-3710
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.0.1
>Reporter: Elias Levy
> Fix For: 1.0.4
>
>
> The ScalaDocs only include docs for org.apache.flink.scala and sub-packages.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-5382) Taskmanager log download button causes 404

2017-01-04 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger resolved FLINK-5382.
---
   Resolution: Fixed
Fix Version/s: 1.3.0
   1.2.0

Resolved for 1.3 (master) in 
http://git-wip-us.apache.org/repos/asf/flink/commit/29eec70d and 1.2 in 
http://git-wip-us.apache.org/repos/asf/flink/commit/a6a5

> Taskmanager log download button causes 404
> --
>
> Key: FLINK-5382
> URL: https://issues.apache.org/jira/browse/FLINK-5382
> Project: Flink
>  Issue Type: Bug
>  Components: Webfrontend
>Affects Versions: 1.2.0
>Reporter: Robert Metzger
>Assignee: Sachin Goel
> Fix For: 1.2.0, 1.3.0
>
>
> The "download logs" button when viewing the TaskManager logs in the web UI 
> leads to a 404 page.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-5383) TaskManager fails with SIGBUS when loading RocksDB

2017-01-09 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger resolved FLINK-5383.
---
   Resolution: Fixed
 Assignee: Stephan Ewen
Fix Version/s: 1.3.0

> TaskManager fails with SIGBUS when loading RocksDB
> --
>
> Key: FLINK-5383
> URL: https://issues.apache.org/jira/browse/FLINK-5383
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.2.0
>Reporter: Robert Metzger
>Assignee: Stephan Ewen
> Fix For: 1.3.0
>
>
> While trying out Flink 1.2, my TaskManager died with the following error 
> while deploying a job:
> {code}
> 2016-12-21 15:57:50,080 INFO  org.apache.flink.runtime.taskmanager.Task   
>   - Map -> Sink
> : Unnamed (15/16) (50f527e4445479fb1fc9f34394d86d2f) switched from DEPLOYING 
> to RUNNING.
> 2016-12-21 15:57:50,081 INFO  org.apache.flink.runtime.taskmanager.Task   
>   - Map -> Sink
> : Unnamed (16/16) (b4b3d3340de587d729fe83d65eac3e10) switched from DEPLOYING 
> to RUNNING.
> 2016-12-21 15:57:50,081 INFO  
> org.apache.flink.streaming.runtime.tasks.StreamTask   - Using user-
> defined state backend: RocksDB State Backend {isInitialized=false, 
> configuredDbBasePaths=null, initialize
> dDbBasePaths=null, checkpointStreamBackend=File State Backend @ 
> hdfs://nameservice1/shared/checkpoint-dir
> -rocks}.
> 2016-12-21 15:57:50,081 INFO  
> org.apache.flink.streaming.runtime.tasks.StreamTask   - Using user-
> defined state backend: RocksDB State Backend {isInitialized=false, 
> configuredDbBasePaths=null, initialize
> dDbBasePaths=null, checkpointStreamBackend=File State Backend @ 
> hdfs://nameservice1/shared/checkpoint-dir
> -rocks}.
> 2016-12-21 15:57:50,223 INFO  
> org.apache.flink.contrib.streaming.state.RocksDBStateBackend  - Attempting 
> to load RocksDB native library and store it at 
> '/yarn/nm/usercache/longrunning/appcache/application_14821
> 56101125_0016'
> LogType:taskmanager.out
> Log Upload Time:Wed Dec 21 16:00:35 + 2016
> LogLength:959
> Log Contents:
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGBUS (0x7) at pc=0x7fe745fd596a, pid=7414, tid=140630801725184
> #
> # JRE version: Java(TM) SE Runtime Environment (7.0_67-b01) (build 
> 1.7.0_67-b01)
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.65-b04 mixed mode 
> linux-amd64 compressed oops)
> # Problematic frame:
> # C  [ld-linux-x86-64.so.2+0x1a96a]  realloc+0x2bfa
> #
> {code}
> the error report file contained the following frames:
> {code}
> Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
> j  java.lang.ClassLoader$NativeLibrary.load(Ljava/lang/String;)V+0
> j  java.lang.ClassLoader.loadLibrary1(Ljava/lang/Class;Ljava/io/File;)Z+302
> j  java.lang.ClassLoader.loadLibrary0(Ljava/lang/Class;Ljava/io/File;)Z+2
> j  java.lang.ClassLoader.loadLibrary(Ljava/lang/Class;Ljava/lang/String;Z)V+48
> j  java.lang.Runtime.load0(Ljava/lang/Class;Ljava/lang/String;)V+57
> j  java.lang.System.load(Ljava/lang/String;)V+7
> j  org.rocksdb.NativeLibraryLoader.loadLibraryFromJar(Ljava/lang/String;)V+14
> j  org.rocksdb.NativeLibraryLoader.loadLibrary(Ljava/lang/String;)V+22
> j  
> org.apache.flink.contrib.streaming.state.RocksDBStateBackend.ensureRocksDBIsLoaded(Ljava/lang/String;)V+62
> j  
> org.apache.flink.contrib.streaming.state.RocksDBStateBackend.createKeyedStateBackend(Lorg/apache/flink/runtime/execution/Environment;Lorg/apache/flink/api/common/JobID;Ljava/lang/String;Lorg/apache/flink/api/common/typeutils/TypeSerializer;ILorg/apache/flink/runtime/state/KeyGroupRange;Lorg/apache/flink/runtime/query/TaskKvStateRegistry;)Lorg/apache/flink/runtime/state/AbstractKeyedStateBackend;+16
> j  
> org.apache.flink.streaming.runtime.tasks.StreamTask.createKeyedStateBackend(Lorg/apache/flink/api/common/typeutils/TypeSerializer;ILorg/apache/flink/runtime/state/KeyGroupRange;)Lorg/apache/flink/runtime/state/AbstractKeyedStateBackend;+137
> {code}
> I saw this error only once so far. I'll report again if it happens more 
> frequently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-5383) TaskManager fails with SIGBUS when loading RocksDB

2017-01-09 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15811424#comment-15811424
 ] 

Robert Metzger commented on FLINK-5383:
---

I can confirm that the mentioned commit is fixing the issue! I'm closing the 
JIRA.

> TaskManager fails with SIGBUS when loading RocksDB
> --
>
> Key: FLINK-5383
> URL: https://issues.apache.org/jira/browse/FLINK-5383
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.2.0
>Reporter: Robert Metzger
> Fix For: 1.3.0
>
>
> While trying out Flink 1.2, my TaskManager died with the following error 
> while deploying a job:
> {code}
> 2016-12-21 15:57:50,080 INFO  org.apache.flink.runtime.taskmanager.Task   
>   - Map -> Sink
> : Unnamed (15/16) (50f527e4445479fb1fc9f34394d86d2f) switched from DEPLOYING 
> to RUNNING.
> 2016-12-21 15:57:50,081 INFO  org.apache.flink.runtime.taskmanager.Task   
>   - Map -> Sink
> : Unnamed (16/16) (b4b3d3340de587d729fe83d65eac3e10) switched from DEPLOYING 
> to RUNNING.
> 2016-12-21 15:57:50,081 INFO  
> org.apache.flink.streaming.runtime.tasks.StreamTask   - Using user-
> defined state backend: RocksDB State Backend {isInitialized=false, 
> configuredDbBasePaths=null, initialize
> dDbBasePaths=null, checkpointStreamBackend=File State Backend @ 
> hdfs://nameservice1/shared/checkpoint-dir
> -rocks}.
> 2016-12-21 15:57:50,081 INFO  
> org.apache.flink.streaming.runtime.tasks.StreamTask   - Using user-
> defined state backend: RocksDB State Backend {isInitialized=false, 
> configuredDbBasePaths=null, initialize
> dDbBasePaths=null, checkpointStreamBackend=File State Backend @ 
> hdfs://nameservice1/shared/checkpoint-dir
> -rocks}.
> 2016-12-21 15:57:50,223 INFO  
> org.apache.flink.contrib.streaming.state.RocksDBStateBackend  - Attempting 
> to load RocksDB native library and store it at 
> '/yarn/nm/usercache/longrunning/appcache/application_14821
> 56101125_0016'
> LogType:taskmanager.out
> Log Upload Time:Wed Dec 21 16:00:35 + 2016
> LogLength:959
> Log Contents:
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGBUS (0x7) at pc=0x7fe745fd596a, pid=7414, tid=140630801725184
> #
> # JRE version: Java(TM) SE Runtime Environment (7.0_67-b01) (build 
> 1.7.0_67-b01)
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.65-b04 mixed mode 
> linux-amd64 compressed oops)
> # Problematic frame:
> # C  [ld-linux-x86-64.so.2+0x1a96a]  realloc+0x2bfa
> #
> {code}
> the error report file contained the following frames:
> {code}
> Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
> j  java.lang.ClassLoader$NativeLibrary.load(Ljava/lang/String;)V+0
> j  java.lang.ClassLoader.loadLibrary1(Ljava/lang/Class;Ljava/io/File;)Z+302
> j  java.lang.ClassLoader.loadLibrary0(Ljava/lang/Class;Ljava/io/File;)Z+2
> j  java.lang.ClassLoader.loadLibrary(Ljava/lang/Class;Ljava/lang/String;Z)V+48
> j  java.lang.Runtime.load0(Ljava/lang/Class;Ljava/lang/String;)V+57
> j  java.lang.System.load(Ljava/lang/String;)V+7
> j  org.rocksdb.NativeLibraryLoader.loadLibraryFromJar(Ljava/lang/String;)V+14
> j  org.rocksdb.NativeLibraryLoader.loadLibrary(Ljava/lang/String;)V+22
> j  
> org.apache.flink.contrib.streaming.state.RocksDBStateBackend.ensureRocksDBIsLoaded(Ljava/lang/String;)V+62
> j  
> org.apache.flink.contrib.streaming.state.RocksDBStateBackend.createKeyedStateBackend(Lorg/apache/flink/runtime/execution/Environment;Lorg/apache/flink/api/common/JobID;Ljava/lang/String;Lorg/apache/flink/api/common/typeutils/TypeSerializer;ILorg/apache/flink/runtime/state/KeyGroupRange;Lorg/apache/flink/runtime/query/TaskKvStateRegistry;)Lorg/apache/flink/runtime/state/AbstractKeyedStateBackend;+16
> j  
> org.apache.flink.streaming.runtime.tasks.StreamTask.createKeyedStateBackend(Lorg/apache/flink/api/common/typeutils/TypeSerializer;ILorg/apache/flink/runtime/state/KeyGroupRange;)Lorg/apache/flink/runtime/state/AbstractKeyedStateBackend;+137
> {code}
> I saw this error only once so far. I'll report again if it happens more 
> frequently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4973) Flakey Yarn tests due to recently added latency marker

2017-01-10 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15814982#comment-15814982
 ] 

Robert Metzger commented on FLINK-4973:
---

Any objections for logging only the exception message in the 
{{LatencyMarksEmitter}} ? Currently, we log the full stack trace, which 
pollutes the logs.

> Flakey Yarn tests due to recently added latency marker
> --
>
> Key: FLINK-4973
> URL: https://issues.apache.org/jira/browse/FLINK-4973
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.2.0
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
>Priority: Critical
>  Labels: test-stability
> Fix For: 1.2.0
>
>
> The newly introduced {{LatencyMarksEmitter}} emits latency marker on the 
> {{Output}}. This can still happen after the underlying {{BufferPool}} has 
> been destroyed. The occurring exception is then logged:
> {code}
> 2016-10-29 15:00:48,088 INFO  org.apache.flink.runtime.taskmanager.Task   
>   - Source: Custom File Source (1/1) switched to FINISHED
> 2016-10-29 15:00:48,089 INFO  org.apache.flink.runtime.taskmanager.Task   
>   - Freeing task resources for Source: Custom File Source (1/1)
> 2016-10-29 15:00:48,089 INFO  org.apache.flink.yarn.YarnTaskManager   
>   - Un-registering task and sending final execution state 
> FINISHED to JobManager for task Source: Custom File Source 
> (8fe0f817fa6d960ea33f6e57e0c3891c)
> 2016-10-29 15:00:48,101 WARN  
> org.apache.flink.streaming.api.operators.AbstractStreamOperator  - Error 
> while emitting latency marker
> java.lang.RuntimeException: Buffer pool is destroyed.
>   at 
> org.apache.flink.streaming.runtime.io.RecordWriterOutput.emitLatencyMarker(RecordWriterOutput.java:99)
>   at 
> org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.emitLatencyMarker(AbstractStreamOperator.java:734)
>   at 
> org.apache.flink.streaming.api.operators.StreamSource$LatencyMarksEmitter$1.run(StreamSource.java:134)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.IllegalStateException: Buffer pool is destroyed.
>   at 
> org.apache.flink.runtime.io.network.buffer.LocalBufferPool.requestBuffer(LocalBufferPool.java:144)
>   at 
> org.apache.flink.runtime.io.network.buffer.LocalBufferPool.requestBufferBlocking(LocalBufferPool.java:133)
>   at 
> org.apache.flink.runtime.io.network.api.writer.RecordWriter.sendToTarget(RecordWriter.java:118)
>   at 
> org.apache.flink.runtime.io.network.api.writer.RecordWriter.randomEmit(RecordWriter.java:103)
>   at 
> org.apache.flink.streaming.runtime.io.StreamRecordWriter.randomEmit(StreamRecordWriter.java:104)
>   at 
> org.apache.flink.streaming.runtime.io.RecordWriterOutput.emitLatencyMarker(RecordWriterOutput.java:96)
>   ... 9 more
> {code}
> This exception is clearly related to the shutdown of a stream operator and 
> does not indicate a wrong behaviour. Since the yarn tests simply scan the log 
> for some keywords (including exception) such a case can make them fail.
> Best if we could make sure that the {{LatencyMarksEmitter}} would only emit 
> latency marker if the {{Output}} would still be active. But we could also 
> simply not log exceptions which occurred after the stream operator has been 
> stopped.
> https://s3.amazonaws.com/archive.travis-ci.org/jobs/171578846/log.txt



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-5297) Relocate Flink's Hadoop dependency and its transitive dependencies

2017-01-10 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15815050#comment-15815050
 ] 

Robert Metzger commented on FLINK-5297:
---

I've created a branch that shades ALL hadoop dependencies (I've only validated 
this with Hadoop 2.3.0 and 2.7.3).
https://github.com/rmetzger/flink/tree/flink5297

The YARN tests are NOT working, because some web servers of YARN are not 
properly starting. The problem is that jetty is dynamically loading some 
servlet classes from a configuration file. The config file is not rewritten, 
that's why jetty can not find the classes.
I've manually tested the branch on a YARN cluster and it works.

> Relocate Flink's Hadoop dependency and its transitive dependencies
> --
>
> Key: FLINK-5297
> URL: https://issues.apache.org/jira/browse/FLINK-5297
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System
>Affects Versions: 1.2.0
>Reporter: Till Rohrmann
> Fix For: 1.2.0
>
>
> A user reported that they have a dependency conflict with one of Hadoop's 
> dependencies. More concretely it is the {{aws-java-sdk-*}} dependency which 
> is not backward compatible. The user is dependent on a newer {{aws-java-sdk}} 
> version which cannot be used by Hadoop version 2.7.
> A solution for future dependency conflicts could be to relocate Hadoop's 
> dependencies or even all of the Hadoop dependency.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4973) Flakey Yarn tests due to recently added latency marker

2017-01-10 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15815053#comment-15815053
 ] 

Robert Metzger commented on FLINK-4973:
---

The stack trace is only logged in the LatencyMarksEmitter. But my job is 
failing frequently. In a 24 hrs run, I got over 8000 of these stack traces in 
my log.

> Flakey Yarn tests due to recently added latency marker
> --
>
> Key: FLINK-4973
> URL: https://issues.apache.org/jira/browse/FLINK-4973
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.2.0
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
>Priority: Critical
>  Labels: test-stability
> Fix For: 1.2.0
>
>
> The newly introduced {{LatencyMarksEmitter}} emits latency marker on the 
> {{Output}}. This can still happen after the underlying {{BufferPool}} has 
> been destroyed. The occurring exception is then logged:
> {code}
> 2016-10-29 15:00:48,088 INFO  org.apache.flink.runtime.taskmanager.Task   
>   - Source: Custom File Source (1/1) switched to FINISHED
> 2016-10-29 15:00:48,089 INFO  org.apache.flink.runtime.taskmanager.Task   
>   - Freeing task resources for Source: Custom File Source (1/1)
> 2016-10-29 15:00:48,089 INFO  org.apache.flink.yarn.YarnTaskManager   
>   - Un-registering task and sending final execution state 
> FINISHED to JobManager for task Source: Custom File Source 
> (8fe0f817fa6d960ea33f6e57e0c3891c)
> 2016-10-29 15:00:48,101 WARN  
> org.apache.flink.streaming.api.operators.AbstractStreamOperator  - Error 
> while emitting latency marker
> java.lang.RuntimeException: Buffer pool is destroyed.
>   at 
> org.apache.flink.streaming.runtime.io.RecordWriterOutput.emitLatencyMarker(RecordWriterOutput.java:99)
>   at 
> org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.emitLatencyMarker(AbstractStreamOperator.java:734)
>   at 
> org.apache.flink.streaming.api.operators.StreamSource$LatencyMarksEmitter$1.run(StreamSource.java:134)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.IllegalStateException: Buffer pool is destroyed.
>   at 
> org.apache.flink.runtime.io.network.buffer.LocalBufferPool.requestBuffer(LocalBufferPool.java:144)
>   at 
> org.apache.flink.runtime.io.network.buffer.LocalBufferPool.requestBufferBlocking(LocalBufferPool.java:133)
>   at 
> org.apache.flink.runtime.io.network.api.writer.RecordWriter.sendToTarget(RecordWriter.java:118)
>   at 
> org.apache.flink.runtime.io.network.api.writer.RecordWriter.randomEmit(RecordWriter.java:103)
>   at 
> org.apache.flink.streaming.runtime.io.StreamRecordWriter.randomEmit(StreamRecordWriter.java:104)
>   at 
> org.apache.flink.streaming.runtime.io.RecordWriterOutput.emitLatencyMarker(RecordWriterOutput.java:96)
>   ... 9 more
> {code}
> This exception is clearly related to the shutdown of a stream operator and 
> does not indicate a wrong behaviour. Since the yarn tests simply scan the log 
> for some keywords (including exception) such a case can make them fail.
> Best if we could make sure that the {{LatencyMarksEmitter}} would only emit 
> latency marker if the {{Output}} would still be active. But we could also 
> simply not log exceptions which occurred after the stream operator has been 
> stopped.
> https://s3.amazonaws.com/archive.travis-ci.org/jobs/171578846/log.txt



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-5450) WindowOperator logs about "re-registering state from an older Flink version" even though its not a restored window

2017-01-11 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-5450:
-

 Summary: WindowOperator logs about "re-registering state from an 
older Flink version" even though its not a restored window
 Key: FLINK-5450
 URL: https://issues.apache.org/jira/browse/FLINK-5450
 Project: Flink
  Issue Type: Bug
  Components: Windowing Operators
Affects Versions: 1.2.0
Reporter: Robert Metzger


While testing the RC0 of Flink 1.2, I stumbled across this log message

{code}
15:42:02,855 INFO  
org.apache.flink.streaming.api.operators.AbstractStreamOperator  - 
WindowOperator (taskIdx=WindowOperator) re-registering state from an older 
Flink version.
{code}

My WindowOperator is not restored, so I find this log message a bit misleading.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4973) Flakey Yarn tests due to recently added latency marker

2017-01-11 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15819056#comment-15819056
 ] 

Robert Metzger commented on FLINK-4973:
---

{code}
2017-01-10 04:59:22,578 INFO  org.apache.flink.runtime.taskmanager.Task 
- Task Source: sale event generator (4/20) is already in state 
CANCELING
2017-01-10 04:59:22,578 INFO  org.apache.flink.runtime.taskmanager.Task 
- Triggering cancellation of task code Sink: control events sink 
(4/20) (85881281bb76d124a85b50156f59b0fd).
2017-01-10 04:59:22,579 WARN  
org.apache.flink.streaming.api.operators.AbstractStreamOperator  - Error while 
emitting latency marker.
java.lang.RuntimeException: Buffer pool is destroyed.
at 
org.apache.flink.streaming.runtime.io.RecordWriterOutput.emitLatencyMarker(RecordWriterOutput.java:99)
at 
org.apache.flink.streaming.runtime.tasks.OperatorChain$BroadcastingOutputCollector.emitLatencyMarker(OperatorChain.java:445)
at 
org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.emitLatencyMarker(AbstractStreamOperator.java:743)
at 
org.apache.flink.streaming.api.operators.StreamSource$LatencyMarksEmitter$1.onProcessingTime(StreamSource.java:142)
at 
org.apache.flink.streaming.runtime.tasks.SystemProcessingTimeService$RepeatedTriggerTask.run(SystemProcessingTimeService.java:256)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.IllegalStateException: Buffer pool is destroyed.
at 
org.apache.flink.runtime.io.network.buffer.LocalBufferPool.requestBuffer(LocalBufferPool.java:149)
at 
org.apache.flink.runtime.io.network.buffer.LocalBufferPool.requestBufferBlocking(LocalBufferPool.java:138)
at 
org.apache.flink.runtime.io.network.api.writer.RecordWriter.sendToTarget(RecordWriter.java:126)
at 
org.apache.flink.runtime.io.network.api.writer.RecordWriter.randomEmit(RecordWriter.java:102)
at 
org.apache.flink.streaming.runtime.io.StreamRecordWriter.randomEmit(StreamRecordWriter.java:104)
at 
org.apache.flink.streaming.runtime.io.RecordWriterOutput.emitLatencyMarker(RecordWriterOutput.java:96)
... 11 more
2017-01-10 04:59:22,581 WARN  
org.apache.flink.streaming.api.operators.AbstractStreamOperator  - Error while 
emitting latency marker.
java.lang.RuntimeException: Buffer pool is destroyed.
at 
org.apache.flink.streaming.runtime.io.RecordWriterOutput.emitLatencyMarker(RecordWriterOutput.java:99)
at 
org.apache.flink.streaming.runtime.tasks.OperatorChain$BroadcastingOutputCollector.emitLatencyMarker(OperatorChain.java:445)
at 
org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.emitLatencyMarker(AbstractStreamOperator.java:743)
at 
org.apache.flink.streaming.api.operators.StreamSource$LatencyMarksEmitter$1.onProcessingTime(StreamSource.java:142)
at 
org.apache.flink.streaming.runtime.tasks.SystemProcessingTimeService$RepeatedTriggerTask.run(SystemProcessingTimeService.java:256)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.IllegalStateException: Buffer pool is destroyed.
at 
org.apache.flink.runtime.io.network.buffer.LocalBufferPool.requestBuffer(LocalBufferPool.java:149)
at 
org.apache.flink.runtime.io.network.buffer.LocalBufferPool.requestBufferBlocking(LocalBufferPool.java:138)
at 
org.apache.flink.runtime.io.network.api.writer.RecordWriter.sendToTarget(RecordWriter.java:126)
at 
org.apache.flink.runtime.io.network.api.writer.RecordWriter.randomEmit(RecordWriter.java:102)
at 
org.apache.flink.streaming.runtime.io.StreamRecordWriter.randomEmit(StreamRecordW

[jira] [Created] (FLINK-5462) Flink job fails due to java.util.concurrent.CancellationException while snapshotting

2017-01-11 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-5462:
-

 Summary: Flink job fails due to 
java.util.concurrent.CancellationException while snapshotting
 Key: FLINK-5462
 URL: https://issues.apache.org/jira/browse/FLINK-5462
 Project: Flink
  Issue Type: Bug
  Components: State Backends, Checkpointing
Affects Versions: 1.2.0
Reporter: Robert Metzger


I'm using Flink 699f4b0.
My restored, rescaled Flink job failed while creating a checkpoint with the 
following exception:

{code}
2017-01-11 18:46:49,853 INFO  
org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Triggering 
checkpoint 3 @ 1484160409846
2017-01-11 18:49:50,111 INFO  
org.apache.flink.runtime.executiongraph.ExecutionGraph- 
TriggerWindow(TumblingEventTimeWindows(4), 
ListStateDescriptor{serializer=org.apache.flink.api.java.typeutils.runtime.TupleSerializer@2edcd071},
 EventTimeTrigger(), WindowedStream
.apply(AllWindowedStream.java:440)) (1/1) (2accc6ca2727c4f7ec963318fbd237e9) 
switched from RUNNING to FAILED.
AsynchronousException{java.lang.Exception: Could not materialize checkpoint 3 
for operator TriggerWindow(TumblingEventTimeWindows(4), 
ListStateDescriptor{serializer=org.apache.flink.api.java.typeutils.runtime.TupleSerializer@2edcd071},
 EventTimeTrigger(), WindowedStream.ap
ply(AllWindowedStream.java:440)) (1/1).}
at 
org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.run(StreamTask.java:939)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.Exception: Could not materialize checkpoint 3 for operator 
TriggerWindow(TumblingEventTimeWindows(4), 
ListStateDescriptor{serializer=org.apache.flink.api.java.typeutils.runtime.TupleSerializer@2edcd071},
 EventTimeTrigger(), WindowedStream.apply(AllWind
owedStream.java:440)) (1/1).
... 6 more
Caused by: java.util.concurrent.CancellationException
at java.util.concurrent.FutureTask.report(FutureTask.java:121)
at java.util.concurrent.FutureTask.get(FutureTask.java:188)
at 
org.apache.flink.util.FutureUtil.runIfNotDoneAndGet(FutureUtil.java:40)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.run(StreamTask.java:899)
... 5 more
2017-01-11 18:49:50,113 INFO  
org.apache.flink.runtime.executiongraph.ExecutionGraph- Job Generate 
Event Window stream (90859d392c1da472e07695f434b332ef) switched from state 
RUNNING to FAILING.
AsynchronousException{java.lang.Exception: Could not materialize checkpoint 3 
for operator TriggerWindow(TumblingEventTimeWindows(4), 
ListStateDescriptor{serializer=org.apache.flink.api.java.typeutils.runtime.TupleSerializer@2edcd071},
 EventTimeTrigger(), WindowedStream.ap
ply(AllWindowedStream.java:440)) (1/1).}
at 
org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.run(StreamTask.java:939)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.Exception: Could not materialize checkpoint 3 for operator 
TriggerWindow(TumblingEventTimeWindows(4), 
ListStateDescriptor{serializer=org.apache.flink.api.java.typeutils.runtime.TupleSerializer@2edcd071},
 EventTimeTrigger(), WindowedStream.apply(AllWindowedStream.java:440)) (1/1).
... 6 more
Caused by: java.util.concurrent.CancellationException
at java.util.concurrent.FutureTask.report(FutureTask.java:121)
at java.util.concurrent.FutureTask.get(FutureTask.java:188)
at 
org.apache.flink.util.FutureUtil.runIfNotDoneAndGet(FutureUtil.java:40)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.run(StreamTask.java:899)
... 5 more
2017-01-11 18:49:50,122 INFO  
org.apache.flink.runtime.executiongraph.ExecutionGraph- Source: Custom 
Source -> Timestamps/Watermarks (1/2) (e52c1211b5693552f5908b0082c80882) 
switched from RUNNING to CANCELING.
{code}

There are no other logged around that time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-5463) RocksDB.disposeInternal does not react to interrupts, blocks task cancellation

2017-01-11 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-5463:
-

 Summary: RocksDB.disposeInternal does not react to interrupts, 
blocks task cancellation
 Key: FLINK-5463
 URL: https://issues.apache.org/jira/browse/FLINK-5463
 Project: Flink
  Issue Type: Bug
  Components: State Backends, Checkpointing
Affects Versions: 1.2.0
Reporter: Robert Metzger


I'm using Flink 699f4b0.
My Flink job is slow while cancelling because RockDB seems to be busy with 
disposing its state:

{code}
2017-01-11 18:48:23,315 INFO  org.apache.flink.runtime.taskmanager.Task 
- Triggering cancellation of task code 
TriggerWindow(TumblingEventTimeWindows(4), 
ListStateDescriptor{serializer=org.apache.flink.api.java.typeutils.runtime.TupleSerializer@2edcd071
}, EventTimeTrigger(), WindowedStream.apply(AllWindowedStream.java:440)) (1/1) 
(2accc6ca2727c4f7ec963318fbd237e9).
2017-01-11 18:48:53,318 WARN  org.apache.flink.runtime.taskmanager.Task 
- Task 'TriggerWindow(TumblingEventTimeWindows(4), 
ListStateDescriptor{serializer=org.apache.flink.api.java.typeutils.runtime.TupleSerializer@2edcd071},
 EventTimeTrigger(), Windowed
Stream.apply(AllWindowedStream.java:440)) (1/1)' did not react to cancelling 
signal, but is stuck in method:
 org.rocksdb.RocksDB.disposeInternal(Native Method)
org.rocksdb.RocksObject.disposeInternal(RocksObject.java:37)
org.rocksdb.AbstractImmutableNativeReference.close(AbstractImmutableNativeReference.java:56)
org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend.dispose(RocksDBKeyedStateBackend.java:250)
org.apache.flink.streaming.api.operators.AbstractStreamOperator.dispose(AbstractStreamOperator.java:331)
org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.dispose(AbstractUdfStreamOperator.java:169)
org.apache.flink.streaming.runtime.operators.windowing.WindowOperator.dispose(WindowOperator.java:273)
org.apache.flink.streaming.runtime.tasks.StreamTask.disposeAllOperators(StreamTask.java:439)
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:340)
org.apache.flink.runtime.taskmanager.Task.run(Task.java:654)
java.lang.Thread.run(Thread.java:745)

2017-01-11 18:48:53,319 WARN  org.apache.flink.runtime.taskmanager.Task 
- Task 'TriggerWindow(TumblingEventTimeWindows(4), 
ListStateDescriptor{serializer=org.apache.flink.api.java.typeutils.runtime.TupleSerializer@2edcd071},
 EventTimeTrigger(), WindowedStream.apply(AllWindowedStream.java:440)) (1/1)' 
did not react to cancelling signal, but is stuck in method:
 org.rocksdb.RocksDB.disposeInternal(Native Method)
org.rocksdb.RocksObject.disposeInternal(RocksObject.java:37)
org.rocksdb.AbstractImmutableNativeReference.close(AbstractImmutableNativeReference.java:56)
org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend.dispose(RocksDBKeyedStateBackend.java:250)
org.apache.flink.streaming.api.operators.AbstractStreamOperator.dispose(AbstractStreamOperator.java:331)
org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.dispose(AbstractUdfStreamOperator.java:169)
org.apache.flink.streaming.runtime.operators.windowing.WindowOperator.dispose(WindowOperator.java:273)
org.apache.flink.streaming.runtime.tasks.StreamTask.disposeAllOperators(StreamTask.java:439)
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:340)
org.apache.flink.runtime.taskmanager.Task.run(Task.java:654)
java.lang.Thread.run(Thread.java:745)

2017-01-11 18:49:23,319 WARN  org.apache.flink.runtime.taskmanager.Task 
- Task 'TriggerWindow(TumblingEventTimeWindows(4), 
ListStateDescriptor{serializer=org.apache.flink.api.java.typeutils.runtime.TupleSerializer@2edcd071},
 EventTimeTrigger(), WindowedStream.apply(AllWindowedStream.java:440)) (1/1)' 
did not react to cancelling signal, but is stuck in method:
 org.rocksdb.RocksDB.disposeInternal(Native Method)
org.rocksdb.RocksObject.disposeInternal(RocksObject.java:37)
org.rocksdb.AbstractImmutableNativeReference.close(AbstractImmutableNativeReference.java:56)
org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend.dispose(RocksDBKeyedStateBackend.java:250)
org.apache.flink.streaming.api.operators.AbstractStreamOperator.dispose(AbstractStreamOperator.java:331)
org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.dispose(AbstractUdfStreamOperator.java:169)
org.apache.flink.streaming.runtime.operators.windowing.WindowOperator.dispose(WindowOperator.java:273)
org.apache.flink.streaming.runtime.tasks.StreamTask.disposeAllOperators(StreamTask.java:439)
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:340)
org.apache.flink.runtime.taskmanager.Task.run(Task.java:654)
java.lang.Thread.run(Thread.java:745)

2017-01-11 18:49:50,080 INFO  org.apache.flink.runtime.taskmanager.Task 
- Freeing task resources for 
TriggerWindow(Tumbli

[jira] [Created] (FLINK-5464) MetricQueryService throws NullPointerException on JobManager

2017-01-11 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-5464:
-

 Summary: MetricQueryService throws NullPointerException on 
JobManager
 Key: FLINK-5464
 URL: https://issues.apache.org/jira/browse/FLINK-5464
 Project: Flink
  Issue Type: Bug
  Components: Webfrontend
Affects Versions: 1.2.0
Reporter: Robert Metzger


I'm using Flink 699f4b0.

My JobManager log contains many of these log entries:

{code}
2017-01-11 19:42:05,778 WARN  
org.apache.flink.runtime.webmonitor.metrics.MetricFetcher - Fetching 
metrics failed.
akka.pattern.AskTimeoutException: Ask timed out on 
[Actor[akka://flink/user/MetricQueryService#-970662317]] after [1 ms]
at 
akka.pattern.PromiseActorRef$$anonfun$1.apply$mcV$sp(AskSupport.scala:334)
at akka.actor.Scheduler$$anon$7.run(Scheduler.scala:117)
at 
scala.concurrent.Future$InternalCallbackExecutor$.scala$concurrent$Future$InternalCallbackExecutor$$unbatchedExecute(Future.scala:694)
at 
scala.concurrent.Future$InternalCallbackExecutor$.execute(Future.scala:691)
at 
akka.actor.LightArrayRevolverScheduler$TaskHolder.executeTask(Scheduler.scala:474)
at 
akka.actor.LightArrayRevolverScheduler$$anon$8.executeBucket$1(Scheduler.scala:425)
at 
akka.actor.LightArrayRevolverScheduler$$anon$8.nextTick(Scheduler.scala:429)
at 
akka.actor.LightArrayRevolverScheduler$$anon$8.run(Scheduler.scala:381)
at java.lang.Thread.run(Thread.java:745)
2017-01-11 19:42:07,765 WARN  
org.apache.flink.runtime.metrics.dump.MetricQueryService  - An exception 
occurred while processing a message.
java.lang.NullPointerException
at 
org.apache.flink.runtime.metrics.dump.MetricDumpSerialization.serializeGauge(MetricDumpSerialization.java:162)
at 
org.apache.flink.runtime.metrics.dump.MetricDumpSerialization.access$300(MetricDumpSerialization.java:47)
at 
org.apache.flink.runtime.metrics.dump.MetricDumpSerialization$MetricDumpSerializer.serialize(MetricDumpSerialization.java:90)
at 
org.apache.flink.runtime.metrics.dump.MetricQueryService.onReceive(MetricQueryService.java:109)
at 
akka.actor.UntypedActor$$anonfun$receive$1.applyOrElse(UntypedActor.scala:167)
at akka.actor.Actor$class.aroundReceive(Actor.scala:467)
at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:97)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
at akka.actor.ActorCell.invoke(ActorCell.scala:487)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238)
at akka.dispatch.Mailbox.run(Mailbox.scala:220)
at 
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at 
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at 
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-5465) RocksDB fails with segfault while calling AbstractRocksDBState.clear()

2017-01-11 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-5465:
-

 Summary: RocksDB fails with segfault while calling 
AbstractRocksDBState.clear()
 Key: FLINK-5465
 URL: https://issues.apache.org/jira/browse/FLINK-5465
 Project: Flink
  Issue Type: Bug
  Components: State Backends, Checkpointing
Affects Versions: 1.2.0
Reporter: Robert Metzger


I'm using Flink 699f4b0.

{code}
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x7f91a0d49b78, pid=26662, tid=140263356024576
#
# JRE version: Java(TM) SE Runtime Environment (7.0_67-b01) (build 1.7.0_67-b01)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (24.65-b04 mixed mode linux-amd64 
compressed oops)
# Problematic frame:
# C  [librocksdbjni-linux64.so+0x1aeb78]  
rocksdb::GetColumnFamilyID(rocksdb::ColumnFamilyHandle*)+0x8
#
# Failed to write core dump. Core dumps have been disabled. To enable core 
dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# 
/yarn/nm/usercache/robert/appcache/application_1484132267957_0007/container_1484132267957_0007_01_10/hs_err_pid26662.log
Compiled method (nm) 1869778  903 n   org.rocksdb.RocksDB::remove 
(native)
 total in heap  [0x7f91b40b9dd0,0x7f91b40ba150] = 896
 relocation [0x7f91b40b9ef0,0x7f91b40b9f48] = 88
 main code  [0x7f91b40b9f60,0x7f91b40ba150] = 496
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.sun.com/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#

{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-5465) RocksDB fails with segfault while calling AbstractRocksDBState.clear()

2017-01-11 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-5465:
--
Attachment: hs-err-pid26662.log

I've attached the error report file, containing the stack trace

{code}
Stack: [0x7f919b72c000,0x7f919b82d000],  sp=0x7f919b82b368,  free 
space=1020k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
C  [librocksdbjni-linux64.so+0x1aeb78]  
rocksdb::GetColumnFamilyID(rocksdb::ColumnFamilyHandle*)+0x8
C  [librocksdbjni-linux64.so+0x2009e1]  
rocksdb::DB::Delete(rocksdb::WriteOptions const&, rocksdb::ColumnFamilyHandle*, 
rocksdb::Slice const&)+0x41
C  [librocksdbjni-linux64.so+0x200a41]  
rocksdb::DBImpl::Delete(rocksdb::WriteOptions const&, 
rocksdb::ColumnFamilyHandle*, rocksdb::Slice const&)+0x11
C  [librocksdbjni-linux64.so+0x18e993]  rocksdb_remove_helper(JNIEnv_*, 
rocksdb::DB*, rocksdb::WriteOptions const&, rocksdb::ColumnFamilyHandle*, 
_jbyteArray*, int)+0x63
C  [librocksdbjni-linux64.so+0x18eee6]  
Java_org_rocksdb_RocksDB_remove__JJ_3BIJ+0x26
J 903  org.rocksdb.RocksDB.remove(JJ[BIJ)V (0 bytes) @ 0x7f91b40ba02d 
[0x7f91b40b9f60+0xcd]

Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
J 903  org.rocksdb.RocksDB.remove(JJ[BIJ)V (0 bytes) @ 0x7f91b40b9fb3 
[0x7f91b40b9f60+0x53]
J 900 C2 org.apache.flink.contrib.streaming.state.AbstractRocksDBState.clear()V 
(47 bytes) @ 0x7f91b4143f18 [0x7f91b4143c60+0x2b8]
J 1087 C2 
org.apache.flink.streaming.runtime.operators.windowing.WindowOperator.cleanup(Lorg/apache/flink/streaming/api/windowing/windows/Window;Lorg/apache/flink/api/common/state/AppendingState;Lorg/apache/flink/streaming/runtime/operators/windowing/MergingWindowSet;)V
 (27 bytes) @ 0x7f91b414dc64 [0x7f91b414dc20+0x44]
J 925 C2 
org.apache.flink.streaming.api.operators.HeapInternalTimerService.onProcessingTime(J)V
 (107 bytes) @ 0x7f91b4194794 [0x7f91b4194560+0x234]
j  
org.apache.flink.streaming.runtime.tasks.SystemProcessingTimeService$TriggerTask.run()V+15
j  java.util.concurrent.Executors$RunnableAdapter.call()Ljava/lang/Object;+4
j  java.util.concurrent.FutureTask.run()V+42
j  
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(Ljava/util/concurrent/ScheduledThreadPoolExecutor$ScheduledFutureTask;)V+1
j  
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run()V+30
j  
java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V+95
j  java.util.concurrent.ThreadPoolExecutor$Worker.run()V+5
j  java.lang.Thread.run()V+11
v  ~StubRoutines::call_stub
{code}

> RocksDB fails with segfault while calling AbstractRocksDBState.clear()
> --
>
> Key: FLINK-5465
> URL: https://issues.apache.org/jira/browse/FLINK-5465
> Project: Flink
>  Issue Type: Bug
>  Components: State Backends, Checkpointing
>Affects Versions: 1.2.0
>Reporter: Robert Metzger
> Attachments: hs-err-pid26662.log
>
>
> I'm using Flink 699f4b0.
> {code}
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGSEGV (0xb) at pc=0x7f91a0d49b78, pid=26662, tid=140263356024576
> #
> # JRE version: Java(TM) SE Runtime Environment (7.0_67-b01) (build 
> 1.7.0_67-b01)
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.65-b04 mixed mode 
> linux-amd64 compressed oops)
> # Problematic frame:
> # C  [librocksdbjni-linux64.so+0x1aeb78]  
> rocksdb::GetColumnFamilyID(rocksdb::ColumnFamilyHandle*)+0x8
> #
> # Failed to write core dump. Core dumps have been disabled. To enable core 
> dumping, try "ulimit -c unlimited" before starting Java again
> #
> # An error report file with more information is saved as:
> # 
> /yarn/nm/usercache/robert/appcache/application_1484132267957_0007/container_1484132267957_0007_01_10/hs_err_pid26662.log
> Compiled method (nm) 1869778  903 n   org.rocksdb.RocksDB::remove 
> (native)
>  total in heap  [0x7f91b40b9dd0,0x7f91b40ba150] = 896
>  relocation [0x7f91b40b9ef0,0x7f91b40b9f48] = 88
>  main code  [0x7f91b40b9f60,0x7f91b40ba150] = 496
> #
> # If you would like to submit a bug report, please visit:
> #   http://bugreport.sun.com/bugreport/crash.jsp
> # The crash happened outside the Java Virtual Machine in native code.
> # See problematic frame for where to report the bug.
> #
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-5462) Flink job fails due to java.util.concurrent.CancellationException while snapshotting

2017-01-12 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-5462:
--
Attachment: application-1484132267957-0005

I've attached the full log.

> Flink job fails due to java.util.concurrent.CancellationException while 
> snapshotting
> 
>
> Key: FLINK-5462
> URL: https://issues.apache.org/jira/browse/FLINK-5462
> Project: Flink
>  Issue Type: Bug
>  Components: State Backends, Checkpointing
>Affects Versions: 1.2.0
>Reporter: Robert Metzger
> Attachments: application-1484132267957-0005
>
>
> I'm using Flink 699f4b0.
> My restored, rescaled Flink job failed while creating a checkpoint with the 
> following exception:
> {code}
> 2017-01-11 18:46:49,853 INFO  
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Triggering 
> checkpoint 3 @ 1484160409846
> 2017-01-11 18:49:50,111 INFO  
> org.apache.flink.runtime.executiongraph.ExecutionGraph- 
> TriggerWindow(TumblingEventTimeWindows(4), 
> ListStateDescriptor{serializer=org.apache.flink.api.java.typeutils.runtime.TupleSerializer@2edcd071},
>  EventTimeTrigger(), WindowedStream
> .apply(AllWindowedStream.java:440)) (1/1) (2accc6ca2727c4f7ec963318fbd237e9) 
> switched from RUNNING to FAILED.
> AsynchronousException{java.lang.Exception: Could not materialize checkpoint 3 
> for operator TriggerWindow(TumblingEventTimeWindows(4), 
> ListStateDescriptor{serializer=org.apache.flink.api.java.typeutils.runtime.TupleSerializer@2edcd071},
>  EventTimeTrigger(), WindowedStream.ap
> ply(AllWindowedStream.java:440)) (1/1).}
> at 
> org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.run(StreamTask.java:939)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.Exception: Could not materialize checkpoint 3 for 
> operator TriggerWindow(TumblingEventTimeWindows(4), 
> ListStateDescriptor{serializer=org.apache.flink.api.java.typeutils.runtime.TupleSerializer@2edcd071},
>  EventTimeTrigger(), WindowedStream.apply(AllWind
> owedStream.java:440)) (1/1).
> ... 6 more
> Caused by: java.util.concurrent.CancellationException
> at java.util.concurrent.FutureTask.report(FutureTask.java:121)
> at java.util.concurrent.FutureTask.get(FutureTask.java:188)
> at 
> org.apache.flink.util.FutureUtil.runIfNotDoneAndGet(FutureUtil.java:40)
> at 
> org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.run(StreamTask.java:899)
> ... 5 more
> 2017-01-11 18:49:50,113 INFO  
> org.apache.flink.runtime.executiongraph.ExecutionGraph- Job Generate 
> Event Window stream (90859d392c1da472e07695f434b332ef) switched from state 
> RUNNING to FAILING.
> AsynchronousException{java.lang.Exception: Could not materialize checkpoint 3 
> for operator TriggerWindow(TumblingEventTimeWindows(4), 
> ListStateDescriptor{serializer=org.apache.flink.api.java.typeutils.runtime.TupleSerializer@2edcd071},
>  EventTimeTrigger(), WindowedStream.ap
> ply(AllWindowedStream.java:440)) (1/1).}
> at 
> org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.run(StreamTask.java:939)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.Exception: Could not materialize checkpoint 3 for 
> operator TriggerWindow(TumblingEventTimeWindows(4), 
> ListStateDescriptor{serializer=org.apache.flink.api.java.typeutils.runtime.TupleSerializer@2edcd071},
>  EventTimeTrigger(), WindowedStream.apply(AllWindowedStream.java:440)) (1/1).
> ... 6 more
> Caused by: java.util.concurrent.CancellationException
> at java.util.concurrent.FutureTask.report(FutureTask.java:121)
> at java.util.concurrent.FutureTask.get(FutureTask.java:188)
> at 
> org.apache.flink.util.FutureUtil.runIfNotDoneAndGet(FutureUtil.java:40)
> at 
> org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.run(StreamTask.java:899)
> ... 5 more
> 2017-01-11 18:49:50,122 INFO  
> org.apache.flink.runtime.executiongraph.ExecutionGraph   

[jira] [Created] (FLINK-5468) Restoring from a semi async rocksdb statebackend (1.1) to 1.2 fails with ClassNotFoundException

2017-01-12 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-5468:
-

 Summary: Restoring from a semi async rocksdb statebackend (1.1) to 
1.2 fails with ClassNotFoundException
 Key: FLINK-5468
 URL: https://issues.apache.org/jira/browse/FLINK-5468
 Project: Flink
  Issue Type: Bug
  Components: State Backends, Checkpointing
Affects Versions: 1.2.0
Reporter: Robert Metzger


I think we should catch this exception and explain what's going on and how 
users can resolve the issue.
{code}
org.apache.flink.client.program.ProgramInvocationException: The program 
execution failed: Job execution failed
at 
org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:427)
at 
org.apache.flink.yarn.YarnClusterClient.submitJob(YarnClusterClient.java:210)
at 
org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:400)
at 
org.apache.flink.streaming.api.environment.StreamContextEnvironment.execute(StreamContextEnvironment.java:66)
at com.dataartisans.eventwindow.Generator.main(Generator.java:60)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:528)
at 
org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:419)
at 
org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:339)
at 
org.apache.flink.client.CliFrontend.executeProgram(CliFrontend.java:831)
at org.apache.flink.client.CliFrontend.run(CliFrontend.java:256)
at 
org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1073)
at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1120)
at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1117)
at 
org.apache.flink.runtime.security.HadoopSecurityContext$1.run(HadoopSecurityContext.java:43)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at 
org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:40)
at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1116)
Caused by: org.apache.flink.runtime.client.JobExecutionException: Job execution 
failed
at 
org.apache.flink.runtime.client.JobClient.awaitJobResult(JobClient.java:328)
at 
org.apache.flink.runtime.client.JobClient.submitJobAndWait(JobClient.java:382)
at 
org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:423)
... 22 more
Caused by: java.io.IOException: java.lang.ClassNotFoundException: 
org.apache.flink.contrib.streaming.state.RocksDBStateBackend$FinalSemiAsyncSnapshot
at 
org.apache.flink.migration.runtime.checkpoint.savepoint.SavepointV0Serializer.deserialize(SavepointV0Serializer.java:162)
at 
org.apache.flink.migration.runtime.checkpoint.savepoint.SavepointV0Serializer.deserialize(SavepointV0Serializer.java:70)
at 
org.apache.flink.runtime.checkpoint.savepoint.SavepointStore.loadSavepoint(SavepointStore.java:138)
at 
org.apache.flink.runtime.checkpoint.savepoint.SavepointLoader.loadAndValidateSavepoint(SavepointLoader.java:64)
at 
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1348)
at 
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1330)
at 
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1330)
at 
scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
at 
scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
at 
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at 
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at 
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: java.lang.ClassNotFoundException: 
org.apache.flink.contrib.streamin

[jira] [Created] (FLINK-5473) setMaxParallelism() higher than 1 is possible on non-parallel operators

2017-01-12 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-5473:
-

 Summary: setMaxParallelism() higher than 1 is possible on 
non-parallel operators
 Key: FLINK-5473
 URL: https://issues.apache.org/jira/browse/FLINK-5473
 Project: Flink
  Issue Type: Bug
  Components: DataStream API
Affects Versions: 1.2.0
Reporter: Robert Metzger


While trying out Flink 1.2, I found out that you can set a maxParallelism 
higher than 1 on a non-parallel operator.
I think we should have the same semantics as the setParallelism() method.

Also, when setting a global maxParallelism in the execution environment, it 
will be set as a default value for the non-parallel operator.
When restoring a savepoint from 1.1, you have to set the maxParallelism to the 
parallelism of the 1.1 job. Non-parallel operators will then also get the 
maxPar set to this value, leading to an error on restore.

So currently, users restoring from 1.1 to 1.2 have to manually set the 
maxParallelism to 1 for all non-parallel operators.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-5473) setMaxParallelism() higher than 1 is possible on non-parallel operators

2017-01-12 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15821262#comment-15821262
 ] 

Robert Metzger commented on FLINK-5473:
---

That would be the best user experience, yes.

> setMaxParallelism() higher than 1 is possible on non-parallel operators
> ---
>
> Key: FLINK-5473
> URL: https://issues.apache.org/jira/browse/FLINK-5473
> Project: Flink
>  Issue Type: Bug
>  Components: DataStream API
>Affects Versions: 1.2.0
>Reporter: Robert Metzger
>
> While trying out Flink 1.2, I found out that you can set a maxParallelism 
> higher than 1 on a non-parallel operator.
> I think we should have the same semantics as the setParallelism() method.
> Also, when setting a global maxParallelism in the execution environment, it 
> will be set as a default value for the non-parallel operator.
> When restoring a savepoint from 1.1, you have to set the maxParallelism to 
> the parallelism of the 1.1 job. Non-parallel operators will then also get the 
> maxPar set to this value, leading to an error on restore.
> So currently, users restoring from 1.1 to 1.2 have to manually set the 
> maxParallelism to 1 for all non-parallel operators.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-5465) RocksDB fails with segfault while calling AbstractRocksDBState.clear()

2017-01-13 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger resolved FLINK-5465.
---
Resolution: Won't Fix

Thank you guys for looking into it. I'm closing the issue as won't fix.

> RocksDB fails with segfault while calling AbstractRocksDBState.clear()
> --
>
> Key: FLINK-5465
> URL: https://issues.apache.org/jira/browse/FLINK-5465
> Project: Flink
>  Issue Type: Bug
>  Components: State Backends, Checkpointing
>Affects Versions: 1.2.0
>Reporter: Robert Metzger
> Attachments: hs-err-pid26662.log
>
>
> I'm using Flink 699f4b0.
> {code}
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGSEGV (0xb) at pc=0x7f91a0d49b78, pid=26662, tid=140263356024576
> #
> # JRE version: Java(TM) SE Runtime Environment (7.0_67-b01) (build 
> 1.7.0_67-b01)
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.65-b04 mixed mode 
> linux-amd64 compressed oops)
> # Problematic frame:
> # C  [librocksdbjni-linux64.so+0x1aeb78]  
> rocksdb::GetColumnFamilyID(rocksdb::ColumnFamilyHandle*)+0x8
> #
> # Failed to write core dump. Core dumps have been disabled. To enable core 
> dumping, try "ulimit -c unlimited" before starting Java again
> #
> # An error report file with more information is saved as:
> # 
> /yarn/nm/usercache/robert/appcache/application_1484132267957_0007/container_1484132267957_0007_01_10/hs_err_pid26662.log
> Compiled method (nm) 1869778  903 n   org.rocksdb.RocksDB::remove 
> (native)
>  total in heap  [0x7f91b40b9dd0,0x7f91b40ba150] = 896
>  relocation [0x7f91b40b9ef0,0x7f91b40b9f48] = 88
>  main code  [0x7f91b40b9f60,0x7f91b40ba150] = 496
> #
> # If you would like to submit a bug report, please visit:
> #   http://bugreport.sun.com/bugreport/crash.jsp
> # The crash happened outside the Java Virtual Machine in native code.
> # See problematic frame for where to report the bug.
> #
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3427) Add watermark monitoring to JobManager web frontend

2017-01-14 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15822779#comment-15822779
 ] 

Robert Metzger commented on FLINK-3427:
---

Hi Ivan,
Sadly, the person who wanted to work on this decided not to do it. So if you 
are still interested, feel free to assign it to you again.

So answer your question:
I was thinking to add a new tab for “Event time”
It will show the same plan as the “Plan” view, but with the low watermarks of 
each operator (We’ll compute the low watermark for the task in the web 
interface based on watermarks reported by the subtasks).
There should also be a view to see the subtasks individual watermarks and the 
max watermark from the subtasks


> Add watermark monitoring to JobManager web frontend
> ---
>
> Key: FLINK-3427
> URL: https://issues.apache.org/jira/browse/FLINK-3427
> Project: Flink
>  Issue Type: Improvement
>  Components: Streaming, Webfrontend
>Reporter: Robert Metzger
>
> Currently, its quite hard to figure out issues with the watermarks.
> I think we can improve the situation by reporting the following information 
> through the metrics system:
> - Report the current low watermark for each operator (this way, you can see 
> if one operator is preventing the watermarks to rise)
> - Report the number of events arrived after the low watermark (users can see 
> how accurate the watermarks are)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-5489) maven release:prepare fails due to invalid JDOM comments in pom.xml

2017-01-15 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger resolved FLINK-5489.
---
   Resolution: Fixed
Fix Version/s: 1.3.0

Thank you for fixing this!

Resolved in master with commit 
http://git-wip-us.apache.org/repos/asf/flink/commit/e2ba042c

> maven release:prepare fails due to invalid JDOM comments in pom.xml
> ---
>
> Key: FLINK-5489
> URL: https://issues.apache.org/jira/browse/FLINK-5489
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.2.0, 1.3.0
>Reporter: Haohui Mai
>Assignee: Haohui Mai
>Priority: Minor
>  Labels: newbie
> Fix For: 1.3.0
>
>
> When I was trying to publish Flink to our internal artifactory, I found out 
> that {{maven release:prepare}} has failed because the plugin complains about 
> the some of the comments pom.xml do not conform with the JDOM format:
> {noformat}
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-release-plugin:2.4.2:prepare (default-cli) on 
> project flink-parent: Execution default-cli of goal 
> org.apache.maven.plugins:maven-release-plugin:2.4.2:prepare failed: The data 
> "-
> [ERROR] This module is used a dependency in the root pom. It activates 
> shading for all sub modules
> [ERROR] through an include rule in the shading configuration. This assures 
> that Maven always generates
> [ERROR] an effective pom for all modules, i.e. get rid of Maven properties. 
> In particular, this is needed
> [ERROR] to define the Scala version property in the root pom but not let the 
> root pom depend on Scala
> [ERROR] and thus be suffixed along with all other modules.
> [ERROR] " is not legal for a JDOM comment: Comment data cannot start with a 
> hyphen.
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-5489) maven release:prepare fails due to invalid JDOM comments in pom.xml

2017-01-15 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-5489:
--
Component/s: Build System

> maven release:prepare fails due to invalid JDOM comments in pom.xml
> ---
>
> Key: FLINK-5489
> URL: https://issues.apache.org/jira/browse/FLINK-5489
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.2.0, 1.3.0
>Reporter: Haohui Mai
>Assignee: Haohui Mai
>Priority: Minor
>  Labels: newbie
> Fix For: 1.3.0
>
>
> When I was trying to publish Flink to our internal artifactory, I found out 
> that {{maven release:prepare}} has failed because the plugin complains about 
> the some of the comments pom.xml do not conform with the JDOM format:
> {noformat}
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-release-plugin:2.4.2:prepare (default-cli) on 
> project flink-parent: Execution default-cli of goal 
> org.apache.maven.plugins:maven-release-plugin:2.4.2:prepare failed: The data 
> "-
> [ERROR] This module is used a dependency in the root pom. It activates 
> shading for all sub modules
> [ERROR] through an include rule in the shading configuration. This assures 
> that Maven always generates
> [ERROR] an effective pom for all modules, i.e. get rid of Maven properties. 
> In particular, this is needed
> [ERROR] to define the Scala version property in the root pom but not let the 
> root pom depend on Scala
> [ERROR] and thus be suffixed along with all other modules.
> [ERROR] " is not legal for a JDOM comment: Comment data cannot start with a 
> hyphen.
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-4959) Write Documentation for ProcessFunction

2017-01-16 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-4959:
--
Priority: Critical  (was: Blocker)

> Write Documentation for ProcessFunction
> ---
>
> Key: FLINK-4959
> URL: https://issues.apache.org/jira/browse/FLINK-4959
> Project: Flink
>  Issue Type: Sub-task
>  Components: Streaming
>Affects Versions: 1.2.0
>Reporter: Aljoscha Krettek
>Priority: Critical
> Fix For: 1.2.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-4959) Write Documentation for ProcessFunction

2017-01-16 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-4959:
--
Issue Type: Sub-task  (was: Improvement)
Parent: FLINK-5430

> Write Documentation for ProcessFunction
> ---
>
> Key: FLINK-4959
> URL: https://issues.apache.org/jira/browse/FLINK-4959
> Project: Flink
>  Issue Type: Sub-task
>  Components: Streaming
>Affects Versions: 1.2.0
>Reporter: Aljoscha Krettek
>Priority: Blocker
> Fix For: 1.2.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-5473) setMaxParallelism() higher than 1 is possible on non-parallel operators

2017-01-16 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-5473:
--
Assignee: Stefan Richter

> setMaxParallelism() higher than 1 is possible on non-parallel operators
> ---
>
> Key: FLINK-5473
> URL: https://issues.apache.org/jira/browse/FLINK-5473
> Project: Flink
>  Issue Type: Bug
>  Components: DataStream API
>Affects Versions: 1.2.0
>Reporter: Robert Metzger
>Assignee: Stefan Richter
>
> While trying out Flink 1.2, I found out that you can set a maxParallelism 
> higher than 1 on a non-parallel operator.
> I think we should have the same semantics as the setParallelism() method.
> Also, when setting a global maxParallelism in the execution environment, it 
> will be set as a default value for the non-parallel operator.
> When restoring a savepoint from 1.1, you have to set the maxParallelism to 
> the parallelism of the 1.1 job. Non-parallel operators will then also get the 
> maxPar set to this value, leading to an error on restore.
> So currently, users restoring from 1.1 to 1.2 have to manually set the 
> maxParallelism to 1 for all non-parallel operators.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-5463) RocksDB.disposeInternal does not react to interrupts, blocks task cancellation

2017-01-16 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15823972#comment-15823972
 ] 

Robert Metzger commented on FLINK-5463:
---

After talking with [~srichter] about this, we decided to wait and see how 
frequently this error occurs.

> RocksDB.disposeInternal does not react to interrupts, blocks task cancellation
> --
>
> Key: FLINK-5463
> URL: https://issues.apache.org/jira/browse/FLINK-5463
> Project: Flink
>  Issue Type: Bug
>  Components: State Backends, Checkpointing
>Affects Versions: 1.2.0
>Reporter: Robert Metzger
>
> I'm using Flink 699f4b0.
> My Flink job is slow while cancelling because RockDB seems to be busy with 
> disposing its state:
> {code}
> 2017-01-11 18:48:23,315 INFO  org.apache.flink.runtime.taskmanager.Task   
>   - Triggering cancellation of task code 
> TriggerWindow(TumblingEventTimeWindows(4), 
> ListStateDescriptor{serializer=org.apache.flink.api.java.typeutils.runtime.TupleSerializer@2edcd071
> }, EventTimeTrigger(), WindowedStream.apply(AllWindowedStream.java:440)) 
> (1/1) (2accc6ca2727c4f7ec963318fbd237e9).
> 2017-01-11 18:48:53,318 WARN  org.apache.flink.runtime.taskmanager.Task   
>   - Task 'TriggerWindow(TumblingEventTimeWindows(4), 
> ListStateDescriptor{serializer=org.apache.flink.api.java.typeutils.runtime.TupleSerializer@2edcd071},
>  EventTimeTrigger(), Windowed
> Stream.apply(AllWindowedStream.java:440)) (1/1)' did not react to cancelling 
> signal, but is stuck in method:
>  org.rocksdb.RocksDB.disposeInternal(Native Method)
> org.rocksdb.RocksObject.disposeInternal(RocksObject.java:37)
> org.rocksdb.AbstractImmutableNativeReference.close(AbstractImmutableNativeReference.java:56)
> org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend.dispose(RocksDBKeyedStateBackend.java:250)
> org.apache.flink.streaming.api.operators.AbstractStreamOperator.dispose(AbstractStreamOperator.java:331)
> org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.dispose(AbstractUdfStreamOperator.java:169)
> org.apache.flink.streaming.runtime.operators.windowing.WindowOperator.dispose(WindowOperator.java:273)
> org.apache.flink.streaming.runtime.tasks.StreamTask.disposeAllOperators(StreamTask.java:439)
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:340)
> org.apache.flink.runtime.taskmanager.Task.run(Task.java:654)
> java.lang.Thread.run(Thread.java:745)
> 2017-01-11 18:48:53,319 WARN  org.apache.flink.runtime.taskmanager.Task   
>   - Task 'TriggerWindow(TumblingEventTimeWindows(4), 
> ListStateDescriptor{serializer=org.apache.flink.api.java.typeutils.runtime.TupleSerializer@2edcd071},
>  EventTimeTrigger(), WindowedStream.apply(AllWindowedStream.java:440)) (1/1)' 
> did not react to cancelling signal, but is stuck in method:
>  org.rocksdb.RocksDB.disposeInternal(Native Method)
> org.rocksdb.RocksObject.disposeInternal(RocksObject.java:37)
> org.rocksdb.AbstractImmutableNativeReference.close(AbstractImmutableNativeReference.java:56)
> org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend.dispose(RocksDBKeyedStateBackend.java:250)
> org.apache.flink.streaming.api.operators.AbstractStreamOperator.dispose(AbstractStreamOperator.java:331)
> org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.dispose(AbstractUdfStreamOperator.java:169)
> org.apache.flink.streaming.runtime.operators.windowing.WindowOperator.dispose(WindowOperator.java:273)
> org.apache.flink.streaming.runtime.tasks.StreamTask.disposeAllOperators(StreamTask.java:439)
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:340)
> org.apache.flink.runtime.taskmanager.Task.run(Task.java:654)
> java.lang.Thread.run(Thread.java:745)
> 2017-01-11 18:49:23,319 WARN  org.apache.flink.runtime.taskmanager.Task   
>   - Task 'TriggerWindow(TumblingEventTimeWindows(4), 
> ListStateDescriptor{serializer=org.apache.flink.api.java.typeutils.runtime.TupleSerializer@2edcd071},
>  EventTimeTrigger(), WindowedStream.apply(AllWindowedStream.java:440)) (1/1)' 
> did not react to cancelling signal, but is stuck in method:
>  org.rocksdb.RocksDB.disposeInternal(Native Method)
> org.rocksdb.RocksObject.disposeInternal(RocksObject.java:37)
> org.rocksdb.AbstractImmutableNativeReference.close(AbstractImmutableNativeReference.java:56)
> org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend.dispose(RocksDBKeyedStateBackend.java:250)
> org.apache.flink.streaming.api.operators.AbstractStreamOperator.dispose(AbstractStreamOperator.java:331)
> org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.dispose(AbstractUdfStreamOperator.java:169)
> org.apache.flink.streaming.runtime.operators.windowing.

[jira] [Commented] (FLINK-5345) IOManager failed to properly clean up temp file directory

2017-01-16 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15823974#comment-15823974
 ] 

Robert Metzger commented on FLINK-5345:
---

[~tonycox] What's the progress on fixing this issue?

> IOManager failed to properly clean up temp file directory
> -
>
> Key: FLINK-5345
> URL: https://issues.apache.org/jira/browse/FLINK-5345
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.1.3
>Reporter: Robert Metzger
>Assignee: Anton Solovev
>  Labels: simplex, starter
> Fix For: 1.2.0, 1.3.0
>
>
> While testing 1.1.3 RC3, I have the following message in my log:
> {code}
> 2016-12-15 14:46:05,450 INFO  
> org.apache.flink.streaming.runtime.tasks.StreamTask   - Timer service 
> is shutting down.
> 2016-12-15 14:46:05,452 INFO  org.apache.flink.runtime.taskmanager.Task   
>   - Source: control events generator (29/40) 
> (73915a232ba09e642f9dff92f8c8773a) switched from CANCELING to CANCELED.
> 2016-12-15 14:46:05,452 INFO  org.apache.flink.runtime.taskmanager.Task   
>   - Freeing task resources for Source: control events generator 
> (29/40) (73915a232ba09e642f9dff92f8c8773a).
> 2016-12-15 14:46:05,454 INFO  org.apache.flink.yarn.YarnTaskManager   
>   - Un-registering task and sending final execution state 
> CANCELED to JobManager for task Source: control events genera
> tor (73915a232ba09e642f9dff92f8c8773a)
> 2016-12-15 14:46:40,609 INFO  org.apache.flink.yarn.YarnTaskManagerRunner 
>   - RECEIVED SIGNAL 15: SIGTERM. Shutting down as requested.
> 2016-12-15 14:46:40,611 INFO  org.apache.flink.runtime.blob.BlobCache 
>   - Shutting down BlobCache
> 2016-12-15 14:46:40,724 WARN  akka.remote.ReliableDeliverySupervisor  
>   - Association with remote system 
> [akka.tcp://flink@10.240.0.34:33635] has failed, address is now gated for 
> [5000] ms.
>  Reason is: [Disassociated].
> 2016-12-15 14:46:40,808 ERROR 
> org.apache.flink.runtime.io.disk.iomanager.IOManager  - IOManager 
> failed to properly clean up temp file directory: 
> /yarn/nm/usercache/robert/appcache/application_148129128
> 9979_0024/flink-io-f0ff3f66-b9e2-4560-881f-2ab43bc448b5
> java.lang.IllegalArgumentException: 
> /yarn/nm/usercache/robert/appcache/application_1481291289979_0024/flink-io-f0ff3f66-b9e2-4560-881f-2ab43bc448b5/62e14e1891fe1e334c921dfd19a32a84/StreamMap_11_24/dummy_state
>  does not exist
> at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1637)
> at 
> org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535)
> at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:2270)
> at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1653)
> at 
> org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535)
> at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:2270)
> at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1653)
> at 
> org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535)
> at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:2270)
> at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1653)
> at 
> org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535)
> at 
> org.apache.flink.runtime.io.disk.iomanager.IOManager.shutdown(IOManager.java:109)
> at 
> org.apache.flink.runtime.io.disk.iomanager.IOManagerAsync.shutdown(IOManagerAsync.java:185)
> at 
> org.apache.flink.runtime.io.disk.iomanager.IOManagerAsync$1.run(IOManagerAsync.java:105)
> {code}
> This was the last message logged from that machine. I suspect two threads are 
> trying to clean up the directories during shutdown?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-4941) Show ship strategy in web interface

2017-01-16 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger resolved FLINK-4941.
---
Resolution: Fixed

This won't be backported to 1.1.

> Show ship strategy in web interface
> ---
>
> Key: FLINK-4941
> URL: https://issues.apache.org/jira/browse/FLINK-4941
> Project: Flink
>  Issue Type: Bug
>  Components: Webfrontend
>Reporter: Robert Metzger
>Assignee: Robert Metzger
> Fix For: 1.2.0
>
> Attachments: ship_strategy_bad.png, ship_strategy_good.png
>
>
> Currently, only Flink's Plan visualizer shows the ship strategy of a 
> streaming job, however not web interface.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-5549) TypeExtractor fails with RuntimeException, but should use GGenericTypeInfoeneric

2017-01-18 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-5549:
-

 Summary: TypeExtractor fails with RuntimeException, but should use 
GGenericTypeInfoeneric
 Key: FLINK-5549
 URL: https://issues.apache.org/jira/browse/FLINK-5549
 Project: Flink
  Issue Type: Bug
  Components: Type Serialization System
Affects Versions: 1.2.0
Reporter: Robert Metzger


This issue has been reported by a user on StackOverflow: 
http://stackoverflow.com/questions/41700568/runtimeexception-when-using-flinks-mapfunction-with-cassandra-insert-but-not

{code}
Exception in thread "main" java.lang.RuntimeException: The field private 
java.util.List com.datastax.driver.core.querybuilder.BuiltStatement.values is 
already contained in the hierarchy of the class 
com.datastax.driver.core.querybuilder.BuiltStatement.Please use unique field 
names through your classes hierarchy
at 
org.apache.flink.api.java.typeutils.TypeExtractor.getAllDeclaredFields(TypeExtractor.java:1762)
at 
org.apache.flink.api.java.typeutils.TypeExtractor.analyzePojo(TypeExtractor.java:1683)
at 
org.apache.flink.api.java.typeutils.TypeExtractor.privateGetForClass(TypeExtractor.java:1580)
at 
org.apache.flink.api.java.typeutils.TypeExtractor.privateGetForClass(TypeExtractor.java:1479)
at 
org.apache.flink.api.java.typeutils.TypeExtractor.createTypeInfoWithTypeHierarchy(TypeExtractor.java:737)
at 
org.apache.flink.api.java.typeutils.TypeExtractor.privateCreateTypeInfo(TypeExtractor.java:565)
at 
org.apache.flink.api.java.typeutils.TypeExtractor.getUnaryOperatorReturnType(TypeExtractor.java:366)
at 
org.apache.flink.api.java.typeutils.TypeExtractor.getUnaryOperatorReturnType(TypeExtractor.java:305)
at 
org.apache.flink.api.java.typeutils.TypeExtractor.getMapReturnTypes(TypeExtractor.java:120)
at 
org.apache.flink.streaming.api.datastream.DataStream.map(DataStream.java:506)
at 
se.hiq.bjornper.testenv.cassandra.SOCassandraQueryTest.main(SOCassandraQueryTest.java:51)
{code}

When Flink is trying to analyze the POJO, the {{getAllDeclaredFields}} method 
fails with a RuntimeException. When the user is using a different class that 
contains the POJO, it just uses the GenericTypeInfo which is able to serialize 
the type.
I think we need to throw a InvalidTypesException in the 
{{getAllDeclaredFields}} method to fix the issue




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-5549) TypeExtractor fails with RuntimeException, but should use GenericTypeInfoeneric

2017-01-18 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-5549:
--
Summary: TypeExtractor fails with RuntimeException, but should use 
GenericTypeInfoeneric  (was: TypeExtractor fails with RuntimeException, but 
should use GGenericTypeInfoeneric)

> TypeExtractor fails with RuntimeException, but should use 
> GenericTypeInfoeneric
> ---
>
> Key: FLINK-5549
> URL: https://issues.apache.org/jira/browse/FLINK-5549
> Project: Flink
>  Issue Type: Bug
>  Components: Type Serialization System
>Affects Versions: 1.2.0
>Reporter: Robert Metzger
>
> This issue has been reported by a user on StackOverflow: 
> http://stackoverflow.com/questions/41700568/runtimeexception-when-using-flinks-mapfunction-with-cassandra-insert-but-not
> {code}
> Exception in thread "main" java.lang.RuntimeException: The field private 
> java.util.List com.datastax.driver.core.querybuilder.BuiltStatement.values is 
> already contained in the hierarchy of the class 
> com.datastax.driver.core.querybuilder.BuiltStatement.Please use unique field 
> names through your classes hierarchy
> at 
> org.apache.flink.api.java.typeutils.TypeExtractor.getAllDeclaredFields(TypeExtractor.java:1762)
> at 
> org.apache.flink.api.java.typeutils.TypeExtractor.analyzePojo(TypeExtractor.java:1683)
> at 
> org.apache.flink.api.java.typeutils.TypeExtractor.privateGetForClass(TypeExtractor.java:1580)
> at 
> org.apache.flink.api.java.typeutils.TypeExtractor.privateGetForClass(TypeExtractor.java:1479)
> at 
> org.apache.flink.api.java.typeutils.TypeExtractor.createTypeInfoWithTypeHierarchy(TypeExtractor.java:737)
> at 
> org.apache.flink.api.java.typeutils.TypeExtractor.privateCreateTypeInfo(TypeExtractor.java:565)
> at 
> org.apache.flink.api.java.typeutils.TypeExtractor.getUnaryOperatorReturnType(TypeExtractor.java:366)
> at 
> org.apache.flink.api.java.typeutils.TypeExtractor.getUnaryOperatorReturnType(TypeExtractor.java:305)
> at 
> org.apache.flink.api.java.typeutils.TypeExtractor.getMapReturnTypes(TypeExtractor.java:120)
> at 
> org.apache.flink.streaming.api.datastream.DataStream.map(DataStream.java:506)
> at 
> se.hiq.bjornper.testenv.cassandra.SOCassandraQueryTest.main(SOCassandraQueryTest.java:51)
> {code}
> When Flink is trying to analyze the POJO, the {{getAllDeclaredFields}} method 
> fails with a RuntimeException. When the user is using a different class that 
> contains the POJO, it just uses the GenericTypeInfo which is able to 
> serialize the type.
> I think we need to throw a InvalidTypesException in the 
> {{getAllDeclaredFields}} method to fix the issue



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-5549) TypeExtractor fails with RuntimeException, but should use GenericTypeInfo

2017-01-18 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-5549:
--
Summary: TypeExtractor fails with RuntimeException, but should use 
GenericTypeInfo  (was: TypeExtractor fails with RuntimeException, but should 
use GenericTypeInfoeneric)

> TypeExtractor fails with RuntimeException, but should use GenericTypeInfo
> -
>
> Key: FLINK-5549
> URL: https://issues.apache.org/jira/browse/FLINK-5549
> Project: Flink
>  Issue Type: Bug
>  Components: Type Serialization System
>Affects Versions: 1.2.0
>Reporter: Robert Metzger
>
> This issue has been reported by a user on StackOverflow: 
> http://stackoverflow.com/questions/41700568/runtimeexception-when-using-flinks-mapfunction-with-cassandra-insert-but-not
> {code}
> Exception in thread "main" java.lang.RuntimeException: The field private 
> java.util.List com.datastax.driver.core.querybuilder.BuiltStatement.values is 
> already contained in the hierarchy of the class 
> com.datastax.driver.core.querybuilder.BuiltStatement.Please use unique field 
> names through your classes hierarchy
> at 
> org.apache.flink.api.java.typeutils.TypeExtractor.getAllDeclaredFields(TypeExtractor.java:1762)
> at 
> org.apache.flink.api.java.typeutils.TypeExtractor.analyzePojo(TypeExtractor.java:1683)
> at 
> org.apache.flink.api.java.typeutils.TypeExtractor.privateGetForClass(TypeExtractor.java:1580)
> at 
> org.apache.flink.api.java.typeutils.TypeExtractor.privateGetForClass(TypeExtractor.java:1479)
> at 
> org.apache.flink.api.java.typeutils.TypeExtractor.createTypeInfoWithTypeHierarchy(TypeExtractor.java:737)
> at 
> org.apache.flink.api.java.typeutils.TypeExtractor.privateCreateTypeInfo(TypeExtractor.java:565)
> at 
> org.apache.flink.api.java.typeutils.TypeExtractor.getUnaryOperatorReturnType(TypeExtractor.java:366)
> at 
> org.apache.flink.api.java.typeutils.TypeExtractor.getUnaryOperatorReturnType(TypeExtractor.java:305)
> at 
> org.apache.flink.api.java.typeutils.TypeExtractor.getMapReturnTypes(TypeExtractor.java:120)
> at 
> org.apache.flink.streaming.api.datastream.DataStream.map(DataStream.java:506)
> at 
> se.hiq.bjornper.testenv.cassandra.SOCassandraQueryTest.main(SOCassandraQueryTest.java:51)
> {code}
> When Flink is trying to analyze the POJO, the {{getAllDeclaredFields}} method 
> fails with a RuntimeException. When the user is using a different class that 
> contains the POJO, it just uses the GenericTypeInfo which is able to 
> serialize the type.
> I think we need to throw a InvalidTypesException in the 
> {{getAllDeclaredFields}} method to fix the issue



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-3150) Make YARN container invocation configurable

2017-01-18 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger resolved FLINK-3150.
---
   Resolution: Fixed
Fix Version/s: 1.3.0

Thanks a lot for implementing this.

Merged in master: http://git-wip-us.apache.org/repos/asf/flink/commit/8f4139a4

> Make YARN container invocation configurable
> ---
>
> Key: FLINK-3150
> URL: https://issues.apache.org/jira/browse/FLINK-3150
> Project: Flink
>  Issue Type: Improvement
>  Components: YARN
>Reporter: Robert Metzger
>Assignee: Nico Kruber
>  Labels: qa
> Fix For: 1.3.0
>
>
> Currently, the JVM invocation call of YARN containers is hardcoded.
> With this change, I would like to make the call configurable, using a string 
> such as
> "%java% %memopts% %jvmopts% ..."
> Also, we should respect the {{java.env.home}} if its set.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-5553) Job fails during deployment with IllegalStateException from subpartition request

2017-01-18 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-5553:
--
Attachment: application-1484132267957-0076

> Job fails during deployment with IllegalStateException from subpartition 
> request
> 
>
> Key: FLINK-5553
> URL: https://issues.apache.org/jira/browse/FLINK-5553
> Project: Flink
>  Issue Type: Bug
>  Components: Network
>Affects Versions: 1.3.0
>Reporter: Robert Metzger
> Attachments: application-1484132267957-0076
>
>
> While running a test job with Flink 1.3-SNAPSHOT 
> (6fb6967b9f9a31f034bd09fcf76aaf147bc8e9a0) the job failed with this exception:
> {code}
> 2017-01-18 14:56:27,043 INFO  
> org.apache.flink.runtime.executiongraph.ExecutionGraph- Sink: Unnamed 
> (9/10) (befc06d0e792c2ce39dde74b365dd3cf) switched from DEPLOYING to RUNNING.
> 2017-01-18 14:56:27,059 INFO  
> org.apache.flink.runtime.executiongraph.ExecutionGraph- Flat Map 
> (9/10) (e94a01ec283e5dce7f79b02cf51654c4) switched from DEPLOYING to RUNNING.
> 2017-01-18 14:56:27,817 INFO  
> org.apache.flink.runtime.executiongraph.ExecutionGraph- Flat Map 
> (10/10) (cbb61c9a2f72c282877eb383e111f7cd) switched from RUNNING to FAILED.
> java.lang.IllegalStateException: There has been an error in the channel.
> at 
> org.apache.flink.util.Preconditions.checkState(Preconditions.java:195)
> at 
> org.apache.flink.runtime.io.network.netty.PartitionRequestClientHandler.addInputChannel(PartitionRequestClientHandler.java:77)
> at 
> org.apache.flink.runtime.io.network.netty.PartitionRequestClient.requestSubpartition(PartitionRequestClient.java:104)
> at 
> org.apache.flink.runtime.io.network.partition.consumer.RemoteInputChannel.requestSubpartition(RemoteInputChannel.java:115)
> at 
> org.apache.flink.runtime.io.network.partition.consumer.SingleInputGate.requestPartitions(SingleInputGate.java:419)
> at 
> org.apache.flink.runtime.io.network.partition.consumer.SingleInputGate.getNextBufferOrEvent(SingleInputGate.java:441)
> at 
> org.apache.flink.streaming.runtime.io.BarrierBuffer.getNextNonBlocked(BarrierBuffer.java:153)
> at 
> org.apache.flink.streaming.runtime.io.StreamInputProcessor.processInput(StreamInputProcessor.java:192)
> at 
> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.run(OneInputStreamTask.java:63)
> at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:270)
> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:666)
> at java.lang.Thread.run(Thread.java:745)
> 2017-01-18 14:56:27,819 INFO  
> org.apache.flink.runtime.executiongraph.ExecutionGraph- Job 
> Misbehaved Job (b1d985d11984df57400fdff2bb656c59) switched from state RUNNING 
> to FAILING.
> java.lang.IllegalStateException: There has been an error in the channel.
> at 
> org.apache.flink.util.Preconditions.checkState(Preconditions.java:195)
> at 
> org.apache.flink.runtime.io.network.netty.PartitionRequestClientHandler.addInputChannel(PartitionRequestClientHandler.java:77)
> at 
> org.apache.flink.runtime.io.network.netty.PartitionRequestClient.requestSubpartition(PartitionRequestClient.java:104)
> at 
> org.apache.flink.runtime.io.network.partition.consumer.RemoteInputChannel.requestSubpartition(RemoteInputChannel.java:115)
> at 
> org.apache.flink.runtime.io.network.partition.consumer.SingleInputGate.requestPartitions(SingleInputGate.java:419)
> at 
> org.apache.flink.runtime.io.network.partition.consumer.SingleInputGate.getNextBufferOrEvent(SingleInputGate.java:441)
> at 
> org.apache.flink.streaming.runtime.io.BarrierBuffer.getNextNonBlocked(BarrierBuffer.java:153)
> at 
> org.apache.flink.streaming.runtime.io.StreamInputProcessor.processInput(StreamInputProcessor.java:192)
> at 
> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.run(OneInputStreamTask.java:63)
> at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:270)
> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:666)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> This is the first exception that is reported to the jobmanager.
> I think this is related to missing network buffers. You see that from the 
> next deployment after the restart, where the deployment fails with the 
> insufficient number of buffers exception.
> I'll add logs to the JIRA.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-5553) Job fails during deployment with IllegalStateException from subpartition request

2017-01-18 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-5553:
-

 Summary: Job fails during deployment with IllegalStateException 
from subpartition request
 Key: FLINK-5553
 URL: https://issues.apache.org/jira/browse/FLINK-5553
 Project: Flink
  Issue Type: Bug
  Components: Network
Affects Versions: 1.3.0
Reporter: Robert Metzger


While running a test job with Flink 1.3-SNAPSHOT 
(6fb6967b9f9a31f034bd09fcf76aaf147bc8e9a0) the job failed with this exception:

{code}
2017-01-18 14:56:27,043 INFO  
org.apache.flink.runtime.executiongraph.ExecutionGraph- Sink: Unnamed 
(9/10) (befc06d0e792c2ce39dde74b365dd3cf) switched from DEPLOYING to RUNNING.
2017-01-18 14:56:27,059 INFO  
org.apache.flink.runtime.executiongraph.ExecutionGraph- Flat Map (9/10) 
(e94a01ec283e5dce7f79b02cf51654c4) switched from DEPLOYING to RUNNING.
2017-01-18 14:56:27,817 INFO  
org.apache.flink.runtime.executiongraph.ExecutionGraph- Flat Map 
(10/10) (cbb61c9a2f72c282877eb383e111f7cd) switched from RUNNING to FAILED.
java.lang.IllegalStateException: There has been an error in the channel.
at 
org.apache.flink.util.Preconditions.checkState(Preconditions.java:195)
at 
org.apache.flink.runtime.io.network.netty.PartitionRequestClientHandler.addInputChannel(PartitionRequestClientHandler.java:77)
at 
org.apache.flink.runtime.io.network.netty.PartitionRequestClient.requestSubpartition(PartitionRequestClient.java:104)
at 
org.apache.flink.runtime.io.network.partition.consumer.RemoteInputChannel.requestSubpartition(RemoteInputChannel.java:115)
at 
org.apache.flink.runtime.io.network.partition.consumer.SingleInputGate.requestPartitions(SingleInputGate.java:419)
at 
org.apache.flink.runtime.io.network.partition.consumer.SingleInputGate.getNextBufferOrEvent(SingleInputGate.java:441)
at 
org.apache.flink.streaming.runtime.io.BarrierBuffer.getNextNonBlocked(BarrierBuffer.java:153)
at 
org.apache.flink.streaming.runtime.io.StreamInputProcessor.processInput(StreamInputProcessor.java:192)
at 
org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.run(OneInputStreamTask.java:63)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:270)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:666)
at java.lang.Thread.run(Thread.java:745)
2017-01-18 14:56:27,819 INFO  
org.apache.flink.runtime.executiongraph.ExecutionGraph- Job Misbehaved 
Job (b1d985d11984df57400fdff2bb656c59) switched from state RUNNING to FAILING.
java.lang.IllegalStateException: There has been an error in the channel.
at 
org.apache.flink.util.Preconditions.checkState(Preconditions.java:195)
at 
org.apache.flink.runtime.io.network.netty.PartitionRequestClientHandler.addInputChannel(PartitionRequestClientHandler.java:77)
at 
org.apache.flink.runtime.io.network.netty.PartitionRequestClient.requestSubpartition(PartitionRequestClient.java:104)
at 
org.apache.flink.runtime.io.network.partition.consumer.RemoteInputChannel.requestSubpartition(RemoteInputChannel.java:115)
at 
org.apache.flink.runtime.io.network.partition.consumer.SingleInputGate.requestPartitions(SingleInputGate.java:419)
at 
org.apache.flink.runtime.io.network.partition.consumer.SingleInputGate.getNextBufferOrEvent(SingleInputGate.java:441)
at 
org.apache.flink.streaming.runtime.io.BarrierBuffer.getNextNonBlocked(BarrierBuffer.java:153)
at 
org.apache.flink.streaming.runtime.io.StreamInputProcessor.processInput(StreamInputProcessor.java:192)
at 
org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.run(OneInputStreamTask.java:63)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:270)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:666)
at java.lang.Thread.run(Thread.java:745)
{code}

This is the first exception that is reported to the jobmanager.

I think this is related to missing network buffers. You see that from the next 
deployment after the restart, where the deployment fails with the insufficient 
number of buffers exception.
I'll add logs to the JIRA.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-5462) Flink job fails due to java.util.concurrent.CancellationException while snapshotting

2017-01-18 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15828276#comment-15828276
 ] 

Robert Metzger commented on FLINK-5462:
---

I've looked through this and other logs with the same error, and I think the 
problem is that this error "only" occurs in the presence of other failures not 
as a root cause for an issue.

I don't think that we need to urgently fix this issue for the 1.2 release.

> Flink job fails due to java.util.concurrent.CancellationException while 
> snapshotting
> 
>
> Key: FLINK-5462
> URL: https://issues.apache.org/jira/browse/FLINK-5462
> Project: Flink
>  Issue Type: Bug
>  Components: State Backends, Checkpointing
>Affects Versions: 1.2.0
>Reporter: Robert Metzger
> Attachments: application-1484132267957-0005
>
>
> I'm using Flink 699f4b0.
> My restored, rescaled Flink job failed while creating a checkpoint with the 
> following exception:
> {code}
> 2017-01-11 18:46:49,853 INFO  
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Triggering 
> checkpoint 3 @ 1484160409846
> 2017-01-11 18:49:50,111 INFO  
> org.apache.flink.runtime.executiongraph.ExecutionGraph- 
> TriggerWindow(TumblingEventTimeWindows(4), 
> ListStateDescriptor{serializer=org.apache.flink.api.java.typeutils.runtime.TupleSerializer@2edcd071},
>  EventTimeTrigger(), WindowedStream
> .apply(AllWindowedStream.java:440)) (1/1) (2accc6ca2727c4f7ec963318fbd237e9) 
> switched from RUNNING to FAILED.
> AsynchronousException{java.lang.Exception: Could not materialize checkpoint 3 
> for operator TriggerWindow(TumblingEventTimeWindows(4), 
> ListStateDescriptor{serializer=org.apache.flink.api.java.typeutils.runtime.TupleSerializer@2edcd071},
>  EventTimeTrigger(), WindowedStream.ap
> ply(AllWindowedStream.java:440)) (1/1).}
> at 
> org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.run(StreamTask.java:939)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.Exception: Could not materialize checkpoint 3 for 
> operator TriggerWindow(TumblingEventTimeWindows(4), 
> ListStateDescriptor{serializer=org.apache.flink.api.java.typeutils.runtime.TupleSerializer@2edcd071},
>  EventTimeTrigger(), WindowedStream.apply(AllWind
> owedStream.java:440)) (1/1).
> ... 6 more
> Caused by: java.util.concurrent.CancellationException
> at java.util.concurrent.FutureTask.report(FutureTask.java:121)
> at java.util.concurrent.FutureTask.get(FutureTask.java:188)
> at 
> org.apache.flink.util.FutureUtil.runIfNotDoneAndGet(FutureUtil.java:40)
> at 
> org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.run(StreamTask.java:899)
> ... 5 more
> 2017-01-11 18:49:50,113 INFO  
> org.apache.flink.runtime.executiongraph.ExecutionGraph- Job Generate 
> Event Window stream (90859d392c1da472e07695f434b332ef) switched from state 
> RUNNING to FAILING.
> AsynchronousException{java.lang.Exception: Could not materialize checkpoint 3 
> for operator TriggerWindow(TumblingEventTimeWindows(4), 
> ListStateDescriptor{serializer=org.apache.flink.api.java.typeutils.runtime.TupleSerializer@2edcd071},
>  EventTimeTrigger(), WindowedStream.ap
> ply(AllWindowedStream.java:440)) (1/1).}
> at 
> org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.run(StreamTask.java:939)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.Exception: Could not materialize checkpoint 3 for 
> operator TriggerWindow(TumblingEventTimeWindows(4), 
> ListStateDescriptor{serializer=org.apache.flink.api.java.typeutils.runtime.TupleSerializer@2edcd071},
>  EventTimeTrigger(), WindowedStream.apply(AllWindowedStream.java:440)) (1/1).
> ... 6 more
> Caused by: java.util.concurrent.CancellationException
> at java.util.concurrent.FutureTask.report(FutureTask.java:121)
> at java.util.concurrent.FutureTask.get(FutureTask.java:188)
> at 
> org.apache.flink.util.FutureUtil.runIfNotDoneAndGet(FutureUt

[jira] [Created] (FLINK-5555) Add documentation about debugging watermarks

2017-01-18 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-:
-

 Summary: Add documentation about debugging watermarks
 Key: FLINK-
 URL: https://issues.apache.org/jira/browse/FLINK-
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation
Affects Versions: 1.2.0
Reporter: Robert Metzger
Assignee: Robert Metzger
 Fix For: 1.2.0


This was a frequent question on the mailing list.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-5005) Publish Scala 2.12 artifacts

2017-01-19 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-5005:
--
Component/s: Scala API

> Publish Scala 2.12 artifacts
> 
>
> Key: FLINK-5005
> URL: https://issues.apache.org/jira/browse/FLINK-5005
> Project: Flink
>  Issue Type: Improvement
>  Components: Scala API
>Reporter: Andrew Roberts
>
> Scala 2.12 was [released|http://www.scala-lang.org/news/2.12.0] today, and 
> offers many compile-time and runtime speed improvements. It would be great to 
> get artifacts up on maven central to allow Flink users to migrate to Scala 
> 2.12.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-5005) Publish Scala 2.12 artifacts

2017-01-19 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15829993#comment-15829993
 ] 

Robert Metzger commented on FLINK-5005:
---

For the {{flakka-*}} artifacts, I can probably build and deploy 2.12 versions. 
Flakka is a akka fork with a feature backport from newer akka versions.

> Publish Scala 2.12 artifacts
> 
>
> Key: FLINK-5005
> URL: https://issues.apache.org/jira/browse/FLINK-5005
> Project: Flink
>  Issue Type: Improvement
>  Components: Scala API
>Reporter: Andrew Roberts
>
> Scala 2.12 was [released|http://www.scala-lang.org/news/2.12.0] today, and 
> offers many compile-time and runtime speed improvements. It would be great to 
> get artifacts up on maven central to allow Flink users to migrate to Scala 
> 2.12.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (FLINK-5005) Publish Scala 2.12 artifacts

2017-01-19 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15829993#comment-15829993
 ] 

Robert Metzger edited comment on FLINK-5005 at 1/19/17 2:11 PM:


For the {{flakka-*}} artifacts, I can probably build and deploy 2.12 versions. 
Flakka is a akka fork with a feature backport from newer akka versions.

Flakka is build from here: https://github.com/dataArtisans/flakka 


was (Author: rmetzger):
For the {{flakka-*}} artifacts, I can probably build and deploy 2.12 versions. 
Flakka is a akka fork with a feature backport from newer akka versions.

> Publish Scala 2.12 artifacts
> 
>
> Key: FLINK-5005
> URL: https://issues.apache.org/jira/browse/FLINK-5005
> Project: Flink
>  Issue Type: Improvement
>  Components: Scala API
>Reporter: Andrew Roberts
>
> Scala 2.12 was [released|http://www.scala-lang.org/news/2.12.0] today, and 
> offers many compile-time and runtime speed improvements. It would be great to 
> get artifacts up on maven central to allow Flink users to migrate to Scala 
> 2.12.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


<    1   2   3   4   5   6   7   8   9   10   >