[jira] [Commented] (FLINK-2394) HadoopOutFormat OutputCommitter is default to FileOutputCommiter

2015-07-24 Thread Stefano Bortoli (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14640007#comment-14640007
 ] 

Stefano Bortoli commented on FLINK-2394:


I see. I was in fact a little disappointed with mongo-hadoop for having a 
method getOutputCommitter() which was not standard in the OutputFormat. 
However, my first implementation was using the mapred Hadoop, so no big 
surprises.

I would say that a good idea could be to simply have either 2 
HadoopOutputFormatBase classes (1 per version of Hadoop), or handle the hadoop 
version to get the OutputCommitter accordingly.

Meanwhile, I have implemented my own MongoHadoopOutputFormat extending the 
HadoopOutputFormat and overriding the open and close methods replacing the 
FileOutputCommitter with the MongoOutputCommiter. 

 HadoopOutFormat OutputCommitter is default to FileOutputCommiter
 

 Key: FLINK-2394
 URL: https://issues.apache.org/jira/browse/FLINK-2394
 Project: Flink
  Issue Type: Bug
  Components: Hadoop Compatibility
Affects Versions: 0.9.0
Reporter: Stefano Bortoli

 MongoOutputFormat does not write back in collection because the 
 HadoopOutputFormat wrapper does not allow to set the MongoOutputCommiter and 
 is set as default to FileOutputCommitter. Therefore, on close and 
 globalFinalize execution the commit does not happen and mongo collection 
 stays untouched. 
 A simple solution would be to:
 1 - create a constructor of HadoopOutputFormatBase and HadoopOutputFormat 
 that gets the OutputCommitter as a parameter
 2 - change the outputCommitter field of HadoopOutputFormatBase to be a 
 generic OutputCommitter
 3 - remove the default assignment in the open() and finalizeGlobal to the 
 outputCommitter to FileOutputCommitter(), or keep it as a default in case of 
 no specific assignment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-2404) LongCounters should have an addValue() method for primitive longs

2015-07-24 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-2404:
---

 Summary: LongCounters should have an addValue() method for 
primitive longs
 Key: FLINK-2404
 URL: https://issues.apache.org/jira/browse/FLINK-2404
 Project: Flink
  Issue Type: Bug
  Components: Core
Affects Versions: 0.10
Reporter: Stephan Ewen
 Fix For: 0.10


Since the LongCounter is used heavily for internal statistics reporting, it 
must have very low overhead.

The current addValue() method always boxes and unboxes the values, which is 
unnecessary overhead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2399) Fail when actor versions don't match

2015-07-24 Thread Sachin Goel (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14640543#comment-14640543
 ] 

Sachin Goel commented on FLINK-2399:


Should this work when the task managers and the job manager are from different 
releases? Or just for the old and new Job Manager?

 Fail when actor versions don't match
 

 Key: FLINK-2399
 URL: https://issues.apache.org/jira/browse/FLINK-2399
 Project: Flink
  Issue Type: Improvement
  Components: JobManager, TaskManager
Affects Versions: 0.9, master
Reporter: Ufuk Celebi
Priority: Minor
 Fix For: 0.10


 Problem: there can be subtle errors when actors from different Flink versions 
 communicate with each other, for example when an old client (e.g. Flink 0.9) 
 communicates with a new JobManager (e.g. Flink 0.10-SNAPSHOT).
 We can check that the versions match on first communication between the 
 actors and fail if they don't match.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2166) Add fromCsvFile() to TableEnvironment

2015-07-24 Thread Fabian Hueske (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14641074#comment-14641074
 ] 

Fabian Hueske commented on FLINK-2166:
--

Hi James, thanks for reaching out and your interest in implementing this 
feature!

You are right, the {{CSVInputFormat}} has quite a few parameters. For the Java 
DataSet API, we have the {{CSVReader}} class which serves a similar purpose as 
Scala's named parameters. It's basically a builder for a {{CSVInputFormat}} 
with default parameters that can be overwritten. It would be nice if the 
{{CSVReader}} could be reused for this feature as well.

Please let me know if you have further questions.

 Add fromCsvFile() to TableEnvironment
 -

 Key: FLINK-2166
 URL: https://issues.apache.org/jira/browse/FLINK-2166
 Project: Flink
  Issue Type: New Feature
  Components: Table API
Affects Versions: 0.9
Reporter: Fabian Hueske
Priority: Minor
  Labels: starter

 Add a {{fromCsvFile()}} method to the {{TableEnvironment}} to read a 
 {{Table}} from a CSV file.
 The implementation should reuse Flink's CsvInputFormat.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2231) Create a Serializer for Scala Enumerations

2015-07-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14640736#comment-14640736
 ] 

ASF GitHub Bot commented on FLINK-2231:
---

GitHub user aalexandrov opened a pull request:

https://github.com/apache/flink/pull/935

[FLINK-2231] Create a Serializer for Scala Enumerations.

This closes FLINK-2231.

The code should work for all objects which follow [the Enumeration idiom 
outlined in the 
ScalaDoc](http://www.scala-lang.org/api/2.11.5/index.html#scala.Enumeration).

The second commit removes the boilerplate code from the 
`EnumValueComparator` by delegating to an `IntComparator`, you can either 
discard or squash it while merging depending on your preference.

Bear in mind the FIXME at line 368 in `TypeAnalyzer.scala`. The commented 
code is better, but unfortunately doesn't work with Scala 2.10, so I used the 
FQN workaround.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/aalexandrov/flink FLINK-2231

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/935.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #935


commit fd69bda383f6771e87ded1b4b595a395519efd6e
Author: Alexander Alexandrov alexander.s.alexand...@gmail.com
Date:   2015-07-24T10:36:17Z

[FLINK-2231] Create a Serializer for Scala Enumerations.

commit dca03720d090c383f88a57af6808fdbfd2c4ec29
Author: Alexander Alexandrov alexander.s.alexand...@gmail.com
Date:   2015-07-24T16:43:14Z

Delegating EnumValueComparator.




 Create a Serializer for Scala Enumerations
 --

 Key: FLINK-2231
 URL: https://issues.apache.org/jira/browse/FLINK-2231
 Project: Flink
  Issue Type: Improvement
  Components: Scala API
Reporter: Stephan Ewen
Assignee: Alexander Alexandrov

 Scala Enumerations are currently serialized with Kryo, but should be 
 efficiently serialized by just writing the {{initial}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2231] Create a Serializer for Scala Enu...

2015-07-24 Thread aalexandrov
GitHub user aalexandrov opened a pull request:

https://github.com/apache/flink/pull/935

[FLINK-2231] Create a Serializer for Scala Enumerations.

This closes FLINK-2231.

The code should work for all objects which follow [the Enumeration idiom 
outlined in the 
ScalaDoc](http://www.scala-lang.org/api/2.11.5/index.html#scala.Enumeration).

The second commit removes the boilerplate code from the 
`EnumValueComparator` by delegating to an `IntComparator`, you can either 
discard or squash it while merging depending on your preference.

Bear in mind the FIXME at line 368 in `TypeAnalyzer.scala`. The commented 
code is better, but unfortunately doesn't work with Scala 2.10, so I used the 
FQN workaround.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/aalexandrov/flink FLINK-2231

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/935.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #935


commit fd69bda383f6771e87ded1b4b595a395519efd6e
Author: Alexander Alexandrov alexander.s.alexand...@gmail.com
Date:   2015-07-24T10:36:17Z

[FLINK-2231] Create a Serializer for Scala Enumerations.

commit dca03720d090c383f88a57af6808fdbfd2c4ec29
Author: Alexander Alexandrov alexander.s.alexand...@gmail.com
Date:   2015-07-24T16:43:14Z

Delegating EnumValueComparator.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-2231] Create a Serializer for Scala Enu...

2015-07-24 Thread aalexandrov
Github user aalexandrov commented on the pull request:

https://github.com/apache/flink/pull/935#issuecomment-124582468
  
PS. The third commit fixes a compilation error in IntelliJ when the 
'scala_2.11' profile is active.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2231) Create a Serializer for Scala Enumerations

2015-07-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14640752#comment-14640752
 ] 

ASF GitHub Bot commented on FLINK-2231:
---

Github user aalexandrov commented on the pull request:

https://github.com/apache/flink/pull/935#issuecomment-124582468
  
PS. The third commit fixes a compilation error in IntelliJ when the 
'scala_2.11' profile is active.


 Create a Serializer for Scala Enumerations
 --

 Key: FLINK-2231
 URL: https://issues.apache.org/jira/browse/FLINK-2231
 Project: Flink
  Issue Type: Improvement
  Components: Scala API
Reporter: Stephan Ewen
Assignee: Alexander Alexandrov

 Scala Enumerations are currently serialized with Kryo, but should be 
 efficiently serialized by just writing the {{initial}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2166) Add fromCsvFile() to TableEnvironment

2015-07-24 Thread James Cao (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14640466#comment-14640466
 ] 

James Cao commented on FLINK-2166:
--

Hi Fabian, I'd like to contribute to this issue. I've played with some 
prototypes of the solubtion and have a few question to ask.

CsvInputformat takes a lot of options like line-delimiter, field-delimiter, 
Included-msak etc. Should we allow user to set these options in fromCsvFile()? 
In scala api, we can have default for most of them and user can override only a 
few parameters using named parameter but for java api, user have to come up 
with a long list of parameters to make the call. 

 Add fromCsvFile() to TableEnvironment
 -

 Key: FLINK-2166
 URL: https://issues.apache.org/jira/browse/FLINK-2166
 Project: Flink
  Issue Type: New Feature
  Components: Table API
Affects Versions: 0.9
Reporter: Fabian Hueske
Priority: Minor
  Labels: starter

 Add a {{fromCsvFile()}} method to the {{TableEnvironment}} to read a 
 {{Table}} from a CSV file.
 The implementation should reuse Flink's CsvInputFormat.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2394) HadoopOutFormat OutputCommitter is default to FileOutputCommiter

2015-07-24 Thread Fabian Hueske (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14640999#comment-14640999
 ] 

Fabian Hueske commented on FLINK-2394:
--

Hi [~stefano.bortoli], we do already have two HadoopOutputFormatBase classes, 
one for each Hadoop API. So treating both APIs differently is not a problem. 
The issue is that one API supports different OutputCommitters out-of-the-box 
(mapreduce) and the other one requires that the OutputCommitter is explicitly 
set (mapred), unless I overlooked something.

 HadoopOutFormat OutputCommitter is default to FileOutputCommiter
 

 Key: FLINK-2394
 URL: https://issues.apache.org/jira/browse/FLINK-2394
 Project: Flink
  Issue Type: Bug
  Components: Hadoop Compatibility
Affects Versions: 0.9.0
Reporter: Stefano Bortoli

 MongoOutputFormat does not write back in collection because the 
 HadoopOutputFormat wrapper does not allow to set the MongoOutputCommiter and 
 is set as default to FileOutputCommitter. Therefore, on close and 
 globalFinalize execution the commit does not happen and mongo collection 
 stays untouched. 
 A simple solution would be to:
 1 - create a constructor of HadoopOutputFormatBase and HadoopOutputFormat 
 that gets the OutputCommitter as a parameter
 2 - change the outputCommitter field of HadoopOutputFormatBase to be a 
 generic OutputCommitter
 3 - remove the default assignment in the open() and finalizeGlobal to the 
 outputCommitter to FileOutputCommitter(), or keep it as a default in case of 
 no specific assignment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2200) Flink API with Scala 2.11 - Maven Repository

2015-07-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14640455#comment-14640455
 ] 

ASF GitHub Bot commented on FLINK-2200:
---

Github user chiwanpark commented on the pull request:

https://github.com/apache/flink/pull/885#issuecomment-124521433
  
Could you post your command to compile Flink with Scala 2.11? The current 
setting works well in my environment. Maven module definitions is not artifact 
id but directory name. So we should keep current setting.

I'm adding the suffix into the pom except quickstart.


 Flink API with Scala 2.11 - Maven Repository
 

 Key: FLINK-2200
 URL: https://issues.apache.org/jira/browse/FLINK-2200
 Project: Flink
  Issue Type: Wish
  Components: Build System, Scala API
Reporter: Philipp Götze
Assignee: Chiwan Park
Priority: Trivial
  Labels: maven

 It would be nice if you could upload a pre-built version of the Flink API 
 with Scala 2.11 to the maven repository.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2200] Add Flink with Scala 2.11 in Mave...

2015-07-24 Thread chiwanpark
Github user chiwanpark commented on the pull request:

https://github.com/apache/flink/pull/885#issuecomment-124521433
  
Could you post your command to compile Flink with Scala 2.11? The current 
setting works well in my environment. Maven module definitions is not artifact 
id but directory name. So we should keep current setting.

I'm adding the suffix into the pom except quickstart.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: Collect(): Fixing the akka.framesize size limi...

2015-07-24 Thread kl0u
Github user kl0u closed the pull request at:

https://github.com/apache/flink/pull/887


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (FLINK-2381) Possible class not found Exception on failed partition producer

2015-07-24 Thread Ufuk Celebi (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ufuk Celebi resolved FLINK-2381.

Resolution: Fixed

Fixed via 9b1343d (0.10) and 198406f (0.9.1)

 Possible class not found Exception on failed partition producer
 ---

 Key: FLINK-2381
 URL: https://issues.apache.org/jira/browse/FLINK-2381
 Project: Flink
  Issue Type: Improvement
  Components: Distributed Runtime
Affects Versions: 0.9, master
Reporter: Ufuk Celebi
Assignee: Ufuk Celebi
 Fix For: 0.10, 0.9.1


 Failing the production of a result partition marks the respective partition 
 as failed with a ProducerFailedException.
 The cause of this exception can be a user defined class, which can only be 
 loaded by the user code class loader. The network stack fails the shuffle 
 with a RemoteTransportException, which has the user exception as a cause. 
 When the consuming task receives this exception, this leads to a class not 
 found exception, because the network stack tries to load the class with the 
 system class loader.
 {code}
 +--+
 | FAILING  |
 | PRODUCER |
 +--+
  || 
  \/
  ProducerFailedException(CAUSE) via network
  || 
  \/
 +--+
 | RECEIVER |
 +--+
 {code}
 CAUSE is only loadable by the user code class loader.
 When trying to deserialize this, RECEIVER fails with a 
 LocalTransportException, which is super confusing, because the error is not 
 local, but remote.
 Thanks to [~rmetzger] for reporting and debugging the issue with the 
 following stack trace:
 {code}
 Flat Map (26/120)
 14:03:00,343 ERROR 
 org.apache.flink.streaming.runtime.tasks.OneInputStreamTask   - Flat Map 
 (26/120) failed
 java.lang.RuntimeException: Could not read next record.
 at 
 org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.readNext(OneInputStreamTask.java:71)
 at 
 org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.invoke(OneInputStreamTask.java:101)
 at org.apache.flink.runtime.taskmanager.Task.run(Task.java:567)
 at java.lang.Thread.run(Thread.java:745)
 Caused by: 
 org.apache.flink.runtime.io.network.netty.exception.LocalTransportException: 
 java.lang.ClassNotFoundException: 
 kafka.common.ConsumerRebalanceFailedException
 at 
 org.apache.flink.runtime.io.network.netty.PartitionRequestClientHandler.exceptionCaught(PartitionRequestClientHandler.java:151)
 at 
 io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:275)
 at 
 io.netty.channel.AbstractChannelHandlerContext.fireExceptionCaught(AbstractChannelHandlerContext.java:253)
 at 
 io.netty.channel.ChannelInboundHandlerAdapter.exceptionCaught(ChannelInboundHandlerAdapter.java:131)
 at 
 io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:275)
 at 
 io.netty.channel.AbstractChannelHandlerContext.notifyHandlerException(AbstractChannelHandlerContext.java:809)
 at 
 io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:341)
 at 
 io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
 at 
 io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:242)
 at 
 io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
 at 
 io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
 at 
 io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:847)
 at 
 io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
 at 
 io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
 at 
 io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
 at 
 io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
 at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
 at 
 io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
 ... 1 more
 Caused by: io.netty.handler.codec.DecoderException: 
 java.lang.ClassNotFoundException: 
 kafka.common.ConsumerRebalanceFailedException
 at 
 io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:99)
 at 
 io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
 ... 12 more
 Caused by: java.lang.ClassNotFoundException: 
 kafka.common.ConsumerRebalanceFailedException
 at 

[jira] [Resolved] (FLINK-2341) Deadlock in SpilledSubpartitionViewAsyncIO

2015-07-24 Thread Ufuk Celebi (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ufuk Celebi resolved FLINK-2341.

Resolution: Fixed

Fixed via efca79c (0.10) and 208c0a1 (0.9.1)

 Deadlock in SpilledSubpartitionViewAsyncIO
 --

 Key: FLINK-2341
 URL: https://issues.apache.org/jira/browse/FLINK-2341
 Project: Flink
  Issue Type: Bug
  Components: Distributed Runtime
Affects Versions: 0.9, 0.10
Reporter: Stephan Ewen
Assignee: Ufuk Celebi
Priority: Critical
 Fix For: 0.9, 0.10


 It may be that the deadlock is because of the way the 
 {{SpilledSubpartitionViewTest}} is written
 {code}
 Found one Java-level deadlock:
 =
 pool-25-thread-2:
   waiting to lock monitor 0x7f66f4932468 (object 0xfa1478f0, a 
 java.lang.Object),
   which is held by IOManager reader thread #1
 IOManager reader thread #1:
   waiting to lock monitor 0x7f66f4931160 (object 0xfa029768, a 
 java.lang.Object),
   which is held by pool-25-thread-2
 Java stack information for the threads listed above:
 ===
 pool-25-thread-2:
   at 
 org.apache.flink.runtime.io.network.partition.SpilledSubpartitionViewAsyncIO.notifyError(SpilledSubpartitionViewAsyncIO.java:304)
   - waiting to lock 0xfa1478f0 (a java.lang.Object)
   at 
 org.apache.flink.runtime.io.network.partition.SpilledSubpartitionViewAsyncIO.onAvailableBuffer(SpilledSubpartitionViewAsyncIO.java:256)
   at 
 org.apache.flink.runtime.io.network.partition.SpilledSubpartitionViewAsyncIO.access$300(SpilledSubpartitionViewAsyncIO.java:42)
   at 
 org.apache.flink.runtime.io.network.partition.SpilledSubpartitionViewAsyncIO$BufferProviderCallback.onEvent(SpilledSubpartitionViewAsyncIO.java:367)
   at 
 org.apache.flink.runtime.io.network.partition.SpilledSubpartitionViewAsyncIO$BufferProviderCallback.onEvent(SpilledSubpartitionViewAsyncIO.java:353)
   at 
 org.apache.flink.runtime.io.network.util.TestPooledBufferProvider$PooledBufferProviderRecycler.recycle(TestPooledBufferProvider.java:135)
   - locked 0xfa029768 (a java.lang.Object)
   at 
 org.apache.flink.runtime.io.network.buffer.Buffer.recycle(Buffer.java:119)
   - locked 0xfa3a1a20 (a java.lang.Object)
   at 
 org.apache.flink.runtime.io.network.util.TestSubpartitionConsumer.call(TestSubpartitionConsumer.java:95)
   at 
 org.apache.flink.runtime.io.network.util.TestSubpartitionConsumer.call(TestSubpartitionConsumer.java:39)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:701)
 IOManager reader thread #1:
   at 
 org.apache.flink.runtime.io.network.util.TestPooledBufferProvider$PooledBufferProviderRecycler.recycle(TestPooledBufferProvider.java:127)
   - waiting to lock 0xfa029768 (a java.lang.Object)
   at 
 org.apache.flink.runtime.io.network.buffer.Buffer.recycle(Buffer.java:119)
   - locked 0xfa3a1ea0 (a java.lang.Object)
   at 
 org.apache.flink.runtime.io.network.partition.SpilledSubpartitionViewAsyncIO.returnBufferFromIOThread(SpilledSubpartitionViewAsyncIO.java:270)
   - locked 0xfa1478f0 (a java.lang.Object)
   at 
 org.apache.flink.runtime.io.network.partition.SpilledSubpartitionViewAsyncIO.access$100(SpilledSubpartitionViewAsyncIO.java:42)
   at 
 org.apache.flink.runtime.io.network.partition.SpilledSubpartitionViewAsyncIO$IOThreadCallback.requestSuccessful(SpilledSubpartitionViewAsyncIO.java:338)
   at 
 org.apache.flink.runtime.io.network.partition.SpilledSubpartitionViewAsyncIO$IOThreadCallback.requestSuccessful(SpilledSubpartitionViewAsyncIO.java:328)
   at 
 org.apache.flink.runtime.io.disk.iomanager.AsynchronousFileIOChannel.handleProcessedBuffer(AsynchronousFileIOChannel.java:199)
   at 
 org.apache.flink.runtime.io.disk.iomanager.BufferReadRequest.requestDone(AsynchronousFileIOChannel.java:431)
   at 
 org.apache.flink.runtime.io.disk.iomanager.IOManagerAsync$ReaderThread.run(IOManagerAsync.java:377)
 {code}
 The full log with the deadlock stack traces can be found here:
 https://s3.amazonaws.com/archive.travis-ci.org/jobs/70232347/log.txt



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: Framesize fix

2015-07-24 Thread kl0u
GitHub user kl0u opened a pull request:

https://github.com/apache/flink/pull/934

Framesize fix

In Apache Flink the results of the collect() call were returned through 
akka to the client. This led to an inherent limitation to the size of the 
output of a job, as this could not exceed the akka.framesize size. In other 
case, akka would drop the message.

To alleviate this, without dropping the benefits brought by akka and its 
out-of-the-box efficiency for small-sized results, we decided to keep 
forwarding the non-oversized (i.e. smaller than the akka.framesize) results 
through akka, and use the BlobCache module for the forwarding the oversized 
(large) ones.

Now the JobManager receives end merges the small accumulators (as before), 
and simply forwards to the Client the keys to the blobs storing the oversized 
ones. Now it is the responsibility of the Client to do the final merging 
between oversized and non-oversized accumulators.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kl0u/flink framesize_fix

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/934.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #934


commit fb1fbd6bdcc81acd20d422842789fce0c0872580
Author: Kostas Kloudas kklou...@gmail.com
Date:   2015-07-22T17:53:11Z

Solved the #887 issue: removing the akka.framesize size limitation for the 
result of a job.

commit 34d3e433eb0ce976539de166288550c9c7612eb4
Author: Kostas Kloudas kklou...@gmail.com
Date:   2015-07-22T17:53:11Z

Solved the #887 issue: removing the akka.framesize size limitation for the 
result of a job.

commit 55aa50c3f3e5c4c3a253b8da68b5ddde9acb307f
Author: Kostas Kloudas kklou...@gmail.com
Date:   2015-07-24T12:02:10Z

Merge branch 'framesize_fix' of https://github.com/kl0u/flink into 
framesize_fix




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2319) Collect size limitation due to akka.

2015-07-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14640365#comment-14640365
 ] 

ASF GitHub Bot commented on FLINK-2319:
---

Github user kl0u commented on the pull request:

https://github.com/apache/flink/pull/934#issuecomment-124500486
  
FLINK-2319 
This pull request targets this ticket.


 Collect size limitation due to akka.
 

 Key: FLINK-2319
 URL: https://issues.apache.org/jira/browse/FLINK-2319
 Project: Flink
  Issue Type: Bug
  Components: JobManager, TaskManager
Affects Versions: 0.10
Reporter: Kostas

 Each TaksManager keeps the results of a local task in a set of Accumulators. 
 Upon termination of the task, the Accumulators are sent back to the 
 JobManager, who merges them, before sending them back to the Client. 
 To exchange the Accumulators and their content, akka is used. This limits the 
 size of the output of a task to no more than akka.framesize bytes. In other 
 case, akka would drop the message.
 This ticket is to propose the removal of this limitation so that results can 
 be of arbitrary size.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: Framesize fix

2015-07-24 Thread kl0u
Github user kl0u commented on the pull request:

https://github.com/apache/flink/pull/934#issuecomment-124500486
  
FLINK-2319 
This pull request targets this ticket.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: Framesize fix

2015-07-24 Thread kl0u
Github user kl0u commented on the pull request:

https://github.com/apache/flink/pull/934#issuecomment-124500737
  
Hello guys,

This is a new pull request, for a previous ticket. 
It is aligned with recent changes in the master branch.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---