[jira] [Commented] (FLINK-2394) HadoopOutFormat OutputCommitter is default to FileOutputCommiter
[ https://issues.apache.org/jira/browse/FLINK-2394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14640007#comment-14640007 ] Stefano Bortoli commented on FLINK-2394: I see. I was in fact a little disappointed with mongo-hadoop for having a method getOutputCommitter() which was not standard in the OutputFormat. However, my first implementation was using the mapred Hadoop, so no big surprises. I would say that a good idea could be to simply have either 2 HadoopOutputFormatBase classes (1 per version of Hadoop), or handle the hadoop version to get the OutputCommitter accordingly. Meanwhile, I have implemented my own MongoHadoopOutputFormat extending the HadoopOutputFormat and overriding the open and close methods replacing the FileOutputCommitter with the MongoOutputCommiter. HadoopOutFormat OutputCommitter is default to FileOutputCommiter Key: FLINK-2394 URL: https://issues.apache.org/jira/browse/FLINK-2394 Project: Flink Issue Type: Bug Components: Hadoop Compatibility Affects Versions: 0.9.0 Reporter: Stefano Bortoli MongoOutputFormat does not write back in collection because the HadoopOutputFormat wrapper does not allow to set the MongoOutputCommiter and is set as default to FileOutputCommitter. Therefore, on close and globalFinalize execution the commit does not happen and mongo collection stays untouched. A simple solution would be to: 1 - create a constructor of HadoopOutputFormatBase and HadoopOutputFormat that gets the OutputCommitter as a parameter 2 - change the outputCommitter field of HadoopOutputFormatBase to be a generic OutputCommitter 3 - remove the default assignment in the open() and finalizeGlobal to the outputCommitter to FileOutputCommitter(), or keep it as a default in case of no specific assignment. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-2404) LongCounters should have an addValue() method for primitive longs
Stephan Ewen created FLINK-2404: --- Summary: LongCounters should have an addValue() method for primitive longs Key: FLINK-2404 URL: https://issues.apache.org/jira/browse/FLINK-2404 Project: Flink Issue Type: Bug Components: Core Affects Versions: 0.10 Reporter: Stephan Ewen Fix For: 0.10 Since the LongCounter is used heavily for internal statistics reporting, it must have very low overhead. The current addValue() method always boxes and unboxes the values, which is unnecessary overhead. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2399) Fail when actor versions don't match
[ https://issues.apache.org/jira/browse/FLINK-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14640543#comment-14640543 ] Sachin Goel commented on FLINK-2399: Should this work when the task managers and the job manager are from different releases? Or just for the old and new Job Manager? Fail when actor versions don't match Key: FLINK-2399 URL: https://issues.apache.org/jira/browse/FLINK-2399 Project: Flink Issue Type: Improvement Components: JobManager, TaskManager Affects Versions: 0.9, master Reporter: Ufuk Celebi Priority: Minor Fix For: 0.10 Problem: there can be subtle errors when actors from different Flink versions communicate with each other, for example when an old client (e.g. Flink 0.9) communicates with a new JobManager (e.g. Flink 0.10-SNAPSHOT). We can check that the versions match on first communication between the actors and fail if they don't match. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2166) Add fromCsvFile() to TableEnvironment
[ https://issues.apache.org/jira/browse/FLINK-2166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14641074#comment-14641074 ] Fabian Hueske commented on FLINK-2166: -- Hi James, thanks for reaching out and your interest in implementing this feature! You are right, the {{CSVInputFormat}} has quite a few parameters. For the Java DataSet API, we have the {{CSVReader}} class which serves a similar purpose as Scala's named parameters. It's basically a builder for a {{CSVInputFormat}} with default parameters that can be overwritten. It would be nice if the {{CSVReader}} could be reused for this feature as well. Please let me know if you have further questions. Add fromCsvFile() to TableEnvironment - Key: FLINK-2166 URL: https://issues.apache.org/jira/browse/FLINK-2166 Project: Flink Issue Type: New Feature Components: Table API Affects Versions: 0.9 Reporter: Fabian Hueske Priority: Minor Labels: starter Add a {{fromCsvFile()}} method to the {{TableEnvironment}} to read a {{Table}} from a CSV file. The implementation should reuse Flink's CsvInputFormat. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2231) Create a Serializer for Scala Enumerations
[ https://issues.apache.org/jira/browse/FLINK-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14640736#comment-14640736 ] ASF GitHub Bot commented on FLINK-2231: --- GitHub user aalexandrov opened a pull request: https://github.com/apache/flink/pull/935 [FLINK-2231] Create a Serializer for Scala Enumerations. This closes FLINK-2231. The code should work for all objects which follow [the Enumeration idiom outlined in the ScalaDoc](http://www.scala-lang.org/api/2.11.5/index.html#scala.Enumeration). The second commit removes the boilerplate code from the `EnumValueComparator` by delegating to an `IntComparator`, you can either discard or squash it while merging depending on your preference. Bear in mind the FIXME at line 368 in `TypeAnalyzer.scala`. The commented code is better, but unfortunately doesn't work with Scala 2.10, so I used the FQN workaround. You can merge this pull request into a Git repository by running: $ git pull https://github.com/aalexandrov/flink FLINK-2231 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/935.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #935 commit fd69bda383f6771e87ded1b4b595a395519efd6e Author: Alexander Alexandrov alexander.s.alexand...@gmail.com Date: 2015-07-24T10:36:17Z [FLINK-2231] Create a Serializer for Scala Enumerations. commit dca03720d090c383f88a57af6808fdbfd2c4ec29 Author: Alexander Alexandrov alexander.s.alexand...@gmail.com Date: 2015-07-24T16:43:14Z Delegating EnumValueComparator. Create a Serializer for Scala Enumerations -- Key: FLINK-2231 URL: https://issues.apache.org/jira/browse/FLINK-2231 Project: Flink Issue Type: Improvement Components: Scala API Reporter: Stephan Ewen Assignee: Alexander Alexandrov Scala Enumerations are currently serialized with Kryo, but should be efficiently serialized by just writing the {{initial}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2231] Create a Serializer for Scala Enu...
GitHub user aalexandrov opened a pull request: https://github.com/apache/flink/pull/935 [FLINK-2231] Create a Serializer for Scala Enumerations. This closes FLINK-2231. The code should work for all objects which follow [the Enumeration idiom outlined in the ScalaDoc](http://www.scala-lang.org/api/2.11.5/index.html#scala.Enumeration). The second commit removes the boilerplate code from the `EnumValueComparator` by delegating to an `IntComparator`, you can either discard or squash it while merging depending on your preference. Bear in mind the FIXME at line 368 in `TypeAnalyzer.scala`. The commented code is better, but unfortunately doesn't work with Scala 2.10, so I used the FQN workaround. You can merge this pull request into a Git repository by running: $ git pull https://github.com/aalexandrov/flink FLINK-2231 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/935.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #935 commit fd69bda383f6771e87ded1b4b595a395519efd6e Author: Alexander Alexandrov alexander.s.alexand...@gmail.com Date: 2015-07-24T10:36:17Z [FLINK-2231] Create a Serializer for Scala Enumerations. commit dca03720d090c383f88a57af6808fdbfd2c4ec29 Author: Alexander Alexandrov alexander.s.alexand...@gmail.com Date: 2015-07-24T16:43:14Z Delegating EnumValueComparator. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2231] Create a Serializer for Scala Enu...
Github user aalexandrov commented on the pull request: https://github.com/apache/flink/pull/935#issuecomment-124582468 PS. The third commit fixes a compilation error in IntelliJ when the 'scala_2.11' profile is active. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2231) Create a Serializer for Scala Enumerations
[ https://issues.apache.org/jira/browse/FLINK-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14640752#comment-14640752 ] ASF GitHub Bot commented on FLINK-2231: --- Github user aalexandrov commented on the pull request: https://github.com/apache/flink/pull/935#issuecomment-124582468 PS. The third commit fixes a compilation error in IntelliJ when the 'scala_2.11' profile is active. Create a Serializer for Scala Enumerations -- Key: FLINK-2231 URL: https://issues.apache.org/jira/browse/FLINK-2231 Project: Flink Issue Type: Improvement Components: Scala API Reporter: Stephan Ewen Assignee: Alexander Alexandrov Scala Enumerations are currently serialized with Kryo, but should be efficiently serialized by just writing the {{initial}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2166) Add fromCsvFile() to TableEnvironment
[ https://issues.apache.org/jira/browse/FLINK-2166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14640466#comment-14640466 ] James Cao commented on FLINK-2166: -- Hi Fabian, I'd like to contribute to this issue. I've played with some prototypes of the solubtion and have a few question to ask. CsvInputformat takes a lot of options like line-delimiter, field-delimiter, Included-msak etc. Should we allow user to set these options in fromCsvFile()? In scala api, we can have default for most of them and user can override only a few parameters using named parameter but for java api, user have to come up with a long list of parameters to make the call. Add fromCsvFile() to TableEnvironment - Key: FLINK-2166 URL: https://issues.apache.org/jira/browse/FLINK-2166 Project: Flink Issue Type: New Feature Components: Table API Affects Versions: 0.9 Reporter: Fabian Hueske Priority: Minor Labels: starter Add a {{fromCsvFile()}} method to the {{TableEnvironment}} to read a {{Table}} from a CSV file. The implementation should reuse Flink's CsvInputFormat. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2394) HadoopOutFormat OutputCommitter is default to FileOutputCommiter
[ https://issues.apache.org/jira/browse/FLINK-2394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14640999#comment-14640999 ] Fabian Hueske commented on FLINK-2394: -- Hi [~stefano.bortoli], we do already have two HadoopOutputFormatBase classes, one for each Hadoop API. So treating both APIs differently is not a problem. The issue is that one API supports different OutputCommitters out-of-the-box (mapreduce) and the other one requires that the OutputCommitter is explicitly set (mapred), unless I overlooked something. HadoopOutFormat OutputCommitter is default to FileOutputCommiter Key: FLINK-2394 URL: https://issues.apache.org/jira/browse/FLINK-2394 Project: Flink Issue Type: Bug Components: Hadoop Compatibility Affects Versions: 0.9.0 Reporter: Stefano Bortoli MongoOutputFormat does not write back in collection because the HadoopOutputFormat wrapper does not allow to set the MongoOutputCommiter and is set as default to FileOutputCommitter. Therefore, on close and globalFinalize execution the commit does not happen and mongo collection stays untouched. A simple solution would be to: 1 - create a constructor of HadoopOutputFormatBase and HadoopOutputFormat that gets the OutputCommitter as a parameter 2 - change the outputCommitter field of HadoopOutputFormatBase to be a generic OutputCommitter 3 - remove the default assignment in the open() and finalizeGlobal to the outputCommitter to FileOutputCommitter(), or keep it as a default in case of no specific assignment. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2200) Flink API with Scala 2.11 - Maven Repository
[ https://issues.apache.org/jira/browse/FLINK-2200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14640455#comment-14640455 ] ASF GitHub Bot commented on FLINK-2200: --- Github user chiwanpark commented on the pull request: https://github.com/apache/flink/pull/885#issuecomment-124521433 Could you post your command to compile Flink with Scala 2.11? The current setting works well in my environment. Maven module definitions is not artifact id but directory name. So we should keep current setting. I'm adding the suffix into the pom except quickstart. Flink API with Scala 2.11 - Maven Repository Key: FLINK-2200 URL: https://issues.apache.org/jira/browse/FLINK-2200 Project: Flink Issue Type: Wish Components: Build System, Scala API Reporter: Philipp Götze Assignee: Chiwan Park Priority: Trivial Labels: maven It would be nice if you could upload a pre-built version of the Flink API with Scala 2.11 to the maven repository. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2200] Add Flink with Scala 2.11 in Mave...
Github user chiwanpark commented on the pull request: https://github.com/apache/flink/pull/885#issuecomment-124521433 Could you post your command to compile Flink with Scala 2.11? The current setting works well in my environment. Maven module definitions is not artifact id but directory name. So we should keep current setting. I'm adding the suffix into the pom except quickstart. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: Collect(): Fixing the akka.framesize size limi...
Github user kl0u closed the pull request at: https://github.com/apache/flink/pull/887 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Resolved] (FLINK-2381) Possible class not found Exception on failed partition producer
[ https://issues.apache.org/jira/browse/FLINK-2381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ufuk Celebi resolved FLINK-2381. Resolution: Fixed Fixed via 9b1343d (0.10) and 198406f (0.9.1) Possible class not found Exception on failed partition producer --- Key: FLINK-2381 URL: https://issues.apache.org/jira/browse/FLINK-2381 Project: Flink Issue Type: Improvement Components: Distributed Runtime Affects Versions: 0.9, master Reporter: Ufuk Celebi Assignee: Ufuk Celebi Fix For: 0.10, 0.9.1 Failing the production of a result partition marks the respective partition as failed with a ProducerFailedException. The cause of this exception can be a user defined class, which can only be loaded by the user code class loader. The network stack fails the shuffle with a RemoteTransportException, which has the user exception as a cause. When the consuming task receives this exception, this leads to a class not found exception, because the network stack tries to load the class with the system class loader. {code} +--+ | FAILING | | PRODUCER | +--+ || \/ ProducerFailedException(CAUSE) via network || \/ +--+ | RECEIVER | +--+ {code} CAUSE is only loadable by the user code class loader. When trying to deserialize this, RECEIVER fails with a LocalTransportException, which is super confusing, because the error is not local, but remote. Thanks to [~rmetzger] for reporting and debugging the issue with the following stack trace: {code} Flat Map (26/120) 14:03:00,343 ERROR org.apache.flink.streaming.runtime.tasks.OneInputStreamTask - Flat Map (26/120) failed java.lang.RuntimeException: Could not read next record. at org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.readNext(OneInputStreamTask.java:71) at org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.invoke(OneInputStreamTask.java:101) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:567) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.flink.runtime.io.network.netty.exception.LocalTransportException: java.lang.ClassNotFoundException: kafka.common.ConsumerRebalanceFailedException at org.apache.flink.runtime.io.network.netty.PartitionRequestClientHandler.exceptionCaught(PartitionRequestClientHandler.java:151) at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:275) at io.netty.channel.AbstractChannelHandlerContext.fireExceptionCaught(AbstractChannelHandlerContext.java:253) at io.netty.channel.ChannelInboundHandlerAdapter.exceptionCaught(ChannelInboundHandlerAdapter.java:131) at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:275) at io.netty.channel.AbstractChannelHandlerContext.notifyHandlerException(AbstractChannelHandlerContext.java:809) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:341) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324) at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:242) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324) at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:847) at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131) at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511) at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111) ... 1 more Caused by: io.netty.handler.codec.DecoderException: java.lang.ClassNotFoundException: kafka.common.ConsumerRebalanceFailedException at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:99) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339) ... 12 more Caused by: java.lang.ClassNotFoundException: kafka.common.ConsumerRebalanceFailedException at
[jira] [Resolved] (FLINK-2341) Deadlock in SpilledSubpartitionViewAsyncIO
[ https://issues.apache.org/jira/browse/FLINK-2341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ufuk Celebi resolved FLINK-2341. Resolution: Fixed Fixed via efca79c (0.10) and 208c0a1 (0.9.1) Deadlock in SpilledSubpartitionViewAsyncIO -- Key: FLINK-2341 URL: https://issues.apache.org/jira/browse/FLINK-2341 Project: Flink Issue Type: Bug Components: Distributed Runtime Affects Versions: 0.9, 0.10 Reporter: Stephan Ewen Assignee: Ufuk Celebi Priority: Critical Fix For: 0.9, 0.10 It may be that the deadlock is because of the way the {{SpilledSubpartitionViewTest}} is written {code} Found one Java-level deadlock: = pool-25-thread-2: waiting to lock monitor 0x7f66f4932468 (object 0xfa1478f0, a java.lang.Object), which is held by IOManager reader thread #1 IOManager reader thread #1: waiting to lock monitor 0x7f66f4931160 (object 0xfa029768, a java.lang.Object), which is held by pool-25-thread-2 Java stack information for the threads listed above: === pool-25-thread-2: at org.apache.flink.runtime.io.network.partition.SpilledSubpartitionViewAsyncIO.notifyError(SpilledSubpartitionViewAsyncIO.java:304) - waiting to lock 0xfa1478f0 (a java.lang.Object) at org.apache.flink.runtime.io.network.partition.SpilledSubpartitionViewAsyncIO.onAvailableBuffer(SpilledSubpartitionViewAsyncIO.java:256) at org.apache.flink.runtime.io.network.partition.SpilledSubpartitionViewAsyncIO.access$300(SpilledSubpartitionViewAsyncIO.java:42) at org.apache.flink.runtime.io.network.partition.SpilledSubpartitionViewAsyncIO$BufferProviderCallback.onEvent(SpilledSubpartitionViewAsyncIO.java:367) at org.apache.flink.runtime.io.network.partition.SpilledSubpartitionViewAsyncIO$BufferProviderCallback.onEvent(SpilledSubpartitionViewAsyncIO.java:353) at org.apache.flink.runtime.io.network.util.TestPooledBufferProvider$PooledBufferProviderRecycler.recycle(TestPooledBufferProvider.java:135) - locked 0xfa029768 (a java.lang.Object) at org.apache.flink.runtime.io.network.buffer.Buffer.recycle(Buffer.java:119) - locked 0xfa3a1a20 (a java.lang.Object) at org.apache.flink.runtime.io.network.util.TestSubpartitionConsumer.call(TestSubpartitionConsumer.java:95) at org.apache.flink.runtime.io.network.util.TestSubpartitionConsumer.call(TestSubpartitionConsumer.java:39) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:701) IOManager reader thread #1: at org.apache.flink.runtime.io.network.util.TestPooledBufferProvider$PooledBufferProviderRecycler.recycle(TestPooledBufferProvider.java:127) - waiting to lock 0xfa029768 (a java.lang.Object) at org.apache.flink.runtime.io.network.buffer.Buffer.recycle(Buffer.java:119) - locked 0xfa3a1ea0 (a java.lang.Object) at org.apache.flink.runtime.io.network.partition.SpilledSubpartitionViewAsyncIO.returnBufferFromIOThread(SpilledSubpartitionViewAsyncIO.java:270) - locked 0xfa1478f0 (a java.lang.Object) at org.apache.flink.runtime.io.network.partition.SpilledSubpartitionViewAsyncIO.access$100(SpilledSubpartitionViewAsyncIO.java:42) at org.apache.flink.runtime.io.network.partition.SpilledSubpartitionViewAsyncIO$IOThreadCallback.requestSuccessful(SpilledSubpartitionViewAsyncIO.java:338) at org.apache.flink.runtime.io.network.partition.SpilledSubpartitionViewAsyncIO$IOThreadCallback.requestSuccessful(SpilledSubpartitionViewAsyncIO.java:328) at org.apache.flink.runtime.io.disk.iomanager.AsynchronousFileIOChannel.handleProcessedBuffer(AsynchronousFileIOChannel.java:199) at org.apache.flink.runtime.io.disk.iomanager.BufferReadRequest.requestDone(AsynchronousFileIOChannel.java:431) at org.apache.flink.runtime.io.disk.iomanager.IOManagerAsync$ReaderThread.run(IOManagerAsync.java:377) {code} The full log with the deadlock stack traces can be found here: https://s3.amazonaws.com/archive.travis-ci.org/jobs/70232347/log.txt -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: Framesize fix
GitHub user kl0u opened a pull request: https://github.com/apache/flink/pull/934 Framesize fix In Apache Flink the results of the collect() call were returned through akka to the client. This led to an inherent limitation to the size of the output of a job, as this could not exceed the akka.framesize size. In other case, akka would drop the message. To alleviate this, without dropping the benefits brought by akka and its out-of-the-box efficiency for small-sized results, we decided to keep forwarding the non-oversized (i.e. smaller than the akka.framesize) results through akka, and use the BlobCache module for the forwarding the oversized (large) ones. Now the JobManager receives end merges the small accumulators (as before), and simply forwards to the Client the keys to the blobs storing the oversized ones. Now it is the responsibility of the Client to do the final merging between oversized and non-oversized accumulators. You can merge this pull request into a Git repository by running: $ git pull https://github.com/kl0u/flink framesize_fix Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/934.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #934 commit fb1fbd6bdcc81acd20d422842789fce0c0872580 Author: Kostas Kloudas kklou...@gmail.com Date: 2015-07-22T17:53:11Z Solved the #887 issue: removing the akka.framesize size limitation for the result of a job. commit 34d3e433eb0ce976539de166288550c9c7612eb4 Author: Kostas Kloudas kklou...@gmail.com Date: 2015-07-22T17:53:11Z Solved the #887 issue: removing the akka.framesize size limitation for the result of a job. commit 55aa50c3f3e5c4c3a253b8da68b5ddde9acb307f Author: Kostas Kloudas kklou...@gmail.com Date: 2015-07-24T12:02:10Z Merge branch 'framesize_fix' of https://github.com/kl0u/flink into framesize_fix --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2319) Collect size limitation due to akka.
[ https://issues.apache.org/jira/browse/FLINK-2319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14640365#comment-14640365 ] ASF GitHub Bot commented on FLINK-2319: --- Github user kl0u commented on the pull request: https://github.com/apache/flink/pull/934#issuecomment-124500486 FLINK-2319 This pull request targets this ticket. Collect size limitation due to akka. Key: FLINK-2319 URL: https://issues.apache.org/jira/browse/FLINK-2319 Project: Flink Issue Type: Bug Components: JobManager, TaskManager Affects Versions: 0.10 Reporter: Kostas Each TaksManager keeps the results of a local task in a set of Accumulators. Upon termination of the task, the Accumulators are sent back to the JobManager, who merges them, before sending them back to the Client. To exchange the Accumulators and their content, akka is used. This limits the size of the output of a task to no more than akka.framesize bytes. In other case, akka would drop the message. This ticket is to propose the removal of this limitation so that results can be of arbitrary size. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: Framesize fix
Github user kl0u commented on the pull request: https://github.com/apache/flink/pull/934#issuecomment-124500486 FLINK-2319 This pull request targets this ticket. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: Framesize fix
Github user kl0u commented on the pull request: https://github.com/apache/flink/pull/934#issuecomment-124500737 Hello guys, This is a new pull request, for a previous ticket. It is aligned with recent changes in the master branch. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---