breaking the pipe line
Hello all, I have a very very long pipeline (implementation of an incremental algorithm). It takes a very long time for Flink execution planner to create the plan. So I splitted the pipeline into several independent pipelines by writing down the intermediate results and again read them. Is there any other more efficient way to do it? Best, Alieh
Re: runtime.resourcemanager
Hello, this is the task manage log but it does not change after I run the program. I think the Flink planner has problem with my program. It can not even start the job. Best, Alieh 018-12-10 12:20:20,386 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner - 2018-12-10 12:20:20,387 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner - Starting TaskManager (Version: 1.6.0, Rev:ff472b4, Date:07.08.2018 @ 13:31:13 UTC) 2018-12-10 12:20:20,387 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner - OS current user: alieh 2018-12-10 12:20:20,609 WARN org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2018-12-10 12:20:20,768 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner - Current Hadoop/Kerberos user: alieh 2018-12-10 12:20:20,769 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner - JVM: Java HotSpot(TM) 64-Bit Server VM - Oracle Corporation - 1.8/25.161-b12 2018-12-10 12:20:20,769 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner - Maximum heap size: 922 MiBytes 2018-12-10 12:20:20,769 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner - JAVA_HOME: /usr/lib/jvm/java-8-oracle 2018-12-10 12:20:20,774 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner - Hadoop version: 2.4.1 2018-12-10 12:20:20,775 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner - JVM Options: 2018-12-10 12:20:20,775 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner - -XX:+UseG1GC 2018-12-10 12:20:20,775 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner - -Xms922M 2018-12-10 12:20:20,775 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner - -Xmx922M 2018-12-10 12:20:20,775 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner - -XX:MaxDirectMemorySize=8388607T 2018-12-10 12:20:20,775 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner - -Dlog.file=/home/alieh/flink-1.6.0/log/flink-alieh-taskexecutor-0-alieh-P67A-D3-B3.log 2018-12-10 12:20:20,775 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner - -Dlog4j.configuration=file:/home/alieh/flink-1.6.0/conf/log4j.properties 2018-12-10 12:20:20,775 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner - -Dlogback.configurationFile=file:/home/alieh/flink-1.6.0/conf/logback.xml 2018-12-10 12:20:20,775 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner - Program Arguments: 2018-12-10 12:20:20,776 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner - --configDir 2018-12-10 12:20:20,776 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner - /home/alieh/flink-1.6.0/conf 2018-12-10 12:20:20,776 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner - Classpath: /home/alieh/flink-1.6.0/lib/flink-python_2.11-1.6.0.jar:/home/alieh/flink-1.6.0/lib/flink-shaded-hadoop2-uber-1.6.0.jar:/home/alieh/flink-1.6.0/lib/log4j-1.2.17.jar:/home/alieh/flink-1.6.0/lib/slf4j-log4j12-1.7.7.jar:/home/alieh/flink-1.6.0/lib/flink-dist_2.11-1.6.0.jar::: 2018-12-10 12:20:20,776 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner - 2018-12-10 12:20:20,777 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner - Registered UNIX signal handlers for [TERM, HUP, INT] 2018-12-10 12:20:20,785 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner - Maximum number of open file descriptors is 1048576. 2018-12-10 12:20:20,803 INFO org.apache.flink.configuration.GlobalConfiguration- Loading configuration property: jobmanager.rpc.address, localhost 2018-12-10 12:20:20,803 INFO org.apache.flink.configuration.GlobalConfiguration- Loading configuration property: jobmanager.rpc.port, 6123 2018-12-10 12:20:20,803 INFO org.apache.flink.configuration.GlobalConfiguration- Loading configuration property: jobmanager.heap.size, 1024m 2018-12-10 12:20:20,803 INFO org.apache.flink.configuration.GlobalConfiguration- Loading configuration property: taskmanager.heap.size, 1024m 2018-12-10 12:20:20,803 INFO org.apache.flink.configuration.GlobalConfiguration- Loading configuration property: taskmanager.numberOfTaskSlots, 1 2018-12-10 12:20:20,803 INFO org.apache.flink.configuration.GlobalConfiguration- Loading configuration property: parallelism.default, 1 2018-12-10 12:20:20,804 INFO org.apache.flink.configuration.GlobalConfiguration- Loading configuration property: rest.port, 8081
Re: runtime.resourcemanager
Hello Piotrek, thank you for your answer. I installed a Flink on a local cluster and used the GUI in order to monitor the task managers. It seems the program *d**oes not start at all*. The whole time just the job manager is struggling... For very very toy examples, after a long time (during this time I see the job manager logs as I mentioned before), the job is started and can be executed in 2 seconds. Best, Alieh On 12/07/2018 10:43 AM, Piotr Nowojski wrote: Hi, Please investigate logs/standard output/error from the task manager that has failed (the logs that you showed are from job manager). Probably there is some obvious error/exception explaining why has it failed. Most common reasons: - out of memory - long GC pause - seg fault or other error from some native library - task manager killed via for example SIGKILL Piotrek On 6 Dec 2018, at 17:34, Alieh wrote: Hello all, I have an algorithm x () which contains several joins and usage of 3 times of gelly ConnectedComponents. The problem is that if I call x() inside a script more than three times, I receive the messages listed below in the log and the program is somehow stopped. It happens even if I run it with a toy example of a graph with less that 10 vertices. Do you have any clue what is the problem? Cheers, Alieh 129149 [flink-akka.actor.default-dispatcher-20] DEBUG org.apache.flink.runtime.resourcemanager.StandaloneResourceManager - Trigger heartbeat request. 129149 [flink-akka.actor.default-dispatcher-20] DEBUG org.apache.flink.runtime.resourcemanager.StandaloneResourceManager - Trigger heartbeat request. 129150 [flink-akka.actor.default-dispatcher-20] DEBUG org.apache.flink.runtime.taskexecutor.TaskExecutor - Received heartbeat request from e80ec35f3d0a04a68000ecbdc555f98b. 129150 [flink-akka.actor.default-dispatcher-22] DEBUG org.apache.flink.runtime.resourcemanager.StandaloneResourceManager - Received heartbeat from 78cdd7a4-0c00-4912-992f-a2990a5d46db. 129151 [flink-akka.actor.default-dispatcher-22] DEBUG org.apache.flink.runtime.resourcemanager.StandaloneResourceManager - Received new slot report from TaskManager 78cdd7a4-0c00-4912-992f-a2990a5d46db. 129151 [flink-akka.actor.default-dispatcher-22] DEBUG org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager - Received slot report from instance 4c3e3654c11b09fbbf8e993a08a4c2da. 129200 [flink-akka.actor.default-dispatcher-15] DEBUG org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager - Release TaskExecutor 4c3e3654c11b09fbbf8e993a08a4c2da because it exceeded the idle timeout. 129200 [flink-akka.actor.default-dispatcher-15] DEBUG org.apache.flink.runtime.resourcemanager.StandaloneResourceManager - Worker 78cdd7a4-0c00-4912-992f-a2990a5d46db could not be stopped.
runtime.resourcemanager
Hello all, I have an algorithm x () which contains several joins and usage of 3 times of gelly ConnectedComponents. The problem is that if I call x() inside a script more than three times, I receive the messages listed below in the log and the program is somehow stopped. It happens even if I run it with a toy example of a graph with less that 10 vertices. Do you have any clue what is the problem? Cheers, Alieh 129149 [flink-akka.actor.default-dispatcher-20] DEBUG org.apache.flink.runtime.resourcemanager.StandaloneResourceManager - Trigger heartbeat request. 129149 [flink-akka.actor.default-dispatcher-20] DEBUG org.apache.flink.runtime.resourcemanager.StandaloneResourceManager - Trigger heartbeat request. 129150 [flink-akka.actor.default-dispatcher-20] DEBUG org.apache.flink.runtime.taskexecutor.TaskExecutor - Received heartbeat request from e80ec35f3d0a04a68000ecbdc555f98b. 129150 [flink-akka.actor.default-dispatcher-22] DEBUG org.apache.flink.runtime.resourcemanager.StandaloneResourceManager - Received heartbeat from 78cdd7a4-0c00-4912-992f-a2990a5d46db. 129151 [flink-akka.actor.default-dispatcher-22] DEBUG org.apache.flink.runtime.resourcemanager.StandaloneResourceManager - Received new slot report from TaskManager 78cdd7a4-0c00-4912-992f-a2990a5d46db. 129151 [flink-akka.actor.default-dispatcher-22] DEBUG org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager - Received slot report from instance 4c3e3654c11b09fbbf8e993a08a4c2da. 129200 [flink-akka.actor.default-dispatcher-15] DEBUG org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager - Release TaskExecutor 4c3e3654c11b09fbbf8e993a08a4c2da because it exceeded the idle timeout. 129200 [flink-akka.actor.default-dispatcher-15] DEBUG org.apache.flink.runtime.resourcemanager.StandaloneResourceManager - Worker 78cdd7a4-0c00-4912-992f-a2990a5d46db could not be stopped.
Gelly: akka.ask.timeout
Hey all,I have an iterative algorithm implemented in Gelly. As long as I upgraded everything to flink-1.3.1 from 1.1.2, the runtime has been increased and in some cases task managers are killed. The error msg is| "akka.ask.timeout". I increased akka.ask.timeout, but the problem still exist. Cheers,Alieh
bulk iteration
Hello all, even though I asked this question before, I lost my emails. So I have to ask again. Using Bulk iteration, is there any way to know the number of iterations? Cheers, Alieh
Bulk Iteration
Hello all, using Bulk iteration, is there any way to know the number of iterations? Cheers, Alieh
count + aggragation
Hello all, 1st question: Is there any way to know the count or the content of a "Fink DataSet" without using count() or collect()? The problem is that I have a loop which the number of iterations depends on the count of a DataSet. Using count() may force the whole pipeline to be executed again. I do not like to use delta or bulk iteration. 2nd question: Using the "Aggregations.Max" on a DataSet of Tuple2<String, Integer> on the second field, I observed that the second field is the real maximum of the whole dataset while the first field is not the corresponding one to the second!!! Best, Alieh
delta iteration
Hello all, I need iteration number in delta iteration (or any kind of counter). Is there anyway to implement or extract it? Cheers, Alieh
delta iteration
Hello all, I need iteration number in delta iteration (or any kind of counter). Is there anyway to implement or extract it? Cheers, Alieh
delta iteration
Hello all, I need iteration number in delta iteration (or any kind of counter). Is there anyway to implement or extract it? Cheers, Alieh
gelly scatter/gather
Hi all I have an iterative algorithm implemented using Gelly scatter/gather. Using 8 workers of a cluster, I encounter the error "akka.pattern.AskTimeoutException", which I think the reason is heap size. Surprisingly, using 4 workers of the same cluster, my program is executed!!! It seems that I have some I/O with 4 workers, but why using 8 workers I can not run the job? Best, Alieh
Flink groupBy
Hi All Is there anyway in Flink to send a process to a reducer? If I do "test.groupby(1).reduceGroup", each group is processed on one reducer? And if the number of groups is more than the number of task slots we have, does Flink distribute the process evenly? I mean if we have for example groups of size 10, 5, 5 and we have two task slots, is the process distributed in this way? task slot1: group of size 10 task slot2: two groups of size 5 Best, Alieh
Re: The two inputs have different execution contexts.
HiI was joining two datasets which were from two different ExecutionEnviornment. It was my mistake. Thanks anyway. Best,Alieh On Monday, 11 July 2016, 11:33, Kostas Kloudas <k.klou...@data-artisans.com> wrote: Hi Alieh, Could you share you code so that we can have a look?From the information you provide we cannot help. Thanks,Kostas On Jul 10, 2016, at 3:13 PM, Alieh Saeedi <a1_sae...@yahoo.com> wrote: I can not join or coGroup two tuple2 datasets of the same tome. The error is java.lang.IllegalArgumentException: The two inputs have different execution contexts.:-(
The two inputs have different execution contexts.
I can not join or coGroup two tuple2 datasets of the same tome. The error is java.lang.IllegalArgumentException: The two inputs have different execution contexts.:-(
LongMaxAggregator in gelly scatter/gather
Hi every bodyIs it possible to use org.apache.giraph.aggregators.LongMaxAggregator as an aggregator of gelly scatter/gather? Thanks in advance
LongMaxAggragator()
Hi everybody Why LongSumAggragator() works but LongMaxAggragator() is not known any more?! Thanks in advance
Gelly Scatter/Gather - Vertex update
Hi everybodyIn Gelly scatter/gather when no message is sent to a vertex in one iteration, it will not enter the scatter function in next iteration. Why? I need all vertices enter the scatter function in all iterations, but some of them receive a message and will be updated. thanks in advance
Gelly scatter/gather
HiIs it possible to access iteration number in gelly scatter/gather? thanks in advance