Hi again
Thanks for the reply!
Not sure I understood the part with active ResourceManager. But if I understood
you right, running Flink in standalone mode enables setting specific values for
the different TaskManagers in the cluster.
Following the documentation linked by Chesnay, I have now registered a custom
resource (called ‘close') to one of the task managers in my standalone cluster
(added in flink-conf.yaml only for that particular task manager). It seems the
task manager has been assigned with that resource:
"id": "10.0.0.5:36501-f787a2",
"path": "akka.tcp://[email protected]:36501/user/rpc/taskmanager_0",
"dataPort": 45583,
"jmxPort": -1,
"timeSinceLastHeartbeat": 1636710281811,
"slotsNumber": 1,
"freeSlots": 0,
"totalResource": {
"cpuCores": 1,
"taskHeapMemory": 383,
"taskOffHeapMemory": 0,
"managedMemory": 512,
"networkMemory": 128,
"extendedResources": {
"close": 1
}
On the application side, I registered a task sharing group for this specific
resource
.setExternalResource("close", 1.0)
And flowingly added one of this operators to the group.
.fromSource(source, WatermarkStrategy.noWatermarks(), "Kafka
Source").slotSharingGroup(ssg)
However, it seems parallelism is still 4
"id": "cbc357ccb763df2852fee8c4fc7d55f2",
"name": "Source: Kafka Source -> Flat Map",
"parallelism": 4,
.. I can see the task running on all four nodes in the cluster. It seems to me
that the reactive mode scaling does not respect the resource demand for the
operator?
Regards
Morten
On 11 Nov 2021, at 17:15, Yangze Guo
<[email protected]<mailto:[email protected]>> wrote:
Hi, Morten,
Sorry for the belated reply. With the doc provided by Chesnay, you can
start the TaskManager with GPU. However, currently with active
ResourceManager, the resource profiles of TaskManager are all the
same, which means all of your TaskManager will have at least one GPU.
If you only want some of the TaskManagers to have GPU, you may set up
a standalone cluster and then manually start a TaskManager with GPU
and let it register to the cluster.
Best,
Yangze Guo
On Thu, Nov 11, 2021 at 9:57 PM Chesnay Schepler
<[email protected]<mailto:[email protected]>> wrote:
The external resource documentation should contain instructions to do so.
On 10/11/2021 09:18, Morten Gunnar Bjørner Lindeberg wrote:
Hi :)
I am trying the fine-grained resource management feature in Flink 1.14, hoping
it can enable assigning certain operators to certain TaskManagers.
The sample code in
https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/deployment/finegrained_resource/
-shows how to define the group and its resource (e.g. using
setExternalResource-method), but I do not see any option to "assign" a
TaskManager worker instance with the capabilities of this "external resource”.
Following the GPU-based example in the documentation, how can I ensure that
Flink "knows" which task manager actually has the required GPU? Is there some
configuration option I am missing / missing in the documentation?
Have a nice day!
Morten Lindeberg
PostDoc Informatics, University of Oslo
[email protected]<mailto:[email protected]>