Repository: flink Updated Branches: refs/heads/master 21742b2d7 -> 392b2e9a0
[FLINK-5454] [docs] Add documentation how to tune streaming applications for large state Project: http://git-wip-us.apache.org/repos/asf/flink/repo Commit: http://git-wip-us.apache.org/repos/asf/flink/commit/392b2e9a Tree: http://git-wip-us.apache.org/repos/asf/flink/tree/392b2e9a Diff: http://git-wip-us.apache.org/repos/asf/flink/diff/392b2e9a Branch: refs/heads/master Commit: 392b2e9a0595a947a24e1504cbfd5f9cabc1c86a Parents: 21742b2 Author: Stephan Ewen <[email protected]> Authored: Sun Jan 22 20:51:45 2017 +0100 Committer: Stephan Ewen <[email protected]> Committed: Sun Jan 22 21:05:16 2017 +0100 ---------------------------------------------------------------------- docs/fig/checkpoint_tuning.svg | 507 +++++++++++++++++++++++++++++ docs/monitoring/large_state_tuning.md | 187 ++++++++++- 2 files changed, 678 insertions(+), 16 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/flink/blob/392b2e9a/docs/fig/checkpoint_tuning.svg ---------------------------------------------------------------------- diff --git a/docs/fig/checkpoint_tuning.svg b/docs/fig/checkpoint_tuning.svg new file mode 100644 index 0000000..2639f6a --- /dev/null +++ b/docs/fig/checkpoint_tuning.svg @@ -0,0 +1,507 @@ +<?xml version="1.0" encoding="UTF-8" standalone="no"?> +<!-- +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + +http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. +--> + +<svg + xmlns:dc="http://purl.org/dc/elements/1.1/" + xmlns:cc="http://creativecommons.org/ns#" + xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" + xmlns:svg="http://www.w3.org/2000/svg" + xmlns="http://www.w3.org/2000/svg" + version="1.1" + width="1116.3304" + height="896.67767" + id="svg2"> + <defs + id="defs4" /> + <metadata + id="metadata7"> + <rdf:RDF> + <cc:Work + rdf:about=""> + <dc:format>image/svg+xml</dc:format> + <dc:type + rdf:resource="http://purl.org/dc/dcmitype/StillImage" /> + <dc:title></dc:title> + </cc:Work> + </rdf:RDF> + </metadata> + <g + transform="translate(255.30809,4.5480758)" + id="layer1"> + <g + transform="translate(-274.66659,45.499311)" + id="g2989"> + <path + d="m 106.07643,701.83735 1003.26247,0 0,1.87546 -1003.26247,0 0,-1.87546 z m 1002.02467,-2.81321 7.5018,3.75094 -7.5018,3.75093 0,-7.50187 z" + id="path2991" + style="fill:#000000;fill-opacity:1;fill-rule:nonzero;stroke:#000000;stroke-width:0.03750934px;stroke-linecap:butt;stroke-linejoin:round;stroke-opacity:1;stroke-dasharray:none" /> + <text + x="1094.0844" + y="738.86139" + id="text2993" + xml:space="preserve" + style="font-size:22.5056076px;font-style:normal;font-weight:normal;text-align:start;text-anchor:start;fill:#000000;font-family:Arial">time</text> + <path + d="m 106.07643,664.17796 0,28.91971 210.05233,0 0,-28.91971 -210.05233,0 z" + id="path2995" + style="fill:#ffc000;fill-opacity:1;fill-rule:evenodd;stroke:none" /> + <path + d="m 316.12876,664.17796 0,28.91971 106.11393,0 0,-28.91971 -106.11393,0 z" + id="path2997" + style="fill:#70ad47;fill-opacity:1;fill-rule:evenodd;stroke:none" /> + <path + d="m 422.24269,664.17796 0,28.91971 210.05233,0 0,-28.91971 -210.05233,0 z" + id="path2999" + style="fill:#ffc000;fill-opacity:1;fill-rule:evenodd;stroke:none" /> + <path + d="m 632.29502,664.17796 0,28.91971 106.11394,0 0,-28.91971 -106.11394,0 z" + id="path3001" + style="fill:#70ad47;fill-opacity:1;fill-rule:evenodd;stroke:none" /> + <path + d="m 738.40896,664.17796 0,28.91971 210.05233,0 0,-28.91971 -210.05233,0 z" + id="path3003" + style="fill:#ffc000;fill-opacity:1;fill-rule:evenodd;stroke:none" /> + <path + d="m 948.46129,664.17796 0,28.91971 105.96391,0 0,-28.91971 -105.96391,0 z" + id="path3005" + style="fill:#70ad47;fill-opacity:1;fill-rule:evenodd;stroke:none" /> + <path + d="m 106.07643,388.78436 1003.26247,0 0,1.87546 -1003.26247,0 0,-1.87546 z m 1002.02467,-2.8132 7.5018,3.75093 -7.5018,3.75093 0,-7.50186 z" + id="path3007" + style="fill:#000000;fill-opacity:1;fill-rule:nonzero;stroke:#000000;stroke-width:0.03750934px;stroke-linecap:butt;stroke-linejoin:round;stroke-opacity:1;stroke-dasharray:none" /> + <text + x="1094.0844" + y="425.83395" + id="text3009" + xml:space="preserve" + style="font-size:22.5056076px;font-style:normal;font-weight:normal;text-align:start;text-anchor:start;fill:#000000;font-family:Arial">time</text> + <path + d="m 106.05767,351.10622 0,28.9197 210.05233,0 0,-28.9197 -210.05233,0 z" + id="path3011" + style="fill:#ffc000;fill-opacity:1;fill-rule:evenodd;stroke:none" /> + <path + d="m 316.11,351.10622 0,28.9197 4.51988,0 0,-28.9197 -4.51988,0 z" + id="path3013" + style="fill:#70ad47;fill-opacity:1;fill-rule:evenodd;stroke:none" /> + <path + d="m 319.22328,351.10622 0,28.9197 210.05233,0 0,-28.9197 -210.05233,0 z" + id="path3015" + style="fill:#ffc000;fill-opacity:1;fill-rule:evenodd;stroke:none" /> + <path + d="m 106.07643,145.61128 1003.26247,0 0,1.87546 -1003.26247,0 0,-1.87546 z m 1002.02467,-2.81321 7.5018,3.75094 -7.5018,3.75093 0,-7.50187 z" + id="path3017" + style="fill:#000000;fill-opacity:1;fill-rule:nonzero;stroke:#000000;stroke-width:0.03750934px;stroke-linecap:butt;stroke-linejoin:round;stroke-opacity:1;stroke-dasharray:none" /> + <text + x="1094.0844" + y="182.71945" + id="text3019" + xml:space="preserve" + style="font-size:22.5056076px;font-style:normal;font-weight:normal;text-align:start;text-anchor:start;fill:#000000;font-family:Arial">time</text> + <path + d="m 106.04829,108.08318 0,28.91032 28.91033,0 0,-28.91032 -28.91033,0 z" + id="path3021" + style="fill:#ffc000;fill-opacity:1;fill-rule:evenodd;stroke:none" /> + <path + d="m 134.95862,108.08318 0,28.91032 106.12331,0 0,-28.91032 -106.12331,0 z" + id="path3023" + style="fill:#70ad47;fill-opacity:1;fill-rule:evenodd;stroke:none" /> + <path + d="m 106.04829,161.53399 0,22.73066" + id="path3025" + style="fill:none;stroke:#000000;stroke-width:2.81320095px;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-opacity:1;stroke-dasharray:none" /> + <path + d="m 239.98479,161.53399 0,22.73066" + id="path3027" + style="fill:none;stroke:#000000;stroke-width:2.81320095px;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-opacity:1;stroke-dasharray:none" /> + <path + d="m 241.08193,172.93683 -135.03364,0" + id="path3029" + style="fill:none;stroke:#000000;stroke-width:2.81320095px;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-opacity:1;stroke-dasharray:none" /> + <text + x="452.19522" + y="-31.805758" + id="text3031" + xml:space="preserve" + style="font-size:25.05624199px;font-style:normal;font-weight:bold;text-align:start;text-anchor:start;fill:#000000;font-family:Arial">Regular Operation</text> + <text + x="407.03397" + y="-4.1988807" + id="text3033" + xml:space="preserve" + style="font-size:22.5056076px;font-style:normal;font-weight:normal;text-align:start;text-anchor:start;fill:#000000;font-family:Arial">(some </text> + <text + x="475.7511" + y="-4.1988807" + id="text3035" + xml:space="preserve" + style="font-size:22.5056076px;font-style:normal;font-weight:normal;text-align:start;text-anchor:start;fill:#000000;font-family:Arial">checkpointing</text> + <text + x="612.88525" + y="-4.1988807" + id="text3037" + xml:space="preserve" + style="font-size:22.5056076px;font-style:normal;font-weight:normal;text-align:start;text-anchor:start;fill:#000000;font-family:Arial">-</text> + <text + x="620.38715" + y="-4.1988807" + id="text3039" + xml:space="preserve" + style="font-size:22.5056076px;font-style:normal;font-weight:normal;text-align:start;text-anchor:start;fill:#000000;font-family:Arial">free time)</text> + <path + d="m 241.08193,108.08318 0,28.91032 28.91033,0 0,-28.91032 -28.91033,0 z" + id="path3041" + style="fill:#ffc000;fill-opacity:1;fill-rule:evenodd;stroke:none" /> + <path + d="m 269.99226,108.08318 0,28.9197 106.13269,0 0,-28.9197 -106.13269,0 z" + id="path3043" + style="fill:#70ad47;fill-opacity:1;fill-rule:evenodd;stroke:none" /> + <path + d="m 376.12495,108.08318 0,28.9197 28.90095,0 0,-28.9197 -28.90095,0 z" + id="path3045" + style="fill:#ffc000;fill-opacity:1;fill-rule:evenodd;stroke:none" /> + <path + d="m 405.0259,108.08318 0,28.9197 106.13269,0 0,-28.9197 -106.13269,0 z" + id="path3047" + style="fill:#70ad47;fill-opacity:1;fill-rule:evenodd;stroke:none" /> + <path + d="m 106.05767,406.75133 0,22.73066" + id="path3049" + style="fill:none;stroke:#000000;stroke-width:2.81320095px;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-opacity:1;stroke-dasharray:none" /> + <path + d="m 239.98479,406.75133 0,22.73066" + id="path3051" + style="fill:none;stroke:#000000;stroke-width:2.81320095px;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-opacity:1;stroke-dasharray:none" /> + <path + d="m 241.09131,418.15417 -135.03364,0" + id="path3053" + style="fill:none;stroke:#000000;stroke-width:2.81320095px;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-opacity:1;stroke-dasharray:none" /> + <text + x="88.081291" + y="213.86913" + id="text3055" + xml:space="preserve" + style="font-size:19.95497131px;font-style:normal;font-weight:normal;text-align:start;text-anchor:start;fill:#000000;font-family:Arial">Checkpoint Interval</text> + <path + d="m 241.08193,161.53399 0,22.73066" + id="path3057" + style="fill:none;stroke:#000000;stroke-width:2.81320095px;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-opacity:1;stroke-dasharray:none" /> + <path + d="m 375.01843,161.53399 0,22.73066" + id="path3059" + style="fill:none;stroke:#000000;stroke-width:2.81320095px;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-opacity:1;stroke-dasharray:none" /> + <path + d="m 376.12495,172.93683 -135.03364,0" + id="path3061" + style="fill:none;stroke:#000000;stroke-width:2.81320095px;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-opacity:1;stroke-dasharray:none" /> + <path + d="m 373.93066,161.53399 0,22.73066" + id="path3063" + style="fill:none;stroke:#000000;stroke-width:2.81320095px;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-opacity:1;stroke-dasharray:none" /> + <path + d="m 507.87652,161.53399 0,22.73066" + id="path3065" + style="fill:none;stroke:#000000;stroke-width:2.81320095px;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-opacity:1;stroke-dasharray:none" /> + <path + d="m 508.9643,172.93683 -135.03364,0" + id="path3067" + style="fill:none;stroke:#000000;stroke-width:2.81320095px;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-opacity:1;stroke-dasharray:none" /> + <path + d="m 508.9643,161.53399 0,22.73066" + id="path3069" + style="fill:none;stroke:#000000;stroke-width:2.81320095px;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-opacity:1;stroke-dasharray:none" /> + <path + d="m 642.91017,161.55275 0,22.73066" + id="path3071" + style="fill:none;stroke:#000000;stroke-width:2.81320095px;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-opacity:1;stroke-dasharray:none" /> + <path + d="m 643.99794,172.95559 -135.03364,0" + id="path3073" + style="fill:none;stroke:#000000;stroke-width:2.81320095px;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-opacity:1;stroke-dasharray:none" /> + <path + d="m 507.87652,108.08318 0,28.9197 28.90095,0 0,-28.9197 -28.90095,0 z" + id="path3075" + style="fill:#ffc000;fill-opacity:1;fill-rule:evenodd;stroke:none" /> + <path + d="m 536.79623,108.10193 0,28.91971 106.11394,0 0,-28.91971 -106.11394,0 z" + id="path3077" + style="fill:#70ad47;fill-opacity:1;fill-rule:evenodd;stroke:none" /> + <text + x="17.819006" + y="47.55584" + id="text3079" + xml:space="preserve" + style="font-size:19.95497131px;font-style:normal;font-weight:normal;text-align:start;text-anchor:start;fill:#000000;font-family:Arial">Duration of Checkpoint</text> + <text + x="266.45242" + y="59.093952" + id="text3081" + xml:space="preserve" + style="font-size:19.95497131px;font-style:normal;font-weight:normal;text-align:start;text-anchor:start;fill:#000000;font-family:Arial">Processing without</text> + <text + x="266.45242" + y="83.099937" + id="text3083" + xml:space="preserve" + style="font-size:19.95497131px;font-style:normal;font-weight:normal;text-align:start;text-anchor:start;fill:#000000;font-family:Arial">a checkpoint in progress </text> + <path + d="m 106.07643,719.16666 0,22.76817" + id="path3085" + style="fill:none;stroke:#000000;stroke-width:2.81320095px;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-opacity:1;stroke-dasharray:none" /> + <path + d="m 239.98479,719.16666 0,22.76817" + id="path3087" + style="fill:none;stroke:#000000;stroke-width:2.81320095px;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-opacity:1;stroke-dasharray:none" /> + <path + d="m 241.11007,730.60701 -135.03364,0" + id="path3089" + style="fill:none;stroke:#000000;stroke-width:2.81320095px;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-opacity:1;stroke-dasharray:none" /> + <path + d="m 318.60437,646.96117 0,11.25281 -2.8132,0 0,-11.25281 2.8132,0 z m 0,19.69241 0,11.2528 -2.8132,0 0,-11.2528 2.8132,0 z m 0,19.69241 0,11.2528 -2.8132,0 0,-11.2528 2.8132,0 z m 0,19.6924 0,11.25281 -2.8132,0 0,-11.25281 2.8132,0 z m 0,19.69241 0,11.2528 -2.8132,0 0,-11.2528 2.8132,0 z m 0,19.6924 0,11.25281 -2.8132,0 0,-11.25281 2.8132,0 z" + id="path3091" + style="fill:#000000;fill-opacity:1;fill-rule:nonzero;stroke:#000000;stroke-width:0.03750934px;stroke-linecap:butt;stroke-linejoin:round;stroke-opacity:1;stroke-dasharray:none" /> + <text + x="88.37674" + y="456.98352" + id="text3093" + xml:space="preserve" + style="font-size:19.95497131px;font-style:normal;font-weight:normal;text-align:start;text-anchor:start;fill:#000000;font-family:Arial">Checkpoint Interval</text> + <text + x="285.33713" + y="475.22153" + id="text3095" + xml:space="preserve" + style="font-size:19.95497131px;font-style:normal;font-weight:normal;text-align:start;text-anchor:start;fill:#000000;font-family:Arial">Checkpoint completes late,</text> + <text + x="285.33713" + y="499.22751" + id="text3097" + xml:space="preserve" + style="font-size:19.95497131px;font-style:normal;font-weight:normal;text-align:start;text-anchor:start;fill:#000000;font-family:Arial">Immediately triggers next</text> + <path + d="m 321.60512,453.07537 -4.36984,-21.17402 2.75694,-0.56264 4.36984,21.15527 -2.75694,0.58139 z m -6.8267,-19.22354 2.41935,-9.13352 5.8327,7.42685 -8.25205,1.70667 z" + id="path3099" + style="fill:#000000;fill-opacity:1;fill-rule:nonzero;stroke:#000000;stroke-width:0.01875467px;stroke-linecap:butt;stroke-linejoin:round;stroke-opacity:1;stroke-dasharray:none" /> + <text + x="102.75316" + y="767.70654" + id="text3101" + xml:space="preserve" + style="font-size:19.95497131px;font-style:normal;font-weight:normal;text-align:start;text-anchor:start;fill:#000000;font-family:Arial">Base Checkpoint</text> + <text + x="145.66385" + y="791.71252" + id="text3103" + xml:space="preserve" + style="font-size:19.95497131px;font-style:normal;font-weight:normal;text-align:start;text-anchor:start;fill:#000000;font-family:Arial">Interval</text> + <path + d="m 107.38925,55.288774 12.26556,37.20927 -2.67254,0.881469 -12.26556,-37.218647 2.67254,-0.872092 z m 14.49737,34.986841 -1.3691,9.339827 -6.64853,-6.695418 8.01763,-2.644409 z" + id="path3105" + style="fill:#000000;fill-opacity:1;fill-rule:nonzero;stroke:#000000;stroke-width:0.00937734px;stroke-linecap:butt;stroke-linejoin:round;stroke-opacity:1;stroke-dasharray:none" /> + <path + d="m 255.01666,76.303384 -50.33754,20.948969 1.08777,2.597522 50.32816,-20.948969 -1.07839,-2.597522 z m -50.12187,17.816939 -6.17028,7.136157 9.41484,0.65641 -3.24456,-7.792567 z" + id="path3107" + style="fill:#000000;fill-opacity:1;fill-rule:nonzero;stroke:#000000;stroke-width:0.00937734px;stroke-linecap:butt;stroke-linejoin:round;stroke-opacity:1;stroke-dasharray:none" /> + <path + d="m 642.91017,161.55275 0,22.73066" + id="path3109" + style="fill:none;stroke:#000000;stroke-width:2.81320095px;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-opacity:1;stroke-dasharray:none" /> + <path + d="m 776.85603,161.55275 0,22.73066" + id="path3111" + style="fill:none;stroke:#000000;stroke-width:2.81320095px;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-opacity:1;stroke-dasharray:none" /> + <path + d="m 777.94381,172.95559 -135.03364,0" + id="path3113" + style="fill:none;stroke:#000000;stroke-width:2.81320095px;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-opacity:1;stroke-dasharray:none" /> + <path + d="m 641.82239,108.10193 0,28.91971 29.06975,0 0,-28.91971 -29.06975,0 z" + id="path3115" + style="fill:#ffc000;fill-opacity:1;fill-rule:evenodd;stroke:none" /> + <path + d="m 670.89214,108.10193 0,28.91971 105.96389,0 0,-28.91971 -105.96389,0 z" + id="path3117" + style="fill:#70ad47;fill-opacity:1;fill-rule:evenodd;stroke:none" /> + <path + d="m 777.94381,161.55275 0,22.73066" + id="path3119" + style="fill:none;stroke:#000000;stroke-width:2.81320095px;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-opacity:1;stroke-dasharray:none" /> + <path + d="m 911.88967,161.55275 0,22.73066" + id="path3121" + style="fill:none;stroke:#000000;stroke-width:2.81320095px;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-opacity:1;stroke-dasharray:none" /> + <path + d="m 912.97745,172.95559 -135.03364,0" + id="path3123" + style="fill:none;stroke:#000000;stroke-width:2.81320095px;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-opacity:1;stroke-dasharray:none" /> + <path + d="m 776.85603,108.10193 0,28.91971 29.06975,0 0,-28.91971 -29.06975,0 z" + id="path3125" + style="fill:#ffc000;fill-opacity:1;fill-rule:evenodd;stroke:none" /> + <path + d="m 805.92578,108.10193 0,28.91971 105.96389,0 0,-28.91971 -105.96389,0 z" + id="path3127" + style="fill:#70ad47;fill-opacity:1;fill-rule:evenodd;stroke:none" /> + <path + d="m 911.88967,161.55275 0,22.73066" + id="path3129" + style="fill:none;stroke:#000000;stroke-width:2.81320095px;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-opacity:1;stroke-dasharray:none" /> + <path + d="m 1045.9856,161.55275 0,22.73066" + id="path3131" + style="fill:none;stroke:#000000;stroke-width:2.81320095px;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-opacity:1;stroke-dasharray:none" /> + <path + d="m 1046.9233,172.95559 -135.03363,0" + id="path3133" + style="fill:none;stroke:#000000;stroke-width:2.81320095px;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-opacity:1;stroke-dasharray:none" /> + <path + d="m 910.95194,108.10193 0,28.91971 28.91971,0 0,-28.91971 -28.91971,0 z" + id="path3135" + style="fill:#ffc000;fill-opacity:1;fill-rule:evenodd;stroke:none" /> + <path + d="m 939.87165,108.10193 0,28.91971 106.11395,0 0,-28.91971 -106.11395,0 z" + id="path3137" + style="fill:#70ad47;fill-opacity:1;fill-rule:evenodd;stroke:none" /> + <path + d="m 527.40014,351.10622 0,28.9197 4.53863,0 0,-28.9197 -4.53863,0 z" + id="path3139" + style="fill:#70ad47;fill-opacity:1;fill-rule:evenodd;stroke:none" /> + <path + d="m 530.68221,351.12497 0,28.91971 210.05233,0 0,-28.91971 -210.05233,0 z" + id="path3141" + style="fill:#ffc000;fill-opacity:1;fill-rule:evenodd;stroke:none" /> + <path + d="m 738.85907,351.12497 0,28.91971 4.38859,0 0,-28.91971 -4.38859,0 z" + id="path3143" + style="fill:#70ad47;fill-opacity:1;fill-rule:evenodd;stroke:none" /> + <path + d="m 742.00985,351.12497 0,28.91971 210.05233,0 0,-28.91971 -210.05233,0 z" + id="path3145" + style="fill:#ffc000;fill-opacity:1;fill-rule:evenodd;stroke:none" /> + <path + d="m 950.18672,351.12497 0,28.91971 4.53863,0 0,-28.91971 -4.53863,0 z" + id="path3147" + style="fill:#70ad47;fill-opacity:1;fill-rule:evenodd;stroke:none" /> + <path + d="m 953.29999,351.12497 0,28.91971 92.68561,0 0,-28.91971 -92.68561,0 z" + id="path3149" + style="fill:#ffc000;fill-opacity:1;fill-rule:evenodd;stroke:none" /> + <path + d="m 318.60437,329.70714 0,11.2528 -2.8132,0 0,-11.2528 2.8132,0 z m 0,19.6924 0,11.25281 -2.8132,0 0,-11.25281 2.8132,0 z m 0,19.69241 0,11.2528 -2.8132,0 0,-11.2528 2.8132,0 z m 0,19.69241 0,11.2528 -2.8132,0 0,-11.2528 2.8132,0 z m 0,19.6924 0,7.14553 -2.8132,0 0,-7.14553 2.8132,0 z" + id="path3151" + style="fill:#000000;fill-opacity:1;fill-rule:nonzero;stroke:#000000;stroke-width:0.01875467px;stroke-linecap:butt;stroke-linejoin:round;stroke-opacity:1;stroke-dasharray:none" /> + <path + d="m 530.0633,329.70714 0,11.2528 -2.8132,0 0,-11.2528 2.8132,0 z m 0,19.6924 0,11.25281 -2.8132,0 0,-11.25281 2.8132,0 z m 0,19.69241 0,11.2528 -2.8132,0 0,-11.2528 2.8132,0 z m 0,19.69241 0,11.2528 -2.8132,0 0,-11.2528 2.8132,0 z m 0,19.6924 0,7.14553 -2.8132,0 0,-7.14553 2.8132,0 z" + id="path3153" + style="fill:#000000;fill-opacity:1;fill-rule:nonzero;stroke:#000000;stroke-width:0.01875467px;stroke-linecap:butt;stroke-linejoin:round;stroke-opacity:1;stroke-dasharray:none" /> + <path + d="m 741.37219,329.70714 0,11.2528 -2.8132,0 0,-11.2528 2.8132,0 z m 0,19.6924 0,11.25281 -2.8132,0 0,-11.25281 2.8132,0 z m 0,19.69241 0,11.2528 -2.8132,0 0,-11.2528 2.8132,0 z m 0,19.69241 0,11.2528 -2.8132,0 0,-11.2528 2.8132,0 z m 0,19.6924 0,7.16429 -2.8132,0 0,-7.16429 2.8132,0 z" + id="path3155" + style="fill:#000000;fill-opacity:1;fill-rule:nonzero;stroke:#000000;stroke-width:0.03750934px;stroke-linecap:butt;stroke-linejoin:round;stroke-opacity:1;stroke-dasharray:none" /> + <path + d="m 952.66233,329.70714 0,11.2528 -2.8132,0 0,-11.2528 2.8132,0 z m 0,19.6924 0,11.25281 -2.8132,0 0,-11.25281 2.8132,0 z m 0,19.69241 0,11.2528 -2.8132,0 0,-11.2528 2.8132,0 z m 0,19.69241 0,11.2528 -2.8132,0 0,-11.2528 2.8132,0 z m 0,19.6924 0,7.16429 -2.8132,0 0,-7.16429 2.8132,0 z" + id="path3157" + style="fill:#000000;fill-opacity:1;fill-rule:nonzero;stroke:#000000;stroke-width:0.03750934px;stroke-linecap:butt;stroke-linejoin:round;stroke-opacity:1;stroke-dasharray:none" /> + <text + x="148.77711" + y="842.66461" + id="text3159" + xml:space="preserve" + style="font-size:19.95497131px;font-style:normal;font-weight:normal;text-align:start;text-anchor:start;fill:#000000;font-family:Arial">Checkpoint completes late</text> + <path + d="m 287.35909,819.16657 26.78167,-53.8259 -2.51313,-1.27532 -26.78167,53.82591 2.51313,1.27531 z m 28.69465,-51.31278 -0.0375,-9.45235 -7.53938,5.70142 7.57689,3.75093 z" + id="path3161" + style="fill:#000000;fill-opacity:1;fill-rule:nonzero;stroke:#000000;stroke-width:0.03750934px;stroke-linecap:butt;stroke-linejoin:round;stroke-opacity:1;stroke-dasharray:none" /> + <text + x="395.0498" + y="800.99463" + id="text3163" + xml:space="preserve" + style="font-size:19.95497131px;font-style:normal;font-weight:normal;text-align:start;text-anchor:start;fill:#000000;font-family:Arial">Min. </text> + <text + x="438.56064" + y="800.99463" + id="text3165" + xml:space="preserve" + style="font-size:19.95497131px;font-style:normal;font-weight:normal;text-align:start;text-anchor:start;fill:#000000;font-family:Arial">t</text> + <text + x="444.11203" + y="800.99463" + id="text3167" + xml:space="preserve" + style="font-size:19.95497131px;font-style:normal;font-weight:normal;text-align:start;text-anchor:start;fill:#000000;font-family:Arial">ime between checkpoints</text> + <path + d="m 387.20896,784.77051 -9.48986,-10.09002 2.06301,-1.95048 9.48987,10.12752 -2.06302,1.91298 z m -10.57763,-7.16429 -2.70067,-9.03975 8.8522,3.26331 -6.15153,5.77644 z" + id="path3169" + style="fill:#000000;fill-opacity:1;fill-rule:nonzero;stroke:#000000;stroke-width:0.03750934px;stroke-linecap:butt;stroke-linejoin:round;stroke-opacity:1;stroke-dasharray:none" /> + <text + x="441.10461" + y="276.30859" + id="text3171" + xml:space="preserve" + style="font-size:25.05624199px;font-style:normal;font-weight:bold;text-align:start;text-anchor:start;fill:#000000;font-family:Arial">Degraded Operation</text> + <text + x="431.65225" + y="303.91547" + id="text3173" + xml:space="preserve" + style="font-size:22.5056076px;font-style:normal;font-weight:normal;text-align:start;text-anchor:start;fill:#000000;font-family:Arial">(constantly </text> + <text + x="546.73096" + y="303.91547" + id="text3175" + xml:space="preserve" + style="font-size:22.5056076px;font-style:normal;font-weight:normal;text-align:start;text-anchor:start;fill:#000000;font-family:Arial">checkpointing</text> + <text + x="683.86511" + y="303.91547" + id="text3177" + xml:space="preserve" + style="font-size:22.5056076px;font-style:normal;font-weight:normal;text-align:start;text-anchor:start;fill:#000000;font-family:Arial">)</text> + <text + x="423.17554" + y="599.56708" + id="text3179" + xml:space="preserve" + style="font-size:25.05624199px;font-style:normal;font-weight:bold;text-align:start;text-anchor:start;fill:#000000;font-family:Arial">Guaranteeing Progress</text> + <text + x="375.76373" + y="627.17395" + id="text3181" + xml:space="preserve" + style="font-size:22.5056076px;font-style:normal;font-weight:normal;text-align:start;text-anchor:start;fill:#000000;font-family:Arial">(ensuring some checkpoint</text> + <text + x="644.33063" + y="627.17395" + id="text3183" + xml:space="preserve" + style="font-size:22.5056076px;font-style:normal;font-weight:normal;text-align:start;text-anchor:start;fill:#000000;font-family:Arial">-</text> + <text + x="651.83246" + y="627.17395" + id="text3185" + xml:space="preserve" + style="font-size:22.5056076px;font-style:normal;font-weight:normal;text-align:start;text-anchor:start;fill:#000000;font-family:Arial">free time)</text> + <path + d="m 422.24269,722.16741 c 0,9.18979 -11.36533,16.61664 -25.35632,16.61664 l 0,0 c -13.99098,0 -25.35631,7.46436 -25.35631,16.65415 0,-9.18979 -11.32782,-16.65415 -25.31881,-16.65415 l -4.76369,0 c -13.99098,0 -25.3188,-7.42685 -25.3188,-16.61664" + id="path3187" + style="fill:none;stroke:#000000;stroke-width:1.87546718px;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-opacity:1;stroke-dasharray:none" /> + </g> + </g> +</svg> http://git-wip-us.apache.org/repos/asf/flink/blob/392b2e9a/docs/monitoring/large_state_tuning.md ---------------------------------------------------------------------- diff --git a/docs/monitoring/large_state_tuning.md b/docs/monitoring/large_state_tuning.md index c49c106..053efd6 100644 --- a/docs/monitoring/large_state_tuning.md +++ b/docs/monitoring/large_state_tuning.md @@ -22,41 +22,196 @@ specific language governing permissions and limitations under the License. --> -This page gives a guide how to improve and tune applications that use large state. +This page gives a guide how to configure and tune applications that use large state. * ToC {:toc} +## Overview + +For Flink applications to run reliably at large scale, two conditions must be fulfilled: + + - The application needs to be able to take checkpoints reliably + + - The resources need to be sufficient catch up with the input data streams after a failure + +The first sections discuss how to get well performing checkpoints at scale. +The last section explains some best practices concerning planning how many resources to use. + + ## Monitoring State and Checkpoints - - Checkpoint statistics overview - - Interpret time until checkpoints - - Synchronous vs. asynchronous checkpoint time +The easiest way to monitor checkpoint behavior is via the UI's checkpoint section. The documentation +for [checkpoint monitoring](checkpoint_monitoring.html) shows how to access the available checkpoint +metrics. + +The two numbers that are of particular interest when scaling up checkpoints are: + + - The time until operators start their checkpoint: This time is currently not exposed directly, but corresponds + to: + + `checkpoint_start_delay = end_to_end_duration - synchronous_duration - asynchronous_duration` + + When the time to trigger the checkpoint is constantly very high, it means that the *checkpoint barriers* need a long + time to travel from the source to the operators. That typically indicates that the system is operating under a + constant backpressure. + + - The amount of data buffered during alignments. For exactly-once semantics, Flink *aligns* the streams at + operators that receive multiple input streams, buffering some data for that alignment. + The buffered data volume is ideally low - higher amounts means that checkpoint barriers are reveived at + very different times from the different input streams. + +Note that when the here indicated numbers can be occasionally high in the presence of transient backpressure, data skew, +or network issues. However, if the numbers are constantly very high, it means that Flink puts many resources into checkpointing. + ## Tuning Checkpointing - - Checkpoint interval - - Getting work done between checkpoints (min time between checkpoints) +Checkpoints are triggered at regular intervals that applications can configure. When a checkpoint takes longer +to complete than the checkpoint interval, the next checkpoint is not triggered before the in-progress checkpoint +completes. By default the next checkpoint will then be triggered immediately once the ongoing checkpoint completes. + +When checkpoints end up frequently taking longer than the base interval (for example because state +grew larger than planned, or the storage where checkpoints are stored is temporarily slow), +the system is constantly taking checkpoints (new ones are started immediately once ongoing once finish). +That can mean that too many resources are constantly tied up in checkpointing and that the operators make too +little progress. This behavior has less impact on streaming applications that use asynchronously checkpointed state, +but may still have an impact on overall application performance. + +To prevent such a situation, applications can define a *minimum duration between checkpoints*: + +`StreamExecutionEnvironment.getCheckpointConfig().setMinPauseBetweenCheckpoints(milliseconds)` + +This duration is the minimum time interval that must pass between the end of the latest checkpoint and the beginning +of the next. The figure below illustrates how this impacts checkpointing. + +<img src="../fig/checkpoint_tuning.svg" class="center" width="80%" alt="Illustration how the minimum-time-between-checkpoints parameter affects checkpointing behavior."/> + +*Note:* Applications can be configured (via the `CheckpointConfig`) to allow multiple checkpoints to be in progress at +the same time. For applications with large state in Flink, this often ties up too many resources into the checkpointing. +When a savepoint is manually triggered, it may be in process concurrently with an ongoing checkpoint. + ## Tuning Network Buffers - - getting a good number of buffers to use - - monitoring if too many buffers cause too much inflight data +The number of network buffers is a parameter that can currently have an effect on checkpointing at large scale. +The Flink community is working on eliminating that parameter in the next versions of Flink. + +The number of network buffers defines how much data a TaskManager can hold in-flight before back-pressure kicks in. +A very high number of network buffers means that a lot of data may be in the stream network channels when a checkpoint +is started. Because the checkpoint barriers travel with that data (see [description of how checkpointing works](../internals/stream_checkpointing.html)), +a lot of in-flight data means that the barriers have to wait for that data to be transported/processed before arriving +at the target operator. + +Having a lot of data in-flight also does not speed up the data processing as a whole. It only means that data is picked up faster +from the data source (log, files, message queue) and buffered longer in Flink. Having fewer network buffers means that +data is picked up from the source more immediately before it is actually being processed, which is generally desirable. +The number of network buffers should hence not be set arbitrarily large, but to a low multiple (such as 2x) of the +minimum number of required buffers. + + +## Make state checkpointing Asynchronous where possible -## Make checkpointing asynchronous where possible +When state is *asynchronously* snapshotted, the checkpoints scale better than when the state is *synchronously* snapshotted. +Especially in more complex streaming applications with multiple joins, Co-functions, or windows, this may have a profound +impact. - - large state should be on keyed state, not operator state, because keyed state is managed, operator state not (subject to change in future versions) +To get state to be snapshotted asynchronously, applications have to do two things: + + 1. Use state that is [managed by Flink](../dev/stream/state.html): Managed state means that Flink provides the data + structure in which the state is stored. Currently, this is true for *keyed state*, which is abstracted behind the + interfaces like `ValueState`, `ListState`, `ReducingState`, ... + + 2. Use a state backend that supports asynchronous snapshots. In Flink 1.2, only the RocksDB state backend uses + fully asynchronous snapshots. + +The above two points imply that (in Flink 1.2) large state should generally be kept as keyed state, not as operator state. +This is subject to change with the planned introduction of *managed operator state*. - - asynchronous snapshots preferrable. long synchronous snapshot times can cause problems on large state and complex topogies. move to RocksDB for that ## Tuning RocksDB - - Predefined options - - Custom Options +The state storage workhorse of many large scale Flink streaming applications is the *RocksDB State Backend*. +The backend scales well beyond main memory and reliably stores large [keyed state](../dev/stream/state.html). + +Unfortunately, RocksDB's performance can vary with configuration, and there is little documentation on how to tune +RocksDB properly. For example, the default configuration is tailored towards SSDs and performs suboptimal +on spinning disks. + +**Passing Options to RocksDB** + +{% highlight java %} +RocksDBStateBackend.setOptions(new MyOptions()); + +public class MyOptions implements OptionsFactory { + + @Override + public DBOptions createDBOptions() { + return new DBOptions() + .setIncreaseParallelism(4) + .setUseFsync(false) + .setDisableDataSync(true); + } + + @Override + public ColumnFamilyOptions createColumnOptions() { + + return new ColumnFamilyOptions() + .setTableFormatConfig( + new BlockBasedTableConfig() + .setBlockCacheSize(256 * 1024 * 1024) // 256 MB + .setBlockSize(128 * 1024)); // 128 KB + } +} +{% endhighlight %} + +**Predefined Options** + +Flink provides some predefined collections of option for RocksDB for different settings, which can be set for example via +`RocksDBStateBacked.setPredefinedOptions(PredefinedOptions.SPINNING_DISK_OPTIMIZED_HIGH_MEM)`. + +We expect to accumulate more such profiles over time. Feel free to contribute such predefined option profiles when you +found a set of options that work well and seem representative for certain workloads. + +**Important:** RocksDB is a native library, whose allocated memory not from the JVM, but directly from the process' +native memory. Any memory you assign to RocksDB will have to be accounted for, typically by decreasing the JVM heap size +of the TaskManagers by the same amount. Not doing that may result in YARN/Mesos/etc terminating the JVM processes for +allocating more memory than configures. + + +## Capacity Planning + +This section discusses how to decide how many resources should be used for a Flink job to run reliably. +The basic rules of thumb for capacity planning are: + + - Normal operation should have enough capacity to not operate under constant *back pressure*. + See [back pressure monitoring](back_pressure.html) for details on how to check whether the application runs under back pressure. + + - Provision some extra resources on top of the resources needed to run the program back-pressure-free during failure-free time. + These resources are needed to "catch up" with the input data that accumulated during the time the application + was recovering. + How much that should be depends on how long recovery operations usually take (which depends on the size of the state + that needs to be loaded into the new TaskManagers on a failover) and how fast the scenario requires failures to recover. + + *Important*: The base line should to be established with checkpointing activated, because checkpointing ties up + some amount of resources (such as network bandwidth). + + - Temporary back pressure is usually okay, and an essential part of execution flow control during load spikes, + during catch-up phases, or when external systems (that are written to in a sink) exhibit temporary slowdown. + + - Certain operations (like large windows) result in a spiky load for their downstream operators: + In the case of windows, the downstream operators may have little to do while the window is being built, + and have a load to do when the windows are emitted. + The planning for the downstream parallelism needs to take into account how much the windows emit and how + fast such a spike needs to be processed. + +**Important:** In order to allow for adding resources later, make sure to set the *maximum parallelism* of the +data stream program to a reasonable number. The maximum parallelism defines how high you can set the programs +parallelism when re-scaling the program (via a savepoint). -## Capacity planning +Flink's internal bookkeeping tracks parallel state in the granularity of max-parallelism-many *key groups*. +Flink's design strives to make it efficient to have a very high value for the maximum parallelism, even if +executing the program with a low parallelism. - - Normal operation should not be constantly back pressured (link to back pressure monitor) - - Allow for some excess capacity to support catch-up in case of failures and checkpoint alignment skew (due to data skew or bad nodes)
