hzxiongyinke opened a new pull request #1178: URL: https://github.com/apache/incubator-kyuubi/pull/1178
<!DOCTYPE html><p cid="n0" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin: 0.8em 0px; white-space: pre-wrap; position: relative; caret-color: rgb(51, 51, 51); color: rgb(51, 51, 51); font-family: "Open Sans", "Clear Sans", "Helvetica Neue", Helvetica, Arial, "Segoe UI Emoji", sans-serif; font-size: 16px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255); text-decoration: none;"><span md-inline="plain" class="md-plain md-expand" style="box-sizing: border-box;">What is the purpose of the pull request</span><span md-inline="softbreak" class="md-softbreak" style="box-sizing: border-box;"> </span><span md-inline="plain" class="md-plain" style="box-sizing: border-box;">pr for KYUUBI #939:Add Z-Order extensions to optimize table with zorder.Z-order is a technique that allows you to map multidimensional data to a single dimension. We did a performance test</span></p><p cid="n3" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin: 0.8em 0px; white-space: pre-wrap; position: relative; caret-color: rgb(51, 51, 51); color: rgb(51, 51, 51); font-family: "Open Sans", "Clear Sans", "Helvetica Neue", Helvetica, Arial, "Segoe UI Emoji", sans-serif; font-size: 16px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255); text-decoration: none;"><span md-inline="plain" class="md-plain" style="box -sizing: border-box;">for this test ,we used aliyun Databricks Delta test case</span><span md-inline="softbreak" class="md-softbreak" style="box-sizing: border-box;"> </span><span md-inline="url" class="md-link md-pair-s" spellcheck="false" style="box-sizing: border-box; word-break: break-all;"><a href="https://help.aliyun.com/document_detail/168137.html?spm=a2c4g.11186623.6.563.10d758ccclYtVb" style="box-sizing: border-box; cursor: pointer; color: rgb(65, 131, 196); -webkit-user-drag: none;">https://help.aliyun.com/document_detail/168137.html?spm=a2c4g.11186623.6.563.10d758ccclYtVb</a></span></p><p cid="n4" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin: 0.8em 0px; white-space: pre-wrap; position: relative; caret-color: rgb(51, 51, 51); color: rgb(51, 51, 51); font-family: "Open Sans", "Clear Sans", "Helvetica Neue", Helvetica, Arial, "Segoe UI Emoji", sans-serif; font-size: 16px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; word-s pacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255); text-decoration: none;"><span md-inline="plain" class="md-plain" style="box-sizing: border-box;">Prepare data for the three scenarios:</span></p><ol class="ol-list" start="" cid="n5" mdtype="list" style="box-sizing: border-box; margin: 0.8em 0px; padding-left: 30px; position: relative; caret-color: rgb(51, 51, 51); color: rgb(51, 51, 51); font-family: "Open Sans", "Clear Sans", "Helvetica Neue", Helvetica, Arial, "Segoe UI Emoji", sans-serif; font-size: 16px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255); text-decoration: none;"><li class="md-list-item" cid="n6" mdtype="list_item" style="box- sizing: border-box; margin: 0px; position: relative;"><p cid="n7" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin: 0px 0px 0.5rem; white-space: pre-wrap; position: relative;"><span md-inline="plain" class="md-plain" style="box-sizing: border-box;">10 billion data and 2 hundred files(parquet files): for big file(1G)</span></p></li><li class="md-list-item" cid="n8" mdtype="list_item" style="box-sizing: border-box; margin: 0px; position: relative;"><p cid="n9" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin: 0px 0px 0.5rem; white-space: pre-wrap; position: relative;"><span md-inline="plain" class="md-plain" style="box-sizing: border-box;">10 billion data and 1 thousand files(parquet files): for medium file(200m)</span></p></li><li class="md-list-item" cid="n10" mdtype="list_item" style="box-sizing: border-box; margin: 0px; position: relative; "><p cid="n11" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin: 0px 0px 0.5rem; white-space: pre-wrap; position: relative;"><span md-inline="plain" class="md-plain" style="box-sizing: border-box;">one billion data and 10 hundred files(parquet files): for smaller file(200k)</span></p></li></ol><p cid="n12" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin: 0.8em 0px; white-space: pre-wrap; position: relative; caret-color: rgb(51, 51, 51); color: rgb(51, 51, 51); font-family: "Open Sans", "Clear Sans", "Helvetica Neue", Helvetica, Arial, "Segoe UI Emoji", sans-serif; font-size: 16px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255); text-decoration: none;"><span md-inline="plain" class="md-plain" style="box-sizing: border-box;">test env:</span><span md-inline="softbreak" class="md-softbreak" style="box-sizing: border-box;"> </span><span md-inline="plain" class="md-plain" style="box-sizing: border-box;">spark-3.1.2</span><span md-inline="softbreak" class="md-softbreak" style="box-sizing: border-box;"> </span><span md-inline="plain" class="md-plain" style="box-sizing: border-box;">hadoop-2.7.2</span><span md-inline="softbreak" class="md-softbreak" style="box-sizing: border-box;"> </span><span md-inline="plain" class="md-plain" style="box-sizing: border-box;">kyubbi-1.4.0</span></p><p cid="n13" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin: 0.8em 0px; white-space: pre-wrap; position: relative; caret-color: rgb(51, 51, 51); color: rgb(51, 51, 51); font-family: "Open Sans", "Clear Sans", "Helvetica Neue", Helvetica, Arial, "Segoe UI Emoji", sans-serif; font-size: 16px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255); text-decoration: none;"><span md-inline="plain" class="md-plain" style="box-sizing: border-box;">test step:</span></p><p cid="n420" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin : 0.8em 0px; white-space: pre-wrap; position: relative; caret-color: rgb(51, 51, 51); color: rgb(51, 51, 51); font-family: "Open Sans", "Clear Sans", "Helvetica Neue", Helvetica, Arial, "Segoe UI Emoji", sans-serif; font-size: 16px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255); text-decoration: none;"><span md-inline="plain" class="md-plain" style="box-sizing: border-box;">Step1: create hive tables</span></p><pre class="md-fences md-end-block ty-contain-cm modeLoaded" spellcheck="false" lang="scala" cid="n424" mdtype="fences" style="box-sizing: border-box; overflow: visible; font-family: var(--monospace); font-size: 0.9em; display: block; break-inside: avoid; text-align: left; white-space: normal; background-image: inherit; background-size: inherit; bac kground-attachment: inherit; background-origin: inherit; background-clip: inherit; background-color: rgb(248, 248, 248); position: relative !important; border: 1px solid rgb(231, 234, 237); border-top-left-radius: 3px; border-top-right-radius: 3px; border-bottom-right-radius: 3px; border-bottom-left-radius: 3px; padding: 8px 4px 6px; margin-bottom: 15px; margin-top: 15px; width: inherit; caret-color: rgb(51, 51, 51); color: rgb(51, 51, 51); font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-indent: 0px; text-transform: none; widows: auto; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; text-decoration: none; background-position: inherit; background-repeat: inherit;"><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;"><span class="cm-variable" style="box-sizing: border-box; color: rgb(0, 0, 0);">spark</span>.<span class="cm-variable" style="box-sizing: border-bo x; color: rgb(0, 0, 0);">sql</span>(<span class="cm-variable" style="box-sizing: border-box; color: rgb(0, 0, 0);">s</span><span class="cm-string" style="box-sizing: border-box; color: rgb(170, 17, 17);">"drop database if exists $dbName cascade"</span>)</span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;"><span class="cm-variable" style="box-sizing: border-box; color: rgb(0, 0, 0);">spark</span>.<span class="cm-variable" style="box-sizing: border-box; color: rgb(0, 0, 0);">sql</span>(<span class="cm-variable" style="box-sizing: border-box; color: rgb(0, 0, 0);">s</span><span class="cm-string" style="box-sizing: border-box; color: rgb(170, 17, 17);">"create database if not exists $dbName"</span>)</span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;"><span class="cm-variable" style="box-sizing: border-box; color: rgb(0, 0, 0);">spark</span>.<span class="cm-variable" style="box-sizing: border-box; color: rgb(0, 0, 0);">sql</span>(<span class="cm-variable" style="box-sizing: border-box; color: rgb(0, 0, 0);">s</span><span class="cm-string" style="box-sizing: border-box; color: rgb(170, 17, 17);">"use $dbName"</span>)</span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;"><span class="cm-variable" style="box-sizing: border-box; color: rgb(0, 0, 0);">spark</span>.<span class="cm-variable" style="box-sizing: border-box; color: rgb(0, 0, 0);">sql</span>(<span class="cm-variable" style="box-sizing: border-box; color: rgb(0, 0, 0);">s</span><span class="cm-string" style="box-sizing: border-box; color: rgb(170, 17, 17);">"create table $connRandomParquet (src_ip string, src_port int, dst_ip string, dst_port int) stored as parquet"</span>)</span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;"><span class="cm-variable" style="box-sizing: border-box; color: rgb(0, 0, 0);">spark</span>.<span class="cm-variable" style="box-sizing: bord er-box; color: rgb(0, 0, 0);">sql</span>(<span class="cm-variable" style="box-sizing: border-box; color: rgb(0, 0, 0);">s</span><span class="cm-string" style="box-sizing: border-box; color: rgb(170, 17, 17);">"create table $connZorderOnlyIp (src_ip string, src_port int, dst_ip string, dst_port int) stored as parquet"</span>)</span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;"><span class="cm-variable" style="box-sizing: border-box; color: rgb(0, 0, 0);">spark</span>.<span class="cm-variable" style="box-sizing: border-box; color: rgb(0, 0, 0);">sql</span>(<span class="cm-variable" style="box-sizing: border-box; color: rgb(0, 0, 0);">s</span><span class="cm-string" style="box-sizing: border-box; color: rgb(170, 17, 17);">"create table $connZorder (src_ip string, src_port int, dst_ip string, dst_port int) stored as parquet"</span>)</span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;"><span class="cm-variable" st yle="box-sizing: border-box; color: rgb(0, 0, 0);">spark</span>.<span class="cm-variable" style="box-sizing: border-box; color: rgb(0, 0, 0);">sql</span>(<span class="cm-variable" style="box-sizing: border-box; color: rgb(0, 0, 0);">s</span><span class="cm-string" style="box-sizing: border-box; color: rgb(170, 17, 17);">"show tables"</span>).<span class="cm-variable" style="box-sizing: border-box; color: rgb(0, 0, 0);">show</span>(<span class="cm-atom" style="box-sizing: border-box; color: rgb(34, 17, 153);">false</span>)</span></pre><p cid="n14" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin: 0.8em 0px; white-space: pre-wrap; position: relative; caret-color: rgb(51, 51, 51); color: rgb(51, 51, 51); font-family: "Open Sans", "Clear Sans", "Helvetica Neue", Helvetica, Arial, "Segoe UI Emoji", sans-serif; font-size: 16px; font-style: normal; font-variant-caps: normal; font-wei ght: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255); text-decoration: none;"><span md-inline="plain" class="md-plain" style="box-sizing: border-box;">Step2: prepare data for parquet table with three scenarios</span><span md-inline="softbreak" class="md-softbreak" style="box-sizing: border-box;"> </span><span md-inline="plain" class="md-plain" style="box-sizing: border-box;">we use the following code</span></p><pre class="md-fences md-end-block ty-contain-cm modeLoaded" spellcheck="false" lang="scala" cid="n15" mdtype="fences" style="box-sizing: border-box; overflow: visible; font-family: var(--monospace); font-size: 0.9em; display: block; break-inside: avoid; text-align: left; white-space: normal; background-image: inherit; background-size: inherit; background-attachment: inherit; background-origin: inherit; background-clip: inherit; background-color: rgb(248, 248, 248); position: relative !important; border: 1px solid rgb(231, 234, 237); border-top-left-radius: 3px; border-top-right-radius: 3px; border-bottom-right-radius: 3px; border-bottom-left-radius: 3px; padding: 8px 4px 6px; margin-bottom: 15px; margin-top: 15px; width: inherit; caret-color: rgb(51, 51, 51); color: rgb(51, 51, 51); font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-indent: 0px; text-transform: none; widows: auto; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; text-decoration: none; background-position: inherit; background-repeat: inherit;"><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;"><span class="cm-keyword" style="box-sizing: border-box; color: rgb(119, 0, 136);">def</span> <span class="cm-def" style="box-sizing: border-box; color: rgb(0, 0, 255);">randomIPv4</span>(<span class="cm-variable" style="box-sizing: border-box; color: rgb(0, 0, 0);">r</span>: <span class="cm-variable" style="box-sizing: border-box; color: rgb(0, 0, 0);">Random</span>) <span class="cm-operator" style="box-sizing: border-box; color: rgb(152, 26, 26);">=</span> <span class="cm-variable-3" style="box-sizing: border-box; color: rgb(0, 136, 85);">Seq</span>.<span class="cm-variable" style="box-sizing: border-box; color: rgb(0, 0, 0);">fill</span>(<span class="cm-number" sty le="box-sizing: border-box; color: rgb(17, 102, 68);">4</span>)(<span class="cm-variable" style="box-sizing: border-box; color: rgb(0, 0, 0);">r</span>.<span class="cm-variable" style="box-sizing: border-box; color: rgb(0, 0, 0);">nextInt</span>(<span class="cm-number" style="box-sizing: border-box; color: rgb(17, 102, 68);">256</span>)).<span class="cm-variable" style="box-sizing: border-box; color: rgb(0, 0, 0);">mkString</span>(<span class="cm-string" style="box-sizing: border-box; color: rgb(170, 17, 17);">"."</span>)</span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;"><span class="cm-keyword" style="box-sizing: border-box; color: rgb(119, 0, 136);">def</span> <span class="cm-def" style="box-sizing: border-box; color: rgb(0, 0, 255);">randomPort</span>(<span class="cm-variable" style="box-sizing: border-box; color: rgb(0, 0, 0);">r</span>: <span class="cm-variable" style="box-sizing: border-box; color: rgb(0, 0, 0);">Random</span>) <span cla ss="cm-operator" style="box-sizing: border-box; color: rgb(152, 26, 26);">=</span> <span class="cm-variable" style="box-sizing: border-box; color: rgb(0, 0, 0);">r</span>.<span class="cm-variable" style="box-sizing: border-box; color: rgb(0, 0, 0);">nextInt</span>(<span class="cm-number" style="box-sizing: border-box; color: rgb(17, 102, 68);">65536</span>)</span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;"><span cm-text="" cm-zwsp="" style="box-sizing: border-box;"></span></span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;"><span class="cm-keyword" style="box-sizing: border-box; color: rgb(119, 0, 136);">def</span> <span class="cm-def" style="box-sizing: border-box; color: rgb(0, 0, 255);">randomConnRecord</span>(<span class="cm-variable" style="box-sizing: border-box; color: rgb(0, 0, 0);">r</span>: <span class="cm-variable" style="box-sizing: border-box; color: rgb(0, 0, 0);">Random</span>) <span clas s="cm-operator" style="box-sizing: border-box; color: rgb(152, 26, 26);">=</span> <span class="cm-variable" style="box-sizing: border-box; color: rgb(0, 0, 0);">ConnRecord</span>(</span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;"> <span class="cm-variable" style="box-sizing: border-box; color: rgb(0, 0, 0);">src_ip</span> <span class="cm-operator" style="box-sizing: border-box; color: rgb(152, 26, 26);">=</span> <span class="cm-variable" style="box-sizing: border-box; color: rgb(0, 0, 0);">randomIPv4</span>(<span class="cm-variable" style="box-sizing: border-box; color: rgb(0, 0, 0);">r</span>), <span class="cm-variable" style="box-sizing: border-box; color: rgb(0, 0, 0);">src_port</span> <span class="cm-operator" style="box-sizing: border-box; color: rgb(152, 26, 26);">=</span> <span class="cm-variable" style="box-sizing: border-box; color: rgb(0, 0, 0);">randomPort</span>(<span class="cm-variable" style="box-sizing: border-box; color: rgb(0, 0, 0);">r</span>),</span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;"> <span class="cm-variable" style="box-sizing: border-box; color: rgb(0, 0, 0);">dst_ip</span> <span class="cm-operator" style="box-sizing: border-box; color: rgb(152, 26, 26);">=</span> <span class="cm-variable" style="box-sizing: border-box; color: rgb(0, 0, 0);">randomIPv4</span>(<span class="cm-variable" style="box-sizing: border-box; color: rgb(0, 0, 0);">r</span>), <span class="cm-variable" style="box-sizing: border-box; color: rgb(0, 0, 0);">dst_port</span> <span class="cm-operator" style="box-sizing: border-box; color: rgb(152, 26, 26);">=</span> <span class="cm-variable" style="box-sizing: border-box; color: rgb(0, 0, 0);">randomPort</span>(<span class="cm-variable" style="box-sizing: border-box; color: rgb(0, 0, 0);">r</span>))</span></pre><p cid="n16" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orpha ns: 4; margin: 0.8em 0px; white-space: pre-wrap; position: relative; caret-color: rgb(51, 51, 51); color: rgb(51, 51, 51); font-family: "Open Sans", "Clear Sans", "Helvetica Neue", Helvetica, Arial, "Segoe UI Emoji", sans-serif; font-size: 16px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255); text-decoration: none;"><span md-inline="plain" class="md-plain" style="box-sizing: border-box;">Step3: do optimize with z-order only ip, sort column: src_ip, dst_ip and shuffle partition just as file numbers .</span><span md-inline="softbreak" class="md-softbreak" style="box-sizing: border-box;"> </span><span md-inline="tab" class="md-tab" style="box-sizing: border-box; display: inline-block; white-space: pre;"> </span><span md-inline="plain" class="md-plain" style="box-sizing: border-box;">execute 'OPTIMIZE conn_zorder_only_ip ZORDER BY src_ip, dst_ip;' by kyuubi.</span></p><p cid="n17" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin: 0.8em 0px; white-space: pre-wrap; position: relative; caret-color: rgb(51, 51, 51); color: rgb(51, 51, 51); font-family: "Open Sans", "Clear Sans", "Helvetica Neue", Helvetica, Arial, "Segoe UI Emoji", sans-serif; font-size: 16px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255); text-decoration: none;"><span md-inline="plain" class="md-plain" sty le="box-sizing: border-box;">Step4: do optimize with z-order only ip, sort column: src_ip, dst_ip and shuffle partition just as file numbers .</span><span md-inline="softbreak" class="md-softbreak" style="box-sizing: border-box;"> </span><span md-inline="tab" class="md-tab" style="box-sizing: border-box; display: inline-block; white-space: pre;"> </span><span md-inline="plain" class="md-plain" style="box-sizing: border-box;">execute 'OPTIMIZE conn_zorder ZORDER BY src_ip, src_port, dst_ip, dst_port;' by kyuubi.</span></p><p cid="n18" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin: 0.8em 0px; white-space: pre-wrap; position: relative; caret-color: rgb(51, 51, 51); color: rgb(51, 51, 51); font-family: "Open Sans", "Clear Sans", "Helvetica Neue", Helvetica, Arial, "Segoe UI Emoji", sans-serif; font-size: 16px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255); text-decoration: none;"><span md-inline="plain" class="m d-plain" style="box-sizing: border-box;">by querying the tables before and after optimization, we find that</span></p><p cid="n21" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin: 0.8em 0px; white-space: pre-wrap; position: relative; caret-color: rgb(51, 51, 51); color: rgb(51, 51, 51); font-family: "Open Sans", "Clear Sans", "Helvetica Neue", Helvetica, Arial, "Segoe UI Emoji", sans-serif; font-size: 16px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255); text-decoration: none;"><span md-inline="plain" class="md-plain" style="box-sizing: border-box;">10 billion data and 200 files</span></p><figure class="md-table-fig" cid="n27" mdtype="table" style="box-sizing: border-box; margin: 1.2em 0 px; overflow-x: auto; max-width: calc(100% + 16px); padding: 0px; cursor: default; caret-color: rgb(51, 51, 51); color: rgb(51, 51, 51); font-family: "Open Sans", "Clear Sans", "Helvetica Neue", Helvetica, Arial, "Segoe UI Emoji", sans-serif; font-size: 16px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255); text-decoration: none;"> Table | Files Number | Data Size | Average File Size | Average query time(10 times) | Query resource | Number of Scan rows | Skipping ratio -- | -- | -- | -- | -- | -- | -- | -- conn_random_parquet | 200 | 10 billion | 1.2 G | 27 554ms | 200 core 600G memory | 10 billion | 100% conn_zorder_only_ip | 200 | 10 billion | 890 M | 2 459ms | 200 core 600G memory | 43170600 | 99.568% conn_zorder | 200 | 10 billion | 890 M | 3 185ms | 200 core 600G memory | 54841302 | 99.451% conn_random_parquet | 1000 | 10 billion | 234.8 M | 27 031ms | 200 core 600G memory | 10 billion | 100% conn_zorder_only_ip | 1000 | 10 billion | 173.9 M | 2 668ms | 200 core 600G memory | 43170600 | 99.568% conn_zorder | 1000 | 10 billion | 174.0 M | 3 207ms | 200 core 600G memory | 54841302 | 99.451% conn_random_parquet | 10,000 | 1billion | 2.7 M | 76 772ms | 10 core 40G memory | 1billion | 100% conn_zorder_only_ip | 10,000 | 1billion | 2.1 M | 3 963ms | 10 core 40G memory | 406,572 | 99.959% conn_zorder | 10,000 | 1billion | 2.2 M | 3 621ms | 10 core 40G memory | 387,942 | 99.961% </figure><p cid="n60" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin: 0.8em 0px; white-space: pre-wrap; position: relative; caret-color: rgb(51, 51, 51); color: rgb(51, 51, 51); font-family: "Open Sans", "Clear Sans", "Helvetica Neue", Helvetica, Arial, "Segoe UI Emoji", sans-serif; font-size: 16px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255); text-decoration: none;"><span md-inline="plain" class="md-plain" style="box-sizing: border-box;">The complete code is as follows:</span></p><pre class="md-fences md-end-block ty-contain-cm modeLoaded" spellcheck="false" lang="shell" cid="n430" mdtype="fences" style="box-sizing: border-box; overflow: visible; font-family: var(--monospace); f ont-size: 0.9em; display: block; break-inside: avoid; text-align: left; white-space: normal; background-image: inherit; background-size: inherit; background-attachment: inherit; background-origin: inherit; background-clip: inherit; background-color: rgb(248, 248, 248); position: relative !important; border: 1px solid rgb(231, 234, 237); border-top-left-radius: 3px; border-top-right-radius: 3px; border-bottom-right-radius: 3px; border-bottom-left-radius: 3px; padding: 8px 4px 6px; margin-bottom: 15px; margin-top: 15px; width: inherit; caret-color: rgb(51, 51, 51); color: rgb(51, 51, 51); font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-indent: 0px; text-transform: none; widows: auto; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; text-decoration: none; background-position: inherit; background-repeat: inherit;"><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px ;">./spark-shell</span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;">import org.apache.spark.SparkConf</span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;">import org.apache.spark.sql.SparkSession</span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;"><span cm-text="" cm-zwsp="" style="box-sizing: border-box;"></span></span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;">case class ConnRecord(src_ip: String, src_port: Int, dst_ip: String, dst_port: Int)</span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;"><span cm-text="" cm-zwsp="" style="box-sizing: border-box;"></span></span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;">val conf <span class="cm-operator" style="box-sizing: border-box; color: rgb(152, 26, 26);">=</span> new SparkConf().setAppName(<spa n class="cm-string" style="box-sizing: border-box; color: rgb(170, 17, 17);">"zorder_test"</span>)</span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;">val spark <span class="cm-operator" style="box-sizing: border-box; color: rgb(152, 26, 26);">=</span> SparkSession.builder().config(conf).enableHiveSupport().getOrCreate()</span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;">import spark.implicits._</span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;"><span cm-text="" cm-zwsp="" style="box-sizing: border-box;"></span></span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;">val sc <span class="cm-operator" style="box-sizing: border-box; color: rgb(152, 26, 26);">=</span> spark.sparkContext</span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;">sc.setLogLevel(<span class="cm-string" style="box-sizing: bord er-box; color: rgb(170, 17, 17);">"WARN"</span>)</span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;">//ten billion rows and two hundred files</span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;">val numRecords <span class="cm-operator" style="box-sizing: border-box; color: rgb(152, 26, 26);">=</span> <span class="cm-number" style="box-sizing: border-box; color: rgb(17, 102, 68);">10</span>*1000*1000*1000L</span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;">val numFiles <span class="cm-operator" style="box-sizing: border-box; color: rgb(152, 26, 26);">=</span> <span class="cm-number" style="box-sizing: border-box; color: rgb(17, 102, 68);">200</span></span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;"><span cm-text="" cm-zwsp="" style="box-sizing: border-box;"></span></span><br><span role="presentation" style="box-sizing: border-box ; padding-right: 0.1px;">val dbName <span class="cm-operator" style="box-sizing: border-box; color: rgb(152, 26, 26);">=</span> s<span class="cm-string" style="box-sizing: border-box; color: rgb(170, 17, 17);">"zorder_test_</span><span class="cm-def" style="box-sizing: border-box; color: rgb(0, 0, 255);">$numFiles</span><span class="cm-string" style="box-sizing: border-box; color: rgb(170, 17, 17);">"</span></span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;">val baseLocation <span class="cm-operator" style="box-sizing: border-box; color: rgb(152, 26, 26);">=</span> s<span class="cm-string" style="box-sizing: border-box; color: rgb(170, 17, 17);">"hdfs://localhost:9000/zorder_test/</span><span class="cm-def" style="box-sizing: border-box; color: rgb(0, 0, 255);">$dbName</span><span class="cm-string" style="box-sizing: border-box; color: rgb(170, 17, 17);">/"</span></span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;">val connRandomParquet <span class="cm-operator" style="box-sizing: border-box; color: rgb(152, 26, 26);">=</span> <span class="cm-string" style="box-sizing: border-box; color: rgb(170, 17, 17);">"conn_random_parquet"</span></span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;">val connZorderOnlyIp <span class="cm-operator" style="box-sizing: border-box; color: rgb(152, 26, 26);">=</span> <span class="cm-string" style="box-sizing: border-box; color: rgb(170, 17, 17);">"conn_zorder_only_ip"</span></span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;">val connZorder <span class="cm-operator" style="box-sizing: border-box; color: rgb(152, 26, 26);">=</span> <span class="cm-string" style="box-sizing: border-box; color: rgb(170, 17, 17);">"conn_zorder"</span></span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;">spark.conf.set(<span class="cm-string" style="box-sizing: bord er-box; color: rgb(170, 17, 17);">"spark.sql.shuffle.partitions"</span>, numFiles)</span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;">spark.conf.get(<span class="cm-string" style="box-sizing: border-box; color: rgb(170, 17, 17);">"spark.sql.shuffle.partitions"</span>)</span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;">spark.conf.set(<span class="cm-string" style="box-sizing: border-box; color: rgb(170, 17, 17);">"spark.sql.hive.convertMetastoreParquet"</span>,false)</span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;">spark.sql(s<span class="cm-string" style="box-sizing: border-box; color: rgb(170, 17, 17);">"drop database if exists </span><span class="cm-def" style="box-sizing: border-box; color: rgb(0, 0, 255);">$dbName</span><span class="cm-string" style="box-sizing: border-box; color: rgb(170, 17, 17);"> cascade"</span>)</span><br><span role="presentation" style="bo x-sizing: border-box; padding-right: 0.1px;">spark.sql(s<span class="cm-string" style="box-sizing: border-box; color: rgb(170, 17, 17);">"create database if not exists </span><span class="cm-def" style="box-sizing: border-box; color: rgb(0, 0, 255);">$dbName</span><span class="cm-string" style="box-sizing: border-box; color: rgb(170, 17, 17);">"</span>)</span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;">spark.sql(s<span class="cm-string" style="box-sizing: border-box; color: rgb(170, 17, 17);">"use </span><span class="cm-def" style="box-sizing: border-box; color: rgb(0, 0, 255);">$dbName</span><span class="cm-string" style="box-sizing: border-box; color: rgb(170, 17, 17);">"</span>)</span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;">spark.sql(s<span class="cm-string" style="box-sizing: border-box; color: rgb(170, 17, 17);">"create table </span><span class="cm-def" style="box-sizing: border-box; color: rgb( 0, 0, 255);">$connRandomParquet</span><span class="cm-string" style="box-sizing: border-box; color: rgb(170, 17, 17);"> (src_ip string, src_port int, dst_ip string, dst_port int) stored as parquet"</span>)</span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;">spark.sql(s<span class="cm-string" style="box-sizing: border-box; color: rgb(170, 17, 17);">"create table </span><span class="cm-def" style="box-sizing: border-box; color: rgb(0, 0, 255);">$connZorderOnlyIp</span><span class="cm-string" style="box-sizing: border-box; color: rgb(170, 17, 17);"> (src_ip string, src_port int, dst_ip string, dst_port int) stored as parquet"</span>)</span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;">spark.sql(s<span class="cm-string" style="box-sizing: border-box; color: rgb(170, 17, 17);">"create table </span><span class="cm-def" style="box-sizing: border-box; color: rgb(0, 0, 255);">$connZorder</span><span class="cm-string" style="box-sizing: border-box; color: rgb(170, 17, 17);"> (src_ip string, src_port int, dst_ip string, dst_port int) stored as parquet"</span>)</span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;">spark.sql(s<span class="cm-string" style="box-sizing: border-box; color: rgb(170, 17, 17);">"show tables"</span>).show(false)</span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;"><span cm-text="" cm-zwsp="" style="box-sizing: border-box;"></span></span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;">import scala.util.Random</span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;">// Function <span class="cm-keyword" style="box-sizing: border-box; color: rgb(119, 0, 136);">for</span> preparing Zorder_Test data</span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;">def randomIPv4(r: Random) <span class="cm-operat or" style="box-sizing: border-box; color: rgb(152, 26, 26);">=</span> Seq.fill(4)(r.nextInt(256)).mkString(<span class="cm-string" style="box-sizing: border-box; color: rgb(170, 17, 17);">"."</span>)</span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;">def randomPort(r: Random) <span class="cm-operator" style="box-sizing: border-box; color: rgb(152, 26, 26);">=</span> r.nextInt(65536)</span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;"><span cm-text="" cm-zwsp="" style="box-sizing: border-box;"></span></span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;">def randomConnRecord(r: Random) <span class="cm-operator" style="box-sizing: border-box; color: rgb(152, 26, 26);">=</span> ConnRecord(</span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;">src_ip <span class="cm-operator" style="box-sizing: border-box; color: rgb(152, 26, 26);">=</sp an> randomIPv4(r), src_port <span class="cm-operator" style="box-sizing: border-box; color: rgb(152, 26, 26);">=</span> randomPort(r),</span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;">dst_ip <span class="cm-operator" style="box-sizing: border-box; color: rgb(152, 26, 26);">=</span> randomIPv4(r), dst_port <span class="cm-operator" style="box-sizing: border-box; color: rgb(152, 26, 26);">=</span> randomPort(r))</span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;"><span cm-text="" cm-zwsp="" style="box-sizing: border-box;"></span></span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;">val df <span class="cm-operator" style="box-sizing: border-box; color: rgb(152, 26, 26);">=</span> spark.range(0, numFiles, <span class="cm-number" style="box-sizing: border-box; color: rgb(17, 102, 68);">1</span>, numFiles).mapPartitions { it <span class="cm-operator" style="box-sizing: bo rder-box; color: rgb(152, 26, 26);">=</span>></span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;">val partitionID <span class="cm-operator" style="box-sizing: border-box; color: rgb(152, 26, 26);">=</span> it.toStream.head</span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;">val r <span class="cm-operator" style="box-sizing: border-box; color: rgb(152, 26, 26);">=</span> new Random(seed <span class="cm-operator" style="box-sizing: border-box; color: rgb(152, 26, 26);">=</span> partitionID)</span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;">Iterator.fill((numRecords / numFiles).toInt)(randomConnRecord(r))</span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;">}</span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;"><span cm-text="" cm-zwsp="" style="box-sizing: border-box;"></span></span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;">df.write</span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;">.mode(<span class="cm-string" style="box-sizing: border-box; color: rgb(170, 17, 17);">"overwrite"</span>)</span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;">.format(<span class="cm-string" style="box-sizing: border-box; color: rgb(170, 17, 17);">"parquet"</span>)</span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;">.insertInto(connRandomParquet)</span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;"><span cm-text="" cm-zwsp="" style="box-sizing: border-box;"></span></span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;">spark.read.table(connRandomParquet)</span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;">.write</span><br><spa n role="presentation" style="box-sizing: border-box; padding-right: 0.1px;">.mode(<span class="cm-string" style="box-sizing: border-box; color: rgb(170, 17, 17);">"overwrite"</span>)</span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;">.format(<span class="cm-string" style="box-sizing: border-box; color: rgb(170, 17, 17);">"parquet"</span>)</span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;">.insertInto(connZorderOnlyIp)</span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;"><span cm-text="" cm-zwsp="" style="box-sizing: border-box;"></span></span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;">spark.read.table(connRandomParquet)</span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;">.write</span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;">.mode(<span class="cm-s tring" style="box-sizing: border-box; color: rgb(170, 17, 17);">"overwrite"</span>)</span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;">.format(<span class="cm-string" style="box-sizing: border-box; color: rgb(170, 17, 17);">"parquet"</span>)</span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;">.insertInto(connZorder)</span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;">spark.stop()</span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;"><span cm-text="" cm-zwsp="" style="box-sizing: border-box;"></span></span></pre><p cid="n432" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin: 0.8em 0px; white-space: pre-wrap; position: relative; caret-color: rgb(51, 51, 51); color: rgb(51, 51, 51); font-family: "Open Sans", "Clear Sans", "Helvetica Neue" , Helvetica, Arial, "Segoe UI Emoji", sans-serif; font-size: 16px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255); text-decoration: none;"><span md-inline="plain" class="md-plain" style="box-sizing: border-box;">Optimize Sql:</span></p><p cid="n456" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin: 0.8em 0px; white-space: pre-wrap; position: relative; caret-color: rgb(51, 51, 51); color: rgb(51, 51, 51); font-family: "Open Sans", "Clear Sans", "Helvetica Neue", Helvetica, Arial, "Segoe UI Emoji", sans-serif; font-size: 16px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; word- spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255); text-decoration: none;"><span md-inline="strong" class="md-pair-s " style="box-sizing: border-box;"><strong style="box-sizing: border-box;"><span md-inline="plain" class="md-plain" style="box-sizing: border-box;">add conf</span></strong></span><span md-inline="plain" class="md-plain" style="box-sizing: border-box;"> </span><span md-inline="softbreak" class="md-softbreak" style="box-sizing: border-box;"> </span><span md-inline="plain" class="md-plain" style="box-sizing: border-box;">spark.sql.extensions=org.apache.kyuubi.sql.KyuubiSparkSQLExtension</span><span md-inline="softbreak" class="md-softbreak" style="box-sizing: border-box;"> </span><span md-inline="plain" class="md-plain" style="box-sizing: border-box;">spark.sql.hive.convertMetastoreParquet=false</span></p><pre class="md-fences md-end-block ty-contain-cm modeLoaded" spellcheck="false" lang="sql" cid="n450" mdtype="fences" style="box-sizing: border-box; overflow: visible; font-family: var(--monospace); font-size: 0.9em; display: block; break-inside: avoid; text-align: left; white-space: normal; background-image: inherit; background-size: inherit; background-attachment: inherit; background-origin: inherit; background-clip: inherit; background-color: rgb(248, 248, 248); position: relative !important; border: 1px solid rgb(231, 234, 237); border-top-left-radius: 3px; border-top-right-radius: 3px; border-bottom-right-radius: 3px; border-bottom-left-radius: 3px; padding: 8px 4px 6px; margin-bottom: 15px; margin-top: 15px; width: inherit; caret-color: rgb(51, 51, 51); color: rgb(51, 51, 51); font-style: normal; font-variant-caps: normal; font-weight: normal ; letter-spacing: normal; orphans: auto; text-indent: 0px; text-transform: none; widows: auto; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; text-decoration: none; background-position: inherit; background-repeat: inherit;"><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;"><span class="cm-keyword" style="box-sizing: border-box; color: rgb(119, 0, 136);">OPTIMIZE</span> conn_zorder_only_ip ZORDER <span class="cm-keyword" style="box-sizing: border-box; color: rgb(119, 0, 136);">BY</span> src_ip<span class="cm-punctuation" style="box-sizing: border-box;">,</span> dst_ip<span class="cm-punctuation" style="box-sizing: border-box;">;</span></span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;"><span cm-text="" cm-zwsp="" style="box-sizing: border-box;"></span></span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;"><span class="cm-keyword" style="box-s izing: border-box; color: rgb(119, 0, 136);">OPTIMIZE</span> zorder_test<span class="cm-variable-2" style="box-sizing: border-box; color: rgb(0, 85, 170);">.conn_zorder</span> ZORDER <span class="cm-keyword" style="box-sizing: border-box; color: rgb(119, 0, 136);">BY</span> src_ip<span class="cm-punctuation" style="box-sizing: border-box;">,</span> src_port<span class="cm-punctuation" style="box-sizing: border-box;">,</span> dst_ip<span class="cm-punctuation" style="box-sizing: border-box;">,</span> dst_port</span></pre><p cid="n451" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin: 0.8em 0px; white-space: pre-wrap; position: relative; caret-color: rgb(51, 51, 51); color: rgb(51, 51, 51); font-family: "Open Sans", "Clear Sans", "Helvetica Neue", Helvetica, Arial, "Segoe UI Emoji", sans-serif; font-size: 16px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255); text-decoration: none;"></p><p cid="n437" mdtype="paragraph" class="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin: 0.8em 0px; white-space: pre-wrap; position: relative; caret-color: rgb(51, 51, 51); color: rgb(51, 51, 51); font-family: "Open Sans", "Clear Sans", "Helvetica Neue", Helvetica, Arial, "Segoe UI Emoji", sans-serif; font-size: 16px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255); text-decoration: none;"><span md-inline="plain" class="md-plain" style="box-sizing: border-box;">Test Sql : </span></p><p cid="n463" mdtype="paragraph" cl ass="md-end-block md-p" style="box-sizing: border-box; line-height: inherit; orphans: 4; margin: 0.8em 0px; white-space: pre-wrap; position: relative; caret-color: rgb(51, 51, 51); color: rgb(51, 51, 51); font-family: "Open Sans", "Clear Sans", "Helvetica Neue", Helvetica, Arial, "Segoe UI Emoji", sans-serif; font-size: 16px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255); text-decoration: none;"><span md-inline="strong" class="md-pair-s " style="box-sizing: border-box;"><strong style="box-sizing: border-box;"><span md-inline="plain" class="md-plain" style="box-sizing: border-box;">add conf</span></strong></span><span md-inline="plain" class="md-plain" style="box-sizing: border-box;"> </span><span md-inline="softbreak" class="md-softbreak" style="box-sizi ng: border-box;"> </span><span md-inline="plain" class="md-plain" style="box-sizing: border-box;">spark.sql.hive.convertMetastoreParquet=true</span></p><pre class="md-fences md-end-block ty-contain-cm modeLoaded" spellcheck="false" lang="sql" cid="n444" mdtype="fences" style="box-sizing: border-box; overflow: visible; font-family: var(--monospace); font-size: 0.9em; display: block; break-inside: avoid; text-align: left; white-space: normal; background-image: inherit; background-size: inherit; background-attachment: inherit; background-origin: inherit; background-clip: inherit; background-color: rgb(248, 248, 248); position: relative !important; border: 1px solid rgb(231, 234, 237); border-top-left-radius: 3px; border-top-right-radius: 3px; border-bottom-right-radius: 3px; border-bottom-left-radius: 3px; padding: 8px 4px 6px; margin-bottom: 15px; margin-top: 15px; width: inherit; caret-color: rgb(51, 51, 51); color: rgb(51, 51, 51); font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-indent: 0px; text-transform: none; widows: auto; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; text-decoration: none; background-position: inherit; background-repeat: inherit;"><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;"><span class="cm-keyword" style="box-sizing: border-box; color: rgb(119, 0, 136);">select</span> <span class="cm-keyword" style="box-sizing: border-box; color: rgb(119, 0, 136);">count</span><span class="cm-bracket" style="box-sizing: border-box; color: rgb(153, 153, 119);">(</span><span class="cm-operator" style="box-sizing: border-box; color: rgb(152, 26, 26);">*</span><span class="cm-bracket" style="box-sizing: border-box; color: rgb(153, 153, 119);">)</span> <span class="cm-keyword" style="box-sizing: border-box; color: rgb(119, 0, 136);">from</span> conn_random_parquet <span class="cm-keyword" style="box-sizing: border-box; color: rgb(119, 0, 136 );">where</span> src_ip <span class="cm-keyword" style="box-sizing: border-box; color: rgb(119, 0, 136);">like</span> <span class="cm-string" style="box-sizing: border-box; color: rgb(170, 17, 17);">'157%'</span> <span class="cm-keyword" style="box-sizing: border-box; color: rgb(119, 0, 136);">and</span> dst_ip <span class="cm-keyword" style="box-sizing: border-box; color: rgb(119, 0, 136);">like</span> <span class="cm-string" style="box-sizing: border-box; color: rgb(170, 17, 17);">'216.%'</span><span class="cm-punctuation" style="box-sizing: border-box;">;</span></span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;"><span cm-text="" cm-zwsp="" style="box-sizing: border-box;"></span></span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;"><span class="cm-keyword" style="box-sizing: border-box; color: rgb(119, 0, 136);">select</span> <span class="cm-keyword" style="box-sizing: border-box; color: rgb(119, 0, 136 );">count</span><span class="cm-bracket" style="box-sizing: border-box; color: rgb(153, 153, 119);">(</span><span class="cm-operator" style="box-sizing: border-box; color: rgb(152, 26, 26);">*</span><span class="cm-bracket" style="box-sizing: border-box; color: rgb(153, 153, 119);">)</span> <span class="cm-keyword" style="box-sizing: border-box; color: rgb(119, 0, 136);">from</span> conn_zorder_only_ip <span class="cm-keyword" style="box-sizing: border-box; color: rgb(119, 0, 136);">where</span> src_ip <span class="cm-keyword" style="box-sizing: border-box; color: rgb(119, 0, 136);">like</span> <span class="cm-string" style="box-sizing: border-box; color: rgb(170, 17, 17);">'157%'</span> <span class="cm-keyword" style="box-sizing: border-box; color: rgb(119, 0, 136);">and</span> dst_ip <span class="cm-keyword" style="box-sizing: border-box; color: rgb(119, 0, 136);">like</span> <span class="cm-string" style="box-sizing: border-box; color: rgb(170, 17, 17);">'216.%'</span><span class ="cm-punctuation" style="box-sizing: border-box;">;</span></span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;"><span cm-text="" cm-zwsp="" style="box-sizing: border-box;"></span></span><br><span role="presentation" style="box-sizing: border-box; padding-right: 0.1px;"><span class="cm-keyword" style="box-sizing: border-box; color: rgb(119, 0, 136);">select</span> <span class="cm-keyword" style="box-sizing: border-box; color: rgb(119, 0, 136);">count</span><span class="cm-bracket" style="box-sizing: border-box; color: rgb(153, 153, 119);">(</span><span class="cm-operator" style="box-sizing: border-box; color: rgb(152, 26, 26);">*</span><span class="cm-bracket" style="box-sizing: border-box; color: rgb(153, 153, 119);">)</span> <span class="cm-keyword" style="box-sizing: border-box; color: rgb(119, 0, 136);">from</span> conn_zorder <span class="cm-keyword" style="box-sizing: border-box; color: rgb(119, 0, 136);">where</span> src_ip <span class=" cm-keyword" style="box-sizing: border-box; color: rgb(119, 0, 136);">like</span> <span class="cm-string" style="box-sizing: border-box; color: rgb(170, 17, 17);">'157%'</span> <span class="cm-keyword" style="box-sizing: border-box; color: rgb(119, 0, 136);">and</span> dst_ip <span class="cm-keyword" style="box-sizing: border-box; color: rgb(119, 0, 136);">like</span> <span class="cm-string" style="box-sizing: border-box; color: rgb(170, 17, 17);">'216.%'</span><span class="cm-punctuation" style="box-sizing: border-box;">;</span></span></pre><br class="Apple-interchange-newline"> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
