date:20210106

[GitHub] [flink] flinkbot edited a comment on pull request #14570: [FLINK-20798][k8s] Use namespaced kubernetes client when creating FlinkKubeClient

2021-01-06 Thread GitBox



flinkbot edited a comment on pull request #14570:
URL: https://github.com/apache/flink/pull/14570#issuecomment-755308058


   
   ## CI report:
   
   * fa9f13d34946414ac57a1429548879d83e0ab6e4 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11696)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #14568: Update the Row.toString method

2021-01-06 Thread GitBox



flinkbot edited a comment on pull request #14568:
URL: https://github.com/apache/flink/pull/14568#issuecomment-755224268


   
   ## CI report:
   
   * 7ca6a4f3b85eff5e92c9b12d2fca2091abc6de67 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11695)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] blublinsky commented on pull request #14512: FLINK-20359 Added Owner Reference to Job Manager in native kubernetes

2021-01-06 Thread GitBox



blublinsky commented on pull request #14512:
URL: https://github.com/apache/flink/pull/14512#issuecomment-755541392


   Sorry guys, how do I force the same PR. Unfortunately not a GIT guru



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] rmetzger commented on pull request #14529: FLINK-20359 Added Owner Reference to Job Manager in native kubernetes

2021-01-06 Thread GitBox



rmetzger commented on pull request #14529:
URL: https://github.com/apache/flink/pull/14529#issuecomment-755516798


   Closing as this is a duplicate of https://github.com/apache/flink/pull/14512



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] rmetzger commented on pull request #14512: FLINK-20359 Added Owner Reference to Job Manager in native kubernetes

2021-01-06 Thread GitBox



rmetzger commented on pull request #14512:
URL: https://github.com/apache/flink/pull/14512#issuecomment-755516473


   The benefit of sticking to the same PR (by (force) pushing changes is that 
we keep track of the history of changes).
   
   I'm closing the other PR. Please push your changes here!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #14502: [Flink-20766][table-planner-blink] Separate implementations of sort ExecNode and PhysicalNode.

2021-01-06 Thread GitBox



flinkbot edited a comment on pull request #14502:
URL: https://github.com/apache/flink/pull/14502#issuecomment-751570047


   
   ## CI report:
   
   * 1a28221fa5f38577e134877d1ca28c87477d9aac Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11702)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #14312: [FLINK-20491] Support Broadcast State in BATCH execution mode

2021-01-06 Thread GitBox



flinkbot edited a comment on pull request #14312:
URL: https://github.com/apache/flink/pull/14312#issuecomment-738876739


   
   ## CI report:
   
   * ca72d305e20fbb2ab11a36b93a2a1e969b6b7883 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11703)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #14567: [FLINK-20453][runtime][checkpoint] Move checkpointing classes to an a…

2021-01-06 Thread GitBox



flinkbot edited a comment on pull request #14567:
URL: https://github.com/apache/flink/pull/14567#issuecomment-755063547


   
   ## CI report:
   
   * 27095134441ec13971f36c873bca2a03d904cf36 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11700)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Closed] (FLINK-20854) Introduce BytesMultiMap to support buffering records

2021-01-06 Thread Jark Wu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-20854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jark Wu closed FLINK-20854.
---
Resolution: Fixed

Fixed in master: facd51fcf5c78c5f733c203c38e3a093ba8a5218

> Introduce BytesMultiMap to support buffering records
> 
>
> Key: FLINK-20854
> URL: https://issues.apache.org/jira/browse/FLINK-20854
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Runtime
>Reporter: Jark Wu
>Assignee: Jark Wu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.13.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] flinkbot commented on pull request #14572: [FLINK-20877][table-runtime-blink] Refactor BytesHashMap and BytesMultiMap to support window key

2021-01-06 Thread GitBox



flinkbot commented on pull request #14572:
URL: https://github.com/apache/flink/pull/14572#issuecomment-755905594


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit 3f1d5189abe80623eba977235da6969663b535dc (Thu Jan 07 
06:02:39 UTC 2021)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Comment Edited] (FLINK-20798) Using PVC as high-availability.storageDir could not work

2021-01-06 Thread oceanxie (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-20798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17260269#comment-17260269
 ] 

oceanxie edited comment on FLINK-20798 at 1/7/21, 6:54 AM:
---

The HA configmap would't be deleted when I delete the flink cluster(sts and 
configmap). I don't think this is appropriate. They were generated 
automatically, not created by me.


was (Author: oceanxie):
The HA configmap would't be deleted when I delete the flink cluster(sts and 
configmap). I don't think this is appropriate. They were generated and not 
created by me.

> Using PVC as high-availability.storageDir could not work
> 
>
> Key: FLINK-20798
> URL: https://issues.apache.org/jira/browse/FLINK-20798
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / Kubernetes
>Affects Versions: 1.12.0
> Environment: FLINK 1.12.0
>Reporter: hayden zhou
>Priority: Major
>  Labels: pull-request-available
> Attachments: flink.log
>
>
> 我这边 部署 flink 到 k8s 使用 PVC 作为 high avalibility storagedir ， 我看jobmanager 
> 的日志，选举成功了。但是 web 一直显示选举进行中。
> When deploying standalone Flink on Kubernetes and configure the 
> {{high-availability.storageDir}} to a mounted PVC directory, the Flink webui 
> could not be visited normally. It shows that "Service temporarily unavailable 
> due to an ongoing leader election. Please refresh".
>  
> 下面是 jobmanager 的日志
> The following is related logs from JobManager.
> {code}
> 2020-12-29T06:45:54.177850394Z 2020-12-29 14:45:54,177 DEBUG 
> io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector [] - 
> Leader election started
>  2020-12-29T06:45:54.177855303Z 2020-12-29 14:45:54,177 DEBUG 
> io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector [] - 
> Attempting to acquire leader lease 'ConfigMapLock: default - 
> mta-flink-resourcemanager-leader (6f6479c6-86cc-4d62-84f9-37ff968bd0e5)'...
>  2020-12-29T06:45:54.178668055Z 2020-12-29 14:45:54,178 DEBUG 
> io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] - 
> WebSocket successfully opened
>  2020-12-29T06:45:54.178895963Z 2020-12-29 14:45:54,178 INFO 
> org.apache.flink.runtime.leaderretrieval.DefaultLeaderRetrievalService [] - 
> Starting DefaultLeaderRetrievalService with 
> KubernetesLeaderRetrievalDriver\{configMapName='mta-flink-resourcemanager-leader'}.
>  2020-12-29T06:45:54.179327491Z 2020-12-29 14:45:54,179 DEBUG 
> io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] - 
> Connecting websocket ... 
> io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager@6d303498
>  2020-12-29T06:45:54.230081993Z 2020-12-29 14:45:54,229 DEBUG 
> io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] - 
> WebSocket successfully opened
>  2020-12-29T06:45:54.230202329Z 2020-12-29 14:45:54,230 INFO 
> org.apache.flink.runtime.leaderretrieval.DefaultLeaderRetrievalService [] - 
> Starting DefaultLeaderRetrievalService with 
> KubernetesLeaderRetrievalDriver\{configMapName='mta-flink-dispatcher-leader'}.
>  2020-12-29T06:45:54.230219281Z 2020-12-29 14:45:54,229 DEBUG 
> io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] - 
> WebSocket successfully opened
>  2020-12-29T06:45:54.230353912Z 2020-12-29 14:45:54,230 INFO 
> org.apache.flink.runtime.leaderelection.DefaultLeaderElectionService [] - 
> Starting DefaultLeaderElectionService with 
> KubernetesLeaderElectionDriver\{configMapName='mta-flink-resourcemanager-leader'}.
>  2020-12-29T06:45:54.237004177Z 2020-12-29 14:45:54,236 DEBUG 
> io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector [] - 
> Leader changed from null to 6f6479c6-86cc-4d62-84f9-37ff968bd0e5
>  2020-12-29T06:45:54.237024655Z 2020-12-29 14:45:54,236 INFO 
> org.apache.flink.kubernetes.kubeclient.resources.KubernetesLeaderElector [] - 
> New leader elected 6f6479c6-86cc-4d62-84f9-37ff968bd0e5 for 
> mta-flink-restserver-leader.
>  2020-12-29T06:45:54.237027811Z 2020-12-29 14:45:54,236 DEBUG 
> io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector [] - 
> Successfully Acquired leader lease 'ConfigMapLock: default - 
> mta-flink-restserver-leader (6f6479c6-86cc-4d62-84f9-37ff968bd0e5)'
>  2020-12-29T06:45:54.237297376Z 2020-12-29 14:45:54,237 DEBUG 
> org.apache.flink.runtime.leaderelection.DefaultLeaderElectionService [] - 
> Grant leadership to contender 
> [http://mta-flink-jobmanager:8081|http://mta-flink-jobmanager:8081/] with 
> session ID 9587e13f-322f-4cd5-9fff-b4941462be0f.
>  2020-12-29T06:45:54.237353551Z 2020-12-29 14:45:54,237 INFO 
> org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint [] - 
> [http://mta-flink-jobmanager:8081|http://mta-flink-jobmanager:8081/] was 
> granted leadership with leaderSessionID=9587e13f-322f-4cd5-9fff-b4941462be0f
>

[GitHub] [flink] pnowojski opened a new pull request #14573: [FLINK-20868][task][metrics] Pause the idle/back pressure timers during processing mailbox actions

2021-01-06 Thread GitBox



pnowojski opened a new pull request #14573:
URL: https://github.com/apache/flink/pull/14573


   FLINK-20717 introduced a bug, where any time spent on processing mails, when 
task is idle or
   back pressured, will be accounted to idle or back pressured time instead of 
the busy time.
   
   The fix is to assign idle or back pressure timer to the suspenssion marker 
and pause this
   timer when MailboxProcessor is running mails.
   
   ## Verifying this change
   
   This PR expand existing tests to cover for the fixed bug.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / **no**)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / **no**)
 - The serializers: (yes / **no** / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / **no** 
/ don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: (yes / **no** / 
don't know)
 - The S3 file system connector: (yes / **no** / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / **no**)
 - If yes, how is the feature documented? (**not applicable** / docs / 
JavaDocs / not documented)
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-20868) Pause the idle/back pressure timers during processing mailbox actions

2021-01-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-20868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-20868:
---
Labels: pull-request-available  (was: )

> Pause the idle/back pressure timers during processing mailbox actions
> -
>
> Key: FLINK-20868
> URL: https://issues.apache.org/jira/browse/FLINK-20868
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Task
>Affects Versions: 1.13.0
>Reporter: Piotr Nowojski
>Assignee: Piotr Nowojski
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.13.0
>
>
> FLINK-20717 introduced a bug, where any time spent on processing mails, when 
> task is idle or back pressured, will be accounted to idle or back pressured 
> time instead of the busy time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] fsk119 commented on a change in pull request #14539: [FLINK-20321][formats] Fix NPE when using AvroDeserializationSchema to deserialize null input

2021-01-06 Thread GitBox



fsk119 commented on a change in pull request #14539:
URL: https://github.com/apache/flink/pull/14539#discussion_r553144882



##
File path: 
flink-formats/flink-avro/src/main/java/org/apache/flink/formats/avro/AvroDeserializationSchema.java
##
@@ -128,6 +128,9 @@ Decoder getDecoder() {
 
 @Override
 public T deserialize(byte[] message) throws IOException {

Review comment:
   Please add  the annotation `@NULLABLE` before parameter message.
   
   You can also take a look at the constructor of the `KafkaDynamicSource`, 
which allows its input is null.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] wuchong commented on a change in pull request #14569: [FLINK-20856][table-planner-blink] Separate the implementation of stream window aggregate nodes

2021-01-06 Thread GitBox



wuchong commented on a change in pull request #14569:
URL: https://github.com/apache/flink/pull/14569#discussion_r553145500



##
File path: 
flink-table/flink-table-planner-blink/src/main/java/org/apache/flink/table/planner/plan/nodes/exec/stream/StreamExecGroupWindowAggregate.java
##
@@ -0,0 +1,373 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.planner.plan.nodes.exec.stream;
+
+import org.apache.flink.api.dag.Transformation;
+import org.apache.flink.streaming.api.transformations.OneInputTransformation;
+import org.apache.flink.table.api.TableConfig;
+import org.apache.flink.table.api.TableException;
+import org.apache.flink.table.data.RowData;
+import org.apache.flink.table.expressions.FieldReferenceExpression;
+import org.apache.flink.table.expressions.ValueLiteralExpression;
+import 
org.apache.flink.table.planner.calcite.FlinkRelBuilder.PlannerNamedWindowProperty;
+import org.apache.flink.table.planner.codegen.CodeGeneratorContext;
+import org.apache.flink.table.planner.codegen.EqualiserCodeGenerator;
+import org.apache.flink.table.planner.codegen.agg.AggsHandlerCodeGenerator;
+import org.apache.flink.table.planner.delegation.PlannerBase;
+import org.apache.flink.table.planner.expressions.PlannerWindowProperty;
+import org.apache.flink.table.planner.plan.logical.LogicalWindow;
+import org.apache.flink.table.planner.plan.logical.SessionGroupWindow;
+import org.apache.flink.table.planner.plan.logical.SlidingGroupWindow;
+import org.apache.flink.table.planner.plan.logical.TumblingGroupWindow;
+import org.apache.flink.table.planner.plan.nodes.exec.ExecEdge;
+import org.apache.flink.table.planner.plan.nodes.exec.ExecNode;
+import org.apache.flink.table.planner.plan.nodes.exec.ExecNodeBase;
+import org.apache.flink.table.planner.plan.utils.AggregateInfoList;
+import org.apache.flink.table.planner.plan.utils.KeySelectorUtil;
+import org.apache.flink.table.planner.plan.utils.WindowEmitStrategy;
+import org.apache.flink.table.planner.utils.JavaScalaConversionUtil;
+import org.apache.flink.table.runtime.generated.GeneratedClass;
+import 
org.apache.flink.table.runtime.generated.GeneratedNamespaceAggsHandleFunction;
+import 
org.apache.flink.table.runtime.generated.GeneratedNamespaceTableAggsHandleFunction;
+import org.apache.flink.table.runtime.generated.GeneratedRecordEqualiser;
+import org.apache.flink.table.runtime.keyselector.RowDataKeySelector;
+import org.apache.flink.table.runtime.operators.window.CountWindow;
+import org.apache.flink.table.runtime.operators.window.TimeWindow;
+import org.apache.flink.table.runtime.operators.window.WindowOperator;
+import org.apache.flink.table.runtime.operators.window.WindowOperatorBuilder;
+import org.apache.flink.table.runtime.types.LogicalTypeDataTypeConverter;
+import org.apache.flink.table.runtime.typeutils.InternalTypeInfo;
+import org.apache.flink.table.types.DataType;
+import org.apache.flink.table.types.logical.LogicalType;
+import org.apache.flink.table.types.logical.RowType;
+
+import org.apache.calcite.rel.core.AggregateCall;
+import org.apache.calcite.tools.RelBuilder;
+import org.apache.commons.lang3.ArrayUtils;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.time.Duration;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.List;
+
+import static 
org.apache.flink.table.planner.plan.utils.AggregateUtil.hasRowIntervalType;
+import static 
org.apache.flink.table.planner.plan.utils.AggregateUtil.hasTimeIntervalType;
+import static 
org.apache.flink.table.planner.plan.utils.AggregateUtil.isProctimeAttribute;
+import static 
org.apache.flink.table.planner.plan.utils.AggregateUtil.isRowtimeAttribute;
+import static 
org.apache.flink.table.planner.plan.utils.AggregateUtil.isTableAggregate;
+import static 
org.apache.flink.table.planner.plan.utils.AggregateUtil.toDuration;
+import static org.apache.flink.table.planner.plan.utils.AggregateUtil.toLong;
+import static 
org.apache.flink.table.planner.plan.utils.AggregateUtil.transformToStreamAggregateInfoList;
+
+/** Stream {@link ExecNode} for either group window aggregate or group window 
table aggregate. */
+public class

[GitHub] [flink] flinkbot edited a comment on pull request #14516: [FLINK-20783][table-planner-blink] Separate implementation of ExecNode and PhysicalNode in batch join

2021-01-06 Thread GitBox



flinkbot edited a comment on pull request #14516:
URL: https://github.com/apache/flink/pull/14516#issuecomment-752024061


   
   ## CI report:
   
   * 5601ca0f26b9fdb130b0a030e7326d8b1287c76c Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11718)
 
   * b87220ccdee330380bc015427b9f4b445173e324 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11721)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #14508: [FLINK-20773][json] Support to parse unescaped control chars in string node

2021-01-06 Thread GitBox



flinkbot edited a comment on pull request #14508:
URL: https://github.com/apache/flink/pull/14508#issuecomment-751734163


   
   ## CI report:
   
   * f1332b021d33a6e4681b0a08ad1c5b58f153c417 UNKNOWN
   * f4d02e921d2641fc5692617a4dd50ba2fda1128c UNKNOWN
   * fd8cbf90a807292b0db7b85bda26f1e717b87767 UNKNOWN
   * 3c03189754755222ce29f4d17485c91532da4a8b UNKNOWN
   * 5663475ec56efe4b84e6ae2e6cabd6d58db34bf2 UNKNOWN
   * 06a37e185c98ee1abefea3ce1b9278b12776d8a5 UNKNOWN
   * b3ffad9d58e5b8149a2e4da9e0ead2044170efb3 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11697)
 
   * 1b96157c0fc638ecdaf4a5c04c51e52275e926f0 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #14561: [FLINK-20836] Register TaskManager with total and default slot resource profile in SlotManager

2021-01-06 Thread GitBox



flinkbot edited a comment on pull request #14561:
URL: https://github.com/apache/flink/pull/14561#issuecomment-754574127


   
   ## CI report:
   
   * d2f645553ea7f6051cb81fee377672920ef871e0 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11653)
 
   * 2172813d98fb963cd7426c80e9de863caf46a87e UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] xintongsong commented on pull request #14557: [FLINK-20781] Avoid NPE after SourceOperator is closed.

2021-01-06 Thread GitBox



xintongsong commented on pull request #14557:
URL: https://github.com/apache/flink/pull/14557#issuecomment-755937641


   It's weird that the flinkbot command is not working.
   I've rebased the latest master and pushed to trigger the ci tests.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] KarmaGYZ commented on a change in pull request #14561: [FLINK-20836] Register TaskManager with total and default slot resource profile in SlotManager

2021-01-06 Thread GitBox



KarmaGYZ commented on a change in pull request #14561:
URL: https://github.com/apache/flink/pull/14561#discussion_r553153813



##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/ResourceManager.java
##
@@ -464,7 +464,11 @@ private void stopResourceManagerServices() throws 
Exception {
 taskExecutors.get(taskManagerResourceId);
 
 if 
(workerTypeWorkerRegistration.getInstanceID().equals(taskManagerRegistrationId))
 {
-if (slotManager.registerTaskManager(workerTypeWorkerRegistration, 
slotReport)) {
+if (slotManager.registerTaskManager(
+workerTypeWorkerRegistration,
+slotReport,
+workerTypeWorkerRegistration.getTotalResourceProfile(),
+
workerTypeWorkerRegistration.getDefaultSlotResourceProfile())) {

Review comment:
   After an offline discussion, we find it may need a refactor with a 
larger scope:
   - The current inheritance relationship between `WorkerRegistration ` and 
`TaskExecutorConnection` is not reasonable.
   - The `TaskManagerRegisteration` also needs to be renamed.
   We decided to have some more in-depth discussions and move it out of the 
scope of this PR.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #14571: [FLINK-20871][API/DataStream] Make DataStream#executeAndCollectWithCl…

2021-01-06 Thread GitBox



flinkbot edited a comment on pull request #14571:
URL: https://github.com/apache/flink/pull/14571#issuecomment-755845385


   
   ## CI report:
   
   * 34d88a71b61f62f2775a41eb291c017c6d76b8ab Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11710)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #14516: [FLINK-20783][table-planner-blink] Separate implementation of ExecNode and PhysicalNode in batch join

2021-01-06 Thread GitBox



flinkbot edited a comment on pull request #14516:
URL: https://github.com/apache/flink/pull/14516#issuecomment-752024061


   
   ## CI report:
   
   * 09d3419bc7ef88969ad62f2d19a66e586610bdcf Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11442)
 
   * 5601ca0f26b9fdb130b0a030e7326d8b1287c76c UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #14538: [FLINK-20681][yarn] Support remote path for shipping archives and files

2021-01-06 Thread GitBox



flinkbot edited a comment on pull request #14538:
URL: https://github.com/apache/flink/pull/14538#issuecomment-752991523


   
   ## CI report:
   
   * af57879d33f5d7e1717518a475fd601916c5c7c1 UNKNOWN
   * f1c012ecf5d3d91f13bb9bfa24ea7c31c775c9c7 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11698)
 
   * d3230f5b6087c8c3c3453c9d69a68e69305010b4 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11713)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-20655) Add E2E tests to the new KafkaSource based on FLIP-27.

2021-01-06 Thread Qingsheng Ren (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-20655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17260257#comment-17260257
 ] 

Qingsheng Ren commented on FLINK-20655:
---

I used Flink integrated state machine example with 4 parallelisms for testing, 
and a topic with 32 partitions on Kafka clusters. These scenarios are tested: 
 # Unbounded mode running for ~1 hour given a fixed topic
 # Use topic pattern to consume from multiple topics
 # Use topic pattern and create a new topic for testing new topic discovery
 # Scale out a topic for testing new partition discovery
 # Unbounded mode with a stopping offset
 # Bounded mode
 # Kill a TM then restart
 # Stop the job with a snapshot and resume

All these cases looks good to me. 

> Add E2E tests to the new KafkaSource based on FLIP-27.
> --
>
> Key: FLINK-20655
> URL: https://issues.apache.org/jira/browse/FLINK-20655
> Project: Flink
>  Issue Type: Test
>  Components: Connectors / Kafka
>Affects Versions: 1.12.0
>Reporter: Jiangjie Qin
>Assignee: Qingsheng Ren
>Priority: Major
> Fix For: 1.13.0, 1.12.1
>
>
> Add the following e2e tests for KafkaSource based on FLIP-27.
>  # A basic read test which reads from a Kafka topic.
>  # Stop the job with savepoint and resume.
>  # Kill a TM and verify the failover works fine.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-20798) Using PVC as high-availability.storageDir could not work

2021-01-06 Thread oceanxie (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-20798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17260269#comment-17260269
 ] 

oceanxie commented on FLINK-20798:
--

The HA configmap would't be deleted when I delete the flink cluster(sts and 
configmap). I don't think this is appropriate. They were generated and not 
created by me.

> Using PVC as high-availability.storageDir could not work
> 
>
> Key: FLINK-20798
> URL: https://issues.apache.org/jira/browse/FLINK-20798
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / Kubernetes
>Affects Versions: 1.12.0
> Environment: FLINK 1.12.0
>Reporter: hayden zhou
>Priority: Major
>  Labels: pull-request-available
> Attachments: flink.log
>
>
> 我这边 部署 flink 到 k8s 使用 PVC 作为 high avalibility storagedir ， 我看jobmanager 
> 的日志，选举成功了。但是 web 一直显示选举进行中。
> When deploying standalone Flink on Kubernetes and configure the 
> {{high-availability.storageDir}} to a mounted PVC directory, the Flink webui 
> could not be visited normally. It shows that "Service temporarily unavailable 
> due to an ongoing leader election. Please refresh".
>  
> 下面是 jobmanager 的日志
> The following is related logs from JobManager.
> {code}
> 2020-12-29T06:45:54.177850394Z 2020-12-29 14:45:54,177 DEBUG 
> io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector [] - 
> Leader election started
>  2020-12-29T06:45:54.177855303Z 2020-12-29 14:45:54,177 DEBUG 
> io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector [] - 
> Attempting to acquire leader lease 'ConfigMapLock: default - 
> mta-flink-resourcemanager-leader (6f6479c6-86cc-4d62-84f9-37ff968bd0e5)'...
>  2020-12-29T06:45:54.178668055Z 2020-12-29 14:45:54,178 DEBUG 
> io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] - 
> WebSocket successfully opened
>  2020-12-29T06:45:54.178895963Z 2020-12-29 14:45:54,178 INFO 
> org.apache.flink.runtime.leaderretrieval.DefaultLeaderRetrievalService [] - 
> Starting DefaultLeaderRetrievalService with 
> KubernetesLeaderRetrievalDriver\{configMapName='mta-flink-resourcemanager-leader'}.
>  2020-12-29T06:45:54.179327491Z 2020-12-29 14:45:54,179 DEBUG 
> io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] - 
> Connecting websocket ... 
> io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager@6d303498
>  2020-12-29T06:45:54.230081993Z 2020-12-29 14:45:54,229 DEBUG 
> io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] - 
> WebSocket successfully opened
>  2020-12-29T06:45:54.230202329Z 2020-12-29 14:45:54,230 INFO 
> org.apache.flink.runtime.leaderretrieval.DefaultLeaderRetrievalService [] - 
> Starting DefaultLeaderRetrievalService with 
> KubernetesLeaderRetrievalDriver\{configMapName='mta-flink-dispatcher-leader'}.
>  2020-12-29T06:45:54.230219281Z 2020-12-29 14:45:54,229 DEBUG 
> io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] - 
> WebSocket successfully opened
>  2020-12-29T06:45:54.230353912Z 2020-12-29 14:45:54,230 INFO 
> org.apache.flink.runtime.leaderelection.DefaultLeaderElectionService [] - 
> Starting DefaultLeaderElectionService with 
> KubernetesLeaderElectionDriver\{configMapName='mta-flink-resourcemanager-leader'}.
>  2020-12-29T06:45:54.237004177Z 2020-12-29 14:45:54,236 DEBUG 
> io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector [] - 
> Leader changed from null to 6f6479c6-86cc-4d62-84f9-37ff968bd0e5
>  2020-12-29T06:45:54.237024655Z 2020-12-29 14:45:54,236 INFO 
> org.apache.flink.kubernetes.kubeclient.resources.KubernetesLeaderElector [] - 
> New leader elected 6f6479c6-86cc-4d62-84f9-37ff968bd0e5 for 
> mta-flink-restserver-leader.
>  2020-12-29T06:45:54.237027811Z 2020-12-29 14:45:54,236 DEBUG 
> io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector [] - 
> Successfully Acquired leader lease 'ConfigMapLock: default - 
> mta-flink-restserver-leader (6f6479c6-86cc-4d62-84f9-37ff968bd0e5)'
>  2020-12-29T06:45:54.237297376Z 2020-12-29 14:45:54,237 DEBUG 
> org.apache.flink.runtime.leaderelection.DefaultLeaderElectionService [] - 
> Grant leadership to contender 
> [http://mta-flink-jobmanager:8081|http://mta-flink-jobmanager:8081/] with 
> session ID 9587e13f-322f-4cd5-9fff-b4941462be0f.
>  2020-12-29T06:45:54.237353551Z 2020-12-29 14:45:54,237 INFO 
> org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint [] - 
> [http://mta-flink-jobmanager:8081|http://mta-flink-jobmanager:8081/] was 
> granted leadership with leaderSessionID=9587e13f-322f-4cd5-9fff-b4941462be0f
>  2020-12-29T06:45:54.237440354Z 2020-12-29 14:45:54,237 DEBUG 
> org.apache.flink.runtime.leaderelection.DefaultLeaderElectionService [] - 
> Confirm leader session ID 9587e13f-322f-4cd5-9fff-b4941462be0f for leader 
>

[GitHub] [flink] V1ncentzzZ commented on pull request #14508: [FLINK-20773][json] Support to parse unescaped control chars in string node

2021-01-06 Thread GitBox



V1ncentzzZ commented on pull request #14508:
URL: https://github.com/apache/flink/pull/14508#issuecomment-755936508


   Thanks @wuchong , Sounds good to me.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #14561: [FLINK-20836] Register TaskManager with total and default slot resource profile in SlotManager

2021-01-06 Thread GitBox



flinkbot edited a comment on pull request #14561:
URL: https://github.com/apache/flink/pull/14561#issuecomment-754574127


   
   ## CI report:
   
   * d2f645553ea7f6051cb81fee377672920ef871e0 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11653)
 
   * 2172813d98fb963cd7426c80e9de863caf46a87e Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11722)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #14554: [FLINK-20798] add kubernates config in ha doc

2021-01-06 Thread GitBox



flinkbot edited a comment on pull request #14554:
URL: https://github.com/apache/flink/pull/14554#issuecomment-754363035


   
   ## CI report:
   
   * c8a962e989edd60ad2da562729a8e61decbee41b Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11631)
 
   * d15c185e55c8d812cdebd2d00430e215b23ad2c1 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #14557: [FLINK-20781] Avoid NPE after SourceOperator is closed.

2021-01-06 Thread GitBox



flinkbot edited a comment on pull request #14557:
URL: https://github.com/apache/flink/pull/14557#issuecomment-754510928


   
   ## CI report:
   
   * 4d66e1dd7d08d10ffb01a48912657d365e37fed8 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11642)
 
   * bd51013d93c5b1b116a246a29de800714f576ce6 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #14572: [FLINK-20877][table-runtime-blink] Refactor BytesHashMap and BytesMultiMap to support window key

2021-01-06 Thread GitBox



flinkbot edited a comment on pull request #14572:
URL: https://github.com/apache/flink/pull/14572#issuecomment-755910110


   
   ## CI report:
   
   * 3f1d5189abe80623eba977235da6969663b535dc Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11719)
 
   * 5f2c9f1f5ae08d469519ceea65208fd30d113866 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11723)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot commented on pull request #14573: [FLINK-20868][task][metrics] Pause the idle/back pressure timers during processing mailbox actions

2021-01-06 Thread GitBox



flinkbot commented on pull request #14573:
URL: https://github.com/apache/flink/pull/14573#issuecomment-755944752


   
   ## CI report:
   
   * 55a5642324516b32ce04521ac5881ce2b5985940 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-web] rmetzger commented on a change in pull request #408: Add security page for Flink

2021-01-06 Thread GitBox



rmetzger commented on a change in pull request #408:
URL: https://github.com/apache/flink-web/pull/408#discussion_r553158138



##
File path: _includes/navbar.html
##
@@ -177,7 +177,9 @@
 
   https://www.apache.org/licenses/; 
target="_blank">License 
 
-  https://www.apache.org/security/; 
target="_blank">Security 
+  Flink Security

Review comment:
   I vaguely remember that the plan visualizer is kinda popular (people 
complained when we removed it) .. that's why I want to keep it.
   
   How about this? 
   
![image](https://user-images.githubusercontent.com/89049/103865534-a0694e00-50c4-11eb-8eca-78c11288d155.png)
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] rmetzger closed pull request #14529: FLINK-20359 Added Owner Reference to Job Manager in native kubernetes

2021-01-06 Thread GitBox



rmetzger closed pull request #14529:
URL: https://github.com/apache/flink/pull/14529


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] xintongsong commented on a change in pull request #14560: [FLINK-20837] Refactor dynamic SlotID

2021-01-06 Thread GitBox



xintongsong commented on a change in pull request #14560:
URL: https://github.com/apache/flink/pull/14560#discussion_r553155630



##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/taskexecutor/slot/TaskSlotTableImpl.java
##
@@ -288,22 +288,13 @@ public boolean allocateSlot(
 
 TaskSlot taskSlot = allocatedSlots.get(allocationId);
 if (taskSlot != null) {
-LOG.info("Allocation ID {} is already allocated in {}.", 
allocationId, taskSlot);
-return false;
-}
-
-if (taskSlots.containsKey(index)) {
-TaskSlot duplicatedTaskSlot = taskSlots.get(index);
+return isDuplicatedSlot(taskSlot, jobId, resourceProfile, index);
+} else if (!isIndexAlreadyTaken(index)) {

Review comment:
   `!` should be removed.

##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/taskexecutor/slot/TaskSlotTableImpl.java
##
@@ -277,27 +280,31 @@ public boolean allocateSlot(
 
 @Override
 public boolean allocateSlot(
-int index,
+int requestedIndex,
 JobID jobId,
 AllocationID allocationId,
 ResourceProfile resourceProfile,
 Time slotTimeout) {
 checkRunning();
 
-Preconditions.checkArgument(index < numberSlots);
+Preconditions.checkArgument(requestedIndex < numberSlots);
+
+// The negative requestIndex indicate that the SlotManger allocate a 
dynamic slot, we
+// transfer the index to an increasing number not less than the 
numberSlots.
+int index = requestedIndex < 0 ? nextDynamicSlotIndex() : 
requestedIndex;
 
 TaskSlot taskSlot = allocatedSlots.get(allocationId);
 if (taskSlot != null) {
 return isDuplicatedSlot(taskSlot, jobId, resourceProfile, index);
-} else if (!isIndexAlreadyTaken(index)) {
+} else if (isIndexAlreadyTaken(index)) {
 LOG.info(
 "The static slot with index {} is already assigned to 
another allocation with id {}.",
 index,
 taskSlots.get(index).getAllocationId());
 return false;
 }
 
-resourceProfile = index >= 0 ? defaultSlotResourceProfile : 
resourceProfile;
+resourceProfile = index < numberSlots ? defaultSlotResourceProfile : 
resourceProfile;

Review comment:
   There are 3 occurrences of `index < numberSlots` and 1 occurrence of 
`index >= numberSlots` in this file.
   Let's deduplicate it with a util method.

##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/taskexecutor/slot/TaskSlotTableImpl.java
##
@@ -288,22 +288,13 @@ public boolean allocateSlot(
 
 TaskSlot taskSlot = allocatedSlots.get(allocationId);
 if (taskSlot != null) {
-LOG.info("Allocation ID {} is already allocated in {}.", 
allocationId, taskSlot);
-return false;
-}
-
-if (taskSlots.containsKey(index)) {
-TaskSlot duplicatedTaskSlot = taskSlots.get(index);
+return isDuplicatedSlot(taskSlot, jobId, resourceProfile, index);
+} else if (!isIndexAlreadyTaken(index)) {
 LOG.info(
-"Slot with index {} already exist, with resource profile 
{}, job id {} and allocation id {}.",
+"The static slot with index {} is already assigned to 
another allocation with id {}.",

Review comment:
   Not sure about exposing the concept *static* slot here.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #14508: [FLINK-20773][json] Support to parse unescaped control chars in string node

2021-01-06 Thread GitBox



flinkbot edited a comment on pull request #14508:
URL: https://github.com/apache/flink/pull/14508#issuecomment-751734163


   
   ## CI report:
   
   * f1332b021d33a6e4681b0a08ad1c5b58f153c417 UNKNOWN
   * f4d02e921d2641fc5692617a4dd50ba2fda1128c UNKNOWN
   * fd8cbf90a807292b0db7b85bda26f1e717b87767 UNKNOWN
   * 3c03189754755222ce29f4d17485c91532da4a8b UNKNOWN
   * 5663475ec56efe4b84e6ae2e6cabd6d58db34bf2 UNKNOWN
   * 06a37e185c98ee1abefea3ce1b9278b12776d8a5 UNKNOWN
   * b3ffad9d58e5b8149a2e4da9e0ead2044170efb3 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11697)
 
   * 1b96157c0fc638ecdaf4a5c04c51e52275e926f0 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11724)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #14376: [FLINK-18202][PB format] New Format of protobuf

2021-01-06 Thread GitBox



flinkbot edited a comment on pull request #14376:
URL: https://github.com/apache/flink/pull/14376#issuecomment-744252765


   
   ## CI report:
   
   * a9a50e78d0ba6ff84d02fbadee1484970fac2c79 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11709)
 
   * bbfc0d96eb419932d49a54bf95f008c3155fbc81 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #14376: [FLINK-18202][PB format] New Format of protobuf

2021-01-06 Thread GitBox



flinkbot edited a comment on pull request #14376:
URL: https://github.com/apache/flink/pull/14376#issuecomment-744252765


   
   ## CI report:
   
   * bbfc0d96eb419932d49a54bf95f008c3155fbc81 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11717)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #14477: [FLINK-16005][yarn] Support yarn and hadoop config override

2021-01-06 Thread GitBox



flinkbot edited a comment on pull request #14477:
URL: https://github.com/apache/flink/pull/14477#issuecomment-750231028


   
   ## CI report:
   
   * 1e34e16b05f86ea121ecefb4d7ff49f2b42b23a7 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11712)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-20773) Support to parse unescaped control chars in string node

2021-01-06 Thread Jark Wu (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-20773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17260273#comment-17260273
 ] 

Jark Wu commented on FLINK-20773:
-

After an offline discussion with [~xiaozilong], we both think that we don't 
need to introduce an option for this feature. We can enable to allow unescaped 
control chars by default, befcause it doesn' have side effect. 

> Support to parse unescaped control chars in string node 
> 
>
> Key: FLINK-20773
> URL: https://issues.apache.org/jira/browse/FLINK-20773
> Project: Flink
>  Issue Type: Improvement
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
>Affects Versions: 1.12.0
>Reporter: xiaozilong
>Assignee: xiaozilong
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.13.0
>
> Attachments: image-2020-12-25-20-21-50-637.png
>
>
> Can we add an option `allow-unescaped-control-chars` for json format because 
> of it will throw exception when exist illegal unquoted characters in the data.
> !image-2020-12-25-20-21-50-637.png!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] flinkbot edited a comment on pull request #14572: [FLINK-20877][table-runtime-blink] Refactor BytesHashMap and BytesMultiMap to support window key

2021-01-06 Thread GitBox



flinkbot edited a comment on pull request #14572:
URL: https://github.com/apache/flink/pull/14572#issuecomment-755910110


   
   ## CI report:
   
   * 3f1d5189abe80623eba977235da6969663b535dc Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11719)
 
   * 5f2c9f1f5ae08d469519ceea65208fd30d113866 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] fsk119 commented on a change in pull request #14539: [FLINK-20321][formats] Fix NPE when using AvroDeserializationSchema to deserialize null input

2021-01-06 Thread GitBox



fsk119 commented on a change in pull request #14539:
URL: https://github.com/apache/flink/pull/14539#discussion_r553116137



##
File path: 
flink-formats/flink-json/src/main/java/org/apache/flink/formats/json/JsonRowDataDeserializationSchema.java
##
@@ -94,6 +94,9 @@ public JsonRowDataDeserializationSchema(
 
 @Override
 public RowData deserialize(byte[] message) throws IOException {

Review comment:
   ditto

##
File path: 
flink-formats/flink-avro/src/test/java/org/apache/flink/formats/avro/AvroRowDataDeSerializationSchemaTest.java
##
@@ -68,10 +68,24 @@
 import static org.apache.flink.table.api.DataTypes.TIMESTAMP;
 import static org.apache.flink.table.api.DataTypes.TINYINT;
 import static org.junit.Assert.assertArrayEquals;
+import static org.junit.Assert.assertNull;
 
 /** Test for the Avro serialization and deserialization schema. */
 public class AvroRowDataDeSerializationSchemaTest {
 
+@Test
+public void testDeserializeNullRow() throws Exception {
+final DataType dataType = ROW(FIELD("bool", BOOLEAN())).nullable();

Review comment:
   The code is duplicate here. Could you extract a common method that 
create a new schema for all test to reuse the same code, e.g 
   
   ```
   private AvroRowDataDeserializationSchema createSchema(DataType dataType) 
throws Exception {
   final RowType rowType = (RowType) dataType.getLogicalType();
   final TypeInformation typeInfo = 
InternalTypeInfo.of(rowType);
   
   AvroRowDataDeserializationSchema deserializationSchema =
   new AvroRowDataDeserializationSchema(rowType, typeInfo);
   deserializationSchema.open(null);
   return deserializationSchema;
   }
   ```

##
File path: 
flink-formats/flink-csv/src/test/java/org/apache/flink/formats/csv/CsvRowDataSerDeSchemaTest.java
##
@@ -206,6 +206,12 @@ public void testDeserializeUnsupportedNull() throws 
Exception {
 Row.of("Test", null, "Test"), testDeserialization(true, false, 
"Test,null,Test"));
 }
 
+@Test
+public void testDeserializeNullRow() throws Exception {
+// return null for null input
+assertNull(testDeserialization(true, false, null));

Review comment:
Maybe it's better to set the parameter `allowParsingErrors` false.

##
File path: 
flink-formats/flink-avro/src/main/java/org/apache/flink/formats/avro/RegistryAvroDeserializationSchema.java
##
@@ -65,6 +65,9 @@ public RegistryAvroDeserializationSchema(
 
 @Override
 public T deserialize(byte[] message) throws IOException {

Review comment:
   ditto

##
File path: 
flink-formats/flink-csv/src/main/java/org/apache/flink/formats/csv/CsvRowDataDeserializationSchema.java
##
@@ -141,6 +141,9 @@ public CsvRowDataDeserializationSchema build() {
 
 @Override
 public RowData deserialize(byte[] message) throws IOException {

Review comment:
   ditto

##
File path: 
flink-formats/flink-avro/src/main/java/org/apache/flink/formats/avro/AvroDeserializationSchema.java
##
@@ -128,6 +128,9 @@ Decoder getDecoder() {
 
 @Override
 public T deserialize(byte[] message) throws IOException {

Review comment:
   Please add annotation @NULLABLE before parameter message.
   
   You can also take a look at the constructor of the KafkaDynamicSource, which 
allows its input is null.

##
File path: 
flink-formats/flink-avro/src/main/java/org/apache/flink/formats/avro/AvroRowDataDeserializationSchema.java
##
@@ -94,6 +94,9 @@ public void open(InitializationContext context) throws 
Exception {
 
 @Override
 public RowData deserialize(byte[] message) throws IOException {

Review comment:
   ditto





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] xiaoHoly commented on pull request #14531: [FLINK-20777][Connector][Kafka] Property "partition.discovery.interval.ms" shoule be enabled by default for unbounded mode, and disable for

2021-01-06 Thread GitBox



xiaoHoly commented on pull request #14531:
URL: https://github.com/apache/flink/pull/14531#issuecomment-755945542


   Hi, @PatrickRen,@becketqin. Do you have time to review?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] maosuhan commented on pull request #14376: [FLINK-18202][PB format] New Format of protobuf

2021-01-06 Thread GitBox



maosuhan commented on pull request #14376:
URL: https://github.com/apache/flink/pull/14376#issuecomment-755900070


   @wuchong , I have updated the code formatting in my PR.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Created] (FLINK-20877) Refactor BytesHashMap and BytesMultiMap to support window key

2021-01-06 Thread Jark Wu (Jira)

Jark Wu created FLINK-20877:
---

 Summary: Refactor BytesHashMap and BytesMultiMap to support window 
key
 Key: FLINK-20877
 URL: https://issues.apache.org/jira/browse/FLINK-20877
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Runtime
Reporter: Jark Wu
Assignee: Jark Wu
 Fix For: 1.13.0


Currently, the {{BytesHashMap}} and {{BytesMultiMap}} only support 
{{BinaryRowData}} as the key. However, for window operators, the key should be 
{{groupKey + window}}. We should make them decouple with {{BinaryRowData}} key 
type and support window key. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-20872) Job resume from history savepoint when failover if checkpoint is disabled

2021-01-06 Thread Yun Tang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-20872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17260248#comment-17260248
 ] 

Yun Tang commented on FLINK-20872:
--

[~Jiangang] If I understand correct. You did not enable the checkpoint (not set 
the checkpoint interval), but resume the job from a savepoint. However, you 
just want the job would not resume from the savepoint if failover occurred?  
Why not enable the checkpoint so that the job could resume from latest 
checkpoint?

> Job resume from history savepoint when failover if checkpoint is disabled
> -
>
> Key: FLINK-20872
> URL: https://issues.apache.org/jira/browse/FLINK-20872
> Project: Flink
>  Issue Type: Improvement
>Affects Versions: 1.10.0, 1.12.0
>Reporter: Liu
>Priority: Minor
>
> I have a long running job. Its checkpoint is disabled and restartStrategy is 
> set.  One time I upgrade the job through savepoint. One day later, the job is 
> failed and restart automatically. But it is resumed from the previous 
> savepoint so that the job is heavily lagged.
>  
> I have checked the code and find that the job will first try to resume from 
> checkpoint and then savepoint.
> {code:java}
> if (checkpointCoordinator != null) {
> // check whether we find a valid checkpoint
> if (!checkpointCoordinator.restoreInitialCheckpointIfPresent(
> new HashSet<>(newExecutionGraph.getAllVertices().values( {
> // check whether we can restore from a savepoint
> tryRestoreExecutionGraphFromSavepoint(
> newExecutionGraph, jobGraph.getSavepointRestoreSettings());
> }
> }
> {code}
> For job which checkpoint is disabled, internal failover should not resume 
> from previous savepoint, especially the savepoint is done long long ago. In 
> this situation, state loss is acceptable but lag is not acceptable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] flinkbot edited a comment on pull request #14516: [FLINK-20783][table-planner-blink] Separate implementation of ExecNode and PhysicalNode in batch join

2021-01-06 Thread GitBox



flinkbot edited a comment on pull request #14516:
URL: https://github.com/apache/flink/pull/14516#issuecomment-752024061


   
   ## CI report:
   
   * 5601ca0f26b9fdb130b0a030e7326d8b1287c76c Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11718)
 
   * b87220ccdee330380bc015427b9f4b445173e324 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot commented on pull request #14572: [FLINK-20877][table-runtime-blink] Refactor BytesHashMap and BytesMultiMap to support window key

2021-01-06 Thread GitBox



flinkbot commented on pull request #14572:
URL: https://github.com/apache/flink/pull/14572#issuecomment-755910110


   
   ## CI report:
   
   * 3f1d5189abe80623eba977235da6969663b535dc UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #14560: [FLINK-20837] Refactor dynamic SlotID

2021-01-06 Thread GitBox



flinkbot edited a comment on pull request #14560:
URL: https://github.com/apache/flink/pull/14560#issuecomment-754563038


   
   ## CI report:
   
   * a4504bda6bf4e1b38aefb248bc8355bc7864f496 UNKNOWN
   * b19b9ae12b807b3e012b0952a70b824a91df0ce5 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11651)
 
   * b8a38b363e2552a8ad2c174c7e0bc945e42398ff UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #14572: [FLINK-20877][table-runtime-blink] Refactor BytesHashMap and BytesMultiMap to support window key

2021-01-06 Thread GitBox



flinkbot edited a comment on pull request #14572:
URL: https://github.com/apache/flink/pull/14572#issuecomment-755910110


   
   ## CI report:
   
   * 3f1d5189abe80623eba977235da6969663b535dc Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11719)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] wuchong commented on pull request #14508: [FLINK-20773][format] Support allow-unescaped-control-chars option for JSON format.

2021-01-06 Thread GitBox



wuchong commented on pull request #14508:
URL: https://github.com/apache/flink/pull/14508#issuecomment-755925759


   Hi @V1ncentzzZ , I helped to rebase the branch to trigger build again. 
   I also move the test into `TestSpec` list, because it is a general case now. 
   If you don't have other objections, I will merge this PR once Azure is 
passed. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-20863) Exclude network memory from ResourceProfile

2021-01-06 Thread Matthias (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-20863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias updated FLINK-20863:
-
Component/s: Runtime / Network

> Exclude network memory from ResourceProfile
> ---
>
> Key: FLINK-20863
> URL: https://issues.apache.org/jira/browse/FLINK-20863
> Project: Flink
>  Issue Type: Task
>  Components: Runtime / Network
>Reporter: Yangze Guo
>Priority: Major
> Fix For: 1.13.0
>
>
> Network memory is included in the current ResourceProfile implementation, 
> expecting the fine-grained resource management to not deploy too many tasks 
> onto a TM that require more network memory than the TM contains.
> However, how much network memory each task needs highly depends on the 
> shuffle service implementation, and may vary when switching to another 
> shuffle service. Therefore, neither user nor the Flink runtime can easily 
> specify network memory requirements for a task/slot at the moment.
> The concrete solution for network memory controlling is beyond the scope of 
> this FLIP. However, we are aware of a few potential directions for solving 
> this problem.
> - Make shuffle services adaptively control the amount of memory assigned to 
> each task/slot, with respect to the given memory pool size. In this way, 
> there should be no need to rely on fine-grained resource management to 
> control the network memory consumption.
> - Make shuffle services expose interfaces for calculating network memory 
> requirements for given SSGs. In this way, the Flink runtime can specify the 
> calculated network memory requirements for slots, without having to 
> understand the internal details of different shuffle service implementations.
> As for now, we propose to exclude network memory from ResourceProfile for the 
> moment, to unblock the fine-grained resource management feature from the 
> network memory controlling issue. If needed, it can be added back in future, 
> as long as there’s a good way to specify the requirement.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] KarmaGYZ commented on pull request #14560: [FLINK-20837] Refactor dynamic SlotID

2021-01-06 Thread GitBox



KarmaGYZ commented on pull request #14560:
URL: https://github.com/apache/flink/pull/14560#issuecomment-755933152


   Thanks for the review @xintongsong . PR updated.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] haydenzhourepo commented on a change in pull request #14554: [FLINK-20798] add kubernates config in ha doc

2021-01-06 Thread GitBox



haydenzhourepo commented on a change in pull request #14554:
URL: https://github.com/apache/flink/pull/14554#discussion_r553148565



##
File path: docs/deployment/ha/kubernetes_ha.md
##
@@ -52,6 +52,10 @@ In order to identify the Flink cluster, you have to specify 
a `kubernetes.cluste
 
   kubernetes.cluster-id: cluster1337
 
+- [kubernetes.namespace]({% link deployment/config.md %}#kubernetes-namespace) 
(required):
+The namespace that will be used for running the jobmanager and taskmanager 
pods.
+  kubernetes.namespace: cluster-namespace

Review comment:
   this is not the required configuration but is an important 
configuration, refer to the [FLINK-20798] problem, if the user deployed FLINK 
cluster not on the default namespace then the deploy will fail because of the 
wrong namespace specified. so the document should remind of the user this 
configuration. and I will remote the required tag then submit.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] haydenzhourepo commented on a change in pull request #14554: [FLINK-20798] add kubernates config in ha doc

2021-01-06 Thread GitBox



haydenzhourepo commented on a change in pull request #14554:
URL: https://github.com/apache/flink/pull/14554#discussion_r553148565



##
File path: docs/deployment/ha/kubernetes_ha.md
##
@@ -52,6 +52,10 @@ In order to identify the Flink cluster, you have to specify 
a `kubernetes.cluste
 
   kubernetes.cluster-id: cluster1337
 
+- [kubernetes.namespace]({% link deployment/config.md %}#kubernetes-namespace) 
(required):
+The namespace that will be used for running the jobmanager and taskmanager 
pods.
+  kubernetes.namespace: cluster-namespace

Review comment:
   this is not the required configuration but is an important 
configuration, refer to the [FLINK-20798] problem, if the user deployed FLINK 
cluster not on the default namespace then the deploy will fail because of the 
wrong namespace specified. so the document should remind of the user this 
configuration. and I will remove the `required` tag then submit.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #14516: [FLINK-20783][table-planner-blink] Separate implementation of ExecNode and PhysicalNode in batch join

2021-01-06 Thread GitBox



flinkbot edited a comment on pull request #14516:
URL: https://github.com/apache/flink/pull/14516#issuecomment-752024061


   
   ## CI report:
   
   * b87220ccdee330380bc015427b9f4b445173e324 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11721)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-20877) Refactor BytesHashMap and BytesMultiMap to support window key

2021-01-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-20877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-20877:
---
Labels: pull-request-available  (was: )

> Refactor BytesHashMap and BytesMultiMap to support window key
> -
>
> Key: FLINK-20877
> URL: https://issues.apache.org/jira/browse/FLINK-20877
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Runtime
>Reporter: Jark Wu
>Assignee: Jark Wu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.13.0
>
>
> Currently, the {{BytesHashMap}} and {{BytesMultiMap}} only support 
> {{BinaryRowData}} as the key. However, for window operators, the key should 
> be {{groupKey + window}}. We should make them decouple with {{BinaryRowData}} 
> key type and support window key. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] wuchong opened a new pull request #14572: [FLINK-20877][table-runtime-blink] Refactor BytesHashMap and BytesMultiMap to support window key

2021-01-06 Thread GitBox



wuchong opened a new pull request #14572:
URL: https://github.com/apache/flink/pull/14572


   
   
   ## What is the purpose of the change
   
   Currently, the BytesHashMap and BytesMultiMap only support BinaryRowData as 
the key. However, for window operators, the key should be groupKey + window. We 
should make them decouple with BinaryRowData key type and support window key.
   
   ## Brief change log
   
   0. Move `BytesMap` classes to util packag. Because in the near future, the 
structures will not only be used in aggregations, but also topn and others. 
   2. Extract pages operations from `AbstractRowDataSerializer` into 
`PagedTypeSerializer`
   1. Make `BytesMap` decouple with `BinaryRowData` key type
   3. Introduce `WindowKey` structure and `WindowKeySerializer`
   4. Introduce `BytesWindowHashMap` and `BytesWindowMultiMap` to support 
window key
   We still keep the original name of `BytesHashMap` unchanged to avoid 
changing too many codes.
   
   
   ## Verifying this change
   
   Refactor `BytesHashMapTest` and `BytesMultiMapTest` tests to also cover the 
new introduced `BytesWindowHashMap` and `BytesWindowMultiMap`.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / **no**)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / **no**)
 - The serializers: (**yes** / no / don't know)
 - The runtime per-record code paths (performance sensitive): (**yes** / no 
/ don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / **no** / don't know)
 - The S3 file system connector: (yes / **no** / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / **no**)
 - If yes, how is the feature documented? (**not applicable** / docs / 
JavaDocs / not documented)
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #14560: [FLINK-20837] Refactor dynamic SlotID

2021-01-06 Thread GitBox



flinkbot edited a comment on pull request #14560:
URL: https://github.com/apache/flink/pull/14560#issuecomment-754563038


   
   ## CI report:
   
   * a4504bda6bf4e1b38aefb248bc8355bc7864f496 UNKNOWN
   * b19b9ae12b807b3e012b0952a70b824a91df0ce5 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11651)
 
   * b8a38b363e2552a8ad2c174c7e0bc945e42398ff Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11720)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #14538: [FLINK-20681][yarn] Support remote path for shipping archives and files

2021-01-06 Thread GitBox



flinkbot edited a comment on pull request #14538:
URL: https://github.com/apache/flink/pull/14538#issuecomment-752991523


   
   ## CI report:
   
   * af57879d33f5d7e1717518a475fd601916c5c7c1 UNKNOWN
   * d3230f5b6087c8c3c3453c9d69a68e69305010b4 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11713)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-20773) Support to parse unescaped control chars in string node

2021-01-06 Thread Jark Wu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-20773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jark Wu updated FLINK-20773:

Summary: Support to parse unescaped control chars in string node   (was: 
Support allow-unescaped-control-chars option for JSON format)

> Support to parse unescaped control chars in string node 
> 
>
> Key: FLINK-20773
> URL: https://issues.apache.org/jira/browse/FLINK-20773
> Project: Flink
>  Issue Type: Improvement
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
>Affects Versions: 1.12.0
>Reporter: xiaozilong
>Assignee: xiaozilong
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.13.0
>
> Attachments: image-2020-12-25-20-21-50-637.png
>
>
> Can we add an option `allow-unescaped-control-chars` for json format because 
> of it will throw exception when exist illegal unquoted characters in the data.
> !image-2020-12-25-20-21-50-637.png!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-20420) ES6 ElasticsearchSinkITCase failed due to no output for 900 seconds

2021-01-06 Thread Matthias (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-20420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias updated FLINK-20420:
-
Labels: test-stability  (was: testability)

> ES6 ElasticsearchSinkITCase failed due to no output for 900 seconds
> ---
>
> Key: FLINK-20420
> URL: https://issues.apache.org/jira/browse/FLINK-20420
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / ElasticSearch
>Affects Versions: 1.12.0
>Reporter: Yun Tang
>Priority: Major
>  Labels: test-stability
>
> Instance:
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=10249=logs=d44f43ce-542c-597d-bf94-b0718c71e5e8=03dca39c-73e8-5aaf-601d-328ae5c35f20=18821
> {code:java}
> Process produced no output for 900 seconds.
> ==
> ==
> The following Java processes are running (JPS)
> ==
> Picked up JAVA_TOOL_OPTIONS: -XX:+HeapDumpOnOutOfMemoryError
> 2274 Launcher
> 18260 Jps
> 15916 surefirebooter3434370240444055571.jar
> ==
> "main" #1 prio=5 os_prio=0 tid=0x7feec000b800 nid=0x3e2d runnable 
> [0x7feec8541000]
>java.lang.Thread.State: RUNNABLE
>   at org.testcontainers.shaded.okio.Buffer.indexOf(Buffer.java:1463)
>   at 
> org.testcontainers.shaded.okio.RealBufferedSource.indexOf(RealBufferedSource.java:352)
>   at 
> org.testcontainers.shaded.okio.RealBufferedSource.readUtf8LineStrict(RealBufferedSource.java:230)
>   at 
> org.testcontainers.shaded.okio.RealBufferedSource.readUtf8LineStrict(RealBufferedSource.java:224)
>   at 
> org.testcontainers.shaded.okhttp3.internal.http1.Http1ExchangeCodec$ChunkedSource.readChunkSize(Http1ExchangeCodec.java:489)
>   at 
> org.testcontainers.shaded.okhttp3.internal.http1.Http1ExchangeCodec$ChunkedSource.read(Http1ExchangeCodec.java:471)
>   at 
> org.testcontainers.shaded.okhttp3.internal.Util.skipAll(Util.java:204)
>   at 
> org.testcontainers.shaded.okhttp3.internal.Util.discard(Util.java:186)
>   at 
> org.testcontainers.shaded.okhttp3.internal.http1.Http1ExchangeCodec$ChunkedSource.close(Http1ExchangeCodec.java:511)
>   at 
> org.testcontainers.shaded.okio.ForwardingSource.close(ForwardingSource.java:43)
>   at 
> org.testcontainers.shaded.okhttp3.internal.connection.Exchange$ResponseBodySource.close(Exchange.java:313)
>   at 
> org.testcontainers.shaded.okio.RealBufferedSource.close(RealBufferedSource.java:476)
>   at 
> org.testcontainers.shaded.okhttp3.internal.Util.closeQuietly(Util.java:139)
>   at 
> org.testcontainers.shaded.okhttp3.ResponseBody.close(ResponseBody.java:192)
>   at org.testcontainers.shaded.okhttp3.Response.close(Response.java:290)
>   at 
> org.testcontainers.shaded.com.github.dockerjava.okhttp.OkDockerHttpClient$OkResponse.close(OkDockerHttpClient.java:280)
>   at 
> org.testcontainers.shaded.com.github.dockerjava.core.DefaultInvocationBuilder.lambda$null$0(DefaultInvocationBuilder.java:272)
>   at 
> org.testcontainers.shaded.com.github.dockerjava.core.DefaultInvocationBuilder$$Lambda$87/1112018050.close(Unknown
>  Source)
>   at 
> com.github.dockerjava.api.async.ResultCallbackTemplate.close(ResultCallbackTemplate.java:77)
>   at 
> org.testcontainers.utility.ResourceReaper.start(ResourceReaper.java:177)
>   at 
> org.testcontainers.DockerClientFactory.client(DockerClientFactory.java:203)
>   - locked <0x88fcbbf0> (a [Ljava.lang.Object;)
>   at 
> org.testcontainers.LazyDockerClient.getDockerClient(LazyDockerClient.java:14)
>   at 
> org.testcontainers.LazyDockerClient.listImagesCmd(LazyDockerClient.java:12)
>   at 
> org.testcontainers.images.LocalImagesCache.maybeInitCache(LocalImagesCache.java:68)
>   - locked <0x88fcb940> (a 
> org.testcontainers.images.LocalImagesCache)
>   at 
> org.testcontainers.images.LocalImagesCache.get(LocalImagesCache.java:32)
>   at 
> org.testcontainers.images.AbstractImagePullPolicy.shouldPull(AbstractImagePullPolicy.java:18)
>   at 
> org.testcontainers.images.RemoteDockerImage.resolve(RemoteDockerImage.java:66)
>   at 
> org.testcontainers.images.RemoteDockerImage.resolve(RemoteDockerImage.java:27)
>   at 
> org.testcontainers.utility.LazyFuture.getResolvedValue(LazyFuture.java:17)
>   - locked <0x890763d0> (a 
> java.util.concurrent.atomic.AtomicReference)
>   at org.testcontainers.utility.LazyFuture.get(LazyFuture.java:39)
>   at 
>

[GitHub] [flink] flinkbot commented on pull request #14573: [FLINK-20868][task][metrics] Pause the idle/back pressure timers during processing mailbox actions

2021-01-06 Thread GitBox



flinkbot commented on pull request #14573:
URL: https://github.com/apache/flink/pull/14573#issuecomment-755932613


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit 55a5642324516b32ce04521ac5881ce2b5985940 (Thu Jan 07 
07:12:19 UTC 2021)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-20871) Make DataStream#executeAndCollectWithClient() public

2021-01-06 Thread Steven Zhen Wu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-20871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Zhen Wu updated FLINK-20871:
---
Summary: Make DataStream#executeAndCollectWithClient() public  (was: Make 
DataStream#executeAndCollectWithClient public)

> Make DataStream#executeAndCollectWithClient() public
> 
>
> Key: FLINK-20871
> URL: https://issues.apache.org/jira/browse/FLINK-20871
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataStream
>Affects Versions: 1.12.0
>Reporter: Steven Zhen Wu
>Priority: Major
>
> Right now, `DataStreamUtils#collectWithClient` is marked as deprecated in 
> favor of the `DataStream#executeAndCollect()`. However, some integration 
> tests (e.g.  
> [FileSourceTextLinesITCase](https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-files/src/test/java/org/apache/flink/connector/file/src/FileSourceTextLinesITCase.java#L187))
>  need the `DataStream#executeAndCollectWithClient` API to get JobClient to 
> cancel the job after collected required output for unbounded source test.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] flinkbot edited a comment on pull request #14477: [FLINK-16005][yarn] Support yarn and hadoop config override

2021-01-06 Thread GitBox



flinkbot edited a comment on pull request #14477:
URL: https://github.com/apache/flink/pull/14477#issuecomment-750231028


   
   ## CI report:
   
   * d4d6f9aa45758753d7dfd55dd253c1c540626646 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11701)
 
   * aa9776ffae2b203d7c6980f4f7a32b70c26c6477 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #14477: [FLINK-16005][yarn] Support yarn and hadoop config override

2021-01-06 Thread GitBox



flinkbot edited a comment on pull request #14477:
URL: https://github.com/apache/flink/pull/14477#issuecomment-750231028


   
   ## CI report:
   
   * d4d6f9aa45758753d7dfd55dd253c1c540626646 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11701)
 
   * aa9776ffae2b203d7c6980f4f7a32b70c26c6477 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11708)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #14376: [FLINK-18202][PB format] New Format of protobuf

2021-01-06 Thread GitBox



flinkbot edited a comment on pull request #14376:
URL: https://github.com/apache/flink/pull/14376#issuecomment-744252765


   
   ## CI report:
   
   * 6b2d6e05fdbd4b579c68007e5001bd1cabe1dd7b Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=10860)
 
   * a9a50e78d0ba6ff84d02fbadee1484970fac2c79 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] maosuhan commented on a change in pull request #14376: [FLINK-18202][PB format] New Format of protobuf

2021-01-06 Thread GitBox



maosuhan commented on a change in pull request #14376:
URL: https://github.com/apache/flink/pull/14376#discussion_r553093799



##
File path: 
flink-formats/flink-protobuf/src/main/java/org/apache/flink/formats/protobuf/PbSchemaValidator.java
##
@@ -0,0 +1,148 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.formats.protobuf;
+
+import org.apache.flink.table.api.ValidationException;
+import org.apache.flink.table.types.logical.ArrayType;
+import org.apache.flink.table.types.logical.LogicalType;
+import org.apache.flink.table.types.logical.LogicalTypeRoot;
+import org.apache.flink.table.types.logical.MapType;
+import org.apache.flink.table.types.logical.RowType;
+
+import com.google.protobuf.Descriptors;
+import com.google.protobuf.Descriptors.FieldDescriptor;
+import com.google.protobuf.Descriptors.FieldDescriptor.JavaType;
+
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+
+public class PbSchemaValidator {
+   private Descriptors.Descriptor descriptor;
+   private RowType rowType;
+   private Map> typeMatchMap = new 
HashMap();
+
+   public PbSchemaValidator(Descriptors.Descriptor descriptor, RowType 
rowType) {
+   this.descriptor = descriptor;
+   this.rowType = rowType;
+   typeMatchMap.put(JavaType.BOOLEAN, 
Collections.singletonList(LogicalTypeRoot.BOOLEAN));
+   typeMatchMap.put(
+   JavaType.BYTE_STRING,
+   Arrays.asList(LogicalTypeRoot.BINARY, 
LogicalTypeRoot.VARBINARY));
+   typeMatchMap.put(JavaType.DOUBLE, 
Collections.singletonList(LogicalTypeRoot.DOUBLE));
+   typeMatchMap.put(JavaType.FLOAT, 
Collections.singletonList(LogicalTypeRoot.FLOAT));
+   typeMatchMap.put(
+   JavaType.ENUM,
+   Arrays.asList(LogicalTypeRoot.VARCHAR, 
LogicalTypeRoot.CHAR));
+   typeMatchMap.put(
+   JavaType.STRING,
+   Arrays.asList(LogicalTypeRoot.VARCHAR, 
LogicalTypeRoot.CHAR));
+   typeMatchMap.put(JavaType.INT, 
Collections.singletonList(LogicalTypeRoot.INTEGER));
+   typeMatchMap.put(JavaType.LONG, 
Collections.singletonList(LogicalTypeRoot.BIGINT));
+   }
+
+   public Descriptors.Descriptor getDescriptor() {
+   return descriptor;
+   }
+
+   public void setDescriptor(Descriptors.Descriptor descriptor) {
+   this.descriptor = descriptor;
+   }
+
+   public RowType getRowType() {
+   return rowType;
+   }
+
+   public void setRowType(RowType rowType) {
+   this.rowType = rowType;
+   }
+
+   public void validate() {
+   validateTypeMatch(descriptor, rowType);
+   if (!descriptor.getFile().getOptions().getJavaPackage()

Review comment:
   This is a good question. If we omit java_package or set a different 
value with package, it is a little complicated to find out the real java type 
of nested message.
   For example,
```
package foo.bar
   
option java_package = "com.example.protos"
   
message A{
   
 message B{}
   
}
   ```
   When we get Descriptor of B, we cannot get the real java class name directly 
by using com.google.protobuf.Descriptors.Descriptor.getName(), the name is 
comming from package instead of java_package.
   So I just make it simple by forcibly set package the same with java_package.
   
   I tried to make it flexible to set different values, but it made the code a 
little complicated and not very compatible.
   
   Do you have any good ideas?
   
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #14571: [FLINK-20871][API/DataStream] Make DataStream#executeAndCollectWithCl…

2021-01-06 Thread GitBox



flinkbot edited a comment on pull request #14571:
URL: https://github.com/apache/flink/pull/14571#issuecomment-755845385


   
   ## CI report:
   
   *  Unknown: [CANCELED](TBD) 
   * 34d88a71b61f62f2775a41eb291c017c6d76b8ab UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] godfreyhe closed pull request #14502: [Flink-20766][table-planner-blink] Separate implementations of sort ExecNode and PhysicalNode.

2021-01-06 Thread GitBox



godfreyhe closed pull request #14502:
URL: https://github.com/apache/flink/pull/14502


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Created] (FLINK-20872) Job resume from history savepoint when failover if checkpoint is disabled

2021-01-06 Thread Liu (Jira)

Liu created FLINK-20872:
---

 Summary: Job resume from history savepoint when failover if 
checkpoint is disabled
 Key: FLINK-20872
 URL: https://issues.apache.org/jira/browse/FLINK-20872
 Project: Flink
  Issue Type: Improvement
Affects Versions: 1.12.0, 1.11.0
Reporter: Liu


I have a long running job. Its checkpoint is disabled and restartStrategy is 
set.  One time I upgrade the job through savepoint. One day later, the job is 
failed and restart automatically. But it is resumed from the previous savepoint 
so that the job is heavily lagged.

 

I have checked the code and find that the job will first try to resume from 
checkpoint and then savepoint.
{code:java}
if (checkpointCoordinator != null) {
// check whether we find a valid checkpoint
if (!checkpointCoordinator.restoreInitialCheckpointIfPresent(
new HashSet<>(newExecutionGraph.getAllVertices().values( {

// check whether we can restore from a savepoint
tryRestoreExecutionGraphFromSavepoint(
newExecutionGraph, jobGraph.getSavepointRestoreSettings());
}
}
{code}
For job which checkpoint is disabled, internal failover should not resume from 
previous savepoint, especially the savepoint is done long long ago. In this 
situation, state loss is acceptable but lag is not acceptable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] stevenzwu opened a new pull request #14571: [FLINK-20871][API/DataStream] Make DataStream#executeAndCollectWithCl…

2021-01-06 Thread GitBox



stevenzwu opened a new pull request #14571:
URL: https://github.com/apache/flink/pull/14571


   …ient() public
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Created] (FLINK-20873) Upgrade Calcite version to 1.27

2021-01-06 Thread Jark Wu (Jira)

Jark Wu created FLINK-20873:
---

 Summary: Upgrade Calcite version to 1.27
 Key: FLINK-20873
 URL: https://issues.apache.org/jira/browse/FLINK-20873
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / API
Reporter: Jark Wu






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] flinkbot commented on pull request #14571: [FLINK-20871][API/DataStream] Make DataStream#executeAndCollectWithCl…

2021-01-06 Thread GitBox



flinkbot commented on pull request #14571:
URL: https://github.com/apache/flink/pull/14571#issuecomment-755845385


   
   ## CI report:
   
   * 54088a5e88bc0f7ab29c21b754a6733c45d1a8bc UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] blublinsky commented on pull request #14512: FLINK-20359 Added Owner Reference to Job Manager in native kubernetes

2021-01-06 Thread GitBox



blublinsky commented on pull request #14512:
URL: https://github.com/apache/flink/pull/14512#issuecomment-755848208


   Oh, my problem is that I am working from the local fork, that I updated to 
get the latest version. So my previous fork is gone, thats why it created a new 
commit. So now from a new fork I can't push to the same PR, because it 
recognizes that my fork has changed



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] godfreyhe commented on a change in pull request #14562: [FLINK-20738][table-planner-blink] Separate the implementation of batch group aggregate nodes

2021-01-06 Thread GitBox



godfreyhe commented on a change in pull request #14562:
URL: https://github.com/apache/flink/pull/14562#discussion_r552627854



##
File path: 
flink-table/flink-table-planner-blink/src/main/java/org/apache/flink/table/planner/plan/nodes/exec/utils/ExecNodeUtil.java
##
@@ -35,6 +38,11 @@
 /** An Utility class that helps translating {@link ExecNode} to {@link 
Transformation}. */
 public class ExecNodeUtil {
 
+/** Return bytes size for given option in {@link TableConfig}. */
+public static long getMemorySize(TableConfig tableConfig, 
ConfigOption option) {
+return 
MemorySize.parse(tableConfig.getConfiguration().getString(option)).getBytes();

Review comment:
   It will broke the current API, because the compile error will occur when 
user's code is `configuration.set(
 ExecutionConfigOptions.TABLE_EXEC_RESOURCE_HASH_AGG_MEMORY, "128 mb")`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] maosuhan commented on a change in pull request #14376: [FLINK-18202][PB format] New Format of protobuf

2021-01-06 Thread GitBox



maosuhan commented on a change in pull request #14376:
URL: https://github.com/apache/flink/pull/14376#discussion_r553102794



##
File path: 
flink-formats/flink-protobuf/src/main/java/org/apache/flink/formats/protobuf/PbSchemaValidator.java
##
@@ -0,0 +1,148 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.formats.protobuf;
+
+import org.apache.flink.table.api.ValidationException;
+import org.apache.flink.table.types.logical.ArrayType;
+import org.apache.flink.table.types.logical.LogicalType;
+import org.apache.flink.table.types.logical.LogicalTypeRoot;
+import org.apache.flink.table.types.logical.MapType;
+import org.apache.flink.table.types.logical.RowType;
+
+import com.google.protobuf.Descriptors;
+import com.google.protobuf.Descriptors.FieldDescriptor;
+import com.google.protobuf.Descriptors.FieldDescriptor.JavaType;
+
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+
+public class PbSchemaValidator {
+   private Descriptors.Descriptor descriptor;
+   private RowType rowType;
+   private Map> typeMatchMap = new 
HashMap();
+
+   public PbSchemaValidator(Descriptors.Descriptor descriptor, RowType 
rowType) {
+   this.descriptor = descriptor;
+   this.rowType = rowType;
+   typeMatchMap.put(JavaType.BOOLEAN, 
Collections.singletonList(LogicalTypeRoot.BOOLEAN));
+   typeMatchMap.put(
+   JavaType.BYTE_STRING,
+   Arrays.asList(LogicalTypeRoot.BINARY, 
LogicalTypeRoot.VARBINARY));
+   typeMatchMap.put(JavaType.DOUBLE, 
Collections.singletonList(LogicalTypeRoot.DOUBLE));
+   typeMatchMap.put(JavaType.FLOAT, 
Collections.singletonList(LogicalTypeRoot.FLOAT));
+   typeMatchMap.put(
+   JavaType.ENUM,
+   Arrays.asList(LogicalTypeRoot.VARCHAR, 
LogicalTypeRoot.CHAR));
+   typeMatchMap.put(
+   JavaType.STRING,
+   Arrays.asList(LogicalTypeRoot.VARCHAR, 
LogicalTypeRoot.CHAR));
+   typeMatchMap.put(JavaType.INT, 
Collections.singletonList(LogicalTypeRoot.INTEGER));
+   typeMatchMap.put(JavaType.LONG, 
Collections.singletonList(LogicalTypeRoot.BIGINT));
+   }
+
+   public Descriptors.Descriptor getDescriptor() {
+   return descriptor;
+   }
+
+   public void setDescriptor(Descriptors.Descriptor descriptor) {
+   this.descriptor = descriptor;
+   }
+
+   public RowType getRowType() {
+   return rowType;
+   }
+
+   public void setRowType(RowType rowType) {
+   this.rowType = rowType;
+   }
+
+   public void validate() {
+   validateTypeMatch(descriptor, rowType);
+   if (!descriptor.getFile().getOptions().getJavaPackage()
+   .equals(descriptor.getFile().getPackage())) {
+   throw new IllegalArgumentException(
+   "java_package and package must be the same in 
proto definition");
+   }
+   if (!descriptor.getFile().getOptions().getJavaMultipleFiles()) {

Review comment:
   The complicated code is below, you can see that "OuterClass" is hard 
coded.
   ```
   public static String getJavaFullName(Descriptors.Descriptor descriptor) {
   String javaPackageName = 
descriptor.getFile().getOptions().getJavaPackage();
   if (descriptor.getFile().getOptions().getJavaMultipleFiles()) {
   //multiple_files=true
   if (null != descriptor.getContainingType()) {
   //nested type
   String parentJavaFullName = 
getJavaFullName(descriptor.getContainingType());
   return parentJavaFullName + "." + descriptor.getName();
   } else {
   //top level message
   return javaPackageName + "." + descriptor.getName();
   }
   } else {
   //multiple_files=false
   if (null != descriptor.getContainingType()) {

[GitHub] [flink] godfreyhe commented on a change in pull request #14562: [FLINK-20738][table-planner-blink] Separate the implementation of batch group aggregate nodes

2021-01-06 Thread GitBox



godfreyhe commented on a change in pull request #14562:
URL: https://github.com/apache/flink/pull/14562#discussion_r552627854



##
File path: 
flink-table/flink-table-planner-blink/src/main/java/org/apache/flink/table/planner/plan/nodes/exec/utils/ExecNodeUtil.java
##
@@ -35,6 +38,11 @@
 /** An Utility class that helps translating {@link ExecNode} to {@link 
Transformation}. */
 public class ExecNodeUtil {
 
+/** Return bytes size for given option in {@link TableConfig}. */
+public static long getMemorySize(TableConfig tableConfig, 
ConfigOption option) {
+return 
MemorySize.parse(tableConfig.getConfiguration().getString(option)).getBytes();

Review comment:
   It will broke the current API, because the compile error will occur when 
user's before code is `configuration.set(
 ExecutionConfigOptions.TABLE_EXEC_RESOURCE_HASH_AGG_MEMORY, "128 mb")`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Created] (FLINK-20870) FlinkKafkaSink

2021-01-06 Thread xx chai (Jira)

xx chai created FLINK-20870:
---

 Summary: FlinkKafkaSink
 Key: FLINK-20870
 URL: https://issues.apache.org/jira/browse/FLINK-20870
 Project: Flink
  Issue Type: Improvement
  Components: API / DataStream
Affects Versions: 1.12.0
 Environment: flink :1.12.0
kafka 2.2.1
Reporter: xx chai


I consum from kafka sink to kafka,Then I split the message into then pieces.I 
guess the ten message should in one transaction. When the fifth message is sink 
kafka ,I throw a exception.but the first four are already in kafka.
I set some parameters :
   properties.setProperty("transactional.id", "cxx");
properties.setProperty("ack", "all");
properties.put("enable.idempotence",true);
properties.put("max.in.flight.requests.per.connection",5);
properties.put("retries", 2);
properties.setProperty("client.id", "producer-syn-2");
properties.put("isolation.level","read_committed");



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-18027) ROW value constructor cannot deal with complex expressions

2021-01-06 Thread Danny Chen (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-18027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17260156#comment-17260156
 ] 

Danny Chen commented on FLINK-18027:


Fixed in https://issues.apache.org/jira/browse/CALCITE-4456, the bug would be 
fixed if we upgrade to CALCITE release 1.27.0 or higher version.

> ROW value constructor cannot deal with complex expressions
> --
>
> Key: FLINK-18027
> URL: https://issues.apache.org/jira/browse/FLINK-18027
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Reporter: Benchao Li
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.13.0
>
>
> {code:java}
> create table my_source (
> my_row row
> ) with (...);
> create table my_sink (
> my_row row
> ) with (...);
> insert into my_sink
> select ROW(my_row.a, my_row.b) 
> from my_source;{code}
> will throw excepions:
> {code:java}
> Exception in thread "main" org.apache.flink.table.api.SqlParserException: SQL 
> parse failed. Encountered "." at line 1, column 18.Exception in thread "main" 
> org.apache.flink.table.api.SqlParserException: SQL parse failed. Encountered 
> "." at line 1, column 18.Was expecting one of:    ")" ...    "," ...     at 
> org.apache.flink.table.planner.calcite.CalciteParser.parse(CalciteParser.java:56)
>  at 
> org.apache.flink.table.planner.delegation.ParserImpl.parse(ParserImpl.java:64)
>  at 
> org.apache.flink.table.api.internal.TableEnvironmentImpl.sqlQuery(TableEnvironmentImpl.java:627)
>  at com.bytedance.demo.KafkaTableSource.main(KafkaTableSource.java:76)Caused 
> by: org.apache.calcite.sql.parser.SqlParseException: Encountered "." at line 
> 1, column 18.Was expecting one of:    ")" ...    "," ...     at 
> org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.convertException(FlinkSqlParserImpl.java:416)
>  at 
> org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.normalizeException(FlinkSqlParserImpl.java:201)
>  at 
> org.apache.calcite.sql.parser.SqlParser.handleException(SqlParser.java:148) 
> at org.apache.calcite.sql.parser.SqlParser.parseQuery(SqlParser.java:163) at 
> org.apache.calcite.sql.parser.SqlParser.parseStmt(SqlParser.java:188) at 
> org.apache.flink.table.planner.calcite.CalciteParser.parse(CalciteParser.java:54)
>  ... 3 moreCaused by: org.apache.flink.sql.parser.impl.ParseException: 
> Encountered "." at line 1, column 18.Was expecting one of:    ")" ...    "," 
> ...     at 
> org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.generateParseException(FlinkSqlParserImpl.java:36161)
>  at 
> org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.jj_consume_token(FlinkSqlParserImpl.java:35975)
>  at 
> org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.ParenthesizedSimpleIdentifierList(FlinkSqlParserImpl.java:21432)
>  at 
> org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.Expression3(FlinkSqlParserImpl.java:17164)
>  at 
> org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.Expression2b(FlinkSqlParserImpl.java:16820)
>  at 
> org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.Expression2(FlinkSqlParserImpl.java:16861)
>  at 
> org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.Expression(FlinkSqlParserImpl.java:16792)
>  at 
> org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.SelectExpression(FlinkSqlParserImpl.java:11091)
>  at 
> org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.SelectItem(FlinkSqlParserImpl.java:10293)
>  at 
> org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.SelectList(FlinkSqlParserImpl.java:10267)
>  at 
> org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.SqlSelect(FlinkSqlParserImpl.java:6943)
>  at 
> org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.LeafQuery(FlinkSqlParserImpl.java:658)
>  at 
> org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.LeafQueryOrExpr(FlinkSqlParserImpl.java:16775)
>  at 
> org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.QueryOrExpr(FlinkSqlParserImpl.java:16238)
>  at 
> org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.OrderedQueryOrExpr(FlinkSqlParserImpl.java:532)
>  at 
> org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.SqlStmt(FlinkSqlParserImpl.java:3761)
>  at 
> org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.SqlStmtEof(FlinkSqlParserImpl.java:3800)
>  at 
> org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.parseSqlStmtEof(FlinkSqlParserImpl.java:248)
>  at org.apache.calcite.sql.parser.SqlParser.parseQuery(SqlParser.java:161) 
> ... 5 more
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] flinkbot edited a comment on pull request #14557: [FLINK-20781] Avoid NPE after SourceOperator is closed.

2021-01-06 Thread GitBox



flinkbot edited a comment on pull request #14557:
URL: https://github.com/apache/flink/pull/14557#issuecomment-754510928


   
   ## CI report:
   
   * 4d66e1dd7d08d10ffb01a48912657d365e37fed8 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11642)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-20861) Provide an option for serializing DECIMALs in JSON as plain number instead of scientific notation

2021-01-06 Thread Jark Wu (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-20861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17260168#comment-17260168
 ] 

Jark Wu commented on FLINK-20861:
-

Feel free to open pull request. 

> Provide an option for serializing DECIMALs in JSON as plain number instead of 
> scientific notation
> -
>
> Key: FLINK-20861
> URL: https://issues.apache.org/jira/browse/FLINK-20861
> Project: Flink
>  Issue Type: Improvement
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile), Table 
> SQL / Ecosystem
>Reporter: Q Kang
>Assignee: Q Kang
>Priority: Minor
>
> When using DECIMAL types in Flink SQL along with JSON format, it is quite 
> common to see that some large values are written out as scientific notation. 
> For example:
> Definition: `orderId DECIMAL(20, 0)`
> Input: `\{"orderId":454621864049246170}`
> Output (without transformations): `\{"orderId":4.5462186404924617E+17}`
> However, values in plain numbers are easier to understand and more convenient 
> for the case shown above. So we can provide a boolean option (say 
> `json.use-plain-decimals`?) to make this behavior tunable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-20861) Provide an option for serializing DECIMALs in JSON as plain number instead of scientific notation

2021-01-06 Thread Jark Wu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-20861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jark Wu updated FLINK-20861:

Fix Version/s: 1.13.0

> Provide an option for serializing DECIMALs in JSON as plain number instead of 
> scientific notation
> -
>
> Key: FLINK-20861
> URL: https://issues.apache.org/jira/browse/FLINK-20861
> Project: Flink
>  Issue Type: Improvement
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile), Table 
> SQL / Ecosystem
>Reporter: Q Kang
>Assignee: Q Kang
>Priority: Minor
> Fix For: 1.13.0
>
>
> When using DECIMAL types in Flink SQL along with JSON format, it is quite 
> common to see that some large values are written out as scientific notation. 
> For example:
> Definition: `orderId DECIMAL(20, 0)`
> Input: `\{"orderId":454621864049246170}`
> Output (without transformations): `\{"orderId":4.5462186404924617E+17}`
> However, values in plain numbers are easier to understand and more convenient 
> for the case shown above. So we can provide a boolean option (say 
> `json.use-plain-decimals`?) to make this behavior tunable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-20832) flink.apache.org site Typography exception

2021-01-06 Thread Jira



 [ 
https://issues.apache.org/jira/browse/FLINK-20832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

谢波 updated FLINK-20832:
---
Component/s: API / Core

> flink.apache.org site Typography exception
> --
>
> Key: FLINK-20832
> URL: https://issues.apache.org/jira/browse/FLINK-20832
> Project: Flink
>  Issue Type: Bug
>  Components: API / Core, Documentation
>Reporter: 谢波
>Priority: Blocker
>
> The typesetting of the official website is abnormal. The typesetting seen by 
> users in China is abnormal.Because user can't load 
> https://maxcdn.bootstrapcdn.com/bootstrap/3.3.4/js/bootstrap.min.js CDN.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] flinkbot edited a comment on pull request #14376: [FLINK-18202][PB format] New Format of protobuf

2021-01-06 Thread GitBox



flinkbot edited a comment on pull request #14376:
URL: https://github.com/apache/flink/pull/14376#issuecomment-744252765


   
   ## CI report:
   
   * a9a50e78d0ba6ff84d02fbadee1484970fac2c79 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11709)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #14477: [FLINK-16005][yarn] Support yarn and hadoop config override

2021-01-06 Thread GitBox



flinkbot edited a comment on pull request #14477:
URL: https://github.com/apache/flink/pull/14477#issuecomment-750231028


   
   ## CI report:
   
   * aa9776ffae2b203d7c6980f4f7a32b70c26c6477 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11708)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] maosuhan commented on a change in pull request #14376: [FLINK-18202][PB format] New Format of protobuf

2021-01-06 Thread GitBox



maosuhan commented on a change in pull request #14376:
URL: https://github.com/apache/flink/pull/14376#discussion_r553094997



##
File path: 
flink-formats/flink-protobuf/src/main/java/org/apache/flink/formats/protobuf/PbSchemaValidator.java
##
@@ -0,0 +1,148 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.formats.protobuf;
+
+import org.apache.flink.table.api.ValidationException;
+import org.apache.flink.table.types.logical.ArrayType;
+import org.apache.flink.table.types.logical.LogicalType;
+import org.apache.flink.table.types.logical.LogicalTypeRoot;
+import org.apache.flink.table.types.logical.MapType;
+import org.apache.flink.table.types.logical.RowType;
+
+import com.google.protobuf.Descriptors;
+import com.google.protobuf.Descriptors.FieldDescriptor;
+import com.google.protobuf.Descriptors.FieldDescriptor.JavaType;
+
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+
+public class PbSchemaValidator {
+   private Descriptors.Descriptor descriptor;
+   private RowType rowType;
+   private Map> typeMatchMap = new 
HashMap();
+
+   public PbSchemaValidator(Descriptors.Descriptor descriptor, RowType 
rowType) {
+   this.descriptor = descriptor;
+   this.rowType = rowType;
+   typeMatchMap.put(JavaType.BOOLEAN, 
Collections.singletonList(LogicalTypeRoot.BOOLEAN));
+   typeMatchMap.put(
+   JavaType.BYTE_STRING,
+   Arrays.asList(LogicalTypeRoot.BINARY, 
LogicalTypeRoot.VARBINARY));
+   typeMatchMap.put(JavaType.DOUBLE, 
Collections.singletonList(LogicalTypeRoot.DOUBLE));
+   typeMatchMap.put(JavaType.FLOAT, 
Collections.singletonList(LogicalTypeRoot.FLOAT));
+   typeMatchMap.put(
+   JavaType.ENUM,
+   Arrays.asList(LogicalTypeRoot.VARCHAR, 
LogicalTypeRoot.CHAR));
+   typeMatchMap.put(
+   JavaType.STRING,
+   Arrays.asList(LogicalTypeRoot.VARCHAR, 
LogicalTypeRoot.CHAR));
+   typeMatchMap.put(JavaType.INT, 
Collections.singletonList(LogicalTypeRoot.INTEGER));
+   typeMatchMap.put(JavaType.LONG, 
Collections.singletonList(LogicalTypeRoot.BIGINT));
+   }
+
+   public Descriptors.Descriptor getDescriptor() {
+   return descriptor;
+   }
+
+   public void setDescriptor(Descriptors.Descriptor descriptor) {
+   this.descriptor = descriptor;
+   }
+
+   public RowType getRowType() {
+   return rowType;
+   }
+
+   public void setRowType(RowType rowType) {
+   this.rowType = rowType;
+   }
+
+   public void validate() {
+   validateTypeMatch(descriptor, rowType);
+   if (!descriptor.getFile().getOptions().getJavaPackage()
+   .equals(descriptor.getFile().getPackage())) {
+   throw new IllegalArgumentException(
+   "java_package and package must be the same in 
proto definition");
+   }
+   if (!descriptor.getFile().getOptions().getJavaMultipleFiles()) {

Review comment:
   It is the same issue of java_package/package, if 
java_multiple_files=false, the outer class name will also XXXOuter， also user 
can set java_outer_class_name to override the outer class name. Mixing with 
java_package/package, it is complicated to get the real java type of nested 
message type.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Closed] (FLINK-20766) Separate the implementation of sort nodes

2021-01-06 Thread godfrey he (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-20766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

godfrey he closed FLINK-20766.
--
Resolution: Fixed

Fixed in 1.13.0: 6c75072f..7668e4e3

> Separate the implementation of sort nodes
> -
>
> Key: FLINK-20766
> URL: https://issues.apache.org/jira/browse/FLINK-20766
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Planner
>Reporter: Wenlong Lyu
>Assignee: Wenlong Lyu
>Priority: Major
> Fix For: 1.13.0
>
>
> includes:
> StreamExecSort
> StreamExecSortLimit
> StreamExecTemporalSort
> BatchExecSort
> BatchExecSortLimit



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] flinkbot edited a comment on pull request #14571: [FLINK-20871][API/DataStream] Make DataStream#executeAndCollectWithCl…

2021-01-06 Thread GitBox



flinkbot edited a comment on pull request #14571:
URL: https://github.com/apache/flink/pull/14571#issuecomment-755845385


   
   ## CI report:
   
   *  Unknown: [CANCELED](TBD) 
   * 34d88a71b61f62f2775a41eb291c017c6d76b8ab Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11710)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #14477: [FLINK-16005][yarn] Support yarn and hadoop config override

2021-01-06 Thread GitBox



flinkbot edited a comment on pull request #14477:
URL: https://github.com/apache/flink/pull/14477#issuecomment-750231028


   
   ## CI report:
   
   * aa9776ffae2b203d7c6980f4f7a32b70c26c6477 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11708)
 
   * 1e34e16b05f86ea121ecefb4d7ff49f2b42b23a7 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11712)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #14538: [FLINK-20681][yarn] Support remote path for shipping archives and files

2021-01-06 Thread GitBox



flinkbot edited a comment on pull request #14538:
URL: https://github.com/apache/flink/pull/14538#issuecomment-752991523


   
   ## CI report:
   
   * af57879d33f5d7e1717518a475fd601916c5c7c1 UNKNOWN
   * f1c012ecf5d3d91f13bb9bfa24ea7c31c775c9c7 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11698)
 
   * d3230f5b6087c8c3c3453c9d69a68e69305010b4 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-20870) FlinkKafkaSink

2021-01-06 Thread xx chai (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-20870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xx chai updated FLINK-20870:

Issue Type: Bug  (was: Improvement)

> FlinkKafkaSink
> --
>
> Key: FLINK-20870
> URL: https://issues.apache.org/jira/browse/FLINK-20870
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream
>Affects Versions: 1.12.0
> Environment: flink :1.12.0
> kafka 2.2.1
>Reporter: xx chai
>Priority: Major
>
> I consum from kafka sink to kafka,Then I split the message into then pieces.I 
> guess the ten message should in one transaction. When the fifth message is 
> sink kafka ,I throw a exception.but the first four are already in kafka.
> I set some parameters :
>properties.setProperty("transactional.id", "cxx");
> properties.setProperty("ack", "all");
> properties.put("enable.idempotence",true);
> properties.put("max.in.flight.requests.per.connection",5);
> properties.put("retries", 2);
> properties.setProperty("client.id", "producer-syn-2");
> properties.put("isolation.level","read_committed");



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-20871) Make DataStream#executeAndCollectWithClient public

2021-01-06 Thread Steven Zhen Wu (Jira)

Steven Zhen Wu created FLINK-20871:
--

 Summary: Make DataStream#executeAndCollectWithClient public
 Key: FLINK-20871
 URL: https://issues.apache.org/jira/browse/FLINK-20871
 Project: Flink
  Issue Type: Improvement
  Components: API / DataStream
Affects Versions: 1.12.0
Reporter: Steven Zhen Wu


Right now, `DataStreamUtils#collectWithClient` is marked as deprecated in favor 
of the `DataStream#executeAndCollect()`. However, some integration tests (e.g.  
[FileSourceTextLinesITCase](https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-files/src/test/java/org/apache/flink/connector/file/src/FileSourceTextLinesITCase.java#L187))
 need the `DataStream#executeAndCollectWithClient` API to get JobClient to 
cancel the job after collected required output for unbounded source test.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] flinkbot commented on pull request #14571: [FLINK-20871][API/DataStream] Make DataStream#executeAndCollectWithCl…

2021-01-06 Thread GitBox



flinkbot commented on pull request #14571:
URL: https://github.com/apache/flink/pull/14571#issuecomment-755840587


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit 54088a5e88bc0f7ab29c21b754a6733c45d1a8bc (Thu Jan 07 
02:22:19 UTC 2021)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
* **This pull request references an unassigned [Jira 
ticket](https://issues.apache.org/jira/browse/FLINK-20871).** According to the 
[code contribution 
guide](https://flink.apache.org/contributing/contribute-code.html), tickets 
need to be assigned before starting with the implementation work.
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Comment Edited] (FLINK-20795) add a parameter to decide whether print dirty record when `ignore-parse-errors` is true

2021-01-06 Thread Jark Wu (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-20795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17260178#comment-17260178
 ] 

Jark Wu edited comment on FLINK-20795 at 1/7/21, 2:38 AM:
--

If we want to refactor this configuration. I would suggest to investigate how 
other projects handle this, e.g. Spark, Hive, Presto, Kafka. 

For example, Spark provides a ParseMode for dealing with corrupt records during 
parsing, it allows the following modes:

- PERMISSIVE : sets other fields to null when it meets a corrupted record, and 
puts the malformed string into a new field configured by 
columnNameOfCorruptRecord. When a schema is set by user, it sets null for extra 
fields.
- DROPMALFORMED : ignores the whole corrupted records.
- FAILFAST : throws an exception when it meets corrupted records.

See 
https://spark.apache.org/docs/2.0.2/api/java/org/apache/spark/sql/DataFrameReader.html
 and 
https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ParseMode.scala


was (Author: jark):
If we want to refactor this configuration. I would suggest to investigate how 
other projects handle this, e.g. Spark, Hive, Presto, Kafka. 

For example, Spark provides a ParseMode for dealing with corrupt records during 
parsing, it allows the following modes:

- PERMISSIVE : sets other fields to null when it meets a corrupted record, and 
puts the malformed string into a new field configured by 
columnNameOfCorruptRecord. When a schema is set by user, it sets null for extra 
fields.
- DROPMALFORMED : ignores the whole corrupted records.
- FAILFAST : throws an exception when it meets corrupted records.

See 
https://spark.apache.org/docs/2.0.2/api/java/org/apache/spark/sql/DataFrameReader.html
 and 

> add a parameter to decide whether print dirty record when 
> `ignore-parse-errors` is true
> ---
>
> Key: FLINK-20795
> URL: https://issues.apache.org/jira/browse/FLINK-20795
> Project: Flink
>  Issue Type: Improvement
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile), Table 
> SQL / Ecosystem
>Affects Versions: 1.13.0
>Reporter: zoucao
>Priority: Major
>
> add a parameter to decide whether print dirty data when 
> `ignore-parse-errors`=true, some users want to make his task stability and 
> know the dirty record to fix the upstream, too.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (FLINK-20795) add a parameter to decide whether print dirty record when `ignore-parse-errors` is true

2021-01-06 Thread Jark Wu (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-20795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17260178#comment-17260178
 ] 

Jark Wu edited comment on FLINK-20795 at 1/7/21, 2:38 AM:
--

If we want to refactor this configuration. I would suggest to first investigate 
how other projects handle this, e.g. Spark, Hive, Presto, Kafka. 

For example, Spark provides a ParseMode for dealing with corrupt records during 
parsing, it allows the following modes:

- PERMISSIVE : sets other fields to null when it meets a corrupted record, and 
puts the malformed string into a new field configured by 
columnNameOfCorruptRecord. When a schema is set by user, it sets null for extra 
fields.
- DROPMALFORMED : ignores the whole corrupted records.
- FAILFAST : throws an exception when it meets corrupted records.

See 
https://spark.apache.org/docs/2.0.2/api/java/org/apache/spark/sql/DataFrameReader.html
 and 
https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ParseMode.scala


was (Author: jark):
If we want to refactor this configuration. I would suggest to investigate how 
other projects handle this, e.g. Spark, Hive, Presto, Kafka. 

For example, Spark provides a ParseMode for dealing with corrupt records during 
parsing, it allows the following modes:

- PERMISSIVE : sets other fields to null when it meets a corrupted record, and 
puts the malformed string into a new field configured by 
columnNameOfCorruptRecord. When a schema is set by user, it sets null for extra 
fields.
- DROPMALFORMED : ignores the whole corrupted records.
- FAILFAST : throws an exception when it meets corrupted records.

See 
https://spark.apache.org/docs/2.0.2/api/java/org/apache/spark/sql/DataFrameReader.html
 and 
https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ParseMode.scala

> add a parameter to decide whether print dirty record when 
> `ignore-parse-errors` is true
> ---
>
> Key: FLINK-20795
> URL: https://issues.apache.org/jira/browse/FLINK-20795
> Project: Flink
>  Issue Type: Improvement
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile), Table 
> SQL / Ecosystem
>Affects Versions: 1.13.0
>Reporter: zoucao
>Priority: Major
>
> add a parameter to decide whether print dirty data when 
> `ignore-parse-errors`=true, some users want to make his task stability and 
> know the dirty record to fix the upstream, too.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (FLINK-20795) add a parameter to decide whether print dirty record when `ignore-parse-errors` is true

2021-01-06 Thread Jark Wu (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-20795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17260178#comment-17260178
 ] 

Jark Wu edited comment on FLINK-20795 at 1/7/21, 2:37 AM:
--

If we want to refactor this configuration. I would suggest to investigate how 
other projects handle this, e.g. Spark, Hive, Presto, Kafka. 

For example, Spark provides a ParseMode for dealing with corrupt records during 
parsing, it allows the following modes:

- PERMISSIVE : sets other fields to null when it meets a corrupted record, and 
puts the malformed string into a new field configured by 
columnNameOfCorruptRecord. When a schema is set by user, it sets null for extra 
fields.
- DROPMALFORMED : ignores the whole corrupted records.
- FAILFAST : throws an exception when it meets corrupted records.

See 
https://spark.apache.org/docs/2.0.2/api/java/org/apache/spark/sql/DataFrameReader.html
 and 


was (Author: jark):
If we want to refactor this configuration. I would suggest to investigate how 
other projects handle this, e.g. Spark, Hive, Presto, Kafka. 

For example, Spark provides a ParseMode for dealing with corrupt records during 
parsing, it allows the following modes:

- PERMISSIVE : sets other fields to null when it meets a corrupted record, and 
puts the malformed string into a new field configured by 
columnNameOfCorruptRecord. When a schema is set by user, it sets null for extra 
fields.
- DROPMALFORMED : ignores the whole corrupted records.
- FAILFAST : throws an exception when it meets corrupted records.

> add a parameter to decide whether print dirty record when 
> `ignore-parse-errors` is true
> ---
>
> Key: FLINK-20795
> URL: https://issues.apache.org/jira/browse/FLINK-20795
> Project: Flink
>  Issue Type: Improvement
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile), Table 
> SQL / Ecosystem
>Affects Versions: 1.13.0
>Reporter: zoucao
>Priority: Major
>
> add a parameter to decide whether print dirty data when 
> `ignore-parse-errors`=true, some users want to make his task stability and 
> know the dirty record to fix the upstream, too.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-20795) add a parameter to decide whether print dirty record when `ignore-parse-errors` is true

2021-01-06 Thread Jark Wu (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-20795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17260178#comment-17260178
 ] 

Jark Wu commented on FLINK-20795:
-

If we want to refactor this configuration. I would suggest to investigate how 
other projects handle this, e.g. Spark, Hive, Presto, Kafka. 

For example, Spark provides a ParseMode for dealing with corrupt records during 
parsing, it allows the following modes:

- PERMISSIVE : sets other fields to null when it meets a corrupted record, and 
puts the malformed string into a new field configured by 
columnNameOfCorruptRecord. When a schema is set by user, it sets null for extra 
fields.
- DROPMALFORMED : ignores the whole corrupted records.
- FAILFAST : throws an exception when it meets corrupted records.

> add a parameter to decide whether print dirty record when 
> `ignore-parse-errors` is true
> ---
>
> Key: FLINK-20795
> URL: https://issues.apache.org/jira/browse/FLINK-20795
> Project: Flink
>  Issue Type: Improvement
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile), Table 
> SQL / Ecosystem
>Affects Versions: 1.13.0
>Reporter: zoucao
>Priority: Major
>
> add a parameter to decide whether print dirty data when 
> `ignore-parse-errors`=true, some users want to make his task stability and 
> know the dirty record to fix the upstream, too.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] xintongsong commented on a change in pull request #14561: [FLINK-20836] Register TaskManager with total and default slot resource profile in SlotManager

2021-01-06 Thread GitBox



xintongsong commented on a change in pull request #14561:
URL: https://github.com/apache/flink/pull/14561#discussion_r553079447



##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/slotmanager/DeclarativeSlotManager.java
##
@@ -275,7 +275,10 @@ public void 
processResourceRequirements(ResourceRequirements resourceRequirement
  */
 @Override
 public boolean registerTaskManager(
-final TaskExecutorConnection taskExecutorConnection, SlotReport 
initialSlotReport) {
+final TaskExecutorConnection taskExecutorConnection,
+SlotReport initialSlotReport,
+ResourceProfile totalResourceProfile,
+ResourceProfile defaultSlotResourceProfile) {

Review comment:
   JavaDoc should be updated accordingly.

##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/slotmanager/SlotManagerImpl.java
##
@@ -459,7 +459,10 @@ public boolean unregisterSlotRequest(AllocationID 
allocationId) {
  */
 @Override
 public boolean registerTaskManager(
-final TaskExecutorConnection taskExecutorConnection, SlotReport 
initialSlotReport) {
+final TaskExecutorConnection taskExecutorConnection,
+SlotReport initialSlotReport,
+ResourceProfile totalResourceProfile,
+ResourceProfile defaultSlotResourceProfile) {

Review comment:
   JavaDoc should be updated accordingly.

##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/ResourceManager.java
##
@@ -464,7 +464,11 @@ private void stopResourceManagerServices() throws 
Exception {
 taskExecutors.get(taskManagerResourceId);
 
 if 
(workerTypeWorkerRegistration.getInstanceID().equals(taskManagerRegistrationId))
 {
-if (slotManager.registerTaskManager(workerTypeWorkerRegistration, 
slotReport)) {
+if (slotManager.registerTaskManager(
+workerTypeWorkerRegistration,
+slotReport,
+workerTypeWorkerRegistration.getTotalResourceProfile(),
+
workerTypeWorkerRegistration.getDefaultSlotResourceProfile())) {

Review comment:
   It's a bit weird that we have to pass in `workerTypeWorkerRegistration. 
getTotalResourceProfile()` and `workerTypeWorkerRegistration. 
getDefaultSlotResourceProfile()` when we have already passed in 
`workerTypeWorkerRegistration`.
   
   Despite the names, I think the boundary between `TaskExecutorConnection` and 
`WorkerRegistration` is that, the former contains information needed in 
`SlotManager` while the latter contains additional information needed in 
`ResourceManager`. (The name `TaskExecutorConnection` is probably because 
previously we need nothing more than the IDs and the RPC gateway in 
`SlotManager`.)
   
   Since the total and default slot resource profiles are only used in 
`SlotManager`, we probably should move them into `TaskExecutorConnection`. We 
may also rename the two classes as follows to explicitly suggest their scope of 
usage.
   * `WorkerRegistration` -> `ResourceManagerWorkerRegistration`
   * `TaskExecutorConnection` -> `SlotManagerWorkerRegistration`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] wenlong88 commented on a change in pull request #14569: [FLINK-20856][table-planner-blink] Separate the implementation of stream window aggregate nodes

2021-01-06 Thread GitBox



wenlong88 commented on a change in pull request #14569:
URL: https://github.com/apache/flink/pull/14569#discussion_r553085276



##
File path: 
flink-table/flink-table-planner-blink/src/main/java/org/apache/flink/table/planner/plan/nodes/exec/stream/StreamExecGroupWindowAggregate.java
##
@@ -0,0 +1,373 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.planner.plan.nodes.exec.stream;
+
+import org.apache.flink.api.dag.Transformation;
+import org.apache.flink.streaming.api.transformations.OneInputTransformation;
+import org.apache.flink.table.api.TableConfig;
+import org.apache.flink.table.api.TableException;
+import org.apache.flink.table.data.RowData;
+import org.apache.flink.table.expressions.FieldReferenceExpression;
+import org.apache.flink.table.expressions.ValueLiteralExpression;
+import 
org.apache.flink.table.planner.calcite.FlinkRelBuilder.PlannerNamedWindowProperty;
+import org.apache.flink.table.planner.codegen.CodeGeneratorContext;
+import org.apache.flink.table.planner.codegen.EqualiserCodeGenerator;
+import org.apache.flink.table.planner.codegen.agg.AggsHandlerCodeGenerator;
+import org.apache.flink.table.planner.delegation.PlannerBase;
+import org.apache.flink.table.planner.expressions.PlannerWindowProperty;
+import org.apache.flink.table.planner.plan.logical.LogicalWindow;
+import org.apache.flink.table.planner.plan.logical.SessionGroupWindow;
+import org.apache.flink.table.planner.plan.logical.SlidingGroupWindow;
+import org.apache.flink.table.planner.plan.logical.TumblingGroupWindow;
+import org.apache.flink.table.planner.plan.nodes.exec.ExecEdge;
+import org.apache.flink.table.planner.plan.nodes.exec.ExecNode;
+import org.apache.flink.table.planner.plan.nodes.exec.ExecNodeBase;
+import org.apache.flink.table.planner.plan.utils.AggregateInfoList;
+import org.apache.flink.table.planner.plan.utils.KeySelectorUtil;
+import org.apache.flink.table.planner.plan.utils.WindowEmitStrategy;
+import org.apache.flink.table.planner.utils.JavaScalaConversionUtil;
+import org.apache.flink.table.runtime.generated.GeneratedClass;
+import 
org.apache.flink.table.runtime.generated.GeneratedNamespaceAggsHandleFunction;
+import 
org.apache.flink.table.runtime.generated.GeneratedNamespaceTableAggsHandleFunction;
+import org.apache.flink.table.runtime.generated.GeneratedRecordEqualiser;
+import org.apache.flink.table.runtime.keyselector.RowDataKeySelector;
+import org.apache.flink.table.runtime.operators.window.CountWindow;
+import org.apache.flink.table.runtime.operators.window.TimeWindow;
+import org.apache.flink.table.runtime.operators.window.WindowOperator;
+import org.apache.flink.table.runtime.operators.window.WindowOperatorBuilder;
+import org.apache.flink.table.runtime.types.LogicalTypeDataTypeConverter;
+import org.apache.flink.table.runtime.typeutils.InternalTypeInfo;
+import org.apache.flink.table.types.DataType;
+import org.apache.flink.table.types.logical.LogicalType;
+import org.apache.flink.table.types.logical.RowType;
+
+import org.apache.calcite.rel.core.AggregateCall;
+import org.apache.calcite.tools.RelBuilder;
+import org.apache.commons.lang3.ArrayUtils;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.time.Duration;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.List;
+
+import static 
org.apache.flink.table.planner.plan.utils.AggregateUtil.hasRowIntervalType;
+import static 
org.apache.flink.table.planner.plan.utils.AggregateUtil.hasTimeIntervalType;
+import static 
org.apache.flink.table.planner.plan.utils.AggregateUtil.isProctimeAttribute;
+import static 
org.apache.flink.table.planner.plan.utils.AggregateUtil.isRowtimeAttribute;
+import static 
org.apache.flink.table.planner.plan.utils.AggregateUtil.isTableAggregate;
+import static 
org.apache.flink.table.planner.plan.utils.AggregateUtil.toDuration;
+import static org.apache.flink.table.planner.plan.utils.AggregateUtil.toLong;
+import static 
org.apache.flink.table.planner.plan.utils.AggregateUtil.transformToStreamAggregateInfoList;
+
+/** Stream {@link ExecNode} for either group window aggregate or group window 
table aggregate. */
+public class

[GitHub] [flink] flinkbot edited a comment on pull request #14477: [FLINK-16005][yarn] Support yarn and hadoop config override

2021-01-06 Thread GitBox



flinkbot edited a comment on pull request #14477:
URL: https://github.com/apache/flink/pull/14477#issuecomment-750231028


   
   ## CI report:
   
   * aa9776ffae2b203d7c6980f4f7a32b70c26c6477 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11708)
 
   * 1e34e16b05f86ea121ecefb4d7ff49f2b42b23a7 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

1 2 3 >

1 - 100 of 252 matches

Mail list logo