[GitHub] [flink] lirui-apache commented on a change in pull request #9721: [FLINK-14129][hive] HiveTableSource should implement ProjectableTable…

2019-09-19 Thread GitBox
lirui-apache commented on a change in pull request #9721: [FLINK-14129][hive] 
HiveTableSource should implement ProjectableTable…
URL: https://github.com/apache/flink/pull/9721#discussion_r326479552
 
 

 ##
 File path: 
flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/connectors/hive/HiveTableInputFormat.java
 ##
 @@ -88,17 +86,23 @@
private transient InputFormat mapredInputFormat;
private transient HiveTablePartition hiveTablePartition;
 
+   // indices of fields to be returned, with projection applied (if any)
+   // TODO: push projection into underlying input format that supports it
 
 Review comment:
   > can RecordReader support project push down?
   
   I guess it depends on the implementation. For example, Hive has its own [ORC 
input 
format](https://github.com/apache/hive/blob/rel/release-2.3.4/ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java).
 And it seems Hive pushes down the projection via 
[configurations](https://github.com/apache/hive/blob/rel/release-2.3.4/serde/src/java/org/apache/hadoop/hive/serde2/ColumnProjectionUtils.java#L52).
   
   > Can we support it now?
   
   I'd rather leave it to a separate task because it needs further 
investigation. Even though this PR doesn't reduce the amount of data to read, 
it avoids the costs of inspecting unused columns, which is good for queries 
selecting only a few columns from tables with lots of columns. And for input 
format that doesn't support projection push down (e.g. text), this is probably 
the best we can do.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] zhuzhurk commented on a change in pull request #9723: [FLINK-14114] [client] Shift down ClusterClient#timeout to RestClusterClient

2019-09-19 Thread GitBox
zhuzhurk commented on a change in pull request #9723: [FLINK-14114] [client] 
Shift down ClusterClient#timeout to RestClusterClient
URL: https://github.com/apache/flink/pull/9723#discussion_r326478660
 
 

 ##
 File path: 
flink-clients/src/main/java/org/apache/flink/client/program/rest/RestClusterClient.java
 ##
 @@ -125,13 +126,18 @@
 import java.util.function.Supplier;
 import java.util.stream.Collectors;
 
+import scala.concurrent.duration.FiniteDuration;
+
 /**
  * A {@link ClusterClient} implementation that communicates via HTTP REST 
requests.
  */
 public class RestClusterClient extends ClusterClient implements 
NewClusterClient {
 
private final RestClusterClientConfiguration 
restClusterClientConfiguration;
 
+   /** Timeout for futures. */
+   private final FiniteDuration timeout;
 
 Review comment:
   @TisonKun Maybe it's better to focus on the shifting work in this PR. 
   So I'd like keep the type FiniteDuration here and change it to 
java.time.Duration in the PR of FLINK-14070.
   WDYT?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] zhuzhurk commented on a change in pull request #9723: [FLINK-14114] [client] Shift down ClusterClient#timeout to RestClusterClient

2019-09-19 Thread GitBox
zhuzhurk commented on a change in pull request #9723: [FLINK-14114] [client] 
Shift down ClusterClient#timeout to RestClusterClient
URL: https://github.com/apache/flink/pull/9723#discussion_r326478660
 
 

 ##
 File path: 
flink-clients/src/main/java/org/apache/flink/client/program/rest/RestClusterClient.java
 ##
 @@ -125,13 +126,18 @@
 import java.util.function.Supplier;
 import java.util.stream.Collectors;
 
+import scala.concurrent.duration.FiniteDuration;
+
 /**
  * A {@link ClusterClient} implementation that communicates via HTTP REST 
requests.
  */
 public class RestClusterClient extends ClusterClient implements 
NewClusterClient {
 
private final RestClusterClientConfiguration 
restClusterClientConfiguration;
 
+   /** Timeout for futures. */
+   private final FiniteDuration timeout;
 
 Review comment:
   @TisonKun Maybe it's better to focus on the shifting work in this PR. 
   So I'd like keep the type FiniteDuration here and change it to 
java.time.Duration in the PR of FLINK-14070.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] flinkbot commented on issue #9723: [FLINK-14114] [client] Shift down ClusterClient#timeout to RestClusterClient

2019-09-19 Thread GitBox
flinkbot commented on issue #9723: [FLINK-14114] [client] Shift down 
ClusterClient#timeout to RestClusterClient
URL: https://github.com/apache/flink/pull/9723#issuecomment-533414803
 
 
   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit 7c3eef29f2f40406bdd24f9c78889f5a68853308 (Fri Sep 20 
05:44:02 UTC 2019)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (FLINK-14114) Shift down ClusterClient#timeout to RestClusterClient

2019-09-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-14114:
---
Labels: pull-request-available  (was: )

> Shift down ClusterClient#timeout to RestClusterClient
> -
>
> Key: FLINK-14114
> URL: https://issues.apache.org/jira/browse/FLINK-14114
> Project: Flink
>  Issue Type: Sub-task
>  Components: Client / Job Submission
>Reporter: tison
>Assignee: Zhu Zhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.10.0
>
>
> {{ClusterClient#timeout}} is only used in {{RestClusterClient}}, even without 
> this prerequisite we can always shift down {{timeout}} field to subclasses of 
> {{ClusterClient}}. It is towards an interface-ized {{ClusterClient}}. By side 
> effect, we could reduce the dependency to parsing duration with Scala 
> Duration on the fly.
> CC [~till.rohrmann] [~zhuzh]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] zhuzhurk opened a new pull request #9723: [FLINK-14114] [client] Shift down ClusterClient#timeout to RestClusterClient

2019-09-19 Thread GitBox
zhuzhurk opened a new pull request #9723: [FLINK-14114] [client] Shift down 
ClusterClient#timeout to RestClusterClient
URL: https://github.com/apache/flink/pull/9723
 
 
   
   ## What is the purpose of the change
   
   *ClusterClient#timeout is only used in RestClusterClient. Shifting it down 
to RestClusterClient could be a right step towards an interface-ized 
ClusterClient. *
   
   ## Brief change log
   
 - *timeout field is moved from ClusterClient to RestClusterClient*
   
   ## Verifying this change
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / **no**)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / **no**)
 - The serializers: (yes / **no** / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / **no** 
/ don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / **no** / don't know)
 - The S3 file system connector: (yes / **no** / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / **no**)
 - If yes, how is the feature documented? (**not applicable** / docs / 
JavaDocs / not documented)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] flinkbot edited a comment on issue #9712: [FLINK-13656] [sql-parser][table-planner][table-planner-blink] Bump Calcite dependency to 1.21.0

2019-09-19 Thread GitBox
flinkbot edited a comment on issue #9712: [FLINK-13656] 
[sql-parser][table-planner][table-planner-blink] Bump Calcite dependency to 
1.21.0
URL: https://github.com/apache/flink/pull/9712#issuecomment-532984287
 
 
   
   ## CI report:
   
   * f2b7710f65f478342de389c8e099799287ddf3f9 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128285030)
   * 19872fd562f606e9738f5b5a77f7a1d3e93a39b9 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128302786)
   * 211f984c5380d61cf78f94e8ddcb4576f5c3f751 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128442800)
   * ed7ead402b73f1d5958f13c050dcda4b22dbe40a : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128451919)
   * 33e4d2ef7f0db97a5fad81b5ef4542a299f805e6 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128453915)
   * 2f413575dd750406c2dfe2f04b93283ca629fbc9 : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/128456492)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] flinkbot edited a comment on issue #9300: [FLINK-13513][ml] Add the FlatMapper and related classes for later al…

2019-09-19 Thread GitBox
flinkbot edited a comment on issue #9300: [FLINK-13513][ml] Add the FlatMapper 
and related classes for later al…
URL: https://github.com/apache/flink/pull/9300#issuecomment-516849331
 
 
   
   ## CI report:
   
   * 9b5b5ff5df053498e491d43a04f44d5ba452579c : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/121415160)
   * eb6cf333f4331ed20d7e22a056cbd3c9b61f31f8 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/124204150)
   * 6440190df24a227f96e2b917acecccee04ab981b : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128458494)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] flinkbot edited a comment on issue #9184: [FLINK-13339][ml] Add an implementation of pipeline's api

2019-09-19 Thread GitBox
flinkbot edited a comment on issue #9184: [FLINK-13339][ml] Add an 
implementation of pipeline's api
URL: https://github.com/apache/flink/pull/9184#issuecomment-513425405
 
 
   
   ## CI report:
   
   * bd7ade5e0b57dc8577d7f864afcbbb24c2513e56 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/119869757)
   * 6a187929b931a4bd8cd7dbd0ec3d2c5a7a98278d : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/121411219)
   * 4f2afd322f96aeaba6d9c0b67a82a051eff22df0 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/121723032)
   * c4fc6905d3adf3ad9ff6f58c5d4f472fdfa7d52b : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/121724039)
   * b1855d5dfff586e41f152a6861ae04f30042cfde : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/127193220)
   * 54210a098757726e439e71797c44a6c22d48bc27 : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/128328713)
   * b089ac03c90e327467a903fcba5d61fbfdf2583e : PENDING 
[Build](https://travis-ci.com/flink-ci/flink/builds/128458501)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] flinkbot edited a comment on issue #9300: [FLINK-13513][ml] Add the FlatMapper and related classes for later al…

2019-09-19 Thread GitBox
flinkbot edited a comment on issue #9300: [FLINK-13513][ml] Add the FlatMapper 
and related classes for later al…
URL: https://github.com/apache/flink/pull/9300#issuecomment-516849331
 
 
   
   ## CI report:
   
   * 9b5b5ff5df053498e491d43a04f44d5ba452579c : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/121415160)
   * eb6cf333f4331ed20d7e22a056cbd3c9b61f31f8 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/124204150)
   * 6440190df24a227f96e2b917acecccee04ab981b : PENDING 
[Build](https://travis-ci.com/flink-ci/flink/builds/128458494)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] flinkbot edited a comment on issue #9184: [FLINK-13339][ml] Add an implementation of pipeline's api

2019-09-19 Thread GitBox
flinkbot edited a comment on issue #9184: [FLINK-13339][ml] Add an 
implementation of pipeline's api
URL: https://github.com/apache/flink/pull/9184#issuecomment-513425405
 
 
   
   ## CI report:
   
   * bd7ade5e0b57dc8577d7f864afcbbb24c2513e56 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/119869757)
   * 6a187929b931a4bd8cd7dbd0ec3d2c5a7a98278d : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/121411219)
   * 4f2afd322f96aeaba6d9c0b67a82a051eff22df0 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/121723032)
   * c4fc6905d3adf3ad9ff6f58c5d4f472fdfa7d52b : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/121724039)
   * b1855d5dfff586e41f152a6861ae04f30042cfde : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/127193220)
   * 54210a098757726e439e71797c44a6c22d48bc27 : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/128328713)
   * b089ac03c90e327467a903fcba5d61fbfdf2583e : UNKNOWN
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] xuyang1706 commented on issue #9300: [FLINK-13513][ml] Add the FlatMapper and related classes for later al…

2019-09-19 Thread GitBox
xuyang1706 commented on issue #9300: [FLINK-13513][ml] Add the FlatMapper and 
related classes for later al…
URL: https://github.com/apache/flink/pull/9300#issuecomment-533404154
 
 
   > Thanks for the contribution @xuyang1706 . I took a quick look and I think 
there are some confusions I have with the Javadocs. Can you please take a look 
and let me know if any of my questions make sense?
   
   @walterddr , thanks for your comments and suggestion. I have refined the 
JavaDoc and add more descriptions, removed unnecessary method and renamed the 
interface name. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] flinkbot edited a comment on issue #9300: [FLINK-13513][ml] Add the FlatMapper and related classes for later al…

2019-09-19 Thread GitBox
flinkbot edited a comment on issue #9300: [FLINK-13513][ml] Add the FlatMapper 
and related classes for later al…
URL: https://github.com/apache/flink/pull/9300#issuecomment-516849331
 
 
   
   ## CI report:
   
   * 9b5b5ff5df053498e491d43a04f44d5ba452579c : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/121415160)
   * eb6cf333f4331ed20d7e22a056cbd3c9b61f31f8 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/124204150)
   * 6440190df24a227f96e2b917acecccee04ab981b : UNKNOWN
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] xuyang1706 commented on a change in pull request #9300: [FLINK-13513][ml] Add the FlatMapper and related classes for later al…

2019-09-19 Thread GitBox
xuyang1706 commented on a change in pull request #9300: [FLINK-13513][ml] Add 
the FlatMapper and related classes for later al…
URL: https://github.com/apache/flink/pull/9300#discussion_r326469340
 
 

 ##
 File path: 
flink-ml-parent/flink-ml-lib/src/main/java/org/apache/flink/ml/common/mapper/MapOpInterface.java
 ##
 @@ -0,0 +1,38 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.flink.ml.common.mapper;
+
+import org.apache.flink.types.Row;
+
+/**
+ * Interface for the map operation of Row type data.
+ */
+public interface MapOpInterface {
 
 Review comment:
   Thanks for your advice, I renamed this interface to `MapOperable`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] xuyang1706 commented on a change in pull request #9300: [FLINK-13513][ml] Add the FlatMapper and related classes for later al…

2019-09-19 Thread GitBox
xuyang1706 commented on a change in pull request #9300: [FLINK-13513][ml] Add 
the FlatMapper and related classes for later al…
URL: https://github.com/apache/flink/pull/9300#discussion_r326469205
 
 

 ##
 File path: 
flink-ml-parent/flink-ml-lib/src/main/java/org/apache/flink/ml/common/mapper/FlatModelMapper.java
 ##
 @@ -0,0 +1,129 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.flink.ml.common.mapper;
+
+import org.apache.flink.ml.api.misc.param.Params;
+import org.apache.flink.table.api.TableSchema;
+import org.apache.flink.types.Row;
+
+import java.util.List;
+
+/**
+ * Abstract class for flatMappers with model.
+ * FlatModelMapper transform one Row type data into zero, one, or more Row 
type result data.
 
 Review comment:
   Thanks, removed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] xuyang1706 commented on a change in pull request #9300: [FLINK-13513][ml] Add the FlatMapper and related classes for later al…

2019-09-19 Thread GitBox
xuyang1706 commented on a change in pull request #9300: [FLINK-13513][ml] Add 
the FlatMapper and related classes for later al…
URL: https://github.com/apache/flink/pull/9300#discussion_r326469171
 
 

 ##
 File path: 
flink-ml-parent/flink-ml-lib/src/main/java/org/apache/flink/ml/common/mapper/FlatModelMapper.java
 ##
 @@ -0,0 +1,129 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.flink.ml.common.mapper;
+
+import org.apache.flink.ml.api.misc.param.Params;
+import org.apache.flink.table.api.TableSchema;
+import org.apache.flink.types.Row;
+
+import java.util.List;
+
+/**
+ * Abstract class for flatMappers with model.
+ * FlatModelMapper transform one Row type data into zero, one, or more Row 
type result data.
+ * Operations that produce multiple strictly one Row type result data per Row 
type data
+ * can also use the {@link ModelMapper}.
+ */
+public abstract class FlatModelMapper extends FlatMapper {
+
+   /**
+* schema of the model with Table type.
+*/
+   protected TableSchema modelSchema;
+
+   public FlatModelMapper(TableSchema modelSchema, TableSchema dataSchema, 
Params params) {
+   super(dataSchema, params);
+   this.modelSchema = modelSchema;
+   }
+
+   /**
+* Load model from the list of Row type data.
+*
+* @param modelRows the list of Row type data
+*/
+   public abstract void loadModel(List  modelRows);
+
+   /**
+* Generate new instance of given FlatModelMapper class without model 
data.
+* The instance can not deal with real data, but it could be used to 
get the output result schema.
+*
+* @param flatModelMapperClassName Name of the FlatModelMapper class
+* @param modelScheme  The schema of input Table type model.
 
 Review comment:
   Thanks, the model is the machine learning model that use the Table as its 
representation (serialized to Table from the memory or deserialized from Table 
to memory). Thus, the model need `modelSchema` and the predict data need 
`dataSchema`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] xuyang1706 commented on a change in pull request #9300: [FLINK-13513][ml] Add the FlatMapper and related classes for later al…

2019-09-19 Thread GitBox
xuyang1706 commented on a change in pull request #9300: [FLINK-13513][ml] Add 
the FlatMapper and related classes for later al…
URL: https://github.com/apache/flink/pull/9300#discussion_r326468259
 
 

 ##
 File path: 
flink-ml-parent/flink-ml-lib/src/main/java/org/apache/flink/ml/common/mapper/FlatMapper.java
 ##
 @@ -0,0 +1,122 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.flink.ml.common.mapper;
+
+import org.apache.flink.ml.api.misc.param.Params;
+import org.apache.flink.table.api.TableSchema;
+import org.apache.flink.types.Row;
+import org.apache.flink.util.Collector;
+
+import java.io.Serializable;
+
+/**
+ * Abstract class for flatMappers.
+ * FlatMapper transform one Row type data into zero, one, or more Row type 
result data.
+ * Operations that produce multiple strictly one Row type result data per Row 
type data
+ * can also use the {@link Mapper}.
+ */
+public abstract class FlatMapper implements Serializable {
+
+/**
+ * schema of the input.
+ */
+   protected TableSchema dataSchema;
+
+/**
+ * params used for FlatMapper.
+ * User can set the params before the FlatMapper is executed.
+ */
+   protected Params params;
+
+   public FlatMapper(TableSchema dataSchema, Params params) {
+   this.dataSchema = dataSchema;
+   this.params = (null == params) ? new Params() : params.clone();
+   }
+
+   /**
+* The core method of the FlatMapper. Takes an element from the input 
data set and transforms
+* it into zero, one, or more elements.
+*
+* @param rowThe input row.
+* @param output The collector for returning result values.
+* @throws Exception This method may throw exceptions. Throwing an 
exception will cause the operation
+* to fail.
+*/
+   public abstract void flatMap(Row row, Collector output) throws 
Exception;
+
+/**
+* Wrapper method for the iterable input.
+* @param rows the input rows.
+* @param output the output collector
+* @throws Exception if {@link #flatMap(Row, Collector)} throws 
exception.
+*/
+   public void flatMap(Iterable  rows, Collector output) throws 
Exception {
 
 Review comment:
   Thanks, removed


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] flinkbot edited a comment on issue #9712: [FLINK-13656] [sql-parser][table-planner][table-planner-blink] Bump Calcite dependency to 1.21.0

2019-09-19 Thread GitBox
flinkbot edited a comment on issue #9712: [FLINK-13656] 
[sql-parser][table-planner][table-planner-blink] Bump Calcite dependency to 
1.21.0
URL: https://github.com/apache/flink/pull/9712#issuecomment-532984287
 
 
   
   ## CI report:
   
   * f2b7710f65f478342de389c8e099799287ddf3f9 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128285030)
   * 19872fd562f606e9738f5b5a77f7a1d3e93a39b9 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128302786)
   * 211f984c5380d61cf78f94e8ddcb4576f5c3f751 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128442800)
   * ed7ead402b73f1d5958f13c050dcda4b22dbe40a : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128451919)
   * 33e4d2ef7f0db97a5fad81b5ef4542a299f805e6 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128453915)
   * 2f413575dd750406c2dfe2f04b93283ca629fbc9 : PENDING 
[Build](https://travis-ci.com/flink-ci/flink/builds/128456492)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] xuyang1706 commented on a change in pull request #9300: [FLINK-13513][ml] Add the FlatMapper and related classes for later al…

2019-09-19 Thread GitBox
xuyang1706 commented on a change in pull request #9300: [FLINK-13513][ml] Add 
the FlatMapper and related classes for later al…
URL: https://github.com/apache/flink/pull/9300#discussion_r326468211
 
 

 ##
 File path: 
flink-ml-parent/flink-ml-lib/src/main/java/org/apache/flink/ml/common/mapper/FlatMapper.java
 ##
 @@ -0,0 +1,122 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.flink.ml.common.mapper;
+
+import org.apache.flink.ml.api.misc.param.Params;
+import org.apache.flink.table.api.TableSchema;
+import org.apache.flink.types.Row;
+import org.apache.flink.util.Collector;
+
+import java.io.Serializable;
+
+/**
+ * Abstract class for flatMappers.
+ * FlatMapper transform one Row type data into zero, one, or more Row type 
result data.
+ * Operations that produce multiple strictly one Row type result data per Row 
type data
 
 Review comment:
   Thanks, FlatMapper can not support multiple rows as its inputs and we have 
been removed the `Iterable` in this interface


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] flinkbot edited a comment on issue #9712: [FLINK-13656] [sql-parser][table-planner][table-planner-blink] Bump Calcite dependency to 1.21.0

2019-09-19 Thread GitBox
flinkbot edited a comment on issue #9712: [FLINK-13656] 
[sql-parser][table-planner][table-planner-blink] Bump Calcite dependency to 
1.21.0
URL: https://github.com/apache/flink/pull/9712#issuecomment-532984287
 
 
   
   ## CI report:
   
   * f2b7710f65f478342de389c8e099799287ddf3f9 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128285030)
   * 19872fd562f606e9738f5b5a77f7a1d3e93a39b9 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128302786)
   * 211f984c5380d61cf78f94e8ddcb4576f5c3f751 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128442800)
   * ed7ead402b73f1d5958f13c050dcda4b22dbe40a : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128451919)
   * 33e4d2ef7f0db97a5fad81b5ef4542a299f805e6 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128453915)
   * 2f413575dd750406c2dfe2f04b93283ca629fbc9 : UNKNOWN
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] lirui-apache commented on a change in pull request #9721: [FLINK-14129][hive] HiveTableSource should implement ProjectableTable…

2019-09-19 Thread GitBox
lirui-apache commented on a change in pull request #9721: [FLINK-14129][hive] 
HiveTableSource should implement ProjectableTable…
URL: https://github.com/apache/flink/pull/9721#discussion_r326465346
 
 

 ##
 File path: 
flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/connectors/hive/HiveTableSource.java
 ##
 @@ -66,6 +68,7 @@
private Map, HiveTablePartition> 
partitionSpec2HiveTablePartition = new HashMap<>();
private boolean initAllPartitions;
private boolean partitionPruned;
+   private int[] projectedFields;
 
public HiveTableSource(JobConf jobConf, ObjectPath tablePath, 
CatalogTable catalogTable) {
this.jobConf = Preconditions.checkNotNull(jobConf);
 
 Review comment:
   I'd like to keep `projectedFields ` as null to indicate no projection has 
been pushed down to this table source. So that `explainSource()` doesn't have 
to display all the column indices if no projection is ever applied. What do you 
think?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (FLINK-14138) Show Pending Slots in Job Detail

2019-09-19 Thread Yadong Xie (Jira)
Yadong Xie created FLINK-14138:
--

 Summary: Show Pending Slots in Job Detail
 Key: FLINK-14138
 URL: https://issues.apache.org/jira/browse/FLINK-14138
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Web Frontend
Reporter: Yadong Xie
 Attachments: 屏幕快照 2019-09-20 下午12.04.00.png, 屏幕快照 2019-09-20 
下午12.04.05.png

It is hard to troubleshoot when all subtasks are always on the SCHEDULED 
status(just like the screenshot below) when users submit a job.

!屏幕快照 2019-09-20 下午12.04.00.png|width=494,height=258!

The most common reason for this problem is that vertex has applied for more 
resources than the cluster has. A pending slots tab could help users to check 
which vertex or subtask is blocked.

!屏幕快照 2019-09-20 下午12.04.05.png|width=576,height=163!

 

REST API needed:

add /jobs/:jobid/pending-slots API to get pending slots data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on issue #9722: [FLINK-14128] [runtime] Remove the description of customized restart strategy configuration

2019-09-19 Thread GitBox
flinkbot edited a comment on issue #9722: [FLINK-14128] [runtime] Remove the 
description of customized restart strategy configuration
URL: https://github.com/apache/flink/pull/9722#issuecomment-533180619
 
 
   
   ## CI report:
   
   * 29da7100a0af920b19723543fe4517dd0d95681e : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/128368730)
   * 5361c2a1b463c02724b90ba9453fc4746dcbba67 : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/128451937)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] flinkbot edited a comment on issue #9720: [FLINK-13025] Elasticsearch 7.x support

2019-09-19 Thread GitBox
flinkbot edited a comment on issue #9720: [FLINK-13025] Elasticsearch 7.x 
support
URL: https://github.com/apache/flink/pull/9720#issuecomment-533110106
 
 
   
   ## CI report:
   
   * d9c1dd529ef235649909d067cc78099179656e62 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128334151)
   * 603db694488096f1491b5ccb068d9e783636a8c8 : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/128375005)
   * 9bc5949275f7997eadd03e2ec1fe8937ee2e689f : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128453921)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] flinkbot edited a comment on issue #9712: [FLINK-13656] [sql-parser][table-planner][table-planner-blink] Bump Calcite dependency to 1.21.0

2019-09-19 Thread GitBox
flinkbot edited a comment on issue #9712: [FLINK-13656] 
[sql-parser][table-planner][table-planner-blink] Bump Calcite dependency to 
1.21.0
URL: https://github.com/apache/flink/pull/9712#issuecomment-532984287
 
 
   
   ## CI report:
   
   * f2b7710f65f478342de389c8e099799287ddf3f9 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128285030)
   * 19872fd562f606e9738f5b5a77f7a1d3e93a39b9 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128302786)
   * 211f984c5380d61cf78f94e8ddcb4576f5c3f751 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128442800)
   * ed7ead402b73f1d5958f13c050dcda4b22dbe40a : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128451919)
   * 33e4d2ef7f0db97a5fad81b5ef4542a299f805e6 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128453915)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] flinkbot edited a comment on issue #9720: [FLINK-13025] Elasticsearch 7.x support

2019-09-19 Thread GitBox
flinkbot edited a comment on issue #9720: [FLINK-13025] Elasticsearch 7.x 
support
URL: https://github.com/apache/flink/pull/9720#issuecomment-533110106
 
 
   
   ## CI report:
   
   * d9c1dd529ef235649909d067cc78099179656e62 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128334151)
   * 603db694488096f1491b5ccb068d9e783636a8c8 : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/128375005)
   * 9bc5949275f7997eadd03e2ec1fe8937ee2e689f : PENDING 
[Build](https://travis-ci.com/flink-ci/flink/builds/128453921)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] flinkbot edited a comment on issue #9712: [FLINK-13656] [sql-parser][table-planner][table-planner-blink] Bump Calcite dependency to 1.21.0

2019-09-19 Thread GitBox
flinkbot edited a comment on issue #9712: [FLINK-13656] 
[sql-parser][table-planner][table-planner-blink] Bump Calcite dependency to 
1.21.0
URL: https://github.com/apache/flink/pull/9712#issuecomment-532984287
 
 
   
   ## CI report:
   
   * f2b7710f65f478342de389c8e099799287ddf3f9 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128285030)
   * 19872fd562f606e9738f5b5a77f7a1d3e93a39b9 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128302786)
   * 211f984c5380d61cf78f94e8ddcb4576f5c3f751 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128442800)
   * ed7ead402b73f1d5958f13c050dcda4b22dbe40a : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128451919)
   * 33e4d2ef7f0db97a5fad81b5ef4542a299f805e6 : PENDING 
[Build](https://travis-ci.com/flink-ci/flink/builds/128453915)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] JingsongLi commented on a change in pull request #9721: [FLINK-14129][hive] HiveTableSource should implement ProjectableTable…

2019-09-19 Thread GitBox
JingsongLi commented on a change in pull request #9721: [FLINK-14129][hive] 
HiveTableSource should implement ProjectableTable…
URL: https://github.com/apache/flink/pull/9721#discussion_r326460965
 
 

 ##
 File path: 
flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/connectors/hive/HiveTableInputFormat.java
 ##
 @@ -88,17 +86,23 @@
private transient InputFormat mapredInputFormat;
private transient HiveTablePartition hiveTablePartition;
 
+   // indices of fields to be returned, with projection applied (if any)
+   // TODO: push projection into underlying input format that supports it
 
 Review comment:
   I think orc/parquet is another story, we should use our implementation 
instead of hadoop `RecordReader`.
   What I want to ask is: can `RecordReader` support project push down?
   If it is a easy way, Can we support it now?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (FLINK-14137) Show Attempt History in Vertex SubTask

2019-09-19 Thread Yadong Xie (Jira)
Yadong Xie created FLINK-14137:
--

 Summary: Show Attempt History in Vertex SubTask
 Key: FLINK-14137
 URL: https://issues.apache.org/jira/browse/FLINK-14137
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Web Frontend
Reporter: Yadong Xie
 Attachments: 屏幕快照 2019-09-20 上午11.32.54.png, 屏幕快照 2019-09-20 
上午11.32.59.png

According to the 
[docs|https://ci.apache.org/projects/flink/flink-docs-stable/monitoring/rest_api.html#jobs-jobid-vertices-vertexid-subtasks-subtaskindex],
 there may exist more than one attempt in a subtask, but there is no way to get 
the attempt history list in the REST API, users have no way to know if the 
subtask has failed before.

!屏幕快照 2019-09-20 上午11.32.54.png|width=499,height=205!

We can add the Attempt History tab under the Subtasks drawer on the job vertex 
page, here is a demo below.

!屏幕快照 2019-09-20 上午11.32.59.png|width=518,height=203!

REST API needed:

add /jobs/:jobid/vertices/:vertexid/subtasks/:subtaskindex/attempts API to get 
attempt history.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] JingsongLi commented on a change in pull request #9721: [FLINK-14129][hive] HiveTableSource should implement ProjectableTable…

2019-09-19 Thread GitBox
JingsongLi commented on a change in pull request #9721: [FLINK-14129][hive] 
HiveTableSource should implement ProjectableTable…
URL: https://github.com/apache/flink/pull/9721#discussion_r326459821
 
 

 ##
 File path: 
flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/connectors/hive/HiveTableInputFormat.java
 ##
 @@ -207,19 +211,22 @@ public Row nextRecord(Row ignore) throws IOException {
if (reachedEnd()) {
return null;
}
-   Row row = new Row(rowArity);
+   Row row = new Row(fields.length);
 
 Review comment:
   Why not use the row in `nextRecord(Row ignore)`? with reuse this row, you 
can never reassign partition values after initialization.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] liyafan82 closed pull request #8934: [FLINK-12628][Runtime / Coordination] Check test failure if partitionhas no consumers in Execution.getPartitionMaxParallelism

2019-09-19 Thread GitBox
liyafan82 closed pull request #8934: [FLINK-12628][Runtime / Coordination] 
Check test failure if partitionhas no consumers in 
Execution.getPartitionMaxParallelism
URL: https://github.com/apache/flink/pull/8934
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] liyafan82 commented on issue #8934: [FLINK-12628][Runtime / Coordination] Check test failure if partitionhas no consumers in Execution.getPartitionMaxParallelism

2019-09-19 Thread GitBox
liyafan82 commented on issue #8934: [FLINK-12628][Runtime / Coordination] Check 
test failure if partitionhas no consumers in 
Execution.getPartitionMaxParallelism
URL: https://github.com/apache/flink/pull/8934#issuecomment-533390260
 
 
   > @liyafan82
   > I rebased the PR on the latest master and added the precondition check, I 
mentioned in the comment.
   > Do you have time to work on this?
   > We moved recently to other in house Travis CI. It looks like it is 
triggered only for new PRs. We could close this PR and open a new one to see 
Travis results.
   
   Sure. I will start a new PR for this. Thanks.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] lirui-apache commented on a change in pull request #9721: [FLINK-14129][hive] HiveTableSource should implement ProjectableTable…

2019-09-19 Thread GitBox
lirui-apache commented on a change in pull request #9721: [FLINK-14129][hive] 
HiveTableSource should implement ProjectableTable…
URL: https://github.com/apache/flink/pull/9721#discussion_r326459475
 
 

 ##
 File path: 
flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/connectors/hive/HiveTableInputFormat.java
 ##
 @@ -88,17 +86,23 @@
private transient InputFormat mapredInputFormat;
private transient HiveTablePartition hiveTablePartition;
 
+   // indices of fields to be returned, with projection applied (if any)
+   // TODO: push projection into underlying input format that supports it
 
 Review comment:
   I haven't checked the code but I think columnar storage formats should be 
able to support projection pushdown. For example, these slides imply projection 
pushdown is supported in ORC and Parquet:
   
https://www.slideshare.net/oom65/orc-files?qid=a032419f-e6a3-4776-8c9c-03dd752f17fd==_search=2
   
https://www.slideshare.net/julienledem/parquet-stratany-hadoopworld2013?qid=94dc3a84-39f0-456b-8766-f623e679a7ca==_search=4


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] flinkbot edited a comment on issue #9720: [FLINK-13025] Elasticsearch 7.x support

2019-09-19 Thread GitBox
flinkbot edited a comment on issue #9720: [FLINK-13025] Elasticsearch 7.x 
support
URL: https://github.com/apache/flink/pull/9720#issuecomment-533110106
 
 
   
   ## CI report:
   
   * d9c1dd529ef235649909d067cc78099179656e62 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128334151)
   * 603db694488096f1491b5ccb068d9e783636a8c8 : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/128375005)
   * 9bc5949275f7997eadd03e2ec1fe8937ee2e689f : UNKNOWN
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] flinkbot edited a comment on issue #9712: [FLINK-13656] [sql-parser][table-planner][table-planner-blink] Bump Calcite dependency to 1.21.0

2019-09-19 Thread GitBox
flinkbot edited a comment on issue #9712: [FLINK-13656] 
[sql-parser][table-planner][table-planner-blink] Bump Calcite dependency to 
1.21.0
URL: https://github.com/apache/flink/pull/9712#issuecomment-532984287
 
 
   
   ## CI report:
   
   * f2b7710f65f478342de389c8e099799287ddf3f9 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128285030)
   * 19872fd562f606e9738f5b5a77f7a1d3e93a39b9 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128302786)
   * 211f984c5380d61cf78f94e8ddcb4576f5c3f751 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128442800)
   * ed7ead402b73f1d5958f13c050dcda4b22dbe40a : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128451919)
   * 33e4d2ef7f0db97a5fad81b5ef4542a299f805e6 : UNKNOWN
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] lirui-apache commented on a change in pull request #9721: [FLINK-14129][hive] HiveTableSource should implement ProjectableTable…

2019-09-19 Thread GitBox
lirui-apache commented on a change in pull request #9721: [FLINK-14129][hive] 
HiveTableSource should implement ProjectableTable…
URL: https://github.com/apache/flink/pull/9721#discussion_r326457315
 
 

 ##
 File path: 
flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/connectors/hive/HiveTableInputFormat.java
 ##
 @@ -207,19 +211,22 @@ public Row nextRecord(Row ignore) throws IOException {
if (reachedEnd()) {
return null;
}
-   Row row = new Row(rowArity);
+   Row row = new Row(fields.length);
try {
//Use HiveDeserializer to deserialize an object out of 
a Writable blob
Object hiveRowStruct = deserializer.deserialize(value);
-   int index = 0;
-   for (; index < structFields.size(); index++) {
-   StructField structField = 
structFields.get(index);
-   Object object = 
HiveInspectors.toFlinkObject(structField.getFieldObjectInspector(),
-   
structObjectInspector.getStructFieldData(hiveRowStruct, structField));
-   row.setField(index, object);
-   }
-   for (String partition : partitionColNames){
-   row.setField(index++, 
hiveTablePartition.getPartitionSpec().get(partition));
+   for (int i = 0; i < fields.length; i++) {
+   // non-partition column
+   if (fields[i] < structFields.size()) {
+   StructField structField = 
structFields.get(fields[i]);
+   Object object = 
HiveInspectors.toFlinkObject(structField.getFieldObjectInspector(),
+   
structObjectInspector.getStructFieldData(hiveRowStruct, structField));
+   row.setField(i, object);
+   } else {
+   // partition column
+   String partition = 
partitionColNames.get(fields[i] - structFields.size());
 
 Review comment:
   Yeah, will pre-compute the partition values.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] flinkbot edited a comment on issue #9722: [FLINK-14128] [runtime] Remove the description of customized restart strategy configuration

2019-09-19 Thread GitBox
flinkbot edited a comment on issue #9722: [FLINK-14128] [runtime] Remove the 
description of customized restart strategy configuration
URL: https://github.com/apache/flink/pull/9722#issuecomment-533180619
 
 
   
   ## CI report:
   
   * 29da7100a0af920b19723543fe4517dd0d95681e : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/128368730)
   * 5361c2a1b463c02724b90ba9453fc4746dcbba67 : PENDING 
[Build](https://travis-ci.com/flink-ci/flink/builds/128451937)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] flinkbot edited a comment on issue #9712: [FLINK-13656] [sql-parser][table-planner][table-planner-blink] Bump Calcite dependency to 1.21.0

2019-09-19 Thread GitBox
flinkbot edited a comment on issue #9712: [FLINK-13656] 
[sql-parser][table-planner][table-planner-blink] Bump Calcite dependency to 
1.21.0
URL: https://github.com/apache/flink/pull/9712#issuecomment-532984287
 
 
   
   ## CI report:
   
   * f2b7710f65f478342de389c8e099799287ddf3f9 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128285030)
   * 19872fd562f606e9738f5b5a77f7a1d3e93a39b9 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128302786)
   * 211f984c5380d61cf78f94e8ddcb4576f5c3f751 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128442800)
   * ed7ead402b73f1d5958f13c050dcda4b22dbe40a : PENDING 
[Build](https://travis-ci.com/flink-ci/flink/builds/128451919)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Comment Edited] (FLINK-11374) See more failover and can filter by time range

2019-09-19 Thread Yadong Xie (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-11374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16933969#comment-16933969
 ] 

Yadong Xie edited comment on FLINK-11374 at 9/20/19 2:58 AM:
-

Hi [~trohrmann]

According to the current flink 
[code|https://github.com/apache/flink/blob/34b5399f4effb679baabd8bca312cbf92ec34165/flink-runtime/src/main/java/org/apache/flink/runtime/rest/handler/job/JobExceptionsHandler.java#L54],
 the MAX_NUMBER_EXCEPTION_TO_REPORT is default 20 and not configurable. The 
problem is that the exception which is useful to users usually appears in the 
first exception, but if there are a lot of exceptions happened, users can only 
get the latest 20 exception messages, the first exception is gone.

 

Making the  MAX_NUMBER_EXCEPTION_TO_REPORT configurable in flink-conf.yaml can 
solve this problem, and we can add filters(text or time-range) in the Web UI to 
help users find useful exceptions.

 

What do you think about it?

 


was (Author: vthinkxie):
Hi [~trohrmann], I just check the flink 
codes(https://github.com/apache/flink/blob/34b5399f4effb679baabd8bca312cbf92ec34165/flink-runtime/src/main/java/org/apache/flink/runtime/rest/handler/job/JobExceptionsHandler.java#L54)
 and found the MAX_NUMBER_EXCEPTION_TO_REPORT is not configurable, I think we 
can make it configurable in flink-conf.yaml and add a frontend time-range 
filter, what do you think about it?

 

> See more failover and can filter by time range
> --
>
> Key: FLINK-11374
> URL: https://issues.apache.org/jira/browse/FLINK-11374
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / REST, Runtime / Web Frontend
>Reporter: lining
>Priority: Major
> Attachments: image-2019-01-22-11-40-53-135.png, 
> image-2019-01-22-11-42-33-808.png
>
>
> Now failover just show limit size task failover latest time. If task has 
> failed many time, we can not see the earlier time failover. Can we add filter 
> by time to see failover which contains task attemp fail msg.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on issue #9722: [FLINK-14128] [runtime] Remove the description of customized restart strategy configuration

2019-09-19 Thread GitBox
flinkbot edited a comment on issue #9722: [FLINK-14128] [runtime] Remove the 
description of customized restart strategy configuration
URL: https://github.com/apache/flink/pull/9722#issuecomment-533180619
 
 
   
   ## CI report:
   
   * 29da7100a0af920b19723543fe4517dd0d95681e : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/128368730)
   * 5361c2a1b463c02724b90ba9453fc4746dcbba67 : UNKNOWN
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] flinkbot edited a comment on issue #9712: [FLINK-13656] [sql-parser][table-planner][table-planner-blink] Bump Calcite dependency to 1.21.0

2019-09-19 Thread GitBox
flinkbot edited a comment on issue #9712: [FLINK-13656] 
[sql-parser][table-planner][table-planner-blink] Bump Calcite dependency to 
1.21.0
URL: https://github.com/apache/flink/pull/9712#issuecomment-532984287
 
 
   
   ## CI report:
   
   * f2b7710f65f478342de389c8e099799287ddf3f9 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128285030)
   * 19872fd562f606e9738f5b5a77f7a1d3e93a39b9 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128302786)
   * 211f984c5380d61cf78f94e8ddcb4576f5c3f751 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128442800)
   * ed7ead402b73f1d5958f13c050dcda4b22dbe40a : UNKNOWN
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-11374) See more failover and can filter by time range

2019-09-19 Thread Yadong Xie (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-11374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16933969#comment-16933969
 ] 

Yadong Xie commented on FLINK-11374:


Hi [~trohrmann], I just check the flink 
codes(https://github.com/apache/flink/blob/34b5399f4effb679baabd8bca312cbf92ec34165/flink-runtime/src/main/java/org/apache/flink/runtime/rest/handler/job/JobExceptionsHandler.java#L54)
 and found the MAX_NUMBER_EXCEPTION_TO_REPORT is not configurable, I think we 
can make it configurable in flink-conf.yaml and add a frontend time-range 
filter, what do you think about it?

 

> See more failover and can filter by time range
> --
>
> Key: FLINK-11374
> URL: https://issues.apache.org/jira/browse/FLINK-11374
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / REST, Runtime / Web Frontend
>Reporter: lining
>Priority: Major
> Attachments: image-2019-01-22-11-40-53-135.png, 
> image-2019-01-22-11-42-33-808.png
>
>
> Now failover just show limit size task failover latest time. If task has 
> failed many time, we can not see the earlier time failover. Can we add filter 
> by time to see failover which contains task attemp fail msg.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on issue #9712: [FLINK-13656] [sql-parser][table-planner][table-planner-blink] Bump Calcite dependency to 1.21.0

2019-09-19 Thread GitBox
flinkbot edited a comment on issue #9712: [FLINK-13656] 
[sql-parser][table-planner][table-planner-blink] Bump Calcite dependency to 
1.21.0
URL: https://github.com/apache/flink/pull/9712#issuecomment-532984287
 
 
   
   ## CI report:
   
   * f2b7710f65f478342de389c8e099799287ddf3f9 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128285030)
   * 19872fd562f606e9738f5b5a77f7a1d3e93a39b9 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128302786)
   * 211f984c5380d61cf78f94e8ddcb4576f5c3f751 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128442800)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] JingsongLi commented on a change in pull request #9721: [FLINK-14129][hive] HiveTableSource should implement ProjectableTable…

2019-09-19 Thread GitBox
JingsongLi commented on a change in pull request #9721: [FLINK-14129][hive] 
HiveTableSource should implement ProjectableTable…
URL: https://github.com/apache/flink/pull/9721#discussion_r326450474
 
 

 ##
 File path: 
flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/connectors/hive/HiveTableInputFormat.java
 ##
 @@ -88,17 +86,23 @@
private transient InputFormat mapredInputFormat;
private transient HiveTablePartition hiveTablePartition;
 
+   // indices of fields to be returned, with projection applied (if any)
+   // TODO: push projection into underlying input format that supports it
+   private int[] fields;
+
public HiveTableInputFormat(
JobConf jobConf,
CatalogTable catalogTable,
-   List partitions) {
+   List partitions,
+   int[] projectedFields) {
super(jobConf.getCredentials());
checkNotNull(catalogTable, "catalogTable can not be null.");
this.partitions = checkNotNull(partitions, "partitions can not 
be null.");
 
this.jobConf = new JobConf(jobConf);
this.partitionColNames = catalogTable.getPartitionKeys();
-   rowArity = catalogTable.getSchema().getFieldCount();
+   int rowArity = catalogTable.getSchema().getFieldCount();
+   fields = projectedFields != null ? projectedFields : 
IntStream.range(0, rowArity).toArray();
 
 Review comment:
   Let `projectedFields` never null?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] JingsongLi commented on a change in pull request #9721: [FLINK-14129][hive] HiveTableSource should implement ProjectableTable…

2019-09-19 Thread GitBox
JingsongLi commented on a change in pull request #9721: [FLINK-14129][hive] 
HiveTableSource should implement ProjectableTable…
URL: https://github.com/apache/flink/pull/9721#discussion_r326449917
 
 

 ##
 File path: 
flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/connectors/hive/HiveTableInputFormat.java
 ##
 @@ -88,17 +86,23 @@
private transient InputFormat mapredInputFormat;
private transient HiveTablePartition hiveTablePartition;
 
+   // indices of fields to be returned, with projection applied (if any)
+   // TODO: push projection into underlying input format that supports it
 
 Review comment:
   I don't think underlying input format can support project push down. Any 
information or delete this comment?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] JingsongLi commented on a change in pull request #9721: [FLINK-14129][hive] HiveTableSource should implement ProjectableTable…

2019-09-19 Thread GitBox
JingsongLi commented on a change in pull request #9721: [FLINK-14129][hive] 
HiveTableSource should implement ProjectableTable…
URL: https://github.com/apache/flink/pull/9721#discussion_r326448890
 
 

 ##
 File path: 
flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/connectors/hive/HiveTableSource.java
 ##
 @@ -106,7 +113,17 @@ public TableSchema getTableSchema() {
 
@Override
public DataType getProducedDataType() {
-   return getTableSchema().toRowDataType();
+   TableSchema originSchema = getTableSchema();
+   if (projectedFields == null) {
+   return originSchema.toRowDataType();
+   }
+   String[] names = new String[projectedFields.length];
+   DataType[] types = new DataType[projectedFields.length];
+   for (int i = 0; i < projectedFields.length; i++) {
+   names[i] = 
originSchema.getFieldNames()[projectedFields[i]];
 
 Review comment:
   Use `getFieldDataType(int fieldIndex)` and `getFieldName(int fieldIndex)`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] JingsongLi commented on a change in pull request #9721: [FLINK-14129][hive] HiveTableSource should implement ProjectableTable…

2019-09-19 Thread GitBox
JingsongLi commented on a change in pull request #9721: [FLINK-14129][hive] 
HiveTableSource should implement ProjectableTable…
URL: https://github.com/apache/flink/pull/9721#discussion_r326450435
 
 

 ##
 File path: 
flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/connectors/hive/HiveTableSource.java
 ##
 @@ -66,6 +68,7 @@
private Map, HiveTablePartition> 
partitionSpec2HiveTablePartition = new HashMap<>();
private boolean initAllPartitions;
private boolean partitionPruned;
+   private int[] projectedFields;
 
public HiveTableSource(JobConf jobConf, ObjectPath tablePath, 
CatalogTable catalogTable) {
this.jobConf = Preconditions.checkNotNull(jobConf);
 
 Review comment:
   Invoke `this()` to init variable, like `projectedFields`, you can init it as 
`IntStream.range(0, rowArity).toArray()`


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (FLINK-14136) Operator Topology and Metrics Inside Vertex

2019-09-19 Thread Yadong Xie (Jira)
Yadong Xie created FLINK-14136:
--

 Summary: Operator Topology and Metrics Inside Vertex
 Key: FLINK-14136
 URL: https://issues.apache.org/jira/browse/FLINK-14136
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Web Frontend
Reporter: Yadong Xie
 Attachments: Kapture 2019-09-17 at 14.31.46.gif, screenshot.png, 
screenshot2.png

In the screenshot below, users can get vertex topology data in the job detail 
page, but the operator topology and metrics inside vertex is missing in the 
graph.

!screenshot.png|width=477,height=206!

There are actually two operators in the first vertex, their names are Source: 
Custom Source and Timestamps/Watermarks, but users can only see Source: Custom 
Source -> Timestamps/Watermarks in the vertex level.

We can already get some metrics at the operator-level such as records-in and 
records-out from the metrics REST API (in the screenshot below).

!screenshot2.png|width=475,height=210!

If we can get the operators’ topology data inside a vertex, users can the whole 
operator topology with record-received and records-sent information after a 
glance at the graph, we think it would be quite useful to troubleshoot jobs’ 
problem when it is running. Here is a demo in the gif below.

!Kapture 2019-09-17 at 14.31.46.gif|width=563,height=286!

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on issue #9712: [FLINK-13656] [sql-parser][table-planner][table-planner-blink] Bump Calcite dependency to 1.21.0

2019-09-19 Thread GitBox
flinkbot edited a comment on issue #9712: [FLINK-13656] 
[sql-parser][table-planner][table-planner-blink] Bump Calcite dependency to 
1.21.0
URL: https://github.com/apache/flink/pull/9712#issuecomment-532984287
 
 
   
   ## CI report:
   
   * f2b7710f65f478342de389c8e099799287ddf3f9 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128285030)
   * 19872fd562f606e9738f5b5a77f7a1d3e93a39b9 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128302786)
   * 211f984c5380d61cf78f94e8ddcb4576f5c3f751 : PENDING 
[Build](https://travis-ci.com/flink-ci/flink/builds/128442800)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (FLINK-14135) Introduce vectorized orc InputFormat for blink runtime

2019-09-19 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-14135:


 Summary: Introduce vectorized orc InputFormat for blink runtime
 Key: FLINK-14135
 URL: https://issues.apache.org/jira/browse/FLINK-14135
 Project: Flink
  Issue Type: Sub-task
  Components: Connectors / ORC
Reporter: Jingsong Lee


VectorizedOrcInputFormat is introduced to read orc data in batches.

When returning each row of data, instead of actually retrieving each field, we 
use BaseRow's abstraction to return a Columnar Row-like view.

This will greatly improve the downstream filtered scenarios, so that there is 
no need to access redundant fields on the filtered data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] zzsmdfj commented on issue #9705: [hotfix][comment]remove useless comment

2019-09-19 Thread GitBox
zzsmdfj commented on issue #9705: [hotfix][comment]remove useless comment
URL: https://github.com/apache/flink/pull/9705#issuecomment-533374806
 
 
   > subjective
   
   Thanks Reply. yes, it is subjective whether a comment is useful or useless. 
that comment "row interval is not valid for session windows" appears four times 
in class GroupWindowValidationTest. It confuses me when I view the code. so I 
just want to see if I can make the code cleaner.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (FLINK-11899) Introduce vectorized parquet InputFormat for blink runtime

2019-09-19 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-11899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee updated FLINK-11899:
-
Parent: FLINK-14133
Issue Type: Sub-task  (was: New Feature)

> Introduce vectorized parquet InputFormat for blink runtime
> --
>
> Key: FLINK-11899
> URL: https://issues.apache.org/jira/browse/FLINK-11899
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Runtime
>Reporter: Jingsong Lee
>Assignee: Jingsong Lee
>Priority: Major
>
> VectorizedParquetInputFormat is introduced to read parquet data in batches.
> When returning each row of data, instead of actually retrieving each field, 
> we use BaseRow's abstraction to return a Columnar Row-like view.
> This will greatly improve the downstream filtered scenarios, so that there is 
> no need to access redundant fields on the filtered data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] liyafan82 closed pull request #8934: [FLINK-12628][Runtime / Coordination] Check test failure if partitionhas no consumers in Execution.getPartitionMaxParallelism

2019-09-19 Thread GitBox
liyafan82 closed pull request #8934: [FLINK-12628][Runtime / Coordination] 
Check test failure if partitionhas no consumers in 
Execution.getPartitionMaxParallelism
URL: https://github.com/apache/flink/pull/8934
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] liyafan82 opened a new pull request #8934: [FLINK-12628][Runtime / Coordination] Check test failure if partitionhas no consumers in Execution.getPartitionMaxParallelism

2019-09-19 Thread GitBox
liyafan82 opened a new pull request #8934: [FLINK-12628][Runtime / 
Coordination] Check test failure if partitionhas no consumers in 
Execution.getPartitionMaxParallelism
URL: https://github.com/apache/flink/pull/8934
 
 
   
   
   ## What is the purpose of the change
   
   Resolve issue FLINK-12628 Check test failure if partition has no consumers 
in Execution.getPartitionMaxParallelism:
   
   Currently, we work around this case in Execution.getPartitionMaxParallelism 
because of tests:
   
   // TODO consumers.isEmpty() only exists for test, currently there has to be 
exactly one consumer in real jobs!
   
   though partition is supposed to have always at least one consumer atm.
   We should check which test fails and consider fixing it.
   
   According to my investigation, there is no test failure, when we ignore the 
case for consumers.isEmpty() equals to true.
   
   ## Brief change log
   
 - Change the implementation of Execution.getPartitionMaxParallelism to 
ignore the case for consumers.isEmpty() equals to true.
   
   
   ## Verifying this change
   
   This change is already covered by existing tests, such as ExecutionTest.
   
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: (no)
 - The runtime per-record code paths (performance sensitive): (no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)
 - The S3 file system connector: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (no)
 - If yes, how is the feature documented? (not applicable)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-14110) Deleting state.backend.rocksdb.localdir causes silent failure

2019-09-19 Thread Congxian Qiu(klion26) (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16933940#comment-16933940
 ] 

Congxian Qiu(klion26) commented on FLINK-14110:
---

[~aaronlevin] thanks for your reply, I think we should first find out the 
reason first. Could you first check whether this is a problem of RocksDB or a 
problem of Flink's state?

As to figure out the reason, maybe you could write a custom test using RocksDB 
directly, and check the behavior after deleting the directory. 

For check Flink state's, you can ref 
{{RocksIncrementalSnapshotStrategy#doSnapshot for incremental snapshot}}, and 
{{RocksFullSnapshotStrategy#doSnapshot for full snapshot.}}

{{And please let me know if you have any problems, thanks.}}

> Deleting state.backend.rocksdb.localdir causes silent failure
> -
>
> Key: FLINK-14110
> URL: https://issues.apache.org/jira/browse/FLINK-14110
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / State Backends
>Affects Versions: 1.8.1, 1.9.0
> Environment: Flink {{1.8.1}} and {{1.9.0}}.
> JVM 8
>Reporter: Aaron Levin
>Priority: Minor
>
> Suppose {{state.backend.rocksdb.localdir}} is configured as:
> {noformat}
> state.backend.rocksdb.localdir: /flink/tmp
> {noformat}
> If I then run \{{rm -rf /flink/tmp/job_*}} on a host while a Flink 
> application is running, I will observe the following:
>  * throughput of my operators running on that host will drop to zero
>  * the application will not fail or restart
>  * the task manager will not fail or restart
>  * in most cases there is nothing in the logs to indicate a failure (I've run 
> this several times and only once seen an exception - I believe I was lucky 
> and deleted those directories during a checkpoint or something)
> The desired behaviour here would be to throw an exception and crash, instead 
> of silently dropping throughput to zero. Restarting the Task Manager will 
> resolve the issues.
> I only tried this on Flink {{1.8.1}} and {{1.9.0}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-14134) Introduce LimitableTableSource to optimize limit

2019-09-19 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-14134:


 Summary: Introduce LimitableTableSource to optimize limit
 Key: FLINK-14134
 URL: https://issues.apache.org/jira/browse/FLINK-14134
 Project: Flink
  Issue Type: Sub-task
Reporter: Jingsong Lee


SQL: select *from t1 limit 1

Now source will scan full table, if we can introduce LimitableTableSource, let 
source know the limit line, source can just read one row is OK.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] zhangjun888 closed pull request #9645: [FLINK-12584][connectors]Add Bucket File Syetem Table Sink

2019-09-19 Thread GitBox
zhangjun888 closed pull request #9645: [FLINK-12584][connectors]Add Bucket File 
Syetem Table Sink
URL: https://github.com/apache/flink/pull/9645
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] flinkbot edited a comment on issue #9712: [FLINK-13656] [sql-parser][table-planner][table-planner-blink] Bump Calcite dependency to 1.21.0

2019-09-19 Thread GitBox
flinkbot edited a comment on issue #9712: [FLINK-13656] 
[sql-parser][table-planner][table-planner-blink] Bump Calcite dependency to 
1.21.0
URL: https://github.com/apache/flink/pull/9712#issuecomment-532984287
 
 
   
   ## CI report:
   
   * f2b7710f65f478342de389c8e099799287ddf3f9 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128285030)
   * 19872fd562f606e9738f5b5a77f7a1d3e93a39b9 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128302786)
   * 211f984c5380d61cf78f94e8ddcb4576f5c3f751 : UNKNOWN
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (FLINK-14129) HiveTableSource should implement ProjectableTableSource

2019-09-19 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee updated FLINK-14129:
-
Parent: FLINK-14133
Issue Type: Sub-task  (was: Improvement)

> HiveTableSource should implement ProjectableTableSource
> ---
>
> Key: FLINK-14129
> URL: https://issues.apache.org/jira/browse/FLINK-14129
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Hive
>Reporter: Rui Li
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> With projection push down, we can avoid inspecting unused columns in 
> HiveTableInputFormat which can be time consuming.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-13025) Elasticsearch 7.x support

2019-09-19 Thread vinoyang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-13025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16933931#comment-16933931
 ] 

vinoyang commented on FLINK-13025:
--

Ping [~aljoscha] and [~trohrmann] for helping. I want to provide a 
connector-elasticsearch7 jar for scala 2.12 for [~lilyevsky].

I ran the command in the root dif of Flink project (my feature branch but 
rebased with flink master branch):


{code:java}
mvn package -DskipTests -Dcheckstyle.skip=true -P scala-2.12 -X
{code}

However, I got an error:


{code:java}
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-enforcer-plugin:3.0.0-M1:enforce 
(enforce-versions) on project flink-runtime_2.12: Some Enforcer rules have 
failed. Look above for specific messages explaining why the rule failed. -> 
[Help 1]
org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal 
org.apache.maven.plugins:maven-enforcer-plugin:3.0.0-M1:enforce 
(enforce-versions) on project flink-runtime_2.12: Some Enforcer rules have 
failed. Look above for specific messages explaining why the rule failed.
{code}

The relevant debug messages about error:


{code:java}
[DEBUG] Adding failure due to exception
org.apache.maven.enforcer.rule.api.EnforcerRuleException: Found Banned 
Dependency: com.typesafe:ssl-config-core_2.12:jar:0.3.7
Found Banned Dependency: com.typesafe.akka:akka-actor_2.12:jar:2.5.21
Found Banned Dependency: com.typesafe.akka:akka-slf4j_2.12:jar:2.5.21
Found Banned Dependency: com.typesafe.akka:akka-stream_2.12:jar:2.5.21
Found Banned Dependency: com.typesafe.akka:akka-protobuf_2.12:jar:2.5.21
Found Banned Dependency: 
org.scala-lang.modules:scala-parser-combinators_2.12:jar:1.1.1
Found Banned Dependency: 
org.scala-lang.modules:scala-java8-compat_2.12:jar:0.8.0
Found Banned Dependency: org.clapper:grizzled-slf4j_2.12:jar:1.3.2
Found Banned Dependency: com.github.scopt:scopt_2.12:jar:3.5.0
Found Banned Dependency: com.typesafe.akka:akka-testkit_2.12:jar:2.5.21
Found Banned Dependency: com.twitter:chill_2.12:jar:0.7.6
Found Banned Dependency: org.scalatest:scalatest_2.12:jar:3.0.0
Found Banned Dependency: org.scalactic:scalactic_2.12:jar:3.0.0
Found Banned Dependency: com.typesafe.akka:akka-remote_2.12:jar:2.5.21
Found Banned Dependency: org.scala-lang.modules:scala-xml_2.12:jar:1.0.5
Use 'mvn dependency:tree' to locate the source of the banned dependencies.
{code}

Actually, I am not familiar with the integration between flink and scala 2.12. 
Did I miss anything? If I am wrong somewhere, please let me know. Thanks.

> Elasticsearch 7.x support
> -
>
> Key: FLINK-13025
> URL: https://issues.apache.org/jira/browse/FLINK-13025
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / ElasticSearch
>Affects Versions: 1.8.0
>Reporter: Keegan Standifer
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Elasticsearch 7.0.0 was released in April of 2019: 
> [https://www.elastic.co/blog/elasticsearch-7-0-0-released]
> The latest elasticsearch connector is 
> [flink-connector-elasticsearch6|https://github.com/apache/flink/tree/master/flink-connectors/flink-connector-elasticsearch6]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-14133) Improve batch sql and hive integrate performance

2019-09-19 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-14133:


 Summary: Improve batch sql and hive integrate performance
 Key: FLINK-14133
 URL: https://issues.apache.org/jira/browse/FLINK-14133
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / Hive, Table SQL / Planner
Reporter: Jingsong Lee


Now we have basically merged the batch function of blink planner and basically 
integrated hive. But there are still many problems with performance, and we 
need to ensure basic performance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-13025) Elasticsearch 7.x support

2019-09-19 Thread vinoyang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-13025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16933924#comment-16933924
 ] 

vinoyang commented on FLINK-13025:
--

[~lilyevsky] I will try to build a jar of connector-elasticsearch7 for scala 
2.12 in my local. Then, I will provide it for your test.

> Elasticsearch 7.x support
> -
>
> Key: FLINK-13025
> URL: https://issues.apache.org/jira/browse/FLINK-13025
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / ElasticSearch
>Affects Versions: 1.8.0
>Reporter: Keegan Standifer
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Elasticsearch 7.0.0 was released in April of 2019: 
> [https://www.elastic.co/blog/elasticsearch-7-0-0-released]
> The latest elasticsearch connector is 
> [flink-connector-elasticsearch6|https://github.com/apache/flink/tree/master/flink-connectors/flink-connector-elasticsearch6]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-8545) Implement upsert stream table source

2019-09-19 Thread Hequn Cheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-8545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16933921#comment-16933921
 ] 

Hequn Cheng commented on FLINK-8545:


Hi [~enricoc], yes this feature is still be considered. Some related 
discussions can also be found 
[here|http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Ground-Source-and-Sink-Concepts-in-Flink-SQL-td29126.html].
 

> Implement upsert stream table source 
> -
>
> Key: FLINK-8545
> URL: https://issues.apache.org/jira/browse/FLINK-8545
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API
>Reporter: Hequn Cheng
>Assignee: Hequn Cheng
>Priority: Major
>
> As more and more users are eager for ingesting data with upsert mode in flink 
> sql/table-api, it is valuable to enable table source with upsert mode. I will 
> provide a design doc later and we can have more discussions. Any suggestions 
> are warmly welcomed !
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Issue Comment Deleted] (FLINK-4233) Simplify leader election / leader session ID assignment

2019-09-19 Thread TisonKun (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-4233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

TisonKun updated FLINK-4233:

Comment: was deleted

(was: Further, if it is assumed that there are never too many contenders, in 
FLINK-10333 we can adopt the unoptimized version of leader election, i.e., 
create the universe leader node, and if fails, wait for its deletion. Then we 
trade off performance(no impact) for simplicity.)

> Simplify leader election / leader session ID assignment
> ---
>
> Key: FLINK-4233
> URL: https://issues.apache.org/jira/browse/FLINK-4233
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Affects Versions: 1.0.3
>Reporter: Stephan Ewen
>Priority: Major
>
> Currently, there are two separate actions and znodes involved in leader 
> election and communication of the leader session ID and leader URL.
> This leads to some quite elaborate code that tries to make sure that the 
> leader session ID and leader URL always eventually converge to those of the 
> leader.
> It is simpler to just encode both the ID and the URL into an id-string that 
> is attached to the leader latch znode. One would have to create a new leader 
> latch each time a contender re-applies for leadership.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-4233) Simplify leader election / leader session ID assignment

2019-09-19 Thread TisonKun (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-4233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16933899#comment-16933899
 ] 

TisonKun commented on FLINK-4233:
-

Further, if it is assumed that there are never too many contenders, in 
FLINK-10333 we can adopt the unoptimized version of leader election, i.e., 
create the universe leader node, and if fails, wait for its deletion. Then we 
trade off performance(no impact) for simplicity.

> Simplify leader election / leader session ID assignment
> ---
>
> Key: FLINK-4233
> URL: https://issues.apache.org/jira/browse/FLINK-4233
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Affects Versions: 1.0.3
>Reporter: Stephan Ewen
>Priority: Major
>
> Currently, there are two separate actions and znodes involved in leader 
> election and communication of the leader session ID and leader URL.
> This leads to some quite elaborate code that tries to make sure that the 
> leader session ID and leader URL always eventually converge to those of the 
> leader.
> It is simpler to just encode both the ID and the URL into an id-string that 
> is attached to the leader latch znode. One would have to create a new leader 
> latch each time a contender re-applies for leadership.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-4233) Simplify leader election / leader session ID assignment

2019-09-19 Thread TisonKun (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-4233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16933879#comment-16933879
 ] 

TisonKun commented on FLINK-4233:
-

Thanks for picking up this topic! I'm thinking of something similar to this one 
recently.

Basically it requires a new ZKLeaderRetrievalService that works like 
{{PathChildrenCache}} in Curator(or we just reuse it) to monitoring leader 
latch registry. In this way, retriever will automatic treat the latch with the 
smallest sequential number as the leader info node and retrieve the 
information. As side effects, leader needs not to "publish" its leader 
information any more.

The downside would be overhead so-called {{PathChildrenCache}} costs. However, 
given that there is typically 1 or 2 latch at the same time and leader 
changes(after we tolerate SUSPENDED a.k.a CONNECTIONLOSS) is rare event, 
hopefully it doesn't impact too much.

> Simplify leader election / leader session ID assignment
> ---
>
> Key: FLINK-4233
> URL: https://issues.apache.org/jira/browse/FLINK-4233
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Affects Versions: 1.0.3
>Reporter: Stephan Ewen
>Priority: Major
>
> Currently, there are two separate actions and znodes involved in leader 
> election and communication of the leader session ID and leader URL.
> This leads to some quite elaborate code that tries to make sure that the 
> leader session ID and leader URL always eventually converge to those of the 
> leader.
> It is simpler to just encode both the ID and the URL into an id-string that 
> is attached to the leader latch znode. One would have to create a new leader 
> latch each time a contender re-applies for leadership.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] bowenli86 commented on a change in pull request #9721: [FLINK-14129][hive] HiveTableSource should implement ProjectableTable…

2019-09-19 Thread GitBox
bowenli86 commented on a change in pull request #9721: [FLINK-14129][hive] 
HiveTableSource should implement ProjectableTable…
URL: https://github.com/apache/flink/pull/9721#discussion_r326422275
 
 

 ##
 File path: 
flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/connectors/hive/HiveTableInputFormat.java
 ##
 @@ -207,19 +211,22 @@ public Row nextRecord(Row ignore) throws IOException {
if (reachedEnd()) {
return null;
}
-   Row row = new Row(rowArity);
+   Row row = new Row(fields.length);
try {
//Use HiveDeserializer to deserialize an object out of 
a Writable blob
Object hiveRowStruct = deserializer.deserialize(value);
-   int index = 0;
-   for (; index < structFields.size(); index++) {
-   StructField structField = 
structFields.get(index);
-   Object object = 
HiveInspectors.toFlinkObject(structField.getFieldObjectInspector(),
-   
structObjectInspector.getStructFieldData(hiveRowStruct, structField));
-   row.setField(index, object);
-   }
-   for (String partition : partitionColNames){
-   row.setField(index++, 
hiveTablePartition.getPartitionSpec().get(partition));
+   for (int i = 0; i < fields.length; i++) {
+   // non-partition column
+   if (fields[i] < structFields.size()) {
+   StructField structField = 
structFields.get(fields[i]);
+   Object object = 
HiveInspectors.toFlinkObject(structField.getFieldObjectInspector(),
+   
structObjectInspector.getStructFieldData(hiveRowStruct, structField));
+   row.setField(i, object);
+   } else {
+   // partition column
+   String partition = 
partitionColNames.get(fields[i] - structFields.size());
 
 Review comment:
   minor: as the partition column names stay the same for rows in the same 
partition, they seem can be precomputed based on projection?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] bowenli86 commented on a change in pull request #9721: [FLINK-14129][hive] HiveTableSource should implement ProjectableTable…

2019-09-19 Thread GitBox
bowenli86 commented on a change in pull request #9721: [FLINK-14129][hive] 
HiveTableSource should implement ProjectableTable…
URL: https://github.com/apache/flink/pull/9721#discussion_r326422275
 
 

 ##
 File path: 
flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/connectors/hive/HiveTableInputFormat.java
 ##
 @@ -207,19 +211,22 @@ public Row nextRecord(Row ignore) throws IOException {
if (reachedEnd()) {
return null;
}
-   Row row = new Row(rowArity);
+   Row row = new Row(fields.length);
try {
//Use HiveDeserializer to deserialize an object out of 
a Writable blob
Object hiveRowStruct = deserializer.deserialize(value);
-   int index = 0;
-   for (; index < structFields.size(); index++) {
-   StructField structField = 
structFields.get(index);
-   Object object = 
HiveInspectors.toFlinkObject(structField.getFieldObjectInspector(),
-   
structObjectInspector.getStructFieldData(hiveRowStruct, structField));
-   row.setField(index, object);
-   }
-   for (String partition : partitionColNames){
-   row.setField(index++, 
hiveTablePartition.getPartitionSpec().get(partition));
+   for (int i = 0; i < fields.length; i++) {
+   // non-partition column
+   if (fields[i] < structFields.size()) {
+   StructField structField = 
structFields.get(fields[i]);
+   Object object = 
HiveInspectors.toFlinkObject(structField.getFieldObjectInspector(),
+   
structObjectInspector.getStructFieldData(hiveRowStruct, structField));
+   row.setField(i, object);
+   } else {
+   // partition column
+   String partition = 
partitionColNames.get(fields[i] - structFields.size());
 
 Review comment:
   minor: the partition column names seem can be precomputed as they stay the 
same for rows in the same partition?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-8545) Implement upsert stream table source

2019-09-19 Thread Enrico Canzonieri (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-8545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16933807#comment-16933807
 ] 

Enrico Canzonieri commented on FLINK-8545:
--

Is this ticket still being considered? Being able to directly ingest retract 
DataStream in the Table API seems a valuable feature. In our deployment we have 
many use cases where we stream CDC from MySQL or Cassandra to Kafka. Supporting 
retract DataStream in the Table API would allow us to use Flink SQL to 
manipulate these streams in realtime.

> Implement upsert stream table source 
> -
>
> Key: FLINK-8545
> URL: https://issues.apache.org/jira/browse/FLINK-8545
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API
>Reporter: Hequn Cheng
>Assignee: Hequn Cheng
>Priority: Major
>
> As more and more users are eager for ingesting data with upsert mode in flink 
> sql/table-api, it is valuable to enable table source with upsert mode. I will 
> provide a design doc later and we can have more discussions. Any suggestions 
> are warmly welcomed !
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14090) Rework FunctionCatalog

2019-09-19 Thread Bowen Li (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bowen Li updated FLINK-14090:
-
Summary: Rework FunctionCatalog  (was: FLIP-57 Rework FunctionCatalog)

> Rework FunctionCatalog
> --
>
> Key: FLINK-14090
> URL: https://issues.apache.org/jira/browse/FLINK-14090
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / API, Table SQL / Client, Table SQL / Planner
>Reporter: Bowen Li
>Assignee: Bowen Li
>Priority: Major
> Fix For: 1.10.0
>
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-57%3A+Rework+FunctionCatalog



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-14132) Extend core table system with modular plugins

2019-09-19 Thread Bowen Li (Jira)
Bowen Li created FLINK-14132:


 Summary: Extend core table system with modular plugins
 Key: FLINK-14132
 URL: https://issues.apache.org/jira/browse/FLINK-14132
 Project: Flink
  Issue Type: New Feature
  Components: Table SQL / API
Reporter: Bowen Li
Assignee: Bowen Li
 Fix For: 1.10.0


please see FLIP-68 
(https://cwiki.apache.org/confluence/display/FLINK/FLIP-68%3A+Extend+Core+Table+System+with+Modular+Plugins)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-13025) Elasticsearch 7.x support

2019-09-19 Thread Leonid Ilyevsky (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-13025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16933794#comment-16933794
 ] 

Leonid Ilyevsky commented on FLINK-13025:
-

[~yanghua] I saw you wrote the code and created a pull request. While we are 
waiting for the merge and official build, maybe you could compile for me that 
connector, for scala 2.12, so I can test it.

> Elasticsearch 7.x support
> -
>
> Key: FLINK-13025
> URL: https://issues.apache.org/jira/browse/FLINK-13025
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / ElasticSearch
>Affects Versions: 1.8.0
>Reporter: Keegan Standifer
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Elasticsearch 7.0.0 was released in April of 2019: 
> [https://www.elastic.co/blog/elasticsearch-7-0-0-released]
> The latest elasticsearch connector is 
> [flink-connector-elasticsearch6|https://github.com/apache/flink/tree/master/flink-connectors/flink-connector-elasticsearch6]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on issue #9717: [FLINK-14044] [runtime] Reducing synchronization in AsyncWaitOperator

2019-09-19 Thread GitBox
flinkbot edited a comment on issue #9717: [FLINK-14044] [runtime] Reducing 
synchronization in AsyncWaitOperator
URL: https://github.com/apache/flink/pull/9717#issuecomment-533068819
 
 
   
   ## CI report:
   
   * 9cfda801891969ac460f45ea639d636b519f22db : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128317194)
   * af7f6be848f0bd90a049d5ee9f7a38b1c3e2b972 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128336908)
   * 43157eb165d9409fbdf4a2f773ef7d52dd74e759 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128365390)
   * ff400d149ed36c63a90d2f3fcd517bde738e5af1 : CANCELED 
[Build](https://travis-ci.com/flink-ci/flink/builds/128402339)
   * 02dbf61e50e6c913610df7f586d7eb0f20529c13 : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/128406923)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] flinkbot edited a comment on issue #9717: [FLINK-14044] [runtime] Reducing synchronization in AsyncWaitOperator

2019-09-19 Thread GitBox
flinkbot edited a comment on issue #9717: [FLINK-14044] [runtime] Reducing 
synchronization in AsyncWaitOperator
URL: https://github.com/apache/flink/pull/9717#issuecomment-533068819
 
 
   
   ## CI report:
   
   * 9cfda801891969ac460f45ea639d636b519f22db : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128317194)
   * af7f6be848f0bd90a049d5ee9f7a38b1c3e2b972 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128336908)
   * 43157eb165d9409fbdf4a2f773ef7d52dd74e759 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128365390)
   * ff400d149ed36c63a90d2f3fcd517bde738e5af1 : CANCELED 
[Build](https://travis-ci.com/flink-ci/flink/builds/128402339)
   * 02dbf61e50e6c913610df7f586d7eb0f20529c13 : PENDING 
[Build](https://travis-ci.com/flink-ci/flink/builds/128406923)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] pnowojski commented on issue #9710: [FLINK-11859][runtime]Improve SpanningRecordSerializer performance by serializing record length to data buffer directly.

2019-09-19 Thread GitBox
pnowojski commented on issue #9710: [FLINK-11859][runtime]Improve 
SpanningRecordSerializer performance by serializing record length to data 
buffer directly.
URL: https://github.com/apache/flink/pull/9710#issuecomment-533282433
 
 
   Nice change. Already visible in the codespeed:
   
http://codespeed.dak8s.net:8000/timeline/#/?exe=1,3=networkThroughput.100,100ms=2=200=off=on=on
   
http://codespeed.dak8s.net:8000/timeline/#/?exe=1,3=twoInputMapSink=2=200=off=on=on
   
http://codespeed.dak8s.net:8000/timeline/#/?exe=1,3=mapRebalanceMapSink=2=200=off=on=on


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] flinkbot edited a comment on issue #9717: [FLINK-14044] [runtime] Reducing synchronization in AsyncWaitOperator

2019-09-19 Thread GitBox
flinkbot edited a comment on issue #9717: [FLINK-14044] [runtime] Reducing 
synchronization in AsyncWaitOperator
URL: https://github.com/apache/flink/pull/9717#issuecomment-533068819
 
 
   
   ## CI report:
   
   * 9cfda801891969ac460f45ea639d636b519f22db : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128317194)
   * af7f6be848f0bd90a049d5ee9f7a38b1c3e2b972 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128336908)
   * 43157eb165d9409fbdf4a2f773ef7d52dd74e759 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128365390)
   * ff400d149ed36c63a90d2f3fcd517bde738e5af1 : PENDING 
[Build](https://travis-ci.com/flink-ci/flink/builds/128402339)
   * 02dbf61e50e6c913610df7f586d7eb0f20529c13 : UNKNOWN
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] flinkbot edited a comment on issue #9717: [FLINK-14044] [runtime] Reducing synchronization in AsyncWaitOperator

2019-09-19 Thread GitBox
flinkbot edited a comment on issue #9717: [FLINK-14044] [runtime] Reducing 
synchronization in AsyncWaitOperator
URL: https://github.com/apache/flink/pull/9717#issuecomment-533068819
 
 
   
   ## CI report:
   
   * 9cfda801891969ac460f45ea639d636b519f22db : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128317194)
   * af7f6be848f0bd90a049d5ee9f7a38b1c3e2b972 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128336908)
   * 43157eb165d9409fbdf4a2f773ef7d52dd74e759 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128365390)
   * ff400d149ed36c63a90d2f3fcd517bde738e5af1 : PENDING 
[Build](https://travis-ci.com/flink-ci/flink/builds/128402339)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] flinkbot edited a comment on issue #9717: [FLINK-14044] [runtime] Reducing synchronization in AsyncWaitOperator

2019-09-19 Thread GitBox
flinkbot edited a comment on issue #9717: [FLINK-14044] [runtime] Reducing 
synchronization in AsyncWaitOperator
URL: https://github.com/apache/flink/pull/9717#issuecomment-533068819
 
 
   
   ## CI report:
   
   * 9cfda801891969ac460f45ea639d636b519f22db : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128317194)
   * af7f6be848f0bd90a049d5ee9f7a38b1c3e2b972 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128336908)
   * 43157eb165d9409fbdf4a2f773ef7d52dd74e759 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128365390)
   * ff400d149ed36c63a90d2f3fcd517bde738e5af1 : UNKNOWN
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (FLINK-14107) Kinesis consumer record emitter deadlock under event time alignment

2019-09-19 Thread Thomas Weise (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Weise updated FLINK-14107:
-
Fix Version/s: 1.8.3
   1.9.1

> Kinesis consumer record emitter deadlock under event time alignment
> ---
>
> Key: FLINK-14107
> URL: https://issues.apache.org/jira/browse/FLINK-14107
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kinesis
>Affects Versions: 1.8.2, 1.9.0
>Reporter: Thomas Weise
>Assignee: Thomas Weise
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.10.0, 1.9.1, 1.8.3
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> When the emitter reaches the max timestamp for the current queue, it stops 
> emitting and waits for the max timestamp to advance. Since it simultaneously 
> selects the next queue as the new "minimum" queue, it may deadlock if the 
> previous min queue represents the new global lower bound after the max 
> timestamp advanced. This occurs very infrequently and we were finally able to 
> reproduce.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on issue #9720: [FLINK-13025] Elasticsearch 7.x support

2019-09-19 Thread GitBox
flinkbot edited a comment on issue #9720: [FLINK-13025] Elasticsearch 7.x 
support
URL: https://github.com/apache/flink/pull/9720#issuecomment-533110106
 
 
   
   ## CI report:
   
   * d9c1dd529ef235649909d067cc78099179656e62 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128334151)
   * 603db694488096f1491b5ccb068d9e783636a8c8 : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/128375005)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] yxu-valleytider edited a comment on issue #9581: [FLINK-13864][streaming]: Modify the StreamingFileSink Builder interface to allow for easier subclassing of StreamingFileSink

2019-09-19 Thread GitBox
yxu-valleytider edited a comment on issue #9581: [FLINK-13864][streaming]: 
Modify the StreamingFileSink Builder interface to allow for easier subclassing 
of StreamingFileSink 
URL: https://github.com/apache/flink/pull/9581#issuecomment-533209039
 
 
   > Hi @yxu-valleytider , I still have the following concerns that make me 
feel uncomfortable about merging this PR:
   > 
   > 1. The user should know that he has to write his own `forRowFormat` / 
`forBulkFormat` in order for them to return the correct builder, and this can 
only be "implicit" knowledge.
   > 2. The user should override the `build` method of the builder he provides.
   > 3. It may break user-code, if the user was explicitly stating in his 
program the return type of the `forRowFormat` or `forBulkFormat`. This is 
because we add a new type parameter.
   
   @kl0u I can see  3) is more or less a concern.  This PR does introduce quite 
a bit of changes to the interface of the StreamingFileSink -- the original hope 
is there are not so many users of `StreamingFileSink` yet hence impact is 
limited. 
   
   > Also some other issues that can be tackled easily though are:
   > 
   > 1. The constructor should not take a `BucketsBuilder` but we have to have 
2 constructors, one for row and one for bulk. If not, then the user may create 
an invalid `StreamingFileSink`.
   > 2. The user cannot change the type of the `BucketID` (although this may 
not be that important).
   
   @aljoscha In terms of a step back, it would something extending [this 
branch](https://github.com/apache/flink/compare/master...kailashhd:FLINK-12539).
  If you are OK with adding a bunch of getter functions to retrieve the 
`private` member fields of the class.  
   
   That would imply smaller amount of interface changes upstream. The drawback 
is it still suffers from the issue described in reference 
[[2](https://stackoverflow.com/questions/17164375/subclassing-a-java-builder-class)].
 There exists a (not-so-convenient) workaround: users can override **ALL** the 
`withXXX` functions (that ever need to be used), such that they return the 
corresponding builder of the subclass .   
   
   In both approaches, users have to do 1) and 2) above. That is, override the 
`forRowFormat` / `forBulkFormat` and `build` methods.  


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] yxu-valleytider edited a comment on issue #9581: [FLINK-13864][streaming]: Modify the StreamingFileSink Builder interface to allow for easier subclassing of StreamingFileSink

2019-09-19 Thread GitBox
yxu-valleytider edited a comment on issue #9581: [FLINK-13864][streaming]: 
Modify the StreamingFileSink Builder interface to allow for easier subclassing 
of StreamingFileSink 
URL: https://github.com/apache/flink/pull/9581#issuecomment-533209039
 
 
   > Hi @yxu-valleytider , I still have the following concerns that make me 
feel uncomfortable about merging this PR:
   > 
   > 1. The user should know that he has to write his own `forRowFormat` / 
`forBulkFormat` in order for them to return the correct builder, and this can 
only be "implicit" knowledge.
   > 2. The user should override the `build` method of the builder he provides.
   > 3. It may break user-code, if the user was explicitly stating in his 
program the return type of the `forRowFormat` or `forBulkFormat`. This is 
because we add a new type parameter.
   
   @kl0u I can see  3) is more or less a concern.  This PR does introduce quite 
a bit of changes to the interface of the StreamingFileSink -- the original hope 
is there are not so many users of `StreamingFileSink` yet hence impact is 
limited. 
   
   > Also some other issues that can be tackled easily though are:
   > 
   > 1. The constructor should not take a `BucketsBuilder` but we have to have 
2 constructors, one for row and one for bulk. If not, then the user may create 
an invalid `StreamingFileSink`.
   > 2. The user cannot change the type of the `BucketID` (although this may 
not be that important).
   
   @aljoscha In terms of a step back, it would something extending [this 
branch](https://github.com/apache/flink/compare/master...kailashhd:FLINK-12539).
  If you are OK with adding a bunch of getter functions to retrieve the 
`private` member fields of the class.  
   
   That would imply smaller amount of interface changes upstream. The drawback 
is it still suffers from the issue described in reference 
[[2](https://stackoverflow.com/questions/17164375/subclassing-a-java-builder-class)].
 Workaround is for users to override **ALL** the `withXXX` functions (that ever 
need to be used), such that they return the corresponding builder of the 
subclass . 
   
   In both approaches, users have to do 1) and 2) above. That is, override the 
`forRowFormat` / `forBulkFormat` and `build` methods.  


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] flinkbot edited a comment on issue #9701: [FLINK-14096][client] Merge NewClusterClient into ClusterClient

2019-09-19 Thread GitBox
flinkbot edited a comment on issue #9701: [FLINK-14096][client] Merge 
NewClusterClient into ClusterClient
URL: https://github.com/apache/flink/pull/9701#issuecomment-532200151
 
 
   
   ## CI report:
   
   * af3e828fe1f134c22c7512565b1a3ba8281c1523 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/127998075)
   * 4a834198a3c8fc5b4d1a66501047918822f708bb : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128000763)
   * 687341a53c89c94b46e7221394a591acd4769e30 : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/128371824)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] yxu-valleytider edited a comment on issue #9581: [FLINK-13864][streaming]: Modify the StreamingFileSink Builder interface to allow for easier subclassing of StreamingFileSink

2019-09-19 Thread GitBox
yxu-valleytider edited a comment on issue #9581: [FLINK-13864][streaming]: 
Modify the StreamingFileSink Builder interface to allow for easier subclassing 
of StreamingFileSink 
URL: https://github.com/apache/flink/pull/9581#issuecomment-533209039
 
 
   > Hi @yxu-valleytider , I still have the following concerns that make me 
feel uncomfortable about merging this PR:
   > 
   > 1. The user should know that he has to write his own `forRowFormat` / 
`forBulkFormat` in order for them to return the correct builder, and this can 
only be "implicit" knowledge.
   > 2. The user should override the `build` method of the builder he provides.
   > 3. It may break user-code, if the user was explicitly stating in his 
program the return type of the `forRowFormat` or `forBulkFormat`. This is 
because we add a new type parameter.
   
   @kl0u I can see  3) is more or less a concern.  This PR does introduce quite 
a bit of changes to the interface of the StreamingFileSink -- the original hope 
is there are not so many users of `StreamingFileSink` yet hence impact is 
limited. 
   
   > Also some other issues that can be tackled easily though are:
   > 
   > 1. The constructor should not take a `BucketsBuilder` but we have to have 
2 constructors, one for row and one for bulk. If not, then the user may create 
an invalid `StreamingFileSink`.
   > 2. The user cannot change the type of the `BucketID` (although this may 
not be that important).
   
   @aljoscha In terms of a step back, it would something extending [this 
branch](https://github.com/apache/flink/compare/master...kailashhd:FLINK-12539).
  If you are OK with adding a bunch of getter functions to retrieve the 
`private` member fields of the class.  
   
   That would imply smaller amount of interface changes upstream. The drawback 
is it still suffers from the issue described in reference 
[[2](https://stackoverflow.com/questions/17164375/subclassing-a-java-builder-class)].
 Workaround is for users to override **ALL** the `withXXX` functions (that ever 
need to be used), such that they return the corresponding builder of the 
subclass . 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-14124) potential memory leak in netty server

2019-09-19 Thread zhijiang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16933568#comment-16933568
 ] 

zhijiang commented on FLINK-14124:
--

I think it is better for you to upgrade the flink version. The 1.4.2 is so old 
and there are many improvements in network stack after 1.5 version. Especially 
on netty server, we avoid the data copy from flink buffer to netty ByteBuffer 
which could save some direct memory used by netty server side.

So my suggestion is upgrading the flink latest version if possible to verify 
whether this issue still exist.

> potential memory leak in netty server
> -
>
> Key: FLINK-14124
> URL: https://issues.apache.org/jira/browse/FLINK-14124
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Network
>Affects Versions: 1.6.3
>Reporter: YufeiLiu
>Priority: Critical
> Attachments: image-2019-09-19-15-53-32-294.png, screenshot-1.png, 
> screenshot-2.png
>
>
> I have a job running in flink 1.4.2, end of the task is use Phoenix jdbc 
> driver write record into Apache Phoenix.
> _mqStream
> .keyBy(0)
> .window(TumblingProcessingTimeWindows.of(Time.of(300, 
> TimeUnit.SECONDS)))
> .process(new MyProcessWindowFunction())
> .addSink(new PhoenixSinkFunction());_
> But the TaskManager of sink subtask off-heap memory keep increasing, 
> precisely is might case by DirectByteBuffer.
> I analyze heap dump, find there are hundreds of DirectByteBuffer object, each 
> of them reference to over 3MB memory address, they are all link to Flink 
> Netty Server Thread.
>  !image-2019-09-19-15-53-32-294.png! 
> It only happened in sink task, other nodes just work fine. I think is problem 
> of Phoenix at first, but heap dump show memory is consume by netty. I didn't 
> know much about flink network, I will be appreciated if someone can tell me 
> the might causation or how to dig further.
>  !screenshot-1.png! 
> yarn.heap-cutoff-ratio: 0.2
> taskmanager.memory.fraction: 0.6
> taskmanager.network.numberOfBuffers: 32240
>  !screenshot-2.png! 
> I have Zookeeper, Kafka, Phoenix(Hbase), Flume dependency in package, they 
> all might use direct memory, but when direct memory get free, is there 
> something block the Cleaner progress.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] yxu-valleytider edited a comment on issue #9581: [FLINK-13864][streaming]: Modify the StreamingFileSink Builder interface to allow for easier subclassing of StreamingFileSink

2019-09-19 Thread GitBox
yxu-valleytider edited a comment on issue #9581: [FLINK-13864][streaming]: 
Modify the StreamingFileSink Builder interface to allow for easier subclassing 
of StreamingFileSink 
URL: https://github.com/apache/flink/pull/9581#issuecomment-533209039
 
 
   > Hi @yxu-valleytider , I still have the following concerns that make me 
feel uncomfortable about merging this PR:
   > 
   > 1. The user should know that he has to write his own `forRowFormat` / 
`forBulkFormat` in order for them to return the correct builder, and this can 
only be "implicit" knowledge.
   > 2. The user should override the `build` method of the builder he provides.
   > 3. It may break user-code, if the user was explicitly stating in his 
program the return type of the `forRowFormat` or `forBulkFormat`. This is 
because we add a new type parameter.
   
   @kl0u I can see  3) is more or less a concern.  This PR does introduce quite 
a bit of changes to the interface of the StreamingFileSink -- the original hope 
is there are not so many users of `StreamingFileSink` yet hence impact is 
limited. 
   
   > Also some other issues that can be tackled easily though are:
   > 
   > 1. The constructor should not take a `BucketsBuilder` but we have to have 
2 constructors, one for row and one for bulk. If not, then the user may create 
an invalid `StreamingFileSink`.
   > 2. The user cannot change the type of the `BucketID` (although this may 
not be that important).
   
   @aljoscha In terms of a step back, it would something extending [this 
branch](https://github.com/apache/flink/compare/master...kailashhd:FLINK-12539).
  If you are OK with adding a bunch of getter functions to retrieve the 
`private` member fields of the class.  
   
   That would imply smaller amount of interface changes upstream. The drawback 
is it still suffers from the issue described in reference 
[[2](https://stackoverflow.com/questions/17164375/subclassing-a-java-builder-class)].
 Workaround is for users to override *ALL* the `withXXX` functions (that ever 
need to be used), such that they return the corresponding builder of the 
subclass . 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] flinkbot edited a comment on issue #9722: [FLINK-14128] [runtime] Remove the description of customized restart strategy configuration

2019-09-19 Thread GitBox
flinkbot edited a comment on issue #9722: [FLINK-14128] [runtime] Remove the 
description of customized restart strategy configuration
URL: https://github.com/apache/flink/pull/9722#issuecomment-533180619
 
 
   
   ## CI report:
   
   * 29da7100a0af920b19723543fe4517dd0d95681e : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/128368730)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] yxu-valleytider commented on issue #9581: [FLINK-13864][streaming]: Modify the StreamingFileSink Builder interface to allow for easier subclassing of StreamingFileSink

2019-09-19 Thread GitBox
yxu-valleytider commented on issue #9581: [FLINK-13864][streaming]: Modify the 
StreamingFileSink Builder interface to allow for easier subclassing of 
StreamingFileSink 
URL: https://github.com/apache/flink/pull/9581#issuecomment-533209039
 
 
   > Hi @yxu-valleytider , I still have the following concerns that make me 
feel uncomfortable about merging this PR:
   > 
   > 1. The user should know that he has to write his own `forRowFormat` / 
`forBulkFormat` in order for them to return the correct builder, and this can 
only be "implicit" knowledge.
   > 2. The user should override the `build` method of the builder he provides.
   > 3. It may break user-code, if the user was explicitly stating in his 
program the return type of the `forRowFormat` or `forBulkFormat`. This is 
because we add a new type parameter.
   
   @kl0u I can see  3) is more or less a concern.  This PR does introduce quite 
a bit of changes to the interface of the StreamingFileSink -- the original hope 
is there are not so many users of `StreamingFileSink` yet hence impact is 
limited. 
   
   > Also some other issues that can be tackled easily though are:
   > 
   > 1. The constructor should not take a `BucketsBuilder` but we have to have 
2 constructors, one for row and one for bulk. If not, then the user may create 
an invalid `StreamingFileSink`.
   > 2. The user cannot change the type of the `BucketID` (although this may 
not be that important).
   
   @aljoscha In terms of a step back, it would something extending [this 
branch](https://github.com/apache/flink/compare/master...kailashhd:FLINK-12539).
  If you are OK with adding a bunch of getter functions to retrieve the 
`private` member fields of the class. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-14117) Translate changes on documentation index page to Chinese

2019-09-19 Thread gaofeilong (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16933551#comment-16933551
 ] 

gaofeilong commented on FLINK-14117:


understood, I'm new, thanks for explaination

> Translate changes on documentation index page to Chinese
> 
>
> Key: FLINK-14117
> URL: https://issues.apache.org/jira/browse/FLINK-14117
> Project: Flink
>  Issue Type: Task
>  Components: chinese-translation, Documentation
>Affects Versions: 1.10.0
>Reporter: Fabian Hueske
>Priority: Major
>
> The changes of commit 
> [ee0d6fdf0604d74bd1cf9a6eb9cf5338ac1aa4f9|https://github.com/apache/flink/commit/ee0d6fdf0604d74bd1cf9a6eb9cf5338ac1aa4f9#diff-1a523bd9fa0dbf998008b37579210e12]
>  on the documentation index page should be translated to Chinese.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (FLINK-14117) Translate changes on documentation index page to Chinese

2019-09-19 Thread gaofeilong (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16933551#comment-16933551
 ] 

gaofeilong edited comment on FLINK-14117 at 9/19/19 4:19 PM:
-

understood, I'm new, thanks for the explaination


was (Author: gaofeilong):
understood, I'm new, thanks for explaination

> Translate changes on documentation index page to Chinese
> 
>
> Key: FLINK-14117
> URL: https://issues.apache.org/jira/browse/FLINK-14117
> Project: Flink
>  Issue Type: Task
>  Components: chinese-translation, Documentation
>Affects Versions: 1.10.0
>Reporter: Fabian Hueske
>Priority: Major
>
> The changes of commit 
> [ee0d6fdf0604d74bd1cf9a6eb9cf5338ac1aa4f9|https://github.com/apache/flink/commit/ee0d6fdf0604d74bd1cf9a6eb9cf5338ac1aa4f9#diff-1a523bd9fa0dbf998008b37579210e12]
>  on the documentation index page should be translated to Chinese.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] yxu-valleytider commented on issue #9581: [FLINK-13864][streaming]: Modify the StreamingFileSink Builder interface to allow for easier subclassing of StreamingFileSink

2019-09-19 Thread GitBox
yxu-valleytider commented on issue #9581: [FLINK-13864][streaming]: Modify the 
StreamingFileSink Builder interface to allow for easier subclassing of 
StreamingFileSink 
URL: https://github.com/apache/flink/pull/9581#issuecomment-533203569
 
 
   @aljoscha  The metrics collection needs be done inside 
`notifyCheckpointComplete()` and `snapshotState()`.  In addition, we may add 
customized logic into these two function as well.  Some of the logic are 
specific to our use cases, and probably better to stay inside our own 
customized `StreamingFileSink`.  I'm following the cited articles' suggestion 
in order to make the builder pattern extensible to the subclasses.  
   
   
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] flinkbot edited a comment on issue #9720: [FLINK-13025] Elasticsearch 7.x support

2019-09-19 Thread GitBox
flinkbot edited a comment on issue #9720: [FLINK-13025] Elasticsearch 7.x 
support
URL: https://github.com/apache/flink/pull/9720#issuecomment-533110106
 
 
   
   ## CI report:
   
   * d9c1dd529ef235649909d067cc78099179656e62 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128334151)
   * 603db694488096f1491b5ccb068d9e783636a8c8 : PENDING 
[Build](https://travis-ci.com/flink-ci/flink/builds/128375005)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Assigned] (FLINK-14096) Merge NewClusterClient into ClusterClient

2019-09-19 Thread TisonKun (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

TisonKun reassigned FLINK-14096:


Assignee: TisonKun

> Merge NewClusterClient into ClusterClient
> -
>
> Key: FLINK-14096
> URL: https://issues.apache.org/jira/browse/FLINK-14096
> Project: Flink
>  Issue Type: Sub-task
>  Components: Client / Job Submission
>Reporter: TisonKun
>Assignee: TisonKun
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.10.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> With the effort under FLINK-10392 we don't need the bridge class 
> {{NewClusterClient}} any more. We can just merge {{NewClusterClient}} into 
> {{ClusterClient}} towards an interface-ized {{ClusterClient}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (FLINK-14113) Remove class JobWithJars

2019-09-19 Thread TisonKun (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

TisonKun reassigned FLINK-14113:


Assignee: TisonKun

> Remove class JobWithJars
> 
>
> Key: FLINK-14113
> URL: https://issues.apache.org/jira/browse/FLINK-14113
> Project: Flink
>  Issue Type: Sub-task
>  Components: Client / Job Submission
>Reporter: TisonKun
>Assignee: TisonKun
>Priority: Major
> Fix For: 1.10.0
>
>
> {{JobWithJars}} is a batch-only concept, acts as a POJO consists of {{Plan}} 
> and {{URL}}s of libs. We can
> 1. inline the usage of {{Plan}} and {{URL}}s as we do in streaming case.
> 2. extract static methods into a utility class said {{ClientUtils}}.
> The main purpose here is towards no batch specific concept that doesn't bring 
> too much good.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on issue #9721: [FLINK-14129][hive] HiveTableSource should implement ProjectableTable…

2019-09-19 Thread GitBox
flinkbot edited a comment on issue #9721: [FLINK-14129][hive] HiveTableSource 
should implement ProjectableTable…
URL: https://github.com/apache/flink/pull/9721#issuecomment-533163569
 
 
   
   ## CI report:
   
   * ffcb98bf85b81ec59ce7ea0fa021076afdcd40d4 : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/128356643)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] flinkbot edited a comment on issue #9720: [FLINK-13025] Elasticsearch 7.x support

2019-09-19 Thread GitBox
flinkbot edited a comment on issue #9720: [FLINK-13025] Elasticsearch 7.x 
support
URL: https://github.com/apache/flink/pull/9720#issuecomment-533110106
 
 
   
   ## CI report:
   
   * d9c1dd529ef235649909d067cc78099179656e62 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128334151)
   * 603db694488096f1491b5ccb068d9e783636a8c8 : UNKNOWN
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] flinkbot edited a comment on issue #9701: [FLINK-14096][client] Merge NewClusterClient into ClusterClient

2019-09-19 Thread GitBox
flinkbot edited a comment on issue #9701: [FLINK-14096][client] Merge 
NewClusterClient into ClusterClient
URL: https://github.com/apache/flink/pull/9701#issuecomment-532200151
 
 
   
   ## CI report:
   
   * af3e828fe1f134c22c7512565b1a3ba8281c1523 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/127998075)
   * 4a834198a3c8fc5b4d1a66501047918822f708bb : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/128000763)
   * 687341a53c89c94b46e7221394a591acd4769e30 : PENDING 
[Build](https://travis-ci.com/flink-ci/flink/builds/128371824)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] zhijiangW commented on issue #9478: [FLINK-13766][task] Refactor the implementation of StreamInputProcessor based on StreamTaskInput#emitNext

2019-09-19 Thread GitBox
zhijiangW commented on issue #9478: [FLINK-13766][task] Refactor the 
implementation of StreamInputProcessor based on StreamTaskInput#emitNext
URL: https://github.com/apache/flink/pull/9478#issuecomment-533194969
 
 
   Yes, it is better to merge this PR soon if no other concerns. Because there 
are additional two PRs #9483 #9646 relying on it,  and it actually wastes some 
efforts to append this PR every time.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] zhuzhurk commented on a change in pull request #9663: [WIP][FLINK-12433][runtime] Implement DefaultScheduler stub

2019-09-19 Thread GitBox
zhuzhurk commented on a change in pull request #9663: 
[WIP][FLINK-12433][runtime] Implement DefaultScheduler stub
URL: https://github.com/apache/flink/pull/9663#discussion_r326241629
 
 

 ##
 File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/DefaultSchedulerFactory.java
 ##
 @@ -66,14 +77,30 @@ public SchedulerNG createInstance(
jobMasterConfiguration,
slotProvider,
futureExecutor,
+   new ScheduledExecutorServiceAdapter(futureExecutor),
userCodeLoader,
checkpointRecoveryFactory,
rpcTimeout,
blobWriter,
jobManagerJobMetricGroup,
slotRequestTimeout,
shuffleMaster,
-   partitionTracker);
+   partitionTracker,
+   schedulingStrategyFactory,
+   new RestartPipelinedRegionStrategy.Factory(),
 
 Review comment:
   I opened a ticket FLINK-14131 for this. We can do it after this PR is done.
   
   Add we need to consider whether to respect the existing 
"jobmanager.execution.failover-strategy" config then.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] flinkbot edited a comment on issue #9701: [FLINK-14096][client] Merge NewClusterClient into ClusterClient

2019-09-19 Thread GitBox
flinkbot edited a comment on issue #9701: [FLINK-14096][client] Merge 
NewClusterClient into ClusterClient
URL: https://github.com/apache/flink/pull/9701#issuecomment-532200151
 
 
   
   ## CI report:
   
   * af3e828fe1f134c22c7512565b1a3ba8281c1523 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/127998075)
   * 4a834198a3c8fc5b4d1a66501047918822f708bb : PENDING 
[Build](https://travis-ci.com/flink-ci/flink/builds/128000763)
   * 687341a53c89c94b46e7221394a591acd4769e30 : UNKNOWN
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


  1   2   3   4   5   >