[GitHub] Aitozi commented on issue #7543: [FLINK-10996]Fix the possible state leak with CEP processing time model

2019-01-21 Thread GitBox
Aitozi commented on issue #7543: [FLINK-10996]Fix the possible state leak with 
CEP processing time model
URL: https://github.com/apache/flink/pull/7543#issuecomment-456022735
 
 
   Thanks for your reply, I will take a look into that PR these two days.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] XuPingyong opened a new pull request #4267: [FLINK-7018][streaming] Refactor streamGraph to make interfaces clear

2019-01-21 Thread GitBox
XuPingyong opened a new pull request #4267: [FLINK-7018][streaming] Refactor 
streamGraph to make interfaces clear
URL: https://github.com/apache/flink/pull/4267
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] XuPingyong removed a comment on issue #4267: [FLINK-7018][streaming] Refactor streamGraph to make interfaces clear

2019-01-21 Thread GitBox
XuPingyong removed a comment on issue #4267: [FLINK-7018][streaming] Refactor 
streamGraph to make interfaces clear
URL: https://github.com/apache/flink/pull/4267#issuecomment-456053593
 
 
   no need


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] clarkyzl closed pull request #3473: [FLINK-5833] [table] Support for Hive GenericUDF

2019-01-21 Thread GitBox
clarkyzl closed pull request #3473: [FLINK-5833] [table] Support for Hive 
GenericUDF
URL: https://github.com/apache/flink/pull/3473
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (FLINK-7015) Separate OperatorConfig from StreamConfig

2019-01-21 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-7015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-7015:
--
Labels: pull-request-available  (was: )

> Separate OperatorConfig from StreamConfig
> -
>
> Key: FLINK-7015
> URL: https://issues.apache.org/jira/browse/FLINK-7015
> Project: Flink
>  Issue Type: Improvement
>  Components: DataStream API
>Reporter: Xu Pingyong
>Assignee: Xu Pingyong
>Priority: Major
>  Labels: pull-request-available
>
>  Motivation:
> A Task contains one or more operators with chainning, however 
> configs of operator and task are all put in StreamConfig. For example, when a 
> opeator sets up with the StreamConfig, it can see the interface about 
> physicalEdges or chained.task.configs that are confused.  Similarly a 
> streamTask should not see the interface aboule chain.index.
>  So we need to separate OperatorConfig from StreamConfig. A 
> streamTask builds execution enviroment with the streamConfig, and extract 
> operatorConfigs from it, then build streamOperators with every 
> operatorConfig. 
> 
>OperatorConfig:  for the streamOperator to setup with, it constains 
> informations that only belong to the streamOperator. It contains:
>1)  operator information: name, id
>2)  Serialized StreamOperator
>3)  input serializer.
>4)  output edges and serializers.
>5)  chain.index
>6) state.key.serializer
>  StreamConfig: for the streamTask to use:
>1) in.physical.edges
>   2) out.physical.edges
>3) chained OperatorConfigs
>4) execution environment: checkpoint, state.backend and so on... 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] XuPingyong closed pull request #4273: [FLINK-7065] Separate the flink-streaming-java module from flink-clients

2019-01-21 Thread GitBox
XuPingyong closed pull request #4273: [FLINK-7065] Separate the 
flink-streaming-java module from flink-clients
URL: https://github.com/apache/flink/pull/4273
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] XuPingyong closed pull request #4241: [FLINK-7015] [streaming] separate OperatorConfig from StreamConfig

2019-01-21 Thread GitBox
XuPingyong closed pull request #4241: [FLINK-7015] [streaming] separate 
OperatorConfig from StreamConfig
URL: https://github.com/apache/flink/pull/4241
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] XuPingyong commented on issue #4267: [FLINK-7018][streaming] Refactor streamGraph to make interfaces clear

2019-01-21 Thread GitBox
XuPingyong commented on issue #4267: [FLINK-7018][streaming] Refactor 
streamGraph to make interfaces clear
URL: https://github.com/apache/flink/pull/4267#issuecomment-456053593
 
 
   no need


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] XuPingyong closed pull request #4267: [FLINK-7018][streaming] Refactor streamGraph to make interfaces clear

2019-01-21 Thread GitBox
XuPingyong closed pull request #4267: [FLINK-7018][streaming] Refactor 
streamGraph to make interfaces clear
URL: https://github.com/apache/flink/pull/4267
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (FLINK-5833) Support for Hive GenericUDF

2019-01-21 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-5833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-5833:
--
Labels: pull-request-available  (was: )

> Support for Hive GenericUDF
> ---
>
> Key: FLINK-5833
> URL: https://issues.apache.org/jira/browse/FLINK-5833
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API  SQL
>Reporter: Zhuoluo Yang
>Assignee: Zhuoluo Yang
>Priority: Major
>  Labels: pull-request-available
>
> The second step of FLINK-5802 is to support Hive's GenericUDF.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-7065) separate the flink-streaming-java module from flink-clients

2019-01-21 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-7065:
--
Labels: pull-request-available  (was: )

> separate the flink-streaming-java module from flink-clients 
> 
>
> Key: FLINK-7065
> URL: https://issues.apache.org/jira/browse/FLINK-7065
> Project: Flink
>  Issue Type: Improvement
>  Components: DataStream API
>Reporter: Xu Pingyong
>Assignee: Xu Pingyong
>Priority: Major
>  Labels: pull-request-available
>
> Motivation:
>  It is not good that "flink-streaming-java" module depends on 
> "flink-clients". Flink-clients should see something in "flink-streaming-java".
> Related Change:
>   1. LocalStreamEnvironment and RemoteStreamEnvironment can also execute 
> a job by the executors(LocalExecutor and RemoteExecutor).  Introduce 
> StreamGraphExecutor which executors a streamGraph as PlanExecutor executors 
> the plan.  StreamGraphExecutor and PlanExecutor all extend Executor.
>   2. Introduce  StreamExecutionEnvironmentFactory which works similarly 
> to ContextEnvironmentFactory in flink-clients.
>   When a object of ContextEnvironmentFactory, 
> OptimizerPlanEnvironmentFactory or PreviewPlanEnvironmentFactory is set into 
> ExecutionEnvironment(by calling initializeContextEnvironment), the relevant 
> StreamEnvFactory is alsot set into StreamExecutionEnvironment. It is similar 
> when calling unsetContext.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] godfreyhe closed pull request #4572: [Flink-7243] [connectors] Add ParquetInputFormat

2019-01-21 Thread GitBox
godfreyhe closed pull request #4572: [Flink-7243] [connectors] Add 
ParquetInputFormat 
URL: https://github.com/apache/flink/pull/4572
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] godfreyhe closed pull request #3818: [FLINK-5514] [table] Implement an efficient physical execution for CUBE/ROLLUP/GROUPING SETS

2019-01-21 Thread GitBox
godfreyhe closed pull request #3818: [FLINK-5514] [table] Implement an 
efficient physical execution for CUBE/ROLLUP/GROUPING SETS
URL: https://github.com/apache/flink/pull/3818
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (FLINK-5514) Implement an efficient physical execution for CUBE/ROLLUP/GROUPING SETS

2019-01-21 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-5514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-5514:
--
Labels: pull-request-available  (was: )

> Implement an efficient physical execution for CUBE/ROLLUP/GROUPING SETS
> ---
>
> Key: FLINK-5514
> URL: https://issues.apache.org/jira/browse/FLINK-5514
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API  SQL
>Reporter: Timo Walther
>Assignee: godfrey he
>Priority: Major
>  Labels: pull-request-available
>
> A first support for GROUPING SETS has been added in FLINK-5303. However, the 
> current runtime implementation is not very efficient as it basically only 
> translates logical operators to physical operators i.e. grouping sets are 
> currently only translated into multiple groupings that are unioned together. 
> A rough design document for this has been created in FLINK-2980.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] XuPingyong closed pull request #4267: [FLINK-7018][streaming] Refactor streamGraph to make interfaces clear

2019-01-21 Thread GitBox
XuPingyong closed pull request #4267: [FLINK-7018][streaming] Refactor 
streamGraph to make interfaces clear
URL: https://github.com/apache/flink/pull/4267
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] godfreyhe closed pull request #4667: [FLINK-5859] [table] Add PartitionableTableSource for partition pruning

2019-01-21 Thread GitBox
godfreyhe closed pull request #4667: [FLINK-5859] [table] Add 
PartitionableTableSource for partition pruning
URL: https://github.com/apache/flink/pull/4667
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (FLINK-7620) Supports user defined optimization phase

2019-01-21 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-7620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-7620:
--
Labels: pull-request-available  (was: )

> Supports user defined optimization phase
> 
>
> Key: FLINK-7620
> URL: https://issues.apache.org/jira/browse/FLINK-7620
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API  SQL
>Reporter: godfrey he
>Assignee: godfrey he
>Priority: Major
>  Labels: pull-request-available
>
> Currently, the optimization phases are hardcode in {{BatchTableEnvironment}} 
> and {{StreamTableEnvironment}}. It's better that user could define the 
> optimization phases and the rules in each phase as needed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-7018) Refactor streamGraph to make interfaces clear

2019-01-21 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-7018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-7018:
--
Labels: pull-request-available  (was: )

> Refactor streamGraph to make interfaces clear
> -
>
> Key: FLINK-7018
> URL: https://issues.apache.org/jira/browse/FLINK-7018
> Project: Flink
>  Issue Type: Improvement
>  Components: DataStream API
>Reporter: Xu Pingyong
>Assignee: Xu Pingyong
>Priority: Major
>  Labels: pull-request-available
>
> Motivation:
>1. StreamGraph is a graph consisted of some streamNodes. So virtual 
> nodes (such as select, sideOutput, partition) should be moved away from it. 
> Main iterfaces of StreamGraph should be as following:
> addSource(StreamNode sourceNode)
> addSink(StreamNode sinkNode)
>addOperator(StreamNode streamNode)
>addEdge(Integer upStreamVertexID, Integer downStreamVertexID, 
> StreamEdge.InputOrder inputOrder, StreamPartitioner partitioner, 
> List outputNames, OutputTag outputTag)
> getJobGraph()
>  2. StreamExecutionEnvironment should not be in StreamGraph, I create 
> StreamGraphProperties which extracts all env information the streamGraph 
> needs from StreamExecutionEnvironment. It contains:
> 1) executionConfig
> 2) checkpointConfig
> 3) timeCharacteristic
> 4) stateBackend
> 5) chainingEnabled
> 6) cachedFiles
> 7) jobName
> 
> Related Changes:
>I moved the part of dealing with virtual nodes to 
> StreamGraphGenerator. And get properties of StreamGraph from 
> StreamGraphProperties instead of StreamExecutionEnvironment.
>  
>   It is only a code abstraction internally.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-6516) using real row count instead of dummy row count when optimizing plan

2019-01-21 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-6516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-6516:
--
Labels: pull-request-available  (was: )

> using real row count instead of dummy row count when optimizing plan
> 
>
> Key: FLINK-6516
> URL: https://issues.apache.org/jira/browse/FLINK-6516
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API  SQL
>Reporter: godfrey he
>Assignee: godfrey he
>Priority: Major
>  Labels: pull-request-available
>
> Currently, the statistic of {{TableSourceTable}} is {{UNKNOWN}} mostly, and 
> the statistic from {{ExternalCatalog}} maybe is null also. Actually, only 
> each {{TableSource}} knows its statistic exactly, especial for 
> {{FilterableTableSource}} and {{PartitionableTableSource}}. So we can add 
> {{getTableStats}} method in {{TableSource}}, and use it in TableSourceScan's 
> estimateRowCount method to get real row count.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-11046) ElasticSearch6Connector cause thread blocked when index failed with retry

2019-01-21 Thread Dawid Wysakowicz (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-11046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16747810#comment-16747810
 ] 

Dawid Wysakowicz commented on FLINK-11046:
--

Hi [~xueyu] Did you manage to make any progress on this issue?

> ElasticSearch6Connector cause thread blocked when index failed with retry
> -
>
> Key: FLINK-11046
> URL: https://issues.apache.org/jira/browse/FLINK-11046
> Project: Flink
>  Issue Type: Bug
>  Components: ElasticSearch Connector
>Affects Versions: 1.6.2
>Reporter: luoguohao
>Assignee: xueyu
>Priority: Major
>
> When i'm using es6 sink to index into es, bulk process with some exception 
> catched, and  i trying to reindex the document with the call 
> `indexer.add(action)` in the `ActionRequestFailureHandler.onFailure()` 
> method, but things goes incorrect. The call thread stuck there, and with the 
> thread dump, i saw the `bulkprocessor` object was locked by other thread. 
> {code:java}
> public interface ActionRequestFailureHandler extends Serializable {
>  void onFailure(ActionRequest action, Throwable failure, int restStatusCode, 
> RequestIndexer indexer) throws Throwable;
> }
> {code}
> After i read the code implemented in the `indexer.add(action)`, i find that 
> `synchronized` is needed on each add operation.
> {code:java}
> private synchronized void internalAdd(DocWriteRequest request, @Nullable 
> Object payload) {
>   ensureOpen();
>   bulkRequest.add(request, payload);
>   executeIfNeeded();
> }
> {code}
> And, at i also noticed that `bulkprocessor` object would also locked in the 
> bulk process thread. 
> the bulk process operation is in the following code:
> {code:java}
> public void execute(BulkRequest bulkRequest, long executionId) {
> Runnable toRelease = () -> {};
> boolean bulkRequestSetupSuccessful = false;
> try {
> listener.beforeBulk(executionId, bulkRequest);
> semaphore.acquire();
> toRelease = semaphore::release;
> CountDownLatch latch = new CountDownLatch(1);
> retry.withBackoff(consumer, bulkRequest, new 
> ActionListener() {
> @Override
> public void onResponse(BulkResponse response) {
> try {
> listener.afterBulk(executionId, bulkRequest, response);
> } finally {
> semaphore.release();
> latch.countDown();
> }
> }
> @Override
> public void onFailure(Exception e) {
> try {
> listener.afterBulk(executionId, bulkRequest, e);
> } finally {
> semaphore.release();
> latch.countDown();
> }
> }
> }, Settings.EMPTY);
> bulkRequestSetupSuccessful = true;
>if (concurrentRequests == 0) {
>latch.await();
> }
> } catch (InterruptedException e) {
> Thread.currentThread().interrupt();
> logger.info(() -> new ParameterizedMessage("Bulk request {} has been 
> cancelled.", executionId), e);
> listener.afterBulk(executionId, bulkRequest, e);
> } catch (Exception e) {
> logger.warn(() -> new ParameterizedMessage("Failed to execute bulk 
> request {}.", executionId), e);
> listener.afterBulk(executionId, bulkRequest, e);
> } finally {
> if (bulkRequestSetupSuccessful == false) {  // if we fail on 
> client.bulk() release the semaphore
> toRelease.run();
> }
> }
> }
> {code}
> As the read line i marked above, i think, that's the reason why the retry 
> operation thread was block, because the the bulk process thread never release 
> the lock on `bulkprocessor`.  and, i also trying to figure out why the field 
> `concurrentRequests` was set to zero. And i saw the the initialize for 
> bulkprocessor in class `ElasticsearchSinkBase`:
> {code:java}
> protected BulkProcessor buildBulkProcessor(BulkProcessor.Listener listener) {
>  ...
>  BulkProcessor.Builder bulkProcessorBuilder =  
> callBridge.createBulkProcessorBuilder(client, listener);
>  // This makes flush() blocking
>  bulkProcessorBuilder.setConcurrentRequests(0);
>  
>  ...
>  return bulkProcessorBuilder.build();
> }
> {code}
>  this field value was set to zero explicitly. So, all things seems to make 
> sense, but i still wonder why the retry operation is not in the same thread 
> as the bulk process execution, after i read the code, `bulkAsync` method 
> might be the last puzzle.
> {code:java}
> @Override
> public BulkProcessor.Builder createBulkProcessorBuilder(RestHighLevelClient 
> client, BulkProcessor.Listener listener) {
>  return BulkProcessor.builder(client::bulkAsync, listener);
> }
> 

[jira] [Commented] (FLINK-11046) ElasticSearch6Connector cause thread blocked when index failed with retry

2019-01-21 Thread xueyu (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-11046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16747831#comment-16747831
 ] 

xueyu commented on FLINK-11046:
---

Hi, [~dawidwys], sorry that I only writed the codes and didn't write any test 
and end-to-end test yet...The codes are on [my 
branch|https://github.com/xueyumusic/flink/tree/es6-onfailure].

According to my current work status I estimated that still need about two weeks 
 from my side.. If this issue is urgent, please feel free to take it over.
Thank you and sorry for delay..., [~dawidwys]  

> ElasticSearch6Connector cause thread blocked when index failed with retry
> -
>
> Key: FLINK-11046
> URL: https://issues.apache.org/jira/browse/FLINK-11046
> Project: Flink
>  Issue Type: Bug
>  Components: ElasticSearch Connector
>Affects Versions: 1.6.2
>Reporter: luoguohao
>Assignee: xueyu
>Priority: Major
>
> When i'm using es6 sink to index into es, bulk process with some exception 
> catched, and  i trying to reindex the document with the call 
> `indexer.add(action)` in the `ActionRequestFailureHandler.onFailure()` 
> method, but things goes incorrect. The call thread stuck there, and with the 
> thread dump, i saw the `bulkprocessor` object was locked by other thread. 
> {code:java}
> public interface ActionRequestFailureHandler extends Serializable {
>  void onFailure(ActionRequest action, Throwable failure, int restStatusCode, 
> RequestIndexer indexer) throws Throwable;
> }
> {code}
> After i read the code implemented in the `indexer.add(action)`, i find that 
> `synchronized` is needed on each add operation.
> {code:java}
> private synchronized void internalAdd(DocWriteRequest request, @Nullable 
> Object payload) {
>   ensureOpen();
>   bulkRequest.add(request, payload);
>   executeIfNeeded();
> }
> {code}
> And, at i also noticed that `bulkprocessor` object would also locked in the 
> bulk process thread. 
> the bulk process operation is in the following code:
> {code:java}
> public void execute(BulkRequest bulkRequest, long executionId) {
> Runnable toRelease = () -> {};
> boolean bulkRequestSetupSuccessful = false;
> try {
> listener.beforeBulk(executionId, bulkRequest);
> semaphore.acquire();
> toRelease = semaphore::release;
> CountDownLatch latch = new CountDownLatch(1);
> retry.withBackoff(consumer, bulkRequest, new 
> ActionListener() {
> @Override
> public void onResponse(BulkResponse response) {
> try {
> listener.afterBulk(executionId, bulkRequest, response);
> } finally {
> semaphore.release();
> latch.countDown();
> }
> }
> @Override
> public void onFailure(Exception e) {
> try {
> listener.afterBulk(executionId, bulkRequest, e);
> } finally {
> semaphore.release();
> latch.countDown();
> }
> }
> }, Settings.EMPTY);
> bulkRequestSetupSuccessful = true;
>if (concurrentRequests == 0) {
>latch.await();
> }
> } catch (InterruptedException e) {
> Thread.currentThread().interrupt();
> logger.info(() -> new ParameterizedMessage("Bulk request {} has been 
> cancelled.", executionId), e);
> listener.afterBulk(executionId, bulkRequest, e);
> } catch (Exception e) {
> logger.warn(() -> new ParameterizedMessage("Failed to execute bulk 
> request {}.", executionId), e);
> listener.afterBulk(executionId, bulkRequest, e);
> } finally {
> if (bulkRequestSetupSuccessful == false) {  // if we fail on 
> client.bulk() release the semaphore
> toRelease.run();
> }
> }
> }
> {code}
> As the read line i marked above, i think, that's the reason why the retry 
> operation thread was block, because the the bulk process thread never release 
> the lock on `bulkprocessor`.  and, i also trying to figure out why the field 
> `concurrentRequests` was set to zero. And i saw the the initialize for 
> bulkprocessor in class `ElasticsearchSinkBase`:
> {code:java}
> protected BulkProcessor buildBulkProcessor(BulkProcessor.Listener listener) {
>  ...
>  BulkProcessor.Builder bulkProcessorBuilder =  
> callBridge.createBulkProcessorBuilder(client, listener);
>  // This makes flush() blocking
>  bulkProcessorBuilder.setConcurrentRequests(0);
>  
>  ...
>  return bulkProcessorBuilder.build();
> }
> {code}
>  this field value was set to zero explicitly. So, all things seems to make 
> sense, but i still wonder why the retry operation is not in the same thread 
> as the bulk 

[GitHub] zhangxinyu1 commented on issue #6770: [FLINK-10002] [Webfrontend] WebUI shows jm/tm logs more friendly.

2019-01-21 Thread GitBox
zhangxinyu1 commented on issue #6770:  [FLINK-10002] [Webfrontend] WebUI shows 
jm/tm logs more friendly.
URL: https://github.com/apache/flink/pull/6770#issuecomment-456045406
 
 
   Three classes: `JobManagerLogFileHandlerTest.java`, 
`JobManagerLogListHandlerTest.java`,  `TaskManagerLogListHandler.java` and a 
method `testRequestUploadFile` in TaskExecutor.java are added for unit testing.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Fokko commented on issue #7497: [FLINK-11338] Bump maven-enforcer-plugin from 3.0.0-M1 to 3.0.0-M2

2019-01-21 Thread GitBox
Fokko commented on issue #7497: [FLINK-11338] Bump maven-enforcer-plugin from 
3.0.0-M1 to 3.0.0-M2
URL: https://github.com/apache/flink/pull/7497#issuecomment-456021166
 
 
   Mostly getting ready for Java8>


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] dawidwys commented on issue #7543: [FLINK-10996]Fix the possible state leak with CEP processing time model

2019-01-21 Thread GitBox
dawidwys commented on issue #7543: [FLINK-10996]Fix the possible state leak 
with CEP processing time model
URL: https://github.com/apache/flink/pull/7543#issuecomment-456020369
 
 
   Hi @Aitozi,
   First of all I wouldn't say this fixes a bug, but rather introduces a new 
feature that should be discussed beforehand. 
   
   I think we should aim for a better approach, that would further unify 
Processing and Event time that is already covered by 
[FLINK-7384](https://github.com/apache/flink/pull/4514/files)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Fokko commented on issue #7499: [FLINK-11340] Bump commons-configuration from 1.7 to 1.10

2019-01-21 Thread GitBox
Fokko commented on issue #7499: [FLINK-11340] Bump commons-configuration from 
1.7 to 1.10
URL: https://github.com/apache/flink/pull/7499#issuecomment-456022861
 
 
   
https://commons.apache.org/proper/commons-configuration/changes-report.html#a1.10


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Fokko commented on issue #7498: [FLINK-11339] Bump exec-maven-plugin from 1.5.0 to 1.6.0

2019-01-21 Thread GitBox
Fokko commented on issue #7498: [FLINK-11339] Bump exec-maven-plugin from 1.5.0 
to 1.6.0
URL: https://github.com/apache/flink/pull/7498#issuecomment-456022154
 
 
   
https://github.com/mojohaus/exec-maven-plugin/compare/exec-maven-plugin-1.5.0...exec-maven-plugin-1.6.0
   
   Dropped support for 

[GitHub] Fokko edited a comment on issue #7497: [FLINK-11338] Bump maven-enforcer-plugin from 3.0.0-M1 to 3.0.0-M2

2019-01-21 Thread GitBox
Fokko edited a comment on issue #7497: [FLINK-11338] Bump maven-enforcer-plugin 
from 3.0.0-M1 to 3.0.0-M2
URL: https://github.com/apache/flink/pull/7497#issuecomment-456021166
 
 
   Mostly getting ready for Java8>
   
   Failed build is a flaky test: 
https://travis-ci.org/Fokko/flink/builds/480128791


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] KurtYoung closed pull request #4940: [FLINK-7959] [table] Split CodeGenerator into CodeGeneratorContext and ExprCodeGenerator.

2019-01-21 Thread GitBox
KurtYoung closed pull request #4940: [FLINK-7959] [table] Split CodeGenerator 
into CodeGeneratorContext and ExprCodeGenerator.
URL: https://github.com/apache/flink/pull/4940
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (FLINK-7959) Split CodeGenerator into CodeGeneratorContext and ExprCodeGenerator

2019-01-21 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-7959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-7959:
--
Labels: pull-request-available  (was: )

> Split CodeGenerator into CodeGeneratorContext and ExprCodeGenerator
> ---
>
> Key: FLINK-7959
> URL: https://issues.apache.org/jira/browse/FLINK-7959
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API  SQL
>Reporter: Kurt Young
>Assignee: Kurt Young
>Priority: Major
>  Labels: pull-request-available
>
> Right now {{CodeGenerator}} actually acts two roles, one is responsible for 
> generating codes from RexNode, and the other one is keeping lots of reusable 
> statements. It makes more sense to split these logic into two dedicated 
> classes. 
> The new {{CodeGeneratorContext}} will keep all the reusable statements, while 
> the new {{ExprCodeGenerator}} will only do generating codes from RexNode.
> And for classes like {{AggregationCodeGenerator}} or 
> {{FunctionCodeGenerator}}, I think the should not be the subclasses of the 
> {{CodeGenerator}}, but should all as standalone classes. They can create 
> {{ExprCodeGenerator}} when they need to generating codes from RexNode, and 
> they can also generating codes by themselves. The {{CodeGeneratorContext}} 
> can be passed around to collect all reusable statements, and list them in the 
> final generated class.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] godfreyhe closed pull request #4667: [FLINK-5859] [table] Add PartitionableTableSource for partition pruning

2019-01-21 Thread GitBox
godfreyhe closed pull request #4667: [FLINK-5859] [table] Add 
PartitionableTableSource for partition pruning
URL: https://github.com/apache/flink/pull/4667
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (FLINK-5859) support partition pruning on Table API & SQL

2019-01-21 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-5859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-5859:
--
Labels: pull-request-available  (was: )

> support partition pruning on Table API & SQL
> 
>
> Key: FLINK-5859
> URL: https://issues.apache.org/jira/browse/FLINK-5859
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API  SQL
>Reporter: godfrey he
>Assignee: godfrey he
>Priority: Major
>  Labels: pull-request-available
>
> Many data sources are partitionable storage, e.g. HDFS, Druid. And many 
> queries just need to read a small subset of the total data. We can use 
> partition information to prune or skip over files irrelevant to the user’s 
> queries. Both query optimization time and execution time can be reduced 
> obviously, especially for a large partitioned table.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] godfreyhe closed pull request #3860: [FLINK-6516] [table] using real row count instead of dummy row count when optimizing plan

2019-01-21 Thread GitBox
godfreyhe closed pull request #3860: [FLINK-6516] [table] using real row count 
instead of dummy row count when optimizing plan
URL: https://github.com/apache/flink/pull/3860
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] godfreyhe closed pull request #4820: [FLINK-7834] [table] cost model extends network cost and memory cost

2019-01-21 Thread GitBox
godfreyhe closed pull request #4820: [FLINK-7834] [table] cost model extends 
network cost and memory cost
URL: https://github.com/apache/flink/pull/4820
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] godfreyhe closed pull request #4682: [FLINK-7620] [table] Supports user defined optimization phase

2019-01-21 Thread GitBox
godfreyhe closed pull request #4682: [FLINK-7620] [table] Supports user defined 
optimization phase
URL: https://github.com/apache/flink/pull/4682
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] godfreyhe opened a new pull request #4667: [FLINK-5859] [table] Add PartitionableTableSource for partition pruning

2019-01-21 Thread GitBox
godfreyhe opened a new pull request #4667: [FLINK-5859] [table] Add 
PartitionableTableSource for partition pruning
URL: https://github.com/apache/flink/pull/4667
 
 
   ## What is the purpose of the change
   
   This pull request adds PartitionableTableSource for partition pruning when 
optimizing the query plan. That way both query optimization time and execution 
time can be reduced obviously, especially for a large partitioned table.
   
   ## Brief change log
   
 - *Adds PartitionableTableSource which extends FilterableTableSource*
 - *Adds setRelBuilder method in FilterableTableSource class*
 - *Adds implementation for partition pruning and extracting partition 
predicates*
   
   ## Verifying this change
   
   This change added tests and can be verified as follows:
   
 - *Added integration tests for PartitionableTableSource on batch and 
stream sql*
 - *Added test that validates the correct of partition pruning*
 - *Added test that validates the correct of extracting partition 
predicates*
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: (no)
 - The runtime per-record code paths (performance sensitive): (no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes)
 - If yes, how is the feature documented? (JavaDocs)
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] zhangxinyu1 commented on issue #6770: [FLINK-10002] [Webfrontend] WebUI shows jm/tm logs more friendly.

2019-01-21 Thread GitBox
zhangxinyu1 commented on issue #6770:  [FLINK-10002] [Webfrontend] WebUI shows 
jm/tm logs more friendly.
URL: https://github.com/apache/flink/pull/6770#issuecomment-456047509
 
 
   @yanghua @tillrohrmann @ifndef-SleePy Sorry to bother you. Could you please 
look at this pr if you have time?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-6626) Unifying lifecycle management of SubmittedJobGraph- and CompletedCheckpointStore

2019-01-21 Thread Till Rohrmann (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-6626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16747808#comment-16747808
 ] 

Till Rohrmann commented on FLINK-6626:
--

Sure [~app-tarush]. Before diving into the implementation details I would be 
interested in your solution idea/design.

> Unifying lifecycle management of SubmittedJobGraph- and 
> CompletedCheckpointStore
> 
>
> Key: FLINK-6626
> URL: https://issues.apache.org/jira/browse/FLINK-6626
> Project: Flink
>  Issue Type: Improvement
>  Components: Distributed Coordination
>Affects Versions: 1.3.0, 1.4.0
>Reporter: Till Rohrmann
>Assignee: Tarush Grover
>Priority: Major
>
> Currently, Flink uses the {{SubmittedJobGraphStore}} to persist {{JobGraphs}} 
> such that they can be recovered in case of failures. The 
> {{SubmittedJobGraphStore}} is managed by by the {{JobManager}}. Additionally, 
> Flink has the {{CompletedCheckpointStore}} which stores checkpoints for a 
> given {{ExecutionGraph}}/job. The {{CompletedCheckpointStore}} is managed by 
> the {{CheckpointCoordinator}}.
> The {{SubmittedJobGraphStore}} and the {{CompletedCheckpointStore}} are 
> somewhat related because in the latter we store checkpoints for jobs 
> contained in the former. I think it would be nice wrt lifecycle management to 
> let the {{SubmittedJobGraphStore}} manage the lifecycle of the 
> {{CompletedCheckpointStore}}, because often it does not make much sense to 
> keep only checkpoints without a job or a job without checkpoints. 
> An idea would be when we register a job with the {{SubmittedJobGraphStore}} 
> then it returns a {{CompletedCheckpointStore}}. This store can then be given 
> to the {{CheckpointCoordinator}} to store the checkpoints. When a job enters 
> a terminal state it could be the responsibility of the 
> {{SubmittedJobGraphStore}} to decide what to do with the job data 
> ({{JobGraph}} and {{Checkpoints}}), e.g. keeping it or cleaning it up.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] wuchong closed pull request #4189: [FLINK-6958] [async] Async I/O hang when the source is bounded and collect timeout

2019-01-21 Thread GitBox
wuchong closed pull request #4189: [FLINK-6958] [async] Async I/O hang when the 
source is bounded and collect timeout
URL: https://github.com/apache/flink/pull/4189
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Guibo-Pan commented on issue #6773: [FLINK-10152] [table] Add reverse function in Table API and SQL

2019-01-21 Thread GitBox
Guibo-Pan commented on issue #6773: [FLINK-10152] [table] Add reverse function 
in Table API and SQL
URL: https://github.com/apache/flink/pull/6773#issuecomment-455989720
 
 
   @twalthr @xccui @tillrohrmann Is here anyone who can help review this? 
Thanks~


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] azagrebin commented on a change in pull request #6615: [FLINK-8354] [flink-connectors] Add ability to access and provider Kafka headers

2019-01-21 Thread GitBox
azagrebin commented on a change in pull request #6615: [FLINK-8354] 
[flink-connectors] Add ability to access and provider Kafka headers
URL: https://github.com/apache/flink/pull/6615#discussion_r249114818
 
 

 ##
 File path: 
flink-connectors/flink-connector-kafka-base/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaConsumerBase.java
 ##
 @@ -251,6 +269,18 @@ public FlinkKafkaConsumerBase(
//  Configuration
// 

 
+   /**
+* Make sure that auto commit is disabled when our offset commit mode 
is ON_CHECKPOINTS.
+* This overwrites whatever setting the user configured in the 
properties.
+* @param properties - Kafka configuration properties to be adjusted
+* @param offsetCommitMode offset commit mode
+*/
+   protected static void adjustAutoCommitConfig(Properties properties, 
OffsetCommitMode offsetCommitMode) {
 
 Review comment:
   The method can be package-private.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] wuchong closed pull request #4145: [FLINK-6938][FLINK-6939] [cep] Not store IterativeCondition with NFA state and support RichFunction interface

2019-01-21 Thread GitBox
wuchong closed pull request #4145: [FLINK-6938][FLINK-6939] [cep] Not store 
IterativeCondition with NFA state and support RichFunction interface
URL: https://github.com/apache/flink/pull/4145
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (FLINK-6958) Async I/O timeout not work

2019-01-21 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-6958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-6958:
--
Labels: pull-request-available  (was: )

> Async I/O timeout not work
> --
>
> Key: FLINK-6958
> URL: https://issues.apache.org/jira/browse/FLINK-6958
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming
>Affects Versions: 1.2.1
>Reporter: feng xiaojie
>Assignee: Jark Wu
>Priority: Major
>  Labels: pull-request-available
>
> when use Async I/O with UnorderedStreamElementQueue, the queue will always 
> full if you don't  call the AsyncCollector.collect to ack them.
> Timeout shall collect these entries when the timeout trigger,but it isn't work
> I debug find,
> when time out, it will call resultFuture.completeExceptionally(error);
> but not call  UnorderedStreamElementQueue.onCompleteHandler
> it will cause that async i/o hang always



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] fhueske commented on issue #6873: [hotfix] Add Review Progress section to PR description template.

2019-01-21 Thread GitBox
fhueske commented on issue #6873: [hotfix] Add Review Progress section to PR 
description template.
URL: https://github.com/apache/flink/pull/6873#issuecomment-456085278
 
 
   I think if we modify the review checklist in place (i.e., update it in the 
PR description) this is almost automatically given. 
   
   AFAIK, only the PR author and members of the ASF Github organization (or 
maybe even registered Flink committers?) can update the PR description. If we 
ask committers to sign-off changes with their name, we should be good.
   
   A review progress bot would be great to have, IMO.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] hequn8128 commented on a change in pull request #6787: [FLINK-8577][table] Implement proctime DataStream to Table upsert conversion

2019-01-21 Thread GitBox
hequn8128 commented on a change in pull request #6787: [FLINK-8577][table] 
Implement proctime DataStream to Table upsert conversion
URL: https://github.com/apache/flink/pull/6787#discussion_r249476414
 
 

 ##
 File path: 
flink-libraries/flink-table/src/test/scala/org/apache/flink/table/api/stream/sql/FromUpsertStreamTest.scala
 ##
 @@ -0,0 +1,140 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.api.stream.sql
+
+import org.apache.flink.api.common.typeinfo.BasicTypeInfo
+import org.apache.flink.api.java.typeutils.{RowTypeInfo, TupleTypeInfo}
+import org.apache.flink.api.scala._
+import org.apache.flink.table.api.Types
+import org.apache.flink.table.api.scala._
+import org.apache.flink.table.utils.TableTestUtil.{UpsertTableNode, term, 
unaryNode}
+import org.apache.flink.table.utils.{StreamTableTestUtil, TableTestBase}
+import org.apache.flink.types.Row
+import org.junit.Test
+import org.apache.flink.api.java.tuple.{Tuple2 => JTuple2}
+import java.lang.{Boolean => JBool}
+
+class FromUpsertStreamTest extends TableTestBase {
+
+  private val streamUtil: StreamTableTestUtil = streamTestUtil()
+
+  @Test
+  def testRemoveUpsertToRetraction() = {
+streamUtil.addTableFromUpsert[(Boolean, (Int, String, Long))](
+  "MyTable", 'a, 'b.key, 'c, 'proctime.proctime, 'rowtime.rowtime)
+
+val sql = "SELECT a, b, c, proctime, rowtime FROM MyTable"
+
+val expected =
+  unaryNode(
+"DataStreamCalc",
+UpsertTableNode(0),
+term("select", "a", "b", "c", "PROCTIME(proctime) AS proctime",
+  "CAST(rowtime) AS rowtime")
+  )
+streamUtil.verifySql(sql, expected)
+  }
+
+  @Test
+  def testMaterializeTimeIndicatorAndCalcUpsertToRetractionTranspose() = {
 
 Review comment:
   `Calc` can't always be pushed down(More details in 
`CalcUpsertToRetractionTransposeRule`), so there are two tests for each of the 
case:
   - `testCalcTransposeUpsertToRetraction()`
   - `testCalcCannotTransposeUpsertToRetraction()`
   
   As for `testMaterializeTimeIndicatorAndCalcUpsertToRetractionTranspose()`, 
maybe it's better to rename to `testMaterializeTimeIndicator()`. It is 
dedicated to test the materialization logic.
   
   What do you think?
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] StefanRRichter commented on issue #7009: [FLINK-10712] Support to restore state when using RestartPipelinedRegionStrategy

2019-01-21 Thread GitBox
StefanRRichter commented on issue #7009: [FLINK-10712] Support to restore state 
when using RestartPipelinedRegionStrategy
URL: https://github.com/apache/flink/pull/7009#issuecomment-456105318
 
 
   I think I agree with the assessment of the existing operators. 
   
   About adding a `RecoveryMode` to consider, would that mean that we would 
prevent all jobs that use union state to work with partial recovery? I think if 
we just consider a few popular operators like `KafkaConsumer`, that would 
already prevent a lot of jobs from using different recovery modes.
   
   I can see that this comes from the concern about existing code that uses 
union state. However, stricly speaking it should not be a regression because 
those recovery modes previously did not support state recovery at all. We also 
cannot prevent users from making wrong implementations, so I feel like a good 
thing to do is document what to care care for union state when using such 
recovery modes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Assigned] (FLINK-11326) Using offsets to adjust windows to timezones UTC-8 throws IllegalArgumentException

2019-01-21 Thread Kezhu Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-11326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kezhu Wang reassigned FLINK-11326:
--

Assignee: Kezhu Wang

> Using offsets to adjust windows to timezones UTC-8 throws 
> IllegalArgumentException
> --
>
> Key: FLINK-11326
> URL: https://issues.apache.org/jira/browse/FLINK-11326
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming
>Affects Versions: 1.6.3
>Reporter: TANG Wen-hui
>Assignee: Kezhu Wang
>Priority: Major
>
> According to comments, we can use offset to adjust windows to timezones other 
> than UTC-0. For example, in China you would have to specify an offset of 
> {{Time.hours(-8)}}. 
>  
> {code:java}
> /**
>  * Creates a new {@code TumblingEventTimeWindows} {@link WindowAssigner} that 
> assigns
>  * elements to time windows based on the element timestamp and offset.
>  *
>  * For example, if you want window a stream by hour,but window begins at 
> the 15th minutes
>  * of each hour, you can use {@code of(Time.hours(1),Time.minutes(15))},then 
> you will get
>  * time windows start at 0:15:00,1:15:00,2:15:00,etc.
>  *
>  * Rather than that,if you are living in somewhere which is not using 
> UTC±00:00 time,
>  * such as China which is using UTC+08:00,and you want a time window with 
> size of one day,
>  * and window begins at every 00:00:00 of local time,you may use {@code 
> of(Time.days(1),Time.hours(-8))}.
>  * The parameter of offset is {@code Time.hours(-8))} since UTC+08:00 is 8 
> hours earlier than UTC time.
>  *
>  * @param size The size of the generated windows.
>  * @param offset The offset which window start would be shifted by.
>  * @return The time policy.
>  */
> public static TumblingEventTimeWindows of(Time size, Time offset) {
>  return new TumblingEventTimeWindows(size.toMilliseconds(), 
> offset.toMilliseconds());
> }{code}
>  
> but when offset is smaller than zero, a IllegalArgumentException will be 
> throwed.
>  
> {code:java}
> protected TumblingEventTimeWindows(long size, long offset) {
>  if (offset < 0 || offset >= size) {
>  throw new IllegalArgumentException("TumblingEventTimeWindows parameters must 
> satisfy 0 <= offset < size");
>  }
>  this.size = size;
>  this.offset = offset;
> }{code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-6196) Support dynamic schema in Table Function

2019-01-21 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-6196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-6196:
--
Labels: pull-request-available  (was: )

> Support dynamic schema in Table Function
> 
>
> Key: FLINK-6196
> URL: https://issues.apache.org/jira/browse/FLINK-6196
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API  SQL
>Reporter: Zhuoluo Yang
>Assignee: Zhuoluo Yang
>Priority: Major
>  Labels: pull-request-available
>
> In many of our use cases. We have to decide the schema of a UDTF at the run 
> time. For example. udtf('c1, c2, c3') will generate three columns for a 
> lateral view. 
> Most systems such as calcite and hive support this feature. However, the 
> current implementation of flink didn't implement the feature correctly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] clarkyzl closed pull request #3623: [FLINK-6196] [table] Support dynamic schema in Table Function

2019-01-21 Thread GitBox
clarkyzl closed pull request #3623: [FLINK-6196] [table] Support dynamic schema 
in Table Function
URL: https://github.com/apache/flink/pull/3623
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (FLINK-11399) Parsing nested ROW()s in SQL

2019-01-21 Thread JIRA
Benoît Paris created FLINK-11399:


 Summary: Parsing nested ROW()s in SQL
 Key: FLINK-11399
 URL: https://issues.apache.org/jira/browse/FLINK-11399
 Project: Flink
  Issue Type: Bug
  Components: Table API  SQL
Affects Versions: 1.7.1
Reporter: Benoît Paris


Hi!

I'm trying to build a nested structure in SQL (mapping to json with 
flink-json). This works fine: 
{code:java}
INSERT INTO outputTable
SELECT ROW(col1, col2) 
FROM (
  SELECT 
col1, 
ROW(col1, col1) as col2 
  FROM inputTable
) tbl2
{code}
(and I use it as a workaround), but it fails in the simpler version: 
{code:java}
INSERT INTO outputTable
SELECT ROW(col1, ROW(col1, col1)) 
FROM inputTable
{code}
, yielding the following stacktrace: 
{noformat}
Exception in thread "main" org.apache.flink.table.api.SqlParserException: SQL 
parse failed. Encountered ", ROW" at line 1, column 40.
Was expecting one of:
")" ...
","  ...
","  ...
","  ...
","  ...
","  ...

at 
org.apache.flink.table.calcite.FlinkPlannerImpl.parse(FlinkPlannerImpl.scala:94)
at 
org.apache.flink.table.api.TableEnvironment.sqlUpdate(TableEnvironment.scala:803)
at 
org.apache.flink.table.api.TableEnvironment.sqlUpdate(TableEnvironment.scala:777)
at TestBug.main(TestBug.java:32)
Caused by: org.apache.calcite.sql.parser.SqlParseException: Encountered ", ROW" 
at line 1, column 40.
Was expecting one of:
")" ...
","  ...
","  ...
","  ...
","  ...
","  ...

at 
org.apache.calcite.sql.parser.impl.SqlParserImpl.convertException(SqlParserImpl.java:347)
at 
org.apache.calcite.sql.parser.impl.SqlParserImpl.normalizeException(SqlParserImpl.java:128)
at org.apache.calcite.sql.parser.SqlParser.parseQuery(SqlParser.java:137)
at org.apache.calcite.sql.parser.SqlParser.parseStmt(SqlParser.java:162)
at 
org.apache.flink.table.calcite.FlinkPlannerImpl.parse(FlinkPlannerImpl.scala:90)
... 3 more
Caused by: org.apache.calcite.sql.parser.impl.ParseException: Encountered ", 
ROW" at line 1, column 40.
Was expecting one of:
")" ...
","  ...
","  ...
","  ...
","  ...
","  ...

at 
org.apache.calcite.sql.parser.impl.SqlParserImpl.generateParseException(SqlParserImpl.java:23019)
at 
org.apache.calcite.sql.parser.impl.SqlParserImpl.jj_consume_token(SqlParserImpl.java:22836)
at 
org.apache.calcite.sql.parser.impl.SqlParserImpl.ParenthesizedSimpleIdentifierList(SqlParserImpl.java:4466)
at 
org.apache.calcite.sql.parser.impl.SqlParserImpl.Expression3(SqlParserImpl.java:3328)
at 
org.apache.calcite.sql.parser.impl.SqlParserImpl.Expression2b(SqlParserImpl.java:3066)
at 
org.apache.calcite.sql.parser.impl.SqlParserImpl.Expression2(SqlParserImpl.java:3092)
at 
org.apache.calcite.sql.parser.impl.SqlParserImpl.Expression(SqlParserImpl.java:3045)
at 
org.apache.calcite.sql.parser.impl.SqlParserImpl.SelectExpression(SqlParserImpl.java:1525)
at 
org.apache.calcite.sql.parser.impl.SqlParserImpl.SelectItem(SqlParserImpl.java:1500)
at 
org.apache.calcite.sql.parser.impl.SqlParserImpl.SelectList(SqlParserImpl.java:1477)
at 
org.apache.calcite.sql.parser.impl.SqlParserImpl.SqlSelect(SqlParserImpl.java:912)
at 
org.apache.calcite.sql.parser.impl.SqlParserImpl.LeafQuery(SqlParserImpl.java:552)
at 
org.apache.calcite.sql.parser.impl.SqlParserImpl.LeafQueryOrExpr(SqlParserImpl.java:3030)
at 
org.apache.calcite.sql.parser.impl.SqlParserImpl.QueryOrExpr(SqlParserImpl.java:2949)
at 
org.apache.calcite.sql.parser.impl.SqlParserImpl.OrderedQueryOrExpr(SqlParserImpl.java:463)
at 
org.apache.calcite.sql.parser.impl.SqlParserImpl.SqlInsert(SqlParserImpl.java:1212)
at 
org.apache.calcite.sql.parser.impl.SqlParserImpl.SqlStmt(SqlParserImpl.java:847)
at 
org.apache.calcite.sql.parser.impl.SqlParserImpl.SqlStmtEof(SqlParserImpl.java:869)
at 
org.apache.calcite.sql.parser.impl.SqlParserImpl.parseSqlStmtEof(SqlParserImpl.java:184)
at org.apache.calcite.sql.parser.SqlParser.parseQuery(SqlParser.java:130)
... 5 more{noformat}
 

I was thinking it could be a naming/referencing issue; or I was not using ROW() 
properly, in the json-idiomatic way I want to push on it.

Anyway this is very minor, thanks for all the good work on Flink!

Cheers,

Ben 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-7834) cost model (RelOptCost) extends network cost and memory cost

2019-01-21 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-7834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-7834:
--
Labels: pull-request-available  (was: )

> cost model (RelOptCost) extends network cost and memory cost
> 
>
> Key: FLINK-7834
> URL: https://issues.apache.org/jira/browse/FLINK-7834
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API  SQL
>Reporter: godfrey he
>Priority: Major
>  Labels: pull-request-available
>
> {{RelOptCost}} defines an interface for optimizer cost in terms of number of 
> row processed, CPU cost, and I/O cost. Flink is a distributed framework, 
> network and memory are also two important cost metrics which should be 
> considered for optimizer. This feature is to extend RelOptCost.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-11249) FlinkKafkaProducer011 can not be migrated to FlinkKafkaProducer

2019-01-21 Thread Piotr Nowojski (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-11249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16747937#comment-16747937
 ] 

Piotr Nowojski commented on FLINK-11249:


I have realised about one more issue. This fix alone might not fully solve the 
problem. With this fix in place, user will be able to update his job from 
{{0.11}} connector to the universal one, but what happens if he upgrades the 
Kafka Brokers at any point of time? If user stops Kafka Brokers, upgrades and 
then restarts them, does this process preserves the pending transactions, that 
Flink already "pre committed"? Or are they automatically aborted? If they are 
automatically aborted we might have a data loss from our perspective.

If "pre committed" transactions are aborted during the Kafka brokers upgrades, 
we would need "clean stop with savepoint" feature to handle this user story. I 
guess this needs more experiments and more testing.

CC [~tzulitai] [~aljoscha]

> FlinkKafkaProducer011 can not be migrated to FlinkKafkaProducer
> ---
>
> Key: FLINK-11249
> URL: https://issues.apache.org/jira/browse/FLINK-11249
> Project: Flink
>  Issue Type: Sub-task
>  Components: Kafka Connector
>Affects Versions: 1.7.0, 1.7.1
>Reporter: Piotr Nowojski
>Assignee: vinoyang
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 1.7.2, 1.8.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> As reported by a user on the mailing list "How to migrate Kafka Producer ?" 
> (on 18th December 2018), {{FlinkKafkaProducer011}} can not be migrated to 
> {{FlinkKafkaProducer}} and the same problem can occur in the future Kafka 
> producer versions/refactorings.
> The issue is that {{ListState 
> FlinkKafkaProducer#nextTransactionalIdHintState}} field is serialized using 
> java serializers and this is causing problems/collisions on 
> {{FlinkKafkaProducer011.NextTransactionalIdHint}}  vs
> {{FlinkKafkaProducer.NextTransactionalIdHint}}.
> To fix that we probably need to release new versions of those classes, that 
> will rewrite/upgrade this state field to a new one, that doesn't relay on 
> java serialization. After this, we could drop the support for the old field 
> and that in turn will allow users to upgrade from 0.11 connector to the 
> universal one.
> One bright side is that technically speaking our {{FlinkKafkaProducer011}} 
> has the same compatibility matrix as the universal one (it's also forward & 
> backward compatible with the same Kafka versions), so for the time being 
> users can stick to {{FlinkKafkaProducer011}}.
> FYI [~tzulitai] [~yanghua]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-11398) Add a dedicated phase to materialize time indicators for nodes produce updates

2019-01-21 Thread Hequn Cheng (JIRA)
Hequn Cheng created FLINK-11398:
---

 Summary: Add a dedicated phase to materialize time indicators for 
nodes produce updates
 Key: FLINK-11398
 URL: https://issues.apache.org/jira/browse/FLINK-11398
 Project: Flink
  Issue Type: Improvement
  Components: Table API  SQL
Reporter: Hequn Cheng
Assignee: Hequn Cheng


As discussed 
[here|https://github.com/apache/flink/pull/6787#discussion_r249056249], we need 
a dedicated phase to materialize time indicators for nodes produce updates.

Details:
Currently, we materialize time indicators in `RelTimeInidicatorConverter`. We 
need to introduce another materialize phase that materializes all time 
attributes on nodes that produce updates. We can not do it inside 
`RelTimeInidicatorConverter`, because only later, after physical optimization 
phase, we know whether it is a non-window outer join which will produce updates

There are a few other things we need to consider.
- Whether we can unify the two converter phase.
- Take window with early fire into consideration(not been implemented yet). In 
this case, we don't need to materialize time indicators even it produces 
updates.











--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-11400) JobManagerRunner does not wait for suspension of JobMaster

2019-01-21 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-11400:
-

 Summary: JobManagerRunner does not wait for suspension of JobMaster
 Key: FLINK-11400
 URL: https://issues.apache.org/jira/browse/FLINK-11400
 Project: Flink
  Issue Type: Bug
  Components: Distributed Coordination
Affects Versions: 1.7.1, 1.6.3, 1.8.0
Reporter: Till Rohrmann
Assignee: Till Rohrmann
 Fix For: 1.8.0


The {{JobManagerRunner}} does not wait for the suspension of the {{JobMaster}} 
to finish before granting leadership again. This can lead to a state where the 
{{JobMaster}} tries to start the {{ExecutionGraph}} but the {{SlotPool}} is 
still stopped.

I suggest to linearize the leadership operations (granting and revoking 
leadership) similarly to the {{Dispatcher}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] rmetzger commented on issue #6873: [hotfix] Add Review Progress section to PR description template.

2019-01-21 Thread GitBox
rmetzger commented on issue #6873: [hotfix] Add Review Progress section to PR 
description template.
URL: https://github.com/apache/flink/pull/6873#issuecomment-456081384
 
 
   @zentol: I'm a bit against this `**NOTE: THE REVIEW PROGRESS MUST ONLY BE 
UPDATED BY AN APACHE FLINK COMMITTER!**` line, as it seems not very inviting 
for new community members.
   Maybe as a convention, reviewers put into the comments what they have 
reviewed, so if a committer who's merging a PR has doubts about the checklist, 
they can check the comments?
   
   I'm also thinking* about implementing a bot that takes care of tracking the 
checklist.
   
   *strongly enough to have created a project locally already :) 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] azagrebin commented on a change in pull request #6615: [FLINK-8354] [flink-connectors] Add ability to access and provider Kafka headers

2019-01-21 Thread GitBox
azagrebin commented on a change in pull request #6615: [FLINK-8354] 
[flink-connectors] Add ability to access and provider Kafka headers
URL: https://github.com/apache/flink/pull/6615#discussion_r249367559
 
 

 ##
 File path: 
flink-connectors/flink-connector-kafka-0.11/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaProducer011.java
 ##
 @@ -641,6 +641,9 @@ public void invoke(KafkaTransactionState transaction, IN 
next, Context context)
} else {
record = new ProducerRecord<>(targetTopic, null, 
timestamp, serializedKey, serializedValue);
}
+   for (Map.Entry header: schema.headers(next)) {
 
 Review comment:
   space before `:`


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] dawidwys commented on a change in pull request #7436: [FLINK-11071][core] add support for dynamic proxy classes resolution …

2019-01-21 Thread GitBox
dawidwys commented on a change in pull request #7436: [FLINK-11071][core] add 
support for dynamic proxy classes resolution …
URL: https://github.com/apache/flink/pull/7436#discussion_r249462832
 
 

 ##
 File path: 
flink-core/src/test/java/org/apache/flink/util/InstantiationUtilTest.java
 ##
 @@ -46,6 +50,17 @@
  */
 public class InstantiationUtilTest extends TestLogger {
 
+   @Test
+   public void testResolveProxyClass() throws Exception {
 
 Review comment:
   This test does not check the new behavior. The proxy class is available in 
the same classloader as `InstantiationUtil`, which corresponds to a situation 
that the proxy class is available on the parent classpath in flink cluster. 
What you should actually test is when the proxy class is only available from 
user classloader. 
   
   In 
`org.apache.flink.runtime.classloading.ClassLoaderTest#testMessageDecodingWithUnavailableClass`
 you can check how to create a new classloader with a class. Also always please 
make sure that your test fails without the changes that are supposed to fix the 
tested problem (It is not the case here, it passes without your changes). 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (FLINK-11398) Add a dedicated phase to materialize time indicators for nodes produce updates

2019-01-21 Thread Hequn Cheng (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-11398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hequn Cheng updated FLINK-11398:

Description: 
As discussed 
[here|https://github.com/apache/flink/pull/6787#discussion_r247926320], we need 
a dedicated phase to materialize time indicators for nodes produce updates.

Details:
Currently, we materialize time indicators in `RelTimeInidicatorConverter`. We 
need to introduce another materialize phase that materializes all time 
attributes on nodes that produce updates. We can not do it inside 
`RelTimeInidicatorConverter`, because only later, after physical optimization 
phase, we know whether it is a non-window outer join which will produce updates

There are a few other things we need to consider.
- Whether we can unify the two converter phase.
- Take window with early fire into consideration(not been implemented yet). In 
this case, we don't need to materialize time indicators even it produces 
updates.









  was:
As discussed 
[here|https://github.com/apache/flink/pull/6787#discussion_r249056249], we need 
a dedicated phase to materialize time indicators for nodes produce updates.

Details:
Currently, we materialize time indicators in `RelTimeInidicatorConverter`. We 
need to introduce another materialize phase that materializes all time 
attributes on nodes that produce updates. We can not do it inside 
`RelTimeInidicatorConverter`, because only later, after physical optimization 
phase, we know whether it is a non-window outer join which will produce updates

There are a few other things we need to consider.
- Whether we can unify the two converter phase.
- Take window with early fire into consideration(not been implemented yet). In 
this case, we don't need to materialize time indicators even it produces 
updates.










> Add a dedicated phase to materialize time indicators for nodes produce updates
> --
>
> Key: FLINK-11398
> URL: https://issues.apache.org/jira/browse/FLINK-11398
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API  SQL
>Reporter: Hequn Cheng
>Assignee: Hequn Cheng
>Priority: Major
>
> As discussed 
> [here|https://github.com/apache/flink/pull/6787#discussion_r247926320], we 
> need a dedicated phase to materialize time indicators for nodes produce 
> updates.
> Details:
> Currently, we materialize time indicators in `RelTimeInidicatorConverter`. We 
> need to introduce another materialize phase that materializes all time 
> attributes on nodes that produce updates. We can not do it inside 
> `RelTimeInidicatorConverter`, because only later, after physical optimization 
> phase, we know whether it is a non-window outer join which will produce 
> updates
> There are a few other things we need to consider.
> - Whether we can unify the two converter phase.
> - Take window with early fire into consideration(not been implemented yet). 
> In this case, we don't need to materialize time indicators even it produces 
> updates.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] bowenli86 commented on issue #7011: [FLINK-10768][Table & SQL] Move external catalog related code from TableEnvironment to CatalogManager

2019-01-21 Thread GitBox
bowenli86 commented on issue #7011: [FLINK-10768][Table & SQL] Move external 
catalog related code from TableEnvironment to CatalogManager
URL: https://github.com/apache/flink/pull/7011#issuecomment-456154882
 
 
   closing this PR in favor of new plan


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] bowenli86 closed pull request #7011: [FLINK-10768][Table & SQL] Move external catalog related code from TableEnvironment to CatalogManager

2019-01-21 Thread GitBox
bowenli86 closed pull request #7011: [FLINK-10768][Table & SQL] Move external 
catalog related code from TableEnvironment to CatalogManager
URL: https://github.com/apache/flink/pull/7011
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] bowenli86 closed pull request #6970: [FLINK-10696][Table API & SQL]Add APIs to ExternalCatalog, CrudExternalCatalog and InMemoryCrudExternalCatalog for views and UDFs

2019-01-21 Thread GitBox
bowenli86 closed pull request #6970: [FLINK-10696][Table API & SQL]Add APIs to 
ExternalCatalog, CrudExternalCatalog and InMemoryCrudExternalCatalog for views 
and UDFs
URL: https://github.com/apache/flink/pull/6970
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] bowenli86 closed pull request #7012: [FLINK-10769][Table & SQL] port InMemoryExternalCatalog to java

2019-01-21 Thread GitBox
bowenli86 closed pull request #7012: [FLINK-10769][Table & SQL] port 
InMemoryExternalCatalog to java
URL: https://github.com/apache/flink/pull/7012
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] bowenli86 commented on issue #7012: [FLINK-10769][Table & SQL] port InMemoryExternalCatalog to java

2019-01-21 Thread GitBox
bowenli86 commented on issue #7012: [FLINK-10769][Table & SQL] port 
InMemoryExternalCatalog to java
URL: https://github.com/apache/flink/pull/7012#issuecomment-456154851
 
 
   closing this PR in favor of new plan


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] bowenli86 commented on issue #6997: [FLINK-10697][Table & SQL] Create an in-memory catalog that stores Flink's meta objects

2019-01-21 Thread GitBox
bowenli86 commented on issue #6997: [FLINK-10697][Table & SQL] Create an 
in-memory catalog that stores Flink's meta objects
URL: https://github.com/apache/flink/pull/6997#issuecomment-456154907
 
 
   closing this PR in favor of new plan


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] azagrebin commented on a change in pull request #6615: [FLINK-8354] [flink-connectors] Add ability to access and provider Kafka headers

2019-01-21 Thread GitBox
azagrebin commented on a change in pull request #6615: [FLINK-8354] 
[flink-connectors] Add ability to access and provider Kafka headers
URL: https://github.com/apache/flink/pull/6615#discussion_r249512219
 
 

 ##
 File path: 
flink-connectors/flink-connector-kafka-0.8/src/main/java/org/apache/flink/streaming/connectors/kafka/internals/SimpleConsumerThread.java
 ##
 @@ -370,8 +370,8 @@ else if (partitionsRemoved) {

keyPayload.get(keyBytes);
}
 
-   final T value = 
deserializer.deserialize(keyBytes, valueBytes,
-   
currentPartition.getTopic(), currentPartition.getPartition(), offset);
+   final T value = 
deserializer.deserialize(
 
 Review comment:
   Here we could try:
   ```
   ConsumerRecord consumerRecord = 
  new ConsumerRecord<>(currentPartition.getTopic(), 
currentPartition.getPartition(), 
 keyBytes, valueBytes, offset);
   final T value = deserializer.deserialize(consumerRecord);
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] azagrebin commented on a change in pull request #6615: [FLINK-8354] [flink-connectors] Add ability to access and provider Kafka headers

2019-01-21 Thread GitBox
azagrebin commented on a change in pull request #6615: [FLINK-8354] 
[flink-connectors] Add ability to access and provider Kafka headers
URL: https://github.com/apache/flink/pull/6615#discussion_r249368384
 
 

 ##
 File path: 
flink-connectors/flink-connector-kafka-0.11/src/main/java/org/apache/flink/streaming/connectors/kafka/internal/Kafka011ConsumerRecord.java
 ##
 @@ -0,0 +1,57 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.streaming.connectors.kafka.internal;
+
+import org.apache.flink.shaded.guava18.com.google.common.base.Function;
+import org.apache.flink.shaded.guava18.com.google.common.collect.Iterables;
+
+import org.apache.kafka.clients.consumer.ConsumerRecord;
+import org.apache.kafka.common.header.Header;
+
+import javax.annotation.Nonnull;
+import javax.annotation.Nullable;
+
+import java.util.AbstractMap;
+import java.util.Map;
+
+/**
+ * Extends base Kafka09ConsumerRecord to provide access to Kafka headers.
 
 Review comment:
   could you put into `{@link Kafka09ConsumerRecord}`


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] azagrebin commented on a change in pull request #6615: [FLINK-8354] [flink-connectors] Add ability to access and provider Kafka headers

2019-01-21 Thread GitBox
azagrebin commented on a change in pull request #6615: [FLINK-8354] 
[flink-connectors] Add ability to access and provider Kafka headers
URL: https://github.com/apache/flink/pull/6615#discussion_r249478973
 
 

 ##
 File path: 
flink-connectors/flink-connector-kafka-base/src/main/java/org/apache/flink/streaming/util/serialization/KeyedDeserializationSchema.java
 ##
 @@ -32,6 +34,49 @@
  */
 @PublicEvolving
 public interface KeyedDeserializationSchema extends Serializable, 
ResultTypeQueryable {
+   /**
+* Kafka record to be deserialized.
+* Record consists of key,value pair, topic name, partition offset, 
headers and a timestamp (if available)
+*/
+   interface Record {
 
 Review comment:
   kafka-clients 0.8 actually has `ConsumerRecord`, just `SimpleConsumerThread` 
does not use it but it seems to be possible to manually wrap `MessageAndOffset` 
into that `ConsumerRecord`. I would give it a try. It seems to be simpler 
option at the moment and would eliminate currently introduced inheritance for 
the sake of wrapping.
   
   Not sure, though, how big the risk is that the Kafka API changes again. The 
Flink wrapping, as now, seems to be a safer option but it also adds some 
performance overhead.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] azagrebin commented on a change in pull request #6615: [FLINK-8354] [flink-connectors] Add ability to access and provider Kafka headers

2019-01-21 Thread GitBox
azagrebin commented on a change in pull request #6615: [FLINK-8354] 
[flink-connectors] Add ability to access and provider Kafka headers
URL: https://github.com/apache/flink/pull/6615#discussion_r249510977
 
 

 ##
 File path: 
flink-connectors/flink-connector-kafka-base/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaConsumerBase.java
 ##
 @@ -82,93 +84,110 @@
  */
 @Internal
 public abstract class FlinkKafkaConsumerBase extends 
RichParallelSourceFunction implements
-   CheckpointListener,
-   ResultTypeQueryable,
-   CheckpointedFunction {
+   CheckpointListener,
 
 Review comment:
   This class contains a lot of unrelated changes, which makes it more 
difficult to review. I would suggest to have either a follow-up PR for them or 
at least put them into a separate commit.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] azagrebin commented on a change in pull request #6615: [FLINK-8354] [flink-connectors] Add ability to access and provider Kafka headers

2019-01-21 Thread GitBox
azagrebin commented on a change in pull request #6615: [FLINK-8354] 
[flink-connectors] Add ability to access and provider Kafka headers
URL: https://github.com/apache/flink/pull/6615#discussion_r249482757
 
 

 ##
 File path: 
flink-connectors/flink-connector-kafka-0.11/src/main/java/org/apache/flink/streaming/connectors/kafka/internal/Kafka011ConsumerRecord.java
 ##
 @@ -0,0 +1,57 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.streaming.connectors.kafka.internal;
+
+import org.apache.flink.shaded.guava18.com.google.common.base.Function;
+import org.apache.flink.shaded.guava18.com.google.common.collect.Iterables;
+
+import org.apache.kafka.clients.consumer.ConsumerRecord;
+import org.apache.kafka.common.header.Header;
+
+import javax.annotation.Nonnull;
+import javax.annotation.Nullable;
+
+import java.util.AbstractMap;
+import java.util.Map;
+
+/**
+ * Extends base Kafka09ConsumerRecord to provide access to Kafka headers.
+ */
+class Kafka011ConsumerRecord extends Kafka09ConsumerRecord {
+   /**
+* Wraps {@link Header} as Map.Entry.
+*/
+   private static final Function> 
HEADER_TO_MAP_ENTRY_FUNCTION =
+   new Function>() {
+   @Nonnull
+   @Override
+   public Map.Entry apply(@Nullable Header 
header) {
+   return new 
AbstractMap.SimpleImmutableEntry<>(header.key(), header.value());
+   }
+   };
+
+   Kafka011ConsumerRecord(ConsumerRecord consumerRecord) {
+   super(consumerRecord);
+   }
+
+   @Override
+   public Iterable> headers() {
+   return Iterables.transform(consumerRecord.headers(), 
HEADER_TO_MAP_ENTRY_FUNCTION);
 
 Review comment:
   Could we avoid relying on non-standard java libraries like guava?
   The record/headers processing is also on the time critical path of per 
record latency.
   I would suggest to implement our own Iterable wrapper which does only this 
header wrapping.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-11400) JobManagerRunner does not wait for suspension of JobMaster

2019-01-21 Thread TisonKun (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-11400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16748078#comment-16748078
 ] 

TisonKun commented on FLINK-11400:
--

Do you mean introduce a {{recoveryOperation}} in {{JobManagerRunner}}?

> JobManagerRunner does not wait for suspension of JobMaster
> --
>
> Key: FLINK-11400
> URL: https://issues.apache.org/jira/browse/FLINK-11400
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination
>Affects Versions: 1.6.3, 1.7.1, 1.8.0
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
>Priority: Major
> Fix For: 1.8.0
>
>
> The {{JobManagerRunner}} does not wait for the suspension of the 
> {{JobMaster}} to finish before granting leadership again. This can lead to a 
> state where the {{JobMaster}} tries to start the {{ExecutionGraph}} but the 
> {{SlotPool}} is still stopped.
> I suggest to linearize the leadership operations (granting and revoking 
> leadership) similarly to the {{Dispatcher}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] GJL opened a new pull request #7546: [FLINK-11390][tests] Port YARNSessionCapacitySchedulerITCase#testTaskManagerFailure to new code base

2019-01-21 Thread GitBox
GJL opened a new pull request #7546: [FLINK-11390][tests] Port 
YARNSessionCapacitySchedulerITCase#testTaskManagerFailure to new code base
URL: https://github.com/apache/flink/pull/7546
 
 
   ## What is the purpose of the change
   
   *Port `YARNSessionCapacitySchedulerITCase#testTaskManagerFailure` to new 
code base.*
   
   cc: @tillrohrmann 
   
   ## Brief change log
 - *Extract HA test out of `testTaskManagerFailure`*
 - *Rename test `testTaskManagerFailure` so that the name reflects what is 
actually asserted*
   
   ## Verifying this change
   
   This change is already covered by existing tests, such as *itself*.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / **no**)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / **no**)
 - The serializers: (yes / **no** / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / **no** 
/ don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / **no** / don't know)
 - The S3 file system connector: (yes / **no** / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / **no**)
 - If yes, how is the feature documented? (**not applicable** / docs / 
JavaDocs / not documented)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (FLINK-11401) Allow compression on ParquetBulkWriter

2019-01-21 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-11401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-11401:
---
Labels: pull-request-available  (was: )

> Allow compression on ParquetBulkWriter
> --
>
> Key: FLINK-11401
> URL: https://issues.apache.org/jira/browse/FLINK-11401
> Project: Flink
>  Issue Type: Improvement
>Affects Versions: 1.7.1
>Reporter: Fokko Driesprong
>Assignee: Fokko Driesprong
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.8.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] Fokko opened a new pull request #7547: [FLINK-11401] Allow setting of compression on ParquetWriter

2019-01-21 Thread GitBox
Fokko opened a new pull request #7547: [FLINK-11401] Allow setting of 
compression on ParquetWriter
URL: https://github.com/apache/flink/pull/7547
 
 
   
   
   ## What is the purpose of the change
   
   *(For example: This pull request makes task deployment go through the blob 
server, rather than through RPC. That way we avoid re-transferring them on each 
deployment (during recovery).)*
   
   
   ## Brief change log
   
   *(for example:)*
 - *The TaskInfo is stored in the blob store on job creation time as a 
persistent artifact*
 - *Deployments RPC transmits only the blob storage reference*
 - *TaskManagers retrieve the TaskInfo from the blob cache*
   
   
   ## Verifying this change
   
   *(Please pick either of the following options)*
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   *(or)*
   
   This change is already covered by existing tests, such as *(please describe 
tests)*.
   
   *(or)*
   
   This change added tests and can be verified as follows:
   
   *(example:)*
 - *Added integration tests for end-to-end deployment with large payloads 
(100MB)*
 - *Extended integration test for recovery after master (JobManager) 
failure*
 - *Added test that validates that TaskInfo is transferred only once across 
recoveries*
 - *Manually verified the change by running a 4 node cluser with 2 
JobManagers and 4 TaskManagers, a stateful streaming program, and killing one 
JobManager and two TaskManagers during the execution, verifying that recovery 
happens correctly.*
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / no)
 - The serializers: (yes / no / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / no / 
don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / no / don't know)
 - The S3 file system connector: (yes / no / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / no)
 - If yes, how is the feature documented? (not applicable / docs / JavaDocs 
/ not documented)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (FLINK-11401) Allow compression on ParquetBulkWriter

2019-01-21 Thread Fokko Driesprong (JIRA)
Fokko Driesprong created FLINK-11401:


 Summary: Allow compression on ParquetBulkWriter
 Key: FLINK-11401
 URL: https://issues.apache.org/jira/browse/FLINK-11401
 Project: Flink
  Issue Type: Improvement
Affects Versions: 1.7.1
Reporter: Fokko Driesprong
Assignee: Fokko Driesprong
 Fix For: 1.8.0






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] alexeyt820 commented on a change in pull request #6615: [FLINK-8354] [flink-connectors] Add ability to access and provider Kafka headers

2019-01-21 Thread GitBox
alexeyt820 commented on a change in pull request #6615: [FLINK-8354] 
[flink-connectors] Add ability to access and provider Kafka headers
URL: https://github.com/apache/flink/pull/6615#discussion_r249606898
 
 

 ##
 File path: 
flink-connectors/flink-connector-kafka-0.11/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaProducer011.java
 ##
 @@ -641,6 +641,9 @@ public void invoke(KafkaTransactionState transaction, IN 
next, Context context)
} else {
record = new ProducerRecord<>(targetTopic, null, 
timestamp, serializedKey, serializedValue);
}
+   for (Map.Entry header: schema.headers(next)) {
 
 Review comment:
   Pushed e0ca3e0


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] kezhuw opened a new pull request #7548: [FLINK-11326] Fix forbidden negative offset in window assigners

2019-01-21 Thread GitBox
kezhuw opened a new pull request #7548: [FLINK-11326] Fix forbidden negative 
offset in window assigners
URL: https://github.com/apache/flink/pull/7548
 
 
   ## What is the purpose of the change
   Allow negative window offset in window assignment as the javadoc says.
   
   ## Brief change log
   - Allow negative window offset in window assignment.
   - Throw `IllegalArgumentException` if offset is out of range for 
`SlidingEventTimeWindows.of` and `SlidingProcessingTimeWindows.of`. 
   
   ## Verifying this change
   
   This change is already covered by existing tests and new test cases has been 
added to allow negative window offset.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes)
 - The serializers: (no)
 - The runtime per-record code paths (performance sensitive): (no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)
 - The S3 file system connector: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (no)
 - If yes, how is the feature documented? (not applicable)
   
   ## Breaking changes
   * `@PublicEvolving` API `SlidingEventTimeWindows.of` and 
`SlidingProcessingTimeWindows.of` allows out of window offset previously, this 
merge request forbid this behavior. This way they behaves same as 
`TumblingEventTimeWindows.of` and `TumblingProcessingTimeWindows.of`. I think 
it is easier for caller to understand [sliding 
windows](https://ci.apache.org/projects/flink/flink-docs-master/dev/stream/operators/windows.html#sliding-windows).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (FLINK-11402) User code can fail with an UnsatisfiedLinkError in the presence of multiple classloaders

2019-01-21 Thread Ufuk Celebi (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-11402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ufuk Celebi updated FLINK-11402:

Component/s: Local Runtime

> User code can fail with an UnsatisfiedLinkError in the presence of multiple 
> classloaders
> 
>
> Key: FLINK-11402
> URL: https://issues.apache.org/jira/browse/FLINK-11402
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination, Local Runtime
>Affects Versions: 1.7.0
>Reporter: Ufuk Celebi
>Priority: Major
> Attachments: hello-snappy-1.0-SNAPSHOT.jar, hello-snappy.tgz
>
>
> As reported on the user mailing list thread "[`env.java.opts` not persisting 
> after job canceled or failed and then 
> restarted|https://lists.apache.org/thread.html/37cc1b628e16ca6c0bacced5e825de8057f88a8d601b90a355b6a291@%3Cuser.flink.apache.org%3E];,
>  there can be issues with using native libraries and user code class loading.
> h2. Steps to reproduce
> I was able to reproduce the issue reported on the mailing list using 
> [snappy-java|https://github.com/xerial/snappy-java] in a user program. 
> Running the attached user program works fine on initial submission, but 
> results in a failure when re-executed.
> I'm using Flink 1.7.0 using a standalone cluster started via 
> {{bin/start-cluster.sh}}.
> 0. Unpack attached Maven project and build using {{mvn clean package}} *or* 
> directly use attached {{hello-snappy-1.0-SNAPSHOT.jar}}
> 1. Download 
> [snappy-java-1.1.7.2.jar|http://central.maven.org/maven2/org/xerial/snappy/snappy-java/1.1.7.2/snappy-java-1.1.7.2.jar]
>  and unpack libsnappyjava for your system:
> {code}
> jar tf snappy-java-1.1.7.2.jar | grep libsnappy
> ...
> org/xerial/snappy/native/Linux/x86_64/libsnappyjava.so
> ...
> org/xerial/snappy/native/Mac/x86_64/libsnappyjava.jnilib
> ...
> {code}
> 2. Configure system library path to {{libsnappyjava}} in {{flink-conf.yaml}} 
> (path needs to be adjusted for your system):
> {code}
> env.java.opts: -Djava.library.path=/.../org/xerial/snappy/native/Mac/x86_64
> {code}
> 3. Run attached {{hello-snappy-1.0-SNAPSHOT.jar}}
> {code}
> bin/flink run hello-snappy-1.0-SNAPSHOT.jar
> Starting execution of program
> Program execution finished
> Job with JobID ae815b918dd7bc64ac8959e4e224f2b4 has finished.
> Job Runtime: 359 ms
> {code}
> 4. Rerun attached {{hello-snappy-1.0-SNAPSHOT.jar}}
> {code}
> bin/flink run hello-snappy-1.0-SNAPSHOT.jar
> Starting execution of program
> 
>  The program finished with the following exception:
> org.apache.flink.client.program.ProgramInvocationException: Job failed. 
> (JobID: 7d69baca58f33180cb9251449ddcd396)
>   at 
> org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:268)
>   at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:487)
>   at 
> org.apache.flink.streaming.api.environment.StreamContextEnvironment.execute(StreamContextEnvironment.java:66)
>   at com.github.uce.HelloSnappy.main(HelloSnappy.java:18)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:529)
>   at 
> org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:421)
>   at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:427)
>   at 
> org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:813)
>   at org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:287)
>   at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:213)
>   at 
> org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1050)
>   at 
> org.apache.flink.client.cli.CliFrontend.lambda$main$11(CliFrontend.java:1126)
>   at 
> org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30)
>   at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1126)
> Caused by: org.apache.flink.runtime.client.JobExecutionException: Job 
> execution failed.
>   at 
> org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:146)
>   at 
> org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:265)
>   ... 17 more
> Caused by: java.lang.UnsatisfiedLinkError: Native Library 
> /.../org/xerial/snappy/native/Mac/x86_64/libsnappyjava.jnilib already loaded 
> in another classloader
>   at 

[jira] [Updated] (FLINK-11326) Using offsets to adjust windows to timezones UTC-8 throws IllegalArgumentException

2019-01-21 Thread Kezhu Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-11326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kezhu Wang updated FLINK-11326:
---
Affects Version/s: 1.7.1

> Using offsets to adjust windows to timezones UTC-8 throws 
> IllegalArgumentException
> --
>
> Key: FLINK-11326
> URL: https://issues.apache.org/jira/browse/FLINK-11326
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming
>Affects Versions: 1.6.3, 1.7.1
>Reporter: TANG Wen-hui
>Assignee: Kezhu Wang
>Priority: Major
>
> According to comments, we can use offset to adjust windows to timezones other 
> than UTC-0. For example, in China you would have to specify an offset of 
> {{Time.hours(-8)}}. 
>  
> {code:java}
> /**
>  * Creates a new {@code TumblingEventTimeWindows} {@link WindowAssigner} that 
> assigns
>  * elements to time windows based on the element timestamp and offset.
>  *
>  * For example, if you want window a stream by hour,but window begins at 
> the 15th minutes
>  * of each hour, you can use {@code of(Time.hours(1),Time.minutes(15))},then 
> you will get
>  * time windows start at 0:15:00,1:15:00,2:15:00,etc.
>  *
>  * Rather than that,if you are living in somewhere which is not using 
> UTC±00:00 time,
>  * such as China which is using UTC+08:00,and you want a time window with 
> size of one day,
>  * and window begins at every 00:00:00 of local time,you may use {@code 
> of(Time.days(1),Time.hours(-8))}.
>  * The parameter of offset is {@code Time.hours(-8))} since UTC+08:00 is 8 
> hours earlier than UTC time.
>  *
>  * @param size The size of the generated windows.
>  * @param offset The offset which window start would be shifted by.
>  * @return The time policy.
>  */
> public static TumblingEventTimeWindows of(Time size, Time offset) {
>  return new TumblingEventTimeWindows(size.toMilliseconds(), 
> offset.toMilliseconds());
> }{code}
>  
> but when offset is smaller than zero, a IllegalArgumentException will be 
> throwed.
>  
> {code:java}
> protected TumblingEventTimeWindows(long size, long offset) {
>  if (offset < 0 || offset >= size) {
>  throw new IllegalArgumentException("TumblingEventTimeWindows parameters must 
> satisfy 0 <= offset < size");
>  }
>  this.size = size;
>  this.offset = offset;
> }{code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] alexeyt820 commented on a change in pull request #6615: [FLINK-8354] [flink-connectors] Add ability to access and provider Kafka headers

2019-01-21 Thread GitBox
alexeyt820 commented on a change in pull request #6615: [FLINK-8354] 
[flink-connectors] Add ability to access and provider Kafka headers
URL: https://github.com/apache/flink/pull/6615#discussion_r249606803
 
 

 ##
 File path: 
flink-connectors/flink-connector-kafka-base/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaConsumerBase.java
 ##
 @@ -251,6 +269,18 @@ public FlinkKafkaConsumerBase(
//  Configuration
// 

 
+   /**
+* Make sure that auto commit is disabled when our offset commit mode 
is ON_CHECKPOINTS.
+* This overwrites whatever setting the user configured in the 
properties.
+* @param properties - Kafka configuration properties to be adjusted
+* @param offsetCommitMode offset commit mode
+*/
+   protected static void adjustAutoCommitConfig(Properties properties, 
OffsetCommitMode offsetCommitMode) {
 
 Review comment:
   Pushed 3cfda16


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] KarmaGYZ commented on issue #7423: [hotfix][docs] Remove the legacy hint in production-ready doc

2019-01-21 Thread GitBox
KarmaGYZ commented on issue #7423: [hotfix][docs] Remove the legacy hint in 
production-ready doc
URL: https://github.com/apache/flink/pull/7423#issuecomment-456233322
 
 
   @alpinegizmo Thanks for the review.
   In the stale version, this sentence point to the asynchronous snapshotting. 
However, I keep this sentence because there are 
[explanation](https://ci.apache.org/projects/flink/flink-docs-master/ops/state/state_backends.html#the-rocksdbstatebackend)
 about the limitation of the throughput of RocksDBBackend.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] dianfu closed pull request #7354: [FLINK-11214] [table] Support non mergable aggregates for group windows on batch table

2019-01-21 Thread GitBox
dianfu closed pull request #7354: [FLINK-11214] [table] Support non mergable 
aggregates for group windows on batch table
URL: https://github.com/apache/flink/pull/7354
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-11402) User code can fail with an UnsatisfiedLinkError in the presence of multiple classloaders

2019-01-21 Thread Ufuk Celebi (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-11402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16748241#comment-16748241
 ] 

Ufuk Celebi commented on FLINK-11402:
-

Similar issue for RocksDB (FLINK-5408)

> User code can fail with an UnsatisfiedLinkError in the presence of multiple 
> classloaders
> 
>
> Key: FLINK-11402
> URL: https://issues.apache.org/jira/browse/FLINK-11402
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination
>Affects Versions: 1.7.0
>Reporter: Ufuk Celebi
>Priority: Major
> Attachments: hello-snappy-1.0-SNAPSHOT.jar, hello-snappy.tgz
>
>
> As reported on the user mailing list thread "[`env.java.opts` not persisting 
> after job canceled or failed and then 
> restarted|https://lists.apache.org/thread.html/37cc1b628e16ca6c0bacced5e825de8057f88a8d601b90a355b6a291@%3Cuser.flink.apache.org%3E];,
>  there can be issues with using native libraries and user code class loading.
> h2. Steps to reproduce
> I was able to reproduce the issue reported on the mailing list using 
> [snappy-java|https://github.com/xerial/snappy-java] in a user program. 
> Running the attached user program works fine on initial submission, but 
> results in a failure when re-executed.
> I'm using Flink 1.7.0 using a standalone cluster started via 
> {{bin/start-cluster.sh}}.
> 0. Unpack attached Maven project and build using {{mvn clean package}} *or* 
> directly use attached {{hello-snappy-1.0-SNAPSHOT.jar}}
> 1. Download 
> [snappy-java-1.1.7.2.jar|http://central.maven.org/maven2/org/xerial/snappy/snappy-java/1.1.7.2/snappy-java-1.1.7.2.jar]
>  and unpack libsnappyjava for your system:
> {code}
> jar tf snappy-java-1.1.7.2.jar | grep libsnappy
> ...
> org/xerial/snappy/native/Linux/x86_64/libsnappyjava.so
> ...
> org/xerial/snappy/native/Mac/x86_64/libsnappyjava.jnilib
> ...
> {code}
> 2. Configure system library path to {{libsnappyjava}} in {{flink-conf.yaml}} 
> (path needs to be adjusted for your system):
> {code}
> env.java.opts: -Djava.library.path=/.../org/xerial/snappy/native/Mac/x86_64
> {code}
> 3. Run attached {{hello-snappy-1.0-SNAPSHOT.jar}}
> {code}
> bin/flink run hello-snappy-1.0-SNAPSHOT.jar
> Starting execution of program
> Program execution finished
> Job with JobID ae815b918dd7bc64ac8959e4e224f2b4 has finished.
> Job Runtime: 359 ms
> {code}
> 4. Rerun attached {{hello-snappy-1.0-SNAPSHOT.jar}}
> {code}
> bin/flink run hello-snappy-1.0-SNAPSHOT.jar
> Starting execution of program
> 
>  The program finished with the following exception:
> org.apache.flink.client.program.ProgramInvocationException: Job failed. 
> (JobID: 7d69baca58f33180cb9251449ddcd396)
>   at 
> org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:268)
>   at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:487)
>   at 
> org.apache.flink.streaming.api.environment.StreamContextEnvironment.execute(StreamContextEnvironment.java:66)
>   at com.github.uce.HelloSnappy.main(HelloSnappy.java:18)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:529)
>   at 
> org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:421)
>   at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:427)
>   at 
> org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:813)
>   at org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:287)
>   at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:213)
>   at 
> org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1050)
>   at 
> org.apache.flink.client.cli.CliFrontend.lambda$main$11(CliFrontend.java:1126)
>   at 
> org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30)
>   at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1126)
> Caused by: org.apache.flink.runtime.client.JobExecutionException: Job 
> execution failed.
>   at 
> org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:146)
>   at 
> org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:265)
>   ... 17 more
> Caused by: java.lang.UnsatisfiedLinkError: Native Library 
> /.../org/xerial/snappy/native/Mac/x86_64/libsnappyjava.jnilib already loaded 
> in another classloader
>   at 

[jira] [Updated] (FLINK-11402) User code can fail with an UnsatisfiedLinkError in the presence of multiple classloaders

2019-01-21 Thread Ufuk Celebi (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-11402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ufuk Celebi updated FLINK-11402:

Attachment: hello-snappy.tgz

> User code can fail with an UnsatisfiedLinkError in the presence of multiple 
> classloaders
> 
>
> Key: FLINK-11402
> URL: https://issues.apache.org/jira/browse/FLINK-11402
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination
>Affects Versions: 1.7.0
>Reporter: Ufuk Celebi
>Priority: Major
> Attachments: hello-snappy-1.0-SNAPSHOT.jar, hello-snappy.tgz
>
>
> As reported on the user mailing list thread "[`env.java.opts` not persisting 
> after job canceled or failed and then 
> restarted|https://lists.apache.org/thread.html/37cc1b628e16ca6c0bacced5e825de8057f88a8d601b90a355b6a291@%3Cuser.flink.apache.org%3E];,
>  there can be issues with using native libraries and user code class loading.
> h2. Steps to reproduce
> I was able to reproduce the issue reported on the mailing list using 
> [snappy-java|https://github.com/xerial/snappy-java] in a user program. 
> Running the attached user program works fine on initial submission, but 
> results in a failure when re-executed.
> I'm using Flink 1.7.0 using a standalone cluster started via 
> {{bin/start-cluster.sh}}.
> 0. Unpack attached Maven project and build using {{mvn clean package}} *or* 
> directly use attached {{hello-snappy-1.0-SNAPSHOT.jar}}
> 1. Download 
> [snappy-java-1.1.7.2.jar|http://central.maven.org/maven2/org/xerial/snappy/snappy-java/1.1.7.2/snappy-java-1.1.7.2.jar]
>  and unpack libsnappyjava for your system:
> {code}
> jar tf snappy-java-1.1.7.2.jar | grep libsnappy
> ...
> org/xerial/snappy/native/Linux/x86_64/libsnappyjava.so
> ...
> org/xerial/snappy/native/Mac/x86_64/libsnappyjava.jnilib
> ...
> {code}
> 2. Configure system library path to {{libsnappyjava}} in {{flink-conf.yaml}} 
> (path needs to be adjusted for your system):
> {code}
> env.java.opts: -Djava.library.path=/.../org/xerial/snappy/native/Mac/x86_64
> {code}
> 3. Run attached {{hello-snappy-1.0-SNAPSHOT.jar}}
> {code}
> bin/flink run hello-snappy-1.0-SNAPSHOT.jar
> Starting execution of program
> Program execution finished
> Job with JobID ae815b918dd7bc64ac8959e4e224f2b4 has finished.
> Job Runtime: 359 ms
> {code}
> 4. Rerun attached {{hello-snappy-1.0-SNAPSHOT.jar}}
> {code}
> bin/flink run hello-snappy-1.0-SNAPSHOT.jar
> Starting execution of program
> 
>  The program finished with the following exception:
> org.apache.flink.client.program.ProgramInvocationException: Job failed. 
> (JobID: 7d69baca58f33180cb9251449ddcd396)
>   at 
> org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:268)
>   at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:487)
>   at 
> org.apache.flink.streaming.api.environment.StreamContextEnvironment.execute(StreamContextEnvironment.java:66)
>   at com.github.uce.HelloSnappy.main(HelloSnappy.java:18)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:529)
>   at 
> org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:421)
>   at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:427)
>   at 
> org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:813)
>   at org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:287)
>   at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:213)
>   at 
> org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1050)
>   at 
> org.apache.flink.client.cli.CliFrontend.lambda$main$11(CliFrontend.java:1126)
>   at 
> org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30)
>   at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1126)
> Caused by: org.apache.flink.runtime.client.JobExecutionException: Job 
> execution failed.
>   at 
> org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:146)
>   at 
> org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:265)
>   ... 17 more
> Caused by: java.lang.UnsatisfiedLinkError: Native Library 
> /.../org/xerial/snappy/native/Mac/x86_64/libsnappyjava.jnilib already loaded 
> in another classloader
>   at 

[jira] [Updated] (FLINK-11402) User code can fail with an UnsatisfiedLinkError in the presence of multiple classloaders

2019-01-21 Thread Ufuk Celebi (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-11402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ufuk Celebi updated FLINK-11402:

Attachment: hello-snappy-1.0-SNAPSHOT.jar

> User code can fail with an UnsatisfiedLinkError in the presence of multiple 
> classloaders
> 
>
> Key: FLINK-11402
> URL: https://issues.apache.org/jira/browse/FLINK-11402
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination
>Affects Versions: 1.7.0
>Reporter: Ufuk Celebi
>Priority: Major
> Attachments: hello-snappy-1.0-SNAPSHOT.jar, hello-snappy.tgz
>
>
> As reported on the user mailing list thread "[`env.java.opts` not persisting 
> after job canceled or failed and then 
> restarted|https://lists.apache.org/thread.html/37cc1b628e16ca6c0bacced5e825de8057f88a8d601b90a355b6a291@%3Cuser.flink.apache.org%3E];,
>  there can be issues with using native libraries and user code class loading.
> h2. Steps to reproduce
> I was able to reproduce the issue reported on the mailing list using 
> [snappy-java|https://github.com/xerial/snappy-java] in a user program. 
> Running the attached user program works fine on initial submission, but 
> results in a failure when re-executed.
> I'm using Flink 1.7.0 using a standalone cluster started via 
> {{bin/start-cluster.sh}}.
> 0. Unpack attached Maven project and build using {{mvn clean package}} *or* 
> directly use attached {{hello-snappy-1.0-SNAPSHOT.jar}}
> 1. Download 
> [snappy-java-1.1.7.2.jar|http://central.maven.org/maven2/org/xerial/snappy/snappy-java/1.1.7.2/snappy-java-1.1.7.2.jar]
>  and unpack libsnappyjava for your system:
> {code}
> jar tf snappy-java-1.1.7.2.jar | grep libsnappy
> ...
> org/xerial/snappy/native/Linux/x86_64/libsnappyjava.so
> ...
> org/xerial/snappy/native/Mac/x86_64/libsnappyjava.jnilib
> ...
> {code}
> 2. Configure system library path to {{libsnappyjava}} in {{flink-conf.yaml}} 
> (path needs to be adjusted for your system):
> {code}
> env.java.opts: -Djava.library.path=/.../org/xerial/snappy/native/Mac/x86_64
> {code}
> 3. Run attached {{hello-snappy-1.0-SNAPSHOT.jar}}
> {code}
> bin/flink run hello-snappy-1.0-SNAPSHOT.jar
> Starting execution of program
> Program execution finished
> Job with JobID ae815b918dd7bc64ac8959e4e224f2b4 has finished.
> Job Runtime: 359 ms
> {code}
> 4. Rerun attached {{hello-snappy-1.0-SNAPSHOT.jar}}
> {code}
> bin/flink run hello-snappy-1.0-SNAPSHOT.jar
> Starting execution of program
> 
>  The program finished with the following exception:
> org.apache.flink.client.program.ProgramInvocationException: Job failed. 
> (JobID: 7d69baca58f33180cb9251449ddcd396)
>   at 
> org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:268)
>   at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:487)
>   at 
> org.apache.flink.streaming.api.environment.StreamContextEnvironment.execute(StreamContextEnvironment.java:66)
>   at com.github.uce.HelloSnappy.main(HelloSnappy.java:18)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:529)
>   at 
> org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:421)
>   at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:427)
>   at 
> org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:813)
>   at org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:287)
>   at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:213)
>   at 
> org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1050)
>   at 
> org.apache.flink.client.cli.CliFrontend.lambda$main$11(CliFrontend.java:1126)
>   at 
> org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30)
>   at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1126)
> Caused by: org.apache.flink.runtime.client.JobExecutionException: Job 
> execution failed.
>   at 
> org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:146)
>   at 
> org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:265)
>   ... 17 more
> Caused by: java.lang.UnsatisfiedLinkError: Native Library 
> /.../org/xerial/snappy/native/Mac/x86_64/libsnappyjava.jnilib already loaded 
> in another classloader
>   at 

[GitHub] KarmaGYZ commented on a change in pull request #7260: [hotfix][docs] Fix typo and make improvement in Kafka Connectors doc

2019-01-21 Thread GitBox
KarmaGYZ commented on a change in pull request #7260: [hotfix][docs] Fix typo 
and make improvement in Kafka Connectors doc
URL: https://github.com/apache/flink/pull/7260#discussion_r249611548
 
 

 ##
 File path: docs/dev/connectors/kafka.md
 ##
 @@ -681,13 +681,13 @@ chosen by passing appropriate `semantic` parameter to 
the `FlinkKafkaProducer011
 
 `Semantic.EXACTLY_ONCE` mode relies on the ability to commit transactions
 that were started before taking a checkpoint, after recovering from the said 
checkpoint. If the time
-between Flink application crash and completed restart is larger then Kafka's 
transaction timeout
+between Flink application crash and completed restart is larger than Kafka's 
transaction timeout
 there will be data loss (Kafka will automatically abort transactions that 
exceeded timeout time).
 Having this in mind, please configure your transaction timeout appropriately 
to your expected down
 times.
 
 Kafka brokers by default have `transaction.max.timeout.ms` set to 15 minutes. 
This property will
-not allow to set transaction timeouts for the producers larger then it's value.
+not allow to set transaction timeouts for the producers larger than it's value.
 `FlinkKafkaProducer011` by default sets the `transaction.timeout.ms` property 
in producer config to
 
 Review comment:
   It sounds more clear and fluent to me. Thanks!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] stevenzwu commented on a change in pull request #6615: [FLINK-8354] [flink-connectors] Add ability to access and provider Kafka headers

2019-01-21 Thread GitBox
stevenzwu commented on a change in pull request #6615: [FLINK-8354] 
[flink-connectors] Add ability to access and provider Kafka headers
URL: https://github.com/apache/flink/pull/6615#discussion_r249606472
 
 

 ##
 File path: 
flink-connectors/flink-connector-kafka-base/src/main/java/org/apache/flink/streaming/util/serialization/KeyedDeserializationSchema.java
 ##
 @@ -32,6 +34,49 @@
  */
 @PublicEvolving
 public interface KeyedDeserializationSchema extends Serializable, 
ResultTypeQueryable {
+   /**
+* Kafka record to be deserialized.
+* Record consists of key,value pair, topic name, partition offset, 
headers and a timestamp (if available)
+*/
+   interface Record {
 
 Review comment:
   @alexeyt820 as pointed out by @azagrebin, kafka 0.8 seems to have 
`ConsumerRecord` interface
   
https://archive.apache.org/dist/kafka/0.8.2-beta/java-doc/org/apache/kafka/clients/consumer/ConsumerRecord.html
   
   As Flink is moving to the direction of just one modern kafka connector, I am 
also wondering if it is ok to drop kafka 0.8 connector?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (FLINK-11326) Using offsets to adjust windows to timezones UTC-8 throws IllegalArgumentException

2019-01-21 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-11326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-11326:
---
Labels: pull-request-available  (was: )

> Using offsets to adjust windows to timezones UTC-8 throws 
> IllegalArgumentException
> --
>
> Key: FLINK-11326
> URL: https://issues.apache.org/jira/browse/FLINK-11326
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming
>Affects Versions: 1.6.3, 1.7.1
>Reporter: TANG Wen-hui
>Assignee: Kezhu Wang
>Priority: Major
>  Labels: pull-request-available
>
> According to comments, we can use offset to adjust windows to timezones other 
> than UTC-0. For example, in China you would have to specify an offset of 
> {{Time.hours(-8)}}. 
>  
> {code:java}
> /**
>  * Creates a new {@code TumblingEventTimeWindows} {@link WindowAssigner} that 
> assigns
>  * elements to time windows based on the element timestamp and offset.
>  *
>  * For example, if you want window a stream by hour,but window begins at 
> the 15th minutes
>  * of each hour, you can use {@code of(Time.hours(1),Time.minutes(15))},then 
> you will get
>  * time windows start at 0:15:00,1:15:00,2:15:00,etc.
>  *
>  * Rather than that,if you are living in somewhere which is not using 
> UTC±00:00 time,
>  * such as China which is using UTC+08:00,and you want a time window with 
> size of one day,
>  * and window begins at every 00:00:00 of local time,you may use {@code 
> of(Time.days(1),Time.hours(-8))}.
>  * The parameter of offset is {@code Time.hours(-8))} since UTC+08:00 is 8 
> hours earlier than UTC time.
>  *
>  * @param size The size of the generated windows.
>  * @param offset The offset which window start would be shifted by.
>  * @return The time policy.
>  */
> public static TumblingEventTimeWindows of(Time size, Time offset) {
>  return new TumblingEventTimeWindows(size.toMilliseconds(), 
> offset.toMilliseconds());
> }{code}
>  
> but when offset is smaller than zero, a IllegalArgumentException will be 
> throwed.
>  
> {code:java}
> protected TumblingEventTimeWindows(long size, long offset) {
>  if (offset < 0 || offset >= size) {
>  throw new IllegalArgumentException("TumblingEventTimeWindows parameters must 
> satisfy 0 <= offset < size");
>  }
>  this.size = size;
>  this.offset = offset;
> }{code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-11402) User code can fail with an UnsatisfiedLinkError in the presence of multiple classloaders

2019-01-21 Thread Ufuk Celebi (JIRA)
Ufuk Celebi created FLINK-11402:
---

 Summary: User code can fail with an UnsatisfiedLinkError in the 
presence of multiple classloaders
 Key: FLINK-11402
 URL: https://issues.apache.org/jira/browse/FLINK-11402
 Project: Flink
  Issue Type: Bug
  Components: Distributed Coordination
Affects Versions: 1.7.0
Reporter: Ufuk Celebi
 Attachments: hello-snappy-1.0-SNAPSHOT.jar, hello-snappy.tgz

As reported on the user mailing list thread "[`env.java.opts` not persisting 
after job canceled or failed and then 
restarted|https://lists.apache.org/thread.html/37cc1b628e16ca6c0bacced5e825de8057f88a8d601b90a355b6a291@%3Cuser.flink.apache.org%3E];,
 there can be issues with using native libraries and user code class loading.

h2. Steps to reproduce

I was able to reproduce the issue reported on the mailing list using 
[snappy-java|https://github.com/xerial/snappy-java] in a user program. Running 
the attached user program works fine on initial submission, but results in a 
failure when re-executed.

I'm using Flink 1.7.0 using a standalone cluster started via 
{{bin/start-cluster.sh}}.

0. Unpack attached Maven project and build using {{mvn clean package}} *or* 
directly use attached {{hello-snappy-1.0-SNAPSHOT.jar}}
1. Download 
[snappy-java-1.1.7.2.jar|http://central.maven.org/maven2/org/xerial/snappy/snappy-java/1.1.7.2/snappy-java-1.1.7.2.jar]
 and unpack libsnappyjava for your system:
{code}
jar tf snappy-java-1.1.7.2.jar | grep libsnappy
...
org/xerial/snappy/native/Linux/x86_64/libsnappyjava.so
...
org/xerial/snappy/native/Mac/x86_64/libsnappyjava.jnilib
...
{code}
2. Configure system library path to {{libsnappyjava}} in {{flink-conf.yaml}} 
(path needs to be adjusted for your system):
{code}
env.java.opts: -Djava.library.path=/.../org/xerial/snappy/native/Mac/x86_64
{code}
3. Run attached {{hello-snappy-1.0-SNAPSHOT.jar}}
{code}
bin/flink run hello-snappy-1.0-SNAPSHOT.jar
Starting execution of program
Program execution finished
Job with JobID ae815b918dd7bc64ac8959e4e224f2b4 has finished.
Job Runtime: 359 ms
{code}
4. Rerun attached {{hello-snappy-1.0-SNAPSHOT.jar}}
{code}
bin/flink run hello-snappy-1.0-SNAPSHOT.jar
Starting execution of program


 The program finished with the following exception:

org.apache.flink.client.program.ProgramInvocationException: Job failed. (JobID: 
7d69baca58f33180cb9251449ddcd396)
  at 
org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:268)
  at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:487)
  at 
org.apache.flink.streaming.api.environment.StreamContextEnvironment.execute(StreamContextEnvironment.java:66)
  at com.github.uce.HelloSnappy.main(HelloSnappy.java:18)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
  at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:498)
  at 
org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:529)
  at 
org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:421)
  at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:427)
  at 
org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:813)
  at org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:287)
  at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:213)
  at 
org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1050)
  at 
org.apache.flink.client.cli.CliFrontend.lambda$main$11(CliFrontend.java:1126)
  at 
org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30)
  at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1126)
Caused by: org.apache.flink.runtime.client.JobExecutionException: Job execution 
failed.
  at 
org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:146)
  at 
org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:265)
  ... 17 more
Caused by: java.lang.UnsatisfiedLinkError: Native Library 
/.../org/xerial/snappy/native/Mac/x86_64/libsnappyjava.jnilib already loaded in 
another classloader
  at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1907)
  at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1861)
  at java.lang.Runtime.loadLibrary0(Runtime.java:870)
  at java.lang.System.loadLibrary(System.java:1122)
  at org.xerial.snappy.SnappyLoader.loadNativeLibrary(SnappyLoader.java:182)
  at org.xerial.snappy.SnappyLoader.loadSnappyApi(SnappyLoader.java:154)
  at org.xerial.snappy.Snappy.(Snappy.java:47)
  at 

[GitHub] stevenzwu commented on a change in pull request #6615: [FLINK-8354] [flink-connectors] Add ability to access and provider Kafka headers

2019-01-21 Thread GitBox
stevenzwu commented on a change in pull request #6615: [FLINK-8354] 
[flink-connectors] Add ability to access and provider Kafka headers
URL: https://github.com/apache/flink/pull/6615#discussion_r249607011
 
 

 ##
 File path: 
flink-connectors/flink-connector-kafka-base/src/main/java/org/apache/flink/streaming/util/serialization/KeyedDeserializationSchema.java
 ##
 @@ -32,6 +34,49 @@
  */
 @PublicEvolving
 public interface KeyedDeserializationSchema extends Serializable, 
ResultTypeQueryable {
+   /**
+* Kafka record to be deserialized.
+* Record consists of key,value pair, topic name, partition offset, 
headers and a timestamp (if available)
+*/
+   interface Record {
 
 Review comment:
   We can add a new deserialize method to `KeyedDeserializationSchema` 
interface with a default implementation that just forwards to the other 
deserialize method (mark as deprecated).
   
   {code}
   T deserialize(ConsumerRecord consumerRecord) throws IOException {
   return deserialize(messageKey, message, topic, partition, offset);
   }
   {code}


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] stevenzwu commented on a change in pull request #6615: [FLINK-8354] [flink-connectors] Add ability to access and provider Kafka headers

2019-01-21 Thread GitBox
stevenzwu commented on a change in pull request #6615: [FLINK-8354] 
[flink-connectors] Add ability to access and provider Kafka headers
URL: https://github.com/apache/flink/pull/6615#discussion_r249607011
 
 

 ##
 File path: 
flink-connectors/flink-connector-kafka-base/src/main/java/org/apache/flink/streaming/util/serialization/KeyedDeserializationSchema.java
 ##
 @@ -32,6 +34,49 @@
  */
 @PublicEvolving
 public interface KeyedDeserializationSchema extends Serializable, 
ResultTypeQueryable {
+   /**
+* Kafka record to be deserialized.
+* Record consists of key,value pair, topic name, partition offset, 
headers and a timestamp (if available)
+*/
+   interface Record {
 
 Review comment:
   We can add a new deserialize method to `KeyedDeserializationSchema` 
interface with a default implementation that just forwards to the other 
deserialize method (mark as deprecated).
   
   ```java
   T deserialize(ConsumerRecord consumerRecord) throws IOException {
   return deserialize(messageKey, message, topic, partition, offset);
   }
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Closed] (FLINK-11214) Support non mergable aggregates for group windows on batch table

2019-01-21 Thread Dian Fu (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-11214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dian Fu closed FLINK-11214.
---
Resolution: Won't Do

> Support non mergable aggregates for group windows on batch table
> 
>
> Key: FLINK-11214
> URL: https://issues.apache.org/jira/browse/FLINK-11214
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API  SQL
>Reporter: Dian Fu
>Assignee: Dian Fu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently, it does not support non-mergable aggregates for group window on 
> batch table. It would be nice to support it as many code paths(but not all) 
> have already considered it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] Clarkkkkk closed pull request #6628: [FLINK-10245] [Streaming Connector] Add Pojo, Tuple, Row and Scala Product DataStream Sink for HBase

2019-01-21 Thread GitBox
Clark closed pull request #6628: [FLINK-10245] [Streaming Connector] Add 
Pojo, Tuple, Row and Scala Product DataStream Sink for HBase
URL: https://github.com/apache/flink/pull/6628
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] alexeyt820 commented on a change in pull request #6615: [FLINK-8354] [flink-connectors] Add ability to access and provider Kafka headers

2019-01-21 Thread GitBox
alexeyt820 commented on a change in pull request #6615: [FLINK-8354] 
[flink-connectors] Add ability to access and provider Kafka headers
URL: https://github.com/apache/flink/pull/6615#discussion_r249637422
 
 

 ##
 File path: 
flink-connectors/flink-connector-kafka-0.8/src/main/java/org/apache/flink/streaming/connectors/kafka/internals/SimpleConsumerThread.java
 ##
 @@ -370,8 +370,8 @@ else if (partitionsRemoved) {

keyPayload.get(keyBytes);
}
 
-   final T value = 
deserializer.deserialize(keyBytes, valueBytes,
-   
currentPartition.getTopic(), currentPartition.getPartition(), offset);
+   final T value = 
deserializer.deserialize(
 
 Review comment:
   Thank you for suggestion, I will try


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-9920) BucketingSinkFaultToleranceITCase fails on travis

2019-01-21 Thread Yun Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16748419#comment-16748419
 ] 

Yun Tang commented on FLINK-9920:
-

Another instance:  [https://api.travis-ci.org/v3/job/481937341/log.txt]

[~aljoscha] when you try to verify this case locally. Did you use hadoop-2.8.3 
just like travis used, and the local OS is linux not mac-os?

> BucketingSinkFaultToleranceITCase fails on travis
> -
>
> Key: FLINK-9920
> URL: https://issues.apache.org/jira/browse/FLINK-9920
> Project: Flink
>  Issue Type: Bug
>  Components: filesystem-connector
>Affects Versions: 1.6.0, 1.7.0
>Reporter: Chesnay Schepler
>Assignee: Aljoscha Krettek
>Priority: Critical
>  Labels: test-stability
> Fix For: 1.8.0
>
>
> https://travis-ci.org/zentol/flink/jobs/407021898
> {code}
> Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 13.082 sec 
> <<< FAILURE! - in 
> org.apache.flink.streaming.connectors.fs.bucketing.BucketingSinkFaultToleranceITCase
> runCheckpointedProgram(org.apache.flink.streaming.connectors.fs.bucketing.BucketingSinkFaultToleranceITCase)
>   Time elapsed: 5.696 sec  <<< FAILURE!
> java.lang.AssertionError: Read line does not match expected pattern.
>   at org.junit.Assert.fail(Assert.java:88)
>   at 
> org.apache.flink.streaming.connectors.fs.bucketing.BucketingSinkFaultToleranceITCase.postSubmit(BucketingSinkFaultToleranceITCase.java:182)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Issue Comment Deleted] (FLINK-11374) See more failover and can filter by time range

2019-01-21 Thread lining (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-11374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lining updated FLINK-11374:
---
Comment: was deleted

(was: agg by vertex

!image-2019-01-22-11-42-33-808.png!)

> See more failover and can filter by time range
> --
>
> Key: FLINK-11374
> URL: https://issues.apache.org/jira/browse/FLINK-11374
> Project: Flink
>  Issue Type: Improvement
>  Components: REST, Webfrontend
>Reporter: lining
>Assignee: lining
>Priority: Major
> Attachments: image-2019-01-22-11-40-53-135.png, 
> image-2019-01-22-11-42-33-808.png
>
>
> Now failover just show limit size task failover latest time. If task has 
> failed many time, we can not see the earlier time failover. Can we add filter 
> by time to see failover which contains task attemp fail msg.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-11405) rest api can see exception by start end time filter

2019-01-21 Thread lining (JIRA)
lining created FLINK-11405:
--

 Summary: rest api can see exception by start end time filter
 Key: FLINK-11405
 URL: https://issues.apache.org/jira/browse/FLINK-11405
 Project: Flink
  Issue Type: Sub-task
Reporter: lining
Assignee: lining






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-11404) web ui add see page and can filter by time

2019-01-21 Thread lining (JIRA)
lining created FLINK-11404:
--

 Summary: web ui add see page and can filter by time
 Key: FLINK-11404
 URL: https://issues.apache.org/jira/browse/FLINK-11404
 Project: Flink
  Issue Type: Sub-task
Reporter: lining
Assignee: lining






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (FLINK-6101) GroupBy fields with arithmetic expression (include UDF) can not be selected

2019-01-21 Thread lincoln.lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-6101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lincoln.lee closed FLINK-6101.
--
Resolution: Later

> GroupBy fields with arithmetic expression (include UDF) can not be selected
> ---
>
> Key: FLINK-6101
> URL: https://issues.apache.org/jira/browse/FLINK-6101
> Project: Flink
>  Issue Type: Bug
>  Components: Table API  SQL
>Reporter: lincoln.lee
>Assignee: lincoln.lee
>Priority: Minor
>
> currently the TableAPI do not support selecting GroupBy fields with 
> expression either using original field name or the expression 
> {code}
>  val t = CollectionDataSets.get5TupleDataSet(env).toTable(tEnv, 'a, 'b, 'c, 
> 'd, 'e)
> .groupBy('e, 'b % 3)
> .select('b, 'c.min, 'e, 'a.avg, 'd.count)
> {code}
> caused
> {code}
> org.apache.flink.table.api.ValidationException: Cannot resolve [b] given 
> input [e, ('b % 3), TMP_0, TMP_1, TMP_2].
> {code}
> (BTW, this syntax is invalid in RDBMS which will indicate the selected column 
> is invalid in the select list because it is not contained in either an 
> aggregate function or the GROUP BY clause in SQL Server.)
> and 
> {code}
>  val t = CollectionDataSets.get5TupleDataSet(env).toTable(tEnv, 'a, 'b, 'c, 
> 'd, 'e)
> .groupBy('e, 'b % 3)
> .select('b%3, 'c.min, 'e, 'a.avg, 'd.count)
> {code}
> will also cause
> {code}
> org.apache.flink.table.api.ValidationException: Cannot resolve [b] given 
> input [e, ('b % 3), TMP_0, TMP_1, TMP_2].
> {code}
> and add an alias in groupBy clause "group(e, 'b%3 as 'b)" work without avail. 
> and apply an UDF doesn’t work either
> {code}
>table.groupBy('a, Mod('b, 3)).select('a, Mod('b, 3), 'c.count, 'c.count, 
> 'd.count, 'e.avg)
> org.apache.flink.table.api.ValidationException: Cannot resolve [b] given 
> input [a, org.apache.flink.table.api.scala.batch.table.Mod$('b, 3), TMP_0, 
> TMP_1, TMP_2].
> {code}
> the only way to get this work can be 
> {code}
>  val t = CollectionDataSets.get5TupleDataSet(env).toTable(tEnv, 'a, 'b, 'c, 
> 'd, 'e)
> .select('a, 'b%3 as 'b, 'c, 'd, 'e)
> .groupBy('e, 'b)
> .select('b, 'c.min, 'e, 'a.avg, 'd.count)
> {code}
> One way to solve this is to add support alias in groupBy clause ( it seems a 
> bit odd against SQL though TableAPI has a different groupBy grammar),  
> and I prefer to support select original expressions and UDF in groupBy 
> clause(make consistent with SQL).
> as thus:
> {code}
> // use expression
> val t = CollectionDataSets.get5TupleDataSet(env).toTable(tEnv, 'a, 'b, 'c, 
> 'd, 'e)
> .groupBy('e, 'b % 3)
> .select('b % 3, 'c.min, 'e, 'a.avg, 'd.count)
> // use UDF
> val t = CollectionDataSets.get5TupleDataSet(env).toTable(tEnv, 'a, 'b, 'c, 
> 'd, 'e)
> .groupBy('e, Mod('b,3))
> .select(Mod('b,3), 'c.min, 'e, 'a.avg, 'd.count)
> {code}

> After had a look into the code, found there was a problem in the groupBy 
> implementation, validation hadn't considered the expressions in groupBy 
> clause. it should be noted that a table has been actually changed after 
> groupBy operation ( a new Table) and the groupBy keys replace the original 
> field reference in essence.
>  
> What do you think?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] alexeyt820 commented on a change in pull request #6615: [FLINK-8354] [flink-connectors] Add ability to access and provider Kafka headers

2019-01-21 Thread GitBox
alexeyt820 commented on a change in pull request #6615: [FLINK-8354] 
[flink-connectors] Add ability to access and provider Kafka headers
URL: https://github.com/apache/flink/pull/6615#discussion_r249637189
 
 

 ##
 File path: 
flink-connectors/flink-connector-kafka-base/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaConsumerBase.java
 ##
 @@ -82,93 +84,110 @@
  */
 @Internal
 public abstract class FlinkKafkaConsumerBase extends 
RichParallelSourceFunction implements
-   CheckpointListener,
-   ResultTypeQueryable,
-   CheckpointedFunction {
+   CheckpointListener,
 
 Review comment:
   Yes, sorry. Looks like I intentionally reformat it. Will revert.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] alexeyt820 commented on a change in pull request #6615: [FLINK-8354] [flink-connectors] Add ability to access and provider Kafka headers

2019-01-21 Thread GitBox
alexeyt820 commented on a change in pull request #6615: [FLINK-8354] 
[flink-connectors] Add ability to access and provider Kafka headers
URL: https://github.com/apache/flink/pull/6615#discussion_r249637189
 
 

 ##
 File path: 
flink-connectors/flink-connector-kafka-base/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaConsumerBase.java
 ##
 @@ -82,93 +84,110 @@
  */
 @Internal
 public abstract class FlinkKafkaConsumerBase extends 
RichParallelSourceFunction implements
-   CheckpointListener,
-   ResultTypeQueryable,
-   CheckpointedFunction {
+   CheckpointListener,
 
 Review comment:
   Yes, sorry. Looks like I un-intentionally reformat it. Will revert.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (FLINK-11403) Remove ResultPartitionConsumableNotifier from ResultPartition

2019-01-21 Thread zhijiang (JIRA)
zhijiang created FLINK-11403:


 Summary: Remove ResultPartitionConsumableNotifier from 
ResultPartition
 Key: FLINK-11403
 URL: https://issues.apache.org/jira/browse/FLINK-11403
 Project: Flink
  Issue Type: Sub-task
  Components: Network
Reporter: zhijiang
Assignee: zhijiang
 Fix For: 1.8.0


This is the precondition for introducing pluggable {{ShuffleService}} on TM 
side.

In current process of creating {{ResultPartition}}, the 
{{ResultPartitionConsumableNotifier}} regarded as TM level component has to be 
passed into the constructor. In order to create {{ResultPartition}} easily from 
{{ShuffleService}}, the required information should be covered by 
{{ResultPartitionDeploymentDescriptor}} as much as possible, then we could 
remove this notifier from the constructor. And it is also reasonable for 
notifying consumable partition via {{TaskActions}} which is already covered in 
{{ResultPartition}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   >