date:20200310

[jira] [Created] (FLINK-16537) Implement ResultPartition state recovery for unaligned checkpoint

2020-03-10 Thread Zhijiang (Jira)

Zhijiang created FLINK-16537:


 Summary: Implement ResultPartition state recovery for unaligned 
checkpoint
 Key: FLINK-16537
 URL: https://issues.apache.org/jira/browse/FLINK-16537
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Checkpointing, Runtime / Network
Reporter: Zhijiang
 Fix For: 1.11.0


During recovery process for unaligned checkpoint, the partition state should 
also be recovered besides with existing operator states.

The ResultPartition would request buffer from local pool and then interact with 
ChannelStateReader to fill in the state data.  The filled buffer would be 
inserted into respective ResultSubpartition queue  in normal way.

It should guarantee that op can not process any inputs before finishing all the 
output recovery to avoid mis-order issue.

Refer to more details by 
[https://docs.google.com/document/d/16_MOQymzxrKvUHXh6QFr2AAXIKt_2vPUf8vzKy4H_tU/edit]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] flinkbot edited a comment on issue #11358: [FLINK-16516][python] Remove Python UDF Codegen Code

2020-03-10 Thread GitBox

flinkbot edited a comment on issue #11358: [FLINK-16516][python] Remove Python 
UDF Codegen Code
URL: https://github.com/apache/flink/pull/11358#issuecomment-596885127
 
 
   
   ## CI report:
   
   * beba9d15aaf629de9e09aa8885cdcc844a96ad30 Travis: 
[SUCCESS](https://travis-ci.com/github/flink-ci/flink/builds/152742281) Azure: 
[SUCCESS](https://dev.azure.com/rmetzger/5bd3ef0a-4359-41af-abca-811b04098d2e/_build/results?buildId=6148)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Updated] (FLINK-16536) Implement InputChannel state recovery for unaligned checkpoint

2020-03-10 Thread Zhijiang (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijiang updated FLINK-16536:
-
Description: 
During recovery process for unaligned checkpoint, the input channel state 
should also be recovered besides with existing operator states.

The InputGate would request buffer from local pool and then interact with 
ChannelStateReader to fill in the state data.  The filled buffer would be 
inserted into respective InputChannel queue for processing in normal way.

It should guarantee that the new data from upstream side should not overtake 
the input state data to avoid mis-order issue.

Refer to more details by 
[https://docs.google.com/document/d/16_MOQymzxrKvUHXh6QFr2AAXIKt_2vPUf8vzKy4H_tU/edit]]

  was:
During recovery process for unaligned checkpoint, the input channel state 
should also be recovered besides with existing operator states.

The InputGate would request buffer from local pool and then interact with 
ChannelStateReader to fill in the state data.  The filled buffer would be 
inserted into respective InputChannel queue for processing in normal way.

It should guarantee that the new data from upstream side should not overtake 
the input state data to avoid mis-order issue.

Refer to more details by [design 
doc|[https://docs.google.com/document/d/16_MOQymzxrKvUHXh6QFr2AAXIKt_2vPUf8vzKy4H_tU/edit]]


> Implement InputChannel state recovery for unaligned checkpoint
> --
>
> Key: FLINK-16536
> URL: https://issues.apache.org/jira/browse/FLINK-16536
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Checkpointing, Runtime / Network
>Reporter: Zhijiang
>Priority: Major
> Fix For: 1.11.0
>
>
> During recovery process for unaligned checkpoint, the input channel state 
> should also be recovered besides with existing operator states.
> The InputGate would request buffer from local pool and then interact with 
> ChannelStateReader to fill in the state data.  The filled buffer would be 
> inserted into respective InputChannel queue for processing in normal way.
> It should guarantee that the new data from upstream side should not overtake 
> the input state data to avoid mis-order issue.
> Refer to more details by 
> [https://docs.google.com/document/d/16_MOQymzxrKvUHXh6QFr2AAXIKt_2vPUf8vzKy4H_tU/edit]]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-16536) Implement InputChannel state recovery for unaligned checkpoint

2020-03-10 Thread Zhijiang (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijiang updated FLINK-16536:
-
Description: 
During recovery process for unaligned checkpoint, the input channel state 
should also be recovered besides with existing operator states.

The InputGate would request buffer from local pool and then interact with 
ChannelStateReader to fill in the state data.  The filled buffer would be 
inserted into respective InputChannel queue for processing in normal way.

It should guarantee that the new data from upstream side should not overtake 
the input state data to avoid mis-order issue.

Refer to more details by 
[https://docs.google.com/document/d/16_MOQymzxrKvUHXh6QFr2AAXIKt_2vPUf8vzKy4H_tU/edit]

  was:
During recovery process for unaligned checkpoint, the input channel state 
should also be recovered besides with existing operator states.

The InputGate would request buffer from local pool and then interact with 
ChannelStateReader to fill in the state data.  The filled buffer would be 
inserted into respective InputChannel queue for processing in normal way.

It should guarantee that the new data from upstream side should not overtake 
the input state data to avoid mis-order issue.

Refer to more details by 
[https://docs.google.com/document/d/16_MOQymzxrKvUHXh6QFr2AAXIKt_2vPUf8vzKy4H_tU/edit]]


> Implement InputChannel state recovery for unaligned checkpoint
> --
>
> Key: FLINK-16536
> URL: https://issues.apache.org/jira/browse/FLINK-16536
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Checkpointing, Runtime / Network
>Reporter: Zhijiang
>Priority: Major
> Fix For: 1.11.0
>
>
> During recovery process for unaligned checkpoint, the input channel state 
> should also be recovered besides with existing operator states.
> The InputGate would request buffer from local pool and then interact with 
> ChannelStateReader to fill in the state data.  The filled buffer would be 
> inserted into respective InputChannel queue for processing in normal way.
> It should guarantee that the new data from upstream side should not overtake 
> the input state data to avoid mis-order issue.
> Refer to more details by 
> [https://docs.google.com/document/d/16_MOQymzxrKvUHXh6QFr2AAXIKt_2vPUf8vzKy4H_tU/edit]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-16536) Implement InputChannel state recovery for unaligned checkpoint

2020-03-10 Thread Zhijiang (Jira)

Zhijiang created FLINK-16536:


 Summary: Implement InputChannel state recovery for unaligned 
checkpoint
 Key: FLINK-16536
 URL: https://issues.apache.org/jira/browse/FLINK-16536
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Checkpointing, Runtime / Network
Reporter: Zhijiang
 Fix For: 1.11.0


During recovery process for unaligned checkpoint, the input channel state 
should also be recovered besides with existing operator states.

The InputGate would request buffer from local pool and then interact with 
ChannelStateReader to fill in the state data.  The filled buffer would be 
inserted into respective InputChannel queue for processing in normal way.

It should guarantee that the new data from upstream side should not overtake 
the input state data to avoid mis-order issue.

Refer to more details by [design 
doc|[https://docs.google.com/document/d/16_MOQymzxrKvUHXh6QFr2AAXIKt_2vPUf8vzKy4H_tU/edit]]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] KarmaGYZ commented on a change in pull request #11353: [FLINK-16438][yarn] Make YarnResourceManager starts workers using WorkerResourceSpec requested by SlotManager

2020-03-10 Thread GitBox

KarmaGYZ commented on a change in pull request #11353: [FLINK-16438][yarn] Make 
YarnResourceManager starts workers using WorkerResourceSpec requested by 
SlotManager
URL: https://github.com/apache/flink/pull/11353#discussion_r390751400
 
 

 ##
 File path: 
flink-yarn/src/test/java/org/apache/flink/yarn/YarnResourceManagerTest.java
 ##
 @@ -584,6 +590,85 @@ public void testGetCpuExceedMaxInt() throws Exception {
}};
}
 
+   @Test
+   public void testWorkerSpecContainerResourceAdapter_MatchVcores() {
 
 Review comment:
   I prefer to follow the code style guide if possible but I'm ok with it atm. 
Someone who has better English proficiency may give us another idea about the 
name.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #7368: [FLINK-10742][network] Let Netty use Flink's buffers directly in credit-based mode

2020-03-10 Thread GitBox

flinkbot edited a comment on issue #7368: [FLINK-10742][network] Let Netty use 
Flink's buffers directly in credit-based mode
URL: https://github.com/apache/flink/pull/7368#issuecomment-567435407
 
 
   
   ## CI report:
   
   * 4d17b4de75015fade25228f8fa6668f0cf0d9dca UNKNOWN
   * ce48c4289d1f953d5f906c6e350c3ee8d971e225 UNKNOWN
   * a7b433be79f2a32bae1ba2ec0edc2fa772f2d1ca UNKNOWN
   * ecf642eb56230c99c2a1a21e25ba8283208e9c7b Travis: 
[SUCCESS](https://travis-ci.com/github/flink-ci/flink/builds/152747904) Azure: 
[PENDING](https://dev.azure.com/rmetzger/5bd3ef0a-4359-41af-abca-811b04098d2e/_build/results?buildId=6150)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Comment Edited] (FLINK-16477) CorrelateTest.testCorrelatePythonTableFunction test fails

2020-03-10 Thread Zhijiang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-16477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17056642#comment-17056642
 ] 

Zhijiang edited comment on FLINK-16477 at 3/11/20, 5:13 AM:


Another instance [https://api.travis-ci.org/v3/job/660629180/log.txt]

[https://dev.azure.com/rmetzger/5bd3ef0a-4359-41af-abca-811b04098d2e/_apis/build/builds/6146/logs/362]


was (Author: zjwang):
Another instance [https://api.travis-ci.org/v3/job/660629180/log.txt]

> CorrelateTest.testCorrelatePythonTableFunction test fails
> -
>
> Key: FLINK-16477
> URL: https://issues.apache.org/jira/browse/FLINK-16477
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner, Tests
>Affects Versions: 1.11.0
>Reporter: Robert Metzger
>Priority: Critical
>  Labels: test-stability
>
> Build: 
> https://dev.azure.com/rmetzger/Flink/_build/results?buildId=6038=logs=d47ab8d2-10c7-5d9e-8178-ef06a797a0d8=9a1abf5f-7cf4-58c3-bb2a-282a64aebb1f
> logs
> {code}
> 2020-03-07T00:35:53.9694141Z [ERROR] Failures: 
> 2020-03-07T00:35:53.9695913Z [ERROR]   
> CorrelateTest.testCorrelatePythonTableFunction:127 planBefore 
> expected:<...PythonTableFunction$[8c725e0df55184e548883d4b29709539]($0, $1)], 
> rowType=[...> but 
> was:<...PythonTableFunction$[201b6fc5cb9d565450dafbaaf1a440ab]($0, $1)], 
> rowType=[...>
> 2020-03-07T00:35:53.9698944Z [ERROR]   
> CorrelateTest.testCorrelatePythonTableFunction:188 planBefore 
> expected:<...PythonTableFunction$[8c725e0df55184e548883d4b29709539]($0, $1)], 
> rowType=[...> but 
> was:<...PythonTableFunction$[201b6fc5cb9d565450dafbaaf1a440ab]($0, $1)], 
> rowType=[...>
> 2020-03-07T00:35:53.9700745Z [INFO] 
> 2020-03-07T00:35:53.9701456Z [ERROR] Tests run: 4329, Failures: 2, Errors: 0, 
> Skipped: 14
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-16534) Support specify savepoint path when submitting sql job through sql client

2020-03-10 Thread Yu Li (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-16534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17056650#comment-17056650
 ] 

Yu Li commented on FLINK-16534:
---

The requirement stems from this user-zh ML discussion: 
http://apache-flink.147419.n8.nabble.com/flink-sql-join-state-state-td1901.html

> Support specify savepoint path when submitting sql job through sql client
> -
>
> Key: FLINK-16534
> URL: https://issues.apache.org/jira/browse/FLINK-16534
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / Client
>Reporter: Kurt Young
>Priority: Major
>
> When user submitted a sql job via sql client, they can stop/pause the job 
> with savepoint. But after that, if user want to resume such job with the 
> savepoint, there is no such feature yet.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-16535) BatchTableSink#emitDataSet returns DataSink

2020-03-10 Thread godfrey he (Jira)

godfrey he created FLINK-16535:
--

 Summary: BatchTableSink#emitDataSet returns DataSink
 Key: FLINK-16535
 URL: https://issues.apache.org/jira/browse/FLINK-16535
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Reporter: godfrey he
 Fix For: 1.11.0


Add return value for {{BatchTableSink#emitDataSet}} to support generating 
{{DataSet}} plan in {{BatchTableEnvironment}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-16533) ExecutionEnvironment supports execution of existing plan

2020-03-10 Thread godfrey he (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

godfrey he updated FLINK-16533:
---
Summary: ExecutionEnvironment supports execution of existing plan  (was: 
ExecutionEnvironment supports executing plan)

> ExecutionEnvironment supports execution of existing plan
> 
>
> Key: FLINK-16533
> URL: https://issues.apache.org/jira/browse/FLINK-16533
> Project: Flink
>  Issue Type: Sub-task
>  Components: Client / Job Submission
>Reporter: godfrey he
>Priority: Major
> Fix For: 1.11.0
>
>
> Currently, {{ExecutionEnvironment}} only supports executing the plan 
> generated by self.
> FLIP-84 proposes {{TableEnvironment}} can only trigger the table program and 
> the {{StreamExecutionEnvironment}}/{{ExecutionEnvironment}} can only trigger 
> {{DataStream}}/{{DataSet}} program. This requires that 
> {{ExecutionEnvironment}} can execute the plan generated by 
> {{TableEnvironment}}. We propose to add two methods in  
> {{ExecutionEnvironment}}: (which is similar to 
> {{StreamExecutionEnvironment}}#execute(StreamGraph) and 
> {{StreamExecutionEnvironment}}#executeAsync(StreamGraph))
> {code:java}
> public class ExecutionEnvironment {
> @Internal
> public JobExecutionResult execute(Plan plan) throws Exception {
> .
> }
> @Internal
> public JobClient executeAsync(Plan plan) throws Exception {
> .
> }
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-16533) ExecutionEnvironment supports executing plan

2020-03-10 Thread godfrey he (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

godfrey he updated FLINK-16533:
---
Description: 
Currently, {{ExecutionEnvironment}} only supports executing the plan generated 
by self.
FLIP-84 proposes {{TableEnvironment}} can only trigger the table program and 
the {{StreamExecutionEnvironment}}/{{ExecutionEnvironment}} can only trigger 
{{DataStream}}/{{DataSet}} program. This requires that {{ExecutionEnvironment}} 
can execute the plan generated by {{TableEnvironment}}. We propose to add two 
methods in  {{ExecutionEnvironment}}: (which is similar to 
{{StreamExecutionEnvironment}}#execute(StreamGraph) and 
{{StreamExecutionEnvironment}}#executeAsync(StreamGraph))

{code:java}
public class ExecutionEnvironment {
@Internal
public JobExecutionResult execute(Plan plan) throws Exception {
.
}

@Internal
public JobClient executeAsync(Plan plan) throws Exception {
.
}
}
{code}






  was:
Currently, {{ExecutionEnvironment}} only supports executing the plan generated 
by self.
FLIP-84 proposes {{TableEnvironment}} can only trigger the table program and 
the {{StreamExecutionEnvironment}}/{{ExecutionEnvironment}} can only trigger 
{{DataStream}}/{{DataSet}} program. This requires that {{ExecutionEnvironment}} 
can execute the plan generated by {{TableEnvironment}}. We propose to add two 
methods in  {{ExecutionEnvironment}}: (which is similar to 
{{StreamExecutionEnvironment}}#execute(StreamGraph) and 
{{StreamExecutionEnvironment}}#executeAsync(StreamGraph))

{code:java}
@Internal
public JobExecutionResult execute(Plan plan) throws Exception {
.
}

@Internal
public JobClient executeAsync(Plan plan) throws Exception {
.
}
{code}







> ExecutionEnvironment supports executing plan
> 
>
> Key: FLINK-16533
> URL: https://issues.apache.org/jira/browse/FLINK-16533
> Project: Flink
>  Issue Type: Sub-task
>  Components: Client / Job Submission
>Reporter: godfrey he
>Priority: Major
> Fix For: 1.11.0
>
>
> Currently, {{ExecutionEnvironment}} only supports executing the plan 
> generated by self.
> FLIP-84 proposes {{TableEnvironment}} can only trigger the table program and 
> the {{StreamExecutionEnvironment}}/{{ExecutionEnvironment}} can only trigger 
> {{DataStream}}/{{DataSet}} program. This requires that 
> {{ExecutionEnvironment}} can execute the plan generated by 
> {{TableEnvironment}}. We propose to add two methods in  
> {{ExecutionEnvironment}}: (which is similar to 
> {{StreamExecutionEnvironment}}#execute(StreamGraph) and 
> {{StreamExecutionEnvironment}}#executeAsync(StreamGraph))
> {code:java}
> public class ExecutionEnvironment {
> @Internal
> public JobExecutionResult execute(Plan plan) throws Exception {
> .
> }
> @Internal
> public JobClient executeAsync(Plan plan) throws Exception {
> .
> }
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-16534) Support specify savepoint path when submitting sql job through sql client

2020-03-10 Thread Kurt Young (Jira)

Kurt Young created FLINK-16534:
--

 Summary: Support specify savepoint path when submitting sql job 
through sql client
 Key: FLINK-16534
 URL: https://issues.apache.org/jira/browse/FLINK-16534
 Project: Flink
  Issue Type: New Feature
  Components: Table SQL / Client
Reporter: Kurt Young


When user submitted a sql job via sql client, they can stop/pause the job with 
savepoint. But after that, if user want to resume such job with the savepoint, 
there is no such feature yet.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-16533) ExecutionEnvironment supports executing plan

2020-03-10 Thread godfrey he (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

godfrey he updated FLINK-16533:
---
Summary: ExecutionEnvironment supports executing plan  (was: 
ExecutionEnvironment supports executing existing plan)

> ExecutionEnvironment supports executing plan
> 
>
> Key: FLINK-16533
> URL: https://issues.apache.org/jira/browse/FLINK-16533
> Project: Flink
>  Issue Type: Sub-task
>  Components: Client / Job Submission
>Reporter: godfrey he
>Priority: Major
> Fix For: 1.11.0
>
>
> Currently, {{ExecutionEnvironment}} only supports executing the plan 
> generated by self.
> FLIP-84 proposes {{TableEnvironment}} can only trigger the table program and 
> the {{StreamExecutionEnvironment}}/{{ExecutionEnvironment}} can only trigger 
> {{DataStream}}/{{DataSet}} program. This requires that 
> {{ExecutionEnvironment}} can execute the plan generated by 
> {{TableEnvironment}}. We propose to add two methods in  
> {{ExecutionEnvironment}}: (which is similar to 
> {{StreamExecutionEnvironment}}#execute(StreamGraph) and 
> {{StreamExecutionEnvironment}}#executeAsync(StreamGraph))
> {code:java}
> @Internal
> public JobExecutionResult execute(Plan plan) throws Exception {
> .
> }
> @Internal
> public JobClient executeAsync(Plan plan) throws Exception {
> .
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-16533) ExecutionEnvironment supports executing existing plan

2020-03-10 Thread godfrey he (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

godfrey he updated FLINK-16533:
---
Summary: ExecutionEnvironment supports executing existing plan  (was: 
ExecutionEnvironment supports executing plan)

> ExecutionEnvironment supports executing existing plan
> -
>
> Key: FLINK-16533
> URL: https://issues.apache.org/jira/browse/FLINK-16533
> Project: Flink
>  Issue Type: Sub-task
>  Components: Client / Job Submission
>Reporter: godfrey he
>Priority: Major
> Fix For: 1.11.0
>
>
> Currently, {{ExecutionEnvironment}} only supports executing the plan 
> generated by self.
> FLIP-84 proposes {{TableEnvironment}} can only trigger the table program and 
> the {{StreamExecutionEnvironment}}/{{ExecutionEnvironment}} can only trigger 
> {{DataStream}}/{{DataSet}} program. This requires that 
> {{ExecutionEnvironment}} can execute the plan generated by 
> {{TableEnvironment}}. We propose to add two methods in  
> {{ExecutionEnvironment}}: (which is similar to 
> {{StreamExecutionEnvironment}}#execute(StreamGraph) and 
> {{StreamExecutionEnvironment}}#executeAsync(StreamGraph))
> {code:java}
> @Internal
> public JobExecutionResult execute(Plan plan) throws Exception {
> .
> }
> @Internal
> public JobClient executeAsync(Plan plan) throws Exception {
> .
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-16533) ExecutionEnvironment supports executing plan

2020-03-10 Thread godfrey he (Jira)

godfrey he created FLINK-16533:
--

 Summary: ExecutionEnvironment supports executing plan
 Key: FLINK-16533
 URL: https://issues.apache.org/jira/browse/FLINK-16533
 Project: Flink
  Issue Type: Sub-task
  Components: Client / Job Submission
Reporter: godfrey he
 Fix For: 1.11.0


Currently, {{ExecutionEnvironment}} only supports executing the plan generated 
by self.
FLIP-84 proposes {{TableEnvironment}} can only trigger the table program and 
the {{StreamExecutionEnvironment}}/{{ExecutionEnvironment}} can only trigger 
{{DataStream}}/{{DataSet}} program. This requires that {{ExecutionEnvironment}} 
can execute the plan generated by {{TableEnvironment}}. We propose to add two 
methods in  {{ExecutionEnvironment}}: (which is similar to 
{{StreamExecutionEnvironment}}#execute(StreamGraph) and 
{{StreamExecutionEnvironment}}#executeAsync(StreamGraph))

{code:java}
@Internal
public JobExecutionResult execute(Plan plan) throws Exception {
.
}

@Internal
public JobClient executeAsync(Plan plan) throws Exception {
.
}
{code}








--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-16477) CorrelateTest.testCorrelatePythonTableFunction test fails

2020-03-10 Thread Zhijiang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-16477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17056642#comment-17056642
 ] 

Zhijiang commented on FLINK-16477:
--

Another instance [https://api.travis-ci.org/v3/job/660629180/log.txt]

> CorrelateTest.testCorrelatePythonTableFunction test fails
> -
>
> Key: FLINK-16477
> URL: https://issues.apache.org/jira/browse/FLINK-16477
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner, Tests
>Affects Versions: 1.11.0
>Reporter: Robert Metzger
>Priority: Critical
>  Labels: test-stability
>
> Build: 
> https://dev.azure.com/rmetzger/Flink/_build/results?buildId=6038=logs=d47ab8d2-10c7-5d9e-8178-ef06a797a0d8=9a1abf5f-7cf4-58c3-bb2a-282a64aebb1f
> logs
> {code}
> 2020-03-07T00:35:53.9694141Z [ERROR] Failures: 
> 2020-03-07T00:35:53.9695913Z [ERROR]   
> CorrelateTest.testCorrelatePythonTableFunction:127 planBefore 
> expected:<...PythonTableFunction$[8c725e0df55184e548883d4b29709539]($0, $1)], 
> rowType=[...> but 
> was:<...PythonTableFunction$[201b6fc5cb9d565450dafbaaf1a440ab]($0, $1)], 
> rowType=[...>
> 2020-03-07T00:35:53.9698944Z [ERROR]   
> CorrelateTest.testCorrelatePythonTableFunction:188 planBefore 
> expected:<...PythonTableFunction$[8c725e0df55184e548883d4b29709539]($0, $1)], 
> rowType=[...> but 
> was:<...PythonTableFunction$[201b6fc5cb9d565450dafbaaf1a440ab]($0, $1)], 
> rowType=[...>
> 2020-03-07T00:35:53.9700745Z [INFO] 
> 2020-03-07T00:35:53.9701456Z [ERROR] Tests run: 4329, Failures: 2, Errors: 0, 
> Skipped: 14
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] xintongsong commented on a change in pull request #11353: [FLINK-16438][yarn] Make YarnResourceManager starts workers using WorkerResourceSpec requested by SlotManager

2020-03-10 Thread GitBox

xintongsong commented on a change in pull request #11353: [FLINK-16438][yarn] 
Make YarnResourceManager starts workers using WorkerResourceSpec requested by 
SlotManager
URL: https://github.com/apache/flink/pull/11353#discussion_r390742962
 
 

 ##
 File path: 
flink-yarn/src/test/java/org/apache/flink/yarn/TestingYarnAMRMClientAsync.java
 ##
 @@ -0,0 +1,113 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.yarn;
+
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.api.java.tuple.Tuple4;
+import org.apache.flink.util.Preconditions;
+
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.yarn.api.records.ContainerId;
+import org.apache.hadoop.yarn.api.records.Priority;
+import org.apache.hadoop.yarn.api.records.Resource;
+import org.apache.hadoop.yarn.client.api.AMRMClient;
+import org.apache.hadoop.yarn.client.api.async.AMRMClientAsync;
+import org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl;
+
+import java.util.Collection;
+import java.util.Collections;
+import java.util.List;
+import java.util.function.Consumer;
+import java.util.function.Function;
+
+/**
+ * A Yarn {@link AMRMClientAsync} implementation for testing.
+ */
+public class TestingYarnAMRMClientAsync extends 
AMRMClientAsyncImpl {
 
 Review comment:
   The problem is that `AMRMClientAsync` is an abstract class and we have to 
implement all its abstract methods if extending it. There are some abstract 
methods in later Hadoop versions that we cannot easily implement for early 
versions, because the absence of argument/return value type.
   
   Although `AMRMClientAsyncImpl` is unstable, we are not really depending on 
its implementation. The methods we override are all declared in 
`AMRMClientAsync`, which is stable. So it should not be a problem.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] xintongsong commented on a change in pull request #11353: [FLINK-16438][yarn] Make YarnResourceManager starts workers using WorkerResourceSpec requested by SlotManager

2020-03-10 Thread GitBox

xintongsong commented on a change in pull request #11353: [FLINK-16438][yarn] 
Make YarnResourceManager starts workers using WorkerResourceSpec requested by 
SlotManager
URL: https://github.com/apache/flink/pull/11353#discussion_r390741693
 
 

 ##
 File path: 
flink-yarn/src/main/java/org/apache/flink/yarn/YarnResourceManager.java
 ##
 @@ -615,4 +632,85 @@ protected double getCpuCores(final Configuration 
configuration) {
//noinspection NumericCastThatLosesPrecision
return cpuCoresLong;
}
+
+   /**
+* Utility class for converting between Flink {@link 
WorkerResourceSpec} and Yarn {@link Resource}.
+*/
+   @VisibleForTesting
+   static class WorkerSpecContainerResourceAdapter {
+   private final Configuration flinkConfig;
+   private final int minMemMB;
+   private final int minVcore;
+   private final boolean matchVcores;
+   private final Map 
workerSpecToContainerResource;
+   private final Map> 
containerResourceToWorkerSpecs;
+   private final Map> 
containerMemoryToContainerResource;
+
+   @VisibleForTesting
+   WorkerSpecContainerResourceAdapter(
+   final Configuration flinkConfig,
+   final int minMemMB,
+   final int minVcore,
+   final boolean matchVcores) {
+   this.flinkConfig = 
Preconditions.checkNotNull(flinkConfig);
+   this.minMemMB = minMemMB;
+   this.minVcore = minVcore;
+   this.matchVcores = matchVcores;
+   workerSpecToContainerResource = new HashMap<>();
+   containerResourceToWorkerSpecs = new HashMap<>();
+   containerMemoryToContainerResource = new HashMap<>();
+   }
+
+   @VisibleForTesting
+   Resource getContainerResource(final WorkerResourceSpec 
workerResourceSpec) {
+   return workerSpecToContainerResource.computeIfAbsent(
+   Preconditions.checkNotNull(workerResourceSpec),
+   this::createAndMapContainerResource);
+   }
+
+   @VisibleForTesting
+   Collection getWorkerSpecs(final Resource 
containerResource) {
+   return 
getEquivalentContainerResource(containerResource).stream()
+   .flatMap(resource -> 
containerResourceToWorkerSpecs.getOrDefault(resource, 
Collections.emptyList()).stream())
+   .collect(Collectors.toList());
+   }
+
+   @VisibleForTesting
+   Collection getEquivalentContainerResource(final 
Resource containerResource) {
+   // Yarn might ignore the requested vcores, depending on 
its configurations.
+   // In such cases, we should also not matching vcores.
+   return matchVcores ?
+   Collections.singletonList(containerResource) :
+   
containerMemoryToContainerResource.getOrDefault(containerResource.getMemory(), 
Collections.emptyList());
+   }
+
+   private Resource createAndMapContainerResource(final 
WorkerResourceSpec workerResourceSpec) {
+   // TODO: need to unset process/flink memory size from 
configuration if dynamic worker resource is activated
+   final TaskExecutorProcessSpec taskExecutorProcessSpec =
+   
TaskExecutorProcessUtils.processSpecFromWorkerResourceSpec(flinkConfig, 
workerResourceSpec);
+   final Resource containerResource = Resource.newInstance(
+   
normalize(taskExecutorProcessSpec.getTotalProcessMemorySize().getMebiBytes(), 
minMemMB),
+   
normalize(taskExecutorProcessSpec.getCpuCores().getValue().intValue(), 
minVcore));
+   
containerResourceToWorkerSpecs.computeIfAbsent(containerResource, ignored -> 
new ArrayList<>())
+   .add(workerResourceSpec);
+   
containerMemoryToContainerResource.computeIfAbsent(containerResource.getMemory(),
 ignored -> new HashSet<>())
 
 Review comment:
   True, we don't need `containerMemoryToContainerResource` if not matching 
vcores. However, I would try avoid unnecessary `if-else` branches.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] xintongsong commented on a change in pull request #11353: [FLINK-16438][yarn] Make YarnResourceManager starts workers using WorkerResourceSpec requested by SlotManager

2020-03-10 Thread GitBox

xintongsong commented on a change in pull request #11353: [FLINK-16438][yarn] 
Make YarnResourceManager starts workers using WorkerResourceSpec requested by 
SlotManager
URL: https://github.com/apache/flink/pull/11353#discussion_r390740551
 
 

 ##
 File path: 
flink-yarn/src/main/java/org/apache/flink/yarn/YarnResourceManager.java
 ##
 @@ -615,4 +632,85 @@ protected double getCpuCores(final Configuration 
configuration) {
//noinspection NumericCastThatLosesPrecision
return cpuCoresLong;
}
+
+   /**
+* Utility class for converting between Flink {@link 
WorkerResourceSpec} and Yarn {@link Resource}.
+*/
+   @VisibleForTesting
+   static class WorkerSpecContainerResourceAdapter {
+   private final Configuration flinkConfig;
+   private final int minMemMB;
+   private final int minVcore;
+   private final boolean matchVcores;
+   private final Map 
workerSpecToContainerResource;
+   private final Map> 
containerResourceToWorkerSpecs;
+   private final Map> 
containerMemoryToContainerResource;
 
 Review comment:
   I think the upper bond of the amount of records really depends on how many 
different `WorkerResourceSpec` do we have.
   If we want to clean the unused records up, the 
`WorkerSpecContainerResourceAdapter` will need `YarnResourceManager` to tell it 
which `WorkerResourceSpec` is no longer needed (all corresponding TMs are 
completed and no pending ones). ATM, I don't see the necessity for such 
complexity. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] xintongsong commented on a change in pull request #11353: [FLINK-16438][yarn] Make YarnResourceManager starts workers using WorkerResourceSpec requested by SlotManager

2020-03-10 Thread GitBox

xintongsong commented on a change in pull request #11353: [FLINK-16438][yarn] 
Make YarnResourceManager starts workers using WorkerResourceSpec requested by 
SlotManager
URL: https://github.com/apache/flink/pull/11353#discussion_r390740551
 
 

 ##
 File path: 
flink-yarn/src/main/java/org/apache/flink/yarn/YarnResourceManager.java
 ##
 @@ -615,4 +632,85 @@ protected double getCpuCores(final Configuration 
configuration) {
//noinspection NumericCastThatLosesPrecision
return cpuCoresLong;
}
+
+   /**
+* Utility class for converting between Flink {@link 
WorkerResourceSpec} and Yarn {@link Resource}.
+*/
+   @VisibleForTesting
+   static class WorkerSpecContainerResourceAdapter {
+   private final Configuration flinkConfig;
+   private final int minMemMB;
+   private final int minVcore;
+   private final boolean matchVcores;
+   private final Map 
workerSpecToContainerResource;
+   private final Map> 
containerResourceToWorkerSpecs;
+   private final Map> 
containerMemoryToContainerResource;
 
 Review comment:
   I think the upper bond of the amount of records really depends on how many 
different `WorkerResourceSpec` do we have.
   If we want to clean the unused records up, the 
`WorkerSpecContainerResourceAdapter` will need `YarnResourceManager` to tell it 
which `WorkerResourceSpec` is no longer needed (all corresponding TMs are 
completed and no pending ones). ATM, I don't see the necessity for such 
comlexity. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Closed] (FLINK-16512) Add API to persist channel state

2020-03-10 Thread Zhijiang (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijiang closed FLINK-16512.

Resolution: Fixed

Merged in master: 1ad361b787ad2929b334a72e7f5be944f4fe27af

> Add API to persist channel state
> 
>
> Key: FLINK-16512
> URL: https://issues.apache.org/jira/browse/FLINK-16512
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Checkpointing
>Reporter: Roman Khachatryan
>Assignee: Roman Khachatryan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.11.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] flinkbot edited a comment on issue #7368: [FLINK-10742][network] Let Netty use Flink's buffers directly in credit-based mode

2020-03-10 Thread GitBox

flinkbot edited a comment on issue #7368: [FLINK-10742][network] Let Netty use 
Flink's buffers directly in credit-based mode
URL: https://github.com/apache/flink/pull/7368#issuecomment-567435407
 
 
   
   ## CI report:
   
   * 4d17b4de75015fade25228f8fa6668f0cf0d9dca UNKNOWN
   * ce48c4289d1f953d5f906c6e350c3ee8d971e225 UNKNOWN
   * 78fe352f47cd907fe288357a69a14fa877710c28 Travis: 
[SUCCESS](https://travis-ci.com/flink-ci/flink/builds/152310129) 
   * a7b433be79f2a32bae1ba2ec0edc2fa772f2d1ca UNKNOWN
   * ecf642eb56230c99c2a1a21e25ba8283208e9c7b Travis: 
[PENDING](https://travis-ci.com/github/flink-ci/flink/builds/152747904) Azure: 
[PENDING](https://dev.azure.com/rmetzger/5bd3ef0a-4359-41af-abca-811b04098d2e/_build/results?buildId=6150)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] zhijiangW merged pull request #11354: [FLINK-16512][task] Unaligned checkpoints: API for persistence

2020-03-10 Thread GitBox

zhijiangW merged pull request #11354: [FLINK-16512][task] Unaligned 
checkpoints: API for persistence
URL: https://github.com/apache/flink/pull/11354
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Updated] (FLINK-16512) Add API to persist channel state

2020-03-10 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-16512:
---
Labels: pull-request-available  (was: )

> Add API to persist channel state
> 
>
> Key: FLINK-16512
> URL: https://issues.apache.org/jira/browse/FLINK-16512
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Checkpointing
>Reporter: Roman Khachatryan
>Assignee: Roman Khachatryan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.11.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] xintongsong commented on a change in pull request #11353: [FLINK-16438][yarn] Make YarnResourceManager starts workers using WorkerResourceSpec requested by SlotManager

2020-03-10 Thread GitBox

xintongsong commented on a change in pull request #11353: [FLINK-16438][yarn] 
Make YarnResourceManager starts workers using WorkerResourceSpec requested by 
SlotManager
URL: https://github.com/apache/flink/pull/11353#discussion_r390738660
 
 

 ##
 File path: 
flink-yarn/src/test/java/org/apache/flink/yarn/YarnResourceManagerTest.java
 ##
 @@ -584,6 +590,85 @@ public void testGetCpuExceedMaxInt() throws Exception {
}};
}
 
+   @Test
+   public void testWorkerSpecContainerResourceAdapter_MatchVcores() {
 
 Review comment:
   I'm sure about this either. That's why I kept it, for discussions in the PR.
   
   On one hand, I do find the following in the [code style 
guide](https://flink.apache.org/contributing/code-style-and-quality-formatting.html).
   > Non-static fields/methods must be in lower camel case.
   
   On the other hand, the underscores do provide better readability in the long 
test case name, and there are already test cases using them (e.g., 
MiniClusterConfigurationTest, CatalogTest, etc.). 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] walterddr commented on a change in pull request #11189: [FLINK-16236][yarn][hotfix] fix Yarn secure test not loading correct context factory

2020-03-10 Thread GitBox

walterddr commented on a change in pull request #11189: 
[FLINK-16236][yarn][hotfix] fix Yarn secure test not loading correct context 
factory
URL: https://github.com/apache/flink/pull/11189#discussion_r390735933
 
 

 ##
 File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/security/contexts/SecurityContextFactory.java
 ##
 @@ -47,5 +48,5 @@ default boolean isCompatibleWith(final SecurityConfiguration 
securityConfig) {
 * @param securityConfig security configuration used to create context.
 * @return the security context object.
 */
-   SecurityContext createContext(SecurityConfiguration securityConfig);
+   SecurityContext createContext(SecurityConfiguration securityConfig) 
throws SecurityContextInitializeException;
 
 Review comment:
   I altered the API to throw explicit exception instead here, to quote the 
suggestion: 
   > we are losing the context of `e`. `e` might contain helpful information 
why the context could not be instantiated
   
   Since the API explicitly requires an override of the `isCompatibleWith` 
method explicitly to check through, I think throwing an exception makes sense 
here 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #7368: [FLINK-10742][network] Let Netty use Flink's buffers directly in credit-based mode

2020-03-10 Thread GitBox

flinkbot edited a comment on issue #7368: [FLINK-10742][network] Let Netty use 
Flink's buffers directly in credit-based mode
URL: https://github.com/apache/flink/pull/7368#issuecomment-567435407
 
 
   
   ## CI report:
   
   * 4d17b4de75015fade25228f8fa6668f0cf0d9dca UNKNOWN
   * ce48c4289d1f953d5f906c6e350c3ee8d971e225 UNKNOWN
   * 78fe352f47cd907fe288357a69a14fa877710c28 Travis: 
[SUCCESS](https://travis-ci.com/flink-ci/flink/builds/152310129) 
   * a7b433be79f2a32bae1ba2ec0edc2fa772f2d1ca UNKNOWN
   * ecf642eb56230c99c2a1a21e25ba8283208e9c7b UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Assigned] (FLINK-16530) Add documentation about "GROUPING SETS" and "CUBE" support in streaming mode

2020-03-10 Thread Jark Wu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jark Wu reassigned FLINK-16530:
---

Assignee: Jark Wu

> Add documentation about "GROUPING SETS" and "CUBE" support in streaming mode
> 
>
> Key: FLINK-16530
> URL: https://issues.apache.org/jira/browse/FLINK-16530
> Project: Flink
>  Issue Type: Task
>  Components: Documentation, Table SQL / API
>Reporter: Jark Wu
>Assignee: Jark Wu
>Priority: Major
> Fix For: 1.10.1, 1.11.0
>
>
> "GROUPING SETS" and "CUBE" are already supported in streaming mode in blink 
> planner, but we missed to add that feature in the documentation. And that 
> confused some users [1].
> Note that it is only supported in blink planner, not old planner.
> [1]: 
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Question-on-the-SQL-quot-GROUPING-SETS-quot-and-quot-CUBE-quot-syntax-usability-td33504.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] walterddr commented on a change in pull request #7702: [FLINK-11088][Security][YARN] Allow YARN to discover pre-installed keytab files

2020-03-10 Thread GitBox

walterddr commented on a change in pull request #7702: 
[FLINK-11088][Security][YARN] Allow YARN to discover pre-installed keytab files
URL: https://github.com/apache/flink/pull/7702#discussion_r390733239
 
 

 ##
 File path: flink-yarn/src/main/java/org/apache/flink/yarn/YarnConfigKeys.java
 ##
 @@ -36,7 +36,10 @@
public static final String FLINK_JAR_PATH = "_FLINK_JAR_PATH"; // the 
Flink jar resource location (in HDFS).
public static final String FLINK_YARN_FILES = "_FLINK_YARN_FILES"; // 
the root directory for all yarn application files
 
+   @Deprecated
public static final String KEYTAB_PATH = "_KEYTAB_PATH";
+   public static final String REMOTE_KEYTAB_PATH = "_REMOTE_KEYTAB_PATH";
 
 Review comment:
   This is a good catch (to also response to other comments related) --> 
technically speaking there should only be one ConfigKey in the JM/TM 
perspective: the "REMOTE_KEYTAB_PATH", or previously known as "KEYTAB_PATH". 
   
   I would revert the changes I've done here. since if the keytab is shipped 
over as you mentioned, it doesn't matter whether it is originated from a remote 
ship location or pre-installed. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] walterddr commented on a change in pull request #7702: [FLINK-11088][Security][YARN] Allow YARN to discover pre-installed keytab files

2020-03-10 Thread GitBox

walterddr commented on a change in pull request #7702: 
[FLINK-11088][Security][YARN] Allow YARN to discover pre-installed keytab files
URL: https://github.com/apache/flink/pull/7702#discussion_r390733239
 
 

 ##
 File path: flink-yarn/src/main/java/org/apache/flink/yarn/YarnConfigKeys.java
 ##
 @@ -36,7 +36,10 @@
public static final String FLINK_JAR_PATH = "_FLINK_JAR_PATH"; // the 
Flink jar resource location (in HDFS).
public static final String FLINK_YARN_FILES = "_FLINK_YARN_FILES"; // 
the root directory for all yarn application files
 
+   @Deprecated
public static final String KEYTAB_PATH = "_KEYTAB_PATH";
+   public static final String REMOTE_KEYTAB_PATH = "_REMOTE_KEYTAB_PATH";
 
 Review comment:
   This is a good catch (to also response to other comments related) --> 
technically speaking there should only be one ConfigKey in the JM/TM 
perspective: the "REMOTE_KEYTAB_PATH", or previously known as "KEYTAB_PATH" - 
as you mentioned, it doesn't matter whether it is originated from a remote ship 
location or pre-installed. 
   
   I would revert the changes I've done here. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] walterddr commented on a change in pull request #7702: [FLINK-11088][Security][YARN] Allow YARN to discover pre-installed keytab files

2020-03-10 Thread GitBox

walterddr commented on a change in pull request #7702: 
[FLINK-11088][Security][YARN] Allow YARN to discover pre-installed keytab files
URL: https://github.com/apache/flink/pull/7702#discussion_r39071
 
 

 ##
 File path: 
flink-yarn/src/main/java/org/apache/flink/yarn/YarnClusterDescriptor.java
 ##
 @@ -941,10 +950,13 @@ private ApplicationReport startAppMaster(
// 
https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/YarnApplicationSecurity.md#identity-on-an-insecure-cluster-hadoop_user_name
appMasterEnv.put(YarnConfigKeys.ENV_HADOOP_USER_NAME, 
UserGroupInformation.getCurrentUser().getUserName());
 
-   if (remotePathKeytab != null) {
-   appMasterEnv.put(YarnConfigKeys.KEYTAB_PATH, 
remotePathKeytab.toString());
+   if (localizedKeytabPath != null) {
 
 Review comment:
   +1, reverting 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] walterddr commented on a change in pull request #7702: [FLINK-11088][Security][YARN] Allow YARN to discover pre-installed keytab files

2020-03-10 Thread GitBox

walterddr commented on a change in pull request #7702: 
[FLINK-11088][Security][YARN] Allow YARN to discover pre-installed keytab files
URL: https://github.com/apache/flink/pull/7702#discussion_r390733381
 
 

 ##
 File path: 
flink-yarn/src/main/java/org/apache/flink/yarn/YarnClusterDescriptor.java
 ##
 @@ -887,20 +887,28 @@ private ApplicationReport startAppMaster(
}
}
 
-   // setup security tokens
Path remotePathKeytab = null;
+   String localizedKeytabPath = null;
String keytab = 
configuration.getString(SecurityOptions.KERBEROS_LOGIN_KEYTAB);
if (keytab != null) {
-   LOG.info("Adding keytab {} to the AM container local 
resource bucket", keytab);
-   remotePathKeytab = setupSingleLocalResource(
-   Utils.KEYTAB_FILE_NAME,
+   boolean requireLocalizedKeytab = 
flinkConfiguration.getBoolean(YarnConfigOptions.SHIP_LOCAL_KEYTAB);
+   localizedKeytabPath = 
flinkConfiguration.getString(YarnConfigOptions.LOCALIZED_KEYTAB_PATH);
+   if (requireLocalizedKeytab) {
+   // Localize the keytab to YARN containers via 
local resource.
+   LOG.info("Adding keytab {} to the AM container 
local resource bucket", keytab);
+   remotePathKeytab = setupSingleLocalResource(
+   localizedKeytabPath,
fs,
appId,
new Path(keytab),
localResources,
homeDir,
"",
fileReplication);
+   } else {
 
 Review comment:
   +1


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] walterddr commented on a change in pull request #7702: [FLINK-11088][Security][YARN] Allow YARN to discover pre-installed keytab files

2020-03-10 Thread GitBox

walterddr commented on a change in pull request #7702: 
[FLINK-11088][Security][YARN] Allow YARN to discover pre-installed keytab files
URL: https://github.com/apache/flink/pull/7702#discussion_r390732821
 
 

 ##
 File path: 
flink-yarn/src/main/java/org/apache/flink/yarn/entrypoint/YarnEntrypointUtils.java
 ##
 @@ -90,16 +90,27 @@ public static Configuration loadConfiguration(String 
workingDirectory, Map

[GitHub] [flink] walterddr commented on a change in pull request #7702: [FLINK-11088][Security][YARN] Allow YARN to discover pre-installed keytab files

2020-03-10 Thread GitBox

walterddr commented on a change in pull request #7702: 
[FLINK-11088][Security][YARN] Allow YARN to discover pre-installed keytab files
URL: https://github.com/apache/flink/pull/7702#discussion_r390733373
 
 

 ##
 File path: 
flink-yarn/src/main/java/org/apache/flink/yarn/YarnClusterDescriptor.java
 ##
 @@ -887,20 +887,28 @@ private ApplicationReport startAppMaster(
}
}
 
-   // setup security tokens
Path remotePathKeytab = null;
+   String localizedKeytabPath = null;
String keytab = 
configuration.getString(SecurityOptions.KERBEROS_LOGIN_KEYTAB);
if (keytab != null) {
-   LOG.info("Adding keytab {} to the AM container local 
resource bucket", keytab);
-   remotePathKeytab = setupSingleLocalResource(
-   Utils.KEYTAB_FILE_NAME,
+   boolean requireLocalizedKeytab = 
flinkConfiguration.getBoolean(YarnConfigOptions.SHIP_LOCAL_KEYTAB);
+   localizedKeytabPath = 
flinkConfiguration.getString(YarnConfigOptions.LOCALIZED_KEYTAB_PATH);
+   if (requireLocalizedKeytab) {
 
 Review comment:
   +1


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] walterddr commented on a change in pull request #7702: [FLINK-11088][Security][YARN] Allow YARN to discover pre-installed keytab files

2020-03-10 Thread GitBox

walterddr commented on a change in pull request #7702: 
[FLINK-11088][Security][YARN] Allow YARN to discover pre-installed keytab files
URL: https://github.com/apache/flink/pull/7702#discussion_r390733239
 
 

 ##
 File path: flink-yarn/src/main/java/org/apache/flink/yarn/YarnConfigKeys.java
 ##
 @@ -36,7 +36,10 @@
public static final String FLINK_JAR_PATH = "_FLINK_JAR_PATH"; // the 
Flink jar resource location (in HDFS).
public static final String FLINK_YARN_FILES = "_FLINK_YARN_FILES"; // 
the root directory for all yarn application files
 
+   @Deprecated
public static final String KEYTAB_PATH = "_KEYTAB_PATH";
+   public static final String REMOTE_KEYTAB_PATH = "_REMOTE_KEYTAB_PATH";
 
 Review comment:
   This is a good catch (to also response to other comment --> technically 
speaking there should only be one ConfigKey in the JM/TM perspective: the 
"REMOTE_KEYTAB_PATH", or previously known as "KEYTAB_PATH". 
   
   I would revert the changes I've done here. since if the keytab is shipped 
over as you mentioned, it doesn't matter whether it is originated from a remote 
ship location or pre-installed. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #11358: [FLINK-16516][python] Remove Python UDF Codegen Code

2020-03-10 Thread GitBox

flinkbot edited a comment on issue #11358: [FLINK-16516][python] Remove Python 
UDF Codegen Code
URL: https://github.com/apache/flink/pull/11358#issuecomment-596885127
 
 
   
   ## CI report:
   
   * beba9d15aaf629de9e09aa8885cdcc844a96ad30 Travis: 
[SUCCESS](https://travis-ci.com/github/flink-ci/flink/builds/152742281) Azure: 
[PENDING](https://dev.azure.com/rmetzger/5bd3ef0a-4359-41af-abca-811b04098d2e/_build/results?buildId=6148)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #7368: [FLINK-10742][network] Let Netty use Flink's buffers directly in credit-based mode

2020-03-10 Thread GitBox

flinkbot edited a comment on issue #7368: [FLINK-10742][network] Let Netty use 
Flink's buffers directly in credit-based mode
URL: https://github.com/apache/flink/pull/7368#issuecomment-567435407
 
 
   
   ## CI report:
   
   * 4d17b4de75015fade25228f8fa6668f0cf0d9dca UNKNOWN
   * ce48c4289d1f953d5f906c6e350c3ee8d971e225 UNKNOWN
   * 78fe352f47cd907fe288357a69a14fa877710c28 Travis: 
[SUCCESS](https://travis-ci.com/flink-ci/flink/builds/152310129) 
   * a7b433be79f2a32bae1ba2ec0edc2fa772f2d1ca UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Updated] (FLINK-16532) Cannot insert white space characters as partition value for Hive table

2020-03-10 Thread Rui Li (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated FLINK-16532:
---
Description: 
The following test fails:
{code}
create table dest (x int) partitioned by (p string);
insert into dest select 1,' ';
{code}
With error:
{noformat}
Caused by: MetaException(message:Invalid partition key & values; keys [p, ], 
values [])
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partition_result$get_partition_resultStandardScheme.read(ThriftHiveMetastore.java)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partition_result$get_partition_resultStandardScheme.read(ThriftHiveMetastore.java)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partition_result.read(ThriftHiveMetastore.java)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:86)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_partition(ThriftHiveMetastore.java:2204)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_partition(ThriftHiveMetastore.java:2189)
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getPartition(HiveMetaStoreClient.java:1307)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:169)
at com.sun.proxy.$Proxy33.getPartition(Unknown Source)
at 
org.apache.flink.table.catalog.hive.client.HiveMetastoreClientWrapper.getPartition(HiveMetastoreClientWrapper.java:127)
at 
org.apache.flink.connectors.hive.HiveTableMetaStoreFactory$HiveTableMetaStore.getPartition(HiveTableMetaStoreFactory.java:88)
at 
org.apache.flink.table.filesystem.PartitionLoader.loadPartition(PartitionLoader.java:69)
at 
org.apache.flink.table.filesystem.FileSystemCommitter.commitSingleCheckpoint(FileSystemCommitter.java:111)
at 
org.apache.flink.table.filesystem.FileSystemCommitter.commitUpToCheckpoint(FileSystemCommitter.java:99)
at 
org.apache.flink.table.filesystem.FileSystemOutputFormat.finalizeGlobal(FileSystemOutputFormat.java:90)
... 34 more
{noformat}

> Cannot insert white space characters as partition value for Hive table
> --
>
> Key: FLINK-16532
> URL: https://issues.apache.org/jira/browse/FLINK-16532
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Hive
>Affects Versions: 1.10.0
>Reporter: Rui Li
>Priority: Major
>
> The following test fails:
> {code}
> create table dest (x int) partitioned by (p string);
> insert into dest select 1,' ';
> {code}
> With error:
> {noformat}
> Caused by: MetaException(message:Invalid partition key & values; keys [p, ], 
> values [])
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partition_result$get_partition_resultStandardScheme.read(ThriftHiveMetastore.java)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partition_result$get_partition_resultStandardScheme.read(ThriftHiveMetastore.java)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partition_result.read(ThriftHiveMetastore.java)
>   at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:86)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_partition(ThriftHiveMetastore.java:2204)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_partition(ThriftHiveMetastore.java:2189)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getPartition(HiveMetaStoreClient.java:1307)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:169)
>   at com.sun.proxy.$Proxy33.getPartition(Unknown Source)
>   at 
> org.apache.flink.table.catalog.hive.client.HiveMetastoreClientWrapper.getPartition(HiveMetastoreClientWrapper.java:127)
>   at 
> org.apache.flink.connectors.hive.HiveTableMetaStoreFactory$HiveTableMetaStore.getPartition(HiveTableMetaStoreFactory.java:88)
>   at 
>

[jira] [Created] (FLINK-16532) Cannot insert white space characters as partition value for Hive table

2020-03-10 Thread Rui Li (Jira)

Rui Li created FLINK-16532:
--

 Summary: Cannot insert white space characters as partition value 
for Hive table
 Key: FLINK-16532
 URL: https://issues.apache.org/jira/browse/FLINK-16532
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Hive
Affects Versions: 1.10.0
Reporter: Rui Li






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (FLINK-16523) The improvement for local state of rocksdb backend.

2020-03-10 Thread luojiangyu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

luojiangyu closed FLINK-16523.
--
Resolution: Fixed

> The improvement for local state of rocksdb backend.
> ---
>
> Key: FLINK-16523
> URL: https://issues.apache.org/jira/browse/FLINK-16523
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / State Backends
>Affects Versions: 1.9.0, 1.9.1, 1.10.0
>Reporter: luojiangyu
>Priority: Major
>
> The state for local recovery of the rocksdb backend.  Using the copy way for 
> local state causes write IO pressure and write amplification。Using HardLink 
> way is more performance than Using copy way， Whether replacing the copy way 
> with hardlink is better choice.(CheckpointStreamWithResultProvider.java) . 
> [~richtesn]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] KarmaGYZ commented on a change in pull request #11353: [FLINK-16438][yarn] Make YarnResourceManager starts workers using WorkerResourceSpec requested by SlotManager

2020-03-10 Thread GitBox

KarmaGYZ commented on a change in pull request #11353: [FLINK-16438][yarn] Make 
YarnResourceManager starts workers using WorkerResourceSpec requested by 
SlotManager
URL: https://github.com/apache/flink/pull/11353#discussion_r390730646
 
 

 ##
 File path: 
flink-yarn/src/test/java/org/apache/flink/yarn/TestingYarnAMRMClientAsync.java
 ##
 @@ -0,0 +1,113 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.yarn;
+
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.api.java.tuple.Tuple4;
+import org.apache.flink.util.Preconditions;
+
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.yarn.api.records.ContainerId;
+import org.apache.hadoop.yarn.api.records.Priority;
+import org.apache.hadoop.yarn.api.records.Resource;
+import org.apache.hadoop.yarn.client.api.AMRMClient;
+import org.apache.hadoop.yarn.client.api.async.AMRMClientAsync;
+import org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl;
+
+import java.util.Collection;
+import java.util.Collections;
+import java.util.List;
+import java.util.function.Consumer;
+import java.util.function.Function;
+
+/**
+ * A Yarn {@link AMRMClientAsync} implementation for testing.
+ */
+public class TestingYarnAMRMClientAsync extends 
AMRMClientAsyncImpl {
 
 Review comment:
   Is there any benefit we could get from extending `AMRMClientAsyncImpl` 
instead of `AMRMClientAsync`? This class is annotated with `Unstable`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] walterddr commented on a change in pull request #11344: [FLINK-16250][python][ml] Add interfaces for PipelineStage and Pipeline

2020-03-10 Thread GitBox

walterddr commented on a change in pull request #11344: 
[FLINK-16250][python][ml] Add interfaces for PipelineStage and Pipeline
URL: https://github.com/apache/flink/pull/11344#discussion_r390727205
 
 

 ##
 File path: 
flink-ml-parent/flink-ml-lib/src/test/java/org/apache/flink/ml/pipeline/UserDefinedPipelineStages.java
 ##
 @@ -0,0 +1,54 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.ml.pipeline;
+
+import org.apache.flink.ml.api.core.Transformer;
+import org.apache.flink.ml.api.misc.param.Params;
+import org.apache.flink.ml.params.shared.colname.HasSelectedCols;
+import org.apache.flink.table.api.Table;
+import org.apache.flink.table.api.TableEnvironment;
+
+/**
+ * Util class for testing {@link org.apache.flink.ml.api.core.PipelineStage}.
 
 Review comment:
   was wondering why this java class is created here not earlier. is this only 
to test how pyflink uses JavaTransformer?
   Can we reuse the `FakeTransformer` class in TransfomerBaseTest? we can 
refactor it out or make it public if necessary IMO. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] walterddr commented on a change in pull request #11344: [FLINK-16250][python][ml] Add interfaces for PipelineStage and Pipeline

2020-03-10 Thread GitBox

walterddr commented on a change in pull request #11344: 
[FLINK-16250][python][ml] Add interfaces for PipelineStage and Pipeline
URL: https://github.com/apache/flink/pull/11344#discussion_r390729953
 
 

 ##
 File path: flink-python/pyflink/ml/api/base.py
 ##
 @@ -0,0 +1,275 @@
+
+#  Licensed to the Apache Software Foundation (ASF) under one
+#  or more contributor license agreements.  See the NOTICE file
+#  distributed with this work for additional information
+#  regarding copyright ownership.  The ASF licenses this file
+#  to you under the Apache License, Version 2.0 (the
+#  "License"); you may not use this file except in compliance
+#  with the License.  You may obtain a copy of the License at
+#
+#  http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+# limitations under the License.
+
+
+import re
+
+from abc import ABCMeta, abstractmethod
+
+from pyflink.table.table_environment import TableEnvironment
+from pyflink.table.table import Table
+from pyflink.ml.api.param import WithParams, Params
+from py4j.java_gateway import get_field
+
+
+class PipelineStage(WithParams):
+"""
+Base class for a stage in a pipeline. The interface is only a concept, and 
does not have any
+actual functionality. Its subclasses must be either Estimator or 
Transformer. No other classes
+should inherit this interface directly.
+
+Each pipeline stage is with parameters, and requires a public empty 
constructor for
+restoration in Pipeline.
+"""
+
+def __init__(self, params=None):
+if params is None:
+self._params = Params()
+else:
+self._params = params
+
+def get_params(self) -> Params:
+return self._params
+
+def _convert_params_to_java(self, j_pipeline_stage):
+for param in self._params._param_map:
+java_param = self._make_java_param(j_pipeline_stage, param)
+java_value = self._make_java_value(self._params._param_map[param])
+j_pipeline_stage.set(java_param, java_value)
+
+@staticmethod
+def _make_java_param(j_pipeline_stage, param):
+# camel case to snake case
+name = re.sub(r'(? str:
+return self.get_params().to_json()
+
+def load_json(self, json: str) -> None:
+self.get_params().load_json(json)
+
+
+class Transformer(PipelineStage):
+"""
+A transformer is a PipelineStage that transforms an input Table to a 
result Table.
+"""
+
+__metaclass__ = ABCMeta
+
+@abstractmethod
+def transform(self, table_env: TableEnvironment, table: Table) -> Table:
+"""
+Applies the transformer on the input table, and returns the result 
table.
+
+:param table_env: the table environment to which the input table is 
bound.
+:param table: the table to be transformed
+:returns: the transformed table
+"""
+raise NotImplementedError()
+
+
+class JavaTransformer(Transformer):
+"""
+Base class for Transformer that wrap Java implementations. Subclasses 
should
+ensure they have the transformer Java object available as j_obj.
+"""
+
+def __init__(self, j_obj):
+super().__init__()
+self._j_obj = j_obj
+
+def transform(self, table_env: TableEnvironment, table: Table) -> Table:
+"""
+Applies the transformer on the input table, and returns the result 
table.
+
+:param table_env: the table environment to which the input table is 
bound.
+:param table: the table to be transformed
+:returns: the transformed table
+"""
+self._convert_params_to_java(self._j_obj)
+return Table(self._j_obj.transform(table_env._j_tenv, table._j_table))
+
+
+class Model(Transformer):
+"""
+Abstract class for models that are fitted by estimators.
+
+A model is an ordinary Transformer except how it is created. While 
ordinary transformers
+are defined by specifying the parameters directly, a model is usually 
generated by an Estimator
+when Estimator.fit(table_env, table) is invoked.
+"""
+
+__metaclass__ = ABCMeta
+
+
+class JavaModel(JavaTransformer, Model):
+"""
+Base class for JavaTransformer that wrap Java implementations.
+Subclasses should ensure they have the model Java object available as 
j_obj.
+"""
+
+
+class Estimator(PipelineStage):
+"""
+Estimators are PipelineStages responsible for training and generating 
machine learning models.
+
+The implementations are expected to take an input table as

[GitHub] [flink] KarmaGYZ commented on a change in pull request #11353: [FLINK-16438][yarn] Make YarnResourceManager starts workers using WorkerResourceSpec requested by SlotManager

2020-03-10 Thread GitBox

KarmaGYZ commented on a change in pull request #11353: [FLINK-16438][yarn] Make 
YarnResourceManager starts workers using WorkerResourceSpec requested by 
SlotManager
URL: https://github.com/apache/flink/pull/11353#discussion_r390715114
 
 

 ##
 File path: 
flink-yarn/src/main/java/org/apache/flink/yarn/YarnResourceManager.java
 ##
 @@ -615,4 +632,85 @@ protected double getCpuCores(final Configuration 
configuration) {
//noinspection NumericCastThatLosesPrecision
return cpuCoresLong;
}
+
+   /**
+* Utility class for converting between Flink {@link 
WorkerResourceSpec} and Yarn {@link Resource}.
+*/
+   @VisibleForTesting
+   static class WorkerSpecContainerResourceAdapter {
+   private final Configuration flinkConfig;
+   private final int minMemMB;
+   private final int minVcore;
+   private final boolean matchVcores;
+   private final Map 
workerSpecToContainerResource;
+   private final Map> 
containerResourceToWorkerSpecs;
+   private final Map> 
containerMemoryToContainerResource;
 
 Review comment:
   These records seem never to be cleaned up. It will not cause any problem atm 
though.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] walterddr commented on a change in pull request #11344: [FLINK-16250][python][ml] Add interfaces for PipelineStage and Pipeline

2020-03-10 Thread GitBox

walterddr commented on a change in pull request #11344: 
[FLINK-16250][python][ml] Add interfaces for PipelineStage and Pipeline
URL: https://github.com/apache/flink/pull/11344#discussion_r390727653
 
 

 ##
 File path: flink-python/pyflink/ml/api/base.py
 ##
 @@ -0,0 +1,275 @@
+
+#  Licensed to the Apache Software Foundation (ASF) under one
+#  or more contributor license agreements.  See the NOTICE file
+#  distributed with this work for additional information
+#  regarding copyright ownership.  The ASF licenses this file
+#  to you under the Apache License, Version 2.0 (the
+#  "License"); you may not use this file except in compliance
+#  with the License.  You may obtain a copy of the License at
+#
+#  http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+# limitations under the License.
+
+
+import re
+
+from abc import ABCMeta, abstractmethod
+
+from pyflink.table.table_environment import TableEnvironment
+from pyflink.table.table import Table
+from pyflink.ml.api.param import WithParams, Params
+from py4j.java_gateway import get_field
+
+
+class PipelineStage(WithParams):
+"""
+Base class for a stage in a pipeline. The interface is only a concept, and 
does not have any
+actual functionality. Its subclasses must be either Estimator or 
Transformer. No other classes
+should inherit this interface directly.
+
+Each pipeline stage is with parameters, and requires a public empty 
constructor for
+restoration in Pipeline.
+"""
+
+def __init__(self, params=None):
+if params is None:
+self._params = Params()
+else:
+self._params = params
+
+def get_params(self) -> Params:
+return self._params
+
+def _convert_params_to_java(self, j_pipeline_stage):
+for param in self._params._param_map:
+java_param = self._make_java_param(j_pipeline_stage, param)
+java_value = self._make_java_value(self._params._param_map[param])
+j_pipeline_stage.set(java_param, java_value)
+
+@staticmethod
+def _make_java_param(j_pipeline_stage, param):
+# camel case to snake case
+name = re.sub(r'(?

[GitHub] [flink] godfreyhe commented on issue #11296: [FLINK-16363] [table] Correct the execution behavior of TableEnvironment and StreamTableEnvironment

2020-03-10 Thread GitBox

godfreyhe commented on issue #11296: [FLINK-16363] [table] Correct the 
execution behavior of TableEnvironment and StreamTableEnvironment
URL: https://github.com/apache/flink/pull/11296#issuecomment-597429594
 
 
   @hequn8128 @dianfu could you help review the python test change part ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] KarmaGYZ commented on a change in pull request #11353: [FLINK-16438][yarn] Make YarnResourceManager starts workers using WorkerResourceSpec requested by SlotManager

2020-03-10 Thread GitBox

KarmaGYZ commented on a change in pull request #11353: [FLINK-16438][yarn] Make 
YarnResourceManager starts workers using WorkerResourceSpec requested by 
SlotManager
URL: https://github.com/apache/flink/pull/11353#discussion_r390711414
 
 

 ##
 File path: 
flink-yarn/src/test/java/org/apache/flink/yarn/YarnResourceManagerTest.java
 ##
 @@ -584,6 +590,85 @@ public void testGetCpuExceedMaxInt() throws Exception {
}};
}
 
+   @Test
+   public void testWorkerSpecContainerResourceAdapter_MatchVcores() {
 
 Review comment:
   Not sure we could add underscores in the method name.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] KarmaGYZ commented on a change in pull request #11353: [FLINK-16438][yarn] Make YarnResourceManager starts workers using WorkerResourceSpec requested by SlotManager

2020-03-10 Thread GitBox

KarmaGYZ commented on a change in pull request #11353: [FLINK-16438][yarn] Make 
YarnResourceManager starts workers using WorkerResourceSpec requested by 
SlotManager
URL: https://github.com/apache/flink/pull/11353#discussion_r390722307
 
 

 ##
 File path: 
flink-yarn/src/main/java/org/apache/flink/yarn/YarnResourceManager.java
 ##
 @@ -615,4 +632,85 @@ protected double getCpuCores(final Configuration 
configuration) {
//noinspection NumericCastThatLosesPrecision
return cpuCoresLong;
}
+
+   /**
+* Utility class for converting between Flink {@link 
WorkerResourceSpec} and Yarn {@link Resource}.
+*/
+   @VisibleForTesting
+   static class WorkerSpecContainerResourceAdapter {
+   private final Configuration flinkConfig;
+   private final int minMemMB;
+   private final int minVcore;
+   private final boolean matchVcores;
+   private final Map 
workerSpecToContainerResource;
+   private final Map> 
containerResourceToWorkerSpecs;
+   private final Map> 
containerMemoryToContainerResource;
+
+   @VisibleForTesting
+   WorkerSpecContainerResourceAdapter(
+   final Configuration flinkConfig,
+   final int minMemMB,
+   final int minVcore,
+   final boolean matchVcores) {
+   this.flinkConfig = 
Preconditions.checkNotNull(flinkConfig);
+   this.minMemMB = minMemMB;
+   this.minVcore = minVcore;
+   this.matchVcores = matchVcores;
+   workerSpecToContainerResource = new HashMap<>();
+   containerResourceToWorkerSpecs = new HashMap<>();
+   containerMemoryToContainerResource = new HashMap<>();
+   }
+
+   @VisibleForTesting
+   Resource getContainerResource(final WorkerResourceSpec 
workerResourceSpec) {
+   return workerSpecToContainerResource.computeIfAbsent(
+   Preconditions.checkNotNull(workerResourceSpec),
+   this::createAndMapContainerResource);
+   }
+
+   @VisibleForTesting
+   Collection getWorkerSpecs(final Resource 
containerResource) {
+   return 
getEquivalentContainerResource(containerResource).stream()
+   .flatMap(resource -> 
containerResourceToWorkerSpecs.getOrDefault(resource, 
Collections.emptyList()).stream())
+   .collect(Collectors.toList());
+   }
+
+   @VisibleForTesting
+   Collection getEquivalentContainerResource(final 
Resource containerResource) {
+   // Yarn might ignore the requested vcores, depending on 
its configurations.
+   // In such cases, we should also not matching vcores.
+   return matchVcores ?
+   Collections.singletonList(containerResource) :
+   
containerMemoryToContainerResource.getOrDefault(containerResource.getMemory(), 
Collections.emptyList());
+   }
+
+   private Resource createAndMapContainerResource(final 
WorkerResourceSpec workerResourceSpec) {
+   // TODO: need to unset process/flink memory size from 
configuration if dynamic worker resource is activated
+   final TaskExecutorProcessSpec taskExecutorProcessSpec =
+   
TaskExecutorProcessUtils.processSpecFromWorkerResourceSpec(flinkConfig, 
workerResourceSpec);
+   final Resource containerResource = Resource.newInstance(
+   
normalize(taskExecutorProcessSpec.getTotalProcessMemorySize().getMebiBytes(), 
minMemMB),
+   
normalize(taskExecutorProcessSpec.getCpuCores().getValue().intValue(), 
minVcore));
+   
containerResourceToWorkerSpecs.computeIfAbsent(containerResource, ignored -> 
new ArrayList<>())
+   .add(workerResourceSpec);
+   
containerMemoryToContainerResource.computeIfAbsent(containerResource.getMemory(),
 ignored -> new HashSet<>())
 
 Review comment:
   Seems we don't need to add it to `containerMemoryToContainerResource` if 
`matchVcores` is false.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink-statefun] tzulitai commented on issue #54: [FLINK-16515][docs] Refactor statefun documentation for multi-language SDKs

2020-03-10 Thread GitBox

tzulitai commented on issue #54: [FLINK-16515][docs] Refactor statefun 
documentation for multi-language SDKs
URL: https://github.com/apache/flink-statefun/pull/54#issuecomment-597429233
 
 
   > I've also noticed that the ingresses/egress structures are not here, maybe 
@tzulitai can help here.
   
   Yes I can help out here, as a follow-up after merging this PR.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] jinglining commented on a change in pull request #11250: [FLINK-16302][rest]add log list and read log by name for taskmanager

2020-03-10 Thread GitBox

jinglining commented on a change in pull request #11250: [FLINK-16302][rest]add 
log list and read log by name for taskmanager
URL: https://github.com/apache/flink/pull/11250#discussion_r390286564
 
 

 ##
 File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/taskexecutor/TaskExecutor.java
 ##
 @@ -900,7 +921,7 @@ public void heartbeatFromResourceManager(ResourceID 
resourceID) {
}
 
@Override
-   public CompletableFuture requestFileUpload(FileType 
fileType, Time timeout) {
+   public CompletableFuture requestFileUpload(FileType 
fileType, String fileName, Time timeout) {
 
 Review comment:
   Sounds good,  but also update methods' name may be better. WDYT？
   ```java
   CompletableFuture requestFileUploadByName(String fileName, 
@RpcTimeout Time timeout);
   
   CompletableFuture requestFileUploadByType(FileType 
fileType, @RpcTimeout Time timeout);
   ```
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Created] (FLINK-16531) Add full integration tests for "GROUPING SETS" for streaming mode

2020-03-10 Thread Jark Wu (Jira)

Jark Wu created FLINK-16531:
---

 Summary: Add full integration tests for "GROUPING SETS" for 
streaming mode
 Key: FLINK-16531
 URL: https://issues.apache.org/jira/browse/FLINK-16531
 Project: Flink
  Issue Type: Test
  Components: Table SQL / Planner
Reporter: Jark Wu
 Fix For: 1.11.0


We have a plan test for GROUPING SETS for streaming mode, i.e. 
{{GroupingSetsTest}}. But we should also have a full IT coverage for it, just 
like batch's 
{{org.apache.flink.table.planner.runtime.batch.sql.agg.GroupingSetsITCase}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-16530) Add documentation about "GROUPING SETS" and "CUBE" support in streaming mode

2020-03-10 Thread Jark Wu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jark Wu updated FLINK-16530:

Description: 
"GROUPING SETS" and "CUBE" are already supported in streaming mode in blink 
planner, but we missed to add that feature in the documentation. And that 
confused some users [1].

Note that it is only supported in blink planner, not old planner.

[1]: 
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Question-on-the-SQL-quot-GROUPING-SETS-quot-and-quot-CUBE-quot-syntax-usability-td33504.html

  was:
"GROUPING SETS" and "CUBE" are already supported in streaming mode, but we 
missed to add that feature in the documentation. And that confused some users 
[1].

[1]: 
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Question-on-the-SQL-quot-GROUPING-SETS-quot-and-quot-CUBE-quot-syntax-usability-td33504.html


> Add documentation about "GROUPING SETS" and "CUBE" support in streaming mode
> 
>
> Key: FLINK-16530
> URL: https://issues.apache.org/jira/browse/FLINK-16530
> Project: Flink
>  Issue Type: Task
>  Components: Documentation, Table SQL / API
>Reporter: Jark Wu
>Priority: Major
> Fix For: 1.10.1, 1.11.0
>
>
> "GROUPING SETS" and "CUBE" are already supported in streaming mode in blink 
> planner, but we missed to add that feature in the documentation. And that 
> confused some users [1].
> Note that it is only supported in blink planner, not old planner.
> [1]: 
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Question-on-the-SQL-quot-GROUPING-SETS-quot-and-quot-CUBE-quot-syntax-usability-td33504.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-16530) Add documentation about "GROUPING SETS" and "CUBE" support in streaming mode

2020-03-10 Thread Jark Wu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jark Wu updated FLINK-16530:

Description: 
"GROUPING SETS" and "CUBE" are already supported in streaming mode, but we 
missed to add that feature in the documentation. And that confused some users 
[1].

[1]: 
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Question-on-the-SQL-quot-GROUPING-SETS-quot-and-quot-CUBE-quot-syntax-usability-td33504.html

> Add documentation about "GROUPING SETS" and "CUBE" support in streaming mode
> 
>
> Key: FLINK-16530
> URL: https://issues.apache.org/jira/browse/FLINK-16530
> Project: Flink
>  Issue Type: Task
>  Components: Documentation, Table SQL / API
>Reporter: Jark Wu
>Priority: Major
> Fix For: 1.10.1, 1.11.0
>
>
> "GROUPING SETS" and "CUBE" are already supported in streaming mode, but we 
> missed to add that feature in the documentation. And that confused some users 
> [1].
> [1]: 
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Question-on-the-SQL-quot-GROUPING-SETS-quot-and-quot-CUBE-quot-syntax-usability-td33504.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-16530) Add documentation about "GROUPING SETS" and "CUBE" support in streaming mode

2020-03-10 Thread Jark Wu (Jira)

Jark Wu created FLINK-16530:
---

 Summary: Add documentation about "GROUPING SETS" and "CUBE" 
support in streaming mode
 Key: FLINK-16530
 URL: https://issues.apache.org/jira/browse/FLINK-16530
 Project: Flink
  Issue Type: Task
  Components: Documentation, Table SQL / API
Reporter: Jark Wu
 Fix For: 1.10.1, 1.11.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-16526) Escape character doesn't work for computed column

2020-03-10 Thread Danny Chen (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-16526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17056608#comment-17056608
 ] 

Danny Chen commented on FLINK-16526:


Thanks, [~liuyufei], you should try

{code:sql}
`timestamp`   AS json_row['timestamp']
{code}

instead.


> Escape character doesn't work for computed column
> -
>
> Key: FLINK-16526
> URL: https://issues.apache.org/jira/browse/FLINK-16526
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Affects Versions: 1.10.0
>Reporter: YufeiLiu
>Priority: Major
>
> {code:sql}
> json_row  ROW<`timestamp` BIGINT>,
> `timestamp`   AS `json_row`.`timestamp`
> {code}
> It translate to "SELECT json_row.timestamp FROM __temp_table__"
> Throws exception "Encountered ". timestamp" at line 1, column 157. Was 
> expecting one of:..."



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-16529) Add ignore_parse_errors() method to Json format in python API

2020-03-10 Thread Jark Wu (Jira)

Jark Wu created FLINK-16529:
---

 Summary: Add ignore_parse_errors() method to Json format in python 
API
 Key: FLINK-16529
 URL: https://issues.apache.org/jira/browse/FLINK-16529
 Project: Flink
  Issue Type: New Feature
  Components: API / Python
Reporter: Jark Wu


We forgot to add corresponding {{ignore_parse_errors}} to {{Json}} class in 
Python API in FLINK-15396.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-16529) Add ignore_parse_errors() method to Json format in python API

2020-03-10 Thread Jark Wu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jark Wu updated FLINK-16529:

Component/s: Table SQL / Ecosystem

> Add ignore_parse_errors() method to Json format in python API
> -
>
> Key: FLINK-16529
> URL: https://issues.apache.org/jira/browse/FLINK-16529
> Project: Flink
>  Issue Type: New Feature
>  Components: API / Python, Table SQL / Ecosystem
>Reporter: Jark Wu
>Priority: Major
>
> We forgot to add corresponding {{ignore_parse_errors}} to {{Json}} class in 
> Python API in FLINK-15396.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (FLINK-15396) Support to ignore parse errors for JSON format

2020-03-10 Thread Jark Wu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-15396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jark Wu resolved FLINK-15396.
-
Fix Version/s: 1.11.0
   Resolution: Fixed

Resolved in master (1.11.0): 87ebe5533d3cd675b416146189abb0f6de61559d

> Support to ignore parse errors for JSON format
> --
>
> Key: FLINK-15396
> URL: https://issues.apache.org/jira/browse/FLINK-15396
> Project: Flink
>  Issue Type: New Feature
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
>Reporter: Jark Wu
>Assignee: Zou
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.11.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> We already support {{'format.ignore-parse-errors'}} to skip dirty records in 
> CSV format. We can also support it in JSON format. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] gaoyunhaii commented on a change in pull request #7368: [FLINK-10742][network] Let Netty use Flink's buffers directly in credit-based mode

2020-03-10 Thread GitBox

gaoyunhaii commented on a change in pull request #7368: [FLINK-10742][network] 
Let Netty use Flink's buffers directly in credit-based mode
URL: https://github.com/apache/flink/pull/7368#discussion_r390719053
 
 

 ##
 File path: 
flink-runtime/src/test/java/org/apache/flink/runtime/io/network/netty/CreditBasedPartitionRequestClientHandlerTest.java
 ##
 @@ -443,10 +587,13 @@ private static BufferResponse createBufferResponse(
// Skip general header bytes
serialized.readBytes(NettyMessage.FRAME_HEADER_LENGTH);
 
-   // Deserialize the bytes again. We have to go this way, because 
we only partly deserialize
-   // the header of the response and wait for a buffer from the 
buffer pool to copy the payload
-   // data into.
-   BufferResponse deserialized = 
BufferResponse.readFrom(serialized);
+   // Deserialize the bytes again. We have to go this way to 
ensure the data buffer part
+   // is consistent with the input channel sent to.
 
 Review comment:
   I think the previous comments is trying to explain why it needs to 
deserialize the bytes again: since it need to reach the state that the message 
is deserialized but not copied to a Flink Buffer yet. I thought that the 
comment might not be consistent with current implementation. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #11358: [FLINK-16516][python] Remove Python UDF Codegen Code

2020-03-10 Thread GitBox

flinkbot edited a comment on issue #11358: [FLINK-16516][python] Remove Python 
UDF Codegen Code
URL: https://github.com/apache/flink/pull/11358#issuecomment-596885127
 
 
   
   ## CI report:
   
   * cf3f3aac4aa052f71bd4954fd55b48c84350edf6 Travis: 
[SUCCESS](https://travis-ci.com/flink-ci/flink/builds/152569306) Azure: 
[FAILURE](https://dev.azure.com/rmetzger/5bd3ef0a-4359-41af-abca-811b04098d2e/_build/results?buildId=6104)
 
   * beba9d15aaf629de9e09aa8885cdcc844a96ad30 Travis: 
[PENDING](https://travis-ci.com/github/flink-ci/flink/builds/152742281) Azure: 
[PENDING](https://dev.azure.com/rmetzger/5bd3ef0a-4359-41af-abca-811b04098d2e/_build/results?buildId=6148)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] gaoyunhaii commented on a change in pull request #7368: [FLINK-10742][network] Let Netty use Flink's buffers directly in credit-based mode

2020-03-10 Thread GitBox

gaoyunhaii commented on a change in pull request #7368: [FLINK-10742][network] 
Let Netty use Flink's buffers directly in credit-based mode
URL: https://github.com/apache/flink/pull/7368#discussion_r390717265
 
 

 ##
 File path: 
flink-end-to-end-tests/test-scripts/test_taskmanager_direct_memory.sh
 ##
 @@ -0,0 +1,47 @@
+#!/usr/bin/env bash
+
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+
+source "$(dirname "$0")"/common.sh
+
+TEST=flink-taskmanager-direct-memory-test
+TEST_PROGRAM_NAME=TaskManagerDirectMemoryTestProgram
+TEST_PROGRAM_JAR=${END_TO_END_DIR}/$TEST/target/$TEST_PROGRAM_NAME.jar
+
+set_config_key "akka.ask.timeout" "60 s"
+set_config_key "web.timeout" "6"
+
+set_config_key "taskmanager.memory.process.size" "1536m"
+
+set_config_key "taskmanager.memory.managed.size" "8" # 8Mb
+set_config_key "taskmanager.memory.network.min" "256mb"
+set_config_key "taskmanager.memory.network.max" "256mb"
+set_config_key "taskmanager.memory.jvm-metaspace.size" "64m"
+
+set_config_key "taskmanager.numberOfTaskSlots" "20" # 20 slots per TM
+set_config_key "taskmanager.network.netty.num-arenas" "1" # Use only one arena 
for each TM
 
 Review comment:
   Fixed the comments


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink-statefun] tzulitai edited a comment on issue #54: [FLINK-16515][docs] Refactor statefun documentation for multi-language SDKs

2020-03-10 Thread GitBox

tzulitai edited a comment on issue #54: [FLINK-16515][docs] Refactor statefun 
documentation for multi-language SDKs
URL: https://github.com/apache/flink-statefun/pull/54#issuecomment-597414511
 
 
   One last thing:
   
   
![image](https://user-images.githubusercontent.com/5284370/76376742-05102b00-6384-11ea-8a3a-856da861130d.png)
   
   Not sure if it's just me, but just at first sight the navigation under the 
"Concepts" tab seems a bit strange to just have a "Logical Instances" sub-page.
   
   I think it should have at-least 2 sub-pages: 1) "Application building 
blocks"? 2) "Logical Instances".
   
   The application building blocks page basically contains the content you have 
on `concepts/index`, but we move those to a sub-page, and have `concepts/index` 
just providing some overview content and have users click through "Next" 
reading the sub-pages one by one.
   
   Not sure if that makes it better or worse, what do you think?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink-statefun] tzulitai edited a comment on issue #54: [FLINK-16515][docs] Refactor statefun documentation for multi-language SDKs

2020-03-10 Thread GitBox

tzulitai edited a comment on issue #54: [FLINK-16515][docs] Refactor statefun 
documentation for multi-language SDKs
URL: https://github.com/apache/flink-statefun/pull/54#issuecomment-597414511
 
 
   One last thing:
   
   
![image](https://user-images.githubusercontent.com/5284370/76376742-05102b00-6384-11ea-8a3a-856da861130d.png)
   
   Not sure if it's just me, but just at first sight the navigation under the 
"Concepts" tab seems a bit strange to just have a "Logical Instances" sub-page.
   
   I think it should have at-least 2 sub-pages: 1) "Application building 
blocks"? 2) "Logical Instances".
   
   The application building blocks page basically contains the content you have 
on `concepts/index`, but we move those to said sub-page, and have 
`concepts/index` just providing some overview content and have users click 
through "Next" reading the sub-pages one by one.
   
   Not sure if that makes it better or worse, what do you think?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] wuchong closed pull request #11119: [FLINK-15396][json] Support to ignore parse errors for JSON format

2020-03-10 Thread GitBox

wuchong closed pull request #9: [FLINK-15396][json] Support to ignore parse 
errors for JSON format
URL: https://github.com/apache/flink/pull/9
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink-statefun] tzulitai commented on issue #54: [FLINK-16515][docs] Refactor statefun documentation for multi-language SDKs

2020-03-10 Thread GitBox

tzulitai commented on issue #54: [FLINK-16515][docs] Refactor statefun 
documentation for multi-language SDKs
URL: https://github.com/apache/flink-statefun/pull/54#issuecomment-597414511
 
 
   One last thing:
   
   
![image](https://user-images.githubusercontent.com/5284370/76376742-05102b00-6384-11ea-8a3a-856da861130d.png)
   
   Not sure if it's just me, but just at first sight the navigation under the 
"Concepts" tab seems a bit strange to just have a "Logical Instances" sub-page.
   
   I think it should have at-least 2 sub-pages: 1) "Application building 
blocks"? 2) "Logical Instances".
   The application building blocks page basically contain the content you have 
on `concepts/index`, but we move those to a sub-page, and have `concepts/index` 
just providing some overview content and have users click through "Next" 
reading the sub-pages one by one.
   
   Not sure if that makes it better or worse, what do you think?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink-statefun] tzulitai commented on a change in pull request #54: [FLINK-16515][docs] Refactor statefun documentation for multi-language SDKs

2020-03-10 Thread GitBox

tzulitai commented on a change in pull request #54: [FLINK-16515][docs] 
Refactor statefun documentation for multi-language SDKs
URL: https://github.com/apache/flink-statefun/pull/54#discussion_r390706590
 
 

 ##
 File path: statefun-docs/docs/concepts/index.rst
 ##
 @@ -0,0 +1,97 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+   or more contributor license agreements.  See the NOTICE file
+   distributed with this work for additional information
+   regarding copyright ownership.  The ASF licenses this file
+   to you under the Apache License, Version 2.0 (the
+   "License"); you may not use this file except in compliance
+   with the License.  You may obtain a copy of the License at
+   http://www.apache.org/licenses/LICENSE-2.0
+   Unless required by applicable law or agreed to in writing,
+   software distributed under the License is distributed on an
+   "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+   KIND, either express or implied.  See the License for the
+   specific language governing permissions and limitations
+   under the License.
+
+.. _concepts:
+
+
+Concepts
+
+
+.. toctree::
+  :hidden:
+
+  logical
+
+Stateful Functions provides a framework for building event drivent 
applications.
+Here, we explain important aspects of Stateful Function’s architecture.
+
+.. contents:: :local:
+
+Event Ingress
+=
+
+Stateful Function applications sit squarely in the event driven space, so the 
natural place to start is with getting events into the system.
+
+.. figure:: ../_static/images/concepts/statefun-app-ingress.svg
+:width: 85%
+:align: center
+
+In stateful functions, the component that ingests records into the system is 
called an event ingress.
+This can be anything from a Kafka topic, to a messsage queue, to an http 
request.
+Anything that can get data into the system and trigger the intitial functions 
to begin compution.
+
+Stateful Functions
+==
+
+At the core of the diagram are the namesake stateful functions.
+
+.. figure:: ../_static/images/concepts/statefun-app-functions.svg
+:width: 85%
+:align: center
+
+Think of these as the building blocks for your service.
+They can message each other arbitrarily, which is one way in which this 
framework moves away from the traditional stream processing view of the world.
+Instead of building up a static dataflow DAG, these functions can communicate 
with each other in arbitrary, potentially cyclic, even round trip ways.
+
+If you are familiar with actor programming, this does share certain 
similarities in its ability to dynamically message between components.
+However, there are a number of significant differences.
+
+Persisted Values
+
+
+The first is that all functions have locally embedded state, known as 
persisted values.
+
+.. figure:: ../_static/images/concepts/statefun-app-state.svg
+:width: 85%
+:align: center
+
+One of Apache Flink's core strengths is its ability to provide fault-tolerant 
local state.
+When inside a function, while it is performing some computation, you are 
always working with local state in local variables.
+
+Fault Tolerance
+===
+
+For both state and messaging, Stateful Function's is still able to provide the 
exactly once guaruntees users expect from a modern data processessing framework.
+
+.. figure:: ../_static/images/concepts/statefun-app-fault-tolerance.svg
+:width: 85%
+:align: center
+
+In the case of failure, the entire state of the world (both persisted values 
and messages) are rolled back to simulate completely failure free execution.
+
+These guaruntees are provided with no database required, instead Stateful 
Function's leverages Apache Flink's proven snapshotting mechanism.
 
 Review comment:
   ```suggestion
   These guarantees are provided with no database required, instead Stateful 
Function's leverages Apache Flink's proven snapshotting mechanism.
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink-statefun] tzulitai commented on a change in pull request #54: [FLINK-16515][docs] Refactor statefun documentation for multi-language SDKs

2020-03-10 Thread GitBox

tzulitai commented on a change in pull request #54: [FLINK-16515][docs] 
Refactor statefun documentation for multi-language SDKs
URL: https://github.com/apache/flink-statefun/pull/54#discussion_r390711708
 
 

 ##
 File path: statefun-docs/docs/concepts/logical.rst
 ##
 @@ -0,0 +1,79 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+   or more contributor license agreements.  See the NOTICE file
+   distributed with this work for additional information
+   regarding copyright ownership.  The ASF licenses this file
+   to you under the Apache License, Version 2.0 (the
+   "License"); you may not use this file except in compliance
+   with the License.  You may obtain a copy of the License at
+   http://www.apache.org/licenses/LICENSE-2.0
+   Unless required by applicable law or agreed to in writing,
+   software distributed under the License is distributed on an
+   "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+   KIND, either express or implied.  See the License for the
+   specific language governing permissions and limitations
+   under the License.
+
+.. _logical-functions:
+
+#
+Logical Functions
+#
+
+Stateful Function's are allocated logically, which means the system can 
support an unbounded number of instances with a finite amount of resources.
+Logical instances do not use CPU, memory, or threads when not actively being 
invoked, there is no theoretical upper limit on the number of instances that 
can created.
+Users are encouraged to model their applications as granularly as possible, 
based on what makes the most sense for their application, instead of desigining 
applications around resource constraints.
+
+.. contents:: :local:
+
+.. _address:
+
+Function Address
+
+
+In a local environment, the address of an object is the same as a reference to 
it.
+But in a Stateful Function's application, function instances are virtual and 
their runtime location is not exposed to the user.
+Instead, an ``Address`` is used to reference a specific stateful functions in 
the system..
+
+.. code-block:: proto
+
+  syntax = "proto3";
+
+  message Address {
+
+message FunctionType {
+  string namespace = 1;
+  string name  = 2;
+}
+
+FunctionType function_type = 1;
+string id = 2;
+  }
+
+
+An address is made of two components, a ``FunctionType`` and ``ID``.
+A function type is similar to a class in an object-oriented language; it 
declares what sort of function the address references.
+The id is a primary key, it scopes the function call to a specific instances 
of the function type.
+
+When a function is being invoked, all actions - including reads and writes of 
persisted values - are scoped to the current address.
+
+For example, imagine a there was a Stateful Function application to track the 
inventory of a warehouse.
+One possible implementation could include an ``Inventory`` function that 
tracks the number units in stock for a particular item; this would be the 
function type.
+There would then be one logical instance of this type for each SKU the 
warehouse manages.
+If it were clothing, there might be an instance for shirts and another for 
pants; "shirt" and "pant" would be two ids.
+Each instance may be interacted with and messaged independently.
+The application is free to create as many instances as there are types of 
items in inventory.
+
+Function Lifecycle
+==
+
+Logical functions are neither created nor destroyed, but always exist 
throughout the lifetime of an application.
+When an application starts, each parallel worker of the framework will create 
one physical object per function type.
+This object will be used to execute all logical instances of that type that 
are run by that particular worker.
+The first time a message is sent to an address, it will be as if that instance 
had always existed with its persisted values returning ``NULL``.
 
 Review comment:
   ```suggestion
   The first time a message is sent to an address, it will be as if that 
instance had always existed with its persisted states being empty.
   ```
   
   Not all states are initially `null`, some are empty collections for example.
   Suggesting a more general term here.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink-statefun] tzulitai commented on a change in pull request #54: [FLINK-16515][docs] Refactor statefun documentation for multi-language SDKs

2020-03-10 Thread GitBox

tzulitai commented on a change in pull request #54: [FLINK-16515][docs] 
Refactor statefun documentation for multi-language SDKs
URL: https://github.com/apache/flink-statefun/pull/54#discussion_r390713139
 
 

 ##
 File path: statefun-docs/docs/getting_started/walkthrough.rst
 ##
 @@ -168,7 +168,7 @@ Each time a message is processed, the function computes a 
personalized message f
 It reads and updates the number of times that user has been seen and sends a 
greeting to the egress.
 
 You can check the full code for the application described in this walkthrough 
`here <{examples}/statefun-greeter-example>`_.
-In particular, take a look at the :ref:`module ` GreetingModule, which 
is the main entry point for the full application, to see how everything gets 
tied together.
+In particular, take a look at the module GreetingModule, which is the main 
entry point for the full application, to see how everything gets tied together.
 
 Review comment:
   ```suggestion
   In particular, take a look at the module ``GreetingModule``, which is the 
main entry point for the full application, to see how everything gets tied 
together.
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink-statefun] tzulitai commented on a change in pull request #54: [FLINK-16515][docs] Refactor statefun documentation for multi-language SDKs

2020-03-10 Thread GitBox

tzulitai commented on a change in pull request #54: [FLINK-16515][docs] 
Refactor statefun documentation for multi-language SDKs
URL: https://github.com/apache/flink-statefun/pull/54#discussion_r390712156
 
 

 ##
 File path: statefun-docs/docs/concepts/logical.rst
 ##
 @@ -0,0 +1,79 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+   or more contributor license agreements.  See the NOTICE file
+   distributed with this work for additional information
+   regarding copyright ownership.  The ASF licenses this file
+   to you under the Apache License, Version 2.0 (the
+   "License"); you may not use this file except in compliance
+   with the License.  You may obtain a copy of the License at
+   http://www.apache.org/licenses/LICENSE-2.0
+   Unless required by applicable law or agreed to in writing,
+   software distributed under the License is distributed on an
+   "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+   KIND, either express or implied.  See the License for the
+   specific language governing permissions and limitations
+   under the License.
+
+.. _logical-functions:
+
+#
+Logical Functions
+#
+
+Stateful Function's are allocated logically, which means the system can 
support an unbounded number of instances with a finite amount of resources.
+Logical instances do not use CPU, memory, or threads when not actively being 
invoked, there is no theoretical upper limit on the number of instances that 
can created.
+Users are encouraged to model their applications as granularly as possible, 
based on what makes the most sense for their application, instead of desigining 
applications around resource constraints.
+
+.. contents:: :local:
+
+.. _address:
+
+Function Address
+
+
+In a local environment, the address of an object is the same as a reference to 
it.
+But in a Stateful Function's application, function instances are virtual and 
their runtime location is not exposed to the user.
+Instead, an ``Address`` is used to reference a specific stateful functions in 
the system..
+
+.. code-block:: proto
+
+  syntax = "proto3";
+
+  message Address {
+
+message FunctionType {
+  string namespace = 1;
+  string name  = 2;
+}
+
+FunctionType function_type = 1;
+string id = 2;
+  }
+
+
+An address is made of two components, a ``FunctionType`` and ``ID``.
+A function type is similar to a class in an object-oriented language; it 
declares what sort of function the address references.
+The id is a primary key, it scopes the function call to a specific instances 
of the function type.
+
+When a function is being invoked, all actions - including reads and writes of 
persisted values - are scoped to the current address.
+
+For example, imagine a there was a Stateful Function application to track the 
inventory of a warehouse.
+One possible implementation could include an ``Inventory`` function that 
tracks the number units in stock for a particular item; this would be the 
function type.
+There would then be one logical instance of this type for each SKU the 
warehouse manages.
+If it were clothing, there might be an instance for shirts and another for 
pants; "shirt" and "pant" would be two ids.
+Each instance may be interacted with and messaged independently.
+The application is free to create as many instances as there are types of 
items in inventory.
+
+Function Lifecycle
+==
+
+Logical functions are neither created nor destroyed, but always exist 
throughout the lifetime of an application.
+When an application starts, each parallel worker of the framework will create 
one physical object per function type.
+This object will be used to execute all logical instances of that type that 
are run by that particular worker.
+The first time a message is sent to an address, it will be as if that instance 
had always existed with its persisted values returning ``NULL``.
+
+Clearing all persisted values of a type is the same as destroying it.
 
 Review comment:
   ```suggestion
   Clearing all persisted states of a function instance is the same as 
destroying it.
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink-statefun] tzulitai commented on a change in pull request #54: [FLINK-16515][docs] Refactor statefun documentation for multi-language SDKs

2020-03-10 Thread GitBox

tzulitai commented on a change in pull request #54: [FLINK-16515][docs] 
Refactor statefun documentation for multi-language SDKs
URL: https://github.com/apache/flink-statefun/pull/54#discussion_r390705594
 
 

 ##
 File path: statefun-docs/docs/concepts/index.rst
 ##
 @@ -0,0 +1,97 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+   or more contributor license agreements.  See the NOTICE file
+   distributed with this work for additional information
+   regarding copyright ownership.  The ASF licenses this file
+   to you under the Apache License, Version 2.0 (the
+   "License"); you may not use this file except in compliance
+   with the License.  You may obtain a copy of the License at
+   http://www.apache.org/licenses/LICENSE-2.0
+   Unless required by applicable law or agreed to in writing,
+   software distributed under the License is distributed on an
+   "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+   KIND, either express or implied.  See the License for the
+   specific language governing permissions and limitations
+   under the License.
+
+.. _concepts:
+
+
+Concepts
+
+
+.. toctree::
+  :hidden:
+
+  logical
+
+Stateful Functions provides a framework for building event drivent 
applications.
+Here, we explain important aspects of Stateful Function’s architecture.
+
+.. contents:: :local:
+
+Event Ingress
+=
+
+Stateful Function applications sit squarely in the event driven space, so the 
natural place to start is with getting events into the system.
+
+.. figure:: ../_static/images/concepts/statefun-app-ingress.svg
+:width: 85%
+:align: center
+
+In stateful functions, the component that ingests records into the system is 
called an event ingress.
+This can be anything from a Kafka topic, to a messsage queue, to an http 
request.
+Anything that can get data into the system and trigger the intitial functions 
to begin compution.
+
+Stateful Functions
+==
+
+At the core of the diagram are the namesake stateful functions.
+
+.. figure:: ../_static/images/concepts/statefun-app-functions.svg
+:width: 85%
+:align: center
+
+Think of these as the building blocks for your service.
+They can message each other arbitrarily, which is one way in which this 
framework moves away from the traditional stream processing view of the world.
+Instead of building up a static dataflow DAG, these functions can communicate 
with each other in arbitrary, potentially cyclic, even round trip ways.
+
+If you are familiar with actor programming, this does share certain 
similarities in its ability to dynamically message between components.
+However, there are a number of significant differences.
+
+Persisted Values
+
+
+The first is that all functions have locally embedded state, known as 
persisted values.
+
+.. figure:: ../_static/images/concepts/statefun-app-state.svg
+:width: 85%
+:align: center
+
+One of Apache Flink's core strengths is its ability to provide fault-tolerant 
local state.
 
 Review comment:
   Same comment here about "local".
   While it is true for all Flink function / operators, in Stateful Functions 
it's not always true.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink-statefun] tzulitai commented on a change in pull request #54: [FLINK-16515][docs] Refactor statefun documentation for multi-language SDKs

2020-03-10 Thread GitBox

tzulitai commented on a change in pull request #54: [FLINK-16515][docs] 
Refactor statefun documentation for multi-language SDKs
URL: https://github.com/apache/flink-statefun/pull/54#discussion_r390710405
 
 

 ##
 File path: statefun-docs/docs/concepts/logical.rst
 ##
 @@ -0,0 +1,79 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+   or more contributor license agreements.  See the NOTICE file
+   distributed with this work for additional information
+   regarding copyright ownership.  The ASF licenses this file
+   to you under the Apache License, Version 2.0 (the
+   "License"); you may not use this file except in compliance
+   with the License.  You may obtain a copy of the License at
+   http://www.apache.org/licenses/LICENSE-2.0
+   Unless required by applicable law or agreed to in writing,
+   software distributed under the License is distributed on an
+   "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+   KIND, either express or implied.  See the License for the
+   specific language governing permissions and limitations
+   under the License.
+
+.. _logical-functions:
+
+#
+Logical Functions
+#
+
+Stateful Function's are allocated logically, which means the system can 
support an unbounded number of instances with a finite amount of resources.
+Logical instances do not use CPU, memory, or threads when not actively being 
invoked, there is no theoretical upper limit on the number of instances that 
can created.
+Users are encouraged to model their applications as granularly as possible, 
based on what makes the most sense for their application, instead of desigining 
applications around resource constraints.
+
+.. contents:: :local:
+
+.. _address:
+
+Function Address
+
+
+In a local environment, the address of an object is the same as a reference to 
it.
+But in a Stateful Function's application, function instances are virtual and 
their runtime location is not exposed to the user.
+Instead, an ``Address`` is used to reference a specific stateful functions in 
the system..
+
+.. code-block:: proto
+
+  syntax = "proto3";
+
+  message Address {
+
+message FunctionType {
+  string namespace = 1;
+  string name  = 2;
+}
+
+FunctionType function_type = 1;
+string id = 2;
+  }
+
+
+An address is made of two components, a ``FunctionType`` and ``ID``.
+A function type is similar to a class in an object-oriented language; it 
declares what sort of function the address references.
+The id is a primary key, it scopes the function call to a specific instances 
of the function type.
+
+When a function is being invoked, all actions - including reads and writes of 
persisted values - are scoped to the current address.
 
 Review comment:
   ```suggestion
   When a function is being invoked, all actions - including reads and writes 
of persisted states - are scoped to the current address.
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink-statefun] tzulitai commented on a change in pull request #54: [FLINK-16515][docs] Refactor statefun documentation for multi-language SDKs

2020-03-10 Thread GitBox

tzulitai commented on a change in pull request #54: [FLINK-16515][docs] 
Refactor statefun documentation for multi-language SDKs
URL: https://github.com/apache/flink-statefun/pull/54#discussion_r390714422
 
 

 ##
 File path: statefun-docs/docs/sdk/modules.rst
 ##
 @@ -0,0 +1,96 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+   or more contributor license agreements.  See the NOTICE file
+   distributed with this work for additional information
+   regarding copyright ownership.  The ASF licenses this file
+   to you under the Apache License, Version 2.0 (the
+   "License"); you may not use this file except in compliance
+   with the License.  You may obtain a copy of the License at
+   http://www.apache.org/licenses/LICENSE-2.0
+   Unless required by applicable law or agreed to in writing,
+   software distributed under the License is distributed on an
+   "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+   KIND, either express or implied.  See the License for the
+   specific language governing permissions and limitations
+   under the License.
+
+.. _modules:
+
+###
+Modules
+###
+
+Stateful Function applications are composed of one or more ``Modules``.
+A module is a bundle of functions that are loaded by the runtime and available 
to be messaged.
+Functions from all loaded modules are multiplexed and free to message each 
other arbitrarily.
+
+Stateful Functions supports two types of modules: Embedded and Remote.
+
+.. contents:: :local:
+
+.. _embedded_module:
+
+Embedded Module
+===
+
+Embedded modules are co-located with, and embedded within, the {flink} runtime.
+
+This module type only supports JVM based languages and are defined by 
implementing the ``StatefulFunctionModule`` interface.
+Embedded modules offer a single configuration method where stateful functions 
are bound to the system based on their :ref:`function type `.
+Runtime configurations are available through the ``globalConfiguration``, 
which is the union of all configurations in the applications 
``flink-conf.yaml`` under the prefix ``statefun.module.global-config`` and any 
command line arguments passed in the form ``--key value``.
+
+.. literalinclude:: 
../../src/main/java/org/apache/flink/statefun/docs/BasicFunctionModule.java
+:language: java
+:lines: 18-
+
+Embedded modules leverage `Java’s Service Provider Interfaces (SPI) 
`_ for 
discovery.
+This means that every JAR should contain a file 
``org.apache.flink.statefun.sdk.spi.StatefulFunctionModule`` in the 
``META_INF/services`` resource directory that lists all available modules that 
it provides.
 
 Review comment:
   There's a more straightforward way to do that now, using `@AutoService` 
annotation on the module classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink-statefun] tzulitai commented on a change in pull request #54: [FLINK-16515][docs] Refactor statefun documentation for multi-language SDKs

2020-03-10 Thread GitBox

tzulitai commented on a change in pull request #54: [FLINK-16515][docs] 
Refactor statefun documentation for multi-language SDKs
URL: https://github.com/apache/flink-statefun/pull/54#discussion_r390707590
 
 

 ##
 File path: statefun-docs/docs/concepts/index.rst
 ##
 @@ -0,0 +1,97 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+   or more contributor license agreements.  See the NOTICE file
+   distributed with this work for additional information
+   regarding copyright ownership.  The ASF licenses this file
+   to you under the Apache License, Version 2.0 (the
+   "License"); you may not use this file except in compliance
+   with the License.  You may obtain a copy of the License at
+   http://www.apache.org/licenses/LICENSE-2.0
+   Unless required by applicable law or agreed to in writing,
+   software distributed under the License is distributed on an
+   "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+   KIND, either express or implied.  See the License for the
+   specific language governing permissions and limitations
+   under the License.
+
+.. _concepts:
+
+
+Concepts
+
+
+.. toctree::
+  :hidden:
+
+  logical
+
+Stateful Functions provides a framework for building event drivent 
applications.
+Here, we explain important aspects of Stateful Function’s architecture.
+
+.. contents:: :local:
+
+Event Ingress
+=
+
+Stateful Function applications sit squarely in the event driven space, so the 
natural place to start is with getting events into the system.
+
+.. figure:: ../_static/images/concepts/statefun-app-ingress.svg
+:width: 85%
+:align: center
+
+In stateful functions, the component that ingests records into the system is 
called an event ingress.
+This can be anything from a Kafka topic, to a messsage queue, to an http 
request.
+Anything that can get data into the system and trigger the intitial functions 
to begin compution.
+
+Stateful Functions
+==
+
+At the core of the diagram are the namesake stateful functions.
+
+.. figure:: ../_static/images/concepts/statefun-app-functions.svg
+:width: 85%
+:align: center
+
+Think of these as the building blocks for your service.
+They can message each other arbitrarily, which is one way in which this 
framework moves away from the traditional stream processing view of the world.
+Instead of building up a static dataflow DAG, these functions can communicate 
with each other in arbitrary, potentially cyclic, even round trip ways.
+
+If you are familiar with actor programming, this does share certain 
similarities in its ability to dynamically message between components.
+However, there are a number of significant differences.
+
+Persisted Values
+
+
+The first is that all functions have locally embedded state, known as 
persisted values.
+
+.. figure:: ../_static/images/concepts/statefun-app-state.svg
+:width: 85%
+:align: center
+
+One of Apache Flink's core strengths is its ability to provide fault-tolerant 
local state.
+When inside a function, while it is performing some computation, you are 
always working with local state in local variables.
+
+Fault Tolerance
+===
+
+For both state and messaging, Stateful Function's is still able to provide the 
exactly once guaruntees users expect from a modern data processessing framework.
+
+.. figure:: ../_static/images/concepts/statefun-app-fault-tolerance.svg
+:width: 85%
+:align: center
+
+In the case of failure, the entire state of the world (both persisted values 
and messages) are rolled back to simulate completely failure free execution.
+
+These guaruntees are provided with no database required, instead Stateful 
Function's leverages Apache Flink's proven snapshotting mechanism.
+
+Event Egress
+
+
+Finally, applications can output data to external systems via event egress's.
+
+.. figure:: ../_static/images/concepts/statefun-app-egress.svg
+:width: 85%
+:align: center
+
+Of course, functions perform arbitrary computation and can do whatever they 
like, which includes making RPC calls and connecting to other systems.
+By using an event egress, applications can leverage pre-built interfaces and 
data output is tied into the systems fault tolerance to avoid data loss.
 
 Review comment:
   Does this sentence imply that making arbitrary RPC calls / connecting to 
other systems can have data loss, without using event egresses? It sort of read 
to me like that, it being after the `Of course, functions perform arbitrary 
...` sentence.
   
   But that isn't true, as using async operations we still guarantee at-least 
once interactions with external systems.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact

[GitHub] [flink-statefun] tzulitai commented on a change in pull request #54: [FLINK-16515][docs] Refactor statefun documentation for multi-language SDKs

2020-03-10 Thread GitBox

tzulitai commented on a change in pull request #54: [FLINK-16515][docs] 
Refactor statefun documentation for multi-language SDKs
URL: https://github.com/apache/flink-statefun/pull/54#discussion_r390704803
 
 

 ##
 File path: statefun-docs/docs/concepts/index.rst
 ##
 @@ -0,0 +1,97 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+   or more contributor license agreements.  See the NOTICE file
+   distributed with this work for additional information
+   regarding copyright ownership.  The ASF licenses this file
+   to you under the Apache License, Version 2.0 (the
+   "License"); you may not use this file except in compliance
+   with the License.  You may obtain a copy of the License at
+   http://www.apache.org/licenses/LICENSE-2.0
+   Unless required by applicable law or agreed to in writing,
+   software distributed under the License is distributed on an
+   "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+   KIND, either express or implied.  See the License for the
+   specific language governing permissions and limitations
+   under the License.
+
+.. _concepts:
+
+
+Concepts
+
+
+.. toctree::
+  :hidden:
+
+  logical
+
+Stateful Functions provides a framework for building event drivent 
applications.
+Here, we explain important aspects of Stateful Function’s architecture.
+
+.. contents:: :local:
+
+Event Ingress
+=
+
+Stateful Function applications sit squarely in the event driven space, so the 
natural place to start is with getting events into the system.
+
+.. figure:: ../_static/images/concepts/statefun-app-ingress.svg
+:width: 85%
+:align: center
+
+In stateful functions, the component that ingests records into the system is 
called an event ingress.
+This can be anything from a Kafka topic, to a messsage queue, to an http 
request.
+Anything that can get data into the system and trigger the intitial functions 
to begin compution.
+
+Stateful Functions
+==
+
+At the core of the diagram are the namesake stateful functions.
+
+.. figure:: ../_static/images/concepts/statefun-app-functions.svg
+:width: 85%
+:align: center
+
+Think of these as the building blocks for your service.
+They can message each other arbitrarily, which is one way in which this 
framework moves away from the traditional stream processing view of the world.
+Instead of building up a static dataflow DAG, these functions can communicate 
with each other in arbitrary, potentially cyclic, even round trip ways.
+
+If you are familiar with actor programming, this does share certain 
similarities in its ability to dynamically message between components.
+However, there are a number of significant differences.
+
+Persisted Values
+
+
+The first is that all functions have locally embedded state, known as 
persisted values.
 
 Review comment:
   ```suggestion
   Persisted State
   
   
   The first is that all functions have locally embedded state, known as 
persisted states.
   ```
   
   Maybe call these persisted states instead. Otherwise, the term persisted 
value now points to a very specific primitive, along side persisted tables / 
appending buffers, etc.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink-statefun] tzulitai commented on a change in pull request #54: [FLINK-16515][docs] Refactor statefun documentation for multi-language SDKs

2020-03-10 Thread GitBox

tzulitai commented on a change in pull request #54: [FLINK-16515][docs] 
Refactor statefun documentation for multi-language SDKs
URL: https://github.com/apache/flink-statefun/pull/54#discussion_r390709762
 
 

 ##
 File path: statefun-docs/docs/concepts/logical.rst
 ##
 @@ -0,0 +1,79 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+   or more contributor license agreements.  See the NOTICE file
+   distributed with this work for additional information
+   regarding copyright ownership.  The ASF licenses this file
+   to you under the Apache License, Version 2.0 (the
+   "License"); you may not use this file except in compliance
+   with the License.  You may obtain a copy of the License at
+   http://www.apache.org/licenses/LICENSE-2.0
+   Unless required by applicable law or agreed to in writing,
+   software distributed under the License is distributed on an
+   "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+   KIND, either express or implied.  See the License for the
+   specific language governing permissions and limitations
+   under the License.
+
+.. _logical-functions:
+
+#
+Logical Functions
+#
+
+Stateful Function's are allocated logically, which means the system can 
support an unbounded number of instances with a finite amount of resources.
+Logical instances do not use CPU, memory, or threads when not actively being 
invoked, there is no theoretical upper limit on the number of instances that 
can created.
+Users are encouraged to model their applications as granularly as possible, 
based on what makes the most sense for their application, instead of desigining 
applications around resource constraints.
+
+.. contents:: :local:
+
+.. _address:
+
+Function Address
+
+
+In a local environment, the address of an object is the same as a reference to 
it.
+But in a Stateful Function's application, function instances are virtual and 
their runtime location is not exposed to the user.
+Instead, an ``Address`` is used to reference a specific stateful functions in 
the system..
 
 Review comment:
   ```suggestion
   Instead, an ``Address`` is used to reference a specific stateful function in 
the system..
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink-statefun] tzulitai commented on a change in pull request #54: [FLINK-16515][docs] Refactor statefun documentation for multi-language SDKs

2020-03-10 Thread GitBox

tzulitai commented on a change in pull request #54: [FLINK-16515][docs] 
Refactor statefun documentation for multi-language SDKs
URL: https://github.com/apache/flink-statefun/pull/54#discussion_r390709148
 
 

 ##
 File path: statefun-docs/docs/concepts/index.rst
 ##
 @@ -0,0 +1,97 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+   or more contributor license agreements.  See the NOTICE file
+   distributed with this work for additional information
+   regarding copyright ownership.  The ASF licenses this file
+   to you under the Apache License, Version 2.0 (the
+   "License"); you may not use this file except in compliance
+   with the License.  You may obtain a copy of the License at
+   http://www.apache.org/licenses/LICENSE-2.0
+   Unless required by applicable law or agreed to in writing,
+   software distributed under the License is distributed on an
+   "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+   KIND, either express or implied.  See the License for the
+   specific language governing permissions and limitations
+   under the License.
+
+.. _concepts:
+
+
+Concepts
+
+
+.. toctree::
+  :hidden:
+
+  logical
+
+Stateful Functions provides a framework for building event drivent 
applications.
+Here, we explain important aspects of Stateful Function’s architecture.
+
+.. contents:: :local:
+
+Event Ingress
+=
+
+Stateful Function applications sit squarely in the event driven space, so the 
natural place to start is with getting events into the system.
+
+.. figure:: ../_static/images/concepts/statefun-app-ingress.svg
+:width: 85%
+:align: center
+
+In stateful functions, the component that ingests records into the system is 
called an event ingress.
+This can be anything from a Kafka topic, to a messsage queue, to an http 
request.
+Anything that can get data into the system and trigger the intitial functions 
to begin compution.
+
+Stateful Functions
+==
+
+At the core of the diagram are the namesake stateful functions.
+
+.. figure:: ../_static/images/concepts/statefun-app-functions.svg
+:width: 85%
+:align: center
+
+Think of these as the building blocks for your service.
+They can message each other arbitrarily, which is one way in which this 
framework moves away from the traditional stream processing view of the world.
+Instead of building up a static dataflow DAG, these functions can communicate 
with each other in arbitrary, potentially cyclic, even round trip ways.
+
+If you are familiar with actor programming, this does share certain 
similarities in its ability to dynamically message between components.
+However, there are a number of significant differences.
 
 Review comment:
   As of now: the differences mentioned on this page are:
   - persisted state
   - fault tolerance
   
   Another one is logical instances. While that has its own dedicated page, I 
wonder if we should also very briefly mention it here, or have a link that 
directly jumps to the logical instances page.
   Not sure, if that breaks the "reading flow" for the user.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink-statefun] tzulitai commented on a change in pull request #54: [FLINK-16515][docs] Refactor statefun documentation for multi-language SDKs

2020-03-10 Thread GitBox

tzulitai commented on a change in pull request #54: [FLINK-16515][docs] 
Refactor statefun documentation for multi-language SDKs
URL: https://github.com/apache/flink-statefun/pull/54#discussion_r390706518
 
 

 ##
 File path: statefun-docs/docs/concepts/index.rst
 ##
 @@ -0,0 +1,97 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+   or more contributor license agreements.  See the NOTICE file
+   distributed with this work for additional information
+   regarding copyright ownership.  The ASF licenses this file
+   to you under the Apache License, Version 2.0 (the
+   "License"); you may not use this file except in compliance
+   with the License.  You may obtain a copy of the License at
+   http://www.apache.org/licenses/LICENSE-2.0
+   Unless required by applicable law or agreed to in writing,
+   software distributed under the License is distributed on an
+   "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+   KIND, either express or implied.  See the License for the
+   specific language governing permissions and limitations
+   under the License.
+
+.. _concepts:
+
+
+Concepts
+
+
+.. toctree::
+  :hidden:
+
+  logical
+
+Stateful Functions provides a framework for building event drivent 
applications.
+Here, we explain important aspects of Stateful Function’s architecture.
+
+.. contents:: :local:
+
+Event Ingress
+=
+
+Stateful Function applications sit squarely in the event driven space, so the 
natural place to start is with getting events into the system.
+
+.. figure:: ../_static/images/concepts/statefun-app-ingress.svg
+:width: 85%
+:align: center
+
+In stateful functions, the component that ingests records into the system is 
called an event ingress.
+This can be anything from a Kafka topic, to a messsage queue, to an http 
request.
+Anything that can get data into the system and trigger the intitial functions 
to begin compution.
+
+Stateful Functions
+==
+
+At the core of the diagram are the namesake stateful functions.
+
+.. figure:: ../_static/images/concepts/statefun-app-functions.svg
+:width: 85%
+:align: center
+
+Think of these as the building blocks for your service.
+They can message each other arbitrarily, which is one way in which this 
framework moves away from the traditional stream processing view of the world.
+Instead of building up a static dataflow DAG, these functions can communicate 
with each other in arbitrary, potentially cyclic, even round trip ways.
+
+If you are familiar with actor programming, this does share certain 
similarities in its ability to dynamically message between components.
+However, there are a number of significant differences.
+
+Persisted Values
+
+
+The first is that all functions have locally embedded state, known as 
persisted values.
+
+.. figure:: ../_static/images/concepts/statefun-app-state.svg
+:width: 85%
+:align: center
+
+One of Apache Flink's core strengths is its ability to provide fault-tolerant 
local state.
+When inside a function, while it is performing some computation, you are 
always working with local state in local variables.
+
+Fault Tolerance
+===
+
+For both state and messaging, Stateful Function's is still able to provide the 
exactly once guaruntees users expect from a modern data processessing framework.
+
+.. figure:: ../_static/images/concepts/statefun-app-fault-tolerance.svg
+:width: 85%
+:align: center
+
+In the case of failure, the entire state of the world (both persisted values 
and messages) are rolled back to simulate completely failure free execution.
 
 Review comment:
   ```suggestion
   In the case of failure, the entire state of the world (both persisted states 
and messages) are rolled back to simulate completely failure free execution.
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink-statefun] tzulitai commented on a change in pull request #54: [FLINK-16515][docs] Refactor statefun documentation for multi-language SDKs

2020-03-10 Thread GitBox

tzulitai commented on a change in pull request #54: [FLINK-16515][docs] 
Refactor statefun documentation for multi-language SDKs
URL: https://github.com/apache/flink-statefun/pull/54#discussion_r390703680
 
 

 ##
 File path: statefun-docs/docs/concepts/index.rst
 ##
 @@ -0,0 +1,97 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+   or more contributor license agreements.  See the NOTICE file
+   distributed with this work for additional information
+   regarding copyright ownership.  The ASF licenses this file
+   to you under the Apache License, Version 2.0 (the
+   "License"); you may not use this file except in compliance
+   with the License.  You may obtain a copy of the License at
+   http://www.apache.org/licenses/LICENSE-2.0
+   Unless required by applicable law or agreed to in writing,
+   software distributed under the License is distributed on an
+   "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+   KIND, either express or implied.  See the License for the
+   specific language governing permissions and limitations
+   under the License.
+
+.. _concepts:
+
+
+Concepts
+
+
+.. toctree::
+  :hidden:
+
+  logical
+
+Stateful Functions provides a framework for building event drivent 
applications.
+Here, we explain important aspects of Stateful Function’s architecture.
+
+.. contents:: :local:
+
+Event Ingress
+=
+
+Stateful Function applications sit squarely in the event driven space, so the 
natural place to start is with getting events into the system.
+
+.. figure:: ../_static/images/concepts/statefun-app-ingress.svg
+:width: 85%
+:align: center
+
+In stateful functions, the component that ingests records into the system is 
called an event ingress.
+This can be anything from a Kafka topic, to a messsage queue, to an http 
request.
+Anything that can get data into the system and trigger the intitial functions 
to begin compution.
 
 Review comment:
   ```suggestion
   This can be anything from a Kafka topic, to a messsage queue, to an http 
request -
   anything that can get data into the system and trigger the intitial 
functions to begin computation.
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink-statefun] tzulitai commented on a change in pull request #54: [FLINK-16515][docs] Refactor statefun documentation for multi-language SDKs

2020-03-10 Thread GitBox

tzulitai commented on a change in pull request #54: [FLINK-16515][docs] 
Refactor statefun documentation for multi-language SDKs
URL: https://github.com/apache/flink-statefun/pull/54#discussion_r390710336
 
 

 ##
 File path: statefun-docs/docs/concepts/logical.rst
 ##
 @@ -0,0 +1,79 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+   or more contributor license agreements.  See the NOTICE file
+   distributed with this work for additional information
+   regarding copyright ownership.  The ASF licenses this file
+   to you under the Apache License, Version 2.0 (the
+   "License"); you may not use this file except in compliance
+   with the License.  You may obtain a copy of the License at
+   http://www.apache.org/licenses/LICENSE-2.0
+   Unless required by applicable law or agreed to in writing,
+   software distributed under the License is distributed on an
+   "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+   KIND, either express or implied.  See the License for the
+   specific language governing permissions and limitations
+   under the License.
+
+.. _logical-functions:
+
+#
+Logical Functions
+#
+
+Stateful Function's are allocated logically, which means the system can 
support an unbounded number of instances with a finite amount of resources.
+Logical instances do not use CPU, memory, or threads when not actively being 
invoked, there is no theoretical upper limit on the number of instances that 
can created.
+Users are encouraged to model their applications as granularly as possible, 
based on what makes the most sense for their application, instead of desigining 
applications around resource constraints.
+
+.. contents:: :local:
+
+.. _address:
+
+Function Address
+
+
+In a local environment, the address of an object is the same as a reference to 
it.
+But in a Stateful Function's application, function instances are virtual and 
their runtime location is not exposed to the user.
+Instead, an ``Address`` is used to reference a specific stateful functions in 
the system..
+
+.. code-block:: proto
+
+  syntax = "proto3";
+
+  message Address {
+
+message FunctionType {
+  string namespace = 1;
+  string name  = 2;
+}
+
+FunctionType function_type = 1;
+string id = 2;
+  }
+
+
+An address is made of two components, a ``FunctionType`` and ``ID``.
+A function type is similar to a class in an object-oriented language; it 
declares what sort of function the address references.
+The id is a primary key, it scopes the function call to a specific instances 
of the function type.
 
 Review comment:
   ```suggestion
   The id is a primary key, which scopes the function call to a specific 
instance of the function type.
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink-statefun] tzulitai commented on a change in pull request #54: [FLINK-16515][docs] Refactor statefun documentation for multi-language SDKs

2020-03-10 Thread GitBox

tzulitai commented on a change in pull request #54: [FLINK-16515][docs] 
Refactor statefun documentation for multi-language SDKs
URL: https://github.com/apache/flink-statefun/pull/54#discussion_r390709575
 
 

 ##
 File path: statefun-docs/docs/concepts/index.rst
 ##
 @@ -0,0 +1,97 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+   or more contributor license agreements.  See the NOTICE file
+   distributed with this work for additional information
+   regarding copyright ownership.  The ASF licenses this file
+   to you under the Apache License, Version 2.0 (the
+   "License"); you may not use this file except in compliance
+   with the License.  You may obtain a copy of the License at
+   http://www.apache.org/licenses/LICENSE-2.0
+   Unless required by applicable law or agreed to in writing,
+   software distributed under the License is distributed on an
+   "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+   KIND, either express or implied.  See the License for the
+   specific language governing permissions and limitations
+   under the License.
+
+.. _concepts:
+
+
+Concepts
+
+
+.. toctree::
+  :hidden:
+
+  logical
+
+Stateful Functions provides a framework for building event drivent 
applications.
+Here, we explain important aspects of Stateful Function’s architecture.
+
+.. contents:: :local:
+
+Event Ingress
+=
+
+Stateful Function applications sit squarely in the event driven space, so the 
natural place to start is with getting events into the system.
+
+.. figure:: ../_static/images/concepts/statefun-app-ingress.svg
+:width: 85%
+:align: center
+
+In stateful functions, the component that ingests records into the system is 
called an event ingress.
+This can be anything from a Kafka topic, to a messsage queue, to an http 
request.
+Anything that can get data into the system and trigger the intitial functions 
to begin compution.
+
+Stateful Functions
+==
+
+At the core of the diagram are the namesake stateful functions.
+
+.. figure:: ../_static/images/concepts/statefun-app-functions.svg
+:width: 85%
+:align: center
+
+Think of these as the building blocks for your service.
+They can message each other arbitrarily, which is one way in which this 
framework moves away from the traditional stream processing view of the world.
+Instead of building up a static dataflow DAG, these functions can communicate 
with each other in arbitrary, potentially cyclic, even round trip ways.
+
+If you are familiar with actor programming, this does share certain 
similarities in its ability to dynamically message between components.
+However, there are a number of significant differences.
+
+Persisted Values
 
 Review comment:
   Can we make this title, and "Fault Tolerance" one level below the other 
titles like "Event Ingress" / "Event Egress" / "Stateful Functions"? The reason 
for that is because these are sub-details further explaining "Stateful 
Functions", and it striked me a bit odd that all these sections are the same 
level.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink-statefun] tzulitai commented on a change in pull request #54: [FLINK-16515][docs] Refactor statefun documentation for multi-language SDKs

2020-03-10 Thread GitBox

tzulitai commented on a change in pull request #54: [FLINK-16515][docs] 
Refactor statefun documentation for multi-language SDKs
URL: https://github.com/apache/flink-statefun/pull/54#discussion_r390712287
 
 

 ##
 File path: statefun-docs/docs/concepts/logical.rst
 ##
 @@ -0,0 +1,79 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+   or more contributor license agreements.  See the NOTICE file
+   distributed with this work for additional information
+   regarding copyright ownership.  The ASF licenses this file
+   to you under the Apache License, Version 2.0 (the
+   "License"); you may not use this file except in compliance
+   with the License.  You may obtain a copy of the License at
+   http://www.apache.org/licenses/LICENSE-2.0
+   Unless required by applicable law or agreed to in writing,
+   software distributed under the License is distributed on an
+   "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+   KIND, either express or implied.  See the License for the
+   specific language governing permissions and limitations
+   under the License.
+
+.. _logical-functions:
+
+#
+Logical Functions
+#
+
+Stateful Function's are allocated logically, which means the system can 
support an unbounded number of instances with a finite amount of resources.
+Logical instances do not use CPU, memory, or threads when not actively being 
invoked, there is no theoretical upper limit on the number of instances that 
can created.
+Users are encouraged to model their applications as granularly as possible, 
based on what makes the most sense for their application, instead of desigining 
applications around resource constraints.
+
+.. contents:: :local:
+
+.. _address:
+
+Function Address
+
+
+In a local environment, the address of an object is the same as a reference to 
it.
+But in a Stateful Function's application, function instances are virtual and 
their runtime location is not exposed to the user.
+Instead, an ``Address`` is used to reference a specific stateful functions in 
the system..
+
+.. code-block:: proto
+
+  syntax = "proto3";
+
+  message Address {
+
+message FunctionType {
+  string namespace = 1;
+  string name  = 2;
+}
+
+FunctionType function_type = 1;
+string id = 2;
+  }
+
+
+An address is made of two components, a ``FunctionType`` and ``ID``.
+A function type is similar to a class in an object-oriented language; it 
declares what sort of function the address references.
+The id is a primary key, it scopes the function call to a specific instances 
of the function type.
+
+When a function is being invoked, all actions - including reads and writes of 
persisted values - are scoped to the current address.
+
+For example, imagine a there was a Stateful Function application to track the 
inventory of a warehouse.
+One possible implementation could include an ``Inventory`` function that 
tracks the number units in stock for a particular item; this would be the 
function type.
+There would then be one logical instance of this type for each SKU the 
warehouse manages.
+If it were clothing, there might be an instance for shirts and another for 
pants; "shirt" and "pant" would be two ids.
+Each instance may be interacted with and messaged independently.
+The application is free to create as many instances as there are types of 
items in inventory.
+
+Function Lifecycle
+==
+
+Logical functions are neither created nor destroyed, but always exist 
throughout the lifetime of an application.
+When an application starts, each parallel worker of the framework will create 
one physical object per function type.
+This object will be used to execute all logical instances of that type that 
are run by that particular worker.
+The first time a message is sent to an address, it will be as if that instance 
had always existed with its persisted values returning ``NULL``.
+
+Clearing all persisted values of a type is the same as destroying it.
+If an instance has no state and is not actively running, then it occupies no 
CPU, no threads, and no memory.
+
+An instance with data stored in one or more of its persisted values only 
occupies the resources necessary to store that data.
 
 Review comment:
   ```suggestion
   An instance with data stored in one or more of its persisted states only 
occupies the resources necessary to store that data.
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink-statefun] tzulitai commented on a change in pull request #54: [FLINK-16515][docs] Refactor statefun documentation for multi-language SDKs

2020-03-10 Thread GitBox

tzulitai commented on a change in pull request #54: [FLINK-16515][docs] 
Refactor statefun documentation for multi-language SDKs
URL: https://github.com/apache/flink-statefun/pull/54#discussion_r390710157
 
 

 ##
 File path: statefun-docs/docs/concepts/logical.rst
 ##
 @@ -0,0 +1,79 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+   or more contributor license agreements.  See the NOTICE file
+   distributed with this work for additional information
+   regarding copyright ownership.  The ASF licenses this file
+   to you under the Apache License, Version 2.0 (the
+   "License"); you may not use this file except in compliance
+   with the License.  You may obtain a copy of the License at
+   http://www.apache.org/licenses/LICENSE-2.0
+   Unless required by applicable law or agreed to in writing,
+   software distributed under the License is distributed on an
+   "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+   KIND, either express or implied.  See the License for the
+   specific language governing permissions and limitations
+   under the License.
+
+.. _logical-functions:
+
+#
+Logical Functions
+#
+
+Stateful Function's are allocated logically, which means the system can 
support an unbounded number of instances with a finite amount of resources.
+Logical instances do not use CPU, memory, or threads when not actively being 
invoked, there is no theoretical upper limit on the number of instances that 
can created.
+Users are encouraged to model their applications as granularly as possible, 
based on what makes the most sense for their application, instead of desigining 
applications around resource constraints.
+
+.. contents:: :local:
+
+.. _address:
+
+Function Address
+
+
+In a local environment, the address of an object is the same as a reference to 
it.
+But in a Stateful Function's application, function instances are virtual and 
their runtime location is not exposed to the user.
+Instead, an ``Address`` is used to reference a specific stateful functions in 
the system..
+
+.. code-block:: proto
+
+  syntax = "proto3";
 
 Review comment:
   Why use protobuf to illustrate this?
   It seems unrelated, also might confuse users that they need to construct 
`Address` as a protobuf message.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink-statefun] tzulitai commented on a change in pull request #54: [FLINK-16515][docs] Refactor statefun documentation for multi-language SDKs

2020-03-10 Thread GitBox

tzulitai commented on a change in pull request #54: [FLINK-16515][docs] 
Refactor statefun documentation for multi-language SDKs
URL: https://github.com/apache/flink-statefun/pull/54#discussion_r390705435
 
 

 ##
 File path: statefun-docs/docs/concepts/index.rst
 ##
 @@ -0,0 +1,97 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+   or more contributor license agreements.  See the NOTICE file
+   distributed with this work for additional information
+   regarding copyright ownership.  The ASF licenses this file
+   to you under the Apache License, Version 2.0 (the
+   "License"); you may not use this file except in compliance
+   with the License.  You may obtain a copy of the License at
+   http://www.apache.org/licenses/LICENSE-2.0
+   Unless required by applicable law or agreed to in writing,
+   software distributed under the License is distributed on an
+   "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+   KIND, either express or implied.  See the License for the
+   specific language governing permissions and limitations
+   under the License.
+
+.. _concepts:
+
+
+Concepts
+
+
+.. toctree::
+  :hidden:
+
+  logical
+
+Stateful Functions provides a framework for building event drivent 
applications.
+Here, we explain important aspects of Stateful Function’s architecture.
+
+.. contents:: :local:
+
+Event Ingress
+=
+
+Stateful Function applications sit squarely in the event driven space, so the 
natural place to start is with getting events into the system.
+
+.. figure:: ../_static/images/concepts/statefun-app-ingress.svg
+:width: 85%
+:align: center
+
+In stateful functions, the component that ingests records into the system is 
called an event ingress.
+This can be anything from a Kafka topic, to a messsage queue, to an http 
request.
+Anything that can get data into the system and trigger the intitial functions 
to begin compution.
+
+Stateful Functions
+==
+
+At the core of the diagram are the namesake stateful functions.
+
+.. figure:: ../_static/images/concepts/statefun-app-functions.svg
+:width: 85%
+:align: center
+
+Think of these as the building blocks for your service.
+They can message each other arbitrarily, which is one way in which this 
framework moves away from the traditional stream processing view of the world.
+Instead of building up a static dataflow DAG, these functions can communicate 
with each other in arbitrary, potentially cyclic, even round trip ways.
+
+If you are familiar with actor programming, this does share certain 
similarities in its ability to dynamically message between components.
+However, there are a number of significant differences.
+
+Persisted Values
+
+
+The first is that all functions have locally embedded state, known as 
persisted values.
 
 Review comment:
   Also not sure of the mention "locally" here. It seems to be a bit dangerous 
to use, considering that remote request-reply protocol functions do not really 
have their state available locally.
   
   How about something along the lines of:
   `The first is that all functions have *something*-scoped state, known as 
persisted states.`
   I'm afraid while I feel unsure about it, I also don't have much better 
suggestions  


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink-statefun] tzulitai commented on a change in pull request #54: [FLINK-16515][docs] Refactor statefun documentation for multi-language SDKs

2020-03-10 Thread GitBox

tzulitai commented on a change in pull request #54: [FLINK-16515][docs] 
Refactor statefun documentation for multi-language SDKs
URL: https://github.com/apache/flink-statefun/pull/54#discussion_r390708228
 
 

 ##
 File path: statefun-docs/docs/concepts/logical.rst
 ##
 @@ -0,0 +1,79 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+   or more contributor license agreements.  See the NOTICE file
+   distributed with this work for additional information
+   regarding copyright ownership.  The ASF licenses this file
+   to you under the Apache License, Version 2.0 (the
+   "License"); you may not use this file except in compliance
+   with the License.  You may obtain a copy of the License at
+   http://www.apache.org/licenses/LICENSE-2.0
+   Unless required by applicable law or agreed to in writing,
+   software distributed under the License is distributed on an
+   "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+   KIND, either express or implied.  See the License for the
+   specific language governing permissions and limitations
+   under the License.
+
+.. _logical-functions:
+
+#
+Logical Functions
+#
+
+Stateful Function's are allocated logically, which means the system can 
support an unbounded number of instances with a finite amount of resources.
+Logical instances do not use CPU, memory, or threads when not actively being 
invoked, there is no theoretical upper limit on the number of instances that 
can created.
 
 Review comment:
   ```suggestion
   Logical instances do not use CPU, memory, or threads when not actively being 
invoked, so there is no theoretical upper limit on the number of instances that 
can created.
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink-statefun] tzulitai commented on a change in pull request #54: [FLINK-16515][docs] Refactor statefun documentation for multi-language SDKs

2020-03-10 Thread GitBox

tzulitai commented on a change in pull request #54: [FLINK-16515][docs] 
Refactor statefun documentation for multi-language SDKs
URL: https://github.com/apache/flink-statefun/pull/54#discussion_r390706263
 
 

 ##
 File path: statefun-docs/docs/concepts/index.rst
 ##
 @@ -0,0 +1,97 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+   or more contributor license agreements.  See the NOTICE file
+   distributed with this work for additional information
+   regarding copyright ownership.  The ASF licenses this file
+   to you under the Apache License, Version 2.0 (the
+   "License"); you may not use this file except in compliance
+   with the License.  You may obtain a copy of the License at
+   http://www.apache.org/licenses/LICENSE-2.0
+   Unless required by applicable law or agreed to in writing,
+   software distributed under the License is distributed on an
+   "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+   KIND, either express or implied.  See the License for the
+   specific language governing permissions and limitations
+   under the License.
+
+.. _concepts:
+
+
+Concepts
+
+
+.. toctree::
+  :hidden:
+
+  logical
+
+Stateful Functions provides a framework for building event drivent 
applications.
+Here, we explain important aspects of Stateful Function’s architecture.
+
+.. contents:: :local:
+
+Event Ingress
+=
+
+Stateful Function applications sit squarely in the event driven space, so the 
natural place to start is with getting events into the system.
+
+.. figure:: ../_static/images/concepts/statefun-app-ingress.svg
+:width: 85%
+:align: center
+
+In stateful functions, the component that ingests records into the system is 
called an event ingress.
+This can be anything from a Kafka topic, to a messsage queue, to an http 
request.
+Anything that can get data into the system and trigger the intitial functions 
to begin compution.
+
+Stateful Functions
+==
+
+At the core of the diagram are the namesake stateful functions.
+
+.. figure:: ../_static/images/concepts/statefun-app-functions.svg
+:width: 85%
+:align: center
+
+Think of these as the building blocks for your service.
+They can message each other arbitrarily, which is one way in which this 
framework moves away from the traditional stream processing view of the world.
+Instead of building up a static dataflow DAG, these functions can communicate 
with each other in arbitrary, potentially cyclic, even round trip ways.
+
+If you are familiar with actor programming, this does share certain 
similarities in its ability to dynamically message between components.
+However, there are a number of significant differences.
+
+Persisted Values
+
+
+The first is that all functions have locally embedded state, known as 
persisted values.
+
+.. figure:: ../_static/images/concepts/statefun-app-state.svg
+:width: 85%
+:align: center
+
+One of Apache Flink's core strengths is its ability to provide fault-tolerant 
local state.
+When inside a function, while it is performing some computation, you are 
always working with local state in local variables.
+
+Fault Tolerance
+===
+
+For both state and messaging, Stateful Function's is still able to provide the 
exactly once guaruntees users expect from a modern data processessing framework.
 
 Review comment:
   ```suggestion
   For both state and messaging, Stateful Functions is still able to provide 
the exactly-once guarantees users expect from a modern data processessing 
framework.
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #11358: [FLINK-16516][python] Remove Python UDF Codegen Code

2020-03-10 Thread GitBox

flinkbot edited a comment on issue #11358: [FLINK-16516][python] Remove Python 
UDF Codegen Code
URL: https://github.com/apache/flink/pull/11358#issuecomment-596885127
 
 
   
   ## CI report:
   
   * cf3f3aac4aa052f71bd4954fd55b48c84350edf6 Travis: 
[SUCCESS](https://travis-ci.com/flink-ci/flink/builds/152569306) Azure: 
[FAILURE](https://dev.azure.com/rmetzger/5bd3ef0a-4359-41af-abca-811b04098d2e/_build/results?buildId=6104)
 
   * beba9d15aaf629de9e09aa8885cdcc844a96ad30 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] jiasheng55 commented on a change in pull request #11322: [FLINK-16376][yarn] Use consistent method to get Yarn application dir…

2020-03-10 Thread GitBox

jiasheng55 commented on a change in pull request #11322: [FLINK-16376][yarn] 
Use consistent method to get Yarn application dir…
URL: https://github.com/apache/flink/pull/11322#discussion_r390714696
 
 

 ##
 File path: 
flink-yarn/src/test/java/org/apache/flink/yarn/YarnFileStageTestS3ITCase.java
 ##
 @@ -161,10 +161,8 @@ private void testRecursiveUploadForYarn(String scheme, 
String pathSuffix) throws
assumeFalse(fs.exists(basePath));
 
try {
-   final Path directory = new Path(basePath, pathSuffix);
-

YarnFileStageTest.testCopyFromLocalRecursive(fs.getHadoopFileSystem(),
-   new 
org.apache.hadoop.fs.Path(directory.toUri()), Path.CUR_DIR, tempFolder, false);
+   Path.CUR_DIR, tempFolder, false);
 
 Review comment:
   @kl0u Sorry I didn't reproduce the failure on my Travis and I don't know how 
to run `YarnFileStageTestS3ITCase` manually, could you tell me how to reproduce 
this failure?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #11353: [FLINK-16438][yarn] Make YarnResourceManager starts workers using WorkerResourceSpec requested by SlotManager

2020-03-10 Thread GitBox

flinkbot edited a comment on issue #11353: [FLINK-16438][yarn] Make 
YarnResourceManager starts workers using WorkerResourceSpec requested by 
SlotManager
URL: https://github.com/apache/flink/pull/11353#issuecomment-596455079
 
 
   
   ## CI report:
   
   * af8a042f0b1bb2b7492f03144eadabcbf0c9f8d7 Travis: 
[PENDING](https://travis-ci.com/flink-ci/flink/builds/152422688) Travis: 
[SUCCESS](https://travis-ci.com/github/flink-ci/flink/builds/152422688) Azure: 
[FAILURE](https://dev.azure.com/rmetzger/5bd3ef0a-4359-41af-abca-811b04098d2e/_build/results?buildId=6082)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Commented] (FLINK-16528) Support Limit push down for streaming sources

2020-03-10 Thread Jingsong Lee (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-16528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17056593#comment-17056593
 ] 

Jingsong Lee commented on FLINK-16528:
--

+1, it is useful, should not be TOP N, should be pushed down.

> Support Limit push down for streaming sources
> -
>
> Key: FLINK-16528
> URL: https://issues.apache.org/jira/browse/FLINK-16528
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / Ecosystem, Table SQL / Planner
>Reporter: Jark Wu
>Priority: Major
>  Labels: usability
> Fix For: 1.11.0
>
>
> Currently, a limit query {{SELECT * FROM kafka LIMIT 10}} will be translated 
> into TopN operator and will scan the full data in the source. However, 
> {{LIMIT}} is a very useful feature in SQL CLI to explore data in the source. 
> It doesn't make sense it never stop. 
> We can support such case in streaming mode (ignore the text format):
> {code}
> flink > SELECT * FROM kafka LIMIT 10;
>  kafka_key  |user_name| lang |   created_at
> +-+--+-
>  494227746231685121 | burncaniff  | en   | 2014-07-29 14:07:31.000
>  494227746214535169 | gu8tn   | ja   | 2014-07-29 14:07:31.000
>  494227746219126785 | pequitamedicen  | es   | 2014-07-29 14:07:31.000
>  494227746201931777 | josnyS  | ht   | 2014-07-29 14:07:31.000
>  494227746219110401 | Cafe510 | en   | 2014-07-29 14:07:31.000
>  494227746210332673 | Da_JuanAnd_Only | en   | 2014-07-29 14:07:31.000
>  494227746193956865 | Smile_Kidrauhl6 | pt   | 2014-07-29 14:07:31.000
>  494227750426017793 | CashforeverCD   | en   | 2014-07-29 14:07:32.000
>  494227750396653569 | FilmArsivimiz   | tr   | 2014-07-29 14:07:32.000
>  494227750388256769 | jmolas  | es   | 2014-07-29 14:07:32.000
> (10 rows)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-16528) Support Limit push down for streaming sources

2020-03-10 Thread Jark Wu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jark Wu updated FLINK-16528:

Labels: usability  (was: )

> Support Limit push down for streaming sources
> -
>
> Key: FLINK-16528
> URL: https://issues.apache.org/jira/browse/FLINK-16528
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / Ecosystem, Table SQL / Planner
>Reporter: Jark Wu
>Priority: Major
>  Labels: usability
> Fix For: 1.11.0
>
>
> Currently, a limit query {{SELECT * FROM kafka LIMIT 10}} will be translated 
> into TopN operator and will scan the full data in the source. However, 
> {{LIMIT}} is a very useful feature in SQL CLI to explore data in the source. 
> It doesn't make sense it never stop. 
> We can support such case in streaming mode (ignore the text format):
> {code}
> flink > SELECT * FROM kafka LIMIT 10;
>  kafka_key  |user_name| lang |   created_at
> +-+--+-
>  494227746231685121 | burncaniff  | en   | 2014-07-29 14:07:31.000
>  494227746214535169 | gu8tn   | ja   | 2014-07-29 14:07:31.000
>  494227746219126785 | pequitamedicen  | es   | 2014-07-29 14:07:31.000
>  494227746201931777 | josnyS  | ht   | 2014-07-29 14:07:31.000
>  494227746219110401 | Cafe510 | en   | 2014-07-29 14:07:31.000
>  494227746210332673 | Da_JuanAnd_Only | en   | 2014-07-29 14:07:31.000
>  494227746193956865 | Smile_Kidrauhl6 | pt   | 2014-07-29 14:07:31.000
>  494227750426017793 | CashforeverCD   | en   | 2014-07-29 14:07:32.000
>  494227750396653569 | FilmArsivimiz   | tr   | 2014-07-29 14:07:32.000
>  494227750388256769 | jmolas  | es   | 2014-07-29 14:07:32.000
> (10 rows)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-16528) Support Limit push down for streaming sources

2020-03-10 Thread Jark Wu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jark Wu updated FLINK-16528:

Description: 
Currently, a limit query {{SELECT * FROM kafka LIMIT 10}} will be translated 
into TopN operator and will scan the full data in the source. However, 
{{LIMIT}} is a very useful feature in SQL CLI to explore data in the source. It 
doesn't make sense it never stop. 

We can support such case in streaming mode (ignore the text format):

{code}
flink > SELECT * FROM kafka LIMIT 10;
 kafka_key  |user_name| lang |   created_at
+-+--+-
 494227746231685121 | burncaniff  | en   | 2014-07-29 14:07:31.000
 494227746214535169 | gu8tn   | ja   | 2014-07-29 14:07:31.000
 494227746219126785 | pequitamedicen  | es   | 2014-07-29 14:07:31.000
 494227746201931777 | josnyS  | ht   | 2014-07-29 14:07:31.000
 494227746219110401 | Cafe510 | en   | 2014-07-29 14:07:31.000
 494227746210332673 | Da_JuanAnd_Only | en   | 2014-07-29 14:07:31.000
 494227746193956865 | Smile_Kidrauhl6 | pt   | 2014-07-29 14:07:31.000
 494227750426017793 | CashforeverCD   | en   | 2014-07-29 14:07:32.000
 494227750396653569 | FilmArsivimiz   | tr   | 2014-07-29 14:07:32.000
 494227750388256769 | jmolas  | es   | 2014-07-29 14:07:32.000
(10 rows)
{code}


  was:
Currently, a limit query {{SELECT * FROM kafka LIMIT 10}} will be translated 
into TopN operator and will scan the full data in the source. However, 
{{LIMIT}} is a very useful feature in SQL CLI to explore data in the source. It 
doesn't make sense it never stop. 

We can support such case in streaming mode:

{code}
flink > SELECT * FROM kafka LIMIT 10;
 kafka_key  |user_name| lang |   created_at
+-+--+-
 494227746231685121 | burncaniff  | en   | 2014-07-29 14:07:31.000
 494227746214535169 | gu8tn   | ja   | 2014-07-29 14:07:31.000
 494227746219126785 | pequitamedicen  | es   | 2014-07-29 14:07:31.000
 494227746201931777 | josnyS  | ht   | 2014-07-29 14:07:31.000
 494227746219110401 | Cafe510 | en   | 2014-07-29 14:07:31.000
 494227746210332673 | Da_JuanAnd_Only | en   | 2014-07-29 14:07:31.000
 494227746193956865 | Smile_Kidrauhl6 | pt   | 2014-07-29 14:07:31.000
 494227750426017793 | CashforeverCD   | en   | 2014-07-29 14:07:32.000
 494227750396653569 | FilmArsivimiz   | tr   | 2014-07-29 14:07:32.000
 494227750388256769 | jmolas  | es   | 2014-07-29 14:07:32.000
(10 rows)
{code}



> Support Limit push down for streaming sources
> -
>
> Key: FLINK-16528
> URL: https://issues.apache.org/jira/browse/FLINK-16528
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / Ecosystem, Table SQL / Planner
>Reporter: Jark Wu
>Priority: Major
> Fix For: 1.11.0
>
>
> Currently, a limit query {{SELECT * FROM kafka LIMIT 10}} will be translated 
> into TopN operator and will scan the full data in the source. However, 
> {{LIMIT}} is a very useful feature in SQL CLI to explore data in the source. 
> It doesn't make sense it never stop. 
> We can support such case in streaming mode (ignore the text format):
> {code}
> flink > SELECT * FROM kafka LIMIT 10;
>  kafka_key  |user_name| lang |   created_at
> +-+--+-
>  494227746231685121 | burncaniff  | en   | 2014-07-29 14:07:31.000
>  494227746214535169 | gu8tn   | ja   | 2014-07-29 14:07:31.000
>  494227746219126785 | pequitamedicen  | es   | 2014-07-29 14:07:31.000
>  494227746201931777 | josnyS  | ht   | 2014-07-29 14:07:31.000
>  494227746219110401 | Cafe510 | en   | 2014-07-29 14:07:31.000
>  494227746210332673 | Da_JuanAnd_Only | en   | 2014-07-29 14:07:31.000
>  494227746193956865 | Smile_Kidrauhl6 | pt   | 2014-07-29 14:07:31.000
>  494227750426017793 | CashforeverCD   | en   | 2014-07-29 14:07:32.000
>  494227750396653569 | FilmArsivimiz   | tr   | 2014-07-29 14:07:32.000
>  494227750388256769 | jmolas  | es   | 2014-07-29 14:07:32.000
> (10 rows)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-16528) Support Limit push down for streaming sources

2020-03-10 Thread Jark Wu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jark Wu updated FLINK-16528:

Description: 
Currently, a limit query {{SELECT * FROM kafka LIMIT 10}} will be translated 
into TopN operator and will scan the full data in the source. However, 
{{LIMIT}} is a very useful feature in SQL CLI to explore data in the source. It 
doesn't make sense it never stop. 

We can support such case in streaming mode:

{code}
flink > SELECT * FROM kafka LIMIT 10;
 kafka_key  |user_name| lang |   created_at
+-+--+-
 494227746231685121 | burncaniff  | en   | 2014-07-29 14:07:31.000
 494227746214535169 | gu8tn   | ja   | 2014-07-29 14:07:31.000
 494227746219126785 | pequitamedicen  | es   | 2014-07-29 14:07:31.000
 494227746201931777 | josnyS  | ht   | 2014-07-29 14:07:31.000
 494227746219110401 | Cafe510 | en   | 2014-07-29 14:07:31.000
 494227746210332673 | Da_JuanAnd_Only | en   | 2014-07-29 14:07:31.000
 494227746193956865 | Smile_Kidrauhl6 | pt   | 2014-07-29 14:07:31.000
 494227750426017793 | CashforeverCD   | en   | 2014-07-29 14:07:32.000
 494227750396653569 | FilmArsivimiz   | tr   | 2014-07-29 14:07:32.000
 494227750388256769 | jmolas  | es   | 2014-07-29 14:07:32.000
(10 rows)
{code}


  was:
Currently, a limit query {{SELECT * FROM kafka LIMIT 10}} will be translated 
into TopN operator and will scan the full data in the source. However, 
{{LIMIT}} is a very useful feature in SQL CLI to explore data in the source. It 
doesn't make sense it never stop. 

We can support such case in streaming mode:

{{code}}
flink > SELECT * FROM kafka LIMIT 10;
 kafka_key  |user_name| lang |   created_at
+-+--+-
 494227746231685121 | burncaniff  | en   | 2014-07-29 14:07:31.000
 494227746214535169 | gu8tn   | ja   | 2014-07-29 14:07:31.000
 494227746219126785 | pequitamedicen  | es   | 2014-07-29 14:07:31.000
 494227746201931777 | josnyS  | ht   | 2014-07-29 14:07:31.000
 494227746219110401 | Cafe510 | en   | 2014-07-29 14:07:31.000
 494227746210332673 | Da_JuanAnd_Only | en   | 2014-07-29 14:07:31.000
 494227746193956865 | Smile_Kidrauhl6 | pt   | 2014-07-29 14:07:31.000
 494227750426017793 | CashforeverCD   | en   | 2014-07-29 14:07:32.000
 494227750396653569 | FilmArsivimiz   | tr   | 2014-07-29 14:07:32.000
 494227750388256769 | jmolas  | es   | 2014-07-29 14:07:32.000
(10 rows)
{{code}}



> Support Limit push down for streaming sources
> -
>
> Key: FLINK-16528
> URL: https://issues.apache.org/jira/browse/FLINK-16528
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / Ecosystem, Table SQL / Planner
>Reporter: Jark Wu
>Priority: Major
> Fix For: 1.11.0
>
>
> Currently, a limit query {{SELECT * FROM kafka LIMIT 10}} will be translated 
> into TopN operator and will scan the full data in the source. However, 
> {{LIMIT}} is a very useful feature in SQL CLI to explore data in the source. 
> It doesn't make sense it never stop. 
> We can support such case in streaming mode:
> {code}
> flink > SELECT * FROM kafka LIMIT 10;
>  kafka_key  |user_name| lang |   created_at
> +-+--+-
>  494227746231685121 | burncaniff  | en   | 2014-07-29 14:07:31.000
>  494227746214535169 | gu8tn   | ja   | 2014-07-29 14:07:31.000
>  494227746219126785 | pequitamedicen  | es   | 2014-07-29 14:07:31.000
>  494227746201931777 | josnyS  | ht   | 2014-07-29 14:07:31.000
>  494227746219110401 | Cafe510 | en   | 2014-07-29 14:07:31.000
>  494227746210332673 | Da_JuanAnd_Only | en   | 2014-07-29 14:07:31.000
>  494227746193956865 | Smile_Kidrauhl6 | pt   | 2014-07-29 14:07:31.000
>  494227750426017793 | CashforeverCD   | en   | 2014-07-29 14:07:32.000
>  494227750396653569 | FilmArsivimiz   | tr   | 2014-07-29 14:07:32.000
>  494227750388256769 | jmolas  | es   | 2014-07-29 14:07:32.000
> (10 rows)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-16528) Support Limit push down for streaming sources

2020-03-10 Thread Jark Wu (Jira)

Jark Wu created FLINK-16528:
---

 Summary: Support Limit push down for streaming sources
 Key: FLINK-16528
 URL: https://issues.apache.org/jira/browse/FLINK-16528
 Project: Flink
  Issue Type: New Feature
  Components: Table SQL / Ecosystem, Table SQL / Planner
Reporter: Jark Wu
 Fix For: 1.11.0


Currently, a limit query {{SELECT * FROM kafka LIMIT 10}} will be translated 
into TopN operator and will scan the full data in the source. However, 
{{LIMIT}} is a very useful feature in SQL CLI to explore data in the source. It 
doesn't make sense it never stop. 

We can support such case in streaming mode:
{{code}}
flink > SELECT * FROM kafka LIMIT 10;
 kafka_key  |user_name| lang |   created_at
+-+--+-
 494227746231685121 | burncaniff  | en   | 2014-07-29 14:07:31.000
 494227746214535169 | gu8tn   | ja   | 2014-07-29 14:07:31.000
 494227746219126785 | pequitamedicen  | es   | 2014-07-29 14:07:31.000
 494227746201931777 | josnyS  | ht   | 2014-07-29 14:07:31.000
 494227746219110401 | Cafe510 | en   | 2014-07-29 14:07:31.000
 494227746210332673 | Da_JuanAnd_Only | en   | 2014-07-29 14:07:31.000
 494227746193956865 | Smile_Kidrauhl6 | pt   | 2014-07-29 14:07:31.000
 494227750426017793 | CashforeverCD   | en   | 2014-07-29 14:07:32.000
 494227750396653569 | FilmArsivimiz   | tr   | 2014-07-29 14:07:32.000
 494227750388256769 | jmolas  | es   | 2014-07-29 14:07:32.000
(10 rows)
{{code}}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-16528) Support Limit push down for streaming sources

2020-03-10 Thread Jark Wu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jark Wu updated FLINK-16528:

Description: 
Currently, a limit query {{SELECT * FROM kafka LIMIT 10}} will be translated 
into TopN operator and will scan the full data in the source. However, 
{{LIMIT}} is a very useful feature in SQL CLI to explore data in the source. It 
doesn't make sense it never stop. 

We can support such case in streaming mode:

{{code}}
flink > SELECT * FROM kafka LIMIT 10;
 kafka_key  |user_name| lang |   created_at
+-+--+-
 494227746231685121 | burncaniff  | en   | 2014-07-29 14:07:31.000
 494227746214535169 | gu8tn   | ja   | 2014-07-29 14:07:31.000
 494227746219126785 | pequitamedicen  | es   | 2014-07-29 14:07:31.000
 494227746201931777 | josnyS  | ht   | 2014-07-29 14:07:31.000
 494227746219110401 | Cafe510 | en   | 2014-07-29 14:07:31.000
 494227746210332673 | Da_JuanAnd_Only | en   | 2014-07-29 14:07:31.000
 494227746193956865 | Smile_Kidrauhl6 | pt   | 2014-07-29 14:07:31.000
 494227750426017793 | CashforeverCD   | en   | 2014-07-29 14:07:32.000
 494227750396653569 | FilmArsivimiz   | tr   | 2014-07-29 14:07:32.000
 494227750388256769 | jmolas  | es   | 2014-07-29 14:07:32.000
(10 rows)
{{code}}


  was:
Currently, a limit query {{SELECT * FROM kafka LIMIT 10}} will be translated 
into TopN operator and will scan the full data in the source. However, 
{{LIMIT}} is a very useful feature in SQL CLI to explore data in the source. It 
doesn't make sense it never stop. 

We can support such case in streaming mode:
{{code}}
flink > SELECT * FROM kafka LIMIT 10;
 kafka_key  |user_name| lang |   created_at
+-+--+-
 494227746231685121 | burncaniff  | en   | 2014-07-29 14:07:31.000
 494227746214535169 | gu8tn   | ja   | 2014-07-29 14:07:31.000
 494227746219126785 | pequitamedicen  | es   | 2014-07-29 14:07:31.000
 494227746201931777 | josnyS  | ht   | 2014-07-29 14:07:31.000
 494227746219110401 | Cafe510 | en   | 2014-07-29 14:07:31.000
 494227746210332673 | Da_JuanAnd_Only | en   | 2014-07-29 14:07:31.000
 494227746193956865 | Smile_Kidrauhl6 | pt   | 2014-07-29 14:07:31.000
 494227750426017793 | CashforeverCD   | en   | 2014-07-29 14:07:32.000
 494227750396653569 | FilmArsivimiz   | tr   | 2014-07-29 14:07:32.000
 494227750388256769 | jmolas  | es   | 2014-07-29 14:07:32.000
(10 rows)
{{code}}



> Support Limit push down for streaming sources
> -
>
> Key: FLINK-16528
> URL: https://issues.apache.org/jira/browse/FLINK-16528
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / Ecosystem, Table SQL / Planner
>Reporter: Jark Wu
>Priority: Major
> Fix For: 1.11.0
>
>
> Currently, a limit query {{SELECT * FROM kafka LIMIT 10}} will be translated 
> into TopN operator and will scan the full data in the source. However, 
> {{LIMIT}} is a very useful feature in SQL CLI to explore data in the source. 
> It doesn't make sense it never stop. 
> We can support such case in streaming mode:
> {{code}}
> flink > SELECT * FROM kafka LIMIT 10;
>  kafka_key  |user_name| lang |   created_at
> +-+--+-
>  494227746231685121 | burncaniff  | en   | 2014-07-29 14:07:31.000
>  494227746214535169 | gu8tn   | ja   | 2014-07-29 14:07:31.000
>  494227746219126785 | pequitamedicen  | es   | 2014-07-29 14:07:31.000
>  494227746201931777 | josnyS  | ht   | 2014-07-29 14:07:31.000
>  494227746219110401 | Cafe510 | en   | 2014-07-29 14:07:31.000
>  494227746210332673 | Da_JuanAnd_Only | en   | 2014-07-29 14:07:31.000
>  494227746193956865 | Smile_Kidrauhl6 | pt   | 2014-07-29 14:07:31.000
>  494227750426017793 | CashforeverCD   | en   | 2014-07-29 14:07:32.000
>  494227750396653569 | FilmArsivimiz   | tr   | 2014-07-29 14:07:32.000
>  494227750388256769 | jmolas  | es   | 2014-07-29 14:07:32.000
> (10 rows)
> {{code}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-16526) Escape character doesn't work for computed column

2020-03-10 Thread Kurt Young (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-16526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17056587#comment-17056587
 ] 

Kurt Young commented on FLINK-16526:


cc [~danny0405]

> Escape character doesn't work for computed column
> -
>
> Key: FLINK-16526
> URL: https://issues.apache.org/jira/browse/FLINK-16526
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Affects Versions: 1.10.0
>Reporter: YufeiLiu
>Priority: Major
>
> {code:sql}
> json_row  ROW<`timestamp` BIGINT>,
> `timestamp`   AS `json_row`.`timestamp`
> {code}
> It translate to "SELECT json_row.timestamp FROM __temp_table__"
> Throws exception "Encountered ". timestamp" at line 1, column 157. Was 
> expecting one of:..."



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] gaoyunhaii commented on a change in pull request #7368: [FLINK-10742][network] Let Netty use Flink's buffers directly in credit-based mode

2020-03-10 Thread GitBox

gaoyunhaii commented on a change in pull request #7368: [FLINK-10742][network] 
Let Netty use Flink's buffers directly in credit-based mode
URL: https://github.com/apache/flink/pull/7368#discussion_r390707610
 
 

 ##
 File path: 
flink-end-to-end-tests/flink-taskmanager-direct-memory-test/src/main/java/org/apache/flink/streaming/tests/TaskManagerDirectMemoryTestProgram.java
 ##
 @@ -0,0 +1,138 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.streaming.tests;
+
+import org.apache.flink.api.java.utils.ParameterTool;
+import org.apache.flink.configuration.ConfigOption;
+import org.apache.flink.configuration.ConfigOptions;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
+import org.apache.flink.streaming.api.functions.sink.RichSinkFunction;
+import 
org.apache.flink.streaming.api.functions.source.RichParallelSourceFunction;
+
+import static org.apache.flink.util.Preconditions.checkArgument;
+
+/**
+ * Test program for taskmanager direct memory consumption.
 
 Review comment:
   Modified the comment.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Commented] (FLINK-16470) Network failure causes Checkpoint Coordinator to flood disk with exceptions

2020-03-10 Thread Jason Kania (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-16470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17056583#comment-17056583
 ] 

Jason Kania commented on FLINK-16470:
-

[~gjy], the restart strategy was just the default and had no delay so that 
explains the rapid restart. I did not understand that the restart strategy was 
related to this interaction with the Zookeeper and the RegionStrategy. Thanks 
for connecting the two. I will try with a restart strategy and see what happens 
another time if there is a similar failure.

> Network failure causes Checkpoint Coordinator to flood disk with exceptions
> ---
>
> Key: FLINK-16470
> URL: https://issues.apache.org/jira/browse/FLINK-16470
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing, Runtime / Coordination
>Affects Versions: 1.9.2
> Environment: Latest patch current Ubuntu release with latest java 8 
> JRE.
>Reporter: Jason Kania
>Priority: Major
>
> When a networking error occurred that prevented access to the shared folder 
> mounted over NFS, the CheckpointCoordinator flooded the logs with the 
> following:
>  
> {{org.apache.flink.util.FlinkException: Could not retrieve checkpoint 158365 
> from state handle under /0158365. This indicates that the 
> retrieved state handle is broken. Try cleaning the state handle store.}}
> {{ at 
> org.apache.flink.runtime.checkpoint.ZooKeeperCompletedCheckpointStore.retrieveCompletedCheckpoint(ZooKeeperCompletedCheckpointStore.java:345)}}
> {{ at 
> org.apache.flink.runtime.checkpoint.ZooKeeperCompletedCheckpointStore.recover(ZooKeeperCompletedCheckpointStore.java:175)}}
> {{ at 
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreLatestCheckpointedState(CheckpointCoordinator.java:1014)}}
> {{ at 
> org.apache.flink.runtime.executiongraph.failover.AdaptedRestartPipelinedRegionStrategyNG.resetTasks(AdaptedRestartPipelinedRegionStrategyNG.java:205)}}
> {{ at 
> org.apache.flink.runtime.executiongraph.failover.AdaptedRestartPipelinedRegionStrategyNG.lambda$createResetAndRescheduleTasksCallback$1(AdaptedRestartPipelinedRegionStrategyNG.java:149)}}
> {{ at 
> org.apache.flink.runtime.concurrent.FutureUtils.lambda$scheduleWithDelay$3(FutureUtils.java:202)}}
> {{ at 
> org.apache.flink.runtime.concurrent.FutureUtils.lambda$scheduleWithDelay$4(FutureUtils.java:226)}}
> {{ at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)}}
> {{ at java.util.concurrent.FutureTask.run(FutureTask.java:266)}}
> {{ at 
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync(AkkaRpcActor.java:397)}}
> {{ at 
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:190)}}
> {{ at 
> org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:74)}}
> {{ at 
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:152)}}
> {{ at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26)}}
> {{ at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21)}}
> {{ at scala.PartialFunction.applyOrElse(PartialFunction.scala:123)}}
> {{ at scala.PartialFunction.applyOrElse$(PartialFunction.scala:122)}}
> {{ at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21)}}
> {{ at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)}}
> {{ at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172)}}
> {{ at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172)}}
> {{ at akka.actor.Actor.aroundReceive(Actor.scala:517)}}
> {{ at akka.actor.Actor.aroundReceive$(Actor.scala:515)}}
> {{ at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225)}}
> {{ at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592)}}
> {{ at akka.actor.ActorCell.invoke(ActorCell.scala:561)}}
> {{ at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258)}}
> {{ at akka.dispatch.Mailbox.run(Mailbox.scala:225)}}
> {{ at akka.dispatch.Mailbox.exec(Mailbox.scala:235)}}
> {{ at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)}}
> {{ at 
> akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)}}
> {{ at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)}}
> {{ at 
> akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)}}
> {{Caused by: java.io.FileNotFoundException: 
> /mnt/shared/completedCheckpoint53ed9d9197f7 (No such file or directory)}}
> {{ at java.io.FileInputStream.open0(Native Method)}}
> {{ at java.io.FileInputStream.open(FileInputStream.java:195)}}
> {{ at java.io.FileInputStream.(FileInputStream.java:138)}}
> {{ at 
> org.apache.flink.core.fs.local.LocalDataInputStream.(LocalDataInputStream.java:50)}}
> {{ at 
>

[jira] [Updated] (FLINK-16455) Introduce flink-sql-connector-hive modules to provide hive uber jars

2020-03-10 Thread Jingsong Lee (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee updated FLINK-16455:
-
Issue Type: New Feature  (was: Bug)

> Introduce flink-sql-connector-hive modules to provide hive uber jars
> 
>
> Key: FLINK-16455
> URL: https://issues.apache.org/jira/browse/FLINK-16455
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Hive
>Reporter: Jingsong Lee
>Assignee: Jingsong Lee
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.11.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Discussed in: 
> [http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Introduce-flink-connector-hive-xx-modules-td38440.html]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (FLINK-16455) Introduce flink-sql-connector-hive modules to provide hive uber jars

2020-03-10 Thread Jingsong Lee (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee resolved FLINK-16455.
--
Resolution: Fixed

master: bd18b8715466bfe49d8b31019075f8081a618875

> Introduce flink-sql-connector-hive modules to provide hive uber jars
> 
>
> Key: FLINK-16455
> URL: https://issues.apache.org/jira/browse/FLINK-16455
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Hive
>Reporter: Jingsong Lee
>Assignee: Jingsong Lee
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.11.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Discussed in: 
> [http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Introduce-flink-connector-hive-xx-modules-td38440.html]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

1 2 3 4 5 6 >

1 - 100 of 537 matches

Mail list logo