[jira] [Created] (FLINK-28571) Add Chi-squared test as Transformer to ml.feature

2022-07-15 Thread Simon Tao (Jira)
Simon Tao created FLINK-28571:
-

 Summary: Add Chi-squared test as Transformer to ml.feature
 Key: FLINK-28571
 URL: https://issues.apache.org/jira/browse/FLINK-28571
 Project: Flink
  Issue Type: New Feature
  Components: Library / Machine Learning
Reporter: Simon Tao


Pearson's chi-squared 
test:https://en.wikipedia.org/wiki/Pearson%27s_chi-squared_test

For more information on 
chi-squared:http://en.wikipedia.org/wiki/Chi-squared_test



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-22089) cdc checkpoint invalid

2021-04-01 Thread Simon Tao (Jira)
Simon Tao created FLINK-22089:
-

 Summary: cdc checkpoint invalid
 Key: FLINK-22089
 URL: https://issues.apache.org/jira/browse/FLINK-22089
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Common, Runtime / Checkpointing
Affects Versions: 1.12.0
 Environment: public static void main(String[] args) {

SourceFunction sourceFunction = MySQLSource.builder()
 .hostname("xxx")
 .port(3306)
 .databaseList("xxx") // monitor all tables under inventory database
 .username("xxx")
 .password("xxx")
 .deserializer(new StringDebeziumDeserializationSchema()) // converts 
SourceRecord to String
 .build();
 StreamExecutionEnvironment env = 
StreamExecutionEnvironment.getExecutionEnvironment();
 env.enableCheckpointing(5000);
 try

{ env.setStateBackend(new RocksDBStateBackend("file:///E:/tmp/t2")); }

catch (IOException e)

{ e.printStackTrace(); }
 env.addSource(sourceFunction).print().setParallelism(1); // use parallelism 1 
for sink to keep message ordering
 
 try \{ env.execute(); } catch (Exception e) \{ e.printStackTrace(); }

}

 

maven=

 


 
 
 maven-assembly-plugin
 
 false
 
 jar-with-dependencies
 
 
 
 
 make-assembly
 package
 
 assembly
 
 
 
 
 
 org.apache.maven.plugins
 maven-compiler-plugin
 
 8
 8
 
 
 





 
 org.apache.flink
 flink-runtime-web_${scala.version}
 ${flink.version}
 

 
 
 org.apache.flink
 flink-connector-jdbc_${scala.version}
 ${flink.version}
 


 
 
 org.apache.flink
 flink-clients_${scala.version}
 ${flink.version}
 

 
 
 org.apache.flink
 flink-test-utils_${scala.version}
 ${flink.version}
 
 

 
 org.apache.flink
 flink-statebackend-rocksdb_${scala.version}
 ${flink.version}
 

 
 
 com.alibaba.ververica
 flink-connector-mysql-cdc
 1.3.0
 

 
 
 ru.ivi.opensource
 flink-clickhouse-sink
 1.3.0
 



 
Reporter: Simon Tao
 Fix For: shaded-13.0


i turn on checkpoint but it seems invalid , during the whole process ,the 
checkpoint file can write to localfile and the flink can read the cdc recrods 
normally, but when i restart flink in idea ,it always reload consumed records . 
i paste on my code and maven configuration below



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-17418) Windows system is not supported currently

2020-04-27 Thread Simon Tao (Jira)
Simon Tao created FLINK-17418:
-

 Summary: Windows system is not supported currently
 Key: FLINK-17418
 URL: https://issues.apache.org/jira/browse/FLINK-17418
 Project: Flink
  Issue Type: New Feature
  Components: API / Python
Affects Versions: 1.10.0
Reporter: Simon Tao


Windows system is not supported currently on pycharm



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-17411) Add async mode in JDBCLookupFunction

2020-04-27 Thread Simon Tao (Jira)
Simon Tao created FLINK-17411:
-

 Summary: Add async mode in JDBCLookupFunction
 Key: FLINK-17411
 URL: https://issues.apache.org/jira/browse/FLINK-17411
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / JDBC
Affects Versions: 1.10.0
Reporter: Simon Tao


At the moment,JDBCLookupFunction only support sync mode, in some scenarios we 
need to use asynchrony to improve performance. We should suppot it in 
JDBCLookupFunction

 

cc [~jark] [~twalthr]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)