[jira] [Created] (FLINK-25834) 'flink run' command can not use 'pipeline.classpaths' in flink-conf.yaml

2022-01-26 Thread Ada Wong (Jira)
Ada Wong created FLINK-25834:


 Summary: 'flink run' command can not use 'pipeline.classpaths' in 
flink-conf.yaml
 Key: FLINK-25834
 URL: https://issues.apache.org/jira/browse/FLINK-25834
 Project: Flink
  Issue Type: Bug
  Components: Client / Job Submission, Command Line Client
Affects Versions: 1.14.3
Reporter: Ada Wong


When we use 'flink run' or CliFrontend class to submit job. If not set 
-C/-classpaths, it disable 'pipeline.classpaths'.

Example:

 flink-conf.yaml content :
{code:java}
pipeline.classpaths: 
file:///opt/flink-1.14.2/other/flink-sql-connector-elasticsearch7_2.12-1.14.2.jar{code}
submit command:
{code:java}
bin/flink run 
/flink14-sql/target/flink14-sql-1.0-SNAPSHOT-jar-with-dependencies.jar{code}
it will throw elasticsearch7 class not found exception.

There are two reasons for this:
 # ProgramOptions#applyToConfiguration will use a list which size is zero to 
overwrite 'pipeline.classpaths' value in configuration.
 # ProgramOptions#buildProgram do not set 'pipeline.classpaths' into 
PackagedProgram.

To solve the 1) problem, could we add a directly return judgement when list 
size is zero in ConfigUtils#encodeCollectionToConfig()

To solve the 2) problem, could we append 'pipeline.classpaths' values into 
classpaths and pass setUserClassPaths together.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25804) Loading and running connector code use separated ClassLoader.

2022-01-25 Thread Ada Wong (Jira)
Ada Wong created FLINK-25804:


 Summary: Loading and running connector code use separated 
ClassLoader.
 Key: FLINK-25804
 URL: https://issues.apache.org/jira/browse/FLINK-25804
 Project: Flink
  Issue Type: New Feature
  Components: API / Core, Connectors / Common, Table SQL / Runtime
Affects Versions: 1.14.3
Reporter: Ada Wong


When we use multiple connectors could have class conflicts. This class conflict 
can not be solved by shade.

The following is example code.
CREATE TABLE es6 (
user_id STRING,
user_name STRING,
PRIMARYKEY (user_id) NOT ENFORCED
) WITH (
'connector' = 'elasticsearch-6',
'hosts' = 'http://localhost:9200',
'index' = 'users',
'document-type' = 'foo'
);

CREATE TABLE es7 (
user_id STRING,
user_name STRING,
PRIMARYKEY (user_id) NOT ENFORCED
) WITH (
'connector' = 'elasticsearch-7',
'hosts' = 'http://localhost:9200',
'index' = 'users'
);

CREATE TABLE ods (
user_id STRING,
user_name STRING
) WITH (
'connector' = 'datagen'
);

INSERT INTO es6 SELECT user_id, user_name FROM ods;
INSERT INTO es7 SELECT user_id, user_name FROM ods;
 
{code:java}
CREATE TABLE es6 (
  user_id STRING,
  user_name STRING,
  PRIMARYKEY (user_id) NOT ENFORCED
) WITH (
  'connector' = 'elasticsearch-6',
  'hosts' = 'http://localhost:9200',
  'index' = 'users',
  'document-type' = 'foo'
);


CREATE TABLE es7 (
  user_id STRING,
  user_name STRING,
  PRIMARYKEY (user_id) NOT ENFORCED
) WITH (
  'connector' = 'elasticsearch-7',
  'hosts' = 'http://localhost:9200',
  'index' = 'users'
);

CREATE TABLE ods (
  user_id STRING,
  user_name STRING
) WITH (
  'connector' = 'datagen'
);

INSERT INTO es6 SELECT user_id, user_name FROM ods;
INSERT INTO es7 SELECT user_id, user_name FROM ods;{code}
 
Inspird by PulginManager, PluginFileSystemFactory and 
ClassLoaderFixingFileSystem class.

Could we create many ClassLoaderFixing* class to avoid class conflict. Such as 
ClassLoaderFixingDynamicTableFactory, ClassLoaderFixingSink or 
ClassLoaderFixingSinkFunction.



If we use ClassLoader fixing, each call SinkFunction#invoke will switch 
classloader by Thread#currentThread()#setContextClassLoader(). Does 
setContextClassLoader() has heavy overhead of setContextClassLoader()?

 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25795) Support Pulsar sink connector in Python DataStream API.

2022-01-24 Thread Ada Wong (Jira)
Ada Wong created FLINK-25795:


 Summary: Support Pulsar sink connector in Python DataStream API.
 Key: FLINK-25795
 URL: https://issues.apache.org/jira/browse/FLINK-25795
 Project: Flink
  Issue Type: New Feature
  Components: API / Python, Connectors / Pulsar
Affects Versions: 1.14.3
Reporter: Ada Wong


[https://issues.apache.org/jira/secure/ViewProfile.jspa?name=knaufk]

 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25601) Update 'state.backend' in flink-conf.yaml

2022-01-10 Thread Ada Wong (Jira)
Ada Wong created FLINK-25601:


 Summary: Update 'state.backend' in flink-conf.yaml
 Key: FLINK-25601
 URL: https://issues.apache.org/jira/browse/FLINK-25601
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / State Backends
Affects Versions: 1.14.2
Reporter: Ada Wong


The value and comments of 'state.backend' in flink-conf.yaml is deprecated.
{code:java}
# Supported backends are 'jobmanager', 'filesystem', 'rocksdb', or the
# .
#
# state.backend: filesystem{code}
We should update to this following.

 
{code:java}
# Supported backends are 'hashmap', 'rocksdb', or the
# .
#
# state.backend: hashmap {code}
 

 

 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25530) Support Pulsar source connector in Python DataStream API.

2022-01-05 Thread Ada Wong (Jira)
Ada Wong created FLINK-25530:


 Summary: Support Pulsar source connector in Python DataStream API.
 Key: FLINK-25530
 URL: https://issues.apache.org/jira/browse/FLINK-25530
 Project: Flink
  Issue Type: New Feature
  Components: API / Python
Affects Versions: 1.14.2
Reporter: Ada Wong


Flink have supported Pulsar source connector.

https://issues.apache.org/jira/browse/FLINK-20726



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25485) JDBC connector implicitly add options when use mysql

2021-12-30 Thread Ada Wong (Jira)
Ada Wong created FLINK-25485:


 Summary: JDBC connector implicitly add options when use mysql
 Key: FLINK-25485
 URL: https://issues.apache.org/jira/browse/FLINK-25485
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / JDBC
Affects Versions: 1.14.2
Reporter: Ada Wong


When we directly use mysql sink, buffer-flush 
options(sink.buffer-flush.max-rows, sink.buffer-flush.interval) can not 
increase throughput.

We must set 'rewriteBatchedStatements=true' into mysql jdbc url, for 
'sink.buffer-flush' to take effect.

I want add a method appendUrlSuffix in MySQLDialect to auto set this jdbc 
option.

Many users forget or don't know this option.

 

Inspired by alibaba DataX.

https://github.com/alibaba/DataX/blob/master/plugin-rdbms-util/src/main/java/com/alibaba/datax/plugin/rdbms/util/DataBaseType.java



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25434) Throw an error when BigDecimal precision overflows.

2021-12-23 Thread Ada Wong (Jira)
Ada Wong created FLINK-25434:


 Summary: Throw an error when BigDecimal precision overflows.
 Key: FLINK-25434
 URL: https://issues.apache.org/jira/browse/FLINK-25434
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner, Table SQL / Runtime
Affects Versions: 1.14.2
Reporter: Ada Wong


 

Lost a lot of data but no error was thrown.

As the following comment, If the precision overflows, null will be returned.
{code:java}
/**
 If the precision overflows, null will be returned.
 */
public static @Nullable DecimalData fromBigDecimal(BigDecimal bd, int 
precision, int scale) {
bd = bd.setScale(scale, RoundingMode.HALF_UP);
if (bd.precision() > precision) {
return null;
}

long longVal = -1;
if (precision <= MAX_COMPACT_PRECISION) {
longVal = bd.movePointRight(scale).longValueExact();
}
return new DecimalData(precision, scale, longVal, bd);
} {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25240) Update log4j2 version to 2.15.0

2021-12-09 Thread Ada Wong (Jira)
Ada Wong created FLINK-25240:


 Summary: Update log4j2 version to 2.15.0 
 Key: FLINK-25240
 URL: https://issues.apache.org/jira/browse/FLINK-25240
 Project: Flink
  Issue Type: Bug
  Components: API / Core
Affects Versions: 1.14.0
Reporter: Ada Wong


2.0 <= Apache log4j2 <= 2.14.1 have a RCE zero day.

https://www.cyberkendra.com/2021/12/worst-log4j-rce-zeroday-dropped-on.html



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25239) Delete useless variables

2021-12-09 Thread Ada Wong (Jira)
Ada Wong created FLINK-25239:


 Summary: Delete useless variables
 Key: FLINK-25239
 URL: https://issues.apache.org/jira/browse/FLINK-25239
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / JDBC
Affects Versions: 1.14.0
Reporter: Ada Wong


    public static final int DEFAULT_FLUSH_MAX_SIZE = 5000;
    public static final long DEFAULT_FLUSH_INTERVAL_MILLS = 0L;



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25188) Cannot install PyFlink in M1 CPU

2021-12-06 Thread Ada Wong (Jira)
Ada Wong created FLINK-25188:


 Summary: Cannot install PyFlink in M1 CPU
 Key: FLINK-25188
 URL: https://issues.apache.org/jira/browse/FLINK-25188
 Project: Flink
  Issue Type: Bug
  Components: API / Python
Affects Versions: 1.14.0
Reporter: Ada Wong


ERROR: Could not find a version that satisfies the requirement 
pandas<1.2.0,>=1.0 (from apache-flink)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25141) Elasticsearch connector customize sink parallelism

2021-12-01 Thread Ada Wong (Jira)
Ada Wong created FLINK-25141:


 Summary: Elasticsearch connector customize sink parallelism
 Key: FLINK-25141
 URL: https://issues.apache.org/jira/browse/FLINK-25141
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / ElasticSearch
Affects Versions: 1.14.0
Reporter: Ada Wong


Inspired by JDBC and Kafka connector, add a 'sink.parallelism' option, and 
using SinkProvider#of(sink, sinkParallelism).

 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-24661) ConfigOption add isSecret method to judge sensitive options

2021-10-26 Thread Ada Wong (Jira)
Ada Wong created FLINK-24661:


 Summary: ConfigOption add isSecret method to judge sensitive 
options
 Key: FLINK-24661
 URL: https://issues.apache.org/jira/browse/FLINK-24661
 Project: Flink
  Issue Type: Improvement
  Components: API / Core
Affects Versions: 1.13.3
Reporter: Ada Wong


Related ticket https://issues.apache.org/jira/browse/FLINK-24381

[~chesnay]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-24381) Hidden password value when Flink SQL connector throw exception.

2021-09-27 Thread Ada Wong (Jira)
Ada Wong created FLINK-24381:


 Summary: Hidden password value when Flink SQL connector throw 
exception.
 Key: FLINK-24381
 URL: https://issues.apache.org/jira/browse/FLINK-24381
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / Planner
Affects Versions: 1.13.2
Reporter: Ada Wong


This following is error message. Password is 'bar' and is displayed.

Could we hidden it to password='***' or password='' inspired by Apache 
Kafka source code.
{code:java}
Missing required options are:

hosts 

Unable to create a sink for writing table 
'default_catalog.default_database.dws'.

Table options are:

'connector'='elasticsearch7-x'
'index'='foo'
'password'='bar'
at 
org.apache.flink.table.factories.FactoryUtil.createTableSink(FactoryUtil.java:208)
at 
org.apache.flink.table.planner.delegation.PlannerBase.getTableSink(PlannerBase.scala:369)
at 
org.apache.flink.table.planner.delegation.PlannerBase.translateToRel(PlannerBase.scala:221)
at 
org.apache.flink.table.planner.delegation.PlannerBase.$anonfun$translate$1(PlannerBase.scala:159
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23324) Postgres of JDBC Connector enable case-sensitive.

2021-07-09 Thread Ada Wong (Jira)
Ada Wong created FLINK-23324:


 Summary: Postgres of JDBC Connector enable case-sensitive.
 Key: FLINK-23324
 URL: https://issues.apache.org/jira/browse/FLINK-23324
 Project: Flink
  Issue Type: Bug
  Components: Connectors / JDBC
Affects Versions: 1.12.4, 1.13.1
Reporter: Ada Wong


Now the PostgresDialect is case-insensitive. I think this is a bug.

https://stackoverflow.com/questions/20878932/are-postgresql-column-names-case-sensitive
https://www.postgresql.org/docs/current/sql-syntax-lexical.html#SQL-SYNTAX-IDENTIFIERS

Could we delete PostgresDialect#quoteIdentifier, make it using super class.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23263) LocalBufferPool can not request memory.

2021-07-05 Thread Ada Wong (Jira)
Ada Wong created FLINK-23263:


 Summary: LocalBufferPool can not request memory.
 Key: FLINK-23263
 URL: https://issues.apache.org/jira/browse/FLINK-23263
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Network
Affects Versions: 1.10.1
Reporter: Ada Wong


Flink job is running, bug it can not consume kafka data.
This following is exception.

"Map -> to: Tuple2 -> Map -> (from: (id, sid, item, val, unit, dt,
after_index, tablename, PROCTIME) -> where: (AND(=(tablename,
CONCAT(_UTF-16LE't_real', currtime2(dt, _UTF-16LE'MMdd'))),
OR(=(after_index, _UTF-16LE'2.6'), AND(=(sid, _UTF-16LE'7296'),
=(after_index, _UTF-16LE'2.10'))), LIKE(item, _UTF-16LE'??5%'),
=(MOD(getminutes(dt), 5), 0))), select: (id AS ID, sid AS STID, item
AS ITEM, val AS VAL, unit AS UNIT, dt AS DATATIME), from: (id, sid,
item, val, unit, dt, after_index, tablename, PROCTIME) -> where:
(AND(=(tablename, CONCAT(_UTF-16LE't_real', currtime2(dt,
_UTF-16LE'MMdd'))), OR(=(after_index, _UTF-16LE'2.6'), AND(=(sid,
_UTF-16LE'7296'), =(after_index, _UTF-16LE'2.10'))), LIKE(item,
_UTF-16LE'??5%'), =(MOD(getminutes(dt), 5), 0))), select: (sid AS
STID, val AS VAL, dt AS DATATIME)) (1/1)" #79 prio=5 os_prio=0
tid=0x7fd4c4a94000 nid=0x21c0 in Object.wait()
[0x7fd4d5719000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at 
org.apache.flink.runtime.io.network.buffer.LocalBufferPool.requestMemorySegment(LocalBufferPool.java:251)
- locked <0x00074e6c8b98> (a java.util.ArrayDeque)
at 
org.apache.flink.runtime.io.network.buffer.LocalBufferPool.requestBufferBuilderBlocking(LocalBufferPool.java:218)
at 
org.apache.flink.runtime.io.network.api.writer.RecordWriter.requestNewBufferBuilder(RecordWriter.java:264)
at 
org.apache.flink.runtime.io.network.api.writer.RecordWriter.copyFromSerializerToTargetChannel(RecordWriter.java:192)
at 
org.apache.flink.runtime.io.network.api.writer.RecordWriter.emit(RecordWriter.java:162)
at 
org.apache.flink.runtime.io.network.api.writer.RecordWriter.emit(RecordWriter.java:128)
at 
org.apache.flink.streaming.runtime.io.RecordWriterOutput.pushToRecordWriter(RecordWriterOutput.java:107)
at 
org.apache.flink.streaming.runtime.io.RecordWriterOutput.collect(RecordWriterOutput.java:89)
at 
org.apache.flink.streaming.runtime.io.RecordWriterOutput.collect(RecordWriterOutput.java:45)
at 
org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:718)
at 
org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:696)
at 
org.apache.flink.streaming.api.operators.TimestampedCollector.collect(TimestampedCollector.java:51)
at 
org.apache.flink.table.runtime.CRowWrappingCollector.collect(CRowWrappingCollector.scala:37)
at 
org.apache.flink.table.runtime.CRowWrappingCollector.collect(CRowWrappingCollector.scala:28)
at DataStreamCalcRule$4160.processElement(Unknown Source)
at 
org.apache.flink.table.runtime.CRowProcessRunner.processElement(CRowProcessRunner.scala:66)
at 
org.apache.flink.table.runtime.CRowProcessRunner.processElement(CRowProcessRunner.scala:35)
at 
org.apache.flink.streaming.api.operators.ProcessOperator.processElement(ProcessOperator.java:66)
at 
org.apache.flink.streaming.runtime.tasks.OperatorChain$ChainingOutput.pushToOperator(OperatorChain.java:488)
at 
org.apache.flink.streaming.runtime.tasks.OperatorChain$ChainingOutput.collect(OperatorChain.java:465)
at 
org.apache.flink.streaming.runtime.tasks.OperatorChain$ChainingOutput.collect(OperatorChain.java:425)
at 
org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:718)
at 
org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:696)
at 
org.apache.flink.streaming.api.operators.TimestampedCollector.collect(TimestampedCollector.java:51)
at 
org.apache.flink.table.runtime.CRowWrappingCollector.collect(CRowWrappingCollector.scala:37)
at 
org.apache.flink.table.runtime.CRowWrappingCollector.collect(CRowWrappingCollector.scala:28)
at DataStreamSourceConversion$4104.processElement(Unknown Source)
at 
org.apache.flink.table.runtime.CRowOutputProcessRunner.processElement(CRowOutputProcessRunner.scala:70)
at 
org.apache.flink.streaming.api.operators.ProcessOperator.processElement(ProcessOperator.java:66)
at 
org.apache.flink.streaming.runtime.tasks.OperatorChain$ChainingOutput.pushToOperator(OperatorChain.java:488)
at 
org.apache.flink.streaming.runtime.tasks.OperatorChain$ChainingOutput.collect(OperatorChain.java:465)
at 
org.apache.flink.streaming.runtime.tasks.OperatorChain$ChainingOutput.collect(OperatorChain.java:425)
at 
org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingBroadcastingOutputCollector.collect(OperatorChain.java:686)
at 

[jira] [Created] (FLINK-23074) There is a class conflict between flink-connector-hive and flink-parquet

2021-06-21 Thread Ada Wong (Jira)
Ada Wong created FLINK-23074:


 Summary: There is a class conflict between flink-connector-hive 
and flink-parquet
 Key: FLINK-23074
 URL: https://issues.apache.org/jira/browse/FLINK-23074
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / Hive
Reporter: Ada Wong


flink-connector-hive 2.3.6 include parquet-hadoop 1.8.1 version but 
flink-parquet include 1.11.1.
org.apache.parquet.hadoop.example.GroupWriteSupport
 is different.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-22776) Delete casting to byte[]

2021-05-25 Thread Ada Wong (Jira)
Ada Wong created FLINK-22776:


 Summary: Delete casting to byte[]
 Key: FLINK-22776
 URL: https://issues.apache.org/jira/browse/FLINK-22776
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / JDBC
Affects Versions: 1.13.0
Reporter: Ada Wong
 Attachments: image-2021-05-26-10-38-04-578.png

Casting to 'byte[]' is redundant. Could we delete it?

 !image-2021-05-26-10-38-04-578.png|thumbnail! 





--
This message was sent by Atlassian Jira
(v8.3.4#803005)